(Springer Tracts in Modern Physics 173) Gernot Alber, Thomas Beth, Michał Horodecki, Paweł Horodecki, Ryszard Horodecki, Martin Rötteler, Harald Weinfurter, Reinhard Werner, Anton Zeilinger (auth.) - .pdf
(Springer Tracts in Modern Physics 173) Gernot Alber, Thomas Beth, Michał Horodecki, Paweł Horodecki, Ryszard Horodecki, Martin Rötteler, Harald Weinfurter, Reinhard Werner, Anton Zeilinger (auth.) - .pdf
(Springer Tracts in Modern Physics 173) Gernot Alber, Thomas Beth, Michał Horodecki, Paweł Horodecki, Ryszard Horodecki, Martin Rötteler, Harald Weinfurter, Reinhard Werner, Anton Zeilinger (auth.) - .pdf
Gernot Alber
α2
β β1
2
ϕ ϕ
ϕ
α1
A B
+1 +1
αi s βi
-1 -1
Fig. 1.1. Basic experimental setup for testing Bell’s inequality; the choices of the
directions of polarization on the Bloch sphere for optimal violation of the CHSH
inequality (1.3) correspond to ϕ = π/4 for spin-1/2 systems
1 From the Foundations of Quantum Theory to Quantum Technology 3
âx b̂y ĉy |ψGHZ = ây b̂x ĉy |ψGHZ = ây b̂y ĉx |ψGHZ = |ψGHZ . (1.9)
Therefore the quantum mechanical result for the product of (1.8) is given by
RQM |ψGHZ = (âx b̂x ĉx )(âx b̂y ĉy )(ây b̂x ĉy )(ây b̂y ĉx )|ψGHZ
= (−1)|ψGHZ (1.10)
and contradicts the corresponding result of an LRT. These peculiar quantum
mechanical predictions have recently been observed experimentally [38]. The
entanglement inherent in these states offers interesting perspectives on the
possibility of distributing quantum information between three parties [39].
|0 and |1 are orthogonal, i.e. 0|1 = 0, or if 0|1 = 1 = a0 |a1 . Both pos-
sibilities contradict the original assumption of nonorthogonal, nonidentical
initial states. Therefore a quantum process capable of copying nonorthogo-
nal quantum states is impossible. This is an early example of an impossible
quantum process.
Soon afterwards, Bennett and Brassard [42] proposed the first quan-
tum protocol (BB84) for secure transmission of a random, secret key using
nonorthogonal states of polarized photons for the encoding (see Table 1.1).
In the Vernam cipher, such a secret key is used for encoding and decoding
messages safely [6, 43]. In this latter encoding procedure the message and
the secret key are added bit by bit, and in the decoding procedure they are
subtracted again. If the random key is secret, the safety of this protocol is
guaranteed provided the key is used only once, has the same length as the
message and is truly random [44]. Nonorthogonal quantum states can help in
transmitting such a random, secret key safely. For this purpose A(lice) sends
photons to B(ob) which are polarized randomly either horizontally (+1) or
vertically (−1) along two directions of polarization. It is convenient to choose
the magnitude of the angle between these two directions of polarization to be
π/8. B(ob) also chooses his polarizers randomly to be polarized along these
directions. After A(lice) has sent all photons to B(ob), both communicate to
each other their choices of directions of polarization over a public channel.
However, the sent or measured polarizations of the photons are kept secret.
Whenever they chose the same direction (yes), their measured polarizations
are correlated perfectly and they keep the corresponding measured results
for their secret key. The other measurement results (no) cannot be used for
the key. Provided the transmission channel is ideal, A(lice) and B(ob) can
use part of the key for detecting a possible eavesdropper because in this case
some of the measurements are not correlated perfectly. In practice, however,
the transmission channel is not perfect and A(lice) and B(ob) have to process
their raw key further to extract from it a secret key [45]. It took some more
Table 1.1. Part of a possible idealized protocol for transmitting a secret key,
according to [42]
years to realize that an exchange of secret keys can be achieved with the
help of entangled quantum states [46]. Thereby, the characteristic quantum
correlations of entangled states and the very fact that they are incompat-
8 Gernot Alber
ible with any LRT can be used for ensuring security of the key exchange.
After the first proof-of-principle experiments [47, 48], the first practical im-
plementation of quantum cryptography over a distance of about 1 km was
realized at the University of Geneva using single, polarized photons trans-
mitted through an optical fiber [49]. These developments launched the whole
new field of quantum cryptography. Now, this field represents the most devel-
oped part of quantum information processing. Quantum cryptography based
on the BB84 protocol has already been realized over a distance of 23 km
[50]. Recent experiments [30, 31] have demonstrated that photon pairs can
also be entangled over large distances, so that entanglement-based quantum
cryptography over such large distances might become accessible soon. Some
of these experiments are discussed in Chap. 3.
Simultaneously with these developments in quantum cryptography, nu-
merous other physical processes were discovered which were either enabled
by entanglement or in which entanglement led to an improvement of perfor-
mance. The most prominent examples are dense coding [51], entanglement-
assisted teleportation [10, 11, 52] and entanglement swapping [52, 53]. (These
processes are discussed in detail in Chaps. 2 and 3.) In the spirit of Feynman’s
suggestion, all these developments demonstrate that characteristic quantum
phenomena have practical applications in quantum information processing.
Let us first of all discuss briefly the classical complexity of this problem.
In order to answer the question in the worst possible case, the oracle has to be
queried more than 2n−1 times. It can happen, for example, that the first 2n−1
queries all give the same result, so that at least one more query of the oracle
is required to decide whether f is constant or balanced. Thus, classically, it
is apparent that the number of steps required grows exponentially with the
number of bits.
|x> |x>
Uf
|a> |a f(x)>
Fig. 1.2. Basic operation of a quantum oracle Uf which evaluates a Boolean func-
tion f : x ∈ Zn 1
2 → f (x) ∈ Z2 ≡ {0, 1}; |x
is the input state of an n-qubit quantum
system; |a
is a one-qubit state and ⊕ denotes addition modulo 2
1. The n-qubit quantum system √ and the ancilla system are prepared in
states |0 and (|0 − |1)/ 2. Then a Hadamard transformation
1
H : |0 → √ (|0 + |1) ,
2
1
|1 → √ (|0 − |1) (1.12)
2
10 Gernot Alber
Taking into account the single application of the quantum oracle in step 2
and the application of the Hadamard transformations in the preparation and
measurement processes, Deutsch’s quantum algorithm requires O(n) steps to
obtain the final answer, in contrast to any classical algorithm, which needs
an exponential number of steps. Thus Deutsch’s quantum algorithm leads to
an exponential speedup.
A key element of this quantum algorithm and of those discovered later is
the quantum parallelism involved in step 2, where the linear superposition
1 From the Foundations of Quantum Theory to Quantum Technology 11
of the first n qubits comprises the requested global information about the
function f . For most of the possible functions f this intermediate quantum
state is expected to be entangled. An exception is the case of a constant func-
tion f , for which the quantum state |ψ2 is separable. Furthermore, it is also
crucial for the success of this quantum algorithm that the final measurement
in step 3 yielding the required answer can be implemented by a fast quantum
measurement whose complexity is polynomial in n. This is a requirement
fulfilled by all other known fast quantum algorithms. The quantum algo-
rithm described above was the first example demonstrating that quantum
phenomena may speed up computations in such a way that an exponential
gap appears between the complexity class of the quantum problem and the
complexity class of the corresponding classical probabilistic problem.
Continuing this development initiated by Deutsch, other, new fast quan-
tum algorithms were discovered in the subsequent years. The most prominent
examples are Simon’s quantum algorithm [57], Shor’s celebrated algorithm
[9] for factorizing numbers, and Grover’s search algorithm [58]. (Quantum al-
gorithms are discussed in detail in Chap. 4.) In addition, possible realizations
of quantum computing devices were suggested which were based on trapped
ions [59] and on cavity quantum electrodynamical setups [60]. These devel-
opments called for new methods for stabilizing quantum algorithms against
perturbing environmental influences, which tend to destroy quantum inter-
ference and quantum entanglement [61]. This led to the development of the
first error-correcting codes [62, 63, 64, 65, 66] by adaptation of classical error-
correcting techniques to the quantum domain. An introduction to the theory
of quantum error correction is presented in Chap. 4.
been demonstrated with photons [10, 11, 49, 70]. Realizations of elementary
quantum logical operations have been based on trapped ions [13, 14] and on
nuclear magnetic resonance [15]. Recent experiments indicate that besides
cavity quantum electrodynamical setups [16], trapped neutral atoms which
are guided along magnetic wires (atom chips) might also be useful for quan-
tum information processing [17]. There have also been theoretical proposals
on using ultracold atoms in optical lattices [18, 19], on ions in an array of
microtraps [20] and on solid-state devices [21, 22, 23] for the implementation
of quantum logical gates.
By now, quantum information processing has become an interdisciplinary
subject which attracts not only physicists but also researchers from other
communities. The common interest is the practical, technologically oriented
application of characteristic quantum phenomena. At this stage of develop-
ment, it appears necessary to examine recent achievements and to emphasize
the underlying, general, basic concepts, which have been developing gradu-
ally and which are now commonly adopted by all researchers in this field.
This is one of the main intentions of the rest of the book.
In Chap. 2, Werner introduces the basic concepts of quantum information
theory and describes the fundamental mathematical structures underlying re-
cent and current developments. In particular, this chapter addresses a natural
question appearing in connection with Feynman’s suggestion, namely what
can be done with the help of quantum systems and what cannot be done. A
first example of an impossible quantum process, the copying of nonorthogonal
quantum states, has already been mentioned. Other examples of possible and
impossible quantum processes are discussed in detail in this contribution.
First experimental realizations of basic quantum communication schemes
based on entangled photon pairs are discussed in Chap. 3 by Weinfurter and
Zeilinger. These first experiments on entanglement-based quantum cryptog-
raphy, dense coding and quantum teleportation demonstrate the important
role photons play in current experiments. Furthermore, these experiments
also emphasize once again the fundamental significance of entanglement for
quantum information processing.
The basic theoretical concepts of quantum computation and the mathe-
matical structure underlying quantum algorithms are discussed in Chap. 4
by Beth and Rötteler. In particular, it is demonstrated how recent results in
the theory of signal processing can be used for the development of new fast
quantum algorithms. A short introduction to the theory of quantum error
correction is also presented.
A comprehensive account of the mathematical structure of entanglement
and of the significance of mixed entangled states for quantum information
processing is presented in Chap. 5 by M. Horodecki, P. Horodecki and R.
Horodecki. One of the most surprising recent developments in this context
has been the discovery of bound entanglement [71]. Though much is still
unknown, this section gives a state-of-the-art presentation of what is known
1 From the Foundations of Quantum Theory to Quantum Technology 13
about this new form of entanglement and its implications for processing quan-
tum information.
2 Quantum Information Theory
– an Invitation
Reinhard F. Werner
2.1 Introduction
part, Sect. 2.6, then gives a description of the mathematical structures and
of some of the tools needed to develop the theory.
M P
Fig. 2.1. Classical teleportation. Here and in the following diagrams, a wavy arrow
stands for quantum systems, and a straight arrow for the flow of classical informa-
tion
commonplace operation. For example, one can encode one classical bit in
the polarization degree of freedom of a photon (clearly a quantum system),
by choosing one of two orthogonal polarizations for the photon, depending
on the value of the classical bit. The readout is done by a photomultiplier
combined with a polarization filter in one of the corresponding directions. In
principle, this allows a perfect transmission. In some sense every transmis-
sion of classical information is of this kind, because every physical system
ultimately obeys the laws of quantum mechanics, even if we can often dis-
regard this fact and treat it classically. Hence classical information can be
translated into quantum information (and back).
But what about the converse? This hypothetical (and in fact, impossi-
ble) process has come to be known as classical teleportation (see Fig. 2.1). It
would involve a measuring device M, operating on some input quantum sys-
tems. The results of the measurements are subsequently fed into a preparing
device P, which produces the final output of the combined device. The task
is to set things up such that the outputs of the combined device are indistin-
guishable from the quantum inputs. Of course, we have to say precisely what
“indistinguishable” should mean. Clearly, this cannot mean that “the same”
system comes out at the other end. In the classical case this is not demanded
either. What can only be meant in quantum mechanics is that no statistical
test will see the difference. In other words, no matter what the preparation
of the input systems is and no matter what observable we measure on the
outputs of the teleportation device, we shall always get the same probability
distribution of results as if the inputs had been directly measured. Note also
that this criterion does not involve the states of individual systems, but only
states in the form of the distribution parameters of ensembles of identically
prepared systems.
The impossibility of classical teleportation will be treated extensively in
the following section, where it is related to a hierarchy of impossible machines.
For a mathematical statement of this impossibility in the standard quantum
formalism of quantum mechanics, see the remark after (2.7). For the moment,
however, let us take it for granted, and see what all this says about the new
concept of quantum information.
First of all, we are concerned here with problems of transmission, not with
content or meaning. This is exactly the same as in classical information the-
ory. There, too, it is often not easy to avoid confusion with a different concept
2 Quantum Information Theory – an Invitation 17
P
M = C
P
Fig. 2.2. Making a copier from a “classical teleportation” line
2 Quantum Information Theory – an Invitation 19
This is the task of combining two separate measuring devices into a single
device, or the “simultaneous measurement” of two quantum observables A
and B. Thus, a joint measuring device “A&B” is a device giving a pair (a, b)
of classical outputs each time it is operated, such that a is a possible output
of A, and b is a possible output of B. (We use the symbol A to denote both an
observable and a device that measures this observable, and similar for B.) We
require that the statistics of the a outcomes alone are the same as for device
A, and similarly for B. Note that once again our criterion is statistical, and
can be tested without recourse to counterfactual conditionals such as “the
result which would have resulted if B rather than A had been measured on
this particular quantum particle”.
Many quantum observables are not jointly measurable in this sense. The
most famous examples, position and momentum, different components of
angular momentum, and positions of a free particle at different times, are
probably contained in every quantum mechanics course. Hence the impossi-
bility of joint measurements is nothing but a precise statement of an aspect
of “complementarity”.
Nevertheless, a joint measurement device for any of these could readily
be constructed given a functioning quantum copier (see Fig. 2.3): one would
simply run the copier C on the quantum system, and then apply the two given
measuring devices, A and B, to the copies. It is easy to see that the definition
of the copier then guarantees that the statistics of a and b separately come
out right. In other words, a copier can be seen as a universal joint measuring
device.
C =
This is not named after a certain phone company, but after John S. Bell,
who never proposed it in this form, but might have. It refers to a project of
20 Reinhard F. Werner
S ?
=
we shall denote the correlation coefficient, which lies between −1 and +1.
The combination
and is borne out by all known experimental data. Now suppose Bob has a
joint measuring device for his B1 and B2 , which we shall denote by B1 &B2 ,
which produces pair outcomes (b1 , b2 ) (see Fig. 2.4). We can then determine
22 Reinhard F. Werner
each for i = 1, 2. The basic rule for the information transmission is the
following:
Alice encodes the bit she wants to send by choosing either apparatus
A1 or apparatus A2 . Then Bob looks at his readout and interprets it
as “A1 ” whenever the two displays coincide (b1 = b2 ), and as “A2 ”
if they are different.
We can then estimate the probability pok for Bob to be right, assuming
that the choices A1 and A2 are made with the same frequency. Assume first
that Alice chooses A1 . Then Bob is right with probability
b1 + b2
2 |a1 | p1 (a1 , b1 , b2 ) ,
a1 ,b1 ,b2
where the first factor takes into account the condition b1 = b2 , and the second
is introduced for later convenience. Combining this with a second term of
similar kind for Alice’s choice A2 , and taking into account the probability
1/2 for each of these choices, we obtain the overall probability pok for Bob to
be correct as
1 b1 + b2
pok = 2 |a1 | p1 (a1 , b1 , b2 )
2
a1 ,b1 ,b2
1 b1 − b2
+
2 2 |a2 | p2 (a2 , b1 , b2 )
a2 ,b1 ,b2
1
≥ (b1 + b2 )a1 p1 (a1 , b1 , b2 )
4
a1 ,b1 ,b2
1
+ (b1 − b2 )a1 p2 (a2 , b1 , b2 )
4
a2 ,b1 ,b2
1
= C(A1 , B1 ) + C(A1 , B2 ) + C(A2 , B1 ) − C(A2 , B2 )
4
β
= . (2.4)
4
Bob is right with a better probability than chance if pok > 1/2, which, by this
computation, can be guaranteed if β > 2, i.e. if the classical Bell inequality
(in Clauser–Horne–Shimony–Holt form [72]) is violated. But this is indeed the
2 Quantum Information Theory – an Invitation 23
were available (to Alice in this case): then we could say that the two decom-
positions were just the first step in an even finer decomposition, a further
reduction of ignorance, which would be brought to light if Alice were to ap-
ply her joint measurement. Presumably the mixed-state analyzer would then
yield this finer decomposition, because the operation of this device would not
depend on how closely Alice cared to look at her particles.
But just as two quantum observables are often not jointly measurable, two
decompositions of mixed states often have no common refinement (actually,
in the formalism of quantum theory, these are two variants of the same the-
orem). In particular, the two decompositions belonging to Alice’s choices in
an experiment demonstrating a violation of Bell’s inequalities have no com-
mon refinement, and any mixed-state analyzer could be used for superluminal
communication in this situation.
Another device, which is suggested by the individual-state interpretation,
arises from a naive extrapolation of this view to the parts of a composite
system: if every single system could be assigned a pure state, a composite
system could be assigned a pair of pure states, one for each subsystem. A
correlated state should therefore be given by a probability distribution of
such pairs. A device which represented an arbitrary state of a composite
system as a mixture of uncorrelated pure product states might be called a
correlation resolver. It could be built given a classical teleportation line: when
one applies teleportation to one of the subsystems and applies conditions
on the classical measurement results of the intermediate stage, one obtains
precisely a representation of an arbitrary state in this form. But it is easy to
see that any state which can be so analyzed automatically satisfies all Bell-
type inequalities, and hence once again the experimental violations of Bell’s
inequalities show that such a correlation resolver cannot exist. Hence we
have here a second line of reasoning in favor of the no-teleportation theorem:
a teleportation device would allow classical correlation resolution, which is
shown to be impossible by the Bell experiments.
The distinction between resolvable states and their complement is one
of the starting points of entanglement theory, where the “resolvable” states
are called “separable”, or “classically correlated”, and all others are called
“entangled”. For a more detailed treatment and an up-to date overview, the
reader is referred to Chap. 5.
Without going into philosophical discussions about the foundations of
quantum mechanics, I should like to comment briefly on the individual-state
interpretation, which has suggested the two impossible machines discussed in
this subsection. First, this view is not at all uncommon, and it is quite possi-
ble to read some passages from the masters of the Copenhagen interpretation
as an endorsement of this view. Secondly, if we define a hidden-variable theory
as a theory in which individual systems are described by classical parame-
ters, whose distribution is responsible for the randomness seen in quantum
experiments, we have no choice but to call the individual-state interpreta-
2 Quantum Information Theory – an Invitation 25
The no-teleportation theorem derived in the previous section says that there
is no way to measure a quantum state in such a way that the measuring
results suffice to reconstruct the state. At first sight this seems to deny that
the notion of “quantum states” has an operational meaning at all. But there is
no contradiction, and we shall resolve the apparent conflict in this subsection,
if only to sharpen the statement of the no-teleportation theorem.
Let us recall the operational definition of quantum states, according to
the statistical interpretation of quantum mechanics. A state is a description
of a way of preparing quantum systems, and in all its aspects it is related to
computing expectation values. We might also say that it is the assignment of
an expectation value to every observable of the system. So to the extent that
expectation values can be measured, it is possible to determine the state by
testing it on sufficiently many observables. What is crucial, however, is that
even the determination of a single expectation value is a statistical measure-
ment. Hence such a determination requires a repetition of the experiment
many times, using many systems prepared according to the same procedure.
In contrast, the above description of teleportation demands that it works
with a single quantum system as input, and that the measuring device does
not accumulate results from several input systems. Expressed in the current
jargon, teleportation is required to be a one-shot operation. Note that this
does not contradict our statistical criteria for the success of teleportation and
of other devices, which involve a statistics of independent “single shots”.
If we have available many identically prepared systems, many operations
which are otherwise impossible become easy. Let us begin with classical tele-
portation. Its multiinput analogue is the state estimation problem: how can
we design a measurement operating on samples of many (say, N ) systems
from the same preparing device, such that the measurement result in each
case is a collection of classical parameters forming a Hermitian matrix which
26 Reinhard F. Werner
T P
tors with negative eigenvalues. So the reversal of noise is not possible with a
one-shot device, but is easy to perform to high accuracy when many equally
prepared inputs are available. In the simplest case of a so-called depolarizing
channel, this problem is well understood [78]; it is also well understood in the
version requiring many outputs, as in the optimal-cloning problem [79].
This is arguably the first major discovery in the field of quantum informa-
tion. The no-cloning and no-teleportation theorems, although they had not
been formulated in such terms, would hardly have come as a surprise to peo-
ple working on the foundations of quantum mechanics in the 1960s, say. But
entanglement assistance was really an unexpected turn. It was first seen by
Bennett et al. [52], who also coined the term “teleportation”. It is gratifying
to see, though it is hardly a surprise on the same scale, that this prediction of
quantum mechanics has also been implemented experimentally. The experi-
ments are another interesting story, which will no doubt be told much better
28 Reinhard F. Werner
Alice Bob
1 qubit 2 bit 1 qubit
M P
1 ebit
S
Fig. 2.6. Entanglement-assisted teleportation
will have very many degrees of freedom, of which only very few are singled
out as the “qubits” on which the quantum computation is performed. Hence
it is necessary to analyze to what degree and on what timescales it is justi-
fied to treat the qubit degrees of freedom separately, and with what errors
the desired quantum operation can be realized in the given system. These
questions are crucial for the realization of any quantum devices, and require
specialized in-depth knowledge of the appropriate theory, e.g. quantum op-
tics, solid-state theory or quantum chemistry (in the case of NMR quantum
computing). However, these problems are not what we want to look at in this
chapter. The other way in which theoretical physics contributes to the field of
quantum information processing is in the form of another kind of theoretical
work, which could be called the “abstract quantum theory of information”.
Recall the arguments in Sect. 2.2, where the possibility of translating be-
tween different carriers of (classical) information was taken as the justifica-
tion for looking at an abstracted version, the classical theory of information,
as founded by Shannon. While it is true that quantum information cannot
be translated into this framework, and is hence a new kind of information,
translation is often possible (at least in principle) between different carriers
of quantum information. Therefore, we can make a similar abstraction in the
quantum case. To this abstract theory all qubits are the same, whether they
are realized as polarizations of photons, nuclear spins, excited states of ions in
a trap, modes of a cavity electromagnetic field or whatever other realization
may be feasible. A large amount of work is currently being devoted to this
abstract branch of quantum information theory, so I shall list some of the
reasons for this effort.
So what will be the basic concepts and features of the emerging quantum
theory of information? The information-theoretical perspective typically gen-
erates questions like
How can a given task of quantum information processing be performed
optimally with the given resources?
We have already seen a few typical tasks of quantum information process-
ing in the previous section and, of course, there are more. Typical resources
required for cryptography, quantum teleportation and dense coding are en-
tangled states, quantum channels and classical channels. In error correction
and computing tasks, the resources are the size of the quantum memory
and the number of quantum operations. Hence all these notions take on a
quantitative meaning.
For example, in entanglement-assisted teleportation the entangled pairs
are used up (one maximally entangled qubit pair is needed for every qubit
teleported). If we try to run this process with less than maximally entangled
states, we may still ask how many pairs from a given preparation device are
needed per qubit to teleport a message of many qubits, say, with an error less
than ε. This quantity is clearly a measure of entanglement. But other tasks
may lead to different quantitative measures of entanglement. Very often it is
possible to find inequalities between different measures of entanglement, and
establishing these inequalities is again a task of quantum information theory.
34 Reinhard F. Werner
that X is a finite set, and the algebra of observables A will be C(X), the space
of all functions f : X → C. A single classical bit corresponds to the choice
X = {0, 1}. On the other hand, a purely quantum system is determined by the
choice A = B(H), the algebra of all bounded linear operators on the Hilbert
space H. The finiteness assumption requires that H has a finite dimension
d, so A is just the space Md of complex d × d matrices. A qubit is given by
A = M2 .
The basic statistical interpretation of the algebra of observables is the
same in the quantum and classical cases, and hinges on the cone of positive
elements in the algebra. Here Y is called positive (in symbols, Y ≥ 0) if it
can be written in the form Y = X ∗ X. Then Y ∈ Md is positive exactly
if it is given by a positive semidefinite matrix, and f ∈ C(X) is positive iff
f (x) ≥ 0 for all x. In any algebra of observables A, we shall denote by 1I ∈ A
the identity element.
A state Φ on A is a positive normalized linear functional on A. That is,
Φ : A → C is linear, with Φ(X ∗ X) ≥ 0 and Φ(1I) = 1. Each state describes
a way of preparing systems, in all the details that are relevant to subsequent
statistical measurements on the systems. The measurements are described by
assigning to each outcome from a device an effect F ∈ A, i.e. an element with
0 ≤ F ≤ 1I. The prediction of the theory for the probability of that outcome,
measured on systems prepared according to the state ρ, is then ρ(F ).
For explicit computations we shall often need to expand states and ele-
ments of A in a basis. The standard basis in C(X) consists of the functions
ex , x ∈ X, such that ex (y) = 1 for x = y and zero otherwise. Similarly,
if φµ ∈ H is an orthonormal basis of the Hilbert space of a quantum sys-
tem, we denote by eµν = |eµ eν | ∈ B(H) the corresponding “matrix units”.
Then a state p on the classical algebra C(X) is characterized by the numbers
px ≡ p(ex ), which form a probability distribution on X, i.e. p(x) ≥ 0 and
x p(x) = 1. Similarly, a quantum state ρ on B(H) is given by the numbers
ρµν ≡ ρ(eνµ ), which form the so-called density matrix. If we interpret them
as the expansion coefficients of an operator ρ = µν ρµν eµν , the density
operator of ρ, we can also write ρ(A) = tr( ρA).
A state is called pure if it is extremal in the convex set of all states, i.e.
if it cannot be written as a convex combination λρ + (1 − λ)ρ of other
states. These are the states which contain as little randomness as possible.
In the classical case, the only pure states are those concentrated on a single
point z ∈ X, i.e. pz = 1, or p(f ) = f (z). The pure states in the quantum
case are determined by “wave vectors” ψ ∈ H such that ρ(A) = ψ, Aψ,
and ρ = |ψψ|. Thus, in the simplest case of a classical bit, there are just
two extreme points, whereas in the case of a qubit the extreme points form
a sphere in three dimensions and are given by the expectations of the three
36 Reinhard F. Werner
Fig. 2.7. State spaces as convex sets: left, one classical bit; right, one quantum bit
(qubit)
Pauli matrices:
1 1 + x3 x1 − ix2 1
ρ = = (1I + σ · x) ,
2 x1 + ix2 1 − x3 2
xk = ρ(σk ) . (2.5)
Then positivity requires |x| ≤ 1, with equality when ρ is pure. This is shown
2
in Fig. 2.7.
Thus, in addition to the north pole |1 and the south pole |0, which
roughly correspond to the extremal states of the classical bit, we have their
coherent superpositions corresponding to the wave vectors α|1 + β|0, where
α, β ∈ C, and |α|2 + |β|2 = 1. This additional freedom becomes even more
dramatic in higher-dimensional systems, and is crucial for the possibility of
entanglement.
Entanglement is a property of states of composite systems, so we must
introduce the notion of composition of systems. We shall define this in a
way which applies to classical and quantum systems alike. If A and B are
the algebras of observables of the subsystems, the algebra of observables of
the composition is defined to be the tensor product A ⊗ B. In the finite-
dimensional case, which is our main concern, this is defined as the space of
linear combinations of elements that can be written as A ⊗ B with A ∈ A
and B ∈ B, such that A ⊗ B is linear in A and linear in B. The algebraic
operations are defined by (A ⊗ B)∗ = A∗ ⊗ B ∗ , and (A1 ⊗ B1 )(A2 ⊗ B2 ) =
(A1 A2 ) ⊗ (B1 B2 ). Thus 1I = 1IA ⊗ 1IB . Since positivity is defined in terms of a
star operation (adjoint) and a product, these definitions also determine the
states and effects of the composite system.
Let us explore how this unifies the more common definitions in the clas-
sical and quantum cases. For two classical factors C(X) ⊗ C(Y ), a basis is
formed by the elements ex ⊗ ey , so the general element can be expanded as
f= f (x, y)ex ⊗ ey ,
x,y
2 Quantum Information Theory – an Invitation 37
and each element can be identified with a function on the Cartesian product
X × Y . Hence C(X) ⊗ C(Y ) ∼ = C(X × Y ). Similarly, in the purely quantum
case, we can expand in matrix units and obtain quantities with four indices:
(A ⊗ B)µν,µ ν = Aµµ Bνν . In a basis-free way, i.e. when A, B are considered
as operators on Hilbert spaces HA , HB , this is defined by the equation
(A ⊗ B)(φ ⊗ ψ) = (Aφ) ⊗ (Bψ) ,
where φ ∈ HA and ψ ∈ HB , and the tensor product of the Hilbert spaces is
formed in the usual way. Hence B(HA ) ⊗ B(HB ) ∼ = B(HA ⊗ HB ).
But the definition of a composition by a tensor product of algebras of ob-
servables also determines how a quantum–classical hybrid must be described.
Such systems occur frequently in quantum information theory, whenever a
combination of classical and quantum information is given. We shall approach
hybrids in two equivalent ways, which are also useful more generally. Suppose
we know only that the first subsystem is classical and make no assumptions
about the nature of the second, i.e. we want to characterize tensor prod-
ucts of the form C(X) ⊗ B. Then every element can be expanded in the
form B = x ex ⊗ Bx , where now Bx ∈ B. Clearly, the elements Bx de-
termine B, and hence we can identify the tensor product with the space
(sometimes denoted by C(X; B)) of B-valued functions on X with pointwise
algebraic operations. Similarly, suppose that we know only that B = Md is
the algebra
of d × d matrices. Then, expanding in matrix units, we find that
A = µν Aµν ⊗ eµν with Aµν ∈ A. That is, we can identify A ⊗ Md with
the space (sometimes denoted by Md (A)) of d × d matrices with entries from
A. By using the relation eµν eαβ = δνα eµβ , one can readily verify that the
product in A ⊗ Md indeed corresponds to the usual matrix multiplication
in Md (A), with due care given to the order of factors in products with ele-
ments from A, if A happens to be noncommutative. The adjoint is given by
(A∗ )µν = (Aνµ )∗ . Hence a hybrid algebra C(X) ⊗ Md can be viewed either
as the algebra of C(X)-valued d × d matrices or as the space of Md -valued
functions on X.
The physical interpretation of a composite system A⊗B in terms of states
and effects is straightforward. When F ∈ A and G ∈ B are effects, so is F ⊗G,
and this is interpreted as the joint measurement of F on the first subsystem
and of G on the second subsystem, where the “yes” outcome is taken as “both
effects give yes”. In particular, F ⊗1IB corresponds to measuring F on the first
system, completely ignoring the second. Thus, for any state ρ on A ⊗ B we
define the restriction ρA of ρ to A by ρA (A) = ρ(A⊗1IB ). In the classical case,
the probability density for ρA is obtained by integrating out the B variables.
In the quantum case, it corresponds to the partial trace of density matrices
with respect to HB . In general, it is not possible to reconstruct the state ρ
from the restrictions ρA and ρB , which is another way of saying that ρ also
describes correlations between the systems. However, given ρA and ρB , there
is always a state with these restrictions, namely the tensor product ρA ⊗ ρB ,
which corresponds to an independent preparation of the subsystems.
38 Reinhard F. Werner
Since A is arbitrary (e.g. A = |eα eβ |), we may compare coefficients, and
obtain ψµ , ψν = λµ δµν . Hence eµ = λ−1/2 ψµ is the desired orthonormal
system.
(2) The existence of the purification is evident if one defines Φ as above,
with
the orthonormal system eµ chosen in an arbitrary way. Then ρB =
µ λµ |eµ eµ |, and the above computation shows that choosing the basis eµ
is the only freedom in this construction. But any two bases are linked by a
unitary transformation.
with states ρA B
µ and ρµ on A and B, respectively, and weights λµ > 0. Other-
wise, ρ is called entangled.
2.6.2 Channels
In the “constructive” approach one allows only maps which can be built
from the basic operations of (1) tensoring with a second system in a speci-
fied state, (2) unitary transformation and (3) reduction to a subsystem. Let
us describe these and some other basic channels more formally, if only to
show the richness of this concept. We leave the verification of the channel
properties, including complete positivity, to the reader.
property of T is equivalent to
Fx ∈ A , Fx ≥ 0 , Fx = 1IA .
x
ignore
the measurement results, which gives the overall state change T̄ =
x T x : B → A.
Note that this looks very similar to the conditions for an instrument, but
the normalization is different. An interesting special case is a “prepara-
tor”, for which A = C is trivial. This prepares B states that depend in
an arbitrary way on the classical input x.
Unitary
=
A
The claim that every channel can be represented in the last two forms is
a direct consequence of the fundamental structural theorem for completely
positive maps, due to Stinespring [89]. We state it here in a version adapted
to pure quantum systems, containing no classical components.
Theorem 2.1. (Stinespring Theorem). Let T : Mn → Mm be a completely
positive linear map. Then there is a number + and an operator V : Cm →
Cn ⊗ C such that
Tx (X) = V ∗ (X ⊗ Fx )V.
create in this way. For example, if the channel is separable, the state will also
be separable.
Mathematically, the kind of relationship we shall describe here is very
reminiscent of the relationship between bilinear forms and linear operators:
an operator from an n-dimensional vector space to an m-dimensional vector
space is parametrized by an n × m matrix, just like a bilinear form with
arguments from an n-dimensional and an m-dimensional space. It is there-
fore hardly surprising that the matrix elements of a density operator on a
tensor product can be reorganized and reinterpreted as the matrix elements
of an operator between operator spaces. What is perhaps not so obvious,
however, is that the positivity conditions for states and for channels exactly
match up in this correspondence. This is the content of the following Lemma,
graphically represented in Fig. 2.9.
S = P
T
Fig. 2.9. The duality scheme of Lemma 2.2: an arbitrary preparation P is uniquely
represented as a preparation S of a pure state and the application of a channel T
to half of the system
ρ = σ ◦ (IH ⊗ T ) . (2.11)
In the definition of channel capacity, we shall have to use a criterion for the
approximation of one channel by another. Since channels are maps between
normed spaces, one obvious choice would be to use the standard norm
S − T := sup S(A) − T (A) | A ≤ 1 . (2.13)
called the norm of complete boundedness, or “cb norm” for short. This name
derives from the observation that on infinite-dimensional C* algebras the
above supremum may be infinite even though each term in the supremum
is finite. By definition, a completely bounded map is one with T cb < ∞.
On a finite-dimensional C* algebra, every linear map is completely bounded:
for maps into Md we have T cb ≤ dT . (As a general reference on these
matters, I recommend the book [91].) One might conclude from this that
the distinction between these norms is irrelevant. However, since we shall
need estimates for large tensor products, every factor that increases with
dimension can make a decisive difference. This is the reason for employing
48 Reinhard F. Werner
the cb norm in the definition of channel capacity. It will turn out, however,
that in the most important cases one has only to estimate differences from
the identity, and T − I and T − Icb can be estimated in terms of each
other with dimension-independent bounds.
The basis of the notion of channel capacity is a comparison between the
given channel T : A2 → A1 and an “ideal” channel S : B1 → B2 . The
comparison is effected by suitable encoding and decoding transformations
E : A1 → B1 and D : B2 → A2 so that the composed operator ET D : B2 →
B1 is a map which can be compared directly with the ideal channel S. Of
course, we are only interested in such a comparison in the case of optimal
encoding and decoding, i.e. in the quantity
where the infimum is over all channels (i.e. all unit-preserving completely
positive maps) E and D with appropriate domain and range. Since these
data are at least implicitly given together with S and T , there is no need
to specify them in the notation. S should be thought of as representing one
word of the kind of message to be sent, whereas T represents one invocation
of the channel. Channel capacity is defined as the number of S words per
invocation of the channel T which can be faithfully transmitted, with suitable
encoding and decoding for long messages. Here “messages of length n” are
represented by the tensor power S ⊗n , and “m invocations of the channel T ”
are represented by the tensor power T ⊗m .
The supremum of all achievable rates is called the capacity of T with respect
to S, and is denoted by C(S, T ).
this notation, we shall now summarize the capacities of ideal quantum and
classical channels. Of course, these are basic data for the whole theory:
C(Mk , Cn ) = 0 for k ≥ 2, (2.16)
log n
C(Ck , Cn ) = C(Mk , Mn ) = C(Mk , Cn ) = . (2.17)
log k
Here the first equation is the capacity version of the no-teleportation theorem:
it is impossible to transport any quantum information on a classical channel.
The second equation shows that for capacity purposes, Mn is indeed best
compared with Cn . In classical information theory one uses the one-bit system
C2 as the ideal reference channel. Similarly, we use the one-qubit channel as
the reference standard for quantum information , i.e. we define the classical
capacity Cc (T ) and the quantum capacity Cq (T ) of an arbitrary channel by
Cc (T ) = C(C2 , T ), (2.18)
Cq (T ) = C(M2 , T ) . (2.19)
Combining the results (2.17) with the “triangle inequality”, or two-step coding
inequality,
we see that this is really only a choice of units, i.e. for arbitrary channels
T we obtain C(Mn , T ) = (log 2/ log n)C(M2 , T ), and a similar equation for
classical capacities. Note that the term “qubit” refers to the reference system
M2 , but it is not advisable to use “qubit” as a special unit for quantum
information (rather than just “bit”): this would be like distinguishing between
the units “vertical meter” and “horizontal meter” and would create problems
in every equation in which the two capacities were directly compared. The
simplest relation of this kind is
Cq (T ) ≤ Cc (T ) , (2.21)
which follows from combining (2.20) with (2.17). Note that both definitions
apply to arbitrary channels T , whether the input and/or output are classical
or quantum or hybrids. In order for a channel to have a positive quantum
capacity, it is necessary that both the input and the output are quantum
systems. This is shown by combining (2.16) with the bottleneck inequality
C(S, T1 T2 ) ≤ min C(S, T1 ), C(S, T2 ) . (2.22)
for the standard ideal channels, and when all systems involved are classical,
we even have equality. However, it is one of the big unsolved problems to
decide under what general circumstances this is true.
Comparison with the Classical Definition. Since the definition of clas-
sical capacity Cc (T ) also applies to the purely classical situation, we have to
verify that it is indeed equivalent to the standard definition in this case. To
that end, we have to evaluate the error quantity T − Icb for a classical-to-
classical channel. As noted, a classical channel T : C(Y ) → C(X) is given by
a transition probability matrix T (x → y). Since the cb norm coincides with
the ordinary norm in the classical case, we obtain
I − T cb = I − T = sup δxy − T (x → y) f (y)
x,f y
= 2 sup 1 − T (x → x) ,
x
where the supremum is over all f ∈ C(Y ) with |f (y)| ≤ 1 and is attained
where f is just the sign of the parenthesis in the second line, and we have
used the normalization of the transition probabilities. Hence, apart from an
irrelevant factor of two, T − Icb is just the maximal probability of error,
i.e. the largest probability for sending x and obtaining anything different.
This is precisely the quantity which is required to go to zero (after suitable
coding and decoding) in Shannon’s classical definition of the channel capacity
of discrete memoryless channels [92]. Hence the above definition agrees with
the classical one.
When considering the classical capacity Cc (T ) of a quantum channel,
it is natural to look at a coded channel ET D as a channel in its own right.
Since we are considering transmission of classical information, this is a purely
classical channel, and we can look at its classical capacity. Optimizing over
coding and decoding, we obtain the quantity
This is called the one-shot classical capacity, because it can be said to involve
only one invocation of the channel T . Of course, many uses of the channel
are implicit in the capacity on the right-hand side, but these are in some
sense harmless. In fact, every coding and decoding scheme for comparing
(ET D)⊗n with an ideal classical channel is also a coding/decoding for T ⊗n ,
but the codings/decodings that arise in this way from the coding ET D are
only those in which the coded input states and the measurements at the
2 Quantum Information Theory – an Invitation 51
where the supremum is over all unit vectors. Hence the achievable rates are
those for which F (ET ⊗nα D) → 1, where E, D map to a system of mα qubits,
and these integer sequences satisfy the same constraints as above. This def-
inition is equivalent to ours, because the error estimates are equivalent. In
fact, if we introduce the off-diagonal fidelity
F% (T ) = sup !eφ, T |φψ| ψ (2.27)
φ,ψ
Cc,1 (T ) = max S pi T∗ [ρi ] − pi S(T∗ [ρi ]) . (2.32)
i i
Whether or not this is equal to the classical capacity depends on whether the
conjectured equality in (2.25) holds or not. In any case, equality is known
to hold for channels with classical input, so Holevo’s coding theorem is a
genuine extension of Shannon’s.
No coding theorem has been proved yet for the quantum capacity. How-
ever, there is a fairly good candidate for the right-hand side, related to a
quantity called “coherent information” [97]. The formula is written most
compactly by relating it to an entanglement quantity via Lemma 2.2. For
any bipartite state ρ with restriction ρB to the second factor, let
where the supremum is over all bipartite pure states σ. Note that any measure
of entanglement can be turned into a capacity-like expression by this proce-
dure. Since this quantity is known not to be additive [99], the candidate for
the right-hand side of the quantum coding theorem is
1
CS (T ) = sup CS,1 (T ⊗ ) , (2.35)
+
in analogy to (2.25). So far there have been some good heuristic arguments
[100, 101] in favor of this candidate, but a full proof remains one of the main
challenges in the field.
An interesting upper bound on Cq (T ) can be written in terms of the
transpose operation Θ on the output system [81]: we have
Let us take a similar look at teleportation. Here three quantum systems are
involved: the entangled pair in state ω, and the input system given to Alice,
in state ρ. Thus the overall initial state is ρ ⊗ ω. Alice measures an observable
F on the first two factors, obtaining a result x, which is sent to Bob. Bob
applies a transformation Tx to his particle, and makes a final measurement
of an observable A of his choice. Thus the probability of Alice measuring x
and of Bob obtaining a result “yes” on A is tr(ρ ⊗ ω)[Fx ⊗ Tx (A)]. Note that
the tensor symbols in this equation refer to different splittings of the system
(1 ⊗ 23 and 12 ⊗ 3, respectively). Teleportation is successful if the overall
probability of obtaining A, computed by summing over all possibilities x, is
the same as for an ideal channel, i.e.
tr(ρ ⊗ ω)[Fx ⊗ Tx (A)] = tr(ρA) . (2.38)
x∈X
Surprisingly, in the tight case one obtains exactly the same conditions on
ω, Tx , Fx for teleportation and for dense coding, i.e. a dense-coding scheme
can be turned into a teleportation scheme simply by letting Bob and Al-
ice swap their equipment. However, this symmetry depends crucially on the
tightness condition, because teleportation schemes with |X| > d2 signals are
trivial to achieve, but |X| > d2 is impossible for dense coding. Conversely,
dense coding through a d > d-dimensional channel is trivial to achieve, while
teleportation of states with d > d dimensions (with the same X) is impossi-
ble.
Let us now give a heuristic sketch of the arguments leading to the neces-
sary and sufficient conditions for (2.37) and (2.38) to hold. For full proofs we
refer to [102]. A crucial ingredient in the analysis of the teleportation equation
is the “no measurement without perturbation” principle from Lemma 2.1: the
left-hand side of (2.38) is indeed such a decomposition, so each term must be
equal to λx tr(ρA) for all ρ, A. But we can carry this even further: suppose
we decompose ω, Fx or Tx into a sum of (completely) positive terms. Then
each term in the resulting sum must also be proportional to tr(ρA). Hence
any components of ω, Tx or Fx satisfy a teleportation equation as well (up
to normalization). Similarly, the vanishing of the dense-coding equation for
x = y carries over to every positive summand in ω, Tx or Fx . Hence it is
2 Quantum Information Theory – an Invitation 55
plausible that we must first analyze the case where all ω, Fx , Tx are “pure”,
i.e. have no nontrivial decompositions as sums of (completely) positive terms:
ω = |ΩΩ|, (2.39)
Fx = |Φx Φx |, (2.40)
Tx (A) = Ux∗ AUx . (2.41)
The further analysis will show that in the pure case any two of these elements
determine the third via the teleportation or the dense-coding equation, so
that in fact all components of ω (and correspondingly Tx or Fx ) have to be
proportional. Hence each of these has to be pure in the first place. For the
present discussion, let us just assume purity in the form (2.39)–(2.41) from
now on. Note that normalization requires that each Ux is unitary.
The second normalization condition, x |Φx Φx | = x Fx = 1I, has
an interesting consequence in conjunction with the tightness condition: the
vectors Φx live in a d2 -dimensional space, and there are exactly d2 of them.
This implies that they are orthogonal: since each vector Φx satisfies Φx ≤ 1,
and d = tr(1
I) = x Φx , we must have Φx = 1 for all x. Hence, in the
2
φ ⊗ Ω1 , Ω2 ⊗ ψ = λφ, ψ . (2.43)
Taking the matrix element φ| · |ek of this equation and summing over k, we
find
ek , Ux∗ φ φ, Ux ek = tr(Ux∗ |φφ|Ux ) = d2 φ2 = φ2 tr(ω1−1 ) .
x,k x
Hence tr(ω1−1 ) = d2 = −1
k rk , where rk are the eigenvalues of ω1 . Us-
ing
again the fact that the smallest value of this sum under −1 the constraint
k r k = 1 is attained only for constant rk , we find ω 1 = d 1I, and Ω is
indeed maximally entangled.
To summarize, we have the following theorem (again, for a detailed proof
see [102]):
Theorem 2.3. Given either a teleportation scheme or a dense-coding scheme,
which is tight in the sense that all Hilbert spaces are d-dimensional and
|X| = d2 classical signals are distinguished, then
2 Quantum Information Theory – an Invitation 57
Quantum entanglement lies at the heart of the new field of quantum com-
munication and computation. For a long time, entanglement was seen just as
one of those fancy features which make quantum mechanics so counterintu-
itive. But recently, quantum information theory has shown the tremendous
importance of quantum correlations for the formulation of new methods of
information transfer and for algorithms exploiting the capabilities of quan-
tum computers. While the latter needs entanglement between a large number
of quantum systems, the basic quantum communication schemes rely only on
entanglement between the members of a pair of particles, directly pointing
to a possible realization of such schemes by means of correlated photon pairs
such as those produced by parametric down-conversion.
This chapter describes the first experimental realizations of quantum com-
munication schemes using entangled photon pairs. We show how to make com-
munication secure against eavesdropping using entanglement-based quantum
cryptography, how to increase the information capacity of a quantum chan-
nel by quantum dense coding and, finally, how to communicate quantum
information itself in the process of quantum teleportation.
3.1 Introduction
Quantum mechanics is probably the most successful physical theory of this
century. It provides powerful tools which form one of the cornerstones of
scientific progress, and which are indispensable for the understanding of om-
nipresent technical devices such as the transistor, semiconductor chips and
the laser. The most important areas where those devices are used are mod-
ern communication and information-processing technologies. But quantum
mechanics, until now, has only been used to construct these devices – quan-
tum effects are absolutely avoided in the representation and manipulation
of information. Rather than using single photons, one still uses strong light
pulses to send information along optical high-speed connections, and one re-
lies on electrical currents in semiconductor logic chips instead of applying
single electrons as signal carriers.
This caution surely is due to the fact that, at first glance, the inherent
stochastic character of quantum effects seems only to introduce unavoidable
G. Alber, T. Beth, M. Horodecki, P. Horodecki, R. Horodecki, M. Rötteler, H. Weinfurter,
R. Werner, A. Zeilinger: Quantum Information, STMP 173, 58–95 (2001)
c Springer-Verlag Berlin Heidelberg 2001
3. Quantum Communication 59
noise and thus does not really recommend their use. Yet quantum informa-
tion theory shows us, in more and more examples, how one can profit from
the peculiar properties of quantum systems, and, when applied correctly,
how fundamental quantum effects can add to the power and features of clas-
sical information processing and transmission [12, 105, 106]. For example,
quantum computers will outperform conventional computers, and quantum
cryptography enables, for the first time, secure communication. While quan-
tum cryptography, in principle, can be performed even with single quantum
particles, all the other proposals utilize entanglement between two or more
particles, for example to enhance communication rates or to enable the tele-
portation of quantum states.
Entanglement between quantum systems is a pure quantum effect. It is
closely related to the superposition principle and describes correlations be-
tween quantum systems that are much stronger and richer than any classical
correlation could be. Originally this property was introduced by Einstein,
Podolsky and Rosen (EPR) [24], and also by Schrödinger [5] and Bohr [107]
in the discussion of the completeness of quantum mechanics and by von Neu-
mann [108] in his description of the measurement process. Entanglement also
provides a handle to distinguish various interpretations of quantum mechan-
ics via Bell’s theorem [72, 109] or the GHZ argument [37]. The development
of experimental techniques has enabled researchers to perform the recent
long-distance tests of entanglement [30], the first Bell experiment fulfilling
Einstein locality conditions [31] and the first GHZ experiment [38], which
all provided convincing demonstrations of the validity of standard quantum
mechanics.1
The field of quantum information is not concerned with the fundamental
issues. Instead, it builds on the validity of quantum mechanics and applies the
characteristic features of entangled systems to devise new, powerful schemes
for communication and computation. Entanglement between a large number
of quantum systems will enable very efficient computations. In particular,
the factorization algorithm of Shor [9] and the search algorithm of Grover
[58] (together with the increasing number of algorithms derived from one or
the other) show how entanglement and the associated interference between
entangled states can boost the power of quantum computers.
Quantum communication exploits entanglement between only two or three
particles. As will be seen in the following sections, the often counterintu-
itive features of such small entangled systems enable powerful communication
methods. After the very basic properties of pairs of entangled particles have
been introduced (Sect. 3.2), Sect. 3.3 gives an overview of the possibilities
of three important quantum communication schemes: entanglement-based
quantum cryptography enables secret key exchange and thus truly secure
communication [46]; using quantum dense coding, one can send classical in-
1
We are aware of the detection loophole [110], which will be closed whenever
technology allows.
60 Harald Weinfurter and Anton Zeilinger
formation more efficiently [51]; and, finally, with quantum teleportation one
can transfer quantum information, that is, the quantum state itself, from
one quantum system to another [52]. The tools for the experimental real-
ization of those quantum communication schemes are presented in Sect. 3.4.
In particular, we show how to produce polarization-entangled photon pairs
by parametric down-conversion [111] and how to observe these nonclassical
states by interferometric Bell-state analysis [112]. In Sect. 3.5 we describe the
first experimental realizations of basic quantum communication schemes. In
experiments performed during recent years at the University of Innsbruck,
we could realize entanglement-based quantum cryptography with randomly
switched analyzers and with the two users separated by more than 400 m
[113]; we demonstrated the possibility of transmitting 1.58 bits of classical
information by encoding trits on a single two-state photon [114]; and we
could transfer a qubit, in our case the polarization state, from one photon to
another by quantum teleportation [10, 11] and entanglement swapping [115].
1
|Ψ + 12 = √ (|01 |12 + |11 |02 ), (3.2)
2
− 1
|Ψ 12 = √ (|01 |12 − |11 |02 ), (3.3)
2
1
|Φ+ 12 = √ (|01 |02 + |11 |12 ), (3.4)
2
− 1
|Φ 12 = √ (|01 |02 − |11 |12 ) . (3.5)
2
The name “Bell states” was given to these states since they maximally
violate a Bell inequality [121]. This inequality was deduced in the context of
so-called local realistic theories (see Chap. 1), and gives a range of possible
results for certain statistical tests on identically prepared pairs of particles
[109]. Quantum mechanics predicts different results if the measurements are
performed on entangled pairs. If the two particles are not correlated, i.e. are
described by a product state, the quantum mechanical prediction is within
the range given by Bell’s inequality.
The remarkably nonclassical features of entangled pairs arise from the
fact that the two systems can no longer be seen as being independent but
now have to be seen as one combined system, where the observation of one of
the two will change the possible predictions of measurement results obtained
for the other [5, 107]. Formally, this mutual dependence is reflected by the
62 Harald Weinfurter and Anton Zeilinger
fact that the entangled state can no longer be factored into a product of two
states for the two subsystems separately.
If one looks only at one of the two particles, one finds it with equal proba-
bility in state |0 or in state |1. One has no information about the particular
outcome of a measurement to be performed. However, the observation of
one of the two particles determines the result of a measurement of the other
particle. This holds not only for a measurement in the basis |0/|1, but for
any arbitrary superposition, that is, for any arbitrary orientation of the mea-
surement apparatus. In particular, for the state |Ψ − we shall find the two
particles always in orthogonal states, no matter which measurement appa-
ratus is used. If, for the case of polarization-entangled photons, we observe
only one of the two photons, it appears to be completely unpolarized, and
any polarization direction is observed with equal probability. However, the
results for both photons are perfectly correlated. For example, this means
that photon 2 has vertical polarization if we found horizontal polarization
for photon 1, but also that photon 2 will be circularly polarized left if we
observed right circular polarization for photon 1.
Another important feature of the four Bell states is that a manipulation
of only one of the two particles suffices to transform from any Bell state to
any of the other three states. This is not possible for the basis formed by the
products. For example, to transform |01 |02 into |11 |12 one has to flip the
state of both particles.
These three features,
• different statistical results for measurements on entangled or unentangled
pairs
• perfect correlations between the observations of the two particles of a pair,
although the results of the measurements on the individual particles are
fully random
• the possibility to transform between the Bell states by manipulating only
one of the two particles,
are the ingredients of the fundamental quantum communication schemes de-
scribed here.
Suppose two parties, let us call them Alice and Bob, want to send each
other secret messages. There exists a cryptographic method, the one-time
pad scheme,3 which is secure against eavesdropping attacks – provided the
key used for encoding and decoding the message is perfectly random, is as
long as the original message and, most importantly, is secret and known
only to Alice and Bob. But how can they be sure that the key was securely
distributed to the two, and that no third person has knowledge about the key?
Quantum cryptography [42, 122] provides a means to ensure the security of
3
In the so-called “one-time pad” encryption (see Sect. 3.1), every character of the
message is encoded with a random key character. As shown by Shannon [44], the
cipher cannot be decoded without a knowledge of the key. The eavesdropping
is impossible as long as the key is securely exchanged between the sender and
receiver.
64 Harald Weinfurter and Anton Zeilinger
the key distribution and thus enables, together with the one-time pad scheme,
absolutely secret communication.4
Let us first discuss how quantum cryptography can profit from the fas-
cinating properties of entangled systems to provide secure key exchange
[46, 123]. Suppose that Alice and Bob receive particles which are in entan-
gled pairs, from an EPR source (Fig. 3.1). Beforehand, Alice and Bob agreed
on some preferred basis, again called |0/|1, in which they start to perform
measurements. The possible results, +1 and −1, correspond to observation
of the state |1 or |0, respectively. Owing to the entanglement of the parti-
cles, the measurement results of Alice and Bob will be perfectly correlated
or, in a case where the source produces pairs in the |Ψ − state, perfectly an-
ticorrelated. For each instance where Alice obtained −1, she knows that Bob
observed +1, and if she obtained the result +1, she knows that Bob had −1.
Alice and Bob can translate the result −1 to the bit value 0 and the result
+1 to the bit value 1 and thereby establish a random key, ideal for encoding
messages. But how can they be sure that no eavesdropper has intercepted
the key exchange? There are two different techniques. The first scheme for
entanglement-based quantum cryptography [123] builds on the ideas of the
basic quantum cryptography protocol for single photons [42, 122]. In this
case, Alice and Bob randomly and independently vary their analysis direc-
tions between 0◦ , corresponding to the |0/|1 basis, and 45◦ , corresponding
to a second, noncommuting basis. They will observe perfect anticorrelations
of their measurements whenever they happen to have polarizers oriented par-
allel (Alice and Bob thus obtain identical keys, if one of them inverts all bits
of his/her key string). This can be viewed in the following way: as Alice makes
4
For descriptions of quantum cryptography schemes not relying on entanglement,
see [105, 106].
3. Quantum Communication 65
In order to implement quantum key distribution, Alice and Bob each vary
their analyzers randomly between two settings: Alice uses −30◦ , 0◦ , and Bob
uses 0◦ , 30◦ . Because Alice and Bob operate independently, four possible com-
binations of analyzer settings will occur, of which the three oblique settings
allow a test of Wigner’s inequality and the remaining combination of parallel
settings allows the generation of keys via the perfect anticorrelation (where,
again, either Alice or Bob has to invert all bits of the key to obtain identical
keys). If the measured probabilities violate Wigner’s inequality, the security
of the quantum channel is ascertained, and the keys generated can readily
be used. This scheme is an improvement on the Ekert scheme, which uses
the CHSH inequality. Since there are fewer settings on each side, the above
version is technically easier to implement and also uses the photon pairs more
efficiently for key generation.
Compared with standard attenuated-pulse quantum cryptography, such
systems are practically immune to any beam-splitter attack (or other attacks
that try to split pulses containing more than one photon) by a potential
eavesdropper. First of all, a photon pair source can be used as an (almost)
ideal source of single photons. If one of the photons is detected, the gate
time of the coincidence electronics (typically on the order of 1 ns) determines
the equivalent pulse duration in standard quantum cryptography. Since the
probability of generating one photon pair during such a short time is very
low, e.g., for the experiment described in Sect. 3.5.1, only about 6.8 × 10−4 ,
the probability of having two photons in the gate time is less than 3 × 10−7
and can be almost neglected. This has to be compared with a probability of
having two photons in a pulse of 0.005 for a typical quantum cryptography
realization using a mean of 0.1 photons per pulse.
However, the security against beam-splitting attacks can be further in-
creased when entanglement-based schemes are used. In this case, there is
only a correlation between two entangled pairs if they are simultaneously
generated during a time interval of the order of the coherence time of the
photons, i.e. during a time of typically 500 fs. This reduces the chances of
an eavesdropper learning the value of a key bit to about 6 × 10−14 and guar-
antees unprecedented security of the quantum key. Moreover, by utilizing
the peculiar properties of entangled photon pairs produced by parametric
down-conversion, one immediately profits from the inherent randomness of
quantum mechanical observations, which guarantees a truly random and non-
deterministic key.
If one wants to send some information, one encodes the message with dis-
tinguishable symbols, writes them on some physical entity and finally, this is
transmitted to the receiver. To send one bit of information one uses, for exam-
ple, the binary values “0” and “1” as code symbols written on the information
3. Quantum Communication 67
Fig. 3.2. Scheme for the efficient transmission of classical information by quantum
dense coding [51] (BSM, Bell-state measurement; U, unitary transformation)
carrier. If one wants to send two bits of information, one consequently has to
perform the process twice; that means one has to send two such entities.
As mentioned above, in the case of quantum information one identifies
the two binary values with the two orthogonal basis states |0 and |1 of
the qubit. In order to send a classical message to Bob, Alice uses quantum
particles, all prepared in the same state by some source. Alice translates the
bit values of the message by either leaving the state of the qubit unchanged or
flipping it to the other, orthogonal state, and Bob, consequently, will observe
the particle in one or the other state. That means that Alice can encode one
bit of information in a single qubit. Obviously, she cannot do better, since in
order to avoid errors, the states arriving at Bob have to be distinguishable,
which is only guaranteed when orthogonal states are used. In this respect,
they do not gain anything by using qubits as compared with classical bits.
Also, if she wants to communicate two bits of information, Alice has to send
two qubits.
Bennett and Wiesner found a clever way to circumvent the classical limit
and showed how to increase the channel capacity by utilizing entangled par-
ticles [51]. Suppose the particle which Alice obtained from the source is en-
tangled with another particle, which was sent directly to Bob (Fig. 3.2). The
two particles are in one of the four Bell states, say |Ψ − . Alice now can use a
particular feature of the Bell basis, that manipulation of one of the two entan-
gled particles suffices to transform to any other of the four Bell states. Thus
she can perform one out of four possible transformations – that is, doing
nothing, shifting the phase by π, flipping the state, or flipping and phase-
shifting the state – to transform the two-particle state of their common pair
68 Harald Weinfurter and Anton Zeilinger
to another state. After Alice has sent the transformed two-state particle to
Bob, he can read the information by performing a combined measurement on
both particles. He makes a measurement in the Bell-state basis and can iden-
tify which of the four possible messages was sent by Alice. Thus it is possible
to encode two bits of classical information by manipulating and transmit-
ting a single two-state system. Entanglement enables one to communicate
information more efficiently than any classical system could do.
The preceding examples show how quantum information can be applied
for secure and efficient transmission of classical information. But can one
also transmit quantum information, that is, the state of a qubit? Obviously,
quantum mechanics places a number of obstacles in the way of this intention,
above all, the problem of measuring quantum states, which is utilized in
quantum cryptography as already described.
The Idea. It is an everyday task, in our classical world, for Alice to send
some information to Bob. Imagine a fax machine. Alice might have some
message, written on a sheet of paper. For the fax machine the actual written
information does not matter, in fact, it reduces to just a sequence of white
and black pixels. For the transmission, the machine scans the paper pixel by
pixel. It measures whether a pixel is white or black and sends this information
to Bob’s machine, which writes the state of each pixel onto another sheet of
paper. In classical physics, by definition, one can make the measurements with
arbitrary precision, and Bob’s sheet can thus become an ideal copy of Alice’s
original sheet of paper. If Alice’s pixels were made smaller and smaller, they
would, in reality, sooner or later be encoded on single molecules or atoms.
If we again confined ourselves to coding in only the basis states, we surely
could measure and transfer the binary value of even such pixels.
Now, imagine Alice not only has classical binary values encoded on her
system, but wants to send a quantum state, i.e. quantum information, to
Bob. She has a qubit encoded on some quantum system such as a molecule
or atom, and wishes, that a quantum system in Bob’s hands should represent
this qubit at the end of the transmission. Evidently, Alice cannot read the
quantum information, that is, measure the state of the quantum object with
arbitrary precision. All she would learn from her measurement would be that
the amplitude of the observed basis state was not zero. But this is not enough
information for Bob to reconstruct the qubit on his quantum particle.
Another limitation, which definitely seems to bring the quest for perfect
transfer of the quantum information to an end, is the no-cloning theorem
(see Sect. 3.1) [4]. According to this theorem, the state of a quantum system
cannot be copied onto another quantum system with arbitrary precision.
Thus, how could Bob’s quantum particle obtain the state of Alice’s particle?
In 1993 Bennett et al. found the solution to this problem [52]. In their
scheme, a chain of quantum correlations is established between the particle
3. Quantum Communication 69
Fig. 3.3. Scheme for teleporting a quantum state from one system to another one
[52]
carrying the initial quantum state and Bob’s particle. They dispense with
measuring the initial state; actually, they avoid gaining any knowledge about
this state at all!
To perform quantum teleportation, initially Alice and Bob share an en-
tangled pair of particles 2 and 3, which they have obtained from some source
of entangled particles, in, say, the state |Ψ − 2,3 (Fig. 3.3). As mentioned be-
fore, we cannot say anything about the state of particle 2 on its own. Nor do
we know the state of particle 3. In fact, these particles do not have a (pure)
state at all. But, whatever the results of measurements might be, we know
for sure that they are orthogonal to each other. Next, particle 1, which car-
ries the state to be sent to Bob, is given to Alice. She now measures particle
1 and 2 together, by projecting them onto the Bell-state basis. After pro-
jecting the two particles into an entangled state, she cannot infer anything
about the individual states of particles 1 and 2 anymore. However, she knows
about correlations between the two. Let us assume she has obtained the result
|Ψ − 1,2 . This tells her, that whatever the two states of particles 1 and 2 have
been, they have been orthogonal to each other. But from this, Alice already
knows that the state of particle 3 is equal to the state of particle 1 (up to a
possible overall phase shift). This follows because the state of particle 1 was
orthogonal to 2 and, owing to the preparation of particles 2 and 3, the state
of particle 2 was orthogonal to 3. All Alice has to do is to tell this to Bob,
to let him know that, in this particular case, the state of his particle 3 is the
same as that which particle 1 had initially.
Of course, since there are four orthogonal Bell states, there are four equally
probable outcomes for Alice’s Bell-state measurement. If Alice has obtained
another result, the state of Bob’s particle is again related to the initial state
70 Harald Weinfurter and Anton Zeilinger
Fig. 3.4. Scheme for entangling particles that have never interacted by the process
of entanglement swapping [125]
And, thirdly, there is also no transfer of matter or energy (other than that
required for the transmission of classical information). All that makes up a
particle are its properties, described by the quantum state. For example, the
state of a free neutron defines its momentum and its spin. If one transfers
the state onto another neutron, this particle obtains all the properties of the
first one; in fact, it becomes the initial particle. We leave it to the science
fiction writers to apply the scheme to bigger and bigger objects. Whether or
not this idea will help some Captain Kirk to get back to his space ship or
not cannot be answered here. Certainly, a lot of other problems need to be
solved as well.5
It is appropriate to point out some generalizations of the principle of
quantum teleportation. It is not necessary that the initial state which is to
be teleported is a pure state. In fact it can be any mixed state, or even the
undefined state of an entangled particle. This is best demonstrated by entan-
glement swapping [125]. Here, the particle to be teleported (1) is entangled
with yet another one (4) (Fig. 3.4). The state of particle 1 on its own is a
mixed state; however, it can be determined by the observation of particle 4.
Quantum teleportation allows us to transfer the state of particle 1 onto par-
ticle 3. Since quantum teleportation works for any arbitrary quantum state,
particle 3 thus becomes entangled with particle 4. Note, that particles 3 and 4
do not come from the same source, nor did they ever interact with each other.
Nevertheless, it is possible to entangle them by swapping the entanglement
in the process of quantum teleportation.
5
The “technical manuals of Star Trek” mention, as a necessary part of their trans-
porter, a “Heisenberg compensator” [124]. Quantum teleportation seems to pro-
vide a solution for this marvelous device. However, a lot more is necessary to
beam large objects.
72 Harald Weinfurter and Anton Zeilinger
the originally mixed state of particle 2 can be turned into a pure state which
depends on the manipulation initially performed on particle 1. Using such
a scheme, one can remotely prepare particle 3 in any pure quantum state.
Thus, it is not necessary to send two real numbers to Bob if one wants him to
have a certain, pure quantum state prepared on his particle. If he is provided
with one of a pair of entangled particles, Alice simply has to transmit two
bits of classical information to Bob.
transfer the entangled particles over reasonable distances. Thus photons (with
wavelengths in the visible or near infrared) are clearly a better choice. For
entangling photons via such a coupling, various methods have been proposed
and partially realized [129, 130, 131] but still need to be investigated more
thoroughly. Fortunately, the process of parametric down-conversion offers an
ideal source of entangled photon pairs without the need for strong coupling
(see Sect. 3.4.2).
To perform Bell-state analysis, one first has to transform the entangled
state into a product state. This is necessary since two particles can be an-
alyzed only if they are measured separately. Otherwise one would need to
entangle the two measurement apparatuses, each of which analyzes one of the
two particles – clearly an even more challenging task. In principle, a disentan-
gling transformation can be performed by reversing the entangling interaction
described above. However, as long as such couplings are not achievable, one
has to find replacements. In the following it is shown how two-particle inter-
ference can be employed for partial Bell-state analysis (see Sect. 3.4.3). Since
the manipulations and unitary transformations have to be performed on only
one quantum particle at a time, this does not create new obstacles. These
operations are often routine; in the case of light they have been routine for
two centuries.
Fig. 3.6. The different relations between the emission directions for type I and type
II down-conversion
conversion of a light quantum from the incident pump field into a pair of
photons in the “idler” and “signal” modes can occur. In principle, this can
be seen as the inverse of the frequency-doubling process in nonlinear optics
[134].
As mentioned above, energy and momentum conservation can give rise
to entanglement in various degrees of freedom, such as position–momentum
and time–energy entanglement. However, the interaction time and volume
will determine the sharpness and quality of the correlations observed , which
are formally obtained by integration of the interaction Hamiltonian [135].
The interaction time is given by the coherence time τc of the UV pump light;
the volume is given by the extent and spatial distribution of the pump light
in the nonlinear crystal.
The relative orientations of the direction and polarization of the pump
beam, and the optic axis of the crystal determine the actual direction of
the emission of any given wavelength. We distinguish two possible alignment
types (Fig. 3.6): for type I down-conversion, the pump has, for example,
the extraordinary polarization and the idler and signal beams both have the
ordinary polarization. Different colors are emitted into cones centered on the
pump beam.
In type II down-conversion, the pump has the extraordinary polarization
and, in order to fulfill the momentum conservation condition inside the crystal
(phase-matching), the two down-converted photons have different, for most
directions orthogonal, polarizations, offering the possibility of a new source
of polarization-entangled photon pairs (Sect. 3.4.2).
One can distinguish two basic ways to observe entanglement. In the first
way, by selecting detection events one can chose a subensemble of possible
outcomes which exhibits the nonclassical features of entangled states.6 This
additional selection seems to contradict the spirit of EPR–Bell experiments;
however, it was shown recently, that, after a detailed analysis of all detection
events, the validity of local hidden-variable theories can be tested on the basis
6
For the observation of polarization entanglement, see [136]. For momentum entan-
glement, see the proposal [137] and the experimental results [138]. Time–energy
entanglement was proposed in [139]. Experiments are described in [140].
76 Harald Weinfurter and Anton Zeilinger
of refined Bell inequalities [141]. Therefore, such sources can also be useful
for entanglement-based quantum cryptography [142].
In the second way, true entangled photon pairs can be generated. This is
essential for all the other quantum communication schemes, where one cannot
use the detection selection method. Several methods to obtain momentum-
entangled pairs [143] have been demonstrated experimentally [144], but are
extremely difficult to handle experimentally owing to the huge requirements
on the stability of the whole setup. Any phase change, i.e. a change in the path
lengths by as little as 10 nm, is devastating for the experiment. Also, the re-
cently developed source of time–energy-entangled photon pairs [145] partially
shares these problems and, to avoid detection selection, requires fast optical
switches. Fortunately, with polarization entanglement as produced by type
II parametric down-conversion, the stability requirements are considerably
more relaxed.
by an entangled state:
|Ψ = |H1 |V 2 + eiα |V 1 |H2 , (3.13)
where the relative phase α arises from the crystal birefringence, and an overall
phase shift is omitted. Using an additional birefringent phase shifter (or even
by slightly rotating the down-conversion crystal itself), the value of α can be
set as desired, e.g. to the value 0 or π. Thus, polarization-entangled states
are produced directly out of a single nonlinear crystal (beta barium borate,
BBO), with no need for extra beam splitters or mirrors and no requirement
to discard detected pairs.
Best of all, by using two extra birefringent elements, one can easily pro-
duce any of the four orthogonal Bell states. For example, when starting with
the state |Ψ + , a net phase shift of π and thus a transformation to the state
|Ψ − may be obtained by rotating a quarter-wave plate in one of the two
paths by 90◦ from the vertical to the horizontal direction. Similarly, a half-
wave plate in one path can be used to change a horizontal polarization to
vertical and to switch to the states |Φ± .
The birefringent nature of the down-conversion crystal complicates the
actual entangled state produced, since the ordinary and the extraordinary
photons have different velocities inside the crystal, and propagate along dif-
ferent directions even though they become parallel and, for short crystals,
collinear outside the crystal. The resulting longitudinal and transverse walk-
off between the two polarizations in the entangled state is maximal for pairs
created near the entrance face of the crystal, which consequently acquire the
greatest time delay and relative lateral displacement. Thus the two possible
emissions become, in principle, distinguishable by the order in which the de-
tectors would fire or by their spatial location, and no entanglement will be
observable. However, the photons are produced coherently along the entire
length of the crystal. One can thus completely compensate for the longitudi-
nal walk-off and partially for the transverse walk-off by using two additional
crystals, one in each path [147]. By verifying the correlations produced by
this source, one can observe strong violations of Bell’s inequalities (modulo
the typical auxiliary assumptions) within a short measurement time [31].
The experimental setup is shown in Fig. 3.8a. The 351.1 nm pump beam
(150 mW) is obtained from a single-mode argon ion laser, followed by a dis-
persion prism to remove unwanted laser fluorescence (not shown) [111]. Our
3 mm long BBO crystal was nominally cut such that θpm , the angle between
the optic axis and the pump beam, was 49.2◦, to allow collinear, degener-
ate operation when the pump beam is precisely orthogonal to the surface.
The optic axis was oriented in the vertical plane, and the entire crystal was
tilted (in the plane containing the optic axis, the surface normal and the
pump beam) by 0.72◦ , thus increasing the effective value of θpm inside the
crystal to 49.63◦ . The two cone overlap directions, selected by irises before
the detectors, were consequently separated by 6.0◦ . Each polarization ana-
lyzer consisted of two-channel polarizers (polarizing beam splitters) preceded
78 Harald Weinfurter and Anton Zeilinger
Fig. 3.8. (a) Experimental setup for the observation of entanglement produced by
a type II down-conversion source. The additional birefringent crystals are needed
to compensate for the birefringent walk-off effects from the first crystal. (b) Coin-
cidence fringes for the Bell states |Ψ + (•) and |Ψ + (◦) obtained when varying the
analyzer angle Θ1 , with Θ2 set to 45◦
ups and is remarkably stable. One of the reasons is that phase drifts are
not detrimental to a polarization-entangled state unless they are birefrin-
gent, i.e. polarization-dependent – this is a clear advantage over experiments
with momentum-entangled or energy–time-entangled photon pairs. Recently,
Kwiat and coworkers tested sandwiched type I crystals and achieved, for thin
crystals, a significantly higher relative yield of entangled photon pairs [148].
Also, utilizing cavities to enhance the pump field in the nonlinear crystal can
boost the output by a factor of 20 [149]. This gives hope that even more
efficient generation of entangled photon pairs will be obtained in the future.
The Principle. Let us discuss first the generic case of two interfering par-
ticles. If we have two otherwise indistinguishable particles in different beams
and overlap these two beams at a beam splitter, we ask ourselves, what is the
probability to find the two particles in different output beams of the beam
splitter (Fig. 3.9a). Alternatively we can ask, what is the probability that
two detectors, one in each output beam, detect one photon each.
If we performed this experiment with fermions, we would at first naively
expect the two fermions to arrive in different output beams. This is sug-
gested by the Pauli principle, which requires that the two particles cannot
be in the same quantum state, that is, they cannot exit in the same output
beam. Analogously, interference of bosons at a beam splitter will result in
the expectation of finding both bosons in one output beam. For a symmetric
50/50 beam splitter, it is fully random whether the two bosons will be de-
tected in the upper or the lower detector, but they will be always detected
by the same detector. However, it is important to realize that the statements
above are only correct if one disregards the internal degrees of freedom of the
interfering particles.
Ultimately, the reason for the different behaviors lies in the different sym-
metries of the wave functions describing bosonic and fermionic particles [150].
There are four different possibilities for how the two particles could propa-
gate from the input to the output beams of the beam splitter. We obtain
one particle in each output if both particles are reflected or both particles
are transmitted; we observe both particles at one detector if one particle
is transmitted and the other reflected, or vice versa. For the antisymmetric
states of fermions, the two possibilities of both particles being transmitted
and both being reflected interfere constructively, resulting in firing of each
80 Harald Weinfurter and Anton Zeilinger
Fig. 3.9. (a) Interference of two particles at a beam splitter. The observation of
coincident detection, i.e. detection of one particle at each of the two detectors, is
sensitive to the symmetry of the spatial component of the quantum state of the
combined system. (b) Bell-state analyzer for identifying the Bell states |Ψ + and
|Ψ − by observing different types of coincidences. The other two Bell states |Φ±
exhibit the same detection probabilities (both photons are detected by one detector)
for this setup and cannot be distinguished
of the two detectors. For the symmetric states of bosons, these two ampli-
tudes interfere destructively, giving no simultaneous detection in different
output beams [152]. For photons with identical polarizations, which means
for bosons, this interference effect has been known since the experiments by
Hong et al. [153],7 but up to now it has not been observed for fermions yet.
What kinds of interference effects of two photons at a beam splitter are to
be expected if we consider also the internal degree of freedom of the photons,
i.e. their polarization? In particular, if we interfere two polarization-entangled
photons at a beam splitter, the Bell state describes only the internal degree of
freedom. Inspection of the four Bell states shows that the state |Ψ − is anti-
symmetric, whereas the other three are symmetric. However, if two particles
interfere at a nonpolarizing beam splitter, what matters is only the spatial
part of the wave function. The symmetry of the wave function is determined
by the requirement that for two photons, the total state has to be symmetric
again. We therefore obtain, for the total state of two photons in the anti-
symmetric Bell state formed from two beams a and b at the beam splitter,
1
|Ψ = (|H1 |V 2 − |V 1 |H2 ) (|a1 |b2 − |b1 |a2 ) . (3.14)
2
7
For further experiments and theoretical generalizations, see [154].
3. Quantum Communication 81
This means that, for the state |Ψ − , we also have an antisymmetric spatial
part of the wave function and thus expect a different detection probability,
that is, different coincidences between the two detectors, compared with the
other three Bell states.
We therefore can discriminate the state |Ψ − from all the other states. It
is the only one which leads to coincidences between the two detectors in the
output beams of the beam splitter. Can we also identify the other Bell states?
If two photons are in the state |Ψ + , they will both propagate in the same
output beam but with orthogonal polarizations in the horizontal/vertical
(H/V) basis, whereas two photons in the state |Φ+ or in the state |Φ− ,
which also both leave the beam splitter in the same output arm, have the
same polarization in the H/V basis. Thus we can discriminate between the
state |Ψ + and the states |Φ± by a polarization analysis in the H/V basis
and by observing either coincidences between the outputs of a two-channel
polarizer or both photons again in only one output (Fig. 3.9b). Note that
reorientation of the polarization analysis allows one to separate any other of
these three states from the other two, but it is not possible to distinguish
between all of them simultaneously [155]. If the photons were entangled in
yet another degree of freedom, i.e. they were four-state systems rather than
regular qubits, one could also discriminate between the states |Φ+ and |Φ−
[156]. But up to now, no quantum communication scheme seems to have
profited from this fact.
Summarizing, we conclude that two-photon interference can be used to
identify two of the four Bell states, with the other two giving the same third
detection result. One thus cannot perform complete Bell-state analysis by
these interferometric means, but we can identify three different settings in
quantum dense coding and, for teleportation, even identification of only one
of the Bell states is sufficient to transfer any quantum state from one particle
to another, although then only in a quarter of the trials.
For example, if we detect one photon behind the beam splitter at almost
the same time as one of the additional down-conversion photons, we can infer
the origin of the photon that is to interfere. However, if the time difference
between the detection events of the two interfering photons, that is, the over-
lap at the beam splitter, is much less than their coherence time, then the
detection of any other photon cannot give any additional information about
their origin. This ultra-coincidence condition requires the use of narrow filters
in order to make the coherence time as long as possible. However, even if we
consider using state-of-the-art interference filters that yield a coherence time
of about 3 ps, no detectors fast enough exist at present. And an even stronger
filtering by Fabry–Perot cavities (to achieve the necessary coherence time of
about 500 ps) results in prohibitively low count rates. Only a considerable
increase of the number of photon pairs emitted into a narrow wavelength
window may allow one to use this technique (e.g. with a subthreshold OPO
configuration as demonstrated in [158]).
The best choice, as it turns out, is not to try to detect the two photons
simultaneously, but rather to generate them with a time definition much
better than their coherence time. Consider two down-conversion processes
pumped by pulsed UV beams (using either two crystals or, as is the case
in our experiments, one crystal pumped by two passages of a UV beam).
Again we attempt to observe interference between two photons, one from each
down-conversion process. Then, without any narrow filters in the beams, the
tight time correlation of the photons coming from the same down-conversion
permits one again to associate simultaneously detected photons with each
other. This provides path information and hence prohibits interference.
We now insert filters before (or behind) the beam splitter. With stan-
dard filters, and thus also with high enough count rates, one easily achieves
coherence times on the order of 1 ps. And it is possible to pump the two down-
conversion processes with UV pulses with a duration shorter than 200 fs. Thus
it follows that the photons detected behind the beam splitter carry practi-
cally no information anymore on the detection times of their twin photons,
and, vice versa, detection of those latter photons does not give which-path
information, which would destroy the interference.
The “coincidence time” for registering the photons now can be very long;
it merely needs to be shorter than the repetition time of the UV pulses, which
is on the order of 10 ns for commercially available laser systems. One thus
can expect very good visibility of interference and very good precision of the
Bell-state analysis.
and have a correlation time of less than 100 ns [161]. The photons are de-
tected by silicon avalanche photodiodes, and time interval analyzers on local
personal computers register all detection events as time stamps together with
the settings of the analyzers and the detection results.
Quantum key distribution is started by a single light pulse sent from the
source to Alice and Bob via a second optical fiber. After a run of about 5 s
duration has been completed, Alice and Bob compare their lists of detec-
tions to extract the coincidences. In order to record the detection events very
accurately, the time bases in Alice’s and Bob’s time interval analyzers are
controlled by two rubidium oscillators. Overall, the system has a measured
rate of total coincidences of ∼ 1700 per second, and a collection efficiency of
each photon path of 5%. All the necessary equipment for the source, Alice and
Bob have been proven to operate outside shielded laboratory environments
with a very high reliability.
For the realization of entanglement-based quantum cryptography using
the Wigner inequality, Alice switches the analyzer randomly between −30◦
and 0◦ , and Bob between 0◦ and +30◦ . After a run, Alice and Bob ex-
tract from the coincidences the probabilities p++ (0◦ , 30◦ ), p++ (−30◦ , 0◦ ), and
p++ (−30◦ , 30◦ ) for the corresponding analyzer settings. We obtain −0.112 ±
0.014 for the left-hand side of the Wigner inequality (3.7), which is in good
agreement with the predictions of quantum mechanics, and the coincidences
obtained at the parallel settings, (0◦ , 0◦ ), can be used as a quantum key. In a
typical run, Alice and Bob established 2162 bits of raw quantum key material
at a rate of 420 baud, and observed a quantum bit error rate (QBER) of 3.4%.
By biasing the frequencies of the analyzer combinations, the production rate
of the quantum keys can be increased to about 1700 baud without sacrificing
security.
To demonstrate the entanglement-based BB84 scheme, Alice’s and Bob’s
analyzers both switched independently and randomly between 0◦ and 45◦ .
After a measurement run, Alice and Bob extracted the coincidences measured
with parallel analyzers to generate the quantum key. In the experiment, Alice
and Bob collected 80 000 bits of quantum key at a rate of 850 baud and
observed a quantum bit error rate of 2.5%. To correct the remaining errors
and ensure the secrecy of the key, various classical error correction and privacy
amplification schemes have been developed. With a very fast and efficient
algorithm, a single iteration gives 49 984 bits with a significantly reduced
QBER of 0.40% [113].
For the first realization of this quantum communication scheme, the experi-
ment consisted of three distinct parts (Fig. 3.11): the EPR source, generating
entangled photons in a well-defined state; Alice’s station, for encoding the
messages by a unitary transformation of her particle; and Bob’s Bell-state
analyzer, for reading the signal sent by Alice.
86 Harald Weinfurter and Anton Zeilinger
Fig. 3.11. Experimental setup for quantum dense coding. The two entangled pho-
tons created by type II down-conversion are distributed to Alice and Bob. Alice
sends her photon, after manipulation with birefringent plates, to Bob, who can read
the encoded information by interferometric Bell-state analysis. The path length de-
lay ∆ is varied to achieve optimal interference
Fig. 3.12. Coincidence rates CHV (•) and CHV (◦) as functions of the path length
difference ∆, when the states |Ψ + (left) or |Ψ − (right) are analyzed by Bob’s
interferometric Bell-state analyzer
Figure 3.12 shows the dependence of the coincidence rates CHV (•) and
CHV (◦) on the path length difference, when either the state |Ψ + (left) or
the state |Ψ − has been sent to the Bell-state analyzer (the rates CH V and
CH V display analogous behavior; we use the notation CAB for the coincidence
rate between the detectors DA and DB ). For perfect path length tuning, CHV
reaches its maximum for |Ψ + (left) and vanishes (apart from noise) for |Ψ −
(right). CHV displays the opposite dependence and clearly signifies |Ψ − . The
results of these measurements imply that if both photons are detected, we
can identify the state |Ψ + with a reliability of 95%, and 93% for the state
|Ψ − .
The performance of the dense-coding transmission is influenced not only
by the quality of the alignment procedure, but also by the quality of the
states sent by Alice. In order to evaluate the latter, the beam splitter was
translated out of the beams. Then an Einstein–Podolsky–Rosen–Bell-type
correlation measurement analyzed the degree of entanglement of the source,
as well as the quality of Alice’s transformations. The correlations were only 1–
2% higher than the visibilities with the beam splitter in place, which means
that the quality of this experiment is limited more by the quality of the
entanglement of the two beams than by that of the interference achieved.
When using Si avalanche diodes in the Geiger mode for single-photon de-
tection, a modification of the Bell-state analyzer is necessary, since then, for
the states |Φ± , one has to register the two photons leaving the Bell-state
analyzer via a coincidence detection. One possibility is to avoid interference
at all for these states by introducing polarization-dependent delays before
Bob’s beam splitter. Another approach is to split the incoming two-photon
state at an additional beam splitter and to detect it (with 50% likelihood)
by a coincidence count between detectors in each output (inset of Fig. 3.13).
For the purpose of a proof-of-principle demonstration, we put such a con-
88 Harald Weinfurter and Anton Zeilinger
Fig. 3.13. Coincidence rates as functions of the path length difference ∆. Because
of the nature of the Si avalanche photodiodes, the extension shown in the inset is
necessary for identifying two-photon states in one output
Fig. 3.14. “1.58 bits per photon” quantum dense coding: the ASCII codes for the
letters “KM◦ ” (i.e. 75, 77, 179) are encoded in 15 trits instead of the 24 bits usually
necessary. The data for each type of encoded state are normalized to the maximum
coincidence rate for that state
3. Quantum Communication 89
Fig. 3.16. Coincidence rate between the two detectors of Alice’s Bell-state analyzer
as a function of the delay between the two photons 1 and 2. The data for the +45◦
and −45◦ polarizations of photon 1 are equal within the statistics, which shows
that no information about the state of photon 1 is revealed to Alice
3. Quantum Communication 91
graphs show the results obtained when the initial polarization of photon 1
was set either to 45◦ or to vertical polarization and then the polarization of
photon 3 along the corresponding direction was analyzed. The reduction in
the polarization to about 65% is due to the limited degree of entanglement
between photons 2 and 3 (85%), and to the reduced contrast of the interfer-
ence at the beam splitter as a consequence of the relatively short coherence
time of the detected photons. Of course, better beam definition by narrow
pinholes and more stringent filtering could improve this value. However, this
would cause further, unacceptable loss in the fourfold coincidence rates. Each
of the polarization data points shown was obtained from about 100 four-fold
coincidence counts in 4000 s.
Fig. 3.17. Polarization of photon 3 after teleportation, compared with the po-
larization initially prepared on photon 1. The analyzer testing the quality of the
teleportation performed by Alice and Bob was oriented parallel to the initial po-
larization
These measurements and also runs with the initial polarization along
other directions demonstrate the ability to teleport the polarization of any
pure state. Of course, since the directions used are mutually nonorthogonal,
one can infer that the scheme works for any arbitrary quantum state. How-
ever, there is a much more direct way to experimentally demonstrate the full
power of quantum teleportation.
One way to demonstrate that any arbitrary quantum state can be trans-
ferred is to use the fact that we can also obtain entanglement between photons
1 and 4 (Fig. 3.18). After the polarizer was removed from arm 1 and put into
arm 4, the state of 1 was not defined anymore, but still could be teleported
to photon 3; this was demonstrated by showing that now the entanglement
had been swapped to photons 3 and 4.
The state of photon 1 (Fig. 3.18), which is part of an entangled pair (pho-
tons 1 and 4), is fully undetermined and is formally described by a mixed
state. If one can teleport this state to another photon, i.e. to Bob’s photon 3,
we expect to find this photon in a mixed state, that means it is unpolarized.
92 Harald Weinfurter and Anton Zeilinger
Now, since Bob’s photon was originally also part of an entangled pair (pho-
tons 2 and 3), it was unpolarized anyway. One might conclude that here we
did not achieve anything. However, if one determines not only the polariza-
tion of photon 3 but the correlations between photons 3 and 4, one finds that
now these two photons, which have been produced independently by different
processes, are entangled [115].
Figure 3.19 verifies the entanglement between photons 3 and 4, condi-
tioned on coincidence detection of photons 2 and 3. Varying the angle Θ of
the polarizer in arm 4 causes a sinusoidal variation of the count rate, here
with the analyzer of photon 3 set to ±45◦ . This shows that we did not tele-
port just a mixed state, but actually the as yet undetermined state of the
entangled photon.
These experiments present the first demonstration of quantum teleporta-
tion, that is, the transfer of a qubit from one two-state particle to another.
In the meantime, further steps have been achieved, in particular the remote
preparation of the state of Bob’s photon (sometimes also called “telepor-
tation”) [10] and, especially important, the teleportation of the state of an
electro-magnetic field [11]. The latter is the first example of teleportation of
continuous variables based on the original EPR entanglement. The first ex-
periment demonstrated the feasibility of transfer of fluctuations of a coherent
state from one light beam to another. Although the experiment was limited
to a narrow bandwidth of 100 kHz, this was only a technical limitation due
to the detection electronics, the modulators and the bandwidth of the source
3. Quantum Communication 93
Fig. 3.19. Verification of the entanglement between photons 3 and 4. The sinusoidal
dependence of the fourfold coincidence rate on the orientation Θ of the polarizer
in arm 4 for ±45◦ polarization analysis of photon 3 demonstrates the possibility to
teleport any arbitrary quantum state
3.6 Outlook
Quantum communication with entangled photons has shown its power and
its fascinating features. Our experiments, where realistic entanglement-based
quantum cryptography has been performed, where the capacity of commu-
nication channels has been increased beyond classical limits and where the
polarization state of a photon has been transferred to another one by means
of quantum teleportation, are only the first steps towards the exploitation of
new resources for communication and information processing.
Quantum communication can offer a wealth of further possibilities, es-
pecially when combined with simple quantum logic circuitry. Quantum com-
puters have to operate on large numbers of qubits to really demonstrate their
power. But quantum communication schemes already profit from combining
only a few qubits and entangled systems. Quantum logic operations with sev-
eral particles are already useful in examples of the quantum coding theorem
[69], but have shown their importance particularly in the proposals for entan-
glement purification [119]. Any realistic transmission of quantum states will
suffer from noise and decoherence along the line. If one wants to distribute
entangled pairs of particles to, say, Alice and Bob, the entanglement between
the received particles will be considerably degraded, which would prevent
successful quantum teleportation, for example. If Alice and Bob now com-
bine the particles of several such noisy pairs on each side by quantum logic
94 Harald Weinfurter and Anton Zeilinger
that the future will show an enormous potential for and benefit from the
use of other quantum communication methods, such as the distribution of
entanglement over large distances and the transfer of quantum information
in the process of quantum teleportation.
4 Quantum Algorithms: Applicable Algebra
and Quantum Physics
4.1 Introduction
.. ..
. .
... ... • j
U (i) = CNOT(i,j) = ... ...
U i ❢
.. .. i
. . .. ..
. .
The possible operations that this computer can perform are the elements
of the unitary group U(2n ). To study the complexity of performing unitary
operations on n-qubit quantum systems, we introduce the following two types
of computational primitives: local unitary operations on a qubit i are matrices
of the form U (i) = 12i−1 ⊗ U ⊗ 12n−i , where U is an element of the unitary
group U(2) of 2 × 2 matrices and 1N denotes the identity matrix of size
N . Furthermore, we need operations which affect two qubits at a time, the
most prominent of which is a so-called controlled NOT gate (also called a
measurement gate) between the qubits j (control) and i (target), denoted by
CNOT(i,j) . On the basis vectors |xn , . . . , x1 of H2n , the operation CNOT(i,j)
is defined by
where the first two operations generate a dense subgroup in U(2). The
Hadamard transformation, which is part of this generating set, is denoted
by
1 1 1
H2 := √
2 1 −1
and is an example of a Fourier transformation on the abelian group Z2
(see Sect. 4.4).
• Knill has obtained a general upper bound O(n4n ) for the approximation
of unitary matrices using a counting argument [178, 179].
• We cite the following approximation result from Sect. 4.2 of [180]: Fix
a number n of qubits and suppose that X1 , . . . , Xr = SU(2n ), i.e.
X1 , . . . , Xr generate a dense subgroup in the special unitary group SU(2n ).
Then it is possible to approximate a given matrix U ∈ SU (2n ) with
given accuracy > 0 by a product of length O{poly[log(1/)]}, where the
factors belong to the set {X1 , . . . , Xr , X1−1 , . . . , Xr−1 }. Furthermore, this
approximation is constructive and efficient, since there is an algorithm
with running time O{poly[log(1/)]} which computes the approximating
product. However, we remind the reader that this holds only for a fixed
value of n; the constant hidden in the O-calculus grows exponentially
with n (see Theorem 4.8 of [180]).
From now on, we put the main emphasis on the model for realizing uni-
tary transformations exactly and on the associated complexity measure κ.
In general, only exponential upper bounds for the minimal length occuring
in factorizations are known. However, there are many interesting classes of
unitary matrices in U(2n ) that lead to only a polylogarithmic word length,
which means that the length of a minimal factorization grows asymptotically
like O[p(n)], where p is a polynomial.
In the following we give some examples of transformations, their factor-
ization into elementary gates and their graphical representation in terms of
quantum gate arrays. The operations considered in these examples admit
short factorizations and will be useful in the subsequent parts of this chap-
ter.
• ❣ •
❅✁✁
❅ = ❣ ❣ ❣
❅✁ • • •
✁❅ ❣ ❣
•
Fig. 4.2. Factorization (1, 3, 2) = (1, 2)(2, 3)
Λ1 (U ) n Λ1 (U ) n
= 12n ⊕ U = ... U ... ..
. = U ⊕ 12n = ... U ... ..
.
1 1
Fig. 4.3. Controlled gates with (left) normal and (right) inverted control bit. Here
⊕ is used to denote the block-direct sum of matrices
Example 4.3 (Cyclic Shift). Let Pn ∈ S2n be the cyclic shift acting on the
states of the quantum register as x → x + 1 mod 2n . The corresponding
permutation matrix is the 2n -cycle (0, 1, . . . , 2n −1). The unitary matrix Pn
can be realized in a polylogarithmic number of operations; see Fig. 4.4 for a
realization using Boolean gates only.
❣ ···
r ❣ ···
r r ❣ ···
..
.
r r r ··· ❣
r r r ··· r ❣
where the last sum runs over all nonconstant multilinear monomials m in the
ring Rn .
This section deals with the issue of embedding a given transform A into a
unitary matrix of larger size. We start by considering the problem of realizing
a given matrix A as a submatrix of a unitary matrix of larger size. The
following theorem (see also [186]) shows that the only condition A has to
fulfill in order to allow an embedding involving one additional qubit is to be
of bounded norm, i.e. A ≤ 1, with respect to the spectral norm.
Proof. Observe that the n × 2n matrix U1 := (A, (1n − A† A)1/2 )t has the
property U1† · U1 = 1n . Analogously, for the matrix U2 := [A, (1n − AA† )1/2 ],
the identity U2 · U2† = 1n holds. An easy computation shows that (4.2) is
indeed unitary.
Since each matrix in C n×n can be renormalized by multiplication with a
suitable scalar to fulfill the requirement of a bounded norm, we can realize
all operations up to a scalar prefactor by unitary embeddings. The embed-
ding (4.2) is by no means unique. However, it is possible to parametrize all
embeddings by (1n ⊕ V1 ) · UA · (1n ⊕ V2 ), where V1 , V2 ∈ U(n) are arbitrary
unitary transforms.
We are naturally led to a different kind of embedding if the given trans-
formation is unitary and we want to realize it on a qubit architecture, i.e. if
we restrict ourselves to matrices whose size is a power of 2. Then, a given
unitary matrix U ∈ U(N ) can be embedded into a unitary matrix in U(2n ) by
choosing n = log N and padding U with an identity matrix 12n −N of size
4 Quantum Algorithms: Applicable Algebra and Quantum Physics 105
Example 4.5. Let f : GF (2)2 → GF (2) be the AND function, i.e. let f = x·y
be the RNF of f . Note that f˜ can be chosen to be the function (x, y, z) →
(x, y, z ⊕ f (x, y)) since the codomain is endowed with a group structure.
Overall, we obtain the function table of f˜ given in Fig. 4.5. The variables
with a prime correspond to the values after the transformation has been
performed.
x y z x y z x y z x y z |x • |x
0 0 0 0 0 0 1 0 0 1 0 0
0 0 1 0 0 1 1 0 1 1 0 1 |y • |y
0 1 0 0 1 0 1 1 0 1 1 1
|z ❢ |z ⊕ x · y
0 1 1 0 1 1 1 1 1 1 1 0
Fig. 4.5. Truth table for the Toffoli gate and the corresponding quantum circuit
Theorem 4.3. Suppose f : {0, 1}n → {0, 1}m is a Boolean function which
can be computed using c operations from the universal set {AND, NOT}
of classical gates. Then f˜ : {0, 1}n+m → {0, 1}n+m, defined by (x, y) →
(x, y ⊕ f (x)), is a reversible Boolean function which can be computed by a
circuit of length 2c + m built up from the set {CNOT, τ } of reversible gates.
Even though the construction described in Theorem 4.3 works for arbi-
trary f : {0, 1}n → {0, 1}m, in general only rf = log maxy∈{0,1}m |f −1 (y)|
additional bits are necessary to define a reversible Boolean function frev :
{0, 1}n+rf → {0, 1}n+rf with the property frev |{0,1}n = f . The reason is that
by using the additional rf bits, the preimages of f can be separated via a
suitable binary encoding. However, the complexity of a Boolean circuit of a
realization of frev constructed in such a way is such that the circuit cannot
be controlled as easily as for the function defined in Theorem 4.3.
4.2.4 Permutations
|0 H2 r r ··· r
❤ ···
..
.
❤ ···
.. .. .. ..
. . . .
|0 ··· ❤
appearing on the right-hand side of the above equation can be prepared by in-
duction hypotheses using the circuits U1 and U2 . Let A be the local transform
a −b
A := ⊗ 12n−1 .
b a
In general, the quantum circuit Uϕ for preparing a state |ϕ ∈ H2n generated
by this algorithm has a complexity κ(Uϕ ) = O(2n ), which is linear in the
dimension of the Hilbert space but exponential in the number of qubits.
However, as the example of cat states previously mentioned shows, there are
states which admit much more efficient preparation sequences. In such a set
of states, we also find the so-called symmetric states [192]
1
√ (|00 . . . 0 + |10 . . . 0 + |01 . . . 0 + . . . + |00 . . . 1) ,
n+1
i.e. the union of the orbits of |00 . . . 0 and |10 . . . 0 under the cyclic group
acting on the qubits. As shown in Sect. 4 of [192], these states can be prepared
using O(n) operations and a quadratic overhead of ancilla qubits. √ ν
Finally, we give circuits for preparation of the states |ψν := (1/ ν) i=1 |i
for ν = 1, . . . , 2n , which represent equal amplitudes over the first ν basis
states of H2n . The states |ψν can be efficiently prepared from the ground
state |0 by the following procedure (using the principle of binary search
[193]), which is described in Sect. 4 of [187].
Since |ψ2n can easily be prepared by application of the Hadamard trans-
formation H2⊗n , we can assume ν < 2n without loss of generality. We now
choose k ∈ N such that 2k ≤ ν < 2k+1 and apply the transformation
√ √
1 2 k − ν − 2k
U := √ √ √
ν ν − 2k 2k
to the first bit of the ground state |0. Next we achieve equal superposition on
the first 2k basis states |0 . . . 0, . . . , |0 . . . 01 . . . 1 by application of an (n−k)-
fold controlled Λ1 (H2⊗k ) operation, which can be implemented using O(n2 )
operations. Finally, we apply the preparation circuit for the state |ψν−2k
(which has been constructed by induction), conditioned on the (k + 1)th
bit. Overall, we obtain a complexity for the preparation of |ψν of O(n3 )
operations.
r q0
✟
✟ ❍❍
✟ ✟ ❍❍
q1
r✟✟ ❍❍r q2
❅ ❅
❅
· ❅ ··
·· ❅r qf1
r ·
❅ ❅
··
r qf2 ·
t : Q × Σ × Q × Σ × {←, ↓, →} −→ [0, 1] ,
which assigns probabilities from the real interval [0, 1] to the possible actions
of T . A normalization condition, which guarantees the well-formedness of a
probabilistic Turing machine, is that for all configurations the sum of the
probabilities of all successors is 1. Therefore, the admissible state transitions
of a probabilistic Turing machine T can be described by a stochastic matrix
ST ∈ [0, 1]Z×Z, where stochasticity means that the rows of ST add up to 1
and the successor csucc of c is obtained by csucc = ST · c. Note that a deter-
ministic Turing machine is a special case of a probabilistic Turing machine
4 Quantum Algorithms: Applicable Algebra and Quantum Physics 111
t : Q × Σ × Q × Σ × {←, ↓, →} −→ C (4.4)
Proof. The basic idea is to replace each elementary step in the computation
of f by a reversible operation (using theorem 4.3), keeping in mind that an-
cilla qubits are needed to make the computation reversible (see Sect. 4.2.3).
We now adjoin additional qubits to the system, which are initialized in the
ground state |0, and apply a controlled NOT operation using the compu-
tational qubits holding the result f (x). Of course, after the application of
112 Thomas Beth and Martin Rötteler
this operation the state is highly entangled between the computational reg-
ister and the additional register holding the result. Next, we run the whole
computation that was done to compute f , backwards, reversibly, on the com-
putation register to get rid of the garbage which might destroy the coherence,
and end up with the state |x, 0, f (x), where the |0 refers to the ancilla bits
used in the first step of this procedure.
Remark 4.2.
• The class of quantum Turing machines allows the definition and study
of the important complexity class BQP [200],3 as well as the relation of
BQP to other classes known from classical complexity theory.
• There are programming primitives for QTMs, such as composition, loops
and branching [200], as in the classical case. However, a problem arises
in realizing while-loops, since the predicate which decides whether the
loop terminates can be in a superposition of true and false, depending on
the computation path. Therefore all computations have to be arranged
in such a way that this predicate is never in a superposed state, i.e. the
state of the predicate has to be classical. As a consequence, we obtain
the result that a quantum Turing machine can only perform loops with
a prescribed number of iterations, which in turn can be determined by a
classical Turing machine.
• An important issue is whether QTMs constitute an analog or discrete
model of computation. One might be tempted to think of the possibil-
ity of encoding an arbitrary amount of information into the transition
amplitudes of t, i.e. of producing a machine model which could benefit
from computing with complex numbers to arbitrary precision (for the
strange effects of such models see, e. g., [201]). However, see [200, 202] for
a proof of the fact that it is sufficient to take transition amplitudes from
the finite set {±3/5, ±4/5, ±1, 0} in order to approximate a given QTM
to arbitrary precision. The reason for this is that the Pythagorean-triple
transformation
1 3 −4
∈ SO(2)
5 4 3
|ϕ1 = |0 . . . 0 ⊗ |0 . . . 0
1
|ϕ3 = √ |x|f (x) .
2n x∈Zn
2
⊗n
5. Application of the Hadamard transformation H2 to y·g the first register
transforms the coset into the superposition y∈U ⊥ (−1) 0 |y. The sup-
ported vectors of this superposition are the
elements of U ⊥ , which is the
⊥ n n
group defined by U := {y ∈ Z2 : x · y = i=1 xi yi = 0}, i.e. the orthog-
onal complement of U with respect to the scalar product in Zn2 (see also
Sect. 4.4.2).
6. Now measure the first register. We draw from the set of irreducible rep-
resentations of Zn2 having U in the kernel, i.e. we obtain an equal distri-
bution over the elements of U ⊥ .
7. By iterating steps 1-6, we produce elements of Zn2 which generate the
group U ⊥ with high probability. After performing this experiment an ex-
pected number of n times, we generate with probability greater than 1−2−n
the group U ⊥ .
8. By solving linear equations over GF (2), it is easy to find generators for
(U ⊥ )⊥ = U .
Analysis of Algorithm 2
• We first address the measurement in step 4. If we do not perform this
measurement, we are left with the state
1
|ϕ4 = √ |σ ⊕ x|f (σ) ,
n
2 σ∈Zn /U x∈U
2
If, equivalently, we adopt the point of view that the signal f and the Fourier
transform are vectors in C N , we see that performing the DFTN is a matrix
vector multiplication of f with the unitary matrix
1
DFTN := √ · ω i·j i,j=0,...,N −1 ,
N
where ω = e2πi/N denotes a primitive N th root of unity.
116 Thomas Beth and Martin Rötteler
Φ : CZN −→ C N ,
DFT2n−1 DFT2n−1
Πτ DFT2n =
DFT2n−1 Wn −DFT2n−1 Wn
= (12 ⊗ DFT2n−1 ) · Tn · (DFT2 ⊗ 12n−1 ) .
Here we denote by
1
ω2n
ω22n
Tn := 12n−1 ⊕ Wn , Wn :=
..
.
n−1
ω22n
4 Quantum Algorithms: Applicable Algebra and Quantum Physics 117
the matrix of twiddle factors [171]. Taking into account the fact that Wn has
the tensor decomposition
1
n−2
Wn = i ,
ω22n
i=0
we see that Tn can be implemented by n−1 gates having one control wire each.
These can be factored into the elementary gates G1 with constant overhead.
Because tensor products are free in our computational model, by recursion
we arrive at an upper bound of O(n2 ) for the number of elementary operations
necessary to compute the discrete Fourier transform on a quantum computer
(this operation will be referred to as “QFT”).
In Fig. 4.8, the derived decomposition into quantum gates is displayed
using the graphical notation introduced in Sect. 4.2.1. The gates labeled by
Dk in this circuit are the diagonal phase shifts diag(1, e2πi/k ) and, in ad-
dition, we have used the abbreviation N = 2n . We observe that the per-
mutations Πn , which arose in the Cooley–Tukey formula, have all been col-
lected together, yielding the so-called bit reversal, which is the permutation
of the quantum wires (1, n) (2, n−1) . . . (n/2, n/2 + 1) when n is even and
(1, n) (2, n−1) . . . ((n − 1)/2, (n + 3)/2) when n is odd.
··· • • ··· • H2
❆ ✁✁
❆ ✁ ··· ··· D4
❅❆ ✁
.. ❅❆✁ .. · · ..
. . ·· ·· .
✁ ❆❅❅ • ··· DN ···
✁ ❆ H2 2
✁ ❆❆ ··· ···
H2 D4 DN
Let A be a finite abelian group. Then A splits into a direct product of its
p-components: A ∼ = Ap1 × . . . × Apn , where pi , i = 1, . . . , n are the prime
divisors of the order |A| of A (see [211], Part I, paragraph 8).
118 Thomas Beth and Martin Rötteler
Proof. The decomposition of DFT2n has already been considered and an im-
plementation in O(n2 ) many operations has been derived from the Cooley–
Tukey formula in Sect. 4.4.1. Since tensor products are free in our computa-
tional model, we can conclude that a direct factor Z2n is already the worst
case for an implementation.
Example 4.6. The Fourier transform for the elementary abelian 2-group Zn2
is given by the tensor product DFT2 ⊗ · · · ⊗ DFT2 of the Fourier transform
for the cyclic factors and hence coincides with the Hadamard matrix H2⊗n
used in Algorithm 2.
The Dual Group. Given a finite abelian group A, we can consider Hom(A,C ∗),
i.e. the group of characters of A (NB: in the nonabelian case a character is
generalized to the traces of the representing matrices and hence is not a ho-
momorphism anymore). The following theorem says that A is isomorphic to
its group of characters.
1 1
DFTA |x = |y .
|U | x∈U |U ⊥ | y∈U ⊥
Proof. Since
1 1
DFTA |x = β(x, y)|y x| |x
|U | x∈U |U | x,y∈A x∈U
1
= β(x, y)|y ,
|A||U | y∈A x∈U
120 Thomas Beth and Martin Rötteler
it suffices to show that x∈U β(x, y) = 0 for y ∈ / U ⊥ , but this statement
follows
from the fact that the existence of x0 with β(x0 , y) = 0 implies that
β(x, y) = β(x + x , y) = β(x , y) x∈U β(x, y). The other case,
x∈U x∈U
⊥
0 0
where ϕc,y ∈ U(1) are phase factors which depend on c and y but are al-
ways eth roots of unity, where e denotes the exponent of A. Since making
measurements involves taking the squares of the amplitudes, we obtain an
equal distribution over U ⊥ . The states |χc+U , which in general will be highly
entangled, make the principle of interference and its use in quantum algo-
rithms apparent: only those Fourier coefficients remain which correspond to
the elements of U ⊥ (constructive interference), whereas the amplitudes of all
other elements vanish (destructive interference).
Theorem 4.10. Let G be a finite abelian group, Hom(G, C ∗ ) the dual group,
and β : G × Hom(G, C ∗ ) → C ∗ the canonical pairing defined by β(g, ϕ) :=
ϕ(g). Then for each subgroup U ⊆ G, the following holds (normalization
omitted):
DFTG |u = |ϕ .
u∈U ϕ∈Hom(G,C ∗ )
U⊆Ker(ϕ)
|G|
Proof. The mapping DFTG : CG → i=1 C is given by evaluation of elements
of G for the irreducible representations {ϕ1 , . . . , ϕs } of G, which are all one-
dimensional (i.e. s = |G|) and hence are characters, since G was assumed to
be abelian. Therefore, the coefficient for the irreducible representation ϕi is
computed from
ϕi (u0 + u) = ϕi (u0 ) · ϕi (u)
u∈U u∈U
= ϕi (u0 ) · ϕi (u) .
u∈U
ρ(g)−1 · A · ρ(g) = A ,
since from this we could conclude ρ(n0 ) = 1d×d , contrary to the assumption
n0 ∈
/ Ker(ρ).
Fig. 4.9. (a) Equal distribution, (b) flip solutions, (c) inversion about average
h h«G
f f f f
Remark 4.3. The question of what linear transforms can be performed op-
tically is equivalent to the question of what matrices can be factored into
diagonal matrices and Fourier transforms, which correspond to diagonal ma-
trices and circulant matrices. It has been shown in [217] that for fields
K ∈ {GF (3), GF (5)}, every square matrix M with entries in K can be
written as a product of circulant and diagonal matrices with entries in K.
Furthermore, if M is unitary the circulant and diagonal factors can also be
chosen to be unitary [217].
In this section we briefly review Shor’s factorization algorithm and show how
the Fourier transform comes into play. It is known that factoring a number N
is easy under the assumption that it is easy to determine the (multiplicative)
order of an arbitrary element in (ZN )× . For a proof of this, we refer the
reader to [222] and to Shor’s original paper [9].
Once this reduction has been done, the following observation is the cru-
cial step for the quantum algorithm. Let y be randomly chosen and let
gcd(y, N ) = 1. To determine the multiplicative order r of y mod N , con-
sider the function
fy (x) := y x mod N .
4 Quantum Algorithms: Applicable Algebra and Quantum Physics 125
5. Measuring the right part of the register gives a certain value z0 . The
remaining state is the superposition of all x satisfying fy (x) = z0 :
s−1
M
|x|z0 = |x0 + kr|z0 , where y x0 = z0 and s = .
y x =z0
r
k=0
Application of this algorithm produces data from which the period r can be
extracted after classical postprocessing involving Diophantine approximation
(see Sect. 4.5.2).
A thorough analysis of this algorithm must take into account the overhead
for the calculation of the function fy : x → y x mod N . However, after this
function has been realized as a quantum network once (which can be obtained
from a classical network for this function in polynomial time), the superpo-
sition principle applies since all inputs can be processed by one application
of fy .
126 Thomas Beth and Martin Rötteler
Remark 4.5. It should be noted that for small numbers, as in the example of
Fig. 4.11 and 4.12, an optical setup using Fourier lenses (cf. Fig. 4.10) could
implement Shor’s algorithm. This very example was initially simulated with
r
the DigiOpt system [223].
Fig. 4.12. The function shown in Fig. 4.11. transformed by a DFT of length 1024
occur for those basis states |l for which lr is close to M . The probability of
measuring a specific l in this sum can be bounded by
1 2 1 2
1 1 − eπirs/M |2
s−1 s−1
2πikrl/M πir/M
√ e ≥ √ e =
sM sM sM |1 − eπir/M |2
k=0 k=0
1 | sin[πrs/(2M )]|2 4
= ≥ 2 .
sM | sin[πr/(2M )]|2 π r
The fractions l/M that we obtain from sampling fulfill the condition
l p
1
− ≤ ,
M r 2M
for some integer p. Because of the choice M ≥ N 2 we obtain the result that
l/M can be approximated efficiently by continued fractions, as described in
the following section. This classical postprocessing completes the description
of Shor’s algorithm for finding the order r of y. To factor N we compute the
least common divisors of (y r/2 + 1, N ) and (y r/2 − 1, N ), obtaining nontrivial
factors of N if r is even and y r/2 ≡ ±1 mod N .
A similar method can be applied to the discrete logarithm problem [9].
We mention that both the factoring and the discrete logarithm problem can
be readily recognized as hidden-subgroup problems (see also Sect. 5): in the
case of factoring, this corresponds to the group U generated by y and we are
interested in the index [Z : U ], which equals the multiplicative order of y. The
discrete logarithm problem for GF (q)× can be considered a hidden-subgroup
problem for the group G = Z × Z and the function f : G → GF (q)× given
by f (x, y) → ζ x α−y , where ζ is the primitive element and α is the element
for which we want to compute the logarithm.
The main difference from the situation in Simon’s algorithm (see Sect.
4.3) is that in these cases we cannot apply the Fourier transform for the
parent group G, since we do not know its order a priori. Rather, we have to
compute larger Fourier transforms; preferably, the length is chosen to be a
power of 2, to oversample. We then obtain the information from the sampled
Fourier spectrum by classical postprocessing.
The basic features of the method of Fourier sampling which we invoked in
Shor’s factoring algorithm are also incorporated in Kitaev’s algorithm for the
abelian stabilizer problem [187]. At the very heart of Kitaev’s approach is a
method to measure the eigenvalues of a unitary operator U , supposing that
the corresponding eigenvectors can be prepared. This estimation procedure
i
becomes efficient if, besides U , the powers U 2 , i = 0, 1, . . . , can also be
implemented efficiently [180, 187]. We must also mention that the method of
Diophantine approximation is crucial for the phase estimation.
Diophantine Approximation. In the following we briefly review some
properties of the continued-fraction expansion of a real number. An impor-
tant property a number can have is to be one of the convergents of such an
128 Thomas Beth and Martin Rötteler
The quantum algorithms which have been discovered by now fall into two
categories, the principles of which we shall describe in the following.
Entanglement-Driven Algorithms. Suppose we are given a function f :
X → Y from a (finite) domain X to a codomain Y . This function does not
have to be injective; however, for a quantum computer to be able to perform
f with respect to a suitably encoded X and Y , the function f has to be
embedded into a unitary matrix Vf (cf. Sect. 4.2.3). We then can compute
simultaneously the images of all inputs x ∈ X using |x|0 → |x|f (x) for all
x ∈ X by preparing an equal superposition x∈X |x in the X register and
the ground state |0 in the Y register first, and then applying the quantum
circuit Vf to obtain x∈X |x|f (x). This entangled state can then be written
as
|x |y ,
y∈Im(f ) x:f (x)=y
4 Quantum Algorithms: Applicable Algebra and Quantum Physics 129
In Sect. 4.4 we have encountered the special case of the discrete Fourier trans-
form for abelian groups. This concept can be generalized to arbitrary finite
groups, which leads to an interesting and well-studied topic for classical com-
puters. We refer to [171, 174, 225, 226] as representatives of a vast number of
publications. The reader not familiar with the standard notations concerning
group representations is referred to these publications and to standard ref-
erences such as [213, 227]. Following [228], we briefly present the terms and
notations from representation theory which we are going to use, and recall
the definition of Fourier transforms.
130 Thomas Beth and Martin Rötteler
of C-algebras, where Mdi (C) denotes the full matrix ring of di × di matri-
ces with coefficients in C. This decomposition is also known as Wedderburn
decomposition of the group algebra CG.
We remind the reader of the fact that the decomposition in (4.7) is quite
familiar from the theory of error-avoiding quantum codes and noiseless sub-
systems [229, 230, 231, 232].
As an example of a Fourier transform, let G = Zn = x | xn = 1
be the cyclic group of order n with regular representation φ = 1E ↑T G,
T = (x0 , x1 , . . . , xn−1 ), and let ωn be a primitive nth root of unity.4 Now
√
φA = n−1 i ij
i=0 ρi , where ρi : x → ωn and A = DFTn = (1/ n)[ωn | i, j =
0 . . . n − 1] is the (unitary) discrete Fourier transform well known from signal
processing.
If A is a Fourier transform for the group G, then any fast algorithm
for the multiplication with A is called a fast Fourier transform for G. Of
course, the term fast depends on the complexity model chosen. Since we
are primarily interested in the realization of a fast Fourier transform on a
quantum computer (QFT), we first have to use the complexity measure κ, as
derived in Sect. 4.2.1.
Classically, a fast Fourier transform is given by a factorization of the
decomposition matrix A into a product of sparse matrices5 [171, 174, 225,
233]. For a solvable group G, this factorization can be obtained recursively
using the following idea. First, a normal subgroup of prime index (G : N ) = p
is chosen. Using transitivity of induction, φ = 1E ↑ G is written as (1E ↑ N ) ↑
G (note that we have the freedom to choose the transversals appropriately).
Then 1E ↑ N , which again is a regular representation, is decomposed (by
4
The induction of a representation φ of a subgroup H ≤ G with transversal
T = (t1 , . . . , tk ) is defined by (φ ↑T G)(g) := [φ̇(ti gt−1
j ) | i, j = 1 . . . n], where
φ̇(x) := φ(x) for x ∈ H or else is the zero matrix of the appropriate size.
5
Note that in general, sparseness of a matrix does not imply low computational
complexity with respect to the complexity measure κ.
4 Quantum Algorithms: Applicable Algebra and Quantum Physics 131
In the case of a cyclic group G the formula yields exactly the well-known
Cooley–Tukey decomposition (see also Sect. 4.4.1 and [209]), in which D is
usually called the twiddle matrix.
Assume that N G is a normal subgroup of prime index p with Fourier
m
transform A and decomposition φA = ρ = i=1 ρi . We can reorder the ρi
such that the first, say k, ρi have an extension ρi to G and the other ρi occur
(p−1)
as sequences ρi ⊕ ρti ⊕ . . . ⊕ ρti of inner conjugates (cf. Theorem 4.14; note
j
t
that the irreducibles ρi , ρi have the same multiplicity since φ is regular). In
the first case the extension may be calculated by Minkwitz’s formula [234];
in the latter case each sequence can be extended by ρi ↑T G (Theorem 4.14,
case 2). We do not state Minkwitz’s formula here, since we shall not need it
in the special cases treated later on. Altogether, we obtain an extension ρ of ρ
and can apply Theorem
p 4.15. The remaining task is to ensure that equivalent
irreducibles in i=1 λi · ρ are equal. For summands of ρ of the form ρi we
have the result that λj · ρi and ρi are inequivalent, and hence there is nothing
to do. For summands of ρ of the form ρi ↑T G, we conjugate λj · (ρi ↑T G)
onto ρi ↑T G using Theorem 4.14, case 2.
Now we are ready to formulate the recursive algorithm for constructing a
fast Fourier transform for a solvable group G due to Püschel et al. [228].
Algorithm 5 Let N G be a normal subgroup of prime index p with transver-
sal T = (t0 , t1 , . . . , t(p−1) ). Suppose that φ is a regular representation of N
with (fast) Fourier transform A, i.e. φA = ρ1 ⊕ . . . ⊕ ρk , fulfilling ρi ∼ = ρj ⇒
ρi = ρj . A Fourier transform B of G with respect to the regular representation
φ ↑T G can be obtained as follows.
1. Determine a permutation matrix P that rearranges the ρi , i = 1, . . . , k,
such that the extensible ρi (i.e. those satisfying ρi = ρti ) come first,
followed by the other representations ordered into sequences of length p
(p−1)
equivalent to ρi , ρti , . . . , ρti . (Note that these sequences need to be equal
(p−1)
t t
to ρi , ρi , . . . , ρi , which is established in the next step.)
2. Calculate a matrix M which is the identity on the extensibles and conju-
(p−1)
gates the sequences of length p to make them equal to ρi , ρti , . . . , ρti .
3. Note that A · P · M is a decomposition matrix for φ, too, and let ρ =
φA·P ·M . Extend ρ to G summand-wise. For the extensible summands use
(p−1)
Minkwitz’s formula; the sequences ρi , ρti , . . . , ρti can be extended by
ρi ↑T G. p−1
4. Evaluate ρ at t and build D = ρ(t)i .
i=0
5. Construct a block-diagonal matrix C with Theorem 4.14, case 2, conju-
p−1
gating i=0 λi · ρ such that equivalent irreducibles are equal. C is the
identity on the extended summands.
Result:
B = (1p ⊗ A · P · M ) · D · (DFTp ⊗ 1|N | ) · C (4.8)
is a fast Fourier transform for G.
4 Quantum Algorithms: Applicable Algebra and Quantum Physics 133
DFT2 n
n–1
D C
... A ... P ... M ... ... ... ..
.
1
Observe that the extensions 1, 3 and 4 of the cyclic subgroup Z2n = x split,
i.e. the groups have the structure of a semidirect product of Z2n with Z2 .
The three isomorphism types correspond to the three different embeddings
of Z2 = y into (Z2n )× ∼= Z2 × Z2n−2 .
In [228] quantum circuits with polylogarithmic gate complexity are given
for the Fourier transforms for each of these groups. See also [236, 237] for
quantum Fourier transforms for nonabelian groups.
{(ϕ, h) : h ∈ H, ϕ : [1, . . . , n] → G}
where ψ is the mapping given by i → ϕ1 (ih2 )ϕ2 (i) for i ∈ [1, . . . , n].
In other words, the wreath product is isomorphic to a semidirect product
of the so-called base group N := G×. . .×G, which is the n-fold direct product
of (independent) copies of G with H, in symbols G * H = N H, where H
operates via permutation of the direct factors of N . So we can think of the
elements as n-tuples of elements from G together with a permutation τ , and
multiplication is done componentwise after a suitable permutation of the first
n factors:
A * Z2 A × A E ,
where the second composition factor is the base group. We first want to
determine the irreducible representations of G := A * Z2 . Let G∗ be the
base group of G, i. e. G∗ = A × A. G∗ is a normal subgroup of G of index
2. Denoting by A = {χ1 , . . . , χk } the set of irreducible representations of A,
recall that the irreducible representations of G∗ are given by the set {χi ⊗χj :
i, j = 1, . . . , k} of pairwise tensor products (e.g. Sect. 5.6 of [212]).
Since G∗ G, the group G operates on the representations of G∗ via inner
conjugation. Because G is a semidirect product of G∗ with Z2 , we can write
each element g ∈ G as g = (a1 , a2 ; τ ) with a1 , a2 ∈ A and we conclude
i.e. only the factor group G/G∗ = Z2 operates via permutation of the tensor
factors. The operation of τ is to map χ1 ⊗ χ2 → χ2 ⊗ χ1 .
Therefore, it is easy to determine the inertia groups (see [171, 235] for
definitions) Tρ of a representation ρ of G∗ . We have to consider two cases:
(a) ρ = χi ⊗ χi . Then Tρ = G, since permutation of the factors leaves ρ
invariant.
4 Quantum Algorithms: Applicable Algebra and Quantum Physics 135
Example 4.7. We consider the special case Wn := Zn2 * Z2 , for which the
quantum Fourier transforms have an especially appealing form.
Applying the design principles for Fourier transforms described in this
section (see also [171, 174, 226, 228, 249]), we obtain the circuits for DFTWn
in a straightforward way. Once we have studied the extension/induction be-
havior of the irreducible representations of G∗ , the recursive formula
(12 ⊗ DFTG∗ ) · Φ(t) · (DFTZ2 ⊗ 1|A|2 ) (4.9)
t∈T
provides a Fourier transform for G. Here Φ(t) denotes the extension (as a
whole) of the regular representation of G∗ to a representation of G [171, 226,
228]. In the case of Wn , the transform DFTG∗ is the Fourier transform for
Z2n
2 and therefore a tensor product of 2n Hadamard matrices.
a r H2
yn H2 r ❤ r ···
.. .. .. ..
. . . .
y1 H2 ··· r ❤ r
xn H2 ❤ r ❤···
.. .. .. ..
. . . .
x1 H2 ··· ❤ r ❤
The circuits for the case of Wn are shown in Fig. 4.14. Obviously, the
complexity cost of this circuit is linear in the number of qubits,since the
conditional gate representing the evaluation at the transversal t∈T Φ(t)
can be realized with 3n Toffoli gates.
The input state of the quantum computer considered here, which has a
register of length n carrying |x and an extra qubit (which is initialized in
the ground state), is |ϕin = |0|x (written explicitly, this is x(0)|00 . . . 0 +
· · · + x(2n )|01 . . . 1), which is transformed by the circuit given in Fig. 4.15
to yield |y.
|0 H2 s
.. .. ..
|x . . πrev .
which means that this expression is equal to 2 cos [m(i + 1/2)π/2n ]. The
multiplication with these (relative) phase factors corresponds to a diagonal
m/2
matrix T = 12 ⊗ diag(ω2n+1 , m = 0, . . . , 2n −1), which can be implemented
by a tensor product T = 12 ⊗ Tn ⊗ · · · ⊗ T1 of local operations, where Ti =
i−1
diag(1, ω22n+2 ) for i = 1, . . . , n.
Looking at the state obtained so far, we see that |ϕin has been mapped
to |ϕ , the lower 2n components of which are given by
n
2 −1
1
ϕ (m) = √ cos [m(i + 1/2)π/2n]x(i) ,
2n i=0
4 Quantum Algorithms: Applicable Algebra and Quantum Physics 139
❡
❉ ❉ ❡
❉ ... x → −x ... ❉ Pn ...
.
.. = . ..
... ... ❡
❉ ❉ ❡
❉ • ❉ • •
Fig. 4.16. Grouping the matrix entries pairwise via π
|0 H2 r ❢
❉
Tn ❉
..
❉.
DFT2n+1 . ..
|x ... ... πrev ... ❢
❉ ❢
T2 ❉
T1 ❉ •
❜ ❢ |0
❅
❅
..
.
Pn Pn−1 .. ..
. DCTII |x
❜ ❢ .
❜ ❢ ❅
❅
❅
❅
• D1 D2 • •
Fig. 4.17. Complete quantum circuit for DCTII using one auxiliary qubit
4.7.1 Introduction
4.7.2 Background
x = (x0 , . . . , xn−1 )
of length n, where each “bit” xi ∈ {0, 1} can take a value xi ∈ GF (2) in the
finite field of two elements. With this elementary notion, the codewords can be
treated as vectors of the n-dimensional GF (2) vector space GF (2)n , where
the set of codewords is usually assumed to form a k-dimensional subspace
C ≤ GF (2)n . This can be obtained canonically as the range im(G) of a GF (2)
linear mapping G : GF (2)k → GF (2)n of the so-called encoder matrix, which
maps k-bit messages onto n-bit codewords, adding a redundancy of r = n − k
bits in the r so-called parity check bits.
The characteristic parameters of such a linear code C are the rate R = k/n
and the minimum Hamming weight
wH := min{wgtH (c) : c ∈ C, c = 0} .
q = 1 − p✲
0 ◗ 0
✸
✑
◗ ✑
◗✑
p ✑◗ p
✑ ◗
1 ✑ q = 1 − p◗
s
✲ 1
s := u · H t = c · H t + e · H t = e · H t
we see that the error pattern determines an affine subspace of GF (2)n , namely
a coset of C in GF (2)n . In order to allow error correction, the syndrome has
to be linked to the unknown error vector e uniquely, usually according to
the maximum-likelihood decoding principle. Obviously, for the BSC this can
be achieved by finding the unique codeword c that minimizes the Hamming
distance to u. For combinatorial reasons this is possible if
w − 1
H
wgt(e) ≤ .
2
Thus a code with minimum Hamming weight wH = 2t + 1 is said to correct
t errors per codeword.
An example which today can be considered a classic in the field of science and
technology is the application and design of (first-order) Reed–Müller codes
RM (1, m). This is not merely because they were a decisive piece of discrete
mathematics in producing the first pictures from the surface of Mars in the
early 1970s after the landing of Mariner 9 on 19 January 1972.
The description of their geometric construction and the method of er-
ror detection/correction of the corresponding wave functions, which we have
taken from [104], Chap. 13, is quite similar to the concept of quantum error-
correcting codes. In this section we describe Reed–Müller codes in a natural
geometrical setting (cf. [103, 104, 252]).
4 Quantum Algorithms: Applicable Algebra and Quantum Physics 143
m
In the first-order Reed–Müller code RM (1, m), the codewords f ∈ GF (2)2
are GF (2)-linear combinations of class functions of the index-2 subgroups
H < Zm 2 and their cosets. From this notion, a natural transform into the
orthonormal basis of Walsh–Hadamard functions [253, 254] is given by the
characters of Zm
2 , i.e. the Hadamard transformation H2m (see also Sect. 4.2.1
and [255]). By these means, the GF (2) vector
m
f = (f (u))u∈Zm ∈ GF (2)2
2
Note that for the first Hadamard coefficient Fˆ0 of F (t), which is given by
$ 2m
#
F0 = F (t) dt ,
0
the following identity holds:
F#0 = 2m − 2 wgt(f ) .
This provides a beautiful maximum-likelihood decoding device similar to that
needed for quantum codes.
We shall illustrate this with the example of the first-order Reed–Müller
code RM (1, 3).
The codewords of the first-order Reed–Müller code RM (1, m) with m = 3,
of length 8, can be regarded as incidence vectors of special subsets of points of
AG(3, 2) (see Fig. 4.19). The following table of Boolean functions (see Sect.
4.2.2) fi (x3 , x2 , x1 ) = 1 ⊕ xi of incidence vectors,
P5 = 101
P6 = 110
P 0 = 000
F(t)
1
t
-1
Fig. 4.20. Transmitted signal corresponding to the codeword f 0 + f 1 + f 2 =
(1, 0, 0, 1, 0, 1, 1, 0) with and without noise
In the more general case of Reed–Müller codes RM (1, m) and their modu-
lated signal functions, noise e(t) is added to the signal function by the channel
during transmission, so that the receiver will detect only a signal
φ̂
Signal Sampling ψ ✲ Maximiser ✲ Decon- Message
✲ at rate ✲ H2m De- ✲
|ψ(t) verter coder M
1/2m ✲
φˆ0
Fig. 4.21. Minimum-distance decoder for Reed–Müller codes (cf. [104], Chap. 13)
With the overall ± parity given by the sign (ψ#0 ) = (−1)ε(ψ) , the deconverter
produces a most likely codeword f = ε(ψ) · 1 + x⊥ . If the enumeration of
the generating hyperspaces ui (cf. (4.10)) is chosen suitably, the decoder
therefore reproduces the (m + 1)-bit message n = (ε, x1 , . . . , xm ), which is a
maximum-likelihood estimator of the word originally encoded into f = M ·G.
The reader is urged to compare this decoding algorithm with the quantum
decoding algorithm described in Sect. 4.7.4.
6
With the fast Hadamard transform algorithm (see Sect. 4.2.1), this decoder re-
quires O(m2m ) computational steps. Note that this is one of the earliest commu-
nication applications of generalized FFT algorithms (cf. [171]).
146 Thomas Beth and Martin Rötteler
One of the basic features described so far in this contribution has been the use
of quantum mechanical properties (entanglement, superposition) to speed up
the solution of classical problems. Most notably, this basic idea of quantum
computing has been present in the complex of hidden-subgroup problems
(see Sect. 4.3, 4.5.2 and 5), where the problem was to identity an unknown
subgroup of a given group out of exponentially many candidates, given a su-
perposition over a coset of the unknown subgroup. In this section we shall take
the dual point of view: we construct states which are simultaneous eigenstates
of a suitably chosen subgroup of a fixed error group and have the additional
property that an element of an unknown coset of this group – this models an
error which happens to the states – can be identified and also corrected.
To construct these states, we rely on the powerful theory of classical ECC
[184] introduced in the preceding sections. In what follows, we shall introduce
the class of binary QECCs, the so-called CSS codes, referring to the elaborate
article [255] by Beth and Grassl. These codes were independently discovered
by Calderbank and Shor [82] and Steane [62].
Quantum Channels. Before describing the construction of these codes,
we shall loosely describe the similarities and differences between a classical
binary symmetric channel (BSC) (see Sect. 4.7.2, Fig. 4.18) and a quantum
channel (QC).
Much as in the idealized case of a BSC, where vectors c ∈ GF (2)n are
n
transmitted, we shall consider the QC to be a carrier of kets |ψ ∈ H2n = C 2
n
spanned by the basis kets |x, where x ∈ GF (2) . In this system, so-called
error operations, generated by local errors as bit-flip errors, and sign-flip
errors can occur in superposition. Similarly to the case of a BSC, where the
error group is isomorphic to (GF (2)n , ⊕) = (e1 , . . . , en ) : ei ∈ {0, 1}, i =
1, . . . , n, in QC the error group
generated by the local bit flips and phase flips, represents all possible error
operators in the quantum channel by the transition diagram described below.
Initially, the input wavefunction |ψ will interact with the environment
via |ψ → |ψ| through a modulator; within the channel, this waveform
evolves under the error group according to the channel characteristics,
channel -
|ψ| ✲ (γ|ψ)|γ , (4.11)
error γ∈E
ket
|µ = γ|ψ|γ (4.12)
γ∈E
wgt(γ)≤t
will range only over those group elements γ which are products of at most
t local (single-bit) errors. Much in accordance with the classical case, for
γ = e1 ⊗ . . . ⊗ en ∈ E we define wgt(γ) to be the number of occurrences of
σx and σz needed to generate γ.
In order to protect quantum states against errors of this kind in a quan-
tum channel, a so-called Pauli channel, a quantum error-correcting code must
be constructed to map original quantum states into certain “protected” or-
thogonal subspaces, so that errors of the type of (4.12) cannot in practice
damage the original state seriously. For this purpose, the theory of classical
codes for the BSC can be successfully applied, as we describe below.
Quantum Codes. The basic principle is to consider encoded quantum states
which are superpositions of basis vectors belonging to classical codes, e.g.
1
|C := |c .
|C| c∈C
which holds for all subgroups U of an abelian group A (see Sect. 4.4.2 and
4.4.3).
The lemma says that any bit-flip error applied to the state |C will give
a state whose support C ⊥ is translation invariant, the shift being expressed
only in the phases of the elements of C ⊥ . Dually, a phase-flip error in |C will
occur as a bit-flip error in |C ⊥ .
148 Thomas Beth and Martin Rötteler
From this, we deduce the following basic coding principle ([255], p. 462):
given a classical binary linear (n, k) code C of length n and dimension k, the
states of the related quantum code are given by
1
|ψwi = (−1)c·wi |c , (4.15)
|C| c∈C
with suitable w i ∈ GF (2)n encoding the ith basis ket of the initial state
to be protected. In addition to the choice of the classical code C, for the
construction of a binary quantum code a subset W ⊆ GF (2)n /C ⊥ has to be
given to define the vectors wi .
Example 4.8. Let C := {(0, 0, 0), (1, 1, 1)} ⊆ GF (2)3 be the dual of the Reed–
Müller code R = RM (1, 3) shown in Fig. 4.19. Here W := GF (2)3 /C ⊥ pro-
vides an appropriate choice,
Note that the state |0, 0, 0 + |1, 1, 1 is the maximally entangled GHZ state.
We remark that the GHZ state is, up to local unitary transformations, the
unique maximally entangled state [256]. The “protected” subspace of code
vectors is, by definition,
ε3 = id ⊗ id ⊗ σx , ε2 = id ⊗ σx ⊗ id, ε1 = σx ⊗ id ⊗ id ∈ U(8) .
3
3
H2 = Hi
i=0
Thus, for any linear combination E of these single bit-flip errors i , the en-
coded state |ψ = α|ψw0 + β|ψw1 is represented by
3
E|ψ = λi (αi |ψw0 + βi |ψw1 ) ,
i=0
forms a quantum code Q which can correct (d1 −1)/2 bit errors and (d2 −1)/2
phase errors.
In practice, we have the following corollary in the case of weakly self-dual
codes C ⊆ C ⊥ , which were shown to be as good asymptotically in [82]:
Theorem 4.18. Let C be a weakly self-dual binary code with dual distance d.
Then the corresponding quantum code is capable of correcting up to (d − 1)/2
errors.
The construction in this theorem can be made more general, as Beth and
Grassl have shown in [255]:
Theorem 4.19. Let C ⊥ be a weakly self-dual binary code. If for
M0 := {w ∈ GF (2)n /C ⊥ | dH (C ⊥ , C ⊥ + w) ≤ t}
∀w i , w j : i = j ⇒ M0 ∩ (M0 + (w i − wj )) = ∅ , (4.17)
150 Thomas Beth and Martin Rötteler
is satisfied, the quantum code can correct t errors, i.e. any error operator
ε ∈ E where the total number of positions exposed to bit flips or sign flips is
at most t.
4.8 Conclusions
5.1 Introduction
Quantum entanglement is one of the most striking features of the quantum
formalism [26]. It can be expressed as follows: If two systems interacted in
the past it is, in general, not possible to assign a single state vector to either
of the two subsystems [257]. This is what is sometimes called the principle of
nonseparability. A common example of an entangled state is the singlet state
[258],
1
ψ− = √ (|01 − |10) . (5.1)
2
One can see that it cannot be represented as a product of individual vectors
describing states of subsystems. Historically, entanglement was first recog-
nized by Einstein, Podolsky and Rosen (EPR) [24] and by Schrödinger [5].1
In their famous paper, EPR suggested a description of the world (called “local
realism”) which assigns an independent and objective reality to the physi-
cal properties of the well-separated subsystems of a compound system. Then
EPR applied the criterion of local realism to predictions associated with an
entangled state to conclude that quantum mechanics was incomplete. The
EPR criticism was the source of many discussions concerning fundamental
differences between the quantum and classical descriptions of nature.
The most significant progress toward the resolution of the EPR problem
was made by Bell [25], who proved that local realism implies constraints on
the predictions of spin correlations in the form of inequalities (called Bell’s
inequalities) which can be violated by quantum mechanical predictions for
a system in the state (5.1). The latter feature of quantum mechanics, usu-
ally called nonlocality, is one of the most evident manifestations of quantum
entanglement.
Information-theoretical aspects of entanglement were first considered by
Schrödinger, who wrote [5], in the context of the EPR problem,
“Thus one disposes provisionally (until the entanglement is resolved by actual obser-
vation) of only a common description of the two in that space of higher dimension.
1
In fact, entangled quantum states had been used in investigations of the proper-
ties of atomic and molecular systems [259].
G. Alber, T. Beth, M. Horodecki, P. Horodecki, R. Horodecki, M. Rötteler, H. Weinfurter,
R. Werner, A. Zeilinger: Quantum Information, STMP 173, 151–195 (2001)
c Springer-Verlag Berlin Heidelberg 2001
152 Micha&l Horodecki, Pawe&l Horodecki and Ryszard Horodecki
This is the reason that knowledge of the individual systems can decline to the scant-
iest, even to zero, while that of the combined system remains continually maximal.
Best possible knowledge of a whole does not include best possible knowledge of its
parts – and that is what keeps coming back to haunt us”.
In this way Schrödinger recognized a profoundly nonclassical relation between
the information which an entangled state gives us about the whole system
and the information which it gives us about the subsystems.
The recent development of quantum information theory has shown that
entanglement can have important practical applications (see, e.g., [1, 2, 3,
260]). In particular, it turns out that entanglement can be used as a resource
for communication of quantum states in an astonishing process called quan-
tum teleportation [261].2 In the latter, a quantum state is transmitted by use
of a pair of particles in a singlet state (5.1) shared by the sender and receiver
(usually referred to as Alice and Bob), and two bits of classical communica-
tion. However, in real conditions, owing to interaction with the environment,
called decoherence, we encounter mixed states rather than pure ones. These
mixed states can still possess some residual entanglement. More specifically,
a mixed state is considered to be entangled if it is not a mixture of prod-
uct states [263]. In mixed states the quantum correlations are weakened, and
hence the manifestations of mixed-state entanglement can be very subtle
[263, 264, 265]. Nevertheless, it appears that it can be used as a resource for
quantum communication. Such a possibility is due to the discovery of distil-
lation of entanglement [266]: by manipulation of noisy pairs, involving local
operations and classical communication, Alice and Bob can obtain singlet
pairs and apply teleportation. This procedure provides a powerful protection
of the quantum data transmission against the environment.
Consequently, the fundamental problem was to investigate the structure
of mixed-state entanglement, especially in the context of quantum commu-
nication. These investigations have led to discovery of discontinuity in the
structure of mixed-state entanglement. It appears that there are at least two
qualitatively different types of entanglement [71]: free, which is useful for
quantum communication, and bound, which is a nondistillable, very weak
and mysterious type of entanglement.
The present contribution is divided into two main parts. In the first part
we report results of an investigation of the mathematical structure of entan-
glement. The main question is: given a mixed state, is it entangled or not? We
present powerful tools that allow us to obtain the answer in many interesting
cases. A crucial role is played here by the connection between entanglement
and the theory of positive maps [267]. In contrast to completely positive maps
[88], positive maps have not been applied in physics so far. The second part
is devoted to the application of the entanglement of mixed states to quantum
communication. Now, the leading question is: given an entangled state, can
it be distilled? The mathematical tools worked out in the first part allow us
2
For experimental realizations, see [262].
Mixed-State Entanglement and Quantum Communication 153
to answer the question. Surprisingly, the answer does not simplify the picture
but, rather, reveals a new horizon including the basic question: what is the
role of bound entanglement in nature?
Since entanglement is a basic ingredient of quantum information theory,
the scope of application of the research presented here goes far beyond the
quantum communication problem. The insight into the structure of entangle-
ment of mixed states can be helpful in many subfields of quantum information
theory, including quantum computing, quantum cryptography, etc.
Finally, it must be emphasized that our approach will be basically qual-
itative. Thus we shall not review here the beautiful work performed in the
domain of quantifying entanglement [66, 268, 269, 270, 271] (we shall only
touch on this subject in the second part). Owing to the limited space for
the present contribution, we shall also restrict our considerations to the en-
tanglement of bipartite systems, even though a number of results have been
recently obtained for multipartite systems (see, e.g., [272, 273]).
Tr P ≥ 0 (5.2)
We shall further need the following maximally entangled pure state of the
d ⊗ d system:
d
d 1
ψ+ = √ |i ⊗ |i . (5.4)
d i=1
ψ = U1 ⊗ U2 ψ+ ,
F () = max
ψ||ψ , (5.5)
ψ
where the maximum is taken over all maximally entangled vectors of the d⊗d
system.
where the positive coefficients ai are called Schmidt coefficients. The state
is entangled if at least two coefficients do not vanish. One finds that the
positive eigenvalues of either of the reductions are equal to the squares of
the Schmidt coefficients. In the next section we shall introduce a series of
necessary conditions for the separability for mixed states. It turns out that
all of them are equivalent to separability in the case of pure states [276, 277].
4
In fact, the state ψ+ used in the definition of the singlet fraction is a local
transformation of the true singlet state. Nevertheless, we shall keep the name
“singlet fraction”, while using the state ψ+ , which is more convenient for technical
reasons.
Mixed-State Entanglement and Quantum Communication 155
Tr B ≤ 2 , (5.7)
the Pauli matrices. For any given set of vectors we have a different inequality.
In [278] we derived the condition for a two-qubit6 state that was equivalent
to satisfying all the inequalities jointly. This condition has the following form:
M () ≤ 1 , (5.9)
where A = TrB , and similarly for B . The above inequalities were proved
[277, 280, 281, 282] to be satisfied by separable states for four different en-
tropies that are particular cases of the Renýi quantum entropies
Sα = (1 − α)−1 log Tr α :
S0 = log R() ,
S1 = −Tr log ,
S2 = − log Tr 2 ,
S∞ = − log |||| , (5.11)
where R() denotes the rank of the state (the number of nonvanishing
eigenvalues). The above inequalities are useful tools in many cases (as we
shall see in Sect. 5.3.5, one of them allows us to obtain a bound on the
possible rank of the bound entangled states); still, however, they are not very
strong criteria.
A different approach, presented in [66], is based on local manipulations of
entanglement (this approach was anticipated in [265]). The main idea is the
following: a given state is entangled because parties sharing many systems
(pairs of particles) in this state can produce a smaller number of pairs in a
highly entangled state (of easily “detectable” entanglement) by local oper-
ations and classical communication (LOCC). This approach initiated a new
field in quantum information theory: manipulating entanglement. The sec-
ond part of this contribution will be devoted to this field. It also initiated the
subject of the quantification of entanglement. Still, however, the seemingly
simple qualitative question of whether a given state is entangled or not was
not solved.
A breakthrough was achieved by Peres [284], who derived a surprisingly
simple but very strong criterion. He noted that a separable state remains a
positive operator if subjected to partial transposition (PT). We will call this
the positive partial transposition (PPT) criterion.
To define partial transposition, we shall use the matrix elements of a state
in some product basis:
mµ,nν =
m| ⊗
µ| |n ⊗ |ν , (5.12)
where the kets with Latin and Greek letters form an orthonormal basis in the
Hilbert space describing the first and the second system, respectively. Hence
the partial transposition of is defined as
T
mµ,nν ≡ mν,nµ .
B
(5.13)
The form of the operator TB depends on the choice of basis, but its eigenval-
ues do not. We shall say that a state “is PPT” if TB ≥ 0; otherwise we shall
say that the state “is NPT”. The partial transposition is easy to perform in
matrix notation. The state of the m ⊗ n system can be written as
A11 ... A1m
= ... ... ... , (5.14)
Am1 ... Amm
Mixed-State Entanglement and Quantum Communication 157
n
with n × n matrices Aij acting on the second (C ) space. These matrices are
defined by their matrix elements {Aij }µν ≡ iν,jµ . Then the partial transpo-
sition can be realized simply by transposition (denoted by T) of all of these
matrices, namely
T
A11 ... AT
1m
TB = ... ... ... . (5.15)
ATm1 ... Amm
T
Now, for any separable state , the operator TB must have still non-
negative eigenvalues [284]. Indeed, consider a partially transposed separable
state
TB = pi i ⊗ (˜
i )T . (5.16)
i
Since the state ˜i remains positive under transposition, so does the total
state.
Note that what distinguishes the Peres criterion from the earlier ones is
that it is structural. In other words, it does not say that some scalar function
of a state satisfies some inequality, but it imposes constraints on the structure
of the operator resulting from PT. Thus the criterion amounts to satisfying
many inequalities at the same time. In the next section we shall see that there
is also another crucial feature of the criterion: it involves a transposition that
is a positive map but is not a completely positive one. This feature, abstracted
from the Peres criterion, allows us to find an intimate connection between
entanglement and the theory of positive maps.
Finally, it should be mentioned that necessary conditions for separability
have been recently obtained in the infinite-dimensional case [285, 286]. In
particular, the Peres criterion was expressed in terms of the Wigner repre-
sentation and applied to Gaussian wave packets [286].
Λn = Λ ⊗ In : AA ⊗ Mn → AB ⊗ Mn (5.17)
is positive for all n; here In is the identity map on the space Mn .8 Thus
the tensor product of a CP map and the identity maps positive operators
into positive ones. An example of a CP map is → W W † , where W is an
arbitrary operator. As a matter of fact, the general form of a CP map is
Λ() = Wi Wi† . (5.18)
i
Tr T P = Tr P T ≥ 0 (5.19)
and P T is still some projector. We have used here the fact that Tr AT = Tr A.
On the other hand, I ⊗ T is no longer positive. One can easily check this,
showing that (I ⊗ T )P+ ≡ P+TB is not a positive operator.
A positive map is called decomposable [287] if it can be represented in the
form
Tr(A) ≥ 0 (5.21)
for any operator A satisfying Tr(AP ⊗ Q) ≥ 0, for all pure states P and Q
acting on HA and HB , respectively.
Remark. Note that operator A, which is positive on product states (i.e. it
satisfies TrA P ⊗ Q ≥ 0), is automatically Hermitian.
The lemma is a reflection of the fact that in real Euclidean space, a convex
set and a point lying outside it can always be separated by a hyperplane.9
Here, the convex set is the set of separable states, while the point is the
entangled state. The hyperplane is determined by the operator A. Though
this operator is not positive its restriction to product states is still positive.
Thus, this operator has been called the “entanglement witness” [290], as it
indicates the entanglement of some state (the first entanglement witness was
provided in [263]; see Sect. 5.2.4). Now, to pass to positive maps, we shall use
the isomorphism between entanglement witnesses and positive non-CP maps
[291]. Note that, if we have any linear operator A ∈ AA ⊗ AB , we can define
a map
k| Λ(|i
j|) |l =
i| ⊗
k| A |j ⊗ |l , (5.22)
9
For infinite dimensions one must invoke the Hahn–Banach theorem, whose geo-
metric form is a generalization of this fact.
160 Micha&l Horodecki, Pawe&l Horodecki and Ryszard Horodecki
As we mentioned, the relevant positive maps here are the ones that are not
completely positive. Indeed, for the CP map Λ we have (I ⊗ Λ) ≥ 0 for any
state , and hence CP maps are of no use here. The above theorem presents, to
our knowledge, the first application of the theory of positive maps in physics.
So far, only completely positive maps have been of interest to physicists.
As we shall see, the theorem has proved fruitful both for mathematics (the
theory of positive maps) and for physics (the theory of entanglement).
Operational Characterization of Entanglement
in Low Dimensions (2 ⊗ 2 and 2 ⊗ 3 Systems). The first conclusion
derived from the theorem is an operational characterization of the separable
states in low dimensions (2 ⊗ 2 and 2 ⊗ 3). This follows from the previously
mentioned result that positive maps in low dimensions are decomposable.
Then the condition (I ⊗ Λ) ≥ 0 reads (I ⊗ ΛCP 1 ) + (I ⊗ Λ2 )
CP TB
. Now,
since is positive and Λ1 is CP, the first term is always positive. If TB
CP
is positive, then the second term is also positive, and hence their sum is
a positive operator. Thus, to check whether for all positive maps we have
(I ⊗ Λ) ≥ 0, it suffices to check only transposition. One obtains the following
[267] (see also [292]):
Theorem 5.2. A state of a 2 ⊗ 2 or 2 ⊗ 3 system is separable if and only
if its partial transposition is a positive operator.
Remark. Equivalently, one can use the partial transposition with respect to
the first space.
The above theorem is an important result, as it allows one to determine
unambiguously whether a given quantum state of a 2⊗2 or a 2⊗3 system can
be written as mixture of product states or not. The necessary and sufficient
condition for separability here is surprisingly simple; hence it has found many
applications. In particular, it has been applied in the context of broadcasting
entanglement [293], quantum information flow in quantum copying networks
[294], disentangling machines [295], imperfect two-qubit gates [296], analysis
Mixed-State Entanglement and Quantum Communication 161
of the volume of the set of entangled states [297, 298], decomposition of sepa-
rable states into minimal ensembles or pseudo-ensembles [299], entanglement
splitting [300] and analysis of entanglement measures [270, 301, 302].
In Sect. 5.3.2 we describe the first application [303]: by use of this theorem
we show that any entangled two-qubit system can be distilled, and hence is
useful for quantum communication.
Higher Dimensions – Entangled States
with Positive Partial Transposition. Since the Størmer–Woronowicz char-
acterization of positive maps applies only to low dimensions, it follows that
for higher dimensions partial transposition will not constitute a necessary and
sufficient condition for separability. Thus there exist states that are entangled
but are PPT (see Fig. 5.1). The first explicit examples of an entangled but
PPT state were provided in [274]. Later on, it became apparent, that the
mathematical literature concerning nondecomposable maps contains exam-
ples of matrices that can be treated as prototypes of PPT entangled states
[292, 304].
We shall now describe the method of obtaining such states presented
in [274], as it has proved to be a fruitful direction in searching for PPT
entangled states. Section 5.3 will describe the motivation for undertaking
the very tedious task of this search – the states represent a curious type of
entanglement, namely bound entanglement.
To find the desired examples we must find an entangled PPT state. Of
course, we cannot use the strongest tool so far described, i.e. the PPT crite-
rion, because we are actually trying to find a state that is PPT. So we must
derive a criterion that would be stronger in some cases. It appears that the
range10 of the state can tell us much about its entanglement in some cases.
This is contained in the following theorem, derived in [274] on the basis of
the analogous condition for positive maps considered in [289].
Now, in [274] there were presented two examples of PPT states violating
the above criterion. We shall present the example for a 2 ⊗ 4 case.11 The
10
The range of an operator A acting on the Hilbert space H is given by R(A) =
{A(ψ) : ψ ∈ H}. If A is a Hermitian operator, then the range is equivalent to the
support, i.e. the space spanned by its eigenvectors with nonzero eigenvalues.
11
This is based on an example concerning positive maps [289].
162 Micha&l Horodecki, Pawe&l Horodecki and Ryszard Horodecki
(a)
PPT NPT
✬ ✩
separable entangled
states states
✫ ✪
(b)
PPT NPT
✬ ✩
entangled states
✬ ✩
separable
states
✫ ✪
✫ ✪
Fig. 5.1. Structure of entanglement of mixed states for 2 ⊗ 2 and 2 ⊗ 3 systems (a)
and for higher dimensions (b)
5.2.4 Examples
We present here a couple of examples, illustrating the results contained in
previous sections. In particular, we introduce two families of states that play
important roles in the problem of distillation of entanglement.
Reduction Criterion for Separability. As mentioned in Sect. 8, if Λ is a
positive map, then for separable states we have
(I ⊗ Λ) ≥ 0 . (5.27)
Mixed-State Entanglement and Quantum Communication 165
If the map is not CP, then this condition is not trivial, i.e. for some states
(I ⊗ Λ) is not positive. Consider the map given by Λ(A) = (Tr A)I − A.
The eigenvalues of the resulting operator Λ(A) are given by λi = Tr A − ai ,
where aiare the eigenvalues of A. If A is positive, then ai ≥ 0. Now, since
Tr A = i ai , then λi are also nonnegative. Thus the map is positive. Now,
the formula (5.27) and the dual formula (Λ⊗ I) ≥ 0 applied to this particular
map imply that separable states must satisfy the following inequalities:
I ⊗ B − ≥ 0 , A ⊗ I − ≥ 0 . (5.28)
The two conditions, taken jointly, are called the reduction criterion [281, 309].
One can check that it implies the entropic inequalities (hence it is better in
“detecting” entanglement). From the reduction criterion, it follows that states
of a d ⊗ d system with F () > 1/d must be entangled (this was originally
argued in [66]). Indeed, from the above inequalities, it follows that for a
separable state σ and a maximally entangled state ψme one has
ψme |σA ⊗
I − σ|ψme ≥ 0. Since the reduced density matrix ψAme of the state ψme is
proportional to the identity, we obtain
ψme |σA ⊗ I|ψme = Tr(ψ A σA ) =
me
He showed that such states (called Werner states) must be of the following
form:
1
W (d) = (I + βV ) , −1 ≤ β ≤ 1 , (5.30)
d2 − βd
166 Micha&l Horodecki, Pawe&l Horodecki and Ryszard Horodecki
where V is the flip operator defined above. Another form for W is [310]
PA PS
W (d) = p + (1 − p) , 0≤p≤1, (5.31)
NA NS
where NA = (d2 − d)/2 and NS = (d2 + d)/2 are the dimensions of the
antisymmetric and symmetric subspaces, respectively. It was shown [263]
that W is entangled if and only if Tr V W < 0. Equivalent conditions are
β < −1/d, p > 0 or is NPT. Thus W is separable if and only if it is PPT.
For d = 2 (the two-qubit case) the state can be written as (see [264])
I 1
W (2) = p|ψ−
ψ− | + (1 − p) , − ≤p≤1. (5.32)
4 3
Note that any state , if subjected to a random transformation of the
form U ⊗ U (we call such an operation U ⊗ U twirling), becomes a Werner
state:
dU U ⊗ U U † ⊗ U † = W . (5.33)
I 1
(p, d) = pP+ + (1 − p) , where − ≤p≤1. (5.34)
d2 d2 − 1
The state will be called “isotropic” [311] here.13 For p > 0 it is interpreted as
a mixture of a maximally entangled state P+ with a completely chaotic noise
represented by I/d. It was shown that it is the only state invariant under
U ⊗ U ∗ transformations.14 If we use the singlet fraction F = Tr P+ as a
parameter, we obtain
d2 I 1
(F, d) = 2 (1 − F ) 2 + (F − 2 )P+ , 0 ≤ F ≤ 1 . (5.35)
d −1 d d
The two parameters are related via p = (d2 F − 1)/(d2 − 1). The state is
entangled if and only if F > 1/d or, equivalently, if it is NPT. Similarly to
the case for Werner states, a state subjected to U ⊗ U ∗ twirling (random
U ⊗ U ∗ operations) becomes isotropic, and the parameter F () is invariant
under this operation.
13
In [281] it was called a “noisy singlet”.
14
The star denotes complex conjugation.
Mixed-State Entanglement and Quantum Communication 167
This map has been shown to be positive [287]. Now, one can calculate the
operator (I ⊗ Λ) and find that one of its eigenvalues is negative for α > 3
(explicitly, λ− = (3 − α)/2). This implies that
• the state is entangled (for a separable state we would have (I ⊗ Λ) ≥ 0)
• the map is nondecomposable (for a decomposable map and PPT state
we also would have (I ⊗ Λ) ≥ 0).
For 2 ≤ α ≤ 3 it is separable, as it can be written as a mixture of other
separable states σα = 6/71 + (α − 2)/7σ+ + (3 − α)/7σ− , where 1 =
(|ψ+
ψ+ | + σ+ + σ− )/3. The latter state can be written as an integral over
product states:
1 2π dθ
1 = |ψ(θ)
ψ(θ)| ⊗ |ψ(−θ)
ψ(−θ)| ,
8 0 2π
√
where |ψ(θ) = 1/ 3(|0 + eiθ |1 + e−2iθ |2) (there exists also a finite decom-
position exploiting phases of roots of unity [312]).
168 Micha&l Horodecki, Pawe&l Horodecki and Ryszard Horodecki
used in this approach is the maximally mixed state with a small admixture of
some pure entangled state. In [313] the sufficient conditions of the above kind
were further developed, and it was concluded that in all of the NMR quantum
computing experiments performed so far, the admixture of the pure state was
too small. Thus the total state used in these experiments was separable: it
satisfied a condition sufficient for separability. This raised an interesting dis-
cussion as to what extent entanglement is necessary for quantum computing
[314, 315] (see also [316]). Even though there is still no general answer, it was
shown [314] that the Shor algorithm [9] requires entanglement.
Let us now turn back to the question of the volumes of Ve and Vpe . If one
takes ψ+ of a d ⊗ d system for simplicity, it is easy to see that a not very
large admixture of any state will ensure F > 1/d. Thus any state belonging
to the neighborhood must be entangled. Showing that the volume of PPT
entangled states is nonzero is a bit more involved [297].
In conclusion, all three types of states are not atypical in the set of all
states of a given system. However, it appears that the ratio of the volume of
the set of PPT states VPPT (and hence also separable states) to the volume
of the total set of states goes down exponentially with the dimension of the
system (see Fig. 5.2). This result was obtained numerically [297] and still
PN
0.6
0.4
0.2
0.0
4 8 12 16 20 24
N
Fig. 5.2. The ratio PN = VPPT /V of the volume of PPT states to the volume of
the set of all states versus the dimension N of the total system. Different symbols
distinguish different sizes of one subsystem (k = 2 (), k = 3 ()). (This figure is
reproduced from Phys. Rev. A 58, 883 (1998) by permission of the authors)
170 Micha&l Horodecki, Pawe&l Horodecki and Ryszard Horodecki
As one knows, if two distant observers (one usually calls them Alice and
Bob) share a pair of particles in a singlet state ψ− then they can send a
quantum state to one another by the use of only two additional classical bits.
This is called quantum teleportation [261].16 If the classical communication
is free of charge (since it is much cheaper than communication of quantum
bits), one can say that a singlet pair is a resource equivalent to sending one
qubit. In the following, it will be shown that mixed-state entanglement can
also be a resource for quantum communication. Quantum communication
via mixed entangled states will require, apart from teleportation, an action
called distillation. It will be also shown that there exists a peculiar type of
entanglement (bound entanglement) that is a surprisingly weak resource.
I
→ Λ() = p + (1 − p) , (5.42)
2
where I/2 is the maximally mixed state of one qubit. This channel has been
thoroughly investigated [66, 266, 323, 324]. What is important here is that it
has been shown [324] that for p ≤ 2/3 the above method of error correction
does not work. In the classical domain, it would mean that the channel was
useless. Here, surprisingly, there is a trick that allows one to beat this limit,
even down to p = 1/3! The scheme that realizes this fact is quite mysterious.
In direct error correction we deal directly with the systems carrying the infor-
mation to be protected. Now, it appears that by using entanglement, one can
remove the results of the action of noise without even having the information
to be sent. Therefore, it can be called counterfactual error correction.
How does this work? The idea itself is not complicated. Instead of sending
the qubits of information, Alice (the sender) sends Bob particles from entan-
gled pairs (in the state ψ− ), keeping one particle from each pair. The pairs
are disturbed by the action of the channel, so that their state turns into a
mixture18 that still possesses some residual entanglement. Now, it turns out
that by local quantum operations (including collective actions over all mem-
bers of the pairs in each lab) and classical communication (local operations
and classical communication, LOCC) between Alice and Bob, Alice and Bob
are able to obtain a smaller number of pairs in a nearly maximally entan-
gled state ψ− (see Fig. 5.3). Such a procedure, proposed in [266], is called
distillation. As in the case of direct error correction, one can achieve a finite
asymptotic rate k/n for the distilled pairs per input pair, and the fidelity,
which now denotes the similarity of the distilled pairs to a product of sin-
glet pairs, is asymptotically perfect. Now, the distilled pairs can be used for
teleportation of quantum information. The maximal possible rate achievable
within the above framework is called the entanglement of distillation of the
17
The capacity Q of a quantum channel is the greatest ratio k/n for reliable trans-
mission down the given channel.
18
If the channel is memoryless, the mixture factorizes into states of individual
pairs.
172 Micha&l Horodecki, Pawe&l Horodecki and Ryszard Horodecki
state , and is denoted by D(). Thus, if Alice and Bob share n pairs each
in state , they can faithfully teleport k = nD() qubits.
As we have mentioned, the error correction stage and the transmission
stage are separated in time here; the error correction can be performed even
before the information to be sent was produced. Using the terminology of
[325], one can say that Alice and Bob operate on potentialities (an entangled
pair represents a potential communication) and correct the potential error,
so that when the actual information is coming, it can be teleported without
any additional action.
The above scheme is not only mysterious. It is also much more powerful
than the direct method. In the next section we describe a distillation protocol
that allows one to send quantum information reliably via a channel with
p > 1/3. A general question is: where are the limits of distillation? As we
have seen, the basic action refers to mixed bipartite states, so that instead of
talking about channels, we can concentrate on bipartite states. The question
can be formulated as follows: which states can be distilled by the most
general LOCC actions? Here, by saying that a state can be distilled, we
mean that Alice and Bob can obtain singlets from the initial state ⊗n of n
pairs (thus we shall work with memoryless channels).
One can easily see that separable states cannot be distilled: they contain
no entanglement, so it is impossible to convert them into entangled states by
LOCC operations. Then the final form of our question is: can all entangled
states be distilled? Before the answer to this question was provided, the
default was “yes”, and the problem was how to prove it. Now, we know that
the answer is “no”, so that the structure of the entanglement of bipartite
states is much more puzzling than one might have suspected.
Finally, we should mention that for pure states the problem of conversion
into singlet pairs has been solved. Here there is no surprise: all entangled pure
states can be distilled [326] (see also [327]). What is especially important
is that this distillation can be performed reversibly: from the singlet pairs
obtained, we can recover (asymptotically) the same number of input pairs
[326]. As we shall see, this is not the case for mixed states.
In this section, we shall describe what was historically the first distillation
protocol for two-qubit states, devised by Bennett, Brassard, Popescu, Schu-
macher, Smolin and Wootters (BBPSSW) [266]. Then we shall show that a
more general protocol can distill any entangled two-qubit state.19
BBPSSW Distillation Protocol. The BBPSSW distillation protocol still
remains the most transparent example of distillation. It works for two-qubit
states with a fully entangled fraction satisfying F > 1/2. Such states are
equivalent to those with F > 1/2, so that we can restrict ourselves to the
latter states. Hence we assume that Alice and Bob initially share a huge
number of pairs, each in the same state with F > 1/2, so that the total
state is ⊗n . Now they aim to obtain a smaller number of pairs with a higher
singlet fraction F . To this end they iterate the following steps:
1. They take two pairs and apply U ⊗ U ∗ twirling to each of them, i.e.
a random unitary transformation of the form U ⊗ U ∗ (Alice picks at
random a transformation U , applies it, and communicates to Bob which
transformation she has chosen; then he applies U ∗ to his particle). Thus
one has a transformation from two copies of to two copies of the isotropic
state F with an unchanged F :
⊗ → F ⊗ F . (5.43)
(the first qubit is called the source, the second qubit the target). They
obtain some complicated state ˜ of two pairs.
3. The pair of target qubits is measured locally in the basis |0, |1 and it is
discarded. If the results agree (success), the source pair is kept and has a
greater singlet fraction. Otherwise (failure), the source pair is discarded
too.
If the results in step 3 agree, the final state of the source pair kept can be
calculated from the formula
source pair
•
•
Alice XOR XOR Bob
•
•
target pair
Fig. 5.4. Bilateral quantum XOR operation
where the partial trace is performed over the Hilbert space H(t) of the tar-
get pair, Is is the identity on the space of the source pair (because it was
not measured), and Pt = |00
00| + |11
11| acts on target pair space and
corresponds to the case “results agree”.
Subsequently, one can calculate the singlet fraction of the surviving pair
as a function of the singlet fraction of the two initial pairs, obtaining
F 2 + (1/9)(1 − F )2
F (F ) = . (5.46)
F 2 + (2/3)F (1 − F ) + (5/9)(1 − F )2
Since the function F (F ) is continuous, F (F ) > F for F > 1/2 and F (1) =
1, we obtain the result that by iterating the procedure, Alice and Bob can
obtain a state with arbitrarily high F . Of course, the larger F is required
to be, the more pairs must be sacrificed, and the less the probability p of
success is. Thus if Alice and Bob start with some Fin and would like to
end up with some higher Fout , the number of final pairs will be on average
k = np/2l , where l and p depend on Fin and Fout , and denote the number
of iterations of the function F (F ) required to reach Fout starting from Fin ,
and the probability of a string of l successful operations, respectively.
The above method allows one to obtain an arbitrarily high F , but the
asymptotic rate is zero. However, if F is high enough so that 1 − S > 0, where
S is the von Neumann entropy of the state , then there exists a protocol
(called hashing) that gives a nonzero rate [66]. We shall not describe this
protocol here, but we note that for any state with F > 1/2 Alice and Bob
can start by using the recurrence method to obtain 1 − S > 0, and then
apply the hashing protocol. This gives a nonzero rate for any state with
F > 1/2. This means that quantum information can be transmitted via a
depolarizing channel (5.42) only if p > 1/3. Indeed, one can check that if
Alice send one of the particles from a pair in a state ψ+ via the channel
to Bob, then the final state shared by them will be an isotropic one with
F > 1/2. By repeating this process, Alice and Bob can obtain many such
pairs. Then distillation will allow them to use the pairs for asymptotically
faithful quantum communication.
All Entangled Two-Qubit States are Distillable. As was mentioned in
Sect. 5.2.4, there exist entangled two-qubit states with F < 1/2, so that no
Mixed-State Entanglement and Quantum Communication 175
product unitary transformation can produce F > 1/2. Thus the BBPSSW
protocol cannot be applied to all entangled two-qubit states. We shall show
below that, nevertheless, all such states are distillable [303]. It was possible
to solve the problem mainly because of the characterization of the entangled
states as discussed in Sect. 5.2.3.
Since we are not interested in the value of the asymptotic rate, it suf-
fices to show that by starting with pairs in an entangled state, Alice and
Bob are able to obtain a fraction of them in a new state with F > 1/2 (and
then the BBPSSW protocol will do the job). Our main tool will be the so-
called filtering operation [326, 330], which involves generalized measurement
performed by one of the parties (say, Alice) on individual pairs. This mea-
surement consists of two outcomes {1, 2}, associated with operators W1 and
W2 satisfying
ψ|TB |ψ < 0 . (5.49)
A†ψ ⊗ I Aψ ⊗ I
˜ = .
Tr(A†ψ ⊗ I Aψ ⊗ I)
Now it is clear that the role of the filter W will be played by the operator A†ψ .
√
We shall show that
ψ− |˜ |ψ− > 1/2, where ψ− = (|01 − |10)/ 2. Then a
suitable unitary transformation by Alice can convert ˜ into a state with
F > 1/2.
From the inequality (5.51), we obtain
If we use the product basis |1 = |00, |2 = |01, |3 = |10, |4 = |11, the
inequality (5.52) can be written as
5.3.3 Examples
ψ− |˜
|ψ− = (p3 /8 + λ2 p/2 − λp2 /2)/N , is greater than 1/2 only if p > 0.
The new state can be distilled by the BBPSSW protocol.
Below we shall prove that some states of higher-dimensional systems are
distillable. We shall do this by showing that some LOCC operation can con-
vert them (possibly with some probability) into an entangled two-qubit state.
Distillation of Isotropic State for d ⊗ d System. For F > 1/d, an
isotropic state can be distilled [281, 313]. If both Alice and Bob apply the
projector P = |0
0| + |1
1|, where |0, |1 are vectors from the local basis,
then the isotropic state will be converted into a two-qubit isotropic state.
(Note that the projectors play the role of filters; also, the filtering is successful
if both Alice and Bob obtain outcomes corresponding to P .) Now, if the initial
state satisfied F > 1/d then the final state, as a two-qubit state, will have
F > 1/2. Thus it is entangled and hence can be distilled.
Distillation and Reduction Criterion. Any state of a d ⊗ d system
that violates the reduction criterion (see Sect. 5.2.4) can be distilled [281].
Indeed, take a vector ψ for which
ψ|A ⊗ I − |ψ < 0. It is easy to see that
by applying the filter W given by ψ = W ⊗ Iψ+ , one obtains a state with
F > 1/d. Now, the random U ⊗ U ∗ transformations will convert it into an
isotropic state with the same F . As shown above, the latter state is distillable.
In the light of the result for two qubits, it was naturally expected that any
entangled state could be distilled. It was a great surprise when it became
178 Micha&l Horodecki, Pawe&l Horodecki and Ryszard Horodecki
apparent that it was not the case. In [71] it was shown that there exist
entangled states that cannot be distilled. The following theorem provides a
necessary and sufficient condition for the distillability of a mixed state [71].
Theorem 5.4. A state is distillable if and only if, for some two-dimen-
sional projectors P, Q and for some number n, the state P ⊗ Q⊗n P ⊗ Q is
entangled.
Remarks. (1) Note that the state P ⊗ Q⊗n P ⊗ Q is effectively a two-qubit
one as its support is contained in the C ⊗ C subspace determined by the
2 2
Thus ( )TB is a result of the action of some completely positive map on an
operator TB that by assumption is positive. Then also the operator ( )TB
must be positive. Thus a LOCC map does not move outside the set of PPT
states.
To prove (ii), let us now show that PPT states can never have a high
singlet fraction F . Consider a PPT state of a d ⊗ d system. We obtain
(Recall that R() denotes the rank of .) Note that the above inequality
is nothing but the entropic inequality (5.10) with the entropy (5.11). Thus
it appears that the latter inequality is a necessary condition not only for
separability, but also for nondistillability. The proof is based on the fact [281]
out to be invalid: it was based on a theorem [311] on the additivity of the relative
entropy of Werner states. However, an explicit counterexample to this theorem
was provided in [337].
Mixed-State Entanglement and Quantum Communication 181
that any state violating a reduction criterion (see Sects. 5.2.4 and 5.3.3) can
be distilled. It can be shown that, if a state violates the above equation, then
it must also violate a reduction criterion, and hence can be distilled. Then
it follows that there does not exist any BE state of rank two [282]. Indeed,
if such a state existed, then its local ranks could not exceed two. Hence the
total state would be effectively a two-qubit state. However, from Sect. 19 we
know that two-qubit bound entangled states do not exist.
So far we have considered BE states that arise from Theorem 5.5, which says
that the NPT condition is necessary for distillability. As mentioned in Sect.
19, for 2 ⊗ n systems all NPT states can be distilled [331], and hence the
condition is also sufficient in this case. However, it is not known whether
it is sufficient in general. A necessary and sufficient condition is given by
Theorem 5.4. To find if this condition is equivalent to the PPT one, it must
be determined whether there exists an NPT state such that, for any number
of copies n, the state ⊗n will not have an entangled two-qubit “substate”
(i.e. the state P ⊗ Q⊗n P ⊗ Q). In [281] it was pointed out that one can
reduce the problem by means of the following observation.
Proposition 5.1. The following statements are equivalent:
1. Any NPT state is distillable.
2. Any entangled Werner state (5.30) is distillable.
Proof. The proof of the implication (1) ⇒ (2) is immediate, as Werner states
are entangled if and only if they are NPT. If we can distill any NPT state,
then also Werner entangled states are distillable. To obtain (2) ⇒ (1), note
that the reasoning of Sect. 19, from (5.49) to (5.52), is insensitive to the
dimension d of the problem. Consequently, from any NPT state, a suitable
filtering produces a state ˜ satisfying Tr ˜V < 0. As mentioned in Sect. 5.2.4,
the parameter Tr V is invariant under U ⊗ U twirling, so that by applying
the latter (which is an LOCC operation), Alice and Bob obtain a Werner
state W satisfying Tr W V < 0. Thus any NPT state can be converted by
means of LOCC operations into an entangled Werner state, which completes
the proof.
The above proposition implies that to determine whether there exist NPT
bound entangled states, one can restrict oneself to the family of Werner states,
which is a one-parameter family of very high symmetry. Even after such a
reduction of the problem, the latter remains extremely difficult. In [310, 338]
the authors examine the nth tensor power of Werner states (in [338] a larger,
two-parameter family is considered). The results, though not conclusive yet,
strongly suggest that there exist NPT bound entangled states (see Fig. 5.5).
Thus it is likely that the characterization of distillable states is not as sim-
ple as reduction to a NPT condition. The possible existence of NPT bound
182 Micha&l Horodecki, Pawe&l Horodecki and Ryszard Horodecki
(a)
PPT NPT
✬ ✩
free
separable
entangled
states
states
✫ ✪
(b)
PPT NPT
✬ ✩
separable free
states entangled
states
✫ ✪
✫ ✪
✫ ✪
Fig. 5.5. Entanglement and distillability of mixed states for 2 ⊗ 2 and 2 ⊗ 3 systems
(a) and for higher dimensions (b). The area filled with diagonal lines denotes the
hypothetical set of bound entangled NPT states
entanglement would make the total picture much more obscure (and hence
much more interesting). Among others, there would arise the following ques-
Mixed-State Entanglement and Quantum Communication 183
tion: for two distinct BE states 1 and 2 , is the state 1 ⊗ 2 also BE? (If
BE was equivalent to PPT, this question would have an immediate answer
“yes”, because the PPT property is additive, i.e. if two states are PPT, then
so is their tensor product [284]). Recently, a negative answer to this question
was obtained in [339] in the case of a multipartite system. For bipartite states
the answer is still unknown.
5.3.6 Example
Consider the family of states (5.37) considered in Sect. 5.2.4. One obtains the
following classification: is
• separable for 2 ≤ α ≤ 3
• bound entangled (BE) for 3 < α ≤ 4
• free entangled (FE) for 4 < α ≤ 5 .
The separability was shown in Sect. (5.2.4). It was also shown there that, for
3 < α ≤ 4, the state is entangled and PPT. In this case we conclude that it is
BE. For α > 4, Alice and Bob can apply local projectors P = |0
0| + |1
1|,
obtaining an entangled two-qubit state. Hence the initial state is FE in this
region of α.
|ψA
ψA | ⊗ AB ,
where ψA is the state to be teleported (unknown to Alice and Bob). Now Al-
ice and Bob perform some trace-preserving LOCC operation (trace-preserving,
because teleportation is an operation that must be performed with probabil-
ity 1). The form of the operation depends on the state AB that is known to
Alice and Bob, but is independent of the input state ψA because that state
is unknown. Now the total system is in a new, perhaps very complicated
state A AB . The transmitted state is given by TrA A (A AB ). The overall
transmission stages are the following:
ψA → |ψA
ψA | ⊗ AB → Λ(|ψA
ψA | ⊗ AB ) → TrA A A AB = B .
f =
ψA |B |ψA ,
where the average is taken over a uniform distribution of the input states
ψA .24 In the original teleportation scheme (where AB is a maximally en-
tangled state), the state B is exactly equal to the input state, so that f = 1.
If Alice and Bob share a pair in a separable state (or, equivalently, share no
pair), then the best one can do is the following: Alice measures the state and
sends the results to Bob [264]. Since it is impossible to find the form of the
state when one has only a single system in that state [345] (it would con-
tradict the no-cloning theorem [346] (see Sect. 1)), the performance of such
a process will be very poor. One can check that the best possible fidelity is
f = 2/(d + 1). If the shared pair is entangled but is not a pure maximally
entangled state, we shall obtain some intermediate value of f .
Optimal Teleportation. Having defined the general teleportation scheme,
one can ask about the maximal fidelity that can be achieved for a given state
24
Note that the fidelity so defined is not a unique criterion of the performance of
teleportation. For example, one can consider a restricted input: Alice receives one
of two nonorthogonal vectors with some probabilities [344]. Then the formula for
the fidelity would be different. In general, the fidelity is determined by a chosen
distribution over input states.
Mixed-State Entanglement and Quantum Communication 185
AB within the scheme. Thus, for a given AB we must maximize f over all
possible trace-preserving LOCC operations. The problem is, in general, ex-
tremely difficult. However, the high symmetry of the chosen fidelity function
allows one to reduce it in the following way. It has been shown [343] that the
best Alice and Bob can do is the following. They first perform some LOCC
action that aims at increasing F (AB ) as much as possible. Then they per-
form the standard teleportation scheme, via the new state AB (just as if it
were the state P+ ). The fidelity obtained is given by
Fmax d + 1
fmax = , (5.64)
d+1
where Fmax = F (AB ) is the maximal F that can be obtained by trace-
preserving LOCC actions if the initial state is AB .
Teleportation Via Bound Entangled States. According to (5.64), to
check the performance of teleportation via BE states of a d ⊗ d system, we
should find the maximal F attainable from BE states via trace-preserving
LOCC actions. As was argued in Sect. 5.3.4, a BE state subjected to any
LOCC operation remains BE. Moreover, the singlet fraction F of a BE state
of a d⊗d system satisfies F ≤ 1/d (because states with F > 1/d are distillable,
as shown in Sect. 5.3.3). We conclude that, if the initial state is BE, then
the highest F achievable by any (not only trace-preserving) LOCC actions
is F = 1/d. However, as we have argued, this gives a fidelity f = 2/(d + 1),
which can be achieved without entanglement. Thus the BE states behave
here like separable states – their entanglement does not manifest itself.
Activation of Bound Entanglement. Here we shall show that bound
entanglement can produce a nonclassical effect, even though the effect is a
very subtle one. This effect is the so-called activation of bound entanglement
[340]. The underlying concept originates from a formal entanglement–energy
analogy developed in [71, 269, 325, 335, 347]. One can imagine that the bound
entanglement is like the energy of a system confined in a shallow potential
well. Then, as in the process of chemical activation, if we add a small amount
of extra energy to the system, its energy can be liberated.
In our case, the role of the system is played by a huge amount of bound
entangled pairs, while that of the extra energy is played by a single pair
that is free entangled. More specifically, we shall show that a process called
conclusive teleportation [348] can be performed with arbitrarily high fidelity
if Alice and Bob can perform joint operations over the BE pairs and the
FE pair. We shall argue that it is impossible if either of the two elements is
lacking.
Conclusive Teleportation. Suppose that Alice and Bob have a pair in a
state for which the optimal teleportation fidelity is f0 . Suppose, further, that
the fidelity is too poor for some of Alice and Bob’s purposes. What they can
do to change the situation is to perform a so-called conclusive teleportation.
186 Micha&l Horodecki, Pawe&l Horodecki and Ryszard Horodecki
Namely, they can perform some LOCC operation with two final outcomes 0
and 1. If they obtain the outcome 0, they fail and decide to discard the pair.
If the outcome is 1 they perform teleportation, and the fidelity is now better
than the initial f0 . Of course, the price they must pay is that the probability
of success (outcome 1) may be small. The scheme is illustrated in Fig. 5.6.
8
>
>
>
> Alice •
• Bob
>
>
>
>
>
>
preparation
>
< LOCC
of a strongly
>
>
❄
entangled pair
>
> ✲ (failure)
>
>
>
> 1−p
>
>
>
: (success) p
❄
( •\/\/\/\/\/\/\/\/•
teleportation φ ◦ −→ + −→ ◦ φ
LOCC
Fig. 5.6. Conclusive teleportation. Starting with a weakly entangled pair, Alice
and Bob prepare with probability p a strongly entangled pair and then perform
teleportation
A simple example is the following. Suppose that Alice and Bob share a
pair in a pure state ψ = a|00 + b|11 which is nearly a product state (e.g. a
is close to 1). Then the standard teleportation scheme provides a rather poor
fidelity f = 2(1 + ab)/3 [341, 349]. However, Alice can subject her particle to
a filtering procedure [326, 330] described by the operation
Λ = W (·)W † + V (·)V † , (5.65)
where W = diag(b, a), V = diag(a, b). Here the outcome 1 (success) cor-
responds to the operator W . Indeed, if this outcome is obtained, the state
collapses to the singlet state
W ⊗ Iψ 1
ψ̃ = = √ (|00 + |11) . (5.66)
||W ⊗ Iψ|| 2
Then, in this case, perfect teleportation can be performed. Thus, if Alice
and Bob teleported directly via the initial state, they would obtain a very
Mixed-State Entanglement and Quantum Communication 187
poor performance. Now they have a small but nonzero chance of performing
perfect teleportation.
• •
LOCC
❄
✲ (failure)
1−p
(success) p
❄
•\/\/\/\/\/\/\/\/•
Fig. 5.7. Conclusive increase of the singlet fraction. Alice and Bob obtain, with
a probability p of success, a state with a higher singlet fraction than that of the
initial state
where σ± are separable states given by (5.38). It is easy to see that the
state (5.67) is free entangled. Namely, after action of the local projections
(|0
0| + |1
1|) ⊗ (|0
0| + |1
1|), we obtain an entangled 2 ⊗ 2 state (its
188 Micha&l Horodecki, Pawe&l Horodecki and Ryszard Horodecki
After some algebra, one can see that the success in the step (ii) occurs with
a nonzero probability
2F + (1 − F )(5 − α)
PF →F = , (5.70)
7
25
Here we need the quantum XOR gate not for two qubits, as in Sect. 19, but for
two qutrits (three-level systems). A general XOR operation for a d ⊗ d system,
which was used in in [281, 351], is defined as
where the initial states |a and |b correspond to the source and target states,
respectively.
Mixed-State Entanglement and Quantum Communication 189
Fig. 5.8. Liberation of bound entanglement. The singlet fraction of the FE state is
plotted versus the number of successful iterations of (i) and (ii), and the parameter
α of the state α of the BE pairs used. The initial singlet fraction of the FE pair is
taken as Fin = 0.3 (This figure is reproduced from Phys. Rev. Lett. 82, 1056 (1999)
by permission of the authors)
2F
F (F ) = . (5.71)
2F + (1 − F )(5 − α)
see that it has helped to obtain a strong upper bound for the entanglement
of distillation D (recall that the latter has the meaning of the capacity of a
noisy teleportation channel constituted by bipartite mixed states, and hence
is a central parameter of quantum communication theory).
The first upper bound for D was the entanglement of formation [66],
calculated explicitly for two-qubit states [301]. However, a stronger bound
has been provided in [269] (see also [334]). It is given by the following measure
of entanglement [268, 269] based on the relative entropy:
where the infimum is taken over all separable states σ. The relative entropy
is defined by
Vedral and Plenio provided a complicated argument [269] showing that EVP is
an upper bound for D(), under the additional assumption that it is additive.
Even though we still do not know if it is indeed additive, Rains showed [311]
that it is a bound for D even without this assumption. He also obtained a
stronger bound by use of BE states (more precisely, PPT states). It appears
that, if the infimum in (5.72) is taken over PPT states (which are bound
entangled), the new measure ER is a bound for the distillable entanglement,
too. However, since the set of PPT states is strictly greater than the set
of separable states, the bound is stronger. For example, the entangled PPT
states have zero distillable entanglement. Since they are not separable, EVP
does not vanish for them, and hence the evaluation of D by means of EVP is
too rough. The Rains measure vanishes for these states.
We will not provide here the original proof of the Rains result. Instead
we demonstrate a general theorem on bounds for distillable entanglement
obtained in [357], which allows a major simplification of the proof of the
result.
31
Classical messages can be sent only from Alice to Bob during distillation.