Quantum Probability
Quantum Probability
computation
Greg Kuperberg
UC Davis
(Dated: October 8, 2007)
Quantum mechanics is one of the most surprising Nonetheless, theoretical results suggest that quan-
sides of modern physics. Its basic precepts require tum computers are possible rather than impossi-
only undergraduate or early graduate mathemat- ble. Entirely apart from technological implications,
ics; but because quantum mechanics is surprising, quantum computation is a beautiful subject that
it is more difficult than these prerequisites suggest. combines mathematics, physics, and computer sci-
Moreover, the rigorous and clear rules of quantum ence.
mechanics are sometimes confused with the more dif- This article is an introduction to quantum prob-
ficult and less rigorous rules of quantum field theory. ability theory, quantum mechanics, and quan-
Many working mathematicians have an excellent tum computation for the mathematically prepared
intuitive grasp of two parent theories of quantum reader. Chapters ?? and ?? depend on Section 1
mechanics, namely classical mechanics and proba- but not on each other, so the reader who is inter-
bility theory. The empirical interpretations of both ested in quantum computation can go directly from
of these theories, above and beyond their mathe- Chapter 1 to Chapter ??.
matical formalism, have been a great source of ideas This article owes a great debt to the textbook on
in mathematics, even for many questions that have quantum computation by Nielsen and Chuang [4],
nothing to do with physics or practical statistics. For and to the Feynman Lectures, Vol. III [2]. An-
example, the probabilistic method of Erd os and oth- other good textbook written for physics students is
ers [? ] is a fundamental method in combinatorics by Sakurai [5].
to show the existence of combinatorial objects. In
principle, the precepts of quantum mechanics could
Exercises
be similarly influential; there could easily be one
or more kind of quantum probabilistic method.
But in practice the precepts of quantum mechan- These exercises are meant to illustrate how empir-
ics are not very familiar to most mathematicians. ical interpretations can lead to solutions of problems
Two subdisciplines of mathematics that have assim- in pure mathematics.
ilated these precepts are mathematical physics and
1. The probabilistic method: The Ramsey num-
operator algebras. However, much of the intention
ber R(n) is defined as the least R such that if
of mathematical physics is the converse of our pur-
a simple graph has R vertices, then either it
pose, to apply mathematics to problems in physics.
or its complement must have a complete sub-
The theory of operator algebras is close to the spirit
graph with n vertices. By considering random
of this article; in this theory what we call quantum
graphs, show that
probability is often called non-commutative proba-
bility. 2(n1)/2
R(n) .
Recently quantum computation has entered as a (2(n!))1/n
new reason for both mathematicians and computer
scientists to learn the precepts of quantum mechan- (The proof can be described as a couting ar-
ics. Just as randomized algorithms can be moder- gument. However, a solution phrased in terms
ately faster than deterministic algorithms for some of probabilitistic existence is more in the spirit
computational problems, quantum algorithms can of these notes.)
be moderately faster or sometimes much faster than 2. Angular momentum: Let S be a smooth sur-
their classical and randomized alternatives. Quan- face of revolution about the z-axis in R3 , and
tum algorithms can only run on a new kind of com- let ~p(t) be a geodesic arc on S, parameterized
puter called a quantum computer. As of this writ- by length, that begins at the point (1, 0, 0) at
ing, convincing quantum computers do not exist. t = 0. Show that ~p(t) never reaches any point
within 1/|py (0)| of the vertical axis.
3. Kirchoffs laws: Suppose that a unit square is
Electronic address: [email protected] tiled by finitely many smaller squares. Show
2
that the edge lengths are uniquely determined evant axioms. One of the restrictions on the alge-
by the combinatorial structure of the tiling, bra is commutativity: If x and y are two real- or
and that they are rational. (Hint: Build the complex-valued random variables, then xy and yx
unit square out of material with unit resistivity are the same random variable. In quantum prob-
with a battery connected to the top and bot- ability, this commutative algebra is replaced by a
tom edges. Cut slits along the vertical edges of non-commutative algebra called a von Neumann al-
the tiles and affix zero-resistance wires to the gebra. The remaining definitions stay as much the
horizontal edges. Each square becomes a unit same as possible.
resistor in an electrical network.) We will mostly consider finite-dimensional quan-
tum systems. These are enough to show most of
the basic ideas of quantum probability, just as finite
1. QUANTUM PROBABILITY or combinatorial probability is enough to show most
of the basic ideas of classical probability. Infinite-
dimensional quantum systems are discussed in Sec-
The precepts of quantum mechanics are neither
tion ??.
a set of physical forces nor a geometric model for
To summarize, quantum probability is the most
physical objects. Rather, they are a generalization
natural non-commutative generalization of classical
of classical probability theory that modifies the ef-
probability. In this authors opinion, this description
fects of physical forces. If you have firmly accepted
does the most to demystify quantum probability and
classical probability, it is tempting to suppose that
quantum mechanics.
quantum mechanics is a set of probabilistic objects,
in effect a special case of probability rather than a
generalization. But this is not true in any reasonable 1.1. Quantum superpositions
sense; quantum probability violates certain inequal-
ities that hold in classical probability (Section ??).
It is also tempting to view quantum mechanics as We will begin by discussing part of the pure-state
a a deterministic dynamical system that produces model of quantum mechanics in order to show the
classical probabilities and is otherwise hidden. This inadequacy of classical probability.
interpretation is not reasonable either. A pure state of a quantum mechanical system can
In physics courses, quantum mechanics is usu- be described as a vector of a complex vector space
ally defined in terms of operators acting on Hilbert H. If the system is finite, then we can say that the
spaces. A state of a system is a vector of its Hilbert vector space is Cn . It will be convenient to label
space, the vector evolves by unitary operators, the the basis of this vector space by an arbitrary finite
vector is measured by Hermitian operators, and the set A rather than by the numbers from 1 to n; we
measured values have probability distributions. can then denote the vector space CA . The general
state space H is not just a vector space but a Hilbert
Although we will discuss the vector-state model,
space, meaning that it has a positive-definite Hermi-
we will emphasize the non-commutative probability
tian inner product h|i. When H is Cn or CA , then
model from operator algebras. In this model, a sys-
it has the standard inner product
tem can be fully quantum, or fully classical, or things
in between. The fully quantum case corresponds to X
the vector-state model, but even in this case, the h|i = a a .
aA
general state is described by an operator rather than
a vector. The states that can be described by vectors In quantum theory, the traditional notation is |i
are called pure; the others are mixed states. (a ket) for a vector and h| (a bra) for the
The vector-state model of quantum mechanics corresponding dual vector
was originally known as matrix mechanics and is
due to Heisenberg. The historical alternative is h| = = h|i.
Schrodingers wave mechanics. Wave mechanics is
best understood as a special case of matrix mechan- This notation is due to Dirac [1] and is called bra-
ics, and we will describe it this way. The probabilis- ket notation. Recall also that a linear map from a
tic interpretation of quantum mechanics is due to Hilbert space to itself is called an operator.
Max Born and is known as the Copenhagen inter- In finite quantum mechanics, as in classical proba-
pretation (Section ??). bility, we can define a physical object by specifying a
Since classical probability is a major analogy for finite set A of independent configurations. In infor-
us, it is reviewed in Section ??. The point is that a mation theory (both quantum and classical), the ob-
classical probabilistic system (or measurable space) ject is often called Alice. Classically, the set of all
is an algebra of random variables that satisfies rel- normalized states of Alice is the simplex A spanned
3
by A in the vector space RA (see Section ??). I.e., The requirement that U is linear is the quantum su-
a general state has the form perposition principle. It appears to contradict the
X classical superposition principle, and it is thus an ap-
= pa [a] parent paradox of quantum probability. (However,
aA the treatment in Section 1.3 reconciles the two sides
of this paradox.) The entries of U are also called
for probabilities pa 0 that sum to 1. (For unnor-
amplitudes, just as the entries of a stochastic map
malized states, the sum need not be 1.) The number
are also probabilities. Since we have posited that
pa is interpreted as the probability that Alice is in
|a |2 is a probability, U conserves total probability
state a. Quantumly, Alices set of pure states is the
if and only if
vector space CA . In other words, a state of Alice is
a vector ||U || = ||||
X
|i = a |ai for all CA ; i.e., if U is a unitary embedding. If
aA A = B or at least |A| = |B|, then U is a unitary
operator.
with complex coefficients a that are called ampli- It will be convenient to consider maps that pre-
tudes. The square norm |a |2 is interpreted as the serve or decrease probability. Such maps are called
probability that Alice is in the configuration |ai. The extinction processes; the model random walks that
total probability is therefore the sum can terminate, experiments that can be scratched,
X etc. A classical map M of this kind is called sub-
h|i = |a |2 . stochastic. The corresponding quantum condition is
aA
||U || ||||
The state |i is normalized if this sum is 1. The
phase of a (i.e., its argument or angle as a complex and such as U is subunitary.
number) has no direct probabilistic interpretation,
but it becomes important when we consider opera-
tors on |i. While the relative phase of two coor- i/2 i/2
dinates a and a is indirectly measurable, it will
i/2
turn out that the global phase of |i is not mea-
i/2
surable, i.e., it is not empirical. Indeed, the global
phase of |i is absent from the operator formalism i/2 i/2
that we will define in Section 1.3.
The state |i is also called a quantum superposi-
tion, an amplitude function, or a wave function. This Figure 1: An idealized two-slit experiment.
last name is motivated by the fact that |i typically
satisfies a wave equation in infinite quantum me- One traditional, idealized setting for the quan-
chanics (Example ?? and Section ??). It also pre- tum superposition principle is a diffraction appara-
dates the Copenhagen interpretation and arguably tus known as the two-slit experiment. Figure 1 shows
distracts from it. the basic idea: A laser emits photons that can travel
If A and B (Alice and Bob) are the configu- through either of two slits in a grating and then may
ration sets of two classical systems, then an empiri- (or may not) reach a detector. The source has a sin-
cally allowed map from Alices state to Bobs state gle state (the state set A has one element), while the
is given by a stochastic linear map grating has two states and there are two detectors
M : RA RB , (B and C each have two elements). The transitions
for each photon, as it passes from A to B to C, are
also called a Markov map. The property that M described by two subunitary matrices
is linear is the classical superposition principle: dis-
joint probabilities add. In addition, in order to be U : CA CB V : CB CC .
stochastic, M must have positive entries (so that We can choose the matrices to be
probabilities remain positive) and its column sums i i i
must be 1 (to conserve probability). U = 2i V = 2i 2 ,
In the quantum case, an empirical transition from 2 2 2i
Alices vector states to Bobs vector states is a linear so that
map 1
2
VU = .
U : CA CB . 0
4
The total amplitude of the photon reaching the top could be tuned to fire only one photon at a time.
detector is 12 and the probability is 14 ; this case Of course, a two-slit experiment is only an ideal-
is called constructive interference. The total ampli- ization of a real experiment; however, it is very sim-
tude reaching the bottom detector is 0, so the photon ilar to many actual experiments and even routine
never reaches it; this case is called destructive in- demonstrations. Note also that diffraction experi-
terference. On the other hand, if one of the slits of ments can portray any operator and therefore any
blocked, then we can discard one of the states in |B|, process in quantum mechanics, just as any classi-
with the result that each detector is reached with cal stochastic map can be modelled by balls falling
1
probability 16 . The classical superposition principle through chutes. The two-slit experiment can be
would dictate a probability of 81 for each detector demonstrated with photons, or electrons or even
with both slits open; thus it is violated. molecules, but it really describes general probabilis-
tic rules. See Sections 1.5 and ?? for more discus-
sion.
i/2 i/2 Examples 1.1.1. A qubit is a two-state quantum
object with configuration set {0, 1}. Two of their
quantum superpositions are:
i/2 i/2
|0i + |1i |0i |1i
|+i = |i =
2 2
Figure 2: An angle-dependent detector in the two-slit Both of these states have probability 12 of being in
experiment. either configuration |0i or |1i, but they are differ-
ent states. This is demonstrated by the effect of a
A natural reaction to the violation of classical su- unitary operator H called the Hadamard gate:
perposition is to try to determine which slit the pho-
ton went through. For instance, the detector could 1 1
H= .
be sensitive to the angle that the photon comes in, 1 1
as in Figure 2. Or there could be a detector at one of
the slits that notices that the photon passed through It exchanges |0i with |+i and |1i with |i.
it. But in any such circumstance, the two paths then The spin state of a spin- 12 particle is a two-state
results in different final states (of the experiment as system which is important in physics. (Electrons,
a whole) rather than in the same state. Thus the protons, and neutrons are all spin- 21 particles.) The
final state vector is conventional orthonormal basis is |i (spin up) and
1
4 |i (spin down). The names of the states refer to
|i = the property of the electron spinning (according to
14
the right-hand rule) about a vertical axis in these
and its total probability is two states. Even though a rotated electron is still
1 an electron, neither this configuration set nor any
h|i = ||||2 = , other is preserved by rotations. The resolution of
8
this paradox is that rotated states appear as super-
regardless of the phases of path segments to and
positions. For example, the spin left and spin right
from the slits. The lesson is that amplitudes of dif-
states are analogous to |+i and |i:
ferent trajectories of an object only add when there
is no evidence of which trajectory it took. If the |i + |i |i |i
trajectory is recorded at all, the probabilities add. |i = |i = .
If we want to see quantum superposition, it is not 2 2
enough to wittingly or unwittingly ignores such ev- Although we will soon switch to a more general
idence. Rather, if the two trajectories induce dif- model of quantum probability, we can say a few
ferent states of the universe, so that some observer words about presenting this vector space model in a
could in principle distinguish them, then they obey basis-independent form. As we said, a quantum ob-
classical superposition. Moreover, the effect is not ject can be assigned any Hilbert space H rather than
the result of interaction between photons; photons the standard finite-dimensional vector space CA for
do not interact with each other1 . Indeed, the laser
a configuration set A. Since states evolve by unitary Such an expansion is interpreted as path sum-
operators, we can conjugate the standard basis of mation; it is the same idea as a sum over his-
CA by any unitary operator, to conclude that any tories in classical probability.
orthonormal basis of any Hilbert space H can be For example, let n = 4 and let each
called a configuration set. Likewise, if we accept a
real-valued function f on A as a random variable,
!
1 1 1
then we can model it by a diagonal matrix D who Uk = .
entries are the values of f . We can then conjugate 2 1 1
that too by a unitary operator U , to conclude that
any Hermitian operator H = U DU 1 represents a Find the amplitudes of the 16 paths and group
real-valued random variable. If the eigenvalues of them according to how they sum.
H are 0 and 1, so that H = P is a Hermitian pro- 5. In general for a spin- 21 particle, the state
jection, then this corresponds to a Boolean random
variable. These are the basic rules of vector-state |~v i = |i + |i
quantum probability, with the exception of the cru-
cial tensor product rule for joint states (Section 1.5). spins in the direction
events add. The state is normalized if (1) = 1, The state is normalized if (1) = 1. (We
which means that the total probability is 1. write M# instead of M for the dual space
If the algebra is finite, then it is isomorphic to because is already used internally to M.)
(Z/2)A , the (Z/2)-valued functions on some finite
If you know or suppose that M = CA , then it is not
set A or the algebra of subsets of A. The set A is
the set of configurations of . hard to show that the self-adjoint elements are the
The complex-valued random variables over or real-valued random variables RA , the positive ele-
A form an algebra denoted L () or CA = (A). ments are the non-negative random variables RA 0 ,
This algebra is generated (as an algebra over C) by and the Boolean variables are the 0 1-valued vari-
the elements of , with addition in Z/2 forgotten ables {0, 1}A = (Z/2)A = .
and multiplication retained. In other words, if xy = It is also not hard to show that the two definitions
z in , then this is also imposed as a relation in of a state are equivalent. Indeed, the set of normal-
L (). (Technically, L () is only the bounded ized states of M or is the simplex A RA that
random variables and is a Banach-space completion consists of convex sums of elements of A. This sim-
of the algebra so generated, but these concerns are plex is shown for a two-state system (a randomized
only important in the infinite case; see Section ??.) bit) and a three-state system (a randomized trit) in
The state extends to a linear functional on L (), Figure ??, together with an example element in each
so that case. To support this picture, we define [a] to be the
state which is definitely a. For example, if a bit is 1
E[x] = (x) with probability p, then its state is
M
M
M is positive-definite, meaning that if x x = = Mnk .
0, then x = 0. k
(x) = Tr(x).
As in the example of a qubit, an important dif- for some vector Cn , since it is also Hermitian.
ference between quantum probability and classical If is normalized, then in addition is normalized,
probability is that the state region M is not a sim- by the relation
plex (except in the commutative case). But it is al-
ways convex, because it is defined by linear equalities Tr = h|i.
and inequalities. This convex structure allows clas-
sical superpositions in a quantum setting. Empiri- In basis-independent form, if M = B(H), and if a
cally, if we have two states 1 and 2 of a quantum state on M is pure, then it is described by a vector
system, and if we prepare a new state by choosing |i H. A configuration set of M is, by definition,
1 with probability p and 2 with probability 1 p, any orthonormal basis of H. Any state |i is a com-
then plex linear combination of the configurations, and
such a linear combination can be called a quantum
= p1 + (1 p)2 . superposition.
9
If Alice and Bob are both fully quantum and have In this section we will look more closely at the
Hilbert spaces HA and HB of the same dimension, measuring quantum random variables. A random
then every algebra isomorphism variable x Msa (or more generally a classical do-
main A M as defined below) is also called a mea-
E : B(HB ) B(HA ) surable or an observable. To measure it is to pass
10
to the conditional state, just as is done in classical random variable is then in general defined by a set
probability. of mutually exclusive Booleans that sum to 1:
We said that if p is a Boolean random variable X
in an algebra M, then the unnormalized conditional pa = 1 a 6= b = pa pb = 0.
state is aA
that H = CA . The set A is a configuration set, in and only if they have no overlap (Exercise ??).
keeping with Section 1.1. If n = |A| = dim H, then Unlike real-valued random variables, there are two
we say that M is an n-state system, even though notions of complex-valued and vector-valued random
technically n is the number of configurations rather variables. We can let z be any element of M, which
than the number of states. For example, a qubit is the complexification of Msa , so that
can also be described as any fully quantum 2-state
system. z + z z z
A maximal classical realm A is also called a com- z = x + iy x= y= .
2 2i
plete measurement. The name evokes the fact that
once any configuration a A is measured, the condi- Then z is a complex random variable in a weak sense,
tional state is pure and determined by a, so there is because x and y are both self-adjoint and are both
no more left to learn from the state of M. (Nonethe- therefore real random variables. This defines a com-
less, M has many different complete measurements; plex random variable in the weak sense. The wrinkle
and as we said, even a pure state is typically still is that x and y may not commute with each other,
a source of perpetual randomness.) Note also that in which case z and a state do not yield a distri-
if M is fully quantum and we have chosen a basis bution on C. If the real and imaginary parts x and
so that A consists of diagonal matrices, then the di- y do commute, or equivalently if z and z commute,
agonal entry aa of a state is just the probability then z is normal. A state and a normal z generate
of the outcome a A. We can view as a classical a classical realm and a classical state on C as usual.
probability distribution on A, plus extra off-diagonal Likewise a vector-valued random variable in the
information. weak sense is any ~v M V for some vector space
If M is not fully quantum, then some of the above V . We can say that ~v is normal when its components
analysis has to be modified. Nonetheless, it is still commute in any basis of V , in which case it generates
true that all of the conditional states of a maximal a classical realm and a classical state on V , given a
classical realm A are pure, that a set A of such out- state on M. An important example of the weak
comes is called a configuration set, and that any two kind of vector-valued random variable is the angular
configuration sets have the same cardinality. If momentum operator (Section ??).
Note the set Mnor of normal elements of M is
M
M
= Mnk , not closed under either addition or multiplication
k
(Exercise ??). The same is true of the set (MV )nor
then the cardinality of vector-valued measurements, or in general the A-
P of A is the total sum of the valued measurements where the set A is an abelian
matrix sizes, n = k nk .
Similar to a complete measurement, if is a pure group. Only the real random variables, Msa , have
state, then there is an associated minimal Boolean the special property that they can be added even if
p with the same matrix as which answers whether they do not commute.
the system is in the state . If p and q are two such
minimal Booleans, then Tr(pq) is both the proba-
bility that the state p will be found in the state q, 1.4.1. Exercises
and vice-versa; it can be called the overlap between
p and q. If M is fully quantum, so that p and q
have state vectors |ai and |bi, then their overlap is 1.5. Joint systems
|ha|bi|2 . Two pure states are mutually exclusive if
[1] Paul A. Dirac, Principles of quantum mechanics, Ox- IEEE Trans. Appl. Superconduct. 13 (2003), no. 2,
ford University Press, 1930. 989993.
[2] Richard P. Feynman, Robert B. Leighton, and [4] Michael A. Nielsen and Isaac L. Chuang, Quantum
Matthew Sands, The Feynman lectures on physics. computation and quantum information, Cambridge
Vol. 3: quantum mechanics, Addison-Wesley, 1965. University Press, Cambridge, 2000.
[3] K. M. Lang, S. Nam, J. Aumentado, C. Urbina, [5] Jun John Sakurai, Modern quantum mechanics, 2nd
and John M. Martinis, Banishing quasiparticles from ed., Benjamin/Cummings, 1985.
josephson-junction qubits: why and how to do it,