Chap 3
Chap 3
Aram W. Harrow
Contents
1 Axioms of Quantum Mechanics 1
1.1 One system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Two systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 The problem of partial measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Density operators 5
2.1 Introduction and definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5 Examples of decoherence 14
5.1 Looking inside a Mach-Zehnder interferometer . . . . . . . . . . . . . . . . . . . . . . 14
5.2 Spin rotations in NMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.3 Spontaneous emission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1
1.1 One system
States are given by unit vectors |ψi ∈ V for some vector space V .
Observables are Hermitian operators  ∈ L(V ).
Measurements P
Suppose  = di=1 λi |vi ihvi | for {|v1 i . . . , |vd i} an orthonormal basis of eigenvalues and (for
simplicity) each λi distinct. If we measure observable  on state |ψi then the outcomes are
distributed according to Pr[λi ] = |hψ|vi i|2 .
∂
Time evolution is given by Schrödinger’s equation: i~ ∂t |ψi = H|ψi where H is the Hamiltonian.
Heisenberg picture We can instead evolve operators in time using
∂
i~ ÂH = [ÂH , HH ]. (1)
∂t
Time-independent solution
If the Hamiltonian does not change in time, then the time evolution operator for time t
iHt
is the unitary operator U = e− ~ . The state evolves according to |ψ(t)i = U|ψ(0)i in the
Schrödinger picture or the operator evolves according to ÂH (t) = U † AH (0)U in the Heisenberg
picture.
Systems are described by a pair (V, H).
Pvectors |w1 i, . . . , |wd i (not necessarily orthogonal) and some p1 , . . . , pd such that
for some unit
pi ≥ 0 and di=1 pi = 1.
Then the probability of outcome i is pi and the residual state in this case is |vi i ⊗ |wi i.
2
Time evolution is still given by Schrödinger’s equation, but now the joint Hamiltonian of two
non-interacting systems is
H = H1 ⊗ I + I ⊗ H2 . (2)
q1 q2
Interactions can add more terms, such as the |~
r1 −~r2 | Coulomb interaction, which generally
cannot be written in this way. Note that a Hamiltonian term of the form Â1 ⊗ Â2 does
represent an interaction; e.g. σz ⊗ σz has energy ±1 depending on whether the two spins have
Z components pointing in the same or opposite directions.
Time-independent solution
For a Hamiltonian of the form in (2), the time evolution operator is
iHt iH1 t iH2 t
U = e− ~ = e− ~ ⊗ e− ~ .
You should convince your that this second equality is true. Of course if the Hamiltonian
contains interactions then U will generally not be of this form.
These principles are actually profoundly different from anything we have seen before. For
example, consider the number of degrees of freedom. One d-level system needs d complex numbers
to describe (neglecting normalization and the overall phase ambiguity) but N d-level systems need
dN complex numbers to describe, instead of dN . This exponential extravagance is behind the power
of quantum computers, which will be discussed briefly at the end of the course, if time permits. It
also seemed intuitively wrong to many physicists in the early 20th century, most notably including
Einstein. The objections of EPR [A. Einstein, B. Podelsky and N. Rosen, Physical Review, 47 777–
780 (1935)] led to Bell’s theorem, which we saw in 8.05 and will review on pset 6. Here, though,
we will consider a simpler problem.
Table 1: Outcomes when Alice measures her half of the singlet state (3) in the {|+i, |−i} basis.
3
What can we say about Bob’s state after such a measurement? It is not a deterministic object,
but rather an ensemble of states, each with an associated probability. For this we use the notation
to indicate that state |ψi i occurs with probability pi . The numbers p1 , . . . , pm should form a
probability distribution, meaning that they are nonnegative reals that sum to one. The states |ψi i
should be unit vectors but do not have to be orthogonal. In fact, the number m could be much
larger than the dimension d, and could even be infinite; e.g. we could imagine a state with some
coefficients that are given by a Gaussian distribution. We generally consider m to be finite because
it keeps the notation simple and doesn’t sacrifice any important generality.
In the example where Alice measures in the {|+i, |−i} basis, Bob is left with the ensemble
1 1
, |+i , , |−i . (5)
2 2
What if Alice chooses a different basis? Recall from 8.05 that if ~n ∈ R3 is a unit vector then a
spin-1/2 particle pointing in that direction has state
θ θ
|~ni ≡ |n; +i = cos |+i + sin eiφ |−i. (6)
2 2
Here (1, θ, φ) are the polar coordinates for ~n; i.e. nx = sin θ cos φ, ny = sin θ sin φ, and ny = cos θ.
The notation |n; +i was what we used in 8.05 and |~ni will be the notation used in 8.06 in contexts
where it is clear that we are talking about spin states. The orthonormal basis {|n; +i, |n; −i} in
our new notation is denoted {|~ni, | − ~ni}.
Suppose that Alice measures in the {|~ni, | − ~ni} basis. It can be shown (see 8.05 notes or
Griffiths §12.2) that for any ~n,
Table 2: Outcomes when Alice measures her half of the singlet state (3) in the {|~ni, | − ~ni} basis.
4
Uh-oh! At this point, our elegant theories of quantum mechanics have run into a number of
problems.
• Theory isn’t closed. When we combine two systems with tensor product we get a new
system, meaning a new vector space and a new Hamiltonian. It still fits the definition of
a quantum system. But when we look at the state of a subsystem, we do not get a single
quantum state, we get an ensemble. Thus, if we start with states being represented by unit
vectors, we are inevitably forced into having to use ensembles of vectors instead.
• Ensembles aren’t unique. Any choice of ~n will give Bob a different ensemble. We expect
our physical theories to give us unique answers, but here we cannot uniquely determine which
ensemble is the right one for Bob. Note that other choices of measurement can leave Bob
with different ensembles as well; e.g. if Alice flips a coin and uses that to choose between
two measurements settings, then Bob will have a distribution over four states, each occurring
with probability 1/4.
• Time travel?! If Bob could distinguish between these different ensembles (including the case
in which Alice does nothing and he still holds half of an entangled state), then Alice could
instantaneously communicate to Bob with her choice of measurement basis (or perhaps her
choice of whether to measure at all or not). According to special relativity, there is a different
inertial frame in which this process looks like Alice sending a message backwards in time.
This rapidly leads to trouble...
Fortunately density operators solve all three problems! As a bonus, they are far more elegant than
ensembles.
2 Density operators
2.1 Introduction and definition
We would like to develop a theory of states that combines randomness and quantum mechanics.
So it is worth reviewing how both randomness and quantum mechanics can be viewed as two
different ways of generalizing classical states. For simplicity, consider a classical system which can
be in d different states labelled 1, 2, . . . d. The quantum mechanical generalization of this would
be to consider complex d-dimensional unit vectors while the probabilistic generalization would be
nonnegative real d-dimensional vectors whose entries sum to one. These can be thought of as
two incomparable generalizations of the classical picture. We are interested in considering both
generalizations at once so that we consider state spaces that are both probabilistic and quantum.
We summarize these different choices of state spaces in Table 3.
classical quantum
deterministic {1, . . . , d} |ψi ∈ Cd
s.t. hψ|ψi = 1
probabilistic p1 , . . . , pd ≥ 0 ensembles?
s.t. p1 + . . . + pd = 1 density operators?
Table 3: Different theories yield different state spaces.
5
What do we put in the fourth box (probabilistic quantum) of Table 3? One possibility is
to put ensembles of quantum states, as defined in (4). Besides the drawbacks mentioned in the
previous section, these also have the flaw that of involving an unbounded number of degrees of
freedom. For example, let’s take a spin-1/2 particle (i.e. d = 2), so our quantum states are of the
form c+ |+i + c− |−i. Then one such probability distribution is |+i with probability 1/3, |−i with
probability 1/2 and |+i−i|−i
√
2
with probability 1/6. Another distribution is cos(θ)|+i + sin(θ)|−i
where θ is distributed according to a Gaussian with mean 0 and variance σ 2 .
This works, but there are an infinite number of degrees of freedom, even if we start with a single
lousy electron spin! Surely nature would not be so cruel.
Another drawback with this approach is that different distributions can give the same measure-
ment statistics for all possible measurements. As a result, many of these infinite degrees of freedom
turn out to be simply redundant.
To see how this works, suppose that we have a discrete distribution where state |ψa i occurs
with probability pa , for a = 1, . . . , m. Consider an observable Â. The expectation of  with respect
to this ensemble is:
m
X m
X
pa hψa |Â|ψa i = pa tr hψa |Â|ψa i since tr has no effect on 1 × 1 matrices
a=1 a=1
Xm
= pa tr Â|ψa ihψa | cyclic property of the trace
a=1
X m
= tr  pa |ψa ihψa | linearity of the trace
a=1
| {z }
density matrix ρ
This is called the cyclic property of the trace because it is often applied to traces of long strings of
matrices. For example, we can repeatedly apply (9) (using curly braces to indicate which blocks of
matrices we are calling X and Y ) to obtain
tr[ABC D ] = tr[DAB
| {z } |{z} C ] = tr[CDA
| {z } |{z} B ] = tr[BCDA]
| {z } |{z} (10)
X Y X Y X Y
6
The trace can also be used to define an inner product on operators. Define
X
hX, Y i ≡ tr[X † Y ] = ∗
Xi,j Yi,j . (11)
i,j
From this last expression we see that hX, Y i is equivalent to turning X and Y into vectors in
the natural way (just listing all the elements in order) and taking the conventional inner product
between those vectors.
2.2 Examples
2.2.1 Pure states
If we know the state is |ψi, then the density matrix is |ψihψ|. Observe that there is no phase
ambiguity (|ψi 7→ eiφ |ψi leaves the density matrix unchanged) and each |ψi gives rise to a distinct
density matrix. Such density matrices are called pure states, and sometimes this terminology is
also used when talking about wavefunctions, to justify not using the density matrix formalism. By
contrast, all other density matrices are called mixed states.
I + ~n · ~σ
|~nih~n| = (12)
2
With this in hand we can return to the example of Alice measuring half of a singlet state. Whatever
her choice of ~n, Bob’s density matrix is
1 1 1 I + ~n · σ 1 I − ~n · σ I
|~nih~n| + | − ~nih−~n| = + = . (13)
2 2 2 2 2 2 2
This rules out their earlier attempts at instantaneous signaling (and later we will prove this in more
generality). Bob’s density matrix fully determines the results of any measurement he makes, and
it is independent of Alice’s choice of ~n.
7
2.2.3 The maximally mixed state
If {|v1 i, . . . , |vd i} are an orthonormal basis, and each occurs with probability 1/d, then the resulting
density matrix is
d
1X I
ρ= |vi ihvi | = , (14)
d d
i=1
independent of the choice of basis. This is called the maximally mixed state. The previous example
was the d = 2 case of this: a 1/2 probability of spin-up and 1/2 probability of spin-down results in
the same density matrix, no matter which direction “up” refers to.
The continuous distribution over all unit vectors in Cd also yields the same density matrix,
although this is a harder calculation.
One lesson is that we shouldn’t take the probabilities pa too seriously; i.e. they are not uniquely
determined by the density matrix. Neither is the property of the states in the ensemble being
orthogonal.
This is known as the Gibbs state or the thermal state. It describes the state of a quantum system
at thermal equilibrium.
8
One specific example comes from NMR. Consider a proton spin in a magnetic field, say a 11.74
Tesla field in the ẑ direction. At this field strength, the proton spin will experience the Hamiltonian
H = −~ω0 σz where ω0 ≈ 500 MHz. (In fact, if you buy a 11.74T superconducting magnet, the
vendor will probably call it a “500 MHz” magnet for this reason. It could also reasonably be called
a 500K magnet because of its price.) The thermal state is
2. ρ 0.
Conversely, for any d × d matrix ρ satisfying these two conditions, there exists an ensemble
{pa , |ψa i}1≤a≤m such that ρ = m
P
a=1 pa |ψa ihψa |. Here m can be taken to be the rank of ρ.
The inequality ρ 0 means that ρ is positive semidefinite, which is defined to mean that
hψ|ρ|ψi ≥ 0 for all |ψi. It is the matrix analogue of being nonnegative.
9
3. There exists a matrix B such that A = B † B. (This is called a Cholesky factorization.)
To get intuition for this last condition, observe that for 1 × 1 matrices, it is the statement that
a real number x ≥ 0 iff x = z ∗ z for some complex z.
Proof of Theorem 2. Since A is Hermitian, we can write A = di=1 λi |ei ihei | for some orthonormal
P
basis {|e1 i, . . . , |ed i} and some real λ1 , . . . , λd .
(1 → 2): Take |ψi = |ei i. Then 0 ≤ hψ|A|ψi = hei |A|ei i = λi .
P √
(2 → 3): Let B = di=1 λi |ei ihei |. As an aside, one can show that B satisfies A = B † B if and
P √
if B = di=1 λi |ei ihfi | for some orthonormal basis {|f1 i, . . . , |fd i}. Sometimes we say that
only √
B = A, by analogy to the scalar case.
(3 → 1): For any |ψi, let |ϕi = B|ψi. Then hψ|A|ψi = hψ|B † B|ψi = hϕ|ϕi ≥ 0.
• Finally, trρ = m
P Pm
i=1 pi tr|ψi ihψi | = i=1 pi = 1.
To prove the other direction, suppose that trρ = 1 and ρ 0. By Theorem 2, ρ = di=1 λi |ei ihei |
P
Pd
for {|e1 i, . . . , |ed i} an orthonormal basis and each λi ≥ 0. Additionally trρ = i=1 λi = 1. Thus
we can take pi = λi and now ρ is the density matrix corresponding to the ensemble {pi , |ei i}1≤i≤d .
If rank ρ < d, then the sum only needs rank ρ terms.
10
Geometrically this looks like the unit ball in R3 . The pure states form the surface of the ball,
corresponding to the case |~a| = 1. The maximally mixed state I/2 corresponds to ~a = 0. In
general, |~a| can be thought of as the “purity” of a state.
This set is called the Bloch ball. The unit vectors at the surface are called the Bloch sphere.
These have nothing to do with Bloch states or Bloch’s theorem (which arise in the solution of
periodic potentials) except for the name of the inventor.
Beware also that for d > 2, the set of density matrices is no longer a ball and there is no longer
a canonical way to quantify “purity.” However, notions of entropy do exist and are used in fields
such as quantum statistical mechanics.
∂
i~ ρ = [H, ρ] . (20)
∂t
This is reminiscent of the Heisenberg equation of motion for operators, but with the opposite sign
∂
i~ ÂH = [ÂH , HH ]. (21)
∂t
One way to explain the different signs is that states and observables are dual to each other, in the
sense that they appear in the expectation value as hÂ, ρi.
Another way to talk about quantum dynamics is in terms of unitary transformations. If a
system undergoes Hamiltonian evolution for a finite time then this evolution can be described by
a unitary operator U, so that state |ψi gets mapped to U|ψi. In this case |ψihψ| is mapped to
U|ψihψ|U † . By linearity, a general density matrix ρ is then mapped to UρU † .
4.2 Measurement
A similar argument shows that if we measure ρ in the orthonormal basis {|v1 i, . . . , |vd i}, then the
probability of outcome j is hvj |ρ|vj i and the post-measurement state is |vj ihvj |. The fastest way
to see this is to consider the observable |vj ihvj | which has eigenvalue 1 (corresponding to obtaining
outcome |vj i) and eigenvalue 0 repeated d − 1 times (corresponding to the orthogonal outcomes).
Then we use the fact that hÂi = tr[Âρ] and set  = |vj ihvj |.
11
Pm
An alternate derivation is to decompose ρ = a=1 pa |ψa ihψa |. Then Pr[j|a] = |hvj |ψa i|2 and
m
X
Pr[j] = pa Pr[j|a]
a=1
Xm
= pa |hvj |ψa i|2
a=1
m
X
= pa hvj |ψa ihψa |vj i
a=1
m
!
X
= hvj | pa |ψa ihψa | |vj i
a=1
= hvj |ρ|vj i
It should be reassuring that, even though we used the ensemble decomposition in this derivation,
the final probability we obtained depends only on ρ.
What if we forget the measurement outcome, or never knew it (e.g. someone else measures the
state while our back is turned)? Then ρ is mapped to
d
X d
X
hvj |ρ|vj i|vj ihvj | = |vj ihvj |ρ|vj ihvj |. (22)
j=1 j=1
Here it is important to note that density matrices, like probability distributions, represent not only
objective states of the world but also subjective states; in other words, they describe our knowledge
about a state. So subjective uncertainty (i.e. the state “really is” something definite but we don’t
know what it is) will have implications for the density matrix.
If we now write ρ from (22) as a matrix in the |v1 i, . . . , |vd i basis, this looks like
ρ1,1 0 . . . 0
..
0 ρ2,2 .
..
..
. . 0
0 ... 0 ρd,d
Can we unify measurement and unitary evolution the way that we have unified the probabilis-
tic and quantum pictures of states? For example, how should we model an atom in an excited
state undergoing fluorescence? We will return to this topic later when we discuss open quantum
systems and quantum operations. However, already we are equipped to handle the phenomenon of
decoherence, which is the monster lurking in the closet of every quantum mechanical experiment.
4.3 Decoherence
Unitary operators correspond to reversible operations: if U is a valid unitary time evolution then
so is U † . In terms of Hamiltonians, evolution according to −H will reverse evolution according to
H. But other quantum processes cause an irreversible loss of information. Irreversible quantum
processes are generally called “decoherence.” This somewhat imprecise term refers to the fact that
this information loss is always associated with a loss of “coherence” and with quantum systems
12
becoming more like classical systems. In what follows we will illustrate it via a series of examples,
but will not give a general definition.
Let’s warm up with P the concept of a mixture. If state |ψa i occurs with probability pa , then
the density matrix is a pa |ψa ihψa |. But what if we have an ensemble of density matrices? e.g.
{(p1 , ρ1 ), . . . , (pm , ρm )} Then the “average” density matrix is
m
X
ρ= pa ρa . (23)
a=1
We can use this to model random unitary evolution. Suppose that our state experiences a
random Hamiltonian. Model this by saying that unitary Ua occurs with probability pa for a =
1, . . . , m. This corresponds to the map
m
X
ρ 7→ pa Ua ρUa† . (24)
a=1
Let’s see how this can explain how coherence is lost in simple quantum systems. Suppose we
start with the density matrix
ρ+,+ ρ+,−
ρ=
ρ−,+ ρ−,−
and choose a random unitary to perform as follows: with probability 1 − p we do nothing and with
probability p we perform a unitary transformation equal to σz . This corresponds to the ensemble
of unitary transformations {(1 − p, I), (p, σz )}. The density matrix is then mapped to
If p = 0 then this of course corresponds to doing nothing, and if p = 1, we simply have ρ0 = σz ρσz .
In between we see that the diagonal terms remain the same, but the off-diagonal terms are reduced
in absolute value. The diagonal terms correspond to the probability of outcomes we would observe
if we measured in the ẑ basis, and so it is not surprising that a ẑ rotation would not affect these.
However, the off-diagonal terms reduce just as we would expect for a vector that is averaged with a
rotated version of itself. If p = 1/2, then the off-diagonal terms are completely eliminated, meaning
that all polarization in the x̂ and ŷ directions has been eliminated. One way to see this is that
the x̂ and ŷ polarization of σz ρσz is opposite to that of ρ. Thus averaging ρ and σz ρσz leaves zero
polarization in the x̂-ŷ plane.
With a series of examples, I will illustrate that:
• Decoherence can be achieved in several ways that look different but have the same results.
13
5 Examples of decoherence
5.1 Looking inside a Mach-Zehnder interferometer
This example is physically unrealistic (in one place) but makes the decoherence phenomenon clearest
to see.
A Mach-Zehnder interferometer is depicted in Fig. 1.
Figure 1: Mach-Zehnder interferometer. Image taken from the wikipedia article with this name.
At each point the photon can take one of two possible paths, which we denote by the states
|1i and |2i. Technically |1i means photon number in one mode and zero in the other modes, and
similarly for |2i. Also, we use |1i, |2i to first denote the two inputs to the first beam splitter, then
the two possible paths through the interferometer, and finally the two outputs of the second beam
splitter leading to the detectors.
Each beam splitter can be modeled as a unitary operator. If they are “50-50” beam splitters,
then this operator is
1 1 1
Ubs = √ .
2 1 −1
Thus, a photon entering in state |1i will go through the first beam splitter and be transformed into
the state |1i+|2i
√
2
, corresponding to an even superposition of both paths. Assuming the paths have
the same length and refractive index, it will have the same state when it reaches the second beam
splitter. At this point the state will be mapped to
|1i + |2i |1i + |2i + |1i − |2i
Ubs √ = = |1i
2 2
and the first detector will click with probability 1.
This is very different from what we’d observe if a particle entering a 50-50 beam splitter chose
randomly which path to take. In that case, both detectors would click half the time.
The usual reason to build a Mach-Zehnder experiment, though, is not only to demonstrate the
wave nature of light, but to measure something. Suppose we put some object in one of the paths
so that light passing through it experiences a phase shift of θ. This corresponds to the unitary
transformation
eiθ 0
Uph ≡ . (25)
0 1
14
Our modified experiment now corresponds to the sequence Ubs Uph Ubs , which maps |1i to
|1i + |2i eiθ |1i + |2i eiθ |1i + eiθ |2i + |1i − |2i
Ubs Uph Ubs |1i = Ubs Uph √ = Ubs √ = .
2 2 2
iθ
The probability of the first detector clicking is now | 1+e 2 2
2 | = cos (θ/2).
Now add decoherence. Suppose you find a way to look at which branch the photon is in without
destroying the photon. (This part is a bit unrealistic, but if we use larger objects, then it becomes
more reasonable. See the readings for a description of a two-slit experiment conducted with C60
molecules.) If we observe it then we will find that regardless of the phase shift θ:
Our measurement has caused decoherence that has destroyed the phase information in θ.
1 1 I
|+ih+| + |−ih−| = .
2 2 2
Applying U again leaves the density matrix unchanged. Decoherence has destroyed the polarization
of the spin.
In actual NMR experiments, we have a test tube with 1020 water molecules at room temperature
and we are not going to measure their individual spins. Instead, suppose that two nuclear spins get
close to each other and interact briefly. Suppose that the first spin is in state ρ and the second spin
is maximally mixed (i.e. density matrix I/2). Suppose that they interact for a time T according
to the Hamiltonian
2
H = λSz ⊗ Sz .
~
~ (1) · S
~ (2) ≡ P3
(Why not S i=1 Si ⊗ Si ? This is a consequence of perturbation theory: if there is a
large Sz ⊗ I + I ⊗ Sz term in the Hamiltonian, then the Sx ⊗ Sx and Sy ⊗ Sy terms are suppressed
but the Sz ⊗ Sz term is not.) This is equivalent to the first spin experiencing a Hamiltonian λSz
if the second spin is in a |+i state. and experiencing −λSz if the second spin is in a |−i state.
15
Averaging over these, the first spin is mapped to the state
1 i i 1 i i
ρ0 = e− ~ tλSz ρe ~ tλSz + e ~ tλSz ρe− ~ tλSz
2 2
e −iλt/2 0 eiλt/2 0 e iλt/2 0 e −iλt/2 0
1 + 1
= ρ ρ
2 0 e iλt/2 0 e −iλt/2 2 0 e−iλt/2 0 eiλt/2
ρ++ cos(λt/2)ρ+−
=
cos(λt/2)ρ−+ ρ−−
This doesn’t complete destroy the off-diagonal terms, but attenuates them. Here we should think
of λt as usually small.
If we average over many such interactions, then this might (skipping many steps, which you
will explore on the pset) result in a process that looks like
1 0 ρ+−
ρ̇ = − , (26)
T2 ρ 0
−+
where T2 is the decoherence time, sometime also called the dephasing time for this kind of decoher-
ence.
If this is T2 , then is there also a T1 ? Yes, T1 refers to a different kind of decoherence. In NMR,
there is typically a static magnetic field in the ẑ direction, which gives rise to a Hamiltonian of the
form H = −γBSz . From this (together with the temperature) we obtain a thermal state ρthermal
described in Section 2.2.5. The process of thermalization is challenging to rigorously derive from
the Schrödinger equation but it is usually sufficient to model it phenomenologically. Suppose that
according to a Poisson process with rate 1/T1 , the spin is discarded and replaced with a fresh spin
in the state ρthermal . Then we would obtain the differential equation
1
ρ̇ = − (ρ − ρthermal ). (27)
T1
Of course, there is another source of dynamics, which is the natural time evolution from the
Schrödinger equation: ρ̇ = − ~i [H, ρ]. Putting this together, we obtain the Bloch equation:
i 1 1 0 ρ+−
ρ̇ = − [H, ρ] − (ρ − ρthermal ) − . (28)
~ T1 T2 ρ 0
−+
I+~a·~
σ
If we write ρ = 2 , then (28) becomes
∂~a
= M̂~a + ~b, (29)
∂t
for M̂ , ~b to be determined on a pset.
(Why do I keep talking about NMR, and not ESR (electron spin resonance)? The electron’s
gyromagnetic ratio is about 657 times higher than the proton’s so its room-temperature polarization
is larger by about this amount, and signals from it are easier to detect. However, it also interacts
more promiscuously and thus often decoheres quickly, with T2 on the order of microseconds or
worse in most cases. So when you get a knee injury, your diagnosis will be made via your nuclei
and not your electrons.)
16
5.3 Spontaneous emission
Consider an atom with states |gi and |ei, corresponding to “ground” and “excited.” We will also
consider a photon mode, i.e. a harmonic oscillator. Suppose the initial state of the system is
|ψiatom ⊗ |0iphoton with |ψi = c1 |gi + c2 |ei. These will interact via the Jaynes-Cummings Hamilto-
nian
H = ~Ω(|gihe| ⊗ ↠+ |eihg| ⊗ â). (30)
(For simplicity we have left out some terms that are usually in this Hamiltonian. This Hamiltonian
can be derived using perturbation theory, as we discussed on a pset.) Suppose that the atom and
photon field interact via this Hamiltonian for a time t. Assume that δ ≡ Ωt is small and expand
the state of the system in powers of δ:
iHt δ2
e− ~ |ψi ⊗ |0i = (c1 |gi + c2 |ei) ⊗ |0i − iδc2 |gi ⊗ |1i − c2 |ei ⊗ |0i + O(δ 3 ). (31)
2
Now measure and we see that with probability |c2 |2 δ 2 the photon number is 1 and the atom is
in the state |gi. In this case, we observe an emitted photon and can conclude that the atom must
currently be in the state |gi. (It is tempting to conclude that we know it was previously in the
state |ei. This sort of reasoning about the past can be dangerous. In fact, all we can conclude is
that c2 must have been nonzero.)
With probability |c1 |2 + (1 − δ 2 )|c2 |2 = 1 − |c2 |2 δ 2 we observe 0 photons and the state is
again, up to O(δ 3 ) corrections. If we repeat this for long enough then we also end up in the state
|gi. This is because if we watch an atom for a long time and it never emits a photon we can
conclude that it’s probably in the ground state.
Therefore the product rule for density matrices can be inferred from the product rule for pure
states.
17
We can also derive it from observables. If we measure observable  on the first system then
this corresponds to the observable  ⊗ I on the composite system; likewise B̂ on the second system
corresponds to I ⊗ B̂ on the joint system. Their product is  ⊗ B̂. This arises for example when
the dipole moments of two spins are coupled and the Hamiltonian gets a term proportional to
~1 · S
S ~2 = Sx ⊗ Sx + Sy ⊗ Sy + Sz ⊗ Sz . Let ω be the joint state of a system where the first particle
is in state ρ and the second is in state σ. The expectation of  ⊗ B̂ with respect to ω should be
Since this is equal to tr[ω(Â ⊗ B̂)] for all choices of Â, B̂, we must have that ω = ρ ⊗ σ.
for all observables X. (It is an instructive exercise to verify that there is always a solution to
(34) and that it is unique.) Similarly we can define the state of system B to be ρB satisfying
tr[ρB X] = tr[ρAB (I ⊗ X)].
Expanding (34) in terms of matrix elements yields
X X X
ρA
a,a0 Xa,a0 = ρAB
ab,a0 b0 Xa,a0 δb,b0 = ρAB
ab,a0 b Xa,a0 . (35)
a,a0 a,b,a0 ,b0 a,a0 ,b
This looks like taking a trace over the B subsystem (i.e. summing over the b = b0 entries) while
leaving the A system alone. For this reason we call the map from ρAB 7→ ρA the “partial trace”
and denote it trB ; i.e. ρA = trB [ρAB ]. The partialP trace is the quantum analogue of the rule for
marginals of probability distributions: pX (x) = y pXY (x, y).
A similar equation holds for ρB ≡ trA [ρAB ] which can be expressed in terms of matrix elements
as X
ρB
b,b 0 = (trA [ρ])b,b 0 = ρAB
ab,ab0 . (37)
a
If A and B have dimensions dA and dB respectively and Md denotes the set of d × d matrices,
then trA : MdA dB → MdB and trB : MdA dB → MdA are linear maps defined by
18
If {|ai} and {|bi} are orthonormal bases then trA [|aiha0 | ⊗ |bihb0 |] = δa,a0 |bihb0 | and trB [|aiha0 | ⊗
|bihb0 |] = δb,b0 |aiha0 |.
Let’s illustrate this by revisiting the example of spontaneous emission from Section 5.3. Suppose
−iHt/~ |gi + |ei
|ψi = e √ ⊗ |0i,
2
where H = ~Ω(|gihe| ⊗ a† + |eihg| ⊗ a). H|g, 0i = 0 and H acts on the {|e, 0i, |g, 1i} subspace as a
rotation. Thus
1 1
|ψi = √ |g, 0i + √ (cos(θ)|e, 0i − i sin(θ)|g, 1i).
2 2
The corresponding density matrix is
h0| h1|
1
+ 12 cos2 (θ) i
|0i 2 2 sin(θ)
tratom |ψihψ| = −i 1 2 . (40)
|1i 2 sin(θ) 2 sin (θ)
(The decorations surrounding the above matrices are meant as reminders of which basis elements
the rows and columns correspond to.)
6.3 Purifications
One way density matrices can arise is via subjective uncertainty; i.e. we don’t know what the state
is, but it “really” is pure. If so, we might imagine that density matrices would be useful for a
quantum theory of statistics or information, but are not essential to quantum physics. However,
density matrices also arise in settings where the overall state is known exactly. We saw this earlier
where Bob could not distinguish his half of a singlet from a uniformly random state. Conversely,
a uniformly random state cannot be distinguished from half of a singlet, with the other half in an
unknown location. This is in fact only a representative example of the general rule that any density
matrix could arise by
P being
PdBpart of an entangled state.
First, let |ψi = di=1
A
j=1 αi,j |ii ⊗ |ji. If Bob measures his system, he obtains outcome j with
P 2
P √
probability pj ≡ i |αi,j | and the residual state for Alice is i αi,j |ii/ pj . Her density matrix is
dB PdA PdA ∗ 0 dA X
dA X
dB
i=1 αi,j |ii i0 =1 αi0 ,j hi |
X X
pj √ √ = αi,j αi∗0 ,j |iihi0 | = αα† .
pj pj
j=1 i=1 i0 =1 j=1
| {z }
=(αα† )i,i0
19
What if Alice measures? Working this out is a good exercise. The answer is (α† α)T .
By Theorem 2 any density matrix ρ can be written as αα† for some matrix α. It remains only
to check the normalization to ensure that |ψi is a valid state:
X
1 = trρ = trαα† = |αi,j |2 .
i,j
This means that if we produce ρ in the lab, we can never know whether the state is mixed
because of uncertainty about which pure state it is, or because it is entangled with a particle that
is out of our control.
20