Physics QM

i
The Physics of
Quantum Mechanics
Daniel F. Styer
ii
The Physics of Quantum Mechanics
Daniel F. Styer
Schiffer Professor of Physics, Oberlin College
This book is in draft form — it is not polished or complete. It needs more

problems. I appreciate your comments.
copyright c 19 August 2021 Daniel F. Styer
The copyright holder grants the freedom to copy, modify, convey, adapt,
and/or redistribute this work under the terms of the Creative Commons
Attribution Share Alike 4.0 International License. A copy of that license is
available at http://creativecommons.org/licenses/by-sa/4.0/legalcode.
You may freely download this book in pdf format from

http://www.oberlin.edu/physics/dstyer/ThePhysicsOfQM.
It is formatted to print nicely on either A4 or U.S. Letter paper. The author
receives no monetary gain from your download: it is reward enough for him
that you want to explore quantum mechanics.
Instructions for living a life:
Pay attention.
Be astonished.
Tell about it.
— Mary Oliver, Sometimes
iii
Contents
Welcome 1
1. What is Quantum Mechanics About? 5
1.1 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Interference . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.3 Aharonov-Bohm effect . . . . . . . . . . . . . . . . . . . . 31
1.4 Light on the atoms . . . . . . . . . . . . . . . . . . . . . . 33
1.5 Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.6 Quantum cryptography . . . . . . . . . . . . . . . . . . . 48
1.7 What is a qubit? . . . . . . . . . . . . . . . . . . . . . . . 52
2. Forging Mathematical Tools 55
2.1 What is a quantal state? . . . . . . . . . . . . . . . . . . . 55

2.2 Amplitude . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.3 Reversal-conjugation relation . . . . . . . . . . . . . . . . 64
2.4 Establishing a phase convention . . . . . . . . . . . . . . . 66
2.5 How can I specify a quantal state? . . . . . . . . . . . . . 68
2.6 States for entangled systems . . . . . . . . . . . . . . . . . 77
2.7 Are states “real”? . . . . . . . . . . . . . . . . . . . . . . . 82
2.8 What is a qubit? . . . . . . . . . . . . . . . . . . . . . . . 82
v
vi Contents
3. Refining Mathematical Tools 85
3.1 Extras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.2 Outer products, operators, measurement . . . . . . . . . . 90
3.3 Photon polarization . . . . . . . . . . . . . . . . . . . . . . 96
3.4 Lightning linear algebra . . . . . . . . . . . . . . . . . . . 100
3.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4. Formalism 113
4.1 The role of formalism . . . . . . . . . . . . . . . . . . . . . 113

4.2 The density matrix . . . . . . . . . . . . . . . . . . . . . . 117
5. Time Evolution 119
5.1 Operator for time evolution . . . . . . . . . . . . . . . . . 119

5.2 Working with the Schrödinger equation . . . . . . . . . . . 123
5.3 Formal properties of time evolution; Conservation laws . . 135
5.4 Magnetic moment in a uniform magnetic field . . . . . . . 139
5.5 The neutral K meson . . . . . . . . . . . . . . . . . . . . . 139
6. The Quantum Mechanics of Position 143
6.1 Describing states in continuum systems . . . . . . . . . . . 143

6.2 How does position amplitude change with time? . . . . . . 150
6.3 What is wavefunction? . . . . . . . . . . . . . . . . . . . . 158
6.4 Operators and their representations; The momentum basis 159
6.5 Time evolution of average quantities . . . . . . . . . . . . 169
7. The Free Particle 173
7.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

Contents vii
8. Square Wells 177
8.1 What does an electron look like? . . . . . . . . . . . . . . 178
9. The Simple Harmonic Oscillator 181
9.1 Resume of energy eigenproblem . . . . . . . . . . . . . . . 181

9.2 Solution of the energy eigenproblem: Differential equation
approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
9.3 Solution of the energy eigenproblem: Operator factoriza-
tion approach . . . . . . . . . . . . . . . . . . . . . . . . . 186
9.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
10. Qualitative Solution of Energy Eigenproblems 193
11. Perturbation Theory 195
11.1 The O notation . . . . . . . . . . . . . . . . . . . . . . . . 195

11.2 Perturbation theory for cubic equations . . . . . . . . . . 198
11.3 Derivation of perturbation theory for the energy eigenprob-
lem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
11.4 Perturbation theory for the energy eigenproblem: Sum-
mary of results . . . . . . . . . . . . . . . . . . . . . . . . 205
11.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
12. Quantum Mechanics in Two and Three Dimensions 211
12.1 More degrees of freedom . . . . . . . . . . . . . . . . . . . 211

12.2 Vector operators . . . . . . . . . . . . . . . . . . . . . . . 214
12.3 Multiple particles . . . . . . . . . . . . . . . . . . . . . . . 215
12.4 The phenomena of quantum mechanics . . . . . . . . . . . 216
viii Contents
13. Angular Momentum 219
13.1 Solution of the angular momentum eigenproblem . . . . . 219

13.2 Summary of the angular momentum eigenproblem . . . . 223
(j)
13.3 Ordinary differential equations for the dm,m0 (θ) . . . . . . 223
13.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
14. Central Force Motion 227
14.1 Energy eigenproblem in two dimensions . . . . . . . . . . 227

14.2 Energy eigenproblem in three dimensions . . . . . . . . . 234
14.3 Bound state energy eigenproblem for Coulombic potentials 239
14.4 Summary of the bound state energy eigenproblem for a
Coulombic potential . . . . . . . . . . . . . . . . . . . . . 243
14.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
15. Identical Particles 247
15.1 Many-particle systems in quantum mechanics . . . . . . . 247

15.2 An antisymmetric basis for the helium problem . . . . . . 261
16. Breather 271
16.1 Scaled variables . . . . . . . . . . . . . . . . . . . . . . . . 272

16.2 Variational method for finding the ground state energy . . 275
16.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
17. Hydrogen 281
17.1 The Stark effect . . . . . . . . . . . . . . . . . . . . . . . . 281
18. Helium 291
18.1 Ground state energy of helium . . . . . . . . . . . . . . . 291

Contents ix
19. Atoms 297
19.1 Addition of angular momenta . . . . . . . . . . . . . . . . 297

19.2 Hartree-Fock approximation . . . . . . . . . . . . . . . . . 304
19.3 Atomic ground states . . . . . . . . . . . . . . . . . . . . . 305
20. Molecules 307
20.1 The hydrogen molecule ion . . . . . . . . . . . . . . . . . 307

20.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
20.3 The hydrogen molecule . . . . . . . . . . . . . . . . . . . . 314
20.4 Can we do better? . . . . . . . . . . . . . . . . . . . . . . 315
21. WKB: The Quasiclassical Approximation 317
21.1 The connection region . . . . . . . . . . . . . . . . . . . . 319

21.2 Why is WKB the “quasiclassical” approximation? . . . . . 320
21.3 The “power law” potential . . . . . . . . . . . . . . . . . . 320
22. The Interaction of Matter and Radiation 329
22.1 Perturbation Theory for the Time Development Problem . 329

22.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
22.3 Light absorption . . . . . . . . . . . . . . . . . . . . . . . 334
22.4 Absorbing incoherent light . . . . . . . . . . . . . . . . . . 340
22.5 Absorbing and emitting light . . . . . . . . . . . . . . . . 341
22.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
23. The Territory Ahead 349

x Contents
Appendix A Tutorial on Matrix Diagonalization 351
A.1 What’s in a name? . . . . . . . . . . . . . . . . . . . . . . 351

A.2 Vectors in two dimensions . . . . . . . . . . . . . . . . . . 352
A.3 Tensors in two dimensions . . . . . . . . . . . . . . . . . . 355
A.4 Tensors in three dimensions . . . . . . . . . . . . . . . . . 359
A.5 Tensors in d dimensions . . . . . . . . . . . . . . . . . . . 360
A.6 Linear transformations in two dimensions . . . . . . . . . 361
A.7 What does “eigen” mean? . . . . . . . . . . . . . . . . . . 363
A.8 How to diagonalize a symmetric matrix . . . . . . . . . . 364
A.9 A glance at computer algorithms . . . . . . . . . . . . . . 370
A.10 A glance at non-symmetric matrices and the Jordan form 371
Appendix B The Spherical Harmonics 377
Appendix C Radial Wavefunctions for the Coulomb Problem 379
Appendix D Quantum Mechanics Cheat Sheet 381
Index 385
Welcome
Why would anyone want to study a book titled The Physics of Quantum
Mechanics?
Starting in the year 1900, physicists exploring the newly discovered atom
found that the atomic world of electrons and protons is not just smaller than
our familiar world of trees, balls, and automobiles, it is also fundamentally
different in character. Objects in the atomic world obey different rules from
those obeyed by a tossed ball or an orbiting planet. These atomic rules are
so different from the familiar rules of everyday physics, so counterintuitive
and unexpected, that it took more than 25 years of intense research to
uncover them.
But it is really only since the year 1990 that physicists have come to
appreciate that the rules of the atomic world (now called “quantum mechan-
ics”) are not just different from the everyday rules (now called “classical
mechanics”). The atomic rules are also far richer. The atomic rules provide
for phenomena like particle interference and entanglement that are simply
absent from the everyday world. Every phenomenon of classical mechanics
is also present in quantum mechanics, but the quantum world provides for
many additional phenomena.
Here’s an analogy: Some films are in black-and-white and some are in
color. It does not malign any black-and-white film to say that a color film
has more possibilities, more richness. In fact, black-and-white films are
simply one category of color films, because black and white are both colors.
Anyone moving from the world of only black-and-white to the world of color
is opening up the door to a new world — a world ripe with new possibilities
and new expression — without closing the door to the old world.
1
2 Welcome
This same flood of richness and freshness comes from entering the quan-
tum world. It is a difficult world to enter, because we humans have no expe-
rience, no intuition, no expectations about this world. Even our language,
invented by people living in the everyday world, has no words for the new
quantal phenomena — just as a language among a race of the color-blind
would have no word for “red”.
Reading this book is not easy: it is like a color-blind student learning
about color from a color-blind teacher. The book is just one long argument,
building up the structure of a world that we can explore not through touch
or through sight or through scent, but only through logic. Those willing to
follow and to challenge the logic, to open their minds to a new world, will
find themselves richly rewarded.
The place of quantum mechanics in nature
Quantum mechanics is the framework for describing and analyzing small

things, like atoms and nuclei. Quantum mechanics also applies to big
things, like baseballs and galaxies, but when applied to big things, cer-
tain approximations become legitimate: taken together, these are called
the classical approximation to quantum mechanics, and the result is the
familiar classical mechanics.
Quantum mechanics is not only less familiar and less intuitive than
classical mechanics; it is also harder than classical mechanics. So whenever
the classical approximation is sufficiently accurate, we would be foolish not
to use it. This leads some to develop the misimpression that quantum
mechanics applies to small things, while classical mechanics applies to big
things. No. Quantum mechanics applies to all sizes, but classical mechanics
is a good approximation to quantum mechanics when it is applied to big
things.
For what size is the classical approximation good enough? That depends
on the accuracy desired. The higher the accuracy demanded, the more situ-
ations will require full quantal treatment rather than approximate classical
treatment. But as a rule of thumb, something as big as a DNA strand is
almost always treated classically, not quantum mechanically.
This situation is analogous to the relationship between relativistic me-
chanics and classical mechanics. Relativity applies always, but classical
mechanics is a good approximation to relativistic mechanics when applied
Welcome 3
to slow things (that is, with speeds much less than light speed c). The speed
at which the classical approximation becomes legitimate depends upon the
accuracy demanded, but as a rule of thumb particles moving less than a
quarter of light speed are treated classically.
The difference between the quantal case and the relativistic case is that
while relativistic mechanics is less familiar, less comforting, and less ex-
pected than classical mechanics, it is no more intricate than classical me-
chanics. Quantum mechanics, in contrast, is less familiar, less comforting,
less expected, and more intricate than classical mechanics. This intricacy
makes quantum mechanics harder than classical mechanics, yes, but also
richer, more textured, more nuanced. Whether to curse or celebrate this
intricacy is your choice.
speed
c -
fast relativistic relativistic

quantum mechanics
mechanics
quantum classical
slow mechanics mechanics
0 - size
small big
Finally, is there a framework that applies to situations that are both fast
and small? There is: it is called “relativistic quantum mechanics” and is
closely related to “quantum field theory”. Ordinary non-relativistic quan-
tum mechanics is a good approximation for relativistic quantum mechanics
when applied to slow things. Relativistic mechanics is a good approxima-
tion for relativistic quantum mechanics when applied to big things. And
classical mechanics is a good approximation for relativistic quantum me-
chanics when applied to big, slow things.
4 Welcome
What you can expect from this book
This book introduces quantum mechanics at the third- or fourth-year Amer-

ican undergraduate level. It assumes the reader knows about. . . .
This is a book about physics, not about mathematics. The word
“physics” derives from the Greek word for “nature”, so the emphasis lies in
nature, not in the mathematics we use to describe nature. Thus the book
starts with experiments about nature, then builds mathematical machinery
to describe nature, then erects a formalism (“postulates”), and then moves
on to applications, where the formalism is applied to nature and where the
understanding of both nature and formalism is deepened.
The book never abandons its focus on nature. It provides a balanced,
interwoven treatment of concepts, formalism, and applications so that each
thread reinforces the other. There are many problems at many levels of
difficulty, but no problem is there just for “make-work”: each has a “moral
to the story”. Some problems are essential to the logical development of
the subject: these are labeled (unsurprisingly) “essential”. Other problems
promote learning far better than simple reading can: these are labeled
“recommended”. Sample problems build both mathematical technique and
physical insight.
The book does not merely convey correct ideas, it also refutes miscon-
ceptions. Just to get started, I list the most important and most pernicious
misconceptions about quantum mechanics: (a) An electron has a position
but you don’t know what it is. (b) The only states are energy states. (c) The
wavefunction ψ(~x, t) is “out there” in space and you could reach out and
touch it if only your fingers were sufficiently sensitive.
The object of the biographical footnotes in this book is twofold: First, to
present the briefest of outlines of the subject’s historical development, lest
anyone get the misimpression that quantum mechanics arose fully formed,
like Aphrodite from sea foam. Second, to show that the founders of quan-
tum mechanics were not inaccessible giants, but people with foibles and
strengths, with interests both inside and outside of physics, just like you
and me.
Chapter 1
What is Quantum Mechanics About?
1.1 Quantization
We are used to things that vary continuously: An oven can take on any
temperature, a recipe might call for any quantity of flour, a child can grow to
a range of heights. If I told you that an oven might take on the temperature
of 172.1 ◦ C or 181.7 ◦ C, but that a temperature of 173.8 ◦ C was physically
impossible, you would laugh in my face.
So you can imagine the surprise of physicists on 14 December 1900,
when Max Planck announced that certain features of blackbody radiation
(that is, of light in thermal equilibrium) could be explained by assuming
that the energy of the light could not take on any value, but only certain
discrete values. Specifically, Planck found that light of frequency ω could
take on only the energies of
E = ~ω(n + 12 ), where n = 0, 1, 2, 3, . . ., (1.1)
and where the constant ~ (now called the “reduced Planck constant”) is
~ = 1.054 571 817 × 10−34 J s. (1.2)
(I use modern terminology and the current value for ~, rather than the
terminology and value used by Planck in 1900.)
That is, light of frequency ω can have an energy of 3.5 ~ω, and it can
have an energy of 4.5 ~ω, but it is physically impossible for this light to have
an energy of 3.8 ~ω. Any numerical quantity that can take on only discrete
values like this is called “quantized”. By contrast, a numerical quantity
that can take on any value is called “continuous”.
The photoelectric effect supplies additional evidence that the energy of
light comes only in discrete values. And if the energy of light comes in
5
6 What is Quantum Mechanics About?
discrete values, then it’s a good guess that the energy of an atom comes in
discrete values too. This good guess was confirmed through investigations of
atomic spectra (where energy goes into or out of an atom via absorption or
emission of light) and through the Franck–Hertz experiment (where energy
goes into or out of an atom via collisions).
Furthermore, if the energy of an atom comes in discrete values, then
it’s a good guess that other properties of an atom — such as its magnetic
moment — also take on only discrete values. The theme of this book is
that these good guesses have all proved to be correct.
The story of Planck’s1 discovery is a fascinating one, but it’s a difficult
and elaborate story because it involves not just quantization, but also ther-
mal equilibrium and electromagnetic radiation. The story of the discovery
of atomic energy quantization is just as fascinating, but again fraught with
intricacies. In an effort to remove the extraneous and dive deep to the heart
of the matter, we focus on the magnetic moment of an atom. We will, to the
extent possible, do a quantum-mechanical treatment of an atom’s magnetic
moment while maintaining a classical treatment of all other aspects — such
as its energy and momentum and position. (In chapter 6, “The Quantum
Mechanics of Position”, we take up a quantum-mechanical treatment of
position, momentum, and energy.)
1.1.1 The Stern-Gerlach experiment
An electric current flowing in a loop produces a magnetic moment, so it

makes sense that the electron orbiting (or whatever it does) an atomic
nucleus would produce a magnetic moment for that atom. And of course, it
also makes sense that physicists would be itching to measure that magnetic
moment.
It is not difficult to measure the magnetic moment of, say, a scout
compass. Place the magnetic compass needle in a known magnetic field
and measure the torque that acts to align the needle with the field. You
1 Max Karl Ernst Ludwig Planck (1858–1947) was a German theoretical physicist par-
ticularly interested in thermodynamics and radiation. Concerning his greatest discovery,

the introduction of quantization into physics, he wrote, “I can characterize the whole pro-
cedure as an act of desperation, since, by nature I am peaceable and opposed to doubtful
adventures.” [Letter from Planck to R.W. Wood, 7 October 1931, quoted in J. Mehra
and H. Rechenberg, The Historical Development of Quantum Theory (Springer–Verlag,
New York, 1982) volume 1, page 49.]
1.1. Quantization 7
will need to measure an angle and you might need to look up a formula in
your magnetism textbook, but there is no fundamental difficulty.
Measuring the magnetic moment of an atom is a different matter. You
can’t even see an atom, so you can’t watch it twist in a magnetic field like a
compass needle. Furthermore, because the atom is very small, you expect
the associated magnetic moment to be very small, and hence very hard to
measure. The technical difficulties are immense.
These difficulties must have deterred but certainly did not stop Otto
Stern and Walter Gerlach.2 They realized that the twisting of a magnetic
moment in a uniform magnetic field could not be observed for atomic-sized
magnets, and also that the moment would experience zero net force. But
they also realized that a magnetic moment in a non-uniform magnetic field
would experience a net force, and that this force could be used to measure
the magnetic moment.
~
B
z
6 µ
~
A classical magnetic moment in a non-uniform magnetic field.
A classical magnetic moment µ ~ , situated in a magnetic field B~ that

points in the z direction and increases in magnitude in the z direction, is
subject to a force
∂B
µz , (1.3)
∂z
where µz is the z-component of the magnetic moment or, in other words,
the projection of µ~ on the z axis. (If this is not obvious to you, then work
problem 1.1, “Force on a classical magnetic moment”, on page 9.)
2 Otto Stern (1888–1969) was a Polish-German-Jewish physicist who made contributions
to both theory and experiment. He left Germany for the United States in 1933 upon
the Nazi ascension to power. Walter Gerlach (1889–1979) was a German experimental
physicist. During the Second World War he led the physics section of the Reich Research
Council and for a time directed the German effort to build a nuclear bomb.
Stern and Gerlach used this fact to measure the z-component of the
magnetic moment of an atom. First, they heated silver in an electric “oven”.
The vaporized silver atoms emerged from a pinhole in one side of the oven,
and then passed through a non-uniform magnetic field. At the far side of
the field the atoms struck and stuck to a glass plate. The entire apparatus
had to be sealed within a good vacuum, so that collisions with nitrogen
molecules would not push the silver atoms around. The deflection of an
atom away from straight-line motion is proportional to the magnetic force,
and hence proportional to the projection µz . In this ingenious way, Stern
and Gerlach could measure the z-component of the magnetic moment of an
atom even though any single atom is invisible.
Before reading on, pause and think about what results you would expect
from this experiment.
Here are the results that I expect: I expect that an atom which happens
to enter the field with magnetic moment pointing straight up (in the z
direction) will experience a large upward force. Hence it will move upward
and stick high up on the glass-plate detector. I expect that an atom which
happens to enter with magnetic moment pointing straight down (in the −z
direction) will experience a large downward force, and hence will stick far
down on the glass plate. I expect that an atom entering with magnetic
moment tilted upward, but not straight upward, will move upward but
not as far up as the straight-up atoms, and the mirror image for an atom
entering with magnetic moment tilted downward. I expect that an atom
entering with horizontal magnetic moment will experience a net force of
zero, so it will pass through the non-uniform field undeflected.
Furthermore, I expect that when a silver atom emerges from the oven
source, its magnetic moment will be oriented randomly — as likely to point
in one direction as in any other. There is only one way to point straight up,
so I expect that very few atoms will stick high on the glass plate. There are
many ways to point horizontally, so I expect many atoms to pass through
undeflected. There is only one way to point straight down, so I expect very
few atoms to stick far down on the glass plate.3
In summary, I expect that atoms would leave the magnetic field in any of
a range of deflections: a very few with large positive deflection, more with a
3 To be specific, this reasoning suggests that the number of atoms with moment tilted
at angle θ relative to the z direction is proportional to sin θ, where θ ranges from 0◦ to

180◦ . You might want to prove this to yourself, but we’ll never use this result so don’t
feel compelled.
1.1. Quantization 9
small positive deflection, a lot with no deflection, some with a small negative
deflection, and a very few with large negative deflection. This continuity of
deflections reflects a continuity of magnetic moment projections.
In fact, however, this is not what happens at all! The projection µz
does not take on a continuous range of values. Instead, it is quantized and
takes on only two values, one positive and one negative. Those two values
are called µz = ±µB where µB , the so-called “Bohr magneton”, has the
measured value of
µB = 9.274 010 078 × 10−24 J/T, (1.4)
with an uncertainty of 3 in the last decimal digit.
Distribution of µz
Expected: Actual:
µz µz
+µB
0 0
−µB
The Stern-Gerlach experiment was initially performed with silver atoms

but has been repeated with many other types of atoms. When nitrogen is
used, the projection µz takes on one of the four quantized values of +3µB ,
+µB , −µB , or −3µB . When sulfur is used, it takes on one of the five
quantized values of +4µB , +2µB , 0, −2µB , and −4µB . For no atom do the
values of µz take on the broad continuum of my classical expectation. For
all atoms, the projection µz is quantized.
Problems
1.1 Force on a classical magnetic moment
The force on a classical magnetic moment is most easily calculated
using “magnetic charge fiction”: Consider the magnetic moment
to consist of two “magnetic charges” of magnitude +m and −m,

separated by the position vector d~ running from −m to +m. The
magnetic moment is then µ ~
~ = md.
a. Use B+ for the magnitude of the magnetic field at +m, and
B− for the magnitude of the magnetic field at −m. Show that
the net force on the magnetic moment is in the z direction with
magnitude mB+ − mB− .
~ Show that to high
b. Use dz for the z-component of the vector d.
accuracy
∂B
B+ = B− + dz .
∂z
Surely, for distances of atomic scale, this accuracy is more
than adequate.
c. Derive expression (1.3) for the force on a magnetic moment.
1.1.2 The conundrum of projections
I would expect the projections µz of a silver atom to take on a continuous

range of values. But in fact, these values are quantized: Whenever µz
is measured, it turns out to be either +µB or −µB , and never anything
else. This is counterintuitive and unexpected, but we can live with the
counterintuitive and unexpected — it happens all the time in politics.
However, this fact of quantization appears to result in a logical con-
tradiction, because there are many possible axes upon which the magnetic
moment can be projected. The figures on the next page make it clear that
it is impossible for any vector to have a projection of either ±µB on all
axes!
1.1. Quantization 11
Because if the projection of µ

~ on the z axis is +µB . . .
+µB
µ
~
. . . then the projection of µ

~ on this second axis must be more than +µB . . .
µ
~
. . . while the projection of µ

~ on this third axis must be less than +µB .
µ
~
Whenever we measure the magnetic moment, projected onto any axis,

the result is either +µB or −µB . Yet is it impossible for the projection
of any classical arrow on all axes to be either +µB or −µB ! This seeming
contradiction is called “the conundrum of projections”. We can live with

the counterintuitive, the unexpected, the strange, but we cannot live with
a logical contradiction. How can we resolve it?
The resolution comes not from meditating on the question, but from
experimenting about it. Let us actually measure the projection on one
axis, and then on a second. To do this easily, we modify the Stern-Gerlach
apparatus and package it into a box called a “Stern-Gerlach analyzer”. This
box consists of a Stern-Gerlach apparatus followed by “pipes” that channel
the outgoing atoms into horizontal paths.4 This chapter treats only silver
atoms, so we use analyzers with two exit ports.
packaged into
AAn atom enters a vertical analyzer through the single hole on the left.
If it exits through the upper hole on the right (the “+ port”) then the
outgoing atom has µz = +µB . If it exits through the lower hole on the
right (the “− port”) then the outgoing atom has µz = −µB .
µz = +µB
µz = −µB
4 In general, the “pipes” will manipulate the atoms through electromagnetic fields, not
through touching. One way way to make such “pipes” is to insert a second Stern-Gerlach
apparatus, oriented upside-down relative to the first. The atoms with µz = +µB , which
had experienced an upward force in the first half, will experience an equal downward
force in the second half, and the net impulse delivered will be zero. But whatever their
manner of construction, the pipes must not change the magnetic moment of an atom
passing through them.
1.1.3 Two vertical analyzers
In order to check the operation of our analyzers, we do preliminary exper-

iments. Atoms are fed into a vertical analyzer. Any atom exiting from the
+ port is then channeled into a second vertical analyzer. That atom exits
from the + port of the second analyzer. This makes sense: the atom had
µz = +µB when exiting the first analyzer, and the second analyzer confirms
that it has µz = +µB .
all
µz = +µB
none
µz = −µB
(ignore these)
Furthermore, if an atom exiting from the − port of the first analyzer

is channeled into a second vertical analyzer, then that atom exits from the
− port of the second analyzer.
1.1.4 One vertical and one upside-down analyzer
Atoms are fed into a vertical analyzer. Any atom exiting from the + port is
then channeled into a second analyzer, but this analyzer is oriented upside-
down. What happens? If the projection on an upward-pointing axis is +µB
(that is, µz = +µB ), then the projection on a downward-pointing axis is
−µB (we write this as µ(−z) = −µB ). So I expect that these atoms will
emerge from the − port of the second analyzer (which happens to be the
higher port). And this is exactly what happens.
all
µz = +µB
none
µz = −µB
(ignore these)
Similarly, if an atom exiting from the − port of the first analyzer is

channeled into an upside-down analyzer, then that atom emerges from the
+ port of the second analyzer.
1.1.5 One vertical and one horizontal analyzer
Atoms are fed into a vertical analyzer. Any atom exiting from the + port is
then channeled into a second analyzer, but this analyzer is oriented horizon-
tally. The second analyzer doesn’t measure the projection µz , it measures
the projection µx . What happens in this case? Experiment shows that the
atoms emerge randomly: half from the + port, half from the − port.
z
y
x
half (µx = −µB )

µz = +µB
half (µx = +µB )
(ignore these)
µz = −µB
This makes some sort of sense. If a classical magnetic moment were

vertically oriented, it would have µx = 0, and such a classical moment
would go straight through a horizontal Stern-Gerlach analyzer. We’ve seen
that atomic magnetic moments never go straight through. If you “want” to
go straight but are forced to turn either left or right, the best you can do is
turn left half the time and right half the time. (Don’t take this paragraph
literally. . . atoms have no personalities and they don’t “want” anything.

But it is a useful mnemonic.)
1.1.6 One vertical and one backwards horizontal analyzer
Perform the same experiment as above (section 1.1.5), except insert the
horizontal analyzer in the opposite sense, so that it measures the projection
on the negative x axis rather than the positive x axis. Again, half the atoms
emerge from the + port, and half emerge from the − port.
z
y
x
half (µ(−x) = −µB )

µz = +µB
half (µ(−x) = +µB )
(ignore these)
µz = −µB
1.1.7 One horizontal and one vertical analyzer
A +x analyzer followed by a +z analyzer is the same apparatus as above

(section 1.1.6), except that both analyzers are rotated as a unit by 90◦ about
the y axis. So of course it has the same result: half the atoms emerge from
the + port, and half emerge from the − port.
z
y
x
µx = −µB µz = +µB
µx = +µB
µz = −µB
1.1.8 Three analyzers
Atoms are fed into a vertical analyzer. Any atom exiting from the + port
is then channeled into a horizontal analyzer. Half of these atoms exit from
the + port of the horizontal analyzer (see section 1.1.5), and these atoms
are channeled into a third analyzer, oriented vertically. What happens at
the third analyzer?
z
y
x
µx =−µB ?
µz =+µB
µx =+µB ?
µz =−µB
There are two ways to think of this: (I) When the atom emerged from
the + port of the first analyzer, it was determined to have µz = +µB .
When that same atom emerged from the + port of the second analyzer,
it was determined to have µx = +µB . Now we know two projections
of the magnetic moment. When it enters the third analyzer, it still has
µz = +µB , so it will emerge from the + port. (II) The last two analyzers
in this sequence are a horizontal analyzer followed by a vertical analyzer,
and from section 1.1.7 we know what happens in this case: a 50/50 split.
That will happen in this case, too.
So, analysis (I) predicts that all the atoms entering the third analyzer
will exit through the + port and none through the − port. Analysis (II)
predicts that half the atoms will exit through the + port and half through
the − port.
Experiment shows that analysis (II) gives the correct result. But what
could possibly be wrong with analysis (I)? Let’s go through line by line:
“When the atom emerged from the + port of the first analyzer, it was
determined to have µz = +µB .” Nothing wrong here — this is what an
analyzer does. “When that same atom emerged from the + port of the
second analyzer, it was determined to have µx = +µB .” Ditto. “Now we
know two projections of the magnetic moment.” This has got to be the
problem. To underscore that problem, look at the figure below.
µ
~
+µB
x
+µB
If an atom did have both µz = +µB and µx = +µB , then the √ projection
◦
on an axis rotated 45 from the vertical would be µ45 = + 2 µB . But
◦
the Stern-Gerlach experiment assures us that√whenever µ45◦ is measured,

the result is either +µB or −µB , and never + 2 µB . In summary, it is not
possible for a moment to have a projection on both the z axis and on the
x axis. Passing to the fourth sentence of analysis (I) — “When the atom
enters the third analyzer, it still has µz = +µB , so it will emerge from the
+ port” — we immediately see the problem. The atom emerging from the
+ port of the second analyzer does not have µz = +µB — it doesn’t have
a projection on the z axis at all.
Because it’s easy to fall into misconceptions, let me emphasize what I’m
saying and what I’m not saying:
I’m saying that if an atom has a value for µx , then it

doesn’t have a value for µz .
I’m not saying that the atom has a value for µz but no one
knows what it is.
I’m not saying that the atom has a value for µz but that
value is changing rapidly.
I’m not saying that the atom has a value for µz but that
value is changing unpredictably.
I’m not saying that a random half of such atoms have the
value µz = +µB and the other half have the value µz =
−µB .
I’m not saying that the atom has a value for µz which will
be disturbed upon measurement.
The atom with a value for µx does not have a value for µz in the same way
that love does not have a color.
This is a new phenomenon, and it deserves a new name. That name
is “indeterminacy”. This is perhaps not the best name, because it might
suggest, incorrectly, that an atom with a value for µx has a value for µz and
we merely haven’t yet determined what that value is. The English language
was invented by people who didn’t understand quantum mechanics, so it is
not surprising that there are no perfectly appropriate names for quantum
mechanical phenomena. This is a defect in our language, not a defect in
quantum mechanics or in our understanding of quantum mechanics, and it
is certainly not a defect in nature.5
How can a vector have a projection on one axis but not on another?
It is the job of the rest of this book to answer that question, 6 but one
thing is clear already: The visualization of an atomic magnetic moment as
a classical arrow must be wrong.
5 In exactly the same manner, the name “orange” applies to light within the wavelength
range 590–620 nm and the name“red” applies to light within the wavelength range 620–
740 nm, but the English language has no word to distinguish the wavelength range
1590–1620 nm from the wavelength range 1620–1740 nm. This is not because optical
light is “better” or “more deserving” than infrared light. It is due merely to the accident
that our eyes detect optical light but not infrared light.
6 Preview: In quantum mechanics, the magnetic moment is represented mathematically
not by a vector but by a vector operator.

1.1.9 The upshot
We escape from the conundrum of projections through probability. If an

atom has µz = +µB , and if the projection on some other axis is measured,
then the result cannot be predicted with certainty: we instead give proba-
bilities for the various results. If the second analyzer is rotated by angle θ
relative to the vertical, the probability of emerging from the + port of the
second analyzer is called P+ (θ).
z
θ
µθ = +µB
µz = +µB
µθ = −µB
µz = −µB
We already know some special values: from section 1.1.3, P+ (0◦ ) = 1;

from section 1.1.5, P+ (90◦ ) = 12 ; from section 1.1.4, P+ (180◦ ) = 0; from
section 1.1.6, P+ (270◦ ) = 12 ; from section 1.1.3, P+ (360◦ ) = 1. It is not
hard to guess the curve that interpolates between these values:
P+ (θ) = cos2 (θ/2), (1.5)
and experiment confirms this guess.
P+ (θ)
1
2
0
0◦ 90◦ 180◦ 270◦ 360◦
θ
Problems
1.2 Exit probabilities (essential problem)
a. An analyzer is tilted from the vertical by angle α. An atom
leaving its + port is channeled into a vertical analyzer. What
is the probability that this atom emerges from the + port?
The − port? (Clue: Use the “rotate as a unit” concept intro-
duced in section 1.1.7.)
b. An atom exiting the − port of a vertical analyzer behaves

exactly like one exiting the + port of an upside-down analyzer
(see section 1.1.4). Such an atom is channeled into an analyzer
tilted from the vertical by angle β. What is the probability
that this atom emerges from the + port? The − port?
(Problem continues on next page.)

c. An analyzer is tilted from the vertical by angle γ. An atom

leaving its − port is channeled into a vertical analyzer. What
is the probability that this atom emerges from the + port?
The − port?
z
γ
1.3 Multiple analyzers

An atom with µz = +µB is channeled through the following line
of three Stern-Gerlach analyzers.
β
α γ
- or -
A C
Find the probability that it emerges from (a) the − port of analyzer
A; (b) the + port of analyzer B; (c) the + port of analyzer C; (d)
the − port of analyzer C.
1.4 Properties of the P+ (θ) function
a. An atom exits the + port of a vertical analyzer; that is, it has
µz = +µB . Argue that the probability of this atom exiting
from the − port of a θ analyzer is the same as the probability
of it exiting from the + port of a (180◦ − θ) analyzer.
b. Conclude that the P+ (θ) function introduced in section 1.1.9
22 Interference
must satisfy
P+ (θ) + P+ (180◦ − θ) = 1.
c. Does the experimental result (1.5) satisfy this condition?
1.2 Interference
There are more quantum mechanical phenomena to uncover. To support

our exploration, we build a new experimental device called the “analyzer
loop”.7 This is nothing but a Stern-Gerlach analyzer followed by “piping”
that channels the two exit paths together again.8
packaged into
The device must be constructed to high precision, so that there can be

no way to distinguish whether the atom passed through by way of the top
7 We build it in our minds. The experiments described in this section have never been
performed exactly as described here, although researchers are getting close. [See Shi-
mon Machluf, Yonathan Japha, and Ron Folman, “Coherent Stern–Gerlach momentum
splitting on an atom chip” Nature Communications 4 (9 September 2013) 2424.] We
know the results that would come from these experiments because conceptually parallel
(but more complex!) experiments have been performed on photons, neutrons, atoms,
and molecules.
8 If you followed the footnote on page 12, you will recall that these “pipes” manipulate
atoms through electromagnetic fields, not through touching. One way to make them
would be to insert two more Stern-Gerlach apparatuses, the first one upside-down and
the second one rightside-up relative to the initial apparatus. But whatever the manner of
their construction, the pipes must not change the magnetic moment of an atom passing
through them.
What Is Quantum Mechanics About? 23
path or the bottom path. For example, the two paths must have the same
length: If the top path were longer, then an atom going through via the top
path would take more time, and hence there would be a way to tell which
way the atom passed through the analyzer loop.
In fact, the analyzer loop is constructed so precisely that it doesn’t
change the character of the atom passing through it. If the atom enters
with µz = +µB , it exits with µz = +µB . If it enters with µx = −µB , it exits
with µx = −µB . If it enters with µ17◦ = −µB , it exits with µ17◦ = −µB .
It is hard to see why anyone would want to build such a device, because
they’re expensive (due to the precision demands), and they do absolutely
nothing!
Once you made one, however, you could convert it into something useful.
For example, you could insert a piece of metal blocking path a. In that case,
all the atoms exiting would have taken path b, so (if the analyzer loop were
oriented vertically) all would emerge with µz = −µB .
Using the analyzer loop, we set up the following apparatus: First, chan-
nel atoms with µz = +µB into a horizontal analyzer loop.9 Then, channel
the atoms emerging from that analyzer loop into a vertical analyzer. Ignore
atoms emerging from the + port of the vertical analyzer and look for atoms
emerging from the − port.
b
input ignore
µz = +µB output
a µz = −µB
We execute three experiments with this set-up: first we pass atoms

through when path a is blocked, then when path b is blocked, finally when
neither path is blocked.
1.2.1 Path a blocked
(1) Atoms enter the analyzer loop with µz = +µB .

(2) Half of them attempt path a, and end up impaled on the blockage.
(3) The other half take path b, and emerge from the analyzer loop with
µx = −µB .
9 To make sure that all of these atoms have µ = +µ , they are harvested from the
z B
+ port of a vertical analyzer.
24 Interference
(4) Those atoms then enter the vertical analyzer. Similar to the result
of section 1.1.7, half of these atoms emerge from the + port and are
ignored. Half of them emerge from the − port and are counted.
(5) The overall probability of passing through the set-up is 12 × 12 = 14 .
If you perform this experiment, you will find that this analysis is correct
and that these results are indeed obtained.
1.2.2 Path b blocked
(1) Atoms enter the analyzer loop with µz = +µB .

(2) Half of them attempt path b, and end up impaled on the blockage.
(3) The other half take path a, and emerge from the analyzer loop with
µx = +µB .
(4) Those atoms then enter the vertical analyzer. Exactly as in sec-
tion 1.1.7, half of these atoms emerge from the + port and are ignored.
Half of them emerge from the − port and are counted.
(5) The overall probability of passing through the set-up is 12 × 12 = 14 .
Once again, experiment confirms these results.
1.2.3 Neither path blocked
Here, I have not just one, but two ways to analyze the experiment:
Analysis I:
(1) An atom passes through the set-up either via path b or via path a.
(2) From section 1.2.1, the probability of passing through via path b is 14 .
(3) From section 1.2.2, the probability of passing through via path a is 14 .
(4) Thus the probability of passing through the entire set-up is 14 + 14 = 21 .
Analysis II:
(1) Because “the analyzer loop is constructed so precisely that it doesn’t

change the character of the atom passing through it”, the atom emerges
from the analyzer loop with µz = +µB .
(2) When such atoms enter the vertical analyzer, all of them emerge
through the + port. (See section 1.1.3.)
(3) Thus the probability of passing through the entire set-up is zero.
These two analyses cannot both be correct. Experiment confirms the

result of analysis II, but what could possibly be wrong with analysis I?
Item (2) is already confirmed through the experiment of section 1.2.1,
item (3) is already confirmed through the experiment of section 1.2.2, and
don’t tell me that I made a mistake in the arithmetic of item (4). The only
thing left is item (1): “An atom passes through the set-up either via path b
or via path a.” This simple, appealing, common-sense statement must be
wrong !
Just a moment ago, the analyzer loop seemed like a waste of money and
skill. But in fact, a horizontal analyzer loop is an extremely clever way of
correlating the path through the analyzer loop with the value of µx : If the
atom has µx = +µB , then it takes path a. If the atom has µx = −µB , then
it takes path b. If the atom has µz = +µB , then it doesn’t have a value of
µx and hence it doesn’t take a path.
Notice again what I’m saying: I’m not saying the atom takes one path
or the other but we don’t know which. I’m not saying the atom breaks
into two pieces and each half traverses its own path. I’m saying the atom
doesn’t take a path. The µz = +µB atoms within the horizontal analyzer
loop do not have a position in the same sense that love does not have a
color. If you think of an atom as a smaller, harder version of a classical
marble, then you’re visualizing the atom incorrectly.
Once again, our experiments have uncovered a phenomenon that doesn’t
happen in daily life, so there is no word for it in conventional language.10
Sometimes people say that “the atom takes both paths”, but that phrase
does not really get to the heart of the new phenomenon. I have asked
students to invent a new word to represent this new phenomenon, and
my favorite of their many suggestions is “ambivate” — a combination of
ambulate and ambivalent — as in “an atom with µz = +µB ambivates
through both paths of a horizontal analyzer loop”. While this is a great
word, it hasn’t caught on. The conventional name for this phenomenon is
“quantal interference”.
10 In exactly the same way, there was no need for the word “latitude” or the word
“longitude” when it was thought that the Earth was flat. The discovery of the near-
spherical character of the Earth forced our forebears to invent new words to represent
these new concepts. Words do not determine reality; instead reality determines which
words are worth inventing.
26 Interference
The name “quantal interference” comes from a (far-fetched) analogy

with interference in wave optics. Recall that in the two-slit interference of
light, there are some observation points that have a light intensity if light
passes through slit a alone, and the same intensity if light passes through
slit b alone, but zero intensity if light passes through both slits. This is
called “destructive interference”. There are other observation points that
have a light intensity if the light passes through slit a alone, and the same
intensity if light passes through slit b alone, but four times that intensity if
light passes through both slits. This is called “constructive interference”.
But in fact the word “interference” is a poor name for this phenomenon as
well. It’s adapted from a football term, and football players never (or at
least never intentionally) run “constructive interference”.
One last word about language: The device that I’ve called the “analyzer
loop” is more conventionally called an “interferometer”. I didn’t use that
name at first because that would have given away the ending.
Back on page 6 I said that, to avoid unnecessary distraction, this chapter
would “to the extent possible, do a quantum-mechanical treatment of an
atom’s magnetic moment while maintaining a classical treatment of all
other aspects — such as its energy and momentum and position”. You
can see now why I put in that qualifier “to the extent possible”: we have
found that within an interferometer, a quantum-mechanical treatment of
magnetic moment demands a quantum-mechanical treatment of position as
well.
1.2.4 Sample Problem: Constructive interference
Consider the same set-up as on page 23, but now ignore atoms leaving the
− port of the vertical analyzer and consider as output atoms leaving the
+ port. What is the probability of passing through the set-up when path
a is blocked? When path b is blocked? When neither path is blocked?
1 1 1
Solution: 4; 4; 1. Because 4 + 14 < 1, this is an example of constructive
interference.
1.2.5 Sample Problem: Two analyzer loops
2a
1b
input output
µz = +µB
1a
2b
Atoms with µz = +µB are channeled through a horizontal analyzer loop

(number 1), then a vertical analyzer loop (number 2). If all paths are open,
100% of the incoming atoms exit from the output. What percentage of the
incoming atoms leave from the output if the following paths are blocked?
(a) 2a (d) 1b
(b) 2b (e) 1b and 2a
(c) 1a (f) 1a and 2b
Solution: Only two principles are needed to solve this problem: First,
an atom leaving an unblocked analyzer loop leaves in the same condition
it had when it entered. Second, an atom leaving an analyzer loop with
one path blocked leaves in the condition specified by the path that it took,
regardless of the condition it had when it entered. Use of these principles
gives the solution in the table on the next page. Notice that in changing
from situation (a) to situation (e), you add blockage, yet you increase the
output!
28
paths input path taken intermediate path taken output probability of

blocked condition through # 1 condition through # 2 condition input → output
none µz = +µB “both” µz = +µB a µz = +µB 100%
2a µz = +µB “both” µz = +µB 100% blocked at a none 0%
2b µz = +µB “both” µz = +µB a µz = +µB 100%
50% blocked at a
1a µz = +µB µx = −µB “both” µx = −µB 50%
50% pass through b
50% pass through a
1b µz = +µB µx = +µB “both” µx = +µB 50%
50% blocked at b
50% pass through a 25% blocked at a
1b and 2a µz = +µB µx = +µB µz = −µB 25%
50% blocked at b 25% pass through b
50% blocked at a 25% pass through a
1a and 2b µz = +µB µx = −µB µz = +µB 25%
50% pass through b 25% blocked at b
Interference
1.2.6 Sample Problem: Find the flaw
No one would write a computer program and call it finished without test-
ing and debugging their first attempt. Yet some approach physics problem
solving in exactly this way: they get to the equation that is “the solution”,
stop, and then head off to bed for some well-earned sleep without investi-
gating whether the solution makes sense. This is a loss, because the real
fun and interest in a problem comes not from our cleverness in finding “the
solution”, but from uncovering what that solution tells us about nature.
To give you experience in this reflection step, I’ve designed “find the flaw”
problems in which you don’t find the solution, you only test it. Here’s an
example.
Find the flaw: Tilted analyzer loop
Four students — Aldo, Beth, Celine, and Denzel — work problem 1.5
presented on the next page. All find the same answer for part (a), namely
zero, but for parts (b) and (c) they produce four different answers! Their
candidate answers are:
(b) (c)
4
Aldo cos4 (θ/2) sin (θ/2)
1 1
Beth 4 sin(θ) 4 sin(θ)
1
√ 1
√
Celine 4 2 sin(θ/2) 4 2 sin(θ/2)
Denzel 1
2 sin2 (θ) 1
2 sin2 (θ)
Without actually working the problem, provide simple reasons showing that
all of these candidates must be wrong.
Solution: For the special case θ = 0◦ the correct answers for (b) and (c)
are both 0. Aldo’s answer to (b) fails this test.
The special case θ = 90◦ was investigated in sections 1.2.1 and 1.2.2: in
this case the answers for (b) and (c) are both 14 . Denzel’s answer fails this
test.
Beth’s answer gives negative probabilities when 180◦ < θ < 360◦ . Bad
idea!
30 Interference
The answer should not change when θ increases by 360◦ . Celine’s answer
fails this test. (For example, it gives the answer + 41 when θ = 90◦ and − 14
when θ = 450◦ , despite the fact that 90◦ and 450◦ are the same angle.)
Problems
1.5 Tilted analyzer loop (recommended problem)
z
θ
a
input
µz =+µB output
An atom with µz = +µB enters the analyzer loop (interferometer)

shown above, tilted at angle θ to the vertical. The outgoing atom
enters a z-analyzer, and whatever comes out the − port is considered
output. What is the probability for passage from input to output when:
a. Paths a and b are both open?
b. Path b is blocked?
c. Path a is blocked?
1.6 Three analyzer loops (recommended problem)

Atoms with µz = +µB are channeled into a horizontal analyzer loop,
followed by a vertical analyzer loop, followed by a horizontal analyzer
loop.
2a
1b 3b
µz =+µB output
1a 3a
2b
If all paths are open, 100% of the incoming atoms exit from the out-
put. What percent of the incoming atoms leave from the output if the
following paths are blocked?
1.3. Aharonov-Bohm effect 31
(a) 3a (d) 2b (g) 1b and 3b

(b) 3b (e) 1b (h) 1b and 3a
(c) 2a (f) 2a and 3b (i) 1b and 3a and 2a
(Note that in going from situation (h) to situation (i) you get more
output from increased blockage.)
1.3 Aharonov-Bohm effect
We have seen how to sort atoms using a Stern-Gerlach analyzer, made of a

non-uniform magnetic field. But how do atoms behave in a uniform mag-
netic field? In general, this is an elaborate question, (treated in section 5.4),
and the answer will depend on the initial condition of the atom’s magnetic
moment, on the magnitude of the field, and on the amount of time that the
atom spends in the field. But for one special case the answer, determined
experimentally, is easy. If an atom is exposed to uniform magnetic field
~ for exactly the right amount of time (which turns out to be a time of
B
π~/µB B), then the atom emerges with exactly the same magnetic condi-
tion it had initially: If it starts with µz = −µB , it ends with µz = −µB .
If it starts with µx = +µB , it ends with µx = +µB . If it starts with
µ29◦ = +µB , it ends with µ29◦ = +µB . Thus for atoms moving at a given
speed, we can build a box containing a uniform magnetic field with just the
right length so that any atom passing through it will spend just the right
amount of time to emerge in the same condition it had when it entered.
We call this box a “replicator”.
If you play with one of these boxes you’ll find that you can build any
elaborate set-up of sources, detectors, blockages, and analyzers, and that
inserting a replicator into any path will not affect the outcome of any exper-
iment. But notice that this apparatus list does not include interferometers
(our “analyzer loops”)! Build the interference experiment of page 23. Do
not block either path. Instead, slip a replicator into one of the two paths a
or b — it doesn’t matter which.
b
µz =+µB ignore
output
a µz =−µB
replicator
32 Interference
Without the replicator no atom emerges at output. But experiment shows

that after inserting the replicator, all the atoms emerge at output.
How can this be? Didn’t we just say of a replicator that “any atom pass-
ing through it will. . . emerge in the same condition it had when it entered”?
Indeed we did, and indeed this is true. But an atom with µz = +µB doesn’t
pass through path a or path b — it ambivates through both paths.
If the atom did take one path or the other, then the replicator would
have no effect on the experimental results. The fact that it does have an
effect is proof that the atom doesn’t take one path or the other.
The fact11 that one can perform this remarkable experiment was pre-
dicted theoretically (in a different context) by Walter Franz. He announced
his result in Danzig (now Gdańsk, Poland) in May 1939, just months before
the Nazi invasion of Poland, and his prediction was largely forgotten in the
resulting chaos. The effect was rediscovered theoretically by Werner Ehren-
berg and Raymond Siday in 1949, but they published their result under the
opaque title of “The refractive index in electron optics and the principles of
dynamics” and their prediction was also largely forgotten. The effect was
rediscovered theoretically a third time by Yakir Aharonov and David Bohm
in 1959, and this time it sparked enormous interest, both experimental and
theoretical. The phenomenon is called today the “Aharonov-Bohm effect”.
Problem
1.7 Bomb-testing interferometer12 (recommended problem)
The Acme Bomb Company sells a bomb triggered by the presence of
silver, and claims that the trigger is so sensitive that the bomb explodes
when its trigger absorbs even a single silver atom. You have heard sim-
ilar extravagant claims from other manufacturers, so you’re suspicious.
You purchase a dozen bombs, then shoot individual silver atoms at
each in turn. The first bomb tested explodes! The trigger worked as
advertised, but now it’s useless because it’s blasted to pieces. The sec-
ond bomb tested doesn’t explode — the atom slips through a hole in
the trigger. This confirms your suspicion that not all the triggers are
11 See B.J. Hiley, “The early history of the Aharonov-Bohm effect” (17 April 2013)
https://arxiv.org/abs/1304.4736.
12 Avshalom C. Elitzur and Lev Vaidman, “Quantum mechanical interaction-free mea-
surements” Foundations of Physics 23 (July 1993) 987–997.

1.4. Light on the atoms 33
as sensitive as claimed, so this bomb is useless to you as well. If you

continue testing in this fashion, at the end all your good bombs will be
blown up and you will be left with a stash of bad bombs.
So instead, you set up the test apparatus sketched here:
b
µz =+µB ?
a
?
bomb with trigger
An atom with µz = +µB enters the interferometer. If the bomb trigger

has a hole, then the atom ambivates through both paths, arrives at the
analyzer with µz = +µB , and exits the + port of the analyzer. But if
the bomb trigger is good, then either (a) the atom takes path a and
sets off the bomb, or else (b) the atom takes path b.
a. If the bomb trigger is good, what is the probability of option (a)?
Of option (b)?
b. If option (b) happens, what kind of atom arrives at the analyzer?
What is the probability of that atom exiting through the + port?
The − port?
Conclusion: If the atom exits through the − port, then the bomb is
good. If it exits through the + port then the bomb might be good or
bad and further testing is required. But you can determine that the
bomb trigger is good without blowing it up!
1.4 Light on the atoms
Our conclusion that, under some circumstances, the atom “does not have
a position” is so dramatically counterintuitive that you might — no, you
should — be tempted to test it experimentally. Set up the interference ex-
periment on page 23, but instead of simply allowing atoms to pass through
the interferometer, watch to see which path the atom takes through the
set-up. To watch them, you need light. So set up the apparatus with lamps
trained on the two paths a and b.
Send in one atom. There’s a flash of light at path a.
Another atom. Flash of light at b.
34 Light on the atoms
Another atom. Flash at b again.

Then a, then a, then b.
You get the drift. Always the light appears at one path or the other. (In
fact, the flashes come at random with probability 21 for a flash at a and 12
for a flash at b.) Never is there no flash. Never are there “two half flashes”.
The atom always has a position when passing through the interferometer.
“So much”, say the skeptics, “for this metaphysical nonsense about ‘the
atom takes both paths’.”
But wait. Go back and look at the output of the vertical analyzer.
When we ran the experiment with no light, the probability of coming out
the − port was 0. When we turn the lamps on, then the probability of
coming out the − port becomes 21 .
When the lamps are off, analysis II on page 24 is correct: the atoms
ambivate through both paths, and the probability of exiting from the − port
is 0. When the lamps are on and a flash is seen at path a, then the atom
does take path a, and now the analysis of section 1.2.2 on page 24 is correct:
the probability of exiting from the − port is 21 .
The process when the lamps are on is called “observation” or “measure-
ment”, and a lot of nonsense has come from the use of these two words.
The important thing is whether the light is present or absent. Whether
or not the flashes are “observed” by a person is irrelevant. To prove this
to yourself, you may, instead of observing the flashes in person, record the
flashes on video. If the lamps are on, the probability of exiting from the
− port is 12 . If the lamps are off, the probability of exiting from the − port
is 0. Now, after the experiment is performed, you may either destroy the
video, or play it back to a human audience, or play it back to a feline au-
dience. Surely, by this point it is too late to change the results at the exit
port.
It’s not just light. Any method you can dream up for determining the
path taken will show that the atom takes just one path, but that method
will also change the output probability from 0 to 21 . No person needs to
actually read the results of this mechanism: as long as the mechanism is at
work, as long as it is in principle possible to determine which path is taken,
then one path is taken and no interference happens.
What happens if you train a lamp on path a but leave path b in the
dark? In this case a flash means the atom has taken path a. No flash means
1.5. Entanglement 35
the atom has taken path b. In both cases the probability of passage for the
atom is 12 .
How can the atom taking path b “know” that the lamp at path a is
turned on? The atom initially “sniffs out” both paths, like a fog creeping
down two passageways. The atom that eventually does take path b in
the dark started out attempting both paths, and that’s how it “knows”
the lamp at path a is on. This is called the “Renninger negative-result
experiment”.
It is not surprising that the presence or absence of light should affect an
atom’s motion: this happens even in classical mechanics. When an object
absorbs or reflects light, that object experiences a force, so its motion is
altered. For example, a baseball tossed upward in a gymnasium with the
overhead lamps off attains a slightly greater height that an identical baseball
experiencing an identical toss in the same gymnasium with the overhead
lamps on, because the downward-directed light beams push the baseball
downward. (This is the same “radiation pressure” that is responsible for
the tails of comets. And of course, the effect occurs whenever the lamps are
turned on: whether any person actually watches the illuminated baseball
is irrelevant.) This effect is negligible for typical human-scale baseballs
and tosses and lamps, but atoms are far smaller than baseballs and it is
reasonable that the light should alter the motion of an atom more than it
alters the motion of a baseball.
One last experiment: Look for the atoms with dim light. In this case,
some of the atoms will pass through with a flash. But — because of the
dimness — some atoms will pass through without any flash at all. For those
atoms passing through with a flash, the probability for exiting the − port
is 12 . For those atoms passing through without a flash, the probability of
exiting the − port is 0.
1.5 Entanglement
I have claimed that an atom with µz = +µB doesn’t have a value of µx ,

and that when such an atom passes through a horizontal interferometer, it
doesn’t have a position. You might say to yourself, “These claims are so
weird, so far from common sense, that I just can’t accept them. I believe
the atom does have a value of µx and does have a position, but something
else very complicated is going on to make the atom appear to lack a µx and
36 Entanglement
a position. I don’t know what that complicated thing is, but just because
I haven’t yet thought it up yet doesn’t mean that it doesn’t exist.”
If you think this, you’re in good company: Einstein13 thought it too.
This section introduces a new phenomenon of quantum mechanics, and
shows that no local deterministic mechanism, no matter how complex or
how fantastic, can give rise to all the results of quantum mechanics. Einstein
was wrong.
1.5.1 Flipping Stern-Gerlach analyzer
A new piece of apparatus helps us uncover this new phenomenon of nature.

Mount a Stern-Gerlach analyzer on a stand so that it can be oriented either
vertically (0◦ ), or tilted one-third of a circle clockwise (+120◦ ), or tilted
one-third of a circle counterclockwise (−120◦ ). Call these three orientations
V (for vertical), O (for out of the page), or I (for into the page). As an atom
approaches the analyzer, select one of these three orientations at random,
flip the analyzer to that orientation, and allow the atom to pass through as
usual. As a new atom approaches, again select an orientation at random,
flip the analyzer, and let the atom pass through. Repeat many times.
V V
120◦
I O I O
Flipping Stern-Gerlach analyzer. The arrows V, O, and I, oriented 120◦

apart, all lie within the plane perpendicular to the atom’s approach path.
13 Although Albert Einstein (1879–1955) is most famous for his work on relativity, he
claimed that he had “thought a hundred times as much about the quantum problems
as I have about general relativity theory.” (Remark to Otto Stern, reported in Abraham
Pais, “Subtle is the Lord. . . ”: The Science and the Life of Albert Einstein, [Oxford
University Press, Oxford, UK, 1982] page 9.)
What happens if an atom with µz = +µB enters a flipping analyzer?

With probability 13 , the atom enters a vertical analyzer (orientation V), and
in that case it exits the + port with probability 1. With probability 31 , the
atom enters an out-of-the-page analyzer (orientation O), and in that case
(see equation 1.5) it exits the + port with probability
cos2 (120◦ /2) = 14 .
With probability 13 , the atom enter an into-the-page analyzer (orientation
I), and in that case it exits the + port with probability 14 . Thus the overall
probability of this atom exiting through the + port is
1 1 1 1 1
3 ×1+ 3 × 4 + 3 × 4 = 12 . (1.6)
A similar analysis shows that if an atom with µz = −µB enters the flipping
analyzer, it exits the + port with probability 12 .
You could repeat the analysis for an atom entering with µ(+120◦ ) = +µB ,
but you don’t need to. Because the three orientations are exactly one-third
of a circle apart, rotational symmetry demands that an atom entering with
µ(+120◦ ) = +µB behaves exactly as an atom entering with µz = +µB .
In conclusion, an atom entering in any of the six conditions µz = +µB ,
µz = −µB , µ(+120◦ ) = +µB , µ(+120◦ ) = −µB , µ(−120◦ ) = +µB , or
µ(−120◦ ) = −µB will exit through the + port with probability 12 .
1.5.2 EPR source of atom pairs
Up to now, our atoms have come from an oven. For the next experiments we
need a special source14 that expels two atoms at once, one moving to the left
and the other to the right. For the time being we call this an “EPR” source,
which produces an atomic pair in an “EPR” condition. The letters come
from the names of those who discovered this condition: Albert Einstein,
Boris Podolsky, and Nathan Rosen. After investigating this condition we
will develop a more descriptive name.
14 The question of how to build this special source need not concern us at the moment: it
is an experimental fact that such sources do exist. One way to make one would start with
a diatomic molecule with zero magnetic moment. Cause the molecule to disintegrate and
eject the two daughter atoms in opposite directions. Because the initial molecule had
zero magnetic moment, the pair of daughter atoms will have the properties of magnetic
moment described. In fact, it’s easier to build a source, not for a pair of atoms, but for
a pair of photons using a process called spontaneous parametric down-conversion.
38 Entanglement
The following experiments investigate the EPR condition:

(1) Each atom encounters a vertical Stern-Gerlach analyzer. The ex-
perimental result: the two atoms exit through opposite ports. To be precise:
with probability 21 , the left atom exits + and the right atom exits −, and
with probability 12 , the left atom exits − and the right atom exits +, but
it never happens that both atoms exit + or that both atoms exit −.
1
probability 2
1
probability 2
never
never
You might suppose that this is because for half the pairs, the left atom
is generated with µz = +µB while the right atom is generated with
µz = −µB , while for the other half of the pairs, the left atom is generated
with µz = −µB while the right atom is generated with µz = +µB . This
supposition seems suspicious, because it singles out the z axis as special,
but at this stage in our experimentation it’s possible.
(2) Repeat the above experiment with horizontal Stern-Gerlach analyz-

ers. The experimental result: Exactly the same as in experiment (1)! The
two atoms always exit through opposite ports.
Problem 1.9 on page 47 demonstrates that the results of this experiment

rule out the supposition presented at the bottom of experiment (1).
(3) Repeat the above experiment with the two Stern-Gerlach analyzers
oriented at +120◦ , or with both oriented at −120◦ , or with both oriented
at 57◦ , or for any other angle, as long as both have the same orientation.
The experimental result: Exactly the same for any orientation!
(4) In an attempt to trick the atoms, we set the analyzers to vertical,
then launch the pair of atoms, then (while the atoms are in flight) switch
both analyzers to, say, 42◦ , and have the atoms encounter these analyzers
both with switched orientation. The experimental result: Regardless of
what the orientation is, and regardless of when that orientation is set, the
two atoms always exit through opposite ports.
Here is one way to picture this situation: The pair of atoms has a total
magnetic moment of zero. But whenever the projection of a single atom
on any axis is measured, the result must be +µB or −µB , never zero.
The only way to insure that that total magnetic moment, projected on
any axis, sums to zero is the way described above. Do not put too much
weight on this picture: like the “wants to go straight” story of section 1.1.5
(page 14), this is a classical story that happens to give the correct result.
The definitive answer to any question is always experiment, not any picture
or story, however appealing it may be.
These four experiments show that it is impossible to describe the con-
dition of the atoms through anything like “the left atom has µz = +µB ,
the right atom has µz = −µB ”. How can we describe the condition of the
pair? This will require further experimentation. For now, we say it has an
EPR condition.
1.5.3 EPR atom pair encounters flipping Stern-Gerlach

analyzers
A pair of atoms leaves the EPR source, and each atom travels at the same
speed to vertical analyzers located 100 meters away. The left atom exits the
− port, the right atom exits the + port. When the pair is flying from source
to analyzer, it’s not correct to describe it as “the left atom has µz = −µB ,
the right atom has µz = +µB ”, but after the atoms leave their analyzers,
then this is a correct description.
Now shift the left analyzer one meter closer to the source. The left atom
encounters its analyzer before the right atom encounters its. Suppose the
left atom exits the − port, while the right atom is still in flight toward its
analyzer. We know that when the right atom eventually does encounter
its vertical analyzer, it will exit the + port. Thus it is correct to describe
40 Entanglement
the right atom as having “µz = +µB ”, even though that atom hasn’t yet
encountered its analyzer.
Replace the right vertical analyzer with a flipping Stern-Gerlach ana-
lyzer. (In the figure below, it is in orientation O, out of the page.) Suppose
the left atom encounters its vertical analyzer and exits the − port. Through
the reasoning of the previous paragraph, the right atom now has µz = +µB .
We know that when such an atom encounters a flipping Stern-Gerlach an-
alyzer, it exits the + port with probability 21 .
Similarly, if the left atom encounters its vertical analyzer and exits the
+ port, the right atom now has µz = −µB , and once it arrives at its flipping
analyzer, it will exit the − port with probability 21 . Summarizing these two
paragraphs: Regardless of which port the left atom exits, the right atom
will exit the opposite port with probability 12 .
Now suppose that the left analyzer were not vertical, but instead in
orientation I, tilted into the page by one-third of a circle. It’s easy to see
that, again, regardless of which port the left atom exits, the right atom will
exit the opposite port with probability 21 .
Finally, suppose that the left analyzer is a flipping analyzer. Once again,
the two atoms will exit from opposite ports with probability 12 .
The above analysis supposed that the left analyzer was one meter closer
to the source than the right analyzer, but clearly it also works if the right
analyzer is one meter closer to the source than the left analyzer. Or one
centimeter. One suspects that the same result will hold even if the two
analyzers are exactly equidistant from the source, and experiment bears
out this suspicion.
In summary: Each atom from this EPR source enters a flipping Stern-
Gerlach analyzer.
(A) The atoms exit from opposite ports with probability 12 .

(B) If the two analyzers happen to have the same orientation, the atoms
exit from opposite ports.
This is the prediction of quantum mechanics, and experiment confirms this

prediction.
1.5.4 The prediction of local determinism
Suppose you didn’t know anything about quantum mechanics, and you
were told the result that “if the two analyzers have the same orientation,
the atoms exit from opposite ports.” Could you explain it?
I am sure you could. In fact, there are two possible explanations: First,
the communication explanation. The left atom enters its vertical analyzer,
and notices that it’s being pulled toward the + port. It calls up the right
atom with its walkie-talkie and says “If your analyzer has orientation I or O
then you might go either way, but if your analyzer has orientation V you’ve
got to go to the − port!” This is a possible explanation, but it’s not a local
explanation. The two analyzers might be 200 meters apart, or they might
be 200 light-years apart. In either case, the message would have to get from
the left analyzer to the right analyzer instantaneously. The walkie-talkies
would have to use not radio waves, which propagate at the speed of light,
but some sort of not-yet-discovered “insta-rays”. Physicists have always
been skeptical of non-local explanations, and since the advent of relativity
they have grown even more skeptical, so we set this explanation aside. Can
you find a local explanation?
Again, I am sure you can. Suppose that when the atoms are launched,
they have some sort of characteristic that specifies which exit port they will
take when they arrive at their analyzer. This very reasonable supposition,
called “determinism”, pervades all of classical mechanics. It is similar to
saying “If I stand atop a 131 meter cliff and toss a ball horizontally with
speed 23.3 m/s, I can predict the angle with which the ball strikes the
ground, even though that event will happen far away and long in the fu-
ture.” In the case of the ball, the resulting strike angle is encoded into the
initial position and velocity. In the case of the atoms, it’s not clear how the
42 Entanglement
exit port will be encoded: perhaps through the orientation of its magnetic
moment, perhaps in some other, more elaborate way. But the method of
encoding is irrelevant: if local determinism holds, then something within
the atom determines which exit port it will take when it reaches its ana-
lyzer.15 I’ll represent this “something” through a code like (+ + −). The
first symbol means that if the atom encounters an analyzer in orientation V,
it will exit through the + port. The second means that if it encounters an
analyzer in orientation O, it will exit through the + port. The third means
that if it encounters an analyzer in orientation I, it will exit through the
− port. The only way to ensure that “if the two analyzers have the same
orientation, the atoms exit from opposite ports” is to assume that when the
two atoms separate from each other within the source, they have opposite
codes. If the left atom has (+ − +), the right atom must have (− + −). If
the left atom has (− − −), the right atom must have (+ + +). This is the
local deterministic scheme for explaining fact (B) that “if the two analyzers
have the same orientation, the atoms exit from opposite ports”.
But can this scheme explain fact (A)? Let’s investigate. Consider first
the case mentioned above: the left atom has (+−+) and the right atom has
(− + −). These atoms will encounter analyzers set to any of 32 = 9 possible
pairs of orientations. We list them below, along with with exit ports taken
by the atoms. (For example, the third line of the table considers a left
analyzer in orientation V and a right analyzer in orientation I. The left
atom has code (+ − +), and the first entry in that code determines that
the left atom will exit from the V analyzer through the + port. The right
atom has code (− + −), and the third entry in that code determines that
the right atom will exit from the I analyzer through the − port.)
15 But remember that in quantum mechanics determinism does not hold. The infor-
mation can’t be encoded within the three projections of a classical magnetic moment
vector, because at any one instant, the quantum magnetic moment vector has only one
projection.
left left right right opposite?

port analyzer analyzer port
+ V V − yes
+ V O + no
+ V I − yes
− O V − no
− O O + yes
− O I − no
+ I V − yes
+ I O + no
+ I I − yes
Each of the nine orientation pairs (VV, OI, etc.) are equally likely, five of
the orientation pairs result in atoms exiting from opposite ports, so when
atoms of this type emerge from the source, the probability of these atoms
exiting from opposite ports is 59 .
What about a pair of atoms generated with different codes? Suppose the
left atom has (− − +) so the right atom must have (+ + −). If you perform
the analysis again, you will find that the probability of atoms exiting from
opposite ports is once again 95 .
Suppose the left atom has (−−−), so the right atom must have (+++).
The probability of the atoms exiting from opposite ports is of course 1.
There are, in fact, just 23 = 8 possible codes:
code probability
for of exiting
left atom opposite
+++ 1
−++ 5/9
+−+ 5/9
++− 5/9
+−− 5/9
−+− 5/9
−−+ 5/9
−−− 1
44 Entanglement
If the source makes left atoms of only type (−−+), then the probability
of atoms exiting from opposite ports is 59 . If the source makes left atoms
of only type (+ + +), then the probability of atoms exiting from opposite
ports is 1. If the source makes left atoms of type (− − +) half the time,
and of type (+ + +) half the time, then the probability of atoms exiting
from opposite ports is halfway between 95 and 1, namely 79 . But no matter
how the source makes atoms, the probability of atoms exiting from opposite
ports must be somewhere between 59 and 1.
But experiment and quantum mechanics agree: That probability is ac-
tually 12 — and 12 is not between 95 and 1. No local deterministic scheme
— no matter how clever, or how elaborate, or how baroque — can give the
result 12 . There is no “something within the atom that determines which
exit port it will take when it reaches its analyzer”. If the magnetic moment
has a projection on axis V, then it doesn’t have a projection on axis O or
axis I.
There is a reason that Einstein, despite his many attempts, never pro-
duced a scheme that explained quantum mechanics in terms of some more
fundamental, local and deterministic mechanism. It is not that Einstein
wasn’t clever. It is that no such scheme exists.
1.5.5 The upshot
This is a new phenomenon — one totally absent from classical physics — so

it deserves a new name, something more descriptive than “EPR”. Einstein
called it “spooky action at a distance”.16 The phenomenon is spooky all
right, but this phrase misses the central point that the phenomenon involves
“correlations at a distance”, whereas the word “action” suggests “cause-
and-effect at a distance”. Schrödinger coined the term “entanglement” for
this phenomenon and said it was “not. . . one but rather the characteristic
trait of quantum mechanics, the one that enforces its entire departure from
classical lines of thought”.17 The world has followed Schrödinger and the
phenomenon is today called entanglement. We will later investigate en-
tanglement in more detail, but for now we will just call our EPR source a
16 Letter from Einstein to Max Born, 3 March 1947, The Born-Einstein Letters (Macmil-
lan, New York, 1971) translated by Irene Born.

17 Erwin Schrödinger, “Discussion of probability relations between separated systems”
Mathematical Proceedings of the Cambridge Philosophical Society 31 (October 1935)

555–563.
“source of entangled atom pairs” and describe the condition of the atom
pair as “entangled”.
The failure of local determinism described above is a special case of
“Bell’s Theorem”, developed by John Bell18 in 1964. The theorem has
by now been tested experimentally numerous times in numerous contexts
(various different angles; various distances between the analyzers; various
sources of entangled pairs; various kinds of particles flying apart — gamma
rays, or optical photons, or ions). In every test, quantum mechanics has
been shown correct and local determinism wrong. What do we gain from
these results?
First, they show that nature does not obey local determinism. To our
minds, local determinism is common sense and any departure from it is
weird. Thus whatever theory of quantum mechanics we eventually develop
will be, to our eyes, weird. This will be a strength, not a defect, in the
theory. The weirdness lies in nature, not in the theory used to describe
nature.
Each of us feels a strong psychological tendency to reject the unfamil-
iar. In 1633, the Holy Office of the Inquisition found Galileo Galilei’s idea
that the Earth orbited the Sun so unfamiliar that they rejected it. The
inquisitors put Galileo on trial and forced him to abjure his position. From
the point of view of nature, the trial was irrelevant, Galileo’s abjuration
was irrelevant: the Earth orbits the Sun whether the Holy Office finds that
fact comforting or not. It is our job as scientists to change our minds to fit
nature; we do not change nature to fit our preconceptions. Don’t make the
inquisitors’ mistake.
Second, the Bell’s theorem result guides not just our calculations about
nature but also our visualizations of nature, and even the very idea of
what it means to “understand” nature. Lord Kelvin19 framed the situation
perfectly in his 1884 Baltimore lectures: “I never satisfy myself until I can
18 John Stewart Bell (1928–1990), a Northern Irish physicist, worked principally in accel-
erator design, and his investigation of the foundations of quantum mechanics was some-
thing of a hobby. Concerning tests of his theorem, he remarked that “The reasonable
thing just doesn’t work.” [Jeremy Bernstein, Quantum Profiles (Princeton University
Press, Princeton, NJ, 1991) page 84.]
19 William Thomson, the first Baron Kelvin (1824–1907), was an Irish mathematical
physicist and engineer who worked in Scotland. He is best known today for establishing
the thermodynamic temperature scale that bears his name, but he also made fundamen-
tal contributions to electromagnetism. He was knighted for his engineering work on the
first transatlantic telegraph cable.
46 Entanglement
make a mechanical model of a thing. If I can make a mechanical model

I can understand it. As long as I cannot make a mechanical model all
the way through I cannot understand, and this is why I cannot get the
electromagnetic theory.”20 If we take this as our meaning of “understand”,
then the experimental tests of Bell’s theorem assure us that we will never be
able to understand quantum mechanics.21 What is to be done about this?
There are only two choices. Either we can give up on understanding, or we
can develop a new and more appropriate meaning for “understanding”.
Max Born22 argued for the first choice: “The ultimate origin of the
difficulty lies in the fact (or philosophical principle) that we are compelled to
use the words of common language when we wish to describe a phenomenon,
not by logical or mathematical analysis, but by a picture appealing to the
imagination. Common language has grown by everyday experience and can
never surpass these limits.”23 Born felt that it was impossible to visualize
or “understand” quantum mechanics: all you could do was grind through
the “mathematical analysis”.
Humans are visual animals, however, and I have found that when we are
told not to visualize, we do so anyway. But we do so in an illicit and uncrit-
ical way. For example, many people visualize an atom passing through an
interferometer as a small, hard, marble, with a definite position, despite the
already-discovered fact that this visualization is untenable. Many people
visualize a photon as a “ball of light” despite the fact that a photon (as
conventionally defined) has a definite energy and hence can never have a
position.
It is possible to develop a visualization and understanding of quantum
mechanics. This can’t be done by building a “mechanical model all the
way through”. It must be done through both analogy and contrast: atoms
20 William Thomson, “Baltimore lectures on wave theory and molecular dynamics,” in
Robert Kargon and Peter Achinstein, editors, Kelvin’s Baltimore Lectures and Modern
Theoretical Physics (MIT Press, Cambridge, MA, 1987) page 206.
21 The first time I studied quantum mechanics seriously, I wrote in the margin of my
textbook “Good God they do it! But how?” I see now that I was looking for a mechanical
mechanism undergirding quantum mechanics. It doesn’t exist, but it’s very natural for
anyone to want it to exist.
22 Max Born (1882–1970) was a German-Jewish theoretical physicist with a particular in-
terest in optics. At the University of Göttingen in 1925 he directed Heisenberg’s research

which resulted in the first formulation of quantum mechanics. His granddaughter, the
British-born Australian actress and singer Olivia Newton-John, is famous for her 1981
hit song “Physical”.
23 Max Born, Atomic Physics, sixth edition (Hafner Press, New York, 1957) page 97.
behave in some ways like small hard marbles, in some ways like classical
waves, and in some ways like a cloud or fog of probability. Atoms don’t
behave exactly like any of these things, but if you keep in mind both the
analogy and its limitations, then you can develop a pretty good visualization
and understanding.
And that brings us back to the name “entanglement”. It’s an important
name for an important phenomenon, but it suggests that the two distant
atoms are connected mechanically, through strings. They aren’t. The two
atoms are correlated — if the left comes out +, the right comes out −, and
vice versa — but they aren’t correlated because of some signal sent back
and forth through either strings or walkie-talkies. Entanglement involves
correlation without causality.
Problems
1.8 An atom walks into an analyzer
Execute the “similar analysis” mentioned in the sentence below equa-
tion (1.6).
1.9 A supposition squashed (essential problem)
If atoms were generated according to the supposition presented below
experiment (1) on page 38, then would would happen when they en-
countered the two horizontal analyzers of experiment (2)?
1.10 A probability found through local determinism
Suppose that the codes postulated on page 42 did exist. Suppose also
that a given source produces the various possible codes with these prob-
abilities:
code probability
for of making
left atom such a pair
+++ 1/2
++− 1/4
+−− 1/8
−−+ 1/8
If this given source were used in the experiment of section 1.5.3 with
distant flipping Stern-Gerlach analyzers, what would be the probability
of the two atoms exiting from opposite ports?
48 Quantum cryptography
1.11 A probability found through quantum mechanics

In the test of Bell’s inequality (the experiment of section 1.5.3), what
is the probability given by quantum mechanics that, if the orientation
settings are different, the two atoms exit from opposite ports?
1.6 Quantum cryptography
We’ve seen a lot of new phenomena, and the rest of this book is devoted
to filling out our understanding of these phenomena and applying that
understanding to various circumstances. But first, can we use them for
anything?
We can. The sending of coded messages used to be the province of
armies and spies and giant corporations, but today everyone does it. All
transactions through automatic teller machines are coded. All Internet
commerce is coded. This section describes a particular, highly reliable
encoding scheme and then shows how quantal entanglement may someday
be used to implement this scheme. (Quantum cryptography was used to
securely transmit voting ballots cast in the Geneva canton of Switzerland
during parliamentary elections held 21 October 2007. But it is not today
in regular use anywhere.)
In this section I use names conventional in the field of coded messages
(called cryptography). Alice and Bob wish to exchange private messages,
but they know that Eve is eavesdropping on their communication. How
can they encode their messages to maintain their privacy?
1.6.1 The Vernam cipher
The Vernam cipher or “one-time pad” technique is the only coding scheme
proven to be absolutely unbreakable (if used correctly). It does not rely on
the use of computers — it was invented by Gilbert Vernam in 1919 — but
today it is mostly implemented using computers, so I’ll describe it in that
context.
Data are stored on computer disks through a series of magnetic patches
on the disk that are magnetized either “up” or “down”. An “up” patch
is taken to represent 1, and a “down” patch is taken to represent 0. A
string of seven patches is used to represent a character. For example, by a
convention called ASCII, the letter “a” is represented through the sequence
1100001 (or, in terms of magnetizations, up, up, down, down, down, down,
up). The letter “W” is represented through the sequence 1010111. Any
computer the world around will represent the message “What?” through
the sequence
1010111 1101000 1100001 1110100 0111111
This sequence is called the “plaintext”.

But Alice doesn’t want a message recognizable by any computer the
world around. She wants to send the message “What?” to Bob in such a
way that Eve will not be able to read the message, even though Eve has
eavesdropped on the message. Here is the scheme invented by Vernam:
Before sending her message, Alice generates a string of random 0s and 1s
just as long as the message she wants to send — in this case, 7 × 5 = 35
bits. She might do this by flipping 35 coins, or by flipping one coin 35
times. I’ve just done that, producing the random number
0100110 0110011 1010110 1001100 1011100
Then Alice gives Bob a copy of that random number – the “key”.
Instead of sending the plaintext, Alice modifies her plaintext into a
coded “ciphertext” using the key. She writes down her plaintext and writes
the key below it, then works through column by column. For each position,
if the key is 0 the plaintext is left unchanged; but if the key is 1 the plaintext
is reversed (from 0 to 1 or vice versa). For the first column, the key is 0, so
Alice doesn’t change the plaintext: the first character of ciphertext is the
same as the first character of plaintext. For the second column, the key
is 1, so Alice does change the paintext: the second character of ciphertext
is the reverse of the second character of plaintext. Alice goes through all
the columns, duplicating the plaintext where the key is 0 and reversing the
paintext where the key is 1.
plaintext: 1010111 1101000 1100001 1110100 0111111

key: 0100110 0110011 1010110 1001100 1011100
ciphertext: 1110001 1011011 0110111 0111000 1100011
Then, Alice sends out her ciphertext over open communication lines.
50 Quantum cryptography
Now, the ciphertext that Bob (and Eve) receive translates to some mes-
sage through the ASCII convention – in fact, it translates to “q[78c” — but
because the key is random, the ciphertext is just as random. Bob deciphers
Alice’s message by carrying out the encoding process on the ciphertext,
namely, duplicating the ciphertext where the key is 0 and reversing the
ciphertext where the key is 1. The result is the plaintext. Eve does not
know the key, so she cannot produce the plaintext.
The whole scheme relies on the facts that the key is (1) random and
(2) unknown to Eve. The very name “one-time pad” underscores that a
key can only be used once and must then be discarded. If a single key is
used for two messages, then the second key is not “random” — it is instead
perfectly correlated with the first key. There are easy methods to break the
code when a key is reused.
Generating random numbers is not easy, and the Vernam cipher de-
mands keys as long as the messages transmitted. As recently as 1992,
high-quality computer random-number generators were classified by the
U.S. government as munitions, along with tanks and fighter planes, and
their export from the country was prohibited.
And of course Eve must not know the key. So there must be some way
for Alice to get the key to Bob securely. If they have some secure method
for transmitting keys, why don’t they just use that same secure method for
sending their messages?
In common parlance, the word “random” can mean “unimportant, not
worth considering” (as in “Joe made a random comment”). So it may
seem remarkable that a major problem for government, the military, and
commerce is the generation and distribution of randomness, but that is
indeed the case.
1.6.2 Quantum mechanics to the rescue
Since quantum mechanics involves randomness, it seems uniquely posi-

tioned to solve this problem. Here’s one scheme.
Alice and Bob set up a source of entangled atoms halfway between their
two homes. Both of them erect vertical Stern-Gerlach analyzers to detect
the atoms. If Alice’s atom comes out +, she will interpret it as a 1, if −,
a 0. Bob interprets his atoms in the opposite sense. Since the entangled
atoms always exit from opposite ports, Alice and Bob end up with the
same random number, which they use as a key for their Vernam-cipher
communications over conventional telephone or computer lines.
This scheme will indeed produce and distribute copious, high-quality
random numbers. But Eve can get at those same numbers through the
following trick: She cuts open the atom pipe leading from the entangled
source to Alice’s home, and inserts a vertical interferometer.24 She watches
the atoms pass through her interferometer. If the atom takes path a, Eve
knows that when Alice receives that same atom, it will exit from Eve’s
+ port. If the atom takes path b, the opposite holds. Eve gets the key, Eve
breaks the code.
It’s worth looking at this eavesdropping in just a bit more detail. When
the two atoms depart from their source, they are entangled. It is not true
that, say, Alice’s atom has µz = +µB while Bob’s atom has µz = −µB
— the pair of atoms is in the condition we’ve called “entangled”, but the
individual atoms themselves are not in any condition. However, after Eve
sees the atom taking path a of her interferometer, then the two atoms are
no longer entangled — now it is true that Alice’s atom has the condition
µz = +µB while Bob’s atom has the condition µz = −µB . The key received
by Alice and Bob will be random whether or not Eve is listening in. To
test for evesdropping, Alice and Bob must examine it in some other way.
Replace Alice and Bob’s vertical analyzers with flipping Stern-Gerlach
analyzers. After Bob receives his random sequence of pluses and minuses,
he sends it to Alice over an open communication line. (Eve will intercept
that sequence but it won’t do her any good, because Bob sends only the
pluses and minuses, not the orientations of his analyzer.) Alice now knows
both the results at her analyzer and the results at Bob’s analyzer, so she
can perform a test of Bell’s theorem: If she finds that the probability of
atoms coming out opposite is 21 , then she knows that their atoms have
arrived entangled, thus Eve has not observed the atoms in transit. If she
finds that the probability is between 59 and 1, then she knows for certain
that Eve is listening in, and they must not use their compromised key.
Is there some other way for Eve to tap the line? No! If the atom pairs
pass the test for entanglement, then no one can know the values of their
24 Inspired by James Bond, I always picture Eve as exotic beauty in a little black dress
slinking to the back of an eastern European café to tap the diplomatic cable which
conveniently runs there. But in point of fact Eve would be a computer.
52 What is a qubit?
µz projections because those projections don’t exist! We have guaranteed

that no one has intercepted the key by the interferometer method, or by
any other method whatsoever.
Once Alice has tested Bell’s theorem, she and Bob still have a lot of
work to do. For a key they must use only those random numbers produced
when their two analyzers happen to have the same orientations. There are
detailed protocols specifying how Alice and Bob must exchange information
about their analyzer orientations, in such a way that Eve can’t uncover
them. I won’t describe these protocols because while they tell you how
clever people are, they tell you nothing about how nature behaves. But
you should take away that entanglement is not merely a phenomenon of
nature: it is also a natural resource.
1.7 What is a qubit?
We’ve devoted an entire chapter to the magnetic moment of a silver atom.

Perhaps you find this inappropriate: do you really care so much about
silver atoms? Yes you do, because the phenomena and principles we’ve
established concerning the magnetic moment of a silver atom apply to a
host of other systems: the polarization of a light photon, the hybridization
of a benzene molecule, the position of the nitrogen atom within an ammonia
molecule, the neutral kaon, and more. Such systems are called “two-state
systems” or “spin- 12 systems” or “qubit systems”. The ideas we establish
concerning the magnetic moment of a silver atom apply equally well to all
these systems.
After developing these ideas in the next chapter, we will (in chapter 6,
“The Quantum Mechanics of Position”) generalize them to continuum sys-
tems like the position of an electron.
Problem
1.12 Questions (recommended problem)
Answering questions is an important scientific skill and, like any skill,
it is sharpened through practice. This book gives you plenty of oppor-
tunities to develop that skill. Asking questions is another important
scientific skill.25 To hone that skill, write down a list of questions you
have about quantum mechanics at this point. Be brief and pointed:
you will not be graded for number or for verbosity. In future problems,
I will ask you to add to your list.
[[For example, one of my questions would be: “Can entanglement be
used to send a message from the left analyzer to the right analyzer?”]]
25 “The important thing is not to stop questioning,” said Einstein. “Never lose a holy
curiosity.” [Interview by William Miller, “Death of a Genius”, Life magazine, volume 38,
number 18 (2 May 1955) pages 61–64 on page 64.]
Chapter 2
Forging Mathematical Tools
When you walked into your introductory classical mechanics course, you
were already familiar with the phenomena of introductory classical mechan-
ics: flying balls, spinning wheels, colliding billiard balls. Your introductory
mechanics textbook didn’t need to introduce these things to you, but in-
stead jumped right into describing these phenomena mathematically and
explaining them in terms of more general principles.
The first chapter of this textbook made you familiar with the phenom-
ena of quantum mechanics: quantization, interference, and entanglement
— at least, insofar as these phenomena are manifest in the behavior of the
magnetic moment of a silver atom. You are now, with respect to quan-
tum mechanics, at the same level that you were, with respect to classical
mechanics, when you walked into your introductory mechanics course. It
is now our job to describe these quantal phenomena mathematically, to
explain them in terms of more general principles, and (eventually) to inves-
tigate situations more complex than the magnetic moment of one or two
silver atoms.
2.1 What is a quantal state?
We’ve been talking about the state of the silver atom’s magnetic moment
by saying things like “the projection of the magnetic moment on the z axis
is µz = −µB ” or “µx = +µB ” or “µθ = −µB ”. This notation is clumsy.
First of all, it requires you to write down the same old µs time and time
again. Second, the most important thing is the axis (z or x or θ), and the
symbol for the axis is also the smallest and easiest to overlook.
55
56 What is a quantal state?
P.A.M. Dirac1 invented a notation that overcomes these faults. He

looked at descriptions like
µz = −µB or µx = +µB or µθ = −µB
and noted that the only difference from one expression to the other was
the axis subscript and the sign in front of µB . Since the only thing that
distinguishes one expression from another is (z, −), or (x, +), or (θ, −),
Dirac thought, these should be the only things we need to write down. He
denoted these three states as
|z−i or |x+i or |θ−i.
The placeholders | i are simply ornaments to remind us that we’re talking
about quantal states, just as the arrow atop ~r is simply an ornament to
remind us that we’re talking about a vector. States expressed using this
notation are sometimes called “kets”.
Simply establishing a notation doesn’t tell us much. Just as in classical
mechanics, we say we know a state when we know all the information needed
to describe the system now and to predict its future. In our universe the
classical time evolution law is
2
d ~r
F~ = m 2
dt
and so the state is specified by giving both a position ~r and a velocity ~v . If
nature had instead provided the time evolution law
3
d ~r
F~ = m 3
dt
then the state would have been specified by giving a position ~r, a velocity
~v , and an acceleration ~a. The specification of state is dictated by nature,
not by humanity, so we can’t know how to specify a state until we know the
laws of physics governing that state. Since we don’t yet know the laws of
quantal physics, we can’t yet know exactly how to specify a quantal state.
Classical intuition makes us suppose that, to specify the magnetic mo-
ment of a silver atom, we need to specify all three components µz , µx , and
µy . We have already seen that nature precludes such a specification: if the
magnetic moment has a value for µz , then it doesn’t have a value for µx ,
1 The Englishman Paul Adrien Maurice Dirac (1902–1984) in 1928 formulated a rela-
tivistically correct quantum mechanical equation that turns out to describe the electron.
In connection with this so-called Dirac equation, he predicted the existence of antimatter.
Dirac was painfully shy and notoriously cryptic.
2.2. Amplitude 57
and it’s absurd to demand a specification for something that doesn’t ex-
ist. As we learn more and more quantum physics, we will learn better and
better how to specify states. There will be surprises. But always keep in
mind that (just as in classical mechanics) it is experiment, not philosophy
or meditation, and certainly not common sense, that tells us how to specify
states.
2.2 Amplitude
b
input
|z+i output
a
|z−i
An atom in state |z+i ambivates through the apparatus above. We have

already seen that, when the atom ambivates in darkness,
probability to go from input to output 6=
probability to go from input to output via path a (2.1)
+ probability to go from input to output via path b.
On the other hand, it makes sense to associate some sort of “influence
to go from input to output via path a” with the path through a and a
corresponding “influence to go from input to output via path b” with the
path through b. This postulated influence is called “probability amplitude”
or just “amplitude”.2 Whatever amplitude is, its desired property is that
amplitude to go from input to output =
amplitude to go from input to output via path a (2.2)
+ amplitude to go from input to output via path b.
For the moment, the very existence of amplitude is nothing but a hopeful
surmise. Scientists cannot now and indeed never will be able to prove that
the concept of amplitude applies to all situations. That’s because new
situations are being investigated every day, and perhaps tomorrow a new
2 The name “amplitude” is a poor one, because it is also used for the maximum value of
a sinusoidal signal — in the function A sin(ωt), the symbol A represents the amplitude —
and this sinusoidal signal “amplitude” has nothing to do with the quantal “amplitude”.
One of my students correctly suggested that a better name for quantal amplitude would
be “proclivity”. But it’s too late now to change the word.
58 Amplitude
situation will be discovered that cannot be described in terms of amplitudes.

But as of today, that hasn’t happened.
The role of amplitude, whatever it may prove to be, is to calculate
probabilities. We establish three desirable rules:
(1) From amplitude to probability. For every possible action there is an

associated amplitude, such that
probability for the action = |amplitude for the action|2 .
(2) Actions in series. If an action takes place through several successive
stages, the amplitude for that action is the product of the amplitudes
for each stage.
(3) Actions in parallel. If an action could take place in several possible
ways, the amplitude for that action is the sum of the amplitudes for
each possibility.
The first rule is a simple way to make sure that probabilities are al-
ways positive. The second rule is a natural generalization of the rule for
probabilities in series — that if an action happens through several stages,
the probability for the action as a whole is the product of the probabilities
for each stage. And the third rule simply restates the “desired property”
presented in equation (2.2).
We apply these rules to various situations that we’ve already encoun-
tered, beginning with the interference experiment sketched above. Recall
the probabilities already established (first column in table):
probability |amplitude| amplitude

go from input to output 0 0 0
1 1
go from input to output via path a 4 2 + 12
1 1
go from input to output via path b 4 2 − 12
If rule (1) is to hold, then the amplitude to go from input to output must
also be 0, while the amplitude to go via a path must have magnitude 12
(second column in table). According to rule (3), the two amplitudes to
go via a and via b must sum to zero, so they cannot both be represented
by positive numbers. Whatever mathematical entity is used to represent
amplitude, it must enable two such entities, each with non-zero magnitude,
to sum to zero. There are many such entities: real numbers, complex
Forging Mathematical Tools 59
numbers, hypercomplex numbers, and vectors in three dimensions are all

possibilities. For this particular interference experiment, it suffices to assign
real numbers to amplitudes: the amplitude to go via path a is + 12 , and the
amplitude to go via path b is − 12 . (Third column in table. The negative
sign could have been assigned to path a rather than to path b: this choice is
merely conventional.) For other interference experiments complex numbers
are required. It turns out that, for all situations yet encountered, one can
represent amplitude mathematically as a complex number. Once again,
this reflects the results of experiment, not of philosophy or meditation.
The second situation we’ll consider is a Stern-Gerlach analyzer.
z
θ
|θ+i
|z+i
|θ−i
The amplitude that an atom entering the θ-analyzer in state |z+i exits in
state |θ+i is called3 hθ+|z+i. That phrase is a real mouthful, so the symbol
hθ+|z+i is pronounced “the amplitude that |z+i is in |θ+i”, even though
this briefer pronunciation leaves out the important role of the analyzer.4
From rule (1), we know that
|hθ+|z+i|2 = cos2 (θ/2) (2.3)
2 2
|hθ−|z+i| = sin (θ/2). (2.4)
You can also use rule (1), in connection with the experiments described in
3 The states appear in the symbol in the opposite sequence from their appearance in
the description.
4 The ultimate source of such problems is that the English language was invented by
people who did not understand quantum mechanics, hence they never produced concise,
accurate phrases to describe quantal phenomena. In the same way, the ancient phrase
“search the four corners of the Earth” is still colorful and practical, and is used today
even by those who know that the Earth doesn’t have four corners.
60 Amplitude
problem 1.2, “Exit probabilities” (on page 20) to determine that

|hz+|θ+i|2 = cos2 (θ/2)
|hz−|θ+i|2 = sin2 (θ/2)
|hθ+|z−i|2 = sin2 (θ/2)
|hθ−|z−i|2 = cos2 (θ/2)
|hz+|θ−i|2 = sin2 (θ/2)
|hz−|θ−i|2 = cos2 (θ/2).
Clearly analyzer experiments like these determine the magnitude of an

amplitude. No analyzer experiment can determine the phase of an ampli-
tude. To determine phases, we must perform interference experiments.
So the third situation is an interference experiment.
z
θ
a
input
|z+i output
|z−i
b
Rule (2), actions in series, tells us that the amplitude to go from |z+i to
|z−i via path a is the product of the amplitude to go from |z+i to |θ+i
times the amplitude to go from |θ+i to |z−i:
amplitude to go via path a = hz−|θ+ihθ+|z+i.
Similarly
amplitude to go via path b = hz−|θ−ihθ−|z+i.
And then rule (3), actions in parallel, tells us that the amplitude to go from
|z+i to |z−i is the sum of the amplitude to go via path a and the amplitude
to go via path b. In other words
hz−|z+i = hz−|θ+ihθ+|z+i + hz−|θ−ihθ−|z+i. (2.5)
We know the magnitude of each of these amplitudes from analyzer ex-

periments:
amplitude magnitude
hz−|z+i 0
hz−|θ+i | sin(θ/2)|
hθ+|z+i | cos(θ/2)|
hz−|θ−i | cos(θ/2)|
hθ−|z+i | sin(θ/2)|
The task now is to assign phases to these magnitudes in such a way that
equation (2.5) is satisfied. In doing so we are faced with an embarrassment
of riches: there are many consistent ways to make this assignment. Here
are two commonly used conventions:
amplitude convention I convention II

hz−|z+i 0 0
hz−|θ+i sin(θ/2) i sin(θ/2)
hθ+|z+i cos(θ/2) cos(θ/2)
hz−|θ−i cos(θ/2) cos(θ/2)
hθ−|z+i − sin(θ/2) −i sin(θ/2)
There are two things to notice about these amplitude assignments.

First, one normally assigns values to physical quantities by experiment, or
by calculation, but not “by convention”. Second, both of these conventions
show unexpected behaviors: Because the angle 0◦ is the same as the angle
360◦ , one would expect that h0◦ +|z+i would equal h360◦ +|z+i, whereas
in fact the first amplitude is +1 and the second is −1. Because the state
|180◦ −i (that is, |θ−i with θ = 180◦ ) is the same as the state |z+i, one
would expect that h180◦ −|z+i = 1, whereas in fact h180◦ −|z+i is either
−1 or −i, depending on convention. These two observations underscore
the fact that amplitude is a mathematical tool that enables us to calculate
physically observable quantities, like probabilities. It is not itself a physical
entity. No experiment measures amplitude. Amplitude is not “out there,
physically present in space” in the way that, say, a nitrogen molecule is.
A good analogy is that an amplitude convention is like a language. Any
language is a human convention: there is no intrinsic connection between a
physical horse and the English word “horse”, or the German word “pferd”,
62 Amplitude
or the Swahili word “farasi”. The fact that language is pure human con-
vention, and that there are multiple conventions for the name of a horse,
doesn’t mean that language is unimportant: on the contrary language is
an immensely powerful tool. And the fact that language is pure human
convention doesn’t mean that you can’t develop intuition about language:
on the contrary if you know the meaning of “arachnid” and the meaning
of “phobia”, then your intuition for English tells you that “arachnopho-
bia” means fear of spiders. Exactly the same is true for amplitude: it is a
powerful tool, and with practice you can develop intuition for it.
When I introduced the phenomenon of quantal interference on page 25,
I said that there was no word or phrase in the English language that ac-
curately represents what’s going on: It’s flat-out wrong to say “the atom
takes path a” and it’s flat-out wrong to say “the atom takes path b”. It
gives a wrong impression to say “the atom takes no path” or “the atom
takes both paths”. I introduced the phrase “the atom ambivates through
the two paths of the interferometer”. Now we have a technically correct
way of describing the phenomenon: “the atom has an amplitude to take
path a and an amplitude to take path b”.
Here’s another warning about language: If an atom in state |ψi enters
a vertical analyzer, the amplitude for it to exit from the + port is hz+|ψi.
(And of course the amplitude for it exit from the − port is hz−|ψi.) This is
often stated “If the atom is in state |ψi, the amplitude of it being in state
|z+i is hz+|ψi.” This is an acceptable shorthand for the full explanation,
which requires thinking about an analyzer experiment, even though the
shorthand never mentions the analyzer. But never say “If the atom is in
state |ψi, the probability of it being in state |z+i is |hz+|ψi|2 .” This gives
the distinct and incorrect impression that before entering the analyzer, the
atom was either in state |z+i or in state |z−i, and you just didn’t know
which it was. Instead, say “If an atom in state |ψi enters a vertical analyzer,
the probability of exiting from the + port in state |z+i is |hz+|ψi|2 .”
2.2.1 Sample Problem: Two paths
Find an equation similar to equation (2.5) representing the amplitude to

start in state |ψi at input, ambivate through a vertical interferometer, and
end in state |φi at output.
1a
|ψi |φi
input output
1b
Solution: Because of rule (2), actions in series, the amplitude for the
atom to take the top path is the product
hφ|z+ihz+|ψi.
Similarly the amplitude for it to take the bottom path is
hφ|z−ihz−|ψi.
Because of rule (3), actions in parallel, the amplitude for it to ambivate
through both paths is the sum of these two, and we conclude that
hφ|ψi = hφ|z+ihz+|ψi + hφ|z−ihz−|ψi. (2.6)
2.2.2 Sample Problem: Three paths
Stretch apart a vertical interferometer, so that the recombining rear end

is far from the splitting front end, and insert a θ interferometer into the
bottom path. Now there are three paths from input to output. Find an
equation similar to equation (2.5) representing the amplitude to start in
state |ψi at input and end in state |φi at output.
1a
θ
2a
|ψi |φi
input output
1b
2b
64 Amplitude
Solution:
hφ|ψi = hφ|z+ihz+|ψi
+ hφ|z−ihz−|θ+ihθ+|z−ihz−|ψi (2.7)
+ hφ|z−ihz−|θ−ihθ−|z−ihz−|ψi
Problems
2.1 Talking about interference
An atom in state |ψi ambivates through a vertical analyzer. We say,
appropriately, that “the atom has an amplitude to take the top path
and an amplitude to take the bottom path”. Find expressions for those
two amplitudes and describe, in ten sentences or fewer, why it is not
appropriate to say “the atom has probability |hz+|ψi|2 to take the top
path and probability |hz−|ψi|2 to take the bottom path”.
2.2 Other conventions
Two conventions for assigning amplitudes are given in the table on
page 61. Show that if hz−|θ+i and hz−|θ−i are multiplied by phase
factor eiα , and if hz+|θ+i and hz+|θ−i are multiplied by phase factor
eiβ (where α and β are both real), then the resulting amplitudes are
just as good as the original (for either convention I or convention II).
2.3 Peculiarities of amplitude
Page 61 pointed out some of the peculiarities of amplitude; this problem
points out another. Since the angle θ is the same as the angle 360◦ + θ,
one would expect that hθ+|z+i would equal h(360◦ + θ)+|z+i. Show,
using either of the conventions given in the table on page 61, that this
expectation is false. What is instead correct?
2.3 Reversal-conjugation relation
Working with amplitudes is made easier through the theorem that the am-
plitude to go from state |ψi to state |φi and the amplitude to go in the
opposite direction are related through complex conjugation:
∗
hφ|ψi = hψ|φi . (2.8)
2.3. Reversal-conjugation relation 65
The proof below works for states of the magnetic moment of a silver atom
— the kind of states we’ve worked with so far — but in fact the result holds
for any quantal system.
The proof relies on three facts: First, the probability for one state to
be analyzed into another depends only on the magnitude of the angle be-
tween the incoming magnetic moment and the analyzer, and not on the
sense of that angle. (An atom in state |z+i has the same probability of
leaving the + port of an analyzer whether it is rotated 17◦ clockwise or 17◦
counterclockwise.) Thus
|hφ|ψi|2 = |hψ|φi|2 . (2.9)
Second, an atom exits an interferometer in the same state in which it en-
tered, so
hφ|ψi = hφ|θ+ihθ+|ψi + hφ|θ−ihθ−|ψi. (2.10)
Third, an atom entering an analyzer comes out somewhere, so
1 = |hθ+|ψi|2 + |hθ−|ψi|2 . (2.11)
The proof also relies on a mathematical result called “the triangle in-
equality for complex numbers”: If a and b are real numbers with a + b = 1,
and in addition eiα a + eiβ b = 1, with α and β real, then α = β = 0. You
can find very general, very abstract, proofs of the triangle inequality, but
the complex plane sketch below encapsulates the idea:
imaginary
eiα a eiβ b
real
a b 1
From the first fact (2.9), the two complex numbers hφ|ψi and hψ|φi have
the same magnitude, so they differ only in phase. Write this statement as
∗
hφ|ψi = eiδ hψ|φi (2.12)
66 Establishing a phase convention
where the phase δ is a real number that might depend on the states |φi and
|ψi. Apply this general result first to the particular state |φi = |θ+i:
∗
hθ+|ψi = eiδ+ hψ|θ+i , (2.13)
and then to the particular state |φi = |θ−i:
∗
hθ−|ψi = eiδ− hψ|θ−i , (2.14)
where the two real numbers δ+ and δ− might be different. Our objective is
to prove that δ+ = δ− = 0.
Apply the second fact (2.10) with |φi = |ψi, giving
1 = hψ|θ+ihθ+|ψi + hψ|θ−ihθ−|ψi
∗ ∗
= eiδ+ hψ|θ+ihψ|θ+i + eiδ− hψ|θ−ihψ|θ−i
= eiδ+ |hψ|θ+i|2 + eiδ− |hψ|θ−i|2
= eiδ+ |hθ+|ψi|2 + eiδ− |hθ−|ψi|2 . (2.15)
Compare this result to the third fact (2.11)

1 = |hθ+|ψi|2 + |hθ−|ψi|2 (2.16)
and use the triangle inequality with a = |hθ+|ψi|2 and b = |hθ−|ψi|2 .
The two the phases δ+ and δ− must vanish, so the “reversal-conjugation
relation” is proven.
2.4 Establishing a phase convention
Although there are multiple alternative phase conventions for amplitudes

(see problem 2.2 on page 64), we will from now on use only phase conven-
tion I from page 61:
hz+|θ+i = cos(θ/2)
hz−|θ+i = sin(θ/2)
(2.17)
hz+|θ−i = − sin(θ/2)
hz−|θ−i = cos(θ/2)
In particular, for θ = 90◦ we have
√
hz+|x+i = 1/√2
hz−|x+i = 1/√2
(2.18)
hz+|x−i = −1/√2
hz−|x−i = 1/ 2
This convention has a desirable special case for θ = 0◦ , namely

hz+|θ+i = 1
hz−|θ+i = 0
(2.19)
hz+|θ−i = 0
hz−|θ−i = 1
but an unexpected special case for θ = 360◦ , namely
hz+|θ+i = −1
hz−|θ+i = 0
(2.20)
hz+|θ−i = 0
hz−|θ−i = −1
This is perplexing, given that the angle θ = 0◦ is the same as the angle θ =
360◦ ! Any convention will have similar perplexing cases. Such perplexities
underscore the fact that amplitudes are important mathematical tools used
to calculate probabilities, but are not “physically real”.
Given these amplitudes, we can use the interference result (2.6) to cal-
culate any amplitude of interest:
hφ|ψi = hφ|z+ihz+|ψi + hφ|z−ihz−|ψi
∗ ∗ (2.21)
= hz+|φi hz+|ψi + hz−|φi hz−|ψi
where in the last line we have used the reversal-conjugation relation (2.8).
Problems
2.4 Other conventions, other peculiarities
Write what this section would have been had we adopted convention II
rather than convention I from page 61. In addition, evaluate the four
amplitudes of equation (2.17) for θ = +180◦ and θ = −180◦ .
2.5 Finding amplitudes (recommended problem)
Using the interference idea embodied in equation (2.21), calculate the
amplitudes hθ+|54◦ +i and hθ−|54◦ +i as a function of θ. Do these
amplitudes have the values you expect for θ = 54◦ ? For θ = 234◦ ?
Plot hθ+|54◦ +i for θ from 0◦ to 360◦ . Compare the result for θ = 0◦
and θ = 360◦ .
2.6 Rotations
Use the interference idea embodied in equation (2.21) to show that
hx+|θ+i = √12 [cos(θ/2) + sin(θ/2)]
hx−|θ+i = − √12 [cos(θ/2) − sin(θ/2)]
(2.22)
hx+|θ−i = √12 [cos(θ/2) − sin(θ/2)]
hx−|θ−i = √12 [cos(θ/2) + sin(θ/2)]
68 How can I specify a quantal state?
If and only if you enjoy trigonometric identities, you should then show
that these results can be written equivalently as
hx+|θ+i = cos((θ − 90◦ )/2)
hx−|θ+i = sin((θ − 90◦ )/2)
(2.23)
hx+|θ−i = − sin((θ − 90◦ )/2)
hx−|θ−i = cos((θ − 90◦ )/2)
This makes perfect geometric sense, as the angle relative to the x axis
is 90◦ less than the angle relative to the z axis:
2.5 How can I specify a quantal state?
We introduced the Dirac notation for quantal states on page 56, but haven’t
yet fleshed out that notation by specifying a state mathematically. Start
with an analogy:
2.5.1 How can I specify a position vector?
We are so used to writing down the position vector ~r that we rarely stop
to ask ourselves what it means. But the plain fact is that whenever we
measure a length (say, with a meter stick) we find not a vector, but a single
number! Experiments measure never the vector ~r but always a scalar —
the dot product between ~r and some other vector, call it ~s for “some other”.
If we know the dot product between ~r and every vector ~s, then we know
everything there is to know about ~r. Does this mean that to specify ~r, we
must keep a list of all possible dot products ~s · ~r ? Of course not. . . such a
list would be infinitely long!
You know that if you write ~r in terms of an orthonormal basis {î, ĵ, k̂},
namely
~r = rx î + ry ĵ + rz k̂ (2.24)
where rx = î · ~r, ry = ĵ · ~r, and rz = k̂ · ~r, then you’ve specified the vector.
Why? Because if you know the triplet (rx , ry , rz ) and the triplet (sx , sy , sz ),
then you can easily find the desired dot product
 
rx
~s · ~r = sx sy sz  ry  = sx rx + sy ry + sz rz . (2.25)
rz
It’s a lot more compact to specify the vector through three dot products
— namely î · ~r, ĵ · ~r, and k̂ · ~r — from which you can readily calculate an
infinite number of desired dot products, than it is to list all infinity dot
products themselves!
2.5.2 How can I specify a quantal state?
Like the position vector ~r, the quantal state |ψi cannot by itself be mea-
sured. But if we determine (through some combination of analyzer exper-
iments, interference experiments, and convention) the amplitude hσ|ψi for
every possible state |σi, then we know everything there is to know about
|ψi. Is there some compact way of specifying the state, or do we have to
keep an infinitely long list of all these amplitudes?
This nut is cracked through the interference experiment result
hσ|ψi = hσ|θ+ihθ+|ψi + hσ|θ−ihθ−|ψi, (2.26)
which simply says, in symbols, that the atom exits an interferometer in the
same state in which it entered (see equation 2.10). It gets hard to keep
track of all these symbols, so I’ll introduce the names
hθ+|ψi = ψ+
hθ−|ψi = ψ−
and
hθ+|σi = σ+
hθ−|σi = σ− .
From the reversal-conjugation relation, this means

∗
hσ|θ+i = σ+
∗
hσ|θ−i = σ− .
In terms of these symbols, the interference result (2.26) is

∗ ∗ ∗ ∗
ψ+
hσ|ψi = σ+ ψ+ + σ− ψ− = σ+ σ− . (2.27)
ψ−
And this is our shortcut! By keeping track of only two amplitudes, ψ+ and
ψ− , for each state, we can readily calculate any amplitude desired. We
don’t have to keep an infinitely long list of amplitudes.
This dot product result for computing amplitude is so useful and so
convenient that sometimes people say the amplitude is a dot product. No.
The amplitude reflects analyzer experiments, plus interference experiments,
plus convention. The dot product is a powerful mathematical tool for com-
puting amplitudes. (A parallel situation: There are many ways to find the
latitude and longitude coordinates for a point on the Earth’s surface, but
the easiest is to use a GPS device. Some people are so enamored of this
ease that they call the latitude and longitude the “GPS coordinates”. But
in fact the coordinates were established long before the Global Positioning
System was built.)
2.5.3 What is a basis?
For vectors in three-dimensional space, an orthonormal basis such as

{î, ĵ, k̂} is a set of three vectors of unit magnitude perpendicular to each
other. As we’ve seen, the importance of a basis is that every vector ~r can
be represented as a sum over these basis vectors,
~r = rx î + ry ĵ + rz k̂,
and hence any vector ~r can be conveniently represented through the triplet
î · ~r
   
rx
 ry  =  ĵ · ~r  .
rz k̂ · ~r
For quantal states, we’ve seen that a set of two states such as
{|θ+i, |θ−i} plays a similar role, so it too is called a basis. For the magnetic
moment of a silver atom, two states |ai and |bi constitute a basis when-
ever ha|bi = 0, and the analyzer experiment of section 1.1.4 shows that
the states |θ+i and |θ−i certainly satisfy this requirement. In the basis
{|ai, |bi} an arbitrary state |ψi can be conveniently represented through
the pair of amplitudes

ha|ψi
.
hb|ψi
2.5.4 Hilbert space
We have learned to express a physical state as a mathematical entity —

namely, using the {|ai, |bi} basis, the state |ψi is represented as a column
matrix of amplitudes

ha|ψi
.
hb|ψi
This mathematical entity is called a “state vector in Hilbert5 space”.
For example, in the basis {|z+i, |z−i} the state |θ+i is represented by

hz+|θ+i cos(θ/2)
= . (2.28)
hz−|θ+i sin(θ/2)
Whereas (in light of equation 2.22) in the basis {|x+i, |x−i} that same
state |θ+i is represented by the different column matrix
!
√1 [cos(θ/2) + sin(θ/2)]

hx+|θ+i 2
= . (2.29)
hx−|θ+i − √12 [cos(θ/2) − sin(θ/2)]
Write down the interference experiment result twice

ha|ψi = ha|z+ihz+|ψi + ha|z−ihz−|ψi
hb|ψi = hb|z+ihz+|ψi + hb|z−ihz−|ψi
and then write these two equations as one using column matrix notation

ha|ψi ha|z+i ha|z−i
= hz+|ψi + hz−|ψi.
hb|ψi hb|z+i hb|z−i
Notice the column matrix representations of states |ψi, |z+i, and |z−i, and
write this equation as
|ψi = |z+ihz+|ψi + |z−ihz−|ψi. (2.30)
5 The German mathematician David Hilbert (1862–1943) made contributions to func-
tional analysis, geometry, mathematical physics, and other areas. He formalized and
extended the concept of a vector space. Hilbert and Albert Einstein raced to uncover
the field equations of general relativity, but Einstein beat Hilbert by a matter of weeks.
And now we have a new thing under the sun. We never talk about adding
together two classical states, nor multiplying them by numbers, but this
equation gives us the meaning of such state addition in quantum mechan-
ics. This is a new mathematical tool, it deserves a new name, and that
name is “superposition”. Superposition is the mathematical reflection of
the physical phenomenon of interference, as in the sentence: “When an
atom ambivates through an interferometer, its state is a superposition of
the state of an atom taking path a and the state of an atom taking path b.”
Superposition is not familiar from daily life or from classical mechanics,
but there is a story6 that increases understanding: “A medieval European
traveler returns home from a journey to India, and describes a rhinoceros
as a sort of cross between a dragon and a unicorn.” In this story the
rhinoceros, an animal that is not familiar but that does exist, is described
as intermediate (a “sort of cross”) between two fantasy animals (the dragon
and the unicorn) that are familiar (to the medieval European) but that do
not exist.
Similarly, an atom in state |z+i ambivates through both paths of a
horizontal interferometer. This action is not familiar but does happen, and
it is characterized as a superposition (a “sort of cross”) between two actions
(“taking path a” and “taking path b”) that are familiar (to all of us steeped
in the classical approximation) but that do not happen.
In principle, any calculation performed using the Hilbert space rep-
resentation of states could be performed by considering suitable, cleverly
designed analyzer and interference experiments. But it’s a lot easier to use
the abstract Hilbert space machinery. (Similarly, any result in electrostatics
could be found using Coulomb’s Law, but it’s a lot easier to use the ab-
stract electric field and electric potential. Any calculation involving vectors
could be performed graphically, but it’s a lot easier to use abstract compo-
nents. Any addition or subtraction of whole numbers could be performed
by counting out marbles, but it’s a lot easier to use abstract mathematical
tools like carrying and borrowing.)
2.5.5 Peculiarities of state vectors
Because state vectors are built from amplitudes, and amplitudes have pe-
culiarities (see pages 61 and 67), it is natural that state vectors have sim-
6 Invented by John D. Roberts, but first published in Robert T. Morrison and Robert
N. Boyd, Organic Chemistry, second edition (Allyn & Bacon, Boston, 1966) page 318.
ilar peculiarities. For example, since the angle θ is the same as the angle
θ + 360◦ , I would expect that the state vector |θ+i would be the same as
the state vector |(θ + 360◦ )+i.
But in fact, in the {|z+i, |z−i} basis, the state |θ+i is represented by

hz+|θ+i cos(θ/2)
= , (2.31)
hz−|θ+i sin(θ/2)
so the state |(θ + 360◦ )+i is represented by
hz+|(θ + 360◦ )+i cos((θ + 360◦ )/2)

= (2.32)
hz−|(θ + 360◦ )+i sin((θ + 360◦ )/2)
cos(θ/2 + 180◦ )

− cos(θ/2)
= = .
sin(θ/2 + 180◦ ) − sin(θ/2)
So in fact |θ+i = −|(θ + 360◦ )+i. Bizarre!
This bizarreness is one facet of a general rule: If you multiply any state
vector by a complex number with magnitude unity — a number such as
−1, or i, or √12 (−1 + i), or e2.7i — a so-called “complex unit” or “phase
factor” — then you get a different state vector that represents the same
state. This fact is called “global phase freedom” — you are free to set the
overall phase of your state vector for your own convenience. This general
rule applies only for multiplying both elements of the state vector by the
same complex unit: if you multiply the two elements with different complex
units, you will obtain a vector representing a different state (see problem 2.8
on page 75).
2.5.6 Names for position vectors
The vector ~r is specified in the basis {î, ĵ, k̂} by the three components
î · ~r
   
rx
 ry  =  ĵ · ~r  .
rz k̂ · ~r
Because this component specification is so convenient, it is sometimes said
that the vector ~r is not just specified, but is equal to this triplet of numbers.
That’s false.
Think of the vector ~r = 5î + 5ĵ. It is represented in the basis {î, ĵ, k̂} by
the triplet (5,√5, 0). But this is√not the only basis that exists. In the basis
ĵ)/ 2, ĵ 0 = (−î+ ĵ)/ 2, k̂}, that same vector is represented
{î0 = (î+√ √ by the
triplet (5 2, 0, 0). If we had said that ~r = (5, 5, 0)√ and that ~r = (5 2, 0, 0),
then we would be forced to conclude that 5 = 5 2 and that 5 = 0!
ĵ
6
ĵ 0 î0
I
@
@
@
@
@
@ - î
To specify a position vector ~r, we use the components of ~r in a particular

basis, usually denoted (rx , ry , rz ). We often write “~r = (rx , ry , rz )” but in
fact that’s not exactly correct. The vector ~r represents a position — it is
independent of basis. The row matrix (rx , ry , rz ) represents the components
of that position vector in a particular basis — it is the “name” of the
position in a particular basis. Instead of using an equals sign = we use
. .
the symbol = to mean “represented by in a particular basis”, as in “~r =
(5, 5, 0)” meaning “the vector ~r = 5î + 5ĵ is represented by the triplet
(5, 5, 0) in the basis {î, ĵ, k̂}”.
Vectors are physical things: a caveman throwing a spear at a mam-
moth was performing addition of position vectors, even though the caveman
didn’t understand basis vectors or Cartesian coordinates. The concept of
“position” was known to cavemen who did not have any concept of “basis”.
2.5.7 Names for quantal states
We’ve been specifying a state like |ψi = |17◦ +i by stating the axis upon
which the projection of µ~ is definite and equal to +µB — in this case, the
axis tilted 17◦ from the vertical.
Another way to specify a state |ψi would be to give the amplitude
that |ψi is in any possible state: that is, to list hθ+|ψi and hθ−|ψi for
all values of θ: 0◦ ≤ θ < 360◦ . One of those amplitudes (in this case
h17◦ +|ψi) will have value 1, and finding this one amplitude would give
us back the information in the specification |17◦ +i. In some ways this is a
more convenient specification because we don’t have to look up amplitudes:
they’re right there in the list. On the other hand it is an awful lot of
information to have to carry around.
The Hilbert space approach is a third way to specify a state that com-
bines the brevity of the first way with the convenience of the second way.
Instead of listing the amplitude hσ|ψi for every state |σi we list only the
two amplitudes ha|ψi and hb|φi for the elements {|ai, |bi} of a basis. We’ve
already seen (equation 2.27) how quantal interference then allows us to
readily calculate any amplitude.
Just as we said “the position vector ~r is represented in the basis {î, ĵ, k̂}
as (1, 1, 0)” or
.
~r = (1, 1, 0),
so we say “the quantal state |ψi is represented in the basis {|z+i, |z−i} as

. hz+|ψi
|ψi = .”
hz−|ψi
Problems
2.7 Superposition and interference (recommended problem)
On page 72 I wrote that “When an atom ambivates through an inter-
ferometer, its state is a superposition of the state of an atom taking
path a and the state of an atom taking path b.”
a. Write down a superposition equation reflecting this sentence for
the interference experiment sketched on page 57.
b. Do the same for the interference experiment sketched on page 60.
2.8 Representations (recommended problem)
In the {|z+i, |z−i} basis the state |ψi is represented by

ψ+
.
ψ−
(In other words, ψ+ = hz+|ψi and ψ− = hz−|ψi.)
a. If ψ+ and ψ− are both real, show that there is one and only one
axis upon which the projection of µ
~ has a definite, positive value,
and find the angle between that axis and the z axis in terms of
ψ+ and ψ− .
b. What would change if you multiplied both ψ+ and ψ− by the same
phase factor (complex unit)?
c. What would change if you multiplied ψ+ and ψ− by different phase
factors?
This problem invites the question “What if the ratio of ψ+ /ψ− is not
pure real?” When you study more quantum mechanics, you will find
that in this case the axis upon which the projection of µ
~ has a definite,
positive value is not in the x-z plane, but instead has a component in
the y direction as well.
2.9 Addition of states
Some students in your class wonder “What does it mean to ‘add two
quantal states’ ? You never add two classical states.” For their benefit
you decide to write four sentences interpreting the equation
|ψi = a|z+i + b|z−i (2.33)
describing why you can add quantal states but can’t add classical states.
Your four sentences should include a formula for the amplitude a in
terms of the states |ψi and |z+i.
2.10 Names of six states, in two bases
Write down the representations (the “names”) of the states |z+i, |z−i,
|x+i, |x−i, |θ+i, and |θ−i in (a) the basis {|z+i, |z−i} and in (b) the
basis {|x+i, |x−i}.
2.11 More peculiarities of states
Because a vector pointing down at angle θ is the same as a vector point-
ing up at angle θ − 180◦ , I would expect that |θ−i = |(θ − 180◦ )+i.
Show that this expectation is false by uncovering the true relation be-
tween these two state vectors.
2.12 Translation matrix
(This problem requires background knowledge in the mathematics of
matrix multiplication.)
Suppose that the representation of |ψi in the basis {|z+i, |z−i} is

ψ+ hz+|ψi
= .
ψ− hz−|ψi
The representation of |ψi in the basis {|θ+i, |θ−i} is just as good, and
we call it
0
ψ+ hθ+|ψi
0 = .
ψ− hθ−|ψi
Show that you can “translate” between these two representations using
the matrix multiplication
0
ψ+ cos(θ/2) sin(θ/2) ψ+
0 = .
ψ− − sin(θ/2) cos(θ/2) ψ−
2.6. States for entangled systems 77
2.6 States for entangled systems
In the Einstein-Podolsky-Rosen experiment (1) on page 38, with two ver-

tical analyzers, the initial state is represented by |ψi, and various possible
final states are represented by | ↑↓ i and so forth, as shown below. (In this
section all analyzers will be vertical, so we adopt the oft-used convention
that writes |z+i as | ↑ i and |z−i as | ↓ i.)
|ψi
| ↑↓ i
| ↓↑ i
| ↑↑ i
| ↓↓ i
The experimental results tell us that

|h ↑↓ |ψi|2 = 1
2
|h ↓↑ |ψi|2 = 1
2 (2.34)
2
|h ↑↑ |ψi| = 0
|h ↓↓ |ψi|2 = 0.
Additional analysis (sketched in problem 15.??, “Normalization of singlet
spin state”) is needed to assign phases to these amplitudes. The results are
h ↑↓ |ψi = + √12
h ↓↑ |ψi = − √12 (2.35)
h ↑↑ |ψi = 0
h ↓↓ |ψi = 0.
78 States for entangled systems
Using the generalization of equation (2.30) for a four-state basis, these

results tell us that
|ψi = | ↑↓ ih↑↓ |ψi + | ↓↑ ih↓↑ |ψi + | ↑↑ ih↑↑ |ψi + | ↓↓ ih↓↓ |ψi
= √1 (| ↑↓ i − | ↓↑ i). (2.36)
2
A simple derivation, with profound implications.
2.6.1 State pertains to system, not to atom
In this entangled situation there is no such thing as an “amplitude for the

right atom to exit from the + port,” because the probability for the right
atom to exit from the + port depends on whether the left atom exits the
+ or the − port. The pair of atoms has a state, but the right atom by
itself doesn’t have a state, in the same way that an atom passing through
an interferometer doesn’t have a position and that love doesn’t have a color.
Leonard Susskind7 puts it this way: If entangled states existed in auto
mechanics as well as quantum mechanics, then an auto mechanic might tell
you “I know everything about your car but . . . I can’t tell you anything
about any of its parts.”
2.6.2 “Collapse of the state vector”
Set up this EPR experiment with the left analyzer 100 kilometers from the
source, and the right analyzer 101 kilometers from the source. As soon as
the left atom comes out of its − port, then it is known that the right atom
will come out if its + port. The system is no longer in the entangled state
√1 (| ↑↓ i − | ↓↑ i); instead the left atom is in state | ↓ i and the right atom
2
is in state | ↑ i. The state of the right atom has changed (some say it has
“collapsed”) despite the fact that it is 200 kilometers from the left analyzer
that did the state changing!
This fact disturbs those who hold the misconception that states are
physical things located out in space like nitrogen molecules, because it
seems that information about state has made an instantaneous jump across
200 kilometers. In fact no information has been transferred from left to
right: true, Alice at the left interferometer knows that the right atom will
7 LeonardSusskind and Art Friedman, Quantum Mechanics: The Theoretical Minimum
(Basic Books, New York, 2014) page xii.
exit the + port 201 kilometers away, but Bob at the right interferome-
ter doesn’t have this information and won’t unless she tells him in some
conventional, light-speed-or-slower fashion.8
If Alice could in some magical way manipulate her atom to ensure that
it would exit the − port, then she could send a message instantaneously.
But Alice does not possess magic, so she cannot manipulate the left-bound
atom in this way. Neither Alice, nor Bob, nor even the left-bound atom
itself knows from which port it will exit. Neither Alice, nor Bob, nor even
the left-bound atom itself can influence from which port it will exit.
8 If you are familiar with gauges in electrodynamics, you will find quantal state similar
to the Coulomb gauge. In the Coulomb gauge, the electric potential at a point in
space changes the instant that any charged particle moves, regardless of how far away
that charged particle is. This does not imply that information moves instantly, because
electric potential by itself is not measurable. The same applies for quantal state.
80 States for entangled systems
2.6.3 Measurement and entanglement
Back in section 1.4, “Light on the atoms” (page 33), we discussed the
character of “observation” or “measurment” in quantum mechanics. Let’s
bring our new machinery concerning quantal states to bear on this situation.
The figure on the next page shows, in the top panel, a potential mea-
surement about to happen. An atom (represented by a black dot) in state
|z+i approaches a horizontal interferometer at the same time that a photon
(represented by a white dot) approaches path a of that interferometer.
We employ a simplified model in which the photon either misses the
atom, in which case it continues undeflected upward, or else the photon
interacts with the atom, in which case it is deflected outward from the
page. In this model there are four possible outcomes, shown in the bottom
four panels of the figure.
After this potential measurement, the system of photon plus atom is
in an entangled state: the states shown on the right must list both the
condition of the photon (“up” or “out”) and the condition of the atom (+
or −).
If the photon misses the atom, then the atom must emerge from the +
port of the analyzer: there is zero probability that the system has final state
|up; −i. But if the photon interacts with the atom, then the atom might
emerge from either port: there is non-zero probability that the system has
final state |out; −i. These two states are exactly the same as far as the
atom is concerned; they differ only in the position of the photon.
If we focus only on the atom, we would say that something strange has
happened (a “measurement” at path a) that enabled the atom to emerge
from the − port which (in the absence of “measurement”) that atom would
never do. But if we focus on the entire system of photon plus atom, then
it is an issue of entanglement, not of measurement.
b
|z+i
|ψi
a
|up; +i
a
|up; −i
a
|out; +i
a
|out; −i
a
Problem
2.13 Amplitudes for “Measurement and entanglement”
Suppose that, in the “simplified model” for measurement and entan-
glement, the probability for photon deflection is 51 . Find the four prob-
abilities |hup; +|ψi|2 , |hup; −|ψi|2 , |hout; +|ψi|2 , and |hout; −|ψi|2 .
82 What is a qubit?
2.7 Are states “real”?
This is a philosophical question for which there is no specific meaning and

hence no specific answer. But in my opinion, states are mathematical tools
that enable us to efficiently and accurately calculate the probabilities that
can be found through repeated analyzer experiments, interference experi-
ments, and indeed all experiments. They are not physically “real”.
Indeed, it is possible to formulate quantum mechanics in such a way that
probabilities and amplitudes are found without using the mathematical tool
of “state” at all. Richard Feynman and Albert Hibbs do just this in their
1965 book Quantum Mechanics and Path Integrals. States do not make an
appearance until deep into their book, and even when they do appear they
are not essential. The Feynman “sum over histories” formulation described
in that book is, for me, the most intuitively appealing approach to quantum
mechanics. There is, however, a price to be paid for this appeal: it’s very
difficult to work problems in the Feynman formulation.
2.8 What is a qubit?
At the end of the last chapter (on page 52) we listed several so-called “two-
state systems” or “spin- 12 systems” or “qubit systems”. You might have
found these terms strange: There are an infinite number of states for the
magnetic moment of a silver atom: |z+i, |1◦ +i, |2◦ +i, and so forth. Where
does the name “two-state system” come from? You now see the answer:
it’s short for “two-basis-state system”.
The term “spin” originated in the 1920s when it was thought that an
electron was a classical charged rigid sphere that created a magnetic mo-
ment through spinning about an axis. A residual of that history is that
people still call9 the state |z+i by the name “spin up” and by the symbol
| ↑ i, and the state |z−i by “spin down” and | ↓ i. (Sometimes the associa-
tion is made in the opposite way.) Meanwhile the state |x+i is given the
name “spin sideways” and the symbol | → i.
9 The very most precise and pedantic people restrict the term “spin” to elementary
particles, such as electrons and neutrinos. For composite systems like the silver atom
they speak instead of “the total angular momentum J~ of the silver atom in its ground
state, projected on a given axis, and divided by ~.” For me, the payoff in precision is
not worth the penalty in polysyllables.
Today, two-basis-state systems are more often called “qubit” systems

from the term used in quantum information processing. In a classical com-
puter, like the ones we use today, a bit of information can be represented
physically by a patch of magnetic material on a disk: the patch magnetized
“up” is interpreted as a 1, the patch magnetized “down” is interpreted as
a 0. Those are the only two possibilities. In a quantum computer, a qubit
of information can be represented physically by the magnetic moment of a
silver atom: the atom in state |z+i is interpreted as |1i, the atom in state
|z−i is interpreted as |0i. But the atom might be in any (normalized) su-
perposition a|1i + b|0i, so rather than two possibilities there are an infinite
number.
Furthermore, qubits can interfere with and become entangled with other
qubits, options that are simply unavailable to classical bits. With more
states, and more ways to interact, quantum computers can only be faster
than classical computers, and even as I write these possibilities are being
explored.
In today’s state of technology, quantum computers are hard to build,
and they may never live up to their promise. But maybe they will.
Chapters 1 and 2 have focused on two-basis-state systems, but of course
nature provides other systems as well. For example, the magnetic moment
of a nitrogen atom (mentioned on page 9) is a “four-basis-state” system,
where one basis is
|z; +2i, |z; +1i, |z; −1i, |z; −2i. (2.37)
And chapter 6 shifts our focus to a system with an infinite number of basis
states.
Problem
2.14 Questions (recommended problem)
Update your list of quantum mechanics questions that you started at
problem 1.12 on page 53. Write down new questions and, if you have un-
covered answers to any of your old questions, write them down briefly.
[[For example, one of my questions would be: “I’d like to see a proof
that the global phase freedom mentioned on page 73, which obviously
changes the amplitudes computed, does not change any experimentally
accessible result.”]]
Chapter 3
Refining Mathematical Tools
3.1 Extras
Summing states
We have said that “if we determine the amplitude hσ|ψi for every pos-
sible state |σi, then we know everything about |ψi that there is to know.”
So, for example, if two particular states |ψ1 i and |ψ2 i have the same ampli-
tudes hσ|ψ1 i = hσ|ψ2 i for every state |σi, then the two states must be the
same: |ψ1 i = |ψ2 i. In short, we can just erase the leading hσ|s from both
sides.
Apply this idea to a more elaborate equation like the interference result
hσ|ψi = hσ|θ+ihθ + |ψi + hσ|θ−ihθ − |ψi. (3.1)
Since this holds for every state |σi, we can erase the leading hσ|s to find
|ψi = |θ+ihθ + |ψi + |θ−ihθ − |ψi. (3.2)
What is this supposed to mean? We have never before summed two quantal
states! Physically, its meaning is nothing new: it just says if an atom
ambivates through the two branches of a θ-interferometer, it exits in the
same state (|ψi) that it entered. An atom entering in state |θ+i goes via
branch a, and exits in state |θ+i. An atom entering in state |θ−i goes via
branch b, and exits in state |θ−i. An atom entering in state |ψi goes via
branch a with amplitude hθ + |ψi and via branch b with amplitude hθ − |ψi,
and exits in state |ψi.
Mathematically, however, equation (3.2) implies some new kind of addi-
tion, an addition of states. It’s easiest to flesh out the mathematical mean-
ing by replacing the states |ψi, |θ+i, and |θ−i with their representations as
85
86 Refining Mathematical Tools
column matrices in the {|z+i, |z−i} basis. The interference equation (3.2)
above becomes

hz + |ψi hz + |θ+i hz + |θ−i
= hθ + |ψi + hθ − |ψi. (3.3)
hz − |ψi hz − |θ+i hz − |θ−i
The first line of this equation is just equation (3.1) with |σi = |z+i, while
the second line is just equation (3.1) with |σi = |z−i.
We have introduced the idea of adding states through interference ex-
periments, and then forged the mathematical tools to describe those ex-
periments, including state addition. It is possible, however, to go the other
direction, and just talk about summing two states such as
α|ai + β|bi. (3.4)
Exercise: Show that for the above sum to produce a state, the identity
|α|2 + |β|2 + 2<e{α∗ βha|bi} = 1
must apply.
When looked at in this way — mathematics first, experiments afterwards

— the concept of state summation and the phenomenon of interference are
often called “superposition”.
The whole idea of summing states is new. In Newtonian mechanics, two
states are never summed. That is because summing of states represents the
phenomenon of interference, and in classical mechanics interference doesn’t
happen.
When we learned how to add vectors, we learned to add them both
geometrically (by setting them tail to head and drawing a vector from the
first tail to the last head) and through components. The same holds for
adding quantal states: You can add them physically, through interference
experiments, or through components.
The equation
~r = îrx + ĵry + k̂rz = î(î · ~r) + ĵ(ĵ · ~r) + k̂(k̂ · ~r)
for geometrical vectors is useful and familiar. The equation
|ψi = |θ+ihθ + |ψi + |θ−ihθ − |ψi.
for state vectors is just as useful and will soon be just as familiar.
3.1. Extras 87
Alternative approach — sum of three or more states:

Consider the three path experiment at equation (2.7) and think about
how you would describe the situation if you walked into the experiment
exactly when the atom was ambivating through the middle of the apparatus.
If the initial state were |ψi, you would say the atom has amplitude hz + |ψi
of ambivating through the top path (1a), amplitude hθ + |z−ihz − |ψi of
ambivating through the middle path (2a), and amplitude hθ − |z−ihz − |ψi
of ambivating through the bottom path (2b). Since an atom in the top path
has state |z+i, an atom in the middle path has state |θ+i, and an atom in
the bottom path has state |θ−i, we say that the atom in state |ψi has been
dissected into the three states through
|ψi = |z+ihz + |ψi
+ |θ+ihθ + |z−ihz − |ψi (3.5)
+ |θ−ihθ − |z−ihz − |ψi
Now we know what it means to add three states — not necessarily basis
states. By using combinations of more and more interferometers, we could
build up devices that would sum more and more states. The mathematics
of state summation is a formal expression of the physics of interference.
Analysis of states into basis states
For all states |ψi, |φi, and any basis |ai, |bi:
hφ|ψi = hφ|aiha|ψi + hφ|bihb|ψi.
This equation is exactly the interference experiment: If the atom goes
through both branch a and through branch b, then it emerges unchanged.
In particular, we write this equation for |φi = |z+i and then for |φi =
|z−i:

hz + |ψi hz + |ai hz + |bi
= ha|ψi + hb|ψi.
hz − |ψi hz − |ai hz − |bi
This equation is the representation, in the {|z+i, |z−i} basis, of
|ψi = |aiha|ψi + |bihb|ψi. (3.6)
What is this equation supposed to mean? The quantities |ψi, |ai, and |bi
represent states, the quantities ha|ψi and hb|ψi represent complex numbers.
This is the first time we have ever added states. What does it mean? It
means nothing more nor less than the interferometer experiment. (Just
as cavemen were able to spear mammoths through vector addition, even

though they didn’t know about basis states or the coordinate representa-
tions of vectors in a particular basis, so atoms are able to interfere through
the above equation, even though they don’t know about bases or represen-
tations. The representations make it easier to work with position vectors
or with quantal states, but they aren’t required.)
Reinforcing the meaning of this equation by writing it down for the
basis {|θ+i, |θ−i}: When I write
|ψi = |θ+ihθ + |ψi + |θ−ihθ − |ψi,
I mean that if an atom passes through a θ-analyzer, it has amplitude
hθ + |ψi to behave like an atom in state |θ+i, and it has amplitude hθ − |ψi
to behave like an atom in state |θ−i.
There’s an easier way to derive the equation (3.6). If two states |αi and
|βi have the same amplitudes
hφ|αi = hφ|βi
for all states |φi, then they are the same state:
|αi = |βi.
(Two atoms that behave identically under all circumstances are in the same
state.) In other words, because the state |φi is arbitrary, we can erase the
symbol hφ| from both sides of the equation. Applying this principle to
hφ|ψi = hφ|aiha|ψi + hφ|bihb|ψi,
where the state |φi is arbitrary, results in
|ψi = |aiha|ψi + |bihb|ψi.
This equation is analogous to the relation for position vectors that

~r = îx + ĵy + k̂z.
Example: If the basis {|ai, |bi} is {|z+i, |z−i}, then

|ψi = |z+ihz + |ψi + |z−ihz − |ψi.
Using the representations for |z+i and |z−i in the {|z+i, |z−i} basis, this
equation becomes

. 1 0 hz + |ψi
|ψi = hz + |ψi + hz − |ψi = .
0 1 hz − |ψi
3.1. Extras 89
Change of basis
Suppose the two amplitudes hz + |ψi and hz − |ψi are known. Then we
can easily find the amplitudes hθ + |ψi and hθ − |ψi, for any value of θ,
through
hθ + |ψi = hθ + |z+ihz + |ψi + hθ + |z−ihz − |ψi
hθ − |ψi = hθ − |z+ihz + |ψi + hθ − |z−ihz − |ψi
These two equations might seem arcane, but in fact each one just represents
the interference experiment performed with a vertical analyzer: The state
|ψi is unaltered if the atom travels through the two branches of an vertical
interferometer, that is via the upper z+ branch and the lower z− branch.
And if the state is unaltered then the amplitude to go to state |θ+i is of
course also unaltered.
The pair of equations is most conveniently written as a matrix equation

hθ + |ψi hθ + |z+i hθ + |z−i hz + |ψi
= .
hθ − |ψi hθ − |z+i hθ − |z−i hz − |ψi
The 2 × 1 column matrix on the right side is called the representation of
state |ψi in the basis {|z+i, |z−i}. The 2 × 1 column matrix on the left
side is called the representation of state |ψi in the basis {|θ+i, |θ−i}. The
square 2 × 2 matrix is independent of the state |ψi, and depends only on
the geometrical relationship between the initial basis {|z+i, |z−i} and the
final basis {|θ+i, |θ−i}:

hθ + |z+i hθ + |z−i cos(θ/2) sin(θ/2)
= .
hθ − |z+i hθ − |z−i − sin(θ/2) cos(θ/2)
Terms concerning quantum states
For atoms in state |z+i, the probability of measuring µθ and finding
µθ = +µB is cos2 (θ/2). We say “The projection probability from |z+i to
|θ+i is cos2 (θ/2).” This situation is frequently, but incorrectly, described as
“The probability that an atom in state |z+i is in state |θ+i is cos2 (θ/2).”
If the projection probability from |Ai to |Bi is zero, and vice versa, the
two states are orthogonal. (For example, |z+i and |z−i are orthogonal,
whereas |z+i and |x−i are not.)
Given a set of states {|Ai, |Bi, . . . , |N i}, this set is said to be complete if
an atom in any state is analyzed into one state of this set. In other words,
it is complete if
N
X
(projection probability from any given state to |ii) = 1.
i=A
(For example, the set {|θ+i, |θ−i} is complete.)

General definition of basis
We say that a set of states {|ai, |bi, . . . , |ni} is a basis if both of the
following apply:
• An atom in any state is analyzed into one element of this set. That is,
for any state |ψi
|ha|ψi|2 + |hb|ψi|2 + · · · + |hn|ψi|2 = 1. (3.7)
• There is zero amplitude for one element to be another element. That
is
ha|bi = 0, ha|ci = 0, . . . , ha|ni = 0,
hb|ci = 0, . . . , hb|ni = 0, (3.8)
etc.
For example, the set {|θ+i, |θ−i} is a basis for any value of θ. The set
{|z+i, |x−i} is not a basis.
3.2 Outer products, operators, measurement
The outer product

Let’s go back to our equation that represents the interference experi-
ment: For any states |φi and |ψi, and for any pair of basis states |ai and
|bi,
hφ|ψi = hφ|aiha|ψi + hφ|bihb|ψi.
Now effect the divorce of amplitudes into inner product of states:

hφ|ψi = hφ| |aiha| + |bihb| |ψi. (3.9)
Our question: What’s that thing between curly brackets?

In any particular basis, |ai is represented by a 2 × 1 column matrix,
while ha| is represented by a 1 × 2 row matrix. Thus the product |aiha| is
represented by a 2 × 2 square matrix. Similarly for |bihb|. Thus, in any
particular basis, the thing between curly brackets is represented by a 2 × 2
matrix.
3.2. Outer products, operators, measurement 91
If this confuses you, then think of it this way. If

. αa . βa
|αi = and |βi = ,
αb βb
then
. .
hα| = αa∗ αb∗ hβ| = βa∗ βb∗ .

and
Thus the “inner product” is the 1 × 1 matrix

βa
hα|βi = αa∗ αb∗ = αa∗ βa + αb∗ βb ,
βb
while the “outer product” is represented by the 2 × 2 matrix
αa βa∗ αa βb∗

. αa
βa∗ βb∗ =

|αihβ| = .
αb αb βa∗ αb βb∗
A piece of terminology: |αihβ| is called an operator and the square matrix
that represents it in a particular basis is called a matrix. The two terms
are often used interchangeably, but if you care to make the distinction then
this is how to make it. It’s conventional to symbolize operators with hats,
like Â.
With these ideas in place, we see what’s inside the curly brackets of
expression (3.9) — it’s the identity operator
1̂ = |aiha| + |bihb|,
and this holds true for any basis {|ai, |bi}.
We check this out two ways. First, in the basis {|z+i, |z−i}, we find
the representation for the operator
|z+ihz + | + |z−ihz − |.
Remember that in this basis

. 1 . 0
|z+i = while |z−i = ,
0 1
so

. 1 10
|z+ihz + | = 10 = . (3.10)
0 00
Meanwhile

. 0 00
|z−ihz − | = 01 = . (3.11)
1 01
Thus
. 10 00 10
|z+ihz + | + |z−ihz − | = + = .
00 01 01
Yes! As required, this combination is the identity matrix, which is of course
the representation of the identity operator.
For our second check, in the basis {|z+i, |z−i} we find the representation
for the operator
|θ+ihθ + | + |θ−ihθ − |.
Remember that inthis basis
. cos(θ/2) . − sin(θ/2)
|θ+i = while |θ−i = ,
sin(θ/2) cos(θ/2)
so
. cos(θ/2)
|θ+ihθ+| = cos(θ/2) sin(θ/2) (3.12)
sin(θ/2)
cos2 (θ/2)

cos(θ/2) sin(θ/2)
= .
sin(θ/2) cos(θ/2) sin2 (θ/2)
Meanwhile
. − sin(θ/2)
|θ−ihθ−| = − sin(θ/2) cos(θ/2) (3.13)
cos(θ/2)
sin2 (θ/2)

− sin(θ/2) cos(θ/2)
= .
− cos(θ/2) sin(θ/2) cos2 (θ/2)
(As a check, notice that when θ = 0, equation (3.12) reduces to equa-
tion (3.10), and equation (3.13) reduces to equation (3.11).) Thus
cos2 (θ/2)

. cos(θ/2) sin(θ/2)
|θ+ihθ+| + |θ−ihθ−| =
sin2 (θ/2)

+

10
= .
01
Yes! Once again this combination is the identity matrix.
Measurement
What happens when an atom in state |ψi passes through a θ-analyzer?
Or, what is the same thing, what happens when an atom in state |ψi is
measured to find the projection of µ on the θ axis? (We call the projection
of µ on the θ axis µθ .)
The atom enters the analyzer in state |ψi. It has two possible fates:
• It emerges from the + port, in which case the atom has been measured
to have µθ = +µB , and it emerges in state |θ+i. This happens with
probability |hθ + |ψi|2 .
• It emerges from the − port, in which case the atom has been measured
to have µθ = −µB , and it emerges in state |θ−i. This happens with
probability |hθ − |ψi|2 .
What is the average value of µθ ?

hµθ i = (+µB )|hθ + |ψi|2 + (−µB )|hθ − |ψi|2
∗ ∗
= (+µB )hθ + |ψi hθ + |ψi + (−µB )hθ − |ψi hθ − |ψi
= (+µB )hψ|θ+ihθ + |ψi + (−µB )hψ|θ−ihθ − |ψi

= hψ| (+µB )|θ+ihθ + | + (−µB )|θ−ihθ − | |ψi
In the last line we have effected the divorce — writing amplitudes in terms
of inner products between states.
Given the last line, it makes sense to define an operator associated with
the measurement of µθ , namely
µ̂θ = (+µB )|θ+ihθ + | + (−µB )|θ−ihθ − |,
so that
hµθ i = hφ|µ̂θ |φi.
Notice what we’ve done here: To find the average value of µθ for a particular
atom, we’ve split up the problem into an operator µ̂θ involving only the
measuring device and a state |ψi involving only the atomic state.
Example: What is the matrix representation of µ̂θ in the basis
{|z+i, |z−i}? We have already found representations for the outer product
|θ+ihθ+| in equation (3.12) and for the outer product |θ−ihθ−| in equa-
tion (3.13). Using these expressions
µ̂θ = (+µB )|θ+ihθ+| + (−µB )|θ−ihθ−|
cos2 (θ/2)

. cos(θ/2) sin(θ/2)
= (+µB )
sin2 (θ/2)

+(−µB )
cos (θ/2) − sin2 (θ/2) 2 cos(θ/2) sin(θ/2)
2
= µB
2 cos(θ/2) sin(θ/2) sin2 (θ/2) − cos2 (θ/2)

cos θ sin θ
= µB
sin θ −cos θ
where in the last line I have used the trigonometric half-angle formulas that
everyone learned in high school and then forgot. (I forgot them too, but I
know where to look them up.)
In particular, using the values θ = 0 and θ = 90◦ ,

1 0 01
µ̂z = µB and µ̂x = µB
0 −1 10
and furthermore
µ̂θ = cos θ µ̂z + sin θ µ̂x .
Which is convenient because the unit vector r̂ in the direction of θ is
r̂ = cos θ k̂ + sin θ î.
So, knowing the operator associated with a measurement, we can easily

find the resulting average value for any given state when measured. But
we often want to know more than the average. We want to know also the
standard deviation. Indeed we would like to know everything about the
measurement: the possible results, the probability of each result, the state
the system will be in after the measurement is performed. Surprisingly, all
this information is wrapped up within the measurement operator as well.
We know that there are only two states that have a definite value of µθ ,
namely |θ+i and |θ−i. How do these states behave when acted upon by
the operator µ̂θ ?
µ̂θ |θ+i = {(+µB )|θ+ihθ + | + (−µB )|θ−ihθ − |}|θ+i

= (+µB )|θ+ihθ + |θ+i + (−µB )|θ−ihθ − |θ+i
= (+µB )|θ+i(1) + (−µB )|θ−i(0)
= (+µB )|θ+i
In other words, when the operator µ̂θ acts upon the state |θ+i, the result is
(+µB ) times that same state |θ+i — and (+µB ) is exactly the result that
we would always obtain if we measured µθ for an atom in state |θ+i! A
parallel result holds for |θ−i.
To convince you of how rare this phenomena is, let me apply the operator
µ̂θ to some other state, say |z+i. The result is
µ̂θ |z+i = {(+µB )|θ+ihθ + | + (−µB )|θ−ihθ − |}|z+i

= (+µB )|θ+ihθ + |z+i + (−µB )|θ−ihθ − |z+i
= (+µB )|θ+i(cos(θ/2)) + (−µB )|θ−i(− sin(θ/2)).
But
|θ+i = |z+ihz + |θ+i + |z−ihz − |θ+i = |z+i(cos(θ/2)) + |z−i(sin(θ/2))
|θ−i = |z+ihz + |θ−i + |z−ihz − |θ−i = |z+i(− sin(θ/2)) + |z−i(cos(θ/2)),
so
µ̂θ |z+i = (+µB )|θ+i(cos(θ/2)) + (−µB )|θ−i(− sin(θ/2))
= µB [|z+i(cos2 (θ/2) − sin2 (θ/2)) + |z−i(2 cos(θ/2) sin(θ/2))]
= µB [|z+i cos θ + |z−i sin θ],
where in the last line I have again used the half-remembered half-angle
formulas.
The upshot is that most of the time, µ̂θ acting upon |z+i does not
produce a number times |z+i — most of the time it produces some com-
bination of |z+i and |z−i. In fact the only case in which µ̂θ acting upon
|z+i produces a number times |z+i is when sin θ = 0, that is when θ = 0
or when θ = 180◦ .
The states when µ̂θ acting upon |ψi produces a number times the origi-
nal state |ψi are rare: they are called eigenstates. The associated numbers
are called eigenvalues. We have found the two eigenstates of µ̂θ : they are
|θ+i with eigenvalue +µB and |θ−i with eigenvalue −µB .
µ̂θ |θ+i = (+µB )|θ+i eigenstate |θ+i with eigenvalue +µB
µ̂θ |θ−i = (−µB )|θ−i eigenstate |θ−i with eigenvalue −µB
The eigenstates are the states with definite values of µθ . And the eigenval-
ues are those values!
Summary: The quantum theory of measurement
This summarizes the quantum theory of measurement as applied to the
measurement of µ projected onto the unit vector in the direction of θ:
The operator µ̂θ has two eigenstates which constitute a complete and
orthogonal basis:
state |θ+i with eigenvalue +µB

state |θ−i with eigenvalue −µB
(a) If you measure µθ of an atom in an eigenstate of µ̂θ , then the number

you measure will be the corresponding eigenvalue, and the atom will remain
in that eigenstate.
(b) If you measure µθ of an atom in an arbitrary state |ψi, then the

number you measure will be one of the two eigenvalues of µ̂θ : It will be +µB
with probability |hθ + |ψi|2 , it will be −µB with probability |hθ − |ψi|2 . If
the value measured was +µB , then the atom will leave in state |θ+i, if the
value measured was −µB , then the atom will leave in state |θ−i.
Exercise: Show that (a) follows from (b).
Are states and operators real?
This is a philosophical question for which there’s no specific meaning
and hence no specific answer. But in my opinion, states and operators are
mathematical tools that enable us to efficiently and accurately calculate
the probabilities that we find through repeated measurement experiments,
interference experiments, and indeed all experiments. They are not “real”.
Indeed, it is possible to formulate quantum mechanics in such a way
that probabilities and amplitudes are found without using the mathemat-
ical tools of “state” and “operator” at all. Richard Feynman and Albert
Hibbs do just this in their book Quantum Mechanics and Path Integrals.
States and operators do not make an appearance until deep into this book,
and even when they do appear it is only to make a connection between Feyn-
man’s formulation and more traditional formulations of quantum mechan-
ics — states and operators are not essential. In my opinion, this Feynman
“sum over histories” formulation is the most intuitively appealing approach
to quantum mechanics. There is, however, a price to be paid for this appeal:
it’s very difficult to work problems in the Feynman formulation.
For more extensive discussion, see N. David Mermin, “What’s bad about
this habit?” Physics Today, 62 (5), May 2009, pages 8–9, and discussion
about this essay in Physics Today, 62 (9), September 2009, pages 10–15.
3.3 Photon polarization
This book has developed the principles of quantum mechanics using a par-
ticular system, the magnetic moment of a silver atom, which has two basis
states. Another system with two basis states is polarized light. I did not
use this system mainly because photons are less familiar than atoms. These
problems develop the quantum mechanics of photon polarization much as
the text developed the quantum mechanics of magnetic moment.
3.3. Photon polarization 97
One cautionary note: There is always a tendency to view the photon

as a little bundle of electric and magnetic fields, a “wave packet” made up
of these familiar vectors. This view is completely incorrect. In quantum
electrodynamics, in fact, the electric field is a classical macroscopic quantity
that takes on meaning only when a large number of photons are present.
3.15 Classical description of polarized light
unpolarized light x-polarized light θ-polarized lig
When a beam of unpolarized light passes through an ideal polarizing

sheet (represented by the symbol
,
where the arrow shows the polarizing axis), the emerging beam is of
lower intensity and it is “polarized”, that is, the electric field vector
undulates but points only parallel or antiparallel to the polarizing axis.
When a beam of vertically polarized light (an “x-polarized beam”) is
passed through an ideal polarizing sheet with polarizing axis oriented at
an angle θ to the vertical, the beam is reduced in intensity and emerges
with an electric field undulating parallel to the sheet’s polarizing axis (a
“θ-polarized beam”). The sheet performs these feats by absorbing any
component of electric field perpendicular to its polarizing axis. Show
that if the incoming x-polarized beam has intensity I0 , then the outgo-
ing θ-polarized beam has intensity I0 cos2 θ. Show that this expression
gives the expected results when θ is 0◦ , 90◦ , 180◦ or 270◦ .
3.16 Quantal description of polarized light: Analyzers
In quantum mechanics, a photon state is described by three quantities:
energy, direction of motion, and polarization. We ignore the first two
quantities for now. There are an infinite number of possible polarization
states: the photons in an x-polarized beam are all in the |xi state, the
photons in a θ-polarized beam (0◦ ≤ θ < 180◦ ) are all in the |θi state,
etc. In the quantum description, when an |xi photon encounters a
polarizing sheet oriented at an angle θ to the vertical, then either it
is absorbed (with probability sin2 θ) or else it emerges as a |θi photon
(with probability cos2 θ). A polarizing sheet is thus not an analyzer:
whereas an analyzer would split the incident beam into two (or more)
beams, the polarizing sheet absorbs one of the beams that an analyzer
would emit. An analyzer can be constructed out of any material that
exhibits double refraction. It is conventional to use a simple calcite
crystal:
x-polarized beam
y-polarized beam
arbitary
input calcite analyzer
beam θ
θ-polarized beam
(θ+90)-polarized beam
What are the probabilities |hx|θi|2 , |hx|θ + 90◦ i|2 ? Design a pair of
experiments to show that the states {|θi, |θ + 90◦ i} constitute a basis.
3.17 Interference
As usual, two analyzers (one inserted backwards) make up an analyzer
loop.
3.3. Photon polarization 99
x-polarized
-
@
- @R
@ -
@
@R
@ -
y-polarized
calcite reversed
analyzer calcite
analyzer
Invent a series of experiments that demonstrates quantum interference.

(I used input photons in state |xi, passed through an analyzer loop
rotated at angle θ to the vertical, followed by a vertical analyzer. But
you might develop some other arrangement.) Show that the results of
these experiments, plus the results of problem 2.16, are consistent with
the amplitudes
hx|θi = cos θ hx|θ + 90◦ i = − sin θ
(3.14)
hy|θi = sin θ hy|θ + 90◦ i = cos θ.
3.18 Circular polarization
Just as it is possible to analyze any light beam into x- and y-polarized
beams, or into θ- and θ + 90◦ -polarized beams, so it is possible to
analyze any beam into right- and left-circularly polarized beams. You
might remember from classical optics that any linearly polarized beam
splits half-and-half into right- and left-circularly polarized light when
so analyzed.
- right-circularly polarized light

linearly polarized light
-
- left-circularly polarized light
RL analyzer
Quantum mechanics maintains that right- and left-circularly polarized

beams are made up of photons in the |Ri and |Li states, respectively.
The projection amplitudes thus have magnitudes
√
|hR|`pi| = 1/√2
(3.15)
|hL|`pi| = 1/ 2
where |`pi is any linearly polarized state. An RL analyzer loop is
described through the equation
hθ|RihR|xi + hθ|LihL|xi = hθ|xi = cos θ. (3.16)
Show that no real valued projection amplitudes can satisfy both rela-
tions (3.15) and (3.16), but that the complex values
√ √
hL|θi = eiθ / √2 hL|xi = 1/√2
(3.17)
hR|θi = e−iθ / 2 hR|xi = 1/ 2
are satisfactory!
3.4 Lightning linear algebra
Linear algebra provides many of the mathematical tools used in quantum

mechanics. This section will scan through and summarize linear algebra
to drive home the main points. . . it won’t attempt to prove things or to
develop the theory in the most elegant form using the smallest number of
assumptions.
Scalars: either real numbers (x) or complex numbers (z)

Vectors: notation a, b, c, or |ψi, |φi, |χi
In addition, there must be a rule for multiplying a vector by a scalar and a

rule for adding vectors, so that a + xb is a vector.
I won’t define “vector” any more than I defined “number”. But I will
give some examples:
arrows in N -dimensional space

n-tuples, with real entries or with complex entries
polynomials
functions
n × m matrices
3.4. Lightning linear algebra 101
Inner product
The “inner product” is a function from the ordered pairs of vectors to
the scalars,
I.P.(a, b) = a real or complex number, (3.18)
that satisfies
I.P.(a, b + c) = I.P.(a, b) + I.P.(a, c) (3.19)
I.P.(a, zb) = zI.P.(a, b) (3.20)
∗
I.P.(a, b) = [I.P.(b, a)] (3.21)
I.P.(a, a) > 0 unless a = 0 (3.22)
It follows from equation (3.21) that I.P.(a, a) is real. Equation (3.22)

demands also that it’s positive.
Why is there a complex conjugation in equation (3.21)? Why not just
demand that I.P.(a, b) = I.P.(b, a)? The complex conjugation is needed
for consistency with (3.22). If it weren’t there, then
I.P.(ia, ia) = (i · i)I.P.(a, a) = −I.P.(a, a) < 0.
Notation: I.P.(a, b) = (a, b) = a·b, I.P.(|φi, |ψi) = hφ|ψi.

p
Definition: The norm of |ψi is hψ|ψi.
Making new vectors from old
Given some vectors, say a1 and a2 , what vectors can you build from
them using scalar multiplication and vector addition?
Example: arrows in the plane.
a3
a2 a2
a'2
a1 a1 a1
(a) (b) (c)
In (a), any arrow in the plane can be built out of a1 and a2 . In other
words, any arrow in the plane can be written in the form r = r1 a1 + r2 a2 .
We say that “the set {a1 , a2 } spans the plane”.
In (b), we cannot build the whole plane from a1 and a02 . These two
vectors do not span the plane.
In (c), the set {a1 , a2 , a3 } spans the plane, but the set is redundant: you
don’t need all three. You can build a3 from a1 and a2 : a3 = a2 − 21 a1 , so
anything that can be built from {a1 , a2 , a3 } can also be built from {a1 , a2 }.
The set {a1 , a2 } is “linearly independent”, the set {a1 , a2 , a3 } is not.
Linearly independent: You can’t build any member of the set out of the
other members.
So any arrow r in the plane has a unique representation in terms of
{a1 , a2 } but not in terms of {a1 , a2 , a3 }. For example,
r = 2a3 = −1a1 + 2a2 + 0a3
= 0a1 + 0a2 + 2a3
Basis: A spanning set of linearly independent vectors. (That is, a

minimum set of building blocks from which any vector you want can be
constructed. In any given basis, there is a unique representation for an
arbitrary vector.)
It’s easy to see that all bases have the same number of elements, and
this number is called the dimensionality, N .
The easiest basis to work with is an orthonormal basis: A basis
{|1i, |2i, . . . , |N i} is orthonormal if
hn|mi = δn,m . (3.23)
For any basis an arbitrary vector |ψi can be written

N
X
|ψi = ψ1 |1i + ψ2 |2i + · · · + ψN |N i = ψn |ni, (3.24)
n=1
but for many bases it’s hard to find the coefficients ψn . For an orthonormal
basis, however, it’s easy. Take the inner product of basis element |mi with
|ψi, giving
N
X N
X
hm|ψi = ψn hm|ni = ψn δm,n = ψm . (3.25)
n=1 n=1
Thus the expansion (3.24) is

N
X
|ψi = |nihn|ψi. (3.26)
n=1
You have seen this formula in the context of arrows. For example,
using two-dimensional arrows with the orthonormal basis {î, ĵ}, also called
{ex , ey }, you know that
r = xex + yey ,
where
x = r·ex and y = r·ey .
Thus
r = ex (ex ·r) + ex (ex ·r),
which is just an instance of the more general expression (3.26).
Representations
Any vector |ψi is completely specified by the N numbers ψ1 , ψ2 , . . . ψN
(that is, the N numbers hn|ψi). We say that in the basis {|1i, |2i, . . . , |N i},
the vector |ψi is represented by the column matrix
h1|ψi
   
ψ1
 ψ2   h2|ψi 
 .  =  . . (3.27)
   
 ..   .. 
ψN hN |ψi
It is very easy to manipulate vectors through their representations, so rep-
resentations are used often. So often, that some people go overboard and
say that the vector |ψi is equal to this column matrix. This is false. The
matrix representation is a name for the vector, but is not equal to the vec-
tor — much as the word “tree” is a name for a tree, but is not the same as
.
a tree. The symbol for “is represented by” is =, so we write
h1|ψi
   
ψ1
.  ψ2   h2|ψi 
   
|ψi =  .  =  .  . (3.28)
 ..   .. 
ψN hN |ψi
What can we do with representations? Here’s a way to connect an inner

product, which is defined solely through the list of properties (3.19)–(3.22),
to a formula in terms of representations.
hφ|ψi
( ) [[ using (3.26) . . . ]]
X
= hφ| |nihn|ψi [[ using (3.19) . . . ]]
X n
= hφ|nihn|ψi [[ using (3.21) . . . ]]
Xn
= φ∗n ψn
n  
ψ1
 ψ2 
= (φ∗1 φ∗2 ··· φ∗N ) 
 
.. 
 . 
ψN
We will sometimes say that hφ| is the “dual vector” to |φi and is repre-
sented by the row matrix
(φ∗1 φ∗2 ··· φ∗N ). (3.29)
Transformation of representations
In the orthonormal basis {|1i, |2i, . . . , |N i}, the vector |ψi is represented
by
 
ψ1
 ψ2 
 . . (3.30)
 
 .. 
ψN
But in the different orthonormal basis {|10 i, |20 i, . . . , |N 0 i}, the vector |ψi
is represented by
 0 
ψ1
 ψ20 
 . . (3.31)
 
 .. 
0
ψN
How are these two representations related?

ψn0 = hn0 |ψi
( )
X
0
= hn | |mihm|ψi
m
X
0
= hn |mihm|ψi
m
so
ψ10 h10 |1i h10 |2i · · · h10 |N i
    
ψ1
 ψ20
  h20 |1i h20 |2i · · · h20 |N i   ψ2 
= .  . . (3.32)
    
.. ..
  .. .

 . .   . 
0 0 0 0
ψN hN |1i hN |2i · · · hN |N i ψN
Operators
A linear operator Â is a function from vectors to vectors
Â : |ψi 7→ |φi or in other words |φi = Â|ψi, (3.33)
with the property that
Â(z1 |ψ1 i + z2 |ψ2 i) = z1 Â|ψ1 i + z2 Â|ψ2 i. (3.34)
If you know how Â acts upon each member of a basis set

{|1i, |2i, . . . , |N i}, then you know everything there is to know about Â,
because for any vector |ψi
( )
X X
Â|ψi = Â ψn |ni = ψn Â|ni, (3.35)
n n
and the vectors Â|ni are known.

Examples of linear operators:
• The identity operator: 1̂|ψi = |ψi.

• Rotations in the plane. (Linear because the sum of the rotated arrows
is the same as the rotation of the summed arrows.)
• The “projection operator” P̂ |ai , defined in terms of some fixed vector
|ai as
P̂ |ai : |ψi 7→ (ha|ψi) |ai (3.36)
This is often used for vectors |ai of norm 1, in which case, for arrows
in space, it looks like:
Pax
a
• More generally, for a given |ai and |bi the operator

Ŝ : |ψi 7→ (hb|ψi) |ai (3.37)
is linear.
Operators may not commute. That is, we might well have

Â1 Â2 |ψi =
6 Â2 Â1 |ψi. (3.38)
Non-linear operators also exist, such as N̂ : |ψi 7→ (hψ|ψi) |ψi, but

are not so important in applications to quantum mechanics. [[This is the
source of the name linear algebra. For non-linear operators, knowledge of
the action of N̂ on the basis vectors is not sufficient to define the operator.
It is a mystery why all the exact operators in quantum mechanics are linear.]
]
Outer products
Recall the operator

Ŝ : |ψi 7→ (hb|ψi) |ai = |aihb|ψi
Ŝ|ψi = |aihb|ψi (3.39)
We will write the operator Ŝ as |aihb| and call it “the outer product of
|ai and |bi”. This means neither more nor less than the defining equa-
tion (3.39).
For any orthonormal basis {|1i, |2i, . . . , |N i}, consider the operator
T̂ ≡ |1ih1| + |2ih2| + · · · + |N ihN |. (3.40)
The effect of this operator on an arbitrary vector |ψi is given in equa-

tion (3.26), which shows that T̂ |ψi = |ψi for any |ψi. Hence my favorite
equation
X
1̂ = |nihn|. (3.41)
n
This might look like magic, but in means nothing more than equation (3.26):
that a vector may be resolved into its components. The operator of equa-
tion (3.41) simply represents the act of chopping a vector into its compo-
nents and reassembling them. It is the mathematical representation of an
analyzer loop!
Representations of operators
Operators are represented by N × N matrices. If

|φi = Â|ψi, (3.42)
then
hn|φi = hn|Â|φi
= hn|Â1̂|φi
( )
X
= hn|Â |mihm| |φi
m
X
= hn|Â|mihm|ψi, (3.43)
m
or, in matrix form,
  
h1|Â|1i h1|Â|2i · · · h1|Â|N i
 
φ1 ψ1
 φ2   h2|Â|1i h2|Â|2i · · · h2|Â|N i   ψ2 

 
=  . . (3.44)
  
 .    . .
 ..   .. .. 
  .
. 
φN hN |Â|1i hN |Â|2i · · · hN |Â|N i ψN
The matrix M that represents operator Â in this particular basis has ele-
ments Mn,m = hn|Â|mi.
In a different basis, the same operator Â will be represented by a dif-
ferent matrix. You can figure out for yourself how to transform the matrix
representation of an operator in one basis into the matrix representation of
that operator in a second basis. But it’s not all that important to do so.
Usually you work in the abstract operator notation until you’ve figured out
the easiest basis to work with, and then work in only that basis.
Unitary operators
If the norm of Û |ψi equals the norm of |ψi for all |ψi, then Û should
be called “norm preserving” but in fact is called “unitary”. The rotation
operator is unitary.
Hermitian conjugate
†
For every operator Â there is a unique operator Â , the “Hermitian1 con-
jugate” (or “Hermitian adjoint”) of Â such that
† ∗
hψ|Â |φi = hφ|Â|ψi (3.45)
for all vectors |ψi and |φi.
†
If the matrix elements for Â are Mn,m , then the matrix elements for Â
are Kn,m = M∗m,n .
Hermitian operators
†
If Ĥ = Ĥ, then Ĥ is said to be Hermitian. Matrix representations of
Hermitian operators have Mn,m = M∗m,n . Hermitian operators are impor-
tant in quantum mechanics because if an operator is to correspond to an
observable, then that operator must be Hermitian.
Theorem: If Ĥ is Hermitian, then: (a) All of its eigenvalues are real.
(b) There is an orthonormal basis consisting of eigenvectors of Ĥ.
Corollaries: If the orthonormal basis mentioned in (b) is
{|1i, |2i, . . . , |N i}, and Ĥ|ni = λn |ni, then
Ĥ = λ1 |1ih1| + λ2 |2ih2| + · · · + λN |N ihN |. (3.46)
The matrix representation of Ĥ in this basis is diagonal:
λ1 0 · · · 0
 
.  0 λ2 · · · 0 
 
Ĥ =  . . . (3.47)
 .. .. 
0 0 · · · λN
1 CharlesHermite (1822-1901), French mathematician who contributed to number the-
ory, orthogonal polynomials, elliptic functions, quadratic forms, and linear algebra.
Teacher of Hadamard and Poincaré, father-in-law of Picard.
3.5. Problems 109
3.5 Problems
3.19 Change of basis

The set {|ai, |bi} is an orthonormal basis.
a. Show that the set {|a0 i, |b0 i}, where
|a0 i = + cos φ|ai + sin φ|bi
|b0 i = − sin φ|ai + cos φ|bi
is also an orthonormal basis. (The angle φ is simply a parameter
— it has no physical significance.)
b. Write down the transformation matrix from the {|ai, |bi} basis
representation to the {|a0 i, |b0 i} basis representation.
(If you suspect a change of basis is going to help you, but you’re not
sure how or why, this change often works, so it’s a good one to try
first. You can adjust φ to any parameter you want, but it’s been my
experience that it is most often helpful when φ = 45◦ .)
3.20 Change of representation, I
If the set {|ai, |bi} is an orthonormal basis, then the set {|a0 i, |b0 i},
where |a0 i = |bi and |b0 i = |ai is also an orthonormal basis — it’s just a
reordering of the original basis states. Find the transformation matrix.
If state |ψi is represented in the {|ai, |bi} basis as

ψa
,
ψb
then how is this state represented in the {|a0 i, |b0 i} basis?
3.21 Change of representation, II
Same as the previous problem, but use |a0 i = i|ai and |b0 i = −i|bi.
3.22 Inner product
You know that the inner product between two position unit vectors
is the cosine of the angle between them. What is the inner product
between the states |z+i and |θ+i? Does the geometrical interpretation
hold?
3.23 Outer product
Using the {|z+i, |z−i} basis representations

. ψ+ . φ+ . cos(θ/2) . − sin(θ/2)
|ψi = |φi = |θ+i = |θ−i = ,
ψ− φ− sin(θ/2) cos(θ/2)
write representations for |θ+ihθ + | and |θ−ihθ − |, then for
hφ|θ+ihθ + |ψi and hφ|θ−ihθ − |ψi, and finally verify that
hφ|ψi = hφ|θ+ihθ + |ψi + hφ|θ−ihθ − |ψi.
3.24 Measurement operator

Write the representation of the µ̂θ operator
µ̂θ = (+µB )|θ+ihθ + | + (−µB )|θ−ihθ − |
in the {|z+i, |z−i} basis. Using this representation, verify that |θ+i
and |θ−i are eigenvectors.
3.25 The trace
For any N ×N matrix A (with components aij ) the trace of A is defined
by
N
X
tr(A) = aii
i=1
Show that tr(AB) = tr(BA), and hence that tr(ABCD) =

tr(DABC) = tr(CDAB), etc. (the so-called “cyclic invariance” of the
trace). However, show that tr(ABC) does not generally equal tr(CBA)
by constructing a counterexample. (Assume all matrices to be square.)
3.26 The outer product
Any two complex N -tuples can be multiplied to form an N × N matrix
as follows: (The star represents complex conjugation.)
x = (x1 x2 . . . xN )
y = (y1 y2 . . . yN )
x1 y1∗ x1 y2∗ . . . x1 yN
∗ 
  
x1
 x2   x2 y1∗ x2 y2∗ . . . x2 yN
∗ 
 ∗ ∗ ∗
x⊗y =  (y1 y2 . . . yN )= .
  
.. ..
 .   . 
xN xN y1∗ xN y2∗ . . . xN yN
∗
This so-called “outer product” is quite different from the familiar “dot
product” or “inner product”
 
y1
 y2 
x · y = (x∗1 x∗2 . . . x∗N )  .  = x∗1 y1 + x∗2 y2 + · · · + x∗N yN .
 
 .. 
yN
Write a formula for the i, j component of x ⊗ y and use it to show that
tr(y ⊗ x) = x · y.
3.5. Problems 111
3.27 Pauli matrix algebra

Three important matrices are the Pauli matrices:

01 0 −i 1 0
σ1 = , σ2 = , σ3 = .
10 i 0 0 −1
(Sometimes they are called σ1 , σ2 , σ3 and other times they are called
σx , σy , σz .)
a. Show that the four matrices {I, σ1 , σ2 , σ3 }, where

10
I= ,
01
constitute a basis for the set of 2 × 2 matrices, by showing that
any matrix

a11 a12
A=
a21 a22
can be written as
A = z0 I + z1 σ1 + z2 σ2 + z3 σ3 .
Produce formulas for the zi in terms of the aij .
b. Show that
i. σ12 = σ22 = σ32 = I 2 = I
ii. σi σj = −σj σi for i 6= j
iii. σ1 σ2 = iσ3 (a)
σ2 σ3 = iσ1 (b)
σ3 σ1 = iσ2 (c)
Note: Equations (b) and (c) are called “cyclic permutations” of
equation (a), because in each equation, the indices go in the order
1 2
and differ only by starting at different points in the “merry-go-

round.”
c. Show that for any complex numbers c1 , c2 , c3 ,
(c1 σ1 + c2 σ2 + c3 σ3 )2 = (c21 + c22 + c23 )I.
3.28 Diagonalizing the Pauli matrices
Find the eigenvalues and corresponding (normalized) eigenvectors for
all three Pauli matrices.
3.29 Exponentiation of Pauli matrices

Define exponentiation of matrices through
∞
X Mn
eM = .
n=0
n!
a. Show that
ezσi = cosh(z)I + sinh(z)σi for i = 1, 2, 3.
(Hint: Look up the series expansions of sinh and cosh.)
b. Show that
√
(σ1 +σ3 )
√ sinh( 2)
e = cosh( 2)I + √ (σ1 + σ3 ).
2
c. Prove that eσ1 eσ3 6= e(σ1 +σ3 ) .
3.30 Hermitian operators
a. Show that if Â is a linear operator and (a, Âa) is real for all vectors
a, then Â is Hermitian. (Hint: Employ the hypothesis with a =
b + c and a = b + ic.)
b. Show that any operator of the form
Â = ca |aiha| + cb |bihb| + · · · + cz |zihz|,
where the cn are real constants, is Hermitian.
c. You know that if an operator is Hermitian then all of its eigen-
values are real. Show that the converse is false by producing a
counterexample. (Hint: Try a 2 × 2 upper triangular matrix.)
3.31 Unitary operators
Show that all the eigenvalues of a unitary operator have square modulus
unity.
3.32 Commutator algebra
Prove that
[Â, bB̂ + cĈ] = b[Â, B̂] + c[Â, Ĉ]
[aÂ + bB̂, Ĉ] = a[Â, Ĉ] + b[B̂, Ĉ]
[Â, B̂ Ĉ] = B̂[Â, Ĉ] + [Â, B̂]Ĉ
[ÂB̂, Ĉ] = Â[B̂, Ĉ] + [Â, Ĉ]B̂
0 = [Â, [B̂, Ĉ]] + [B̂, [Ĉ, Â]] + [Ĉ, [Â, B̂]] (the “Jacobi identity”).
Chapter 4
Formalism
Look at A Quantum Mechanics Primer by Daniel T. Gillespie, pages 1

through 70.
4.1 The role of formalism
We started off trying to follow the behavior of a silver atom as it passed

through various magnetic fields, and we ended up with an elaborate mathe-
matical structure of state vectors, Hilbert space, operators, and eigenstates.
This is a good time to step back and focus, not on the formalism, but on
what the formalism is good for: what it does, what it doesn’t do, and why
we should care. We do so by looking at a different mathematical formalism
for a more familiar physical problem.
Here’s the physical problem: Suppose I count out 178 marbles and put
them in an empty bucket. Then I count out 252 more marbles and put
them in the same bucket. How many marbles are in the bucket?
There are a number of ways to solve this problem. First, by experiment:
One could actually count out and place the marbles, and then count the
number of marbles in the bucket at the end of the process. Second, by
addition using Arabic numerals, using the rules for addition of three-digit
numbers (“carrying”) that we all learned in elementary school. Third, by
the trick of writing
178 + 252 = 180 + 250 = 430
which reduces the problem to two-digit addition. Fourth, by converting
from Arabic numerals in base 10 (decimal) to Arabic numerals in base 8
113
114 Formalism
(octal) and adding the octal numerals:

178(dec) + 252(dec) = 262(oct) + 374(oct) = 656(oct) = 430(dec) .
Fifth, by converting to Roman numerals and adding them using the Roman
addition rules that are simple and direct, but that you probably didn’t learn
in elementary school. Sixth, by converting to Mayan numerals and adding
them using rules that are, to you, even less familiar. If you think about it,
you’ll come up with other methods.
The formal processes of Arabic numeral addition, Roman numeral ad-
dition, and Mayan numeral addition are interesting only because they give
the same result as the experimental method of counting out marbles. That
is, these formal, mathematical processes matter only because they reflect
something about the physical world. (It’s clear that addition using deci-
mal Arabic numerals is considerably easier — and cheaper — than actually
doing the experiment. If you were trained in octal or Roman or Mayan nu-
merals, then you’d also find executing those algorithms easier than doing
the experiment.)
Does the algorithm of “carrying” tell us anything about addition? For
example, does it help us understand what’s going on when we count out
the total number of marbles in the bucket at the end of the experiment? I
would answer “no”. The algorithm of carrying tells us not about addition,
but about how we represent numbers using Arabic numerals with decimal
positional notation (“place value”). The “carry digits” are a convenient
mathematical tool to help calculate the total number of marbles in the
bucket. The amount of carrying involved differs depending upon whether
the addition is performed in decimal or in octal. It is absurd to think that
one could look into the bucket and identify which marbles were involved in
the carry and which were not! Nevertheless, you can and should develop
an intuition about whether or not a carry will be needed when performing
a sum. Indeed, when we wrote 178 + 252 as 180 + 250, we did so precisely
to avoid a carry.
There are many ways to find the sum of two integers. These different
methods differ in ease of use, in familiarity, in concreteness, in ability to
generalize to negative, fractional, and imaginary numbers. So you might
prefer one method to another. But you can’t say that one method is right
and another is wrong: the significance of the various methods is, in fact,
that they all produce the same answer, and that that answer is the same
as the number of marbles in the bucket at the end of the process.
4.1. The role of formalism 115
As with marbles in a bucket, so with classical mechanics. You know

several formalisms — several algorithms — for solving problems in classi-
cal mechanics: the Newtonian formalism, the Lagrangian formalism, the
Hamiltonian formalism, Poisson brackets, etc. These formal, mathemati-
cal processes are significant only because they reflect something about the
physical world.
The mathematical manipulations involved in solving a particular prob-
lem using Newton’s force-oriented method differ dramatically from the
mathematical manipulations involved in solving that same problem using
Hamilton’s energy-oriented method, but the two answers will always be the
same. Just as one can convert integers from a representation as decimal
Arabic numerals to a representation as octal Arabic numerals, or as Roman
numerals, or as Mayan numerals, so one can add any constant to a Hamilto-
nian and obtain a different Hamiltonian that is just as good as the original.
Poisson brackets don’t actually exist out in nature — you can never per-
form an experiment to measure the numerical value of a Poisson bracket
— but they are convenient mathematical tools that help us calculate the
values of positions that we can measure.
Although Lagrangians, Hamiltonians, and Poisson brackets are features
of the algorithm, not features of nature, it is nevertheless possible to develop
intuition concerning Lagrangians, Hamiltonians, and Poisson brackets. You
might call this “physical intuition” or you might call it “mathematical in-
tuition” or “algorithmic intuition”. Regardless of what you call it, it’s a
valuable thing to learn.
These different methods for solving classical problems differ in ease of
use, in familiarity, in concreteness, in ability to generalize to relativistic and
quantal situations. So you might prefer one method to another. But you
can’t say that one method is right and another is wrong: the significance
of the various methods is, in fact, that they all produce the same answer,
and that that answer is the same as the classical behavior exhibited by the
system in question.
As with marbles in a bucket, and as with classical mechanics, so with
quantum mechanics. This chapter has developed an elegant and formidable
formal apparatus representing quantal states as vectors in Hilbert space and
experiments as operators in Hilbert space. This is not the only way of solv-
ing problems in quantum mechanics: One could go back to the fundamental
rules for combining amplitudes in series and in parallel (page 58), just as
116 Formalism
one could go back to solving arithmetic problems by throwing marbles into

a bucket. Or one could develop more elaborate and more formal ways to
solve quantum mechanics problems, just as one could use the Lagrangian or
Hamiltonian formulations in classical mechanics. This book will not treat
these alternative formulations of quantum mechanics: the path integral
formulation (Feynman), the phase space formulation (Wigner), the density
matrix formulation (for an introduction, see section 4.2), the variational
formulation, the pilot wave formulation (de Broglie-Bohm), or any of the
others. But be assured that these alternative formulations exist, and their
existence proves that kets and operators are features of the algorithmic
tools we use to solve quantum mechanical problems, not features of nature.
The mathematical manipulations involved in solving a particular prob-
lem using the Hilbert space formalism differ dramatically from the mathe-
matical manipulations involved in solving that same problem using the rules
for combining amplitudes in series and in parallel, but the two answers will
always be the same. In almost all cases the Hilbert space formalism is far
easier to apply, and that’s why we use it. We use it so often that we can fall
into the trap of thinking that kets and operators are features of nature, not
features of an algorithm. But remember that just as one can convert inte-
gers from a representation as decimal Arabic numerals to a representation
as octal Arabic numerals, or as Roman numerals, or as Mayan numerals, so
one can multiply any state vector by a constant of modulus unity to obtain
a different state vector that is just as good as the original. State vectors
don’t actually exist out in nature — you can never perform an experiment
to measure the numerical value of a state vector (or even of an amplitude)
— but they are convenient mathematical tools that help us calculate the
values of probabilities that we can measure.
Many students, faced with the formidable mathematical formalism of
quantum mechanics, fall into the trap of despair. “How can nature possibly
be so sophisticated and formal?” This is the same trap as wondering “How
can marbles know the algorithm for carrying in the addition of decimal Ara-
bic numerals?” Nature doesn’t know anything about Hilbert space, just as
marbles don’t know anything about carrying. The fact that the formalism
of quantum mechanics is more sophisticated than the formalism of addi-
tion, or the formalism of classical mechanics, simply reflects the two facts
(noted previously on pages XXX and YYY) that quantum mechanics is far
removed from common sense, and that quantum mechanics is stupendously
rich.
4.2. The density matrix 117
4.2 The density matrix
4.1 Definition
Consider a system in quantum state |ψi. Define the operator
ρ̂ = |ψihψ|,
called the density matrix , and show that the expectation value of the
observable associated with operator Â in |ψi is
tr{ρ̂Â}.
4.2 Statistical mechanics
Frequently physicists don’t know exactly which quantum state their
system is in. (For example, silver atoms coming out of an oven are in
states of definite µ projection, but there is no way to know which state
any given atom is in.) In this case there are two different sources of
measurement uncertainty: first, we don’t know what state they system
is in (statistical uncertainty, due to our ignorance) and second, even
if we did know, we couldn’t predict the result of every measurement
(quantum uncertainty, due to the way the world works). The density
matrix formalism neatly handles both kinds of uncertainty at once.
If the system could be in any of the states |ai, |bi, . . . , |ii, . . . (not
necessarily a basis set), and if it has probability pi of being in state |ii,
then the density matrix
X
ρ̂ = pi |iihi|
i
is associated with the system. Show that the expectation value of the
observable associated with Â is still given by
tr{ρ̂Â}.
4.3 Trace of the density matrix
Show that tr{ρ̂} = 1. (This can be either a long and tedious proof, or
a short and insightful one.)
Chapter 5
Time Evolution
5.1 Operator for time evolution
You now are at the point in quantum mechanics where you were when you
first stepped into the door of your classical mechanics classroom: you know
what you’re trying to calculate.
But! How to calculate it? If quantum mechanics is to have a classical
limit, then quantal states have to change with time. We write this time
dependence explicitly as
|ψ(t)i. (5.1)
We seek the equations that govern this time evolution, the ones parallel to
the classical time development equations, be they the Newtonian equations
X
F~ = m~a (5.2)
or the Lagrange equations
∂L d ∂L
− =0 (5.3)
∂qi dt ∂ q̇i
or the Hamilton equations
∂H ∂H
= −ṗi , = q̇i . (5.4)
∂qi ∂pi
Assume the existence of some “time development operator” Û (∆t) such

that
|ψ(t + ∆t)i = Û (∆t)|ψ(t)i. (5.5)
You might think that this statement is so general that we haven’t assumed
anything — we’ve just said that things are going to change with time. In
119
120 Time Evolution
fact we’ve made a big assumption: just by our notation we’ve assumed
that the time-development operator Û is linear, independent of the state
|ψi that’s evolving. That is, we’ve assumed that the same operator will
time-evolve any different state. (The operator will, of course, depend on
which system is evolving in time: the number of particles involved, their
interactions, their masses, the value of the magnetic field in which they
move, and so forth.)
By virtue of the meaning of time, we expect the operator Û (∆t) to have
these four properties:
(1) Û (∆t) is unitary.

(2) Û (∆t2 )Û (∆t1 ) = Û (∆t2 + ∆t1 ).
(3) Û (∆t) is dimensionless.
(4) Û (0) = 1̂.
And it’s also reasonable to assume that the time-development operator

can be expanded in a Taylor series:
Û (∆t) = Û (0) + Â∆t + B̂(∆t)2 + · · · . (5.6)
We know that Û (0) = 1̂, and we’ll write the quadratic and higher-order
terms as B̂(∆t)2 + · · · = O(∆t2 ) . . . which is read “terms of order ∆t2 and
higher” or just as “terms of order ∆t2 ”. Finally, we’ll write Â in a funny
way so that
i
Û (∆t) = 1̂ − Ĥ∆t + O(∆t2 ). (5.7)
~
I could just say, “we define Ĥ = i~Â” but that just shunts aside the im-
portant question — why is this a useful definition? There are two reasons:
First, the operator Ĥ turns out to be Hermitian. (We will prove this in this
section.) Second, because it’s Hermitian, it can represent a measured quan-
tity. When we investigate the classical limit, we will see that it corresponds
to the classical energy.1
[[The energy operator is called “the Hamiltonian” and represented by
the letter Ĥ in honor of William Rowan Hamilton, who first pointed out the
central role that energy can play in time development in the formal theory of
classical mechanics. Hamilton (1805–1865) made important contributions
to mathematics, optics, classical mechanics, and astronomy. At the age
1 For now, you can just use dimensional analysis to see that it has the correct dimensions
for energy.
5.1. Operator for time evolution 121
of 22 years, while still an undergraduate, he was appointed professor of

astronomy at his university and the Royal Astronomer of Ireland. He was
not related to the American founding father Alexander Hamilton.]]
Theorem: The operator Ĥ defined above is Hermitian.
Proof: The proof uses the fact that the norm of |ψ(t + ∆t)i equals the
norm of |ψ(t)i:
i
|ψ(t + ∆t)i = |ψ(t)i − ∆t Ĥ|ψ(t)i +O(∆t2 ). (5.8)
~ | {z }
≡ |ψH (t)i
Thus
i 2 i 2
hψ(t + ∆t)|ψ(t + ∆t)i = hψ(t)| + ∆thψH (t)| + O(∆t ) |ψ(t)i − ∆t|ψH (t)i + O(∆t (5.9)
)
~ ~

i
= hψ(t)|ψ(t)i + ∆t hψH (t)|ψ(t)i − hψ(t)|ψH (t)i + O(∆t2 ) (5.10)
~

i ∗
1 = 1 + ∆t hψ(t)|ψH (t)i − hψ(t)|ψH (t)i + O(∆t2 ) (5.11)
~

i ∗
0 = ∆t hψ(t)|Ĥ|ψ(t)i − hψ(t)|Ĥ|ψ(t)i + O(∆t2 ). (5.12)
~
This equation has to hold for all values of ∆t, so the quantity in square
brackets must vanish! That is,
∗
hψ(t)|Ĥ|ψ(t)i = hψ(t)|Ĥ|ψ(t)i (5.13)
for all vectors |ψ(t)i. A simple exercise (problem 2.30, part a) then shows
that operator Ĥ is Hermitian.
We have written the time-development equation as
i
|ψ(t + ∆t)i = |ψ(t)i − ∆tĤ|ψ(t)i + O(∆t2 ). (5.14)
~
Rearrangement gives
|ψ(t + ∆t)i − |ψ(t)i i
= − Ĥ|ψ(t)i + O(∆t). (5.15)
∆t ~
In the limit ∆t → 0, this gives
d|ψ(t)i i
= − Ĥ|ψ(t)i (5.16)
dt ~
which is an important result known as the Schrödinger2 equation!
2 Erwin Schrödinger (1887–1961) was interested in physics, biology, philosophy, and
Eastern religion. Born in Vienna, he held physics faculty positions in Germany, Poland,
and Switzerland. In 1926 he discovered the time-development equation that now bears
his name. This led, in 1927, to a prestigious appointment in Berlin. In 1933, disgusted
with the Nazi regime, he left Berlin for Oxford, England. He held several positions in
various cities before ending up in Dublin. There, in 1944, he wrote a book titled What
is Life? which is widely credited for stimulating interest in what had been a backwater
of science: biochemistry.
122 Time Evolution
Time evolution of projection probabilities
Theorem: If |φi is a time-independent state and P̂φ = |φihφ| is its associated

projection operator, then
d i
|hφ|ψ(t)i|2 = − h[P̂φ , Ĥ]i. (5.17)
dt ~
Proof:
d d ∗
|hφ|ψ(t)i|2 = hφ|ψ(t)ihφ|ψ(t)i
dt dt
∗
d ∗ d
= hφ| |ψ(t)i hφ|ψ(t)i + hφ|ψ(t)i hφ| |ψ(t)i
dt dt
d i
But hφ| |ψ(t)i = − hφ|Ĥ|ψ(t)i, so
dt ~
d ih ∗ ∗
i
|hφ|ψ(t)i|2 = − hφ|Ĥ|ψ(t)ihφ|ψ(t)i − hφ|ψ(t)ihφ|Ĥ|ψ(t)i
dt ~
ih i
= − hψ(t)|φihφ|Ĥ|ψ(t)i − hψ(t)|Ĥ|φihφ|ψ(t)i
~
ih n o i
= − hψ(t)| |φihφ|Ĥ − Ĥ|φihφ| |ψ(t)i
~
i
= − hψ(t)|[P̂φ , Ĥ]|ψ(t)i
~
Lemma: Suppose Â and B̂ are commuting Hermitian operators. If |ai

is an eigenvector of Â and P̂a = |aiha|, then [P̂a , B̂] = 0.
Proof: From the compatibility theorem, there is an eigenbasis {|bn i} of
B̂ with |b1 i = |ai. Write B̂ in diagonal form as
X
B̂ = bn |bn ihbn |.
n
Then
X X
B̂|b1 ihb1 | = |bn ihbn |b1 ihb1 | = |bn iδn,1 hb1 | = b1 |b1 ihb1 |
n n
while
X X
|b1 ihb1 |B̂ = |b1 ihb1 |bn ihbn | = |b1 iδ1,n hbn | = b1 |b1 ihb1 |.
n n
5.2. Working with the Schrödinger equation 123
5.2 Working with the Schrödinger equation
Quantal states evolve according to the Schrödinger time-development equa-

tion
d i
|ψ(t)i = − Ĥ|ψ(t)i. (5.18)
dt ~
We have shown that the linear operator Ĥ is Hermitian and has the dimen-
sions of energy. I’ve stated that we are going to show, when we discuss the
classical limit, that the operator Ĥ corresponds to energy, and this justifies
the name “Hamiltonian operator”. That’s still not much knowledge! This
is just as it was in classical mechanics: Time development is governed by
X
F = ma, (5.19)
but this doesn’t help you until you know what forces are acting. Similarly,
in quantum mechanics the Schrödinger equation is true but doesn’t help us
until we know how to find the Hamiltonian operator.
We find the Hamiltonian operator in quantum mechanics in the same
way that we find the force function in classical mechanics: by appeal to
experiment, to special cases, to thinking about the system and putting
the pieces together. It’s a creative task to stitch together the hints that
we know to find a Hamiltonian. Sometimes in this book I’ll be able to
guide you down this creative path. Sometimes, as in great art, the creative
process came through a stroke of genius that can only be admired and not
explained.
Representations of the Schrödinger equation
As usual, we become familiar with states through their components,
that is through their representations in a particular basis:
X
|ψ(t)i = ψn |ni. (5.20)
n
We know that |ψ(t)i changes with time on the left-hand side, so something
has to change with time on the right-hand side. Which is it, the expansion
coefficients ψn or the basis states |ni? The choice has nothing to do with
nature — it is purely formal. All our experimental results will depend on
|ψ(t)i, and whether we ascribe the time development to the expansion co-
efficients or to the basis states is merely a matter of convenience. There are
three common conventions, called “pictures”: In the “Schrödinger picture”,
the expansion coefficients change with time while the basis states don’t. In
the “Heisenberg picture” the reverse is true. In the “interaction picture”
both expansion coefficients and basis states change with time.
124 Time Evolution
time constant time dependent name

{|ni} ψn (t) Schrödinger picture
ψn {|n(t)i} Heisenberg picture
nothing ψn (t), {|n(t)i} interaction picture
This book will use the Schrödinger picture, but be aware that this is mere
convention.
In the Schrödinger picture, the expansion coefficients hn|ψ(t)i = ψn (t)
change in time according to
d i iX
hn|ψ(t)i = − hn|Ĥ|ψ(t)i = − hn|Ĥ|mihm|ψ(t)i, (5.21)
dt ~ ~ m
or, in other words, according to
dψn (t) iX ∗
=− Hn,m ψm (t) where, recall Hn,m = Hm,n . (5.22)
dt ~ m
A system with one basis state

Consider a system with one basis state — say, a motionless hydrogen
atom in its electronic ground state, which we call |1i. Then
|ψ(t)i = ψ1 (t)|1i
If the initial state happens to be
|ψ(0)i = |1i,
then the time development problem is
Initial condition: ψ1 (0) = 1
dψ1 (t) i
Differential equation: = − Eg ψ1 (t),
dt ~
where Eg = h1|Ĥ|1i is the energy of the ground state.
The solution is straightforward:
ψ1 (t) = 1e−(i/~)Eg t
or, in other words,
|ψ(t)i = e−(i/~)Eg t |1i. (5.23)
Because two state vectors that differ only in phase represent the same state,
the state doesn’t change even though the coefficient ψ1 (t) does change with
time. The system says always in the ground state.
When I was in high school, my chemistry teacher said that “an atom
is a pulsating blob of probability”. He was thinking of this equation, with
the expansion coefficient ψ1 (t) changing in time as
e−(i/~)Eg t = cos((Eg /~)t) − i sin((Eg /~)t). (5.24)
On one hand you know that this function “pulsates” — that is, changes in
time periodically with period 2π~/Eg . On the other hand you know also
that this function represents an irrelevant overall phase — for example, it
has no effect on any probability at all. My high school chemistry teacher
was going overboard in ascribing physical reality to the mathematical tools
we use to describe reality.
Exercise: Change energy zero. You know the energy zero is purely con-
ventional so changing the energy zero shouldn’t change anything in the
physics. And indeed it changes only the phase, which is also purely
conventional. In the words of my high school chemistry teacher this
changes the “pulsation” rate — but it doesn’t change anything about
the behavior of the hydrogen atom.
A system with two basis states: The silver atom

Consider a system with two basis states — say, a silver atom in a uniform
vertical magnetic field. Take the two basis states to be
|1i = |z+i and |2i = |z−i. (5.25)
It’s very easy to write down the differential equation

d ψ1 (t) i H1,1 H1,2 ψ1 (t)
=− (5.26)
dt ψ2 (t) ~ H2,1 H2,2 ψ2 (t)
but it’s much harder to see what the elements in the Hamiltonian matrix
should be — that is, it’s hard to guess the Hamiltonian operator.
The classical energy for this system is
U = −µ · B = −µz B. (5.27)
Our guess for the quantum Hamiltonian is simply to change quantities into
operators
Ĥ = −µ̂z B (5.28)
where
µ̂z = (+µB )|z+ihz + | + (−µB )|z−ihz − | (5.29)
126 Time Evolution
is the quantum mechanical operator corresponding to the observable µz .

(See equation 3.2.) In this equation B is not an operator but simply a
number, the magnitude of the classical magnetic field in which the silver
atom is immersed. You might think that we should quantize the magnetic
field as well as the atomic magnetic moment, and indeed a full quantum-
mechanical treatment would have to include the quantum theory of elec-
tricity and magnetism. That’s a task for later. For now, we’ll accept the
Hamiltonian (5.28) as a reasonable starting point, and indeed it turns out
to describe this system to high accuracy, although not perfectly.3
It is an easy exercise to show that in the basis
{|z+i, |z−i} = {|1i, |2i},
the Hamiltonian operator (5.28) is represented by the matrix

H1,1 H1,2 −µB B 0
= . (5.30)
H2,1 H2,2 0 +µB B
Thus the differential equations (5.26) become
dψ1 (t) i
= − (−µB B)ψ1 (t)
dt ~
dψ2 (t) i
= − (+µB B)ψ2 (t).
dt ~
The solutions are straightforward:
ψ1 (t) = ψ1 (0)e−(i/~)(−µB B)t
ψ2 (t) = ψ2 (0)e−(i/~)(+µB B)t .
Stuff about initial state |z+i.

Suppose the initial state is
1 1
|x+i = |z+ihz + |x+i + |z−ihz − |x+i = |z+i √ + |z−i √ .
2 2
Then
1 1
ψ1 (0) = √ ψ2 (0) = √
2 2
so
1 1
|ψ(t)i = √ e−(i/~)(−µB B)t |z+i + √ e−(i/~)(+µB B)t |z−i.
2 2
3 If you want perfection, you’ll need to go into some discipline other than science.
So the atom is produced in state |x+i, then is exposed to a vertical magnetic

field for time t, and ends up in the state mentioned above. If we now
measure µx , what is the probability that it’s +µB again? That probability
is the square of the amplitude
1 1
hx + |ψ(t)i = √ e−(i/~)(−µB B)t hx + |z+i + √ e−(i/~)(+µB B)t hx + |z−i
2 2
1 −(i/~)(−µB B)t 1 1 −(i/~)(+µB B)t 1
= √ e √ +√ e √
2 2 2 2

1 −(i/~)(−µB B)t
= e + e−(i/~)(+µB B)t
2

1
= 2 cos((1/~)(µB B)t)
2

µB B
= cos t
~
The probability is

2 2 µB B
|hx + |ψ(t)i| = cos t (5.31)
~
which starts at one when t = 0, then goes down to zero, then goes back up
to one, with an oscillation period of
π~
.
µB B
This phenomena is called “Rabi oscillation” — it is responsible for the

workings of atomic clocks.4
Another two-state system: The ammonia molecule
Another system with two basis states is the ammonia molecule NH3 .
If we ignore translation and rotation, and assume that the molecule is
rigid,5 then there are still two possible states for the molecule: state |1i
with the nitrogen atom pointing up, and state |2i with the nitrogen atom
4 IsidorIsaac Rabi (1898–1988) won the Nobel Prize for his discovery of nuclear magnetic
resonance, but he also contributed to the invention of the laser and the atomic clock.
No capsule biography suffices because he did so much. If you read Jeremy Bernstein’s
profile of Rabi in the New Yorker (13 and 20 October 1975) you will see that he was a
very clever man.
5 That is, ignore vibration. These approximations seem, at first glance, to be absurd.
They are in fact excellent approximations, because the tunneling happens so fast that
the molecule doesn’t have time to translate, rotate, or vibrate to any significant extent
during one cycle of tunneling.
128 Time Evolution
pointing down. These are states of definite position for the nitrogen atom,
but not states of definite energy (stationary states) because there is some
amplitude for the nitrogen atom to tunnel from the “up” position to the
“down” position. That is, if you start with the atom in state |1i, then some
time later it might be in state |2i, because the nitrogen atom tunneled
through the plane of hydrogen atoms.
H H
H H |1> H H |2>
What is the implication of such tunneling for the Hamiltonian matrix?

The matrix we dealt with in equation (5.30) was diagonal, and hence the
two differential equations split up (“decoupled”) into one involving ψ1 (t)
and another involving ψ2 (t). These were independent: If a system started
out in the state |1i (i.e. ψ1 (t) = e−(i/~)H1,1 t , ψ2 (t) = 0), then it stayed there
forever. We’ve just said that this is not true for the ammonia molecule, so
the Hamiltonian matrix must not be diagonal.
The Hamiltonian matrix in the {|1i, |2i} basis has the form
E Aeiφ

H1,1 H1,2
= . (5.32)
H2,1 H2,2 Ae−iφ E
The two off-diagonal elements must be complex conjugates of each other
because the matrix is Hermitian. It’s reasonable that the two on-diagonal
elements are equal because the states |1i and |2i are mirror images and
hence h1|Ĥ|1i = h2|Ĥ|2i.
For this Hamiltonian, the Schrödinger equation is
E Aeiφ

d ψ1 (t) i ψ1 (t)
=− . (5.33)
dt ψ2 (t) ~ Ae−iφ E ψ2 (t)
It’s hard to see how to solve this pair of differential equations. The matrix
is not diagonal, so the differential equation for ψ1 (t) involves the unknown
function ψ2 (t), and the differential equation for ψ2 (t) involves the unknown
function ψ1 (t). However, while it’s hard to solve in this initial basis, it
would be easy to solve in a basis where the matrix is diagonal.
To diagonalize an N × N Hermitian matrix M:
(1) In initial basis, the matrix representation of Â is M. The eigenvectors

of Â satisfy Â|en i = λn |en i.
(2) Find N eigenvalues by solving the N th order polynomial equation
det |M − λI| = 0.
(3) Find the representation en of the eigenvector |en i by solving N simul-
taneous linear equations
Men = λn en .
[In the above equation, M is an N ×N matrix, en is an N ×1 matrix (the
N unknowns), and λn is a known number (determined in the previous
step).]
(4) In the basis {|e1 i, |e2 i, . . . , |eN i}, the matrix representation of Â is di-
agonal
λ1 0 · · · 0
 
 0 λ2 · · · 0 
..  .
 
 .
 .. . 
0 0 · · · λN
Let’s carry out these steps for the ammonia molecule problem.
1. The Hamiltonian is represented in the initial basis {|1i, |2i} by
E Aeiφ

M=
Ae−iφ E
2. Find the eigenvalues.
E − λ Aeiφ
det =0
Ae−iφ E − λ
(E − λ)2 − A2 = 0
(E − λ)2 = A2
E − λ = ±A
λ = E±A
λ1 = E + A (5.34)
λ2 = E − A (5.35)
130 Time Evolution
As required, the eigenvalues are real.

3. Find the eigenvectors.
We start with the eigenvector for λ1 = E + A:
Me1 = λ1 e1
(M − λ1 I)e1 =0
E − λ1 Aeiφ

x 0
=
Ae−iφ E − λ1 y 0
−A Aeiφ

x 0
=
Ae−iφ −A y 0
iφ

−1 e x 0
=
e−iφ −1 y 0
−x + eiφ y = 0
e−iφ x − y = 0
These two are not independent equations! They cannot be. There are many
eigenvectors because if, say

1
5
is an eigenvector, then so are

−1 2 3i
, , and ,
−5 10 15i
and infinitely more eigenvectors.
The solution is y = e−iφ x, so

x
e1 = .
e−iφ x
Although I could choose any value of x that I wanted, it is most conve-

nient to work with normalized eigenvectors, for which
|x|2 + |y|2 = 1
|x|2 + |e−iφ x|2 = 1
2|x|2 = 1
This equation has many solutions. I could pick
1 1 i 1+i
x= √ or x = − √ or x = √ or x=
2 2 2 2
but there’s no advantage to picking a solution with all sorts of unneeded

symbols. So I choose the first possibility and write

1 1
e1 = √ −iφ .
2 e
This is the representation of |e1 i in the basis {|1i, |2i}.
Exercise. Show that an eigenvector associated with λ2 = E − A is

. 1 −1
|e2 i = e2 = √ −iφ .
2 e
Exercise. Verify that he1 |e2 i = 0.
In summary,
√1 +|1i + e−iφ |2i

|e1 i = 2
√1 −|1i + e−iφ |2i .

|e2 i = 2
(5.36)
Exercise. Show that {|e1 i, |e2 i} constitute a spanning set by building

|1i and |2i out of |e1 i and |e2 i. (Answer: |1i = √12 (|e1 i − |e2 i), |2i =
√1 eiφ (|e1 i + |e2 i).)
2
What are these states like?
• States |1i and |2i have definite positions for the nitrogen atom, namely
“up” or “down”. But they don’t have definite energies. These states
are sketched on page 128.
• States |e1 i and |e2 i have definite energies, namely E + A or E − A.
But they don’t have definite positions for the nitrogen atom. They
can’t be sketched using classical ink. (For a molecule in this state the
nitrogen atom is like a silver atom passing through “both branches” of
an interferometer — the atom does not have a definite position.)
4. In the basis {|e1 i, |e2 i}, the matrix representation of the Hamiltonian
is

E+A 0
.
0 E−A
It’s now straightforward to solve the differential equations. Using the
notation
|ψ(t)i = ψ̄1 (t)|e1 i + ψ̄2 (t)|e2 i,
132 Time Evolution
the time development differential equations are

dψ̄1 (t) i
= − (E + A)ψ̄1 (t)
dt ~
dψ̄2 (t) i
= − (E − A)ψ̄2 (t)
dt ~
with the immediate solutions
ψ̄1 (t) = ψ̄1 (0)e−(i/~)(E+A)t
ψ̄2 (t) = ψ̄2 (0)e−(i/~)(E−A)t .
Thus
−(i/~)Et −(i/~)At +(i/~)At
|ψ(t)i = e e ψ̄1 (0)|e1 i + e ψ̄2 (0)|e2 i . (5.37)
(It is surprising that this time evolution result — and indeed the result of
any possible experiment — is independent of the phase φ of the off-diagonal
element of the Hamiltonian. This surprise is explained in problem 5.8.)
Let’s try out this general solution for a particular initial condition. Sup-
pose the nitrogen atom starts out “up” — that is,
|ψ(0)i = |1i, (5.38)
and we ask for the probability of finding it “down” — that is, |h2|ψ(t)i|2 .
The initial expansion coefficients in the {|e1 i, |e2 i} basis are (see equa-
tions (5.36))
ψ̄1 (0) = he1 |ψ(0)i = he1 |1i = √1
2
ψ̄2 (0) = he2 |ψ(0)i = he2 |1i = − √12
so
1
|ψ(t)i = √ e−(i/~)Et e−(i/~)At |e1 i − e+(i/~)At |e2 i .
2
The amplitude to find the nitrogen atom “down” is

1 −(i/~)Et −(i/~)At +(i/~)At
h2|ψ(t)i = √ e e h2|e1 i − e h2|e2 i
2

1 −(i/~)Et −(i/~)At 1 −iφ +(i/~)At 1 −iφ
= √ e e √ e −e √ e
2 2 2

1
= e−iφ e−(i/~)Et e−(i/~)At − e+(i/~)At
2

1
= e−iφ e−(i/~)Et −2i sin ((1/~)At)
2

A
= −ie−iφ e−(i/~)Et sin t
~
and thus the probability of finding the nitrogen atom “down” is

2 2 A
|h2|ψ(t)i| = sin t . (5.39)
~
GRAPH HERE
This oscillation has period
π~ 2π~
=
A ∆E
where ∆E represents the energy splitting between the two energy eigenval-
ues, E + A and E − A.
This oscillation is at the heart of the MASER (Microwave Amplification
by Simulated Emission of Radiation).
Problems
5.1 Probability of no change
In equation (5.39) we found the probability that the nitrogen atom be-
gan in the “up” position (equation 5.38) and finished in the “down”
position. Find the amplitude and the probability that the nitrogen
atom will finish in the “up” position, and verify that these two proba-
bilities sum to 1.
5.2 Tunneling for small times
Equation (5.37) solves the time evolution problem completely, for all
time. But it doesn’t give a lot of insight into what’s “really going on”.
This problem provides some of that missing insight.
a. When the time involved is short, we can approximate time evolu-
tion through

i
|ψ(∆t)i = 1̂ − Ĥ∆t + · · · |ψ(0)i. (5.40)
~
Show that this equation, represented in the {|1i, |2i} basis, is
1 − (i/~)E∆t −(i/~)Aeiφ ∆t

ψ1 (∆t) ψ1 (0)
≈ .
ψ2 (∆t) −(i/~)Ae−iφ ∆t 1 − (i/~)E∆t ψ2 (0)
(5.41)
b. Express the initial condition |ψ(0)i = |1i, used above at equa-
tion (5.38), in the {|1i, |2i} basis, and show that, for small times,

ψ1 (∆t) 1 − (i/~)E∆t
≈ . (5.42)
ψ2 (∆t) −(i/~)Aeiφ ∆t
134 Time Evolution
c. This shows that the system starts with amplitude 1 for being in
state |1i, but that amplitude “seeps” (or “diffuses” or “hops”)
from |1i into |2i. In fact, the amplitude to be found in |2i after
a small time ∆t has passed is −(i/~)Aeiφ ∆t. What is the proba-
bility of being found in |2i? What is the condition for a “small”
time?
d. Show that the same probability results from approximating re-
sult (5.39) for small times.
In a normal diffusion process – such as diffusion of blue dye from one
water cell into an adjacent water cell – the dye spreads out uniformly
and then net diffusion stops. But in this quantal amplitude diffusion,
the amplitude is complex-valued. As such, the diffusion of more ampli-
tude into the second cell can result, through destructive interference,
in a decreased amplitude in the second cell. This interference gives rise
to the oscillatory behavior demonstrated in equation (5.39).
e. While this approach does indeed provide a lot of insight, it also
raises a puzzle. What, according to equation (5.42), is the proba-
bility of being found in the initial state |1i after a short time has
passed? Conclude that the total probability is greater than 1! We
will resolved this paradox in problem 11.1.
5.3. Formal properties of time evolution; Conservation laws 135
5.3 Ammonia molecule in an electric field

Place an ammonia molecule into an external electric field E perpendic-
ular to the plane of hydrogen atoms.
N
E
H H
H H |1> H H |2>
Now the states |1i and |2i are no longer symmetric, so we can no longer
assume that h1|Ĥ|1i = h2|Ĥ|2i. Indeed, the proper matrix representa-
tion of Ĥ in the {|1i, |2i} basis is
E + pE Aeiφ

,
Ae−iφ E − pE
where p is interpreted as the molecular dipole moment. (Negative
charge migrates toward the nitrogen atom.)
a. Find the eigenvalues e1 and e2 of Ĥ. (Check against the re-
sults (5.35) that apply when E = 0.)
b. Find the eigenvectors |e1 i and |e2 i in terms of |1i and |2i. (Check
against the results (5.36).)
c. If a molecule is initially in state |1i, find the probability that it
will be found in state |2i as a function of time.
5.3 Formal properties of time evolution; Conservation laws
Quantal states evolve according to the Schrödinger time-development equa-

tion
d|ψ(t)i i
= − Ĥ|ψ(t)i. (5.43)
dt ~
136 Time Evolution
The Hamiltonian operator Ĥ is Hermitian, with eigenvectors {|en i} and

eigenvalues en :
Ĥ|en i = en |en i. (5.44)
These are called the “energy eigenstates” or “states of definite energy”.
Theorem I: Energy eigenstates are stationary states.

If |ψ(0)i = (number)|en i, then |ψ(t)i = (number)0 |en i, where both num-
bers have square modulus unity.
Because of this result, the energy eigenstates are also called “stationary
states”: once you’re in one of them, you stay.
Proof: A formal proof will be given in the proof of theorem II. This
informal proof provides less rigor and more insight.
Start at time t = 0 and step forward a small amount of time ∆t:
∆|ψi i
≈ − Ĥ|ψ(0)i (5.45)
∆t ~
i
= − Ĥ(number)|en i (5.46)
~
= (stuff)|en i. (5.47)
∆|ψi = (stuff)∆t|en i. (5.48)
That is, the change in the state vector is parallel to the initial state vector,
so the new state vector |ψ(∆t)i = |ψ(0)i + ∆|ψi is again parallel to the
initial state vector, and all three vectors are parallel to |en i. Repeat for as
many time steps as desired.
The vector |ψ(∆t)i is not only parallel to the vector |ψ(0)i, but it also
has the same norm. (Namely unity.) This can’t happen for regular position
vectors multiplied by real numbers. The only way to multiply a vector by
a number, and get a different vector with the same norm, is to multiply by
a complex number.
Theorem II: Formal solution of the Schrödinger equation.

X X
If |ψ(0)i = ψn (0)|en i, then |ψ(t)i = ψn (0)e−(i/~)en t |en i.
n n
Proof: In component form, the Schrödinger equation is

dψn (t) iX
=− Hn,m ψm (t).
dt ~ m
5.3. Formal properties of time evolution; Conservation laws 137
In the energy eigenbasis,

en n = m
Hn,m = = en δn,m .
0 n=6 m
Thus
dψn (t) iX i
=− en δn,m ψm (t) = − en ψn (t)
dt ~ m ~
and
ψn (t) = ψn (0)e−(i/~)en t .
So, this is how states change with time! What about measurements?
We will first find how average values change with time, then look at “the
whole shebang” – not just the average, but also the full distribution.
Definition: The operator ÂB̂ − B̂ Â is called “the commutator of Â and

B̂” and represented by [Â, B̂].
Theorem III: Time evolution of averages.
dhÂi i
= − h[Â, Ĥ]i.
dt ~
Proof: (Using mathematical notation for inner products.)

d d
hÂi = ψ(t), Âψ(t)
dt dt

dψ(t) dψ(t)
= , Âψ(t) + ψ(t), Â
dt dt

i i
= − Ĥψ(t), Âψ(t) + ψ(t), Â − Ĥψ(t) [[use the fact that Ĥ is Hermitian]]
~ ~
i i
= ψ(t), Ĥ Âψ(t) − ψ(t), ÂĤψ(t)
~ ~
i
=− ψ(t), [ÂĤ − Ĥ Â]ψ(t)
~
i
= − h[Â, Ĥ]i
~
Corollary: If Â commutes with Ĥ, then hÂi is constant.

138 Time Evolution
However, just because the average of a measurement doesn’t change with

time doesn’t necessarily mean that nothing about the measurement changes
with time. To fully specify the results of a measurement, you must also list
the possible results, the eigenvalues an , and the probability of getting that
result, namely |han |ψ(t)i|2 . The eigenvalues an are time constant, but how
do the probabilities change with time?
Theorem IV: Time evolution of projection probabilities.

If |φi is a time-independent state and P̂φ = |φihφ| is its associated pro-
jection operator, then
d i
|hφ|ψ(t)i|2 = − h[P̂φ , Ĥ]i. (5.49)
dt ~
Proof:
d d ∗
|hφ|ψ(t)i|2 = hφ|ψ(t)ihφ|ψ(t)i
dt dt
∗
d ∗ d
= hφ| |ψ(t)i hφ|ψ(t)i + hφ|ψ(t)i hφ| |ψ(t)i
dt dt
d i
But hφ| |ψ(t)i = − hφ|Ĥ|ψ(t)i, so
dt ~
d ih ∗ ∗
i
|hφ|ψ(t)i|2 = − hφ|Ĥ|ψ(t)ihφ|ψ(t)i − hφ|ψ(t)ihφ|Ĥ|ψ(t)i
dt ~
ih i
= − hψ(t)|φihφ|Ĥ|ψ(t)i − hψ(t)|Ĥ|φihφ|ψ(t)i
~
ih n o i
= − hψ(t)| |φihφ|Ĥ − Ĥ|φihφ| |ψ(t)i
~
i
= − hψ(t)|[P̂φ , Ĥ]|ψ(t)i
~
Lemma: Suppose Â and B̂ are commuting Hermitian operators. If |ai is

an eigenvector of Â and P̂a = |aiha|, then [P̂a , B̂] = 0.
Proof: From the compatibility theorem, there is an eigenbasis {|bn i} of

B̂ with |b1 i = |ai. Write B̂ in diagonal form as
X
B̂ = bn |bn ihbn |.
n
Then
X X
B̂|b1 ihb1 | = |bn ihbn |b1 ihb1 | = |bn iδn,1 hb1 | = b1 |b1 ihb1 |
n n
5.4. Magnetic moment in a uniform magnetic field 139
while
X X
|b1 ihb1 |B̂ = |b1 ihb1 |bn ihbn | = |b1 iδ1,n hbn | = b1 |b1 ihb1 |.
n n
Corollary: If Â commutes with Ĥ, then nothing about the measurement

of Â changes with time.
Definition: The observable associated with such an operator is said to

be conserved.
Note: All these results apply to time evolution uninterrupted by mea-

surements.
5.4 Magnetic moment in a uniform magnetic field
5.5 The neutral K meson
You know that elementary particles are characterized by their mass and
charge, but that two particles of identical mass and charge can still behave
differently. Physicists have invented characteristics such as “strangeness”
and “charm” to label (not explain!) these differences. For example, the
difference between the electrically neutral K meson K 0 and its antiparticle
the K̄ 0 is described by attributing a strangeness of +1 to the K 0 and of
−1 to the K̄ 0 .
Most elementary particles are completely distinct from their antiparti-
cles: an electron never turns into a positron! Such a change is prohibited
by charge conservation. However this prohibition does not extend to the
neutral K meson precisely because it is neutral. In fact, there is a time-
dependent amplitude for a K 0 to turn into a K̄ 0 . We say that the K 0
and the K̄ 0 are the two basis states for a two-state system. This two-state
system has an observable strangeness, represented by an operator, and we
have a K 0 when the system is in an eigenstate of strangeness with eigen-
value +1, and a K̄ 0 when the system is in an eigenstate of strangeness
with eigenvalue −1. When the system is in other states it does not have a
definite value of strangeness, and cannot be said to be “a K 0 ” or “a K̄ 0 ”.
The two strangeness eigenstates are denoted |K 0 i and |K̄ 0 i.
140 Time Evolution
5.4 Strangeness
Write an outer product expression for the strangeness operator Ŝ, and
find its matrix representation in the {|K 0 i, |K̄ 0 i} basis. Note that this
matrix is just the Pauli matrix σ3 .
5.5 Charge Parity
Define an operator CPd that turns one strangeness eigenstate into the
other:
d |K 0 i = |K̄ 0 i,
CP d |K̄ 0 i = |K 0 i.
CP
(CP stands for “charge parity”, although that’s not important here.)
Write an outer product expression and a matrix representation (in the
{|K 0 i, |K̄ 0 i} basis) for the CP
d operator. What is the connection
between this matrix and the Pauli matrices? Show that the normalized
eigenstates of CP are
1
|KU i = √ (|K 0 i + |K̄ 0 i),
2
1
|KS i = √ (|K 0 i − |K̄ 0 i).
2
(The U and S stand for unstable and stable, but that’s again irrelevant
because we’ll ignore K meson decay.)
5.6 The Hamiltonian
The time evolution of a neutral K meson is governed by the “weak
interaction” Hamiltonian
Ĥ = e1̂ + f CP
d.
(There is no way for you to derive this. I’m just telling you.) Show
that the numbers e and f must be real.
5.7 Time evolution
Neutral K mesons are produced in states of definite strangeness be-
cause they are produced by the “strong interaction” Hamiltonian that
conserves strangeness. Suppose one is produced at time t = 0 in state
|K 0 i. Solve the Schrödinger equation to find its state for all time after-
wards. Why is it easier to solve this problem using |KU i, |KS i vectors
rather than |K 0 i, |K̄ 0 i vectors? Calculate and plot the probability of
finding the meson in state |K 0 i as a function of time.
[[The neutral K meson system is extraordinarily interesting. I have

oversimplified by ignoring decay. More complete treatments can be found in
Ashok Das & Adrian Melissinos, Quantum Mechanics (Gordon and Breach,
5.5. The neutral K meson 141
New York, 1986) pages 172–173; R. Feynman, R. Leighton, and M. Sands,

The Feynman Lectures on Physics, volume III (Addison-Wesley, Reading,
Massachusetts, 1965) pages 11-12–20; Gordon Baym, Lectures on Quantum
Mechanics (W.A. Benjamin, Reading, Massachusetts, 1969), pages 38–45;
and Harry J. Lipkin, Quantum Mechanics: New Approaches to Selected
Topics (North-Holland, Amsterdam, 1986) chapter 7.]]
5.8 The most general two-state Hamiltonian

We’ve seen a number of two-state systems by now: the spin states
of a spin- 12 atom, the polarization states of a photon, the CP states
of a neutral K-meson. [[For more two-state systems, see R. Feynman,
R. Leighton, and M. Sands, The Feynman Lectures on Physics, vol-
ume III (Addison-Wesley, Reading, Massachusetts, 1965) chapters 9,
10, and 11.]] This problem investigates the most general possible Hamil-
tonian for any two-state system.
Because the Hamiltonian must be Hermitian, it must be represented
by a matrix of the form

a c
c∗ b
where a and b are real, but c = |c|eiγ might be complex. Thus the
Hamiltonian is specified through four real numbers: a, b, magnitude
|c|, and phase γ. This seems at first glance to be the most general
Hamiltonian.
But remember that states can be modified by an arbitrary overall phase.
If the initial basis is {|1i, |2i}, show that in the new basis {|1i, |20 i},
where |20 i = e−iγ |2i, the Hamiltonian is represented by the matrix

a |c|
|c| b
which is pure real and which is specified through only three real num-
bers.
Chapter 6
The Quantum Mechanics of Position
6.1 Describing states in continuum systems
At the start of this book we said we’d begin by treating only the magnetic
moment of the atom quantum mechanically, and that once we got some
grounding on the physical concepts and mathematical tools of quantum
mechanics in this situation, we’d move on to the quantal treatment of other
properties of the atom — such as its position. This led us to develop
quantum mechanics for systems with two basis states. This was a very
good thing, and we learned a lot about quantum mechanics, and also about
practical applications like atomic clocks and MASERs.
All good things must come to an end, but in this case we’re ending
one good thing to come onto an even better thing, namely the quantum
mechanics of a continuum system. The system we’ll pick is a particle mov-
ing in one dimension. For the time being we’ll ignore the atom’s magnetic
moment and internal constitution, and focus only on its position. Later in
the book we’ll treat both position and magnetic moment together.
Course-grained description
The situation is a point particle moving in one dimension. We start off

with a course-grained description of the particle’s position: we divide the
line into an infinite number of bins, each of width ∆x. (We will later
take the limit as the bin width vanishes and the number of bins grows to
compensate.)
143
144 The Quantum Mechanics of Position
x
... −3 −2 −1 0 1 2 ... ∆x
If we ask “In which bin is the particle positioned?” the answer might
be “It’s not in any of them. The particle doesn’t have a position.” Not all
states have definite positions. On the other hand, there are some states
that do have definite positions. If the particle has a definite position within
bin 5 we say that it is in state |5i.
I maintain that the set of states {|ni} with n = 0, ±1, ±2, ±3, . . . con-
stitutes a basis, because the set is:
• Orthonormal. If the particle is in one bin, then it’s not in any of the
others. The mathematical expression of this property is
hn|mi = δn,m .
• Complete. If the particle does have a position, then it has a position
within one of the bins. The mathematical expression of this property
is
X∞
|nihn| = 1̂.
n=−∞
If the particle has no definite position, then it is in a state |ψi that is a

superposition of basis states (Have I used this word before?)
∞
X
|ψi = ψn |ni (6.1)
n=−∞
where
∞
X
ψn = hn|ψi so |ψn |2 = 1. (6.2)
n=−∞
The quantity |ψ5 |2 is the probability that, if the position of the particle
is measured (perhaps by shining a light down the one-dimensional axis),
the particle will be found within bin 5. We should always say
“|ψ5 |2 is the probability of finding the particle in bin 5”,

6.1. Describing states in continuum systems 145
because the word “finding” suggests the whole story: Right now the particle
has no position, but after you measure the position then it will have a posi-
tion, and the probability that this position falls within bin 5 is |ψ5 |2 . This
phrase is totally accurate but it’s a real mouthful. Instead one frequently
hears
“|ψ5 |2 is the probability that the particle is in bin 5”.
This is technically wrong. Before the position measurement, when the

particle is in state |ψi, the particle doesn’t have a position. It has no
probability of being in bin 5, or bin 6, or any other bin, just as love doesn’t
have probability 0.5 of being red, 0.3 of being green, and 0.2 of being blue.
Love doesn’t have a color, and the particle in state |ψi doesn’t have a
position.
Because the second, inaccurate, phrase is shorter than the first, correct,
phrase, it is often used despite its falseness. You may use it too, as long as
you don’t believe it.
Similarly, the most accurate statement is
“ψ5 is the amplitude for finding the particle in bin 5”,
but you will frequently hear the brief and inaccurate
“ψ5 is the amplitude that the particle is in bin 5”
instead.
Successively finer-grained descriptions
Suppose we want a more accurate description of the particle’s position

properties. We can get it using a smaller value for the bin width ∆x.
Still more accurate descriptions come from using still smaller values of ∆x.
Ultimately I can come up with a sequence of ever smaller bins homing in
on the position of interest, say x0 . For all values of ∆x, I will call the bin
straddling x0 by the name “bin k”. The relevant question seems at first to
be: What is the limit
lim |ψk |2 ?
∆x→0
In fact, this is not an interesting question. The answer to that question

is “zero”. For example: Suppose you are presented with a narrow strip
of lawn, 1000 meters long, which contains seven four-leaf clovers, scattered
over the lawn at random. The probability of finding a four-leaf clover within
a 2-meter wide bin is
7
(2 m) = 0.014.
1000 m
The probability of finding a four-leaf clover within a 1-meter wide bin is
7
(1 m) = 0.007.
1000 m
The probability of finding a four-leaf clover within a 1-millimeter wide bin
is
7
(0.001 m) = 0.000007.
1000 m
As the bin width goes to zero, the probability goes to zero as well. (Put
another way, the probability of finding a four-leaf clover at a point along
the strip of lawn is zero, because that probability is
7
,
number of points
and the number of points along the strip is infinite.)
The interesting question concerns not the bin probability, which always
goes to zero, but the probability density, that is, the probability of finding
the particle per length.
Exercise. What is the probability density for finding a four-leaf clover

in the strip of lawn described above? Be sure to include the dimensions
in your answer.
The probability per length of finding the particle at x0 , called the prob-
ability density at x0 , is the finite quantity
|ψk |2
lim . (6.3)
∆x→0 ∆x
(Remember that the limit goes through a sequence of bins k, every one of
which straddles the target point x0 .) In this expression both the numerator
and denominator go to zero, but they approach zero in such a way that the
ratio goes to a finite quantity. In other words, for small values of ∆x, we
have
|ψk |2 ≈ (constant)∆x, (6.4)
where that constant is the probability density for finding the particle at
point x0 .
We know that amplitudes are more general that probabilities, because
probabilities give the results for measurement experiments, but amplitudes
give the results for both interference and measurement experiments. What
does equation (6.4) say about bin amplitudes? It says that for small values
of ∆x
√
ψk ≈ (constant)0 ∆x (6.5)
whence the limit
ψk
lim √
∆x→0 ∆x
exists. This limit defines the quantity, a function of x0 ,
ψk
lim √ = ψ(x0 ).
∆x→0 ∆x
If I were naming this quantity, I would have named it “amplitude density”.
But for historical reasons it has a different name, namely “the wavefunc-
tion”.
The wavefunction evaluated at x0 is often called “the amplitude for
the particle to have position x0 ”, but that’s not exactly correct, because
an amplitude squared is a probability whereas a wavefunction squared is
a probability density. Instead
√ this phrase is just shorthand for the more
accurate phrase “ψ(x0 ) ∆x is the amplitude for finding the particle in an
interval of short length ∆x straddling position x0 , when the position is
measured”.
Exercise. Show that the wavefunction

√ for a point particle in one dimen-
sion has the dimensions 1/ length.
Working with wavefunctions
When we were working with discrete systems, we said that the inner product
could be calculated through
X
hφ|ψi = φ∗n ψn .
n
How does this pull over into continuum systems?

For any particular stage in the sequence of ever-smaller bins, the inner
product is calculated through
X∞
hφ|ψi = φ∗i ψi .
i=−∞
Prepare to take the ∆x → 0 limit by writing
∞
φ∗ ψ
√ i √ i ∆x.
X
hφ|ψi =
i=−∞
∆x ∆x
Then
∞ +∞
φ∗
Z
ψ
√ i √ i ∆x =
X
hφ|ψi = lim φ∗ (x)ψ(x) dx.
∆x→0
i=−∞
∆x ∆x −∞
Exercise. What is the normalization condition for a wavefunction?
Basis states
When we went through the process of looking at finer and finer course-
grainings, that is, taking ∆x → 0 and letting the number of bins increase
correspondingly, we were not changing the physical state of the particle.
Instead, we were just obtaining more and more accurate descriptions of
that state. How? By using a larger and larger basis!1 The sequence of
intervals implies a sequence of basis states |ki. What is the limit of that
sequence?
One way to approach this question is to look at the sequence
h i
lim ψk = lim hk|ψi = lim hk| |ψi. (6.6)
∆x→0 ∆x→0 ∆x→0
(Where, in the last step, we have acknowledged that in the sequence of
finer-grained approximations involves changing the basis states |ki, not the
state of the particle |ψi.) This approach is not helpful because the limit
always vanishes.
More useful is to look at the sequence

ψk hk|ψi hk|
lim √ = lim √ = lim √ |ψi = ψ(x0 ). (6.7)
∆x→0 ∆x ∆x→0 ∆x ∆x→0 ∆x
1 You might object that the basis was not really getting bigger — it started out with an
infinite number of bins and at each stage in the process always has an infinite number of
bins. I will reply that in some sense it has a “larger infinity” than it started with. If you
want to make this sense rigorous and precise, take a mathematics course that studies
transfinite numbers.
This sequence motivates the definition of the “position basis state”

|ki
|x0 i = lim √ . (6.8)
∆x→0 ∆x
This new entity |x0 i is not quite the same thing as the basis states
like |ki that we’ve seen up to now, just as ψ(x0 ) is not quite the same
thing as an amplitude. For example, |ki is dimensionless while |x0 i has the
√
dimensions of 1/ length. Mathematicians call the entity |x0 i not a “basis
state” but a “rigged basis state”. The word “rigged” carries the nautical
connotation — a rigged ship is one outfitted for sailing and ready to move
into action — and not the unsavory connotation — a rigged election is an
unfair one. These are again fascinating mathematical questions2 but this is
not a mathematics book, so we won’t make a big fuss over the distinction.
Completeness relation for continuum basis states:
∞ ∞ Z +∞
X X |ii hi|
1̂ = |iihi| = lim √ √ ∆x = |xihx| dx. (6.9)
i=−∞
∆x→0
i=−∞
∆x ∆x −∞
Orthogonality relation for continuum basis states:

hi|ji = δi,j
hx|yi = 0 when x 6= y
hi|ii 1
hx|xi = lim = lim =∞
∆x→0 ∆x ∆x→0 ∆x
hx|yi = δ(x − y).
Just as the wavefunction is related to an amplitude but is not a true ampli-
tude, and a rigged basis state |xi is related to a basis state but is not a true
basis state, so the inner product result δ(x − y), the Dirac delta function,
is related to a function but is not a true function. Mathematicians call it a
“generalized function” or a “Schwartz distribution”.
Comparison of discrete and continuous basis states
2 If you find them interesting, take a course in rigged Hilbert spaces.

Discrete Continuous
1
basis states |ni; dimensionless basis states |xi; dimensions √
length
ψn = hn|ψi ψ(x) = hx|ψi
1
ψn is dimensionless ψ(x) has dimensions √
Z +∞ length
X
|ψn |2 = 1 2
|ψ(x)| dx = 1
n −∞
hn|mi = δn,m hx|yi = δ(x − y)
X Z +∞
hφ|ψi = φ∗n ψn hφ|ψi = φ∗ (x)ψ(x) dx
n −∞
X Z +∞
|nihn| = 1̂ |xihx| dx = 1̂
n −∞
Z +∞
Exercise: Show that hφ|ψi = φ∗ (x)ψ(x) dx. Hint: hφ|ψi =
−∞
hφ|1̂|ψi.
6.2 How does position amplitude change with time?
In classical mechanics, the equation telling us how position changes with

time is F~ = m~a. It is not possible to derive F~ = m~a, but it is possible to
motive it.
The role of this section is to uncover the quantal equivalent of F~ = m~a:
namely the equation telling us how position amplitude changes with time.
As with F~ = m~a, it is possible to motivate this equation but not to prove
it. As such, the arguments in this section are suggestive, not definitive.
Indeed, in some circumstances the arguments are false (e.g. for a single
charged particle in a magnetic field, or for a pair of entangled particles).
ψi−1 ψi ψi+1
∆x
time ∆t later
ψ'i−1 ψ'i ψ'i+1
6.2. How does position amplitude change with time? 151
Normalization requirement
The amplitude for the particle to be within bin i is initially ψi , and after
time ∆t it changes to ψi0 = ψi + ∆0 ψi . (In this section, change with time
is denoted ∆0 ψ, while change with space is denoted ∆ψ.) Because the
probability that the particle is in some bin is one, the bin amplitudes are
normalized to
X
|ψi |2 = 1
i
and
X
|ψi0 |2 = 1.
i
The second equation can be written
X X X
1= ψi0∗ ψi0 = (ψi∗ +∆0 ψi∗ )(ψi +∆0 ψi ) = (ψi∗ ψi +ψi∗ ∆0 ψi +∆0 ψi∗ ψi +∆0 ψi∗ ∆0 ψi ).
i i i
The first term on the far right sums to exactly 1, due to initial normaliza-
tion. The next two terms are of the form z + z ∗ = 2<e{z}, so
X
0= 2<e{ψi∗ ∆0 ψi } + ∆0 ψi∗ ∆0 ψi .
i
When we go to the limit of very small ∆t, then ∆0 ψi will be very small,
and ∆0 ψi∗ ∆0 ψi , as the product of two very small quantities, will be ultra
small. Thus we neglect it and conclude that, due to normalization,
( )
X
∗ 0
<e ψi ∆ ψi = 0. (6.10)
i
We can change this to a relation about wavefunction rather than bin

amplitude by remembering that, if xi is the point at the center of bin i,
then
ψi
ψ(xi ) = lim √ (6.11)
∆x→0 ∆x
For very small bins, equation (6.10) becomes
( )
X
∗
√ 0
√
<e ψ (xi ) ∆x∆ ψ(xi ) ∆x = 0
i
or
Z +∞
<e ψ ∗ (x)∆0 ψ(x) dx = 0. (6.12)
−∞
The flow of amplitude
Deductions from the preservation of normalization are important but purely

formal. . . they don’t tell us anything about the physics that’s going on as
time evolves. We begin with a very reasonable surmise:
ψi0 = Ai ψi−1 + Bi ψi + Ci ψi+1 .
This says nothing more3 than that the amplitude to be in bin i at the end
of the time interval is the sum of
the amplitude to be in bin i−1 initially (ψi−1 ) times the amplitude

to flow right (Ai )
plus
the amplitude to be in bin i initially (ψi ) times the amplitude to
stay in that bin (Bi )
plus
the amplitude to be in bin i+1 initially (ψi+1 ) times the amplitude
to flow left (Ci ).
The only important assumption we’ve made in writing down this surmise
is that only adjacent bins are important: surely a reasonable assumption if
the time interval ∆t is short. (Some people like to call Ai and Ci “hopping
amplitudes” rather than “flow amplitudes”.)
Note that the change amplitudes Ai , Bi , and Ci are independent of the
position bin amplitudes ψi−1 , ψi , and ψi+1 . That is, Ai represents the
amplitude to flow right regardless of what amplitude is originally in bin
i − 1. In other words, Ai , Bi , and Ci depend on the situation (e.g. the mass
of the particle, the forces applied to the particle) but not on the state.
We surmise further that the flow amplitudes are independent of position
and of direction, so all the Ai and Ci are independent of i, and equal to each
other. This surmise seems at first to be silly: surely if the particle moves
on a line containing a hill and a valley, the flow will be more likely downhill
than uphill. However, this observation shows only that Ai ψi−1 will differ
from Ci ψi+1 , not that Ai will differ from Ci . We know that motion can
happen even if there are no hills and valleys — that “a particle in motion
remains at motion in the absence of an external force” — and the flow
amplitudes concern this part of motion, the motion without external force.
3 Compare the rules for combining amplitude on page 58.
(The surmise that left flow amplitude equals right flow amplitude does, in
fact, turn out to be false for a charged particle in a magnetic field.) On
the other hand, the hill vs. valley argument means that Bi will depend on
position.
Finally, realize that the amplitudes A and Bi will depend on ∆x and
∆t: we expect that the flow amplitude A will increase with increasing ∆t
(more time, more flow), and decrease with increasing ∆x (with fat bins the
flow at boundaries is less significant).
With these surmises in place, we have
ψi0 = Aψi−1 + Bi ψi + Aψi+1 . (6.13)
Now, I write Bi in a funny way as Bi = −2A + 1 + Di . I do this so that
the equation will turn into
∆0 ψi = ψi0 − ψi = A(ψi−1 − ψi ) + Di ψi + A(ψi+1 − ψi ), (6.14)
which emphasizes amplitude differences rather than amplitude totals. In
terms of the differences sketched below
ψi−1 ψi ψi+1
∆ψL = ψi − ψi−1 ∆ψR = ψi+1 − ψi
this equation is
∆0 ψi = −A∆ψL + Di ψi + A∆ψR . (6.15)
Writing this way, in terms of differences, sets us up for taking derivatives:

∆ψR ∆ψL
∆ψR − ∆ψL = ∆x − .
∆x ∆x
The ratio ∆ψR /∆x clearly relates to a spatial derivative taken at the right
boundary of bin i. Furthermore
∆ψR ∆ψL
 
−
∆ψR − ∆ψL = (∆x)2  ∆x ∆x 
∆x
just as clearly relates to a second spatial derivative taken at the center of

bin i.
At some point we need to switch over from talking about bin amplitude
to talking about wavefunction,
√ and this is a convenient point. Divide both
sides of equation (6.15) by ∆x and use equation (6.11) to write (in an
approximation that grows increasingly accurate as ∆x → 0)
2
0 2 ∂ ψ(x)
∆ ψ(xi ) ≈ −A(∆x) + Di ψ(xi )
∂x2 x=xi
While I have written this equation for the point at the center of bin i, of
course it holds for any point. Defining D(xi ) = Di gives
∂2ψ
∆0 ψ(x) ≈ −A(∆x)2 2 + D(x)ψ(x). (6.16)
∂x
Using the normalization equation
This is a good time to use result (6.12), the consequence of the normaliza-
tion requirement. Applying (6.16) in (6.12) shows that
Z +∞ Z +∞ Z +∞
∂2ψ
ψ ∗ (x)∆0 ψ(x) dx = −A(∆x)2 ψ ∗ (x) 2 dx+ ψ ∗ (x)D(x)ψ(x) dx
−∞ −∞ ∂x −∞
(6.17)
is pure imaginary. This requirement holds for all wavefunctions ψ(x), and
for all situations regardless of D(x), so each of the two terms on the right
must be pure imaginary. (That is, we cannot count on a real part in first
term on the right to cancel a real part in the second term on the right,
because if they happened to cancel for one function D(x), they wouldn’t
cancel for a different function D(x), but the normalization condition has to
hold for all possible functions D(x).)
The first integral on the right-hand side of (6.17) can be performed by
parts:
Z +∞ +∞ Z +∞
∂2ψ ∂ψ ∗ ∂ψ

∗ ∗ ∂ψ
ψ (x) 2 dx = ψ (x) − dx
−∞ ∂x ∂x x=−∞ −∞ ∂x ∂x
The part in square brackets vanishes. . . otherwise ψ(x) is not normalized.
The remaining integral is of the form
Z
f ∗ (x)f (x) dx
which is pure real. Thus the constant A must be pure imaginary.
The second integral on the right-hand side of (6.17) is
Z +∞
ψ ∗ (x)D(x)ψ(x) dx
−∞
which must be imaginary for all wavefunctions ψ(x), even wavefunctions

that are pure real. Thus D(x) must be pure imaginary.
We have found that the amplitudes A and D(x) must be pure imaginary,
so we define the pure real quantities a and d(x) through
A = ia and D(x) = id(x).
The discrete-time amplitude equation (6.16) becomes
∂2ψ

∆0 ψ(x) ≈ i −a(∆x)2 2 + d(x)ψ(x) . (6.18)
∂x
Dimensional analysis
Let’s find more about the quantity a, which is dimensionless. It’s not plau-
sible for the quantity a to depend on the phase of the moon, or the national
debt. It can only depend on ∆x, ∆t, the particle mass m, and Planck’s
constant ~. (It makes sense that a should depend on the inertia of the
particle m, as we’ve already pointed out that this part of the Hamiltonian
is involved with flow.)
quantity dimensions
∆x [`]
∆t [t]
m [m]
~ [m][`]2 /[t]
The quantity a(∆x)2 must be finite in the limit ∆x → 0, so a must

depend on ∆x through the proportionality
1 1
a∝ 2
dimensions of right-hand side: .
(∆x) [`]2
To make a dimensionless we’ll need to cancel the dimensions of length. The
only way to do this is through ~:
~ [m]
a∝ dimensions of right-hand side: .
(∆x)2 [t]
Now we need to cancel out the dimensions of mass and time. Again there
is only one way to do this:
~ ∆t
a∝ dimensions of right-hand side: none.
(∆x)2 m
In short
∆t ~
a= nd
(∆x)2 m
where nd is a dimensionless real number. Note that, as anticipated immedi-
ately before equation (6.13), the quantity a increases with ∆t and decreases
with ∆x.
With our new understanding we write equation (6.18) as
~nd ∂ 2 ψ

0
∆ ψ(x) ≈ i − ∆t 2 + d(x)ψ(x)
m ∂x
or
∆0 ψ(x) ~nd ∂ 2 ψ d(x)

≈i − + ψ(x)
∆t m ∂x2 ∆t
which is conventionally written
∆0 ψ(x) i ~2 nd ∂ 2 ψ ~d(x)

≈− − ψ(x) .
∆t ~ m ∂x2 ∆t
(This form has the advantage that the part in square brackets has the
dimensions of energy times the dimensions of ψ.)
The function ~d(x)/∆t has the dimensions of energy, and we call it v(x).
Now taking the two limits ∆x → 0 and ∆t → 0, we find
i ~2 nd ∂ 2 ψ(x, t)

∂ψ(x, t)
=− − v(x)ψ(x, t) . (6.19)
∂t ~ m ∂x2
Exercise: Does it make physical sense that the “stay at home bin ampli-
tude” Di (see equation 6.14) should increase with increasing ∆t?
Classical limit
To complete the specification of this equation, we must find values for nd

and v(x). This can be done by applying the equation to a massive particle
starting with a pretty-well defined position and seeing how that pretty-
well defined position changes with time. In this so-called classical limit,
the results of quantum mechanics must go over to match the results of
classical mechanics. We are not yet equipped to do this, but we will find in
section 6.5 that enforcing the classical limit gives the result that nd = −1/2
and v(x) is the negative of the classical potential energy function V (x).
This latter result astounds me. The classical potential energy function
derives from considering a particle with a definite location. Why should it
have anything to do with quantum mechanics? I don’t know, but it surely
does.
We will see that the first part of the Hamiltonian corresponds to kinetic
energy, and sure enough we’ve been relating it to “flow” or “hopping”.
Again, I am astounded that the quantal expression corresponding to kinetic
energy is so different from the classical expression, just as I am astounded
that the quantal expression corresponding to potential energy is so similar
to the classical expression. Again, it’s true whether I find it astounding or
not.
Conclusion
The wavefunction ψ(x; t) evolves in time according to

~2 ∂ 2 ψ(x, t)

∂ψ(x, t) i
=− − + V (x)ψ(x, t) , (6.20)
∂t ~ 2m ∂x2
where V (x) is the classical potential energy function. This equation was dis-
covered in a completely different way by the 38-year-old Erwin Schrödinger
during the Christmas season of 1925, at the alpine resort of Arosa, Switzer-
land, in the company of “an old girlfriend [from] Vienna”, while his wife
stayed at home in Zürich. It is called the Schrödinger equation, and it plays
the same central role in quantum mechanics that F~ = m~a plays in classical
mechanics.
Do not think that we have derived the Schrödinger equation. . . instead
we have taken it to pieces to see how it works.
Bibliography:
R.P. Feynman, R.B. Leighton, and M. Sands, The Feynman Lectures on
Physics, volume 3: Quantum Mechanics (Addison-Wesley, Reading, Mas-
sachusetts, 1965) pages 16-1–16-4.
Gordon Baym, Lectures on Quantum Mechanics (Benjamin, Reading,
Massachusetts, 1969) pages 46–53.
W. Moore, Schrödinger: Life and Thought (Cambridge University Press,
1989) page 194.
6.3 What is wavefunction?
We have introduced the tool of wavefunction (or “amplitude density”).

Wavefunction is sort of like magnetic field in that you can’t touch it or taste
it or smell it, but in fact is even more abstract. For one thing wavefunction
is complex-valued, not real-valued. For another it is determined, to some
extent, by convention. We will see a third soon: The wavefunction for a
single particle moving in three-dimensional space is a function of three-
dimensional space: ψ(~x). But the wavefunction for two particles, A and B,
each moving in three-dimensional space, is NOT ψA (~x) plus ψB (~x), instead
it is a function ψ(~xA , ~xB ). That is, the wavefunction for two particles is
a function in a six-dimensional space! (You might recall from a classical
mechanics course that this space is called configuration space.)
This question has gnawed at people from the very beginnings of quan-
tum mechanics: In the summer of 1926, Erich Hückel4 composed the ditty,
presented here in the free translation by Felix Bloch5
Erwin with his ψ can do

Calculations quite a few.
But one thing has not been seen:
Just what does ψ really mean?
Rather than worry about what wavefunction is, I recommend that you
avoid traps of what wavefunction is not. It can’t be measured. It doesn’t
exist in physical space. It is dependent on convention. It is a mathematical
tool like the scalar and vector potentials. ψ is a step in an algorithm: it has
no more physical significance than the intermediates of a multiplication.
4 Erich Hückel (1896–1980) was a German physicist whose work in molecular orbitals
resulted in the first successful treatment of the carbon-carbon double bond.

5 Felix Bloch (1905–1983) was a Jewish-Swiss-American physicist who made contribu-
tion to the quantum theory of solids and elsewhere. He won the Nobel Prize for his work
in nuclear magnetic resonance. His memory of this poem comes from his “Reminiscences
of Heisenberg and the early days of quantum mechanics” [Physics Today 29(12) 23–27
(December 1976)].
6.4. Operators and their representations; The momentum basis 159
6.4 Operators and their representations; The momentum

basis
The position operator and functions of the position operator
The position operator is called x̂. If we know the action of x̂ on every

member of the {|xi} basis (or any other basis!), then we know everything
about the operator. But we do know that!
x̂|x0 i = x0 |x0 i.
Furthermore, we can find the action of x̂2 on every member of the {|xi}
basis as follows:
x̂2 |x0 i = x̂(x̂|x0 i) = x̂(x0 |x0 i) = x0 (x̂|x0 i) = x0 (x0 |x0 i) = (x0 )2 |x0 i.
Similarly, for any integer power n,
x̂n |x0 i = (x0 )n |x0 i.
Exercise: Prove this using mathematical induction.
If f (x) is a scalar function with Taylor series

∞
X f (n) (0) n
f (x) = x , (6.21)
n=0
n!
then we define the operator f (x̂) through
∞
X f (n) (0) n
f (x̂) = x̂ . (6.22)
n=0
n!
(This enables us to find operators for quantities like ex .) The upshot is
that for these operators, the position basis states are eigenstates:
f (x̂)|x0 i = f (x0 )|x0 i.
We’ve been examining the action of operators like f (x̂) on position basis
states. What if they act upon some other state? We find out by expanding
the general state |ψi into position states:
f (x̂)|ψi = f (x̂)1̂|ψi
Z +∞
= f (x̂) |x0 ihx0 | dx0 |ψi
−∞
Z +∞
= f (x̂)|x0 ihx0 |ψi dx0
−∞
Z +∞
= |x0 if (x0 )hx0 |ψi dx0 .
−∞
To get a feel for this result, we look for the representation of the state
f (x̂)|ψi in the {|xi} basis:
Z +∞
hx|f (x̂)|ψi = hx|x0 if (x0 )hx0 |ψi dx0
−∞
Z+∞
= δ(x − x0 )f (x0 )ψ(x0 ) dx0
−∞
= f (x)ψ(x).
IMPORTANT RESULT: The representation of an operator f (x̂) in the

position basis is
hx|f (x̂)|ψi = f (x)hx|ψi. (6.23)
And, as we’ve seen, if we know hx|Â|ψi for general |ψi and for general x,
then we know everything there is to know about the operator.
The relation between a function-of-position operator and its position
basis representation is simple: erase the hats!
|φi = f (x̂)|ψi ⇐⇒ φ(x) = f (x)ψ(x). (6.24)
Another application:
Z +∞ Z +∞
hφ|f (x̂)|ψi = dx hφ|xix|f (x̂)|ψ = φ∗ (x)f (x)ψ(x) dx. (6.25)
−∞ −∞
So you might think we’re home free. But no, because. . .
There are other operators
For example, the Hamiltonian operator, defined in terms of its components

in the position basis, is
~2 ∂ 2

hx|Ĥ|ψi = − + V (x) hx|ψi. (6.26)
2m ∂x2
The logical definition of the momentum operator is through
p̂2
Ĥ = + V (ˆ(x)), (6.27)
2m
so
∂2
hx|p̂2 |ψi = −~2 hx|ψi. (6.28)
∂x2
IMPORTANT RESULT: Define the momentum operator p̂ in terms of

its components / representation in the position basis as
∂
hx|p̂|ψi = −i~ hx|ψi. (6.29)
∂x
The operator with “+i” rather than “−i” out in front would have the same
square, but would not have the correct classical limit. (See problem 6.7 and
the second exercise below.)
Exercise: Would the phase-shifted convention
∂
hx|p̂|ψi = −i~eiδ hx|ψi,
∂x
where δ is pure real, be acceptable?
Exercise — Sign of the momentum operator: The function ψR (x; t) =
Aei(+kx−ωt) represents a wave moving to the right, while ψL (x; t) =
Aei(−kx−ωt) represents a wave moving to the left. (Take k to be positive.)
Apply each of our two candidate momentum operators
. ∂ . ∂
p̂1 = −i~ and p̂2 = +i~
∂x ∂x
to both of these functions, and show that the first candidate makes more
sense.
Answer:
∂
hx|p̂1 |ψR i = −i~ Aei(+kx−ωt) = −i~(+ik)Aei(+kx−ωt) = (+~k)ψR (x; t)
∂x
∂
hx|p̂1 |ψL i = −i~ Aei(−kx−ωt) = −i~(−ik)Aei(−kx−ωt) = (−~k)ψL (x; t)
∂x
∂
hx|p̂2 |ψR i = +i~ Aei(+kx−ωt) = +i~(+ik)Aei(+kx−ωt) = (−~k)ψR (x; t)
∂x
∂
hx|p̂2 |ψL i = +i~ Aei(−kx−ωt) = +i~(−ik)Aei(−kx−ωt) = (+~k)ψL (x; t)
∂x
Thus the eigenvalues for these four situations are:
candidate wave eigenvalue
p̂1 rightward moving +~k
p̂1 leftward moving −~k
p̂2 rightward moving −~k
p̂2 leftward moving +~k
Candidate 1 associates the rightward moving wave with a positive mo-
mentum eigenvalue and the leftward moving wave with a negative mo-
mentum eigenvalue. Candidate 2 does the opposite. Since we intuitively
associate rightward motion with positive momentum, candidate 1 is su-

perior.
Check on p̂2 :
hx|p̂2 |ψi = hx|p̂p̂|ψi [[define |φi = p̂|ψi]]
= hx|p̂|φi
∂
= −i~ hx|φi
∂x
∂
= −i~ hx|p̂|ψi
∂x

∂ ∂
= −i~ −i~ hx|ψi
∂x ∂x
2
∂
= −~2 2 hx|ψi
∂x
Now that we have the momentum operator, we will of course want to
find its eigenstates |pi! (Purists will point out that these are not actually
eigenstates, but rigged eigenstates.)
Find the position representation π(x) = hx|pi of the momen-

tum eigenstates
p̂|pi = λ|pi
hx|p̂|pi = λhx|pi
∂
−i~ hx|pi = λhx|pi
∂x
∂π(x)
−i~ = λπ(x)
∂x
∂π(x) λ
= i π(x)
∂x ~
π(x) = Cei(λ/~)x (6.30)
That’s funny. When we solve an eigenproblem, we expect that only a

few eigenvalues λ will result. That’s what happened with ammonia. But
there we had 2 × 2 matrices, and got two eigenvalues, whereas here we have
∞×∞ matrices, so we get an infinite number of eigenvalues! The eigenvalue
λ can be anything. . . positive, negative, even complex! A complex-valued λ
will result in a probability density looking like this:
|π(x)|2
with an infinite pile of probability density off to the right or off to the left.
This seems unphysical. Furthermore, complex values of λ would result in a
non-Hermitian momentum operator, so we reject them. (Remember that in
this section we are not making rigorous mathematical derivations, instead
we are seeking sensible definitions.6 Complex-valued eigenvalues λ for the
momentum operator are not sensible.7 )
The constant C is just an overall normalization constant. The best
convention is (see problem 6.1)
1
C=√ . (6.31)
2π~
In summary, the operator p̂ has eigenvectors |pi (technically, rigged

vectors) satisfying
p̂|pi = p|pi (6.32)
1
hx|pi = √ ei(p/~)x . (6.33)
2π~
6 “Here and elsewhere in science, as stressed not least by Henri Poincaré, that view is
out of date which used to say, ‘Define your terms before you proceed.’ All the laws and
~ = qE
theories of physics, including the Lorentz force law [F ~ +q~v × B],
~ have this deep and
subtle character, that they both define the concepts they use (here E ~ and B)
~ and make
statements about these concepts. Contrariwise, the absence of some body of theory, law,
and principle deprives one of the means properly to define or even to use concepts. Any
forward step in human knowledge is truly creative in this sense: that theory, concept,
law, and method of measurement — forever inseparable — are born into the world in
union.” C.W. Misner, K.S. Thorne, and J.A. Wheeler, Gravitation (W.H. Freeman and
Company, San Francisco, 1973) page 71.
7 In exactly the same way, when you solve this classical trajectory problem —“Carol
stands atop a 96 meter cliff and tosses a baseball at speed 45 m/s and angle 33◦ above
the horizontal. When does the baseball hit the ground?” — you find two solutions: 7.6 s
and −2.6 s. Even though the negative number is a mathematically correct solution, you
reject it on physical grounds.
√
Exercise: Show that |pi has the dimensions of 1/ momentum. What
are the dimensions of hx|pi?
Problem 6.1 will show that the momentum states are orthonormal
hp|p0 i = δ(p − p0 ) (6.34)
and complete
Z +∞
1̂ = |pihp| dp, (6.35)
−∞
and hence the set {|pi} constitutes a continuum (“rigged”) basis.
Representing states in the momentum basis
We have been dealing with a state |ψi through its representation in the
position basis, that is, through its wavefunction (or position representation)
ψ(x) = hx|ψi. (6.36)
It is equally legitimate to deal with that state through its representation in
the momentum basis, that is, through its so-called momentum wavefunction
(or momentum representation)
ψ̃(p) = hp|ψi. (6.37)
Either representation carries complete information about the state |ψi,

so you can obtain one from the other
Z +∞ Z +∞
1
ψ̃(p) = hp|ψi = hp|xihx|ψi dx = √ e−i(p/~)x ψ(x) dx
(6.38)
−∞ 2π~ −∞
Z +∞ Z +∞
1
ψ(x) = hx|ψi = hx|pihp|ψi dp = √ e+i(p/~)x ψ̃(p) dp.
(6.39)
−∞ 2π~ −∞
In short, the position and momentum wavefunctions are related to each

other through a Fourier transform!
Representing operators in the momentum basis
It is easy to represent momentum-related operators in the momentum basis.

For example, using the fact the p̂ is Hermitian,
hp|p̂|ψi = [hψ|p̂|pi]∗ = [phψ|pi]∗ = php|ψi. (6.40)
More generally, for any function of the momentum operator,

hp|f (p̂)|ψi = f (p)hp|ψi. (6.41)
It’s a bit more difficult to find the momentum representation of the

position operator, that is, to find hp|x̂|ψi. But we can do it, using a slick
trick called “parametric differentiation”.
First, I’ll introduce parametric differentiation in a purely mathematical
context. suppose you need to evaluate the integral
Z ∞
xe−kx cos x dx
0
but you can only remember that
Z ∞
k
e−kx cos x dx = .
0 k2 +1
You can differentiate both sides with respect to the parameter k finding
Z ∞
∂ ∂ k
e−kx cos x dx =
∂k 0 ∂k k 2 + 1
Z ∞ −kx
∂e (k 2 + 1) − k(2k)
cos x dx =
0 ∂k (k 2 + 1)2
Z ∞ 2
−k + 1
(−xe−kx ) cos x dx = 2
0 (k + 1)2
Z ∞ 2
k −1
xe−kx cos x dx = 2
0 (k + 1)2
This is a lot easier than any other method I can think of to evaluate this
integral.
Go back to the problem of finding hp|x̂|ψi:

hp|x̂|ψi = hp|x̂1̂|ψi
Z +∞
= hp|x̂|xihx|ψi dx
−∞
Z+∞
= hp|xixhx|ψi dx
−∞
Z +∞
1
= √ e−i(p/~)x xhx|ψi dx [[Now use parametric differentiation!]]
2π~ −∞
Z +∞
1 ~ ∂ h −i(p/~)x i
= √ e hx|ψi dx
2π~ −∞ −i ∂p
Z +∞
1 ∂ −i(p/~)x
= +i~ √ e hx|ψi dx
2π~ ∂p −∞
Z +∞
∂
= +i~ hp|xihx|ψi dx
∂p −∞
∂
= +i~ hp|ψi (6.42)
∂p
There’s a nice symmetry to this result, making it easy to remember: The
momentum operator, represented in the position basis, is
∂
hx|p̂|ψi = −i~ ψ(x) (6.43)
∂x
while the position operator, represented in the momentum basis, is
∂
hp|x̂|ψi = +i~ ψ̃(p). (6.44)
∂p
Exercise: Show that

Z +∞ Z +∞
|ψi = ψ(x) |xi dx = ψ̃(p) |pi dp. (6.45)
−∞ −∞
Verify that both of these relations have the correct dimensions.
Problems
6.1 The states {|pi} constitute a continuum basis
At equation (6.30) we showed that the inner product hx|pi must have
the form
hx|pi = Cei(p/~)x (6.46)
where C may be chosen for convenience.
a. Show that the operator

Z ∞
Â = |pihp| dp (6.47)
−∞
is equal to
2π~|C|2 1̂ (6.48)
by evaluating
hφ|Â|ψi = hφ|1̂Â1̂|ψi (6.49)
Rfor arbitrary states |ψi and |φi. Hints:

R ∞ Set the first 1̂ equal to
∞ 0 0
−∞
|xihx| dx, the second 1̂ equal to −∞
|x ihx | dx0 . The identity
Z ∞
1
δ(x) = eikx dk (6.50)
2π −∞
(see Griffiths equation [2.144] on page 77) for the Dirac delta func-
tion is useful here. Indeed, this is one of the most useful equations
to be found anywhere! √
b. Using the conventional choice C = 1/ 2π~, show that
hp|p0 i = δ(p − p0 ). (6.51)
The expression (6.50) is again helpful.
6.2 Peculiarities of continuum basis states
Recall that the elements of a continuum basis set are peculiar in that
they possess dimensions. That is not their only peculiarity. For any
ordinary state |ψi, the wavefunction ψ(x) = hx|ψi satisfies
Z ∞
ψ ∗ (x)ψ(x) dx = 1. (6.52)
−∞
0
Show that the states |x i and |pi cannot obey this normalization.
6.3 Hermiticity of the momentum operator
Show that the momentum operator is Hermitian over the space of states
|ψi that have wavefunction ψ(x) which vanish at x = ±∞. Hint:
Z ∞
dψ(x)
hφ|p̂|ψi = φ∗ (x) −i~ dx. (6.53)
−∞ dx
Integrate by parts.
6.4 Commutator of x̂ and p̂
Show that [x̂, p̂] = i~ by showing that hφ|[x̂, p̂]|ψi = i~hφ|ψi for arbi-
trary |φi and |ψi. Hints: First evaluate hx|p̂x̂|ψi and hx|x̂p̂|ψi. It helps
to define |χi = x̂|ψi.
6.5 Momentum representation of the Schrödinger equation

You know that the Schrödinger equation
d|ψ(t)i i
= − Ĥ|ψ(t)i (6.54)
dt ~
has the position representation
∂hx|ψ(t)i i
= − hx|Ĥ|ψ(t)i (6.55)
∂t ~
or
∂ψ(x; t) ~2 ∂ 2 ψ(x; t)
i~ =− + V (x)ψ(x; t). (6.56)
∂t 2m ∂x2
In this problem you will uncover the corresponding equation that gov-
erns the time development of
ψ̃(p; t) = hp|ψ(t)i. (6.57)
The left hand side of equation (6.54) is straightforward because
d ∂ ψ̃(p; t)
hp||ψ(t)i = . (6.58)
dt ∂t
To investigate the right hand side of equation (6.54) write
1 2
Ĥ = p̂ + V̂ (6.59)
2m
where p̂ is the momentum operator and V̂ the potential energy operator.
a. Use the Hermiticity of p̂ to show that
p2
hp|Ĥ|ψ(t)i = ψ̃(p; t) + hp|V̂ |ψ(t)i. (6.60)
2m
Now we must investigate hp|V̂ |ψ(t)i.
b. Show that
Z ∞
1
hp|V̂ |ψ(t)i = √ e−i(p/~)x V (x)ψ(x; t) dx (6.61)
2π~ −∞
by inserting the proper form of 1̂ at the proper location.
c. Define the (modified) Fourier transform Ṽ (p) of V (x) through
Z ∞
1
Ṽ (p) = √ e−i(p/~)x V (x) dx (6.62)
2π~ −∞
Z ∞
= hp|xiV (x) dx. (6.63)
−∞
6.5. Time evolution of average quantities 169
Note that Ṽ (p) has funny dimensions. Show that

Z ∞
1
V (x) = √ ei(p/~)x Ṽ (p) dp (6.64)
2π~ −∞
Z ∞
= hx|piṼ (p) dp. (6.65)
−∞
You may use either forms (6.62) and (6.64), in which case the proof
employs equation (6.50), or forms (6.63) and (6.65), in which case
the proof involves completeness and orthogonality of basis states.
d. Hence show that
Z ∞
1
hp|V̂ |ψ(t)i = √ Ṽ (p − p0 )ψ̃(p0 ; t) dp0 . (6.66)
2π~ −∞
(Caution! Your intermediate expressions will probably involve
three distinct variables that you’ll want to call “p”. Put primes
on two of them!)
e. Put everything together to see that ψ̃(p; t) obeys the integro-
differential equation
Z ∞
i p2

∂ ψ̃(p; t) 1
=− ψ̃(p; t) + √ dp0 Ṽ (p − p0 )ψ̃(p0 ; t) .
∂t ~ 2m 2π~ −∞
(6.67)
This form of the Schrödinger equation is particularly useful in the study
of superconductivity.
6.5 Time evolution of average quantities
In our general treatment of time evolution we found that for any measurable
with operation Â, the average value hÂit changed with time according to
dhÂit i
= − h[Â, Ĥ]it . (6.68)
dt ~
For the systems of this chapter,
1 2
Ĥ = p̂ + V (x̂), (6.69)
2m
where x̂ and p̂ satisfy the commutation relation
[x̂, p̂] = x̂p̂ − p̂x̂ = i~. (6.70)
Knowing this, let’s see how the average position hx̂it changes with time.
We must find
1
[x̂, Ĥ] = [x̂, p̂2 ] + [x̂, V (x̂)].
2m
The commutator [x̂, V (x̂)] is easy:
[x̂, V (x̂)] = x̂V (x̂) − V (x̂)x̂ = 0.
And the commutator [x̂, p̂2 ] is not much harder. We use the know commu-
tator for [x̂, p̂] to write
x̂p̂2 = (x̂p̂)p̂ = (p̂x̂ + i~)p̂ = p̂x̂p̂ + i~p̂,
and then use it again to write
p̂x̂p̂ = p̂(x̂p̂) = p̂(p̂x̂ + i~) = p̂2 x̂ + i~p̂.
Together we have
x̂p̂2 = p̂2 x̂ + 2i~p̂
or
[x̂, p̂2 ] = 2i~p̂.
Plugging these commutators into the time-evolution result, we get
dhx̂it i 1
=− 2i~hp̂]it .
dt ~ 2m
or
dhx̂it hp̂it
= , (6.71)
dt m
a result that stirs our memories of classical mechanics!
Meanwhile, what happens for average momentum hp̂it ?
1
[p̂, Ĥ] = [p̂, p̂2 ] + [p̂, V (x̂)] = [p̂, V (x̂)].
2m
To evaluate [p̂, V (x̂)] we use the familiar idea that if we know hx|Â|ψi for
arbitrary |xi and |ψi, then we know everything there is to know about the
operator Â. In this way, examine
hx|[p̂, V (x̂)]|ψi = hx|p̂V (x̂)|ψi − hx|V (x̂)p̂|ψi
∂
= −i~ hx|V (x̂)|ψi − V (x)hx|p̂|ψi
∂x
∂ ∂
= −i~ V (x)ψ(x) − V (x) −i~ ψ(x)
∂x ∂x

∂V (x) ∂ψ(x) ∂ψ(x)
= −i~ ψ(x) + V (x) − V (x)
∂x ∂x ∂x

∂V (x)
= −i~ ψ(x) .
∂x
6.5. Time evolution of average quantities 171
Now, the derivative of the classical potential energy function has a name.
It’s just (the negative of) the classical force function!
∂V (x)
F (x) = − . (6.72)
∂x
Continuing the evaluation begun above,
hx|[p̂, V (x̂)]|ψi = i~ [F (x)ψ(x)]
= i~hx|F (x̂)|ψi.
Because this relation holds for any |xi and for any |ψi, we know that the
operators are related as
[p̂, V (x̂)] = i~F (x̂). (6.73)
Going back to the time evolution of average momentum,
dhp̂it i i
= − h[p̂, Ĥ]it = − i~hF (x̂)it
dt ~ ~
or
dhp̂it
= hF (x̂)it , (6.74)
dt
which is suspiciously close to Newton’s second law!
These two results together,
dhx̂it hp̂it
= (6.75)
dt m
dhp̂it
= hF (x̂)it , (6.76)
dt
which tug so strongly on our classical heartstrings, are called the Ehrenfest8
equations. There are two things you should remember about them: First,
they are exact (within the assumptions of our derivation: non-relativistic,
one-dimensional, no frictional or magnetic forces, etc.). Because they do
tug our classical heartstrings, some people get the misimpression that they
apply only in the classical limit. That’s wrong — if you go back over the
derivation you’ll see that we never made any such assumption. Second, they
are incomplete. This is because (1) knowing hx̂it doesn’t let you calculate
hF (x̂)it , because in general hF (x̂)it 6= F (hx̂it ), and because (2) even if you
did know both hx̂it and hp̂it , that would not give you complete knowledge
of the state.
8 Paul Ehrenfest (1880–1933), Austrian and Dutch theoretical physicist. He was known
particularly for asking probing questions that clarified the essence and delineated the
unsolved problems of any matter under discussion. Particularly in this mode of ques-
tioner, he played a central role in the development of relativity, of quantum mechanics,
and of statistical mechanics. He died tragically by his own hand.
Problems
6.6 Alternative derivation.
Derive result 6.73 by expanding V (x) in a Taylor series.
6.7 Choice of sign for momentum operator
If we had taken the opposite sign choice for the momentum operator
at equation (6.29) (call this choice p̂2 ), then what would have been the
commutator [x̂p̂2 ]? What would have been the result 6.71?
6.8 Quantities in the Hamiltonian
When we derived equation (6.19) we were left with an undetermined
number nd and an undetermined function v(x). Repeat the derivation
of the Ehrenfest equations with this form of the Schrödinger equation to
determine that number and function by demanding the correct classical
limit.
Chapter 7
The Free Particle
7.1 Problems
7.1 Energy eigenstates

In lecture we examined the behavior of a free particle in a state of
definite momentum. Such states have a definite energy, but they are
not the only possible states of definite energy.
a. Show that the state
|ρ(0)i = A| + p0 i + B| − p0 i, (7.1)
2 2 2
where |A| + |B| = 1, has definite energy E(p0 ) = p0 /2m. (That
is, |ρ(0)i is an energy eigenstate with eigenvalue p20 /2m).
b. Show that the “wavefunction” corresponding to |ρ(t)i evolves in
time as
1 h i(+p0 x−E(p0 )t)/~ i
ρ(x; t) = √ Ae + Bei(−p0 x−E(p0 )t)/~ . (7.2)
2π~
I use the term wavefunction in quotes because ρ(x; t) is
not hx|normal statei but rather a sum of two terms like
hx|continuum basis statei.
c. Show that the “probability density” |ρ(x; t)|2 is independent of
time and given by
1 2p0 x 2p0 x
|ρ(x; t)|2 = 1 + 2 <e{A∗ B} cos + 2 =m{A∗ B} sin .
2π~ ~ ~
(7.3)
7.2 A useful
Z ∞integral
2 √
Using e−u du = π, show that
−∞
Z ∞ √
2 2 2π −y2 /2α2
e−u α /2 eiuy du = e (7.4)
−∞ α
173
174 The Free Particle
where α may be complex, but <e{α2 } > 0. Hint: Complete the square
by writing
2
u2 α2 y2

uα y
− + iuy = − √ −i √ − 2.
2 2 2α 2α
Note: If c is a real number independent of x, you know that
lim (x + c) = ∞.
x→∞
You might think that a different limit would result if the additive con-
stant c were complex, but in fact, that is not the case:
lim (x + ic) = ∞.
x→∞
It is not unusual for the limit of a sequence of complex numbers to be
real.
7.3 A somewhat
Z ∞ less useful integral
−x2
√
Given e dx = π, show that
−∞
Z ∞ √
2 π
x2 e−x dx = . (7.5)
−∞ 2
Z ∞ Z ∞
2 2
Hint: x2 e−x dx = 2 x2 e−x dx, then integrate by parts.
−∞ 0
7.4 Static properties of a Gaussian wavepacket
Consider the wavefunction
A 2 2
ψ(x; 0) = √ e−x /2σ ei(p0 /~)x . (7.6)
σ
a. Show that the wavefunction is properly normalized when A =
√
1/ 4 π.
Show that in this√ state hx̂i = 0 (trivial), and ∆x =
b. p
h(x̂ − hx̂i)2 i = σ/ 2 (easy).
c. Use equation (7.4) to show that
r
σ −(p−p0 )2 σ2 /2~2
ψ̃(p; 0) = A e . (7.7)
~
√
d. Hence show that hp̂i = p0 and ∆p = ~/( 2σ).
7.5 Force-free motion of a Gaussian wavepacket
A particle with the initial wavefunction given in the previous problem
evolves as
ψ̃(p; t) = e−(i/~)E(p)t ψ̃(p, 0) (7.8)
so that Z ∞
1
ψ(x; t) = √ ei(px−E(p)t)/~ ψ̃(p; 0) dp. (7.9)
2π~ −∞
7.1. Problems 175
a. Plug in ψ̃(p; 0) and change the integration variable to k where

~k = p − p0 in order to show that
σ i(p0 x−E(p0 )t)/~ ∞ −k2 (σ2 +i ~t )/2 ik(x− p0 t)
r Z
ψ(x; t) = A e e m e m dk.
2π −∞
(7.10)
Hint: Change variable first to p0 = p − p0 , then to k = p0 /~.
b. Define the complex dimensionless quantity
~t
β =1+i (7.11)
mσ 2
and evaluate the integral using equation (7.4), giving
1 p0 2 2
ψ(x; t) = A √ ei(p0 x−E(p0 )t)/~ e−(x− m t) /2σ β . (7.12)
σβ
c. Hence show that
1 p0 2 2 2
|ψ(x; t)|2 = √ e−(x− m t) /σ |β| . (7.13)
πσ|β|
By comparing |ψ(x; t)|2 with |ψ(x; 0)|2 , read off the results
s 2
p0 σ|β| σ ~
hx̂i = t, ∆x = √ = √ 1+ t . (7.14)
m 2 2 mσ 2
(No computation is required!)
Chapter 8
Square Wells
Will include both infinite and finite square wells. See the problem on “Scal-
ing” below.
8.1 The infinite square well

You are no doubt familiar with the energy eigenproblem for the infinite
square well, but a short review is never-the-less in order. Consider a
well of width L, and place the origin at the left edge of the well. The
mathematical problem is to find values En and functions ηn (x) such
that a solution to
~2 d2 ηn
− = En ηn (x) (8.1)
2m dx2
has
ηn (0) = ηn (L) = 0. (8.2)
Show that the eigenvalues are
π 2 ~2
En = n2 n = 1, 2, 3, 4, . . . (8.3)
2mL2
and the eigenfunctions are
r
2 x
ηn (x) = sin nπ . (8.4)
L L
Be sure to explain carefully why negative and zero values of n are not
used. Note that ηn (x) is even under reflection about the well center for
n odd, odd for n even.
8.2 Ground state energy for the infinite square well
Make up a problem like problem 9.1.
177
178 Square Wells
8.3 Characteristics of the ground energy level

The ground state energy for the infinite square well is
π 2 ~2
.
2mL2
Does it makes sense that. . .
a. . . . this energy vanishes as ~ → 0? (Clue: Consider the classical
limit.)
b. . . . this energy vanishes as L → ∞? (Clue: Think about the
Heisenberg indeterminacy relation. Compare problem 8.2.)
c. . . . this energy varies as 1/m?
8.4 Scaling
We’ve seen that a normalizable solution of the energy eigenequation
for a square well relies on a tradeoff between the well width and the
candidate energy E. It makes sense that a suitable change in width
can be offset by a suitable change in E. This problem explores that
tradeoff.
Suppose η(x) is a solution to
~2 d2 η
− + V (x)η(x) = Eη(x)
2m dx2
with a particular energy E. Now consider a different potential energy
function U (x) = s2 V (sx). For example, if s = 2 then U (x) has the
same shape as V (x), but has half the width and four times the height.
Call σ(x) = η(sx). Show that σ(x) solves the energy eigenproblem for
potential energy U (x), but with a different energy. Find that energy.
8.1 What does an electron look like?
One electron is in an infinite square well, in energy eigenstate n = 2, with

a node right in the center of the well. The electron has probability 21 of
being found in the left half of the well, and probability 12 of being found in
the right half of the well, but no probability density at all of being found at
the center. How can the electron move from the left half to the right half
without passing through the center?
This question betrays a deepseated misconception about quantum me-
chanics. It arises from incorrectly thinking that the electron in energy
eigenstate n = 2 has a definite position that we don’t know (or that is
8.1. What does an electron look like? 179
changing erratically). That’s a wrong picture: the electron doesn’t have a

position.
The “passing through nodes” question doesn’t have an answer because
the question assumes an erroneous picture for the character of an electron.
It is as silly and as unanswerable as the question “If love is blue and passion
is red-hot, how can passionate love exist?”
A similar conundrum is this one: “Suppose I start with an electron in
a state so that it has equal probability of being anywhere in a box. If I
shine a strong light throughout the entire box I will find the electron at
only one point. But what happens if I shine the light on only the left half
of the box, and don’t find the electron? I now know that the electron is
somewhere in the right half. How could the light, shining where the electron
isn’t, affect the electron?” This conundrum (called the Renninger negative-
result experiment) has the same resolution as the “passage through nodes”
conundrum: namely, the conundrum arises from an incorrect visualization
of the electron as a hard, tiny marble. Before the light shining, the electron
didn’t have a position. Instead the amplitude to be found in the left half
had the same magnitude as the amplitude to be found in the right half.
Problems
8.5 Paradox?
a. The year is 1492, and you are discussing with a friend the radical
idea that the earth is round. “This idea can’t be correct,” objects
your friend, “because it contains a paradox. If it were true, then a
traveler moving always due east would eventually arrive back at his
starting point. Anyone can see that that’s not possible!” Convince
your friend that this paradox is not an internal inconsistency in the
round-earth idea, but an inconsistency between the round-earth
idea and the picture of the earth as a plane, a picture which your
friend has internalized so thoroughly that he can’t recognize it as
an approximation rather than the absolute truth.
b. The year is 2092, and you are discussing with a friend the radical
idea of quantal interference. “This idea can’t be correct,” objects
your friend, “because it contains a paradox. If it were true, then
an atom passing through branch a would have to know whether
branch b were open or blocked. Anyone can see that that’s not
180 Square Wells
possible!” Convince your friend that this paradox is not an in-

ternal inconsistency in quantum mechanics, but an inconsistency
between quantal ideas and the picture of an atom as a hard little
marble that always has a definite position, a picture which your
friend has internalized so thoroughly that he can’t recognize it as
an approximation rather than the absolute truth.
Chapter 9
The Simple Harmonic Oscillator
9.1 Resume of energy eigenproblem
The energy eigenproblem for the simple harmonic oscillator is

~2 d2 ηn (x) mω 2 2
− + x ηn (x) = En ηn (x). (9.1)
2m dx2 2
This is a second-order linear ordinary differential equation, and the theory
of differential equations assures us that for every value of En , there are two
linearly independent solutions to this equation.
This does not, however, mean that every En is an energy eigenvalue
with two energy eigenfunctions. Nearly all of these solutions turn out to
be unnormalizable,
Z +∞
η ∗ (x)η(x) dx = ∞,
−∞
so they do not represent physical states. The problem of solving the energy
eigenproblem is simply the problem of plowing through the vast haystack
of solutions of (9.1) to find those few needles with finite norm.
9.2 Solution of the energy eigenproblem: Differential equa-

tion approach
Problem: Given m and ω, find values En such that the corresponding so-
lutions ηn (x) of
~2 d2 ηn (x) mω 2 2
− + x ηn (x) = En ηn (x) (9.2)
2m dx2 2
181
182 The Simple Harmonic Oscillator
are normalizable wavefunctions. Such En are the energy eigenvalues, and

the corresponding solutions ηn (x) are energy eigenfunctions.
Strategy: The following four-part strategy is effective for most differen-
tial equation eigenproblems:
(1) Convert to dimensionless variable.

(2) Remove asymptotic behavior of solutions.
(3) Find non-asymptotic behavior using the series method.
(4) Invoke normalization to terminate the series as a polynomial.
In this treatment, I’ll play fast and loose with asymptotic analysis. But
everything I’ll do is both reasonable and rigorously justifiable. (C.M. Ben-
der and S.A. Orszag, Advanced Mathematical Methods for Scientists and
Engineers McGraw-Hill, New York, 1978.)
1. Convert to dimensionless variable:p The only combination of m, ω,
and ~ with the dimensions of [length] is ~/mω. Hence define the dimen-
sionless variable proportional to length
r
mω
q= x. (9.3)
~
In terms of this variable, the ordinary differential equation (9.2) is
d2 ηn (q)

2En 2
+ − q ηn (q) = 0. (9.4)
dq 2 ~ω
Exercise: We’re using this equation merely as a stepping-stone to reach

the full answer, but in fact it contains a lot of information already. For
example, suppose we had two electrons in two far-apart simple harmonic
oscillators, the second one with three times the “stiffness” of the first
(that is, the spring constants are related through k(2) = 3k(1) ). We don’t
yet know the energy of the fourth excited state for either oscillator, yet
we can easily find their ratio. What is it?
2. Remove asymptotic behavior of solutions: Consider the limit as q 2 →

∞. In this limit, the ODE (9.4) “becomes”
d2 ηn (q)
− q 2 ηn (q) = 0, (9.5)
dq 2
but it is hard to solve even this simplified equation! Fortunately, it’s not
necessary to find an exact solution, only to find the asymptotic character
of the solutions.
9.2. Solution of the energy eigenproblem: Differential equation approach 183
Pick the trial solution

2
fn (q) = e−q /2
. (9.6)
When we test to see whether this is a solution, we find
d2 fn (q)
− q 2 fn (q)
dq 2
2 2
2 2
= q 2 e−q /2 − e−q /2 − q 2 e−q /2 = −e−q /2
So the function (9.6) does not solve the ODE (9.5). On the other hand, the
amount by which it “misses” solving (9.5) is small in the sense that
2
d2 f /dq 2 − q 2 f −e−q /2 −1
lim 2
= lim 2 −q 2 /2 = lim 2 = 0.
q 2
→∞ q f q →∞ q e
2 q →∞ q
2
2
A similar result holds for gn (x) = e+q /2
.
Our conclusion is that, in the limit q 2 → ∞, the solution ηn (q) behaves
like
2 2
ηn (q) ≈ Ae−q /2
+ Be+q /2
.
If B 6= 0, then ηn (q) will not be normalizable because the probability
density would become infinite as q 2 → ∞. Thus the solutions we want —
the normalizable solutions — behave like
2
ηn (q) ≈ Ae−q /2
in the limit that q 2 becomes very large.

The three paragraphs above motivate us to define a new function vn (q)
through
2
ηn (q) = e−q /2
vn (q). (9.7)
(I could have just produced this definition by fiat, without motivation.
But then you wouldn’t know how to come up with the proper motivation
yourself when you’re faced with a new and unfamiliar differential equation.)
In terms of this new function, the exact ODE (9.4) becomes
d2 vn (q)

dvn (q) 2En
− 2q + − 1 vn (q) = 0. (9.8)
dq 2 dq ~ω
For brevity we introduce the shorthand notation
2En
en = − 1. (9.9)
~ω
3. Find non-asymptotic behavior using the series method: Okay, but

how are we going to solve equation (9.8) for vn (q)? Through the power
series method!
Try a solution of the form
∞
X
v(q) = ak q k
k=0
X∞ ∞
X
v 0 (q) = kak q k−1 qv 0 (q) = kak q k
k=0 k=0
X∞
v 00 (q) = k(k − 1)ak q k−2 [[ note that first two terms vanish . . . ]]
k=0
X∞
= k(k − 1)ak q k−2 [[ change summation index to k 0 = k − 2 . . . ]]
k=2
∞
X 0
= (k 0 + 2)(k 0 + 1)ak0 +2 q k [[ rename dummy index k 0 to k . . . ]]
k0 +2=2
X∞
= (k + 2)(k + 1)ak+2 q k
k=0
Then equation (9.8) becomes

∞
X
[(k + 2)(k + 1)ak+2 − 2kak + en ak ]q k = 0. (9.10)
k=0
All of the terms in square brackets must vanish, whence the recursion rela-
tion
2k − en
ak+2 = ak k = 0, 1, 2, . . . (9.11)
(k + 2)(k + 1)
Like any second order ODE, equation (9.8) has two linearly independent
solutions:
• An even solution of equation (9.8) comes by taking a0 = 1, a1 = 0. It

is
en (en − 4)en 4 (en − 8)(en − 4)en 6
v (e) (q) = 1 − q 2 + q − q + ··· .
2! 4! 6!
• An odd solution of equation (9.8) comes by taking a0 = 0, a1 = 1. It is
en − 2 3 (en − 6)(en − 2) 5 (en − 10)(en − 6)(en − 2) 7
v (o) (q) = q− q + q − q +· · · .
3! 5! 7!
9.2. Solution of the energy eigenproblem: Differential equation approach 185
What is the asymptotic behavior of such solutions vn (q) as q 2 → ∞? Well,

the large q behavior will be dominated by the high-order terms of the series.
Generally, as k → ∞,
ak+2 2k − en 2
= → . (9.12)
ak (k + 2)(k + 1) k
Compare this behavior to the expansion

2
eq = b0 + b2 q 2 + b4 q 4 + · · · (9.13)
which has
bk+2 1 2
= → . (9.14)
bk (k/2) + 1 k
So whenever this happens,
2 2 2
vn (q) ≈ eq and ηn (q) = e−q /2
vn (q) ≈ eq /2
,
Thus giving us the very same unnormalizable behavior we’ve been trying
so hard to avoid!
Is there no way to repair the situation?
4. Invoke normalization to terminate the series as a polynomial: The
candidate wavefunction ηn (q) is not normalizable when ak+2 /ak → 2/k
(see equation (9.12)). There is only one way to avoid this limit: when the
series for vn (q) terminates as a polynomial. This termination occurs when,
for some non-negative integer n, we have 2n = en whence (by recursion
relation (9.11)), ak = 0 for all k > n. Hence the only physical states
correspond to energies with
2En
2n = en = − 1.
~ω
Or, rephrasing,
Energy (eigen)states can exist only if they correspond to the energy

(eigen)values
En = ~ω(n + 12 ) n = 0, 1, 2, 3, . . . (9.15)
What are the wavefunctions of the energy eigenstates?
(e) (o)
For n even, vn (q) terminates and vn (q) doesn’t.
(o) (e)
For n odd, vn (q) terminates and vn (q) doesn’t.
By tradition one defines the Hermite1 polynomial of nth order Hn (q):

n!
n even: Hn (q) = (−1)n/2 v (e) (q) (9.16)
(n/2)! n
2n!
n odd: Hn (q) = (−1)(n−1)/2 v (o) (q) (9.17)
((n − 1)/2)! n
so that
r
−q 2 /2 mω
ηn (x) = Cn e Hn (q) q= x (9.18)
~
where Cn is a normalization factor.
9.3 Solution of the energy eigenproblem: Operator factor-

ization approach
The differential equation approach works. It’s hard. It’s inefficient in that
we find an infinite number of solutions and then throw most of them away.
It’s dependent on a particular representation. Worst of all, it’s hard to use.
For example, suppose we wanted to find the expected value of the potential
energy in the n-th energy eigenstate. We would find
R +∞ 2 −q2 2
mω 2 mω 2 +∞ 2 2 mω 2 ~ −∞ q e Hn (q) dq
Z
2
hÛ in = hηn |x̂ |ηn i = x ηn (x) dx = .
2 2 2 mω +∞ e−q2 Hn2 (q) dq
R
−∞ −∞
Unless you happen to love integrating the Hermite polynomials, these last
two integrals are intimidating.
I’ll show you a method, invented by Dirac (or was it Schrödinger?),
which avoids all these problems. On the other hand the method is hard to
motivate: It clearly springs from the mind of genus.
Start with the Hamiltonian
1 2 mω 2 2
p̂ +
Ĥ = x̂ . (9.19)
2m 2
Since we’re in a mathematical mode, it makes sense to define the dimen-
sionless operators
r
mω 1
X̂ = x̂ and P̂ = √ p̂, (9.20)
2~ 2m~ω
1 Biographical information on Charles Hermite is given on page 108.
9.3. Solution of the energy eigenproblem: Operator factorization approach 187
which satisfy
r
mω 1 i
[X̂, P̂ ] = √ [x̂, p̂] = 1̂, (9.21)
2~ 2m~ω 2
and write
Ĥ = ~ω(X̂ 2 + P̂ 2 ). (9.22)
Now, one of the most fundamental tools of problem solving is to break

something complex into its simpler pieces. (“All Gaul is divided into three
parts.”) If we had an expression like
x2 − p 2
you might well break it into simpler pieces as
(x − p)(x + p).
Slightly less intuitive would be to express
x2 + p 2
as
(x − ip)(x + ip).
But in our case, we’re factoring an operator, and we have to ask concerning
the expression
(X̂−iP̂ )(X̂+iP̂ ) = X̂ 2 +iX̂ P̂ −iP̂ X̂+P̂ 2 = X̂ 2 +i[X̂, P̂ ]+P̂ 2 = X̂ 2 +P̂ 2 − 21 1̂.
(9.23)
So we haven’t quite succeeded in factorizing our Hamiltonian — there’s a
bit left over due to non-commuting operators — but the result is
Ĥ = ~ω[(X̂ − iP̂ )(X̂ + iP̂ ) + 12 ]. (9.24)
From here, define

â = X̂ + iP̂ . (9.25)
The Hermitian adjoint of â is
â† = X̂ − iP̂ . (9.26)
Note that the operators â and â† are not Hermitian. There is no observable
corresponding to â. The commutator is
[â, â† ] = 1̂. (9.27)
Exercise: Verify the above commutator.

And in terms of â and â† , the Hamiltonian is

Ĥ = ~ω(â† â + 21 ). (9.28)
Our task: Using only the fact that [â, â† ] = 1̂, where â† is the Hermitian
adjoint of â, solve the energy eigenproblem for Ĥ = ~ω(â† â + 21 ).
We will do this by solving the eigenproblem for the operator N̂ = â† â.
Once these are known, we can immediately read off the solution for the
eigenproblem for Ĥ. So, we look for the eigenvectors |ni with eigenvalues
n such that
N̂ |ni = n|ni. (9.29)
Because N̂ is Hermitian, its eigenvalues are real. Furthermore, they are
positive because (where we define the vector |φi through |φi = â|ni)
∗ ∗
n = hn|N̂ |ni = hn|â† â|ni = hn|â† |φi = hφ|â|ni = hφ|φi ≥ 0. (9.30)
Now I don’t know much about energy state |ni, but I do know that at
least one exists. So for this particular one, I can ask “What is â|ni?”. Well,
â|ni = 1̂â|ni
= (ââ† − â† â)â|ni
= âN̂ |ni − N̂ â|ni
= nâ|ni − N̂ â|ni.
So if I define |φi = â|ni (an unnormalized vector), then
|φi = n|φi − N̂ |φi
N̂ |φi = n|φi − |φi = (n − 1)|φi.
In other words, the vector |φi is an eigenvector of N̂ with eigenvalue n − 1.
Wow!
|φi = C|n − 1i.
We need to find the normalization constant C:

hφ|φi = |C|2 hn − 1|n − 1i = |C|2
hφ|φi = hn|â† â|ni = hn|N̂ |ni = n.
√
So C = n and
√
â|ni = n|n − 1i (9.31)
9.3. Solution of the energy eigenproblem: Operator factorization approach 189
The operator â is called a “lowering operator”.

So, we started off with one eigenstate |ni. We applied â to get another
eigenstate — with smaller eigenvalue. We can apply â to this new state
to get yet another eigenstate with an even smaller eigenvalue. But this
seems to raise a paradox. We saw at equation (9.30) that the eigenvalues
were positive or zero. This seems present a mechanism for getting negative
eigenvalues — in fact, eigenvalues as small as desired! For example if we
started with a state of eigenvalue 2.3, we could lower it to produce a state
of eigenvalue 1.3. We could lower this to produce a state of eigenvalue 0.3,
and we could lower once more to produce a state of eigenvalue −0.7. But
we know there are no states with negative eigenvalues! Thus there can’t be
any states of eigenvalue 2.3 to start off with.
However, if we start with a state of eigenvalue 2, we could lower that to
get |1i, lower that to get |0i, and what happens when we try to lower |0i?
From equation (9.31), we find
√
â|0i = 0| − 1i = 0.
When we lower the state |0i, we don’t get the state | − 1i. Instead we get
nothing!
In conclusion, there are no fractional eigenvalues. The only eigenvalues
are the non-negative integers.
We’ve gotten a lot out of the use of â. What happens when we use â† ?
â† |ni = â† 1̂|ni
= â† (ââ† − â† â)|ni
= N̂ â† |ni − â† N̂ |ni
= N̂ â† |ni − nâ† |ni.
So if I define |χi = â† |ni (an unnormalized vector), then
|χi = N̂ |χi − n|χi
N̂ |χi = n|χi + |χi = (n + 1)|χi.
In other words, the vector |χi is an eigenvector of N̂ with eigenvalue n + 1:
|φi = C|n + 1i.
†
The operator â is a “raising operator”!
Exercise: Find the normalization constant C and conclude that

√
â† |ni = n + 1|n + 1i (9.32)
The eigenproblem is solved entirely. Given only [â, â† ] = 1̂, where â† is
the Hermitian adjoint of â, the operator
Ĥ = ~ω(â† â + 21 )
has eigenstates |0i, |1i, |2i, . . . with eigenvalues ~ω( 12 ), ~ω(1+ 12 ), ~ω(2+ 21 ),
. . . . These eigenstates are related through
√
â|ni = n|n − 1i “lowering operator”
†
√
â |ni = n + 1|n + 1i “raising operator”
The operators â and â† are collectively called “ladder operators” or “eleva-
tor operators”.
9.4 Problems
9.1 Ground state of the simple harmonic oscillator

You may have been surprised that the lowest possible energy for the
simple harmonic oscillator was E0 = 12 ~ω rather than E0 = 0. This
exercise attempts to explain the non-zero ground state energy in seat-
of-the-pants, semiclassical terms rather than in rigorous, formal, math-
ematical terms. It then goes on to use these ideas plus the uncertainty
principle to guess at a value for the ground state energy. You may
abhor such non-rigorous arguments, but you must be able to do them
in order to make informed guesses about the behavior of systems that
are too complicated to yield to rigorous mathematical methods.
In classical mechanics the SHO ground state has zero potential energy
(the particle is at the origin) and zero kinetic energy (it is motionless).
However in quantum mechanics if a particle is localized precisely at the
origin, and hence has zero potential energy, then it has a considerable
spread of momentum values and hence a non-zero kinetic energy (or,
to be precise, a non-zero expectation value for kinetic energy). The
kinetic energy can be reduced by decreasing the spread of momentum
values, but only by increasing the spread of position values and hence
by increasing the (expected value of the) potential energy. The ground
state is the state in which this trade off between kinetic and potential
energies results in a minimum total energy.
Assume that the spread in position extends over some distance d about
the origin (i.e. the particle will very likely be found between x = −d/2
and x = +d/2). This will result in a potential energy somewhat less
9.4. Problems 191
than
2
1 d
mω 2 .
2 2
This argument is not intended to be rigorous, so let’s forget the “some-
what less” part of the last sentence. Furthermore, a position spread of
∆x = d implies through the uncertainty principle a momentum spread
of ∆p ≥ ~/2d. (The expected value of the momentum is zero.) Contin-
uing in our non-rigorous vein, let’s set ∆p = ~/2d and kinetic energy
equal to
2
1 ∆p
.
2m 2
Sketch potential energy, kinetic energy and total energy as a function of
d. Find the minimum value of E(d) and compare with the true ground
state energy E0 = 12 ~ω. (Note that if ~ were zero, the energy minimum
would fall at E(d) = 0!)
9.2 Expressions for simple harmonic oscillator ladder operators
Show that the lowering operator â has the outer product expression
∞
X √
â = n |n − 1ihn|
n=0
and the matrix representation (in the energy basis)

 √ 
0 1 √0 0 0
0 0 2 √0 0 
 
0 0 0 3 √0 · · · 
 
.
0 0 0 0 4


 
0 0 0 0 0 
..
 
..
. .
Write down the outer product expression and matrix representation for
â† .
9.3 Ladder operators for the simple harmonic oscillator
a. Express x̂ and p̂ in terms of â and â† .
b. Calculate the following simple harmonic oscillator matrix ele-
ments:
hm|â|ni hm|p̂|ni hm|x̂p̂|ni
hm|â† |ni hm|x̂2 |ni hm|p̂x̂|ni
hm|x̂|ni hm|p̂2 |ni hm|Ĥ|ni
c. Show that the expectation value of the potential energy in a SHO

energy eigenstate equals the expectation value of the kinetic en-
ergy in that state. (Recall that for a classical simple harmonic
oscillator, the time averaged potential energy equals the time av-
eraged kinetic energy.)
d. Find ∆x, ∆p, and ∆x∆p for the energy eigenstate |ni.
9.4 Simple harmonic oscillator states
Use scaled variables throughout this problem
a. Concerning the ground energy state: What is η0 (x) at x = 0.5?
What is the probability density ρ0 (x) there?
b. Concerning the first excited energy state: What is η1 (x) at x =
0.5? What is the probability density ρ1 (x) there? √
c. Concerning the “50–50 combination” ψA (x) = (ρ0 (x)+ρ1 (x))/ 2:
What is ψA (x) at x = 0.5? What is the probability density ρA (x)
there?
d. Concerning
√ another “50–50 combination” ψB (x) = (ρ0 (x) −
ρ1 (x))/ 2: What is ψB (x) at x = 0.5? What is the probabil-
ity density ρB (x) there?
e. Veronica argues that “Probability is central to quantum mechan-
ics, so the probability density of any 50–50 combination of η0 (x)
and η1 (x) will be half-way between ρ0 (x) and ρ1 (x).” Prove Veron-
ica wrong. What phenomenon of quantum mechanics has she ig-
nored?
f. (Optional, for the mathematically inclined.) Prove that for any
50–50 combination of η0 (x) and η1 (x), the probability density at
x will range from ρA (x) to ρB (x). (Clue: Use the triangle inequal-
ity.)
Chapter 10
Qualitative Solution of Energy

Eigenproblems
193
Chapter 11
Perturbation Theory
11.1 The O notation
Approximations are an important part of physics, and an important part

of approximation is to ensure their reliability and consistency. The O no-
tation (pronounced “the big-oh notation”) is a practical tool for making
approximations reliable and consistent.
The technique is best illustrated through an example. Suppose you
desire an approximation for
e−x
f (x) = (11.1)
1−x
valid for small values of x, that is, for x 1. You know that
e−x = 1 − x + 21 x2 − 16 x3 + · · · (11.2)
and that
1
= 1 + x + x2 + x3 + · · · , (11.3)
1−x
so it seems that reasonable approximations are
e−x ≈ 1 − x (11.4)
and
1
≈ 1 + x, (11.5)
1−x
whence
e−x
≈ (1 − x)(1 + x) = 1 − x2 . (11.6)
1−x
195
196 Perturbation Theory
Let’s try out this approximation at x0 = 0.01. A calculator shows that

e−x0
= 1.0000503 . . . (11.7)
1 − x0
while the value for the approximation is
1 − x20 = 0.9999000. (11.8)
This is a very poor approximation indeed. . . the deviation from f (0) = 1 is
even of the wrong sign!
Let’s do the problem over again, but this time keeping track of exactly
how much we’ve thrown away while making each approximation. We write
e−x = 1 − x + 21 x2 − 16 x3 + · · · (11.9)
as
e−x = 1 − x + 21 x2 + O(x3 ), (11.10)
3
where the notation O(x ) stands for the small terms that we haven’t both-
ered to write out explicitly. The symbol O(x3 ) means “terms that are about
the magnitude of x3 , or smaller” and is pronounced “terms of order x3 ”.
The O notation will allow us to make controlled approximations in which
we keep track of exactly how good the approximation is.
Similarly, we write
1
= 1 + x + x2 + O(x3 ), (11.11)
1−x
and find the product
f (x) = 1 − x + 12 x2 + O(x3 ) × 1 + x + x2 + O(x3 )

(11.12)
1 − x + 21 x2 + O(x3 )

= (11.13)
+ 1 − x + 12 x2 + O(x3 ) x

(11.14)
+ 1 − x + 12 x2 + O(x3 ) x2

(11.15)
+ 1 − x + 12 x2 + O(x3 ) O(x3 ).

(11.16)
1 2 3 2 3 3
Note, however, that x × 2x = O(x ), and that x × O(x ) = O(x ), and
so forth, whence
1 − x + 12 x2 + O(x3 )

f (x) = (11.17)
+ x − x2 + O(x3 )

(11.18)
+ x2 + O(x3 )

(11.19)
3
+O(x ) (11.20)
1 2 3
= 1+ 2x + O(x ). (11.21)
11.1. The O notation 197
Thus we have the approximation

f (x) ≈ 1 + 21 x2 . (11.22)
Furthermore, we know that this approximation is accurate to terms of order
O(x2 ) (i.e. that the first neglected terms are of order O(x3 )). Evaluating
this approximation at x0 = 0.01 gives
1 + 21 x20 = 1.0000500, (11.23)
far superior to our old approximation.
What went wrong on our first try? The −x2 in approximation (11.6)
is the same as the −x2 on line (11.18). However, lines (11.17) and (11.19)
demonstrate that there were other terms of about the same size (i.e. other
“terms of order x2 ”) that we neglected in our first attempt.
The O notation is superior to the “dot notation” (such as · · · ) in that
dots stand for “a bunch of small terms”, but the dots don’t tell you just
how small they are. The symbol O(x3 ) also stands for “a bunch of small
terms”, but in addition it tells you precisely how small those terms are.
The O notation allows us to approximate in a consistent manner, unlike
the uncontrolled approximations where we ignore a “small term” without
knowing whether we have already retained terms that are even smaller.
Problem
11.1 Tunneling for small times — O notation version
Problem 5.2, part e, raised the paradox that, according to an approx-
imation produced using truncation rather than O notation, the total
probability was greater than 1. This problem resolves the paradox using
O notation.
a. Approximate time evolution through
i 1 2
|ψ(∆t)i = 1̂ − Ĥ∆t − 2 Ĥ (∆t)2 + O(∆t3 ) |ψ(0)i.
~ 2~
(11.24)
Find the representation of this equation in the {|1i, |2i} basis.
b. Conclude that for initial condition |ψ(0)i = |1i,
1 − (i/~)E∆t − (1/2~2 )(E 2 + A2 )(∆t)2 + O(∆t3 )

ψ1 (∆t)
= .
ψ2 (∆t) −(i/~)Ae−iφ ∆t − (1/~2 )EAe−iφ (∆t)2 + O(∆t3 )
(11.25)
c. Find the resulting probabilities for the system to be found in |1i
and in |2i, correct to second order in ∆t, and show that these
probabilities sum to 1, correct to second order in ∆t.
11.2 Perturbation theory for cubic equations
Perturbation theory is any technique for approximately solving one prob-

lem, when an exact solution for a similar problem is available.
It’s a general mathematical technique, applicable to many problems.
(It was first developed in the context of classical mechanics: We have an
exact solution for the problem two gravitating bodies, such as the ellipse
of the Earth orbiting the Sun. But we don’t have an exact solution for
the problem of three gravitating bodies, such as the Earth plus the Sun
plus Jupiter. Perturbation theory was developed to understand how the
attraction by Jupiter “perturbed” the motion of the Earth away from the
pure elliptical orbit that it would execute if Jupiter didn’t exist.) Before
we apply perturbation theory to quantum mechanics, we’ll apply it in a
simpler, and purely mathematical, context.
I wish to solve the cubic equation
x3 − 4.001 x + 0.002 = 0. (11.26)
There is a formula for finding the three roots of a cubic equation, and
we could use it to solve this problem. On the other hand, that formula is
very complicated and awkward. And while there’s no straightforward exact
solution to the problem as stated, that problem is very close to the problem
x3 − 4 x = 0, (11.27)
which does have straightforward exact solutions, namely
0, ±2. (11.28)
Can I use the exact solution of this “nearby” problem to find an approxi-
mate solution for the problem of interest?
I’ll write the cubic equation as the sum of a part we can solve plus a
“small” perturbing part, namely
x3 − 4x + (−0.001 x + 0.002) = 0. (11.29)
I place the word “small” in quotes because its meaning is not precisely
clear. On one hand, for a typical value of x, say x = 1, the “big” part is
−3 while the small part is only 0.001. On the other hand, for the value
x = 0, the “big” part is zero and the “small” part is 0.002. So for some
values of x the “small” part is bigger than the “big” part. Mathematicians
spend a lot of time figuring out a precise meaning of “big” versus “small”
11.2. Perturbation theory for cubic equations 199
in this context, but we don’t need to follow their figurings. It’s enough for
us that the perturbing part is, in some general way, small compared to the
remaining part of the problem, the part that we can solve exactly.
To save space, I’ll introduce the constant T to mean “thousandths”, and
write our problem as
x3 − 4x + T (−x + 2) = 0. (11.30)
And now I’ll generalize this problem by inserting a variable in front of the
“small” part:
x3 − 4x + T (−x + 2) = 0. (11.31)
The variable enables us to interpolate smoothly from the problem we’re
interested in, with = 1, to the problem we know how to solve, with = 0.
Instead of solving one cubic equation, the problem with = 1, we’re
going to try to solve an infinite number of cubic equations, those with
0 ≤ ≤ 1. For example, I can call the smallest of these solutions x1 (). I
don’t know much about x1 () — I know only that x1 (0) = −2 — but I have
an expectation: I expect that x1 () will behave smoothly as a function of
, for example something like this
x3(ε)
x2(ε)
ε
−2
x1(ε)
and I expect that it won’t have jumps or kinks like this

x3(ε)
x2(ε)
ε
−2
x1(ε)
Because of this expectation, I expect that I can write x1 () as a Taylor

series:
∞
X
x1 () = a i xi (11.32)
i=1
= −2 + a1 + a2 2 + O(3 ) (11.33)
This function x1 () has to satisfy

x31 () − (4 + T )x1 () + 2T = 0. (11.34)
I can write the middle term above as an expansion in powers of using
equation (11.33):
−4x1 () = 8 − (4a1 ) − 2 (4a2 ) + O(3 )
2
−T x1 () = + (2T ) − (T a1 ) + O(3 )
−(4 + T )x1 () = 8 + (−4a1 + 2T ) + (−4a2 − T a1 ) + O(3 )
2
With just a bit more effort, I can work out the left-most term in equa-
tion (11.34) as an expansion:
x21 () = 4 − (4a1 ) + 2 (−4a2 + a21 ) + O(3 )
x1 () = −8 − (−12a1 ) + 2 (12a2 − 6a21 ) + O(3 )
3
So finally, I have worked out the expansion of every term in equation (11.34):
x31 () = −8 − (−12a1 ) + 2 (12a2 − 6a21 ) + O(3 )
−(4 + T )x1 () = 8 + (−4a1 + 2T ) + 2 (−4a2 − T a1 ) + O(3 )
2T = + (2T )
11.3. Derivation of perturbation theory for the energy eigenproblem 201
Summing the three equations above must, according to equation (11.34),

produce zero:
0 = (−8 + 8) + (12a1 − 4a1 + 4T ) + 2 (12a2 − 6a21 − 4a2 − T a1 ) + O(3 )
0 = (−8 + 8) + (8a1 + 4T ) + 2 (8a2 − 6a21 − T a1 ) + O(3 )
Now, because the expression on the right must vanish for any value of , all
the coefficients must vanish. First we must have that (−8 + 8) = 0, which
checks out. Then the term linear in must vanish, so
(8a1 + 4T ) = 0 whence a1 = − 21 T.
And the term quadratic in must vanish, so
(8a2 − 6a21 − T a1 ) = 0 whence a2 = 34 a21 + 81 T a1 = 18 T 2 .
The expansion for x1 () is thus

x1 () = −2 − 12 T + 81 T 2 2 + O(3 )
If we set = 1 and ignore the terms O(3 ), we find
x1 (1) ≈ −2.000399875
and comparison to the exact solution of the cubic equation (which is much
more difficult to work through) shows that this result is accurate to one
part in a billion.
11.3 Derivation of perturbation theory for the energy

eigenproblem
Approach
0
To solve the energy eigenproblem for the Hamiltonian Ĥ (0) + Ĥ , where the
solution
Ĥ (0) |n(0) i = En(0) |n(0) i (11.35)
0
is known and where Ĥ is “small” compared with Ĥ (0) , we set
0
Ĥ() = Ĥ (0) + Ĥ (11.36)
and then find |n()i and En () such that
Ĥ()|n()i = En ()|n()i (11.37)
and
hn()|n()i = 1. (11.38)
Intermediate goal
Find |n̄()i and En () such that

Ĥ()|n̄()i = En ()|n̄()i (11.39)
and
hn(0) |n̄()i = 1. (11.40)
Then our final goal will be
|n̄()i
|n()i = 1/2
. (11.41)
hn̄()|n̄()i
Remarkably, it often turns out to be good enough to reach our interme-
diate goal of finding |n̄()i, and one can then invent tricks for extracting
information from these unnormalized eigenstates.
Initial assumption
We make the standard perturbation theory guess:

|n̄()i = |n(0) i + |n̄(1) i + 2 |n̄(2) i + O(3 ) (11.42)
En () = En(0) + En(1) + 2
En(2) 3
+ O( ) (11.43)
(1)
[Note that the set {|n̄ i} is not complete, or orthonormal, or any other
good thing.]
Consequences of the magnitude choice
The choice hn(0) |n̄()i = 1 gives rise to interesting and useful consequences.
First, take the inner product of |n(0) i with equation (11.42)
hn(0) |n̄()i = hn(0) |n(0) i + hn(0) |n̄(1) i + 2 hn(0) |n̄(2) i + O(3 )
1 = 1 + hn(0) |n̄(1) i + 2 hn(0) |n̄(2) i + O(3 )
Because this relationship holds for all values of , the coefficient of each m
must vanish:
hn(0) |n̄(m) i = 0 m = 1, 2, 3, . . . . (11.44)
11.3. Derivation of perturbation theory for the energy eigenproblem 203
Whence

(0) (1) 2 (2) 3 (0) (1) 2 (2) 3
hn̄()|n̄()i = hn | + hn̄ | + hn̄ | + O( ) |n i + |n̄ i + |n̄ i + O( )

= hn |n i + hn̄ |n i + hn |n̄ i + hn̄ |n i + hn̄ |n̄ i + hn |n̄ i + O(3 )
(0) (0) (1) (0) (0) (1) 2 (2) (0) (1) (1) (0) (2)

= 1 + 0 + 0 + 0 + hn̄ |n̄ i + 0 + O(3 )
2 (1) (1)
= 1 + 2 hn̄(1) |n̄(1) i + O(3 ). (11.45)

In other words, while the vector |n̄()i is not exactly normalized, it is
“nearly normalized” — the norm differs from 1 by small, second-order
terms.
Developing the perturbation expansion
What came before was just warming up. We now go and plug our expansion
guesses, equations (11.42) and (11.43) into
Ĥ()|n()i = En ()|n()i (11.46)
to find

0
Ĥ (0) + Ĥ |n(0) i + |n̄(1) i + 2 |n̄(2) i + O(3 )

= En(0) + En(1) + 2 En(2) + O(3 ) |n(0) i + |n̄(1) i + 2 |n̄(2) i + (11.47)
O(3 )
Separating out powers of gives

Ĥ (0) |n(0) i = En(0) |n(0) i (11.48)
(0) 0 (0)
Ĥ |n̄ (1)
i + Ĥ |n i= En(1) |n(0) i + En(0) |n̄(1) i (11.49)
(0) 0
Ĥ |n̄ (2)
i + Ĥ |n̄ (1)
i= En(2) |n(0) i + En(1) |n̄(1) i + En(0) |n̄(2) i (11.50)
and so forth.
Finding the first-order energy shifts
How do we extract useful information from these expansion equations?

Let’s focus on what we know and what we want to find. We know Ĥ (0) ,
0 (0) (1)
Ĥ , |n(0) i, and En . From equation (11.49) we will find En and |n̄(1) i.
(2)
Knowing these, from equation (11.50) we will find En and |n̄(2) i. And so
forth.
(1)
To find the energy shifts En , we multiply equation (11.49) by hn(0) | to
find
0
hn(0) |Ĥ (0) |n̄(1) i + hn(0) |Ĥ |n(0) i = En(1) hn(0) |n(0) i + En(0) hn(0) |n̄(1) i
0
En(0) hn(0) |n̄(1) i + hn(0) |Ĥ |n(0) i = En(1) + En(0) hn(0) |n̄(1) i (11.51)
Or,
0
En(1) = hn(0) |Ĥ |n(0) i. (11.52)
Often you need only these energies, not the states, and you can stop here.
But if you do need the states. . .
Finding the first-order state shifts
We will find the state shifts |n̄(1) i by finding all the components of |n̄(1) i
in the unperturbed basis {|m(0) i}.
Multiply equation (11.49) by hm(0) | (m 6= n) to find
0
hm(0) |Ĥ (0) |n̄(1) i + hm(0) |Ĥ |n(0) i = En(1) hm(0) |n(0) i + En(0) hm(0) |n̄(1) i
(0) 0
Em hm(0) |n̄(1) i + hm(0) |Ĥ |n(0) i = 0 + En(0) hm(0) |n̄(1) i
0
hm(0) |Ĥ |n(0) i = (En(0) − Em
(0)
)hm(0) |n̄(1) i (11.53)
(0) (0) (0)
Now, if the state |n i is non-degenerate, then Em 6= En and we can
divide both sides to find
0
hm(0) |Ĥ |n(0) i
hm(0) |n̄(1) i =
(0) (0)
(m 6= n) (11.54)
En − Em
But we already know, from equation (11.44), that
hn(0) |n̄(1) i = 0. (11.55)
So now all the projections hm(0) |n̄(1) i are known, and therefore the vector
is known:
X
|n̄(1) i = |m(0) ihm(0) |n̄(1) i (11.56)
m
(0)
In conclusion — if |n i is non-degenerate
0
X hm(0) |Ĥ |n(0) i
|n̄(1) i = |m(0) i (0) (0)
. (11.57)
m6=n En − Em
11.4. Perturbation theory for the energy eigenproblem: Summary of results 205
11.4 Perturbation theory for the energy eigenproblem:

Summary of results
Given: Solution for the Ĥ (0) eigenproblem:

Ĥ (0) |n(0) i = En(0) |n(0) i hn(0) |n(0) i = 1. (11.58)
0
Find: Solution for the Ĥ (0) + Ĥ eigenproblem:
0
(Ĥ (0) + Ĥ )|n()i = En ()|n()i hn()|n()i = 1. (11.59)
Define the “matrix elements”
0 0
hn(0) |Ĥ |m(0) i = Hnm . (11.60)
The solutions are (provided |n(0) i is not degenerate):

X H0 H0
0 nm mn
En () = En(0) + Hnn + 2 (0) (0)
+ O(3 ) (11.61)
E
m6=n n − Em
|n()i = |n(0) i
0
X Hmn
+ |m(0) i (0) (0)
m6=n En − Em

0 0
XX Hm` H`n
+2  |m(0) i (0) (0) (0) (0)
m6=n `6=n (En − Em )(En − E` )

0 0 X H0 H0
X Hnn Hmn 1 nm mn
− |m(0) i (0) (0) 2
− |n(0) i (0) (0) 2

(En − Em ) 2 (En − Em )
m6=n m6=n
+O(3 ) (11.62)
Rules of thumb concerning perturbation theory

• There is no guarantee that the series is convergent, or even asymptotic.
• But experience says “stop at the first non-vanishing energy correction”.
• The wavefunctions produced are notoriously poor. How can the ener-
gies be good when the wavefunctions are poor? See section 16.2.
• The technique is generally useful for many mathematical problems:
classical mechanics, fluid mechanics, etc. Even for solving cubic equa-
tions!
• Technique is never guaranteed to succeed, but it is likely to fail (and
perhaps fail silently!) if there are degenerate energy states. In this case
(0) (0)
En = Em , so second-order term perhaps diverges, despite the fact
that the first-order term hn(0) |Ĥ 0 |n(0) i looks perfectly fine.
11.5 Problems
11.2 Square well with a bump

An infinite square well of width L (problem 8.1) is perturbed by putting
in a bit of potential of height V and width a in the middle of the
well. Find the first order energy shifts for all the energy eigenstates,
and the first order perturbed wavefunction for the ground state (your
result will be an infinite series). (Note: Many of the required matrix
elements will vanish! Before you integrate, ask yourself whether the
integrand is odd.) When a = L the perturbed problem can be solved
exactly. Compare the perturbed energies with the exact energies and
the perturbed ground state wavefunction with the exact ground state
wavefunction.
6 6
11.3 Anharmonic oscillator

a. Show that for the simple harmonic oscillator,
s 3
~ p √
3
hm|x̂ |ni = n(n − 1)(n − 2) δm,n−3 + 3 n3 δm,n−1
2mω

p p
+ 3 (n + 1)3 δm,n+1 + (n + 1)(n + 2)(n + 3) δm,n+3 .
b. Recall that the simple harmonic oscillator is always an approxi-

mation. The real problem always has a potential V (x) = 21 kx2 +
bx3 + cx4 + · · · . The contributions beyond 12 kx2 are called “an-
harmonic terms”. Ignore all the anharmonic terms except for bx3 .
11.5. Problems 207
Show that to leading order the nth energy eigenvalue changes by

3
b2

~
− (30n2 + 30n + 11). (11.63)
~ω 2mω
Note that these shifts are not “small” when n is large, in which
case it is not appropriate to truncate the perturbation series at
leading order. Explain physically why you don’t expect the shifts
to be small for large n.
11.4 Slightly relativistic simple harmonic oscillator
You know that the concept of potential energy is not applicable in rel-
ativistic situations. One consequence of this is that the only fully rela-
tivistic quantum theories possible are quantum field theories. However
there do exist situations where a particle’s motion is “slightly relativis-
tic” (say, v/c ∼ 0.1) and where the force responds quickly enough to the
particle’s position that the potential energy concept has approximate
validity. For a mass on a spring, this situation hold when the spring’s
response time is much less than the period.
a. Show that a reasonable approximate Hamiltonian for such a
“slightly relativistic SHO” is
p̂2 mω 2 2 1
Ĥ = + x̂ − 2 3 p̂4 . (11.64)
2m 2 8c m
b. Show that
2
√ √

m~ω
hm|p̂4 |0i = (3 δm,0 − 6 2 δm,2 + 2 6 δm,4 ). (11.65)
2
c. Calculate the leading non-vanishing energy shift of the ground
state due to this relativistic perturbation.
d. Calculate the leading corrections to the ground state eigenvector
|0i.
11.5 Two-state systems
1
The most general Hamiltonian for a two state system (e.g. spin 2,
neutral K meson, ammonia molecule) is represented by
a0 I + a1 σ1 + a3 σ3 (11.66)
where a0 , a1 , and a3 are real numbers and the σ’s are Pauli matrices.
(See problem 58.)
a. Assume a3 = 0. Solve the energy eigenproblem.
b. Now assume a3 a0 ≈ a1 . Use perturbation theory to find the
leading order shifts in the energy eigenvalues and eigenstates.
c. Find the energy eigenvalues exactly and show that they agree with
the perturbation theory results when a3 a0 ≈ a1 .
11.6 Degenerate perturbation theory in a two-state system
Consider a two state system with a Hamiltonian represented in some
basis by
a0 I + a1 σ1 + a3 σ3 . (11.67)
We shall call the basis for this representation the “initial basis”. This
exercise shows how to use perturbation theory to solve (approximately)
the energy eigenproblem in the case a0 a1 ≈ a3 .

a0 0 a3 a1
Ĥ (0) = Ĥ 0 = (11.68)
0 a0 a1 −a3
In this case the unperturbed Hamiltonian is degenerate. The initial
basis

1 0
, (11.69)
0 1
is a perfectly acceptable energy eigenbasis (both states have energy a0 ),
but the basis

1 1 1 1
√ , √ , (11.70)
2 1 2 −1
for example, is just as good.
(1)
a. Show that if the non-degenerate formula En = hn(0) |Ĥ 0 |n(0) i
were applied (or rather, misapplied) to this problem, then the for-
mula would produce different energy shifts depending upon which
basis was used!
Which, if either, are the true energy shifts? The answer comes from
equation (11.53), namely
(En(0) − Em
(0)
)hm(0) |n̄(1) i = hm(0) |Ĥ 0 |n(0) i whenever m 6= n. (11.71)
This equation was derived from the fundamental assumption that |n()i
and En () could be expanded in powers of . If the unperturbed states
(0) (0)
|n(0) i and |m(0) i are degenerate, then En = Em and the above equa-
tion demands that
(0) (0)
hm(0) |Ĥ|n(0) i = 0 whenever m 6= n and En = Em . (11.72)
If this does not apply, then the fundamental assumption must be wrong.
And this answers the question of which basis to use! Consistency de-
mands the use of a basis in which the perturbing Hamiltonian is diag-
onal. (The Hermiticity of Ĥ 0 guarantees that such a basis exists.)
11.5. Problems 209
b. Without finding this diagonalizing basis, find the representation

of Ĥ 0 in it.
c. Find the representation of Ĥ (0) in the diagonalizing basis. (Trick
question.)
d. What are the energy eigenvalues of the full Hamiltonian Ĥ (0) +Ĥ 0 ?
(Not “correct to some order in perturbation theory,” but the exact
eigenvalues!)
e. Still without explicitly producing the diagonalizing basis, show
that the states in that basis are exact energy eigenstates of the
full Hamiltonian.
f. (Optional) If you’re ambitious, you may now go ahead and show
that the (normalized) diagonalizing basis vectors are

1 +a
p1 cos θ
√ = (11.73)
,
−a3 + a21 + a23
q p sin θ
2 a21 + a23 − a3 a21 + a23

1 −a 1 − sin θ
√ q 2
p = (11.74)
,
p +a3 + a21 + a23 cos θ
2 a1 + a23 + a3 a21 + a23
where
a
tan θ = p1 . (11.75)
a3 + a21 + a23
Coda: Note the reasoning of degenerate perturbation theory: We ex-
pand about the basis that diagonalizes Ĥ 0 because expansion about any
other basis is immediately self-contradictory, not because this basis is
guaranteed to produce a sensible expansion. As usual in perturbation
theory, we have no guarantee that this expansion makes sense. We do,
however, have a guarantee that any other expansion does not make
sense.
Chapter 12
Quantum Mechanics in Two and

Three Dimensions
12.1 More degrees of freedom
Let’s think of the process of adding degrees of freedom.

First consider a spinless particle in one dimension:
(1) The particle’s state is described by a vector |ψi.

(2) The vector has dimension ∞, reflecting the fact that any basis, for ex-
ample the basis {|xi}, has ∞ elements. (No basis is better than another
other basis — for every statement below concerning position there is a
parallel statement concerning momentum — but for concreteness we’ll
discuss only position.)
(3) These basis elements are orthonormal,
hx|x0 i = δ(x − x0 ), (12.1)
and complete
Z +∞
1̂ = dx |xihx|. (12.2)
−∞
[[These two equations may seem recondite, formal, and purely mathe-
matical, but in fact they embody the direct, physical results of mea-
surement experiments: Completeness reflects the fact that when the
particle’s position is measured, it is found to have a position. Orthonor-
mality reflects the fact that when the particle’s position is measured, it
is found in only one position. Statement should be refined. Connection
between completeness and interference?]]
(4) The state |ψi is represented (in the position basis) by the numbers
hx|ψi = ψ(x). In symbols
.
|ψi = hx|ψi = ψ(x). (12.3)
211
212 Quantum Mechanics in Two and Three Dimensions
(5) When the position is measured, the probability of finding the particle
at a position within dx about x0 is
|ψ(x0 )|2 dx. (12.4)
Now consider a spin- 12 particle in one dimension:

(2) The vector has dimension ∞ × 2, reflecting the fact that any basis,
for example the basis {|x, +i, |x, −i}, has ∞ × 2 elements. (No basis is
better than another other basis — for every statement below concerning
position plus projection on a vertical axis there is a parallel statement
concerning momentum plus projection of a horizontal axis — but for
concreteness we’ll discuss only position plus projection of a vertical
axis.) [[For example, the state |5, +i represents a particle at position 5
with spin +. The state
√1 [|5, +i
− |7, −i]
2
√
represents a particle with√amplitude 1/ 2 to be at position 5 with
spin + and amplitude −1/ 2 to be at position 7 with spin −, but with
no amplitude to be at position 5 with spin −, and no amplitude to be
at position 6 with any spin.]]
hx, +|x0 , +i = δ(x − x0 )
hx, +|x0 , −i = 0
hx, i|x0 , ji = δ(x − x0 )δi,j (12.5)
and complete
Z +∞ Z +∞
1̂ = dx |x, +ihx, +| + dx |x, −ihx, −|
−∞ −∞
X Z +∞
1̂ = dx |x, iihx, i| (12.6)
i=+,− −∞
(4) The state |ψi is represented (in this basis) by the numbers

hx, +|ψi ψ+ (x)
= . (12.7)
hx, −|ψi ψ− (x)
(5) When both the spin projection and the position are measured, the
probability of finding the particle with spin up and at a position within
dx about x0 is
|ψ+ (x0 )|2 dx. (12.8)
12.1. More degrees of freedom 213
The proper way of expressing the representation of the state |ψi in the
{|x, +i, |x, −i} basis is through the so-called “spinor” above, namely

. ψ+ (x)
|ψi = .
ψ− (x)
Sometimes you’ll see this written instead as
.
|ψi = ψ+ (x)|+i + ψ− (x)|−i.
Ugh! This is bad notation, because it confuses the state (something like |ψi,
a vector) with the representation of a state in a particular basis (something
like hx, i|ψi, a set of amplitudes). Nevertheless, you’ll see it used.
This example represents the way to add degrees of freedom to a descrip-
tion, namely by using a larger basis set. In this case I’ve merely doubled the
size of the basis set, by including spin. I could also add a second dimension
by adding the possibility of motion in the y direction, and so forth.
Consider a spinless particle in three dimensions:

(2) The vector has dimension ∞3 , reflecting the fact that any basis, for
example the basis {|x, y, zi} — which is also written as {|xi} — has
∞3 elements. (No basis is better than another other basis — for every
statement below concerning position there is a parallel statement con-
cerning momentum — but for concreteness we’ll discuss only position.)
hx, y, z|x0 , y 0 , z 0 i = δ(x − x0 )δ(y − y 0 )δ(z − z 0 ), (12.9)
which is also written as
hx|x0 i = δ(x − x0 ). (12.10)
In addition, the basis elements are complete
Z +∞ Z +∞ Z +∞
1̂ = dx dy dz |x, y, zihx, y, z|, (12.11)
−∞ −∞ −∞
which is also written as
Z +∞
1̂ = d3 x |xihx|. (12.12)
−∞
hx|ψi = ψ(x) (a complex-valued function of a vector argument).
(5) When the position is measured, the probability of finding the particle
at a position within d3 x about x0 is
|ψ(x0 )|2 d3 x. (12.13)
12.2 Vector operators
So much for states. . . what about operators?

The general idea of a vector is that it’s “something like an arrow”. But
in what way like an arrow? If you work with the components of a vector,
how can the components tell you that they represent something that’s “like
an arrow”?
Consider the vector momentum p. If the coordinate axes are x and y,
the components of the vector p are px and py . But if the coordinate axes
are x0 and y 0 , then the components of the vector p are px0 and py0 . It’s
the same vector, but it has different components using different coordinate
axes.
y' y
x'
θ x
How are these two sets of coordinates related? It’s not hard to show
that they’re related through
px0 = px cos θ + py sin θ
py0 = −px sin θ + py cos θ (12.14)
(There’s a similar but more complicated formula for three-dimensional vec-
tors.)
We use this same formula for change of coordinates under rotation
whether it’s a position vector or a velocity vector or a momentum vector,
despite the fact that position, velocity, and momentum are very different
in character. It is in this sense that position, velocity, and momentum are
all “like an arrow” and it is in this way that the components of a vector
show that the entity behaves “like an arrow”.
12.3. Multiple particles 215
Now, what is a “vector operator”? In two dimensions, it’s a set of two

operators that transform under rotation just as the two components of a
vector do:
p̂x0 = p̂x cos θ + p̂y sin θ
p̂y0 = −p̂x sin θ + p̂y cos θ (12.15)
(There’s a similar but more complicated formula for three-dimensional vec-
tor operators.)
Meanwhile, a “scalar operator” is one that doesn’t change when the
coordinate axes are rotated.
For every vector operator there is a scalar operator
p̂2 = p̂2x + p̂2y + p̂2z . (12.16)
12.3 Multiple particles
In section 12.1 we considered adding spin and spatial degrees of freedom

for a single particle. But the same scheme works for adding additional
particles. (There are peculiarities that apply to the identical particles —
see chapter 15 — so in this section we’ll consider non-identical particles.)
Consider a system of two spinless particles (call them red and green)
moving in one dimension:
(1) The system’s state is described by a vector |ψi.

(2) The vector has dimension ∞2 , reflecting the fact that any basis, for
example the basis {|xR , xG i} has ∞2 elements. (No basis is better
than another other basis — for every statement below concerning two
positions there is a parallel statement concerning two momenta — but
for concreteness we’ll discuss only position.)
hxR , xG |x0R , x0G i = δ(xR − x0R )δ(xG − x0G ). (12.17)
In addition, the basis elements are complete
Z +∞ Z +∞
1̂ = dxR dxG |xR , xG ihxR , xG |. (12.18)
−∞ −∞
hxR , xG |ψi = ψ(xR , xG ) (a complex-valued function of a two-variable
argument).
(5) When the positions of both particles are measured, the probability of
finding the red particle within a window of width dxA about xA and
the green particle within a window of width dxB about xB is
|ψ(xA , xB )|2 dxA dxB . (12.19)
12.4 The phenomena of quantum mechanics
We started (chapter 1) with the phenomena of quantum mechanics: quan-

tization, probability, interference, and entanglement. We used these phe-
nomena to build up the formalism of quantum mechanics: amplitudes, state
vectors, operators, etc. (chapter 2).
We’ve been working at the level of formalism for so long that we’re in
danger of forgetting the phenomena that underlie the formalism: For exam-
ple in this chapter we discussed how the formalism of quantum mechanics
applies to continuum systems in three dimensions. It’s time to return to
the level of phenomena and ask how the phenomena of quantum mechanics
generalize to continuum systems in three dimensions.
Interference
Interference of a particle — experiments of Tonomura:

A. Tonomura, J. Endo, T. Matsuda, T. Kawasaki, and H. Ezawa, “Demon-
stration of single-electron buildup of an interference pattern,” American
Journal of Physics, 57 (1989) 117–120.
http://www.hqrd.hitachi.co.jp/em/doubleslit.cfm
Entanglement
How does one describe the state of a single classical particle moving in one
dimension? It requires two numbers: a position and a momentum (or a
position and a velocity). Two particles moving in one dimension require
merely that we specify the state of each particle: four numbers. Similarly
specifying the state of three particles require six numbers and N particles
require 2N numbers. Exactly the same specification counts hold if the
particle moves relativistically.
12.4. The phenomena of quantum mechanics 217
How, in contrast, does one describe the state of a single quantal par-
ticle moving in one dimension? A problem arises at the very start, here,
because the specification is given through a complex-valued wavefunction
ψ(x). Technically the specification requires an infinite number of numbers!
Let’s approximate the wavefunction through its value on a grid of, say, 100
points. This suggests that a specification requires 200 real numbers, a com-
plex number at each grid point, but one number is taken care of through
the overall phase of the wavefunction, and one through normalization. The
specification actually requires 198 independent real numbers.
How does one describe the state of two quantal particles moving in one
dimension? Now the wavefunction is a function of two variables ψ(xA , xB ).
(This wavefunction might factorize into a function of xA alone times a func-
tion of xB alone, but it might not. If it does factorize, the two particles are
unentangled, if it does not, the two particles are entangled. In the general
quantal case a two-particle state is not specified by giving the state of each
individual particle, because the individual particles might not have states.)
The wavefunction of the system is a function of two-dimensional configu-
ration space, so an approximation of the accuracy established previously
requires a 100 × 100 grid of points. Each grid point carries one complex
number, and again overall phase and normalization reduce the number of
real numbers required by two. For two particles the specification requires
2 × (100)2 − 2 = 19998 independent real numbers.
Similarly, specifying the state of N quantal particles moving in one
dimension requires a wavefunction in N -dimensional configuration space
which (for a grid of the accuracy we’ve been using) is specified through
2 × (100)N − 2 independent real numbers.
The specification of a quantal state not only requires more real numbers
than the specification of the corresponding classical state, but that number
increases exponentially rather than linearly with the number of particles
N.
The fact that a quantal state holds more information than a classical
state is the fundamental reason that a quantal computer is (in principle)
faster than a classical computer, and the basis for much of quantum infor-
mation theory.
Relativity is different from classical physics, but no more complicated.
Quantum mechanics, in contrast, is both different from and richer than
classical physics. You may refer to this richness using terms like “splendor”,
or “abounding”, or “intricate”, or “ripe with possibilities”. Or you may

refer to it using terms like “complicated”, or “messy”, or “full of details
likely to trip the innocent”. It’s your choice how to react to this richness,
but you can’t deny it.
Chapter 13
Angular Momentum
13.1 Solution of the angular momentum eigenproblem
We solved the simple harmonic oscillator energy eigenproblem twice: once

using a straightforward but laborious differential equation technique, and
then again using an operator-factorization technique that was much easier
to implement, but which involved unmotivated creative leaps. We’ll do
the same with the angular momentum eigenproblem, but in the opposite
sequence.
Here’s the problem:
Given Hermitian operators Jˆx , Jˆy , Jˆz obeying
[Jˆx , Jˆy ] = i~Jˆz , and cyclic permutations (13.1)
find the eigenvalues and eigenvectors for one such operator, say Jˆz .
Any other component of angular momentum, say Jˆx or Jˆ42◦ , will have
exactly the same eigenvalues, and eigenvectors with the same structure.
Note that the we are to solve the problem using only the commutation
relations — we are not to use, say, the expression for the angular momen-
tum operator in the position basis, or the relationship between angular
momentum and rotation.
Strangely, our first step is to slightly expand the problem. (I warned
you that the solution would not take a straightforward, “follow your nose”
path.)
Define
Jˆ2 = Jˆx2 + Jˆy2 + Jˆz2 (13.2)
219
220 Angular Momentum
and note that

2
[Jˆ , Jî ] = 0 for i = x, y, z. (13.3)
2
Because Jˆ and Jˆz commute, they have a basis of simultaneous eigen-
vectors. We expand the problem to find these simultaneous eigenvectors
|β, µi, which satisfy
2
Jˆ |β, µi = ~2 β|β, µi (13.4)
Jˆz |β, µi = ~µ|β, µi (13.5)
We define the values β and µ in this way so that β and µ will be

dimensionless. (If this is not obvious to you, show it now. Also, if it’s
not obvious that the equations (13.3) follow from the equations (13.1), you
2
should show that, too. What is the commutator [Jˆ , Jˆ28◦ ]?)
Start off by noting that
2 2 2 2
(Jˆx + Jˆy )|β, µi = (Jˆ − Jˆz )|β, µi = ~2 (β − µ2 )|β, µi. (13.6)
2 2
Now the first operator (Jˆx + Jˆy ) would be (Jˆx − iJˆy )(Jˆx + iJˆy ) if Jˆx and
Jˆy were numbers. The factorization is not in fact quite that clean, because
those operators are not in fact numbers. But we use this to inspire the
definitions
Jˆ− = Jˆx − iJˆy and Jˆ+ = Jˆx + iJˆy (13.7)
so that
2 2 2 2 2 2
Jˆ− Jˆ+ = Jˆx + Jˆy + i(Jˆx Jˆy − Jˆy Jˆx ) = Jˆx + Jˆy + i[Jˆx , Jˆy ] = Jˆx + Jˆy − ~Jˆz .
(13.8)
This tells us that
Jˆ− Jˆ+ |β, µi = (~2 β − ~2 µ2 − ~2 µ)|β, µi = ~2 (β − µ(µ + 1))|β, µi. (13.9)
We have immediately that
hβ, µ|Jˆ− Jˆ+ |β, µi = ~2 (β − µ(µ + 1)). (13.10)
But if we define
|φi = Jˆ+ |β, µi then hφ| = hβ, µ|Jˆ−
then equation (13.10) is just the expression for hφ|φi, and we know that for
any vector hφ|φi ≥ 0. Thus
β ≥ µ(µ + 1). (13.11)
13.1. Solution of the angular momentum eigenproblem 221
With these preliminaries out of the way, we investigate the operator Jˆ+ .
First, its commutation relations:
2
[Jˆ , Jˆ+ ] = 0, (13.12)
[Jˆz , Jˆ+ ] = [Jˆz , Jˆx ] + i[Jˆz , Jˆy ] = (i~Jˆy ) + i(−i~Jˆx ) = ~Jˆ+ . (13.13)
Then, use the commutation relations to find the effect of Jˆ+ on |β, µi. If
we again define |φi = Jˆ+ |β, µi, then
2 2 2
Jˆ |φi = Jˆ Jˆ+ |β, µi = Jˆ+ Jˆ |β, µi = ~2 β Jˆ+ |β, µi = ~2 β|φi, (13.14)
Jˆz |φi = Jˆz Jˆ+ |β, µi = (Jˆ+ Jˆz + ~Jˆ+ )|β, µi = ~µJˆ+ |β, µi + ~Jˆ+ |β, µi = ~(µ +(13.15)
1)|φi.
2
That is, the vector |φi is an eigenvector of Jˆ with eigenvalue β and an
eigenvector of Jˆz with eigenvalue µ + 1. In other words,
Jˆ+ |β, µi = C|β, µ + 1i (13.16)
where C is a normalization factor to be determined.
To find C, we contrast
hφ|φi = |C|2 hβ, µ|β, µi = |C|2 (13.17)
with the result of equation (13.10), namely
hφ|φi = hβ, µ|Jˆ− Jˆ+ |β, µi = ~2 (β − µ(µ + 1)). (13.18)
p
From this we may select C = ~ β − µ(µ + 1) so that
Jˆ+ |β, µi = ~ β − µ(µ + 1) |β, µ + 1i.
p
(13.19)
In short, the operator Jˆ+ applied to |β, µi acts as a raising operator: it
doesn’t change the value of β, but it increases the value of µ by 1.
Parallel reasoning applied to Jˆ− shows that
Jˆ− |β, µi = ~ β − µ(µ − 1) |β, µ − 1i.
p
(13.20)
In short, the operator Jˆ− applied to |β, µi acts as a lowering operator: it
doesn’t change the value of β, but it decreases the value of µ by 1.
At first it might appear that we could use these raising or lowering
operators to ascend to infinitely high heavens or to dive to infinitely low
depths, but that appearance is incorrect. Equation (13.11),
β ≥ µ(µ + 1), (13.21)
will necessarily be violated for sufficiently high or sufficiently low values of
µ. Instead, there must be some maximum value of µ — call it µmax —
such that an attempt to raise |β, µmax i results not in a vector proportional
to |β, µmax + 1i, but results instead in 0. It is clear from equation (13.19)
that this value of µ satisifies
β − µmax (µmax + 1) = 0. (13.22)
And it’s equally clear from equation (13.20) that there is a minimum value
µmin satisifying
β − µmin (µmin − 1) = 0. (13.23)
Solving these two equations simultaneously, we find that
µmax = −µmin with µmax ≥ 0 (13.24)
and that
β = µmax (µmax + 1). (13.25)
But there’s more. Because we raise or lower µ by 1 with each application
of Jˆ+ or Jˆ− , the value of µmax must be an integer above µmin :
µmax = µmin + (an integer)
2µmax = (an integer)
an integer
µmax = ≥0 (13.26)
2
Common practice is to call the half-integer µmax by the name j, and the
half-integer µ by the name m. And common practice is to label the angular
momentum state not as |β, µi but as |j, mi, which contains equivalent in-
formation. Using these conventions, the solution to the angular momentum
eigenvalue problem is:
The eigenvalues of Jˆ2 are
~2 j(j + 1) j = 0, 12 , 1, 32 , 2, . . . . (13.27)
For a given j, the eigenvalues of Jˆz are
~m m = −j, −j + 1, . . . , j − 1, j. (13.28)
The eigenstates |j, mi are related through the operators
Jˆ+ = Jˆx + iJˆy Jˆ− = Jˆx − iJˆy (13.29)
by
Jˆ+ |j, mi = ~ j(j + 1) − m(m + 1) |j, m + 1i
p
(13.30)
Jˆ− |j, mi = ~ j(j + 1) − m(m − 1) |j, m − 1i.
p
(13.31)
Problems:
(1) Write out the “parallel reasoning” that results in equation (13.20).
(2) The simultaneous solution of equations (13.22) and (13.23) results in
two possible solutions, namely (13.24) and µmin = µmax + 1. Why do
we reject this second solution? Why do we, in equation (13.24), insert
the proviso µmax ≥ 0?
13.2. Summary of the angular momentum eigenproblem 223
13.2 Summary of the angular momentum eigenproblem
Given [Jˆx , Jˆy ] = i~Jˆz , and cyclic permutations, the eigenvalues of Jˆ2 are
~2 j(j + 1) j = 0, 21 , 1, 32 , 2, . . . .
~m m = −j, −j + 1, . . . , j − 1, j.
Jˆ+ = Jˆx + iJˆy Jˆ− = Jˆx − iJˆy
by
Jˆ+ |j, mi = ~ j(j + 1) − m(m + 1) |j, m + 1i
p
Jˆ− |j, mi = ~ j(j + 1) − m(m − 1) |j, m − 1i.

p
(j)
13.3 Ordinary differential equations for the dm,m0 (θ)
ˆ
dm,m0 (θ) = hj, m|e−i(J y /~)θ |j, m0 i
(j)
d h (j) i ˆ
dm,m0 (θ) = hj, m|e−i(J y /~)θ (−iJˆy /~)|j, m0 i
dθ
1 ˆ
= − hj, m|e−i(J y /~)θ (Jˆ+ − Jˆ− )|j, m0 i
2~
ˆ
= − 12 j(j + 1) − m0 (m0 + 1)hj, m|e−i(J y /~)θ |j, m0 + 1i
p
ˆ
+ 12 j(j + 1) − m0 (m0 − 1)hj, m|e−i(J y /~)θ |j, m0 − 1i
p
d h (j) i p (j)
p (j)
dm,m0 (θ) = − 21 j(j + 1) − m0 (m0 + 1)dm,m0 +1 (θ)+ 12 j(j + 1) − m0 (m0 − 1)dm,m0 −1 (θ)
dθ
For a given j and m, these are 2j + 1 coupled first-order ODEs. In matrix
form they are
 (j)
dm,j (θ)
  √   d(j) (θ) 
0 j 0 0 0 0 m,j
 d(j) (θ) 
 m,j−1  −√j 0
√
2j − 1 0 ··· 0 0 
 d(j) (θ) 
m,j−1
√ √
   
 (j)   0 − 2j − 1  (j) 
 dm,j−2 (θ)  0 3j − 3 0 0  d m,j−2 (θ) 
d  (j)
  
1  √  
  (j) 
 dm,j−3 (θ)  = √ 
 0 0 − 3j − 3 0 0 0   dm,j−3 (θ)  .
dθ  .  2 ..   ..
 
. 
.
. .
 
√ 
  
 (j) 

 dm,−j+1 (θ)   0 0 0 0 0 j   d(j) (θ)

√ m,−j+1

(j)
dm,−j (θ) 0 0 0 0 − j 0 (j)
dm,−j (θ)
13.4 Problems
13.1 Trivial pursuit

a. Show that if an operator commutes with two components of an
angular momentum vector, it commutes with the third as well.
b. If Jˆx and Jˆz are represented by matrices with pure real entries
(as is conventionally the case, see problem 13.2), show that Jˆy is
represented by a matrix with pure imaginary entries.
13.2 Matrix representations for spin- 12
If we are interested only in a particle’s angular momentum, and not
in its position, momentum, etc., then for a spin- 21 particle the basis
{| 12 , 12 i, | 12 , − 12 i} spans the relevant states. These states are usually
denoted simply {| ↑i, | ↓i}. Recall that the matrix representation of
operator Â in this basis is
!
h↑ |Â| ↑i h↑ |Â| ↓i
, (13.32)
h↓ |Â| ↑i h↓ |Â| ↓i
and recall also that this isn’t always the easiest way to find a matrix
representation.
a. Find matrix representations in the {| ↑i, | ↓i} basis of Ŝz , Ŝ+ , Ŝ− ,
Ŝx , Ŝy , and Ŝ 2 . Note the reappearance of the Pauli matrices!
b. Find normalized column matrix representations for the eigenstates
of Ŝx :
~
Ŝx | →i = + | →i (13.33)
2
~
Ŝx | ←i = − | ←i. (13.34)
2
13.3 Rotations and spin- 21
Verify explicitly that
| →i = e−i(Ŝy /~)(+π/2) | ↑i, (13.35)
−i(Ŝy /~)(−π/2)
| ←i = e | ↑i. (13.36)
(Problems 2.27 through 2.29 are relevant here.)
13.4 Spin-1 projection amplitudes
a. (Easy.) Prove that
(j) (j)
dm,m0 (θ) = [dm0 ,m (−θ)]∗ . (13.37)
13.4. Problems 225
(j)
b. Show that the dm,m0 (θ) with j = 1 are
(1) (1) (1)
d1,1 (θ) = + 12 (cos θ + 1) d1,0 (θ) = − √12 sin θ d1,−1 (θ) = − 12 (cos θ − 1)
(1) (1) (1)
d0,1 (θ) = + √12 sin θ d0,0 (θ) = cos θ d0,−1 (θ) = − √12 sin θ
(1) (1) (1)
d−1,1 (θ) = − 12 (cos θ − 1) d−1,0 (θ) = + √12 sin θ d−1,−1 (θ) = + 12 (cos θ + 1)
Chapter 14
Central Force Motion
14.1 Energy eigenproblem in two dimensions
In one dimension, the energy eigenproblem is

~2 d2 ηn (x)
− + V (x)ηn (x) = En ηn (x). (14.1)
2M dx2
The generalization to two dimensions is straightforward:
~2 ∂ 2 ηn (x, y) ∂ 2 ηn (x, y)

− + + V (x, y)ηn (x, y) = En ηn (x, y). (14.2)
2M ∂x2 ∂y 2
The part in square brackets is called “the Laplacian of ηn (x, y)” and rep-
resented by the symbol “∇2 ” as follows
2
∂ f (x, y) ∂ 2 f (x, y)

+ ≡ ∇2 f (x, y). (14.3)
∂x2 ∂y 2
Thus the “mathematical form” of the energy eigenproblem is
2M
∇2 ηn (~r) = − 2 [En − V (~r)]ηn (~r). (14.4)
~
Suppose V (x, y) is a function of distance from the origin r only. Then
it makes sense to use polar coordinates r and θ rather than Cartesian
coordinates x and y. What is the expression for the Laplacian in polar
coordinates? This can be uncovered through the chain rule, and it’s pretty
hard to do. Fortunately, you can look up the answer:
1 ∂ 2 f (r, θ)

2 1 ∂ ∂f (r, θ)
∇ f (~r) = r + 2 . (14.5)
r ∂r ∂r r ∂θ2
Thus, the partial differential equation to be solved is
1 ∂ 2 ηn (r, θ)

1 ∂ ∂ηn (r, θ) 2M
r + 2 = − 2 [En − V (r)]ηn (r, θ) (14.6)
r ∂r ∂r r ∂θ2 ~
227
228 Central Force Motion
or
∂2

∂ ∂ 2M 2
2
+r r + 2 r [En − V (r)] ηn (r, θ) = 0. (14.7)
∂θ ∂r ∂r ~
For convenience, we wrap up all the r dependence into one piece by defining
the linear operator

∂ ∂ 2M
Ln (r) ≡ r r + 2 r2 [En − V (r)] (14.8)
∂r ∂r ~
and write the above as
∂2

+ L n (r) ηn (r, θ) = 0. (14.9)
∂θ2
There are at least two ways to approach the above equation: the Fourier
series method and the separation of variables method. We’ll try each one
in turn.
Fourier series:
Because increasing the angle θ by 2π brings you to the same point where
you started, the function ηn (r, θ) is periodic in θ with period 2π. And the
theory of Fourier series teaches that any such function can be written in
the form
X∞
a0 + (an cos nθ + bn sin nθ). (14.10)
n=1
But, because
einθ + e−inθ einθ − e−inθ
cos nθ = and sin nθ = ,
2 2i
this series can also be written in the form
X∞
c` ei`θ . (14.11)
`=−∞
[When dealing with real functions, the form (14.10) has obvious attrac-
tions. But the form (14.11) is always more compact, and when dealing
with complex-valued functions it’s more natural as well.] The Fourier the-
orem (14.11) is just another way of saying that the set of functions
ei`θ with ` = 0, ±1, ±2, ±3, . . . (14.12)
constitute a basis for the set of continuous functions with periodicity 2π.
14.1. Energy eigenproblem in two dimensions 229
Applying this Fourier theorem to the problem at hand, we write the

energy eigenfunction as
∞
X
ηn (r, θ) = Rn,` (r)ei`θ (14.13)
`=−∞
and then take

∂2

0= + L n (r) ηn (r, θ)
∂θ2
2 X ∞
∂
= + L n (r) Rn,` (r)ei`θ
∂θ2
`=−∞
∞ 2
X ∂
= + Ln (r) Rn,` (r)ei`θ
∂θ2
`=−∞
X∞
−`2 Rn,` (r) + Ln (r)Rn,` (r) ei`θ .

= (14.14)
`=−∞
Because the basis set {ei`θ } is complete (or, equivalently, because Fourier
series are unique) the expression in square brackets above must vanish, for
every value of `, whence
Ln (r)Rn,` (r) − `2 Rn,` (r) = 0 for ` = 0, ±1, ±2, ±3, . . . . (14.15)
We no longer have a partial differential equation. Instead we have an infinite
number of ordinary differential equations

d d 2M
r r + 2 r2 [En − V (r)] − `2 Rn,` (r) = 0. (14.16)
dr dr ~
Separation of variables:
Our equation
∂2

+ L n (r) ηn (r, θ) = 0. (14.17)
∂θ2
is a linear partial differential equation, so we cast around for solutions
knowing that a linear combination of solutions will also be a solution, and
hoping that we will cast our net wide enough to catch all the elements of
a basis. We cast around using the technique of “separation of variables”,
namely by looking for solutions of the form
ηn (r, θ) = R(r)Θ(θ). (14.18)
Plugging this form into the PDE gives

R(r)Θ00 (θ) + Θ(θ)Ln (r)R(r) = 0
Θ00 (θ) Ln (r)R(r)
+ =0 (14.19)
Θ(θ) R(r)
Through the usual separation-of-variables argument, we recognize that if a
function of r alone plus a function of θ alone sum to zero, where r and θ are
independent variables, then both functions must be equal to a constant:
Ln (r)R(r) Θ00 (θ)
=− = const. (14.20)
R(r) Θ(θ)
First, look at the angular part:

Θ00 (θ) = −const Θ(θ). (14.21)
This is the differential equation for a mass on a spring! The two linearly
independent solutions are
√ √
Θ(θ) = sin( const θ) or Θ(θ) = cos( const θ). (14.22)
Now, the boundary condition for this ODE is just that the function must
come back to itself if θ increases by 2π:
Θ(θ) = Θ(2π + θ). (14.23)
√
If you think about this for a minute, you’ll see that this means const must
be an integer. The negative integers don’t give us anything new, so we’ll
take
√
const = ` where ` = 0, 1, 2, . . . . (14.24)
In summary, the solution to the angular problem is
`=0 `=1 `=2 `=3 ···

Θ(θ) 1 sin θ or cos θ sin 2θ or cos 2θ sin 3θ or cos 3θ ···
Alternatively, we can take linear combinations of the above to produce the

set of solutions
Θ(θ) = ei`θ for ` = 0, ±1, ±2, ±3, . . . . (14.25)
Now examine the radial part of the problem:

Ln (r)R(r)
= const = `2 . (14.26)
R(r)
Because the radial operator depends on n, the solutions R(r) will depend
on both n and `, so we denote them by Rn,` (r). They solve the equation
Ln (r)Rn,` (r) − `2 Rn,` (r) = 0 for ` = 0, ±1, ±2, ±3, . . . . (14.27)
We no longer have a partial differential equation. Instead we have an infinite
number of ordinary differential equations

d d 2M
r r + 2 r2 [En − V (r)] − `2 Rn,` (r) = 0. (14.28)
dr dr ~
Two routes converge:

We are at the end of the bifurcation. Both the “Fourier series” route and
the “Separation of variables” route have arrived at the same destination.
This is not a coincidence. The mathematical theory of Sturm-Liouville
problems1 assures us that for a wide class of partial differential equations,
the result of separation of variables will give rise to a complete set of func-
tions, like the trigonometric functions that form a basis for the continuous
functions with period 2π.
Write the resulting one-variable ODE as
`2

1 d d 2M
r + 2 [En − V (r)] − 2 Rn,` (r) = 0
r dr dr ~ r
2 2

1 d d 2M ~ `
r + 2 En − V (r) − Rn,` (r) = 0 (14.29)
r dr dr ~ 2M r2
I want to compare this differential equation with another one-variable dif-
ferential equation, namely the one for the energy eigenvalue problem in one
dimension:
2
d 2M
+ 2 [E − V (x)] η(x) = 0. (14.30)
dx2 ~
The parts to the right are rather similar, but the parts to the left — the
derivatives — are rather different.
In addition, the one-dimensional energy eigenfunction satisfies the nor-
malization
Z ∞
|η(x)|2 dx = 1, (14.31)
−∞
1 Charles-FrançoisSturm (1803–1855) was a French mathematician who also helped
make the first experimental determination of the speed of sound in water. Joseph Liou-
ville (1809–1882), another French mathematician, made contributions in complex analy-
sis, number theory, differential geometry, and classical mechanics. He was elected to the
French Constituent Assembly of 1848 which established the Second Republic.
whereas the two-dimensional energy eigenfunction satisfies the normaliza-

tion
Z
|η(x, y)|2 dx dy = 1
Z ∞ Z 2π
dr r dθ |Rn,` (r)ei`θ |2 = 1
0 0
Z ∞
2π dr r|Rn,` (r)|2 = 1. (14.32)
0
This suggests that the true analog of the one-dimensional η(x) is not
Rn,` (r), but rather
√
un,` (r) = rRn,` (r). (14.33)
Furthermore,
√

1 d 1 1 u(r)
if u(r) = rR(r), then (rR0 (r)) = √ 00
u (r) + .
r dr r 4 r2
(14.34)
Using this change of function, the radial equation (14.29) becomes
2
~2 `2

d 1 1 2M
+ + E − V (r) − u(r) = 0
dr2 4 r2 ~2 2M r2
2
~2

d 2M 2 1 1
+ E − V (r) − ` − u(r) = 0
dr2 ~2 2M 4 r2
~2 (` − 21 )(` + 12 ) 1
2
d 2M
+ 2 E − V (r) − u(r) = 0 (14.35)
dr2 ~ 2M r2
In this form, the radial equation is exactly like a one-dimensional en-

ergy eigenproblem, except that where the one-dimensional problem has the
function V (x), the radial problem has the function
V (r) + ~2 (` − 21 )(` + 21 )/2M r2 .
These two functions play parallel mathematical roles in the two problems.
To emphasize these similar roles, we define an “effective potential energy”
for the radial problem, namely
~2 (` − 12 )(` + 21 ) 1
Veff (r) = V (r) + . (14.36)
2M r2
Don’t read too much into the term “effective potential energy.” No actual
potential energy will depend upon ~, or upon the separation constant `!
I’m not saying that Veff (r) is a potential energy function, merely that it
plays the mathematical role of one in solving this eigenproblem.
Now that the radial equation (14.35) is in exact correspondence with the
one-dimensional equation (14.30), we can solve this eigenproblem using the
same “curve toward or away from axis” techniques that we developed for
the one-dimensional problem in chapter 10. (Or any other technique which
works for the one-dimensional problem.) The resulting eigenfunctions and
eigenvalues will, of course, depend upon the value of the separation constant
`, because the effective potential depends upon the value of `. And as
always, for each ` there will be many eigenvalues and eigenfunctions, which
we will label by index n = 1, 2, 3, . . . calling them un,` (r) with eigenvalue
En,` .
Finally, note that the effective potential energy for ` = +5 is the same as
the effective potential energy for ` = −5. Thus the eigenfunctions un,+5 (r)
and eigenvalues En,+5 will be identical to the eigenfunctions un,−5 (r) and
eigenvalues En,−5 .
This is a really charming result. We haven’t yet specified the potential
energy function V (r), so we can’t yet determine, say, E7,+5 or E7,−5 . Yet
we know that these two energy eigenvalues will be equal! Whenever there
are two different eigenfunctions, in this case
un,+5 (r) +i5θ un,+5 (r) −i5θ
√ e and √ e ,
r r
attached to the same eigenvalue, the eigenfunctions are said to be degen-
erate. I don’t know how such a disparaging term came to be attached to
such a charming result, but it has been. [[Consider better placement of this
remark.]]
Summary:
To solve the two-dimensional energy eigenproblem for a radially-
symmetric potential energy V (r), namely
~2 2
− ∇ η(~r) + V (r)η(~r) = Eη(~r), (14.37)
2M
first solve the radial energy eigenproblem
~2 d2 u(r) ~2 (` − 12 )(` + 12 ) 1

− + V (r) + u(r) = Eu(r) (14.38)
2M dr2 2M r2
for ` = 0, 1, 2, . . .. For a given `, call the resulting energy eigenfunctions and
eigenvalues un,` (r) and En,` for n = 1, 2, 3, . . .. Then the two-dimensional
solutions are
un,0 (r)
For ` = 0: η(r, θ) = √ with energy En,0 (14.39)
r
and
un,` (r) +i`θ un,` (r)
For ` = 1, 2, 3, . . .: √ e
η(r, θ) = and η(r, θ) = √ e−i`θ with energy En,` .
r r
(14.40)
Alternatively, for the last equation we can use
un,` (r) un,` (r)
For ` = 1, 2, 3, . . .: η(r, θ) = √ sin(`θ) and η(r, θ) = √ cos(`θ) with energy En,` .
r r
(14.41)
Reflection:
So we’ve reduced the two-dimensional problem to a one-dimensional
problem. How did this miracle occur? Two things happened:
• The original eigenvalue problem was of the form

{angular operator + radial operator}ηn (r, θ) = 0. (14.42)
• There was an angular operator eigenbasis {Φ` (θ)} such that
{angular operator}Φ` (θ) = number Φ` (θ). (14.43)
14.2 Energy eigenproblem in three dimensions
Can we get the same miracle to occur in three dimensions?
y
φ
x
14.2. Energy eigenproblem in three dimensions 235
The energy eigenproblem is

~2 2
− ∇ ηn (~x) + V (r)ηn (~x) = En ηn (~x), (14.44)
2M
and the Laplacian in spherical coordinates is
1 ∂2

2 1 ∂ 2 ∂ 1 ∂ ∂
∇ = 2 r + sin θ +
r ∂r ∂r sin θ ∂θ ∂θ sin2 θ ∂φ2
∂2

1 ∂ ∂ ∂ ∂ 1
= 2 r2 + (1 − µ2 ) + . (14.45)
r ∂r ∂r ∂µ ∂µ 1 − µ2 ∂φ2
Here φ ranges from 0 to 2π, and θ ranges from 0 to π. It is often convenient
to use, in place of θ, the variable
µ = cos θ where µ ranges from −1 to +1. (14.46)
(The situation µ = +1 corresponds to the “north pole” of the spherical
coordinate system – the positive z axis – while the situation µ = −1 corre-
sponds to the “south pole” – the negative z axis.) The energy eigenproblem
is then

2 2M
∇ + 2 [En − V (r)] ηn (~x) = 0, (14.47)
~
or
∂2

∂ ∂ 1 ∂ 2 ∂ 2M 2
(1 − µ2 ) + + r + r [En − V (r)] ηn (~x) = 0,
∂µ ∂µ 1 − µ2 ∂φ2 ∂r ∂r ~2
(14.48)
and this expression is indeed in the desired form (14.42).
Now, is there a basis of angular operator eigenfunctions as required
by (14.43)?
We seek a complete set of functions on the unit sphere {yλ (µ, φ)} such
that
∂2

∂ ∂ 1
{angular operator}yλ (µ, φ) = (1 − µ2 ) + yλ (µ, φ) = λyλ (µ, φ).
∂µ ∂µ 1 − µ2 ∂φ2
(14.49)
These functions, like any function of angle φ, must be periodic in φ with
period 2π, so
∞
X
yλ (µ, φ) = pλ,m (µ)eimφ . (14.50)
m=−∞
In these terms, the eigenproblem (14.49) becomes

∞
m2

X ∂ ∂
(1 − µ2 ) − 2
− λ pλ,m (µ) eimφ = 0. (14.51)
m=−∞
∂µ ∂µ 1 − µ
Because of the uniqueness of Fourier series (compare the reasoning below

equation (14.14)) each term in curly brackets must vanish individually. For
each
m
m2

d 2 d
(1 − µ ) − − λ pλ,m (µ) = 0 m = 0, ±1, ±2, . . . .
dµ dµ 1 − µ2
(14.52)
This differential equation is called the “generalized Legendre2 equa-
tion.” It can be solved using the power-series method — details are left to
problem X.Y. There are, of course, two linearly independent solutions for
all values of m and λ. However, most of those solutions diverge at either
µ = 1, or µ = −1, or both. (That is, either at the north or south pole, or
both.) Solutions are finite for all µ from −1 to +1, inclusive, if and only if
λ = −`(` + 1) ` = 0, 1, 2, 3, . . . (14.53)
and
m = −`, −` + 1, . . . , 0, . . . , ` − 1, `. (14.54)
In these cases, the finite solution is called the “associated Legendre func-
tion” P`m (µ).
These functions have a lot of interesting properties, but the main point
is that we can use them to define a set of functions on the unit sphere called
the spherical harmonics
s
2` + 1 (` − m)! m
Y`m (θ, φ) or Y`m (µ, φ) = P (µ)eimφ . (14.55)
4π (` + m)! `
(Some authors use different prefactors.)
The spherical harmonics satisfy
∂2

m ∂ 2 ∂ 1
{angular operator}Y` (µ, φ) = (1 − µ ) + Y`m (µ, φ) = −`(`+1)Y`m (µ, φ)
∂µ ∂µ 1 − µ2 ∂φ2
(14.56)
and are complete in the sense that
Theorem: If f (µ, φ) is a differentiable function on the unit sphere then

∞ X
X ` Z 2π Z 1
f (µ, φ) = f`,m Y`m (µ, φ) where f`,m = dφ dµ (Y`m (µ, φ))∗ f (µ, φ).
`=0 m=−` 0 −1
(14.57)
2 Adrien-Marie Legendre (1752–1833) made contributions throughout mathematics. He
originated the “least squares” method of curve fitting. One notable episode from his life
is that the French government denied him the pension he had earned when he refused
to endorse a government-supported candidate for an honor.
14.2. Energy eigenproblem in three dimensions 237
The above paragraph is precisely analogous to the Fourier series result

that the “trigonometric” functions ei`θ satisfy
2
∂
ei`θ = −`2 ei`θ (14.58)
∂θ2
and are complete in the sense that
Theorem: If f (θ) is a differentiable function on the unit circle (i.e. with

periodicity 2π) then
∞ Z 2π
X 1
f (θ) = f` ei`θ where f` = dθ (ei`θ )∗ f (θ). (14.59)
2π 0
`=−∞
There are a lot of special functions, many of which are used only in very
specialized situations. But the spherical harmonics are just as important
in three dimensional problems as the trigonometric functions are in two di-
mensional problems. Spherical harmonics are used in quantum mechanics,
in electrostatics, in acoustics, in signal processing, in seismology, and in
mapping (to keep track of the deviations of the Earth’s shape from spher-
ical). They are as important as sines and cosines. It’s worth becoming
familiar with them.
Now that we have a complete set of eigenfunctions for the angular op-
erator, we can carry out the “reduction to one dimension” process in three
dimensions just as we did in two dimensions. The energy eigenvalue prob-
lem is
~2 2

− ∇ + V (r) − En ηn (~r) = 0. (14.60)
2M
Write
∞ X
X `
ηn (r, µ, φ) = Rn,`,m (r)Y`m (µ, φ) (14.61)
`=0 m=−`
to prove that Rn,`,m (r) satisfies

~2 `(` + 1)

1 d 2 d 2M
r + E n − V (r) − Rn,`,m (r) = 0.
r2 dr dr ~2 2M r2
(14.62)
Note that the differential equation is independent of m, so the solution
must also be independent of m. Thus we drop the subscript m and write
Rn,` (r).
The energy eigenfunction satisfies the normalization

Z
|η(x, y, z)|2 dx dy dz = 1
Z ∞ Z 1 Z 2π
dr dµ r2 dφ |Rn,` (r)Y`m (µ, φ)|2 = 1
0 −1 0
Z ∞
dr r2 |Rn,` (r)|2 = 1. (14.63)
0
This suggests that the true analog to a one-dimensional wavefunction is
un,` (r) = rRn,` (r), and sure enough un,` (r) satisfies the equation
~2 d 2 ~2 `(` + 1)

− + V (r) + un,` (r) = En,` un,` (r). (14.64)
2M dr2 2M r2
Summary:
To solve the three-dimensional energy eigenproblem for a spherically-
symmetric potential energy V (r), namely
~2 2
− ∇ η(~r) + V (r)η(~r) = Eη(~r), (14.65)
2M
first solve the radial energy eigenproblem
~2 d2 u(r) ~2 `(` + 1) 1

− + V (r) + u(r) = Eu(r) (14.66)
2M dr2 2M r2
for ` = 0, 1, 2, . . .. For a given `, call the resulting energy eigenfunctions and
eigenvalues un,` (r) and En,` for n = 1, 2, 3, . . .. Then the three-dimensional
solutions are
un,` (r) m
ηn,`,m (r, θ, φ) = Y` (θ, φ) with energy En,` , (14.67)
r
where m takes on the 2` + 1 values −`, −` + 1, . . . , 0, . . . , ` − 1, `. Notice
that the 2` + 1 different solutions for a given n and `, but with different m,
are degenerate.
Problem: Show that the probability density |Y`m (θ, φ)|2 associated with
any spherical harmonic is “axially symmetric,” that is, independent of ro-
tations about the z axis, that is, independent of φ.
14.3. Bound state energy eigenproblem for Coulombic potentials 239
14.3 Bound state energy eigenproblem for Coulombic po-

tentials
Problem: Given a (reduced) mass M and a Coulombic potential energy

V (r) = −k/r, find the negative values En,` such that the corresponding
solutions Un,` (r) of
~2 d2 k ~2 `(` + 1)

− + − + Un,` (r) = En,` Un,` (r) (14.68)
2M dr2 r 2M r2
are normalizable wavefunctions
Z ∞
|Un,` (r)|2 dr = 1. (14.69)
0
Strategy: Same as for the simple harmonic oscillator eigenproblem:
(1) Convert to dimensionless variable.

(2) Remove asymptotic behavior of solutions.
(3) Find non-asymptotic behavior using the series method.
(4) Invoke normalization to terminate the series as a polynomial.
1. Convert to dimensionless variable: Only one length can be con-

structed from M , k, and ~. It is
~2
a= . (14.70)
kM
For the hydrogen problem
mp me e2
M= ≈ me and k= ,
mp + me 4π0
so this length is approximately
~2 4π0
≡ a0 ≡ “the Bohr radius” = 0.0529 nm. (14.71)
me e2
Convert to the dimensionless variable
r
ρ= (14.72)
a
and the dimensionless wavefunction
√
un,` (ρ) = aUn,` (aρ). (14.73)
The resulting eigenproblem is
d2

2 `(` + 1) En,`
− 2− + un,` (ρ) = 2 un,` (ρ) (14.74)
dρ ρ ρ2 k M/2~2
with
Z ∞
|un,` (ρ)|2 dρ = 1. (14.75)
0
It’s clear that the energy

k2 M
(14.76)
2~2
is the characteristic energy for this problem. For hydrogen, its value is
approximately
2 2
e me
≡ Ry ≡ “the Rydberg” = 13.6 eV.
4π0 2~2
Thus it is reasonable, for brevity, to define the dimensionless energy pa-
rameter
En,`
En,` = 2 . (14.77)
k M/2~2
Furthermore, for the bound state problem En,` is negative so we define
b2n,` = −En,` (14.78)
and the eigenproblem becomes
2
d 2 `(` + 1) 2
+ − + bn,` un,` (ρ) = 0 (14.79)
dρ2 ρ ρ2
with
Z ∞
|un,` (ρ)|2 dρ = 1. (14.80)
0
2. Remove asymptotic behavior of solutions:
Note: In this section we will show that as ρ → 0,
un,` (ρ) ≈ ρ`+1 , (14.81)
and that as ρ → ∞,
un,` (ρ) ≈ e−bn,` ρ , (14.82)
so we will set
un,` (ρ) = ρ`+1 e−bn,` ρ vn,` (ρ) (14.83)
and then solve an ODE for vn,` (ρ). As far as rigor is concerned we could
have just pulled the change-of-function (14.83) out of a hat. Thus this
section is motivational and doesn’t need to be rigorous.
14.3. Bound state energy eigenproblem for Coulombic potentials 241
Because equation (14.79) has problems (or, formally, a “regular singular

point”) at ρ = 0, it pays to find the asymptotic behavior when ρ → 0 as
well as when ρ → ∞.
2A. Find asymptotic behavior as ρ → 0: The ODE is
2
d 2 `(` + 1) 2
+ − − bn,` un,` (ρ) = 0. (14.84)
dρ2 ρ ρ2
As ρ → 0 the term in square brackets is dominated (unless ` = 0) by
−`(` + 1)/ρ2 . The equation
2
d `(` + 1)
− u(ρ) = 0 (14.85)
dρ2 ρ2
is solved by
u(ρ) = Aρ`+1 + Bρ−` . (14.86)
−`
However, it’s not healthy to keep factors like ρ around, because
Z ρ0 ρ0
1 1
ρ−2` dρ = =∞ [for ` > 12 ], (14.87)
0 −2` + 1 ρ2`−1 0
so wavefunctions with ρ−` prefactors tend to be unnormalizable. (Here ρ0
is just any positive number.) Thus the wavefunction must behave as
u(ρ) ≈ Aρ`+1 (14.88)
as ρ → 0.
Our arguments have relied upon ` 6= 0, but it turns out that by stupid
good luck the result (14.88) applies when ` = 0 as well. However, it’s rather
hard to prove this, and since this section is really just motivation anyway,
I’ll not pursue the matter.
2B. Find asymptotic behavior as ρ → ∞: In this case, the square bracket
term in equation (14.84) is dominated by −b2n,` , so the approximate ODE
is
2
d 2
− b n,` un,` (ρ) = 0 (14.89)
dρ2
with solutions
un,` (ρ) = Ae−bn,` ρ + be+bn,` ρ . (14.90)
Clearly, normalization requires that B = 0, so the wavefunction has the
expected exponential cutoff for large ρ.
In this way, we have justified the definition of vn,` (ρ) in equation (14.83).
Plugging
2(14.83) into ODE (14.79), we find that vn,` (ρ) satisfies the ODE
d d
ρ 2 + 2[` + 1 − bn,` ρ] − 2[bn,` ` + bn,` − 1] vn,` (ρ) = 0 (14.91)
dρ dρ
3. Find non-asymptotic behavior using the series method: We try out
the solution
∞
X
vn,` (ρ) = ak ρk (14.92)
k=0
and readily find that
2bn,` (k + ` + 1) − 2
ak+1 = ak k = 0, 1, 2, . . . (14.93)
(k + 1)(k + 2` + 2)
(Note that because k and ` are both non-negative, the denominator never
vanishes.)
4. Invoke normalization to terminate the series as a polynomial: If the
ak coefficient never vanishes, then
ak+1 2bn,`
→ as k → ∞. (14.94)
ak k
2bn,` ρ
As in the SHO, this leads to v(ρ) ≈ e as ρ → ∞, which is pure disaster.
To avoid catastrophe, we must truncate the series as a kth order polynomial
by demanding
1
bn,` = k = 0, 1, 2, . . . (14.95)
k+`+1
Thus bn,` is always the reciprocal of the integer
n=k+`+1 (14.96)
and
1
En,` = −b2n,` = − 2 n = 1, 2, 3, . . . . (14.97)
n
We have found the permissible bound state energies!
What are the eigenfunctions? The solution vn,` (ρ) that is a polynomial
of order k = n − ` − 1 has a name: it is the Laguerre3 polynomial
L2`+1
n−`−1 ((2/n)ρ). (14.98)
It would be nicer to have a more direct notation like our own vn,` (ρ), but
Laguerre died before quantum mechanics was born, so he could not have
known how to make his notation convenient for the quantum mechanical
Coulomb problem. The Laguerre polynomials are just one more class of
special functions not worth knowing much about.
All together, the energy eigenfunctions are
ηn,`,m (ρ, θ, φ) = [constant]ρ` e−ρ/n L2`+1 m
n−`−1 ((2/n)ρ)Y` (θ, φ). (14.99)
3 Edmond Laguerre (1834–1886), French artillery officer and mathematician, made con-
tributions to analysis and especially geometry.
14.4. Summary of the bound state energy eigenproblem for a Coulombic potential 243
Degeneracy
Recall that each vn,` (ρ) already has an associated 2` + 1-fold degeneracy.
In addition, each ` gives rise to an infinite number of eigenvalues:
1
En,` = − k = 0, 1, 2, . . . . (14.100)
(k + ` + 1)2
In tabular form
`=0 gives n= 1, 2, 3, 4, ...

`=1 gives n= 2, 3, 4, ...
`=2 gives n= 3, 4, ...
..
.
So. . .
1 1 1
`=0 (degeneracy 1) gives En,` = −1, − , − , − , ...
22 32 42
1 1 1
`=1 (degeneracy 3) gives En,` = − 2, − 2, − 2, ...
2 3 4
1 1
`=2 (degeneracy 5) gives En,` = − 2, − 2, ...
3 4
..
.
Eigenenergies of −1/n2 are associated with n different values of `,

namely ` = 0, 1, . . . , n − 1. The total degeneracy is thus
n−1
X
(2` + 1) = n2 . (14.101)
`=0
14.4 Summary of the bound state energy eigenproblem for

a Coulombic potential
A complete set of energy eigenfunctions is ηn,`,m (r, θ, φ)
where n = 1, 2, 3, . . .
and for each n ` = 0, 1, 2, . . . , n − 1
and for each n and ` m = −`, −` + 1, . . . , ` − 1, `.
This wavefunction represents a state of energy

k 2 M/2~2
En = − ,
n2
independent of ` and m. Thus energy En has an n2 -fold degeneracy. In
particular, for hydrogen this eigenenergy is nearly
Ry
En = − , Ry = 13.6 eV.
n2
In addition, the wavefunction ηn,`,m (r, θ, φ) represents a state with an

angular momentum squared of ~2 `(` + 1) and an angular momentum z
component of ~m.
[I recommend that you memorize this summary. . . it’s the sort of thing
that frequently comes up on GREs and physics oral exams.]
14.5 Problems
14.1 Positronium
The “atom” positronium is a bound state of an electron and a positron.
Find the allowed energies for positronium.
14.2 Operator factorization solution of the Coulomb problem
The bound state energy eigenvalues of the hydrogen atom can be found
using the operator factorization method. In reduced units, the radial
wave equation is
d2

`(` + 1) 2
− 2+ − un,` (ρ) ≡ h` un,` (ρ) = En,` un,` (ρ). (14.102)
dρ ρ2 ρ
Introduce the operators
(`) d ` 1
D± ≡ ∓ ± (14.103)
dρ ρ `
and show that
(`+1) (`+1) 1 (`) (`) 1
D− D+ = −h` − , D+ D− = −h` − . (14.104)
(` + 1)2 `2
From this, conclude that
(`+1) (`+1)
h`+1 D+ un,` (ρ) = En,` D+ un,` (ρ) (14.105)
whence
(`+1)
D+ un,` (ρ) ∝ un,`+1 (ρ) (14.106)
14.5. Problems 245
and En,` is independent of `.

Argue that for every En,` < 0 there is a maximum `. (Hint: examine
the effective potential for radial motion.) Call this value `max , and set
n = `max + 1 to show that
1
En,` = − 2 , ` = 0, . . . , n − 1. (14.107)
n
14.3 A non-Coulombic central force
The central potential
k c
V (r) = − + 2 (14.108)
r r
is a model (albeit a poor one) for the interaction of the two atoms
in a diatomic molecule. (Arnold Sommerfeld called this the “rotating
oscillator” potential: see his Atomic Structure and Spectral Lines, 3rd
ed., 1922, appendix 17.) Steven A. Klein (class of 1989) investigated
this potential and found that its energy eigenproblem could be solved
exactly.
a. Sketch the potential, assuming that k and c are both positive.
b. Following the method of section 14.3, convert the radial equation
of the energy eigenproblem into
d2

2 γ + `(` + 1)
− 2− + un,` (ρ) = En,` un,` (ρ). (14.109)
dρ ρ ρ2
where γ = 2cM/~2 and where ρ, En,` , and un,` (ρ) are to be iden-
tified.
c. Find two values of x such that x(x + 1) = γ + `(` + 1). Select
whichever one will be most convenient for later use.
d. Convince yourself that the solution described in section 14.3 does
not depend upon ` being an integer, and conclude that the energy
eigenvalues are
−1
En,` = 1
p (14.110)
[n − ` + 2 (−1 + (2` + 1)2 + 4γ)]2
where n = 1, 2, 3, . . . and where for each n, ` can take on values
` = 0, 1, 2, . . . , n − 1.
e. Verify that this energy spectrum reduces to the Coulomb limit
when c = 0.
14.4 The quantum mechanical virial theorem
a. Argue that, in an energy eigenstate |η(t)i, the expectation value
hr̂ · p̂i does not change with time.
b. Hence conclude that hη(t)|[r̂ · p̂, Ĥ]|η(t)i = 0.

c. Show that [r̂ · p̂, p̂2 ] = 2i~ p̂2 , while [r̂ · p̂, V (r̂)] = −i~ r̂ · ∇V (r̂),
where V (r) is any scalar function of the vector r. (Hint: For the
second commutator, use an explicit position basis representation.)
d. Suppose the Hamiltonian is
1 2
Ĥ =p̂ + V (r̂) = T̂ + V̂ . (14.111)
2m
Define the force function F(r) = −∇V (r) and the force operator
F̂ = F(r̂). Conclude that, for an energy eigenstate,
2hT̂ i = −hr̂ · F̂i. (14.112)
This is the “virial theorem.”
e. If V (r) = C/rn , show that 2hT̂ i = −nhV̂ i for any energy eigen-
state, and that
n −2
hT̂ i = E, hV̂ i = E, (14.113)
n−2 n−2
for the energy eigenstate with energy E.
14.5 Research project
Discuss the motion of wavepackets in a Coulombic potential. Does the
expectation value of r̂ follow the classical Kepler ellipse? Is it even
restricted to a plane? Does the wavepacket spread out in time (as with
the force-free particle) or remain compact (as with the simple harmonic
oscillator)?
Chapter 15
Identical Particles
Note: Heap algorithm? Permutation groupie things in an appendix?

Identical particles not necessarily interacting, so two particles can be at
same point.
15.1 Many-particle systems in quantum mechanics
[[This section should be in the chapter on “continuum systems” and then

referred to from here.]]
One particle moves in one dimension. (Ignore spin.) How can we
represent this system’s state?
There are several ways: The ordinary wavefunction ψ(x) represents the
state in terms of the position basis. The momentum wavefunction
Z +∞
1
ψ̃(p) = √ dx e−i(p/~)x ψ(x)
2π~ −∞
represents the state in terms of the momentum basis. The energy expansion
coefficients
Z +∞
cn = ηn∗ (x)ψ(x) dx
−∞
represent the state in terms of the energy basis. [Meaning that

X
ψ(x) = cn ηn (x). ]
n
Or we can represent the state in terms of the expectation of position hxi,

the expectation of momentum hpi, the indeterminacy in position (which
247
248 Identical Particles
involves hx2 i), the indeterminacy in momentum (which involves hp2 i), the
moments hx3 i and hp3 i and so forth, the correlation functions hxpi and
hxp2 x3 pi and so forth. You can prove (it’s not easy!) that if all these mean
values are known then one can reconstruct the wavefunction.
Suppose we know the ordinary position wavefunction. Then if you mea-
sure the particle’s location, the probability of finding it in a window of width
dxA about position xA is |ψ(xA )|2 dxA . This is not sufficient information
to specify the particle’s state: it tells you everything there is to know about
position, but nothing about momentum or about position-momentum cor-
relation functions.
One-particle window
dxA
x
xA
Meanwhile the amplitude of finding the particle in this window is

√
ψ(xA ) dxA . If you know the amplitude at every point xA , then you do
have full information about the state.
The normalization is of course
Z +∞
|ψ(x)|2 dx = 1.
−∞
Variations: If a spin-zero particle moves in three dimensions, the wave-

function is ψ(x, y, z). If a spin-half particle (sz = ± 21 ) moves in three
dimensions, the wavefunction is ψ(x, y, z, sz ), or ψ̃(px , py , pz , sx ). In gen-
eral, when I say things like “ the variable x”, you will have to generalize in
different circumstances to, for example, “the variables px , py , pz , sx ”.
Two particles, say an electron and a neutron, move in one dimension.
(Ignore spin.) How can we represent this system’s state?
There is now a wavefunction ψ(xA , xB ) with the interpretation that if
you measure the location of both particles, then the probability of finding
the electron in a window of width dxA about position xA , and the neutron
in a window of width dxB about position xB , is |ψ(xA , xB )|2 dxA dxB .
15.1. Many-particle systems in quantum mechanics 249
Two-particle windows
dxB dxA
x
xB xA
Note that the letters A and B refer to two different positions, not two differ-
ent particles. The particles are represented by the sequence of arguments:
the first argument pertains to the electron, the second argument pertains
to the neutron.
I particularly emphasize that the wavefunction applying to the system
“electron plus neutron” is one function of two variables, and is not two
functions each of one variable:
ψelectron (xA ) as well as ψneutron (xB ) NO!
The wavefunction of the system might happen to have the factorized form
ψelectron (xA )ψneutron (xB ) PERHAPS
but it does not necessarily have this form.
The difference feeds directly into this question: How many (real) num-
bers does it take to specify a state? In classical mechanics, the answer is
straightforward. The state of any single particle is specified through two
numbers: the position and momentum of that particle. The state of a col-
lection of several particles is specified through the state of each particle. In
summary
particles real numbers needed to specify classical state

1 2
2 4
3 6
.. ..
. .
N 2N
In quantum mechanics, the answer is more subtle. To specify the state

of even a single particle, one must give the wavefunction ψ(x). . . an infinite
number of complex numbers! For concreteness suppose we approximate
this function on a computer, using a grid of 100 points. Then we need 100
complex numbers, that is 2(100) real numbers. But one of these numbers is
fixed through the normalization condition, and one is an overall phase that
can be set arbitrarily. The end result is that to specify a single-particle
wavefunction to this degree of accuracy requires 2(100) − 2 = 198 real
numbers.
What about two particles? Now we have a wavefunction on a grid of
100 × 100 points, so specifying a two-particle wavefunction to this degree of
accuracy requires 2(100)2 − 2 real numbers. This number (19998) is much
larger than twice 198. To specify the two-particle states, we cannot get
away with just specifying two one-particle states. Just as a particle might
not have a location, so in a two-particle system an individual particle might
not have a state.
In summary
particles real numbers needed to specify quantal state

1 2(100) − 2 = 198
2 2(100)2 − 2 = 19998
3 2(100)3 − 2 = 1999998
.. ..
. .
N 2(100)N − 2
Much of the spectacular richness and complexity of the quantum word arises
from this rapid increase of information with particle number. (Design of
quantum computer.)
Two identical particles, say two neutrons, move in one dimension.
(Ignore spin.) How can we represent this system’s state?
Of course, there is a wavefunction ψ(xA , xB ), but the interpretation
is somewhat different. The question is not “What is the probability of
finding neutron α within window A and neutron β within window B?”
These neutrons are identical, so there is no such thing as “neutron α” or
“neutron β.” The question instead is “What is the probability of finding a
neutron within window A and a neutron within window B?” The answer
to this question is
2|ψ(xA , xB )|2 dxA dxB if the windows don’t overlap

|ψ(xA , xB )|2 dxA dxB if xB = xA

The
Z normalization condition is
+∞ Z xA Z +∞ Z ∞
2 dxA dxB |ψ(xA , xB )|2 = 1 or dxA dxB |ψ(xA , xB )|2 = 1.
−∞ −∞ −∞ −∞
If the two particles are identical, then it’s certainly true that
|ψ(xA , xB )|2 = |ψ(xB , xA )|2 .
But this condition insures only that the position probabilities are unaffected
if you swap the windows. If the two particles are identical, then the same
holds for momentum probabilities. In other words, the wavefunctions
ψ(xA , xB ) and ψ(xB , xA )
represent the same state, so
ψ(xA , xB ) = sψ(xB , xA ),
where s is a number with modulus unity, not a function of xA or xB . (The
name s comes from “swap”. We’ve swapped the subscripts.) Thus, for
example,
ψ(5, 7) = sψ(7, 5).
But
ψ(7, 5) = sψ(5, 7),
so
ψ(5, 7) = s2 ψ(5, 7).
We conclude that
s = ±1.
In other words, when the wavefunction swaps arguments, it either remains
the same or changes sign. In the first case, the wavefunction is called “sym-
metric under swapping,” (or “under exchange,” or “under interchange”)1
in the second, “antisymmetric.”
What if there are three identical particles? The wavefunction is
ψ(xA , xB , xC ) and you can swap either the first and second arguments, or
the second and third arguments, or the first and third arguments. The
arguments of the next three paragraphs will show that the wavefunction
must be either symmetric under each of these three interchanges or else
antisymmetric under each of these three interchanges.
1 I prefer “swap” to emphasize that we’re swapping mathematical windows, not ex-
changing physical particles.

After any swapping, you must produce a wavefunction representing the

same state, so any swapping can introduce at most a constant phase factor.
Thus
ψ(xA , xB , xC ) = eiα ψ(xB , xA , xC )
= eiβ ψ(xA , xC , xB )
= eiγ ψ(xC , xB , xA )
The “double swap” argument above shows that eiα is either +1 or −1,
that eiβ is either +1 or −1, and that eiγ is either +1 or −1. We can gain
more information through repeated swappings that return ultimately to the
initial sequence. For example
ψ(xA , xB , xC ) = eiα ψ(xB , xA , xC ) [[swapping first and second arguments]]
iα iβ
= e e ψ(xB , xC , xA ) [[swapping second and third arguments]]
iα iβ iγ
= e e e ψ(xA , xC , xB ) [[swapping first and third arguments]]
iα iβ iγ iβ
= e e e e ψ(xA , xB , xC ) [[swapping second and third arguments]]
We already know that (eiβ )2 = 1, so this argument reveals that eiα eiγ = 1,
i.e., these two phase factors are either both +1 or both −1.
Further arguments of this type will convince you that the three phase
factors must either be all +1 or else all −1. For suppose that
ψ(xA , xB , xC ) = −ψ(xB , xA , xC )
= +ψ(xA , xC , xB )
= −ψ(xC , xB , xA )
(That is, antisymmetric under swaps of the first and second arguments or
the first and third arguments, symmetric under swaps of the second and
third arguments.) Then we can go from ψ(xA , xB , xC ) to ψ(xB , xC , xA )
via two different swapping routes:
ψ(xA , xB , xC ) = (−1)ψ(xB , xA , xC ) [[swapping first and second arguments]]
= (−1)(+1)ψ(xB , xC , xA ) [[swapping second and third arguments]]
or
ψ(xA , xB , xC ) = (−1)ψ(xC , xB , xA ) [[swapping first and third arguments]]
= (−1)(−1)ψ(xB , xC , xA ) [[swapping first and second arguments]]
The only function that satisfies both of these conditions is ψ(xA , xB , xC ) =

0.
The other possible “mixed symmetric and antisymmetric” possibility is
ψ(xA , xB , xC ) = +ψ(xB , xA , xC )
= −ψ(xA , xC , xB )
= +ψ(xC , xB , xA )
but this can be shown impossible by the “two route” argument of the pre-
vious paragraph.
15.1 Problem: Show that the same result applies for functions of four or
more arguments by considering first swaps among the first, second,
and third arguments; then swaps among the first, second, and fourth
arguments; then swaps among the first, second, and fifth arguments;
etc.
In conclusion, a wavefunction for any number of identical particles must

be either “completely symmetric” (every swap introduces a phase factor of
+1) or else “completely antisymmetric” (every swap introduces a phase
factor of −1). This is called the “exchange symmetry” of the wavefunction.
15.2 Problem: If there are two particles, there is one possible swap. If
there are three particles, there are three possible swaps. Show that for
four particles there are six possible swaps and that for N particles there
are N (N − 1)/2 possible swaps.
15.3 Problem: Show that the momentum wavefunction has the same in-
terchange symmetry as the position wavefunction (i.e., symmetric or
antisymmetric). How about the energy coefficients? (Exactly what
does that last question mean?)
15.4 Problem: Show that exchange symmetry is conserved: If the system
starts out in a symmetric state it will remain symmetric at all times in
the future, and similarly for antisymmetric.
Given what we’ve said so far, I would guess that a collection of neu-
trons could start out in a symmetric state (in which case they would be
in a symmetric state for all time) or else they could start out in an anti-
symmetric state (in which case they would be in an antisymmetric state
for all time). In fact, however, this is not the case. For suppose you had a
collection of five neutrons in a symmetric state and a different collection of

two neutrons in an antisymmetric state. Just by changing which collection
is under consideration, you could consider this as one collection of seven
neutrons. That collection of seven neutrons would have to be either com-
pletely symmetric or completely antisymmetric, and it wouldn’t be if the
five were in a symmetric state and the two in an antisymmetric state.
So the exchange symmetry has nothing to do with history or with what
you consider to be the extent of the collection, but instead depends only
on the type of particle. Neutrons, protons, electrons, carbon-13 nuclei, and
sigma baryons are always antisymmetric under swapping — they are called
fermions.2 Photons, alpha particles, carbon-12 nuclei, and pi mesons are
always symmetric under swapping — they are called bosons.3
Furthermore, all bosons have integral spin and all fermions have half-
integral spin. There is a mathematical result in relativistic quantum field
theory called “the spin-statistics theorem” that sheds some light on this
astounding fact. (See Pauli and the Spin-Statistics Theorem by Ian Duck
and E.C.G. Sudarshan, and the review of this book by A.S. Wightman in
Am. J. Phys. 67 (August 1999) 742–746.)
Given their obvious importance, it makes sense to spend some time on
the mathematics of completely symmetric and completely anti-
symmetric functions. Given a garden-variety two-variable “seed” func-
tion f (xA , xB ), we can build a symmetric function
fS (xA , xB ) = f (xA , xB ) + f (xB , xA ),
and an antisymmetric function
fA (xA , xB ) = f (xA , xB ) − f (xB , xA ).
Note that these built functions are not necessarily normalized.
2 Enrico Fermi (1901–1954) of Italy excelled in both experimental and theoretical
physics. He directed the building of the first nuclear reactor and produced the first
theory of the weak interaction. The Fermi surface in the physics of metals was named
in his honor. He elucidated the statistics of what are now called fermions in 1926. He
produced so many thoughtful conceptual and estimation problems that such problems
are today called “Fermi problems”. I never met him (he died before I was born) but I
have met several of his students, and all of them speak of him in that rare tone reserved
for someone who is not just a great scientist and a great teacher and a great leader, but
also a great human being.
3 Satyendra Bose (1894–1974) of India made contributions in fields ranging from chem-
istry to school administration, but his signal contribution was elucidating the statistics
of photons. Remarkably, he made this discovery in 1922, three years before Schrödinger
developed the concept of wavefunction.
A generalized process works for three-variable functions: The built func-

tions are sums over all 3! permutations of arguments. The function
fS (xA , xB , xC ) = f (xA , xB , xC )+f (xA , xC , xB )+f (xC , xA , xB )+f (xC , xB , xA )+f (xB , xC , xA )+f (xB , xA , xC )
is completely symmetric while the function
fA (xA , xB , xC ) = f (xA , xB , xC )−f (xA , xC , xB )+f (xC , xA , xB )−f (xC , xB , xA )+f (xB , xC , xA )−f (xB , xA , xC )
is completely antisymmetric. (These 3! permutations are listed in the se-
quence called “plain changes” or “the Johnson-Trotter sequence”. This
sequence has the benefit that each permutation differs from its predecessor
by a single swap of adjacent letters.) This process of building an symmet-
ric function fS from arbitrary seed function f is called “symmetrization”.
Similarly for “antisymmetrization”. If the function is a wavefunction, the
(anti)symmetrization process is usually understood to include also normal-
izing the resulting wavefunction.
15.5 Problem: If the seed f (xA , xB , xC ) happens to be symmetric to begin

with, what are the symmetrized and antisymmetrized functions? What
if the seed happens to be antisymmetric to begin with?
15.6 Problem: Show that any two-variable function can be represented as
a sum of a symmetric and an antisymmetric function. Can any three-
variable function be represented as a sum of a completely symmetric
and a completely antisymmetric function?
Exchange symmetry and position correlations
Symmetric implies “huddled together”, antisymmetric implies “spread

apart”. (“cluster” / “avoid”)
This is not a result of repulsion. Two electrons, of course, repel each
other electrically. This electrical repulsion is reflected through a term in
the Hamiltonian of the pair. But the exchange symmetry effect holds even
when there is no interaction term in the Hamiltonian. If the two particles
are “independent” in that there is no interaction term in the Hamiltonian,
in that they don’t interact through a repulsive or attractive force, they
still have a tendency to “huddle together” or “spread apart” through the
exchange symmetry requirement. (A pair of particles in this situation is
said to have “no interaction” in the physics sense of the word “interaction,”
even though they do affect each other in the everyday sense of the word
“interaction”.)
Question: I can see how two electrons, repelling each other through an
electric field, can affect each other. But you’ve just said that two identical
particles which don’t exert a force on each other nevertheless affect each
other. The two particles are not in contact and don’t exert a force. What
is the mechanism through which one affects the other?
Answer: The two particles affect each other through “smelling out”
the various positions available to each. Remember that these particles
don’t have positions.
Although this section is titled “Exchange symmetry and position cor-
relations” remember that the symmetry requirement holds also for the
momentum wavefunction and for the energy coefficients. Antisymmetric
combinations are spread apart in momentum as well as position.
See problem 15.10.
Symmetric and antisymmetric bases
If we want to study identical particles, we’ll need to build a basis of sym-

metric states and a basis of antisymmetric states. Here’s how.
Start with a single particle subject to a potential, and solve the energy
eigenproblem. Suppose the results are an energy basis of
η1 (x) 1
η2 (x) 2
.. ..
. .
ηM (x) M
In most cases the number of one-particle energy eigenstates M is infinite,

but it’s useful to keep that number as a variable anyway. There might or
might not be some degeneracies in the system. . . it doesn’t matter.
Three non-identical particles. Now suppose there is not one, but
three particles subject to this potential, and that they’re not identical. We
can build a basis of product wavefunctions.
η1 (xA )η1 (xB )η1 (xC ) |1, 1, 1i 1 + 1 + 1

η1 (xA )η1 (xB )η2 (xC ) |1, 1, 2i 1 + 1 + 2
η1 (xA )η2 (xB )η1 (xC ) |1, 2, 1i 1 + 2 + 1
η2 (xA )η1 (xB )η1 (xC ) |2, 1, 1i 2 + 1 + 1
.. .. ..
. . .
η1 (xA )η7 (xB )η3 (xC ) |1, 7, 3i 1 + 7 + 3
η7 (xA )η3 (xB )η1 (xC ) |7, 3, 1i 7 + 3 + 1
.. .. ..
. . .
ηM (xA )ηM (xB )ηM (xC ) |M, M, M i M + M + M
A few remarks: (1) There are M 3 elements in the basis. (2) We have a
basis of product wavefunctions, but that doesn’t mean that every state is a
product state, because an arbitrary state is a sum of basis elements. (3) It’s
tiring to write always the form in the left column so we abbreviate it through
the form in the center column. (4) Sequence matters: the state |451i is
different from the state |145i. (5) If the three particles don’t interact,
then this is an energy basis with the eigenvalues shown. But even if they
do interact, it’s a basis. (6) If the particles don’t interact, then there is
necessarily degeneracy in this basis. (7) To keep in mind the distinction
between this basis for the three-particle system and the basis for the one-
particle system from which it is built, we often call the three-particle basis
elements “states” and the one-particle basis elements “levels”. The levels
are the building blocks out of which states are constructed.
Building a symmetric basis. Any wavefunction can be expressed as
a sum over the above basis,
X M
M X
M X X
ψ(xA , xB , xC ) = cr,s,t ηr (xA )ηs (xB )ηt (xC ) = cr,s,t |r, s, ti,
r=1 s=1 t=1 r,s,t
but if we have three identical bosons, we’re not interested in any wavefunc-
tion, we’re interested in symmetric wavefunctions. To build a symmetric
wavefunction, we execute the symmetrization process on ψ(xA , xB , xC ).
Doing so, we conclude that this symmetric wavefunction can be expressed
as a sum over the symmetrization of each basis element.
Let’s think a bit about the symmetrization of
ηr (xA )ηs (xB )ηt (xC ) also known as |r, s, ti.
When we introduced the symmetrization process, we permuted the variables
(representing position windows) xA , xB , and xC . But if the seed function
happens to be a product like this, it’s obviously the same thing to permute
the level indices r, s, and t. We represent the symmetrization of |r, s, ti as
Ŝ|r, s, ti = const (|r, s, ti + |r, t, si + |t, r, si + |t, s, ri + |s, t, ri + |s, r, ti)
where “const” is a normalization constant.
If we go through and symmetrize each element of the basis for three
non-identical particles, we will find a basis for symmetric states. Let’s start
with |1, 1, 1i. This symmetrizes to itself:
Ŝ|1, 1, 1i = |1, 1, 1i.
Next comes |1, 1, 2i:
Ŝ|1, 1, 2i = const (|1, 1, 2i + |1, 2, 1i + |2, 1, 1i + |2, 1, 1i + |1, 2, 1i + |1, 1, 2i) = 2 const (|1, 1, 2i + |1, 2, 1i + |2, 1
It’s clear, then, that
Ŝ|1, 1, 2i = Ŝ|1, 2, 1i = Ŝ|2, 1, 1i,
so we must discard two of these three states from our basis. It’s equally
clear that all states built through symmetrizing any three given levels are
the same state. For example
Ŝ|3, 9, 2i = Ŝ|3, 2, 9i = Ŝ|2, 3, 9i = Ŝ|2, 9, 3i = Ŝ|9, 2, 3i = Ŝ|9, 3, 2i,
and we must discard five of these six states from our basis.
We are left with a basis of
M (M + 1)(M + 2)
3!
symmetric elements. One of the neat things about these elements is that
they’re long . . . for example one of them is
√1 [η1 (xA )η7 (xB )η3 (xC ) + η1 (xA )η3 (xB )η7 (xC ) + η3 (xA )η1 (xB )η7 (xC )
3!
+ η3 (xA )η7 (xB )η1 (xC ) + η7 (xA )η3 (xB )η1 (xC ) + η7 (xA )η1 (xB )η3 (xC )]
but to specify them we need only state the three levels that go into building
it (the three “building blocks” that go into making it). [This was not the
case for three non-identical particles.] Consequently one often speaks of
this state as “a particle in level 1, a particle in level 7, and a particle in
level 3”. This phrase is not correct: If a particle were in level 7, then it
could be distinguished as “the particle in level 7” and hence would not
be identical to the other two particles. The correct statement is that the
system is in the antisymmetric state given above, and that the individual
particles do not have states. On the other hand, the correct statement is a
mouthful and you may use the “balls in buckets” picture as shorthand —
as long as you say it but don’t think it.
Building an antisymmetric basis. We can build a basis of states,
each of which is antisymmetric, in a parallel manner by antisymmetrizing
each element of the basis for non-identical particles and discarding dupli-
cates.
The antisymmetrization of
ηr (xA )ηs (xB )ηt (xC ) also known as |r, s, ti,
results in
Â|r, s, ti = const (|r, s, ti − |r, t, si + |t, r, si − |t, s, ri + |s, t, ri − |s, r, ti)
where “const” is again a normalization constant.
Let’s start with |1, 1, 1i. This antisymmetrizes to zero. Same with
|1, 1, 2i:
Â|1, 1, 2i = const (|1, 1, 2i − |1, 2, 1i + |2, 1, 1i − |2, 1, 1i + |1, 2, 1i − |1, 1, 2i) = 0.
It’s clear, in fact, that any basis element with two indices the same will
antisymmetrize to zero. The only way to avoid antisymmetrization to zero
is for all of the level indices to differ. Furthermore
Â|r, s, ti = −Â|r, t, si = Â|t, r, si = −Â|t, s, ri = Â|s, t, ri = −Â|s, r, ti
so the six distinct basis elements |1, 7, 3i, |7, 3, 1i, |3, 7, 1i, etc. all antisym-
metrize to the same thing.
Discarding the duplicates results in a basis of
M (M − 1)(M − 2)
3!
symmetric elements.
Once again these states have long expressions like
√1 [η1 (xA )η7 (xB )η3 (xC ) − η1 (xA )η3 (xB )η7 (xC ) + η3 (xA )η1 (xB )η7 (xC )
3!
− η3 (xA )η7 (xB )η1 (xC ) + η7 (xA )η3 (xB )η1 (xC ) − η7 (xA )η1 (xB )η3 (xC )]
but to specify the three-particle state we need only list the one-particle
building blocks (“levels”) used in its construction. This results in almost the
same “balls in buckets” picture that we drew for symmetric wavefunctions,
with the additional restriction that any bucket can contain only one or zero
balls. Once again you may use the “balls in buckets” picture as a shorthand,
as long as you keep in mind that it conceals a considerably more intricate

process of building and antisymmetrizing.
Generalizations. It is easy to generalize this procedure for building
antisymmetric and symmetric many-particle basis states out of one-particle
levels for any number of particles. The only special case is for two particles,
where the symmetric basis has M (M +1)/2 elements and the antisymmetric
basis has M (M −1)/2 elements. Putting these two bases together results in
a full basis of M 2 elements. This reflects the fact that any function of two
variables can be written as the sum of an antisymmetric and a symmetric
function. The same is not true for systems of three or more particles.
15.7 Problem: Count the elements in the antisymmetric and antisymmetric

bases for N particles rather than three. (Continue to use M levels.)
Does your expression have the proper limits when N = 1 and when
N = M?
15.8 Problem: Find the normalization constant for Ŝ|7, 3, 7i.
15.9 Problem: Any two-variable function may be written as a sum of a
symmetric and an antisymmetric function. Consequently the union of
the symmetric basis and the antisymmetric basis is a basis for the set
of all two-variable functions. Show that neither of these statements is
true for functions of three variables.
15.10 Mean separation
(Be sure to read Griffiths section 5.1.2, “Exchange forces,” before at-
tempting this problem.) Two noninteracting particles, each of mass
m, are in an infinite square well of width L. The associated one-body
energy eigenstates are ηn (x) and ηm (x), where
r
2 nπ
ηn (x) = sin x .
L L
Calculate the root-mean-square separation
q
h(xA − xB )2 i
if these are
a. two non-identical particles, one in state ηn (xA ) and the other in
state ηm (xB )
b. two identical bosons, in state
√1 [ηn (xA )ηm (xB ) + ηm (xA )ηn (xB )]
2
15.2. An antisymmetric basis for the helium problem 261
c. two identical fermions, in state

√1 [ηn (xA )ηm (xB ) − ηm (xA )ηn (xB )]
2
Do your results always adhere to our general rule of “symmetric means

huddled together; antisymmetric means spread apart”?
d. Bonus, for the mathematically inclined. Solve all three parts above
in one step by computing the root-mean-square separation for the
two-particle state
cos θ ηn (xA )ηm (xB ) + sin θ ηm (xA )ηn (xB )
where θ is a parameter of no physical significance.
15.11 Building basis states
Suppose you had three particles and three “building block” levels (say
the orthonormal levels η1 (x), η3 (x), and η7 (x)). Construct normalized
three-particle basis states for the case of
a. three non-identical particles
b. three identical bosons
c. three identical fermions
How many states are there in each basis? Repeat for three particles
with four one-particle levels, but in this case simply count and don’t
write down all the three-particle states.
15.2 An antisymmetric basis for the helium problem
Helium: two electrons and one nucleus. The three-body problem! But
wait, the three-body problem hasn’t been solved exactly even in classical
mechanics, there’s no hope for an exact solution in quantum mechanics.
Does this mean we give up? No. If you give up on a problem you can’t solve
exactly, you give up on life.4 Instead, we look for approximate solutions.
If we take account of the Coulomb forces, but ignore things like the
finite size of the nucleus, nuclear motion, relativistic motion of the electron,
4 Can’t find the exact perfect apartment to rent? Can’t find the exact perfect candidate
to vote for? Can’t find the exact perfect friend? Of course you can’t find any of these
things. But we get on with our lives accepting imperfections because we realize that the
alternatives (homelessness, political corruption, friendlessness) are worse.
spin-orbit effects, and so forth, the Hamiltonian for two electrons and one
nucleus is
~2 2 e2 1 ~2 2 e2 1
2
. e 1
Ĥ = − ∇ − + − ∇ − +
2me A 4π0 rA 2me B 4π0 rB 4π0 |~rA − ~rB |
= KEA + ÛnA
d + KEB + ÛnB + ÛAB
d
| {z } | {z } |{z}
≡ ĤA ≡ ĤB ≡ Ĥ 0
Recall that in using the subscripts “A” and “B” we are not labeling the
electrons as “electron A” and “electron B”: the electrons are identical and
can’t be labeled. Instead we are labeling the points in space where an
electron might exist as “point A” and “point B”.
We will look for eigenstates of the partial Hamiltonian ĤA + ĤB . These
are not eigenstates of the full Hamiltonian, but they are a basis, and they
can be used as a place to begin perturbation theory.
One-particle levels
We begin by finding the one-particle levels (or “orbitals”) for just the Hamil-
tonian ĤA . We combine these with levels for ĤB and antisymmetrize the
result.
The problem ĤA is just the Hydrogen atom Coulomb problem with two
changes:
nuclear mass is 4mp =⇒ very small effect (“ignore nuclear motion”)

2 2
me e 4 Ry
nuclear charge is 2e =⇒ the Rydberg is Ry = 2 , so En(A) =− 2
2~ 4π0 A
nA
Similarly, the energy eigenstates for ĤA are represented by familiar func-
tions like
ηn`m (r)| ↑i or ηn`m (r)χ+ .
Soon we will need to keep track of ĤA versus ĤB . A notation like
ηn`m (rA )| ↑i is fine for the space part of the eigenstate, but leaves the
spin part ambiguous. We will hence use notation like
ηn`m (A)χ+ (A)
to mean the same thing.
[[Notice that the eigenstates don’t have to take on the factorized form
of “space part”×“spin part” — for example
√1 [η200 (r)χ+ + η210 (r)χ− ]
2
is a perfectly good eigenstate — but that the factorized form is particularly

convenient for working with. (If we were to consider spin-orbit coupling,
then the eigenstates could not take the factorized form.)]]
Antisymmetrization
Recall how we build an antisymmetrized wavefunction from a product of

two one-particle levels, ηn (A) and ηm (B):
Âηn (A)ηm (B) = √12 [ηn (A)ηm (B) − ηm (A)ηn (B)]
√
(The normalization factor 1/ 2 holds when ηn (A) and ηm (A) are orthog-
onal.)
Two theorems:
• If you antisymmetrize a product of the same two levels, you end up

with zero:
Âηn (A)ηn (B) = 0.
• If you antisymmetrize two levels in the opposite sequence, you end up
with the same state:
Âηn (A)ηm (B) = −Âηm (A)ηn (B).
The ground state
The ground levels of ĤA and of ĤB are both doubly degenerate due to
spin. So if you had distinguishable particles, the ground state of ĤA + ĤB
would be four-fold degenerate:
distinguishable
η100 (A)χ+ (A)η100 (B)χ+ (B)
η100 (A)χ+ (A)η100 (B)χ− (B)
η100 (A)χ− (A)η100 (B)χ+ (B)
η100 (A)χ− (A)η100 (B)χ− (B)
But if you have identical fermions, two of these basis states antisymmetrize
to zero, and the other two antisymmetrize to the same state:
distinguishable antisymmetrized
η100 (A)χ+ (A)η100 (B)χ+ (B) 0
η100 (A)χ+ (A)η100 (B)χ− (B) √1 [η100 (A)χ+ (A)η100 (B)χ− (B) − η100 (A)χ− (A)η100 (B)χ+ (B)]
2
η100 (A)χ− (A)η100 (B)χ+ (B) − √12 [above]
η100 (A)χ− (A)η100 (B)χ− (B) 0
Hence the Hamiltonian ĤA +ĤB has a non-degenerate ground state, namely
η100 (A)η100 (B) √12 [χ+ (A)χ− (B) − χ− (A)χ+ (B)].
It’s common to hear things like “In the ground state of Helium, one
electron is in one-body level |100i with spin up and the other is in one-
body level |100i with spin down.” This claim is false. The equation makes
it clear that “In the ground state of Helium, one electron is in one-body level
|100i, the other is in one-body level |100i, and the spins are entangled.” If
the first phrase were correct, then you would be able to distinguish the two
electrons, and they would not be identical. But it’s not correct.
States built from one ground level
Now build a state by combining the ground level of one Hamiltonian with
|n`mi from the other. If you had distinguishable particles, this “combina-
tion” means a simple multiplication, and there would be eight states (all
with the same energy):
distinguishable
η100 (A)χ+ (A)ηn`m (B)χ+ (B)
η100 (A)χ+ (A)ηn`m (B)χ− (B)
η100 (A)χ− (A)ηn`m (B)χ+ (B)
η100 (A)χ− (A)ηn`m (B)χ− (B)
ηn`m (A)χ+ (A)η100 (B)χ+ (B)
ηn`m (A)χ+ (A)η100 (B)χ− (B)
ηn`m (A)χ− (A)η100 (B)χ+ (B)
ηn`m (A)χ− (A)η100 (B)χ− (B)
But if you have identical fermions, the “combination” means a multipli-

cation followed by an antisymmetrization. Because of the second theorem
concerning antisymmetrization, each of the last four products above anti-
symmetrize to the same state as one of the first four products. The first
four products result in antisymmetrized states as follows:
distinguishable antisymmetrized
(a) η100 (A)χ+ (A)ηn`m (B)χ+ (B) √1 [η100 (A)χ+ (A)ηn`m (B)χ+ (B) − ηn`m (A)χ+ (A)η100 (B)χ+ (B)]
2
(b) η100 (A)χ+ (A)ηn`m (B)χ− (B) √1 [η100 (A)χ+ (A)ηn`m (B)χ− (B) − ηn`m (A)χ− (A)η100 (B)χ+ (B)]
2
(c) η100 (A)χ− (A)ηn`m (B)χ+ (B) √1 [η100 (A)χ− (A)ηn`m (B)χ+ (B) − ηn`m (A)χ+ (A)η100 (B)χ− (B)]
2
(d) η100 (A)χ− (A)ηn`m (B)χ− (B) √1 [η100 (A)χ− (A)ηn`m (B)χ− (B) − ηn`m (A)χ− (A)η100 (B)χ− (B)]
2
Antisymmetrized expressions (a) and (d) readily factor into a space part
times a spin part:
(a) =⇒ √1 [η100 (A)ηn`m (B) − ηn`m (A)η100 (B)]χ+ (A)χ+ (B)
2
(d) =⇒ √1 [η100 (A)ηn`m (B) − ηn`m (A)η100 (B)]χ− (A)χ− (B)
2
But expressions (b) and (c) do not factor. One thing to do about this is
nothing — after all, there’s no requirement that the wavefunctions factorize.
But another approach is to look for a simple change of basis (remember,
these four states all have the same energy of ĤA + ĤB ). Someone (I don’t
know who) thought about the favorite change of basis in planar geometry
— a rotation of the axes by 45◦ :
̂
̂0 = √1 [̂
2
− ı̂] 6 ı̂0 = √1 [̂
2
+ ı̂]
@
I
@
@ - ı̂
Applying this transformation to the basis elements (“unit vectors”) given

through (b) and (c) results in the new basis elements
√1 [(b) + (c)] = 1
{[η100 (A)ηn`m (B) − ηn`m (A)η100 (B)]χ+ (A)χ− (B)
2 2
+[η100 (A)ηn`m (B) − ηn`m (A)η100 (B)]χ− (A)χ+ (B)}
= √1 [η100 (A)ηn`m (B) − ηn`m (A)η100 (B)] √12 [χ+ (A)χ− (B) + χ− (A)χ+ (B)]
2
and
√1 [(b) − (c)] = 1
{[η100 (A)ηn`m (B) + ηn`m (A)η100 (B)]χ+ (A)χ− (B)
2 2
−[η100 (A)ηn`m (B) + ηn`m (A)η100 (B)]χ− (A)χ+ (B)}
= √1 [η100 (A)ηn`m (B) + ηn`m (A)η100 (B)] √12 [χ+ (A)χ− (B) − χ− (A)χ+ (B)].
2
This process results in an antisymmetric basis of
√1 [η100 (A)ηn`m (B) − ηn`m (A)η100 (B)]χ+ (A)χ+ (B)

2
√1 [η100 (A)ηn`m (B) − ηn`m (A)η100 (B)] √12 [χ+ (A)χ− (B) + χ− (A)χ+ (B)]
2
√1 [η100 (A)ηn`m (B) − ηn`m (A)η100 (B)]χ− (A)χ− (B)
2
1
√ [η100 (A)ηn`m (B)
2
+ ηn`m (A)η100 (B)] √12 [χ+ (A)χ− (B) − χ− (A)χ+ (B)].
The first three elements are called a “triplet” (with “space antisymmet-
ric, spin symmetric”). The last element is called a “singlet” (with “space
symmetric, spin antisymmetric”). This particular basis has three nice prop-
erties: (1) Every basis element factorizes into a spatial part times a spin
part. (2) Every basis element factorizes into a symmetric part times an an-
tisymmetric part. (3) All three elements in the triplet have identical spatial
parts.
The third point means that when we take account of electron-electron
repulsion through perturbation theory, we will necessarily find that all three
elements of any triplet remain degenerate even when the effects of the sub-
Hamiltonian ÛAB are considered.
[[This process works for combining an arbitrary level |n`mi with a
ground state level |100i. Since this should work for any |n`mi, what
happens if we take |n`mi = |100i, the situation we first considered? In
particular, what’s up with the normalization?]]
States built from two excited levels
What happens if we carry out the above process but combining an excited
level of one sub-Hamiltonian (say η200 (A)) with an arbitrary level of the
other sub-Hamiltonian (say ηn`m (B))?
The process goes on in a straightforward way, but it turns out that the
resulting eigenenergies are are always so high that the atom is unstable:
it decays rapidly to a positive helium atom plus an ejected electron. Such

electrons are called “Auger electrons” (pronounced “oh-jey” because Pierre
Victor Auger was French) and Auger electron spectroscopy is an important
analytical technique in surface and materials science.
15.12 Electron-electron repulsion

In class we wrote the (approximate) Hamiltonian for helium as ĤA +
ĤB + ÛAB and found antisymmetric energy eigenstates for ĤA + ĤB
that we called 11 S, 23 S, 21 P, and so forth. Then we qualitatively dis-
cussed how the energy associated with such a state would change, under
perturbation theory, through the electron-electron repulsion term ÛAB .
Write down expressions for the first-order energy shifts due to ÛAB for
21 S, 23 S, and 21 P. (That is, set up the integrals in terms of the one-
particle eigenstates ηn`m (r). Do not evaluate the integrals.) Bonus:
Argue that the energy shift for 21 P is greater than the shift for 21 S.
15.13 Two-electron ions
Apply the techniques of Griffiths, section 7.2, “Ground State of He-
lium,” to the H− and Li+ ions. Each of these ions has two electrons,
like helium, but nuclear charges Z = 1 and Z = 3, respectively. For
each ion find the effective (partially shielded) nuclear charge and de-
termine the best upper bound on the ground state energy.
15.14 The meaning of two-particle wavefunctions (Old)
a. The wavefunction ψ(xA , xB ) describes two non-identical particles
in one dimension. Does
Z ∞ Z ∞
dxA dxB |ψ(xA , xB )|2 (15.1)
−∞ −∞
equal one (the usual normalization) or two (the number of parti-
cles)? Write integral expressions for:
i. The probability of finding particle A between x1 and x2 and
particle B between x3 and x4 .
ii. The probability of finding particle A between x1 and x2 , re-
gardless of where particle B is.
b. The wavefunction ψ(xA , xB ) describes two identical particles in
one dimension. Does
Z ∞ Z ∞
dxA dxB |ψ(xA , xB )|2 (15.2)
−∞ −∞
equal one or two? Assuming that x1 < x2 < x3 < x4 , write
integral expressions for:
i. The probability of finding one particle between x1 and x2 and

the other between x3 and x4 .
ii. The probability of finding a particle between x1 and x2 .
c. Look up the definition of “configuration space” in a classical me-
chanics book. Does the wavefunction inhabit configuration space
or conventional three-dimensional position space? For discussion:
Does your answer have any bearing upon the question of whether
the wavefunction is “physically real” or a “mathematical conve-
nience”? Does it affect your thoughts concerning measurement
and the “collapse of the wavepacket”?
15.15 Symmetric and close together, antisymmetric and far apart
(Old)
In lecture I argued that symmetric wavefunctions describe particles that
huddle together while antisymmetric wavefunctions describe particles
that avoid one another.
a. Illustrate this principle as follows: Construct symmetric and an-
tisymmetric two-particle wavefunctions out of the single-particle
wavefunctions
r r
2 x 2 x
η1 (x) = sin π and η2 (x) = sin 2π , 0 ≤ x ≤ L,
L L L L
(15.3)
which are the first and second energy eigenfunctions for the infinite
square well of width L. For each (anti)symmetrized function make
a plot of xA and xB and shade in regions of high probability
density.
b. Prove that if the two wavefunctions ψ(x) and φ(x) are orthogonal,
then the expectation value of (xA − xB )2 for the antisymmetric
combination of the two wavefunctions is greater than or equal to
that for the symmetric combination.
15.16 Symmetrization and antisymmetrization (mathematical) (Old)
a. Show that any two-variable function can be written as the sum of
a symmetric function and an antisymmetric function.
b. Show that this is not true for functions of three variables. [Hint:
Try the counterexample f (x, y, z) = g(x).]
c. There is a function of three variables that is:
i. Antisymmetric under interchange of the first and second vari-
ables: f (x, y, z) = −f (y, x, z).
ii. Symmetric under interchange of the second and third vari-

ables: f (x, y, z) = f (x, z, y).
iii. Symmetric under interchange of the first and third variables:
f (x, y, z) = f (z, y, x).
Find this function and show that it is unique.
Chapter 16
Breather
Why do we need a breather at this point?

There are no new principles, but lots of applications. The applications
will shed light on the principles and the principles will shed light on the
applications. I will not attempt to fool you: the applications will be hard.
For example, the three-body problem has not been solved in classical me-
chanics. In the richer, more intricate, world of quantum mechanics, we will
not solve it either.
You know from solving problems in classical mechanics that you should
think first, before plunging into a hard problem. You know, for example,
that if you use the appropriate variables, select the most appropriate coor-
dinate system, or use a symmetry – that you can save untold amounts of
labor. (See, for example, George Pólya, How to Solve it (Doubleday, Gar-
den City, NY, 1957) Sanjoy Mahajan, Street-Fighting Mathematics (MIT
Press, Cambridge, MA, 2010).) This rule holds even more so in the more
complex world of quantum mechanics.
And that’s the role of this chapter. We’ll take a breather, pull back from
the details, and organize ourselves for facing the difficult problems that lie
before us.
Henry David Thoreau, Walden (1854): “I went to the woods because
I wished to live deliberately, to front only the essential facts of life, and
see if I could not learn what it had to teach, and not, when I came to die,
discover that I had not lived.”
271
272 Breather
16.1 Scaled variables
Here’s the energy eigenproblem for the hydrogen atom (at the level of ap-
proximation ignoring collisions, radiation, nuclear mass, nuclear size, spin,
magnetic effects, relativity, and the quantum character of the electromag-
netic field):
~2
2
∂2 ∂2 e2 1

∂
− + + − η(~r) = Eη(~r). (16.1)
2m ∂x2 ∂y 2 ∂z 2 4π0 r
This section uses dimensional analysis to find the characteristic length and
characteristic energy for this problem, then uses scaled variables to express
this equation in a more natural and more easily-worked-with form.
Whatever result comes out of this energy eigenequation, whether the
result be a length, or an energy, or anything else, the result can only de-
pend on three parameters: ~, m, and e2 /4π0 . These parameters have the
following dimensions:
dimensions base dimensions

(mass, length, time)
~ [Energy×T] [ML2 /T]
m [M] [M]
e2 /4π0 [Energy×L] [ML3 /T2 ]
How can we build a quantity with the dimensions of length from these
three parameters? Well, the quantity will have to involve ~ and e2 /4π0 ,
because these are the only parameters that include the dimensions of length,
but we’ll have to get rid those dimensions of time. We can do that by
squaring the first and dividing by the third:
quantity dimensions
~2
[ML]
e2 /4π0
And now there’s only one way to get rid of the dimension of mass (without
reintroducing a dimension of time), namely dividing this quantity by m:
quantity dimensions
~2
2
[L]
m e /4π0
16.1. Scaled variables 273
We have uncovered the one and only way to combine these three parameters
to produce a quantity with the dimensions of length. We define the Bohr
radius
~2
a0 ≡ ≈ 0.05 nm. (16.2)
m e2 /4π0
This quantity sets the typical scale for any length in a hydrogen atom. For
example, if I ask for the mean distance from the nucleus to an electron in
energy eigenstate η5,4,−3 (~r) the answer will be some pure (dimensionless)
number times a0 . If I ask for the uncertainty in x̂ of an electron in state
η2,1,0 (~r) the answer will be some pure number times a0 .
Is there a characteristic energy? Yes, it is given through e2 /4π0 divided
by a0 . The characteristic energy is
m (e2 /4π0 )2
E0 ≡ = 2Ry. (16.3)
~2
This characteristic energy doesn’t have its own name, because we just call
it twice the Rydberg energy (the minimum energy required to ionize a
hydrogen atom). It plays the same role for energies that a0 plays for lengths:
Any energy value concerning hydrogen will be a pure number times E0 .
Now is the time to introduce scaled variables. Whenever I specify a
length, I specify that length in terms of some other length. For example,
when I say the Eiffel tower is 324 meters tall, I mean that the ratio of the
height of the Eiffel tower to the length of the prototype meter bar — that
bar stored in a vault in Sèvres, France — is 324.
Now, what is the relevance of the prototype meter bar to atomic phe-
nomena? None! Instead of measuring atomic lengths relative to the pro-
totype meter, it makes more sense to measure them relative to something
atomic, namely to the Bohr radius. I define the dimensionless “scaled
length” x̃ as
x
x̃ ≡ , (16.4)
a0
and it’s my preference to measure atomic lengths using this standard, rather
than using the prototype meter bar as a standard.
So, what is the energy eigenproblem (16.1) written in terms of scaled
lengths? For any function f (x), the chain rule of calculus tells us that
∂f (x) ∂f (x̃) ∂ x̃ ∂f (x̃) 1
= =
∂x ∂ x̃ ∂x ∂ x̃ a0
274 Breather
and consequently that

∂ 2 f (x) ∂ 2 f (x̃) 1
= .
∂x2 ∂ x̃2 a20
Consequently the energy eigenproblem (16.1) is
~2 1
2
∂2 ∂2 e2 1

∂
− + 2+ 2 − η(~r̃) = Eη(~r̃), (16.5)
2m a20 ∂ x̃2 ∂ ỹ ∂ z̃ 4π0 a0 r̃
which seems like a nightmare, until you realize that
~2 1 e2 1
2 = = E0 .
m a0 4π0 a0
The eigenproblem (16.5) is thus
1 ∂2 ∂2 ∂2

1 E
− + + − η(~r̃) = η(~r̃). (16.6)
2 ∂ x̃2 ∂ ỹ 2 ∂ z̃ 2 r̃ E0
Defining the dimensionless “scaled energy”
E
Ẽ ≡ , (16.7)
E0
we see immediately that the energy eigenproblem, expressed in scaled vari-
ables, is
1 ∂2 ∂2 ∂2

1
− + + − η(~r̃) = Ẽη(~r̃) (16.8)
2 ∂ x̃2 ∂ ỹ 2 ∂ z̃ 2 r̃
or

1 ˜2 1
− ∇ − η(~r̃) = Ẽη(~r̃). (16.9)
2 r̃
Whoa! It’s considerably easier to work with the energy eigenproblem
written in this form than it is to work with form (16.1) — there are no ~s
and e2 /4π0 s to keep track of (and to lose through algebra errors).
The only problem is that there are so many tildes to write down. People
get tired of writing tildes, so they just omit them, with the understanding
that they are now working with scaled variables rather than traditional
variables, and the energy eigenproblem becomes

1 1
− ∇2 − η(~r) = Eη(~r). (16.10)
2 r
I like to call this process “using scaled variables”. Others call it “mea-
suring length and energy in atomic units”. Still others say that we get
equation (16.10) from (16.1) by
e2
“setting ~ = m = = 1”.
4π0
16.2. Variational method for finding the ground state energy 275
This last phrase is particularly opaque, because taken literally it’s absurd.
So you must not take it literally: it’s a code phrase for the more interesting
process of converting to scaled variables and then dropping the tildes.
One last point. Some people call this system not “atomic units” but
“natural units”. While these units are indeed the natural system for solving
problems in atomic physics, they not the natural units for solving problems
in nuclear physics, or in stellar physics, or in cosmology. And they are par-
ticularly unnatural and inappropriate for measuring the heights of towers.
16.2 Variational method for finding the ground state en-

ergy
Imagine a gymnasium full of fruits

smallest fruit ≤ smallest cantaloupe.
Similarly
ground state energy ≤ hψ|Ĥ|ψi for any |ψi.
So try out a bunch of states, turn the crank, find the smallest. Very me-
chanical.
For example, to estimate the ground state energy of a quartic oscillator
V (x) = αx4 , you could use as trial wavefunctions the Gaussians
1 2 2
ψ(x) = √ √ e−x /2σ .
4
π σ
Turn the crank to find hψ|Ĥ|ψi, then minimize to find which value of σ
minimizes that expectation value.
Two things to remember: First, it’s a mathematical technique useful in
many fields, not just in quantum mechanics. Second, it seems merely me-
chanical, but in fact it relies on picking good trial wavefunctions: you have
to gain an intuitive understanding of how the real wavefunction is going to
behave, then pick trial wavefunctions which can mimic that behavior. In
the words of Forman S. Acton (Numerical Methods that Work, 1970, page
252) “In the hands of a Feynman the [variational] technique works like a
Latin charm; with ordinary mortals the result is a mixed bag.”
Sample problem: Variational estimate for the ground state
energy of a quartic oscillator
The trial wavefunction
1 2 2
ψ(x) = √ √ e−x /2σ
4
π σ
276 Breather
is normalized. (If you don’t know this, you should verify it.) We look for
Z +∞
~2 ∂ 2

1 −x2 /2σ 2 2 2
hψ|Ĥ|ψi = √ e − 2
+ αx e−x /2σ dx
4
πσ −∞ 2m ∂x
Z +∞
~2 1 x2

2 2 1 2 2
=− √ e−x /2σ − 2 1 − 2 e−x /2σ dx
2m πσ −∞ σ σ
Z +∞
1 2 2 2 2
+ α√ e−x /2σ x4 e−x /2σ dx
πσ −∞
Z +∞
~2 1 x2

2 2
= √ 3 1 − 2 e−x /σ dx
2m πσ −∞ σ
Z +∞
1 2 2
+ α√ x4 e−x /σ dx
πσ −∞
2 Z +∞ Z +∞
~ 1 2 σ4 2
1 − x̃2 e−x̃ dx̃ + α √ x̃4 e−x̃ dx̃.

= √ 2
2m πσ −∞ π −∞
Already, even before evaluating the integrals, we can see that both integrals
are numbers independent of the trial wavefunction width σ. Thus the
expected kinetic energy, on the left, decreases with σ while the expected
potential energy, on the right, increases with σ. Does this make sense to
you?
When you work out (or look up) the integrals, you find
~2 1 √ √ σ4 3 √ ~2 1 3σ 4

hψ|Ĥ|ψi = √ 2 π − 12 π + α √ π = 2
+α .
2m πσ π 4 2m 2σ 4
If you minimize this energy with respect to σ, you will find that the min-
imum value (which is, hence, the best upper bound for the ground state
energy) is
2 2/3 1/3
~ α
9 .
2m 4
Problem: Show that the width of the minimum-energy wavefunction is

2 1/6
~ 1
σ= .
2m 3α
Added: You can use the variational technique for other states as well:
for example, in one dimensional systems, the first excited state is less than
or equal to hψ|Ĥ|ψi for all states |ψi with a single node.
16.3. Problems 277
16.3 Problems
16.1 Quantal recurrence in the infinite square well

a. Find the period as a function of energy for a classical particle of
mass m in an infinite square well of width L.
b. Show that any wavefunction, regardless of energy, in the same
infinite square well is periodic in time with a period
4mL2
.
~π
(This part can be solved knowing only the energy eigenvalues.)
c. What happens after one-half of this time has passed? (This part
requires some knowledge of the energy eigenfunctions.)
[Note: This problem raises deep questions about the character of quan-
tum mechanics and of its classical limit. See D.F. Styer, “Quantum
revivals versus classical periodicity in the infinite square well,” Ameri-
can Journal of Physics 69 (January 2001) 56–62.]
16.2 Quantal recurrence in the Coulomb problem
Show that in the Coulomb problem, any quantal state consisting of a
superposition of two or more bound energy eigenstates with principal
quantal numbers n1 , n2 , . . . , nr evolves in time with a period of
h 2
N ,
Ry
where Ry is the Rydberg and the integer N is the least common multiple
of n1 , n2 , . . . , nr .
16.3 Atomic units
The Schrödinger equation for the Coulomb problem is
∂Ψ(x, y, z, t) ~2 2 e2 1
i~ =− ∇ Ψ(x, y, z, t) − Ψ(x, y, z, t).
∂t 2m 4π0 r
It is clear that the answer to any physical problem can depend only on
the three parameters ~, m, and e2 /4π0 . In section 16.1, we used these
ideas to show that any problem that asked for a length had to have
an answer which was a dimensionless number times the characteristic
length, the so-called Bohr radius
4π0 ~2
a0 = .
e2 m
a. Show that there is only one characteristic energy, i.e. only one
way to combine the three parameters to produce a quantity with
278 Breather
the dimensions of energy. (Section 16.1 found one way to perform

this combination, but I want you to prove that this is the only
way. Hint: Instead of the conventional base dimensions of length,
mass, and time, use the unconventional base dimensions of length,
mass, and energy.)
b. Find the characteristic time τ0 . What is its numerical value in
terms of femtoseconds?
c. Bonus: Show that, in the Bohr model, the period of the innermost
orbit is 2πτ0 . What is the period of the nth orbit?
d. Estimate the number of heartbeats made in a lifetime by a typical
person. If each Bohr model orbit corresponds to a heartbeat, how
many “lifetimes of hydrogen” pass in a second?
e. Write the time-dependent Schrödinger equation in terms of the
scaled variables
r
r̃ = “lengths measured in atomic units”
a0
and
t
t̃ = “time measured in atomic units”.
τ0
Be sure to use the dimensionless wavefunction
Ψ(x̃,
e ỹ, z̃, t̃) = (a0 )3/2 Ψ(x, y, z, t).
16.3. Problems 279
16.4 Scaling in the stadium problem

The “stadium” problem is often used as a model chaotic system, in
both classical and quantum mechanics. [See E.J. Heller, “Bound-State
Eigenfunctions of Classically Chaotic Hamiltonian Systems: Scars of
Periodic Orbits” Phys. Rev. Lett., 53, 1515–1518 (1984); S. Tomsovic
and E.J. Heller, “Long-Time Semiclassical Dynamics of Chaos: The
Stadium Billiard” Phys. Rev. E, 47, 282–299 (1993); E.J. Heller and
S. Tomsovic, “Postmodern Quantum Mechanics” Physics Today, 46
(7), 38–46 (July 1993).] This is a two-dimensional infinite well shaped
as a rectangle with semi-circular caps on opposite ends. Suppose one
stadium has the same shape but is exactly three times as large as
another. Show that in the larger stadium, wavepackets move just as
they do in the smaller stadium, but nine times more slowly. (The
initial wavepacket is of course also enlarged three times.) And show
that the energy eigenvalues of the larger stadium are one-ninth the
energy eigenvalues of the smaller stadium.
16.5 Variational principle for the harmonic oscillator

Find the best bound on the ground state energy of the one-dimensional
harmonic oscillator using a trial wavefunction of form
A
ψ(x) = ,
x 2 + b2
where A is determined through normalization and b is an adjustable
parameter. Hint: Put the integrals within hHi into dimensionless form
so that they are independent of A and b, and are “just numbers”: call
them CK and CP . Solve the problem in terms of these numbers, then
evaluate the integrals only at the end.
16.6 Solving the Coulomb problem through operator factorization
Griffiths (section 4.2) finds the bound state energy eigenvalues for
the Coulomb problem using power series solutions of the Schrödinger
equation. Here is another way, based on operator factorization (ladder
280 Breather
operators). In atomic units, the radial wave equation is

1 d2

`(` + 1) 1
− + − un,` (r) ≡ h` un,` (r) = n,` un,` (r)
2 dr2 2r2 r
where un,` (r) is r times the radial wavefunction. Introduce the opera-
tors
(`) d ` 1
D± ≡ ∓ ± .
dr r `
a. Show that
(`) (`) 1
D+ D− = −2h` − 2 .
`
and that
(`+1) (`+1) 1
D− D+ = −2h` −
(` + 1)2
b. Conclude that
(`+1) (`+1)
h`+1 D+ = D+ h` ,
and apply this operator equation to un,` (r) to show that
(`+1)
D+ un,` (r) ∝ un,`+1 (r)
and that n,` is independent of `.
c. Argue that for every n,` < 0 there is a maximum `. (Hint: Ex-
amine the effective potential for radial motion.) Call this ` value
`n .
d. Define n = `n + 1 and show that
1
n,` = − 2 where ` = 0, . . . , n − 1.
2n
(One can also continue this game to find the energy eigenfunctions.)
Chapter 17
Hydrogen
Recall the structure of states summarized in section 14.4.
17.1 The Stark effect
The unpeturbed Hamiltonian, as represented in the position basis, is

. ~2 2 e2 1
Ĥ (0) = − ∇ − . (17.1)
2m 4π0 r
An electric field of magnitude E is applied, and we name the direction
of the electric field the z direction. The perturbing Hamiltonian, again
represented in the position basis, is
.
Ĥ 0 = eEz = eEr cos θ. (17.2)
Perturbation theory for the energy eigenvalues tells us that, provided

the unperturbed energy state |n(0) i is non-degenerate,
X |hm(0) |Ĥ 0 |n(0) i|2
En = En(0) + hn(0) |Ĥ 0 |n(0) i + (0) (0)
+ ··· . (17.3)
m6=n En − Em
Let us apply perturbation theory to the ground state |n, `, mi = |1, 0, 0i.
This state is non-degenerate, so equation (17.3) applies without ques-
tion. A moment’s thought will convince you that h1, 0, 0|Ĥ 0 |1, 0, 0i =
281
282 Hydrogen
eEh1, 0, 0|ẑ|1, 0, 0i = 0, so the result is

∞ n−1 +`
(0)
X X X |hn, `, m|Ĥ 0 |1, 0, 0i|2
E1 = E1 + (0) (0)
+ ···
n=2 `=0 m=−` E1 − En
∞ n−1 +`
X X X |eEhn, `, m|ẑ|1, 0, 0i|2
= −Ry + + ···
n=2
−Ry + Ry/n2
`=0 m=−`
∞ n−1 +`
e E 2 X X X |hn, `, m|ẑ|1, 0, 0i|2
2
= −Ry − + · · · . (17.4)
Ry n=2 1 − 1/n2
`=0 m=−`
It would take a lot of work to evaluate the sum here, but one thing is
clear: that sum is just some quantity with the dimensions [length2 ], and
independent of the field strength E. So when the electric field is turned
on, the ground state energy decreases from the zero-field energy of −Ry,
quadratically with E. Without even evaluating the sum, we get a lot of
important information.
Well, that went well. What if we apply perturbation theory to
the first excited state |2, 0, 0i? My first thought is that, once again
h2, 0, 0|Ĥ 0 |2, 0, 0i = eEh2, 0, 0|ẑ|2, 0, 0i = 0, so we’ll need to go on to second-
order perturbation theory, and hence we’ll again find a quadratic Stark ef-
fect. The same argument holds for the excited state |2, 1, +1i, the state
|7, 5, −3i and indeed for any energy state.
But that quick and easy argument is wrong. In making it we’ve forgot-
ten that the equation 17.3 applies only to non-degenerate energy states.1
The first excited state is four-fold degenerate: the states |2, 0, 0i, |2, 1, +1i,
|2, 1, 0i, and |2, 1, −1i all have the same energy, namely −Ry/22 . If we were
to try to evaluate the sum, we’d have to look at terms like
|h2, 1, 0|Ĥ 0 |2, 0, 0i|2 |h2, 1, 0|Ĥ 0 |2, 0, 0i|2
= ,
E2,0,0 − E2,1,0 0
which equals infinity! In our attempt to “get a lot of important information
without actually evaluating the sum” we have missed the fact that the sum
diverges.
There’s only one escape from this trap. We can avoid infinities by
making sure that, whenever we have a zero in the denominator, we also
have a zero in the numerator. (Author’s note to self: Change chapter 11
1 This is a favorite trick question in physics oral exams.
17.1. The Stark effect 283
to show this more rigorously.) That is, we can’t perform the perturbation
theory expansion using the basis
{|2, 0, 0i, |2, 1, +1i, |2, 1, 0i, |2, 1, −1i}
but we can perform it using some new basis, a linear combination of these
states, such that in this new basis the matrix elements of Ĥ 0 vanish except
on the diagonal. In other words, we must diagonalize the 4 × 4 matrix of
Ĥ 0 , and perform the perturbation expansion using that new basis rather
than the initial basis.
The process, in other words, requires three stages: First find the matrix
of Ĥ 0 , then diagonalize it, and finally perform the expansion.
Start by finding the 4×4 matrix in the initial basis. Each matrix element
will have the form
ha|Ĥ 0 |bi = eEha|ẑ|bi (17.5)
Z 2π Z π Z ∞
= eE dφ sin θ dθ r2 dr ηa∗ (r, θ, φ) r cos θ ηb (r, θ, φ)
0 0 0
and they will be arrayed in a matrix like this:
h200| h211| h210| h211̄|
 
|200i
  |211i
 
  |210i
|211̄i
(Here the m value of −1 is shown as 1̄ because otherwise it messes up the

spacing.)
You might think that there are 16 matrix elements to calculate, that
each one is a triple integral, and that the best way to start off is by go-
ing to a bar and getting drunk. Courage! The operator is Hermitian, so
the subdiagonal elements are the complex conjugates of the corresponding
superdiagonal elements — there are only 10 matrix elements to calculate.
The diagonal elements are all proportional to the expectation values of
ẑ, and these expectation values vanish for any of the traditional Coulomb
problem eigenstates |n, `, mi.
h200| h211| h210| h211̄|
 
0 |200i

 0 
 |211i
 0  |210i
0 |211̄i
284 Hydrogen
Remember what the wavefunctions look like:

.
|2, 0, 0i = R2,0 (r)Y00 (θ, φ) ∼ 1
.
|2, 1, +1i = R2,1 (r)Y1+1 (θ, φ) ∼ sin θ e+iφ
.
|2, 1, 0i = R2,1 (r)Y10 (θ, φ) ∼ cos θ
.
|2, 1, −1i = R2,1 (r)Y1−1 (θ, φ) ∼ sin θ e−iφ
where ∼ means that I’ve written down the angular dependence but not the
radial dependence.
The leftmost matrix element on the top row is
h2, 1, +1|Ĥ 0 |2, 0, 0i
Z 2π Z π Z ∞
= eE dφ sin θ dθ r2 dr R2,1 (r)Y1+1∗ (θ, φ) r cos θ R2,0 (r)Y00 (θ, φ).
0 0 0
There are three integrals here: r, θ, and φ. To do the r integral I would
have to look up the expressions for R2,1 (r) and R2,0 (r), and then do a
gnarly integral. To do the θ integral I would have to look up the spherical
harmonics and then do an integral not quite so gnarly as the r integral. But
to do the φ integral is straightforward: The function Y1+1∗ (θ, φ) contributes
an e−iφ and that’s it. The φ integral is
Z 2π
dφ e−iφ
0
and this integral is easy to do. . . it’s zero.
h200| h211| h210| h211̄|
 
0 0 |200i
0 0  |211i
 
 0  |210i
0 |211̄i
It’s a good thing we put off doing the difficult r and θ integrals, because
if we had sweated away working them out, and then found that all we
did with those hard-won results was to multiply them by zero, then we’d
really need to visit that bar. When I was a child, my Protestant-work-ethic
parents told me that when faced with two tasks, I should always “be a man”
and do the difficult one first. I’m telling you to do the opposite, because
doing the easy task might make you realize that you don’t have to do the
difficult one.
If you look at the two other matrix elements on the superdiagonal,
h2, 1, 0|Ĥ 0 |2, 1, +1i and h2, 1, −1|Ĥ 0 |2, 1, 0i,
you’ll recognize instantly that for each of these two the φ integral is
Z 2π
dφ e+iφ = 0.
0
The same holds for h2, 1, −1|Ĥ 0 |2, 0, 0i, so the matrix is shaping up as
h200| h211| h210| h211̄|
 
0 0 0 |200i
0 0 0  |211i
 
 0 0 0  |210i
0 0 0 |211̄i
and we have just two more elements to calculate.

The matrix element
Z 2π
h2, 1, −1|Ĥ 0 |2, 1, 1i ∼ dφ e+2iφ = 0,
0
so the only hard integral we have to do is
h2, 1, 0|Ĥ 0 |2, 0, 0i = eEh2, 1, 0|ẑ|2, 0, 0i.
The matrix element h2, 1, 0|ẑ|2, 0, 0i is a length, and any length for the
Coulomb problem must turn out to be a dimensionless number times the
Bohr radius
h2, 1, 0|Ĥ 0 |2, 0, 0i = eEh2, 1, 0|ẑ|2, 0, 0i = eE(number)a0 . (17.6)
The only thing that remains to do is to find that dimensionless number.
I ask you to do this yourself in problem 17.1 (part a). The answer is −3.
Thus the matrix is
h200| h211| h210| h211̄|
 
0 0 3 0 |200i
0 0 0 0  |211i
−eEa0 
3

0 0 0  |210i
0 0 0 0 |211̄i
and we are done with the first stage of our three-stage problem.
You will be tempted to rush immediately into the problem of diagonal-
izing this matrix, but “fools rush in where angels fear to tread” (Alexander
Pope). If you think about it for an instant, you’ll realize that it will be a
286 Hydrogen
lot easier to do the problem if we rearrange the sequence of basis vectors

so that the matrix reads
h200| h210| h211| h211̄|
 
0 3 0 0 |200i
3 0 0 0  |210i
−eEa0 
0

0 0 0  |211i
0 0 0 0 |211̄i
Now we start the second stage, diagonalizing the matrix. First, find the
eigenvalues:
0 = det |M − λI|
−λ 3 0 0
3 −λ 0 0
= det
0 0 −λ 0
0 0 0 −λ
−λ 0 0 3 0 0
= −λ det 0 −λ 0 − 3 det 0 −λ 0
0 0 −λ 0 0 −λ
= λ4 − 32 λ2
= λ2 (λ2 − 32 )
Normally, it’s hard to solve a quartic equation, but in this case we can just
read off the four solutions:
λ = +3, −3, 0, 0.
The eigenvectors associated with λ = 0 and λ = 0 are clearly

|2, 1, +1i and |2, 1, −1i.
The eigenvector associated with λ = 3 will be a linear combination
x|2, 0, 0i + y|2, 1, 0i
where

03 x x
=3 .
30 y y
Any x = y is a solution, but I choose the normalized solution so that the
eigenvector with eigenvalue 3 is
√1 (|2, 0, 0i + |2, 1, 0i) .
2
The parallel process for λ = −3 reveals the eigenvector

√1 (−|2, 0, 0i + |2, 1, 0i) .
2
[[Why, you will ask, do I use this eigenvector rather than
√1 (|2, 0, 0i − |2, 1, 0i) ,
2
which is also an eigenvector but which I can write down with fewer pen
strokes? The answer is simple personal preference. The version I use is the
same one used for geometrical vectors in a plane, and where the change
of basis is a 45◦ rotation. This helps me remember that, even in this
recondite and abstruse situation, the process of matrix diagonalization does
not change the physical situation, it merely changes the basis vectors we
select to help us describe the physical situation.]]
To summarize, in the basis
n o
√1 (|2, 0, 0i + |2, 1, 0i) , √1 (−|2, 0, 0i + |2, 1, 0i) , |2, 1, +1i, |2, 1, −1i
2 2
the matrix representation of the operator Ĥ 0 is

 
3 0 00
 0 −3 0 0 
−eEa0 0 0 0 0.

0 0 00
And now, for the final stage, executing perturbation theory starting
from this new basis, which I’ll call {|ai, |bi, |ci, |di}. The energy value as-
sociated with |ai is
(0)
X |ha|Ĥ 0 |mi|2
E2 = E2 + ha|Ĥ 0 |ai + (0) (0)
+ ···
m Ea − Em
The first correction we already know: it is ha|Ĥ 0 |ai = −3eEa0 . The second
correction — the sum — contains terms like
|ha|Ĥ 0 |bi|2 0
(0) (0)
=
Ea − E 0
b
and
|ha|Ĥ 0 |ci|2 0
(0) (0)
=
Ea − Ec 0
and
|ha|Ĥ 0 |1, 0, 0i|2 something
=
(0)
Ea −
(0)
E1,0,0 − 34 Ry
288 Hydrogen
but it contains no terms where a number is divided by zero. I will follow

the usual rule-of-thumb for perturbation theory, which is to stop at the first
non-zero correction and ignore the sum altogether.
Similarly, the leading energy correction associated with |bi is hb|Ĥ 0 |bi =
3eEa0 .
The first-order corrections for |ci and |di vanish, so these states will
be subject to a quadratic Stark effect, just like the ground state. I could
work them out if I really needed to, but instead I will quote and follow the
age-old dictum (modified from “The Lay of the Last Minstrel” by Walter
Scott):
Breathes there the man, with soul so dead,

Who never to himself hath said
“To hell with it, I’m going to bed.”
17.1 The Stark effect

a. Find the numerical factor in equation (17.6).
b. The “good” energy eigenstates for the n = 2 Stark effect — the
states that one should use as unperturbed states in perturbation
theory — are
|2, 1, +1i
|2, 1, −1i
√1 (+|2, 0, 0i + |2, 1, 0i)
2
√1 (−|2, 0, 0i + |2, 1, 0i)
2
Find the expectation value for position h~ri in each of these states.
c. The expectation value of position is zero in state |2, 0, 0i and zero √
in state |2, 1, 0i, yet it is non-zero in state (|2, 0, 0i + |2, 1, 0i)/ 2.
This might seem like a contradiction: After all, if the average
position vanishes for two probability densities, then it vanishes for
the sum of the two. What great principle of quantum mechanics
allows this fact to escape the curse of contradiction? (Answer in
one sentence.)
d. (Bonus.) Describe these four states qualitatively and explain why
they are the “good” states for use in the Stark effect.
e. Consider the Stark effect for the n = 3 states of hydrogen. There

are initially nine degenerate states. Construct a 9×9 matrix repre-
senting the perturbing Hamiltonian. (Hint: Before actually work-
ing any integrals, use a selection rule to determine the sequence
of basis elements that will produce a block diagonal matrix.)
f. Find the eigenvalues and degeneracies.
17.2 Bonus
In the previous problem, on the Stark effect, we had to calculate a lot
of matrix elements of the form
Z ∞
r2 Rn,` (r) r Rn0 ,`0 (r) dr.
0
This was possible but (to put it mildly) tedious. Can you think of some
easy way to do integrals of this form? Could the operator factorization
technique (problem 16.6) give us any assistance? Can you derive any
inspiration from our proof of Kramers’ relation (problem below)?
17.3 Kramers’ relation
Kramers’ relation states that for any energy eigenstate ηn`m (~r) of the
Coulomb problem, the expected values of rs , rs−1 , and rs−2 are related
through
s+1 s s
hr i − (2s + 1)a0 hrs−1 i + [(2` + 1)2 − s2 ]a20 hrs−2 i = 0.
n2 4
a. Prove Kramers’ relation. Hints: Use atomic units. Start with the
radial equation in form

00 `(` + 1) 2 1
u (r) = − + 2 u(r),
r2 r n
and use it to express
Z ∞
u(r)rs u00 (r) dr
0
in terms of hrs i, hrs−1 i, and hrs−2 i. Then perform that integral by

parts to find an integral involving u0 (r) as the highest derivative.
Show that
Z ∞
s
u(r)rs u0 (r) dr = − hrs−1 i
0 2
and that
Z ∞ Z ∞
2
u0 (r)rs u0 (r) dr = − u00 (r)rs+1 u0 (r) dr.
0 s+1 0
290 Hydrogen
b. Use Kramers’ relation with s = 0, s = 1, s = 2, and s = 3 to

find formulas for hr−1 i, hri, hr2 i, and hr3 i. Note that you could
continue indefinitely to find hrs i for any positive power.
c. However, you can’t use this chain to work downward. Try it for
s = −1, and show that you get a relation between hr−2 i and hr−3 i,
but not either quantity by itself.
Chapter 18
Helium
The helium problem is a “three-body problem”. This problem has never

been solved exactly even in classical mechanics, and it is hopeless to expect
an exact solution in the richer and more intricate regime of quantum me-
chanics. Does this mean we should give up? Of course not. Most physics
problems cannot be solved exactly, but some can be solved approximately
well enough to compare theory to experiment, which is itself imperfect. (In
the same way, most problems you have with your parents, or with your
boy/girlfriend, cannot be solved perfectly. But they can often be solved
well enough to continue your relationship.)
18.1 Ground state energy of helium
The role of theory
Jacov Ilich Frenkel (also Yakov Ilich Frenkel or Iakov Ilich Frenkel; 1894–
1952) was a prolific physicist. Among other things he coined the term
“phonon”. In a review article on the theory of metals (quoted by M.E.
Fisher in “The Nature of Critical Points”, Boulder lectures, 1965) he said:
The more complicated the system considered, the more simplified must
its theoretical description be. One cannot demand that a theoretical
description of a complicated atom, and all the more of a molecule or a
crystal, have the same degree of accuracy as of the theory of the simplest
hydrogen atom. Incidentally, such a requirement is not only impossible
to fulfill but also essentially useless. . . . An exact calculation of the con-
stants characterizing the simplest physical system has essential signifi-
cance as a test on the correctness of the basic principles of the theory.
291
292 Helium
However, once it passes this test brilliantly there is no sense in subject-

ing it to further tests as applied to more complicated systems. The most
ideal theory cannot pass such tests, owing to the practically unsurmount-
able mathematical difficulties unavoidably encountered in applications to
complicated systems. In this case all that is demanded of the theory is a
correct interpretation of the general character of the quantities and laws
pertaining to such a system. The theoretical physicist is in this respect
like a cartoonist, who must depict the original, not in all details like
a photographic camera, but simplify and schematize it in a way as to
disclose and emphasize the most characteristic features. Photographic
accuracy can and should be required only of the description of the sim-
plest system. A good theory of complicated systems should represent
only a good “caricature” of these systems, exaggerating the properties
that are most difficult, and purposely ignoring all the remaining inessen-
tial properties.
Which case is the ground state of He?
1) Fundamental test of symmetrization postulate.

2) Test to see whether QM breaks down for complex systems (An-
thony J. Leggett).
3) Refinements can involve new physical ideas.
4) Physical effects other than ground state energy.
Experiment
Eg = −78.975 eV.
Theory
(Summarizing Griffiths 5.2.1 and 7.2.) If we take account of the Coulomb

forces, but ignore things like the finite size of the nucleus, nuclear mo-
tion, relativistic motion of the electron, spin-orbit effects, and so forth, the
Hamiltonian for two electrons and one nucleus is
Ĥ = ĤA + ĤB + ÛAB (18.1)
where
e2 1
ÛAB = . (18.2)
4π0 |rA − rB |
18.1. Ground state energy of helium 293
The ground state wavefunction for H is

1
η100 (r) = √ 3/2 e−r/a0 . (18.3)
πa0
But if the nucleus had charge +Ze, this would be
Z 3/2
η100 (r) = √ 3/2 e−Zr/a0 . (18.4)
πa0
So the ÛAB = 0 ground state is
Z 3 −Z(rA +rB )/a0
η100 (rA )η100 (rB ) = e with Z = 2. (18.5)
πa30
This state gives a ground state energy of Eg = −8(Ry) = −109 eV.
Turning on the electron-electron repulsion, perturbation theory finds
hÛAB i and jacks up Eg to −75 eV.
The variational method uses the same wavefunction as above, but con-
siders Z not as 2 but as an adjustable parameter. Interpretation: “shield-
ing” — expect 1 < Zmin < 2. And in fact minimizing hHi with over this
class of trial wavefunctions gives Zmin = 1.69 and Eg = −77.5 eV. (Sure
enough, an overestimate.) Griffiths stops here and suggests that the rest of
the work is humdrum.
Further theory
Review: A. Hibbert, Rept. Prog. Phys. 38 (1975) 1222–1225.

Hylleraas (1929): Trial wavefunction of form (atomic units)
X
ψ(rA , rB ) = e−Z(rA +rB ) cnlm (Z(rA +rB ))n (Z(rA −rB ))2l (Z|rA −rB |)m .
[I won’t go into all the reasons why he picked this trial wavefunction,
but. . . ask why only even powers 2l.] Using Z and six terms in sum as
variational parameters, he got an energy good to 2 parts in 10,000.
This is a good energy. Is there any point in doing better? Yes. Although
it gives you a good energy, it gives you a poor wavefunction: Think of
a d = 2 landscape with a hidden valley — e.g. a crater, an absolute
minimum. The d = 2 landscape represents two variational parameters —
by coincidence, the exact wavefunction has the form that you guessed. If
you tried just one variational parameter, you’d be walking a line in this
landscape. The line could be quite far from the valley bottom while giving
294 Helium
very good elevation estimates for the valley bottom, because the valley is
flat at the bottom. [Sketch.]
In fact, you can show that no wavefunction of this form, no matter how
many terms you pick, can satisfy the Schrödinger Equation — even if you
picked an infinite number of terms, you’d never hit the wavefunction right
on!
Is there any reason to get the wavefunction right? Yes! For example if
you wanted to calculate Stark or Zeeman effect, or spin-orbit, or whatever,
you’d need those wavefunctions for doing perturbation theory!
Kinoshita (1959): One of the “great fiddlers of physics”. Trial wave-
function of form (atomic units)
2l m
−Z(rA +rB )
X
n rA − rB |rA − rB |
ψ(rA , rB ) = e cnlm (Z(rA +rB )) Z .
|rA − rB | rA + rB
He showed that this could satisfy the Schrödinger Equation exactly if sum
were infinite. Used 80 terms for accuracy 1 part in 100,000.
Pekeris (1962): A different trial wavefunction guaranteed to get the
correct form when both electrons are far from nucleus. Used 1078 terms,
added fine structure and hyperfine structure, got accuracy 1 part in 109 .
Schwartz (1962): Added terms like [Z(rA + rB )]n/2 . . . not smooth.
Got better energies with 189 terms!
Frankowski and Pekeris (1966): Introduced terms like lnk (Z(rA +
rB )) . . . not smooth. 246 terms, accuracy 1 part in 1012 .
Kato: (See Drake, page 155.) Looked at condition for two electrons
close, both far from nucleus. In this case it’s like H atom, wavefunction
must have cusp. Allow electrons to show this cusp.
State of art: Gordon W.F. Drake, ed. Atomic, Molecular, and Optical
Physics Handbook page 163. [Reference QC173.A827 1996]
New frontiers: experiment. S.D. Bergeson, et al., “Measurement of the
He ground state Lamb shift”, Phys. Rev. Lett. 80 (1998) 3475–3478.
New frontiers: theory. S.P. Goldman, “Uncoupling correlated calcula-
tions in atomic physics: Very high accuracy and ease,” Phys. Rev. A 57
(1998) 677–680. 8066 terms, 1 part in 1018 .
New frontiers: Lithium, metallic Hydrogen.
18.1. Ground state energy of helium 295
Sometimes people get the impression that variational calculations are

dry and mechanical: simply add more parameters to your trial wavefunc-
tion, and your results will improve (or at least, they can’t get worse). The
history of the Helium ground state calculation shows how wrong this im-
pression is. Progress is made by deep thinking about the character of the
true wavefunction (What is the character when both electrons are far from
the nucleus and far from each other? What is the character when both elec-
trons are far from the nucleus and close to each other?) and then choosing
trail wavefunctions that can display (or at least mimic) those characteristics
of the true wavefunction.
Chapter 19
Atoms
19.1 Addition of angular momenta
We often have occasion to add angular momenta. For example, an electron

might have orbital angular momentum with respect to the nucleus, but also
spin angular momentum. What is the total angular momentum?
Or again, there might be two electrons in an atom, each with orbital
angular momentum. What is the total orbital angular momentum of the
two electrons?
Or again, there might be an electron with orbital angular momentum
relative to the nucleus, but the nucleus moves relative to some origin. What
is the total angular momentum of the electron relative to the origin?
This section demonstrates how to perform such additions through a
specific example, namely adding angular momentum A with À = 1 to
angular momentum B with `B = 2. (For the moment, assume that these
angular momenta belong to non-identical particles. If the two particles are
identical — as in the second example above — then there is an additional
requirement that the sum wavefunction be symmetric or antisymmetric
under swapping/interchange/exchange.)
First, recall the states for a single angular momentum: There are no
states with values of L̂x , L̂y , L̂z , and L̂2 = L̂2x + L̂2y + L̂2z simultaneously,
reflecting such facts as that L̂x and L̂z do not commute. However, because
L̂2 and L̂z do commute, there are states (in fact, a basis of states) that
have values of L̂2 and L̂z simultaneously.
297
298 Atoms
For angular momentum A, with À = 1, these basis states are

|1, +1i
|1, 0i
|1, −1i
where
L̂2A |À , mA i = ~2 À (À + 1)|À , mA i = ~2 (1)(2)|À , mA i
and
L̂A,z |À , mA i = ~mA |À , mA i.
These states are called the “À = 1 triplet”.
For angular momentum B, with `B = 2, these basis states are
|2, +2i
|2, +1i
|2, 0i
|2, −1i
|2, −2i
where
L̂2B |`B , mB i = ~2 `B (`B + 1)|`B , mB i = ~2 (2)(3)|`B , mB i
and
L̂B,z |`B , mB i = ~mB |`B , mB i.
These states are called the “`B = 2 quintet”.
Now, what sort of states can we have for the sum of these two angular
momenta? The relevant total angular momentum operator is
ˆ ~ˆ ~ˆ
J~ = L A + LB
so
Jˆz = L̂A,z + L̂B,z
but
Jˆ2 6= L̂2A + L̂2B .
We can ask for states with values of Jˆ2 and Jˆz simultaneously, but such
states will not necessarily have values of L̂A,z and L̂B,z , because Jˆ2 and
L̂A,z do not commute (see problem XXX). For the same reason, we can ask
19.1. Addition of angular momenta 299
for states with values of L̂A,z and L̂B,z simultaneously, but such states will
not necessarily have values of Jˆ2 .
For most problems, there are two bases that are natural and useful.
The first is consists of states like |À , mA i|`B , mB i — simple product states
of the bases we discussed above. The second basis consists of states like
|j, mJ i. To find how these are connected, we list states in the first basis
according to their associated1 value of mJ :
|À , mA i|`B , mB i mJ
|1, +1i|2, +2i +3
|1, +1i|2, +1i |1, 0i|2, +2i +2 +2
|1, +1i|2, 0i |1, 0i|2, +1i |1, −1i|2, +2i +1 +1 +1
|1, +1i|2, −1i |1, 0i|2, 0i |1, −1i|2, +1i 0 0 0
|1, +1i|2, −2i |1, 0i|2, −1i |1, −1i|2, 0i −1 −1 −1
|1, 0i|2, −2i |1, −1i|2, −1i −2 −2
|1, −1i|2, −2i −3
These values of mJ fall into a natural structure:
There is a heptet of seven states with

mJ = +3, +2, +1, 0, −1, −2, −3. This heptet must be associ-
ated with j = 3.
There is a quintet of five states with mJ = +2, +1, 0, −1, −2. This
quintet must be associated with j = 2.
There is a triplet of three states with mJ = +1, 0, −1. This triplet
must be associated with j = 1.
So now we know what the values of j are! If you think about this problem
for general values of À and `B , you will see immediately that the values
of j run from À + `B to |À − `B |. Often, this is all that’s needed.2 But
sometimes you need more. Sometimes you need to express total-angular-
momentum states like |j, mJ i in terms of in individual-angular-momentum
states like |À , mA i|`B , mB i.
The basic set-up of our problem comes through the table below:
1 While the state |À , mA i|`B , mB i doesn’t have a value of j, it does have a value of
mJ , namely mJ = mA + mB .
2 In particular, many GRE questions that appear on their face to be deep and difficult
only go this far.

300 Atoms
|À , mA i|`B , mB i |j, mJ i

|1, +1iA |2, +2iB |3, +3iJ
|1, +1iA |2, +1iB |1, 0iA |2, +2iB |3, +2iJ |2, +2iJ
|1, +1iA |2, 0iB |1, 0iA |2, +1iB |1, −1iA |2, +2iB |3, +1iJ |2, +1iJ |1, +1iJ
|1, +1iA |2, −1iB |1, 0iA |2, 0iB |1, −1iA |2, +1iB |3, 0iJ |2, 0iJ |1, 0iJ
|1, +1iA |2, −2iB |1, 0iA |2, −1iB |1, −1iA |2, 0iB |3, −1iJ |2, −1iJ |1, −1iJ
|1, 0iA |2, −2iB |1, −1iA |2, −1iB |3, −2iJ |2, −2iJ
|1, −1iA |2, −2iB |3, −3iJ
Note that we have labeled states like

|À , mA i|`B , mB i as |À , mA iA |`B , mB iB
and states like
|j, mJ i as |j, mJ iJ .
Otherwise we might confuse the state |2, +1iB on the left side of the second
row with the completely different state |2, +1iJ on the right side of the of
the third row. (Some authors solve this notation vexation by writing the
states of total angular momentum as |j, mJ , À , `B i, taking advantage of
the fact that À and `B are the same for all states on the right — and for
all states on the left, for that matter. This means every state on the right
would be written as |j, mJ , 1, 2i. For me, it rapidly grows frustrating to
tack a “1,2” on to the end of every such state.)
The second line of this table means that the state |3, +2iJ is some linear
combination of the states |1, +1iA |2, +1iB and |1, 0iA |2, +2iB . Similarly
for the state |2, +2iJ . [[This is the meaning of the assertion made earlier
that in the state |3, +2iJ there is no value for mA : The state |3, +2iJ
is a superposition of a state with mA = +1 and a state with mA = 0,
but the state |3, +2iJ itself has no value for mA .]] Similarly, the state
|1, +1iA |2, +1iB is a linear combination of states |3, +2iJ and |2, +2iJ
But what linear combination? We start with the first line of the table.
Because there’s only one state on each side, we write
|3, +3iJ = |1, +1iA |2, +2iB . (19.1)
(We could have inserted an overall phase factor of modulus one, such
as |3, +3iJ =√ −|1, +1iA |2, +2iB or |3, +3iJ = i|1, +1iA |2, +2iB or even
|3, +3iJ = − i |1, +1iA |2, +2iB . But this insertion would have only made
our lives difficult for no reason.)
Now, to find an expression for |3, +2i, apply the lowering operator
Jˆ− = L̂A,− + L̂B,−
to both sides of equation (19.1). Remembering that
Jˆ− |j, mi = ~ j(j + 1) − m(m − 1) |j, m − 1i,
p
this lowering gives

h i
Jˆ− |3, +3iJ = L̂A,− |1, +1iA |2, +2iB (19.2)
h i
+ |1, +1iA L̂B,− |2, +2iB
p h p i
~ 3(4) − 3(2) |3, +2iJ = ~ 1(2) − 1(0) |1, 0iA |2, +2iB
h p i
+ |1, +1iA ~ 2(3) − 2(1) |2, +1iB
√ h√ i h√ i
6 |3, +2iJ = 2 |1, 0iA |2, +2iB + |1, +1iA 4 |2, +1iB
r r
1 2
|3, +2iJ = |1, 0iA |2, +2iB + |1, +1iA |2, +1iB .
3 3
Before, we knew only that if the system were in state |3, +2iJ and we
measured mA , the result might be 0 or it might be +1. Now we know
that the probability of obtaining the result 0 is 31 , while the probability of
obtaining the result +1 is 23 .
You can continue this process: lower |3, +2iJ to find an expression for
|3, +1iJ , lower |3, +1iJ to find an expression for |3, 0iJ , and so forth. When
you get to |3, −2iJ , you should lower it to find
|3, −3iJ = |1, −1iA |2, −2iB ,
and if that’s not the result you get, then you made an error somewhere in
this long chain.
Now we know how to find expressions for the entire heptet |3, miJ , with
m ranging from +3 to −3. But what about the quintet |2, miJ , with m
ranging from +2 to −2? If we knew the top member |2, +2iJ , we could
lower away to find the rest of the quintet. But how do we find this starting
point?
The trick to use here is orthogonality. We know that
|2, +2iJ = α|1, 0iA |2, +2iB + β|1, +1iA |2, +1iB ,
302 Atoms
where α and β are to be determined, and that

r r
1 2
|3, +2iJ = |1, 0iA |2, +2iB + |1, +1iA |2, +1iB
3 3
and that
h3, +2|2, +2iJ = 0.
We use the orthogonality to find the expansion coefficients α and β:
0 = h3, +2|2, +2iJ
"r r #" #
1 2
= A h1, 0|B h2, +2| + A h1, +1|B h2, +1| α|1, 0iA |2, +2iB + β|1, +1iA |2, +1iB
3 3
r r
1 1
= αh1, 0|1, 0iA h2, +2|2, +2iB + βh1, 0|1, +1iA h2, +2|2, +1iB
3 3
r r
2 2
+ αh1, +1|1, 0iA h2, +1|2, +2iB + βh1, +1|1, +1iA h2, +1|2, +1iB
3 3
r r r r
1 1 2 2
= α(1) + β(0) + α(0) + β(1)
3 3 3 3
r r
1 2
=α +β .
3 3
There are, of course, many solutions to this equation, but you can read off
a normalized solution, namely
r r
2 1
α= , β=−
3 3
so that
r r
2 1
|2, +2iJ = |1, 0iA |2, +2iB − |1, +1iA |2, +1iB . (19.3)
3 3
You could have taken |2, +2iJ to be √ the negative of the expression above
(or i times the expression above, or i times the expression above, etc.)
I recommend against this: life is hard enough on its own, don’t go out of
your way to deliberately make difficulties for yourself.
Once the expression for |2, +2iJ is known, we can lower mJ from +2 all
the way to −2 to find expressions for the entire j = 2 quintet.
And then one can find the expression for |1, +1iJ by demanding that
it be orthogonal to |3, +1iJ and |2, +1iJ . And once that’s found we can
lower to find expressions for the entire j = 1 triplet.
In summary, the states of these two angular momenta, À = 1 and

`B = 2, fall in a Hilbert space with a fifteen-element basis. While there
are, of course, an infinite number of bases, the most natural and most
useful bases are (1) the states of definite individual angular momenta (the
15 states like |À , mA i|`B , mB i) or (2) the states of definite total angular
momentum (the 15 states like |j, mJ iJ ). We now know (in principle) how
to express states of the p second basis
p in terms of states in the p first basis.
The
p coefficients, like 1/3 and 2/3 in equation (19.2), or 2/3 and
− 1/3 in equation (19.3) that implement this change of basis are called
Clebsch-Gordon coefficients.3
As you can see, it takes a lot of work to compute Clebsch-Gordon coef-
ficients, but fortunately you don’t have to do it. There are published tables
of Clebsch-Gordon coefficients. Griffiths explains how to use them.
Problem XXX: Commutators. Show that
[Jˆ2 , L̂A,z ] = 2i~(L̂A,x L̂B,y − L̂A,y L̂B,x ).
Without performing any new calculation, find [Jˆ2 , L̂B,z ].
3 Alfred Clebsch (1833–1872) and Paul Gordan (1837–1912) were German mathemati-
cians who recognized the importance of these coefficients in the purely mathematical
context of invariant theory in about 1868, years before quantum mechanics was discov-
ered. Gordan went on to serve as thesis advisor for Emmy Noether.
304 Atoms
19.2 Hartree-Fock approximation
For atom with atomic number Z.

(1) Guess some spherically-symmetric potential energy function that
interpolates between
1 Ze2
for small r, V (r) ≈ − (19.4)
4π0 r
and
1 e2
for large r, V (r) ≈ − . (19.5)
4π0 r
(2) Using all the tricks we’ve learned about spherically-symmetric po-
tential energy functions, solve (numerically) the energy eigenproblem for
the lowest Z/2 one-body energy levels. (If Z is odd, round up.)
(3) Use the antisymmetrization machinery to combine those levels into
the Z-body ground state.
(4) From the quantal probability density for electrons in configuration
space, deduce an electrostatic charge density in position space.
(5) Average that charge density over angle to make it spherically sym-
metric.
(6) From this spherically-symmetric charge density, use the shell the-
orem of electrostatics to deduce a spherically-symmetric potential energy
function.
(7) Go to step (2)
You’ll notice that this process never ends. In practice, you repeat until
either you’ve earned a Ph.D. or you can’t stand it any longer.
This is a “mean-field approximation”. An electron is assumed to interact
with the average (mean) of all the other electrons. Even if you go through
this process an infinite number of times, you will never get the fine points of
two electrons interacting far from the nucleus and from the other electrons.
Nevertheless, even two or three cycles through this algorithm can pro-
duce results in close accord with experiment. This has always surprised
me and I think if I understood it I’d discover something valuable about
quantum mechanics.
19.3. Atomic ground states 305
19.3 Atomic ground states
In addition to the process described above, you have to worry about spin,
and about orbital angular momentum and (when you go on to Hamiltonians
more accurate than the above) their interaction.
Friedrich Hund4 did many such perturbation calculations and noticed
regularities that he codified into “Hund’s rules”. Griffith talks about them.
The some aspects of an electronic state are described using a particular
notation — called a “term symbol” —- which you should know about.
A state will have a particular orbital angular momentum L, spin angular
momentum S, and total angular momentum J. (It will also have values
of Lz , Sz , and Jz , but they are not recorded in this notation.) You would
think these three numbers would be presented as three numbers, but no:
they are conventionally presented as the term symbol
2S+1
LJ . (19.6)
By further convention, S and J are given as numbers, while L is presented
as a letter using the S, P, D, F encoding. (In this notation, capital not
lower case letters are used. Please don’t ask why.) The ground state of
carbon, for example, happens to have S = 1, L = 1, and J = 0; it is
described as a 3 P0 state. One last convention: the spin number is written
as a number, but pronounced as a degeneracy. The ground state of carbon
is pronounced “triplet pee zero”. The ground state of sodium, 2 S1/2 , is
pronounced “doublet ess one-half”.
The people who write the physics GRE have fallen into the miscon-
ception that this term symbol notation tells us something important about
nature, rather than about human convention. I recommend that you review
the above paragraph the night before you take the GRE.
4 German physicist (1896–1997) who applied quantum mechanics to atoms and
molecules, and who discovered quantum tunneling.

Chapter 20
Molecules
20.1 The hydrogen molecule ion
The hydrogen molecule ion1 is two protons an a single electron. . . H+

2 . If we
had managed to successfully solve the helium atom problem we would also
have solved this one, because it’s just three particles interacting through
1/r2 forces. However, you know that this problem has not been exactly
solved even in the classical limit. Thus we don’t even look for an exact
solution: we look for the approximation most applicable to the case of two
particles much more massive than the third.
e−
rα
rβ
α R β
If we take account of the Coulomb forces, but ignore things like the finite
size of the nucleus, relativistic motion of the electron, spin-orbit effects, and
so forth, the Hamiltonian for one electron and two protons (α and β) is
Ĥ = KE
dα + KE
dβ + KE
de + Ûαβ + Ûαe + Ûβe (20.1)
This is, of course, also the Hamiltonian for the helium atom, or for any
three-body problem with pair interactions. Now comes the approximation
suitable for the hydrogen molecule ion (but not appropriate for the helium
1 Technically the hydrogen molecule cation.
307
308 Molecules
atom): Assume that the two protons are so massive that they are fixed,
and the interaction between them is treated classically. In equations, this
approximation demands
e2 1
KE
dα = 0; KE
dβ = 0; Ûαβ = Uαβ = . (20.2)
4π0 R
The remaining, quantum mechanical, piece of the full Hamiltonian is the
electronic Hamiltonian
~2 2 e2

1 1
Ĥe = − ∇ − + . (20.3)
2m 4π0 rα rβ
This approximation is called the “Born-Oppenheimer” approximation.
What shall we do with the electronic Hamiltonian? It would be nice to
have an analytic solution of the energy eigenproblem. Then we could do
precise comparisons between these results and the experimental spectrum
of the hydrogen molecule ion, and build on them to study the hydrogen
molecule, in exactly the same way that we built on our exact solution for
He+ to get an approximate solution for He. This goal is hopelessly beyond
our reach. [Check out Gordon W.F. Drake, editor, Atomic, Molecular,
and Optical Physics Handbook (AIP Press, Woodbury, NY, 1996) Refer-
ence QC173.A827 1996. There’s a chapter on high-precision calculations
for helium, but no chapter on high-precision calculations for the hydrogen
molecule ion.] Instead of giving up, we might instead look for an exact
solution to the ground state problem. This goal is also beyond our reach.
Instead of giving up, we use the variational method to look for an approx-
imate ground state.
Before doing so, however, we notice one exact symmetry of the electronic
Hamiltonian that will guide us in our search for approximate solutions.
The Hamiltonian is symmetric under the interchange of symbols α and
β or, what is the same thing, symmetric under inversion about the point
midway between the two nuclei. Any discussion of parity (see, for example,
Gordon Baym Lectures on Quantum Mechanics pages 99–101) shows that
this means the energy eigenfunctions can always be chosen either odd or
even under the interchange of α and β.
Where will we find a variational trial wavefunction? If nucleus β did not
exist, the ground state wavefunction would be the hydrogen ground state
wavefunction centered on nucleus α:
1
ηα (~r) = p 3 e−rα /a0 ≡ |αi. (20.4)
πa0
20.1. The hydrogen molecule ion 309
Similarly if nucleus α did not exist, the ground state wavefunction would
be
1
ηβ (~r) = p 3 e−rβ /a0 ≡ |βi. (20.5)
πa0
We take as our trial wavefunction a linear combination of these two wave-
functions. This trial wavefunction is called a “linear combination of atomic
orbitals” or “LCAO”. So the trial wavefunction is
ψ(~r) = Aηα (~r) + Bηβ (~r). (20.6)
At first glance, it seems that the variational parameters are the complex
numbers A and B, for a total of four real parameters. However, one pa-
rameter is taken up through normalization, and one through overall phase.
Furthermore, because of parity the swapping of α and β can result in at
most a change in sign, whence B = ±A. Thus our trial wavefunction is
ψ(~r) = A± [ηα (~r) ± ηβ (~r)], (20.7)
where A± is the normalization constant, selected to be real and positive.
(The notation A± reflects the fact that depending on whether we take the
+ sign or the − sign, we will get a different normalization constant.)
This might seem like a letdown. We have discussed exquisitely precise
variational wavefunction involving hundreds or even thousands of real pa-
rameters. Here the only variational parameter is the binary choice: + sign
or − sign! Compute hĤe i both ways and see which is lower! You don’t even
have to take a derivative at the end! Clearly this is a first attempt and more
accurate calculations are possible. Rather than give in to despair, however,
let’s recognize the limitations and forge on to see what we can discover.
At the very least what we learn here will guide us in selecting better trial
wavefunctions for our next attempt.
There are only two steps: normalize the wavefunction and evaluate
hĤe i. However, these steps can be done through a frontal assault (which
is likely to get hopelessly bogged down in algebraic details) or through a
more subtle approach recognizing that we already know quite a lot about
the functions ηα (~r) and ηβ (~r), and using this knowledge to our advantage.
Let’s use the second approach.
Normalization demands that
1 = |A± |2 (hα| ± hβ|)(|αi ± |βi)
= |A± |2 (hα|αi ± hα|βi ± hβ|αi + hβ|βi)
= 2|A± |2 (1 ± hα|βi)
310 Molecules
where in the last step we have used the normalization of |αi and |βi. The
integral hα|βi is not easy to calculate, so we set it aside for later by naming
it the overlap integral
Z
I(R) ≡ hα|βi = ηα (~r)ηβ (~r) d3 r. (20.8)
In terms of this integral, we can select the normalization to be

1
A± = p . (20.9)
2(1 ± I(R))
Evaluating the electronic Hamiltonian in the trial wavefunction gives

(hα| ± hβ|)Ĥe (|αi ± |βi)
hĤe i =
2(1 ± I(R))
hα|Ĥe |αi ± hα|Ĥe |βi ± hβ|Ĥe |αi + hβ|Ĥe |βi
=
2(1 ± I(R))
hα|Ĥe |αi ± hβ|Ĥe |αi
= (20.10)
1 ± I(R)
But we have already done large parts of these two integrals:
2 2

Ĥe |αi = KEd − e 1 − e 1 |αi
4π0 rα 4π0 rβ
2 2

= KEd − e 1 |αi − e 1 |αi
4π0 rα 4π0 rβ
1
= −Ry |αi − 2 Ry a0 |αi
rβ

a0
= −Ry |αi + 2 |αi (20.11)
rβ
whence

a0
hα|Ĥe |αi = −Ry 1 + 2 α α (20.12)
rβ

a0
hβ|Ĥe |αi = −Ry hβ|αi + 2 β α . (20.13)
rβ
On the right-hand side we recognize the overlap integral, I(R) = hβ|αi, and
two new (dimensionless) integrals, which are called the direct integral

a0
D(R) ≡ α α (20.14)
rβ
20.1. The hydrogen molecule ion 311
and the exchange integral

a0
X(R) ≡ β α . (20.15)
rβ
These two integrals are not easy to work out (I will assign them as
homework) but once we do them (plus the overlap integral) we can find the
expectation value of the electronic Hamiltonian in the trial wavefunction.
It is
1 + 2D(R) ± I(R) ± 2X(R)
hĤe i = −Ry
1 ± I(R)

D(R) ± X(R)
= −Ry 1 + 2 . (20.16)
1 ± I(R)
This, remember, is only the electronic part of the Hamiltonian. In the
Born-Oppenheimer approximation the nuclear part has no kinetic energy
and Coulombic potential energy
e2 1 a0
= 2 Ry , (20.17)
4π0 R R
so the upper bound on the total ground state energy is

a0 D(R) ± X(R)
Ry 2 − 1 − 2 . (20.18)
R 1 ± I(R)
What are the results?

312 Molecules
Here the dashed line represents −, the solid line represents +. X means
R/a0 , and the vertical axis is energy in Ry. [When R → ∞, the system is a
hydrogen atom (ground state energy −Ry) and a clamped proton far away
(ground state energy 0).]
20.1.1 Why is + lower energy than −?
20.1.2 Understanding the integrals
How can we understand these integrals? This section uses scaled units.
First, all three integrals are always positive.
The overlap integral: I(R) = hβ|αi.

When R → ∞, I(R) approaches zero, exponentially quickly.
When R = 0, I(R) = 1.
The direct integral: D(R) = hα|1/rβ |αi.

When R → ∞, D(R) → 1/R.
When R = 0, D(R) = h1/ri = 1.
The exchange integral: X(R) = hβ|1/rβ |αi.

When R → ∞, X(R) approaches zero even faster than I(R) does.
When R = 0, X(R) = h1/ri = 1.
Do the analytic expressions bear these limits out?

I(R) = e−R 1 + R + 31 R2

(20.19)

1 1
D(R) = − 1+ e−2R (20.20)
R R
X(R) = e−R (1 + R) (20.21)
First conclusion: For R positive, I(R) > X(R). Check.

For R → ∞, I(R) and X(R) go to zero exponentially, while D(R) →
1/R. Check.
For R → 0,
I(R) → 1 − 61 R2 + O(R3 ) (20.22)
1 2 3
X(R) → 1 − 2R + O(R ) (20.23)
20.2. Problems 313
Check, check. But what of D(R)? As R → 0, you might say D(R) →

∞ − (1 + ∞)1, and the infinities cancel, so you’re left with D(R) → −1,
but of course that’s silly. . . we’ve already said that D(R) is positive. We
need to do the limit with some care.

1 1
D(R) = − 1+ e−2R
R R

1 1
1 + (−2R) + 21 (−2R)2 + 61 (−2R)3 + O((−2R)4 )

= − 1+
R R

1 1
1 − 2R + 2R2 − 34 R3 + O(R4 )

= − 1+
R R
1
− 1 − 2R + 2R2 − 34 R3 + O(R4 )

=
R
1
1 − 2R + 2R2 − 43 R3 + O(R4 )

−
R
1
− 1 − 2R + 2R2 − 34 R3 + O(R4 )

=
R
1
− − 2 + 2R − 34 R2 + O(R3 )
R
= − −1 + 23 R2 + O(R3 )

= 1 − 32 R2 + O(R3 ). (20.24)
All three integrals start at 1 when R = 0. As R increases they all take

off with zero slope, but drop quadratically: I(R) is highest, then X(R),
and D(R) lowest. But at some point D(R) crosses the other two. While
all three approach zero as R → ∞, D(R) does so much more slowly than
the other two.
20.1.3 Why is H+
2 hard?
Obviously not Pauli exclusion! But if you plot the various contributions,
you see that it’s classical nuclear repulsion, not “Heisenberg hardness”.
20.2 Problems
20.1 The hydrogen molecule ion: Evaluation of integrals

Evaluate the direct
√ and exchange integrals D(R) and X(R). (Hint:
Remember that x2 = |x|.) Plot as a function of R the overlap integral,
I(R), as well as D(R) and X(R).
314 Molecules
20.2 The hydrogen molecule ion: Thinking about integrals

For the hydrogen molecule ion, find and plot the expectation values of
nuclear potential energy, total electronic energy, kinetic electronic en-
ergy, and potential electronic energy for the state ψ+ (~r), as functions of
R. Do these plots shed any light on our initial question of “Why is stuff
hard?” (We gave possible answers of “repulsion hardness,” “Heisenberg
hardness,” and “Pauli hardness.”) Bonus: The hydrogen molecule ion
cannot display Pauli hardness, because it has only one quantal particle.
Can you generalize this discussion to the neutral hydrogen molecule?
20.3 Improved variational wavefunction
Everett Schlawin (‘09) suggested using “shielded” subwavefunctions like
equation (18.4) in place of the subwavefunctions (20.4) and (20.5) that
go into making trial wavefunction (20.7). Then there would be a vari-
ational parameter Z in addition to the binary choice of + or −. I
haven’t tried this, but through the usual variational argument, it can’t
be worse than what we’ve tried so far! (That is, the results can’t be
worse. The amount of labor involved can be far, far worse.) Execute
this suggestion. Show that this trial wavefunction results in the exact
helium ion ground state energy in the case R = 0.
20.3 The hydrogen molecule
When we discussed the helium atom, we had available an exact solution

(that is, exact ignoring fine and hyperfine structure) of the helium ion
problem. We used the one-body levels of the helium ion problem as building
blocks for the two-body helium atom problem. Then we added electron-
electron repulsion. You will recall, for example, that the helium atom
ground state had the form (where “level” refers to a solution of the one-
body helium ion problem)
(two electrons in ground level) × (spin singlet) (20.25)
while the helium atom first excited state had the form
(one electron in ground level, one in first excited level) × (spin triplet).
(20.26)
We will attempt the same strategy for the hydrogen molecule, but we
face a roadblock at the very first step — we lack an exact solution to the
hydrogen molecule ion problem! Using LCAO, we have a candidate for a
ground state, namely
ψ+ (~r) = A+ [ηα (~r) + ηβ (~r)]. (20.27)
20.4. Can we do better? 315
20.4 Can we do better?
Try out our LCAO upper bound for the electronic ground state en-
ergy (20.16) at R = 0: The result is −3 Ry. But for R = 0 this is just
the Helium ion, for which the exact ground state energy is −4 Ry. Sure
enough, the variational method produces an upper bound, but it’s a poor
one.
We’ve seen before that the trick to getting good variational bounds is
to figure out the qualitative character of the true wavefunction and select
a trial wavefunction that mimics that character. Friedrich Hund, Robert
Mulliken, John C. Slater, and John Lennard-Jones started out by dreaming
up a trial wavefunction that could mimic the character of the true wave-
function at R = 0. Their techniques evolved into what is today called the
“molecular orbital method”. This is only one of several choices of trial wave-
function. Others are called “valance bond theory” or “the Hückel method”
or “the extended Hückel method”.
Story about Roald Hoffmann.
All these are primitive, but in synthetic chemistry, you don’t need the
spectrum, you don’t need the ground state energy, all you need to know is
which structure has lower energy, and that’s the one you’ll synthesize.
Today, chemists are much more likely to use a completely different
approach, called “density-functional theory”. This was developed by the
physicist Walter Kohn and made readily accessible through the computer
program gaussian written by the mathematician John Pople. When Kohn
and Pople won the Nobel Prize in Chemistry in 1998, I heard some grum-
bling among chemists that Chemistry Nobel laureates should have taken at
least one chemistry course.
Chapter 21
WKB: The Quasiclassical

Approximation
When I started learning quantum mechanics, I worked a lot of integrals and

diagonalized a lot of matrices. But I also vaguely wondered “Why is quan-
tum mechanics true?”. For example, why are there limits on our ability to
observe position and momentum? But eventually (while teaching Applied
Quantum Mechanics) I realized that I had the question backwards. The
real question is “We know that interference and entanglement exist. Why
don’t we notice them in daily life?” For example, Heisenberg indeterminacy
principle answers the question “When is the classical approximation ade-
quate?” That is, the real question concerns the classical limit of quantum
mechanics. Ehrenfest’s Theorem shows that classical mechanics can be the
limit of quantum mechanics, but not that it has to be. Research on this
topic continues under the name “decoherence”. We approach this subject
through the quasiclassical approximation.
The WKB technique finds approximate solutions to the energy eigen-
problem in one dimension. It is named for three physicists who indepen-
dently discovered it: the German Gregor Wentzel, the Dutchman Hendrik
Kramers, and the Frenchman Léon Brillouin. In the Netherlands it is known
as the KWB approximation, in France as BWK, and in Britain as JWKB
(adding a tribute to the English mathematician Sir Harold Jeffreys, who in
fact discovered the approximation three years before Wentzel, Kramers, and
Brillouin did). In Russia it is known as the quasiclassical approximation,
the name that I prefer.
The fact that this approximation was discovered independently four
times suggests, correctly, that the idea is pretty straightforward.1 Focus on
1 The same basic idea can be used in many similar situations: to light moving in a
medium where the index of diffraction varies slowly, for example.
317
318 WKB: The Quasiclassical Approximation
a region where the potential energy function V (x) is constant. Within that
region the eigenfunction of energy E is, when E > V , given by
p
η(x) = Ae±ikx where ~k = 2m(E − V ). (21.1)
The plus sign indicates positive momentum, the minus sign negative mo-
mentum, and the general solution is of course a linear combination of the
two. The wavefunction is sinusoidal oscillatory, with constant wavelength
2π~
λ= p (21.2)
2m(E − V )
and constant amplitude. Now suppose that V (x) is not constant, but that
it varies slowly over the length λ. Then my guess would be that η(x) is
almost sinusoidal, but the wavelength and amplitude vary slowly with x.
That is, I would seek oscillatory solutions like
p
η(x) = A(x)e±ik(x)x where ~k(x) = 2m(E − V (x)). (21.3)
On the other hand, if the potential energy function V (x) is constant

but E < V , then the energy eigenfunction is
~
η(x) = Ae±x/d where d = p . (21.4)
2m(V − E)
Here d is the characteristic exponential decay length: If one walks in the
direction of decreasing function, then the function diminishes by a factor
of 1/e (about 1/3) every time one steps a distance d. However when V (x)
is not constant, but varies slowly over the length of d, then η(x) is almost
exponential, but the decay length and amplitude vary slowly with x. That
is, I should seek solutions like
~
η(x) = A(x)e±x/d(x) where d(x) = p . (21.5)
2m(V (x) − E)
What we have said so far reinforces the qualitative expectations for energy
eigenfunction sketching established in chapter XX.
There is one place where this entire scheme is guaranteed to fail. If
E = V (x) then λ(x) = d(x) = ∞, and no potential energy function varies
“slowly on the scale of infinity”. The proper handling of these so-called
“classical turning points” is the most difficult facet of deriving the quasi-
classical approximation. However we will find that once the derivation is
done the final result is easy to state and to use.
If you apply these ideas to two- or three-dimensional problems, you
find that the classical turning points are now lines (in two dimensions) or
21.1. The connection region 319
surfaces (in three dimensions). The matching program at turning points

becomes a matching program over lines or surfaces (called in this context
“caustics”) and the results are neither easy to state nor simple to use. They
are connected with classical chaos, and, remarkably, with the theory of the
rainbow. Such are the nimble abstractions of mathematics. We will not
pursue these avenues in this book.
Define
p
pc (x) = 2m(E − V (x)) (21.6)
This function is the “classical momentum”, that is the momentum that a
classical particle of energy E would have if it were located at x.
21.1 The connection region
We have one formula accurate within the classically allowed region, and
another accurate within the classically prohibited region. But we lack a
formula accurate within the connection region near the classical turning
point, where the quasiclassical approximation fails. The job of this section
is to find a formula accurate in this region.
GRAPH with blue horizontal line marked E.
olive line slanted from SW to NE V (x) = Vx + s(x − xx ).
Vertical dashed line xx
arrow to right of dashed line x0 = x − xx
Graph with qualitative η(x) sketch?
cross point xx . . . x for “crossing”. So Vx = E.
~2 d2 η
− + V (x)η(x) = Eη(x) (21.7)
2m dx2
~2 d2 η
− + [E − s(x − xx )]η(x) = Eη(x) (21.8)
2m dx2
In terms of the new variable x0
~2 d2 η
− + sx0 η(x0 ) = 0 (21.9)
2m dx02
There are only two parameters: ~2 /2m and s. What is the characteristic
length for this problem?
quantity dimensions
2
~ /2m [mass][length]4 /[time]2
s [mass][length]/[time]2
Clearly the characteristic length is

2 1/3
~ /2m
x0 = (21.10)
s
Defining the scaled variable
x̃ = x0 /x0 (21.11)
we have
d2 η
− + x̃η(x̃) = 0. (21.12)
dx̃2
21.2 Why is WKB the “quasiclassical” approximation?
The approximation works when the de Broglie wavelength h/p is much

less than the characteristic length Lc of variations in the potential energy
function:
h/p Lc
p h/Lc . (21.13)
That is, it works for large — classical — values of momentum. Remember
that when I say large I don’t mean large on a human scale (say by comparing
the momentum of a gnat to the momentum of a semi-truck), I mean large
on the scale of h/LC . So the momentum could be very small on a human
scale yet the WKB approximation would still work very well.
21.3 The “power law” potential
While the quasiclassical approximation is difficult to derive, it is straight-

forward to apply. This section applies the approximation to the so-called
“power law” potential energy function,
V (x) = α|x|ν . (21.14)
When ν = 2 this is just the simple harmonic oscillator, which we have
studied extensively. When ν > 2 this potential traces out successively
steeper potential wells as ν increases:
21.3. The “power law” potential 321
V(x) ν=3
ν→∞ ν=2
x
−1 0 +1
In the limit ν → ∞, the power law potential approaches an infinite square

well.
Meanwhile, when ν < 2 this potential traces out successively flatter
potential wells as ν decreases:
V(x)
ν=2 ν=1
α ν→0
x
−1 0 +1
In the limit ν → 0, the power law potential approaches the flat potential
V (x) = α.
I don’t know of any physical system that obeys the power law potential
(except for the special cases ν = 0, ν = 2, and ν → ∞), but it’s a good idea
to understand quantum mechanics even in cases where it doesn’t reflect any
physical system.
To apply the quasiclassical approximation, locate the classical turning

points at
x1 = −(E/α)1/ν and x2 = +(E/α)1/ν , (21.15)
V(x)
x
x1 = −(E/α)1/ν x2 = +(E/α)1/ν
and then perform the integration

Z x2
pc (x) dx = (n − 12 )π~ (21.16)
x1
where
p q
pc (x) = 2m(E − V (x)) = 2m(E − α|x|1/ν ). (21.17)
It’s always a good idea to sketch the integrand before executing the
integral, and that’s what I do here:
pc(x)
√
2mE
ν→∞
ν→0
x
x1 x2
So
Z x2 Z x2 p
pc (x) dx = 2m(E − V (x)) dx
x1 x1
√ Z +(E/α)1/ν p
= 2m E − α|x|ν dx
−(E/α)1/ν
√ Z +(E/α)1/ν √
= 2 2m E − αxν dx.
0
How should one execute this integral? I prefer to integrate over dimen-
sionless variables, so as to separate the physical operation of setting up an
integral from the mathematical operation of executing that integral. For
that reason I define the dimensionless variable u through
αxν = Euν ,
1/ν
E
x= u,
α
α 1/ν
u= x.
E
Changing the integral to this variable
1/ν
Z x2 √ Z 1
√ E
pc (x) dx = 2 2m E − Euν dx
x1 0 α
1/ν Z 1
√ E √
= 2 2mE 1 − uν dx
α 0
(8m)1/2 (2+ν)/2ν 1 √
Z
= E 1 − uν dx
α1/ν 0
where the integral here is a numerical function of ν independent of m or E
or α. Let’s call it
Z 1
√
I(ν) = 1 − uν dx. (21.18)
0
If you try to evaluate this integal in terms of polynomials or trig functions
or anything familiar, you will fail. This is a function of ν all right, but
we’re going to have to uncover its properties on our own without recourse
to familiar functions.
Let’s start by graphing the integrand.

√
y(u) = 1 − uν
ν→∞
1
ν=2
ν→0
0 u
0 1
I(ν) is the area under the curve. You could produce a table of values
through numerical integration, but let’s uncover its properties first. It’s
clear from the graph that I(0) = 0, that as ν → ∞, I(ν) → 1, and that
I(ν) increases monotonically.
√
When ν = 2, the integrand y is y = 1 − u2 so u2 + y 2 = 1. . . the
integrand traces out a quarter circle of radius 1. The area under this curve
is of course π/4. So my first thought is that the function I(ν) looks like
this:
I(ν)
1
π/4
0 ν
0 1 2
But I want to investigate one detail further: What is the behavior of

I(ν) for small values of ν? To find this, I need to understand the behavior
of uν for small values of ν.

ex = 1 + x + 12 x2 + 1 3
3! x + · · ·
uν = eν ln u = 1 + ν ln u + 12 ν 2 ln2 u + 16 ν 3 ln3 u + · · ·
1 − uν = −ν ln u − 12 ν 2 ln2 u − 16 ν 3 ln3 u + · · ·
√ √ √
1 − uν ≈ ν − ln u
At first glance it looks very bad to see that negative sign under the square
root radical, but then you remember that when 0 < u < 1, ln u is negative,
so it’s a good thing that the negative sign is there!
For small values of ν,
√
Z 1
√ √
I(ν) ≈ ν − ln u du = ν (some positive number). (21.19)
0
Even without knowing the value of that postive number, you know that
I(ν) takes off from ν = 0 with infinite slope, like this:
I(ν)
1
π/4
0 ν
0 1 2
[[You don’t really need the value of “some positive number”, but if you’re
insatiably curious, use the substitution v = − ln u to find
Z ∞ √
√
Z 1 Z 0
√ −v 1/2 −v 3 π
− ln u du = v(−e ) dv = v e dv = Γ( 2 ) = ,
0 ∞ 0 2
so for small values of ν,
√
π√
I(ν) ≈ ν. ]]
2
A formal analysis shows that our integral I(ν) can be expressed in terms
of gamma functions as
√
π Γ( ν1 )
I(ν) = ,
2 + ν Γ( ν1 + 12 )
but the graph actually tells you more than this formal expression does.
When I was an undergraduate only a very few special functions (for example
the Γ function) had been laboriously worked out numerically and tabulated,
so it was important to express your integral of interest in terms of one of
those few that had been worked out. Now numerical integration is a breeze
(your phone is more powerful than the single computer we had on campus
when I was an undergraduate), so it’s more important to be able to tease
information out of the function as we’ve done here.
In summary, the energy eigenvalues obtained through the quasiclassical
approximation
(8m)1/2 (2+ν)/2ν
(n − 21 )π~ = E I(ν)
α1/ν
are
2ν/(2+ν)
α1/ν

1
En = (n − 2 )π~ n = 1, 2, 3, . . . . (21.20)
(8m)1/2 I(ν)
You could spend a lot of time probing this equation to find out what
it tells us about quantum mechanics. (You could also spend a lot of time
looking at the quasiclassical wavefunctions.) I’ll content myself with exam-
ining the energy eigenvalues for the three special cases ν = 2, ν → ∞, and
ν → 0.
When ν = 2 the power-law potential V (x) = αx2 becomes the simple
harmonic oscillator V (x) = 12 mω 2 x2 . Equation (21.20) becomes
α1/2
En = (n − 21 )π~
(8m)1/2 I(2)
( 12 mω 2 )1/2
= (n − 21 )π~
(8m)1/2 π/4
= (n − 12 )~ω n = 1, 2, 3, . . . . (21.21)
The exact eigenvalues are of course
En = (n + 12 )~ω n = 0, 1, 2, 3, . . . .
For the simple harmonic oscillator, the quasiclassical energy eigenvalues are
exactly correct. [[The energy eigenfunctions are not.]]
When ν → ∞ the power-law potential becomes an infinite square well

of width L = 2. Equation (21.20) becomes
2
α1/∞

1
En = (n − 2 )π~
(8m)1/2 I(∞)
2
1
= (n − 21 )π~
(8m)1/2
π 2 ~2
= (n − 21 )2 . (21.22)
8m
The exact eigenvalues are (when L = 2)
π 2 ~2 2 π 2 ~2 2
En = 2
n = n .
2mL 8m
Not bad for an approximation.
When ν → 0 the power-law potential becomes the flat, constant poten-
tial V (x) = α. This “free particle” potential admits no bound states. How
will the quasiclassical approximation deal with this?
2ν/(2+ν)
α1/ν

1
En = (n − 2 )π~
(8m)1/2 I(ν)
α2/(2+ν) 2ν/(2+ν)
(n − 21 )π~

= ν/(2+ν) 2ν/(2+ν)
(8m) I(ν)
α 0
(n − 12 )π~

→ 0 ν
(8m) I(ν)
α
→ .
I(ν)ν
But what is I(ν)ν for small ν? We’ve already seen at equation (21.19) that
√ ν
it is ν (some positive number)ν . The right part goes to 1, but
√ ν
ν = ν ν/2 = eν ln ν/2 → e0 = 1.
Thus as ν → 0,
En → α for all values of n. (21.23)
Chapter 22
The Interaction of Matter and

Radiation
Two questions:
(1) Our theorem says atoms stay in excited energy state forever!
(2) Absorb light of only one frequency . . . what, will absorb light of
wavelength 471.3428 nm but not 471.3427 nm?
Strangely, we start our quest to solve these problems by figuring out
how to solve differential equations.
22.1 Perturbation Theory for the Time Development Prob-

lem
By now, you have realized that quantum mechanics is an art of approxi-

mations. I make no apologies for this: After all, physics is an art of ap-
proximations. (The classical “three-body problem” has never been solved
exactly, and never will be.) Indeed, life is an art of approximations. (If
you’re waiting for the perfect boyfriend or girlfriend before making a com-
mitment, you’ll be waiting for a long time — and for some, that long wait
is a poor solution to the problem of life.)
Furthermore, much of the fun and creativity of theoretical physics comes
from finding applicable approximations. If theoretical physics were nothing
but turning a mathematical crank to mechanically grind out solutions, it
would not be exciting. I do not apologize for the fact that, to do theoretical
physics, you have to think!
329
330 The Interaction of Matter and Radiation
22.2 Setup
Here’s our problem:
Solve the initial value problem for the Hamiltonian
Ĥ(t) = Ĥ (0) + Ĥ 0 (t) (22.1)
given the solution {|ni} of the unperturbed energy eigenproblem
Ĥ (0) |ni = En |ni. (22.2)
Here we’re thinking of Ĥ 0 (t) as being in some sense “small” compared to

the unperturbed Hamiltonian Ĥ (0) . One common example is a burst of
light shining on an atom. Note also that it doesn’t make sense to solve
the energy eigenproblem for Ĥ(t), because this Hamiltonian depends upon
time, so it doesn’t have stationary state solutions!
We solve this problem by expanding the solution |ψ(t)i in the basis
{|ni}:
X
|ψ(t)i = Cn (t)|ni where Cn (t) = hn|ψ(t)i. (22.3)
n
Once we know the Cn (t), we’ll know the solution |ψ(t)i. Now, the state
vector evolves according to
d i
|ψ(t)i = − Ĥ|ψ(t)i (22.4)
dt ~
so the expansion coefficients evolve according to
dCn (t) i
= − hn|Ĥ|ψ(t)i
dt ~
iX
=− hn|Ĥ|miCm (t)
~ m
i Xh i
=− hn|Ĥ (0) |mi + hn|Ĥ 0 |mi Cm (t)
~ m
i X 0

=− Em δm,n + Hn,m Cm (t)
~ m
" #
i X
0
=− En Cn (t) + Hn,m Cm (t) (22.5)
~ m
This result is exact: we have yet to make any approximation.

22.2. Setup 331
Now, if Ĥ 0 (t) vanished, the solutions would be

Cn (t) = Cn (0)e−(i/~)En t , (22.6)
which motivates us to define new variables cn (t) through
Cn (t) = cn (t)e−(i/~)En t . (22.7)
−(i/~)En t
Because the “bulk of the time development” comes through the e
term, the cn (t) presumably have “less time dependence” than the Cn (t).
In other words, we expect the cn (t) to vary slowly with time.
Plugging this definition into the time development equation (22.5) gives
dcn (t) −(i/~)En t
e + cn (t) (−(i/~)En ) e−(i/~)En t (22.8)
dt" #
i X
=− En cn (t)e−(i/~)En t + 0
Hn,m cm (t)e−(i/~)Em t
~ m
or
dcn (t) iX 0
=− H cm (t)e+(i/~)(En −Em )t . (22.9)
dt ~ m n,m
Once again, this equation is exact. Its formal solution, given the initial
values cn (0), is
iX t 0
Z
0
cn (t) = cn (0) − H (t0 )cm (t0 )e+(i/~)(En −Em )t dt0 . (22.10)
~ m 0 n,m
This set of equations (one for each basis element) is exact, but at first
glance seems useless. The unknown quantities cn (t) are present on the left,
but also the right-hand sides.
We make progress using our idea that the coefficients cn (t) are chang-
ing slowly. In a very crude approximation, we can think that they’re not
changing at all. So on the right-hand side of equation (22.10) we plug in
not functions, but the constants cm (t0 ) = cm (0), namely the given initial
conditions.
Having made that approximation, we can now perform the integrations
and produce, on the left-hand side of equation (22.10), functions of time
cn (t). These coefficients aren’t exact, because they were based on the crude
approximation that the coefficients were constant in time, but they’re likely
to be better approximations than we started off with.
Now, armed with these more accurate coefficients, we can plug these
into the right-hand side of equation (22.10), perform the integration, and
produce yet more accurate coefficients on the left-hand side. This process
can be repeated over and over, for as long as our stamina lasts.
initial condition
cm(t') on right no
tired? stop
yes
cn(t) on left
There is actually a theorem assuring us that this process will converge!

0
Theorem (Picard1 ) If the matrix elements Hn,m (t) are continuous in time
and bounded, and if the basis is finite, then this method converges to
the correct solution.
The theorem does not tell us how many iterations will be needed to reach
a desired accuracy. In practice, one usually stops upon reaching the first
non-zero correction.
In particular, if the initial state is some eigenstate |ai of the unperturbed
Hamiltonian Ĥ (0) , then to first order
i t 0
Z
0
cn (t) = − H (t0 )e+(i/~)(En −Ea )t dt0 for n 6= a (22.11)
~ 0 n,a
i t 0
Z
ca (t) = 1 − H (t0 ) dt0
~ 0 a,a
If the system is in energy state |ai at time zero, then the probability of
finding it in energy state |bi at time t, through the influence of perturbation
Ĥ 0 (t), is called the transition probability
Pa→b (t) = |Cb (t)|2 = |cb (t)|2 . (22.12)
Example: An electron bound to an atom is approximated by a one-

dimensional simple harmonic oscillator of natural frequency ω0 . The os-
cillator is in its ground state |0i and then exposed to light of electric field
amplitude E0 and frequency ω for time t. (The light is polarized in the di-
rection of the oscillations.) What is probability (in first-order perturbation
theory) of ending up in state |bi?
1 Émile Picard (1856–1941) made immense contributions to complex analysis and to
the theory of differential equations. He wrote one of the first textbooks concerning the
theory of relativity, and married the daughter of Charles Hermite.
22.2. Setup 333
Solution part A — What is the Hamiltonian? If it were a classical

particle of charge −e exposed to electric field E0 sin ωt, it would experi-
ence a force −eE0 sin ωt and hence have a potential energy of eE0 x sin ωt.
(We can ignore the spatial variation of electric field because the electron
is constrained to move only up and down — that’s our “one dimensional”
assumption. We can ignore magnetic field for the same reason.)
The quantal Hamiltonian is then
p̂2 mω02 2
Ĥ = + x̂ + eE0 x̂ sin ωt. (22.13)
2m 2
We identify the first two terms as the time-independent Hamiltonian Ĥ (0)
and the last term as the perturbation Ĥ 0 (t).
Solution part B — Apply perturbation theory. The matrix element is
r
~
Hn,0 (t) = hn|Ĥ 0 (t)|0i = eE0 sin ωt hn|x̂|0i = eE0 sin ωt δn,1 .
2mω0
(22.14)
(Remember your raising and lowering operators! See equation (D.31).)
Invoking equations (22.11), we obtain
cn (t) = 0 for n 6= 0, 1 (22.15)
r Z t
i ~ 0
c1 (t) = − eE0 sin ωt0 eiω0 t dt0 (22.16)
~ 2mω0 0
c0 (t) = 1 (22.17)
We will eventually need to perform the time integral in equation (22.16),

but even before doing so the main qualitative features are clear: First,
probability is not conserved within first order perturbation theory. The
probability of remaining in the ground state is 1, but the probability of
transition to the first excited state is finite! Second, to first order transitions
go only to the first excited state. This is an example of a selection rule.
The time integral in equation (22.16) will be evaluated at equa-

tion (22.30). For now, let’s just call it I(t). In terms of this integral,
the transition probabilities are
P0→b (t) = 0 for b 6= 0, 1 (22.18)
e2 E02
P0→1 (t) = I(t)I ∗ (t) (22.19)
2m~ω0
P0→0 (t) = 1 (22.20)
22.3 Light absorption
How do atoms absorb light?

More specifically, if an electron in atomic energy eigenstate |ai (usu-
ally but not always the ground state) is exposed to a beam of monochro-
matic, polarized light for time t, what is the probability of it ending up
in atomic energy eigenstate |bi? We answer this question to first order in
time-dependent perturbation theory.
First, we need to find the effect of light on the electron. We’ll treat
the light classically — that is, we’ll ignore the quantization of the electro-
magnetic field (quantum electrodynamics) that gives rise to the concept
of photons. Consider the light wave (polarized in the k̂ direction, with
frequency ω) as an electric field
~ r, t) = E0 k̂ sin(~k · ~r − ωt).
E(~ (22.21)
Presumably, the absorption of light by the atom will result in some sort
of diminution of the light beam’s electric field, but we’ll ignore that. (A
powerful beam from a laser will be somewhat diminished when some of
the light is absorbed by a single atom, but not a great deal.) The light
beam has a magnetic field as well as an electric field, but the magnetic
field amplitude is B0 = E0 /c, so the electric force is on the order of eE0
while the magnetic force is on the order of evB0 = e(v/c)E0 . Since the
electron moves at non-relativistic speeds, v/c 1 and we can ignore the
magnetic effect. Finally, the electric field at one side of the atom differs
from the electric field at the other side of the atom, but the atom is so small
compared to the wavelength of light (atom: about 0.1 nm; wavelength of
violet light: about 400 nm) that we can safely ignore this also.
Using these approximations, the force experienced by an electron due
to the light beam is
F~ (t) = −eE0 k̂ sin(ωt), (22.22)
22.3. Light absorption 335
so the associated potential energy is

U (t) = eE0 z sin(ωt). (22.23)
Turning this classical potential energy into a quantal operator gives
Ĥ 0 (t) = eE0 ẑ sin(ωt). (22.24)
(Note that the hat k̂ in equation (22.22) signifies unit vector, whereas the
hat ẑ in equation (22.24) signifies quantal operator. I’m sorry for any confu-
sion. . . there just aren’t enough symbols in the world to represent everything
unambiguously!)
Now that we have the quantal operator for the perturbation, we can turn
to the time-dependent perturbation theory result (22.11). (Is it legitimate
to use perturbation theory in this case? See the problem.)
For all of the atomic energy states |ai we’ve considered in this book,
0
Ha,a (t) = ha|H 0 (t)|ai = eE0 ha|ẑ|ai sin(ωt) = 0, (22.25)
whence ca (t) = 1 and Pa→a = 1. Most of the atoms don’t make transitions.
But what about those that do? For these we need to find the matrix
elements
0
Hb,a (t) = hb|H 0 (t)|ai = eE0 hb|ẑ|ai sin(ωt). (22.26)
These are just the zb,a matrix elements that we calculated for the Stark
effect. (And after all, what we’re considering here is just the Stark effect
with an oscillating electric field.) The transition amplitudes are
Z t
i 0
cb (t) = − eE0 hb|ẑ|ai sin(ωt0 )e+(i/~)(Eb −Ea )t dt0 . (22.27)
~ 0
It is convenient (and conventional!) to follow the lead of Einstein’s ∆E =
~ω and define
Eb − Ea = ~ω0 . (22.28)
The time integral is then
Z t
0
sin(ωt0 )eiω0 t dt0
0
t 0 0
e+iωt − e−iωt iω0 t0 0
Z
= e dt
0 2i
Z t Z t
1 0 0
= ei(ω0 +ω)t dt0 − ei(ω0 −ω)t dt0
2i 0 0
" 0 0
#t
1 ei(ω0 +ω)t ei(ω0 −ω)t
= −
2i i(ω0 + ω) i(ω0 − ω)
0
1 ei(ω0 +ω)t − 1 ei(ω0 −ω)t − 1

=− − (22.29)
2 ω0 + ω ω0 − ω
Enrico Fermi thought about this expression and realized that in most cases
it would not be substantial (as reflected in the fact that Pa→a = 1). The
numerators are complex numbers in magnitude between 0 and 2. For light,
we’re thinking of frequencies ω near ZZZ. The only case when this expres-
sion is big, is when ω ≈ ω0 , and when that’s true only the right-hand part
is big. So it’s legitimate to ignore the left-hand part and write
Z t
0
sin(ωt0 )eiω0 t dt0
0
i(ω0 −ω)t
1 e −1
≈− −
2 ω0 − ω
1 i(ω0 −ω)t/2 ei(ω0 −ω)t/2 − e−i(ω0 −ω)t/2

= e
2 ω0 − ω

1 i(ω0 −ω)t/2 2i sin((ω0 − ω)t/2)
= e
2 ω0 − ω
sin((ω 0 − ω)t/2)
= iei(ω0 −ω)t/2
ω0 − ω
sin((ω − ω0 )t/2)
= ie−i(ω−ω0 )t/2 . (22.30)
ω − ω0
Plugging this approximation for the integral into equation (22.27) produces
eE0 hb|ẑ|ai −i(ω−ω0 )t/2 sin((ω − ω0 )t/2)
cb (t) = e . (22.31)
~ ω − ω0
The transition probability is then

e2 E02 |hb|ẑ|ai|2 sin2 ((ω − ω0 )t/2)
Pa→b = . (22.32)
~2 (ω − ω0 )2
This rule, like all rules,2 has limits on its applicability: we’ve already men-
tioned that it applies when the wavelength of light is much larger than an
atom, when the light can be treated classically, when ω ≈ ω0 , etc. Most
importantly, it applies only when the transition probability is small, be-
cause when that probability is large the whole basis of perturbation theory
breaks down. You might think that with all these restrictions, it’s not a
very important result. You’d be wrong. In fact Fermi used it so often that
he called it “the golden rule.”
2 A father needs to leave his child at home for a short time. Concerned for his child’s
safety, he issues the sensible rule “Don’t leave home while I’m away.” While the father
is away, the home catches fire. Should the child violate the rule?
Physical implications of Fermi’s golden rule
We have derived Fermi’s golden rule, but that’s only the start and not the
end of our quest to answer the question of “How do atoms absorb light?”.
What does Fermi’s golden rule say about nature? First, we’ll think of the
formula as a function of frequency ω for fixed time t, then we’ll think of
the formula as a function of time t at fixed frequency ω.
Write the transition probability as
sin2 ((ω − ω0 )t/2)
Pa→b = A (22.33)
(ω − ω0 )2
where the value of A is independent of both frequency and time. Clearly,
this expression is always positive or zero (good thing!) and is symmetric
about the natural transition frequency ω0 . The expression is always less
then the time-independent “envelope function” A/(ω−ω0 )2 . The transition
probability vanishes when
ω − ω0 = N π/t, N = ±2, ±4, ±6, . . .
while it touches the envelope when
ω − ω0 = N π/t, N = ±1, ±3, ±5, . . . .
What about when ω = ω0 ? Here you may use l’Hôpital’s rule, or the
approximation
sin θ ≈ θ for θ 1,
but either way you’ll find that
when ω = ω0 , Pa→b = At2 /4. (22.34)
In short, the transition probability as a function of ω looks like this graph:
P
At2/4
ω
ω0 π/t
Problem: Show that if the central maximum has value Pmax , then
the first touching of the envelope (at ω − ω0 = π/t) has value
(4/π 2 )Pmax = 0.405 Pmax , the second touching (at ω − ω0 = 3π/t) has
value (4/9π 2 )Pmax = 0.045 Pmax , and the third (at ω − ω0 = 5π/t) has
value (4/25π 2 )Pmax = 0.016 Pmax . Notice that these ratios are indepen-
dent of time.
There are several unphysical elements of this graph it gives a result even
at ω = 0 . . . indeed, even when ω is negative! But the formula was derived
assuming ω ≈ ω0 , so we don’t expect it to give physically reasonable results
in this regime. In time, the maximum transition probability At2 /4 will grow
to be very large, in fact even larger than one! But the formula was derived
assuming a small transition probability, and becomes invalid long before
such an absurdity happens.
This result may help you with a conundrum. You have perhaps been
told something like: “To excite hydrogen from the ground state to the first
excited state, a transition with ∆E = 41 Ry, you must supply a photon
with energy exactly equal to 14 Ry, what is with frequency ω0 = 14 Ry/~,
or in other words with wavelength 364.506 820 nm.” You know that no
laser produces light with the exact wavelength of 364.506 820 nm. If the
photon had to have exactly that wavelength, there would almost never be
a transition. But the laser doesn’t need to have exactly that wavelength:
as you can see, there’s some probability of absorbing light that differs a bit
from the natural frequency ω0 .
Problem: Show that the width of the central peak, from zero to zero,
is 4π/t.
One aspect of the transition probability expression is quite natural: The

light most effective at promoting a transition is light with frequency ω equal
to the transition’s natural frequency ω0 . Also natural is that the effective-
ness decreases as ω moves away from ω0 , until the transition probability
vanishes entirely at ω = ω0 ±2π/t. But then a puzzling phenomenon sets in:
as ω moves still further away from ω0 , the transition probability increases.
This increase is admittedly slight, but nonetheless it exists, and I know of
no way to explain it in physical terms. I do point out, however, that this
puzzling phenomenon does not exist for light pulses of Gaussian form: see
problem 22.5, “Gaussian light pulse”.
Now, investigate the formula (22.33) as a function of time t at fixed

light frequency ω. This seems at first to be a much simpler task, because
the graph is trivial:
t
2π/(ω−ω0)
But now reflect upon the graph. We have a laser set to make transitions
from |ai to |bi. We turn on the laser, and the probability of that transition
increases. So far, so good. Now we keep the laser on, but the probability
decreases! And if we keep it on for exactly the right amount of time, there
is zero probability for a transition. It’s as if we were driving a nail into a
board with a hammer. The first few strikes push the nail into the board,
but with continued strikes the nail backs out of the board, and it eventually
pops out altogether!
How can this be? Certainly, no nail that I’ve hammered has ever be-
haved this way! The point is that there are two routes to get from |ai to |ai:
You can go from |ai to |bi and then back to |ai, or you can stay always in
|ai, that is go from |ai to |ai to |ai. There is an amplitude associated with
each route. If these two amplitudes interfere constructively, there is a high
probability of remaining in |ai (a low probability of transitioning to |bi).
If these two amplitudes interfere destructively, there is a low probability
of remaining in |ai (a high probability of transitioning to |bi). This wavy
graph is a result of interference of two routes that are, not paths in position
space, but routes through energy eigenstates.3
This phenomenon is called “Rabi oscillation”, and it’s the pulse at the
heart of an atomic clock.
3 This point of view is developed extensively in R.P. Feynman and A.R. Hibbbs, Quan-
tum Mechanics and Path Integrals (D.F. Styer, emending editor, Dover Publications,
Mineola, New York, 2010) pages 116–117, 144–147.
22.4 Absorbing incoherent light
For coherent, z-polarized, x-directed, long-wavelength, non-magnetic, clas-

sical, non-diminishing light, in the approximation of first-order time-
dependent perturbation theory, and with ω ≈ ω0 , the transition probability
is
2
e2 E02 2 sin ((ω − ω0 )t/2)
Pa→b = |hb|ẑ|ai| . (22.35)
~2 (ω − ω0 )2
The classical energy density (average energy per volume) of an electromag-
netic wave is u = 0 E02 /2, where 0 is the famous vacuum permittivity that
appears as 1/(4π0 ) in Coulomb’s law, so this result is often written
2e2 u sin2 ((ω − ω0 )t/2)
Pa→b = 2
|hb|ẑ|ai|2 . (22.36)
0 ~ (ω − ω0 )2
What if the light is polarized but not coherent? In this case light comes
at varying frequencies. Writing the energy density per frequency as ρ(ω),
the transition probability due to light of frequency ω to ω + dω is
2
0 2e2 ρ(ω) dω 2 sin ((ω − ω0 )t/2)
Pa→b = |hb|ẑ|ai| , (22.37)
0 ~2 (ω − ω0 )2
whence the total transition probability is
Z ∞
2e2 2 sin2 ((ω − ω0 )t/2)
Pa→b = 2
|hb|ẑ|ai| ρ(ω) dω. (22.38)
0 ~ 0 (ω − ω0 )2
[[We have assumed that the light components at various frequencies is inde-
pendent, so that the total transition probability is the sum of the individual
transition probabilities. If instead the light components were completely
correlated, then the total transition amplitude would be the sum of the
individual transition amplitudes. This is the case in problem 22.5, “Gaus-
sian light pulse”. If the light components were incompletely correlated but
not completely independent, then a hybrid approach would be needed.]] If
ρ(ω) is slowly varying relative to the absorption profile (22.33) — which it
almost always is — then it is accurate to approximate
Z +∞
2e2 2 sin2 ((ω − ω0 )t/2)
Pa→b = |hb|ẑ|ai| ρ(ω 0 ) dω, (22.39)
0 ~2 −∞ (ω − ω0 )2
where I have changed the lower integration limit from 0 to −∞, with neg-
ligible change in Pa→b , because the integrand nearly vanishes whenever
ω < 0. Finally, the definite integral
Z +∞
sin2 x
dx = π
−∞ x2
22.5. Absorbing and emitting light 341
gives, for polarized incoherent light,

πe2
Pa→b = |hb|ẑ|ai|2 ρ(ω0 )t. (22.40)
0 ~2
The primary thing to note about this formula is the absence of Rabi
oscillations: it gives a far more familiar rate of transition. The second
thing is that the rate from |bi to |ai is equal to the rate from |ai to |bi,
which is somewhat unusual: you might think that the rate to lose energy
(|bi to |ai) should be greater than the rate to gain energy (|ai to |bi). [Just
as it’s easier to walk down a staircase than up the same staircase.]
Finally, what if the light is not coherent, not polarized, and not directed?
(Such as the light in a room, that comes from all directions.) In this case
πe2
|hb|x̂|ai|2 + |hb|ŷ|ai|2 + |hb|ẑ|ai|2 ρ(ω0 )t.

Pa→b = (22.41)
30 ~2
22.5 Absorbing and emitting light
Qualitative quantum electrodynamics
Of course we want to do better than the treatment above: Instead of treat-

ing a quantum mechanical atom immersed in a classical electromagnetic
field, we want a full quantum-mechanical treatment of the atom and the
light. Such a theory — quantum electrodynamics — has been developed
and it is a beautiful thing. Because light must travel at speed c this theory
is intrinsically relativistic and, while beautiful, also a very difficult thing.
We will not give it a rigorous treatment in this book. But this section
motivates the theory and discusses its qualitative character.
Most of this book discusses the quantum mechanics of atoms: The
Hamiltonian operator Ĥatom has energy eigenstates like the ground state |ai
and the excited state |bi. The system√ can exist in any linear combination
of these states, such as (|ai − |bi)/ 2. If the system starts off in one of the
energy states, including the excited state |bi, it stays there forever.
You can also write down a Hamiltonian operator ĤEM for the electro-
magnetic field. This operator has energy eigenstates. By convention, the
ground state is called |vacuumi, one excited state is called |1 photoni, an
even more excited state is called |2 photonsi. The field can also exist in
√
linear combinations such as (|vacuumi − |2 photonsi)/ 2, but this state is
not a stationary state, and it does not have an energy.
You can do the classic things with field energy states: There’s an oper-
ator for energy and an operator for photon position, but they don’t com-
mute. So in the state |1 photoni the photon has an energy but no position.
There’s a linear combinations of energy states in which th photon does
have a position, but in these position states the electromagnetic field has
no energy.
But there’s even more: There is an operator for electric field at a given
location. And this operator doesn’t commute with either the Hamiltonian
or with the photon position operator.4 So in a state of electric field at some
given point, the photon does not have a position, and does not have an
energy. Anyone thinking of the photon as a “ball of light” — a wavepacket
of electric and magnetic fields — is thinking of a misconception. A photon
might have a “pretty well defined” position and a “pretty well defined”
energy and a “pretty well defined” field, but it can’t have an exact position
and an exact energy and an exact field at the same time.
If the entire Hamiltonian were Ĥatom + ĤEM , then energy eigenstates
of the atom plus field would have the character of |ai|2 photonsi, or
|bi|vacuumi and if you started off in such a state you would stay in it
forever. Note particularly the second example: if the atom started in an
excited state, it would never decay to the ground state, emitting light.
But since that process (called “spontaneous emission”) does happen, the
Hamiltonian Ĥatom +ĤEM must not be the whole story. There must be some
additional term in the Hamiltonian that involves both the atom and the
field: This term is called the “interaction Hamiltonian” Ĥint . (Sometimes
called the “coupling Hamiltonian”, because it couples — connects — the
atom and the field.) The full Hamiltonian is Ĥatom + ĤEM + Ĥint . The state
|bi|vacuumi is not an eigenstate of this full Hamiltonian: If you start off in
|bi|vacuumi, then at a later time there will be some amplitude to remain
in |bi|vacuumi, but also some amplitude to be in |ai|1 photoni.
4 It’s clear, even without writing down the “EM field Hamiltonian” and the “electric
field at a given point” operators, that they do not commute: any operator that commutes
with the Hamiltonian is conserved, so if these two operators commuted then the electric
field at a given point would never change with time!
Einstein A and B argument
Back in 1916, Einstein wanted to know about both absorption and emission
of light by atoms, and — impatient as always — he didn’t want to wait
until a full theory of quantum electrodynamics was developed. So he came
up with the following argument — one of the cleverest in all of physics.
absorption stimulated emission spontaneous emission

|b> |b> |b>
|a> |a> |a>
Einstein said that there were three processes going on, represented
schematically in the figure above. In absorption of radiation the atom
starts in its ground state |ai and ends in excited state |bi, while the light
intensity at frequency ω0 is reduced. Although the reasoning leading to
equation (22.41) hadn’t yet been performed in 1916, Einstein thought it
reasonable that the probability of absorption would be given by some rate
coefficient Bab , times the energy density of radiation with the proper fre-
quency for exciting the atom, times the time:
Pa→b = Bab ρ(ω0 ) t. (22.42)
In stimulated emission the atom starts in excited state |bi and, under
the influence of light, ends in ground state |ai. After this happens the light
intensity at frequency ω0 increases due to the emitted light. In this process
the incoming light of frequency ω0 “shakes” the atom out of its excited
state. Einstein thought the probability for this process would be
Pb→a = Bba ρ(ω0 ) t. (22.43)
We know, from equation (22.41), that in fact Bba = Bab , but Einstein
didn’t know this so his argument doesn’t use this fact.
Finally, in spontaneous emission the atom starts in excited state |bi
and ends in ground state |ai, but it does so without any incoming light
to “shake” it. After spontaneous emission the light intensity at frequency
ω0 increases due to the emitted light. Because this process doesn’t rely on
incoming light, the probability of it happening doesn’t depend on ρ(ω0 ).
Instead, Einstein thought, the probability would be simply
0
Pb→a = At. (22.44)
Einstein knew that this process had to happen, because excited atoms in
the dark can give off light and go to their ground state, but he didn’t have
a theory of quantum electrodynamics that would enable him to calculate
the rate coefficient A.
The coefficients Bab , Bba , and A are independent of the properties of
the light, the number of atoms in state |ai, the number of atoms in state
|bi, etc. — they depend only upon the characteristics of the atom.
Now if you have a bunch of atoms, with Na of them in the ground state
and Nb in the excited state, the rate of change of Na through these three
processes is
dNa
= −Bab ρ(ω0 ) Na + Bba ρ(ω0 ) Nb + ANb . (22.45)
dt
In equilibrium, by definition,
dNa
= 0. (22.46)
dt
In addition, in thermal equilibrium at temperature T , the following two
facts are true: The first is called “Boltzmann distribution”
Nb
= e−(Eb −Ea )/kB T = e−~ω0 /kB T , (22.47)
Na
where kB is the so-called “Boltzmann constant” that arises frequently in
thermal physics. The second is called “energy density for light in thermal
equilibrium (backbody radiation)”
~ ω3
ρ(ω) = , (22.48)
π 2 c3 e~ω/kB T −1
where c is the speed of light. [If you have taken a course in statistical
mechanics, you have certainly seen the first result. You might think you
haven’t seen the second result, but in fact it is a property of the ideal Bose
gas when the chemical potential µ vanishes.]
You might not yet know these two facts, but Einstein did. He combined
equation (22.46) and equation (22.45) finding
ANb
ρ(ω0 ) = .
Bab Na − Bba Nb
Then he used the Boltzmann distribution (22.47) to produce
A
ρ(ω0 ) = (22.49)
Bab e~ω0 /kB T − Bba
and compared that to the blackbody result (22.48) producing

A ~ ω03
= .
Bab e~ω0 /kB T − Bba π 2 c3
e~ω0 /kB T − 1
This result must hold for all temperatures T , and the coefficients Bab , Bba ,
and A are independent of T . Thus, Einstein reasoned, we must have
Bab = Bba ≡ B (22.50)
(which we already knew, but which was a discovery to Einstein) and hence
A ~ ω03
=
B(e~ω0 /kB T − 1) π 2 c3 e ~ω0 /k BT −1
or, with temperature-dependent parts canceling on both sides,
A ~ω 3
= 2 03 . (22.51)
B π c
The result is, of necessity, independent of temperature T . Einstein’s
argument uses thermal equilibrium not to discover the macroscopic prop-
erties of matter, but as a vehicle to uncover microscopic details about the
relation between matter and radiation. We have no way to find A from first
principles, but from the fact that thermal equilibrium exits we can find A
through
~ω03 4h
A= 2 3
B = 3 B. (22.52)
π c λ0
I hope you find this argument as astounding, and as beautiful, as I do.

It has the character of Einstein: First, it is not technically difficult, but it
combines the various elements in a way that I never would have thought
of, to produce a result that I thought would require working out full theory
of quantum electrodynamics. Second, it turns the problem on its head:
The fundamental question is “Will microscopic actions always result in
macroscopic thermal equilibrium? If so, how fast will that equilibrium be
approached?” Einstein skips over the fundamental question and asks “We
know from observation that macroscopic thermal equilibrium does in fact
exist. How can we exploit this fact to find out about microscopic actions?”
Numerical example: I would expect the stimulated decay rate Bρ(ω0 )
to exceed the spontaneous emission rate A (just as a jar on a shelf is more
likely to fall off when shaken than when left alone). On the other hand I’ve
found my expectations violated by quantum mechanics so frequently that
I can’t be sure. What is the ratio of A to Bρ(ω0 ) at room temperature for
the transition associated with the red light of a Helium-Neon laser (λ0 =
633 nm)?
Use equation (22.49) to write
Bρ(ω0 ) 1
= ~ω /k T . (22.53)
A e 0 B −1
1
Now at room temperature, kB T = 40 eV, so
~ω0 hc 1240 eV·nm
= = 1 = 78
kB T λ0 kB T (633 nm)( 40 eV)
resulting in
Bρ(ω0 ) 1
= 78 = e−78 = 10−34 .
A e −1
My intuition about shaking has been vindicated! At what temperature will
the stimulated and spontaneous rates be equal?
22.6 Problems
22.1 On being kicked upstairs

A particle in the ground state of an infinite square well is perturbed
by a transient effect described by the Hamiltonian (in coordinate rep-
resentation)
0 2πx
H (x; t) = A0 sin δ(t), (22.54)
L
where A0 is a constant with the dimensions of action. What is the
probability that after this jolt an energy measurement will find the
system in the first excited state?
22.2 Second-order time-dependent perturbation theory
At equation (22.16) we treated, to first order in perturbation theory,
the problem of a simple harmonic oscillator in its ground state exposed
to a sinusoidal external force (with frequency ω and amplitude eE0 ).
We concluded that the only non-vanishing first-order transition ampli-
(1) (1)
tudes were c0 (t) = 1 and c1 (t). (Here the superscript (1) denotes
“first-order”.) Show that to second order the non-vanishing transition
amplitudes are:
i t 0 0 −iω0 t0 (1) 0 0
Z
(2)
c0 (t) = 1 − H (t )e c1 (t ) dt , (22.55)
~ 0 01
i t 0 0 +iω0 t0 (1) 0 0
Z
(2)
c1 (t) = − H (t )e c0 (t ) dt , (22.56)
~ 0 10
i t 0 0 +iω0 t0 (1) 0 0
Z
(2)
c2 (t) = − H (t )e c1 (t ) dt , (22.57)
~ 0 21
22.6. Problems 347
where r
0 0 ~
H01 (t) = H10 (t) = eE0 sin(ωt), (22.58)
2mω0
and r
0 2~
H21 (t) = eE0 sin(ωt). (22.59)
2mω0
(2) (2)
The integrals for c0 (t) and c2 (t) are not worth working out, but it
(2)
is worth noticing that c2 (t) involves a factor of (eE0 )2 (where eE0 is
(2) (1)
in some sense “small”), and that c1 (t) = c1 (t).
22.3 Is light a perturbation?
Is it legitimate to use perturbation theory in the case of light absorbed
by an atom? After all, we’re used to thinking of the light from a
powerful laser as a big effect, not a tiny perturbation. However, whether
an effect is big or small depends on context. Estimate the maximum
electric field due to a laser of XX watts, and the electric field at an
electron due to its nearby nucleus. Conclude that while the laser is
very powerful on a human scale (and you should not stick your eye into
a laser beam), it is nevertheless very weak on an atomic scale.
22.4 Magnitude of transitions
At equation (22.33) we defined
e2 E02 |hb|ẑ|ai|2
A≡
~2
and then noted that it was independent of ω and t, but otherwise ig-
nored it. (Although we used it when we said that the maximum tran-
sition probability was At2 /4.) This problem investigates the character
of A.
The maximum classical force on the electron due to light is eE0 . A
typical force is less, so define the characteristic force due to light as
Fc,L ≡ 21 eE0 .
A typical classical force on the electron due to the nucleus is
2
e 1
Fc,N ≡ .
4π0 a20
Using these two definitions, and taking a typical matrix element |hb|ẑ|ai|
to be a0 , show that a typical value of A is
2
Fc,L 1
4 .
Fc,N τ02
If this excites you, you may also show that the exact value is
2 2
Fc,L 1 hb|ẑ|ai
A=4 .
Fc,N τ02 a0
22.5 Gaussian light pulse

An atom is exposed to a Gaussian packet of light
2
/τ 2
E(t) = E0 e−t sin(ωt). (22.60)
At time t = −∞, the atom was in state |ai. Find the amplitude, to
first order in perturbation theory, that at time t = ∞ the atom is in
state |bi. Clue: Use
Z +∞ r
ax2 +bx π −b2 /4a
e dx = e for <e{a} ≤ 0 but a 6= 0.
−∞ −a
Answer:
√ h
eE0 hb|ẑ|ai π 2 2
i
cb = − e−τ (ω+ω0 )/4 + e−τ (ω−ω0 )/4 .
~ 2τ
Chapter 23
The Territory Ahead
I reckon I got to light out for the territory ahead. . .

— Mark Twain (last sentence of Huckleberry Finn)
This is the last chapter of the book, but not the last chapter of quantum
mechanics. There are many fascinating topics that this book hasn’t even
touched on. Quantum mechanics will — if you allow it — surprise and
delight (and mystify) you for the rest of your life.
How to extend what’s in this book:
• Relativistic quantum mechanics. (Don’t make t an operator, instead

turn x back to a variable and introduce creation and annihilation op-
erators.)
• Quantum field theory.
• Quantal chaos and the classical limit of quantum mechanics.
• Friction and decay to ground state.
• Atomic, molecular, and solid state physics.
All of these fall solidly within the amplitude framework!
349
Appendix A
Tutorial on Matrix Diagonalization
You know from as far back as your introductory mechanics course that
some problems are difficult given one choice of coordinate axes and easy
or even trivial given another. (For example, the famous “monkey and
hunter” problem is difficult using a horizontal axis, but easy using an axis
stretching from the hunter to the monkey.) The mathematical field of
linear algebra is devoted, in large part, to systematic techniques for finding
coordinate systems that make problems easy. This tutorial introduces the
most valuable of these techniques. It assumes that you are familiar with
matrix multiplication and with the ideas of the inverse, the transpose, and
the determinant of a square matrix. It is also useful to have a nodding
acquaintance with the inertia tensor.
This presentation is intentionally non-rigorous. A rigorous, formal
treatment of matrix diagonalization can be found in any linear algebra
textbook,1 and there is no need to duplicate that function here. What is
provided here instead is a heuristic picture of what’s going on in matrix di-
agonalization, how it works, and why anyone would want to do such a thing
anyway. Thus this presentation complements, rather than replaces, the log-
ically impeccable (“bulletproof”) arguments of the mathematics texts.
Essential problems in this tutorial are marked by asterisks (∗ ).
A.1 What’s in a name?
There is a difference between an entity and its name. For example, a tree
is made of wood, whereas its name “tree” made of ink. One way to see
this is to note that in German, the name for a tree is “Baum”, so the name
351
352 Tutorial on Matrix Diagonalization
changes upon translation, but the tree itself does not change. (Throughout
this tutorial, the term “translate” is used as in “translate from one language
to another” rather than as in “translate by moving in a straight line”.)
The same holds for mathematical entities. Suppose a length is rep-
resented by the number “2” because it is two feet long. Then the same
length is represented by the number “24” because it is twenty-four inches
long. The same length is represented by two different numbers, just as the
same tree has two different names. The representation of a length as a
number depends not only upon the length, but also upon the coordinate
system used to measure the length.
A.2 Vectors in two dimensions
One way of describing a two-dimensional vector V is by giving its x and y

components in the form of a 2 × 1 column matrix

Vx
. (A.1)
Vy
Indeed it is sometimes said that the vector V is equal to the column ma-
trix (A.1). This is not precisely correct—it is better to say that the vector
is described by the column matrix or represented by the column matrix
or that its name is the column matrix. This is because if you describe
the vector using a different set of coordinate axes you will come up with
a different column matrix to describe the same vector. For example, in
the situation shown below the descriptions in terms of the two different
coordinate systems are related through the matrix equation

Vx0 cos φ sin φ Vx
= . (A.2)
Vy 0 − sin φ cos φ Vy
A.2. Vectors in two dimensions 353
OC y6
C
y0 C V
C
C
C
C
C
C :

C
x 0
C
C φ -

C x
C
The 2 × 2 matrix above is called the “rotation matrix” and is usually

denoted by R(φ):

cos φ sin φ
R(φ) ≡ . (A.3)
− sin φ cos φ
One interesting property of the rotation matrix is that it is always invertible,
and that its inverse is equal to its transpose. Such matrices are called
orthogonal.1 You could prove this by working a matrix multiplication, but
it is easier to simply realize that the inverse of a rotation by φ is simply a
rotation by −φ, and noting that
R−1 (φ) = R(−φ) = R† (φ). (A.4)
(The dagger represents matrix transposition.)
There are, of course, an infinite number of column matrix representa-
tions for any vector, corresponding to the infinite number of coordinate axis
rotations with φ from 0 to 2π. But one of these representations is special:
It is the one in which the x0 -axis lines up with the vector, so the column
matrix representation is just

V
, (A.5)
0
1 Although all rotation matrices are orthogonal, there are orthogonal matrices that are
not rotation matrices: see problem A.4.

q
where V = |V| = Vx2 + Vy2 is the magnitude of the vector. This set of
coordinates is the preferred (or “canonical”) set for dealing with this vector:
one of the two components is zero, the easiest number to deal with, and
the other component is a physically important number. You might wonder
how I can claim that this representation has full information about the
vector: The initial representation (A.1) contains two independent numbers,
whereas the preferred representation (A.5) contains only one. The answer
is that the preferred representation contains one number (the magnitude of
the vector) explicitly while another number (the polar angle of the vector
relative to the initial x-axis) is contained implicitly in the rotation needed
to produce the preferred coordinate system.
A.1 Problem: Right angle rotations

Verify equation (A.2) in the special cases φ = 90◦ , φ = 180◦ , φ = 270◦ ,
and φ = 360◦ .
A.2 Problem: The rotation matrix
a. Derive equation (A.2) through purely geometrical arguments.
b. Express î0 and ĵ0 , the unit vectors of the (x0 , y 0 ) coordinate system,
as linear combinations of î and ĵ. Then use
Vx0 = V·î0 and Vy0 = V·ĵ0 (A.6)
to derive equation (A.2).
c. Which derivation do you find easier?
A.3 Problem: Rotation to the preferred coordinate system∗
In the preferred coordinate system, Vy0 = 0. Use this requirement to
show that the preferred system is rotated from the initial system by an
angle φ with
Vy
tan φ = . (A.7)
Vx
For any value of Vy /Vx , there are two angles that satisfy this equa-
tion. What is the representation of V in each of these two coordinate
systems?
A.4 Problem: A non-rotation orthogonal transformation
In one coordinate system the y-axis is vertical and the x-axis points to
the right. In another the y 0 -axis is vertical and the x0 -axis points to
the left. Find the matrix that translates vector coordinates from one
system to the other. Show that this matrix is orthogonal but not a
rotation matrix.
A.3. Tensors in two dimensions 355
A.5 Problem: Other changes of coordinate∗

Suppose vertical distances (distances in the y direction) are measured
in feet while horizontal distances (distances in the x direction) are mea-
sured in miles. (This system is not perverse. It is used in nearly all
American road maps.) Find the matrix that changes the representation
of a vector in this coordinate system to the representation of a vector
in a system where all distances are measured in feet. Find the matrix
that translates back. Are these matrices orthogonal?
A.6 Problem: Other special representations
At equation (A.5) we mentioned one “special” (or “canonical”) repre-
sentation of a vector. There are three others, namely

0 −V 0
, , . (A.8)
−V 0 V
If coordinate-system rotation angle φ brings the vector representation
into the form (A.5), then what rotation angle will result in these three
representations?
A.3 Tensors in two dimensions
A tensor, like a vector, is a geometrical entity that may be described

(“named”) through components, but a d-dimensional tensor requires d2
rather than d components. Tensors are less familiar and more difficult to
visualize than vectors, but they are neither less important nor “less physi-
cal”. We will introduce tensors through the concrete example of the inertia
tensor of classical mechanics (see, for example, reference [2]), but the results
we present will be perfectly general.
Just as the two components of a two-dimensional vector are most eas-
ily kept track of through a 2 × 1 matrix, so the four components of two-
dimensional tensor are most conveniently written in the form of a 2 × 2
matrix. For example, the inertia tensor T of a point particle with mass m
located2 at (x, y) has components
my 2 −mxy

T= . (A.9)
−mxy mx2
(Note the distinction between the tensor T and its matrix of components,
its “name”, T.) As with vector components, the tensor components are
2 Or, to be absolutely precise, the particle located at the point represented by the vector
with components (x, y).

different in different coordinate systems, although the tensor itself does not
change. For example, in the primed coordinate system of the figure on
page 353, the tensor components are of course
my 02 −mx0 y 0

0
T = . (A.10)
−mx0 y 0 mx02
A little calculation shows that the components of the inertia tensor in two
different coordinate systems are related through
T0 = R(φ)TR−1 (φ). (A.11)
This relation holds for any tensor, not just the inertia tensor. (In fact,
one way to define “tensor” is as an entity with four components that sat-
isfy the above relation under rotation.) If the matrix representing a tensor
is symmetric (i.e. the matrix is equal to its transpose) in one coordinate
system, then it is symmetric in all coordinate systems (see problem A.7).
Therefore the symmetry is a property of the tensor, not of its matrix rep-
resentation, and we may speak of “a symmetric tensor” rather than just “a
tensor represented by a symmetric matrix”.
As with vectors, one of the many matrix representations of a given tensor
is considered special (or “canonical”): It is the one in which the lower left
component is zero. Furthermore if the tensor is symmetric (as the inertia
tensor is) then in this preferred coordinate system the upper right compo-
nent will be zero also, so the matrix will be all zeros except for the diagonal
elements. Such a matrix is called a “diagonal matrix” and the process of
finding the rotation that renders the matrix representation of a symmetric
tensor diagonal is called “diagonalization”.3 We may do an “accounting
of information” for this preferred coordinate system just as we did with
vectors. In the initial coordinate system, the symmetric tensor had three
independent components. In the preferred system, it has two independent
components manifestly visible in the diagonal matrix representation, and
one number hidden through the specification of the rotation.
A.7 Problem: Representations of symmetric tensors∗

Show that if the matrix S representing a tensor is symmetric, and if B
is any orthogonal matrix, then all of the representations
BSB† (A.12)
3 An efficient algorithm for diagonalization is discussed in section A.8. For the moment,
we are more interested in knowing that a diagonal matrix representation must exist than
in knowing how to most easily find that preferred coordinate system.
A.3. Tensors in two dimensions 357
are symmetric. (Clue: If you try to solve this problem for rotations in
two dimensions using the explicit rotation matrix (A.3), you will find it
solvable but messy. The clue is that this problem asks you do prove the
result in any number of dimensions, and for any orthogonal matrix B,
not just rotation matrices. This more general problem is considerably
easier to solve.)
A.8 Problem: Diagonal inertia tensor
The matrix (A.9) represents the inertia tensor of a point particle with
mass m located a distance r from the origin. Show that the matrix
is diagonal in four different coordinate systems: one in which the x0 -
axis points directly toward the particle, one in which the y 0 -axis points
directly away from the particle, one in which the x0 -axis points directly
away from the particle, and one in which the y 0 -axis points directly
toward the particle. Find the matrix representation in each of these
four coordinate systems.
A.9 Problem: Representations of a certain tensor
Show that a tensor represented in one coordinate system by a diagonal
matrix with equal elements, namely

d0 0
, (A.13)
0 d0
has the same representation in all orthogonal coordinate systems.
A.10 Problem: Rotation to the preferred coordinate system∗
A tensor is represented in the initial coordinate system by

ab
. (A.14)
bc
Show that the tensor is diagonal in a preferred coordinate system which
is rotated from the initial system by an angle φ with
2b
tan(2φ) = . (A.15)
a−c
This equation has four solutions. Find the rotation matrix for φ = 90◦ ,
then show how the four different diagonal representations are related.
You do not need to find any of the diagonal representations in terms of
a, b and c. . . just show what the other three are given that one of them
is

d1 0
. (A.16)
0 d2
A.11 Problem: Inertia tensor in outer product notation

The discussion in this section has emphasized the tensor’s matrix rep-
resentation (“name”) T rather than the tensor T itself.
a. Define the “identity tensor” 1 as the tensor represented in some
coordinate system by

10
1= . (A.17)
01
Show that this tensor has the same representation in any coordi-
nate system.
b. Show that the inner product between two vectors results in a
scalar: Namely

ax bx
if vector bf a is represented by and vector bf b is represented by
ay by
then the inner product a · b is given through

bx
ax ay = ax bx + ay by ,
by
and this inner product is a scalar. (A 1 × 2 matrix times a 2 × 1
matrix is a 1 × 1 matrix.) That is, the vector a is represented by
different coordinates in different coordinate systems, and the vec-
tor b is represented by different coordinates in different coordinate
systems, but the inner product a · b is the same in all coordinate
systems.
c. In contrast, show that the outer product of two vectors is a tensor:
Namely

. ax ax bx ax by
ab = bx by = .
ay ay bx ay by
(A 2 × 1 matrix times a 1 × 2 matrix is a 2 × 2 matrix.) That is,
show that the representation of ab transforms from one coordinate
system to another as specified through (A.11).
d. Show that the inertia tensor for a single particle of mass m located
at position r can be written in coordinate-independent fashion as
T = m1r2 − mrr. (A.18)
A.4. Tensors in three dimensions 359
A.4 Tensors in three dimensions
A three-dimensional tensor is represented in component form by a 3×3 ma-

trix with nine entries. If the tensor is symmetric, there are six independent
elements. . . three on the diagonal and three off-diagonal. The components
of a tensor in three dimensions change with coordinate system according to
T0 = RTR† , (A.19)
where R is the 3 × 3 rotation matrix.
A rotation in two dimension is described completely by giving a single
angle. In three dimensions more information is required. Specifically, we
need not only the amount of the rotation, but we must also know the plane
in which the rotation takes place. We can specify the plane by giving the
unit vector perpendicular to that plane. Specifying an arbitrary vector
in three dimensions requires three numbers, but specifying a unit vector
in three dimensions requires only two numbers because the magnitude is
already fixed at unity. Thus three numbers are required to specify a rotation
in three dimensions: two to specify the rotation’s plane, one to specify
the rotation’s size. (One particularly convenient way to specify a three-
dimensional rotation is through the three Euler angles. Reference [3] defines
these angles and shows how to write the 3 × 3 rotation matrix in terms of
these variables. For the purposes of this tutorial, however, we will not need
an explicit rotation matrix. . . all we need is to know is the number of angles
required to specify a rotation.)
In two dimensions, any symmetric tensor (which has three independent
elements), could be represented by a diagonal tensor (with two independent
elements) plus a rotation (one angle). We were able to back up this claim
with an explicit expression for the angle.
In three dimensions it seems reasonable that any symmetric tensor (six
independent elements) can be represented by a diagonal tensor (three in-
dependent elements) plus a rotation (three angles). The three angles just
have to be selected carefully enough to make sure that they cause the off-
diagonal elements to vanish. This supposition is indeed correct, although
we will not pause for long enough to prove it by producing explicit formulas
for the three angles.
A.5 Tensors in d dimensions
A d-dimensional tensor is represented by a d × d matrix with d2 entries. If

the tensor is symmetric, there are d independent on-diagonal elements and
d(d − 1)/2 independent off-diagonal elements. The tensor components will
change with coordinate system in the now-familiar form
T0 = RTR† , (A.20)
where R is the d × d rotation matrix.
How many angles does it take to specify a rotation in d dimensions?
Remember how we went from two dimensions to three: The three dimen-
sional rotation took place “in a plane”, i.e. in a two-dimensional subspace.
It required two (i.e. d − 1) angles to specify the orientation of the plane
plus one to specify the rotation within the plane. . . a total of three angles.
A rotation in four dimensions takes place within a three-dimensional
subspace. It requires 3 = d − 1 angles to specify the orientation of the
three-dimensional subspace, plus, as we found above, three angles to specify
the rotation within the three-dimensional subspace. . . a total of six angles.
A rotation in five dimensions requires 4 = d − 1 angles to specify the
four-dimensional subspace in which the rotation occurs, plus the six angles
that we have just found specify a rotation within that subspace. . . a total
of ten angles.
In general, the number of angles needed to specify a rotation in d di-
mensions is
Ad = d − 1 + Ad−1 = d(d − 1)/2. (A.21)
This is exactly the number of independent off-diagonal elements in a sym-
metric tensor. It seems reasonable that we can choose the angles to ensure
that, in the resulting coordinate system, all the off-diagonal elements van-
ish. The proof of this result is difficult and proceeds in a very different
manner from the plausibility argument sketched here. (The proof involves
concepts like eigenvectors and eigenvalues, and it gives an explicit recipe
for constructing the rotation matrix. It has the advantage of rigor and the
disadvantage of being so technical that it’s easy to lose track of the fact
that that all you’re doing is choosing a coordinate system.)
A.12 Problem: Non-symmetric tensors∗

Argue that a non-symmetric tensor can be brought into a “triangular”
A.6. Linear transformations in two dimensions 361
representation in which all the elements below the diagonal are equal to
zero and all the elements on and above the diagonal are independent.
(This is indeed the case, although in general some of the non-zero el-
ements remaining will be complex-valued, and some of the angles will
involve rotations into complex-valued vectors.)
A.6 Linear transformations in two dimensions
Section A.3 considered 2 × 2 matrices as representations of tensors. This

section gains additional insight by considering 2 × 2 matrices as represen-
tations of linear transformations. It demonstrates how diagonalization can
be useful and gives a clue to an efficient algorithm for diagonalization.
A linear transformation is a function from vectors to vectors that can
be represented in any given coordinate system as

u a11 a12 x
= . (A.22)
v a21 a22 y
If the equation above represents (“names”) the transformation in one coor-
dinate system, what is its representation in some other coordinate system?
We assume that the two coordinate systems are related through an orthog-
onal matrix B such that
0 0
u u x x
= B and = B . (A.23)
v0 v y0 y
(For example, if the new coordinate system is the primed coordinate system
of the figure on page 353, then the matrix B that translates from the original
to the new coordinates is the rotation matrix R(φ).) Given this “translation
dictionary”, we have
0
u a11 a12 x
= B . (A.24)
v0 a21 a22 y
But B is invertible, so
0
x −1 x
=B (A.25)
y y0
whence
u0 x0

a11 a12
=B B−1 . (A.26)
v0 a21 a22 y0
Thus the representation of the transformation in the primed coordinate

system is

a11 a12
B B−1 (A.27)
a21 a22
(compare equation A.11). This equation has a very direct physical mean-
ing. Remember that the matrix B translates from the old (x, y) coordinates
to the new (x0 , y 0 ) coordinates, while the matrix B−1 translates in the op-
posite direction. Thus the equation above says that the representation of
a transformation in the new coordinates is given by translating from new
to old coordinates (through the matrix B−1 ), then applying the old repre-
sentation (the “a matrix”) to those old coordinates, and finally translating
back from old to new coordinates (through the matrix B).
The rest of this section considers only transformations represented by
symmetric matrices, which we will denote by

u ab x
= . (A.28)
v bc y
Let’s try to understand this transformation as something more than a jum-
ble of symbols awaiting a plunge into the calculator. First of all, suppose
the vector V maps to the vector W. Then the vector 5V will be mapped
to vector 5W. In short, if we know how the transformation acts on vectors
with magnitude unity, we will be able to see immediately how it acts on
vectors with other magnitudes. Thus we focus our attention on vectors on
the unit circle:
x2 + y 2 = 1. (A.29)
A brief calculation shows that the length of the output vector is then
p p
u2 + v 2 = a2 x2 + b2 + c2 y 2 + 2b(a + c)xy, (A.30)
which isn’t very helpful. Another brief calculation shows that if the input
vector has polar angle θ, then the output vector has polar angle ϕ with
b + c tan θ
tan ϕ = , (A.31)
a + b tan θ
which is similarly opaque and messy.
Instead of trying to understand the transformation in its initial coordi-
nate system, let’s instead convert (rotate) to the special coordinate system
A.7. What does “eigen” mean? 363
in which the transformation is represented by a diagonal matrix. In this

system,
0 0
d1 x0

u d1 0 x
= = . (A.32)
v0 0 d2 y0 d2 y 0
The unit circle is still
x02 + y 02 = 1, (A.33)
so the image of the unit circle is
0 2 0 2
u v
+ = 1, (A.34)
d1 d2
namely an ellipse! This result is transparent in the special coordinate sys-
tem, but almost impossible to see in the original one.
Note particularly what happens to a vector pointing along the x0 co-
ordinate axis. For example, the unit vector in this direction transforms
to

d1 d1 0 1
= . (A.35)
0 0 d2 0
In other words, the when the vector is transformed it changes in magnitude,
but not in direction. Vectors with this property are called eigenvectors. It
is easy to see that any vector on either the x0 or y 0 coordinate axes are
eigenvectors.
A.7 What does “eigen” mean?
If a vector x is acted upon by a linear transformation B, then the output

vector
x0 = Bx (A.36)
will usually be skew to the original vector x. However, for some very special
vectors it might just happen that x0 is parallel to x. Such vectors are called
“eigenvectors”. (This is a terrible name because (1) it gives no idea of
what eigenvectors are or why they’re so important and (2) it sounds gross.
However, that’s what they’re called.) We have already seen, in the previous
section, that eigenvectors are related to coordinate systems in which the
transformation is particularly easy to understand.
If x is an eigenvector, then
Bx = λx, (A.37)
where λ is a scalar called “the eigenvalue associated with eigenvector x”.
If x is an eigenvector, then any vector parallel to x is also an eigenvector
with the same eigenvalue. (That is, any vector of the form cx, where c is
any scalar, is also an eigenvector with the same eigenvalue.) Sometimes we
speak of a “line of eigenvectors”.
The vector x = 0 is never considered an eigenvector, because
B0 = λ0, (A.38)
for any value of λ for any linear transformation. On the other hand, if
Bx = 0x = 0 (A.39)
for some non-zero vector x, then x is an eigenvector with eigenvalue λ = 0.
A.13 Problem: Plane of eigenvectors

Suppose x and y are two non-parallel vectors with the same eigenvalue.
(In this case the eigenvalue is said to be “degenerate”, which sounds
like an aspersion cast upon the morals of the eigenvalue but which is
really just poor choice of terminology again.) Show that any vector of
the form c1 x + c2 y is an eigenvector with the same eigenvalue.
A.8 How to diagonalize a symmetric matrix
We saw in section A.3 that for any 2 × 2 symmetric matrix, represented in

its initial basis by, say,

ab
, (A.40)
bc
a simple rotation of axes would produce a new coordinate system in which
the matrix representation is diagonal:

d1 0
. (A.41)
0 d2
These two matrices are related through

d1 0 ab
= R(φ) R−1 (φ), (A.42)
0 d2 bc
A.8. How to diagonalize a symmetric matrix 365
where R(φ) is the rotation matrix (A.3). Problem A.10 gave a direct way
to find the desired rotation. However this direct technique is cumbersome
and doesn’t generalize readily to higher dimensions. This section presents
a different technique, which relies on eigenvalues and eigenvectors, that is
more efficient and that generalizes readily to complex-valued matrices and
to matrices in any dimension, but that is somewhat sneaky and conceptually
roundabout.
We begin by noting that any vector lying along the x0 -axis (of the pre-
ferred coordinate system) is an eigenvector. For example, the vector 5î0 is
represented (in the preferred coordinate system) by

5
. (A.43)
0
Multiplying this vector by the matrix in question gives

d1 0 5 5
= d1 , (A.44)
0 d2 0 0
so 5î0 is an eigenvector with eigenvalue d1 . The same holds for any scalar
multiple of î0 , whether positive or negative. Similarly, any scalar multiple
of ĵ0 is an eigenvector with eigenvalue d2 . In short, the two elements on the
diagonal in the preferred (diagonal) representation are the two eigenvalues,
and the two unit vectors î0 and ĵ0 of the preferred coordinate system are
two of the eigenvectors.
Thus finding the eigenvectors and eigenvalues of a matrix gives you the
information needed to diagonalize that matrix. The unit vectors î0 and ĵ0
constitute an “orthonormal basis of eigenvectors”. The eigenvectors even
give the rotation matrix directly, as described in the next paragraph.
Let’s call the rotation matrix

b11 b12
B= , (A.45)
b21 b22
so that the inverse (transpose) matrix is

b11 b21
B−1 = B† = . (A.46)
b12 b22
The representation of î0 in the preferred basis is

1
, (A.47)
0
so its representation in the initial basis is (see equation A.2)

† 1 b11 b21 1 b11
B = = . (A.48)
0 b12 b22 0 b12
Similarly, the representation of ĵ0 in the initial basis is

0 b11 b21 0 b21
B† = = . (A.49)
1 b12 b22 1 b22
Thus the rotation matrix is !
initial rep. of î0 , on its side
B= . (A.50)
initial rep. of ĵ0 , on its side
Example
Suppose we need to find a diagonal representation for the matrix

73
T= . (A.51)
37
First we search for the special vectors—the eigenvectors—such that

73 x x
=λ . (A.52)
37 y y
At the moment, we don’t know either the eigenvalue λ or the associated
eigenvector (x, y). Thus it seems that (bad news) we are trying to solve
two equations for three unknowns:
7x + 3y = λx
3x + 7y = λy (A.53)
Remember, however, that there is not one single eigenvector: any multiple
of an eigenvector is also an eigenvector. (Alternatively, any vector on the
line that extends the eigenvector is another eigenvector.) We only need one
of these eigenvectors, so let’s take the one that has x = 1 (i.e. the vector
on the extension line where it intersects the vertical line x = 1). (This
technique will fail if we have the bad luck that our actual eigenvector is
vertical and hence never passes through the line x = 1.) So we really have
two equations in two unknowns:
7 + 3y = λ
3 + 7y = λy
but note that they are not linear equations. . . the damnable product λy
in the lower right corner means that all our techniques for solving linear
equations go right out the window. We can solve these two equations for
λ and y, but there’s an easier, if somewhat roundabout, approach.
Finding eigenvalues
Let’s go back to equation (A.52) and write it as

73 x x 0
−λ = . (A.54)
37 y y 0
Then
73 x 10 x 0
−λ = (A.55)
37 y 01 y 0
or
7−λ 3 x 0
= . (A.56)
3 7−λ y 0
Let’s think about this. It says that matrix M = T − λ1, we have
for some
x 0
M = . (A.57)
y 0
You know right away one vector (x, y) that satisfies this equation, namely
(x, y) = (0, 0). And most of the time, this is the only vector that satisfies
the equation, because
x −1 0 0
=M = . (A.58)
y 0 0
We appear to have reached a dead end. The solution is (x, y) = (0, 0),
but the zero vector is not, by definition, considered an eigenvector of any
transformation. (Because it always gives eigenvalue zero for any transfor-
mation.)
However, if the matrix M is not invertible, then there will be other
solutions to
x 0
M = . (A.59)
y 0
in addition to the trivial solution (x, y) = (0, 0). Thus we must look for
those special values of λ such that the so-called characteristic matrix M
is not invertible. These values come if and only if the determinant of M
vanishes. For this example, we have to findvalues of λ such that
7−λ 3
det = 0. (A.60)
3 7−λ
This is a quadratic equation in λ
(7 − λ)2 − 32 = 0 (A.61)
called the characteristic equation. Its two solutions are
7 − λ = ±3 (A.62)
or
λ = 7 ± 3 = 10 or 4. (A.63)
We have found the two eigenvalues of our matrix!
Finding eigenvectors
Let’s look now for the eigenvector associated with λ = 4. Equation (A.53)
7x + 3y = λx
3x + 7y = λy
still holds, but no longer does it look like two equations in three unknowns,
because we are now interested in the case λ = 4:
7x + 3y = 4x
3x + 7y = 4y
Following our nose gives
3x + 3y = 0
3x + 3y = 0
and when we see this our heart skips a beat or two. . . a degenerate system of
equations! Relax and rest your heart. This system has an infinite number of
solutions and it’s supposed to have an infinite number of solutions, because
any multiple of an eigenvector is also an eigenvector. The eigenvectors
associated with λ = 4 are any multiple of

1
. (A.64)
−1
An entirely analogous search for the eigenvectors associated with λ = 10

finds any multiple of

1
. (A.65)
1
Tidying up
We have the two sets of eigenvectors, but which shall we call î0 and which
ĵ0 ? This is a matter of individual choice, but my choice is usually to make
the transformation be a rotation (without reflection) through a small pos-
itive angle. Our new, preferred coordinate system is related to the original
coordinates by a simple rotation of 45◦ if we choose

1 −1
î0 = √12 and ĵ0 = √12 . (A.66)
1 1
(Note that we have also “normalized the basis”, i.e. selected the basis vec-
tors to have magnitude unity.) Given this choice, the orthogonal rotation
matrix that changes coordinates from the original to the preferred system
is (see equation A.50)

1 1
B = √12 (A.67)
−1 1
and the diagonalized matrix (or, more properly, the representation of the
matrix in the preferred coordinate system) is

10 0
. (A.68)
0 4
You don’t believe me? Then multiply out

73
B B† (A.69)
37
and see for yourself.
Problems
A.14 Problem: Diagonalize a 2 × 2 matrix∗
Diagonalize the matrix

26 12
. (A.70)
12 19
a. Find its eigenvalues.
b. Find its eigenvectors, and verify that they are orthogonal.
c. Sketch the eigenvectors, and determine the signs and sequence
most convenient for assigning axes. (That is, should the first
eigenvector you found be called î0 , −î0 , or ĵ0 ?)
d. Find the matrix that translates from the initial basis to the basis
of eigenvectors produced in part (c.).
e. Verify that the matrix produced in part (d.) is orthogonal.
f. Verify that the representation of the matrix above in the basis of
eigenvectors is diagonal.
g. (Optional.) What is the rotation angle?
A.15 Problem: Eigenvalues of a 2 × 2 matrix
Show that the eigenvalues of

ab
(A.71)
bc
are
h p i
1
λ= 2 (a + c) ± (a − c)2 + 4b2 . (A.72)
Under what circumstances is an eigenvalue complex valued? Under

what circumstances are the two eigenvalues the same?
A.16 Problem: Diagonalize a 3 × 3 matrix
Diagonalize the matrix
 
1182 −924 540
1 
−924 643 720  . (A.73)
625
540 720 −575
a. Find its eigenvalues by showing that the characteristic equation is
λ3 − 2λ2 − 5λ + 6 = (λ − 3)(λ + 2)(λ − 1) = 0. (A.74)
b. Find its eigenvectors, and verify that they are orthogonal.
c. Show that the translation matrix can be chosen to be
 
20 −15 0
1 
B= 9 12 −20  . (A.75)
25
12 16 15
Why did I use the phrase “the translation matrix can be chosen
to be” rather then “the translation matrix is”?
A.17 Problem: A 3 × 3 matrix eigenproblem
Find the eigenvalues and associated eigenvectors for the matrix
 
123
2 3 4. (A.76)
345
A.9 A glance at computer algorithms
Anyone who has worked even one of the problems in section A.8 knows that
diagonalizing a matrix is no picnic: there’s a lot of mundane arithmetic
involved and it’s very easy to make mistakes. This is a problem ripe for
computer solution. One’s first thought is to program a computer to solve
the problem using the same technique that we used to solve it on paper:
first find the eigenvalues through the characteristic equation, then find the
eigenvectors through a degenerate set of linear equations.
A.10. A glance at non-symmetric matrices and the Jordan form 371
This turns out to be a very poor algorithm for automatic computation.

The effective algorithm is to choose a matrix B such that the off-diagonal
elements of
BAB−1 (A.77)
are smaller than the off-diagonal elements of A. Then choose another, and
another. Go through this process again and again until the off-diagonal
elements have been ground down to machine zero. There are many strate-
gies for choosing the series of B matrices. These are well-described in any
edition of Numerical Recipes.4
When you need to diagonalize matrices numerically, I urge you to look at
Numerical Recipes to see what’s going on, but I urge you not to code these
algorithms yourself. These algorithms rely in an essential way on the fact
that computer arithmetic is approximate rather than exact, and hence they
are quite tricky to implement. Instead of coding the algorithms yourself,
I recommend that you use the implementations in either LAPACK5 (the
Linear Algebra PACKage) or EISPACK.6 These packages are probably the
finest computer software ever written, and they are free. They can be
obtained through the “Guide to Available Mathematical Software” (GAMS)
at http://gams.nist.gov.
A.10 A glance at non-symmetric matrices and the Jordan

form
Many of the matrices that arise in applications are symmetric and hence
the results of the previous sections are the only ones needed. But every
once in a while you do encounter a non-symmetric matrix and this section
gives you a guide to treating them. It is just an introduction and treats
only 2 × 2 matrices.
Given a non-symmetric matrix, the first thing to do is rotate the axes to
make the matrix representation triangular, as discussed in problem A.12:

ab
. (A.78)
0c
Note that b 6= 0 because otherwise the matrix would be symmetric and we
would already be done. In this case vectors on the x-axis are eigenvectors
because

ab 1 1
=a . (A.79)
0c 0 0
Are there any other eigenvectors? The equation

ab x x
=λ (A.80)
0c y y
tells us that
ax + by = λx
cy = λy
whence λ = c and the eigenvector has polar angle θ where
c−a
tan θ = . (A.81)
b
Note that if c = a (the “degenerate” case: both eigenvalues are the same)
then θ = 0 or θ = π. In this case all of the eigenvectors are on the x-axis.
Diagonal form
We already know that that a rotation of orthogonal (Cartesian) coordinates

will not diagonalize this matrix. We must instead transform to a skew
coordinate system in which the axes are not perpendicular.
y6
y0

Vx0

*

Vy0

ϕ V
-0
x, x

Note that in with oblique axes, the coordinates are given by

V = Vx0 î0 + Vy0 ĵ0 (A.82)
but, because î0 and ĵ0 are not perpendicular, it is not true that
Vx0 = V · î0 . NO! (A.83)
A little bit of geometry will convince you that the name of the vector
V changes according to

Vx0 Vx
=B , (A.84)
Vy 0 Vy
where

1 sin ϕ − cos ϕ
B= . (A.85)
sin ϕ 0 1
This matrix is not orthogonal. In fact its inverse is

1 cos ϕ
B−1 = . (A.86)
0 sin ϕ
Finally, note that we cannot have ϕ = 0 or ϕ = π, because then both
Vx0 and Vy0 would give information about the horizontal component of the
vector, and there would be no information about the vertical component of
the vector.
What does this say about the representations of tensors (or, equiva-
lently, of linear transformations)? The “name translation” argument of
equation (A.27) still applies, so
T0 = BTB−1 . (A.87)
Using the explicit matrices already given, this says

1 sin ϕ − cos ϕ ab 1 cos ϕ a (a − c) cos ϕ + b sin ϕ
T0 = = .
sin ϕ 0 1 0c 0 sin ϕ 0 c
(A.88)
To make this diagonal, we need only choose a skew coordinate system where
the angle ϕ gives
(a − c) cos ϕ + b sin ϕ = 0, (A.89)
that is, one with
c−a
tan ϕ = . (A.90)
b
Comparison with equation (A.81) shows that this simply means that the
skew coordinate system should have its axes pointing along two eigenvec-
tors. We have once again found an intimate connection between diagonal
representations and eigenvectors, a connection which is exploited fully in

abstract mathematical treatments of matrix diagonalization.
Once again we can do an accounting of information. In the initial co-
ordinate system, the four elements of the matrix contain four independent
pieces of information. In the diagonalizing coordinate system, two of those
pieces are explicit in the matrix, and two are implicit in the two axis rota-
tion angles needed to implement the diagonalization.
This procedure works almost all the time. But, if a = c, then it would
involve ϕ = 0 or ϕ = π, and we have already seen that this is not an
acceptable change of coordinates.
Degenerate case
Suppose our matrix has equal eigenvalues, a = c, so that it reads

ab
. (A.91)
0a
If b = 0, then the matrix is already diagonal. (Indeed, in this case all
vectors are eigenvectors with eigenvalue a, and the linear transformation is
simply multiplication of each vector by a).
But if b 6= 0, then, as we have seen, the only eigenvectors are on the
x-axis, and it is impossible to make a basis of eigenvectors. Only one thing
can be done to make the matrix representation simpler than it stands in
equation (A.91), and that is a shift in the scale used to measure the y-axis.
For example, suppose that in the (x, y) coordinate system, the y-axis is
calibrated in inches. We wish to switch to the (x0 , y 0 ) system in which the
y 0 -axis is calibrated in feet. There is no change in axis orientation or in the
x-axis. It is easy to see that the two sets of coordinates are related through
0 0
x 1 0 x x 1 0 x
0 = and = (A.92)
y 0 1/12 y y 0 12 y0
This process is sometimes called a “stretching” or a “scaling” of the y-axis.
The transformation represented by matrix (A.91) in the initial coordi-
nate system is represented in the new coordinate system by

1 0 ab 1 0 a 12b
= . (A.93)
0 1/12 0a 0 12 0 a
The choice of what to do now is clear. Instead of scaling the y-axis by a

factor of 12, we can scale it by a factor of 1/b, and produce a new matrix
representation of the form

a1
. (A.94)
0a
Where is the information in this case? In the initial coordinate system,

the four elements of the matrix contain four independent pieces of informa-
tion. In the new coordinate system, two of those pieces are explicit in the
matrix, one is implicit in the rotation angle needed to implement the initial
triangularization, and one is implicit in the y-axis scale transformation.
The Jordan form
Remarkably, the situation discussed above for 2 × 2 matrices covers all

the possible cases for n × n matrices. That is, in n dimensional space,
the proper combination of rotations, skews, and stretches of coordinate
axes will bring the matrix representation (the “name”) of any tensor or
linear transformation into a form where every element is zero except on
the diagonal and on the superdiagonal. The elements on the diagonal are
eigenvalues, and each element on the superdiagonal is either zero or one:
zero if the two adjacent eigenvalues differ, either zero or one if they are the
same. The warning of problem A.12 applies here as well: The eigenvalues
on the diagonal may well be complex valued, and the same applies for the
elements of the new basis vectors.
References
1
For example, Kenneth Hoffman and Ray Kunze, Linear Algebra, second
edition (Prentice-Hall, Englewood Cliffs, New Jersey, 1971).
2
For example, Jerry Marion and Stephen Thorton, Classical Dynamics
of Particles and Systems, fourth edition (Saunders College Publishing, Fort
Worth, Texas, 1995) section 11.2.
3
For example, Jerry Marion and Stephen Thorton, Classical Dynamics
of Particles and Systems, fourth edition (Saunders College Publishing, Fort
Worth, Texas, 1995) section 11.7.
4
W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical
Recipes (Cambridge University Press, Cambridge, U.K., 1992).
5
E. Anderson, et al., LAPACK Users’ Guide (SIAM, Philadelphia,
1992).
6
B.T. Smith, et al., Matrix Eigensystem Routines—EISPACK Guide
(Springer-Verlag, Berlin, 1976).
Appendix B
The Spherical Harmonics
A “function on the unit sphere” is a function f (θ, φ). Another convenient

variable is µ = cos θ. “Integration over the unit sphere” means
Z Z π Z 2π Z +1 Z 2π
dΩ f (θ, φ) = sin θ dθ dφ f (θ, φ) = dµ dφ f (θ, φ).
0 0 −1 0
1
∇2 Y`m (θ, φ) = − `(` + 1)Y`m (θ, φ) (B.1)
Z r2
0
∗
Y`m
0 (θ, φ)Y`m (θ, φ) dΩ = δ`0 ,` δm0 ,m (B.2)
∞ X
X `
f (θ, φ) = f`,m Y`m (θ, φ) where (B.3)
`=0 m=−`
Z
f`,m = Y`m∗ (θ, φ)f (θ, φ)dΩ (B.4)
In the table, square roots are always taken to be positive.
377
378 The Spherical Harmonics
1/2
1
Y00 (µ, φ) =
22 π
1/2 1/2
3 3 z
Y10 (µ, φ) = µ =
22 π 2
2 π r
1/2 p 1/2
3 3 1
Y1±1 (µ, φ) = ∓ 1 − µ2 e±iφ =∓ (x ± iy)
23 π 23 π r
1/2 1/2 2
5 5 z
Y20 (µ, φ) = (3µ2 − 1) = 3 − 1
24 π 24 π r2
1/2 p 1/2
±1 3·5 3·5 z
Y2 (µ, φ) = ∓ µ 1 − µ2 e±iφ =∓ (x ± iy)
23 π 3
2 π r2
1/2 1/2
±2 3·5 3·5 1
Y2 (µ, φ) = (1 − µ2 )e±2iφ = (x ± iy)2
25 π 25 π r2
1/2 1/2 3
7 7 z z
Y30 (µ, φ) = (5µ3 − 3µ) = 5 − 3
24 π 24 π r3 r
1/2 1/2 2
±1 3·7 p 3·7 z 1
Y3 (µ, φ) = ∓ 6
(5µ2 − 1) 1 − µ2 e±iφ =∓ 6
5 2 −1 (x ± iy)
2 π 2 π r r
1/2 1/2
±2 3·5·7 3·5·7 z
Y3 (µ, φ) = µ(1 − µ2 )e±2iφ = (x ± iy)2
25 π 25 π r3
1/2 1/2
5·7 p 5·7 1
Y3±3 (µ, φ) = ∓ 6
(1 − µ2 ) 1 − µ2 e±3iφ =∓ (x ± iy)3
2 π 26 π r3
Appendix C
Radial Wavefunctions for the

Coulomb Problem
Based on Griffiths, page 154, but with scaled variables and with integers
factorized.
R10 (r) = 2e−r

1 1
R20 (r) = √ 1 − r e−r/2
2 2
1
R21 (r) = √ r e−r/2
3
2 ·3

2 2 2
R30 (r) = √ 1 − r + 3 r2 e−r/3
33 3 3
23

1
R31 (r) = √ 1− r r e−r/3
33 2 · 3 2·3
2
2
R32 (r) = √ r2 e−r/3
34 2 · 3 · 5

1 3 1 2 1 3 −r/4
R40 (r) = 1 − 2r + 3r − 6 r e
22 2 2 2 ·3
√
5 1 1 2
R41 (r) = √ 1 − 2 r + 4 r r e−r/4
24 3 2 2 ·5

1 1
R42 (r) = √ 1 − 2 r r2 e−r/4
26 5 2 ·3
1
R43 (r) = √ r3 e−r/4
28 · 3 5 · 7
379
Appendix D
Quantum Mechanics Cheat Sheet
Delta functions:
Z +∞
eikx dk = 2πδ(x) (D.1)
−∞
Z +∞
i(p/~)x
e dp = 2π~δ(x) (D.2)
−∞
Z +∞
eiωt dω = 2πδ(t) (D.3)
−∞
Fourier transforms:
Z +∞
1
ψ(p) =
e √ ψ(x)e−i(p/~)x dx (D.4)
2π~ −∞
Z +∞
1 +i(p/~)x
ψ(x) = √ ψ(p)e
e dp (D.5)
2π~ −∞
Z +∞
fe(ω) = f (t)e−iωt dt (D.6)
−∞
Z +∞
dω
f (t) = fe(ω)e+iωt (D.7)
−∞ 2π
Gaussian integrals:
Z +∞ r
ax2 +bx π −b2 /4a
e dx = e <e{a} ≤ 0 (D.8)
−∞ −a
Z +∞
2 2
x2 e−x /2σ dx
−∞
Z +∞ = σ2 (D.9)
−x2 /2σ 2
e dx
−∞
381
382 Quantum Mechanics Cheat Sheet
Time development:
d|ψ(t)i i
= − Ĥ|ψ(t)i (D.10)
dt ~
~2 2

∂ψ(x, t) i
=− − ∇ + V (x) ψ(x, t) (D.11)
∂t ~ 2m
X
|ψ(t)i = e−(i/~)En t cn |ηn i (D.12)
n
dhÂi i
= − h[Â, Ĥ]i (D.13)
dt ~
Momentum:
∂
p̂ ⇐⇒ −i~ (D.14)
∂x
[x̂, p̂] = i~ (D.15)
1
hx|pi = √ ei(p/~)x (D.16)
2π~
Dimensions:
ψ(x) has dimensions [length]−1/2 (D.17)
−6/2
ψ(x1 , x2 ) has dimensions [length] (D.18)
ψ(p)
e has dimensions [momentum]−1/2 (D.19)
~ has dimensions [length × momentum] or [energy × time]
(D.20)
Energy eigenfunction sketching: (one dimension)

nth excited state has n nodes (D.21)
if classically allowed: regions of high V (x) have large amplitude and long wavelength
(D.22)
if classically forbidden: regions of high V (x) have faster cutoff (D.23)
Infinite square well: (width L)

p
ηn (x) = 2/L sin kx k = nπ/L n = 1, 2, 3, . . . (D.24)
~2 k 2 π 2 ~2
En = = n2 (D.25)
2m 2mL2
383
p
Simple harmonic oscillator: (V (x) = 12 Kx2 , ω = K/m)
En = (n + 21 )~ω n = 0, 1, 2, . . . (D.26)
†
[â, â ] = 1̂ (D.27)
† 1
Ĥ = ~ω(â â + 2) (D.28)
√
â|ni = n |n − 1i (D.29)
√
â† |ni = n + 1 |n + 1i (D.30)
p
x̂ = ~/2mω (â + â† ) (D.31)
p
p̂ = −i m~ω/2 (â − â† ) (D.32)
Coulomb problem:
Ry me (e2 /4π0 )2
En = − Ry = = 13.6 eV (D.33)
n2 2~2
(e2 /4π0 )
a0 = = 0.0529 nm (Bohr radius) (D.34)
2 Ry
~
τ0 = = 0.0242 fsec (characteristic time) (D.35)
2 Ry
Angular momentum:
[Jˆx , Jˆy ] = i~Jˆz , and cyclic permutations (D.36)
The eigenvalues of Jˆ2 are
~2 j(j + 1) j = 0, 12 , 1, 32 , 2, . . . . (D.37)
~m m = −j, −j + 1, . . . , j − 1, j. (D.38)
Jˆ+ = Jˆx + iJˆy Jˆ− = Jˆx − iJˆy (D.39)
by
Jˆ+ |j, mi = ~ j(j + 1) − m(m + 1) |j, m + 1i
p
(D.40)
Jˆ− |j, mi = ~ j(j + 1) − m(m − 1) |j, m − 1i.
p
(D.41)
384 Quantum Mechanics Cheat Sheet
Spherical harmonics:
A “function on the unit sphere” is a function f (θ, φ). Another convenient
variable is µ = cos θ. “Integration over the unit sphere” means
Z Z π Z 2π Z +1 Z 2π
dΩ f (θ, φ) = sin θ dθ dφ f (θ, φ) = dµ dφ f (θ, φ).
0 0 −1 0
1
∇2 Y`m (θ, φ) = − `(` + 1)Y`m (θ, φ) (D.42)
Z r2
0
∗
Y`m
0 (θ, φ)Y`m (θ, φ) dΩ = δ`0 ,` δm0 ,m (D.43)
∞ X
X `
f (θ, φ) = f`,m Y`m (θ, φ) where (D.44)
`=0 m=−`
Z
f`,m = Y`m∗ (θ, φ)f (θ, φ)dΩ (D.45)
Index
↑↓ symbols, 77, 82 characteristic equation, 367

characteristic matrix, 367
absorption of radiation, 343 classical limit of quantum mechanics,
action at a distance, spooky, 44 2
Aharonov, Yakir, 32 Clebsch, Alfred, 303
Aharonov-Bohm effect, 31–32 collapse of the state vector, 78
ambivate, 25 complex unit, 73
amplitude, 57 configuration space, 158
peculiarities of, 61, 64, 67, 72 continuous, 5
pronunciation of, 59 controlled approximation, 197
symbol for, 59 conundrum of projections, 10–19
analogy, 1, 72 Coulomb gauge, analogous to quantal
analyzer loop, 22 state, 79
analyzer, Stern-Gerlach, 12
definite value
approximation
use of, 180
controlled, 197
degenerate, 233
uncontrolled, 197
delight, 3
atomic units, 272–275
despair, 3, 6
dimensional analysis, 272–275
baum, 351 Dirac notation, 56
Bell’s Theorem, 45 Dirac, P.A.M., 56
Bell, John, 45 disparaging term attached to
blackbody radiation, 5 charming result, 233
Bloch, Felix, 158
Bohm, David, 32 effective potential energy function,
Bohr magneton, 9 232
Bohr radius, 273 Ehrenberg, Werner, 32
Born, Max, 46 Ehrenfest, Paul, 171
Bose, Satyendra, 254 eigenvalues, 363, 365
boson, 254 eigenvectors, 363, 365
385
386 Index
Einstein A and B argument, 343–345 Laguerre, Edmond, 242

Einstein, Albert, 36, 37, 44, 53, 71, language, 2, 18, 25–26, 34, 44, 47, 57,
343–345 62, 64
entanglement, 44–47, 50–52, 77–81 for amplitude, 59
the word, 44, 47 swap v. exchange v. interchange,
EPR, 37, 44 251
exchange, 251 Legendre, Adrien-Marie, 236
Liouville, Joseph, 231
Fermi, Enrico, 254, 336 love, 179
fermion, 254 color of, 18, 25, 78
find-the-flaw problems, 29–30
football, 26 marble, 46, 179, 180
Frank–Hertz experiment, 6 matrix mathematics, 351–376
Franz, Walter, 32 measurement, 34
measurement disturbs the system”, 18
Galileo Galilei, 45 metaphor, 18, 35, 46
Gerlach, Walter, 7 misconceptions, 178
global phase freedom, 73, 75 a “wheels and gears” mechanism
Gordan, Paul, 303 undergirds quantum
mechanics, 46
Hückel, Erich, 158 all states are energy states, 4
Hamilton, Alexander, 121 amplitude is physically “real”, 4,
Hamilton, William Rowan, 120 61–62, 67, 78–79
Hermite, Charles, 108 atom can absorb light only if
Hilbert space, 71 ~ω = ∆E, 338
Hilbert, David, 71 collapse of the quantal state”
history of quantum mechanics, 32, 82 involves (or permits)
Holy Office of the Inquisition, 45 instantaneous
Hund’s rules, 305 communication, 78–79
Hund, Friedrich, 305 electron is a small, hard marble, 25
indeterminate quantity exists but
indeterminacy, 18, 25 changes rapidly, 18
Inquisition, Holy Office of the, 45 indeterminate quantity exists but
interchange, 251 changes unpredictably, 18
interference, 25, 339 indeterminate quantity exists but
constructive, 26 is disturbed upon
the physical manifestation of measurement, 18
superposition, 72 indeterminate quantity exists but
intuition regarding quantum knowledge is lacking, 4, 18,
mechanics, 45–47, 82 44, 62
indeterminate quantity exists in
Johnson-Trotter sequence of random shares, 18
permutations, 255 magnetic moment behaves like a
classical arrow, 18
Kelvin, 45 photon as ball of light, 46, 342
ket, 56 photon is a small, hard marble, 46
Index 387
quantum mechanics applies only to reversal conjugation theorem, 64

small things, 2 richness, 1, 3
state of system given through Rosen, Nathan, 37
states of each constituent, rotation matrix, 353
78
misconceptions, catalog of, 4 scaled variables, 272–275
Schrödinger, Erwin, 44, 121
natural units, 275 shorthand, dangerous: say it but
Noether, Emmy, 303 don’t think it, 259
Siday, Raymond, 32
observation, 34 solutions, exact vs. approximate, 261
orthogonal matrix, 353 spin, 82
outer product, 358 spin- 21 systems, 52, 82
overall phase factor, 73, 75 spontaneous emission of radiation,
342, 343
perfection, 261 spooky action at a distance, 44
permutations, 255 state
phase factor, 73
definition, 56
phase freedom, global, 73, 75
of entangled system, 78
phase, overall, 73, 75
peculiarities of, 72, 76
photon, 341–342
state vector, 71
Picard, Émile, 332
Stern, Otto, 7, 36
plain changes sequence of
Stern-Gerlach analyzer, 12
permutations, 255
stimulated emission of radiation, 343
Planck constant, 5
Planck, Max, 5, 6 Sturm, Charles-François, 231
Podolsky, Boris, 37 Sturm-Liouville problems, 231
politics, 10 superposition, 72, 86
Pope, Alexander, 285 the mathematical reflection of
probability, 19 interference, 72
probability amplitude, 57 Susskind, Leonard, 78
PS swap, 251
test and reflect on your solution,
29–30 tentative character of science, 57–58
Thomson, William, 45
quantization, 5, 9, 10 tree, 351
quantum computer, 83 triangle inequality, 65
qubit systems, 52, 82 two-state systems, 52, 82
Rabi, I.I., 127 uncontrolled approximation, 197

Renninger negative-result
experiment, 35, 179 vacuum state, 341
replicator, 31 visualization, 18, 25, 46, 180

Physics QM

Uploaded by

Copyright:

Available Formats

Physics QM

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Physics QM

Uploaded by

Copyright:

Available Formats

i

The Physics of Quantum Mechanics

This book is in draft form — it is not polished or complete. It needs more

copyright c 19 August 2021 Daniel F. Styer

You may freely download this book in pdf format from

— Mary Oliver, Sometimes

1. What is Quantum Mechanics About? 5

2. Forging Mathematical Tools 55

2.1 What is a quantal state? . . . . . . . . . . . . . . . . . . . 55

3. Refining Mathematical Tools 85

4.1 The role of formalism . . . . . . . . . . . . . . . . . . . . . 113

5. Time Evolution 119

5.1 Operator for time evolution . . . . . . . . . . . . . . . . . 119

6. The Quantum Mechanics of Position 143

6.1 Describing states in continuum systems . . . . . . . . . . . 143

7. The Free Particle 173

7.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

8. Square Wells 177

8.1 What does an electron look like? . . . . . . . . . . . . . . 178

9. The Simple Harmonic Oscillator 181

9.1 Resume of energy eigenproblem . . . . . . . . . . . . . . . 181

10. Qualitative Solution of Energy Eigenproblems 193

11. Perturbation Theory 195

11.1 The O notation . . . . . . . . . . . . . . . . . . . . . . . . 195

12. Quantum Mechanics in Two and Three Dimensions 211

12.1 More degrees of freedom . . . . . . . . . . . . . . . . . . . 211

13. Angular Momentum 219

13.1 Solution of the angular momentum eigenproblem . . . . . 219

14. Central Force Motion 227

14.1 Energy eigenproblem in two dimensions . . . . . . . . . . 227

15. Identical Particles 247

15.1 Many-particle systems in quantum mechanics . . . . . . . 247

16. Breather 271

16.1 Scaled variables . . . . . . . . . . . . . . . . . . . . . . . . 272

17. Hydrogen 281

17.1 The Stark effect . . . . . . . . . . . . . . . . . . . . . . . . 281

18. Helium 291

18.1 Ground state energy of helium . . . . . . . . . . . . . . . 291

19. Atoms 297

19.1 Addition of angular momenta . . . . . . . . . . . . . . . . 297

20. Molecules 307

20.1 The hydrogen molecule ion . . . . . . . . . . . . . . . . . 307

21. WKB: The Quasiclassical Approximation 317

21.1 The connection region . . . . . . . . . . . . . . . . . . . . 319

22. The Interaction of Matter and Radiation 329

22.1 Perturbation Theory for the Time Development Problem . 329

23. The Territory Ahead 349

Appendix A Tutorial on Matrix Diagonalization 351

A.1 What’s in a name? . . . . . . . . . . . . . . . . . . . . . . 351

Appendix B The Spherical Harmonics 377

Appendix C Radial Wavefunctions for the Coulomb Problem 379

Appendix D Quantum Mechanics Cheat Sheet 381

The place of quantum mechanics in nature

Quantum mechanics is the framework for describing and analyzing small

fast relativistic relativistic

What you can expect from this book

This book introduces quantum mechanics at the third- or fourth-year Amer-

What is Quantum Mechanics About?

1.1.1 The Stern-Gerlach experiment