Advanced Quantum Mechanics: 6ccm436a/7ccmms31
Advanced Quantum Mechanics: 6ccm436a/7ccmms31
6ccm436a/7ccmms31
Neil Lambert∗
N.B. These notes are not particularly original but rather based on the lecture notes of
Profs. N. Drukker and G. Watts.
∗
email: [email protected]
2
Contents
0.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
0.2 Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Angular Momentum 25
2.1 Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 A Spherically Symmetric Potential . . . . . . . . . . . . . . . . . . . . . 27
2.3 Angular Momentum Operators . . . . . . . . . . . . . . . . . . . . . . . 28
6 Perturbation Theory 55
6.1 A Simple Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.2 First Order Non-Degenerate Perturbation Theory . . . . . . . . . . . . . 58
6.3 Second Order Non-Degenerate Perturbation Theory . . . . . . . . . . . . 61
6.4 Third Order Non-Degenerate Perturbation Theory . . . . . . . . . . . . . 63
6.5 Degenerate Perturbation Theory . . . . . . . . . . . . . . . . . . . . . . . 63
3
4 CONTENTS
0.1 Introduction
Quantum Mechanics arguably presents our most profound and revolutionary under-
standing of objective reality. In the classical world, as studied for example “Classical
Dynamics” (5ccm231a) the physically observable world consists of knowing the posi-
tions and momenta (velocities) of every single particle at a moment in time. Then the
dynamical equations can be used to evolve the system so as to know the positions and
momenta in the future leading to a causal and highly predictive theory. You could
forgive the physicists of the 19th century for thinking that they were almost done. Ev-
erything seemed to work perfectly and there were just a few loose strings to get under
control.
It turned out that by pulling on those loose strings the whole structure of classical
physics and our understanding of objective relativity fell apart. We now have a more
predictive and precise theory in the form of Quantum Mechanics. But no one has got
to grips with what it truly means.
The classic experiment, although there are many, that lead to the unravelling of
classical mechanics relates to the question as to whether or not light is a particle or
a wave. If you consider a coherent light source and send a beam of light through two
small slits and see what happens you will find an interference pattern. This is what
you would expect if light was a wave and the two slits produced two wave sources that
could add constructively or cancel out deconstructively. Thus an interference pattern
is observed. Fine, light is a wave. Let’s do the same with electrons. We find the same
thing! Shooting an electron beam at a double slit also leads to an interference pattern.
So electrons, which otherwise appear as point like particles with fixed electric charges
also behave like waves (and similarly there are experiments such as the photoelectric
effect which show that light behaves like a particle). This interference pattern even
exists if you slow to beam down so that one electron at a time is released.
What has happened? Nothing that has a classical interpretation. We can think of
electrons as waves with profile ψ. After passing through the slits we have two wave-
functions ψhole1 and ψhole2 . Quantum mechanics tells us that the system is described by
ψhole1 + ψhole2 but in this case the two wave profiles can constructively or deconstruc-
tively interfere (i.e. add or cancel out). Classically there isn’t really an explanation but
one would expect two the electron beams coming out from each of the slits to behave
independently. Experiment tells us which is true (see figure 0.1.1). Needless to say there
have been countless experiments since which confirm the quantum picture and refute
any classical explanation (most notably in the Bell’s inequalities).
There are further revelations to that we won’t get to speak much of. Relativity
came and told us that time is not absolute. Combining this with Quantum Mechanics
means that there really aren’t any particles at all just local excitations of fields that
permeate spacetime. Why, after all, is every electron identical to every other? This is
the topic of Quantum Field Theory. However exploring Quantum Mechanics beyond an
introductory module such as “Introductory Quantum Theory” (6cmm332c) is crucial
6 CONTENTS
Figure 0.1.1: After 100, 200, 500 and 1000 electrons hit the screen the interference
picture is formed (on top). Compare this to the probability distribution computed from
the wave function of the electron (bottom). In particular the quantum probability (left)
is obtained from |ψhole1 + ψhole2 |2 vs. the classical sum of probabilities |ψhole1 |2 + |ψhole2 |2
(right).
0.2 Plan
We hope to cover each chapter in a week - but this could fluctuate, its a course on
Quantum Mechanics after all. The main lectures, where new material is presented, are
on Mondays 4-6pm. There will also be in-person discussion sessions on Tuesdays from
11-12am. These are intended to discuss issues from the previous week’s lectures and
give further examples or calculations and are meant for the level 6 students (but anyone
can attend).
0.2. PLAN 7
In addition there will be weekly problem sets found on Keats. You are strongly
urged to do the problems - no one ever understood Quantum Mechanics based on their
personal experience or by only listening to lectures. There are weekly tutorials on
Thursday morning which will discuss the problem set from the previous week.
There is a reading week starting October 30 where there will not be any lectures,
discussion sessions or tutorials.
8 CONTENTS
Chapter 1
Review Of One-Dimensional
Quantum Mechanics
Let’s jump head first into the formulation of Quantum Mechanics. In this section we
will just consider one-dimensional systems, meaning one spatial dimension and time.
Conceptually the extension to higher dimensions is easy but more technically involved
and is the subject of much of this module. One can also consider quantum systems with
no spatial dimension. We will consider these in Chapter 8. It is expected that what is
said here is a review.
In the Hamiltonian formulation of Classical Mechanics the state of a system corre-
sponds to a point in phase space. One specifies the positions and momenta of all the
particles and then one knows everything there is to know about the system: the relevant
observable physical quantities, such as energy and pressure, are known functions of the
positions and momenta. Thus if you know the state of the system then you in principle
know all the observable physical quantities. Furthermore given the state of the system
at some initial time t = 0 one can use Hamilton’s equations to predict the state of the
system at a later time. In general Hamilton’s equations are non-linear so this is is com-
plicated problem but for suitable classes of the Hamiltonian it is well-posed. (One can
also use the Lagrangian description with momentum replace by velocity and Hamilton’s
equations by the Euler-Lagrange equations.) With this you can do experiments, build
fancy machines to fly you around the world or go to the moon. Fantastic. But as more
detailed experiments have shown over the past 100 years or more this is simply not true
fundamentally, on microscopic scales. Rather it is a coarse grained approximation to
reality.
9
10 CHAPTER 1. REVIEW OF ONE-DIMENSIONAL QUANTUM MECHANICS
inner product
〈ψ1 |ψ2 〉 = ψ1∗ ψ2 dx (1.2)
Showing this is a seperable Hilbert space is one of those “better leave it to our analysis
friends” types of question. In cases like these one often refers to the state |ψ〉 as the
wavefunction ψ(x) and often writes
although this isn’t really defined as 〈x| doesn’t exist, even by physicists standards. We
will often use the terms state and wavefunction interchangeably.
Formally Hilbert spaces are technically very involved and our treatment hides many
subtleties. Complete means that all Cauchy sequences converge. Separable means that
any two vectors admit two disjoint open sets such that each includes one vector but
not the other (it is ensured if there is a countable basis |e1 〉, |e2 〉, |e3 〉, ... which we will
assume to be the case). As far as a physicist is concerned a Hilbert space is an infinite
dimensional complex vector space that behaves like CN but with N = ∞. In this
module we will behave as physicists and assume that all definitions and theorems can
be rigorously stated and proved without affecting our use of them; trust me (or more
precisely trust our colleagues) they can.
The statement that the state of the system is an element of a Hilbert space is already
profoundly weird as vector spaces allow you to add vectors. Thus if |ψ1 〉 and |ψ2 〉 are
two orthogonal states then
1
|ψ〉 = √ (|ψ1 〉 + |ψ2 〉) (1.4)
2
is also a state. There is no classical analogue of this. It sounds like madness. For
example Schrödinger famously exploited this by taking the states of an alive and dead
cat thereby introducing a morass of philosophical confusion to the world.2
Definition The space of linear maps from H to C is called the dual space H∗ .
1
Hilbert himself did not know what a Hilbert space was and is reported to have entered a seminar
where it was being discussed saying ”Hilbert space? What is that?”
2
To quote Stephen Hawking “whenever I hear Schrödinger’s cat I reach for my gun”.
1.2. OBSERVABLES 11
Here we are using Dirac notation where we denote vectors by “kets” |ψ〉. Elements
of the dual space are denoted by “bras” 〈ψ| such that
There is a theorem which states that H∗ is also a Hilbert space and is isomorphic to H.
This means that for each |ψ〉 there is a unique 〈ψ| and vice versa. Dirac was a hugely
important figure in 20th Century physics, second only to Einstein. His text book on
Quantum Mechanics was foundational and remains one of the best to this day. We will
see more from him later.3
Hilbert spaces admit linear maps (the analogue of matrices in finite dimensions)
O:H→H (1.6)
† ∗
where, just as in finite dimensions, Omn = Onm .
1.2 Observables
In addition to a Hilbert space of states Quantum Mechanics exploits the fact that there
exists a preferred class of linear maps which are self-adjoint (aka Hermitian) in the sense
that
for all vectors |ψ1 〉 and |ψ2 〉. Such maps are call Observables and they lead to an
important
First we suppose that |ψ1 〉 = |ψ2 〉 and hence λ2 = λ1 . Thus we have, since |ψ1 〉 must
have non-zero norm, λ1 = λ∗1 . Next we suppose that |ψ1 〉 ∕= |ψ2 〉. Since we now know
that λ1 and λ2 are real we have
Thus if λ2 ∕= λ1 then 〈ψ1 |ψ2 〉 = 0. If there are degeneracies, meaning that there are
many eigenstates with the same eigenvalue then one can arrange to find an orthogonal
basis by a Gram-Schmidt process. But we can’t conclude that two such eigenstates are
orthogonal without extra work. Whereas we know immediately that if two eigenstates
of a self-adjoint operator have different eigenvalues then they are orthogonal.
If an operator is self-adjoint then there is a choice of basis |en 〉 so that (see problem
set 1)
O= λn |en 〉〈en | (1.13)
n
The above theorem tells us that we can find an orthonormal basis of eigenvectors of
an observable O. Thus any state |ψ〉 can be written as
|ψ〉 = cn |ψn 〉 (1.15)
n
1.2. OBSERVABLES 13
where O|ψn 〉 = λn |ψn 〉. Intuitively the physical idea is that in order to measure an
observable one must probe the state you are observing, e.g. to find the position of an
electron you must look for it which means hitting it with light which will then impart
momentum and change it. Thus the act of measurement changes the system and hence
one doesn’t know what the system is doing after the measurement. Eigenstates are
special as they are, in some sense, unaffected by a particular measurement (but not
others).
Looking at |ψ〉 we can compute the expectation value of O to find
〈ψ|Oψ〉 = c∗m cn 〈ψm |Oψn 〉
m,n
= λn c∗m cn 〈ψm |ψn 〉
m,n
= λn c∗n cn
n
= λn p n (1.16)
n
where we think of pn = c∗n cn as a probability distribution since the unit norm of |ψ〉
implies that
1 = 〈ψ|ψ〉
= c∗m cn 〈ψm |ψn 〉
m,n
= c∗m cn 〈ψm |ψn 〉
m,n
= c∗n cn
n
= pn (1.17)
n
Note that the right hand side can be written as 〈(O − 〈O〉I)2 〉 and hence is positive.
Another theorem asserts that if two operators commute:
[O1 , O2 ] = O1 O2 − O2 O1 = 0 (1.19)
14 CHAPTER 1. REVIEW OF ONE-DIMENSIONAL QUANTUM MECHANICS
then one can find an orthonormal basis of states which are simultaneously eigenstates
of both operators.
On the other hand for pairs of operators that don’t commute we find the famous
Heisenberg uncertainty principle:
1
∆O1 ∆O2 ≥ |〈i[O1 , O2 ]〉| (1.20)
2
The classic example of this are the positions and momenta operators:
∂
x̂ψ(x) = xψ(x) p̂ψ(x) = −i ψ(x) (1.21)
∂x
so that [x̂, p̂] = i and hence
The ‘canonical’ way to quantise this system is consider the Hilbert space L2 (RN ) of
functions of qi and make the replacements4
as required (and used above). Note that this corresponded to a particular choice of coor-
dinates qi and conjugate momentum pi . However Hamiltonian systems admit canonical
4
In general we will avoid putting hats on operators but we need to do so here as qi have very specific
meanings as real numbers.
1.4. TIME EVOLUTION 15
transformations that mix these and hence there are several ways to quantise correspond-
ing to choosing a ‘polarisation’, i.e. a choice of what is a q and what is a p.
However this is really putting the cart before the horse as the classical world emerges
from the quantum world and not the other way around. Indeed there are quantum
systems with no classical limit at all. They obey the rules of quantum mechanics that we
are outlining here but they need not come from “quantising” some classical Hamiltonian
system.
dO
= {O, H} (1.27)
dt
becomes
dOH i
= − [OH , H] (1.28)
dt
Here I have introduced a subscript H on O to indicate that we are in the so-called
Heisenberg picture where operators evolve in time and states are constant. And 1.28 is
known as the Heisenberg equation.
A more familiar picture is the Schrödinger picture where
OS = e−iHt/ OH eiHt/
|ΨS 〉 = e−iHt/ |ΨH 〉 (1.29)
∂
i |ΨS 〉 = H|ΨS 〉 (1.31)
∂t
This follows by thinking of the Hamiltonian operator H as the energy operator Ê and
making the identification
∂
Ê = i (1.32)
∂t
In what follows we drop the subscript S and will always work in the Schrödinger picture.
16 CHAPTER 1. REVIEW OF ONE-DIMENSIONAL QUANTUM MECHANICS
One then sees that eigenstates of H with eigenvalue En evolve rather trivially (for
simplicity here we assume that H and hence its eigenvectors and eigenvalues are time
independent):
I will try to stick to a convention where the time dependent wavefunction is |Ψ〉 and
the time independent one, meaning it is an eigenstate of the Hamiltonian, |ψ〉. The
difference then is just the time-dependent phase.
More generally given any state |Ψ〉 we can expand it in an energy eigenstate basis:
|Ψ〉 = cn (t)|ψn 〉 (1.34)
n
and the cn (0) can be found by expanding |ψ(0)〉 in the energy eigenstate basis. Of course
finding the eigenstates and eigenvalues of a typical Hamiltonian is highly non-trivial.
But in principle solving Quantum Mechanics is down to (infinite-dimensional) linear
algebra.
A central point is that time evolution is unitary, meaning that it preserves the
inner-product between two states. It is crucial for the consistency of the theory as
otherwise the probabilities we discussed above will fail to sum to unity.
For a particle in one-dimension subject to a potential V (x) the quantum Hamiltonian
is
1 2
H= p̂ + V (x̂)
2m
2 ∂ 2
=− + V (x) (1.37)
2m ∂x2
Following the canonical quantization described above leads to the most familiar form of
the Schrödinger equation:
∂Ψ 2 ∂ 2 Ψ
i =− + V (x)Ψ (1.38)
∂t 2m ∂x2
Here we have taken the Hilbert space to be L2 (R) and x̂, p̂ as above. In principle one
can find a basis of eigenvectors of H which obey
2 ∂ 2 ψ n
− + V (x)ψn = En ψn . (1.39)
2m ∂x2
1.5. THE (NON)-COLLAPSE OF THE WAVEFUNCTION 17
This is some differential equation and one typically finds a discrete spectrum of eigenval-
ues En and wavefunctions ψn (x). This is the origin of quantum in Quantum Mechanics.
We can then compute various expectation values and probabilities.
For example to normalize the wavefunction we require that
|ψ|2 dx = 1 (1.40)
Given a region R ⊂ R then interpretation of ψ is that the probability to find the particle
in R is
P (R) = |ψ|2 dx (1.41)
R
i.e. this is the average value of where you will observe the particle after many measure-
ments of its position.
This all sounds great and wonderful as we have reduced all the complexities of the
world down to linear algebra. Who knew?! But we seem to be no where near the classical
world we experience and you could be forgiven for thinking of Quantum Mechanics as
lying in the realm of crackpot science. But it’s not and it is in fact incredibly rich and
predictive with no known conflict with experiment.
So in a sense the classical world emerges as the most probable set of observations.
In particular there is the so-called correspondence principle which asserts that we
expect to find agreement with the classical theory in the limit of large quantum numbers
(i.e. large n in the case of equation 1.39). Okay so we are back in the so-called real world.
Whew.
The most controversial statement in all of physics is the infamous Copenhagen in-
terpretation which asserts that following a measurement corresponding to an eigenvalue
λ of an observable O the system then collapses to the associated eigenstate. This is
where indeterminacy enters as the system is evolving in time as some state |ψ(t)〉 as
dictated by the Schrödinger equation and then after the measurement it is suddenly
projected onto an eigenstate of O in a random and unpredictable way, not governed by
18 CHAPTER 1. REVIEW OF ONE-DIMENSIONAL QUANTUM MECHANICS
the Schrödinger equation but somehow chosen by god or the experimenter. In particular
it could be projected onto any of the eigenstates. It then evolves again according to the
Schrödinger equation until the next measurement.
No one believes this. However we all do it in practice as it works and the alternative
will keep you awake at night.
What people typically believe is that there is no classical world and no classical ob-
server who can interfere with the evolution of the state to make it suddenly collapse to
a particular eigenstate. Rather the world is a very large quantum system consisting of
a single wavefunction of many many variables and there are subtle correlations between
the various subsystems such that the wavefunction becomes strongly peaked around
certain eigenvalues and vanishingly small away from others. In particular it never col-
lapses to one eigenstate, there are no “measurements”, and it always evolves causally in
time according to the Schrödinger equation. The classical world appears because this
wavefunction is so strongly peaked around a particular set of classical outcomes that it
leads to an emergent classical history.
The disturbing thing about this is that there is no reason to think that the wave-
function will peak around a unique classical world. Rather there must be many, possibly
all, classical worlds that emerge from a single wavefunction. Different branches of the
wavefunction must surely exist and correspond to distinct classical histories and we just
get to live in one of these. Sleep well because in some branch of the wavefunction your
mirror self will have a very bad dream.
∂Ψ 2 ∂ 2 Ψ 1 2
i =− + kx Ψ (1.45)
∂t 2m ∂x2 2
We could proceed by finding the general solution to this differential equation. However
this is a difficult task and a much better analysis can be done using algebra.
Again our Hilbert space is H = L2 (R). We can write the Schrödinger equation as
1 2 k 2
Ê|Ψ〉 = p̂ + x̂ |Ψ〉 (1.46)
2m 2
These satisfy
√ √
† 1 2 mk 2 i 1 2 mk 2 1
ââ = √ p̂ + x̂ − [x̂, p̂] = √ p̂ + x̂ +
2 mk 2 2 2 mk 2 2
√ √
1 mk 2 i 1 mk 2 1
↠â = √ p̂2 + x̂ + [x̂, p̂] = √ p̂2 + x̂ −
2 mk 2 2 2 mk 2 2
(1.48)
and hence
We know that the normalisable modes have energies which are bounded below by
zero. In particular there is a lowest energy state that we call the ground state |0〉. Let
us introduce another operator N̂ called the number operator
N̂ = ↠â (1.52)
Thus it follows that the lowest energy eigenstate |0〉 must satisfy
N̂ |0〉 = 0 (1.54)
â|0〉 = 0 (1.55)
1 k2
hence Ĥ|0〉 = 2 m
|0〉, i.e. the ground state energy is
1
E0 = ω (1.56)
2
We can create new states by acting with â†
and
and
N̂ ↠|n〉 = ([N̂ , ↠] + ↠N̂ )|n〉 = (↠+ n↠)|n〉 = (n + 1)↠|n〉 (1.62)
The operators ↠and â are therefore known as raising and lowering operators respec-
tively. From this we see that
1 1
H|n〉 = ω N̂ + |n〉 = ω n + |n〉 (1.63)
2 2
1.6. THE QUANTUM HARMONIC OSCILLATOR 21
So we have completely solved the system without ever solving a differential equation.
Furthermore this approach also allows us to explicitly construct the corresponding
energy eigenstate wavefunctions. For example the ground state satisfies
√ 12
12
1 mk
â|0〉 = √ p̂ − i x̂ |0〉 = 0 (1.65)
2 mk 2
where Ψ0 (x, t) = e−iEn t/ ψ(x) is the explicit element of L2 (R) that represents |0〉.
Rewriting this equation gives
√
dψ0 mk
=− xψ0 (1.67)
dx
√
mkx2 /2
The solution to this differential equation is simply ψ ∝ e− and hence
√
itE0 mk 2
− − x
Ψ = N0 e e 2 (1.68)
√
mk 2
These have the form of a polynomial times the Gaussian e− 2
x
. We can explicitly
22 CHAPTER 1. REVIEW OF ONE-DIMENSIONAL QUANTUM MECHANICS
It would have been very difficult to find all these solutions to the Schrödinger equation
by brute force and also to use them to calculate quantities such as 〈x̂〉 and 〈x̂2 〉.
We can also calculate various expectation values in this formalism. For example, in
the state |n〉, we have
〈x̂〉 = 〈n|x̂n〉
12
i 2
= √ 〈n|(â − ↠)n〉
2 mk
12
i 2
= √ 〈n|ân〉 − 〈n|↠n〉 (1.71)
2 mk
〈x̂2 〉 = 〈n|x̂2 n〉
1 2
= − √ 〈n|(â − ↠)2 n〉
4 mk
1 2
= − √ 〈n|(â)2 n〉 + 〈n|(↠)2 n〉 − 〈n|â↠n〉 − 〈n|↠ân〉
4 mk
(1.73)
Now again (â)2 |n〉 ∝ |n − 2〉 and (↠)2 |n〉 ∝ |n + 2〉 so that the first two terms give zero.
However the last two terms give
1 2
2
〈x̂ 〉 = √ (〈↠n|↠n〉 + 〈ân|ân〉) (1.74)
4 mk
Now we have already seen that
1
|n + 1〉 = ↠|n〉 , (1.75)
n+1
1.6. THE QUANTUM HARMONIC OSCILLATOR 23
so that
dψ df 2 2
= − 2αxf e−α x
dx dx
2
d2 ψ df df 2 2
2
= 2
− 4αx − 2αf + 4α x f e−α x
2 2
(1.81)
dx dx dx
and
2 d2 f df 1
− 2
− 4αx − 2αf + 4α2 x2 f + kx2 f = Ef (1.82)
2m dx dx 2
d2 f df 2mE
− 4αx − 2αf = − 2 f (1.83)
dx 2 dx
This equation will have polynomial solutions. Indeed if we write f = xn + . . . then the
leading order xn−2 term must cancel to give
2mE
−4αn − 2α = − (1.84)
2
From here we read off that
2 α k
E=2 (n + 1/2) = (n + 1/2) (1.85)
m m
as before.
24 CHAPTER 1. REVIEW OF ONE-DIMENSIONAL QUANTUM MECHANICS
Chapter 2
Angular Momentum
Let us start our exploration of Quantum Mechanics by looking at more realistic three-
dimensional models. In Classical Dynamics the simplest, but also most common, sys-
tems have rotational symmetry which leads via Noether’s theorem to conserved angular
momentum. This in turn means that the system can typically be solved. The classic
example is the Kepler problem of a planet moving around the Sun.
It’s ridiculous to think of the quantum problem of something so macroscopic as a
planet moving around something even bigger such as the Sun. The correspondence
principle tells us that we should just reproduce the classical results with incredible
accuracy. But happily there is a suitable quantum analogue which is of a negatively
charged particle (an electron) moving around a heavier positively charged particle (a
nucleus) as found in atoms! But before we tackle this problem head on and see Quantum
Mechanics really working for us and matching experiment we should step back and think
of symmetries, and angular momentum in particular, in Quantum Mechanics.
2.1 Symmetries
In classical dynamics symmetries play an important role. Simply put these are transfor-
mations of the system that leave the dynamics unchanged. In a Hamiltonian formulation
these correspond to canonical transformations that leave the Hamiltonian invariant:
qi′ = qi + εTi + . . .
p′i = pi + εUi + . . . (2.3)
25
26 CHAPTER 2. ANGULAR MOMENTUM
here the ellipsis denote higher order powers of ε and Ti , Ui are some functions on phase
space. It is known that for this to be a canonical transformation there must exist a
function Q on phase space such that
∂Q ∂Q
Ti = Ui = − (2.4)
∂pi ∂qi
This in turn means that
[H, Q] = 0 (2.8)
What does this mean? Well since they commute and are self-adjoint there is a basis of
eigenstates of both Q and H:
As we have seen a state that is an eigenstate of H will remain an eigenstate under time
evolution. This if it is also an eigenstate of Q it will remain an eigenstate of Q. However
a typically state will not be an eigenstate of either H or Q. Nevertheless it is also easy
to see that the expectation value of Q in any state will be time independent (assuming
that Q is time independent):
d d d
〈ψ|Q|ψ〉 = 〈 ψ|Q|ψ〉 + 〈ψ|Q| ψ〉
dt dt dt
i i
= 〈− Hψ|Q|ψ〉 + 〈ψ|Q| − Hψ〉
i
= 〈ψ|H † Q − QH|ψ〉
i
= 〈ψ|[H, Q]|ψ〉
=0 (2.10)
2.2. A SPHERICALLY SYMMETRIC POTENTIAL 27
x = r sin θ cos φ
y = r sin θ sin φ
z = r cos θ (2.13)
In which case (you can show this by a tedious calculation just using the chain rule, see
the problem sets, its easier if you know Riemannian geometry)
2 1 ∂ 2 ∂ 1 ∂ ∂ 1 ∂2
∇ = 2 r + 2 sin θ + 2 2 (2.14)
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂φ2
Furthermore we look for energy eigenstates so
where is a constant. In particular the second equation knows nothing about the
potential and hence nothing about the problem at hand:
1 ∂ ∂ 1 ∂2 2m
sin θ + + 2 Y =0 (2.20)
sin θ ∂θ ∂θ sin2 θ ∂φ2
Indeed the solutions for Y (θ, φ) can be completely solved and the solutions are related
to spherical harmonics Yl,m (θ, φ), labelled by two integers l, m with l ≥ 0 and |m| ≤ l.
On the other hand the first equation is a single ordinary differential equation for
u(r) and we have some hope to solve it:
1 d 2 du 2m
− 2 r + 2 V (r) + 2 − E u = 0 (2.21)
r dr dr r
This is the only place which is sensitive to the particular problem at hand through the
choice of potential V (r).
Thus we have reduced the problem to solving two independent differential equations.
And the only one specific to the particular problem is a second order ordinary linear
differential equation. One can proceed with brute force (the Y ′ s can be found by sep-
aration of variables once more). You can find them on Wikipedia or Mathematica so
we won’t say more about them here - come on Tuesday! Although we will say that one
finds
2 l(l + 1)
= (2.22)
2m
for some non-negative integer l. Later we will construct them in an insightful way. Indeed
spherical harmonics arise from some very important physics (angular momentum) and
mathematics (Lie algebras) underlying the system. So let us explore...
L = x̂ × p̂ (2.23)
where i = 1, 2, 3 and we use the convention that repeated indices are summed over.
Therefore, following canonical quantisation, we find the operator
∂
Li = −iεijk xj (2.25)
∂xk
2.3. ANGULAR MOMENTUM OPERATORS 29
for example
∂ 3 ∂
L1 = −ix2 + ix (2.26)
∂x3 ∂x2
To gain some intuition let us look at
[L1 , L2 ] = L1 L2 − (1 ↔ 2)
2 k ∂ m ∂
= − ε1kl x ε2mn x − (1 ↔ 2)
∂xl ∂xn
2 k ∂ k m ∂2
= − ε1kl ε2mn x δlm n + x x − (1 ↔ 2)
∂x ∂xl ∂xn
∂ ∂2
= −2 ε1kl ε2ln xk n − 2 ε1kl ε2ln xk xm l n − (1 ↔ 2) (2.27)
∂x ∂x ∂x
Lets look at the second term. We see that it is symmetric under k ↔ m and l ↔ n.
Therefore it is symmetric under 1 ↔ 2 and is cancelled when we subtract 1 ↔ 2. Thus
we have
∂
[L1 , L2 ] = −2 ε1kl ε2ln xk − (1 ↔ 2)
∂xn
∂
= −2 ε123 ε231 x2 1 − (1 ↔ 2)
∂x
∂
= −2 x2 1 − (1 ↔ 2)
∂x
= iL3 (2.28)
where we have used the fact that l ∕= 1 in ε1kl and l ∕= 2 in ε2ln so the only non-zero
term comes from l = 3 and hence k = 2. This is a general fact which follows from the
identity1
Does this look familiar? If not it will be the end of the semester. If we rescale Li = Ji
then we find
which is the Lie-algebra of su(2) - the simplest example of a Lie-algebra. There will be
more on Lie algebras latter. This structure is powerful enough for us to deduce more
or less anything we need to know about states with angular momentum from purely
algebraic considerations.
1
Check it for your self or use Mathematica. I don’t know any “smart” way to do it, just think of
the cases and note that the left hand side is anti-symmetric in i, j.
30 CHAPTER 2. ANGULAR MOMENTUM
Chapter 3
3.1 Representations
We have seen that the angular momentum operators (with factors of stripped off)
satisfy [Ji , Jj ] = iijk Jk but it is also not hard to see that
1
Ji = τi (3.1)
2
where τi are the Pauli matrices also satisfy [Ji , Jj ] = iijk Jk . So what else does?
The commutation relation satisfied by the J ′ s is an example of a Lie-algebra. Roughly
speaking (and there is a whole module devoted to this) a Lie algebra is what you get
when you look at infinitesimally small group transformations (more on this shortly). The
formal definition is that a Lie-algebra is a vector space G (often taken to be complex)
with a bi-linear map
[ · , · ]:G×G →G (3.2)
which is antisymmetric: [A, B] = −[B, A] and satisfies the Jacobi identity:
[A, [B, C]] + [B, [C, A]] + [C, [A, B]] = 0 (3.3)
This last relation is automatically obeyed by matrices if we take [A, B] = AB − BA
since matrix multiplication is associative (A(BC) = (AB)C). If we let Ta be a basis
for G where a = 1, .., dimV then it must be that there exist constants, called structure
constants, such that
[Ta , Tb ] = ifab c Tc (3.4)
where there is a sum over c and the Ta are known as generators of the Lie-algebra.
The factor of i requires some explanation. It is often not there in the mathematics
literature. However in Physics we like our generators to be Hermitian (just like the
angular momentum operators). If the Ta are Hermitian then
([Ta , Tb ])† = Tb† Ta† − Tb† Ta†
= Tb Ta − Ta Tb
= −[Ta , Tb ] (3.5)
31
32 CHAPTER 3. ALL ABOUT SU (2)
Thus the factor of i ensures that fab c are real. In Mathematics one often drops the factor
of i and takes Ta = −Ta† .
Next we wish to construct all finite dimensional unitary representations of su(2).
Formally a representation is a linear map
Π : G → End(V ) (3.6)
such that
Here V is a vector space (typically complex) and End(V ) are the set of endomorphisms
of V (linear maps from V to V ). In physics language V = CN and End(V ) are N × N
matrices.
We are particularly interested in irreducible representations. These are represen-
tations with no non-trivial invariant subspaces. That is, there are no vector subspaces
of V that are mapped to itself by Π.
Let us suppose that we are given matrices Ji that satisfy [Ji , Jj ] = iijk Jk . Since we
want a unitary representation we assume that Ji† = J i but we do not know anything else
yet and we certainly don’t assume that they are 2 × 2 matrices or differential operators.
Note that we have both the dimension of the Lie-algebra dimG and the dimension of
the representation N . The dimension of su(2) is three (the generators are J1 , J2 , J3 ) but
we will see that we can construct N × N Hermitian matrices Ji , i.e. N -dimensional
representations of su(2), for any N = 0, 1, 2, ....
g = I + iθi Ji + . . . (3.8)
with θ1 << 1 some arbitrarily small parameters. To obtain representations of the group
we simply exponentiate these matrices:
i i
g = eiθ J (3.9)
θ = θn (3.11)
3.2. GROUPS VS ALGEBRAS 33
(n · τ )2 = ni nj τi τj
= ni nj (δij I + iεijk τk )
= n · nI
=I (3.12)
This means that (in · τ )2 = −I. Thus we have (the proof is the same as for eiθ =
cos θ + i sin θ)
i
g = e 2 θ·τ
= cos(θ/2)I + in · τ sin(θ/2) (3.13)
and hence
SO(3) ∼
= SU (2)/Z2 (3.17)
where Z2 is generated by the centre of SU (2) - that is to say the set of all elements
in SU (2) that commute which each other. This means that they are multiples of the
identity and this leaves just ±I. In particular
R = I + θ · K + ... (3.19)
34 CHAPTER 3. ALL ABOUT SU (2)
with
0 0 0 0 0 1 0 1 0
K 1 = 0 0 1 , K2 = 0 0 0 , K3 = −1 0 0 (3.20)
0 −1 0 −1 0 0 0 0 0
which is indeed an infinitesimal rotation around the z axis. One can see explicitly that
Ji = iKi (3.23)
then we find [Ji , Jj ] = iεijk Jk which is su(2). Note that the Ki ’s are not Hermitian so
we can’t diagonalise them but the Ji ’s are so we could (but the physical significance of
this is obscure!) and we’d find the l = 1 representation that we constructed above (it is
easy to see that the eigenvalues of J3 = iK3 are −1, 0, 1).
Exponentiating these gives a relatively complicated expression but we can just con-
sider rotations about z:
0 0 0 1 0 0 0 1 0
g = eθ3 K3 = 0 0 0 + cos θ3 0 1 0 + i sin θ3 −1 0 0
0 0 1 0 0 0 0 0 0
cos θ3 sin θ3 0
= − sin θ3 cos θ3 0 (3.24)
0 0 1
Here we have used the fact that K3 , and all powers of K3 , splits into a non-trivial 2 × 2
bit that squares to minus the identity and a trivial part with only zeros. Thus we have
rotations with θ3 ∈ [0, 2π].
Thus the homomorphism Φ : SU (2) → SO(3) is
i
θ·τ
Φ e2 = eθ·K (3.25)
Along these lines we see that any representation of su(2) will exponentiate to give a
group element of SO(3) if
e2πiJ3 = I (3.26)
g = AI + iB · τ (3.27)
is closed in SO(3) but can’t be continuously deformed to a point. But curiously enough
going around this loop twice is deformable to a point, i.e. no loop, π1 (SO(3)) = Z2 .
More general representation arise by considering tensors Tµ1 ...µn over C2 for su(2)
or R3 for SO(3). The group elements act on each of the µi indices in the natural
way. In general this does not give an irreducible representation. For larger algebras
such as SU (N ) and SO(N ) taking Tµ1 ...µn to be totally anti-symmetric does lead to
an irreducible representation. So does totally symmetric and traceless on any pair of
indices.
What happens in Nature. We saw when constructing the spherical harmonics that
l was integer but we have also seen that representations with l half-integer exists. The
spherical harmonics arise because there is a fundamental SO(3) rotational symmetry of
space. Later we will see that actually the symmetry of space is SU (2) and that there
are particles, Fermions, which transform under SU (2) not just SO(3). Sometimes one
writes SU (2) = Spin(3) where Spin(d) is the simply connected cover of SO(d). But
this is critical as Fermions must satisfy the Pauli exclusion principle which means that
no two Fermions can be in the same state. This is ultimately what makes atoms and
matter stable enough for us to exist.
Furthermore with relativity the symmetry group of spacetime is enhanced to SO(1, 3).
But again there are Fermions and the actual symmetry group is Spin(1, 3) ∼ = SL(2, C) ∼
=
SU (2)×SU (2). By a happy coincidence everything is still described by SU (2). This is a
fluke of being in a relatively low dimension. In higher dimensions the spacetime groups
and algebras are more complicated than those of SU (2). However many physicists can
happily spend their lives only looking at SU (2).
36 CHAPTER 3. ALL ABOUT SU (2)
We note that
J 2 = (J1 )2 + (J2 )2 + (J3 )2 , (3.30)
is a Casimir. That means it commutes with all the generators
[J 2 , Ji ] = [Jj2 , Ji ]
j
= Jj [Jj , Ji ] + [Jj , Ji ]Jj
j
= Jj jik Jk + jik Jk Jj
jk
= jik (Jj Jk + Jk Jj )
jk
=0. (3.31)
There is a famous theorem known as Schur’s lemma which states that any such Casimir
must act as a multiple of the identity in an irreducible representation. This means that
J 2 = λI in any irreducible representation. In practical terms if J 2 commutes with all
other operators then nothing will change the eigenvalue of J 2 .
Since the Ji are Hermitian we can chose to diagonalise one, but only one since su(2)
has rank 1, say J3 . Thus the representation has a basis of states labelled by eigenvalues
of J3 and J 2 :
J3 |λ, m〉 = m|λ, m〉 J 2 |λ, m〉 = λ|λ, m〉. (3.32)
In analogy to the harmonic oscillator we swap J1 and J2 for operators
Notice that
Therefore we have
where the constants cm and dm are chosen to ensure that the states are normalized (we
are assuming for simplicity that the eigenspaces of J3 are one-dimensional - we will
return to this shortly).
To calculate cm we evaluate
Similarly for dm :
So that
√
dm = λ − m2 + m . (3.40)
Thus we see that any irrep of su(2) is labelled by λ and has states with J3 eigenvalues
m, m±1, m±2m, . . .. If we look for finite dimensional representations then there must be
a highest value of J3 -eigenvalue mh and lowest value ml . Furthermore the corresponding
states must satisfy
This is a quadratic equation for ml as a function of mh and hence has two solutions.
Simple inspection tells us that
ml − −mh or ml = mh + 1 . (3.45)
The second solution is impossible since ml ≤ mh and hence the spectrum of J3 eigenvalues
is:
mh , mh − 1, ..., −mh + 1, −mh , (3.46)
with a single state assigned to each eigenvalue. Furthermore there are 2mh + 1 such
eigenvalues and hence the representation has dimension 2mh + 1. This must be an
integer so we learn that
2mh = 0, 1, 2, 3.... . (3.47)
We return to the issue about whether or not the eigenspaces |λ, m〉 can be more
than one-dimensional. If space of eigenvalues with m = mh is N -dimensional then
when we act with J− we obtain N -dimensional eigenspaces for each eigenvalue m. This
would lead to a reducible representation where one could simply take one-dimensional
subspaces of each eigenspace. Let us then suppose that there is only a one-dimensional
eigenspace for m = mh , spanned by |λ, mh 〉. It is then clear that acting with J− produces
all states and each eigenspace of J3 has only a one-dimensional subspace spanned by
|λ, m〉 ∝ (J− )n |λ, mh 〉 for some n = 0, 1, ..., 2λ + 1.
In summary, and changing notation slightly to match the norm, we have obtained
a (2l + 1)-dimensional unitary representation determined by any l = 0, 12 , 1, 32 , ... having
the Casimir J 2 = l(l + 1)I (in terms of what we had before l = mh ). The states can be
labelled by |l, m〉 where m = −l, −l + 1, ..., l − 1, l.
Let us look at some examples.
l = 0: Here we have just one state |0, 0〉 and the matrices Ji act trivially. This is the
trivial representation.
l = 1/2: Here we have 2 states:
1 0
| 12 , 12 〉 = | 12 , − 12 〉 = . (3.48)
0 1
By construction J3 is diagonal:
1/2 0
J3 = . (3.49)
0 −1/2
so that
0 1
J+ = . (3.51)
0 0
And can determine J− through
J− | 12 , 12 〉 = 3/4 − 1/4 + 1/2| 12 , − 12 〉 J− | 12 , − 12 〉 = 0 (3.52)
so that
0 0
J− = . (3.53)
1 0
Or alternatively
1 1 0 1
J1 = (J+ + J− ) =
2 2 1 0
1 1 0 −i
J2 = (J+ − J− ) = (3.54)
2i 2 i 0
Let us return to thinking about angular momentum more concretely. Having constructed
all three operators
∂
Li = −iijk xj k (4.1)
∂x
is helpful to introduce the quadratic Casimir
L2 = L21 + L22 + L23 (4.2)
Please do not confuse L2 with L2 ! We have the result that
[L2 , Li ] = 0 (4.3)
Thus it follows that we can find states which are eigenstates of L2 and one of the
L′i s but no more. The usual choice by far is to take this to be L3 . So our states are
eigenstates of L2 (which we take to all have the same eigenvalue) and L3 . This is such
a well-studied system that the treatment is very standard. Let us introduce raising and
lowering operators
L+ = L1 + iL2 L− = L1 − iL2 (4.4)
which satisfy L− = L†+ . The commutation relations are now
[L3 , L+ ] = [L3 , L1 + iL2 ] = iL2 + L1 = L+
[L3 , L− ] = [L3 , L1 − iL2 ] = iL2 − L1 = −L−
[L+ , L− ] = [L1 + iL2 , L1 − iL2 ] = −2i[L1 , L2 ] = 2L3 (4.5)
Finally we note that
2 2
2 L+ + L− L+ − L−
L = + + L23
2 2i
1 2
= L+ + L2− + L+ L− + L− L+ − L2+ − L2− + L+ L− + L− L+ + L23
4
1
= (L+ L− + L− L+ ) + L23
2
1
= L− L+ + [L+ , L− ] + L23
2
= L− L+ + L3 + L23 (4.6)
41
42 CHAPTER 4. BACK TO ANGULAR MOMENTUM
Everything we have done here is based on the simple algebraic fact that
where Li = Ji Nothing has relied on the explicit differential form of the operators Li .
However now it might be helpful to see what the previous analysis looks like in terms
of these differential operators.
∂
Li = −iεijk xj (4.8)
∂xk
As written and interpreted the Li act on the space of differential functions on R3 . This
is an infinite dimensional vector space.
It’s not surprising then that this is a reducible representation. For example consider
the set of functions of the form ψ = f (x2 + y 2 + z 2 ) then
∂ψ df
Li ψ = −iεijk xj k
= −2iεijk xj xk =0 (4.9)
∂x dr
so these are an invariant subspace.
To continue it is wise to return to spherical coordinates
x = r sin θ cos φ
y = r sin θ sin φ
z = r cos θ (4.10)
∂ ∂r ∂ ∂θ ∂ ∂φ ∂
= + +
∂x ∂x ∂r ∂x ∂θ ∂x ∂φ
∂ cos θ cos φ ∂ sin φ ∂
= sin θ cos φ + −
∂r r ∂θ r sin θ ∂φ
∂ ∂r ∂ ∂θ ∂ ∂φ ∂
= + +
∂y ∂y ∂r ∂y ∂θ ∂y ∂φ
∂ cos θ sin φ ∂ cos φ ∂
= sin θ sin φ + +
∂r r ∂θ r sin θ ∂φ
∂ ∂r ∂ ∂θ ∂ ∂φ ∂
= + +
∂z ∂z ∂r ∂z ∂θ ∂z ∂φ
∂ sin θ ∂
= cos θ −
∂r r ∂θ
(4.13)
∂ ∂
L1 = − iy + z
∂z ∂y
∂ sin θ ∂
= − ir sin θ sin φ cos θ −
∂r r ∂θ
∂ cos θ sin φ ∂ cos φ ∂
+ ir cos θ sin θ sin φ + +
∂r r ∂θ r sin θ ∂φ
∂ ∂
= i sin φ + cot θcos φ (4.14)
∂θ ∂φ
∂ ∂
L2 = −iz + ix
∂x ∂z
∂ cos θ cos φ ∂ sin φ ∂
= −ircos θ sin θ cos φ + −
∂r r ∂θ r sin θ ∂φ
∂ sin θ ∂
+ ir sin θ cos φ cos θ −
∂r r ∂θ
∂ ∂
= i −cos φ + cot θsin φ (4.15)
∂θ ∂φ
∂ ∂
L3 = −ix + y
∂y ∂x
∂ cos θ sin φ ∂ cos φ ∂
= − ir sin θ cos φ sin θ sin φ + +
∂r r ∂θ r sin θ ∂φ
∂ cos θ cos φ ∂ sin φ ∂
+ ir sin θ sin φ sin θ cos φ + −
∂r r ∂θ r sin θ ∂φ
∂
= −i (4.16)
∂φ
44 CHAPTER 4. BACK TO ANGULAR MOMENTUM
L+ = L1 + iL2
iφ ∂ ∂
= ie −i + cot θ
∂θ ∂φ
L− = L1 − iL2
−iφ ∂ ∂
= ie i + cot θ (4.17)
∂θ ∂φ
We’ve seen these equations before: |l, m〉 = Yl,m are the spherical harmonics
L3 Yl,m = mYl,m
L2 Yl,m = 2 l(l + 1)Yl,m (4.19)
The set {Yl,m |m = −l, ..., +l} therefore provides an irreducible, (2l + 1)-dimensional
representation of su(2) inside the space of all differentiable functions on S 2 (viewed as
the unit sphere in R3 ).
In particular we see that the 3-dimensional Hamiltonian with a spherically symmetry
potential can be written as
2 1 ∂ 2 ∂
H=− r + Vef f (r) (4.20)
2m r2 ∂r ∂r
with
2 2
Vef f (r) = V (r) + L
2m
2 l(l + 1)
= V (r) + (4.21)
2m
4.2. ADDITION OF ANGULAR MOMENTUM 45
where in the second line we have assumed that we are looking at angular momentum
eigenstates, i.e. eigenstates of L2 . It’s an easy excersize to see that L2 does indeed
commute with H.
Let us look for the associated eigenfunctions corresponding to the states |l, m〉. We
can also construct them relatively easily. These satisfy
∂
L3 |l, m〉 = −i |l, m〉 = m|l, m〉 (4.22)
∂φ
Thus
for some function Θl,m (θ). We start with the highest weight state |l, l〉. We know that
0 = L+ |l, l〉
iφ ∂ ∂
= ie −i + cot θ eilφ Θl,l (θ)
∂θ ∂φ
d
0 = −i + il cot θ Θl,l (θ) (4.24)
dθ
This is easily solved:
dΘl,l
= l cot θdθ = ld ln sin θ
Θl,l
=⇒ Θl,l (θ) = Cl (sin θ)l (4.25)
for some explicit function Θl,m . Note that we haven’t worried here about the normal-
isation but one can. But in practise these functions are given in books or built in to
Mathematica.
Ψ(t, x1 , x2 ) (4.27)
i.e. each is a representation of su(2). It is easy to see that we have four commuting
observables:
(1) (2)
(L(1) )2 , L3 , (L(2) )2 , L3 (4.30)
(A) (A) (A)
where (L(A) )2 = (L1 )2 +(L2 )2 +(L3 )2 and hence we have four eigenvalues determined
by (l1 , m1 ; l2 , m2 ). Each of these has
−l1 ≤ m1 ≤ +l1
−l2 ≤ m2 ≤ +l2 (4.31)
Its easy to see that these all commute with each other and therefore we can find a basis
of eigenstates with eigenvalues determined by l, m, l1 , l2 . Here l and m3 relate to the
total angular momentum-squared and total angular momentum around the z-axis. This
amounts to a change of basis from |l1 , m1 ; l2 , m2 〉 to a basis labelled by |l, m, l1 , l2 〉 and
takes the form
l1
l2
|l, m, l1 , l2 〉 = Cml11,l,m
2 ,l
|l , m1 ; l2 , m2 〉
2 ,m 1
(4.34)
m1 =−l1 m2 =−l2
representations of su(2) (and spanned by spherical harmonics with fixed lA ) with di-
mensions 2l1 + 1 and 2l2 + 1 respectively. In such a finite dimensional Hilbert space
the angular momentum operators simply behave as matrices forming a representation
of su(2) (recall the definition of a representation):
(1) (2)
Π1 (Ji ) Π2 (Ji ) (4.36)
(A) (A)
Here we have switched from Li to Ji so we can drop ugly factors of . We can lift
this to a representation on H1 × H2 by taking
(1) (2)
ΠT (Ji ) = Π1 (Ji ) ⊗ I + I ⊗ Π2 (Ji ) (4.37)
It’s easy to check that this is a representation. But it is generically reducible, i.e. not
irreducible. Rather it will be a direct sum of irreducible representations.
Without loss of generality we can take l1 ≥ l2 . It can be shown that it decomposes
into a direct sum over 2l2 + 1 irreducible representations with
l = l1 − l2 , l1 − l2 + 1, l1 − l2 + 2, ...., l1 + l2 (4.38)
where the bold face means that we are talking about a whole representation not just a
number. As a check on this we can see that the dimensions agree. The dimension on
the left is (2l1 + 1)(2l2 + 1) whereas on the right we have
2l2
2l2 (2l2 + 1)
(2(l1 − l2 + k) + 1) = (2(l1 − l2 ) + 1)(2l2 + 1) +
k=0
2
= (2l2 + 1)(2(l1 − l2 ) + 1 + 2l2 )
= (2l1 + 1)(2l2 + 1) (4.40)
as required.
Let us be more specific and look at the simplest non-trivial case of l1 = l2 = 1/2
(this doesn’t exactly correspond to angular momentum as it is non-integer but it does
correspond to the spin of an electron - more on this later). In the tensor product we
have the following basis of states (we drop the l1 = l2 = 1/2 to clean up the notation
since it is common to all states):
| 12 , 12 〉 = | 12 〉 ⊗ | 12 〉
| 12 , − 12 〉 = | 12 〉 ⊗ | − 12 〉
| − 12 , 12 〉 = | − 12 〉 ⊗ | 12 〉
| − 12 , − 12 〉 = | − 12 〉 ⊗ | − 12 〉 (4.41)
48 CHAPTER 4. BACK TO ANGULAR MOMENTUM
(1) (2)
These are eigenstates of J3 + J3
(1) (2) (1) (2)
(J3 + J3 )| 12 , 12 〉 = J3 | 12 〉 ⊗ | 12 〉 + | 12 〉 ⊗ J3 | 12 〉
= 12 | 12 〉 ⊗ | 12 〉 + 12 | 12 〉 ⊗ | 12 〉
= | 12 , 12 〉
(1) (2) (1) (2)
(J3 + J3 )| 12 , − 12 〉 = J3 | 12 〉 ⊗ | − 12 〉 + | 12 〉 ⊗ J3 | − 12 〉
= 12 | 12 〉 ⊗ | − 12 〉 − 12 | 12 〉 ⊗ | − 12 〉
=0
(1) (2) (1) (2)
(J3 + J3 )| − 12 , 12 〉 = J3 | − 12 〉 ⊗ | 12 〉 + | − 12 〉 ⊗ J3 | 12 〉
= − 12 | − 12 〉 ⊗ | 12 〉 + 12 | − 12 〉 ⊗ | 12 〉
=0
(1) (2) 1 (1) (2)
(J3 + J3 )| − 2
, − 12 〉 = J3 | − 12 〉 ⊗ | − 12 〉 + | − 12 〉 ⊗ J3 | − 12 〉
= − 12 | − 12 〉 ⊗ | − 12 〉 − 12 | − 12 〉 ⊗ | − 12 〉
= −| − 12 , − 12 〉 (4.42)
(T )
Thus we find the J3 eigenvalues (−1, 0, 0, +1). This doesn’t correspond to any irre-
ducible representation of su(2) but we can split it as 0 and (−1, 0, 1) which are the
eigenvalues of the l = 0 and l = 1 representations. Thus we expect to find
2⊗2=1⊕3 (4.43)
Let us put what we learnt above to use in solving for an electron moving around a
positively charged nucleus. Here we can be a bit general and consider it as a two-
body problem where there is the position of the electron re and nucleus rn . Thus our
wavefunction is
Furthermore the potential arising from the electrostatic force between them is
Ze2
V =− (5.2)
|rn − re |
where Z is the nucleon number and e the charge of an electron in appropriate units.
Thus the time dependent Schrödinger equation is
2 2 2 2 Ze2
Eψ = − ∇r e ψ − ∇r n ψ − ψ (5.3)
2me 2mn |rn − re |
where me and mn are the masses of the electron and nucleon respectively.
49
50 CHAPTER 5. THE HYDROGEN ATOM
where M = me + mn and µ = me mn /M .
Thus our time-independent Schrödinger equation is
2 2 2 2 Ze2
Eψ = − ∇R ψ − ∇r12 ψ − ψ (5.7)
2M 2µ |r12 |
Now we can use seperation of variables and write
2 2
CoM ψCoM = − ∇ ψCoM
2M R
2 Ze2
rel ψrel = − ∇2r12 ψrel − ψrel (5.9)
2µ |r12 |
with E = CoM + rel .
with CoM = 2 |k|2 /2M . These just describe a free particle, the atom, in a basis where
the linear momentum is fixed. We can then construct wave packets through
d3 k ik·R
ψCoM (R) = e χ(k) (5.11)
(2π)3
which will not be energy eigenstates but can be localized in position to an arbitrary
accuracy.
The more interesting part is to solve for ψrel (r12 ). Here we can switch from r12 to
spherical coordinates and use separation of variables yet again:
where r = |r12 | etc.. We already know what the Yl,m ’s are and we know that ul satisfies:
1 d 2 dul 2µ Ze2 2 l(l + 1)
− 2 r + 2 − + − rel ul = 0 (5.13)
r dr dr r 2µr2
So solving the Hydrogen atom (which corresponds to Z = 1 but we can be more general
without causing any more pain) comes down to solving this equation and finding ul , E
and ψrel and ψ(re , rn ).
To continue we write
to find
d2 fl 2(l + 1) dfl 2µZe2 2µrel
− − − f l − fl = 0 (5.15)
dr2 r dr 2 r 2
This substitution is relatively standard in spherically symmetric examples as it removes
the l(l + 1) term from the equation.
To continue we look at the large r limit where only the first and last terms are
important. If rel > 0 then we find oscillating solutions:
f ∼ C1 cos(− 2µrel r/) + C2 sin( 2µrel r/) (5.16)
which will not be normalisable. Solutions with rel = 0 will also not be normaliable.
Thus we conclude that rel < 0. In this case we expect solutions, at large r, to be of the
form
√ √
f ∼ C1 e − −2µrel r/
+ C2 e −2µrel r/
(5.17)
2µZe2
− + 2 −2µrel n + 2(l + 1) −2µrel = 0 (5.19)
which must vanish. We can rearrange this to give
2
µZ 2 e4 1
rel =− (5.20)
22 n+l+1
Substituting back in simply leads to a recursion relation obtained from setting the
coefficient of ρk−2 to zero:
We see that taking k = 0 leads to C−1 = 0 and hence the series indeed terminates giving
a polynomial. These polynomials are known as Laguerre Polynomials. We haven’t
shown here that these are the only normalizable solutions but this does turn out to be
the case.
In summary our solutions are of the form
and ψCoM is a generic free wave packet. It’s fun to plot |ψrel |2 for various choices of n, l
and m. For example look here https://en.wikipedia.org/wiki/Atomic_orbital#
/media/File:Hydrogen_Density_Plots.png On the other hand there are actual pic-
tures of atoms such as here: https://www.nature.com/articles/498009d
We can also reproduce the famous formula postulated by Bohr in the earliest days of
Quantum Mechanics (before all this formalism that we have discussed was formulated):
µZ 2 e4 1
EBohr = − (5.26)
22 N 2
for some integer N = n + l + 1 = 1, 2, 3, ...,. In particular since a proton is 2000 times
more massive than an electron we have mn >> me and so µ = mn to very high accuracy.
Thus one finds (for Hydrogen where Z = 1)
mp Z 2 e 4 1 mp Z 2 e 4 1
Ephoton = − +
22 N12 22 N22
1 1
=R 2
− 2 (5.28)
N2 N1
• N = 1: n = l = 0 1 state
5.2. THE HYDROGEN ATOM 53
• N = 2: n = 1, l = 0 or n = 0, l = 1 1+3=4 states
• N = 3: n = 2, l = 0 or n = 1, l = 1 or n = 0, l = 2 1+3+5 = 9 states
In fact we find N 2 states for each energy level (the sum of the first N odd numbers is
N 2 ).
However what are the lowest energy states with multiple electrons. In fact there
are twice as many as these as each electron can be spin up or spin down (more on this
later). This is just an additional quantum number (which only takes two values: up or
down) and corresponds to the fact that the correct rotation group of Nature isn’t SO(3)
but SU (2) = Spin(3). Note also that the terminology of spin up and spin down has
no physical interpretation in terms of up and down: if you take an electron of spin up
to Australia it does not become spin down. It has more to do with the fact that we
write vectors as column matrices and the up state are on top and then down state on
the bottom. We must know another fact: no two electrons can be in the same state
(the multi-electron wavefunction must be anti-symmetric - odd - under interchange of
any two electrons). Thus the degeneracies of low energy multi-electron states are (this
ignores inter electron interactions which become increasingly important)
• 2: n = l = 0
This pattern is evident in the periodic table whose rows have 2, 8 and 18 elements.
Which we have now predicted based on some crazy idea that states are vectors in a
Hilbert space and observables are self-adjoint linear maps.
54 CHAPTER 5. THE HYDROGEN ATOM
Chapter 6
Perturbation Theory
Next we must face the fact that solving the Schrödinger equation in general is too
complicated and we need to make approximations to make progress (for spherically
symmetric systems one is on better ground but still one has to solve a second order
ODE). The idea is to find a system you can solve exactly, for example a free, non-
interacting, system or a system you have solved exactly such as a single electron in an
atom, and then imagine that you perturb it (say by adding in another electron or putting
the atom in a small background magnetic field). This means by adding interaction whose
strength is controlled by a parameter g which we can make as small as we like such that
setting g = 0 reduces us the problem we can solve exactly. One then computes the
physically meaningful quantities in a power series:
E = E0 + gE1 + g 2 E2 + . . . (6.1)
The constants g are referred to as coupling constants - there can be several in any given
problem. The contributions at order g are called first-order and at g 2 second order etc..
Computing each term in this expansion is known as perturbation theory. Much, almost
all, of Physics is done this way.
Of course in Nature g is not a free parameter but some constant that you determine
from experiment. If g is small then we say the theory is weakly coupled whereas if g
is not small then it is strongly coupled. For example in electromagnetism the relevant
coupling constant is
e2 1
α= ∼ (6.2)
c 137
This is known as the fine structure constant. It is so named as it leads to corrections to
the Hydrogen atom spectrum that correspond to “fine”, e.g. small, corrections to what
we found above. This is indeed small. Nature was kind to Physicists as this meant
that accurate predictions from quantum electrodynamics (QED) could be made using
perturbation theory. Indeed in some cases theory and experiment agree to 13 decimal
place accuracy. This is like knowing the distance to the moon to within the width of a
human hair.
55
56 CHAPTER 6. PERTURBATION THEORY
On the other hand in Quantum Chromodynamics (QCD), the theory that describes
quarks inside protons and neutrons, the relevant coupling is
1
αs ∼ (6.3)
2
In fact neither α nor αs are constant. αs becomes larger at longer distances whereas α
gets smaller. The values I gave here are the approximate values of αs at the distance of
a proton and α at infinite distance. So we are not lucky with QCD (superconductivity
is another system that is strongly coupled and perturbation theory fails). At distances
around the size of a proton, QCD is strongly coupled and computations in perturbation
theory are next to useless (but we can use perturbation theory well at very short, sub
proton, distances where they behave in a weakly coupled way). This is asymptotic
freedom, meaning that at short distances the theory is weakly coupled. Pulling quarks
apart increases the force between them, somewhat like a spring, whereas the electric force
between two quarks gets weaker at large distances. Quarks are confined into protons
(and other baryons such as neutrons) but this is still poorly understood largely because
we can’t compute much.
It is important to ask what small means. It is meaningless to say that a quantity
that is dimensionful is small or large. By dimensionful we mean that it is measured in
some units. For example is a fighter jet fast? A typical speed for a fighter jet is 2000
kph and that is very fast compared to driving a car (say 100 kph):
So no in this sense a fighter jet is very very slow. Thus when we do perturbation theory
we need to identify a small dimensionless number. Small then means g 2 << g, so that
higher order terms in the expansion are indeed smaller (although their coefficients might
grow). Note that this requires us to compare g and g 2 so they must have the same units
and hence g must be dimensionless.
Lastly we note that even if we have a small, even tiny, dimensionless coupling con-
stant perturbation theory can still fail. This is because not all functions admit a Taylor
expansion about g = 0. The classic example is
1
e− g 2 g ∕= 0
f (g) = (6.6)
0 g=0
i.e. perturbation theory misses all the information in f (g). This may seem like an esoteric
example but actually functions of this form arise all the time in quantum theories as
one can see in the path integral formulation.
In practice in quantum theories the perturbative series of the form 6.1 do not even
converge! They become more accurate as we include higher order terms for a while but
then they get worse if you include too many and then would ultimately diverge if one
could do infinitely many. In QED where g = α = 1/137 one expects the series to start
failing around the 137-th term. Such a series is known as an asymptotic series. The full
theory is not divergent and one expect the complete answer to take the form
2
E = (E0,0 + E0,1 g + E0,2 g 2 + . . .) + (E1,0 + E1,1 g + E1,2 g 2 + . . .)e−1/g + . . . (6.8)
Not nice but solved exactly! Note that for E1 ∕= E2 we can smoothly take → 0 as
1 4 2
()
E1/2 = E1 + E2 ± |E1 − E2 | 1 +
2 (E1 − E2 )2
1 22
= E1 + E2 ± |E1 − E2 | ± + ...
2 |E1 − E2 |
2
= E1/2 ± + O(||4 ) (6.14)
|E1 − E2 |
where
(0)
E1 0 (0) 1 (0) 0
H (0) = (0) |E1 〉 = |E2 〉 = (6.17)
0 E2 0 1
(0)
and we have relabelled En = En . Next we expand to lowest order in . In principle
this involves an infinite number of terms but lets start with just the first order term
H = H (0) + H (1) + . . .
En = En(0) + En(1) + . . .
|En 〉 = |En(0) 〉 + |En(1) 〉 + . . . (6.18)
6.2. FIRST ORDER NON-DEGENERATE PERTURBATION THEORY 59
The terms of order 0 cancel as we have solved the unperturbed problem. Thus, ignoring
higher orders in , we find
(0) 1
〈Em |En(1) 〉 = (0) (0)
(0)
〈Em |H (1) |En(0) 〉 (6.25)
En − Em
Here is where the non-degeneracy is important. On the other hand we want orthonormal
states |En 〉:
〈En(1) |Em
(0) (0)
〉 = (〈Em |En(1) 〉)∗ = −〈Em
(0)
|En(1) 〉 (6.27)
This tells us that the first order correction is orthogonal to the unperturbed eigenvalue.
So finally
|En(1) 〉 = (0)
〈Em |En(1) 〉|Em
(0)
〉
m
1 (0)
= (0) (0)
〈Em |H (1) |En(0) 〉|Em
(0)
〉 (6.29)
m∕=n En − Em
as required for keep the eigenvectors orthonormal. Note that in this derivation, although
we started with a two-dimensional Hilbert space, we didn’t use this and our answer is
true more generally.
Let’s look at the simple example above where
0 1
H (1) = (6.31)
1 0
This agrees with our exact result but will not be true in general since our H (1) is off-
diagonal.
(1)
However |En 〉 will be non-zero as:
0 1 0
(0) (0)
〈E1 |H (1) |E2 〉 = 1 0 =1
1 0 1
0 1 1
(0) (0)
〈E2 |H (1) |E1 〉 = 0 1 =1 (6.34)
1 0 0
6.3. SECOND ORDER NON-DEGENERATE PERTURBATION THEORY 61
and
(0) (0)
(0) 〈E |H (1) |E2 〉 (0)
|E2 〉 = |E2 〉 + 1 |E1 〉
E2 − E1
0 1
= + (0) (0)
1 E2 − E1 0
− (0) (0)
= E1 −E2 (6.36)
1
Which agrees with what we found above.
However to see the shift in energy we will have to go to second order in perturbation
theory!
But we can assume that the zeroth and first order equations have been solved so we
find, at second order,
H (2) |En(0) 〉 + H (1) |En(1) 〉 + H (0) |En(2) 〉 = En(0) |En(2) 〉 + En(1) |En(1) 〉 + En(2) |En(0) 〉 (6.40)
62 CHAPTER 6. PERTURBATION THEORY
(2) (2)
Remember that the unknowns are En and |En 〉 everything else is known. So we take
matrix elements again:
(0)
〈Em |H (2) |En(0) 〉 + 〈Em
(0)
|H (1) |En(1) 〉 + 〈Em
(0)
|H (0) |En(2) 〉
= En(0) 〈Em
(0)
|En(2) 〉 + En(1) 〈Em
(0)
|En(1) 〉 + En(2) 〈Em
(0)
|En(0) 〉 (6.41)
which becomes
(0)
〈Em |H (2) |En(0) 〉 + 〈Em
(0)
|H (1) |En(1) 〉 + Em
(0) (0)
〈Em |En(2) 〉
= En(0) 〈Em
(0)
|En(2) 〉 + En(1) 〈Em
(0)
|En(1) 〉 + En(2) δnm (6.42)
〈En(0) |H (2) |En(0) 〉 + 〈En(0) |H (1) |En(1) 〉 + En(0) 〈En(0) |En(2) 〉 = En(0) 〈En(0) |En(2) 〉 + En(2) (6.43)
(0) (1) (0) (2)
where we have used 〈En |En 〉 = 0. The 〈En |En 〉 terms cancel so this tells us the
second order correction to En :
(2)
Since we can compute everything on the right hand side we now know En . It is left
(2)
and an exercise to compute |En 〉 (just do it for H (2) = 0).
So let us try to compute the correction to the energy in our simple example. Here
we have H (2) = 0 and so (using the matrix elements we computed before 6.34)
(0) (0) (0) (0)
(2) 〈E2 |H (1) |E1 〉〈E1 |H (1) |E2 〉 1
E1 = (0) (0)
=
E1 − E2 E1 − E2
(0) (0) (0) (0)
(2) 〈E1 |H (1) |E2 〉〈E2 |H (1) |E1 〉 1
E2 = (0) (0)
=− (6.45)
E2 − E1 E1 − E2
which agrees with what we found above from the exact solution 6.14.
Let us summarise what we found and in a slightly cleaner notation. Typically (but
not always) the perturbation to the Hamiltonian comes from the potential so
2 2
En(0) ψn = − ∇ ψn + V (0) ψn (6.49)
2m
Then our formulae are (see the problem set)
Vmn Vnm
En = En(0) + Vnn + 2 (0) (0)
+ ...
m∕=n En − Em
Vnm
|En 〉 = |n〉 + (0) (0)
|m〉
m∕=n En − Em
Vpm Vmn Vmm Vmn
+ 2 (0) (0) (0) (0)
|m〉 − (0) (0)
|m〉
− − 2
m∕=n p∕=n (En Em )(En Ep ) m∕=n (En − Em )
1 Vmn Vmn
− |n〉 + . . . (6.50)
2 m∕=n (En(0) − Em
(0) 2
)
where sgn() = /||. Here we see the problem: the answer is not analytic in and so a
naive Taylor series expansion will fail. In addition taking → 0± does not give us the
64 CHAPTER 6. PERTURBATION THEORY
H = H (0) + H (1) + . . .
En = En(0) + En(1) + . . .
|En 〉 = |En(0) 〉 + |En(1) 〉 + . . . (6.55)
which led to
(0)
〈Em |H (1) |En(0) 〉 = En(1) δnm + (En(0) − Em
(0) (0)
)〈Em |En(1) 〉 (6.56)
So this formula remains the same. But now in the degenerate case we therefore have for
(0) (0)
some subset, labelled by {n′ , m′ }, of degenerate energy states En′ = Em′ ,
(0) (0) (1)
〈Em′ |H (1) |En′ 〉 = En′ δn′ m′ (6.58)
(0) (0)
This is a constraint on the choice of |En′ 〉. In particular we need to choose |En′ 〉 to be
(1)
eigenstates of H (1) with eigenvalue En′ . This is possible since H (1) is self-adjoint and
H (0) is proportional to the identity when acting on the degenerate subspace.
We then find that the degeneracy is lifted by the eigenvalues of H (1) in the degenerate
subspace
(0) (0) (0)
En′ = En′ + 〈En′ |H (1) |En′ 〉 + . . .
(0) (1)
= En′ + En′ + . . . (6.59)
(0) (1) (0)
where H (1) |En′ 〉 = En′ |En′ 〉. Whereas for non-degenerate eigenvalues we find
(0) (0) (0)
En′′ = En′′ + 〈En′′ |H (1) |En′′ 〉 + . . . (6.60)
where n′′ labels the non-degenerate eigenstates. We then solve for the eigenvectors by
(1)
〈Em
(0) (0)
|H (1) |En′′ 〉 (0)
|En′′ 〉 = (0) (0)
|Em 〉
m∕=n′′ En′′ − Em
(1)
〈E (0)′′ |H (1) |E (0)
′ 〉 (0)
m n
|En′ 〉 = (0) (0)
|Em′′ 〉 (6.61)
m′′ E n′ − Em′′
6.6. TIME DEPENDENT PERTURBATION THEORY 65
Thus looking at our simple two-dimensional example we should start from the basis
1 1 1 −1
|E1 〉 = √ + O(2 ) |E2 〉 = √ + O(2 ) (6.62)
2 1 2 1
In fact in this simple case the perturbation ends at first order but this won’t be true in
general.
Of course it could be that there are still degeneracies but then one simply goes to
the next order. In quantum theories one expects that the only degeneracies that persist
to all orders in perturbation theory are those that are protected by a symmetry. That
is there exists an observable Q that commutes with the full Hamiltonian [Q, H] = 0
and hence the states |En 〉 and Q|En 〉 have the same energy to all orders in perturbation
theory:
One then works with a energy eigenstates that are also eigenstates of Q.
(0)
where |ψn 〉 are a basis of eigenstates of H (0) :
In this case the cn ’s are just constants that characterise the state at t = 0.
66 CHAPTER 6. PERTURBATION THEORY
Next we use the fact that the ψn satisfy the unperturbed time-independent Schrödinger
equation so
dcn
i e−iEn t/ |ψn(0) 〉 = g cn (t)e−iEn t/ H (1) (t)|ψn(0) 〉 (6.70)
n
dt n
(0)
Note that again we can still use the fact that the |ψn 〉 are an orthonormal basis of the
(0)
Hilbert space. Thus we can take the innerproduct of this equation with |ψm 〉 to obtain
dcm −iEm t/
i e =g cn e−iEn t/ 〈ψm
(0)
|H (1) (t)|ψn(0) 〉
dt n
dcm ig
=− cn ei(Em −En )t/ 〈ψm(0)
|H (1) (t)|ψn(0) 〉 (6.71)
dt n
(0)
For the usual case where |ψn 〉 are realised by functions on R3 we have
(1)
〈ψm |H (t)|ψn 〉 = (ψm (0)
(x))∗ H (1) (t)ψn(0) (x)d3 x (6.72)
Thus we obtain a first order differential equation for each cn (t) but it is coupled to all
other c’s. Thus cn (t) is determined by the original initial condition cn (0).
So far we haven’t made any approximation. But we have an infinite set of coupled
differential equations! At lowest order cn is constant so let us expand
The interpretation is that |cm (t)|2 gives the probablity that after the perturbation the
(1)
system will lie in the m-th energy state (m ∕= k) at time t. cm (t) is called the first order
transition amplitude. Note that we don’t expect to find
|cn (t)|2 = 1 (6.78)
n
The interpretation is that the perturbation has introduced (or taken away) energy into
the system which then gets redistributed.
So lets do an example! Consider a harmonic oscillator
p̂2 1
H (0) = + kx̂2 (6.79)
2m 2
with states |n〉, n = 0, 1, 2... and energies En = ω(n + 12 ). Next we perturb it by
2 /τ 2
H (1) = −x̂e−t (6.80)
Classically this corresponds to an applied force F = g in the x-direction but only for a
period of time of the order 2τ . In this case we assume that the system is in the ground
state at t → −∞. Thus we need to evaluate (we have change the initial time to t = −∞)
i t iωnt′
(1)
cn (t) = − e 〈n|H (1) (t′ )|0〉dt′
−∞
t
i ′ ′2 2
= − 〈n|x̂|0〉 eiωnt −t /τ dt′ (6.81)
−∞
There is no closed form for this integral but we can look at late times so
∞
i ′ ′2 2
(1)
cn (∞) = − 〈n|x̂|0〉 eiωnt −t /τ dt′
−∞
√
iτ π −τ 2 n2 ω2 /4
=− e 〈n|x̂|0〉 (6.82)
where we have used to integral
∞
−at′2 +bt′ π b2 /4a
e dt′ = e (6.83)
−∞ a
Lastly we need to compute
〈n|x̂|0〉 = 〈n|â + ↠|0〉
2mω
= 〈n|↠|0〉
2mω
= δn,1 (6.84)
2mω
68 CHAPTER 6. PERTURBATION THEORY
More generally its easy to see that if we started in |N 〉 then we could only jump to
|N + 1〉 with the same dependence on τ .
To make predictions it is necessary to see what happens to the ground state. The
equation for c0 is
(1)
c0 (t) = 1 + gc0 (t) + . . . (6.86)
with
∞
(1) i
c0 (∞) =− 〈0|H (1) (t′ )|0〉dt′
−∞
=0 (6.87)
Let us now look at another type of approximation. We expect that for large systems,
whatever that means, classical results should emerge. Much of the quantum-ness of
quantum theory disappears. This disappearance arises as the wavefunctions typically
oscillate so quickly that only the quantum effects cancel out and only the dominant
classical configurations remain.
7.1 WKB
This motivates the idea that we should consider wave-functions of the form (let’s just
think of one dimension)
ψ(x) ∼ eiσ(x)/ (7.1)
where we aren’t worrying about normalizations at this time. We can fix that later. To
proceed we find the following (restricting to one dimension)
dψ i dσ iσ(x)/
= e
dx dx
2
d2 ψ i d2 σ iσ(x)/ 1 dσ
= e − 2 eiσ(x)/ (7.2)
dx2 dx2 dx
The time-independent Schrr̈odinger equation is therefore
d2 ψ
Eψ = − + V (x)ψ
dx
2
i d2 σ 1 dσ
E=− 2
+ + V (x) (7.3)
2m dx 2m dx
We are now in a position to make a semi-classical expansion:
σ = σ (0) + σ (1) + . . . (7.4)
which gives at zeroth and first order (we assume there is no in V )
(0) 2
1 dσ
E= + V (x)
2m dx
i d2 σ (0) dσ (0) dσ (1)
0=− + (7.5)
2 dx2 dx dx
69
70 CHAPTER 7. SEMI-CLASSICAL QUANTIZATION: WKB
The semi-classical limit is, roughly, “ → 0”. I put this in quotes as has units (and
can be set to one by a choice of units). What we mean is that the first term is small
compared to the second:
2
d2 σ dσ
2 << (7.6)
dx dx
To take the semi-classical limit we therefore ignore the first term which in a formal
sense corresponds to = 0. Thus at zeroth order we find a familiar formula:
dσ (0)
= ± 2m(E − V (x)) (7.7)
dx
which is solved by
x
(0)
σ (x) = ± p(y)dy p(y) = 2m(E − V (y)) (7.8)
x0
Thus we have determined the wavefunction (in principle and assuming the condition
7.6). In other words our semi-classical wavefunction is
i
x i
x
ψsemiclassical = Ae p(y)dy
+ Be− p(y)dy
(7.9)
Assuming that V is well behaved, i.e. dV /dx bounded, then we find the semi-classical
approximation breaks down at the ‘turning points’ where V = E. Here the momentum
p vanishes so nothing can be smaller than it.
Next we can easily solve for the first order term:
i d2 σ (0) dσ (0) dσ (1)
2
=
2 dx (0)
dx dx
i d dσ dσ (1)
ln =
2 dx dx dx
i
ln p = σ (1) + const (7.12)
2
We can absorb the constant in the normalization cofficients A, B and hence, to first
order,
A i
x B i
x
ψW KB (x) = e p(y)dy
+ e− p(y)dy
(7.13)
p(x) p(x)
This is known as the WKB approximation after Wentzel, Kramers and Brillouin. It
captures a surprisingly large amount of quantum information.
7.2. PARTICLE IN A BOX 71
A i x B
i x
ψ(x) = √ e pdy + √ e− pdy
p p
i i
= A′ e px + B ′ e− px (7.14)
√
where p = 2mE is constant and we have absorbed p into the coefficients A′ , B ′ . Next
we need to impose boundary conditions ψ(0) = ψ(L) = 0 which leads to
So we find p = nπ/L and E = n2 π 2 2 /2mL2 for n = 1, 2, .... This agrees perfectly with
the exact answer, for example see the discussion session in week 3 (because d2 σ (0) /dx2 =
dp/dx = 0). Next!
d2 ψ
− zψ = 0 (7.19)
dz 2
72 CHAPTER 7. SEMI-CLASSICAL QUANTIZATION: WKB
The solution to this equation which decays at large z is known as the Airy function1 :
ψ = cAi(z) (7.20)
for some constant c. It is named after George Airy who was an Astronomer Royal at
the Greenwich Observatory. He features in their museum if you have ever been. The
Airy function is the poster-child of difficult functions to understand using perturbative
techniques due to its bi-polar characteristics of oscillating and decaying. But we love
it none the less as it is a thing of beauty with important and varied applications (and
with modern computer techniques it can be evaluated numerically with high precision).
It can be defined by
∞
1
Ai(z) = √ cos( 13 u3 + zu)du (7.21)
π 0
From here you can check that
∞
′ 1
Ai (z) = − √ u sin( 13 u3 + zu)du
π 0
∞
′′ 1
Ai (z) = − √ u2 cos( 13 u3 + zu)du (7.22)
π 0
so
∞
′′ 1
Ai (z) − zAi(z) = − √ (u2 + z) cos( 13 u3 + zu)du
π 0
∞
1 d
= −√ sin( 13 u3 + zu)du
π 0 du
=0 (7.23)
(one must be careful with the boundary term at infinity which is oscillating wildly). A
plot of Ai(z) is given below. It oscillates to the left of the turning point (where E > F x)
and then dies off exponentially to the right (where E < F x). Therefore, near a turning
point, this is what we expect the wavefunction to look like. In particular the asymptotic
form of Ai(z) is known:
1 e− 23 z3/2 z→∞
1/4
Ai(z) ∼ z (7.24)
1 sin 2 |z|3/2 + π/4
z → −∞
|z|1/4 3
It’s worth mentioning here the fact that Ai(z) and hence the wavefunction is not zero
to the right of the turning point where the potential energy V = F x is greater than
the total energy: V > E. Thus there is a non-zero probability to find the particle in a
region which is strictly forbidden in the classical world.
So let us try this with the WKB approximation. The idea is that WKB should
be good away from the turning points and near a turning point we can use the Airy
function. So it’s a question of patching together the various wavefunctions.
1
There is a second solution that grows at large z.
7.3. TURNING POINTS, AIRY AND BOHR-SOMMERFELD QUATIZATION 73
0.4
0.2
-20 -10 10 20
-0.2
-0.4
So lets do the WKB procedure for V = E + F (x − b). This is a potential that rises
to the right (F > 0) with the turning point at x = b:
p(x) = 2m(F b − F x) (7.25)
and hence (we pick the lower bound of integration constant to make things simple as
the effect is just a constant that can be absorbed elsewhere)
x
(0)
σ = 2m(F b − F y)dy
b
2√
=− 2mF (b − x)3/2 (7.26)
3
Thus our WKB wavefunction is
A i 2 √2mF (b−x)3/2 B −i 2 √2mF (b−x)3/2
ψ= e 3 + e 3 (7.27)
p(x) p(x)
but we don’t trust it near x = b and in particular p = 0 there. Rather we take the
following
2 √ 2 √
√Ar
e −
3
2mF (x−b)3/2
+ √Br
e +
3
2mF (x−b)3/2
x>b
p(x)
p(x)
In other words the WKB solution agrees with the asymptotic regions of the exact Airy
function but not the region where it changes from oscillating to damping.
We can also imagine a turning point, rising to the left, located at x = a < b. The
analysis is the same but with x − b → a − x.
√
√ c
sin − 2
2mF (x − a) 3/2
− π/4 x>a
3
p(x)
ψ = cAi (2mF/)1/3 (a − x) x∼a (7.31)
√
2 3/2
√ c e− 3 2mF (a−x) x<a
p(x)
Let us now consider a more general situation there there is a potential V with turning
points on the left and right. In the middle, a < x < b we need to match the two solutions
we find (this time assuming a general potential):
c 1 x
ψb = sin − p(y)dy + π/4 x<b
p(x) b
c 1 x
ψa = sin − p(y)dy − π/4 x>a (7.32)
p(x) a
These won’t agree unless
1 x 1 x
− p(y)dy + π/4 = − p(y)dy − π/4 + nπ (7.33)
b a
for some integer n. Rearranging this gives the so-called Bohr-Sommerfeld quantization
rule:
b
p(y)dy = π(n − 1/2) n∈Z (7.34)
a
where a and b are two turning points. This was more or less guessed by Bohr and
Sommerfeld in the early days of quantum mechanics where they conjectured that n =
1, 2, ....
The associated wavefunctions will not be the ones we found before: i.e. a polynomial
times an exponential suppression. But that’s okay, we don’t expect to land on the exact
answer in any non-trivial case.
√
Let us see what we can do. At large x we have σ (0) ∼ ±i mk x2 and hence we do
see an exponential suppression in ψ (for the right choice of sign - the must discard the
wrong choice due to normalizability). We can apply the Bohr-Sommerfeld quantization
condition:
b √2E/k √
p(x)dx = √ 2mE − mkx2 dx
a − 2E/k
= πE m/k
= πE/ω (7.36)
Setting this to π(n − 1/2) indeed gives us the correct spectrum (n = 1, 2, ...):
Let’s get back to the fun stuff. We have seen that, in the Copenhagen Interpretation,
there is an inherent probabilistic interpretation of quantum mechanics. There can also
be probabilistic predictions in classical mechanics. For example in thermal physics or
statistical physics, which describe systems with a large number of degrees of freedom.
Here one doesn’t expect to predict the motion of every particle. Rather one is after
the average but this will introduce probabilities: if one measures a given particle it will
only have the average properties “on average”, any particular measurement could give
a different result.
So are these the same uncertainties? Let us look at an example. Consider a two
state system (using yet another notation):
1
|ψ〉 = √ (| ↑〉 + | ↓〉) (8.1)
2
1
〈ψ|τ3 |ψ〉 = (〈↑ | + 〈↓ |)τ3 (| ↑〉 + | ↓〉)
2
1
= (〈↑ | + 〈↓ |)(| ↑〉 − | ↓〉)
2
1
= (1 − 1)
2
=0 (8.2)
This means that if we make repeated measurements of τ3 in the state |ψ〉 we will find
+1 half the time and −1 the other half leading to an expected outcome of 0.
We can compare this to a more classical picture where we don’t know the underlying
system but we suppose that half the time the system is in the state is | ↑〉 half the time
| ↓〉. Therefore measuring τ3 still has an expectation value of zero.
But these are not the same. To probe the difference lets measure another operator
77
78CHAPTER 8. DENSITY MATRICES, ENTANGLEMENT AND THERMAL STATES
that doesn’t commute with τ3 . So let’s measure τ1 . If we look at the state |ψ〉 we find
1
τ1 |ψ〉 = √ (τ1 | ↑〉 + τ1 | ↓〉)
2
1
= √ (| ↓〉 + | ↑〉)
2
= |ψ〉 (8.3)
so 〈ψ|τ1 |ψ〉 = 1.
On the other hand lets measure τ1 in the classical model where half the time we are
measuring | ↑〉 and half the time | ↓〉. In the first case we have
ρ = |ψ〉〈ψ| (8.6)
Given any observable O we can compute the expectation value of |ψ〉 by computing a
trace:
tr(Oρ) = 〈en |Oρ|en 〉
n
= 〈en |O|ψ〉〈ψ|en 〉 (8.7)
n
Next we expand
|ψ〉 = cm |em 〉 ⇐⇒ cm = 〈em |ψ〉 = (〈ψ|em 〉)∗ (8.8)
m
8.1. DENSITY MATRICES 79
so
tr(Oρ) = cm 〈en |O|em 〉c∗n
n,m
= cm λm 〈en |em 〉c∗n
n,m
= λn |cn |2
n
= 〈ψ|O|ψ〉 (8.9)
=1 (8.10)
Thus we can swap a state for a density matrix. Such a state is called a pure state,
meaning that it is equivalent to a single state in the more traditional formulation.
So what’s the point? We can consider more general density matrices:
Note that this is with respect to some basis. If we choose a different basis then ρ may
not take such a diagonal form. The second condition translates into
pi = 1 (8.12)
In general these are called mixed states when they can’t be written as
ρ = |ψ〉〈ψ| (8.13)
for a single state |ψ〉. A mixed state allows us to introduce a statistical notion of
uncertainty, in the sense that we don’t know what the quantum state is (but maybe we
could if we did further experiments). For example we can compute
The expectation value then has the interpretation as a classical statistic average over
the individual quantum expectation values.
80CHAPTER 8. DENSITY MATRICES, ENTANGLEMENT AND THERMAL STATES
If we go back to the simple system we started talking about in the first, quantum
superposition, experiment we have
ρ1 = |ψ〉〈ψ|
1
= (| ↑〉 + | ↓〉)(〈↑ | + 〈↓ |)
2
1
= (| ↑〉〈↑ | + | ↑〉〈↓ | + | ↓〉〈↑ | + | ↓〉〈↓ |) (8.15)
2
1
ρ2 = (| ↑〉〈↑ | + | ↓〉〈↓ |) (8.16)
2
Let us compute
which agrees with what we saw before but is different from ρ1 . Note that although the
density matrix ρ2 may look simpler than ρ1 it is a mixed state whereas ρ1 is pure.
How can we tell in general whether or not a density matrix corresponds to a pure or
mixed state? Well a pure state means ρ = |ψ〉〈ψ| for some state |ψ〉 and hence
So in particular tr(ρ2 ) = tr(ρ) = 1. But for a mixed state (we assume |ψn 〉 form an
orthonormal basis)
ρ= pn |ψn 〉〈ψn |
n
2
ρ = pn pm |ψn 〉〈ψn |ψm 〉〈ψm |
n,m
= p2n |ψn 〉〈ψn | (8.20)
n
tr(ρ2 ) ≤ 1 (8.22)
with equality iff ρ represents a pure state. For example in our classical density matrix
above we have p1 = p2 = 1/2 and so tr(ρ) = 1 as required but tr(ρ2 ) = 1/4 + 1/4 =
1/2 < 1 which tells us it is indeed a mixed state.
82CHAPTER 8. DENSITY MATRICES, ENTANGLEMENT AND THERMAL STATES
1 −En /kB T
ρthermal = e |En 〉〈En | (8.23)
Z n
where |En 〉 are the energy eigenstates, T is the temperature and kB is Boltzman’s
constant, kB = 1.380649 × 10−23 J/o K that converts temperature into energy.1 Clearly
this is a mixed state and is known as the Boltzman distribution.
To determine the normalization Z we need to impose
1 = tr(ρ)
1 −En /kB T
= e 〈Em |En 〉〈En |Em 〉
Z n,m
1 −En /kB T
= e (8.24)
Z n
Thus
Z= e−En /kB T (8.25)
n
This known as the Partition function and plays a very central role. It counts the number
of states available at each energy En :
Z= d(n)e−βEn (8.26)
En
where d(n) counts the degeneracy of states at energy level En and and β = 1/kB T is
the inverse temperature.
At low temperatures, where T → 0, the density matrix will strongly peak around
the lowest energy state:
and hence becomes pure. However at high energy, T → ∞, all the energies states
contribute more or less equally and so
1
lim ρthermal = |En 〉〈En | (8.28)
T →∞ dimH n
1
This value of kB is exact. It’s a definition.
8.2. THERMAL STATES 83
Figure 8.2.1: The Planck Curve: N.B. The horizontal axis is wavelength λ ∼ 1/ω
This is the famous Planck formula for the spectrum of emitted light from a so-called
black body of temperature T , albeit in just one dimension. In three dimensions there
is an 4πω 3 in the numerator. The extra 4πω 2 comes from counting modes in three
dimensions; basically it arises from the growth of the size of a sphere as ω = |k| where k
is the spatial momentum. A plot of the three-dimensional curve is given in Fig. 8.2.1. In
particular it was observed that all bodies radiate with a certain spectrum that depends
on their temperature. Humans (living ones) radiate in the infra-red. If it is hotter you
can see it as visible light such as an oven hob (not an induction one!). Even the empty
space of the Universe emits radiation in the form of the cosmic microwave background
(CMB) corresponding to a temperature of about −270o C. Planck’s formula Ethermal
matches experiment but deriving it was one of those problems the eighteenth century
physicists thought was just a loose string. Rather Planck had to introduce the notion
of discrete energies and yes to derive it. And that was the beginning of the end for
Classical Physics.
8.3 Entropy
To continue we need to talk about the idea of entropy. Entropy is a measure of disorder.
Or more precisely it measures how many microscopic states lead to the same macroscopic
one. All your various charger cables are always tangled up as there are so many more
tangled possibilities than untangled ones. It’s not that life is against you its just that
being organised is an exponentially unlikely state of affairs: organisation requires some
8.4. ENTANGLEMENT 85
organiser and quite some effort2 . The most basic definition of entropy is it is the log of
the number of states with the same energy. So
This requires a little motivation. Firstly to compute the Log of an operator the easiest
way is to diagonalise it so:
SvN = − pn ln pn (8.37)
n
In fact there is a theorem that SvN ≤ ln(dim H). A mixed state with SvN = ln(dim H)
is said to be maximally mixed. For example a maximally mixed state arises when
pn = 1/N for all n, where N = dim H. Such a mixed state is equally mixed between all
pure states.
8.4 Entanglement
Perhaps the most mystical and celebrated feature of Quantum Mechanics is the existence
of entangled states. What do we mean precisely? First let’s look at the quintessential
example of two particles in the |0, 0〉 state that we saw in section 4.2:
1
|0, 0〉 = √ | 12 , 12 〉 ⊗ | 12 , − 12 〉 − | 12 , − 12 〉 ⊗ | 12 , 12 〉 (8.39)
2
This is a state that consists of two particles each of which carries ±/2 units of angular
momentum around the z-axis in such a way that the total angular momentum vanishes.
2
Of course life could still be against you; even though everyones charger cables get tangled, yours
are more tangled than others and always at the worst time.
3
One could say that the entropy of entropy definitions is non-zero.
86CHAPTER 8. DENSITY MATRICES, ENTANGLEMENT AND THERMAL STATES
But we don’t know if the first one is spinning clockwise and the second anti-clockwise
or vice-versa.
Now these two particles might be very far from each other. Perhaps they were created
in some experiment as a pair and then flew away across the universe, like endless rain
into a paper cup. Suppose sometime later we measure one of the particles, say the
first, and it is in the state | 12 , 12 〉. Then we know, we absolute certainty, that the second
particle, wherever it is in the Universe, is in the | 12 , − 12 〉 state. This is very counter
intuitive but it doesn’t contradict anything we know about the world (and indeed has
been verified experimentally). For example we can’t use it to send messages faster than
light as we can’t control how the first particle’s wavefunction ‘collapses’.
So how can we understand this more generally? The basic idea is that the total
Hilbert space of the system can be viewed as a direct product of two Hilbert spaces:
H = H1 ⊗ H2 (8.40)
Of course it could be a direct product of many separate Hilbert spaces but this is enough
for us. For example in the example above H1,2 are the Hilbert spaces of the individual
particles. Formally all these means is that a general state of the system can be written
as
|Ψ〉 = cn′ n′′ |en′ 〉 ⊗ |en′′ 〉 (8.41)
n′ ,n′′
with |en′ 〉 a basis of H1 and |en′′ 〉 a basis of H2 . Note that pure state takes the form
ρ = |Ψ〉〈Ψ|
= cn′ n′′ c∗m′ m′′ |en′ 〉 ⊗ |en′′ 〉〈em′ | ⊗ 〈em′′ |
n′ ,n′′ ,m′ ,m′′
= cn′ n′′ c∗m′ m′′ |en′ 〉〈em′ | ⊗ |en′′ 〉〈em′′ | (8.42)
n′ ,n′′ ,m′ ,m′′
Next we want to we admit that we have no idea what is going on in the second Hilbert
space H2 . Thus we would construct a density matrix where we sum (trace) over all the
options for H2 :
= an′ m′ (8.44)
Furthermore the pn′ , which are the eigenvalues of an′ m′ , are real and in fact positive
(because, roughly, a ∼ cc† is positive definite) and less than one (because |Ψ〉 is nor-
malised).
Thus we find a density matrix, called the reduced density matrix, in the first Hilbert
space. Even though we started from a pure state, from the point of view of the first
Hilbert ρreduced is a mixed state in general. Furthermore we can evaluate it’s von-
Neumann entropy:
which is generally referred to as the entanglement entropy. Note that there is a theorem
that, had we traced over H1 and computed the entanglement entropy of the resulting
density matrix in H2 then we would find the same value for SEE . In other words the
state is equally entangled between the two sub-Hilbert spaces.
So lets go back to our two particle system and first consider the pure state
|Ψ〉 = | ↑〉 ⊗ | ↓〉 (8.47)
which is pure and indeed See = 0. The reason is that the unknown part of the state aris-
ing from H2 is just one state. So averaging over it doesn’t really loose any information.
However suppose we start with
1
|Ψ〉 = √ (| ↑〉 ⊗ | ↓〉 + | ↓〉 ⊗ | ↑〉) (8.50)
2
88CHAPTER 8. DENSITY MATRICES, ENTANGLEMENT AND THERMAL STATES
then tracing over the second Hilbert space the middle two terms drop out and we find
1
ρreduced = | ↑〉〈↑ | + | ↓〉〈↓ | (8.52)
2
which agrees with ρ2 that we had above. We also see that
1 1 1 1
See = − ln − ln = ln 2 (8.53)
2 2 2 2
as before. The lesson is that the ‘Classical’ density matrix we considered above arises
in a quantum theory by tracing over hidden degrees of freedom. In other words the
quantum description is more refined, i.e. contains more information.
Chapter 9
Let us consider the Schrödinger equation for a collection of N particles with positions
xa , a = 1, ..., N :
∂Ψ 2 2
i =− ∇ Ψ + V (xa )Ψ (9.1)
∂t 2ma a
where Ψ = Ψ(t, x1 , ..., xN ). We don’t expect there to be a preferred point or direction
in space. So let us suppose that the potential is of the form
i.e. we assume that the potential only depends on distance between positions of the
particles. Schrödinger’s equation is then rotationally invariant:
xa → x′a = xa + vt (9.4)
Clearly V is invariant as xa − xb = x′a − x′b . And the Laplacian terms don’t change.
However if we write Ψ′ (t, xa ) = Ψ(t, x′a )
∂Ψ′ ∂Ψ
= +v· ∇a Ψ (9.5)
∂t ∂t a
Thus we find an extra term on the left hand side and the Schrödinger equation is no
longer invariant. However this can be compensated for by taking instead
1
′ − i v· ma xa + ma vt
Ψ (t, xa ) = e a 2 Ψ(t, x′a ) (9.6)
i.e. we include an extra phase factor in the wavefunction. As a result we find (see the
problem set)
∂Ψ′ 2 ′ 2 ′
i =− ∇ Ψ + V (x′a )Ψ′ (9.7)
∂t 2ma a
89
90 CHAPTER 9. RELATIVISTIC QUANTUM MECHANICS
Thus we have a symmetry under what are known as Galilean boosts, corresponding
to a notion of Galilean relativity. In particular in Galilean relativity time is absolute,
eveyone agrees on t and an infinitesimal tick of the clock dt. So also do they all agree
on spatial lengths: dx2 + dy 2 + dz 2 . Thus observers may move in reference frames which
are related by rotations and boosts of the form (9.4).
Thats great but as Einstein showed in 1905 Galilean boosts are not symmetries of
space and time. Rather Lorentzian boosts are and we need to consider Special Relativity!
In Galilean relativity the speed of light is not a constant because under a boost velocities
are shifted:
ẋ → ẋ + v (9.8)
But Einstein’s great insight (well one of them) was to insist that the speed of light is a
universal constant. I won’t go into why except to mention that Maxwell’s equations for
electromagnetism don’t allow you to go to a frame where c = 0. They are not invariant
under the Galilean boosts we saw above. They are instead invariant under Lorentz
transformations which include rotations and Lorentz boosts which we now describe.
So what is a Lorentz boost. In Special Relativity we have that the invariant notion
of length is, for an infinitesimal displacement,1
Here c is the speed of light. Here we only require that ds2 is the same for all observers
(not dt and dx2 + dy 2 + dz 2 ). Thus we are allowed to consider a transformation of the
form
we need ds′2 = ds2 and hence (we could also take the other sign but that would flip the
direction of time)
1
γ= (9.12)
1 − |β|2
Thus to recover Galilean boost x′ = x + vt we identify β = v/c. Small β means |v| << c.
Then we see that
and we have recovered absolute time. These transformations are called Lorentzian boosts
and together with spatial rotations they for the Lorentz group.
E 2 = |p|2 c2 + m2 c4
(setting p = 0 immediately gives a famous formula). But this allows for both
positive and negative energy.
The other way to go, and one of those oh so clever moments of discovery, is to find an
equation that it first order in time and space. This is what Dirac did and it’s beautiful.
So let us try something like
i 0 ∂
γ − iγ · ∇ + mc Ψ = 0 (9.16)
c ∂t
92 CHAPTER 9. RELATIVISTIC QUANTUM MECHANICS
We will not be very specific yet about what γ 0 and γ i are. We want this equation to
imply the Klein-Gordon equation so we square it (and change the sign of i for good
measure):
i 0 ∂ i 0 ∂
− γ + iγ · ∇ + mc γ − iγ · ∇ + mc Ψ = 0
c ∂t c ∂t
2
0 2 ∂2 2 i j ∂2 2 0 i ∂ 2 2 2
(γ ) 2 + {γ , γ } i j − {γ , γ } +m c Ψ=0 (9.17)
c2 ∂t 2 ∂x ∂x c ∂t∂xi
(γ 0 )2 = 1 {γ 0 , γ i } = 0 {γ i , γ j } = −2δ ij (9.18)
Clearly this can’t happen if the γ’s are just numbers. But maybe if they are Matrices?
Some trial and error gives the following (unique up to conjugation γ → U γU −1 ) solution:
0 I 0 τ i
γ0 = γi = (9.19)
I 0 −τi 0
where I is the 2 × 2 unit matrix and τi the Pauli matrices. Thus Ψ must be a complex
four-vector:
ψ1
ψ
2
Ψ= . (9.20)
ψ 3
ψ4
x0 = ct , x1 = x , x2 = y , x3 = z (9.21)
where
1 0 0 0
0 −1 0 0
ηµν = (9.23)
0 0 −1 0
0 0 0 −1
9.3. SPINORS 93
and repeated indices are summed over. Note that in such sums one index is always up
and one down. This ensures that any quantity that has no left-over indices is a Lorentz
invariant. It is also helpful to introduce the matrix inverse to ηµν which is denoted by
1 0 0 0
0 −1 0 0
−1
η µν = ηµν = (9.24)
0 0 −1 0
0 0 0 −1
Beautiful.
9.3 Spinors
What about Ψ? It now has four components so we could introduce an index Ψα , α =
1, 2, 3, 4 in which case the γ-matrices also pick-up indices:
γ µ = (γ µ )α β (9.26)
{γ µ , γ ν } = 2η µν I ⇐⇒ {γ µ , γ ν }α β = 2η µν δα β (9.27)
This is called a Clifford algebra2 . Ψ is called a spinor and the α, β indices spinor
indices. This is a new kind of object for us, so lets see the consequences.
Under a Lorentz transformation:
x′µ = Λµ ν xν (9.28)
In Relativity we adopt a notation where contracting an index with ηµν lowers it: ω µ ρ ηµσ =
ωρσ . Similarly one can raise an index by contracting with η µν . Thus the Lie-algebra
consists of anti-symmetric ωρσ :
These are just 4 × 4 matrices with a single ±1 somewhere in the upper triangle and a
∓1 in the corresponding lower triangle to make it anti-symmetric. For example:
0 0 0 0
0 0 −1 0
M12 = δ1ρ η2σ − δ2ρ η1σ = (9.35)
0 1 0 0
0 0 0 0
[Mµν , Mλρ ] = Mµρ ηνλ − Mνρ ηµλ + Mνλ ηµρ − Mµλ ηνρ (9.36)
The extra factor of 1/2 comes from the over counting due to the anti-symmetry. To find
a finite transformation we exponentiate:
1 µν Σ
S = e2ω µν
(9.39)
∂x′ν ∂Ψ
∂µ Ψ = = Λν µ ∂ν′ Ψ (9.41)
∂xµ ∂x′ν
We want to show
−iγ µ ∂µ Ψ′ + mcΨ′ = S −iγ µ ∂µ′ Ψ(x′ ) + mcΨ(x′ ) = 0 (9.42)
S −1 Λµ ν γ ν S = γ µ (9.43)
This becomes
1 1
− ω ρσ Σρσ γ µ + ω µ ν γ ν + γ µ ω ρσ Σρσ = 0
2 2
ρσ 1 µ 1 µ 1 µ 1 µ
ω − γρσ γ + δρ γσ − δσ γρ + γ γρσ = 0 (9.44)
4 2 2 4
This is in fact true as you can check. But for now just try some cases: if µ, ρ, σ are all
distinct then γ µ commutes with γρσ and the left hand side vanishes. So does the right
hand side. If µ = σ and ρ ∕= σ then the two terms on the left hand side will add to give
a factor 2 which agrees with the right hand side.
96 CHAPTER 9. RELATIVISTIC QUANTUM MECHANICS
Ψ′ = −Ψ . (9.48)
Thus under a rotation of 2π spinors come back with a minus sign. This is very important.
It turns out that this means particles described by spinors must have wavefunctions
which change sign under swapping any two particles. Such particles are call Fermions
(particles that don’t have such a sign, such as photons, are called Bosons). This in
turn implies the Pauli exclusion principle: no to spinor particles can be in the same
state. This is what makes matter hard: when you add electrons to a Hydrogen atom to
build heavier elements they must fill out different energy levels.
How can we construct a Lorentz scalar from a spinor? For a vector we form the
contraction: e.g. ds2 = ηµν dxµ dxν = dxµ dxµ . For a spinor we need to consider something
like
for some spinor matrix C. Now under a Lorentz transformation δΨ1,2 = 14 ωµν γ µν Ψ1,2
and hence
1 1
δΨ†1 = Ψ†1 ωµν (γ µν )† = Ψ†1 ωµν (γ ν )† (γ µ )† (9.50)
4 4
Now it is easy to check that
(γ µ )† = γ 0 γ µ γ 0
=⇒ (γ ν )† (γ µ )† = γ 0 γ ν γ µ γ 0 (9.51)
9.4. BACK TO QUANTUM MECHANICS 97
thus
1 1
δΨ†1 = Ψ†1 ωµν γ 0 γ νµ γ 0 = − Ψ†1 ωµν γ 0 γ µν γ 0 (9.52)
4 4
What we need is
0 = δ(Ψ†1 CΨ2 )
= δΨ†1 CΨ2 + Ψ†1 CδΨ2
1
= Ψ†1 ωµν (−γ 0 γ µν γ 0 C + Cγ µν )Ψ2 (9.53)
4
Thus we can simply take C = γ 0 . This defines the Dirac conjugate:
Ψ̄ = Ψ† γ 0 (9.54)
and hence Ψ̄1 Ψ2 is a Lorentz scalar. Thus C = γ 0 plays same role for spinor indices
that η µν plays for spacetime indices. If we put indices on it we have C αβ = (γ 0 )α β .
A final comment is that Ψ is called a Dirac spinor. This is not an irreducible
representation of so(1, 3). One can impose the constraint
γ5 Ψ = γ 0 γ 1 γ 2 γ 2 Ψ = ±Ψ (9.55)
The eigenstates of γ5 are known as Weyl spinors and are often called left or right handed
spinors. Given the form of the γ-matrices above we have
I 0
γ5 = (9.56)
0 −I
where ΨL/R are the left and right handed spinors. These form irreducible representations
of so(1, 3) and each is a complex 2-vector in spinor space.
So we still have negative energy states! But we do have a positive definite density.
Indeed we can identify a 4-current
J µ = Ψ̄γ µ Ψ (9.59)
∂µ J µ = ∂µ Ψ̄γ µ Ψ + Ψ̄γ µ ∂µ Ψ
mc mc
= i Ψ̄Ψ − i Ψ̄Ψ
=0 (9.60)
which you can show in the problem set. Along with showing Ψ̄γ µ Ψ a Lorentz vector.
This means that the time component can be used as a conserved quantity:
1d
J d x = ∂0 J 0 d3 x
0 3
c dt
= − ∂i J i d3 x
=0 (9.62)
And furthermore
J 0 = Ψ̄γ 0 Ψ = Ψ† Ψ (9.63)
is positive definite.
So we’ve discovered something beautiful and new but haven’t solved all our problems.
We still have negative energy states and not all particles are described by the Dirac
equation: just Fermions such as electrons and quarks. Dirac’s solution to the negative
energy states was to use the Pauli exclusion principle to argue that the ground state
has all negative energy states filled. This however implies that you should be able to
knock a state with negative energy out of the vacuum. Such an excitation has the same
energy (mass) but opposite charge to a positive energy state. In so doing he predicted
anti-particles which were subsequently observed. We now know that the Higgs’ Boson
(as with other Bosons) satisfies a Klein-Gordon-like equation and that these too have
anti-particles. The full solution comes by allowing the number of particles to change,
i.e. one allows for the creation and annihilation of particles, and leads to quantum field
theory. But thats another module....
Chapter 10
Our last topic is to present another way of formulating quantum mechanics known as
the path integral formulation. It is perhaps the most popular way as one can incorporate
symmetries such as Lorentz invariance from the outset. It also allows us to make contact
with the principle of least action and hence the Classical world.
Let us start by considering a general quantum mechanical system with coordinate q̂
and its conjugate momenta p̂:
= δ(a2 − a1 )
〈k1 |k2 〉 = ψk1 (q)∗ ψk2 (q)dq
= ei(k2 −k1 )q/ dq
= 2πδ(k2 − k1 ) (10.6)
99
100 CHAPTER 10. THE FEYNMAN PATH INTEGRAL
In a suitable sense (don’t ask too many questions) the |a〉 form a basis. We can write
the identity operator as
I = da|a〉〈a| (10.7)
K(a2 , t2 ; a1 , t1 ) = 〈a2 |IeiĤ(t−tN +2 )/ IeiĤ(tN +2 −tN +1 )/ IeiĤ(tN +1 −tN )/ . . . IeiĤ(t3 −t1 )/ |a1 〉
(10.16)
Next we replace each occurrence of I by (10.8) followed by (10.7) alternatively:
dkN +2
K(a2 , t2 ; a1 , t1 ) = 〈a2 | |kN +2 〉〈kN +2 |eiĤ(tN −tN −1 )/
2π
× daN +1 |aN +1 〉〈aN +1 |eiĤ(tN −1 −tN −2 )/ . . . |a1 〉
N
+2
dki
= dai 〈a2 |ki 〉〈ki |eiĤ(ti −ti−1 )/ |ai 〉 (10.17)
i=3
2π
We further suppose that each pair ti − ti−1 = (t2 − t1 )/(N + 1) = δt so that
N
+2
dki i
(ki (ai+1 −ai )+H(ki ,ai )δt)
K(a2 , t2 ; a1 , t1 ) = dai e
i=3
2π
N +2 i
a
dki i ki
i+1 −ai
+H(ki ,ai ) δt
= dai e δt
(10.18)
i=3
2π
In the large N limit we can recognise the object appearing in the exponential as an
integral:
ai+1 − ai t2
lim ki + H(ki , ai ) δt = k(t)ȧ(t) + H(k(t), a(t))dt (10.19)
N →∞
i
δt t1
where ai+1 − ai = ȧ(ti+1 − ti )/N = ȧδt with a dot denoting differentiation with respect
to time. Thus we find an example of a path-integral:
dk i
K(a2 , t2 ; a1 , t1 ) = [da]e (k(t)ȧ(t)+H(k(t),a(t)))dt (10.20)
2π
102 CHAPTER 10. THE FEYNMAN PATH INTEGRAL
where k(t) and a(t) are arbitrary paths such that a(t1 ) = a1 and a(t2 ) = a2 . This is an
infinite dimensional integral, we are integrating over all paths k(t) and a(t). It is not
well understood Mathematically but is very powerful in Physics.
where N is an infinite but irrelevant constant that we can absorb. Again we recognise
the exponent as an integral
m(ai+1 − ai )2 t2
1 2
lim 2
− V (ai ) δt = mȧ − V (a)dt
N →∞
i
2(δt) t 1
2
= S[a] (10.24)
i
= [da]e− S[a(t)] (10.25)
This is known as the Schrödinger propagator. You can check that, for t2 ∕= t1
∂KS 2 ∂ 2 Ks
i =− . (10.30)
∂t2 2m ∂a22
This object is similar to the partition function that we saw above only in imaginary
temperature β = −i/ and weighted by the action not the energy. In fact the relation
between the partition function of a quantum theory, as defined by (10.33), and the
partition function we saw in our discussion of thermal statistical physics is quite deep.
Thus
1
S=− qn qm
en (t)Eem (t)dt (10.37)
2
Let us choose en (t) to be eigenstates of E with eigenvalues λn :
1
S=− qn qm en (t)λm em (t)dt
3
1
=− λn qn2 (10.38)
2 n
Thus
i i
λn q22
[dq]e− S[q] = dqn e 2 n
n
i 2
= dqn e 2 λn qn
n
−πi
=
n
λn
N
=√ , det E = λn (10.39)
det E n
10.2. COMPUTING DETERMINANTS 105
where the determinant is the infinite product of eigenvalues of the operator Eab . More
generally still if one has
1
S=− qa Eab qb + Ja qa dt (10.42)
2
we find
i N −1
[dqa ]e S[q1 ,...,qn ] = √ eJa Eab Jb (10.43)
a
det Eab
1
Some of you might be worrying that these integrals are not well defined and this is just hocus-pocus.
Don’t, just relax and don’t ask too many questions.