0% found this document useful (0 votes)
71 views274 pages

Notes On Advanced Quantun Mechanics by TU Delft

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views274 pages

Notes On Advanced Quantun Mechanics by TU Delft

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 274

L ECTURE NOTES A DVANCED Q UANTUM M ECHANICS

AP3051

Jos Thijssen

Autumn 2013

Copyright © 2013 by TU Delft


An electronic version of these notes is available at
http://blackboard.tudelft.nl/.
P REFACE
These notes are intended to be used for the course “Advanced Quantum Mechanics”, which is
part of the master programme in Applied Physics at Delft University of Technology. They are
been the result of teaching activities in the field of quantum mechanics over a range of years
and at different levels.

Some relatively elementary topics are covered in the first few chapters of these lecture
notes. After some review of complex analysis and bachelor-level quantum mechanics, we
shall dig into the formalism of quantum mechanism a bit more deeply, thereby uncovering
interesting relations with classical mechanics. Further topics such as variational calculus and
the WKB approximation are covered, before entering into the exciting world of Green’s func-
tions, which has applications in scattering theory, open quantum systems and interacting
many-body systems. After applying the Born approximation in scattering theory, we enter
the main topics of this lecture course: The quantization of harmonic fields such as lattice
vibrations and the electromagnetic field, the formalism of second quantization, open quan-
tum systems and relativistic quantum mechanics. Part of the topics are taught in an optional
add-on course.

Elaborate exercises are an integral part of the process of learning quantum mechanics,
and the material can simply not be mastered without going through these exercises.

In the course of the years, I have learnt a lot from students and from co-teachers. I want to
thank them all, and mention here in particular Leo Di Carlo whose many insightful remarks
and comments have contributed significantly to this version.

Quantum mechanics at this level has two faces; it is hard material and requires a lot of
effort. But it is also great fun; after spending long hours trying to find the solution to the
problems, the satisfaction lets you forget all the misery you’ve gone through. I hope that the
fun part will persist in the memory of the students after following this course.

Delft, August 2013

iii
C ONTENTS

Preface iii

1 A complex function theory survival guide 1

2 A quantum mechanics survival guide 5


2.1 Spin-1/2 and the Bloch sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Schrödinger and Heisenberg pictures . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3 Formal quantum mechanics and the path integral 17


3.1 The postulates of quantum mechanics . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Relation with classical mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 The path integral: from classical to quantum mechanics . . . . . . . . . . . . . . 22
3.4 The path integral: from quantum mechanics to classical mechanics . . . . . . . 26
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4 The variational method for the Schrödinger equation 31


4.1 Variational calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 Linear variational calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.1 The infinitely deep potential well . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.2 Variational calculation for the hydrogen atom . . . . . . . . . . . . . . . . 34
4.2.3 Exploiting symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3 Examples of non-linear variational calculus . . . . . . . . . . . . . . . . . . . . . 37
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5 The WKB approximation 45


5.1 introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 The WKB Ansatz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.3 The WKB Ansatz II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.4 The WKB Ansatz III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.5 Tunnelling in the WKB approximation . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.6 The connection formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

6 Green’s functions in quantum mechanics 59


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.2 Definition of the Green’s function . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.3 Green’s functions and perturbations . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.3.1 Systems with discrete spectra . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.3.2 Systems with continuous spectra . . . . . . . . . . . . . . . . . . . . . . . . 62
6.3.3 Discrete spectra revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.4 Green’s functions and boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

v
vi C ONTENTS

7 Scattering in classical and in quantum mechanics 69


7.1 Classical analysis of scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.2 Quantum scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.3 The optical theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

8 Systems of Harmonic oscillators – Phonons and Photons 81


8.1 Creating and annihilating quanta in the Harmonic oscillator . . . . . . . . . . . 81
8.1.1 Coherent states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8.2 Quantization of the linear chain of atoms . . . . . . . . . . . . . . . . . . . . . . . 84
8.3 The quantum theory of electromagnetic radiation . . . . . . . . . . . . . . . . . 87
8.3.1 Classical electromagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
8.3.2 Quantization of the electromagnetic field . . . . . . . . . . . . . . . . . . . 89
8.3.3 Some properties of the electromagnetic field . . . . . . . . . . . . . . . . . 91
8.3.4 The Casimir effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
8.4 summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
8.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

9 Second quantisation 101


9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
9.2 Moving around in Fock space – creation and annihilation operators . . . . . . . 104
9.2.1 Many-boson systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
9.2.2 Many-fermion systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
9.3 Interacting particle systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
9.4 Change of basis – field operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.5 Examples of many-body systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
9.5.1 Many non-relativistic particles in a box . . . . . . . . . . . . . . . . . . . . 111
9.5.2 The Heisenberg model and the Jordan-Wigner transformation . . . . . . 112
9.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
9.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

10 Electrons and phonons 123


10.1 Theory of the electron gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
10.2 Electron-phonon coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
10.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

11 Superconductivity 141
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
11.2 Cooper pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
11.3 The BCS wave function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
11.4 The BCS Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
11.5 Summary of BCS theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
11.6 Landau-Ginzburg theory and the London equations . . . . . . . . . . . . . . . . 152
11.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

12 Density operators — Quantum information theory 157


12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
12.2 The density operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
12.3 Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
12.4 The EPR paradox and Bell’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 169
12.5 No cloning theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
12.6 Dense coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
C ONTENTS vii

12.7 Quantum computing and Shor’s factorisation algorithm . . . . . . . . . . . . . . 173


12.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

13 Open quantum systems 181


13.1 Coupling to an environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
13.1.1 Example: The damping channel . . . . . . . . . . . . . . . . . . . . . . . . 182
13.1.2 The operator-sum representation . . . . . . . . . . . . . . . . . . . . . . . 184
13.1.3 Example: qubit depolarization . . . . . . . . . . . . . . . . . . . . . . . . . 185
13.2 Direct quantum measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
13.2.1 System evolution conditioned on the result of direct measurement . . . 186
13.2.2 Unconditioned system evolution under direct measurement . . . . . . . 186
13.2.3 Measurement statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
13.2.4 Example: projective measurement of a quantum bit . . . . . . . . . . . . 187
13.3 Indirect quantum measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
13.3.1 System evolution conditioned on the result of indirect measurement . . 189
13.3.2 Unconditioned system evolution under indirect measurement . . . . . . 189
13.3.3 What does an indirect quantum measurement actually measure? . . . . 190
13.3.4 POVMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
13.3.5 Example: weak quantum measurement of a qubit . . . . . . . . . . . . . . 190
13.4 Repeated measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
13.5 Lindblad representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
13.5.1 Unconditioned weak measurements in Lindblad form . . . . . . . . . . . 196
13.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

14 Time evolution of the density operator 203


14.1 The Born-Markov master equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
14.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
14.2.1 The damped harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . . 207
14.2.2 Spontaneous emission from an electronic excitation in an atom . . . . . 209
14.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

15 (More than a) survival guide to special relativity 215


15.1 History and Einstein’s postulates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
15.2 The Lorentz transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
15.3 More about the Lorentz transformation . . . . . . . . . . . . . . . . . . . . . . . . 217
15.4 Energy and momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
15.5 Mathematical structure of space-time . . . . . . . . . . . . . . . . . . . . . . . . . 222
15.6 Electromagnetic fields and relativity . . . . . . . . . . . . . . . . . . . . . . . . . . 224
15.7 Relativistic dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
15.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
15.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228

16 The Klein-Gordon and Maxwell equations 231


16.1 The Klein-Gordon equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
16.2 Analogy with the Maxwell equations . . . . . . . . . . . . . . . . . . . . . . . . . . 233
16.3 Source terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
16.4 Static solutions for the propagators . . . . . . . . . . . . . . . . . . . . . . . . . . 235
16.5 Scattering as an exchange of virtual particles . . . . . . . . . . . . . . . . . . . . . 235
16.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
viii C ONTENTS

17 The Dirac equation 239


17.1 Improving on the Klein-Gordon equation . . . . . . . . . . . . . . . . . . . . . . . 239
17.2 The probability density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
17.3 A new form for the Dirac equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
17.4 Spin-1/2 for the Dirac particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
17.5 The hydrogen atom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
17.6 Interaction with an electromagnetic field . . . . . . . . . . . . . . . . . . . . . . . 246
17.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248

18 Second quantization for relativistic particles 253


18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
18.2 Second quantization for Klein-Gordon particles . . . . . . . . . . . . . . . . . . . 253
18.3 Second quantization for Dirac particles . . . . . . . . . . . . . . . . . . . . . . . . 255
18.4 A physical realization of a Dirac field theory: graphene . . . . . . . . . . . . . . . 259
18.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
1
A COMPLEX FUNCTION THEORY
SURVIVAL GUIDE

Not every student may have had enough complex analysis to appreciate the manipulations
we must carry out when – for example – calculating Green’s functions in this course; others
may have forgotten most of what they’ve learnt about this subject. Hence, I list the most
important results of complex function theory without proof.
An analytic function is a complex function which can be differentiated an infinite number
of times. It turns out that a complex function which is differentiable, satisfies the Cauchy-
Riemann equations. In order to formulate these equations, we first introduce some notation.
A point in the complex plane is given as

z = x + iy.

A complex function f (z) may then be written as

f (z) = u(x, y) + iv(x, y).

u is the real, and v the imaginary part of the complex function. The Cauchy-Riemann equa-
tions are then
∂u ∂v ∂u ∂v
= ; =− .
∂x ∂y ∂y ∂x
It turns out that this condition is sufficiently strong to ensure infinite differentiability: a com-
plex function which can be differentiated once, can be differentiated an infinite number of
times. Functions that satisfy this requirement are called analytic.
We often deal with integrations over closed curves (‘contours’). It can be shown that the
integral of an analytic function taken over such a contour gives zero:
I
f (z)d z = 0,
Γ

where Γ denotes the contour. We adopt the convention that in complex integration, the con-
tour is always traversed in the anti-clockwise direction. Reversing the direction reverses the
sign of the result (which in the case of an analytic function has no effect, as the result of the
integration is 0).
We often deal with functions having singularities. Point-like singularities are called poles.
We say that a function f has a pole of order n in the point a on the complex plane if (z −
a)n f (z) is analytic in a, but (z − a)n−1 f (z) is not. Now suppose we expand f around a as a
series expansion in z − a, including negative powers:

(z − a)n f n (a).
X
f (z) =
n=−∞

1
2 1. A COMPLEX FUNCTION THEORY SURVIVAL GUIDE

The residue of f in a, denoted as resa f is defined as the coefficient of (z − a)−1 in this ex-
{
pansion. For a pole of order one, also a called a simple pole, the residue of f is given as
1 {
limz→a (z − a) f (a). In general, for an isolated pole, the residue is defined as

1 d n−1 £ n
¤
resa f (z) = lim (z − a) f (z) .
(n − 1)! z→a d z n−1

The most important result of complex analysis that we shall be using frequently, is about
functions with a set of isolated poles a 1 , a 2 , . . . , a k within the closed contour Γ. We then have
I k
X
f (z)d z = 2πi resa j f .
Γ j =1

This is the so-called residue theorem.


Let us consider an example. We calculate the integral

1
I
dz
1 + z2

over a circle of radius 2 around the origin. The contour contains the points ±i. We note that

1 i 1 i 1
2
= − .
1+z 2 z +i 2 z −i
Both terms yield a standard integral with a simple pole, and working them out using the
residue theorem yields the value

1 i i
I
2
d z = (2πi) − (2πi) = 0.
1+z 2 2

Another result is important for the cases we will be dealing with. Consider a semi-circle
with radius R in the upper complex plane, and centred around 0. We call this circle Γ+ . Then,
Jordan’s lemma says that if the function f (z) is bounded on Γ+ , the integral
Z
e ikz f (z)d z
Γ+

is finite, and for R → ∞ it approaches zero.


As an example, we calculate
e ikx
Z ∞
d x,
−∞ x − a
where a is a complex number with a positive imaginary part and k is real. We now evaluate
the integral over the contour Γ shown in figure 1.1. Because of Jordan’s Lemma, this integral is
equal to the integral over the real axis only. Using the residue theorem, we immediately have

∞ e ikx e ikx e ikx


Z I Z
dx = dx − d x.
−∞ x − a x −a Γ+ x −a

Now we use Jordan’s lemma which says that the second term vanishes, and the integral over
the closed contour can be evaluated using the residue theorem:

e ikx
I
d x = 2πie i ka .
x −a

Finally, we consider the integral


f (x)
Z
dx
x
3

{
1
{

F IGURE 1.1: The contour for evaluating an integral along the real axis.

over the real axis. This integral is not well defined at x = 0 unless f vanishes there (and is
continuously differentiable at x = 0). Now let us consider the same integral but running just
above the real axis:
f (x + i²)
Z
d x,
x + i²
where ² is small. Supposing again that f is regular, the small ² does not change its value
significantly, but we have avoided the singularity at x = 0. For x not very close to zero, the
integral is approximately equal to the one running over the real axis, and we should focus on
what happens near x = 0. Let us work out the imaginary part of 1/(x + i²):

²
µ ¶ µ ¶
1 1 1 1
Im = − =− 2 2.
x + i² 2i x + i² x − i² x +²

The right hand side of this function is a narrow peak centred around x = 0. Its integral is −π.
Therefore, for small ², the integral will be −πδ(x). All in all, we see that we can write

1
= 1/x (for x away from zero) − iπδ(x).
x + i²
The first part of this expression is called the principal value and denoted as P :
µ ¶
1 1
=P − iπδ(x).
x + i² x

More precisely µ ¶
1
P = 1/x for |x| > ², ² → 0.
x
For completeness, we write down the principal value integral over a function f (x) having a
singularity in a is: Z Z a−²
Z ∞
P f (x)d x = lim f (x)d x + f (x)d x.
²↓0 −∞ a+²
2
A QUANTUM MECHANICS SURVIVAL
GUIDE

In this chapter we review standard quantum mechanics, highlighting important results. Quan-
tum mechanics describes the evolution of the state of a mechanical system. The state is a
vector in a special kind of complex vector space, the Hilbert space. This is a vector space in
which an inner product is defined.
¯ψ . The inner product be-
¯ ®
We represent the state of the system by a ket-vector, such as
tween vectors ψ and φ is denoted φ|ψ . For a quantum mechanical system for which we
¯ ® ¯ ® ­ ®
¯ ¯
want to calculate the time evolution, the vector describing the system becomes itself time-
dependent: ψ(t ) .
¯ ®
¯
The equation determining the time evolution of a known state vector ¯ψ(0) at t = 0 is
¯ ®

known as the time-dependent Schrödinger equation:

∂ ¯¯
ψ(t ) = Ĥ ¯ψ(t ) .
® ¯ ®
i× (2.1)
∂t

Here, Ĥ is an operator acting on vectors in the Hilbert space. Ĥ is Hermitian, which means it
is equal to its Hermitian conjugate. The Hermitian conjugate  † of an operator  is defined
as follows: Â † must be such that for any two vectors ψ and φ it must hold that
® ³D E´∗
φ| Â|ψ = ψ| Â † |φ
­
,

We summarise:

 † =  means that  is Hermitian.


In particular, the Hamiltonian Ĥ is Hermitian: Ĥ † = Ĥ .

Hermitian operators have important properties:

• The eigenvalues λ of a Hermitian operator are all real: λ = λ∗ .

• Eigenvectors ¯φλ and ¯φµ belonging different eigenvalues λ and µ are always mutu-
¯ ® ¯ ®

ally orthogonal:
φλ |φµ = 0; λ 6= µ.
­ ®

• Degenerate eigenvectors can be chosen orthogonal.

• All eigenvectors span the Hilbert space.

5
6 2. A QUANTUM MECHANICS SURVIVAL GUIDE

The fact that the eigenvectors form a basis of the Hilbert space leads to the often-used reso-
lution of the identity, in which the unit operator 1 is written as:

1 = ¯φλ φλ ¯
X¯ ®­ ¯
j
{
2 { where the sum is over all the eigenvectors.
The solution to the time-dependent Schrödinger equation (2.1) is easy to find: it is just

¯ψ(t ) = e −iĤ t /× ¯ψ(0) .


¯ ® ¯ ®

This solution is verified by substituting it back into that equation. However, working out this
solution is very difficult, as it contains the exponential of an operator. The easiest way to
handle this exponent is by diagonalising the operator. Suppose we have a complete set of
eigenvectors φn and eigenvalues E n for Ĥ :
¯ ®
¯

Ĥ ¯φn = E n ¯φn ,
¯ ® ¯ ®
(2.2)

and that we know how to expand the initial state ¯ψ(0) into these eigenstates. Given that the
¯ ®

exponent of an operator can be written as the diagonal operator with the exponents of the
eigenvalues on its diagonal (I write this here for a finite-dimensional matrix):

e −iE 1 t /×
 
0 ··· 0 0
0 e −iE 2 t /× · · · 0 0
 
−it Ĥ /×
 
e =

..
,
0 0 . 0 0

 
−iE N t /×
0 0 ··· 0 e

we obtain
N
¯ψ(t ) = c j e −iE j t /× ¯φ j ,
¯ ® X ¯ ®
j =1

where
c j = φ j |ψ(0) .
­ ®

Eq. (2.2) is called the stationary Schrödinger equation.


In case
¯ ® you have forgotten how to diagonalise a matrix, I recall that, since for an eigen-
vector ¯φ of an operator Â,
 ¯φ = λ ¯φ ,
¯ ® ¯ ®

¯φ should be a non-zero vector for which


¯ ®

( Â − λ1) ¯φ = 0.
¯ ®

From your first lecture on linear algebra, you should know that this can only be true if the
determinant of the matrix  − λ1 vanishes. This leads to an algebraic equation for λ. As an
example, consider
µ ¶
0 1
Ĥ = .
1 0
The determinant condition is ¯ ¯
¯ −λ 1 ¯
¯ ¯ = 0,
¯ 1 −λ ¯

which leads to
λ2 = 1,
so λ = ±1. This could be anticipated as you may have recognised Ĥ as the Pauli matrix σx
and know that the Pauli matrices all have eigenvalues ±1. Obviously, the larger the matrix,
the higher the order of the equation for λ and the more work it takes to find the eigenvalues.
7

Once you have the eigenvalue, you may find the corresponding eigenvector by solving
linear equations. Calling the eigenvector (a, b) we have,for the +1 eigenvalue,
b=a
a =b {
p p
and the normalised eigenvector becomes (1, 1)/ 2. Similarly we find (1, −1)/ 2 for the eigen- 2
{
vector with eigenvalue −1. These eigenvectors could also have been guessed if you let your-
self be guided by the symmetric structure of the matrix you are diagonalising. Another helpful
fact is that, for a Hermitian matrix (operator), the eigenvectors belonging to different eigen-
values are orthogonal, and that for equal eigenvalues (degeneracy), all eigenvectors can al-
ways be chosen orthogonal. The matrix eigenvalue problem can be solved analytically only
if the matrix size is modest (typically smaller than or equal to 3) or if the matrix has a simple
and/or very regular structure. In all other cases we use numerical routines for solving the
eigenvalue problem.

We see that the time evolution of a wave function is determined by the time evolution
operator Û (t ) = e −it Ĥ /× . From the hermiticity of Ĥ , it is easy to see that Û (t ) satisfies
Û (t )Û † (t ) = Û † (t )Û (t ) = 1,
where 1 is the unit operator. An operator satisfying this equation is called unitary. We see that
unitarity of the time evolution operator directly follows from the hermiticity of the Hamilto-
nian. Interestingly, this unitarity also guarantees that the norm of the wave function is con-
served. To see this, we use
¯ψ(t ) = Û (t ) ¯ψ(0) ;
¯ ® ¯ ®

ψ(t )¯ = ψ(0)¯ Û † (t ),
­ ¯ ­ ¯

to evaluate what happens to the norm as time evolves:


® D E ­
ψ(t )|ψ(t ) = ψ(0)|Û † (t )Û (t )|ψ(0) = ψ(0)|ψ(0) ,
­ ®

which shows that the norm is indeed preserved.


The Schrödinger equation is quite a general equation and does not specify the structure
of the Hilbert space, nor the specific form of the Hamiltonian. Physicists have guessed both
in the first decades of the twentieth century and good guesses have turned out to yield results
for physical measurements in excellent agreement with experiment. Here we list a few.
• Spinless point particle in one dimension. Hilbert space: class of square integrable func-
tions (L 2 ) on the real axis. Hamiltonian:
×2 d 2
Ĥ = − + V (x).
2m d x 2
• Spinless point particle in three dimensions. Hilbert space: class of square integrable
functions (L 2 ) in R3 . Hamiltonian:
×2 2
Ĥ = − ∇ + V (r).
2m
• Particles with spin 1/2, neglecting their motion. Hilbert space: two-dimensional vector
space. Hamiltonian:
eB
Ĥ = σz ,
m
where B is a magnetic field along the z-axis and σz is the Pauli matrix
µ ¶
1 0
.
0 −1
8 2. A QUANTUM MECHANICS SURVIVAL GUIDE

It is easy to extend this list with numerous other cases.


For any physical quantity A, appropriate for the system at hand, there exists a Hermitian
operator  whose eigenvalues¯ λn are the possible values of A found in a measurement. These
®¯2
¯ ® occur with probability φn |ψ , where φn is the eigenvector corresponding to λ¯n and
­
values ¯ ¯
¯ψ is the state of the system. The expectation value of A in a system in quantum state ¯ψ is
®
{
2{ given by ψ| Â|ψ .
­ ®

We now concentrate on electrons in 3D, moving in the field of a radial potential depend-
ing only on the distance r to the origin: V ≡ V (r ), r = |r|. This is a special example of a system
exhibiting a symmetry. If there is a symmetry, there usually is degeneracy, meaning that two
or more eigenvalues of the Hamiltonian (i.e., the energies) have the same values – we shall
return to this point in chapter 4. It turns out that if there is symmetry, there is one or more
operators that commute with the Hamiltonian. The expectation values of these operators
(which are assumed to have no explicit time dependence) then remain unchanged in time as
can easily be checked from the time evolution:
D ¯ ¯ E
〈A〉t = ψ0 ¯e it H /× Ae −it H /× ¯ ψ0 .
¯ ¯

Taking the time derivative of this expression yields the commutator [H , A] which vanishes
by assumption. It is possible to find a set of vectors that are eigenvectors of all independent
operators which commute with H . To be specific, states whose energy eigenvalues are degen-
erate may have different eigenvalues for an operator  other than Ĥ . In order to identify all
the states uniquely, we need in addition to Ĥ a set of operators Â, B̂ , Ĉ ,. . . such that each (si-
multaneous) eigenvector of this set of operators has a unique set of eigenvalues E n , a j , b k , c l ,
. . . . The set of all independent operators which commutes with H , including H is called ob-
servation maximum:

An observation maximum is the set of all independent operators, including H , that


commute amongst themselves and with H . The eigenvalues of all these operators
label the simultaneous eigenvectors of all the operators of the observation maxi-
mum. They form a basis of the Hilbert space.

Degeneracy is related to symmetry which is present in the Hamiltonian (this is formally


substantiated by the quantum mechanical version of Noether’s theorem, which we shall not
go into here). This relation is the reason why the energies of a 3D system which is spherically
symmetric (that is, a system with a radial potential), are degenerate. The operators commut-
ing with the Hamiltonian for a spinless particle in a radial potential are the angular momen-
tum operators L 2 and L z . These have the eigenvalues ×2 l (l + 1) and ×m respectively, where
l is 0, 1, 2, . . . and m is an integer running from −l to l . The energy eigenvalues depend on l
and an additional quantum number, n – they are written as E nl . For each l , m runs from −l
to l in integer steps. So there are 2l + 1 m-values for each l . As the energy eigenvalues do not
depend on m, they are (at least) 2l + 1-fold degenerate. If the particles have spin-1/2, there
are additional quantum numbers: s which always takes on the value 1/2 (as we are dealing
with spin-1/2 particles) and m s which takes the value ±1/2. All in all, the states of an elec-
tron in a radial potential are denoted |n, l , m, s, m s 〉. The quantum number s being always
1/2 for an electron, is often left out. If we include the spin, each level with quantum number
l is 2(2l + 1)-fold degenerate. Figure 2.1 shows a schematic representation of a spectrum of a
particle in a radial potential.
The Coulomb potential is a special case: this potential has some hidden symmetry 1
which causes several of the E nl to coincide. Whereas we normally label the energy eigen-
values for each l by n = 1, 2, . . ., this degeneracy allows us to label the states as in Figure 2.2.
The quantum number along the vertical axis is called the principal quantum number. We see
that for each principal quantum number n, we have states with l -values between 0 and n −1.
1 This symmetry is related to the four-dimensional rotational group O(4).
2.1. S PIN -1/2 AND THE B LOCH SPHERE 9

{
n= 3 2
{
n= 2
n= 2
E n= 3 n= 2
n= 1
n= 1
n= 2 n= 1
n= 1

l= 0 l= 1 l= 2 l= 3
(2) (6) (10) (14)

F IGURE 2.1: The spectrum of an electron in a radial potential, grouped according to the l -value. In parenthesis,
the degeneracy of the levels (including the two-fold spin-degeneracy) is given.

This is a special degeneracy of the hydrogen atom – the degeneracies of the energy levels can
be found by adding the degeneracies for each l = 0, . . . , n − 1. These degeneracies correspond
to the so-called ‘noble gas’ atoms (for higher n, deviations from this series occur due to effects
not taken into account here).

2.1 S PIN -1/2 AND THE B LOCH SPHERE


In this section, we elaborate a bit on the states of a particle whose Hilbert space is two-
dimensional; this is the simplest non-trivial example of a Hilbert space. The standard ex-
ample of such a system is a spin-1/2 particle, but many other realisations of systems with a
two-dimensional Hilbert space are possible. Systems with such a Hilbert space are denoted
as ‘two-level systems’ (TLS).
Any Hermitian operator in this space can be represented as a 2 × 2 hermitian matrix. This
means that such a matrix has in principle 4 degrees of freedom (the diagonal elements must
be real, and the off-diagonal elements must be each-other’s complex conjugate). This means
that any hermitian operator can be written as a linear combination of four basis operators.
These are taken to be the unit matrix and the three Pauli matrices:
µ ¶
1 0
1=
0 1
µ ¶
0 1
σx =
1 0
µ ¶
0 −i
σy =
i 0
µ ¶
1 0
σz = .
0 −1

The Pauli matrices satisfy the properties:

σ j , σk = σ j σk + σk σ j = 2δ j k
© ª

where the indices j and k stand for x, y or z. The braces {, } are generally used for the
anti-commutator in these notes. From this it follows in particular that σ2j = 1. If we apply
10 2. A QUANTUM MECHANICS SURVIVAL GUIDE

...
n= 3 (18)
{
2{ n= 2 (8)

n= 1 (2)
l= 0 l= 1 l= 2 l= 3
(2) (6) (10) (14)

F IGURE 2.2: The spectrum of an electron in a Coulomb potential, grouped according to their l -value. The num-
bers along the vertical axis is the principal quantum number. The degeneracies per l -value are the same as for
Figure 2.1. Adding the degeneracies for each principal quantum number n gives the degeneracies in parentheses
on the right hand side.

a space rotation, the spin also changes. This change is expressed by the rotation operator
exp (iα · σ/2), which represents a rotation over an angle |α| about an axis with direction α.
Although this looks like a complicated expression to work out, the property σ2j = 1 enables us
to turn it into a simple expression. For a rotation over an angle α about the z-axis, we have

exp (iασz /2) = cos(α/2)1 + i sin(α/2)σz ,

where the equality can be verified from the Taylor expansions of the exponential function
and of the sine and cosine.
A vector in the spin-1/2 Hilbert space is represented by a vector
µ ¶
¯ψ = a with |a|2 + |b|2 = 1.
¯ ®
b

This vector is characterised by three real numbers (two complex numbers contain four real
numbers, minus one because of the normalisation condition). Furthermore, when we mul-
tiply this vector by a phase factor exp(iγ), the state does not change. Therefore by choosing,
say, a to be real, there are only two numbers left. These can be represented by a point on a
3D sphere (which itself is a two-dimensional manifold) – see figure 2.3. The point is defined
by the two polar angles ϑ and ϕ. The point on the sphere is simply given by the expectation
value of the three Pauli matrices:

ψ |σx | ψ = a ∗ b + ab ∗
­ ®

and similar for y and z. The point with these coordinates is the polarisation.
Up[ to an overall phase factor, the relation between the components a and b and the polar
angles ϑ and ϕ is given by

a = exp(−iϕ/2) cos(ϑ/2);
b = exp(iϕ/2) sin(ϑ/2),

as can be verified (see problem 9). The sphere of polarisation points is called Bloch sphere.
2.2. S CHRÖDINGER AND H EISENBERG PICTURES 11

θ {
2
{

y
φ

F IGURE 2.3: The Bloch sphere.

2.2 S CHRÖDINGER AND H EISENBERG PICTURES


In quantum mechanics, experimental results are expressed in terms of matrix elements of
operators. In general, for an operator  (which we assume to be time-independent), such a
matrix element is
φ ¯ Â ¯ ψ .
­ ¯ ¯ ®

These matrix elements change in time as the wave functions are time-dependent:

¯ψ(t ) = e −iH t /× ¯ψ
¯ ® ¯ ®

where ¯ψ on the right hand side is the wave function at t = 0.


¯ ®

Therefore, we can write


® D ¯¯ ¯ E
φ(t ) ¯ Â ¯ ψ(t ) = φ ¯e iH t /× Âe −iH t /× ¯ ψ .
­ ¯ ¯ ¯

From this formulation, it is immediately clear that we can take two viewpoints:

• We take  independent of time and let ¯ψ and ¯φ evolve in time according to the
¯ ® ¯ ®

time-dependent Schrödinger equation, or

• We take the wave functions ¯ψ and ¯φ fixed and introduce a time-dependent operator
¯ ® ¯ ®

Â(t ):
Â(t ) = e iH t /× Âe −iH t /× .
In that case, the matrix element is written as

φ ¯ Â(t )¯ ψ .
­ ¯ ¯ ®

The first viewpoint is called the Schrödinger picture and the second the Heisenberg picture.
For systems where the Hamiltonian can be split into an ‘easy’ and a ‘difficult’ part, it makes
sense to use a third picture, called interaction picture. This will be covered later in these notes.
12 2. A QUANTUM MECHANICS SURVIVAL GUIDE

2.3 P ROBLEMS
1. An electron in a hydrogen atom finds itself in a state

¯ψ = |1, 0, 0〉 + 1 |2, 1, 1〉 + p1 |2, 1, 0〉 + pi |2, 1, −1〉 .


¯ ®
2 2 2
{
2{ We neglect spin and the states are labeled as |n, l , m〉.

(a) Normalise this state.


(b) Calculate the probability of finding the electron with energy E 2 .
(c) Calculate the probability of finding the electron with angular momentum com-
ponent L z = 0.
(d) Calculate the probability of finding L x = × in a measurement.
(e) The hydrogen atom is subject to a field which adds a term αL x to the Hamiltonian.
Give ¯ψ(t ) , if the state given above is the state at t = 0.
¯ ®

2. Consider a spin-1/2
¯ ® particle in a spherically symmetric potential. The state of the par-
ticle is denoted ¯ψ . L is the orbital angular momentum and S the spin operator. The
functions ψ+ and ψ− are defined by

ψ± (r) = r, ±|ψ ,
­ ®
(2.3)

where the second argument in the bra-vector on the right-hand side denotes the spin
and where
· ¸
0 1 1
ψ+ (r) = R(r ) Y0 (θ, φ) + p Y0 (θ, φ) ; (2.4a)
3
R(r ) £
ψ− (r) = p Y11 (θ, φ) − Y01 (θ, φ) ;
¤
(2.4b)
3

with R some given function of r and Yml (θ, φ) the eigenfunctions of the angular mo-
mentum operators L 2 , L z .

(a) Which condition must be satisfied by R(r ) in order for ¯ψ to be normalised?


¯ ®

(b) A measurement of the spin z-component S z is performed. What are the possible
results with respective probabilities? Same questions for L z and S x .
(c) A value 0 is measured for the quantity L 2 . What is the state of the particle imme-
diately after the measurement?

3. An electron subject to a magnetic field in the z direction evolves under the Hamiltonian
1
H = − ∆1 S z .
2
At t = 0, the electron spin points along the positive x direction.

(a) Calculate the time-dependent wavefunction.


(b) Formulate the equations of motion of the expectation values of the spin compo-
­ ®
nents, 〈S x 〉, S y and 〈S z 〉.

From now, we consider the stationary behaviour rather than the dynamics.

(c) Consider now a second electron. This second electron experiences a field in the
z direction, but with different magnitude. The two electrons interact via a weak
transverse coupling. The total Hamiltonian is thus given by:
1 1
H = − ∆1 S z1 − ∆2 S z2 + J S x1 S x2 + S y1 S y2 ,
¡ ¢
2 2
2.3. P ROBLEMS 13

where J << ∆1 , ∆2 , |∆1 − ∆2 |.


Use first-order perturbation theory to estimate the energies and corresponding
eigenstates for this Hamiltonian.
(d) Use second-order perturbation theory to estimate the energies and correspond-
{
ing eigenstates for this Hamiltonian.
2
{
(e) Find the exact solution for the energies and eigenstates.

4. Consider two spin-1/2 particles subject to a Heisenberg-type interaction

H0 = −αS1 · S2 ,

where S1 and S2 are the spin operators for particle 1 and particle 2, respectively.

(a) Find the energies and expand the corresponding eigenstates in the basis |s, m s 〉,
where s and m s denote the quantum numbers for the total momentum operators
S = S1 + S2 and S z = S z1 + S z2 , respectively.
(b) Expand the eigenstates in the basis |m s1 , m s2 〉, where m s1 en m s2 are the quantum
numbers of the operators S z1 en S z2 .
(c) Consider now the modified Hamiltonian

H = H0 + βS 2z ,

Give the exact energies and corresponding eigenstates.


(d) Determine whether the ground state of the system is entangled. Note: an entan-
gled state cannot be written as a product of states for particle 1 and particle 2.

5. Consider two quantum dots in close proximity. A quantum dot is a small structure that
can be occupied by an electron. We consider the case where only one state is available
in each dot. In this problem, we assume that exactly one electron is present in the
system. Denote the energies¯ of the states as ²i , i = 1, 2 where i labels the dot. To these
levels correspond the states ¯φi , i = 1, 2. As the dots are placed close together, they are
®

coupled. The coupling constant is (the complex number) τ:

τ = φ1 |H | φ2 ,
­ ®

where H is the Hamiltonian.

(a) Write the Hamiltonian of the system in the form of a 2 × 2 matrix.


(b) Find the spectrum of this Hamiltonian. Plot the spectrum as a function of ²1 − ²2
for fixed τ. Also find the eigenfunctions.
(c) Suppose we place an electron in dot 1 at t = 0. Give the time evolution of the wave
function and show that the probability to find the electron in dot 2 as a function
of time has the form
P (t ) = C [1 − cos(ωt )] .
Determine ω. Hint: write the quantum state ¯ψ as
¯ ®

¯ψ = a ¯φ1 + b ¯φ2
¯ ® ¯ ® ¯ ®

and use the spectrum found in (b) together with the time-dependent solution of
the Schrödinger equation.

6. A particle is located at the origin of a line where the potential V is zero. At t = 0, the
particle is released. Find the wave function in the x-representation at time t > 0.
14 2. A QUANTUM MECHANICS SURVIVAL GUIDE

7. (a) Consider the three spin triplet states |1m〉 (in the standard notation |s, m s 〉) and
the singlet state |00〉, which are constructed out of two spin- 21 particles, A and B.
The vector σ A = (σAx , σAy , σAz ) has the three Pauli spin matrices as its compo-
nents, and σ B is defined in a similar way. Show that
{
2{ σA ·σ
(σ σB ) |1m〉 = + |1m〉

and
σA ·σ
(σ σB ) |00〉 = −3 |00〉 .
σA ·σ
Obtain the eigenvalues of (σ σ B )n .
1
(b) A system consisting of two spin 2 particles is described by the Hamiltonian
(
σA ·σ
λ (σ σB ) , t ≥ 0,
H=
0 t < 0.

Assume ¯ψ(t = 0) = |↓↑〉, i.e. particle A has spin along the −z axis, and particle B
¯ ®

has its spin oriented ¯along the +z axis. Express H in terms of the vector operator
σ = σ A +σσB . Obtain ¯ψ(t ) for t > 0 and determine the probability for finding the
®

system in the states |↑↑〉, |↓↓〉 and |↑↓〉.

8. We consider some properties of the Pauli matrices σx , σ y and σz .

(a) Show that σ2i = 1, for i = x, y, z.


(b) Show that [σi , σ j ] = 2ii j k σk where i j k = 1 whenever i , j , k is an even permuta-
tion of x, y, z and i j k = −1 when i , j , k is an odd permutation of i , j , k.
(c) For angular momentum quantum number l = 1, we have three possible states
|1, m〉, where m = 1, 0, −1.
Write down the matrix form of L z in the basis of these states.
(d) Show that the matrices

0 1 0 0 −i 0
   
×  ×
Lx = p 1 0 1  and L y = p  i 0 −i  ,
2 0 1 0 2 0 i 0

together with the matrix found in (c), satisfy the angular momentum commuta-
tion relations
£ ¤
L i , L j = i×i , j ,k L k .

(e) Calculate the eigenvalues and eigenvectors of L x and L y .

9. It may be useful to refer to the previous problem to refresh your knowledge concerning
the Pauli matrices!
The state of a quantum two-level system (also called quantum bit, qubit or Q-bit) can
be written as
|ψ〉 = α |0〉 + β |1〉,
where the coefficients α and β are complex, and properly normalised: |α|2 + |β|2 = 1.

(a) Explain why, without loss of generality, we can also write the state as

|ψ〉 = cos(θ/2) |0〉 + e iφ sin(θ/2) |1〉,

with θ ranging from 0 to π and φ ranging from 0 to 2π.


2.3. P ROBLEMS 15

(b) There is a one-to-one correspondence between the state |ψ〉 and a unit vector
in 3-D, whose orientation is defined by polar and azimuthal angles θ and φ, re-
spectively (see figure 2.3). Such a vector is called the Bloch vector associated with
|ψ〉. Draw the Bloch vectors associated with the states |0〉, |1〉, p1 |0〉 + p1 |1〉, and
p 2 2
1
|0〉 − i 3 {
2 2 |1〉. 2
{
(c) Find 〈ψ|σx |ψ〉, 〈ψ|σ y |ψ〉, and 〈ψ|σz |ψ〉 as a function of θ and φ. Here, σx , σ y and
σz are the Pauli matrices.
(d) Show that 〈σx 〉2 + 〈σ y 〉2 + 〈σz 〉2 = 1.
(e) How are 〈σx 〉, 〈σ y 〉 and 〈σz 〉 related to the cartesian coordinates of the Bloch vec-
tor?
(f) Unitary operations on one qubit correspond to rotations in the Bloch sphere. What
kind of rotations correspond to the operators σx , σ y and σz ? Specify the axis of
rotation, and the rotation angle for each.
(g) What does the operator R̂ x (α) ≡ cos(α/2)1 − i sin(α/2)σx do?
n · {σ̂x , σ̂ y , σ̂z } do? Here, ~
(h) What does the operator R̂~n (α) ≡ cos(α/2)1 − i sin(α/2)~ n is
a 3-D unit vector.
(You only need to show that R̂~n leaves the state with Bloch vector ~
n invariant.)

10. The non-relativistic Hamiltonian that describes the interaction of a charged particle
with the electromagnetic field is

1 ³ q ´2
H= p − A(r) + qφ(r)
2m c
(a) Assume that the wave function is changed by a constant phase

ψ(r, t ) → ψ0 (r, t ) = e iα ψ(r, t ).

Show that ψ0 (r, t ) satisfies the original Schrödinger equation with the original vec-
tor potential A(r, t ).
(b) Assume that the wave function is changed by a non-constant phase

ψ(r, t ) → ψ0 (r, t ) = e iα(r,t ) ψ(r, t ).

Show that ψ0 does not satisfy the original Schrödinger equation with the original
vector potential A(r, t ).
(c) Show that ψ0 (r, t ) does satisfy the original Schrödinger equation but with a new
vector potential
A0 (r, t ) = A(r, t ) + χ(r, t ),
where χ is a vector field. How is the gauge term χ(r, t ) related to the phase term
α(r, t )?
How does the scalar potential change:

φ(r, t ) → φ0 (r, t ) = φ(r, t )+?


3
F ORMAL QUANTUM MECHANICS AND THE
PATH INTEGRAL

When we consider classical mechanics, we start from Newton’s laws and derive the behaviour
of moving bodies subject to forces from these laws. This is a nice approach as we always like
to see a structured presentation of the world surrounding us. We should however not forget
that people have thought for thousands of years about motion and forces before Newton’s
compact formulation of the underlying principles was found. It is not justified to pretend that
physics only consists of understanding and predicting phenomena from a limited set of laws.
The ‘dirty’ process of walking in the dark and trying to find a comprehensive formulation for
the phenomena under consideration is an essential part of physics.
This also holds for quantum mechanics, although it was developed in a substantially
shorter amount of time than classical mechanics. In fact, quantum mechanics started at the
beginning of the twentieth century, and its formulation was more or less complete around
1930.
The previous chapter contained a brief review of quantum mechanics at a level where
you should feel comfortable. Now we make a step partly into new material by considering
quantum mechanics from a formal viewpoint. Part of this material can be found in Griffiths’
book (chapter 3), in particular for section 3.1, where we introduce quantum mechanics by
formulating the postulates on which the quantum theory is based. In sections 3.2, 3.3 and 3.4,
we establish the link between the classical mechanics and quantum mechanics via Poisson
brackets and via the path integral.

3.1 T HE POSTULATES OF QUANTUM MECHANICS


Quantum theory can be formulated in terms of a set of postulates which however do not
have a canonised form similar to Newton’s laws: most books have their own version of these
postulates and even their number varies.
We now present a particular formulation of these postulates.

1. The state of a physical system at any time t is given by the wave function of the system
at that time. This wave function is an element of the Hilbert space of the system. The
evolution of the system in time is determined by the Schrödinger equation:

∂ ¯¯
ψ(t ) = Ĥ ¯ψ(t ) .
® ¯ ®

∂t

Here Ĥ is an Hermitian operator, called the Hamiltonian.

2. Any physical quantity Q is represented by an Hermitian operator Q̂.


When we perform a measurement of the quantity Q, we will always find one of the

17
18 3. F ORMAL QUANTUM MECHANICS AND THE PATH INTEGRAL

eigenvalues of the operator Q̂. For a system in the state ¯ψ(t¯) , the probability of finding
¯ ®

a particular eigenvalue λi , with an associated eigenvector ¯φi of Q̂ is given by


®

¯ φi |ψ(t ) ¯2
¯­ ®¯
Pi = ­ ®.
ψ(t )|ψ(t ) φi |φi
®­

Immediately after the measurement, the system will find itself in the state ¯φi corre-
¯ ®

sponding to the value λi which was found in the measurement of λi .


{
3{ Several remarks can be made.

1. The wave function contains the maximum amount of information we can have about
the system. In practice, we often do not know the wave function of the system.

2. Note that Q̂ being Hermitian implies that the eigenstates ¯φi always
¯ ®
¯ form a basis of the
Hilbert space of the system under consideration. Thus the state ¯ψ(t ) of the system
®

before the measurement can always be written in the form

¯ψ(t ) = c i (t ) ¯φi .
¯ ® X ¯ ®
i

The probability to find in a measurement the values λi is therefore given by

|c i |2
P i = P ¯ ¯2
j cj
¯ ¯

¯ψ(t ) it
¯ ®
where we have omitted the time-dependence of the c i . For a normalised state
holds that, if the eigenvectors ¯φi are orthonormal:
¯ ®

|c i |2 = 1.
X
i

In that case
P i = |c i |2 .

3. So far we have suggested in our notation that the eigenvalues and eigenvectors form
a discrete set. In reality, not only discrete, but also continuous spectra are possible. In
those cases, the sums are replaced by integrals.

4. In understanding quantum mechanics, it helps to make a clear distinction between


the formalism which describes the evolution of the wave function (the Schrödinger
equation, postulate 1) versus the interpretation scheme. We see that the wave func-
tion contains the information we need to predict the outcome of measurements, using
the measurement postulate (number 2).

It now seems that we have arrived at a formulation of quantum mechanics which is simi-
lar to that of classical mechanics: a limited set of laws (prescriptions) from which everything
can be derived, provided we know the form of the Hamiltonian (this is analogous to the situ-
ation in classical mechanics, where Newton’s laws do not tell us what the form of the forces
is).
However there is an important difference: the classical laws of motion can be understood
by using our everyday experience so that we have some intuition for their meaning and con-
tent. In quantum mechanics, however, our laws are formulated as mathematical statements
concerning objects (vectors and operators) for which we do not have a natural intuition. This
is the reason why quantum mechanics is so difficult in the beginning (although its mathe-
matical structure as such is rather simple). You should not despair when quantum mechanics
seems difficult: many people find it difficult, and the workings of the measurement process
3.2. R ELATION WITH CLASSICAL MECHANICS 19

are still the object of intensive debate. Sometimes you must switch your intuition off and use
the rules of linear algebra to solve problems.
Above, we have mentioned that quantum mechanics does not prescribe the form of the
Hamiltonian. In fact, although the Schrödinger equation, quite unlike the classical equa-
tion of motion, is a linear equation, which allows us to make ample use of linear algebra,
the structure of quantum mechanics is richer than that of classical mechanics because in
principle any type of Hilbert space can occur in Nature. In classical mechanics, the space of
all possible states of an N -body system is a 6N dimensional space (we have 3N space and
{
3N momentum coordinates). We may extend this with 2 angles per particle if the particles
carry a magnetic or electric dipole moment. In quantum mechanics, wave functions can be
3
{

part of infinite-dimensional spaces (like the wave functions of a particle moving along a one-
dimensional axis) but they can also lie in a finite-dimensional space (for example in the case
of spin, which has no classical analogue).

3.2 R ELATION WITH CLASSICAL MECHANICS


In order to see whether we can guess the structure of the Hamiltonian for systems which have
a classical analogue, we consider the time evolution of a physical quantity Q. We assume that
Q does not depend on time explicitly. However, the expectation value of Q may vary in time
due to the change of the wave function in the course of time. For normalised wave functions,

d ¡­ ∂ ­ ∂ ¯¯
µ ¶ µ ¶
ψ(t )|Q̂|ψ(t ) = ψ(t ) Q̂ ψ(t ) + ψ(t ) Q̂ ψ(t ) .
®¢ ¯ ¯ ® ­ ¯ ®
¯ ¯ ¯
dt ∂t ∂t

Using the time-dependent Schrödinger equation and its Hermitian conjugate,

∂ ­
ψ(t )¯ = ψ(t )¯ Ĥ ,
¯ ­ ¯
−i×
∂t
(note the minus sign on the left-hand side which results from the Hermitian conjugate) we
obtain
d ¡­
ψ(t )|Q̂|ψ(t ) = ψ(t )|Q̂ Ĥ − Ĥ Q̂|ψ(t ) = ψ(t )| Q̂, Ĥ |ψ(t ) ,
®¢ ­ ® ­ £ ¤ ®

dt
£ ¤
where Q̂, Ĥ is the commutator. We see that the time derivative of Q̂ is related to the com-
mutator between Q̂ and Ĥ . This should wake you up or ring a bell. Perhaps you have seen a
derivation of the time derivative of any physical quantity Q in classical mechanics, based on
the Hamilton equations of classical mechanics. These equations read

∂H ∂H
ṗ j = − ; q̇ j = .
∂q j ∂p j

Note that H is expressed as a function of the generalised coordinates q j and p j .


As a side note, we provide a different form of this equation. Introducing the two-dimensional
coordinates z j = (p j , q j ), we may write the Hamilton equations in the form

∂z j
= J∇j H,
∂t
where ∇ j = (∂/∂p j , ∂/∂q j ) and
µ ¶
0 −1
J= .
1 0
The time derivative of Q(p j , q j ) is given by

dQ X ∂Q ∂H ∂Q ∂H
µ ¶
© ª
= − ≡ Q, H .
dt j ∂q j ∂p j ∂q j ∂p j
20 3. F ORMAL QUANTUM MECHANICS AND THE PATH INTEGRAL

We see that this equation is very similar to that obtained above for the time derivative of
the expectation value of the operator Q̂! The differences consist of replacing the Poisson
bracket by the commutator and adding a factor −i×. Note that the commutator and the Pois-
son bracket both are anti-symmetric under exchange of the two quantities involved:

[A, B ] = −[B, A]; {A, B } = −{B, A}.

It seems that classical and quantum mechanics are not that different after all. Could this
{ perhaps be a guide to formulate quantum mechanics for systems for which we have already
3{ a classical version? This turns out to be the case. At this stage, we summarise the result by
writing down the time-evolution for Q̂ and its classical version:
­ ®
d Q̂ ­£ ¤®
i× = Q̂, Ĥ (Quantum);
dt
dQ © ª
= Q, H (Classical),
dt
which shows the striking similarity between the two formulations. These equations are both
called Liouville equation – the first is however usually denoted as quantum Liouville equation.
As an example, we start by considering a one-dimensional system for which the relevant
classical observables are the position x and the momentum p. Classically, we have

© ª ∂x ∂p ∂x ∂p
x, p = − = 1.
∂x ∂p ∂p ∂x

The second term in the expression vanished because x and p are to be considered as inde-
pendent coordinates. From this, and from the factor i× occurring in the quantum evolution
equation above, we may guess the quantum version of this Poisson relation:
£ ¤
x, p = i×

which should look familiar (if it does not, please review the second year quantum mechanics
course). It seems that our recipe for making quantum mechanics out of classical mechanics
makes sense! Therefore we can now state the following rule:

If the Hamiltonian of some classical system is known, we can use the same form
in quantum mechanics, taking into account the fact that the coordinates q j and
p j become Hermitian operators and that their commutator relations are:

[q j , q k ] = 0; [p j , p k ] = 0; [q j , p k ] = i×δ j k .

You can verify these extended commutation relations easily by working out the correspond-
ing classical Poisson brackets.
In the second year, you have learned that

× d
p̂ = .
i dx
What about this relation? It was not mentioned here so far. The striking message is that this
relation can be derived from the commutation relation. In order to show this, we must discuss
another object you might have missed too in the foregoing discussion: the wave function
written in the form¯ ®ψ(r) (for a particle in 3D). It is important to study the relation between
this and the state ¯ψ . Consider a vector a in two dimensions. This vector can be represented
by two numbers a 1 and a 2 , which are the components of the vector a. However, the actual
values of the components depend on how we have chosen our basis vectors. The vector a is
an arrow in a two dimensional space. In that space, a has a particular length and a particular
3.2. R ELATION WITH CLASSICAL MECHANICS 21

orientation. By changing the basis vectors, we do not change the object a, but we do change
the numbers a 1 and a 2 .
In the case of the Hilbert space of a one-dimensional particle, we can use as basis vectors
the states in which the particle is localised at a particular position x. We call these states |x〉.
They are eigenvectors of the position operator x̂ with eigenvalue x:

x̂ |x〉 = x |x〉 .

The states |x〉 are properly normalised: {


3
{
x|x 0 = δ(x − x 0 ),
­ ®

where δ(x − x 0 ) is the Dirac delta function. We now can define ψ(x):

ψ(x) = x|ψ ,
­ ®

that is, ψ(x) are the ‘components’ of the ‘vector’ ¯ψ with respect to the basis |x〉. For three
¯ ®

dimensions, we have a wave function which is expressed with respect to the basis |r〉.
In order to derive the representation of the momentum operator, p̂ = ×i ddx , we first calcu-
late the matrix element of the commutator:

x|[x̂, p̂]|x 0 = x|x̂ p̂ − p̂ x̂|x 0 = (x − x 0 ) x|p̂|x 0 .


­ ® ­ ® ­ ®

The last expression¯ is obtained by having x̂ in the first term act on the bra-vector 〈x| on its
left, and on the ket ¯x 0 on the right in the second term.
®

On the other hand, using the commutation relation, we know that

x|[x̂, p̂]|x 0 = i× x|x 0 .


­ ® ­ ®

This is an even function of x − x 0 , as interchanging x and x 0 does not change the matrix el-
ement on the right-hand side. Since this function is equal to (x − x 0 ) x|p̂|x 0 , we know that
­ ®

x|p̂|x 0 must be an odd function of x − x 0 .


­ ®
­ ®
Now we evaluate the matrix element x|p̂|ψ . We recall from linear algebra that, since |x〉
are the eigenstates of an Hermitian operator, they form a complete set, that is:
Z
1 = |x〉 〈x| d x,

where 1 is the unit operator. Then we can write


Z
x|p̂|x 0 x 0 |ψ d x 0 .
­ ® ­ ®­ ®
x|p̂|ψ =

Now we perform a Taylor expansion around x in order to rewrite x 0 |ψ :


­ ®

d ­ ® (x 0 − x)2 d 2 ­
x 0 |ψ = x|ψ + (x 0 − x)
­ ® ­ ® ®
x|ψ + x|ψ +···
dx 2! d x2
We then obtain

d ­ ® (x 0 − x)2 d 2 ­
Z · ¸
x|p|x 0 x|ψ + (x 0 − x) d x 0.
­ ® ­ ® ­ ® ®
x|p̂|ψ = x|ψ + x|ψ + · · ·
dx 2! d x2

The first term in the square brackets gives a zero after integration, as it is multiplied by
x|p̂|x 0 , which was an odd function of x − x 0 . The second term gives
­ ®

d ­ d ­
Z
x|p̂|x 0 (x 0 − x) x|ψ d x 0 = −i×
­ ® ® ®
x|ψ ,
dx dx
22 3. F ORMAL QUANTUM MECHANICS AND THE PATH INTEGRAL

where we have used the relation

x|p̂|x 0 (x 0 − x) = −i×δ(x 0 − x).


­ ®

We use the same relation for the second term. But then we obtain a term of the form

(x 0 − x)δ(x 0 − x)

{ in the integral over d x 0 . This obviously yields a zero. The same holds for all higher order
3 { terms, so we are left with
­ ® × d ­ ®
x|p̂|ψ = x|ψ ,
i dx
which is the required result.
Having obtained this, we can analyse the form of the eigenstates of the momentum oper-
ator: ¯ ® ¯ ®
p̂ ¯p = p ¯p .
¯ ® ­ ®
The states ¯p can be represented in the basis 〈x|; the components then are x|p . We can
find the form of these functions by using the eigenvalue equation and the representation of
the momentum operator as a derivative:
­ ® ­ ®
x|p̂|p = p x|p and
­ ® × d ­ ®
x|p̂|p = x|p .
i dx
¯ ®
The first of these equation expresses the fact that ¯p is an eigenstate of the operator p̂, and
the second one follows directly from the fact that the momentum operator acts as a derivative
in the x-representation. Combining these two we obtain a simple differential equation

× d ­ ® ­ ®
x|p = p x|p ,
i dx
with a normalised solution:
1
e ipx/× .
­ ®
x|p = p
2π×

¯ ψ
This allows us to find any state
®
in the momentum representation, that is, the represen-
tation in which we use the states ¯p as basis states:

1
Z Z
ψ(p) = p|ψ = e −ipx/× ψ(x) d x.
­ ® ­ ®­ ®
p|x x|ψ d x =
2π×

This is nothing but the Fourier transform, the specific form of which we already encountered
at the end of the previous chapter. The analysis presented here for a one-dimensional particle
can be generalised to three or more dimensions in a natural way.

3.3 T HE PATH INTEGRAL : FROM CLASSICAL TO QUANTUM MECHANICS


The path integral is a very powerful concept for connecting classical and quantum mechan-
ics. Moreover, this formulation renders the connection between quantum mechanics and
statistical mechanics very explicit. We shall restrict ourselves here to a discussion of the path
integral in quantum mechanics. The reader is advised to consult the excellent book by Feyn-
man and Hibbs (Quantum Mechanics and Path Integrals, McGraw-Hill, 1965) for more de-
tails.
The path integral formulation can be derived from the following heuristics, based on the
analogy between particles and waves:
3.3. T HE PATH INTEGRAL : FROM CLASSICAL TO QUANTUM MECHANICS 23

x0
{
3
{
x1

t0 t1

F IGURE 3.1: A possible path running from an initial position x i at time t i to a final position x f at time t f . The time
is divided up into many identical slices.

• A point particle which moves with momentum p at energy E can also be viewed as a
wave with a phase ϕ given by
ϕ = k · r − ωt
where p = ×k and E = ×ω.

• For a single path, these phases are additive, i.e., to find the phase of the entire path, the
phases for different segments of the path should be added.

• The probablity to find a particle which at t = t 0 was at r0 , at position r1 at time t = t 1 ,


is given by the absolute square of the sum of the phase factors exp(iϕ) of all possible
paths leading from (r0 , t 0 ) to (r1 , t 1 ):
¯ ¯2
¯ X ¯
P (r0 , t 0 ; r1 , t 1 ) = ¯ e iϕpath ¯ .
¯ ¯
¯all paths ¯

This probability is defined up to a constant which can be fixed by normalisation (i.e.


the term within the absolute bars must reduce to a delta-function in r1 − r0 ).

These heuristics are the analog of the Huygens principle in wave optics.
To analyse the consequences of these heuristics, we chop the time interval between t 0
and t 1 into many identical time slices (see Fig. 3.1) and consider one such slice. Within this
slice we take the path to be linear. To simplify the analysis we consider one-dimensional
motion. We first consider the contribution of kx to the phase difference. If the particle moves
in a time ∆t over a distance ∆x, we know that its k-vector is given by
mv m∆x
k= = .
× ×∆t
The phase change resulting from the displacement of the particle can therefore be given as

m∆x 2
∆ϕ = k∆x = .
×∆t
We still must add the contribution of ω∆t to the phase. The frequency ω is related to the
energy; we have
p2 ×2 k 2
×ω = + V (x) = + V (x).
2m 2m
24 3. F ORMAL QUANTUM MECHANICS AND THE PATH INTEGRAL

Neglecting the potential energy we obtain

m∆x 2 ×2 k 2 m∆x 2
∆ϕ = − ∆t = .
×∆t 2m× 2×∆t
The potential also enters through the ω∆t term, to give the result:

m∆x 2 V (x)
∆ϕ = − ∆t .
2×∆t ×
{ For x occurring in the potential we may choose any value between the begin and end point –
3{ the most accurate result is obtained by substituting the average of the value at the beginning
and at the end of the time interval.
If we now use the fact that phases are additive, we see that for the entire path the phases
are given by
1 X m x(t j +1 ) − x(t j ) 2 V [x(t j )] + V [x(t j +1 )]
( · ¸ )
ϕ= − ∆t .
× j 2 ∆t 2
This is nothing but the discrete form of the classical action of the path! Taking the limit ∆t → 0
we obtain
1 t1 m ẋ 2 1 t1
Z · ¸ Z
ϕ= − V (x) d t = L(x, ẋ) d t .
× t0 2 × t0
We therefore conclude that the probability to go from r0 at time t 0 to r1 at time t 1 is given by
¯ · Z t1 ¸¯¯2
¯ i
P (r0 , t 0 ; r1 , t 1 ) = ¯N
X
exp L(x, ẋ) d t ¯
¯ ¯
¯ all paths × t0 ¯

where N is the normalisation factor


r
m
N = .
2πi∆t ×
This now is the path integral formulation of quantum mechanics. Let us spend a moment
to study this formulation. First note the large prefactor 1/× in front of the exponent. If the
phase factor varies when varying the path, this large prefactor will cause the exponential to
vary wildly over the unit circle in the complex plane. The joint contribution to the probability
will therefore become very small. If on the other hand there is a region in phase space (or
‘path space’) where the variation of the phase factor with the path is zero or very small, the
phase factors will add up to a significant amount. Such regions are those where the action
is stationary, that is, we recover the classical paths as those giving the major contribution to
the phase factor. For × → 0 (the classical case), only the stationary paths remain, whereas for
small ×, small fluctuations around these paths are allowed: these are the quantum fluctua-
tions.
You may not yet recognise how this formulation is related to the Schrödinger equation.
On the other hand, we may identify the expression within the absolute signs in the last ex-
pression for P with a matrix element of the time evolution operator since both have the same
meaning: · Z t1 ¸
i
N exp
­ ® X
x 1 |Û (t 1 − t 0 )|x 0 = L(x, ẋ) d t .
all paths × t0
This form of the time evolution operator is sometimes called the propagator. Let us now
evaluate this form of the time evolution operator acting for a small time interval ∆t on the
wave function ψ(x, t ):
Z
ψ(x 1 , t 1 ) =
­ ¯ ¯ ®­ ®
x 1 ¯Û (t 1 − t 0 )¯ x 0 x 0 |ψ d x 0 =

ẋ 2 (t )
Z ∞ ½ Z t1 · ¸ ¾
i
Z
N D[x(t )] exp m − V [x(t )] d t ψ(x 0 , t 0 ) d x 0 .
−∞ × t0 2
3.3. T HE PATH INTEGRAL : FROM CLASSICAL TO QUANTUM MECHANICS 25

R
The notation D[x(t )] indicates an integral over all possible paths from (x 0 , t 0 ) to (x 1 , t 1 ). We
first approximate the integral over time in the same fashion as above, taking t 1 very close to
t 0 , and assuming a linear variation of x(t ) from x 0 to x 1 :

∞ (x 1 − x 0 )2 V (x 0 ) + V (x 1 )
½ · ¸ ¾
i
Z
ψ(x 1 , t 1 ) = N exp m − ∆t ψ(x 0 , t 0 ) d x 0 .
−∞ × 2∆t 2 2

A similar argument as used above to single out paths close to stationary ones can be used
here to argue that the (imaginary) Gaussian factor will force x 0 to be very close to x 1 . The {
allowed range for x 0 is 3
{
×∆t
(x 1 − x 0 )2 ¿ .
m
As ∆t is taken very small, we may expand the exponent with respect to the V ∆t term:

∞ i (x 1 − x 0 )2 i[V (x 0 ) + V (x 1 )]
Z · ¸· ¸
ψ(x 1 , t 1 ) = N exp m 1− ∆t ψ(x 0 , t 0 ) d x 0 .
−∞ × 2∆t 2×
V (x 0 )+V (x 1 )
As x 0 is close to x 1 we may approximate 2× by V (x 1 )/×. We now change the integra-
tion variable from x 0 to u = x 0 − x 1 :

u2 ∞
¶ µ
i
Z
ψ(x 1 , t 1 ) = N exp m [1 − i/×V (x 1 )∆t ] ψ(x 1 + u, t 0 ) d u.
−∞ × 2∆t

As u must be small, we can expand ψ(x) about x 1 and obtain

∞ im u 2 ∂ u 2 ∂2
µ ¶· ¸· ¸
i
Z
ψ(x 1 , t 1 ) = N exp 1 − V (x 1 )∆t ψ(x 1 , t 0 ) + u ψ(x 1 , t 0 ) + ψ(x 1 , t 0 ) d u.
−∞ × 2∆t × ∂x 2 ∂x 2

Note that the second term in the Taylor expansion of ψ leads to a vanishing integral as the
integrand is an antisymmetric function of u. All in all, after evaluating the Gaussian integrals,
we are left with

i∆t i×∆t ∂2
ψ(x 1 , t 1 ) = ψ(x 1 , t 0 ) − V (x 1 )ψ(x 1 , t 0 ) + ψ(x 1 , t 0 ).
× 2m ∂x 2
Using
ψ(x 1 , t 1 ) − ψ(x 1 , t 0 ) ∂
≈ ψ(x 1 , t 1 ),
∆t ∂t
we obtain the time-dependent Schrödinger equation for a particle moving in one dimension:

∂ ×2 ∂2
· ¸
i× ψ(x, t ) = − + V (x) ψ(x, t ).
∂t 2m ∂x 2

You may have found this derivation a bit involved. It certainly is not the easiest way to
arrive at the Schrödinger equation, but it has two attractive features;

• Everything was derived from simple heuristics which were based on viewing a particle
as a wave and allowing for interference between the waves;

• The formulation shows that the classical path is obtained from quantum mechanics
when we let × → 0.
26 3. F ORMAL QUANTUM MECHANICS AND THE PATH INTEGRAL

3.4 T HE PATH INTEGRAL : FROM QUANTUM MECHANICS TO CLASSICAL


MECHANICS
In the previous section we have considered how we can arrive from classical mechanics at the
Schrödinger equation. This formalism can be generalised in the sense that for each system
for which we can write down a Lagrangian, we have a way to find a quantum formulation
in terms of the path integral. Whether a Schrödinger-like equation can be found is not sure:
sometimes we run into problems which are beyond the scope of these notes. In this section
{ we assume that we have a system described by some Hamiltonian and show that the time-
3{ evolution operator has the form of a path integral as found in the previous section.
The starting point is the time evolution operator, or propagator, which, for a time-independent
Hamiltonian, takes the form
D ¯ i ¯ E
U (rf , t f ; ri , t i ) = rf ¯e − × (tf −ti )Ĥ ¯ ri .
¯ ¯

The matrix element is difficult to evaluate – the reason is that the Hamiltonian which, for a
particle in one dimension, takes the form

×2 d 2
Ĥ = − + V (x)
2m d x 2
is the sum of two noncommuting operators. Although it is possible to evaluate the exponents
of the separate terms occurring in the Hamiltonian, the exponent of the sum involves an infi-
nite series of increasingly complicated commutators. For any two noncommuting operators
 and B̂ we have

e Â+B̂ = e  e B̂ e −1/2[ Â,B̂ ]−1/12([ Â,[ Â,B̂ ]]+[B̂ ,[B̂ , Â]])+1/24[ Â,[B̂ ,[ Â,B̂ ]]]+...

This is the so-called Campbell–Baker–Hausdorff (CBH) formula. The cumbersome commu-


tators occurring on the right can only be neglected if the operators A and B are small in some
sense. We can try to arrive at an expression involving small commutators by applying the
time slicing procedure of the previous section:
i i i i
e − × (tf −ti )Ĥ = e − × ∆t Ĥ e − × ∆t Ĥ e − × ∆t Ĥ . . .

Note that no CBH commutators occur because ∆t Ĥ commutes with itself.


Having this, we can rewrite the propagator as (we omit the hat from the operators)
Z D ED E D E
U (x f , t f ; x i , t i ) = d x 1 . . . d x N −1 x f |e −i∆t H /× |x N −1 x N −1 |e −i ∆t H /× |x N −2 · · · x 1 |e −i∆t H /× |x i .

Now that the operators occurring in the exponents can be made arbitrarily small by taking ∆t
very small, we can evaluate the matrix elements explicitly:
® D i∆t 2
E D i∆t 2
E
x j |e −i∆t H |x j +1 = x j |e − × [p /(2m)+V (x)] |x j +1 = e −i∆tV (x j )/× x j |e − × p /(2m) |x j +1 .
­

The last matrix element can be evaluated


¯ ® by inserting two unit operators formulated in terms
of integrals over the complete sets ¯p :
D i∆t
E Z Z ­ ® ­ ¯ i∆t 2
p 2 /(2m)
x|e − |x 0 = x|p p ¯ e − × p̂ /(2m) ¯p 0 p 0 |x 0 d p d p 0 .
¯ ®­ ®
×

p
We have seen that x|p = exp(ipx/×)/ 2π×. Realising that the operator exp − i∆t 2
­ ® £ ¤
× p̂ /(2m)
is diagonal in p space, we find, after integrating over p:
r
D
− i∆t p 2 /(2m)
E m
0
exp im(x − x 0 )2 /(2∆t ×) .
£ ¤
x|e × |x =
2π×∆t
3.5. S UMMARY 27

All in all we have


r
D
−i∆t H /×
E m
e −i∆tV (x j )/× exp mi(x − x 0 )2 /(2∆t ×) .
£ ¤
x j |e |x j +1 =
2π×∆t

Note that we have evaluated matrix elements of operators. The result is expressed completely
in terms of numbers, and we no longer have to bother about commutation relations. Collect-
ing all terms together we obtain

³ m ´(N −1)/2 Z
(
N
"
m(x j +1 − x j )2
# ) {
i X
U (x f , t f ; x i , t i ) = d x 1 . . . d x N −1 exp − V (x j ) ∆t . 3
{
2π×∆t × j =0 2

The expression in the exponent is the discrete form of the Lagrangian; the integral over all
intermediate values x j is the sum over all paths. We therefore have shown that the time evo-
lution operator from x i to x f is equivalent to the sum of the phase factors of all possible paths
from x i to x f .
In conclusion, we have seen that the idea that the probability to a find a particle starting
off from x 0 at t 0 at x 1 at time t 1 is given by a sum of all the phase factors corresponding to
all paths from the starting to the end point, gives us the quantum theory. On the other hand,
once we have the Hamiltonian of a quantum theory, we can reformulate this as a path integral
involving the corresponding Lagrangian.

3.5 S UMMARY
In this chapter, you have hopefully learned to appreciate the close relation between classical
and quantum mechanics. In fact, the mathematical structure of both theories is more similar
than you may have thought after your introductory courses on quantum mechanics. Let us
list some of the relations we have found.

• Both classical and quantum mechanics allow us – in principle – to follow the evolution
of a mechanical system by solving a first-order differential equation in the time. In
both, the Hamiltonian is the agent governing this evolution.
In classical mechanics, the time evolution equations read

∂p j ∂H
=− ;
∂t ∂q j
∂q j ∂H
= .
∂t ∂p j

Writing z j = (p j , q j ), we may write this in the form

∂z j
= J∇j H,
∂t
where ∇ j = (∂/∂p j , ∂/∂q j ) and
µ ¶
0 −1
J= .
1 0

In quantum mechanics, the Hamilton equations are replaced by the time-dependent


Schrödinger equation:
∂ ¯
i× ¯ψ(t ) = Ĥ ¯ψ(t ) .
® ¯ ®
∂t
• In quantum mechanics, measurements disrupt the time evolution and bring stochastic
elements into the theory.
28 3. F ORMAL QUANTUM MECHANICS AND THE PATH INTEGRAL

• The Poisson bracket of classical mechanics is replaced by the commutator in quantum


mechanics:
£ ¤
{A, B } → −i× Â, B̂ .

• Th path integral allows us to work out the quantum mechanical time evolution as an
infinite sum over paths, where each path is weighted by the factor
¡ ¢
exp iS path /× ,
{
3{ where S path is the classical action of the path:
Z
S path = L(q j , q̇ j )d t ,
path

with L the classical Lagrangian of the path.

• The smallness of × ensures that the paths with a value of the action deviating not more
than O (×) from the stationary value(s) of the action, yield a major contribution to the
path integral (time evolution). For × → 0, only the stationary (i.e. the classical) path(s)
survive(s).

• The time-dependent Schrödinger equation can be derived from the path integral for-
malism.

3.6 P ROBLEMS
1. The product of the exponentials of two non-commuting operators X and Y can be writ-
ten as
e X e Y = exp(X + Y + [X , Y ]/2 + . . .)
This is the so-called Campbell Baker Hausdorff formula (or CBH formula). The aim of
this problem is to derive this formula.

(a) First, show that log(1 + x) ≈ x − x 2 /2 + x 3 /3 − . . ..


(b) Now expand the formula log(e X e Y ) to second order in X and Y to derive the CBH
formula.
(c) If you’re brave, you may try to find the third order expansion to find the next term.

2. We consider a particle moving in one dimension. The state of the particle is ¯ψ .


¯ ®

(a) Show that


® D ¯¯ ¯ E
x + a|ψ = x ¯e ia p̂/× ¯ ψ
­ ¯
¯ ®
by inserting a unit operator using completeness of the momentum basis ¯p .
(b) Demonstrate the same result by writing p̂ as the operator −i×d /d x and Taylor
expanding ψ around x.
(c) Show that for a wave function in three dimensions:
­ 0 ® D iαL̂ /× E
r |ψ = r|e z |ψ ,

where r0 is the vector r rotated about an angle α around the z-axis.


(d) (The following two parts were already addressed in problem 9).
Show that the operator exp(iασz /2) rotates a spin-1/2 state about an angle α
around the z-axis.
(e) Give the rotation operator for the wave function describing a spin-1/2 particle.
3.6. P ROBLEMS 29

3. The Lagrangian of a harmonic oscillator is L = (m/2)(ẋ 2 −ω2 x 2 ). Show that the classical
action is:
mω £ 2
(x i + x f2 ) cos ωT − 2x i x f ,
¤
S cl = (3.1)
2 sin ωT
where T = t f − t i
Rt
Hint: the definition of the action is: S = ti f L(ẋ, x, t ) d t . Make use of a classical tra-
jectory of the harmonic oscillator: x(t ) = A cos ωt + B sin ωt , which satisfies boundary
conditions: x(t i ) = x i and x(t f ) = x f .
{
4. The path integral formalism expresses the probability to move from a point x i at time 3
{
t i to a point x f at time t f in terms of a sum over all paths:
X
K (x f , t f ; x i , t i ) = exp(iS path /×).
Paths

We write the path as a sum of the classical path x class (t ) and a fluctuation:

x(t ) = x class (t ) + δx(t ),

where δx(t i ) = δx(t f ) = 0, since the positions at the beginning and at the end of the path
are fixed.
We consider a free particle.

(a) Give the action for the classical path for a free particle (that is, V (x) = 0 for all x).
(b) Show that the action for a general path can be written as
m
Z
S = S [x class ] + δẋ 2 (t ) d t .
2
Also show that · ¸
i
K (x f , t f ; x i , t i ) = K (0, t f ; 0, t i ) exp S(x class ) .
×
(c) The properly normalised form of the discretised path integral is given by
" #
³ m ´(N +1)/2 Z NY
+1 i XN
K (0, t f ; 0, t i ) = lim d x n exp ² L(x, ẋ) .
N →∞ 2πi²× n=1 × n=1

In the exponent, ẋ means (x n − x n−1 )/²; ² = (t f − t i )/(N + 1). Furthermore, x 0 ≡ x i


and x N +1 ≡ x f .
The integration over all possible paths δx can be performed in the discretised path
integral: Ã !
Z NY +1 ² X N m ẋ 2
IN = d x n exp i .
i =1 × n=1 2
It can be shown (using e.g. induction) that

2πi²× N /2
µ ¶
1
IN = p .
N +1 m
If you’re in for a challenge, you may try to prove this, but that is not required here.
Show, using this result, that
s
m× im (x f − x i )2
µ ¶
K (x f , t f ; x i , t i ) = exp .
2πi(t f − t i ) 2× (t f − t i )

(d) Find this propagator directly, using


D 2
E
K (x f , t f ; x i , t i ) = x f |e −i(tf −ti )p /2m× |x i .

Hint: insert two unit operators in this expression, formulated as integrals over p.
4
T HE VARIATIONAL METHOD FOR THE
S CHRÖDINGER EQUATION

4.1 VARIATIONAL CALCULUS


Quantum systems are governed by the Schrödinger equation. In particular, the solutions to
the stationary form of this equation determine many physical properties of the system at
hand. The stationary Schrödinger equation can be solved analytically in a very restricted
number of cases – examples include the free particle, the harmonic oscillator and the hy-
drogen atom. In most cases we must resort to computers to determine the solutions to this
equation. It is of course possible to integrate the Schrödinger equation using discretisation
methods but in realistic electronic structure calculations for example we would need a huge
number of grid points, leading to important computer time and memory requirements. On
the other hand, the variational method enables us to solve the Schrödinger equation much
more efficiently in many cases. In this chapter we introduce the variational method for solv-
ing the Schrödinger equation.
In the variational method, the possible solutions are restricted to a subspace of the Hilbert
space, and in this subspace we seek the best possible solution (below we shall define what
is to be understood by the ‘best’ solution). To see how this works, we first show that the
stationary Schrödinger equation can be derived by a stationarity condition of the functional

d X ψ∗ (X )H ψ(X ) ψ|H |ψ
R ­ ®
E [ψ] = R = ­ ® , (4.1)
d X ψ∗ (X )ψ(X ) ψ|ψ

which is recognised as the expectation value of the energy for a stationary state ψ (in order
to keep the analysis general we are not specific about the form of the generalised coordinate
X – it may include the space and spin coordinates of a collection of particles). The stationary
states of this energy functional are defined by postulating that if such a state is changed by an
arbitrary δψ, the corresponding change in E vanishes to first order. Formally, this means that
¯
d ¡ £
E ψ + λδψ − E [ψ] ¯¯
¤ ¢¯
≡0 (4.2)
dλ λ=0
for all normalised vectors δψ. Defining

P = ψ|H |ψ
­ ®
and
(4.3)
Q = ψ|ψ ,
­ ®

31
32 4. T HE VARIATIONAL METHOD FOR THE S CHRÖDINGER EQUATION

we can write the change δE in the energy to first order in δψ as

ψ + δψ|H |ψ + δψ ψ|H |ψ
­ ® ­ ®
δE = ® − ­
ψ + δψ|ψ + δψ ψ|ψ
­ ®
® P­ ® P­
δψ|H |ψ − Q δψ|ψ ψ|H |δψ − Q ψ|δψ
­ ® ­ ®
≈ + . (4.4)
Q Q

As this should vanish for an arbitrary but small change in ψ, we find, using E = P /Q:

H ψ = E ψ, (4.5)

together with the Hermitian conjugate of this equation, which is equivalent.


{ In variational calculus, stationary states of the energy functional are found within a sub-
4 { space of the Hilbert space. An important example is¯linear variational calculus, in which the
subspace is spanned by a finite set of basis vectors ¯χp , p = 1, . . . , N , that we take to be or-
®

thonormal at first, that is,


χp |χq = δpq ,
­ ®
(4.6)
where δpq is the Kronecker delta-function which is 0 unless p = q, in which case it is 1.
For a state
N
¯ψ = C p ¯χp ,
¯ ® X ¯ ®
(4.7)
p=1

the energy functional is given by


PN ∗
p,q=1 C p C q H pq
E= PN , (4.8)
p,q=1 C p C q δpq

with
H pq = χp |H |χq .
­ ®
(4.9)
The stationary states follow from the condition that the derivative of this functional with re-
spect to the C p vanishes, which leads to

N ¡
H pq − E δpq C q = 0
X ¢
for p = 1, . . . , N . (4.10)
q=1

Equation (4.10) is an eigenvalue problem which can be written in matrix notation:

HC = E C. (4.11)

This is the Schrödinger equation formulated for a finite, orthonormal basis.


Linear parametrisations are often used because the resulting method is simple, allowing
for numerical matrix diagonalisation techniques to be used. The lowest eigenvalue of (4.11)
is always higher than or equal to the ground state energy of Eq. (4.5), as the ground state is the
minimal value assumed by the energy functional over the full Hilbert space. If we restrict our-
selves to a part of this space, then the minimum value of the energy functional must always
be higher than or equal to the ground state of the full Hilbert space. Adding more basis func-
tions to our set, the subspace becomes larger, and consequently the minimum of the energy
functional will decrease (or stay the same). For the specific case of linear variational calculus,
this result can be generalised to stationary states at higher energies: the higher eigenvalues
are always higher than the equivalent solution to the full problem, but approximate the latter
better with increasing basis set size. The formal statement of this is the Hylleraas-Undin-
MacDonald theorem (see for example Springer Handbook of Atomic, Molecular, and Optical
Physics, Volume 1, Gordon Drake (ed.), 2006). The behaviour of the spectrum found by solv-
ing (4.11) with increasing basis size is depicted in Figure 4.1.
4.2. L INEAR VARIATIONAL CALCULATIONS 33

(4)
(5)
E 4 E4

(4)
(5)
E4
E 3 E3
(4) E3
E 2
(5) E2
E2
(4)
E 1
E1
(5)
E1

F IGURE 4.1: The behaviour of the spectrum of Eq. (4.11) with increasing basis set size in linear variational calcu-
lus. The upper index is the number of states in the basis set, and the lower index labels the spectral levels. {
4
{
We now describe how to proceed when the basis consists of non-orthonormal basis func-
tions, as is often the case in practical calculations. In that case, we must reformulate (4.11),
taking care of the fact that the overlap matrix S, whose elements S pq are given by

S pq = χp |χq
­ ®
(4.12)

is not the unit matrix. This means that in Eq. (4.8) the matrix elements δpq of the unit matrix,
occurring in the denominator, have to be replaced by S pq , and we obtain

HC = E SC. (4.13)

This looks like an ordinary eigenvalue equation, the only difference being the matrix S in
the right hand side. It is called a generalised eigenvalue equation and there exist computer
programs for solving such a problem.

4.2 L INEAR VARIATIONAL CALCULATIONS


In this section, we describe two quantum mechanical problems that can be analyzed nu-
merically with a linear variational calculation. In both cases, a generalised matrix eigenvalue
problem (4.13) must be solved, which can easily be done using a program like MATLAB.

4.2.1 T HE INFINITELY DEEP POTENTIAL WELL


The potential well with infinite barriers is given by
½
∞ for |x| > |a|,
V (x) = (4.14)
0 for |x| ≤ |a|.

It forces the wave function to vanish at the boundaries of the well (x = ±a). The exact solu-
tion for this problem is known and treated in every textbook on quantum mechanics (see for
example Griffiths). Here we discuss a linear variational approach to be compared with the
exact solution. We take a = 1 and use natural units such that ×2 /2m = 1.
As basis functions we take simple polynomials that vanish on the boundaries of the well:

ψn (x) = x n (x − 1)(x + 1), n = 0, 1, 2, . . . (4.15)

The reason for choosing this particular form of basis functions is that the relevant matrix el-
ements can easily be calculated analytically. We start with the matrix elements of the overlap
matrix, defined by
Z 1
S mn = ψn |ψm = ψn (x)ψm (x)d x.
­ ®
(4.16)
−1
There is no complex conjugate with the ψn in the integral because we use real basis functions.
Working out the integral gives
2 4 2
S mn = − + (4.17)
n +m +5 n +m +3 n +m +1
34 4. T HE VARIATIONAL METHOD FOR THE S CHRÖDINGER EQUATION

TABLE 4.1: Energy levels of the infinitely deep potential well. The first four columns show the variational en-
ergy levels for various numbers of basis states N . The last column shows the exact values. The exact levels are
approached from above as in Figure 4.1.

N =5 N =8 N = 12 N = 16 Exact
2.4674 2.4674 2.4674 2.4674 2.4674
9.8754 9.8696 9.8696 9.8696 9.8696
22.2934 22.2074 22.2066 22.2066 22.2066
50.1246 39.4892 39.4784 39.4784 39.4784
87.7392 63.6045 61.6862 61.6850 61.6850

{
4{
for n + m even; otherwise S mn = 0.
We can also calculate the Hamilton matrix elements – you can check that they are given
by:
d2
Z 1 µ ¶
2
Hmn = ψn |p |ψm = ψn (x) − 2 ψm (x)d x
­ ®
−1 dx
(4.18)
1 − m − n − 2mn
· ¸
= −8
(m + n + 3)(m + n + 1)(m + n − 1)
for m + n even, else Hmn = 0.
The exact solutions are given by
½
cos(k n x) n odd
ψn (x) = (4.19)
sin(k n x) n even and positive

with k n = nπ/2, n = 1, 2, . . ., with corresponding energies

n 2 π2
E n = k n2 = . (4.20)
4

For each eigenvector C, the function N p=1 C p χp (x) should approximate an eigenfunction
P

(4.19). The variational levels are shown in table 4.1, together with the analytical results.

4.2.2 VARIATIONAL CALCULATION FOR THE HYDROGEN ATOM


As we shall see in further on in this course, one of the main problems of electronic structure
calculations is the treatment of electron–electron interactions. Here we develop a scheme
for solving the Schrödinger equation for an electron in a hydrogen atom for which the many-
electron problem does not arise, so that a direct variational treatment of the problem is pos-
sible which can be compared to the analytic solution.
The electronic Schrödinger equation for the hydrogen atom reads:

×2 2 e2 1
· ¸
− ∇ − ψ(r) = E ψ(r) (4.21)
2m 4π²0 r
where the second term in the square brackets is the Coulomb attraction potential of the nu-
cleus. The mass m is the reduced mass of the proton–electron system which is approximately
equal to the electron mass. The ground state is found at energy
¶2
m e2
µ
E =− 2 ≈ −13.6058 eV (4.22)
× 4π²0
and the wave function is given by
1
ψ(r) = p e −r /a0 , (4.23)
a 03/2 π
4.2. L INEAR VARIATIONAL CALCULATIONS 35

in which a 0 is the Bohr radius,

4π²0 ×2
a0 = ≈ 0.529 18 Å. (4.24)
me 2
When performing a calculation for such an equation, it is convenient to use units such
that equations take on a simple form, involving only coefficients of order 1. Standard units in
electronic structure physics are so-called atomic units: the unit of distance is the Bohr radius
a 0 , masses are expressed in the electron mass m e and the charge is measured in unit charges
(e). The energy is finally given in ‘Hartrees’ (E H ), given by m e c 2 α2 (α is the fine-structure
constant and m e is the electron mass) which is roughly equal to 27.212 eV. In these units, the
Schrödinger equation for the hydrogen atom assumes the following simple form:
{
4
· ¸
1 2 1 {
− ∇ − ψ(r) = E ψ(r). (4.25)
2 r

We try to approximate the ground state energy and wave function of the hydrogen atom
in a linear variational procedure. We use Gaussian basis functions. For the ground state, we
only need angular momentum l = 0 functions (s-functions) – they have the form:
2
χp (r ) = e −αp r (4.26)

centred on the nucleus (which is thus placed at the origin). We have to specify the values of
the exponents αp . A large αp defines a basis function which is concentrated near the nucleus,
whereas small αp characterises a function with a long tail. Optimal values for the exponents
αp can be found by solving the non-linear variational problem including the linear coeffi-
cients C p and the exponents αp . Several numerical methods for solving such non-linear op-
timisation problem exist and the solutions can be found in textbooks or documentation of
quantum chemical software packages.
We shall use known, fixed values of the exponents:

α1 = 13.007 73
α2 = 1.962 079
α3 = 0.444 529 (4.27)
α4 = 0.121 949 2,

but relax the values of the coefficients C p . The wave function therefore has the form

4 2
C p e −αp r
­ ® X
r|ψ =
p=1

with the αp listed above. We now discuss how to find the best values of the linear coefficients
C p . To this end, we need the overlap and Hamiltonian matrix. The advantage of using Gaus-
sian basis functions is that analytic expressions for these matrices can be found. In particular,
it is not so difficult to show that the elements of the overlap matrix S, the kinetic energy matrix
T and the Coulomb matrix A are given by:
¶3/2
π
Z µ
2 2
S pq = d 3 r e −αp r e −αq r = ,
αp + αq
1 αp αq π3/2
Z
2 2
T pq = − d 3 r e −αp r ∇2 e −αq r = 3 , (4.28)
2 (αp + αq )5/2
2 1 2π
Z
2
A pq = − d 3 r e −αp r e −αq r = − .
r αp + αq
36 4. T HE VARIATIONAL METHOD FOR THE S CHRÖDINGER EQUATION

STO-4G
STO-3G
STO-2G
0.5 STO-G
Exact

0.25

0
0 1 2 3

F IGURE 4.2: The best fit of a Gaussian basis set consisting of 1, 2, 3 and 4 basis functions, to the exact Slater
{ solution e −r /a of the hydrogen atom. The mnemonic STO-nG denotes a fit using n Gaussian functions to a Slater
4{ Type Orbital (STO).

These expressions can be put into a computer program which solves the generalised eigen-
value problem. The resulting ground state energy is −0.499278 Hartree, which is amazingly
close to the exact value of −1/2 Hartree, which is −13.6058 eV. We conclude that four Gaus-
sian functions can be linearly combined into a form which is surprisingly close to the exact
ground state wave function which is known to have the so-called Slater-type form exp(−r /a)
rather than a Gaussian! This is shown in figure 4.2.

4.2.3 E XPLOITING SYMMETRY


We have seen that the solution of a stationary quantum problem using linear variational cal-
culus, in the end, boils down to solving a (generalised) matrix eigenvalue problem. Finding
the eigenvalues (and eigenvectors) of an N × N matrix, or solving a generalised eigenvalue
problem, requires a number of floating point operations in the computer proportional to N 3 .
This means that if we double the size of the basis set used in the variational analysis, the com-
puter time goes up by a factor of 8. As it turns out, we are often interested in problems having
some symmetry. We shall now briefly sketch how this can be used to significantly reduce the
computer time for variational calculations.
In subsection 4.2.1, we considered a problem having a very simple symmetry: replacing
x by −x does not change the potential, and therefore the Hamiltonian is insensitive to this
transformation. Let us denote the operation x → −x by R. Because flipping the sign of x
twice leaves the space invariant, we have
R 2 = 1,
where 1 is as usual the identity operator. Let us consider the eigenvalues λ of this operator.
From R 2 = 1 we have that λ2 = 1. Therefore, λ = ±1. Furthermore, R commutes with the
Hamiltonian:
RH − H R = 0,
since the Hamiltonian is not affected by R. We know (or should know!) that if an operator
commutes with H we can always find eigenvalues which are eigenvalues of H and of that
operator. This means that we can divide the eigenfunctions of H into two classes: one of
symmetric eigenfunctions (symmetric meaning having eigenvalue λ = +1 when acting on it
with R) and one of antisymmetric eigenfunctions (λ = −1).
Now suppose we construct our variational basis set such that it can be divided into two
classes, that of symmetric and that of anti-symmetric basis functions. Let us calculate the
inner product of a symmetric and an anti-symmetric eigenfunction. Using antisymmetry
of the product of the two may immediately convince you that¯ this vanishes.
¯ ® To illustrate
the more general procedure, we consider two eigenfunctions, ¯φ1 and ¯φ2 with different
®

eigenvalues λ1 and λ2 for the symmetry operation R. Then we can write


φ1 |R|φ2 = λ1 φ1 |φ2 = λ2 φ1 |φ2 ,
­ ® ­ ® ­ ®
4.3. E XAMPLES OF NON - LINEAR VARIATIONAL CALCULUS 37

where we first let R act on the left, and then on the right function. The fact that λ1 6= λ2
leads to the well-known theorem saying that two eigenvectors of a Hermitian operator with
different eigenvalues, are orthogonal (if the eigenvalues are the same, the wave functions are
either identical or they can be chosen orthogonal).
The key result is that a similar conclusion can be drawn for the expectation value of the
Hamiltonian.

φ1 |RH | φ2 = λ1 φ1 |H | φ2 = φ1 |H R| φ2 = λ2 φ1 |H | φ2 ,
­ ® ­ ® ­ ® ­ ®

which, as λ1 6= λ2 directly gives


φ1 |H |φ2 = 0.
­ ®

For an orthonormal basis set, we see that, if we would order the basis functions in our
{
set with respect to their eigenvalue of R, the Hamiltonian becomes block-diagonal. ¯ For our 4
{
denoting the symmetric basis functions by φps and
®
simple reflection-symmetric ¯ example,
¯
the antisymmetric ones by ¯φpa where p runs from 1 to M = N /2, we have
®

 
H1s,1s . . . H1s,M s 0... 0
 .. .. .. .. .. .. 

 . . . . . . 

 H
 M s,1s . . . H M s,M s 0... 0 
H =

0 ... 0 H1a,1a . . . H1a,M a 


.. .. .. .. .. ..
 
. .
 
 . . . . 
0 ... 0 H M a,1a . . . H M a,M a

We can diagonalise the two blocks on the diagonal independently. This takes 2(N /2)3 steps
(up to a multiplicative constant), which is 4 times less than N 3 (up to the same constant)
required to diagonalise the full Hamiltonian! If there are additional symmetries, they can be
used to reduce the work required even further.
If the basis is non-orthogonal, it still holds that basis functions having different eigen-
values under the symmetry-operator R are orthogonal and that the matrix elements of the
Hamiltonian between them vanishes. This means that the Hamiltonian matrix and the over-
lap matrix have the same block-diagonal structure. Therefore, the respective generalised
eigenvalues for the blocks can be dealt with independently of each other and we achieve the
same speed-up.
What we have touched upon is an example of the application of group theory in physics,
which is an important topic on its own.

4.3 E XAMPLES OF NON - LINEAR VARIATIONAL CALCULUS


Linear variational calculus leads to a (generalised) matrix eigenvalue problem to be solved.
Variational methods form a much wider class, including trial functions which depend non-
linearly on the variational parameters. Numerically this is quite complicated to solve, but
several analytic non-linear variational calculations exist which give quite good results even if
one only one or two variational parameters are used. A nice example is¯ the hydrogen atom,
which we try to solve using a variational wave function (‘trial function’) ¯ψT of the form
®

r|ψT ∝ e −r /a .
­ ®

You may note that this is the form of the exact ground state of the hydrogen atom. However,
to illustrate the variational method, we first relax the value of the parameter a and then vary
it to minimise the expectation value of the energy. We should then find the exact ground state
wave function and energy.
The Schrödinger equation was already given in atomic units in the previous section:
· ¸
1 1
− ∇2 − ψ(r) = E ψ(r). (4.29)
2 r
38 4. T HE VARIATIONAL METHOD FOR THE S CHRÖDINGER EQUATION

It is useful to first normalise the trial wave function:


Z
4π r 2 e −2r /a d r = πa 3 ,

so that we have
1
e −r /a .
­ ®
r|ψT = p
πa 3
Its is now easy to calculate the expectation value of the kinetic energy. Using the fact that
for a function ψ(r) in 3D which only depends on r ,
1 d 2 d
µ ¶
2
∇ ψ(r) = 2 r ψ(r ) ,
r dr dr
{ we have
4{ 2 1
µ
1 2

∇ ψT (r) = p − e −r /a ,
πa 3 a 2 ar
we find, after some calculation that
¿ ¯ ¯ À Z µ ¶
¯ 1 2¯ 1 1 2 1
− ψT ¯ ∇ ¯ ψ T =
¯ ¯
3
− 2+ e −2r /a 4πr 2 d r = 2 .
2 2πa a ar 2a
For the potential energy, we find
1 1
Z
− e −2r /a 4πr d r = − .
πa 3 a
Therefore, the expectation value of the energy for the trial wave function is given by
1 1
ET = 2
− .
2a a
The minimum of this expression is found at a = 1 and yields an energy of E T = −1/2 in units
of 27.212 eV, which is the correct ground state energy of −13.6058 eV.
Now we turn to a more complicated problem: the helium atom, which (when the nucleus
is considered not to move because of its large mass) is described by the Hamiltonian

p 12 p 22 2e 2 2e 2 e2
H= + − − + .
2m 2m 4π²0 r 1 4π²0 r 2 4π²0 |r1 − r2 |
In atomic units, this becomes, with r 12 = |r1 − r2 |:

1 d2 1 d2 2 2 1
H =− 2
− 2
− − + .
2 d r 1 2 d r 2 r 1 r 2 r 12

For the trial wave function we use the form


­ ®
r1 , r2 |ψ = exp [−2(r 1 + r 2 )/a] .

This function is chosen such that it yields the ground state of two noninteracting electrons
moving in the field of the helium nucleus (the nuclear charge Z = 2 leads to a scaling of 2
in the exponent). So the trial wave function is simply the successful trial wave function (in
the sense that it contains the exact solution) for the independent-electron case. In particular,
when an electron approaches the nucleus, its behaviour is properly described by this wave
function as the electron-nucleus interaction largely dominates the electron-electron interac-
tion in that case.
We have taken the wave function to be symmetric in the coordinates r1 and r2 . The two
electrons should however form an antisymmetric wave function as they are fermions. The
antisymmetry is taken care of by the spin wave function
1
p (|↑↓〉 − |↓↑〉) .
2
4.3. E XAMPLES OF NON - LINEAR VARIATIONAL CALCULUS 39

Later we shall go much deeper into the structure of many-body wave functions – here we just
mention that, as the Hamiltonian does not contain any spin dependence, we can forget about
the spin part of the wave function and can safely assume that the ground state wave function
is symmetric in r1 and r2 .
We first must normalise this solution. This can be done for r1 and r2 independently and
we obtain
πa 3
Z
4π r 2 d r e −4r /a = ,
8
so that the normalised solution reads
­ ® 8
r1 r2 |ψ = exp [−2(r 1 + r 2 )/a] . (4.30)
πa 3
{
In order to find the expectation value of the Hamiltonian for this wave function, it is conve- 4
{
nient to write it in the form
1
H = H1 + H2 + ,
|r1 − r2 |
where
p i2 2
Hi = − , i = 1, 2.
2m ri
This is the Hamiltonian for an electron moving in the helium potential.
We calculate the kinetic energy following the method used for the hydrogen atom. The
result is (for the two electrons together)

4
EK = .
a2

A quick way to arrive at this result is by taking the kinetic energy 1/(2a 2 ) for the hydrogen
atom, replacing a → a/2 and multiplying by 2 because we now have two electrons. The po-
tential energy due to the attraction between the nucleus and the electrons is found to be

−8
E n-e = ,
a
as can be verified by taking the hydrogen result 1/a, replacing a → a/2 and multiplying by 2
because the nuclear charge is twice as large as for the hydrogen atom and again by 2 because
we now have two electrons.
Now we must add to this the contribution from the electron repulsion:

64e −4(r 1 +r 2 )/a 1


Z
d 3r1 d 3r2.
π a
2 6 |r1 − r2 |

If we fix r1 , we can evaluate the integral over r2 . Choosing the z-axis to be the direction of r1 ,
we have (without the prefactor):

1
Z
e −4(r 1 +r 2 )/a q 2π sin θd θr 22 d r 2 .
r 12 + r 22 − 2r 1 r 2 cos θ
p
We first evaluate the integral over θ. Choosing cos θ = u, this is of the form d u/ p − qu and
R

we are left with


µ ¶
1 q 2
Z q
−4(r 1 +r 2 )/a
2π e r 1 + r 2 + 2r 1 r 2 − r 1 + r 2 − 2r 1 r 2 r 22 d r 2 =
2 2 2
r1r2
· ¸
1
Z
−4(r 1 +r 2 )/a
2π e (r 1 + r 2 − |r 1 − r 2 |) r 22 d r 2 (4.31)
r1r2
40 4. T HE VARIATIONAL METHOD FOR THE S CHRÖDINGER EQUATION

The term in square brackets equals 2/ max(r 1 , r 2 ) where the function max(x, y) returns the
largest of the two numbers x and y. Therefore, we need to split the integral over r 2 into a part
running from 0 to r 1 and a part running from r 1 to ∞:
· Z r1 Z ∞ ¸
1
4πe −4r 1 /a e −4r 2 /a r 22 d r 2 + e −4r 2 /a r 2 d r 2 =
r1 0 r1
3
πa −4r 1 /a h r1 i
e 1 − e −4r 1 /a − 2 e −4r 1 /a .
8r 1 a
Now it remains to multiply this by the normalisation factor 64/(π2 a 6 ) and by exp(−4r 1 /a),
and then integrate over 4πr 12 d r 1 (there is no dependence on the two angular variables for r1 ).
All the integrals are straightforward and the final result is
{ ¿ ¯ ¯ À
1 ¯ ψT = 5
4{ ψT ¯
¯
¯
¯
|r1 − r2 | ¯ 4a
in units of Hartree = 27.212 eV. So the total energy is given by
4 −8 5 4 27
ET = − + = − .
a2 a 4a a 2 4a
Taking a = 1, i.e. assuming that both electrons are in the ground state of the atom with the
electron-electron interaction switched off, yields an energy of

4 − 27/4 = −11/4 Hartree = 74.833 eV.

The experimental value is −79 eV, so although our result is not extremely bad, it is not im-
pressively accurate either.
Now let us relax a and find the minimum of the trial energy. This is found at a = 32/27 =
1.1815 Bohr radii. The energy is then found at −2.8477 Hartree= 77.49 eV.

4.4 S UMMARY
In this chapter, we have studied a successful method for approximating the ground state so-
lution of a complicated quantum mechanical problem: variational calculus. This method
simply consists of finding the minimum of the expectation value of the Hamiltonian in a re-
stricted space, where the term ‘restricted’ indicates that the space is a subset of the full Hilbert
space of the quantum problem. It is trivial to see that this variational solution yields an energy
equal to or larger than the exact value.
The space in which we search for the solution usually is a set which we can represent
by parametrised trial solutions, ψT = ψT (α1 , α2 , . . . , αN ). The parameters α j enter in a pos-
sibly non-linear way into the wave function. Solving non-linear minimisation problems is
non-trivial, but numerical routines are available for this. Analytical non-linear variational
problems are usually restricted to one or two variational parameters with respect to which
the expectation value of the Hamiltonian is to be minimised.
A special case is the one in which the trial wave functions depend linearly on the varia-
tional parameters, which we now call C p . That is, we can write the trial wave function as
N
¯ψT = C p ¯χp .
¯ ® X ¯ ®
p=1

In that case, the minimisation problem of the energy reduces to solving the generalised eigen-
value problem:
HC = E SC,
where C stands for the vector with elements C p and the matrices H and S have elements

H pq = χp |H | χq ;
­ ®

S pq = χp |χq .
­ ®
4.5. P ROBLEMS 41

For the special case of an orthonormal basis χp , S pq = δpq , the Dirac delta-function, and the
generalised eigenvalue problem reduces to an ordinary eigenvalue problem:

HC = E C.

4.5 P ROBLEMS
1. (a) We consider the ground state of an electron in the hydrogen atom. Approximate
the ground-state wave function by
2
r|ψ = e −αr
­ ®

and find an upper bound to the ground-state energy using variational calculus.
{
(b) Approximate the ground state of the one-dimensional harmonic oscillator using 4
{
a trial wave function (
­ ® α2 − x 2 for |x| ≤ α,
x|ψ =
0 elsewhere.

2. The attractive potential felt by an electron in an atom is sometimes taken to be the


screened potential
Ae 2 −r /ξ
V (r ) = e .
r
Consider trial wave functions of the form

ψ(r) = exp(−r /ρ).

Minimise the variational energy for this wave function.

3. Consider a one-dimensional potential


(
λx for 0 < x < a;
V (x) =
∞ for x ≤ 0 and x ≥ a.

Find the ground state for this Hamiltonian using variational calculus. Take a second or-
der polynomial as a trial function. The polynomial should obviously satisfy the correct
boundary conditions at x = 0 and x = a.

4. We consider a particle moving in one dimension in a ‘quartic potential’. The Hamilto-


nian is given as
×2 d 2 b
H =− 2
+ x 4,
2m d x 4
with b some positive constant. Now take a trial wave function of the form
1 2 2
x|ψ = pp e −x /(2σ ) .
­ ®
πσ

Calculate the variational energy and minimise it to obtain the variational ground state
energy.

5. Consider a system of two particles with equal masses m and momenta p1 and p2 , in-
teracting via a potential V (r 12 ) where r 12 = |r1 − r2 |.

(a) Write the Hamiltonian Ĥ of the system in terms of the momenta

P = p1 + p2 ; (4.32)
p1 − p2
p= (4.33)
2
42 4. T HE VARIATIONAL METHOD FOR THE S CHRÖDINGER EQUATION

(all momentum vectors are operators) and of r 12 . Also use the total mass M +m 1 +
m 2 and the reduced mass µ = m 1 m 2 /(m 1 + m 2 ).
Show that Ĥ can be written in the form
P2
Ĥ = + Ĥ12 .
2M

(b) We denote by E (2) the ground state energy of Ĥ12 . Give the expression for E (2)
when V (r ) = −b 2 /r and for V (r ) = κr 2 /2.
(c) Consider a system of three particles of equal mass m with pairwise interactions:

V = V (r 12 ) + V (r 23 ) + V (r 31 ) .
{
4 { Show that
¢2 ¡ ¢2 ¡ ¢2 ¡ ¢2
3(p 12 + p 22 + p 32 ) = p1 + p2 + p3 + p1 − p2 + p2 − p3 + p3 − p1 ,
¡

and that the Hamiltonian of the three-body Hamiltonian can be written as

P2 (3)
H (3) = + Hrel ,
6m
where
(3)
Hrel = H12 + H23 + H31 ,
P = p1 + p 2 + p 3
and where Hi j contains a kinetic part with a reduced mass µ0 . Express µ0 in terms
of m.
(d) Check whether the Hi j commute amongst each other. What can you say about
the energy spectrum if this were the case?
(e) Show that the three-body ground state energy E (3) is related to the ground state
energy E (2) of the two-body problem described by Hi j by the inequality

E (3) ≥ 3E (2) .

Note that the latter depends on µ0 calculated in (c).


Hint: write E (3) as the expectation value of H (3) for the ground state |Ω〉. Then use
the fact that Ω ¯ H ¯ Ω ≥ E (2) .
­ ¯ (2) ¯ ®

(f) Give the lower bounds for the case where V (r ) = −b/r 2 and where V (r ) = κr 2 /2.
How does this lower bound for the first case compare with the numerical result

E (3) ≈ −1.067mb 4 /×2 ?

6. Linear variational calculus for the Cooper-pair box.


The Cooper-pair box consists of a small piece of superconductor (called the island)
coupled to a larger piece (called the reservoir) via a Josephson junction: a thin, insu-
lating layer. Superconductivity will be addressed later in this course, and Josephson
junctions not at all. Detailed knowledge however is not needed to do this exercise. In
a superconductor, Cooper pairs, pairs of electrons with opposite spin and momentum,
form a condensate which requires a finite energy to create excitations. The Joseph-
son junction is thin enough for allowing Cooper pairs to tunnel through it, and this
tunneling generates a particular coupling between the two superconducting volumes
connected by the junction.
We use the charge basis |n〉, where n denotes the number of Cooper pairs that have
tunneled through the junction, and therefore the number of charges that have moved
4.5. P ROBLEMS 43

to the small superconductor (see figure below). This island has an electrostatic capacity
C , which means that every electron contributes an amount EC = e 2 /2C to the energy.
The Hamiltonian reads
EJ X
(n − n g )2 |n〉 〈n| −
X
H = 4EC (|n + 1〉 〈n| + |n〉 〈n + 1|) ,
n 2 n

with n = ... − 1, 0, 1, ... and n g is some charge offset. The first term reflects the electro-
static charging, and the second reflects tunneling (also called hopping). The parameter
E J is the so-called Josephson energy. Crucially, note that n g is a constant (not an opera-
tor) under the control of the experimentalist. Note that the charge basis is orthonormal,
〈n|m〉 = δn,m .
{
We will now use linear variational calcultus to estimate the ground and first-excited
state energies and wave functions. Caution: please do not attempt to solve this prob-
4
{

lem analytically. Rather, use your favorite mathematical software: Mathematica, Mat-
lab, anything you like! Please print out your code.

(a) Write the matrix H q p in the charge basis. The matrix is of course infinite dimen-
sional, show just a subset of it, revealing its basic structure.
(b) Write the matrix S q p also in the charge basis.
(c) Consider EC = 10E J . Restrict your trial functions to the subspace n = −N , ..., 0, ...+
N . Plot the ground and first-excited state energies as a function of n g in the range
n g ∈ [−1, 1]. Do this for N = 1, N = 2 and N = 3.
(d) Repeat for E J = 10EC .
(e) For E J = 10EC and n g = 0.5, plot the ground and first-excited state energies as a
function of N in the range N ∈ [1, 5]. Confirm McDonald’s theorem: show that the
energies decrease monotonically as you increase N .
(f) For E J = 10EC , plot the ground and first-excited state wave functions in the charge
basis. That is, plot the coefficient c n in the expansion

N
X
|Ψ〉 = c n |n〉
n=−N

Do this for n g = 0 and n g = 0.5. What choice of N would you say is accurate
enough?

island

junction

reservoir

F IGURE 4.3: The Cooper-pair box


5
T HE WKB APPROXIMATION

5.1 INTRODUCTION
The WKB approximation provides a simple way of obtaining reasonable values for different
aspects of the solution to a quantum problem in one dimension. In this chapter we briefly
describe how this approximation works. You are advised to read the material up in other
books containing more extensive treatments. A good reference is chapter 8 of Griffiths’ book.

5.2 T HE WKB A NSATZ


When electrons move in one dimension through some constant potential, their state can be
written as
­ ®
x|ψ = A exp(ikx)
with a wave vector s
2m(E − V )
k= .
×2
The wave runs towards the right (the positive x-direction), as can be seen by looking at the
full time-dependent wave function:
x|ψ = A exp [i (kx − ωt )] ,
­ ®

with ω = E /×. A wave running towards the left would then be described by the stationary
wave function
­ ®
x|ψ = A exp(−ikx).
What happens if the wave is incident on a non-constant potential, such as a step, a well
or barrier? Your first guess would probably be: part of the wave is reflected, and another
part is transmitted. That is correct as demonstrated by figure 5.1(a), which shows the time
evolution of a Gaussian wave packet incident on a rectangular potential barrier. However,
the rectangular barrier is special in the sense that at it varies abruptly from 0 to a finite value,
and then as abruptly back again.
Now let us consider a wave packet incident on a smooth barrier, described by a Gaussian
shape – see figure 5.1(b). The shape of the well is chosen such that its rise from 0 to its max-
imum value extends over a few wavelengths of the packet (of course, the packet contains a
continuum of wavelengths, but they are close to an average value). We see that the packet is
just transmitted and not reflected! So we conclude that

• A wave incident on a potential which varies substantially over length scales


smaller than a wave length, is reflected and transmitted.
• If the potential is smooth, i.e. it varies substantially on length scales (much)
larger than a wave length, an incident wave is not reflected, just transmitted.

45
46 5. T HE WKB APPROXIMATION

{
5{

F IGURE 5.1: (a) Reflection and transmission of a wave packet by a rectangular potential well. Periodic bound-
ary conditions are used, so the transmitted part disappearing off the right edge reappears at the left edge. (b)
Transmission of a wave packet by a Gaussian potential well. Reflection is virtually absent. Periodic boundary
conditions are applied.

This notion is the basis of the Wentzel-Kramers-Brillouin, or WKB approximation. In this


section, we introduce this approximation in a hand-waving way.
We assume that E > V (x), so that running waves are possible. Any wave function is then
characterised by an amplitude and a phase:

x|ψ = A(x)e iϕ(x) ,


­ ®

where both A and ϕ are real. We focus on the phase ϕ. Moving over a distance ∆x, the phase
of the wave function changes by k∆x where
r
2m
k= (E − V )
×2
where we assume that the potential V is constant on the interval ∆x. This assumption is
valid provided ∆x is taken sufficiently small. Now consider the following interval where the
potential has a different value V 0 . Then the total phase picked up by the wave function on the
two intervals is
k∆x + k 0 ∆x = (k + k 0 )∆x,
p
where obviously k 0 = (2m/×2 ) (E − V 0 ). Note that this only holds if we only allow right-
moving waves, i.e. we neglect left movers, arising from scattering. This assumption is only
valid when the potential is smooth. Dividing up a larger interval into N small ones ∆x, a
wave function which is real at x 0 will have the form
à !
NX −1
r
2m
k(x i )∆x ; x i = x 0 + i ∆x; k(x) =
­ ®
x|ψ ≈ A(x) exp i (E − V (x)).
i =0 ×2

Taking the limit N → ∞, we may write this as


µZ x ¶
k(x 0 )d x 0 .
­ ®
x|ψ ≈ A(x) exp i
x0
5.3. T HE WKB A NSATZ II 47

This fixes the phase factor. In order to find the amplitude A(x), we use a conservation prin-
ciple. Remember that the absolute square of the wave function represents the probability to
find the particle in a particular quantum state, which in our context means a particle located
at a specific position. For our wave function, the density is given by |A(x)|2 . The speed at
which the particles travel at position x is (approximately) given by

×k(x)
v= ,
m
therefore the flux is given as
×k(x)
j (x) = |A(x)|2 .
m
The crucial step is now that, since this wave function represents a stationary flow, the flux
must be constant through space. This implies that

1
A(x) ∝ p . {
k(x)
5
{
This then leads to the WKB wave function Ansatz:
µZ x ¶ r
­ ® 1 0 0 2m
x|ψ ≈ p exp i k(x )d x ; k(x) = (E − V (x)),
k(x) x0 ×2

which fixes the form of the wave function up to a suitable normalization constant.

5.3 T HE WKB A NSATZ II


In the previous section we have considered the WKB wave function which was based on the
notion that a wave inciding on a smooth potential is not scattered back, so that the phase is
simply the sum of the infinitesimal phases picked up on many short intervals. In this section
we start from the form
­ ®
x|ψ = A(x) exp[iϕ(x)], (5.1)
for the solution of the one-dimensional Schrödinger equation, where both A and ϕ are real
functions. Inserting this into the Schrödinger equation, we derive the form for ϕ and V under
the assumption that V varies slowly. Note that any one-dimensional wave function can be
written in this form, so we have not imposed any restriction here.
Let us assume that in the interval where we describe the wave function by this form, E >
V (x). Defining s
2m(E − V (x))
k(x) = ,
×2
we can write the Schrödinger equation in the form (using ψ(x) = x|ψ ):
­ ®

ψ00 (x) = −k 2 (x)ψ(x),

where ψ00 is the second derivative of ψ with respect to x.


Now we put our wave function (5.1) into this equation. This leads straightforwardly to
2
A 00 + 2iA 0 ϕ0 + iAϕ00 − Aϕ0 = −k 2 A.

Note that this yields two real equations, one following from the imaginary, and one from the
real part. They are, respectively:
2A 0 ϕ0 + Aϕ00 = 0
and
2
A 00 − Aϕ0 = −k 2 A.
48 5. T HE WKB APPROXIMATION

The first equation can be recast into the form

1 d ¡ 2 0¢
A ϕ = 0,
A dx
from which we immediately have
Const
A= p . (5.2)
ϕ0
For a slowly varying potential, we anticipate that ϕ(x) ≈ k(x)x, so that k(x) ≈ ϕ0 (x). From
the relation (5.2) between A and ϕ, we can then infer that A 00 /A ∝ [V 0 /(E − V )]2 ,V 00 /(E − V ).
We now neglect these terms, i.e. we set A 00 ¿ k 2 A. We then have

ϕ0 (x) = ±k(x),

and we write the solution in the form


x
{
Z
ϕ(x) = ± k(x 0 )d x 0 .
5 { x0

This is the WKB approximation. Note that the approximation essentially consists of neglect-
2
ing the term A 00 in comparison with the term ϕ0 ≈ k 2 , i.e. we neglect variations in the poten-
tial with a wavelength much larger the wavelength k(x) of the particles due to their kinetic
energy.
Remember that in this analysis we have required that E > V and that the variation of V
over a few wavelengths is small in comparison with E −V (x). If V (x) gets close to E , the wave-
length grows larger and larger, and this condition no longer holds. We shall see in section 5.6
how this regime can be dealt with. Another regime is that where the energy E is substantially
smaller than V (x). This can be dealt with in the same way as above. Now we can choose the
wave function to be real (this can also be done in the E > V regime, but there it is inconve-
nient). Defining s
2m(V (x) − E )
κ(x) = ,
×2
the Schrödinger equation turns into
ψ00 = κ2 ψ.
Plugging a solution
Ae ±ϕ(x)
where ϕ0 (x) = κ(x) into this equation yields

A 00 ± 2A 0 κ ± Aκ0 + κ2 e ±ϕ(x) = κ2 Ae ±ϕ(x) .


¡ ¢

Neglecting again the term proportional to A 00 , we are left with the condition that the second
and third terms in the brackets on the left hand side vanish. Casting this into the form

1 d ¡ 2 ¢
A κ = 0,
A dx
we find
1
A(x) = p ,
κ(x)
leading to the WKB wave function:

1 ±
Rx
κ(x 0 )d x 0
ψWKB (x) = p e x0 .
κ(x)
5.4. T HE WKB A NSATZ III 49

Summary so far
We have considered an approximation to the solution of the stationary, one-
dimensional Schrödinger equation in those regions where the wave function oscil-
lates much more rapidly than the potential; in formula:
µ ¶2
ψ00 V 00 V 0
À , .
ψ V V

The solution to the stationary Schrödinger equation in those regions is approxi-


mated by s¯ ¯
¯ 1 ¯ ± R x k(x 0 )d x 0
ψWKB (x) = ¯
¯ ¯e x 0 ,
k(x) ¯
where s
2m [V (x) − E ]
k(x) = ± .
×2 {
For E < V (x), the WKB solution represents a decaying or exponentially growing 5
{
wave function; for E > V (x), k(x) is purely imaginary, and we obtain an oscillatory
wave function.

5.4 T HE WKB A NSATZ III


In this section we consider the WKB approximation once again, but using an alternative
derivation. This derivation is performed in Problem 8.2 of Griffiths’ book.
We consider again a Schrödinger equation in one dimension with a slowly varying poten-
tial. Slow means that the potential does not vary significantly on the scale of a wavelength of
the solution.
−×2 00
ψ (x) + V (x)ψ(x) = E ψ(x).
2m
q
Using κ(x) = 2m ×2
[V (x) − E ], we can write this in the form

ψ00 (x) = κ2 (x)ψ(x).

Note that κ2 can be positive or negative. When it is positive, we are in the tunnelling region; if
it is negative, we are in the classical region (the region where classical mechanics allows the
particle to be).
We write the wave function in the form
­ ®
x|ψ = exp[u(x)],

where u(x) can be complex. The Schrödinger equation can now be worked out for this form
in terms of u:
2
u 00 + u 0 = κ2
(note that κ and u are functions of x). Note that, so far, no approximation has been made.
In order to make progress, we define a reference solution u 0 by

u 00 (x) = ±κ(x).

It can easily be seen that for κ independent of x, the reference solution satisfies the Schröd-
inger equation:
u 00 = ±κ; u 000 = 0 for κ constant.
Now we write the exact solution u as

u(x) = u 0 (x) + δu(x).


50 5. T HE WKB APPROXIMATION

Putting this into the Schrödinger equation (for u as formulated above), we obtain
2 ¡ ¢2 ¢2
u 00 + u 0 = u 000 + δu 00 + u 00 + 2u 00 δu 0 + δu 0 = κ2 (x).
¡

Now we use u 00 = ±κ(x) and neglect the second and the last term in the second expression as
they are second order in the inverse wavelength of the variation of the potential and hence
expected to be much smaller than the remaining terms. We then obtain:

u 000 + 2u 00 δu 0 = 0;

hence
1 d 1 d
δu 0 = − ln u 00 = − ln κ
2 dx 2 dx
as can easily be verified.
Now we have
x
{
µ ¶
1
Z
0 0
u(x) = u 0 (x) + δu(x) = ± κ(x ) d x + ln p .
5 { x0 κ(x)

Translating this back into the original wave function ψ, we obtain


µ Z x ¶
1 0 0
κ(x )d x
­ ®
x|ψ = p exp ±
κ(x) x0

up to a normalisation constant. This is the WKB wave function.


In the classical region, κ2 < 0, and we take κ(x) = ik(x), the same as the solution found at
the beginning of the previous section. Similarly, for the tunnelling region, taking κ(x) real we
obtain the same result as in the previous section for E < V (x).

5.5 T UNNELLING IN THE WKB APPROXIMATION


The WKB is often used to solve tunnelling problems. The standard way in which such prob-
lems are formulated is by considering a one-dimensional potential as in figure 5.2.

1 T
E
R
xL xR

F IGURE 5.2: A typical tunnelling problem: A wave incident from the left splits into a part which bounces back
and one which tunnels through the classically forbidden region.

Consider a wave pexp(ikx) incident from the left, where the potential is zero, at energy E .
The wave vector k = 2m [E − V (x)] /×2 . The wave is incident on a potential barrier between
x L and x R where V (x) > E , that is, classically the particle cannot not enter this region. Part of
the wave bounces back from this barrier and part tunnels through the classically forbidden
potential. On the right hand side, the transmitted wave is given by exp(ikx). All in all the wave
function is 
e ikx + Re −ikx for x < x L

­ ® 
x|ψ = u(x) for x L ≤ x ≤ x R

Te ikx

for x > x R .
5.6. T HE CONNECTION FORMULAE 51

Ai
0.4 Bi

0.2

-0.2

-0.4
-10 -8 -6 -4 -2 0 2
x

{
F IGURE 5.3: The Airy functions Ai and Bi . These are the solutions to the Schrödinger equation with a cross-
5
{
ing point at x = 0. Left of this crossing point we see the oscillatory behaviour characteristic for E > V (x) – the
classically allowed region. Right of that point, the curve either decays (Ai ) or it increases indefinitely (Bi ).

For the solution in the classically forbidden region we use the WKB approximation:

A Rx
κ(x 0 )d x 0 B −
Rx
κ(x 0 )d x 0
u(x) = p e xL +p e xL .
κ(x) κ(x)

The matching conditions lead straightforwardly to a set of equations connecting A, B , T and


R. The analysis simplifies considerably if the tunneling amplitude is small. This happens
when the potential is wide and/or much higher than the energy of the incident wave. The
wave function is then much smaller in amplitude at the right edge of the barrier than it is on
Rx
the left. The exponent γ = xLR κ(x 0 )d x 0 is mainly responsible for this difference – the prefactor
p
1/ κ(x) varies much less than the exponent. We therefore neglect the prefactor. It is then
easy to see that matching |T | to the solution yields |T | ∝ exp(−γ).
The ratio between the tunnelling and the incident currents is proportional to |T |2 as the
velocity of the particles at the left and the right of the barrier is the same, so the current is
dominated by the density, which is proportional to the square of the amplitude of the wave
function. This ratio is usually denoted as the transmission T . This transmission therefore
satisfies Z x R
T ∝ e −2γ ; γ= κ(x 0 )d x 0 .
xL

5.6 T HE CONNECTION FORMULAE


The WKB approximation is valid only when the wavelength of the solution is (much) smaller
than the variation of the potential. Now consider a problem in which the potential varies
continuously. Then at the ‘classical turning point’ x t , where E = V (x t ), the wavelength 1/κ(x)
diverges! Hence the WKB approximation fails miserably in this case. So what should we do?
Well, we know that the exact solution behaves smoothly at and close to the turning point.
Let us therefore look at the exact solution close to the turning point and integrate from the
turning point to the left and to the right. Then we match this left and right solution onto the
WKB approximation when we are at some distance from the turning point. Now we describe
how this is done. Close to the turning point, which we take at x t = 0, we approximate the
potential by
×2 3
V (x) = E + α x.
2m
52 5. T HE WKB APPROXIMATION

The factor α3 is the slope of the potential at the turning point. We have taken it to be α3
just for convenience. Then the Schrödinger equation has the following form near the turning
point:
ψ00 = α3 xψ.
There are two solutions to this equation (as it is a second order differential equation): they
are called the Airy functions Ai (αx) and Bi (αx) – they are shown in figure 5.3. They are not
known in terms of standard functions, but it is known that for large positive and negative
arguments they assume the forms:

p 1 sin 32 (−z)3/2 + π4
£ ¤
π(−z)1/4
for z ¿ 0
Ai (z) ≈ 1
¡ 2 3/2 ¢
 p 1/4 exp − z for z À 0;
2 πz 3

1
cos 32 (−z)3/2 + π4
£ ¤
p
π(−z)1/4
for z ¿ 0
Bi (z) ≈
 p 1 1/4 exp 2 z 3/2
¡ ¢
for z À 0.
{ πz 3
5 { The WKB form, which should also be valid once we are far enough from the turning point,
is given by (α > 0):

1 h R0 0 0
R0 0 0
i
ψWKB (x < 0) = p Ae −i x k(x ) d x + B e i x k(x ) d x
|k(x)|

and
1 h Rx 0 0
Rx 0 0
i
ψWKB (x > 0) = p C e − 0 k(x ) d x + De 0 k(x ) d x .
|k(x)|
p
Both forms can be worked out analytically for k(x) = α3 |x|:

1 h 2 3/2 2 3/2
i
ψWKB (x < 0) = Ae −i 3 (−αx) + B e i 3 (−αx)
α3/4 (−x)1/4

and
1 h
− 23 (αx)3/2 2
(αx)3/2
i
ψWKB (x > 0) = C e + De 3 .
α3/4 x 1/4

We see that the forms of the WKB and the Airy functions for large argument are the same (as
it should) – we just need to match the coefficients.
A salient feature of the Airy functions is the factor 2 occurring in the denominator of Ai
for positive arguments, and the lack of this factor of 2 in Bi . This shows up in the matching
for positive x: writing
­ ®
x|ψ = a Ai (x) + b Bi (x),
we must have from the matching at positive x:
p p
a = 2 π/αC and b = π/αD.

For negative argument we therefore have a WKB solution


0 π π
µZ 0
D
µZ ¶ ¶
2C 0 0 0 0
p sin k(x ) d x + +p cos k(x ) d x + .
|k(x)| x 4 |k(x)| x 4

We now express the matching condition for a turning point x t as follows:

π π
µZ xt µZ x t
D
¶ ¶
2C 0 0 0 0
ψ(x ¿ x t ) = p sin k(x ) d x + +p cos k(x ) d x + ↔
|k(x)| x 4 |k(x)| x 4
µ Z x µZ x
C D
¶ ¶
0 0 0 0
p exp − k(x ) d x + p exp k(x ) d x ; [x À x t ].
|k(x)| xt |k(x)| xt
5.6. T HE CONNECTION FORMULAE 53

1
numerical
WKB
Airy
0.5 -cos(0.25*x)
Energy

-0.5

-1
0 2 4 6 8 10
x

{
F IGURE 5.4: The matching procedure for a particle of unit mass, moving in a potential − cos(x/4). The red curve is
5
{
an accurate numerical solution. The bound state energy, found as −0.17891517, is shown as the black horizontal
line – it crosses the potential at the turning point (black vertical line). The green line is a WKB solution. The part
left of the turning point was calculated starting off with the correct boundary condition ψ(x = 0) = 0 at x = 0.
The part right of the turning point started of near x = 10 as a decaying solution. Both WKB parts fail close to the
turning point. The blue curve is the Airy function solution matched to the exact solution at the turning point. It
describes the solution well near that point but deviates from it far away from the turning point.

with the appropriate expression for k(x) which is always real.


For a turning point with a classical region on the right rather than on the left we have:
µ Z xt µZ xt
C D
¶ ¶
0 0 0 0
ψ(x ¿ x t ) = p exp − k(x ) d x + p exp k(x ) d x ↔
|k(x)| x |k(x)| x

π π
µZ x µZ x
D
¶ ¶
2C 0 0 0 0
p sin k(x ) d x + +p cos k(x ) d x + ; [x À x t ].
|k(x)| xt 4 |k(x)| xt 4
In summary, we have a WKB solution which is accurate far away from the turning point. The
Airy function solution is accurate close to the turning point. The exact solution satisfies the
two limiting cases (close to and far away from) the turning point. The situation is represented
in figure 5.4.
Now we can apply these matching expressions to several potentials. Suppose we have a
potential well which is bounded by an infinite potential barrier on the left hand side, and a
continuous turning point on the right hand side. For large x, we do not want the solution to
explode, hence we have D = 0 there. So we are left with the solution

π
µZ xt ¶
2C
p sin k(x 0 ) d x 0 +
|k(x)| x 4

within the well and µ Z x


C

p exp − k(x 0 ) d x 0 .
|k(x)| xt

outside. Note that these solutions need not match at x = x t as these are the approximate WKB
forms which should only hold far enough from the turning point. Now we must require the
solution to vanish near the infinite wall. There, the WKB solution must be valid (within the
WKB approximation). This boundary condition directly leads to
Z xt
k(x 0 ) d x 0 + π/4 = nπ.
x
54 5. T HE WKB APPROXIMATION

If we have a potential well bounded by a turning point on the left and one on the right
hand side, the solution inside the well is given by

π
µZ xR ¶
2C 0 0
p sin k(x ) d x +
|k(x)| x 4

and by
D x π
µZ ¶
p cos k(x 0 ) d x 0 +
|k(x)| xL 4
where the first expression is based on the right turning point x R and the second one on the
left turning point x L . Requiring both solutions to be identical leads to

π π
Z xR Z x
k(x 0 ) d x 0 + = ± k(x 0 ) d x 0 + + (n + 1/2)π,
x 4 xL 4

which follows directly from the notion that sin x = cos(π/2−x). The last equation should hold
{
5 for all x and this can be the case only for the + sign before the first term on the right hand side.
{
We then obtain the condition
Z xR
k(x 0 ) d x 0 = (n + 1/2)π.
xL
5.7. P ROBLEMS 55

Final Summary
The WKB approximation leads to practical schemes for approximating tunneling
amplitudes and bound state energies. For tunnelling, we have found that the trans-
mission probability T for tunnelling through a barrier V (x), is given as

T = e −2γ ,

where
xR
r
2m
Z
0 0
γ= κ(x )d x , κ(x) = (V (x) − E ),
xL ×2
and x L,R are the left and right classical turning points where E = V (x).
We have concluded that the WKB form, which has a limited domain of validity, can
be extended with an Airy function solution close to the points where E = V (x). This
leads to rather simple conditions for finding the bound state energies of quantum
particles in one dimension (see the figure below).
{
1. For a particle in a well with two vertical walls (i.e. jumps in the potential at x L 5
{
and x R , both crossing the value E ), the condition for having a bound state is

xR
r
2m
Z
k(x)d x = nπ, n = 1, 2, 3, . . . , κ(x) = (E − V (x)).
xL ×2

2. For a particle in a well with a single vertical wall (i.e. a jump in the potential
energy at x L crossing the value E , and a slope which reaches the value E at
x = x R ), the WKB condition for a bound state is:
Z xR µ ¶
1
k(x)d x = n − π, n = 1, 2, 3, . . .
xL 4

3. For a particle in a well with two sloping walls crossing the energy at x L and x R ,
both crossing the value E ), the condition for having a bound state is
Z xR µ ¶
1
k(x)d x = n − π, n = 1, 2, 3, . . .
xL 2

V (x) V (x) V (x)


E E E

1 2 3

5.7 P ROBLEMS
1. Consider a one-dimensional potential
(
λx for 0 < x < a;
V (x) =
∞ for x ≤ 0 and x ≥ a.

Show that the ground state energy of a particle with mass m in this potential is, in the
WKB approximation, given as the solution to the implicit equation

3 π×λ
E 3/2 − (E − E 0 )3/2 = p
2 2m
56 5. T HE WKB APPROXIMATION

where E 0 = λa. Note that for E 0 = 1 = 23 pπ×λ , the solution to this equation is given by
2m
E = 1. Please comment on the suitability of the WKB method in this limit. Compare
this result with that obtained in problem 3 of chapter 4.

2. Consider a particle moving in one dimension in a ‘quartic potential’. The Hamiltonian


is given as
×2 d 2 b
H =− 2
+ x 4,
2m d x 4
with b some positive constant.
Solve the spectrum using the WKB approximation. Compare your result with that of
problem 4 of chapter 4.

3. Using the WKB approximation, derive a formula for the energies of the bound s-states
of a particle of a particle of mass m in a potential V (r ) = −V0 exp(−r /R) with V0 and R
both positive.
{
5{ 4. Use the WKB approximation to find the allowed energies (E n ) of an infinite square well
with a ‘shelf’ of height V0 , extending half-way across:

V0 if 0 < x < a/2,


V (x) = 0 if a/2 < x < a,


∞ otherwise.

Express your answer in terms of V0 and E n0 ≡ (nπ×)2 /(2ma 2 ) (the n th allowed energy
for the ‘unperturbed’ infinite square well, with no shelf). Assume that E 10 > V0 , but do
not assume that E n À V0 . Compare your result for the same problem in first-order
perturbation theory:
V0
E n = E n0 + .
2
Note that they are in agreement if either V0 is very small (perturbative regime) or when
n is very large (semi-classical WKB regime).

5. In WKB, when we have a system with two infinite walls at x L and x R with a classical
regime [E > V (x)] in between, we have the quantisation condition
Z xR
p(x) d x = nπ×.
xL

For a system with an infinite wall at x L and a single classical turning point at x R , this
quantisation condition is replaced by
Z xR
p(x) d x = (n − 1/4)π×
xL

for positive, integer n.


Now consider a potential given by
(
mg x if x > 0,
V (x) =
∞ if x < 0.

This describes a ball of mass m moving along a straight sloped track, against a wall on
the left end of the track. Find the turning point for an energy E . Find the energy levels
from the WKB quantisation condition. Give the first three levels in units of (mg 2 ×2 )1/3 .
Compare your results with the exact values (units of (mg 2 ×2 )1/3 ):

E 0 = 1.8558; E 1 = 3.2446 and E 2 = 4.3817.


5.7. P ROBLEMS 57

6. The graph below shows the relation between the resistance and the length, measured in
numbers of carbon atoms, of alkane chains. In the experimental setups used, molecules
of different lengths were connected to two electrodes.

{
5
{

A small bias voltage was then applied across the electrodes, and the current flow was
recorded.

(a) Explain the fact that for each experiment, a straight line connects the data in this
graph.
(b) The tunneling takes place via a so-called ‘molecular orbital’: this is an electronic
state on the (uncoupled) molecule at a definite energy.
Find approximate values of the location of the molecular orbitals in the experi-
ment with respect to the Fermi energy of the gold.
You will need the length per CH2 unit in an alkane chain. The carbon atoms are ar-
ranged in a zigzag pattern with an angle of about 109o between successive atoms.
The distance between those atoms is 1.5 Å. A good guess for the distance is there-
fore about 2 Bohr radii.
The literature value for the difference between the potential and the fermi energy
is 0.2 eV. If your value differs, can you give arguments why this would be the case?
6
G REEN ’ S FUNCTIONS IN QUANTUM
MECHANICS

6.1 I NTRODUCTION
Green’s functions are the workhorses of theoretical quantum mechanics. They are used in
many subfields of quantum mechanics because they are very powerful. Nevertheless, to
many researchers, Green’s functions often seem abstract and difficult. Sometimes this is
right, but the use of Green’s functions for quantum systems in which the interactions be-
tween particles are not explicitly considered is not so complicated. And they can be useful
even for such noninteracting systems. In this chapter, we shall explain what the Green’s func-
tion is and how it can be used for analysing different types of problems. In chapter 7 we shall
use Green’s functions when discussing scattering theory.

6.2 D EFINITION OF THE G REEN ’ S FUNCTION


The Green’s function of a system described by a Hamiltonian H is defined as

(z − H )G = 1, (6.1)

where the right hand side is the unit operator. This operator can have different forms, de-
pending on the structure of the Hilbert space. If that space is a finite-dimensional vector
space, the unit operator can be written as the unit matrix
 
1 0 ··· 0
 0 1 ··· 0 
 
1= . . .
 .
 .. .. .. 0 
0 0 ··· 1

In the case where the Hilbert space consists of the normalizable complex functions defined
on the real axis, the matrix elements of the unit operator are given in terms of a delta function

x|1|x 0 = δ(x − x 0 ).
­ ®

For the Hilbert space of particles moving in 3D, i.e. the space consisting of normalizable in
R3 (i.e. L2-functions) in R3 , the matrix elements are given by the three-dimensional delta-
function δ(3) (r − r0 ).
The Green’s function may seem a rather arbitrary object – it is not immediately clear why
this function could be useful in any way. Moreover, it is defined as the inverse of an operator,
and that is usually difficult to find. It is in particular not obvious what information we could
obtain from this inverse, while we could instead diagonalize H (which is, numerically, equally

59
60 6. G REEN ’ S FUNCTIONS IN QUANTUM MECHANICS

difficult as inversion). In order to give some insight into these questions, we must recall an
important result from complex function theory (see the end of chapter 1):
µ ¶
1 1
lim =P − iπδ(x).
²↓0 x + i² x
This turns out very useful as can be seen by expanding the Green’s
¯ ® function for a system with
a discrete spectrum in the basis consisting of the eigenstates ¯φn of H :
1 X¯ ® 1 ­ ¯
G(z) = = ¯φn φn ¯
z−H n z − En
Here, z can in principle be any complex number, but we decide to choose it close to, but
above the real axis. We then obtain the retarded Green’s function
1
G r (E ) =
E − H + iη
where E is real and η is considered to be small and positive. We only give E as an argument
of the Green’s function; the superscript ‘r’ indicates that we have moved the energy slightly
upward (i.e. to the positive imaginary part) in the complex plane.
We have µ ¶
r 1
φn P φn ¯ − iπ ¯φn δ(E − E n ) φn ¯ .
X¯ ® ­ ¯ X¯ ® ­ ¯
G (E ) =
{
¯
n E − En n
6{ This Green’s function is an operator depending on E (or more generally, on z), and we would
like to work with a simpler object. We therefore study the trace of the Green’s function. The
trace of an operator is defined as the sum over the diagonal elements of that operator:
φn ¯ Â ¯ φn .
X­ ¯ ¯ ®
Tr( Â) =
n

where the vectors ¯φn form an orthonormal basis. It can be shown that the trace is indepen-
¯ ®

dent of the particular basis chosen – for the Green’s function, we take the basis consisting of
the eigenstates of the Hamiltonian and find:
X 1
TrG(z) = .
n z − En
We see that the trace of the Green’s function has a simple pole on the real axis at every energy
eigenvalue E n .
We have learned two important things: (i) the trace of the Green’s function is a complex
function which has poles on the real axis which correspond to the eigenvalues of H and (ii)
at
¯ these poles the imaginary part of the Green’s function (not its trace) is proportional to
¯φn φn ¯, which is a projection operator onto the corresponding eigenstate φn . We see that
®­ ¯

having the Green’s function is equally useful as having the eigenstates and eigenfunctions of
the Hamiltonian. The reason why we often use Green’s functions is that it is often possible to
obtain them for systems for which the Hamiltonian cannot be diagonalised. An example is
formed by a closed rather than an open system, as we shall see below.
We can also conclude that the trace of the imaginary part of the (retarded) Green’s func-
tion gives a series of δ-functions, one for each energy:
lim Tr ImG(E + iη) = −π δ(E − ²n ).
£ ¤ X
η↓0 n
This is an example of a general result which says that the imaginary part of the trace of the
Green’s function is proportional to the density of states of a system.
The Green’s function is often powerful in studying quantum systems, as we already men-
tioned in the introduction. To be specific, (i) the Green’s function is useful for working out
perturbation series, (ii) it plays a major role in scattering theory (again when scattering is for-
mulated as a pertubative problem) and (iii) the Green’s function has a local character: we can
evaluate it for a particular region, and it encodes the influence which this region has on adja-
cent regions. In this chapter we shall briefly go into these applications of Green’s functions.
6.3. G REEN ’ S FUNCTIONS AND PERTURBATIONS 61

6.3 G REEN ’ S FUNCTIONS AND PERTURBATIONS


There exists a very important equation that is quite simple but turns out very powerful for
perturbative problems. To obtain this equation, let us first formulate a perturbative quantum
problem by splitting its Hamiltonian as

H = H0 + V,

where V is ‘small’ in some sense. Usually we mean small with respect to the typical distance
between the energy eigenvalues of H0 or, in the case of a continuous spectrum, small with
respect to the typical eigenenergy of H0 measured with respect to the ground state energy.
We define G 0 as the Green’s function of the unperturbed Hamiltonian H0 :

(z − H0 )G 0 = 1, (6.2)

where the unit operator on the right hand side is the same as above, i.e. its form depends on
the Hilbert space of the system.
Now it is very easy to obtain the following result:

z − H = G −1 = z − H0 − V = G 0−1 − V.

Multiplying the second and the fourth form of this equation from the left with G 0 and from {
the right with G, we obtain, after some reorganisation:
6
{

G = G 0 +G 0V G. (6.3)

It seems that we have not made much progress, as this is an implicit equation for the Green’s
function G. However, V is small, and this inspires us to take the expression for G and plug it
into the right hand side of this equation. We then obtain:

G = G 0 +G 0V G 0 +G 0V G 0V G.

The second term on the right hand side contains one V , and the third terms contains two V ’s.
This means that for small V , the third term is a lot smaller than the second one, and if we
neglect this very small term, we obtain an explicit equation for G:

G = G 0 +G 0V G 0 .

We can iterate further and further, each time replacing the G on the right hand side by the full
right hand side, and thus obtain an infinite perturbation series:

G = G = G 0 +G 0V G 0 +G 0V G 0V G 0 +G 0V G 0V G 0V G 0 + . . . (6.4)

The terms on the right hand side contain increasingly higher-order contributions in V to the
unperturbed Green’s function. Eq. (6.3) is the famous Dyson equation. It is a very important
equation which is used in many fields of physics. Eq. (6.4) is called the Born series. Cutting
this of after the first-order term (in V ) is called the first Born approximation, after the second
order term in V it is the second Born approximation, etcetera.

6.3.1 S YSTEMS WITH DISCRETE SPECTRA


In order to clarify the relation between the Born series and standard perturbation theory
which you have learned in your previous quantum mechanics course, let us have a look at the
first-order equation and check whether this reproduces the results of standard perturbation
theory which is set up without Green’s functions (see e.g. Griffiths Chapter 6). For simplicity,
we take a nondegenerate Hamiltonian which has the unperturbed Green’s function:

¯φn 1 ­ ¯¯
φn .
X¯ ®
G 0 (z) =
n z − En
62 6. G REEN ’ S FUNCTIONS IN QUANTUM MECHANICS

Now we switch on the perturbation V . The first order approximation to the Green’s function
is
G(z) = G 0 +G 0V G 0 .
We can easily
¯ evaluate the trace of this first order Green’s function, using the basis of the
eigenstates ¯φn of the unperturbed Hamiltonian:
®

1 X 1 ­ ¯ ¯ ® 1
φn ¯ V ¯φn
X
Tr [G(z)] = + .
n z − En n z − En z − En

Here we have used that G 0 is diagonal in the basis ¯φn . For small V , we note that
¯ ®

1 1 1 1 ­ ¯ ¯ ®2
φ ¯ V ¯φn + φn ¯ V ¯φn + . . . ,
­ ¯ ¯ ®
­ ¯ ¯ ®≈ + n (6.5)
z − E n − φn ¯ V ¯φn z − E n (z − E n )2 (z − E n )3

showing that, to first order in V , the trace of the Green’s function may be written as the left
hand side of the last equation, which has poles at E n + φn V φn . This is equivalent to the
­ ¯ ¯ ®
¯ ¯
result of standard first order perturbation theory which says that the energy correction to E n
due to a perturbative potential V is given by φn ¯ V ¯φn (see also Desai Eq. (16.26) on page
­ ¯ ¯ ®

280, or Griffiths, Ch. 6).


{ Now we consider the second order expansion:
6{
G = G 0 +G 0V G 0 +G 0V G 0V G 0 .

Taking the trace gives (we use the obvious notation φn ¯ V ¯φm ≡ Vnm ):
­ ¯ ¯ ®

X 1 X 1 1 X 1 1 1
TrG(z) = + Vnn + Vnm Vmn .
n z − En n z − En z − E n nm z − E n z − Em z − En

We first note that for n = m in the sum in the rightmost term we obtain the second order term
of the expansion of 1/(z − E n − Vnn ); see Eq. (6.5). Therefore, we only need to add the contri-
butions n 6= m in the last term to first order in a Taylor expansion of the Green’s function, and
obtain:
X 1
TrG(z) = (2)
n z − E n − Vnn − Vnn

with
(2)
X Vnm Vmn
Vnn = .
m6=n z − E m

The poles of the Green’s function will occur at E n plus a first order correction in V . So, we
can replace z in the last expression by E n and obtain the result that the pole will be shifted in
second order by
X Vnm Vmn
.
m6=n E n − E m

This is then recognised as the second order correction known from standard perturbation
theory (see e.g. Desai, Eq. (16.28) or Griffiths, Ch. 6).
We see that the Born series allows us to easily find higher-order corrections to the ener-
gies. To find the corrections to the corresponding wave functions is a bit more complicated.
We return to that problem in section 6.3.3.

6.3.2 S YSTEMS WITH CONTINUOUS SPECTRA


Now we turn to a system with a continuous spectrum. The operator H0 is chosen such as to
be easily diagonalizable:
H0 ¯φk = E k ¯φk ,
¯ ® ¯ ®

where k is now a continuous index.


6.3. G REEN ’ S FUNCTIONS AND PERTURBATIONS 63

We
¯ take the same energy E k and define the eigenstate of the full Hamiltonian at that en-
ergy ¯ψk :
®

H ¯ψk = E k ¯ψk .
¯ ® ¯ ®

Note that we assume that the energy E k is in the continuous spectrum of both the unper-
turbed and the perturbed Hamiltonian. This is the case for systems which are of interest to
us.
The unperturbed ¯ Hamiltonian H0 was assumed to be diagonalized, and we anticipate
that in general the ¯ψk are difficult to find. We use perturbation theory for this problem. We
®

can write the last equation in the form

(E k − H0 ) ¯ψk = V ¯ψk .
¯ ® ¯ ®

and combine this with


(E k − H0 ) ¯φk = 0.
¯ ®

Now we can write


¯ψk = ¯φk + (E k − H0 )−1V ¯ψk .
¯ ® ¯ ® ¯ ®

(From now on, we leave out the subscript k to the¯ wave functions ψ and φ.) We have ob-
tained an implicit equation for the wave function ¯ψ , similar to the Dyson equation found
®
{
in the previous section. We recognize the Green’s function of the unperturbed system as the 6
{
operator in front of V on the right hand side of this equation:

¯ψ = ¯φ +G 0V ¯ψ .
¯ ® ¯ ® ¯ ®

Note
¯ ® that the second term on the right hand side is the perturbation of the wave function
¯φ at a fixed eigenenergy due to the presence of the perturbation V . Similar to the approach
taken for the Dyson equation, we can try to solve this equation iteratively. The lowest order
approximation is by replacing ψ on the right hand side by φ:

¯ψ = ¯φ +G 0V ¯φ .
¯ ® ¯ ® ¯ ®

This is the so-called first Born approximation often abbreviated as ‘Born approximation’. We
may also replace the full expression for ψ into the equation and obtain:

¯ψ = ¯φ +G 0V ¯φ +G 0V G 0V ¯ψ .
¯ ® ¯ ® ¯ ® ¯ ®

etcetera. This expression just given is the second Born approximation. This Born approxima-
tion is frequently used in scattering theory.

6.3.3 D ISCRETE SPECTRA REVISITED


Now that we have analysed the energy corrections for discrete spectra in section 6.3.1 and the
wave function corrections for continuous spectra in the previous section, we now combine
the techniques used in both in order to analyse the change in the wave functions for discrete
problems.
We start from the unperturbed Schrödinger equations

H0 ¯ψ(0) = E n(0) ¯ψ(0)


¯ ® ¯ ®
n n .

and its ‘full’ version:


H ¯ψ n = E n ¯ψ n ,
¯ ® ¯ ®
(6.6)
where H = H0 + V . The energy has shifted: E n = E n(0) + δE n and the wave function has too:
¯ ® ¯ (0) ® ¯
¯ψn = ¯ψ
n + δψn .
®
¯
64 6. G REEN ’ S FUNCTIONS IN QUANTUM MECHANICS

Therefore, Eq. (6.6) can be written in the form

(H0 + V ) ¯ψ(0)
n + δψn = (E 0 + δE n ) ¯ψ(0)
n + δψn .
¡¯ ® ¯ ®¢ ¡¯ ® ¯ ®¢
¯ ¯

Using the unperturbed Schrödinger equation, this can directly be reworked to give

(E 0 − H0 ) ¯ψn − ¯ψ(0) = (V − δE ) ¯ψn .


¡¯ ® ¯ ®¢ ¯ ®
n

Multiplying this equation by G 0 = (E n(0) − H0 + iη)−1 directly gives


¯ ® ¯ (0) ®
¯ψn = ¯ψ
n +G 0 (V − δE ) ψn .
¯ ®
¯

The right hand side can then be expanded as a Born series:


¯ ® ¯ (0) ® ¯ (0) ® ¯ (0) ®
¯ψn = ¯ψ
n +G 0 (V − δE ) ψn +G 0 (V − δE )G 0 (V − δE ) ψn + . . .
¯ ¯
D E
Now we can use the first order correction to the eigenvalue: δE = ψ(0) n |V | ψ (0)
n found in
¯ E
section 6.3.1, in order to find the first order correction to ¯ψ(0)
n :
¯

¯ ® ¯ (0) ® ¯ (0) ® ¯ (0) ® X ¯ (0) ® 1 ­ (0) ¯


¯ψn = ¯ψ
n +G 0 (V − δE ) ψn = ¯ψn + ¯ψm ψm ¯ (V − δE ) ¯ψ(0)
¯ ®
n .
¯
{ m E n − E m + iη
6 {
First note that the term for m = n vanishes (why?). For all other terms, the part proportional
to δE vanishes too (why?) and we are left with
D ¯ ¯ E
(0) ¯ ¯ (0)
¯ ® ¯ (0) ® X ¯ (0) ® ψm ¯ V ¯ψn
¯ψn = ¯ψ ¯ψ .
n + m
m6=n En − Em

This result is well known from stationary perturbation theory.

6.4 G REEN ’ S FUNCTIONS AND BOUNDARIES


Green’s functions are very useful for calculations on very complex systems, containg bulk-like
and microscopic parts. As an illustration, we consider an atom near a crystal surface, where
the atom has a single energy level ²a . The energy spectrum of the bulk crystal is continuous
(it is a band structure). We can write the full Hamiltonian as

H = HB + Ha + HT ,

where HB is the Hamiltonian of the bulk crystal, Ha is that of the atom and HT is the coupling
Hamiltonian between the atom and the crystal. The Hilbert space of the problem can be
written as
H = H a ⊕ H B,
where H a is the one-dimensional space spanned by the atomic eigenstate. In matrix form,
this Hamiltonian reads:
²a
 
−τ
 
H = .
 

 −τ HB 

Here, τ is the coupling between atom and bulk, and we take for the energy of the isolated
atom the value ²a . More generally, we can write
 
HS −τ
 
H = .
 

 −τ HB 
6.4. G REEN ’ S FUNCTIONS AND BOUNDARIES 65

where the subscript S stands for ‘system’. This Hamiltonian is supposed to be small enough
that we can diagonalize it. Denoting the dimension of the Hilbert spaces of the system and
the bulk system by D S and D B , we see that HB is a D B ×D B matrix, HS is a D S ×D S -sized matrix
and τ is of size D S × D B .
The solution to this problem seems very difficult. Perhaps we can solve for the bulk spec-
trum and states using Bloch’s theorem, and the atom by itself is easy (only a single level with
a known energy). But when we couple the two, the problem seems to get hopelessly compli-
cated. However, we shall now show that it is possible to find the atomic part of the Green’s
function of the coupled system!
We write the Green’s function of the coupled form as the matrix

A†
 
GS
 
G(z) = 
 
 A GB

All the submatrices depend on z but we refrain from indicating that explicitly.
The Green’s function is the inverse of z − H so we can write:

τ A† 1S
    
z − HS GS 0 {
     6
{
=
    

τ z − HB   A GB   0 1B
  
 

We extract the following two equations from this:

(z − HS )G S + τA = 1S ,

τ†G S + (z − HB ) A = 0.
(Note that the 0 on the right hand side is a D B × D S matrix). The second of these equations
allows us to eliminate A:
A = (z − HB )−1 τ†G S ,
which we can use to remove A from the first equation to obtain

G S = (z − HS − Σ)−1 ,

where
Σ = τg B τ†
and
g B = (z − HB )−1 .
By considering the sizes of the matrices g B (D B × D B ) and τ (D S × D B ) it is found that σ is a
matrix of size D S × D S . This matrix encodes all the influence of the bulk onto the system.
The operator Σ is called the self-energy. This operator is not Hermitian. It can be written
as
Σ = Λ + iΓ,
where Λ and Γ are Hermitian. This is very important to interpret how the presence of a bulk
system influences the discrete spectrum of a system. The effect of Λ is to shift the discrete
energy levels. The effect of Γ is to broaden the discrete level into a Lorentzian peak. To see
this, let us consider a system with a one-dimensional Hilbert space, e.g. an atom with a single
energy level ²a . The Green’s function of the atom close to the bulk system is then
¢−1
G S = z − ²a − λ − iγ ,
¡
66 6. G REEN ’ S FUNCTIONS IN QUANTUM MECHANICS

3.5
γ=0.4
3 γ=0.2
γ=0.1
2.5
2
1.5
1
0.5
0
-0.5 0 0.5 1 1.5 2 2.5
E

F IGURE 6.1: The Lorentz curve for ²a + λ = 1 and different values of γ.

{ where the use of the lower case letters λ and γ indicates that they correspond to the small
6{ system. Calculating the density of states yields

γ
· ¸
1 1 1
DOS(E ) = Im = .
π E − ²a − λ − iγ π (E − ²a − λ)2 + γ2

Figure 6.1 shows this function for various values of γ.

6.5 S UMMARY
In this chapter, we have encountered different uses of Green’s functions. First, they are con-
venient for analysing different forms of perturbation theory. We have used Green’s functions
for calculating energy and wave function corrections for discrete spectra, and analysed the
wave functions for continuous spectra. All these methods were based on the Dyson equa-
tion, which relates the Green’s function G for a system described by a Hamiltonian

H = H0 + V

to the Green’s function G 0 of the unperturbed system:

G = G 0 +G 0V G.

Iterating this equation gives the Born series

G = G 0 +G 0V G 0 +G 0V G 0V G 0 + . . .

Finally we have used the Green’s function concept to study the behaviour of a system
which is embedded into another, usually larger, system. We have seen that the effect of the
environment on a system which is coupled to that environment, can be captured by a self-
energy Σ, which is in general a non-hermitian operator. The Green’s function of the embed-
ded system S is then given as
G(z) = (z − HS + Σ)−1 ,
where, for an environmen with Green’s function g B , Σ is given by

Σ = τg B τ† .

Here τ is the part of the Hamiltonian which contains the coupling between system and bath.
6.6. P ROBLEMS 67

For a single level, it is easy to see that the self-energy has two effects: its real part shifts
the energy, and its imaginary parts broadens the level to a Lorentzian density of states.
Green’s functions are very important in the study of interacting many-body systems. In
that case, they are defined in a more general way. The interaction is treated in a perturbative
way and the equations satisfied by the Green’s functions can be illustrated in a more or less
straightforward way by diagrams – these are called Feynman diagrams.

6.6 P ROBLEMS
1. Consider a system¯ ® of N noninteracting particles in a box. The particle states in the box
are denoted ¯φn , n = 1, 2, . . .. The eigenenergies of these states are (in ascending order)
²n , n = 1, 2, . . .. We assume that these energies are nondegenerate.

(a) Write the Green’s function of the one-particle system in terms of the single particle
states φn and their eigenenergies ²n .
(b) The ground state of the N -particle system is obtained by filling the lowest N eigen-
states φn (r). The density is then given by

N ¯
¯φn (r)¯2 .
X ¯
n(r) =
n=1 {
6
{
Now consider the Green’s function G(z). A famous result from complex analysis
(the residue theorem, see ch. 1 of the lecture notes) tells us that the line integral
over a closed contour Γ in the complex plane of a complex function f (z) is given
by I X
f (z)d z = 2πi resak f (z),
Γ k

where a i are all the singularities (‘poles’) of f (z) within the contour Γ and resa f (z) =
limz→a (z − a) f (z). Using the residue theorem, show that the density can be writ-
ten as
1
I
〈r |G(z)| r〉 d z = n(r),
2πi Γ
where the contour Γ encloses the energy eigenvalues ²n , n = 1, . . . , N on the real
axis. This equation is frequently used in quantum transport.

2. We define a Green’s function


D ¯ ¯ E
k ¯e −iH (t −t0 )/× ¯ j = iG(k, t ; j , t 0 ),
¯ ¯

where |k〉 and | j 〉 are basis functions.

(a) How would you interpret this expression?


(b) Show that the Green’s function generates the time evolution, which, for a particle
moving in three dimensions, using |r〉 as basis states, reads:
Z
ψ(r , t ) = i G(r0 , t 0 ; r, t )ψ(r, t )d 3 r.
0 0

(c) Using the fact that ψ(r, t ) satisfies Schrödinger’s equation, show that the Green’s
function satisfies
(i×∂t − H )G(r0 , t 0 ; r, t ) = 0 for t 0 > t
and that
iG(r0 , t ; r, t ) = δ3 (r − r0 ).
68 6. G REEN ’ S FUNCTIONS IN QUANTUM MECHANICS

(d) Show that for


×2
H =− ,
2m∇2
the Green’s function is
à ¯2 !
im ¯r0 − r¯
¸3/2 ¯
m
·
0 0
G(r , t ; r, t ) = −i exp .
2πi× (t − t 0 ) 2×(t − t 0 )

{
6{
7
S CATTERING IN CLASSICAL AND IN
QUANTUM MECHANICS

Scattering experiments are perhaps the most important tool for obtaining detailed informa-
tion on the structure of matter, in particular the interaction between particles. Examples of
scattering techniques include neutron and X-ray scattering for liquids, atoms scattering from
crystal surfaces, elementary particle collisions in accelerators. In most of these scattering ex-
periments, a beam of incident particles hits a target which also consists of many particles.
The distribution of scattering particles over the different directions is then measured, for dif-
ferent energies of the incident particles. This distribution is the result of many individual
scattering events. Quantum mechanics enables us, in principle, to evaluate for an individual
event the probabilities for the incident particles to be scattered off in different directions; and
this probability is identified with the measured distribution.
Suppose we have an idea of what the potential between the particles involved in the
scattering process might look like, for example from quantum mechanical energy calcula-
tions (programs for this purpose will be discussed in the next few chapters). We can then
parametrise the interaction potential, i.e. we write it as an analytic expression involving a set
of constants: the parameters. If we evaluate the scattering probability as a function of the
scattering angles for different values of these parameters, and compare the results with ex-
perimental scattering data, we can find those parameter values for which the agreement be-
tween theory and experiment is optimal. Of course, it would be nice if we could evaluate the
scattering potential directly from the scattering data (this is called the inverse problem), but
this is unfortunately very difficult (if not impossible) as many different interaction potentials
can have similar scattering properties as we shall see below.
Many different motivations for obtaining accurate interaction potentials can be given.
One is that we might use the interaction potential to make predictions about the behaviour
of a system consisting of many interacting particles, such as a dense gas or a liquid.
Scattering might be elastic or inelastic. In the former case the energy is conserved, in
the latter energy disappears. This means that energy transfer takes place from the scattered
particles to degrees of freedom which are not included explicitly in the system (inclusion of
these degrees of freedom would cause the energy to be conserved). In this chapter we shall
consider elastic scattering.

7.1 C LASSICAL ANALYSIS OF SCATTERING


A well known problem in classical mechanics is that of the motion of two bodies attracting
each other by a gravitational force whose value decays with increasing separation r as 1/r 2 .
This analysis is also correct for opposite charges which feel an attractive force of the same
form (Coulomb’s law). When the force is repulsive, the solution remains the same – we only
have to change the sign of the parameter A which defines the interaction potential according

69
70 7. S CATTERING IN CLASSICAL AND IN QUANTUM MECHANICS

to V (r ) = A/r . One of the key experiments in physics which led to the notion that atoms
consist of small but heavy kernels, surrounded by a cloud of light electrons, is Rutherford
scattering. In this experiment, a thin gold sheet was bombarded with α-particles (i.e. helium-
4 nuclei) and the scattering of the latter was analysed using detectors behind the gold film. In
this section, we shall first formulate some new quantities for describing scattering processes
and then calculate those quantities for the case of Rutherford scattering.
Rutherford scattering is chosen as an example here – scattering problems can be stud-
ied more generally; see Griffiths, chapter 11, section 11.1.1 for a nice description of classical
scattering.
We consider scattering of particles incident on a so-called ‘scattering centre’, which may
be another particle. The scattering centre is supposed to be at rest. This might not always
justified in a real experiment, but in a standard approach in classical mechanics, the full
two-body problem is reduced to a one-body problem with with a reduced mass, which is
the present case (in problem 1 we will perform the same procedure for a quantum two-body
system). The incident particles interact with the scattering centre located at r = 0 by the
usual scalar two-point potential V (r ) which satisfies the requirements of Newton’s third law.
Suppose we have a beam of incident particles parallel to the z-axis. The beam has a homo-
geneous density close to that axis, and we can define a flux, which is the number of particles
passing a unit area perpendicular to the beam, per unit time. Usually, particles close to the
z-axis will be scattered more strongly than particles far from the z-axis, as the interaction
potential between the incident particles and scattering centre falls off with their separation
r . An experimentalist cannot analyse the detailed orbits of the individual particles – instead
{ a detector is placed at a large distance from the scattering centre and this detector counts
7{ the number of particles arriving at each position. You may think of this detector as a photo-
graphic plate which changes colour to an extent related to the number of particles hitting it.
The theorist wants to predict what the experimentalist measures, starting from the interac-
tion potential V (r ) which governs the interaction process.
In figure 7.1, the geometry of the process is shown. In addition a small cone, spanned
by the spherical polar angles d ϑ and d ϕ, is displayed. It is assumed here that the scattering
takes place in a small neighbourhood of the scattering centre, and for the detector the orbits
of the scattered particles all seem to be directed radially outward from the scattering centre.
The surface d A of the intersection of the cone with a sphere of radius R around the scattering
centre is given by d A = R 2 sin ϑd ϑd ϕ. The quantity sin ϑd ϑd ϕ is called spatial angle and is
usually denoted by d Ω. This d Ω defines a cone like the one shown in figure 7.1. Now consider
the number of particles which will hit the detector within this small area per unit time. This
number, divided by the total incident flux (see above) is called the differential scattering cross
section, d σ/d Ω:
d σ(Ω) Number of particles leaving the scattering centre through the cone d Ω per unit time
= .
dΩ Flux of incident beam
(7.1)
The differential cross section has the dimension of area (length2 ).
First we realise ourselves that the problem is symmetric with respect to rotations around
the z-axis, so the differential scattering cross section only depends on ϑ. The only two rel-
evant parameters of the incoming particle then are its velocity and its distance b from the
z-axis. This distance is called the impact parameter – it is also shown in figure 7.1.
We first calculate the scattering angle ϑ as a function of the impact parameter b. We per-
form this calculation for the example of Rutherford scattering, for which we have the standard
Kepler solution which is now a hyperbola (see your classical mechanics lecture course). The
potential for the Kepler problem is V (r ) = −A/r . The orbits are given by specifying r (t ), ϑ(t ).
However, for the Kepler problem it is more convenient to focus on the shape of the orbitals,
which is given as
1+²
r =λ (7.2)
² cos(ϑ −C ) − 1
7.1. C LASSICAL ANALYSIS OF SCATTERING 71

ϕ
ϑ

8
b 

d cos ϑ dϕ

F IGURE 7.1: Geometry of the scattering process. b is the impact parameter and ϕ and ϑ are the angles of the orbit
of the outcoming particle.

with s {
2E `2 7
{
²= 1+ ; (7.3)
µA 2
this parameter is called eccentricity – for a hyperbola, we have |²| > 1. Here, ` is the angu-
lar momentum and µ the reduced mass. The integration constant C reappears in the cosine
because we have not chosen ϑ = 0 at the perihelion – the closest approach occurs when the
particle crosses the dashed line in figure 7.1 which bisects the in- and outgoing particle di-
rection.
We know that for the incoming particles, for which ϑ = π, r → ∞, we have

cos(π −C ) = 1/². (7.4)

Because of the fact that cosine is even [cos x = cos(−x)] we can infer that the other value of
ϑ for which r goes to infinity, and which corresponds to the outgoing direction occurs when
the argument of the cosine is C − π, so that we find

ϑ∞ −C = C − π, (7.5)

or ϑ∞ = 2C − π. The subscript ∞ indicates that this value corresponds to t → ∞. From the


last two equations we find the following relation between the scattering angle ϑ∞ and ²:

sin(ϑ∞ /2) = cos(π/2 − ϑ/2) = cos(π −C ) = 1/². (7.6)

We want to know ϑ∞ as a function of b rather than ² however. To this end we note that
the angular momentum is given as
` = µv inc b, (7.7)
where ‘inc’ stands for ‘incident’, and the total energy as
µ 2
E= v , (7.8)
2 inc
so that the impact parameter can be found as
`
b=p . (7.9)
2µE
72 7. S CATTERING IN CLASSICAL AND IN QUANTUM MECHANICS

p
Using Eq. (7.3) and the fact that cot(x) = 1 − sin2 (x)/ sin(x), we can finally write (7.6) in the
form:
p 2Eb
cot(ϑ∞ /2) = ²2 − 1 = . (7.10)
|A|
From the relation between b and ϑ∞ we can find the differential scattering cross section.
The particles scattered with angle between ϑ and ϑ+d ϑ, must have approached the scattering
centre with impact parameters between particular boundaries b and b + d b. The number of
particles flowing per unit area through the ring segment with radius b and width d b is given
as j 2πbd b, where j is the incident flux. We consider a segment d ϕ of this ring. Hence:

d σ(Ω) = b(ϑ)d bd ϕ. (7.11)

Relation (7.10) can be used to express the right hand side in terms of ϑ∞ :
¶2 ¶2
A A d cot(ϑ/2) d ϑ
µ µ
d σ(Ω) = cot(ϑ/2) d cot(ϑ/2) d ϕ = cot(ϑ/2) d cos ϑd ϕ. (7.12)
2E 2E dϑ d cos ϑ

This can be worked out straightforwardly to yield:


µ ¶2
d σ(Ω) A 1
= . (7.13)
dΩ 4E sin ϑ/2
4

This is known as the famous Rutherford formula.


{
7{ 7.2 Q UANTUM SCATTERING
In quantum scattering, we know parts of the wave function describing the particles that are
scattered off some other particles. Just as for classical scattering, we consider a two-particle
collision, which we can reduce to a single particle problem in which an incident particle scat-
ters off a static potential which is nonzero only near the origin – see problem 1. Assuming that
the potential vanishes indeed outside of a sphere with radius r max centered at the origin, we
know the wave functions for the incident beam along the z-axis and the scattered waves. So,
outside of the sphere, we have
e ikr
ψ ∝ e ikz + f (ϑ, ϕ) .
r
We shall justify the form of the second term below. The main message for us at this stage is
that we have an incident wave and a scattered wave. The amplitude of the scattered wave
(the second term) depends on the polar angles ϑ, ϕ that are defined with respect to the z-axis
along which the incident particles approach the scattering centre.
The detection of particles far from the scattering centre provides information about the
amplitude f (ϑ, ϕ). Knowing that the outgoing wave e i kr /r represents a unit flux per solid
angle d Ω = sin ϑd ϑd ϕ, we conclude that

d σ ¯¯ ¯2
= f (θ, ϕ)¯ .
dΩ
Therefore, we need to find f (ϑ, ϕ) in order to determine the experimentally relevant quantity
d σ/d Ω.
In order to get a handle on this quantity, we start from the Schrödinger equation:

×2 2
· ¸
− ∇ + V (r) ψ(r) = E ψ(r).
2m

For V (r) ≡ 0, the solution to this equation would be an incoming plane wave. It turns out pos-
sible to write the solution to the Schrödinger equation with a potential formally as an integral
expression. This is done using the Green’s function formalism, discussed in chapter 6.4. The
7.2. QUANTUM SCATTERING 73

Green’s function is an operator – in this case that means that it depends on two positions r
and r0 – it is defined by
×2 2
· ¸
E+ ∇ − V (r) G(r, r0 ) = δ(r − r0 ).
2m
In fact, these are the matrix elements of the Green’s function operator:

G(r, r0 ) = r ¯Ĝ ¯ r0 .
­ ¯ ¯ ®

You may view the delta function on the right hand side as a unit operator, so that G may be
called the inverse of the operator E 1 − Ĥ , where 1 is the unit operator, all in line with the theory
of chapter 6.4. Now we want to calculate the unperturbed Green’s function, i.e. the one for
V (r) ≡ 0, which we denote as usual as G 0 :
×2 2
· ¸
E+ ∇ G 0 (r, r0 ) = δ(r − r0 ).
2m
Before calculating G 0 let us assume we have it at our disposal. We then may write the
solution to the full Schrödinger equation, i.e. including the potential V , in terms of a solution
φ(r) to the ‘bare’ Schrödinger equation, that is, the Schrödinger equation with potential V ≡ 0:
Z
ψ(r) = φ(r) + G 0 (r, r0 )V (r0 )ψ(r0 ) d 3 r 0 , (7.14)

which is an explicit form of the equation we met in chapter 6.4:


¯ψ = ¯φ +G 0V ¯ψ .
¯ ® ¯ ® ¯ ®
{
For this case, Eq. (7.14) can easily be checked by substituting the solution into the full Schröd- 7
{
inger equation and using the fact that E 1 − Ĥ0 , acting on the Green’s function Ĝ 0 , gives a
delta-function:
Z
(E 1 − H0 ) ψ(r) = 0 + δ(r − r0 )V (r0 )ψ(r0 )d 3 r 0 = V (r)ψ(r),

showing that ψ(r) satisfies the Schrödinger equation for the full Hamiltonina H0 + V .
Now we consider the scattering problem with an incoming beam of the form φ(r) p = exp(iki ·
r) (the subscript ‘i’ denotes the incoming wave vector; do not confuse it with i = −1!). We
see from Eq. (7.14) that this wave persists but that it is accompanied by a scattering term
which is the integral on the right hand side. Now the wavefunction ψ(r) is still very difficult
to find, as it occurs in Eq. (7.14) in an implicit form. We can make the equation explicit if we
assume that the potential V (r) is small, so that the scattered part of the wave is much smaller
than the wavefunction of the incoming beam. In a first approximation we might then replace
ψ(r0 ) on the right hand side of Eq. (7.14) by φ(r0 ) which is a plane wave:
Z Z
0
ψ(r) = φ(r) + G 0 (r, r0 )V (r0 )φ(r0 ) d 3 r 0 = e iki ·r + G 0 (r, r0 )V (r0 )e iki ·r d 3 r 0 .

The key to the scattering amplitude is given by the notion that it must always be possible to
write the solution (7.14) in the form:
e ikr
ψ(r) = e iki ·r + f (ϑ, ϕ) .
r
At this moment we hardly recognise this form in the expression obtained for the wavefunc-
tion. We first must find the explicit expression for the Green’s function G 0 . Without going
through the derivation (see for example Griffiths, section 11.41 ) we give it here:
0
2m e ik |r−r |
G 0 (r, r0 ) = −
×2 4π |r − r0 |
1 The free Schrödinger equation is also known as the Helmholtz equation. You will probably derive the Green’s

function for this in your math course under that name.


74 7. S CATTERING IN CLASSICAL AND IN QUANTUM MECHANICS

p
with k = 2mE /×2 .
We are interested in the wave function at the detector – this position is r. Therefore we
can take r far from the origin. As the range of the potential is finite, we know that only contri-
butions with r 0 ¿ r have to be taken into account. Taylor expanding the exponent occuring
in the Green’s function:
0
¯r − r0 ¯ = r 2 − 2r · r0 + r 02 ≈ r − r · r
¯ ¯ p
r
leads to
2m e ikr −ikr·r0 /r
G(r, r0 ) = − 2 e .
× 4πr
The denominator does not have to be taken into account as it gives a much smaller contribu-
tion to the result for r À 1/k. Now we define kf = kr/r , i.e. kf is a wave vector corresponding
to an outgoing wave from the scattering centre to the point r at the detector. We have

2m e ikr
Z
0 0
ψ(r) = φ(r) − V (r0 )e −ikf ·r e iki ·r d 3 r 0 .
×2 4πr

This is precisely of the required form provided we set

m
Z
0
f (ϑ, ϕ) = − 2
V (r0 )e i(ki −kf )·r d 3 r 0 .
2π×

This is the so-called first Born approximation. It is valid for weak scattering – higher order
{ approximations can be made by iterative substitution for ψ(r0 ) in the integral occurring in
7{ Eq. (7.14). In the first order Born approximation, the scattering amplitude f (ϑ, ϕ) is in fact a
Fourier transform of the scattering potential.
For future reference, we note that the exact expression for the scattering amplitude is
given by
m
Z
f (ϑ, ϕ) = − V (r0 )e ik·r ψ(r) d 3 r 0 . (7.15)
2π×2
where k is the outgoing wave with polar angles ϑ, ϕ. This is the equation one would obtain
from (7.14) without the Born approximation.
As an example, we consider the Coulomb potential which is strictly speaking not weak,
but we pretend that we can use the Born approximation for this. The Coulomb potential has
the form
q1 q2 1
V (r ) = .
4π²0 r
The Fourier transform of this potential reads

q1 q2 1
V (k) = .
²0 k 2

Therefore, we immediately find for f (ϑ):


mq 1 q 2
f (ϑ) = − .
2π²0 ×2 (ki − kf )2

The angle ϑ is hidden in ki −kf , the norm of which is equal to 2k sin(ϑ/2). The result therefore
is, using E = ×2 k 2 /(2m):
¸2
dσ q1 q2
·
= .
dΩ 16π²0 E sin2 (ϑ/2)
This is precisely the classical Rutherford formula, which also turns out to be the correct quan-
tum mechanical result. This could not possibly be anticipated beforehand, but it is a happy
coincidence.
7.3. T HE OPTICAL THEOREM 75

7.3 T HE OPTICAL THEOREM


We conclude this chapter by deriving an important theorem which results from the conserva-
tion of matter. This is the optical theorem, which also exists in classical optics (there it derives
from total energy conservation). To study the optical theorem, we should therefore study the
conservation of matter in quantum mechanics. For the Schrödinger equation we can derive
a material conservation law a follows. Suppose we have a volume V in three-dimensional
space which is bounded by a surface Γ. We calculate the rate of change of the amount of
material in that space. That amount of material is given by
Z
¯ ¯2
Q = ¯ψ¯ d 3 r.
V

We then find for the rate of change (the dot indicates derivative with respect to time):
Z Z
ψ̇|ψ d 3 r + ψ|ψ̇ d 3 r.
­ ® ­ ®
Q̇ =
V V

The time-derivatives of the wave function are given by the time-dependent Schrödinger equa-
tion:

i× ψ(r, t ) = Ĥ ψ(r, t )
∂t
and its hermitian conjugate

−i× ψ∗ (r, t ) = Ĥ ψ∗ (r, t ).
∂t
Taking for the Hamiltonian the form −ħ2 /(2m)∇2 + V (r), we obtain {
7
{
×
Z
ψ (r)∇2 ψ(r) − ψ(r)∇2 ψ∗ (r) d 3 r
£ ∗ ¤
Q̇ = −
2im V
(note that the terms containing V cancel). Using Green’s second identity, the volume integral
can be transformed into a surface integral:

ħ
Z
£ ∗
ψ (r)∇ψ(r) − ψ(r)∇ψ∗ (r) d a.
¤
Q̇ = −
2im Γ
Here, d a is a vector, the norm of which is that of a small surface segment, and directed along
the outward surface normal. The change in material can in this case only be caused by flow
of material through the boundary. In particular, we have
Z
Q̇ = − j · d a.

so that we see that the expression for the particle flux is

ħ £ ∗
ψ (r)∇ψ(r) − ψ(r)∇ψ∗ (r)
¤
j=
2im
Note that using the diverence theorem, the equation relating the material change to the
surface integral can be related to the continuity equation:
Z Z Z
Q̇ = ρ̇d 3 r = − j · d a = ∇ · jd 3 r.
V V

As the volume is arbitrary, this equation can only hold when

ρ̇ + ∇ · j = 0.

Let’s now turn again to the scattering problem, where we are dealing with an incoming
and an outgoing wave:
ψ = φin + φout .
76 7. S CATTERING IN CLASSICAL AND IN QUANTUM MECHANICS

The total current in this wave is then given as


× £¡ ∗
φin + φ∗out ∇ φin + φout − φin + φout ∇ φ∗in + φ∗out .
¢ ¡ ¢ ¡ ¢ ¡ ¢¤
j=
2im
As this expression consists of a complex number minus its complex conjugate, we can write
this in the form
×
j = Im φ∗in + φ∗out ∇ φin + φout .
£¡ ¢ ¡ ¢¤
m
This can be rewritten as
× ×
Im φ∗in ∇φout + φ∗out ∇φin = jin + jout + Im φ∗in ∇φout − φout ∇φ∗in .
¡ ¢ ¡ ¢
j = jin + jout +
m m
In a scattering problem we look at the scattered flux generated by a stationary flux of in-
coming particles. In the stationary limit, there is no generation or absorption of new matter:
the particles flowing into some sphere centred around the scattering potential should also
come out at the same rate. Therefore we must have:
Z
j · d a = 0,
S

as this expression calculates the total matter flux through the sphere’s surface.
As we have written the total flux as a sum over an outgoing, an ingoing and a mixing
term, we can also divide the total flux through the surface up into these three contributions.
The flux of the incoming wave turns out to be zero! This is because the beam described by
{
exp(ikz) gives an incoming flux on one side of the sphere, and an equal outgoing flux on the
7{ other side. The outgoing flux through the sphere is given by
¯2 ×k 1 ×k
Z Z
2πr 2 sin θd θ = σtot .
¯
jout · d a = ¯ f (θ)¯ 2
S m r m
Therefore, we obtain Z
¡ ∗
φin ∇φout − φout ∇φ∗in · d a.
¢
−kσtot = Im

Using Green’s theorem again, the expression on the right hand side can be reworked:
Z
¡ ∗ 2
φin ∇ φout − φout ∇2 φ∗in d 3 r.
¢
−kσtot = Im

Writing ψ = φin + φout , this can be rewritten as


Z Z
£ ∗ 2¡
φin ∇ ψ − φin − ψ − φin ∇2 φ∗in d 3 r = Im
£ ∗ 2¡ ¢
φin ∇ ψ − ψ∇2 φ∗in d 3 r.
¢ ¡ ¢ ¤ ¡ ¢¤
−kσtot = Im
V V
(7.16)
Now we use that the full wave function satisfies the full Schrödinger equation
2mV (r)
∇2 ψ = −k 2 ψ + ψ,
×2
and the incoming wave satisfies the Schrödinger equation with potential 0:

∇2 φin = −k 2 φin .

Putting this into (7.16) gives:


Z µ ¶
∗ 2 ∗ 2mV (r )
−kσtot = Im −φin k ψ + φin ψ + ψk φin d 3 r.
2 ∗
V ×2
The first and the third term in the integral cancel and we have
2mV (r )
Z
−kσtot = Im φ∗in 2
ψ d 3 r.
V ×
7.4. S UMMARY 77

The last term is recognised as the exact scattering amplitude for θ = 0 (up to a negative pre-
factor) – see Eq. (7.15).
Therefore, we find the optical theorem:
4π f (θ = 0)
σtot = Im .
k
If the wave is scattered, we see an attenuation in the forward direction compared to the case
where the incoming particles would not be scattered. The forward scattering amplitude f (0)
is therefore related to the scattering of the particles (i.e. σtot ). It is important to realise that the
optical theorem holds exactly; in the first Born approximation, the theorem does not hold.

7.4 S UMMARY
In this chapter, we have analysed scattering of particles by a potential localised in a finite
region around some point, which we take as the origin. The starting point of the quantum
mechanical analysis is the wave function far from the scattering centre, which reads:
e ikr
ψ(r) = e iki ·r + f (ϑ, ϕ)
.
r
The first term represents the incoming wave (the wave vector ki is usually taken along the z-
axis) and the second term represents the scattered wave whose amplitude can be measured
as a function of ϑ and ϕ at a detector – this amplitude is given as the differential cross section:
d σ ¯¯ ¯2
= f (ϑ, ϕ)¯ .
dΩ {
The shape of the function f (ϑ, ϕ) is determined by the interaction potential V (r), which is 7
{
often taken to be spherically symmetric: V (r) = V (r ). The total cross section is the integral of
the differential cross section:

Z
σtot = sin ϑd ϑd ϕ.
dΩ
The calculation of f (ϑ, ϕ) from the scattering potential V can be performed quite easily
provided the interaction potential is weak. In that case, we can use the exact Green’s function
solution: Z
ψ(r) = φ(r) + G 0 (r, r0 )V (r0 )ψ(r0 ) d 3 r 0

and its first Born approximation:


Z
ψ(r) = φ(r) + G 0 (r, r0 )V (r0 )φ(r0 ) d 3 r 0 .

From this last equation, the form of f (ϑ, ϕ) can readily be derived:
m
Z
0
f (ϑ, ϕ) = − V (r0 )e i(ki −kf )·r d 3 r 0 .
2π×2
We usually set q = ki − kf . For a spherically symmetric potential, the integral depends only on
the length q of this vector. This length is given as
q = 2k sin(ϑ/2),
where k is the length of the wave vector of the in- and outgoing waves, and ϑ is the scattering
angle.
The Born approximation yields the exact result for scattering off a Coulomb potential (the
Rutherford formula): ¸2
dσ q1 q2
·
= .
dΩ 16π²0 E sin2 (ϑ/2)
Finally, we have derived the optical theorem as a consequence of the conservation of mat-
ter:
4π f (ϑ = 0)
σtot = Im .
k
78 7. S CATTERING IN CLASSICAL AND IN QUANTUM MECHANICS

7.5 P ROBLEMS
1. Show that for two identical particles, 1 and 2, with coordinates r1 and r2 , the kinetic
energy can be written as
×2 ¡ 2 ×2 2 ×2 2
∇1 + ∇22 = −
¢
T =− ∇ − ∇ ,
2m 2M R 2µ r
r1 +r2
where ∇r denotes a gradient with respect to R = 2 and ∇r a gradient with respect to
r = r1 − r2 . Finally, M = 2m and µ = m/2.
The Hamiltonian then becomes
×2 2 ×2 2
· ¸
H= − ∇ − ∇ + V (r ) ψ(r, R) = E ψ(r, R).
2M R 2µ r
Also show that ψ(r, R) can be written as φ(r)χ(R). Find suitable eigen-equations for φ
and χ (this is separation of variables).

2. Consider the 1-D scattering problem illustrated in the figure below, with an arbitrary
localised potential (without any particular spatial symmetry) in Region II, and V (x) = 0
in Regions I and III.

V(x)
Aeikx Ceikx
{
7{
Be-ikx De-ikx

x
Region I Region II Region III

In Regions I and III, the solutions to the time-independent Schrodinger equation take
on the form
Ae ikx + B e −ikx in Region I
½
〈x|ψ〉 = ,
C e ikx + De −ikx in Region III
p
where k = 2mE /ħ. For scattering from left to right, {A, B,C , D} = {1, S LL , S RL , 0}. For
scattering from right to left, {A, B,C , D} = {0, S LR , S RR , 1}.
­ ®
(a) Show that, if x|ψ is a solution to the stationary Schrödinger equation, so is its
­ ®∗
complex conjugate x|ψ . This is a consequence of time reversal invariance.
(b) Calculate the flux for a wave function ψ(x) = exp(ikx). Show that particle number
conservation implies that

|A|2 − |B |2 = |C |2 − |D|2 .
∗ ∗
(c) Show S LL S RL = −S RR S LR (Hint: use the linearity of the Schrodinger equation).
(d) Using the linearity of the Schrödinger equation and the conservation law found in
(b), show that S RL =S LR . Thus, the transmission amplitude through the potential
is symmetric, even though the potential has no left-right symmetry.
(e) Show that the scattering matrix, defined by
µ ¶
S LL S LR
S= ,
S RL S RR
is unitary.
7.5. P ROBLEMS 79

(f) Are (c) and (d) necessary or simply sufficient conditions for S to be unitary?

3. Obtain the low-energy cross-section for the potential given by


(
−V0 for r < a;
V (r ) =
0 for r > a.

4. Consider scattering off a spherical cage (such as a bucky ball), which is described by a
δ-function
V (r ) = g δ(r − a).
Find the scattering amplitude.

5. In time-dependent perturbation theory, we can calculate the probability to move from


some initial state to some final state as a result of a perturbation which was turned on
for a finite time. Specifically, we write the Hamiltonian in the form

H = H0 + H 0 (t ),

where H0 is the unperturbed Hamiltonian with eigenstates φn and eigenenergies E n .


The time-dependent solutions of H0 have the form

¯ψ0 (t ) = c n e −iE n t /× ¯φn ,


¯ ® X ¯ ®
n
{
where the c n are time-independent expansion coefficients. 7
{
For the full Hamiltonian, we write the time-dependent solutions in the form

¯ψ(t ) = c 0 (t )e −iE n t /× ¯φn .


¯ ® X ¯ ®
n
n

Note that the expansion coefficients c n0 (t ) are now time-dependent. We have solved
the time-dependent problem if we know the explicit time-dependence of these coeffi-
cients.

(a) From now on, we will drop the prime from the coefficients c of the full solution.
Show that the c n satisfy the equation

φk |H 0 |φn e iωkn t c n = Hkn


X­ ® X 0 iω t
i×ċ k = e kn c n , (7.17)
n n

where ωkn = (E k − E n )/×.


(b) Now suppose that we start off at t → −∞ with a state ¯¯φl for a single n. We want
¯ ®

to know the probability to end up in some other state ¯φm with m 6= l . Show that
®

the coefficient can be solved by requiring that c n on the right hand side of (7.17)
can be replaced by δnl . Which condition on H 0 would justify this approach? Show
that now
1 t
Z
0
cm = 0
Hml (t 0 )e iωml t d t 0 .
i× −∞
(c) Now suppose that the Hamiltonian H 0 is switched on at −T and switched off again
at T where T À ×ωml , and that it can be assumed to be constant in between these
two times. Show that then
2 0 sin(ωml T )
cm = H .
i× ml ωml
From the fact that
π
lim sin2 (ax)/(a 2 x) = δ(a),
x→∞ 2
80 7. S CATTERING IN CLASSICAL AND IN QUANTUM MECHANICS

derive that the rate at which the probability to find the particle in state m after T
increases, is given by
2π 0 2
|H | δ(ωml ).
×2 ml
(d) Now consider a scattering problem, where the unperturbed states are given by
plane waves with wave vector k, which are as usual denoted by |k〉. The transition
rate from an initial state |kI 〉 to any final state |kF 〉 is then given by
2π ¯¯
Z
¯2
R= 〈kF | H 0 |kI 〉¯ δ(E I − E F )d 3 k F .
×
Show from this that the differential cross section for scattering is given by

d σ 2πm 2 ¯¯ ¯2
= 〈kF | H 0 |kI 〉¯ ,
dΩ × 4

i.e., we have recovered the first Born approximation. In the expressions, we as-
sume the normalisation
e −ik·r
〈k|r〉 = .
(2π)3/2

6. Calculate the differential cross section d σ/d Ω in the first Born approximation for a
potential of the form
V (r ) = −V0 e −r /R .
{ R ∞ −ar
Hint: 0 e r d r = 1/a 2 . Note that a may be complex in this expression!
7{
7. Scattering off a charge distribution
We consider scattering of particles with charge e off a charge distribution ρ(r) located
near the origin. The electrostatic potential felt by the incoming particles is given as the
solution to the Poisson equation (in SI units):

−∇2 ϕ(r) = ρ(r)/²0 .

(a) Show that in Fourier space, the Poisson equation takes the form

k 2 ϕ̃(k) = ρ̃(k)/²0 ,

where ϕ̃ and ρ̃ denote the Fourier transforms of ϕ(r) and ρ(r).


The potential energy is given by eϕ(r).
(b) Show that the scattering amplitude is given by

me ρ̃(q)
f (θ, φ) = − .
2π×2 ²0 q 2

8. In the Born approximation, the forward-scattering amplitude (the scattering angle ϑ =


0) is real, yielding σtot = 0, and therefore it appears to be in contradiction with the
optical theorem.

(a) Resolve this contradiction.


(b) Show that the second-order Born approximation yields a perturbative correction
to the scattering amplitude which, for ϑ = 0, is given by
0
³ m ´2 e ik|r−r |
Z
(2) ik·(r−r0 )
f (ϑ = 0) = e V (r) V (r0 ) d 3 r d 3 r 0 .
2π×2 |r − r0 |

(c) Show that the optical theorem is satisfied to second order in the potential V .
8
S YSTEMS OF H ARMONIC OSCILLATORS –
P HONONS AND P HOTONS
In this chapter, we start with one of the main topics of this course: dealing with many-particle
systems. These systems are particularly hard to treat when the particles interact. In this chap-
ter, we will stick to systems of non-interacting particles. Furthermore, we shall restrict our-
selves to systems with boson character. Notwithstanding these restrictions, you will find the
material sometimes complicated. It may therefore be helpful to always keep the general idea
behind the analysis in mind: we try to decompose the systems at hand into normal modes,
which leads us to view them as composed of non-interacting harmonic oscillators. A good
understanding of the quantum description of the harmonic oscillator is therefore the start-
ing point from which we proceed – this description is given in the next section. We shall
apply the general approach to two types of systems: phonons (vibrational modes occurring
in solids) and photons (vibrational excitation of the electromagnetic field). The last system
leads us to consider an exciting observable quantum phenomenon: the Casimir effect.

8.1 C REATING AND ANNIHILATING QUANTA IN THE H ARMONIC OSCIL -


LATOR
In this section we review the analysis of a well-known quantum mechanical problem: the
harmonic oscillator in one dimension, which is described by the Hamiltonian

p2 1
H= + mω2 x 2 .
2m 2
To solve for the spectrum, we introduce the operators
r r
mω ³ p ´ mω ³ p ´
a= x +i ; a† = x −i .
2× mω 2× mω

It is easy to see that a † is indeed the hermitian conjugate of a as the notation suggests. The
commutation relation between x and p:

[x, p] = i×

can be used to derive the commutation relation between the a’s:

[a, a † ] = 1 check this!.

Furthermore, the Hamiltonian can be formulated in terms of the a’s:


³ ´
H = a † a + 1/2 ×ω

81
82 8. S YSTEMS OF H ARMONIC OSCILLATORS – P HONONS AND P HOTONS

which can also be checked by substituting the expressions for the a-operators in terms of x
and p.
The spectrum can now be ¯ found from the commutation relations between the a’s. Sup-
pose we have an eigenstate ¯ψn with energy ×ω(n + 1/2). The notation ¯here suggests that n
®

is integer, but we leave this open for the moment. We now show that a ¯ψn is proportional
®

to another eigenstate of the Hamiltonian with energy ×ω(n − 1/2):


³ ´ ¯ ® ³ ´¯ ®
H a ¯ψn = ×ω a † a + 1/2 a ¯ψn = a×ω a † a − 1/2 ¯ψn
¯ ®

where we have used the commutation relation between the a’s in order to move the a in front
of the ket vector to the
¯ left. This decreases the term in parentheses by 1. In a similar fashion,
†¯
one can show that a ψn is an eigenstate of H with eigen-energy ×ω a † a + 3/2 . Thus, a has
® ¡ ¢

the effect of decreasing the energy by ×ω and a † will increase the energy when acting on an
eigenstate.
In order to find the spectrum, we use a physical argument. The spectrum must be bounded
from below¯ as the potential does not assume infinitely negative values. Therefore, if we start
with some ¯ψE and act successively on it with the lowering operator a, we must have at some
®

point:
a n ¯ψE = 0,
¯ ®
(8.1)
n−1 ¯
ψ
¯ ®
because otherwise the spectrum would not be bounded from below. Let us call a E =
¯ψG . Then a ¯ψG = 0. Therefore,
¯ ® ¯ ®

a † a ¯ψG = 0,
¯ ®
(8.2)
that is, ¯ψ0 is an eigenstate of n̂ = a † a with eigenvalue n = 0, or
¯ ®

¯ψG = ¯ψ0 .
¯ ® ¯ ®
{
8{ This eigenstate has energy eigenvalue ×ω/2, which is found

¯ when acting on it with the Hamil-
tonian. Acting with a on ψ0 we obtain an ¯ ® ψ1 (up to a constant) with
¯ ® ®
¯ eigenstate ¯ ®energy
¯ ¯
eigenvalue 3×ω/2 etc. Acting n times with ¯a † on ¯ψ0 , we obtain an eigenstate ¯ψn (up to
®

a constant) with energy ×ω(n + 1/2).


The operator¯ a † a is called number operator, denoted by n̂, and H can now ¯ be® written
as ×ω(n̂ + 1/2). ¯ψn¯ is an eigenstate of n̂ with eigenvalue n. The norm of a † ¯ψn can be
®

expressed in that of ¯ψn :


®

D ¯ E ­ ¯ E ­ ¯ E
a † ψn ¯ a † ψn = ψn ¯ aa † ψn = ψn ¯ (a † a + 1)ψn = (n + 1) ψn ¯ ψn .
­ ¯ ®
(8.3)
¯

¯ ® p
¯ψn is normalised, a † ¯ψn / n + 1 is normalised too, and normalised states
¯ ®
Therefore, if
¯ψn can be constructed from a normalised state ¯ψ0 according to:
¯ ® ¯ ®

¯ψn = p1 a † ¯ψ0 .
¯ ® ³ ´n ¯ ®
(8.4)
n!

Using the commutation relations for a, a † , it is also possible to show that states belonging
to different energy levels are mutually orthogonal:
³ ´m ¯ ®
ψn ¯ ψm ∝ ψ0 ¯ a n a † ¯ ψ0 .
­ ¯ ® ­ ¯
(8.5)
¯

Moving the a’s to the right and the a † to the left by application
¯ ®of the commutation relations
leads to a form involving the lowering operator a acting on ¯ψ0 and/or 〈ψ0 | followed by a † ’s,
which both vanish.
Exercise: show that ψ2 |ψ3 vanishes indeed.
­ ®
8.1. C REATING AND ANNIHILATING QUANTA IN THE H ARMONIC OSCILLATOR 83

We have succeeded in finding the energy spectrum but¯it might seem that we have not
¯ψn . However, we have a simple
®
made any progress in finding the form of the eigenfunctions
differential equation defining the ground state ¯ψ0 :
¯ ®

r
mω ³ p ´­ ®
x +i x|ψ = 0. (8.6)
2× mω
We now transform to the coordinates
r
mω 1
x̃ = x; p̃ = p p,
× m×ω
and find that the operator p̃ can be related to x̃ as follows:

1 × d 1 d
p̃ = p = .
mω× i d x i d x̃

We have
­ ® 1 ­ ®
a x̃|ψ0 = p (x̃ + ip̃) x̃|ψ0 = 0 (8.7)
2
or:
d
µ ¶
x̃ + ψ0 (x̃) = 0 (8.8)
d x̃
The solution can immediately be found as:
2
ψ(x) = Const. e −x̃ /2
(8.9)

in accordance with the result obtained in the direct method which can be found in any quan-
tum mechanics textbook. The normalisation constant is found as
{
Const. = (mω/×π)1/4 (8.10)
8
{

(check this for yourself!).


Using (8.4), we can find the solution for general n from:
® ³ mω ´1/4 1 ¡ ¢n 2
π x̃ + ip̃ e −x̃ /2 .
­
x̃|ψn = p (8.11)
× n!2n
We have seen that the harmonic oscillator has an equidistant spectrum, and if the en-
ergy is given as ×ω(n + 1/2), we say that there are n quanta in the system. However, there is
nothing which prevents us from viewing those quanta as particles, each of which carry the
same energy ×ω. The operators a and a † annihilate, respectively create a particle. In the next
chapter we shall see that we can derive creation and annihilation operators for electrons and
other massive particles, within an algebraic structure which is very similar to that of one of
more harmonic oscillators.
The fact that we do not distinguish between particles and energy quanta is the reason
why in physics we often mix the two: photons and phonons are sometimes viewed as energy
quanta related to harmonic oscillator excitations, but often we speak about them as if they
were particles.
In the next sections we consider the two major examples of harmonic oscillator systems,
lattice vibrations and the electromagnetic field, in some detail. We shall see that lattice vibra-
tions are conveniently analysed in terms of normal modes: vibrational excitations in which
all the degrees of freedom oscillate at the same frequency. These are the well-known normal
modes. From classical mechanics, we know that the Hamiltonian can be written as a sum of
independent harmonic oscillators, one for each normal mode. Similarly, the Hamiltonian of
the electromagnetic field in vacuum can be viewed as a sum of independent harmonic oscil-
lators. The programme of this chapter is then to turn all these classical harmonic oscillators
84 8. S YSTEMS OF H ARMONIC OSCILLATORS – P HONONS AND P HOTONS

into quantum oscillators which we then can formulate in terms of their creation and annihi-
lation operators. The energy quanta of the lattice vibrations are called phonons, and those of
the electromagnetic field are the photons.
First, however we consider particular states of the quantum harmonic oscillator which
can be related to classical oscillatory excitations.

8.1.1 C OHERENT STATES


The quantum solutions of the harmonic oscillator seem completely unrelated to those we
know from classical mechanics. In this section, we briefly describe eigenfunctions that are
closer to the classical solutions. These coherent states are not eigenstates of the Hamiltonian,
but more general: they can (for each time t ) be expanded in terms of these eigenstates.
A coherent state |α〉 is by definition an eigenstate of the annihilation operator a:

a |α〉 = α |α〉

where it is important to note that a is an operator and α a number.


The properties of coherent states will be addressed in problems 1 and 2. Here we sum-
marise a few:

• Coherent states are minimum-uncertainty states. Minimum-uncertainty states are states


satisfying
∆x∆p = ×/2,
i.e. they match the lower bound of the Heisenberg uncertainty relation.

• They represent minimal-uncertainty wave packets that oscillate back and forth in time,
just like a classical harmonic oscillator.
{
8{ • If external driving is applied, a system evolves towards a coherent state. In problem 3
it will be shown that a coherent state solves the time-dependent Schrödinger equation
for the harmonic oscillator.

8.2 Q UANTIZATION OF THE LINEAR CHAIN OF ATOMS


In this section, we study phonon systems. Rather than doing this in detail for crystals with
different symmetries, we study a homogeneous one-dimensional chain of particles, placed
at a distance d from each other and moving along the chain. The particles are coupled by
springs. If we take the distance between these particles to be small while reducing the mass
accordingly, we can study the continuum limit, which is the elastic string.
To be specific, we formulate the Hamiltonian of this system

NX
−1 p n2 K NX
−1 ¡ ¢2
H= + q n+1 − q n ,
n=0 2m 2 n=0

where q n denotes the deviation from the equilibrium position of oscillator n. We assume
periodic boundary conditions, so that q N ≡ q 0 . The distance between the oscillators is a. We
first consider this problem classically.
A Fourier basis is formed by the waves

1 2π j
u k j n = p e ik j na ; kj = , j = 1, . . . , N ,
N Na

and this is used to define (sums over n and m run from 0 to N − 1):
1 X ik j na
qk j = p e qn .
N n
8.2. QUANTIZATION OF THE LINEAR CHAIN OF ATOMS 85

1
0.9
0.8
0.7
0.6

E(k)
0.5
0.4
0.3
0.2
0.1
0
-3 -2 -1 0 1 2 3
k (1/a)

F IGURE 8.1: The dispersion relation for the harmonic chain.

From the definition of the waves u kn , it follows that u k∗ n = u −k j n = u −nk j . We can define the
j
inverse Fourier transform (assuming that N is even) as

1 NX
/2−1 NX
/2−1 NX
/2−1
qn = p e −ik j na q k j = u k∗j n q k j = u −k j n q k j .
N j =−N /2 j =−N /2 j =−N /2

Here we have used that u k j n = u k j +mN n for any integer m in order to let j run from −N /2 to
N /2−1 rather than from 0 to N −1. We may verify that applying the Fourier transform and its
inverse does not change q n :
NX NX {
1 /2−1 1 /2−1 1 X NX /2−1
8
−ik j na −ik j na ik j ma
e ik j (m−n)a q m = q n . {
X
qn = p e qk j = e e qm =
N j =−N /2 N j =−N /2 m N m j =−N /2

Here we have used


1 NX
/2−1
e ik j na = δn,0
N j =−N /2

in the last equality. From now on we shall use the short hand notation
NX
/2−1 X
f (k j ) ≡ f (k).
j =−N /2 k

The kinetic energy T = n p n2 /(2m) can then be written as k p k p −k /(2m) by applying the
P P

same Fourier transformation to p n as we have done for q n (exercise!). The potential energy
¢2
V = K2 n=0
PN −1 ¡
q n+1 − q n can furthermore be written as (another exercise!)

ka
µ ¶
V = 2K sin2
X
q k q −k ,
k 2

so that the full Hamiltonian can be written in the form


X p k p −k mω2k
" #
H= + q k q −k
k 2m 2
p
where ωk = 2 K /m sin(|ka|/2). This relation between the energy ωk and k is called the dis-
persion relation. It is represented in figure 8.1.
Now we consider the quantum version of this harmonic chain. This means that we con-
sider the masses that move as quantum (point) particles. Their coordinates p n , q n then be-
come operators with commutation relations [p n , p n 0 ] = [q n , q n 0 ] = 0; [p n , q n 0 ] = −i×δnn 0 .
86 8. S YSTEMS OF H ARMONIC OSCILLATORS – P HONONS AND P HOTONS

From these commutation relations, their counterparts for the Fourier-transforms can be
formulated.
£ ¤
p k , q k 0 = −i×δk,−k 0 .
To obtain this result, we write out p k and q k as Fourier sums:

1 X ikna ik 0 ma i× X i(k+k 0 )n
[p k , q k 0 ] = e e [p n , q m ] = − e = −i×δk,−k 0 . (8.12)
N nm N n

Now we define new operators a k and a k† as follows:

r s
mωk 1
ak = qk + i pk (8.13)
2× 2×mωk

and s
r
mωk 1
a k† = q −k − i p −k . (8.14)
2× 2×mωk
It is readily seen that the Hamiltonian can be written as

π/a µ
1

a k† a k
X
H= + ×ωk .
k=−π/a 2

where the operators a k and a k† satisfy commutation relations that can easily be derived from
(8.12):
[a k , a k† 0 ] = δkk 0 .
From these relations we find
{
8 {
h i
H , a k† = ×ωk a k† ; [H , a k ] = −×ωk a k .

Denoting by |0k 〉 the ground state of the harmonic oscillator, any state can then be con-
structed by acting on the ground state with the creation operators a k† :

Y 1 ³ † ´n k
|{n k }〉 = a |0〉 .
nk ! k
p
k

We see that for the simplest possible elastic medium, i.e. a one-dimensional harmonic
chain, quantization leads to a formulation in terms of creation and annihilation operators.
It turns out that the standard quantum field theories can all be formulated along these lines.
We shall however not go into details here.
Note that the creation and annihilation operators create and annihilate quantum excita-
tions of the system, respectively. These excitations can be viewed as particles. Such particles
are called phonons. We can also study phonons in three-dimensional grids. In that case, we
have not only a quantum number k which now has become a three dimensional vector, in-
dicating the direction in which the phonon wave travels, but also three components of the
excitations: two transversal, and one longitudinal mode. This means that the creation and
annihilation operators must have three components, corresponding to the polarization (i.e.
the direction in which the atoms move) in addition to k: we therefore have operators a k,α and

a k,α , where two of the α’s correspond to the two transverse, and the third one to the longitu-
dinal mode. We can then also denote these operators as vector operators ak , a†k , where the
boldface indicates that, for example, ak contains three components α. We shall come back to
phonons when dealing with electron-phonon coupling in the next chapter.
8.3. T HE QUANTUM THEORY OF ELECTROMAGNETIC RADIATION 87

8.3 T HE QUANTUM THEORY OF ELECTROMAGNETIC RADIATION


In previous courses, you have covered the classical theory of electromagnetism, which is
summarized in the Maxwell equations. On the other hand you know that light has quantum
properties, as is demonstrated by the existence of light quanta – the photons. In this section,
we first review the classical theory of light in vacuum and then we formulate a quantum the-
ory for this. As we shall see, the procedure is quite analogous to that which was applied to
crystal vibrations, the phonons, in the previous section.

8.3.1 C LASSICAL ELECTROMAGNETISM


The Maxwell equations read (in SI units):

∇ · E = ρ/²0 ;
∇ · B = 0;
∂E
∇ × B − µ0 ²0 = µ0 j;
∂t
∂B
∇×E+ = 0.
∂t
For our purposes, it is convenient not to start from the fields E and B but from the vector and
scalar potentials, A and φ. Note that fields and potentials all depend on position and time,
e.g. A = A(r, t ). The electric and magnetic fields are obtained from the potentials as follows:

B = ∇ × A;
∂A
E = −∇φ − .
∂t
The potentials enjoy a gauge freedom: changing them according to
{
A → A + ∇χ; 8
{
∂χ
φ → φ−
∂t
leaves the electric and magnetic fields unchanged. We choose χ such that

∇ · A = 0,

which is known as the Coulomb or transverse gauge. Substituting the expression for the elec-
tric field in the first of the Maxwell equations given above, and using the Coulomb gauge
condition, we obtain:
∂A
µ ¶
∇ · −∇φ − = −∇2 φ = ρ/²0 .
∂t
We want to describe the electromagnetic field in vacuum, so ρ ≡ 0 and therefore φ = must be
constant in space and time.
Now we use the expressions for both E and B in the third Maxwell equation, to obtain
∂ ∂A
µ ¶
∇ × (∇ × A) − µ0 ²0 − = 0,
∂t ∂t
where we have used the fact that φ is constant. The first term on the left hand side can be
rewritten as ∇(∇·A)−∇2 A so that, applying the gauge condition once again, we obtain a wave
equation for A, which, using µ0 ²0 = 1/c 2 , takes the form:

1 ∂2 A
∇2 A − = 0.
c 2 ∂t 2
We know the solutions to such a wave equation: they read

A = dk e i(k·r−ωt ) and d0k e i(k·r+ωt ) ,


88 8. S YSTEMS OF H ARMONIC OSCILLATORS – P HONONS AND P HOTONS

where, always, ωk = c |k| (in vacuum). The first solution describes a wave running in the
direction of k and the second a wave running in the opposite direction. The pre-factors dk
and d0k specify the direction of the vector-potential field. Because of the Coulomb gauge ∇·A =
0, we have
k · dk = k · d0k = 0,
which says that the vector potential is oriented perpendicular to the propagation direction.
Therefore we can write both pre-factors as a linear combination of two basis vectors perpen-
dicular to k:
d α,k ²̂α
X
dk =
α=1,2

and similar for d0k . Let’s keep this in the back of our minds for later on.
The general solution can be written as a linear superposition of the monochromatic waves:
X³ ´
A= dk e i(k·r−ωk t ) + d0k e i(k·r+ωk t ) .
k

The two terms in the sum are related by the fact that A is a real field. Its complex conjugate is
given by
X ³ ∗ −i(k·r−ω t ) ∗
´
A∗ = dk e k
+ d0 k e −i(k·r+ωk t ) .
k

We focus on the time dependence of the different terms in A and A∗ and try to equate those.
This leads to

dk e −ik·r = d0 k e ik·r
and the complex conjugate of this equality. At first sight, the left and right hand side seem
incompatible as the equality of A and A∗ should hold throughout space. However, realizing
{ that ωk = ω−k we can replace k on the right hand side by −k, which directly leads to the
8{ conclusion
d0 k = d∗−k
Therefore, A can be written as
X³ ´ X¡
dk e −iωk t e ik·r + d∗−k e iωk t e ik·r = dk (t ) + d∗−k (t ) e ik·r ,
¢
A=
k k

where, in the last expression, we have incorporated the time dependence of the waves into
the dk (t )’s. In the sequel, we shall need the time derivative of A, and we therefore note that

ḋk = −iωk dk and ḋ∗−k = iωk dk ,

which we should also keep in our short-term memory. In an infinite volume, the k form a
continuous set, and the sum over them becomes an integral:

d 3k
Z
d(k, t ) + d∗ (−k, t ) e ik·r
£ ¤
A(r, t ) =
(2π)3/2

The classical Hamiltonian is found as the integral over the energy density of an electro-
magnetic field: Z µ ¶
1 2 1 2 3
H= ²0 E + B d r.
2 µ0
As we have seen, the electric field can be found as −Ȧ (remember φ ≡ 0) and the magnetic
field as B = ∇ × A. We can therefore write the electric field as:
i
Z
ωk d(k, t ) − d∗ (−k, t ) e ik·r d 3 k =
£ ¤
E(r, t ) = 3/2
(2π)
8.3. T HE QUANTUM THEORY OF ELECTROMAGNETIC RADIATION 89

Using the fact that E is real, we can write it also in the form

i
Z
ωk d∗ (k, t ) − d(−k, t ) e −ik·r d 3 k.
£ ¤
E(r, t ) = − 3/2
(2π)

For the magnetic field, we obtain the expression:

i
Z
£ ∗
¤ ik·r 3
B(r, t ) = k × d(k, t ) + d (−k, t ) e d k
(2π)3/2
i
Z
£ ∗ ¤ −ik·r 3
=− k × d (k, t ) + d(−k, t ) e d k.
(2π)3/2

Using these equations for the electric and magnetic field, we can write the Hamiltonian in
terms of the coefficients d. For the first term, defined in terms of the electric field, we obtain,
using the orthogonality of the exp(ik · r):

²0
Z
E · E∗ d 3 r =
2
²0 1
Z Z Z
0
ωk d(k, t ) − d (−k, t ) e d k ωk0 d∗ (k0 , t ) − d(−k0 , t ) e −ik ·r d 3 k 0 d 3 r =
£ ∗
¤ ik·r 3 £ ¤
2 (2π) 3
²0
Z
ω2k 2 |d(k, t )|2 − d(k, t )d(−k, t ) − d∗ (k, t )d∗ (−k, t ) d 3 k.
£ ¤
2

For the term involving B we find a similar result, using that [k × d(k)] · [k × d∗ (−k)] =
k 2 |d|2 , since k is perpendicular to both d(k) and d∗ (−k) (see above):

1 2 1
Z
k 2 2 |d(k, t )|2 + d(k, t )d(−k, t ) + d∗ (k, t )d∗ (−k, t ) d 3 r.
£ ¤
B =
2µ0 2µ0
{
Adding the two contributions to the Hamiltonian gives, with ωk = c |k| and ²0 µ0 = 1/c : 2 8
{
Z X Z 2 X Z
2 2 3 2 3
H = 2²0 ωk |d(k, t )| d k = 2²0 ωk |d α (k, t )| d k = Hα (k)d 3 k. (8.15)
α=1,2 α=1,2

8.3.2 Q UANTIZATION OF THE ELECTROMAGNETIC FIELD


The form of the Hamiltonian we have arrived at suggests that it can be written as a sum of
harmonic oscillators as each oscillator d evolves in time as exp(iωk t ). However, there is an
important difference: a classical harmonic oscillator is described in terms of real variables x
and p. Our d α (k, t ) are complex, so they cannot play the role of an x and p. We know however
how to arrive from the d at real variables proportional to cos(ωt ) or sin(ωt ): we take the real
or imaginary part of the d, with additional proportionality factors. These factors are chosen
such as to arrive at a suitable form of the Hamiltonian, as we shall see shortly:
p £
²0 d α (k, t ) + d α∗ (k, t ) ,
¤
x α (k, t ) =
p £
p α (k, t ) = −iωk ²0 d α (k, t ) − d α∗ (k, t ) .
¤

These relations can simply be inverted, to yield:

p α (k, t )
· ¸
1
d α (k, t ) = p x α (k, t ) + i . (8.16)
2 ²0 ωk

Furthermore, the Hamiltonian for each harmonic oscillator

1 2 1
Hα (k) = p α (k, t ) + ω2k x α2 (k, t ),
2 2
precisely equals Hα (k) occuring in the last expression of (8.15).
90 8. S YSTEMS OF H ARMONIC OSCILLATORS – P HONONS AND P HOTONS

The foregoing analysis may seem a bit arbitrary and unnatural: why would x α and p α
be defined this way? To convince yourself that the resulting expressions make sense, let’s see
how x and p are related, by using the Maxwell equations for the classical fields. Let’s therefore
consider the expressions for E and B in terms of plane waves again. The electric field contains
a term (up to a constant)
iωk ²̂α d α e ik·r
and the same mode in the B field is (up to the same constant):

iωk n̂ × ²̂α d α e ik·r .

Now let us apply the Maxwell equation

∂B
∇×E = −
∂t
to these modes. We then find

i2 ωk k × ²̂α ωk x α (k, t ) + ip α (k, t ) e ik·r = −ik × ²̂α ẋ α (k, t ) + iṗ α (k, t ) e ik·r .
¡ ¢ ¡ ¢

Equating the imaginary and real parts on the left and right hand side, we immediately see
that
∂x α ∂p α
= pα; = −ω2k x α . (8.17)
∂t ∂t
These equations demonstrate that the coordinates x and p which we have introduced obey
the classical equations of motion for the harmonic oscillator! So it seems they really deserve
their names x and p, and we conclude again that the electromagnetic field is correctly de-
scribed by a superposition of harmonic oscillators.
We can now readily write down the quantum version of the Hamiltonian: we just impose
{
8{ a commutation relation between the x α (k) and p α (k):

[x α (k), p α0 (k0 )] = i×δ(k − k0 )δαα0 .

The usual creation and annihilation operators can now be defined in terms of these:

ωk
r
p α (k, t )
· ¸
a α (k, t ) = x α (k, t ) + i ,
2× ωk
ωk
r
p α (k, t )
· ¸

aα (k, t ) = x α (k, t ) − i
2× ωk
and the Hamiltonian can then be formulated as
X Z ×ωk h † †
i
H= a α (k, t )a α (k, t ) + a α (k, t )a α (k, t ) d 3 k =
α=1,2 2
· ¸
X Z † 1 3
×ωk a α (k, t )a α (k, t ) + d k. (8.18)
α=1,2 2

Interestingly, we note that the definition of the a-operators in terms of the x and p oper-
ators is very similar to the relation between the d ’s and the x- and p operators [(8.16)] – the
only difference is the constant factor ×/(2²0 ωk ):
p

s
×
d α (k, t ) ↔ a α (k, t ).
2²0 ωk

We can always use the classical expressions derived in the previous subsection and then per-
form this transformation and impose the standard commutation relation for the operators

a α (k) and a α (k):
† 0 0
[a α (k), a α 0 (k )] = δαα0 δ(k − k )
8.3. T HE QUANTUM THEORY OF ELECTROMAGNETIC RADIATION 91

to arrive at the quantum version of electromagnetic field theory.


We see that the modes are characterized by the wave vector k and by their polarization
direction ²̂, which represents two quantum numbers labelled by α. The energy quanta are
called photons. Inspection of the expression of the Hamiltonian (8.18) reveals a peculiar fea-
ture: even when there are no photons present, there is an energy due to the factor 1/2. This
energy is infinite as we must integrate over all possible wave vectors k! We may of course
make it finite by placing he system in a finite box and imposing a maximum on the possi-
ble energies (i.e. a short-wavelength cut-off). A finite offset energy is no problem as physical
processes are driven only by energy differences, and not the energy values themselves. There-
fore the term 1/2 is neglected when calculating the energy of the electromagnetic field. It is
however important to realize that it is there, and why it is there: as the field is composed of
infinitely many modes, and each mode has a lowest energy of ×ωk /2 due to the fact that there
are always quantum fluctuations present in each mode, even at the lowest possible energies,
we have to accept that there is an infinite offset energy.
We conclude this section by providing the electric and magnetic fields and vector po-
tential, expressed in terms of the the p creation and annihilation operators. This procedure
involves replacing d α (k, t ) by a α (k, t ) ×/(2²0 ωk ) and similar for the Hermitian (complex)
conjugates. Collecting the vector operator α ²̂α a α into a vector a, and similarly for a† , we
P

find the expansions:


s
1 × h
Z i

A(r, t ) = a(k, t ) + a (−k, t ) e ik·r d 3 k.
(2π)3/2 2²0 ωk
Z s
∂A i ×ωk h †
i
E(r, t ) = − =− a(k, t ) − a (−k, t ) e ik·r d 3 k.
∂t (2π)3/2 2²0

i
Z s
× {
8
h i

B(r, t ) = ∇ × A = k × a(k, t ) + a (−k, t ) e ik·r d 3 k. {
(2π)3/2 2²0 ωk

8.3.3 S OME PROPERTIES OF THE ELECTROMAGNETIC FIELD


Using the expressions for the fields, we can calculate the total momentum of the field. The ex-
pression for this quantity is in classical electrodynamics given as an integral over the Poynting
vector:
Z
P = ²0 E × B∗ d 3 r =

ωk h
Z s
× †
i h
† 0
i 0
a(k, t ) − a (k, t ) × k0
× a(−k0
, t ) + a (k , t ) e i(k−k )·r d 3 k d 3 k 0 d 3 r
2(2π) 3 ωk0

×
Z h i
= k a(k, t ) · a† (k, t ) − a† (−k, t ) · a(−k, t ) + a(k, t ) · a(−k, t ) − a† (k, t ) · a† (−k, t ) d 3 k.
2
To go from the first to the second line, we have used the fact that the a’s are all perpendicular
to k. Noting that the third and fourth term are odd under the substitution k → −k. On the
other hand under the same substitution, the first term turns into the second, but also the k
in front of the square bracket changes sign, so we are left with
×
Z X h i
† †
P= k ²̂α · ²̂α a α (k, t )a α (k, t ) + a α (k, t )a α (k, t ) d 3 k.
2 α

Using the commutation relations for the operators to swap the last product in this integral
yields a term 1 inside the square brackets, similar to that occurring in the Hamiltonian. How-
ever, in this case that term is not dangerous as the integral vanishes due to the antisymmetry
of k. The final result is thus Z
P = ×k Nα (k)d 3 k.
X
α
92 8. S YSTEMS OF H ARMONIC OSCILLATORS – P HONONS AND P HOTONS

We could have guessed this form: we sum the photon momenta for each occupied quantum.
Indeed, it is well known from classical electrodynamics that the Poynting vector equals the
product of the electromagnetic energy density (which is Nα (k)×ω) and the speed of light. For
a mode k, α, indeed: X
S = Nα (k)×ωc k̂ = ×k Nα (k)×k.
α

Given the fact that the momentum density is given by the Poynting vector E×B/(4πc), we
also have an expression for the angular momentum of the field:
Z
J = ²0 r × (E × B)d 3 r.

We calculate the i -th component of this field:


Z
J i = ²0 d 3 r ²i j k r j ²kl m E l B m .
X
j kl m

Using the fact that B = ∇ × A, and the formula

²kl m ²ki j = δl i δm j − δl j δmi ,


X
k

we obtain
Z X
J i = ²0 ²i j k r j E l ∂k A l − r j E l ∂l A k
¡ ¢
j kl
Z "X #
= ²0 ²i j k (∂l (r j E l A k ) − (∂l r j )E l A k − r j (∂l E l )A k d 3 r. (8.19)
X
E l (r × A) −
{ l j kl
8 {
The second term is a gradient, so it should be evaluated at the surface. Assuming vanishing
fields at infinity, we can neglect this term. The last term vanishes as a result of ∇ · E = 0. In
conclusion, we can write J = L + S, where
Z X
L = ²0 E l (r × ∇)A l d 3 r,
l

and Z
S = ²0 E × A d 3 r.

The term L has the form of


i
r × p,
×
acting as a diagonal operator between the components of the E and A fields. The term r × p is
recognized as the mechanical orbital momentum.
Writing the second term with the same prefactor between the electric field and vector
potential components, we obtain the form

S (ij k) = −iײi j k .

It can be shown that S is the component of J along k. It is easy to show that the eigenvalues
of this operator are ±1. Note that the last result tells us that the photon is a spin one particle.
The two possibilities ±1 correspond to the two circular polarizations.
8.3. T HE QUANTUM THEORY OF ELECTROMAGNETIC RADIATION 93

8.3.4 T HE C ASIMIR EFFECT


In the previous sections we have seen that the electromagnetic field can be seen as a collec-
tion of simple harmonic oscillators, one for each polarization α and wave vector k. We also
noted that a harmonic oscillator has a zero-point energy ×ωk /2, which is the ground state en-
ergy of that oscillator. For an infinitely large volume, there are infinitely many k-vectors, so
the ground state of the electromagnetic field is infinite. We do not care about this in general,
as we know that physical processes are driven by energy differences and not by the actual
values of the energies involved.
The Casimir effect is a rather dramatic manifestation of the existence of the vacuum en-
ergy. This statement already should be puzzling: we just argued that the vacuum energy is
not noticeable as it is only the energy differences we should care about. The Casimir effect
however is a manifestation of the difference between vacuum energies, that is, it shows us
a difference between infinity and infinity! It was first postulated based on theoretical argu-
ments in 1948 by H. B. G. Casimir, and then (somewhat tentatively) demonstrated experimen-
tally for the first time by M. Sparnaay in 1958. Currently, there is great interest in measuring
the Casimir effect using nanotechnology, see for example G. Bressi et al., Phys. Rev. Lett. 88
041804 (2002).
So let us see how we can find a force from the vacuum. We consider a segment of space
enclosed between two parallel planes. We take the z-axis perpendicular to these planes. We
introduce some cut-off to make the vacuum energy between these two planes finite. Let us
first write up the vacuum energy in a box of size L × L × L with periodic boundary conditions.
The k-vectors in such a box are 2π(n x , n y , n z )/L. This means that the volume per k-point is
(2π/L)3 . Using this, we can turn the sum over all k-points into an integral, in addition to a
sum over the two possible polarizations α:
µ ¶3 Z
1 XX X q L
E0 = 2 2 2
×ωk = ×c k x + k y + k z = ×ckd 3 k {
2 α k k 2π 8
{

with c the speed of light (the sum over the two polarizations α cancels the prefactor 1/2). The
cut-off is introduced with a parameter λ, the inverse of which sets the scale up to which the
k-modes contribute to the energy:
¶3 Z
L
µ
E 0 (λ) = ×cke −λk d 3 k.

After completing the calculation we shall put λ → 0 to remove the cut-off – physically useful
results should then be independent of λ. The integral can be worked out:

V
Z ∞ 1 3V c×
E 0 (λ) = 2 c× k 3 d ke −λk = .
2π 0 λ4 π2

This implies a vacuum energy per volume of

E 0 (λ) 3×c
e 0 (λ) = = 2 4.
V π λ
So if we consider the space contained between two infinite parallel planes, separated by a
distance a, then the vacuum energy contained in between those two planes per surface area
A is
3a×c
E 0 /A = 2 4 .
π λ
Now we calculate the same energy in the case where these planes are perfect conductors.
This has a significant influence on the vacuum energy contained between them, as a result of
the fact that the field should vanish at the conducting planes. This means that the field must
94 8. S YSTEMS OF H ARMONIC OSCILLATORS – P HONONS AND P HOTONS

be proportional to sin(nπz/a), so that only the values k z = nπ/a are allowed, and the modes
along the z-direction now become discrete:
¶3 Z ¶2 Z
L L
µ µ
3
X
d k→ d kx d k y .
2π 2π kz

The last sum can no longer be replaced by an integral when a is not large. The new vacuum
energy per surface area now becomes

1 X ×c
Z ∞ q q
−λ k k2 +(πn z /a)2
E 00 /A
X
2 2
= kk d kk k k + (πn z /a) e
2 α 2π n z =0

Wait a minute! The sum on the right hand side does not take into account that there are two
polarizations only for n z 6= 0. This can be seen by actually writing the expressions for the
electric field:

E x = E x(0) cos(k x x) sin(k y y) sin(n z πz/a);


E y = E y(0) sin(k x x) cos(k y y) sin(n z πz/a);
E z = E z(0) sin(k x x) sin(k y y) cos(n z πz/a)

Two of the components E x , E y , E z can be chosen freely, but the third is fixed by the Maxwell
equations, as are the three magnetic field components. However, for n z = 0 we see that only
the amplitude of the z-component can be chosen freely, hence there is only one mode possi-
ble in that case.
Therefore it is useful to split off the n z = 0 term:
" #
{ ∞ q q
1 ×c
Z
0 −λk k
X
2 2 −λ k k2 +(πn z /a)2
8{ E 0 /A =
2 2π
kk d kk kk e +2
n z =1
k k + (πn z /a) e .

The first part of the integral can easily be evaluated:

×c c×
Z
k k2 d k k e −λkk = .
π 2πλ3
To evaluate the second part of the integral on the right hand side, we substitute
q
y= k k2 + (πn z /a)2 ,

which turns the integral for n z into


∞ d 2 1 −n z πλ/a
Z q Z µ ¶
−λ k k2 +(πn z /a)2
q
kk d kk k k2 + (πn z /a)2 e = y 2 e −λy d y = e .
n z π/a d λ2 λ

All in all, the result for the energy is now

c× c× d 2 X ∞ 1
E 00 /A = + e −n z πλ/a .
2πλ 3 2π d λ n z =1 λ
2

Carrying out the sum on the right hand side, which contains a geometric series, we obtain

c× c× d 2 1 1
E 00 /A = + πλ/a
.
2πλ 3 2π d λ λ e
2 −1
The right hand side diverges for λ → 0 as

0 3a×c
E Divergent /A = ,
π2 λ4
8.4. SUMMARY 95

which is precisely equal to the divergent vacuum energy in the absence of the two conductors.
Expanding the right hand side further in powers of λ gives

1 1 1 z z3
= − + − .
e z − 1 z 2 12 720
We find that the next non-vanishing term is independent of λ and

E 00 − E 0 c×π2
=− .
A 720a 3
This implies that the two conductors attract each other with a force per unit area:

F c×π2
=− .
A 240a 4
The principle of the Casimir effect seems general enough to apply in other cases where
waves are confined in space. Indeed, it has been claimed that two ships would be attracted to
each other by a Casimir effect of the water waves in between them. This conjecture however
does not seem to have a firm ground.

8.4 SUMMARY
In this chapter we have analysed systems that can be considered as collections of harmonic
oscillators. We started by formulating the harmonic oscillator Hamiltonian

p2 m 2 2
H= + ω x
2m 2
in terms of creation and annihilation operators, which are defined as: {
r r 8
{
mω ³ p ´ mω ³ p ´
a= x +i ; a† = x −i .
2× mω 2× mω

These operators satisfy the commutation relation

[a, a † ] = 1.

In terms of the creation and annihilation operators, the Hamiltonian takes the form
³ ´
H = ×ω a † a + 1/2 .

These ingredients lead to the following conclusions:

• The operator n̂ = a † a takes on non-negative integer values – it is called the number


operator.

• The eigenstates of the Hamiltonian are eigenstates |ψn 〉 of the number operator with
eigenvalue n = 0, 1, 2, . . . with that operator. With the Hamiltonian, they have eigenval-
ues
H = ×ω (n + 1/2) .

• The creation and annihilation operators have the following effect when acting on a
state |ψn 〉: p
p
a |ψn 〉 = n |ψn−1 〉 ; a † |ψn 〉 = n + 1 |ψn+1 〉 .
96 8. S YSTEMS OF H ARMONIC OSCILLATORS – P HONONS AND P HOTONS

Coherent states are eigenstates of the annihilation operator a. They represent minimum-
uncertainty wave packets and lead to an oscillating wave packet when propagated in time.
Phonon and photon fields can be written as superpositions of independent harmonic
oscillators. This implies that we can readily quantise them. The phonon modes are labelled
by their wave vector k and a characterised by their frequency which is given in terms of k in
the so-called dispersion relation. For a linear chain with nearest neighbour couplings with
coupling strength K , the longitudinal modes have the dispersion relation

ωk = 2K sin(|ka|/2).

The Hamiltonian is given by


Z ³ ´
HChain = ×ωk a k† a k + 1/2 d k.

Also for the electromagnetic field, we have demonstrated that the Hamiltonian can be
written as a sum over independent modes. These have dispersion relation ωk = c|k|. The
Hamiltonian of the electromagnetic field is
X Z ×ωk h † †
i
HEM = a α (k, t )a α (k, t ) + a α (k, t )a α (k, t ) d 3 k.
α 2


Here, the a α (k, t ) and a α (k, t ) are the time-dependent operators

a α (k, t ) = e iωk t a α (k),

with a α (k) the time-independent annihilation operator and similar for the hermitian conju-
{ gate (creation operator). The subscript α denotes the two polarization directions perpendic-
8{ ular to the propagation direction k. The expressions used are always valid within the trans-
verse gauge ∇ · A = 0. The energy quanta of this Hamiltonian are called photons.
We can also formulate the quantum expressions for the fields in terms of these creation
and annihilation operators. These are:
s
1 × h
Z i
A(r, t ) = a(k, t ) + a† (−k, t ) e ik·r d 3 k.
(2π)3/2 2²0 ωk

Z s
∂A i ×ωk h i
E(r, t ) = − = a(k, t ) − a† (−k, t ) e ik·r d 3 k.
∂t (2π)3/2 2²0
Z s
i × h i
B(r, t ) = ∇ × A = k × a(k, t ) + a† (−k, t ) e ik·r d 3 k.
(2π)3/2 2²0 ωk
Finally, we can formulate the Poynting vector, which represents the energy flow of the EM
field, as Z
P = ×k Nα (k)d 3 k,
X
α
α
where Nα = a (k)a(k) counts the number of photons with polarization alpha.
The quantum expression for the angular momentum carried by the electromagnetic field
splits into two contributions: the orbital angular momentum and the spin.

J = L + S,

where Z X
L = ²0 E l (r × ∇)A l d 3 r,
l
8.5. P ROBLEMS 97

and Z
S = ²0 E × A d 3 r.

Finally, we have seen that the electric field energy density always tends to infinity, due
to the fact that it is composed of an infinite amount of harmonic oscillators, leading to an
infinite amount of zero-point energies ×ωk /2. We never ‘see’ the infinite energy density, as
processes are driven only by energy differences. The infinite energy density manifests itself
however through the dramatic Casimir effect: two ideal, parallel conductors with separation
a deform the field in between them due to the zero tangential field boundary condition at
their surface, leading to a finite energy difference per surface area. This energy difference can
be calculated to be
E 00 − E 0 c×π2
=− .
A 720a 3
From this, we see that the two conductors attract each other with a force per unit area:
F c×π2
=− .
A 240a 4
This is the famous Casimir force.

8.5 P ROBLEMS
1. (a) Derive the equations of motion for the operators x̂ and p̂ for the harmonic oscil-
lator, described by the Hamiltonian
p2 1
H= + mω2 x 2 .
2m 2
(b) Transforming the variables according to
p p {
X (t ) = x(t ) mω/×; P (t ) = p(t )/ ×mω,
8
{
find the equations of motion for X and P .
(c) Now we search for the eigenstates of the annihilation, or ‘lowering’ operator a.
What is the form of this operator, expressed in terms of x and p? And in terms of
X and P ?
The eigenstates of a satisfy
a |α〉 = α |α〉 .
We write |α〉 in the form

X
|α〉 = c n |n〉 ,
n=0
where |n〉 is an eigenstate of the Hamiltonian with energy eigenvalue ×ω(n +1/2).
Show that the fact that this is an eigenstate of the lowering operator leads to the
condition
α
c n+1 = p cn ,
n +1
and show from this that
αn
cn = p c0 .
n!
(d) Find the normalization condition, and assuming that c 0 is real
1 2
c 02 = P = e −|α| ,
∞ |α|2n
n=0 n!

so that
2
/2
X αn
|α〉 = e −|α| p |n〉 .
n n!
These states are called coherent states.
98 8. S YSTEMS OF H ARMONIC OSCILLATORS – P HONONS AND P HOTONS

(e) Show that this can be written as


2
/2 αa †
|α〉 = e −|α| e |0〉 .

Hint: remember that


(a † )n
|n〉 = p |0〉 .
n!
2. We continue with the coherent states considered in the previous problem. We now
focus on the time dependence.

(a) Show that the Heisenberg equation for a has as its solution

a(t ) = e −iωt a(0).

(b) Show that the eigenstate of a with eigenvalue α at t = 0 will remain an eigenstate
of a with eigenvalue α(t ) = exp(−iωt )α.
¤ p
(c) From X (t ) = a(t ) + a † (t ) / 2, show that the expectation value of X , for a system
£

starting off in the state |α〉, varies in time as

〈X 〉 (t ) = X m cos(ωt − ϕ).

Express X m and ϕ in terms of α. Find a similar equation for P (t ).


(d) Calculate α|∆2 X |α and α|∆2 P |α . How do these quantities evolve in time?
­ ® ­ ®

(e) According to the minimum uncertainty principle, what is the lower bound on the
product ∆2 X ∆2 P ? (The expression ∆2 X denotes the variance X 2 − 〈X 〉2
­ ®­ ® ­ ® ­ ®
­ 2 ®
and similar for ∆ P ). Are coherent states thus minimum uncertainty states?

{ (f) The coherent state |α〉 is related to the ground state of the harmonic oscillator via
8 {
|α〉 = D(α) |0〉 ,

where
D(α) = e αa

−α∗ a
.
Prove this.
(g) Show that for a general operator A and a unitary operator U we have the relation

U † e AU = eU AU

and use that to calculate the operator

U0† (t )D(α)U0 (t ),

where U0 (t ) is the time evolution operator for the harmonic oscillator.


Use this result to calculate the time evolution of an initial state |α〉. Show that the
initial state remains a coherent state.

3. In this problem, we show that a driven harmonic oscillator, starting off in a coherent
state, remains in a coherent state. The Hamiltonian of the driven harmonic oscillator
reads
p 2 mω2 2 p
H= + x + 2mω3 × f (t )x.
2m 2
(a) Formulate this Hamiltonian in terms of creation and annihilation operators, and
show that the equation of motion for a(t ) reads

ȧ = −iωa − iω f .

Write the solution to this equation as an integral involving f .


8.5. P ROBLEMS 99

(b) Now consider a system starting off in a coherent state |α(0)〉. Work out

a(t ) |α(0)〉 .

Show that the time-dependent state is a coherent state

|α(t )〉

where Z t 0
α(t ) = α(0)e −iωt
− iω e −iω(t −t ) f (t 0 )d t 0 .
0

4. Consider the harmonic oscillator in two dimensions (in convenient units):


1³ 2 ´ 1
H= p x + p 2y + (x 2 + y 2 ). (8.20)
2 2
The units furthermore imply × ≡ 1, so that [x α , p β ] = iδαβ , for α, β = 1, 2.
Define annihilation operators a 1 and a 2 as follows:

1 1
a 1 = p (x + ip x ) and a 2 = p (y + ip y ). (8.21)
2 2

Creation operators a 1† and a 2† are the Hermitian conjugates of these.



(a) Calculate all commutation relations for a α and a α , α = 1, 2.
(b) Show that N1 = a 1† a 1 and N2 = a 2† a 2 form an observation maximum, i.e. a maximal
set of commuting operators. (Capital N ’s are used for operators, and lower case
n’s for their eigenvalues). Show furthermore that
{
³ ´n 1 ³ ´n 2
a 1† a 2†
8
{
|n 1 n 2 〉 = p p |0〉 (8.22)
n1 ! n2 !

are the corresponding eigenstates and give their degeneracies.


Now we define
1³ ´
J x = a 2† a 1 + a 1† a 2 (8.23a)
2
i³ ´
J y = a 2† a 1 − a 1† a 2 (8.23b)
2
1³ ´
J z = a 1† a 1 − a 2† a 2 (8.23c)
2

(c) Show that [J k , J l ] = i ²kl m J m and that J 2 = N /2(N /2 + 1), N = N1 + N2 .


(d) Show that J 2 and J z form an observation maximum and show that
´ j +m ³ ´ j −m
³

a a 2†
¯jm = p 1
¯ ®
p |0〉 (8.24)
( j + m)! ( j − m)!

What are the eigenvalues of the Hamiltonian and their degeneracies?


(e) For a spin-less particle in a magnetic field B = (0, 0, B z ) the Hamiltonian can be
written as
H = H1 + H2 + H3 (8.25)
with H1 the Hamiltonian of the two dimensional harmonic oscillator in the x y
plane, H2 the Hamiltonian for a free particle along the z axis and H3 = −qB L z /2m.
Show that [Hi , H j ] = 0, i , j = 1, 2, 3.
100 8. S YSTEMS OF H ARMONIC OSCILLATORS – P HONONS AND P HOTONS

(f) Show that L z can be expressed in J y .


(g) How can the states of H be labelled? Give energy levels and degeneracies.

5. In this problem, we study the mean and variance of the electric field in the photonic
ground state, which we denote as |0〉.

(a) Show that 〈0 | E (r, t ) | 0〉 = 0.


(b) Show that, written as a sum over the modes in a L × L × L cavity with periodic
boundary conditions,
­ ¯ 2 ¯ ® X ×ωk
0 ¯ E (r, t ) ¯ 0 = 3
.
k,α 2²0 L

(c) Show, using the integral representation and introducing a cut-off wave vector k c ,
that the uncertainty diverges as k c4 .
(d) In practice, we never measure the field at a single point but in a finite volume. If
the linear size of that volume is `, how does the uncertainty of the field scale with
`?

{
8{
9
S ECOND QUANTISATION

9.1 I NTRODUCTION
The quantum mechanics you have been confronted with so far usually deals with single par-
ticles moving in a potential which is caused by the environment (e.g. gravity, an electric or
a magnetic field). If we have more than one particle, each of them feels the other particles
through some interaction (gravity, Coulomb interaction, spin-orbit interaction, . . . ). When
the system is close to equilibrium, we can decompose its Lagrangian into its normal modes,
and these can be described formally by independent harmonic oscillators – this is the ap-
proach we used in the previous chapter to analyse the phonons in a chain. Generally, how-
ever, this analysis is not possible, and the problem becomes tremendously difficult. As in-
teracting many-body systems are so difficult to analyse thoroughly, we shall only scratch the
surface of this interesting topic in this lecture course. However, treating the particles in a
many body system as independently moving in one potential which reflects the effect of the
interactions of all other particles, has proven to be extremely useful, and many phenomena
in for example solids can be explained using this ‘independent particle’ picture. Even with-
out treating the interactions explicitly, many-body systems are interesting, if only because of
the statistics, which, for indistinguishable fermions, gives rise to a pressure in the degenerate
limit (high density and low temperature), originating from the fact that two fermions having
the same spin do not want to occupy the same position in space.
In the previous chapter we have dealt with phonons and photons. We know that these
‘particles’ (‘excitations’ would be a better term for them) do not preserve their number as
phonons and photons are continuously created and annihilated. This happens for example
in processes where electromagnetic radiation interacts with electrons, and it is a direct con-
sequence of the fact that there is no ‘mass energy’ E = mc 2 to be paid for creating a photon
– just its momentum determines the energy via E = ×ω = ×ck, and this momentum may be
(very) small.
The varying number of photons or phonons is an example of the general phenomenon
that interactions cause the particle numbers of different species in the system to vary. There-
fore we need a formalism in which the number of particles is not constant, but allowed to
change. It will turn out that this description is also the most convenient one in a mathemat-
ical sense. We therefore shall be working in a space which is a direct sum of Hilbert spaces
with fixed numbers of particles. This space is called the Fock space F :

F = H (1) ⊕ H (2) ⊕ H (3) ⊕ H (4) . . .

which is a direct sum of the Hilbert spaces H (n) for n particles, where n assumes any positive
integer value. For completeness, we state that if the Hamiltonian of a one-particle system is

101
102 9. S ECOND QUANTISATION

ĥ, then for an n-particle non-interacting system, it is given as


n
X
Ĥ (n) = ĥ(1) + ĥ(2) + · · · + ĥ(n) = ĥ(i ).
i =1

Here, i , as in ĥ(i ), denotes the appropriate quantum number(s) of particle i . Examples are
the position ri of a spin-less particle, or, if the particle has spin, (ri , s i ) where s i is the spin
quantum number (corresponding to, say, the z-component of the spin) of the particle. An
example of ĥ is the kinetic energy:
p2
ĥ(i ) = i .
2m
From now on, we will omit the hat from operators when there is no ambiguity. If all particles
feel the same external potential U (r), ĥ becomes

p i2
ĥ(i ) = +U (ri ).
2m
The particles are all assumed to be of the same species for simplicity. For interacting par-
ticles, we have additional terms in the Hamiltonian. An important example is a two-particle
interaction v:
n
X X
H (n) = h(i ) + v(ri − ri 0 ).
i =1 i <i 0 ≤n

We need a basis of the Fock space. We start from some basis ¯φ j , j = 1, . . . of the one-
¯ ®

particle Hilbert space. It is important to distinguish the particles, labelled by i , i 0 etc. from
the one-particle basis ¯states, which are labelled by j , k, m etc. A basis for the two-particle
Hilbert space is then ¯φ j φk , where the first particle is in the one-particle state j and the
®

second in k. However, this is not a suitable basis when the particles are indistinguishable. In
that case, the Hamiltonian is invariant under exchange of any pair of particles. This implies
that H commutes with the exchange operator P j k for particles j and k:
{
9{ [P j k , H ] = 0.

From linear algebra we conclude that we can find eigenstates of H which are at the same time
eigenfunctions of P j k , since they are two commuting Hermitian operators. Furthermore we
know that P 2j k = 1. Combined with the fact that P j k , being Hermitian, has real eigenvalues,
we see that only the eigenvalues +1 and −1 of P j k are allowed, and
¯ ®therefore it is possible
to have eigenstates of the Hamiltonian which satisfy P j k ¯ψ = ± ¯ψ for each ψ which is a
¯ ®

many body state including particles j and k. It turns out that a particle has either a +1 or
a −1 as its eigenvalue of any exchange operator, i.e. it can not switch from being a particle
with +1 under exchange, to −1. Particles belonging to the former class are called bosons and
those belonging to the latter are fermions. The so-called spin-statistics theorem states that
bosons have integer spin, whereas fermions have half-integer spin. This theorem was proven
by M. Fierz (1939) and W. Pauli (1940) but the argument is rather tricky. Recently, alternative
proofs have been formulated – we do not give the details in these notes.
We have:

There are two types of particles. The first type is called bosons; these particles are
symmetric under particle exchange; they have integer spin. The second type of
particles, carrying half-integer spin, are called fermions. These particles are anti-
symmetric under particle exchange.

Note that a general permutation of any number of particles, can always be constructed from
a sequence of particle exchanges (pair-wise swaps). Although that sequence is not unique,
the parity of the number of such exchanges in the sequence is always the same for a given
9.1. I NTRODUCTION 103

permutation – therefore we distinguish even and odd permutations. The eigenvalue of the
wave function for an operator which permutes the particles according to an even permuta-
tion is +1. For an odd permutation, the fermion particle wave function has then eigenvalue
−1 (the boson wave function always has eigenvalue +1).
The requirement imposed by these symmetry properties implies that our basis functions
should have the appropriate symmetry. For bosons, we¯ can construct symmetric basis func-
tions for n particles starting from single-particle states ¯φ j , as follows:
®

1, 2, 3, 4, . . . , n|ψS = N φ1 (P 1 )φ2 (P 2 ) . . . φn (P n ).
­ ® X
P

where the sum is over all permutations P of the sequence 1, 2, . . . , n:

P (1, 2, 3, . . .) = (P 1 , P 2 , P 3 , . . .)

and where N is an appropriate normalisation factor. Note that P is used as an operator,


working on the set of numbers (1, 2, . . . , N ), whereas the P 1 , P 2 . . . , are numbers between 1
and N . For two particles we only have the two permutations

1, 2 → 1, 2 and 1, 2 → 2, 1.

For two bosons, occupying two different one-particle states ¯φ1 and ¯φ2 , we can construct
¯ ® ¯ ®

the symmetric two-particle states as follows:


1 £­ ¯¯ ® ­ ¯¯ ® ­ ¯¯ ® ­ ¯¯ ®¤
1, 2 ¯ ψS = p 1 φ1 2 φ2 + 2 φ1 1 φ2 .
­ ¯ ®
(9.1)
2
It can easily be verified that this state is symmetric under exchange ¯for particles 1 and 2. If
the particles occupy the same one-particle state (φ1 = φ2 ) we have 1 φ1 2 φ1 as the two-
­ ®­ ¯ ®
¯ ¯
particle state.
For fermions the situation is different. The antisymmetric wave function for two fermions
is
1 £­ ¯¯ ® ­ ¯¯ ® ­ ¯¯ ® ­ ¯¯ ®¤
1 φ1 2 φ2 − 2 φ1 1 φ2 , {
­ ®
1, 2|ψAS = p
2 9
{
We see that this vanishes when φ1 and φ2 are the same. Indeed, for two fermions in the same
state φ, we see that the antisymmetric property says:

1 ¯ φ 2 ¯ φ = − 2 ¯ φ 1 ¯ φ = 0.
­ ¯ ®­ ¯ ® ­ ¯ ®­ ¯ ®

For more than two (say, n) fermions, occupying a set of one-particle states ¯φ1 , ¯φ2 ,. . . ¯φn ,
¯ ® ¯ ® ¯ ®

the antisymmetric wave function is given by a so-called Slater determinant


¯ ­ ¯ ® ­ ¯ ® ­ ¯ ® ¯¯
¯ 1¯φ
¯ ­ ¯ 1 ® ­2 ¯ φ1 ® . . . ­n ¯ φ1 ® ¯
¯ ¯
1 ¯¯ 1 φ2 2 φ2 ... n φ2 ¯
¯ ¯ ¯ ¯ ¯
­ ®
1, 2, 3, 4, . . . , n|ψAS = p ¯ .. .. .. .. ¯
n! ¯ . . . .
¯
¯
¯ 1 ¯ φn 2 ¯ φn n ¯ φn ¯
¯ ­ ¯ ® ­ ¯ ® ­ ¯ ® ¯
...

This wave function can be written in a form similar to the one used for bosons:
1 X
sgn(P ) P 1 ¯ φ1 P 2 ¯ φ2 . . . P n ¯ φn ,
­ ® ­ ¯ ®­ ¯ ® ­ ¯ ®
1, 2, 3, 4, . . . , n|ψAS = p
n! P

where sgn(P ) is the sign of the permutation: it is +1 if the permutation can be written as a
product of an even number of transpositions (exchange operations) and −1 otherwise.
For a symmetric boson wave function in which the one-particle orbital φ1 occurs n 1
times, φ2 n 2 times etcetera, we use the notation

¯ψS = |n 1 n 2 . . .〉 .
¯ ®
(9.2)
104 9. S ECOND QUANTISATION

We can do the same for a fermion system, for which n j can take on the values 0 or 1 only. 1
To conclude this section, we consider the normalisation of the boson states. In princi-
ple, we can sum over all permutations, which would yield n! terms. However, if several states
are multiply occupied, any permutation involving a reshuffling of particles within the same
single particle state does not lead to a new state. This leads us to consider two different for-
mulations for the symmetric boson states. The first involves a sum over all permutations,
leading to an overcounting of identical states related by permutations of the particles within
each level. This is compensated for in the normalisation factor:
1
N =p ,
n!n 1 !n 2 ! . . .

where there are n 1 particles in the single particle state 1, n 2 in the single particle state 2 and
so on. Alternatively, we can sum over the unique permutations only, i.e. we take only one
representative of each set of states obtained from each other by permutations of the particles
within each level. Then the normalisation factor becomes:
s
n 1 !n 2 ! . . .
NU = .
n!

The formulation of a quantum theory which describes many-particle states in the Fock
space is called second quantisation.

9.2 M OVING AROUND IN F OCK SPACE – CREATION AND ANNIHILATION


OPERATORS
In the previous section we have introduced the Fock space – the direct sum of Hilbert spaces
of different particle numbers. Using this space obviously only makes sense if we can change
the particle numbers – otherwise, we could have used the N -particle Hilbert space. In the
previous chapter we have encountered operators which can change energy quanta: these are
the creation and annihilation operators. We also pointed out that the difference between
{ quanta and particles only has an interpretational character to it; on the formal level, they are
9{ the same. We now extend the notion of creation and annihilation operators to any type of
particle, massive or massless, fermions or bosons. We start with bosons.

9.2.1 M ANY- BOSON SYSTEMS


For many-boson systems, annihilation operators remove a particle from one of the single-
particle states in a many-body state of the form (9.1) (or (9.2)); the creation operators add a
particle. The definition of these operators is tied to the set of one-particle states we work with.
Often, these single-particle states are denoted as orbitals, or spin-orbitals when the particles
have spin-degrees of freedom. The annihilation and creation operators are thus defined by
their action on the many-body basis states which are conveniently used in the occupation
number representation:
¯ ® p ¯ ®
a j ¯n 1 n 2 . . . n j . . . = n j ¯n 1 n 2 . . . n j − 1 . . .

and
® q
a †j ¯n 1 n 2 . . . n j . . . = n j + 1 ¯n 1 n 2 . . . n j + 1 . . . .
¯ ¯ ®

It is then easy to see that a †j a j is a Hermitian operator with eigenvalue n j :

a †j a j ¯n 1 n 2 . . . n j . . . = a † n j ¯n 1 n 2 . . . n j − 1 . . . = n j ¯n 1 n 2 . . . n j . . . .
¯ ® p ¯ ® ¯ ®

1 Some textbooks use a special notation instead of the ‘ket’ vector, such as |. . .) or |. . .}, on the right hand side in

order to indicate that the state is specified by the ‘occupation numbers’ n 1 , n 2 ,. . . rather than the labels of the
occupied basis states.
9.2. M OVING AROUND IN F OCK SPACE – CREATION AND ANNIHILATION OPERATORS 105

Note that we have only defined the creation and annihilation operators: they are not a natural
consequence of an analysis of a harmonic oscillator as in the previous chapter. In fact, the
physics of our bosons may have no relation to the harmonic oscillator at all.
We can also construct a normalised state containing n j particles in orbital j by acting on
the vacuum sufficiently often with the creation operator:
† n
® (a ) j
¯000n j 0 . . . = pj
¯
|00 . . . 00〉 .
nj!

Interestingly, with this choice of the definition of the creation and annihilation operators,
we have commutation relations for the creation and annihilation operators reminding us of
those for the harmonic oscillator: h i
a j , a †j = 1.

This can be seen by acting with this operator on an arbitrary state with n j particles in orbital
j:
³ ´¯ q
a j a †j − a †j a j ¯n 1 . . . n j . . . = a j n j + 1 ¯n 1 . . . n j + 1 . . . − a †j n j ¯n 1 . . . n j − 1 . . . =
® ¯ ® p ¯ ®
¯ ® ¯ ® ¯ ®
(n j + 1) ¯n 1 . . . n j . . . − n j ¯n 1 . . . n j . . . = ¯n 1 . . . n j . . . .

Moreover, following similar steps, it can be checked that


¤ h i
a j , a k = a †j , a k† = 0,
£

and that h i
a j , a k† = 0 for k 6= j.

Proving these relations is a useful exercise.

9.2.2 M ANY- FERMION SYSTEMS


We can define similar operators for the fermion case. Note that no two fermions can be in the
{
same orbital: in that case, the antisymmetric wave function is always zero. To show that the 9
{
antisymmetry of the many-body wave function requires a lot of care, we compare the states

|n 1 = 1, n 2 = 1, n 3 = 1〉 and |n 2 = 1, n 1 = 1, n 3 = 1〉 .

Note the swap of the labels 1 and 2. By the convention imposed by the Slater determinant,
one of the terms in the wave function on the left hand side is a product of the orbitals of
particle 1, 2 and 3 in the same¯ order as they appear in the ket¯ vector, i.e. particle 1 is in orbital
¯φ1 ; particle 2 is in orbital ¯φ2 and particle 3 is in orbital ¯φ3 . This term occurs with a plus
¯ ® ® ®

(+) sign. In the state on the right hand side, the first two particles are swapped, and the same
term (particles 1, 2 and 3 in orbitals 1, 2 and 3 respectively) occurs with a minus (−) sign.
All other terms have their sign swapped too; therefore, due to the anti-symmetry of the wave
function, the second wave function has a minus sign with respect to the first:

|n 1 = 1, n 2 = 1, n 3 = 1〉 = − |n 2 = 1, n 1 = 1, n 3 = 1〉 .

We want to introduce again an annihilation operator. We may try to define this as:
¯ ® ¯ ®
a j ¯n 1 . . . n j = 1 . . . = ¯n 1 . . . n j = 0 . . .

and ¯ ®
a j ¯n 1 . . . n j = 0 . . . = 0.
This however leads to a problem. The point is that

a 2 |n 1 = 1, n 2 = 1, n 3 = 1〉 = |n 1 = 1, n 3 = 1〉
106 9. S ECOND QUANTISATION

and
a 2 |n 2 = 1, n 1 = 1, n 3 = 1〉 = |n 1 = 1, n 3 = 1〉 .
where the states on the left hand side in these two equalities differ by a factor −1! So appears
that acting with a 2 on two states differing by a minus sign, we get the same result. This can
obviously not be right, and we must introduce a more subtle definition for the annihilation
operator:
a j ¯n 1 . . . n j = 1 . . . = (−)Σ j ¯n 1 . . . n j = 0 . . .
¯ ® ¯ ®

where
jX
−1
Σj = nk ,
k=1

i.e. Σ j counts the number of particles in the orbitals occurring before j in the occupation
number state. It is now easy to check that the minus-sign problem is resolved with this defi-
nition. We can formulate the definition of a j for the two cases where n j = 0 or 1 in a concise
way as follows:
a j ¯n 1 . . . n j . . . = n j (−)Σ j ¯n 1 . . . 1 − n j . . . .
¯ ® ¯ ®

A similar sign problem has to be dealt with when defining a creation operator. We may
naively define
a †j ¯n 1 . . . n j = 0 . . . = ¯n 1 . . . n j = 1 . . .
¯ ® ¯ ®

and require a †j to give zero when acting on a state with n j = 1. In order to let this operator be
the Hermitian conjugate of the correct annihilation operator, we must however have

a †j ¯n 1 . . . n j . . . = (1 − n j )(−)Σ j ¯n 1 . . . 1 − n j . . . .
¯ ® ¯ ®

The definitions introduced here have a nice consequence, as can be seen by calculating
the anti-commutation relation relation between, say two annihilation operators for orbitals
j and l ( j 6= l ). First, we note that the result of the anti-commutator acting on the state is zero
if at least one of n j and n l is zero. If they are both 1, we have:
{
9{ (a j a l + a l a j ) ¯. . . n j = 1 . . . n l = 1 . . . = a j (−)Σl ¯. . . n j = 1 . . . 0 . . . + a l (−)Σ j |. . . 0 . . . n l = 1 . . .〉 .
¯ ® ¯ ®

In working out the last expression, we should realise ourselves that there is a difference of
1 between the Σl in the first term and the one arising from a l in the second term, as in the
latter, the particle in orbital j has already been removed. Therefore, we obtain two terms
with opposite signs and the commutator vanishes. For j = l , the anti-commutator obviously
vanishes, as we try to remove two particles from the same state. So we have found for each
j,l:
© ª
a j , a l = a j a l + a l a j = 0,
where {·, ·} denotes the anti-commutator.
A similar anti-commutation relation can be found for the creation operators:
n o
a †j , a l† = 0.

Now we analyse the anti-commutator between a creation and an annihilation operator:

(a j a †j + a †j a j ) ¯. . . n j . . . = a j (1 − n j )(−)Σ j ¯. . . 1 − n j . . . + a †j (−)Σ j n j ¯. . . 0 . . . 1 − n j . . . =
¯ ® ¯ ® ¯ ®
h i¯
(1 − n j )2 + n 2j ¯. . . n j . . . ,
®

¯ ®
where, for n j = 0 or 1, the result is always ¯. . . n j . . . preceded by 1, leading to
n o
a j , a †j = 1.
9.3. I NTERACTING PARTICLE SYSTEMS 107

n o
It can be shown that, for j 6= l , a j , a l† = 0, so that we have
n o
a j , a l† = δ j l .

All in all we have:

For bosons, we can define annihilation and creation operators for removing, respec-
tively adding particles from or to orbitals. These annihilation and creation relation
operators satisfy the following commutator algebra:

[a j , a l ] = [a †j , a l† ] = 0, [a j , a l† ] = δ j l .

For fermions, we can define similar operators. However, in that case the commuta-
tion relations should be replaced by anti-commutation relations:
ª n o n o
a j , a l = a †j , a l† = 0; a j , a l† = δ j l .
©

9.3 I NTERACTING PARTICLE SYSTEMS


We consider a many-particle system with the Hamiltonian introduced in section 9.1:

N p2 N 1 X N N 1 X N
X j X X
H= + V (r j ) + v(r j − rl ) = h( j ) + v(r j − rl ),
j =1 2m j =1 2 j ,l =1 j =1 2 j ,l =1

p 2j
where h( j ) = 2m + V (r j ) is the single-particle Hamiltonian. This system describes point par-
ticles moving in an external potential V and interacting via a two-body potential v. In the
latter, the factor 1/2 is included with the omission of the condition that j < l in the sum to
compensate for double counting. We implicitly assume that v = 0 for j = l (if v 6= 0 for j = l ,
we could include it into the external potential V ). We now formulate this Hamiltonian in
{
terms of creation and annihilation operators.
We therefore evaluate the matrix element of the Hamiltonian for two basis states which,
9
{

for fermion systems, by convention, are two Slater determinants of the form:

­ ® 1 X ­ ®­ ®­ ® ­ ®
1, 2, . . . , N |ψAS = p sgn(P ) 1|φP 1 2|φP 2 3|φP 3 · · · N |φP N .
N! P

We first consider the matrix elements of the one-body potential h for two such basis states.
In view of the indistinguishability of the particles,
¯ Awe may¯ consider just h(1) and multiply the
result by N . We call the Slater determinants ¯ψAS and ¯ψBAS . If we expand the two Slater
® ®

determinants in products of orbitals φ j , we obtain terms of the form

φ A1 φ A2 · · · φ AN ¯ h(1) ¯φB 1 φB 2 · · · φB N ,
­ ¯ ¯ ®

where anti-symmetrisation is not implicit in the bra- and ket states. Now note that for this
expression not to vanish, we must have A2 = B 2, A3 = B 3 etcetera, as the set of orbitals φ j
is orthonormal. For a non-vanishing matrix element, only A1 and B 1 may be different, as
they occur on both sides of h(1). We conclude that matrix elements of h between two Slater
determinants are non-zero only if the orbitals from which the two Slater determinants ¯ A ® have
been composed, differ at most by ¯ one orbital. If indeed an orbital j occurs only in ¯ψAS and,
B
likewise, orbital l only occurs in ψAS , while all the other orbitals in both Slater determinants
®
¯
are pairwise equivalent, the matrix element will be

φ j ¯ h ¯φl .
­ ¯ ¯ ®
108 9. S ECOND QUANTISATION

The pre-factor is correct, since the orbitals j and l must both be occupied by particle 1; the
other N − 1 orbitals must be in the same order in both A and B , but that still leaves room for
(N −1)! permutations. Together with the factor N arising from the fact that we have looked at
h(1) only, this cancels
¯ the 1/N ! from the normalisation
¯ B of the two Slater determinants.
A
If the orbitals in ψAS are the same as those in ψAS , we obtain
® ®
¯ ¯

φ j |h|φ j
X­ ®
j

(note that the sum over j is the same for the set of orbitals in A as in B ). Again, we can assign
all the N orbitals j in the sets A and B to the first place (hence the sum over j ) and for each
choice we can permute the other orbitals in (N − 1)! ways, which, together with the prefactor
N , cancels the normalisation of 1/N !.
Now we claim that the operator

h j l a †j a l with h j l = φ j |h|φl ,
X ­ ®
(9.3)
jl

where the sum is over all possible indices j and l , is the correct matrix representation for this
one-particle Hamiltonian. This can easily be seen by studying its matrix elements between
two Slater determinants, which we shall now write in the occupation number representation.
When the two states on the left- and right hand side contain exactly the same set of occupied
orbitals, we have
D ¯ ¯ E X
h j l n 1 n 2 . . . ¯a †j a l ¯ n 1 n 2 . . . = φ j ¯ h ¯φ j .
X X­ ¯ ¯ ®
hj j =
¯ ¯
jl j ∈A j

Now suppose that in the ket, N −1 occupied orbitals are identical to orbitals in the bra-vector
so that these two vectors differ in only one orbital which is the orbital k in the bra, and m in
the ket. Our operator gives again the right result:
D ¯ ¯ E
h j l n 1 n 2 . . . n k = 1 . . . ¯a †j a l ¯ n 1 n 2 . . . n m = 1 . . . = h km = φk ¯ h ¯φm .
X ­ ¯ ¯ ®
{
¯ ¯
9{ jl

It is also easily seen that, when more than one orbital differ in the bra and the ket, the matrix
element vanishes. As the operator j l h j l a †j a l gives the correct matrix elements between all
P

possible basis vectors, it is the correct representation of the single particle Hamiltonian.
¯ A ®
Now we turn to the two-particle interaction, which we evaluate again between ¯ψ
¯ B ® AS and
¯ψ . Similar to the analysis of the single-particle operator h, we can evaluate the action of
AS
this interaction between particles 1 and 2, and multiply the result by N (N − 1)/2, which is the
total number of distinct pairs we can make. Similar to the case of the single particle operator,
we can argue that we should have two identical sets of orbitals in the bra- and ket vector,
which should be occupied by particles 3 to N ; only the orbitals of particle 1 and 2 are free in
both. Interestingly, a similar analysis as above, tells us that for identical orbital sets in the two
vectors the matrix element reduces to
1 X ¡­ ® ­ ®¢
j k|v| j k − j k|v|k j ,
2 j k∈A

where Z Z
3
d 3 r 2 φ∗j (r1 )φ∗k (r2 )v(r1 − r2 )φl (r1 )φm (r2 ).
­ ®
j k|v|l m = d r1

The pre-factor requires some care. We can select N (N − 1) orbital pairs j k. This should then
be multiplied by the pre-factor 1/N ! deriving from the normalisation factors of the many
body wave functions. Furthermore, the remaining N −2 states (i.e. not j and k) can be ordered
in (N − 2)! ways. Together with the prefactor of 1/2 in the Hamiltonian this multiplies to
9.3. I NTERACTING PARTICLE SYSTEMS 109

exactly 1/2. Note that the sum over j and k is unrestricted: each pair occurs twice in this
sum.
Now we consider the case where the two many-particle states differ in one orbital. In that
case, the two different orbitals (which we denote k ∈ A and m ∈ B should
¯ ® involve particles
1 and 2, and we can choose one additional orbital (which we denote ¯ j ) of the set of N − 1
shared orbitals. The matrix element then becomes
1 X ¡­ ¯ ¯ ® ­ ¯ ¯ ®¢
jk ¯v ¯ jm − jk ¯v ¯m j .
2 j ∈A, j 6=k

The combinatorics: k and m are fixed, and the sum over j takes care of all the other orbitals.
Once j is chosen, N − 2 orbitals are left, which can be ordered in (N − 2)! ways. Swapping the
other of two orbitals in both the left and right many body wave function gives a factor of 2,
and combining all this with the pre-factor and the normalisation of the Slater determinants,
we see that the correct pre-factor is 1.
Finally, if two orbitals on the bra have no counterpart in the ket, we have

1¡ ¢
〈 j k|v|l m〉 − 〈 j k|v|ml 〉
2
where j , k are the two labels of the bra differing from l m in the ket. All other matrix elements
give zero.
We now show that the operator
1 X ­ ¯¯ ¯¯ ® † †
j k v l m a j ak am al
2 j kl m

gives the same matrix elements in the occupation number representation. Let us consider
the case where the orbitals in bra and ket are all identical. Then it is easy to see that when we
put our operator between the two states, either j = l and k = m or j = m and k = l , so we are
left with

1 X ³­ ¯¯ ¯¯ ® ­ {
j k v j k . . . n j = 1 . . . n k = 1 . . .¯ a †j a k† a k a j ¯. . . n j = 1 . . . n k = 1 . . . + 9
¯ ¯
{
®
2 j k∈A
®´
j k ¯ v ¯k j . . . n j = 1 . . . n k = 1 . . .¯ a †j a k† a j a k ¯. . . n j = 1 . . . n k = 1 . . .
­ ¯ ¯ ®­ ¯ ¯

which directly yields


1 X ¡­ ¯¯ ¯¯ ® ­ ¯¯ ¯¯ ®¢
jk v jk − jk v k j ,
2 j k∈A

where the minus sign on the right hand side follows from the anti-commutation relations of
the fermion creation and annihilation operators. The correctness for the two other cases can
be checked in a similar way, and is left as an exercise to the reader. We summarise the very
important results obtained in this section:

For a many-body system described by a N -particle Hamiltonian

N p2 N 1 X N N 1 X N
X j X X
H= + V (r j ) + v(r j − rl ) = h( j ) + v(r j − rl ),
j =1 2m j =1 2 j ,l =1 j =1 2 j ,l =1

in Fock-space, this Hamiltonian is:

1 X ­
h j k a †j a k + j k|v|l m a †j a k† a m a l .
X ®
Ĥ = (9.4)
jk 2 j kl m

This is the formulation of the Hamiltonian in second quantisation.


110 9. S ECOND QUANTISATION

9.4 C HANGE OF BASIS – FIELD OPERATORS


We have seen that the creation and annihilation operators a †j and a j create or annihilate
particles in an orbital ¯φ j . Now suppose we want creation and annihilation operators which
¯ ®

create
¯ and annihilate particles in orbitals taken from a different set, called |u α 〉. The orbitals
sets ¯φ j and |u α 〉 both are basis sets of the same single particle Hilbert space H . In order
®

to work out the form of a creation operator a α creating a particle in the orbital u α , we write

the action of this operator in a particular way. When a creation operator a α acts on a N -
particle Slater determinant ψ(N ) which itself does not contain the orbital |u α 〉, it yields a
¯ ®
¯
N + 1-particle state:
†¯
ψ(N ) = Â ¯u α ψ(N ) ,
¯ ® ¯ ®

where we have introduced the anti-symmetrisation operator Â, which constructs a Slater de-
terminant consisting of the orbitals in ψ(N ) and u α . Now we can write
¯ ®
¯

¯φ j φ j ¯ u α ¯ψ(N ) .
X ¡¯ ® ­ ¯ ®¢ ¯ ®

j

As φ j ¯ u α is a number, we can move that in front of the anti-symmetrisation operator:


­ ¯ ®

†¯
ψ(N ) = φ j |u α Â ¯φ j ψ(N ) = φ j |u α a †j ¯ψ(N )
¯ ® X­ ® ¡¯ ®¢ X ­ ® ¯ ®

j j

showing that X­ ¯ ® †

aα = φ j ¯ uα a j . (9.5)
j

For the annihilation operators we have the transformation rule

uα ¯ φ j a j ,
X­ ¯ ®
aα = (9.6)
j

which is just the Hermitian conjugate of (9.5).


{ Finally, we introduce the concept of field operators. These are creation and annihilation
9{ operators of a particle located at r. Using the rule we have just derived, we see that the field
operators should be defined as

ψ† (r) = a †j φ j |r ; ψ(r) = a j r|φ j


X ­ ® X ­ ®
j j

These operators create or annihilate a particle in a state |r〉.


We now write the Hamiltonian we have discussed in section 9.3 in terms of these field
operators. Using the basis |r〉, the one-particle Hamiltonian can directly be written as
Z Z
H = h(i ) = d r d 3 r 0 ψ† (r) r|h|r0 ψ(r0 ).
(1) 3
X ­ ®
i

For a local potential, V (r) in h, the central matrix element is diagonal and we obtain
Z
V = d 3 r ψ† (r)V (r)ψ(r).

For the kinetic energy T , we have

X p i2 1
Z Z
d 3r d 3 r 0 ψ† (r) r|p̂ 2 |r0 ψ(r0 ).
­ ®
T= =
i 2m 2m

Fourier expanding to transform the integrals over r we obtain


1 1
Z Z Z Z
3 3 3 0 †
­ ® 2 ­ 0®
T= d p d r d r ψ (r) r|p p p|r ψ(r ) = 0
d 3 pψ† (p)p 2 ψ(p),
2m 2m
9.5. E XAMPLES OF MANY- BODY SYSTEMS 111

where we have used the inverse Fourier transform


Z
ψ(r) = r|p p|ψ d 3 p.
­ ®­ ®

The expression over p 2 ψ(p) is recognised as the Fourier transform of −∇2 ψ(r), and trans-
forming back, we have
−1
Z
T= d 3 r ψ† (r)∇2 ψ(r).
2m
All in all we therefore have Z
H (1) = d 3 r ψ† (r)ĥψ(r).

Now the interaction term can be analysed in a similar way, leading to the result

1
Z
v̂ = d 3 r d 3 r 0 ψ† (r)ψ† (r0 )v(r − r0 )ψ(r)ψ(r0 ).
2

9.5 E XAMPLES OF MANY- BODY SYSTEMS


9.5.1 M ANY NON - RELATIVISTIC PARTICLES IN A BOX
The particle in a box is a very common problem which is covered in textbooks on elemen-
tary quantum mechanics. For convenience, we use a big box here with periodic boundary
conditions, so that the solutions of the one-particle Hamiltonian are of the form

1
p exp(ik · r)
V

where
nx n y nz
µ ¶
k = 2π , , ;
Lx L y Lz
p
L x , L y and L z define the size of the rectangular box. The prefactor 1/ V , where V = L x L y L z ,
ensures proper normalisation of the plane wave inside that box. Inside the box, the potential
{
is constant – we take it to be 0. It is natural to introduce creation and annihilation operators
9
{
a † (p) and a(p), which create and annihilate particle with momentum p = ×k.
Now suppose we want to count the number of particles in the box. The state of the sys-
tem may be a superposition of Slater determinants of different sizes and involving different
orbitals, but the number of particles is always found using the operator

a p† a p .
X
N=
p

For a large box, the sum over p can be turned into an integral as we precisely know which
vectors k are summed over:
V
Z
d 3 k,
X
→ 3
k (2π)
where V = L x L y L z , so, using p = ×k, we have

V
Z
N= d 3 p a † (p)a(p). (9.7)
(2π×)3

Note that a p is used for the discrete set of p’s which fit into the periodic box, whereas the
notation a(p) is used for continuous p’s.
We can do the same for the Hamiltonian

V p2 †
Z
3
H= d p a (p)a(p).
(2π×)3 2m
112 9. S ECOND QUANTISATION

Instead of plane wave orbitals, we can also use orbitals describing particles localised at r:
the corresponding operators are the field operators introduced in section 9.4. We can con-
struct those operators from our orthonormal plane wave basis set. This gives
p

X e −ik·r † V
Z
ψ (r) = p ap = 3
d 3 p e −ip·r a † (p)
p V (2π×)

and p
V
Z
ψ(r) = d 3 p e ip·r a(p).
(2π×)3
We can now write the total number of particles as
Z
N = d 3 r ψ† (r)ψ(r),

and we show that this expression is identical to (9.7). Writing out the expressions for ψ† and
ψ, we obtain:
V
Z Z Z
0
3
N= d r 6
d p d 3 p 0 e −ip·r a † (p)e ip ·r a(p0 ).
3
(2π×)
Integrating over r yields (2π×)3 δ(3) (p − p0 ), and the expression reduces to (9.7) indeed.
The Hamiltonian becomes

−×2 2
Z
H = d 3 r ψ† (r) ∇ ψ(r).
2m
You may ask at this stage what we can learn from the formalism just presented. In this
form, it is not directly clear what is the use of this many-body formulation of free particles. In
fact, this formalism is be very powerful when dealing with interacting particles. We will go a
bit into this in chapter 10.

9.5.2 T HE H EISENBERG MODEL AND THE J ORDAN -W IGNER TRANSFORMATION


{ In this section we shall illustrate how a second-quantised form arises when analysing a one-
9 {
dimensional chain of spin-1/2 particles. Consider such a chain of length N , where there is
a spin-1/2 particle sitting on each site of the chain, and where there is a nearest neighbour
interaction: X£ ¡ x x y y ¢
J ⊥ S i S i +1 + S i S i +1 + J z S iz S iz+1 .
¤
H =−
i

In this section, sums run from site 1 to N ; periodic boundary conditions are assumed: 1 ≡
x,y,z
N + 1. The operators S i are the spin-operators, acting on site i . They satisfy the usual
commutation relations
β γ
[S iα , S i ] = i²αβγ S i
where α, β and γ run over the Cartesian coordinates x, y, z and ²αβγ is the anti-symmetric
tensor.
Let us turn this Hamiltonian into a fermion chain by identifying the spin-down state |↓〉
with ‘empty’ (no particle) and |↑〉 with ‘occupied’ (one particle):

|↑〉 ≡ |1〉
|↓〉 ≡ |0〉 .

Note that we have left out the site index i for the time being. In the spin picture, we can switch
between up and down via the raising and lowering operators

S + = S x + iS y
S − = S x − iS y ,
9.6. S UMMARY 113

which have the form (in the basis |↑〉, |↓〉):


µ ¶ µ ¶
0 1 0 0
S+ = ; S− = .
0 0 1 0

In the language of the occupation numbers, these operators take the form of creation and
annihilation operators
S+ ↔ c† S− ↔ c
where c and c † are the usual creation and annihilation operators:

c † |0〉 = |1〉 , c |1〉 = |0〉 ;


c † |1〉 = 0, c |0〉 = 0.

It seems that we can now just replace the S i+ by the c i† and similarly S i− → c i . However, this
poses a problem, as the spin operators at different sites commute, whereas, if the c i , c i† would
be real fermion operators, they should anti-commute at different sites. A small modification
in the definition of the c i and c i† however fixes this. Using Σi ≡ j <i n j , where n j = 1 for the
P

occupied, and n j = 0 for the unoccupied sites, we obtain correct fermion operators c i and c i†
according to

c i† = (−)Σi S i+ ;
c i = (−)Σi S i− .

You should verify for yourself that the c i and c i† all anti-commute at different sites!
Using the fact that S z = S i+ S i− −1/2 (we take × ≡ 1), we can now formulate the Hamiltonian
in terms of the c i :
X J⊥ ³ †
· µ ¶¸

´ 1 † † †
H =− c c i +1 + c i +1 c i + J z − c i c i + c i c i c i +1 c i +1 .
i 2 i 4
{
We have arrived at a formulation of the Heisenberg Hamiltonian in terms of a fermion chain. 9
{
Note that the fermions have no spin. The first part of the Hamiltonian describes hopping:
the ability of fermions to move from site i to i + 1 and back. The second term contains a
contribution proportional to n i n i +1 . This is interpreted as a ‘Coulomb interaction’ between
neighbouring sites: it is non-zero only when the two sites are occupied by a fermion.
In problem 9 we shall solve the spectrum for the case where J z = 0.

9.6 S UMMARY
In this chapter, we have introduced the Fock space, defined as

F = H (1) ⊕ H (2) ⊕ H (3) ⊕ H (4) . . .

Here, H (N ) is the Hilbert space for N -particles. Within each Hilbert space, the states are
either symmetric under particle exchange – the particles are then called bosons – or anti-
symmetric – then they are fermions. The spin statistics theorem tells us that particles with
integer spin are bosons and those with half-integer spin are fermions. A convenient way to
denote states with n j particles in level j is the occupation number representation:

¯ψ = |n 1 , n 2 , . . .〉 .
¯ ®

These states are symmetric or anti-symmetric many-body states for bosons / fermions re-
spectively, with single-particle orbitals occupied with occupation numbers n j .
Creation and annihilation operators move us from the Hilbert space H (N ) to H (N + 1)
and H (N − 1) respectively. They are defined as follows:
114 9. S ECOND QUANTISATION

• For bosons: ¯ ® p ¯ ®
a j ¯n 1 n 2 . . . n j . . . = n j ¯n 1 n 2 . . . n j − 1 . . .
and
® q
a †j ¯n 1 n 2 . . . n j . . . = n j + 1 ¯n 1 n 2 . . . n j + 1 . . . .
¯ ¯ ®

These operators satisfy the commutation relations

[a j , a l ] = [a †j , a l† ] = 0, [a j , a l† ] = δ j l .

P j −1
• For Fermions, we use Σ j = k=1
n j in the definition of the annihilation operator:

a j ¯n 1 . . . n j . . . = n j (−)Σ j ¯n 1 . . . 1 − n j . . . .
¯ ® ¯ ®

and in that of the creation operator:

a †j ¯n 1 . . . n j = 0 . . . = ¯n 1 . . . n j = 1 . . . .
¯ ® ¯ ®

They satisfy the anti-commutation relations:


ª n o n o
a j , a l = a †j , a l† = 0; a j , a l† = δ j l .
©

Moving from a orbital basis ¯φ j to a new orbital basis |u α 〉, changes the creation and
¯ ®

annihilation operators as follows:



φ j |u α a †j
X­ ®
aα =
j

and X­ ®
aα = u α |φ j a j .
j
{
9 { The Hamiltonian of a many-particle system can be formulated in terms of creation and
annihilation operators as follows:

1 X ­
h j k a †j a k + j k|v|l m a †j a k† a m a l .
X ®
H=
jk 2 j kl m

9.7 P ROBLEMS
1. In this problem, c and c † are fermion operators; a and a † are boson operators.

(a) Calculate, or try to write in the most compact way (i.e., using the smallest number
of operators):
[a, aa † ]
[c, a]
[c, c † c]
{c, c † c}
[c a, a † ]

(b) Show that



e c e c = 1 + c + c † + cc † .

2. Consider a system with a one-particle Hilbert space of dimension N .


9.7. P ROBLEMS 115

(a) Denote by NS the dimension of the Hilbert space for a system of two identical
bosonic particles. Find NS .
(b) Denote by NAS the dimension of the Hilbert space for a system of two identical
fermionic particles. Find NAS .
(c) Show NS + NAS = N 2 .
(d) Make a plot or table of NS and NAS as a function of N .
(e) Now consider a system where single electrons can occupy two orbital levels. What
is the dimension N of the one-particle Hilbert space (don’t forget that the electron
is a spin-1/2 particle!)? What is the dimension NAS of the two-electron Hilbert
space?
(f) Denote the two orbital states ¯φ1 and ¯φ2 . One can construct a basis (we’ll call
¯ ® ¯ ®

it basis 1) for the two-electron system by considering all possible combinations of


symmetric (antisymmetric) orbital and antisymmetric (symmetric) spin states, so
that each basis state is overall antisymmetric. Construct this basis.
(g) Another basis (basis 2) for the two-electron system is obtained by constructing all
possible Slater determinants, starting from the four wave functions

¯φ1 , + , ¯φ2 , + , ¯φ1 , − , ¯φ2 , − .


¯ ® ¯ ® ¯ ® ¯ ®

Find basis 2.
(h) Express the two-electron wave functions in basis 1 in terms of those in basis 2.
(i) Express the two-electron wave functions in basis 1 and in basis 2 in second quan-
tisation. i.e., using creation operators acting on the vacuum state.

3. Consider now a system of n identical particles with single-particle Hilbert space of di-
mension N .

(a) Find NS (n, N ).


{
(b) Find NAS (n, N ). Under what conditions is NAS (n, N ) > 0? 9
{
(c) In the previous problem, you showed that two-particle systems satisfy NS (2, N ) +
NAS (2, N ) = N 2 . Show that NS (n, N )+ NAS (n, N ) = N 2 only for n = 2. What can you
say for n > 2?

4. Consider a chain of fermions described by the Hamiltonian

N ³ ´
c †j c j +1 + c †j +1 c j
X
H = −t
j =1

where periodic boundary conditions impose N + 1 ≡ 1.


Use the Heisenberg equation of motion

Ȧ = i [H , A]

to find the time derivative of the operators ċ j (t ) and ċ j +1 (t ).

5. We consider the scattering of particles at a beam splitter with input ports A and B and
output ports C and D (see Figure). We assume throughout that no interactions take
by matrix elements ψc U ψa = r ,
­ ¯ ¯ ®
place at the beam splitter. Scattering is described ¯ ¯
ψd ¯ U ¯ψa = t , ψc ¯ U ¯ψb = t , and ψd ¯ U ¯ψb = −r , where real coefficients r and t
­ ¯ ¯ ® ­ ¯ ¯ ® ­ ¯ ¯ ®

satisfy t 2 + r 2 = 1. The operator U describes the time evolution between the beginning
of the experiment (when the particles are still in ports A and/or B) and the end (when
the particles are in C and/or D).
116 9. S ECOND QUANTISATION

C D

A B
S CHEMATIC OF A BEAM SPLITTER .

(a) Show that the matrix µ ¶


r t
S=
t −r
is unitary. Why is this required?
(b) Consider two photons of identical polarisation and frequency, one in A and the
other in B, incident on the beam splitter.Write the two-particle input wave func-
tion. Don’t forget to symmetrise!
p
(c) Calculate the output wave function. What can you say for r = t = 1/ 2? This
quantum phenomenon is called bunching. What would you expect classically?
If you wonder what we mean by classical, imagine two billiard balls scattering
independently.
(d) Similarly, consider now the case of two electrons, one at each input, and both spin
up. Write the two-electron input wave function. Don’t forget to anti-symmetrise!
(e) Calculate the output wave function. Explain why this result is different from what
you would expect classically. This phenomenon, you might have already guessed,
{ is known as anti-bunching.
9{ (f) Finally, Consider two incident electrons, again one on each arm, but now in a spin
singlet configuration. Write the two-electron input wave function, and calculate
the output wave function. Do the electrons bunch or anti-bunch? Therefore, does
the overall particle symmetry (fermion, boson) dictate whether the two particles
will bunch or anti-bunch upon scattering at a beam splitter?

6. We will now repeat the previous exercise, but using¯ second¯ quantisation. Define a and
b as the annihilation operators of incident states ¯ψa and ¯ψb ¯, respectively.
® ®
¯ ® Similarly,
define c and d as the annihilation operators of outgoing states ¯ψc and ¯ψd .
®

(a) Using second quantisation, Write the input state for the case of two identical pho-
tons. Use a, b, c and d for the annihilation operators for particles in port A, B , C
and D respectively.
(b) In the Heisenberg picture, The final output operators are related to the initial in-
put operators by
µ ¶ µ ¶
c a
=S .
d b
Invert to find expressions for a and b in terms of c and d .
(c) Use this result and the bosonic commutation relations to calculate the output
state. You should get the same result as in 5(c).
9.7. P ROBLEMS 117

(d) Also using second quantisation, write the input state for the case of two electrons
with identical spin. Calculate the output state by exploiting only the fermionic
commutation relations. You should get the same result as in 5(e).
(e) Now consider the two-electron case in which the electrons are initially in a spin

singlet. Note that for creating such a state, you need creation operators such as a σ ,
where
σ =↑, ↓ denotes the spin. Write the input state in second quantisation. Calculate
the output state. You should get the same result as in 5(f ).

7. In this problem, we consider the scattering of coherent states at a beamsplitter. It is


advised to use the operator formalism of the previous problem!

(a) Consider a coherent state |α〉 incident in A and vacuum |0〉 incident in B. Calculate
the output state. Show that it consists of a product of coherent states in C and D.
(b) Consider now coherent states incident at both inputs, |α〉 in A, and ¯β incident
¯ ®

in B. Calculate the output state. Is there entanglement produced between the two
output beams in this case?

8. Consider a particle moving on a periodic line of length L. The particle is subject to a


(periodic) potential V (x).

(a) Write up an equation for the expectation value of the energy of a state ¯ψ which
¯ ®

is normalised on the periodic line.


(b) Now we discretise the particle positions to a very narrowly spaced grid with grid
constant h = L/N . Give the representation of the Laplace operator on that grid.
Write the expectation value of the energy now in terms of a sum over the sites of
the dense grid.
(c) Show that a Hamiltonian of the form
NX
−1 h ³ ´ i {
H= A a i† a i +1 + a i†+1 a i + B i a i† a i 9
{
i =0

gives the same expectation for the energy of a single particle for particular values
of A and B i . Calculate these values.

9. Consider the hopping Hamiltonian


X J⊥ ³ ´
H =− c †j c j +1 + c †j +1 c j .
j 2

Now introduce the Fourier transform of the c-operators

1 X ik j
ck = p e cj,
N j

where the sum runs over the sites j and k takes on the values 2πn/N . Show that the
Hamiltonian can be rewritten as
X J⊥
H= ωk c k† c k .
k 2

Calculate the dispersion relation ωk .


118 9. S ECOND QUANTISATION

10. For theoretically studying properties of electrons in a solid, often the so-called Hubbard
model is considered. The partition function of this model was solved exactly by Lieb
and Wu in the early sixties. It predicts for some parameter values a transition between
a conductor and an insulator.
The Hubbard model describes electrons that hop from atom to atom. Only nearest
neighbour hopping is allowed. In the one-dimensional version, the hopping is de-
scribed by the following term in the Hamiltonian:
X³ † ´
T= τc i ,s c i +1,s + τ∗ c i†+1,s c i ,s
i ,s

where i labels the sites on a one-dimensional chain of atoms (sites) where the electrons
can reside. It runs from 1 to N and it is periodic, i.e. 1 ≡ N + 1. The label s is for the
spin and can be + or −. The operators c and c † are fermion creation and annihilation
operators.

(a) Explain why this term describes hopping along the chain.

The second term in the Hamiltonian provides an energy penalty U for two electrons to
be at the same atom: X
V = U n i ,− n i ,+ .
i

(b) Show that the particle number is conserved.


(c) Let S i ,µ = ×2 ss 0 c i†,s (σµ )ss 0 c i s 0 be the spin operator at site i with σµ the Pauli ma-
P

trices.
Compute · ¸
¢2 3¡ ¢ 3
|Si |2 = S j ,µ = ×2
X ¡
n i ,+ + n i ,− − n i ,+ n i ,−
µ=x,y,z 4 2
(n i ,s are number operators) and show that the Hubbard Hamiltonian can be ex-
pressed in terms of spin-operators (for real τ) as
XX³ † ´ 2 X U
{ H Hubbard = −τ c i ,s c i +1,s + c i†+1,s c i ,s − U |Si |2 + N̂
9{ i s 3× i 2
P
where N̂ = i n i .

11. Su-Shrieffer-Heeger model for acetylene


In this problem we study acetylene chains. Acetylene is a chain of C–H groups, arranges
in a zig-zag form with angles of 120o between successive bonds. The electronic prop-
erties are, just as in graphene, determined by the electrons in the p z orbitals, where z
is the direction perpendicular to the plane of the chain, see figure.
9.7. P ROBLEMS 119

A particular feature of these chains is that some bonds may shrink a bit, whereas others
stretch. We therefore include the displacement u n of the atoms into the Hamiltonian.
Here positive u n is the displacement along the bond right of atom n, and negative u n
denotes a displacement along the left bond. We neglect the electron spin. The full
Hamiltonian can be written as
NX
−2 h i κ NX−2
H =− [t − α (u n+1 − u n )] c n† c n+1 + h.c. + (u n+1 − u n )2 ,
n=0 2 n=0
where h.c. stands for ‘hermitian conjugate’ as usual. N is odd.
The average length of the bonds is a, but the chains may dimerise: the double bonds
will contract whereas the single bonds stretch a little bit. Some contemplation should
convince you that this can be represented as
u n = u(−1)n ,
where it is assumed that the left atoms of a double bond have even n, the atoms right
of the double bond have odd n. The leftmost bond is taken to be a double bond.

(a) Show that the Hamiltonian can now be rewritten as


NX−2 n£ ¤³ ´ o
H =− t + 2αu(−1)n c n† c n+1 + h.c. − 2κu 2 .
n=0

(b) We now perform an important step by considering two electrons adjacent to one
double bound as being part of one unit cell. We label these cells by m (and we
assume that N is even). The left atoms of the double bond are labelled m A and
the right atoms mB . Show that with this notation, the Hamiltonian reads
NX
/2−1 h ³ ´
† †
H =− t cm A
c mB + c m+1,A
c mB + h.c. +
m=0
³ ´ i
† †
2αu c m c
A mB
− c m+1,A c mB + h.c. − 4κu 2 − 2κu 2 .
{
Setting 2uα = t ∆ and neglecting the term 4κu 2 , this can be rewritten in the form 9
{
NX
/2−1 h ³ ´ ³ ´i
† †
H =− t (1 + ∆) c m c
A mB
+ h.c. + t (1 − ∆) c m+1,A c mB + h.c. .
m=0

(c) Solve this Hamiltonian by trying the solution


µ ¶
uA
ψm = e 2ikma .
uB
This leads to a 2 × 2 matrix for each k. Diagonalise this matrix in order to show
that q
E (k) = ±2t 1 + ∆2 − 1 sin2 (ka).
¡ ¢

Does this solution match the boundary conditions? If not, how can you construct
a solution which does match the correct boundary condition?
(d) Analyse the spectrum for ∆ = 0 and show that the Hamiltonian describes massless
fermions in that case.
(e) Calculate the total energy by integrating over all eigenvalues. You may use the
small-∆ approximation:
Z π/2 q
1 − (1 − ∆2 ) sin2 (x) d x ≈ 2 + a 1 − b 1 ln ∆2 ∆2 ,
¡ ¢
−π/2
where a 1 and b 1 are (unspecified) numerical constants. Find the values for ∆ for
which a long chain is maximally unstable towards dimerisation. Hint: now you
should include the term 4κu 2 !
120 9. S ECOND QUANTISATION

12. In this problem we consider a spin-1/2 particle in terms of fermions operators a, a † ,


along the lines of section 9.5.2. To this end, we identify a spin-‘up’ state as a particle,
and a spin-‘down’ state as the vacuum state:

|↑〉 = |1〉 = a † |0〉


|↓〉 = |0〉 = a |1〉 .

In this representation, the spin-raising and lowering operator can be written as

σ+ = a † ; σ− = a.

In addition:
σz = a † a − 1/2.

(a) From their definitions in terms of the creation and annihilation operators, show
that these operators satisfy the commutation relations

[σ+ , σ− ] = 2σz .

Now we consider a chain of fermions that are coupled to each other. The spin-operators
therefore get a site label j , as do the fermion operators. So we have operators like a j and
σ+j etcetera. Suppose we again represent the spins with our fermion operators. These
operators anti-commute when they act on different sites, whereas the spin-operators
(the σ’s) on different sites commute. We therefore adjust the relation between the σ’s
and the a’s to take this into account.

(b) We define
" #
σ+j = a †j exp a †j 0 a j 0
X
iπ ,
j 0< j
" #
σ−j a †j 0 a j 0
X
= exp −iπ aj,
{ j 0< j
9 {
σzj = a †j a j − 1/2.

Verify, using these definitions, that

σ+j σ−j +1 = a †j a j +1 .

(c) The Hamiltonian of the anisotropic spin-1/2 chain reads:


Xh ³
y y
´i
H =− J z σzj σzj +1 + J x σxj σxj+1 + σ j σ j +1 .
j

Rewrite this in terms of the creation and annihilation operators a † and a.

13. Consider a single species of bosons with annihilation and creation operators a and a †
respectively. The Hamiltonian operator for this quantum many-body system is
³ ´ ∆³ ´
Ĥ = ω a † a + 1/2 + a † a † + aa .
2
We take × = 1 throughout this problem. The following transformation is useful to gain
insight into the properties of this quantum system:

b = λa + µa †
b † = λ∗ a † + µ∗ a.

where λ and µ are complex numbers.


9.7. P ROBLEMS 121

(a) Show that this transformation preserves ¯ ¯2the usual commutation relations (but
now for b and b † ) provided that |λ|2 − ¯µ¯ = 1. In the remainder of this problem,
you may write λ = cosh u, µ = sinh u if you’re at ease with hyperbolic functions.
(b) Assuming λ and µ to be real and using the result of (a), show that, for a particular
value of u, the transformation brings the Hamiltonian into the form
µ ¶
† 1
H = ω̃ b b + .
2

Find an equation for λ and µ for which this form is obtained and. You do not have
to solve explicitly for ω̃, but it is necessary that the correct form of the Hamiltonian
is obtained.
(c) If the bosons characterised by a and a † are considered as excitations of a har-
monic oscillator with Hamiltonian

p̂ 2 ω
H= + m x̂ 2 ,
2m 2
then a is given by
1 p ip̂
µ ¶
a=p mωx̂ + p .
2 mω
Express the Hamiltonian in terms of x̂ and p̂ for the special case ∆ = ω. How would
you interpret this result physically?

14. Fermions and Majorana fermions


Consider an electrically neutral solid at T = 0. The states up to the Fermi energy are
filled, those above the Fermi energy are empty. An electron is described by fermion
creation and annihilation operators a † and a, respectively, satisfying the usual anti-
commutation relations. An electron creation operator associated with an unoccupied
orbital will put an electron in that orbital. An annihilation operator associated with an
occupied orbital will remove an electron from that orbital. In solid state physics, we {
often say that removing an electron from an occupied orbital is equivalent to creating 9
{
a hole. In a proper relativistic description of electrons in vacuum, the same structure is
recovered: an operator which creates an electron can equivalently be viewed as an op-
erator annihilating a positron and vice-versa. The positron is called the ‘anti-particle’
of the electron.
Starting from fermion creation and annihilation operators a † and a, we define two new
operators c 1 and c 2 as follows:

a − a†
c1 = a + a † , c2 = .
i
Show that these operators satisfy the relations

c α = c α† and c α c β + c β c α = 2δαβ .

The first condition is often formulated as ‘the particle described by c α is its own anti-
particle’. Such a particle is generally called a Majorana fermion.
10
E LECTRONS AND PHONONS

10.1 T HEORY OF THE ELECTRON GAS


If we want to understand the behaviour of the electrons in a solid, we face a formidable prob-
lem. We have to deal with the electrons, the nuclei and the interactions between all of these.
A non-relativistic Hamiltonian which does not include magnetic interactions is already quite
complicated. For a finite number (N ) of electrons moving in the Coulomb potential field of
K nuclei with charges Zn e, the Hamiltonian reads

N p2 K P n2 1 X N e2 N X K Zn e 2 K Zn Zm e 2
i
X X X X
H= + + − + .
i =1 2m n=1 2M n 2 i , j =1,i 6= j |ri − r j | i =1 n=1 |ri − Rn | n,m=1,n6=m |Rn − Rm |

Here, the pi are momenta of the electrons, Pn those of the nuclei, ri are the positions of the
electrons and Rn those of the nuclei, which have masses M n and charges Zn – the electrons
have mass m. Needless to say, this Hamiltonian is impossible to solve if N and K are not very
small, even on a powerful computer.
In order to make progress and at least partly understand the physics of this system, we
must make approximations. A sensible approximation is the Born-Oppenheimer (BO) ap-
proximation which is based on the observation that the electron mass is at least about 2000
times smaller than the nuclear mass. If the kinetic energy is more or less evenly distributed
over the electrons and the nuclei, this implies that the nuclei move much more slowly than
the electrons, and these can therefore adapt their wave function at any time to the nuclear
configuration as if that were stationary. The BO approximation can be formalised, but we
restrict ourselves to this descriptive definition. This then leaves the (still formidable) task of
calculating the wave function for the electrons with the nuclei standing still. Varying then the
positions of the nuclei, we see that the ground state energy of the electrons varies, and the ex-
pression of the total energy as a function of the positions of the nuclei is called the potential
energy surface (PES).
Even for stationary nuclei, the problem remains enormously difficult. This is due to the
interactions between the electrons. In fact, the Hamiltonian within the BO approximation
can be written as
XN 1X
HBO = h(i ) + v(ri − r j )
i =1 2 i 6= j

where h(i ) = p i2 /(2m) + v ext (ri ), with v ext the electric potential energy felt by each electron
individually and caused by the nuclei, and v is the electrostatic repulsion between the elec-
trons. If that interaction were not there, the one-electron problem for h could be solved, and
the total energy would simply be a sum of the energies of the occupied one-electron states,
and the wave function would be a Slater determinant composed of those one-electron states
(spin-orbitals).

123
124 10. E LECTRONS AND PHONONS

In order to make further progress, we neglect the discrete structure of the nuclei and make
the rather drastic assumption that their charge is smeared out evenly over space. Consider-
ing a solid in the thermodynamic limit, this means that we have a constant positive nuclear
charge density, and hence a constant contribution to the potential felt by the electrons. It is
as if the nuclei are transformed into a uniformly charged jelly, hence the name jellium model
for this approximation.
We assume that the total system is electrically neutral, so that in any large volume V ,
the charge −Ne of electrons in that volume is compensated by the positive jellium charge in
that volume, which leads to a jellium charge density of n b = Ne/V . The Hamiltonian for the
jellium model can now be written as

N p2 1 X N e2
i
X X
H= + + Ve−b (ri ),
i =1 2m 2 i 6= j =1 |ri − r j | i

where the letter b stands for ‘background’. The potential representing the interaction be-
tween the electrons and the background is given by

nb
Z
Ve−b (r) = −e 2 d 3r 0.
|r − r0 |

Furthermore, the background carries its own energy which is given by

e2
Z n b2
E b−b = d 3r d 3r 0.
2 |r − r0 |

Both of these terms are constants tending to infinity, which makes them delicate to evaluate.
Let us therefore screen the potential with a screening length 1/λ:

e2 e 2 exp(−λr )
→ ,
r r
which we can send to λ = 0 in the end. The Fourier transform of the potential can be directly
found:
4πe 2
vq = .
V (q 2 + λ2 )
{
10{ In the calculation we first take the limit of the volume to infinity and then that of the screening
length. This means that we can consider the screening length always small with respect to the
volume. We can then split the integral into one over r and a second one over ∆r = r0 − r. Note
that for this to be possible we need λ3 ¿ V indeed. We now have, with n b = N /V :

e2N 2 ∞ e −λ∆r e 2 N 2 4π
Z
E b−b = 4π ∆r 2 d ∆r = . (10.1)
2V 0 ∆r 2V λ2

Now let us calculate the background potential Ve−b felt by the electrons. This term is
independent of r and will therefore lead to a constant contribution to the energy of Ve−b N . It
is easy to evaluate this term:

e −λr 3 4πeN
Z
Ve−b = −n b e d r =− .
r V λ2

This term gives a constant contribution to the total energy of

4πe 2 N 2
E e−b = eNVe−b = − (10.2)
V λ2
which scales with N and V the same as E b−b , but it is negative and twice as large.
10.1. T HEORY OF THE ELECTRON GAS 125

k
k−q

k0 + q
k0

F IGURE 10.1: Electron-electron diagram.

Now we consider the electron-electron interaction. The Hamiltonian can again be written
in its usual form H = i h(i ) + 12 i 6= j v(i , j ). We have seen that in a many-body field theory,
P P

this Hamiltonian can be represented as an operator in Fock space. It is convenient to use a


plane wave basis and work with the creation and annihilation operation operators for these
basis functions. We neglect the spin for now to keep the analysis simple:
1 X † †
²k a k† a k +
X
H= a a v kqk0 q0 a q0 a k0 ,
k 2 kqk0 q0 k q

where ²k = ×2 k 2 /(2m). This form is precisely the one we formulated in the previous chapter
[(9.4)] – we have used the plane wave labels k, q etcetera instead of j , k, l in that chapter.
Let us formulate the matrix element of v explicitly:

1 e2
Z
0 0
v kqk0 q0 = 2 e −i(k−k )·r1 e −i(q−q )·r2 d 3 r 1 d 3 r 2 .
V |r1 − r2 |

Now we change to coordinates r = r1 − r2 and R = 12 (r1 + r2 ) and the integral transforms into

1 e2 3
Z Z
1
−i(k−k0 +q−q0 )·R 3 0 0
v kqk0 q0 = 2 e d R e − 2 i(k−k −q+q )r d r.
V r
The integral over R gives a delta function

V δ(3) (k − k0 + q − q0 )
{
and using this in the second integral leads to the form
10
{
1 e2 3
Z
0
v kqk0 q0 = δ(k − k0 + q − q0 ) e −i(k−k )·r d r.
V r
The integral is the Fourier transform V v k−k0 . The delta-function expresses momentum con-
servation: the total ‘incoming’ momentum k0 + q0 equals the ‘outgoing’ momentum k + q.
Using k and k0 for the incoming momenta, we can write the outgoing ones in the form k − q
and k0 + q without loss of generality. Using these definitions, we can write interaction term of
the Hamiltonian in the form
1 X †
a a † 0 v q a k0 a k .
2 k,k0 ,q k−q k +q

This term can be represented pictorially, see figure 10.1. The final form of the Hamiltonian,
now with the spin quantum numbers included, becomes:

X X ×2 k 2 † 1 X X †
H= a k,σ a k,σ + a a †0 0 v q a k0 ,σ0 a k,σ .
σ k 2m 2 σ,σ0 k,k0 ,q k−q,σ k +q,σ

At low temperatures, the electron gas will be in its ground state, so let’s calculate the low-
est possible energy. That is a difficult task – we can make progress by assuming the electron-
electron interaction to be relatively weak so that we can use perturbation theory. We start
126 10. E LECTRONS AND PHONONS

by neglecting the interaction term altogether and minimize the kinetic energy. This can be
done by putting the electrons by pairs in the lowest available momentum states. We fill up
those states until we have exhausted all the electrons. The requirement that we fill the low-
est momentum states means that the momenta are the set of points closest to the origin in
reciprocal space – that is, they fill a sphere in reciprocal space; this is the Fermi sphere. The
number of k-points inside the sphere is the volume of the sphere divided by the volume per
k-points which, for a large L × L × L volume, is (2π/L)3 . We therefore see that in the sphere
with a radius which we call k F we can store

L 3 4π 3
N =2 k
(2π)3 3 F

electrons. Note the factor of 2 which accounts for the two spin states. We see that the k F can
be calculated from the density n = N /L 3 :

k F = (3π2 n)1/3 .

The subscript ‘F’ stands for ‘Fermi’ and k F is called the ‘Fermi momentum’. It is straightfor-
ward to calculate the total ground state energy if we neglect the interactions:

V ×2 k 2 3
Z
EG(0) =2 d k.
(2π)2 k<k F 2m

Evaluating the integral and dividing by the number of particles N , we obtain

EG(0) 2 2
3 × kF 3
= = ²F ,
N 5 2m 5

where ²F = ×2 k F2 /2m is the Fermi energy.


The next step is to take the electron interaction into account. We consider this term as
a perturbation. Standard perturbation theory tells us that the first order correction to the
ground state energy, due to a perturbation W to some Hamiltonian H0 for which we know
the ground state Φ, is given by
∆E = 〈Φ|W |Φ〉 .
{ We use this result, taking for H0 the kinetic energy:
10{
X X ×2 k 2 †
H0 = a k,σ a k,σ ,
σ k 2m

for which the ground state |Φ〉 is a Slater determinant built from all k-vectors inside the Fermi
sphere, and W is the electron-electron interaction

1 X †
W= a a † 0 v q a k0 a k .
2 k,k0 ,q k−q k +q

So:
1 X XD †
E
∆E = Φ|a k−q,σ a k† 0 +q,σ0 v q a k0 ,σ0 a k,σ |Φ .
2 σ,σ0 kk0 q

Now we consider which combinations of k, k0 and q yield nonzero matrix elements. The
two annihilation operators a k and a k0 remove orbitals k and k0 from the Fermi sphere. Any
creation of orbitals outside the Fermi sphere gives zero, since 〈Φ| on the left does not contain
these orbitals. Since all orbitals within the Fermi sphere, except for k and k0 , are occupied,
we connot create orbitals other than these two there. So two possibilities remain:

• We have q = 0, i.e. we annihilate k and then create it again, and similarly for k0 .
10.1. T HEORY OF THE ELECTRON GAS 127

k
k k k − q = k0

k0 + q = k

k0
k0 k0

F IGURE 10.2: Electron-electron processes that conserve the momenta.

• The two incoming momenta are swapped by the interaction term, i.e.:

k − q = k0 .

These two processes are represented in figure 10.2. In the first process, q = 0, and we first
analyze these q = 0 terms. These lead to
1 XX † †
a a 0 0 v 0 a k0 ,σ0 a k,σ .
2 σ,σ0 k,k0 k,σ k ,σ

We would like to move the rightmost operator two places to the left, as we can then recognise
two number operators in the expression. However, the anti-commutation relations lead to
an extra 1 for the case where k = k0 and σ = σ0 . We split these extra contributions off from the
sum to obtain
" # " #
1 XX † †
XX † 1 XX XX
a a k,σ a k0 ,σ0 a k0 ,σ0 v 0 − a k,σ a k,σ v 0 = n k,σ n k0 ,σ0 v 0 − n k,σ v 0 .
2 σ,σ0 k,k0 k,σ σ k 2 σ,σ0 k,k0 σ k
(10.3)
P P
Using σ k n k,σ = N and substituting

4πe 2
v0 = ,
V λ2
we obtain for the first term: {
2πe N 2 2 10
{
E e−e =

which is the ‘classical’ electron-electron interaction. Previously we have obtained self-interaction
E b−b of the positive background charge in Eq. (10.1) and the interaction energy of the elec-
trons and the background in Eq. (10.2). We see that the all these terms cancel: E b−b + E e−b +
E e−e = 0. The second term of (10.3) results in an energy

N 2πe 2

V λ2
which yields a zero contribution per particle for V → ∞ (note the importance of taking first
the limit for V to infinity and then λ → 0). We conclude that the q = 0 term in the interaction
potential cancels the background energies.
The terms in the interaction energy with q 6= 0 should give us a more interesting contri-
bution from the electron-electron interaction. It is
1 X X D ¯¯ † †
¯ E
∆E = Φ ¯a k0 ,σ a k,σ v a k ,σ k,σ ¯ Φ .
a
¯
0 k−k0 0 0
2 σ,σ0 kk0

Now suppose σ 6= σ0 . Then a k,σ removes an electron with wave vector k and spin σ from the

ground state. The spin-orbital k, σ0 remains occupied. So the creation operator a k,σ0 will give
128 10. E LECTRONS AND PHONONS

zero. So, only terms with σ = σ0 give a non-zero result. If we now move in the operator a k0 ,σ
one place to the left, the anti-commutation relation gives us the expression

1 XX­
∆E = − Φ|n k0 ,σ n k,σ v k−k0 |Φ .
®
2 σ kk0

Using the fact that for |k| < k F all the states are occupied, we obtain
X(F)
∆E = − k,k0 v k−k0

where (F) denotes a sum over k and k0 inside the Fermi sphere. The correction can now be
P

calculated in the continuum limit. The expression for ∆E then reads

V 4πe 2
Z
∆E = − d 3k d 3k 0.
(2π)6 |k − k0 |2

We write this in the form:


V
Z
∆E = − ²(k) d 3 k,
(2π)3
where
1 4πe 2 3 0 e 2 kF 1 1 du
Z Z Z
2
²(k) = d k = − dx x .
(2π)3 |k0 |<k F |k − k0 |2 π 0 −1 x 2 + y 2 − 2x yu

Here we have put x = k /k F and y = k/k F and u is the cosine of the angle between k and k0 .
0

The integral can be done and we obtain:

2e 2 k F k
µ ¶
²(k) = − f ,
π kF

with
1 1 − x 2 ¯¯ 1 + x ¯¯
¯ ¯
f (x) = + ln ¯ .
2 4x 1−x ¯
If we then sum (integrate) over the k, we obtain the total energy correction, which turns
out to be
V
∆E = − 3 e 2 k F4 .

{ Let’s discuss the results obtained. The correction term is negative. This may seem sur-
10{ prising as it is derived from the repulsive electron-electron energy. However, it is easy to
understand if we realize that the q = 0 term considered above takes the major part of the
electron-electron interaction into account and the effect of the correction above is to reduce
that large positive result. Why is it reduced? The calculation above showed that the mecha-
nism responsible for the correction is momentum exchange. That is why this term is called
exchange interaction. The negative sign is due to the anti-commutation relation between the
creation and annihilation operators, and this in turn is due to the Fermi statistics obeyed by
the electrons that are exchanged. Two electrons with the same spin cannot occupy the same
position – the anti-symmetry of the wave function keeps them apart. The very fact that the
electrons are kept apart by the anti-symmetry of the wave function causes a reduction of the
electrostatic energy as this increases when the electrons are close together. The first-order
perturbation analysis of the interacting electron gas is called the Hartree-Fock theory.
Several predictions of the Hartree-Fock theory are in conflict with experiments on metals:
(i) the band width is predicted to be larger than that of free electrons, whereas in experiments
it’s usually found to be lower; (ii) the density of single-electron states is predicted to vanish
logarithmically at the Fermi wave vector k = k F . In reality, a value close to the free-electron
result is found. The reason for these failures of the Hartree-Fock is the fact that polarization
has not been included: we have used unperturbed ground state orbitals to evaluate the ex-
change energy. In reality, the orbitals occupied by the electrons will deform as a result of the
10.1. T HEORY OF THE ELECTRON GAS 129

Coulomb interaction. This has a major effect on the energies. Such effects can be taken into
account within the random phase approximation (RPA), which yields much better values.
The same approach as the one followed here is often used in quantum chemistry to calcu-
late the energies and to predict the excitation energies of molecules. It gives quite good results
for those cases, indicating that polarization effects are more important in semi-conductors
than in molecules.
Another approach is based on writing the exchange energy as a function of the local den-
sity which we shall now briefly sketch without going into details. Translating the exchange
energy into a function of the density, we can derive a potential Vxc from it, which has the
form
Vxc = −2.95(a 03 n)1/3 Ry
(1 Ry = 13.6 eV). This form has been used very often in calculations for electrons in solids. In
these calculations, the Hartree-Fock form for the exchange energy is made a local function
by replacing the average density n by the local density n(r). Moreover, some extra terms are
incorporated into the one-electron Hamiltonian, in particular the electrostatic energy of the
electrons among themselves and the interaction with the nuclei, so that we arrive at the one-
electron Hamiltonian:

p 2 X −Zn e 2
Z 2
e n(r0 ) 3 0
H= + + d r + Vxc (r).
2m n |r − Rn | |r − r0 |

The third term on the right hand side represents the electrostatic energy resulting from the
electron cloud with density n(r). In 1964, Hohenberg and Kohn proved a theorem which says
that a similar (but not identical) Hamiltonian, with the same interaction with the nuclei and
all other terms depending on the electron density only, gives the exact ground state energy.
Although the exact form of the other terms is as yet unknown, many approximations to it
exist, and allow for accurate ground state energy calculations of atoms, molecules and solids.
The theory based on the Hohenberg-Kohn theorem is called density functional theory.
To systematically improve on the Hartree-Fock analysis, we may consider the second-
order perturbation term. We only sketch briefly how this works. If we start from the ground
state, |Φ〉, we obtain for the energy correction the form
X 〈Φ|v|ΦI 〉 〈ΦI |v|Φ〉
∆E (2) = {
E I − EG
I
10
{
where |ΦI 〉 is any intermediate state in which one or two electrons may be excited with re-
spect to the ground state – these intermediate states are summed over. Figure 10.3 shows a
picture of typical terms occurring in this expansion. Similar to the first order case, the ini-
tial and final momenta can either remain the same or they will be exchanged, whereas the
intermediate states can have electrons excited to outside the Fermi sphere.
130 10. E LECTRONS AND PHONONS

F IGURE 10.3: Second order electron-electron diagrams with momentum conservation.

Summary of the electron gas


In this section we have considered a gas of interacting electrons in a medium with a
positive background charge. We have used perturbation theory to analyse this sys-
tem, using the electron-electron interaction as a small term in the Hamiltonian. If
the electron-electron interaction is neglected, we find a ground state consisting of a
sphere of occupied states in reciprocal space, the Fermi sphere. In the ground states,
all k-orbitals within the Fermi sphere are occupied by two electrons, one with spin-
up and one with spin-down. The k-points outside the Fermi sphere are unoccupied
in the ground state.
The first order correction to the ground state gives two contributions. The first is one
which does not involve momentum exchange and represents the static repulsion
between the electrons. This term cancels the interaction between the electrons and
the positive background plus the self-energy of the positive background.
The second first-order contribution lowers represents the effect of exchanging the
particles with two different momenta (but having the same spin) in the Fermi
sphere. This term has the effect of lowering the energy of the ground state by an
amount
V
∆E = − 3 e 2 k F4 .
{ 4π
10{ The lowering can be explained from the fact that this term is a direct consequence
of the fermion character of the wave function, which effectively keeps the electrons
apart so that their interaction energy is reduced. First-order perturbation theory is
called Hartree-Fock theory.
A systematic way of improving on the Hartree-Fock theory is to include higher or-
der terms in the analysis, which quickly becomes a lot more tricky due to the large
amount of contributions needed. Other approximations which give better results
than Hartree-Fock are density functional theory, as well as the random phase ap-
proximation.

10.2 E LECTRON - PHONON COUPLING


In the beginning of this chapter we mentioned the Born-Oppenheimer (BO) approximation
in which the nuclei are standing still, and the electronic state can be calculated for any config-
uration of nuclei. This then gives rise to a potential felt by the nuclei, the so-called potential
energy surface (PES). Given a PES, a classical calculation predicts the motion of the nuclei.
This approach is called the adiabatic approximation as the nuclear positions are considered
as external parameters which change very slowly in time. In the adiabatic approximation,
the electrons will therefore always remain in the ground state for the actual configuration
of nuclei. Now we want to improve on this approximation by introducing energy exchange
10.2. E LECTRON - PHONON COUPLING 131

between electrons and phonons more explicitly into the Hamiltonian. We have already stud-
ied the example of a simple phonon system in section 8.2: a linear chain of which we have
considered the longitudinal modes. This analysis can be generalised straightforwardly to 3D
lattices and elastic waves in the longitudinal and the two transverse directions. All in all, this
leads, for a monatomic Bravais lattice, to the following Hamiltonian determining the motion
of the nuclei:
X X p qα p −qα mω2qα y q,α y −qα
à !
H= + .
α q 2m 2
Here q is a vector inside the Brillouin zone, and α denotes the longitudinal and transverse di-
rections. The coordinates y qα are the displacements of the nuclei with respect to their equi-
librium positions Rn . The nuclear mass is m.
Now we want to emphasise that in this section, we use the notation:

• q, q0 : wave vectors used in the Fourier transforms of nuclear displacements


(phonons). For a monatomic Bravais lattice, these are inside the Brillouin
zone.
• k, k0 : wave vectors used for Fourier transforms of the orbital electron wave
functions. They are inside the reciprocal lattice and are usually decomposed
into a wave vector k inside the Brillouin zone plus a vector K of the reciprocal
lattice.
• The operators a k and a k† create and annihilate electrons in a state |k〉.

• Similarly, the operators d qα and d qα create and annihilate phonons.

The operator d qα is defined as


r
mωq,α 1
d q,α = y q,α + i p p qα .
2× 2m×ωqα

and r
† mωq,α 1
d q,α = y −q,α − i p p −qα .
2× 2m×ωqα
{
With these definitions, we obtain 10
{
³ ´

X
H= ×ω d qα d qα + 1/2 .

In the previous section we have already considered the Hamiltonian for the electrons.
Here we shall consider the electrons in the independent particle approximation, in which
the electrons move only in an external potential. This external potential may, in some ap-
proximation schemes, be generated (at least in part) by all the electrons in the system. The
electron Hamiltonian is therefore given as

²k a k† a k .
X
Hel =
k

With a homogeneous background, the sum runs over the all k-vectors in reciprocal space.
The fact that we use a homogeneous background may seem strange, since the electron-
nucleus interaction is locally very strong. However, we may focus on the valence electrons
in a metal, and they see the nuclei as screened by the core electrons, resulting in a weak and
smooth potential. Which of the phonon states are occupied depends on the temperature (in
equilibrium). In a periodic solid, the sum runs over the vectors k of the Brillouin zone and
perhaps band labels, or other quantum numbers.
132 10. E LECTRONS AND PHONONS

The electrons feel the attractive potential of the nuclei, and this potential is determined
by the positions of the latter. We can write the interaction as

k|v e-n (r − Rn − yn )|k0 a k† a k0 ,


X ­ ®
HI =
k,k0 ,n

where the index n runs over the nuclei, Rn is the equilibrium position of nucleus n, v e-n is the
electron-nucleus interaction (which will be Coulomb-like), and yn is the displacement of the
nucleus with respect to equilibrium, which is supposed to be small. We expand the potential
in terms of the small displacement yn :

v e−n (r − Rn − yn ) = v e−n (r − Rn ) − yn · ∇v e−n (r − Rn ).

We see that the effect of the nuclei on the electrons is two-fold: the first effect is a shift of
the potential due to all the nuclei when they are at equilibrium, and the second is due to the
displacements yn of the nuclei with respect to that equilibrium. We consider both terms as
external potentials for the electrons and use first order perturbation theory. This tells us that
the first term results in a shift of the total energy given by

k|v e−n (n)|k0 a k† a k0 ,


X ­ ®
HB =
k,k0 ,n

where v e−n (n) denotes the potential energy due to nucleus n. We can write out the matrix
element:
X 1 Z i(k0 −k)·r
HB = e v e−n (r − Rn ) d 3 r a k† a k0 .
k,k0 ,n V
We now make the substitution r → r + Rn in the integral to obtain
X 1 Z
0
e i(k −k)·r v e−n (r) d 3 r a k† a k0 .
£ 0 ¤
HB = exp i(k − k) · Rn
k,k0 ,n V

Carrying out the sum over Rn results in a ‘modified’ delta-function: since the sum is only
over the discrete vectors Rn , the argument of the ‘delta’-function only forces the component
inside the Brillouin zone to be equal – hence the argument may still be a reciprocal lattice
vector rather than 0. The formal result is n e ik·Rn = m δ(k+Km ) where Km are the reciprocal
P P

lattice vectors. We therefore have


{ X †
10{ HB = N v e−n (Km )a k+K
k,m
ak .
m

This shift can be absorbed in the single-electron energies ²k – it is called the Bloch term.
The second term depends on the nuclear displacements. It reads
® † X 1 Z i(k0 −k)·r
0
∇v e−n (r − Rn )d 3 r a k† a k0 .
X X­ X
He-ph = − yn · k|∇n v e−n (n)|k a k a k0 = − yn · e
n k,k0 n k,k0 V

We now focus on the integral in the right hand side. Using partial integration in the last inte-
gral, we obtain

1 i 0
Z Z
0 0
e i(k −k)·r ∇v e−n (r − Rn )d 3 r = − (k − k) e i(k −k)·r v e−n (r − Rn )d 3 r =
V V
i 0
Z
0
e i(k −k)·r v e−n (r)d 3 r,
£ 0 ¤
− (k − k) exp i(k − k) · Rn
V
where we have made the substitution r → r + Rn again in order to obtain the last form. The
integral is recognized as the Fourier transform of v e−n , and we can write the electron-phonon
Hamiltonian as

yn · (k0 − k) exp i(k0 − k) · Rn v e−n (k − k0 )a k† a k0 .


X £ ¤
He-ph = i
k,k0 ,n
10.2. E LECTRON - PHONON COUPLING 133

Now we substitute for yn its Fourier transform


1 X
yn = p yq e iq·Rn
N q

to obtain
i 0
e i(k −k+q)·Rn (k0 − k) · yq v e-n (k − k0 )a k† a k0 .
X
He-ph = p
N k,k0 ,q,n

Carrying out the sum over n forces q = k − k0 and we have


p X 0
He-ph = i N (k − k) · yk−k0 v e-n (k − k0 )a k† a k0 .
k,k0

The result of the dot-product depends on the polarization of the phonon (longitudinal or
transverse) – we shall not go into details here. It is important to again realize that the delta-
function we have obtained in the sum over n needs to be interpreted with some care: if k − k0
lies inside the Brillouin zone, it is correct as given here; if k − k0 lies outside the first Brillouin
zone, we have q = k0 + k − K where K is a reciprocal lattice vector. The processes where this
is the case are called ‘Umklapp’ processes – they have a noticeable influence on the temper-
ature dependence of the resistance. Note that the deviation yq has a polarization. In a ho-
mogeneous elastic medium, we can take two transverse and one longitudinal polarization.
For a crystalline solid, this is in general not always be possible, but we shall not treat that
case here. The term q · yq in the electron-phonon Hamiltonian shows that only longitudinal
modes can interact with the electrons. The polarization can be along the direction q̂ or −q̂,
and this obviously has its effect on the electron-phonon interactions, as a phonon can either
hit an electron ‘in the back’ or ‘head-on’. In the sum over k and k0 , both cases will occur,
which is necessary for the Hamiltonian to Hermitian. From now on, we only consider the
longitudinal component of the displacement and therefore we consider y q as the amplitude
of the displacement along q: y q is therefore no longer a vector.
The major step is now to express y in terms of d q† and d q . Now we use Eqs. (8.13) and
(8.14) to find s
×
yq = (d † + d q ),
2mωq −q
which, using the definition {
10
s
N × ¯¯ ¯¯ {
M α,q = i q v e-n (q)
2mωq
leads to

M α (k − k0 )(d −q,α + d q,α )a k† a k0 .
X
He-ph =
k,k0 ,α

where q = k − k0 , reduced to the first Brillouin zone. Now the total Hamiltonian can be given:

H = ²k a k† a k + ×ωq d k† d q + †
M (k − k0 )(d −q + d q )a k† a k0 .
X X X
k q k,k0

The last term has a very nice interpretation which can be visualised using diagrams: the

term containing d −q describes a process in which the electron momentum k0 is changed into
k under emission of a phonon. The term containing d q describes the absorption of a phonon.
The total momentum and energy is to be conserved in these processes. They are represented
in figure 10.4.
We can now evaluate the effect on the total energy of electron-phonon interactions. We
again use perturbation theory for this purpose, just as in the case of the electron gas. The
perturbation to second order is

E = E 0 + Φ|He-ph |Φ + Φ|He-ph (E 0 − H0 )−1 He-ph |Φ .


­ ® ­ ®
134 10. E LECTRONS AND PHONONS

absorption emission

F IGURE 10.4: Electron-phonon diagrams.

k k’
k’ k
k k
− −q q
− −q

F IGURE 10.5: Second order electron-phonon diagrams.

Here |Φ〉 is the ground state of a system of electrons and phonons. The first term in the per-
turbation expansion vanishes, as it changes the momentum of a single electron, and |Φ〉 con-
tains a Slater determinant composed of all k-waves within the Fermi sphere – changing one
of the k-vectors gives a new state which is perpendicular to |Φ〉. The second terms contains
intermediate states that may be perpendicular to the ground state and we have
* ¯ ¯ +
¯X ¯ ¯ h † † i¯
0 ¯2 −1 † † −1 † †
E −E 0 = Φ ¯ M (k − k ) d −q a k a k0 (E 0 − H0 ) d −q a k0 a k + d q a k a k0 (E 0 − H0 ) d q a k0 a k ¯ Φ .
¯ ¯ ¯
¯k,k0 ¯

The two processes in the right hand side can be visualised as in figure 10.5. Let us analyze
the first term in this expression. Acting with d −q a k† 0 a k on the ground state gives a state with
energy E 0 − ×ω−q + ²k0 − ²k . Therefore the term (E 0 − H0 )−1 gives 1/(×ωq + ²k − ²k0 ). If we
furthermore move the creation and annihilation operators through this expression, we find,
with a similar treatment of the second term:
{
10{ X¯ ¯2
¯M (k − k0 )¯ 〈n k (1 − n k0 )〉
à ­
n −q
® ­
n q + 1
® !
E − E0 = + .
k,k0 ²k − ²k0 + ×ω−q ²k − ²k0 − ×ωq

Note that the n k and n k0 are electron occupations, whereas n ±q denote phonon occupations.
Also note that the +1 in the numerator of the second term represents an emitted phonon
during the interaction.
Now we focus on the term in the interaction which has the form of a two-electron inter-
action: Ã ­ ® ­ ® !
0
X¯ 0
¯2 nq nq + 1
E e-ph = − ¯M (k − k )¯ 〈n k n k0 〉 + .
k,k0 ²k − ²k0 + ×ωq ²k − ²k0 − ×ωq
Note that we have replaced −q → q: in equilibrium, there is no preference for the direction of
­ ®
the phonon propagation. The term proportional to n q can be cast into the form
2×ωq
(1) ¯M (k − k0 )¯2 〈n k n k0 〉 n q
X¯ ¯ ­ ®
E e-ph = ¢2 .
(²k − ²k0 ) − ×ωq
¡
k,k0
­ ®
The term which is not proportional to n q has the form

n k n k0
­ ®
(2)
X¯ 0
¯2
E e-ph = − ¯M (k − k )¯ ,
k,k0 ²k − ²k0 − ×ωq
10.3. P ROBLEMS 135

which can be cast into a similar form as the first:

(2) 1 X ¯¯ ¯2 2×ωq
E e-ph =− M (k − k0 )¯ 〈n k n k0 〉 ,
2 k,k0 (²k − ²k0 )2 − (×ωq )2

where we have used the fact that

(²k − ²k0 ) A(k − k0 ) = 0


X
k,k0

when A is symmetric in k − k0 .
Now we study the possible signs of these terms. Although the electron energies are usually
much higher than the phonon energies, differences between electron energies may become
of the same order as the phonon energies. If that happens, we may get verry large values from
the denominator in the expressions for the electron-interaction, with a positive or a negative
sign. This contribution may become arbitrarily large as this denominator may even vanish!
If the electrons are free to arrange themselves to minize their energy, it is like ly that they will
try to profit from this negative (i.e. attractive) interaction. This is the mechanism behind the
formation of Cooper pairs, which play a crucial role in superconductivity.

Summary for electron-phonon interactions


In this section, we have seen that the electron-phonon interaction can be repre-
sented by the following term in the Hamiltonian:

M α (k − k0 )(d −q,α + d q,α )a k† a k0 .
X
He-ph =
k,k0 ,α

where q = k − k0 , reduced to the first Brillouin zone. The amplitude M α,q appearing
in this expression is given by:
s
N × ¯¯ ¯¯
M α,q = i q v e-n (q).
2mωq

The fact that the interaction Hamiltonian contains single operators a k and a k† means
that it vanishes when calculated in first-order perturbation theory: the interaction
either fills a state outside the Fermi sphere with an electron, or it empties a state {
within the Fermi sphere. Both processes yield a new many-body state which is per- 10
{
pendicular to the ground state. The second order contribution however will in gen-
eral be non-zero. This contribution can assume positive or negative values, indi-
cating that the electron-phonon can increase or lower the energy of the electron
system.

10.3 P ROBLEMS
1. (a) Give the kinetic and the potential energy of the homogeneous electron gas within
the Hartree-Fock approximation, both in terms of the Fermi wave vector k F .
(b) Calculate the density at which these two energies have the same (absolute) value
in units of electrons/Å3 .
(c) Which of these two energies dominates for higher densities?

2. Consider a system consisting of a large number N of spinless interacting fermions in a


large one-dimensional box of length L. There are periodic boundary conditions. The
particles interact via a delta-function potential, so the Hamiltonian is

V X †
Ak 2 a k† a k + a † 0 ak 0 ak
X
H= a
k L k,k 0 ,q k−q k +q
136 10. E LECTRONS AND PHONONS

with A and V constants. The sums run over all permitted values of k, k 0 and q, includ-
ing q = 0.

(a) Show that the Hamiltonian given above is correct for particles with delta-function
interaction.
(b) Calculate the ground state energy of the noninteracting system.
(c) Calculate the ground state energy of the interacting system within the Hartree-
Fock approximation.
(d) Explain why the solution found in (c) must be the exact solution to the problem.

3. We have seen that the total energy of a system consisting of electrons occupying energy
levels ²k and coupled to phonons with frequency spectrum ωk , is given by
à ­ ® ­ ® !
X¯ 0 ¯2
¯ n −q nq + 1
E − E0 = ¯ M (k − k ) 〈n k (1 − n k0 )〉 + .
k,k0 ²k − ²k0 + ×ω−q ²k − ²k0 − ×ωq

In the sum, k0 = k − q and E 0 is the ground state energy for a system consisting of elec-
trons and phonons:
H0 = ²k n k + ×ωq n q
X X
k q

where the n k is the number operator for electrons, and n q that of the phonons.

(a) Explain that if the electrons are in the ground state of H0 , we only find contribu-
tions when k is inside, and k0 outside the Fermi sphere.
(b) Use ωq = ω−q , n q = n −q and to rearrange this equation to find
­ ® ­ ®

" ­ ® #
0
¯2 nq 〈1 − n k0 〉
¯M (k − k )¯ 〈n k 〉 2 (²k − ²k0 )

E − E0 = +
k,k0 (²k − ²k ) − (×ω−q )
0
2 2 ² k ²k0 − ×ωq

where the term proportional to 〈n k n k0 〉 has vanished due to antisymmetry in the


sum over k and k0 .
(c) The last expression can be used to find the chemical potential defined by the en-
{ ergy difference associated with adding or subtracting a phonon with wave vector
10{ (p)
q from the phonon bath. Calling this chemical potential ×ωq , show that

(p) ¯M (k − k0 )¯2
X¯ ¯ 2 〈n k 〉 (²k − ²k0 )
×ωq = ×ωq +
k (²k − ²k0 )2 − (×ω−q )2

where, still, k0 = k − q. We see that the electron-phonon energy renormalizes the


phonon frequencies.
(d) Note that for ²k = ²k0 we may have a singularity in the phonon spectrum. Usually
this divergence disappears after integrating over k and k0 . This singularity is how-
ever at its ‘worst’ when k and k0 are diametrically opposite at the surface of the
Fermi sphere.
We assume the sound velocity to be so small that the term ×ω in the denominator
can be neglected with respect to the ²k . Take q along the z-axis and show that the
integral over k on the Fermi sphere in the expression for the renormalized phonon
energy is proportional to
dy
Z
.
q 2 − 2k F q y
Show that this integral gives a divergence when q = 2k F .
10.3. P ROBLEMS 137

4. (a) Consider a Hamiltonian describing a one-dimensional superconductor:

NX
−1 ³ ´ N NX
−1 ³ ´
a †j a j +1 + a †j +1 a j − µ a †j a j + ∆a j a j +1 + ∆∗ a †j +1 a †j .
X
H = −t
j =1 j =1 j =1

Here, coefficients t and µ are real-valued and ∆ is complex. Check that this Hamil-
tonian is Hermitian.
(b) We now set ∆ = |∆| e iθ (this is the superconducting gap). We define Majorana
fermions c j as follows:

c A j = e iθ/2 a j + e −iθ/2 a †j ,

c B j = −ie iθ/2 a j + ie −iθ/2 a †j .

Show that the operators c A j and c B j describe Majorana fermions. From now on,
we take θ = 0.
(c) Show that the special case µ = 0 and t = ∆, the Hamiltonian, expressed in terms
of Majorana operators, becomes

NX
−1
H = it c B j c A j +1 .
j =1

Note that this Hamiltonian does not depend on c A1 or c B N .


(d) Now we define new fermionic operators:

1¡ ¢ 1
bj = c B j + ic A j +1 , j = 1, . . . , N − 1, and b N = (c B N + ic A1 ) .
2 2
Check that these operators satisfy the conventional fermion anti-commutation
relations. Express the operators b j in terms of the original fermion operators a j .
(e) Show that H can be formulated in terms of the b’s as follows:
NX
−1 µ 1

H = 2t b i† b i − .
2
i =1 {
(f) Discuss the eigenstates and the spectrum of H , paying particular attention to the
10
{

degeneracy of the ground state. Argue whether this degeneracy is very sensitive
to changes of the parameters away from µ = 0, ∆ = t .

5. In chapter 9, we have seen that the Hamiltonian of a many-body system can be written
as
1 X † †
H = a †j a k h j k +
X
a a v j kl m a l a m .
jk 2 j kl m j k

where
h j k = ψ j |h|ψk
­ ®

and
1
Z
v j kl m = d 3 r d 3 r 0 ψ†j (r)ψ†k (r0 ) ψl (r)ψm (r0 ).
|r − r0 |
Note that in the integral on the right hand side, the inner product of the spin-parts of
the one-particle states is implicitly assumed, i.e. the spin of the states labelled j and l
must be equal, and the same holds for the spins of states k and m. The Hamiltonian
however does not affect the spin!
138 10. E LECTRONS AND PHONONS

­ ®
(a) Show that, for two particles in the same orbital r|φ and with opposite spin, the
ground state for a wave function which has the form of a Slater determinant, is
given by
1
Z
E 0 = 2 φ|h|φ + d 3 r d 3 r 0 |φ(r)|2 |φ(r0 )|2
­ ®
.
|r − r0 |
Give a physical interpretation of the second term of this energy.
(b) For φ(r) = exp(−ar ), find an analytic expression for the ground state energy for
two electrons in the helium atom. In atomic units, these electrons are described
by the Hamiltonian
1 1 −2 −2 1
H = − ∇21 − ∇22 + + + .
2 2 r1 r 2 |r1 − r2 |
In atomic units, distances are calculated in units of the Bohr radius a 0 , the elec-
tron mass and the charge are both 1, and the energy is expressed in Hartrees, 1
Hartree is 27.212 eV.
To find the expression for the energy, you must calculate quite a few integrals.
There is one which is nontrivial:
1
Z
2
N d 3 r 1 d 3 r 2 e −2ar 1 e −2ar 2 .
|r1 − r2 |

The prefactor N 2 ensures normalisation of the orbitals exp(−ar ). The result of


this integral (including that prefactor) is 5a/8.
Minimize the ground state with respect to a and compare the result with the
known value for the ground state energy of a helium atom, which is −78.975 eV .

6. Consider a many-body system consisting of N identical spin-1/2 particles with Hamil-


ton operator:
N p2
h (i ) ; h (i ) = i + B σ(ix ) .
X
H=
i =1 2m
Use the states |kσ〉 as a basis (as usual, p = ×k, and σ is the eigenvalue of σz and there-
fore takes on the values +1 or −1).

(a) Show that the many-body energy operator H in the kσ representation is of the
{
following form and determine f , g + and g − :
10
{
X † X³ † †
´
H= f a kσ a kσ + g + a k,+ a k,− + g − a k,− a k,+ .
k,σ k
£ ¤
(b) Compute H , a k,σ .
(c) Calculate the time-dependent operators c k (t ) = a k,+ +a k,− and d k (t ) = a k,+ −a k,− .
Give the physical meaning of these operators.

7. Consider the jellium model for electrons of density n in three dimensions. Assume that
the interaction between two electrons has the form U (r ) = e 2 /r as a function of the
distance r between them. The ground state energy in the Hartree-Fock approximation
is given by
X ×k 2 1 X D † † E
E =2 + a k,σ a k0 ,σUk−k0 a k,σ a k0 ,σ .
k 2m 2 k,k0 ,σ
The sums in this expression are all over the Fermi sphere. The second term contains an
expectation value with respect to the ground state of the noninteracting system (i.e. a
filled Fermi sphere). This second term has a form ∝ k F4 – it is the exchange energy of
the electron gas.
We first consider the ground state with equal numbers of spin-up and -down electrons.
10.3. P ROBLEMS 139

(a) Show that the kinetic energy contribution to E (per unit volume) is of the form

An α .

What is α? What is A?
(b) Show that the exchange contribution to E (per unit volume) is of the form

E ex
= B nβ.
V
What is β? How does B depend on V ? What do you know about the sign of B ?
(c) We now consider a spin-polarized system where all spins are pointing up. Find
an equation for the density in terms of A and B where the spin-polarized state
becomes stable with respect to the unpolarized state.

{
10
{
11
S UPERCONDUCTIVITY

11.1 I NTRODUCTION
Superconductivity is the phenomenon that the conductance drops to zero below a threshold
temperature, accompanied by a complete expulsion of the magnetic field from the interior
of the superconductor (the ‘Meissner effect’). In this chapter we shall consider theoretical
approaches to this phenomenon. Note however, that we mainly discuss superconductivity as
an illustration and culmination point of the formalism developed in the previous chapters,
that is, you should not expect a course on superconductivity here.
The emphasis in this chapter is on the BCS theory of superconductivity. In the next sec-
tion, we first introduce Cooper pairs which are held responsible for the phenomenon of su-
perconductivity. In section 11.3 we shall discuss the form of the superconducting ground
state wave function. This then allows us to formulate a Hamiltonian which is reduced to con-
tain only the interaction terms relevant in the BCS wave function in section 11.4. This Hamil-
tonian will then be diagonalised by writing the wave function in the form of section 11.3.
Finally, we give a brief account of the Landau-Ginzburg description of a superconductor in
section 11.6.

11.2 C OOPER PAIRS


In the previous chapter, we learned that the interaction between electrons and phonons can
lead to an attractive effective interaction between electrons. This followed from the Hamilto-
nian
X ¯ ¯2 ×ωq
H = H0 + ¯M q ¯ a† a† a a 0 .
2 − (×ω )2 k0 +q k−q k k
0
kk q (²k − ²k−q ) q

Lumping all the scalar (non-operator) terms in this interaction into a coupling constant G q ,
and realizing that we are dealing with an interaction which only depends on the distance
between the electrons, we can write the Hamiltonian, now including spin, as [see (9.4)]:

G q a k† 0 +q,σ0 a k−q,σ

X X
H = H0 + a k,σ a k0 ,σ0 .
σ,σ0 kk0 q

The Hamiltonian H0 describes non-interacting particles:



²k a k,σ
X
H0 = a k,σ .
k,σ

In principle, the full Hamiltonian H preserves the number of particles. Therefore, we can
work in a N -electron Hilbert space. However, in the context of superconductivity it turns
out useful to relax the constraint on the number of particles and control their number using
a chemical potential, µ. This is defined as the total energy needed to add a particle to the

141
142 11. S UPERCONDUCTIVITY

system. From statistical mechanics we know that two systems in equilibrium at some tem-
perature T with the possibility to exchange particles, have equal chemical potentials. If one
of these systems (the ‘bath’) is much bigger than the other, the smaller system will loose or
gain particles until its chemical potential is the same as that of the big system.
At zero temperature, if we neglect interactions, the ground state consists of a filled Fermi
sphere, with the Fermi energy equalling the chemical potential µ. In the following, we take µ
equal zero, so we should measure particle energies with respect to the chemical potential:

²k → ²k − µ ≡ ²̃k .

Neglecting the interactions, we see that the energy can be lowered by adding particles with
negative energy, i.e. particles with ²k < µ, therefore µ is easily seen to control the particle
number.
Superconductivity is the dramatic consequence of the fact that the electron-phonon cou-
pling causes electrons to bind in pairs. The fact that the electron-phonon coupling plays
an essential role in superconductivity was first suggested by the experimental discovery of
the so-called isotope effect: the dependence of the critical temperature, above which super-
conductivity vanishes, on the mass of the nuclei. The electron-phonon coupling is the only
mechanism by which nuclear mass can affect electronic behavior. Superconductivity was
discovered in 1911 by Kamerlingh Onnes in Leiden, three years after he succeeded to liquefy
helium for the first time. Following the 1933 discovery of the Meissner effect, which refers to
the expulsion of a magnetic field inside a superconductor, and the description of this effect
by the London equations in 1935, Landau and Ginzburg formulated in 1948 a successful phe-
nomenological theory of superconductivity. Remarkably, it was not until 1957 that the first
microscopic theory explaining the basic phenomena associated with superconductivity ap-
peared. Named BCS theory after its creators Bardeen, Cooper and Schrieffer, we shall discuss
it in some detail below.
We first want to gain insight into the structure of the ground state of the interacting sys-
tem, which is the equilibrium state at zero absolute temperature. We start with a collection of
particles occupying momentum states inside the Fermi sphere, and then add two particles to
this system. The total energy of the system may be lowered when the particles form a bound
state. We then expect these particles to have zero total momentum in order to minimize their
centre of mass kinetic energy – hence the two particles are composed of plane waves with
momenta k and −k, both outside the Fermi sphere (the states inside the Fermi sphere are
already occupied). In view of the overall anti-symmetry, either the orbital or the spin com-
ponents of the two-particle wave function must be antisymmetric and the other symmetric.
We anticipate that the symmetric orbital wave function will give us the lowest energy as the
{
11{ other option forbids the particles to approach each other and thereby take advantage of their
attraction. These considerations lead to a wave function of the form
1
Ψ(x 1 , x 2 ) =
X
A k cos(k · (r1 − r2 )) p (|↑↓〉 − |↓↑〉) ,
k>k F 2

where x i denotes the combined orbital and spin coordinates (ri , σi ). Th presence of the co-
sine, and not the sine, in the sum is a consequence of the orbital symmetry. We can replace
the cosine by an exponential function, and require A k ≡ A −k . We omit the spin part of the
wave function in the following.
Inserting this Ψ into the Schrödinger equation, we arrive at
X
(E − 2²k + 2µ)A k = G k−k0 A k0 .
|k0 |>k F

The coupling G k−k0 is induced by phonons, and the typical maximum phonon frequency is
the Debye frequency ωD . The wave vector associated with these phonons is usually much
smaller than the Fermi wave vector, and the electron-phonon coupling is relevant only when
11.3. T HE BCS WAVE FUNCTION 143

|²k − ²k0 | < ×ωD . Within this range we approximate G q by a constant, −G. We then have,
replacing ²k by ²̃k + µ: P
k0 A k0
Ak = G ,
2²̃k − E
where the sum over k0 is understood to be in the narrow range corresponding to ×ωD and
outside the Fermi sphere. Summing both sides over the same set of k’s, we obtain

1 X 1
= .
G k 2²̃k − E

Supposing that NF , the density of states near the Fermi level, is approximately constant over
an energy range ωD , we can replace the sum over k by an energy integral:
×ωD d ²̃ E − 2×ωD
µ ¶
1 1
Z
= NF = NF ln .
G 0 2²̃ − E 2 E

Assuming that the coupling is weak, NFG ¿ 1, we can then write

E ≈ −×ωD e −2/(NFG) .

We see that, no matter how weak the electron-phonon interaction is, the two new electrons
have a negative energy – hence they are in a bound state that will be absorbed into the ground
state. This shows that the electrons form pairs, held together by the electron-phonon inter-
action – these are the Cooper pairs. Such pairs behave more like bosons than fermions, and
for bosons it is known that above a certain density, they macroscopically occupy the ground
state as in a Bose-Einstein condensate. We shall however not use that picture in these notes.

11.3 T HE BCS WAVE FUNCTION


In the previous section, we have seen that the Fermi sphere is unstable against the formation
of a bound (Cooper) pair of electrons in a spin-singlet state (antisymmetric spin state) and
a symmetric orbital state. But what happens when we create a second Cooper pair, a third,
etcetera? It may be that after creating a large number of Cooper pairs, the Fermi sea gets
distorted and new Cooper pairs are no longer favourable. This turns out to be the scenario
indeed, and we will end up with a Fermi sphere which is smaller than the sphere for non-
interacting electrons, plus a ‘layer’ of Cooper pairs. This picture is conceivable and turns
out to reflect the actual situation, but a quantitative description seems rather complicated.
The description can be simplified by anticipating a particular form of the wave function, or
BCS wave function [for the original paper, see J. Bardeen, L. N. Cooper and J. R. Schrieffer, {
Phys. Rev. 108, 1175 (1957)]. Excellent descriptions can also be found in the books by De 11
{
Gennes (Superconductivity of Metals and Alloys, Benjamin, New York, 1966) and Tinkham (In-
troduction to Superconductivity, McGraw Hill, 1975). We shall construct the BCS form from
a mean-field approximation applied to the interaction term of the Hamiltonian. Let us fi-
nally anticipate the physical picture corresponding to the superconducting electron state.
For small k, the electrons are still in the normal state which is appropriately described by a
Slater determinant of independent plane waves. Near the Fermi wave vector, we have a layer
of Cooper-pair states. Outside that layer, the electron states are unoccupied.
In the previous section we have seen that the electrons gain energy by forming Cooper
pairs. This suggests choosing a particular form of wave function

ΦN = Aϕ(r1 − r2 )ϕ(r3 − r4 ) . . . ϕ(rN −1 − rN ) (↑↓↑↓ . . . ↑↓) .

Here, A in front of the expression is the anti-symmetrisation operator which ensures that
the wave function has the appropriate fermion exchange anti-symmetry. This form deserves
some explanation. First of all, we see that it does not have the form of a Slater determinant,
144 11. S UPERCONDUCTIVITY

as the electrons are correlated in pairs like ϕ(r1 − r2 ) etc. There is no way to recast this into
a single Slater-determinant form. Now remember that the Slater determinants represented
the eigenstates of a Hamiltonian of non-interacting particles, then it is clear that the wave
function given here carries the fingerprint of the electron-electron interactions – it is an in-
teracting wave function.
The spins are collected together in the last part of this expression for the wave function. In
this part, the spins are assumed to correspond to the particle ordering: 123. . . . From this we
see that each orbital pair function ϕ(ri − r j ) is multiplied by a wave function of two opposite
spins.
Let us now write
ϕ(r) = g k e ik·r .
X
k

The total wave function Φ can then be written as

ΦN = g k1 g k2 . . . g kN −1 g kN e ik1 (r1 −r2 ) A |1 ↑; 2 ↓〉 . . . e ikN /2 (rN −1 −rN ) |(N − 1) ↑; N ↓〉 .


X
k1 ,...,kN /2

We see that in this wave function, the spin-orbital |k, ↑〉 is paired with |−k; ↓〉. It is the job of
the anti-symmetrisation operator A to turn this into an anti-symmetrized product. Note that
each term in the sum over the wave vectors can be written as Slater determinant. This is not
in contradiction with what was said above about |ΦN 〉, as we have written this wave function
as a superposition of Slater determinants. The anti-symmetrisation operation yields a Slater
determinant of the form
|k1 , ↑; −k1 ↓; . . . kN /2 ↑; −kN /2 ↓〉F ,
where the subscript ’F" with the ket means that this is a state in Fock space, i.e. the anti-
symmetrisation has been taken care of. We see that the k-vector and the spin have become
entangled in this form: it is no longer possible to write these states as a product of a k part
and a spin part.
We can write this state also in a different form, using fermion creation operators:

|k1 , ↑; −k1 , ↓; . . . kN /2 , ↑; −kN /2 , ↓〉F = a k† ,↑ a −k



. . . a k† †
a −k |0〉 . (11.1)
1 1 ,↓ N /2 ,↑ N /2 ,↓

The BCS wave function is a superposition of these Slater determinants with expansion coef-
ficients g k1 · · · g kN /2 . Our job is now to vary N and the g k in order to minimize the quantity
E − µN , which is the free energy of the system.
It turns out rather difficult to handle the wave functions in the form given above due to
the fact that it is a state for a fixed particle number, which requires minimization with respect
{ to the g coefficients for any particle number N . A variable number is therefore preferred,
11{ however keeping the pair-wise coupling! It turns out that a more convenient form can be
used where we replace the g k by pairs of real numbers u k , v k :
¯ ® Y³ † †
´
¯Φ̃ = u k + v k a k,↑ a −k,↓ |0〉 . (11.2)
k

Expanding the product on the right hand side generates all possible wave functions with all
possible particle numbers of the form (11.1), provided
vk
= g k and u k2 + v k2 = 1,
uk

the latter condition ensuring proper normalisation of ¯Φ̃N . It is easy to see that for this wave
¯ ®

function, the particle number is


〈N 〉 = 2v k2 ,
X
k
† †
as each combination a k,↑ a −k,↓ generates the two particles of a Cooper pair.
11.3. T HE BCS WAVE FUNCTION 145

Let us make a hold for a second and summarize where we stand right now. We have first
constructed the interacting wave function |Φ〉, in which the electrons are pair-wise corre-
lated. Then we have expanded this function in terms of Slater determinants and seen that in
these determinants, the spin-orbitals |k, ↑〉 and |−k, ↓〉 always occur in pairs. Realising that
this would urge us to solve the complicated problem of calculating first the energy for arbi-
trary N , followed by minimizing the free energy E − µN with respect to both the expansion
coefficients g k and N , we have changed to a different form, (11.2), which encapsulates all
expansions of our Slater determinants for arbitrary particle numbers.
Let us now carry out the remainder of our programme and evaluate the free energy for
the state of (11.2). The Hamiltonian has the form [see (9.4)]

H = ²k a k,σ G k,k0 ,q a k† 0 +q,σ0 a k−q,σ

X X X
a k,σ + a k,σ a k0 ,σ0 . (11.3)
k,σ σ,σ0 kk0 q

Note that the interaction term depends on three wave vectors, and not only on the momen-
tum transfer q. This is because the interaction depends on the lattice vibrations and is there-
fore not translationally invariant (we will return to this shortly). As already discussed above,
the fact that we should minimize the free energy E − µN can be taken care of by replacing

²k → ²k − µ ≡ ²̃k

and we define ³ ´
† †
²̃k a k,↑
X
H0 = a k,↑ + a −k,↓ a −k,↓ .
k

The contribution of this term to the free energy is easily seen to be

Φ̃¯ H0 ¯Φ̃ = 2 v k2 ²̃k .


­ ¯ ¯ ® X
k

¯ ® The interaction term is more difficult to analyse. This term removes two particles from
¯Φ̃ and then creates two particles, possibly in different states. We obviously have the two
possibilities that we already used in the Hartree-Fock theory (see chapter 10): (i) q = 0 or (ii)
q = k−k0 . The first term just calculates the interaction between the two particle distributions
and the second one results from the anti-symmetry of the wave function. It turns out that
these contributions can be incorporated into the single-particle energies ²k , hence they do
not cause Cooper pairing.
As the ground state does not have a fixed set of filled states, there is another possibility:
the annihilation operators may remove a pair k ↑; −k, ↓ and replace it by another pair l ↑; −l, ↓,
where k 6= l. Because in the state on the right hand side, the l pair should be empty and the
{
k pair occupied, we have a term proportional to u l and to v k from the wave function on the 11
{
right hand side of the expectation value (the ‘ket’-part). Then we have a contribution with
v l and u k from the left hand wave function (the ’bra’ part), as in that state the l pair must be
occupied and the k pair empty. The contribution obtained should therefore be proportional
to u k v k u l v l . Furthermore, we need k = −k0 and q = k − l. We then obtain a contribution in
(11.3)
Φ̃¯ Hint ¯Φ̃ = G k,l u k v k u l v l .
­ ¯ ¯ ® X
k,l

So we obtain for the total free energy

Φ̃¯ H − µN ¯Φ̃ = 2 v k2 ²̃k + G k,l u k v k u l v l .


­ ¯ ¯ ® X X
(11.4)
k k,l

We must find the minimum of this function with respect to the u k and v k , subject to the
condition that u k2 + v k2 = 1. We realise this constraint by parametrising u k and v k by a single
variable θk :
u k = sin θk ; v k = cos θk .
146 11. S UPERCONDUCTIVITY

In terms of θk , the expectation value of the free energy now reads (using sin θ cos θ = sin(2θ)/2):

1X
Φ̃¯ H − µN ¯Φ̃ = 2 ²̃k cos2 θk +
­ ¯ ¯ ® X
sin (2θk ) sin (2θl )G k,l .
k 4 k,l

The minimum of this free energy is found by putting its derivatives with respect to the θk to
zero:
∂ ­ ¯¯
Φ̃N Hint ¯Φ̃N = −2²̃k sin (2θk ) + cos (2θk ) sin (2θl )G k,l ,
¯ ® X
0=
∂θk l

which we can reduce to


1X
²̃k tan (2θk ) = G k,l sin (2θl ) .
2 l
We now introduce the quantities:

∆k = −
X
G k,l u l v l ;
l
q
Ek = ²̃2k + ∆2k .

Using these, we can simplify the minimum condition to

∆k
tan (2θk ) = − , F AC T OROF 2!!!!!!!!!!!!!!!!
²̃k

and this in turn enables us to write


∆k
2u k v k = sin (2θk ) = (11.5)
Ek
²̃k
−u k2 + v k2 = cos (2θk ) = − . (11.6)
Ek

Using the first of these equations in the definition for ∆k above, we arrive at an implicit equa-
tion for the ∆k :
∆l
∆k = − G k,l
X
. (11.7)
l 2E l
Let us now try to solve this self-consistency equation. The first solution we immediately
recognise is ∆k = 0 for all k. This means that either v k = 0 or u k = 0. As in that case E k = ²̃k for
²̃k > 0 and E k = −²̃k for ²̃k < 0, we see from (11.6) that

{
(
1 for ²̃k < 0;
11
{ vk =
0 for ²̃k > 0.
(11.8)

This is precisely a filled Fermi sphere, and this does not give us superconductivity.
In order to take advantage of the correlated structure of the BCS wave function to lower
the energy, we must have a non-zero ∆k , requiring a non-zero interaction. BCS chose:
(
−G for ²̃k , ²̃l < ×ωD ;
G k,l =
0 otherwise ,

where ωD denotes the Debye frequency as usual. Now the self-consistency equation (11.7)
can be solved. First we realise that
(
0 for ²̃k > ×ωD ;
∆k =
∆ independent of k for ²̃k < ×ωD .
11.4. T HE BCS H AMILTONIAN 147

Putting this into (11.7) and replacing sums by integrals by means of the density of states (see
section 11.2) gives
Z ×ωD
1
∆ = NFG ∆ p d ²̃,
−×ωD 2 ∆2 + ²̃2

from which we find


Z ×ωD µ ¶
1 1 ×ωD
= p d ²̃ = sinh−1 .G AN DF AC T OROF 2M I SSI NG!!!!!!!!!!!g P ROB AB LY I N DE NOMON LH S
NFG 0 ∆2 + ²̃2 ∆

Note that for this equation to hold, the interaction should be attractive, i.e. G > 0. We rework
the result analogously to section 11.2 to obtain

∆ = 2×ωD e −1/(NFG)

when NFG ¿ 1. In practice, NFG only rarely exceeds 0.3.


It remains to check whether the energy is reduced with respect to that of the non-interacting
wave function (i.e. a Slater determinant corresponding to a filled Fermi sphere). For this
purpose, it is useful to calculate the u k and v k . Combining (11.6) with the normalisation
u k2 + v k2 = 1, it is easy to see that for the BCS state:
 
1 ²̃k
u k2 = 1 + q

2

²̃ + ∆
k
2 2

and  
1 ²̃k
v k2 = 1 − q ,

2 ²̃ + ∆2
2
k

whereas the normal state is characterised by (11.8). These can be used directly in the expres-
sion for the total energy (11.4) to obtain
 
²̃k ∆k ∆l
E BCS = ²̃k 1 − q
X   X
 + G k,l q q . (11.9)
k ²̃k + ∆k
2 2 kl 4 ²̃k + ∆2k ²̃2l + ∆2l
2

In problem 1 it is shown that the energy difference results in

∆2
〈ΦN | Ĥ |ΦN 〉 − 〈ΦBCS | Ĥ |ΦBCS 〉 = NF . {
2
11
{
This means that the ground state energy of the BCS is lower than that of the normal state. The
BCS state however only wins by a small amount: it is only of the order of ∆2 /E F ≈ 10−2 K per
particle.

11.4 T HE BCS H AMILTONIAN


From now on we shall denote single particle states by k which denotes the momentum k and
the spin s. A Cooper pair can then be denoted as (k, −k), which means that we have two
particles with opposite spin and opposite momentum. The Hamiltonian containing only the
interaction terms representing the replacement of one Cooper pair by another then has the
form X ³ ´ X
H = ²k a k† a k + a −k

a −k − G kl a l† a −l

a −k a k .
k kl

Finding the ground state for our Hamiltonian is impossible without making approxima-
tions. A classic approximation scheme that predicts the behaviour of standard superconduc-
tors remarkably well is the mean field approximation. In this approximation, we neglect the
148 11. S UPERCONDUCTIVITY

deviations – to a certain order – of the operators a −k a k and a k† a −k †


from their expectation
values in the ground state (which still must be determined). To be specific, we write
³ D E D E´
a l† a −l

a −k a k = a l† a −l

− a l† a −l

+ a l† a −l

(a −k a k − 〈a −k a k 〉 + 〈a −k a k 〉)
³ D E´
and expand the products, neglecting the second-order fluctuation term a l† a −l †
− a l† a −l

(a −k a k − 〈a −k a k 〉):
X ³ ´ X ³ D E D E´
HMFA = ²k a k† a k + a −k†
a −k − G kl a l† a −l

〈a −k a k 〉 + a −k a k a l† a −l

− 〈a −k a k 〉 a l† a −l

.
k kl

Noticing
D E the expectation values are scalars, we replace them by numbers C k = 〈a −k a k 〉,
that
† †
C k∗ = a k a −k . Furthermore, calling

∆k =
X
G kl C l ,
l

the Hamiltonian assumes the form


X ³ ´ X X³ ∗ ´
H = ²k a k† a k + a −k

a −k + ∆k C k∗ − ∆k a −k a k + ∆k a k† a −k

.
k k k

It turns out that the phase of ∆k can be chosen arbitrarily (to show this, we need arguments
related to gauge invariance which are beyond the scope of these notes) – we choose ∆k to be
real.
The resulting Hamiltonian is quadratic (in the sense that every term contains at most two
operators). If the Hamiltonian would only contain products of a creation and an annihila-
tion operator, the diagonalisation would have been straightforward, as H could then be com-
pletely formulated in terms of number operators. In our case, however, the products of two
annihilation and two creation operators cause problems. Related to this is that the Hamilto-
nian in this form does not conserve particle number, although the original Hamiltonian did.
This lack of particle number conservation is a consequence of the mean field approximation.
There is a trick for transforming the mean field Hamiltonian into a diagonalizable one.
This trick is essentially a linear transformation of the operators a k and a k† which removes the
‘difficult’ terms (i.e. the terms with two annihilation or two creation operators) and leaves
only number operators. This transformation, which is closely related to the construction of
the BCS wave function discussed in the previous section, is called the ‘Bogoliubov-Valatin’
transformation. The most general linear transformation of the a-operators is

{ αk = u k a k − v k a −k ,
11
{ βk = u k a −k + v k a k† .

Note that the u k and v k are numbers; the a k , αk , βk and their Hermitian conjugates are op-
erators. We would like the operators αk and βk to obey the usual anti-commutation relations
for fermion operators. This imposes a condition on the coefficients u k and v k :

u k2 + v k2 = 1.

Inverting the transformation leads to

a k = u k αk + v k β†k ,
a −k = u k βk − v k α†k .

Now we use these transformations to recast the original Hamiltonian into a form containing
only the αk and βk (and their Hermitian conjugates). The non-interacting part then yields
¢³ ´ ³ ´
a k† a k + a −k

a −k = 2v k2 + u k2 − v k2 α†k αk + β†k βk + 2u k v k α†k β†k + βk αk .
¡
11.4. T HE BCS H AMILTONIAN 149

For the terms deriving from the interaction, we have


³ ´ ¡ ¢³ ´
a −k a k + a k† a −k

= 2u k v k − 2u k v k α†k αk + β†k βk + u k2 − v k2 α†k β†k + βk αk .

Note that except for the normalization condition for the u k and v k , we still have the freedom
to choose these parameters. We choose them such as to make the coefficient of the terms
containing two annihilation or two creation operators zero. Such terms arising from the non-
interacting and interacting parts of the Hamiltonian combine into
¢¤ ³ ´
2²k u k v k − ∆k u k2 − v k2 a k† β†k + βk αk .
X£ ¡
k

Therefore we impose
2²k u k v k − ∆k u k2 − v k2 = 0.
¡ ¢

Defining g k = v k /u k , we find that the positive value of g k is given by

E k − ²k
gk = ,
∆k

with q
Ek = ²2k + ∆2k .

Combining this with u k2 + v k2 = 1, we have, similar to the result obtained in the previous sec-
tion:  
²k ²k
µ ¶
1 1
u k2 = 1+ = 1 + q

2 Ek 2

²2k + ∆2k

and  
²k ²k
µ ¶
1 1
v k2 = 1− = 1 − q .

2 Ek 2 ² +∆
2 2
k k

The part of the Hamiltonian that survives is


¤³ ´
²k v k2 − ²k u k − v k2 + 2∆k u k v k α†k αk + β†k βk .
X£ ¡ 2
∆k u k v k +
X X ¢
HMFA = 2
k k k

This has the form X£ ¤


HMFA = M k + Nk (n α,k + n β,k ) ,
k {
where n α,β are the number operators for the α and β particles; M k and Nk are complex num-
11
{

bers. The α and β ‘particles’ are in fact the excitations of the ground state with definite energy.
This brings us to the notion of a ‘quasi-particle’. If we were to add an electron to the sys-
tem without changing the state of the ‘resident’ particles, we obtain a state in the Fock space
which is not an eigenstate of the Hamiltonian. Only judicious combinations of real particles
give excitations which are eigenstates of the Hamiltonian – these are the α and β particles. As
they are not actual electrons or holes, they are called quasi-particles.
The vacuum state of the electron system satisfies

a k |0〉 = 0

for each k. Similarly, the quasi-particle ground state is defined as the state satisfying

αk |0〉quasi = βk |0〉quasi = 0
150 11. S UPERCONDUCTIVITY

for all k. From this, and using the definition of the quasi-particle operators αk and βk , we can
express the quasi-particle vacuum state in terms of the electron vacuum state:
Y³ ´
|0〉quasi = u k + v k a k† a −k

|0〉 . (11.10)
k

The fact that this is the quasi-particle vacuum can be verified by acting with αk and βk on it.
As the vacuum state is unique, the given expression must be correct.
Let us have a closer look at (11.10). We run over all possible k values. With probability
|v k | we create a pair of states (k, −k) and with probability |u k |2 we do nothing, i.e. we leave
2

an empty state empty. That is, the relative sizes of u k and v k are parameters controlling the
particle number. For u k close to 1 (and, consequently, v k close to 0), there are no electrons
with momentum k. For u k close to zero, hence v k close to one, we have ‘ordinary’ electron
pairs.
Let us return to the definition of ∆k :

∆k =
X
G kl C l ,
l

with C l = 〈a l a −l 〉. Let us work out this expectation value for the ground state of the α and β
particles:
Y D ¯¯ ³ ´¯ E
C l =quasi 〈0 |a −l a l | 0〉quasi = 0 ¯(u k 0 + v k 0 a −k 0 a k 0 ) a −l a l u k + v k a k† a −k
† ¯
¯ 0 = ul v l .
kk 0

In obtaining the right-hand side, we made use of the fact that for k 6= l , the terms of the quasi-
particle ground state in the ket vector are orthonormal to the terms in the bra-vector. This
forces the k 0 -s to be equal to the k-s. Thus we have

1 X G kl ∆l
∆k =
X
G kl u l v l = q .
l 2 l ²̃2 + ∆2
l l

This is an implicit equation for ∆k – it is known as the gap equation.


The quasi-particle Hamiltonian can now be written as
³ ´
²̃k v k2 − ∆k u k v k + E k α†k αk + β†k βk .
X X X
H =2
k>0 k k>0

We can simplify the analysis considerably by neglecting the kl dependence of G: G kl ≡ G.


{ Under this approximation, ∆ becomes a constant independent of k and we can write:
11{
∆2 X ³ ´
²̃k v k2 − E k α†k αk + β†k βk .
X
H =2 +
k>0 G k>0

The gap ∆ is now given by the implicit equation

GX 1
1= q .
2 k ²̃2k + ∆2

Now that we have cast the Hamiltonian into a diagonisable form, we can study the struc-
ture of its ground-state wave function. Let us first consider the states at low energies. We
assume that ∆ is small with respect to ²F . Low energy means that ²̃ is close to −²F . Then we
find that  
1 ²̃k
u k2 = 1 + q

2

²̃ + ∆2
2
k
11.5. S UMMARY OF BCS THEORY 151

1
u
0.9 v
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 0.5 1 1.5 2
Energy (in units of the Fermi energy)

F IGURE 11.1: Variation of the coefficients |u k |2 and |v k |2 with energy. The width of the transition region is deter-
mined by the parameter ∆ (here, ∆ = 0.1²F ).

is close to zero and  


1 ²̃k
v k2 = 1 − q

2

²̃ + ∆
2
k
2

is close to one. As already mentioned, this creates independent (k, −k) pairs, indistinguish-
able from ordinary electrons in the independent particle picture. These particles are said to
be in the normal state.
At high energies, we have u k close to 1 and v k close to 0 – therefore, we have no particles.
For an energy range ∆ near ²k = ²F , we see that u k and v k vary between 0 and 1: this is the re-
gion where we find quasi-particles. Figure 11.1 shows the variation of u k and v k with energy,
illustrating the behaviour described.
The ground state energy is given by

E BCS = 2 ²̃k v k2 − ∆k u k v k .
X X
k k

Substituting the expressions for u k and v k , we can write the total energy as
X (E k − ²̃k )2
E BCS = − < 0.
k 2E k
q
Creating a quasi-particle excitation α†k |0〉quasi requires an energy E k = ²̃2k + ∆2k which at the
Fermi energy (²̃k = 0) is equal to ∆. This gap is responsible for the exponential temperature {
dependence of the specific heat below the critical temperature. 11
{
The operator a k† a k counts the number of electrons in the momentum-spin state k. The
number of electrons is therefore given by
X D ¯¯ † †
¯ E
= 2 v k2 .
X
N= 0 ¯a k a k + a −k a −k ¯ 0
¯
quasi
k k

We finally note that the equation for the gap leads to the same relation between G and ∆
as found for the single Cooper pair:

∆ = 2×ωD e −1/(NFG) .

11.5 S UMMARY OF BCS THEORY


In the previous sections, we have seen that an attractive interaction leads to bound states of
electron pairs, which have opposite k and opposite spin – these are the Cooper pairs. In the
section 11.2, we have seen that each Cooper pair lowers the energy by an amount

∆E = −×ωD e −1/(NFG) .
152 11. S UPERCONDUCTIVITY

Here, NF is the density of states near the Fermi energy and G is the effective coupling strength
by which electrons are attracted: its physical background is the electron-phonon interaction,
which explains the presence of the pre-factor ωD which is the Debye frequency, a cut-off of
the phonon frequency.
In section 11.3, we have analysed the structure of the many-body wave function which
contains two-particle correlations due to the Copper pair interaction. The BCS wave function
is given as
¯ ® Y³ ´
¯Φ̃N = uk + v k a † a † |0〉 .
k,↑ −k,↓
k
which contains wave functions for all possible particle numbers. These wave functions fur-
thermore contain entanglement of the wave vector and spin 1 . It turns out that the corre-
sponding Hamiltonian containing the terms which are exploited by the BCS wave function to
lower the energy, reads
X ³ ´ X
H = ²k a k† a k + a −k

a −k − G kl a l† a −l

a −k a k .
k kl

Here we have used the notation k ≡ (k, ↑) and −k = (−k, ↓).


This Hamiltonian was analysed in section 11.4 using mean-field theory. In this theory,
products of two creation operators are replaced by their expectation values:

a −k a k → 〈a −k a k 〉 ≡ C k

and similar for creation operators. Fluctuations from these expectation values to second or-
der are then removed from the Hamiltonian. This leads to a quadratic Hamiltonian (that is,
a Hamiltonian containing products of at most two creation/annihilation operators) which
still contains the unknown values C k and their complex conjugates C k∗ . These have to be
solved for self-consistently. Diagonalised proceeds using the Bogoliubov-Valatin transforma-
tion and it leads to quasi-particles destroyed by the operators α and β (and created by their
Hermitian conjugates). The quasi-particles carry energies
q
E k = ²̃2k + ∆2 .

Here, ∆ is the superconducting gap, its value is given by

∆ = G kl v k u l .
X
l

{
11.6 L ANDAU -G INZBURG THEORY AND THE L ONDON EQUATIONS
11
{ In the previous section we have considered the BCS ground state and analyzed excitations
from this. Seven years before BCS theory was developed, there already existed a successful
phenomenological description of superconductivity: the Landau Ginzburg theory. A detailed
discussion of this theory is beyond the scope of these notes. Instead, we shall give a short
description of it. Landau and Ginzburg have written down a phenomenological expression
for the free energy difference ∆F between a superconductor and the normal state. This ex-
pression is

e∗ 2 β
·Z µ ¶ ¸
1 ×
∆F = ∗
ψ (r) ∇ − A ψ(r) + α|ψ(r)| + |ψ(r)| d 3 r.
2 4
2m ∗ i c 2
Here, ψ(r) is not an operator, but rather a complex field, the phase of which represents the
phase of the wave function, and its norm gives the ‘density of superconducting electrons’, n s :
¯2
n s (r) = ¯ψ(r)¯ .
¯

1 Entanglement will be formally defined in the next section; for now it means that the wave function of two elec-

trons cannot be written as the product of a k-part and a spin-part


11.6. L ANDAU -G INZBURG THEORY AND THE L ONDON EQUATIONS 153

The form of this free energy can be understood as follows. The free energy will depend on the
field ψ and its gradients. As the free energy is an extensive quantity, we can think of it as a
sum of the free energies within small boxes into which we have divided our large system. This
explains the integral over quantities depending only on r. Assuming that ψ and its gradients
are small, the integrand can be expanded in a Taylor expansion of the field ψ and its gradient.
This expression must be real, and it should be symmetric under reversing the sign of ψ (for
an isotropic system). This explains the powers of 2 and 4. In principle, higher powers will be
present, but the ones used here are sufficient for describing the main features.
The free energy reaches its minimum value in equilibrium. We find this minimum energy
by varying the field ψ:

e∗ 2
µ ¶
1 × ¯2
ψ(r) + β ¯ψ(r)¯ ψ(r) = −αψ(r).
¯

∇ − A
2m i c

One might ask why we have taken a complex field, and not a real one. The reason is that
we want to describe currents, and currents are related to the variation of the phase of the
wavefunction. From the fact that |ψ(r)|2 gives the superconducting density, the current must
be given by
e∗ e∗ e∗
· µ ¶ µ ¶ ¸
∗ × × ∗
j= ψ (r) ∇ − A ψ(r) + ψ(r) − ∇ − A ψ (r) .
2m ∗ i c i c
Putting e ∗ = 2e and m ∗ = 2m, as the ‘particles’ described by ψ are Cooper pairs, we obtain
from the last equation:

e
· µ ¶ µ ¶ ¸
∗ × 2e × 2e ∗
js = ψ (r) ∇ − A ψ(r) + ψ(r) − ∇ − A ψ (r) ,
4m i c i c

where the subscript ’s’ denotes that this is the current of the superconducting fraction of
electrons. Neglecting the r-dependence of ψ with respect to that of A, we then obtain

e2
js = − n s A,
m
which, after taking the curl in the left and right hand side, gives

2e 2
∇ × js = − n s B.
m
This equation is well known for superconductors – it is called the London equation. Note that
the approximation in which we have neglected the variation of ψ is justified in the case where
{
the superconductor is homogeneous and the field is weak: in that case, ψ is approximately 11
{
the equilibrium value with a small, space-dependent perturbation on top of it.
We now want to show that the London equation leads to the expulsion of magnetic fields
from the superconductor. Using the Maxwell equation


∇×B = j,
c
we have
∇ × ∇ × B = −k 2 B,
with
8πe 2 n s
k2 = .
mc
But we also know from vector calculus that

∇ × ∇ × B = −∇2 B,
154 11. S UPERCONDUCTIVITY

so that we have
∇2 B = k 2 B.
For a field which, at the x − y surface of a superconductor, is homogeneous, the solution is

B = e −kz B0 ,

which tells
p us that the field decays inside the superconductor with a penetration depth of
−1/k = mc/(8πe 2 n s ).

11.7 P ROBLEMS
1. In this problem we consider the energy difference between the normal and the super-
conducting state.

(a) Show that equation (11.9) can be written in the form


à !
²̃ ∆(²̃)∆(²̃0 )
Z ∞ Z
N (²̃)²̃ 1 − p d ²̃ − N (²̃)N (²̃0 )V p p d ²̃d ²̃0 .
−∞ ²̃2 + ∆2 (²) 4 ²̃2 + ∆(²̃) ²̃02 + ∆(²̃0 )

Here, N (²̃) is the density of states.


(b) This expression can be strongly simplified by realising that ∆(²̃) = 0 for |²̃| > ×ω
and ∆k = ∆ for |²k | < ×ωk . Use this to arrive at
à !
×ωD ×ωD ²̃
Z Z
E BCS = 2 N (²̃)²̃d ²̃ + NF |²̃| − p d ²̃+
−∞ −×ωD ²̃2 + ∆2 (²)
∆2
Z ×ωD Z ×ωD
N (²̃)N (²̃0 )V p p d ²̃d ²̃0
−×ωD −×ωD 4 ²̃ + ∆ ²̃ + ∆
2 02

(c) Simplify this further by using N (²̃) ≈ NF , to arrive at

0 ∆2
Z
E BCS = N (²̃)²̃d ²̃ − NF .
−∞ 2

Hint: use the integrals

1 1 xp 2 a2
Z Z
−1
p d x = sinh (x/a); p dx = x + a2 − sinh−1 (x/a).
x2 + a2 x2 + a2 2 2
{
2. In the lectures, we derived the BCS ground state
11{ ¯ E Y³ ´
¯ (0)
¯ΨBCS = u k + v k a k† a −k

|0〉 .
k

We also defined the Bogoliubov-Valatin operators



αk = u k a k − v k a −k
βk = u k a −k + v k a k† .
¯ E ¯ E
(a) Show that αk ¯Ψ(0)
¯ (0)
= 0 for any k. Similarly, show β Ψ = 0 for any k.
¯
BCS k ¯ BCS
¯ E
(b) We also showed that α†k ¯Ψ(0) is an excited state of the BCS Hamiltonian. Evalu-
¯
BCS
ate that state explicitly and explain what it represents. Explain in particular what
has happened to a Cooper pair by creating this excitation.
¯ E
(c) Repeat this calculation for β†k ¯Ψ(0) .
¯
BCS
11.7. P ROBLEMS 155

3. Hubbard model and superconductivity


The Hubbard model in two dimensions with negative U serves as a simple model for a
superconductor. The Hamiltonian reads
† † †
X X
H = −t c nσ c mσ +U c n↑ c n↑ c n↓ c n↓
〈nm〉,σ n

with U < 0. The first sum in the Hamiltonian is over nearest neighbour lattice sites
〈n, m〉.

(a) Derive the Hamiltonian in momentum space. Use the notation k ≡ (k, ↑) and −k ≡
(−k, ↓). Decouple the interaction term
D in aEmean field approximation by assuming
that only the expectation values c k† c −k

and 〈c −k c k 〉 are nonzero, while other
products of two fermion operators vanish.
We introduce
UX
∆=− 〈c −k c k 〉 .
V k
Derive the resulting mean-field Hamiltonian HBCS .
(b) Express HBCS in terms of new operators αk , βk which are defined by the Bogoli-
ubov transformation

c k = u k αk + v k β†k ; (11.11)

c −k = −v k∗ αk + u k∗ β†k . (11.12)

(c) Show that, for α and β to be fermion annihilation operators, we must have |u k |2 +
|v k |2 = 1.
(d) Derive the condition on u k and v k such that HBCS becomes diagonal in the oper-
ators αk and βk .
(e) Derive the energy spectrum of the quasi-particles described by the operators αk
and βk and their hermitian conjugates.

4. Quasiparticle excitations in a superconductor


In the lecture, we derived the BCS ground state
¯ E Y³ ´
¯ (0)
¯ΨBCS = u k + v k a k† a −k

|0〉
k
{
and showed that the Bogoliubov-Valatin operators
11
{

αk = u k a k − v k a −k
βk = u k a −k + v k a k†
¯ E
satisfy αk ¯Ψ(0) = 0 for any k.
¯
BCS
¯ E
(a) Similarly, show βk ¯Ψ(0) = 0 for any k.
¯
BCS
¯ E
(b) We also showed that α†k ¯Ψ(0) is an excited state of the BCS Hamiltonian. Evalu-
¯
BCS
ate that state explicitly and explain what it represents. Explain in particular what
has happened to a Cooper pair by creating this excitation.
¯ E
(c) Repeat this calculation for β†k ¯Ψ(0) .
¯
BCS
156 11. S UPERCONDUCTIVITY

5. Anderson’s pseudospin formulation of the BCS theory starts by transforming from the
pair creation operators b k† = a k† a −k

to operators defined by 2s x (k) = b k† + b k ; 2s y (k) =
i(b k† − b k ); 2s z (k) = 1 − nk − n−k . Here we use the notation of Desai’s book (and of the
lecture notes) where k denotes (k, ↑) and −k is (−k, ↓). Verify that in the units where
ħ ≡ 1 these operators obey the commutation relations of spins as defined by:

[s x , s y ] = is z ; [s y , s z ] = is x ; [s z , s x ] = is y

or, more concisely,


σ = iσ
σ ×σ σ.

6. Verify that when the transformation of the previous problem is substituted in the BCS
Hamiltonian:
X ³ ´ 1X
HBCS = ²k a k† a k + a −k

a −k − G kl a k† a −k

a −l a l
k 2 k,l

one finds a result of the form:


X
HBCS = − H(k) · s(k)
k

where H is an "effective pseudomagnetic field" given by


à !
G kk0 s x (k0 ), G kk0 s y (k0 ), 2²̂k
X X
H(k) =
k0 k0

Note that we have put l = (k0 , σ), G depends only on the relative wave vectors: G kl =
G k,k0 . [The energy of this system can now be minimized by arguing in analogy with the
theory of the domain walls in ferromagnets].

{
11{
12
D ENSITY OPERATORS — Q UANTUM
INFORMATION THEORY

12.1 I NTRODUCTION
In this chapter, we extend the formalism of quantum physics to include non-isolated systems.
Such systems can be described by an object called the density operator. Density operators can
be used to capture the influence of the outside world on a particle without keeping track of
the outside world’s degrees of freedom. In the simplest case, the outside world might be just
another particle. We shall see that this influence may lead to quantum states that do not
have a classical analogue – these states are called entangled. Entanglement is used in novel
technologies based on quantum mechanics. The most spectacular realization of this trend is
quantum computing, which we briefly discuss at the end of this chapter.

12.2 T HE DENSITY OPERATOR


Up to this point, we have dealt exclusively with quantum systems that are isolated, i.e., the
Hilbert space in which we formulate our quantum system is considered to be all there is –
no interactions with other systems are, or have ever been, present. Obviously this scenario is
hardly ever met in experiments: interactions with the outside world are unavoidable, whether
this is a different system, or degrees of freedom of our experimental apparatus that we would
like to neglect. For example, we often treat the electrons in a solid as our quantum system.
However, the electrons interact with the nuclei, and these form a quantum system by them-
selves, and the two are coupled by the electron-phonon interaction.
As we already know, the proper description of the state of an isolated system is the wave
function. We now ask, What is the most complete description of a non-isolated system?
To begin answering this question, let us consider a system S that is coupled to an envi-
ronment E. The system and environment together form the universe U (see figure). Do not
take these names too literally – they just represent the roles of the different elements at stake
quite well.

E
U=S+E

157
158 12. D ENSITY OPERATORS — QUANTUM INFORMATION THEORY

is described by a wave function ¯ψU .


¯ ®
The universe U is itself an isolated ¯system
E – hence ¯ it E
We define the orthonormal basis sets ¯φSj in S and ¯χEq in E. A basis of U is then formed by
¯ ¯

combining these two bases: ¯ E ¯ E


¯ S
¯φ j ⊗ ¯χEq . (12.1)
¯

The symbol ⊗ denotes a tensor product. This should not be considered mysterious or difficult,
it simply denotes that we combine the two basis sets into one. In fact, we have done this all
the time: when we write |n, l , m l , m s 〉 to denote a basis state for an electron in the hydrogen
atom, we mean by this the formal expression:

|n〉 ⊗ |l m l 〉 ⊗ |m s 〉 . (12.2)

In fact we could also denote the new basis as


¯ E
¯ S E
¯φ j χq . (12.3)

The reason why we dwell a little on this point is that tensor product notation is regularly used
in this field, and it is worth to familiarise yourself with it.
Back to the problem. We seek a means to describe the state of S without including degrees
of freedom of E. What is the ‘best’ description? That is a description which enables us to
calculate the expectation value of any operator acting solely on S. For example, if we consider
S to be the electrons in a solid, we would like to have a description of the electrons enabling
us to calculate the expectation value of any physical quantity of the electrons, such as total
spin, density, energy, . . . . The degrees of freedom of the nuclei should not be included in the
description.
So, let us take an operator  acting on S. This means that the matrix elements φ j | Â|φk
­ ®

fully represent the operator A. The expectation value of the operator  is known from the
wave function of the universe:
〈A〉 = ψU ¯ Â ¯ ψU .
­ ¯ ¯ ®
(12.4)
¯ U®
This formal result however includes a state ¯ψ , which¯ lives partly on E and we want to
restrict ourselves to S only. To make progress, we expand ¯ψU in terms of the basis which we
®

have introduced above: ¯ U® X ¯ E


¯ψ = C j q ¯¯φS χE . (12.5)
j q
jq

We then obtain for the expectation value:


D ¯ ¯ E
C ∗j q C kr φSj χEq ¯ Â ¯ φSk χEr .
XX
〈A〉 = (12.6)
¯ ¯
j q kr

Now remember that A only acts on the degrees of freedom of the system S – hence we can
{ write
12{
D ¯ ¯ ED ¯ E
C j q C kr φSj ¯ Â ¯ φSk χEq ¯χEr .
XX ∗
〈A〉 = (12.7)
¯ ¯ ¯
j q kr

The orthonormality of the basis of E turns the last inner product into δqr , and we have
D ¯ ¯ E
C ∗j q C kq φSj ¯ Â ¯ φSk .
XX
〈A〉 = (12.8)
¯ ¯
jk q

Note that we have now arrived at a description involving only basis vectors of S. In order to
cast this result into a more convenient form, we define
p ¯¯ S E X ¯ E
p q ¯η q = C j q ¯φSj . (12.9)
¯
j
12.2. T HE DENSITY OPERATOR 159

¯ E
The real and positive factor p q is chosen so as to normalize ¯ηSq : first the right hand side of
¯
p
this definition is worked out and then p q is the norm of the result.
Now we can write the expectation value in a convenient form:
X D ¯¯ ¯¯ E
〈A〉 = p q ηSq ¯ Â ¯ ηSq . (12.10)
q

This is our final result. We have succeeded in finding a description of the state of system S in
terms of vectors living only in S. Note that this description
¯ E has a simple statistical interpreta-
tion. Suppose we are given a set of normalized states ¯ηSq , together with a probability p q for
¯

every such state to occur. After some meditation, it will be clear that the final expression for
the expectation value of A is precisely the one given in (12.10).
You may now think that the result we have obtained can be formulated as a superposition
state: ¯ S ® X p ¯¯ S E
¯φ = p q ¯η q . WRONG! (12.11)
q

D writingEout the expectation value of A for such a state would include matrix ele-
Note that
ments ηSq | Â|ηSq 0 for q 6= q 0 . Such elements do not occur in the correct expression (12.10)
obtained above.

It turns out that there is no description in terms of a single quantum state defined
on S which gives the correct expectation value of an arbitrary operator  acting on S.
¯TheEsystem S can only be described by a set (also called ensemble of quantum states
¯ S
¯η q with classical probabilities p q .

This is the main result of this section. We can no longer describe the system S by a single
quantum state. Instead, we can describe it as a set of quantum states that occur with a certain
classical probability p q .
To emphasize the difference between a so-called mixed state, consisting of an ensemble
of states, and a pure state characterized by a single wave function, let us look at the simplest
possible nontrivial system, described by a two-dimensional Hilbert state, e.g. a spin-1/2 par-
ticle. Suppose someone, Charlie, gives us an electron but he does not know its spin state. He
does however know that there is no reason for the spin to be preferably up or down, so the
probability to measure spin ‘up’ or ‘down’ is 1/2 for both. In the language used above, the p q
are 1/2 for both q =‘up’ and q =‘down’. Can we describe this situation by a single quantum
state? Well, you might guess that the state is

¯ψ = p1 (|↑〉 + |↓〉) ,
¯ ®
(12.12)
2

but why couldn’t it be


{
¯ψ = p1 (|↑〉 − |↓〉)? 12
¯ ®
(12.13) {
2
In fact the state of the system could be anything of the form

¯ψ = p1 |↑〉 + e iϕ |↓〉 ,
¯ ® ¡ ¢
(12.14)
2

for any real ϕ. We see that we cannot assign a unique state to the spins we get from Charlie.
Although we do not know the wave function, we can evaluate the expectation value of the
z-component of the spin: as we find ×/2 and −×/2 with equal probabilities, the expectation
value is 0. More generally, if we have a spin which is in the spin-up state with probability p
and in the down state with probability 1− p, the expectation value of the z-component of the
spin is ×(p −1/2). So expectation values can still be found, although we do not have complete
160 12. D ENSITY OPERATORS — QUANTUM INFORMATION THEORY

information about the state of the system. This raises the question whether we may describe
the ensemble of states we get from Charlie by some quantum wave function which would
yield the same expectation values for measurements.
To answer this question we introduce the following states:

¯ψ1 = |↑〉 ;
¯ ®
(12.15a)
¯ψ2 = |↓〉 ;
¯ ®
(12.15b)
¯ψ3 = p1 (|↑〉 + |↓〉) ;
¯ ®
(12.15c)
2
1
¯ψ4 = p (|↑〉 − |↓〉) ;
¯ ®
(12.15d)
2
1
¯ψ5 = p (|↑〉 + i |↓〉) ;
¯ ®
(12.15e)
2
1
¯ψ6 = p (|↑〉 − i |↓〉) .
¯ ®
(12.15f)
2

These ¯states are recognized as the spin-up and -down states for the z, x and y directions.
States ψ3 to ψ6 all give spin-up and -down with probabilities 1/2 along the z-axis.
® ¯ ®
¯ ¯
Let us consider the most general quantum state which gives spin-up and -down along z
¡ iϕ
¢ p
with equal probabilities. This is the state |↑〉 +e |↓〉 ¯ / ®2. We now calculate the probability
of finding in a measurement this particle in the state ¯ψ3 :

¢¯2 ¯
¯ ¯ ¯ ¯2
¯1
¯ (〈↑| + 〈↓|) |↑〉 + e iϕ |↓〉 ¯ = ¯ 1 + exp(iϕ) ¯ = 1 1 + cos ϕ .
¡ ¯ ¡ ¢
¯2 (12.16)
¯ ¯ 2 ¯ 2

If we evaluate the probability to find the particle in the state ¯ψ3 in the case it was, before
¯ ®

the measurement, in a so-called mixed state which is given with equal probabilities to be |↑〉
and |↓〉, we find 1/2,¯ as can¯easily be verified. Calculating the probabilities for a particle to be
found in the states ¯ψ1 to ¯ψ6 we find the following results.
® ®

¢ p
|↑〉 + e iϕ |↓〉 / 2
¡
State Equal mixture of |↑〉 and |↓〉

¯ψ1 ®
¯ ®
¯ 1/2 1/2
¯ψ2 1/2 1/2
¯ψ3 1/2(1 + cos ϕ)
¯ ®
1/2
¯ψ4 1/2(1 − cos ϕ)
¯ ®
1/2
¯ψ5 1/2(1 − sin ϕ)
¯ ®
1/2
¯ψ6 1/2(1 + sin ϕ)
¯ ®
1/2

{ We see that there is no ϕ, i.e. no pure state, which leads to the same probabilities for all
12{ possible measurements.
Let us summarize what we have learned:

A system can be either in a pure or mixed state. In the first case, the state of the
system is completely described by a wave function. In the second case, we are not
sure about the state, but we can ascribe a classical probability for the system to be
in any state within a set. This situation can occur if our system is coupled to another
system (although this does not exclude the possibility of having a pure state in that
case). A mixed state cannot be represented by a single wave function.

Note that the uncertainty about the state of the particle is classical. Charlie can, for ex-
ample, flip a coin and, depending on whether the result is head or tails, send us a spin-up
12.2. T HE DENSITY OPERATOR 161

or -down. We only know that with probability 1/2, the particle that we receive is in quan-
tum state ’up’, and similarly for the quantum state ’down’. This classical uncertainty in the
quantum state of the spin should not be confused with the quantum uncertainty inherent to
measurement outcomes.
We now turn to the general case of a non-isolated system ¯ ® that can be in either one of a
set of normalized, but not necessarily orthogonal, states ¯ψi (we leave out the superscript
S in the
¯ sequel and replace the subscript q by i ). The probability for the system to be in the
state ψ¯i is p i , with obviously i p i = 1. Suppose the expectation value of some operator Â
® P
¯
in state ¯ψi is given by A i . Then the expectation value of  for the system at hand is given by
®

p i ψi ¯ Â ¯ ψi .
X X ­ ¯ ¯ ®
〈A〉 = pi Ai = (12.17)
i i

We now introduce the density operator, which is in some sense the ‘optimal’ specification
of the system. The density operator is defined as

ρ̂ = p i ¯ψi ψi ¯ .
X ¯ ®­ ¯
(12.18)
i

Suppose the set ¯φn forms a basis of the Hilbert space of the system under consideration.
¯ ®

Then the expectation value of the operator  can be rewritten after inserting the identity
operator 1 = n ¯φn φn ¯ as
P ¯ ®­ ¯

p i ψi ¯ Â ¯ψi = p i ψi ¯ ¯φn φn ¯ Â ¯ψi =


X ­ ¯ ¯ ® X ­ ¯X¯ ®­ ¯ ¯ ®
〈A〉 =
i i n
" #
φn ¯ p i ¯ψi ψi ¯ Â ¯φn = φn ¯ ρ̂ ¯φn = Tr ρ̂ Â . (12.19)
X­ ¯ X ¯ ®­ ¯ ¯ ® X­ ¯ ¯ ® ¡ ¢
n i n

Here we have used the trace, Tr , which sums all diagonal terms of an operator. For a general
operator Q̂:
φn ¯ Q̂ ¯φn .
X­ ¯ ¯ ®
Tr Q̂ = (12.20)
n

The trace is independent of the basis used — it is invariant under a basis transformation. We
omit the hat from operators unless confusion may arise. Another property of the trace is

Tr ¯ψ χ¯ = χ ¯ ψ ,
¡¯ ® ­ ¯¢ ­ ¯ ®
(12.21)

which is easily verified by writing out the trace with¯ ®respect to a basis φn .
If a system is in a well-defined quantum state ¯ψ , we say that the system is in a pure state.
In that case the density operator is
ρ = ¯ψ ψ¯ .
¯ ®­ ¯
(12.22)
If the system is not in a pure state, but if only the statistical weights p i of the states ¯ψi are
¯ ®

known, we say that the system is in a mixed state. How can you assess if a system described {
by a given density operator is in a pure or mixed state? For a pure state we have ρ 2 = ρ, which 12
{
means that ρ is a projection operator1 :

ρ 2 = ¯ψ ψ|ψ ψ¯ = ¯ψ ψ¯ = ρ,
¯ ®­ ®­ ¯ ¯ ®­ ¯
(12.23)

where we have used the fact that ¯ψ is normalized.


¯ ®

For a mixed state, such as

ρ = α ¯ψ ψ¯ + β ¯φ φ¯ ,
¯ ®­ ¯ ¯ ®­ ¯
(12.24)

where ψ|φ = 0, we have


­ ®

ρ 2 = α2 ¯ψ ψ¯ + β2 ¯φ φ¯ 6= ρ.
¯ ®­ ¯ ¯ ®­ ¯
(12.25)
1 Recall that a projection operator P is an Hermitian operator satisfying P 2 = P .
162 12. D ENSITY OPERATORS — QUANTUM INFORMATION THEORY

Although we have considered a particular example here, it holds in general that, for a mixed
state, ρ is not a projection operator.
¯ ® way to see this is to look at the eigenvalues of ρ. For a pure state, ρ = ψ ψ .
¯ ®­ ¯
Another ¯ ¯
Clearly, ψ is an eigenstate of ρ with eigenvalue 1, and all other ¯eigenvalues are 0 (you can
¯
verify this by using that all other eigenstates are orthogonal to ¯ψ ). These values for the
®

eigenvalues are the only ones allowed by a projection operator. As


X
Trρ = p i = 1, (12.26)
i

we have
λi = 1.
X
(12.27)
i

Now let us evaluate


®¯2
φ¯ ρ ¯φ = p i ¯ ψi |φ ¯ ≤ 1,
­ ¯ ¯ ® X ¯­
(12.28)
i

where the fact that ¯ ψi |φ ¯ ≤ 1, combined with i p i = 1 leads to the inequality. The condi-
¯­ ®¯ P

tion i λi = 1 means that either one of the eigenvalues is 1 and the rest are 0, or they are all
P

strictly less than 1. Thus, for an eigenstate φ of the density operator, we have

φ¯ ρ ¯φ = φ|λ|φ = λ < 1.
­ ¯ ¯ ® ­ ®
(12.29)

We see that a density operator has eigenvalues between 0 and 1.


For a pure state we know that ρ̂ 2 = ρ̂. In view of Tr ρ̂ = 1 we therefore have for a pure
state that Tr ρ̂ 2 = 1. If the state is not pure, Tr ρ̂ 2 < 1. The quantity Tr ρ̂ 2 is therefore called the
purity of the density matrix ρ̂.
In summary:

The sum of the eigenvalues of the density operator is 1. The special case where only
one of these eigenvalues is 1 and the rest are 0 corresponds to a pure state.
If there are eigenvalues 0 < λ < 1, then we are dealing with a mixed state.
The purity of a state can be quantified as Tr ρ 2 . This quantity lies is smaller than 1.
The value 1 is reached for a pure state.

To summarize, if a system
¯ ® is in a mixed state, it can be characterized by a distribution of
possible wave functions ¯ψi with associated classical probabilities p i . But a more compact
way of representing our knowledge of the system is the density operator, which can be con-
structed when we know the possible states ψi and their probabilities p i [see Eq. (12.18)]. The
density operator can be used to calculate expectation values using the trace, see Eq. (12.19).
Let us consider an example. Take again the scenario where Charlie sends us a spin-up or
-down particle with equal probabilities. For convenience, we denote these two states as |0〉
{ (spin up) and |1〉 (spin down). Then the density operator can be evaluated as
12{ 1 1
ρ= |0〉 〈0| + |1〉 〈1| . (12.30)
2 2
This operator works in a two-dimensional Hilbert space – therefore it can be represented as
a 2 × 2 matrix: µ ¶
1/2 0
ρ= . (12.31)
0 1/2
The matrix elements are evaluated as follows. The upper-left element is

1 1
〈0| ρ |0〉 = 〈0|0〉 〈0|0〉 + 〈0|1〉 〈1|0〉 = 1/2 (12.32)
2 2
12.2. T HE DENSITY OPERATOR 163

as follows from (12.30) and from the orthogonality of the two basis states. The upper-right
element is given by
1 1
〈0| ρ |1〉 = 〈0|0〉 〈0|1〉 + 〈0|1〉 〈1|1〉 = 0 (12.33)
2 2
­ ® ­ ®
as a result of orthogonality. The lower left element 1|ρ|0 and the lower right 1|ρ|1 are
found similarly. Another interesting way to find the density matrix (i.e. the matrix represen-
tation of the density operator) is by directly using the vector representation of the states |0〉
and |1〉: µ ¶ µ ¶ µ ¶
1 1 1 0 1/2 0
ρ= (1, 0) + (0, 1) = . (12.34)
2 0 2 1 0 1/2
Note the somewhat unusual order in which we encounter column and row vectors: the result
is not a number, but an operator.
Another day, Charlie decides to send us particles which are either "up" or "down" along
the x-axis. As you might remember, the eigenstates are

1
p (|0〉 + |1〉) (12.35)
2

for spin-up (along x) and


1
p (|0〉 − |1〉) (12.36)
2
for spin-down. You recognize these states as the states ¯ψ3 and ¯ψ4 given above. Now let us
¯ ® ¯ ®

work out the density operator:


µ ¶ µ ¶ µ ¶
1 1 1 1 1/2 0
ρ= (1, 1) + (1, −1) = . (12.37)
4 1 4 −1 0 1/2

We see that we obtain the same density matrix! Apparently, the particular axis used by Charlie
does not affect what we measure at our end.
Another question we frequently ask ourselves when ¯ ® dealing with quantum systems is:
What is the probability to find the system in a state ¯φ in a measurement?
The answer for a system in a pure state ψ is:
¯ ®
¯
¯­ ¯ ®¯2
Pφ = ¯ φ ¯ ψ ¯ . (12.38)

If the system can be in either one of a set of states ¯ψi with respective probabilities p i , the
¯ ®

answer is X ¯­ ¯ ®¯2
P φ = p i ¯ φ ¯ ψi ¯ . (12.39)
i

Another way to obtain the expression on the right hand side is by using the density operator:
­ ¯ ¯ ® X ¯­ ¯ ®¯2 {
φ ¯ ρ ¯ φ = p i ¯ φ ¯ ψi ¯ = P φ . (12.40) 12
{
i

This equation follows directly from the definition of the density operator.
Important examples of systems in a mixed state are statistical systems connected to a
heat bath. Loosely speaking, the actual state of the system without the bath varies with time,
and we do not know that state when we perform a measurement. We know however from
statistical physics that the probability for the system to be in a state with energy E is given by
the Boltzmann factor exp[−E /(k B T )], so the density operator can be written as
X ¯ ® −E /(k T ) ­ ¯
ρ=N ¯ψi e i B ψi ¯ (12.41)
i
164 12. D ENSITY OPERATORS — QUANTUM INFORMATION THEORY

where the ψi are eigenstates of the Hamiltonian. The prefactor N is adjusted such that
N e −E i /(kB T ) = 1 in order to guarantee that Tr ρ = 1. The density operator can also be writ-
P

ten as
ρ = N e −Ĥ /(kB T ) , (12.42)
as can be verified as follows:
X¯ ® ˆ
e −Ĥ /(kB T ) =
X ¯ ® ­ ¯ −Ĥ /(k T ) X
¯ψi ψi ¯ e |ψ j 〉〈ψ j | = ¯ψi e −E i /(kB T ) ψi ¯ .
­ ¯
B
(12.43)
i j i

Any expectation value can now in principle be evaluated. For example, consider a spin-
1/2 particle connected to a heat bath of temperature T in a magnetic field B pointing in the
z-direction. The Hamiltonian is given by

H = −γB S z . (12.44)

Then the expectation value of the z-component of the spin can be calculated as

〈S z 〉 = Tr (ρS z ). (12.45)

We can evaluate ρ. Using the notation β = 1/(k B T ) it reads:

e βγ×B /2
µ ¶
1 0
ρ= . (12.46)
e βγ×B /2 + e −βγ×B /2 0 e −βγ×B /2

Now the expectation value 〈S z 〉 can immediately be found, using S z = ×σz /2, where σz is the
Pauli matrix:
〈S z 〉 = Tr (ρS z ) = ×/2 tanh(βγ×)B /2). (12.47)
Considering systems of non-interacting particles, the density operator can be used to de-
rive the average occupation of energy levels, leading to the well-known Fermi-Dirac distribu-
tion for fermions, and the Bose-Einstein distribution for bosons. This derivation is however
beyond the scope of this course — it is treated in your statistical mechanics course.
To conclude this section, we return to the systems S, E and U (U is S and E taken together).
Recalling the form of the ensemble states
p ¯¯ S E X ¯ E
p q ¯η q = C j q ¯φSj , (12.48)
¯
j

we can formulate the density matrix on S as


X ¯¯ E D ¯¯ X X ¯ E­ ¯
ρ S = p q ¯ηSq ηSq ¯ = ∗ ¯ S
C j q C kr ¯φ j φSk ¯ . (12.49)
q q jk

{ Now that we have some experience with traces, it is interesting to note that the density
12{ matrix of S can be obtained by taking the partial trace of the density
¯ matrix of U over the
E
¯ E
degrees of freedom of E. By this we mean the following. Remember ¯χq is a basis of E. The
density matrix of U is given by
ρ U = ¯ψ U ψ U ¯ .
¯ ®­ ¯
(12.50)
Taking the partial trace of this over E is defined as
¡ ¢ XD E U EE
TrE ρ U = χq |ρ |χq . (12.51)
q

Note that the resulting object is an operator acting in S. Also note the similarity with the ex-
pression for the full trace: the difference is that we restrict ourselves to the basis states of
E.
12.3. E NTANGLEMENT 165

Writing ρ U in terms of ψU and expanding the latter as


¯ U® X ¯ E
¯ψ = C j q ¯¯φS χE , (12.52)
j q
jq

and using the orthonormality of the basis on E, we obtain


¯ E­ ¯
TrE ρ U = C j q C kq∗ ¯ S
¯φ j φSk ¯ .
¡ ¢ X
(12.53)
q

We see that the resulting expression is precisely the density matrix of S:

ρ S = TrE ρ U .
¡ ¢
(12.54)

In the next section we come back to the use of partial traces.

Summary
We have seen that systems coupled to an environment cannot be described by a
single quantum state from the Hilbert space of that system. Instead, the system is
described by the density operator of density matrix ρ̂. The density matrix has the
form
ρ̂ = p j ¯ψ j ψ j ¯ .
X ¯ ®­ ¯
j

The numbers p j are the probabilities for finding the system in the state ¯ψ j . They
¯ ®

therefore add up to 1. If one p j = 1 and the remaining ones are 0, the state is called
pure. A measure for the purity of a state is called Tr ρ̂ 2 .
Having ρ̂, we can calculate statistical properties of measurements. In particular, for
a general Hermitian operator Â, the expectation value for the measurement of that
operator in a system described by a density matrix ρ̂ is given by

 = Tr ρ̂ Â.
­ ®

Furthermore, the probability to find the system in a particular state ¯ψ is given by


¯ ®

P ψ = ψ ¯ ρ̂ ¯ ψ .
­ ¯ ¯ ®

A density matrix satisfies several mathematical requirements:


• It is Hermitian: ρ̂ = ρ̂ † .
• It is positive definite, i.e. ψ ¯ ρ̂ ¯ ψ ≥ 0 for all ¯ψ . This is equivalent to saying
­ ¯ ¯ ® ¯ ®

that all its eigenvalues are non-negative.


• The trace is unity: Tr ρ = 1.
{
For two coupled systems, one of the systems can be described by a reduced density 12
{
matrix:
ρ̂ A = Tr B ρ̂
where ρ̂ is the density matrix of the combined system A+B.

12.3 E NTANGLEMENT
Entanglement is a phenomenon which can occur when two or more quantum systems are
coupled. We have seen in the previous section that coupling between a system S and its
environment E may lead to a mixed state of S which is impossible to characterize by a single
wave function. If the influence of the environment precludes the possibility of describing S
by a pure state, we say that S and E are entangled:
166 12. D ENSITY OPERATORS — QUANTUM INFORMATION THEORY

Consider a system consisting of two or more subsystems. If the state of the system is
such that the subsystems cannot be described by a pure state, then the subsystems
are called entangled.

We shall focus on the simplest nontrivial system exhibiting entanglement: two particles,
A and B, whose degrees of freedom span a two-dimensional Hilbert space (as usual, you may
think of two spin-1/2 particles). The basis states for each particle are denoted |0〉 and |1〉.
Therefore, the possible states of the two-particle system are linear combinations of the states

|00〉 , |01〉 , |10〉 , and |11〉 . (12.55)


(the first number denotes the state of particle A and the second one that of particle B). We use
these states (in this order) as a basis of the four-dimensional Hilbert space, that is, we may
identify  
1
 0 
|00〉 ⇔  (12.56)
 
 0 

0
and so on.
Suppose the combined system is in the state
¯ ® 1
¯ψ = (|00〉 + |01〉 + |10〉 + |11〉) . (12.57)
2
In vector notation, this state is represented as:
 
1
1 1 
ψ= . (12.58)
 
1

2 
1

Note that this state is normalized.


We can find the density matrix of particle A by tracing out the degrees of freedom of parti-
cle B. This is the procedure that we followed at the very end of Section 12.2, and we can copy
the result of that section simply by using C j q = 1/2 for our example ( j , q both assume values
0 or 1). The result is then
1
ρS = (|0〉 〈0| + |0〉 〈1| + |1〉 〈0| + |1〉 〈1|) . (12.59)
2
This can also be written as
ρ S = ¯ψ3 ψ3 ¯ ,
¯ ®­ ¯
(12.60)
with
¯ψ3 = p1 (|0〉 + |1〉) .
¯ ®
(12.61)
{ 2
12
{
We see that ρ S is the density operator of a pure state – hence the two particles are not entan-
gled.
Now we want to study the same problem from a different viewpoint. Suppose we perform
measurements of the first spin only. More specifically, we measure the probabilities for a
system to be in the states

¯ψ1 = |0〉 , ¯ψ2 = |1〉 , ¯ψ3 = p1 (|0〉 + |1〉) , ¯ψ4 = p1 (|0〉 − |1〉) . (12.62)
¯ ® ¯ ® ¯ ® ¯ ®
or
2 2
The resulting probabilities are (check this!):

P 1 = P 2 = 1/2; (12.63a)
P 3 = 1; P 4 = 0. (12.63b)
12.3. E NTANGLEMENT 167

p are precisely the same results as those for a single particle in the state ψ3 = (|0〉 +
¯ ®
These ¯
|1〉)/ 2, that is, if we want to predict measurements on the first particle, we can forget about
the second particle. The reason for this is that we can write the state (12.57) as

1
(|0〉 A + |1〉 A ) ⊗ (|0〉B + |1〉B ) . (12.64)
2
The fact that (12.57) can be written as a (tensor) product of pure states of the two subsystems
A and system B is responsible for the fact that the second particle does not ‘interfere’ with the
first one.
We see that entanglement can be defined in three ways, which can be shown to be com-
pletely equivalent:

• Two systems A and B are entangled when the density matrix of one of them describes a
mixed state. This is usually not the easiest definition to be used in problems!

• Two systems A and B are entangled when it is impossible to write the state as a single
product of a state on A and a state on B. This definition usually is the easiest to check
whether a state is entangled or not.

• Two systems A and B are entangled when the outcome of some measurement on one
of them influences the probability of measurements on the other (see below).

In order to use the first definition, the use of partial traces is necessary. It is convenient
to learn how a partial trace works out on a matrix representation of the density matrix. Let us
first work out the density matrix of the combined system AB in the previous example:
   
1 1 1 1 1
¯ ®­ ¯ 1   1 
 1
 1 1 1 1 

ρ AB = ¯ψ ψ¯ =   (1, 1, 1, 1) =  . (12.65)
4 1  4 1 1 1 1 
1 1 1 1 1

Taking the trace over the states of B corresponds to taking four traces over the submatrices
formed by the elements in which the system A has the same values. Some inspection leads to
the submatrices indicated in the following equation:
 
a b p q
 c d r s  µ a +d p +s ¶
TrB  = . (12.66)
 
 α β η ζ  α+δ η+ξ
γ δ χ ξ

We see that we should view the full, 4×4 density matrix as consisting of four 2×2 submatrices.
Of each of these submatrices, we take the trace and put this as a number in the resulting 2 × 2
matrix. {
In the case where we want to take the partial trace over A rather than over B, we proceed
12
{
as follows:  
a b p q
 c d r s  µ a +η b +ζ ¶
TrA  = . (12.67)
 
 α β η ζ  c +χ d +ξ
γ δ χ ξ
We see that we divide the large matrix again into four submatrices, but these submatrices
have their elements two rows and/or columns apart rather than one, like in the previous case.
Whether we take the trace over A or over B, the ‘reduced’ density matrix reads
µ ¶
1 1 1
ρA = ρB = . (12.68)
2 1 1
168 12. D ENSITY OPERATORS — QUANTUM INFORMATION THEORY

This matrix has two eigenvectors:


µ ¶ µ ¶
1 1 1 1
p and p , (12.69)
2 1 2 −1

with eigenvalues 1 and 0, respectively. This implies that the so-called reduced density matrix
that is obtained after taking a partial trace over the full density matrix, is that of a pure state
(see the previous section). Therefore, we conclude, again, that the state of the combined
system AB is not and entangled state.
We now consider another example, defined by the state

¯ψE = p1 (|00〉 + |11〉) .


¯ ®
(12.70)
2
In vector form, this state reads  
1
1  0 
p  . (12.71)
 
2 0 
1
The density operator is (in matrix form):
   
1 1 0 0 1
1 0 
  1 0 0
 0 0 
ρ=   (1001) =  . (12.72)

2 0  2 0 0 0 0 
1 1 0 0 1

Taking the trace over either A or B gives the same result:


µ ¶
1 1 0
ρA = ρB = . (12.73)
2 0 1

We immediately see two eigenvalues, 1/2 and 1/2, and we conclude that the reduced density
matrix is not that of a pure state, so the state of AB we started from was entangled (see the
previous section).
Now we analyze the entanglement of the state according to the other criteria discussed
above. First, it is easy to see that there is no way to write the state as product state of two
states of system A and B respectively. This is as usual the easiest way to determine whether
the state is entangled or not (in this case, entangled).
Now consider the effect of measuring an aspect of one subsystem. If we measure for
the first spin the value 0, then a measurement of the second spin will also give 0. The same
holds for measuring 1 for both. We see that the measurement of one particle, influences the
measurement results of the other – hence the state is entangled.
{ Let us now describe this more formally. We perform measurements on particle A and on
12{ B, checking whether these particles are found in state 1 or 0. For our entangled state (12.70)
we find

P 00 = P 11 = 1/2, (12.74a)
P 10 = P 01 = 0, (12.74b)

where P 01 is the probability to find particle A in state 0 and particle B in state 1 etcetera. We
see that in terms of classical probabilities, the system A is strongly correlated with system B.
It turns out that this correlation remains complete even when the measurement is performed
with respect to another basis (see exercises):

Entanglement gives rise to correlation of probabilities, and this correlation cannot


be lifted by a basis transformation.
12.4. T HE EPR PARADOX AND B ELL’ S THEOREM 169

An interesting question is whether a non-entangled system may become entangled in the


course of time. We therefore take a system which is in a non-entangled state — it might for ex-
ample be in the state (12.57). We assume that the system evolves according to a Hamiltonian
H which, in the basis |00〉, |01〉, |10〉, |11〉, has the following form:
 
1 0 0 0
 0 1 0 0 
, (12.75)
 
 0 0 1 0 

0 0 0 −1

The time evolution operator is given by T = exp(−i t Ĥ /×) — at t = π×/2 it has the form
 
−i 0 0 0
 0 −i 0 0 
, (12.76)
 
 0 0 −i 0 

0 0 0 i

so that we find
¯ψ(t = π×/2) = − i (|00〉 + |01〉 + |10〉 − |11〉) ,
¯ ®
(12.77)
2
which is an entangled state (you will find no way to write it as a tensor product of two pure
states of A and B). Thus we see that when a system starts in a non-entangled state, it might
evolve into an entangled state in the course of time.
We have seen that non-entangled states lead to a reduced density matrix describing a
pure state. The lowest value for the purity Tr ρ̂ 2 of the reduced density matrix is obtained
when the density matrix has two eigenvalues 1/2:
µ ¶
1/2 0
ρ̂ A = .
0 1/2

The two-qubit state leading to this density matrix is therefore called a ‘most entangled’ state.
For two qubits, the space of most entangled states is spanned by the so-called Bell states:

¯ψ1 = p1 (|00〉 + |11〉) ;


¯ ®
2
1
¯ψ2 = p (|00〉 − |11〉) ;
¯ ®
2
1
¯ψ3 = p (|01〉 + |10〉) ;
¯ ®
2
1
¯ψ4 = p (|01〉 − |10〉) .
¯ ®
2
{
These states are generally referred to as Bell states. 12
{

12.4 T HE EPR PARADOX AND B ELL’ S THEOREM


In 1935, Einstein, Podolsky and Rosen (EPR) published a thought experiment, which demon-
strated that quantum mechanics is not compatible with some obvious ideas which we tacitly
apply when describing phenomena. In particular the notions of a reality existing indepen-
dently of experimental measurements and of locality cannot both be reconciled with quan-
tum mechanics. Locality is used here to denote the idea that events cannot have an effect at a
distance before information has travelled from that event to another place where its effect is
noticed. Together, the notions of reality and locality are commonly denoted as ‘local realism’.
From the failure of quantum mechanics to comply with local realism, EPR concluded that
quantum mechanics is not a complete theory.
170 12. D ENSITY OPERATORS — QUANTUM INFORMATION THEORY

The EPR paradox is quite simple to explain. At some point in space, a stationary parti-
cle with spin 0 decays into two spin-1/2 particles which fly off in opposite directions (mo-
mentum conservation does not allow the directions not to be opposite). During the decay
process, angular momentum is conserved which implies that the two particles must have
opposite spin: when one particle is found to have spin ‘up’ along some measuring axis, the
other particle must have spin ‘down’ along the same axis. Obviously, we are dealing with an
entangled state.
Suppose Alice and Bob both receive an outcoming particle from the same decay event.
Alice measures the spin of the particle along the z direction, and Bob does the same with his
particle. Superficially, we can say that they would both have the same probability to find ei-
ther ×)/2 or −×)/2. However, if quantum mechanics is correct, these measurements should
be strongly correlated: if Alice has measured spin up, then Bob’s particle must have spin down
along the z-axis, so the measurement results are fully correlated. According to the ‘orthodox’,
or ‘Copenhagen’ interpretation of quantum mechanics, if Alice is the first one to measure
the spin, the particular value measured by her is decided at the very moment of that mea-
surement. But this means that at the at the same moment the spin state of Bob’s particle is
determined. But Bob could be lightyears away from Alice, and perform his measurement
immediately after her. According to the orthodox interpretation, his measurement would be
influenced by Alice’s. But this was inconceivable to Einstein, who maintained that the in-
formation about Alice’s measurement could not reach the Bob’s particle instantaneously, as
the speed of light is a limiting factor for communication. In Einstein’s view, the outcome of
the measurements of the particles is determined at the moment when they leave the source,
and he believed that a more complete theory could be found which would unveil the ‘hidden
variables’ which determine the outcomes of Alice and Bob’s measurements when the parti-
cles left the source. These hidden variables would then represent some “reality” which exists
irrespectively of the measurement.
The EPR puzzle remained unsettled for a long time, until, in 1965, John Bell formulated
a theorem which would allow to distinguish between Einstein’s scenario and the orthodox
quantum mechanical interpretation. We shall now derive Bell’s theorem. Suppose we count
in an audience the numbers of people having certain properties, such as ‘red hair’ or ‘wearing
yellow socks’, ‘taller than 1.70 m’. We take three such properties, called A, B and C . If we select
one person from the audience, he or she will either comply to each of these properties or not.
We denote this by a person being ‘in the state’ A + , B − ,C + for example. The number of people
in the state A + , B − ,C + is denoted N (A + , B − ,C + ). We now write

N (A + , B − ) = N (A + , B − ,C + ) + N (A + , B − ,C − ) (12.78)

which is a rather obvious relation.


We use similar relations in order to rewrite this as

{ N (A + , B − ) = N (A + ,C − )−N (A + , B + ,C − )+N (B − ,C + )−N (A − , B − ,C + ) ≤ N (A + ,C − )+N (B − ,C + ).


12
{ (12.79)
This is Bell’s inequality, which can also be formulated in terms of probabilities [P (A + , B − )
instead of N (A + , B − ) etcetera]. We have used everyday-life examples in order to emphasise
that there is nothing mysterious, let alone quantum mechanical, about Bell’s inequality. But
let us now turn to quantum mechanics, and spin determination in particular.
Consider the three axes a, b and c shown in the figure. A + is now identified with a spin-
up measurement along a etcetera. We can now evaluate P (A + ,C − ). Measuring A + happens
with probability 1/2, but after this measurement, the particle is in the spin-up state along the
a-axis. If the spin is then measured along the c direction, we have a probability sin2 π/8 to
find C − (see problem 16 of the exercises). The combined probability is P (A + ,C − ) is therefore
1 2 − + 1 2 + −
2 sin (π/8) Similarly, P (B ,C ) is also equal to 2 sin (π/8), and P (A , B ) is 1/4. Inserting
12.5. N O CLONING THEOREM 171

a c

b
F IGURE 12.1: The measuring axis for a spin.

these numbers into Bell’s inequality gives:

1p
µ ¶
1 2 1
≤ sin (π/8) = 1− 2 , (12.80)
4 2 2

which is obviously wrong. Therefore, we see that quantum mechanics does not obey Bell’s
inequality.
Now what does this have to do with the EPR paradox? Well, first of all, the EPR paradox
allows us to measure the spin in two different directions at virtually the same moment. But,
more importantly, if the particles would leave the origin with predefined probabilities, Bell’s
inequality would unambiguously hold. The only way to violate Bell’s inequality is by accept-
ing that Alice’s measurement reduces the entangled wave function of the two-particle system,
which is also noticed by Bob instantaneously. So, there is some ‘action at a distance’, in con-
trast to what we usually have in physics, where every action is mediated by particles such as
photons, mesons, . . . .
In 1982, Aspect, Dalibard and Roger performed experiments with photons emerging from
decaying atoms in order to check whether Bell’s theorem holds or not. Since then, several
other groups have redone this experiment, sometimes with different setups. The conclu-
sion is now generally accepted that Bell’s theorem does not hold for quantum mechanical
probabilities. The implications of this conclusion for our view of Nature is enormous: some-
how actions can be performed without intermediary particles, so that the speed of light is
not a limiting factor for this kind of communication. ‘Communication’ is however a danger-
ous term to use in this context, as it suggests that information can be transmitted instanta-
neously. However, the ‘information’ which is transmitted from Alice to Bob or vice versa is
purely probabilistic, since Bob nor Alice can predict the outcome of their measurements. So
far, no schemes have been invented or realised which would allow us to send over a Mozart
symphony at speeds faster than the speed light. {
12
{

12.5 N O CLONING THEOREM


In recent years, much interest has arisen in quantum information processing. In this field,
people try to exploit quantum mechanics in order to process information in a way com-
pletely different from classical methods. We have already encountered one example of these
attempts: quantum cryptography, where a random encryption key can be shared between
Bob and Alice without Eve being capable of eavesdropping. Another very important appli-
cation, which unfortunately is still far from a realisation, is the quantum computer. When I
speak of a quantum computer, you should not forget that I mean a machine which exists only
on paper, not in reality. A quantum computer is a quantum machine in which qubits evolve
in time. A qubit is a quantum system with a 2-dimensional Hilbert space. It can always be
172 12. D ENSITY OPERATORS — QUANTUM INFORMATION THEORY

denoted
¯ϕ = a |0〉 + b |1〉 ,
¯ ®
(12.81)
where a and b are complex constants satisfying a 2 +b 2 = 1. The states |0〉 and |1〉 form a basis
in the Hilbert space. A quantum computer manipulates several qubits in parallel. A system
consisting of n qubits has a 2n -dimensional Hilbert space. A quantum computation consists
of a preparation of the qubits in some well-defined state, followed by an autonomous evolu-
tion of the qubit system, and concluded by reading out the state of the qubits. As the system
is autonomous, it is described by a (Hermitian) Hamiltonian. The time-evolution operator
U = exp(−i t H /×)) is then a unitary operator, so the the quantum computation between ini-
tialisation and reading out the results can be described in terms of a sequence of unitary
transformations applied to the system. In this section we shall derive a general theorem for
such an evolution, the no-cloning theorem:

An unknown quantum state cannot be cloned.

By cloning we mean that we can copy the state of some quantum system into some other
system without losing the state of our original system.
Before proceeding with the proof of this theorem, let us assume that cloning would be
possible. In that case, communication at speeds faster than light would in principle be pos-
sible. To see this, imagine Alice and Bob have a pair of entangled qubits. Alice performs a
measurement on her qubit along the axis |0〉 or |0〉 + |1〉. After this, Bob makes many clones
of his qubit. As Bob has many clones, he can find out which measurement Alice performed
without ambiguity (how?). So the no-cloning theorem is essential in making communication
at speeds faster than the speed of light impossible.
The proof of the no-cloning theorem for qubit systems proceeds as follows. Cloning for a
qubit pair means that we have a unitary evolution U with the following effect on a qubit pair:

U |α0〉 = |αα〉 . (12.82)


The evolution U should work for any state α, therefore it cannot depend on α. Therefore, for
some other state ¯β we must have
¯ ®

U ¯β0 = ¯ββ .
¯ ® ¯ ®
(12.83)
¯ ®¢ p
Now let us operate with U on the state ¯γ0 with ¯γ = |α〉 + ¯β / 2:
¯ ® ¯ ® ¡

¯ ®¢ p
U ¯γ0 = |αα〉 + ¯ββ / 2 6= ¯γγ ,
¯ ® ¡ ¯ ®
(12.84)

which completes the proof.

12.6 D ENSE CODING


{
In this section, I describe a way of sending over more information than bits. This sounds com-
12
{
pletely impossible, but, again, quantum mechanics is in principle able to realise the impossi-
ble. It is however difficult to implement, as it is based on Bob and Alice having an entangled
pair of qubits, in the state
|00〉 + |11〉 . (12.85)
From now on, we shall adopt the convention in this field to omit normalisation factors in
front of the wave functions. We can imagine this state to be realised by having an entan-
gled pair generator midway between Alice and Bob, sending entangled particles in opposite
directions as in the EPR setup.
12.7. QUANTUM COMPUTING AND S HOR ’ S FACTORISATION ALGORITHM 173

Note that the following qubit operations are all unitary:

I ¯φ = ¯φ
¯ ® ¯ ®
(12.86a)
X |0〉 = |1〉 , (12.86b)
X |1〉 = |0〉 (12.86c)
Z |0〉 = |0〉 , (12.86d)
Z |1〉 = − |1〉 . (12.86e)
Y |0〉 = |1〉 (12.86f)
Y |1〉 = − |0〉 Y = X Z . (12.86g)

The operator I is the identity; X is called the NOT operator, We assume that Alice has a de-
vice with which she can perform any of the four transformations (I , X , Y , Z ) on her member
(i.e. the first) of the entangled qubit pair. The resulting perpendicular states for these four
transformations are:

I (|00〉 + |11〉) = (|00〉 + |11〉) (12.87a)


X (|00〉 + |11〉) = (|10〉 + |01〉) (12.87b)
Y (|00〉 + |11〉) = (|10〉 − |01〉) (12.87c)
Z (|00〉 + |11〉) = (|00〉 − |11〉) (12.87d)

Alice does not perform any measurement — she performs one of these four transformations
and then she sends her bit to Bob. Bob then measures in which of the four possible states
the entangled pair is, in other words, he now knows which transformation Alice applied. This
information is ‘worth’ two bits, but Alice had to send only one bit to Bob!

12.7 Q UANTUM COMPUTING AND S HOR ’ S FACTORISATION ALGORITHM


A quantum computer is a device containing one or more sets of qubits (called registers),
which can be initialised without ambiguity, and which can evolve in a controlled way un-
der the influence of unitary transformations and which can be measured after completion of
this evolution.
The most general single-qubit transformation is a four-parameter family. For more than
one qubit, it can be shown that every nontrivial unitary transformation can be generated by
a single-qubit transformation of the form

−i e −i φ sin(θ/2)
µ ¶
cos(θ/2)
U (θ, φ) = . (12.88)
−i e i φ sin(θ/2) cos(θ/2)

and another unitary transformations involving more than a single qubit, the so-called 2-qubit
XOR. This transformation acts on a qubit pair and has the following effect:
{
XOR (|00〉) = |00〉 (12.89a)
12
{

XOR (|01〉) = |01〉 (12.89b)


XOR (|10〉) = |11〉 (12.89c)
XOR (|11〉) = |10〉 (12.89d)

We see that the first qubit is left unchanged and the second one is the eXclusive OR of the two
input bits. Unitary transformations are realised by hardware elements called gates.
Several proposals for building quantum computers exist. In the ion trap, an array of ions
which can be in either the ground state (|0〉) or the excited state (|1〉), controlled by laser
pulses. Coupling of neighbouring ions in order to realise an XOR-gate is realised through a
controlled momentum transfer to displacement excitations (phonons) of the chain.
174 12. D ENSITY OPERATORS — QUANTUM INFORMATION THEORY

Here in Delft, activities focus on arrays of Josephson junctions. Josephson junctions are
very thin layers of ordinary conductors separating two superconductors. Current can flow
through these junctions in either the clockwise or anti-clockwise direction (interpreted as
0 and 1 respectively). Other initiatives include NMR devices and optical cavities. With this
technique it has become possible recently to factorise the number 15. Realisation of a work-
ing quantum computer will take at least a few decades — if it will come at all.
A major problem in realising a working quantum computer is to ensure a unitary evolu-
tion. In practice, the system will always be coupled to the outside world. Quantum comput-
ing hinges upon the possiblity to have controlled, coherent superpositions. Coherent super-
positions are linear combinations of quantum states into another, pure state. As we have seen
in the previous section, coupling to the environment may lead to entanglement which would
cause the quantum computer to be described by a density operator rather than by a pure
state. In particular, any phase relation between constitutive parts of a phase-coherent super-
position is destroyed by coupling to the environment. We shall now treat this phenomenon
in more detail.
Consider a qubit which interacts with its environment. We denote the state of the envi-
ronment by the ket |m〉. The interaction is described by the following prescription:

|0〉 |m〉 → |0〉 |m 0 〉 ; (12.90a)


|1〉 |m〉 → |1〉 |m 1 〉 . (12.90b)

In this interaction, the qubit itself does not change — if this would be the case, our computer
would be useless to start with.
Suppose we start with a state
|0〉 + e i φ |1〉 (12.91)
which is coupled to the environment. This coupling will induce the transition
³ ´
|0〉 + e i φ |1〉 |m〉 → |0〉 |m 0 〉 + e i φ |1〉 |m 1 〉 . (12.92)

Suppose this qubit is then fed into a so-called Hademard gate, which has the effect

1
H |0〉 = p (|0〉 + |1〉) ; (12.93a)
2
1
H |1〉 = p (|0〉 − |1〉) . (12.93b)
2

Then the outcome is


h ³ ´ ³ ´i
e i φ/2 |0〉 e −i φ/2 |m 0 〉 + e i φ/2 |m 1 〉 + |1〉 e −i φ/2 |m 0 〉 − e i φ/2 |m 1 〉 . (12.94)

{ If we suppose that 〈m 0 |m 1 〉 is real, we find for the probabilities to measure the qubit in the
12{ state |0〉 or |1〉 (after normalisation):


1 + 〈m 0 |m 1 〉 cos φ
¢
P0 = (12.95a)
2

P 1 = 1 − 〈m 0 |m 1 〉 cos φ
¢
(12.95b)
2
If there is no coupling, m 0 = m 1 = m, and we recognise the phase relation between the two
states in the probabilities. On the other hand, if 〈m 0 |m 1 〉 = 0, then we find for both probabil-
ities 1/2, and the phase relation has disappeared completely.
It is interesting to construct a density operator for the qubit in the final state (12.94). Con-
sider a qubit
α |0〉 | + β |1〉 (12.96)
12.7. QUANTUM COMPUTING AND S HOR ’ S FACTORISATION ALGORITHM 175

which has interacted with its environment, so that we have the combined state

α |0〉 |m 0 〉 + β |1〉 |m 1 〉 . (12.97)

We can arrive at a density operator for the qubit only by performing the trace over the m-
system only. Using (12.21) we find

|α|2 αβ∗ 〈m 1 |m 0 〉
µ ¶
ρ qubit = . (12.98)
α∗ β 〈m 0 |m 1 〉 |β|2

The eigenvalues of this matrix are

1 1
q
¢2
λ=
¡
± |α|2 − |β|2 + 4|α|2 |β|2 | 〈m 0 |m 1 〉 |2 (12.99)
2 2
and these lie between 0 and 1, where the value 1 is reached only for 〈m 0 |m 1 〉 = 1. The terms
coherence/decoherence derive from the name coherence which is often used for the matrix
element 〈m 0 |m 1 〉.
Now let us return to the very process of quantum computing itself. The most impressive
algorithm, which was developed in 1994 by Peter Shor, is that of factorising large integers, an
important problem in the field of encryption and code-breaking. We shall not describe this
algorithm in detail, but present a brief sketch of an important sub-step, finding the period of
an integer function f . It is assumed here that all unitary transformations used can be realised
with a limited number of gates.
The algorithm works with two registers, both containing n qubits. These registers are
described by a 2n -dimensional Hilbert space. As basis states we use the bit-sequences of the
integers between 0 and 2n−1 . The basis state corresponding to such an integer x is denoted
|x〉n . Now we perform the Hademard gate (12.93) to all bits of the state |0〉n . This yields
n
2X −1
−n
H |0〉n ≡ |w〉n = 2 |x〉n . (12.100)
x=0

It is possible (but we shall not describe the method here) to construct, for any function f
which maps the set of numbers 0 to 2n−1 onto itself, a unitary transformation U f which has
the effect ¯ ®
U f |x〉n |0〉n = |x〉n ¯ f (x) n (12.101)
using a limited number of gates.
Now we are ready for the big trick in quantum computing. If we let U f act on the state
|w〉n then we obtain
n
2X −1
U f |w〉n |0〉n = 2−n
¯ ®
|x〉n ¯ f (x) n . (12.102)
x=0

We see that the new state contains f (x) for all possible values of x. In other words, applying {
the gates U f to our state |w〉n |0〉n , we have evaluated the function f for 2n different argu- 12
{
ments. This feature is called quantum parallelism and it is this feature which is responsible
for the (theoretical) performance of quantum computing.
Of course, if we were to read out the results of the computation for each x-value, we would
have not gained much, as this would take 2n operations. In general, however, the final result
that we are after consists of only few data, so a useful problem does not consist of simply
calculating f for all of its possible arguments. As an example we consider the problem of
finding the periodicity of the function f , which is an important step in Shor’s algorithm. This
is done by reading out only one particular value of the result in the second register, f (x) = u,
say. The first register is then the sum of all x-states for which it holds that f (x) = u. If f has
a period r , we will find that these x-values lie a distance r apart from each other. Now we
act with a (unitary) Fourier transform operator on this register, and the result will be a linear
176 12. D ENSITY OPERATORS — QUANTUM INFORMATION THEORY

combination of the registers corresponding to the period(s) of the function f . If there is only
one period, we can read this out straightforwardly.
It has been said already that finding the period of some function is an important step
in the factorising algorithm. Shor’s algorithm is able to factorise an n-bit integer in about
300n 3 steps. A very rough estimate of size for the number to be factorized where a quantum
computer starts outperforming a classical machine, is about 10130 .

12.8 P ROBLEMS
1. For the operators given below, determine whether they are admissible as density matri-
ces and, if yes, whether they describe a pure or a mixed state. Also, provide
¯ ® the¯ density
operators in matrix form (i.e., the density matrices). The two states ¯ψ1 and ¯ψ2 are
®

linearly independent and normalised.

(a) ρ = 12 ¯ψ1 ψ1 ¯ + 12 ¯ψ2 ψ2 ¯ ,


¯ ®­ ¯ ¯ ®­ ¯

(b) ρ = 12 ¯ψ1 ψ1 ¯ + ¯ψ1 ψ2 ¯ + ¯ψ2 ψ1 ¯ + ¯ψ2 ψ2 ¯ ,


£¯ ® ­ ¯ ¯ ® ­ ¯ ¯ ® ­ ¯ ¯ ® ­ ¯¤

(c) ρ = 12 ¯ψ1 ψ1 ¯ + i ¯ψ1 ψ2 ¯ − i ¯ψ2 ψ1 ¯ + ¯ψ2 ψ2 ¯ ,


£¯ ® ­ ¯ ¯ ® ­ ¯ ¯ ® ­ ¯ ¯ ® ­ ¯¤

(d) ρ = 1 ¯ψ1 ψ1 ¯ + ¯ψ1 ψ2 ¯ − ¯ψ2 ψ1 ¯ + ¯ψ2 ψ2 ¯ .


£¯ ® ­ ¯ ¯ ® ­ ¯ ¯ ® ­ ¯ ¯ ® ­ ¯¤
2

2. Determine which of the following states are entangled:


|00〉+|01〉
(a) p ,
2
−|01〉+|11〉
(b) p ,
2
|00〉+|11〉
(c) p ,
2
i |01〉+|00〉
(d) p ,
2
|00〉−i|01〉+|10〉−i|11〉
(e) 2 ,
|00〉−|01〉+i |10〉−i|11〉
(f) 2 ,
|000〉−|100〉+|001〉−|101〉
(g) 2 ,
|010〉+|110〉+|011〉−|111〉
(h) 2 .

3. We consider a two-qubit system prepared in the quantum state

¯ψ(t = 0) = 1 (|00〉 + |01〉 + |10〉 + |11〉) .


¯ ®
2
We assume that this system evolves with Hamiltonian Ĥ , which in the basis

{|00〉 , |01〉 , |10〉 , |11〉} ,

{ takes the following form:


12{
 
1 0 0 0
 0 −1 0 0 
.
 
0 0 −1 0

 
0 0 0 1

(a) Is the state ¯ψ(t = 0) entangled?


¯ ®

(b) Show that at t = π×/2, the wave function ¯ψ(t ) is not entangled.
¯ ®

(c) Does this hold for every time t ? If not, can you invent a (non-trivial) Hamiltonian
for which this would hold (change a few signs in the Hamiltonian).
(d) Show that if the Hamiltonian can be written as the sum of a Hamiltonian acting
only on the first particle, and one only acting on the second particle, the particles
will never become entangled.
12.8. P ROBLEMS 177

4. Alice and Bob have two qubits, which together are in the quantum state

1 ¡
p α |00〉 + β |01〉 + α |11〉 − β |10〉 ,
¢
2

where |α|2 + |β|2 = 1. The first qubit in the ket-vector is always that of Alice.

(a) Give the density matrix for this state in the form of a 4 × 4 matrix. Clearly indicate
which basis you choose.
(b) What is the probability for Bob to find the value 0 when he measures his qubit?
(c) Calculate the reduced density matrix for Bob in the form of a 2×2 matrix, by taking
the trace over Alice’s Hilbert space. Answer part (b) again based on the result.
(d) Charlie prepares this state many times for Alice and Bob. Each time, they mea-
sure their qubits in the measurement basis |0〉 , |1〉. Finally they get together and
compare their results. Can they find out the state of the system by combining
their measurements? Can they characterise the state after their measurement by
a density matrix? Give that density matrix.

5. Alice possesses a qubit in the quantum state

¯ψ = a |0〉 + b |1〉 ,
¯ ®

with |a|2 + |b|2 = 1. Bob and Charlie know this state too – that is, they know a and b.

(a) Bob measures this qubit. What are the probabities for finding |0〉 and |1〉 respec-
tively?
(b) Following his measurement, Bob sends the qubit to Charlie, without communi-
cating the measurement result. How would Charlie characterize the state of the
qubit (i.e. is it a mixed or a pure state)? Give this state.
(c) Now both Alice and Bob have a qubit. That of Alice is in the state ¯ψ . Bob’s qubit
¯ ®

is in a different state given as

¯φ = c |0〉 + d |1〉 .
¯ ®

These qubits both traverse a controlled-NOT gate. This unitary operation has the
following effect:

00 → 00
01 → 01
10 → 11
11 → 10 {
12
{
After this operation, Alice’s qubit is measured with respect to the basis |0〉, |1〉.
Give the possible two-qubit states after this measurement.
(d) Now the qubits of Bob and of Alice are in the state

1
p (|00〉 + |11〉) ,
2

where the first qubit is that of Alice, and the second that of Bob.
What are now the possible states after the controlled-NOT operation, followed by
a measurement of Alice’s qubit? Charlie does not know the result of the measure-
ment. Give the state for Charlie.
178 12. D ENSITY OPERATORS — QUANTUM INFORMATION THEORY

(e) Take the trace over Alice’s Hilbert space of the density operator found in (d) to
obtain the reduced density operator for Bob’s qubit. Is Bob’s qubit in a pure or
a mixed state? Same questions for Alice’s qubit after tracing over Bob’s Hilbert
space.

6. In this problem we analyse Bell’s inequality in a different form as presented in the lec-
ture. Alice and Bob receive particles which can each be described by a two-dimensional
Hilbert space with basis |0〉 and |1〉. Alice and Bob perform measurements on these
particles. Alice can perform measurements for her particle corresponding to the Pauli
matrices σz , σx etc. or linear combinations of these. Bob can do the same. If Alice finds
in a measurement of σz the value 1 (corresponding to the state |0〉), then Bob finds in a
similar measurement for his particle a value −1 (corresponding to |1〉) and vice-versa.
Both possibilities are equally probable.

(a) Give the most general wave function ¯ψ with which you can describe the state
¯ ®

of the two-particles. Also give the state when the wave function is antisymmetric
under particle exchange (fermion behaviour). We assume that this antisymmetry
requirement is satisfied.

Now Alice and Bob perform measurements for physical quantities corresponding to the
operators σz , σθ = cos θσz + sin θσx and σ−θ = cos θσz − sin θσx . Alice measures either
σz or σθ . Bob measures either σz or σ−θ . This choice is made at random with equal
probabilities. In all cases they find either a value 1 or −1. We denote the measurement
results by S zA , S Bz , S θA and S −θ
B
. The upper index A or B indicates whether we are dealing
with Alice or Bob, and the lower index indicates whether it is σz which is measured or
σ±θ .
In this problem, we consider the operator

ĝ = σzA σBz + σθA σBz + σzA σB−θ − σθA σB−θ .

(b) Show that the expectation values σzA , σBz , σθA and σB−θ all give the value 0.
­ ® ­ ® ­ ® ­ ®

(c) Give the matrices of the operators σzA σBz , σzA σBx , σxA σBz en σxA σBx with respect to
the basis
{|00〉 , |01〉 , |10〉 , |11〉} .

(d) Calculate ĝ , as a function of θ.


­ ®

(e) Sketch ĝ as a function of θ and show that it is larger than 2 (in absolute value)
­ ®

for 0 < θ < π/2.


­ ®
(f) Suggest a procedure for measuring ĝ in an experiment where both particles are
only measured once.

{ Now we try to ‘invent’ a classical probabilistic process, which generates possible mea-
12
{ surement outcomes S zA , S −θ
B
, etcetera, and which would reproduce the values found in
the experiment.
We therefore consider the number

g = S zA S Bz + S θA S Bz + S zA S −θ
B
− S θA S −θ
B
.

(g) Argue why the average of the numbers S zA etcetera must be 0.


(h) Show that for all possibilities for the four values S zA , S −θ
B
etcetera, g = ±2. Hint:
show that the last term in the expression for g is equal to the product of the first
three terms.
From the fact that g = ±2 for each pair measured by Alice and Bob it follows that:
¯ ¯
¯g ¯ ≤ 2,
12.8. P ROBLEMS 179

where g is the average of g is taken over many measurements. Verify this. Note
that this is always satisfied for any probabilities for the combinations of the four
numbers S zA , S Bz , S −θ
B A
, S −θ .
­ ®
(i) How does the result for g change if the phase relation between the two terms in
the wavefunction changes? (Only the last term in the expression for g changes).

7. Consider a density matrix ρ S of a system with a two-dimensional Hilbert space.

(a) Show that 1/2 ≤ Tr ρ 2S ≤ 1.


¡ ¢
¡ 2¢
What is Tr ρ S for a pure state?

Now consider a universe U=S+E in which the system S and environment E are each a
2-dimensional Hilbert space (i.e., they are both qubits). The state of the universe can
be written as the wave function

¯ψU = α00 |00〉 + α01 |01〉 + α10 |10〉 + α11 |11〉 .


¯ ®

We would like to quantify the degree of entanglement between S and E using the con-
currence, C :
¯ ® q ¡
C (¯ψU ) = 2 1 − Tr ρ 2 .
¡ ¢¢
S

(b) Show that C is given in terms of the coefficients αi j by

C = 2 |α00 α11 − α10 α01 | .

(c) Show that 0 ≤ C ≤ 1. What can you say about ¯ψU when C = 0?
¯ ®

(d) Find the concurrence for the following wave functions:


1
• 2 (|00〉 +
|01〉 − |10〉 − |11〉),
1
• 2 (|00〉 −
|01〉 + |10〉 + |11〉),
• p1 (|00〉 + |01〉 + |11〉),
3
• p1 (|01〉 + |10〉).
2

8. The fidelity F of a state (pure or mixed, characterized by density operator ρ) to a target


state ψtarget is defined as
¯ ®
¯
F ≡ ψtarget ¯ ρ ¯ψtarget .
­ ¯ ¯ ®

(a) Show that, for the case of a pure state characterized by the wave function ¯ψ ,
¯ ®

®¯2
F = ¯ ψ|ψtarget ¯ .
¯­

(b) Consider a system consisting of two qubits. Show that any separable (un-entangled,
{
product)
¯ state of the two qubits cannot have a fidelity greater than 50 % to the Bell 12
{
state ¯ψtarget = p1 (|01〉 + |10〉).
®
2
Hint: use the Bloch vector representation of the two spins, with polar angles θ, φ
and θ 0 , φ0 respectively.
(c) Show that the converse ¯is not true. That is, F ≤ 0.5 to the Bell state does not
guarantee that the state ¯ψ is un-entangled. Hint: the easiest way to show this is
®

to give a counterexample.

9. The reduced density operator for one qubit (regardless of the size of the environment)
can be expanded in the basis of Pauli operators:
v1 vx vy vz
ρS = 1 + σx + σ y + σz .
2 2 2 2
180 12. D ENSITY OPERATORS — QUANTUM INFORMATION THEORY

(a) Why must the coefficients v i , v x , v y , and v z be real valued?


(b) Show v 1 = 1.
(c) Express the purity Tr ρ 2S in terms of v x , v y and v z .
¡ ¢

(d) Show that v2 = v x2 + v 2y + v z2 ≤ 1, with equality only for pure states.


(e) Calculate 〈σx 〉, σ y , 〈σz 〉 in terms of the numbers v x , v y and v z .
­ ®

10. Consider the time-dependent state of problem 4. This described a state of a two-spin
system as a function of time.
Evaluate the trace over the Hilbert space of the second spin to obtain the density matrix
of the first spin. Then evaluate the expectation value of the z-component of the first
spin as a function of time.

{
12{
13
O PEN QUANTUM SYSTEMS
In this chapter we shall discuss the time evolution of open quantum systems. In the previ-
ous chapter, we have encountered open quantum systems – in particular, we have seen that
these systems are described by a density matrix rather than by a single wave function. In this
chapter we focus on the time evolution of the density matrix.
So far, the time evolution that we have
¯ ® mostly dealt with is that of closed systems, that
are fully described by a wave function ¯ψ evolving deterministically according to the time-
dependent Schrödinger equation:
¯ ˙® 1 ¯ ®
¯ψ = H ¯ψ .

Solving this differential equation lets us relate the state at time t to that at time 0 by

¯ψ(t ) = U (t ) ¯ψ(0) ,
¯ ® ¯ ®

where U (t ) = e −iH t /ħ is the time-evolution operator – see chapter 2.


We have also
¯ ® ­seen ¯ that such a closed system can be equivalently described by a density
operator, ρ = ¯ψ ψ¯. Using the time evolution of the bra- and the ket wave functions occur-
ring in this density operator, we find that it evolves according to

ρ(t ) = U (t )ρ(0)U † (t ).

Taking the time derivative, we obtain the following important differential equation:

i×ρ̇ = H , ρ .
£ ¤

It is interesting to note that this differential equation differs from the usual time evolution
for a quantum mechanical operator as it occurs in the Heisenberg picture (see section 2.2):

∂ ∂
Ô(t ) = i× U † (t )ÔU (t ) = Ô(t ), H .
£ ¤

∂t ∂t
We now turn to the evolution of an open quantum system. Using the notation of sec-
tion 12.2, we label the system with S and the environment with E. The environment may
consist of another quantum system (often much larger) interacting with our system. It can
also represent a measurement apparatus that may be either under our control or controlled
by another party that does not communicate with us. The central question that concerns us
is: how does the system’s reduced density operator ρ S evolve in these different scenarios? Is
there an analog differential equation, or evolution operator for ρ S ? Apart from giving insight
into what would really happen to the quantum systems in our labs, these questions touch
upon the measurement process in quantum mechanics.

181
182 13. O PEN QUANTUM SYSTEMS

ψS(0) ρS(t)

U
ψE(0)

F IGURE 13.1: A quantum system and an environment, initially in a product state |ψU (0)〉 = |ψS (0)〉 ⊗ |ψE (0)〉,
evolve under a globally unitary operation U . We are not interested in the state of the environment E. How is the
reduced density of the system, initially in ρ S (0) = |ψS (0)〉 〈ψS (0)|, transformed by U ?

13.1 C OUPLING TO AN ENVIRONMENT


Analogously to the approach of section 12.2, let us first consider a closed universe consisting
of our system and an environment evolving in a unitary fashion (see Fig. 13.1). At any time,
the state of the universe is described by a state |ψU (t )〉 and, equivalently, by a pure density
matrix
ρ U (t ) = |ψU (t )〉 〈ψU (t )| .
Clearly, ρ U (t ) = U (t )ρ U (0)U † (t ), and the reduced density matrix for the system at any time is
given by
ρ S (t ) = TrE [ρ U (t )].
The answer to how the reduced density matrix of the system transforms under the unitary
evolution is simple to write down formally:
h i
ρ S (0) = TrE ρ U (0) → ρ S (t ) = TrE U (t )ρ U (0)U † (t ) .
£ ¤

But what is the form of the transformation E that evolves the density matrix in time:

ρ S (t ) = E (ρ S (0))?

To begin answering this question, let us look at a simple example. In it, and throughout this
chapter, we assume that the system and environment are initially in an unentangled, or prod-
uct state |ψU (0)〉 = |ψS (0)〉 ⊗ |ψE (0)〉.

13.1.1 E XAMPLE : T HE DAMPING CHANNEL


Imagine that we have a qubit (our system) initially in a pure state |ψS (0)〉 = α |0〉+β |1〉 and an
environment initially in |ψE (0)〉:

|ψU (0)〉 = α |0〉 + β |1〉 |ψE (0)〉 .


¡ ¢

The state |0〉 denotes the ground state and |1〉 the excited state. Imagine that for the system
{ evolves over a certain amount of time¯from the state ¯ψE
¯ ®
in the state |0〉, the environment
13
{ to the (normalized) state ¯ψE , which is not necessarily orthonormal to ¯ψE : For short time
¯ 0® ®

evolutions, these states will be almost the same.

|0〉 |ψE 〉 → |0〉 ¯ψ0E .


¯ ®

This equation expresses the fact that when the system is in its ground state, the environment
will not drive it out of that state.
We allow however for the excited state |1〉 to decay to the ground state during the evolu-
tion. We assume that if there is no interaction, the system stays in |1〉 and the environment
13.1. C OUPLING TO AN ENVIRONMENT 183

evolves in the same way as for the system in the ground state, that is, it ends up in ¯ψ0E .
¯ ®

However, the excited state partly evolves into the ground state as a result ¯of interactions with
the environment,¯ and in that case, the environment ends up in a state ¯ψ00E which we take
®

perpendicular to ¯ψ0E :
®

¯ ® p
|1〉 ⊗ ¯ψE → 1 − p |1〉 ⊗ ¯ψ0E + p |0〉 ⊗ ¯ψ00E .
¯ ® p ¯ ®

All environment states are normalized. The two states after the evolution are however not
necessarily orthonormal to the initial environment state.
All in all, this leads to the time evolution of the state |ψU (0)〉:
¯ ® p
|ψU (0)〉 = α |0〉 + β |1〉 |ψE 〉 → |ψU (t )〉 = α |0〉 ¯ψ0E + 1 − pβ |1〉 ¯ψ0E + pβ |0〉 ¯ψ00E .
¡ ¢ ¯ ® p ¯ ®

You may check that with this choice of prefactors, the evolution preserves the norm of the
p
state. Under this evolution, there is a probability amplitude p that an excited qubit will lose
its excitation to the environment. The final state state of the universe can be re-expressed as
a linear combination of two states with perpendicular environmental components:
³ ´ ¯ ® p
|ψU (t )〉 = α |0〉 + 1 − pβ |1〉 ⊗ ¯ψ0E + pβ |0〉 ⊗ ¯ψ00E
p ¯ ®

= M 0 |ψS (0)〉 ¯ψ0E + M 1 |ψS (0)〉 ¯ψ00E ,


¯ ® ¯ ®

where, in the |0〉, |1〉 basis, the operators M 0 and M 1 are the matrices
µ ¶ µ p ¶
1 0 0 p
M0 = p and M 1 = . (13.1)
0 1−p 0 0

These matrices ¯tell us what has happened to the system state if we find the environment in
either the state ψE or ψE .
0
® ¯ 00 ®
¯ ¯
The final reduced density matrix for the system is found by tracing the density matrix of
the universe over the environment:

ρ S (t ) = TrE ¯ψ0U ψ0U ¯


¡¯ ® ­ ¯¢

= M 0 |ψS (0)〉 〈ψS (0)| M 0† + M 1 |ψS (0)〉 〈ψS (0)| M 1† ,

where we have made use of the orthogonality between ¯ψ0E and ¯ψ00E . Substituting
¯ ® ¯ ®

ρ S (0) = |ψS (0)〉 〈ψS (0)|

above, we find
ρ S (t ) = M i ρ S (0)M i† .
X
i

So, once we have found the matrices M i , this expression gives our sought-after connection
between the initial and final reduced density matrices for the system, expressed as a trans-
formation only in the Hilbert space of S. This specific form is known as the operator-sum
representation of the transformation E . The M i are called operation elements. Note that they
act solely on the qubit Hilbert space, and that they are not unitary. You can easily check this {
for M 0 and M 1 above. We have found the following important result: 13
{

The state of a system that is initially in a pure state, and coupled to an environ-
ment, will generally evolve into a mixed state. The evolution of the density matrix
is encoded in the so-called operator elements M i which, for each final state i of the
environment, map the initial state of the system to a final state.
Viewing the M i as matrices, their dimension is that of the Hilbert space of the sys-
tem, but we have a number of them equal to the number of possible final environ-
ment states.
184 13. O PEN QUANTUM SYSTEMS

But how general is the operator-sum representation? Can it describe the evolution of
the reduced density matrix for any system in interaction with an environment in a closed
universe? We shall address this general question in the following section.
Before concluding this example, we note that, for a single qubit, E is easy to visualize on
the Bloch sphere. Using
µ ¶
1 1 + 〈Z 〉 〈X 〉 − i 〈Y 〉
ρS = , (13.2)
2 〈X 〉 + i 〈Y 〉 1 − 〈Z 〉
and we can relate the final and initial Bloch vectors:
  p
〈X 〉 (t ) 1−p 0 0 〈X 〉 (0) 0
    
p
 〈Y 〉 (t )  =  0 1−p 0   〈Y 〉 (0)  +  0  . (13.3)
〈Z 〉 (t ) 0 0 1−p 〈Z 〉 (0) p

This represents a motion of the point towards Z = 1 and a compression within the X Y plane.
Anticipating that p would grow in time, the state will therefore end up in its ground state
Z = 1.

13.1.2 T HE OPERATOR - SUM REPRESENTATION


In this section we consider the general case of a unitary system-environment interaction,
starting from an unentangled state

|ψU (0)〉 = |ψS (0)〉 ⊗ |ψE (0)〉 .

Following the interaction, the universe state becomes


¡ ¢
|ψU (t )〉 = U (t ) |ψS (0)〉 ⊗ |ψE (0)〉 .

We can expand the¯ unitary operator U (t ) in terms of a basis of the universe.¯ We use orthonor-
® ®
mal basis vectors ¯ j for the system S, and |i 〉 for the environment E, so ¯ j , i forms an (or-
thonormal) basis of the universe, and this is the one we shall use:
NS X NS X
NE X NE
U j i j 0 i 0 (t ) ¯ j , i j 0 , i 0 ¯ .
X ¯ ®­ ¯
U (t ) =
j i j0 i0

Thus,
NS X NS X
NE X NE
U j i j 0 i 0 (t ) ¯ j , i j 0 , i 0 ¯ |ψS (0)〉 ⊗ |ψE (0)〉
X ¯ ®­ ¯¡ ¢
|ψU (t )〉 =
j i j0 i0
NS X NS X
NE X NE
U j i j 0 i 0 (t ) i 0 |ψE (0)〉 ¯ j j 0 |ψS (0)〉 ⊗ |i 〉
X ­ ¯ ®­
=
j i j0 i0
à !
NE X
X NS X
NS X
NE ¯ ® ­ 0¯
= c i 0 U j i j 0 i 0 (t ) ¯ j j ¯ |ψS (0)〉 ⊗ |i 〉
i j j0 i0
NE
X
{ = M i |ψS (0)〉 ⊗ |i 〉 , (13.4)
13
{ i

PN PN PN
where c i 0 = i 0 ψE (0) and M i ≡ j S j 0 S i 0 E c i 0 U j i j 0 i 0 (t ) ¯ j j 0 ¯. We can write M i succinctly
­ ® ¯ ®­ ¯

as
M i = 〈i |U (t ) |ψE (0)〉 .
At this point it is very important to realize that the M i ’s are operators acting in the system
Hilbert space – this may not be obvious from the notation. We can write the reduced density
13.1. C OUPLING TO AN ENVIRONMENT 185

operator for the system following the interaction:

ρ S (t ) = TrE (|ψU (t )〉 〈ψU (t )|)


ÃÃ !Ã !!
NE NE
0
X X ­ 0¯
= TrE M i |ψS (0)〉 ⊗ |i 〉 M i 〈ψS (0)| ⊗ i ¯
i i0

M i |ψS (0)〉 〈ψS (0)| M i†


X
=
i

M i ρ S (0)M i† .
X
= (13.5)
i

We have arrived, without loss of generality, at an operator-sum representation for the trans-
formation ρ S (0) → ρ S (t ) that involves "sandwiching" the initial system reduced density op-
erator by operation elements M i . The detailed form of these operation elements depends on
the initial state of the environment, the global evolution operator U , and our choice of or-
thonormal basis for the environment. Crucially, however, they are independent of the initial
system state |ψS (0)〉. It is important to highlight that the number of nonzero operation ele-
ments can equal the dimensionality NE of the environment. Thus, their number can largely
exceed the dimensionality NS of the system Hilbert space, because most often the environ-
ment is much larger than the system. Our second example will illustrate this point. Before
moving to it, we note a property of the operation elements M i arising from the unitarity of
the global evolution operator.

1 = ψU (t ) ψU (t )
­ ®
à !à !
〈i | 〈ψS (0)| M † M i 0 |ψS (0)〉 ¯i 0
X X ¯ ®
= i
i i0

〈ψS (0)| M i† M i
X
= |ψS (0)〉
i
à !
M i† M i |ψS (0)〉 .
X
= 〈ψS (0)|
i

As this must hold for any |ψS (0)〉, it follows that


X †
Mi Mi = I S ,
i

where I S is the identity operator in the system Hilbert space. We say that the operation ele-
ments form a decomposition of unity.

13.1.3 E XAMPLE : QUBIT DEPOLARIZATION


We consider a qubit-environment interaction that transforms the initially unentangled state
|ψU (0)〉 = |ψS (0)〉 |ψE (0)〉 to

|ψU (t )〉 = 1 − 3p/4 |ψS (0)〉 ¯ψ0E + p/4X |ψS (0)〉 ¯ψ00E


p ¯ ® p ¯ ®

+ p/4Y |ψS (0)〉 ¯ψ000 p/4Z |ψS (0)〉 ¯ψ0000


p ¯ ® p ¯ ®
E + E ,
{
where the four environment states on the right-hand side are orthonormal. Under this inter- 13
{
action, the qubit undergoes a bit flip (X operation), phase flip (Z operation) and bit-phase
p
flip (Y operation), each with a probability amplitude p/4. The final reduced density matrix
for the system is given by
4
ρ S (t ) = M i ρ S (0)M i† ,
X
i =1

with operation elements


p p p p
M 1 = 1 − 3p/4I , M 2 = p/4X , M 3 = p/4Y , and M 4 = p/4Z . (13.6)
186 13. O PEN QUANTUM SYSTEMS

The dimension 2 of the system’s Hilbert space is reflected in the fact that the matrices M i are
2 × 2. We have four independent 2 × 2 matrices in this example. By noting that the square of
every Pauli operator is the identity, it is easy to show that these operation elements indeed
form a decomposition of unity.
The effect of this process is easy to visualize in the Bloch sphere:

〈X 〉 (t ) 〈X 〉 (0)
   
 〈Y 〉 (t )  = (1 − p)  〈Y 〉 (0)  .
〈Z 〉 (t ) 〈Z 〉 (0)

The interaction with the environment reduces the magnitude of the Bloch vector without
changing its direction. For p = 1, every Bloch vector collapses to the origin. Quite generally,
a process under which the purity of the reduced density matrix decreases is said to cause
decoherence of the system.

13.2 D IRECT QUANTUM MEASUREMENTS


We now turn our attention to the evolution of a system that undergoes measurement. We first
focus on direct quantum measurements (Fig. 13.2). Despite their perhaps unfamiliar name,
these measurements are exactly the type considered in standard introductory courses. Such
measurements follow Born’s rule: With every such measurement is asssociated ¯ an operator Ô
in the system Hilbert space. The act of measurement collapses the system state ψS to one of the
®
¯
eigenstates |i 〉 of Ô , and the measurement result, which we denote by m, is the corresponding
eigenvalue λi . The probability of measuring λi and collapsing onto eigenstate |i 〉 is given by
the square of the overlap between |ψS 〉 and |i 〉:

p i = | 〈i 〉 ψS |2 .

13.2.1 S YSTEM EVOLUTION CONDITIONED ON THE RESULT OF DIRECT MEASUREMENT

Suppose we prepare the system in a pure state ¯ψ . According to the Born rule, the post-
¯ ®

measurement state of the system is simply the eigenstate |i 〉 of Ô corresponding to the mea-
surement result λi . Note that Ô can be written in terms of its eigenvalues and eigenvectors
as The evolution under a direct quantum measurement with result m = λi is

ρ S = |ψS 〉 〈ψS | → ρ S|m=λi = |i 〉 〈i | .

Another, admittedly more complicated, way to write the post-measurement density matrix
conditioned on m = λi is
1
ρ S|m=λi = Πi ρ S Πi , (13.7)
pi
where Πi ≡ |i 〉 〈i | is a projection operator. This last expression puts the transformation in
p
operator-sum representation. In this case, the only operator element is Πi / p i , which is
obviously not a decomposition of unity.

{ 13.2.2 U NCONDITIONED SYSTEM EVOLUTION UNDER DIRECT MEASUREMENT


13{ In the previous section we have considered the evolution of the system when the measure-
ment result is known. We call this the conditional evolution. Now we consider the same
measurement, but for the case that the result is not known or discarded. For example, this
may arise when the system is measured by someone else (a lab partner perhaps) who does
not communicate the result to us. We can ask ourselves whether the system density matrix
undergoes a nontrivial evolution in this situation. Indeed, this turns out to be the case. We
call the evolution of the density matrix in this process unconditional.
13.2. D IRECT QUANTUM MEASUREMENTS 187

O m=λi
ψS i

F IGURE 13.2: A direct quantum measurement is associated with an Hermitian operator. The measurement result
m is an eigenvalue λi of Ô . The post-measurement state of the system is the corresponding eigevenvector |i 〉.
The probability of measuring m = λi is given by p i = | 〈i 〉 ψS |2 .

Following the measurement, the system is left in a statistical mixture of the eigenstates of
Ô , and is described by the density matrix:

ρ 0S =
X
p i |i 〉 〈i |
i
| 〈i 〉 ψS |2 |i 〉 〈i |
X
=
i
|i 〉 〈i 〉 ψS ψS i 〈i |
X ­ ®
=
i
Πi |ψS 〉 〈ψS | Πi
X
=
i
Πi ρ S Πi .
X
=
i

In the final line, we have arrived at an expression for ρ 0S in operator-sum representation,


where the operation elements M i are the projectors Πi onto the eigenstates of Ô . Because
these eigenstates form an orthonormal basis for the system Hilbert space, the operation ele-
ments clearly form a decomposition of unity.
In summary, we have seen that the evolution of the reduced density matrix of a sys-
tem that undergoes quantum measurement can be written in operator-sum representation,
wherein all operator elements are projection operators.

13.2.3 M EASUREMENT STATISTICS


Before turning to an example, we consider the statistics of the measurement. The expectation
value of the measurement is

λi p i
X
〈m〉 =
i
λi | 〈i 〉 ψS |2
X
=
i
λi ψS i 〈i 〉 ψS
X ­ ®
=
i
à !
λi |i 〉 〈i | |ψS 〉
X
= 〈ψS |
i

= 〈ψS | Ô |ψS 〉 , {
13
{
a result that should be familiar to us.

13.2.4 E XAMPLE : PROJECTIVE MEASUREMENT OF A QUANTUM BIT


We consider the direct measurement of Z for a qubit. As illustrated in Fig. 13.3, it is easy to
visualize the effect of the operation elements M + = Π0 = |0〉 〈0| and M − = Π1 = |1〉 〈1| on the
Bloch vector. For a measurement result m = +1, the Bloch vector is collapsed onto the north
pole. For m = −1, it is collapsed onto the south pole. When we ignore the measurement
188 13. O PEN QUANTUM SYSTEMS

(a)

z
m=+1

y
x
m=-1

(b)
result
unknown

F IGURE 13.3: Illustration of the effect of the projective operation elements in the direct quantum measurement
of Z for a qubit. Depending on the measurement result, the qubit Bloch vector is collapsed to the north (m = +1)
or south pole (m = −1). Under unconditioned evolution, the Bloch vector x and y components vanish, while the
z component remains unchanged.

ψS (pi-1/2)Mi ψS
U A m=ai
ψA i
apparatus

F IGURE 13.4: Anatomy of an indirect quantum measurement.

result, the post-measurement density matrix evolves as


ρ 00 ρ 01 ρ 00
µ ¶ µ ¶
0
ρS = → ρ 0S = M + ρ S M +† + M − ρ S M −† = .
ρ 10 ρ 11 0 ρ 11
Thus, we see that the unconditioned evolution preserves the z component of the Bloch vec-
tor, and zeroes out its x and y components.
{
13{ 13.3 I NDIRECT QUANTUM MEASUREMENTS
We now turn to the evolution of a quantum system that is subject to a different kind of mea-
surement, known as indirect quantum measurement. An indirect measurement consists of
three steps, as illustrated in Fig. 13.4. First, a measurement apparatus (itself a quantum sys-
tem with a Hilbert space of dimension NA ) is initialized in a pure state |ψA 〉. The system and
apparatus then undergo a known joint unitary evolution U , which in general entangles the
two. Finally, a direct measurement of the apparatus observable  is performed.
Let us take a moment to consider the wealth of choices available within indirect measure-
ments:
13.3. I NDIRECT QUANTUM MEASUREMENTS 189

• dimensionality NA of the measurement apparatus.

• initial state |ψA 〉 of the measurement apparatus.

• unitary time evolution U corresponding to Hamiltonian that contains an interaction


between system + apparatus.

• operator  for direct measurement of the apparatus.

These choices make for a wide variety of indirect measurements. Indeed, there is a whole
zoology of such measurements. The interested reader will find a thorough presentation in
Ch. 1 of Wiseman and Milburn.
Let us now consider the evolution of the system under such measurements. Like for direct
measurements, we consider two scenarios: one in which we are aware of the measurement
result (because we perform the measurement), and another in which we are ignorant of it
(because another party performs the measurement and does not tell us the result).

13.3.1 S YSTEM EVOLUTION CONDITIONED ON THE RESULT OF INDIRECT MEASURE -


MENT
We first analyze what happens to the state of the system in an indirect measurement. The
system and apparatus are initially in a product state |ψU (0)〉 = |ψS 〉 |ψA 〉. Following the unitary
evolution, the joint state is
NA
X
|ψU 〉 = M i |ψS 〉 |i 〉 ,
i
where M i = 〈i |U |ψA 〉 – see Eq. (13.4). We again emphasize that, although this form of M i
seems to give a number, it is actually an operator in S, as U acts on the combined Hilbert
space of the system and the apparatus, and 〈i | and |ψA 〉 only act within the Hilbert space of
the apparatus – see the discussion following Eq. (13.1.2). We choose for orthonormal basis
|i 〉 of the apparatus Hilbert space the eigenstates of the operator Â, Â |i 〉 = a i |i 〉. This choice
simplifies the next step. Upon a measurement giving result m = a i , the joint system collapses
to the product state
¯ψU|m=a = N M i |ψS 〉 ⊗ |i 〉 .
¯ ® £ ¤
i
q
Here, N = 1/ 〈ψS | M i† M i |ψS 〉 is a normalization factor. Note that, by Born’s rule, 〈ψS | M i† M i |ψS 〉
is precisely the probability p i of getting measurement result m = a i . Thus, we can write the
post-measurement reduced density matrix conditioned on m = a i as
1
ρ S|m=ai = M i ρ S M i† .
pi
Note that this form is the same as Eq. (13.7), except for a crucial distinction: M i is not neces-
sarily a projection operator!

13.3.2 U NCONDITIONED SYSTEM EVOLUTION UNDER INDIRECT MEASUREMENT


If the indirect measurement is performed by another party that does not reveal the measure-
ment result, the post-measurement reduced density matrix of the system is mixed. It is given
by a weighted sum of the conditioned density matrices, {
13
{
ρ 0S = p i ρ s|m=ai
X
(13.8)
i
µ ¶
1
M i |ψS 〉 〈ψS | M i†
X
= pi (13.9)
i pi
M i |ψS 〉 〈ψS | M i†
X
= (13.10)
i

M i ρ S M i† .
X
= (13.11)
i
190 13. O PEN QUANTUM SYSTEMS

Thus, the transformation E (ρ S ) of the reduced density matrix of a system undergoing an in-
direct quantum measurement can also be written in operator-sum representation. When the
measurement result is known, there is only one operation element. When the result is un-
known (or ignored) the sum includes NA operator elements.

13.3.3 W HAT DOES AN INDIRECT QUANTUM MEASUREMENT ACTUALLY MEASURE ?


Before considering our first example, we briefly discuss the statistics of the measurement
result. The average value of the measurement is

NA
X
〈m〉 = ai p i
i =1
NA
a i 〈ψS | M i† M i |ψS 〉
X
=
i =1
à !
NA
a i M i† M i
X
= 〈ψS | |ψS 〉
i =1

= 〈ψS | Ô |ψS 〉 ,
PNA
where Ô = i =1 a i M i† M i is an Hermitian operator acting on the system Hilbert space. Let us
take a moment to appreciate how this system operator is determined. It is given by weighted
sum of NA Hermitian operators built from the operation elements M i (themselves deter-
mined by the initial apparatus state and the evolution U ). The weighing coefficients are given
by the eigenvalues of the direct measurement of the apparatus operator A.
Evidently, like the observable in a direct measurement, Ô is an Hermitian operator. How-
ever, note
¯ the crucial distinction. The measurement results a i and corresponding post-measurement
states ¯ψS|m=ai are categorically not eigenvalues and eigenvectors of Ô !
®

13.3.4 POVM S
We have placed a lot of emphasis on the post-measurement state of the system. However, it
is often the case in experiment that a measurement is performed only as the final step, and
the post-measurement state of the system is really not of interest. Rather, only the properties
of the pre-measurement state revealed by the measurement are of interest. In such a case,
PNA
it suffices to know the operator as Ô = i =1 a i E i . The E i are called effects, and are related
to the operation elements by E i = M i† M i . Note that knowing the effects E i lets us calculate
all the statistics of the measurement result, but precludes us from saying anything about the
post-measurement state. The effects add up to the identity operator in the system Hilbert
space, X
Ei = I ,
i
and are positive operators:
〈ψS | E i |ψS 〉 ≥ 0 for any |ψS 〉 .
Measurements that are specified by their effects, rather than their operation elements, are
{ known as positive-operator-valued measures, or POVMs in short.
13{
13.3.5 E XAMPLE : WEAK QUANTUM MEASUREMENT OF A QUBIT
One of the simplest and most beautiful examples of an indirect quantum measurement is the
so-called weak measurement of a qubit. This process is illustrated in Fig. 13.5. The apparatus
consists of another qubit (often called the ancilla1 ) initialized in state |0〉. The interaction
between the qubit S and the ancilla A leads to the unitary evolution

U = e −iZS YA θ ,
1 Ancilla is the latin word for a slave or servant.
13.3. I NDIRECT QUANTUM MEASUREMENTS 191

ψS M ψS
e-iZYθ
X m= 1
0

F IGURE 13.5: Weak measurement of a qubit by indirect quantum measurement. The apparatus consists contains
an ancilla qubit initialized in |0〉. The ancilla and qubit undergo an entangling unitary U = e −i Z Y θ , where Z
acts on the qubit and Y on the ancilla. Following this interaction, a direct measurement of X is performed on
the ancilla. The strength of the measurement is controlled by θ. The choice θ = π/4 realizes a fully projective
measurement of the qubit in the qubit Z basis. The choice θ = 0 implies no measurement at all. For in-between
values of θ, a weak measurement is performed where the operation elements only rotate the qubit toward the
north and south poles of the Bloch sphere.

where ZS acts on the qubit and YA on the ancilla and θ ∈ [0, π/4]. Finally, a direct measure-
ment of the operator X A is performed on the ancilla. By choosing θ small, the act of measure-
ment distorts the initial state only slightly - in this sense, it is (or can be) a weak measurement.

The unitary evolution is easy to visualize: the ancilla undergoes a rotation by ±2θ around
the y axis of the Bloch sphere, with the sign depending on the state of the qubit:

U α |0〉S + β |1〉S |0〉A = α |0〉S e −iY θ |0〉A + β |1〉S e iY θ |0〉A


¡ ¢

= α |0〉S (cos θ |0〉A + sin θ |1〉A ) + β |1〉S (cos θ |0〉A − sin θ |1〉A )
= cos θ α |0〉S + β |1〉S |0〉A + sin θ α |0〉S − β |1〉S |1〉A .
¡ ¢ ¡ ¢

p
Writing this state in the |±〉A = (|0〉 _A ± |1〉A )/ 2 basis for the ancilla, dropping the subscripts
and using C θ = cos(θ) and S θ = sin(θ) for conciseness, we obtain

¢ |+〉 + |−〉 ¢ |+〉 − |−〉


U α |0〉 + β |1〉 |+〉 = C θ α |0〉 + β |1〉 + S θ α |0〉 − β |1〉
¡ ¢ ¡ ¡
p p
2 2
(C θ + S θ ) α |0〉 + (C θ − S θ ) β |1〉 (C θ − S θ ) α |0〉 + (C θ + S θ ) β |1〉
¡ ¢ ¡ ¢
= p |+〉 + p |−〉 .
2 2

From this, the operation elements are seen to be

Cθ Sθ Cθ Sθ
M + = p I + p Z and M − = p I − p Z .
2 2 2 2

In the qubit |0〉 , |1〉 basis, they are represented by the matrices
à C θ +S θ ! à C θ −S θ !
p 0 p 0
M+ = 2 and M − = 2 .
0
C θ −S θ
p 0
C θ +S θ
p {
2 2 13
{
Let us take a moment to visualize the effect of these operation elements on the qubit state.
For θ = π/4, the operation elements become projection operators onto |0〉 and |1〉. This is
identical to a direct measurement of Z . For θ < π/4, the measurement is no longer projec-
tive, and in this sense is called weak. Rather than collapse the Bloch vector onto the poles, a
measurement result m = +1 (m = −1) only rotates the Bloch vector toward the north (south)
pole. This rotation preserves the azimuthal angle of the Bloch vector. The rotation angle
magnitude depends on θ, on the latitude of the Bloch vector, and on the measurement result.
192 13. O PEN QUANTUM SYSTEMS

2θ m=+1
y
m=-1

F IGURE 13.6: Illustration of the operation elements for weak indirect measurement of a qubit. (a) For a measure-
ment result m = +1 (m = −1), a Bloch vector initially on the equator is rotated toward the north (south) pole by a
change of polar angle of ∓2θ. The azymuthal angle of the Bloch vector is preserved. (b) In general, the change of
polar angle is a function of θ, the latitude of the Bloch vector, and the measurement result.

When the Bloch vector lies on the equator, the rotation angle has magnitude (2θ) for both
measurement results, as illustrated in Fig. 13.6.
The unconditioned post-measurement density matrix

ρ 0S = M + ρ S M + + M − ρ S M −

is easily visualized on the Bloch sphere. It is straightforward to show that

〈X 〉0 C 2θ 0 0
    
〈X 〉
 〈Y 〉0  =  0 C 2θ 0   〈Y 〉  .
〈Z 〉0 0 0 1 〈Z 〉

The effects are


1 S 2θ 1 S 2θ
E + = M +† M + = I + Z and E − = M −† M − = I − Z.
2 2 2 2

The measurement operator Ô is

Ô = +1M +† M + − 1M −† M −
S 2θ S 2θ
µ ¶ µ ¶
1 1
= +1 I + Z −1 I − Z
2 2 2 2
= S 2θ Z .

13.4 R EPEATED MEASUREMENTS


It is natural to ask, both for direct and indirect quantum measurements, how ρ S evolves
under repeated measurements. After N measurements resulting in a measurement record
{m 1 , m 2 , m 3 , ....m N } = {i 1 , i 2 , i 3 , ..., i N }, ρ S transforms into

1¡ N ¡ N ¢†
ρ 0S = Πn=1 M i n ρ S (0) Πn=1
¢
Mi n ,
{ p
13{ ¡ N ¡ N ¢†
where p = Πn=1 M i n ρ S (0) Πn=1
¢
M i n is the probability of getting the measurement record.
For direct measurements, the evolution of the ρ S is not particularly rich: it is frozen after
the first measurement! This is because the operation elements M i are orthogonal projectors,
N
and thus all subsequent measurements equal the first, and Πn=1 M i n = M iN = M i 1 . The two
1
possible trajectories for the Bloch vector of a qubit undergoing a series of direct measure-
ments of Z are shown in Fig. 13.7(a,b). Remember that the conditional evolution is subject to
a known measurement outcome. The evolution can be much richer for weak indirect quan-
tum measurements. Here, we imagine that the apparatus state is re-initialized to a fixed |ψA 〉
13.4. R EPEATED MEASUREMENTS 193

Direct measurement Weak indirect measurement


1.0 1.0
(a) (c)
Bloch vector coordinate

0.5 0.5
conditioned trajectory conditioned trajectory

0.0 0.0

<X> <X>
-0.5 <Y> -0.5 <Y>
<Z> <Z>
-1.0 -1.0
1.0 1.0
(b) (d)
Bloch vector coordinate

0.5 0.5
conditioned trajectory conditioned trajectory

0.0 0.0

<X> <X>
-0.5 <Y> -0.5 <Y>
<Z> <Z>
-1.0 -1.0
1.0 1.0
(e) (f)
Bloch vector coordinate

0.5 0.5
unconditioned trajectory unconditioned trajectory

0.0 0.0

<X> <X>
-0.5 <Y> -0.5 <Y>
<Z> <Z>
-1.0 -1.0
0 2 4 6 8 10 0 50 100 150 200 250 300
Measurement number Measurement number

F IGURE 13.7: Sample quantum trajectories of a qubit undergoing repeated direct measurements of Z (a,b) and
repeated weak indirect measurements with θ = π/40 (c,d). (e) The unconditioned trajectory of the Bloch vector
under repeated direct measurement. (f) The unconditioned ¡trajectory p p under repeated
¢ weak indirect measure-
ments. The initial Bloch vector is in all cases (〈X 〉 , 〈Y 〉 , 〈Z 〉) = 3/8, − 3/8, 1/2 .

following each measurement. Note that otherwise the operation elements M i would change
from measurement to measurement! Example trajectories for the Bloch vector of a qubit un-
dergoing weak indirect measurements with θ = π/40 are shown in Fig. 13.7(c,d).
If the measurement results are concealed (or disregarded), the reduced density matrix of
the system becomes mixed. After N measurements,
{
ρ 0S =
NA X
X NA
...
NA ¡
X N
Πn=1
¢ ¡ N
M i n ρ S Πn=1
¢†
Mi n .
13
{
i i =1 i 2 =1 i N =1

Note that for direct measurements, the projective character of the M i also stops the evolution
of the unconditioned density matrix after the first measurement (Fig. 13.7(e)). The evolution
of the unconditioned density matrix for a qubit undergoing weak indirect measurements with
θ = π/40 is shown in Fig. 13.7(f).
194 13. O PEN QUANTUM SYSTEMS

13.5 L INDBLAD REPRESENTATION


So far we have described transformations ρ S → ρ 0S = E (ρ S ) using the operator-sum, or Kraus,
representation. In all of our examples in this chapter - coupling to environment, direct and
indirect quantum measurements - the final reduced density matrix is given by "sandwiching"
the initial matrix ρ S by operation elements2 M i . In this section, we will derive a general ex-
pression for the change in ρ S , that is, for E (ρ S ) − ρ S . We will show that this change can always
be represented in a compact form known as Lindblad form.
Our starting point is to expand all operation elements M i using an orthonormal basis
F 1 , F 2 , ..., F N 2 for operators in S. By orthonormal, we mean
S

h i
TrS F i† F j = δi j . (13.12)

For convenience, we always define F N 2 as the normalized identity operator,


S

1
FN 2 ≡ p IS.
S NS

Note that this choice of F N 2 combined with the orthonormality condition make F 1 , ..., F N 2 −1
S S
traceless,
TrS [F i ] = 0 for i = 1, .., NS2 − 1.
Expanding the operation elements in this basis, we have
2
NS
αi j F j ,
X
Mi =
j =1

h i
with coefficients αi j = TrS F j† M i .
Example: Orthonormal bases of operators.
For a two-dimensional S, a basis of orthonormal operators is
½ ¾
1 1
p Z , σ− , σ+ , p I , (13.13)
2 2

where µ ¶ µ ¶
0 0 0 1
σ+ = and σ− = .
1 0 0 0
Another possibility is ½ ¾
1 1 1 1
p X, p Y, p Z, p I . (13.14)
2 2 2 2
You should check that each of these bases satisfies the orthonormality condition (13.12). Us-
ing Eq. (13.5),

E (ρ S ) = M i ρ S M i†
X
i
{
à ! à !
13 αi j F j ρ S α∗i k F k†
X X X
{ =
i j k

c j k F j ρ S F k† ,
X
=
j ,k

where
αi j α∗i k .
X
cjk ≡
i

2 Operation elements are also called Kraus operators in the literature


13.5. L INDBLAD REPRESENTATION 195

Separating the terms involving F N 2 , we arrive at


S

cN 2 N 2 NS2 −1 µ c cN 2 j ¶ NS −1 2
j NS2 †
E (ρ S ) = ρS + p F j ρS + p ρSF j + c j k F j ρ S F k†
S S
X S
X
NS j =1 NS NS j ,k=1

Defining
PNS2
F = p1 c 2F ,
NS i =2 i NS i

and from it the Hermitian operators


1
¡ † ¢
H = 2i F −F ,

c N 2 N 2 −NS
I + 12 F † + F ,
S S
¡ ¢
G = 2NS

the equation can be written as


2
ª NX
S −1
E (ρ S ) − ρ S = −i H , ρ S + G, ρ S + c j k F j ρ S F k† .
£
¤ ©
j ,k=1

The transformation must conserve the (unity) trace of any ρ S . The condition

TrS [E (ρ S ) − ρ S ] = 0

implies
2
NS −1
1 X
G =− c j k F k† F j ,
2 j ,k=1

and thus
NS2 −1 µ o¶
† 1n †
E (ρ S ) − ρ S = −i H , ρ S c j k F j ρ S Fk − F F j , ρS .
£ ¤ X
+ (13.15)
j ,k=1 2 k

This compact expression is said to be in first standard form.


The (NS2 − 1) × (NS2 − 1) matrix
 
c 1,1 ··· c 2,N 2 −1
S
 .. .. .. 
C ≡
 . . .


c N 2 −1,1 ··· c N 2 −1,N 2 −1
S S S

is Hermitian and positive semi-definite. It may be diagonalized by a unitary transformation


U:
γ1 0 · · ·
 
0
 0 γ ··· 0 
2

 
UCU =   .. ,
 0 0 . 0


0 0 · · · γN 2 −1
S

with diagonal elements γi ≥ 0. {


Introducing a new set of operators A k : 13
{

NS2 −1
X
Fj = uk j A k ,
k=1

and plugging into Eq. (13.15), we arrive finally at

NS2 −1 µ o¶
1n †
E (ρ S ) − ρ S = −i H , ρ S + γi A i ρ S A †i − A Ai , ρS ,
£ ¤ X
i =1 2 i
196 13. O PEN QUANTUM SYSTEMS

which can be written very compactly as

NS2 −1
E (ρ S ) − ρ S = −i H , ρ S + γi D[A i ]ρ S ,
£ ¤ X
i =1

with
1 1
D[A]ρ S ≡ Aρ S A † − A † Aρ S − ρ S A † A. (13.16)
2 2
This final expression is in so-called Lindblad form. D bears the name of dissipation super-
operator. It warrants being called a super-operator because it transforms operators to op-
erators. The operator A associated with D is called a Lindblad operator. It is important to
remark that, in general, the Hermitian operator H is not the free Hamiltonian HS of S in the
full Hamiltonian HU = HS + HE + HI .
Example: The damping channel.
Let us write the damping transformation in Lindblad form. In the basis (13.13), the two op-
eration elements are
1 1
M 0 = α01 p Z + α04 p I and M 1 = α13 σ− ,
2 2
p p
1+ 1−p 1− 1−p p ¢2
with α04 = p , α01 = p , and α13 = p (all other αi j = 0). These give c 11 = 1 − 1 − p /2,
¡ p
2 2
c 33 = p and c j k = 0 for all other j , k. Since F ∝ Z and Z = Z † , it follows that H = 0. The matrix
C is already diagonal, and we thus arrive at
c 11
E (ρ S ) − ρ S = c 33 D[σ− ]ρ S + D[Z ]ρ S .
2
Example: The depolarization channel.
We can proceed similarly to write the depolarization transformation in Lindblad form. The
four operation elements in (13.6) are already expanded in the operator basis (13.14). It is easy
to see that F = 0 and thus also H = 0. The matrix C is already diagonal. We leave it as an
exercise to arrive at
p p p
E (ρ S ) − ρ S = D[X ]ρ S + D[Y ]ρ S + D[Z ]ρ S .
4 4 4

13.5.1 U NCONDITIONED WEAK MEASUREMENTS IN L INDBLAD FORM


Let us now consider a weak indirect quantum measurement in which the interaction between
system and apparatus is given by the Hamiltonian

HU = HI = B S ⊗ A A ,

with Hermitian system operator B S and apparatus operator A A . The time-evolution operator
for an interaction time τ is given by

ρ 0U = U (τ)ρ UU (τ)† ,

where
{ U = e −iHU τ/ħ .
13
{
Because we assume the interaction to be weak, we can approximate U by an expansion
up to second order in τ:
τ τ2
U = 1 + B S A A − 2 B S2 A 2A + . . .
iħ ħ
The final universe density matrix is thus
³ ´ ³ ´
τ τ2 2 2 τ τ2 2 2
U ρ UU † = I + iħ B A − 2ħ 2B A ρ U I − iħ B A − 2ħ 2B A

τ
B Aρ U − ρ U B A + ħτ2 B Aρ U B A − 12 B 2 A 2 ρ U − 12 ρ U B 2 A 2 + O(τ3 ).
2 ¡
= ρ U + iħ
¡ ¢ ¢
13.6. P ROBLEMS 197

Tracing over the apparatus and taking into consideration that initially the system is unentan-
gled with the apparatus,
ρ U = ρ S ⊗ |ψA 〉 〈ψA | ,
we find
E (ρ S ) = TrA [U ρ UU † ]
τ
〈A〉[B, ρ S ] + 〈A 2 〉 ħτ2 B ρ S B − 12 {B 2 , ρ S } + O(τ3 ),
2 ¡
= ρ S + iħ
¢

with 〈A〉 ≡ 〈ψA | A |ψA 〉 and 〈A 2 〉 ≡ 〈ψA | A 2 |ψA 〉.


We have arrived at a Lindblad-form expression for the change in ρ S :

τ τ2
E (ρ S ) − ρ S = 〈A〉[B, ρ S ] + 〈A 2 〉 2 D[B ]ρ S + O(τ3 ). (13.17)
iħ ħ
This expression has only one dissipator with associated Lindblad operator B S , and an effec-
tive Hamiltonian term proportional to B S . The latter induces systematic backaction on the
system. Notice that the strength of these two different terms depends on the initial state of
the apparatus through 〈A〉 and 〈A 2 〉.
Example: Weak indirect measurement of a qubit.
Let us take a moment to relate this result to the example of weak measurement of a qubit
from above. Linking the notation, we have B S = Z and A A = Y , θ = τ/ħ, and |ψA 〉 = |0A 〉. Thus,
we have 〈A A 〉 = 0 and 〈A 2A 〉 = 1. The first tells us that there is no systematic backaction on the
measured qubit, and that all dynamics of ρ S arises from the single dissipator.
An interesting prediction of Eq. (13.17) is that initializing the ancilla qubit in a state |ψA 〉
with non-zero expectation value for YA (〈ψA | YA |ψA 〉 6= 0)) will produce systematic backaction
of the measured qubit. Note that A 2A = Y 2 = I , so the dissipative term is independent of the
initial ancilla state. We explore this in Fig. 13.8 by initializing the ancilla qubit on the y-z
plane with polar angle ranging from 0 to π/2.

13.6 P ROBLEMS
1. Measurement-induced qubit dephasing
In lecture, we discussed an indirect quantum measurement of a qubit using a second,
ancillary qubit. This two-step process consists of an interaction

U = e −iZS YA θ ,

followed by direct measurement of the ancilla with operator X A . Here, we will explore
the consequences of projecting the ancilla onto different bases. We adopt the usual
convention where X , Y and Z stand for the Pauli matrices – they are used instead of
σx , σ y and σz .

(a) Consider an ancilla measurement in YA . Derive the two operation elements in


this case. What does each of these operation elements do to the qubit, i.e., what
is their backaction on the qubit? Check that these operation elements form a de-
composition of unity.
(b) Consider an ancilla measurement in ZA . Derive the two operation elements in {
this case. What does each of these operation elements do to the qubit? Check that 13
{
these operation elements form a decomposition of unity.
(c) Show that in both cases, the unconditioned post-measurement density matrix is
the same as that derived in class for measurement in X A .
Note: This result is quite general. When an indirect measurement is performed
on a system, but the measurement result ignored, the post-measurement sys-
tem density matrix is independent of the projective measurement chosen for the
probe.
198 13. O PEN QUANTUM SYSTEMS

z
ψA
a ancilla initialization
b
y
c

unconditioned trajectories
1.0
(a)

0.5

0.0

-0.5 <X>
<Y>
<Z>
-1.0

1.0
(b)
Bloch vector coordinate

0.5

0.0

-0.5 <X>
<Y>
-1.0
<Z>

1.0 (c)

0.5

0.0

-0.5 <X>
<Y>
-1.0
<Z>
0 50 100 150 200 250 300
Measurement number

F IGURE 13.8: Unconditioned evolution of a qubit undergoing weak quantum measurement with the ancilla ini-
tialized in three different states on the y-z plane of the Bloch sphere: (a) |ψA 〉 = |0〉, (b) |ψA 〉 = C π/8 |0〉 + i S π/8 |1〉,
(c) and |ψA 〉 = p1 (|0〉 + i |1〉). The initial condition of the ancilla affects the systematic backaction on the qubit.
2
Consistent with Eq. (13.17), the systematic backaction in this example is a rotation about the z axis. In (a), there
is no systematic backaction, and all dynamics arises from the dissipator. In (c), the systematic backaction term
dominates the evolution.

2. Consider the problem of discriminating between non-orthogonal polarization states


{ of a single photon using projective measurements. Imagine a source of single photons
13{ that prepares them in either of the following two non-orthogonal states:
¯ψ 0
¯ ®
¯ ® = cos(α/2) |H 〉 + sin(α/2) |V 〉 ,
¯ψ1 = cos(α/2) |H 〉 − sin(α/2) |V 〉 ,

where |H 〉 and |V 〉 correspond to horizontal and vertical polarization. We denote


¯ ®the
probability that the source sends ¯ψ0 as p 0 and the probability that it sends ¯ψ1 as
¯ ®

p1 = 1 − p0.

(a) Calculate the overlap ψ0 |ψ1 .


­ ®
13.6. P ROBLEMS 199

(b) If ¯ψ0 and ¯ψ1 were orthogonal, we would


¯ ® ¯ ®
¯ ® discriminate perfectly by performing
¯ψ0 , ¯ψ1 . For non-orthogonal ¯ψ0 and
¯ ® ¯ ®
projective measurements in the basis
¯ψ1 , what is the measurement basis which minimizes the probability for a mis-
¯ ®

take in deciding which state we received? Hint: Parametrize the measurement


basis by an angle β:

|Φ0 〉 = cos(β/2) |H 〉 + sin(β/2) |V 〉 ,


|Φ1 〉 = sin(β/2) |H 〉 − cos(β/2) |V 〉 ,

and write the probability of error as a function of p 0 , p 1 , α and β. Minimize with


respect to β.
(c) Show that the minimum error probability is attained for

tan(βopt ) = tan(α)/(p 0 − p 1 ).

(d) Find an expression for the minimum error probability, p min . This minimum error
probability is known as the Helstrom lower bound.
(e) Make a plot of p min as a function of α for p 0 = p 1 = 1/2.

3. In this problem, we consider a different strategy altogether for discriminating between


the two non-orthogonal states. Here, we will devise a measurement scheme which
yields the possible results:

• We know for sure that the state is ¯ψ0 .


¯ ®

• We know for sure that the state is ¯ψ1 .


¯ ®

• Inconclusive: we do not know which state was sent to us.

Consider a POVM with the three effects:

E 0 = M 0† M 0 = γ0 |Ψ2 〉 〈Ψ2 | ,
E 1 = M 1† M 1 = γ1 |Ψ3 〉 〈Ψ3 | ,
E 2 = M 2† M 2 = 1 − E 0 − E 1 ,

where
¯ψ2 = sin(α/2) |H 〉 − cos(α/2) |V 〉 ,
¯ ®

¯ψ3 = sin(α/2) |H 〉 + cos(α/2) |V 〉 ,


¯ ®

γi are real and satisfy γi > 0. Note that ψ2 |ψ0 = ψ3 |ψ1 = 0, with
­ ® ­ ®
and
¯ ®the coefficients
¯ψ0 and ¯ψ1 as defined in the previous problem.
¯ ®

(a) Determine the conditions for this POVM to be valid (the effects are positive op-
erators, and form a decomposition of unity). That is, what constraint does this
impose on γ0 and γ1 ?
(b) Show that¯ the probability of getting measurement result 0 given that the photon
sent is in ¯ψ0 is 0.
®
{
(c) Calculate¯the probability of getting measurement result 1 given that the photon 13
{
sent is in ¯ψ0 .
®

(d) Calculate the total probability of getting measurement result 2. Express this as a
function of γ0 , γ1 , p 0 , and p 1 .
(e) The strategy
¯ for discrimination proposed is: If we measure 0, declare
¯ ® that the state
sent was ψ1 . If we measure 1, declare that state sent was ¯ψ0 . If we measure
®
¯
2, report Inconclusive. Optimize over the γ to minimize the probability that our
result will be inconclusive.
200 13. O PEN QUANTUM SYSTEMS

4. (10 pts) Consider the indirect weak measurement of a quantum bit S using another an-
cillary qubit A, as covered in class and lecture notes. As before, the ancilla is initialized
in |0〉 A . However, on this occasion the chosen interaction between S and A is

H = ZS X A ,

which is on for a time t = ħθ.

(a) Show that the operation elements corresponding to ancilla measurement along
the Y A axis with results m = +1 and m = −1 are (in the |0〉S , |1〉S basis):
à C θ −S θ ! à C θ +S θ !
p 0 p 0
M +1 = 2 and M −1 = 2
C θ +S θ C θ −S θ
0 p 0 p
2 2

where the usual notation C θ = cos(θ), S θ = sin(θ) is used.


(b) Consider now performing two such weak measurements (measurement a fol-
lowed by measurement b). It is understood that the ancilla qubit is re-initialized
to |0〉 A between measurements. What is the operation element corresponding to
the measurement record {m a , m b } = {+1, +1}.
(c) Repeat for {m a , m b } = {+1, −1}.
(d) For what measurement records can you say that the post-measurements’ state of
the qubit is the same as the initial state?
(e) Generalize to the case of a series of N measurements. Consider the cases where N
is odd or even. For what measurement records does the qubit return to the initial
state after the measurements are done?

5. (10 pts) Entanglement by measurement


In this problem we consider the possibility of entangling two qubits (Q 1 and Q 2 ) by an
indirect measurement using a third, ancillary qubit A. Consider the scheme below.

U1
Z mA = 1
Ψ 0A ?
U2

Q 1 and Q 2 are initially in a maximal superposition state

1
|Ψ〉 = (|01 02 〉 + |01 12 〉 + |11 02 〉 + |11 12 〉) .
2

{ The interaction between Q i and A results in the unitary evolution:


13
{
Ui = |0i 〉 〈0i | e −iθY A + |1i 〉 〈1i | e +iθY A ,

where as usual Y A is the Pauli-Y A operator, and θ is a fixed angle, somewhere between
0 and π/2, equal for the two interactions.

(a) Demonstrate that the initial state |Ψ〉 has no entanglement between Q 1 and Q 2 .

Be careful: Do not try to solve the following parts using elaborate calculations. Try to
visualize the action of the two Ui operators on the ancilla.
13.6. P ROBLEMS 201

(a) Work out the combined action of U1 and U2 on the state |01 12 〉 |0 A 〉. Do the same
for the other possible initial states of the Q1-Q2 system.
(b) What is the probability of obtaining the measurement result m A = +1? What is the
post-measurement state in that case? Does this state represent entangled qubits
Q 1 and Q 2 ? If yes, are the qubits maximally entangled?
(c) What is the post-measurement state of the Q 1 -Q 2 system when the measurement
result is m A = −1? Are the two qubits entangled in this case? If so, are they maxi-
mally entangled?

6. H Consider a qubit whose state is described by a known reduced density matrix ρ s . We


now hand this qubit over to Jos, who performs a projective measurement in the |0〉 , |1〉
basis. Jos then returns the qubit to us, without telling us the result of his measurement.

(a) How is ρ s transformed by this process? Write down this process using operator-
sum representation:
ρ 0s = M i ρ s M i† .
X
i

Specify the two operation elements M 0 and M 1 (each one as a 2 × 2 matrix in the
|0〉 , |1〉 basis).
(b) How does this process transform the Bloch vector representing the qubit state?
Hint: (〈X 〉 , 〈Y 〉 , 〈Z 〉) → (?, ?, ?).
(c) Now consider another process, known as the phase-flip channel. With probability
1/2, nothing happens. With probability 1/2, the qubit acquires a phase shift:

α |0〉 + β |1〉 → α |0〉 − β |1〉 .

What are the two operation elements M 00 and M 10 for this process? How does the
Bloch vector transform in this case?
(d) Compare results for the transformation of the Bloch vector obtained in (b) and
in (c). These transformations should be the same! This simple example shows
that different processes can lead to the same quantum operation on a quantum
system. Show that the operation elements M 0 and M 1 are related to the operation
elements M 00 and M 10 by a unitary transformation

M 00
µ ¶ µ ¶
M0
=S ,
M 10 M1
where S is a 2 × 2 matrix.

{
13
{
14
T IME EVOLUTION OF THE DENSITY
OPERATOR

In this chapter, we revisit the material covered in the previous chapter from a different view-
pont: we calculate the time evolution of the density operator for a system coupled to a bath.
We shall discuss the different assumptions necessary to arrive at the final result in detail:
these boil down to requiring that the system under consideration is coupled sufficiently weakly
to the bath that the latter can always be considered to be in equilibrium.

14.1 T HE B ORN -M ARKOV MASTER EQUATION


In this section, we develop the formal equation describing the evolution of the density op-
erator of a system coupled to an environment or bath – the terms ‘bath’ and ‘environment’
have the same meaning in this section. The equation we search for is the analogue for an
open system of the Schrödinger equation for a closed quantum system. Of course, different
environments give rise to different evolutions of the density matrix and the same holds for
different couplings to the same environment. Therefore, we should anticipate that our equa-
tions will depend on the model(s) used for both the environment and the coupling. On the
other hand, one may ask the question what is the most general form of this equation. We
shall address this issue towards the end of this chapter.
The material discussed in this and the following sections can be found in many books.
The discussion here is based on Quantum Measurement and Control, by H. M. Wiseman and
G. J. Milburn, Cambridge University Press, 2010. Another useful text is The Theory of Open
Quantum Systems, by H. P. Breuer and F. Petruccione, Oxford University Press, 2002.
To find the equation of motion for the system’s density operator is straightforward: we
start by writing up the time evolution of the ‘Universe’ and then we trace out the environment.
However, in order to arrive at a useful and convenient result, we must make two important
approximations as we shall see.
The Hamiltonian of the universe can be written as

H = HS + HE + VSE

where HS and HE are the Hamiltonians of the system and the environment without interac-
tion, and VSE is the coupling between the two.
Now we step back. In section 2.2, we have introduced the Schrödinger and Heisenberg
pictures of quantum mechanics. Let us summarise them here:

203
204 14. T IME EVOLUTION OF THE DENSITY OPERATOR

• Schrödinger picture:

The states evolve in time according to the time evolution operator U (t ) =


exp (−it H /×):
¯ψ(t ) = U (t ) ¯ψ(0) .
¯ ® ¯ ®

Equivalently, the time evolution of the states is given by the time-dependent


Schrödinger equation:

∂ ¯¯
ψ(t ) = H ¯ψ(t ) .
® ¯ ®

∂t
The density operator evolves in time as

ρ(t ) = U (t )ρ(0)U † (t ).

The operators do not evolve in time.

• Heisenberg picture:

The operators Ô evolve in time according to

Ô H (t ) = e iH t /× Ô e −iH t /× = U † (t )ÔU (t ).

Equivalently, they evolve according to the Heisenberg equation of motion:

d i
O H (t ) = [H , O H (t )] .
dt ×
The states do not depend on time in this picture and the same holds for the
density operator ρ H .

It should be clear from this overview that the density operator evolves in a way different from
a ‘usual’ operator: it is time dependent in the Schrödinger picture, where usual operators are
time-independent, and in the Heisenberg picture we have the opposite situation.
For our problem where we have a non-interacting system described by the Hamiltonian
H0 = HS + HE and an interaction VSE , it turns out convenient to introduce a third picture: the
‘interaction picture’, in which the states and the operators are time dependent. The states are
defined by
¯ψI (t ) = e it H0 /× ¯ψ(t ) = e it H0 /× e −it (H0 +VSE ) ¯ψ0 .
¯ ® ¯ ® ¯ ®

Note that the two factors it H0 in the exponent do not cancel due to the fact that H0 does not
commute with VSE . The operators also evolve in the interaction picture – they are defined as

Ô I (t ) = e iH0 t /× Ô e −iH0 t /×

and the density matrix is in this picture defined as:

ρ I (t ) = e iH0 t /× ρ(t )e −iH0 t /× ,

where ρ(t ) is the density matrix in the Schrödinger picture. This directly gives the inverse
relation
ρ(t ) = e −iH0 t /× ρ I (t )e iH0 t /× . (14.1)
{ Armed with this new picture, the time evolution of the density matrix can now be cast in
14{ a compact form. We now set
×≡1
for convenience, and start with the general relation (in the Schrödinger picture):

ρ̇ = −i H , ρ .
£ ¤
14.1. T HE B ORN -M ARKOV MASTER EQUATION 205

Rewriting this equation in the interaction picture gives after some calculation:

ρ̇ I (t ) = −i VSE, I (t ), ρ I (t ) .
£ ¤
(14.2)

This equation can be solved implicitly:


Z t
ρ I (t ) = ρ I (0) − i VSE, I (t 0 ), ρ I (t 0 ) d t 0 .
£ ¤
0

This equation is then put back into (14.2), which then leads to
Z t
ρ̇ I (t ) = −i VSE, I (t ), ρ I (0) − VSE, I (t ), VSE, I (t 0 ), ρ I (t 0 ) d t 0 .
£ ¤ £ £ ¤¤
0

We may iterate further but this does not turn out to be necessary for our purposes. Such an
equation requires an initial condition. Without fully specifying the initial density matrix, we
assume that it is of the form:
ρ I (0) = ρ S (0) ⊗ ρ E (0),
i.e. at t = 0, there are no correlations between the system and the environment. This may
be realised in practice by preparing a system and an equilibrated environment, and bringing
them into contact at t = 0.
We want to take the trace of the density matrix over the environment in order to arrive at
a density matrix for the system only. Doing this for the uncorrelated density operator at t = 0,
we usually find
£ ¤
Tr E VSE, I (0)ρ I (0) = 0.
If this is not the case, we simply add a constant to H0 :

H0 → H0 + aI S ⊗ I E

where I S,E are the unit operators within the respective Hilbert spaces and a is chosen such as
to make the above trace zero. We then have
Z t
Tr E ρ̇ I (t ) = − Tr E VSE, I (t ), VSE, I (t 0 ), ρ I (t 0 ) d t 0 .
¡£ £ ¤¤¢
0

We can make this equation manageable if we realise that the coupling VSE between the
environment and the system is weak. Subsequent iterations lead to higher and higher powers
of VSE in the integral. So we may replace the ρ I in that integral by a simpler form without
spoiling the ρ̇ I on the left hand side too much. The approximation we make for ρ I is based on
the physical assumption that the environment contains so many degrees of freedom that it is
hardly perturbed by the system, and that it does not build up extensive correlations with the
latter. This means that we assume the environment to be at equilibrium at all times. In that
case we may put
ρ I (t 0 ) = ρ S,I (t 0 ) ⊗ ρ E,I ,
so that the evolution of the density operator is determined by the equation
Z t
ρ̇ S,I = − Tr E VSE, I (t ), VSE, I (t 0 ), ρ S,I (t 0 ) ⊗ ρ E,I (0) d t 0 .
¡£ £ ¤¤¢
0

Note that this approximation is useful if we are mainly interested in the effect of the bath on
the system – there is also a reverse effect of the system on the bath, which becomes important {
when the environment carries away information concerning the system, as happens during 14
{
a measurement. The approximation that the environment is not influenced by the system is
called the Born approximation. It is strongly related to the Born approximation in scattering,
where we replace the full wave function, rather than the density matrix, by an unperturbed
one. Finally, note that ρ S,I and ρ E,I are both calculated in the interaction picture.
206 14. T IME EVOLUTION OF THE DENSITY OPERATOR

The evolution equation for the system density operator still has a major disadvantage: the
left hand side is a function of t , whereas the right hand side contains an integral over t 0 for
which we need all the density matrices at times between 0 and t . Noting that

VSE, I (t ) = e it (HS +HE )VSE e −it (HS +HE )


0
we can assume that, for a large bath, the terms with phase factor e i(t −t )HE oscillate so quickly
in the integral that their contribution averages out to zero except for t ≈ t 0 . This is the Markov
condition, which can be formulated as the requirement that

Tr E ρ EVSE, I (t )VSE, I (t 0 ) ≡ Γ(t − t 0 ),


¡ ¢

which is recognized as the autocorrelation function of the environment, is sharply peaked


around t = t 0 . In physical terms, it says that the environment decorrelates so quickly that all
contributions in which t and t 0 are separated more than a minimum time τ, average out to
zero; τ is furthermore supposed to be much smaller than the time scale at which the system
changes. The evolution of the system is now governed by the equation
Z t
ρ̇ S (t ) = − Tr E VSE, I (t ), VSE, I (t 0 ), ρ S,I (t ) ⊗ ρ E (0) d t 0 ,
¡£ £ ¤¤¢
(14.3)
0

where we have omitted the subscript ‘I’ with ρ E (0) as the different pictures are identical at
t = 0. Given the fact that the integrand of the original integral is peaked around t ≈ t 0 , we can
set the lower bound of the integral to −∞, which leads to
Z t
ρ̇ S,I (t ) = − Tr E VSE, I (t ), VSE, I (t 0 ), ρ S, I (t ) ⊗ ρ E (0) d t 0 .
¡£ £ ¤¤¢
−∞

This is the so-called Born-Markov equation for the density matrix.

Summary so far We have seen that a convenient picture for describing the evolution
of a system coupled to a bath is the interaction picture, defined by

¯ψI (t ) = e it H0 ¯ψ(t ) ;
¯ ® ¯ ®

ρ I (t ) = e it H0 ρ(t )e −iH0 t ;
O I (t ) = e it H0 O e −it H0 .

In this picture, the evolution of a system can be derived if we make the following two
assumptions:
• The environment is considered to remain in the initial state and does not
evolve significantly from its state at t = 0 (Born approximation).
• The interactions with the environment decorrelate quickly in time (Markov ap-
proximation).
The result is therefore called the Born-Markov equation:
Z t
ρ̇ S, I (t ) = − Tr E VSE, I (t ), VSE, I (t 0 ), ρ S, I (t ) ⊗ ρ E (0) d t 0 .
¡£ £ ¤¤¢
(14.4)
−∞

{ The name Redfield or Bloch-Redfield equation is often given to this or to a very sim-
14{ ilar equation.

We shall now consider some concrete examples to illustrate the ideas and to derive some
physically relevant results.
14.2. E XAMPLES 207

14.2 E XAMPLES
14.2.1 T HE DAMPED HARMONIC OSCILLATOR
We consider a harmonic oscillator S with frequency ω0 coupled to a bath E of other harmonic
oscillators with frequencies ωk , one for each k-vector in a cavity. The Hamiltonian is

H = HS + HE + VSE ,

where ³ ´ X³ ´
HS = ω0 a † a + 1/2 ; HE = b k† b k + 1/2 ;
k
³ ´³ ´
g k a † + a b k† + b k .
X
VSE =
k

The environment, described by the creation and annihilation operators b, b † ’s, could for ex-
ample be a 1D chain of N harmonic oscillators like the one studied in section 8.2 (a quick
revision of the harmonic oscillator is strongly recommended!). Note that the last term repre-
sents a coupling which, for a 1D system, would be of the form x a x 0 , as x a ∝ a + a † , and the
position x 0 of the zeroeth oscillator of the environment would be

1 X X³ † ´
x0 = xk ∝ bk + bk .
N k k

The bath has a density matrix

1
ρE = exp(−βHE ),
Z
where β = 1/(k B T ) and. Note that this can also be written as

1Y
ρE = |n k 〉 〈n k | e −βωk (nk +1/2) ,
Z k

with Y hX i
Z= )n k = 0∞ e −βωk (nk +1/2) .
k

In the interaction picture, the a and b k become time-dependent – their time dependence
is caused by the unperturbed Hamiltonian and we obtain:
X ³ ´³ ´
VSE, I (t ) = g k a † e iω0 t + ae −iω0 t b k e −iωk t + b k† e iωk t
k

where the time dependence has been made explicit in the form of the phase factors. All ω’s
are positive. The time-dependent phase factors rotate in the complex plane, and their average
effect is expected to decay rapidly with time. An exception occurs when ωk ≈ ω0 , and we
usually only keep the terms which are expected to give a non-negligible transition rate:
X ³ ´
VSE, I (t ) = g k a † b k e i(ω0 −ωk )t + b k† ae −i(ω0 −ωk )t .
k

This expression will give significant contributions when ωk is close to ω0 . This approximation
which consists of eliminating the rapidly oscillating phase factors is called the rotating wave
approximation. {
Expanding the double commutator occurring in the Born-Markov equation seems cum- 14
{
bersome: it gives rise to no less than 16 terms. However, half of these vanish since the trace
over the terms b k b k0 and b k† b k† 0 vanishes (why?). The other two possible combinations give
(see problem 1) ³ ´
Tr E ρ E b k† b k0 = δ(k − k0 )n k
208 14. T IME EVOLUTION OF THE DENSITY OPERATOR

and ³ ´
Tr E ρ E b k b k† 0 = δ(k − k0 ) n k + 1 .
¡ ¢

Apart from pre-factors, sums over k and the integral over t , but after having taken the trace
over the environment Hilbert space, we are then left with the following:
0 0
aa † ρ S,I n k e −i(ω0 −ωk )(t −t ) + a † aρ S,I (n k + 1)e i(ω0 −ωk )(t −t ) −
0 0
aρ S,I a † n k e −i(ω0 −ωk )(t −t ) − a † ρ S,I a(n k + 1)e i(ω0 −ωk )(t −t ) −
0 0
a † ρ S,I a † n k e −i(ω0 −ωk )(t −t ) − aρ S,I a † (n k + 1)e i(ω0 −ωk )(t −t ) +
0 0
ρ S,I aa † n k e i(ω0 −ωk )(t −t ) + ρ S,I a † a(n k + 1)e −i(ω0 −ωk )(t −t ) .

Two operations remain to be performed: the sum over the modes k and the integral over
the time t . It can be shown that
Z 0 Z ∞ µ ¶
1
e ±iωt = e ∓iωt = πδ(ω) ∓ iP .
−∞ 0 ω
It turns out that the imaginary contribution shifts the frequency over a small amount – we
neglect this effect. Replacing the sum over k by an integral over the energies ω:
X Z
→ η(ω)d ω
k
¯ ¯2
where η(ω) represents the density of states, we finally obtain, putting γ = 2πη(ω) ¯g k ¯ :

γ ³ ´ γ¡ ¢³ ´
ρ̇ S, I = n(ω0 ) 2a † ρ S,I a − aa † ρ S,I − ρ S,I aa † + n(ω0 ) + 1 2aρ S,I a † − a † aρ S,I − ρ S,I a † a
2 2
≡ L ρ S,I , (14.5)

where we have introduced the Lindblad operator L , acting on ρ S .


We now want to evaluate the time evolution of different physical quantities (represented
by Hermitian operators). We first calculate the time dependence of the expectation value of
a:
d d
Tr S a I (t )ρ S, I (t ) = −i 〈ω0 a I 〉 + Tr S L ρ S,I a I ,
¡ ¢ ¡ ¢
〈a〉 (t ) =
dt dt
where the first term on the right hand side derives from ȧ I (t ). The rightmost term gives six
contributions (see the form of L ). The first of these has the form:
γ ³ ´ γ ³ ´ γ ³ ´
n(ω0 )Tr S 2a † ρ S,I aa I (t ) = n(ω0 )Tr S 2a † ρ S,I aae −iωt = n(ω0 )Tr S 2ρ S,I a 2 a † e −iωt .
2 2 2
Working out the next two terms in a similar fashion and adding them, yields the term propor-
tional to n(ω0 ):
γ ¢ γ
n(ω0 )Tr S ρ S,I a I (t ) = n(ω0 ) 〈a〉 .
¡
2 2
The next three terms (proportional to n(ω0 ) + 1) combine into:
γ£ ¤
− n(ω0 ) + 1 〈a〉 ,
2
so that, collecting all terms, we obtain

{ d γ
〈a〉 (t ) = −iω 〈a〉 (t ) − 〈a〉 (t ).
14
{ dt 2
The first term on the right hand side arises from the commutator with the (unperturbed)
Hamiltonian; the Lindblad operator yields the damping term γ. We see that after a long time,
the expectation value of a reduces to a simple oscillation.
We can also work out the relaxation of the energy. This is addressed in problem 2.
14.2. E XAMPLES 209

14.2.2 S PONTANEOUS EMISSION FROM AN ELECTRONIC EXCITATION IN AN ATOM


The environment is in this case composed of the photons of the field. The Hamiltonian of the
environment can be written as
HE = ωk a k† a k
X
k

where we use k as a general index containing information about the wave vector k and the
polarization (there are two transverse polarization modes in vacuum) – the sum is therefore
over all independent modes. We have furthermore put × ≡ 1. The creation and annihilation
operators a k† and a k satisfy the boson commutation relation

[a k , a l† ] = δkl .

The interaction between the electron and the electromagnetic field follows from the Hamil-
tonian of a charged particle in an electromagnetic field:

1 ¡ ¢2
H= p + eA(r, t ) − eϕ(r, t ).
2m
We use the dipole approximation which takes the wavelength of electromagnetic waves to be
much larger than the size of the atom. This is realistic, as the size of the atom is of the order of
Angstroms, whereas the wavelength for light inducing an atomic transition is typically three
orders of magnitude larger.
We thus take
A(r, t ) → A(t ).
The term proportional to A(t )2 is thus a constant oscillating field. We neglect the influence of
this term, which is typically very small. We are then left with the interaction term
e
VSE (t ) = A(t ) · p.
m
and the atomic Hamiltonian is
p2
HS = − eφ(r).
2m
¯ ®
Expressed in terms of the ground state ¯g and the excited state |e〉 of the atom, this Hamilto-
nian takes the form
ωa
HS = σz ,
2
where σz is¯ the Pauli-matrix operator which works in the two-dimensional Hilbert space
spanned by ¯g and |e〉. It is a diagonal operator with eigenvalues 〈e | σz | e〉 = 1 and g ¯ σz ¯ g =
® ­ ¯ ¯ ®

−1. The energy difference between ground and excited state is ωa .


For the vector potential occurring in VSE we have, putting the atom at r = 0:

1 h

i
²̂α a k,α (t ) + a −k,α
X
A(t ) = (t ) ,
2V ²0 ωk,α
p
k

with a k,α (t ) = e −iωk t a(0) (see section 8.3.2).


p Note that we have replaced the integral over the
k-modes by a sum, including the factor 1/ V .
The other part of the interaction Hamiltonian VSE is the momentum. We find a suitable
form for this via a trick. We note that
{
p = im [HS , r] 14
{
¯ ®
as can easily be verified. We therefore find, expressed in the basis ¯g , |e〉
­ ¯ ¯ ® ­ ¯ ¯ ® ­ ¯ ¯ ®
g ¯ p ¯ e = im g ¯ [HS , r] ¯ e = −imωa g ¯ r ¯ e
210 14. T IME EVOLUTION OF THE DENSITY OPERATOR

with ωa = E e −E g . The diagonal matrix elements vanish due to anti-symmetry of the integral:
¯2
r ¯ψ(r)¯ d 3 r = 0 due to the fact that ψ(r) = ±ψ(−r). All in all we see that the operator p can
R ¯

be written as
p = imωa (σ− − σ+ ) g ¯ r ¯ e ,
­ ¯ ¯ ®

where µ ¶ µ ¶
0 1 0 0
σ+ = σ− = .
0 0 1 0
A trivial basis transformation:
¯ ® ¯ ®
¯g → ¯g
|e〉 → i |e〉

transforms the expression for p into

p = mωa (σ− + σ+ ) g ¯ r ¯ e .
­ ¯ ¯ ®

­ ¯ ¯ ®
Note that we have not changed the number g ¯ r ¯ e ; we have only changed the operator
representation. We envisage that the relevant modes which excite or de-excite the atom have
frequencies close to the transition energy: ωk ≈ ωa .
Note that, so far, we have taken the A-field to be time dependent, whereas the σ± were
taken time-independent. In the interaction picture, they vary with time as

σ+ (t ) = σ+ e iωa t ;
σ− (t ) = σ− e −iωa t .

All in all, we find that the interaction Hamiltonian takes the form (with the factors × restored):
s
X ×ωk ¢³ ´
²̂α · g ¯ r ¯ e σ− e −iωa t + σ+ e iωa t a k e −iωk t + a k† e iωk t .
­ ¯ ¯ ®¡
VSE, I (t ) =
k,α 2²0V

Lumping all prefactors into a coupling constant g k , we obtain:

g k (a k e −iωk t + a k† e iωk t )(σ+ e iωa t + σ− e −iωa t );


X
VSE,I =
k

with s
×ωk
²̂α · d,
X
gk = (14.6)
α 2²0V
where d is the matrix element of the dipole moment between the ground and excited state.
Working out the product in the sum gives terms with exp [±i(ωk + ωa )t ] and terms with
exp [±i(ωk − ωa )t ]. Just as in the previous section, we neglect the terms of the first form as
they give a negligible contribution (rotating wave approximation). In this approximation, the
interaction reads: X h i
VSE,I = g k a k σ+ e i(ωa −ωk )t + a k† σ− e −i(ωa −ωk )t .
k

The Hamiltonian describing the system with this interaction is called the Jaynes-Cummings
Hamiltonian. It is omnipresent in systems where bosons interact with fermions.
{ We now have all the ingredients for formulating the Born-Markov equation for this case.
14{ We do this for very low temperatures, in which the electromagnetic field is in its ground state
(all occupations 0). In addition to terms with two creation or two boson annihilation opera-
tors, the expectation value D ¯ ¯ E
¯ †
0 ¯ b k,α b k,α ¯ 0 = 0.
¯
14.2. E XAMPLES 211

Working out all the terms in the Born-Markov equation leads to (we leave out the subscripts
‘I’): Z t
ρ̇ S = − Γ(t − t 0 ) σ+ σ− ρ S (t 0 ) − σ− ρ S (t 0 )σ+ d t 0 + h.c.
£ ¤
(14.7)
−∞
Here h.c. denotes Hermitian conjugate and

Γ(τ) = g k2 e −i(ωk −ωa )τ .


X
k

You are strongly advised to verify this (see problem 1).


In analogy with the harmonic oscillator problem of the previous section, we have

γ
Z ∞
Γ(τ)d τ = − ∆ωa .
0 2

We then obtain the Born-Markov equation:

∆ωa £ γ¡
ρ̇ S (t ) = −i σz , ρ S (t ) + γσ− ρ S (t )σ+ − σ+ σ− ρ S (t ) + ρ S (t )σ+ σ− .
¤ ¢
2 2
We shall neglect the term with ∆ω – this is called the Lamb shift. From the definition of Γ(τ) it
can be seen that the Lamb shift only occurs when g is not symmetrically distributed around
ωk : if there are for example more states in the bath with frequency ωk > ωa than the other way
round, the effective frequency of the atom is slightly shifted upward. Calculating the Lamb
shift is quite difficult; it is one of the major exercises of quantum electrodynamics. We refrain
from going into details here.
The term on the right hand side turns out to generate a non-unitary time evolution. For a
closed quantum system, we expect only unitary time evolutions. The fact that this is not the
case here reflects the leak of information to the environment. Let us make this more explicit
by calculating the time evolution of the state of the atom. This is found from

d
〈σz 〉 = Tr S ρ̇ S σz .
¡ ¢
dt
Using the above equation for ρ̇ S we obtain, with z ≡ 〈σz 〉:

ż = −γ(z + 1),

which decays from a starting value in the excited state z = 1 to −1¯ with ¯2 a decay time 1/γ.
The value of the emission rate can be calculated: it is given as ¯g k ¯ η(ω), with η the density
of states. We can calculate the latter straightforwardly:

V V ω2
Z Z
d 3k = d ω,
X
=
k (2π)3 2π2 c3

so that
V ω2
η(ω) =
.
2π2 c 3
¯2
We furthermore need to calculate ¯ g ¯ d ¯ e · ²̂α ¯ which occurs in the expression for g k see
¯­ ¯ ¯ ®

eq. 14.6. Now we use a symmetry argument to evaluate this expression:


X ¯­ ¯ ¯ ® ¯2 2 ¯­ ¯ ¯ ®¯2 {
α
¯ g ¯ d ¯ e ²̂α ¯ = ¯ g ¯ d ¯ e ¯ .
3
14
{

The factor 2/3 arises because there are


¯­ 2¯ polarization
¯ ®¯2 directions, and for each direction we
consider the component of the vector ¯ g ¯ d ¯ e ¯ along that polarization direction. Averaging
212 14. T IME EVOLUTION OF THE DENSITY OPERATOR

over all polarization directions of the field then leads to 1/3 because isotropy. All in all we
obtain, restoring the factor ×:

2 ωa ¯¯­ ¯¯ ¯¯ ®¯¯2 ω3 |d|2


γ = 2πη g d e = a .
3 2²0 3πײ0 c 3
This is the spontaneous emission rate. ¯ ®
It is instructive to work out the Lindblad operator in the space of the states ¯g and |e〉. In
that space, the density operator is a 2×2 matrix. Using the forms of σ+ and σ− we directly see
that
1 ρ ++ ρ +− ρ −−
µ ¶ µ ¶
−ρ +− /2
ρ̇ S = =γ .
d t ρ +− ρ −− −ρ −+ /2 −ρ −−
From this we can easily calculate the time evolution of the three components of the Bloch
vector. We call these components X , Y and Z :
d γ
〈X 〉 = − 〈X 〉 ,
dt 2
d γ
〈Y 〉 = − 〈Y 〉 ,
dt 2
d
〈Z 〉 = −γ 〈Z 〉 + γ.
dt
Fortunately, these equations are independent and we can solve them at once, finding

〈X 〉 (t ) = e −γt /2 〈X 〉 (0),
〈Y 〉 (t ) = e −γt /2 〈Y 〉 (0),
〈Z 〉 (t ) = e −γt 〈Z 〉 (0) + 1 − e −γt .

14.3 P ROBLEMS
1. Consider a boson bath described by a density matrix ρ B . The expectation value for the
number of bosons in mode k is given by
³ ´
〈n k 〉 = Tr ρ B b k† b k .
³ ´
(a) Show that Tr b k ρ B b k† = 〈n k 〉.
³ ´
(b) Show that Tr b k† ρ B b k = 〈n k 〉 + 1.
(c) Derive Eq. (14.7) of the lecture notes. This equation is derived for the vacuum
state 〈n k 〉 = 0.
(d) Derive the same equation if the system is not in the vacuum state.

2. Energy damping in the damped harmonic oscillator In section 14.2.1, we have derived
the evolution equation for the density matrix of the harmonic oscillator coupled to a
bath of other oscillators. From the result, Eq. 14.5 calculate the time evolution of the
expectation value of the operator a † a.

3. In honour of the Physics Nobel prize 2012 (Serge Haroche and David Wineland), we
consider a cavity quantum electrodynamics system. We consider two atoms (A and
B ) and one single-mode cavity, all with matching transition frequencies. We realise an
indirect quantum measurement of the two-atom system by interacting each atom in
{ sequence (first A, then B ) with the cavity and then measuring the photon number n in
14{ the cavity. We work in the interaction picture that makes the non-interacting Hamilto-
nian terms disappear, leaving us only with the interaction term (in the rotating-wave
approximation):
³ ´ ³ ´
H I = γ A (t ) a † σ−,A + aσ+,A + γB (t ) a † σ−,B + aσ+,B .
14.3. P ROBLEMS 213

Initially, the two-atom system is in state ¯ψ , and the cavity in the n = 0 Fock state. To
¯ ®

make atom A interact with the cavity, we turn on γ A to γ at t = 0, for a time τ – γB is


kept at zero. At t = τ, we turn γ A off and turn γB to γ for another time τ. At t = 2τ, we
turn γB off. After these interactions, we perform a photon-number measurement.

(a) We choose τ so that an excitation in A is fully transferred to the cavity. Please


express τ in terms of γ.
(b) Consider ¯ψ = ¯g A g B (both atoms in ground state). What is the state of the
¯ ® ¯ ®

atoms+cavity universe after the two interaction steps?


(c) Repeat for ¯ψ = ¯g A e B (atom B excited).
¯ ® ¯ ®

(d) Repeat for ¯ψ = ¯e A g B (atom A excited).


¯ ® ¯ ®

(e) Finally, repeat for ¯ψ = |e A e B 〉 (both atoms excited).


¯ ®

(f) Write the three operation elements M n corresponding to measuring n = 0, 1 or 2


photons in the cavity. Verify that these operation elements form a decomposition
of unity.
(g) What is the measurement operator acting in the two-atom Hilbert space giving
the expected value of the measurement
¯ result¯n? Please write this operator as a
® ¯ ® ®
matrix expressed in the basis {¯g A g B , ¯g A e B , ¯e A g B , |e A e B 〉}.

4. Collective coupling in cavity QED


In this problem we return to the Nobel-prize winning field of Cavity QED. We consider
the problem of N identical two-level atoms coupling with equal strength γ to a single
mode of a cavity. The cavity mode frequency is resonant with the atomic transition.
Working in the interaction picture, the effective Hamiltonian in the rotating-wave ap-
proximation is given by

N ³ ´
HI = γ a † σ−,i + aσ+,i ,
X
i =1

where as usual a † and a are creation and annihilation operators for the cavity mode,
and σ+,i and σ−,i are
¯ ®raising and lowering operators for atom i . We denote ¯ ®the ground
state of atom i by ¯g i and the excited state by |e i 〉. Written in the basis {¯g i , |e i 〉},
µ ¶ µ ¶
0 0 0 1
σ+,i = and σ−,i = .
1 0 0 0

The initial state (t = 0) of the system is a 1-photon Fock state of the cavity, with all atoms
in the ground state: ¯ ®
|Ψ(t = 0)〉 = |n = 1〉 ⊗ ¯g 1 ...g N .

(a) Calculate the action of H I2 on ¯ψ(t = 0) .


¯ ®

(b) Give an explicit expression for ¯ψ(t ) by acting with the time evolution operator
¯ ®

exp(−it H I /×) on the initial state. Show that the state of the atoms+cavity oscil-
lates coherently between the initial state and another (normalized) state. Specify
this other state and the oscillation frequency as a function of γ and N .
(c) Imagine instead that the system is initially in the state {
14
{
N
(−1)i ¯g 1 ...g i −1 e i g i +1 ...g N
X ¯ ®
|Ψ(t = 0)〉 = |n = 0〉 ⊗
i =1

and suppose N is even. Describe the time evolution in this case.


214 14. T IME EVOLUTION OF THE DENSITY OPERATOR

5. In this problem we consider the interaction between one particular


¯ ® mode of an electro-
magnetic field with an atom that can be in the ground state ¯g or in the excited state
|e〉. The electric field mode can in practice be realized using a cavity, whose dimensions
precisely accommodate the ¯ single mode, which we assume to have a frequency ω0 . The
®
set of orthonormal states g , n and |e, n〉 forms a basis of the Hilbert space of the cavity
¯
+ atom. Here |n〉 denotes a cavity state containing n photons. In this problem, we as-
sume that the cavity is tuned to¯ the frequency corresponding to the energy difference
®
between the two atomic states ¯g and |e〉. The Hamiltonian of the atom is

×ω A
HA = − σz ,
2
where σz is the Pauli matrix which operates in the Hilbert space spanned by
µ ¶ µ ¶
¯ ®
¯g ≡ 1 0
, |e〉 ≡
0 1

and ω0 = ω A . The interaction between the field and the cavity is given by
³ ´
W = γ σ+ a + σ− a † ,

where coefficient γ sets the coupling strength, a and a † are the creation and annihila-
tion operators for photons in the cavity, and σ± are the operators which move the atom
from ground to excited state (σ+ ) and viceversa (σ− ).

(a) Explain the two processes described by the Hamiltonian W .


(b) Let H0 be the Hamiltonian of the atom plus the field, without the interaction. Give
the eigenstates and eigenvalues of H0 . Give the degeneracies of the eigenstates.
(c) Determine the eigenstates of H = H0 + W and the corresponding energies. Show
that this problem reduces to the diagonalisation of a set of 2 × 2 matrices. You
p
should find the values ×ω0 n ± γ n for the energy eigenvalues.
(d) Now we assume that at t = 0, the cavity is in a coherent state α, and the atom is
in its excited state |e〉, i.e., cavity plus atom are in the state |α〉 ⊗
¯ |e〉. Calculate the
®
probability of finding, at time T , the atom in the ground state g . The result is a
¯
series expansion in n. Plot the series, cutting it off for different values of n.

{
14{
15
(M ORE THAN A ) SURVIVAL GUIDE TO
SPECIAL RELATIVITY

This chapter is more extensive due to the fact that I had these lecture notes lying around from
about ten years ago. It contains more than you need on special relativity. What you need can
be found in chapter 31 of Desai.

15.1 H ISTORY AND E INSTEIN ’ S POSTULATES


The theory of classical mechanics based on Newton’s laws gives an excellent description of
everyday life systems. At speeds of the order of the speed of light, relativistic effects come into
play and necessitate a different formulation of mechanics. It was Einstein who in 1905 gave
a definite physical interpretation to the mathematics which was already known in essence.
This is the special theory of relativity.
The theory of special relativity is based on two fundamental postulates, one of which
holds for classical mechanics as well, and which is known as Galilei invariance. This postulate
pertains to inertial frames. We shall frequently use the name reference frame for the same
thing. An inertial frame is a system which moves at constant velocity, i.e. bodies standing
still in this frame are not subject to any force (that is, the forces acting on it add up to zero).
Obviously the same holds for uniformly moving objects in that frame. Galilei invariance for a
physical theory can be formulated as follows:

• It is impossible to determine the absolute velocity of an inertial frame. Only the relative
velocity of two inertial frames can be determined. Physical laws are the same in all
inertial frames.

Galilei invariance holds for classical mechanics, but not if classical mechanics is combined
with Maxwell’s equations, which describe the phenomena of electricity and magnetism. In
fact, Maxwell’s equations predict electromagnetic waves to move (in vacuum) at the speed of
light. It is not clear from Maxwell’s equations how waves emitted by a source moving with ve-
locity v can be described, as in the context of classical mechanics, these waves should move at
a speed c k̂+v (the unit vector k̂ defines the direction of the radiation) which would mean that
we obtain a speed different from that of light, but this behaviour is not obtained by Maxwell’s
equations. The solution to this problem comes from the notion (or rather the experimental
fact) that the speed of light is independent of the velocity of the source. This is formulated in
the second postulate:

• The speed of light is the same for observers in different inertial frames.

This postulate is a very counter-intuitive one. If you drive a very fast car, the headlights emit
EM radiation at the speed of light. But by an observer standing still at the road that radiation

215
216 15. (M ORE THAN A ) SURVIVAL GUIDE TO SPECIAL RELATIVITY

is also perceived as moving the speed of light! The validity of this postulate has been firmly
established through experiments, of which the Michaelson-Morley experiment stands out as
one of the landmarks of experimental physics. From Einstein’s two postulates the full theory
of special relativity can be derived.

15.2 T HE L ORENTZ TRANSFORMATION


The Lorentz transformation directly follows from the fact that the speed of light is indepen-
dent of the reference frame you are in. The Lorentz transformation for systems with one
space and one time dimension relates the space coordinates, x and x 0 , and the time coordi-
nates, t and t 0 , where the ‘primed’ system is moving with a velocity v = βc with respect to the
‘unprimed’ system. The first Lorentz equation is

x 0 = x/γ − βt 0 , (15.1)

or
x = γ(x 0 + βt 0 ). (15.2)
Here we have introduced γ = (1 − v 2 /c 2 )−1/2 . Another notational convention is β = v/c. Fi-
nally, c is usually taken equal to 1. From now on, we shall conform to these units and this
notation, unless stated otherwise:

β = v/c; c ≡1 (15.3a)
p q
γ = 1/ 1 − (v/c)2 = 1/ 1 − β2 . (15.3b)

A similar formula which relates t to x 0 and t 0 is given by

t 0 = γ(t − βx) (15.4)

which, together with (15.2) gives


t = γ(t 0 + βx 0 ). (15.5)
The conclusion is that for a fixed time t 0 in the observer’s frame, moving clocks which are
synchronised in their rest frame, indicate different times! In other words: synchronous time
has a meaning in one and the same reference frame, but it is not an invariant condition for
different reference frames.
These are the Lorentz equations, which give the relation between space-time points x, t
in the primed and unprimed frame:

t = γ(t 0 + βx 0 ); (15.6a)
0 0
x = γ(x + βt ). (15.6b)

It is clear that the inverse is obtained by swapping primed and unprimed quantities and set-
ting β → −β:

t 0 = γ(t − βx); x 0 = γ(x − βt ). (15.7a)

{ Restoring the factors c in these equations, they read:


15{
15.3. M ORE ABOUT THE L ORENTZ TRANSFORMATION 217

t 0 = γ(t − βx/c); (15.8a)


0
x = γ(x − βc t ). (15.8b)

Having these transformation equations, we can derive a transformation equation for the
velocity u = d r/d t . Consider two reference frames moving with respect to each other. The
relative velocity is oriented along the x-axis. We now calculate the velocity components along
the x and y axis of a particle which moves in the primed system with a velocity u0 .1 It is easy
to see that
d x d γ(x 0 + βt 0 ) u x0 + β
ux = = = . (15.9)
dt d γ(t 0 + βx 0 ) 1 + βu x0
In ‘full units’, i.e. without putting c = 1, this reads:

u x0 + v
ux = . (15.10)
vu x0
1+ c2

Now consider the y-component. This transforms according to


0
d y0 1 uy
uy = = (15.11)
d γ(t 0 + βx 0 ) γ 1 + βu x0

which can be written out as


q u 0y
2
uy = 1 − (v/c) . (15.12)
vu x0
1+ c2
We see that the velocity does not transform according to the Lorentz transformation.

15.3 M ORE ABOUT THE L ORENTZ TRANSFORMATION


The second postulate of special relativity directly leads to the Lorentz transformation. It can
easily be shown that for any two points in space time separated by a distance ∆t in the time
direction and by ∆x in the space direction, Lorentz transformation leaves the quantity

∆s 2 = c 2 ∆t 2 − ∆x 2 (15.13)

invariant. For three spatial dimensions plus one temporal dimension, the invariant quantity
∆s is q
∆s 2 = c 2 ∆t 2 − ∆r 2 , ∆r = ∆x 2 + ∆y 2 + ∆z 2 . (15.14)
We have only given the Lorentz transformation for the case where one frame moves with
respect to the other with a relative speed which is directed along the x-direction. The general
Lorentz transformation is represented by a 4 × 4 matrix. It not only describes relative mo-
tions in different directions, but also rotations and reflections in the spatial part, as this also
preserves the quantity ∆s 2 . We shall not give the full matrix expression here.
The Lorentz transformation can be represented in a graphical way – see figure 15.1. The
left picture shows the transformation starting from a Cartesian (x, t ) space-time, and the right
hand shows the inverse, i.e., starting from the (x 0 , t 0 ) system. In the right picture, a heavy
line is shown at some time t in the un-primed system. The dashed lines show the previous
position in the un-primed system if the line does not move in the unprimed system. In the
moving (primed) system, the line moves in the −x direction and it is contracted: the heavy
dash-dotted line represents this line at some time in the unprimed system.
The fact that ∆s 2 = c 2 ∆t 2 −c 2 ∆r 2 is invariant under the Lorentz transformation enables us
to write physical equations in space-time in a very elegant form. This form involves scalars, {
1 Particle velocities will be denoted by u from now on. For relative velocities between inertial frames we use v.
15
{
218 15. (M ORE THAN A ) SURVIVAL GUIDE TO SPECIAL RELATIVITY

t’ t’
t t

x’

x x’

F IGURE 15.1: Graphical representation of the Lorentz transformation.

vectors and tensors, objects which should not be considered as arrays of numbers only, but as
objects with certain transformation properties. As an example, we introduce the four vector

x µ = (x 0 , x 1 , x 2 , x 3 ) = (t , x, y, z) (15.15)

of which we know that it transforms according to the Lorentz transformation represented by


µ
a matrix L ν :
3
µ
x0 = Lµν x ν ≡ Lµν x ν.
X
(15.16)
ν=0

In this equation, we have introduced the notational convention that repeated upper and
lower indices are summed over. This is the Einstein summation convention.
The quantity ∆s 2 defined in (15.14) looks like a norm in 4-dimensional space-time. The
only difference with the well known norm from Euclidean vector spaces is the minus-sign
in front of the space-components. This minus-sign makes it useful to introduce the metric
tensor:  
1 0 0 0
 0 −1 0 0 
g µν = g µν =  . (15.17)
 
 0 0 −1 0 
0 0 0 −1

Let us now consider the quantity s 2 = c 2 t 2 − x 2 , which is the same as ∆s 2 when one of the two
points is at the origin of space-time. Using the definition of g µν , we can write this as

s 2 = x µ g µν x ν . (15.18)

Note that the Einstein summation convention has been used here. We define this inner prod-
uct as the interval in space-time.
For a Lorentz transformation represented by the linear operator L µ ν , invariance of the
interval under Lorentz transformation gives:
µ ν
s 0 = x 0 g µν x 0 = L µ ν x ν g µρ L ρ σ x σ = x µ g µν x ν .
2
(15.19)

It follows that L µ ν must satisfy


L ρ µ g ρσ L σ ν = g µν . (15.20)
This can be taken as the definition of a Lorentz transformation. Note that the operator can
¡ ¢ ρ
also be written as L T µ (the superscript T denotes the transpose). The equation can there-
fore be written as
{
15{ LT g L = g . (15.21)
15.4. E NERGY AND MOMENTUM 219

In order to avoid having to put g µν and the like in all the equations, we define

x µ = g µν x ν = (x 0 , −x 1 , −x 2 , −x 3 ), (15.22)

so that the interval can now simply be written as

s 2 = x µ xµ . (15.23)

From now on we shall use this notation.


Now suppose we have a vector a µ which transforms according to

a0µ = Lµν aν (15.24)

then a µ transforms as
µ
a 0 = g µν a 0 ν = g µν L ν ρ a ρ = g µν L ν ρ g ρσ a σ ≡ M µ σ a σ . (15.25)

A vector like a µ with a lower index is called covariant. We see that a covariant vector trans-
forms according to M which is related to the inverse Lorentz transformation for a covariant
vector as
M = g Lg . (15.26)
From a covariant vector, we can construct a contravariant vector.ÂăThis is a vector with an
upper index – it is related to a covariant vector by

a µ = g µν a ν .

Below we shall see that for a covariant vector x µ , transforming according to the Lorentz trans-
formation, ∂/∂x µ transforms as a contravariant vector. Therefore we can write ∂/∂x µ ≡ ∂µ ,
expressing the fact that this is a contravariant vector.
Finally, a few remarks about naming conventions. The components 1, 2 and 3 of a four-
vector form its spatial part, the component 0 is the temporal part. These names are also used
when the four-vector is not (t , x) (we shall encounter other examples of four-vectors below).
A four-vector s µ is called time-like when (s 0 )2 > (s 1 )2 + (s 2 )2 + (s 3 )2 , and space-like when this
is not the case.

15.4 E NERGY AND MOMENTUM


In nonrelativistic classical mechanics, physical laws are usually phrased in terms of expres-
sions involving scalar quantities (such as mass or energy) or vector quantities (position, mo-
mentum, angular momentum). Vector quantities are usually derived from the position vec-
tor, and they transform accordingly. For example, a rotation of the three dimensional Eu-
clidean space, acts exactly the same on position vectors as on the momentum. In relativistic
mechanics we want physical laws to be invariant under relative displacements at uniform
speed, which means that we require the equations to assume the same form in any inertial
frame. This can be done in two ways: we can formulate the equations as equalities between
scalars, which should be invariant, or as vector equalities. As the physical vector quantities
we are interested in are assumed to be derived from the positions of the particles involved, it
is natural to assume that these quantities should transform according to the Lorentz transfor-
mations, just as the rotational transformations in nonrelativistic mechanics – i.e., they should
be four-vectors.
In the light of this, when looking for physical laws, it is natural to find these laws as ex-
pressions in terms of four-vectors or scalars. Scalars can be constructed by taking the inner
product of two four-vectors, analogous to Eqs. (15.19) and (15.23). This leads to some scalar
{
quantity q 2 : 15
{
q 2 = a µ g µν b ν = a µ b µ = a 0 b 0 − a · b. (15.27)
220 15. (M ORE THAN A ) SURVIVAL GUIDE TO SPECIAL RELATIVITY

u~1
uy
-u x
11111
00000
-u x 00000
11111
~ u’y
-u y u’1
u1 -u’x 1111
0000
-u’x 0000-u’
1111
u1 y

uy u2 u
~
11111
00000 u2 2
00000
11111
00000
11111
ux
1111
0000
ux
~
u2
0000
1111
-u y
(a)
(b)

F IGURE 15.2: Collision of two particles, (a) in the CM frame, and (b) in the rest frame of particle 1.

Let us now consider conservation of momentum and energy in more detail. The proce-
dure is to start with the nonrelativistic definition of momentum of a point particle of mass m
moving at velocity u:
p = mu. (15.28)
In this expression, the velocity u is defined as the time-derivative of position, and we have
already seen how this quantity transforms [see Eq. (15.10) and (15.12)]:
u x0 + v
ux = (15.29a)
vu 0
1 + c 2x
q u 0y
u y = 1 − (v/c)2 (15.29b)
vu x0
1+ c2

where the relative velocity v of the frames is directed along the x-axis.
The aim is now to construct a four-vector starting from the definition of nonrelativistic
momentum. We know that in the absence of external forces, momentum is conserved in clas-
sical nonrelativistic mechanics. We now want to find a relativistic analogue of momentum
which is also conserved. As we do not yet know how to treat forces in relativistic mechanics,
we consider elastic collisions, where the forces only act at the moment of the collision, and
where energy and momentum conservation forms a general framework for describing the
physics.
We consider the collision shown in figure 15.2. The collision involves two particles, 1 and
2, of equal mass m. In the left hand part, the collision is shown in the frame in which the
centre of mass is at rest (CM frame), whereas the right hand part shows the same collision in
the frame in which particle 2 moves along the y-axis. Note that the right hand frame moves at
velocity u x x̂ with respect to the CM frame. The particle velocities in the CM frame are equal in
magnitude (but with opposite directions). We can now evaluate the velocities of the particles
in the right hand frame by the velocity transformation law. A tilde denotes the velocity after
the collision.
0 0 −2u x
ũ 1x = u 1x = ; (15.30a)
1 + u x2
0 0
ũ 2x = u 2x = 0; (15.30b)
0 0 1 −u y
−ũ 1y = u 1y = ; (15.30c)
γ 1 + u x2
{ 1 uy
15{ 0
−ũ 2y 0
= u 2y =
γ 1 − u x2
. (15.30d)
15.4. E NERGY AND MOMENTUM 221

It is clear that in the CM frame, the total nonrelativistic momentum is conserved (it van-
ishes!). This is however no longer the case in the right hand frame, where the y-component of
the total momentum is different before and after the collision. We search for a modified def-
inition of momentum, which is conserved in relativistic mechanics. The simplest possibility
is to add an extra velocity-dependent factor to the momentum:

p = f (u)mu. (15.31)

If the momentum is conserved we have in the frame in which particle 2 moves along the
y-axis:
f (u 10 )m∆u 1y
0
= f (u 20 )m∆u 2y
0
. (15.32)
where ∆u 1y
0
is the change in velocity due to the collision. It is easy to find expressions for
these changes, using the transformed velocities [Eq. (15.30)]:
2u y 2u y
f (u 10 )m = f (u 20 )m (15.33)
γ(1 + u x2 ) γ(1 − u x2 )
so that
f (u 10 ) 1 + u x2
= . (15.34)
f (u 20 ) 1 − u x2
We now express the right hand side in terms of u 10 and u 20 , which are given by

2
4u x2 + u 2y (1 − u x2 )
u01 = ; (15.35a)
(1 + u x2 )2
2
u 2y
u02 = . (15.35b)
1 − u x2
Eliminating u y , this can be cast in the form
2
1 − u01 (1 − u x2 )2
= . (15.36)
1 − u 0 22 (1 + u x2 )2
The conclusion is therefore that
p
f (u) = Const/ 1 − u 2 = Const · γu (15.37)

where we have introduced γu to distinguish it from γv which is determined by the relative ve-
locity of the two frames rather than the speed of a particle in one reference frame or another.
As we want the momentum to coincide with the nonrelativistic momentum for small speeds
(γ ≈ 1), we have
prel = γu mu. (15.38)
Often, the factor γ is included in the mass:

m(u) = γu m. (15.39)

At u = 0, this relativistic mass assumes the value m, which we call the rest mass.
We now know the expression for the three-dimensional momentum. If the momentum
is a physical quantity involved in acceptable physical laws, it should be part of a four-vector.
Let us assume then that this is the case. What could the zeroth element of this four-vector
be? Consider a particle moving in some (primed) reference frame in the x-direction with
momentum p x0 = mγu x0 . Consider the frame in which this particle is at rest. This rest frame
moves at speed −u x0 with respect to our reference frame. In the rest frame, the momentum
p x is zero, so according to the Lorentz transformation we must have:2

p x = 0 = γ(p x0 − u x0 p 00 ), (15.40) {
2 note that as v = −u x̂, γ = γ , so their distinction is not made in the following.
15
{
x v u
222 15. (M ORE THAN A ) SURVIVAL GUIDE TO SPECIAL RELATIVITY

that is, p 00 = γm. So we take


p µ = (p, p 4 ) = γ(mu, mc). (15.41)
What is the meaning of the first term? Let’s expand this term for low velocities:

(u/c)2 mc 2 + 12 mu 2
µ ¶
γmc = mc 1 + = . (15.42)
2 c

The second term on the right hand side of this expression is the kinetic energy, and the first
term is a constant. In classical mechanics, a constant added to the energy does not change
the physics. Therefore, the first component is the relativistic expression for the energy. For a
particle at rest, we have
E = mc 2 . (15.43)
This is the famous equation which is usually presented as Einstein’s main achievement. Note
that this expression for the energy only holds for particles at rest. For moving particles, the
energy is q
E = γmc 2 = m 2 c 4 + p2 c 2 . (15.44)

The last expression can easily be derived from E = γmc 2 .


It is useful to explicitly show that

E
µ ¶
p µ = (p, p 4 ) = p, (15.45)
c

is indeed a four vector. This is left as an exercise to the reader.


It is possible to formulate the foregoing in a different way, using the concept of proper
time. The proper time is the time as measured in the particle’s rest frame. Because of time
dilatation, we know that when we observe a particle move at speed u, its proper time τ is
related to our time t by
d t = γd τ. (15.46)
Observers in different reference frames have non-synchronised clocks. However, they can
both relate their time to the proper time which itself is independent of the speed of an ob-
server’s reference frame, and is therefore an invariant. In fact, for a point particle not sub-
jected to external forces, the only two relativistic invariants we know are the mass and the
proper time. It now is immediately clear that the vector quantity

u µ = (d t /d τ, d x/d τ) = (γ, γd x/d t ) = γ(1, v) (15.47)

transforms as a covariant four-vector. Multiplying this by the rest mass, we obtain the four-
momentum:
p µ = (p, E /c) = mu µ = (γmv, γm). (15.48)
In the previous section we have shown that the inner product of any two four-vectors is
an invariant quantity. For the four-momentum we find that the inner product of this vector
with itself is:
p µ p µ = p 02 − p2 = γ2 m 2 (1 − v 2 ) = m 2 , (15.49)
and the rest mass m is obviously invariant.

15.5 M ATHEMATICAL STRUCTURE OF SPACE - TIME


We have encountered two different examples of vectors which transform according to a Lorentz
transformation: x µ and p µ . We also have seen that for such a vector a µ , the product:
{
15{ ¡ ¢2 ¡ ¢2 ¡ ¢2 ¡ ¢2
s 2 = a µ aµ ≡ + a 0 − a 1 − a 3 − a 3 (15.50)
15.5. M ATHEMATICAL STRUCTURE OF SPACE - TIME 223

is invariant under the Lorentz transformation. Recall that the components of a µ are:

a 0 = a 0 ; a 1 = −a 1 ; a 2 = −a 2 ; a 3 = −a 3 . (15.51)

The vectors a µ are called covariant and their counterparts a µ are contravariant. Going from
co- to contravariant means swapping the sign of the space-like components of the vectors.
The invariance of the quantity s 2 fixes the form of the possible Lorentz transformation up
to a sign. The Lorentz transformation corresponding to a relative velocity v = βc along the
x-axis has the form
γ
 
0 0 −γβ
 0 1 0 0 
Lµν =  (15.52)
 
 0 0 1 0 

−γβ 0 0 γ
The inverse transformation is obtained by moving in the opposite direction: β → −β. This
last result is immediately clear on physical grounds, and it can be checked explicitly for the
matrix form.
Note that in the form given, the Lorenz transform describes what happens to a covariant
vector:
x 0 µ = L µ ν xν . (15.53)
We can find the form of the Lorentz transform of a contravariant vector by swapping upper-
and lower indices, using the metric tensor g µν = g µν , as we have seen above:
µ
a0 = M µν aν; (15.54a)
µ µρ σ
M ν =g L ρ g ρσ . (15.54b)

In fact, as L for contravariant vectors always occurs with subscripts L µ ν and M with M µ ν , we
can also replace M µ ν by L µ ν without risking confusion. Moreover, this convention allows us
to use g to turn lower indices into upper ones and vice versa, as Eq. 15.54b) shows.
In physics, we often deal with space-time derivatives. An example is the continuity equa-
tion:
∂ρ
∇·j+ = 0. (15.55)
∂t
which involves the spatial gradient ∇ and the time derivative ∂/∂t . It turns out that the deriva-
tives with respect to covariant vectors are contravariant and vice versa. We shall show this
now explicitly.
Let us investigate this for the covariant vector x µ . Suppose we have a function f depend-
ing on x 2 = x µ x µ :
f = f (s 2 ) = f (x µ x µ ). (15.56)
We calculate the gradient of the function f with respect to the vector x µ , applying the
chain rule:
∂ ∂
f (s 2 ) = f 0 (s 2 ) (x ν x ν ) = f 0 (s 2 )2x µ . (15.57)
∂x µ ∂x µ
From the fact that f (s 2 ) is invariant, we conclude that ∂/∂x µ transforms as a contravariant
vector.
We have obtained a few important results:

• We can consider two types of relativistically transforming four-vectors: covari-


ant and contravariant. They are related by a sign change of their spatial com-
ponents.
• The Lorentz transformations for co- and contravariant vectors are each other’s
inverse.
• The gradient with respect to a covariant vector transforms as a contravariant {
vector and vice versa. 15
{
224 15. (M ORE THAN A ) SURVIVAL GUIDE TO SPECIAL RELATIVITY

The last remark leads to the notational convention:


∂ ∂
≡ ∂µ ; ≡ ∂µ . (15.58)
∂x µ ∂x µ

15.6 E LECTROMAGNETIC FIELDS AND RELATIVITY


In section 8.3.1, we have seen that the Maxwell equations in the absence of sources can be
formulated in terms of the vector and scalar potential. In the presence of sources (charge and
current), the equations for the potentials read:

1 ∂2 A(r, t )
−∇2 A(r, t ) + = µ0 j(r, t ); (15.59a)
c 2 ∂t 2
1 ∂2 φ(r, t )
−∇2 φ(r, t ) + 2 = ρ(r, t )/²0 . (15.59b)
c ∂t 2
The electric and magnetic fields can be obtained from these potentials:

B = ∇ × A;
∂A
E = −∇φ −
∂t
where we have left out the position and time dependences for sake of brevity.
The electric and magnetic fields transform in a complicated way under a Lorentz transfor-
mation. The potentials are convenient vehicles for expressing the behaviour of electromag-
netic fields under a Lorentz transformation. In section 15.5, we have seen that the gradient
with respect to the contravariant space-time transforms as a covariant vector and vice versa.
Hence, we can write
1 ∂2
− ∇2 = ∂µ ∂µ ,
c 2 ∂t 2
which is therefore invariant.
We can write the potential formulation of the Maxwell equations, Eqs. 15.59a in the form

∂µ ∂µ A = µ0 j; (15.60a)
µ
C H EC K SIG N !!!!∂µ ∂ φ = ρ/²0 . (15.60b)

It is clear that we can turn this into a relativistically invariant expression if we combine φ, A
and ρ, j into four-vectors. In order to get an idea to do this in a sensible way, note that ρ and j
should satisfy the continuity equation:

∂ρ
∇·j+ = 0,
∂t
which, after introducing the four current J µ :

J µ = (cρ, j),

can also be written as


∂µ J µ = 0.
If we now write A µ = (φ/c, A), then we can write

∂µ ∂µ A ν = µ0 J νC H EC K !!!

which clearly exhibits the relativistically invariant form of the Maxwell equations. In the next
{ chapter, we shall comment further on the relativistically invariant formulation of Maxwell’s
15
{
equations.
15.7. R ELATIVISTIC DYNAMICS 225

15.7 R ELATIVISTIC DYNAMICS


In classical mechanics, we start from Newton’s law of motion and find the appropriate expres-
sion for the force. We could follow a similar approach in relativistic mechanics by replacing
Newton’s law by some relativistically invariant analogue. Newton’s law is a vector equality, so
we want to replace it by a relation between four-vectors. It is now important to realise that
although p µ is a four-vector, d p µ /d t is not. This is because the time t is not a relativistic in-
variant. Therefore, we use the four vector d p µ /d τ, which is an invariant. Newtons’ law now
reads:
d pµ
Kµ = . (15.61)

K µ is a generalisation of the force – in the nonrrelativistic limit, its spatial part should reduce
to the classical force.
The approach described here is used in many textbooks. A problem with it however is
that obtaining the correct expression for K µ requires guessing and hand-waving. Therefore,
we shall follow here a more consistent approach which ties in closely with the first part of this
course as it is based on constructing a relativistically invariant action. Again, we want to find
the dynamic trajectory of a particle as the one which minimises some functional. The action
is a scalar, and if we want to derive a general principle, we should construct it as a relativistic
invariant. This invariant should reduce to the classical action in the norelativistic limit. In
the following derivation we leave factors c in the expressions for clarity.
We start with a particle not subject to a force. We have only two relativistic invariants: the
proper time and the mass. As the classical action is written as an integral over time, we write
it now as an integral over proper time:
Z B
S= d τλ, (15.62)
A

where λ may depend on the rest mass m. Using (15.46), we can write S as
Z B dt
S= λ. (15.63)
A γ

A Taylor expansion of the integrand gives


s
λ v2 v2
µ ¶
= λ 1− 2 ≈ λ 1− 2 . (15.64)
γ c 2c

This is seen to equal the kinetic energy plus a constant (which does not affect the stationary
solutions of S) if
λ = −mc 2 . (15.65)
So we have found Z B dt
2
S = −mc . (15.66)
A γ
as the relativistically invariant expression for the action. In analogy to the classical expression
for the action, we can say that the Lagrangian is given as
s
v2
L = −mc 2 1− . (15.67)
c2

The canonical momentum, defined in section 2.7, is given by

∂L mv
p= =q , (15.68) {
∂v 2
1 − vc 2 15
{
226 15. (M ORE THAN A ) SURVIVAL GUIDE TO SPECIAL RELATIVITY

which corresponds precisely to the expression for the relativistic momentum, found earlier.
The Euler-Lagrange equation for a particle not subject to external forces therefore reads
dp
= 0. (15.69)
dt
The Hamiltonian is found as
H = p · v − L = γmc 2 , (15.70)
which, as we have seen above, is indeed the energy.
Now we want to include a force. As an example we consider the force experienced by
a particle in an electromagnetic field. Strictly speaking, we cannot derive this because we
have not investigated the behaviour of electromagnetic fields in the relativistic limit, which
is a topic beyond the scope of this course. Therefore we shall quote the main result of this
analysis:
The vector (ϕ, A) formed by the electric potential ϕ and the vector potential A,
transforms as a four vector. This four-vector is usually denoted as A µ .
So our task is to construct a relativistically invariant analogue to the classical Lagrangian of a
charged particle in an electromagnetic field:
1
L = mṙ2 + q ṙ · A(r, t ) − qφ(r, t ). (15.71)
2
From the classical expression we immediately guess
s
v2 q
L = −mc 2 1 − 2 − pµ Aµ. (15.72)
c γm
In that case, the action reads:
Z B ³ q ´
S= d τ −mc 2 − p µ A µ , (15.73)
A m
which is immediately seen to be relativistically invariant.
From the Lagrangian, we derive the momentum as
p = γmv − qA. (15.74)
The Euler-Lagrange equations read:
d mv
µ ¶
p − qA = −q∇ϕ + q∇(v · A). (15.75)
dt 1 − v 2 /c 2
Except for the first term on the left hand side, everything is exactly the same as in the classical
derivation, and we immediately find:
d
(γmv) = qE + q(v × B). (15.76)
dt
It is interesting to study the case of a constant, homogeneous electric field in the x-
direction. Then
d
(γmv x ) = qE x . (15.77)
dt
We find
γmv x = qE x t , (15.78)
from which it follows that
qcE x t
vx = q . (15.79)
m 2 c 2 + q 2 E x2 t 2
The acceleration cannot be constant, as the velocity saturates at the value c. In fact, the
acceleration is given as
{ d v qE x
· µ ¶ ¸−3/2
qE x t 2
15{ a=
dt
=
m
1+
mc
. (15.80)
15.8. S UMMARY 227

15.8 S UMMARY
At the end of this chapter, it seems useful to summarize those results that are necessary for
the remainder of these lecture notes.
We have introduced several four-vectors:

• Space-time: x µ = (r, ct );

• Energy-momentum: p µ = (p, E /c);

• Four-current: j µ = (j, cρ);

• Four-vector potential: A µ = (A, φ).

These vectors transform under a Lorentz transformation as follows. For a vector a µ :

a 0µ = L µ ν a ν ,

where a 0µ is the transformed vector and where we have used the Einstein summation conven-
tion:

In the Einstein summation convention, we sum over indices that occur twice in an
expression, once as an upper index and once as a lower index.

The Lorentz transformation is defined by the requirement that it leaves

∆r2 − c 2 ∆t 2

invariant. It includes rotations of the space vectors and ‘boosts’, resulting from changing
the reference frame from the original, ‘unprimed’ one to a second, ’primed’ frame which is
moving with velocity v with respect to the unprimed one. If the relative velocity of the primed
frame with respect to the unprimed frame is v along the x-axis, the Lorentz transformation is

γ
 
0 0 −γβ
 0 1 0 0 
Lµν = 
 
 0 0 1 0 

−γβ 0 0 γ
p
where β = v/c and γ = 1/ 1 − v 2 /c 2 .
All the vectors mentioned so far in this summary had an upper index. We define for such
a vector a µ another vector with a lower index, which has its space part reversed. For the
space-time vector
x µ = (r, c t ), x µ = (−r, c t ).

A vector with an upper index is called covariant. A vector with a lower index is called
contravariant.

A covariant vector transforms into a contravariant vector through the so-called metric tensor,
g:  
−1 0 0 0
 0 −1 0 0 
g µν = g µν =  .
 
 0 0 −1 0 
0 0 0 1
We have, for an arbitrary vector a µ :

a µ = g µν a ν ; a µ = g µν a ν . {
15
{
Here, the Einstein summation convention is obviously used.
228 15. (M ORE THAN A ) SURVIVAL GUIDE TO SPECIAL RELATIVITY

A transformation L of four-vectors is a Lorentz transformation if and only of it satisfies

LT g L = g ,

which should be read as an equation of 4 × 4 matrices.


The beauty of the Lorentz transformation is that it leaves any product of the form a µ b µ
invariant. Such a quantity is an invariant scalar. Also, any four-vector equality a µ = b µ or
a µ = b µ is left invariant under a Lorentz transformation.
Finally, we introduce the four-derivative


∂µ = ,
∂x µ
which has the form (∇, 1/c∂/∂t ). This transforms as a contravariant vector – hence the nota-
tion ∂µ . Likewise,

∂µ = ,
∂x µ
which has the form (−∇, 1/c∂/∂t ), transforms as a covariant vector.

15.9 P ROBLEMS
1. An equation which holds in any reference frame is relativistically invariant. We also
use the term covariant for this property, which is not to be confused with ‘covariant’ as
in covariant/contravariant. Which of the following are covariant equations (give brief
explanations?)?

(a)
∂φ
= Aµ.
∂x µ
(b)
∂φ
= a(c 2 t 2 − r 2 ).
∂x µ
(c)
T µν A ν = B µ .

(d)
A µ = B µνC µ .

(e)
A µ = B µνC ν .

(f)
∂µ A µ = C µ .

(g)
∂µ T ρσµ = 1.

Now prove the following:


µ ¡ µ¢ µ
(h) For a tensor Tν , show that Tr Tν = Tm u is an invariant scalar.
(i) Knowing that the number K = A µνC µν is an invariant scalar for any tensor C µν ,
show that A µν transforms as a tensor.

{
15{
15.9. P ROBLEMS 229

2. In this problem we consider the relativistically invariant formulation of the Maxwell


equations. Physical systems are described by a Lagrangian, and an advanced example
of this idea is the electromagnetic field. The field is described by a four vector which
encapsulates the vector and scalar potential as explained in the lecture notes. The four
vector is A µ = (A, φ/c).
We introduce the electromagnetic tensor F µν :

F µν = ∂µ A ν − ∂ν A µ .

Note that this tensor is anti-symmetric: F µν = −F νµ . We construct the Lagrangian di-


rectly from this tensor:
1
L = − F µν F µν − J µ A µ .
4
The field equations which minimize this Lagrangian are obtained through the general-
ized Euler-Lagrange equations:

∂L ∂L
∂β ¡ ¢= .
∂ ∂β A α ∂A α

(a) Show that


∂F µν
¢ = δµβ δνα − δνβ δµα ,
∂ ∂β A α
¡

(b) Show that the generalized Euler-Lagrange equation leads to

∂β F βα = J α .

(c) Show that this leads directly to the Maxwell equations, formulated in terms of the
potentials:
∂µ ∂µ A ν = J ν .

3. This problem is about Compton scattering.


Individual photons behave like particles of rest mass equal to zero. When a photon
scatters off a free electron, the electron can recoil, taking up both energy and momen-
tum. There is a unique relationship between the energy of the scattered photon, E γ0
and the scattering angle from the incident photon, θ 0 . This relation also depends on
the electron rest mass m e and on the incident photon energy, E γ , see figure. If the
four-momentum of the incident and the scattered photons are denoted by k µ , k 0µ re-
µ
spectively, the statement of energy and momentum conservation is k µ + p e = k 0µ + p e ,

µ 0µ
where p e and p e are the initial and final electron energy-momentum four vectors of
the electron.

E’γ’
Eγ θ’

α
e− {
15
{
230 15. (M ORE THAN A ) SURVIVAL GUIDE TO SPECIAL RELATIVITY

(a) Prove, using k µ k µ = k 0µ k µ0 = 0, that the conservation laws of energy and momen-
tum require that

E γ0 (θ 0 ) = E γ (1−cos θ 0 )
.
1 + m c2
e

(b) Using the above, prove that the maximum wavelenth shift of the photon is given
by
h
λ0 − λ = 2 ,
me c
where h/(m e c) is the Compton wavelength of the electron.

4. (a) Derive, starting from the Maxwell equations as given in section 8.3.1 a second
order equation for the electric and magnetic fields separately. For example, take
the curl of the equation involving ∇×E and use the identity ∇×(∇×E) = ∇(∇·E)−
∇2 E. You should find

1 ∂2 1 ∂ j
µ ¶ · µ ¶ ¸
2
− ∇ E = −4π + ∇ρ .
c 2 ∂t 2 c ∂t c

(b) The operator on the left-hand side of the last equation is relativistically invari-
ant. Show that on the right-hand side, we have an expression proportional to
∂µ J ν − ∂ν J µ with µ = 0 and ν = i = 1, 2, 3. From this it follows that E must be three
components of a tensor F µν .
(c) The tensor F µν has the form
 
0 −B z By Ex
 Bz 0 −B x Ey 
.
 
−B y Bx 0 Ez

 
−E x −E y −E z 0

By deriving an equation for B similar to that for E, verify that

∂ρ ∂ρ F µν = C ∂µ j ν − ∂ν j µ ,
¡ ¢

where C is a constant.

{
15{
16
T HE K LEIN -G ORDON AND M AXWELL
EQUATIONS

16.1 T HE K LEIN -G ORDON EQUATION


In an attempt to generalise the Schrödinger equation to relativistic problems, one may use
the standard representation of the momentum operator in quantum mechanics

p = −iħ∇,

together with the ‘energy operator’, to construct a four-vector, analogous to what we have
in special relativity. What would be a suitable form of the energy operator? We can use two
ideas for guiding us: first, from the fact that p involves a spatial derivative, we see that the
fourth component must be a time-derivative because the gradient with respect to a covariant
space–time vector four vector is a contravariant vector. All in all, this leads to the four vector

p µ = iħ∂µ


where, as usual, ∂µ = (∇, 1c ∂t ).
Another idea may be the fact that for an eigenstate of the Schrödinger equation with def-
inite energy E , we know that
∂ ¯
iħ ¯ψ(t ) = E ¯ψ(t )
® ¯ ®
∂t

which suggests the identification iħ ∂t with the fourth component of the energy-momentum
four vector.
Now we use the general result
p µ pµ = m2c 2
of special relativity to write down the so-called Klein-Gordon equation:
" µ ¶2 #
2 Ê ¯ψ = m 2 c 2 ¯ψ .
¯ ® ¯ ®
−p̂ +
c

This is an analogue of the Schrödinger equation for free particle at relativistic speed.
In section 7.3, we have analyzed the mass conservation problem and concluded that for
a density given by ¯2
ρ(r, t ) = ¯ψ(r, t )¯
¯

and a current
ħ £ ∗
ψ (r)∇ψ(r) − ψ(r)∇ψ∗ (r) ,
¤
j=
2im

231
232 16. T HE K LEIN -G ORDON AND M AXWELL EQUATIONS

the continuity equation


∂ρ
+∇·j = 0
∂t
holds.
We now would like to play the same game using the Klein-Gordon equation. It turns out
that this fails as this equation does not involve the first time derivative like the Schrödinger
equation, but the second time derivative. The best option seems to preserve ¯ the ¯2 expression
for the flux and see what we then obtain for the material density (which was ¯ψ(r)¯ in the case
of Schrödinger quantum mechanics). This can be done by taking the Klein-Gordon equation
and its conjugate, and subtracting the two;

×2 ×2 £ ∗ 2 ×2 £ ∗ 2
∇ · ψ∗ ∇ψ − ψ∇ψ∗ = ψ ∇ ψ − ψ∇2 ψ∗ = ψ ∂t ψ − ψ∂2t ψ∗ = −∂t ρ,
£ ¤ ¤ ¤
∇·j = 2
2im 2im 2imc
which then leads to
∂ψ(r, t ) ∂ψ∗ (r, t )
· ¸
× ∗
ρ=− ψ (r, t ) − ψ(r, t ) .
2imc 2 ∂t ∂t
For a free particle state ψ(r, t ) = C exp [i (k · r − ωt )], this leads to
E
ρ= |C |2 .
mc 2
This forms nicely a four-vector with the current
p
j= |C |2 .
m
This four-vector is then

jµ =
|C |2 .
m
The expression for the current density suggests, by analogy to the nonrelativistic case,
that an ‘inner product’ can be defined as follows:


Z
ψ1 |ψ2 = i ψ∗1 (x) ∂ t ψ2 (x)d 3 r.
­ ®

In this expression, x denotes the space-time four vector, and


∂ψ1
¶ µ ∗¶
∗ ∂ψ2
µ
∗ ←→
ψ1 (x) ∂ t ψ2 (x) = ψ1 − ψ2 .
∂t ∂t
From this inner product, we can determine the normalisation of a plane wave state. This
p
turns out to be 1/ 2ωk , with ×ωk = E . With proper units we have
1 1
ψ(x) = p e −ip·x . (16.1)
2ωk (2π×)3/2
It is important to appreciate the consequences of the fact that the energy operator occurs
with a square in the Klein-Gordon equation, whereas it occurs with power 1 in the Schrödin-
ger equation. The difference is clear from the dispersion relation:
q
E = ± p 2c 2 + m2c 4,

showing that E occurs with two signs: for every positive energy, there exists also a negative
energy. This differs quite significantly from the situation in the Schrödinger equation: as
the momentum p 2 is not bounded from above, the energy can become infinitely negative.
This means that there is no well-defined ground state, and that we have an ‘energy hole’, into
which energy can flow ad inifinitum. This is obviously not physical.
Another problem follows from the fact that the density is proportional to the energy.
Therefore, negative densities are possible, which does not seem sensible either.

{
16{
16.2. A NALOGY WITH THE M AXWELL EQUATIONS 233

16.2 A NALOGY WITH THE M AXWELL EQUATIONS


The structure of the Maxwell equations is quite analogous to that of the Klein-Gordon equa-
tion as we have seen in the previous chapter. Here we shall discuss the relativistic formula-
tion of Maxwell’s equations more in-depth. We start from the relativistic formulation of the
Maxwell equations. Note that we can express the electric and magnetic field completely in
terms of the scalar and vector potential, which together form a four vector:

A µ = (A, ϕ/c).

The fields are found from

B = ∇ × A;
1 ∂A
E = −∇ϕ − .
c ∂t
The Maxwell equations in the absence of electric charges and currents can conveniently be
written in terms of the tensor
F µν = ∂µ A ν − ∂ν A µ . (16.2)
Note that this expression does not change under a gauge transformation

A µ → A µ − g ∂µ χ(x).

The Maxwell equations read:


∂µ F µν = 0
as we shall now show.
Working our this formula in terms of the four-vector potential gives

∂µ ∂µ A ν + ∂ν ∂µ A µ = 0.
¡ ¢

An appropriate gauge transformation (called the Lorenz gauge), ensuring ∂ν A ν to vanish,


eliminates the second term in this formulation of Maxwell’s equations in the absence of cur-
rents and charges, leaving
∂ν ∂ν A µ = 0.
This now looks remarkably similar to the Klein-Gordon equation, if this is written in the form

∂ν ∂ν φ = m 2 φ.

The fact that the right hand side vanishes in the case of the Maxwell equations shows that the
particles described by these equations have zero mass. These particles are the photons.
The operator ∂ν ∂ν = c12 ∂2t − ∇2 is usually written as ä.
Note that the Maxwell equations are treated here completely on the classical level! The
Laplace operator in the Klein-Gordon equation comes about as a result of the quantisation
of the momentum operator p = −i×∇. The classical Maxwell equations have this operator on
board without quantization. This difference is also clear from the fact that the Klein-Gordon
equation contains Planck’s constant, whereas the Maxwell equations do not.
Another important remark is that a right hand side containing current and charge sources
does not introduce a mass in the Maxwell equations – a mass term would have the form m A ν ,
which is essentially different from an external source term.

{
16
{
234 16. T HE K LEIN -G ORDON AND M AXWELL EQUATIONS

16.3 S OURCE TERMS


The Maxwell equations in the presence of charges and sources read

äA µ (x) = J µ (x).

Such an inhomogeneous equation can be solved in terms of the Green’s function D F which is
defined as
äD F = δ(c t )δ(3) (r) = δ(4) (x).
With this definition, it can be verified that
Z
µ
µ
A = A 0 (x) + D F (x − x 0 )J µ (x 0 )d 4 x 0

µ
where A 0 (x) is a solution of the Maxwell equations without sources, solves this equation with
sources.
The Green’s function (or free propagator) D F can be found by Fourier-transforming:

1
Z
D F (x) = − 4 D F (p)e −ipx d 4 p

where p µ = (p, E /c) is the momentum four vector. Transformations from p to x and vice versa
always hinge upon the expression for the delta-function:

1
Z
δ(4) (x) = e −ipx d 4 p.
(2π)4

Considering the Fourier transform φ̃(p) of a function φ(x), where these two are related
through:
1
Z
φ(x) = − 4 φ̃(p)e −ipx d 4 p,

we see that ∂µ φ(x), gives for the Fourier transform the result i×p µ φ̃(p). Extending this result
to the present case, we see that the free propagator for space-time can be found as

1
D F (p) = .
p2

Note that we do not use a tilde f̃or denoting the Fourier transform of D F – the argument
tells us whether we deal with the function in direct or in reciprocal space. This expression is
singular at the origin, which we usually get rid off by writing

1
D F (p) = ,
p 2 + i²

which shifts the singularity slightly off the real axis. In that case we can use the tricks from
chapter 1 in order to calculate the Green’s function:

1 e −ipx 4
Z
D F (x) = − 4 d p.
2π p 2 + i²

We can also add a source term to the Klein-Gordon equation. This does not immediately
have a physical interpretation, but the propagator can play a useful role in solving the prob-
lem of a Klein-Gordon particle in a (weak) potential.
For the Klein-Gordon equation, an analysis similar to that of the Maxwell equations leads
to
1 e −ipx
Z
∆F (x) = − 4 d 4 p.
2π p 2 − m 2 + i²

{
16{
16.4. S TATIC SOLUTIONS FOR THE PROPAGATORS 235

We see that the Maxwell propagator is a special case of the Klein-Gordon propagator with
m = 0.
The Klein-Gordon propagator can be reworked to
" Z −ip·(x−x 0 ) Z ip·(x−x 0 ) #
i e e
∆F (x − x 0 ) = θ(t − t 0 ) d 3 p + θ(−t + t 0 ) d 3p .
(2π)3 2ωp 2ωp

where ωp =
p
p 2 + m 2 . We leave this form for future reference.

16.4 S TATIC SOLUTIONS FOR THE PROPAGATORS


It is useful at this point to recall the forms of the propagators in the static limit (no t -dependence).
The equation for the Maxwell propagator:

∇2 φe (r) = δ(3) (r)

is the Poisson equation for a point charge (up to a factor of 4π). We have used the notation
φe to distinguish the static propagator from the dynamic one D F . The solution is given by

1
φe (k) = .
k2
In real space, this becomes
1 1
φe (r) = .
4π r
In 1D, this would be φe (x) = |x| and in 2D, it is φe (r) = 2π ln(r ).
Now we turn to the static Klein-Gordon equation. This has the form
¡ 2
∇ − m 2 φ(r) = δ(3) (r).
¢

This is a Helmholtz equation. The solution is given by

1
φ(k) = .
k 2 + m2
In real space, this becomes
1 e −mr
φ(r) ,
4π r
which is consistent with the solution of the Poisson equation for m → 0.
We see that zero-mass gives a power-law decay of the potential, whereas the effect of the
mass is to set a finite range beyond which the potential decays exponentially to zero.

16.5 S CATTERING AS AN EXCHANGE OF VIRTUAL PARTICLES


The Green’s function formulation of scattering results in a nice interpretation of scattering
processes as we shall explain in this section. Remember that in the Born approximation, the
scattering amplitude is given as

m
Z
f (θ) = − e −iq·rV (r0 )d 3 r 0 .
2π×2

Here, q = k − k0 where k is the incoming, and k0 the outgoing momentum. Note that in the
elastic case, |k| = |k0 |; therefore we have

q = 2k sin(θ/2),

where θ is the angle between k and k0 .

{
16
{
236 16. T HE K LEIN -G ORDON AND M AXWELL EQUATIONS

In the case of Rutherford scattering (V (r ) = −Z e 2 /r ), the scattering amplitude becoms


¶2
2mZ e 2
µ
1
f (θ) = ,
×2 q2
which is reminiscent of the propagator for the zero-mass Klein-Gordon equation (or for the
Maxwell equations), which makes sense as the scattering amplitude in the Born approxima-
tion is just the Fourier transform of the interaction potential. The fact that this interaction is
described as a Klein-Gordon solution with zero mass, suggests an interpretation of the term
D(q) = 1/q 2 in the expression as a massless particle, the photon.
In the case of a Yukawa potential

Z e 2 e −mr
V (r) = −
r ,
the factor 1/q 2 is replaced by 1/(q 2 + m 2 ) which derives from a Klein-Gordon equation with
mass m. In this case the ‘virtual particles’ responsible for the effect of the interaction carry
a mass m. An example are mesons, elementary particles that are responsible for the interac-
tions between protons and neutrons.

16.6 P ROBLEMS
1. The nonrelativistic limit of the Klein-Gordon equation
The Klein-Gordon equation is compatible with an energy-momentum relation

E 2 = p 2c 2 + m2c 4.

(a) Show that, in the non-relativistic limit, this reduces to the sum of the rest energy
and the classical expression for the kinetic energy.

In order to study the non-relativistic limit of the Klein-Gordon equation, we write the
2
solution to that equation in the form ψe −imc t /× .

(b) Show that this leads to the Schrödinger equation to be obeyed by ψ.

2. Klein-Gordon equation and Coulomb potential


It is interesting te study the ’Klein-Gordon hydrogen atom’, where we use the Coulomb
potential in this relativistic wave equation:
¸2
∂ Z e2
·
i× + ψ(r, t ) = −×2 c 2 ∇2 ψ(r, t ) + m 2 c 4 ψ(r, t ).
∂t r

(a) Write ψ(r, t ) = R(r)Ylm (θ, φ)e −iE t /× and show that this leads to the following equa-
tion for R(r ):
2 2
³ ´
·
1 d
µ
d

l (l + 1)
¸ E + Zre − m2c 4
2
r + R(r ) = .
r 2 dr dr r2 ×2 c 2
Here, you may look into the analysis of the Schrödinger equation for a spherically-
symmetric potential (e.g. Griffiths Ch. 4) where the analysis of the Laplace opera-
tor via a separation of variables using Ylm is done in a similar way.
(b) Writing
Z e2 4 m2c 4 − E 2 2E γ
¡ ¢
2
γ= ; α = 2 2
; λ= ,
×c × c ×cα
and ρ = αr , show that the radial Klein Gordon equation of part (a) can be written
as
1 d 2 d λ 1 (l (l + 1) − γ2
½ µ ¶ · ¸¾
ρ + − − R(ρ) = 0.
ρ2 d ρ dρ ρ 4 ρ2

{
16{
16.6. P ROBLEMS 237

(c) Now write R in the form R(ρ) = f (ρ)g (ρ)v(ρ), where f and g are the correct solu-
tions for R for small and large ρ respectively:
(
f (ρ) = ρ s for ρ → 0;
R(ρ) → −ρ/2
g (ρ) = g (ρ) = e .

Prove this and give an equation for s in terms of l and γ. We now have

R(ρ) = ρ s e −ρ/2 v(ρ).

(d) As usual, we expand v(ρ) as a power series ∞ a ρ k . By putting this into the
P
k=0 k
radial Klein-Gordon equation, derive a recurion relation for the a k . Requiring
that for k ≥ N , a k = 0, derive that
¶−1/2
γ2
µ
2
E n = mc 1+ 2 ,
n

where n = N + s + 1.
(e) Expand the energy found in (d) to fourth order in γ2 and show that the zeroth
order gives the rest energy, and the second order the Rydberg series. Calculate a
formula for the lowest relativistic correction to the Rydberg formula. You should
find
mc 2 ³ n ´
−γ4 4 − 3/2 .
n l + 1/2
Hint: First show that s = l − γ2 /(2l + 1).

{
16
{
17
T HE D IRAC EQUATION

17.1 I MPROVING ON THE K LEIN -G ORDON EQUATION


In the previous chapter we have seen that the problem with generalising the Schrödinger
equation to the relativistic case is two-fold:
• For every positive energy, there is also a negative emergy of the same modulus, so that
the spectrum is not bounded from below.

• The density may become negative.


Dirac tried to solve these issues by postulating a linear form of the Hamiltonian operator
in terms of the momentum and energy:
∂ψ
µ ¶
×
i× = α · ∇ + βm ψ = H ψ.
∂t i
The αi and β are fixed by satisfying the following physical requirements.
• The free-particle wave functions ψ should be a plane wave with a wave vector satisfying
E 2 = p2 c 2 + m 2 c 4 .

• There exists a four-vector current density j µ whose fourth component should be a pos-
itive density.
Before we continue we adopt the usual convention in relativistic quantum mechanics to put
× = c = 1. This defines particular units in which we work. Any physically meaningful result
which involves a physical dimension like energy, time etcetera, can be transformed into con-
ventional units by multiplying with suitable powers of c and ×.
Let us work out the first condition. We work this out, interpreting the Hamiltonian as the
energy, and operating with H 2 on ψ. We then find conditions for αi and β from the require-
ment that the following equation must hold:

p 2 + m2 = αi p i α j p j + β2 m 2 + αi p i β + β αj p j ,
X¡ ¢X¡ ¢ X¡ ¢ X¡ ¢
(17.1)
i j i j

where the indices i , j run from 1 to 3. I have deliberately respected the order in which the
different terms occur in the products as it turns out that commuting objects αi and β do not
lead to the desired result. This can directly be seen from the following conditions that should
be obeyed in order to satisfy the equation (17.1) (note that {a, b} denotes an anti-commutator
ab + ba):

αi , α j = 0 for i 6= j ;
© ª

αi , β = 0;
© ª

α2i = β2 = 1.

239
240 17. T HE D IRAC EQUATION

The anti-commutation rules immediately block the possibility to use scalars for the αi
{
and β. Moreover, they suggest to involve the Pauli matrices, which also anti-commute. How-
17
{
ever, together with the unit matrix, the Pauli matrices form an independent set of four objects,
but the unit matrix fails to anti-commute with the Pauli matrices.
The simplest option to satisfy the conditions, is to choose four-dimensional matrices.
Then, several acceptable forms of these matrices can be found – for example:

σi
µ ¶
0
αi = ;
σi 0
1 0
µ ¶
β=
0 −1

where 1 denotes the 2 × 2 unit matrix.


This form of the αi and the β has as a consequence that the wave function becomes four-
dimensional. Note that this four-dimensionality has no direct relation to space-time being
four-dimensional. The free-particle wave function ψ can now be written as
µ ¶
u1
ψ(r) = e −ip·x ,
u2

where u 1 and u 2 are two-dimensional vectors. They are chosen because of the block structure
of the 4 × 4 matrices αi and the β. Writing the Dirac equation for the components of ψ, we
obtain for u 1 and u 2 :

σ · p u 2 = 0;
(E − m) u 1 −σ (17.2)
σ · p u 1 + (E + m)u 2 = 0.
−σ (17.3)

It is important to realise that the relativistic equation for the energy, E 2 = p 2 + m 2 still allows
for both positive and negative energies. Let us first consider the energy to be positive, and
the momentum to be small (this is the nonrelativistic limit). In that case, we see that the size
of u 2 is much smaller than that of u 1 . So the positive energy solution causes – in the weakly
relativistic limit – one component of the four-spinor ψ to be much smaller than the other.
For negative energies, the roles are reversed and u 2 is the large component and u 1 the small
one. We now change notation and speak henceforth of the large component u L and the small
component u S .
A second important note is that we can combine the two equations for u L and u S into a
σ · p)(σ
single one for u L . Noting that (σ σ · p) = p 2 (verify this!) we find

p2
µ ¶
|E | − m − u L = 0.
|E | + m

In the nonrelativistic limit and positive energy, E , m À |E − m|, we see that

p2
u L ≈ (E − m)u L ,
2m
so that, as required, we recover the Schrödinger equation with energy E − m.
As for both u L and u S , their nonrelativistic Hamiltonian commutes with the Pauli matrix
σz , we can assign a spin-value to u L and to u S . The two components of u L are identified with
the two spin states along the z-axis, corresponding to the eigenstates of σz :
µ ¶ µ ¶
1 0
u L+ = ; u L− = ,
0 1
17.2. T HE PROBABILITY DENSITY 241

and similar for u S± . We use the Dirac equation in the form (17.2) to find the full (non-normalized)
{
four-vectors 
1
 17
{
 0 
u + (p) = C  p z (17.4a)
 

 |E |+m 
p x +ip y
|E |+m

and  
0
 1 
u − (p) = C  . (17.4b)
 
p x −ip y
|E |+m
 
−p z
|E |+m

For the negative energy solutions we can play the same game to arrive at
 −p z 
|E |+m
p x +ip y
− |E |+m
 
u + (p) = C  (17.4c)
 

 1 
0

and  p x −ip y 
− |E |+m
 pz 
u − (p) = C  |E |+m . (17.4d)
 
 0 
1

17.2 T HE PROBABILITY DENSITY


One problem with the Klein-Gordon equation was the non-positiveness of the density. We
now construct a density for the Dirac equation by requiring again that

∂ρ
+ ∇ · j = 0,
∂t

where ρ = ψ† ψ (which is always positive). The wave functions satisfy the Dirac equation

∂ψ
i α · ∇ψ + βmψ
= H ψ = −iα
∂t
and its Hermitian conjugate is

∂ψ† ³ ´
−i = i ∇ ψ† ·α
α + βmψ† ,
∂t
which directly leads to
∂ρ ³ ´
= −∇ · ψ†α ψ ,
∂t
so that we identify j = ψ†α ψ.
It is very important that we have succeeded in finding a four-current with a density ρ
which is always positive. This means that we have solved the problem of the sometimes neg-
ative densities of the Klein-Gordon equation! The problem of the negative energy states still
remains. We shall come back to this later in this course.
242 17. T HE D IRAC EQUATION

{
17.3 A NEW FORM FOR THE D IRAC EQUATION
17
{ The Dirac equation can quite trivially be reformulated by introducing the so-called gamma
matrices, γµ . These are 4 × 4 matrices that are expressed in terms of the α j and β matrices:
γ4 = β; γ j = βα j ,
from which we easily find the explicit form of these matrices:
 
0 0 0 1
 0 0 1 0 
γ1 =  ,
 
 0 −1 0 0 
−1 0 0 0
 
0 0 0 −i
 0 0 i 0 
γ2 =  ,
 
 0 i 0 0 
−i 0 0 0
 
0 0 1 0
 0 0 0 −1 
γ3 =  ,
 
 −1 0 0 0 
0 1 0 0
 
1 0 0 0
 0 1 0 0 
γ4 =  .
 
 0 0 −1 0 
0 0 0 −1
Note that the first three of these have the form
σj
µ ¶
0
.
−σ j 0
The Dirac equation now takes the simple form
¡ µ
γ p µ − m ψ(r, t ) = 0.
¢

In this equation, p µ is the operator i×∂µ (time-dependent form) or (E , −i×∂ j ), j = 1, 2, 3 in the


stationary case (when ψ is a solution at energy E ).
We now make a very important remark: Although this equation looks relativistically in-
variant, this is not immediately clear from its form! It should be emphasised that the γµ do
NOT transform under a Lorentz transformation, only the p µ and the ψ do. The transforma-
tion of the four vector ψ is necessary to obtain a relativistically invariant equation. We shall
not go into details here (see section 35.7 of Desai’s book) but it is necessary for guaranteeing
covariance, that under a Lorentz transformation
µ µ
x 0 = Lν x ν,
ψ transforms according to a 4 × 4 matrix S:
ψ0 (x 0 ) = Sψ(x)
where S obeys the requirement that
S −1 γµ S = L µ ν γν .
We list some properties of the gamma matrices:
¡ 4 ¢2
γ = 1;

© µ νª
γ , γ = 2g µν ; (17.5)
and
¡ µ ¢†
γ = γ4 γµ γ4 .
17.4. S PIN -1/2 FOR THE D IRAC PARTICLE 243

17.4 S PIN -1/2 FOR THE D IRAC PARTICLE {


The angular momentum operator can be viewed as the ‘generator of rotations’. With this 17
{
statement, we mean that, for a rotation around the z-axis for example, the rotation operator
about an angle α can be written as

R = exp(i αL z /×).

Note that we have restored the × in the equation – this will turn out useful in this section.
This means that any component of the angular momentum operator commutes with any
spherically symmetric operator. Note that angular momentum comes in two ‘flavors’: the
orbital angular momentum is defined as

L = r×p

and an additional type of angular momentum which has no analogue in classical mechanics:
the spin.
We consider a spherically symmetric problem, defined by a potential V (r ) and evaluate
the commutator between the orbital angular momentum component L j and the Hamilto-
nian
α · ∇ + βm + V (r ).
H = −iα
Due to the spherical symmetry, we immediately verify that the only non-vanishing com-
mutator is
3
[L j , H ] = α · [L j , p] = αk [L j , p k ].
X
k=1

The expression on the right hand side can be evaluated from and from [p j , r k ] = −i×δ j k , and
leads to the result
α × p).
[L, H ] = i×(α
We see that the orbital angular momentum does not commute with the Hamiltonian,
which is quite strange in view of the rotational symmetry the Hamiltonian should have. The
solution to this apparent paradox is the notion that we have neglected the spin, which is
an essential part of the total angular momentum. A natural form of the spin is the vector
operator with components
σj 0
µ ¶
0
σj = .
0 σj
Here j = x, y or z and σ0j is a 4 × 4 matrix containing two Pauli spin matrices on the diagonal.
Note that the prime with σ indicates that we are dealing with a 4 × 4 matrix containing two
Pauli spin matrices. Note the difference with the boldface σ which denotes a vector operator
with the three 2 × 2 matrices σx , σ y and σz as its components. The vector operator σ 0 has the
three 4 × 4 matrices σ0j , j = x, y, z as its components. Using the relation [σ j , σk ] = 2i² j kl σl ,
the commutator of this operator with H can be worked out straightforwardly and the result
is
σ0 , H ] = 2i(p ×α
[σ α).
Combining this with the commutation relation for the orbital angular momentum operator,
we obtain ·µ ¶ ¸
×
L + σ 0 , H = 0,
2
i.e. the Hamiltonian does commute with the total angular momentum, provided we assume
σ0 /2. The total angular momen-
that the particles have spin-1/2, described by the operator ×σ
tum
×
J = L + σ0
2
therefore commutes with H .
244 17. T HE D IRAC EQUATION

{
17.5 T HE HYDROGEN ATOM
17
{ In section 17.1 we have found the solutions of the Dirac equation with a constant potential:
they are plane waves with a four-dimensional amplitude vector. The constant potential is
also a standard problem that we can solve for in the Schrödinger equation case. Other prob-
lems for which the Schrödinger equation can be solved exactly are the harmonic oscillator
and the hydrogen atom. It turns out that the hydrogen atom can also be solved for in the
case of the Dirac equation. This turns out to be a very useful exercise as the Dirac equation
contains the spin, which should give us important parts of the solution such as the spin-orbit
coupling. For the full solution, we refer to Desai, sections 34.3-6. Here we shall restrict our-
selves to considering the weakly relativistic limit, and show that all the fine structure terms of
the hydrogen atom naturally emerge from the Dirac equation.
Similar to the standard approach one usually follows to obtain a stationary Schrödinger
equation, we construct a stationary Dirac equation by writing

ψ(r, t ) = exp(−iE t )φ(r)

which, after plugging this into the Dirac equation leads to the following stationary equation:

α · p − βm − V (r ) φ(r) = 0,
£ ¤
E −α

where we have replaced E by E − V (r ) to account for the (spherically symmetric) potential.


Note that the solution to this equation is a four vector. Writing this solution as

φL
µ ¶
φ= ,
φS

where S and L stand for small and large respectively, and plugging this into the Dirac equa-
tion, we obtain
σ · pφS = 0;
(E − m − V ) φL −σ
σ · pφL + (E + m − V ) φS = 0.
−σ
We now show that the in the weakly relativistic limit the fine structure terms emerge from this
equation.
To this end, we take E − m ¿ m. We define E T = E − m to rewrite the two equations we
just derived in the form
(E T − V ) φL = σ · pφS ;
(2m + E T − V ) φS = σ · pφL .
Now we use the second of these equations to eliminate φS in the first:
1
(E T − V ) φL = σ · p σ · p φL .
¡ ¢ ¡ ¢
2m + E T − V
It would seem that we have a weakly relativistic Dirac equation for the spinor represented
by the large component which has the form similar to the stationary Schrödinger equation.
However, there is a subtle problem with this equation. Usually when we solve an equation like
this, we search for a solution which is normalized to 1. To see this, imagine that we minimize
the expectation value of the energy:

ψ |H | ψ
­ ®
〈E 〉 = ­
ψ|ψ
®

with respect to ψ. This is most easily done by varying ψ → ψ+δψ and then requiring that the
first-order Taylor terms of vanish. Some calculation leads to two equations. The first is

ψ|H |ψ ¯ ®
­ ®
H ψ = ­ ® ¯ψ
¯ ®
¯
ψ|ψ
17.5. T HE HYDROGEN ATOM 245

and the second is the Hermitian conjugate of this equation (which does not yield new infor-
{
ψ|H ψ|ψ
­ ® ­ ®
mation). Replacing then |ψ / by E , the standard stationary Schrödinger equation
17
{
H ψ = E ψ is recovered. Obviously, the problem can also be solved by finding the mini-
¯ ® ¯ ®
¯ ¯
mum of the numerator under the condition that ψ|ψ is normalized to 1. However, in the
­ ®

present case, it is φ, and not φL which is normalized to 1. Using

σ · p σ · p = σ j p j σk p k = p2 ,
¡ ¢¡ ¢

we can write, using m À E ,V :


¿ ¯ 2 ¯ À
¯ p ¯
φS |φS = φL ¯¯ ¯ φL ,
­ ®
4m 2 ¯
so that
2
† p
Z Z · ¸
† 3
¯ ¯2
φ φd r = 1 = ¯φL + φL
¯ φL d 3 r.
4m 2
Therefore we now introduce a two-spinor function ψ in terms of φL :

p2
µ ¶
ψ ≈ 1+ φL
8m 2
¯ ¯
which is properly normalized to 1 for ¯p/m ¯ ¿ 1. Inverting this relation we have

p2
µ ¶
φL ≈ 1 − ψ.
8m 2

We shall consistently neglect corrections of the order p 4 /m 4 . Rephrasing our weakly relativis-
tic Dirac equation as an equation for the properly normalized function ψ, we obtain

p2 p2
µ ¶ µ ¶
1
ψ σ σ ψ.
¡ ¢ ¡ ¢
(E T − V ) 1 − = · p · p 1 −
8m 2 2m + E T − V 8m 2
Approximating the term on the right hand side within our consistent approach:
1 1 ET − V
≈ − ,
2m + E T − V 2m 4m 2
we obtain after some calculation

p2 σ ·p V σ ·p
· 2
p4 p2
¡ ¢ ¡ ¢¸
p
µ ¶
(E T − V ) 1 − ψ= − − ET + ψ.
8m 2 2m 16m 3 4m 2 4m 2

Reorganising this equation further and multiplying left and right hand side by 1 − p 2 /(8m 2 )
we obtain:
σ ·p V σ ·p
· 2
p4 p 2V + V p 2
¡ ¢ ¡ ¢¸
p
ETψ = − +V − + ψ.
2m 8m 3 8m 2 4m 2
This equation contains the somewhat intricate term σ · p V σ · p . This can be reworked
¡ ¢ ¡ ¢

using
σ · a) (σ
(σ σ · b) = a · b + iσ
σ · (a × b)
to
σ · pV × pψ .
σ · p V σ · p ψ = pV · pψ + iσ
¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ £¡ ¢ ¡ ¢¤

Working out all the terms, one obtains, after some algebra:

H ψ = E T ψ,

where
p2 p4 ∇2V σ ∇V × p
¡ ¢
H= − +V + + .
2m 8m 3 8m 2 4m 2
246 17. T HE D IRAC EQUATION

In addition to the nonrelativistic Hamiltonian


{
17
{ p2
Hnonrel = + V,
2m
we see that this equation contains corrections which we shall now discuss.
The term p 4 /(8m 3 ) is a correction due to the kinetic energy and does not depend on the
potential V . Two more terms do depend on V . The term ∇2V /(8m 2 ) is called the Darwin
term. Using the fact that ∇V = 4πδ(r) for the specific case of the Coulomb potential of a
charge located at the origin, it can be formulated as a correction term located only at the
origin.
The last term can also be reformulated. Using the fact that V is spherically symmetric, we
have
r dV
∇V = ,
r dr
so that we can write
¢ 1 dV ¢ 1 dV 1 dV
σ · ∇V × p = σ · r×p = σ · (r × p) = σ · L.
¡ ¡
r dr r dr r dr
This is the famous spin-orbit coupling.

17.6 I NTERACTION WITH AN ELECTROMAGNETIC FIELD


The Dirac equation can easily be formulated for a particle moving in an electromagnetic field.
Such a field contains an electric potential φ and a vector potential A. We follow the same rules
as in the ordinary Schrödinger equation:

p → p − eA

and
E → E − eφ.
This is a relativistically correct procedure as we can write this transformation as

p µ → p µ − e Aµ.

The energy-momentum relation in this case is replaced by the condition


¢2
(E − eφ)2 = p − eA + m 2 ,
¡

which, in the nonrelativistic limit, yields (by taking the square root of the last equation):
¡ ¢2
p − eA
E − eφ ≈ m + .
2m
The Dirac equation now reads:
£ µ
γ (p µ − e A µ ) − m ψ = 0.
¤

Multiplying this from the left by γν (p ν − e A ν ) + m, yields


£ ν
γ (p ν − e A ν )γµ (p µ − e A µ ) − m 2 ψ = 0.
¤

We now consider the factor γµ γν , which we write in the form

1¡ µ ν ¢ 1¡
γµ γν = γ γ + γν γµ + γµ γν − γν γµ .
¢
2 2
17.6. I NTERACTION WITH AN ELECTROMAGNETIC FIELD 247

The first term is an anti-commutator which is equal to g µν (see Eq. (17.5)). The second term
{
is defined as
17
{
γµ γν − γν γµ = −2iΣµν .
As Σµν is a commutator, the diagonal elements (µ = ν) are zero. The elements Σ4i are equal
to
0 σi
µ ¶
Σ4i = i , i = x, y, z.
σi 0
The remaining nonvanishing terms can be shown to be

σk
µ ¶
ij 0
σ = 2²i j k i , j , k = x, y, z.
0 σk

All in all, we have


£ µ
(p − e A µ )(p µ − e A µ ) − iΣµν (p µ − e A µ )(p ν − e A ν ) − m 2 ψ(x) = 0.
¤

Using the anti-symmetry of Σ (i.e. Σµν = −Σνµ ), we can write the second term as
e
−iΣµν (p µ − e A µ )(p ν − e A ν )ψ = − Σµν (∂µ A ν − ∂ν A µ )ψ
2
where in the last form, the partial derivatives only act on the components of A and not on ψ
as can be seen by carefully writing out all terms. The term in parenthesis is recognised as the
Maxwell tensor (see Eq. (16.2)):
e e
− Σµν (∂µ A ν − ∂ν A µ ) = − Σµν F µν .
2 2
Using the fact that the electric field components are E i = F 0i and that the magnetic field is
given as B i = − 21 ²i j k F j k , we have
e
Σµν F µν = e σ 0 · B − iα
α·E .
¡ ¢
2
All in all, the Dirac equation for an electron in an electromagnetic field becomes

(E − eφ)2 − (p − eA)2 + eσ α · E − m 2 ψ = 0.
σ0 · B + ieα
£ ¤

It can be shown that, in the weakly relativistic limit, the term involving the electric field be-
comes much smaller than the other ones. The term involving the magnetic field is the natural
coupling between the spin and the magnetic field. If we restore the prefactors and give it a
dimension of energy, it reads
e× 0
σ;
2mc
the prefactor is called the Bohr magneton. Restricting ourselves to the large component of
the spinor in the nonrelativistic limit, we obtain

∂ψL e×
· ¸
1 2
i× = (i×∇ + eA) − σ · B + eφ ψL .
∂t 2m 2mc

This equation can also be derived in a more direct way for a plane wave solution. Starting
from the Dirac equation for this plane wave we have

σ · (p − eA)
µ ¶µ ¶
E − eφ − m −σ uL
=0
σ · (p − eA) E − eφ + m
−σ uS

which yields the two equations

σ · (p − eA)u S = 0,
(E − eφ − m)u L −σ
248 17. T HE D IRAC EQUATION

and
{
σ · (p − eA)u L = 0.
(E − eφ + m)u S −σ
17
{
In the nonrelativistic approximation, |E − m| ¿ m and eφ ¿ m, and we can write the second
equation as
σ · (p − eA) σ · (p − eA)
uS = uL ≈ uL .
E − eφ + m 2m
Inserting this into the first equation gives
σ · (p − eA)
(E − eφ − m)u L = σ · (p − eA) uL .
2m
This then yields, after some algebra

µ ¶
1
E − m − eφ − (p − eA)2 + σ · B u L = 0.
2m 2mc
This equation is in accordance with the result of our more general calculation, but the deriva-
tion is not so general as we have assumed a plane wave form for the solution.

17.7 P ROBLEMS
1. Establish the following properties of the γ matrices:

(a) γ2ν = g νν 1.
(b) g µν γµ γν = 41.
(c) γλ γν γλ = −γν .
(d) γλ γν γκ γλ = 4δκν .
(e) γλ γν γκ γµ γλ = −2γµ γλ γν .
(f) Tr(γκ γν ) = 4δκν .

2. The Dirac equation with a central potential, Part 1.


We consider a particle in a radial potential. We define the operator
σ ·L
µ 0 ¶
K =β +1 ,
×
and the operator
1 ∂ 1¡ ¢
p r = −i×
r = r · p − i× .
r ∂r r
This is the radial momentum operator. Also, we define
1
αr = α · r) .

r
(a) Show that
σ = iσ
σ ×σ σ.
for the vector σ which has the Pauli matrices as its components. (This result can
be generalized to an arbitrary angular momentum operator J.)
(b) Show that
σ0 L = r p r + i (σ
α · r) α · p̂ = r · p + iσ σ · L + ×) .
¡ ¢

(c) Multiplying this from the left by αr /r , show that
· ¸
i
α · p = αr p r + βK .
r
This leads to the following form of the Dirac Hamiltonian:
c
H = cαr p r + i× αr βK + βmc 2 + V (r ).
r
17.7. P ROBLEMS 249

3. Dirac equation with a spherical potential, Part 2


{
17
{
We proceed with the Dirac Hamiltonian of problem 2.

(a) Show that [K , H ] = 0.


(b) The matrix operator K is block diagonal; hence it operates separately on two
spinor wave functions φL and φS . Take as a basis in the space of spinor wave
functions, Ylm χ± , where
µ ¶ µ ¶
1 0
χ+ = ; χ− =
0 1

are the spin-up and down spinors, respectively, and Ylm is the usual spherical har-
monic function.
Show these basis functions are all eigenstates of ×K and that the eigenvalues are
×κ, where κ = ±( j + 1/2).

4. H We consider a system described by the Dirac Hamiltonian

α · ∇ + βm.
H = −iα

We take × = c = 1.

(a) Show that the Heisenberg equation

∂Q
−i (t ) = (H ,Q) ,
∂t
applied to the position operator x(t ) gives

∂x
(t ) = αx .
∂t
So, αx can be seen as a velocity (measured in units of c) in the x-direction.
(b) Show that αx (t) by itself satisfies the equation of motion

∂αx
(t ) = 2i p x − αx H .
£ ¤
∂t

(c) Solve the two equations of motion for αx and x to obtain

αx (t ) = Ae −2iH t + p x H −1

and express A in terms of αx (0). Hint: use the time independence of H and of p x .
(d) Verify that

1
x(t ) = x(0) + p x H −1 t + iH −1 αx (0) − p x H −1 e −2iH t − 1 .
£ ¤¡ ¢
2
We see that the second term contains an oscillatory contribution. It can be shown
that it arises from an interference between positive- and negative-energy waves.
This phenomenon is known as the Zitterbewegung.

5. H In 2010, the Nobel prize was awarded to A. Geim and K. Novoselov for their research
into the physical properties of electrons in graphene, which is a single layer of carbon
atoms, ordered in a hexagonal grid. One of the striking properties of these electrons is
that they behave as massless spin-1/2 particles in two dimensions. In 1929, Klein had
analyzed the properties of Dirac particles when they scatter off a potential barrier, and
250 17. T HE D IRAC EQUATION

in 2006, Katsnelson, Geim and Novoselov published a paper in Nature Physics on an


{
experimental verification in graphene of this process. The so-called Klein paradox is
17
{
addressed in this problem.
We consider electrons incident on a potential barrier
(
V0 for z > 0;
V (z) =
0 for z < 0.

For z < 0, the Dirac equation reads

αp + βm ψ = E ψ.
¡ ¢

For z > 0 we have in turn:


αp + βm ψ = (E − V0 )ψ.
¡ ¢

0 σz
µ ¶
Note that α is the matrix with σz the Pauli matrix. Also, p = p z . From the
σz 0
Dirac theory, we know that an incident spin-up particle is described by
 
1
 0 
 i pz
ψinc =  e .

 p/(E + m) 
0

As the potential step is not expected to change the spin of the particle, we expect the
reflected and transmitted wave to only have spin-up character. The reflected wave has
the form  
1
 0 
 −i pz
ψrefl = a  e

 −p/(E + m) 
0
and the transmitted wave is
 
1
 0  0
 ip z
ψtrans = b  e .

0
 p /(E − V0 + m) 
0

Show that matching the solution at z = 0 gives the following equations:

1 + a = b;
p0 E + m
1−a = b ≡ r b.
p E − V0 + m

We distinguish between three cases for the value of V0 :



V0 < E − m case I;


E − m ≤ V0 < E + m case II;


V ≥ E + m case III.
0

Show that in case I, s


(V0 − E + m)(E + m)
r= .
(V0 − E − m)(E − m)
17.7. P ROBLEMS 251

whereas for case II, s {


r =i
(V0 − E + m)(E + m)
. 17
{
(E − V0 + m)(E − m)
and, in case III: s
(V0 − E + m)(E + m)
r =− .
(V0 − E − m)(E − m)

Also show that the reflected current versus the incident current is given by

¯ 1 − r ¯2
¯ ¯
j refl 2 ¯ .
= |a| = ¯
¯
j inc 1+r ¯

Show that in region I, the reflected flux is less than 1, that in region 2 the reflected flux
is 1, and in region 3 it exceeds 1!
Find an explanation for this unexpected result.
18
S ECOND QUANTIZATION FOR
RELATIVISTIC PARTICLES

18.1 I NTRODUCTION
In this chapter, we apply the second quantization formalism to relativistic particles. As we
have seen in the previous chapters, we need to describe the particles with the appropriate rel-
ativistic quantum equation: the Klein-Gordon equation for relativistic bosons and the Dirac
equation for relativistic fermions. We start this chapter, by considering these two cases in
some detail. Then we analyze an important example of fermion systems in some detail: elec-
trons in graphene.

18.2 S ECOND QUANTIZATION FOR K LEIN -G ORDON PARTICLES


We first formulate the procedure of second quantization for the Klein-Gordon field. Let us
first recall this equation:
¡ 2
−∇ + ∂2t + m 2 φ(r, t ) = 0.
¢

Looking at the solutions to this equation, we concluded in chapter 16, that there are two
differences in comparison with the the Schrödinger particles considered in chapter 9:

• We have particles with positive and with negative energies

• We have a different expression for the density, which is given by


↔ h i
ρ(r, t ) = iφ† (r, t ) ∂ t φ(r, t ) = i φ† (r, t )∂t φ(r, t ) − ∂t φ† (r, t )∂t φ(r, t ) .

The first point leads to the conclusion that the wavefunctions have the form

C e −i(ωt −p·r) or C e i(ωt +p·r) ,

where ω2 = p2 + m 2 is always positive, therefore E = ω (first case) or E = −ω (second case).


The normalization condition gives us the constant C . Evaluating the inner product of
two plane waves with positive energies, and momenta p and p0 , we obtain, using the Klein-
Gordon density:
Z h i
0 0
d 3 r ie i(ωp t −p·r) −iωp0 e −i(ωp0 t −p ·r) − i iωp e i(ωt −p·r) e −i(ωp0 t −p ·r) = 2ωp (2π)3 δ(3) (p − p0 ).
¡ ¢ ¡ ¢

Also, performing a similar calculation for a positive and a negative energy state, we obtain
Z h i
0 0
d 3 r ie −i(ωp t −p·r) −iωp0 e −i(ωp0 t −p ·r) − i −iωp e −i(ωp t −p·r) e −i(ωp0 t −p ·r) = 0.
¡ ¢ ¡ ¢

253
254 18. S ECOND QUANTIZATION FOR RELATIVISTIC PARTICLES

So we see that properly normalized positive-energy wavefunctions have the form


1
φ(+) (r, t ) = e −i(ωp t −p·r) .
(2π)3/2
p
2ωp
{ For negative-energy wave functions, the wavefunction
18{
1
φ(−) (r, t ) = e i(ωt +p·r)
(2π)3/2
p
2ωp

is normalized to −1 as can be checked by a calculation similar to that for φ(+) . Furthermore,


φ(+) and φ(−) are orthogonal.
It is now possible to formulate field operators by a similar expansion as in chapter 9:

1 d 3 p £ −i(ωt −p·r)
Z
φ̂(r) = a(p) + e i(ωt +p·r) b(p) .
¤
e
(2π)3/2
p
2ωp

First we realize that within the second integral we can replace p by −p, and, using p · x =
ωt − p · r, we obtain

1 d 3 p £ −ip·x
Z
φ̂(r) = a(p) + e ip·x b(−p) .
¤
e
(2π)3/2
p
2ωp

If we now require that the field operators are Hermitian fields (φ̂(r, t ) = φ̂† (r, t )), we have
a † (p) = b(−p). We then have

1 d 3 p h −ip·x
Z i
φ̂(r) = e a(p) + e ip·x a † (p) .
(2π)3/2
p
2ωp

Let us consider this form. We see that the field operators are composed of creation oper-
ators of particles with momentum p and of annihilation operators with momentum p, which
are equivalent to creation operators for momentum states −p. Both types of particles are
assumed to have positive energies – therefore the Hamiltonian will have the form
1
Z h i
H= d 3 p ×ω a † (p)a(p) + a(p)a † (p) .
2
Of course you will now question how we can write this form from the information we had
so far – to be honest, we cannot. A more sophisticated field-theoretical description however
confirms that this is the right form (see for example Bjorken and Drell, Relativistic Quantum
Fields).
Using the Boson commutation relations for the a(p) and a † (p):
h i
a(p), a † (p) = 1,

we can rewrite the Hamiltonian as


Z h i
H= d 3 p×ω a † (p)a(p) + 1/2 .

We see that the ground state energy is infinite! This is another example of an infinity occur-
ring in field theory – previously we encountered this when discussing the Casimir effect.
In order to resolve the ‘inifinite energy problem’, we should realize that the values of the
energies do not matter – it is energy differences that drive physical processes. Therefore, the
physics does not change if we subtract the ground state energy from all energies. This leads
to the ‘renormalized’ Hamiltonian
Z
H = d 3 p ×ωa † (p)a(p).
18.3. S ECOND QUANTIZATION FOR D IRAC PARTICLES 255

In addition to this, we can count the number of particles as


Z
N = d 3 p a † (p)a(p).

These definitions lead to a field theory with positive energies and particle numbers (or den- {
sities). The fact that we have not propely shown that our expressions for the Hamiltonian 18
{
and particle number are correct may be somewhat unsatisfactory – for Dirac particles the
situation is better so we quickly move on to that case.

18.3 S ECOND QUANTIZATION FOR D IRAC PARTICLES


When finding plane wave solutions for the Dirac equation, we distinguish between the so-
lutions with positive and those with negative energy. We write spinors with positive energy
as
x|ψs = u s e −ip·x
­ ®

where
γ · p − m u s (p) = 0,
¡ ¢
p
where the fourth component of p in the left hand side is E p = p2 + m 2 , s = ± denotes the
spin. Just as in the case of the Klein-Gordon equation, we also have negative energy solutions
of the form:
ψ(r, t ) = v s e ip·x .
Putting this into the Dirac equation leads to

γ · p + m v s (p) = 0.
¡ ¢

The question is now what the spinors u s and v s look like. We have already encountered them
in chapter 17. They have the form
 
1
 0 
u + (p) = C  pz
 

 |E |+m 
p x +ip y
|E |+m

and  
0
1
 
 
 p x −ip y 
u − (p) = C 
 |E |+m
;

pz
 − |E |+m
 

 pz 
|E |+m
 p x +ip y 
v + (p) = C  |E |+m
 

 1 
0
and  p x −ip y 
|E |+m
pz
− |E |+m
 
v − (p) = C  .
 
 0 
1
The important question now is what the C ’s are in order for u and v to be normalized. We
have seen in section 17.2 that the correct four-verctor form of the current is

(j, ρ) = (ψ†α ψ, ψ† ψ).


256 18. S ECOND QUANTIZATION FOR RELATIVISTIC PARTICLES

This can also be written in the more elegant form

j µ = ψ† γ4 γµ ψ.

The quantity
{
18
{ ψ† (x)γ4
occurs so often that a special notation is used for it:

ψ̄ = ψ† (x)γ4 .

Now there is a subtle problem. The norm of the wavefunction should always be 1, i.e. is
should be a relativistically invariant scalar. But the density transforms as the 4-th component
of a four vector, and is therefore not an invariant scalar itself! It turns out that the quantity
ψ̄ψ is invariant (see Bjorken and Drell; this result is also used in Desai, sec. 35.7).
For p = 0, the forms of u s and v s are taken to be
   
1 0
 0   1 
u+ =   ; u− = 
   
 0   0 

0 0

and    
0 0
 0   0 
v+ =  ; v− = 
   
1 0

   
0 1
For p 6= 0 our norm ū s u s 0 remains constant. So we use the relativistically invariant normal-
ization conditions:
ū s u s 0 = −v̄ s v s 0 = δss 0 , ū s v s 0 = v̄ s u s 0 = 0.
Note the minus-sign in the normalization for the v’s: it is due to the γ4 appearing in the
definition of ψ̄. These normalisation conditions fix C for u s to

p2
µ ¶
2
|C | 1 − =1
(ω + m)2

(note that the minus-sign is caused by the γ4 in the definition of the norm). Therefore, we see
that
p2
µ ¶
u s† u s 0 = |C |2 1 + δss 0
(ω + m)2
which, after some calculus, can be seen to yield
ω
u s† u s 0 = δss 0 .
m
The same holds for the v s :
ω
v s† v s 0 =
δss 0 .
m
Interestingly, there is a ‘skewed’ orthogonality condition between the u’s and the v’s:

u s† (p)v s 0 (−p) = v s† (p)u s 0 (−p) = 0. (18.1)

This is natural as can be seen by writing out the normalization conditions including the plane
wave part: Z
0
d 3 r u s† (p)e ip·x v s (p0 )e ip ·x .
18.3. S ECOND QUANTIZATION FOR D IRAC PARTICLES 257

The integral over r forces p = −p0 , and for this situation, the above orthogonality condition
works.
We see that a properly orthonormalized set of wavefunctions is given by
s s
m −ip·x m ip·x {
e u s (p); e v s (p).
ωp ωp 18
{

From this, we can immediately write down the field operator


s
1 m h
Z i
3
ψ(x) = u s (p)a s (p)e −ip·x + v s (p)b s† (p)e ip·x .
X
3/2
d p
s=± (2π×) Ep

where the a s (p), a s† (p) annihilate and create positive and negative energy particles with posi-
tive energy, and the operators b s (p), b s† (p) create or annihilate a particle with negative energy.
Obviously, these operators satisfy the anti-commutator relation:
n o ª n o
a s (p), a s†0 (q) = δss 0 δ(3) (p − q); a s (p), a s 0 (q) = a s† (p), a s†0 (q) = 0;
©

n o ª n o
b s (p), b s†0 (q) = δss 0 δ(3) (p − q); b s (p), b s 0 (q) = b s† (p), b s†0 (q) = 0.
©

All anti-commutators involving a a and a b vanish.


The Hermitian conjugate of the field operator is
s
1 m h †
Z i
† 3
ψ (x) u s (p)a s† (p)e ip·x + v s† (p)b s (p)e −ip·x .
X
3/2
d p
s=± (2π×) Ep

Let us again calculate the particle number:


Z
N (t ) = ψ† (r, t )ψ(r, t )d 3 r.

Using the normalization conditions for the basis functions, we see that we get a number of
positive-energy particles
+
XZ 3 †
N = d pa s (p)a s (p).
s

For the negative energies, we have


XZ 3 XZ 3 h i
N− = d pb s (p)b s† (p) = d p 1 − b s† (p)b s (p) .
s s

It is a convention to neglect the inifinite constant s d 3 p from this number, so that we have
P R

a negative density left. This is common practice in quantum field theory: for quantities like
energy and particle number, we use the normal order procedure: we move all creation oper-
ators to the left, and the annihilation operators to the right. The notation for this procedure
is two colons: : . . . :. For example
X Z 3 XZ 3 †
: d pb s (p)b s† (p) := − d pb s (p)b s (p).
s s

As you can see, the re-ordering has consequences for the sign in front of the expression. Fi-
nally, cross-terms such as
s s
1 m † m †
Z Z Z
3 3 3
u (p)v s 0 (q)b † (q)e ip·x e iq·x
X
d r d p a (p) d q
ss 0 (2π×)3 Ep Eq s

vanish as a result of the orthogonality condition (18.1).


258 18. S ECOND QUANTIZATION FOR RELATIVISTIC PARTICLES

We see that the negative-energy particles contribute with a minus-sign to the particle
number. This can be related back to the negative density problem which always seems to
pop up in relativistic quantum mechanics. A solution may be to interpret this quantity not so
much as the number density, but rather as the charge density.
{ We may also formulate the Hamiltonian of the system, which proceeds in a way similar
18{ to the particle density: Z
H = d 3 r ψ† (x) −iα
α · ∇ + βm ψ(x).
¡ ¢

We can rewrite this after insertion of γ4 , and using ψ̄ = ψ† γ4 , as


Z
H = d 3 r ψ̄(x) −iγ γ · ∇ − m ψ(x).
¡ ¢

Inserting the expansion of the field operators in terms of the a(p) and b(p) operators and
their Hermitian conjugates gives
Z Xh † i
H = d 3 pE p a s (p)a s (p) − b s (p)b s† (p) .
s

The right hand side again contains an infinite energy as can be seen by writing bb † = 1 − b † b.
Removing this infinite offset then gives
Z Xh † i
H = d 3 pE p a s (p)a s (p) + b s† (p)b s (p) .
s

which leads to the surprising conclusion that the negative energy particles contribute a pos-
itive amount to the total energy!
At this stage it is useful to enter the interpretation of the Dirac theory. We have an equa-
tion which describes the behaviour of electrons extremely successfully (remember our analy-
sis of the hydrogen atom). However, it allows for negative energies which by themselves may
not be a problem, unless we allow the electron to interact with an electromagnetic field. If
this is done, then the electron may lower its energy by emitting a photon. If it continues to do
so, it can lower its energy ad infinitum. An electron moving at constant momentum through
space will not emit any radiation according to electrodynamics, but if it is orbiting in a hydro-
gen atom in a nucleus, the notorious process of losing energy will continue to happen. This is
not in agreement with observations – therefore these transitions should somehow be ‘forbid-
den’: Dirac realized this by invoking Pauli’s principle: he stated that all the negative energy
states were already filled with negative-energy electrons. Moving a positive energy electron
to a negative energy state would then be impossible. Outrageous the suggestion may seem, it
actually works very well. In fact, in condensed matter we use this picture all the time: the state
below the Fermi energies are filled with electrons, and this prevents electrons with energies
above the Fermi energies to ‘fall down’ and occupy the states below.
We also know from condensed matter physics that electrons can be excited from below
the Fermi energy to states above, when they absorb a photon. Suppose that the negative
electrons in a ‘vacuum’ could do the same. Then we would observe the excited electron as a
particle with negative charge and positive energy. However that is not all: we also see a hole
in the Fermi sea, which, due to the absence of an electron with negative charge there, acts
as a positively charged particle. In condensed matter physics, we call such a particle indeed
a hole. Furthermore, as we are ‘missing’ an electron in the sea of negative energy states, the
hole carries a positive energy.
In observations, there is no distinction between a hole and a positively charged particle.
As the suggestion that the vacuum consists of an infinite amount of negative-energy elec-
trons, which should somehow be compensated for by a positive background charge, is not
the most elegant option, we usually choose for the interpretation that the a-operators create
and annihilate negatively charged particles, whereas the b-s create and annihilate positively
18.4. A PHYSICAL REALIZATION OF A D IRAC FIELD THEORY: GRAPHENE 259

charged particles. These particles are called positrons. The energy to create an electron-
positron pair (or, in the Dirac sea language, to excite a negative energy electron to a positive
energy state) requires an energy of at least 2 electron masses (one electron mass corresponds
to 0.51 MeV).
{
18.4 A PHYSICAL REALIZATION OF A D IRAC FIELD THEORY: GRAPHENE 18
{
In 2010, A. Geim and K. Novoselov were awarded the Nobel prize for the isolation of and
research on graphene: a single layer of carbon atoms, arranged in an hexagonal lattice – see
figure 18.1. The nearest-neighbour distance between two carbon atoms in graphene is about

F IGURE 18.1: The hexagonal graphene lattice.

a = 1.42 Å. The hexagonal lattice is spanned by two unit vectors:


a p a p
a1 = (3, 3); a2 = (3, − 3).
2 2
Each unit cell contains two atoms: we say that the hexagonal lattice is a lattice with a basis,
where the basis is the set of (in this case) two atoms in the unit cell. The basis vectors of the
reciprocal lattice are
2π p 2π p
b1 = (1, 3); b2 = (1, − 3).
3a 3a
In this lattice, we can define the Brillouin zone as usual: it is the set of points which are closer
to the origin than to any reciprocal lattice points. We see that the Brillouin zone is again a
hexagon. Two of the corner points of the hexagon lie in the first Brillouin zone; we call those
points K and K0 . They are given as
1p 1p
µ ¶ µ ¶
2π 2π
K= 1, 3 ; K0 = 1, − 3
3a 3 3a 3
The lattice and the Brillouin zone are graphically illustrated in figure 18.2.
As you can see in the figure, we can divide up the lattice into two different kinds of points:
A points and B points. Each A point has three B points as its neighbours and vice versa. The
vectors connecting an A point to its three neighbours are
a p a p
d1 = (1, 3), d2 = (1, − 3), d3 = −a(1, 0).
2 2
The Hamiltonian for electrons moving on this lattice contains as usual a potential and
a kinetic energy term. We use wave functions localized near the nuclei as basis states.1 The

operators a σ,n A
and a σ,n A are the creation and annihilation for an electron with spin σ = ±1/2

occupying the orbital of the A atom at site n A . Similarly, we use b σ,m B
and b σ,mB for the
electrons in the orbitals of the B -atoms. The potential energy of an electron when it is in
such a localized state is state is ²:
à !
X X † X †
Hpot = ² a σ,n A a σ,n A + b σ,mB b σ,mB .
σ=±1/2 n A mB

1 These orbitals are the p orbitals which stick out from two sides of the graphene sheet.
z
260 18. S ECOND QUANTIZATION FOR RELATIVISTIC PARTICLES

ky b1

B A
K
{ d3 d1
18{ Γ M kx
a1 d2
K’
a2
b2

F IGURE 18.2: Hexagonal lattice with its unit cell. The vectors d j , j = 1, 2, 3 connect the A-sites of the lattice to its
three nearest B -neighbours. On the right the reciprocal lattice is shown with the hexagonal Brillouin zone with
some of the points with special symmetry indicated.

Note that this can be written as


à !
Hpot = ²
X X X
n σ,n A + n σ,mB .
σ=±1/2 n A mB

where n σ,n A and n σ,mB are the number operators.


The electrons may hop to a localized orbital of another atom. This hopping process is
most prominent for nearest neighbour orbitals. In fact, we cut off possible hops to farther
than nearest neighbours and apply a hopping parameter t to arrive at a neighbouring point,
and obtain the following kinetic energy:
X X ³ † †
´
Hkin = −t a σ,n A b σ,mB + b σ,m B
a σ,n A
.
σ=±1/2 〈n A ,m B 〉

The angular brackets 〈n A , m B 〉 indicate that the sum is only over nearest neighbours. The
particles do not change their spin in the hopping process. Note that this Hamiltonian is Her-
mitian.
The velocity of electrons near the Fermi energy is 106 m/s in graphene. This means that
the electrons can safely be described as nonrelativistic (their velocity is much smaller than
the speed of light). Therefore, and as the two spin directions do not occur explicitly in the
Hamiltonian, we can neglect the electron spin: spin-up electrons behave the same as spin-
down electrons, and there is no coupling between the two.
In order to find the energies, we make use of Bloch’s theorem, which tells us that the
eigenstates of a periodic Hamiltonian can be written as

r|ψk = u(r)e ik·r ,


­ ®

where u is a function which has the periodicity of the lattice, and k can be chosen in the
first Brillouin zone. This theorem is formulated for continuum space. Here we are however
dealing with a discrete space. In that case, Bloch’s theorem reads:

n c |ψ = u(n c )e ik·rnc ,
­ ®

where n c denotes a unit cell of the lattice, located at rnc . As we have two orbitals in each unit
cell, one on an A atom and another one on a B atom, all wave functions can be written as
two-vectors:
φ A (n c )
µ ¶
­ ®
n c |ψ = . (18.2)
φB (n c )
18.4. A PHYSICAL REALIZATION OF A D IRAC FIELD THEORY: GRAPHENE 261

The periodic function u also has this form:


µ ¶
uA
u= .
uB
{
The phase factor exp(ik · r), however, is obviously just a scalar. Note that we have left out the
argument n c which is justified since u is a periodic function: it has the same value in each
18
{

cell.
The Schrödinger equation for such a wave function with Bloch wave-vector k, yields an
equation for u A and u B :
³ ´
E k u A = ²u A − t u B e −ik·(a1 +a2 ) + e −ik·a1 + e −ik·a2 .

Similarly, ³ ´
E k u B = ²u B − t u A e ik·(a1 +a2 ) + e ik·a1 + e ik·a2 .

These are two homogeneous, linear equations with two unknowns. A nontrivial solution to
these only exists when the determinant of this system of equations vanishes:

E −² t e −ik·(a1 +a2 ) + e −ik·a1 + e −ik·a2


¯ ¡ ¢ ¯
¯ ¯
¯ ¡ ¯=0
¯ t e ik·(a1 +a2 ) + e ik·a1 + e ik·a2 ¢ E −² ¯

leads to p
E − ² = ±t 3 + 2 cos(k · a1 ) + 2 cos(k · a2 ) + 2 cos [k · (a2 − a1 )].
The basis vectors a1 and a2 have been given above, so we easily find

p p
v à ! à !
u ³p
u 3a a 3 3a a 3 ´
E − ² = ±t 3 + 2 cos
t kx + k y + 2 cos kx − k y + 2 cos 3k y a ,
2 2 2 2

which can be rewritten as


v
u à p !
u ³p ´ µ
3a

a 3
E − ² = ±t 3 + 2 cos 3k y a + 4 cos
t k x cos ky . (18.3)
2 2

A picture of this dispersion relation is given in figure 18.3. We see that the band structure has
positive and negative values, as is immediately clear from the ± sign in Eq. (18.3). Further-
more we observe that the positive and negative values ‘touch’ each other at special points in
the Brillouim zone with a cone-like shape. These cones are described by the mathematical
expression
E ∝ ±|q|
where q is the point in reciprocal space relative to the cone position. Comparing this with the
relativistic expression for the energy
q
E = ± p 2 + m2,

and identifying p with q, we see that these cones describe particles with zero mass. In fact,
the relation ω = c|k| is well known for photons, the carriers of light, which have zero mass
and travel at the speed of light. The k points where the cones touch, are K and K0 which are
shown in figure 18.2. Writing k = K + q or k = K0 + q, it is easy to verify that for small vectors q,
the dispersion relation reads:
E (q) = v F |q| + O (q 2 ).
The points K and K0 are called ‘Dirac points’ for reasons that will become clear later on.
262 18. S ECOND QUANTIZATION FOR RELATIVISTIC PARTICLES

{
18{

F IGURE 18.3: Band structure of graphene. Shown are the energies as a function of the two-dimensional vectors
k.

Now we need some additional piece of information which comes from chemistry. Carbon
has a nuclear charge Z = 6 and therefore it has 6 electrons. Two of these are in the low-
lying 1s orbital, which has spherical symmtry and which strongly localized near the nucleus.
This leaves four electrons, three of which are used to bind to the three neighbouring atoms.
This leaves one electron in each atomic p z orbital, and as we have two atoms per unit cell
in the hexagonal lattice, we have two electrons per cell. The band structure which we have
calculated is only for these p z electrons, and for each k point we should fill the energy values
with two electrons. At zero temperature, we fill the lowest possible energy values. As we can
put two electrons with opposite spin in the negative energy states, we conclude that these
states are filled at zero temperature, whereas the positive energy states are empty. The picture
we now have obtained is that of a filled Dirac sea of negative energy states, and for energies
close to zero (i.e. close to the Fermi energy) we have a dispersion relation

E = v F |q| = c|p|,

where in the second expression we have emphasised the similarity to massless particles in
relativity.
We want to show that the electrons in graphene satisfy a Dirac equation in two dimen-
sions. Let us therefore first analyze such a Dirac theory. In section 17.1, Eq. (17.1), we have
seen that we need ‘objects’ αi , i = x, y and β, satisfying

αi , α j = 2δi j ,
© ª

αi , β = 0,
© ª

and
α2i = β2 = 1.
18.4. A PHYSICAL REALIZATION OF A D IRAC FIELD THEORY: GRAPHENE 263

The fact that we now only need two αi ’s makes it much easier to find a solution than in the
four-dimensional case we studied before: we choose αi = σi , and β = σz ! The Dirac Hamil-
tonian should then read
H = −iσσ · ∇ − βm,
{
with σ = (σx , σ y ) and β = σz , and where the dot product is of course two dimensional.
In order to show that in graphene the electrons are indeed described by such a Hamilto-
18
{

nian, recall that the wavefunction could be written in the form of a two-spinor, see Eq. (18.2).
Let us return to the Hamiltonian in the Fock space which describes particles hopping to near-
est neighbour positions:
X ³ † †
´
H = −t a n A b mB + b m a
B nA
,
〈n A m B 〉

where 〈n A m B 〉 denotes nearest neighbour pairs on the lattice as usual. Note that the points
n A and m B both belong to some unit cell of the hexagonal lattice. These cells are indicated by
a position vector Rn , which may be the position of the A-atom in that cell – other conventions,
such as the point midway of a ‘horizontal’ point pair in figure 18.2 are possible. We know
that the dispersion relation has Dirac cones located near the points K and K0 in that figure.
We want to describe the electrons with wave vectors close to these points. To this end, we
consider the Fourier transforms of the operators a n A etcetera:

1 X ik·Rn
ak = p e an A ,
Nc n

where the sum is over the Nc cells, and n A indicates the A-point within the cell located at Rn ,
and k is a vector inside the Brillouin zone. The inverse transform is
1 X
an A = p a(k)e −ik·Rn .
Nc k²BZ

Furthermore
1 X −ik·Rn †
a k† = p e an A
Nc n
and we have similar transforms for the b nB .
Now we make an important step: as we want to describe the behaviour near K and K0 , we
write
1 X h 0
i
an A = p a(K + q)e −i(K+q)·Rn + a(K0 + q)e −i(K +q)·Rn ,
Nc q small
i.e., we focus on the regions near the two Dirac points. We now define

1 X
a 1,n A = p a(K + q)e −iq·Rn
Nc q

and
1 X
a 2,n A = p a(K0 + q)e −iq·Rn ,
Nc q
so that
0
a n A = e −iK·Rn a 1,n A + e −iK ·Rn a 2,n A .
A similar approach for the B particles gives
0
b nB = e −iK·Rn b 1,nB + e −iK ·Rn b 2,nB .

Note that the Fourier-expansions of the a and b operators both contain an Rn : the a-operators
automatically pertain to the A-point of that cell, and the b-operators to the B -point. As the
a i ,n A and the b i ,nB are expanded in terms of Fourier components with small q, their spatial
variation is small.
264 18. S ECOND QUANTIZATION FOR RELATIVISTIC PARTICLES

The Hamiltonian now reads


X ³ iK·R † 0

´³
−iK0 ·Rm
´
H = −t e n
a 1,n A + e iK ·Rn a 2,n A
e −iK·Rm
b 1,m B
+ e b 2,m B
+ h.c.,
〈n A m B 〉

where ‘h.c.’ denotes the Hermitian conjugate of the term written down.
{
18
{ Let us analyse the term
X iK·R † X iK·(R −R ) †
H11 = −t e n
a 1,n A e −iK·Rm b 1,mB + h.c. = −t e n m
a 1,n A b 1,mB + h.c.
〈n A m B 〉 〈n A m B 〉

Careful inspection of Fig. 18.2 leads to the conclusion that for an A-point in cell Rn , the three
neighbouring B -points are in the cells with Rm¯ = Rn − a1 , Rm = Rn − a2 and Rm = Rn − a1 − a2 .
This implies that if we act on a wave function ¯ψ with components
®

φ A (n c )
µ ¶
­ ®
n c |ψ = ,
φB (n c )
where n c denotes, as usual, a cell, we see that, after acting with the Hamiltonian on this wave-
function, the new upper component, corresponding to φ A (n), becomes

−t e iK·a1 φB (Rn − a1 ) + e iK·a2 φB (Rn − a2 ) + e iK·(a1 +a2 ) φB (Rn − a1 − a2 ) .


¡ ¢

We work this and Taylor-expand the φB around Rn (this is possible because all our φ’s are
slowly varying functions of the position) to obtain as a new upper component
3t a ¡
−∂x + i∂ y φB (Rn ).
¢
(18.4)
2
Now we consider the Hermitial conjugate of this term in H11 :
X iK·(R −R ) †
−t e n m
b 1,nB a 1,m A .
〈n B m A 〉

This term acts on the wavefunction on the neighbouring sites m A of n B and it uses the values
on those sites to fill the new value of the wavefunction on the site n B . These A-neighbours
are located in cells at a relative distance −a1¯ , −a2 and 0 with respect to Rn . Writing this out
we obtain a value of the B -component of H ¯ψ at n B :
®

3t a ¡
∂x + i∂ y φ A (Rn ).
¢
(18.5)
2
The two equations (18.4) and (18.5) can be written as
φ A (Rn )
¡ ¢ ¶µ
® 3t a
µ ¶
­ 0 −∂x + i∂ y
n c |H |φ = .
2 ∂x + i∂ y 0 φB (Rn )
This can also be written as
3t a ¡
i −∂x σ y + ∂ y σx n c |ψ ,
­ ® ¢­ ®
n c |H |φ = −
2
where σx and σ y are Pauli matrices. These matrices have a fixed form, but the choice of the
x and y axis are of course arbitrary. Swapping these two (x → −y, y → x) we can write the
Hamiltonian in the form
3it a
H =− σ · ∇,
2
which is recognised as the Dirac Hamiltonian in two dimensions for massless particles.
The terms coupling the a 2 and b 2 operators gives the same Hamiltonian, whereas the
coupling terms involving a 1 and b 2 contain quickly oscillating terms which vanish in the
sum over the lattice. We see that we have two independent systems of Dirac Hamiltonians for
massless particles. The two systems correspond to the labels 1 and 2, which can ultimately
be related to the two Dirac points K and K0 . The two components of the wavefunctions cor-
respond to A and B and are therefore related to the fact that graphene has a hexagonal lattice
which has two points in the unit cell.
18.5. P ROBLEMS 265

18.5 P ROBLEMS
1. (20 pts) In this problem, we study electrons in graphene subject to a magnetic field
perpendicular to the graphene sheet. In the lectures, the Hamiltonian for electrons
(without a magnetic field) was shown to be
{
−iv Fσ · ∇ψ(r) = E ψ(r). 18
{

Here, σ is a vector whose components are the 2 × 2 σx and σ y Pauli matrices, ψ(r) is a
two-spinor and r is a two-dimensional position vector on the sheet, which is taken to
be in the x y plane.
The effect of the magnetic field is accounted for by replacing

−i∇ → −i∇ + eA/c.

We use the ‘Landau gauge’:


A = B (−y, 0, 0).

(a) Show that this leads to a magnetic field of magnitude B along the positive z-axis.
(b) Write out the two components of the Dirac equation for electrons in graphene in
the presence of a magnetic field. Write ψ(r) = e ikx φ(y), (φ(y) is then a two-spinor
too) and show that the Dirac equation can be cast into the form

O
µ ¶
0
ωc φ(ξ) = E φ(ξ),
O† 0

where r
y c
ξ = − l B k, lB = .
lB eB
The parameter l B is called the ‘magnetic length’. Here

1 ¡
O = p ∂ξ + ξ .
¢
2

Give ωc and the form of O † .


This Dirac equation can also be written as
³ ´ 2E
O σ+ + O † σ− φ = φ.
ωc

(c) Show that [O , O † ] = 1, i.e., the operators O † and O are boson creation and annihi-
lation operators, respectively. Show that, given a ground state wave function

O χ0 (ξ) = 0,

all solutions can be found as

χN −1 (ξ)
µ ¶
φN ,± = ,
±χN (ξ)

with χ−1 (ξ) = 0 energies p


E ± = ±ωc N .
Oscillations with this particular structure have been observed for graphene.
266 18. S ECOND QUANTIZATION FOR RELATIVISTIC PARTICLES

(d) The existence of a zero-energy state in graphene, observed in experiments as a


resistance peak (see figure), is called the ‘anomalous integer quantum Hall ef-
fect’. Here, n on the horizontal axis denotes the electron density, i.e. n = 0 means
charge-neutral graphene. What can you say about the Fermi energy in that case
{ in view of the band structure of graphene?
18{ Explain why the observation in the graph is called ‘anomalous’ in view of the den-
sity of states of graphene at zero magnetic field.

You might also like