0% found this document useful (0 votes)
12 views453 pages

Notes of College Physics

Notes of College Physics, including classical mechanics, classical field theory, general relativity, quantum mechanics, quantum field theory, and statistical physics.

Uploaded by

songshengyuyang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views453 pages

Notes of College Physics

Notes of College Physics, including classical mechanics, classical field theory, general relativity, quantum mechanics, quantum field theory, and statistical physics.

Uploaded by

songshengyuyang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 453

1 h

X i
V .k0 / D ˇ t ln.1 ˛ˇ/ C ˇ t ˛ ln k t
t D0
1
X 1
X h1 .˛ˇ/t i
t
D ln.1 ˛ˇ/ ˇ C˛ ˇt ln ˛ˇ C ˛ t ln k0
t D0 t D0
1 ˛ˇ

D
˛
ln k0 C
ln.1 Physics ˛ˇ/
C ˛ ln.˛ˇ/
X ˇ .˛ˇ/ 
1 t t

1 ˛ˇ 1 ˇ 1 ˛ 1 ˛
物理
t D0
˛ ln.1 ˛ˇ/ ˛ˇ
D ln k C
0 C ln.˛ˇ/
1 ˛ˇ 1 ˇ .1 ˇ/.1 ˛ˇ/

˛ ln.1 ˛ˇ/ ˛ˇ
左边 D V .k/ D ln k C C ln.˛ˇ/
1 ˛ˇ 1 ˇ .1 ˇ/.1 ˛ˇ/
4 ˛
D ln k C A
1 ˛ˇ

n  o
右边 D max u f .k/ y C ˇV .y/

利用 FOC 和包络条件求解得到 y D ˛ˇk ˛ ,代入,求右边。

n  o
右边 D max u f .k/ y C ˇV .y/
 h ˛ i
D u f .k/ g.k/ C ˇ ln g.k/ C A
Summary is the best wayhto1say ˛ˇ
“Good Bye” i
˛ ˛ ˛
D ln.k ˛ˇk / C ˇ ln ˛ˇk ˛ C A
1 ˛ˇ
h ˛   i
D ln.1 ˛ˇ/ C ˛ ln k C ˇ ln ˛ˇ C ˛ ln k C k
1 ˛ˇ
˛ˇ ˛ˇ
D ˛ ln k C ˛ ln k C ln.1 ˛ˇ/ C ln ˛ˇ C ˇA
1 ˛ˇ 1 ˛ˇ
˛ ˛ˇ
D ln k C ln.1 ˛ˇ/ C ln ˛ˇ C ˇA
1 ˛ˇ 1 ˛ˇ
˛ Editor:Yuyang Songsheng
D ln k C .1 ˇ/A C ˇA
1 ˛ˇ Date:August 1, 2021
˛ Email: [email protected]
D ln k C A
1 ˛ˇ

所以,左边 D 右边,证毕。

Version: 1.00
Contents

I Classical Mechanics 13

1 The Formulation of Classical Mechanics 14


1.1 Lagrangian formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2 Symmetry and Conservation Laws(1) . . . . . . . . . . . . . . . . . . . . . . 15
1.3 Hamiltonian formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.1 Poisson Brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.2 Canonical transformations . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.3 Evolution as canonical transformations . . . . . . . . . . . . . . . . 18
1.3.4 Liouville’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.4 Symmetry and Conservation Laws(2) . . . . . . . . . . . . . . . . . . . . . . 19
1.5 Hamilton-Jacobi equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.6 Symmetry and Conservation Laws(3) . . . . . . . . . . . . . . . . . . . . . . 20

2 Two Body Problem 21


2.1 Reduced mass and central field . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Kepler Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Disintegration and collisions of particles . . . . . . . . . . . . . . . . . . . . 24
2.4 Scattering and cross section . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Small Oscillation 30
3.1 Small oscillation in one-dimension . . . . . . . . . . . . . . . . . . . . . . . 30
3.2 Forced oscillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 Non-linear oscillation and perturbation theory . . . . . . . . . . . . . . . . . 31
3.4 Oscillations of systems with more than one degree of freedom . . . . . . . . 33

4 Motion of a Rigid Body 35


4.1 Angular velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Dynamics of rigid body . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3 Eulerian angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

II Classical Field Theory 41

5 Special Relativity 42
5.1 The principle of relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Contents –3/453–

5.2 Relativistic Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43


5.3 Relativistic Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.3.1 Distribution function . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.3.2 Invariant cross section . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.3.3 Elastic scattering between two particles . . . . . . . . . . . . . . . . 46

6 Classical Field Theory 48


6.1 Lagrangian formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.2 Symmetry and conservation law . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.3 Functional derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6.4 Hamiltonian formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.4.1 Poisson bracket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.4.2 Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.4.3 Angular momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

7 Classical Electrodynamics 54
7.1 The formulation of classical electrodynamics . . . . . . . . . . . . . . . . . . 54
7.1.1 Maxwell’s equations and Lorentz force . . . . . . . . . . . . . . . . . 54
7.1.2 Lorentz transformation of fields . . . . . . . . . . . . . . . . . . . . 55
7.1.3 Energy-momentum tensor . . . . . . . . . . . . . . . . . . . . . . . 55
7.1.4 Charged particles in a given EM field . . . . . . . . . . . . . . . . . . 57
7.2 Constant electromagnetic field . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.2.1 Coulomb’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.2.2 Multipole moments . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7.2.3 Biot-Savart law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.2.4 Magnetic moment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.3 Electromagnetic waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.3.1 Electromagnetic waves . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.3.2 Monochromatic wave . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.3.3 Partially polarized light . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.4 The field of moving charges . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.4.1 Retarded potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.4.2 Spectral resolution of the retarded potentials . . . . . . . . . . . . . 68
7.5 Radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.5.1 Far field approximation . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.5.2 Low velocity approximation . . . . . . . . . . . . . . . . . . . . . . . 70
7.5.3 Radiation from a rapidly moving charge . . . . . . . . . . . . . . . . 71
7.6 The interaction between charged particles and EM field . . . . . . . . . . . . 73
7.6.1 Radiation reaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.6.2 Scattering by free charges . . . . . . . . . . . . . . . . . . . . . . . . 74

III General Relativity 76


8 Elementary Differential Geometry 77
–4/453– Contents

8.1 Fundamental conception on differential manifolds . . . . . . . . . . . . . . . 77


8.2 Multi linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.3 Vector Bundle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.4 Tangent vector field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.5 Exterior differential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8.6 Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
8.7 Riemannian manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

9 A Geometrical Description of Newtonian Theory 98


9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
9.2 Geometry structure of Newtonian spacetime . . . . . . . . . . . . . . . . . . 98
9.3 Geometry formulation of Newtonian gravity . . . . . . . . . . . . . . . . . . 99
9.4 Standard formulation of Newtonian gravity . . . . . . . . . . . . . . . . . . . 100
9.5 Galilean coordinate system . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
9.6 Coordinate transformation in space . . . . . . . . . . . . . . . . . . . . . . . 101

10 More on the Geometry of Spacetime 102


10.1 Hodge dual operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
10.2 Metric-induced properties of Riemann curvature tensor . . . . . . . . . . . . 103
10.3 Killing vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
10.4 The coordinates of observer . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
10.5 Hypersurfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
10.5.1 Description of hypersurfaces . . . . . . . . . . . . . . . . . . . . . . 106
10.5.2 Integration on hypersurfaces . . . . . . . . . . . . . . . . . . . . . . 107
10.5.3 Differentiation of tangent vector fields . . . . . . . . . . . . . . . . . 109

11 Formulation of General Relativity 111


11.1 Basic assumptions of general relativity . . . . . . . . . . . . . . . . . . . . . 111
11.2 Lagrangian formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
11.2.1 Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
11.2.2 Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
11.2.3 General relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
11.3 Hamiltonian formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
11.3.1 3+1 decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
11.3.2 Field theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
11.3.3 General relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

12 Perturbation Theory and Gravitational Radiation 117


12.1 The linearized theory of gravity . . . . . . . . . . . . . . . . . . . . . . . . . 117
12.2 Nearly Newtonian gravitational fields . . . . . . . . . . . . . . . . . . . . . . 118
12.3 Gravitational wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
12.4 Nonlinear effects in gravitational waves . . . . . . . . . . . . . . . . . . . . . 122
12.4.1 The shortwave approximation . . . . . . . . . . . . . . . . . . . . . . 122
12.4.2 Effect of background curvature on wave propagation . . . . . . . . . 124
12.4.3 Stress-energy tensor for gravitational waves . . . . . . . . . . . . . . 125
Contents –5/453–

12.5 Conservation laws for 4-momentum and angular momentum . . . . . . . . . 125


12.5.1 4-momentum and angular momentum . . . . . . . . . . . . . . . . . 125
12.5.2 Conservation laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
12.6 Production of gravitational wave . . . . . . . . . . . . . . . . . . . . . . . . 128

13 Black Holes 130


13.1 Schwarzschild black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
13.1.1 Schwarzschild metric . . . . . . . . . . . . . . . . . . . . . . . . . . 130
13.1.2 Geodesics of Schwarzschild spacetime . . . . . . . . . . . . . . . . . 131
13.1.3 Penrose diagram and event horizon . . . . . . . . . . . . . . . . . . 134
13.1.4 The maximally extended Schwarzschild solution . . . . . . . . . . . 135
13.2 Reissner-Nordström black holes . . . . . . . . . . . . . . . . . . . . . . . . . 138
13.3 Kerr black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
13.3.1 Kerr metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
13.3.2 Static limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
13.3.3 Penrose process and the area of the event horizon . . . . . . . . . . . 143
13.3.4 Particle orbits in the Kerr metric . . . . . . . . . . . . . . . . . . . . 144

14 Geometry of the Universe 146


14.1 Friedmann–Lemaître–Robertson–Walker metric . . . . . . . . . . . . . . . . 146
14.2 Observable quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

IV Quantum Mechanics 149

15 Linear Algebra 150


15.1 Linear Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
15.2 Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
15.3 Self-Adjoint operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
15.4 Rigged Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
15.5 Unitary operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
15.6 Antiunitary operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

16 Formulation of Quantum Mechanics 160


16.1 Axioms of quantum mechanics . . . . . . . . . . . . . . . . . . . . . . . . . 160
16.2 Transformations of States . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
16.3 Schrödinger equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
16.4 Position operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
16.5 Momentum operators and canonical quantization . . . . . . . . . . . . . . . 162
16.6 Momentum operators and translation of states . . . . . . . . . . . . . . . . . 163
16.7 Angular momentum operators and rotation of states . . . . . . . . . . . . . . 164
16.8 Heisenberg picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
16.9 Symmetries and conservation laws . . . . . . . . . . . . . . . . . . . . . . . 165

17 Coordinate and Momentum Representation 166


–6/453– Contents

17.1 Coordinate representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166


17.2 Galilei transformation of Schrödinger equation . . . . . . . . . . . . . . . . 167
17.3 Probability flux and conditions on wave functions . . . . . . . . . . . . . . . 167
17.4 Path integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
17.5 Momentum representation . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
17.6 Harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
17.6.1 Algebraic solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
17.6.2 Solution in coordinate representation . . . . . . . . . . . . . . . . . 172
17.6.3 Path integral solution . . . . . . . . . . . . . . . . . . . . . . . . . . 173
17.7 Quantum mechanics in classical electromagnetic field . . . . . . . . . . . . . 174
17.7.1 General discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
17.7.2 Motion in a uniform static magnetic field . . . . . . . . . . . . . . . 175
17.7.3 The Aharonov-Bohm effect . . . . . . . . . . . . . . . . . . . . . . . 176

18 Angular Momentum 178


18.1 Eigenvalues of angular momentum operators . . . . . . . . . . . . . . . . . . 178
18.2 Orbital Angular Momentum and Spin . . . . . . . . . . . . . . . . . . . . . 179
18.3 Rotation operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
18.4 Addition of angular momentum . . . . . . . . . . . . . . . . . . . . . . . . . 184
18.5 Tensor operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
18.6 Spherical potential well . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

19 Discrete Symmetries 192


19.1 Space inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
19.2 Time reversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

20 Approximation Method 196


20.1 Time-independent perturbation theory . . . . . . . . . . . . . . . . . . . . . 196
20.1.1 Brillouin-Wigner perturbation theory . . . . . . . . . . . . . . . . . 196
20.1.2 Nondegenerate perturbation theory . . . . . . . . . . . . . . . . . . 197
20.1.3 Degenerate perturbation theory . . . . . . . . . . . . . . . . . . . . 198
20.2 Application of time-independent perturbation theory in hydrogen atom . . . 199
20.2.1 Stark effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
20.2.2 Fine structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
20.2.3 Zeeman effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
20.2.4 Hyperfine structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
20.3 Time-dependent perturbation theory . . . . . . . . . . . . . . . . . . . . . . 208
20.3.1 Dyson series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
20.3.2 Constant and harmonic perturbation . . . . . . . . . . . . . . . . . . 209
20.3.3 Transition probability . . . . . . . . . . . . . . . . . . . . . . . . . . 210
20.4 Atomic radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
20.5 Variational method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
20.5.1 The formulation of variational method . . . . . . . . . . . . . . . . . 212
20.5.2 Bound states and the virial theorem . . . . . . . . . . . . . . . . . . 213
Contents –7/453–

20.6 WKB method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213


20.6.1 The semi-classical expansion . . . . . . . . . . . . . . . . . . . . . . 214
20.6.2 A linear potential and the Airy function . . . . . . . . . . . . . . . . 215
20.6.3 Bound state spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . 216
20.7 Slowly changing Hamiltonians . . . . . . . . . . . . . . . . . . . . . . . . . . 218
20.7.1 The adiabatic approximation . . . . . . . . . . . . . . . . . . . . . . 218
20.7.2 Berry phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
20.7.3 The Born-Oppenheimer approximation . . . . . . . . . . . . . . . . 220

21 Many Body Problem 222


21.1 Identical particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
21.2 Non-relativistic quantum field theory . . . . . . . . . . . . . . . . . . . . . . 224
21.2.1 Motivation and formulation of quantum field theory . . . . . . . . . 224
21.2.2 Particles in quantum field theory . . . . . . . . . . . . . . . . . . . . 225
21.2.3 Momentum space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
21.2.4 Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228

22 Scattering Theory 229


22.1 Scattering in one-dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
22.1.1 Reflection and transmission amplitudes . . . . . . . . . . . . . . . . 230
22.1.2 S-matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
22.1.3 Bound states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
22.1.4 Resonances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
22.2 Lippmann–Schwinger equation . . . . . . . . . . . . . . . . . . . . . . . . . 234
22.3 Born approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
22.4 Partial wave analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
22.4.1 Partial wave expansion . . . . . . . . . . . . . . . . . . . . . . . . . 238
22.4.2 Hard sphere scattering . . . . . . . . . . . . . . . . . . . . . . . . . 240
22.4.3 Attractive potential . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
22.5 Resonance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
22.5.1 Delta-shell potential . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
22.5.2 General description of resonances . . . . . . . . . . . . . . . . . . . 244
22.6 Two-to-two scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
22.7 Time-dependent formulation of scattering theory . . . . . . . . . . . . . . . 245

V Quantum Field Theory 247


23 Elementary Group Theory 248
23.1 Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
23.2 Representation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
23.3 Representations of the symmetric groups . . . . . . . . . . . . . . . . . . . . 253
23.4 Lie Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
23.4.1 Lie groups in general . . . . . . . . . . . . . . . . . . . . . . . . . . 258
23.4.2 SO(N ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
–8/453– Contents

23.4.3 SU(N ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260


23.5 Lie algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
23.5.1 Lie algebra in general . . . . . . . . . . . . . . . . . . . . . . . . . . 261
23.5.2 Compact Lie algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
23.5.3 SO(N ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
23.5.4 SU(N ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
23.5.5 Sp(2l) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
23.5.6 Classification of compact Lie algebras . . . . . . . . . . . . . . . . . 268
23.6 Spinor representations of orthogonal algebras . . . . . . . . . . . . . . . . . 270

24 From Classical Field to Quantum Field 274


24.1 Canonical quantization of classical field . . . . . . . . . . . . . . . . . . . . . 274
24.2 Lorentz invariance in quantum field theory . . . . . . . . . . . . . . . . . . . 274
24.3 Symmetry and conservation law . . . . . . . . . . . . . . . . . . . . . . . . . 275
24.4 Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
24.5 Angular Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
24.6 Anticommutation relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277

25 Scalar Field 278


25.1 Klein-Gordon field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
25.2 Canonical quantization Formulation . . . . . . . . . . . . . . . . . . . . . . 278
25.3 Perturbation theory for canonical quantization . . . . . . . . . . . . . . . . . 280
25.3.1 Perturbation expansion of correlation functions . . . . . . . . . . . . 280
25.3.2 Feynman diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
25.4 Path integral formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
25.4.1 Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
25.4.2 Free field theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
25.4.3 Interacting field theory . . . . . . . . . . . . . . . . . . . . . . . . . 283
25.4.4 Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
25.5 Scattering matrix and cross section . . . . . . . . . . . . . . . . . . . . . . . 285
25.6 LSZ reduction formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
25.6.1 Field strength renormalization . . . . . . . . . . . . . . . . . . . . . 287
25.6.2 LSZ reduction formula . . . . . . . . . . . . . . . . . . . . . . . . . 289
25.7 Renormalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
25.7.1 Counting of ultraviolet divergence . . . . . . . . . . . . . . . . . . . 291
25.7.2 Renormalized perturbation theory . . . . . . . . . . . . . . . . . . . 291
25.7.3 Techniques for evaluating loop diagrams . . . . . . . . . . . . . . . . 293
25.7.4 One-loop structure of ϕ4 theory . . . . . . . . . . . . . . . . . . . . 294
25.7.5 General renormalization theory . . . . . . . . . . . . . . . . . . . . 296
25.8 Renormalization group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
25.8.1 Modified minimal-subtraction scheme . . . . . . . . . . . . . . . . . 299
25.8.2 Equations of the renormalization group . . . . . . . . . . . . . . . . 300
25.8.3 Running of coupling constants . . . . . . . . . . . . . . . . . . . . . 302
25.9 Spontaneous symmetry breaking . . . . . . . . . . . . . . . . . . . . . . . . 304
Contents –9/453–

25.9.1 Effective action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304


25.9.2 Calculation of the effective action . . . . . . . . . . . . . . . . . . . 305
25.9.3 The effective action as a generating functional . . . . . . . . . . . . . 306
25.9.4 Renormalization and symmetry . . . . . . . . . . . . . . . . . . . . 307
25.9.5 Goldstone’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
25.10 Linear sigma model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
25.10.1 Symmetry breaking . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
25.10.2 Renormalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
25.10.3 Effective action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
25.11 Optical theorem and unstable particles . . . . . . . . . . . . . . . . . . . . . 314
25.11.1 Optical theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
25.11.2 Unstable Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
25.12 Non-relativistic limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
25.12.1 Complex Klein-Gordon field . . . . . . . . . . . . . . . . . . . . . . 316
25.12.2 Non-relativistic limit . . . . . . . . . . . . . . . . . . . . . . . . . . 317

26 Spinor Field 318


26.1 Representation of the Lorentz group . . . . . . . . . . . . . . . . . . . . . . 318
26.2 Spin-statistics theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
26.3 Spinor field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
26.4 Dynamics of spinor fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
26.5 Canonical quantization formulation . . . . . . . . . . . . . . . . . . . . . . . 326
26.5.1 Canonical quantization of left-handed Weyl field . . . . . . . . . . . 326
26.5.2 Canonical quantization of Dirac field . . . . . . . . . . . . . . . . . . 327
26.5.3 Canonical quantization of Majorana field . . . . . . . . . . . . . . . 331
26.6 Parity, time reversal and charge conjugation . . . . . . . . . . . . . . . . . . 331
26.7 Perturbation theory for canonical quantization . . . . . . . . . . . . . . . . . 334
26.8 Path integral formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
26.8.1 Grassmann numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 336
26.8.2 Path integral formulation for free Dirac field . . . . . . . . . . . . . . 337
26.8.3 Perturbation theory for path integral quantization . . . . . . . . . . . 338
26.9 LSZ reduction formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
26.10 Functional determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341

27 Vector Field 343


27.1 Electromagnetic field and gauge invariance . . . . . . . . . . . . . . . . . . . 343
27.2 Canonical quantization of EM field . . . . . . . . . . . . . . . . . . . . . . . 344
27.2.1 Canonical quantization in Coulomb gauge . . . . . . . . . . . . . . . 344
27.2.2 Canonical quantization in Lorenz gauge . . . . . . . . . . . . . . . . 346
27.3 Perturbation theory for canonical quantization . . . . . . . . . . . . . . . . . 349
27.3.1 Coulomb gauge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
27.3.2 Lorenz gauge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
27.4 Path integral quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
27.4.1 Path integral formulation for free EM field . . . . . . . . . . . . . . . 352
–10/453– Contents

27.4.2 Perturbation theory for path integral quantization . . . . . . . . . . . 353


27.4.3 Ward-Takahashi identity (1) . . . . . . . . . . . . . . . . . . . . . . 354
27.5 Exact photon propagator in QED . . . . . . . . . . . . . . . . . . . . . . . . 355
27.5.1 Photon self-energy . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
27.5.2 Masslessness of photon in QED . . . . . . . . . . . . . . . . . . . . . 356
27.6 LSZ reduction formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
27.6.1 LSZ reduction formula and Feynman rules . . . . . . . . . . . . . . 357
27.6.2 Ward-Takahashi identity (2) . . . . . . . . . . . . . . . . . . . . . . 358
27.7 Renormalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
27.7.1 Renormalized quantum electrodynamics . . . . . . . . . . . . . . . . 359
27.7.2 One loop structure of QED . . . . . . . . . . . . . . . . . . . . . . . 362
27.7.3 Renormalization group . . . . . . . . . . . . . . . . . . . . . . . . . 364
27.7.4 Magnetic dipole moment . . . . . . . . . . . . . . . . . . . . . . . . 366
27.7.5 Lamb shift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367

28 Gauge Field 369


28.1 Nonabelian gauge theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
28.1.1 Nonabelian symmetries . . . . . . . . . . . . . . . . . . . . . . . . . 369
28.1.2 Nonabelian gauge theory . . . . . . . . . . . . . . . . . . . . . . . . 371
28.1.3 Group representations . . . . . . . . . . . . . . . . . . . . . . . . . . 372
28.2 Quantization of nonabelian gauge theory . . . . . . . . . . . . . . . . . . . . 376
28.2.1 The path integral for nonabelian gauge theory . . . . . . . . . . . . . 376
28.2.2 The Feynman rules for nonabelian gauge theory . . . . . . . . . . . . 377
28.3 Renormalization of nonabelian gauge theory . . . . . . . . . . . . . . . . . . 378
28.4 Chiral gauge theories and anomalies . . . . . . . . . . . . . . . . . . . . . . 381
28.4.1 Anomalies in local symmetries . . . . . . . . . . . . . . . . . . . . . 381
28.4.2 Anomalies in global symmetries . . . . . . . . . . . . . . . . . . . . 383
28.4.3 Anomalies and the path integral for fermions . . . . . . . . . . . . . 385
28.5 Spontaneous breaking of gauge symmetries . . . . . . . . . . . . . . . . . . . 387
28.6 Quantization of spontaneously broken gauge theory . . . . . . . . . . . . . . 389
28.6.1 Spontaneously broken abelian gauge theory . . . . . . . . . . . . . . 389
28.6.2 Spontaneously broken nonabelian gauge theory . . . . . . . . . . . . 390
28.7 The Standard Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
28.7.1 Gauge and Higgs sector . . . . . . . . . . . . . . . . . . . . . . . . . 393
28.7.2 Lepton sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
28.7.3 Quark sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397

VI Statistical Mechanics And Field Theory 401

29 Thermodynamics 402
29.1 Central problem of thermodynamics . . . . . . . . . . . . . . . . . . . . . . 402
29.2 Entropy formulation of thermodynamics . . . . . . . . . . . . . . . . . . . . 403
29.2.1 Property of entropy function . . . . . . . . . . . . . . . . . . . . . . 403
Contents –11/453–

29.2.2 Simple problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404


29.2.3 Heat and work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
29.3 Thermodynamic potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
29.3.1 Energy scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
29.3.2 Intensive variables and thermodynamic potentials . . . . . . . . . . . 407
29.3.3 Free energy and maxwell relations . . . . . . . . . . . . . . . . . . . 408
29.3.4 Gibbs free energy and enthalpy . . . . . . . . . . . . . . . . . . . . . 409
29.3.5 Other thermodynamic potentials . . . . . . . . . . . . . . . . . . . . 410
29.3.6 The Euler and Gibbs-Duhem equations . . . . . . . . . . . . . . . . 410
29.4 Thermodynamic systems with multi-components . . . . . . . . . . . . . . . 411
29.4.1 Chemical reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
29.4.2 Phase coexistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
29.4.3 The Clausius-Clapeyron equation . . . . . . . . . . . . . . . . . . . . 412
29.4.4 Gibbs phase rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
29.4.5 The Critical Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414

30 Principles of Statistical Mechanics and Ensembles 415


30.1 Density matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
30.2 Statistical ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
30.2.1 Micro canonical ensemble . . . . . . . . . . . . . . . . . . . . . . . 416
30.2.2 Canonical ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
30.2.3 Grand canonical ensemble . . . . . . . . . . . . . . . . . . . . . . . 417
30.3 Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
30.3.1 Canonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
30.3.2 Grand canonical ensemble . . . . . . . . . . . . . . . . . . . . . . . 418

31 Interaction-free Systems 419


31.1 General discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
31.1.1 Bose-Einstein Statistics . . . . . . . . . . . . . . . . . . . . . . . . . 419
31.1.2 Fermi-Dirac Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 420
31.1.3 Boltzmann Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
31.2 Ideal Boltzmann Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
31.2.1 Molecules without internal motion . . . . . . . . . . . . . . . . . . . 421
31.2.2 Molecules with internal motion . . . . . . . . . . . . . . . . . . . . . 422
31.3 Ideal Bose Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
31.4 Ideal Fermi systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
31.5 Thermodynamics of the blackbody radiation . . . . . . . . . . . . . . . . . . 429

32 Quantum Field Theory in Statistical Physics 431


32.1 Superfluidity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
32.1.1 Experimental facts of Helium at low temperatures . . . . . . . . . . . 431
32.1.2 Mechanism of superfluidity . . . . . . . . . . . . . . . . . . . . . . . 432
32.2 Finite temperature perturbation theory . . . . . . . . . . . . . . . . . . . . . 435
32.3 Path integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
–12/453– Contents

33 Phase Transitions and the Renormalization Group 437


33.1 Order parameter and phase transition . . . . . . . . . . . . . . . . . . . . . . 437
33.2 Landau theory of phase transitions . . . . . . . . . . . . . . . . . . . . . . . 438
33.3 Renormalization group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
33.3.1 One-dimensional Ising model . . . . . . . . . . . . . . . . . . . . . 441
33.3.2 Gell-Mann–Low equations . . . . . . . . . . . . . . . . . . . . . . . 443
33.3.3 Analysis of the Gell-Mann–Low equation . . . . . . . . . . . . . . . 445
33.4 Critical exponents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
33.5 RG analysis of the ferromagnetic transition . . . . . . . . . . . . . . . . . . . 449
33.5.1 Preliminary dimensional analysis . . . . . . . . . . . . . . . . . . . . 449
33.5.2 Landau mean field theory . . . . . . . . . . . . . . . . . . . . . . . . 450
33.5.3 Gaussian model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
33.5.4 Renormalization group analysis . . . . . . . . . . . . . . . . . . . . 451
Part I

Classical Mechanics
Chapter 1
The Formulation of Classical Mechanics

One of the fundamental concepts of mechanics is that of a particle. By this we mean a body
whose dimensions may be neglected in describing its motion. The position of a particle in
space is defined by its radius vector r, whose components are its Cartesian coordinates x, y, z.
The derivative v = dr/dt of r with respect to the time t is called the velocity of the particle,
and the second derivative d2 r/dt2 is its acceleration.

To define the position of a system of N particles in space, it is necessary to specify N radius


vectors, i.e., 3N coordinates. The number of independent quantities which must be specified in
order to define uniquely the position of any system is called the number of degrees of freedom.
Any s quantities q1 , · · · , qs which completely define the position of a system with s degrees of
freedom are called generalised coordinates of the system, and the derivatives q̇i are called its
generalised velocities. The subscript index would be omited except for emphasis.

If all the coordinates and velocities are simultaneously specified, the state of the system is com-
pletely determined and that its subsequent motion can be calculated. Mathematically, if all the
coordinates q and velocities q̇ are given at some instant, the accelerations q̈ at that instant are
uniquely defined. The relation between the accelerations, velocities and coordinates are called
the equations of motion. They are second-order differential equations for the functions q(t),
and their integration makes possible the determination of these functions and so of the path
of the system.

1.1 Lagrangian formulation


The essence of classical mechanics can be summarized by the existence of a Lagrangian func-
tion L from which all the dynamics of a system can be derived. For a set of interacting particles,
this Lagrangian depends on the generalized coordinates and their derivatives. The dynamics
of this system can be obtained by requiring the action
Z t2
S= L(qi , q˙i , t) dt (1.1)
t1

to be stationary under small perturbations of evolution. (The perturbations of the coordinates


must vanish at the boundary of the integration domain). The resulting equation of motion is
 
d ∂L ∂L
− = 0. (1.2)
dt ∂ q̇i ∂qi
1.2 Symmetry and Conservation Laws(1) –15/453–

1. If we transform the coordinates q to Q and q = q(Q, t), the new Lagrangian will be

L(Q, Q̇, t) ≡ L(q(Q, t), q̇(Q, Q̇, t), t). (1.3)

We can verify that  


d ∂L ∂L
− = 0. (1.4)
dt ∂ Q̇ ∂Q

2. If L′ = L + df (q, t)/dt , then L and L′ is equivalent and will generate the same dynam-
ical equation.
Example:
1. The form of Lagrangian for an isolated system of particles in inertial frame is
X1
L= ma va2 − U (r1 , r2 , · · · ). (1.5)
a
2

The equation of motion is


mi r̈i = −∇ri U. (1.6)
To get the form of Lagrangian for a system of interacting particles, we must assume:
• Space and time are homogeneous and isotropic in inertial frame;
• Galileo’s relativity principle and Galilean transformation;
• Spontaneous interaction between particles;
2. Consider a reference frame K. Suppose the K is moving with the velocity V (t) and
rotating with angular velocity Ω relative to the inertial reference frame. We use the
coordinates of the mass point in K as general coordinates, i.e., r = (xk , yk , zk ). Then
the Lagrangian of the mass point will be
1 m
L = mv 2 + m(v + V ) · (Ω × r) + (Ω × r)2 − mV̇ · r − U. (1.7)
2 2
The equation of motion will be
dv ∂U
m =− − m(V̇ + Ω × V ) + m(r × Ω̇) + 2m(v × Ω) + m[Ω × (r × Ω)]. (1.8)
dt ∂r

1.2 Symmetry and Conservation Laws(1)


Theorem 1.1 Nother’s theorem

For qi → qi + δqi and L → L + δL, if δL = df (qi , q̇i , t)/dt , we get


!
d X i ∂L ♣
p δqi − f = 0 where pi ≡ . (1.9)
dt i ∂ q̇i
–16/453– Chapter 1 The Formulation of Classical Mechanics

Example: For an isolated system of particles in inertial frame:


δL = 0 when δri → ri + δa, so
!
d X
pi = 0. (1.10)
dt i

δL = 0 when δri → ri + ri × δθ, so


!
d X
ri × pi = 0. (1.11)
dt i

Homogeneity of time If ∂L/∂t = 0, then we get


dE X
=0 where E ≡ q˙i pi − L. (1.12)
dt i

1.3 Hamiltonian formulation


A mechanical system can also be characterized by a function called Hamiltonian H(qi , pi , t),
where pi is the generalized momentum of the coordinate qi . The generalized coordinates and
generalized momentums together form the phase space of the system. If the coordinates of
the system are fixed at instant t1 and t2 , the evolution of the system will be determined by
requiring the integral Z t2 X
pi q˙i − H(qi , pi , t) dt (1.13)
t1 i

to be stationary under small perturbations. We can derive the equation of motion


∂H ∂H
ṗi = − , q̇i = . (1.14)
∂qi ∂pi
If the Lagrangian of the system is given, the Hamiltonian of the system can be obtained from
∂L X
pi = , H= pi q˙i − L, (1.15)
∂ q̇i i

ensuring that two formulations are equivalent.


Example: For an isolated system of particles in inertial frame, the Hamiltonian is given by
X p2
H= i
+ U (r1 , r2 , · · · ). (1.16)
i
2m

The equation of motion is therefore


pi
ṗi = −∇ri U, ṙi = . (1.17)
mi
1.3 Hamiltonian formulation –17/453–

1.3.1 Poisson Brackets


Given two functions f , g that depend on phase space and time, their Poisson bracket {f, g}
is another function that depends on phase space and time. The following rules hold for any
three functions f , g, h of phase space and time
Anticommutativity {f, g} = −{g, f }; (1.18a)
Bilinearity {af + bg, h} = a{f, h} + b{g, h}, a, b ∈ R; (1.18b)
Leibniz’s rule {f g, h} = f {g, h} + {f, h}g; (1.18c)
Jacobi identity {f, {g, h}} + {g, {h, f }} + {h, {f, g}} = 0. (1.18d)
If we assume that
 
qi , pk = δik , {qi , qk } = 0, pi , pk = 0, (1.19)

X  ∂f ∂g 
we can derive that
∂f ∂g
{f, g} = − . (1.20)
k
∂qk ∂pk ∂pk ∂qk
The Hamilton equation can therefore be rewritten as

ṗi = pi , H , q̇i = {qi , H}. (1.21)
We can further get
   
df ∂f d{f, g} df dg
= {f, H} + , = ,g + f, . (1.22)
dt ∂t dt dt dt

Example: For an particle in an inertial frame, we have


{ra , pb } = δab . (1.23)
If we define la ≡ ϵabc ra pb , we can obtain
{la , rb } = ϵabc rc , {la , pb } = ϵabc pc , {la , lb } = ϵabc lc . (1.24)

1.3.2 Canonical transformations


Consider a transformation in phase space Qi = Qi (qk , pk , t), P i = P i (qk , pk , t). If there exsits
a function H ′ (Qi , P i , t), satisfying
∂H ′ ∂H ′
Q̇i = , Ṗi = − , (1.25)
∂Pi ∂Qi
the transformation will be called canonical transformation. In Hamiltonian mechanics, a
canonical transformation is a change of canonical coordinates that preserves the form of Hamil-
ton’s equations.
If (qi , pi , H) → (Qi , P i , H ′ ) is a canonical transformation, there will be a generating function
F (qi , Qi , t) satisfying that
X X dF
pi q̇i − H(pi , qi ) = P i Q̇i − H ′ (Qi , P i ) + . (1.26)
i i
dt
Applying Legendre transformation, we can get four kinds of generating function.
–18/453– Chapter 1 The Formulation of Classical Mechanics
P P
1. Notice that dF /dt = i pi q̇i − i P i Q̇i + (H ′ − H). Assuming Φ(qi , Qi , t) = F , we
have
∂Φ ∂Φ ∂Φ
pi = , Pi = − , H′ = H + . (1.27a)
∂qi ∂Qi ∂t
P P P
2. Notice that d(F + i P i Qi )/dt = i pi q̇i + i Qi Ṗ i +(H ′ −H). Assuming Φ(qi , P i , t) =
P
F + i P i Qi , we have
∂Φ ∂Φ ∂Φ
pi = , Qi = , H′ = H + . (1.27b)
∂qi ∂P i ∂t

P P P
3. Notice that d(F − i pi qi )/dt = − i qi ṗi − i P i Q̇i + (H ′ − H). Assuming
P
Φ(pi , Qi , t) = F − i pi qi , we have
∂Φ ∂Φ ∂Φ
qi = − , Pi = − , H′ = H + . (1.27c)
∂pi ∂Qi ∂t

P P P P
4. Notice that d(F + i P i Qi − i pi qi )/dt = − i qi ṗi + i Qi Ṗ i + (H ′ − H). As-
P P
suming Φ(pi , P i , t) = F + i P i Qi − i pi qi , we have
∂Φ ∂Φ ∂Φ
qi = − , Qi = , H′ = H + . (1.27d)
∂pi ∂P i ∂t

Poisson bracket is invariant under canonical transformation. Suppose that (q, p, H) → (Q, P, H ′ )
is a canonical transformation and F (Q, P, t) = f (q, p, t), G(Q, P, t) = g(q, p, t). We will
have
{f, g}q,p = {F, G}Q,P (1.28)
A necessary and sufficient condition for a canonical transformation is
 i j 
{Qi , Qj }q,p = 0, P , P q,p = 0, Qi , P j q,p = δji . (1.29)

1.3.3 Evolution as canonical transformations


Let qt , pt be the values of the canonical variables at time t, and qt+τ , pt+τ their values at another
time t + τ . The latter are functions of the former:

qt+τ = q(qt , pt , t, τ ), pt+τ = p(qt , pt , t, τ ). (1.30)

If these formulae are regarded as a transformation from the variables qt ,pt to qt+τ , pt+τ , then
this transformation is canonical. This is evident from the expression

dS = −pt dqt + pt+τ dqt+τ − (Ht+τ − Ht )dt. (1.31)

for the differential of the action S(qt , qt+τ , t, τ ), taken along the true path, passing through
the points q, and qt+τ at times t and t + τ for a given τ . −S is the generating function of the
transformation. As a result, we also have the following bracket relations,
 i 
{qi, t+τ , qj, t+τ }qt ,pt = 0, pt+τ , pjt+τ qt ,pt = 0, qi, t+τ , pjt+τ qt ,pt = δij . (1.32)
1.4 Symmetry and Conservation Laws(2) –19/453–

1.3.4 Liouville’s theorem


Lemma 1

The determinat of the Jacobian matrix of the canonical transformation is 1. ♣

Theorem 1.2 Liouville’s theorem

The phase-space distribution function is constant along the trajectories of the system. ♣

+ Proof: The phase volume is invariant under canonical transformation.The change in p and q during
the motion can be regarded as a canonical transformation. Suppose that each point in the region of
phase space moves in the course of time in accordance with the equations of motion of the mechanical
system. The region as a whole therefore moves also, but its volume remains unchanged. 2

1.4 Symmetry and Conservation Laws(2)


Suppose g is a function of p and q. If an infinitesimal transformation of q and p can be described
as
q → q + ϵ{q, g}, p → p + ϵ{p, g}, where ϵ → 0. (1.33)
We can prove that
H → H + ϵ{H, g}. (1.34)
If H is invariant under the transformation, we have {H, g} = 0. Thus dg/dt = 0, i.e., g is a
conserved quantity of the motion.

1.5 Hamilton-Jacobi equation


Define Z q,t
S(q, t) ≡ L dt . (1.35)
q0 ,t0 extremum

We can prove that


∂S ∂S
p= , H=− . (1.36)
∂q ∂t
Therefore, we have  
∂S ∂S
− = H q, ,t . (1.37)
∂t ∂q
This is called Hamilton-Jacobi equation.

Suppose the complete integral of the Hamilton-Jacobi equation is

S = f (t, q1 , · · · , qs ; α1 , · · · , αs ) + A, (1.38)
–20/453– Chapter 1 The Formulation of Classical Mechanics

where α1 , · · · , αs and A are arbitrary constants. We effect a canonical transformation from


the variables qi , pi to new variables, taking the function f (t, qi , αi ) as the generating function,
and the quantities α1 , · · · , αs as the new momenta. Let the new coordinates be β1 , · · · , β2 ,
and we have
∂f ∂f ∂f
pi = , βs = , H′ = H + = 0. (1.39)
∂qi ∂αs ∂t
Therefore,
αs = constant, βs = constant. (1.40)
By means of the s equations βs = ∂f /∂αs , the s coordinates qi can be expressed in terms of
the time and the 2s constants. This gives the general integral of the equations of motion.

1.6 Symmetry and Conservation Laws(3)


For qi → qi + δqi and L → L + δL, if δL = df (qi , q̇i , t)/dt , we have
q,t
X q,t
i
δS = p δqi =f . (1.41)
i q0 ,t0 q0 ,t0

As a result, pi δqi − f is a conserved quantity.


Chapter 2
Two Body Problem

2.1 Reduced mass and central field


The Lagrangian for a two-body system is
1 1
L = m1 ṙ12 + m2 ṙ22 + U (|r1 − r2 |). (2.1)
2 2
Let r ≡ r1 − r2 be the relative position vector and let the origin be at the centre of mass, i.e.,
m1 r1 + m2 r2 = 0. These two equations give
m2 m1
r1 = r, r2 = − r. (2.2)
m1 + m2 m1 + m2
Then we have
1 m1 m2
L = mṙ 2 − U (r) where m ≡ is called reduced mass. (2.3)
2 m1 + m2
The Lagrangian is formally identical with the Lagrangian of a particle of mass m moving in an
external field U (r) which is symmetrical about a fixed origin.
L is isotropic, so angular momentum is conserved, i.e., M = r × p is constant. Since r is
always perpendicular to M , the path of the particle lies in one plane. Using polar coordinates,
we have
1
L = m(ṙ2 + r2 ϕ̇2 ) − U (r). (2.4)
2
It is easy to notice that
1 M2
M = mr2 ϕ̇ = const., E = mṙ2 + + U (r) = const. (2.5)
2 2mr2
Therefore,
r
dr 2[E − U (r)] M2
= − 2 2 (2.6a)
dt m mr
dϕ M
= p . (2.6b)
dr r 2m[E − U (r)] − M 2 /r2
2

The radial part of the motion can be regarded as taking place in one-dimension in a field where
the effective potential energy is
M2
Ueff = U (r) + . (2.7)
2mr2
–22/453– Chapter 2 Two Body Problem

The values of r for which


M2
U (r) + =E (2.8)
2mr2
determine the limits of the motion as regards distance from the centre. When 2.8 is satisfied,
the radial velocity ṙ is zero. This does not mean that the particle comes to rest as in true
one-dimensional motion, since the angular velocity is not zero. The value ṙ = 0 indicates a
turning point of the path, where r(t) begins to decrease instead of increasing, or vice versa. If
the range in which r may vary is limited only by the condition r ≥ rmin , the motion is infinite:
the particle comes from, and returns to, infinity. If the range of r has two limits rmin and
rmax , the motion is finite and the path lies entirely within the annulus bounded by the circles
r = rmin and r = rmax . This does not mean that the path must be a closed curve. During the
time in which r varies from rmin to rmax and back, the radius vector turns through an angle
Z rmax
M
∆ϕ = 2 p dr . (2.9)
rmin r
2 2m[E − U (r)] − M 2 /r2
The condition for the path to be closed is that this angle should be a rational fraction of 2π.
There are only two types of central field in which all finite motions take place in closed paths.
They are those in which the potential energy of the particle varies as 1/r or as r2 . The presence
of the centrifugal energy when M ̸= 0, which becomes infinite as 1/r2 when r → 0, generally
renders it impossible for the particle to reach the centre of the field, even if the field is an
attractive one. A fall of the particle to the centre is possible only if the potential energy tends
sufficiently rapidly to −∞ as r → 0. From the inequality
1 2 M2
mṙ = E − U (r) − > 0, (2.10)
2 2mr2
it follows that r can take values tending to zero only if
M2
[r2 U (r)]r→0 < − . (2.11)
2m

2.2 Kepler Problem


An important class of central fields is formed by those in which the potential energy is in-
versely proportional to r. They include the fields of N Newtonian gravitational attraction and
of Coulomb electrostatic interaction; the latter may be either attractive or repulsive.
Let us first consider an attractive field, where
α
U =− (2.12)
r
with α a positive constant. The effective potential energy is
α M2
Ueff = − + . (2.13)
r 2mr2
As r → 0, Ueff tends to +∞, and as r → ∞ it tends to zero from negative values; for r =
M 2 /mα it has a minimum value
mα2
Ueff,min = − . (2.14)
2M 2
2.2 Kepler Problem –23/453–

The motion is finite for −mα2 /2M 2 ≤ E < 0 and infinite for E ≥ 0. The shape of path is
r
p M2 2EM 2
= 1 + e cos ϕ where p = , e= 1+ . (2.15)
r mα mα2
This is the equation of a conic section with one focus at the origin; 2p is called the latus rectum
of the orbit and e the eccentricity. Our choice of the origin is such that the point where ϕ = 0
is the point nearest to the origin (called the perihelion).

Figure 2.1: (a) Attractive Kepler orbit with e < 1; (b) Attractive Kepler orbit with e > 1; (c)
Repulsive Kepler orbit.

If E < 0, the orbit is an ellipse and the motion is finite, as shown in Figure 2.1(a). The major
and minor semi-axes of the ellipse are
p α p M
a= = ; b= √ =p . (2.16)
1−e 2 2|E| 1−e 2 2m|E|

The least and greatest distances from the centre of the field (the focus of the ellipse) are
p p
rmin = = a(1 − e); rmax = = a(1 + e). (2.17)
1+e 1−e
The period of revolution in an elliptical orbit is
r r
πab m m
T = = 2πa3/2 = πα . (2.18)
r2 ϕ̇/2 α 2|E|3

If E > 0, the path is a hyperbola with the origin as internal focus, as shown in Figure 2.1(b).
The distance of the perihelion from the focus is
p
rmin = = a(1 − e), (2.19)
1+e
where a = p/(e2 − 1) = α/2E is the semiaxis of the hyperbola.
If E = 0, the eccentricity e = 1, and the particle moves in a parabola with perihelion distance
rmin = p/2. This case occurs if the particle starts from rest at infinity.
Let us now consider motion in a repulsive field, where
α
U= (α > 0). (2.20)
r
–24/453– Chapter 2 Two Body Problem

Here the effective potential energy is


α M2
Ueff = + , (2.21)
r 2mr2
and decreases monotonically from +∞ to zero as r varies from zero to infinity. The energy
of the particle must be positive, and the motion is always infinite. The calculations are exactly
similar to those for the attractive field. The path is a hyperbola, as shown in Figure 2.1(c).
p
= −1 + e cos ϕ. (2.22)
r
The perihelion distance is
p
rmin = = a(1 + e). (2.23)
−1 + e

There is an integral of the motion which exists only in fields U = α/r. It is easy to verify by
direct calculation that the quantity
αr
v×M + (2.24)
r
is constant. The direction of the conserved vector is along the major axis from the focus to
the perihelion, and its magnitude is αe. This is most simply seen by considering its value at
perihelion.

2.3 Disintegration and collisions of particles


Let us consider a spontaneous disintegration of a particle into two constituent parts. This
process is most simply described in a frame of reference in which the particle is at rest before
the disintegration. The law of conservation of momentum shows that the sum of the momenta
of the two particles formed in the disintegration is then zero; that is, the particles move apart
with equal and opposite momenta. The magnitude p0 of either momentum is given by the law
of conservation of energy:
p20 p2
Ei = E1i + + E2i + 0 , (2.25)
2m1 2m2
where m1 and m2 are the masses of the particles, E1i and E2i , their internal energies, and Ei
the internal energy of the original particle. If ϵ is the disintegration energy, i.e., the difference
ϵ = Ei −E1i −E2i which must obviously be positive, then ϵ = p20 /2m, where m is the reduced
mass of the two particles.
Let us now change to a frame of reference in which the primary particle moves with velocity
V before the break-up. This frame is usually called the laboratory system, or L system, in
contradistinction to the centre-of-mass system, or C system, in which the total momentum is
zero. Let us consider one of the resulting particles, and let v and v0 be its velocities in the L
and C frame respectively. The relation between the angles θ and θ0 in the L and C systems is
evidently,
v0 sin θ0
tan θ = , (2.26)
V + v0 cos θ0
2.3 Disintegration and collisions of particles –25/453–

Figure 2.2: Disintegration in L and C frame.

as shown in Figure 2.2. In physical applications we are usually concerned with the disintegra-
tion of not one but many similar particles, and this raises the problem of the distribution of
the resulting particles in direction, energy, etc. We shall assume that the primary particles are
randomly oriented in space, i.e., isotropically on average.
In the C system, every resulting particle has the same energy, and their directions of motion are
isotropically distributed. The fraction of particles entering a solid angle element do is do/4π.
Thus the distribution with respect to the angle θ0 is
1
sin θ0 dθ0 . (2.27)
2
The corresponding distributions in the L system are obtained by an appropriate transforma-
tion. For example, let us work out the kinetic energy distribution in the L system. Since

v 2 = V 2 + v02 + 2V v0 cos θ0 , (2.28)

we have d(v 2 ) = d(cos θ0 ). Thus the kinetic energy can distributed uniformly over between
Tmin = m(v0 − V )2 /2 and Tmax = m(v0 + V )2 /2.
A collision between two particles is said to be elastic if it involves no change in their inter-
nal state. The collision is most simply described in the C system. The velocities of the par-
ticles before the collision are related to their velocities v1 and v2 in the L system by v10 =
m2 v/(m1 + m2 ), v20 = −m1 v/(m1 + m2 ), where v = v1 − v2 . Because of the law of conser-
vation of momentum, the momenta of the two particles remain equal and opposite after the
collision, and are also unchanged in magnitude, by the law of conservation of energy. Thus,
in the C system the collision simply rotates the velocities, which remain opposite in direction
and unchanged in magnitude. The velocities of the two particles after the collision are
′ m2 v n̂0 ′ m1 v n̂0
v10 = , v20 =− . (2.29)
m1 + m2 m1 + m2
The velocities in the L system after the collision are therefore
m2 v n̂0 m1 v1 + m2 v2 m1 v n̂0 m1 v1 + m2 v2
v1′ = + , v2′ = − + . (2.30)
m1 + m2 m1 + m2 m1 + m2 m1 + m2
Multiplying equations by m1 and m2 respectively, we obtain
m1 (p1 + p2 ) m2 (p1 + p2 )
p′1 = mv n̂0 + , p′2 = −mv n̂0 + . (2.31)
m1 + m2 m1 + m2
–26/453– Chapter 2 Two Body Problem

Figure 2.3: Collision in L and C frame.

Let us consider in more detail the case where one of the particles (m2 , say) is at rest before the
collision. In Figure 2.4, the distance OB = m2 p1 /(m1 + m2 ) = mv is equal to the radius.
The vector AB⃗ is equal to the momentum p1 of the particle m1 before the collision.

Figure 2.4: Collision with 2 at rest.

Angle θ1 and θ2 can be expressed in terms of χ by


m2 sin χ π−χ
tan θ1 = θ2 = . (2.32)
m1 + m2 cos χ 2
The magnitudes of the velocities of the two particles after the collision in terms of χ are
p
′ m21 + m22 + 2m1 m2 cos χ 2m1 v χ
v1 = v v2′ = sin . (2.33)
m1 + m2 m1 + m2 2
If m1 < m2 , the velocity of m1 after the collision can have any direction. If m1 > m2 , this
particle can be deflected only through an angle not exceeding θmax from its original direction.
Evidently,
m2
sin θmax = . (2.34)
m1
The collision of two particles of equal mass, of which one is initially at rest, is especially simple.
In this case both B and A lie on the circle, as shown in Figure 2.4(c).
χ π−χ
θ1 = , θ2 = ; (2.35)
2 2
χ χ
v1′ = v cos , v2′ = v sin . (2.36)
2 2
After the collision the particles move at right angles to each other.
2.4 Scattering and cross section –27/453–

2.4 Scattering and cross section


Scattering is a general physical process where some forms of radiation, such as light, sound, or
moving particles, are forced to deviate from a straight trajectory by one or more paths due to
localized non-uniformities in the medium through which they pass. In classical mechanics,
scattering generally refer to particle-particle collisions. The definition of cross section of a
scattering process is
Number of Events per target
σ≡ . (2.37)
Time × Incident Flux
Here, the incident flux are measured in the frame of target particle. Recall that when we reduce
two body problem in to one body problem, r = r1 − r2 is the coordinates of projectile in the
frame of target. Thus the scattering process can be represented by the reduced mass moving
in the central field.

Figure 2.5: Scattering in central field.

According to Figure 2.5, the scattering angle can be obtained through


Z ∞
M dr
ϕ0 = p , χ = |π − 2ϕ0 |. (2.38)
rmin r
2 2m[E − U (r)] − M 2 /r2

Since
1 2
E = mv∞ , M = mρv∞ , (2.39)
2
we can get the relation between χ and ρ. Suppose the number density of the particles is n. The
incident flux is therefore nv∞ . The number of events that particles are scattered into the solid
angle do = sin χ dχ dϕ at (χ, ϕ) in time T is

dρ ρ(χ) dϕ nv∞ T. (2.40)

Thus, we have
ρ(χ) dρ
dσ = ρ(χ) dρ dϕ = do . (2.41)
sin χ dχ
In C system, we have r1 = m2 r/(m1 + m2 ), so the scattering angle of particle 1 is simply
χ. While in L system (particle 2 is at rest before scattering), we must making corresponding
transformation to get the right expression for cross section.
–28/453– Chapter 2 Two Body Problem

Rutherford’s formula
One of the most important applications of the formulae derived above is to the scattering of
charged particles in a Coulomb field. As U = α/r, we have
2
α/mv∞ ρ
ϕ0 = arccos p . (2.42)
1 + (α/mv∞2 ρ)2

Recall that χ = (π − ϕ0 )/2, we can obtain

α2 2 χ
ρ2 = cot (2.43)
m2 v∞
4 2

and  2
α do
dσ = 2 4 . (2.44)
2mv∞ sin (χ/2)
This is Rutherford’s formula. It may be noted that the effective cross-section is independent of
the sign of α, so that the result is equally valid for repulsive and attractive Coulomb fields.

Formula above gives the effective cross-section in the frame of reference in which the centre of
mass of the colliding particles is at rest. The transformation to the laboratory system is effected
by means of
m2 sin χ π−χ
tan θ1 = , θ2 = . (2.45)
m1 + m2 cos χ 2
For particles initially at rest, we have
 2
α do2
dσ2 = 2
. (2.46)
mv∞ cos3 θ2

The same transformation for the incident particles leads, in general, to a very complex formula,
and we shall merely note two particular cases.

If the mass m2 of the scattering particle is large compared with the mass m1 of the scattered
particle, then χ = θ1 and m = m1 , so that
 2
α do1
dσ1 = 4 , (2.47)
4E1 sin (θ1 /2)
2
where E1 = m1 v∞ /2 is the energy of the incident particle. If the masses of the two particles
are equal, then by θ1 = χ/2, we have
 2
α cos θ1
dσ1 = do1 . (2.48)
E1 sin4 θ1

If the particles are entirely identical, that which was initially at rest cannot be distinguished
after the collision. The total effective cross-section for all particles is obtained by adding do1
and do2 , so  2  
α 1 1
dσ = + cos θ do . (2.49)
E1 sin4 θ cos4 θ
2.4 Scattering and cross section –29/453–

Let us return to the general formula and use it to determine the distribution of the scattered
particles with respect to the energy lost in the collision. When the masses of the scattered and
scattering particles are arbitrary, the velocity acquired by the latter is given by
2m1 χ
v2′ = v∞ sin . (2.50)
m1 + m2 2
The energy acquired by 2 and lost by 1 is therefore

2m2 2 χ
ϵ= v∞ sin2 . (2.51)
m2 2
Expressing sin(χ/2) in terms of ϵ, we obtain

α2 dϵ
dσ = 2π 2 ϵ2
. (2.52)
m2 v∞

This is the required formula: it gives the effective cross-section as a function, of the energy loss
ϵ, which takes values from zero to ϵmax = 2m2 v∞ 2
/m2 .
Chapter 3
Small Oscillation

3.1 Small oscillation in one-dimension


Let us consider the motion in one-dimension, the potential energy of the particle is V = V (q).
If we choose the coordinate of equilibrium point as q = 0, then ∂V /∂q |q=0 = 0. Expand V (q)
around q = 0, we have
1 ∂ 2V
V (q) = V (0) + q2 + · · · (3.1)
2 ∂q 2 q=0
If the equilibrium point is stable, we have
∂ 2V
≡ V ′′ (0) > 0. (3.2)
∂q 2 q=0

For small oscillation, we can neglect the higher orders of q and the Lagrangian can be written
as
1 1
L = mq̇ 2 − V ′′ (0)q 2 . (3.3)
2 2
The Euler-Lagrangian equation gives
V ′′ (0)
q̈ + ω02 q =0 where ω02 = . (3.4)
m
The general solution is
q = A cos(ω0 t + ϕ), (3.5)
where A and ϕ depends on the initial condition.
If there is a damped force which is proportional to the velocity of the particle, then we have
1
q̈ + q̇ + ω02 q = 0. (3.6)
Q
If Q > 1/(2ω0 ), we have under damped oscillation:
r
−t/(2Q) 1
q = Ae cos(ωt + ϕ) where ω = ω02 − . (3.7)
4Q2
If Q < 1/(2ω0 ), we have over damped oscillation:
r
1 1
q = Ae λ+ t
+ Be λ− t
where λ± = − ± − ω02 . (3.8)
2Q 4Q2
If Q = 1/(2ω0 ), we have critical damped oscillation:
q = C(1 + Dt)e−ω0 t . (3.9)
3.2 Forced oscillation –31/453–

3.2 Forced oscillation


The equation of motion for forced oscillation is
1
q̈ + q̇ + ω02 q = F (t). (3.10)
Q

The solution has the form q = qs + qg , where qg is the general solution of the homogeneous
equation and qs is an arbitrary special solution of the equation. In order to get qs , we consider
the following equation
1
G̈ + Ġ + ω02 G = δ(t − t′ ), (3.11)
Q
whose solution is G(t, t′ ). Then we have
Z ∞
qs = F (t′ )G(t, t′ ) dt′ . (3.12)
−∞

For under damped oscillation, we have


  
exp − t−t′ sin[ω(t−t′ )] , t > t′
G(t, t′ ) = 2Q ω
(3.13)
0, t < t′

and so Z ∞  
′ t′ sin ωt′ ′
qs = F (t − t ) exp − dt . (3.14)
0 2Q ω
For a special case where
F (t) = F0 cos Ωt, (3.15)
we have
F0
q(t) = p cos(Ωt + ϕ), (3.16)
[Ω2 − ω02 + 1/(2Q2 )]2 + ω 2 /Q2
where
ω/Q
tan ϕ = . (3.17)
ω02 − Ω2
When r
1
Ω = ω02 − , (3.18)
2Q2
we have
QF0
qmax = . (3.19)
ω
It is called resonance.

3.3 Non-linear oscillation and perturbation theory


Consider the equation of motion

q̈ + ω02 q + ϵq 3 = 0. (3.20)
–32/453– Chapter 3 Small Oscillation

Suppose that ϵ is very small. We have the expansion

q = q0 + ϵq1 + ϵ2 q2 + · · · (3.21a)
ω = ω0 + ϵω1 + ϵ ω2 + · · ·
2
(3.21b)

The equation of motion can be written as

q̈ + ω 2 q = (ω 2 − ω02 )q − ϵq 3 . (3.22)

Let τ ≡ ωt and q ′ ≡ dq/dτ = q̇/ω. We have


 
′′ ω02 ϵ
q + q = 1 − 2 q − 2 q 3 = 0. (3.23)
ω ω

It can be solved power by power and get

q0′′ + q0 = 0, (3.24a)
q03 2ω1
q1′′ + q1 = − 2 + q0 , (3.24b)
ω0 ω0

and so on. When doing the perturbation, we must adjust the ωi to avoid the resonance solution.
The details will be neglect here.

Now let us consider the non-linear oscillation with drive force. The equation of motion is
1
q̈ + q̇ + ω02 q + ϵq 3 = F0 cos ωt. (3.25)
Q

It can be rewritten as
1
q̈ + ω 2 q = − q̇ + (ω 2 − ω02 )q − ϵq 3 + F0 cos ωt. (3.26)
Q

We treat the right hand of the equation as a perturbation. We multiply it by a parameter µ and
let it be 1 later,
 
1
q̈ + ω q = µ − q̇ + (ω − ω0 )q − ϵq + F0 cos ωt .
2 2 2 3
(3.27)
Q

Concerning on the phase lagging effect, we redefine the “time” as

τ ≡ ωt − δ, (3.28)

so   
′′ 1 ′ ω02 ϵ 3 F0
q +q =µ − q + 1 − 2 q − 2 q + 2 cos(τ + δ) . (3.29)
Qω ω ω ω
The expansion series of q and δ are

q = q0 + µq1 + µ2 q2 + · · · (3.30a)
δ = δ0 + µδ1 + µ2 δ2 + · · · (3.30b)
3.4 Oscillations of systems with more than one degree of freedom –33/453–

It can be solved power by power and get


q0′′ + q0 = 0, (3.31a)
 2

1 ′ ω ϵ 3 F0
q1′′ + q1 = − q0 + 1 − 02 q0 − q + cos(τ + δ0 ), (3.31b)
Qω ω ω2 0 ω2
and so on. The solution of the zeroth order perturbation is
q0 = A0 cos τ. (3.32)
Substitute it into the first order equation, we have
    
′′ A0 F0 ω02 3ϵA30 F0 ϵA30
q1 +q1 = − 2 sin δ0 sin τ + 1 − 2 A0 − + cos δ0 cos τ − cos 3τ.
Qω ω ω 4ω 2 ω2 4ω 2
(3.33)
To avoid non-physical solution, we must have
 
A0 ω ω02 3ϵA30 F0
sin δ0 = and 1 − 2 A0 − + 2 cos δ0 = 0, (3.34)
F0 Q ω 4ω 2 ω
leading to
F0
A0 = q . (3.35)
2
[(ω 2 − ω02 ) − 3ϵA20 /4] + ω 2 /Q2
If we define
ω A0 ω02
x≡ , y≡ , (3.36)
ω0 F0
equation 3.35 can be rewritten as
1 3ϵF02 1
y2 = where a = , b= . (3.37)
(x2 − 1 − ay 2 )2 + bx2 4ω06 ω02 Q2
Expressing x in terms of y, we have
q
2 + 2ay − b ± (b − 2 − 2ay)2 − 4(a2 y 2 + 2ay + 1 − y1 )
x2 = . (3.38)
2

As shown in Figure 3.1, the resonance curve has two branches. When the frequency of the
drive force increase from left, the amplitude of oscillation will become larger and larger. But
when it comes to the point of inflection, the amplitude will drop to the low-right part of the
curve. When the frequency of the drive force decrease from right, the amplitude of oscillation
will also become larger and larger. When it comes to the point of inflection, the amplitude will
jump to the hight-left part of the curve. This effect is called hysteresis.

3.4 Oscillations of systems with more than one degree of


freedom
Let’s look at a system with many degrees of freedom; we have
1X
L= Tij q̇i q̇j − V (q1 , . . . qn ) . (3.39)
2 i,j
–34/453– Chapter 3 Small Oscillation

10

8
A0 ω02 /F0

0
0.0 0.5 1.0 1.5 2.0 2.5
ω/ω0

Figure 3.1: Resonance curve.

Let q0,i be an equilibrium position and expand about this point qi = q0,i + ηi , and so q̇i = η̇i .
We can expand the potential energy to give
X  ∂V  
1 X ∂ 2V

V (q1 , . . . qn ) = V (q0,1 , . . . q0,n ) + ηi + ηi ηj + · · · (3.40)
i
∂qi q0,i 2 i,j ∂qi ∂qj q0,i

The first term is constant with respect to ηi and constant terms do not affect the motion. The
second term is zero, because q0,i is a point of equilibrium. So we are left with
1X
L= (Tij η̇i η̇j − Vij ηi ηj ) , (3.41)
2 i,j

where  
∂ 2V
Tij = Tij (q0,1 , . . . q0,n ) , Vij = , (3.42)
∂qi ∂qj q0,i

yielding the equation of motion


X
(Tij η̈j + Vij ηj ) = 0. (3.43)
j

This is a linear differential equation with constant coefficients. We can try using

ηi = Cai e−iωt , (3.44)

leading to X 
Vij aj − ω 2 Tij aj = 0. (3.45)
j

The equation has non zero solutions only if det[Vij − ω 2 Tij ] = 0. This gives a nth-degree
polynomial to solve for ω 2 . We will get n solutions for ω 2 that we can substitute into the
matrix equation and solve for aj .
Chapter 4
Motion of a Rigid Body

4.1 Angular velocity


Suppose there are two coordinate frames. The frame 2 is rotating with respect to frame 1. If the
coordinates of a particle in frame 2 are r2 = (x2 , y2 , z2 ). Then the coordinates of the particle
in frame 1 are
r1 = O(t)r2 where O⊺ O = I. (4.1)

We can derive that


dO
Ω + Ω⊺ = 0 where Ω ≡ O⊺ . (4.2)
dt
Thus, we have

dr1 dO dr2 dr2


v1 ≡ = r2 + O = O(Ωr2 + v2 ) where v2 ≡ . (4.3)
dt dt dt dt
Ω can be written explicitly as
 
0 −ω2z ω2y
Ω =  ω2z 0 −ω2x  . (4.4)
−ω2y ω2x 0

We then have
v1 = O(ω2 × r2 + v2 ). (4.5)

Define
ω1 ≡ Oω2 . (4.6)

We can get
v1 = ω1 × r1 + Ov2 . (4.7)


Note: ω is the so­called angular velocity. ω1 is independent of the base vector we choose for frame 2. If
we choose frame 1 differently, ω1 will transform like an vector.

4.2 Dynamics of rigid body


–36/453– Chapter 4 Motion of a Rigid Body

Inertial tensor of rigid body


Suppose there is frame attached to the rigid body. The coordinate of all the mass point of the
rigid body must be constant in this frame, i.e.,

r1 = r0 (t) + O(t)r2 where r2 is a constant. (4.8)

The velocity in frame 1 is

v1 = V + ω1 × r1 = V + O(ω2 × r2 ). (4.9)

The kinetic energy of the rigid body is


Xm Xm X Xm
T = (V + ω × r)2 = V2+ mV · (ω × r) + (ω × r)2 . (4.10)
2 2 2
If we choose the origin of the frame 2 to be the center of mass of the rigid body, we have

µV 2 1 X
T = + m[ω 2 r2 − (ω · r)2 ]. (4.11)
2 2
If we define the inertial tensor as
X
Iik = m(x2l δik − xi xk ), (4.12)

the kinetic energy of the rigid body can be rewritten as

µV 2 1
T = + Iik ωi ωk , (4.13)
2 2
and the Lagrangian of the rigid body is

µV 2 1
L= + Iik ωi ωk − U. (4.14)
2 2
If the body is regarded as continuous, the sum becomes an integral over the volume of the
body: Z
Iik = ρ(x2l δik − xi xk ) dV . (4.15)

Like any symmetrical tensor of rank two, the inertia tensor can be reduced to diagonal form
by an appropriate choice of the directions of the axes x1 , x2 x3 . These directions are called
the principal axes of inertia, and the corresponding values of the diagonal components of the
tensor are called the principal moments of inertia; we shall denote them by I1 , I2 , I3 . When
the axes x1 , x2 x3 are so chosen, the kinetic energy of rotation takes the very simple form
1
Trot = (I1 ω12 + I2 ω22 + I3 ω32 ). (4.16)
2

A body whose three principal moments of inertia are all different is called an asymmetrical
top. If two are equal (I1 = I2 ̸= I3 ), we have a symmetrical top. In this case the direction
of one of the principal axes in the x1 x2 -plane may be chosen arbitrarily. If all three principal
4.2 Dynamics of rigid body –37/453–

moments of inertia are equal, the body is called a spherical top, and the three axes of inertia
may be chosen arbitrarily as any three mutually perpendicular axes.

The determination of the principal axes of inertia is much simplified if the body is symmetrical,
for it is clear that the position of the centre of mass and the directions of the principal axes must
have the same symmetry as the body. For example, if the body has a plane of symmetry, the
centre of mass must lie in that plane, which also contains two of the principal axes of inertia,
while the third is perpendicular to the plane. If a body has an axis of symmetry of any order,
the centre of mass must lie on that axis, which is also one of the principal axes of inertia, while
the other two are perpendicular to it. If the axis is of order higher than the second, the body is
a symmetrical top. For any principal axis perpendicular to the axis of symmetry can be turned
through an angle different from π about the latter, i.e., the choice of the perpendicular axes is
not unique, and this can happen only if the body is a symmetrical top.

Finally, we may note one further result concerning the calculation of the inertia tensor. Al-
though this tensor has been defined with respect to a system of coordinates whose origin is at
the centre of mass , it may sometimes be more conveniently found by first calculating a similar
tensor, X

Iik = m(x′2 ′ ′
l δik − xi xk ), (4.17)
defined with respect to some other origin O′ . If the distance OO′ is represented by a vector a,
i.e., r = r ′ + a, we have

Iik = Iik + µ(a2 δik − ai ak ). (4.18)

Using this formula, we can easily figure out Iik if Iik is known.

Angular momentum

The value of the angular momentum of systems depends on the point with respect to which
it is defined. In the mechanics of a rigid body, the most appropriate point to choose for this
purpose is the origin of the moving system of coordinates, i.e., the centre of mass of the body.
Then we have
X X  
M= mr × (ω × r + V ) = m r2 ω − (ω · r)r , (4.19)

or, in tensor notation,


Mi = Iik ωk . (4.20)
If the axes x1 , x2 and x3 are the same as the principal axes of inertia, we have

M1 = I1 ω1 , M2 = I2 ω2 , M3 = I3 ω3 . (4.21)

Equation of motion

Since a rigid body has, in general, six degrees of freedom, the general equations of motion must
be six in number. They can be put in a form which gives the time derivatives of two vectors, the
momentum and the angular momentum of the body. The first equation is obtained by simply
–38/453– Chapter 4 Motion of a Rigid Body

summing the equations ṗ = f for each particle in the body. In terms of the total momentum
of the body X
P = p = µV , (4.22)
P
and total force acting on it F = f , we have
dP
= F. (4.23)
dt
Although F has been defined as the sum of all the forces f acting on the various particles,
including the forces due to other particles, F actually includes only external forces: the forces
of interaction between the particles composing the body must cancel out.
Let us now derive the second equation of motion, which gives the time derivative of the angu-
lar momentum M . To simplify the derivation, it is convenient to choose the fixed (inertial)
frame of reference in such a way that the centre of mass is at rest in that frame at the instant
considered. We have
d X  X X
Ṁ = r×p = ṙ × p + r × ṗ. (4.24)
dt
Our choice of the frame of reference (with V = 0) means that the vectors ṙ and p = mv are
parallel, so ṙ × p = 0. We have
dM X
= K where K = r × f. (4.25)
dt
Since M has been defined as the angular momentum about the centre of mass, it is unchanged
when we go from one inertial frame to another. We can therefore deduce that the equation of
motion, though derived for a particular frame of reference, is valid in any other inertial frame,
by Galileo’s relativity principle. The vector r × f is called the moment of the force f , and so
K is the total torque, i.e., the sum of the moments of all the forces acting on the body. Like
P
the total force, r × f need include only the external forces.

Euler’s equations
Let dA/dt be the rate of change of any vector A with respect to the fixed system of coordinates.
We have
dA d′ A
= + ω × A, (4.26)
dt dt
where d′ A/dt is the rate of change of the A’s components in the body system of coordinates.
Therefore,
d′ M
+ ω × M = K. (4.27)
dt
Suppose the principal axes of inertia are x1 , x2 x3 . We have
dω1
I1 + (I3 − I2 )ω2 ω3 = K1 ; (4.28a)
dt
dω2
I2 + (I1 − I3 )ω1 ω3 = K2 ; (4.28b)
dt
dω3
I3 + (I2 − I1 )ω1 ω2 = K3 . (4.28c)
dt
These are called Euler’s equations.
4.3 Eulerian angle –39/453–

4.3 Eulerian angle


The motion of a rigid body can be described by means of the three coordinates of its centre
of mass and any three angles which determine the orientation of the axes x1 , x2 x3 in the
moving system of coordinates relative to the fixed system X, Y , Z. These angles may often be
conveniently taken as what are called Eulerian angles. The moving x1 x2 -plane intersects the

Figure 4.1: Eulerian angle.

fixed XY -plane in some line ON , called the line of nodes. This line is evidently perpendicular
to both the Z-axis and the x3 -axis; we take its positive direction as that of the vector product
ẑ × x̂3 . We take, as the quantities defining the position of the axes x1 , x2 x3 relative to the
axes X, Y , Z the angle θ between the Z and x3 axes, the angle ϕ between the X-axis and ON ,
and the angle ψ between the x1 and ON .

Let us now express the components of the angular velocity vector ω along the moving axes
x1 , x2 x3 in terms of the Eulerian angles and their derivatives. To do this, we must find the
components along those axes of the angular velocities θ̇, ϕ̇, ψ̇. The angular velocity θ̇ is along
the line of nodes ON . The angular velocity ϕ̇ is along the Z-axis. The angular velocity ψ is
along the x3 -axis. Collecting the components along each axis, we have

ω1 = ϕ̇ sin θ sin ψ + θ̇ cos ψ;


ω2 = ϕ̇ sin θ cos ψ − θ̇ sin ψ;
ω3 = ϕ̇ cos θ + ψ̇.

For a symmetrical top, by using the fact that the choice of directions of the principal axes x1 ,
x2 is arbitrary for a symmetrical top. If the x1 axis is taken along the line of nodes ON , i.e.,
ψ = 0, the components of the angular velocity are simply

ω1 = θ̇, ω2 = ϕ̇ sin θ, ω3 = ϕ̇ cos θ + ψ̇. (4.29)

For the free motion of a symmetrical top, we take the Z-axis of the fixed system of coordinates
in the direction of the constant angular momentum M of the top. The x3 -axis of the moving
–40/453– Chapter 4 Motion of a Rigid Body

system is along the axis of the top; let the x1 -axis coincide with the line of nodes at the instant
considered. Then the components of the vector M are

M1 = I1 θ̇, M2 = I2 ϕ̇ sin θ, M3 = I3 (ϕ̇ cos θ + ψ̇). (4.30)

Since the x1 -axis is perpendicular to the Z-axis, we have

M1 = 0, M2 = M sin θ, M3 = M cos θ. (4.31)

Comparison gives
M M cos θ
θ̇ = 0, ϕ̇ = , ϕ̇ cos θ + ψ̇ = . (4.32)
I1 I3
The first of these equations gives θ = constant, i.e., the angle between the axis of the top and
the direction of M is constant. The second equation gives the angular velocity of precession
ϕ̇ = M/I1 . Finally, the third equation gives the angular velocity with which the top rotates
about its own axis ω3 = M cos θ/I3 .
Part II

Classical Field Theory


Chapter 5
Special Relativity

5.1 The principle of relativity


First, we assume that the velocity of propagation of interaction has an upper limit c. (In the
following, we adopt the unit system in which c = 1.) Second, we assume that all inertial refer-
ence frames are equivalent in describing the law of physics. Then, we can obtain the invariant
intervals when transforming from one inertial reference frame to another:

ds2 = −dt2 + dx2 + dy 2 + dz 2 . (5.1)

The transformation preserving the invariant intervals is called Lorentz transformation, which
can be written as
x′µ = Λµν xν . (5.2)
The invariant symbol of the vector representation of Lorentz transformation is η µν . We have
 
−1
 
 1 
Λ ρ Λ σ ηµν = ηρσ where ηµν ≡ 
µ ν
. (5.3)
 1 
1
The inverse of ηµν is denoted as η µν . We can use η µν and ηµν to raise and lower vector indices:

xµ ≡ ηµν xν , xµ = η µν xν . (5.4)

We can also verify the following identities,

Λµρ Λν ρ = δνµ ; (5.5a)


x′µ= Λµν xν ; (5.5b)
Λµρ Λν σ ηρσ = ηµν . (5.5c)

In a special case where the new reference frame moves along 1̂ direction with velocity β, we
have

x′0 = γx0 − γβx1 , x′1 = −γβx0 + γx1 where γ ≡ (1 − β 2 )−1/2 . (5.6)

Define rapidity η by β = tanh(η), the transformation can be rewritten as


 ′0   0  
x −Zη x 0 1
=e where Z = . (5.7)
x′1 x1 1 0
5.2 Relativistic Mechanics –43/453–

Generally, Lorentz transformations for vectors can be represented by


 
1
Λ = exp θµν ΣV µν
where (Σµν ρ
V ) σ = η δσ − η νρ δσµ , θ0i = −ηi ,
µρ ν
θij = −ϵijk θk ,
2
(5.8)
or explicitly,  
0 η1 η2 η3
 
η 0 −θ3 θ2 
Λ = exp 1 . (5.9)
η2 θ3 0 −θ1 
η3 −θ2 θ1 0

Some physical quantities will behave like a tensor (vector, scalar) when transforming from one
inertial frame to another. For example,

scalar proper time dτ , mass m, electrical charge e.

vector four velocity uµ ≡ dxµ /dτ , four momentum pµ ≡ muµ , four acceleration aµ ≡
duµ /dτ , four force f µ ≡ maµ .

We can also define some three vectors.

three velocity v̂ i ≡ dxi /dt


The three velocity relates to the four velocity through

u0 = γ, ui = γv̂ i . (5.10)

If the new reference frame moves along 1̂ direction with velocity β, we have

v̂ 1 − β v̂ 2 v̂ 3
v̂ ′1 = , v̂ ′2 = , v̂ ′3 = . (5.11)
1 − v̂ 1 β γ(1 − v̂ 1 β) γ(1 − v̂ 1 β)

three momentum p̂i ≡ pi

three acceleration âi ≡ dv i /dt

three force fˆi ≡ dp̂i /dt


The three force relates to the four force through

f i = γ fˆi . (5.12)

5.2 Relativistic Mechanics


For a free particle, we have the following equation of motion:

dpµ
= 0. (5.13)

It can be derived in several ways.
–44/453– Chapter 5 Special Relativity

Lagrangian formulation
The action for a free particle is given by
Z t2 Z b p
S= L dt = −m dτ where L = −m 1 − ẋi ẋi . (5.14)
t1 a

The action is stationary under perturbations with constraints δxµ (a) = δxµ (b) = 0. We can
derive the equation of motion
duµ
m = 0. (5.15)

Hamiltonian formulation
The canonical momentums and Hamiltonian for a free particle are
∂L p
πi = = γmη ij ẋj , H = π i ẋi − L = γm = m2 + π i πi . (5.16)
∂ ẋi
Thus, Hamilton’s equations for a free particle are
πj
π̇ i = 0, ẋi = ηij √ . (5.17)
m2 + π k πk

Hamilton-Jacobi equation
The Hamilton-Jacobi equation for a free particle is
 2  2  2  2
∂S 2 ∂S ∂S ∂S
=m + + + . (5.18)
∂t ∂x ∂y ∂z
Notice that p0 = H = −∂t S, pi = π i = ∂i S. We have pµ = ∂ µ S.

Non-free particle
For a non-free particle, we have the revised Newton’s second law:
dpµ
fµ = . (5.19)

It can also be written in the form of three vectors as

fˆi = γmâi + γ 3 (âj v̂j )mv̂ i . (5.20)

If the system consists of more than one particles interacting with each other. We have the
conservation laws from the symmetry.

Translational symmetry and conservation of momentum


The action is invariant under the transformation x′µ = xµ + δxµ . So we have
X b
δS = pµ δxµ = 0, (5.21)
a
P
i.e., pµ is conserved.
5.3 Relativistic Scattering –45/453–

Rotational symmetry and conservation of angular momentum


The action is invariant under the transformation x̄µ = xµ + xν δω µν , where I + δω is an
infinitesimal Lorentz transformation. So we have
X b
µ ν
δS = p x δω µν = 0. (5.22)
a
P
Since δω is antisymmetric, M µν is conserved, where M µν ≡ xµ pν − xν pµ .

5.3 Relativistic Scattering


5.3.1 Distribution function
The number of particles in the region r + dr and p + dp is f (p, r) dpx dpy dpz dx dy dz.
Function f (p, r) is called distribution function.
We first determine the properties of the “volume element” dpx dpy dpz , with respect to Lorentz
transformations. If we introduce a four-dimensional coordinate system, on whose axes are
marked the components of the four-momentum of a particle, then dpx dpy dpz , can be con-
sidered as the zeroth component of an element of the hypersurface defined by the equation
pµ pµ + m2 = 0. The element of hypersurface is a four-vector directed along the normal to the
hypersurface; in our case the direction of the normal obviously coincides with the direction of
the four-vector pµ . From this it follows that the ratio dpx dpy dpz /E is an invariant quantity,
since it is the ratio of corresponding components of two parallel four-vectors.
We also notice that dV dt is invariant under Lorentz transformation and dt = E dτ /m. We
can infer that dx dy dz E is an invariant quantity. Putting all together, we know the phase
volume dpx dpy dpz dx dy dz is an invariant volume. Therefore, we have

f (r, p) = f ′ (r ′ , p′ ) (5.23)

in Lorentz transformation.

5.3.2 Invariant cross section


Recall the definition of cross section
Number of Events per target
σ≡ . (5.24)
Time × Incident Flux
where the incident flux and time are measured in the frame of target particle.
Suppose that we have two colliding beams; we denote by n1 and n2 the particle densities in
them and by v1 and v2 the velocities of the particles. In the reference system in which par-
ticle 2 is at rest, we are dealing with the collision of the beam of particles 1 with a stationary
target. Then according to the usual definition of the cross-section σ, the number of collisions
occurring in volume dV in time dt is

dN = σvrel n1 n2 dV dt , (5.25)
–46/453– Chapter 5 Special Relativity

where vrel is the velocity of particle 1 in the rest system of particle 2 (which is just the definition
of the relative velocity of two particles in relativistic mechanics).
The number dN is by its very nature an invariant quantity. We would like to express it in a
form which is applicable in any reference system:

dN = An1 n2 dV dt , (5.26)

where A is a number to be determined, for which we know that its value in the rest frame of
one of the particles is vrel σ. We shall always mean by σ precisely the cross-section in the rest
frame of one of the particles, i.e., by definition, an invariant quantity. From its definition, the
relative velocity vrel is also invariant. The product dV dt is an invariant. Therefore the product
An1 n2 must also be an invariant. The law of transformation of the particle density n is
n0
n= √ = n0 E/m, (5.27)
1 − v2
where n0 is the density in the rest frame of the particle. Thus we can construct A in an arbitrary
frame as
pµ p2µ
A = −σvrel 1 . (5.28)
E1 E2
Notice that
m1 1 − v1 · v2
− pµ1 p2µ = p m2 = m1 m2 p . (5.29)
1 − vrel
2
(1 − v12 ) · (1 − v22 )
We can get the following expression for vrel :
p
(v1 − v2 )2 − (v1 × v2 )2
vrel = . (5.30)
1 − v1 · v2
Finally, we have p
dN = σ (v1 − v2 )2 − (v1 × v2 )2 n1 n2 dV dt . (5.31)
If the velocities v1 and v2 are collinear, then we have

dN = σ|v1 − v2 |n1 n2 dV dt . (5.32)

If we have only one target, then we have

dN = σ|v1 − v2 |n1 dt . (5.33)

5.3.3 Elastic scattering between two particles


The energy and momentum of two particles before and after scattering are (E1 , p1 , E2 , p2 )
and (E1′ , p′1 , E2′ , p′2 ) respectively. In lab frame (L frame),

E2 = m2 , p2 = 0. (5.34)

In light of the conservation of four momentum, we can derive that

E1′ (E1 + m2 ) − E1 m2 − m21 (E1 + m2 )(E2′ − m2 )


cos θ1 = , cos θ2 = , (5.35)
p1 p′1 p1 p′2
5.3 Relativistic Scattering –47/453–

where θ1 (θ2 ) is the angle between p′1 (p′2 ) with p1 . Especially, if m1 = 0, we have

E1
E1′ = . (5.36)
(1 − cos θ1 )E1 /m2 + 1

Define x ≡ p′1 cos θ1 , y ≡ p′1 sin θ1 . We have

(x − c)2 y 2
+ 2 = 1, (5.37)
a2 b
where
p1 (E1 m2 + m22 ) m2 p1 a p1 (E1 m2 + m21 )
a≡ , b≡ p 2 = √ , c≡ ,
m21 + m22 + 2m2 E1 m1 + m22 + 2m2 E1 1−V2 m21 + m22 + 2m2 E1
(5.38)
where V ≡ p1 /(E1 + m2 ) is the velocity of particle 2 before scattering in the center of mass
frame (C frame). It is easy to see that a + c = p1 . The scattering in L frame is illustrated in
Figure 5.1. We note that if m1 > m2 , the scattering angle θ1 cannot exceed a certain maximum
value, which is given by sin θ1max = m2 /m1 .

Figure 5.1: Relativistic scattering.

Suppose the scattering angle in C frame is χ. We can derive that

m2 (E12 − m21 )
E1′ = E1 −∆E, E2′ = m2 +∆E where ∆E ≡ (1−cos χ). (5.39)
m21 + m22 + 2m2 E1
Chapter 6
Classical Field Theory

In physics, a field is a physical quantity, represented by a number or tensor, that has a value
for each point in space and time. A classical field theory is a physical theory that predicts how
one or more physical fields interact with matter through field equations.

6.1 Lagrangian formulation


Similar to classical mechanics, a field theory tends to be expressed mathematically by using
Lagrangians and action principle. Given a field ϕa (x), a quantity called the Lagrangian density
L can be constructed from ϕa (x) and its derivatives. From this density, the action functional
can be constructed by integrating over spacetime,
Z t2
S= L(ϕa , ϕ̇a , ∇ϕa ) d4 x . (6.1)
t1

Keeping the end points fixed and requiring S to be stationary about samll perturbations, the
field equation are obtained  
∂L ∂L
∂µ − = 0. (6.2)
∂(∂µ ϕa ) ∂ϕa

A key feature of all theories of nature is the property of locality. The locality of the theory
requires that there are no terms in the Lagrangian coupling ϕ(x, t) directly to ϕ(y, t) with
x ̸= y. The closet we get for the x label is coupling between ϕ(x, t) and ϕ(x + δx, t) through
the gradient term ∇ϕ.
Modern formulations of classical field theories also require Lorentz covariance as laws of na-
ture are relativistic. The field can behave like a scalar or vector, while Lagrangian density must
be a scalar, or more loosely, action must be invariant under Lorentz transformation.
Scalar fields Under Lorentz transformation x′ = Λx, we have
ϕ′ (x′ ) = ϕ(Λ−1 x′ ). (6.3)

Vector fields Under Lorentz transformation x′ = Λx, we have


A′µ (x′ ) = Λµν Aν (Λ−1 x′ ), A′µ (x′ ) = Λµν Aν (Λ−1 x′ ). (6.4)

The derivative of scalar field behaves like a vector field,


∂µ′ ϕ′ (x′ ) = (Λ−1 )ν µ ∂ν ϕ(Λ−1 x′ ) = Λµν ∂ν ϕ(Λ−1 x′ ). (6.5)
6.2 Symmetry and conservation law –49/453–

6.2 Symmetry and conservation law


Theorem 6.1 Noether’s theorem

Every continuous symmetry of the Lagrangian density gives rise to a conserved current
j µ (x) such that the equation of motion imply ∂µ j µ = 0. Suppose that the infinitesimal
transformation gives
ϕa → ϕa + δϕa , L → L + δL. (6.6)

µ
If δL = ∂µ K , the conserved current is

∂L
jµ = − δϕa + K µ . (6.7)
∂(∂µ ϕa )

Under infinitesimal spacetime translation, we have ϕa (xµ ) → ϕa (xµ − aµ ) = ϕa − aµ ∂µ ϕa +


O(a2 ) and L(xµ ) → L(xµ − aµ ) = L − aµ ∂µ L + O(a2 ). The conserved current is

∂L
j µ = −aν T µν where T µν ≡ − ∂ ν ϕa + η µν L. (6.8)
∂(∂µ ϕa )

The arbitrariness of aµ implies that ∂µ T µν = 0. If we define


Z
P ≡ T 0µ d3 x ,
µ
(6.9)

we will obtain the law of momentum conservation:


dP µ
= 0. (6.10)
dt

Under Lorentz transformation, the field generally transforms as

ϕ′a (x′ ) = Sa b ϕb (Λ−1 x′ ). (6.11)

In the limit of infinitesimal Lorentz transformation, we have Λµν = δνµ + δω µν + O(δω 2 ),


where  
0 β1 β2 β3
 
β 0 −θ3 θ2 
δω µν ≡  1 . (6.12)
β2 θ3 0 −θ1 
β3 −θ2 θ1 0
The matrix S can be approximated by
1 
Sa b = δab + δωαβ (Σαβ )ab + O δω 2 . (6.13)
2
We can derive the conserved current
1 ∂L
j µ = M µνρ δωνρ where M µνρ ≡ xν T µρ − xρ T µν − (Σνρ )ab ϕb . (6.14)
2 ∂(∂µ ϕa )
–50/453– Chapter 6 Classical Field Theory

The arbitrariness of δωµν expect for antisymmetry implies that ∂µ M µνρ = 0. If we define
Z
M νρ
≡ M 0νρ d3 x , (6.15)

we will obtain the law of angular momentum conservation,

dM νρ
= 0. (6.16)
dt

6.3 Functional derivatives


Definition 6.1 Functional derivatives

Given a manifold M representing (continuous/ smooth) functions ρ (with certain


boundary conditions etc.), and a functional F defined as

F: M →R or F : M → C (6.17)

the functional derivative of F [ρ], denoted δF /δρ , is defined by ♡


Z
δF F [ρ + ϵϕ] − F [ρ] dF [ρ + ϵϕ]
ϕ(x) dx = lim = , (6.18)
δρ ϵ→0 ϵ dϵ ϵ=0

where ϕ is an arbitrary function. The quantity ϵϕ is called the variation of ρ.

Like the derivative of a function, the functional derivative satisfies the following properties,
where F [ρ] and G[ρ] are functionals:

Linearity
δ(λF + µG)[ρ] δF [ρ] δG[ρ]
=λ +µ (6.19)
δρ(x) δρ(x) δρ(x)
where λ, µ are constants.

Product rule
δ(F G)[ρ] δF [ρ] δG[ρ]
= G[ρ] + F [ρ] . (6.20)
δρ(x) δρ(x) δρ(x)

Chain rules If F is a functional and G an operator, then


Z
δF [G[ρ]] δF [G[ρ]] δG[ρ](y)
= dy . (6.21)
δρ(x) δG[ρ](y) δρ(x)

If G is an ordinary differentiable function g, then this reduces to

δF [g(ρ)] δF [g(ρ)] dg
= . (6.22)
δρ(x) δg(ρ(x)) dρ
6.4 Hamiltonian formulation –51/453–

Proposition 6.1 Properties of functional derivatives

δF 1
= lim {F [ρ + ϵδx ] − F [ρ]} where δx ≡ δ(y − x). (6.23)
δρ(x) ϵ→∞ ϵ
δf (y) δf ′ (y) dδ(y − x)
= δ(y − x), = . (6.24)
δf (x)
Z
δf (x)

dy ♠
δ
g(f (t)) dt = g ′ (f (x)). (6.25)
δf (x)
Z 
δ d
g(f (t)) dt = − [g ′ (f ′ (x))].

(6.26)
δf (x) dx

6.4 Hamiltonian formulation


A field theory can also be expressed by using canonical momentum and Hamiltonian. Given
the field and Lagrangian density, the canonical momentum and Hamiltonian density are
∂L
πa = , H(ϕa , ∇ϕa , π a ) = π a ϕ˙a − L. (6.27)
∂ ϕ˙a
The Hamiltonian can be constructed by integrating over space,
Z
H = H d3 x . (6.28)

By applying action principle, we can derive Hamilton’s equation


 
˙ ∂H δH ∂H ∂H δH
ϕa (x) = a
= a , π˙ (x) = −
a + =− . (6.29)
∂π δπ (x) ∂ϕa ∂ϕa,i ,i δϕa (x)

6.4.1 Poisson bracket


The Poisson bracket of two functionals obeys the same rules (anticommutativity, bilinearity,
Leibniz’s rule and Jacobi identity) as for two classical quantities. We futher demand that dif-
ferentiation and integration in space commutate with bracket operation. If we assume that
 a 
{ϕa (x), ϕb (y)} = 0, π (x), π b (y) = 0, ϕa (x), π b (y) = δab δ(x − y), (6.30)

we can derive that


Z  
δF δG δF δG
{F [ϕ(x), π(x)], G[ϕ(x), π(x)} = 3
dx − . (6.31)
δϕ(x) δπ(x) δπ(x) δϕ(x)
The Hamilton equation can therefore be rewritten as

ϕ˙a (x) = {ϕa (x), H}, π˙a (x) = {π a (x), H}. (6.32)

We can further get


   
dO(ϕ, π, t) ∂O d{A, B} dB dA
= {O, H} + , = A, + ,B . (6.33)
dt ∂t dt dt dt
–52/453– Chapter 6 Classical Field Theory

6.4.2 Momentum
Using equation 6.8 and 6.9, we can derive that
Z
0
P = H, i
P = −π a ∂ i ϕa d3 x . (6.34)

We now obtain the following Poisson brackets

{ϕa , P µ } = −∂ µ ϕa , {π a , P µ } = −∂ µ π a , {P µ , P ν } = 0. (6.35)

6.4.3 Angular momentum


Using equation 6.14 and 6.15, we have
Z
M = (xµ T 0ν − xν T 0µ − π a (Σµν )ab ϕb ) d3 x .
µν
(6.36)

If we define
Z Z
ML ≡
µν µ
(x T 0ν
− x T )d x,
ν 0µ 3
MS ≡
µν
(−π a (Σµν )ab ϕb ) d3 x , (6.37)

we can get the following Poisson brackets

{ϕa , MLµν } = (Lµν )ab ϕb , {ϕa , MSµν } = (Sµν )ab ϕb ., (6.38)

where
(Lµν )ab ≡ −(xµ ∂ ν − xν ∂ µ )δab , (Sµν )ab ≡ −(Σµν )ab . (6.39)
Because dM µν /dt = 0, M µν commutate with d/dt. Dirivatives with respect to spatial coor-
dinates also commutate with bracket operation by definition. As a result, we have

{{ϕ(x), M µν }, M ρσ } = (Lµν + Sµν )(Lρσ + Sρσ )ϕ(x). (6.40)

We can further derive that

{ϕ(x), {M µν , M ρσ }} = (Lµν Lρσ − Lρσ Lµν + Sµν Sρσ − Sρσ Sµν )ϕ(x). (6.41)

Notice that

Lµν Lρσ − Lρσ Lµν = −η νρ Lµσ + η σµ Lρν + η µρ Lνσ − η σν Lρµ . (6.42)

If we demand that

Sµν Sρσ − Sρσ Sµν = −η νρ Sµσ + η σµ Sρν + η µρ Sνσ − η σν Sρµ , (6.43)

we can obtain the following Poisson bracket

{M µν , M ρσ } = −η νρ M µσ + η σµ M ρν + η µρ M νσ − η σν M ρµ , (6.44)

up to the possibility of a term on the right-hand side that commutes with ϕ(x) and its deriva-
tives.
6.4 Hamiltonian formulation –53/453–

We now define Ji ≡ ϵijk M jk /2 and Ki ≡ M i0 , or explictly,


 
0 −K1 −K2 −K3
 
K 0 J3 −J2 
M µν =  1 , (6.45)
K2 −J3 0 J1 
K3 J2 −J1 0

equation 6.44 can be rewritten as

{Ji , Jj } = ϵijk Jk , {Ji , Kj } = ϵijk Kk , {Ki , Kj } = −ϵijk Jk . (6.46)

By the similar method, we can also derive that

{P µ , M ρσ } = η µσ P ρ − η µρ P σ . (6.47)

It can be rewritten as

{Ji , H} = 0, {Ji , Pj } = ϵijk Pk , {Ki , H} = Pi , {Ki , Pj } = δij H. (6.48)

Fianlly, we define Li ≡ ϵijk MLjk /2 and Si ≡ ϵijk MSjk /2. We can derive that

{Li , Sj } = 0, {Si , Pj } = 0, {Li , Pj } = ϵijk Pk . (6.49)


Chapter 7
Classical Electrodynamics

7.1 The formulation of classical electrodynamics


7.1.1 Maxwell’s equations and Lorentz force
The Lagrangian density of electromagnetic field (EM field) Aµ when coupling with current is
1
L = − F µν Fµν + j µ Aµ where Fµν ≡ ∂µ Aν − ∂ν Aµ , j µ ≡ ρe0 uµ . (7.1)
4
ρe0 is the charge density measured in the rest frame of charge. The field equation of EM field
can be derived as
∂ν F µν = j µ . (7.2)
The charge conservation equation then follows directly from field equation,

∂µ j µ = ∂µ ∂ν F µν = 0. (7.3)

For a charged particle moving in EM field with trajectory xµ (τ ), we have


p
j µ = ea δ(r − ra (t)) 1 − va2 uµa . (7.4)

It follows that Z Z
µ
dV dt j Aµ = dxµ ea Aµ (xµ (τ )). (7.5)

The action for a charged particle when coupling with EM field is therefore
Z Z
S = −m dτ + e dxµ Aµ (xµ (τ )). (7.6)

The equation of motion for the particle can be derived as

maµ = eFµν uν . (7.7)


Note: The Hamiltonian formulation of electrodynamics will be discussed in detail in the Hamiltonian
formulation of general relativity and canonical quantization formulation of quantum electrodynamics.

The electric and magnetic field are defined as


1
E i ≡ F 0i = −Ȧi − ∂i A0 , B i ≡ ϵijk F jk = ϵijk ∂j Ak . (7.8)
2
7.1 The formulation of classical electrodynamics –55/453–

We also define ρe ≡ j 0 and J i ≡ j i . The field equation can be rewritten as so-called Maxwell’s
equations:
∂E ∂B
∇×B = + J, ∇×E =− , ∇ · E = ρe , ∇ · B = 0. (7.9)
∂t ∂t
The equations of motion for the charged particle can be rewritten as so-called Lorentz force
equations:
dp dE
= e(E + v × B), = eE · v. (7.10)
dt dt
We also notice that Aµ cannot be completely determined by Maxwell’s equations and Lorentz
force equations. If we make the transformation Aµ → Aµ + ∂µ ξ(x), L and F µν would be in-
variant, and Maxwell’s equations and Lorentz force equations are still valid. This arbitrariness
of ξ is called gauge invariance. This topic will be discussed in detail in QED.

7.1.2 Lorentz transformation of fields


EM field Aµ transforms as a vector under Lorentz transformation. In a special case where the
new reference frame move along 1̂ direction with velocity β, we have

A′0 = γA0 − γβA1 , A′1 = −γβA0 + γA1 . (7.11)

Field E and B will transform as

E1′ = E1 , E2′ = γE2 − γβB3 , E3′ = γE3 + γβB2 , (7.12a)


B1′ = B1 , B2′ = γB2 + γβE3 , B3′ = γB3 − γβE2 . (7.12b)

Equation 7.12 can be generalized to the case where the direction of β is arbitrary. We have

E ′ = γ(E⊥ + β × B) + E∥ , B ′ = γ(B⊥ − β × E) + B∥ . (7.13)

If β ≪ 1, we have
E ′ = E + β × B, B ′ = B − β × E. (7.14)
We notice that Fµν F µν and ϵµνρσ F µν F ρσ is invariant under Lorentz transformation, i.e.,

E 2 − B 2 = inv, E · B = inv. (7.15)

7.1.3 Energy-momentum tensor


The energy-momentum tensor for free EM field is
∂L 1
Tfµν ≡ − ∂ ν Aρ + η µν L = ∂ ν Aρ F µρ − η µν Fρσ F ρσ . (7.16)
∂(∂µ Aρ ) 4

We notice that the energy-momentum tensor defined above is not symmetric. So we define a
modified energy-momentum tensor by adding a term −∂ ρ Aν F µρ , i.e.,

1
Tfµν
′ = F νρ F µρ − η µν Fρσ F ρσ . (7.17)
4
–56/453– Chapter 7 Classical Electrodynamics

For free EM field, we have ∂ ρ Aν F µρ = ∂ ρ Aν F µρ . As a result,

∂µ Tfµν
′ = 0, Pfµ′ = Pfµ . (7.18)

From now on, we will use Tf ′ as the energy-momentum tensor of EM field and omit the prime
for simplicity. The momentum of the free EM field is
Z Z Z Z
E2 + B2
0
Pf = dV ≡ dV w, Pf = dV E × B ≡ dV S.
i
(7.19)
2

If there also exists charged particles in the system, i.e., the source of EM field, we must also
include the energy-momentum tensor of the particles to get the right conservation equation.
The energy-momentum tensor of particles is defined as
X p
Tpµν ≡ ma δ(r − ra ) 1 − va2 uµa uνa . (7.20)
a

From this definition, we can get the four momentum of all particles,
X ma X ma
Pp0 = p , Pp = p va . (7.21)
a
1 − va
2
a
1 − va2

Similarly to electric current, the mass current of particles is


X p
µ
jm = ma δ(r − ra ) 1 − va2 uµa . (7.22)
a

If there is no creation or annihilation of particles, we would have mass conservation equation


µ
∂µ jma = 0. (7.23)

The Lorentz force equation 7.7 can also be written as


duµ ν
ρm0a = Fµν jea . (7.24)

p
where ρm0a ≡ ma δ(r − ra ) 1 − va2 is the mass density measured in the rest frame of the
particle. On the one hand, we have
X ∂uνa X duνa
∂µ Tpµν = µ
jma = ρm0a = F νµ jeµ . (7.25)
a
∂xµ a

On the other hand, we can derive that

∂µ Tfµν = −F νµ jeµ (7.26)

by implementing Maxwell’s equations. Define

T µν ≡ Tfµν + Tpµν . (7.27)

We have conservation equation


∂µ T µν = 0. (7.28)
7.1 The formulation of classical electrodynamics –57/453–

If we define the Maxwell stress tensor as

f ij ≡ −Tfij = E i E j + B i B j − wδ ij , (7.29)

the conservation law of energy and momentum can be written as


 Z  I  Z  I
d d
Pp + w dV = − S · dσ ,
0
Pp + S dV = f · dσ , (7.30)
dt dt

assuming that there is no particle crossing the boundary.

The symmetry of T µν implies

∂µ (xν T µρ − xρ T µν ) = 0. (7.31)

Thus, we can obtain the conservation law of angular momentum as


 Z  I X ma
d
Lp + r × S dV = r × f · dσ where Lp ≡ p ra × va . (7.32)
dt a
1 − va
2

7.1.4 Charged particles in a given EM field


Suppose that the EM field is given, i.e., neglect the EM field generated by the test charged
particles. The action for the test particle is
Z t2 √
S= (−m 1 − v 2 + eA · v − eϕ) dt . (7.33)
t1

The Lagrangian of the particle is



L = −m 1 − v 2 + eA · v − eϕ. (7.34)

The canonical momentum is


∂L
π= = γmv + eA. (7.35)
∂v
The Hamiltonian of the particle is
p
H = π · v − L = γm + eϕ = m2 + (π − eA)2 + eϕ. (7.36)

In the limit of v → 0, we have

mv 2 (π − eA)2
L= + eA · v − eϕ, π = mv + eA, H= + eϕ. (7.37)
2 2m

If the EM field is static, we have ∇ × E = 0. We could choose the gauge in which

Ȧ = 0, E = −∇ϕ. (7.38)

In this case, ∂L/∂t = 0, and so γm + eϕ is conserved during motion.


–58/453– Chapter 7 Classical Electrodynamics

Motion in a uniform and constant electric field

Suppose the direction of electric field is x̂, the orbit is in x − y plane. The equation of motion
will be
ṗx = eE, ṗy = 0. (7.39)

The solution is
q
1 p0 eEt
x= E20 + (eEt)2 , y= arcsinh , (7.40)
eE eE E0
p
assuming px = 0, py = p0 at t = 0 and E0 ≡ p20 + m2 . The trajectory of the particle is

E0 eEy
x= cosh . (7.41)
eE p0

Motion in a uniform and constant magnetic field

Suppose the direction of magnetic field is ẑ. Notice that particle’s kinetic energy E = γm is
constant if there is no electric field. We can derive the equation of motion

v̇x = ωvy , v̇y = −ωvx , v̇z = 0, (7.42)

where ω = eB/γm. The solution is

x = x0 + r sin(ωt + α), y = y0 + r cos(ωt + α), z = z0 + v0z t. (7.43)

where x0 , y0 , z0 , r, α, v0z should be determined by initial condition.

Motion in a uniform and constant EM field

We only focus on the case where the velocity of particle is much smaller than light speed.
Suppose the direction of magnetic field is ẑ and the direction of electric field is within y − z
plane. The equation of motion is

mẍ = eB ẏ, mÿ = eEy − eB ẋ, mz̈ = eEz . (7.44)

The solution is

Ey eEz
ẋ = a cos ωt + , ẏ = −a sin ωt, ż = v0z + t. (7.45)
B m

where ω = eB/m. a and vz0 are determined by initial condition. As we suppose that v ≪ 1
is satisfied, we must have

eEz t Ey
a ≪ 1, v0z ≪ 1, ≪ 1, ≪ 1. (7.46)
m B
7.2 Constant electromagnetic field –59/453–

7.2 Constant electromagnetic field

7.2.1 Coulomb’s law


For constant electric field, Maxwell’s equations take the form

∇ · E = ρe , ∇ × E = 0. (7.47)

Therefore, we have
E = −∇ϕ, ∇2 ϕ = −ρe . (7.48)

The solution is Z
ρe (r ′ )
ϕ(r) = dV ′ . (7.49)
4π|r − r ′ |

If ρe (r ′ ) = Qδ(r ′ ), we have

Q Qr
ϕ(r) = , E(r) = . (7.50)
4π|r| 4π|r|3

For a system of static charged particles, the total energy is


Z Z
1 2 1 1X 1X
U= E dV = ρϕ dV = ea ϕa + ea Φa . (7.51)
2 2 2 2

Here, ϕa is the electric potential at the point where ea is located, produced by ea itself, while
Φa is the potential produced by other charges. It is obvious that Uself = ea ϕa /2 is infinite,
indicating that classical electrodynamics is no more valid in small distance. This problem will
be solved in quantum electrodynamics: the mass of charged particle we measured is already
renormalized to include the electromagnetic self energy. Actually, we have
Z
1 1 X ea eb
U= E 2 dV − Uself = where Rab = |ra − rb |. (7.52)
2 2 a̸=b 4πRab

If the charged particle is moving with a constant velocity v, we can derive the electric field it
produced by Lorentz transformation. The final result is

er 1 − v2
E= , B = v × E, (7.53)
4πr3 (1 − v 2 sin2 θ)3/2

where r is the vector point from the particle to the point we measure the electric field, and θ
is the angle between r and v. If V ∼ 1, the electric field will be concentrated in the direction
perpendicular to the V . If v ≪ 1, we have

er ev × r
E= , B= . (7.54)
4πr3 4πr3
–60/453– Chapter 7 Classical Electrodynamics

7.2.2 Multipole moments


For a system of charged particles, the potential it produced at r is
X ea
ϕ= . (7.55)
a
4π|r − ra |

If r > ra , we can expand the equation around ra = 0. Generally, we have

1 X∞ X l
ral 4π
= Y ∗ (θ, ϕ)Ylm (θa , ϕa ). (7.56)
|r − ra | l=0 m=−l
r l+1 2l + 1 lm

P
The potential can be decomposed into ϕ = ϕ(l) , where
r r
1 Xl
4π X 4π
ϕ (l)
≡ Q(l) Y ∗ (θ, ϕ), Q(l) ≡ ea ral Ylm (θa , ϕa ). (7.57)
4πrl+1 m=−l 2l + 1 m lm m
a
2l + 1

We list the leading terms in decomposition:

Q Qn̂
ϕ(0) = , E (0) = ; (7.58)
4πr 4πr2
d · n̂ 3(d · n̂)n̂ − d
ϕ(1) = , E (1) = ; (7.59)
4πr2 4πr3
n̂ · D · n̂ 5(n̂ · D · n̂)n̂ − (n̂ · D + D · n̂)
ϕ(2) = , E (2) = ; (7.60)
8πr3 8πr4
where
r X X X
n̂ ≡ , Q≡ ea , d≡ ea ra , D≡ ea (3ra ra − ra2 I). (7.61)
r a a a

Now turn to a system of charged particles in the electric field ϕ(r). If all the particles are near
r = 0, we can make the expansion
r
X
∞ X
m=l

l
ϕ(r) = r alm Ylm (θ, ϕ). (7.62)
l=0 m=−l
2l + 1

Thus, the potential energy of the system can be decomposed as

X
∞ X
l
U= U (l) , U (l) = alm Q(l)
m. (7.63)
l=0 m=−l

The force exerted on the system can be obtained by taking the derivatives of the potential
energy. We list the leading terms in decomposition:

U (0) = Qϕ0 , F (0) = QE0 , (7.64)


U (1) = −d · E0 , F (1) = (∇E0 ) · d, M (1) = d × E0 . (7.65)
7.2 Constant electromagnetic field –61/453–

7.2.3 Biot-Savart law


Let us consider the magnetic field produced by charges which perform a finite motion, in
which the particles are always within a finite region of space and the momenta also always
remains finite. Consider the time average magnetic field ⟨B⟩, produced by the charges; this
field will now be a function only of the coordinates and not of the time. We take the time
average of the Maxwell’s equations
∂E
∇ · B = 0, ∇×B = + j. (7.66)
∂t
Notice that the average value of the derivative ∂E/∂t , like the derivative of any quantity which
varies over a finite range, is zero. Thus, we have

∇ · ⟨B⟩ = 0, ∇ × ⟨B⟩ = ⟨j⟩ . (7.67)

Recall that B = ∇ × A. Imposing the gauge condition ∇ · A = 0, we can get

∇2 ⟨A⟩ = − ⟨j⟩ . (7.68)

The solution is Z  
1 ⟨j⟩ ′ 1 X ea va
⟨A⟩ = dV = . (7.69)
4π |r − r ′ | 4π a |r − ra |
The magnetic field is Z
1 ⟨j⟩ × (r − r ′ )
⟨B⟩ = dV ′ . (7.70)
4π |r − r |
′ 3

7.2.4 Magnetic moment


For a system of charged particles, the potential it produced at r is
 
1 X ea va
⟨A⟩ = . (7.71)
4π a |r − ra |

If r ≫ ra , we can expand the equation around ra = 0 to the first order,


1X X 
1

4π ⟨A⟩ = e ⟨va ⟩ − eva ra · ∇ . (7.72)
r a a
r

Firstly, * +
X d X
e ⟨va ⟩ = era = 0. (7.73)
a
dt a
Secondly,
X 
1

1 X
− eva ra · ∇ = 3 ⟨eva (ra · r)⟩ . (7.74)
a
r r a

Notice that
!
X 1 dera (ra · r) 1 X
eva (ra · r) = + era × va × r. (7.75)
a
2 dt 2 a
–62/453– Chapter 7 Classical Electrodynamics

Define the magnetic moment as


!
1 X
m≡ era × va . (7.76)
2 a

We can get
⟨m⟩ × r 3n̂(⟨m⟩ · n̂) − ⟨m⟩
⟨A⟩ = 3
, ⟨B⟩ = . (7.77)
4πr 4πr3
If all the particles have the same mass-to-charge ratio, and the velocity of all the particles is
much smaller than that of light, we have
e X e
m= mra × va = M. (7.78)
2m a 2m

Now focus on a system of charges in an external constant uniform magnetic field. The time
average of the force acting on the system is
* +
X d X
F = e ⟨va × B⟩ = era × B = 0. (7.79)
a
dt a

The average value of the moment of the force is


X
⟨K⟩ = e ⟨ra × (va × B)⟩ . (7.80)
a

We can derive that


⟨K⟩ = ⟨m⟩ × B. (7.81)
Let us consider the change in the average angular momentum ⟨M ⟩ of the system. According
to a well-known equation of mechanics, the derivative of M is equal to the moment K of the
forces acting on the system. Thus, we have
d ⟨M ⟩
= ⟨m⟩ × B. (7.82)
dt
If the mass-to-charge ratio is the same for all particles of the system, the angular momentum
and magnetic moment are proportional to one another, and we find:
d ⟨M ⟩ e
= −Ω × ⟨M ⟩ , Ω= B. (7.83)
dt 2m
This equation states that the vector ⟨M ⟩ rotates with angular velocity −Ω around the direction
of the field, while its absolute magnitude and the angle which it makes with this direction
remain fixed. This motion is called the Larmor precession.

7.3 Electromagnetic waves


7.3.1 Electromagnetic waves
Electromagnetic fields occurring in vacuum in the absence of charges are called electromag-
netic waves. We choose the Coulomb’s gauge,

ϕ = 0, ∇ · A = 0. (7.84)
7.3 Electromagnetic waves –63/453–

Consequently, we have
∂A
E=− , B = ∇ × A. (7.85)
∂t
From Maxwell’s equations we can derive that

∂ 2A
∇2 A − = 0. (7.86)
∂t2
This is the equation which determines the potentials of electromagnetic waves. We can verify
that the electric and magnetic field E and B satisfy the same wave equation.

We consider the special case of electromagnetic waves in which the fields depends only on
one coordinates, say x. Such waves are said to be plane. In this case the equation of the field
becomes
∂ 2f ∂ 2f
− = 0, (7.87)
∂t2 ∂x2
where f is understood any component of the vectors A, E and B. The solution is

f (t, x) = f1 (t − x) + f2 (t + x). (7.88)

f1 (t − x) represents a plane wave moving in the positive direction along the x axis. f2 (t − x)
represents a plane wave moving in the negative direction along the x axis. The Coulomb’s
gauge would imply that Ax = 0. And we can obtain

E = −A′ , B = −n̂ × A′ = n̂ × E. (7.89)

where the prime denotes differentiation with respect to t − x and n̂ is a unit vector along the
direction of propagation of the wave. We see that the electric and magnetic fields E and B of
a plane wave are directed perpendicular to the direction of propagation of the wave. For this
reason, electromagnetic waves are said to be transverse. The energy density and flux of the
plane waves are
W = E 2 , S = W n̂. (7.90)

7.3.2 Monochromatic wave


A very important special case of electromagnetic waves is a wave in which the field is a sim-
ply periodic function of the time. Such a wave is said to be monochromatic. All quantities
(potentials, field components) in a monochromatic wave depend on the time through a factor
of the form cos(ωt + a). The quantity ω is called the cyclic frequency of the wave (we shall
simply call it the frequency). For the monochromatic wave, the wave equation becomes

∂ 2f
+ ω 2 f = 0. (7.91)
∂x2
The vector potential of such a wave is most conveniently written as the real part of a complex
expression

A = Re A0 ei(k·r−ωt) , k = ω n̂. (7.92)
–64/453– Chapter 7 Classical Electrodynamics

The time average of the product of field intensity can be worked out as
1
⟨XY ⟩ = Re {X0 Y0∗ } . (7.93)
2
The electric and magnetic field are

E = iωA, B = ik × A. (7.94)

And we can verify that (ω, k) transforms like a four-vector.


Generally, the electric field can be written as

Ey = A cos(ϕ), Ez = B cos(ϕ + δ), ϕ = kx − ωt, −π < δ ≤ π. (7.95)

The end of the vector E in y − z plane will form an ellipse


Ey2 Ez2 2Ey Ez cos δ
+ − = sin2 δ. (7.96)
A2 B 2 AB
The magnitudes of the semiaxes of the polarized ellipse are
1 √ 2 √
A + B 2 + 2AB sin δ ± A2 + B 2 − 2AB sin δ . (7.97)
2
The angle θ between the major axis and y axis satisfies the equation
2AB cos δ
tan 2θ = . (7.98)
A2 − B 2
• If −π/2 < δ < π/2, major axis is in the first and third quadrant.
• If δ > π/2 or δ < −π/2, major axis is in the second and forth quadrant.
• If δ = ±π/2 and A > B, the major axis is y axis. If δ = ±π/2 and A < B, the major
axis is z axis.
• If δ = ±π/2 and A = B, the ellipse becomes a circle.
• If 0 < δ < π, the rotation is positive in the direction of x axis (right handed, or levoro-
tatory).
• If −π < δ < 0, the rotation is negative in the direction of x axis (left handed, or dex-
trorotatory).
• If δ = 0, π, the ellipse becomes a line.
Any field is expandable in a Fourier integral containing a continuous or discrete distribution
of different frequencies. Such an expansion has the form
Z ∞

f (t) = fω e−iωt , (7.99)
−∞ 2π
where the Fourier components are given in terms of the function f (t) by the integrals
Z ∞
fω = f (t)eiωt dt . (7.100)
−∞
7.3 Electromagnetic waves –65/453–

Because f (t) must be real, we have


f−ω = fω∗ . (7.101)
The total intensity of the wave is
Z ∞ Z ∞ Z ∞
2 dω dω
2
f dt = |fω | =2 |fω |2 . (7.102)
−∞ −∞ 2π 0 2π

There is a special case that f (t) is a periotic function with angular frequency ω0 . f (t) can be
expanded as
X∞
f (t) = fn e−inω0 t , (7.103)
−∞

where Z T
1
fn ≡ f (t)einω0 t dt . (7.104)
T 0

The average intensity of the wave is


Z T X

1
f 2 dt = |fn |2 . (7.105)
T 0 −∞

Generally, we have the relation

X

fω = 2πfn δ(ω − nω0 ). (7.106)
−∞

7.3.3 Partially polarized light


Every monochromatic wave is necessarily polarized. However we usually have to deal with
waves which are only approximately monochromatic, and which contain frequencies in a small
interval δω. We consider such a wave, and let ω be some average frequency for it. Then its field
at a fixed point in space can be written in the form

E0 (t)e−iωt , (7.107)

where the complex amplitude E0 is some slowly varying function of the time. Since E0 deter-
mines the polarization of the wave, this means that at each point of the wave, its polarization
changes with time, such a wave is said to be partially polarized.

The polarization properties of electromagnetic waves are observed experimentally by passing


the light to be investigated through various bodies and then observing the intensity of the
transmitted light. From the mathematical point of view this means that we draw conclusions
concerning the polarization properties of the light from the values of certain quadratic func-
tions of its field. Here of course we are considering the time averages of such functions.

Quadratic functions of the field are made up of terms proportional to the products Eα Eβ ,
Eα∗ Eβ∗ or Eα∗ Eβ . Products of the form Eα Eβ and Eα∗ Eβ∗ contain the rapidly oscillating factors
–66/453– Chapter 7 Classical Electrodynamics

e−i2ωt and will give zero when the time average is taken. Thus, we see that the polarization
properties of the light are completely characterized by the tensor

Jαβ = E0α E0β . (7.108)

The trace of the tensor


J ≡ Jαα = E0 E0∗ (7.109)
determines the intensity of the wave, as measured by the energy flux density. To eliminate this
quantity which is not directly related to the polarization properties, we introduce the tensor

Jαβ
ραβ = , (7.110)
J
called polarization tensor.
Generally, the polarization tensor can be expressed as
 
1 1 + p3 p1 − ip2
ρ= . (7.111)
2 p1 + ip2 1 − p3

If we introduce the Pauli matrix,


     
0 1 0 −i 1 0
σ1 ≡ , σ2 ≡ , σ3 ≡ . (7.112)
1 0 i 0 0 −1

we have
1 1
ρ = (1 − P )I + P (I + n̂ · σ), (7.113)
2 2
q p
p2 p3 
where
1
P ≡ p21 + p22 + p32 , n̂ ≡ , . , (7.114)
P P P
For a monochromatic light with polarization state |E⟩ = (cos(θ/2)e−iϕ/2 , sin(θ/2)eiϕ/2 ), the
polarization tensor is |E⟩⟨E|. We can verify that

P = 1, n̂ = (sin θ cos ϕ, sin θ sin ϕ, cos θ). (7.115)

For arbitrary light, we have


1 1
ρ = (1 − P )ρn + P ρp where ρn = I, ρp = (I + n̂ · σ). (7.116)
2 2
Thus, we call P the degree of polarization.
Suppose there is a polarizing filter, which allows the light with polarization state
 
θ −i ϕ θ iϕ
|D⟩ = cos e 2 , sin e 2 (7.117)
2 2

to pass totally. If a light with polarization tensor ρ pass through the device, the relative intensity
will become
1 1
⟨D|ρ|D⟩ = + p · m̂, (7.118)
2 2
7.4 The field of moving charges –67/453–

where p ≡ (p1 , p2 , p3 ), m̂ ≡ (sin θ cos ϕ, sin θ sin ϕ, cos θ).

In optics, the Stokes parameters are defined as

I ≡ ⟨Ex2 ⟩ + ⟨Ey2 ⟩
= ⟨Ea2 ⟩ + ⟨Eb2 ⟩
= ⟨E+2 ⟩ + ⟨E−2 ⟩,
Q ≡ ⟨Ex2 ⟩ − ⟨Ey2 ⟩,
U ≡ ⟨Ea2 ⟩ − ⟨Eb2 ⟩,
V ≡ ⟨E+2 ⟩ − ⟨E−2 ⟩,

where the subscripts refer to three different bases of the space of Jones vectors: the standard
Cartesian basis x̂, ŷ, a Cartesian basis rotated by 45° â, b̂, and a circular basis +̂, −̂. The
symbols ⟨·⟩ represent expectation values. It is easy to verify that

Q = Ip3 , U = Ip1 , V = Ip2 . (7.119)

7.4 The field of moving charges


7.4.1 Retarded potential
This time we impose Lorenz gauge ∂µ Aµ = 0. Maxwell’s equations would become

∂ 2 Aµ = −j µ . (7.120)

We can rewrite it in three-dimensional form,

∂ 2A ∂ 2ϕ
∇2 A − = −J , ∇2 ϕ − = −ρ. (7.121)
∂t2 ∂t2
To find the particular solution, we divide the whole space into infinitely small regions and
determine the field produced by the charges located in one of these volume elements. Because
of the linearity of the field equations, the actual field will be the sum of the fields produced
by all such elements. The charge e in a given volume element is a function of the time. If we
choose the origin of coordinates in the volume element under consideration, then the charge
density is e(t)δ(R), where R is the distance from the origin. Thus, we must solve the equation

∂ 2ϕ
∇2 ϕ − = −e(t)δ(R). (7.122)
∂t2
The particular solution is
e(t − R)
ϕ= . (7.123)
4πR
For an arbitrary distribution of charges ρ(r ′ , t), we have
Z
ρ(r ′ , t − |r − r ′ |)
ϕ(r, t) = dV ′ . (7.124)
4π|r − r | ′
–68/453– Chapter 7 Classical Electrodynamics

Similarly we have for the vector potential


Z
J (r ′ , t − |r − r ′ |)
A(r, t) = dV ′ . (7.125)
4π|r − r ′ |
The particular solution above is called retarded potential solution.
Now we consider a charged particle with equation of motion r = r0 (t). We have
ρ(r, t) = eδ[r − r0 (t)] (7.126)
and Z
eδ[r ′ − r0 (t − |r − r ′ |)] e
ϕ(r, t) = dV ′ = , (7.127)
4π|r − r |′ 4πR (1 − n̂∗ · v ∗ )

where
R∗
n̂∗ = , R∗ = r − r0 (t∗ ), v ∗ = v0 (t∗ ), t∗ = t − R ∗ . (7.128)
R
Similarly, we have
ev ∗
A(r, t) = . (7.129)
4πR∗ (1 − n̂∗ · v ∗ )
The potential is called Lienard-Wiechert potentials. Notice that
∂t∗ 1 ∗ n̂∗
= , ∇t = − . (7.130)
∂t 1 − n̂∗ · v ∗ 1 − n̂∗ · v ∗
We can figure out the corresponding electric and magnetic field intensity,
 
e (1 − v ∗2 )(n̂∗ − v ∗ ) n̂∗ × [(n̂∗ − v ∗ ) × a∗ ]
E= + ; (7.131a)
4π(1 − n̂∗ · v ∗ )3 R∗2 R∗
B = n̂∗ × E. (7.131b)
The electric field consists of two parts of different type. The first term depends only on the
velocity of the particle (and not on its acceleration) and varies at large distances like 1/R2 .
The second term depends on the acceleration, and for large R it varies like 1/R. This latter
term is related to the electromagnetic waves radiated by the particle.

7.4.2 Spectral resolution of the retarded potentials


Suppose Z Z
∞ ∞
−iωt dω dω
ρ(r, t) = ρω (r)e , ϕ(r, t) = ϕω (r)e−iωt . (7.132)
−∞ 2π −∞ 2π
We can derive that Z Z ∞
ρ(r ′ , t) iω(R+t)

ϕω (r) = dV e dt . (7.133)
−∞ 4πR
If there is just one point charge ρ = eδ[r − r0 (t)], we can obtain
Z ∞
e
ϕω (r) = dt eiω[R(t)+t] , R(t) = r − r0 (t). (7.134)
−∞ 4πR(t)
Similarly, for vector potential, we have
Z ∞
ev0 (t) iω[R(t)+t]
Aω (r) = dt e , R(t) = r − r0 (t). (7.135)
−∞ 4πR(t)
The electric and magnetic field are given by
Eω = −∇ϕω + iωAω , Bω = ∇ × A ω . (7.136)
7.5 Radiation –69/453–

7.5 Radiation
7.5.1 Far field approximation
We consider the field produced by a system of moving charges at distances large compared
with the dimensions of the system. We choose the origin of coordinates O anywhere in the
interior of the system of charges. The radius vector from O to the point P , where we determine
the field, we denote by r, and the unit vector in this direction by n̂. Let the radius vector of
the charge element be r ′ , and the radius vector from charge to the point P be R. At large
distances from the system of charges, r ≫ r′ , and we have approximately,

R ≈ r − n̂ · r ′ . (7.137)

We substitute this for the retarded potentials. In the denominator of the integrands we can
neglect n̂ · r ′ compared with r. In t − r + n̂ · r ′ , whether it is possible to neglect these terms is
determined by how much the quantities e and j change during the time n̂ · r ′ . The potentials
of the field at large distances from the system of charges are
Z
1
ϕ(r, t) = ρ(r ′ , t − r + n̂ · r ′ ) dV ′ , (7.138a)
4πr
Z
1
A(r, t) = J (r ′ , t − r + n̂ · r ′ ) dV ′ . (7.138b)
4πr
At sufficiently large distances from the system of charges, the field over small regions of space
can be considered to be a plane wave. For this it is necessary that the distance be large com-
pared not only with the dimensions of the system, but also with the wavelength of the electro-
magnetic waves radiated by the system. We refer to this region of space as the wave zone of
the radiation. In wave zone, we have
 
∂A ∂A
B= × n̂, E = × n̂ × n̂. (7.139)
∂t ∂t
The energy flux is given by the Poynting vector which, for a plane wave, is

S = B 2 n̂. (7.140)

The power of radiation into the element of solid angle do is

dP = B 2 r2 do . (7.141)

Since the field is inversely proportional to r, we see that the amount of energy radiated by the
system in unit time into the element of solid angle do is the same for all distances. For the
radiation produced by a single arbitrarily moving point charge, it turns out to be convenient
to use the Lienard-Wiechert potentials. At large distances, we have
ev(t′ )
A(r, t) = , (7.142)
4πr[1 − n̂ · v(t′ )]
where
t′ − n̂ · r0 (t′ ) = t − r. (7.143)
–70/453– Chapter 7 Classical Electrodynamics

Now we turn to the spectral resolution of the field of the waves radiated by the system. For
vector potential, we can derive that
Z
eikr ′
Aω (r) = Jω e−ik·r dV ′ , (7.144)
4πr
where k ≡ ω n̂. In wave zone, we have
i
Bω = ik × Aω , Eω = (k × Aω ) × k. (7.145)
ω
Suppose dEωn̂ is the energy radiated into the element of solid angle do in the form of waves
with frequencies in the interval dω. We have

dEωn̂ = 2Bω2 r2 do . (7.146)

For the radiation produced by a single arbitrarily moving point charge, we can derive that
Z Z
e ikr ∞ iω(t−n̂·r0 ) ieω ikr ∞ iω(t−n̂·r0 )
Aω (r) = e e dr0 , Bω (r) = e e n̂ × dr0 .
4πr −∞ 4πr −∞
(7.147)

7.5.2 Low velocity approximation


If the charge density (current density) distribution changes a little during time r ′ · n̂, we can
expand f (r, t − r + r ′ · n̂) in series of (r ′ · n̂)2 . In this case, if the typical angular frequency
and scale of the motion is ω and a respectively, we would have
2π a
a≪T ∼ ∼ λ or v ∼ ≪ 1, (7.148)
ω T
which means that or the scale of the system is much smaller then the wavelength ,or the velocity
of the particles is much smaller then that of light.
As for the zeroth order approximation, we just drop the r ′ · n̂ in the equation,
Z
1
A(r, t) = J (r ′ , t − r) dV ′ . (7.149)
4πr
For the radiation produced by arbitrarily moving point charges, 7.149 becomes
1 X 1 ˙
A(r, t) = ea va (t − r) = d. (7.150)
4πr a 4πr

As a result, we have
1 ¨ 1 ¨
B= d × n̂, E= (d × n̂) × n̂. (7.151)
4πr 4πr
Radiation of this kind is called dipole radiation. We notice that a closed system of particles,
for all of which the ratio of charge to mass is the same, cannot radiate by dipole radiation. The
power of the dipole radiation is

d¨2
dP = sin2 θ do , (7.152)
16π 2
7.5 Radiation –71/453–

where θ is the angle between d¨ and n̂. Integrating over all the direction, we have
d¨2
P = . (7.153)

If we have just one charge moving in the external field, we have
e2 w2
P = , (7.154)

where w is the acceleration of the charge.
For the spectral resolution of the intensity of dipole radiation, we have
ω4 dω
dEω = |dω |2 . (7.155)
3π 2π
More details on dipole radiation during collisions and Coulomb interaction can be found in
section 68, 69 and 70 of The classical theory of fields (L.D.Landau & E.M.Lifshitz).
If we keep the first order of n̂ · r ′ , the radiation is
d˙ D̈ · n̂ ṁ × n̂
A= + + . (7.156)
4πr 24πr 4πr
We can further get
 
1 1 ...
B= d¨ × n̂ + (D · n̂) × n̂ + (m̈ × n̂) × n̂ ; (7.157a)
4πr 6
 
1 1 ...
E= (d¨ × n̂) × n̂ + [(D · n̂) × n̂] × n̂ + n̂ × m̈ ; (7.157b)
4πr 6
1 ¨2 1 ...2 1 2
P = d + D + m̈ . (7.157c)
6π 720π 6π
The total radiation consists of three independent parts: dipole, quadrupole, and magnetic
dipole radiation. The details of the derivation can be found in section 71 of The classical the-
ory of fields (L.D.Landau & E.M.Lifshitz).

7.5.3 Radiation from a rapidly moving charge


Firstly, consider the reference system in which the particle is at rest at a given moment; in this
system of reference we can apply low velocity approximation. Here, we have
e2 w 2
dE = dt , dP = 0. (7.158)

Therefore, in an arbitrary reference frame, we have
e2 duν duν µ
dP µ = u dτ . (7.159)
6π dτ dτ
Recall the equation of Lorentz force, the total four-momentum radiated during the time of
passage of the particle through a given electromagnetic field is equal to
Z
e4
∆P =µ
Fνρ uρ F νσ uσ dxµ . (7.160)
6πm2
–72/453– Chapter 7 Classical Electrodynamics

Particularly, we have
Z 2 Z
e2 w − (v × w)2 e4 (E + v × B)2 − (E · v)2
∆E = dt = dt . (7.161)
6π (1 − v 2 )3 6πm2 1 − v2
It is clear that for velocity close to that of light, the total energy radiated per unit time is pro-
portionally to the square of the energy of the moving particle. The only exception is motion
in an electric field, along the direction of the field. In this case the factor (1 − v 2 ) standing in
the denominator is cancelled by an identical factor in the numerator, and the radiation does
not depend on the energy of the particle.
Now we discuss the angular distribution of the radiation from a rapidly moving charge. The
radiation field is
e n̂ × [(n̂ − v) × w]
E= , B = n̂ × E. (7.162)
4πR (1 − n̂ · v)3
where all the quantities on the right sides of the equations refer to the retarded time t′ = t−R.
The power radiated into the solid angle do is
 
e2 2(n̂ · w)(v · w) w2 (1 − v 2 )(n̂ · w)2
dP = + − do . (7.163)
16π 2 (1 − n̂ · v)5 (1 − n̂ · v)4 (1 − n̂ · v)6
If we want to determine the angular distribution of the total radiation throughout the whole
motion of the particle, we must integrate the intensity over the time. In doing this, it is im-
portant to remember that the integrand is a function of t′ ; therefore we must write
dt = (1 − n̂ · v) dt′ (7.164)
after which the integration over t′ is immediately done.
In the ultrarelativistic case, the intensity is large within the narrow range of angles in which
1 − n̂ · v is small. Thus an ultrarelativistic particle radiates mainly along the direction of its
own motion, within the small range of angles around the direction of its velocity. We also point
out that, for arbitrary velocity and acceleration of the particle, there are always two directions
for which the radiated intensity is zero. These are the directions for which the vector n̂ − v is
parallel to the vector w.
If the velocity and acceleration of the particle are parallel, we have
e w × n̂ e2 w2 sin2 θ
B= , dP = do . (7.165)
4πR (1 − n̂ · v)3 16π 2 (1 − v cos θ)6
It is naturally, symmetric around the common direction of v and w, and vanishes along (θ = 0)
and opposite to (θ = π) the direction of the velocity. In the ultrarelativistic case, the intensity
as a function of θ has a sharp double maximum near v, with a steep drop to zero for θ = 0.
If the velocity and acceleration are perpendicular to one another, we have
 
e2 w2 1 (1 − v 2 ) sin2 θ cos2 ϕ
dP = − do , (7.166)
16π 2 (1 − v cos θ)4 (1 − v cos θ)6
where θ is again the angle between v and n̂, and ϕ is the azimuthal angle of the vector n̂ relative
to the plane passing through v and w.
The discussion of synchrotron radiation (magnetic bremsstrahlung) can be found in section
74 of The classical theory of fields (L.D.Landau & E.M.Lifshitz).
7.6 The interaction between charged particles and EM field –73/453–

7.6 The interaction between charged particles and EM field


7.6.1 Radiation reaction
If a charged particle accelerates, it radiates away energy. This means that if an external force is
applied to a charge, not all of the energy transferred to the charge by the force is converted to the
kinetic energy of the charge; some of the energy is radiated away in the form of electromagnetic
waves. From Newton’s law F = ma, the net force on the charge must be less than the applied
external force. In effect, the fields surrounding the charge exert a recoil or reaction force on
the charge.
The fields of a moving point charge are given by 7.131. It is only those terms that go as 1/R
that contribute to radiation energy. The other term falls off as 1/R2 so contributes nothing to
the integral of the Poynting vector over a large sphere. This term is called the velocity field and,
although it doesn’t contribute to the energy radiated away by the EM field, it does store energy.
So some of the energy imparted by the force that gets the charge moving must be siphoned off
to create these velocity fields. These velocity fields are curious beasts, however, for they contain
energy that is never actually lost to the charge. If a charge is accelerated to some velocity, the
velocity fields are constructed around the moving charge, but if the charge is then decelerated
to rest again, the velocity fields disappear without having radiated away any energy. It would
seem to be reabsorbed by the charge as it slows down.
If we look at a charged particle that starts off in some state, then goes through an acceleration
followed by a deceleration, and finally ends up in the same state that it started from. What we
can say is that the velocity fields are the same at the end as they were at the start, so over this
period, the only energy that is truly lost from the particle is the energy that is radiated away.
In non-relativistic case, we have
Z t2 Z t2 Z
e2 t2 2
Frad · v dt = − P dt = − a dt . (7.167)
t1 t1 6π t1

Since
Z t2 Z t2 Z t2 Z t2
2
a dt = v̇ · v̇ dt = v · v̇|tt21 − v · v̈ dt = − v · v̈ dt , (7.168)
t1 t1 t1 t1

we can get
e2
ȧ. Frad = (7.169)

This is known as the Abraham-Lorentz formula for radiation reaction. This equation can only
be applied when the frequency and intensity of the EM field is not very big, i.e.,

e2 m2
λ≫ , B≪ . (7.170)
m e3
The details can be found in section 75 of The classical theory of fields (L.D.Landau & E.M.Lifshitz).
We then derive the relativistic expression for the radiation damping for a single charge, which
is applicable also to motion with velocity comparable to that of light. This force is now a four-
–74/453– Chapter 7 Classical Electrodynamics

vector g µ , which must be included in the equation of motion of the charge, written in four-
dimensional form:
duµ
m = eF µν uν + g µ . (7.171)

To determine g µ we notice that for v ≪ 1, its three space components must go over into
the components of the vector e2 ȧ/6π. It is easy to see that the vector (e2 /6π) d2 uµ /dτ 2 has
this property. However, it does not satisfy the identity g µ uµ = 0, which is valid for any force
four-vector. In order to satisfy this condition, we must add to the expression given a certain
auxiliary four-vector, made up from the four-velocity uµ and its derivatives. The three space
components of this vector must become zero in the limiting case v = 0. As a result we find
 
µ e2 d2 uµ µ νd u
2 ν
g = +u u . (7.172)
6π dτ 2 dτ 2
It is called Abraham–Lorentz–Dirac force.
The integral of the four-force g µ over the world line of the motion of a charge, passing through
a given field, must coincide (except for opposite sign) with the total four-momentum ∆P µ of
the radiation from the charge. The first term in equation above goes to zero on performing the
integration, since at infinity the particle has no acceleration. We integrate the second term by
parts and get: Z Z
e2 duν duν µ
− g dτ =
µ
u dτ = ∆P µ . (7.173)
6π dτ dτ

7.6.2 Scattering by free charges


If an electromagnetic wave falls on a system of charges, then under its action the charges are
set in motion. This motion in turn produces radiation in all directions; there occurs, we say, a
scattering of the original wave. The scattering is most conveniently characterized by the ratio
of the amount of energy emitted by the scattering system in a given direction per unit time, to
the energy flux density of the incident radiation. This ratio clearly has dimensions of area, and
is called the reflective scattering cross section. Let dP be the energy radiated by the system
into solid angle do per second for an incident wave with Poynting vector S. Then the effective
cross-section for scattering (into the solid angle do) is
d ⟨P ⟩
dσ = . (7.174)
⟨S⟩
The integral σ of dσ over all directions is the total scattering cross-section.
Let us consider the scattering produced by a free charge at rest. Suppose there is incident on
this charge an approximately plane monochromatic wave (partially polarized light). We shall
assume that the velocity acquired by the charge under the influence of the incident wave is
small compared with that of light. Then we can neglect the force exerted by magnetic field.
We also assume the wavelength of the EM field is much larger than the displacement of the
charge during its vibrations. Therefore, we have

e2
d¨ = er̈ = E0 (t)e−iωt . (7.175)
m
7.6 The interaction between charged particles and EM field –75/453–

Now we assume the incident direction of the EM wave is x̂, the scattered direction of the EM
wave is n̂′ = (sin θ cos ϕ, sin θ sin ϕ, cos θ). The dipole radiation is

1 D ′ 2
E e4
d ⟨P ⟩ = 2
[Re(d¨ × n̂ )] do = 2 2
|E0 × n̂′ |2 do , (7.176)
16π 32π m
where

|E0 × n̂′ |2 = −(E0y E0z
∗ ∗
+ E0y E0z ) cos θ sin ϕ sin θ − |E0y |2 sin2 ϕ − |E0z |2 sin2 θ + |E0y |2 .
(7.177)
Notice that
1
⟨S⟩ = ⟨Re(E) · Re(E)⟩ = ⟨E0 · E0∗ ⟩ . (7.178)
2
The effective cross-section for scattering can be obtained as

e4  2
dσ = [−(ρ12 + ρ 21 ) cos θ sin ϕ sin θ − ρ 11 sin2
ϕ − ρ 22 sin θ + ρ11 ] do . (7.179)
16π 2 m2
The total cross section is  2
8π e2
σ= . (7.180)
3 4πm
If the incident light is totally linear polarized in ẑ direction, then we have

e4
dσ = sin2 θ do . (7.181)
16π 2 m2
If the incident light is unpolarized, we have

e4 2

dσ = 1 + cos Θ do , (7.182)
32π 2 m2
where cos Θ = cos ϕ sin θ; i.e., Θ is the angle between the direction of incident light and
scatted light.

Scattering by bound charges


The dynamic equation of the bound charge is

e e2 ...
ξ̈ = E0 e−iωt − ω02 ξ + ξ. (7.183)
m 6πm
Suppose ξ = ξ0 e−iωt . We can get

eE0 e2 ω 2
ξ= e−iωt where γ ≡ . (7.184)
m(ω0 − ω − iωγ)
2 2 6πm
We can show that
ω4
σ = σ0 , (7.185)
(ω02 − ω 2 )2 + ω 2 γ 2
where σ0 is the total cross section when EM wave is scattered by free charges. When ω ≫ ω0 ,
σ is independent of ω. When ω ≪ ω0 , σ0 is proportional to ω 4 , and it is called Rayleigh
scattering.
Part III

General Relativity
Chapter 8

Elementary Differential Geometry

8.1 Fundamental conception on differential manifolds

Definition 8.1 Manifold

Manifold Formally, a topological manifold is a second countable Hausdorff space that


is locally homeomorphic to Euclidean space.
Differentiable manifold In formal terms, a differentiable manifold is a topological
manifold with a globally defined differential structure.
Tangent space We can define a tangent vector as an equivalence class of curves passing ♡
through p while being tangent to each other at p.
Cotangent space Two functions f and g have the same first order behaviour near p if
and only if the derivative of the function f − g vanishes at p. The cotangent space
will then consist of all the possible first-order behaviors of a function near p.

Definition 8.2 Submanifold

Submanifold A submanifold of a manifold M is a subset S which itself has the struc-


ture of a manifold.
Immersed submanifolds An immersed submanifold of a manifold M is the image S of
an immersion map f : N → M ; in general this image will not be a submanifold
as a subset, and an immersion map need not even be injective (one-to-one) – it
can have self-intersections.

Injective immersed submanifolds More narrowly, one can require that the map f :
N → M be an injection, and define an immersed submanifold to be the image
subset S together with a topology and differential structure such that S is a man-
ifold and the inclusion f is a diffeomorphism. This is just the topology on N ,
which in general will not agree with the subset topology. In general the subset S
is not a submanifold of M , in the subset topology.
–78/453– Chapter 8 Elementary Differential Geometry

Definition 8.3 Embedded submanifold

An embedded submanifold (also called a regular submanifold), is an immersed sub-


manifold for which the inclusion map is a topological embedding. That is, the subman- ♡
ifold topology on S is the same as the subspace topology.

Theorem 8.1

A submanifold (f, N ) of a smooth manifold M is a regular submanifold if and only if


for any point p ∈ N , there exists a coordinate chart (V ; v α ), v α (q) = 0 at the point
q = f (p) in M , such that f (N ) ∩ V is defined by ♣

v n+1 = v n+2 = · · · = v m = 0. (8.1)

8.2 Multi linear algebra

Definition 8.4 Tensor

Vector space A vector space is a collection of objects called vectors, which may be
added together and multiplied (“scaled”) by numbers.
Dual space In mathematics, any vector space V has a corresponding dual vector space
consisting of all linear functionals on V together with a naturally induced linear
structure. For vector space with finite dimensions, we have V = (V ∗ )∗ .
Tensor product In mathematics, the tensor product V ⊗W of two vector spaces V and
W is the vector space generated by the symbols v ⊗ w, with v ∈ V and w ∈ W ,
in which the relations of bilinearity are imposed for the product operation ⊗,
and no other relations are assumed to hold. It is equivalent to the vector space
consisting of all bilinear functionals on V ∗ and W ∗ . It is also the dual vector space

of V ∗ ⊗ W ∗ .
Tensor Suppose V is an n-dimensional vector space over F with dual space V ∗ . The
elements in the tensor product

V rs = V ⊗ · · · ⊗ V ⊗ V ∗ ⊗ · · · ⊗ V ∗ (8.2)
| {z } | {z }
r terms s terms

are called (r, s) type tensors. Suppose {ei }1≤i≤n and {e∗i }1≤i≤n are dual bases in
V and V ∗, respectively. An (r, s) type tensor x can be uniquely expressed as

x = xi1 ···ir k1 ···ks ei1 ⊗ · · · ⊗ eir ⊗ e∗k1 ⊗ · · · ⊗ e∗ks . (8.3)


8.2 Multi linear algebra –79/453–

Definition 8.5 Symmtric and antisymmetric tensor

Permutation For any permutation operator σ ∈ P(r) and tensor x,

σx(v ∗1 , · · · , v ∗r ) = x(v ∗σ(1) , · · · , v ∗σ(r) ). (8.4)

Symmetric tensor x = σx for any σ ∈ P(r).


Antisymmetric tensor σx = sgn · σx for any σ ∈ P(r).
Symmetrization operator
1 X ♡
Sr (x) = σx. (8.5)
r!
σ∈P(x)

Antisymmetrization operator
1 X
Ar (x) = sgn · σx. (8.6)
r!
σ∈P(x)

Definition 8.6 Exterior vector space

Exterior vector space

Λr (V ) ≡ Ar (T r (V )), Λ0 (V ) ≡ F, Λ1 (V ) ≡ V. (8.7)

Wedge product

(k + l)! ♡
ξ∧η ≡ Ak+l (ξ ⊗ η) where ξ ∈ Λk (V ), η ∈ Λl (V ). (8.8)
k!l!
Pull-back mapping f : V → W is a linear mapping. We define f ∗ : Λr (W ∗ ) →
Λr (V ∗ ) as

f ∗ ϕ(v1 , · · · , vr ) = ϕ(f (v1 ), · · · , f (vr )) where ϕ ∈ Λr (W ∗ ), vi ∈ V. (8.9)

Proposition 8.1 Properties of Wedge product

(ξ1 + ξ2 ) ∧ η = ξ1 ∧ η + ξ2 ∧ η; (8.10)
ξ ∧ (η1 + η2 ) = ξ ∧ η1 + ξ ∧ η2 ; (8.11)
ξ ∧ η = (−1) η ∧ ξ; kl
(8.12) ♠
(k + l + h)!
(ξ ∧ η) ∧ ζ = ξ ∧ (η ∧ ζ) = Ak+l+h (ξ ⊗ η ⊗ ζ); (8.13)
k!l!h!
f ∗ (ϕ ∧ ψ) = f ∗ ϕ ∧ f ∗ ψ. (8.14)
–80/453– Chapter 8 Elementary Differential Geometry

Proposition 8.2 Properties of exterior space

ei1 ∧ · · · ∧ eir (v ∗1 , · · · , v ∗r ) = det⟨eiα , v ∗β ⟩; (8.15)


∗j1 ∗jr ∗jβ ···jr
ei1 ∧ · · · ∧ eir (e ,··· ,e ) = det⟨eiα , e ⟩= δij11···i r
(8.16) ♠
Λ (V ) = Span{ei1 ∧ · · · ∧ eir ,1 ≤ i1 < · · · < ir ≤ n}
r
(8.17)
(Λr (V ))∗ = Λr (V ∗ ). (8.18)

8.3 Vector Bundle

Definition 8.7 Fiber bundle

Fiber bundle A fiber bundle is a space that is locally a product space, but globally
may have a different topological structure. Specifically, the similarity between
a space E and a product space B × F is defined using a continuous surjective
map π : E → B that in small regions of E behaves just like a projection from
corresponding regions of B × F to B. The map π, called the projection or sub-
mersion of the bundle, is regarded as part of the structure of the bundle. The
space E is known as the total space of the fiber bundle, B as the base space, and
F the fiber.
Vector Bundle A vector bundle is a topological construction that makes precise the
idea of a family of vector spaces parameterized by another space X: to every
point x of the space X we associate a vector space V (x) in such a way that these
vector spaces fit together to form another space of the same kind as X, which is
then called a vector bundle over X.

Tangent bundle In differential geometry, the tangent bundle of a differentiable man-
ifold M is a manifold T M , which assembles all the tangent vectors in M . As a
set, it is given by the disjoint union of the tangent spaces of M . That is,
G [ [
TM = Tx M = {x} × Tx M = {(x, y)|y ∈ Tx M }, (8.19)
x∈M x∈M x∈M

where Tx M denotes the tangent space to M at the point x. Therefore, an element


of T M can be thought of as a pair (x, v) , where x is a point in M and v is a
tangent vector to M at x. There is a natural projection π : T M → M defined
by π(x, v) = x. This projection maps each tangent space Tx M to the single
point x. A section of T M is a vector field on M , and the dual bundle to T M is
the cotangent bundle, which is the disjoint union of the cotangent spaces of M .
Cotangent bundle and tensor bundle are defined similarly.
8.4 Tangent vector field –81/453–

8.4 Tangent vector field

Theorem 8.2

Suppose X is a tangent vector field on a manifold M . For an arbitrary local coordinate


(U, [xi ]) on M , X would be smooth on U if and only if its component functions with ♣
respect to this chart are smooth.

Theorem 8.3

Suppose X is a smooth tangent vector field on a manifold M . X : C ∞ (M ) → C ∞ (M )


satisfy that
1. ∀f, g ∈ C ∞ (M ), X(f + g) = X(f ) + X(g);
2. ∀f ∈ C ∞ (M ), α ∈ R, X(αf ) = α · X(f ); ♣

3. ∀f, g ∈ C (M ), X(f · g) = f · X(g) + g · X(f ).
If v : C ∞ (M ) → C ∞ (M ) satisfy the three conditions above, there exists a unique
smooth tangent vector field X on M that for all f ∈ C ∞ (M ), X(f ) = v(f ).

Theorem 8.4

If X and Y are smooth tangent vector fields on M ,

[X, Y ] ≡ X ◦ Y − Y ◦ X (8.20) ♣

will be a smooth tangent vector field on M as well.

Proposition 8.3

[aX + bY, Z] = a[X, Z] + b[Y, Z]; (8.21)


[Z, aX + bY ] = a[Z, X] + b[Z, Y ]; (8.22)
[X, Y ] = −[Y, X]; (8.23) ♠
[X, [Y, Z]] + [Y, [Z, X]] + [Z, [X, Y ]] = 0; (8.24)
 
[X, Y ]|U = X|U , YIU = (X i Y j,i − Y i X j,i )∂j ; (8.25)
f∗ [X, Y ] = [f∗ X, f∗ Y ]. (8.26)
–82/453– Chapter 8 Elementary Differential Geometry

Definition 8.8 One parameter differentiable transformation group

Suppose M is a smooth manifold and ϕ : R × M → M is a smooth mapping, and


∀(t, p) ∈ R × M , denote ϕt (p) = ϕ(t, p). If ϕ satisfy that
1. ϕ0 = id : M → M ;
2. ∀s, t ∈ R, ϕs ◦ ϕt = ϕs+t ; ♡
then ϕ is called a one parameter differentiable transformation group acting on M .
• Trajectory of ϕ through p on M : γp (t) = ϕ(t, p).
• Tangent vector field induced by ϕ: Xp (f ) = ⟨γp , f ⟩.

Definition 8.9 Lie derivative

Suppose X is a smooth tangent vector field on M and ϕt is the one parameter differen-
tiable transformation group inducing it. Denote the trajectory of ϕt through x by γx (t).
Thus we have linear isomorphism

(ϕ−1
t )∗ = (ϕ−t )∗ : Tγx (t) M → Tx M ; (8.27)
(ϕt )∗ : Tγ∗x (t) → Tx∗ M. (8.28)

We can induce the linear isomorphism

Φt : Tqp (γx (t)) → Tqp (x). (8.29)

If S and T are smooth tensor fields on M , we have


1. for all t which is small enough, Φt S is a smooth tensor field on M which has the
same type as S, and limt→0 Φt (S(γp (t))) = S(p), ∀p ∈ M ; ♡
2. Φt (S ⊗ T ) = Φt S ⊗ Φt T ;
3. Φt (Cba (S)) = Cba (Φt (S)), Cba is a tag for contraction.
Therefore, we can define the Lie derivative for smooth tensor field τ on M as
Φt (τ ) − τ
LX (τ ) = lim . (8.30)
t→0 t
In local coordinates, we have
X
p
X
q
(LX τ )µv11,···
,··· ,µp
,vq =X α
∂α τvµ11,···
,··· ,µp
,vq − τvµ11,···
,··· ,α,··· ,µp
,vq ∂α X µi + τvµ11,···
,··· ,µp α
,α,··· ,vq ∂vj X .
i=1 j=1
(8.31)
Espeically, the Lie derivative of a tangent vector field Y with respect to X is LX (Y ) =
[X, Y ]; the Lie derivative of a smooth function f with respect to X is LX (f ) = X(f ).
8.4 Tangent vector field –83/453–

Proposition 8.4

LX (τ1 + λτ2 ) = LX τ1 + λLX τ2 ; (8.32)


LX (τ1 ⊗ τ2 ) = LX τ1 ⊗ τ2 + τ1 ⊗ LX τ2 ; (8.33)
Csr (LX τ ) = LX (Csr (τ )); (8.34) ♠
(LX ω)(Y ) = X(ω(Y )) − ω([X, Y ]); (8.35)
L[X,Y ] = LX ◦ LY − LY ◦ LX ; (8.36)
LX+Y = LX + LY . (8.37)

Theorem 8.5

Suppose X is a smooth tangent vector field on a manifold M . If Xp ̸= 0 at a point



p ∈ M , then there exists a local coordinate system (W, wi ) such that X W = ∂w1 .

Definition 8.10 Distribution

Let M be a C ∞ manifold of dimension m, and let n ≤ m. Suppose that for each x ∈ M ,


we assign an n-dimensional subspace ∆x ⊂ Tx (M ) of the tangent space in such a way
that for a neighbourhood Nx ⊂ M of x there exist n linearly independent smooth
tangent vector fields X1 , . . . , Xn such that for any point y ∈ Nx , X1 (y), . . . , Xn (y) ♡
span ∆y . We let ∆ refer to the collection of all the ∆x for all x ∈ M and we then call
∆ a distribution of dimension n on M , or sometimes a C ∞ n-plane distributionon M .
The set of smooth tangent vector fields {X1 , . . . , Xn } is called a local basis of ∆.

Definition 8.11 Involutive distributions

We say that a distribution ∆ on M is involutive if for every point x ∈ M there exists a


local basis {X1 , . . . , Xn } of the distribution in a neighbourhood of x such that for all

1 ≤ i, j ≤ n , [Xi , Xj ] is in the span of {X1 , . . . , Xn }. That is, if [Xi , Xj ] is a linear
combination of {X1 , . . . , Xn }. Normally this is written as [∆, ∆] ⊂ ∆.

Theorem 8.6 Frobenius Theorem

If distribution ∆ on M is involutive, then for all p ∈ M , there exists (V, xi ) and p ∈ V



that ∆|V = Span{∂1 , · · · , ∂h }.
–84/453– Chapter 8 Elementary Differential Geometry

Definition 8.12 Integrable manifold

Suppose ∆h is a smooth distribution on M . If ϕ : N → M is an injective immersion


manifold, and for all p ∈ N , ϕ∗ (Tp N ) ⊂ ∆h (ϕ(p)), then (ϕ, N ) is called an integrable
manifold of ∆h . ♡
If for all q ∈ M , there is an h-dimensional integrable manifold of ∆ through it, we say
h

that ∆h is completely integrable.

8.5 Exterior differential

Definition 8.13 Exterior form space

For smooth exterior form field τ ∈ Λr (M ), we have


1
τ |U = τi ···i dxi1 ∧ · · · ∧ dxir = τ|i1 ···ir | dxi1 ∧ · · · ∧ dxir , (8.38)
r! 1 r
where  
∂ ∂
τi1 ···ir = τ , · · · , ir . (8.39)
∂x i 1 ∂x ♡
τ is a r multi-linear mapping, and for every variable, it is C ∞ (M ) linear,

v1i1 · · · vri1
τ (v1 , · · · , vr )|U = τ|i1 ···ir | .. .. . (8.40)
. .
v1 · · · vrir
ir

Proposition 8.5 Pullback mapping

If f : M → N is a smooth mapping and f ∗ : Λr (Tf∗(p) N ) → Λr (Tp∗ M ) the pull back


mapping it generated, for ϕ ∈ Λr (N ), we have

1 ∂f α1 ∂f αr i1
f ∗ ϕ|U = (ϕα1 ···αr ◦ f ) · i
· · · i
dx ∧ · · · ∧ dxir ; (8.41) ♠
r! ∂x 1 ∂x r

and
f ∗ (ϕ ∧ ψ) = f ∗ ϕ ∧ f ∗ ψ. (8.42)
8.5 Exterior differential –85/453–

Definition 8.14 Exterior differential

Suppose M is a m-dimensional smooth manifold. There exists a unique mapping d :


Λ(M ) → Λ(M ) satisfy that
1. d(Λr (M )) ⊂ Λr+1 (M );
2. ∀ω1 , ω2 ∈ Λ(M ), d(ω1 + ω2 ) = dω1 + dω2 ;

3. if ω1 ∈ Λr (M ), then d(ω1 ∧ ω2 ) = dω1 ∧ ω2 + (−1)r ω1 ∧ dω2 ;
4. f ∈ Λ0 (M ), df is just the differential of f ;
5. ∀f ∈ Λ0 (M ), d(df ) = 0.
d is called exterior differential.

Proposition 8.6

∀ω ∈ Λ1 (M ), X, Y ∈ T (M ),

dω (X, Y ) = X⟨Y, ω⟩ − Y ⟨X, ω⟩ − ⟨[X, Y ], ω⟩. (8.43)

∀ω ∈ Λr (M ), X1 , · · · , Xr+1 ∈ T (M ),

X
r+1
dω (X1 , · · · , Xr+1 ) = (−1)i+1 Xi (⟨X1 ∧ · · · ∧ X̂i ∧ · · · ∧ Xr+1 , ω⟩)
i=1
X
+ (−1) i+j
⟨[Xi , Xj ] ∧ · · · ∧ X̂i ∧ · · · ∧ X̂j ∧ · · · Xr+1 , ω⟩. (8.44)
1≤i<j≤r+1

Proposition 8.7

f ∗ (dω) = d(f ∗ ω) . (8.45) ♠

Lemma 1 Poincare Lemma

1. d2 = 0.
2. Suppose U = B0 (r) is a spherical neighbourhood with center origin O and radius

r in Rn . Then for all ω ∈ Λr (U ) and dω = 0, there exists τ ∈ Λr−1 (U ), satisfy
that ω = dτ .

Definition 8.15 Pfaff euqations

Suppose ω α (1 ≤ α ≤ r) ∈ Λ1 (U ) and U is an open set of m-dimensional smooth



manifold M . Differential equation set ω α = 0 is called Pfaff equations.
–86/453– Chapter 8 Elementary Differential Geometry

Definition 8.16 Integral manifold of Pfaff equations

If there is an injective immersion submanifold ϕ : N → U satisfying that ϕ∗ ω α = 0,



(ϕ, N ) is called an integral manifold of Pfaff eqation set.

Proposition 8.8 Partial differential equations and Pfaff equations

There is a set of first order partial differential equations

∂y α
= fiα (x1 , · · · , xm , y 1 , · · · , y n ) (1 ≤ i ≤ m, 1 ≤ α ≤ n). (8.46)
∂xi
fiα (x, y) is a smooth function on the open set U × V ⊂ Rm × Rn . The equations sets
can be written as Pfaff equations on U × V

ω α ≡ dy α − fiα (x, y) dxi = 0. (8.47)



If the partial differential equations have solution

y α = g α (x1 , . . . , xm ), (8.48)

then the submanifold ϕ : U → U × V ,

ϕ(x1 , . . . , xm ) = (x1 , . . . , xm , g 1 (x), . . . , g n (x)) (8.49)

is an integral manifold of the Pfaff equations , i.e. ϕ∗ ω α = 0.

Proposition 8.9 Distribution and Pfaff equations

Pfaff equations ω α = 0 on open set V ∈ M with rank r is equivalent to a h = m − r


dimensional smooth distribution locally.

∆h (p) = {v ∈ Tp M : ω α (v) = 0, 1 ≤ α ≤ r}. (8.50) ♠

If ϕ : N → V is an integral manifold of ω α , then for all X ∈ Tp N , we have ω α (ϕ∗ X) =


ϕ∗ ωα (X) = 0. Thus ϕ∗ X ∈ ∆h (p) and ϕ : N → V is an integral manifold of ∆h .
8.5 Exterior differential –87/453–

Definition 8.17 Completely integrable

Suppose ω α is a set of r linearly independent 1 forms defined on an open set U ⊂ M .


For all p ∈ U , Paffa equations

ω α = 0, (1 ≤ α ≤ r). (8.51) ♡

has an h = M − r dimensional integral manifold ϕ : N → V such that p ∈ V , Paffa


equations are called completely integrable.

Theorem 8.7 Frobenius theorem

Pfaff equations ω α = 0(1 ≤ α ≤ r) satisfying Frobenius condition

dω α ∧ ω 1 ∧ · · · ∧ ω r = 0. (8.52) ♣

is completely integrable.

Definition 8.18 Orientation of manifold

Suppose α : [0, 1] → M is a path on M . For all t ∈ [0, 1], assign an orientation for
Tα(t) M , denoted by µt . If for t0 ∈ [0, 1], there is a local coordinate (U ; xi ) of α(t0 ) and
a neighbourhood [t0 − δ1 , t0 + δ2 ] of t0 that

α([t0 − δ1 , t0 + δ2 ]) ⊂ U (8.53)

and  
∂ ∂
1
,..., m ∈ µt , ∀t ∈ [t0 − δ1 , t0 + δ2 ], (8.54)
∂x ∂x α(t)

µ is called a continuous topological orientation of α.

Definition 8.19 The propagation of orientation

Suppose p, q ∈ M and α : [0, 1] → M is a path connecting p, q. Assign an orientation


λ of Tp M . If there is a continuous topological orientation of µ satisfying that µ0 = λ,

then orientation µ1 of Tq M is called the propagation of orientation λ along α. The
orientation of µ1 is unique.
–88/453– Chapter 8 Elementary Differential Geometry

Definition 8.20 Orientable manifold

Suppose M is a m-dimensional smooth manifold. If there is an atlases (A0 =


{(Uα , ϕα )}), making that if Uα ∩ Uβ ̸= ∅, the Jacobian of

ϕβ ◦ ϕ−1
α : ϕα (Uα ∩ Uβ ) → ϕβ (Uα ∩ Uβ ) (8.55)

is positive. Then M is called orientable manifold.

Proposition 8.10

Suppose M is a orientable connected manifold. For all p ∈ M , assign an orientation


λ for Tp M . Then for all q ∈ M , the propagation of λ along an arbitrary path define a ♠
unique orientation µ for Tq M .

Definition 8.21 Manifold with boundary

A topological manifold with boundary is a Hausdorff space in which every point has a
neighbourhood homeomorphic to an open subset of Euclidean half-space (for a fixed

n):
Rn+ = {(x1 , . . . , xn ) ∈ Rn : xn ≥ 0}. (8.56)

Definition 8.22 Boundary and interior

Suppose M is a manifold with boundary. The interior of M , denoted Int M , is the set of
points in M which have neighbourhoods homeomorphic to an open subset of Rn . The
boundary of M , denoted ∂M , is the complement of Int M in M . The boundary points
can be characterized as those points which land on the boundary hyperplane (xn = 0) ♡
of Rn+ under some coordinate chart. If M is a manifold with boundary of dimension
n, then Int M is a manifold (without boundary) of dimension n and ∂M is a manifold
(without boundary) of dimension n − 1.

Theorem 8.8

Suppose M is a smooth manifold with boundary. The differential structure of ∂M can


be deduced from the M , making ∂M a (m − 1)-dimensional smooth manifold and the

inclusion map i : ∂M → M is embedding map. If M is orientable, then ∂M is also
orientable.
8.5 Exterior differential –89/453–

Definition 8.23 Induced orientation

Suppose M is an orientable m-dimensional smooth manifold with boundary. A is the


orientation of M . For local coordinates (U ; xi ) ∈ A, when

Ũ = U ∩ ∂M = {(x1 , . . . , xm ) ∈ U : xm = 0} ̸= ∅ (8.57) ♡

assign a local coordinate system ((−1)m · x1 , x2 , . . . , xm−1 ) on Ũ . The orientation de-


fined by this local coordinate system is called induced orientation of ∂M .

Definition 8.24 Support set

Suppose M is a m-dimensional orientable smooth manifold. For ω ∈ Λr (M ), the


support set of ω can be defined as

supp ω = {p ∈ M : ω(p) ̸= 0}. (8.58)

All the r-form with compact support set is denoted as Λr0 (M ).

Theorem 8.9 Partition of unity

Suppose Σ is an open cover of M . There exists a family of smooth function gα on M


that
1. ∀α, 0 ≤ gα ≤ 1, supp gα is compact and there is an open set Wα ∈ Σ that

supp gα ⊂ Wα .
2. ∀p ∈ M , it has a neighbourhood U which intersect finite supp gα .
P
3. α gα = 1.

Definition 8.25 Integral of differential form with compact support

Z Z ! Z X
X XZ
ϕ= gα ·ϕ= (gα · ϕ) = gα · ϕ
M M M M
XZ XZ
α α α

= gα · ϕ = f (w1 , · · · , wm ) dw1 ∧ · · · ∧ dwm
Wα Wα
XZ
α α

≡ f (w1 , · · · , wm ) dw1 · · · dwm . (8.59)


α Wα
–90/453– Chapter 8 Elementary Differential Geometry

Theorem 8.10 Stokes Theorem

Suppose M is an orientable m-dimensional smooth manifold with boundary and ω ∈


Λm−1
0 (M ). Then Z Z
dω = i∗ ω. (8.60) ♣
M ∂M

Here, ∂M has an orientation induced by M and i is embedding mapping.

8.6 Connection

Definition 8.26 Connection

Suppose M is a smooth manifold and E is a q-dimensional real vector bundle on M .


Γ(E) is the set of all smooth sections of E on M . The connection on E is a mapping:

D : Γ(E) → Γ(T ∗ M ⊗ E) (8.61)

satisfying that ♡
1. ∀s1 , s2 ∈ Γ(E), D(s1 + s2 ) = Ds1 + Ds2 .
2. ∀s ∈ Γ(E) and α ∈ C ∞ (M ), D(αs) = dα ⊗ s + αDs.
If X is a smooth tangent vector field on M , s ∈ Γ(E), then DX s ≡ ⟨X, Ds⟩, called
absolute derivative of s along X.

Proposition 8.11

Local representation of connection is:

X X
q
Dsα = Γ β
αi du ⊗ sβ =
i
ωα β ⊗ sβ , (8.62)
1≤i≤m,1≤β≤q β=1

where X
ωα β ≡ Γβ αi dui . (8.63)
1≤i≤m ♠
It can be written compactly as
Ds = ω ⊗ S. (8.64)
If we use a new base S ′ = A · S, we have

Ds′ = dA ⊗ S + A · Ds = (dA · A−1 + A · ω · A−1 ) ⊗ S ′ ; (8.65)


′ −1 −1
ω = dA · A +A·ω·A . (8.66)
8.6 Connection –91/453–

Theorem 8.11

A connection always exists on a vector bundle. ♣

Theorem 8.12

Suppose D is a connection on a vector bundle E, and p ∈ M . Then there exists a local


frame field S in a coordinate neighborhood of p such that the corresponding connection ♣
matrix ω is zero at p.

Definition 8.27 Curvature matrix

Ω ≡ dω − ω ∧ ω. (8.67) ♡

Proposition 8.12 Transformation law of curvature matrix

Ω′ = A · Ω · A−1 . (8.68) ♠

Definition 8.28 Curvature operator

Suppose X, Y are two arbitrary tangent vector fields on M . Suppose s can be expressed
P
as s = qα=1 λα sα |p using the local frame. The curvature operator is defined as

X
q
R(X, Y )s ≡ λα Ωαβ (X, Y )sβ |p . (8.69) ♡
α,β=1

The transformation law of curvature matrix ensures that curvature operator is indepen-
dent of the choice of local coordinates.

Proposition 8.13

R(X, Y ) = DX DY − DY DX − D[X,Y ] . (8.70) ♠


–92/453– Chapter 8 Elementary Differential Geometry

Theorem 8.13 Bianchi identity

The curvature matrix Ω satisfies the Bianchi identity



dΩ = ω ∧ Ω − Ω ∧ ω. (8.71)

Definition 8.29 Induced connection

Connection of dual vector and tensor is defined through

d⟨s, s∗ ⟩ = ⟨Ds, s∗ ⟩ + ⟨s, Ds∗ ⟩; (8.72)



D(s1 ⊕ s2 ) ≡ Ds1 ⊕ Ds2 ; (8.73)
D(s1 ⊗ s2 ) ≡ Ds1 ⊗ s2 + s1 ⊗ Ds2 . (8.74)

Definition 8.30 Affine connection

∂ ∂ ∂
D i
≡ ωi j ⊗ j ≡ Γj ik duk ⊗ j . (8.75) ♡
∂u ∂u ∂u

Proposition 8.14

The transformation law for Christoffel symbols Γ under coordinate transformation u →


ω is
∂wj ∂up ∂ur ∂ 2 up ∂wj
Γ′j ik = Γqpr q + ; (8.76)
∂u ∂wi ∂wk ∂wi ∂wk ∂up
The covariant derivative of tangent vector field X and cotangent vector field α are ♠

∂ ∂
DX = (X i,j + X k Γikj ) duj ⊗ i
= X i;j duj ⊗ i ; (8.77)
∂u ∂u
Dα = (αi,j − αk Γ ij ) du ⊗ du = αi;j du ⊗ du .
k j i j i
(8.78)

Definition 8.31 Geodesic equation

d2 ui j
i du du
k
+ Γ jk = 0. (8.79) ♡
dt2 dt dt
8.7 Riemannian manifold –93/453–

Definition 8.32 Curvature tensor

1
Ωij = Rj ikl duk ∧ dul . (8.80)
2 ♡

R ≡ Rj ikl j ⊗ dui ⊗ duk ⊗ dul . (8.81)
∂u

Proposition 8.15

Rijkl = Γijl,k − Γijk,l + Γihk Γhjl − Γihl Γhjk . (8.82)


R(αX , Y, Z, W ) = ⟨αX , R(Z, W )Y ⟩ ; (8.83) ♠
   
i ∂ ∂ ∂ i
R jkl = R , , du . (8.84)
∂uk ∂ul ∂uj

Definition 8.33 Torsion tensor


T jik ≡ Γj ki − Γj ik , T ≡ T jik ⊗ dui ⊗ duk . (8.85) ♡
∂uj

Proposition 8.16


T (X, Y ) = T kij X i Y j = DX Y − DY X − [X, Y ]. (8.86) ♠
∂uk

Theorem 8.14

Suppose D is an affine connection without torsion on M . For all p ∈ M , there is a local



coordinate system that Γijk (p) vanishes.

Theorem 8.15

Suppose D is an affine connection without torsion on M . Then we have Bianchi equa-


tion ♣
i i i
R jkl;h + R jhk;l + R jlh;k = 0. (8.87)
–94/453– Chapter 8 Elementary Differential Geometry

8.7 Riemannian manifold

Definition 8.34 Riemannian manifold

If an m-dimensional smooth manifold M is given a smooth, everywhere nondegenerate


symmetric covariant tensor field of rank 2, G, then M is called a generalized Rieman-

nian manifold, and G is called a fundamental tensor or metric tensor of M . If G is
positive definite, then M is called a Riemannian manifold.

Theorem 8.16

There exists a Riemannian metric on any m-dimensional smooth manifold M . ♣

Definition 8.35 Index lifting

If X ∈ Tp (M ), let αX (Y ) ≡ G(X, Y ) for Y ∈ Tp (M ). Then α is a linear functional on


Tp (M ). Since G is nondegenerate, any element of T ∗ (M ) can be expressed in the form
αX . Componentwise, if Xi = X i ∂i , αX = Xi dui , then we obtain that

j j ij
Xi = gij X , X = g Xi , (8.88)

where g ij is the inverse of matrix gij .

Definition 8.36 Adapted connection

Suppose (M, G) is an m-dimensional generalized Riemannian manifold, and D is an


affine connection on M . If DG = 0, then D is called a metric-compatible connection ♡
on (M, G).

Theorem 8.17 Fundamental Theorem of Riemannian Geometry

Suppose M is an m-dimensional generalized Riemannian manifold. Then there exists


a unique torsion-free and metric-compatible connection on M , called the Levi-Civita
connection of M . In local coordinates, we have ♣
1
Γkij = g kl (gil,j + gjl,i − gij,l ). (8.89)
2
8.7 Riemannian manifold –95/453–

Proposition 8.17 Properties of curvature tensor

Rijkl = −Rjikl = −Rijlk ; (8.90)



Rijkl + Riklj + Riljk = 0; (8.91)
Rijkl = Rklij . (8.92)

Definition 8.37 Geodesic

Suppose M is an m-dimensional Riemannian manifold. If a parametrized curve C is


a geodesic curve in M with respect to the Levi-Civita connection, then C is called a
geodesic of the Riemannian manifold M . It is easy to show that the parameter for a ♡
geodesic curve in a Riemannian manifold must be a linear function of the arc length s,
i.e. t = λs + µ, where λ and u are constants.

Definition 8.38 Normal coordinates

Let a given point in Riemannian manifold be O and consider some nearby point P . If
P is close enough to O then there exists a unique geodesic joining O to P . Let X i be the
components of the unit tangent vector to this geodesic at O and let s be the geodesic arc

length measured from O to P . Then the Riemann normal coordinates of P are defined
to be ui = sX i . One trivial consequence of this definition is that all geodesics through
O are of the form ui (s) = sX i and that the X i are constant along each geodesic.

Proposition 8.18

In Riemann normal coordinates, we have

Γijk = 0, Rijkl = gjk,il − gik,jl (8.93)



at the origin. Taylor series expansion of the metric around the origin is
1 
gij (u) = gij − Rikjl uk ul + O u3 . (8.94)
3
–96/453– Chapter 8 Elementary Differential Geometry

Theorem 8.18

Suppose U is a normal coordinate neighborhood of the point O. Then there exists a


positive number ϵ such that, for any 0 < δ < ϵ, the hypersphere
( )
Xm
Σδ = p ∈ U (ui (p))2 = δ 2 . (8.95)
i=1

has the following properties:
1. Every point on Σδ can be connected to O by a unique shortest geodesic curve in
U.
2. Any geodesic curve tangent to Σδ is strictly outside Σδ in a neighborhood of the
tangent point.

Theorem 8.19

There exists a η-ball neighborhood W at any point p in a Riemannian manifold M ,


where η is a sufficiently small positive number, such that any two points in W can be ♣
connected by a unique geodesic curve.

Definition 8.39 Sectional Curvature

Define

R(X, Y, Z, W ) ≡ Rijkl X i Y j Z k W l , (8.96)


G(X, Y, Z, W ) ≡ G(X, Z)G(Y, W ) − G(X, W )G(Y, Z). (8.97)

Suppose E is a 2-dimensional subspace of Tp (M ) and X, Y are two linearly indepen-


dent tangent vectors of E. Then ♡

R(X, Y, X, Y )
K(E) = (8.98)
G(X, Y, X, Y )

is a function of E independent of the choice of X, Y in E. We call it the sectional


curvature of M at (p, E).

Theorem 8.20

The curvature tensor of a Riemannian manifold M at a point p is uniquely determined



by the sectional curvatures of all the 2-dimensional tangent subspaces at p.
8.7 Riemannian manifold –97/453–

Definition 8.40 Constant curvature Riemannian manifold

Suppose M is a Riemannian manifold. If the sectional curvature K(E) at the point p is


a constant, then we say that M is wandering at p. If M is a Riemannian manifold which

is wandering at every point and the sectional curvature K(p) is a constant function on
M , then M is called a constant curvature space.

Theorem 8.21 F.Schur’s theorem

Suppose M is a connected m dimensional Riemannian manifold that is everywhere



wandering. If m > 3, then M is a constant curvature space.
Chapter 9
A Geometrical Description of Newtonian Theory

9.1 Introduction
In Newtonian theory, there is a global coordinates (t, x, y, z) for the whole spacetime, where
t is time coordinate and (x, y, z) are Euclidean space coordinates. The equation of motion of
the particle is
 2
d2 t d 2 xi ∂Φ dλ
= 0, + i = 0. (9.1)
dλ2 dλ2 ∂x dt
We can impose the spacetime manifold with a connection structure that Γi00 = Φ,i and all
other components vanish. Then, the equation of motion can be written as a geodesic equation,

d2 xα β
α dx dx
γ
+ Γ βγ = 0. (9.2)
dλ2 dλ dλ
The Riemann tensor of the given connection is

∂ 2Φ
Ri0j0 = −Ri00j = , (9.3)
∂xi ∂xj
and all other components vanish. The Ricci tensor is defined as the contraction of the first and
third components of Riemann curvature tensor, i.e., Rµν ≡ Rαµαν . For Newtonian theory, we
have
R00 = Φ,ii , (9.4)

and all other components vanish. As a result, Newton’s law of universal gravitation can be
expressed as
R00 = 4πρ. (9.5)

9.2 Geometry structure of Newtonian spacetime


Stratification of spacetime

Regard absolute time t as a scalar field defined once and for all in Newtonian spacetime t =
t(P). The layers of spacetime are the slices of constant t – the “space slices” – each of which
has an identical geometric structure: the old “absolute space”.
9.3 Geometry formulation of Newtonian gravity –99/453–

Flat Euclidean space


A given space slice is endowed with basis vectors e i = ∂i ; and this basis has vanishing con-
nection coefficients, Γijk = 0. Consequently, the geometry of each space slice is completely
flat. Absolute space is Euclidean in its geometry. Each space slice is endowed with a three-
dimensional metric, and its Galilean coordinate basis is orthonormal, e i · e j = δij .

Curvature of spacetime
Parallel transport a vector around a closed curve lying entirely in a space slice; it will return
to its starting point unchanged. But transport it forward in time by ∆t, northerly in space by
∆xk , back in time by −∆t, and southerly by −∆xk to its starting point; it will return changed
by  
∂ ∂
A = −R
δA R ∆t , ∆xk A. (9.6)
∂t ∂xk
Geodesics of a space slice (Euclidean straight lines) that are initially parallel remain always
parallel. But geodesics of spacetime (trajectories of freely falling particles) initially parallel get
pried apart or pushed together by spacetime curvature,

∇u ∇u n + R (n
n , u )u
u = 0. (9.7)

9.3 Geometry formulation of Newtonian gravity


1. There exists a function t called “universal time”, and a symmetric covariant derivative
∇.

2. The 1-form d t is covariant constant, i.e.,

∇u d t = 0 for all u . (9.8)


Note: if w is a spatial vector field, then ∇u w is also spatial for every u .

3. Spatial vectors are unchanged by parallel transport around infinitesimal closed curves;
i.e.,
n , u )w
R (n w = 0 if w is spatial, for every u and n . (9.9)

4. All vectors are unchanged by parallel transport around infinitesimal, spatial, closed
curves; i.e.,
R (vv , w ) = 0 for every spatial v and w . (9.10)

5. The Ricci curvature tensor has the form

Ricci = 4πρ d t ⊗ d t, (9.11)

where ρ is the density of mass.


–100/453– Chapter 9 A Geometrical Description of Newtonian Theory

6. There exists a metric · defined on spatial vectors only, which is compatible with the
covariant derivative in this sense: for any spatial w and v , and for any u whatsoever,

∇u (w
w · v ) = (∇u w ) · v + w · (∇u v ). (9.12)


Note: Axioms (1), (2), and (3) guarantee that such a spatial metric can exist.

u , v ), defined for any vectors u , n , p by


7. The Jacobi curvature operator J (u

1
J (u
u , n )pp = [R
R (pp , n )u
u + R (pp , u )n
n ]. (9.13)
2
is “self-ad-joint” when operating on spatial vectors, i.e.,

v · [JJ (u w ] = w · [JJ (u
u , n )w u , n )vv ] for all spactial v , w ; and for any u , n . (9.14)

8. “Ideal rods” measure the lengths that are calculated with the spatial metric; “ideal clocks”
measure universal time t; and “freely falling particles” move along geodesics of ∇.

9.4 Standard formulation of Newtonian gravity


1. There exist a universal time t, a set of Cartesian space coordinates xi (called “Galilean
coordinates”), and a Newtonian gravitational potential Φ.

2. The density of mass ρ generates the Newtonian potential by Poisson’s equation,

∂ 2Φ
= 4πρ. (9.15)
∂xi ∂xi

3. The equation of motion for a freely falling particle is

d2 xi ∂Φ
2
+ i = 0. (9.16)
dt ∂x

4. “Ideal rods” measure the Galilean coordinate lengths; “ideal clocks” measure universal
time.

9.5 Galilean coordinate system


The features of Galilean coordinate systems are

∂ ∂
x0 (P) = t(P), i
· j = δij , (9.17)
∂x ∂x
Γj 00 = Φ,j for some scalar field, all other components vanish. (9.18)

Consider following coordinate transformation:


9.6 Coordinate transformation in space –101/453–

1. x′0 = x0 = t, both time coordinates must be universal time;


2. at fixed t, both sets of space coordinates must be Euclidean, so they must be related by
a rotation and a translation

x′i (t) = Aij (t)xj + a′i (t). (9.19)

We can derive that

Γ′i0j = Γ′ij0 = Ail Ȧjl (9.20)


∂Φ
Γ′i00 = + Aik (Älk x′l − äk ) where ak ≡ Ajk a′k , (9.21)
∂x′i
and all other components of Christoffel symbol vanish. Therefore, new coordinates have the
standard Galilean form if and only if

Ȧij = 0, Φ′ = Φ − äi xi + constant. (9.22)

Were all the matter in the universe concentrated in a finite region of space and surrounded by
emptiness (“island universe”), then one could impose the global boundary condition Φ → 0
1
as r ≡ (xi xi ) 2 → ∞. This would single out a subclass of Galilean coordinates (“absolute”
Galilean coordinates), with a unique, common Newtonian potential. The transformation from
one absolute Galilean coordinate system to any other is called Galilean transformation.

9.6 Coordinate transformation in space


We now consider a coordinate transformation of Galilean coordinate system purely in space
without any terms related with time,

x′i = x′i (xj ), t′ = t. (9.23)

The components of the connection in the new coordinate system are

∂x′i ∂ 2 xp ∂x′i
Γ′i00 = Γj 00 , Γ′ijk = . (9.24)
∂xj ∂x′j ∂x′k ∂xp
The equation of motion in this coordinate is

d2 t′ d2 x′i ′
′i dt dt
′ ′j
′i dx dx
′k
= 0, + Γ 00 + Γ jk = 0, (9.25)
dλ2 dλ2 dλ dλ dλ dλ
or compactly,
d2 x′i ′i
′j
′i dx dx
′k
+ Γ 00 + Γ jk = 0. (9.26)
dt2 dt dt
We can derive that
 
′i 1 ∂glj′ ′
∂glk ′
∂gjk ∂Φ
Γ = g ′il + − , Γ′i00 = g ′ij . (9.27)
jk
2 ∂x′k ∂x′j ∂x′l ∂x′j

where g ′ is the spatial metric in new coordinate system.


Chapter 10
More on the Geometry of Spacetime

10.1 Hodge dual operator


Definition 10.1 Hodge dual operator

The Hodge star operator on a vector space V with a non-degenerate symmetric bilinear
form (herein referred to as the inner product) is a linear operator on the exterior algebra
of V , mapping k-vectors to (n − k)-vectors where n = dim V , for 0 ≤ k ≤ n. It has
the following property, which defines it completely: given two k-vectors α, β,

α ∧ (⋆β) = ⟨α, β⟩ω. (10.1)

where ⟨·, ·⟩ denotes the inner product on k-vectors and ω is the preferred unit n-vector. ♡
The inner product ⟨·, ·⟩ on k-vectors is extended from that on V by requiring that

⟨α, β⟩ = det [⟨αi , βj ⟩] . (10.2)

for any decomposable k-vectors α = α1 ∧ · · · ∧ αk and β = β1 ∧ · · · ∧ βk . The unit


n-vector ω is unique up to a sign. The preferred choice of ω defines an orientation on
V.

Given an orthonormal basis (e1 , · · · , en ) ordered such that ω = e1 ∧ · · · ∧ en , we see that

⋆ (ei1 ∧ ei2 ∧ · · · ∧ eik ) = eik+1 ∧ eik+2 ∧ · · · ∧ ein , (10.3)

where (i1 , i2 , · · · , in ) is an even permutation of {1, 2, · · · , n}. Of these n!/2, only Cnk are
independent. The first one in the usual lexicographical order reads

⋆ (e1 ∧ e2 ∧ · · · ∧ ek ) = ek+1 ∧ ek+2 ∧ · · · ∧ en . (10.4)

Definition 10.2 Levi-Civita tensor


p
εi1 ,··· ,in ≡ |g|ϵi1 ,··· ,in . (10.5)

where ϵ is Levi-Civita symbol and g is the determinant of the metric matrix.
10.2 Metric-induced properties of Riemann curvature tensor –103/453–

Proposition 10.1
p
|g| i1 ,··· ,in sgn(g)
εi1 ,··· ,in = g i1 j1 · · · g in jn εj1 ,··· ,jn = ϵ = p ϵi1 ,··· ,in . (10.6) ♠
g |g|

Using tensor index notation, the Hodge dual is obtained by contracting the indices of a k-form
with the n-dimensional completely antisymmetric Levi-Civita tensor.

Proposition 10.2

1
(⋆η)i1 ,i2 ,...,in−k = η j1 ,...,jk εj1 ,...,jk ,i1 ,...,in−k , (10.7)
(n − k)! ♠
where η is an arbitrary antisymmetric tensor in k indices.

10.2 Metric-induced properties of Riemann curvature tensor


1. In a n-dimensional manifold with torsion-free affine connection, the number of inde-
pendent components of Riemann tensor is
n3 (n − 1) n2 (n − 1)(n − 2) (n2 − 1)n2
− = . (10.8)
2 6 3
In a n-dimensional Riemann manifold, the number of independent components of Rie-
mann tensor is
 2
n(n − 1) n2 (n − 1)(n − 2) (n2 − 1)n2
− = . (10.9)
2 6 12

2. The double dual of Riemann tensor is defiend as


1 1 1 αβµν
Ḡαβ γδ ≡ εαβµν Rµν ρσ ερσγδ = − δρσγδ Rµν ρσ . (10.10)
2 2 4
It contains precisely the same amount of information as Riemann tensor, and satisfies
precisely the same set of symmetries.
3. The Einstein curvature tensor, which is symmetric, is defined as

Gβ δ ≡ Ḡµβ µδ . (10.11)

4. The Bianchi identity takes a particularly simple form when rewritten in terms of the
double dual:
Ḡαβ γδ;δ = 0, (10.12)
and it has the obvious consequence

Gβδ;δ = 0. (10.13)
–104/453– Chapter 10 More on the Geometry of Spacetime

5. The Ricci curvature tensor Rβδ = Rµβµδ , which is symmetric, and the curvature scalar
R = Rββ are related to the Einstein tensor by
1
Gβ δ = Rβδ − δδβ R. (10.14)
2

6. The Weyl conformal tensor


[α β] 1 [α β]
C αβγδ = Rαβγδ − 2δ [γ R δ] + δ [γ δ δ] R (10.15)
3
possesses the same symmetries as the Riemann tensor. Weyl tensor is completely “trace-
free”; i.e., that contraction of Cαβγδ on any pair of slots vanishes. Thus, Cαβγδ can be
regarded as the trace-free part of Riemann, and Rαβ can be regarded as the trace of
Riemann tensor. Riemann tensor is determined entirely by its trace-free part Cαβγδ and
its trace Rαβ .

10.3 Killing vector


A vector filed that satisfies the Killing equation

Lξ g = 0. (10.16)

is called a Killing vector field; it keeps the metric invariant and therefore corresponds to a
space-time symmetry. A few lines of algebra can show that it is equivalent to define killing
vector field by
ξµ;ν + ξν;µ = 0. (10.17)
Derivatives of killing vectors can be related to Riemann tensor by

ξ ρ;σµ = Rρσµν ξ ν . (10.18)

This shows that from the value of ξ µ and ξ µ;ν at a given point one can determine the Killing
vector field uniquely. One should then specify N values for ξ ν and N (N −1)/2 values for ξ µ;ν ,
so that there are at most N (N + 1)/2 linearly independent Killing vector fields. For N = 4,
there are at most 10 Killing vectors, which is precisely the dimension of the Poincare group of
Minkowski space.
A space-time enjoying the maximum number of Killing vector fields is called a maximally
symmetric space-time. It can be shown from the Killing equations the Riemann tensor must
then satisfy
R
Rρσµν = (δ ρ gσν − δνρ gσµ ). (10.19)
N (N − 1) µ
After contraction, we have
R
Rσν = gσν . (10.20)
N
Also, the Ricci scalar must be a constant. Maximally symmetric spaces are thus spaces of con-
stant curvature. For N = 4, there are three maximally symmetric space-times: Minkowski,
de Sitter and anti-de Sitter.
10.4 The coordinates of observer –105/453–

Now, consider a geodesic with tangent vector uµ . We have


uν (uµ ξµ );ν = 0. (10.21)
Therefore, the quantity uµ ξµ is constant along the trajectory. Similarly, if T µν;ν = 0 and T µν =
T νµ , we have
(T µν ξν );µ = 0. (10.22)
This relation has an important implication. Define I µ ≡ T µν ξν . Gauss-Stokes theorem implies
that I √
I µ nµ hd3 y = 0. (10.23)
∂V
Thus Z √
Q= I µ nµ hd3 y (10.24)
Σt
is a conservation charge.

10.4 The coordinates of observer


The proper reference frame of an accelerated observer
1. Let τ be proper time as measured by the accelerated observer’s clock. Let P = P0 (τ ) be
the observer’s world line.
2. The observer carries with himself an orthonormal tetrad {ee α̂ } with
dP0
e 0̂ = u ≡ and e α̂ · e β̂ = ηα̂β̂ . (10.25)

3. The tetrad changes from point to point along the observer’s world line, relative to parallel
transport:
Ω · e α̂
∇u e α̂ = −Ω where Ωµν = aµ uν − uµ aν + uα ωβ εαβµν . (10.26)
Here a ≡ ∇u u is the acceleration of the observer and we have
u · a = u · ω = 0. (10.27)
If ω were zero, the observer would be Fermi-Walker-transporting his tetrad (gyroscope-
type transport). If both a and ω were zero, he would be freely falling (geodesic motion)
and would be parallel-transporting his tetrad.
4. The observer constructs his proper reference frame in a manner analogous to the Riemann-
normal construction. From each event P0 (τ ) on his world line, he sends out purely spa-
tial geodesics (geodesics orthogonal to u ), with affine parameter equal to proper length,
P = G[τ, n , s], (10.28)
where τ is proper time, telling “starting point” of geodesic, n is tangent vector to geodesic
at starting point, telling “which” geodesic, and s is proper length along geodesic from
starting point, telling “where” on geodesic. The tangent vector has unit length, because
the chosen affine parameter is proper length.
–106/453– Chapter 10 More on the Geometry of Spacetime

5. Each event near the observer’s world line is intersected by precisely one of the geodesics
G[τ, n , s]. Far away, this is not true; the geodesics may cross, either because of the ob-
server’s acceleration or or because of the curvature of spacetime.

6. Pick an event P near the observer’s world line. The geodesic through it originated on the
observer’s world line at a specific time τ , had original direction n = nĵ e ĵ ; and needed
to extend a distance s before reaching P. Hence, the four numbers

(x0̂ , x1̂ , x2̂ , x3̂ ) ≡ (τ, sn1̂ , sn2̂ , sn3̂ ) (10.29)

are a natural way of identifying the event P. These are the coordinates of P in the ob-
server’s proper reference frame.

Along the world line of observer, we have

gα̂β̂,0̂ = 0, gĵ k̂,l̂ = 0, g0̂0̂,ĵ = −2aĵ , g0̂ĵ,k̂ = −ϵĵ k̂l̂ ω l̂ ; (10.30)


Γ0̂ĵ 0̂ = Γĵ0̂0̂ = aĵ , Γĵk̂0̂ = −ω î ϵîĵ k̂ , Γα̂ĵ k̂ = 0. (10.31)

Fermi normal coordinates


We can introduce coordinates xα = (t, xα ) such that near a geodesic γ, the metric can be
expressed as

gtt = −1 − Rtatb (t)xa xb + O x3 ,
2 
gta = − Rtbac (t)xb xc + O x3 ,
3
1 
gab = δab − Racbd (t)xc xd + O x3 . (10.32)
3
These coordinates are know as Fermi normal coordinates, and t is proper time along the
geodesic γ, on which the spacial coordinates xa are all zero. The components of the Rie-
mann tensor here are evaluated on γ, and they depend on t only. It is obvious that equation
above enforces gαβ,γ = 0 and Γµαβ = 0. The local flatness therefore holds everywhere on the
geodesic. The proof can be found in section 1.11 of A Relativist’s Toolkit (Eric Poisson).

10.5 Hypersurfaces
10.5.1 Description of hypersurfaces

Note: We only discuss timelike and spacelile hypersurfaces in this section.

Normal vector
(
−1 if Σ is spacelike
nα nα = ϵ ≡ (10.33)
+1 if Σ is timelike
10.5 Hypersurfaces –107/453–

Induced metric

Suppose that the hypersurface is parametrized with equation xα = xα (y a ). Then

∂xα
eαa = . (10.34)
∂y a

For displacements within Σ, we have


  
2 α β ∂xα a ∂xβ b
dsΣ = gαβ dx dx = gαβ dy dy = hab dy a dy b , (10.35)
∂y a ∂y b

where hab ≡ gαβ eαa eβb . The completeness relation can be written as

g αβ = ϵnα nβ + hab eαa eβb . (10.36)

10.5.2 Integration on hypersurfaces


The positive volume element of the whole space time is dx0 ∧ · · · ∧ dxm−1 , the positive volume
element of the hypersurfaces is dy 1 ∧· · ·∧dy m−1 . Suppose that the coordinate in hypersurfaces
is compatible with the coordinate of the whole space-time, which means that −dy in ∧ dy 1 ∧
· · · ∧ dy m−1 has the same orientation as dx0 ∧ · · · ∧ dxm−1 . Then we have

∂xαx α1 αm−1
ϵαx α1 ···αm−1 e · · · em−1 < 0. (10.37)
∂y in 1

If we demand that the direction of nα is the opposite of ∂xα ∂y in , then we have

α
ϵαx α1 ···αm−1 nαx eα1 1 · · · em−1
m−1
> 0. (10.38)

Surface element

We define the surface element of a hypersurface as

dΣµ = εµαβγ eα1 eβ2 eγ3 dy 1 ∧ dy 2 ∧ dy 3 . (10.39)

It is easy to verify that


√ √
f ∗ ( −g dx1 ∧ dx2 ∧ dx3 ) = dΣ0 , f ∗ (− −g dx0 ∧ dx2 ∧ dx3 ) = dΣ1 (10.40)

and so on. We can demonstrate that


p
dΣµ = ϵnµ |h| dy 1 ∧ dy 2 ∧ dy 3 . (10.41)
–108/453– Chapter 10 More on the Geometry of Spacetime

Element of two-surface

Within the hypersurface Σ, we can define a two-surface S, which is parametrized with y a =


y a (θA ). We have

∂y a ∂xα
eaA = , e α
A = = eαa eaA ; (10.42)
∂θA ∂θA
σAB = hAB eaA ebB = gαβ eαA eβB ; (10.43)
ab a b
h = ϵr r r + σ AB eaA ebB ; (10.44)
g αβ = ϵn nα nβ + ϵr rα rβ + σ AB eαA eβB . (10.45)

If we demand that the direction ra is the opposite of that of ∂y a ∂θin , then the condition of
compatibility can be written as
εµνβγ nµ rν eβ2 eγ3 > 0. (10.46)
We define the surface element of a two-surface as

dSµν = εµνβγ eβ2 eγ3 dθ2 ∧ dθ3 . (10.47)

It is easy to verify that



f ∗ ( −g dx2 ∧ dx3 ) = dS 01 , (10.48)
and so on. We can demonstrate that
p
dSαβ = ϵn ϵr (nα rβ − nβ rα ) |σ|dθ2 ∧ dθ3 . (10.49)

Gauss-Stokes theorem

Linear algebra theory tells us that

∂ det A 
= (det A) A−1 ba . (10.50)
∂Aab

Then we can deduce that


∂g
= gg αβ gαβ,µ . (10.51)
∂xµ
If Aµ is a vector, we have
√ √
Aµ;µ −g = (Aµ −g),µ . (10.52)
If B µν is an antisymmetric tensor, we have
√ √
B µν;ν −g = (B µν −g),ν . (10.53)

Apply the general Stokes theorem in differential geometry,


Z Z
dω = i∗ ω, (10.54)
M ∂M

and we can derive the following theorem:


10.5 Hypersurfaces –109/453–

Theorem 10.1 Gauss-Stokes theorem

Z I I p

−g d x =
Aα;α 4 α
A dΣα = ϵAα nα |h|d3 y. (10.55)
Z V I ∂V I ∂V ♣
1 p
αβ
B ;β dΣα = B αβ dSαβ = ϵn ϵr B αβ nα rβ |σ| d2 θ . (10.56)
Σ 2 ∂Σ ∂Σ

10.5.3 Differentiation of tangent vector fields


Tangent tensor field
Given a hypersurface Σ, the tensor field Aαβ··· that are defined only on Σ and which are purely
tangent to the hypersurface admit the following decomposition:

Aαβ··· = Aab··· eαa eβb · · · (10.57)

We can show that


Aαβ··· eαa eβb · · · = Aab··· = ham hbn · · · Amn··· . (10.58)

Intrinsic covariant derivative


Suppose Aa is a vector field on the surface. We define the covariant derivative of Aa as
DAα α
Aa|b ≡ e = Aα;β eαa eβb , (10.59)
Dy b a
where Aα = Aa eαa . We can demonstrate that

Aa|b = Aa,b − Γcab Ac , (10.60)

where the connection Γcab is compatible with hab .

Extrinsic curvature
The extrinsic curvature of the hypersurface is defined as
Dnα α
Kab ≡ b
ea = nα;β eαa eβb . (10.61)
Dy
In terms of this, we have
Aα;β eβb = Aa|b eαa − ϵAa Kab nα . (10.62)
We notice that if eαa is substituted is place of Aα , we can obtain

eαa;β eβb = Γcab eαc − ϵKab nα . (10.63)

This is know as Gauss-Weingarten equation. We can prove that Kab is a symmetric tensor.
Thus, we have
1
Kab = (Ln gαβ )eαa eβb . (10.64)
2
–110/453– Chapter 10 More on the Geometry of Spacetime

We also note the notation


K ≡ hab Kab = nα;α . (10.65)
Finally, we have the following Gauss-Codazzi theorem, uncovering the relation between in-
trinsic and extrinsic curvature of the hypersurface. The proof can be found in section 3.5 of
The mathematics of black hole mechanics (Eric Poisson).

Theorem 10.2 Gauss-Codazzi theorem

(3) m
Rµαβγ eαa eβb eγc = R+ ϵ(Kab|c − Kac|b )nµ + ϵKab nµ;γ eγc − ϵKac nµ;β eβb .
µ
abc em
(10.66) ♣
− 2ϵGαβ nα nβ = 3R + ϵ(K ab Kab − K 2 ), Gαβ eαa nβ = K ba|b − K,a . (10.67)
R = 3R + ϵ(K 2 − K ab Kab ) + 2ϵ(nα;β nβ − nα nβ ;β );α . (10.68)
Chapter 11
Formulation of General Relativity

Give the fields that generate mass-energy, and their time-rates of change, and give 3-geometry
of space and its time-rate of change, all at one time, and solve for the 4-geometry of spacetime
at that one time. Four of the ten components of Einstein’s law connect the curvature of space
here and now with the distribution of mass-energy here and now, and the other six equations
tell how the geometry as thus determined then proceeds to evolve.

11.1 Basic assumptions of general relativity


1. Spacetime is a four-dimensional pseudo-Riemannian manifold.
2. The metric of the manifold is governed by the Einstein’s equations
T.
G = 8πT (11.1)

3. All special relativistic laws of physics are valid in local Lorentz frames of metric.

11.2 Lagrangian formulation


11.2.1 Mechanics
Given the action Z τ2  α

α dx
S[q] = L x , dτ (11.2)
τ1 dτ
and boundary conditions
δxα (τ1 ) = 0, δxα (τ2 ) = 0, (11.3)
the variation principle gives the equation of motion of the test particle as
d ∂L ∂L
α
− α = 0. (11.4)
dτ ∂u ∂x
Example: The Lagrangian for a charged particle in spacetime is
L = −m(−gµν uµ uν )1/2 + eAµ uµ . (11.5)
The resulting equation of motion is
duα 1
maα = eFαµ uµ where aα = − gµν,α uµ uν and Fαµ = Aµ;α − Aα;µ (11.6)
dτ 2
–112/453– Chapter 11 Formulation of General Relativity

11.2.2 Field Theory


Given the action Z

S[ϕ] = L(ϕ, ϕ;α ) −g d4 x (11.7)
V
and boundary conditions
δϕ = 0, (11.8)
∂V
the variation principle gives the field equation as
 
∂L ∂L
− = 0. (11.9)
∂ϕ;α ;α ∂ϕ

Example: The Lagrangian density for electromagnetic field in spacetime is


1
L = − F µν Fµν + Aµ j µ . (11.10)
4
The resulting equation of motion is
F µν;ν = j µ . (11.11)

11.2.3 General relativity


The action of general relativity is composed of Hilbert term, boundary term, nondynamical
term and matter term:
Z
1 √
SH [g] = R −g d4 x ; (11.12a)
16π V
I p
1
SB [g] = ϵK |h| d3 y ; (11.12b)
8π ∂V
I p
1
S0 = ϵK0 |h| d3 y ; (11.12c)
8π ∂V
Z

SM [ϕ; g] = L(ϕ, ϕ;α ; gαβ ) −g d4 x . (11.12d)
∂V

Here R is the curvature scalar of the spacetime, K is the extrinsic curvature scalar of ∂V and
K0 is the extrinsic curvature scalar of ∂V when embedded in flat spacetime. The variation of
Hilbert term is given by
Z I p

(16π) δSH = Gαβ δg αβ
−g d x −
4
ϵhαβ δgαβ,µ nµ |h| d3 y . (11.13)
V ∂V

The variation of boundary term is given by


I p
16π δSB = ϵhαβ δgαβ,µ nµ |h| d3 y . (11.14)
∂V

The variation of nondynamical tern is zero. It is used to eliminate the infinity in the boundary
term. The variation of matter term is
Z  
∂L 1 √
δSM = αβ
− Lg αβ δg αβ
−g d4 x . (11.15)
V ∂g 2
11.3 Hamiltonian formulation –113/453–

Define
∂L
Tαβ ≡ −2 + Lgαβ . (11.16)
∂g αβ
The variation principle then leads to the Einstein’s equations

Gαβ = 8πTαβ . (11.17)

11.3 Hamiltonian formulation


11.3.1 3+1 decomposition

Figure 11.1: Foliation of spacetime.

The spacetime can be foliated by spacelike hypersurfaces Σt that is described by scalar function
t(xα ), as shown in 11.1. t is a single valued function and the unit normal to the hypersurfaces
nα ∝ ∂α t is a future directed timelike vector field.
Consider a congruence of curves γ intersecting Σt . We use t as a parameter on the curves
and the vector tα is tangent to the congruence (tα ∂α t = 1). Install coordinates y a on Σt and
impose y a (P ′′ ) = y a (P ′ ) = y a (P ), so y a is held constant on each member of the congruence.
This construction defines a coordinate system (t, y a ) in V. Base vectors of the frame (t, y a ) are
given by  α  α
α ∂x α ∂x
t = , ea = . (11.18)
∂t ya ∂y a t
The normal vector of the hypersurface is given through

nα = −N ∂α t and nα nα = −1, (11.19)

and we have nα eαa = 0. The base vector tα can be decomposed as

tα = N nα + N a eαa . (11.20)

The metric of the spacetime is

gαβ dxα dxβ = gαβ (tα dt + eαa dy a )(tβ dt + eβb dy b )


= −N 2 d2 t + hab (dy a + N a dt)(dy b + N b dt). (11.21)

The determinant of metric matrix in (t, y) coordinates is −N 2 h.


–114/453– Chapter 11 Formulation of General Relativity

11.3.2 Field theory


The canonical momentum of the field coordinate ϕ is

∂ −gL ∂ϕ
π= where ϕ̇ = . (11.22)
∂ ϕ̇ ∂t

The Hamiltonian density of the field is



H(ϕ, ϕ,a , π) = π ϕ̇ − −gL. (11.23)

The total Hamiltonian in the hypersurface Σt is given by the integration


Z
H= H(ϕ, ϕ,a , π)d3 y. (11.24)
Σt

Minimizing the action Z Z


t2
S= dt (π ϕ̇ − H) d3 y , (11.25)
t1 Σt

we can obtain the Hamilton’s equations,


 
∂H ∂ ∂H ∂H
π̇ = − + a , ϕ̇ = . (11.26)
∂ϕ ∂y ∂ϕ,a ∂π

Example: For electromagnetic field in 3+1 decomposition form, we define the electric and
magnetic field by
Ea ≡ Fαβ nβ eαa , ϵabc B c ≡ Fαβ eαa eβb . (11.27)
In this definition, the equation of motion of a particle in electromagnetic field can be written
as
maa = γe[N Ea + ϵabc (v b + N b )B c ]. (11.28)
where aa ≡ aα eαa , γ ≡ [N 2 − (N b + v b )(Nb + vb )]−1/2 and v a ≡ dy a /dt . If we adopt the
coordinates (t, y a ) at first, it is easy to verify that
1
E a ≡ hab Eb = N F 0a , B a = εabc Fbc . (11.29)
2
We further define
√ √
E a ≡ hE a , B a ≡ hB a , ϕ ≡ −A0 , ρe ≡ −j α nα = N j 0 , J a ≡ N j a . (11.30)

If we notice that
2hck
F0a = −hab N 2 F 0b − Fab N b , ϵabc ϵijk hai hbj = . (11.31)
h
we can express the Lagrangian density L = Aµ,ν F µν + F µν Fµν /4 + Aµ j µ as
√ 1 √ √
−gL = −E a Ȧa + ϕE,aa − √ N hab (E a E b + B a B b ) + ϵabc N a E b B c − hϕρe + hAa J a .
2 h
(11.32)
11.3 Hamiltonian formulation –115/453–

Using π a = −E a , we can obtain the Hamiltonian density


N hab √ √
H = ϕπ,a
a
+ √ (π a π b + B a B b ) + ϵabc N a π b B c + hϕρe − hAa J a . (11.33)
2 h
The resulting Hamilton’s equations are

Ȧa = −ϕ,a − N Ea − εabc N b B c ; (11.34a)



π̇ a = −ϵabc (N Bc ),b + ϵabc (εijc N i E j ),b + hJ a . (11.34b)
a

And constraint equation π,a + hρe = 0 can be obtained as well. After some simplification,
we can write dwon the Maxwell’s equations in curved spacetime,
1 ∂ √ 
√ hE = ∇ × (N B − N × E) − J ; (11.35a)
h ∂t
1 ∂ √ 
√ hB = −∇ × (N E + N × B) (11.35b)
h ∂t
∇ · E = ρe (11.35c)
∇ · B = 0. (11.35d)

11.3.3 General relativity


Using Gauss-Codazzi threom, the action of gravity can be rewritten as
Z t2 Z I 
1  √ 3 √ 2
SG = dt 3
R + K Kab − K N h d y + 2 (k − k0 )N σ d θ ,
ab 2
16π t1 Σt St
(11.36)
where k0 is extrinsic curvature of St embedded in flat space. The time derivative of the metric
on hypersurface is

ḣab ≡ Lt hab = Lt (gαβ eαa eβb ) = 2N Kab + Na|b + Nb|a , (11.37)

or equivalently,
1
Kab = (ḣab − Na|b − Nb|a ). (11.38)
2N
The corresponding canonical momentums of hab are
√ √
∂ −gLG h ab
ab
p = = (K − Khab ), (11.39)
∂ ḣab 16π
or equivalently,  
√ 1 ab
hK = 16π p − ph
ab ab
. (11.40)
2
The Hamiltonian on the hypersurface is given by
Z
 √
16πHG = N (K ab Kab − K 2 − 3R ) − 2Na (K ab − Khab )|b h d3 y
Σt
I
 √
−2 N (k − k0 ) − Na (K ab − Khab )rb σ d2 θ . (11.41)
St
–116/453– Chapter 11 Formulation of General Relativity

The variation of gravitational Hamiltonian is


Z
δHG = (Pab δhab + Hab δpab − CδN − 2Ca δN a ) d3 y , (11.42a)
Σt

where
√ √
16πPab = N hGab − h(N |ab − hab N |cc )
"   #
√ 1
+ 16π 2pc(a N |c − h √ pab N c
b)
h |c
     
2 2N 1 ab N 1 2
+ (16π) √ pc p − pp
a bc
− √ p pcd − p hab ,
cd
(11.42b)
h 2 2 h 2
 
32πN 1
Hab = √ pab − phab + 2N(a|b) , (11.42c)
h 2

h 3
C= ( R + K 2 − K ab Kab ), (11.42d)
16π

h
Ca = (K b − Kδab )|b . (11.42e)
16π a
Similarly, we can also get the variation of electromagnetic Hamiltonian,
Z  
1 √ ab √ √
δHE = − N hI δhab + hρδN − hsa δN d3 y , a
(11.43a)
Σt 2

where
1
Iab = (E c Ec + B c Bc )hab − E a E b − B a B b , (11.43b)
2
1
ρ = (E c Ec + B c Bc ), (11.43c)
2
sa = ϵabc E b B c . (11.43d)

Now, we can write down Hamilton’s equations and corresponding constraint equations for
general relativity,

ḣab = Hab ; (11.44a)


1 √
ṗab = −Pab + N hIab ; (11.44b)
2
3
R + K − K Kab = 16πρ;
2 ab
(11.44c)
(Ka −
b
Kδab )|b = −8πsa . (11.44d)
Chapter 12
Perturbation Theory and Gravitational Radiation

12.1 The linearized theory of gravity


In a weak-field situation
gµν = ηµν + hµν , |hµν | ≪ 1, (12.1)
one can expand the field equations in powers of hµν using a coordinate frame where |hµν | ≪ 1
holds; and without much loss of accuracy, one can keep only linear terms. The resulting for-
malism is often called “the linearized theory of gravity”. The resulting connection coefficients,
when linearized in the metric perturbation hµν , read
1 1
Γµαβ = η µν (hαν,β + hβν,α − hαβ,ν ) ≡ (hαµ,β + hβ µ,α − hαβ ,µ ). (12.2)
2 2
Whenever one expands in powers of hµν , indices of hµν are raised and lowered using η µν and
ηµν , not g µν and gµν . A similar linearization of the Ricci tensor yields
1
Rµν = Γαµν,α − Γαµα,ν = (hµα,να + hν α,µα − hµν,αα − h,µν ) where h ≡ ηµν hµν . (12.3)
2
After a further contraction to form R ≡ g µν Rµν ≈ η µν Rµν , one finds the Einstein’s equations
read
hµα,να + hν α,µα − hµν,αα − h,µν − ηµν (hαβ ,αβ − h,αα ) = 16πTµν . (12.4)
Define
1
h̄µν ≡ hµν − ηµν h. (12.5)
2
The linearized field equations become

H µανβ,αβ = 16πT µν , (12.6)

where
− H µανβ ≡ h̄µν η αβ + h̄αβ η µν − h̄αν η µβ − h̄µβ η αν . (12.7)

Two different types of coordinate transformations connect nearly globally Lorentz systems to
each other: global Lorentz transformations, and infinitesimal coordinate transformations. As
for global Lorentz transformations, we can verify that hµν and h̄µν transform like components
of a tensor in flat spacetime. For Infinitesimal coordinate transformations

xµnew = xµold + ξ µ , (12.8)


–118/453– Chapter 12 Perturbation Theory and Gravitational Radiation

where ξ µ are four arbitrary functions small enough to leave |hµ′ ν ′ | ≪ 1. We can verify that
the metric perturbation functions in the new xµnew and old xµold coordinate systems are related
by
µν = hµν − ξµ,ν − ξν,µ .
hnew old
(12.9)
The functional forms of all other scalars, vectors, and tensors which is of order O(h), such as
Rµν , Tµν and R, are unaltered, to within the precision of linearized theory.
For any physical situation, one can specialize the gauge so that

h̄µα,α = 0, (12.10)

called Lorentz gauge. The Lorentz gauge is not fixed uniquely. The gauge condition is left
unaffected by any gauge transformation for which

2ξ α ≡ ξ α,ββ = 0. (12.11)

The field equations then become

2hµν = h̄µν,αα = −16πTµν . (12.12)

Once the gauge has been fixed by fiat for a given system, one can regard hµν as components
of tensors in flat spacetime; and one can regard the field equations and the chosen gauge con-
ditions as geometric, coordinate-independent equations in flat spacetime. This viewpoint al-
lows one to use curvilinear coordinates, if one wishes. But in doing so, one must everywhere
flat
replace the Lorentz components of the metric ηµν by the metric’s components gµν in the flat-
spacetime curvilinear coordinate system; and one must replace all ordinary derivatives in the
field equations and gauge conditions by covariant derivatives whose connection coefficients
flat
come from gµν .

12.2 Nearly Newtonian gravitational fields


The general solution to the linearized field equations in Lorentz gauge lends itself to expression
as a retarded integral of the form familiar from electromagnetic theory:
Z
4Tµν (t − r, x′ ) 3 ′
h̄µν (t, x) = d x , r = |x − x′ |. (12.13)
r
Here focus attention on a nearly Newtonian source: T00 ≫ |T0i | and T00 ≫ |Tij |, and veloci-
ties slow enough that retardation is negligible. In this case, we have
Z
T00 (t, x′ ) 3 ′
h̄00 = −4Φ, h̄0i = h̄ij = 0 where Φ(t, x) = − dx . (12.14)
r
The corresponding metric is

ds2 = −(1 + 2Φ) dt2 + (1 − 2Φ)(dx2 + dy 2 + dz 2 )


   
2M 2M
≈− 1− 2
dt + 1 + (dx2 + dy 2 + dz 2 ). (12.15)
r r
12.2 Nearly Newtonian gravitational fields –119/453–

For a test particle whose velocity v ≪ 1, the geodesic equation becomes

dv i dv i
+ Γi00 = + Φ,i = 0. (12.16)
dt dt
Therefore, we reproduce the classical Newtonian gravitation theory.

Now let us consider the path of a photon through this geometry; in other words, solve the per-
turbed geodesic equation for a null trajectory xµ (λ). (We parametrize the trajectory with λ to
ensure that pµ = dxµ /dλ.) Recall that our philosophy is to consider the metric perturbation
as a field defined on a flat background spacetime. Similarly, we can decompose the geodesic
into a background path plus a perturbation,

xµ (λ) = xµ(0) (λ) + xµ(1) (λ), (12.17)

where xµ(0) (λ) solves the geodesic equation in the background. We then evaluate all quantities
along the background path, to solve for xµ(1) (λ). For this procedure to make sense, we need to
assume that the potential Φ is not appreciably different along the background and true geode-
sies; this condition amounts to requiring that xi(1) ∂i Φ ≪ Φ. For convenience we denote the
wave vector of the background path as k µ and the derivative of the deviation vector as lµ . The
condition that a path be null is of course

(ηµν + hµν )(lµ + k µ )(lν + k ν ) = 0. (12.18)

At zeroth order we simply have


ηµν k µ k ν = 0. (12.19)
or equivalently,
k 2 ≡ (k 0 )2 = k · k. (12.20)
At first order we obtain
2ηµν k µ lν + hµν k µ k ν = 0, (12.21)
or equivalently,
− kl0 + l · k = 2k 2 Φ. (12.22)
We now turn to the perturbed geodesic equation. The zeroth-order geodesic equation simply
tells us that xµ(0) (λ) is a straight trajectory, while at first order we have

dlµ
= −Γµρσ k ρ k σ . (12.23)

It follows that
dl0 dl
= −2k(k · ∇Φ), = −2k 2 ∇⊥ Φ, (12.24)
dλ dλ
where ∇⊥ ≡ ∇ − k −2 (k · ∇)k.

Assuming l0 = 0 when Φ = 0, the integration gives


Z
l = −2kΦ,
0
l · k = 0, l = −2k 2
∇⊥ Φ dλ . (12.25)
–120/453– Chapter 12 Perturbation Theory and Gravitational Radiation

The redshift of photon propagating to infinity from a gravitation source is

−(k + l0 )u0
z≡ − 1 = −Φ. (12.26)
k
The deflection angle of the photon passing by a gravitation source is
Z
l
α = − = 2k ∇⊥ Φ dλ . (12.27)
k
Particularly, for Φ = −M/r, we can get
M 4M
z= , α= , (12.28)
r b
where b is the distance between gravitational source and light ray.

12.3 Gravitational wave


Let us decompose hµν as

h00 = −2A, h0i = ∂i B + B̄i , hij = 2Cδij + 2∂i ∂j E + ∂i Ēj + ∂j Ēi + Ẽij (12.29)

with
∂i B̄i = 0, ∂i Ēi = 0, ∂i Ẽij = 0, Ẽii = 0. (12.30)
Then we decompose the displacement vector for gauge transformation as

ξ 0 = −T, ξ i = −∂ i L − L̄i with ∂i L̄i = 0. (12.31)

Under such coordinate transformation, the metric transform as

A → A + Ṫ , B → B + L̇ − T, C → C, E → E + L for scalar modes;


(12.32a)
B̄i → B̄i + L̄˙ i , Ēi → Ēi + L̄i for vector modes; (12.32b)
Ẽij → Ẽij for tensor modes. (12.32c)

The tensor modes are therefore gauge invariant since they do not depend on the choice of the
coordinate system. This is not the case for the vector and scalar modes. However, we can
define a combination of these modes that are gauge invariant. For the scalar modes, we define

Φ ≡ A + Ḃ − Ë, Ψ ≡ −C, (12.33)

and for the vector modes we define

Ψ̄i ≡ Ē˙ i − B̄i . (12.34)

Thus, we have defined 2 scalar quantities and 1 vector quantity (2 degrees of freedom) which
are gauge invariant. All together , we have 6 degrees of freedom, once the 4 arbitrary degrees
of freedom related to the gauge choice have been absorbed.
12.3 Gravitational wave –121/453–

In terms of gauge invariant quantities, the Einstein tensor can be written as

G00 = 2∇2 Ψ,
1
G0i = 2∂i Ψ̇ + ∇2 Ψ̄i ,
2
1 ˙ 
˙ + 2δ Ψ̈ − 1 2Ẽ .
Gij = (δij ∇2 − ∂i ∂j )(Φ − Ψ) + ∂i Ψ̄j + ∂j Ψ̄i ij ij
2 2
We note again that Einstein tensor (Ricci tensor) is gauge invariant.

Now we consider the solution of linearised Einstein’s equations in vacuum. For the scalar
modes, the Einstein’s equations Gµν = 0 imply that

∇2 Ψ = 0, ∇2 (Φ − Ψ) = 0. (12.35)

The only regular solutions are


Φ = Ψ = 0, (12.36)

which means that no scalar mode can propagate.

For the vector modes, we have


∇2 Ψ̄i = 0. (12.37)

The only regular solution is Ψ̄i = 0. Just as for scalar modes, no vector modes can propagate.

For tensor modes, we have


2Ẽij = 0. (12.38)

Therefore, the only perturbations that can propagate in a Minkowski space-time are the grav-
itational waves and they satisfy

2Ẽij = 0, ∂i Ẽij = 0, Ẽii = 0. (12.39)

The three conditions Φ = Ψ = Ψ̄i = 0 define a gauge equivalence class. We can choose a
gauge in this family by imposing some conditions on the perturbations. For instance, setting
E = B = 0 and Ēi = 0, we define what is called a transverse and traceless (TT) gauge in
which the metric is completely determined. In this case, the only non-vanishing component
of hTT
µν is Ẽij . A particularly useful set of solutions to this wave equation are the plane waves,
given by
ikσ xσ
hTT
µν = Cµν e , (12.40)

where Cµν is a constant, symmetric, traceless and purely spatial tensor. The Einstein’s equa-
tions now become
k σ kσ = 0, k µ Cµν = 0. (12.41)

Our solution can be made more explicit by choosing spatial coordinates such that the wave is
travelling in the z direction. A little algebra can show that

+ ×
 −iω(t−z)
hTT
ij = C+ ϵij + C× ϵij e , (12.42)
–122/453– Chapter 12 Perturbation Theory and Gravitational Radiation

Time

Figure 12.1: Effects of a gravitational plane wave propagating along the axis z on a ring of
particles located in the plane xy, depending on the wave polarization.

with the two polarization tensors being defined by


   
1 0 0 0 1 0
ϵ+ ≡ 0 −1 0 , ϵ× ≡ 1 0 0 . (12.43)
0 0 0 0 0 0

Consider the geodesic equation of a test particle in the gravitational field of a gravitational wave
in the TT gauge. Since to leading order in perturbation we have Γi00 = 0, a particle initially
at rest will remain at rest. Of course, this does not mean that nothing happens, but rather
that the frame of reference is co moving with the test particle. To see if anything happens, we
should look at the relative motion of two neighbouring particles, which can be done using the
geodesic deviation equation. The relative acceleration is given by

∇u ∇u n = −R
R (n
n , u )u
u. (12.44)

To leading order in v, we have uµ = (1, 0) and nµ = (0, ni ), Ri00j = 12 ḧTT


ij . Thus we have

d2 ni 1 j
2
= ḧTT
ij n . (12.45)
dt 2
For + modes, we have
d2 nx 1 d2 ny 1
2
= − ω 2 C+ nx e−iω(t−z) , 2
= ω 2 C+ ny e−iω(t−z) . (12.46)
dt 2 dt 2
For × modes, we have
d2 nx 1 d2 ny 1
2
= − ω 2 C× ny e−iω(t−z) , 2
= − ω 2 C× nx e−iω(t−z) . (12.47)
dt 2 dt 2

12.4 Nonlinear effects in gravitational waves


12.4.1 The shortwave approximation
Consider gravitational waves propagating through a background spacetime. Let R be the typ-
ical radius of curvature of the background; let λ ≡ λ/2π and A be the typical reduced wave-
length and amplitude of the waves; and demand both A ≪ 1 and λ/R ≪ 1. The background
12.4 Nonlinear effects in gravitational waves –123/453–

curvature might be due entirely to the waves, or partly to waves and partly to nearby matter
and nongravitational fields.

The analysis uses a coordinate system closely “tuned” to spacetime in the sense that the metric
coefficients can be split into “background” coefficients plus perturbations
(B)
gµν = gµν + hµν . (12.48)

with these properties:

1. the amplitude of the perturbation is A

hµν ≲ ( typical value of gµν


(B)
) · A; (12.49a)

2. the scale on which gµν


(B)
varies is ≳ R
(B)
gµν,α ≲ ( typical value of gµν
(B)
)/R; (12.49b)

3. the scale on which hµν is ∼ λ

hµν,α ∼ ( typical value of hµν )/λ. (12.49c)

Such coordinates are called “steady”.

A rather long computation shows that the Ricci tensor for an expanded metric is
(B) (1) (2)
Rµν = Rµν + Rµν + Rµν + error. (12.50)
? A/λ2 A2 /λ2 A3 /λ2

Here a marker has been placed under each term to show its typical order of magnitude; Rµν
(B)
is
the Ricci tensor for the background metric gµν ; and Rµν and Rµν are expressions defined by
(B) (1) (2)

1
(1)
Rµν ≡ (hαµ|να + hαν|µα − hµν|αα − h|µν ); (12.51a)
2
1 1
(2)
Rµν ≡ hαβ|µ hαβ |ν + hαβ (hµν|αβ + hαβ|µν − hαµ|νβ − hαν|µβ )
2 2
  
1 |α
+hν (hαµ|β − hβµ|α ) − h |β − h
α|β αβ
(hαµ|ν + hαν|µ − hµν|α ) . (12.51b)
2

In these expressions and everywhere below, indices are raised and lowered with gµν
(B)
, and an
upright line denotes a covariant derivative with respect to gµν .
(B)

At the heart of the shortwave formalism is its method for solving the vacuum field equations
Rµν = 0. One begins by selecting out the part linear in the amplitude of the wave A, and
setting it equal to zero. The action of the waves to curve up the background is a nonlinear
phenomenon; so Rµν (B)
cannot be linear in A. Hence, Rµν(1)
is the only linear term, and it must
vanish by itself
(1)
Rµν (h) = 0. (12.52a)
–124/453– Chapter 12 Perturbation Theory and Gravitational Radiation

Of course hµν may contain nonlinear correction terms - call them jµν - of order A2 , which
must not be constrained by this linear equation.

One next splits the remainder of Rµν into a part that varies only on scales far larger than λ,
and a second part that contains the fluctuations. This split can be accomplished by averaging
over several wavelengths:
(B) (2)
Rµν + Rµν (h) +error = 0 [smooth part]; (12.52b)
? A2 /λ2 A3 /λ2
(1)
Rµν (2)
(j) + Rµν (h)− Rµν
(2)
(h) +error = 0 [fluctuating part]. (12.52c)
A2 /λ2 A2 /λ2 A2 /λ2 A3 /λ2

Smooth part shows how the stress-energy in the waves creates the background curvature. It
can be rewritten in the more suggestive form

1 (B) (B)
µν ≡ Rµν − R gµν = 8πTµν
G(B) (B) (GW)
in vaccum, (12.53)
2
where  
1 1 (B) (2)
(GW)
Tµν ≡− Rµν (h) − gµν ⟨R (h)⟩
(2)
(12.54)
8π 2
is the stress-energy tensor for the gravitational waves. Equation 12.53 can be generalized to
the case where matter and other fields are present,
(matther) (other fields)
G(B) (GW)
µν = 8π(Tµν + Tµν + Tµν ) (12.55)

Notice that the typical magnitude of components Rµν


(B)
is R−2 . We can deduce that the dimen-
sionless numbers A and λ/R are related by A ≲ λ/R.

Finally, fluctuation part shows how the gravitational waves generate nonlinear corrections j
to themselves (wave-wave scattering, harmonics of the fundamental frequency, etc).

12.4.2 Effect of background curvature on wave propagation


Focus attention on the propagation equation Rµν
(1)
(h) = 0. Define h̄µν = hµν − 12 gµν
(B)
h. The
propagation equation can be rewritten as

h̄ |βα − 2h̄α(µ|αν) + 2Rαµβν


(B) αβ
h̄µν|αα + gµν (B)
h̄αβ − 2Rα(µ
B
h̄ν)α = 0. (12.56)

The propagation equation can be simplified by a special choice of gauge. An infinitesimal


coordinate transformation xµnew (P) = xµold (P) + ξ µ (P) induces a first-order change in the
functional forms of the metric coefficients given by

µν (xnew ) = hµν (xnew ) − ξµ|ν − ξν|µ .


hnew µ old µ
(12.57)

By an appropriate choice of the four functions ξ µ , one can enforce the transverse and traceless
gauge conditions
h̄µα|α = 0, h̄ = 0. (12.58)
12.5 Conservation laws for 4-momentum and angular momentum –125/453–

The propagation equation will become

hµν|αα + 2Rαµβν
(B)
hαβ − 2Rα(µ
B
hν)α = 0. (12.59)

The propagation equation is accurate to first order in the amplitude; and its accuracy is in-
dependent of the ratio λ/R. Thus, it can be applied whenever the waves are weak even if the
wavelength is large. All nonlinear interactions of the wave with itself are neglected in this first-
order propagation equation. Actually contained in the propagation equation are all effects due
to the linear action of the background curvature on the propagating wave.

12.4.3 Stress-energy tensor for gravitational waves


Turn now to an evaluation of the effective stress-energy tensor Tµν
(GW)
. The evaluation requires
averaging various quantities over several wavelengths. Useful rules for manipulating quantities
inside the averaging brackets are the following:
1. Covariant derivatives commute, e.g., hhµν|αβ = hhµν|βα . The fractional errors
made by freely commuting are (λ/R)2 .
2. Gradients average out to zero; e.g., (h|α hµν )|β = 0. Fractional errors made here are
λ/R.
3. As a corollary, one can freely integrate by parts, flipping derivatives from one h to the
other; e.g. hhµν|αβ = −h|β hµν|α .
A straightforward but long calculation using these rules and propagation equation, yields
 
1 1
(GW)
Tµν = h̄αβ|µ h̄ |ν − h̄|µ h̄|ν − 2h̄ |β h̄α(µ|ν) .
αβ αβ
(12.60)
32π 2
In transverse and traceless gauge, we have
1 D E
(GW)
Tµν = hαβ|µ hαβ |ν . (12.61)
32π
The dominant errors in Tµν (GW)
are ∼ λ/R. Tµν(GW)
is on an equal footing with any other stress-
energy tensor. It plays the same role in producing background curvature. One can also show
that this stress-energy tensor is divergence-free in vacuum

T (GW)µν|ν = 0 + error, (12.62)

where the error ∼ (λ/R)(T GWµν /R) is negligible in the shortwave approximation.

12.5 Conservation laws for 4-momentum and angular


momentum
12.5.1 4-momentum and angular momentum
Suppose that the source is isolated, and that spacetime become asymptotically flat far away
from it. Use a coordinate system that becomes asymptotically Lorentz as rapidly as spacetime
–126/453– Chapter 12 Perturbation Theory and Gravitational Radiation

curvature permits, when one moves radially outward from the source toward infinity. Every-
where in this coordinate system, even inside the source, where |hµν | ≪ 1 may break down,
we still define formally
hµν ≡ gµν − ηµν . (12.63)
The hµν are clearly not the components of a tensor. Neither is ηµν the true metric tensor.
Nevertheless, one is free to raise and lower indices on hµν with ηµν and to define

1
h̄µν ≡ hµν − ηµν h, h ≡ hαβ η αβ . (12.64)
2
and
− H µανβ ≡ h̄µν η αβ + h̄αβ η µν − h̄αν η µβ − h̄µβ η αν . (12.65)
Then we can define the effective energy-momentum pseudotensor by
µν
16πTeff ≡ H µανβ,αβ , (12.66)

so it follows that
µν
Teff,ν = 0. (12.67)
We further define the total 4-momentum and angular momentum for both strong and weak
source by
Z I
1 1
P ≡
µ 3 µ0
d x Teff = H µα0j,α dSj , (12.68)
16π 16π S
Z I
1
J ≡ (x Teff − x Teff ) d x =
µν µ 0ν ν 0µ 3
(xµ H να0j,α − xν H µα0j,α + H µj0ν − H νj0ν ) dSj ,
16π S
(12.69)

where the closed surface of integration S is in the asymptotically flat region surrounding the
source. Naturally, gravitation energy-momentum pseudotensor is defined as

16πtµν ≡ H µανβ,αβ − 2Gµν , (12.70)


µν
leanding to Teff = T µν + tµν .
µν
All the quantities H αµνβ , tµν and Teff depend for their definition and existence on the choice
of coordinates. Nevertheless, there is adequate invariance under general coordinate transfor-
mations to give the values P µ and J µν coordinate-free significance in the asymptotically flat
region far outside the source.

Although this invariance is hard to see in the volume integrals themselves, it is clear from the
surface-integral forms that no coordinate transformation which changes the coordinates only
inside some spatially bounded region can influence the values of the integrals. For coordinate
changes in the distant, asymptotically flat regions, linearized theory guarantees that under
Lorentz transformations the integrals for P µ and J µν will transform like special relativistic
tensors, and that under infinitesimal coordinate transformations (gauge changes) they will be
invariant.
12.5 Conservation laws for 4-momentum and angular momentum –127/453–

Knowing P µ and J µν , one can figure out the source’s total mass-energy M and intrinsic an-
gular momentum Sρ by

J µν Pν 1 (J µν − Y µ P ν + Y ν P µ )P σ
M ≡ (−P µ Pµ )−1/2 , Yµ ≡− , Sρ ≡ εµνσρ .
M2 2 M
(12.71)
αµνβ
It is clear that any quantities Hnew which agree with the original H αµνβ in the asymptotic
weak-field region will give the same values as H αµνβ does for the P µ and J µν surface integrals.
One especially convenient choice is Landau-Lifshitz pseudotensor, who define
µανβ
HL−L = gµν gαβ − gαν gµβ where gµν ≡ (−g)1/2 g µν . (12.72)

Einstein’s equations can be written in the form


µανβ
HL−L,αβ = 16π(−g)(T µν + tµν
L−L ), (12.73)

where the Landau-Lifshitz pseudotensor components



1 1
αβ
(−g)tL−L = gαβ ,λ gλµ,µ − gαλ,λ gβµ,µ + g αβ gλµ gλν ,ρ gρµ,ν
16π 2
− (g αλ gµν gβν ,ρ gµρ,λ + g βλ gµν gαν ,ρ gµρ,λ ) + g λµ gνρ gαλ,ν gβµ,ρ

1 αλ βµ
+ (2g g − g g )(2gνρ gστ − gρσ gντ )g ,λ g ,µ ,
αβ λµ ντ ρσ
(12.74)
8

are precisely quadratic in the first derivatives of the metric. It follows that
µν
TL−L,eff ≡ (−g)(T µν + tµν
L−L ) (12.75)
µν
has all the properties of the Teff defined by 12.66.

12.5.2 Conservation laws


Surround an asymptotically flat system by a two-dimensional surface S that is at rest in some
asymptotic Lorentz frame. The 4-momentum and angular momentum inside S change at a
rate (as measured in S’s rest frame) given by
I I
dP µ dJ µν
= − Teff dSj and
µj
= − (xµ Teff
νj
− xν Teff
µj
) dSj . (12.76)
dt dt
Although the pseudotensor tµν in the interbody region and outside the system, contributes
negligibly to the total 4-momentum and angular momentum, its contribution via gravitational
waves to the time derivatives can be important when added up over astronomical periods of
time. Thus, one must not ignore it in the flux integrals 12.76.

In evaluating these flux integrals, it is especially convenient to use the Landau-Lifshitz form
µν
of Teff , since that form contains no second derivatives of the metric. Only those portions of
µν
tL−L that die out as 1/r2 or 1/r3 at large r can contribute to the flux integrals 12.76. For static
solutions gµν ∼ const. +O(1/r), tµν 4
L−L dies out as 1/r . Hence, the only contributions come
–128/453– Chapter 12 Perturbation Theory and Gravitational Radiation

from dynamic parts of the metric, which, at these large distances, are entirely in the form of
gravitational waves.
When tµνL−L is averaged over several wavelengths, it becomes the stress-energy tensor T
(GW)µν

for the waves. Moreover, averaging tµνL−L over several wavelengths before evaluating the flux

integrals 12.76 cannot affect the values of the integrals. Therefore, one can freely make in
these integrals the replacement
µν
Teff = T (GW)µν + T µν . (12.77)

12.6 Production of gravitational wave


Consider an isolated source in the asymptotically flat spacetime, where asymptotically Minkowskian
coordinates are used. One can further specialize the coordinates so that the four conditions

h̄µα,α = 0 (12.78)

are exactly satisfied everywhere, including the interior of the source. The exact Einstein field
equations can be written in terms of h̄µν as

h̄µν ,αβ η αβ = −16π(T µν + tµµ ). (12.79)

Einstein’s equations 12.79, augmented by an outgoing-wave boundary condition, are equiva-


lent to the integral equations
Z
µν [T µν + tµν ]ret 3 ′
h̄ (t, x) = 4 d x, (12.80)
|x − x′ |
where the subscript “ret” means the quantity is to be evaluated at the retarded spacetime point
(t′ = t − |x − x′ |, x′ ).
Now we introduce the slow-motion assumption R/λ ∼ v ≪ 1. Place the origin of spatial
coordinates inside the source. For slow-motion systems, the only significant contributions to
the retarded integrals 12.80 come from deep inside the near zone (from a region of size L ∼
R ≪ λ). Confine attention to field points far outside this source region, |x| ≡ r ≫ L ≳ |x′ |,
and expand the retarded integral 12.80 in powers of x′ /r. The result is
Z
4
µν
h̄ (t, x) = [T µν (t − r, x′ ) + tµν (t − r, x′ )]d3 x′
r
 j Z 
x ′j ′ ′ 3 ′
+O 2 x [T (t − r, x ) + t (t − r, x )]d x .
µν µν
(12.81)
r λ

Of the ten components of h̄µν , only the six spatial ones are of interest, since only they are
µν
needed in projecting out the transverse-traceless radiation field hTT
jk . Applying (T +tµν ),ν =
0, we can derive that

[(T 00 + t00 )xj xk ],00 = [(T lm + tlm )xj xk ],lm − 2[(T lj + tlj )xk + (T lk + tlk )xj ],l + 2(T jk + tjk ),
(12.82)
12.6 Production of gravitational wave –129/453–

whence
Z Z
1 d2 Ijk
(T jk + tjk )d3 x = where Ijk (t) ≡ [T 00 (t, x) + t00 (t, x)]xj xk d3 x. (12.83)
2 dt2
Now introduce the nearly Newtonian assumption. It guarantees that gravitation contributes
only a small fraction of the total energy, t00 ∼ (Φ,j )2 ∼ |Φ|T 00 ≪ T 00 , hence
Z
Ijk = T 00 (t, x)xj xk d3 x. (12.84)

The quantity Ijk thus represents the quadrupole moment of the mass distribution. By combin-
ing equations 12.83 and 12.81, and by noticing that inside the source |tik | ∼ |Φ,j Φ,k | ∼ T 00 |Φ|,
one obtains   jk  
2 d2 Ijk (t − r) |T | λ
jk
h̄ (t, x) = 1 + O + |Φ| . (12.85)
r dt2 T 00 R
Notice that |Φ| ∼ v 2 ∼ (R/λ)2 and |T jk |/T 00 ∼ v 2 ∼ (R/λ)2 . Thus, terms of higher order
can be neglected.

jk can be obtained by first lowering indices, using ηlm = δlm and then projecting out the TT
hTT
part using the projection operator for radially traveling waves:
xl
Plm = δlm − nl nm , nl ≡ . (12.86)
r
The result is
2 d2 Ijk
TT
1
TT
hjk = TT
where Ijk = Pjl Ilm Pmk − Pjk Pml Ilm (12.87)
r dt2 2
The effective stress-energy tensor for these outgoing waves are
 
1 1 ... ... ... 1 ...
T00 = −T0r = Trr =
GW GW GW TT
h h TT
= I jk I jk − 2nj I jl I lk nk + (nj I jk nk ) ,
2
32π jk,0 jk,0 8πr2 2
(12.88)
where I jk ≡ Ijk − δjk I/3 is the reduced quadrupole moment of of the mass distribution. The
total power crossing a sphere of radius r at time t is
Z
1 ... ...
LGW (t, r) = T (GW)0r r2 dΩ = I jk (t − r) I jk (t − r) . (12.89)
5

Example: One case of special interest is the gravitational radiation emitted by a binary star.
For simplicity let us consider two stars of mass m1 and m2 in a circular orbit, We will treat
the motion of the stars in the Newtonian approximation. The angular frequency of the orbit
is ω = (M/d3 )1/2 , where d is the speration of two stars and M ≡ m1 + m2 is the total mass.
The power that they radiate as gravitational waves is

32 µ2 M 3
LGW = , (12.90)
5 d5
where µ ≡ m1 m2 /(m1 + m2 ) is the reduced mass of the system.
Chapter 13
Black Holes

13.1 Schwarzschild black holes


13.1.1 Schwarzschild metric
In Einstein’s theory of general relativity, the Schwarzschild metric is the solution to the Ein-
stein field equations that describes the gravitational field outside a spherical mass, on the as-
sumption that the electric charge of the mass, angular momentum of the mass, and universal
cosmological constant are all zero. In spherical coordinates {t, r, θ, ϕ}, the metric is given by
   −1
2M 2M
ds = − 1 −
2
dt + 1 −
2
dr2 + r2 dΩ2 , (13.1)
r r

where dΩ2 is the metric on a unit two-sphere,

dΩ2 = dθ2 + sin2 θ dϕ2 . (13.2)

Theorem 13.1 Birkhoff’s theorem

Schwarzschild metric is the unique vacuum solution with spherical symmetry. ♣

The proof can be found in section 5.2 of Spacetime and Geometry (Sean Carroll). Any spher-
ically symmetric vacuum metric possesses a timelike Killing vector. A metric that possesses
a Killing vector that is timelike near infinity is called stationary. A metric is called static if it
possesses a timelike Killing vector that is orthogonal to a family of hypersurfaces. An alter-
native definition of “static” is stationary, and invariant under time reversal. We should think
of stationary as meaning “doing exactly the same thing at every time,” while static means “not
doing anything at all.”
The Schwarzschild metric coefficients become infinite at r = 0 and r = 2M . The metric
coefficients are coordinate-dependent quantities, it is certainly possible to have a coordinate
singularity that results from a breakdown of a specific coordinate system rather than the un-
derlying manifold. Direct calculation reveals that

48M 2
Rµνρσ Rµνρσ = . (13.3)
r6
13.1 Schwarzschild black holes –131/453–

This is enough to convince us that r = 0 represents an honest singularity.

As for r = 2M , the Schwarzschild radius. We can check that none of the curvature invari-
ants blows up there. We therefore begin to think that it is actually not singular, and we have
simply chosen a bad coordinate system. The surface r = 2M is very well-behaved in the
Schwarzschild metric – it demarcates the event horizon of a black hole.

13.1.2 Geodesics of Schwarzschild spacetime


There are four Killing vectors in Schwarzschild Space-time: three for the spherical symmetry,
and one for time translations. Each of these will lead to a constant of the motion for a free
particle. If Kµ is a Killing vector, we know that Kµ dxµ /dλ is a constant along the geodesic.
Here, we choose the affine parameter λ of geodesics to make dxµ /dλ four velocity for massive
particles and four momentum for massless particles. It follows that the quantity

dxµ dxµ
ϵ = −gµν (13.4)
dλ dλ
is constant along the path. For massive particles ϵ = 1 while for massless particles, ϵ = 0.

We can think of the angular momentum as a three-vector with a magnitude and direction.
Conservation of the direction of angular momentum means that the particle will move in a
plane. We can choose this to be the equatorial plane θ = π/2 of our coordinate system. The
two remaining Killing vectors correspond to energy and the magnitude of angular momentum.
The energy arises from the timelike Killing vector
 
2M
µ µ
K = (∂t ) = (1, 0, 0, 0), Kµ = − 1, 0, 0, 0 . (13.5)
r

The Killing vector whose conserved quantity is the magnitude of the angular momentum is

Rµ = (∂ϕ )µ = (0, 0, 0, 1), Rµ = (0, 0, 0, r2 sin2 θ). (13.6)

The two conserved quantities are


 
dxµ 2M dt dxµ dϕ
E = −Kµ = 1− , L = Rµ = r2 . (13.7)
dλ r dλ dλ dλ

For massless particles, these can be thought of as the conserved energy and angular momen-
tum, while for massive particles they are the conserved energy and angular momentum per
unit mass of the particle.

After some algebra manipulations, we can get a single equation for r(λ),
 2
1 dr
+ V (r) = E, (13.8)
2 dλ

where
1 M L2 M L2 1
V (r) = ϵ − ϵ + 2 − 3 and E = E 2 . (13.9)
2 r 2r r 2
–132/453– Chapter 13 Black Holes

massless particles massive particles


0.8
L=M L=M
0.7 L = 2M L = 2M
L = 3M L = 3M
L = 4M L = 4M
0.6
L = 5M L = 5M

0.5
V (r)

0.4

0.3

0.2

0.1

0.0
0 5 10 15 20 25 30 0 5 10 15 20 25 30
r/M r/M

Figure 13.1: Effective potentials for particles in Schwarzschild spacetime. There is an inner-
most circular orbit greater than or equal to 3M , and any orbit that falls inside this radius
continues to r = 0 for particles on geodesies.

In general relativity, at r = 2M the potential is always zero; inside this radius is the black
hole. For massless particles there is always a barrier (except for L = 0, for which the potential
vanishes identically), but a sufficiently energetic photon will nevertheless go over the barrier
and be dragged inexorably down to the center. At the top of the barrier are unstable circular
orbits rc = 3M .
For massive particles, the circular orbits are at

L2 ± L4 − 12M 2 L2
rc = . (13.10)
2M
For large L there will be two circular orbits, one stable and one unstable. In the L → ∞ limit
their radii are given by rc = (L2 /M, 3M ). In this limit the stable circular orbit becomes farther
away, while the unstable one approaches 3M , behaviour that parallels the massless case. As

we decrease L, the two circular orbits come closer together; they coincide when L = 12M
for which rc = 6M . We have therefore found that the Schwarzschild solution possesses stable
circular orbits for r > 6M and unstable circular orbits for 3M < r < 6M . It is important to
remember that these are only the geodesies; there is nothing to stop an accelerating particle
from dipping below r = 3M and emerging, as long as it stays beyond r = 2M .
As for a general non-circular orbit, the equation of the orbit statisfies that
 2
dr 1 2M 2E
+ 2 r4 − 2 r3 + r2 − 2M r = 2 r4 . (13.11)
dϕ L L L

Define x ≡ L2 /M r. It follows that


 2
dx L2 2M 2 2EL2
+ 2 − 2x + x2 − 2 x3 = . (13.12)
dϕ M L M2
13.1 Schwarzschild black holes –133/453–

We take the derivative of 13.12 with respect to ϕ to get

d2 x 3M 2 2
− 1 + x = x. (13.13)
dϕ2 L2

In a Newtonian calculation, the last term would be absent, and we could solve for x exactly;
here, we suppose the orbit is far from r = 2M and treat the last term as a perturbation. We
expand x into a Newtonian solution plus a small deviation: x = x0 + x1 . The solution for the
zeroth-order equation is
x0 = 1 + e cos ϕ. (13.14)
Then the first-order equation becomes

d2 x1 3M 2
+ x 1 = (1 + e cos ϕ)2 . (13.15)
dϕ2 L2

After some approximation, we can get

3M 2
x = 1 + e cos[(1 − α)ϕ] where α = . (13.16)
L2
During each orbit, perihelion advances by an angle

6πM 2
∆ϕ = 2πα = . (13.17)
L2
An ordinary ellipse satisfies
(1 − e2 )a
r= , (13.18)
1 + e cos ϕ
where a is the semi-major axis. Comparing to our zeroth-order solution and the definition of
x, we see that
L2 ≈ M (1 − e2 )a. (13.19)
Hence, we have
6πM
∆ϕ = . (13.20)
(1 − e2 )a

Figure 13.2: Perihelion of Mercury.


–134/453– Chapter 13 Black Holes

13.1.3 Penrose diagram and event horizon


In theoretical physics, a Penrose diagram is a two-dimensional diagram capturing the causal
relations between different points in spacetime. It is an extension of a Minkowski diagram
where the vertical dimension represents time, and the horizontal dimension represents space,
and slanted lines at an angle of 45◦ correspond to light rays. The biggest difference is that lo-
cally, the metric on a Penrose diagram is conformally equivalent to the actual metric in space-
time. The conformal factor is chosen such that the entire infinite spacetime is transformed
into a Penrose diagram of finite size. For spherically symmetric spacetime, every point in the
diagram corresponds to a 2-sphere.
Take the Minkowski spacetime as an example. The metric of Minkowski spacetime is

ds2 = − dt2 + dr2 + r2 dΩ2 . (13.21)

Introduce new coordinates ψ and ξ by


ψ+ξ ψ−ξ
t + r = tan , t − r = tan . (13.22)
2 2
The metric in new coordinates will be
2 − dψ 2 + dξ 2
ds = ψ+ξ ψ−ξ
+ r2 (ψ, ξ) dΩ2 . (13.23)
2
4 cos 2 cos 22

The range of new coordinates are 0 ≤ ξ < π and |ψ| + ξ < π. The corresponding Penrose
diagram of Minkowski spacetime is shown in Figure 13.3.

i+

I+

r=0

i0

I−

t = constant

r = constant

i

Figure 13.3: Penrose diagram of Minkowski spacetime.

It is obvious that light cones are still 45 degree lines in this Penrose diagram. What is more,
points and regions located at infinite distances in the original coordinates have now been
13.1 Schwarzschild black holes –135/453–

brought to a finite distance. The figure indicates a set of different kinds of “infinity” that is
useful in the discussion of physical phenomena. The following list gives the definition of each
of these:

i+ = future timelike infinity (ψ = π, ξ = 0)


i0 = spatial infinity (ψ = 0, ξ = π)
i− = past timelike infinity (ψ = −π, ξ = 0)
I + = future null infinity (ψ = π − ξ, 0 < ξ < π)
I − = past null infinity (ψ = −π + ξ, 0 < ξ < π)

Black holes are characterized by the fact that you can enter them, but never exit. Thus, their
most important feature is actually not the singularity at the center, but the event horizon at
the boundary. An event horizon is a hypersurface separating those spacetime points that are
connected to infinity by a timelike path from those that are not. In general relativity, the global
structure of spacetime can take many different forms, with correspondingly different notions
of infinity. But to think about black holes in the real universe, we use infinity as a proxy for
“well outside the black hole,” and imagine that spacetime sufficiently far away from the hole
can be approximated by Minkowski space.

the rest of the


spacetime
future event
horizon

past event
horizon

Figure 13.4: Penrose diagram of event horizon.

From Penrose diagram 13.4, the future event horizon is the surface beyond which timelike
curves cannot escape to infinity. The causal past J − of a region is the set of all points we
can reach from that region by moving along past-directed timelike paths, the event horizon
can be equivalently defined as the boundary of J − (I + ), the causal past of future null infinity.
Analogous definitions hold for the past horizon. From the definition, it is clear that the event
horizon is a null hypersurface.

13.1.4 The maximally extended Schwarzschild solution


Let us consider radial null curves, those for which θ and ϕ are constant and ds2 = 0. We can
get
 −1
dt 2M
=± 1− (13.24)
dr r
–136/453– Chapter 13 Black Holes

For large r the slope is ±45◦ , as it would be in flat space, while as we approach r = 2M we get
dt/dr = ±∞, and the light cones “close up”, as shown in Figure 13.5. It seems that a light ray
that approaches r = 2M never seems to get there, at least in this coordinate system; instead it
seems to asymptote to this radius.

Event horizon

Figure 13.5: In Schwarzschild coordinates, light cones appear to close up as we approach


r = 2M .

If we stayed outside while an intrepid observational general relativist dove into the black hole,
sending back signals all the time, we would simply see the signals reach us more and more
slowly, as shown in Figure 13.6.

Figure 13.6: A beacon falling freely into a black hole emits signals at intervals of constant
proper time. An observer at fixed r receives the signals at successively longer time intervals.

The fact that we never see the infalling observer reach r = 2M is a meaningful statement, but
the fact that their trajectory in the t − r plane never reaches there is not. It is highly dependent
on our coordinate system, and we would like to ask a more coordinate-independent question
(such as, “Does the observer reach this radius in a finite amount of their proper time?”). The
best way to do this is to change coordinates to a system that is better behaved at r = 2M .

We omit the intermediate steps of coordinate transformation and move to Kruskal coordinates
13.2 Reissner-Nordström black holes –137/453–

directly. The new coordinates is given by


 r 1/2 r    r 1/2 r  
t t
T =± −1 e 4M sinh , R=± −1 e 4M cosh
2M 4M 2M 4M
(13.25)
for r > 2M and
    
r 1/2 r t  r 1/2 r t
T =± 1− e 4M cosh , R=± 1− e 4M sinh .
2M 4M 2M 4M
(13.26)
for r < 2M . The metric in new coordinates becomes

32M 3 −r/2M
ds2 = e (− dT 2 + dR2 ) + r2 dΩ2 . (13.27)
r

We can now draw a spacetime diagram in the T − R plane, known as a Kruskal diagram,
representing the maximal extension of the Schwarzschild geometry, as shown in Figure 13.7.

r = 2M r=0 r = 2M
t = −∞ t = +∞

r = constant t = constant

r = 2M r = 2M
t = +∞ t = −∞

Figure 13.7: The Kruskal diagram – the Schwarzschild solution in Kruskal coordinates, where
all light cones are at ±45◦ .

We can further transform the coordinates to bring them into a finite range and get the Penrose
diagram of Schwarzschild spacetime. The transformation is given by

ψ+ξ ψ−ξ
T + R = tan , T − R = tan . (13.28)
2 2

The range of ψ and ξ is constrained by

ψ+ξ ψ−ξ
tan tan = T 2 − R2 < 1. (13.29)
2 2

The diagram is shown in Figure 13.8.


–138/453– Chapter 13 Black Holes

H
or

on
iz on

iz
or
H
H
or

n
iz

o
iz
on

or
H
Figure 13.8: The Penrose diagram for the Schwarzschild spacetime.

13.2 Reissner-Nordström black holes


In physics and astronomy, the Reissner–Nordström (RN) metric is a static solution to the
Einstein–Maxwell field equations, which corresponds to the gravitational field of a charged,
non-rotating, spherically symmetric body of mass M . The solution can be obtained by using
the fact that T µµ = 0 and T µα T αν ∝ δνµ always holds for EM field. The metric in spherical
coordinates {t, r, θ, ϕ} is
2M Q2
ds2 = −∆ dt2 + ∆−1 dr2 + r2 dΩ2 where ∆ = 1 − + . (13.30)
r 4πr2
The EM filed is  
Q
Aα = − , 0, 0, 0 . (13.31)
4πr
The RN metric has a true curvature singularity at r = 0, as could be checked by computing
the curvature invariant scalar Rµνρσ Rµνρσ . The event horizon of RN metric is determined by
g rr = ∆ = 0. The Penrose diagram for RN black hole is showin in Fugure 13.9.
If M 2 > Q2 /4π, the metric coefficient p∆ is positive at large r and small r, and negative inside
the two vanishing points r± = M ± M 2 − Q2 /4π. The metric has coordinate singularities
at both r+ and r− ; in both cases these could be removed by a change of coordinates as we did
with Schwarzschild. The surfaces defined by r± are both null, and they are both event horizons.
The singularity at r = 0 is a timelike line, not a spacelike surface as in Schwarzschild. If you are
an observer falling into the black hole from far away, r+ is just like 2M in the Schwarzschild
metric; at this radius r switches from being a spacelike coordinate to a timelike coordinate, and
you necessarily move in the direction of decreasing r. Witnesses outside the black hole also
see the same phenomena that they would outside an uncharged hole – the infalling observer
is seen to move more and more slowly, and is increasingly redshifted.
But the inevitable fall from r+ to ever-decreasing radii only lasts until you reach the null surface
r = r− , where r switches back to being a spacelike coordinate and the motion in the direction
of decreasing r can be arrested. You can choose either to continue on to r = 0, or begin to
move in the direction of increasing r back through the null surface at r = r− . Then r will
13.3 Kerr black holes –139/453–

once again be a timelike coordinate, but with reversed orientation; you are forced to move in
the direction of increasing r. You will eventually be spit out past r = r+ once more, which is
like emerging from a white hole into the rest of the universe. From here you can choose to go
back into a different hole and repeat the voyage as many times as you like.

III III

II I

I I

II I

III III III

II I

I I

II I

III III

(a) Ayón-Beato-García black hole spacetime (b) Extremal Ayón-Beato-García (c) No black hole

Figure 13.9: Penrose diagram of RN spacetime.

If M 2 = Q2 /4π, the extremal black holes have ∆ = 0 at a single radius, r = M . This


represents an event horizon, but the r coordinate is never timelike; it becomes null at r = M ,
but is spacelike on either side. The singularity at r = 0 is a timelike line, as in the other cases.
Thus for this black hole you can again avoid the singularity and continue to move to the future
to extra copies of the asymptotically flat region, but the singularity is always “to the left”.

If M 2 < Q2 /4π, ∆ is always positive and the metric is completely all the way down to r = 0.
The coordinate t is always timelike, and r is always spacelike. But still there is the singularity
at r = 0, which is now a timelike line. Since there is no event horizon, there is no obstruc-
tion to an observer traveling to the singularity and returning to report on what was observed.
This is a naked singularity. A careful analysis of the geodesies reveals that the singularity is
repulsive-timelike geodesies never intersect r = 0; instead they approach and then reverse
course and move away. Null geodesies can reach the singularity, as can nongeodesic timelike
curves. As r → ∞ the solution approaches flat spacetime, and as we have just seen the causal
structure seems normal everywhere. The conformal diagram will therefore be just like that of
Minkowski space, except that now r = 0 is a singularity.
–140/453– Chapter 13 Black Holes

13.3 Kerr black holes


13.3.1 Kerr metric
The Kerr metric describes the geometry of empty spacetime around a rotating uncharged
axially-symmetric black hole with a quasispherical event horizon. The metric is
 
2M r 2M ar sin2 θ ρ2 2
ds = − 1 − 2
2
dt2 − (dt dϕ + dϕ dt) + dr
ρ ρ2 ∆
sin2 θ  2 
+ ρ2 dθ2 + 2
(r + a2 )2 − a2 ∆ sin2 θ dϕ2 ,
ρ

where
∆(r) = r2 − 2M r + a2 , ρ2 (r, θ) = r2 + a2 cos2 θ. (13.32)
We can show that a is the the angular momentum per unit mass of the black hole.

It is straightforward to check that as a → 0 they reduce to Schwarzschild coordinates. If we


keep a fixed and let M → 0, we recover flat spacetime but not in ordinary polar coordinates.
The metric becomes
r2 + a2 cos2 θ 2
ds2 = − dt2 + dr + (r2 + a2 cos2 θ)2 dθ2 + (r2 + a2 ) sin2 θ dϕ2 , (13.33)
r 2 + a2
and we recognize the spatial part of this as flat space in ellipsoidal coordinates. They are related
to Cartesian coordinates in Euclidean 3-space by
1 1
x = (r2 + a2 ) 2 sin θ cos ϕ, y = (r2 + a2 ) 2 sin θ sin ϕ, z = r cos θ. (13.34)

r = 0 is a two-dimensional disk; the intersection of r = 0 with θ = π/2 is the ring at the


boundary of this disk.

From the form of the Kerr metric, it is obvious that the metric becomes ill-defined at ρ = 0
and at ∆ = 0. The calculation of scalar invariants of the curvature tensor shows that ρ = 0 is
indeed a physical singularity. The condition ρ = 0 corresponds to

ρ2 = r2 + a2 cos2 θ = 0, (13.35)

which can only be satisfied with θ = π/2 and r = 0. Hence we have a ring-like singularity
in the case of a Kerr metric. The curvature invariants are well behaved at ∆ = 0. ∆(r) = 0
is equal to g rr = 0. Thus ∆(r) = 0 is a null surface. The quadratic equation ∆ = 0 has two
roots if |a| < M ,

rh = M ± M 2 − a2 , (13.36)
representing inner and outer horizons in Kerr spacetime. The corresponding Penrose diagram
is shown in FIgure 13.10.

There are two Killing vectors of the metric, K µ = (∂t )µ and Rµ = (∂ϕ )µ . The norms of these
Killing vectors are scalar quantities with a coordinate-free, geometrical interpretation. This
13.3 Kerr black holes –141/453–

closed timelike curve

anti universe

ring sigularity at

future
internal envent
horizon at

Figure 13.10: The Penrose diagram for the Kerr spacetime for the case M > |a|.

allows us to represent three of the metric coefficients in the form


 
∆ − a2 sin2 θ
Kµ K = gtt = −
µ
,
ρ2
a sin2 θ(∆ − a2 − r2 )
Kµ Rµ = gtϕ = ,
ρ2
[(r2 + a2 )2 − ∆a2 sin2 θ] sin2 θ
Rµ Rµ = gϕϕ = . (13.37)
ρ2

Let us consider the surface defined by the quadratic equation gtt = 0. Consider a class of ob-
servers with four-velocity uµ in the direction of the timelike Killing vector K µ . For any photon
with four-momentum pµ propagating in this spacetime, the observer with four-velocity uµ will
attribute a frequency
pµ K µ E
ω = −pµ uµ = µ = (13.38)
K Kµ −gtt
where E is the conserved “energy” of the photon. Thus it is easy to see that the surface with
gtt = 0 corresponds to infinite redshift. For the Kerr metric, the equation gtt also has two
solutions, r = r± , given by

r± = M ± M 2 − a2 cos2 θ. (13.39)

Physically, this corresponds to a surface of infinite redshift usually called an ergosurface. The
region between outer ergosurface and horizon is called the ergosphere. The geometrical struc-
ture of Kerr black hole is shown in Figure 13.11.
–142/453– Chapter 13 Black Holes

Inner ergosurface Outer ergosurface

Outer event horizon Inner event horizon

Ergoregion Ring singularity

Figure 13.11: Schematic picture showing the geometrical structure of the Kerr spacetime.

13.3.2 Static limit


An observer in the Kerr spacetime moving with a constant angular velocity and having fixed
values for r, θ will see the geometry to be unchanging. Such observers are called stationary
observers. If the observers also have a fixed value for ϕ, they are called static observers located
at fixed spatial coordinates.

Consider a stationary observer with an angular velocity Ω in the Kerr spacetime with

dϕ uϕ
Ω= = t. (13.40)
dt u
Such an observer has a four-velocity (ut , 0, 0, Ωut ). From the normalization uµ uµ = −1, we
have
gtt + 2gtϕ Ω + gϕϕ Ω2 < 0. (13.41)
This condition leads to limits in the range of values allowed for the angular velocity to be

Ωmin < Ω < Ωmax , (13.42)

where
q q
Ωmin = ω − ω 2 − (gtt /gϕϕ ), Ωmax = ω + ω 2 − (gtt /gϕϕ ), (13.43)

with
gϕt 2M ar
ω=− = 2 . (13.44)
gϕϕ (r + a )2 − ∆a2 sin2 θ
2

First, far away from the black hole, we have rΩmin = −1 and rΩmax = +1, which correspond
to the standard result that motion should be at a speed less than that of light. Second, as one
13.3 Kerr black holes –143/453–

moves closer to the black hole, Ωmin increases due to the dragging of the inertial frames. Even-
tually, Ωmin reaches zero at the surface on which gtt = 0, which is the ergosurface. Therefore,
inside the ergosphere, all stationary observers must orbit the black hole with Ω > 0 and hence
static observers can exist only out-side the ergosurface. Finally, as one crosses the ergosur-
face and moves towards the event horizon, the allowed range of angular velocities become
ever more positive with the allowed range narrowing down. At the event horizon, the Ωmin
and Ωmax coincide and all timelike worldlines point inwards. The limiting angular velocity is
given by
a
ΩH ≡ ωrh ,θ = . (13.45)
2M rh
This limiting angular velocity is sometimes called the angular velocity of the horizon.

13.3.3 Penrose process and the area of the event horizon


Since the Kerr metric has a timelike Killing vector field K µ , any particle moving on a geodesic
will have a conserved energy given by E = −pµ K µ = −pt . Clearly, the orbit must be inside
the ergosphere if the energy has to be negative.

This result can be used to extract energy from the Kerr black hole in several ways, of which the
simplest one is the following. Consider, for example, a particle A moving in the ergosphere
which breaks into two particles B and C. We let particle B to fall into the black hole and let
particle C escape to infinity. All this can be done using suitable timelike trajectories. The
conservation of four-momentum requires that

EA = EB + EC . (13.46)

Since the particle A can fall into the ergosphere from infinity, we have EA > m. We can arrange
the trajectory of B making EB < 0. It immediately follows that EC > EA . When the particle
C goes back to infinity, it will have more energy than the original particle had. Thus, using the
existence of negative energy orbits in the ergosphere and the local conservation of energy for
processes taking place in the ergoregion, one can extract energy from the black hole.

The Penrose process decreases both the mass and the angular momentum of the Kerr black
hole by an amount equal to the (negative of) the energy and the angular momentum of the
particle B that falls into the black hole. Consider the dot product of pµ with the vector K µ +
ωRµ . Since K µ + ωRµ is timelike outside the horizon and we want particle B to fall into the
horizon, it is necessary that this dot product is negative. Using pµ Kµ = −E, pµ Rµ = L, where
E and L are the conserved energy and angular momentum of particle B, we get the condition
−E + ΩH L < 0. When the particle B falls into the black hole, the angular momentum and
mass of a Kerr black hole will change by δJ = L and δM = E. Hence the above bound
translates into the result
δM > ΩH δJ. (13.47)

The surface area of the event horizon of a Kerr black hole is

A = 4π(rh2 + a2 ). (13.48)
–144/453– Chapter 13 Black Holes

We can verify that


a
δA = 8π √ (δM − ΩH δJ) > 0. (13.49)
ΩH M 2 − a2
This can be manipulated to read
κ
δM = δA + ΩH δJ, (13.50)

where √
M 2 − a2
κ= √ (13.51)
2M (M + M 2 − a2 )
is the surface gravity.
Quantum field theory in curved spacetime would show that equations 13.49 and 13.50 can be
given a thermodynamic interpretation with κ/2π acting as the temperature of the black hole
and A/4 acting as the entropy of the black hole.

13.3.4 Particle orbits in the Kerr metric


The orbits of particles in the Kerr metric can be studied by Hamilton-Jacobian method. How-
ever, lack of spherical symmetry makes the nature of the orbits very complicated and analytic
solutions are impossible to find. It is clear that radial motion will now be possible only along
the axis of symmetry and even planar motion will be possible only in the equatorial plane.
Derivations of equations that govern the particle trajectories in the Kerr metric is given by
subsection 8.5.4 of Gravitation: Foundations and Frontiers ( T. Padmanabhan).
The existence of stable circular orbits in the equatorial plane is of some practical interest in
astrophysics. It is generally believed that the matter in the accretion disks around astrophysical
black holes will be able to move towards the black hole in a series of approximately circular
orbits in the equatorial plane. In that case, the radius of the stable circular orbit closest to the
black hole and its energy are of interest in the astrophysics of accretion disks. For the motion in
the equatorial plane one can introduce an effective potential as in the case of the Schwarzschild
metric along the following lines. Setting θ = π/2 into equations of motion will give equation
for the radial motion,
 2
dr [(r2 + a2 )E − aL]2 ∆[(aE − L)2 + m2 r2 ]
m 2
= − . (13.52)
dτ r4 r4
We can now define an effective U (r) such that the right hand side of the above equation van-
ishes when E = U . Hence the effective potential is the solution to the equation:

[(r2 + a2 )U (r) − aL]2 − ∆[(aU (r) − L)2 + m2 r2 ] = 0. (13.53)

The radii of stable circular orbits are determined by the minima of U (r); that is by the simul-
taneous solution to the equations E = U (r), U ′ (r) = 0. Among all the stable circular orbits,
we are interested in the innermost one. Fairly lengthy but straightforward calculation shows
that this orbital radius is the solution to the quartic equation

r2 − 6M r − 3a2 + 8a M r = 0. (13.54)
13.3 Kerr black holes –145/453–

When a = 0 we get the standard result that r = 6M ; as the rotation parameter increases,
the radius of the circular orbit decreases for the co-rotating orbit. This shows that one can
have stable circular orbits very close to the black hole in the case of rotating black holes. The
quantity (m − E)/m represents the fraction of the rest energy that can be released when a
particle falls from the innermost stable circular orbit into the black hole. In the extreme case

of a = M , this fraction is 1 − 1/ 3 which is about 42 percent, while the corresponding value
for orbits in the Schwarzschild metric is only about 5.7 percent. This higher efficiency could
be of use in certain astrophysical scenarios.
Chapter 14
Geometry of the Universe

14.1 Friedmann–Lemaître–Robertson–Walker metric


Assuming a homogeneous and isotropic universe, the metric of spacetime will be of the form
 
dr2
ds = − dt + a(t)
2 2 2 2
+ r dΩ . 2
(14.1)
1 − Kr2
If we define
 √ √

sin −1
( Kr)/ K, K > 0 Z
 dt
χ≡ r, K = 0 and η ≡ , (14.2)

sinh−1 (√−Kr)/√−K, K < 0
 a(t)

the metric can then be rewritten as



ds2 = a(η)2 − dη 2 + dχ2 + r(χ)2 dΩ2 . (14.3)

In coordinates (t, r, θ, ϕ), we have


3(ȧ2 + K) 6(ȧ2 + aä + K)
Gtt = , Gµµ = − . (14.4)
a2 a2
Suppose that the matter in the universe can be modeled as idealized fluid, of which the corre-
sponding energy-momentum tensor is T µν = (ρ + p)uµ uν + pg µν . Contraction of the energy-
momentum tensor gives T µµ = −ρ + 3p. If the matter is comoving with the coordinates, we
have Ttt = ρ. Einstein’s equations now take the from as
 2
ȧ 8π K ä 4π
= ρ − 2, = − (ρ + 3p), (14.5)
a 3 a a 3
called Friedmann’s equations. Taking the derivative of the first one, we can get
 
ä 4π dρ
=− − − 2ρ . (14.6)
a 3 d ln a
It follows that

= −3(p + ρ). (14.7)
d ln a
If we have the equation of state p = kρ, the evolution of density will be ρ ∝ a−3(k+1) . Our
universe is mainly composed of dark energy, cold dark matters, baryons and radiations. Their
properties are listed here:
14.2 Observable quantities –147/453–

radiation (photon, neutrino) pR = 1/3ρR and ρR ∝ a−4

cold matter (baryon, cold dark matter) pCM = 0 and ρCM ∝ a−3

dark energy pDE = −ρDE and ρDE = constant

Friedmann’s equations now becomes

H2
= ΩR0 a−4 + ΩCM0 a−3 + ΩDE0 + ΩK0 a−2 . (14.8)
H02

The subscript 0 denotes today’s value of some parameters and we scale the coordinates to make
a0 = 1. H ≡ ȧ/a and H0 is called Hubble’s constant. Ω0 is the ratio between density ρ0 of
some substance and critical density ρc0 ≡ 3H02 /8π. Particularly, ρK0 = 3K/8π is called the
density of curvature energy and we have ΩK0 = 1 − ΩDE0 − ΩCM0 − ΩR0 .

14.2 Observable quantities


Redshift
Suppose there is a photon emitted from r = 0 at time t. The world line of the photon will
satisfy that
 2
dt a2
= , θ = constant, ϕ = constant. (14.9)
dr 1 − Kr2
In terms of four-momentum, we have
 2
pt a2
= (14.10)
pr 1 − Kr2

On the other hand, we have the geodesic equation that

dpt dpt aȧ


+ Γtαβ pα pβ = + (pr )2 = 0. (14.11)
dλ dλ 1 − Kr 2

It follows that
dpt da
+ = 0. (14.12)
pt a
Notice that a0 = 1 and so pt0 = apt . Define cosmological redshift by pt0 = pt /(1 + z). We
have the relation
1
a= . (14.13)
1+z

Luminosity
Suppose there is an object with intrinsic luminosity L at r = 0 and time t. Suppose at time t0 ,
the photon propagate to r = r0 (our position) at time t0 (now). In coordinates (η, χ), we have
η0 = χ0 . Hence Z t0 ′ Z z
dt dz ′
χ0 (z) = = ′
, (14.14)
t a 0 H(z )
–148/453– Chapter 14 Geometry of the Universe

where z is the redshift of the object, and


 √  √

 sin Kχ / K, K > 0
 0

r0 ≡ χ0 , K = 0 (14.15)

 √  √

sinh −Kχ0 / −K, K < 0.

The area size of the two-surface t = t0 , r = r0 is 4πr02 . In time interval ∆t, the object emitted
∆N photons, so L = ϵ∆N/∆t. ϵ is the energy of the photon. The time interval for receiver is
∆t0 = ∆t/a (It is easy to verify in coordinates (η, χ) that ∆η = ∆η ′ ). Take into account the
redshift of the photon, the flux we measured is

ϵ∆N L
f= 2
= where dL ≡ (1 + z)r0 . (14.16)
(1 + z)∆t0 4πr0 4πd2L

Size
Suppose there is an object with intrinsic size ∆l at time t. Now we put ourselves at r = 0,
so the object will be at the two surface with metric dσ 2 = a(t)2 r02 dΩ2 . The angle it extends
relative to us satisfy that ar0 ∆θ = ∆l. Hence we have

∆l r0
∆θ = where dA ≡ . (14.17)
dA 1+z
Part IV

Quantum Mechanics
Chapter 15

Linear Algebra

15.1 Linear Vector Space

Definition 15.1 Linear vector space

A linear vector space is a set of elements, called vectors, which is closed under addition
and multiplication by scalars. That is to say, if ϕ and ψ are vectors then so is aϕ + bψ,
where a and b are arbitrary scalars. If the scalars belong to the field of complex (real) ♡
numbers, we speak of a complex (real) linear vector space. Henceforth the scalars will
be complex numbers unless otherwise stated.

Example:

1. Discrete vectors, which may be represented as columns of complex numbers.

2. Spaces of functions of some type, for example, the space of all differentiable functions

Definition 15.2 Linear independence

A set of vectors {ϕn } is said to be linearly independent if no non-trivial linear combi-


P
nation of them sums to zero; that is to say, if the equation n cn ϕn = 0 can hold only
when cn = 0 for all n. If this condition does not hold, the set of vectors is said to be ♡
linearly dependent, in which case it is possible to express a member of the set as a linear
combination of the others.

Definition 15.3 Dimension

The maximum number of linearly independent vectors in a space is called the dimension

of the space.
15.1 Linear Vector Space –151/453–

Definition 15.4 Base

A maximal set of linearly independent vectors is called a basis for the space. Any vector

in the space can be expressed as a linear combination of the basis vectors.

Definition 15.5 Inner product

An inner product (or scalar product) for a linear vector space associates a scalar (ϕ, ψ)
with every ordered pair of vectors. It must satisfy the following properties:
1. (ϕ, ψ) = a complex number.

2. (ϕ, ψ) = (ψ, ϕ)∗ .
3. (ϕ, c1 ψ1 + c2 ψ2 ) = c1 (ϕ, ψ1 ) + c2 (ϕ, ψ2 ).
4. (ϕ, ϕ) ≥ 0,with equality holding if and only if ϕ = 0.

Example:

1. If ψ is the column vector with elements a1 , a2 , · · · , and ϕ is the column vector with
elements b1 , b2 , · · · , then

(ψ, ϕ) = a∗1 b1 + a∗2 b2 + · · · (15.1)

2. If ψ and ϕ are functions of x, then


Z
(ϕ, ψ) = ϕ∗ (x)ψ(x)w(x) dx , (15.2)

where w(x) is some non-negative weight function.

Definition 15.6 Norm


p
∥ϕ∥ ≡ (ϕ, ϕ). (15.3) ♡

Theorem 15.1 Schwarz’s inequality

|(ψ, ϕ)|2 ≤ (ψ, ψ)(ϕ, ϕ). (15.4) ♣

Theorem 15.2 triangle inequality

∥ϕ + ψ∥ ≤ ∥ϕ∥ + ∥ψ∥. (15.5) ♣


–152/453– Chapter 15 Linear Algebra

Definition 15.7 Orthonormal

A set of vectors {ϕn } is said to be orthonormal if the vectors are pairwise orthogonal

and of unit norm; that is to say, their inner products satisfy (ψm , ϕn ) = δmn .

Definition 15.8 Dual vector

Corresponding to any linear vector space V there exists the dual space of linear func-
tionals on V . A linear functional F assigns a scalar F (ϕ) to each vector ϕ, such that

F (aϕ + bψ) = aF (ϕ) + bF (ψ) (15.6)



for any vectors for ϕ and ψ, and any scalars a and b. The set of linear functionals may
itself be regarded as forming a linear space V ′ if we define the sum of two functionals
as
(F1 + F2 )(ϕ) ≡ F1 (ϕ) + F2 (ϕ). (15.7)

Theorem 15.3 Riesz theorem

There is a one-to-one correspondence between linear functionals F in V ′ and vectors


f in V , such that all linear functionals have the form

F (ϕ) = (f, ϕ), (15.8) ♣

f being a fixed vector, and ϕ being an arbitrary vector. Thus the spaces V and V ′ are
essentially isomorphic.

In Dirac’s notation, which is very popular in quantum mechanics, the vectors in V are called
ket vectors, and are denoted as |ϕ⟩. The linear functionals in the dual space V ′ are called bra
vectors, and are denoted as ⟨F |. The numerical value of the functional is denoted as

F (ϕ) = ⟨F |ϕ⟩ . (15.9)

According to the Riesz theorem, there is a one-to-one correspondence between bras and kets.
Therefore we can use the same alphabetic character for the functional (a member of V ′ ) and
the vector (in V ) to which it corresponds, relying on the bra, ⟨F |, or ket, |F ⟩, notation to
determine which space is referred to. It follows that

⟨F |ϕ⟩ = (F, ϕ). (15.10)

Notice that the Riesz theorem establishes, by construction, an antilinear correspondence be-
tween bras and kets. If ⟨F | ↔ |F ⟩, we will have the correspondence

c∗1 ⟨F1 | + c∗2 ⟨F2 | ↔ c1 |F1 ⟩ + c2 |F2 ⟩ . (15.11)


15.2 Linear Operators –153/453–

15.2 Linear Operators

Definition 15.9 Linear operators

An operator on a vector space maps vectors onto vectors. A linear operator satisfies that

A |c1 ψ1 + c2 ψ2 ⟩ = c1 A |ψ1 ⟩ + c2 A |ψ2 ⟩ . (15.12)

Define the sum and product of operators as

(A + B) |ψ⟩ ≡ A |ψ⟩ + B |ψ⟩ , (AB) |ψ⟩ ≡ A(B |ψ⟩). (15.13)

Define their action to the left on bra vectors as

(⟨ϕ| A) |ψ⟩ ≡ ⟨ϕ| (A |ψ⟩) for any ψ. (15.14) ♡

According to the Riesz theorem there must exist a vector χ such that

(⟨ϕ| A) |ψ⟩ = ⟨χ|ψ⟩ for any ψ. (15.15)

If we define operator A† as
A† |ϕ⟩ = |χ⟩ . (15.16)
we will have
⟨ϕ|A|ψ⟩ = ⟨A† ϕ|ψ⟩ = ⟨ψ|A† |ϕ⟩∗ . (15.17)

Definition 15.10 Outer product

The outer product of vector |ϕ⟩ and |ψ⟩ is an operator defined as



(|ψ⟩⟨ϕ|) |λ⟩ ≡ |ψ⟩ (⟨ϕ|λ⟩). (15.18)

Definition 15.11 Trace

The trace of an operator is defined as


X
Tr A ≡ ⟨ui |A|ui ⟩ , (15.19)

where {uj } may be any orthonormal basis. It can be shown that the value of Tr A is
independent of the particular orthonormal basis that is chosen for its evaluation.
–154/453– Chapter 15 Linear Algebra

Proposition 15.1

(cA)† = c∗ A† , (A + B)† = A† + B † , (AB)† = B † A† , (|ψ⟩⟨ϕ|)† = |ϕ⟩⟨ψ| . ♠


(15.20)

15.3 Self-Adjoint operators

Definition 15.12 Self-Adjoint operators

An operator A that is equal to its adjoint A† is called self-adjoint. This means that it
satisfies
⟨ϕ|A|ψ⟩ = ⟨ψ|A|ϕ⟩∗ (15.21) ♡
and that the domain of A coincides with the domain of A† . An operator that only sat-
isfies above equation is called Hermitian.

Theorem 15.4

If ⟨ψ|A|ψ⟩ = ⟨ψ|A|ψ⟩∗ for all |ψ⟩, it would follow that ⟨ϕ1 |A|ϕ2 ⟩ = ⟨ϕ2 |A|ϕ1 ⟩∗ for

all |ϕ1 ⟩ and |ϕ2 ⟩, and hence that A = A† .

Definition 15.13 Eigenvalues and eigenvectors

If an operator acting on a certain vector produces a scalar multiple of that same vector,

A |ϕ⟩ = a |ϕ⟩ , (15.22)

we call the vector |ϕ⟩ an eigenvector and the scalar an eigenvalue of the operator A. ♡
The antilinear correspondence between bras and kets, and the definition of the adjoint
operator A† , imply the left-handed eigenvalue equation

⟨ϕ| A† = ⟨a∗ ϕ| . (15.23)

Theorem 15.5

If A is a Hermitian operator, all of its eigenvalues must be real. ♣


15.3 Self-Adjoint operators –155/453–

Theorem 15.6

Eigenvectors corresponding to distinct eigenvalues of a Hermitian operator must be



orthogonal.

If the orthonormal set of vectors {ϕi } is complete, then we can expand an arbitrary vector |v⟩
in terms of it: X X 
|v⟩ = |ϕi ⟩ (⟨ϕi |v⟩) = |ϕi ⟩⟨ϕi | |v⟩ . (15.24)
Therefore, X
|ϕi ⟩⟨ϕi | = I. (15.25)
If A |ϕi ⟩ = ai |ϕi ⟩ and the eigenvectors form a complete orthonormal set, then the operator
can be reconstructed in a useful diagonal form in terms of its eigenvalues and eigenvectors:
X
A= ai |ϕi ⟩⟨ϕi | . (15.26)

We can define a function of an operator


X
f (A) = f (ai ) |ϕi ⟩⟨ϕi | . (15.27)

The Hermitian operators in a finite N -dimensional vector space have complete sets of eigen-
vectors. But This statement does not carry over to infinite-dimensional spaces. A Hermitian
operator in an infinite-dimensional vector space may or may not possess a complete set of
eigenvectors, depending upon the precise nature of the operator and the vector space. Instead,
we have the following spectral theorem.

Theorem 15.7

To each self-adjoint operator A there corresponds a unique family of projection opera-


tors, E(λ), for real λ, with the properties:
1. If λ1 < λ2 then E(λ1 )E(λ2 ) = E(λ2 )E(λ1 ) = E(λ1 ).
2. If ϵ > 0, then E(λ + ϵ) |ψ⟩ → E(λ) |ψ⟩ as ϵ → 0. ♣
3. E(λ) |ψ⟩ → 0 as λ → −∞.
4. E(λ) |ψ⟩ → |ψ⟩ as λ → ∞.
R∞
5. −∞ λ dE(λ) = A.

We can define a function of an operator by


Z ∞
f (A) = f (λ) dE(λ) . (15.28)
−∞

Following Dirac’s pioneering formulation, it has become customary in quantum mechanics to


write a formal eigenvalue equation for an operator such as Q that has a continuous spectrum,

Q |q⟩ = q |q⟩ . (15.29)


–156/453– Chapter 15 Linear Algebra

The orthonormality condition for the continuous case takes the form

⟨q|q ′ ⟩ = δ(q − q ′ ). (15.30)

Evidently the norm of these formal eigenvectors is infinite, since ⟨q|q⟩ → ∞. Instead of the
spectral theorem for Q, Dirac would write

Z ∞
Q= q |q⟩⟨q| dq . (15.31)
−∞

Dirac’s formulation does not fit into the mathematical theory of Hilbert space, which admits
only vectors of finite norm. The projection operator formally given by

Z λ
E(λ) = |q⟩⟨q| dq (15.32)
−∞

is is well defined in Hilbert space, but its derivative does not exist within the Hilbert space
framework.

Theorem 15.8

If A and B are self-adjoint operators, each of which possesses a complete set of eigenvec-
tors, and if AB = BA, then there exists a complete set of vectors which are eigenvectors ♣
of both A and B.

Definition 15.14 complete commuting set of operators

Let (A, B, · · · ) be a set of mutually commutative operators that possess a complete set
of common eigenvectors. Corresponding to a particular eigenvalue for each operator,
there may be more than one eigenvector. If, however, there is no more than one eigen-

vector (apart from the arbitrary phase and normalization) for each set of eigenvalues
(an , bm , · · · ), then the operators (A, B, · · · ) are said to be a complete commuting set of
operators.

Theorem 15.9

Any operator that commutes with all members of a complete commuting set must be a

function of the operators in that set.
15.4 Rigged Hilbert space –157/453–

15.4 Rigged Hilbert space


Definition 15.15 Rigged Hilbert space

Formally, a rigged Hilbert space consists of a Hilbert space H, together with a subspace
Φ which carries a finer topology, that is one for which the natural inclusion Φ ⊆ H
is continuous. It is no loss to assume that Φ is dense in H for the Hilbert norm. We
consider the inclusion of conjugate space HX in ΦX . ΦX is the space of τΦ continuous
antilinear functional on Φ. For any ϕ ∈ Φ, F ∈ ΦX ,we define

⟨F |ϕ⟩ ≡ F (ϕ), ⟨ϕ|F ⟩ ≡ [F (ϕ)]∗ . (15.33)

Now by applying the Riesz representation theorem we can identify HX with H. There-
fore, the definition of rigged Hilbert space is in terms of a sandwich:

Φ ⊆ H ⊆ ΦX . (15.34)

There may or may not exist any solutions to the eigenvalue equation A |an ⟩ = an |an ⟩ for a self-
adjoint operator A on an infinite-dimensional vector space. However, the generalized spectral
theorem asserts that if A is self-adjoint in H then a complete set of eigenvectors exists in the
extended space ΦX . The precise conditions for the proof of this theorem are rather technical,
so the interested reader can refer to Gel’fand and Vilenkin (1964) for further details.
There are many examples of rigged-Hilbert-space triplets. A Hilbert space H is formed by
those functions that are square-integrable. That is, H consists of those functions ψ(x) for
which Z ∞
⟨ψ|ψ⟩ = |ψ(x)|2 dx is finite . (15.35)
−∞

A nuclear space Φ is made up of functions ψ(x) which satisfy the infinite set of conditions,
Z ∞
|ψ(x)|2 (1 + |x|)m dx is finite for m = 0, 1, 2, · · · (15.36)
−∞

The functions ψ(x) which make up Φ must vanish more rapidly than any inverse power of
x in the limit |x| → ∞. The extended space ΦX , which is conjugate to Φ, consists of those
functions χ(x) for which
Z ∞
⟨χ|ψ⟩ = χ∗ (x)ψ(x) dx is finite for any ψ in Φ. (15.37)
−∞

In addition to the functions of finite norm, which also lie in H, ΦX will contain functions
that are unbounded at infinity provided the divergence is no worse than a power of x. Hence
ΦX contains eikx , which is an eigenfunction of the operator D = i d/dx . It also contains
the Dirac delta function, δ(x − λ), which is an eigenfunction of the operator X, defined by
Xψ(x) = xψ(x). These two examples suffice to show that rigged Hilbert space seems to be a
more natural mathematical setting for quantum mechanics than is Hilbert space.
–158/453– Chapter 15 Linear Algebra

15.5 Unitary operators

Definition 15.16 Unitary operator

A unitary operator U is a bounded linear operator on a Hilbert space H that satisfies



U U † = U † U = I, where U † is the adjoint of U , and I is the identity operator.

Consider a family of unitary operators, U (s), that depend on a single continuous parameter s.
Let U (0) = I be the identity operator, and let U (s1 + s2 ) = U (s1 )U (s2 ). We can demonstrate
that
dU
= iK with K = K †. (15.38)
ds s=0

The Hermitian operator K is called the generator of the family of unitary operators because it
determines U (s), not only for infinitesimal s, but for all s. This can be shown by differentiating

U (s1 + s2 ) = U (s1 )U (s2 ) (15.39)

with respect to s2 at s2 = 0,
dU
= U (s1 )iK. (15.40)
ds s=s1

This first order differential equation with initial condition U (0) = I has the unique solution

U (s) = eiKs . (15.41)

15.6 Antiunitary operators

Definition 15.17 Antiunitary operator

In mathematics, an antiunitary transformation, is a bijective antilinear map U : H1 →


H2 between two complex Hilbert spaces such that

⟨U x|U y⟩ = ⟨x|y⟩∗ (15.42) ♡

for all x and y in H1 . If additionally one has H1 = H2 then U is called an antiunitary


operator.
15.6 Antiunitary operators –159/453–

Proposition 15.2

1. When U is antiunitary, U 2 will be unitary. This follows from

U 2 x U 2 y = ⟨U x|U y⟩∗ = ⟨x|y⟩ . (15.43)

2. For unitary operator V the operator V K, where K is complex conjugate opera-


tor, is antiunitary. The reverse is also true, for antiunitary U the operator U K is

unitary.
3. For antiunitary U the definition of the adjoint operator U † is changed into

U † x y = ⟨x|U y⟩∗ . (15.44)

4. The adjoint of an antiunitary U is also antiunitary and U U † = U † U = I.


Chapter 16
Formulation of Quantum Mechanics

16.1 Axioms of quantum mechanics


1. The properties of a quantum system are completely defined by specification of its state
vector |ψ⟩. The state vector is an element of a complex Hilbert space H called the space
of states.

2. With every physical property (energy, position, momentum, angular momentum, ...)
there exists an associated linear, Hermitian operator A (usually called observable), which
acts in the space of states. The eigenvalues of the operator are the possible values of the
physical properties.

3. • If |ψ⟩ is the vector representing the state of a system and if |ϕ⟩ represents another
physical state, there exists a probability P (|ψ⟩ , |ϕ⟩) of finding |ψ⟩ in state |ϕ⟩,
which is given by the squared modulus of the scalar product on H : P (|ψ⟩ , |ϕ⟩) =
|⟨ψ|ϕ⟩|2 .

• If A is an observable with eigenvalues ak and eigenstates |k⟩, given a system in the


state |ψ⟩, the probability of obtaining ak as the outcome of the measurement of A
is |⟨k|ψ⟩|2 . After the measurement the system is left in the state projected on the
subspace of the eigenvalue ak .

4. The evolution of a closed system is unitary. The state vector |ψ(t)⟩ at time t is derived
from the state vector |ψ(t0 )⟩ at time t0 by applying a unitary operator U (t, t0 ), called
the evolution operator: |ψ(t)⟩ = U (t, t0 ) |ψ(t0 )⟩.

16.2 Transformations of States


A transformation of states can be described by |ψ⟩ → U (τ ) |ψ⟩ ≡ |ψ ′ ⟩. We demand that

|⟨ϕ|ψ⟩| = |⟨ϕ′ |ψ ′ ⟩|. (16.1)

We have the following theorem:


16.3 Schrödinger equation –161/453–

Theorem 16.1 Wigner Theorem

Any mapping of the vector space onto itself that preserves the value of |⟨ϕ|ψ⟩| may
be implemented by an operator U with U being either unitary (linear) or antiunitary ♣
(antilinear).

Continuous transformation

Only linear operators can describe continuous transformations because every continuous trans-
formation has a square root. Suppose, for example, that U (l) describes a displacement through
the distance l. This can be done by two displacements of U (l/2), and hence U (l) = U (l/2)U (l/2).
The product of two antilinear operators is linear, since the second complex conjugation nul-
lifies the effect of the first. Thus, regardless of the linear or antilinear character of U (l/2), it
must be the case that U (l) is linear. A continuous operator cannot change discontinuously
from linear to antilinear as a function of l, so the operator must be linear for all l.

Transformations of observables

For observable Q and transformation |ψ ′ ⟩ = U |ψ⟩, we have

⟨ψ ′ |Q|ψ ′ ⟩ = ⟨ψ|U −1 QU |ψ⟩ . (16.2)

If U is continuous and U (τ )−1 Q(q)U (τ ) = Q(q + τ ), where Q(q + τ ) |q⟩ = (q + τ ) |q⟩, we


can derive that
U |q⟩ = |q + τ ⟩ . (16.3)

16.3 Schrödinger equation


The axioms of quantum mechanics state that the evolution operator U (t, t0 ) is unitary and
U (t2 , t0 ) = U (t2 , t1 )U (t1 , t0 ). We define a Hermitian operator H(t0 ), called Hamiltonian
operator, as
dU (t, t0 )
= −iH(t0 ). (16.4)
dt t=t0

It follows that
dU (t, t0 )
= −iH(t1 )U (t1 , t0 ). (16.5)
dt t=t1

The formal solution of the differential equation is


∞ Z
X t Z t1 Z tn−1
U (t, t0 ) = I + (−i) n
dt1 dt2 · · · dtn H(t1 )H(t2 ) · · · H(tn ). (16.6)
n=1 t0 t0 t0
–162/453– Chapter 16 Formulation of Quantum Mechanics

Suppose that T stands for time ordering, placing all operators evaluated at later times to the
left, equation 16.6 can be written as
∞ Z Z t Z t
(−i)n X t
U (t, t0 ) = I + dt1 dt2 · · · dtn T{H(t1 )H(t2 ) · · · H(tn )}
n! n=1 t0 t0 t0
 Z t 
′ ′
≡ exp −iT H(t ) dt . (16.7)
t0

If the Hamiltonian operator H is time-dependent but the H’s at different times commute,
equation 16.7 can be simplified to
 Z t 
′ ′
U (t, t0 ) = exp −i H(t ) dt . (16.8)
t0

If H is time-independent, 16.8 can be further simplified into

U (t, t0 ) = exp[−iH(t − t0 )]. (16.9)

Since |ψ(t)⟩ = U (t, t0 ) |ψ(t0 )⟩, we can derive the Schrödinger equation

d |ψ(t)⟩
= −iH(t) |ψ(t)⟩ . (16.10)
dt
The expectation value of an observable Q is ⟨ψ|Q|ψ⟩, denoted by ⟨Q⟩. We then have
 
d ⟨Q⟩ ∂Q
= −i ⟨[Q, H]⟩ + , (16.11)
dt ∂t

where commutation bracket is defined as [Q, H] ≡ QH − HQ.

16.4 Position operators


In three-dimensional space, for a particle, we have three operators corresponding to the ob-
servation of its position in space, X = (X1 , X2 , X3 ). If the particle has some other internal
degrees of freedom, X plus some other observables S’s will form a complete commuting set
of operators. The eigenstate state will be denoted by |x, s⟩, satisfying that

Xi |x, s⟩ = xi |x, s⟩ . (16.12)

It describes a particle posited in x with internal state s. We would normalize |x, s⟩ so that

⟨x, s′ |x, s⟩ = δss′ δ(x − x′ ). (16.13)

16.5 Momentum operators and canonical quantization


Since X plus some other observables Ss form a complete commuting set of operators. There-
fore, the momentum operators can not be independent of them. Numerous experiments show
16.6 Momentum operators and translation of states –163/453–

that the position and momentum of particles can not be measured simultaneously, so we
would expect [X, P ] ̸= 0.
For a system which has a classical correspondence, the classical equation of motion of a particle
is

ẋ = {x, HC (x, p, t)}, ṗ = {p, HC (x, p, t)}, (16.14)

where { } is the Poisson bracket in classical mechanics.


While in quantum mechanics, we have
d ⟨X⟩ d ⟨P ⟩
= −i ⟨[X, H]⟩ , = −i ⟨[P, H]⟩ . (16.15)
dt dt
If we assume that the classical equation of motion of a particle is an approximation of quantum
mechanics, we may expect
[ ] = i{ }. (16.16)
Since the Poisson bracket in classical mechanics and commutation bracket in quantum me-
chanics have the same algebra structure, we only need to demand that

[Xi , Xj ] = 0, [Pi , Pj ] = 0, [Xi , Pj ] = iδij , (16.17)

and
H = HC (X, P, t). (16.18)

For a more general system, we would use

[Xi , Pj ] = iδij (16.19)

as a definition for momentum operator. The form of H can not be given as a priori,and may
be specified by the hints from classical theory and experiments.

16.6 Momentum operators and translation of states


We have the following lemma:

Lemma 1 Baker-Hausdorff lemma

in λn
exp(iGλ)A exp(−iGλ) = A + iλ[G, A] + · · · + [G, [G, [G, · · · [G, A]]] · · · ] + · · · ♣
n!
(16.20)

If we define T (a) ≡ e−iP ·a , we can get

T (a)−1 XT (a) = X + a, T (a) |x⟩ = |x + a⟩ . (16.21)

Thus, T (a) is the space translation operator and we can also define the momentum operator
as the generator of space translation.
–164/453– Chapter 16 Formulation of Quantum Mechanics

16.7 Angular momentum operators and rotation of states


We define the angular momentum operators J as the generator of rotation, i.e.,

R(θ) ≡ e−iJ·n̂θ . (16.22)

If the operator M = (M1 , M2 , M3 ) is a vector in configuration space and can be rotated by


R, we can derive that
[Ji , Mj ] = iϵijk Mk . (16.23)
Especially, we have
[Ji , Jj ] = iϵijk Jk . (16.24)

Orbital angular momentum

Orbital angular momentum of a particle is defined as L = X × P . It is the generator of the


rotation of the particle’s position, since

[Li , Xj ] = iϵijk Xk , [Li , Pj ] = iϵijk Pk , [Li , Lj ] = iϵijk Lk . (16.25)

Spin angular momentum

Experiments show that some microscopic particles possess a property called spin. The state of
the spin is denoted by |s⟩. The corresponding observables are S = [S1 , S2 , S3 ], which measure
the spin along the 1, 2, 3 direction. Spin operator is the generator of rotation of the spin of the
particle. Thus we have
[Si , Sj ] = iϵijk Sk . (16.26)
The rotation of position and spin is independent. It follows that

[Li , Sj ] = 0. (16.27)

Total angular momentum

The total angular momentum of the particle is

J = L + S. (16.28)

It is the generator of the rotation of the entire system, which is equivalent to the rotation of the
coordinates in opposite direction.

16.8 Heisenberg picture


Observables in Heisenberg picture are defined as

QH (t) ≡ U † (t, t0 )QU (t, t0 ). (16.29)


16.9 Symmetries and conservation laws –165/453–

The evolution of QH is given by


 
dQH ∂Q
= −i[QH (t), HH (t)] + . (16.30)
dt ∂t H

If the state of the system at t0 is |ψ0 ⟩, we can deduce that

⟨Q⟩ (t) ≡ ⟨ψ(t)|Q|ψ(t)⟩ = ⟨ψ0 |QH (t)|ψ0 ⟩ . (16.31)

If |q⟩ is the eigenstate of the Q with the eigenvalue q, |qH (t)⟩ ≡ U † (t, t0 ) |q⟩ would be the
eigenstate of the QH with eigenvalue q. Thus, the probability distribution of the measurement
of the observable Q at time t is

⟨q|ψ(t)⟩ = ⟨qH (t)|ψ0 ⟩ . (16.32)

16.9 Symmetries and conservation laws


Let U = eiKs be a continuous unitary transformation with generatorK. To say that the Hamil-
tonian operator H is invariant under this transformation means that

U (s)−1 H(t)U (s) = H(t). (16.33)

Then we can deduce that


[K, H(t)] = 0. (16.34)
Usually, K does not depend on time explicitly. Thus in Heisenberg picture, we have

KH (t) = K, |kH (t)⟩ = |k⟩ . (16.35)

It follows that
⟨K⟩ (t) = ⟨K⟩ (t0 ), ⟨k|ψ(t)⟩ = ⟨k|ψ0 ⟩ . (16.36)
The expectation value and probability distribution of the measurement of the observable K
will not change with time for an arbitrary initial state. We will assume that K is a constant of
motion.

Note: The concept of a constant of motion should not be confused with the concept of a stationary state.
Suppose that the Hamiltonian operator H is independent of t, and that the initial state vector is an eigenstate
of H, |ψ0 ⟩ = |En ⟩ with H |En ⟩ = En |En ⟩. This describes a state having a unique value of energy
En . The evolution of the state is
|ψ(t)⟩ = e−iEn t |ψ0 ⟩ . (16.37)
From this result it follows that the expectation value of any dynamical variable R

⟨ψ(t)|R|ψ(t)⟩ = ⟨En |R|En ⟩ (16.38)

is independent of t for such a state. By considering functions of R we can further show that the probability
distribution is independent of time. In a stationary state the averages and probabilities of all dynamical
variables are independent of time, whereas a constant of motion has its average and probabilities independent
of time for all states.
Chapter 17
Coordinate and Momentum Representation

17.1 Coordinate representation


To form a representation of an abstract linear vector space, one chooses a complete orthonor-
mal set of basis vectors {|ui ⟩} and represents an arbitrary vector |ψ⟩ by its expansion coef-
P
ficients {ci }, where |ψ⟩ = ci |ui ⟩. The array of coefficients ⟨ui |ψ⟩ can be regarded as a
column vector (possibly of infinite dimensions), provided the basis set is discrete. Coordinate
representation is obtained by choosing as the basis set the eigenstates {|x⟩} of the position
operator. Since this is a continuous set, the expansion coefficients define a function of a con-
tinuous variable,
ψ(x) ≡ ⟨x|ψ⟩ . (17.1)

The inner product of the state vector in coordinate representation is


Z
⟨ϕ|ψ⟩ = ϕ∗ (x)ψ(x) d3 x . (17.2)

It is a matter of taste whether one says that the set of functions forms a representation of the
vector space, or that the vector space consists of the functions ψ(x). The action of an operator
A on the function space is related to its action on the abstract vector space by the rule

Aψ(x) ≡ ⟨x|A|ψ⟩ . (17.3)

The action of an position operator in coordinate representation is

Xψ(x) = xψ(x). (17.4)

The action of an momentum operator in coordinate representation is

P ψ(x) = −i∇ψ(x). (17.5)

For a spin-less particle in the scalar potential W (x), the Hamiltonian is H = P 2 /2m+W (X).
The equation of motion in the coordinate representation is
 
1 2 ∂ψ(x, t)
− ∇ + W (x) ψ(x, t) = i . (17.6)
2m ∂t
17.2 Galilei transformation of Schrödinger equation –167/453–

17.2 Galilei transformation of Schrödinger equation


For simplicity we shall treat only one spatial dimension. Let us consider two frames of refer-
ence: F with coordinates x and t, and F ′ with coordinates x′ and t′ . F ′ is moving uniformly
with velocity v relative to F , and so

x = x′ + vt′ , t = t′ . (17.7)

The potential energy is given by W (x, t) in F , and by W ′ (x′ , t′ ) in F ′ , with

W (x, t) = W ′ (x′ , t′ ). (17.8)

Because the requirement of invariance under Galilei transformation, we expect in F ′ the Schrödinger
equation has the form
 
1 ∂ ′ ′ ′ ′ ′ ∂ψ ′ (x′ , t′ )
− + W (x ) ψ (x , t ) = i , (17.9)
2m ∂x′2 ∂t′
where ψ ′ (x′ , t′ ) is the wave function in F ′ . The probability density at a point in spacetime must
be the same in the two frames of reference,

|ψ(x, t)|2 = |ψ ′ (x′ , t′ )|2 , (17.10)

and hence we must have


ψ(x, t) = eif ψ ′ (x′ , t′ ), (17.11)
where f is a real function of the coordinates. We can derive that
1
f (x, t) = mvx − mv 2 t, (17.12)
2
apart from an irrelevant constant term.

17.3 Probability flux and conditions on wave functions


The probability flux vector is defined as
1
J (x, t) ≡ Im(ψ ∗ ∇ψ). (17.13)
m
We can obtain a continuity equation
d
|ψ(x, t)|2 + ∇ · J (x, t) = 0. (17.14)
dt
Applying the divergence theorem, we have
Z I

|ψ(x, t)| d x = − J · dS .
2 3
(17.15)
∂t Ω σ

The equations of continuity require that the probability flux J (x, t) be continuous across any
surface, since otherwise the surface would contain sources or sinks. Although this condition
applies to all surfaces, implying that J (x, t) must be everywhere continuous, its practical ap-
plications are mainly to surfaces separating regions in which the potential has different analytic
forms. Usually, we have the following conditions,
–168/453– Chapter 17 Coordinate and Momentum Representation

1.
dψ dψ
ψ(x)|x+0 = ψ(x)|x−0 , = . (17.16)
dx x+0 dx x−0

2.
dψ dψ
ψ(x)|x+0 = ψ(x)|x−0 = 0, − is finite. (17.17)
dx x+0 dx x−0

Consider next the behavior at a singular point, assumed for convenience to be the origin of
coordinates. Let S be a small sphere of radius r surrounding the singularity. The probability
that the particle is inside S must be finite. Suppose that ψ = u/rα , where u is a smooth
function that does not vanish at r = 0. Then we must have |ψ|2 r3 convergent at the origin,
which implies that α < 3/2.
H
The net outward flow through the surface S is F = S J ·dS. It must vanish in the limit r → 0,
since otherwise the origin would be a point source or sink. One has ∂ψ/∂r = r−α ∂u/∂r −
αur−α−1 . The second term does not contribute to the flux, so we obtain
 I  
−i ∗ ∂u ∂u∗
F =r 2−2α
u −u dΩ, (17.18)
2m ∂r ∂r

where the integration is over solid angle. If the integral does not vanish, we must have α < 1
in order for F to vanish in the limit r → 0. This is a stronger condition than that derived from
the probability density.

Since |ψ|2 is a probability density, it must vanish sufficiently rapidly at infinity so that its in-
tegral over all configuration space is convergent and equal to 1. The conditions that we have
discussed apply to wave functions ψ(x) which represent physically realizable states, but they
need not apply to the eigenfunctions of operators that represent observables. Those eigenfunc-
tions, χ(x), which play the role of filter functions in computing probabilities, are only required
to lie in the extended space, ΦX , of the rigged-Hilbert-space triplet. It has been suggested that
ψ(x) be restricted to the nuclear space Φ, rather than merely to the Hilbert space H. In many
cases this would amount to requiring that ψ(x) should vanish at infinity more rapidly than
any inverse power of the distance.

17.4 Path integrals


In the path integral formulation of quantum mechanics, Gaussian integration would play an
important role.

Theorem 17.1 Gaussian integration


Z   12
2π J2
dx e − 12 ax2 +Jx
= e 2a . (17.19) ♣
a
17.4 Path integrals –169/453–

The time evolution of a quantum state vector, |ψ(t)⟩ = U (t, t0 ) |ψ0 ⟩, can be regarded as the
propagation of an amplitude in configuration space,
Z
ψ(x, t) = G(x, t; x′ , t0 )ψ(x′ , t0 ) dx′ , (17.20)

where
G(x, t; x′ , t0 ) = ⟨x|U (t, t0 )|x′ ⟩ (17.21)
is often called the propagator. Making use of the multiplicative property of the time develop-
ment operator, it follows that the propagator can be written as
Z Z
G(x, t; x0 , t0 ) = · · · G(x, t; xN , tN ) · · · G(x1 , t1 ; x0 , t0 ) dxN · · · dx1 . (17.22)

The N -fold integration is equivalent to a sum over zigzag paths that connect the initial point
(x0 , t0 ) to the final point (x, t). If we now pass to the limit of N → ∞ and ∆t = ti −ti−1 → 0,
we will have the propagator expressed as a sum (or, rather, as an integral) over all paths that
connect the initial point to the final point.
For H = P 2 /2m + V (X), it can be shown that
r    
−iH∆t ′ m m(x − x′ )2
xe x = exp i − V (x) ∆t as ∆t → 0. (17.23)
2iπ∆t 2∆t2

Then we can derive that


Z Z  ( N   )
m  N2+1 X m(xj+1 − xj )2
G(x, t; x0 , t0 ) = lim ··· exp i − V (xj+1 ) ∆t dx1 · · · dxN .
N →∞ 2iπ∆t j=0
2∆t2
(17.24)
The propagator can be expressed formally as
Z Z
G(x, t; x0 , t0 ) = D[x(τ )]e iS[x(τ )]
where S[x(τ )] = L(x, ẋ) dτ . (17.25)
x(τ )

The integral can be assumed as a functional integration over all paths x(τ ) which connect the
initial point (x0 , t0 ) to the final point (x, t).
To conclude this section, let us generalize our path-integral formula to a more complicated
systems. Consider a very general quantum system, described by arbitrary set of of coordinates
qi , conjugate momentum pi , and Hamiltonian H(q, p). We can show that
!    " #
Y Z dpi q + q X
qk+1 e−iϵH qk =
k+1 k
k
exp −iϵH , pk exp i pik (qi,k+1 − qi,k )
i
2π 2 i
(17.26)
and so
! "  !#
Y Z dpik dq i,k X X q k+1 + q k
⟨qN |U (t, t0 )|q0 ⟩ = exp i pik (qi,k+1 − qi,k ) − ϵH , pk .
i,k
2π k i
2
(17.27)
–170/453– Chapter 17 Coordinate and Momentum Representation

There is one momentum integral for each k from 0 to N , and one coordinate integral for each
k from 1 to N . The propagator can also be expressed formally as
! " Z !#
YZ T X
⟨qN |U (t, t0 )|q0 ⟩ = Dq(t)Dp(t) exp i dt p q˙i − H (q, p)
i
, (17.28)
i 0 i

where the functions q(t) are constrained at the endpoints, but p(t) are not. The details of
this generalization can be found in chapter 9.1 of An introduction to quantum field theory
(M.E.Peskin & D.V.Schroeder)

17.5 Momentum representation


Momentum representation is obtained by choosing as the basis set the eigenstates {|p⟩} of the
momentum operator. The orthonormality condition takes the form

⟨p|p′ ⟩ = δ(p − p′ ). (17.29)

It can be deduced that


1
⟨x|p⟩ = eip·x (17.30)
(2π)3/2
and Z
1
ϕ(p) ≡ ⟨p|ψ⟩ = e−ip·x ⟨x|ψ⟩ d3 x . (17.31)
(2π)3/2
The effect of position operator in momentum representation is

Xϕ(p) = i∇ϕ(p). (17.32)

Bloch’s Theorem
A crystal is unchanged by translation through a vector displacement of the form

Rn = n1 a1 + n2 a2 + n3 a3 , (17.33)

where n1 , n2 and n3 are integers, and a1 , a2 and a2 form the edges of a unit cell of the crystal.
Corresponding to such a translation, there is a unitary operator, U (Rn ) = exp(−iP · Rn ),
which leaves the Hamiltonian of the crystal invariant:

U −1 (Rn )HU (Rn ) = H. (17.34)

These unitary operators for translations commute with each other, as well as with H, so there
must exist a complete set of common eigenstates for all of these operators,

H |ψ⟩ = E |ψ⟩ , U (Rn ) |ψ⟩ = c(Rn ) |ψ⟩ . (17.35)

By the composition relation of the translation operators, we can deduce that

c(Rn ) = exp(−ik · Rn ). (17.36)


17.6 Harmonic oscillator –171/453–

In coordinate representation, we have

ψk (x − Rn ) = U (Rn )ψk (x) = exp(−ik · Rn )ψk (x). (17.37)

The vector k is called the Bloch wave vector of the state. A function of the Bloch form can be
expanded in a series of plane waves as
X ′
ψk (x) = a(k′ )eik ·x . (17.38)
k′

It can be shown that for any Rn , we have

exp[i(k′ − k) · Rn ] = 1. (17.39)

And Gn ≡ k′ − k is called a vector of the reciprocal lattice. The expansion can now be written
as X
ψk (x) = a(k + Gm )ei(k+Gm )·x . (17.40)
Gm

17.6 Harmonic oscillator


A harmonic oscillator is an object that is subject to a quadratic potential energy, which pro-
duces a restoring force against any displacement from equilibrium that is proportional to
the displacement. The Hamiltonian for such an object whose motion is confined to one-
dimension is
1 2 mω 2 2
H= P + Q, (17.41)
2m 2
where P is the momentum, Q is the position, and m is the mass.

17.6.1 Algebraic solution


Firstly, we have the commutation relation [Q, P ] = i and the self-adjointness of the operators
P and Q. Define dimensionless position and momentum operator as

p ≡ (mω)−1/2 P, q ≡ (mω)1/2 Q. (17.42)

It follows that
1
[q, p] = i, H = ω(p2 + q 2 ). (17.43)
2
Define annihilation operator as
q + ip
a≡ √ . (17.44)
2
We can deduce that
 † 1 1 1
a, a = 1, H = ω(aa† + a† a) = ω(aa† − ) = ω(a† a + ). (17.45)
2 2 2
Introducing number operator N ≡ a† a, we obtain
 
[N, a] = −a, N, a† = a† . (17.46)
–172/453– Chapter 17 Coordinate and Momentum Representation

Suppose |ν⟩ is the normalized eigenstate of N with eigenvalue ν. Consequently, we have

N a |ν⟩ = a(N − 1) |ν⟩ = (ν − 1)a |ν⟩ . (17.47)

If ν ̸= 0, a |ν⟩ cannot be 0 since ∥a |ν⟩∥2 = ⟨ν|N |ν⟩ = ν ̸= 0, and so a |ν⟩ must be an


eigenstate of N with eigenvalue ν − 1. Since the norm must be nonnegative, it follows that
ν ≥ 0, and thus an eigenvalue cannot be negative. By applying the operator a repeatedly, it
would appear that one could construct an indefinitely long sequence of eigenstates having the
eigenvalues ν −1, ν −2, ν −3 and so on. The contradiction can be avoided only if the sequence
terminates with the value ν = 0 and a |0⟩ = 0.
Similarly, we also have

N a† |ν⟩ = a† (N + 1) |ν⟩ = (ν + 1)a† |ν⟩ . (17.48)


2
Since a† |ν⟩ = ⟨ν|N + 1|ν⟩ = ν + 1 > 0, a† |ν⟩ must be an eigenstate of N with eigen-
value ν + 1. By repeatedly applying the operator a† , one can construct an unlimited sequence
of eigenstates, each having an eigenvalue one unit greater than that of its predecessor. The
sequence begins with the eigenvalue ν = 0. The spectrum of N consists of the nonnegative
integers, ν = n.
Now the orthonormal eigenstates of N will be denoted as |n⟩, and we can derive that

|n⟩ = n−1/2 a† |n − 1⟩ = (n!)−1/2 (a† )n |0⟩ . (17.49)

The matrix elements of a† and a are

⟨n′ |a† |n⟩ = (n + 1)1/2 δn′ ,n+1 , ⟨n′ |a|n⟩ = (n)1/2 δn′ ,n−1 . (17.50)

The eigenvalues and eigenstates of the harmonic oscillator Hamiltonian are


 
1
H |n⟩ = En |n⟩ , En = n + ω. (17.51)
2

17.6.2 Solution in coordinate representation


In the coordinate representation, the Schrödinger equation for harmonic oscillator is
1 d2 ψ mω 2 x2
− + ψ = Eψ. (17.52)
2m dx2 2
Define
2E
q ≡ (mω)1/2 x, λ≡ . (17.53)
ω
The Schrödinger equation becomes
d2 u
+ (λ − q 2 )u = 0. (17.54)
dq 2
When q → ±∞, u(q) ∼ eq /2 or e−q /2 . The first of these is unacceptable, because it diverges
2 2

so severely as to be outside of both Hilbert space and rigged Hilbert space. We would seek
solutions of the form
u(q) = H(q)e− 2 q .
1 2
(17.55)
17.7 Quantum mechanics in classical electromagnetic field –173/453–

Substituting 17.55 into 17.54, we would obtain equation


H ′′ − 2qH ′ + (λ − 1)H = 0. (17.56)
It is the wellknown Hermite differential equation. When λ − 1 = 2n, we have regular solu-
tions. The solutions are given in Hermite polynomials, and will be denoted as Hn (q). After
appropriate normalization, we have
h i1/2  
α − 21 α2 x2 1
ψn (x) = 1/2 n Hn (αx)e , En = n + ω, (17.57)
π 2 n! 2
where α ≡ (mω)1/2 .

17.6.3 Path integral solution


The propagator of a harmonic oscillator in terms of path integral is
Z ∫ tb 1
G(xb , tb ; xa , ta ) = D[x(t)]ei ta ( 2 mẋ − 2 mω x )dt
2 1 2 2

Z  ( N   )
m  N2+1 X m(xj+1 − xj )2 1
= lim exp i 2
− mω 2 x2j+1 ∆t dx1 · · · dxN ,
N →∞ 2iπ∆t 2∆t 2
j=0
(17.58)
where x0 = xa , xN +1 = xb and ∆t = (tb − ta )/(N + 1). Suppose xc (t) is the classical path
of the harmonic oscillator and δx(t) ≡ x(t) − xc (t) is the deviation from the classical path.
Substituting x = xc + δx into 17.58, terms which is linear in δx can be dropped since
δS
= 0. (17.59)
δx x(t)=xc (t)

Noticing that δx0 = δxN +1 = 0, equation 17.58 can be simplified into


Z  ( )
m  N2+1 XN
G(xb , tb ; xa , ta ) = eiSc lim exp i δxj Sjk δxk dδx1 · · · dδxN
N →∞ 2iπ∆t j=1,k=1
s
 m  N2+1 πN
= eiSc lim , (17.60)
N →∞ 2iπ∆t det(−iS)
where  
2 − ω 2 ∆t2 −1 0 ...
m   −1 2 − ω 2 ∆t2 −1 . . .

− iS =  2 − ω 2 ∆t2 . . .
. (17.61)
2i∆t  0 −1 
.. .. ..
. . .
Further simplification of equation 17.60 will be eliminated here but can be found in section
2.1.4 of Quantum Field Theory of Many-body Systems (Xiao-Gang Wen). The final result is
 mω 1/2  
imω  2 
G(xb , t; xa , 0) = exp (xa + xb ) cos ωt − 2xa xb .
2
(17.62)
2πi sin ωt 2 sin ωt
In the limit ω → 0, we have
 m 1/2  
im
G(xb , t; xa , 0) = exp (xb − xa ) ,
2
(17.63)
2πit 2t
which is the propagator of free particle.
–174/453– Chapter 17 Coordinate and Momentum Representation

17.7 Quantum mechanics in classical electromagnetic field


17.7.1 General discussion
In classical electrodynamics, the Hamiltonian of a charged particle whose velocity is much
smaller than that of light in a given EM field is

(π − eA)2
H= + eϕ. (17.64)
2m
The Hamiltonian operator in corresponding quantum theory will be

[P − eA(X)]2
H= + eϕ(X). (17.65)
2m
In Heisenberg picture, the equation of motion is

dX 1
= −i[X, H] = (P − eA). (17.66)
dt m
Define kinetic momentum K by
K ≡ P − eA. (17.67)
It follows that
[Ki , Kj ] = ie(∂i Aj − ∂j Ai ) = ieϵijk Bk . (17.68)
Hence,
  
d2 X ∂K 1 dX dX
m 2 = −i[K, H] + =e E+ ×B−B× . (17.69)
dt ∂t 2 dt dt

In coordinate representation of Schrödinger picture, the equation of motion is

1 ∂ψ(x, t)
[−i∇ − eA] · [−i∇ − eA] ψ(x, t) + eϕ(x)ψ(x, t) = i . (17.70)
2m ∂t
Define the probability current of the particle as

1 1 e
j≡ Re(ψ ∗ Kψ) = Im(ψ ∗ ∇ψ) − A|ψ|2 . (17.71)
m m m
We can verify that
∂ρ
∇·j+ = 0. (17.72)
∂t
Gauge transformation
∂Λ
ϕ→ϕ− , A → A + ∇Λ (17.73)
∂t
will leave E and B unchanged. In classical electrodynamics, gauge transformation will not
change the trajectory of particles, which is the only thing we can observe in experiment. In
quantum theory, suppose the state vector |ψ⟩ will transform as

|ψ(t)⟩ → U (t) |ψ(t)⟩ (17.74)


17.7 Quantum mechanics in classical electromagnetic field –175/453–

under gauge transformation, where U (t) is a unitary operator. If the Schrödinger equation is
always satisfied, we should demand that

∂U
H ′U − U H = i , (17.75)
∂t
where H ′ is the Hamiltonian operator after gauge transformation. Generally, we have

U (t) = exp [ieΛ(X, t)] . (17.76)

It follows that

U −1 XU = X, U −1 P U = P + e∇Λ, U −1 (P − eA′ )U = P − eA. (17.77)

The expectation value of X and K is invariant under gauge transformation. We can also verify
that j is also invariant under gauge transformation.
One special case of gauge transformation is

ϕ → ϕ + ϕ0 (t), A → A. (17.78)

In this case, we have  Z t 


′ ′
O(t) = exp −i eϕ0 (t ) dt . (17.79)
t0

If ϕ0 is a constant, we have
O(t) = exp[−ieϕ0 (t − t0 )]. (17.80)

17.7.2 Motion in a uniform static magnetic field


Suppose the magnetic field be of magnitude B in the ẑ direction. The Hamiltonian is H =
Hxy + Hz with Hxy = (Kx2 + Ky2 )/2m and Hz = Kz2 /2m. Since Bx = By = 0, Kz commutes
with Kx and Ky . Hence the operators Hxy and Hz are commutative, and every eigenvalue of
H is just the sum of an eigenvalue of Hxy and an eigenvalue of Hz . Define

Kx Ky p
Q′ ≡ , P′ ≡ , γ≡ |eB|. (17.81)
γ γ
We have
1 |eB| ′2
Hxy = (Q + P ′2 ) with [Q′ , P ′ ] = i or − i. (17.82)
2 m
Thus, the eigenvalues of Hxy must be equal to (n + 1/2)|eB|/m, where n is any non-negative
integer.
The spectrum of Kz is gauge invariant. Because the magnetic field is uniform and in the ẑ
direction, it is possible to choose the vector potential such that Az = 0. Thus, the spectrum of
Kz is continuous from −∞ to ∞, like that of Pz . The energy eigenvalues for a charged particle
in a uniform static magnetic field B are therefore

(n + 1/2)|eB| p2
En (pz ) = + z. (17.83)
m 2m
–176/453– Chapter 17 Coordinate and Momentum Representation

The motion parallel to the magnetic field is not coupled to the transverse motion, and is unaf-
fected by the field. The classical motion in the plane perpendicular to the field is in a circular
orbit with angular frequency ωc = eB/m, and it is well known that periodic motions corre-
spond to discrete energy levels whose separation is ωc .
We can also derive the energy spectrum in coordinate representation. Let us choose the vector
potential to be Ax = −yB, Ay = Az = 0. The Hamiltonian now becomes

(Px + yeB)2 + Py2 + Pz2


H= . (17.84)
2m
Px and Pz commute with H, so it is possible to construct a complete set of common eigenstates
of H, Px and Pz . In coordinate representation, the eigenvalue equation now takes the form

1 2 ieB ∂ψ e2 B 2 2
− ∇ ψ− y + y ψ = Eψ. (17.85)
2m m ∂x 2m
Suppose
ψ(x, y, z) = exp(ikx x + ikz z)ϕ(y). (17.86)
The equation becomes
 
1 d2 ϕ(y) mωc2 ′
− + (y − y0 ) − E ϕ(y) = 0,
2
(17.87)
2m dy 2 2

where E ′ = E − kz2 /2m is the energy associated with motion in the xy plane. This is just the
energy eigenvalue equation for a simple harmonic oscillator with angular frequency ω = |ωc |.
The energies for the charged particle in the magnetic field must be E = (n+1/2)|ωc |+kz2 /2m.
Apart from a normalization constant, the eigenfunction is
 
1 2
ψ = exp(ikx x + ikz z)Hn [α(y − y0 )] exp − α (y − y0 ) . 2
(17.88)
2
√ p
with α = mω = |eB| and y0 = −kx /eB.
For fixed n and kz , the energy eigenvalue is highly degenerate. For convenience, we assume
that the system is confined to a rectangle of dimension Dx ×Dy and subject to periodic bound-
ary conditions. The allowed values of kx are kx = 2πnx /Dx , with nx = 0, ±1, · · · . The orbit
center coordinate y0 = −2πnx /Dx eB must lie in the range [0, Dy ]. In the limit as Dx and Dy
become large, we may ignore problems associated with orbits lying near the boundary, since
they will be a negligible fraction of the total. In this limit the number of degenerate states
corresponding to fixed n and kz will be

Dx Dy |eB| e
= Φ . (17.89)
2π 2π

17.7.3 The Aharonov-Bohm effect


As shown in Figure 17.1, a long solenoid is placed perpendicular to the plane of the figure,
so that a magnetic field can be created inside the solenoid while the region external to the
17.7 Quantum mechanics in classical electromagnetic field –177/453–

Figure 17.1: The Aharonov–Bohm experiment.

solenoid remains field-free. The solenoid is located in the unilluminated shadow region so
that no particles will reach it, and moreover it may be surrounded by a cylindrical shield that
is impenetrable to the charged particles. Nevertheless it can be shown that the interference
pattern depends upon the magnetic flux through the cylinder.
Let Ψ(0) (x, t) be the solution of Schrödinger equation with boundary conditions of this prob-
lem for the case in which the vector potential is everywhere zero. Now let us consider the case
in which the magnetic field is non-zero inside the cylinder but zero outside of it. The vector
potential A will not vanish everywhere in the exterior region, even though B outside of the
cylinder. This follows by applying Stokes’ theorem to any path surrounding the cylinder
I ZZ ZZ
A · dx = (∇ × A) · dS = B · dS = Φ. (17.90)

If the flux through the cylinder is not zero, the vector potential must be nonzero on every path
that encloses the cylinder. However in any simply connected region outside of the cylinder,
it is possible to express the vector potential as the gradient of a scalar, from the zero potential
solution by means of a gauge transformation, Ψ = Ψ(0) exp(ieΛ).
In region L, which contains the slit on the left, the wave function can be written as ΨL =
R
ΨL exp(ieΛ1 ), where ΨL is the zero potential solution in region L, and Λ1 (x, t) = A · dx,
with the integral taken along a path within region L. A similar form can be written for the wave
function in the region R, which contains the slit on the right. At the point b, in the overlap of
regions L and R, the wave function is a superposition of contributions from both slits. Hence
we have
Ψb = ΨL exp(ieΛ1 ) + ΨR exp(ieΛ2 ). (17.91)
The interference pattern depends on exp[ie(Λ1 − Λ2 )] = exp(ieΦ). Therefore the interfer-
ence pattern is sensitive to the magnetic flux inside of the cylinder, even though the particles
never pass through the region in which the magnetic field is nonzero. The AB (Aharonov-
Bohm) effect is a topological effect, in that the effect depends on the flux encircled by the
paths available to the particle, even though the paths may never approach the region of the
flux.
Chapter 18
Angular Momentum

18.1 Eigenvalues of angular momentum operators


Commutation relations of angular momentum operators are given by

[Ji , Jj ] = iϵijk Jk . (18.1)

Introduce the operator J 2 ≡ Jx2 + Jy2 + Jz2 . We can deduce that [J 2 , J ] = 0. Thus, there
exists a complete set of common eigenstates of J 2 and any one component of J . Particularly,
we have the following eigenvalue equations,

J 2 |β, m⟩ = β |β, m⟩ , Jz |β, m⟩ = m |β, m⟩ . (18.2)

Noticing that

⟨β, m|J 2 − Jz2 |β, m⟩ = ⟨β, m|Jx2 |β, m⟩ + ⟨β, m|Jy2 |β, m⟩ ≥ 0, (18.3)

the inequality m2 ≤ β is obtained immediately. For a fixed value of β, there must be maximum
and minimum values for m.
Define
J+ ≡ Jx + iJy , J− ≡ Jx − iJy . (18.4)
It follows that
[Jz , J+ ] = J+ , [Jz , J− ] = −J− , [J+ , J− ] = 2Jz . (18.5)
Thus, we have

Jz J+ |β, m⟩ = J+ (Jz + 1) |β, m⟩ = (m + 1)J+ |β, m⟩ . (18.6)

Either J+ |β, m⟩ is an eigenstate of Jz with the raised eigenvalue m + 1, or J+ |β, m⟩ = 0. For


fixed β, there is a maximum value of m, which we shall denote as j. It must be the case that

J+ |β, j⟩ = 0. (18.7)

Noticing that
J− J+ = J 2 − Jz2 − Jz , (18.8)
we can derive that β = j(j+1). By similar method, we can show that the minimum eigenvalue
k of Jz for fixed β satisfy β = k(k − 1). Consequently, we have k = −j. Now, we have shown
18.2 Orbital Angular Momentum and Spin –179/453–

the existence of a set of eigenstates corresponding to integer spaced m values in the range
−j ≤ m ≤ j. Since the difference between the maximum value j and the minimum value −j
must be an integer, it follows that j = integer/2. Henceforth we shall adopt the common and
more convenient notation of labeling the eigenstates by j instead of by β. The vector that was
previously denoted as |β, m⟩ will now be denoted as |j, m⟩.
To find the matrix element of angular momentum operators, we notice that

⟨j, m|J− J+ |j, m⟩ = j(j + 1) − m(m + 1). (18.9)

It can be seen that


p p
J+ |j, m⟩ = (j + m + 1)(j − m) |j, m + 1⟩ , J− |j, m⟩ = (j − m + 1)(j + m) |j, m − 1⟩ .
(18.10)
The matrix element of J+ , J− and Jz are
p
⟨j ′ , m′ |J+ |j, m⟩ = (j + m + 1)(j − m)δjj ′ δm′ ,m+1 ; (18.11a)
p
⟨j ′ , m′ |J− |j, m⟩ = (j − m + 1)(j + m)δjj ′ δm′ ,m−1 ; (18.11b)
′ ′
⟨j , m |Jz |j, m⟩ = mδjj ′ δm′ ,m . (18.11c)

18.2 Orbital Angular Momentum and Spin


Let ψ(x) be a one-component state function in coordinate representation. When it is subjected
to a rotation, it is transformed into

Rψ(x) = ψ(R−1 x), (18.12)

where R is the rotation operator generated by R = exp(−iJ · n̂θ). For a rotation through an
infinitesimal angle ϵ about z axis, we have
 
∂ψ ∂ψ
Rz (ϵ)ψ(x, y, z) = ψ(x + ϵy, y − ϵx, z) = ψ(x, y, x) + ϵ y −x . (18.13)
∂x ∂y

Noticing that
Rz (ϵ) = I − iϵJz , (18.14)
we have Jz = −i(x∂y − y∂x ), which is the z component of the orbital angular momentum
operator L = X × P in coordinate representation.
For a multicomponent state function, we have
   
ψ1 (x) ψ1 (R−1 x)
   −1 
R ψ2 (x) = D ψ2 (R x) . (18.15)
.. ..
. .

The general form of the rotation operator will be

Rn̂ (θ) = e−iL·n̂θ Dn̂ (θ). (18.16)


–180/453– Chapter 18 Angular Momentum

The two factors commute because the first acts only on the coordinate and the second acts only
on the components of the column vector. The matrix D must be unitary, and it can be written
as
Dn̂ (θ) = e−iS·n̂θ , (18.17)
where Ss are finite-dimensional Hermitian matrices and satisfy commutation relations [Si , Sj ] =
iϵijk Sk . The angular momentum operator J now takes the form

J =L+S (18.18)

and [Lα , Sβ ] = 0. L and S are called the orbital and spin part of the angular momentum
operator.

Orbital angular momentum


The gradient operator in spherical coordinates is

∂ 1 ∂ 1 ∂
∇ = êr + êθ + êϕ . (18.19)
∂r r ∂θ r sin θ ∂ϕ

The orbital angular momentum operator now becomes


 
∂ 1 ∂
L = rêr × (−i∇) = (−i) êϕ − êθ . (18.20)
∂θ sin θ ∂ϕ

It follows that
   
∂ 1 ∂ ∂ 1 ∂2
Lz = L · êz = −i , L =L·L=−
2
sin θ + . (18.21)
∂ϕ sin θ ∂θ ∂θ sin2 θ ∂ϕ2

Apart from normalization, solutions of the eigen equations,

Lz Ylm (θ, ϕ) = mYlm (θ, ϕ) and L2 Ylm (θ, ϕ) = l(l + 1)Ylm (θ, ϕ) (18.22)

are Ylm (θ, ϕ) = eimϕ Plm (cos θ), where Plm is the associated Legendre polynomials. If we
assume that the solutions must be single-valued under rotation, it will follow that m must be
an integer. If we further assume that the solutions must be nonsingular at θ = 0 and θ = π,
from the standard theory of the Legendre equation it will follow that l must be a nonnegative
integer in the range l ≥ |m|. The normalized solutions that result from these assumptions are
the well-known spherical harmonics
 1/2
m (m+|m|)/2 (2l + 1)(l − |m|)! |m|
Yl (θ, ϕ) = (−1) eimϕ Pl (cos θ). (18.23)
4π(l + |m|)!

Spin
The eigenvalue equations for S 2 and Sz ,

S 2 |s, m⟩ = s(s + 1) |s, m⟩ , Sz |s, m⟩ = m |s, m⟩ (18.24)


18.3 Rotation operator –181/453–

have solutions for m = s, s − 1, · · · , −s with s being any nonnegative integer or half-integer.


Because a particular species of particle is characterized by a set of quantum numbers that
includes the value of its spin s, it is often sufficient to treat the spin operator as acting on the
space of dimension 2s + 1 that is spanned by the eigenstates of equation 18.24 for a fixed value
of s.
If s = 1/2, we have
     
1 0 1 1 0 −i 1 1 0
Sx = , Sy = , Sz = . (18.25)
2 1 0 2 i 0 2 0 −1
The spin operator in direction n̂ = (sin θ cos ϕ, sin θ sin ϕ, cos θ) is
 
1 cos θ e−iϕ sin θ
Sn̂ = , (18.26)
2 eiϕ sin θ − cos θ
with eigenstates
   −iϕ/2 
e−iϕ/2 cos(θ/2) −e sin(θ/2)
, , (18.27)
eiϕ/2 sin(θ/2) eiϕ/2 cos(θ/2)
corresponding to eigenvalues 1/2 and −1/2 respectively.
If s = 1, we have
r 0 1 0  r 0 −i 0  
1 0 0

1 1
Sx = 1 0 1 , Sy = i 0 −i , S z = 0 0 0  . (18.28)
2 2
0 1 0 0 i 0 0 0 −1
The spin operator in direction n̂ is
 q 
−iϕ 1
cos θ sin θe 0
 q 2
q 
 −iϕ 1
Sn̂ = sin θe iϕ 1
0 sin θe , (18.29)
 2
q 2

0 sin θeiϕ 12 − cos θ

with eigenstates
   p   
−iϕ
(1 +pcos θ)e /2 − 1/2 sin θe−iϕ (1 −pcos θ)e−iϕ /2
  
1/2 sin θ  ,  p cos θ ,  − 1/2 sin θ  , (18.30)
(1 − cos θ)eiϕ /2 1/2 sin θe iϕ
(1 + cos θ)eiϕ /2

corresponding to eigenvalues 1, 0 and −1 respectively.


Notice that spin 1 representation of rotation are equivalent to the vector representation, where
(Si )jk = −iϵijk . They are related by a transformation of basis.

18.3 Rotation operator


Three parameters are required to describe an arbitrary rotation. A common parameterization
is by the Euler angles. As shown in Figure 18.1, from the fixed system of axes Oxyz, a new
rotated set of axes Ox′ y ′ z ′ is produced in three steps:
–182/453– Chapter 18 Angular Momentum

• Rotate through angle α about Oz, carrying Oy into Ou.

• Rotate through angle β about Ou, carrying Oz into Oz ′ .

• Rotate through angle γ about Oz ′ , carrying Ou into O′ .

Figure 18.1: Euler angles.

The net rotation is

R(α, β, γ) = Rz′ (γ)Ru (β)Rz (α) = e−iγJz′ e−iβJu e−iαJz . (18.31)

Since Ju = Rz (α)Jy Rz (−α), we have Ru (β) = Rz (α)Ry (β)Rz (−α). Similarly, we can also
get Rz′ (γ) = Ru (β)Rz (γ)Ru (−β). The rotation operator now becomes

R(α, β, γ) = Rz (α)Ry (β)Rz (γ) = e−iαJz e−iβJy e−iγJz . (18.32)

The matrix representation of the rotation operator in the basis |j, m⟩

⟨j ′ , m′ |R(α, β, γ)|j, m⟩ = δjj ′ Dm′ m (α, β, γ)


(j)
(18.33)

gives rise to the rotation matrices,



Dm′ m (α, β, γ) ≡ j, m′ e−iαJz e−iβJy e−iγJz j, m = e−i(αm +γm) dmm′ (β),
(j) (j)
(18.34)

where
dm′ m (β) ≡ j, m′ e−iβJy j, m .
(j)
(18.35)
For the case of j = 1/2, we have Jy = σy /2 and σy2 = I. We can obtain
 
(1/2) cos(β/2) − sin(β/2)
d (β) = . (18.36)
sin(β/2) cos(β/2)

Notice that this matrix is periodic in β with period 4π, but it changes sign when 2π is added to
β. This double-valuedness under rotation by 2π is a characteristic of the full rotation matrix
whenever j is a half odd-integer. The matrix is single-valued under rotation by 2π whenever
j is an integer.
18.3 Rotation operator –183/453–

Rotation of angular momentum eigenstates now can be written as


X (j)
R(α, β, γ) |j, m⟩ = Dm′ m (α, β, γ) |j, m′ ⟩ . (18.37)
m′

When it comes to spherical harmonics, we have


X ′
Ylm (θ′ , ϕ′ ) = [R(α, β, γ)]−1 Ylm (θ, ϕ) = Ylm (θ, ϕ)[Dmm′ (α, β, γ)]∗ ,
(l)
(18.38)
m′

where the rotation R(α, β, γ) takes a vector in the direction (θ, ϕ) into the direction (θ′ , ϕ′ ).
By putting β = γ = 0 we obtain
X ′
Ylm (θ, ϕ)[Dmm′ (α, 0, 0)]∗ = eiαm Ylm (θ, ϕ).
(l)
Ylm (θ, ϕ + α) = (18.39)
m′

Setting ϕ = 0 then yields


Ylm (θ, α) = eiαm Ylm (θ, 0). (18.40)
Since the direction θ = 0 is the polar axis, continuity of the spherical harmonic requires that
Ylm (0, α) be independent of α. We must have Ylm (0, 0) = 0 for m ̸= 0, i.e.,

Ylm (0, 0) = cl δm0. (18.41)

As R(θ, ϕ, γ) carries a vector parallel to the z axis into the direction (θ, ϕ), we have
X ′
Ylm (0, 0)[Dmm′ (ϕ, θ, γ)]∗ = cl [Dm0 (ϕ, θ, γ)]∗
(l) (l)
Ylm (θ, ϕ) = (18.42)
m′

for arbitrary γ, thus obtaining a simple relation between the spherical harmonics and the ro-
tation matrices. Conventional normalization is obtained if we put
 1/2
2l + 1
cl = . (18.43)

The operator for a rotation through 2π about an axis along the unit vector n̂ is Rn̂ (2π) =
e−2πin̂·J . Its effect on the standard angular momentum eigenstates is

Rn̂ (2π) = (−1)2j |j, m⟩ . (18.44)

We assume a rotation through 2π as a trivial operation that leaves everything unchanged, i.e.,
all dynamical variables are invariant under 2π rotation:

R(2π)AR−1 (2π) = A, (18.45)

where A may represent any physical observable.


The operator R(2π) divides the vector space into two subspaces. A typical vector in the first
subspace, denoted as |+⟩, has the property R(2π) |+⟩ = |+⟩, whereas a typical vector in the
second subspace, denoted as |−⟩, has the property R(2π) |−⟩ = − |−⟩. Now, if A represents
any physical observable, we have ⟨+|R(2π)A|−⟩ = ⟨+|AR(2π)|−⟩, leading to

⟨+|A|−⟩ = 0. (18.46)
–184/453– Chapter 18 Angular Momentum

No physical observable can have nonvanishing matrix elements between states with integer
angular momentum and states with half odd-integer angular momentum. This fact forms the
basis of a superselection rule: There is no observable distinction among the state vectors of the
form
|Ψω ⟩ = |+⟩ + eiω |−⟩ (18.47)
for different values of the phase ω.

18.4 Addition of angular momentum


Let us consider a two-component system, each component of which has angular momentum
degrees of freedom. Basis vectors for the composite system can be formed from the basis
vectors of the components by taking all binary products of a vector from each set

|j1 , j2 , m1 , m2 ⟩ = |j1 , m1 ⟩(1) |j2 , m2 ⟩(2) . (18.48)

These vectors are common eigenstates of the four commutative operators J (1) · J (1) , J (2) ·
(1) (2)
J (2) , Jz , and Jz . It is often desirable to form eigenstates of the total angular momentum
operators, J · J and Jz , where the total angular momentum vector operator is

J = J (1) ⊗ 1 + 1 ⊗ J (2) . (18.49)

This is useful when the system is invariant under rotation as a whole, but not under rotation of
the two components separately. The eigenstates of J · J and Jz may be denoted as |α, J, M ⟩.
It is easy to verify that the four operators J (1) · J (1) , J (2) · J (2) , J · J and Jz are mutually
commutative, and hence they possess a complete set of common eigenstates. Since the set of
product vectors and the new set of total angular momentum eigenstates are both eigenstates of
J (1) · J (1) and J (2) · J (2) , the eigenvalues j1 and j2 will be constant in both sets. Therefore we
may confine our attention to the vector space of dimension (2j1 + 1)(2j2 + 1) that is spanned
by product vectors with fixed values of j1 and j2 .

Now the 2J + 1 vectors |α, J, M ⟩, with M in the range −J ≤ M ≤ J, span an irreducible


subspace. If the vector |α, J, M ⟩, for a particular value of M , can be constructed in the space
under consideration, so can the entire set of 2J + 1 vectors. For a particular value of J, it
might be possible to construct one such set of vectors, two or more linearly independent sets,
or none at all. Let N (J) denotes the number of independent sets that can be constructed. Let
n(M ) be the degree of degeneracy, in this space, of the eigenvalue M . The relation between
these two quantities is X
n(M ) = N (J), (18.50)
J≥|M |

and hence
N (J) = n(J) − n(J + 1). (18.51)
The product vectors |j1 , m1 ⟩ |j2 , m2 ⟩ are eigenstates of the operator Jz , with eigenvalue m1 +
m2 , and the degree of degeneracy n(M ) is equal to the number of pairs (m1 , m2 ) such that
M = m1 + m2 .
18.4 Addition of angular momentum –185/453–
m2

M = j1 + j2

m1

M = j1 − j2

Figure 18.2: Possible values of M = m1 + m2 , illustrated for j1 = 3, j2 = 2.

As shown in Figure 18.2, we have




0, |M | > j1 + j2

n(M ) = j1 + j2 + 1 − |M |, |j1 − j2 | ≤ M ≤ |j1 + j2 | . (18.52)


2j + 1, 0 ≤ |M | ≤ |j − j |
min 1 2

It then follows that (


1, |j1 − j2 | ≤ J ≤ |j1 + j2 |
N (J) = . (18.53)
0, otherwise
It has turned out that N (J) is never greater that 1, and so the vectors |α, J, M ⟩ can be uniquely
labeled by the eigenvalues of the four operators J (1) ·J (1) , J (2) ·J (2) , J ·J and Jz . Henceforth
these total angular momentum eigenstates will be denoted as |j1 , j2 , J, M ⟩. And we have the
unitarity transformation
X
|j1 , j2 , J, M ⟩ = |j1 , j2 , m1 , m2 ⟩ ⟨j1 , j2 , m1 , m2 |j1 , j2 , J, M ⟩ . (18.54)
m1 ,m2

The coefficients of this transformation are called the Clebsch–Gordan coefficients, denoted
as (j1 , j2 , m1 , m2 |J, M ). The phases of the CG coefficients are not yet defined because of the
indeterminacy of the relative phases of the vectors |j1 , j2 , J, M ⟩. For different values of M but
fixed J we adopt the usual phase convention that led to
p
J+ |j1 , j2 , J, M ⟩ = (J + M + 1)(J − M ) |j1 , j2 , J, M + 1⟩ . (18.55)

This leaves one arbitrary phase for each J value, which we fix by requiring that (j1 , j2 , j1 , J −
j1 |J, J) be real and positive. It can be shown that all of the CG coefficients are now real. We
can also prove that CG coefficients vanish unless following conditions are satisfied:
• m1 + m2 = M .
• |j1 − j2 | ≤ J ≤ |j1 + j2 |.
• j1 + j2 + J = an integer.
–186/453– Chapter 18 Angular Momentum

It is possible to work out the values of the CG coefficients by successive application of the
raising or lowering operator to
X
|j1 , j2 , J, M ⟩ = |j1 , j2 , J, M ⟩ (j1 , j2 , m1 , m2 |J, M ). (18.56)
m1 ,m2

The details of the calculation can be found in section 7.7 of Quantum mechanics – a modern
development (Leslie E. Ballentine). There are Table of CG coefficients and Calculator of CG co-
efficients on the internet. A special case of angular momentum addition is spin-orbit coupling
of spin 1/2 particles, and we list the corresponding CG coefficients (l, 1/2, M − ms , ms |J, M )
in Table 18.1.
J = l + 1/2 J = l − 1/2
h i1/2 h i1/2
ms = 1/2 l+M +1/2
2l+1
− l−M +1/2
2l+1
h i1/2 h i1/2
ms = −1/2 l−M +1/2
2l+1
l+M +1/2
2l+1

Table 18.1: Spin-orbit coupling

Now let us consider the relation between CG coefficients and rotation matrices. On the one
hand, we have
⟨j1 , j2 , m1 , m2 |R|j1 , j2 , m′1 , m′2 ⟩ = Dm11 m′ (R)Dm22 m′ (R).
(j ) (j )
(18.57)
1 2

On the other hand, we have


⟨j1 , j2 , m1 , m2 |R|j1 , j2 , m′1 , m′2 ⟩
X
= (j1 , j2 , m1 , m2 |J, M )(j1 , j2 , m′1 , m′2 |J ′ , M ′ ) ⟨j1 , j2 , J, M |R|j1 , j2 , J ′ , M ′ ⟩
J,M,J ′ ,M ′
X
(j1 , j2 , m1 , m2 |J, M )(j1 , j2 , m′1 , m′2 |J, M ′ )DM M ′ (R).
(J)
=
J,M,M ′

Therefore, we can get


X
(j1 , j2 , m1 , m2 |J, M )(j1 , j2 , m′1 , m′2 |J, M ′ )DM M ′ (R).
(j ) (j ) (J)
Dm11 m′ (R)Dm22 m′ (R) =
1 2
J,M,M ′
(18.58)
It is called Clebsch-Gordan series. Recall that
 1/2
2l + 1
[Dm0 (ϕ, θ, 0)]∗ .
m (l)
Yl (θ, ϕ) = (18.59)

We then have
s
X (2l1 + 1)(2l2 + 1)
Yl1m1 (θ, ϕ)Yl2m2 (θ, ϕ) = (l1 , l2 , m1 , m2 |l, m)(l1 , l2 , 0, 0|l, 0)Ylm (θ, ϕ).
l,m
4π(2l + 1)
(18.60)
The orthogonal relation of spherical harmonics would imply that
Z s
(2l1 + 1)(2l2 + 1)
dΩ Ylm∗ (θ, ϕ)Yl1m1 (θ, ϕ)Yl2m2 (θ, ϕ) = (l1 , l2 , m1 , m2 |l, m)(l1 , l2 , 0, 0|l, 0).
4π(2l + 1)
(18.61)
18.5 Tensor operators –187/453–

18.5 Tensor operators


Suppose the state of the system is |ψ⟩. The state after rotation is R |ψ⟩, denoted as |ψ ′ ⟩. An
operator S is called scalar operator if and only if

⟨ψ ′ |S|ψ ′ ⟩ = ⟨ψ|S|ψ⟩ , (18.62)

which is equivalent to
R−1 SR = S. (18.63)
Taking the case of infinitesimal rotation, we can derive that

[J , S] = 0. (18.64)

A group of operators V is called vector operator if and only if

⟨ψ ′ |V |ψ ′ ⟩ = R ⟨ψ|V |ψ⟩ , (18.65)

which is equivalent to
R−1 V R = RV . (18.66)
Taking the case of infinitesimal rotation, we can derive that

[Ji , Vj ] = iϵijk Vk . (18.67)

If V and W are vector operators, we can prove that V · W is scalar operator and V × W is
vector operator.
Similarly, tensor operators are defined as
X
R−1 Tij···k R = Rii′ Rjj ′ · · · Rkk′ Ti′ j ′ ···k′ . (18.68)
i′ ···

Such a tensor is known as a Cartesian tensor. The trouble with a Cartesian tensor is that it is re-
ducible, i.e., it can be decomposed into objects that transform independently under rotations.
For example, the trace of a tensor transform like a scalar under rotations. Thus, we would like
to define spherical tensor operators which are irreducible under rotations. A spherical tensor
operator of rank k with (2k + 1) components is defined as

X
k
R−1 Tq(k) R = [Dqq′ (R)]∗ Tq′ ,
(k) (k)
(18.69)
q ′ =−k

or equivalently
X
k
RTq(k) R−1 =
(k) (k)
Dq′ q (R)Tq′ , (18.70)
q ′ =−k

(k)
where Dqq′ is the rotation matrix. Taking the case of infinitesimal rotation, we can derive that
  p (k)  
J± , Tq(k) = (k ∓ q)(k ± q + 1)Tq±1 , Jz , Tq(k) = qTq(k) . (18.71)
–188/453– Chapter 18 Angular Momentum

For example, spherical components of a vector operator V ,

Vx − iVy Vx + iVy
V−1 = √ , V 0 = Vz , V1 = − √ , (18.72)
2 2

satisfy the commutation relation above. So they are spherical tensor of rank 1. Generally, if V
is a vector operator, Ylm (V ) will be a spherical tensor of ranks l.

Spherical tensors can be formed as products of other spherical tensors. We have the following
theorem:
Theorem 18.1
(k ) (k )
If Xq1 1 and Zq2 2 are irreducible spherical tensors of rank k1 and k2 ,
X
Tq(k) = (k1 , k2 , q1 , q2 |k, q)Xq(k1 1 ) Zq(k2 2 ) (18.73) ♣
q1 ,q2

will be an irreducible spherical tensor of rank k.

The proof can be found in section 3.10 of Modern Quantum Mechanics (J.J.Sakurai).

Example: Suppose V and U are spherical tensor of rank 1. It follows that


r r
(0) 1 1
T0 = (U−1 V1 + U1 V−1 − U0 U0 ) = − (Ux Vx + Uy Vy + Uz Vz ), (18.74)
3 3

which is a spherical tensor of rank 0.

Another important theorem on tensor operator is Wigner-Eckart theorem:

Theorem 18.2 Wigner-Eckart theorem

The matrix elements of tensor operators with respect to angular-momentum eigenstates


satisfy that

⟨τ ′ , j ′ ||T (k) ||τ, j⟩ ♣


τ ′ , j ′ , m′ Tq(k) τ, j, m = (j, k, m, q|j ′ , m′ ) √ . (18.75)
2j + 1

where the double-bar matrix element is independent of m and m′ and q.

The proof can also be found in section 3.10 of Modern Quantum Mechanics (J.J.Sakurai).

For scaler operator S, we have

⟨τ ′ , j ′ ||S||τ, j⟩
⟨τ ′ , j ′ , m′ |S|τ, j, m⟩ = δjj ′ δmm′ √ . (18.76)
2j + 1
18.6 Spherical potential well –189/453–

For spherical tensor of rank 1, we have

⟨τ ′ , j ′ ||Vq ||τ, j⟩
⟨τ ′ , j ′ , m′ |Vq |τ, j, m⟩ = (j, 1, m, q|j ′ , m′ ) √ . (18.77)
2j + 1

It would vanish unless

m′ − m = q, j ′ − j = 0, 1, −1, j and j ′ are not both 0. (18.78)

For j = j ′ , Wigner-Eckart theorem - when applied to the vector operator- takes a particularly
simple form:

⟨τ ′ , j, m|J · V |τ, j, m⟩
⟨τ ′ , j, m′ |Vq |τ, j, m⟩ = ⟨j, m′ |Jq |j, m⟩ . (18.79)
j(j + 1)

Example: The magnetic moment operator for an atom has the form
−e
µ= (gL L + gS S). (18.80)
2me
The parameters gL and gS have approximately the values gL = 1 and gS = 2. The former is
an generalization of the magnetic moment we worked out in classical electrodynamics for a
system of charged particles. The latter will be discussed in quantum field theory. We define
the effective Lande factor as
−e
⟨τ, J, M ′ |µ|τ, J, M ⟩ = geff ⟨J, M ′ |J |J, M ⟩ . (18.81)
2me
Hence we have
⟨τ, J, M |gL L · J + gs S · J |τ, J, M ⟩ J(J + 1) − L(L + 1) + S(S + 1)
geff = =1+ .
J(J + 1) 2J(J + 1)
(18.82)

18.6 Spherical potential well


The stationary states of a particle in a spherical potential well are determined by

1 2
− ∇ Ψ + W (r)Ψ = EΨ. (18.83)
2m
In spherical coordinates, we have
   
1 ∂ 2 ∂ 1 ∂ ∂ 1 ∂2
∇ = 2
2
r + 2 sin θ + 2 2 . (18.84)
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂ϕ2

Therefore, the eigenvalue equation becomes


 
1 1 ∂ 2 ∂Ψ L2
− r + Ψ + W (r)Ψ = EΨ. (18.85)
2m r2 ∂r ∂r 2mr2
–190/453– Chapter 18 Angular Momentum

Suppose eigenfunctions have the factored form


u(r)
Ψ(r, θ, ϕ) = Ylm (θ, ϕ) . (18.86)
r
The radial function then satisfies the equation
 
1 d2 u(r) l(l + 1)
− + + W (r) u(r) = Eu(r). (18.87)
2m dr2 2mr2
The radial function also satisfies the boundary condition u(0) = 0 since Ψ(r, θ, ϕ) would
otherwise have an r−1 singularity at the origin. The normalization ⟨Ψ|Ψ⟩ = 1 implies that
Z ∞
|u(r)|2 dr = 1. (18.88)
0

The hydrogen atom


The hydrogen atom is a two-particle system consisting of an electron and a proton. The Hamil-
tonian of the system is
P2 Pp2 e2
H= e + − . (18.89)
2me 2mp 4π|Qe − Qp |
We take as independent variables the center of mass and relative coordinates of the particles
me Qe + mp Qp
Qc = , Qr = Qe − Qp . (18.90)
me + mp
The corresponding momentum operators are
mp Pe − me Pp
Pc = Pe + Pp , Pr = . (18.91)
me + mp
We can verify that

[Qcα , Pcβ ] = [Qrα , Prβ ] = iδαβ , [Qcα , Prβ ] = [Qrα , Pcβ ] = 0. (18.92)

The Hamiltonian becomes


Pc2 P2 e2
H= + r − , (18.93)
2(me + mp ) 2µ 4π|Qr |
where µ is called the reduced mass, defined by µ ≡ me mp /(me + mp ). The center of mass
behaves as a free particle, and its motion is not coupled to the relative coordinate. We shall
confine our attention to the internal degrees of freedom described by the relative coordinate
Qr . The energy eigenvalue equation in coordinate representation is

1 2 e2
− ∇ Ψ(r) − Ψ(r) = EΨ(r). (18.94)
2µ 4πr
Suppose Ψ(r, θ, ϕ) = Ylm (θ, ϕ)u(r)/r, we have
 
1 d2 u(r) l(l + 1) e2
− + − u(r) = Eu(r). (18.95)
2µ dr2 2µr2 4πr
18.6 Spherical potential well –191/453–

Define r
p e2 µ
ρ ≡ αr, α ≡ 8µ|E|, λ≡ . (18.96)
4π 2|E|
The eigenvalue equation becomes
 
d2 u 1 λ l(l + 1)
+ − + − u = 0. (18.97)
dρ2 4 ρ ρ2
As ρ → ∞, we have u ∼ e−ρ/2 . And as ρ → 0, we have u ∼ ρl+1 . Therefore, we would like to
suppose
u(ρ) = ρl+1 e−ρ/2 v(ρ). (18.98)
It follows that
d2 v dv
ρ 2
+ (2l + 2 − ρ) + (λ − l − 1)v = 0. (18.99)
dρ dρ
It is the wellknown confluent hypergeometric differential equation. When λ−1−l = nr , it has
regular solutions. Solutions are called associated Laguerre polynomial, and will be denoted as
L2l+1
n−l−1 (ρ), where n = nr + l + 1). The energy levels are

µe4
En = − . (18.100)
32π 2 n2
The degeneracy of an eigenvalue En is
X
n−1
(2l + 1) = n2 . (18.101)
l=0


Note: The degeneracy of an energy level of a hydrogen atom is greater than this by a factor of 4, which
arises from the two­fold orientational degeneracies of the electron and proton spin states. This four­fold
degeneracy is modified by the hyperfine interaction between the magnetic moments of the electron and the
proton.

The orthonormal energy eigenfunctions for hydrogen atoms are


 1/2
4(n − l − 1)! −ρ/2 m
Ψnlm (r, θ, ϕ) = ρl L2l+1
n−l−1 (ρ)e Yl (θ, ϕ), (18.102)
(na0 )3 n[(n + l)!]3
where a0 ≡ 4π/µe2 is a characteristic length for the atom, known as the Bohr radius, and
ρ = 2r/(na0 ). The wave function of ground state is
Ψ000 = (πa30 )−1/2 e−r/a0 . (18.103)
A measure of the spatial extent of the bound states of hydrogen is given by the averages of
various powers of the distance r:
  
1 l(l + 1)
⟨r⟩ = n a0 1 +
2
1− , (18.104a)
2 n2
  
3 l(l + 1) − 1/3
r2 = n4 a20 1 + 1− , (18.104b)
2 n2
 
1 1
= 2 . (18.104c)
r n a0
Chapter 19
Discrete Symmetries

19.1 Space inversion


The space inversion transformation brings x to −x. The corresponding operator on state space
is usually called parity operator, denoted as P . By definition, the parity operator reverses the
signs of position operator and momentum operators,

P −1 XP = −X, P −1 P P = −P . (19.1)

It follows that the orbital angular momentum, L = X × P , is unchanged by the parity


transformation. This property is extended, by definition, to any angular momentum operators,
i.e.,
P −1 J P = J . (19.2)

We can verify that P must be linear by applying space inversion to the commutation relation
[Xi , Pi ] = i. Hence the parity operator is a unitary operator rather than an anti-unitary oper-
ator. Since two consecutive space inversions produce no change at all, it follows that the states
described by |ψ⟩ and by P 2 |ψ⟩ must be the same. The operator P 2 can differ from the identity
operator by at most a phase factor. This phase factor is left arbitrary. It is most convenient to
choose that phase factor to be unity, and hence we have

P = P −1 = P † . (19.3)

Further more, we can derive that


P |x⟩ = |−x⟩ . (19.4)

The effect of P on a wave function is therefore

P ψ(x) ≡ ⟨x|P |ψ⟩ = ⟨−x|ψ⟩ = ψ(−x). (19.5)

From the fact that P 2 = 1, it follows that P has eigenvalues ±1. Any even function, ψe (x) =
ψe (−x), is an eigenfunction on P with eigenvalue 1, and any odd function, ψo (x) = −ψo (−x),
is an eigenfunction of P with eigenvalue −1. A function corresponding to parity +1 is also
said to be of even parity, and a function corresponding to parity −1 is said to be of odd parity.

If the parity of operator K is p, i.e., P KP = pK, and the parities of the state |ψ1 ⟩ and |ψ2 ⟩
are p1 and p2 respectively, ⟨ψ1 |K|ψ2 ⟩ would vanish unless p = p1 p2 .
19.2 Time reversal –193/453–

Example: Under space inversion x → −x, the spherical harmonic undergoes the transfor-
mation
Ylm (θ, ϕ) → Ylm (π − θ, ϕ + π) = (−1)l Ylm (θ, ϕ). (19.6)
Hence the single particle orbital angular momentum eigenstate |l, m⟩ is also an eigenstate of
parity, with parity (−1)l . A total orbital angular momentum eigenstate for a two electron atom
is of the form
X
|l1 , l2 , L, M ⟩ = ⟨l1 , l2 , m1 , m2 |l1 , l2 , L, M ⟩ |l1 , m1 ⟩ ⊗ |l2 , m2 ⟩ . (19.7)
m1 ,m2

It is apparent that
P |l1 , l2 , L, M ⟩ = (−1)l1 +l2 |l1 , l2 , L, M ⟩ , (19.8)
We see that, in general, the parity of an angular momentum state is not determined by its total
angular momentum.

If the parity operator P commutes with the Hamiltonian H, parity eigenvalue ±1 will be a
conserved quantity. In that case an even parity state can never acquire an odd parity com-
ponent, and an odd parity state can never acquire an even parity component. If |ψ(t)⟩ is a
possible evolution of a system with Hamiltonian H, satisfying Schrödinger equation

∂ |ψ⟩
H |ψ⟩ = i , (19.9)
∂t
we can derive that
∂P |ψ⟩
HP |ψ⟩ = i , (19.10)
∂t
i.e., the space inversion of |ψ(t)⟩, P |ψ(t)⟩, can also be a possible physical process of the system.
Experiments have shown that parity in β decay is not conserved.

19.2 Time reversal


The effect of the time reversal operator T is to reverse the linear and angular momentum while
leaving the position unchanged. We require, by definition,

T −1 XT = X, T −1 P T = −P , T −1 J T = −J . (19.11)

We can verify that T must be antilinear by applying space inversion to the commutation rela-
tion [Xi , Pi ] = i. Thus, the time reversal operator is an anti-unitary operator.
Suppose that the evolution of a system satisfies Schrödinger equation

∂ |ψ(t)⟩
H |ψ(t)⟩ = i . (19.12)
∂t
If T H = HT , we can derive that
∂T |ψ(−t)⟩
HT |ψ(−t)⟩ = i , (19.13)
∂t
–194/453– Chapter 19 Discrete Symmetries

indicating that T |ψ(−t)⟩ is also a possible evolution of the system.

In coordinate representation, the Schrödinger equation takes the form


 
1 2 ∂ψ(x, t)
− ∇ + W (x) ψ(x, t) = i . (19.14)
2m ∂t

Its complex conjugate is


 
1 2 ∗ ∗ ∂ψ ∗ (x, t)
− ∇ + W (x) ψ (x, t) = −i . (19.15)
2m ∂t

The condition for the Hamiltonian to be invariant under complex conjugation is that the po-
tential be real: W = W ∗ . In that case, if ψ(x, −t) is a solution, so will be ψ ∗ (x, t). This
suggests that we may identify the time reversal operator with the complex conjugation oper-
ator in this representation,
T = K0 , (19.16)

where, by definition, K0 ψ(x, t) = ψ ∗ (x, t). In this case T is its own inverse.
R
The formal expression for an arbitrary state in coordinate representation is |ψ⟩ = ψ(x) |x⟩ d3 x,
where the basis vector |x⟩ is an eigenstate of the position operator. Since T is equal to the com-
R
plex conjugation operator, its effect is simply T |ψ⟩ = ψ ∗ (x) |x⟩ d3 x, with T |x⟩ = |x⟩.

In momentum representation, an arbitrary state can be decomposed as


Z
|ψ⟩ = ψ(p) |p⟩ d3 p . (19.17)

Since Z
T |p⟩ = ⟨x|p⟩∗ |x⟩ d3 x = |−p⟩ ,

we have Z Z

T |ψ⟩ = ψ (p) |−p⟩ d p =3
ψ ∗ (−p) |p⟩ d3 p. (19.18)

The time reversal operator must reverse the angular momentum. For spin operator, we have

T −1 ST = −S. (19.19)

In the standard representation of the spin operators, Sx and Sz are real, while Sy is imaginary.
The time reversal operator T cannot be equal to the complex conjugation operator K0 in this
representation, since the effect of the latter is

K0 S x K0 = S x , K0 Sy K0 = −Sy , K0 Sz K0 = Sz . (19.20)

Let us write the time reversal operator as T = Y K0 , where Y is a linear operator. Y must have
the following properties:

Y −1 Sx Y = −Sx , Y −1 Sy Y = Sy , Y −1 Sz Y = −Sz . (19.21)


19.2 Time reversal –195/453–

And Y must operate only on the spin degrees of freedom. A reasonable choice is that Y =
e−iπSy , whose effect is to rotate spin (and only spin) through the angle π about the y axis.
Therefore the explicit form of the time reversal in this representation is

T = e−iπSy K0 . (19.22)

Two successive applications of the time reversal transformation, must leave the physical situ-
ation unchanged. It follows that
T 2 |ψ⟩ = c |ψ⟩ , (19.23)
where |c| = 1. Noticing that

T 2 (T |ψ⟩) = T (T 2 |ψ⟩) = T (c |ψ⟩) = c∗ T |ψ⟩ , (19.24)

we have
T 2 (|ψ⟩ + T |ψ⟩) = c |ψ⟩ + c∗ T |ψ⟩ = c′ (|ψ⟩ + T |ψ⟩). (19.25)
It follows that c′ = c∗ = c, and so c = ±1, leading to

T 2 |ψ⟩ = ± |ψ⟩ . (19.26)

In the particularly representation of 19.22, we have

T 2 = e−iπSy K0 e−iπSy K0 = e−i2πSy , (19.27)

or equivalently,
T 2 = e−i2πJy = R(2π), (19.28)
since e−i2πLy = I.

Kramer’s theorem
Let us consider the energy eigenvalue equation, H |ψ⟩ = E |ψ⟩, for a time-reversal-invariant
Hamiltonian. Since HT |ψ⟩ = T H |ψ⟩ = ET |ψ⟩, both |ψ⟩ and T |ψ⟩ are eigenstates with
energy eigenvalue E. There are two possibilities: (a) |ψ⟩and T |ψ⟩ are linearly dependent, and
so describe the same state. (b) |ψ⟩and T |ψ⟩ are linearly independent, and so describe two
degenerate states.
Suppose that (a) is true, in which case we must have T |ψ⟩ = a |ψ⟩ with |a| = 1. A second
application of T yields T 2 |ψ⟩ = |ψ⟩. Thus for those states that satisfy T 2 |ψ⟩ = − |ψ⟩ it is
necessarily true that |ψ⟩and T |ψ⟩ are linearly independent, degenerate states. This result is
known as Kramer’s theorem: any system for which T 2 |ψ⟩ = − |ψ⟩ has only degenerate energy
levels.
Chapter 20
Approximation Method

20.1 Time-independent perturbation theory


20.1.1 Brillouin-Wigner perturbation theory
We consider an unperturbed Hamiltonian H0 with eigenvalues ϵk and eigenstates |kα⟩, where
α is an index introduced to resolve degeneracy, so that

H0 |kα⟩ = ϵk |kα⟩ . (20.1)

We pick one of these levels ϵn for study, so the index n will be fixed for the following discussion.
We denote the eigenspace of the unperturbed system corresponding to eigenvalue ϵn by H, so
that the unperturbed eigenstates {|nα⟩ , α = 1, 2, · · · } form a basis in this space.
We take the perturbed Hamiltonian to be H = H0 + λH1 , where λ is a formal expansion
parameter that we allow to vary between 0 and 1 to interpolate between the unperturbed and
perturbed system. When the perturbation is turned on, the unperturbed energy level ϵn may
split and shift. We denote one of the exact energy levels that grows out of ϵn by E. We let |ψ⟩
be an exact energy eigenstate corresponding to energy E, so that

H |ψ⟩ = (H0 + λH1 ) |ψ⟩ = E |ψ⟩ . (20.2)

Both E and |ψ⟩ are understood to be functions of λ; as λ → 0, E approaches ϵn and |ψ⟩


approaches some state lying in Hn . We break the Hilbert space into the subspace Hn and
its orthogonal complement which we denote by Hn⊥ . The components of |ψ⟩ parallel and
perpendicular to Hn are conveniently expressed in terms of the projector P onto the subspace
Hn and the orthogonal projector Q, defined by
X X
P ≡ |kα⟩⟨kα| , Q ≡ |kα⟩⟨kα| . (20.3)
α k̸=n,α

These projectors satisfy that

P 2 = P, Q2 = Q, P Q = QP = 0, P + Q = I, [P, H0 ] = [Q, H0 ] = 0. (20.4)

The component P |ψ⟩ is a linear combination of the known unperturbed eigenstates {|nα⟩ , α =
1, 2, · · · }, and is easily characterized. The orthogonal component Q |ψ⟩ is harder to find. It
turns out it is possible to write a neat power series expansion for this solution. Firstly, we have

(E − H0 ) |ψ⟩ = λH1 |ψ⟩ . (20.5)


20.1 Time-independent perturbation theory –197/453–

Now we define a new operator R


X |kα⟩⟨kα|
R≡ . (20.6)
k̸=n,α
E − ϵk


Note: If there are other unperturbed energy levels ϵk lying close to ϵn , then the perturbation could push
the exact energy E near to or past some of these other levels, and then other small denominators would
make R ill defined. This will certainly happen if the perturbation is large enough. For the time being we
will assume this does not happen, so that R is free of small denominators. When this is not the case we
shall refer to “nearly degenerate perturbation theory”, which is discussed later.

The operator R satisfies that

P R = RP = 0, QR = RQ = R, R(E − H0 ) = (E − H0 )R = Q. (20.7)

Then we have

R(E − H0 ) |ψ⟩ = Q |ψ⟩ = λRH1 |ψ⟩ and |ψ⟩ = P |ψ⟩ + λRH1 |ψ⟩ . (20.8)

Thus |ψ⟩ can be solved as a series of P |ψ⟩:

1
|ψ⟩ = P |ψ⟩ = P |ψ⟩ + λRH1 P |ψ⟩ + λ2 RH1 RH1 P |ψ⟩ + · · · . (20.9)
1 − λRH1

20.1.2 Nondegenerate perturbation theory


In nondegenerate perturbation theory the level ϵn of H0 is nondegenerate. Then the index α is
not needed for the level ϵn , and we can write simply |n⟩ for the corresponding eigenstate. We
retain α for the other levels k ̸= n, since these may still be degenerate. We assume that P |ψ⟩
is normalized rather than |ψ⟩ so that

P |ψ⟩ = |n⟩ . (20.10)

With this normalization convention, we have

⟨n|ψ⟩ = 1. (20.11)

Now the series becomes


X ⟨kα|H1 |n⟩ X X ⟨kα|H1 |k ′ α′ ⟩ ⟨k ′ α′ |H1 |n⟩
|ψ⟩ = |n⟩ + λ |kα⟩ + λ2 |kα⟩ .
k̸=n,α
E − ϵk ′
k̸=n,α k ̸=n,α′
(E − ϵ k )(E − ϵk′ )

(20.12)
To find an equation for E, we notice that

⟨n|E − H0 |ψ⟩ = E − ϵn = λ ⟨n|H1 |ψ⟩ . (20.13)


–198/453– Chapter 20 Approximation Method

It follows that

E = ϵn + λ ⟨n|H1 |n⟩ + λ2 ⟨n|H1 RH1 |n⟩ + λ3 ⟨n|H1 RH1 RH1 |n⟩ + · · ·


X ⟨n|H1 |kα⟩ ⟨kα|H1 |n⟩
= ϵn + λ ⟨n|H1 |n⟩ + λ2
k̸=n,α
E − ϵk
X X ⟨n|H1 |kα⟩ ⟨kα|H1 |k ′ α′ ⟩ ⟨k ′ α′ |H1 |n⟩
+ λ3 + ··· (20.14)

k̸=n,α k ̸=n,α′
(E − ϵk )(E − ϵk′ )

It is easy to obtain E up to O(λ3 ),


X ⟨n|H1 |kα⟩ ⟨kα|H1 |n⟩ 
E = ϵn + λ ⟨n|H1 |n⟩ + λ2 + O λ3 , (20.15)
k̸=n,α
ϵn − ϵk

and |ψ⟩ up to O(λ2 ),


X ⟨kα|H1 |n⟩ 
|ψ⟩ = |n⟩ + λ |kα⟩ + O λ2 . (20.16)
k̸=n,α
ϵn − ϵk

Higher corrections can be found on the Wikipedia.

20.1.3 Degenerate perturbation theory


In the case that the unperturbed energy level ϵn is degenerate, we have
X
P |ψ⟩ = |nα⟩ cα and ⟨nα|P |ψ⟩ = ⟨nα|ψ⟩ = cα . (20.17)
α

Then we can obtain an equation for the cα ,

⟨nα|E − H0 |ψ⟩ = cα (E − ϵn ) = λ ⟨nα|H1 |ψ⟩ , (20.18)

or in expanded form,
X X
(E − ϵn )cα = λ ⟨nα|H1 |nβ⟩ cβ + λ2 ⟨nα|H1 RH1 |nβ⟩ cβ + · · ·
β β
X X X ⟨nα|H1 |kγ⟩ ⟨kγ|H1 |nβ⟩
=λ ⟨nα|H1 |nβ⟩ cβ + λ2 cβ + · · ·
β β k̸=n,γ
E − ϵk

(20.19)

This equation must be solved simultaneously for the eigenvalues E and the unknown expan-
sion coefficients cα . If we truncate the series at first order, we see that the corrections E − ϵn to
the energies are determined as the eigenvalues of the matrix ⟨nα|H1 |nβ⟩, and the coefficients
cα are the corresponding eigenvectors. This determines the energies to first order, but the co-
efficients cα only to zeroth order. Then P |ψ⟩ becomes known to zeroth order and Q |ψ⟩ to
first order. The first order matrix may or may not have degeneracies itself. If it does not, then
all degeneracies are lifted at first order; if it does, the remaining degeneracies may be lifted at
a higher order, or may persist to all orders. Degeneracies that persist to all orders are almost
20.2 Application of time-independent perturbation theory in hydrogen atom –199/453–

always due to some symmetry of the system, which can usually be recognized at the outset.
The higher order corrections can be worked out step by step, which will not be listed here.
Now let us consider the case in which the unperturbed levels of H0 , while not technically
degenerate, are close to one another. Suppose to be specific that two levels, say, ϵn and ϵm , are
close enough to one another that first order perturbations will push the exact level E close to
or onto the unperturbed level ϵm . In this case we choose some energy, call it ϵ̄, which is close
to ϵn and ϵm . Then let us take the original unperturbed Hamiltonian and perturbation and
rearrange them in the form,
H = H0 + H1 = H0′ + H1′ , (20.20a)
where
X
H0 = ϵk |kα⟩⟨kα| , (20.20b)

X X
H0′ = ϵk |kα⟩⟨kα| + ϵ̄ |kα⟩⟨kα| , (20.20c)
k̸=m,n;α k=m,n;α
X
H1′ = H1 + (ϵk − ϵ̄) |kα⟩⟨kα| . (20.20d)
k=m,n;α

Then standard degenerate perturbation theory may be applied. We will call this procedure
“nearly degenerate perturbation theory.”

20.2 Application of time-independent perturbation theory in


hydrogen atom
20.2.1 Stark effect
The Stark effect concerns the behavior of atoms in external electric fields. We choose hydrogen
atom because it is single-electron atom. The hydrogen atom will be modeled with the central
force Hamiltonian
p2 e2
H0 = − . (20.21)
2m 4πr
In this Hamiltonian we ignore spin and other small effects such as relativistic corrections,
hyperfine effects and Lamb shift. These effects cause splitting and shifting of the energy levels
of our simplified model, as well as the introduction of new quantum numbers and new degrees
of freedom. But these effects are all small, and if the applied electric field is strong enough, it
will overwhelm them and the physical consequences will be much as we shall describe them
with our simplified model. The unperturbed energy levels in hydrogen are
1 e2
En = − 2 . (20.22)
2n 4πa0
where a0 is the Bohr radius. These levels are n2 degenerate. As for the perturbation, let us
write F for the external electric field, and let us take it to lie in the ẑ-direction. Thus, the
perturbing potential has the form
V1 = −(−e)F · x = eF z. (20.23)
–200/453– Chapter 20 Approximation Method

For small z, the attractive Coulomb field dominates the total potential and we have the usual
Coulomb well that supports atomic bound states. However, for large negative z, the unper-
turbed potential goes to zero, while the perturbing potential becomes large and negative. At
intermediate values of negative z, the competition between the two potentials gives a maxi-
mum in the total potential. The electric force on the electron is zero at the maximum of the
potential. Given the relative weakness of the applied field, the maximum must occur at a dis-
tance from the nucleus that is large in comparison to the Bohr radius a0 . Atomic states with
small principal quantum numbers n lie well inside this radius. The perturbation analysis we
shall perform applies to these states.

The bound states of the unperturbed system are able to tunnel through the potential barrier.
When an external electric field is turned on, the bound states of the atom cease to be bound in
the strict sense, and become resonances. Electrons that tunnel through the barrier and emerge
into the classically allowed region at large negative z will accelerate in the external field, leaving
behind an ion. This is the phenomenon of field ionization. This effect can be neglected if the
external field is weak enough and the lifetime of the “bound state” is long enough.

In the case of hydrogen, the ground state is |100⟩. The first order shift in the ground state
energy level is given by
(1)
∆Egnd = ⟨100|eF z|100⟩ = 0, (20.24)
which vanishes because the parity of z is odd, but ⟨100| and |100⟩ have the same parity. For the
excited states of hydrogen, according to first order degenerate perturbation theory, the shifts
in the energy levels En are given by the eigenvalues of the n2 × n2 matrix,

⟨nlm|eF z|nl′ m′ ⟩ (20.25)

According to the Wigner-Eckart theorem and parity, the matrix elements vanish unless l−l′ =
±1 and m = m′ . Consider, for example, the case n = 2. The four degenerate states are
|2, 0, 0⟩, |2, 1, −1⟩, |2, 1, 0⟩ and |2, 1, 1⟩. Only the states |2, 0, 0⟩ and |2, 1, 0⟩ are connected by
the perturbation. Therefore of the 16 matrix elements, the only nonvanishing ones are

⟨2, 0, 0|eF z|2, 1, 0⟩ ≡ −W = −3eF a0 (20.26)

and its complex conjugate. The matrix connecting the two states |2, 0, 0⟩ and |2, 1, 0⟩ is
 
0 −W
, (20.27)
−W 0

and its eigenvalues are the first order energy shifts in the n = 2 level,
(1)
∆E2 = ±W. (20.28)

In addition, the two states |2, 1, −1⟩ and |2, 1, 1⟩ do not shift their energies at first order. The
perturbed eigenstates are

|2, 0, 0⟩ − |2, 1, 0⟩ |2, 0, 0⟩ + |2, 1, 0⟩


|+W ⟩ = √ , |−W ⟩ = √ . (20.29)
2 2
20.2 Application of time-independent perturbation theory in hydrogen atom –201/453–

This is zeroth order part of the exact eigenstates.

Now let us look at the exact symmetries of the full, perturbed Hamiltonian H = H0 + H1 ,
without doing perturbation theory at all. Since [H, Lz ] = 0 the exact eigenstates of H can be
chosen to be eigenstates of Lz as well. Denote these by |γm⟩, where γ is an additional index
needed to specify an energy eigenstate. Thus, we have

Lz |γm⟩ = m |γm⟩ , H |γm⟩ = Eγm |γm⟩ , (20.30)

where Eγm is allowed to depend on m since the full rotational symmetry is broken. As for
time reversal, the state T |γm⟩ must be an eigenstate of energy with eigenvalue Eγm since
T H = HT . But because T −1 Lz T = −Lz , it also follows that T |γm⟩ is an eigenstate of Lz
with eigenvalue −m. If m ̸= 0, we must have a degeneracy of at least two. The only energy
levels that can be non-degenerate are those with m = 0. In the example above, even higher
order corrections cannot separate |2, 1, −1⟩ and |2, 1, 1⟩.

20.2.2 Fine structure


Fine structure of atoms concerns the effects of relativity and spin on the dynamics of the elec-
tron. Both these effects are of the same order of magnitude, and must be treated together in
any realistic treatment of the atomic structure. The fine structure terms account for relativistic
effects through order v 2 , and have the effect of enlarging the Hilbert space by the inclusion of
the spin degrees of freedom, introducing new quantum numbers, and shifting and splitting the
energy levels of the electrostatic model. The splitting in particular means that spectral lines
that appear a singlets under low resolution become closely spaced multiplets under higher
resolution.

Derivation of the exact form of relativistic corrections of Hamiltonian in quantum mechanics


can not be very rigorous and needs some reasonable guesses. The details of derivation can be
found in lecture notes of quantum mechanics by Robert G. Littlejohn. Here we just list the
results of derivation. Fine structure corrections of Hamiltonian consists of three parts:

HFS = HRKE + HD + HSO . (20.31)

The term
p4
HRKE = − (20.32)
8m3
p
is due to the second order term of the expansion series of E = p2 + m2 . (The first order
term is just the kinetic energy in non relativistic quantum mechanics). The term

1
HD = ∇2 V (20.33)
8m2

comes out as a result of virtual process e− → e− + e− + e+ in the region whose scale is


smaller than the Compton length λC = 1/m = αa0 of electrons. Such virtual states appear in
perturbation theory when one sums over intermediate states, which derive ultimately from a
–202/453– Chapter 20 Approximation Method

resolution of the identity. The effect is to smear out the position of the atomic electron over a
distance of order λC . The term

1 1 dV
HSO = L·S (20.34)
2m2 r dr
arises because the electric field of nuclei generates a magnetic field in the rest frame of electron.

The unperturbed energy levels in hydrogen are given by equation 20.22. When spin of electron
is taken into account, these levels are 2n2 degenerate. One choice of base is |nlml ms ⟩. It is the
eigenstate of operator L2 , Lz and Sz . However, Lz and Sz do not commute with HSO . A better
choice of base is |nljmj ⟩. It is the eigenstate of operator L2 , J 2 and Jz . HSO , HRKE and HSO
all commute with L2 , J 2 and Jz . Thus nl′ j ′ m′j H nljmj vanishes unless l′ = l, j = j ′ and
m′j = mj . We can figure out that
 
1 3 n
⟨nljmj |HRKE |nljmj ⟩ = − 2 − α2 En , (20.35)
n 4 l + 1/2
1
⟨nljmj |HD |nljmj ⟩ = − δl0 α2 En , (20.36)
n
1 j(j + 1) − l(l + 1) − 3/4 2
⟨nljmj |HSO |nljmj ⟩ = − α En . (20.37)
2n l(l + 1/2)(l + 1)

The total energy shift due to the fine structure is


 
1 3 n
∆EFS =− 2 − α 2 En . (20.38)
n 4 j + 1/2

It is independent of the orbital angular momentum quantum number l, although each of the
individual terms does depend on l. However, the total energy shift does depend on j in ad-
dition to the principal quantum number n, so when we take into account the fine structure
corrections,the energy levels of hydrogen atom have the form Enj .

Besides fine structure effect, the remaining important effects causing energy shift are hyperfine
effects and the Lamb shift. The Lamb shift is a shift in the energy levels due to the interaction of
the electron with the vacuum fluctuations of the quantized electromagnetic field. It has small
effects on the s states (l = 0) of hydrogen, thereby introducing a dependence of the energy
levels on l. Thus, including the Lamb shift, the energy levels in hydrogen have the form Enlj ,
and the only degeneracy is that due to rotational invariance. It will be further discussed in
quantum electrodynamics. Hyperfine effects are caused by the interaction between electron
spin and nuclei spin, and will be discussed later.

20.2.3 Zeeman effect


The Zeeman effect concerns the interaction of atomic systems with external magnetic fields.
The Hamiltonian for the electron in hydrogen atom is

(P + eA)2 e2
H= − + HFS + ge µB S · B, (20.39)
2m 4πr
20.2 Application of time-independent perturbation theory in hydrogen atom –203/453–

where ge ≈ 2 and µB ≡ e/2m. We assume a uniform magnetic field B = B ẑ. Taking the
gauge
1
A = B × r, (20.40)
2
we have ∇ · A = 0, implying that

P · A = A · P. (20.41)

Hence the cross terms in the expansion of the kinetic energy can be written in either order.
Noticing that
1 1
P · A = P · (B × r) = B · L, (20.42)
2 2
we thus have
H = Ha + HZ + HB + HFS , (20.43)
where
p2 e2 e e2 2 2
Ha = − , HZ = (Lz + 2Sz )B, HB = B (x + y 2 ). (20.44)
2m 4πr 2m 8m
Denote the typical energy of the term Hi as Ei . We have Ea ∼ me4 /32n2 π 2 ℏ2 ϵ20 , EZ ∼
neℏB/2m and EB ∼ 2n4 π 2 ϵ20 ℏ4 B 2 /m3 e2 , leading to
EZ 16π 2 n3 ℏ3 ϵ20 n3 B
∼ B ∼ (20.45)
Ea m2 e3 2 × 105 T
and  2 3 3 2 2  2
EB 8π n ℏ ϵ0 n3 B
∼ B ∼ . (20.46)
Ea m2 e3 4 × 105 T
Under usual experimental conditions, we have

EB ≪ EZ ≪ Ea , (20.47)

and so term HB can be neglected. Recall that


EFS 3α2 1
∼ 2 ∼ . (20.48)
Ea 4n 2.5 × 104 n2
Whether EFS /EZ is much larger than 1, much smaller than 1 or close to 1 depends on the
value of B and n.
If HFS is much smaller than HZ and can be neglected, we have
e
H = Ha + (Lz + 2Sz )B. (20.49)
2m
The eigenstate of H is |nlml ms ⟩ with eigenvalue E = En +µB B(ml +2ms ). The energy levels
for n = 2 is shown in Figure 20.1.
If HFS is much smaller than HZ and but cannot be neglected, we may treat it as a perturbation.
For simplicity, we only take HSO into account. Up to the first order, we consider the matrix
element
⟨nlml ms |f (r)L · S|nl′ m′l m′s ⟩ . (20.50)
–204/453– Chapter 20 Approximation Method

Figure 20.1: Zeeman effect for n = 2 in Hydrogen atom.

Since [HSO , L2 ] = 0, the matrix element vanishes unless l = l′ . Thus we focus on the matrix in
the subspace l = l′ . Take the 2p orbits of hydrogen as an example. There is a 2-fold degeneracy
between |2, 1, −1, 1/2⟩ and |2, 1, 1, −1/2⟩. This makes one 2 × 2 matrix. Let us look at the
off-diagonal element
⟨2, 1, −1, 1/2|f (r)L · S|2, 1, 1, −1/2⟩ . (20.51)

To be non-vanishing, the operator in the middle of the matrix element must connect states
with ∆ml = 2. But in fact that operator L · S = (L+ S− + L− S+ )/2 + Lz Sz permits only
∆m = 0, ±1, the off-diagonal matrix element vanishes and the energy shift is determined by
diagonal elements. As

∆E = ⟨nlml ms |f (r)L · S|nlml ms ⟩ ∝ ml ms , (20.52)

the degeneracy remains under the first order perturbation.

The final case we shall examine is the weak field limit, in which Hz ≪ HFS and we will treat
Hz as perturbation.

Note: In the case of hydrogen, one should also consider the Lamb shift for a realistic treatment. For example,
in the n = 2 levels of hydrogen, the Lamb shift is about 10 times smaller than the fine structure energy
shifts, indicating that we really should question how the Lamb shift compares to the Zeeman term which
is also (by our assumptions) much smaller than the fine structure term.

The eigenstate of Ha + HFS are |nljmj ⟩ with eigenvalue Enj . Up to the first order, the matrix
elements we need have the form

nl′ jm′j Hz nljmj . (20.53)

Since [HZ , L2 ] = 0 and [HZ , Jz ] = 0, off-diagonal matrix element vanishes automatically. The
energy shift is
∆E = µB B ⟨nljmj | Lz + 2Sz |nljmj ⟩ = geff µB Bmj , (20.54)

where
j(j + 1) − l(l + 1) + s(s + 1)
geff = 1 + . (20.55)
2j(j + 1)
20.2 Application of time-independent perturbation theory in hydrogen atom –205/453–

20.2.4 Hyperfine structure


The nucleus of an atom contains localized charge and current distributions, which produce
electric and magnetic fields that can be decomposed into multipole fields much as in classical
electrostatics or magnetostatics. The first of the multipole moments, the electric monopole,
is of course the Coulomb electrostatic field that holds the electrons in their orbits and pro-
duces the gross structure of the atom. The higher order multipole moments produce small
corrections to the atomic structure that are known generally as hyperfine effects.

Multipole moments for a system of charges has been discussed in the part of classical electro-
dynamics. In quantum mechanics, we recall that the intrinsic magnetic moment operator of
an electron is defined in the space of electron spin. We may infer that multipole moments of
nuclei are defined in the space of nuclear spin I. Suppose that I 2 has eigenvalues i(i + 1). The
nuclear Hilbert space will be a (2i + 1)-dimensional space in which the standard basis is |mi ⟩
with −i ≤ mi ≤ i.

Not all the multipole fields that occur classically are allowed in the case of nuclei. There are
two rules governing the allowed multipole moments of the nucleus. The first is that electric
multipoles of odd k and magnetic mutlipoles of even k are forbidden. For example, if the
nucleus had an electric dipole moment, the perturbing Hamiltonian would be

d·r
H1 = −e . (20.56)
r3
And just like µ, d must be proportional to the spin, because all vector operators on a single
irreducible subspace are proportional (Wigner-Eckart theorem). Thus, we have

I ·r
H1 = −κe . (20.57)
r3
We find that H1 violates time reversal and parity

T H1 T † = −H1 , P H1 P † = −H1 . (20.58)

The weak interactions do violate parity, and we do know that time reversal is violated at a very
small level in certain decay processes, so it is possible that the terms forbidden by this rule
actually exist at a small level. For example, the neutron or the electron may have an electric
dipole moment, but if such moments exist, they are certainly very small and can be neglected
in our discussion.

The second rule states that a 2k -pole can occur only if k ≤ 2i. For example, the proton with
i = 1/2 can possess an electric monopole moment and a magnetic dipole moment, but not an
electric quadrupole moment. Lying behind this rule is the fact that the operator representing
the 2k -pole on the nuclear Hilbert space is, in fact, an order k irreducible tensor operator. But
the maximum order of an irreducible tensor operator on the nuclear Hilbert space with spin i
is k = 2i.

For hydrogen atom, whose nuclear spin is i = 1/2, the only term we have to concern is mag-
netic moment. A point magnetic dipole of moment µ situated at the origin of the coordinates
–206/453– Chapter 20 Approximation Method

produces a magnetic field B = ∇ × A, where

µ×r
A(r) = . (20.59)
r3
To avoid the singularity at origin, we modify A(r) as
(
1
, r<a
A(r) = µ × r a3
1
, (20.60)
r3
, r>a

taking into account of the finite size of nuclei. By taking the curl we compute the magnetic
field (

3 , r<a
B(r) = a , (20.61)
µ · T, r > a

where Tij = (3xi xj − r2 δij )/r5 .

Define ( (
1
, r<a 0, r<a
∆(r) ≡ a3
and f (r) ≡ . (20.62)
0, r>a 1, r>a
Then we can write
 
f (r)
A(r) = µ × r ∆(r) + 3 and B(r) = µ · [2∆(r)I + f (r)T] . (20.63)
r

In the limit a → 0, we have



lim ∆(r) = δ(r), lim f (r) = 1. (20.64)
a→0 3 a→0

The Hamiltonian for the atomic electron is


(P + eA)2 e2 e
H= − + HFS + HLamb + S · B. (20.65)
2me 4πr me

The expressions 20.63 for A and B are the fields of a classical magnetic dipole at the origin,
but now for use in the Hamiltonian we must reinterpret µ as an operator acting on the nuclear
Hilbert space, given in terms of the nuclear spin by

µ = gp µp I, (20.66)

where gp is g-factor of proton and µp ≡ e/2mp . Thus, the Hamiltonian must be interpreted
as an operator acting the total Hilbert space

H = Helec ⊗ Hnucl . (20.67)

For Helec the obvious basis is |nljmj ⟩ with energies Enlj when there is no hyperfine terms.
In hydrogen energies depend on l because of the Lamb shift. The obvious basis in Hnucl is
|imi ⟩. Thus we define the basis states in H as |nljmj mi ⟩ (we suppress the index i since it is a
20.3 Time-dependent perturbation theory –207/453–

constant). We call this the uncoupled basis. Now we expand the kinetic energy in Hamiltonian
and neglect the term in A2 , writing the result as H = H0 + H1 , where

p2 e2
H0 = − + HFS + HLamb and H1 = 2µB (P · A + S · B). (20.68)
2me 4πr
Using
P · (I × r) = I · (r × P ) = I · L, (20.69)
we can get  
f (r)
H1,orbi ≡ 2µB (P · A) = k(I · L) ∆(r) + 3 . (20.70)
r
where k ≡ ge gp µB µp . As for spin part, we obtain

H1,spin ≡ 2µB S · B = k [2∆(r)I · S + f (r)I · T · S] . (20.71)

Taking the limit a → 0, we have


   
4π 1 8π
H1,orbi = k(I · L) δ(r) + 3 and H1,spin = k δ(r)I · S + I · T · S .
3 r 3
(20.72)
Since the coupling term (e.g. I · S) are not invariant under either electronic rotations alone
or under nuclear rotations alone, the uncoupled basis is not the best one for carrying out the
perturbation calculation. The dot products in question, however, are invariant under total
rotations of the system, electronic plus nuclear, which are generated by the total angular mo-
mentum of the system defined by

F ≡ I + J = I + L + S. (20.73)

This suggests that we couple together J and I to create eigenstates of F and Fz . We will call
the result the “coupled basis”, denoted by |nljf mf ⟩.
In the coupled basis the matrix elements we need to consider for degenerate perturbation the-
ory are nljf mf H1 nljf ′ m′f . Since [F , H1 ] = 0, the energy shift caused by H1 is simply
given by diagonal matrix elements, i.e.,

∆E = ⟨nljf mf |H1 |nljf mf ⟩ . (20.74)

We can figure out that

ge gp µB µp 1 f (f + 1) − j(j + 1) − i(i + 1)
∆E = . (20.75)
4πa30 n3 j(j + 1)(2l + 1)

The energy levels now have the form Enljf . The energy eigenstates are |nljf mf ⟩, and are
(2f + 1)-fold degenerate, causing the fine structure levels of hydrogen to split, giving rise to
hyperfine multiplets. For example, the ground state |1, 0, 1/2⟩ splits into two levels f = 0
and f = 1. This f = 0 level is the true ground state of hydrogen. It is nondegenerate. The
f = 1 level is 3-fold degenerate, and lies above the ground state by an energy of approximately
1.42GHz in frequency units, or 21cm in wave length units.
–208/453– Chapter 20 Approximation Method

20.3 Time-dependent perturbation theory


20.3.1 Dyson series
Time-dependent perturbation theory applies to Hamiltonians of the form
H = H0 + H1 (t), (20.76)
where H0 is solvable and H1 is treated as a perturbation. In time-dependent perturbation
theory, we are usually interested in time-dependent transitions between eigenstates of the un-
perturbed system induced by the perturbation H1 . Time-dependent transitions are usually
described by the transition amplitude, defined as the quantity
⟨f |U (t)|i⟩ , (20.77)
where U (t) is the exact time evolution operator for the Hamiltonian, and where |i⟩ and |f ⟩
are two eigenstates of the unperturbed Hamiltonian H0 . Let us denote the unperturbed time-
evolution operator by U0 (t) and the exact one by U (t). These operators satisfy the evolution
equations
∂U0 (t) ∂U (t)
i = H0 U0 (t) and i = HU (t). (20.78)
∂t ∂t
Since H0 is independent of time, we have U0 = e−iH0 t .
Suppose the state in Schrödinger picture is |ψS (t)⟩. We define the state in interaction picture
as
|ψI (t)⟩ ≡ U0† (t) |ψS (t)⟩ . (20.79)
Similarly, we define the operator in interaction picture as
AI (t) ≡ U0† (t)AS (t)U0 (t). (20.80)
Let us define W (t) as the operator that evolves kets in the interaction picture forward from
time 0 to final time t, i.e.,
|ψI (t)⟩ = W (t) |ψI (0)⟩ . (20.81)
We can verify that
W (t) = U0† U. (20.82)
The time evolution of W (t) is given by
∂W
i = H1I (t)W (t). (20.83)
∂t
The formal solution is
X∞ Z t Z t1 Z tn−1
W (t) = I + (−i)n
dt1 dt2 · · · dtn H1I (t1 )H1I (t2 ) · · · H1I (tn ). (20.84)
n=1 0 0 0

Let us assume for simplicity that H0 has a discrete spectrum H0 |n⟩ = En |n⟩. We assume that
the system is initially in an eigenstate of the unperturbed system, what we will call the “initial”
state |i⟩ with energy Ei . The evolution of the state in interaction picture is
|ψI (t)⟩ = W (t) |i⟩ . (20.85)
20.3 Time-dependent perturbation theory –209/453–

Let us expand the exact solution of the Schrödinger equation in the interaction picture in the
unperturbed eigenstates as X
|ψI (t)⟩ = cn (t) |n⟩ . (20.86)
n

We then have
D E
cn (t) = ⟨n|W (t)|i⟩ = n U0† (t)U (t) i = eiEn t ⟨n|U (t)|i⟩ (20.87)

Thus, the transition amplitudes in the interaction picture and those in the Schrödinger picture
are related by a simple phase factor. The transition probabilities are the squares of the ampli-
tudes and are the same in either case. The perturbation expansion of the transition amplitude
cn (t) is
n (t) + · · ·
cn (t) = δni + c(1) (20.88)
where
Z t Z t
1 ′ ′ 1 ′
c(1)
n (t) = dt ⟨n|H1I (t )|i⟩ = dt′ ei(En −Ei )t ⟨n|H1 (t′ )|i⟩ . (20.89)
i 0 i 0

20.3.2 Constant and harmonic perturbation


If H1 is time-independent, we have
2 iωni t/2 sin(ωni t/2)
c(1)
n (t) = e ⟨n|H1 |i⟩ . (20.90)
i ωni
Up to the first order, the transition probability is

sin2 (ωni t/2)


Pn (t) = 4 2
|⟨n|H1 |i⟩|2 , (n ̸= i). (20.91)
ωni
Another case that is important in practice is when H1 has a periodic time dependence of the
form
H1 (t) = Ke−iω0 t + K † eiω0 t . (20.92)
We can get
 
2 i(ωni −ω0)t/2 sin[(ωni − ω0 )t/2] i(ωni +ω0 )t/2 sin[(ωni + ω0 )t/2] †
c(1)
n (t) = e ⟨n|K|i⟩ + e nK i .
i ωni − ω0 ωni + ω0
(20.93)
Often, we are most interested in those final states to which most of the probability goes, which
are the states for which one or the other of the two denominators is small. For these states we
have
En ≈ Ei ± ω0 . (20.94)
We call these two cases absorption and stimulated emission, respectively. Taking the case of
absorption, and looking only at final states that are near resonance, we can write the transition
probability to first order of perturbation theory as

sin2 [(ωni − ω0 )t/2]


Pn (t) = 4 |⟨n|K|i⟩|2 . (20.95)
(ωni − ω0 )2
–210/453– Chapter 20 Approximation Method

20.3.3 Transition probability


Let us fix the final state and examine how the probability develops as a function of time in
first order time-dependent perturbation theory. Obviously Pn (0) = 0. At later times we
see that Pn (t) oscillates at frequency ωni between 0 and a maximum proportional to 1/ωni 2 .
The frequency ωni measures how far the final state is off resonance, that is, how much it fails
to conserve energy. If this frequency is large, the probability oscillates rapidly between zero
and a small maximum. But as we move the state closer to the initial state in energy, ωni gets
smaller, the period of oscillations becomes longer, and the amplitude grows. If there is a final
state degenerate in energy with the initial state, then ωni = 0 and the time-dependent factor
takes on its limiting value t2 /4. In this case, first order perturbation theory predicts that the
probability grows without bound. This is an indication of the fact that at sufficiently long times
first order perturbation theory breaks down and we must take into account higher order terms
in the perturbation expansion. But at short times it is correct that Pn for a state on resonance
grows as t2 .
Now let us fix the time t and examine how the expression for Pn (t) in first order perturbation
theory depends on the energy of the final state. To do this we focus on sin2 (ωt/2)/ω 2 as a
function of ω. The curve of the function consists of oscillations under the envelope 1/ω 2 , with
zeroes at ω = 2nπ/t. The central lobe has height t2 /4 and width that is proportional to 1/t,
so the area of the central lobe is proportional to t. We can derive that

1 sin2 ωt/2 π
lim 2
= δ(ω). (20.96)
t→∞ t ω 2
The δ-function enforces energy conservation in the limit t → ∞. But at finite times, transi-
tions take place to states in a range of energies about the initial energy. This width is of order
1/t. This is an example of the energy-time uncertainty relation, ∆t∆E ∼ 1, indicating that a
system that is isolated (not subjected to a measurement) over a time interval ∆t has an energy
that is uncertain by an amount ∆E ∼ 1/∆t.
It is conventional to define the transition rate as the transition probability per unit time,

P (i → f )
Γ(i → f ) ≡ lim . (20.97)
t→∞ t
Up to the first order, we have

Γ(i → f ) = 2πδ(Ei − Ef )|Vf i |2 , (20.98)

where Vf i = ⟨f |H1 |i⟩.

20.4 Atomic radiation


We apply time-dependent perturbation theory to the interaction of atomic electron with clas-
sical radiation field. The basic Hamiltonian, with A2 omitted, is
p2 e
H= − eϕ(x) + A · P . (20.99)
2m m
20.4 Atomic radiation –211/453–

For a polarized monochromatic plane wave, we have

A = 2A0 ϵ cos(k · x − ωt) = A0 ϵ[eik·x e−iωt + e−ik·x eiωt ]. (20.100)

The term (e/m)A · P is a Harmonic perturbation with K = (eA0 /m)eik·x (ϵ · P ). Take the
case of absorption of radiation, the transition rate is

e2 |A0 |2 2
Γabs (1 → 2) = 2π 2
2 eik·x (ϵ · P ) 1 δ(Ef − Ei − ω). (20.101)
m
Notice that the average energy density of the radiation field is
1
ρ= E 2 + B 2 = 2ω 2 |A0 |2 . (20.102)
2
The transition rate can be rewritten as
πe2 2
Γabs (1 → 2) = 2
2 eik·x (ϵ · P ) 1 ρ(ω12 ), (20.103)
m2 ω12

where ρ(ω) ≡ ρδ(E2 − E1 − ω) is the average energy density of the EM field per unit angular
frequency.
The electric dipole approximation is based on the fact that the wavelength of radiation field is
far longer than the atomic dimension. The series

eik·x = 1 + ik · x + · · · (20.104)

can be approximated by its leading term 1. It follows that


m
2 eik·x (ϵ · P ) 1 ≈ ϵ · ⟨2|[x, H0 ]|1⟩ = im(E2 − E1 ) ⟨2|ϵ · x|1⟩ , (20.105)
i
where H0 = p2 /2m − eϕ(x). The Wigner-Eckart theorem gives the following selection rules:
• If the EM wave is linearly polarized in ẑ direction, the matrix element would vanish
unless ∆m = 0 and ∆l = ±1 (∆l can not be 0 because parity consideration).
• If the EM wave is circular polarized, we would have ∆m = ±1.
• If the fine structure is taken into account, we would have ∆j = 0, ±1 and ∆l = ±1.
• If the hyperfine structure is taken into account, we would have ∆f = 0, ±1, ∆j = 0, ±1
and ∆l = ±1.
In dipole approximation, the absorption rate can be simplified to

Γabs (1 → 2) = π|⟨2|ϵ · d|1⟩|2 ρ(ω21 ), (20.106)

where d ≡ ex is the electric dipole operator. Assuming the direction and polarization of the
incident light are totally random, we have
1
|⟨2|ϵ · d|1⟩|2 = d221 where d221 ≡ |⟨2|d|1⟩|2 . (20.107)
3
–212/453– Chapter 20 Approximation Method

If we take the degeneracy of state |2⟩ into account, we have

πg2 d221
Γabs (1 → 2) = B1→2 ρ(ω21 ) where B1→2 ≡ . (20.108)
3
Similarly, the rate of stimulated emission is

πg1 d212
Γemm (2 → 1) = B2→1 ρ(ω21 ) where B2→1 ≡ . (20.109)
3
Usually B1→2 and B1→2 are called Einstein coefficients of absorption and stimulated emission.
And the relation g2 B2→1 = g1 B1→2 is obtained directly. However, the spontaneous emission
can not be explained unless we quantize the EM field as well. A complete treatment of atomic
radiation by quantum field theory can be found in section 4.5 of Theoretical Astrophysics, Vol-
ume 1(T. Padmanabhan).

20.5 Variational method


20.5.1 The formulation of variational method
Let H be a Hamiltonian which is assumed to have some bound states. Let the discrete (bound
state) energy eigenvalues be ordered as E0 < E1 < · · · . The eigenvalues are allowed to be
degenerate. There may also be a continuous spectrum above some energy, as often happens in
practice. Let |ψ⟩ be any normalizable state. The theorem states that
⟨ψ|H|ψ⟩
≥ E0 . (20.110)
⟨ψ|ψ⟩
The state |ψ⟩ is chosen to be an approximation to the ground state, based on physical intuition
or other criteria, and the upper bound on E0 that is obtained is often quite useful. In practice
we often choose a continuous family of trial wave functions. Let λ be a continuous parameter,
and let us write the family as |ψ(λ)⟩. We define a function of λ, really an energy function,
⟨ψ(λ)|H|ψ(λ)⟩
F (λ) ≡ . (20.111)
⟨ψ(λ)|ψ(λ)⟩
We minimize this by finding the root λ0 of ∂F /∂λ = 0 so that the best estimate to the ground
state wave function out of the family is |ψ(λ0 )⟩ and the best estimate for the ground state
energy is F (λ0 ). Of course we must check that the root is actually a minimum.
In practice the normalization denominators are often inconvenient. One possibility is simply
to normalize each member of the set of trial wave functions so those denominators are not
present. But often it is easier to enforce normalization by using Lagrange multipliers. We
introduce the function,

F (λ, β) = ⟨ψ(λ)|H|ψ(λ)⟩ − β(⟨ψ(λ)|ψ(λ)⟩ − 1), (20.112)

and then require


∂F ∂F
= 0, = 0. (20.113)
∂λ ∂β
20.6 WKB method –213/453–

20.5.2 Bound states and the virial theorem


Suppose that the potential scales as V (λx) = λn V (x). We will assume that there is a normal-
ized ground state with wave function ψ0 (x). The ground state energy is
Z  
1
E0 = d
d x |∇ψ0 (x)| + V (x)|ψ0 (x)| ≡ ⟨T ⟩0 + ⟨V ⟩0 .
2 2
(20.114)
2m

Now consider the trial wave function ψ(x) = αd/2 ψ0 (αx), where the prefactor ensures that
ψ(x) continues to be normalized. From the scaling property of the potential, it is simple to
show that
E(α) = α2 ⟨T ⟩0 + α−n ⟨V ⟩0 . (20.115)

The minimum of E(α) satisfies that

dE
= 2α ⟨T ⟩0 − nα−n−1 ⟨V ⟩0 = 0. (20.116)

But this minimum must sit at α = 1 since, by construction, this is the true ground state. We
learn that for the homogeneous potentials, we have

2 ⟨T ⟩0 = n ⟨V ⟩0 , (20.117)

called the virial theorem.

Example: For Coulomb potential, we have V ∝ −1/r. The virial theorem tells us that E0 =
⟨T ⟩0 + ⟨V ⟩0 = − ⟨T ⟩0 < 0. In other words, we proved what we already know: the Coulomb
potential has bound states.


Note: Nowhere in our argument of the virial theorem did we state that the potential must be attractive.
Our conclusion above would seem to hold for repulsive potential, yet this is clearly wrong: the repulsive
potential V ∼ +1/r has no bound states. It is because we assumed at the beginning of the argument
that the ground state ψ0 was normalisable. For repulsive potentials this is not true: all states are asymptotically
plane waves of the form eikx . The virial theorem is not valid for repulsive potentials of this kind.

There is another exact and rather pretty result that holds for particles moving in one-dimension.
Consider a particle moving in a potential V (x) such that V (x) = 0 for |x| > L. A bound state
R
exists whenever dx V (x) < 0. In other words, a bound state exists whenever the potential
is “mostly attractive”. However, the converse to this statement does not hold. The proof can
be found in subsection 2.1.3 of Topic in Quantum Mechanics (David Tong).

20.6 WKB method



Note: In this section, we will write ℏ explicitly in our equations.
–214/453– Chapter 20 Approximation Method

20.6.1 The semi-classical expansion


Consider the one-dimensional time independent Schrödinger equation

ℏ2 d2 ψ
− + V (x)ψ = Eψ. (20.118)
2m dx2
We will look for solutions of the form

ψ(x) = eiW (x)/ℏ . (20.119)

Plugging this ansatz into 20.118 leaves us with the differential equation
 2
d2 W dW
iℏ 2 − + p2 (x) = 0, (20.120)
dx dx

where p2 = 2m(E − V ). Here we’ll look for solutions where the second derivative is merely
small, meaning
2
d2 W dW
ℏ ≪ . (20.121)
dx2 dx
We refer to this as the semi-classical limit. Roughly speaking, it can be thought of as the ℏ → 0
limit. Indeed, mathematically, it makes sense to attempt to solve Schrödinger using a power
series in ℏ. We treat p(x) as the background potential which we will take to be of the order of
| dW /dx|. We expand our solution as

W (x) = W0 (x) + ℏW1 (x) + · · · (20.122)

Plugging this ansatz into equation 20.120 gives



(−W0′2 + p2 ) + ℏ(iW0′′ − 2W0′ W1′ ) + O ℏ2 = 0. (20.123)

We have the solution


Z x
i
W0 (x) = ± dx′ p(x′ ), W1 (x) = log p(x) + constant. (20.124)
2
Putting these together gives us the WKB approximation to the wave function:
Z ′ !
A i x ′ ′
ψ(x) ≈ p exp ± dx p(x ) . (20.125)
p(x) ℏ

To leading order, our requirement reads


dp dλ
ℏ ≪ |p|2 or ≪ 2π, (20.126)
dx dx
where λ ≡ 2πℏ/p is the de Broglie wavelength. This is the statement that the de Broglie wave-
length of the particle does not change considerably over distances comparable to its wave-
length. Alternatively, we can phrase this as a condition on the potential as

dV |p|2
λ ≪ , (20.127)
dx 2m
20.6 WKB method –215/453–

which says that the change of the potential energy over a de Broglie wavelength should be
much less than the kinetic energy.

The WKB approximation does provides a solution in regions where E ≫ V (x) and, corre-
spondingly, p(x) is real. This is the case in the middle of the potential, where the wave function
oscillates. The WKB approximation also provides a solutions when E ≪ V (x) , where p(x) is
imaginary. This is the case to the far left and far right, where the wave function suffers either
exponential decay or growth
 Z 
A 1 x ′p
ψ(x) ≈ exp ± dx 2m(V − E) . (20.128)
2m(V − E)1/4 ℏ

The choice of ± is typically fixed by normalisability requirements. In the region near E =


V (x), the WKB approximation is never valid. The point x0 where p(x0 ) = 0 is the classi-
cal turning point. The key idea that makes the WKB approximation work is matching. This
means that we use the WKB approximation where it is valid. But in the neighbourhood of
any turning point we will instead find a different solution. This will then be matched onto our
WKB solution. In the vicinity of x0 , we expand the potential energy, keeping only the linear
term
V (x) ≈ E + C(x − x0 ) + · · · (20.129)
The Schrödinger equation then becomes

ℏ2 d2 ψ
− + Cxψ = Cx0 ψ. (20.130)
2m dx2

20.6.2 A linear potential and the Airy function


To solve differential equation 20.130, we define the dimensionless displacement as
  13   13
2mC 2m
u≡ (x − x0 ) ≈ (V − E). (20.131)
ℏ2 ℏ2 C 2

Equation 20.130 then becomes


d2 ψ
− uψ = 0, (20.132)
du2
known as Airy equation. The solution is called Airy function, defined by integral
Z  3 
1 ∞ t
Ai(u) ≡ dt cos + ut . (20.133)
π 0 π

The asymptotic behavior of the Airy function is


 1/2  
1 1 2 3/2
Ai(u) ∼ √ exp − u , u ≫ 0, (20.134)
2 π u 3

and  1/2  
1 2 √ π
Ai(u) ∼ √ cos u −u + , u ≪ 0. (20.135)
π −u 3 4
–216/453– Chapter 20 Approximation Method

The main purpose in introducing the Airy function is to put it to work in the WKB approxi-
mation. The asymptotic behavior is exactly what we need to match onto the WKB solution.
First consider the case where u ≪ 0. Here E > V (x) and we have the oscillatory solution:
" #1/2  Z x 
(2mCℏ)1/3 1 ′
p π
ψ(x) ∼ p cos dx sgn(C) 2m(E − V ) + . (20.136)
π 2m(E − V ) ℏ x0 4
This takes the same oscillatory form as the WKB solution. The two solutions can be patched
together simply by picking an appropriate normalisation factor and phase for the WKB solu-
tion. Similarly, in the region where u ≫ 0, we have the exponentially decaying solution:
" #1/2  Z 
1 (2mCℏ)1/3 1 x ′ p
ψ(x) ∼ p exp − dx sgn(C) 2m(V − E) . (20.137)
2 π 2m(V − E) ℏ x0
This too has the same form as the exponentially decaying WKB solution. This is how we piece
together solutions. In regions where E > V (x), the WKB approximation gives oscillating
solutions. In regimes where E < V (x), it gives exponentially decaying solutions. The Airy
function interpolates between these two regimes.

20.6.3 Bound state spectrum


V (x)

x
a b

Figure 20.2: One-dimensional potential.


As shown in Figure 20.2, we first split the potential into three regions where the WKB approx-
imation can be trusted:
• Region 1: x ≪ a.
• Region 2: a ≪ x ≪ b.
• Region 3: b ≪ x.
We’ll start in the left-most Region 1. Here the WKB approximation tells us that the solution
dies exponentially as
 Z 
A 1 a ′p
ψ1 (x) ≈ exp − dx 2m(V − E) . (20.138)
2m(V − E)1/4 ℏ x
20.7 Slowly changing Hamiltonians –217/453–

As we approach x = a, the potential takes the linear form and this coincides with the asymp-
totic form of the Airy function. We then follow this Airy function through to Region 2 where
we have
 Z x 
A 1 ′
p π
ψ2 (x) ≈ cos dx 2m(E − V ) − . (20.139)
m(E − V )1/4 ℏ a 4

The Airy function takes this form close to x = a where V (x) is linear. But we can extend this
solution throughout Region 2 where it coincides with the WKB approximation.

We now repeat this procedure to match Regions 2 an 3. When x ≫ b, the WKB approximation
tells us that the wave function is
 Z 
A′ 1 x ′p
ψ3 (x) ≈ exp − dx 2m(V − E) . (20.140)
2m(V − E)1/4 ℏ b

Matching to the Airy function across the turning point x = b, we have


 Z x 
A′ 1 ′
p π
ψ2 (x) ≈ cos dx 2m(E − V ) + . (20.141)
m(E − V )1/4 ℏ b 4

We’re left with two expressions for the wave function in Region 2. Clearly these must agree.
Equating the two tells us that |A| = |A′ |, but they may differ by a sign, since this can be
compensated by the cosine function. Insisting that the two cosine functions agree, up to sign,
gives us the condition
Z  
b

p 1
dx 2m(E − V ) = n+ ℏπ. (20.142)
a 2

The WKB approximation underlies an important piece of history from the pre-Schrödinger
era of quantum mechanics. We can rewrite the quantisation condition as
I  
1
dx p = n+ h, h ≡ 2πℏ, (20.143)
2
H
where means that we take a closed path in phase space which, in this one-dimensional ex-
ample, is from xmin to xmax and back again. In the old days of quantum mechanics, Bohr and
Sommerfeld introduced an ad-hoc method of quantisation. They suggested that one should
impose the condition
I
dx p = nh (20.144)

with n an integer. They didn’t include the factor of 1/2. They made this guess because it turns
out to correctly describe the spectrum of the hydrogen atom. The WKB approximation pro-
vides an a-posteriori justification of the Bohr-Sommerfeld quantisation rule. More generally,
“Bohr-Sommerfeld quantisation” means packaging up a 2d-dimensional phase space of the
system into small parcels of volume hd and assigning a quantum state to each. It is, at best, a
crude approximation to the correct quantisation treatment.
–218/453– Chapter 20 Approximation Method

20.7 Slowly changing Hamiltonians


20.7.1 The adiabatic approximation
Consider a Hamiltonian H(λ) which depends on some number of parameters λi . For sim-
plicity, we will assume that H has a discrete spectrum. We write these states as

H(λ) |n(λ)⟩ = En (λ) |n(λ)⟩ . (20.145)

Let’s place ourselves in one of these energy eigenstates. Now vary the parameters λi . The
adiabatic theorem states that if λi are changed suitably slowly, then the system will cling to
the energy eigenstate |n[λ(t)]⟩ that we started off in. To see this, we want to solve the time-
dependent Schrödinger equation
∂ |ψ(t)⟩
i = H(λ) |ψ(t)⟩ . (20.146)
∂t
We expand the solution in a basis of instantaneous energy eigenstates as
X
|ψ(t)⟩ = am (t)eiξm (t) |m(λ)⟩ . (20.147)
m

Here am (t) are coefficients that we wish to determine, while ξm (t) is the usual energy-dependent
phase factor defined as Z t
ξm (t) ≡ − dt′ Em (t′ ). (20.148)
0
To proceed, we substitute our ansatz 20.147 into the Schrödinger equation to find
X 
iξm ∂ |m(λ)⟩ i
ȧm e |m(λ)⟩ + am e
iξm
i
λ̇ = 0. (20.149)
m
∂λ

Taking the inner product with ⟨n(λ)| gives


X  
i(ξm −ξn ) ∂
ȧn = ian Ai (λ)λ̇ −
i
am e n(λ) i
m(λ) λ̇i , (20.150)
m̸=n
∂λ

where Ai (λ) ≡ i ⟨n(λ)|∂/∂λi |n(λ)⟩, called the Berry connection.


First, we need to deal with the second term in equation above. We will argue that this is small.
To see this, we return to our original definition of |m(λ)⟩ and differentiate it with respect to λ:
∂H ∂ |m⟩ ∂Em ∂ |m⟩
i
|m⟩ + H i
= |m⟩ + Em . (20.151)
∂λ ∂λ ∂λ ∂λi
Now take the inner product with ⟨n| where n ̸= m to find
   
∂ i ∂H λ̇i
n m λ̇ = n m . (20.152)
∂λi ∂λi Em − En

The adiabatic theorem holds when the change of parameters λ̇i is much smaller than the split-
ting of energy levels Em − En . In this limit, we can ignore this term. We’re then left with

ȧn = ian Ai λ̇i . (20.153)


20.7 Slowly changing Hamiltonians –219/453–

This is easily solved to give


Z !
λ(t)
an = an (t = 0) exp i Ai (λ) dλ
i
. (20.154)
λ(0)

If we start at time t = 0 with am = δmn , so the system is in a definite energy eigenstate |n⟩,
the system will remain in the state |n(λ)⟩ as we vary λ. This is true as long as λ̇i ≪ ∆E. In
particular, this means that when we vary the parameters λ, we should be careful to avoid level
crossing, where another state becomes degenerate with the |n(λ)⟩ that we’re sitting in. In this
case, we will have Em = En for some |m⟩ and all bets are off: when the states separate again,
there is no simple way to tell which linear combinations of the state we now sit in.

20.7.2 Berry phase


As we vary the parameters λ, the phase of the state |n(λ)⟩ changes but there are two contribu-
tions, rather than one. The first is the usual e−iEt phase that we expect for an energy eigenstate.
But there is also a second contribution to the phase due to Berry connection. Suppose that we
vary the parameters λ but, finally we put them back to their starting values. This means that
we trace out a closed path in the space of parameters. The second contribution can now be
written as  I 
e ≡ exp i dλ Ai (λ) .
iγ i
(20.155)
C

In contrast to the energy-dependent phase, this does not depend on the time taken to make the
journey in parameter space. Instead, it depends only on the path we take through parameter
space. It is known as the Berry phase.
Like gauge potential in electromagnetic field theory, there is also a redundancy in the infor-
mation contained in the Berry connection Ai (λ). This follows from the arbitrary choice we
made in fixing the phase of the reference states |n(λ)⟩. We could pick a different phase for
every choice of parameters λ,
|n′ (λ)⟩ = eiω(λ) |n(λ)⟩ (20.156)
for any function ω(λ). If we compute the Berry connection arising from this new choice, we
have
∂ω
A′i = Ai − i . (20.157)
∂λ
This takes the same form as the gauge transformation.
Following the analogy with electromagnetism, we might expect that the physical information
in the Berry connection can be found in the gauge invariant field strength which, mathemati-
cally, is known as the curvature of the connection,

∂Ai ∂Aj
Fij (λ) = − . (20.158)
∂λj ∂λi
It is certainly true that F contains some physical information about our quantum system, but
it is not the only gauge invariant quantity of interest. In the present context, the most natural
thing to compute is the Berry phase. Importantly, this too is independent of the arbitrariness
–220/453– Chapter 20 Approximation Method
H
arising from the gauge transformation. This is because ∂i ω dλi = 0. Indeed, we have already
seen this same expression in the context of electromagnetism: it is the Aharonov-Bohm phase.
In fact, it is possible to write the Berry phase in terms of the field strength using the higher-
dimensional version of Stokes’ theorem:
 I   Z 
e = exp i dλ Ai (λ) = exp i dS Fij ,
iγ i ij
(20.159)
C S

where S is a two-dimensional surface in the parameter space bounded by the path C. A stan-
dard example of the application of Berry phase can be found in section 6.3.5 of Applications of
Quantum Mechanics (David Tong).

20.7.3 The Born-Oppenheimer approximation


The Born-Oppenhemier approximation is an approach to solving quantum mechanical prob-
lems in which there is a hierarchy of scales. The standard example is a bunch of nuclei, each
with position Rα , mass Mα and charge Zα , interacting with a bunch of electrons, each with
position ri , mass m and charge −e. The Hamiltonian of the system is
!
X ∇2 X ∇2 e 2 X 1 X Zα Zβ X Zα
H=− α
− i
+ + − .
α
2M α i
2m 4π ij
|r i − r j | αβ
|R α − R β | iα
|r i − R α |
(20.160)
The hierarchy of scales in the Hamiltonian above arises because of the mass difference be-
tween the nuclei and the electrons. The nuclei are cumbersome and slow, while the electrons
are nimble and quick. Relatedly, the nuclei wave functions are much more localised than the
electron wave functions. This motivates us to first fix the positions of the nuclei and look at
the electron Hamiltonian, and only later solve for the nuclei dynamics. This is the essence of
the Born-Oppenheimer approximation. To this end, we write

H = Hnucl + Hel , (20.161)

where
X ∇2 e2 X Zα Zβ
Hnucl ≡ − α
+ , (20.162)
α
2Mα 4π αβ |Rα − Rβ |
and !
X ∇2 e2 X 1 X Zα
Hel ≡ − i
+ − . (20.163)
i
2m 4π ij
|ri − rj | iα
|ri − Rα |
We then solve for the eigenstates of Hel , where the nuclei positions R are viewed as parame-
ters which, as in the adiabatic approximation, will subsequently vary slowly. For fixed R, the
instantaneous electron wave functions are

Hel ϕn (r; R) = ϵn (R)ϕn (r; R). (20.164)

In what follows, we will assume that the energy levels are non-degenerate. We then make the
ansatz for the wave function of the full system
X
Ψ(r; R) = Φn (R)ϕn (r; R). (20.165)
n
20.7 Slowly changing Hamiltonians –221/453–

We would like to write down an effective Hamiltonian which governs the nuclei wave functions
Φ(R). This is straightforward. The wave function Ψ obeys

(Hnucl + Hel )Ψ = EΨ. (20.166)

Switching to bra-ket notation for the electron eigenstates, we can write this as
X
⟨ϕm |Hnucl Φn |ϕn ⟩ + ϵm (R)Φm = EΦm . (20.167)
n

Now Hnucl contains the kinetic term ∇2R , and this acts both on the nuclei wave function, but
also on the electron wave function where the nuclei positions sit as parameters. We have
X
ϕm ∇2R Φn ϕn = (δmk ∇R + ⟨ϕm |∇R |ϕk ⟩) (δkn ∇R + ⟨ϕk |∇R |ϕn ⟩) Φn . (20.168)
k

We can shows that


⟨ϕm |∇R Hel |ϕk ⟩
⟨ϕm |∇R |ϕk ⟩ = if m ̸= k (20.169)
ϵk − ϵm
In the spirit of the adiabatic approximation, ⟨ϕm |∇R |ϕk ⟩ can be neglected as long as the mo-
tion of the nuclei is smaller than the splitting of the electron energy levels. In this limit, we get
a simple effective Hamiltonian for the nuclei. The Hamiltonian depends on the state |ϕn ⟩ that
the electrons sit in, and is given by
X 1 e2 X Zα Zβ
Hneff = − (∇α + iAn,α )2 + + ϵn (R). (20.170)
α
2Mα 4π αβ |Rα − Rβ |

We see that the electron energy level ϵn (R) acts as an effective potential for the nuclei. The
Berry connection
An,α = −i ⟨ϕn |∇Rα |ϕn ⟩ (20.171)
acts as an effective magnetic field in which the nuclei moves.
The idea of the Born-Oppenheimer approximation is that we can first solve for the fast-moving
degrees of freedom, to find an effective action for the slow-moving degrees of freedom. We
sometimes say that we have “integrated out” the electron degrees of freedom, language which
really comes from the path integral formulation of quantum mechanics. This is a very powerful
idea, and one which becomes increasingly important as we progress in theoretical physics.
Indeed, this simple idea underpins the Wilsonian renormalization group which we will meet
in later chapters.
Chapter 21
Many Body Problem

21.1 Identical particles


If two particles are indistinguishable, their exchange must not change physical quantities, so
we have
|· · · ψj · · · ψi · · ·⟩ = Eij |· · · ψi · · · ψj · · ·⟩ = eiθ |· · · ψi · · · ψj · · ·⟩ , (21.1)
where Eij is the operator to exchange particle i and j and ψi is the quantum number to describe
the state of particle i. In coordinate representation, it can be expressed as

Ψ(· · · xj · · · xi · · · ) = eiθ Ψ(· · · xi · · · xj · · · ). (21.2)

In three-dimensional space, the value of eiθ can only be ±1. If the spin of the particle is in-
teger, the phase factor must be 1 and the particle is called boson. If the spin of the particle
is half-integer, the phase factor must be −1 and the particle is called fermion. This is called
spin-statistics theorem and can only be proved by relativistic quantum field theory. In two-
dimensional space, the phase eiθ can be anything, and the particles that obey quantum statistics
of this sort are called anyons. A brief introduction can be found in chapter 12.1 and 12.2 of
Quantum Field Theory and the Standard Model (Matthew D. Schwartz).
Not all vectors in space H1 ⊗· · ·⊗Hn are physical states. Physical states must be the eigenvec-
tors of all Eij with eigenvalue 1 (−1) for bosons (fermions). The space composed of physical
states is called Fock space. For example, suppose there are three particles with different state
a, b, c. |abc⟩ is not a physical state. The physical state for bosons is
1
√ (|abc⟩ + |acb⟩ + |bca⟩ + |bac⟩ + |cab⟩ + |cba⟩) . (21.3)
3!
The physical state for fermions is
1
√ (|abc⟩ − |acb⟩ + |bca⟩ − |bac⟩ + |cab⟩ − |cba⟩) . (21.4)
3!
In general, for N particles filling N distinct states, there are N ! states to start with, but there
is only one totally symmetric state and one totally anti-symmetric state, and the rest of N ! −
2 states are thrown out. Therefore quantum statistics reduces the size of the Hilbert space
quite dramatically. Further more, not all Hermitian operators are physical observables. A
nonphysical operator is one that takes a state in the physical subspace of the Hilbert space (one
that satisfies the right symmetry under exchange), and maps it into a nonphysical state (one
21.2 Non-relativistic quantum field theory –223/453–

that does not have the right symmetry). An example is the operator X1 . We might call this the
operator corresponding to the measurement of the position of particle 1. The problem with
this operator from a physical standpoint is that you cannot measure the position of particle 1.
You can select a region of space, and ask whether there is a particle in that region. But if you
find one, you cannot say whether it is particle 1 or particle 2, since they are indistinguishable.
A physical observable O must obey

[Eij , O] = 0 for all ij. (21.5)

Figure 21.1: The path integral for three identical fermions.

The generalization of path integral formulation of quantum mechanics to the N -particle case
is Z ∫ tf
⟨x1f · · · xN f , tf |x1i · · · xN i , ti ⟩ = Dx1 (t) · · · xN (t)ei ti dtL(t) . (21.6)

Here, the particle 1 at the initial position x1i moves to the final position x1f , the particle 2 at
the initial position x2i to x2f , etc, and you sum over all possible paths. When the particles
are identical, however, we need to introduce proper (anti-)symmetrization of the state. For
fermions, we introduce the anti-symmetrized position bra
1 X
⟨[x1 · · · xN ]| = √ (−1)σ xσ(1) · · · xσ(N ) . (21.7)
N! σ

Notice that the Lagrangian for identical particles must be invariant under the exchange of
particles. We can prove that
X
⟨[x1f · · · xN f ], tf |[x1i · · · xN i ], ti ⟩ = (−1)σ xσ(1)f · · · xσ(N )f , tf x1i · · · xN i , ti . (21.8)
σ

In other words, the path integral sums over all possible paths allowing the positions at the final
time slice are interchanged in all possible ways starting from the positions at the initial time
slice. A diagrammatic representation of the path integral is shown in Figure 21.1. The case for
bosons can be obtained easily by dropping all minus signs.
–224/453– Chapter 21 Many Body Problem

21.2 Non-relativistic quantum field theory


21.2.1 Motivation and formulation of quantum field theory
There are some limitations of multi-body Schrödinger wave function. Firstly, when the num-
ber of particles is large, multi-body Schrödinger wave function would be cumbersome. Sec-
ondly, it is incapable of describing processes where the number of particles changes.
The aim of the quantum field theory is to come up with a formalism which is completely equiv-
alent to multi-body Schrödinger equations but just better: it allows you to consider a variable
number of particles all within the same framework and can even describe the change in the
number of particles. It also gives totally symmetric or anti-symmetric multi-body wave func-
tion automatically. It also allows a systematic way of organizing perturbation theory in terms
of Feynman diagrams. It is particularly suited to multi-body problems.
In quantum mechanics, you start with classical particle Hamiltonian mechanics, with no con-
cept of wave or interference. After quantizing it, we introduce Schrödinger wave function
and there emerges concepts of wave and its interference. In quantum field theory, you start
with classical wave equation, with no concept of particle. After quantizing it, we find particle
interpretation of excitations in the system.
Let us consider a classical field equation
 
∂ ∇2
i + ψ(x, t) = 0. (21.9)
∂t 2m
A solution to this field equation is that of a plane wave
k2
ψ(x, t) = eik·x−iωt where ω = . (21.10)
2m
This classical field equation can be derived from the action
Z  
∗ ∂ ∇2
S = dt dx L where L = ψ i + ψ. (21.11)
∂t 2m
We can even add a non-linear term in the action, for instance
 
∗ ∂ ∇2 1
L=ψ i + ψ − λψ ∗2 ψ 2 . (21.12)
∂t 2m 2
Least action principle gives a non-linear field equation
 
∂ ∇2 ∗
i + − λψ ψ ψ(x, t) = 0. (21.13)
∂t 2m
Now we quantize the Schrödinger field by canonical quantization method. A more formal dis-
cussion on the motivation of canonical quantization will be discussed in relativistic quantum
field theory. The canonically conjugate momenta of ψ(x) is
∂L
π(x) = = iψ † (x). (21.14)
∂ ψ̇(x)
21.2 Non-relativistic quantum field theory –225/453–

We now introduce the canonical commutation relation

[ψ(x), π(y)] = iδ(x − y). (21.15)

It is equivalent to
 
ψ(x), ψ † (y) = δ(x − y). (21.16)
We can regard ψ(x) as annihilation operator and ψ(x) creation operator of a boson at position
x. The Hamiltonian of the system is
Z  
†∇
2
1 †2 2
H = dx −ψ ψ + λψ ψ . (21.17)
2m 2

We can figure out that


  ∇2 †
H, ψ † = − ψ + λψ †2 ψ. (21.18)
2m

21.2.2 Particles in quantum field theory


We define the vacuum |0⟩ which is annihilated by the annihilation operator

ψ(x) |0⟩ = 0, (21.19)

and construct the Fock space by

1
|x1 · · · xN ⟩ = √ ψ † (x1 ) · · · ψ † (xN ) |0⟩ . (21.20)
N!

The state |x1 · · · xN ⟩ is an n-particle state of identical bosons in the position eigenstate at
x1 · · · xN .

Let us look at the one-particle state

|x⟩ = ψ † (x) |0⟩ . (21.21)

We can derive that


⟨y|x⟩ = δ(x − y). (21.22)
Therefore, this state is normalized in the same way as the one-particle position eigenstate in
quantum mechanics. We define a general one-particle state in the quantum field theory as
Z
|Ψ(t)⟩ ≡ dx Ψ(x, t)ψ † (x) |0⟩ . (21.23)

Ψ(x, t)is a c-number function which determines a particular superposition of the position
eigenstates |x⟩ and corresponds to the Schrödinger wave function in the particle quantum
mechanics. The Schrödinger equation in quantum field theory is

∂ |Ψ(t)⟩
i = H |Ψ(t)⟩ . (21.24)
∂t
–226/453– Chapter 21 Many Body Problem

Since
Z Z  
  ∇2 Ψ(x, t)
H |Ψ(t)⟩ = dx Ψ(x, t) H, ψ † |0⟩ = dx − |x⟩ , (21.25)
2m

equation 21.24 reduces to


∂Ψ(x, t) ∇2 Ψ(x, t)
i =− , (21.26)
∂t 2m
which is exactly the Schrödinger equation for one isolate particle in coordinate representation
of particle quantum mechanics.

Let us next study the two-particle state

1
|x1 x2 ⟩ = √ ψ † (x1 )ψ † (x2 ) |0⟩ . (21.27)
2
We can derive that
1
⟨x1 x2 |y1 y2 ⟩ = [δ(x1 − y1 )δ(x2 − y2 ) + δ(x1 − y2 )δ(x2 − y1 )] . (21.28)
2
This normalization suggests that we are dealing with a two-particle state of identical particles,
because the norm is non-vanishing when x1 = y1 and x2 = y2 , but also when x1 = y2 and
x2 = y1 . A general two-particle state is constructed by
Z
1
|Ψ(t)⟩ ≡ √ dx1 dx2 Ψ(x1 , x2 , t)ψ † (x1 )ψ † (x2 ) |0⟩ . (21.29)
2
 
Because ψ † (x1 ), ψ † (x2 ) = 0, the integration over x1 and x2 is symmetric under the in-
terchange of x1 and x2 , and hence Ψ(x1 , x2 , t) = Ψ(x2 , x1 , t). The symmetry under the
exchange suggests that we are dealing with identical bosons. Since
Z
1    
H |Ψ(t)⟩ = √ dx1 dx2 Ψ(x1 , x2 , t) H, ψ † (x1 ) ψ † (x2 ) + ψ † (x1 ) H, ψ † (x2 ) |0⟩
2
Z  
1 ∇12 ∇22
=√ dx1 dx2 − − + λδ(x1 − x2 ) Ψ(x1 , x2 , t) |x1 x2 ⟩ , (21.30)
2 2m 2m

equation 21.24 reduces to


 
∂Ψ(x1 , x2 , t) ∇21 ∇22
i = − − + λδ(x1 − x2 ) Ψ(x1 , x2 , t), (21.31)
∂t 2m 2m

which is the Schrödinger equation for a two-particle wave function with delta potential as the
interaction between them. Therefore, the Fock space with two creation operators correctly
describes the two-particle quantum mechanics.

If we want a general interaction potential between them, the action must be modified to
Z Z   Z 
† ∂ ∇2 1 † †
S = dt dx ψ (x) i + ψ(x) − dx dy ψ (x)ψ (y)V (x − y)ψ(x)ψ(y) .
∂t 2m 2
(21.32)
21.2 Non-relativistic quantum field theory –227/453–

The corresponding Hamiltonian is


Z Z
† −∇
2
1
H= dx ψ ψ+ dx dy ψ † (x)ψ † (y)V (x − y)ψ(x)ψ(y). (21.33)
2m 2

Following exactly the same steps as above, we can derive that


 
∂Ψ(x1 , x2 , t) ∇21 ∇22
i = − − + V (x1 − x2 ) Ψ(x1 , x2 , t). (21.34)
∂t 2m 2m

In general, an n-particle state can be constructed as


Z
1
|Ψ(t)⟩ = √ dx1 · · · dxn Ψ(x1 , · · · , xn , t)ψ † (x1 ) · · · ψ † (xn ) |0⟩ . (21.35)
n!
The Schrödinger equation reduces to
" #
∂Ψ(x1 , · · · , xn , t) X ∇2 X
i = − i + V (xi − xj ) Ψ(x1 , · · · , xn , t). (21.36)
∂t i
2m i<j

If we are interested in a system in a background potential. A good example is the electrons


in an atom, where all of them are moving in the background Coulomb potential due to the
nucleus. In this case, the correct field-theory Hamiltonian is
Z   Z
† −∇2 Ze2 1 e2
H = dx ψ (x) − ψ(x) + dx dy ψ † (x)ψ † (y) ψ(x)ψ(y).
2m 4π|x| 2 4π|x − y|
(21.37)

The total number of particles is an eigenvalue of the operator


Z
N ≡ dx ψ † (x)ψ(x). (21.38)

It follows that
 
[N, ψ] = −ψ, N, ψ † = ψ † . (21.39)
Thus we can derive that
N |x1 , · · · , xn ⟩ = n |x1 , · · · , xn ⟩ . (21.40)

21.2.3 Momentum space


Creation and annihilation operators in the momentum space are defined as
Z Z
−3/2 −ip·x † −3/2
a(p) ≡ (2π) dx ψ(x)e , a (p) ≡ (2π) dx ψ † (x)eip·x . (21.41)

It follows that
   
a(p), a† (q) = δ(p − q), [a(p), a(q)] = a† (p), a† (q) = 0. (21.42)
–228/453– Chapter 21 Many Body Problem

We can rewrite the Hamiltonian in the momentum space. The free part of the Hamiltonian is
Z Z
† −∇
2
p2 †
H0 ≡ dx ψ ψ = dp a (p)a(p). (21.43)
2m 2m
It simply counts the number of particles in a given momentum state and assigns the energy
p2 /2m accordingly. The interaction part of the Hamiltonian is
Z
1
∆H ≡ dx dyψ † (x)ψ † (y)V (x − y)ψ(x)ψ(y)
2
Z
1
= dp dq dp′ dq ′ V (p − q)a† (p)a† (p′ )a(q)a(q ′ )δ(p + p′ − q − q ′ ), (21.44)
2
where Z
1
V (p − q) = dx V (x)e−i(p−q)·x . (21.45)
(2π)3
The delta function represents the momentum conservation in the scattering process due to
the potential V . The potential term of Hamiltonian causes scattering, by annihilating two
particles in momentum states q, q ′ and create them in different momentum states p, p′ with
the amplitude V (p − q).

21.2.4 Fermions
We have seen that the quantized Schrödinger field gives multi-body states of identical bosons.
For fermions, we should use anti-commutation relations rather than commutation relations:
 
ψ(x), ψ † (y) = δ(x − y), {ψ(x), ψ(y)} = ψ † (x), ψ † (y) = 0. (21.46)

One noteworthy point is that ψ † (x)ψ † (x) = ψ † (x), ψ † (x) /2 = 0. What this means is
that one cannot create two particles at the same position, an expression of Pauli’s exclusion
principle for fermions.
Consider a two-particle state
Z
1
|Ψ(t)⟩ = √ dx1 dx2 Ψ(x1 , x2 , t)ψ † (x1 )ψ † (x2 ) |0⟩ . (21.47)
2

From the anti-commutation relation ψ † (x), ψ † (y) = 0, we have

Ψ(x1 , x2 ) = −Ψ(x2 , x1 ). (21.48)

Such a state indeed describes identical fermions.


Similarly to the identity of commutators [A, BC] = A[B, C] + [A, B]C, we find

[A, BC] = {A, B}C − B{A, C}, [AB, C] = A{B, C} − {A, C}B. (21.49)

With the help of 21.49, we can derive that


Z
 †
 ∇2 †
H, ψ (x) = − ψ (x) + dy ψ † (x)ψ † (y)V (x − y)ψ(y). (21.50)
2m
It is the same commutation relation as that of bosons! The Schrödinger equation of Ψ(x1 , x2 , t)
we obtained for fermions will be the same as that for bosons.
Chapter 22
Scattering Theory

22.1 Scattering in one-dimension


The basic idea behind scattering theory is simple: there is an object that you want to under-
stand. So you throw something at it. By analysing how that something bounces off, you can
glean information about the object itself.
We start by considering a quantum particles moving along a line. The object that we want to
understand is some potential V (x). Importantly, the potential is localised to some region of
space which means that V (x) → 0 as x → ±∞. A quantum particle moving along the line is
governed by the Schrödinger equation
1 d2 ψ
− + V (x)ψ = Eψ. (22.1)
2m dx2
For any potential, there are essentially two different kinds of states.
• Bound States are states that are localised in some region of space. The wave functions
are normalisable and have profiles that drop off exponentially far from the potential:

ψ(x) → e−λ|x| as |x| → ∞. (22.2)

Because the potential vanishes in the asymptotic region, the Schrödinger equation re-
lates the asymptotic fall-off to the energy of the state,
λ2
E=− . (22.3)
2m
In particular, bound states have E < 0. Indeed, it is this property which ensures that
the particle is trapped within the potential and cannot escape to infinity. Bound states
are rather special. In the absence of a potential, a solution which decays exponentially
to the left will grow exponentially to the far right. But, for the state to be normalisable,
the potential has to turn this behaviour around, so the the wave function decreases at
both x → −∞ and x → +∞. This will only happen for specific values of λ. Ultimately,
this is why the spectrum of bound states is discrete.
• Scattering states are not localised in space and the wave functions are not normalisable.
Instead, asymptotically, far from the potential, scattering states take the form of plane
waves. In one-dimension, there are two possibilities,

Right moving: ψ ∼ eikx , Left moving: ψ ∼ e−ikx , where k > 0. (22.4)


–230/453– Chapter 22 Scattering Theory

Solving the Schrödinger equation in the asymptotic region gives the energy

k2
E= . (22.5)
2m
Scattering states have E > 0. Note that nothing special has to happen to find scattering
solutions. We expect to find solutions for any choice of k.

22.1.1 Reflection and transmission amplitudes


When solving the Schrdöinger equation for the scattering states, we expect that there are two
independent solutions for each value of k. Suppose that we throw the particle in from the left.
When it hits the potential, it can bounce back, or it can pass straight through. Mathematically,
this means that we are looking for a solution which asymptotically takes the form
(
eikx + re−ikx , x → −∞
ψR (x) ∼ . (22.6)
teikx , x → +∞

The coefficient r is called the reflection amplitude. The coefficient t is called the transmission
amplitude. The probability for reflection R and transmission T are given by the usual quantum
mechanics rule:
R = |r|2 , T = |t|2 . (22.7)
Given a solution ψ(x) to the Schrödinger equation, we can construct a conserved probability
current  
−i ∗ dψ dψ ∗
J(x) = ψ −ψ , (22.8)
2m dx dx
which obeys dJ/dx = 0. This means that J(x) is constant. For our scattering solution ψR ,
the probability current as x → −∞ is given by

k
J(x) = (1 − |r|2 ). (22.9)
m
Meanwhile, as x → +∞, we have
k 2
J(x) = |t| . (22.10)
m
Equating the two gives R + T = 1.

Now we throw the particle in from the right. We are now looking for solutions which take the
asymptotic form
(
t′ e−ikx , x → −∞
ψL (x) ∼ . (22.11)
e−ikx + r′ eikx , x → +∞
Because the potential V (x) is a real function, if ψR is a solution, so will be ψR∗ . By linearity,
(ψR∗ − r∗ ψR )/t∗ is also a solution, with asymptotic behavior
(
ψR∗ (x) − r∗ ψR (x) te−ikx , x → −∞
∼ r∗ t
. (22.12)
t∗ e−ikx − eikx , x → +∞
t∗
22.1 Scattering in one-dimension –231/453–

Comparing 22.12 with 22.11, we can deduce that


r∗ t
t′ = t, r′ = − . (22.13)
t∗

Example: Let us compute r and t for a simple potential, given by


(
−V0 , −a/2 < x < a/2
V (x) = , (22.14)
0, |x| > a/2

with V0 > 0. We can figure out that

(k 2 − q 2 ) sin(qa)e−ika 2iqke−ika
r= , t= , (22.15)
(q 2 + k 2 ) sin(qa) + 2iqk cos(qa) (q 2 + k 2 ) sin(qa) + 2iqk cos(qa)

where q 2 = 2mV0 + k 2 . At the limit k → 0, we have r → −1 and t → 0. This means that


if you throw the particle very softly, it doesn’t make it through the potential; it’s guaranteed
to bounce back. Conversely, in the limit k → ∞, we have r = 0. By unitarity we must have
|t| = 1 and the particle is guaranteed to pass through. This is what you might expect; if you
throw the particle hard enough, it barely notices that the potential is there.
We can repeat the calculation above for scattering from the right. In fact, for our pothole
potential, the result is exactly the same and we have r = r′ . This arises because V (x) = V (−x)
so it’s no surprise that scattering from the left and right are the same.

22.1.2 S-matrix
We have two ingoing asymptotic wave functions, one from the left and one from the right,

IR (x) = eikx , x → −∞, IL (x) = e−ikx , x → ∞. (22.16)

Similarly, there are two outgoing asymptotic wave functions,

OR (x) = eikx , x → ∞, OL (x) = e−ikx , x → −∞. (22.17)

The two asymptotic solutions ψR and ψL can then be written as


       
ψR IR OR t r
= +S , S≡ ′ ′ . (22.18)
ψL IL OL r t

We can show that SS† = I.


For symmetric potentials, with V (x) = V (−x), we have [P, H] = 0 which means that eigen-
states of the Hamiltonian can be chosen so that they are also eigenstates of parity. Thus eigen-
states of the Hamiltonian are either even functions or odd functions. Scattering eigenstates
ψR and ψL are neither odd nor even. Instead, for a symmetric potential, they are related by
ψL (x) = ψR (−x). If we want to work with the parity eigenstates, we take

ψ+ (x) = ψL (x) + ψR (x), ψ− (x) = ψL (x) − ψR (x), (22.19)


–232/453– Chapter 22 Scattering Theory

which obey P ψ± (x) = ±ψ± (x).

We can also think about the S-matrix using our new basis of states. The asymptotic ingoing
modes are even and odd functions, given at |x| → ∞ by

I+ (x) = e−ik|x| , I− (x) = sign(x)e−ik|x| . (22.20)

The two asymptotic outgoing modes are

O+ (x) = eik|x| , O− (x) = −sign(x)eik|x| . (22.21)

These are related to our earlier modes by a simple change of basis


         
I+ I O+ OR 1 1
=M R , =M , M= . (22.22)
I− IL O− OL −1 1

The S-matrix with respect to this parity basis is defined through


     
ψ+ I+ P O+
= +S . (22.23)
ψ− I− O−

It follows that    2iδ+ (k) 


t+r 0 e 0
S =
P
≡ . (22.24)
0 t−r 0 e2iδ− (k)
For scattering off a symmetric potential, all the information is encoded in two momentum-
dependent phase shifts, which tell us how the phases of the outgoing waves are changed with
respect to the ingoing waves.

22.1.3 Bound states


We now look at the bound states, which have energy E < 0 and are localised near inside
the potential. It turns out that the information about these bound states can be extracted
from the S-matrix, which we constructed purely from knowledge of the scattering states. We
take our scattering solutions, which depend on momentum k ∈ R, and extend them to the
complex momentum plane. First notice that the solutions with k ∈ C still obey our original
Schrödinger equation since, at no point in any of our derivation did we assume that k ∈ R.
The only difficulty comes when we look at how the wave functions behave asymptotically. In
particular, any putative solution will, in general, diverge exponentially as x → +∞ or x →
−∞, rendering the wave function non-normalisable. However, there are certain solutions that
survive.

For simplicity, assume that we have a symmetric potential. This means that there is no mixing
between the parity-even and parity-odd wave functions. We start by looking at the parity-even
states. The general solution takes the form
(
eikx + S++ e−ikx , x → −∞
ψ+ (x) = I+ (x) + S++ O+ (x) = . (22.25)
e−ikx + S++ eikx , x→∞
22.1 Scattering in one-dimension –233/453–

Suppose that we make k pure imaginary and write k = iκ with κ > 0. Then we get
(
e−κx + S++ eκx , x → −∞
ψ+ (x) = . (22.26)
eκx + S++ e−κx , x→∞

Both terms proportional to S++ decay asymptotically, but the other terms diverge. The wave
function above is normalisable whenever we can find a κ such that S++ (k = iκ) → ∞. So
poles in the complex momentum plane that lie on the positive imaginary axis correspond to
bound states. This information also tells us the energy of the bound state,

κ2
E=− . (22.27)
2m
We could also have set k = −iκ, with κ > 0. In this case, it is the terms proportional to S++
which diverge and the wave function is normalisable only if S++ (k = −iκ) = 0. However,
since S++ is a phase, this is guaranteed to be true whenever S++ (k = iκ) has a pole, and
simply gives us back the solution above.

Finally, exactly the same arguments hold for parity-odd wave functions. There is a bound state
whenever S−− (k) has a pole at k = iκ with κ > 0.

Example: We can illustrate this with example of the square well, of depth −V0 and width a.
We have,
q tan(qa/2) − ik
S++ = r + t = −eika , (22.28)
q tan(qa/2) + ik
where q 2 = 2mV0 + k 2 . Setting k = iκ, we see that this has a pole when
 qa 
κ = q tan with κ2 + q 2 = 2mV0 . (22.29)
2
These are the usual equations that you have to solve when finding parity-even bound states in
a square well. Similarly, if we look at the parity-odd wave functions, we have

q + ik tan(qa/2)
S−− = t − r = −eika , (22.30)
q − ik tan(qa/2)

which has a pole at k = iκ when


 qa 
κ = −q cot with κ2 + q 2 = 2mV0 . (22.31)
2
This too reproduces the equations that we found when searching for bound states in a square
well.

22.1.4 Resonances
Let us think the example shown in Figure 22.1. One the one hand, we know that there can be
no bound states in such a trap because they will have E > 0. Any particle that we place in the
trap will ultimately tunnel out. On the other hand, if the walls of the trap are very large then
–234/453– Chapter 22 Scattering Theory
V (x)

Figure 22.1: Example of trap potential.

we might expect that the particle stays there for a long time before it eventually escapes. In this
situation, we talk of a resonance. These are also referred to as unstable or metastable states.
Suppose that S++ has a pole that lies on the complex momentum plane at position k = k0 −iγ.
We note that the energy is also imaginary,

Γ k02 − γ 2 2γk0
E = E0 − i where E0 ≡ , Γ≡ . (22.32)
2 2m m
Recall that the time dependence of the wave function is given by

e−iEt = e−iE0 t e−Γt/2 . (22.33)

For γ > 0, the overall form of the wave function decays exponentially with time. This is the
characteristic behaviour of unstable states. A wave function that is initially supported inside
the trap will be very small there at time much larger than τ = 1/Γ. Here τ is called the half-
life of the state, while Γ is usually referred to as the width of the state. Including the time
dependence, when S++ → ∞, the solution takes the asymptotic form
(
e−iE0 t e−ik0 x e−γx−Γt/2 , x → −∞
ψ+ (x, t) = . (22.34)
e−iE0 t eik0 x eγx−Γt/2 , x → ∞

The final factor varies as


Γ k0
e±γ(x∓vt) , v= = . (22.35)
2γ m
This has the interpretation of a particle moving with momentum k0 . This, of course, is the
particle which has escaped the trap. The upshot of this discussion is that poles of the S-matrix
in the lower-half complex plane correspond to resonances. It is often useful to write S++ as
a function of energy rather than momentum. Since S++ is a phase, close to a resonance it
necessarily takes the form
E − E0 − iΓ/2
S++ = . (22.36)
E − E0 + iΓ/2

22.2 Lippmann–Schwinger equation


Imagine a particle coming in and getting scattered by a short-ranged potential V (x) located
around the origin x ∼ 0. The time-independent Schrödinger equation is simply

(H0 + V ) |ψ⟩ = E |ψ⟩ , (22.37)


22.2 Lippmann–Schwinger equation –235/453–

where H0 ≡ P 2 /2m is the free-particle Hamiltonian operator. The solution can be written
formally as
1
ψ (±) = V ψ (±) + |ϕ⟩ where H0 |ϕ⟩ = E |ϕ⟩ . (22.38)
E − H0 ± iϵ
In coordinate representation, we have
Z  
3 ′ 1
(±)
ψ (x) = ϕ(x) + d x x x V (x′ )ψ (±) (x′ ),

(22.39)
E − H0 ± iϵ

where ϕ(x) = eik·x /(2π)3/2 . Define the Green function as


 
′ 1 1 ′
G± (x, x ) ≡ x x . (22.40)
2m E − H0 ± iϵ

It follows that
′ 1 e±ik|x−x |


G± (x, x ) = − where k ≡ 2mE. (22.41)
4π |x − x′ |
We can verify that
(∇2 + k 2 )G± (x, x′ ) = δ(x − x′ ). (22.42)
Now solution 22.39 becomes
Z ′
eik·x 3 ′1 e±ik|x−x |
ψ (±)
(x) = − 2m dx V (x′ )ψ (±) (x′ ). (22.43)
(2π)3/2 4π |x − x′ |

We can interpret ψ + (x) as a superposition of incident plane wave and scattered wave which
propagate from scatterer to outside region and denote it as ψ(x).
The experiment is done typically by placing the detector far away from the scatterer, i.e., |x| ≪
a where a is the “size” of the scatterer. The integration over x′ , on the other hand, is limited
within the “size” of the scatterer because of the V (x′ ) factor. Therefore, we are in the situation
|x′ | ≪ |x|, and hence can use the approximation

x′ · x
|x − x′ | ≈ |x| − , (22.44)
|x|

Under this limit, we have


Z
eik·x meikr ′ ′ x
ψ(x) = − d3 x′ e−ik ·x V (x′ )ψ(x′ ) where r ≡ |x|, k′ ≡ k . (22.45)
(2π)3/2 2πr r

It is customary to write equation 22.45 in the form


 ikr

1 ′ e m
ψ(x) = 3 e ik·x
+ f (k, k ) where f (k, k′ ) ≡ − (2π)3 ⟨k′ |V |ψ⟩ .
(2π) 2 r 2π
(22.46)
Recall the definition of cross section
Number of Events
σ≡ . (22.47)
Time × Incident Flux
–236/453– Chapter 22 Scattering Theory

The differential cross section for being scattered into solid angle dΩ is
|jscatt |r2 dΩ
dσ = = |f (k, k′ )|2 dΩ , (22.48)
|jinc |
where jinc and jscatt are probability flux of incident and scattered wave function.
In a more realistic situation, we should use wave packets to describe the scattering process.
The basic picture is a free wave packet approaches the scattering center. After a long time, we
have both the original wave packet moving in the original direction plus a spherical wave front
that moves outward. The details can be found in the section 3 of the lecture notes Scattering
Theory I (Hitoshi Murayama).
Furthermore, if we require that the normalization of the wave function should always satisfy
R 3
d x |ψ(x)|2 = 1 for any t, as guaranteed by the unitarity of time evolution operator. This
requirement leads to a special requirement on the scattered wave, and hence f (k, k′ ), from
which we can derive the optical theorem.

Theorem 22.1 Optical theorem

In physics, the optical theorem is a general law of wave scattering theory, which relates
the forward scattering amplitude to the total cross section of the scatterer. It is usually
written in the form
(22.49) ♣
kσtot
Im f (0) = ,

where f (0) is the scattering amplitude with an angle of zero, and σtot is the total cross
section of the scatterer.

The meaning of this theorem is clear. Because the scattered wave takes the probability away to
different directions, the total probability for the particle to go to the forward direction (unscat-
tered) should decrease. This decrease is caused by the interference between the unscattered
and scattered waves and hence is proportional to Imf (0). On the other hand, the amount of
decrease in the forward direction should equal the total probability at other directions, which
is proportional to the total cross section. The proof can be found in the section 4 of the lecture
notes Scattering Theory I (Hitoshi Murayama).

22.3 Born approximation


If |ψ⟩ = |ϕ⟩ + O(V ) is close to |ϕ⟩, we can solve the Lippmanmn-Schwinger equation by
perturbation theory. The lowest order approximation is
1
|ψ⟩ = V |ϕ⟩ + |ϕ⟩ . (22.50)
E − H0 + iϵ
This is called Born approximation. In coordinate representation, we have
Z
′ m
f (k, k ) = −
(1)
d3 x V (x)eiq·x , (22.51)

22.3 Born approximation –237/453–

where q ≡ k − k′ . If the potential is central, we can derive that


Z
′ 2m ∞
f (k, k ) = −
(1)
dr rV (r) sin(qr). (22.52)
q 0
Yukawa potential
For Yukawa potential
α −µr
V = e , (22.53)
r
we can derive that
2mα
f (θ) = − . (22.54)
q2 + µ2
Differential cross section is
dσ 1
= (2mα)2 2 . (22.55)
dΩ [2k (1 − cos θ) + µ2 ]2
The total cross section is obtained by integrating over dΩ,

σ = (2mα)2 . (22.56)
4k 2 µ2+ µ4
Coulomb potential
Coulomb potential can be obtained by taking the limit µ → 0 in Yukawa potential. For
Coulomb potential, we have
2mα dσ  α 2 1
f (θ) = − 2 , = 4 . (22.57)
q dΩ 4E sin (θ/2)
However, the total cross section diverges. The divergence is in the integral when θ → 0.
In other words, the divergence occurs for the small momentum transfer q → 0, which cor-
responds to large distances. The reason why the total cross section diverges is because the
Coulomb potential is actually a long-range force. No matter how far the incident particles are
from the charge, there is always an effect on the motion of the particles and they get scattered.

Form factor
If the source of Coulomb potential has an distribution ρ(x), the potential will become
Z
α
V (x) = d3 x ρ(x′ ). (22.58)
|x − x |

Notice that the potential is mathematically a convolution of the Coulomb potential and the
probability density. Since the first Born amplitude is nothing but the Fourier transform of
the potential, the convolution becomes a product of Fourier transforms, one for the Coulomb
potential and the other for the probability density. Thus we have
f (θ) = f (θ)pointlike F (q), (22.59)
where Z
F (q) ≡ d3 x ρ(x)eiq·x (22.60)

is called form factor.


–238/453– Chapter 22 Scattering Theory

Born expansion
Define T-matrix by
V |ψ⟩ = T |ϕ⟩ . (22.61)
Using the definition of the T-matrix, we find that
m
f (k, k′ ) = − (2π)3 ⟨k′ |T |k⟩ . (22.62)

Multiplying the both sides of the Lippmann-Schwinger equation 22.38 by V from left, we can
get
1
T |ϕ⟩ = V T |ϕ⟩ + V |ϕ⟩ . (22.63)
E − H0 + iϵ
A formal solution to the T-matrix is
1
T = V. (22.64)
1 − V (E − H0 + iϵ)−1
The Taylor expansion of T in geometric series is
1 1 1
T =V +V V +V V V + ··· (22.65)
E − H0 + iϵ E − H0 + iϵ E − H0 + iϵ
Thus we have
 
1 1 1
|ψ⟩ = 1 + V + V V + ··· |ϕ⟩ . (22.66)
E − H0 + iϵ E − H0 + iϵ E − H0 + iϵ
The first term is the wave which did not get scattered. The second term is the wave that gets
scattered at a point in the potential and then propagates outwards by the propagator. In the
third term, the wave gets scattered at a point in the potential, propagates for a while, and gets
scattered again at another point in the potential, and propagates outwards. In the n + 1-th
term, there are n times scattering of the wave before it propagates outwards.

22.4 Partial wave analysis


22.4.1 Partial wave expansion
When the potential is central, angular momentum is conserved due to Noether’s theorem.
Therefore, we can expand the wave function in the eigenstates of the angular momentum.
Obtained waves with definite angular momenta are called partial waves. We can solve the
scattering problem for each partial wave separately, and then in the end put them together to
obtain the full scattering amplitude. The plane wave can be expanded as
X

ikz
e = (2l + 1)il jl (kr)Pl (cos θ), (22.67)
l=0

wgere jl (kr) is spherical Bessel functions of the first kind. The asymptotic behaviour of jl (kr)
at large r is 
sin kr − lπ2
jl (kr) ∼ . (22.68)
kr
22.4 Partial wave analysis –239/453–

So we have
1 X

eikz ∼ (2l + 1)(eikr − (−1)l e−ikr )Pl (cos θ) at large r. (22.69)
2ikr l=0

The f factor can be expanded as


X

f (θ) = fl (2l + 1)Pl (cos θ). (22.70)
l=0

The cross section can be represented by expansion coefficient of f factor as


X
σ = 4π (2l + 1)|fl |2 . (22.71)
l

Notice that X
Im f (0) = (2l + 1) Im fl . (22.72)
l
Applying optical theorem, we find that
1
|fl |2 = Im fl . (22.73)
k
It follows that
|1 + 2ikfl |2 = 1. (22.74)
We can define a phase δl by
1 + 2ikfl = e2iδl . (22.75)
Asymptotic behaviour of the wave function 22.46 then would be
1 X
ψ(x) ∼ (2l + 1)Pl (cos θ)[eikr e2iδl − (−1)l e−ikr ]. (22.76)
2ikr l

Compare it to the case of the plane wave without scattering. What this equation says is that
the wave converging on the scatterer has the well-defined phase factor −(−1)l , the same as in
the case without scattering. While the wave that emerges from the scatterer has an additional
phase factor e2iδl . All what scattering did is to shift the phase of the emerging wave by 2δl . The
reason why this is merely a phase factor is the conservation of probability. What converged
to the origin must come out with the same strength. But this shift in the phase causes the
interference among all partial waves different from the case without the phase shifts, and the
result is not a plane wave but contains the scattered wave. In terms of the phase shifts, the cross
section is given by
4π X
σ= 2 (2l + 1) sin2 δl . (22.77)
k l
Actual calculation of phase shifts is basically to solve the Schrödinger equation
 
1 d2 l(l + 1)
− r+ + 2mV (r) Rl (r) = k 2 Rl (r) (22.78)
r dr2 r2
for each partial waves. After solving the equation, we take the asymptotic limit r → ∞, and
write Rl (r) as a linear combination of jl (kr) cos δl − nl (kr) sin δl . The relative coefficients of
jl and nl determines the phase shift δl , and hence the cross section.
–240/453– Chapter 22 Scattering Theory

22.4.2 Hard sphere scattering


The potential for hard sphere scattering is
(
0, r > a
V = . (22.79)
∞, r < a

The infinite potential corresponds to the boundary condition Rl (a) = 0. We first analyze the
S-wave (l = 0). The Schrödinger equation is simply
d2 rR0
− 2
= k 2 rR0 . (22.80)
dr
The solution is
ceika  i(kr−2ka) 
rR0 = c sin[k(r − a)] = e − e−ikr . (22.81)
2i
It follows that δ0 = −ka. The reason behind the phase shift is that the wave cannot penetrate
into r < a, the wave is shifted outwards, which is the shift in the phase −ka. The cross section
from the S-wave scattering is

σ0 = 2 sin2 ka. (22.82)
k
Let us generalize the discussion to the case of a little bit penetrable potential
(
0, r > a
V = . (22.83)
V0 , r < a

Define K ≡ 2mV0 . If k > K, we have
( √ 
sin k 2 − K 2 r , r < a
rR0 = . (22.84)
sin(kr + δ0 ), r > a

By matching the logarithmic derivatives of the wave function at r = a, we find that


 √ 
−1 k
δ0 = tan √ tan k 2 − K 2 a − ka. (22.85)
k2 − K 2
For k ≫ K, one can neglect K and the phase shift vanishes. The energy is too large to care
the slight potential and there is no scattering any more.
If k < K, we have ( √ 
sinh K 2 − k 2 r , r<a
rR0 = . (22.86)
sin(kr + δ0 ), r>a
The phase shift is obtained as
 √ 
−1 k
δ0 = tan √ tanh K − k a − ka.
2 2 (22.87)
K 2 − k2
For k ≪ K, we have  
tanh Ka
δ0 ∼ ka −1 . (22.88)
Ka
22.4 Partial wave analysis –241/453–

The phase shift δ0 always starts linearly with k at small momentum, and the slope is negative.
This is a completely general result for a repulsive potential, and a convenient quantity

dδ0
a0 = lim − (22.89)
k→0 dk
is called the scattering length, as it has the dimension of the length. This quantity basically
measures how big the scatterer is. The cross section at k → 0 limit is then given by 4πa20 . For
the hard sphere potential, the scattering length is indeed the size of the sphere.
For the hard sphere problem, the phase shifts for higher partial waves can be worked out sim-
ilarly. We have
jl (ka)
tan δl = . (22.90)
nl (ka)
The cross section is then given by

4π X X∞
2 4π(2l + 1) [jl (ka)]2
σ= 2 (2l + 1) sin δl = . (22.91)
k l l=0
k2 [jl (ka)]2 + [nl (ka)]2

For small momentum k ≪ a−1 , we can use the power expansion of the spherical Bessel func-
tions
rl (2l − 1)!!
jl (r) ∼ , nl (r) ∼ − , (22.92)
(2l + 1)!! rl+1
and find that
δl ∼ (ka)2l+1 . (22.93)
Thus phase shift (and so cross section) is smaller for higher partial waves. This is easy to
understand. When k is small, the centrifugal barrier does not allow the particle to reach r = a
classically. Therefore the effect of the potential is extremely suppressed.
At high momentum, sin2 δl oscillates between 0 and 1 as a function of l up to l ∼ ka. Above
this value, the phase shift drops rapidly to zero. This makes sense from the classical physics
intuition. When l > ka, the impact parameter is larger than the size of the target and there
should not be any scattering.

22.4.3 Attractive potential


As for attractive potential (
0, r>a
V = , (22.94)
−V0 , r<a
the phase shift for S-wave is
 
−1 k √
δ0 = tan √ tan k + K a − ka.
2 2 (22.95)
k2 + K 2
The scattering length is  
tan Ka
a0 = a 1 − . (22.96)
Ka
–242/453– Chapter 22 Scattering Theory

For small K, the scattering length is negative. This is easy to understand because the wave is
pulled into the potential rather pushed out unlike the repulsive case. However, once we make
the potential more attractive (larger K), the scattering length grows and becomes even infinite
at Ka = π/2.

Let us study the analytic structure of the scattering amplitude more carefully. Notice that

1 + i √ k tan k2 + K 2a
−2ika k2 +K 2
e2iδ0 =e √ . (22.97)
1 − i √k2k+K 2 tan k 2 + K 2 a

It can have a pole if


k √
1 − i√ tan k 2 + K 2 a = 0. (22.98)
k2 + K 2
This equation appears impossible to satisfy, but it can be on the complex plane of k. For a pure
imaginary k = iκ, the equation becomes

K 2 − κ2
κ=− √ . (22.99)
tan K 2 − κ2 a

This is the condition for bound states. The scattering wave eikr becomes e−κr , which is trapped
by scatter. By decreasing K from a sufficiently large value with bound states, the bound state
energies E = −κ2 /2m move up. When Ka = (n + 1/2)π, we have tan Ka → ∞, and we
find a bound state approaching κ = k = 0. The infinite scattering cross section at k = 0
happens because there is a bound state exactly at k = 0.

This can also be seen on the complex k plane in the following manner. The lower half plane
is unphysical as it corresponds to an exponentially growing wave function at the infinity for
the scattered wave. When there are bound states, we see poles along the positive imaginary
axis. By decreasing K, the poles along the positive imaginary axis go down, and a pole reaches
the origin. By further decreasing K, the pole goes below the origin into the unphysical region.
However, the existence of a pole just below the origin makes the scattering amplitude at k → 0
large and results in an anomalously large cross section.

22.5 Resonance

22.5.1 Delta-shell potential


There are few examples of potential that can be worked out simply and exhibit resonances.
Here we discuss an idealized potential called “delta-shell” potential,

V = γδ(r − a). (22.100)

This potential leads to a true bound state if γ is sufficiently negative. On the other hand, for
γ → ∞, the regions inside r < a and outside r > a are decoupled and one finds a tower of
states confined inside the shell. The fate of these states for finite γ is very interesting.
22.5 Resonance –243/453–

The phase shift for the S-wave can be worked out analytically,
k
2iδ0 −2ika
sin ka + 2mγ
eika
e =e . (22.101)
sin ka + k
2mγ
e−ika

We now look for poles of the denominator, which can be rewritten as


k
e2ika = 1 − i . (22.102)

When γ is large, we have
 2
nπ 1 n2 π 2 
k≈ −i 3
+ O γ −2 . (22.103)
a + 1/(2mγ) 2mγ a

The poles are in the unphysical lower half plane. But when γ is large, the poles are very close to
the real axis, and the scattering amplitude receives a large enhancement due to these poles. In
the limit of γ → ∞, or in other words in the limit of no coupling between the regions inside
and outside the shell, they become poles along the real axis. They are the discrete states inside
the shell in this limit. By making γ finite, we introduce coupling between the discrete states
inside the shell to the continuum states outside the shell.
It is instructive to solve Schrödinger equation for the values of k which correspond to the
location of poles. Because the outgoing wave eikr is enhanced relative to the incoming wave
e−ikr by an infinite amount due to the pole, the boundary condition is that the solution is
“purely outgoing”, i.e.,
(
sin kr, r < a
rR0 = , Re(k) > 0. (22.104)
sin ka eik(r−a) , r > a

Because the factor eik(r−a) grows exponentially at large r due to the negative imaginary part
in k, the solution is not a regular normalizable solution. In the large γ limit, sin ka ∼ O(γ −1 )
is small. Therefore the wave function almost vanishes at the shell. Outside the shell, the wave
function oscillates at the small amplitude sin ka, which however starts growing again due to
the eik(r−a) factor exponentially.
We now put the time dependence in. For k = k0 − iκ, we have

Γ k2 k0 κ 
E = E0 − i = 0 −i + O κ2 . (22.105)
2 2m m
The time dependence of the wave function is simply
(
sin kr e−iE0 t e− 2 , (r < a)
Γt
−iE0 t − 2
Γt
rR0 (r, t) = rR0 (r)e e = . (22.106)
sin ka eik(r−a) e−iE0 t e− 2 , (r > a)
Γt

Inside the shell, it shows an exponentially decaying probability density uniformly over space.
Outside the shell, the probability density is |rR0 |2 ∝ e2κr−Γt , which shows the probability
flowing out to infinity with speed Γ/2κ = k0 /m, nothing but the velocity of the particle
–244/453– Chapter 22 Scattering Theory

itself. In other words, the wave function describes a “bound state” inside shell decaying into
a continuum state outside the shell moving away at the expected velocity. The resonances can
be viewed as quasi-bound states which decay into continuum states. The lifetime of the quasi-
bound states is τ = 1/Γ. A more rigorous treatment of resonance using wave packets can be
found in the section 5 of the lecture notes Scattering Theory III (Hitoshi Murayama).

22.5.2 General description of resonances


In general, once we know that there is a pole just below the real axis, we can approximate the
phase shift by the contribution from the pole only. Then as a function of the energy, the phase
shift is approximated as
g(E)
e2iδl ∼ . (22.107)
E − E0 + iΓ/2
Because of the unitarity |e2iδl |2 = 1 , we immediately conclude that

E − E0 − iΓ/2
e2iδl ∼ e2iθ . (22.108)
E − E0 + iΓ/2

Ignoring the continuum contribution e2iθ , we have

Γ2 /4
σl ∝ sin2 δl = . (22.109)
(E − E0 )2 + Γ2 /4

At E = E0 , it saturates the unitarity limit sin2 δl = 1, and its shape in energy is called
Lorentzian or Breit-Wigner. Γ is nothing but the FWHM (Full-Width-Half-Maximum) of the
Lorentzian peak in sin2 δl . Comparing the discussion of the decaying probability density with
a run-away wave and the dependence of the cross section on the energy, we established the
relationship between the life time of the quasi-bound state and the FWHM of the resonance as
τ Γ = 1. This is an explicit manifestation of the energy-time uncertainty relation ∆E∆t ∼ 1.

22.6 Two-to-two scattering


Similar to the discussion for hydrogen atom, we take as independent variables the center of
mass and relative coordinates of the particles
m1 x1 + m2 x2
X= , x = x1 − x2 . (22.110)
m1 + m2
The corresponding momentum operators are
m1 p1 − m2 p2
P = p 1 + p2 , p= . (22.111)
m1 + m2
The Hamiltonian becomes
P2 p2
H= + + V (|x|). (22.112)
2M 2µ
Now the problem reduces to the potential scattering problem for a particle of mass µ =
m1 m2 /(m1 + m2 ).
22.7 Time-dependent formulation of scattering theory –245/453–

If two particles that scatter are identical particles, such as electron-electron scattering or scat-
tering of two identical atoms, symmetry of the wave function needs to be considered. Under
the interchange of two particles, the center of mass motion is not affected, but the relative
coordinates change their signs. If they have spins, their spins need to be interchanged at the
same time.

If two particles are identical spinless bosons, there is no spin degrees of freedom and the inter-
change of particles is simply x → −x in the wave function. Because they are bosons, the wave
function should not change under the interchange of particles, and hence the wave function
must be an even function of x. Therefore the asymptotic form of the wave function must be
changed to
eikr
ψ(x) → eikz + e−ikz + [f (θ) + f (π − θ)] . (22.113)
r
The differential cross section is then found to be

= |f (θ) + f (π − θ)|2 . (22.114)
dΩ
Note that one should not integrate over the entire solid angle to obtain the total cross section
because (θ, ϕ) and (π − θ, ϕ + π) correspond to an identical state.

For two spin 1/2 fermions, there are two possible spin wave functions, symmetric S = 1 and
anti-symmetric S = 0. Therefore depending on the spin wave function, we either have a anti-
symmetric or symmetric spatial wave function, respectively. In particular, the differential cross
section is the same as the spinless bosons for the anti-symmetric spin wave function S = 0
while it is

= |f (θ) − f (π − θ)|2 . (22.115)
dΩ
for the symmetric spin wave function S = 1. In the latter case, the differential cross section
vanishes identially at θ = π/2.

22.7 Time-dependent formulation of scattering theory


In the time-dependent perturbation theory, the rate of the initial state |i⟩ transforming into
the final state |f ⟩ (up to the first order) is

Γ(i → f ) = 2πδ(Ei − Ef )|Vf i |2 . (22.116)

When applied to the scattering problem, an additional issue is to define how we sum over the
final states. In particular, we would like to sum over the continuum plane-wave states, and
we must make the sum well-defined. To define the sum over the continuum states, it is useful
to consider the system in a cube of size L. We impose the periodic boundary condition. The
plane-wave solutions in this box are given by

1 2πi(nx x+ny y+nz z)/L


ψnx ,ny ,nz (x) = e . (22.117)
L3/2
–246/453– Chapter 22 Scattering Theory

In the limit L → ∞, we have


X  3 X  3  3 Z Z 3 3
L 2π L d xd p
= → 3
d p= . (22.118)
n ,n ,n
2π n ,n ,n L 2π (2π)3
x y z x y z

Coming back to the scattering problem, we sum over the final states to define the rate of the
outgoing particle to go into various momentum states
X Z 3 3
L dp
Γ(i → f ) = 3
2πδ(Ei − Ef )|Vf i |2 , (22.119)
f
(2π)

where
Z Z
e−ipf ·x eipi ·x 1
Vf i = 3
d x 3/2 V (x) 3/2 = 3 d3 x V (x)eiq·x , q = pi − pf . (22.120)
L L L

The incident flux of the incoming particle is v/L3 . It follows that


Z Z
L3 X
2
m d3 p
σ= Γ(i → f ) = 2πδ(Ei − E f ) d3
x V (x)e iq·x
. (22.121)
v f pi (2π)3

Notice that E = p2 /2m and δ(Ei − Ef ) = mδ(pf − pi )/pi . Equation 22.121 can be simplified
into Z Z 2
m
σ = dΩ dxV (x)eiq·x . (22.122)

This is nothing but the Born approximation for the scattering cross section.
Part V

Quantum Field Theory


Chapter 23
Elementary Group Theory

23.1 Group
Definition 23.1 Group

Group A group G is a set of elements with a rule for assigning to every (ordered) pair
of elements, satisfying
• If f, g ∈ G, then f g ∈ G.
• For f, g, h ∈ G, f (gh) = (f g)h.
• There is an identity element, e, such that for all f ∈ G, ef = f e = f .
• Every element f ∈ G has an inverse, f −1 , such that f f −1 = f −1 f = 1.

abelian group An abelian group is one in which the multiplication of arbitrary two
elements is commutative.
Finite group A group is finite if it has a finite number of elements. The number of
elements in a finite group G is called the order of G, denoted by N (G). A finite
group with n elements can be characterized by its multiplication table. In each
row or column any group element can appear once and only once.

Definition 23.2 Subgroup

Subgroup A subset H of group G is called a subgroup of G if H also forms a group.


Conjugacy class In a group G, two elements g and g ′ are said to be conjugate (g ∼ g ′ )
if there exists another element f such that g ′ = f −1 gf .
Invariant subgroup A subgroup H of group G is called invariant if H and g −1 Hg are
the same for all g ∈ G.
Center of a group The center of a group G is the set of all elements of G that commute

with all elements of G. The center is an abelian, normal subgroup of G.
Derived subgroup Define ⟨a, b⟩ ≡ a−1 b−1 ab. Denote by {x1 , x2 , · · · } the objects
⟨a, b⟩ as a and b range over all the elements in the group. These objects, together
with the products of these objects with each other constitute a subgroup of G,
known as the derived subgroup D. D is an invariant subgroup of G.
Simple group A group is called simple if it has no non-trivial invariant subgroup.
23.2 Representation theory –249/453–

Theorem 23.1 Lagrange’s theorem

Let a group G with n elements have a subgroup H with m elements. Then m is a factor

of n. In other words, n/m is an integer.

Definition 23.3 Cosets

If H is a subgroup of G, the left cosets are given by {gH|g ∈ G}, where gH ≡ {gh|h ∈
H}. If H is a invariant subgroup, we have gH = Hg. In this case, we can define
multiplication rules for left cosets as (ga H)(gb H) ≡ (ga gb )H. Then the left cosets form ♡
a group, known as the quotient group and written as Q = G/H. If G is finite, we have
N (Q) = N (G)/N (H).

Definition 23.4 Symmetric group

The set of all bijections {1, · · · , n} → {1, · · · , n}, called permutations, with composi-
tion of maps forms a finite group. We call this group the symmetric group of degree n
and it is denoted by Sn . σ ∈ Sn can be represented by (1, · · · , n) → (σ(1), · · · , σ(n)). ♡
We will take the convention of composing permutations from right to left and so taking
π, σ ∈ Sn , we have π · σ = (1, · · · , n) → (π(σ(1)), · · · , π(σ(n))).

Definition 23.5

Partition A partition of n is a sequence λ = (λ1 , · · · , λk ) of positive integers such that


λ1 + · · · + λk = n and λ1 ≥ · · · ≥ λk . The set of partitions of n is denoted by
Pn . Setting vi := |{j = 1, . . . , k : λj = i}|, we write λ = (nvn , . . . , 1v1 ).
Cycle type Every σ ∈ S can be written as the product of disjoint cycles. Take σ ∈ S ♡
with σ = σ1 · · · σk , where σ1 , · · · , σk are disjoint cycles of lengths λ1 , · · · , λk .
We may assume that λ1 ≥ · · · ≥ λk . Then λ(σ) := (λ1 , · · · , λk ) ∈ Pn is called
the cycle type of σ.

Proposition 23.1

1. Given a cycle type characterized by (nvn , . . . , 1v1 ), the number of permutations


Q
in Sn with this cycle type is n!/( j j vj vj !). ♠
2. Permutations σ, π ∈ S are conjugate if and only if λ(σ) = λ(π).
–250/453– Chapter 23 Elementary Group Theory

23.2 Representation theory

Definition 23.6 Representation

A representation of group G is a mapping between the elements g ∈ G and a set of


linear operators D(g) with the properties D(e) = 1 and D(g1 )D(g2 ) = D(g1 g2 ). The
dimension of a representation is the dimension of the linear space on which the opera- ♡
tors in the representation act. If the map D is isomorphic, that is, one-to-one, then the
representation is faithful. D(g) = 1 for all g ∈ G is called trivial representation.

Definition 23.7 Regular representation

The regular representation of a group is constructed by taking the group elements


{g1 , g2 , · · · } themselves as the orthonormal base vectors {|g1 ⟩ , |g2 ⟩ , · · · } of the rep-
resentation space,
Dreg (g1 ) |g2 ⟩ = |g1 g2 ⟩ . (23.1) ♡
Hence, [Dreg (g)]ij = ⟨gi |Dreg (g)|gj ⟩ = ⟨gi |ggj ⟩. The dimension of Dreg is the order of
group G.

Definition 23.8 Character

Given a representation D(g), we define the character χ(g) of the representation by


χ(g) ≡ Tr D(g). If g1 and g2 belong to the same conjugacy class c, then χ(c) ≡ χ(g1 ) = ♡
χ(g2 ).

Definition 23.9 Equivalent representation

Two representations, D(g) and D′ (g), are equivalent representations if they are related

by a similarity transformation D′ (g) = S −1 D(g)S. We have χ′ (c) = χ(c).

Theorem 23.2 Unitariness of finite group’s representation

A representation of group G is unitary if and only if all the matrix elements D(g) are
unitary. All representations of finite groups are equivalent to unitary representations.

As a corollary, if a class c of a finite group contains the inverses of its members, χ(c)
must be real.
23.2 Representation theory –251/453–

Definition 23.10 Reducible representation

A representation is reducible if it has nontrivial invariant space. A representation is


completely reducible if it is equivalent to a representation whose matrix elements have
the following block diagonal form:
 
D1 (g) 0 ···
 D2 (g) · · · ♡
D(g) =  0  , ∀g ∈ G, (23.2)
.. .. ..
. . .

where Dj (g)s are irreducible representations of G for all subscripts j. A representation


D in block diagonal form is said to be the direct sum of the sub-representations Dj .

Proposition 23.2

Every representation of a finite group is completely reducible. ♠

Definition 23.11

Given two representations, of dimension dr and ds , with representation matrices Dr (g)


and Ds (g), respectively. The direct product matrices D(g) = Dr (g) ⊗ Ds (g), namely,
the dr ds -by-dr ds matrix given by

D(g)aα,bβ = Dr (g)ab Ds (g)αβ , (23.3)

are called the direct product representation.

Lemma 1 Schur’s Lemma

1. Suppose Dr and Ds are irreducible representation of group G on finite dimen-


sional vector space Vr and Vs and A is a linear map from Vr to Vs . If for any g ∈ G,

Ds (g)A = ADr (g) (23.4)

holds, then Dr and Ds must be equivalent unless A = 0.



2. Suppose D is a representation of group G on finite dimensional vector space V
and there exists a nonzero matrix A satisfying that

D(g)A = AD(g), ∀g ∈ G. (23.5)

If D is irreducible, equation 23.5 holds if and only if A = λI.


–252/453– Chapter 23 Elementary Group Theory

Theorem 23.3 The Great Orthogonality theorem

Given a d-dimensional irreducible representation D(g) of a finite group G, we have


X N (G)
Dr† (g)ij Ds (g)kl = δrs δli δjk . (23.6)
g
dr

If we take the trace of the representation, we have following column orthogonality for
chapter table χcr ≡ χr (c):
X
nc χ∗r (c)χs (c) = N (G)δrs , (23.7)
c

where nc is the number of elements in class c. The character table also satisfies row
orthogonality,
X N (G)
χ∗r (c)χr (c′ ) = δcc′ . (23.8)
r
n c

A corollary of column and row orthogonality of character table is that N (C) = N (G).
The identity element e itself is a class and χr (e) = dr , so we have
X
d2r = N (G). (23.9)
r

Proposition 23.3 A test for reducibility


P
If D is a reducible representation of G and χ(c) = r nr χr (c), then we have
X X X ♠
nc χ(c)∗ χ(c) = N (G) (nr )2 , nc χ∗r (c)χ(c) = N (G)nr . (23.10)
c r c

Definition 23.12 Real, pseudoreal, complex Representations

• Suppose D(g) is a representation of the group G. Then D∗ (g) is also a represen-


tation of G, called conjugate representations. If D(g) is equivalent to D∗ (g), then
D(g) is a non-complex representation. Otherwise it is a complex representation.

• For a non-complex representation D(g), if D(g) is equivalent to a representation
whose matrix elements are real for any g, then D(g) is a real representation. Oth-
erwise it is a pseudoreal representation.
23.3 Representations of the symmetric groups –253/453–

Proposition 23.4

1. If D(g) is a non-complex irreducible representation of the group G and D∗ (g) =


SD(g)S −1 , then S is either symmetric or antisymmetric.
2. If S is symmetric, the representation is real. If S is antisymmetric, the represen-
tation is pseudoreal. ♠
3. Up to an overall constant, S is unitary.
4. An invariant bilinear y ⊺ Sx exists if and only if the irreducible representation is
real or pseudoreal.

Proposition 23.5 The reality checker





X 1 if real,
χr (g 2 ) = ηr N (G), with ηr = −1 if pseudoreal, (23.11) ♠


g∈G 0 if complex.

23.3 Representations of the symmetric groups


Definition 23.13 Defining representation of Sn

The defining representation of Sn is a n-dimensional representation defined by



D(σ) |j⟩ = |σ(j)⟩, where j = 1, · · · , n. It must be reducible.

Definition 23.14 Parity of a permutation

The parity of a permutation σ ∈ Sn with cycle type (λ1 , · · · , λk ) is defined as (−1)σ ≡


Q λi −1
i (−1) = (−1)n−k . If (−1)σ = 1, σ is called an even permutation. If (−1)σ = −1, ♡
σ is called an odd permutation.

Proposition 23.6

1. Every Sn has a non-trivial invariant subgroup, the alternating group An , which is


composed of all even permutations.
2. Every Sn has two one-dimensional representations: ♠
• D(σ) = 1 for any σ ∈ Sn ;
• D(σ) = (−1)σ for any σ ∈ Sn .
–254/453– Chapter 23 Elementary Group Theory

Definition 23.15 Young Diagrams

A partition (λ1 , · · · , λk ) is represented graphically by a Young diagram of n squares



arranged in k rows, the jth of which contains λj squares.

Corollary 1

Number of distinct Young diagram for n = Number of classes in Sn = Number of in-



equivalent irreducible representations in Sn .

Definition 23.16 Young Tableau

A tableau is a diagram filled with a distinct number (1, · · · , n) in each square.


Young tableau Numbers filled with no particular order
Normal tableau Numbers filled consecutively from left to right and top to bottom
Standard tableau Numbers ordered from left to right in each row and top to bottom

in each column
The symmetric group Sn acts on the set of all Young tableau entrywisely. Normal tableau
of shape λ is denoted by Θλ . An arbitrary Young tableau of shape λ can be obtained
through permutation σ · Θλ , denoted as Θλσ .

Theorem 23.4 Hook length formula

For each cell of the Young diagram in coordinates (i, j) (that is, the cell in the ith row
and jth column), the hook Hλ (i, j) is the set of cells (a, b) such that a = i and b ≥ j or
a ≥ i and b = j. The hook-length hλ (i, j) is the number of cells in the hook Hλ (i, j).
The hook-length formula expresses the number of standard Young tableaux of shape λ, ♣
sometimes denoted by dλ , as
n!
dλ = Q . (23.12)
hλ (i, j)

Example: For a Young diagram of shape (2, 1), the hook-length for each cell is given by

3 1. (23.13)
1

So the number of standard Young tableau of shape (2, 1) is

3!
dλ = = 2. (23.14)
3
23.3 Representations of the symmetric groups –255/453–

Definition 23.17 Horizontal and vertical permutations

Let Θλσ be any tableau. A horizontal (vertical) permutation hλσ (vλ σ ) is a permutation
that does not exchange numbers between different rows (columns). Each cycle in hλσ ♡
(vλ σ ) must contain numbers that appear in the same row (column).

Definition 23.18
P
Symmetrizer sλσ ≡ h hλσ
P
Anti-symmetrizer aλσ ≡ v (−1)v vλ σ ♡
P
Irreducible symmetrizer eλσ ≡ sλσ aλσ = h,v (−1)v hλσ vλ σ

Example: For normal tableau


1 2, (23.15)
3

The horizontal and vertical permutations are hλ = {e, (12)} and vλ = {e, (13)}, respectively.
The symmetrizer, anti-symmetrizer and irreducible symmetrizer are sλ = e + (12), aλ =
e − (13) and eλ = e + (12) − (13) − (321).

For standard tableau


1 3, (23.16)
2

(2,3) (2,3)
The horizontal and vertical permutations are hλ = {e, (13)} and vλ = {e, (12)},
(2,3)
respectively. The symmetrizer, anti-symmetrizer and irreducible symmetrizer are sλ =
(2,3) (2,3)
e + (13), vλ = e − (12) and eλ = e − (12) + (13) − (123).

Theorem 23.5

eλ generates an irreducible representation of Sn . ♣

Example: For tableau 23.15, define e2 ≡ eλ = e + (12) − (13) − (321). We have

ee2 = e2 , (12)e2 = e2 , (23)e2 = (23) + (132) − (123) − (12) ≡ r2


(31)e2 = −e2 − r2 , (123)e2 = −e2 − r2 , (321)e2 = r2 . (23.17)

So e2 generates an irreducible 2-dimensional representation of Sn .


–256/453– Chapter 23 Elementary Group Theory

Theorem 23.6

1. Irreducible representation generated by eλ and eλσ are equivalent.


2. eλ and eµ generate inequivalent irreducible representations if λ ̸= µ.
3. Set {eλ } of all normal tableaux generate all inequivalent irreducible representa-
tions.
4. Subspace Lλσ ≡ {ρ · eλσ | ∀ρ ∈ Sn } associated with distinct standard tableaux ♣
Θ σ are linearly independent. And we have the direct sum composition Sn =
Lλ σ
λ,σ Lλ .
5. The dimension of the representation generated by eλ is equal to the number of
standard Young tableaux of shape λ.

Definition 23.19 General linear group

Let Vm be an m-dimensional vector space with basis {|i⟩ , i = 1, · · · , m}. The general
linear group GL(m) consists of all invertible linear transformations on Vm . The natural
m-dimensional representation of GL(m) on Vm is

g |i⟩ = |j⟩ gji . (23.18)

The natural mn-dimensional representation of GL(m) on tensor space Vm n ≡ ♡


Nn
j=1 Vm is
g |i1 · · · in ⟩ = |j1 · · · jn ⟩ gj1 i1 · · · gjn in (23.19)
The action of Sn on Vm n is defined as

σ |i1 · · · in ⟩ = iσ−1 (1) · · · iσ−1 (n) . (23.20)

Theorem 23.7

1. Tλ (α) spanned by {σeλ |α⟩ | σ ∈ Sn } is an irreducible invariant subspace of Sn


2. Tλ σ spanned by {eλσ |α⟩ | α ∈ Vmn } is an irreducible invariant subspace of
L
GL(m). And we have the direct sum decomposition Vmn = σ
λ,σ Tλ , where
every term is associated with distinct standard tableaux.

3. The dimension of Tλ σ is
Ym+j−i
dim Tλ σ = . (23.21)
hλ (i, j)
(i,j)

Suppose we have two irreducible representation of GL(m). The product representation of


them is generally reducible. We can decomposition the product representation into direct
sum of irreducible representations by following procedures.
23.4 Lie Group –257/453–

1. In the smaller tableaux, assign the same symbol, say a; to all boxes in the first row, the
same symbol b to all boxes in the second row, etc.

2. Attach boxes labeled by a to the second tableaux in all possible ways subjected to the
rules that no two a’s appear in the same column and that the resultant graph is still a
Young tableaux (i.e., the length of rows does not increase from top to bottom and there
are not more than n rows, etc.) Repeat this process with b’s ,etc.

3. After all symbols have been added to the tableaux, these added symbols are then read
from right to left in the first row, then the second row in the same order, and so forth.
This sequence of symbols aabbac must form a lattice permutation. Thus, to left of any
symbol there are not fewer a than b and no fewer b than c, etc.

4. These added symbols are read again from top to bottom in the last column, then the last
but one column in the same order, and so forth. This sequence of symbols must also
form a lattice permutation.

Example: Let consider the product representation ⊗ of GL(3).

Stage 0:

⊗ a a (23.22)
b
Stage 1:

a (23.23)
a
a
Stage 2:
a
a a a (23.24)
a
a
a a
Stage 3:

a a a a
a a a (23.25)
a b a
b a b
b b a a b

So we have

⊗ = ⊕ ⊕ ⊕ ⊕ ⊕ . (23.26)

Calculating the dimension of each irreducible representation, we can check that

8 × 8 = 27 + 10 + 10 + 8 + 8 + 1. (23.27)
–258/453– Chapter 23 Elementary Group Theory

23.4 Lie Group

23.4.1 Lie groups in general


Definition 23.20 Lie Group

Lie groups G are groups where the group elements g ∈ G depends smoothly on a set
of continuous real parameters g = g(α) where α = {αa | 1 ≤ a ≤ N }. In general, we

choose parameters {αa } so that the identity can be expressed as e = g(0). If we find a
representation D(G), we have similarly 1 = D(0).

Definition 23.21 Invariant measures and compact groups

To integrate various functions F (g) of the group elements over the group G. We need an
R
integration measure dµ (g) to formulate such integrals as G dµ (g)F (g). The measure
should be invariant under group action, i.e., dµ (g) = dµ (g ′ ) where g ′ = g1 g. For a spe-
cific parametrization of a group manifold αa , we can write dµ (g) = ρ(αa ) dα1 · · · dαN . ♡
The invariance of measure indicates that ρ(αa ) dα1 · · · dαN = ρ(αa′ ) dα1′ · · · dαN

. The
R
group is compact if G dµ (g) is finite. Most of theorems on finite groups also hold in
P R
the case of compact groups, if we replace the summation g by integral G dµ (g).

Example: The rotation group in 2-dimensional space can be parametrized by the angle of
rotation θ:  
cos θ − sin θ
R(θ) = where 0 < θ < 2π. (23.28)
sin θ cos θ
If L(θ′ ) = L(ϕ)L(θ), we have θ′ = ϕ + θ. Since ρ(θ) dθ = ρ(θ′ ) dθ′ , we can derive that

ρ(θ) = ρ(θ′ ) (23.29)

Setting θ = 0 gives ρ(ϕ) = ρ(0). The measure is determined only up to an overall constant,
and so we might as well set ρ(0) = 1. Noting that
Z 2π
dθ = 2π, (23.30)
0

the rotation group is compact. We can check that representation R(θ) of the rotation group is
equivalent to a unitary representation, like that of a finite group.

Example: The Lorentz group in 1 + 1 spacetime can be parametrized by the velocity v:


 
(1 − v 2 )−1/2 v(1 − v 2 )−1/2
L(v) = where − 1 < v < 1. (23.31)
v(1 − v 2 )−1/2 (1 − v 2 )−1/2
23.4 Lie Group –259/453–

If L(v ′ ) = L(u)L(v), we have v ′ = (v + u)/(1 + uv). Since ρ(v) dv = ρ(v ′ ) dv ′ , we can derive
that
1 − u2
ρ(v) = ρ(v ′ ) (23.32)
(1 + uv)2
Setting v = 0 gives ρ(u) = ρ(0)/(1 − u2 ). We might as well set ρ(0) = 1. Noting that
Z +1
1
dv = ∞, (23.33)
−1 1 − v
2

the Lorentz group is not compact.

23.4.2 SO(N )
We define SO(N ) as the group of all N -by-N real matrices R satisfying R⊺ R = I and det R =
1. The elements of the SO(N ) are represented, by definition, by the N -by-N matrices trans-
forming the N unit basis vectors ê1 , · · · , êN into one another. More precisely, the N -dimensional
irreducible representation of SO(N ) is furnished by a vector. A vector is defined by how it
transforms under a rotation:

V i = Rij V j with i, j = 1, · · · , N. (23.34)

Group SO(N ) can also be represented on tensor space:

T i1 ···in = Ri1 j1 · · · Rin jn T j1 ···jn . (23.35)

Tensor representations can be reduced into several invariant subspace by requiring it to have
definite symmetry properties under permutation of their indices.
The Kronecker delta δ ij is invariant under SO(N ). So T (ij)··· can be further decomposed into
T (ij)··· δ ij and T (ij)··· − δ ij [T (kl)··· δ kl /N ].
The Levi-Civita symbol ϵi1 ···iN is also invariant under SO(N ). So representation T [i1 ···iN −1 ]···
is equivalent to T i··· , T [i1 ···iN −2 ]··· is equivalent to T [ij]··· , etc.
The rotation group SO(2n) enjoys an additional feature of selfdual and anti-selfdual tensors.
Consider the antisymmetric tensor with n indices Ai1 ···in . Construct the tensor B i1 ···in ≡
in ϵi1 ···in in+1 ···i2n Ain+1 ···i2n /n! dual to A, denoted as B ∼ ϵA. Then A is dual B, i.e., A ∼
ϵB. It follows that the two tensors T±i1 ···in ≡ Ai1 ···in ± B i1 ···in are self-dual and anti-selfdual,
respectively. Schematically, ϵT± ∼ ϵ(A ± B) ∼ ϵA ± ϵB ∼ B ± A ∼ ±(A ± B) ∼ ±T± .
Clearly, under an SO(2n) transformation, T+ transforms into a linear combination of T+ ,
while T− transforms into a linear combination of T− . The two tensors correspond to two
irreducible representations with dimension (2n)!/(2(n!)2 ), not (2n)!/(n!)2 .
Example: For SO(3), a pair of antisymmetric indices can always be traded for a single index.
The irreducible representation is furnished by totally symmetric traceless tensors carrying n
indices, with n an arbitrary positive integer, that is, a tensor S i1 ···in that remains unchanged on
the interchange of any pair of indices and that vanishes when any two indices are contracted.
The dimension of S i1 ···in is 2n + 1.
–260/453– Chapter 23 Elementary Group Theory

23.4.3 SU(N )
We define SU(N ) as the group of all N -by-N complex matrices U satisfying U † U = I and
det U = 1. By definition, the fundamental representation is furnished by a vector:
V i = U ij V j . (23.36)
The conjugate representation is furnished by the complex conjugation of a vector:
Vi = Vj (U † )j i where Vi ≡ V i∗ . (23.37)
The product representations of them are thus furnished by tensors with upper and lower in-
dices:
···im 1 ···km
Vji11···j n
= U i1k1 · · · U imkm Vl1k···ln
(U † )l1 j1 · · · (U † )ln jn . (23.38)
Tensor representations can be reduced by requiring it to have definite symmetry properties
under permutation of their upper indices and under permutation of their lower indices.
The Kronecker delta δji is invariant under SU(N ). So Tj···
i···
can be further decomposed into
Ti··· and Tj··· − δj (Tk··· /N ).
i··· i··· i k···

The Levi-Civita symbol ϵi1 ···iN and ϵi1 ···iN are also invariant under SO(N ). Using the two
antisymmetric symbols, we can move indices on SU(N ) tensors up and down stairs.
Example: For SU(2), because the antisymmetric symbols ϵij and ϵij carry two indices, we can
in fact remove all lower indices. Furthermore, it suffices to consider only tensors with up-
per indices all symmetrized. The irreducible representation is furnished by totally symmetric
tensors carrying n upper indices. The dimension of the representation is n + 1.
Since ϵij ψj transforms in exactly the same way as ψ i under SU(2) and ϵij is antisymmetric,
the fundamental representation of SU(2) is pseudoreal.
Any hermitean and traceless 2-by-2 matrix X can be written as a linear combination of the
three Pauli matrices:
 
x3 x1 + ix2
X = x1 σ1 + x2 σ2 + x3 σ3 = . (23.39)
x1 + ix2 −x3
The determinant of X is det X = −x2 . Pick an arbitrary element U of SU(2). Consider
X ′ ≡ U † XU . Since X ′ is also hermitean and traceless, we can write it as X ′ = x′i σi , and
x → x′ is a linear transformation. Since det X ′ = det X, x′ and x have the same length,
thus defining a rotation R. In other words, we can associate an element R of SO(3) with any
element U of SU(2). This map f : U → R of SU(2) into SO(3) is actually 2-to-1, since U
and −U are mapped into the same R. The unitary group SU(2) is said to double cover the
orthogonal group SO(3).

Example: For SU(3), a pair of antisymmetric upper (lower) indices can always be traded for
a single lower (upper) index. The irreducible representation is furnished by traceless tensor
ϕij11···i
···jn with all upper indices symmetrized and all lower indices symmetrized. The dimension
m

of the representation is (m + 1)(n + 1)(m + n + 2)/2. It can be denoted by a Young diagram


with m + n boxes in the first row and n boxes in the second row.
23.5 Lie algebra –261/453–

23.5 Lie algebra

23.5.1 Lie algebra in general

Definition 23.22 Generators of Lie group

In some neighborhood of the identity, the elements of a Lie group G or its representation
D(G) can be Taylor expanded as,

D(dα) = 1 + i dαa X a + O dα2 , (23.40)

where ♡
∂D(α)
X = −i
a
(23.41)
∂αa α=0

are called the generators of group G in its representation D(G). X a are independent of
one another. The representation of the group elements for finite parameters α = {αa }
can be defined as D(α) = exp(iαa X a ). This procedure is called exponential mapping.

Definition 23.23 Lie algebra

The generators of the Lie group G form an closed algebra under Lie brackets [A, B] =
AB − BA. It is called the Lie algebra. Lie algebras are generally written as
 a b ♡
X , X = if abc X c , (23.42)

where coefficients f abc are real, known as the structure constants of the Lie group G.

Proposition 23.7

1. f abc = −f bac .
2. The generators of a unitary representation of Lie group are hermitian matrices.
3. The structure constants satisfy the so-called Jacobi identity, ♠

f abd f dcg + f bcd f dag + f cad f dbg = 0. (23.43)

Definition 23.24 Adjoint representation

Define (T a )bd ≡ −if abd . We have [T a , T c ] = if acd T d from Jacobi identity. Thus T a is

a representation of the Lie algebra, called the adjoint representation.
–262/453– Chapter 23 Elementary Group Theory

Definition 23.25 Simple Lie algebra

Invariant subalgebra An invariant subalgebra is some set of generators H = {X a }


which goes into itself under Lie brackets with any element Y b of the whole algebra.
The subgroup generated by invariant subalgebra is an invariant subgroup. The
whole algebra and the null set are two trivial invariant subalgebras. ♡
Simple Lie algebra A non-abelian Lie algebra which has no nontrivial invariant sub-
algebras is called simple Lie algebra. A simple Lie algebra generates a simple Lie
group, which does not have nontrivial connected invariant subgroups.


Note: A simple Lie group may contain discrete invariant subgroups, hence being a simple Lie group is
different from being simple as an abstract group.

Definition 23.26 Semisimple Lie algebra

abelian invariant subalgebra An abelian invariant subalgebra consists of a single gen-


erator which commutes with all of the generators of the Lie group G. We call such
a sub-algebra a U(1) factor of the group. ♡
Semisimple Lie algebra The Lie algebras without abelian invariant sub-algebras are
called semi-simple Lie algebras.

Definition 23.27 Cartan metric

The Cartan metric of a Lie algebra is defined as


 ♡
g ab ≡ Tr T a T b = −f acd f bdc . (23.44)

Proposition 23.8

The following conditions are equivalent:


• The Lie algebra is semisimple.
• The Lie algebra is a direct sum of simple Lie algebras. ♠
• The Cartan metric of the Lie algebra is non-degenerate.
• Every representation of the Lie algebra is completely reducible.

For semisimple algebra, the real symmetric object gab and its inverse g ab can be used to raise
and lower indices and to take scalar products, e.g., f abc ≡ g dc f abd = −iTr(T a T b T c −
T a T c T b ). Clearly, f abc is totally antisymmetric.
23.5 Lie algebra –263/453–

23.5.2 Compact Lie algebra


Definition 23.28 Compact Lie algebra

A Lie algebra is compact if the Cartan metric on it is positive definite. ♡


Note:

• A compact Lie algebra must be the Lie algebra of a compact semisimple Lie group.

• The Cartan metric on the Lie algebra of a compact Lie group is positive semidefinite, not positive
definite in general.

• The Cartan metric on the Lie algebra of a non­compact Lie group may be positive semidefinite as
well.

Definition 23.29 Cartan subalgebra

In any Lie group, the maximum set of mutually commuting generators H i (i =


1, 2, · · · , l) generates an abelian subalgebra which is called the Cartan subalgebra. The

number of generators in Cartan subalgebra is the rank of the corresponding Lie algebra.
The remaining (n − l) generators are denoted by E a .

From now on, we would choose the base of the compact Lie algebra satisfying that g ab = δ ab
and so all T a are hermitean. Matrices T i commute with one another by definition, and hence
these l matrices can be simultaneously diagonalized by choosing new bases for E a . Denote
the diagonal elements of T i by −β i (a). These l matrices are thus given by

(T i )ab = −β i (a)δba (23.45)

Note that β i (a) = 0 for a ≤ l.

The Lie bracket of H i and X a is


 
H i , X a = if iab X b = β i (a)X a . (23.46)

Suppose the vector |λ⟩ is the eigenvalue of (H 1 , · · · , H l ) with eigenvalues (λ1 , · · · , λl ). It


follows that
X a |λ⟩ = |λ + β(a)⟩ or X a |λ⟩ = 0. (23.47)

If X a ∈ E, we can say β(a) is a root of the Lie algebra and X a can be denoted as Eβ . Vectors
λ depend on the representation of the Lie algebra, called weight vectors. Clearly, roots of a Lie
algebra are the nonzero weights of its adjoint representation.

If [H i , Eβ ] = β i Eβ , we have [H i , Eβ† ] = −β i Eβ† . So −β is also a root of the Lie algebra.


–264/453– Chapter 23 Elementary Group Theory

Given two roots α and β, we have


 i 
H , [Eα , Eβ ] = (αi + β i )[Eα , Eβ ]. (23.48)

We have three cases:


• α + β is a root vector of Lie algebra, and so [Eα , Eβ ] = Nα,β Eα+β .
• α + β = 0 and so [Eα , E−α ] = αi H i .
• α + β is not a root vector of Lie algebra, and so [Eα , Eβ ] = 0.
It can be shown that αi = αi if we take the normalization Tr(Eα E−α ) = 1.
Now, we conclude that a general compact Lie algebra is defined by
 i j  i 
H , H = 0, H , Eα = αi Eα , [Eα , Eβ ] = Nα,β Eα+β , [Eα , E−α ] = αi H i .
(23.49)
Note that Nα,β = 0 if and only if α + β is not a root.
A more general discussion of root system of a semisimple Lie algebra can be found in wikipedia.

23.5.3 SO(N )
In the fundamental representation of SO(N ), the Lie algebra consists of all N -by-N antisym-
metric hermitean matrices. We choose the following bases for the Lie algebra:

(Jmn )ij = −i(δ mi δ nj − δ mj δ ni ) where m < n. (23.50)

The commutator of the Lie algebra is

[Jmn , Jpq ] = i(δmp Jnq + δnq Jmp − δnp Jmq − δmq Jnp ). (23.51)

Of the N (N − 1)/2 generators, {J12 , J34 , · · · , J2l−1,2l } form a maximal subset of mutually
commuting generators (N = 2l or 2l + 1). Diagonalize them simultaneously, and call them
H 1 , H 2 , · · · , H l respectively.
For N = 2l, we have

H 1 = diag(1, −1, 0, 0, · · · , 0, 0), ···, H l = diag(0, 0, · · · , 0, 0, 1, −1), (23.52)

from which we read off the 2l weights for the fundamental representation:

w1 = (1, 0, · · · , 0), w2 = (−1, 0, · · · , 0), ···,


w 2l−1
= (0, · · · , 0, 1), w = (0, · · · , 0, −1).
2l
(23.53)

Let us write the 2l weights more compactly as ±ei in terms of the l unit vectors ei , for i =
1, · · · , l. The root vectors take us from one state to another, and hence they are given by the
differences between the weights, namely,

± ei ± ej (signs uncorrelated) (i < j). (23.54)


23.5 Lie algebra –265/453–

We take the positive roots to be ei ± ej (a negative root must be opposite to a positive root). A
simple root is a positive root that cannot be written as a sum of two positive roots with positive
coefficients. The simple roots are then

ei−1 − ei , el−1 + el where i = 2, · · · , l. (23.55)

Figure 23.1: Weight diagram (in fundamental representation) and root diagram of SO(4).
Note that no root takes the weight w1 into w2 ; this is because rotations cannot transform
x1 ± ix2 into each other. Similarly for x3 ± ix4 .

For N = 2l + 1, we have

H 1 = diag(1, −1, 0, 0, · · · , 0, 0, 0), ···, H l = diag(0, 0, · · · , 0, 0, 1, −1, 0), (23.56)

from which we read off the 2l + 1 weights for the fundamental representation:

w1 = (1, 0, · · · , 0), w2 = (−1, 0, · · · , 0), ···,


w2l−1 = (0, · · · , 0, 1), w2l = (0, · · · , 0, −1), w2l+1 = (0, · · · , 0, 0). (23.57)

Figure 23.2: Weight diagram (in fundamental representation) and root diagram of SO(5).

In terms of ei , the roots are given by

± ei ± ej (signs uncorrelated) (i < j), ±ei . (23.58)

We take the positive roots to be ei ± ej , (i < j) and ei . The simple roots are then

ei−1 − ei , el where i = 2, · · · , l. (23.59)


–266/453– Chapter 23 Elementary Group Theory

23.5.4 SU(N )
In the fundamental representation of SU(N ), the Lie algebra consists of all N -by-N traceless
hermitean matrices. Evidently, there are l = N − 1 traceless N -by-N matrices that commute
with one another and hence can be simultaneously diagonalized. They are
√ √
H 1 = diag(1, −1, 0, · · · , 0)/ 2, H 2 = diag(1, 1, −2, 0, · · · , 0)/ 6, · · · ,
p p
H l−1 = diag(1, 1, 1, · · · , −(l − 1), 0)/ (l − 1)l, H l = diag(1, 1, 1, · · · , 1, −l)/ l(l + 1)
(23.60)

The weights of the N = l + 1 different states in the fundamental representation are


! !
√ 1 1 1 1 √ 1 1 1 1
1
w = 2 , √ ,··· , p ,p , 2
w = 2 − , √ ,··· , p ,p ,
2 2 3 2l(l − 1) 2l(l + 1) 2 2 3 2l(l − 1) 2l(l + 1)
!
√ 1 1 1
w3 = 2 0, − √ , · · · , p ,p , ···,
3 2l(l − 1) 2l(l + 1)
! !
√ l − 1 1 √ l
wl = 2 0, 0, · · · , − p ,p , wl+1 = 2 0, 0, · · · , 0, − p .
2l(l − 1) 2l(l + 1) 2l(l + 1)
(23.61)

The root vectors are given by wm − wn , m, n = 1, · · · , N = l + 1. We take the positive roots


to be wm − wn (m < n). The simple roots are then wm − wm+1 for m = 1, 2, · · · , N − 1.

Figure 23.3: Weight diagram (in fundamental representation) and root diagram of SU(3).

We find that the simple roots of SU(l + 1) satisfy


2
αi = 2, i = 1, · · · l and αi · αi+1 = −1, i = 1, · · · , l − 1. (23.62)

The simple roots of SU(l + 1) can be written in a more elegant form by going to a space one
dimension higher. Let ei (i = 1, · · · , l + 1) denote unit vectors living in (l + 1)-dimensional
2
space. Then (ei − ei+1 ) = 2, and (ei − ei+1 ) · (ej − ej+1 ) = −1 if j = i ± 1 and 0 otherwise.
The l simple roots SU(l + 1) are then given by

αi = ei − ei+1 , i = 1, · · · l. (23.63)

Note that the simple roots live in the l-dimensional hyperplane perpendicular to the vector
P j
je .
23.5 Lie algebra –267/453–

23.5.5 Sp(2l)
We define Sp(2l) as the group of all 2l-by-2l complex matrices U satisfying U † U = I and
U T JU = J, where  
0 I
J≡ . (23.64)
−I 0 2l×2l
In fundamental representation, the Lie algebra of Sp(2l) consists of all 2l-by-2l hermitean
matrices satisfying H ⊺ = JHJ. Thus, the general form of the generators is given by
 
P W∗
, (23.65)
W −P ⊺
where P is hermitean and W is symmetric. The generators can also be represented in the
direct product notation as

iA ⊗ I, S1 ⊗ σ1 , S 2 ⊗ σ2 , S3 ⊗ σ3 , (23.66)

where A is an arbitrary real l-by-l antisymmetric matrix and S1 , S2 , and S3 are three arbitrary
real n-by-n symmetric matrices.
Of the l(2l + 1) generators, a maximal subset of mutually commuting generators could be
{u1 ⊗ σ3 , · · · , ul ⊗ σ3 }, where ui denotes the l-by-l diagonal matrix with a single entry equal
to 1 in the ith row and ith column. So the weights of the 2l different states in the fundamental
representation are

w1 = (1, 0, · · · , 0, 0), w2 = (0, 1, · · · , 0, 0), ···, wl = (0, 0, · · · , 0, 1),


wl+1 = (−1, 0, · · · , 0, 0), wl+2 = (0, −1, · · · , 0, 0), ···, w2l = (0, 0, · · · , 0, −1).
(23.67)

Figure 23.4: Weight diagram (in fundamental representation) and root diagram of Sp(4).
The root diagram of Sp(4) is the same as that of SO(5), indicating the local isomorphism
Sp(4) ≃ SO(5).

In terms of ei , the roots are given by

± ei ± ej (signs uncorrelated) (i < j), ±2ei . (23.68)

We take the positive roots to be ei ± ej , (i < j) and 2ei . The simple roots are then

ei−1 − ei , 2el where i = 2, · · · , l. (23.69)


–268/453– Chapter 23 Elementary Group Theory

23.5.6 Classification of compact Lie algebras


Given a general Lie algebra, to consider the sequence of nested [Eα , [Eα , · · · [Eα , Eβ ] · · ·]] with
α ̸= β. We encounter Eα+β , E2α+β and so on. Eventually, we must reach 0, since the algebra
has a finite number of generators. Denote by p the maximum number of Eα s in this chain
before it vanishes. Similarly, we can consider the sequence [E−α , [E−α , · · · [E−α , Eβ ] · · ·]].
Denote by q the maximum number of E−α s.

Define
M (k, α, β) ≡ Nα,kα+β N−α,(k+1)α+β . (23.70)
From Jacobi identity

[Ekα+β , [Eα , E−α ]] + [Eα , [E−α , Ekα+β ]] + [E−α , [Ekα+β , Eα ]] = 0, (23.71)

we can get
M (k − 1, α, β) = M (k, α, β) + kα · α + α · β. (23.72)
From
[Eα , Epα+β ] = 0, [E−α , E−qα+β ] = 0, (23.73)
we have
M (p, α, β) = 0, M (−q − 1, α, β) = 0. (23.74)
Using equations 23.72 and 23.74, we obtain
   
1
M (p − s, α, β) = s α · α p − (s − 1) + α · β . (23.75)
2

Choosing s = p + q + 1, we find that

α·β q−p n
= ≡ . (23.76)
α·α 2 2

Next, we can repeat the same argument with the roles of α and β interchanged, leading to

α·β q ′ − p′ m
= ≡ . (23.77)
β·β 2 2

As a result, root vectors satisfy

(α · β)2 mn α·α m
cos2 θαβ = = ≤ 1, ραβ ≡ = . (23.78)
(α · α)(β · β) 4 β·β n

There are only four possible angles between α and β (we can always take θαβ to be acute, by
flipping α if necessary):

• θαβ = π/2 and ραβ is indeterminate

• θαβ = π/3 and ραβ = 1

• θαβ = π/4 and ραβ = 2


23.5 Lie algebra –269/453–

• θαβ = π/6 and ραβ = 3


We have following theorems about roots.

Proposition 23.9

1. If α and β are roots, vector β ′ ≡ β − 2α(α · β)/(α · α) must also be a root.


2. The chain β + kα can contain at most four roots.
3. A root diagram contains at most two different lengths.
4. A rank l algebra has l simple roots, which can be used as the basis vectors for the ♠
space the root vectors live in.
5. The angle between two simple roots has to be obtuse or right.
6. If α and β are two simple roots and θαβ = π/2, [Eα , Eβ ] = 0.

The Dynkin diagram of the Lie algebra can be drawn as follows.


1. Draw a small circle for each simple root.
2. Connect the two circles corresponding to two simple roots by one, two, or three lines if
the angle between them is π/3, 3π/4, or 5π/6, respectively.
3. Do not connect the two circles if the angle between them is π/2.
4. Fill the circle (that is, darken it with ink) of the short root.
Dynkin diagrams of SU(l + 1), SO(2l + 1), Sp(2l) and SO(2l) are shown in Figure 23.5.

Figure 23.5: Dynkin diagrams of classical Lie algebras.

From Figure 23.5, we find following local isomorphisms:


SO(3) ≃ SU(2) ≃ Sp(2), SO(4) ≃ SU(2)⊗SU(2), SO(5) ≃ Sp(4),
SO(6) ≃ SU(4).
(23.79)
The Dynkin diagram of simple Lie algebra must be connected. So SO(4) is semisimple, not
simple.
–270/453– Chapter 23 Elementary Group Theory

We have following theorems about Dynkin diagrams.

Theorem 23.8

The no-loop theorem Loops are not allowed in Dynkin diagrams.


The no-more-than-three lines theorem The number of lines coming out of a small cir-
cle in a Dynkin diagram cannot be more than three. ♣
The shrinking theorem Shrinking a linear chain of circles connected to one another
by a single line to just one circle leads to a valid Dynkin diagram.

Using these theorems, we can enumerate all exceptional compact simple Lie algebras, as shown
in Figure 23.6. The details can be found in section VI.5 of Group Theory in a Nutshell for
Physicists (A.Zee).

Figure 23.6: Dynkin diagrams of exceptional Lie algebras.

23.6 Spinor representations of orthogonal algebras


Spinor Representations
Define
(1) (1)
γ1 ≡ σ1 , γ2 ≡ σ2 (23.80)
and
(n+1) (n) (n+1) (n+1)
γj ≡ γj ⊗ σ3 (j = 1, · · · , 2n), γ2n+1 ≡ I ⊗ σ1 , γ2n+2 ≡ I ⊗ σ2 . (23.81)

So for every n, we have 2n 2n -by-2n Hermitean matrices,

γ2k−1 = 1 ⊗ 1 ⊗ · · · ⊗ σ1 ⊗ σ3 ⊗ · · · ⊗ σ3 , γ2k = 1 ⊗ 1 ⊗ · · · ⊗ σ2 ⊗ σ3 ⊗ · · · ⊗ σ3 , (23.82)

with 1 appearing k − 1 times and σ3 appearing n − k times. Anticommutators between those


gamma matrices are
{γi , γj } = 2δij I (23.83)

Define
i
σij ≡ − [γi , γj ]. (23.84)
2
23.6 Spinor representations of orthogonal algebras –271/453–

These sigma matrices are hermitean and satisfy


[σij , γk ] = 2i(δik γj − δjk γi ) (23.85)
and
[σij , σkl ] = 2i(δik σjl − δjk σil − δil σjk + δjl σik ). (23.86)
As a result, σij /2’s define a 2n -dimensional representation of the Lie algebra SO(2n). Sigma
matrices can also be obtained by inductions:
(n+1) (n) (n+1) (n) (n+1) (n) (n+1)
σij = σij ⊗ 1, σi,2n+1 = γi ⊗ σ2 , σi,2n+2 = −γi ⊗ σ1 , σ2n+1,2n+2 = I ⊗ σ3 ,
(23.87)
(1)
where σ12 = σ3 .
Consider the unitary transformation
ψ → eiωij σij /4 ψ (23.88)
with ωij = −ωij a set of real numbers. The conjugation of ψ would transform as ψ † →
ψ † e−iωij σij /4 . So for ωij infinitesimal, we have
ψ † γk ψ → ψ † γk ψ + ωki ψ † γi ψ, (23.89)
i.e., vk = ψ † γk ψ transforms like a vector in 2n-dimensional space. Under a complete rotation
of the vector through 2π, we have ψ → −ψ. Thus, the spinor ψ furnishes a representation of
the double covering of the group SO(2n).
Define hermitean matrices
γF ≡ (−i)n γ1 · · · γ2n = σ3 ⊗ σ3 · · · σ3 . (23.90)
It anticommutes with all the gamma matrices. Define
1
P± = (I ± γF ). (23.91)
2
We have P±2 = P± , P+ P− = P− P+ = 0 and P+ + P− = I. Thus P± are projection matrices.
Now we decompose the spinor into a left handed part ψL = P− ψ and a right handed part
ψR = P+ ψ. We deduce that
ψL → P− eiωij σij /4 ψL = eiωij σij /4 ψL , ψR → P+ eiωij σij /4 ψR = eiωij σij /4 ψR , (23.92)
In other words, ψL and ψR transform separately and just like ψ. The irreducible spinor repre-
sentation of SO(2n) has dimension 2n−1 with representation matrices given by eiωij σij /4 P± .
In spinor representation, the weights of the Lie algebra SO(2n) are
|ϵ1 ϵ2 · · · ϵn ⟩ where σ2k−1,2k |ϵ1 ϵ2 · · · ϵn ⟩ = ϵk |ϵ1 ϵ2 · · · ϵn ⟩ . (23.93)
Each of the ϵ’s takes on the values ±1. It can be checked that
Y
n
γF |ϵ1 ϵ2 · · · ϵn ⟩ = ϵi |ϵ1 ϵ2 · · · ϵn ⟩ . (23.94)
i=1
Qn
Thus, right handed spinors S + consist of states with i=1 ϵi = 1, and left handed spinors
Q
consist of states with ni=1 ϵi = −1.
–272/453– Chapter 23 Elementary Group Theory

Complex conjugation
If matrix C satisfies
σij⊺ C + Cσij = 0, (23.95)
ζ ⊺ Cψ will be invariant under rotation. Specificly, we can choose C1 = iσ2 and
  (
0 Cn Cn ⊗ σ1 if n is odd
Cn+1 = n+1 = (23.96)
(−1) Cn 0 Cn ⊗ iσ2 if n is even

It can be verified that

Cn⊺ = (−1)n(n+1)/2 Cn ,
(n) (n)
γF Cn = (−1)n Cn γF , γi Cn = (−1)n+i+1 Cn γi . (23.97)

Define
ψc ≡ C −1 ψ ∗ . (23.98)
Note that
1 1
C −1 σij∗ (1 ± γF )∗ C = −σij [1 ± (−1)n γF ]. (23.99)
2 2
We have

C −1 e−iωij σij P±∗ = eiωij σij /4 P± C −1 if n is even (23.100)
and

C −1 e−iωij σij P±∗ = eiωij σij /4 P∓ C −1 if n is odd. (23.101)
In other words, ψc,± will transform like ψ± if n is even and ψ∓ if n is odd. Thus, the spinor
representation of SO(4k + 2) is complex while that of SO(4k) is non-complex.
Note that C = C ⊺ if n = 4k and C = −C ⊺ if n = 4k + 2. Thus, the spinor representation of
SO(8k) is real while that of SO(8k + 4) is pseduoreal.

Multiplying spinor representations together


Define tensor
⊺ ⊺ ⊺
T++ ≡ ψ+ CΓκ ψ+ , T+− ≡ ψ+ CΓκ ψ− , T−− ≡ ψ− CΓκ ψ− , (23.102)

where
Γκ = γi1 · · · γiκ , i1 , · · · , iκ are all different. (23.103)
If n is even and κ is odd, or κ is even and n is odd, T++ and T−− will vanish. If n and κ are
both even or both odd, T+− and T−+ will vanish.
If T does not vanish, T would transform like a totally antisymmetric rank κ tensor in 2n
dimensional vector space. So the product representation of spinor representations can be re-
duced to the direct sum of antisymmetric tensor representation.
Take SO(4) as an example. We have

2+ ⊗ 2+ = [0] ⊕ [2] = 1 ⊕ 3, 2+ ⊗ 2− = [1] = 4. (23.104)

Note that [2] is a selfdual totally antisymmetric rank 2 tensor in 4 dimensional vector space.
23.6 Spinor representations of orthogonal algebras –273/453–

Embedding unitary groups into orthogonal groups


Consider the fundamental representation (x1 , · · · , xn , y1 , · · · , yn ) of SO(2n). Vector (x +
iy, x − iy) will transform as n ⊕ n∗ under SU(n), which is embedded naturally in SO(2n).
In adjoint representation of SO(2n), the embedding will be
 ∗
∗ ∗ n(n − 1) n(n − 1)
2n ⊗A 2n → (n ⊕ n ) ⊗A (n ⊕ n ) = (n − 1) ⊕ 1 ⊕2
⊕ . (23.105)
2 2

In spinor representation |ϵ1 · · · ϵn ⟩ of SO(2n), (ϵi + 1)/2 can be seen as the occupation num-
ber of fermions in energy level i. The generators of SO(2n) are given by all possible bilinear
operators of the form fi fj , fi† fj† and fi† fj , where fi and fi† are annihilation and creation oper-
ators for fermions in level i. The generators of the U(n) subgroup are then the operators that
conserve the number of fermions, namely, fi† fj . The diagonal U(1) is just the total fermion
P
number ni=1 fi† fi .
When restricted on SU(n), the spinor can be decomposed as follows:
• For n odd,

S + → [0] ⊕ [2] ⊕ · · · ⊕ [n − 1], S − → [1] ⊕ [3] ⊕ · · · ⊕ [n]. (23.106)

• For n even,

S + → [0] ⊕ [2] ⊕ · · · ⊕ [n], S − → [1] ⊕ [3] ⊕ · · · ⊕ [n − 1]. (23.107)

Here, [k] is the fermion number of the spinor.

Table 23.1: The isomorphism between SO(6) and SU(4).


SO(6) SU(4) Dimensions
S+ Vα 4
S− V̄ α 4
Vi A[αβ] 6
A[ij] Tβα 15
S {ij}
[αβ]
Aγ 20
[ijk]
D+ S αβ 10
Chapter 24
From Classical Field to Quantum Field

24.1 Canonical quantization of classical field


The state of a quantum field is represented by an element |ψ⟩ in Hilbert space. The measure-
ment of the field is realized by an operator field ϕa (x). The expectation value of the measure-
ment evolves as
d ⟨ψ|ϕa (x)|ψ⟩
= −i ⟨ψ|[ϕa (x), HS ]|ψ⟩ . (24.1)
dt
If [ϕa (x), HS ] = i{ϕa (x), HS }poisson , classical field theory will be an average effect of quantum
field theory. Notice that the algebraic structure of commutator brackets is the same as that of
Poisson brackets. So to quantize a classical field, what we need are canonical quantization
 a   
[ϕa (x), ϕb (y)] = 0, π (x), π b (y) = 0, ϕa (x), π b (y) = iδab δ(x − y) (24.2)

and the same definition of L, π a and H as those in corresponding classical theory.


Operators in Heisenberg picture are defined as

AH ≡ U −1 (t, t0 )AS U (t, t0 ). (24.3)

Especially, we have

ϕa (x) ≡ U −1 (t, t0 )ϕa (x)U (t, t0 ), π a (x) ≡ U −1 (t, t0 )π a (x)U (t, t0 ). (24.4)

Commutators for field operators at any time are


 a   
[ϕa (x, t), ϕb (y, t)] = 0, ϕa (x, t), π b (y, t) = iδab δ(x − y).
π (x, t), π b (y, t) = 0,
(24.5)
The dynamics of the quantum field can be described by Heisenberg equation
dϕa (x) dπ a (x)
= −i[ϕa (x), HH ], = −i[π a (x), HH ], (24.6)
dt dt
whose form would be identical to that of classical field equation.

24.2 Lorentz invariance in quantum field theory


Under a Lorentz transformation x′ = Λx, the quantum field |ψ⟩ transforms as

|ψ ′ ⟩ = U (Λ) |ψ⟩ . (24.7)


24.3 Symmetry and conservation law –275/453–

For scalar fields, we have


⟨ψ ′ |ϕ(x)|ψ ′ ⟩ = ψ ϕ(Λ−1 x) ψ , U −1 (Λ)ϕ(x)U (Λ) = ϕ(Λ−1 x). (24.8)
For vector fields, we have
⟨ψ ′ |Aµ (x)|ψ ′ ⟩ = ψ Λµν Aν (Λ−1 x) ψ , U −1 (Λ)Aµ (x)U (Λ) = Λµν Aν (Λ−1 x). (24.9)
Lorentz invariance means that Lagrangian density must be a scalar, or more loosely, action
must be invariant under Lorentz transformation.

24.3 Symmetry and conservation law


Thanks to canonical quantization, most equations in classical field theory will hold automat-
ically in quantum field theory as long as we interpret the field in equations as operators in
Heisenberg picture. As a result, Noether’s theorem holds in quantum field theory as well. If
the infinitesimal transformation of the field ϕa → ϕa + δϕa , L → L + δL satisfy the condition
that δL = ∂µ K µ , we will have the conserved current
∂L
jµ = − δϕa + K µ , (24.10)
∂(∂µ ϕa )
satisfying the continuity equation ∂µ j µ = 0. Integrating the continuity equation over a volume
V , large enough to have no net currents through its surface, leads to the conservation law
Z
dQ
= 0 where Q = j 0 dV . (24.11)
dt V

Furthermore, it can be shown that Q is also the generator of the infinitesimal transformation
δϕa , i.e.,
U † ϕa U = ϕa + δϕa where U ≡ eiQ ≈ I + iQ. (24.12)
However, in some cases, classical conservation laws would break in quantum field theory,
called anomalies, which will be discussed later.

24.4 Momentum
The conserved currents for infinitesimal spacetime translation x′µ = xµ + aµ are
∂L
j µ = −aν T µν where T µν ≡ − ∂ ν ϕa + η µν L. (24.13)
∂(∂µ ϕa )
The corresponding conserved charges, called four-momentum, are
Z Z Z
P ≡ T d x = H, P ≡ T d x = −π a ∂i ϕa d3 x .
0 00 3 i 0i 3
(24.14)

We can derive the following commutation relations:


[ϕa , P µ ] = −i∂ µ ϕa , [π a , P µ ] = −i∂ µ π a , [P µ , P ν ] = 0. (24.15)
Four-momentum is also the generator of translation operator. We have
T −1 (s)ϕa (x)T (s) = ϕa (x − s) where T (s) = e−iP
µs
µ
. (24.16)
–276/453– Chapter 24 From Classical Field to Quantum Field

24.5 Angular Momentum


The conserved currents for infinitesimal Lorentz transformation x′µ = xµ + δω µν xν are
1 ∂L
j µ = M µνρ δωνρ where M µνρ ≡ xν T µρ − xρ T µν − (Σνρ )ab ϕb . (24.17)
2 ∂(∂µ ϕa )

The corresponding conserved charges, called angular momentum, are


Z Z
M ≡ M d x = (xν T 0ρ − xρ T 0ν − π a (Σνρ )ab ϕb ) d3 x .
νρ 0νρ 3
(24.18)

Define that
Z Z
ML ≡
µν µ
(x T 0ν
− x T )d x,
ν 0µ 3
MS ≡
µν
(−π a (Σµν )ab ϕb ) d3 x . (24.19)

We can derive the following commutation relations:

[ϕa , MLµν ] = (Lµν )ab ϕb , [ϕa , MSµν ] = (Sµν )ab ϕb , (24.20a)


[M µν , M ρσ ] = i(−η νρ M µσ + η σµ M ρν + η µρ M νσ − η σν M ρµ ), (24.20b)

where
(Lµν )ab ≡ −i(xµ ∂ ν − xν ∂ µ )δab , (Sµν )ab ≡ −i(Σµν )ab . (24.21)
We now define Ji ≡ ϵijk M jk /2 and Ki ≡ M i0 . So equation 24.20b can be rewritten as

[Ji , Jj ] = iϵijk Jk , [Ji , Kj ] = iϵijk Kk , [Ki , Kj ] = −iϵijk Jk . (24.22)

Commutators between momentum and angular momentum are

[P µ , M ρσ ] = i(η µσ P ρ − η µρ P σ ), (24.23)

or equivalently,

[Ji , H] = 0, [Ji , Pj ] = iϵijk Pk , [Ki , H] = iPi , [Ki , Pj ] = iδij H. (24.24)

Finally, we define Li ≡ ϵijk MLjk /2 and Si ≡ ϵijk MSjk /2. We can derive that

[Li , Sj ] = 0, [Si , Pj ] = 0, [Li , Pj ] = iϵijk Pk . (24.25)

Angular momentum is also the generator of Lorentz transform. We have

U −1 (Λ)ϕa (x)U (Λ) = Sa b ϕb (Λ−1 x) (24.26)

and
U −1 (Λ)P µ U (Λ) = Λµν P ν , U −1 (Λ)M µν U (Λ) = Λµρ Λν σ M ρσ , (24.27)
where    
i i
U (Λ) = exp θµν M µν
, S = exp θµν Sµν
. (24.28)
2 2
24.6 Anticommutation relation –277/453–

24.6 Anticommutation relation


Anticommutation relation of operators are defined as {A, B} ≡ AB + BA. Suppose that
the field operator and its canonical momentum operator have the following anticommutation
relations
 
{ϕa (x, t), ϕb (y, t)} = 0, π a (x, t), π b (y, t) = 0,ϕa (x, t), π b (y, t) = iδab δ(x − y).
(24.29)
If the operator A is composed of terms of the form π Ea ϕb , it can be shown that values of
a b

[ϕa , A] and [π a , A] are the same as those in the theory quantized with commutation relation.
It is easy to verify that P i and MSµν have the required form. The form of H is determined
by the specific theory. As we can see later, the Hamiltonian of Dirac field has the required
form. When it is quantized with anticommutation relations, the commutation relations be-
tween field operators, momentum operators and angular momentum operators discussed in
previous sections will hold automatically.
Chapter 25
Scalar Field

25.1 Klein-Gordon field


The Lagrangian density of free Klein-Gordon field is
1 1
L = − ∂ µ ϕ∂µ ϕ − m2 ϕ2 + Ω0 , (25.1)
2 2
from which we can derive the field equation

(∂ µ ∂µ − m2 )ϕ = 0. (25.2)

The Hamiltonian of the field is


Z
1 1 1
H = H d3 x where H = π 2 + (∇ϕ)2 + m2 ϕ2 − Ω0 , π = ϕ̇. (25.3)
2 2 2
The momentum and angular momentum of the field are
Z Z

P = H, P = −π∇ ϕ d x , Ji = ϵijk xj −π∇k ϕ d3 x .
0 i i 3
(25.4)

25.2 Canonical quantization Formulation


Canonical quantization requires that

[ϕ(x, t), ϕ(y, t)] = 0, [π(x, t), π(y, t)] = 0, [ϕ(x, t), π(y, t)] = iδ(x − y). (25.5)

Using field equation, field operators can be expanded as


Z Z
   
f ipx
ϕ(x, t) = dp a(p)e + a (p)e † −ipx f ω a(p)eipx − a† (p)e−ipx ,
, π(x, t) = −i dp

p (25.6)
f ≡ d3 p/(2π)3 2E. It follows that
where E = p2 + m2 , px ≡ p · x − Et and dp
Z Z
3 −ipx †
a(p) = d x e (iπ + Eϕ), a (p) = d3 x eipx (−iπ + Eϕ). (25.7)

Commutator brackets 25.5 can be rewritten in terms of a and a† as


 †   
[a(p), a(q)] = 0, a (p), a† (q) = 0, a(p), a† (q) = (2π)3 2Eδ(p − q). (25.8)
25.3 Perturbation theory for canonical quantization –279/453–

Momentum operators can also be represented by a and a† as


Z Z
f † 1 −3
H = dp Ep a (p)a(p) + (E0 − Ω0 )V where E0 = (2π) d3 p Ep (25.9)
2
and Z
i
P = f pi a† (p)a(p).
dp (25.10)

The commutation relations between momentum operators and a, a† are


 
[H, a(p)] = −Ep a(p), H, a† (p) = Ep a† (p) (25.11)

and
 i   i † 
P , a(p) = −pi a(p), P , a (p) = pi a† (p). (25.12)
Define |p⟩ ≡ a† (p) |0⟩. We have

H |p⟩ = Ep |p⟩ , P i |p⟩ = pi |p⟩ . (25.13)

Therefore, we interpret the state |p⟩ as the momentum eigenstate of a single particle of mass m.
We can also show that Ji |p = 0⟩ = 0. So the particle carries no internal angular momentum.

The amplitude for a particle to propagate from y to x is ⟨0|ϕ(x)ϕ(y)|0⟩, denoted by D(x − y).
And we can figure out that Z
D(x − y) = dpe f ip(x−y) (25.14)

and
[ϕ(x), ϕ(y)] = D(x − y) − D(y − x). (25.15)
If x − y is spacelike, a continuous Lorentz transformation can take (x − y) to −(x − y),
leading to [ϕ(x), ϕ(y)] = 0. As a result, a measurement performed at one point can not affect
a measurement at another point whose separation is spacelike.

The retarded Green function of Klein-Gordon field is defined as


Z
d4 p −i
DR (x − y) ≡ θ(x − y ) ⟨0| [ϕ(x)ϕ(y)] |0⟩ =
0 0
eip(x−y) , (25.16)
(2π) p + m2
4 2

where the integration over p0 is performed in the way shown by Figure 25.1. The retarded
Green function also satisfies the equation

(∂ 2 − m2 )DR (x − y) = iδ(x − y). (25.17)

The Feynman Green function is defined as


Z
d4 p −i
DF (x − y) ≡ ⟨0| T ϕ(x)ϕ(y) |0⟩ = eip(x−y) , (25.18)
(2π) p + m2 − iϵ
4 2

where T stands for time ordering, placing all operators evaluated at later times to the left, and
the integration over p0 is performed in the way shown by Figure 25.2.
–280/453– Chapter 25 Scalar Field

Figure 25.1: Retarded Green Function.

Figure 25.2: Feynman Green Function.

25.3 Perturbation theory for canonical quantization

25.3.1 Perturbation expansion of correlation functions


The Lagrangian density of Klein-Gordon field with ϕ4 interaction is

1 1 λ0
L = − ∂µ ϕ∂ µ ϕ − m20 ϕ2 − ϕ4 . (25.19)
2 2 4!

The Hamiltonian of the field is


Z
λ40 4
H = H0 + Hint where Hint = d3 x ϕ (x). (25.20)
4!

Ground states of the interaction field theory and free field theory are denoted by |Ω⟩ and |0⟩
respectively. The zero of energy is fixed by H0 |0⟩ = 0.

Define
Z
−iH0 (t−t0 ) λ40 4
ϕI (t, x) ≡ e iH0 (t−t0 )
ϕ(t0 , x)e , HI (x) ≡ d3 x ϕ. (25.21)
4! I

The correlation function of the interacting field is given by


D n  R o E
T
0 T ϕI (x)ϕI (y) exp −i −T dt HI 0
⟨Ω|T{ϕ(x)ϕ(y)}|Ω⟩ = lim D n  R o E . (25.22)
T →∞(1−iϵ) T
0 T exp −i −T dt HI 0

The derivation can be found in section 4.2 of An introduction to quantum field theory (M.E.Peskin
& D.V.Schroeder).

To evaluate the right hand side of equation 25.22, we need the following theorem.
25.3 Perturbation theory for canonical quantization –281/453–

Theorem 25.1 Wick’s theorem

T {ϕI (x1 ) · · · ϕI (xn )} = N {ϕI (x1 ) · · · ϕI (xn ) + all possible contractions } . (25.23) ♣

N means normal order, in which all the a’s are to the right of all the a† s.

The proof can be found in section 4.3 of An introduction to quantum field theory (M.E.Peskin
& D.V.Schroeder).

Example:

⟨0|T {ϕI (x1 )ϕI (x2 )ϕI (x3 )ϕI (x4 )}|0⟩ = DF (x1 − x2 )DF (x3 − x4 )
+ DF (x1 − x3 )DF (x2 − x4 ) + DF (x1 − x4 )DF (x2 − x3 ) (25.24)

25.3.2 Feynman diagram


n  R o
T
Expanding ⟨0|T ϕI (x)ϕI (y) exp −i −T dt HI |0⟩ to the first order of λ0 , we have
  Z  
−iλ0 4
0 T ϕI (x)ϕI (y) d z ϕI (z)ϕI (z)ϕI (z)ϕI (z) 0
4!
  Z
−iλ0
=3· DF (x − y) d4 z DF (z − z)DF (z − z)
4!
 Z
−iλ0
+ 12 · d4 z DF (x − z)DF (y − z)DF (z − z). (25.25)
4!

It can be represented by the following Feynman diagrams.

Figure 25.3: Feynman diagram representation of perturbation expansion. The symmetry fac-
tor of the diagrams above are S = 4!/3 = 8 and S = 4!/12 = 2 respectively.

The Feynman rules for ϕ4 theory to evaluate Feynman diagrams are:

1. For each propagator, multiplying P = DF (x − y);


R
2. For each vertex, multiplying V = (−iλ0 ) d4 z;

3. For each external point, multiplying E = 1;


–282/453– Chapter 25 Scalar Field

4. Divided by the symmetry factor S.


Finally, it can be shown that
⟨Ω|T{ϕI (x1 )ϕI (x2 ) · · · ϕI (xn )}|Ω⟩ = sum of all E-connected diagrams with n external points,
(25.26)
where the “E-disconnected” means disconnected
n fromall external points,
o called “vacuum bub-
RT
bles”. They vacuum bubbles in ⟨0|T ϕI (x)ϕI (y) exp −i −T dt HI |0⟩ are all canceled by
n  R o
T
the ⟨0|T exp −i −T dt HI |0⟩.

25.4 Path integral formulation


25.4.1 Frameworks
Generalizing the path integral formulation of quantum mechanics, we have
Z  Z T  
−iHT 1 2 1
ϕb (x) e ϕa (x) = DϕDπ exp i d x π ϕ̇ − π − (∇ϕ) − V (ϕ) ,
4 2
0 2 2
(25.27)
where |ϕb (x)⟩ is the eigenstate of ϕS (x) = ϕH (x, 0) with eigenvalue ϕb (x) at time t = T , and
|ϕa (x)⟩ the eigenstate with eigenvalue ϕa (x) at time t = 0. Since the exponential is quadratic
in π, we can complete the square and evaluate the D(π) integral to obtain
Z  Z T 
−iHT
ϕb (x) e ϕa (x) = Dϕ exp i d xL .
4
(25.28)
0


Note: We emphasize that in this section, ϕH denotes the operator­valued Heisenberg picture of the field,
ϕS are the Schrödinger picture of the field, and ϕ(x) represents the classical field whose value is ordinary
number.

The correlation function of the field is given by


R  R 
T
Dϕϕ(x1 )ϕ(x2 ) exp i T d4 x L
⟨Ω|TϕH (x1 )ϕH (x2 )|Ω⟩ = lim R  R  . (25.29)
T →∞(1−iϵ) T
Dϕ exp i T d4 x L

The proof can be found in section 9.2 of An introduction to quantum field theory (M.E.Peskin
& D.V.Schroeder).
The generating functional of the correlation function is defined as
Z  Z 
Z[J] ≡ Dϕ exp i d x L + J(x)ϕ(x) .
4
(25.30)

It can be shown that


   
1 δ δ
⟨Ω|TϕH (x1 ) · · · ϕH (xn )|Ω⟩ = −i · · · −i Z[J] , (25.31)
Z[0] δJ(x1 ) δJ(xn ) J=0

where Z[0] ≡ Z[J = 0].


25.4 Path integral formulation –283/453–

25.4.2 Free field theory


For free Klein-Gordon field, we have
Z Z  
1
d x (L0 + Jϕ) = d x ϕ(∂ − m + iϵ)ϕ + Jϕ .
4 4 2 2
(25.32)
2
Define Z

ϕ (x) ≡ ϕ(x) + d4 y (−iDF (x − y))J(y). (25.33)

Noticing that (∂ 2 − m2 )DF (x − y) = iδ(x − y), we can derive that


Z Z   Z
1 ′ 2 ′ 1
d x (L0 + Jϕ) = d x ϕ (∂ − m + iϵ)ϕ − d4 x d4 y J(x)[−iDF (x − y)]J(y).
4 4 2
2 2
(25.34)
It follows that
 Z 
1
Z[J] = Z[0] exp − d x d y J(x)DF (x − y)J(y) .
4 4
(25.35)
2
Thus, we have
 Z 
1 δ δ 1
⟨0|TϕH (x1 )ϕH (x2 )|0⟩ = − exp − 4 4
d x d y JDF J = DF (x1 −x2 ).
Z[0] δJ(x1 ) δJ(x2 ) 2 J=0
(25.36)

25.4.3 Interacting field theory


Define
1 1 λ0
L0 ≡ − ∂µ ϕ∂ µ ϕ − m20 ϕ2 , L1 = − ϕ4 . (25.37)
2 2 4!
The generating functional of the ϕ4 theory can be expanded as
Z ∫ ∫ Z ∫ 4
i d4 x(L0 +L1 +Jϕ)
δ
i d4 yL1 ( 1i δJ(y) )
Z[J] = Dϕe =e Dϕei d x(L0 +Jϕ)

 Z 
δ
i d4 xL1 ( 1i δJ(x) ) 1
= Z0 [0]e exp − d y d z J(y)DF (y − z)J(z)
4 4
2
" Z  4 #V  Z P
X∞
1 −iλ0 1 δ 1 1
= Z0 [0] dx4
− d y d z J(y)DF (y − z)J(z) .
4 4

V =0,P =0
V! 4! i δJ(x) P! 2
(25.38)

If we focus on a term with particular values of V and P , the number of surviving sources (after
we take all the functional derivatives) will be E = 2P − 4V . The 4V functional derivatives
can act on the 2P sources in (2P )!/(2P − 4V )! different combinations. However, many of
the resulting expressions are algebraically identical.
To organize them, we introduce Feynman diagrams similar to those in perturbation theory of
canonical quantization. In these diagrams, a line segment stands for a propagator DF (x − y),
R
a filled circle at one end of a line segment for a source i d4 x J(x), and a vertex joining four
R
line segments for −iλ0 d4 z.
–284/453– Chapter 25 Scalar Field

For each diagram, we can assign a symmetry factor similar to that in perturbation theory for
canonical quantization. Due to the fact that some external sources are identical here, usually
symmetry factors in two cases are not equivalent. However, when calculating the correlation
function, the exchange of the order of functional derivatives to identical sources can eliminate
the difference.
It can be shown that !
X
Z[J] = Z0 [0] exp CI , (25.39)
I
where CI stands for a particular connected diagram, including its symmetry factor. We can
define W [J] by
Z[J] = Z[0] exp(−iW [J]). (25.40)
It follows from W [0] = 0 that X
− iW [J] = CI . (25.41)
I̸={0}

The notation I ̸= {0} means that the vacuum diagrams are omitted from the sum. The detailed
discussion can be found in section 9 of Quantum field theory (M. Srednicki).

25.4.4 Symmetries
Equations of motion
The equation of motion in classical field theory is give by
δS
= 0. (25.42)
δϕ(x)
In quantum field theory, we derive the equation of motion by claiming that the path integral
will be invariant under the infinitesimal change of field, i.e., ϕ(x) → ϕ(x) + ϵ(x). Define
Z
Z[ϕ(x1 ), · · · , ϕ(xn )] ≡ DϕeiS ϕ(x1 ) · · · ϕ(xn ). (25.43)

It follows that
Z Z  
δS
δZ = Dϕe iS 4
d x ϵ(x) i ϕ(x1 ) · · · ϕ(xn ) + δ(x − x1 )ϕ(x2 ) · · · ϕ(xn ) + · · · ,
δϕ(x)
(25.44)
leading to
  X n
δS
ϕ(x1 ) · · · ϕ(xn ) = i ⟨ϕ(x1 ) · · · δ(x − xi ) · · · ϕ(xn )⟩ . (25.45)
δϕ(x) i=1

Example: For free Klein-Gordon field, the variation of S gives


δS
= (∂ 2 − m2 )ϕ(x). (25.46)
δϕ(x)
Thus we have
(∂ 2 − m2 ) ⟨0|Tϕ(x)ϕ(x1 )|0⟩ = iδ(x − x1 ). (25.47)
25.5 Scattering matrix and cross section –285/453–

Conservation laws
Consider a local field theory of a set of fields ϕa (x), governed by a Lagrangian density L(ϕ).
An infinitesimal symmetric transformation on the fields ϕa is of the form

ϕa (x) → ϕa (x) + ϵ∆ϕa (x). (25.48)

If ϵ is a constant, the action will be invariant under this transformation, i.e., the Lagrangian
density must be invariant up to a total divergence,

L[ϕ] → L[ϕ] + ϵ∂µ K µ . (25.49)

If ϵ depends on x, the variation of Lagrangian will be


∂L
L[ϕ] → L[ϕ] + (∂µ ϵ)∆ϕa + ϵ∂µ K µ . (25.50)
∂(∂µ ϕa )
Thus we have
δS ∂L
= ∂µ j µ where j µ = − ∆ϕa + K µ . (25.51)
δϵ(x) ∂(∂µ ϕa )
If the measure Dϕ is invariant under the transformation, we can derive that
X
n
⟨∂µ j (x)ϕ(x1 ) · · · ϕ(xn )⟩ =
µ
⟨ϕ(x1 ) · · · [i∆ϕ(xi )δ(x − xi )] · · · ϕ(xn )⟩ . (25.52)
i=1

25.5 Scattering matrix and cross section


Scattering matrix (S-matrix) elements ⟨f |S|i⟩ are defined as the transition amplitudes be-
tween the asymptotically defined in states |i⟩ and out states |f ⟩ of definite momentum. If the
particles do not interact at all, S is simply the identity operator. Even if the theory contains
interactions, the particles have some probability of simply missing one another. To isolate the
interesting part of the S-matrix – that is, the part due to interactions – we define the T-matrix
by
S = 1 + iT. (25.53)
Next we the matrix elements of S should reflect 4-momentum conservation. Thus S or T
P P
should always contain a factor δ( pf − pi ). Extracting this factor, we define the invariant
matrix element M, by
X X
⟨f |iT |i⟩ = iT(2π)4 δ( pf − pi ). (25.54)

Cross section σ can be constructed from M. For a relativistic collinear scattering process, we
have
dN = σ|v1 − v2 |n1 dt . (25.55)
Consider a 2 → n process p1 + p2 → {pj }. Suppose that the volume of the space in which
the scattering process takes place is V and the duration of the scattering process is T . So the
number density of the incident particle is
1
n1 = (25.56)
V
–286/453– Chapter 25 Scalar Field

and the number of events in final states phase volume dΠ is


| ⟨f |iT |i⟩|2 Y V
dN = dΠ where dΠ = d3 pj . (25.57)
⟨i|i⟩ ⟨f |f ⟩ j
(2π)3

Using the fact that


V VT
δ (3) (0) = , δ (4) (0) = , ⟨p|p⟩ = (2π)3 2ωδ (3) (0), (25.58)
(2π)3 (2π)4
we can derive that
1
dσ = |T|2 dΠLIPS , (25.59)
(2E1 )(2E2 )|v1 − v2 |
where
Y d3 pj 1 X
dΠLIPS = (2π)4
δ( pj − p1 − p2 ), (25.60)
j
(2π)3 2Ej
called the Lorentz-invariant phase space (LIPS). Since all the factors of V and T have dropped
out, now it is trivial to take V → ∞ and T → ∞.
The decay rate is defined as
Number of Events
Γ≡ . (25.61)
Time
Consider a 1 → n process p1 → {pj }. We can derive the differential decay rate similarly to
obtain
1
dΓ = |T|2 dΠLIPS . (25.62)
2E1

Example: For a 2 → 2 scattering p1 + p2 → p3 + p4 in the center-of-mass frame, the LIPS is


given by
d 3 p3 1 d 3 p4 1
dΠLIPS = (2π)4 δ(p3 + p4 − p1 − p2 ). (25.63)
(2π)3 2E3 (2π)3 2E4
Integrating over p4 gives
Z
1 p2f
dΠLIPS = dΩ dpf δ(E3 + E4 − ECM ), (25.64)
16π 2 E3 E4
where pf ≡ |p3 | = |p4 | and ECM ≡ E1 + E2 . Define x(pf ) ≡ E3 + E4 − ECM . We have
dx (E3 + E4 )pf
= . (25.65)
dpf E3 E4
It follows that
1 pf
dΠLIPS = 2
dΩ θ(ECM − m3 − m4 ) (25.66)
16π ECM
and
dσ 1 |pf | 2
= |T| θ(ECM − m3 − m4 ). (25.67)
dΩ 64π ECM |pi |
2 2

If all the masses are equal, we will have


dσ 1
= 2 2
|T|2 . (25.68)
dΩ 64π ECM
25.6 LSZ reduction formula –287/453–

25.6 LSZ reduction formula


25.6.1 Field strength renormalization
The completeness of Klein-Gordon field indicates that
X Z d3 p 1 q
1 = |Ω⟩⟨Ω| + 3 2E
|λ p ⟩⟨λp | where Ep ≡ m2λ + p2 . (25.69)
λ
(2π) p

multiparticle
continuum

one particle in
motion
bound
state

one particle at rest

Figure 25.4: Particle’s energy-momentum relation.

Assume for now x0 > y 0 and define connected two point function as

⟨Ω|ϕ(x)ϕ(y)|Ω⟩C ≡ ⟨Ω|ϕ(x)ϕ(y)|Ω⟩ − ⟨Ω|ϕ(x)|Ω⟩ ⟨Ω|ϕ(y)|Ω⟩ . (25.70)

Term ⟨Ω|ϕ(x)|Ω⟩ is usually zero by symmetry; for higher spin fields, it is zero by Lorentz
invariance. From the completeness of Klein-Gordon field, we have
X Z d3 p 1
⟨Ω|ϕ(x)ϕ(y)|Ω⟩C = 3 2E
⟨Ω|ϕ(x)|λp ⟩ ⟨λp |ϕ(y)|Ω⟩ . (25.71)
λ
(2π) p

Since
⟨Ω|ϕ(x)|λp ⟩ = ⟨Ω|ϕ(0)|λ0 ⟩ eipx |p0 =Ep , (25.72)
we can obtain
XZ d4 p −i
⟨Ω|ϕ(x)ϕ(y)|Ω⟩C = eip(x−y) | ⟨Ω|ϕ(0)|λ0 ⟩ |2 . (25.73)
λ
(2π) p + mλ − iϵ
4 2 2
–288/453– Chapter 25 Scalar Field

Analogous expressions also hold when y 0 > x0 , and both cases can be summarized as
Z ∞
dM 2
⟨Ω|Tϕ(x)ϕ(y)|Ω⟩C = ρ(M 2 )DF (x − y; M 2 ), (25.74)
0 2π

where
X
ρ(M 2 ) ≡ (2π)δ(M 2 − m2λ )| ⟨Ω|ϕ(0)|λ0 ⟩ |2 . (25.75)
λ

1-particle
states
bound
states

2-particle
states

Figure 25.5: The structure of the spectral density function ρ(M 2 ).

The one-particle state contributes an isolated delta function to the spectral density function.
If follows that

ρ(M 2 ) = 2πδ(M 2 − m2 ) · Z + (nothing else until M 2 ≳ 4m2 ), (25.76)

where Z = | ⟨Ω|ϕ(0)|λ0 ⟩ |2 is called field-strength renormalization and m is the physical mass


of a single particle. The Fourier transformation of the two point function would be
Z Z ∞
−ipx −iZ dM 2 −i
4
d xe ⟨Ω|Tϕ(x)ϕ(0)|Ω⟩C = 2 + ρ(M 2 ) 2 .
p + m2 − iϵ ∼4m2 2π p + M 2 − iϵ
(25.77)

isolated poles from branch cut


pole bound states

Figure 25.6: The structure of the two point function in Fourier space.
25.6 LSZ reduction formula –289/453–

25.6.2 LSZ reduction formula


To evaluate the scattering amplitude of interacting particles, we need the following reduction
formula.
Yn Z Y
m
−ipi xi
4
d xi e d4 yj eikj yj ⟨Ω|T{ϕ(x1 ) · · · ϕ(xn )ϕ(y1 ) · · · ϕ(ym )}|Ω⟩
1 1
√ ! m √ !
Y
n
− Zi Y − Zi
∼ ⟨p1 · · · pn |S|k1 · · · km ⟩ . (25.78)
pi0 →Epi ki0 →Eki
1
p2i + m2 − iϵ 1
ki2 + m2 − iϵ

The ∼ means that the two sides of the expression share the same singular structure around
p0i → Epi , ki0 → Eki . The proof can be found in section 7.2 of An introduction to quantum
field theory (M.E.Peskin & D.V.Schroeder).
To express 25.78 in the language of Feynman diagrams, we consider the 2 → 2 scattering for
example. Notice that the disconnected diagram should be disregarded because they do not
have the singularity structure with a product of four poles indicated by the right hand side of
the LSZ reduction formula. The exact four point function
2 Z
Y 2 Z
Y
−ipi xi
4
d xi e d4 yi eikj yj ⟨Ω|T{ϕ(x1 )ϕ(x2 )ϕ(y1 )ϕ(y2 )}|Ω⟩ (25.79)
1 1

can be represented by the Feynman diagram shown in Figure 25.7.

Amputated

Figure 25.7: Feynman diagram for four point function.

Let −iM 2 (p2 ) denote the sum of all one-particle-irreducible (1PI) insertions into the scalar
propagator. Here 1PI refers to diagrams that is still connected after one line is cut, as shown
in Figure 25.8.

1PI

Figure 25.8: Diagram representation of 1PI propagator.


–290/453– Chapter 25 Scalar Field

1PI 1PI 1PI

Figure 25.9: Diagram representation of exact propagator.

The exact propagator can be written as a geometric series of 1PI propagators, as shown in
Figure 25.9.

If we expand each re-summed propagator about the physical particle pole, we see that each
external leg of the four-point amplitude contributes

−i −iZ
∼ + (regular) . (25.80)
p2 + m20 + M 2 p →Ep
0 p2 + m 2

Thus, the sum of diagrams of four point function contains a product of four point poles, which
is exactly the singularity on the second line of 25.78. Comparing the coefficients of this product
of poles, we find the relation shown in Figure 25.10.

Amp.

Figure 25.10: Feynman diagram representation of LSZ reduction formula.

After Fourier transforming the n-point function to momentum space and cutting off the ex-
ternal legs, the Feynman diagram can be evaluated as follows:

1. For each propagator, P = −i/(p2 + m20 − iϵ);

2. For each vertex, V = −iλ0 ;

3. For each external point, E = 1;

4. Impose momentum conservation at each vertex;


R
5. Integrate over each undetermined loop momentum d4 p/(2π)4 ;

6. Divided by the symmetry factor.



The T-matrix of the scattering is then given by iT = ( Z)ni +nf iM, where iM is the sum of
all connected diagrams contributing to the n-point function.
25.7 Renormalization –291/453–

25.7 Renormalization
25.7.1 Counting of ultraviolet divergence
Renormalization is the procedure in quantum field theory by which divergent parts of a cal-
culation, leading to nonsensical infinite results, are absorbed by redefinition into a few mea-
surable quantities, so yielding finite answers.

Consider a pure scalar theory in d dimensions with a ϕn interaction term. The corresponding
Lagrangian density is
1 1 λ
L = − ∂ µ ϕ∂µ ϕ − m2 ϕ2 − ϕn . (25.81)
2 2 n!
Let N be the number of external lines in one Feynman diagram, P the number of propagators,
and V the number of vertices. The number of loops in the diagram is L = P − V + 1. There
are n lines meeting at each vertex, so nV = 2P + N . Loosely speaking, each loop has an
R
integral dd p, while each propagator contributes a factor p−2 . Thus the superficial degrees of
divergence is    
d−2 d−2
D = dL − 2P = d + n −d V − N. (25.82)
2 2
Naively, we expect a diagram to have a divergence proportional to ΛD , where Λ is a momentum
cutoff, when D > 0. We expect a divergence of the form log Λ when D = 0, and no divergence
when D < 0.

According to the superficial degrees of divergence of the diagram, there are three possible types
of ultraviolet behavior of quantum field theories. We will refer to them as follows:

Super-renormalizable Only a finite number of Feynman diagrams are superficially diverge.

Renormalizable Only a finite number of amplitudes are superficially diverge; however, di-
vergences occur at all orders in perturbation theory.

Non-renormalizable All amplitudes are divergent at a sufficiently high order in perturbation


theory.

For ϕ4 theory in 4-dimensional spacetime, we have D = 4 − N . So it is a renormalizable


theory.

25.7.2 Renormalized perturbation theory


The ϕ4 theory is invariant under ϕ → −ϕ, all amplitudes with an odd number of external legs
vanish. The only divergent amplitudes are shown in Figure 25.11.

Ignoring the vacuum diagram, these amplitudes contain three infinite constants. Our goal
is to absorb these constants into the three unobservable parameters of the theory: the bare
mass m0 , the bare coupling constant λ0 , and the field strength Z. To accomplish this goal, it
is convenient to reformulate the perturbation expansion so that these unobservable quantities
do not appear explicitly in the Feynman rules.
–292/453– Chapter 25 Scalar Field

(unobservable vacuum energy shift)

Figure 25.11: Divergence of ϕ4 theory.

Define

ϕr ≡ Z −1/2 ϕ, δZ ≡ Z − 1, δm ≡ Zm20 − m2 , δλ ≡ λ0 Z 2 − λ, (25.83)

where m is the physical mass and λ the physical coupling constant, defined by on-shell (OS)
renormalization conditions, as shown in Figure 25.12. The Lagrangian density then becomes
1 1 λ 1 1 δλ
L = − ∂ µ ϕr ∂µ ϕr − m2 ϕ2r − ϕ4r − δZ ∂ µ ϕr ∂µ ϕr − δm ϕ2r − ϕ4r . (25.84)
2 2 4! 2 2 4!
The last three terms, known as counterterms, have absorbed the infinite but unobservable
shifts between the bare parameters and the physical parameters.

Amp.

Figure 25.12: OS renormalization conditions.

We can use Feynman rules shown in Figure 25.13 to compute any amplitude in ϕ4 theory. The
procedure is as follows. Compute the desired amplitude as the sum of all possible diagrams
created from the propagator and vertices shown in Figure 25.13. The loop integrals in the di-
agrams will often diverge, so one must introduce a regulator. The result of this computation
will be a function of the three unknown parameters δZ , δm , and δλ . Adjust ( or “renormal-
ize”) these three parameters as necessary to maintain the renormalization conditions shown
in Figure 25.12. After this adjustment, the expression for the amplitude should be finite and in-
dependent of the regulator. This procedure, using Feynman rules with counterterms, is known
as renormalized perturbation theory.

Figure 25.13: Feynman rules for renormalized perturbation theory.

Mandelstam variable
In theoretical physics, the Mandelstam variable are numerical quantities that encode the en-
ergy, momentum, and angles of particles in a scattering process in a Lorentz-invariant fashion.
25.7 Renormalization –293/453–

They are used for scattering processes of two particles to two particles. The Mandelstam vari-
ables s, t, u are defined as

s ≡ −(p1 + p2 )2 , t ≡ −(p1 − p3 )2 , u ≡ −(p1 − p4 )2 , (25.85)

where p1 and p2 are the four-momenta of the incoming particles and p3 and p4 the four-
momenta of the outgoing particles. s is known as the square of the center-of-mass energy
(invariant mass) and t the square of the four-momentum transfer. We can verify that

s + t + u = m21 + m23 + m23 + m24 . (25.86)

25.7.3 Techniques for evaluating loop diagrams


Feynman’s formula
Feynman’s formula states that
Z
1
= dF n (x1 A1 + · · · + xn An )−n , (25.87)
A1 · · · An

where the integration measure over the Feynman parameters xi is


Z Z 1
dF n = (n − 1)! dx1 · · · dxn δ(x1 + · · · + xn − 1). (25.88)
0

This measure is normalized so that Z


dF n = 1. (25.89)

A generalization of Feynman’s formula is


P Z Q αi −1
1 Γ( i αi ) 1 x
= Q dF n P i i ∑ αi . (25.90)
A1 · · · An
α1 α n
i Γ(αi ) (n − 1)! ( i xi Ai ) i

Wick rotation

Figure 25.14: Wick rotation.


–294/453– Chapter 25 Scalar Field
R
For an integral dd q f (q 2 − iϵ), think of the integral over q 0 from −∞ to +∞ as a contour
integral in the complex q 0 plane. If the integrand vanishes fast enough as |q 0 | → ∞, we can
rotate this contour by π/2, as shown in Figure 25.14, so that it runs from −i∞ to i∞. In
making this Wick rotation, the contour does not pass over any poles. Thus the value of the
integral is unchanged. It is now convenient to define a Euclidean d-dimensional vector q̄ via
q 0 = iq̄d and qj = q̄j ; then q 2 = q̄ 2 , where

q̄ 2 = q̄12 + · · · + q̄d2 . (25.91)

Also, dd q = i dd q̄. Therefore, in general,


Z Z
d q f (q − iϵ) = i dd q̄ f (q̄ 2 ),
d 2
(25.92)

as long as f (q̄ 2 ) → 0 faster than 1/q̄ d as q̄ → ∞.

Dimensional regularization
Dimensional regularization is a method for regularizing integrals in the evaluation of Feyn-
man diagrams. For example, if one wishes to evaluate a loop integral which is logarithmically
divergent in four dimensions, like
Z
dd q̄ 1
. (25.93)
(2π)d (q̄ 2 + m2 )2

One first rewrites the integral in some way so that the number of variables integrated over does
not depend on d, and then we formally vary the parameter d, to include non-integral values
like d = 4 − ϵ. So the integral 25.93 would become
Z ∞
dq̄ 2π (4−ϵ)/2 q̄ 3−ϵ 2ϵ−4 π ϵ/2−1
4−ϵ Γ (2 − ϵ/2)
= m−ϵ
0 (2π) (q̄ 2 + m2 )2 sin(πϵ/2)Γ(1 − ϵ/2)
 
1 1 m2
= 2 − ln + γ + O(ϵ). (25.94)
8π ϵ 16π 2 4π

The general formula for calculating the loop integral is


Z
dd q̄ (q̄ 2 )a Γ(b − a − d/2)Γ(a + d/2) −(b−a−d/2)
= D . (25.95)
(2π)d (q̄ 2 + D)b (4π)d/2 Γ(b)Γ(d/2)

If a = 0, the formula will be


Z
dq q̄ 1 Γ(b − d/2) −(b−d/2)
d 2 b
= D . (25.96)
(2π) (q̄ + D) (4π)d/2 Γ(b)

25.7.4 One-loop structure of ϕ4 theory


First consider the basic two-particle scattering amplitude. The expansion of the scattering
amplitude to one-loop level is shown in Figure 25.15.
25.7 Renormalization –295/453–

Amp.

Figure 25.15: Feynman diagram representation of two-particle scattering to one-loop.

Define p ≡ p1 + p2 . The value of the second diagram in the perturbation series is


Z
(−iλ)2 d4 k −i −i
≡ (−iλ)2 iV (p2 ). (25.97)
2 (2π)4 k 2 + m2 (k + p)2 + m2
Thus, the total amplitude can be written as

iM = −iλ + (−iλ)2 [iV (−s) + iV (−t) + iV (−u)] − iδλ + O λ3 . (25.98)
To keep λ dimensionless in dimensional regularization, we can make the transformation λ →
λµ̃ϵ , where µ̃ is an arbitrary number with mass dimension 1 and ϵ ≡ 4 − d. We can figure out
that Z 1   2 
1 2 µ
V (p ) = −
2
2
dx + ln , (25.99)
32π 0 ϵ D(p2 )
where µ2 ≡ 4πe−γ µ̃2 and D(p2 ) = x(1 − x)p2 + m2 . The OS renormalization conditions
imply that

δλ = −λ2 [V (−4m2 ) + 2V (0)] + O λ3 . (25.100)
It then follows that
Z       
iλ2 1
D(−s) D(−t) D(−u) 
iM = −iλ − dx ln + ln + ln + O λ3
.
32π 2 0 D(−4m2 ) D(0) D(0)
(25.101)
To determine δZ and δm we must calculate the two-point function
−i
, (25.102)
p2 + m2 + M 2
where −iM (p2 ) is the sum of all 1PI insertions into the propagator. The OS renormalization
requires that poles in this full propagator occur at p2 = −m2 and have residue 1, i.e.,
dM 2 (p2 )
M 2 (p2 ) = 0, = 0. (25.103)
p2 =−m2 dp2 p2 =−m2

It can be worked out that


  2 
iλ 2 µ
− iM (p ) =
2 2
2
+ ln 2
+ 1 m2 − i(p2 δZ + δm ), (25.104)
32π ϵ m
thus leading to
  2 
 λ 2 µ  
δZ = O λ , 2
δm = + ln + 1 m2
+ O λ 2
, M 2 (p2 ) = O λ2 .
32π 2 ϵ m2
(25.105)
The details of the calculation in this subsection can be found in section 10.2 of An introduction
to quantum field theory (M.E.Peskin & D.V.Schroeder).
–296/453– Chapter 25 Scalar Field

Perturbation theory to all orders


We begin by summing all one-particle irreducible diagrams with two external lines; this gives
us the self-energy M 2 . We next sum all amputated diagrams with four external lines; this gives
us the four-point vertex function V4 (k1 , k2 , k3 , k4 ). (Amputated diagrams with four external
lines in ϕ4 theory must be 1PI). Order by order in λ, we must adjust the value of the Lagrangian
coefficients δZ , δm , and δλ to maintain the conditions M 2 (p2 = −m2 ) = 0, dM 2 /dp2 (p2 =
−m2 ) = 0, and V4 (s = 4m2 ) = 0.
Next we will construct the n-point vertex functions Vn with 6 ≤ n ≤ E, where E is the num-
ber of external lines in the process of interest. We compute these using a skeleton expansion.
This means that we draw all the contributing 1PI diagrams, but omit diagrams that include
either propagator or four-point vertex corrections. That is, we include only diagrams that are
not only 1PI, but also 2PI and 4PI: they remain connected when any one, two, or four lines
are cut. (Cutting four lines may isolate a single tree-level vertex, but nothing more compli-
cated.) Then we take the propagators and vertices in these diagrams to be given by the exact
propagator −i/[(p2 + m2 + M 2 (p2 )] and vertex V4 (k1 , k2 , k3 , k4 ), rather than by the tree-level
propagator and vertex. We then sum these skeleton diagrams to get Vn for 4 < n ≤ E. Or-
der by order in λ, this procedure is equivalent to computing Vn by summing the usual set of
contributing 1PI diagrams.
Next we draw all tree-level diagrams that contribute to the process of interest (which has E
external lines), including not only four-point vertices, but also n-point vertices. Then we eval-
uate these diagrams using the exact propagator for internal lines, and the exact 1PI vertices
Vn ; external lines are assigned a factor of one. We sum these tree diagrams to get the scatter-
ing amplitude. Order by order in λ, this procedure is equivalent to computing the scattering
amplitude by summing the usual set of contributing diagrams. Thus we now know how to
compute an arbitrary scattering amplitude to arbitrarily high order. The procedure is the same
in any quantum field theory; only the form of the propagators and vertices change, depending
on the spins of the fields.

25.7.5 General renormalization theory


Recall some of the major results and methods of previous subsections.
1. In perturbation theory, bare and physical quantities are related by ultraviolet-divergent
expressions
mphys = m0 + ∆m , (25.106)
where mphys is finite, ∆m is ultraviolet-divergent, and so m0 is necessarily ultraviolet-
divergent.
2. We express the Lagrangian in terms of physical quantites, and separate it into

L = L0 + LI + LCT , (25.107)

where L0 is the canonically normalized free Lagrangian for physical fields and masses,
LI contains the interaction, again in terms of physical parameters, and LCT contains the
25.7 Renormalization –297/453–

counterterms with ultraviolet divergent coefficients. From L0 , we obtain the propaga-


tors of the physical fields. LI and LCT give interaction vertices.

3. At the one-loop level, the self-energy is given by the effective two-point vertices: the
1PI two-point vertex of the interaction and the counter-term two-point vertex. The
counterterms absorb ultraviolet divergences, and the finite parts of the counterterms
are determined by renormalization conditions, which ensure the quantities in L0 + LI
are physical. The conditions constrain the self-energy and the effective vertices, and give
a finite, uniquely-determined value for the counterterms.

Now, for a general theory in d-dimensional spacetime, the field content is given by ϕf , f =
1, 2, · · · , where f labels the field type. The (mass) dimension of the field is [ϕf ] = ∆f and
we have ∆f > 0 in all physical theories. We have interaction vertices of type i, i = 1, 2, · · · ,
contributing a term of the form
Y n
λi ∂ ni ϕf if , (25.108)
f

where λi is the coupling constant, with dimension


X
[λi ] = κi = d − ni − nif ∆f. (25.109)
f

Now consider a 1PI diagram in such a theory. On the one hand, the value of the diagram
Q
is M ∼ ΛD i λVi i , where Vi is the number of vertices of type i, D the superficial degree of
P
divergence and Λ a high momentum cut-off, leading to [M ] = D + i Vi κi .
Q E
On the other hand, the diagram could arise from an interaction term λ′ f ϕf f , where Ef is
P
the number of external lines of ϕf , resulting in [M ] = [λ′ ] = d − Ef ∆f . It follows that the
superficial degree of divergence of the diagram is
X X
D =d− Ef ∆f − Vi κi . (25.110)
f i

And diagrams which are ultraviolet divergent satisfy that


X X
Ef ∆f + Vi κi < d. (25.111)
f i

We can now divide all theories into

super-renormalizable theory all κi > 0;

renormalizable theory all κi ≥ 0;

non-renormalizable theory at least one κi < 0.

Consider a generic divergent diagram M of degree D, that is,


Z Λ
M= ds sD−1 , (25.112)
–298/453– Chapter 25 Scalar Field

where all loop momenta are taken proportional to s. Generally, internal propagators have the
form
1 1
∼ α (25.113)
(as + p) · · ·
α s

for large s, where a is a numerical constant and p is a combination of the external momenta.
Differentiating M n times with respect to p gives a term proportional to

1 1
α+n
∼ α+n . (25.114)
(as + p) s

Thus the D + 1 derivatives with respect to the external momenta will make M finite. It means
that we have the expansion

M(p) = M0 + M1 p + · · · + MD pD + finite terms, (25.115)

where the argument p of the function represents the collection of external momenta. We have
suppressed the index structure, and M0 , M1 , · · · , MD are potentially divergent constants.
Suppose that M has Ef external lines of the field ϕf . Then the divergence of M(p) can be
canceled by counterterms of the form

X
D Y E
Aj (∂)j ϕf f , (25.116)
j=0 f

where Aj s are divergent coefficients in order to cancel the divergence in Mj . The index struc-
ture in Aj ∂ j should match the suppressed index structure of M(p).

However, more difficult situations occurs when we have nested or overlapping divergences,
that is, when two divergent loops share a propagator. Terms like log p2 log Λ2 would appear,
contradicting our naive argument, based on the criterion of the superficial degree of diver-
gence, that the divergent terms of a Feynman integral are always simple polynomials in p. We
will refer to divergences multiplying only polynomials in p as local divergences, since their
Fourier transforms back to position space are delta functions or derivatives of delta functions.
We will call the new, nonpolynomial, term a nonlocal divergence. It is a local divergence sur-
rounded by an ordinary, nondivergent, quantum field theory process.

Fortunately, BPHZ theorem states that, for a general renormalizable quantum field theory,
to any order in perturbation theory, all divergences are removed by the counterterm vertices
corresponding to superficially divergent amplitudes. In other words, any superficially renor-
malizable quantum field theory is in fact rendered finite when one performs renormalized
perturbation theory with the complete set of counterterms.

A more detailed discussion of the appearance and cancellation of non local divergence can
be found in section 10.4 and 10.5 of An introduction to quantum field theory (M.E.Peskin &
D.V.Schroeder).
25.8 Renormalization group –299/453–

25.8 Renormalization group


25.8.1 Modified minimal-subtraction scheme
In minimal-subtraction renormalization scheme, we do not demand that m be the physics
mass of the particle and ϕ create a normalized one-particle state. Instead we choose δZ , δm and
δλ to cancel the infinities, and nothing more; i.e., δZ , δm and δλ have no finite parts. The choice
is called the modified minimal-subtraction (MS) scheme. (“modified” because we introduced

µ via λ → λµ̃ϵ , with µ ≡ 4πe−γ/2 µ̃; had we set µ = µ̃ instead, the scheme would be just
plain minimal subtraction.)
For one-loop corrections to propagator, we have
    2 
 λ  λ m 
δZ = O λ , δ m =
2
2
+O λ 2 2 2 2
m , M (p ) = 2
ln 2
− 1 m2 +O λ2 .
16π ϵ 32π µ
(25.117)
In MS scheme, the propagator will no longer have a pole at p = −m . However, by definition,
2 2

the actual physical mass mph of the particle is determined by the location of this pole: p2 =
−m2ph . The relation of m and mph is given by
   2  
λ m 
2 2 2
mph = M (−mph ) + m = 1 +2
2
ln 2
−1 +O λ 2
m2 . (25.118)
32π µ
Because mph is a independent of µ, i.e., dmph /dµ = 0, it can be derived that
 
dm λ 
= +O λ2
m. (25.119)
d ln µ 32π 2

The residue R of the propagator’s pole is no longer one as well. In ϕ4 theory, we have

R = 1 + O λ2 . (25.120)

For one-loop corrections to vertex, we have


 2 
3λ  1
δλ = +O λ 3
,
16π 2 ϵ
Z 1       
iλ2 D(s) D(t) D(u) 
iM = −iλ − dx ln + ln + ln + O λ 3
. (25.121)
32π 2 0 µ2 µ2 µ2
The T-matrix element for 2 → 2 scattering would be

iT = R2 iM. (25.122)

To make sure T independent of the choice of µ, we can derive that


dλ 3π 2 
= λ + O λ3 . (25.123)
d ln µ 16

For a scattering process with p2 ≫ m2 , we have D ∼ x(1 − x)p2 . In OS scheme, the one-loop
correction to propagator or vertex generally includes a factor ln(D/D0 ) ∼ ln(p2 /m2 ), making
–300/453– Chapter 25 Scalar Field

the perturbation expansion no longer a good approximation when p2 ≫ m2 . In MS scheme,


introducing µ allows us to address this problem: if we choose µ2 ∼ p2 , no such large logarithm
arises. If we choose µ appropriately, i.e., to be comparable to the energy scale of the physical
process, we can improve our perturbation expansion. Thus λ(µ) and m(µ) can be considered
as the scale-dependent coupling constants. And the reason we get large logarithmic terms
in OS scheme is that we are trying to use coupling constants defined at one scale to describe
physics at very different scales.

25.8.2 Equations of the renormalization group


In previous subsection, we used the fact that physical observables must be independent of the
fake parameter µ to figure out how the Lagrangian parameters m and λ must change with λ.
In this subsection we rederive these results from a much more formal point of view. Equations
that tell us how the Lagrangian parameters vary with µ are collectively called the equations of
the renormalization group.
The Lagrangian density of ϕ4 theory
1 1 λ0
L = − ∂ µ ϕ0 ∂µ ϕ0 − m20 ϕ20 − ϕ40 (25.124)
2 2 4!
can also be written as
1 1 λ
L = − Zϕ ∂ µ ϕ∂µ ϕ − Zm m2 ϕ2 − Zλ µ̃ϵ ϕ4 , (25.125)
2 2 4!
with
−1/2
λ0 = Zϕ−2 Zλ λµ̃ϵ .
1/2 1/2
ϕ0 = Zϕ ϕ, m0 = Zϕ Zm m, (25.126)
After using dimensional regularization, the infinities coming from loop integrals take the form
of inverse powers of ϵ. In MS scheme, we choose the Zs to cancel off these powers of 1/ϵ, and
nothing more. Therefore the Zs can be written as
X

an (λ) X

bn (λ) X

cn (λ)
Zϕ = 1 + , Zm = 1 + , Zλ = 1 + . (25.127)
n=1
ϵn n=1
ϵn n=1
ϵn

In ϕ4 theory, we have a1 = O(λ2 ), b1 = λ/16π 2 + O(λ2 ) and c1 = 3λ/16π 2 + O(λ2 ).


Define
 X

Gn (λ)
G(λ, ϵ) ≡ ln Zϕ−2 Zλ = . (25.128)
n=1
ϵn
We can work out that G1 = c1 − 2a1 = 3λ/16π 2 + O(λ2 ). As ln λ0 = G + ln λ + ϵ ln µ̃ and
dλ0 /dµ = 0, we can derive that
 
λG′1 λG′2 dλ
1+ + 2 + ··· + ϵλ = 0. (25.129)
ϵ ϵ d ln µ
dλ/d ln µ is the rate at which λ must change to compensate for a small change in ln µ. In
a renormalizable theory, this rate should be finite in the ϵ → 0 limit. Therefore, the beta
function, defined via

β(λ) ≡ , (25.130)
d ln µ
25.8 Renormalization group –301/453–

is given by
3λ2 
β(λ) = −ϵλ + λ2 G′1 (λ) = 2
+ O λ3 . (25.131)
16π
The first term, −ϵλ, is fixed by matching the O(ϵ) terms in 25.129. The second term, λ2 G′1 , is
similarly determined by matching the O(ϵ0 ) terms. Terms that are higher-order in 1/ϵ must
also cancel, and this determines all the other G′n (λ) in terms of G′1 (λ). These relations among
the G′n (λ)s can be checked order by order in perturbation theory.

Define
  X ∞
Mn (λ)
1/2 −1/2
M (λ, ϵ) ≡ ln Zm Zϕ = n
. (25.132)
n=1
ϵ

We can work out that M1 = b1 /2 − a1 /2 = λ/32π 2 + O(λ2 ). As ln m0 = M (λ, ϵ) + ln m


and d ln m0 /d ln µ = 0, we can derive that

d ln m ∂M (λ, ϵ) dλ X∞
Mn′ (λ)
2 ′
=− = (ϵλ − λ G1 ) = λM1′ (λ) + · · · , (25.133)
d ln µ ∂λ d ln µ n=1
ϵ

where the ellipses stand for terms with powers of 1/ϵ. In a renormalizable theory, d ln m/d ln µ
should be finite in the ϵ → 0 limit, and so these terms must actually all be zero. Therefore, the
anomalous dimension of the mass, defined via
d ln m
γm (λ) ≡ , (25.134)
d ln µ

is given by
λ 
γm (λ) = λM1′ (λ) = + O λ 2
. (25.135)
32π 2

Let us now consider the n-point Green function in the MS renormalization scheme. The bare
Green function should be independent of µ. The bare and renormalized propagators are re-
(n) n/2
lated by G0 = Zϕ G(n) . Taking the logarithm and differentiating with respect to ln µ, we
get  
∂ dλ ∂ dm ∂ n d ln Zϕ
+ + + G(n) (λ, m, µ) = 0. (25.136)
∂ ln µ d ln µ ∂λ d ln µ ∂m 2 d ln µ
We can write
a1 a2 − a21 /2
ln Zϕ = + + ··· (25.137)
ϵ ϵ2
Then we have
 
d ln Zϕ ∂Zϕ dλ a′1
= = + ··· [−ϵλ + β(λ)] = −λa′1 + · · · (25.138)
d ln µ ∂λ d ln µ ϵ

where the ellipses in the last line stand for terms with powers of 1/ϵ. Since G(n) should vary
smoothly with µ in the ϵ → 0 limit, these must all be zero. Therefore, the anomalous dimen-
sion of the field, defined via
1 d ln Zϕ
γϕ (λ) ≡ , (25.139)
2 d ln µ
–302/453– Chapter 25 Scalar Field

is given by
1 
γϕ (λ) = − λa′1 = O λ2 . (25.140)
2
Equation 25.136 can now be written as
 
∂ ∂ ∂
+ β(λ) + γm (λ)m + nγϕ (λ) G(n) (λ, m, µ) = 0 (25.141)
∂ ln µ ∂λ ∂m

in the ϵ → 0 limit. This is the Callan–Symanzik equation for the Green function.

25.8.3 Running of coupling constants


Three behaviours are possible for β(λ) in the region of small λ:

1. β(λ) > 0;

2. β(λ) = 0;

3. β(λ) < 0.

In theories of the first class, the running coupling constant goes to zero in the infra-red, lead-
ing to definite predictions about the small-momentum behavior of the theory. However, the
running coupling constant becomes large in the region of high momenta. Thus the short-
distance behavior of the theory cannot be computed using Feynman diagram perturbation
theory. A Feynman diagram analysis is useful in such theories if one is mainly interested in
large-distance or macroscopic behavior.

In theories of the second class, the coupling constant does not flow. In these theories, the
running coupling constant is independent of the momentum scale, and thus equal to the bare
coupling. This means that there can be no ultraviolet divergences in the relation of coupling
constants. The only possible ultraviolet divergences in such theories are those associated with
field rescaling, which automatically cancel in the computation of S-matrix elements.

In theories of the third class, the running coupling constant becomes large in the large-distance
regime and becomes small at large momenta or short distances. Such theories are called
asymptotically free. In theories of this class, the short-distance behavior is completely solv-
able by Feynman diagram methods. Though ultraviolet divergences appear in every order of
perturbation theory, the renormalization group tells us that the sum of these divergences is
completely harmless.

In the region of strong coupling, the approximation we have made, ignoring the higher-order
terms in the β function is no longer valid. It is a logical possibility that the leading order term
is positive while the higher terms of the β function are negative, so that the β function has the
form shown in Figure 25.16(a). In this case the β function has a zero at a non-zero value λ∗ .
When λ approaches this value, the renormalization group flow slows to a halt; thus λ = λ∗
would be a non-trivial fixed point of the renormalization group.

For a β function of the form of Figure 25.16(a), the β function behaves in the vicinity of the
25.9 Spontaneous symmetry breaking –303/453–

Figure 25.16: Possible forms of the β function with nontrivial zeros.

fixed point as β ≈ −B(λ − λ∗ ), where B is a positive constant. For λ near λ∗ ,


≈ −B(λ − λ∗ ). (25.142)
d ln µ

The solution of this equation is

 B
µ0
λ(µ) = λ∗ + C . (25.143)
µ

Thus, λ indeed tends to λ∗ as µ → ∞, and the rate of approach is governed by the slope of the
β function at the fixed point.

For a massless scalar field with a fixed point, the solution of C-S equation for propagator at the
fixed point is
 −γϕ (λ∗ )
(2) C(λ∗ ) µ2
G (p) = , (25.144)
p2 p2

where C(λ∗ ) is an integration constant. Thus the two-point correlation function returns to
the form of a simple scaling law, but with a power law different from that expected by dimen-
sional analysis. At the fixed point we have a scale-invariant quantum field theory in which the
interactions of the theory affect the law of rescaling.

A similar behavior is possible in an asymptotically free theory. If the β function has the form
shown in Figure 25.16(b), the running coupling constant will tend to a fixed point λ∗ as µ → 0.
The two-point correlation function of fields will tend to a power law for asymptotically small
momenta. The two cases shown in Figure 25.16(a) and (b) are called, respectively, ultraviolet-
stable and infrared-stable fixed points.

In higher orders of perturbation theory, β and γ depend on the specific renormalization con-
ventions. However, the existence of a zero of the β function, the slope B at the zero, and the
value of the anomalous dimension at the fixed point should all be independent of the conven-
tions used to compute β and γ.
–304/453– Chapter 25 Scalar Field

25.9 Spontaneous symmetry breaking


25.9.1 Effective action
Consider a quantum field ϕ in the presence of an external source J. We define an energy
functional E[J] by
Z  Z 
−iE[J]
Z[J] = e = Dϕ exp i d x (L[ϕ] + Jϕ) .
4
(25.145)

We define the quantity ϕcl (x), called the classical field, by

ϕcl (x) ≡ ⟨Ω|ϕ(x)|Ω⟩J . (25.146)

It follows that
δ
E[J] = −ϕcl (x). (25.147)
δJ(x)
The effective action is defined as the Legendre transform of E[J]:
Z
Γ[ϕcl ] ≡ −E[J] − d4 y J(y)ϕcl (y). (25.148)

If L is invariant under the transformation U , i.e., L(U ϕ) = L(ϕ), it can be shown that the
effective action Γ is also invariant under transformation U , i.e., Γ(U ϕcl ) = Γ(ϕcl ).
Thanks to the property of Legendre transformation, we can get
δ
Γ[ϕcl ] = −J(x). (25.149)
δϕcl (x)
If the external source is set to zero, we will have
δ
Γ[ϕcl ] = 0. (25.150)
δϕcl (x)
The solution to this equation are the values of ⟨ϕ(x)⟩ in the vacuum states of the theory.
From here on we will assume, for the field theories we consider, that the possible vacuum states
are invariant under translations and Lorentz transformations. Then, for each possible vacuum
state, the corresponding solution ϕcl (x) will be a constant. Furthermore, we know that Γ is an
extensive quantity. If T is the time extent of the region and V is its three dimensional volume,
we can write
Γ[ϕcl ] = −(V T ) · Veff (ϕcl ). (25.151)
The coefficient Veff is called the effective potential . The condition that Γ[ϕcl ] has an extreme
then reduces to the simple equation

Veff (ϕcl ) = 0. (25.152)
∂ϕcl
A system with spontaneously broken symmetry will have several minimum of Veff , all with
the same energy by virtue of the symmetry. The choice of one among these vacuum is the
spontaneous symmetry breaking.
25.9 Spontaneous symmetry breaking –305/453–

25.9.2 Calculation of the effective action


Decompose the Lagrangian density into a piece depending on renormalized parameters and
one containing the counterterms,
L = Lr + Lct . (25.153)
Define Jr and Jct by
Z
δSr
+ Jr (x) = 0, J(x) = Jr (x) + Jct (x) where Sr ≡ Lr . (25.154)
δϕ ϕ=ϕcl

It follows that Z ∫ ∫
−iE[J] d4 x(Lr +Jr ϕ) i d4 x(Lct +Jct ϕ)
e = Dϕei e . (25.155)

Define η ≡ ϕ − ϕcl . We have


Z Z Z !
δSr
d4 x (Lr + Jr ϕ) = d4 x (Lr [ϕcl ] + Jr ϕcl ) + d4 x η(x) + Jr
δϕ ϕ=ϕcl
Z
1 δ 2 Sr
+ d4 x d4 y η(x)η(y)
2 δϕ(x)δϕ(y) ϕ=ϕcl
Z 3
1 δ Sr
+ d4 x d4 y d4 z η(x)η(y)η(z) + ··· (25.156)
3! δϕ(x)δϕ(y)δϕ(z) ϕ=ϕcl

The term linear in η vanishes by definition of Jr . Put back the effects of the counterterm
Lagrangian, writing it as

(Lct [ϕcl ] + Jct ϕcl ) + (Lct [ϕcl + η] − Lct [ϕcl ] + Jct η). (25.157)

Define
 Z 
1 δ 3 Sr
Lη ≡ 4 4 4
d x d y d z η(x)η(y)η(z) + · · · +(Lct [ϕcl +η]−Lct [ϕcl ]+Jct η).
3! δϕ(x)δϕ(y)δϕ(z)
(25.158)
We have Z ∫ ∫ 1 δ 2 Sr
e−iE[J] = Z1 ei Lη ( 1i δ
δI
)
Dη ei η
2 δϕδϕ
η+Iη
, (25.159)
I=0
where  Z 
Z1 ≡ exp i d x (Lr [ϕcl ] + Jr ϕcl + Lct [ϕcl ] + Jct ϕcl ) .
4
(25.160)

Define the Feynman propagator DF as


 −1
δ 2 Sr
DF ≡ i . (25.161)
δϕδϕ
We have

Z ∫
−iE[J] Lη ( 1i δI
δ
− 12 IDF I
Z[J] = e = Z1 Z2 e i )
Dη ei , (25.162)
I=0
where Z ∫ 2
i δ Sr
Z2 ≡ Dη e 2 η δϕδϕ η
. (25.163)
–306/453– Chapter 25 Scalar Field

A perturbative expansion for iE[J] can be obtained using connected Feynman diagram as
Z
−iE[J] = i (Lr [ϕcl ]+Jr ϕcl +Lct [ϕcl ]+Jct ϕcl )+log(Z2 )+ connected diagrams . (25.164)

Therefore, the effective action is


Z Z
Γ[ϕcl ] = d x Lr [ϕcl ] − i log(Z2 ) − i connected diagrams + d4 x Lct [ϕcl ].
4
(25.165)

Notice that there are no terms remaining that depend explicitly on J; thus, Γ is expressed as a
function of ϕcl , as it should be. The Feynman diagrams contributing to Γ[ϕcl ] have no external
lines, and the simplest ones turn out to have two loops. The lowest-order quantum correction
to Γ is given by the functional determinant Z2 . The last term provides a set of counterterms
that can be used to satisfy the renormalization conditions on Γ and, in the process, to cancel
divergences that appear in the evaluation of the functional determinant and the diagrams. The
renormalization conditions will determine all of the counterterms in Lct .
The formalism we have constructed contains a new counterterm Jct , whose value is deter-
mined by ⟨η⟩ = 0. Our adjustment of Jct to keep ⟨η⟩ = 0 means that the sum of all connected
diagrams with an external line is zero. Consider now that same infinite set of diagrams, but
replace the external line in each of them with some other subdiagram. Here is the point: no
matter what this replacement subdiagram is, the sum of all these diagrams is still zero. There-
fore, we need not bother to compute any of them. The rule is this: ignore any diagram that
falls into two parts when a single line is cut. All of these diagrams (known as tadpoles) are
canceled by the Jct counterterm, no matter what subdiagram they are attached to.

25.9.3 The effective action as a generating functional


E[J] is the generating functional of connected correlation functions,
δ n E[J]
= in+1 ⟨ϕ(x1 ) · · · ϕ(xn )⟩conn . (25.166)
δJ(x1 ) · · · δJ(xn )
The effective action Γ[ϕcl ] is the generating functional of 1PI correlation functional,

δ 2 Γ[ϕcl ]
= iD−1 (x, y) where D(x, y) = ⟨ϕ(x)ϕ(y)⟩conn , (25.167)
δϕcl (x)δϕcl (y)
δ n Γ[ϕcl ]
= −i⟨ϕ(x1 ) · · · ϕ(xn )⟩1PI where n ≥ 3. (25.168)
δϕcl (x1 ) · · · δϕcl (xn )
A detailed proof can be found in section 10.2 of An introduction to quantum field theory
(M.E.Peskin & D.V.Schroeder)
As a result, the effective action can also be defined constructively as
Z
1 dd k
Γ[ϕ] ≡ Γ[ϕcl,0 ] + η̃(−k)(−k 2 − m2 − M 2 (k 2 ))η̃(k)
2 (2π)d
Z d
1 d k1 dd kn
+ · · · (2π)d δ(k1 + · · · + kn )Vn (k1 , · · · , kn )η̃(k1 ) · · · η̃(kn ),
n! (2π)d (2π)d
25.9 Spontaneous symmetry breaking –307/453–
R
where η̃(k) = dd x e−ikx η(x), η = ϕ − ϕcl,0 , and iVn (k1 , · · · , kn ) equals the value of 1PI
Feynman diagram in momentum space. The effective action has the property that the tree-
level Feynman diagrams it generates give the complete scattering amplitude of the original
theory. A detailed discussion is provided by section 21 of Quantum field theory (M. Srednicki).
Effective action contains the complete set of physical predictions of the quantum field theory.
The vacuum state of the field theory is identified as the minimum of the effective potential. The
location of the minimum determines whether the symmetries of the Lagrangian are preserved
or spontaneously broken. The second derivative of Γ is the inverse propagator. The poles of the
propagator, or the zeros of the inverse propagator, give the values of the particle masses. The
higher derivatives of Γ are the one-particle-irreducible amplitudes. These can be connected by
full propagators and joined together to construct four-and higher-point connected amplitudes,
which give the S-matrix elements. Thus, from the knowledge of Γ, we can reconstruct the
qualitative behavior of the quantum field theory, its pattern of symmetry-breaking, and then
the quantitative details of its particles and their interactions.

25.9.4 Renormalization and symmetry


Consider first the computation of the effective potential for constant classical fields, in a field
theory with an arbitrary number of fields ϕi . The effective potential has mass dimension 4, so
we expect that Veff (ϕcl ) will have divergent terms up to Λ4 . To understand these divergences,
expand Veff (ϕcl ) in a Taylor series:

2 ϕcl ϕcl + A4 ϕcl ϕcl ϕcl ϕcl + · · ·


Veff (ϕcl ) = A0 + Aij i j ijkl i j k l
(25.169)

In theories without a symmetry of ϕ → −ϕ, there might also be terms linear and cubic in ϕi ;
we omit these for simplicity. The coefficients A0 , A2 , A4 have mass dimension, respectively,
4, 2, and 0; thus we expect them to contain Λ4 , Λ2 , and log Λ divergences, respectively. The
power-counting analysis predicts that all higher terms in the Taylor series expansion should
be finite.
The constant term A0 is independent of ϕcl ; it has no physical significance. However, the di-
vergences in A2 and A4 appear in physical quantities, since these coefficients enter the inverse
propagator and the irreducible four-point function and therefore appear in the computation of
S-matrix elements. There is one further coefficient in the effective action that has non-negative
mass dimension by power counting; this is the coefficient of the term quadratic in ∂µ ϕcl , which
appears when the effective action is evaluated for a non-constant background field:
Z
∆Γ[ϕcl ] = d4 x B2ij ∂µ ϕicl ∂ µ ϕjcl . (25.170)

All other coefficients in the Taylor expansion of the effective action in powers of ϕcl are finite
by power counting.
We can now argue that the counterterms of the original Lagrangian suffice to remove the di-
vergences that might appear in the computation of Γ[ϕcl ]. The argument proceeds in two steps.
We first use the BPHZ theorem to argue that the divergences of Green’s functions can be re-
moved by adjusting a set of counterterms corresponding to the possible operators that can be
–308/453– Chapter 25 Scalar Field

added to the Lagrangian with coefficients of mass dimension greater than or equal to zero.
The coefficients of these counterterms are in 1-to-1 correspondence with the coefficients A2 ,
A4 , and B2 of the effective action. Next, we use the fact that the effective action is manifestly
invariant to the original symmetry group of the model. This is true even if the vacuum state
of the model has spontaneous symmetry breaking, since the method we presented for com-
puting the effective action is manifestly invariant to the original symmetry of the Lagrangian.
Combining these two results, we conclude that the effective action can always be made finite by
adjusting the set of counterterms that are invariant to the original symmetry of the theory, even
if this symmetry is spontaneously broken. By using the results of previous subsection, which
explain how to construct the Green’s functions of the theory from the functional derivatives
of the effective action, this conclusion of renormalizability extends to all the Green’s functions
of the theory.

25.9.5 Goldstone’s theorem


Theorem 25.2 Goldstone’s theorem

Goldstone’s theorem examines a generic continuous symmetry which is spontaneously


broken; i.e., its currents are conserved, but the ground state is not invariant under the
action of the corresponding charges. Then, necessarily, new massless (or light, if the

symmetry is not exact) scalar particles appear in the spectrum of possible excitations.
There is one scalar particle – called a Nambu-Goldstone boson – for each generator of
the symmetry that is broken, i.e., that does not preserve the ground state.

+ Proof: A general continuous symmetry transformation has the form

ϕa → ϕa + α∆a (ϕ), (25.171)

where α is an infinitesimal parameter and ∆a is some function of all the ϕ’s. Specialize to constant
fields; then the derivative terms in L vanish and the potential alone must be invariant. This condition
can be written as

V (ϕa ) = V (ϕa + α∆a (ϕ)) or ∆a (ϕ) a V (ϕ) = 0. (25.172)
∂ϕ
The effective potential Veff encapsulates the full solution to the theory, including all orders of quantum
corrections. At the same time, it satisfies the general properties of the classical potential: It is invariant
to the symmetries of the theory, and its minimum gives the vacuum expectation value of ϕcl . Thus


∆a (ϕ) Veff (ϕ) = 0. (25.173)
∂ϕa

Now differentiate with respect to ϕb , and set ϕ = ϕcl :


     2 
∂∆a ∂Veff a ∂ Veff
0= + ∆ (ϕcl ) . (25.174)
∂ϕb ϕcl ∂ϕa ϕcl ∂ϕa ∂ϕb ϕcl

The first term vanishes since ϕcl is a minimum of Veff , so the second term must also vanish. If the
transformation leaves ϕcl unchanged (i.e., if the symmetry is respected by the ground state), then
25.10 Linear sigma model –309/453–

∆a (ϕcl ) = 0 and this relation is trivial. A spontaneously broken symmetry is precisely one for which
∆a (ϕcl ) ̸= 0; in this case ∆a (ϕcl ) is the vector with eigenvalue zero.

Effective action’s second functional derivative is the inverse propagator,


Z
−1 2 δ2Γ
iD̃ij (p ) = d4 x e−ip(x−y) (x, y) . (25.175)
δϕi δϕj ϕ=ϕcl

A particle of mass 0 corresponds to a zero eigenvalue of this matrix equation at p2 = 0. Now set p = 0.
This implies (δ 2 Γ/δϕi δϕj )(x, y) has a zero eigenvalue. This is equivalent to ∂ 2 Veff /∂ϕicl ∂ϕjcl has a zero
eigenvalue. This completes the proof of Goldstone’s theorem. 2

25.10 Linear sigma model


25.10.1 Symmetry breaking
The Lagrangian density of linear sigma model is
1 1 λ
L = − ∂µ ϕi ∂ µ ϕi + µ2 (ϕi )2 − [(ϕi )2 ]2 , i = 1, · · · , N, (25.176)
2 2 4
which is invariant under the rotation
ϕi → Rij ϕj (25.177)
for any N × N orthogonal matrices. In classical theory, the lowest-energy classical configu-
ration is a constant field ϕi0 , whose value is chosen to minimize the potential
1 λ
V = − µ2 (ϕi )2 + [(ϕi )2 ]2 . (25.178)
2 4
The potential is minimized for any ϕi0 satisfying
µ2
(ϕi )2 = . (25.179)
λ
This condition determines only the length of the vector ϕi0 , its direction is arbitrary. It is con-
ventional to choose coordinates so that ϕi0 points in the N th direction,
µ
ϕi0 = (0, 0, · · · , 0, v), v ≡ √ . (25.180)
λ
We can now define a set of shifted fields by writing
ϕi = (π k , v + σ), k = 1, · · · , N − 1. (25.181)
It is now straightforward to rewrite the Lagrangian in terms of the π and σ fields. The result is
1 1 1
L = − (∂µ π k )2 − (∂µ σ)2 − (2µ2 )σ 2
2 2 2
√ √ λ λ λ
− λµσ − − λµ(π ) σ − σ 4 − (π k )2 σ 2 − [(π k )2 ]2 .
3 k 2
(25.182)
4 2 4
We obtain a massive σ field and also a set of N − 1 massless π fields. The original O(N )
symmetry is hidden when we choose a specific ϕi0 for vacuum state, leaving only the subgroup
O(N − 1), which rotates the π fields among themselves.
–310/453– Chapter 25 Scalar Field

25.10.2 Renormalization
From this expression of the Lagrangian written in terms of shifted fields, we can read off the
Feynman rules for the linear sigma model, as shown in Figure 25.17. Then we can compute
tree-level amplitudes without difficulty.

Figure 25.17: Feynman rules for the linear sigma model.

Diagrams with loops, however, will often diverge. For the amplitude with Ne external legs, the
superficial degree of divergence is
D = 4 − Ne . (25.183)
The linear sigma model has eight different superficially divergent amplitudes and several of
these have D > 0 and therefore can contain more than one infinite constant, as shown in
Figure 25.18.

Figure 25.18: Divergent amplitudes in the linear sigma model.

Yet we have only three counterterms to absorb these infinities,

1 1 δλ
Lct = − δZ ∂µ ϕi ∂ µ ϕi − δµ (ϕi )2 − [(ϕi )2 ]2 . (25.184)
2 2 4
Written in terms of σ and π fields, it takes the form
δZ 1 δZ 1
Lct = − (∂µ π k )2 − (δµ + δλ v 2 )(π k )2 − (∂µ σ)2 − (δµ + 3δλ v 2 )σ 2
2 2 2 2
δ λ δλ δλ
− (δµ v + δλ v 3 )σ − δλ vσ(π k )2 − δλ vσ 3 − [(π k )2 ]2 − σ 2 (π k )2 − σ 4 . (25.185)
4 2 4
25.10 Linear sigma model –311/453–

Figure 25.19: Feynman rules for counterterm vertices in the linear sigma model.

The Feynman rules associated with these counterterms are shown in Figure 25.19.
Three renormalization parameters, δZ , δµ and δλ , can be adjusted to satisfy the renormalization
conditions shown in Figure 25.20.

1PI 1PI Amp.

Figure 25.20: Renormalization conditions for linear sigma model.

Conclusions from subsection 25.9.4 make sure that these three parameters are able to absorb
all the infinities arising in the divergent amplitudes shown in Figure 25.18. No new symmetry-
breaking terms are needed to make this theory renormalizable. The statement is also verified
up to one-loop level in section 11.2 of An introduction to quantum field theory (M.E.Peskin &
D.V.Schroeder). As an aside, the calculation also shows that π particles remain massless after
one-loop corrections.

25.10.3 Effective action


Effective action can be calculated using equation 25.165. For linear sigma model, we have
1 1 λ
Lr = − ∂µ ϕi ∂ µ ϕi + µ2 (ϕi )2 − [(ϕi )2 ]2 . (25.186)
2 2 4
It follows that
δ 2 Sr
= [∂ 2 δij + µ2 δij − λ(ϕkcl ϕkcl δij + 2ϕicl ϕjcl )]δ(x − y). (25.187)
δϕ(x)δϕ(y) ϕ=ϕcl

Let us orient the coordinates so that ϕcl points in the N th direction,

ϕicl = (0, · · · , ϕcl ). (25.188)

Then the operator 25.187 is just equal to the Klein-Gordon operator (∂ 2 − m2i ), where
(
λϕ2cl − µ2 , acting on η 1 , · · · , η N −1
m2i = . (25.189)
3λϕ2cl − µ2 , acting on η N
–312/453– Chapter 25 Scalar Field

As a result, Z2 is given by

Y
N N Z
Y ∫
i
η (∂ 2 −m2i )η
Z2 = Zi = Dη e 2 . (25.190)
i=1 i=1

Treating −m2i η 2 /2 as a perturbation, we have


Z Z
im2 ∫ d4 p −i ip(x−y)
Dηe (− 2 IDF I )
1
− i ( 1 δ )2
Zi ∝ e 2 i δI
i
where DF (x − y) =
e .
I=0 (2π)4 p2
(25.191)
The proportionality factor can be dropped out as it is independent of ϕcl , contributing only a
constant shift to the effective potential. The Feynman rules to evaluate 25.191 are:

• a line from x to y is associated with DF (x − y);


R
• a vertex joining two lines at x is associated with −im2i d4 x.

And we have
X
log Zi = CI . (25.192)
I

where CI represents connected diagram without external source, as shown in Figure 25.21.

Figure 25.21: Connected Feynman diagram without external source.

The value of diagram Cn is


Z Yn Z  n
1 d4 pk d4 xk −m2i ipk (xk −xk+1 ) 1 m2i
Cn = 4 2
e = δ(0) d p − 2
4
, (25.193)
2n k=1
(2π) p k 2n p

leading to
Z  n Z  
1 d4 p X 1 m2i 1 d4 p m2i
log Zi = − V T − − 2 =− VT log 1 + 2 . (25.194)
2 (2π)4 n p 2 (2π)4 p

By Wick rotation and dimensional regularization, we can work out that

i Γ(− d2 ) 2 d
log Zi = (m ) 2 V T. (25.195)
2 (4π)d/2 i
25.11 Optical theorem and unstable particles –313/453–

So up to one-loop corrections, the effective action is

1 λ 1 Γ(− d2 ) 2 d2 2 d2 1 1
Veff = − µ2 ϕ2cl + ϕ4cl − d/2
[(N − 1)(λϕ 2
cl − µ ) + (3λϕ2
cl − µ ) ] + δ µ ϕ2
cl + δλ ϕ4cl .
2 4 2 (4π) 2 4
(25.196)
To make terms involving ϕcl finite, we must have

2λ2 (N + 8) 2λµ2 (N + 2)
δλ = + finite terms , δµ = − + finite terms. (25.197)
(4π)2 (4 − d) (4π)2 (4 − d)

Functional determinants

Equation 25.190 can also be evaluated formally using functional determinants. Recall the
Gaussian integral
Z ∞ ! r
i X
n
(−2πi)n
d x exp −
n
Aij xi xj = (25.198)
−∞ 2 i,j=1 det A

Formally we have
1  
log Zi = − log det (−∂x2 + m2i )δ(x − y) . (25.199)
2
Define

M (x, y) ≡ (−∂x2 + m2i )δ(x − y), (25.200a)


M0 (x, y) ≡ −∂x2 δ(x − y), (25.200b)
M1 (y, z) ≡ δ(y − z) + im2i DF (y − z). (25.200c)

It follows that Z
M (x, z) = d4 y M0 (x − y)M1 (y − z). (25.201)

Thus, we have
log det M = log det M0 + log det M1 → log det M1 . (25.202)
Term log det M0 is dropped out because it is independent of ϕcl .

Since M1 = I − G, where I = δ(x − y) is the identity matrix and G = −im2i DF , we can get

1X

log det M1 = Tr log M1 = Tr log(I − G) = − TrGn , (25.203)
n n=1

where
Z
TrG = n
(−im2i )n dx1 · · · dxn DF (x1 − x2 ) · · · DF (xn − x1 ). (25.204)

It can be verified that


1
Cn = TrGn . (25.205)
2n
–314/453– Chapter 25 Scalar Field

25.11 Optical theorem and unstable particles


25.11.1 Optical theorem
The optical theorem is a straightforward consequence of the unitarity of the S-matrix: S † S =
1. Inserting S = 1 + iT , we have

− i(T − T † ) = T † T. (25.206)

Using the fact that X X


⟨f |iT |i⟩ = iT(2π)4 δ( pf − pi ) (25.207)
and
XZ Y
n

f T T i = fk ⟨f |T † |{q}⟩ ⟨{q}|T |i⟩ ,
dq (25.208)
n k=1

we can obtain
XZ Y
n X X

−i[T(i → f )−T (f → i)] = fk T(i → {q})T ∗ (f → {q})(2π)4 δ(
dq pf − qk ).
n k=1 k
(25.209)
Let us abbreviate this identity as
XZ

− i[T(i → f ) − T (f → i)] = dΠm T(i → m)T ∗ (f → m), (25.210)
m

where the sum runs over all possible sets of particles and i and f could be one-particle or
multi-particle asymptotic states. For the important special case of forward scattering, we can
set i = f to obtain a simpler identity,
Z
1X
Im T(i → i) = dΠm |T(i → anything )|2 . (25.211)
2 m

Supplying the kinematic factors required to build a cross section, we obtain the standard form
of the optical theorem,

Im T(k1 k2 → k1 k2 ) = 2Ecm pcm σtot (k1 k2 → anything ), (25.212)

where Ecm is the total center-of-mass energy and pcm is the momentum of either particle in the
center-of-mass frame. This equation relates the forward scattering amplitude to the total cross
section for production of all final states. Since the imaginary part of the forward scattering
amplitude gives the attenuation of the forward-going wave as the beam passes through the
target, it is natural that this quantity should be proportional to the probability of scattering.

25.11.2 Unstable Particles


The generalized optical theorem is true not only for S-matrix elements, but for any ampli-
tudes T that we can define in terms of Feynman diagrams. It has been proved to all orders
in perturbation theory using cutting rules by Cutkosky. An brief introduction can be found
25.11 Optical theorem and unstable particles –315/453–

in section 7.3 of An introduction to quantum field theory (M.E.Peskin & D.V.Schroeder). This
fact is extremely useful for dealing with unstable particles, which never appear in asymptotic
states.

The exact two-point function for a scalar particle has the form

−i
. (25.213)
p2 + m2+ M 2 (p2 )

We defined the quantity −iM (p2 ) as the sum of all 1PI insertions into the boson propagator,
but we can equally well think of it as the sum of all amputated diagrams for 1-particle → 1-
particle scattering. Under OS renormalization scheme, the LSZ formula would imply

T = −M 2 (p2 ). (25.214)

If the scalar boson is stable, there will be no possible final state that can contribute to the
right-hand side of equation 25.211 and so M 2 (p2 ) must be real. Renormalization condition
M 2 (−m2 ) = 0 can be realized by a real-valued m, which is the physical mass of the stable
particle. The pole of the propagator lies on the real p2 axis, below the multiparticle branch cut.

Often, however, a particle can decay into two or more lighter particles. In this case, M 2 (p2 ) will
acquire an imaginary part and the renormalization condition must by modified as Re M 2 (−m2 ) =
0. The pole in the propagator would be displaced from the real axis.

If this propagator appears in the s channel of a Feynman diagram, the cross section one com-
putes, in the vicinity of the pole, will have the form
2
1
σ∝ where s = −p2 , p = p1 + p2 . (25.215)
s − m − i Im M 2 (−s)
2

If Im M 2 (−m2 ) is small, so that the resonance is narrow, we can approximate Im M 2 (−s) as


Im M 2 (−m2 ) over the width of the resonance. In this case, the FWHM of the resonance curve
will be
Im M 2 (−m2 )
∆E = − where Ep = (p1 + p2 )2 + m2 . (25.216)
Ep
With optical theorem, the imaginary part of M 2 (p2 ) is given by
Z
1X
Im M (p ) = − Im T = −
2 2
dΠm |T(p → anything )|2 . (25.217)
2 m

For p on-shell, we have

Im M 2 (p2 = −m2 ) = −Ep Γtot (p → anything ), (25.218)

where Γtot is the total decay rates of the intermediate particle. As a result, the width of the
resonance and the lifetime of the intermediate particle are related by

∆E∆τ = 1. (25.219)
–316/453– Chapter 25 Scalar Field

We stress once again that our derivation of this equation applies only to the case of a long-
lived unstable particle, so that Γ ≪ m. For a broad resonance, the full energy dependence of
M 2 (p2 ) must be taken into account.

To get a more physical understanding of this result, recall that in non-relativistic quantum
mechanics, a metastable state with energy E0 and angular momentum quantum number l
shows up as a resonance in the partial-wave scattering amplitude,

1
fl ∼ . (25.220)
E − E0 + iΓ/2

If we imagine convolving this amplitude with a wave packet ψ̃(E)e−iEt will find a time depen-
dence Z
1
ψ(t) ∼ dE ψ̃(E)e−iEt ∼ e−iE0 t−Γt/2 . (25.221)
E − E0 + iΓ/2
Therefore |ψ(t)|2 ∼ e−Γt , and we identify Γ as the inverse lifetime of the metastable state.

25.12 Non-relativistic limit


25.12.1 Complex Klein-Gordon field
The Lagrangian density of free complex Klein-Gordon field is

L = −∂ µ Φ† ∂µ Φ − m2 Φ† Φ. (25.222)

The canonical momentum of the field operator is

π = Φ̇† . (25.223)

Canonical quantization requires that

[ϕ(x, t), π(y, t)] = iδ(x − y). (25.224)

Given the equation of motion


(∂ 2 − m2 )Φ = 0, (25.225)
field operators can be expanded as
Z Z
f ipx
Φ = dp[b(p)e +c (p)e † −ipx † f † (p)e−ipx +c(p)eipx ] where p2 +m2 = 0.
], Φ = dp[b
(25.226)
It follows that

[b(p), b† (q)] = (2π)3 2ωδ(p − q), [c(p), c† (q)] = (2π)3 2ωδ(p − q). (25.227)

Working out the commutation relations between H, P and b, b† , c, c† , we can conclude that
b† (p) / c† (p) creates a b / c particle with momentum p, while b(p) / c(p) annihilates a b / c
particle with momentum p. They share the same mass m.
25.12 Non-relativistic limit –317/453–

We notice that L is invariant under transformation Φ → Φeiα . Noether’s theorem implies that
complex Klein-Gordon field has a conserve charge
Z Z
† †
Q = i d x(Φ̇ Φ − Φ Φ̇) = dp[c
3 f † (p)c(p) − b† (p)b(p)] = Nc − Nb . (25.228)

We would like to interpret c-particle as the antiparticle of b-particle. The number of anti-
particles minus the number of particles is a conserved quantity, i.e., particles and anti-particles
must be created and annihilated in pair.

25.12.2 Non-relativistic limit


The complex Klein-Gordon field can be decomposed as
1 −imt
Φ(x) = √ e ψ(x). (25.229)
2m

In the non-relativistic limit |p| ≪ m, we have ψ̇ ≪ mψ, leading to


 
∂Φ† ∂Φ 2 † i † ∂ψ ∂ψ †
−m Φ Φ≈ ψ − ψ . (25.230)
∂t ∂t 2 ∂t ∂t

Integrating by parts, the Lagrangian density of the complex Klein-Gordon field would become
 
† ∂ ∇2
L = iψ + ψ, (25.231)
∂t 2m

which is exactly the Schrödinger field in non-relativistic quantum field theory.


Chapter 26
Spinor Field

26.1 Representation of the Lorentz group


Under a Lorentz transformation x′ = Λx, the operator field transform as
 
−1 −1 i
U (Λ)ϕa (x)U (Λ) = Sa ϕb (Λ x)
b
where S = exp θµν S µν
, (26.1)
2

and matrices Sαβ satisfy that

[Sµν , Sρσ ] = i(−η νρ Sµσ + η σµ Sρν + η µρ Sνσ − η σν Sρµ ). (26.2)

Define Si ≡ 21 ϵijk Sjk , Ki ≡ Si0 . It follows that

[Si , Sj ] = iϵijk Sk , [Si , Kj ] = iϵijk Kk , [Ki , Kj ] = −iϵijk Sk , (26.3)

Under an infinitesimal transformation, we have

Sa b = δab − i δθi (Si )ab + i δβi (Ki )ab (26.4)

We further define Ni+ ≡ 21 (Si − iKi ) and Ni− ≡ 12 (Si + iKi ). The commutation relations now
becomes
 + +  − −  + −
Ni , Nj = iϵijk Nk+ , Ni , Nj = iϵijk Nk− , Ni , Nj = 0. (26.5)

We see that we have two different SU(2) Lie algebras that are exchanged by hermitian con-
jugation. A representation of the SU(2) Lie algebra is specified by an integer or half integer;
we therefore conclude that a representation of the Lie algebra of the Lorentz group in four
spacetime dimensions is specified by two integers or half-integers n and n′ .

We will label these representations as (2n + 1, 2n′ + 1); the number of components of a rep-
resentation is then (2n + 1)(2n′ + 1). Different components within a representation can also
be labeled by their angular momentum representations. Since Si = Ni+ + Ni− , deducing the
allowed values of j given n and n′ becomes a standard problem in the addition of angular mo-
menta. The general result is that the allowed values of j are |n − n′ |, |n − n′ | + 1, · · · , n + n′ ,
and each of these values appears exactly once.
26.2 Spin-statistics theorem –319/453–

26.2 Spin-statistics theorem


Theorem 26.1 Spin-statistics theorem

States with identical particles of integer spin are symmetric under the interchange of
the particles, while states with identical particles of half-integer spin are antisymmet-
ric under the interchange of the particles. This is equivalent to the statement that the
creation and annihilation operators for integer spin particles satisfy canonical commu-
tation relations, while creation and annihilation operators for half-integer spin particles ♣
satisfy canonical anti-commutation relations. Particles quantized with canonical com-
mutation relations are called bosons, and satisfy Bose–Einstein statistics, and particles
quantized with canonical anti-commutation relations are called fermions, and satisfy
Fermi–Dirac statistics.

Roughly speaking, one way to interchange two particles is to rotate them around their mid-
point by π. For a particle of spin s, this rotation will introduce a phase factor of eiπs . Thus, a
two-particle state with identical particles both of spin s will pick up a factor of ei2πs . For s a
half-integer, this will give a factor of −1; for s an integer, it will give a factor of +1. Therefore,
the creation and annihilation operators for integer spin particles satisfy canonical commuta-
tion relations, while creation and annihilation operators for half-integer spin particles satisfy
canonical anti-commutation relations. The detailed proof can be found in section 12.1 and
12.2 of Quantum Field Theory and the Standard Model (Matthew D. Schwartz).

26.3 Spinor field


Consider a left-handed spinor field ψa (x), also known as a left-handed Weyl field, which is
in the (2, 1) representation of the Lie algebra of the Lorentz group. Here the index a is a left-
handed spinor index that takes on two possible values. Under a Lorentz transformation, we
have
U (Λ)−1 ψa (x)U (Λ) = Lab (Λ)ψb (Λ−1 x), (26.6)
where
 
i 1 1
L = exp θµν SL ,
µν
Sij
L = ϵ
ijk
(Nk+ + Nk− )2,1 = ϵijk σk , L = i(Nk − Nk )1,2 =
Sk0 + −
iσk ,
2 2 2
(26.7)
and σk s are Pauli matrices. Explicitly, the transformation matrix can be written as
 
i 1
L = exp − θi σi − ηi σi . (26.8)
2 2

Similarly, a right-handed spinor field is in the (1, 2) representation of the Lie algebra of the
Lorentz group, where
1 1
Sij
R = ϵ
ijk
(Nk+ + Nk− )1,2 = ϵijk σk , Sk0 −
R = i(Ni − Ni )1,2 = − iσk .
+
(26.9)
2 2
–320/453– Chapter 26 Spinor Field

The transformation matrix is given by


 
i 1
R = exp − θi σi + ηi σi . (26.10)
2 2

The hermitian conjugate of the left-handed spinor field also furnishes a representation of
Lorentz group. We will distinguish the indices of the conjugate field from those of the original
field by putting dots over them. Thus, we write

[ψa (x)]† = ψȧ† (x). (26.11)

Under a Lorentz transformation, we have

U (Λ)−1 ψȧ† (x)U (Λ) = (L∗ )ȧḃ ψḃ† (x)(Λ−1 x). (26.12)

Define  
0 −1
ϵab ≡ . (26.13)
1 0
Using the fact that det L = 1, we have

Lac Lbd ϵcd = ϵab , (26.14)

so that ϵab is an invariant symbol of the Lorentz group. The inverse of ϵab is denoted by ϵab . We
can use ϵab and ϵab to raise and lower left-handed spinor indices,

ψ a (x) ≡ ϵab ψb (x), ψa (x) = ϵab ψ b (x). (26.15)

We also notice the minus sign when we contract indices,

ψ a χa = −ψa χa . (26.16)

It can be verified that

Lab Lac = −δbc , Lac Lbd ϵcd = ϵab , U (Λ)−1 ψ a (x)U (Λ) = −Lab (Λ)ψ b (Λ−1 x). (26.17)

For conjugate field, there is also an invariant symbol ϵȧḃ , which is equivalent to ϵab numerically.
We can use ϵȧḃ and ϵȧḃ to raise and lower conjugate spinor indices.
Using the fact that σ2 σi∗ σ2 = −σi and ϵȧḃ = i(σ2 )ȧḃ , we can show that
  ȧ
∗ ȧ ∗ d˙ i 1
R ḃ ≡ −(L ) ḃ = ϵ (L )ċ ϵd˙ḃ = exp − θi σi + ηi σi
ȧ ȧċ
. (26.18)
2 2 ḃ

Since
U (Λ)−1 ψ †ȧ (x)U (Λ) = Rȧḃ ψ †ḃ (Λ−1 x), (26.19)
conjugate field ψ †ȧ is actually a right-handed spinor field.
Define
σaµȧ ≡ (I, σ1 , σ2 , σ3 ). (26.20)
26.4 Dynamics of spinor fields –321/453–

It is an invariant symbol under the group (2, 1) ⊗ (1, 2) ⊗ (2, 2). The properties of invariance
symbol can be used to derive the following equations.

Proposition 26.1

1.
σaµȧ σµbḃ = −2ϵab ϵȧḃ , ϵab ϵȧḃ σaµȧ σbνḃ = −2η µν . (26.21)
2.
(Sµν µν
L )ab = (SL )ba , (Sµν µν
R )ȧḃ = (SR )ḃȧ . (26.22)
3. Define
σ̄ µȧa ≡ ϵab ϵȧḃ σbµḃ . (26.23)

Numerically, we have

σ̄ µȧa = (I, −σ1 , −σ2 , −σ3 ). (26.24)

It can be shown that


i µ ν i µ ν
(Sµν b
L )a = (σ σ̄ − σ ν σ̄ µ )ab , (Sµν ȧ
R ) ḃ = (σ̄ σ − σ̄ ν σ µ )ȧḃ .
4 4

We adopt the following convention: a missing pair of contracted, undotted indices is under-
stood to be written as cc , and a missing pair of contracted, dotted indices is understood to be
written as ċċ . Thus, if χ and ψ are two left-handed Weyl fields, we have

χψ = χa ψa = χ2 ψ1 − χ1 ψ2 , χ† ψ† = χ†ȧ ψ †ȧ = χ†1 ψ2† − χ†2 ψ1† . (26.25)

We expect Weyl fields to describe spin-one-half particles, and by the spin-statistics theorem,
these particles must be fermions. Therefore the corresponding fields must anticommute, rather
than commute. That is, we should have

χa (x)ψb (y) = −ψb (x)χa (x). (26.26)

Thus, we can get


χψ = ψχ. (26.27)

Using the convention above, we can derive the following propositions:

Proposition 26.2

1.
(χψ)† = ψ † χ† .

2.
[ψ † σ̄ µ χ]† = χ† σ̄ µ ψ.
–322/453– Chapter 26 Spinor Field

26.4 Dynamics of spinor fields


Weyl field

The Lagrangian density for a Weyl field is

L = iψ † σ̄ µ ∂µ ψ. (26.28)

It is a scalar by construction. Furthermore, we have

(iψ † σ̄ µ ∂µ ψ)† = iψ † σ̄ µ ∂µ ψ − ∂µ (iψ † σ̄ µ ψ). (26.29)

The second term is a total divergence, and vanishes (with suitable boundary conditions on the
fields at infinity) when we integrate it over d4 x to get the action S. Thus iψ † σ̄ µ ∂µ ψ has the
hermiticity properties necessary for a term in L.

The field equation can be obtained from the principle of least action,

σ̄ µ ∂µ ψ = 0. (26.30)

Majorana field

If we add mass terms to the 26.28, we have

1 1
L = iψ † σ̄ µ ∂µ ψ − mψψ − mψ † ψ † . (26.31)
2 2
The field equation is

− iσ̄ µ ∂µ ψ + mψ † = 0, −iσ µ ∂µ ψ † + mψ = 0. (26.32)

We can write the equation of motion more compactly by introducing the gamma matrices
 
0 σaµċ
γ ≡
µ
. (26.33)
σ̄ µȧc 0

It can be shown that


{γ µ , γ ν } = −2η µν . (26.34)

We also introduce a four-component Majorana field as


 
ψc
Ψ≡ . (26.35)
ψ †ċ

Then equation 26.32 becomes


(−iγ µ ∂µ + m)Ψ = 0. (26.36)
26.4 Dynamics of spinor fields –323/453–

Dirac field

A Dirac field is composed of two left-handed spinor fields with an U (1) symmetry. The La-
grangian density is given by

1 1
L = iχ† σ̄ µ ∂µ χ + iξ † σ̄ µ ∂µ ξ − mχξ − mξ † χ† , (26.37)
2 2
which is invariant under the transformation

χ → e−iα χ, ξ → eiα ξ. (26.38)

Define a four-component Dirac field as


 
χc
Ψ≡ . (26.39)
ξ †ċ

The field equation written in terms of Ψ is

(−iγ µ ∂µ + m)Ψ = 0. (26.40)

We can write the Lagrangian density in terms of the Dirac field. First we take the hermitian
conjugate of Ψ to get
Ψ† = (χ†ȧ , ξ a ). (26.41)

Introduce the matrix  


0 δċȧ
β≡ . (26.42)
δac 0
Given β, we define
Ψ ≡ Ψ† β = (ξ a , χ†ȧ ). (26.43)

We can work out that

ΨΨ = ξχ + χ† ξ † , Ψγ µ ∂µ Ψ = χ† σ̄ µ ∂µ χ + ξ † σ̄ µ ∂µ ξ + ∂µ (ξσ µ ξ † ). (26.44)

Therefore, up to an irrelevant total divergence, we have

L = iΨγ µ ∂µ Ψ − mΨΨ. (26.45)

This form of the Lagrangian density is invariant under the U (1) transformation

Ψ → e−iα Ψ, Ψ → eiα Ψ. (26.46)

The Noether current associated with this symmetry is

j µ = Ψγ µ Ψ = χ† σ̄ µ χ − ξ † σ̄ µ ξ. (26.47)
–324/453– Chapter 26 Spinor Field

Charge conjugation
Charge conjugation simply exchanges ξ and χ. We can define a unitary charge conjugation
operator C that implements this:

C −1 ξa (x)C = χa (x), C −1 χa (x)C = ξa (x). (26.48)

We then have C −1 L(x)C = L(x).


To express charge conjugation in terms of the Dirac field, we first introduce the charge conju-
gation matrix !
ϵab
C≡ . (26.49)
ϵȧḃ

Next we notice that, if we take the transpose of Ψ, we get


!
a
ξ
Ψ⊺ = . (26.50)
χ†ȧ

Then, if we multiply by C, we get a field that we will call ΨC , the charge conjugate of Ψ,
 
⊺ ξa
Ψ ≡ CΨ =
C
. (26.51)
χ†ȧ

Therefore, for a Dirac field, we have

C −1 Ψ(x)C = ΨC (x). (26.52)

The charge conjugation matrix has a number of useful properties. As a numerical matrix, it
obeys
C⊺ = C† = C−1 = −C, C−1 γ µ C = −(γ µ )⊺ . (26.53)

Now let us return to the Majorana field. It is obvious that a Majorana field is its own charge
conjugate, that is, ΨC ≡ CΨ⊺ = Ψ. This condition is analogous to the condition ϕ† = ϕ that is
satisfied by a real scalar field. A Dirac field, with its U (1) symmetry, is analogous to a complex
scalar field, while a Majorana field, which has no U (1) symmetry, is analogous to a real scalar
field.
Using the fact that Ψ = Ψ⊺ C, the Lagrangian density of a Majorana field in terms of Ψ is given
by
i 1
L = Ψ⊺ Cγ µ ∂µ Ψ − mΨ⊺ CΨ. (26.54)
2 2

Projection matrix
We can also recover the Weyl components of a Dirac or Majorana field by means of a suitable
projection matrix. Define
 c 
−δa 0
γ5 ≡ . (26.55)
0 δċȧ
26.5 Canonical quantization formulation –325/453–

Then we can define left and right projection matrices


 c   
1 δa 0 1 0 0
PL ≡ (1 − γ5 ) = , PR ≡ (1 + γ5 ) = . (26.56)
2 0 0 2 0 δċȧ

Thus we have, for a Dirac field,


   
χc 0
PL Ψ = , PR Ψ = . (26.57)
0 ξ †ċ

The matrix γ5 can also be expressed as

i
γ5 = iγ 0 γ 1 γ 2 γ 3 = ϵµνρσ γ µ γ ν γ ρ γ σ where ϵ0123 = −1. (26.58)
24
The γ5 also has the following properties:

(γ5 )† = γ5 , (γ5 )2 = 1, {γ5 , γ µ } = 0. (26.59)

The behaviour of Dirac field under Lorentz transformation

Define !
(Sµν
L )a
b
0
S µν
≡ µν ȧ . (26.60)
0 (SR ) ḃ

Numerically, we have
 i   k 
i i σ 1 σ
S µν
= [γ µ , γ ν ], S =
i0
, S = ϵijk
ij
. (26.61)
4 2 −σ i 2 σk

Then, for either a Dirac or Majorana field Ψ, we can write


 
−1 i −1
U (Λ) Ψ(x)U (Λ) = D(Λ)Ψ(Λ x) where D(Λ) = exp θµν Sµν
. (26.62)
2

We can also verify that

U (Λ)−1 Ψ(x)U (Λ) = Ψ(Λ−1 x)[D(Λ)]−1 . (26.63)

From the identity


[γ µ , Sρσ ] = i(η µσ γ ρ − η µρ γ σ ), [γ5 , Sµν ] = 0, (26.64)

it follows that
D−1 γ µ D = Λµν γ ν , D−1 γ 5 D = γ 5 , (26.65)

which means Ψγ µ Ψ behaves like a vector under Lorentz transformation, while Ψγ5 Ψ like a
scalar.
–326/453– Chapter 26 Spinor Field

26.5 Canonical quantization formulation


26.5.1 Canonical quantization of left-handed Weyl field
Consider a left-handed Weyl field ψ with Lagrangian density given by 26.28. The canonically
conjugate momentum to the field ψa (x) is then

π a (x) = iψȧ† (x)σ̄ 0ȧa . (26.66)

The Hamiltonian of the field is


Z
H = H d3 x where H = −iψ † σ̄ i ∂i ψ. (26.67)

The momentum and spin angular momentum of the field are


Z Z
i
P = − π ∂i ψa d x , S = −
i a 3 i
π a (σ i )ab ψb d3 x . (26.68)
2

The appropriate canonical anticommutation relations for Weyl field are

{ψa (x, t), ψc (y, t)} = {π a (x, t), π c (y, t)} = 0, {ψa (x, t), π c (y, t)} = iδac δ(x − y).
(26.69)
Using the field equation, ψ(x) can be expanded as
Z
 
ψa = dp f b(p)wa (p)eipx + d† (p)wa (p)e−ipx where p2 = 0, (p̂ · σ + 1)w(p) = 0.
(26.70)

Choosing the normalization w (p)w(p) = 2Ep = 2|p|, we can derive that
 
b(p), b† (q) = d(p), d† (q) = (2π)3 2Ep δ(p − q), (26.71)

and all other anticommutation brackets between b, b† , d and d† vanish. In terms of b, b† , d and
d† , the Hamiltonian of the field is
Z Z
 †  1
H = dp f |p| b (p)b(p) + d (p)d(p) − 2E0 V where E0 = (2π)
† −3
d3 p |p|,
2
(26.72)
the momentum is Z
 
P = dp f p b† (p)b(p) + d† (p)d(p) , (26.73)

and the spin angular momentum is


Z
S = dpf − p̂ b† (p)b(p) + p̂ d† (p)d(p) + · · · (26.74)
2 2
Notice that terms eliminated would vanish when S is projected onto p̂. The component of
the spin angular momentum in the direction of the three-momentum is called the helicity. A
fermion with helicity +1/2 is said to be right-handed, and a fermion with helicity −1/2 left-
handed. So b† (p) creates a left-handed massless fermion while d† (p) creates a right-handed
massless anti-fermion.
26.5 Canonical quantization formulation –327/453–

26.5.2 Canonical quantization of Dirac field


Consider a Dirac field Ψ with Lagrangian density given by 26.45. The canonically conjugate
momentum to the field Ψ(x) is then
∂L
Π≡ = iΨγ 0 = iΨ† , (Ψ = −iΠγ 0 , Ψ† = −iΠ). (26.75)
∂(∂0 Ψ)
The Hamiltonian of the field is
Z
H = H d3 x where H = −Π(α · ∇ + iβm)Ψ, αi = γ 0 γ i , β = γ 0. (26.76)

The momentum and spin angular momentum of the field are


Z Z  i 
i σ
P = − Π ∂i Ψa d x , J = −
i a 3 i i 3
ΠΣ Ψ d x where Σi = . (26.77)
2 σi

Using the field equation, Ψ(x) can be expanded as


XZ  
Ψ(x) = f bs (p)us (p)eipx + d† (p)vs (p)e−ipx ,
dp (26.78)
s
s=±

where
(p/ + m)u(p) = 0, (−p/ + m)v(p) = 0 (p2 + m2 = 0). (26.79)
/ ≡ aµ γ µ . Each
Here, we introduce the Feynman slash: given any four-vector aµ , we define a
of 26.79 has two linear independent solutions, which we label via s = + and s = −.
For m ̸= 0, it is easiest to analyze equation 26.79 in the rest frame, where p = 0. Two linear
independent solutions for u(p) can be chosen as
 
√ ξ † ∗
us (0) = m s where ξ+ ξ+ = 1, ξ− = −iσ2 ξ+ . (26.80)
ξs

It can be shown that ξs† ξs′ = δss′ . We also choose the solutions for v(p) as
   
√ ξ− √ −ξ+
v+ (0) = m , v− (0) = m (26.81)
−ξ− ξ+

We can now find the spinors corresponding to an arbitrary three-momentum p by applying


to us (0) and vs (0) the matrix D(Λ) that corresponds to an appropriate boost. Generally, we
have
us (p) = exp(iη p̂ · K)us (0), vs (p) = exp(iη p̂ · K)vs (0), (26.82)
where  
i  i σi
K = S = γ i, γ 0 =
i i0
, η = sinh−1 (|p|/m). (26.83)
4 2 −σ i

Define the barred spinors

ūs (p) ≡ u†s (p)β, v̄s (p) ≡ vs† (p)β, (β = γ 0 ). (26.84)


–328/453– Chapter 26 Spinor Field

They can be obtained by

ūs (p) = ūs (0) exp(−iη p̂ · K), v̄s (p) = v̄s (0) exp(−iη p̂ · K). (26.85)

This follows from K̄j ≡ βKj† β = Kj . In particular, it turns out that γ µ , Sµν , iγ5 , γ µ γ5 and
iγ5 Sµν all satisfy Ā = A.

The barred spinors satisfy equations

ūs (p)(p/ + m) = 0, v̄s (p)(−p/ + m) = 0. (26.86)

Some useful identities of spinors are summarized in the following proposition. The proof can
be found in section 38 of Quantum field theory (Mark Srednicki)

Proposition 26.3

1.

ūs′ (p)us (p) = 2mδss′ , v̄s′ (p)vs (p) = −2mδss′


ūs′ (p)vs (p) = v̄s′ (p)us (p) = 0. (26.87)

2.

2mūs′ (p′ )γ µ us (p) = ūs′ (p′ )[(p′ + p)µ − 2iSµν (p′ − p)ν ]us (p)
−2mv̄s′ (p′ )γ µ vs (p) = v̄s′ (p′ )[(p′ + p)µ − 2iSµν (p′ − p)ν ]vs (p). (26.88) ♠

3.

ūs′ (p)γ µ us (p) = v̄s′ (p)γ µ vs (p) = 2pµ δss′


ūs′ (p)γ 0 vs (−p) = v̄s′ (p)γ 0 us (−p) = 0. (26.89)

4. X X
us (p)ūs (p) = −p/ + m, vs (p)v̄s (p) = −p/ − m. (26.90)
s=± s=±

Working out boosting matrix for spinors explicitly, we have


     
eη/2 1−p̂·σ
+ e−η/2 1+p̂·σ
0
exp(iη p̂ · K) =     
2 2
(26.91)
0 eη/2 1+p̂·σ
2 + e−η/2 1−p̂·σ
2

In the extreme relativistic limit, we have η → ∞ and meη → 2E. Choosing ξs as the eigen-
vector of p̂ · σ with eigenvalue s, we have
  
√ 0 √ ξ
u+ (p) = v− (p) = 2E , u− (p) = v+ (p) = 2E − . (26.92)
ξ+ 0
26.5 Canonical quantization formulation –329/453–

Thus PL Ψ and PR Ψ becomes left-handed and right-handed Weyl field respectively.


For our discussion of parity, time reversal, and charge conjugation, we will need a number of
relationships among the u and v spinors. First βus (0) = +us (0) and βvs (0) = −vs (0). Also,
βKj = −Kj β. We then have

us (−p) = βus (p), vs (−p) = −βvs (p). (26.93)

Next, we need the charge conjugation matrix 26.49. We can show that Cū⊺s (0) = vs (0),
Cv̄s⊺ (0) = us (0). Also, equation 26.53 implies C−1 Kj C = −(Kj )⊺ . From this we conclude
that
Cū⊺s (p) = vs (p), Cv ⊺s (p) = us (p). (26.94)
Taking the complex conjugate of 26.94, we get

u∗s (p) = Cβvs (p), vs∗ (p) = Cβus (p). (26.95)

Next, γ5 us (0) = +sv−s (0) and γ5 vs (0) = −su−s (0), and that γ5 Kj = Kj γ5 . Therefore

γ5 us (p) = +sv−s (p), γ5 vs (p) = −su−s (p). (26.96)

Combining equation 26.93, 26.95 and 26.96 results in

u∗−s (−p) = −sCγ5 us (p), ∗


v−s (−p) = −sCγ5 vs (p). (26.97)

We will need equation 26.93 in our discussion of parity, equation 26.94 in our discussion of
charge conjugation, and 26.97 in our discussion of time reversal.
From equation 26.78, we can derive that
Z Z
3 −ipx †
bs (p) = d x e ūs (p)γ Ψ(x), bs (p) = d3 x eipx Ψ(x)γ 0 us (p)
0

Z Z
3 −ipx †
ds (p) = d x e Ψ(x)γ vs (p), ds (p) = d3 x eipx v̄s (p)γ 0 Ψ(x).
0
(26.98)

The appropriate canonical anticommutation relations for Dirac field are

{Ψa (x, t), Ψc (x, t)} = {Πa (x, t), Πc (x, t)} = 0, {Ψa (x, t), Πc (y, t)} = iδac δ(x − y).
(26.99)
† †
In terms of b, b , d and d , the anticommutation relations are
n o n o
bs (p), b†s′ (q) = ds (p), d†s′ (q) = (2π)3 2Ep δss′ δ(p − q), (26.100)

and and all other anticommutation brackets vanish.


Define
N + (p, s) = b†s (p)bs (p), N − (p, s) = d†s (p)ds (p). (26.101)
The Hamiltonian of the Dirac field can be rewritten as
XZ  
H= f Ep N + (p, s) + N − (p, s) − 4E0 V,
dp (26.102)
s=±
–330/453– Chapter 26 Spinor Field

XZ
the momentum as
 
P = f p N + (p, s) + N − (p, s) ,
dp (26.103)
s=±
and the spin angular momentum as
XZ  
S= f s p̂ N + (p, s) + N − (p, s) + · · ·
dp (26.104)
s=±
2
In equation 26.104, we choose ξs as the eigenvector of p̂ · σ with eigenvalue s. (If p = 0, p̂
can be chosen arbitrarily.) Terms eliminated will vanish when S is projected onto p̂.
The anti-commutation relations for Ψ(x) at arbitrary spacetime is given by

Ψa (x), Ψc (y) = (i∂/x + m)ac i∆(x − y), (26.105)
where Z
i∆(x − y) ≡ f [eip(x−y) − e−ip(x−y) ].
dp (26.106)

For (x − y)2 > 0, ∆(x − y) = 0 and so Ψa (x), Ψc (y) = 0. It follows that
 
Ψa (x)Ψb (x), Ψc (y)Ψd (y) = 0 if (x − y)2 > 0. (26.107)
Thus, the microscopic causality is satisfied for any physical observables, such as charge density
or momentum density.
The two point correlation function for Dirac field is given by
Z
0 Ψa (x)Ψc (y) 0 = (i∂/x + m)ac dp f eip(x−y) (26.108)

and Z
0 Ψc (y)Ψa (x) 0 = −(i∂/x + m)ac f eip(y−x) .
dp (26.109)
Retarded green function for Dirac field is defined via
SR (x − y)ac ≡ θ(x0 − y 0 ) 0 Ψa (x)Ψc (y) 0 . (26.110)
It is easy to verify that
Z
d4 p i(p/ − m) ip(x−y)
SR (x − y) = (i∂/x + m)DR (x − y) = e (26.111)
(2π)4 p2 + m2
and
(i∂/x − m)SR (x − y) = iδ(x − y) · 14×4 . (26.112)
Now, we introduce the time ordered product for fermion fields
Tη(x)η(y) ≡ θ(x0 − y 0 )η(x)η(y) − θ(y 0 − x0 )η(y)η(x). (26.113)
The Feynman Green function for Dirac field is defined as
Z
d4 p i(p/ − m) ip(x−y)
SF (x − y) ≡ 0 TΨ(x)Ψ(y) 0 = e . (26.114)
(2π)4 p2 + m2 − iϵ
It follows that
0 TΨa (x)Ψc (y) 0 = − 0 TΨc (y)Ψa (x) 0 = −SF (y − x)ca . (26.115)
We also have ⟨0|TΨ(x)Ψ(y)|0⟩ = 0 TΨ(x)Ψ(y) 0 = 0 for Dirac field.
26.6 Parity, time reversal and charge conjugation –331/453–

26.5.3 Canonical quantization of Majorana field


Consider a Majorana field Ψ with Lagrangian density given by 26.54. The equation of motion
for Ψ is once again the Dirac equation, and so the general solution is once again given by 26.78
and 26.79. However, Ψ must also obey the Majorana condition Ψ = CΨ⊺ . Using equation
26.94, we can derive that
XZ  

CΨ = f ds (p)us (p)eipx + b† (p)vs (p)e−ipx .
dp (26.116)
s
s=±

So the Majorana condition implies that ds (p) = bs (p). And a free Majorana field can be
expanded as
XZ  
Ψ(x) = f bs (p)us (p)eipx + b† (p)vs (p)e−ipx .
dp (26.117)
s
s=±

The anticommutation relations for a Majorana field in two-components form are the same as
those for a Weyl field, given by equation 26.69. Translating into four-components form, we
have

{Ψa (x, t), Ψc (x, t)} = (Cγ 0 )ac δ(x − y), Ψa (x, t), Ψc (y, t) = (γ 0 )ac δ(x − y),
(26.118)
which can be used to show that
n o
{bs (p), bs′ (q)} = 0, bs (p), b†s′ (q) = (2π)3 2Eδss′ δ(p − q). (26.119)

as we would expect.
The Hamiltonian for the Majorana field Ψ in terms of b and b† is
XZ
H= f Ep b† (p)bs (p) − 2E0 V.
dp (26.120)
s
s=±

The Majorana Lagrangian density has no U (1) symmetry. Thus there is no associated charge,
and only one kind of particle (with two possible spin states).
The Feynman Green function ⟨0|TΨ(x)Ψ(y)|0⟩ for Majorana field is the same as that for
Dirac field. With the Majorana condition, we also have

⟨0|TΨ(x)Ψ(y)|0⟩ = SF (x − y)C−1 and 0 TΨ(x)Ψ(y) 0 = C−1 SF (x − y), (26.121)

which would vanish for Dirac field.

26.6 Parity, time reversal and charge conjugation


Parity
Under the parity transformation, we require that

P −1 b†s (p)P = ηb†s (−p), P −1 d†s (p)P = ηd†s (−p) (26.122)


–332/453– Chapter 26 Spinor Field

where η is a possible phase factor that should satisfy η 2 = ±1. We could assign different
phase factors to the b and d operators, but we choose them to be the same so that the parity
transformation can be compatible with Majorana condition ds (p) = bs (p) when applying for
Majorana fermions.
Using 26.122, we can derive that

P −1 P P = −P , P −1 SP = S. (26.123)

Thus a parity transformation reverse the three momentum while leaving the spin direction
unchanged.
It can also be derived that
XZ  
−1
P Ψ(x)P = f η ∗ bs (p)βus (p)eipPx − ηd† (p)βvs (p)e−ipPx ,
dp (26.124)
s
s=±

where Pµν = diag(1, −1, −1, −1). An acceptable choice is η = −i, resulting in

P −1 Ψ(x)P = iβΨ(Px), P −1 Ψ(x)P = −iΨ(Px)β. (26.125)

So the transformation properties of fermion bilinears of the form ΨAΨ is

P −1 (ΨAΨ)P = Ψ(βAβ)Ψ. (26.126)

Recalling that
 
χa
Ψ= , (26.127)
ξ †ȧ
we see from equation 26.124 that

P −1 χa (x)P = iξ †ȧ (Px), P −1 ξ †ȧ (x)P = iχa (Px). (26.128)

Thus a parity transformation exchanges a left-handed field for a right-handed one. If we take
the hermitian conjugate of equation 26.128, then raise the index on one side while lowering it
on the other, we get

P −1 χ†ȧ (x)P = iξa (Px), P −1 ξa (x)P = iχ†ȧ (Px). (26.129)

Comparing equations 26.128 and 26.129, we see that they are compatible with the Majorana
condition χa (x) = ξa (x).

Time reversal
In quantum theory, the time reversal operator is antiunitary, i.e., T −1 iT = −i. Under the time
reversal transformation, we require that

T −1 b†s (p)T = ζs b†−s (−p), T −1 d†s (p)T = ζs d†−s (−p). (26.130)

It follows that
T −1 P T = −P , T −1 ST = −S. (26.131)
26.6 Parity, time reversal and charge conjugation –333/453–

Thus a time reversal transformation reverse the three momentum and the spin direction.
Applying to field operator, we obtain
XZ  
−1
T Ψ(x)T = f − sCγ5 ζ ∗ bs (p)us (p)eipTx + ζ−s d† (p)vs (p)e−ipTx ,
dp (26.132)
−s s
s=±

where T µν = diag(−1, 1, 1, 1). An acceptable choice is ζs = s, leading to

T −1 Ψ(x)T = Cγ5 Ψ(Tx), T −1 Ψ(x)T = Ψ(Tx)γ5 C−1 . (26.133)

And fermion bilinears will transform as

T −1 (ΨAΨ)T = Ψ(γ5 C−1 A∗ Cγ5 )Ψ. (26.134)

Considering the effect of time reversal on the Weyl fields, we can figure out that

T −1 χa (x)T = χa (Tx), T −1 ξ †ȧ T = −ξȧ† (Tx). (26.135)

Thus left-handed Weyl fields transform into left-handed Weyl fields (and right-handed into
right-handed) under time reversal. If we take the hermitian conjugate of equation 26.135,
then raise the index on one side while lowering it on the other, we get

T −1 χ†ȧ (x)T = −χ†ȧ (Tx), T −1 ξa T = ξ a (Tx), (26.136)

which is compatible with the Majorana condition χa (x) = ξa (x).

Charge conjugation
Under charge conjugation, we have

C −1 Ψ(x)C = CΨ⊺ (x), C −1 Ψ(x)C = Ψ⊺ (x)C. (26.137)

Fermion bilinears will transforms as

C −1 (ΨAΨ)C = Ψ(C−1 A⊺ C)Ψ. (26.138)

Summary
The transformation properties of the various fermion bilinears under C, P and T are summa-
rized in table 26.1.
We see that ΨΨ and Ψiγ5 Ψ are both even under CP T , while Ψγ µ Ψ and Ψγ µ γ5 Ψ are both
odd. These are examples of a more general rule: a fermion bilinear with n vector indices
(and no uncontracted spinor indices) is even (odd) under CP T if n is even (odd). This also
applies if we allow derivatives acting on the fields, since each component of ∂µ is odd under
the combination P T and even under C.
For scalar and vector fields, it is always possible to choose the phase factors in the C, P , and
T transformations so that, overall, they obey the same rule: a hermitian combination of fields
–334/453– Chapter 26 Spinor Field

Table 26.1: Transformation properties of fermion bilinears under discrete symmetries. Here,
we use the shorthand (−1)µ ≡ 1 for µ = 0 and (−1)µ ≡ −1 for µ = 1, 2, 3.

ΨΨ iΨγ5 Ψ Ψγ µ Ψ Ψγ µ γ5 Ψ
P +1 −1 (−1)µ −(−1)µ
T +1 −1 (−1)µ (−1)µ
C +1 +1 −1 +1
CP T +1 +1 −1 −1

and derivatives is even or odd depending on the total number of uncontracted vector indices.
Putting this together with our result for fermion bilinears, we have the following CP T theo-
rem.
Theorem 26.2 CPT theorem

Any hermitian combination of any set of fields (scalar, vector, Dirac, Majorana) and
their derivatives that is a Lorentz scalar (and so carries no indices) is even under CP T .

Since the Lagrangian must be formed out of such combinations, we have L(x) →
R
L(−x) under CP T , and so the action S = d4 x L is invariant.

26.7 Perturbation theory for canonical quantization


The Lagrangian density of Dirac field and Klein-Gordon field with Yukawa interaction is

1 1
L = − ∂µ ϕ∂ µ ϕ − M02 ϕ2 + iΨγ µ ∂µ Ψ − m0 ΨΨ − g0 ΨΨϕ. (26.139)
2 2
The Hamiltonian is
Z
H = H0 + Hint where Hint = d3 x g0 Ψ(x)Ψ(x)ϕ(x). (26.140)

The correlation function of the Yukawa theory can be expanded perturbatively using
D n h R io E
T
0 T ΨI (x)ΨI (y)ϕI (z) exp −i −T dt HI 0
Ω T{Ψ(x)Ψ(y)ϕ(z)} Ω = lim D n h R io E ,
T →∞(1−iϵ) T
0 T exp −i −T dt HI 0
(26.141)
4
where the definition of ϕI , ΨI and HI are similar to those in ϕ thoery.

The right hand side of equation 26.141 can be evaluated using Wick’s theorem. Before we state
the Wick’s theorem for Yukawa theory, we must note the following conventions:

1. The time-ordered product picks up one minus sign for each interchange of operators
that is necessary to put the fields in time order.
26.8 Path integral formulation –335/453–

2. The normal-ordered product picks up one minus sign for each interchange of operators
that is necessary to put the fields in normal order.
3. Define contractions under the normal-ordering symbol to include minus signs for op-
erator interchanges.
With these conventions, Wick’s theorem takes the same form as before:
 
T ΨI (x1 )ΨI (x2 )ΨI (x3 ) · · · = N ΨI (x1 )ΨI (x2 )ΨI (x3 ) · · · + all possible contractions .
(26.142)
Example:

0 T ΨIa (x1 )ΨIb (x2 )ΨIc (x3 )ΨId (x4 ) 0
= SF (x1 − x2 )ab SF (x3 − x4 )cd − SF (x1 − x4 )ad SF (x3 − x2 )cb . (26.143)

n h R io
T
Expand ⟨0|T ΨIa (x)ΨIb (y)ϕI (z) exp −i −T dt HI |0⟩ to the first order of g0 , we have
  Z  
4
0 T ΨIa (x)ΨIb (y)ϕI (z)(−ig0 ) d w ΨI (w)ΨI (w)ϕI (w) 0
Z
= −(−ig0 )SF (x − y)ab d4 w DF (z − w)Tr[SF (w − w)]
Z
+ (−ig0 ) d4 w [SF (x − w)SF (w − y)]ab DF (w − z).

It can be represented by the following Feynman diagrams.

Figure 26.1: Feynman diagram representation of perturbation expansion.

The Feynman rules for Yukawa theory to evaluate Feynman diagrams are:
1. For each Fermion propagator from y to x, P = SF (x − y).
2. For each scalar propagator, P = DF (x − y).
R
3. For each vertex, V = (−ig0 ) d4 w.
4. For each external point, E = 1.
5. Divided by the symmetry factor.
–336/453– Chapter 26 Spinor Field

26.8 Path integral formulation


26.8.1 Grassmann numbers
Formal definition
Grassmann numbers are individual elements or points of the exterior algebra generated by a
set of n Grassmann variables or Grassmann directions or supercharges {θi } with n possibly
being infinite. The Grassmann variables are the basis vectors of a vector space (of dimension
n). They form an algebra over a field, with the field usually being taken to be the complex
numbers, although one could contemplate other fields, such as the reals. The algebra is a unital
algebra, and the generators are anti-commuting:
θi θj = −θj θi . (26.144)
Since the θi form a vector space over the complex numbers, it is trivial that they commute with
the complex numbers; this is by definition. That is, for complex x, one has
θi x = xθi . (26.145)
The squares of the generators vanish:
θi θi = −θi θi = 0. (26.146)
In other words, a Grassmann variable is a non-zero square-root of zero. Let V denote this
n-dimensional vector space of Grassmann variables. Note that it is independent of the choice
of basis. The corresponding exterior algebra is defined as
Λ = C ⊕ V ⊕ (V ∧ V ) ⊕ (V ∧ V ∧ V ) ⊕ · · · (26.147)
where ∧ is the exterior product and ⊕ is the direct sum. The individual elements of this algebra
are then called Grassmann numbers. It is standard to completely omit the wedge symbol ∧
when writing a Grassmann number; it is used here only to clearly illustrate how the exterior
algebra is built up out of the Grassmann variables. Thus, a completely general Grassmann
number can be written as
X∞ X
z= ci1 i2 ···ik θi1 θi2 · · · θik , (26.148)
k=0 i1 ,i2 ,··· ,ik

where the cs are complex numbers, or, equivalently, ci1 i2 ···ik is a complex-valued, completely
antisymmetric tensor of rank k. Again, the θi can be clearly seen here to be playing the role of
a basis vector of a vector space. Observe that the Grassmann algebra generated by n linearly
independent Grassmann variables has dimension 2n ; this follows from the binomial theorem
applied to the above sum, and the fact that the n + 1-fold product of variables must vanish, by
the anti-commutation relations, above. In other words, for n variables, the sum terminates
Λ = C ⊕ Λ1 (V ) ⊕ Λ2 (V ) ⊕ · · · ⊕ Λn (V ), (26.149)
where Λk (V ) is the k-fold alternating product. The dimension of Λk (V ) is given by n choose k,
the binomial coefficient. The special case of n = 1 is called a dual number, and was introduced
by William Clifford in 1873.
26.8 Path integral formulation –337/453–

Integral over Grassmann number


Single-variable integration: Z
dθ (A + Bθ) ≡ B. (26.150)

Multi-variable integration: Z
dθ dη ηθ ≡ 1. (26.151)

Complex conjugation:
Z
∗ ∗ ∗ ∗ ∗
(θη) ≡ η θ = −θ η , dθ∗ dθ θθ∗ ≡ 1. (26.152)

Gaussian integral:
Z Z
−θ∗ bθ ∗ bθ

dθ dθ e = b, dθ∗ dθ θθ∗ e−θ = 1. (26.153)

Linear transformation:
Y Y
θi′ = (det A)( θi ) where θi′ = Aij θj (26.154)
i i

For a general integral !


YZ
dθi∗ dθi f (θ), (26.155)
i

the only term of f (θ) that survives has exactly one factor of each θi and θi∗ ; it is proportional to
Q Q
( i θi )( i θi∗ ). If we replace θ by U θ where U is a unitary transformation, this term acquires
a factor of det U det U ∗ = 1, so the integral is unchanged under the unitary transformation.
Using this effect, we can derive the Gaussian integral over multiple complex Grassmann num-
bers:
! !
YZ YZ
∗ −θi∗ Bij θj ∗
dθi dθi e = det B, dθi dθi θk θl∗ e−θi Bij θj = (B −1 )kl det B.

i i
(26.156)

26.8.2 Path integral formulation for free Dirac field


A Grassmann field is a function of space-time whose values are Grassmann numbers. The
classical Dirac field being used to evaluate the path integral is a Grassmann field. The time-
ordered two point correlation function for a free Dirac field is given by
R h R i
T
DΨDΨ exp i −T d4 x Ψ(i∂/ − m)Ψ Ψ(x1 )Ψ(x2 )
Ω TΨH (x1 )ΨH (x2 ) Ω = lim R h R i .
T →∞(1−iϵ) T
DΨDΨ exp i −T d x Ψ(i∂ − m)Ψ
4 /
(26.157)
The generating functional of the correlation function is
Z  Z 
Z[η̄, η] = DΨDΨ exp i d x Ψ(i∂/ − m)Ψ + η̄Ψ + Ψη ,
4
(26.158)
–338/453– Chapter 26 Spinor Field

where η(x) is a Grassmann-valued source field.


Define Z

Ψ (x) ≡ Ψ(x) − i d4 y SF (x − y)η(y). (26.159)

It follows that Z

Ψ (x) = Ψ(x) − i d4 y η̄(y)SF (y − x). (26.160)

Noticing that (i∂/x − m)SF (x − y) = iδ(x − y) and SF (y − x)(i∂/x + m) = −iδ(x − y), we


can derive that
Z Z Z
′ ′
d x Ψ(i∂/ − m)Ψ + η̄Ψ + Ψη = d x Ψ (i∂/ − m)Ψ + i d4 x d4 y η̄(x)SF (x − y)η(y).
4 4

(26.161)
So we have  Z 
Z[η̄, η] = Z[0] exp − d x d y η̄(x)SF (x − y)η(y) .
4 4
(26.162)

Adopting the convention that


d d
θη = − ηθ = −θ, (26.163)
dη dη
the correlation function can be calculated as
  
−1 δ δ
0 TΨH (x1 )ΨH (x2 ) 0 = Z[0] −i i Z[η̄, η] = SF (x1 − x2 ).
δ η̄(x1 ) δη(x2 ) η̄,η=0
(26.164)

26.8.3 Perturbation theory for path integral quantization


Define
1 1
L0 = − ∂µ ϕ∂ µ ϕ − M02 ϕ2 + iΨγ µ ∂µ Ψ − m0 ΨΨ, L1 = −g0 ΨΨϕ. (26.165)
2 2
The generating functional of the Yukawa theory can be expanded as
Z ∫ 4
Z[J] = DϕDΨDΨei d x[L0 +L1 +Jϕ+η̄Ψ+Ψη]
∫ Z ∫ 4
δ
i d4 xL1 ( 1i δJ(x) δ
, 1i δη̄(x) δ
,− 1i δη(x) )
=e DϕDΨDΨei d y[L0 +Jϕ+η̄Ψ+Ψη]

 Z 
i d4 xL1 ( 1i δJ(x) 4 1
δ δ
, 1i δη̄(x) δ
,− 1i δη(x) )
∝e exp − d y d z J(y)DF (y − z)J(z) + η̄(y)SF (y − z)η(z)
4
2
X 1
∞  Z  V
1 δ 1 δ 1 δ
= −ig0 d x 4
×− ×
V =0
V ! i δJ(x) i δη(x) i δ η̄(x)
X∞  Z P1
1 1
× − d y 1 d z1 J(y1 )DF (y1 − z1 )J(z1 )
4 4

P1 =0
P 1! 2
X∞  Z P2
1
× − d y 2 d z2 η̄(y2 )SF (y2 − z2 )η(z2 )
4 4
. (26.166)
P =0
P 2!
2
26.9 LSZ reduction formula –339/453–

If we focus on a term with particular V , P1 and P2 , the number of surviving scalar sources will
be E1 = 2P1 − V and fermion sources E2 = 2P2 − 2V . Those terms can be organized using
Feynman diagrams. In these diagrams, a dashed line segment stands for a scalar propagator
DF (x − y), a line with an arrow pointing from y to x for a fermion propagator SF (x − y), a
R
filled circle at one end of a dashed line segment for a scalar source i d4 x J(x), a filled circle at
R
the start of a line with an arrow for a fermion source i d4 x η(x), a filled circle at the end of a
R
line with an arrow for a anti-fermion source i d4 x η̄(x), a vertex joining three line segments
R
for −ig0 d4 x.

26.9 LSZ reduction formula


Using the completeness of Dirac field, we can derive the structure of the exact propagator,
Z
iZ2 (p/ − m)
d4 x e−ipx Ω T Ψ(x)Ψ(0) Ω = + ··· (26.167)
C
p2 + m2 − iϵ

where m is the physical mass of the fermion, and Z2 is the probability for the quantum field
to create or annihilate an exact one-particle eigenstate of H, defined through
p p
⟨Ω|Ψ(0)|p, s, b⟩ = Z2 us (p), ⟨p, s, d|Ψ(0)|Ω⟩ = Z2 v s (p). (26.168)

We also eliminate terms without isolate poles on p2 plane in equation 26.167.

Scattering amplitude of interacting fermions and antifermions can be evaluated using follow-
ing reduction formula.

p1 · · · pn p̄1 · · · p̄n̄ S k1 · · · km k̄1 · · · k̄n̄


 m+m̄+n+n̄ Y n Z Y n̄ Z m Z
Y Y

i −ipi xi −ip̄i x̄i
= √ 4
d xi e 4
d x̄i e 4
d yj eikj yj
d4 ȳi eik̄j ȳj
Z2 1 1 1 1

× [ūs1 (p1 )(p/1 + m)] · · · [ūsn (pn )(p/n + m)] × [v̄r̄1 (k̄1 )(k/̄1 − m)] · · · [v̄r̄m̄ (k̄m̄ )(k/̄m̄ − m)]
× Ω T{Ψ(x1 ) · · · Ψ(xn )Ψ(x̄1 ) · · · Ψ(x̄n̄ )Ψ(y1 ) · · · Ψ(ym )Ψ(ȳ1 ) · · · Ψ(ȳm̄ )} Ω
× [(k/1 + m)ur1 (k1 )] · · · [(k/m + m)urm (km )] × [(p/̄1 − m)vs̄1 (p̄1 )] · · · [(p/̄n̄ − m)vs̄n̄ (p̄n̄ )].
(26.169)

From equation 26.169, we can see that the scattering amplitude would vanish unless n − n̄ =
m − m̄, implying the conservation of charge. Terms like eipx would impose the condition
of momentum conservation, and terms like p/ ± m would remove external legs in Feynman
diagrams. A formal derivation of reduction formula for fermions can be found in section 2.7
of Advanced Quantum Field Theory (Jorge Crispim Romão).

Now we can list the Feynman rules of Yukawa theory which can be used to evaluate scattering
amplitudes.

1. For each incoming electron, draw a solid line with an arrow pointed towards the vertex,
and label it with the electron’s four-momentum, ki .
–340/453– Chapter 26 Spinor Field

2. For each outgoing electron, draw a solid line with an arrow pointed away from the ver-
tex, and label it with the electron’s four-momentum, pi .

3. For each incoming positron, draw a solid line with an arrow pointed away from the
vertex, and label it with minus the positron’s four-momentum, −k̄i .

4. For each outgoing positron, draw a solid line with an arrow pointed towards the vertex,
and label it with minus the positron’s four-momentum, −p̄i .

5. For each incoming scalar, draw a dashed line with an arrow pointed towards the vertex,
and label it with the scalar’s four-momentum, qi .

6. For each outgoing scalar, draw a dashed line with an arrow pointed away from the vertex,
and label it with the scalar’s four-momentum, qi′ .

7. The only allowed vertex joins two solid lines, one with an arrow pointing towards it and
one with an arrow pointing away from it, and one dashed line (whose arrow can point
in either direction). Using this vertex, join up all the external lines, including extra
internal lines as needed. In this way, draw all possible diagrams that are topologically
inequivalent.

8. Assign each internal line its own four-momentum. Think of the four-momenta as flow-
ing along the arrows, and conserve four-momentum at each vertex. For a tree diagram,
this fixes the momenta on all the internal lines.

9. The value of a diagram consists of the following factors:

• for each incoming or outgoing scalar, 1;

• for each incoming electron, ur (k);

• for each outgoing electron, us (p);

• for each incoming positron, v r (k);

• for each outgoing positron, vs (p);

• for each vertex, −ig0 ;

• for each internal scalar, −i/(p2 + M 2 − iϵ);

• for each internal fermion, i(p/ − m)/(p2 + m2 − iϵ).

10. Spinor indices are contracted by starting at one end of a fermion line: specifically, the
end that has the arrow pointing away from the vertex. The factor associated with the
external line is either ū or v̄. Go along the complete fermion line, following the arrows
backwards, and write down (in order from left to right) the factors associated with the
vertices and propagators that you encounter. The last factor is either a u or v. Repeat
this procedure for the other fermion lines, if any.

11. The overall sign of a tree diagram is determined by drawing all contributing diagrams
in a standard form: all fermion lines horizontal, with their arrows pointing from left to
26.10 Functional determinant –341/453–

right, and with the left endpoints labeled in the same fixed order (from top to bottom); if
the ordering of the labels on the right endpoints of the fermion lines in a given diagram
is an even (odd) permutation of an arbitrarily chosen fixed ordering, then the sign of
that diagram is positive (negative).
12. Each closed fermion loop contributes an extra minus sign.
13. Value of iM is given by a sum over the values of the contributing diagrams.
P P
14. ⟨f |iT |i⟩ = (Z1 )nsca /2 (Z2 )nfer /2 iMδ( pf − pi ).
When evaluating closed fermion loops, we need to calculate the trace of the product of n
gamma matrices. Here, we list some frequently-used formulas:

Tr[ odd no. ofγ µ s] = 0, Tr[γ5 ( odd no. ofγ µ s)] = 0; (26.170a)
Tr[γ µ γ ν ] = −4η µν , Tr[γ µ γ ν γ ρ γ σ ] = 4 (η µν η ρσ − η µρ η νσ + η µσ η νρ ) ; (26.170b)
Tr γ5 = 0, µ ν
Tr[γ5 γ γ ] = 0, Tr[γ5 γ γ γ γ ] = −4iε
µ ν ρ σ µνρσ
. (26.170c)

Another category of gamma matrix combinations that we will eventually encounter is γ µ a


/ . . . γµ
in d-dimension. We also quote some useful results:

γ µ γµ = −d, γ µ a /γµ = (d − 2)/ a, //bγµ = 4(ab) − (d − 4)/


γ µa a/b,
γ µa / − (d − 4)/
//b/cγµ = 2/c/ba a/b/c. (26.171)

26.10 Functional determinant


We consider a theory of a Dirac fermion Ψ with

L = iΨγ µ ∂µ Ψ − mΨΨ − gΨΨϕ, (26.172)

where ϕ is a real scalar background field. We define the path integral


Z ∫ 4
Z(ϕ) = DΨDΨei d xL . (26.173)

Recall that if we have n complex Grassmann variables θi , then we can evaluate gaussian inte-
grals by the general formula
Z
dn θ∗ dn θ exp[−iθi∗ Mij θj ] ∝ det M. (26.174)

In the case of the functional integral in equation 26.173, the “matrix” M becomes

Mαβ (x, y) = [−i∂/x + m + gϕ(x)]αβ δ(x − y). (26.175)

Define

M0αβ (x, y) ≡ (−i∂/x + m)αβ δ(x − y), M1βγ (y, z) ≡ δβγ δ(y − z) + igSF (y − z)βγ ϕ(z).
(26.176)
–342/453– Chapter 26 Spinor Field

It can be shown that


Z
Mαγ (x, z) = d4 y M0αβ (x, y)M1βγ (y, z). (26.177)

Furthermore, we have M1 = I − G, where

I = δαβ δ(x − y), Gαβ (x − y) = −igSF (x − y)αβ ϕ(x). (26.178)

It follows that
X∞
1
log Z[ϕ] = log det(M0 M1 ) = − Tr Gn + contant. (26.179)
n=1
n

where
Z
n
Tr G = (−ig) n
d4 x1 · · · d4 xn tr SF (x1 − x2 )ϕ(x2 ) · · · SF (xn − x1 )ϕ(x1 ). (26.180)

To better understand what it means, we will rederive it in a different way. Consider treating the
−gϕΨΨ term in L as an interaction. This leads to a vertex that connects two Ψ propagators;
the associated vertex factor is −igϕ(x). And log Z[ϕ] is given by
X
log Z[ϕ] = CI , (26.181)
I

where CI represents connected diagram without external source. The only connected dia-
grams we can draw with these Feynman rules are fermion circles with n vertices where n ≥ 1.
The diagram with n vertices has an n-fold cyclic symmetry, leading to a symmetry factor of
S = n. The closed fermion loop implies a trace over the spinor indices. Thus the value of the
n-vertex diagram is
Z
1
(−ig) n
d4 x1 · · · d4 xn trSF (x1 − x2 )ϕ(x2 ) · · · SF (xn − x1 )ϕ(x1 ). (26.182)
n
Summing up these diagrams, we find that we are missing the overall minus sign in equation
26.179. The appropriate conclusion is that we must associate an extra minus sign with each
closed fermion loop.
Chapter 27
Vector Field

27.1 Electromagnetic field and gauge invariance


Under a Lorentz transformation, a vector field Aµ (x) transforms as

U (Λ)−1 Aµ (x)U (Λ) = Λµν Aν (Λ−1 x), (27.1)

where  
i
Λ = exp θρσ SV ,
ρσ
V ) ν ≡ −i(η
(Sρσ µ
δν − η σµ δνρ ).
ρµ σ
(27.2)
2
The vector representation of Lorentz group is equivalent to the (2, 2) representation, as (2, 2)
contains j = 0 and 1, which is just right for a four-vector, whose time component is a scalar
under spatial rotations, and whose space components are a three-vector.
Electromagnetic field is a vector field. The Lagrangian density of free EM field is
1
L = − Fµν F µν where Fµν ≡ ∂µ Aν − ∂ν Aµ , Aµ ≡ (ϕ, A), (27.3)
4
leading to the field equation
∂µ F µν = 0. (27.4)

EM field Aµ has 4 components, which would naively seem to tell us that it has 4 degrees of
freedom. But there are two related comments which will ensure that quantizing the gauge
field Aµ gives rise to 2 degrees of freedom, rather than 4:
• The field A0 has no kinetic term Ȧ0 in the Lagrangian: it is not dynamical. This means
that if we are given some initial data Ai and Ȧi at a time t0 , the field A0 will be fully
determined by ∂µ F µ0 = 0, which, expanding out, reads
∂A
∇2 A0 = ∇ · . (27.5)
∂t
Thus A0 is not independent: we do not get to specify A0 on the initial time slice.
• The Lagrangian density of EM field is invariant under the gauge transformation

Aµ → Aµ + ∂µ λ(x). (27.6)

The seemed infinite number of symmetries, one for each function λ(x), is to be viewed
as a redundancy in our description. That is, two states related by a gauge symmetry are
–344/453– Chapter 27 Vector Field

to be identified: they are the same physical state. One way to see that this interpretation
is necessary is to notice that field equation is not sufficient to specify the evolution of
Aµ . The equation reads,
(ηµν ∂ 2 − ∂µ ∂ν )Aν = 0. (27.7)

But the operator (ηµν ∂ 2 − ∂µ ∂ν ) is not invertible: it annihilates any function of the form
∂µ λ. This means that given any initial data, we have no way to uniquely determine Aµ
at a later time since we can not distinguish between Aµ and Aµ + ∂µ λ. This would
be problematic if we thought that Aµ is a physical object. However, if we are happy to
identify Aµ and Aµ +∂µ λ as corresponding to the same physical state, then our problems
disappear.

The picture that emerges for the theory of electromagnetism is of an enlarged phase space,
foliated by gauge orbits. All states that lie along a given gauge orbit can be reached by a gauge
transformation and are identified. To make progress, we pick a representative from each gauge
orbit. It does not matter which representative we pick after all, they are all physically equiva-
lent. But we should make sure that we pick a “good” gauge, in which we cut the orbits. Here
we will look at two different gauges:

• Lorenz Gauge: ∂ µ Aµ = 0.

We can always pick a representative configuration satisfying ∂ µ Aµ = 0. In fact this


condition does not pick a unique representative from the gauge orbit. We are always
free to make further gauge transformations with ∂ µ ∂µ λ = 0, which also has non-trivial
solutions.

• Coulomb Gauge: ∇ · A = 0.

We can make use of the residual gauge transformations in Lorenz gauge to pick ∇·A =
0. Since A0 is fixed by equation 27.5, we have as a consequence A0 = 0. (A0 = 0 will no
longer hold in Coulomb gauge in the presence of charged matter.) The 3 components
of A satisfy a single constraint ∇ · A = 0, leaving behind just 2 degrees of freedom.
These will be identified with the two polarization states of the photon.

27.2 Canonical quantization of EM field

27.2.1 Canonical quantization in Coulomb gauge


The canonical momentum of the EM field Aµ (x) in Coulomb gauge is

∂L ∂L
π0 = = 0, πi = = Ȧi = −E i . (27.8)
∂ Ȧ0 ∂(∂0 Ai )

The Hamiltonian of the field is


Z
1
H = H d3 x where H = (π 2 + B 2 ). (27.9)
2
27.2 Canonical quantization of EM field –345/453–

The momentum and spin angular momentum of the field are


Z Z
Pi = −π ∂i Aj d x , S = −i π j (SiV )j k Ak d3 x where (SiV )j k = −iϵijk . (27.10)
j 3 i

Three pairs of Ai and π i are not independent from each other. They must satisfy the constraint
equations
∇ · A = 0, ∇ · π = 0. (27.11)
The appropriate commutation relations for EM field are
 i 
[Ai (x, t), Aj (y, t)] = 0, π (x, t), π j (y, t) = 0,
  Z  
  ∂i ∂ j d3 k ki k j
Ai (x, t), π (y, t) = i δi − 2 δ(x − y) ≡ i
j j
δi − 2 eik·(x−y) .
j
∇ (2π)3 k
(27.12)
It can be verified that
 
Ȧi = −i[Ai (x, t), H] = πi (x, t), π̇ i = −i π i (x, t), H = ∇2 Ai (x, t), (27.13)
which is consistent with the field equation.
Using the field equation and gauge condition, EM field can be expanded as
XZ
A(x) = f r (p)ϵr (p)eipx + a† (p)ϵ∗ (p)e−ipx ] where p2 = 0,
dp[a ϵ · p = 0.
r r
r=±
(27.14)
We will stick to the normalization
ϵr · ϵ∗s = δrs . (27.15)
The completeness relation for the polarization vectors is
X pi pj
ϵir (p)ϵ∗j (p) = δ ij
− . (27.16)
r=±
r
|p| 2

Usually, we choose ϵs to be the eigenvector of the space-part of p̂ · SV with eigenvalue s. As a


result, ϵ+ rotates anticlockwise in the direction of momentum, while ϵ− rotates clockwise.
Using equation 27.12 and 27.14, it can be shown that
   
[ar (p), as (q)] = a†r (p), a†s (q) = 0, ar (p), a†s (q) = (2π)3 2|p|δrs δ(p − q). (27.17)
In terms of ar (p) and a†r (q), we have
XZ 1
Z
H= f |p|a† (p)as (p) + 2E0 V
dp where E0 = (2π)−3 d3 p |p|, (27.18)
s
r=±
2
and
XZ X Z
P = f pa† (p)as (p),
dp S= f sp̂a† (p)as (p) + · · ·
dp (27.19)
s s
r=± r,s=±

Notice that terms eliminated would vanish when S is projected onto p̂.
Finally, we can derive the Feynman propagator for EM field in Coulomb gauge,
Z  
d4 p −i pi pj
GF (x − y)ij ≡ ⟨0|TAi (x)Aj (y)|0⟩ = δij − eip(x−y) . (27.20)
(2π)4 p2 − iϵ |p|2
–346/453– Chapter 27 Vector Field

27.2.2 Canonical quantization in Lorenz gauge


Modify the Lagrangian density of EM field by introducing a new term,
1 1
L = − Fµν F µν − (∂µ Aµ )2 . (27.21)
4 2ξ
The field equation is now
 
1
∂ Aµ − 1 −
2
∂ µ (∂ · A) = 0. (27.22)
ξ

The conjugate momentum of the EM field is


1 1
π 0 = ∂ · A = (−Ȧ0 + ∂i Ai ), π i = Ȧi + ∇i A0 = −E i . (27.23)
ξ ξ
The Hamiltonian of the field is
Z
1
H = H d3 x where H = (π 2 + B 2 − ξπ 0 π 0 ) + (π · ∇)A0 + π 0 (∇ · A). (27.24)
2
We remark that the above Lagrangian density and the equations of motion, reduce to Maxwell’s
theory in the gauge ∂ · A = 0. So our choice corresponds to a class of Lorenz gauges with
parameter ξ. With this abuse of language (in fact we are not setting ∂ · A = 0, otherwise the
problems would come back) the value of ξ = 1 is known as the Feynman gauge and ξ = 0 as
the Landau gauge. From now on we will take the case of the so-called Feynman gauge, where
ξ = 1. Then the equation of motion coincide with the Maxwell theory in the Lorenz gauge.
As we do not have anymore π 0 = 0, we can impose the canonical commutation relations at
equal times

[Aµ (x, t), Aν (y, t)] = 0, [π µ (x, t), π ν (y, t)] = 0, [Aµ (x, t), π ν (y, t)] = iδµν δ(x − y).
(27.25)
It follows that

[Ȧµ (x, t), Ȧν (y, t)] = 0, [Aµ (x, t), Ȧν (y, t)] = iηµν δ(x − y). (27.26)

Using the field equation, EM field can be expanded as


3 Z
X
A(x) = f λ (p)ϵλ (p)eipx + a† (p)ϵ∗ (p)e−ipx ],
dp[a (27.27)
λ λ
λ=0

where p2 = 0 and ϵλµ are a set of four independent 4-vectors. We choose ϵ1µ and ϵ2µ orthog-
onal to k µ and nµ = (1, 0, 0, 0), such that

ϵλµ ϵ∗µ
λ′ = δλλ′ , λ, λ′ = 1, 2. (27.28)

Then, we choose ϵ3µ in the plane (k µ , nµ ) and perpendicular to nµ such that

ϵ3µ nµ = 0, ϵ3µ ϵ∗µ


3 = 1. (27.29)
27.2 Canonical quantization of EM field –347/453–

Finally we choose ϵ0µ = nµ . The vectors ϵ1µ and ϵ2µ are called transverse polarizations, while
ϵ3µ and ϵ0µ longitudinal and scalar polarizations, respectively. In general we can show that

ϵλ · ϵ∗λ′ = ηλλ′ , η λλ ϵλµ ϵ∗λ′ ν = ηµν . (27.30)

Inserting the plane wave expansion we get


h i h i
† † †
[aλ (p), aλ′ (q)] = aλ (p), aλ′ (q) = 0, aλ (p), aλ′ (q) = (2π)3 2|p|ηλλ′ δ(p−q). (27.31)

showing that the quanta associated with λ = 0 has acommutation relation with the wrong
sign.
To see the problem with the sign we construct the one-particle state with scalar polarization,
that is Z
|1⟩ = dpff (p)a† (p) |0⟩ (27.32)
0

and work out its norm Z


⟨1|1⟩ = − ⟨0|0⟩ f (p)|2 .
dp|f (27.33)

The state |1⟩ has a negative norm.


To solve this problem we note that we are not working anymore with the classical Maxwell
theory because we modified the Lagrangian. What we would like to do is to impose the con-
dition ∂ · A = 0, but that is impossible as an equation for operators. We can, however, require
that condition on a weaker form, as a condition only to be verified by the physical states. More
specifically, we require that the part of ∂ · A that contains the annihilation operator (positive
frequencies) annihilates the physical states,

µ |ψ⟩ = 0.
∂ µ A+ (27.34)

The states |ψ⟩ can be written in the form

|ψ⟩ = |ψT ⟩ |ϕ⟩ , (27.35)

where |ψT ⟩ is obtained from the vacuum with creation operators with transverse polarization
and |ϕ⟩ with scalar and longitudinal polarization. To understand the consequences it is enough
to analyze the states |ϕ⟩ as ∂ µ A+
µ contains only scalar and longitudinal polarizations

XZ
µ +
∂ Aµ = i f λ (p)(p · ϵλ (p))eipx .
dpa (27.36)
λ=0,3

Therefore the previous condition becomes


X
[p · ϵλ (p)]aλ (p) |ϕ⟩ = 0. (27.37)
λ=0,3

The condition is equivalent to

[a0 (p) − a3 (p)] |ϕ⟩ = 0. (27.38)


–348/453– Chapter 27 Vector Field

We can construct |ϕ⟩ as a linear combination of states |ϕn ⟩ with n scalar or longitudinal pho-
tons:
|ϕ⟩ = C0 |ϕ0 ⟩ + C1 |ϕ⟩ + · · · where |ϕ0 ⟩ ≡ |0⟩ . (27.39)
The states |ϕn ⟩ are eigenstates of the operator number for scalar or longitudinal photons,
Z

N |ϕn ⟩ = n |ϕn ⟩ where N ≡ dp[a ′ f † (p)a3 (p) − a† (p)a0 (p)]. (27.40)
3 0

Since n ⟨ϕn |ϕn ⟩ = ⟨ϕn |N ′ |ϕn ⟩ = 0, we have ⟨ϕn |ϕn ⟩ = δn0 , i.e., for n ̸= 0, the state |ϕn ⟩ has
zero norm. So the norm for the general physical state |ϕ⟩ is

⟨ϕ|ϕ⟩ = |C0 |2 ≥ 0 (27.41)

Define

NLS (p) ≡ a†3 (p)a3 (p) − a†0 (p)a0 (p), NT (p) ≡ a†1 (p)a1 (p) + a†2 (p)a2 (p). (27.42)

For physical states |ψ⟩ = |ψT ⟩ |ϕ⟩, we have

⟨ψ|NLS (p)|ψ⟩ = 0, ⟨ψ|NT (p)|ψ⟩ = |C0 |2 ⟨ψT |NT (p)|ψT ⟩ . (27.43)

In terms of a and a† , the Hamiltonian of the field is


Z
H = dp f |p|[NLS (p) + NT (p)] + 2E0 V. (27.44)

We can see that ⟨ψ|H|ψ⟩ / ⟨ψ|ψ⟩ = ⟨ψT |HT |ψT ⟩ / ⟨ψT |ψT ⟩. The arbitrariness of Ci of the
physical states does not affect the physical observables. Only the physical transverse polariza-
tions contribute to the result.

It is important to note that although for the average values of the physical observables only the
transverse polarizations contribute, the scalar and longitudinal polarizations are necessary for
the consistency of the theory. In particular they show up when we consider complete sums
over the intermediate states.

The Feynman propagator for EM field in Feynman gauge is


Z
d4 p −iηµν ip(x−y)
GF (x − y)µν ≡ ⟨0|TAµ (x)Aν (y)|0⟩ = e . (27.45)
(2π)4 p2 − iϵ

It is easy to verify that GF (x − y)µν is the Green’s function of the field equation,

∂ 2 GF (x − y)µν = iηµν δ(x − y). (27.46)

The propagator in an arbitrary ξ gauge is


Z  
d4 p −iηµν pµ pν
GF (x − y)µν = + i(1 − ξ) 2 eip(x−y) . (27.47)
(2π)4 p2 − iϵ (p − iϵ)2
27.3 Perturbation theory for canonical quantization –349/453–

27.3 Perturbation theory for canonical quantization


The Lagrangian density for quantum electrodynamics (QED) of Dirac fermion Ψ is
1
L = − Fµν F µν + Ψ(i∂/ − m0 )Ψ + e0 j µ Aµ where j µ ≡ Ψγ µ Ψ. (27.48)
4
Usually, we also define a covariant derivative

Dµ Ψ ≡ ∂µ Ψ − ie0 Aµ Ψ. (27.49)

So the Lagrangian density can also be written as


1
L = − Fµν F µν + Ψ(iD
/ − m0 )Ψ. (27.50)
4
The Lagrangian density is invariant under the gauge transformation
1
Aµ (x) → Aµ (x) + ∂µ α(x), Ψ(x) → eiα(x) Ψ(x). (27.51)
e0

27.3.1 Coulomb gauge


In Coulomb gauge, we have the constraint equation

∇ · A = 0, ∇2 A0 = e0 j 0 . (27.52)

The solution for A0 is Z


j 0 (x′ , t)
A0 (x, t) = −e0 d3 x′ . (27.53)
4π|x − x′ |
The Hamiltonian of the QED is

H = HD + HM + Hint , (27.54)

where
Z Z
1
HD ≡ d x − Π(⃗
3
α · ∇ + iβm)Ψ, HM ≡ d3 x (π 2 + B 2 )

2
Z  2 Z 0 0 ′

e ′ j (x)j (x )
Hint ≡ d3 x −e0 j · A + 0 d3 x . (27.55)
2 4π|x − x′ |
The perturbation expansion of correlation function is given by
D n h R io E
T
0 T ΨI (x)ΨI (y)AI (z) exp −i −T dt HI 0
Ω T{Ψ(x)Ψ(y)A(z)} Ω = lim D n h R io E ,
T →∞(1−iϵ) T
0 T exp −i −T dt HI 0
(27.56)
RT
where T dt HI can be written as
 Z  Z Z ′

4 ′ e0 δ(t − t )
2
′ ′ 0 ′ ′
− d x e0 ΨI γΨI · AI +
4 4
dx dx 0
ΨI (x, t)γ ΨI (x, t)ΨI (x , t )γ ΨI (x , t ) .
8π|x − x′ |
(27.57)
–350/453– Chapter 27 Vector Field

Wick’s theorem for photons is given by

T {AI (x1 )AI (x2 )AI (x3 ) · · · } = N {AI (x1 )AI (x2 )AI (x3 ) · · · + all possible contractions} .
(27.58)

Example:

⟨0|T {AIi (x1 )AIj (x2 )AIk (x3 )AIl (x4 )}|0⟩
= GF (x1 − x2 )ij GF (x3 − x4 )kl + GF (x1 − x3 )ik GF (x2 − x4 )jl + GF (x1 − x4 )il GF (x2 − x3 )jk .

Now we can derive the Feynman rule for QED theory. Firstly, we evaluate the term
  Z  
0 T ΨIa (x)ΨIb (y)AIi (z)(ie0 ) dw ΨI (w)γΨI (w) · AI (w) 0 .
4
(27.59)

After contraction, it takes the form of


Z
− (ie0 )SF (x − y)ab d4 w GF (z − w)ik Tr[γ k SF (w − w)]
Z
+ (ie0 ) d4 w GF (w − z)ik [SF (x − w)γ k SF (w − y)]ab (27.60)

and can be represented by the diagram 27.1

Figure 27.1: Feynman diagram representation of perturbation expansion.

Secondly, we evaluate the term


  Z Z ′0
 
4 ′ −ie0 δ(w − w ) 1
2 0
4 0 ′ 0 ′
0 T ΨIa (x)ΨIb (y) d w d w ΨI (w)γ ΨI (w)ΨI (w )γ ΨI (w ) 0 .
4π|w − w′ | 2
(27.61)
After contraction, it becomes
Z
iδ(w0 − w′0 )
− (ie0 ) 2
d4 w d4 w′ [SF (x − w)γ 0 SF (w − y)]ab Tr[γ 0 SF (w′ − w′ )]
4π|w − w |

Z
iδ(w0 − w′0 )
+ (ie0 )2 d4 w d4 w′ [SF (x − w)γ 0 SF (w − w′ )γ 0 SF (w′ − y)]ab
4π|w − w′ |
Z
1 iδ(w0 − w′0 )
+ (ie0 ) SF (x − y)ab d4 w d4 w′
2
Tr[γ 0 SF (w − w)]Tr[γ 0 SF (w′ − w′ )]
2 4π|w − w′ |
Z
1 iδ(w0 − w′0 )
− (ie0 ) SF (x − y)ab d4 w d4 w′
2
Tr[γ 0 SF (w − w′ )γ 0 SF (w′ − w)], (27.62)
2 4π|w − w′ |
27.4 Path integral quantization –351/453–

Figure 27.2: Feynman diagram representation of perturbation expansion.

and can be represented by the diagram 27.2.

It seems that Feynman rules in Coulomb gauge would be rather messy. However, the offending
non-local interaction comes from the A0 component of the gauge field, and we could try to
redefine the propagator to include a GF (x − y)00 piece which will capture this term. Since
Z ′
iδ(w0 − w′0 ) d4 p ieip(w−w )
= , (27.63)
4π|w − w′ | (2π)4 |p|2

we can combine the non-local interaction with the transverse photon propagator by defining
a new photon propagator



i
 |p|2 ,  µ, ν = 0 
GF (p)µν ≡ p2−i pi pj
δij − |p| , µ = i ̸= 0, ν = j ̸= 0 . (27.64)


−iϵ 2


0, otherwise

With this propagator, the wavy photon line now carries a µ, ν = 0, 1, 2, 3 index, with the extra
µ = 0 component taking care of the instantaneous interaction.

The Feynman rules for QED in Coulomb gauge are:

1. For each Fermion propagator from y to x, multiplying P = SF (x − y).

2. For each vector propagator, multiplying P = GF (x − y).


R
3. For each vertex, multiplying V = (ie0 γ µ ) d4 w.

4. For each external point, multiplying E = 1.

5. Divided by the symmetry factor.

27.3.2 Lorenz gauge


In Lorenz gauge, we have
Hint = −e0 Ψγ µ ΨAµ . (27.65)
The Feynman rules for QED in Lorenz gauge will be the same as that in Coulomb gauge except
for that the photon propagator will be

−iηµν pµ pν
GF (p)µν = + i(1 − ξ) 2 . (27.66)
p − iϵ
2 (p − iϵ)2
–352/453– Chapter 27 Vector Field

27.4 Path integral quantization


27.4.1 Path integral formulation for free EM field
The time-ordered two point correlation function for free EM field is given by
R h R i
T
DA exp i −T d4 x (− 14 F µν Fµν ) A(x1 )A(x2 )
⟨Ω|TAH (x1 )AH (x2 )|Ω⟩ = lim R h R i .
T →∞(1−iϵ) T
DA exp i −T d4 x (− 14 F µν Fµν )
(27.67)
The generating functional of the correlation function is
Z  Z   
1 µν
Z[J] = DA exp i d x − F Fµν + J Aµ .
4 µ
(27.68)
4
The action for free EM field can be put into the quadratic form of Aµ as
Z   Z
1 µν 1
S ≡ d x − F Fµν =
4
d4 x Aµ (x)(∂ 2 η µν − ∂ µ ∂ ν )Aν (x). (27.69)
4 2
Notice that operator (∂ 2 η µν − ∂ µ ∂ ν ) is singular, since for any α(x),

(∂ 2 η µν − ∂ µ ∂ ν )∂µ α(x) = 0. (27.70)

This difficulty is due to gauge symmetry. The functional is badly defined because we are re-
dundantly integrating over a continuous infinity of physically equivalent field configurations.
To fix the problem, we would like to isolate the interesting part of the functional integral, which
counts each physical configuration only once. Let G(A) be some function that we wish to set
equal zero as a gauge-fixing condition. We could constrain the functional integral to cover
only the configurations with G(A) = 0 by inserting a functional delta function, δ[G(A)].
To do so, we insert 1 in the path integral:
Z  
δG 1
1 = Dα(x)δ{G[A(α)]} det where Aµ [α(x)] = Aµ (x) + ∂µ α(x). (27.71)
δα e0
We set the gauge fixing function as G(A) = ∂ µ Aµ − ω(x), so that
1 2
G[A(α)] = ∂ µ Aµ + ∂ α − ω(x). (27.72)
e0
Since det(δG/δα ) = det(∂ 2 )/e0 is independent of Aµ (x) and α(x), we have
 Z Z
δG
Z[0] = det Dα DAeiS[A] δ{G[A(α)]}. (27.73)
δα
Now change the integration variable from A to A(α). This is a simple shift, so DA = DA(α).
Also, by gauge invariance, S[A] = S[A(α)]. Since A(α) is now just a dummy integration
variable, we can rename it back to A, leading to
 Z Z
δG
Z[0] = det Dα DAeiS[A] δ[∂ µ Aµ − ω(x)]. (27.74)
δα
27.4 Path integral quantization –353/453–

Notice that equation 27.74 holds for any ω(x). So we have


Z  Z 2
  Z Z
4 ω δG
Z[0] = N (ξ) Dω exp −i d x det Dα DAeiS[A] δ[∂ µ Aµ − ω(x)]
2ξ δα
Z Z  Z 
1 µ
∝ Dα DAe iS[A]
exp −i d x (∂ Aµ )
4 2

Z  Z   
i 1 µ ν
∝ DA exp d x Aµ ∂ η − ∂ ∂ + ∂ ∂ Aν .
4 2 µν µ ν
(27.75)
2 ξ
The generating function therefore becomes
Z  Z   
i 1 µ ν
Z[J] ∝ DA exp d x Aµ ∂ η − ∂ ∂ + ∂ ∂ Aν + J Aµ .
4 2 µν µ ν µ
(27.76)
2 ξ
Using  
1 µ ν
∂ η − ∂ ∂ + ∂ ∂ GF (x − y)νρ = iδρµ δ(x − y),
2 µν µ ν
(27.77)
ξ
it can be derived that
 Z 
1
Z[J] = Z[0] exp − d x d y J (x)GF (x − y)µν J (y) .
4 4 µ ν
(27.78)
2
The correlation function can be calculated as
  
−1 δ δ
⟨0|TAH (x1 )AH (x2 )|0⟩ = Z[0] −i −i Z[J] = GF (x1 − x2 ).
δJ(x1 ) δJ(x2 ) J=0
(27.79)

27.4.2 Perturbation theory for path integral quantization


Using QED theory as an example. The Lagrangian density of QED and path integral measure
DΨDΨ are both invariant under gauge transformation. So it can be shown that
Z
Z[0] ≡ DADΨDΨeiS[A,Ψ,Ψ]
Z  Z 
1 µ
∝ DADΨDΨe iS[A,Ψ,Ψ]
exp −i d x (∂ Aµ ) .
4 2
(27.80)

Define
1 1
L0 ≡ − Fµν F µν − (∂ µ Aµ )2 + Ψ(i∂/ − m0 )Ψ, L1 ≡ e0 Ψγ µ ΨAµ . (27.81)
4 2ξ
We have
Z ∫ 4
Z[J] ∝ DADΨDΨei d x[L0 +L1 +JA+η̄Ψ+Ψη]
∫ Z ∫ 4
δ
i d4 xL1 ( 1i δJ(x) δ
, 1i δη̄(x) δ
,− 1i δη(x) )
∝e DΨDΨDAei d y(L0 +JA+η̄Ψ+Ψη)

 Z 
i d4 xL1 ( 1i δJ(x) 4 1
δ δ
, 1i δη̄(x) δ
,− 1i δη(x) )
∝e exp − d y d z J(y)GF (y − z)J(z) + η̄(y)SF (y − z)η(z) ,
4
2
(27.82)
–354/453– Chapter 27 Vector Field

which can be further expanded into

X∞  Z  V
1 1 δ 1 δ µ 1 δ
Z[J] ∝ ie0 d x 4
µ (x)
·− ·γ ·
V =0
V ! i δJ i δη(x) i δ η̄(x)
X 1
∞  Z P 1
1
× − d y1 d z2 J(y1 )GF (y1 − z1 )J(z1 )
4 4

P1 =0
P 1! 2
X∞  Z P2
1
× − d y2 d z2 η̄(y2 )SF (y2 − z2 )η(z2 )
4 4
. (27.83)
P =0
P 2!
2

If we focus on a term with particular values of V , P1 and P2 , the number of surviving vector
sources will be E1 = 2P1 − V and fermion sources E2 = 2P2 − 2V . Those terms can
be organized using Feynman diagrams. In these diagrams, a wavy line segment stands for a
vector propagator GF (x−y), a line with an arrow pointing from y to x for a fermion propagator
R
SF (x − y), a filled circle at one end of a wavy line segment for a vector source i d4 x J(x), a
R
filled circle at the start of a line with an arrow for a fermion source i d4 x η(x), a filled circle
R
at the end of a line with an arrow for a anti-fermion source i d4 x η̄(x) and a vertex joining
R
three line segments for ie0 γ µ d4 x.

27.4.3 Ward-Takahashi identity (1)


The Noether current of the symmetric transformation Ψ → eiα Ψ ≈ Ψ + iαΨ is j µ = Ψγ µ Ψ.
Since the path integral measure is invariant under the transformation, we can apply equation
25.52, leading to

ie0 ∂µ Ω Tj µ Ψ(x1 )Ψ(x2 ) Ω


= −ie0 δ(x − x1 ) Ω TΨ(x1 )Ψ(x2 ) Ω + ie0 δ(x − x2 ) Ω TΨ(x1 )Ψ(x2 ) Ω . (27.84)

Term ie0 ⟨Ω|Tj µ Ψ(x1 )Ψ(x2 )|Ω⟩ can be represented by the diagram 27.3.

Figure 27.3: Feynman diagram representation of correlation function.

Using Feynman diagram, we can derive that


Z Z Z
d4 p −ipy
⟨Aν (y)⟩ = d x GF (x − y)µν ie0 ⟨j (x)⟩ =
4 µ
e GF (p)µν d4 x eipx ie0 ⟨j µ (x)⟩ .
(2π)4
(27.85)
It follows that Z Z
d y ⟨Aν (y)⟩ e
4 iky
= GF (k)µν d4 x eikx ie0 ⟨j µ (x)⟩. (27.86)
27.5 Exact photon propagator in QED –355/453–

Now compute the Fourier transformation of equation 27.84 by multiplying


Z Z Z
4 ikx 4 −iqx1
d xe d x1 e d4 x2 eipx2 . (27.87)

Using diagram 27.3 and equation 27.86, we can get an identify represented by diagram 27.4,
called Ward-Takahashi identify.

Figure 27.4: Feynman diagram representation of Ward identity. Notice that the external leg
of photon is cut-off, while external legs of fermion remain.

Diagram 27.4 can be further generated to the case with n external fermions. Another proof of
Ward-Takahashi identity by analyzing Feynman diagrams directly can be found in section 7.4
of An introduction to quantum field theory (M.E.Peskin & D.V.Schroeder)

27.5 Exact photon propagator in QED


27.5.1 Photon self-energy
The exact photon propagator in QED, G(x)µν ≡ ⟨Ω|TAµ (x)Aν (0)|Ω⟩C , can be represented
by diagram 27.5.

1PI 1PI 1PI

Figure 27.5: Feynman diagram representation of exact photon propagator.

Define iΠµν to be the sum of all 1-particle-irreducible insertions into the photon propagator.
So we have
1
G(k) = GF (k) + GF (k)[iΠ(k)]GF (k) + · · · = GF (k) . (27.88)
1 − iΠ(k)GF (k)
It follows that
(iG)−1 = (iGF )−1 − Π. (27.89)
Recall that
1 kµ kν kµ kν
iGF (p)µν = L
(P T + ξPµν ) T
where Pµν ≡ ηµν − , L
Pµν ≡ . (27.90)
k2 − iϵ µν k2 k2
We can derive that  
1 L
(iGF )−1
µν
2 T
= k Pµν + Pµν . (27.91)
ξ
–356/453– Chapter 27 Vector Field

We may also decompose Πµν into transverse part and longitudinal part,
kµkν
Πµν = PTµν fT (k 2 ) + PLµν fL (k 2 ) = η µν fT + 2 (fL − fT ). (27.92)
k
Using equations 27.89, 27.91 and 27.92, we can work out the decomposition of G(k)µν as
−i −i
G(k)µν = 2 T
Pµν + 2 PL . (27.93)
k − fT (k ) 2 k /ξ − fL (k 2 ) µν
So if fT,L (k 2 = 0) ̸= 0, a mass will be generated for the photon. Because Π(k) comes from
1PI diagrams, it should not be singular at k 2 = 0, and so we have fL − fT = O(k 2 ).

27.5.2 Masslessness of photon in QED


The generating functional for connected diagrams is given by
E[J, η, η̄] = i log Z[J, η, η̄]. (27.94)
And the exact photon propagator can be obtained as
δ 2 E[J, η, η̄]
G(x − y)µν = i . (27.95)
δJ µ (x)δJ ν (y) J,η,η̄=0

Under an infinitesimal gauge transformation, we have δAµ = ∂µ λ, δΨ = ie0 λΨ and δΨ =


R
−ie0 λΨ. So the variation of action S = d4 x (L0 + L1 + JA + η̄Ψ + Ψη) is
Z Z
1
δS = − d x ∂µ A ∂ λ + d4 x J µ ∂µ λ + ie0 η̄Ψλ − ie0 Ψηλ.
4 µ 2
(27.96)
ξ
R
The substitution of variable will not change the path integral Z[J, η, η̄] ∝ DADΨDΨeiS
since the measure is invariant under the gauge transformation. Hence, we must have
Z Z  
1 2
d x λ(x) DADΨDΨe − ∂ ∂µ A − ∂µ J + ie0 (η̄Ψ − Ψη) = 0.
4 iS µ µ
(27.97)
ξ
Notice that
δE δE δE
⟨Aµ (x)⟩J,η,η̄ = − µ , ⟨Ψ(x)⟩J,η,η̄ = − , Ψ(x) J,η,η̄ = . (27.98)
δJ (x) δ η̄(x) δη(x)
Equation 27.97 can be written as
 
1 2 µ δE δE δE
∂ ∂ − ∂µ J (x) − ie0 η̄
µ
+ η = 0. (27.99)
ξ δJ µ (x) δ η̄(x) δη(x)
Differetiating equation 27.99 with J ν (y) and substituting equation 27.95, we have
i 2 µ
∂ ∂ G(x − y)µν + ∂ν δ(x − y) = 0, (27.100)
ξ
which can be transformed into momentum-space as
i k2
− k 2 k µ G(k)µν + kν = − 2 kν + kν = 0. (27.101)
ξ k − ξfL (k 2 )
As a result, we have fL (k 2 ) = 0 and fT (k 2 ) = O(k 2 ). The exact photon propagator becomes
−i −iξ L
G(k)µν = 2 P T
+ Pµν where Π(k 2 ) ≡ fT (k 2 )/k 2 . (27.102)
k [1 − Π(k )] 2 µν
k 2

The photon remains massless after quantum corrections in QED theory.


27.6 LSZ reduction formula –357/453–

27.6 LSZ reduction formula


27.6.1 LSZ reduction formula and Feynman rules
Using the completeness of EM field and the masslessness of photon after quantum corrections
, we can derive the structure of the exact photon propagator in Feynman gauge,
Z
−iZ3 ηµν
d4 x e−ipx ⟨Ω|TAµ (x)Aν (0)|Ω⟩C = 2 + ··· (27.103)
p − iϵ
where Z3 is the probability for the quantum field to create or annihilate an exact one-particle
eigenstate of H, defined through,
p
⟨Ω|A(0)|p, λ⟩ = Z3 ϵλ (p). (27.104)

Scattering amplitude of interacting photons and charged fermions can be evaluated using fol-
lowing reduction formula.
n Z
Y m Z
Y
−ipi xi
⟨p1 · · · pn |S|k1 · · · km ⟩ = 4
d xi e d4 yj eikj yj
1 1
 m+n
i
× √ [p21 ϵ∗µ 2 ∗µn
λ1 (p1 )] · · · [pn ϵλn (pn )][k1 ϵλ′1 (k1 )] · · · [km ϵλ′m (pm )]
1 2 ν1 2 νm
Z3
× ⟨Ω|T{Aµ1 (x1 ) · · · Aµn (xn )Aν1 (y1 ) · · · Aνm (ym )}|Ω⟩ . (27.105)

Given the structure of exact propagator and LSZ reduction formula, we can list the Feynman
rules of QED theory which can be used to evaluate scattering amplitudes.
1. For each incoming electron, draw a solid line with an arrow pointed towards the vertex,
and label it with the electron’s four-momentum, pi .
2. For each outgoing electron, draw a solid line with an arrow pointed away from the ver-
tex, and label it with the electron’s four-momentum, p′i .
3. For each incoming positron, draw a solid line with an arrow pointed away from the
vertex, and label it with minus the positron’s four-momentum, −pi .
4. For each outgoing positron, draw a solid line with an arrow pointed towards the vertex,
and label it with minus the positron’s four-momentum, −p′i .
5. For each incoming photon, draw a wavy line with an arrow pointed towards the vertex,
and label it with the photon’s four-momentum,ki .
6. For each outgoing photon, draw a wavy line with an arrow pointed away from the vertex,
and label it with the photon’s four-momentum, ki′ .
7. The only allowed vertex joins two solid lines, one with an arrow pointing towards it and
one with an arrow pointing away from it, and one wavy line. Using this vertex, join
up all the external lines, including extra internal lines as needed. In this way, draw all
possible diagrams that are topologically inequivalent.
–358/453– Chapter 27 Vector Field

8. Assign each internal line its own four-momentum. Think of the four-momenta as flow-
ing along the arrows, and conserve four-momentum at each vertex.

9. The value of a diagram consists of the following factors:

• for each incoming photon, ϵµλ (k); for each outgoing photon, ϵ∗µ
λ (k);

• for each incoming electron, ur (k); for each outgoing electron, ūs (p);

• for each incoming positron, v r (k); for each outgoing positron, vs (p);

• for each vertex, ie0 γ µ ; for each internal photon, GF (p); for each internal fermion,
SF (p).

10. Spinor indices are contracted by starting at one end of a fermion line: specifically, the
end that has the arrow pointing away from the vertex. The factor associated with the
external line is either ū or v̄. Go along the complete fermion line, following the arrows
backwards, and write down (in order from left to right) the factors associated with the
vertices and propagators that you encounter. The last factor is either a u or v. Repeat
this procedure for the other fermion lines, if any. The vector index on each vertex is
contracted with the vector index on either the photon propagator (if the attached photon
line is internal) or the photon polarization vector (if the attached photon line is external).

11. The overall sign of a tree diagram is determined by drawing all contributing diagrams
in a standard form: all fermion lines horizontal, with their arrows pointing from left to
right, and with the left endpoints labeled in the same fixed order (from top to bottom); if
the ordering of the labels on the right endpoints of the fermion lines in a given diagram
is an even (odd) permutation of an arbitrarily chosen fixed ordering, then the sign of
that diagram is positive (negative).

12. Each closed fermion loop contributes an extra minus sign.

13. Value of iM is given by a sum over the values of the contributing diagrams.
P P
14. ⟨f |iT |i⟩ = (Z2 )nfer /2 (Z3 )npho /2 iMδ( pf − pi ).

27.6.2 Ward-Takahashi identity (2)


Suppose the invariant matrix element for a scattering process is M. If we replace the polariza-
tion state vector ϵµλ (or ϵ∗µ
λ ) of one incoming (or outgoing) photon with its momentum vector
µ
k , we have
k µ Mµ = 0. (27.106)

+ Proof: Without losing generality, we can consider a physical process with a single incoming and out-
going fermion lines respectively. Therefore, the ward identities represented by diagram 27.4 states that

− ikµ F µ (k; p, q) = ie0 [F0 (p + k, q) − F0 (p, q − k)] . (27.107)

Here, F keeps the external fermion legs but cuts external photon lines. According to the LSZ reduction
formula, from each diagram we can get the contribution to an S matrix element by taking the coefficient
27.7 Renormalization –359/453–

of the product of poles   


−i −i
. (27.108)
p
/ m
+ q/ + m
But terms on the right hand side contain one of these poles, but neither contains both poles. Thus they
contribute nothing to S-matrix, leading to k µ Mµ = 0. 2

When calculating invariant matrix element M, the value of photon propagator depends on
the gauge we used. In Coulomb gauge, the photon propagator is given by equation 27.64. In
Lorenz gauge, the photon propagator becomes equation 27.66.
A general scattering process can be represented by Figure 27.6.

Figure 27.6: Feynman diagram representation of a QED process.

The value of the diagram is


M = Mµ1 GF (k)µν Mν2 (27.109)
Since kµ Mµ1 = 0 and kν Mν2 = 0, the factor ξ in photon propagator for Lorenz gauge is irrele-
vant to the value of M. As for Coulomb gauge, denote Mµ1 as αµ and Mµ2 as β µ . We have
 
α · β (α · k)(β · k) α0 β 0
M = α GF (k)µν β = i − 2 +
µ ν
+ 2 . (27.110)
k k 2 k2 k

Using the fact that α · k = −α0 k0 and β · k = −β 0 k0 , we can verify that


 
iηµν
α GF (k)µν β = α − 2 β ν .
µ ν µ
(27.111)
k
So different photon propagators in Lorenz gauge and Coulomb gague lead to the same M
element.

27.7 Renormalization
27.7.1 Renormalized quantum electrodynamics
The superficial degree of divergence of a Feynman diagram in QED is
3
D = 4 − Nγ − Ne , (27.112)
2
where Nγ is the number of external photons and Ne is the number of external (anti-)fermions.
Only seven types of diagrams have D > 0, including the vacuum term. However, symmetries
–360/453– Chapter 27 Vector Field

can cause certain terms to cancel, and the divergence of a diagram may be reduced or even
eliminated.

Under charge conjugation, we have C |Ω⟩ = |Ω⟩ and Cj µ (x)C † = −j µ (x), leading to

⟨Ω|j µ (x)|Ω⟩ = Ω C † Cj µ (x)C † C Ω = − ⟨Ω|j µ (x)|Ω⟩ = 0. (27.113)

According to equation 27.85, the amplitude with Nγ = 1 and Ne = 0 must vanish. Similarly,
the amplitude with Nγ = 3 and Ne = 0 also vanishes.

Considering the scattering amplitude with Nγ = 4. The Ward identity requires that if we
replace any external photon by its momentum vector, the amplitude vanishes: pµ Mµνσρ = 0.
By exhaustion one can show that this condition is satisfied only if the amplitude is proportional
to η µν pσ − η µσ pν , with a similar factor for each of the other three legs. Each of these factors
involves one power of momentum, so all terms with less than four powers of momentum in the
Taylor series of this amplitude must vanish. The rest nonvanishing term has D = 0 − 4 = −4,
and therefore this amplitude is finite.

As discussed in section 27.5, the transverse part of the photon propagator is proportional to
(ηµν p2 − pµ pν ). Viewing this expression as a Taylor series in q, we see that the constant and
linear terms both vanish, lowering the superficial degree of divergence from 2 to 0. The diver-
gence is only logarithmic.

Neglecting the vacuum term, there are only three divergent amplitude terms left, as shown in
Figure 27.7. We need four counterterms to eliminate all the divergence.

Figure 27.7: Feynman diagram representation of divergent amplitude in QED.

Define
−1/2 −1/2
Ar ≡ Z3 A, Φr ≡ Z2 Φ, δ3 ≡ Z3 − 1, δ2 ≡ Z2 − 1,
1
δm ≡ Z2 m0 − m, δ1 ≡ Z1 − 1 ≡ (e0 /e)Z2 Z3 − 1,
2
(27.114)

where m is the physical mass of the fermion and e the physical electric charge. The Lagrangian
density then becomes
L = L1 + Lct , (27.115)
where
1
L1 = − Frµν Frµν + iΨr γ µ ∂µ Ψr − mΨr Ψr + eΨr γ µ Ψr Arµ ,
4
1
Lct = − δ3 Frµν Frµν + iδ2 Ψr γ µ ∂µ Ψr − δm Ψr Ψr + eδ1 Ψr γ µ Ψr Arµ . (27.116)
4
27.7 Renormalization –361/453–

Figure 27.8: Feynman rules for counterterms in QED.

The Feynman rules for counterterms are shown in Figure 27.8.

Denote the renormalized 1PI component of exact photon propagator by i(η µν q 2 −q µ q ν )Πr (q 2 ),
the renormalized 1PI component of exact fermion propagator by −iΣr (p/) and the renormal-
ized exact amputated photon-fermion-antifermion vertex as ieΓµr (p, p′ ). We should adjust δ1 ,
δ2 , δ3 and δm as necessary to maintain the following renormalization conditions:

d
Σr (p/ = −m) = 0, Σr = 0, Πr (q 2 = 0) = 0, ieΓµr (p − p′ = 0) = ieγ µ
dp/ p
/=−m
(27.117)
As an aside, the renormalized exact fermion and photon propagator are

−i −i
Sr (p) = , Gr (q)µν = PT . (27.118)
p/ + m + Σr (p/) q 2 (1 − Πr (q 2 )) µν

Rewriting equation 27.84 in terms of renormalized quantities, we have

ieZ2 ∂µ Ω Tjrµ Ψr (x1 )Ψr (x2 ) Ω


= −ieδ(x − x1 ) Ω TΨr (x1 )Ψr (x2 ) Ω + ieδ(x − x2 ) Ω TΨr (x1 )Ψr (x2 ) Ω . (27.119)

After Fourier transformation, we obtain

− kµ Z2 Z1−1 Sr (p + k)[ieΓµr (p + k, p)]Sr (p) = e[Sr (p + k) − Sr (p)], (27.120)

which can be further simplified into

Z2 Z1−1 kµ Γµr (p + k, p) = k/ + Σr (k/ + p/) − Σr (p/). (27.121)

Since both Γr and Πr are finite, Z2 Z1−1 must be finite as well.

Taking the limit k → 0 at the on-shell point p/ = −m, we immediately get

Z1 = Z2 (27.122)

in OS renormalization scheme, leading to e = Z3 e0 .

Since the relation between the bare and renormalized electric charge depends only on the EM
field strength renormalization, not on quantities particular to the fermions, there is a universal
electric charge that has the same value for all species.

In the following subsection, we would omit the subscript r unless it is necessary to emphasis
the difference of bare fields and renormalized fields.
–362/453– Chapter 27 Vector Field

27.7.2 One loop structure of QED


Photon propagator
The quantum corrections to the photon propagator up to one-loop order is represented by
Figure 27.9.

Figure 27.9: The one-loop and counterterm corrections to the photon propagator.

Using Feynman rules, we can work out that


Z   
e2 1 1 1 D 
Π(k ) = − 2
2
dx x(1 − x) − ln 2 − δ3 + O e4 , (27.123)
π 0 ϵ 2 µ
where D = x(1 − x)p2 + m2 − iϵ and µ2 = 4πe−γ µ̃2 . Imposing the OS renormalization
condition Π(0) = 0, we have
   Z 1  
e2 1 m  e2 D 
δ3 = − 2 − ln + O e , Π(k ) = 2
4 2
dx x(1 − x) ln 2
+ O e4 .
6π ϵ µ 2π 0 m
(27.124)

Fermion propagator
The exact renormalized fermion propagator in OS renormalization can be written as
Z ∞
1 ρΨ (s)
iS(p/) = + ds √ . (27.125)
p/ + m − iϵ mth p/ + s − iϵ
We see that the first term has a pole at p/ = −m with residue one. This residue corresponds to
the field normalization that is needed for the validity of the LSZ formula. There is a problem,
however: in quantum electrodynamics, the threshold mass mth is m, corresponding to the
contribution of a fermion and a zero energy photon. Thus the second term has a branch point
at p/ = −m. The pole in the first term is therefore not isolated, and its residue is ill defined.
This is a reflection of an underlying infrared divergence, associated with the massless photon.
To deal with it, we must impose an infrared cutoff that moves the branch point away from
the pole. The most direct method is to change the denominator of the photon propagator
from k 2 to k 2 + m2γ , where mγ is a fictitious photon mass. Ultimately, we must deal with this
issue by computing cross sections that take into account detector inefficiencies. In quantum
electrodynamics, we must specify the lowest photon energy ωmin that can be detected. Only
after computing cross sections with extra undetectable photons, and then summing over them,
is it safe to take the limit mγ → 0.
The quantum corrections to the fermion propagator up to one-loop order is described by Fig-
ure 27.10.
27.7 Renormalization –363/453–

Figure 27.10: The one-loop and counterterm corrections to the fermion propagator.

Using Feynman rules, we can work out that


Z 1   
e2   1 1 D 
Σ(p/) = 2 dx (2 − ϵ)(1 − x)p/ + (4 − ϵ)m − ln 2 + δ2 p/ + δm + O e4 ,
8π 0 ϵ 2 µ
(27.126)
where D = x(1 − x)p + xm + (1 − x)mγ . The finiteness of Σ(p/) requires that
2 2 2

   
e2 1  e2 1 
δ2 = − 2 + finite + O e 4
and δm /m = − 2 + finite + O e4 .
8π ϵ 2π ϵ
(27.127)

Imposing the OS renormalization condition Σ(−m) = 0 and Σ (−m) = 0, we find that
Z 1  
e2   D 
Σ(p/) = − 2 dx (1 − x)p/ + 2m ln + κ2 (p/ + m) + O e4 , (27.128)
8π 0 D0

where D0 = x2 m2 + (1 − x)m2γ and κ2 = −2 ln(m/mγ ) + 1.

Vertex
The quantum corrections to the vertex up to one-loop order is shown in Figure 27.11.

Figure 27.11: The one-loop and counterterm corrections to the photon-fermion-fermion


vertex.

Using Feynman rules, we can work out that


( Z   Z )
e2
1 1 D 1 Ñ µ 
Γµ (p, p′ ) = (1 + δ1 )γ µ + 2 −1− dF 3 ln 2 γµ + dF 3 + O e4 ,
8π ϵ 2 µ 4 D
(27.129)
where
Z Z 1
dF 3 = 2 dx1 dx2 dx3 δ(x1 + x2 + x3 − 1),
0
D = x1 (1 − x1 )p2 + x2 (1 − x2 )p′2 − 2x1 x2 p · p′ + (x1 + x2 )m2 + x3 m2γ ,
Ñ µ = γν [x1 p/ − (1 − x2 )p/′ + m]γ µ [−(1 − x1 )p/ + x2 p/′ + m]γ ν . (27.130)
–364/453– Chapter 27 Vector Field

Finiteness of Γµ requires that


 
e2 1 
δ1 = − 2 + finite + O e4 . (27.131)
8π ϵ

Imposing the OS renormalization condition Γµr (p = p′ , p2 = −m2 ) = γ µ , we obtain that


(    )
2 Z µ 
e D Ñ
Γµ (p, p′ ) = γ µ − dF 3 ln + 2κ 1 γ µ
− + O e 4
, (27.132)
16π 2 D0 2D

where D0 = (1 − x3 )2 m2 + x3 m2γ amd κ1 = −2 ln(m/mγ ) + 5/2.


In order to compute fermion-fermion scattering amplitude, we must evaluate ūs′ (p′ )Γµ (p′ , p)us (p)
with p2 = p′2 = −m2 , but with q 2 = (p′ − p)2 arbitrary. Up to one-loop order, we can work
out that
 
′ µ ′ ′ ′ i
ūs′ (p )Γ (p , p)us (p) = ūs′ (p ) F1 (q )γ − F2 (q )S qν us (p),
2 µ 2 µν
(27.133)
m
where
Z  
e2 x1 x2 q 2 /m2
F1 (q ) = 1 −
2
dF3 ln 1 +
16π 2 (1 − x3 )2
Z
e2 1 − 4x3 + x23 (x3 + x1 x2 )q 2 /m2 − (1 − 4x3 + x23 ) 
− dF3 + + O e4 ,
16π 2 (1 − x3 ) + x3 mγ /m
2 2 2 x1 x2 q /m + (1 − x3 ) + x3 mγ /m
2 2 2 2 2
Z
e2 1
dy 
F2 (q 2 ) = 2 + O e 4
. (27.134)
8π 0 1 − y(1 − y)q 2 /m2

Especially, we have F1 (0) = 1 and F2 (0) = α/2π + O(α2 ), where α = e2 /4π is the fine-
structure constant.

27.7.3 Renormalization group


Imposing MS renormalization scheme, we have to one-loop order

e2 1  e2 1  e2 1 
Z1 = Z2 = 1 − + O e4
, Z3 = 1 − + O e4
, Zm = 1 − + O e 4
.
8π 2 ϵ 6π 2 ϵ 2π 2 ϵ
(27.135)

The Lagrangian density of QED


1
L = − F0µν F0µν + Ψ0 (i∂/ − m0 )Ψ0 + e0 Ψ0 γ µ Ψ0 A0µ (27.136)
4
can be rewritten as
1
L = − Z3 Fµν F µν + Ψ(iZ2 ∂/ − Zm m)Ψ + Z1 eΨγ µ ΨAµ , (27.137)
4
where
−1/2
m0 = Z2−1 Zm m,
1/2 1/2
Ψ0 = Z2 Ψ, A0 = Z3 A, e 0 = Z3 eµ̃ϵ/2 . (27.138)
27.7 Renormalization –365/453–

After using dimensional regularization, the infinities coming from loop integrals take the form
of inverse powers of ϵ. In the MS renormalization scheme, we choose the Zs to cancel off these
powers of 1/ϵ, and nothing more. Therefore the Zs can be expanded as
X

an (e) X

bn (e) X

cn (e) X

dn (e)
Z1 = 1 + , Z2 = 1 + , Z3 = 1 + , Zm = 1 + .
n=1
ϵn n=1
ϵn n=1
ϵn n=1
ϵn
(27.139)

Using equation 27.135, we can get a1 = b1 = −e2 /8π 2 + O(e4 ), c1 = −e2 /6π 2 + O(e4 ) and
d1 = −e2 /2π 2 + O(e4 ).
Define
  X ∞
En (e)
−1/2
E(e, ϵ) ≡ ln Z3 = . (27.140)
n=1
ϵn
We can work ou that E1 = −c1 /2 = e2 /12π 2 + O(e4 ). As ln e0 = E + ln e + ϵ ln µ̃/2 and
de0 /dµ = 0, we can derive that
 
eE1′ eE2′ de ϵ
1+ + 2 + ··· + e = 0. (27.141)
ϵ ϵ d ln µ 2

In a renormalizable theory, de/d ln µ should be finite in the ϵ → 0 limit. Therefore, the beta
function for the charge is supposed to be

de ϵ e2 ϵ e3
β(e) ≡ = − e + E1′ (e) = − e + + O(e5 ). (27.142)
d ln µ 2 2 2 12π 2

Define
 X

Mn (e)
M (e, ϵ) ≡ ln Zm Z2−1 = . (27.143)
n=1
ϵn
We can work out M1 = d1 − b1 = −3e2 /8π 2 + O(e4 ). As ln m0 = M (e, ϵ) + ln m and
d ln m0 /d ln µ = 0, we can derive that

d ln m ∂M (e, ϵ) de 1 X∞
Mn′ (e) e
2 ′
=− = (ϵe − e E1 ) n
= M1′ + · · · (27.144)
d ln µ ∂e d ln µ 2 n=1
ϵ 2

In a renormalizable theory, d ln m/d ln µ should be finite in the ϵ → 0 limit, and so terms


eliminated must actually all be zero. Therefore, the anomalous dimension of the mass would
be
1 dm e 3e2 
γm (e) ≡ = M1′ = − 2 + O e4 . (27.145)
m d ln µ 2 8π

By similar procedure, we can also derive the anomalous dimension of the fermion field and
EM field:
1 d ln Z2 1 db1 e2 
γ2 (e) ≡ =− e = 2
+ O e4 , (27.146)
2 d ln µ 4 de 16π
1 d ln Z3 1 dc1 e2 
γ3 (e) ≡ =− e = 2
+ O e4 . (27.147)
2 d ln µ 4 de 12π
–366/453– Chapter 27 Vector Field

27.7.4 Magnetic dipole moment


Suppose there is a constant classical electromagnetic background field Acl
µ (x). Then we have
a new term
eγ µ Acl
µ (x)ΨΨ (27.148)
in the Lagrangian density. In Feynman diagram, the term joins two fermion lines, and the
value of the vertex is
Z

ieγ µ
d4 x e−iq·x ei(E −E)t Acl µ cl ′
µ (x) = ieγ õ (q)(2π)δ(E − E). (27.149)

where Z

q ≡ p − p, Ãcl
µ (q) ≡ d3 x e−iq·x Acl
µ (x). (27.150)

If an electron is scattered by the background field and its momentum changes from p to p′ , the
scattering matrix will be

⟨p′ |iT |p⟩ = ieūs′ (p′ )Γµ (p′ , p)us (p)Ãcl ′


µ (q)(2π)δ(E − E). (27.151)

If there is only magnetic field, we have

Acl cl
µ (x) = (0, Ai ), B cl = ∇ × Acl , B̃ cl = iq × Ãcl . (27.152)

Using equations 26.80, 26.82 and 26.91, we can work out that
 
√ (1 − p · σ/2m)ξ
u(p) = m (27.153)
(1 + p · σ/2m)ξ

in the non-relativistic limit. Now we can derive that


 
′ i ′† pi + p′i i
ū(p )γ u(p) = 2mξ − ϵijk qj σk ξ. (27.154)
2m 2m
The first term in the bracket is spin-independent. The second is the magnetic moment inter-
action we are seeking. Thus we retain only the latter term in the following discussion.
We can also derive that
   
′ i iν ′† i
ū(p ) − S qν u(p) = 2mξ − ϵijk qj σk ξ. (27.155)
m 2m
Using equation 27.133, we have
 
′ ′† i ′
⟨p |iT |p⟩ = 2imeξ − i (q)(2π)δ(E − E )
ϵijk qj σk [F1 (0) + F2 (0)] ξ Ãcl
2m
ne σk o cl
′†
= i(2m)ξ [F1 (0) + F2 (0)] ξ B̃k (q)(2π)δ(E − E ′ ). (27.156)
m 2
Recall that in quantum mechanics, the transition matrix given by the first-order time depen-
dent perturbation is
Z t
′ ′
⟨p |iT |p⟩ = −i dt ei(E −E)t ⟨p′ |V (x)|p⟩ = −2πiδ(E ′ − E)Ṽ (q), (27.157)
0
27.7 Renormalization –367/453–

where V (x) is perturbing potential we applied to free particles.

Notice that in quantum field theory, we normalize the momentum eigenvector as ⟨p′ |p⟩ =
(2π)3 2Eδ(p − p′ ) rather than ⟨p′ |p⟩ = (2π)3 δ(p − p′ ). So there is an extra factor 2m in
equation 27.156 when compared with equation 27.157, resulting in
ne σk o cl
Ṽ (q) = −ξ ′† [F1 (0) + F2 (0)] ξ B̃k (q). (27.158)
m 2
Thus, the magnetic moment of electron is given by
e α 
µ= ge S where ge = 2F1 (0) + 2F2 (0) = 2 + + O α2 . (27.159)
2m 2π

27.7.5 Lamb shift


Let us consider the scattering of electron by proton in non-relativistic limit.

proton electron

Figure 27.12: Scattering of electron by proton in non-relativistic limit. Here, we have |p| ∼
|p′ | ∼ |q| ≪ me and |k| ∼ |k′ | ≪ mp . We should also keep in mind that me ≪ mp so that
loops formed by proton propagators can be neglected.

Up to tree level, we have

−iηµν −e2
iM = ū(k ′ )(−ieγ µ )u(k) ū(p ′
)(ieγ ν
)u(p) ≈ −i 2me δss′ 2mp δrr′ . (27.160)
q2 q2

Conparing equation 27.160 to Born approximation in quantum mechanics, we find that

−e2 −e2
Ṽ (q) = , V (r) = . (27.161)
q2 4πr

So the scattering is described by Coulomb potential.

Next let us examine how electron loop modifies the electromagnetic interaction. Using the
exact photon propagator, we have the modified potential

−e2
Ṽ (q) = , (27.162)
q 2 [1 − Π(q 2 )]

where
Z  
2α 1
m2e 
Π(q ) = −
2
dx x(1 − x) ln + O α2 . (27.163)
π 0 m2e + x(1 − x)q 2
–368/453– Chapter 27 Vector Field

Since |q 2 | ≪ m2e , we can derive that

α q2 4πα 4α2
Π(q 2 ) = , Ṽ (q) = − − . (27.164)
15π m2e q2 15m2e

Transforming the potential to position space, we find that

α 4α2
V (r) = − − δ(x). (27.165)
r 15m2e

The correction term indicates that the electromagnetic force becomes much stronger at small
distances. This effect can be measured in the hydrogen atom, where the energy levels are
shifted by
4α2
∆E = − |ψ(0)|2 . (27.166)
15m2e
The wave function is non-zero at the origin only for s-wave states. For the 2S state, the shift is
about −1.123×10−7 eV. This modified potential causes a split for degenerate levels of different
l. This is a (small) part of the Lamb shift splitting.
A more precise correction is given by Uehling potential

α2 e−2me r
δV (r) = − √ . (27.167)
4 πr (me r)3/2

Thus the range of the correction is roughly the electron’s Compton wavelength, m−1 e . Since
hydrogen wave functions are nearly constant on this scale, the delta function was a good ap-
proximation. We can interpret the correction as being due to screening. At r > m−1 e , virtual
+ −
e e pairs make the vacuum a dielectric medium in which the apparent charge is less than
the true charge. At smaller distances we begin to penetrate the polarization cloud and see the
bare charge. This phenomenon is known as vacuum polarization.
Chapter 28
Gauge Field

28.1 Nonabelian gauge theory


28.1.1 Nonabelian symmetries
Consider the theory of N real scalar fields ϕi with
1 1 1
L = − ∂µ ϕi ∂ µ ϕi − m2 ϕi ϕi − λ(ϕi ϕi )2 . (28.1)
2 2 16
The Lagrangian density is invariant under the SO(N ) transformation

ϕi (x) → Rij ϕj (x), (28.2)

where R is an orthogonal matrix with a positive determinant: R⊺ = R−1 and det R = +1.
Consider an infinitesimal SO(N ) transformation

Rij = δij + θij + O(θ2 ). (28.3)

Orthogonality of Rij implies that θij is real and antisymmetric. It is convenient to express θij
in terms of a basis set of hermitian matrices (T a )ij . The index a runs from 1 to N (N − 1)/2,
the number of linearly independent, hermitian, antisymmetric, N × N matrices. Commonly,
these matrices obey the normalization condition

Tr T a T b = 2δ ab . (28.4)

In terms of them, we can write


θij = −iθa (T a )ij , (28.5)
where θa is a set of N (N − 1)/2 real, infinitesimal parameters.
The T a s are the generator matrices of SO(N ). The product of any two SO(N ) transformations
is another SO(N ) transformation; this implies that the commutator of any two generator ma-
trices must be a linear combination of generator matrices,
 a b
T , T = if abc T c . (28.6)

The numerical factors f abc are the structure coefficients of the group. If f abc = 0, the group is
abelian. Otherwise, it is nonabelian.
–370/453– Chapter 28 Gauge Field

If we multiply equation 28.6 on the right by T d , take the trace, and use equation 28.4, we find

i   
f abc = − Tr T a , T b T c . (28.7)
2

Using the cyclic property of the trace, we find that f abc must be completely antisymmetric.
Taking the complex conjugate of equation 28.7, we find that f abc must be real.

Example: The simplest nonabelian group is SO(3). In this case, we can choose (T a )ij =
−iϵaij . The commutation relations become
 
T a , T b = iϵabc T c . (28.8)

Consider now the theory of N complex scalar fields ϕi , and a Lagrangian density

1
L = −∂µ ϕ†i ∂ µ ϕi − m2 ϕ†i ϕi − λ(ϕ†i ϕi )2 . (28.9)
4

This Lagrangian is clearly invariant under the U (N ) transformation

ϕi (x) → Uij ϕj (x), (28.10)

where U is a unitary matrix, U † = U −1 . We can write Uij = e−iθ U eij , where θ is a real
parameter and det U e = 1; U eij is called a special unitary matrix. Clearly the product of two
special unitary matrices is another special unitary matrix; the N × N special unitary matrices
form the group SU(N ). The group U (N ) is the direct product of the group U(1) and the group
SU(N ).

Consider an infinitesimal SU(N ) transformation

eij = δij − iθa (T a )ij + O(θ2 ),


U (28.11)

where θa is a set of real, infinitesimal parameters. Unitarity of Ue implies that the generator
matrices T a are hermitian, and det U e = 1 implies that each T a is traceless. The index a runs
from 1 to N 2 − 1, the number of linearly independent, hermitian, traceless, N × N matrices.
We can choose these matrices to obey the normalization condition

 1
Tr T a T b = δ ab . (28.12)
2

Example: For SU(2), we can choose (T a )ij = (σ a )ij /2. The commutation relations become
 
T a , T b = iϵabc T c . (28.13)
28.1 Nonabelian gauge theory –371/453–

28.1.2 Nonabelian gauge theory


Consider a Lagrangian with N scalar or spinor fields ϕi (x) that is invariant under a continuous
SU(N ) symmetry,
ϕi (x) = Uij ϕj (x). (28.14)
It is called a global symmetry transformation, because the matrix U does not depend on the
space-time label x. If we want to generalize the symmetry to local transformation

ϕi (x) = Uij (x)ϕj (x), (28.15)

terms with derivatives, such as ∂ µ ϕ† ∂µ ϕi , will not remain invariant under local transforma-
tion. Thus we must include a traceless hermitian N × N matrix of fields Aµ (x), and promote
ordinary derivatives ∂µ to covariant derivatives Dµ = ∂µ − igAµ to ensure that

Dµ ϕ → U Dµ ϕ. (28.16)

As a result, the gauge field Aµ (x) must transform as

i
Aµ (x) → U (x)Aµ (x)U † (x) + U (x)∂µ U † (x). (28.17)
g

Replacing all ordinary derivatives in L with covariant derivatives renders L gauge invariant.
We can write U (x) in terms of the generator matrices as exp[−igΓa (x)T a ]. If the structure
constant f abc ̸= 0, we have a nonabelian gauge theory.

We still need a kinetic term for Aµ (x). Let us define the field strength as

i
Fµν (x) ≡ [Dµ , Dν ] = ∂µ Aν − ∂ν Aµ − ig[Aµ , Aν ]. (28.18)
g

We can verify that the field strength transform as

Fµν (x) → U (x)Fµν (x)U † (x). (28.19)

A reasonable kinetic term would be


1
Lkin = − Tr(F µν Fµν ). (28.20)
2
Since we have taken Aµ to be hermitian and traceless, we can expand it in terms of the gener-
ator matrices:
Aµ (x) = Aaµ (x)T a . (28.21)
Similarly, we have
a
Fµν (x) = Fµν (x)T a c
where Fµν = ∂µ Acν − ∂ν Acµ + gf abc Aaµ Abν . (28.22)

The kinetic term now becomes


1
Lkin = − F cµν Fcµν . (28.23)
4
–372/453– Chapter 28 Gauge Field

Everything we have just said about SU(N ) also goes through for SO(N ), with unitary replaced
by orthogonal, and traceless replaced by antisymmetric. There is also another class of compact
nonabelian groups called Sp(2N ), and five exceptional compact groups: G(2), F (4), E(6),

E(7) and E(8). Compact means that Tr T a T b is a positive definite matrix. Nonabelian
gauge theory must be based on a compact group, because otherwise some of the terms in Lkin
would have the wrong sign, leading to a Hamiltonian that is unbounded below.
As a specific example, let us consider quantum chromodynamics (QCD), which is based on
the gauge group SU(3). There are several Dirac fields corresponding to quarks. Each quark
comes in three colors; these are the values of the SU(3) index. There are also six flavours: up,
down, strange, charm, bottom, and top. Thus we consider the Dirac field ΨiI (x), where i is
the color index and I is the flavour index. The Lagrangian density is
1
L = iΨiI D
/ ij Ψj I − mI ΨI ΨI − Tr(F µν Fµν ). (28.24)
2
The different quark flavours have different masses, ranging from a few MeV for the up and
down quarks to 174 GeV for the top quark. The covariant derivative in equation 28.24 is

Dµij = δij ∂µ − igAaµ (T a )ij . (28.25)

The index a on Aaµ runs from 1 to 8, and the corresponding massless spin-one particles are the
eight gluons.
In a nonabelian gauge theory in general, we can consider scalar or spinor fields in different
representations of the group. A representation of a compact nonabelian group is a set of finite-
dimensional hermitian matrices TRa that obey the same commutation relations as the original
generator matrices T a . Given such a set of D(R) × D(R) matrices, and a field ϕ(x) with
D(R) components, we can write its covariant derivative as Dµ = ∂µ − igAaµ TRa . Under a
gauge transformation, ϕ(x) → UR (x)ϕ(x). The theory will be gauge invariant provided that

Acµ → Acµ + gθa Abµ f abc − ∂µ θc (28.26)

under infinitesimal transformation, which is independent of representation.

28.1.3 Group representations


Given the structure coefficients f abc of a compact nonabelian group, a representation of that
group is specified by a set of D(R) × D(R) traceless hermitian matrices TRa that obey the same
commutation relations as the original generators matrices T a . The number D(R) is the di-
mension of the representation. The original T a s correspond to the fundamental or defining
representation. If TRa is a representation of the group, TR̄a = −(TRa )∗ will also be a representa-
tion, called the complex conjugate representation of TRa .
• If TRa = −(TRa )∗ , or if we can find a unitary transformation TRa → U −1 TRa U that makes
−(TRa )∗ = TRa for every a, the representation R is real.
• If such a unitary transformation does not exist, but we can find a unitary matrix V ̸= I
such that V −1 TRa V = −(TRa )∗ for every a, the representation R is pseudoreal.
28.1 Nonabelian gauge theory –373/453–

• If such a unitary matrix also does not exist, the representation R is complex.
• One way to prove that a representation is complex is to show that at least one generator
matrix TRa (or a real linear combination of them) has eigenvalues that do not come in
plus-minus pairs.
Example:
• The fundamental representation of SO(N ) is real.
• The fundamental representation for SU(2) is pseudoreal.
• The fundamental representation for SU(N ) with N ≥ 3 is complex.

Notice that
 a b  c   c a b   b c  a 
Tr T e T , T , T + [T , T ], T + T , T , T = 0. (28.27)

We can derive the Jacobian identity

f abd f dce + f bcd f dae + f cad f dbe = 0. (28.28)

It can be rearranged as

(−if abd )(−if cde ) − (−if cbd )(−if ade ) = if acd (−if dbe ). (28.29)

Define
(TAa )bc ≡ −if abc . (28.30)
Clearly, TAa is a new representation of the group, called adjoint representation. The dimension
of adjoint representation is equal to the number of the generators. And adjoint representation
is real.
Two related numbers usefully characterize a representation: the index T (R) and thequadratic
Casimir C(R). The index is defined via

Tr TRa TRb ≡ T (R)δ ab . (28.31)

The matrix TRa TRa commutes with every generator, and so must be a number times the identity
matrix. This number is the quadratic Casimir C(R). It is easy to show that

T (R)D(A) = C(R)D(R). (28.32)

With the standard normalization conventions for the generators, we have T (N) = 1/2 for the
fundamental representation of SU(N ) and T (N) = 2 for the fundamental representation of
SO(N ). Using equation 28.32, it follows that C(N) = (N 2 − 1)/2N for SU(N ) and C(N) =
N − 1 for SO(N ).
A representation R is reducible if there is a unitary transformation TRa → U −1 TRa U that puts
all the nonzero entries into the same diagonal blocks for each a; otherwise it is irreducible.
Consider a reducible representation R whose generators can be put into two blocks, with the
–374/453– Chapter 28 Gauge Field

blocks forming the generators of the irreducible representations R1 and R2 . Then R is the
direct sum representation R = R1 ⊕ R2 , and we have

D(R1 ⊕ R2 ) = D(R1 ) + D(R2 ), T (R1 ⊕ R2 ) = T (R1 ) + T (R2 ). (28.33)

Suppose we have a field ϕiI that carries two group indices, one for the representation R1 and
one for the representation R2 , denoted by i and I respectively. This field is in the direct product
representation R1 ⊗ R2 . The corresponding generator matrix is

(TRa1 ⊗R2 )iI,jJ = (TRa1 )ij δIJ + δij (TRa2 )IJ . (28.34)

We then have

D(R1 ⊗ R2 ) = D(R1 )D(R2 ), T (R1 ⊗ R2 ) = T (R1 )D(R2 ) + T (R2 )D(R1 ). (28.35)

Consider a field ϕ in the complex representation R. We will adopt the convention that such
a field carries a down index: ϕi , where i = 1, · · · , D(R). Hermitian conjugation changesthe
representation from R to R, and we will adopt the convention that this also raises the index on
the field, (ϕi )† = ϕ†i . Thus a down index corresponds to the representation R, and an up index
to R. Indices can be contracted only if one is up and one is down. Generator matrices for R
are then written with the first index down and the second index up: (TRa )ij . An infinitesimal
group transformation of ϕi takes the form

ϕi → ϕi − iθa (TRa )ij ϕj , (28.36)

An infinitesimal group transformation of ϕ†i takes the form

ϕ†i → ϕ†i − iθa (TRa )ij ϕ†j = ϕ†j + iθa (TRa )j i ϕ†j . (28.37)

Consider the Kronecker delta symbol with one index down and one up: δij . Under a group
transformation, we have

δij → (1 + iθa TRa )ik (1 + iθa TRa )j l δkl = δij + O θ2 , (28.38)

So δij is an invariant symbol of the group. This existence of this invariant symbol, which carries
one index for R and one for R, tells us that the product of the representations R and R must
contain the singlet representation 1, specified by T1a = 0. We therefore can write

R ⊗ R = 1 ⊕ ··· (28.39)

The generator matrix (TRa )ij , which carries one index for R, one for R, and one for the adjoint
representation A, is also an invariant symbol, which implies that

R ⊗ R ⊗ A = 1 ⊕ ··· (28.40)

If we now multiply both sides by A, and use A ⊗ A = 1 ⊕ · · · ( A is real), we find R ⊗ R =


A ⊕ · · · . Finally, we have
R ⊗ R = 1 ⊕ A ⊕ ··· (28.41)
28.1 Nonabelian gauge theory –375/453–

That is, the product of a representation with its complex conjugate is always reducible into a
sum that includes at least the singlet and adjoint representations. Notably, for the fundamental
representation of SU(N ), we have

N ⊗ N = 1 ⊕ A. (28.42)

Using equations 28.33 and 28.35, we can find that T (A) = N .


Consider now a real representation R. It follows from equation 28.41 and R = R that

R ⊗ R = 1 ⊕ A ⊕ ··· (28.43)

The singlet on the right-hand side implies the existence of an invariant symbol with two R
indices; this symbol is the Kronecker delta δij . The fact that δij = δji implies that the singlet
on the right-hand side of the equation above appears in the symmetric part of this product of
two identical representations. Remarkably, for the fundamental representation of SO(N ), we
have
N ⊗ N = 1S ⊕ AA ⊕ SS . (28.44)
The representation S corresponds to a field with a symmetric traceless pair of fundamental
indices.
Consider now a pseudoreal representation R. Since R is equivalent to its complex conjugate,
up to a change of basis, equation 28.43 still holds. However, we cannot identify δij as the
corresponding invariant symbol, because then R would have to be real, rather than pseudoreal.
From the perspective of the direct product, the only alternative is to have the singlet appear
in the antisymmetric part of the product, rather than the symmetric part. The corresponding
invariant symbol must then be antisymmetric on exchange of its two R indices.
An example is the fundamental representation of SU(2). For SU(N ) in general, another in-
variant symbol is the Levi-Civita tensor ϵi1 ,··· ,iN , which carries N fundamental indices and
iscompletely antisymmetric. For SU(2), the Levi-Civita symbol is ϵij = −ϵji ; this is the two-
index invariant symbol that corresponds to the singlet in the product 2 ⊗ 2 = 1A ⊕ 3S , where
3 is the adjoint representation.
The structure constants f abc are another invariant symbol. This follows from (TAa )bc = −if abc ,
since we have seen that generator matrices in any representation are invariant. Alternatively,
given the generator matrices in a representation R, we can write
 
T (R)f abc = −i Tr TRa TRb , TRc . (28.45)

Since the right-hand side is invariant, the left-hand side must be as well. If we use an anticom-
mutator in place of the commutator, we get another invariant symbol,
1  
A(R)dabc = Tr TRa TRb , TRc , (28.46)
2
where A(R) is the anomaly coefficient of the representation. The cyclic property of the trace
implies that dabc is symmetric on exchange of any pair of indices. Using (TRa )ij = −(TRa )j i , we
can see that
A(R) = −A(R). (28.47)
–376/453– Chapter 28 Gauge Field

Thus, if R is real or pseudoreal, A(R) = 0. We also have

A(R1 ⊕ R2 ) = A(R1 ) + A(R2 ), A(R1 ⊗ R2 ) = A(R1 )D(R2 ) + A(R2 )D(R1 ). (28.48)

We normalize the anomaly coefficient so that it equals one for the smallest complex represen-
tation. In particular, for SU(N ) with N ≥ 3, the smallest complex representation is the fun-
damental, and A(N) = 1. For SU(2), all representations are real or pseudoreal, and A(R) = 0
for all of them.

28.2 Quantization of nonabelian gauge theory


28.2.1 The path integral for nonabelian gauge theory
We wish to evaluate the path integral for nonabelian gauge theory:
Z Z ∫ 4 1
Z[0] = DAe iSYM
= DAei d xLYM where LYM ≡ − F aµν Fµν a
. (28.49)
4
After an infinitesimal gauge transformation θ → θ + dθ, the gauge field would be

Aaµ (θ + dθ) = Aaµ (θ) − Dµac (θ) dθc where Dµac ≡ δ ac ∂µ − igAbµ (TAb )ac . (28.50)

Introducing a gauge fixing function

Ga [A(θ)] ≡ ∂ µ Aaµ (θ) − ω a (x), (28.51)

we have
Z  
δG δGa (x)
1= Dθδ(G) det where b
= −∂ µ Dµab (θ)δ 4 (x − y). (28.52)
δθ δθ (y)
As shown in section 26.10, a functional determinant can be written as a path integral over
complex Grassmann variables. Let us introduce the complex Grassmann field ca (x), and its
hermitian conjugate c̄a (x), called Faddeev–Popov ghosts. Then we can write
  Z Z ∫ 4
δG
det ∝ DcDc̄ e iSgh
= DcDc̄ ei d xLgh [A(θ)] , (28.53)
δθ
where
Lgh [A(θ)] ≡ c̄a ∂ µ Dµab (θ)cb = −∂ µ c̄a ∂µ ca + gf abc Acµ (θ)∂ µ c̄a cb . (28.54)
We see that ca (x) has the standard kinetic term for a complex scalar field. The ghost field is
also a Grassmann field, and so a closed loop of ghost lines in a Feynman diagram carries an
extra factor of minus one. Since the particles associated with the ghost field do not in fact exist
(and would violate the spin-statistics theorem if they did), it must be that the amplitude to
produce them in any scattering process is zero.
Combining equations 28.49, 28.52 and 28.53, the path integral for nonabelian gauge theory
would be Z Z Z
Z[0] ∝ Dθδ[G(θ)] DcDc̄ e iSgh (θ)
DAeiSYM (28.55)
28.2 Quantization of nonabelian gauge theory –377/453–

We can change integral variable A in equation 28.55 to A(θ) and we have DA = DA(θ) under
gauge transformation. Also, by gauge invariance, SYM [A] = SYM [A(θ)]. Since A(θ) is now just
a dummy integration variable, we can rename it bake to A. Now equation 28.55 becomes
Z Z
Z[0] ∝ Dθ DcDc̄DA eiSYM +iSgh δ[∂ µ Aaµ − ω a (x)]. (28.56)

Since equation 28.56 holds for any ω(x), we have


Z  Z a a Z

4 ω ω
Z[0] ∝ Dω exp −i d x DcDc̄DA eiSYM +iSgh δ[∂ µ Aaµ − ω a (x)]

Z Z Z
−1 µ a ν a
= DcDc̄DA e iSYM +iSgh +iSgf
where Sgf = d x Lgf ≡ d4 x
4
∂ Aµ ∂ Aν .

(28.57)

28.2.2 The Feynman rules for nonabelian gauge theory


The Yang-Mills term in Lagrangian density can be expanded into

LYM = − 12 ∂ µ Aaν ∂µ Aaν + 21 ∂ µ Aaν ∂ν Aaµ − gf abe Aaµ Abν ∂µ Aeν − 41 g 2 f abe f cde Aaµ Abν Acµ Adν .
(28.58)
Adding the gauge-fixing term and doing some integrations-by-parts in the quadratic terms,
we find
h   i
LYM +Lgf = 12 Aeµ ηµν ∂ 2 − 1 − 1ξ ∂µ ∂ν Aeν −gf abe Aaµ Abν ∂µ Aeν − 41 g 2 f abe f cde Aaµ Abν Acµ Adν .
(28.59)
Therefore, the gluon propagator is
   
−iδ ab 1 kµ kν
ab
GF (k)µν = 2 ηµν − 1 − . (28.60)
k − iϵ ξ k2

Figure 28.1: The three-gluon and four-gluon vertices in nonabelian gauge theory.

The three-gluon and four-gluon vertex factors, as shown in Figure 28.1, are given by
abc
iVµνρ (p, q, r) = gf abc [(q − r)µ gνρ + (r − p)ν gρµ + (p − q)ρ gµν ] (28.61)

and
abcd
iVµνρσ = −ig 2 [f abe f cde (gµρ gνσ − gµσ gνρ ) + f ace f dbe (gµσ gρν − gµν gρσ ) + f ade f bce (gµν gσρ − gµρ gσν )].
(28.62)
–378/453– Chapter 28 Gauge Field

For loop calculations, we need to include the ghosts, with Lagrangian density

Lgh = −∂ µ c̄c ∂µ cc + gf abc Aaµ ∂ µ c̄b cc . (28.63)

The ghost propagator is


−iδ ab
DF (k) = . (28.64)
k 2 − iϵ
Because the ghosts are complex scalars, their propagators carry a charge arrow.

The ghost-ghost-gluon vertex factor, as shown in Figure 28.2(a), is given by

iVµabc (q, r) = igf abc (−iqµ ) = gf abc qµ . (28.65)

Figure 28.2: (a) The ghost-ghost-gluon vertex in nonabelian gauge theory; (b) The quark-
quark-gluon vertex in nonabelian gauge theory.

If we include a quark coupled to the gluons, we have the quark Lagrangian density

Lquark = iΨi D
/ ij Ψj − mΨi Ψi = iΨi ∂/Ψi − mΨi Ψi + gAaµ Ψi γ µ (T a )ij Ψj . (28.66)

The quark propagator is


i(p/ − m)δij
SF (p)ij = . (28.67)
p2 + m2 − iϵ
The quark-quark-gluon vertex factor, as shown in Figure 28.2(b), is given by

iVijµa = igγ µ (T a )ij . (28.68)

28.3 Renormalization of nonabelian gauge theory


Rescaling the fields to the renormalized field strengths by extracting the factors Z2 , Z3 and
Z2c for the fermions, gauge bosons, and ghosts, and shifting the coupling to the renormalized
coupling g, the counterterm Lagrangian density then takes the form

1
Lct = − δ3 F aµν Fµν a
+ Ψ(iδ2 ∂/ − δm )Ψ + δ2c c̄a ∂ 2 ca
4
1
+ gδ1 Aaµ Ψγ µ T a Ψ − gδ13g f abe Aaµ Abν ∂µ Aeν − g 2 δ14g f abe f cde Aaµ Abν Acµ Adν
4
c abc a µ b c
+ gδ1 f Aµ ∂ c̄ c , (28.69)
28.3 Renormalization of nonabelian gauge theory –379/453–

with the counterterms determined by

δ2 = Z2 − 1, δ3 = Z3 − 1, δ2c = Z2c − 1, δm = Z2 m0 − m,
g0 g0 g02 g0 c
δ1 = Z2 (Z3 )1/2 − 1, δ13g = (Z3 )3/2 − 1, δ14g = 2
(Z3 )2 − 1, δ1c = Z2 (Z3 )1/2 − 1.
g g g g
(28.70)

Notice that these eight counterterms depend on five underlying parameters; thus, there are
three relations among them. The situation is very similar to that for the scalar theories with
spontaneously broken symmetry that we studied before. The underlying symmetry of the
theory - local gauge invariance - implies relations among the divergent amplitudes of the the-
ory and among the counterterms required to cancel them. In the present case, a set of five
renormalization conditions uniquely specifies all of the counterterms in a way that removes
all divergences from the theory. The rigorous proof will be omitted here.

Quark propagator
The quantum corrections to the quark propagator up to one-loop order is represented by Fig-
ure 28.3.

Figure 28.3: The one-loop and counterterm corrections to the quark propagator.

In MS renormalization scheme and Feynman gauge, we can work out that

g2 1  g2 1 
Z2 = 1 − C(R) + O g 4
, Zm = 1 − C(R) + O g 4
. (28.71)
8π 2 ϵ 2π 2 ϵ

Quark-quark-gluon vertex
The quantum corrections to the quark-quark-gluon vertex up to one-loop order is described
by Figure 28.4.

Figure 28.4: The one-loop and counterterm corrections to the quark-quark-gluon vertex.

Imposing MS renormalization scheme and Feynman gauge, we would find that

g2 1 
Z1 = 1 − [C(R) + T (A)] 2
+ O g4 . (28.72)
8π ϵ
–380/453– Chapter 28 Gauge Field

Figure 28.5: The one-loop and counterterm corrections to the gluon propagator.

Gluon propagator

The quantum corrections to the gluon up to one-loop order is shown in Figure 28.5.

Using MS renormalization scheme and Feynman gauge, we can get


  2
5 4 g 1 
Z3 = 1 + T (A) − nF T (R) + O g 4
. (28.73)
3 3 8π 2 ϵ


Note: Term 5T (A)/3 in the square bracket of equation 28.73 comes from gluon loop and ghost loop in
28.5, while term 4nF T (R)/3 comes from quark loop.

Beta function

We define α ≡ g 2 /4π. Then we have

Z12
α0 = αµ̃ϵ . (28.74)
Z22 Z3

Let us write
 X

Gn (α)
ln Z3−1 Z2−2 Z12 = . (28.75)
n=1
ϵn
Then we have
X

Gn (α)
ln α0 = + ln α + ϵ ln µ̃. (28.76)
n=1
ϵn
From equations 28.71, 28.72 and 28.73, we get
 
11 4 α 
G1 (α) = − T (A) − nF T (R) + O α2 . (28.77)
3 3 2π

Since dα0 /dµ = 0 and dα/d ln µ should be finite in the ϵ → 0 limit, it can be derived that
  2
dα 2 ′ 11 4 α 
β(α) ≡ = −ϵα + α G1 (α) = −ϵα − T (A) − nF T (R) + O α3 . (28.78)
d ln µ 3 3 2π
28.4 Chiral gauge theories and anomalies –381/453–

The gauge group for quantum chromodynamics is SU(3), and quarks are in its fundamental
representation. Thus we have T (A) = 3 and T (R) = 1/2, and the factor in square brackets
is 11 − 2nF /3. As long as nF ≤ 16, the beta function will be negative: the gauge coupling in
quantum chromodynamics gets weaker at high energies, and stronger at low energies.
This has dramatic physical consequences. Perturbation theory cannot serve as a reliable guide
to the low-energy physics. And indeed, in nature we do not see isolated quarks or gluons. The
appropriate conclusion is that color is confined: all finite-energy states are invariant under
a global SU(3) transformation. This has not yet been rigorously proven, but it is the only
hypothesis that is consistent with all of the available theoretical and experimental information.
The detailed calculation omitted in this section can be found in section 73 of Quantum Field
Theory (Mark Sredniki).

28.4 Chiral gauge theories and anomalies


28.4.1 Anomalies in local symmetries
A Dirac field Ψ can be written in terms of two left-handed Weyl ields χ and ξ as
 
χ
Ψ= † . (28.79)
ξ

If Ψ is in a representation R of the gauge group, then χ and ξ † must be as well. Equivalently,


χ must be in the representation R, and ξ must be in the complex conjugate representation R.
Now suppose that we have a single left-handed Weyl field ψ in a complex representation R.
Such a gauge theory is automatically parity violating because the right-handed hermitian con-
jugate of the left-handed Weyl field is in an inequivalent representation of the gauge group,
and is said to be chiral. The Lagrangian density is
1
L = iψ † σ̄ µ Dµ ψ − F aµν Fµν
a
, (28.80)
4
where Dµ = ∂µ − igAaµ TRa . Since TRa is a hermitian matrix, iψ † σ̄ µ Dµ ψ is hermitian up to a
total divergence. We cannot include a Majorana mass term for ψ, because ψψ transforms as
R ⊗ R, and R ⊗ R does not contain a singlet if R is complex. Thus, ψψ is not gauge invariant.
Consider a U(1) theory with a single Weyl field ψ with charge +1. The Lagrangian density
can be written as
 
1 µν ψ
L = iΨγ (∂µ − igAµ )PL Ψ − F Fµν where PL Ψ =
µ
. (28.81)
4 0
We can easily read the Feynman rules from the Lagrangian density. In particular, the fermion
propagator in momentum space is iPL p//p2 and the fermion-fermion-photon vertex is igγ µ PL .
Now consider the correction to the photon propagator. It can be shown that at the one-loop
level, the contribution to Πµν (k) of a single charged Weyl field is half that of a Dirac field. This
is physically reasonable, since a Dirac field is equivalent to two charged Weyl fields.
–382/453– Chapter 28 Gauge Field

Nothing interesting happens in the one-loop and counterterm corrections to the fermion
propagator, or the fermion-fermion-photon vertex. There is simply an extra factor of PL along
the fermion line, which can be moved to the far right. Except for this factor, the results exactly
duplicate those of spinor electrodynamics. All of this implies that a single Weyl field makes
half the contribution of a Dirac field to the leading term in the beta function for the gauge
coupling.

Next we turn to diagrams with three external photons, and no external fermions, shown in
Figure 28.6. In spinor electrodynamics, the fact that the vector potential is odd under charge
conjugation implies that the sum of these diagrams must vanish. For the present case of a
single Weyl field, there is no charge-conjugation symmetry, and so we must evaluate these
diagrams.

Figure 28.6: One-loop contributions to the three-photon vertex.

The second diagram in Figure 28.6 is the same as the first, with p ↔ q and µ ↔ ν. Thus we
have
Z
d4 l i3 N µνρ 
µνρ
iV (p, q, r) = (−1)(ig) 3
+ (p, µ ↔ q, ν) + O g 5 , (28.82)
(2π) (l − p) l (l + q)
4 2 2 2

where
 
N µνρ = Tr (/l − p/)γ µ /lγ ν (/l + /q)γ ρ PL . (28.83)
The term in equation 28.83 with PL → 1/2 simply yields half the result that we get in spinor
electrodynamics with a Dirac field, which gives a vanishing contribution to V µνρ (p, q, r).
Hence, we can make the replacement PL → −γ5 /2 in equation 28.83.

After a lengthy calculation, we find that

ig 3 ανβρ ig 3 αρβµ
rρ V µνρ (p, q, r) = 0, pµ V µνρ (p, q, r) = ϵ pα qβ , qν V µνρ (p, q, r) = ϵ qα p β .
8π 2 8π 2
(28.84)
Thus, the three-photon vertex is not gauge invariant.

Equations 28.84 also show that the three-photon vertex does not exhibit the expected symme-
try among the external lines. It lies in the fact that the integral in equation 28.82 is linearly
divergent, and so shifting the loop momentum changes its value. Shifting the loop momen-
tum appropriately can restore symmetry among the external lines. But anomalies can not be
eliminated completely in any regularization scheme.

Consider now a U(1) gauge theory with several left-handed Weyl fields ψi , with charges Qi ,
so that the covariant derivative of ψi is ∂µ − igQi Aµ . Then each of these fields circulates in
28.4 Chiral gauge theories and anomalies –383/453–

the loop in 28.6, and each vertex has an extra factor of Qi . The right-hand sides of equations
P P
28.84 are now multiplied by i Q3i . And if i Q3i happens to be zero, then gauge invariance
is restored. The simplest possibility is to have the ψs come in pairs with equal and opposite
charges.

All of this has a straightforward generalization to nonabelian gauge theories. Suppose we have
a single Weyl field in a (possibly reducible) representation R of the gauge group. Then we must
 
attach an extra factor of Tr TRa TRb TRc to the first diagram in 28.6, and a factor of Tr TRa TRc TRb
to the second; here the group indices a, b, c go along with the momenta p, q, r, respectively.
Repeating our analysis shows that the diagrams with PL → 1/2 come with an extra factor
  
of Tr TRa , TRb TRc /2; these contribute to the renormalization of the tree-level three-gluon
vertex. Diagrams with PL → −γ5 /2 come with an extra factor of

1  
Tr TRa , TRb TRc = A(R)dabc , (28.85)
2

where dabc is a completely symmetric tensor that is independent of the representation, and
A(R) is the anomaly coefficient of R. In order for this theory to exist, we must have A(R) = 0.

For SU(2) and SO(N ), N ̸= 2, 6, all representations have A(R) = 0. For SU(N ) with N ≥ 3,
the fundamental representation has A(N) = 1, and most complex SU(N ) representations R
have A(R) ̸= 0. Notice that A(R) + A(R) = 0; thus a theory whose left-handed Weyl fields
come in R ⊕ R pairs is automatically anomaly free.

Generally, consider a theory with a nonabelian gauge symmetry, and also a U(1) gauge sym-
metry. The theory contains left-handed Weyl fields in the representations (Ri , Qi ), where Ri
is the representation of the nonabelian group, and Qi is the U(1) charge. For this theory to

be anomaly free, we must demand that Tr T a , T b T c /2 = 0, where T a is either a gener-
ator of the non abelian group in the representation R1 ⊕ · · · Rn , or the generator Q of the
abelian group. The non abelian generators are block diagonal, with blocks given by TRai , and
Qi s diagonal with d(R1 ) entries Q1 , d(R2 ) entries Q2 , etc.
 P
• If all three generators are nonabelian, we have Tr T a , T b T c /2 = i A(Ri )dabc , and
P
so we must have i A(Ri ) = 0.
 P
• If one generator is the abelian generator Q, we have Tr T a , T b T c /2 = i T (Ri )Qi δ ab ,
P
and so we must have i T (Ri )Qi = 0.
P
• If two generators are abelian, we have Tr Q2 T c = i Q2i Tr TRci = 0, since nonabelian
generators are always traceless.
 P
• If all three generators areabelian, we have Tr T a , T b T c /2 = i d(Ri )Q3i , and this
must also vanish.

28.4.2 Anomalies in global symmetries


Anomalies in global symmetries can arise in gauge theories that are free of anomalies in the
local symmetries. Consider electrodynamics with a massless Dirac field. Writing Ψ in terms
–384/453– Chapter 28 Gauge Field

of two left-handed Weyl fields χ and ξ, the Lagrangian density becomes

L = iχ† σ̄ µ (∂µ − igAµ )χ + iξ † σ̄ µ (∂µ + igAµ )ξ. (28.86)

Because the fermion field is massless, the Lagrangian is invariant under a global symmetry in
which χ and ξ transform with the same phase:

χ(x) → eiα χ(x), ξ(x) → eiα ξ(x). (28.87)

In terms of Ψ, this is

Ψ(x) → e−iαγ5 Ψ(x), Ψ(x) → Ψe−iαγ5 . (28.88)

This is called axial U(1) symmetry, because the associated Noether current

jAµ = Ψ(x)γ µ γ5 Ψ(x) (28.89)

is an axial vector. Noether’s theorem leads us to expect that this current is conserved. However,
the axial current actually has an anomalous divergence.
Consider the matrix element ⟨p, q|jAρ (z)|0⟩, where ⟨p, q| is a state of two outgoing photons
with four-momenta p and q, and polarization vectors ϵµ and ϵ′ν , respectively. Using the LSZ
formula for photons, we have
Z
ρ 2 ′
⟨p, q|jA (z)|0⟩ = (ig) ϵµ ϵν d4 x d4 y e−i(px+qy) ⟨0|Tj µ (x)j ν (y)jAρ (z)|0⟩ , (28.90)

where j µ = Ψγ µ Ψ is the Noether current corresponding to the U(1) gauge symmetry. Since
both jµ (x) and jν (x) are Noether currents, we expect the Ward identities
∂ ∂
⟨0|Tj µ (x)j ν (y)jAρ (z)|0⟩ = 0, ⟨0|Tj µ (x)j ν (y)jAρ (z)|0⟩ = 0,
∂xµ ∂y ν

ρ
⟨0|Tj µ (x)j ν (y)jAρ (z)|0⟩ = 0 (28.91)
∂z
to be satisfied. Note that there are no contact terms in equations 28.91, because both j µ (x)
and jAµ (x) are invariant under both U(1) transformations.
Let us define C µνρ (p, q, r) via
Z
4
(2π) δ(p + q + r)C µνρ
(p, q, r) ≡ d4 x d4 y d4 z e−i(px+qy+rz) ⟨0|Tj µ (x)j ν (y)jAρ (z)|0⟩ .
(28.92)
Then we have
⟨p, q|jAρ (z)|0⟩ = −g 2 ϵµ ϵ′ν C µνρ (p, q, r)eirz r=−q−p
. (28.93)
Taking the divergence of the current yields

⟨p, q|∂ρ jAρ (z)|0⟩ = −ig 2 ϵµ ϵ′ν rρ C µνρ (p, q, r)eirz r=−q−p
. (28.94)

The expected Ward identities become

pµ C µνρ (p, q, r) = 0, qν C µνρ (p, q, r) = 0, rρ C µνρ (p, q, r) = 0. (28.95)


28.4 Chiral gauge theories and anomalies –385/453–

To check equations 28.95, we compute C µνρ (p, q, r) with Feynman diagrams. At the one-loop
level, the contributing diagrams are exactly those we computed in previous subsection, except
that the three vertex factors are now γ µ , γ ν and γ ρ γ5 instead of igγ µ PL , igγ ν PL and igγ ρ PL .
But the three PL s can be combined into just one at the last vertex, and then this one can be
replaced by − 21 γ5 . Thus, the vertex function iV µνρ (p, q, r) is related to C µνρ (p, q, r) by
1 
iV µνρ (p, q, r) = − (ig)3 C µνρ (p, q, r) + O g 5 . (28.96)
2

In order to preserve the conservation of the current coupled to the gauge field, we should
choose the regularization scheme in which the first two equations in 28.95 are satisfied, result-
ing in
i 
rρ C µνρ (p, q, r) = − 2 ϵµναβ pα qβ + O g 2 . (28.97)

Using this in equation 28.94, we find
g 2 µναβ ′ −i(p+q)z

⟨p, q|∂µ jAρ (z)|0⟩ = − ϵ p α q β ϵµ ϵν e + O g 4
. (28.98)
2π 2

28.4.3 Anomalies and the path integral for fermions


We begin with the path integral over the Dirac field, with the gauge field treated as a fixed
background, to be integrated later. We have
Z Z
Z(A) ≡ DΨDΨe iS(A)
where S(A) ≡ d4 x ΨiDΨ. / (28.99)

Here Aµ is either the U(1) gauge field, or the matrix-valued nonabelian gauge field, depending
on the theory under consideration.
Now consider an axial U(1) transformation of the Dirac field, but with a space-time dependent
parameter α(x):
Ψ(x) → e−iα(x)γ5 Ψ(x), Ψ(x) → Ψe−iα(x)γ5 . (28.100)
The corresponding change in the action is
Z Z
S(A) → S(A) + d x jA (x)∂µ α(x) = S(A) − d4 x α(x)∂µ jAµ (x).
4 µ
(28.101)

If we assume that the measure DΨDΨ is invariant under the axial U(1) transformation, then
we have Z ∫ 4 µ
Z(A) → DΨDΨeiS(A) e−i d xα(x)∂µ jA (x) . (28.102)

This must be equal to the original expression for Z(A). This implies that ∂µ jAµ (x) = 0 holds
inside quantum correlation functions, up to contact terms, as discussed in subsection 25.4.4.
However, the assumption that the measure DΨDΨ is invariant under the axial U(1) transfor-
mation must be examined more closely. The change of variable in equations 28.100 is imple-
mented by the functional matrix

J(x, y) = δ(x − y)e−iα(x)γ5 . (28.103)


–386/453– Chapter 28 Gauge Field

Because the path integral is over fermionic variables (rather than bosonic), we get a jacobian
factor of (det J)−1 (rather than det J) for each of the transformations, so that we have

DΨDΨ → (det J)−2 DΨDΨ. (28.104)

Using log det J = Tr log J, we can write


 Z 
−2
(det J) = exp 2i d x α(x) Tr δ(x − x)γ5 ,
4
(28.105)

where the explicit trace is over spin and group indices.


To regularize equation 28.105, we replace the delta function with a gaussian
2 /M 2
δ(x − y) → e(iD/ x ) δ(x − y). (28.106)

Then we can derive that


g 2 µνρσ
Tr δ(x − x)γ5 → − ϵ Tr Fµν Fρσ . (28.107)
32π 2
Including the transformation of the measure, equation 28.102 should be modified as
Z  Z  2 
g µνρσ
Z(A) → DΨDΨe iS(A)
exp −i d x α(x)
4
ϵ µ
Tr Fµν Fρσ + ∂µ jA (x) .
16π 2
(28.108)
This must be equal to the original expression for Z(A), leading to
g 2 µνρσ
∂µ jAµ = − ϵ Tr Fµν Fρσ , (28.109)
16π 2
which holds inside quantum correlation functions, up to possible contact terms. This result is
known as the Adler–Bardeen theorem, and it is the exact version of equation 28.98.
The method can also be used to find the anomaly in the chiral gauge theories that we studied
in section 28.4.1, but the analysis is more involved. Here we will quote only the final result.
Consider a left-handed Weyl field in a (possibly reducible) representation R of the gauge group.
We define the chiral gauge current j aµ ≡ ΨTRa γ µ PL Ψ. Its covariant divergence is given by
  
g 2 µνρσ 1
ab bµ
Dµ j = ϵ ∂µ Tr TR Aν ∂ρ Aσ − igAν Aρ Aσ .
a
(28.110)
24π 2 2
The right-hand side of equation 28.110 is not gauge invariant. The anomaly spoils gauge invari-
ance in chiral gauge theories, unless this righthand side happens to vanish for group-theoretic
reasons. It can be shown that this occurs if and only if A(R) = 0.
Finally, a related but more subtle problem, known as Witten anomaly, arises for theories with
an odd number of Weyl fields in a pseudoreal representation. In this case, every gauge field
configuration Aµ can be smoothly deformed into another gauge field configuration A′µ that has
the same action, but has Z(A′ ) = −Z(A). Thus, when we integrate over A, the contribution
from cancels the contribution from A, and the result is zero. Since its path integral is trivial,
this theory does not exist.
28.5 Spontaneous breaking of gauge symmetries –387/453–

28.5 Spontaneous breaking of gauge symmetries


Consider scalar electrodynamics, specified by the Lagrangian density
1
L = −(Dµ ϕ)† Dµ ϕ − V (ϕ) − F µν Fµν , (28.111)
4
where ϕ is a complex scalar field, Dµ = ∂µ − ieAµ , and

1
V (ϕ) = m2 ϕ† ϕ + λ(ϕ† ϕ)2 . (28.112)
4
Let us consider m2 < 0. Classically, the field has a nonzero vacuum expectation value (VEV
for short), given by
1
⟨0|ϕ(x)|0⟩ = √ v, (28.113)
2
where we have made a global U(1) transformation to set the phase of the VEV to zero, and
r
4|m|2
v= . (28.114)
λ
We therefore write
1
ϕ(x) = √ [v + ρ(x)]e−iχ(x) , (28.115)
2
where ρ(x) and χ(x) are real scalar fields. The scalar potential depends only on ρ, and is given
by
1 1 1
V = λv 2 ρ2 + λvρ3 + λρ4 . (28.116)
4 4 16
Since χ does not appear in the potential, it is massless; it is the Goldstone boson for the spon-
taneously broken U(1) symmetry.

In the gauge theory, we can make a gauge transformation that shifts the phase of ϕ(x) by an
arbitrary spacetime function. We can use this gauge freedom to set χ(x) = 0; this choice is
called unitary gauge. Now we have
1 1
− (Dµ ϕ)† Dµ ϕ = − ∂ µ ρ∂µ ρ − g 2 (v + ρ)2 Aµ Aµ . (28.117)
2 2
Expanding out the last term, we see that the gauge field now has a mass

M = gv. (28.118)

This is the Higgs mechanism: the Goldstone boson disappears, and the gauge field acquires a
mass. Note that this leaves the counting of particle spin states unchanged: a massless spin-one
particle has two spin states, but a massive one has three. The Goldstone boson has become the
third or longitudinal state of the now-massive gauge field. A scalar field whose VEV breaks a
gauge symmetry is generically called a Higgs field.

This generalizes in a straightforward way to a nonabelian gauge theory. Consider a complex


scalar field ϕ in a representation R of the gauge group. The kinetic term for ϕ is −(Dµ ϕ)† Dµ ϕ,
–388/453– Chapter 28 Gauge Field

where the covariant derivative is (Dµ ϕ)i = ∂µ ϕi − igAaµ (TRa )ij ϕj , and the indices i and j run
from 1 to d(R). We assume that ϕ acquires a VEV
1
⟨0|ϕi (x)|0⟩ = √ vi . (28.119)
2

If we replace ϕ by its VEV in −(Dµ ϕ)† Dµ ϕ, we find a mass term for the gauge fields
1
Lmass = − (M 2 )ab Aaµ Abµ , (28.120)
2
where the mass-squared matrix is
1 
(M 2 )ab = g 2 vi∗ TRa , TRb v.
ij j
(28.121)
2
A generator T a is spontaneously broken if (TRa )ij vj ̸= 0. We see that gauge fields correspond-
ing to broken generators get a mass, while those corresponding to unbroken generators do not.
The unbroken generators (if any) form a gauge group with massless gauge fields. The massive
gauge fields (and all other fields) form representations of this unbroken group.
Consider the gauge group SU(N ), with a complex scalar field ϕ in the fundamental represen-
tation. We can make a global SU(N ) transformation to bring the VEV entirely into the last
component, and furthermore make it real. Any generator (TRa )ij that does not have a nonzero
entry in the last column will remain unbroken. These generators form an unbroken SU(N −1)
gauge group. There are three classes of broken generators: those with (TRa )iN = 1/2 for i ̸= N
(there are N −1 of these); those with (TRa )iN = −i/2 for i ̸= N (there are also N −1 of these);
and finally the single generator T N −1 = [2N (N − 1)]−1/2 diag(1, 1, · · · , −N + 1). The gauge
2

fields corresponding to the generators in the first two classes get a mass M = gv/2. we can
group them into a complex vector field that transforms in the fundamental representation
of the unbroken SU(N − 1) subgroup. The gauge field corresponding to T N −1 gets a mass
2

M = [(N − 1)/2N ]1/2 gv; it is a singlet of SU(N − 1).


Consider the gauge group SO(N ), with a real scalar field in the fundamental representation.
We can make a global SO(N ) transformation to bring the VEV entirely into the last compo-
nent. Any generator (TRa )ij that does not have a nonzero entry in the last column will remain
unbroken. These generators form an unbroken SO(N − 1) subgroup. There are N − 1 broken
generators, those with (TRa )iN = −i for i ̸= N . The corresponding gauge fields get a mass
M = gv; they form a fundamental representation of the unbroken SO(N − 1) subgroup.
Consider the gauge group SU(N ), with a real scalar field Φ in the adjoint representation. It will
prove more convenient to work with the N × N matrix-valued field Φ = ϕa T a ; the covariant
derivative of Φ is Dµ Φ = ∂µ Φ − igAaµ [T a , Φ], and the VEV of ϕ is a traceless hermitian N × N
matrix V . Thus the mass squared matrix for the gauge fields is
1   
(M 2 )ab = − g 2 Tr [T a , V ], T b , V . (28.122)
2
We can make a global SU(N ) transformation to bring V into diagonal form. Suppose the diag-
P
onal entries consist of N1 v1 s, followed by N2 v2 s, etc., where v1 < v2 < · · · and i Ni vi = 0.
28.6 Quantization of spontaneously broken gauge theory –389/453–

Then all generators whose nonzero entries lie entirely within the ith block commute with V ,
and hence form an unbroken SU(Ni ) subgroup. Furthermore, generators that is proportional
to V also commutes with V , and forms a U(1) subgroup. Thus the unbroken gauge group is
SU(N1 ) × SU(N2 ) × · · · × U (1). The gauge coupling constants for the different groups are
all the same, and equal to the original SU(N ) gauge coupling constant.

28.6 Quantization of spontaneously broken gauge theory


28.6.1 Spontaneously broken abelian gauge theory
Consider scalar electrodynamics, specified by the Lagrangian density
 2
† 1 1 † 1 2
L = −(Dµ ϕ) Dµ ϕ − V (ϕ) − F µν Fµν where V (ϕ) = λ ϕ ϕ − v . (28.123)
4 4 2

Using a cartesian basis for ϕ, we write


1
ϕ = √ (v + h + ib), (28.124)
2
where h and b are real scalar fields. In terms of h and b, the potential is
1 1 1
V = λv 2 h2 + λvh(h2 + b2 ) + λ(h2 + b2 )2 . (28.125)
4 4 16
The kinetic term can be expanded as
1 1 1
−(Dµ ϕ)† Dµ ϕ = − ∂µ h∂ µ h − ∂µ b∂ µ b − g 2 v 2 Aµ Aµ + gvAµ ∂µ b
2 2 2
1
+ gAµ (h∂µ b − b∂µ h) − gvhAµ Aµ − g 2 (h2 + b2 )Aµ Aµ . (28.126)
2
The first line contains all the terms that are quadratic in the fields. The first two are the kinetic
terms for the h and b fields. The third is the mass term for the vector field. The fourth is an
annoying cross term between the vector field and the derivative of b.
In abelian gauge theory, in the absence of spontaneous symmetry breaking, we fix gauge by
adding to L the gauge-fixing and ghost terms

G2 δG
Lgf + Lgh = − − c̄ c, (28.127)
2ξ δθ
where G = ∂ µ Aµ , and θ(x) parametrizes an infinitesimal gauge transformation

Aµ → Aµ − ∂µ θ, ϕ → igθϕ. (28.128)

Since δG/δθ = −∂ 2 , the ghost fields have no interactions, and can be ignored.
In the presence of spontaneous symmetry breaking, we choose instead

G = ∂ µ Aµ − ξgνb, (28.129)
–390/453– Chapter 28 Gauge Field

which reduces to ∂ µ Aµ when v = 0. Multiplying out G2 , we have


1 µ 1
Lgf = − ∂ Aµ ∂ ν Aν − gvAµ ∂ µ b − ξg 2 v 2 b2 . (28.130)
2ξ 2
Note that the second term cancels the annoying cross term between the vector field and the

derivative of b. Also, the last term on the gives a mass ξM to the b field ( M = gv).
We must still evaluate Lgh . To do so, we note under gauge transformation,

h → h + gθb, b → b − gθ(v + h). (28.131)

Then we have
δG
= −∂ 2 + ξg 2 v(v + h). (28.132)
δθ
The ghost Lagrangian is

Lgh = −∂ µ c̄∂µ c − ξg 2 v 2 c̄c − ξg 2 vhc̄c. (28.133)

We see from the second term that the ghost has acquired the same mass as the b field.
Now let us examine the vector field. Including Lgh , the terms in L that are quadratic in the
vector field can be written as
1  
L0 = − Aµ η µν (−∂ 2 + M 2 ) + (1 − ξ −1 )∂ µ ∂ ν Aν . (28.134)
2
The propagator for the vector field is
−iP µν −iξk µ k ν /k 2 kµkν
SF (k) = + where P µν = g µν − . (28.135)
k 2 + M 2 − iϵ k 2 + ξM 2 − iϵ k2
We see that the transverse components of the vector field propagate with mass M , while the

longitudinal component propagates with the same mass as the b and ghost fields, ξM . Since
their masses depend on ξ, the ghosts, the b field, and the longitudinal component of the vector
field must all represent unphysical particles that do not appear in incoming or outgoing states.
When ξ → ∞, we can recover the unitary gauge.

28.6.2 Spontaneously broken nonabelian gauge theory


We decompose complex scalar fields into pairs of real ones, and organize all the real scalar
fields into a big list ϕi , i = 1, · · · , N . These real scalar fields form a (possibly reducible)
representation R of the gauge group. Let T a be the gauge-group generator matrices; they are
linear combinations of the generators of the SO(N ) group that rotates all components of ϕi
into each other. Because these generators are hermitian and antisymmetric, so are the T a s.
The Lagrangian density for our theory can now be written as
1 1
L = − Dµ ϕDµ ϕ−V (ϕ)− F aµν Fµν
a
where (Dµ ϕ)i = ∂µ ϕi −iga Aaµ (T a )ij ϕj . (28.136)
2 4
Now we suppose that the potential is minimized when ϕ has a VEV

⟨0|ϕi |0⟩ = vi . (28.137)


28.6 Quantization of spontaneously broken gauge theory –391/453–

A generator T a is unbroken if (T a )ij vj = 0, and broken if (T a )ij vj ̸= 0.

Each broken generator results in a massless Goldstone boson. We note that the potential must
be invariant under a global gauge transformation. It follows that

∂V
(T a )jk ϕk = 0. (28.138)
∂ϕj

We differentiate equation 28.138 with respect to ϕi to get

∂ 2V ∂V
(T a )jk ϕk + (T a )ji = 0. (28.139)
∂ϕi ∂ϕj ∂ϕj

Now set ϕi = vi ; then ∂V /∂ϕi vanishes. Also, we can identify

∂ 2V
(m2 )ij = (28.140)
∂ϕi ∂ϕj ϕi =vi

as the mass-squared matrix for the scalars after spontaneous symmetry breaking. Thus equa-
tion 28.139 becomes
(m2 )ij (T a v)j = 0. (28.141)
We see that if T a v ̸= 0, then T a v is an eigenvector of the mass-squared matrix with eigenvalue
zero. Thus there is a zero eigenvalue for every linearly independent broken generator.

Let us write
ϕi (x) = vi + χi (x). (28.142)
The covariant derivative of becomes

(Dµ ϕ)i = ∂µ χi − igAaµ (T a )ij (v + χ)j . (28.143)

It is now convenient to define a set of real antisymmetric matrices

τija ≡ ig(T a )ij (28.144)

and the real rectangular matrix


F ai ≡ (τ a )ij vj . (28.145)
We can now write
(Dµ ϕ)i = ∂µ χi − Aaµ (F a + τ a χ)i . (28.146)
The kinetic term for ϕ becomes
1 1 1
− Dµ ϕDµ ϕ = − ∂µ χi ∂ µ χi − (F ai F bi )Aaµ Abµ + F ai Aaµ ∂ µ χi
2 2 2
1
+ Aaµ χi τija ∂ µ χj − Aaµ Abµ F ai τijb χj − Aaµ Abµ χi (τ a τ b )ij χj . (28.147)
2
We see that the mass-squared matrix for the vector fields is

(M 2 )ab = F ai F bi = (F F ⊺ )ab . (28.148)


–392/453– Chapter 28 Gauge Field

A theorem of linear algebra states that every real rectangular matrix can be written as

F ai = S ac (M c δjc )Rji , (28.149)

where S and R are orthogonal matrices, and the diagonal entries M c are real and nonnegative.
We see that these diagonal entries are the masses of the vector fields. The vector fields of
eaµ = S ba Abµ .
definite mass are then given by A
Now we are ready to fix Rξ gauge. To do so, we add to L the gauge-fixing and ghost terms
Ga Ga δGa
Lgf + Lgh = − − c̄a b cb where Ga = ∂ µ Aaµ − ξF ai χi . (28.150)
2ξ δθ
Then we have
1 µ a ν a 1
Lgf = − ∂ Aµ ∂ Aν − F ai Aaµ ∂ µ χi − ξ(F ai F aj )χi χj . (28.151)
2ξ 2
The last term makes a contribution to the mass-squared matrix for the χ fields,

ξ(M 2 )ij = ξF ai F aj = ξ(F ⊺ F )ij . (28.152)



The eigenvalues of this matrix are ξM a , where M a are the vector-boson masses. The mass-
squared matrix ξM 2 should be added to the mass-squared matrix m2 that we get from the
potential. Note that
(m2 )ij (ξM 2 )jk = 0. (28.153)
Thus these two contributions to the mass-squared matrix of the scalar fields live in orthogonal
subspaces. The m2 subspace consists of the physical, massive scalars, and the ξM 2 subspace
consists of the unphysical Goldstone bosons; these are the fields that would be set to zero in
unitary gauge.
We must still evaluate Lgh . Under an infinitesimal gauge transformation, we have

Aaµ → Aµa − Dab


µ b
θ, χi → χi − θa τija (v + χ)j . (28.154)

It follows that
δGa
= −∂µ Dab
µ
+ ξ(M 2 )ab + ξF aj τjk
b
χk , (28.155)
δθb
So the ghost Lagrangian density is

Lgh = −∂ µ c̄a Dµab cb − ξ(M 2 )ab c̄a cb − ξF aj τjk


b
χk c̄a cb . (28.156)

ca = S ba cb and ēc = S ba c̄b .


a
The ghost fields of definite mass are e
The complete gauge-fixed Lagrangian density is now given by equations 28.136, 28.147, 28.151
and 28.156. We can rewrite it in terms of the fields of definite mass. This results in the replace-
ments

F ai → M a δia , (τ a )ij → S ab (R⊺ τ b R)ij , f abc → S ad S be S cg f deg (28.157)

throughout L. The Feynman rules then follow in the usual way.


28.7 The Standard Model –393/453–

28.7 The Standard Model


The Standard Model of elementary particles is the complete (except for gravity) quantum field
theory that appears to describe our world. It can be succinctly specified as a gauge theory
with gauge group SU(3) × SU(2) × U(1), with left-handed Weyl fields in three copies of the
representation (1, 2, −1/2) ⊕ (1, 1, +1) ⊕ (3, 2, +1/6) ⊕ (3̄, 1, −2/3) ⊕ (3̄, 1, +1/3), and
a complex scalar field in the representation (1, 2, −1/2). Here the last entry of each triplet
gives the value of the U(1) charge, known as hypercharge. The Lagrangian density includes all
terms of mass dimension four or less that are allowed by the gauge symmetries and Lorentz
invariance.

28.7.1 Gauge and Higgs sector


We begin with the electroweak part of the gauge group, SU(2) × U(1), and the complex scalar
field ϕ, known as the Higgs field, in the representation (2, −1/2). The Higgs field acquires a
non-zero VEV that spontaneously breaks SU(2) × U(1) to U(1); the unbroken U(1) is iden-
tified as electromagnetism.

The covariant derivative of the Higgs field ϕ is

(Dµ ϕ)i = ∂µ ϕi − i(g2 Aaµ T a + g1 Bµ Y )ij ϕj , (28.158)

where T a = σ a /2 and Y = −I/2; Y is the hypercharge generator. It will prove useful to write
out g2 Aaµ T a + g1 Bµ Y in matrix form,
 
1 g2 A3µ − g1 Bµ g2 (A1µ − iA2µ )
. (28.159)
2 g2 (A1µ + iA2µ ) −g2 A3µ − g1 Bµ

Now suppose that ϕ has a potential


 2
1 † 1 2
V (ϕ) = λ ϕ ϕ − v . (28.160)
4 2

This potential gives ϕ a non-zero VEV. We can make a global gauge transformation to bring
this VEV entirely into the first component, and furthermore make it real, so that
 
1 v
⟨0|ϕ|0⟩ = √ . (28.161)
2 0

The kinetic term for ϕ is −(Dµ ϕ)† Dµ ϕ. After replacing ϕ by its VEV, we find a mass term for
the gauge fields,
 2  
1 2  g2 A3µ − g1 Bµ g2 (A1µ − iA2µ ) 1
Lmass =− v 1 0 . (28.162)
8 g2 (Aµ + iAµ ) −g2 Aµ − g1 Bµ
1 2 3
0

Define the weak mixing angle


θW ≡ tan−1 (g1 /g2 ) (28.163)
–394/453– Chapter 28 Gauge Field

and the fields


1
Wµ± ≡ √ (A1µ ∓ iA2µ ), Zµ ≡ cW A3µ − sW Bµ , Aµ ≡ sW A3µ + cW Bµ , (28.164)
2
where cW ≡ cos θW and sW ≡ sin θW . In terms of these fields, we have
1
Lmass = −MW
2
W +µ Wµ− − MZ2 Z µ Zµ , (28.165)
2
where we have identified
g2 v MW
MW ≡ , MZ ≡ . (28.166)
2 cos θW
Note that the Aµ field remains massless; this signifies that there is an unbroken U(1) subgroup.
We will identify this unbroken U(1) with the gauge group of electromagnetism.
The two complex components of the ϕ field yield four real scalar fields; three of these become
the longitudinal components of the W ± and Z 0 . The remaining scalar field must be able to
account for shifts in the overall scale of ϕ. Thus we can write, in unitary gauge,
 
1 v + H(x)
ϕ(x) = √ , (28.167)
2 0

where H is a real scalar field; the corresponding particle is the Higgs boson. The potential now
reads
1 1 1
V = λv 2 H 2 + λvH 3 + λH 4 . (28.168)
4 4 16
We see that the mass of the Higgs boson is given by m2H = λv 2 /2. The kinetic term for H
comes from the kinetic term for ϕ, and is the usual one for a real scalar field, −∂µ H∂ µ H/2.
Finally, recall that the mass term for the gauge fields is proportional to v 2 . Hence it should be
multiplied by a factor of (1 + H/v)2 .
Now we have to work out the kinetic terms for the gauge fields:
1 1
L = − F aµν Fµν
a
− B µν Bµν . (28.169)
4 4
We find
1 1
√ (Fµν
1
− iFµν
2
) = Dµ Wν+ − Dν Wµ+ , √ (Fµν
1 2
+ iFµν ) = Dµ† Wν− − Dν† Wµ− , (28.170)
2 2
where we have defined a covariant derivative that acts on W + ,

Dµ ≡ ∂µ − ig2 A3µ = ∂µ − ig2 (sW Aµ + cW Zµ ). (28.171)

If we identify Aµ as the electromagnetic vector potential, and assign electric charge Q = +1


to the W + , then we must identify the electromagnetic coupling constant e as

e ≡ g2 sW . (28.172)

Here we are adopting the convention that e is positive. (In our treatment of quantum electro-
dynamics, we used the convention that e is negative, but that is less convenient in the present
context.)
28.7 The Standard Model –395/453–

We also have
3
Fµν = sW Fµν + cW Zµν − ig2 (Wµ+ Wν− − Wν+ Wµ− ), Bµν = cW Fµν − sW Zµν , (28.173)

where Fµν = ∂µ Aν −∂ν Aµ is the usual electromagnetic field strength, and Zµν ≡ ∂µ Zν −∂ν Zµ
is the abelian field strength associated with the Zµ field.
Now we can assemble all of this into the complete Lagrangian density for the electroweak gauge
fields and the Higgs boson in unitary gauge. We will express g2 in terms of e and θW , and λ in
terms of mH and v. We ultimately get
1 1
L = − Fµν F µν − Zµν Z µν − D†µ W −ν Dµ Wν+ + D†µ W −ν Dν Wµ+ + ie(F µν + cot θW Z µν )Wµ+ Wν−
4 4
2
 2
1 e − − −ν − − 1 2 µ H
− (W Wµ W Wν − W Wµ W Wν ) − (MW W Wµ + MZ Z Zµ ) 1 +
+µ +ν +µ + 2 +µ
2 sin2 θW 2 v
1 1 1 1
− ∂µ H∂ µ H − m2H H 2 − m2H v −1 H 3 − m2H v −2 H 4 , (28.174)
2 2 2 8
where Dµ = ∂µ − ie(Aµ + cot θW Zµ ).

28.7.2 Lepton sector


Leptons are spin-one-half particles that are singlets of the color group. There are six different
flavours of lepton. The six flavours are naturally grouped into three families or generations: e
and νe , µ and νµ , τ and ντ .
Let us begin by describing a single lepton family, the electron and its neutrino. We introduce
left-handed Weyl fields l and ē in the representations (2, −1/2) and (1, +1) of SU(2) × U(1).
Here the bar over the e in the field ē is part of the name of the field, and does not denote any
sort of conjugation. The covariant derivatives of these fields are

(Dµ l)i = ∂µ li − ig2 Aaµ (T a )ij lj − ig1 (−1/2)Bµ li , Dµ ē = ∂µ ē − ig1 (+1)Bµ ē, (28.175)

and their kinetic terms are

Lkin = il†i σ̄ µ (Dµ l)i + iē† σ̄ µ Dµ ē. (28.176)

We cannot write down a mass term involving l and (or) ē because there is no gauge-group
singlet contained in any of the products

(2, −1/2) ⊗ (2, −1/2), (2, −1/2) ⊗ (1, +1), (1, +1) ⊗ (1, +1). (28.177)

However, we are able to write down a Yukawa coupling of the form

LYuk ≡ −yϵij ϕi lj ē + h.c., (28.178)

where ϕ is the Higgs field in the (2, −1/2) representation, and y is the Yukawa coupling con-
stant. A gauge-invariant Yukawa coupling is possible because there is a singlet on the right-
hand side of
(2, −1/2) ⊗ (2, −1/2) ⊗ (1, +1) = (1, 0) ⊕ (3, 0). (28.179)
–396/453– Chapter 28 Gauge Field

In unitary gauge, we replace ϕ1 with (v + H)/ 2, where H is the real scalar field representing
the physical Higgs boson, and ϕ2 with zero. The Yukawa term becomes
1
LYuk = − √ y(v + H)(l2 ē + h.c.). (28.180)
2
It is now convenient to assign new names to the SU(2) components of l,
 
ν
l= . (28.181)
e
Thus we have
1
LYuk = − √ y(v + H)EE, (28.182)
2
where we have defined a Dirac field for the electron
 
e
E≡ † . (28.183)


We see that the electron has acquired a mass me ≡ yv/ 2, while neutrino has remained
massless. For neutrinos, it is more convenient to work with
 
ν
NL ≡ PL N = . (28.184)
0
We can think of NL as a Dirac field; for example, the neutrino kinetic term can be written as
iNL ∂/NL .
Now we express the covariant derivatives in terms of the Wµ± , Zµ , and Aµ fields. From our
results in previous subsection, we have
 
g2 0 Wµ+
g2 Aµ T + g2 Aµ T = √
1 1 2 2
(28.185)
2 Wµ− 0
and
g2 A3µ T 3 + g1 Bµ Y = e(T 3 + Y )Aµ + e(cot θW T 3 − tan θW Y )Zµ . (28.186)
Since we identify Aµ as the electromagnetic field and e as the electromagnetic coupling con-
stant, we identify
Q = T3 + Y (28.187)
as the generator of electric charge. Then we see that

Qν = 0, Qe = −e, Qē = +ē. (28.188)

It is convenient to replace Y with Q − T 3 . We find


e
g2 A3µ T 3 + g1 Bµ Y = eQAµ + (T 3 − s2W Q)Zµ . (28.189)
s W cW
Using equations 28.185 and 28.189 in equations 28.175 and 28.176, we find the coupings of
the gauge fields to the leptons,
1 1 e
Lint = √ g2 Wµ+ J −µ + √ g2 Wµ− J +µ + Zµ JZµ + eAµ JEM
µ
, (28.190)
2 2 s W cW
28.7 The Standard Model –397/453–

where

J +µ ≡ EL γ µ NL , J −µ ≡ NL γ µ EL , JZµ ≡ J3µ − s2W JEM


µ
,
1 1
J3µ ≡ NL γ µ NL − EL γ µ EL , JEM µ
≡ −Eγ µ E. (28.191)
2 2

Having worked out the interactions of a single lepton generation, we now examine what hap-
pens when there is more than one of them. Let us consider the fields liI and ēI , where I =
1, 2, 3 is a generation index. The kinetic term for all these fields is

Lkin = ilI†i σ̄ µ (Dµ )ij lj I + iē†I σ̄ µ Dµ ēI . (28.192)

The most general Yukawa term we can write down now reads

LYuk = −ϵij ϕi lj I yIJ ēJ + h.c., (28.193)

where yIJ is a complex 3 × 3 matrix, and the generation indices are summed. We can make
unitary transformations in generation space on the fields: lI → LIJ lJ and ēI → E IJ ēJ , where
L and E are independent unitary matrices. The kinetic terms are unchanged, and the Yukawa
matrix y is replaced with L⊺ yE. We can choose L and E so that L⊺ yE is diagonal with positive

real entries yI . The charged leptons then have masses mI = yI v/ 2, and the neutrinos remain
massless. In the currents, we simply add a generation index I to each field, and sum over it.

28.7.3 Quark sector


Quarks are spin-one-half particles that are triplets of the color group. There are six different
flavours of quark. The six flavours are naturally grouped into three families or generations: u
and d, c and s, t and b.
Let us begin by describing a single quark family, the up and down quarks. We introduce left-
handed Weyl fields q, ū, and d¯in the representations (3, 2, +1/6), (3̄, 1, −2/3), and (3̄, 1, +1/3)
of SU(3) × SU(2) × U(1). The covariant derivatives of these fields are

(Dµ q)αi = ∂µ qαi − ig3 Aaµ (T3a )αβ qβi − ig2 Aaµ (T2a )i j qαj − ig1 (+1/6)Bµ qαi , (28.194a)
(Dµ ū)α = ∂µ ūα − ig3 Aaµ (T3̄a )αβ ūβ − ig1 (−2/3)Bµ ūα , (28.194b)
¯ α = ∂µ d¯α − ig3 Aa (T a )α d¯β − ig1 (+1/3)Bµ d¯α .
(Dµ d) (28.194c)
µ 3̄ β

We rely on context to distinguish the SU(3) gauge fields from the SU(2) gauge fields. The
kinetic terms for q, ū, and d¯ are

Lkin = iq †αi σ̄ µ (Dµ q)αi + iū†α σ̄ µ (Dµ ū)α + id¯†α σ̄ µ (Dµ d)


¯ α. (28.195)

We cannot write down a mass term involving q, ū, and (or) d¯ because there is no gauge-group
singlet contained in any of the products of their representations. But we are able to write down
Yukawa couplings of the form

LYuk = −y ′ ϵij ϕi qαj d¯α − y ′′ ϕ†i qαi ūα + h.c., (28.196)


–398/453– Chapter 28 Gauge Field

where ϕ is the Higgs field in the (1, 2, −1/2) representation, and y ′ and y ′′ are the Yukawa
coupling constants.

In unitary gauge, we replace ϕ1 with (v + H)/ 2, where H is the real scalar field representing
the physical Higgs boson, and ϕ2 with zero. The Yukawa term becomes
1 1
LYuk = − √ y ′ (v + H)qα2 d¯α − √ y ′′ (v + H)qα1 ūα + h.c. . (28.197)
2 2
It is now convenient to assign new names to the SU(2) components of q,
 
u
q= . (28.198)
d

Then we have
1 1
LYuk = − √ y ′ (v + H)Dα Dα − √ y ′′ (v + H)Uα Uα , (28.199)
2 2
where we have defined Dirac fields for the down and up quarks,
   
dα uα
Dα ≡ , Uα ≡ . (28.200)
d¯†α ū†α

We see that the up and down quarks have acquired masses

y′v y ′′ v
md ≡ √ , mu ≡ √ . (28.201)
2 2

Since Q = T 3 + Y , we have
2 1 2 1¯
Qu = + u, Qd = − d, Qū = − ū, Qd¯ = + d. (28.202)
3 3 3 3
This is just the set of electric charge assignments that we expect for the up and down quarks.
Using equations 28.185 and 28.189 in equations 28.194 and 28.195, we find the coupings of
the gauge fields to the quarks,
1 1 e
Lint = √ g2 Wµ+ J −µ + √ g2 Wµ− J +µ + Zµ JZµ + eAµ JEM
µ
, (28.203)
2 2 s c
W W

where we have defined the currents

J +µ ≡ DL γ µ UL , J −µ ≡ UL γ µ DL , JZµ ≡ J3µ − s2W JEM


µ
,
1 1 2 1
J3µ ≡ UL γ µ UL − DL γ µ DL , JEM µ
≡ + Uγ µ U − Dγ µ D. (28.204)
2 2 3 3

Having worked out the interactions of a single quark generation, we now examine what hap-
pens when there is more than one of them. Let us consider the fields qαiI , ūI and d¯I , where
I = 1, 2, 3 is a generation index. The kinetic term for all these fields is

Lkin = iq †αiI σ̄ µ (Dµ )αiβj qβj I + iū†αI σ̄ µ (Dµ )αβ ūβI + id¯†αI σ̄ µ (Dµ )αβ d¯βI , (28.205)
28.7 The Standard Model –399/453–

where the repeated generation index is summed. The most general Yukawa term we can write
down now reads
′ ¯α
LYuk = −ϵij ϕi qαj I yIJ dJ − ϕ†i qαj I yIJ
′′ α
ūJ + h.c., (28.206)
′ ′′
where yIJ and yIJ are complex 3 × 3 matrices. In unitary gauge, this becomes
1 ′ ¯α 1 ′′ α
LYuk = − √ (v + H)dαI yIJ dJ − √ (v + H)uαI yIJ ūJ + h.c. . (28.207)
2 2
We can make unitary transformations in generation space on the fields: dI → DIJ dJ , d¯I →
DIJ d¯J , uI → UIJ uJ and ūI → U IJ ūJ , where U , U , D, D are independent unitary matri-
ces. The kinetic terms are unchanged (except for the couplings to the W ± , as we will discuss
momentarily), and the Yukawa matrices y ′ and y ′′ are replaced with D⊺ y ′ D and U ⊺ y ′′ U .
We can choose U , U , D, D so that D⊺ y ′ D and U ⊺ y ′′ U are diagonal with positive real entries

yI′ and yI′′ . The down quarks dI then have masses mI = yI′ v/ 2, and the up quarks uI have

masses mI = yI′′ v/ 2. In the neutral currents, we simply add a generation index I to each
field, and sum over it. The charged currents are more complicated, however; they become

J +µ = DLI (V † )IJ γ µ ULJ , J −µ = ULI VIJ γ µ DLJ , (28.208)

where V ≡ U † D is the Cabibbo-Kobayashi-Maskawa matrix (CKM matrix).


A 3 × 3 unitary matrix has nine real parameters. However, we are still free to make the inde-
pendent phase rotations DI = eiαI DI and UI = eiβI UI , as these leave the kinetic and mass
terms invariant. These phase changes allow us to make the first row and column of VIJ real,
eliminating five of the nine parameters. The remaining four can be chosen as θ1 , θ2 , θ3 and δ,
where  
c1 +s1 c3 +s1 s3
V = −s1 c2 c1 c2 c3 − s2 s3 eiδ c1 c2 s3 + s2 c3 eiδ  , (28.209)
−s1 s2 c1 s2 c3 + c2 s3 e c1 s2 s3 − c2 c3 e
iδ iδ

and ci = cos θi and si = sin θi . Note that the charged currents now have some terms with
a phase factor eiδ , and some without. Since the time-reversal operator T is antiunitary, the
charged currents do not transform in a simple way under time reversal. This implies that the
charged current terms in Lint are not time-reversal invariant; hence the electroweak interac-
tions violate time-reversal symmetry. Since CP T is always a good symmetry, time-reversal
violation is equivalent to CP violation; δ is therefore sometimes called the CP violating phase.
At last, we would show that the Standard Model is anomaly free. The representation of the left-
handed Weyl fields is three copies of (1, 2, −1/2) ⊕ (1, 1, +1) ⊕ (3, 2, +1/6) ⊕ (3̄, 1, −2/3) ⊕
(3̄, 1, +1/3).
The 3-3-3 anomaly cancels if there are equal numbers of 3’s and 3̄’s; in doing this counting, each
SU(2) component counts separately. We see that each generation has two 3’s from (3, 2, +1/6)
and two 3̄’s from (3̄, 1, −2/3)⊕(3̄, 1, +1/3); thus the 3-3-3 anomaly cancels. There is no 2-2-2
anomaly because the 2 is a pseudoreal representation.
P
For mixed anomalies such as 3-3-1 and 2-2-1, we require i T (Ri )Qi to vanish. For 3-3-1,
each SU(2) component counts separately. Setting T (3) = T (3̄) = 1, we have 2 · (+1/6) +
–400/453– Chapter 28 Gauge Field

(+1/3)+(−2/3) = 0. For 2-2-1, each SU(3) component counts separately. Setting T (2) = 1,
we have −1/2 + 3(+1/6) = 0.
P
For 1-1-1, we require i Q3i to vanish, where the sum counts each SU(2) and SU(3) compo-
nent separately. We have 1·2·(−1/2)3 +1·1·(+1)3 +3·2·(1/6)3 +3·1·(−2/3)3 +3·1·(1/3)3 =
0. Other possible combinations, such as 1-2-3 or 2-2-3, always involve the trace of a single
SU(2) or SU(3) generator, and this vanishes. Finally, the global SU(2) anomaly is absent if
there is an even number of 2’s; we have 1 + 3 = 4 2’s.

Figure 28.7: Standard Model of Particle Physics. The diagram shows the elementary particles
of the Standard Model (the Higgs boson, the three generations of quarks and leptons, and the
gauge bosons), including their names, masses, spins, charges, chiralities, and interactions with
the strong, weak and electromagnetic forces. It also depicts the crucial role of the Higgs boson
in electroweak symmetry breaking, and shows how the properties of the various particles differ
in the (high-energy) symmetric phase (top) and the (low-energy) broken-symmetry phase
(bottom).
Part VI

Statistical Mechanics And Field Theory


Chapter 29
Thermodynamics

29.1 Central problem of thermodynamics


A thermodynamic system is a macroscopic system whose behaviour is identified by a small and
finite number of quantities - the thermodynamic properties. One considers only a restricted
set of manipulations on thermodynamic systems. In practice, one allows them to be put in
contact with one another, or one acts upon them by changing a few macroscopic properties
such as their volume or the electric or magnetic field in which they are immersed. One then
identifies a number of properties such that, if they are known before the manipulation, their
values after the manipulation can be predicted. The smallest set of properties that allows one
to successfully perform such a prediction can be selected as the basis for a thermodynamic
description of the system.

If the state of a thermodynamic system can be fully characterized by the values of the ther-
modynamic variables, and if these values are invariant over time, one says that it is in a state
of thermodynamic equilibrium. Thermodynamic equilibrium occurs when all fast processes
have already occurred, while the slow ones have yet to take place. Clearly the distinction be-
tween fast and slow processes is dependent on the observation time τ that is being considered.
A system can be in equilibrium if the observation time is fairly short, while it is no longer
possible to consider it in equilibrium for longer observation times. A more curious situation
is that the same system can be considered in equilibrium, but with different properties, for
different observation times.

Let us consider two thermodynamic systems, 1 and 2, that can be made to interact with one
another. Variables like the volume V , the number of particles N , and the internal energy U ,
whose value (relative to the total system) is equal to the sum of the values they assume in
the single systems, are called additive or extensive. Strictly speaking, internal energy is not
extensive, unless the interaction between 1 and 2 can be neglected.

The fundamental hypothesis of thermodynamics is that it should be possible to characterize


the state of a thermodynamic system by specifying the values of a certain set (X0 , X1 , · · · , Xr )
of extensive variables. For example, X0 could be the internal energy U , X1 the number of par-
ticles N , X2 the volume V of the system, and so on. The central problem of thermodynamics
is that given the initial state of equilibrium of several thermodynamic systems that are allowed
to interact, determine the final thermodynamic state of equilibrium.

The interaction between thermodynamic systems is usually represented by idealized walls that
29.2 Entropy formulation of thermodynamics –403/453–

allow the passage of one (or more) extensive quantities from one system to the other. Among
the various possibilities, the following are usually considered:

Thermally conductive walls These allow the passage of energy, but not of volume or particles.

Semipermeable walls These allow the passage of particles belonging to a given chemical species.

The space of possible states of equilibrium (compatible with constraints and initial conditions)
is called the space of virtual states. The initial state is obviously a (specific) virtual state. The
central problem of thermodynamics can obviously be restated as follows: Characterize the
actual state of equilibrium among all virtual states.

29.2 Entropy formulation of thermodynamics

29.2.1 Property of entropy function


There exists a function S of the extensive variables (X0 , X1 , · · · , Xr ), called the entropy, that
assumes the maximum value for a state of equilibrium among all virtual states and that pos-
sesses the following properties:

1. Extensivity: If 1 and 2 are thermodynamic systems, then

S (1∪2) = S (1) + S (2) . (29.1)

2. Convexity: If X 1 = (X01 , X11 , · · · , Xr1 ) and X 2 = (X02 , X12 , · · · , Xr2 ) are two thermo-
dynamic states of the same system, then for any α between 0 and 1, one obtains

S[(1 − α)X 1 + αX 2 ] ≥ (1 − α)S(X 1 ) + αS(X 2 ). (29.2)

From this expression, if we take the derivative with respect to α at α = 0, we obtain

Xr
∂S
(Xi2 − Xi1 ) ≥ S(X 2 ) − S(X 1 ), (29.3)
i=0
∂Xi
X1

which expresses the fact that the surface S(X0 , X1 , · · · , Xr ) is always below the plane
that is tangent to each of its points. (We adpot the convention that convex means upper
convex).

3. Monotonicity: S(U, X1 , · · · , Xr ) is a monotonically increasing function of the internal


energy U :
∂S 1
= > 0. (29.4)
∂U X1 ,··· ,Xr T

The entropy postulate allows one to solve the central problem of thermodynamics, by refer-
ring it back to the solution of a constrained extremum problem: The equilibrium state cor-
responds to the maximum entropy compatible with the constraints.
–404/453– Chapter 29 Thermodynamics

29.2.2 Simple problems


Thermal contact
Let us consider two systems, 1 and 2, that are in contact by means of a thermally conductive
wall. The virtual state space is therefore defined by the relations:
(1) (2)
U (1) + U (2) = U = const, Xi = const, Xi = const, r = 1, · · · , r. (29.5)

Let us look for the maximum of S as a function of U (1) :


∂S ∂S (1) ∂S (2)
= − . (29.6)
∂U (1) ∂U (1) U (1) ∂U (2) U −U (1)

(1)
We will denote the value of U (1) at equilibrium by Ueq . One therefore has

∂S (1) ∂S (2)
= . (29.7)
∂U (1) (1)
Ueq ∂U (2) (2)
Ueq

Due to entropy’s convexity, we can derive that


" #
∂S (1) ∂S (2) (1)
(1)
− (2)
(1)
(Ueq − Uini ) ≥ 0. (29.8)
∂U U (1) ∂U U (2)
ini ini

Let us introduce the quantity


 −1
∂S
T = . (29.9)
∂U
According to our hypotheses, this quantity is positive. We obtained the following results:
• At equilibrium, T is the same in all subsystems that are in reciprocal contact by means
of thermally conductive walls.
• In order to reach equilibrium, energy shifts from systems with higher values of T toward
systems with lower values of T .
Later, we will show that T is the absolute temperature of the system.

A thermally conductive and mobile wall


In this case, the two systems can also exchange volume V , in addition to internal energy U . If
introduce the quantity p by
p ∂S
= , (29.10)
T ∂V
the equilibrium conditions will be

T (1) = T (2) , p(1) = p(2) . (29.11)

One can easily prove that between two systems, both initially at the same temperature, volume
is initially released by the system in which p is lower to the system in which p is higher. Later,
we will show that p is the pressure of the system.
29.2 Entropy formulation of thermodynamics –405/453–

A semipermeable wall
Let us consider a system composed of several chemical species, and let us introduce the num-
ber of molecules N1 , · · · , Nr belonging to the chemical species that constitute it as part of the
thermodynamic variables. Let us suppose that two systems of this type are separated by a wall
that only allows the k-th chemical species to pass. Clearly, it is impossible for the exchange of
molecules to occur without an exchange of energy. If we introduce the quantity µi by

µi ∂S
= , (29.12)
T ∂Ni
the equilibrium conditions will be

T (1) = T (2) , µ(1) = µ(2) . (29.13)

We will call µi as the chemical potential of the molecule i.

29.2.3 Heat and work


From mechanics (and from electromagnetism), we can derive an expression for the infinitesi-
mal mechanical work performed on the system by varying the extensive quantities. One usu-
ally adopts a sign convention according to which work is considered positive if the system per-
forms work on the outside. Following this convention, the expression of infinitesimal work is
given by
Xr
δW = − fi dXi . (29.14)
i=1

On the one hand, we have

dU X ∂S
r
dS = + dXi . (29.15)
T i=1
∂Xi U,··· ,Xr

On the other hand, we have


dU = δQ − δW . (29.16)
Finally, we get
∂S fi
δQ = T dS , =− . (29.17)
∂Xi U,··· ,Xr T

Temperature
Let us consider a system made up of a thermal engine and two heat reservoirs with T1 > T2 .
A heat reservoir is a system for which T is independent of U . The whole compound system is
enclosed in a container that allows it to exchange energy with the environment only in a purely
mechanical way.
Let the system evolve from an initial equilibrium condition, in which the first heat reservoir has
internal energy U1 , the second has internal energy U2 , and the thermal engine is in some equi-
librium state, to a final equilibrium state in which the first heat reservoir has internal energy
–406/453– Chapter 29 Thermodynamics

U1′ , the second has U2′ . Thus the work performed by the system is W = (U1 + U2 ) − (U1′ + U2′ ),
and the thermal engine is back to its initial state. By definition, the efficiency of the engine is
given by η = W/(U1 − U1′ ).
In a transformation of this kind, the total entropy of the compound system cannot become
smaller:
S (1) (U1 ) + S (2) (U2 ) ≤ S (1) (U1′ ) + S (2) (U2′ ). (29.18)
Since we are dealing with heat reservoirs, we have
Ui′ − Ui
S (i)
(Ui′ ) (i)
= S (Ui ) + , i = 1, 2. (29.19)
Ti
It follows from equations 29.18 and 29.19 that
U1 − U1′ U ′ − U2
≤ 2 . (29.20)
T1 T2
Using the definition of the efficiency, we can get
T2
η ≤1− . (29.21)
T1
Compared with the maximum efficiency evaluated in elementary thermodynamics, we can
conclude that T is the absolute temperature, up to an overall factor, which can be fixed to 1 by
rescaling the S.

Pressure
Let us consider an infinitesimal variation of V . In this case, mechanics tells us that the work
performed by the system is given by δW = P dV . Thus we have
∂S P
= . (29.22)
∂V U,··· ,Xr T

This allows us to identify the pressure P with the quantity p we defined previousl.

The fundamental equation


The equation
S = S(X0 = U, X1 , · · · , Xr ) (29.23)
is called the fundamental equation, and it represents a complete description of the thermody-
namics of the system being considered.

29.3 Thermodynamic potential


29.3.1 Energy scheme
We can also use a different (but equivalent) formulation of the fundamental principle of ther-
modynamics, in which entropy assumes the role of an independent variable, while energy
29.3 Thermodynamic potential –407/453–

becomes a dependent variable that satisfies a variational principle. This formalism is known
as the energy scheme.
Let ∆X be a virtual variation of the extensive variables (excluding internal energy U ) with
respect to the equilibrium value Xeq . Then

∆S = S(U, Xeq + ∆X) − S(U, Xeq ) ≤ 0. (29.24)

Since S is a monotonically increasing function of U , there exists a value U ′ > U such that
S(U ′ , Xeq + ∆X) = S(U, Xeq ). Therefore, if S is kept constant, as the system moves out
of equilibrium, U cannot but increase. Thus, in energy formalism, the maximum entropy
principle is replaced by the principle of minimum internal energy: Among all states with a
specific entropy value, the state of equilibrium is that in which internal energy is minimal.
The fundamental equation in the energy scheme is U = U (S, X1 , · · · , Xr ). Its differential
assumes the form
Xr
dU = T dS + fi dXi . (29.25)
i=1
Further more, it can be derived that

U [(1 − α)Y 1 + αY 2 ] ≤ (1 − α)U (Y 1 ) + αU (Y 2 ) where Y = {S, X1 , · · · , Xr }, (29.26)

which implies that the internal energy function is concave.

29.3.2 Intensive variables and thermodynamic potentials


The derivatives fi = ∂U /∂Xi of the internal energy U with respect to extensive quantities S,
{Xi } are called intensive quantities. For uniformity’s sake, we define f0 ≡ ∂U /∂Xi = T . A
given quantity fi is called the conjugate of the corresponding variable Xi , and vice versa.
Since both U and Xi are extensive, intensive variables are not dependent on system size. More-
over, if a system is composed of several subsystems that can exchange the extensive quantity
Xi , the corresponding intensive variable fi assumes the same value in those subsystems that
are in contact at equilibrium. We now want to identify the state of equilibrium among all states
that exhibit a given value of an intensive variable fi .
Specifically, for i = 0, we are confronted with the case of system with a fixed temperature. Let
us now define the Helmholtz free energy F (T, X) by the relation

F (T, X) = U (S(T, X), X) − T S(T, X), (29.27)

where X = {X1 , · · · , Xr }. It can shown that the thermodynamical equilibrium in these con-
ditions is characterized by the following variational principle: The value of the Helmholtz
free energy is minimal for the equilibrium state among all virtual states at the given tem-
perature T .
Let us now consider more generally the Legendre transform of the internal energy U with
respect to the intensive variable fi :

Φ(S, f1 , X2 , · · · , Xr ) = U (S, X1 , X2 , · · · , Xr ) − f1 X1 , (29.28)


–408/453– Chapter 29 Thermodynamics

where X1 (S, f1 , X2 , · · · , Xr ) is determined by the condition


∂U
f1 = . (29.29)
∂X1 S,X2 ,··· ,Xr

Then, the state of equilibrium is specified by the following criterion: Among all the states
that have the same value as f1 , the state of equilibrium is that which corresponds to the
minimum value of Φ.
The partial derivative of Φ, performed with respect to f1 , with the other extensive variables
kept fixed, yields the value of the extensive variable X1 :
∂Φ
= −X1 (S, f1 , X2 , · · · , Xr ). (29.30)
∂f1 S,X2 ,··· ,Xr

Considering two intensive variables T and f1 , we would introduce the thermodynamic po-
tential Φ(T, f1 , X2 , · · · , Xr ), obtained as a Legendre transform of U with respect to S and
X1 :
Φ(T, f1 , X2 , · · · , Xr ) = U − T S − f1 X1 . (29.31)
This thermodynamic potential assumes at equilibrium the minimum value among all the states
with the same values of T and f1 .
We can obtain a whole series of thermodynamic potentials by using a Legendre transform with
respect to the extensive variables Xi . However, we cannot eliminate all extensive variables in
this manner. We will see later that if we did this, the resulting thermodynamic potential would
identically vanish. A general thermodynamic potential

X
k
Φ(T, f1 , · · · , fk , Xk+1 , · · · , Xr ) = U − T S − f i Xi (29.32)
i=1

is concave as a function of the remaining extensive variables, for fixed values of the intensive
variables f1 , · · · , fk . Φ on the other hand is convex as a function of the intensive variables fi s,
when the extensive variables are fixed. The concavity and convexity are connected to the stabil-
ity of thermodynamic equilibrium. For example, the specific heat is always positive whatever
constraint is placed on the system since

∂S ∂ 2Φ
C≡T = −T > 0. (29.33)
∂T ∂T 2
If this were not the case, a small fluctuation in temperature that might make one of the sys-
tem’s regions colder would lead to this system claiming heat from surrounding regions. With
negative specific heat, this energy would make the temperature of the region diminish further,
and in this manner, the energy fluctuation would be amplified.

29.3.3 Free energy and maxwell relations


The natural variables of F are the temperature T and the extensive variables X1 , · · · , Xr , en-
tropy excluded. Entropy can be obtained by taking the derivative of F with respect to T . The
29.3 Thermodynamic potential –409/453–

expression for the differential of F is


X
r
dF = −S dT + fi dXi . (29.34)
i=1

More specifically, by setting X1 = V , one has


∂F
−P = . (29.35)
∂V T,X2 ,··· ,Xr

If we now take P ’s derivative with respect to T and we use the theorem of the equality of mixed
derivatives, we obtain
∂P ∂ 2F ∂S
=− = . (29.36)
∂T V,X2 ,··· ,Xr ∂T ∂V ∂V T,X2 ,··· ,Xr

These relations between thermodynamic derivatives that derive from the equality of mixed
derivatives of thermodynamic potentials are called Maxwell relations.
The free energy designation is derived from the following property. If a system is put in contact
with a reservoir at temperature T , the maximum quantity of work Wmax that it can perform
on its environment is equal to the variation in free energy between the initial and final states.

29.3.4 Gibbs free energy and enthalpy


Transforming F according to Legendre with respect to V , we obtain a new thermodynamic
potential, called the Gibbs free energy:

G(T, P, X2 , · · · , Xr ) = F + P V = U − T S + P V. (29.37)

The variational principle satisfied by the Gibbs free energy is the following: Among all states
that have the same temperature and pressure values, the state of equilibrium is that in
which the Gibbs free energy assumes the minimum value.
G’s differential is expressed as follows:
X
r
dG = −S dT + V dP + fi dXi . (29.38)
i=2

If a system is brought toward equilibrium while temperature and pressure are kept constant, the
maximum work that can be performed on its environment is given precisely by the difference
between the initial and final values of G.
If we Legendre transform the internal energy U with respect to V , we obtain a new thermo-
dynamic potential, usually denoted by H and called enthalpy:

H(S, P, X2 , · · · , Xr ) = U + P V. (29.39)

Enthalpy governs the equilibrium of adiabatic processes that occur while pressure is constant:
Among all states that have the same entropy and pressure values, the state of equilibrium
is the one that corresponds to the minimum value of enthalpy.
–410/453– Chapter 29 Thermodynamics

If a system relaxes toward equilibrium while the pressure is kept constant, the maximum heat
that can be produced by the system is equal to its variation in enthalpy. For this reason, en-
thalpy it is also called free heat. The differential of H is given by
X
r
dH = T dS + V dP + fi dXi . (29.40)
i=2

The equality of the mixed derivatives of G and H yield two more Maxwell relations:

∂S ∂V ∂T ∂V
=− , = . (29.41)
∂P T,X2 ,··· ,Xr ∂T P,X2 ,··· ,Xr ∂P S,X2 ,··· ,Xr ∂S P,X2 ,··· ,Xr

29.3.5 Other thermodynamic potentials


The Legendre transform of F with respect to N produces a thermodynamic potential (often
written as Ω) that depends on T , on volume V , on chemical potential µ, and on the other
extensive variables:
Ω(T, V, µ) = F − µN. (29.42)
Its differential is expressed as follows:

dΩ = −S dT − P dV − N dµ . (29.43)

If one transforms U instead, one obtains a rarely used potential that depends on S, V , and µ,
which we will designate as Φ(S, V, µ) = U − µN . Its differential is given by dΦ = T dS −
P dV − N dµ.

29.3.6 The Euler and Gibbs-Duhem equations


By taking the derivative of U (λS, λX1 , · · · , λXr ) = λU (S, X1 , · · · , Xr ) with respect to λ
and setting λ = 1, we obtain the Euler equation
X
r
TS + fi Xi = U. (29.44)
i=1

More particularly, for simple fluids, one obtains

U = T S − P V + µN, (29.45)

which implies that

µ = (U − T S + P V )/N = G/N, Ω = U − T S − µN = −P V. (29.46)

From the Euler equation, it follows that the Legendre transform of U with respect to all exten-
sive variables vanishes identically.

Note: The interpretation of the chemical potential as a per particle density of Gibbs free energy is valid
only in the case of simple fluids – in the case of a mixture of several chemical species, it is no longer valid.
29.4 Thermodynamic systems with multi-components –411/453–

If we take the Euler equation’s differential and subtract both sides from the usual expression
of dU , we obtain the Gibbs-Duhem equation:
X
r
S dT + Xi dfi = 0. (29.47)
i=1

In the case of simple fluids, for example, one arrives at S dT − V dP + N dµ = 0. By dividing


with respect to the number of particles N , one obtains the Gibbs-Duhem equation in the form

dµ = v dP − s dT , (29.48)

where v represents volume per particle and s entropy per particle.


Relations between the densities and the intensive variables are called equations of state. If, for
example, we consider the Gibbs free energy for a simple fluid, we arrive at
V 1 ∂G
v= = = v(T, P ), (29.49)
N N ∂P T,N

where we have made use of the fact that G is extensive, and therefore proportional to N . In
the case of the simple fluid, we also have equation of state s = s(T, P ), which expresses the
entropy per particle s as a function of P and T . In reality, the two equations of state are not
completely independent, because of the Maxwell relations:
∂s ∂v
=− . (29.50)
∂P T ∂T P

29.4 Thermodynamic systems with multi-components


29.4.1 Chemical reactions
Let us now consider a mixture of r chemical species, A1 , · · · , Ar , which can be transformed
into one other by a reaction of the following type:

ν1 A1 + · · · + νk Ak ⇆ νk+1 Ak+1 + · · · + νr Ar . (29.51)

We can conventionally assign negative stoichiometric coefficients νi to the products, so as to


write this formula as a formal equation:
X
r
νi Ai = 0. (29.52)
i=1

If temperature and pressure are kept fixed, the variation of Gibbs free energy for a certain
variation in the number of particles due to the reaction will be
X ∂G X ∂G X
δG = δNi ∝ νi = µi νi . (29.53)
i
∂Ni P,T i
∂Ni P,T i

Since at equilibrium one must have δG = 0 for any virtual variation of the Ni , one will have
X
µi νi = 0. (29.54)
i
–412/453– Chapter 29 Thermodynamics

29.4.2 Phase coexistence


It frequently happens that two systems characterized by different thermodynamic density val-
ues can maintain thermodynamic equilibrium even in the absence of constraints on the mutual
exchange of extensive quantities. This situation is called phase coexistence.

In the case of a simple fluid, it is realized, for example, when a liquid coexists with its vapor
inside a container. In this case, the intensive variables assume the same value in both sys-
tems, while densities assume different values. In these cases, we refer to each of the coexisting
homogeneous systems as a phase.

One can describe phase coexistence by saying that the equation of state v = v(P, T ) does
not admit of a unique solution, but instead allows for at least the two solutions v = vliq and
v = vvap which correspond to the liquid and vapor, respectively. Since the liquid and vapor
coexist and can exchange particles, the chemical potential of the liquid has to be equal to that
of the vapor:
µliq (P, T ) = µvap (P, T ). (29.55)

On the other hand, we know that for a simple fluid, the chemical potential is equal to the Gibbs
free energy per particle. Thus, the Gibbs free energy in the total system does not depend on
the number of particles that make up the liquid and the vapor system:

G = Gliq + Gvap = Nliq µliq + Nvap µvap = (Nliq + Nvap )µ = N µ. (29.56)

The volume per particle of the system is given by

Vliq + Vvap Nliq vliq + Nvap vvap


v= = = xliq vliq + xvap vvap , (29.57)
N N
where xliq is the fraction of particles in the liquid and xvap = 1 − xliq that of the particles in
the vapor. As a consequence, the value of v lies somewhere between vliq and vvap .

In the equation of state P = P (v, T ), phase coexistence appears as a horizontal segment, for
a given value of T , and for values of v between vliq and vvap , as shown in Figure 29.1 (a).

Consider the Helmholtz free energy F . The pressure P is obtained as the derivative of −F
with respect to V at a given value of T . The isotherm curve F (V ) exhibits a straight segment
with slope −Pt , lying between Vliq = N vliq and Vvap = N vvap , as shown in Figure 29.1 (b).

The Gibbs free energy has an turning point at Pt . The two slopes that coexist at the turning
point correspond to Vliq and Vvap , respectively, as shown in Figure 29.1 (c).

29.4.3 The Clausius-Clapeyron equation


If we consider a thermodynamic system as a function of its intensive variables, we can identify
some regions in which the thermodynamic properties vary regularly with variations of their
values. These regions represent thermodynamically stable phases and are limited by curves
that represent phase transitions. The phase transitions can be discontinuous, like the phase
29.4 Thermodynamic systems with multi-components –413/453–

P F G

Pt

vliq vvap v V P
Vliq Vvap Pt
(a) (b) (c)

Figure 29.1: (a) Isotherm P as a function of the volume per particle v for a simple liquid; (b)
Isotherm of the Helmholtz free energy F as a function of volume V ; (c) Isotherm of the Gibbs
free energy G as a function of pressure P . The black and red dash lines in each curve represent
metastable and unstable states, respectively.

coexistence we just discussed, or continuous. In the first case, the densities present a disconti-
nuity at the transition (first order transitions), while in the second, they vary with continuity,
even though their derivatives can exhibit some singularities (second order transitions).
In the case of a simple fluid, it is possible to identify the transition curve within the plane of
the intensive variables (P, T ), as shown in Figure 29.2 (a), from the condition of equality of
the chemical potential µ between the two coexisting phases:

µliq (Pt (T ), T ) = µvap (Pt (T ), T ). (29.58)

Taking the total derivative of equation 29.58 with respect to T , along the transition line Pt (T ),
we can obtain the Clausius–Clapeyron equation for phase coexistence:
dPt svap − sliq
= . (29.59)
dT vvap − vliq

We can also represent the phase diagram in the plane (v, T ), as shown in Figure 29.2 (b). In
this manner, phase coexistence is represented by the existence of a forbidden region vliq (T ) <
v < vvap (T ) in the plane. Outside this region, it is possible to obtain any given value of v in
a homogeneous system. Within this region, instead, the system separates into a liquid and a
vapor phase.

29.4.4 Gibbs phase rule


Let us now consider a mixture of particles belonging to r different chemical species. Suppose
that we are looking for the coexistence of q phases. At equilibrium, all the intensive variables
must assume the same value in the coexisting phases. We will therefore have a specific value for
the pressure and the temperature, and in addition the chemical potential of each species will
have to assume the same value in all the different phases. If we denote the chemical potential
of species i in phase α as µαi , we will have

µαi = µi , i = 1, · · · , r, α = 1, · · · , q. (29.60)
–414/453– Chapter 29 Thermodynamics

P T

(vc, Tc)
(Tc, Pc)

T0

T vliq vvap v
(a) (b)

Figure 29.2: (a) The transition curve within the plane (P, T ); (b) Coexistence curve for a
simple fluid. The critical point corresponds to (vc , Tc , Pc ).

In this equation, µi is the shared value taken by the chemical potential of species i. We thus
obtain r(q − 1) equations for q(r − 1) + 2 unknown values. These unknown values are P ,
T , and the q(r − 1) independent densities xαi of species i in phase α. Generically speaking,
f = 2 − q + r free parameters remain. For f = 0, coexistence will occur in isolated points of
the phase diagram, for f = 1, along a line, and so on. The quantity f is called variance.

29.4.5 The Critical Point


For simple fluid, the coexistence of liquid and vapour cannot be obtained for temperatures
higher than a certain temperature Tc , called the critical temperature. The transition curve
ends at a point (Pc , Tc ), where Pc is the critical pressure, as shown in Figure 29.2 (a). For
T < Tc , the difference vvap − vliq tends continuously toward zero when T gets close to Tc , or
in other words, the transition goes from being discontinuous to being continuous (and finally
to disappear at higher temperatures).
The critical point is a thermodynamic state with exceptional properties. For example, since for
T < Tc and P = Pt (T ) one gets ∂P /∂V T = 0 within the coexistence curve, this relation
must ultimately be valid also at the critical point. Thus, the system’s compressibility diverges:

1 ∂V
χ=− → ∞ for T → Tc . (29.61)
V ∂P T

Various thermodynamic properties exhibit analogous singularities.


Chapter 30
Principles of Statistical Mechanics and Ensembles

30.1 Density matrix


In quantum mechanics, the state of a system is a vector in Hilbert space, denoted as |ψ⟩. A
physical observable is an operator on this Hilbert space, denoted as O. The expectation value
of the measurement of the observable is ⟨ψ|O|ψ⟩. It is easy to verify that

⟨ψ|O|ψ⟩ = Tr(|ψ⟩⟨ψ| O). (30.1)

Now, consider a system whose space of states is the direct product of two subspace, i.e.,

H = HA ⊗ HB . (30.2)

An arbitrary state can be decomposed as

|ψ⟩ = CiI |i⟩A ⊗ |I⟩B . (30.3)

Thus, we have

|ψ⟩⟨ψ| = CiI CjJ |i, I⟩⟨j, J| . (30.4)
We define the partial trace of |ψ⟩⟨ψ| on B as
X

TrB (|ψ⟩⟨ψ|) ≡ CiI CjI |i⟩⟨j| . (30.5)
I

It is an operator on HA . Now, suppose there is an observable which measures only on A, i.e.,

⟨i, I|O|j, I⟩ = δIJ ⟨i|OA |j⟩ , (30.6)

where OA is an operator on HA . It follows that

⟨ψ|O|ψ⟩ = Tr(|ψ⟩⟨ψ| O) = TrA [TrB (|ψ⟩⟨ψ|)OA ]. (30.7)

Now, if we take A as the system and B the environment, a piratical observable measures only
on system. For any system which is coupled to environment, its state can be described by an
operator
ρ = Trenv (|ψ⟩⟨ψ|). (30.8)
Thus the expectation value of the measurement on the system is

Tr[ρOsys ]. (30.9)
–416/453– Chapter 30 Principles of Statistical Mechanics and Ensembles

It can be verified that


Tr ρ = 1, ρ† = ρ, (30.10)
and any eigenvalue of ρ must lie between 0 and 1. Suppose ρ can be diagonalized as
X
ρ= pi |i⟩⟨i| . (30.11)
i

We have
Tr[ρO] = pi ⟨i|O|i⟩ . (30.12)
It is reasonable to assume pi as the (classical) probability of the system in (pure) state |i⟩. One
fundamental postulate of statistical mechanics is that the entropy operator of the system is

Ŝ = − ln ρ. (30.13)

Therefore, the expectation value of the entropy is

S = Tr[−ρ ln ρ]. (30.14)

30.2 Statistical ensemble


30.2.1 Micro canonical ensemble
Micro canonical ensemble describes a system which is weakly coupled to the environment.
The volume V and the number of the particles N are fixed. The energy of the system lies in
a narrow range between E − ∆E and E + ∆E. The total number of distinct microstates
accessible to a system is then denoted by the symbol Γ(V, N, E; ∆) and, by assumption, any
one of these microstates is as likely to occur as any other. Accordingly, the density matrix in
the energy representation will be of the form

ρmn = ρm δmn , (30.15)

with (
1/Γ for each of the accessible states
ρn = . (30.16)
0 for all other states
The entropy of the system is
S = ln Γ. (30.17)

30.2.2 Canonical ensemble


Canonical ensemble describes a system which can exchange energy with the environment. The
density matrix of the system is
e−βH
ρ= . (30.18)
Tr[e−βH ]
Now, we define
  1
Z(β, V, N ) ≡ Tr e−βH , F (β, V, N ) ≡ − ln Z. (30.19)
β
30.3 Fluctuations –417/453–

The energy of the system is given by


∂ ln Z ∂F 1
U = Tr[ρH] = − =F −T where T ≡ . (30.20)
∂β V,N ∂T V,N β

The entropy of the system is given by


∂F U −F
S = Tr[−ρ ln ρ] = − = . (30.21)
∂T V,N T

It can be derived that


∂U
= T. (30.22)
∂S V,N

Thus, we can identify T as the absolute temperature and F as the Helmholtz free energy.

30.2.3 Grand canonical ensemble


Grand canonical ensemble describes a system which can exchange energy and particles with
the environment. The density matrix of the system is

e−β(H−µN )
ρ= . (30.23)
Tr[e−β(H−µN ) ]
Now, we define
 
ZΩ (β, V, µ) ≡ Tr e−β(H−µN ) , Ω(β, V, N ) ≡ − ln ZΩ /β. (30.24)

The particle number and energy of the system are given by


1 ∂ ln ZΩ ∂Ω ∂ ln ZΩ ∂Ω
N= =− , U − µN = − =Ω−T . (30.25)
β ∂µ V,β ∂µ V,T ∂β V,µ ∂T V,µ

The entropy of the system is given by


∂Ω U − µN − Ω
S = Tr[−ρ ln ρ] = − = . (30.26)
∂T V,µ T

It can be derived that


∂U
= µ. (30.27)
∂N V,S

Thus, we can identify µ as the chemical potential and Ω as grand canonical potential.

30.3 Fluctuations
30.3.1 Canonical Ensemble
For canonical ensemble, we have
∂ρ
= −ρH + ρ Tr[ρH]. (30.28)
∂β N,V
–418/453– Chapter 30 Principles of Statistical Mechanics and Ensembles

Since U = Tr[ρH], it follows that

∂U  
= − Tr ρH 2 + (Tr[ρH])2 = − E 2 + ⟨E⟩2 = − (∆E)2 . (30.29)
∂β N,V

Thus, the relative fluctuation of energy in canonical ensemble is


p s √
⟨(∆E)2 ⟩ T ∂U CV
= =T ∼ O(N −1/2 ). (30.30)
⟨E⟩ U ∂T N,V U

For large N the relative fluctuation in the values of E is quite negligible.

30.3.2 Grand canonical ensemble


Similarly, it can be derived that the fluctuation of density in grand canonical ensemble is

⟨(∆n)2 ⟩ T
= κT , (30.31)
⟨n⟩2 V

where n = N/V is the number density and κT = − ∂v/∂P T /v is the isothermal compress-
ibility of the system. Thus, the relative root-mean-square fluctuation in the particle density of
the given system is ordinarily O(N −1/2 ) and, hence, negligible.
However, in situations accompanying phase transitions, the compressibility of a given sys-
tem can become excessively large. In the region of phase transitions, especially at the critical
points, we encounter unusually large fluctuations in the particle density of the system. Such
fluctuations indeed exist and account for phenomena like critical opalescence. It is clear that
under these circumstances the formalism of the grand canonical ensemble could, in principle,
lead to results that are not necessarily identical to the ones following from the correspond-
ing canonical ensemble. In such cases, it is the formalism of the grand canonical ensemble
that will have to be preferred because only this one will provide a correct picture of the actual
physical situation.
The energy fluctuation in grand canonical ensemble is
!2
∂U
⟨(∆E)2 ⟩ = T 2 CV + ⟨(∆N )2 ⟩. (30.32)
∂N T,V

The mean-square fluctuation in the energy of a system in the grand canonical ensemble is equal
to the value it would have in the canonical ensemble plus a contribution arising from the fact
that now the particle number N is also fluctuating. Again, under ordinary circumstances, the
relative root-mean-square fluctuation in the energy density of the system would be practically
negligible. However, in the region of phase transitions, unusually large fluctuations in the
value of this variable can arise by virtue of the second term in the formula.
Chapter 31
Interaction-free Systems

31.1 General discussion


31.1.1 Bose-Einstein Statistics
If the system is composed of interaction-free bosons, the state of the system can be denoted as

|n1 , n2 , · · · , ni , · · ·⟩ , (31.1)

where ni is the number of particles in state |i⟩. Here, we choose |i⟩ as the energy eigenstate
with energy ϵi . Adopting grand canonical ensemble, we have

  X ∑ YX
∞ Y 1
ZΩ = Tr e−β(H−µN ) = e−β i ni (ϵi −µ)
= [e−β(ϵi −µ) ]ni = .
n1 ,··· ,ni ,··· i ni =0 i
1 − e−β(ϵi −µ)
(31.2)
Thus, the grand canonical potential of the system is
X 
Ω = −β −1 ln ZΩ = T ln 1 − e−β(ϵi −µ) . (31.3)
i

The chemical potential of the system must satisfy that µ < ϵ0 , where ϵ0 is the energy of the
ground state. To derive further results, we prefer to introduce a parameter z, called as the
fugacity of the system, defined by the relation

z ≡ eβµ . (31.4)

The expectation value of the particle number is

∂Ω X 1
N =− = . (31.5)
∂µ T,V i
eβϵi z −1 − 1

The expectation value of the energy is


X ϵi
U= . (31.6)
i
βϵ
e z i −1 −1

The expectation value of particle number on level i is


1
ni = . (31.7)
eβϵi z −1 − 1
–420/453– Chapter 31 Interaction-free Systems

31.1.2 Fermi-Dirac Statistics


If the system is composed of interaction-free fermions, the state of the system can be denoted
as
|n1 , n2 , · · · , ni , · · ·⟩ , (31.8)
where ni is the number of particles in state |i⟩ and ni = 0 or 1. Adopting grand canonical
ensemble, we have
  X ∑ YX
1 Y
ZΩ = Tr e−β(H−µN ) = e−β i ni (ϵi −µ)
= [e−β(ϵi −µ) ]ni = 1 + e−β(ϵi −µ) .
n1 ,··· ,ni ,··· i ni =0 i
(31.9)
Thus, the grand canonical potential of the system is
X 
Ω = −β −1 ln ZΩ = −T ln 1 + e−β(ϵi −µ) . (31.10)
i

The expectation value of the particle number is


X 1
N= βϵ −1
. (31.11)
i
e z +1
i

The expectation value of the energy is


X ϵi
U= . (31.12)
i
e z −1
βϵ i +1

The expectation value of the particle number on level i is


1
ni = . (31.13)
eβϵi z −1 + 1

31.1.3 Boltzmann Statistics


If the system is composed of distinguishable interaction-free particles, the state of the system
can be denoted as
|i1 , i2 , · · · , iN ⟩ , (31.14)
where ik is the state of k-th particle and N is the particle number of the system. Adopting
canonical ensemble, we have
X ∑N Y
N X X
−βH −β
Z = Tr e = e k=1 ϵik = e−βϵi = (Z1 )N where Z1 ≡ e−βϵi .
ik k=1 i i
(31.15)
For a particle in a box, we have Z1 ∝ V ∝ N . So the free energy of the system is

F = −N T ln Z1 = −N T ln N + · · · . (31.16)

However, the term −N T ln N is not extensible. This is called Gibbs paradox, due to the fact
that our assumption of distinguishable particles is wrong. It can be amended if we demand
that
ZN
Z= 1 . (31.17)
N!
31.2 Ideal Boltzmann Gas –421/453–

Then the non-extensible term is eliminated. In the grand canonical ensemble, we have

X∞
z N (Z1 )N
ZΩ = = exp[Z1 z]. (31.18)
N =0
N!

Thus, the grand canonical potential of the system is


X
Ω = −β −1 ln ZΩ = −T e−β(ϵi −µ) . (31.19)
i

The expectation value of the particle number is


X 1
N= . (31.20)
i
eβϵi z −1

The expectation value of the energy is


X ϵi
U= . (31.21)
i
e i z −1
βϵ

The expectation value of the particle number on level i is

1
ni = . (31.22)
eβϵi z −1
As we can see, in the limit of
eβϵi ≥ eβϵ0 ≫ z, (31.23)
Bose-Einstein, Fermi-Dirac and Boltzmann statistics are identical.

31.2 Ideal Boltzmann Gas


31.2.1 Molecules without internal motion
Let us study thermodynamic properties of ideal gas. Our basic assumptions are:

• The temperature is high enough so that interaction between molecules can be neglected,
i.e., e−Vint /T ≪ 1.

• Bose-Einstein or Fermi-Dirac statistics of the molecules can be well approximated by


Boltzmann statistics, i.e., z ≪ 1.

Suppose the side length of the box is L. The momentum of the particle would be


(nx , ny , nz ), (31.24)
L
where nx ,ny and nz are integers. Thus we have
X β
e− 2m ( L )
2π 2 2
(nx +n2y +n2z )
Z1 = . (31.25)
nx ,ny ,nz
–422/453– Chapter 31 Interaction-free Systems

If the difference of adjacent energy level is much smaller than β −1 , the summation can be
approximately as an integral. In SI units, this condition can be written explicitly as
 −1  −2
h2 m L
T ≫ 2
∼ 10−17 K (31.26)
2mkB L mp 1m

which is always satisfied in practice. Thus, we have


Z r
d3 p − βp2 V 2π
Z1 = V e 2m = where λ ≡ . (31.27)
(2π)3 λ3 mT


Note: When calculating the integral above, the following formula may by useful:
Z ∞
n −a x2 Γ( n+1
2
)
x e dx = n+1 . (31.28)
0 2a 2

The free energy of the system is


 
V
F = −N T ln Z1 + T ln N ! = −N T ln − N T. (31.29)
N λ3

Values of other thermodynamic quantities are given by

∂F V 5 3
S=− = N ln + N, U = F + T S = N T,
∂T V,N N λ3 2 2
∂F NT ∂F V ∂U 3
P =− = , µ= = −T ln , CV = = N. (31.30)
∂V T,N V ∂N T,V N λ3 ∂T V,N 2

Note that we must have z = eβµ = λ3 /v ≪ 1 to ensure the validness of Boltzmann statistics.
In SI units, the condition can be written explicitly as

T 3/2 h3 K3/2
≫ ∼ O(1) . (31.31)
ρ (2πkB )3/2 m5/2 kg/m3

Usually, before the condition is violated, the interaction between molecules becomes impor-
tant and the gas may transform to liquid already.

31.2.2 Molecules with internal motion


If the internal motion of molecules is taken into account, we have
V X
Z1 = j(T ) where j(T ) ≡ gi e−βϵi . (31.32)
λ3 i

Here, ϵi is the energy associated with a state of internal motion, while gi is the multiplicity of
that state. The contributions made by the internal motions of the molecules, over and above the
31.3 Ideal Bose Systems –423/453–

translational degrees of freedom, follow straightforwardly from the function j(T ). Explicitly,
we have
 
∂ ln j ∂ ln j
Fint = −N T ln j, Sint = N ln j + T , Uint = N T 2
∂T ∂T
 
∂ ∂ ln j
µint = −kT ln j, (CV )int = N T2 . (31.33)
∂T ∂T
How the central problem is to derive an explicit expression for the function j(T ) from a knowl-
edge of the internal states of the molecules. For this, we the internal state of a molecule is
determined by
• the electronic state
• the state of the nuclei
• the vibrational state
• the rotational state.
Rigorously speaking, these four modes of excitation mutually interact; in many cases, however,
they can be treated independently of one another. We can then write

j(T ) = jelec (T )jnucl (T )jvib (T )jrot (T ), (31.34)

with the result that the net contribution made by the internal motions to the various thermo-
dynamic properties of the system is given by a simple sum of the four respective contributions.
A detailed discussion on gaseous systems composed of molecules with internal motion can be
found in section 6.5 from Statistical Mechanics (R.K.Pathria & Paul D.Beale).

31.3 Ideal Bose Systems


Bose-Einstein condensation
For ideal Bose gas, we have
PV X  X 1
ln ZΩ = =− ln 1 − ze−βϵi , N= . (31.35)
T i i
z −1 eβϵi −1

For large V , the spectrum of the single-particle states is almost a continuous one, so the sum-
mations may be replaced by integrations. However, by replacing summation by integration,
we are inadvertently giving a weight zero to the energy level ϵ = 0. This is wrong because in a
quantum mechanical treatment we must give a statistical weight unity to each non-degenerate
single-particle state in the system. It is, therefore, advisable to take this particular state out
of the sum in question before carrying out the integration; a rigorous justification of this un-
usual step can be found in Appendix F of Statistical Mechanics (R.K.Pathria & Paul D.Beale).
We thus obtain
Z ∞
P 2π −βϵ
 1
=− 3
(2m)3/2
ϵ1/2
ln 1 − ze dϵ − ln(1 − z) (31.36)
T (2π) 0 V
–424/453– Chapter 31 Interaction-free Systems

and Z ∞
N 2π ϵ1/2 dϵ 1 z
= (2m)3/2 + . (31.37)
V (2π)3 0 z e −1 V 1−z
−1 βϵ

For z ≪ 1, which corresponds to situations not far from the classical limit, the last term of
equations 31.36 and 31.37 is of order 1/N and, therefore, negligible.

However, as z increases and assumes values close to unity, the term V −1 z/(1 − z), which
is identically equal to N0 /V (N0 being the number of particles in the ground state), can well
become a significant fraction of the quantity N/V ; this accumulation of a macroscopic fraction
of the particles into a single state leads to the phenomenon of Bose-Einstein condensation.

Nevertheless, since z = N0 /(N0 + 1), the term −V −1 ln(1 − z) is equal to V −1 ln(N0 + 1),
which is at most O(N −1 ln(N + 1)); this term is, therefore, negligible for all values of z and
hence may be dropped altogether. Thus, we have

P 1 N − N0 1
= 3 g5/2 (z), = 3 g3/2 (z), (31.38)
T λ V λ
where gν (z) are Bose-Einstein functions defined by
Z ∞
1 xν−1 dx z2
gν (z) ≡ = z + + ··· . (31.39)
Γ(ν) 0 z −1 ex − 1 2ν

The internal energy of the system is given by

∂ ln ZΩ 3T V 3 3g5/2 (z)
U =− = g5/2 (z) = P V = (N − N0 )T (31.40)
∂β z,V 2 λ3 2 2g3/2 (z)

When the fugacity of the system is close to 1, we have N0 = z/(1 − z) ≫ 1 and

N − N0 1
= 3 ζ(3/2) where ξ(ν) ≡ gν (1). (31.41)
V λ
This curious phenomenon of a macroscopically large number of particles accumulating in a
single quantum state is generally referred to as the phenomenon of Bose-Einstein condensa-
tion. The condition for the onset of Bose-Einstein condensation is
 2/3
2π N
T < Tc ≡ . (31.42)
m V ζ(3/2)

Here, Tc denotes a characteristic temperature that depends on the particle mass m and the
particle density N/V in the system. Accordingly, for T < Tc , the system may be looked on as
a mixture of two “phases”:

• a normal phase, consisting of Ne = N (T /Tc )3/2 particles distributed over the excited
states.

• a condensed phase, consisting of N0 = N − Ne particles accumulated in the ground


state.
31.3 Ideal Bose Systems –425/453–

Pressure
Next, we examine the variation of P with T , keeping v fixed. When T < Tc , the pressure is
T
P = ζ(5/2), (31.43)
λ3
which is proportional to T 5/2 but independent of v, implying infinite compressibility. At the
transition point the value of the pressure is
ζ(5/2) N Tc N Tc
P (Tc ) = ≈ 0.5134 . (31.44)
ζ(3/2) V V
Thus, the pressure exerted by the particles of an ideal Bose gas at the transition temperature
is about one-half of that exerted by the particles of an equivalent Boltzmannian gas. When
T > Tc , the pressure is
g5/2 (z) N T
P = . (31.45)
g3/2 (z) V
As T → ∞, the pressure approaches the classical value.

Specific heat
When T < Tc , the specific heat is
 
CV 3V d T 15 v
= ζ(5/2) = ζ(5/2) 3 , (31.46)
N 2N dT λ3 4 λ

which is proportional to T 3/2 . At T = Tc , we have CV (Tc ) ≈ 1.925N , which is significantly


higher than the classical value 1.5. When T > Tc , the specific heat is
  
CV ∂ 3 g5/2 (z)
= T . (31.47)
N ∂T 2 g3/2 (z) v
Note the recurrence relation of Bose-Einstein function
∂gν (z)
z = gv−1 (z), (31.48)
∂z
and g3/2 (z) = λ3 /v when T > Tc . It is easy to get
 
1 ∂z 3 g3/2 (z)
=− . (31.49)
z ∂T v 2T g1/2 (z)
Thus, we have
CV 15 g5/2 (z) 9 g3/2 (z)
= − . (31.50)
N 4 g3/2 (z) 4 g1/2 (z)
In the limit z → 1, the second term vanishes because of the divergence of g1/2 (z), while the
first term gives exactly the the same result as in the case where T → Tc from below. The specific
heat is, therefore, continuous at the transition point. Its derivative is, however, discontinuous,
the magnitude of the discontinuity being
∂CV ∂CV 27N N
− = ζ(3/2)2 ≈ 3.665 . (31.51)
∂T T =Tc −0 ∂T T =Tc +0 16πTc Tc
–426/453– Chapter 31 Interaction-free Systems

Entropy

Finally, we examine the adiabats of the ideal Bose gas. For this, we need an expression for the
entropy of the system. Making use of U − T S + P V = N µ, we get
(
5 g5/2 (z)
2 g3/2 (z)
− ln z, T > Tc
s= . (31.52)
5 v
2 λ3
ζ(5/2), T < Tc

A reversible adiabatic process implies the constancy of s. For T > Tc , this implies the con-
stancy of z as well and in turn the constancy of v/λ3 . For T ≤ Tc , it again implies the same.
We thus obtain, quite generally, the following relationship between the volume and the tem-
perature of the system when it undergoes a reversible adiabatic process:

vT 3/2 = const. (31.53)

Using equations 31.38, the corresponding relationship between the pressure and the temper-
ature is
P T −5/2 = const. (31.54)
Eliminating T , we obtain
P v 5/3 = const. (31.55)

In the mixed-phase region, the entropy of the gas may be written as

5 ζ(5/2)
S = Ne . (31.56)
2 ζ(3/2)

As expected, the N0 particles that constitute the condensate do not contribute to the entropy
of the system, while the Ne particles that constitute the normal part contribute an amount of
5ζ(5/2)/2ζ(3/2) per particle.

31.4 Ideal Fermi systems


General properties of ideal Fermi systems

For an ideal Fermi gas, we have

PV X  X 1
= ln ZΩ = ln 1 + ze−βϵi , N= . (31.57)
T i i
eβϵi z −1 +1

Unlike the Bose case, the parameter z in the Fermi case can take on unrestricted values. More-
over, in view of the Pauli exclusion principle, the question of a large number of particles oc-
cupying a single energy state does not even arise in this case. We can replace summations by
corresponding integrations. We thus obtain

P g N g
= 3 f5/2 (z), = 3 f3/2 (z), (31.58)
T λ V λ
31.4 Ideal Fermi systems –427/453–

where g is a weight factor arising from the internal structure of the particles and fν (z) are
Fermi-Dirac functions defined by
Z ∞ ν−1
1 x dx z2 z3
fν (z) ≡ = z − + − ··· (31.59)
Γ(ν) 0 z −1 ex + 1 2ν 3ν
The internal energy of the Fermi gas is given by
∂ ln ZΩ 3T gV 3 3f5/2 (z)
U =− = 3
f5/2 (z) = P V = N T. (31.60)
∂β z,V 2 λ 2 2f3/2 (z)
The free energy of and entropy of the gas are
   
f5/2 (z) U −F 5 f5/2 (z)
F = N µ − P V = N T ln z − , S= =N − ln z .
f3/2 (z) T 2 f3/2 (z)
(31.61)
Using the recurrence relation of Fermi-Dirac function
∂fν (z)
z = fv−1 (z), (31.62)
∂z
we also obtain the specific heat of the gas:
CV 15 f5/2 (z) 9 f3/2 (z)
= − . (31.63)
N 4 f3/2 (z) 4 f1/2 (z)

In order to determine the various properties of the Fermi gas in terms of the particle density
n and the temperature T , we need to know the functional dependence of the parameter z on
n and T ; this information is formally contained in the implicit relationship gf3/2 (z) = nλ3 .
If the density of the gas is very low and/or its temperature very high, the Fermi gas will be
equivalent to classical ideal gas; we then speak of the gas as being non-degenerate.
If the parameter z is small in comparison with unity but not very small, we should obtain an
expansion for z in powers of nλ3 /g.
If the density and the temperature are such that the parameter (nλ3 /g) is of order unity, the
foregoing expansions cannot be of much use. In that case, one may have to make recourse to
numerical calculation.
If (nλ3 /g) ≫ 1, the functions involved can be expressed as asymptotic expansions in powers
of (ln z)−1 ; we then speak of the gas as being degenerate.
As (nλ3 /g) → ∞, our functions assume a closed form and the expressions for the various
thermodynamic quantities become highly simplified; we then speak of the gas as being com-
pletely degenerate.

Completely degenerate case


In the limit T → 0, which implies (nλ3 /g) → ∞, the mean occupation numbers of the
single-particle state become
(
1 1 for ϵi < µ0
⟨ni ⟩ = β(ϵi −µ) = , (31.64)
e +1 0 for ϵi > µ0
–428/453– Chapter 31 Interaction-free Systems

where µ0 is the chemical potential of the system at T = 0. Thus, at T = 0, all single-particle


states up to µ0 are “completely” filled, with one particle per state, while all single-particle states
with ϵi > µ0 are empty. The limiting energy µ0 is generally referred to as the Fermi energy
of the system and is denoted by the symbol ϵF ; the corresponding value of the single-particle
momentum is referred to as the Fermi momentum and is denoted by the symbol pF . The
defining equation for these parameters is
Z ϵF
gV 2 dp
N= a(ϵ)dϵ where a(ϵ) = 4πp . (31.65)
0 (2π)3 dϵ
We readily obtain
gV 3
N= p , (31.66)
6π 2 F
which gives
 1/3  2/3
6π 2 n 1 6π 2 n
pF = , ϵF = . (31.67)
g 2m g
The ground-state, or zero-point, energy of the system is then given by
Z
4πgV pF p2 2 gV p5F 3
U0 = p dp = = N ϵF . (31.68)
(2π)3 0 2m 20π 2 m 5
The ground-state pressure of the system is in turn given by
2U0 2
P0 = = nϵF ∝ n5/3 . (31.69)
3V 5

Degenerate case
For an analytical study of the Fermi gas at finite, but low, temperatures, we observe that the
value of z is now finite, though still large in comparison with unity. The functions fν (z) can be
expressed as asymptotic expansions in powers of (ln z)−1 . For the values of ν we are presently
interested in, we have the approximation
 
8 5π 2 −2
f5/2 (z) = (ln z) 5/2
1+ (ln z) + · · · , (31.70a)
15π 1/2 8
 
4 π2 −2
f3/2 (z) = 1/2 (ln z) 3/2
1 + (ln z) + · · · , (31.70b)
3π 8
 
2 π2 −2
f1/2 (z) = 1/2 (ln z) 1/2
1 − (ln z) + · · · . (31.70c)
π 24

Thus, from equations 31.58, 31.67 and 31.70, we can get


 
3/2 π2 −2
ϵF = (T ln z) 3/2
1 + (ln z) + · · · . (31.71)
8

To the lowest order of T /ϵF , the chemical potential of the degenerate Fermi gas is
"  2 #
π2 T
µ = T ln z ≈ ϵF 1 − . (31.72)
12 ϵF
31.5 Thermodynamics of the blackbody radiation –429/453–

From equations 31.60, 31.70 and 31.72, the internal energy and pressure of the degenerate
Fermi gas are given by
  "  2 #
U 3f5/2 (z) 3 π2 3 5π 2
T
= N T ≈ (T ln z) 1 + (ln z)−2 ≈ ϵF 1 + . (31.73)
N 2f3/2 (z) 5 2 5 12 ϵF

and "  2 #
2
2U 2 5π T
P = ≈ nϵF 1 + . (31.74)
3V 5 12 ϵF
Thus, the low temperature specific heat of the gas is

CV 1 ∂U π2 T
= ≈ . (31.75)
N N ∂T V,N 2 ϵF

The Helmholtz free energy of the system is


"  2 #
F PV 3 5π 2 T
=µ− ≈ ϵF 1 − , (31.76)
N N 5 12 ϵF

which gives
S U −F π2 T
= ≈ . (31.77)
N NT 2 ϵF

31.5 Thermodynamics of the blackbody radiation


We consider a radiation cavity of volume V at temperature T . The system can be looked as a gas
of identical and indistinguishable photons. Because the number of photons is not conserved,
the chemical potential of photon gas in equilibrium must be zero (zero chemical potential
means that the ensemble is not allowed to punish states with different values of N ). Thus, the
photon number in energy level i is
1
⟨ni ⟩ = . (31.78)
eβϵi −1
The internal energy of photon gas is
X ϵi
U= . (31.79)
i
eβϵi −1

The summation in equation 31.79 can be approximated by an integral. Since the spin of a
photon can take two distinct values, we have
Z
V T 4 ∞ x3 dx V T4 π2V T 4
U= 2 = Γ(4)ζ(4) = . (31.80)
π 0 ex − 1 π2 15
If there is a small opening in the walls of the cavity, the photons will “effuse” through it. The
net rate of flow of the radiation, per unit area of the opening, is
Z
1 π/2 U U π2
I= cos θ sin θdθ = = T 4. (31.81)
2 0 V 4V 60
–430/453– Chapter 31 Interaction-free Systems

The grand partition function of the photon gas is


Z Z
V T4 ∞ −x
 2 V T 4 ∞ x3 dx 1
Ω = −T ln ZΩ = − 2 ln 1 − e x dx = − 2 =− U (31.82)
π 0 3π 0 e − 1
x 3

It follows that
U π2T 4
P = = . (31.83)
3V 45
Since the chemical potential of photon gas is zero, the Helmholtz free energy is equal to Ω;
therefore the entropy is given by

U −F 4U
S= = ∝ V T 3. (31.84)
T 3T
The specific heat of the photon gas is

∂S
CV = T = 3S. (31.85)
∂T T

Photons at ground state is undetectable and can be neglected. Thus, the equilibrium number
density of photons in the radiation cavity is
Z
N T 3 ∞ x2 dx 2ξ(3)T 3
= 2 = ∝ T 3. (31.86)
V π 0 ex − 1 π2

Instructive though it may be, formula above cannot be taken at its face value because in the
present problem, the magnitude of the fluctuations in the variable N , which is determined by
the quantity ( ∂P /∂V T )−1 , is infinitely large.
Chapter 32
Quantum Field Theory in Statistical Physics

32.1 Superfluidity

32.1.1 Experimental facts of Helium at low temperatures


Helium is the only element which remains a liquid at zero temperature and atmospheric pres-
sure. The phase diagram of 4He is shown in Figure 32.1 (a).

35
melting curve 12
(a) (b)
30
10
25
8
C[J g−1 K−1 ]

20 λ-line
p[bar]

6
15
He-II He-I
4
10

2 Tλ
5
vapor pressure
0 0
0 1 2 3 4 5 6 1.50 1.75 2.00 2.25 2.50 2.75
T [K] T [K]

Figure 32.1: (a) The phase structure of 4He at low temperature; (b) The specific heat of helium
as a function of temperature.

Helium I is a normal fluid and has a normal gas-liquid critical point. Helium II is a mixture of a
normal fluid and a superfluid. The superfluid is characterized by the vanishing of its viscosity.
Helium I and helium II are separated by a line known as the λ-transition line. At Tλ = 2.18 K,
Pλ = 2.29 Pa, helium I, helium II, and helium gas coexist. The specific heat of liquid helium
along the vapour transition line forms a logarithmic discontinuity, as shown in Figure 32.1
(b). The form of this diagram resembles the Greek letter λ and is the reason for calling the
transition a λ transition.

The excitation spectrum of helium II can be measured experimentally through elastic neutron
scattering. It is found to consist of two parts, the phonon region

E(p) = c|p|, when |p| ≪ |p0 |, (32.1)


–432/453– Chapter 32 Quantum Field Theory in Statistical Physics

and the roton region

1
E(p) = ∆ + |p − p0 |2 , when |p| ∼ |p0 |, (32.2)

where c = 226m/s is the velocity of sound, ∆/kB = 9K is the roton parameter, and µ =
0.25mHe is the effective mass. There is another velocity parameter known as the critical veloc-
ity v0 . It is only when helium II moves with velocity greater than v0 that viscous effects arise.
At low temperature the roton excitations are damped by the Boltzmann factor exp(−β∆).

32.1.2 Mechanism of superfluidity


The Hamiltonian of a 4He system in momentum space is
X |k|2 1 X e
H= a†k ak + V (q)a†k1 +q a†k2 −q ak1 ak2 , (32.3)
k
2m 2V k ,k ,q
1 2

where Z Z
1
ak = √ dx ψ(x)e −ip·x
, Ve (q) = dx V (x)e−iq·x . (32.4)
V
Here we adopt box normalization to make momentum of the particle discrete. It follows that
   
ap , a†q = δp,q , [ap , aq ] = a†p , a†q = 0. (32.5)

The canonical partition function is


 
Z = Tr e−βH . (32.6)

At low temperature, states with low energy value become dominant. These are expected to be
states with low values of momentum. Let us consider the system close to T ≈ 0. We can then
assume that the state of lowest energy corresponds to atoms of low momentum with a sizable
fraction of molecules in the zero momenta state, leading to Bose-Einstein condensation. Thus
if the system has on average N atoms then a significant number N0 of the atoms are in the
lowest energy state.

Let us suppose that |C; N, N0 ⟩ is a superfluid state with a total of N helium atoms, N0 of which
are in the zero momentum plane wave state. If a†0 and a0 are creation and destruction operators
of a state of zero momentum, we have

a†0 a0 |C; N, N0 ⟩ = N0 |C; N, N0 ⟩ , a0 a†0 |C; N, N0 ⟩ = (N0 + 1) |C; N, N0 ⟩ . (32.7)

For large N0 we can approximate N0 + 1 by N0 so that on the state |C; N, N0 ⟩ we can replace
both the operators a†0 a0 and a†0 a0 by a single c-number, N0 .

For |C; N, N0 ⟩, the particle number operator N̂ can be written as


X †
N̂ = N0 + ak ak . (32.8)
k̸=0
32.1 Superfluidity –433/453–

Neglecting terms of order O(1), we have


X
N̂ 2 ≈ N02 + 2N0 a†k ak . (32.9)
k̸=0

We next examine the interaction part of H when restricted to |C; N, N0 ⟩. When all four op-
erators in HI have zero momentum, we have the term
" #
1 1 1 X
HI0 = Ve (0)a†0 a†0 a0 a0 = Ve (0)N02 ≈ Ve (0) N 2 − 2N0 a†k ak . (32.10)
2V 2V 2V k̸=0

The next term is of order N0 and is the part of HI containing two operators carrying zero
momentum. There are six ways in which this can happen. These are displayed with the mo-
mentum variables which are set to zero as shown
N0 X e N0 X e
k1 + q = k2 − q = 0 : V (q)a−q aq , k1 + q = k1 = 0 : V (0)a†k2 ak2 ,
2V q̸=0 2V k ̸=0
2

N0 X N0 X
k1 + q = k2 = 0 : Ve (q)a†−q a−q , k2 − q = k1 = 0 : Ve (q)a†q aq ,
2V q̸=0 2V q̸=0
N0 X e N0 X e
k2 − q = k2 = 0 : V (0)a†k1 ak1 , k1 = k2 = 0 : V (q)a†q a†−q . (32.11)
2V k ̸=0 2V q̸=0
1

Since at low temperature we expect only small momenta excitations to be important, we replace
Ve (k) by Ve (0) in HI . Therefore, on the state |C; N, N0 ⟩, the interacting Hamiltonian, keeping
terms of O(N0 ), is given by
" #
e (0)
V X †
HIB ≡ N 2 + N0 (2ak ak + a†k a†−k + ak a−k ) . (32.12)
2V k̸=0

Dropping non-operator parts of HIB , the total Hamiltonian is approximated by


!
X |k|2 V ˜(0) Ve (0)N0 X † †
HB = + N0 a†k ak + [ak a−k + ak a−k ]. (32.13)
k̸=0
2m V 2V k̸=0

To determine the energy eigenvalues of H B , we use the method of the Bogoliubov–Valatin


transform. Since H B is a quadratic function of operators ak and a†k , by taking appropriate lin-
ear combinations of these operators, we can form new operators bk and b†k which “diagonalize”
H B , leading to X
HB = E(k)b†k bk . (32.14)
k̸=0

The function E(k) will then determine the different excitations of the system while bk and b†k
will be destruction and creation operators for these excitations or “quasi-particles”, provided
they satisfy the commutation rules
 
bp , b†q = δp,q . (32.15)
–434/453– Chapter 32 Quantum Field Theory in Statistical Physics

Writing
bk = α(k)ak − β(k)a†−k , b†k = α(k)a†k − β(k)a−k . (32.16)
we then have the constraint
α(k)2 − β(k)2 = 1. (32.17)
Substituting equation 32.16 into 32.13 and comparing it with the 32.14, we find
v !
u
u k2 k 2 2N e (0)
V
E(k) = t
0
+ . (32.18)
2m 2m V

The energy E(k) is called the quasi-particle energy and operators bk and b†k are quasi-particle
destruction and creation operators. For small values of k we have
r
|k| N0 e
E(k) = V (0)m. (32.19)
m V
Observe that for E(k) to be real we must have Ve (0) > 0. This implies there is a repulsive
R
region for V (x) which must dominate the integral dx V (x). Observe also that |k|/m = v is
a velocity, and N0 m/V = ρ, is the density of the superfluid helium so that the quasi-particle
energy can be written as q
E(k = mv) ≈ |v| ρVe (0). (32.20)

We now show that a system with such an energy spectrum represents a superfluid, i.e., a system
with no friction. Friction in a system represents dissipation of energy. Consider a molecule
of mass MA moving in a medium. If this molecule can change its energy through collisions
with the excitations of the medium, then the system has friction. We will find that a molecule
of mass MA and velocity VA moving through a system consisting of quasi-particles of energy
E(k) cannot change its energy by scattering off quasi-particles if |VA | < |v0 | where |v0 | is a
critical velocity determined by Ve (0) and ρ.
To see this, let us consider the collision of a molecule of mass MA and velocity VA with a quasi-
particle at rest. If the final momentum of the molecule is QA and that of the quasi-particle is
k, we have, from momentum conservation,
|QA |2 = |PA |2 + |k|2 − 2|PA ||k| cos θ, (32.21)
where PA = MA VA and θ is the angle between PA and k. It follows that
|PA |2 − |QA |2 |PA |2 − |QA |2 + |k|2
≤ = cos θ ≤ 1. (32.22)
2|PA ||k| 2|PA ||k|
Combining this with the energy conservation condition
|PA |2 |QA |2
= + E(p), (32.23)
2MA 2MA
we end up with q
MA E(k) ρVe (0)
= ≤ 1. (32.24)
|PA ||k| mVA
q
Thus the process of changing energy for the molecule is not allowed if VA ≤ v0 = ρVe (0)/m
and the system of quasi-particles behaves like a superfluid.
32.2 Finite temperature perturbation theory –435/453–

32.2 Finite temperature perturbation theory


In this section we develop a perturbative formalism for the computation of the grand canoni-
cal partition function. This proceeds in close analogy with the perturbative evaluation of the
evolution operator. The grand canonical partition is

ZΩ = Tr e−βK where K ≡ K0 + Hint ≡ H0 + Hint − µN. (32.25)

Define

U (β) ≡ exp(−βK), U0 (β) ≡ exp(−βK0 ), W (β) = U0† (β)U (β). (32.26)

Similar to the time-dependent perturbation theory in quantum mechanics, we can get


∂W
= −HI (β)W (β) where HI (β) ≡ U0† (β)Hint U0 (β). (32.27)
∂β
The solution can be represented by Dyson series
X (−1)n Z β Z β
W (β) = dτ1 · · · dτn {THI (τ1 ) · · · HI (τn )}. (32.28)
n≥0
n! 0 0

Notice that T is now the ordering with respect to τ (or imaginary time).
Upon substitution of 32.28 into the grand canonical partition sum, we have
 
ZΩ = Tr e−βK0 W (β) . (32.29)

So we obtain a perturbative expansion of the partition sum in analogy with that of the evolution
operator. Before we can apply this formalism to the computation of ZΩ , we need to analyze
the finite temperature versions of the time-ordered Green functions and Wick’s theorem. The
detailed discussion can be found in section 9.8 of Elements of Statistical Mechanics (Ivo Sachs,
Siddhartha Sen & James Sexton).

32.3 Path integral


Similar to the time evolution operator in quantum mechanics, the partition function can also
be worked out through path integral. We firstly notice that

e−ϵ(T +V ) = e− 2 V e−ϵT e− 2 V + O ϵ3 .
ϵ ϵ
(32.30)

If we define ϵ ≡ β/n, we have


n ϵ n  n 
Tr e−βH = Tr e−ϵH = Tr e− 2 V e−ϵT e− 2 V + nO ϵ3 = Tr e−ϵV e−ϵT + O ϵ2 .
ϵ

(32.31)
In the limit n → ∞ the error term will go to zero, and we have achieved a splitting of the
original exponential operator
n
Tr e−βH = lim Tr e−ϵV e−ϵT . (32.32)
n→∞
–436/453– Chapter 32 Quantum Field Theory in Statistical Physics

At this point we are still working with operators, but we can now insert a complete set of
states between each term in the product, and convert the problem to one with just commuting
numbers. For simplicity, we assume the system has only one pair of canonical variables.
Tr e−βH
Z
 
= lim (dp dq)n Tr e−ϵV |q0 ⟩⟨q0 | e−ϵT |p0 ⟩⟨p0 | e−ϵV · · · e−ϵV |qn−1 ⟩⟨qn−1 | e−ϵT |pn−1 ⟩⟨pn−1 |
n→∞
Z  n ∑
dp dq n−1
= lim e i=0 ipi (qi+1 −qi )−ϵH(qi ,pi ) where qn = q0 . (32.33)
n→∞ 2π
If T (p) = p2 /2m, the pk integral will be a Gaussian integral, leading to
 m  n2 Z ∑n−1 m
−βH
dn q eϵ i=0 − 2ϵ2 (qi+1 −qi ) −V (qi ) .
2
Tr e = lim (32.34)
n→∞ 2πϵ

Formally, integral 32.34 can be written as


Z
−βH
Tr e = Dq(τ )e−SE [q(τ )] , (32.35)

where Z β m 
2
q(0) = q(β), SE [q(τ )] = dτ q̇ + V (q) . (32.36)
0 2
In quantum mechanics, we have
Z
−iHT
⟨b| e |a⟩ = Dq(t)eiS[q(t)] , (32.37)

where Z T m 
q(0) = a, q(T ) = b, S[q(t)] = dt q̇ − V (q) .
2
(32.38)
2 0
Actually, path integral 32.35 can be obtained directly if we replace T with −iβ and change the
integral variable from t to τ = it in the path integral 32.37.
The path integral formulation of partition function can be generalized to the case of quantum
field straightforwardly. For a system of bosons, we have
Z
−β(H−µN )
Tr e = DψDψ † e−SE , (32.39)

where ψ(x, 0) = ψ(x, β) and


Z β Z  
† ∂ ∇2
SE = dτ 3
d x ψ (x, τ ) − − µ ψ(x, τ )
∂τ 2m
0
Z 
1 † †
+ d x d y ψ (x, τ )ψ (y, τ )V (x − y)ψ(y, τ )ψ(x, τ ) .
3 3
(32.40)
2
As for Fermions, we have
Z
−β(H−µN )
Tr e = DψDψ † e−SE , (32.41)

where ψ(x, 0) = −ψ(x, β) and the value of ψ(x, t) is Grassmann number. A formal con-
struction of path integral based on coherent states can be found in section 4.1 and 4.2 of Con-
densed Matter Field Theory (Alexander Altland & Ben Simons).
Chapter 33
Phase Transitions and the Renormalization Group

33.1 Order parameter and phase transition


A given equilibrium state of a macroscopic system can be described by an order parameter
field. For a ferromagnet the order parameter field is the magnetization density. A phase tran-
sition corresponds to the order parameter field changing qualitatively together with the emer-
gence of singular behaviour in the system. For instance, the order parameter field in the case
of a ferromagnet is non-zero in the ferromagnetic phase, is zero in the paramagnetic phase,
and the susceptibility of the system diverges at the phase transition temperature.

Determining a suitable order parameter field to characterize a phase is part of the task of a
theory of phase transitions. If the order parameter field changes continuously from one phase
to another, as in the case of a ferromagnet, the transition is said to be a continuous or second-
order phase transition. If it is discontinuous the transition is said to be first order. An example
of a first order transition is when a solid melts to a liquid. The density of the system, which
can be taken as the order parameter, changes discontinuously. A phase transition is a striking
example of an emergent phenomenon. Starting off with only short-range interactions between
its microscopic magnetic moments, the system realizes long-range correlations below critical
temperature Tc .

We start with a model for a ferromagnet. We regard a ferromagnetic solid as being made out of
a finite number of elementary magnets placed at locations throughout the solid. We simplify
our model by assuming that each of these elementary magnets m can either point up m = 1
or down m = −1. Finally each elementary magnet interacts only with its nearest neighbour.
A Hamiltonian for this model could be
X X
H = −g mi mi+n − B mi , (33.1)
n,i i

where the first sum is over i as well as the nearest neighbours of i. Notice that H decreases if
mi , mi+n have the same sign for g > 0.

If there are altogether a large but finite number of magnets in a ferromagnet, the susceptibility
cannot diverge. Whereas, divergence appears if we allow the number of elementary magnets
to tend to infinity. This is because an infinite sum of analytic functions need not be analytic.
In order to analyze this possibility we will need to consider the statistical mechanics partition
function in the limit in which the number of configurations is infinite.
–438/453– Chapter 33 Phase Transitions and the Renormalization Group

Another approach to the problem is to suppose that the external magnetic field B is changed to
B + δB(x). We expect that a change at x, δB(x), will produce a change in the magnetization
δM not just at the point x but at other points as well. Indeed we might expect

δM (y) ∝ CT (|x − y|)δB(x), (33.2)

where CT is a “correlation function” which determines the effect at x on the magnetization


due to a change in the external field δB at x. The total change is then expected to be
Z
δM (y) = d3 x CT (|x − y|)δB(x). (33.3)

We have assumed that the correlation function depends only on temperature and on the dis-
tance between the points x and y. Let us now suppose that δB is independent of x and let us
set y = 0. Then we have
Z
δM (0)
χ(0) = = d3 x CT (|x|). (33.4)
δB
If we assume that (
α, |x| ≤ a(T )
CT (|x|) = , (33.5)
0, |x| > a(T )
that is, a disturbance only propagates a distance a(T ), we can get
4πα 3
χ(0) = a (T ). (33.6)
3
Thus, χ(0) will diverge if a(T ) diverges, that is, if correlations in the system become infinite.
From this point of view, the divergence in this susceptibility is due to the fact that near a phase
transition disturbances propagate over large distances.

33.2 Landau theory of phase transitions


For any change of a system in which the temperature is kept fixed and no work is done by the
system, the change of free energy, ∆F , is always negative so that a state of equilibrium must
be a minimum of F . Landau utilized this property of the free energy in his theory of phase
transitions.
Let us examine this approach for the case of a ferromagnet. The basic idea is to make a model
for the free energy F near the Curie temperature Tc when the system is still a ferromagnet. We
know that for T < Tc long-range correlations are present, that is, the spin at lattice site x must
point in the same direction as that at site y even when x and y are not adjacent. Otherwise
the observed macroscopic magnetic properties of the system would not exist.
The basic assumption underlying Landau’s theory is that, near the critical temperature Tc , the
properties of a ferromagnet can be described in terms of a magnetization density function
M (x). The function M (x) can be defined by considering a volume element ∆V , large com-
pared to the lattice cell volume, but small compared to the volume of correlated spins centred
33.2 Landau theory of phase transitions –439/453–

around the point x. The magnetization of the volume element ∆V is defined to be M (x)∆V .
For this definition of M (x) to be useful, it is important that M (x) should not be a rapidly
varying function of position. Near the Curie temperature Tc we also expect M (x) to be small
in amplitude.

On the basis of arguments of this kind, Landau proposed to introduce a functional FL [T, B, M ]
of the magnetization density M (x), temperature T , and external magnetic field B(x) of the
form
Z
FL [T, B, M ] = FL (T, B, 0) + d3 x [a(T )|M |2 + b(T )|M |4 + · · ·
X
+ c(T ) (∇j Mi )2 + · · · − B · M ]. (33.7)
ij

The free energy FL (T, B) is then obtained by minimizing FL [T, B, M ] with respect to M .
Notice that the temperature dependent coefficients a(T ), b(T ), c(T ) · · · are assumed to be
smooth functions of temperature. We will simplify the model function by assuming magnetic
field along the z-direction and that M (x) only has components in the z-direction. Then we
have
Z
FL [T, Bz , Mz ] = FL (T, Bz , 0) + d3 x [a(T )Mz2 + b(T )Mz4 + · · ·

+ c(T )|∇Mz |2 + · · · − Bz Mz ]. (33.8)

The expression for the Landau free energy FL is expected to be useful when T is close to the
Curie temperature Tc . In this region Mz (x) is expected to be small and we also expect |∇Mz |2
to be small. Because of these reasons we will from now on ignore the effect of the higher powers
of Mz and higher gradient terms.

To determine the equilibrium configuration of the magnetization we have to minimize the free
energy with respect to Mz (x). Using
Z
δFL = d3 x [2a(T )Mz + 4b(T )Mz3 − 2c(T )∇2 Mz − Bz ]δMz , (33.9)

we see that vanishing of δF for arbitrary δMz requires

2a(T )Mz (x) + 4b(T )Mz3 (x) − 2c(T )∇2 Mz (x) = Bz (x). (33.10)

Suppose now that Bz (x) does not depend on x and let us see if a solution for Mz (x) indepen-
dent of x is possible. Such an x independent solution must satisfy

[2a(T )Mz + 4b(T )Mz3 ] = Bz . (33.11)

Now we ask if it is possible to construct a solution with the property that Mz ̸= 0 when Bz = 0
and T < Tc . As we have stressed this model is constructed to represent a ferromagnet near
its Curie temperature. We also assume that the coefficient functions are all smooth functions
of temperature. We thus expect the Mz4 term to be small compared to the Mz2 term. It is then
–440/453– Chapter 33 Phase Transitions and the Renormalization Group

reasonable to replace b(T ) by b(Tc ) = b0 , a constant and set a(T ) = a0 (T − Tc ). Assume


a0 > 0 and b0 > 0. When T > Tc , the solution is

Mz = 0. (33.12)

When T < Tc , the solution is


s
a0 (Tc − T )
Mz = 0 or ± . (33.13)
2b0
Since  s 
a0 (Tc − T ) 
F (Mz = 0) > F Mz = ± , (33.14)
2b0
p
we must have Mz = a0 (Tc − T )/b0 when T < Tc .

F
T > Tc

T < Tc

Figure 33.1: Landau free energy.

When T = Tc and Bz ̸= 0, we have


Bz
Mz3 = . (33.15)
4b0
When T ̸= Tc , if Bz is changed to Bz + δBz , the corresponding equilibrium distribution Mz
will be Mz + δMz , where

2a0 (T − Tc )δMz + 12b0 Mz2 δMz = δBz . (33.16)

Setting Bz = 0, we find that


(
1
δMz 2a0 (T −Tc )
, T > Tc
χ= = . (33.17)
δBz Bz =0
1
, T < Tc
4a0 (Tc −T )

Landau’s theory qualitatively reproduces the expected singular behaviour.


It is also possible to get a rather precise statement regarding long-range correlations within the
framework of Landau’s theory. Setting c(T ) = c0 in equation 33.10 near the Curie tempera-
ture, we have
2a(T )Mz (x) + 4b0 Mz3 (x) − 2c0 ∇2 Mz (x) = Bz (x). (33.18)
33.3 Renormalization group –441/453–

It follows that
[2a0 (T − Tc ) + 12b0 Mz2 − 2c0 ∇2 ]δMz (x) = δBz (x). (33.19)
Using equation 33.3, we obtain

[2a0 (T − Tc ) + 12b0 Mz2 − 2c0 ∇2 ]CT (|x − y|) = δ(x − y). (33.20)

Setting Bz = 0, we get

[2a0 (T − Tc ) − 2c0 ∇2 ]CT (|x − y|) = δ(x − y) when T > Tc ; (33.21)


[4a0 (Tc − T ) − 2c0 ∇2 ]CT (|x − y|) = δ(x − y) when T < Tc . (33.22)

The solution is
1 e−|x−y|/ξ
CT (|x − y|) = , (33.23)
4π |x − y|
where ξ 2 = c0 /a(T ) for T > Tc and ξ 2 = −c0 /2a(T ) for T < Tc .
We notice that ξ → ∞ as T → Tc . Thus Landau’s theory is in qualitative agreement with the
intuitive idea that long-range correlations are generated in a ferromagnet as T → Tc . Another
point to notice is that if δBz were x independent, then as we saw before,
Z
δM (0)
χ(0) = = d3 x CT (|x|) ∼ ξ 2 → ∞, as T → Tc . (33.24)
δB

Let us summarize the results obtained from Landau’s approach. The approach focused on long-
range correlations and suggested that the singular behaviour of the susceptibility was due to
such correlations when T → Tc . The approach also predicts that the relation between different
macroscopic parameters involves power laws,
1
Mz ∼ (Tc − T )β , Mz ∼ Bz1/δ , χ∼ , (33.25)
(Tc − T )γ

with β = 1/2, γ = 1, and δ = 3. The parameters β, δ and γ are called critical exponents and
are measured experimentally.
The experimental values for these parameters β ≈ 0.33, δ ≈ 4.5, and γ ≈ 1.2 are found
for different ferromagnet with different lattice structures and widely differing values for the
Curie temperature Tc . These parameters thus are a universal property of the ferromagnetic
phase transition. This is also a feature of Landau’s theory. Landau’s theory is in qualitative
agreement with experiment.

33.3 Renormalization group


33.3.1 One-dimensional Ising model
Although Landau’s theory is in good qualitative agreement with experiment, there is room
for improvement on the quantitative level concerning the critical exponents. Instead of con-
sidering a model for the free energy, we now consider the partition function directly. Let us
–442/453– Chapter 33 Phase Transitions and the Renormalization Group

consider a one-dimensional ferromagnetic system described in terms of an Ising model. The


partition function is given by

X
N X
N
−βH
Z = Tr e where H = −J Si Si+1 − Hext Si . (33.26)
i=1 i=1

Here, Si = ±1 denotes the (uniaxial) magnetization or spin of site i (periodic boundary con-
ditions, SN +1 = S1 , imposed), and Hext represents an external field. The Boltzmann weight
of the system can be factorized according to the relation

∑N Y
N
e−βH = e 1 KSi Si+1 +hSi
= T (Si , Si+1 ), (33.27)
i=1

where K = βJ > 0 and h = βHext , and the weight is defined through the relation T (S, S ′ ) =
exp[KSS ′ + h(S + S ′ )/2]. The partition function of the system can be written as
X  K+h 
−βH N e e−K
Z= e = Tr T where T = . (33.28)
e−K eK−h
{Si }

We first subdivide the spin chain into regular clusters of b neighbouring spins. We then proceed
to sum over the sub-configurations of each cluster, thereby generating an effective functional
describing the inter-cluster energy balance. For one-dimensional Ising model, we have
N/b
= Tr(T ′ ) = ZN/b (K ′ , h′ ).
N/b
ZN (K, h) = Tr T N = Tr T b (33.29)

Define u ≡ e−K and v ≡ e−h . It can be worked out that


√ √
′ v + v −1 ′ u4 + v 2
u = 4 , v = √ for b = 2. (33.30)
(u + u−4 + v 2 + v −2 )1/4 u4 + v −2
The possibility of representing the new transfer matrix in the same algebraic structure as the
old one implies that the transformed model again describes an Ising spin system, and that we
can think of each cluster as some kind of block spin. However, the Hamiltonian βH of the new
block spin system is defined at a different temperature, magnetic field and exchange constant
and describes fluctuations on length scales that are twice as large as in the original system. In
particular, the short-distance cut-off has been doubled.
To make further progress, one may focus on the two relevant parameters u′ and v ′ and observe
that the result of the block spin transformation can be represented as a discrete map
 ′  
u f1 (u, v)
= , (33.31)
v′ f2 (u, v)

where the functions f1,2 are defined through equation 33.30.


In one-dimensional Ising model, the map f possesses two disjoint sets of fixed points (u∗ , v ∗ ) =
(0, 1) and (u∗ , v ∗ ) = (1, v). The set of fixed points represents the most important structural
characteristic of an RG analysis. They organize the space of “flowing” coupling constants into
sectors of qualitatively different behaviour. In particular, one may notice that, at a fixed point,
33.3 Renormalization group –443/453–

all characteristics of the model, including its correlation length ξ, remain invariant. On the
other hand, we noticed above that an RG step is tantamount to doubling the fundamental
length scale of the system. Consistency requires that either ξ = 0 or ξ = ∞.

In the present case, the line of fixed points (1, v) is identified with u = exp(−βJ) = 1, i.e.,
β = 0. This is the limit of infinitely large temperatures, at which we expect the model to be in
a state of maximal thermal disorder, i.e., ξ = 0. Besides the high-temperature fixed line, there
is a zero-temperature fixed point (u, v) = (exp(βJ), exp(h)) = (0, 1) implying T → 0 and
h → 0. Upon approaching zero temperature, the system is expected to order and to build up
long-range correlations, i.e., ξ → ∞. Critical point corresponds to a fixed point of RG group
with infinity correlation length.

Notice, however, an important difference between the high- and the low-temperature set of
fixed points: while the former is an attractive fixed point in the sense that the RG trajecto-
ries approach it asymptotically, the latter is a repulsive fixed point. No matter how low the
temperature at which we start, the RG flow will drive us into a regime of effectively higher
temperature or lower ordering. (Of course, the physical temperature does not change under
renormalization. All we are saying is that the block spin model behaves as an Ising model at a
higher temperature than the original system.)

33.3.2 Gell-Mann–Low equations


There are a number of methodologically different procedures whereby the set of flow equa-
tions can be obtained from the microscopic theory. Here, we formulate this step in a language
adjusted to applications in statistical field theory as opposed to particle physics. While there
is considerable freedom in the actual implementation of the RG procedure, all methods share
the feature that they proceed in a sequence of three more or less canonical steps.

Subdivision of the field manifold


In the first step, one may decompose the integration manifold {ϕ} into a sector to be integrated
out, {ϕf }, and a complementary set, {ϕs }.

• We may proceed according to a generalized block spin scheme and integrate over all de-
grees of freedom located within a certain structural unit in the base manifold {x}.(This
scheme is adjusted to lattice problems where {x} = {xi } is a discrete set of points.)

• We could decide to integrate over a certain sector in momentum space. When this sec-
tor is defined to be a shell Λ/b < |p| < Λ, one speaks of a momentum shell integration.
Naturally, within this scheme, the theory will be explicitly cutoff-dependent at interme-
diate stages.

• Alternatively, we may decide to integrate over all high-lying degrees of freedom λ−1 ≤
|p|. In this case, we will of course encounter divergent integrals. An elegant way to han-
dle these divergences is to apply dimensional regularization. Within this approach one
formally generalizes from integer dimensions d to fractional values d ± ϵ. One moti-
vation for doing so is that the formal extension of the characteristic integrals appearing
–444/453– Chapter 33 Phase Transitions and the Renormalization Group

during the RG step to non-integer dimensions are finite. As long as one stays clear of
the dangerous values d = integer one can then safely monitor the dependence of the
integrals on the IR cutoff λ−1 .

RG step

The second part of the program is to actually integrate over short range fluctuations. This
step usually involves approximations. In most cases, one will proceed by a so-called loop ex-
pansion, i.e., one organizes the integration over the fast field ϕf according to the number of
independent momentum integrals (loops) that occur after the appropriate contractions.

Following the procedure, an expansion over the fast degrees of freedom gives an action in
which coupling constants of the remaining slow fields are altered. Notice that the integration
over fast field fluctuations may lead to the generation of “new” operators, i.e., operators that
have not been present in the bare action. In such cases one has to investigate whether the
newly generated operators are “relevant” in their scaling behaviour. If so, the appropriate way
to proceed is to include these operators in the action from the very beginning (with an a priori
undetermined coupling constant). One then verifies whether the augmented action represents
a complete system, i.e., one that does not lead to the generation of operators beyond those that
are already present. If necessary, one has to repeat this step until a closed system is obtained.

Rescaling

One next rescales frequency/momentum so that the rescaled field amplitude ϕ′ fluctuates on
the same scales as the original field ϕ, i.e., one sets

q → bq, ω → bz ω. (33.32)

Here, the frequency renormalization exponent or dynamical exponent z depend on the effec-
tive dispersion relating frequency and momentum. We finally notice that the field ϕ, as an
integration variable, may be rescaled arbitrarily. Using this freedom, we select a term in the
action which we believe governs the behaviour of the “free” theory – in a theory with elastic
R
coupling this might be the leading-order gradient operator dd r (∇ϕ)2 – and require that it
be strictly invariant under the RG step. To this end we designate a dimension Ldϕ for the field,
chosen so as to compensate for the factor bx arising after the renormalization of the operator.
The rescaling ϕ → bdϕ ϕ is known as field renormalization. It renders the “leading” operator
in the action scale invariant.

As a result of all these manipulations, we obtain a renormalized action


X
S[ϕ] = ga′ Oa [ϕ], (33.33)
a

which is entirely described by the set of changed coupling constants, i.e., the effect of the RG
step is fully encapsulated in the mapping

g ′ = R̃(g), (33.34)
33.3 Renormalization group –445/453–

relating the old value of the vector of coupling constants to the renormalized one. By letting
the control parameter, l ≡ ln b, of the RG step assume infinitesimal values, one can make
the difference between bare and renormalized coupling constants arbitrarily small. It is then
natural to express the difference in the form of a generalized β-function or Gell-Mann–Low
equation
dg
= R(g), (33.35)
dl
where the right-hand side is defined through the relation
R(g) = lim l−1 (R̃(g) − g). (33.36)
l→0

33.3.3 Analysis of the Gell-Mann–Low equation


The prime structural characteristic of the Gell-Mann–Low equations is the set of fixed points,
i.e., the submanifold {g ∗ } of points in coupling constant space which are stationary under the
flow. Once the coupling constants are fine-tuned to a fixed point, the system no longer changes
under subsequent RG transformations. In particular it remains invariant under the change of
space/time scale associated with the transformation. Alluding to the fact that they look the
same no matter how large a magnifying glass is used, systems with this property are referred
to as self-similar.
Now, to each system, one can attribute at least one intrinsic length scale, namely the length ξ
determining the exponential decay of field correlations. However, the existence of a finite, and
pre-determined, intrinsic length scale clearly does not go together with invariance under scale
transformations. We thus conclude that, at a fixed point, either ξ = 0 (not so interesting), or
ξ = ∞. However, a diverging correlation length ξ → ∞ is a hallmark of a second-order phase
transition. We thus tentatively identify fixed points of the RG flow as candidates for “transition
points” of the physical system.
This being so, it is natural to pay special attention to the behaviour of the flow in the immediate
vicinity of the fixed-point manifolds. If the set of coupling constants, g, is only close enough
to a fixed point, g ∗ , it will be sufficient to consider the linearized mapping
∂Ra
R(g) ≡ R[g ∗ + (g − g ∗ )] ≈ W (g − g ∗ ), Wab = . (33.37)
∂gb g=g ∗

To explore the properties of flow, let us assume that we had managed to diagonalize the matrix
W . Denoting the eigenvalues by λα , and the left-eigenvectors by ϕα , we have
ϕ⊺α W = ϕ⊺α λα . (33.38)
Let vα be the αth component of the vector g − g ∗ when represented in the basis {ϕα }, i.e.,
vα = ϕ⊺α (g − g∗ ). (33.39)
It follows that
dvα
= λα vα . (33.40)
dl
Under renormalization, the coefficients vα change by a mere scaling factor λα , wherefore they
are called scaling fields. It suggests a discrimination between at least three different types of
scaling fields:
–446/453– Chapter 33 Phase Transitions and the Renormalization Group

• For λα > 0 the flow is directed away from the critical point. The associated scaling field
is said to be relevant.

• In the complementary case, λα < 0, the flow is attracted by the fixed point. Scaling
fields with this property are said to be irrelevant.

• Finally, scaling fields which are invariant under the flow, λα = 0 , are termed marginal.

The distinction of relevant/irrelevant/marginal scaling fields in turn implies a classification of


different types of fixed points:

• Firstly, there are stable fixed points, i.e., fixed points whose scaling fields are all irrelevant
or, at worst, marginal. These points define what we might call “stable phases of matter”:
when you release a system somewhere in the parameter space surrounding any of these
attractors, it will scale towards the fixed point and eventually sit there. Or, expressed in
more physical terms, looking at the problem at larger and larger scales will make it more
and more resemble the infinitely correlated self-similar fixed-point configuration. By
construction, the fixed point is impervious to moderate variations in the microscopic
morphology of the system, i.e., it genuinely represents what one might call a “state of
matter.”

• Complementary to stable fixed points, there are unstable fixed points, where all scaling
fields are relevant (e.g., the T = 0 fixed point of the 1-D Ising model). You can never
get there and, even if you managed to approach it closely, the harsh conditions of reality
will make you flow away from it. Although unstable fixed points do not correspond to
realizable forms of matter, they are of importance inasmuch as they “orient” the global
RG flow of the system.

• Finally, there is the generic class of fixed points with both relevant and irrelevant scaling
fields. These points are of particular interest inasmuch as they can be associated with
phase transitions. To understand this point, we first notice that the r eigenvectors asso-
ciated with irrelevant scaling fields span the tangent space S of an r-dimensional mani-
fold known as the critical surface. This critical manifold forms the basin of attraction of
the fixed point, i.e., whenever a set of physical coupling constants g is fine-tuned so that
g ∈ S, the expansion in terms of scaling fields contains only irrelevant contributions
and the system will feel attracted to the fixed point as if it were a stable one. However,
the smallest deviation from the critical surface introduces a relevant component driving
the system exponentially away from the fixed point. For example, in the case of the fer-
romagnetic phase transition, deviations from the critical temperature Tc are relevant. If
we consider a system only slightly above or below Tc , it may initially appear to be crit-
ical. However, upon further increasing the scale, the relevant deviation will grow and
drive the system away from criticality, either towards the stable high-temperature fixed
point of the paramagnetic phase or towards the ferromagnetic low-temperature phase.
33.4 Critical exponents –447/453–

33.4 Critical exponents


The phenomenology of second-order transitions is generally richer than that of first-order
transitions. As a thermodynamic state variable, the order parameter M is coupled to a con-
jugate field H and M = −∂H F , where F is the free energy. At a second-order transition, M
changes non-analytically, which means that the second-order derivative, a thermodynamic
susceptibility, χ = −∂H 2
F , develops a singularity. The susceptibility is intimately linked to
the field fluctuation behaviour of the system. A divergence of the susceptibility implies the
accumulation of infinitely long-range field fluctuations.

We have seen that, right at the transition/fixed point, the system is self-similar. This implies
that the behaviour of its various characteristics must be described by power laws. The set
of different exponents characterizing the relevant power laws occurring in the vicinity of the
transition are known as critical exponents.

In the following, let us briefly enumerate the list of the most relevant exponents, α, β, γ, δ, η
and ν. Although we shall again make use of the language of the magnetic transition, it is clear
that the definitions of most exponents generalize to other systems.

1. In the vicinity of the critical temperature, the specific heat

T ∂ 2F
C=− , (33.41)
Ld ∂T 2 h↘0

scales as C ∼ |t|−α , where t = (T − Tc )/Tc . Notice that a non-trivial statement has


been made: although the phases above and below the transition are essentially different,
the scaling exponents controlling the behaviour of C are identical. The same applies to
most other exponents listed below.

2. Approaching the transition temperature from below, the magnetization vanishes as

M ≡ − ∂H F ∼ (−t)β . (33.42)
H↘0

3. The magnetic susceptibility behaves as

χ ≡ − ∂h M ∼ (−t)−γ . (33.43)
h↘0

4. At the critical temperature, t = 0, the field dependence of the magnetization is given by


M ∼ |h|1/δ .

5. Upon approaching the transition point, the correlation length diverges as ξ ∼ |t|−ν .

6. The correlation function,


(
r−(d−2+η) , r≪ξ
C(r) = ⟨ϕ(r)ϕ(0)⟩ ∼ , (33.44)
exp(−r/ξ), r≫ξ
–448/453– Chapter 33 Phase Transitions and the Renormalization Group

crosses over from exponential to a power law scaling behaviour at the length scale ξ.
The engineering dimension of ϕ is [ϕ] = L(2−d)/2 and so C(r) has canonical dimension
L2−d . The exponent η is called the anomalous dimension of the correlation function.
As the response functions can be obtained from integrating the connected correlation
functions, we have
Z Z ξ
dd x
χ∼ d x C(x) ∼
d
∼ ξ 2−η . (33.45)
0 rd−2+η

Universality

In fact, the majority of critical systems can be classified into a relatively small number of uni-
versality classes. Crudely speaking, leaving apart more esoteric classes of phase transitions,
there are O(10) fundamentally different types of flow recurrently appearing in practical ap-
plications. This has to be compared with the near infinity of different physical systems that
display critical phenomena. The origin of this universality can readily be understood from the
concept of critical surfaces.

Imagine, then, an experimentalist exploring a system that is known to exhibit a phase tran-
sition. Motivated by the critical phenomena that accompany phase transitions, the available
control parameters Xi (temperature, pressure, magnetic field, etc.) will be varied until the
system begins to exhibit large fluctuations.

On a theoretical level, the variation of the control parameters determines the initial values of
the coupling constants of the model. For microscopic parameters corresponding to a point
above or below the critical manifold, the system asymptotically falls into either the “high-” or
the “low-temperature” regime. However, eventually the trajectory through parameter space
will intersect the critical surface. For this particular set of coupling constants, the system is
critical. As we look at it on larger and larger length scales, it will be attracted by the fixed point
at S, i.e., it will display the universal behaviour characteristic of this particular point. This is the
origin of universality: variation of the system parameters in a different manner will generate a
different trajectory. However, as long as this trajectory intersects with S, it is guaranteed that
the critical behaviour will exhibit the same universal characteristics controlled by the unique
fixed point.

In fact a more far-reaching statement can be made. Given that there is an infinity of systems ex-
hibiting transition behaviour while there is only a very limited set of universality classes, many
systems of very different microscopic morphology must have the same universal behaviour.
More formally, different microscopic systems must map onto the same critical low-energy
theory.

Scaling laws

Let us consider the case of the ferromagnetic transition. The flow in the vicinity of the magnetic
fixed point is controlled by only two relevant scaling fields, the (reduced) temperature t ≡
(T − Tc )/Tc and the reduced magnetic field h ≡ H/T . Other scaling fields gi s are irrelevant.
33.5 RG analysis of the ferromagnetic transition –449/453–

Under a renormalization group transformation, the reduced free energy f = F/T Ld will
behave as

f (t, h, gi ) = b−d f (tbyt , hbyh , gi bλi ) = td/yt f (1, ht−yh /yt , gi t−λi /yt )
t≪1 d/y
≈ t t
f (1, ht−yh /yt , 0) ≡ td/yt f˜(ht−yh /yt ). (33.46)

Here, we have used the freedom of arbitrarily choosing the parameter b to set tbyt = 1 while,
in the third equality, we have assumed that we are sufficiently close to the transition that the
dependence of f on irrelevant scaling fields is inessential. Combining the definitions of critical
exponents and equation 33.46, it can be shown that
d d − yh 2yh − d
α=2− , β= , γ= ,
yt yt yt
yh 1
δ= , ν = , η = 2 + d − 2yh . (33.47)
d − yh yt
The dimensions of the relevant scaling fields have a more fundamental status than the critical
exponents. Of the six classical exponents, only two can be truly independent. Scaling laws can
be derived by eliminating yh and yt in equations 33.47:

ν(2 − η) = γ, α + 2β + γ = 2, β(δ − 1) = γ, 2 − α = νd. (33.48)

33.5 RG analysis of the ferromagnetic transition


33.5.1 Preliminary dimensional analysis
We now turn to the problem of calculating the various critical exponents of ferromagnetic
system. The method presented here will improve on Landau’s theory while reducing to the
latter in a certain limit. The idea is to represent Landau’s model as a certain approximation to
the partition function of some field theory for the order parameter.
We suppose the partition function of Ising model can be represented by
Z Z  
−S[ϕ] 1 r 2 λ 4
Z = Dϕe where S[ϕ] = d x (∇ϕ) + ϕ + ϕ − hϕ .
d 2
(33.49)
2 2 4!
The functional integral above is not derived from a concrete Hamiltonian. The path integral
should be understood rather as a statistical averaging over different configurations of the order
parameter ϕ(x). The reason why it it possible to neglect both higher powers and gradients of
the field ϕ can be formulated by dimensional analysis.
Anticipating that the “real” dimensions carried by the operators in the action will be not too
far from their engineering dimensions, we begin by exploring the latter. It is straightforward
to attribute engineering dimensions to all operators:
Z  Z  Z  Z 
2 2 4 4−d n d+(2−d)n/2
ϕ =L , ϕ =L , ϕ =L , (∇ ϕ) = L2(1−m) .
m 2

(33.50)
These relations convey much about the potential significance of all structurally allowed oper-
ators:
–450/453– Chapter 33 Phase Transitions and the Renormalization Group

• The engineering dimension of the non-gradient operator ∼ ϕ2 is positive in all dimen-


sions, indicating general relevance.
• The ϕ4 operator is relevant (irrelevant) in dimensions d < 4(d > 4). This suggests
that for d > 4 a harmonic approximation (λ = 0) of the model should be reasonable.
It also gives us a preliminary clue as to how we might want to approach the ϕ4 -model
on a technical level: while for dimensions “much” smaller than d = 4 the interaction
operator ∼ ϕ4 is strongly relevant, the dimension d = 4 itself is borderline. This suggests
that we analyze the model at d = 4, or maybe “close” to d = 4 where the ϕ4 operator is
not yet that virulent, and then try to extrapolate to infer what happens at the “physical
dimensions” of d = 2 and 3.
• Operators ϕn>4 become relevant only in dimensions d < 2/(1 − 2/n) < 4. However,
even below these threshold dimensions, operators of high powers in the field variable are
much less relevant than the dominant non-harmonic operator ϕ4 . This is the a posteriori
justification for the neglect of ϕn>4 operators in the derivation of the model.
• Similarly, operators with more than two gradients are generally irrelevant and can be
neglected in all dimensions.
• In contrast, the operator ϕ coupling to the magnetic field carries dimension (1 + d/2)
and is therefore always strongly relevant.

33.5.2 Landau mean field theory


We can approximate S[ϕ] by its functional Taylor expansion about ϕ0 , which is defined by the
condition δS/δϕ = 0:
Z Z
1 δ2S
S[ϕ] = S[ϕ0 ] + d x dd y (ϕ − ϕ0 )x (ϕ − ϕ0 )y
d
. (33.51)
2! δϕx δϕy ϕ0
If the fluctuation of the field around ϕ0 can be neglected, the path integral can be approximated
by its saddle point value
Z ≈ e−S[ϕ0 ] . (33.52)
Notice that Z = e−βF in the canonical ensemble. We can identify the free energy of statistical
mechanics with S[ϕ0 ] of our field theory by
1
F = S(ϕ0 ). (33.53)
β
Now we recover the Landau mean field theory as the saddle point approximation of full path
integral.
The relative fluctuation of the field in mean field theory can be quantified as
Rξ d
d x ⟨ϕ(0)ϕ(x)⟩
R = 0 Rξ (33.54)
0
dd x ⟨ϕ⟩2
Notice that ⟨ϕ(0)ϕ(x)⟩ ∼ r2−d , ⟨ϕ⟩ ∼| T − Tc |1/2 and ξ ∼ |T − Tc |−1/2 in mean field theory.
We can derive R ∼ |T − Tc |(d−4)/2 . Thus, mean field approximation is valid only when d > 4.
33.5 RG analysis of the ferromagnetic transition –451/453–

33.5.3 Gaussian model


When d > 4, we can write
Z  
r 2 1
S[ϕ] ≈ d x ϕ + (∇ϕ) − hϕ ,
d 2
(33.55)
2 2

neglecting all irrelevant operators. We split our field into fast and slow degrees of freedom
ϕ = ϕs + ϕf , resulting in the fragmentation of the action S[ϕs , ϕf ] = Ss (ϕs ) + Sf [ϕf ] +
Sc [ϕs , ϕf ]. However, the action Sc coupling fast and slow components vanishes, implying that
the integration over the fast field merely leads to an inessential constant. The effect of the RG
step on the action is then entirely contained in the rescaling of the slow action. The scaling
factors are determined by the engineering dimensions of the operators appearing in the action,
i.e., r → b2 r and h → b1+d/2 h. Using the fact that r ∼ t we can then readily write down the
two relevant scaling dimensions of the problem, yt = 2 and yh = d/2 + 1. Using equations
33.47, we obtain

d d 1 d+2 1
α=2− , β= − , γ = 1, δ= , ν= , η=0 (33.56)
2 4 2 d−2 2

We notice that the Gaussian model possesses only one fixed point, namely r = h = 0, which
in the context of ϕ4 -theory is called the Gaussian fixed point.

One tricky issue is that the mean field exponents agree with the scaling analysis here only
when d = 4. This results from the fact that the coefficient b(T ) of Mz4 in mean field theory
is assumed topbe constant in the vicinity of T = Tc . For example, in mean field theory, we
have Mz = a0 (Tc − T )/b0 and so Mz ∼ (−t)1/2 . If we take into account the fact that b0
scales as t(4−d)/2 around Gaussian fixed point, we will find that Mz ∼ (−t)d/4−1/2 , which is
fully compatible with scaling analysis.

It is tempting to think that we can just neglect the irrelevant operators because their coefficients
flow to zero as we approach the infra-red. However, sometimes we will be interesting in quan-
tities which have the irrelevant coupling constants sitting in the denominator. In this case, one
cannot just blindly ignore these irrelevant couplings as they affect the scaling analysis. When
this happens, the irrelevant coupling is referred to as dangerously irrelevant.

33.5.4 Renormalization group analysis


The path integral of quantum ϕ4 theory in (d − 1) + 1-dimensional spacetime is
Z Z h i
r
Z=e iSL [ϕ]
where SL = dt dd−1 x (∂t ϕ)2 − (∇ϕ)2 − ϕ2 − hϕ . (33.57)
2

If we replace t by −iτ , Z will be exactly the partition function of Ising model in d-dimensional
space. Comparing statistical ϕ4 theory with its quantum counterpart, we can obtain the free
propagator of it as
1
D(p) = 2 . (33.58)
p +r
–452/453– Chapter 33 Phase Transitions and the Renormalization Group

Note that when calculating loop integrals of quantum field, the free propagator will be the
same as that of statistical field after Wick rotation. We may infer that the renormalization
group equations of statistical field and quantum field are identical.

The detailed derivation of Gell-Mann–Low equations of ϕ4 theory can be found in section


8.4.4 of Condensed Matter Field Theory (Alexander Altland & Ben Simons), section 3.4 and
3.5 of Statistical Field Theory (David Tong), or chapter 12 of An introduction to quantum field
theory (M.E.Peskin & D.V.Schroeder). The results are

dr λ rλ dλ 3λ2 dh 6−ϵ
= 2r + 2
− , = ϵλ − , = , (33.59)
d ln l 16π 16π 2 d ln l 16π 2 d ln l 2
where ϵ = 4 − d. Equations 33.59 clearly illustrate the meaning of the ϵ-expansion. According
to the second one, a perturbation away from the Gaussian fixed point will initially grow at a
rate set by the engineering dimension ϵ, while the one-loop contribution ∼ λ2 stops the flow
at a value λ ∼ ϵ.

β(λ)

>0

O()
λ

<0

Figure 33.2: Beta function for λ.

Equating the right-hand sides of Gell-Mann–Low equations to zero (and temporarily ignoring
the magnetic field), we indeed find that besides the Gaussian fixed point a non-trivial fixed
point (r2∗ , λ∗2 ) = (−ϵ/6, 16π 2 ϵ/3) has appeared. Notice that the second fixed point is O(ϵ)
and coalesces with the Gaussian fixed point as ϵ is sent to zero. Plotting the β-function for the
coupling constant λ, we further find that, for ϵ > 0, λ is relevant around the Gaussian fixed
point but irrelevant at the non-trivial fixed point, as shown in Figure 33.2.

To understand the flow diagram of the system, one may linearize the β-function around both
the Gaussian and the non-trivial fixed point. Denoting the linearized mappings by W1,2 , we
have   !
2 16π 1
2 2 − 1
3
ϵ 1+ϵ/6
16π 2
W1 = , W2 = . (33.60)
0 ϵ 0 −ϵ

Figure 33.3 shows the flow in the vicinity of the two fixed points, as described by the matrices
W1,2 as well as the extrapolation to a global flow chart. Notice that the critical surface of the
system – the straight line interpolating between the two fixed points – is tilted with respect
to the r (temperature) axis of the phase diagram. This implies that it is not the physical tem-
perature alone that decides whether the system will eventually wind up in the paramagnetic
33.5 RG analysis of the ferromagnetic transition –453/453–

ferromagnetic

paramagnetic
non-trivial

Gaussian

r
unphysical

Figure 33.3: Phase diagram of the ϕ4 -model as obtained from the ϵ-expansion.

or ferromagnetic sector of the phase diagram. Rather one has to relate temperature to the
strength of the non-linearity to decide on which side of the critical surface we are. For exam-
ple, for strong enough λ, even a system with r initially negative may eventually flow towards
the disordered phase. This type of behaviour cannot be predicted from the mean-field analysis
of the model. Rather it represents a non-trivial effect of fluctuations.
Finally notice that, while we can formally extend the flow into the lower portion of the dia-
gram, λ < 0, this region is actually unphysical. The reason is that, for λ < 0, the action is
fundamentally unstable and, in the absence of a sixth-order contribution, does not describe a
physical system.
Of the two eigenvalues of W2 , 2−ϵ/3 and −ϵ, only the former is relevant and tied to the scaling
of the coupling constant r. Thus, we have yt = 2 − ϵ/3 and, as before, yh = (6 − ϵ)/2. The
critical exponents are therefore
ϵ 1 ϵ ϵ 1 ϵ
α= , β= − , γ =1+ , δ = 3 + ϵ, ν= + , η = 0. (33.61)
6 2 6 6 2 12
If we extend the radius of the expansion to ϵ = 1, we obtain the critical exponents for 3-
dimensional Ising model. The agreement with the experimental results has improved even in
spite of the fact that we have driven the ϵ-expansion well beyond its range of applicability.

You might also like