Notes of College Physics
Notes of College Physics
X i
V .k0 / D ˇ t ln.1 ˛ˇ/ C ˇ t ˛ ln k t
t D0
1
X 1
X h1 .˛ˇ/t i
t
D ln.1 ˛ˇ/ ˇ C˛ ˇt ln ˛ˇ C ˛ t ln k0
t D0 t D0
1 ˛ˇ
D
˛
ln k0 C
ln.1 Physics ˛ˇ/
C ˛ ln.˛ˇ/
X ˇ .˛ˇ/
1 t t
1 ˛ˇ 1 ˇ 1 ˛ 1 ˛
物理
t D0
˛ ln.1 ˛ˇ/ ˛ˇ
D ln k C
0 C ln.˛ˇ/
1 ˛ˇ 1 ˇ .1 ˇ/.1 ˛ˇ/
˛ ln.1 ˛ˇ/ ˛ˇ
左边 D V .k/ D ln k C C ln.˛ˇ/
1 ˛ˇ 1 ˇ .1 ˇ/.1 ˛ˇ/
4 ˛
D ln k C A
1 ˛ˇ
n o
右边 D max u f .k/ y C ˇV .y/
n o
右边 D max u f .k/ y C ˇV .y/
h ˛ i
D u f .k/ g.k/ C ˇ ln g.k/ C A
Summary is the best wayhto1say ˛ˇ
“Good Bye” i
˛ ˛ ˛
D ln.k ˛ˇk / C ˇ ln ˛ˇk ˛ C A
1 ˛ˇ
h ˛ i
D ln.1 ˛ˇ/ C ˛ ln k C ˇ ln ˛ˇ C ˛ ln k C k
1 ˛ˇ
˛ˇ ˛ˇ
D ˛ ln k C ˛ ln k C ln.1 ˛ˇ/ C ln ˛ˇ C ˇA
1 ˛ˇ 1 ˛ˇ
˛ ˛ˇ
D ln k C ln.1 ˛ˇ/ C ln ˛ˇ C ˇA
1 ˛ˇ 1 ˛ˇ
˛ Editor:Yuyang Songsheng
D ln k C .1 ˇ/A C ˇA
1 ˛ˇ Date:August 1, 2021
˛ Email: [email protected]
D ln k C A
1 ˛ˇ
所以,左边 D 右边,证毕。
Version: 1.00
Contents
I Classical Mechanics 13
3 Small Oscillation 30
3.1 Small oscillation in one-dimension . . . . . . . . . . . . . . . . . . . . . . . 30
3.2 Forced oscillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 Non-linear oscillation and perturbation theory . . . . . . . . . . . . . . . . . 31
3.4 Oscillations of systems with more than one degree of freedom . . . . . . . . 33
5 Special Relativity 42
5.1 The principle of relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Contents –3/453–
7 Classical Electrodynamics 54
7.1 The formulation of classical electrodynamics . . . . . . . . . . . . . . . . . . 54
7.1.1 Maxwell’s equations and Lorentz force . . . . . . . . . . . . . . . . . 54
7.1.2 Lorentz transformation of fields . . . . . . . . . . . . . . . . . . . . 55
7.1.3 Energy-momentum tensor . . . . . . . . . . . . . . . . . . . . . . . 55
7.1.4 Charged particles in a given EM field . . . . . . . . . . . . . . . . . . 57
7.2 Constant electromagnetic field . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.2.1 Coulomb’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.2.2 Multipole moments . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7.2.3 Biot-Savart law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.2.4 Magnetic moment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.3 Electromagnetic waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.3.1 Electromagnetic waves . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.3.2 Monochromatic wave . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.3.3 Partially polarized light . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.4 The field of moving charges . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.4.1 Retarded potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.4.2 Spectral resolution of the retarded potentials . . . . . . . . . . . . . 68
7.5 Radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.5.1 Far field approximation . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.5.2 Low velocity approximation . . . . . . . . . . . . . . . . . . . . . . . 70
7.5.3 Radiation from a rapidly moving charge . . . . . . . . . . . . . . . . 71
7.6 The interaction between charged particles and EM field . . . . . . . . . . . . 73
7.6.1 Radiation reaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.6.2 Scattering by free charges . . . . . . . . . . . . . . . . . . . . . . . . 74
29 Thermodynamics 402
29.1 Central problem of thermodynamics . . . . . . . . . . . . . . . . . . . . . . 402
29.2 Entropy formulation of thermodynamics . . . . . . . . . . . . . . . . . . . . 403
29.2.1 Property of entropy function . . . . . . . . . . . . . . . . . . . . . . 403
Contents –11/453–
Classical Mechanics
Chapter 1
The Formulation of Classical Mechanics
One of the fundamental concepts of mechanics is that of a particle. By this we mean a body
whose dimensions may be neglected in describing its motion. The position of a particle in
space is defined by its radius vector r, whose components are its Cartesian coordinates x, y, z.
The derivative v = dr/dt of r with respect to the time t is called the velocity of the particle,
and the second derivative d2 r/dt2 is its acceleration.
If all the coordinates and velocities are simultaneously specified, the state of the system is com-
pletely determined and that its subsequent motion can be calculated. Mathematically, if all the
coordinates q and velocities q̇ are given at some instant, the accelerations q̈ at that instant are
uniquely defined. The relation between the accelerations, velocities and coordinates are called
the equations of motion. They are second-order differential equations for the functions q(t),
and their integration makes possible the determination of these functions and so of the path
of the system.
1. If we transform the coordinates q to Q and q = q(Q, t), the new Lagrangian will be
2. If L′ = L + df (q, t)/dt , then L and L′ is equivalent and will generate the same dynam-
ical equation.
Example:
1. The form of Lagrangian for an isolated system of particles in inertial frame is
X1
L= ma va2 − U (r1 , r2 , · · · ). (1.5)
a
2
X ∂f ∂g
we can derive that
∂f ∂g
{f, g} = − . (1.20)
k
∂qk ∂pk ∂pk ∂qk
The Hamilton equation can therefore be rewritten as
ṗi = pi , H , q̇i = {qi , H}. (1.21)
We can further get
df ∂f d{f, g} df dg
= {f, H} + , = ,g + f, . (1.22)
dt ∂t dt dt dt
P P P
3. Notice that d(F − i pi qi )/dt = − i qi ṗi − i P i Q̇i + (H ′ − H). Assuming
P
Φ(pi , Qi , t) = F − i pi qi , we have
∂Φ ∂Φ ∂Φ
qi = − , Pi = − , H′ = H + . (1.27c)
∂pi ∂Qi ∂t
P P P P
4. Notice that d(F + i P i Qi − i pi qi )/dt = − i qi ṗi + i Qi Ṗ i + (H ′ − H). As-
P P
suming Φ(pi , P i , t) = F + i P i Qi − i pi qi , we have
∂Φ ∂Φ ∂Φ
qi = − , Qi = , H′ = H + . (1.27d)
∂pi ∂P i ∂t
Poisson bracket is invariant under canonical transformation. Suppose that (q, p, H) → (Q, P, H ′ )
is a canonical transformation and F (Q, P, t) = f (q, p, t), G(Q, P, t) = g(q, p, t). We will
have
{f, g}q,p = {F, G}Q,P (1.28)
A necessary and sufficient condition for a canonical transformation is
i j
{Qi , Qj }q,p = 0, P , P q,p = 0, Qi , P j q,p = δji . (1.29)
If these formulae are regarded as a transformation from the variables qt ,pt to qt+τ , pt+τ , then
this transformation is canonical. This is evident from the expression
for the differential of the action S(qt , qt+τ , t, τ ), taken along the true path, passing through
the points q, and qt+τ at times t and t + τ for a given τ . −S is the generating function of the
transformation. As a result, we also have the following bracket relations,
i
{qi, t+τ , qj, t+τ }qt ,pt = 0, pt+τ , pjt+τ qt ,pt = 0, qi, t+τ , pjt+τ qt ,pt = δij . (1.32)
1.4 Symmetry and Conservation Laws(2) –19/453–
The phase-space distribution function is constant along the trajectories of the system. ♣
+ Proof: The phase volume is invariant under canonical transformation.The change in p and q during
the motion can be regarded as a canonical transformation. Suppose that each point in the region of
phase space moves in the course of time in accordance with the equations of motion of the mechanical
system. The region as a whole therefore moves also, but its volume remains unchanged. 2
S = f (t, q1 , · · · , qs ; α1 , · · · , αs ) + A, (1.38)
–20/453– Chapter 1 The Formulation of Classical Mechanics
The radial part of the motion can be regarded as taking place in one-dimension in a field where
the effective potential energy is
M2
Ueff = U (r) + . (2.7)
2mr2
–22/453– Chapter 2 Two Body Problem
The motion is finite for −mα2 /2M 2 ≤ E < 0 and infinite for E ≥ 0. The shape of path is
r
p M2 2EM 2
= 1 + e cos ϕ where p = , e= 1+ . (2.15)
r mα mα2
This is the equation of a conic section with one focus at the origin; 2p is called the latus rectum
of the orbit and e the eccentricity. Our choice of the origin is such that the point where ϕ = 0
is the point nearest to the origin (called the perihelion).
Figure 2.1: (a) Attractive Kepler orbit with e < 1; (b) Attractive Kepler orbit with e > 1; (c)
Repulsive Kepler orbit.
If E < 0, the orbit is an ellipse and the motion is finite, as shown in Figure 2.1(a). The major
and minor semi-axes of the ellipse are
p α p M
a= = ; b= √ =p . (2.16)
1−e 2 2|E| 1−e 2 2m|E|
The least and greatest distances from the centre of the field (the focus of the ellipse) are
p p
rmin = = a(1 − e); rmax = = a(1 + e). (2.17)
1+e 1−e
The period of revolution in an elliptical orbit is
r r
πab m m
T = = 2πa3/2 = πα . (2.18)
r2 ϕ̇/2 α 2|E|3
If E > 0, the path is a hyperbola with the origin as internal focus, as shown in Figure 2.1(b).
The distance of the perihelion from the focus is
p
rmin = = a(1 − e), (2.19)
1+e
where a = p/(e2 − 1) = α/2E is the semiaxis of the hyperbola.
If E = 0, the eccentricity e = 1, and the particle moves in a parabola with perihelion distance
rmin = p/2. This case occurs if the particle starts from rest at infinity.
Let us now consider motion in a repulsive field, where
α
U= (α > 0). (2.20)
r
–24/453– Chapter 2 Two Body Problem
There is an integral of the motion which exists only in fields U = α/r. It is easy to verify by
direct calculation that the quantity
αr
v×M + (2.24)
r
is constant. The direction of the conserved vector is along the major axis from the focus to
the perihelion, and its magnitude is αe. This is most simply seen by considering its value at
perihelion.
as shown in Figure 2.2. In physical applications we are usually concerned with the disintegra-
tion of not one but many similar particles, and this raises the problem of the distribution of
the resulting particles in direction, energy, etc. We shall assume that the primary particles are
randomly oriented in space, i.e., isotropically on average.
In the C system, every resulting particle has the same energy, and their directions of motion are
isotropically distributed. The fraction of particles entering a solid angle element do is do/4π.
Thus the distribution with respect to the angle θ0 is
1
sin θ0 dθ0 . (2.27)
2
The corresponding distributions in the L system are obtained by an appropriate transforma-
tion. For example, let us work out the kinetic energy distribution in the L system. Since
we have d(v 2 ) = d(cos θ0 ). Thus the kinetic energy can distributed uniformly over between
Tmin = m(v0 − V )2 /2 and Tmax = m(v0 + V )2 /2.
A collision between two particles is said to be elastic if it involves no change in their inter-
nal state. The collision is most simply described in the C system. The velocities of the par-
ticles before the collision are related to their velocities v1 and v2 in the L system by v10 =
m2 v/(m1 + m2 ), v20 = −m1 v/(m1 + m2 ), where v = v1 − v2 . Because of the law of conser-
vation of momentum, the momenta of the two particles remain equal and opposite after the
collision, and are also unchanged in magnitude, by the law of conservation of energy. Thus,
in the C system the collision simply rotates the velocities, which remain opposite in direction
and unchanged in magnitude. The velocities of the two particles after the collision are
′ m2 v n̂0 ′ m1 v n̂0
v10 = , v20 =− . (2.29)
m1 + m2 m1 + m2
The velocities in the L system after the collision are therefore
m2 v n̂0 m1 v1 + m2 v2 m1 v n̂0 m1 v1 + m2 v2
v1′ = + , v2′ = − + . (2.30)
m1 + m2 m1 + m2 m1 + m2 m1 + m2
Multiplying equations by m1 and m2 respectively, we obtain
m1 (p1 + p2 ) m2 (p1 + p2 )
p′1 = mv n̂0 + , p′2 = −mv n̂0 + . (2.31)
m1 + m2 m1 + m2
–26/453– Chapter 2 Two Body Problem
Let us consider in more detail the case where one of the particles (m2 , say) is at rest before the
collision. In Figure 2.4, the distance OB = m2 p1 /(m1 + m2 ) = mv is equal to the radius.
The vector AB⃗ is equal to the momentum p1 of the particle m1 before the collision.
Since
1 2
E = mv∞ , M = mρv∞ , (2.39)
2
we can get the relation between χ and ρ. Suppose the number density of the particles is n. The
incident flux is therefore nv∞ . The number of events that particles are scattered into the solid
angle do = sin χ dχ dϕ at (χ, ϕ) in time T is
Thus, we have
ρ(χ) dρ
dσ = ρ(χ) dρ dϕ = do . (2.41)
sin χ dχ
In C system, we have r1 = m2 r/(m1 + m2 ), so the scattering angle of particle 1 is simply
χ. While in L system (particle 2 is at rest before scattering), we must making corresponding
transformation to get the right expression for cross section.
–28/453– Chapter 2 Two Body Problem
Rutherford’s formula
One of the most important applications of the formulae derived above is to the scattering of
charged particles in a Coulomb field. As U = α/r, we have
2
α/mv∞ ρ
ϕ0 = arccos p . (2.42)
1 + (α/mv∞2 ρ)2
α2 2 χ
ρ2 = cot (2.43)
m2 v∞
4 2
and 2
α do
dσ = 2 4 . (2.44)
2mv∞ sin (χ/2)
This is Rutherford’s formula. It may be noted that the effective cross-section is independent of
the sign of α, so that the result is equally valid for repulsive and attractive Coulomb fields.
Formula above gives the effective cross-section in the frame of reference in which the centre of
mass of the colliding particles is at rest. The transformation to the laboratory system is effected
by means of
m2 sin χ π−χ
tan θ1 = , θ2 = . (2.45)
m1 + m2 cos χ 2
For particles initially at rest, we have
2
α do2
dσ2 = 2
. (2.46)
mv∞ cos3 θ2
The same transformation for the incident particles leads, in general, to a very complex formula,
and we shall merely note two particular cases.
If the mass m2 of the scattering particle is large compared with the mass m1 of the scattered
particle, then χ = θ1 and m = m1 , so that
2
α do1
dσ1 = 4 , (2.47)
4E1 sin (θ1 /2)
2
where E1 = m1 v∞ /2 is the energy of the incident particle. If the masses of the two particles
are equal, then by θ1 = χ/2, we have
2
α cos θ1
dσ1 = do1 . (2.48)
E1 sin4 θ1
If the particles are entirely identical, that which was initially at rest cannot be distinguished
after the collision. The total effective cross-section for all particles is obtained by adding do1
and do2 , so 2
α 1 1
dσ = + cos θ do . (2.49)
E1 sin4 θ cos4 θ
2.4 Scattering and cross section –29/453–
Let us return to the general formula and use it to determine the distribution of the scattered
particles with respect to the energy lost in the collision. When the masses of the scattered and
scattering particles are arbitrary, the velocity acquired by the latter is given by
2m1 χ
v2′ = v∞ sin . (2.50)
m1 + m2 2
The energy acquired by 2 and lost by 1 is therefore
2m2 2 χ
ϵ= v∞ sin2 . (2.51)
m2 2
Expressing sin(χ/2) in terms of ϵ, we obtain
α2 dϵ
dσ = 2π 2 ϵ2
. (2.52)
m2 v∞
This is the required formula: it gives the effective cross-section as a function, of the energy loss
ϵ, which takes values from zero to ϵmax = 2m2 v∞ 2
/m2 .
Chapter 3
Small Oscillation
For small oscillation, we can neglect the higher orders of q and the Lagrangian can be written
as
1 1
L = mq̇ 2 − V ′′ (0)q 2 . (3.3)
2 2
The Euler-Lagrangian equation gives
V ′′ (0)
q̈ + ω02 q =0 where ω02 = . (3.4)
m
The general solution is
q = A cos(ω0 t + ϕ), (3.5)
where A and ϕ depends on the initial condition.
If there is a damped force which is proportional to the velocity of the particle, then we have
1
q̈ + q̇ + ω02 q = 0. (3.6)
Q
If Q > 1/(2ω0 ), we have under damped oscillation:
r
−t/(2Q) 1
q = Ae cos(ωt + ϕ) where ω = ω02 − . (3.7)
4Q2
If Q < 1/(2ω0 ), we have over damped oscillation:
r
1 1
q = Ae λ+ t
+ Be λ− t
where λ± = − ± − ω02 . (3.8)
2Q 4Q2
If Q = 1/(2ω0 ), we have critical damped oscillation:
q = C(1 + Dt)e−ω0 t . (3.9)
3.2 Forced oscillation –31/453–
The solution has the form q = qs + qg , where qg is the general solution of the homogeneous
equation and qs is an arbitrary special solution of the equation. In order to get qs , we consider
the following equation
1
G̈ + Ġ + ω02 G = δ(t − t′ ), (3.11)
Q
whose solution is G(t, t′ ). Then we have
Z ∞
qs = F (t′ )G(t, t′ ) dt′ . (3.12)
−∞
and so Z ∞
′ t′ sin ωt′ ′
qs = F (t − t ) exp − dt . (3.14)
0 2Q ω
For a special case where
F (t) = F0 cos Ωt, (3.15)
we have
F0
q(t) = p cos(Ωt + ϕ), (3.16)
[Ω2 − ω02 + 1/(2Q2 )]2 + ω 2 /Q2
where
ω/Q
tan ϕ = . (3.17)
ω02 − Ω2
When r
1
Ω = ω02 − , (3.18)
2Q2
we have
QF0
qmax = . (3.19)
ω
It is called resonance.
q̈ + ω02 q + ϵq 3 = 0. (3.20)
–32/453– Chapter 3 Small Oscillation
q = q0 + ϵq1 + ϵ2 q2 + · · · (3.21a)
ω = ω0 + ϵω1 + ϵ ω2 + · · ·
2
(3.21b)
q̈ + ω 2 q = (ω 2 − ω02 )q − ϵq 3 . (3.22)
q0′′ + q0 = 0, (3.24a)
q03 2ω1
q1′′ + q1 = − 2 + q0 , (3.24b)
ω0 ω0
and so on. When doing the perturbation, we must adjust the ωi to avoid the resonance solution.
The details will be neglect here.
Now let us consider the non-linear oscillation with drive force. The equation of motion is
1
q̈ + q̇ + ω02 q + ϵq 3 = F0 cos ωt. (3.25)
Q
It can be rewritten as
1
q̈ + ω 2 q = − q̇ + (ω 2 − ω02 )q − ϵq 3 + F0 cos ωt. (3.26)
Q
We treat the right hand of the equation as a perturbation. We multiply it by a parameter µ and
let it be 1 later,
1
q̈ + ω q = µ − q̇ + (ω − ω0 )q − ϵq + F0 cos ωt .
2 2 2 3
(3.27)
Q
τ ≡ ωt − δ, (3.28)
so
′′ 1 ′ ω02 ϵ 3 F0
q +q =µ − q + 1 − 2 q − 2 q + 2 cos(τ + δ) . (3.29)
Qω ω ω ω
The expansion series of q and δ are
q = q0 + µq1 + µ2 q2 + · · · (3.30a)
δ = δ0 + µδ1 + µ2 δ2 + · · · (3.30b)
3.4 Oscillations of systems with more than one degree of freedom –33/453–
As shown in Figure 3.1, the resonance curve has two branches. When the frequency of the
drive force increase from left, the amplitude of oscillation will become larger and larger. But
when it comes to the point of inflection, the amplitude will drop to the low-right part of the
curve. When the frequency of the drive force decrease from right, the amplitude of oscillation
will also become larger and larger. When it comes to the point of inflection, the amplitude will
jump to the hight-left part of the curve. This effect is called hysteresis.
10
8
A0 ω02 /F0
0
0.0 0.5 1.0 1.5 2.0 2.5
ω/ω0
Let q0,i be an equilibrium position and expand about this point qi = q0,i + ηi , and so q̇i = η̇i .
We can expand the potential energy to give
X ∂V
1 X ∂ 2V
V (q1 , . . . qn ) = V (q0,1 , . . . q0,n ) + ηi + ηi ηj + · · · (3.40)
i
∂qi q0,i 2 i,j ∂qi ∂qj q0,i
The first term is constant with respect to ηi and constant terms do not affect the motion. The
second term is zero, because q0,i is a point of equilibrium. So we are left with
1X
L= (Tij η̇i η̇j − Vij ηi ηj ) , (3.41)
2 i,j
where
∂ 2V
Tij = Tij (q0,1 , . . . q0,n ) , Vij = , (3.42)
∂qi ∂qj q0,i
This is a linear differential equation with constant coefficients. We can try using
leading to X
Vij aj − ω 2 Tij aj = 0. (3.45)
j
The equation has non zero solutions only if det[Vij − ω 2 Tij ] = 0. This gives a nth-degree
polynomial to solve for ω 2 . We will get n solutions for ω 2 that we can substitute into the
matrix equation and solve for aj .
Chapter 4
Motion of a Rigid Body
We then have
v1 = O(ω2 × r2 + v2 ). (4.5)
Define
ω1 ≡ Oω2 . (4.6)
We can get
v1 = ω1 × r1 + Ov2 . (4.7)
Note: ω is the socalled angular velocity. ω1 is independent of the base vector we choose for frame 2. If
we choose frame 1 differently, ω1 will transform like an vector.
v1 = V + ω1 × r1 = V + O(ω2 × r2 ). (4.9)
µV 2 1 X
T = + m[ω 2 r2 − (ω · r)2 ]. (4.11)
2 2
If we define the inertial tensor as
X
Iik = m(x2l δik − xi xk ), (4.12)
µV 2 1
T = + Iik ωi ωk , (4.13)
2 2
and the Lagrangian of the rigid body is
µV 2 1
L= + Iik ωi ωk − U. (4.14)
2 2
If the body is regarded as continuous, the sum becomes an integral over the volume of the
body: Z
Iik = ρ(x2l δik − xi xk ) dV . (4.15)
Like any symmetrical tensor of rank two, the inertia tensor can be reduced to diagonal form
by an appropriate choice of the directions of the axes x1 , x2 x3 . These directions are called
the principal axes of inertia, and the corresponding values of the diagonal components of the
tensor are called the principal moments of inertia; we shall denote them by I1 , I2 , I3 . When
the axes x1 , x2 x3 are so chosen, the kinetic energy of rotation takes the very simple form
1
Trot = (I1 ω12 + I2 ω22 + I3 ω32 ). (4.16)
2
A body whose three principal moments of inertia are all different is called an asymmetrical
top. If two are equal (I1 = I2 ̸= I3 ), we have a symmetrical top. In this case the direction
of one of the principal axes in the x1 x2 -plane may be chosen arbitrarily. If all three principal
4.2 Dynamics of rigid body –37/453–
moments of inertia are equal, the body is called a spherical top, and the three axes of inertia
may be chosen arbitrarily as any three mutually perpendicular axes.
The determination of the principal axes of inertia is much simplified if the body is symmetrical,
for it is clear that the position of the centre of mass and the directions of the principal axes must
have the same symmetry as the body. For example, if the body has a plane of symmetry, the
centre of mass must lie in that plane, which also contains two of the principal axes of inertia,
while the third is perpendicular to the plane. If a body has an axis of symmetry of any order,
the centre of mass must lie on that axis, which is also one of the principal axes of inertia, while
the other two are perpendicular to it. If the axis is of order higher than the second, the body is
a symmetrical top. For any principal axis perpendicular to the axis of symmetry can be turned
through an angle different from π about the latter, i.e., the choice of the perpendicular axes is
not unique, and this can happen only if the body is a symmetrical top.
Finally, we may note one further result concerning the calculation of the inertia tensor. Al-
though this tensor has been defined with respect to a system of coordinates whose origin is at
the centre of mass , it may sometimes be more conveniently found by first calculating a similar
tensor, X
′
Iik = m(x′2 ′ ′
l δik − xi xk ), (4.17)
defined with respect to some other origin O′ . If the distance OO′ is represented by a vector a,
i.e., r = r ′ + a, we have
′
Iik = Iik + µ(a2 δik − ai ak ). (4.18)
′
Using this formula, we can easily figure out Iik if Iik is known.
Angular momentum
The value of the angular momentum of systems depends on the point with respect to which
it is defined. In the mechanics of a rigid body, the most appropriate point to choose for this
purpose is the origin of the moving system of coordinates, i.e., the centre of mass of the body.
Then we have
X X
M= mr × (ω × r + V ) = m r2 ω − (ω · r)r , (4.19)
M1 = I1 ω1 , M2 = I2 ω2 , M3 = I3 ω3 . (4.21)
Equation of motion
Since a rigid body has, in general, six degrees of freedom, the general equations of motion must
be six in number. They can be put in a form which gives the time derivatives of two vectors, the
momentum and the angular momentum of the body. The first equation is obtained by simply
–38/453– Chapter 4 Motion of a Rigid Body
summing the equations ṗ = f for each particle in the body. In terms of the total momentum
of the body X
P = p = µV , (4.22)
P
and total force acting on it F = f , we have
dP
= F. (4.23)
dt
Although F has been defined as the sum of all the forces f acting on the various particles,
including the forces due to other particles, F actually includes only external forces: the forces
of interaction between the particles composing the body must cancel out.
Let us now derive the second equation of motion, which gives the time derivative of the angu-
lar momentum M . To simplify the derivation, it is convenient to choose the fixed (inertial)
frame of reference in such a way that the centre of mass is at rest in that frame at the instant
considered. We have
d X X X
Ṁ = r×p = ṙ × p + r × ṗ. (4.24)
dt
Our choice of the frame of reference (with V = 0) means that the vectors ṙ and p = mv are
parallel, so ṙ × p = 0. We have
dM X
= K where K = r × f. (4.25)
dt
Since M has been defined as the angular momentum about the centre of mass, it is unchanged
when we go from one inertial frame to another. We can therefore deduce that the equation of
motion, though derived for a particular frame of reference, is valid in any other inertial frame,
by Galileo’s relativity principle. The vector r × f is called the moment of the force f , and so
K is the total torque, i.e., the sum of the moments of all the forces acting on the body. Like
P
the total force, r × f need include only the external forces.
Euler’s equations
Let dA/dt be the rate of change of any vector A with respect to the fixed system of coordinates.
We have
dA d′ A
= + ω × A, (4.26)
dt dt
where d′ A/dt is the rate of change of the A’s components in the body system of coordinates.
Therefore,
d′ M
+ ω × M = K. (4.27)
dt
Suppose the principal axes of inertia are x1 , x2 x3 . We have
dω1
I1 + (I3 − I2 )ω2 ω3 = K1 ; (4.28a)
dt
dω2
I2 + (I1 − I3 )ω1 ω3 = K2 ; (4.28b)
dt
dω3
I3 + (I2 − I1 )ω1 ω2 = K3 . (4.28c)
dt
These are called Euler’s equations.
4.3 Eulerian angle –39/453–
fixed XY -plane in some line ON , called the line of nodes. This line is evidently perpendicular
to both the Z-axis and the x3 -axis; we take its positive direction as that of the vector product
ẑ × x̂3 . We take, as the quantities defining the position of the axes x1 , x2 x3 relative to the
axes X, Y , Z the angle θ between the Z and x3 axes, the angle ϕ between the X-axis and ON ,
and the angle ψ between the x1 and ON .
Let us now express the components of the angular velocity vector ω along the moving axes
x1 , x2 x3 in terms of the Eulerian angles and their derivatives. To do this, we must find the
components along those axes of the angular velocities θ̇, ϕ̇, ψ̇. The angular velocity θ̇ is along
the line of nodes ON . The angular velocity ϕ̇ is along the Z-axis. The angular velocity ψ is
along the x3 -axis. Collecting the components along each axis, we have
For a symmetrical top, by using the fact that the choice of directions of the principal axes x1 ,
x2 is arbitrary for a symmetrical top. If the x1 axis is taken along the line of nodes ON , i.e.,
ψ = 0, the components of the angular velocity are simply
For the free motion of a symmetrical top, we take the Z-axis of the fixed system of coordinates
in the direction of the constant angular momentum M of the top. The x3 -axis of the moving
–40/453– Chapter 4 Motion of a Rigid Body
system is along the axis of the top; let the x1 -axis coincide with the line of nodes at the instant
considered. Then the components of the vector M are
Comparison gives
M M cos θ
θ̇ = 0, ϕ̇ = , ϕ̇ cos θ + ψ̇ = . (4.32)
I1 I3
The first of these equations gives θ = constant, i.e., the angle between the axis of the top and
the direction of M is constant. The second equation gives the angular velocity of precession
ϕ̇ = M/I1 . Finally, the third equation gives the angular velocity with which the top rotates
about its own axis ω3 = M cos θ/I3 .
Part II
The transformation preserving the invariant intervals is called Lorentz transformation, which
can be written as
x′µ = Λµν xν . (5.2)
The invariant symbol of the vector representation of Lorentz transformation is η µν . We have
−1
1
Λ ρ Λ σ ηµν = ηρσ where ηµν ≡
µ ν
. (5.3)
1
1
The inverse of ηµν is denoted as η µν . We can use η µν and ηµν to raise and lower vector indices:
xµ ≡ ηµν xν , xµ = η µν xν . (5.4)
In a special case where the new reference frame moves along 1̂ direction with velocity β, we
have
Some physical quantities will behave like a tensor (vector, scalar) when transforming from one
inertial frame to another. For example,
vector four velocity uµ ≡ dxµ /dτ , four momentum pµ ≡ muµ , four acceleration aµ ≡
duµ /dτ , four force f µ ≡ maµ .
u0 = γ, ui = γv̂ i . (5.10)
If the new reference frame moves along 1̂ direction with velocity β, we have
v̂ 1 − β v̂ 2 v̂ 3
v̂ ′1 = , v̂ ′2 = , v̂ ′3 = . (5.11)
1 − v̂ 1 β γ(1 − v̂ 1 β) γ(1 − v̂ 1 β)
f i = γ fˆi . (5.12)
dpµ
= 0. (5.13)
dτ
It can be derived in several ways.
–44/453– Chapter 5 Special Relativity
Lagrangian formulation
The action for a free particle is given by
Z t2 Z b p
S= L dt = −m dτ where L = −m 1 − ẋi ẋi . (5.14)
t1 a
The action is stationary under perturbations with constraints δxµ (a) = δxµ (b) = 0. We can
derive the equation of motion
duµ
m = 0. (5.15)
dτ
Hamiltonian formulation
The canonical momentums and Hamiltonian for a free particle are
∂L p
πi = = γmη ij ẋj , H = π i ẋi − L = γm = m2 + π i πi . (5.16)
∂ ẋi
Thus, Hamilton’s equations for a free particle are
πj
π̇ i = 0, ẋi = ηij √ . (5.17)
m2 + π k πk
Hamilton-Jacobi equation
The Hamilton-Jacobi equation for a free particle is
2 2 2 2
∂S 2 ∂S ∂S ∂S
=m + + + . (5.18)
∂t ∂x ∂y ∂z
Notice that p0 = H = −∂t S, pi = π i = ∂i S. We have pµ = ∂ µ S.
Non-free particle
For a non-free particle, we have the revised Newton’s second law:
dpµ
fµ = . (5.19)
dτ
It can also be written in the form of three vectors as
If the system consists of more than one particles interacting with each other. We have the
conservation laws from the symmetry.
f (r, p) = f ′ (r ′ , p′ ) (5.23)
in Lorentz transformation.
dN = σvrel n1 n2 dV dt , (5.25)
–46/453– Chapter 5 Special Relativity
where vrel is the velocity of particle 1 in the rest system of particle 2 (which is just the definition
of the relative velocity of two particles in relativistic mechanics).
The number dN is by its very nature an invariant quantity. We would like to express it in a
form which is applicable in any reference system:
dN = An1 n2 dV dt , (5.26)
where A is a number to be determined, for which we know that its value in the rest frame of
one of the particles is vrel σ. We shall always mean by σ precisely the cross-section in the rest
frame of one of the particles, i.e., by definition, an invariant quantity. From its definition, the
relative velocity vrel is also invariant. The product dV dt is an invariant. Therefore the product
An1 n2 must also be an invariant. The law of transformation of the particle density n is
n0
n= √ = n0 E/m, (5.27)
1 − v2
where n0 is the density in the rest frame of the particle. Thus we can construct A in an arbitrary
frame as
pµ p2µ
A = −σvrel 1 . (5.28)
E1 E2
Notice that
m1 1 − v1 · v2
− pµ1 p2µ = p m2 = m1 m2 p . (5.29)
1 − vrel
2
(1 − v12 ) · (1 − v22 )
We can get the following expression for vrel :
p
(v1 − v2 )2 − (v1 × v2 )2
vrel = . (5.30)
1 − v1 · v2
Finally, we have p
dN = σ (v1 − v2 )2 − (v1 × v2 )2 n1 n2 dV dt . (5.31)
If the velocities v1 and v2 are collinear, then we have
E2 = m2 , p2 = 0. (5.34)
where θ1 (θ2 ) is the angle between p′1 (p′2 ) with p1 . Especially, if m1 = 0, we have
E1
E1′ = . (5.36)
(1 − cos θ1 )E1 /m2 + 1
(x − c)2 y 2
+ 2 = 1, (5.37)
a2 b
where
p1 (E1 m2 + m22 ) m2 p1 a p1 (E1 m2 + m21 )
a≡ , b≡ p 2 = √ , c≡ ,
m21 + m22 + 2m2 E1 m1 + m22 + 2m2 E1 1−V2 m21 + m22 + 2m2 E1
(5.38)
where V ≡ p1 /(E1 + m2 ) is the velocity of particle 2 before scattering in the center of mass
frame (C frame). It is easy to see that a + c = p1 . The scattering in L frame is illustrated in
Figure 5.1. We note that if m1 > m2 , the scattering angle θ1 cannot exceed a certain maximum
value, which is given by sin θ1max = m2 /m1 .
m2 (E12 − m21 )
E1′ = E1 −∆E, E2′ = m2 +∆E where ∆E ≡ (1−cos χ). (5.39)
m21 + m22 + 2m2 E1
Chapter 6
Classical Field Theory
In physics, a field is a physical quantity, represented by a number or tensor, that has a value
for each point in space and time. A classical field theory is a physical theory that predicts how
one or more physical fields interact with matter through field equations.
Keeping the end points fixed and requiring S to be stationary about samll perturbations, the
field equation are obtained
∂L ∂L
∂µ − = 0. (6.2)
∂(∂µ ϕa ) ∂ϕa
A key feature of all theories of nature is the property of locality. The locality of the theory
requires that there are no terms in the Lagrangian coupling ϕ(x, t) directly to ϕ(y, t) with
x ̸= y. The closet we get for the x label is coupling between ϕ(x, t) and ϕ(x + δx, t) through
the gradient term ∇ϕ.
Modern formulations of classical field theories also require Lorentz covariance as laws of na-
ture are relativistic. The field can behave like a scalar or vector, while Lagrangian density must
be a scalar, or more loosely, action must be invariant under Lorentz transformation.
Scalar fields Under Lorentz transformation x′ = Λx, we have
ϕ′ (x′ ) = ϕ(Λ−1 x′ ). (6.3)
Every continuous symmetry of the Lagrangian density gives rise to a conserved current
j µ (x) such that the equation of motion imply ∂µ j µ = 0. Suppose that the infinitesimal
transformation gives
ϕa → ϕa + δϕa , L → L + δL. (6.6)
♣
µ
If δL = ∂µ K , the conserved current is
∂L
jµ = − δϕa + K µ . (6.7)
∂(∂µ ϕa )
∂L
j µ = −aν T µν where T µν ≡ − ∂ ν ϕa + η µν L. (6.8)
∂(∂µ ϕa )
The arbitrariness of δωµν expect for antisymmetry implies that ∂µ M µνρ = 0. If we define
Z
M νρ
≡ M 0νρ d3 x , (6.15)
dM νρ
= 0. (6.16)
dt
F: M →R or F : M → C (6.17)
Like the derivative of a function, the functional derivative satisfies the following properties,
where F [ρ] and G[ρ] are functionals:
Linearity
δ(λF + µG)[ρ] δF [ρ] δG[ρ]
=λ +µ (6.19)
δρ(x) δρ(x) δρ(x)
where λ, µ are constants.
Product rule
δ(F G)[ρ] δF [ρ] δG[ρ]
= G[ρ] + F [ρ] . (6.20)
δρ(x) δρ(x) δρ(x)
δF [g(ρ)] δF [g(ρ)] dg
= . (6.22)
δρ(x) δg(ρ(x)) dρ
6.4 Hamiltonian formulation –51/453–
δF 1
= lim {F [ρ + ϵδx ] − F [ρ]} where δx ≡ δ(y − x). (6.23)
δρ(x) ϵ→∞ ϵ
δf (y) δf ′ (y) dδ(y − x)
= δ(y − x), = . (6.24)
δf (x)
Z
δf (x)
dy ♠
δ
g(f (t)) dt = g ′ (f (x)). (6.25)
δf (x)
Z
δ d
g(f (t)) dt = − [g ′ (f ′ (x))].
′
(6.26)
δf (x) dx
ϕ˙a (x) = {ϕa (x), H}, π˙a (x) = {π a (x), H}. (6.32)
6.4.2 Momentum
Using equation 6.8 and 6.9, we can derive that
Z
0
P = H, i
P = −π a ∂ i ϕa d3 x . (6.34)
{ϕa , P µ } = −∂ µ ϕa , {π a , P µ } = −∂ µ π a , {P µ , P ν } = 0. (6.35)
If we define
Z Z
ML ≡
µν µ
(x T 0ν
− x T )d x,
ν 0µ 3
MS ≡
µν
(−π a (Σµν )ab ϕb ) d3 x , (6.37)
where
(Lµν )ab ≡ −(xµ ∂ ν − xν ∂ µ )δab , (Sµν )ab ≡ −(Σµν )ab . (6.39)
Because dM µν /dt = 0, M µν commutate with d/dt. Dirivatives with respect to spatial coor-
dinates also commutate with bracket operation by definition. As a result, we have
{ϕ(x), {M µν , M ρσ }} = (Lµν Lρσ − Lρσ Lµν + Sµν Sρσ − Sρσ Sµν )ϕ(x). (6.41)
Notice that
If we demand that
{M µν , M ρσ } = −η νρ M µσ + η σµ M ρν + η µρ M νσ − η σν M ρµ , (6.44)
up to the possibility of a term on the right-hand side that commutes with ϕ(x) and its deriva-
tives.
6.4 Hamiltonian formulation –53/453–
{P µ , M ρσ } = η µσ P ρ − η µρ P σ . (6.47)
It can be rewritten as
Fianlly, we define Li ≡ ϵijk MLjk /2 and Si ≡ ϵijk MSjk /2. We can derive that
∂µ j µ = ∂µ ∂ν F µν = 0. (7.3)
It follows that Z Z
µ
dV dt j Aµ = dxµ ea Aµ (xµ (τ )). (7.5)
The action for a charged particle when coupling with EM field is therefore
Z Z
S = −m dτ + e dxµ Aµ (xµ (τ )). (7.6)
Note: The Hamiltonian formulation of electrodynamics will be discussed in detail in the Hamiltonian
formulation of general relativity and canonical quantization formulation of quantum electrodynamics.
We also define ρe ≡ j 0 and J i ≡ j i . The field equation can be rewritten as so-called Maxwell’s
equations:
∂E ∂B
∇×B = + J, ∇×E =− , ∇ · E = ρe , ∇ · B = 0. (7.9)
∂t ∂t
The equations of motion for the charged particle can be rewritten as so-called Lorentz force
equations:
dp dE
= e(E + v × B), = eE · v. (7.10)
dt dt
We also notice that Aµ cannot be completely determined by Maxwell’s equations and Lorentz
force equations. If we make the transformation Aµ → Aµ + ∂µ ξ(x), L and F µν would be in-
variant, and Maxwell’s equations and Lorentz force equations are still valid. This arbitrariness
of ξ is called gauge invariance. This topic will be discussed in detail in QED.
Equation 7.12 can be generalized to the case where the direction of β is arbitrary. We have
If β ≪ 1, we have
E ′ = E + β × B, B ′ = B − β × E. (7.14)
We notice that Fµν F µν and ϵµνρσ F µν F ρσ is invariant under Lorentz transformation, i.e.,
We notice that the energy-momentum tensor defined above is not symmetric. So we define a
modified energy-momentum tensor by adding a term −∂ ρ Aν F µρ , i.e.,
1
Tfµν
′ = F νρ F µρ − η µν Fρσ F ρσ . (7.17)
4
–56/453– Chapter 7 Classical Electrodynamics
For free EM field, we have ∂ ρ Aν F µρ = ∂ ρ Aν F µρ . As a result,
∂µ Tfµν
′ = 0, Pfµ′ = Pfµ . (7.18)
From now on, we will use Tf ′ as the energy-momentum tensor of EM field and omit the prime
for simplicity. The momentum of the free EM field is
Z Z Z Z
E2 + B2
0
Pf = dV ≡ dV w, Pf = dV E × B ≡ dV S.
i
(7.19)
2
If there also exists charged particles in the system, i.e., the source of EM field, we must also
include the energy-momentum tensor of the particles to get the right conservation equation.
The energy-momentum tensor of particles is defined as
X p
Tpµν ≡ ma δ(r − ra ) 1 − va2 uµa uνa . (7.20)
a
From this definition, we can get the four momentum of all particles,
X ma X ma
Pp0 = p , Pp = p va . (7.21)
a
1 − va
2
a
1 − va2
f ij ≡ −Tfij = E i E j + B i B j − wδ ij , (7.29)
∂µ (xν T µρ − xρ T µν ) = 0. (7.31)
mv 2 (π − eA)2
L= + eA · v − eϕ, π = mv + eA, H= + eϕ. (7.37)
2 2m
Ȧ = 0, E = −∇ϕ. (7.38)
Suppose the direction of electric field is x̂, the orbit is in x − y plane. The equation of motion
will be
ṗx = eE, ṗy = 0. (7.39)
The solution is
q
1 p0 eEt
x= E20 + (eEt)2 , y= arcsinh , (7.40)
eE eE E0
p
assuming px = 0, py = p0 at t = 0 and E0 ≡ p20 + m2 . The trajectory of the particle is
E0 eEy
x= cosh . (7.41)
eE p0
Suppose the direction of magnetic field is ẑ. Notice that particle’s kinetic energy E = γm is
constant if there is no electric field. We can derive the equation of motion
We only focus on the case where the velocity of particle is much smaller than light speed.
Suppose the direction of magnetic field is ẑ and the direction of electric field is within y − z
plane. The equation of motion is
The solution is
Ey eEz
ẋ = a cos ωt + , ẏ = −a sin ωt, ż = v0z + t. (7.45)
B m
where ω = eB/m. a and vz0 are determined by initial condition. As we suppose that v ≪ 1
is satisfied, we must have
eEz t Ey
a ≪ 1, v0z ≪ 1, ≪ 1, ≪ 1. (7.46)
m B
7.2 Constant electromagnetic field –59/453–
∇ · E = ρe , ∇ × E = 0. (7.47)
Therefore, we have
E = −∇ϕ, ∇2 ϕ = −ρe . (7.48)
The solution is Z
ρe (r ′ )
ϕ(r) = dV ′ . (7.49)
4π|r − r ′ |
If ρe (r ′ ) = Qδ(r ′ ), we have
Q Qr
ϕ(r) = , E(r) = . (7.50)
4π|r| 4π|r|3
Here, ϕa is the electric potential at the point where ea is located, produced by ea itself, while
Φa is the potential produced by other charges. It is obvious that Uself = ea ϕa /2 is infinite,
indicating that classical electrodynamics is no more valid in small distance. This problem will
be solved in quantum electrodynamics: the mass of charged particle we measured is already
renormalized to include the electromagnetic self energy. Actually, we have
Z
1 1 X ea eb
U= E 2 dV − Uself = where Rab = |ra − rb |. (7.52)
2 2 a̸=b 4πRab
If the charged particle is moving with a constant velocity v, we can derive the electric field it
produced by Lorentz transformation. The final result is
er 1 − v2
E= , B = v × E, (7.53)
4πr3 (1 − v 2 sin2 θ)3/2
where r is the vector point from the particle to the point we measure the electric field, and θ
is the angle between r and v. If V ∼ 1, the electric field will be concentrated in the direction
perpendicular to the V . If v ≪ 1, we have
er ev × r
E= , B= . (7.54)
4πr3 4πr3
–60/453– Chapter 7 Classical Electrodynamics
1 X∞ X l
ral 4π
= Y ∗ (θ, ϕ)Ylm (θa , ϕa ). (7.56)
|r − ra | l=0 m=−l
r l+1 2l + 1 lm
P
The potential can be decomposed into ϕ = ϕ(l) , where
r r
1 Xl
4π X 4π
ϕ (l)
≡ Q(l) Y ∗ (θ, ϕ), Q(l) ≡ ea ral Ylm (θa , ϕa ). (7.57)
4πrl+1 m=−l 2l + 1 m lm m
a
2l + 1
Q Qn̂
ϕ(0) = , E (0) = ; (7.58)
4πr 4πr2
d · n̂ 3(d · n̂)n̂ − d
ϕ(1) = , E (1) = ; (7.59)
4πr2 4πr3
n̂ · D · n̂ 5(n̂ · D · n̂)n̂ − (n̂ · D + D · n̂)
ϕ(2) = , E (2) = ; (7.60)
8πr3 8πr4
where
r X X X
n̂ ≡ , Q≡ ea , d≡ ea ra , D≡ ea (3ra ra − ra2 I). (7.61)
r a a a
Now turn to a system of charged particles in the electric field ϕ(r). If all the particles are near
r = 0, we can make the expansion
r
X
∞ X
m=l
4π
l
ϕ(r) = r alm Ylm (θ, ϕ). (7.62)
l=0 m=−l
2l + 1
X
∞ X
l
U= U (l) , U (l) = alm Q(l)
m. (7.63)
l=0 m=−l
The force exerted on the system can be obtained by taking the derivatives of the potential
energy. We list the leading terms in decomposition:
The solution is Z
1 ⟨j⟩ ′ 1 X ea va
⟨A⟩ = dV = . (7.69)
4π |r − r ′ | 4π a |r − ra |
The magnetic field is Z
1 ⟨j⟩ × (r − r ′ )
⟨B⟩ = dV ′ . (7.70)
4π |r − r |
′ 3
Firstly, * +
X d X
e ⟨va ⟩ = era = 0. (7.73)
a
dt a
Secondly,
X
1
1 X
− eva ra · ∇ = 3 ⟨eva (ra · r)⟩ . (7.74)
a
r r a
Notice that
!
X 1 dera (ra · r) 1 X
eva (ra · r) = + era × va × r. (7.75)
a
2 dt 2 a
–62/453– Chapter 7 Classical Electrodynamics
We can get
⟨m⟩ × r 3n̂(⟨m⟩ · n̂) − ⟨m⟩
⟨A⟩ = 3
, ⟨B⟩ = . (7.77)
4πr 4πr3
If all the particles have the same mass-to-charge ratio, and the velocity of all the particles is
much smaller than that of light, we have
e X e
m= mra × va = M. (7.78)
2m a 2m
Now focus on a system of charges in an external constant uniform magnetic field. The time
average of the force acting on the system is
* +
X d X
F = e ⟨va × B⟩ = era × B = 0. (7.79)
a
dt a
ϕ = 0, ∇ · A = 0. (7.84)
7.3 Electromagnetic waves –63/453–
Consequently, we have
∂A
E=− , B = ∇ × A. (7.85)
∂t
From Maxwell’s equations we can derive that
∂ 2A
∇2 A − = 0. (7.86)
∂t2
This is the equation which determines the potentials of electromagnetic waves. We can verify
that the electric and magnetic field E and B satisfy the same wave equation.
We consider the special case of electromagnetic waves in which the fields depends only on
one coordinates, say x. Such waves are said to be plane. In this case the equation of the field
becomes
∂ 2f ∂ 2f
− = 0, (7.87)
∂t2 ∂x2
where f is understood any component of the vectors A, E and B. The solution is
f1 (t − x) represents a plane wave moving in the positive direction along the x axis. f2 (t − x)
represents a plane wave moving in the negative direction along the x axis. The Coulomb’s
gauge would imply that Ax = 0. And we can obtain
where the prime denotes differentiation with respect to t − x and n̂ is a unit vector along the
direction of propagation of the wave. We see that the electric and magnetic fields E and B of
a plane wave are directed perpendicular to the direction of propagation of the wave. For this
reason, electromagnetic waves are said to be transverse. The energy density and flux of the
plane waves are
W = E 2 , S = W n̂. (7.90)
∂ 2f
+ ω 2 f = 0. (7.91)
∂x2
The vector potential of such a wave is most conveniently written as the real part of a complex
expression
A = Re A0 ei(k·r−ωt) , k = ω n̂. (7.92)
–64/453– Chapter 7 Classical Electrodynamics
The time average of the product of field intensity can be worked out as
1
⟨XY ⟩ = Re {X0 Y0∗ } . (7.93)
2
The electric and magnetic field are
E = iωA, B = ik × A. (7.94)
There is a special case that f (t) is a periotic function with angular frequency ω0 . f (t) can be
expanded as
X∞
f (t) = fn e−inω0 t , (7.103)
−∞
where Z T
1
fn ≡ f (t)einω0 t dt . (7.104)
T 0
X
∞
fω = 2πfn δ(ω − nω0 ). (7.106)
−∞
E0 (t)e−iωt , (7.107)
where the complex amplitude E0 is some slowly varying function of the time. Since E0 deter-
mines the polarization of the wave, this means that at each point of the wave, its polarization
changes with time, such a wave is said to be partially polarized.
Quadratic functions of the field are made up of terms proportional to the products Eα Eβ ,
Eα∗ Eβ∗ or Eα∗ Eβ . Products of the form Eα Eβ and Eα∗ Eβ∗ contain the rapidly oscillating factors
–66/453– Chapter 7 Classical Electrodynamics
e−i2ωt and will give zero when the time average is taken. Thus, we see that the polarization
properties of the light are completely characterized by the tensor
∗
Jαβ = E0α E0β . (7.108)
Jαβ
ραβ = , (7.110)
J
called polarization tensor.
Generally, the polarization tensor can be expressed as
1 1 + p3 p1 − ip2
ρ= . (7.111)
2 p1 + ip2 1 − p3
we have
1 1
ρ = (1 − P )I + P (I + n̂ · σ), (7.113)
2 2
q p
p2 p3
where
1
P ≡ p21 + p22 + p32 , n̂ ≡ , . , (7.114)
P P P
For a monochromatic light with polarization state |E⟩ = (cos(θ/2)e−iϕ/2 , sin(θ/2)eiϕ/2 ), the
polarization tensor is |E⟩⟨E|. We can verify that
to pass totally. If a light with polarization tensor ρ pass through the device, the relative intensity
will become
1 1
⟨D|ρ|D⟩ = + p · m̂, (7.118)
2 2
7.4 The field of moving charges –67/453–
I ≡ ⟨Ex2 ⟩ + ⟨Ey2 ⟩
= ⟨Ea2 ⟩ + ⟨Eb2 ⟩
= ⟨E+2 ⟩ + ⟨E−2 ⟩,
Q ≡ ⟨Ex2 ⟩ − ⟨Ey2 ⟩,
U ≡ ⟨Ea2 ⟩ − ⟨Eb2 ⟩,
V ≡ ⟨E+2 ⟩ − ⟨E−2 ⟩,
where the subscripts refer to three different bases of the space of Jones vectors: the standard
Cartesian basis x̂, ŷ, a Cartesian basis rotated by 45° â, b̂, and a circular basis +̂, −̂. The
symbols ⟨·⟩ represent expectation values. It is easy to verify that
∂ 2 Aµ = −j µ . (7.120)
∂ 2A ∂ 2ϕ
∇2 A − = −J , ∇2 ϕ − = −ρ. (7.121)
∂t2 ∂t2
To find the particular solution, we divide the whole space into infinitely small regions and
determine the field produced by the charges located in one of these volume elements. Because
of the linearity of the field equations, the actual field will be the sum of the fields produced
by all such elements. The charge e in a given volume element is a function of the time. If we
choose the origin of coordinates in the volume element under consideration, then the charge
density is e(t)δ(R), where R is the distance from the origin. Thus, we must solve the equation
∂ 2ϕ
∇2 ϕ − = −e(t)δ(R). (7.122)
∂t2
The particular solution is
e(t − R)
ϕ= . (7.123)
4πR
For an arbitrary distribution of charges ρ(r ′ , t), we have
Z
ρ(r ′ , t − |r − r ′ |)
ϕ(r, t) = dV ′ . (7.124)
4π|r − r | ′
–68/453– Chapter 7 Classical Electrodynamics
where
R∗
n̂∗ = , R∗ = r − r0 (t∗ ), v ∗ = v0 (t∗ ), t∗ = t − R ∗ . (7.128)
R
Similarly, we have
ev ∗
A(r, t) = . (7.129)
4πR∗ (1 − n̂∗ · v ∗ )
The potential is called Lienard-Wiechert potentials. Notice that
∂t∗ 1 ∗ n̂∗
= , ∇t = − . (7.130)
∂t 1 − n̂∗ · v ∗ 1 − n̂∗ · v ∗
We can figure out the corresponding electric and magnetic field intensity,
e (1 − v ∗2 )(n̂∗ − v ∗ ) n̂∗ × [(n̂∗ − v ∗ ) × a∗ ]
E= + ; (7.131a)
4π(1 − n̂∗ · v ∗ )3 R∗2 R∗
B = n̂∗ × E. (7.131b)
The electric field consists of two parts of different type. The first term depends only on the
velocity of the particle (and not on its acceleration) and varies at large distances like 1/R2 .
The second term depends on the acceleration, and for large R it varies like 1/R. This latter
term is related to the electromagnetic waves radiated by the particle.
7.5 Radiation
7.5.1 Far field approximation
We consider the field produced by a system of moving charges at distances large compared
with the dimensions of the system. We choose the origin of coordinates O anywhere in the
interior of the system of charges. The radius vector from O to the point P , where we determine
the field, we denote by r, and the unit vector in this direction by n̂. Let the radius vector of
the charge element be r ′ , and the radius vector from charge to the point P be R. At large
distances from the system of charges, r ≫ r′ , and we have approximately,
R ≈ r − n̂ · r ′ . (7.137)
We substitute this for the retarded potentials. In the denominator of the integrands we can
neglect n̂ · r ′ compared with r. In t − r + n̂ · r ′ , whether it is possible to neglect these terms is
determined by how much the quantities e and j change during the time n̂ · r ′ . The potentials
of the field at large distances from the system of charges are
Z
1
ϕ(r, t) = ρ(r ′ , t − r + n̂ · r ′ ) dV ′ , (7.138a)
4πr
Z
1
A(r, t) = J (r ′ , t − r + n̂ · r ′ ) dV ′ . (7.138b)
4πr
At sufficiently large distances from the system of charges, the field over small regions of space
can be considered to be a plane wave. For this it is necessary that the distance be large com-
pared not only with the dimensions of the system, but also with the wavelength of the electro-
magnetic waves radiated by the system. We refer to this region of space as the wave zone of
the radiation. In wave zone, we have
∂A ∂A
B= × n̂, E = × n̂ × n̂. (7.139)
∂t ∂t
The energy flux is given by the Poynting vector which, for a plane wave, is
S = B 2 n̂. (7.140)
dP = B 2 r2 do . (7.141)
Since the field is inversely proportional to r, we see that the amount of energy radiated by the
system in unit time into the element of solid angle do is the same for all distances. For the
radiation produced by a single arbitrarily moving point charge, it turns out to be convenient
to use the Lienard-Wiechert potentials. At large distances, we have
ev(t′ )
A(r, t) = , (7.142)
4πr[1 − n̂ · v(t′ )]
where
t′ − n̂ · r0 (t′ ) = t − r. (7.143)
–70/453– Chapter 7 Classical Electrodynamics
Now we turn to the spectral resolution of the field of the waves radiated by the system. For
vector potential, we can derive that
Z
eikr ′
Aω (r) = Jω e−ik·r dV ′ , (7.144)
4πr
where k ≡ ω n̂. In wave zone, we have
i
Bω = ik × Aω , Eω = (k × Aω ) × k. (7.145)
ω
Suppose dEωn̂ is the energy radiated into the element of solid angle do in the form of waves
with frequencies in the interval dω. We have
dω
dEωn̂ = 2Bω2 r2 do . (7.146)
2π
For the radiation produced by a single arbitrarily moving point charge, we can derive that
Z Z
e ikr ∞ iω(t−n̂·r0 ) ieω ikr ∞ iω(t−n̂·r0 )
Aω (r) = e e dr0 , Bω (r) = e e n̂ × dr0 .
4πr −∞ 4πr −∞
(7.147)
As a result, we have
1 ¨ 1 ¨
B= d × n̂, E= (d × n̂) × n̂. (7.151)
4πr 4πr
Radiation of this kind is called dipole radiation. We notice that a closed system of particles,
for all of which the ratio of charge to mass is the same, cannot radiate by dipole radiation. The
power of the dipole radiation is
d¨2
dP = sin2 θ do , (7.152)
16π 2
7.5 Radiation –71/453–
where θ is the angle between d¨ and n̂. Integrating over all the direction, we have
d¨2
P = . (7.153)
6π
If we have just one charge moving in the external field, we have
e2 w2
P = , (7.154)
6π
where w is the acceleration of the charge.
For the spectral resolution of the intensity of dipole radiation, we have
ω4 dω
dEω = |dω |2 . (7.155)
3π 2π
More details on dipole radiation during collisions and Coulomb interaction can be found in
section 68, 69 and 70 of The classical theory of fields (L.D.Landau & E.M.Lifshitz).
If we keep the first order of n̂ · r ′ , the radiation is
d˙ D̈ · n̂ ṁ × n̂
A= + + . (7.156)
4πr 24πr 4πr
We can further get
1 1 ...
B= d¨ × n̂ + (D · n̂) × n̂ + (m̈ × n̂) × n̂ ; (7.157a)
4πr 6
1 1 ...
E= (d¨ × n̂) × n̂ + [(D · n̂) × n̂] × n̂ + n̂ × m̈ ; (7.157b)
4πr 6
1 ¨2 1 ...2 1 2
P = d + D + m̈ . (7.157c)
6π 720π 6π
The total radiation consists of three independent parts: dipole, quadrupole, and magnetic
dipole radiation. The details of the derivation can be found in section 71 of The classical the-
ory of fields (L.D.Landau & E.M.Lifshitz).
Particularly, we have
Z 2 Z
e2 w − (v × w)2 e4 (E + v × B)2 − (E · v)2
∆E = dt = dt . (7.161)
6π (1 − v 2 )3 6πm2 1 − v2
It is clear that for velocity close to that of light, the total energy radiated per unit time is pro-
portionally to the square of the energy of the moving particle. The only exception is motion
in an electric field, along the direction of the field. In this case the factor (1 − v 2 ) standing in
the denominator is cancelled by an identical factor in the numerator, and the radiation does
not depend on the energy of the particle.
Now we discuss the angular distribution of the radiation from a rapidly moving charge. The
radiation field is
e n̂ × [(n̂ − v) × w]
E= , B = n̂ × E. (7.162)
4πR (1 − n̂ · v)3
where all the quantities on the right sides of the equations refer to the retarded time t′ = t−R.
The power radiated into the solid angle do is
e2 2(n̂ · w)(v · w) w2 (1 − v 2 )(n̂ · w)2
dP = + − do . (7.163)
16π 2 (1 − n̂ · v)5 (1 − n̂ · v)4 (1 − n̂ · v)6
If we want to determine the angular distribution of the total radiation throughout the whole
motion of the particle, we must integrate the intensity over the time. In doing this, it is im-
portant to remember that the integrand is a function of t′ ; therefore we must write
dt = (1 − n̂ · v) dt′ (7.164)
after which the integration over t′ is immediately done.
In the ultrarelativistic case, the intensity is large within the narrow range of angles in which
1 − n̂ · v is small. Thus an ultrarelativistic particle radiates mainly along the direction of its
own motion, within the small range of angles around the direction of its velocity. We also point
out that, for arbitrary velocity and acceleration of the particle, there are always two directions
for which the radiated intensity is zero. These are the directions for which the vector n̂ − v is
parallel to the vector w.
If the velocity and acceleration of the particle are parallel, we have
e w × n̂ e2 w2 sin2 θ
B= , dP = do . (7.165)
4πR (1 − n̂ · v)3 16π 2 (1 − v cos θ)6
It is naturally, symmetric around the common direction of v and w, and vanishes along (θ = 0)
and opposite to (θ = π) the direction of the velocity. In the ultrarelativistic case, the intensity
as a function of θ has a sharp double maximum near v, with a steep drop to zero for θ = 0.
If the velocity and acceleration are perpendicular to one another, we have
e2 w2 1 (1 − v 2 ) sin2 θ cos2 ϕ
dP = − do , (7.166)
16π 2 (1 − v cos θ)4 (1 − v cos θ)6
where θ is again the angle between v and n̂, and ϕ is the azimuthal angle of the vector n̂ relative
to the plane passing through v and w.
The discussion of synchrotron radiation (magnetic bremsstrahlung) can be found in section
74 of The classical theory of fields (L.D.Landau & E.M.Lifshitz).
7.6 The interaction between charged particles and EM field –73/453–
Since
Z t2 Z t2 Z t2 Z t2
2
a dt = v̇ · v̇ dt = v · v̇|tt21 − v · v̈ dt = − v · v̈ dt , (7.168)
t1 t1 t1 t1
we can get
e2
ȧ. Frad = (7.169)
6π
This is known as the Abraham-Lorentz formula for radiation reaction. This equation can only
be applied when the frequency and intensity of the EM field is not very big, i.e.,
e2 m2
λ≫ , B≪ . (7.170)
m e3
The details can be found in section 75 of The classical theory of fields (L.D.Landau & E.M.Lifshitz).
We then derive the relativistic expression for the radiation damping for a single charge, which
is applicable also to motion with velocity comparable to that of light. This force is now a four-
–74/453– Chapter 7 Classical Electrodynamics
vector g µ , which must be included in the equation of motion of the charge, written in four-
dimensional form:
duµ
m = eF µν uν + g µ . (7.171)
dτ
To determine g µ we notice that for v ≪ 1, its three space components must go over into
the components of the vector e2 ȧ/6π. It is easy to see that the vector (e2 /6π) d2 uµ /dτ 2 has
this property. However, it does not satisfy the identity g µ uµ = 0, which is valid for any force
four-vector. In order to satisfy this condition, we must add to the expression given a certain
auxiliary four-vector, made up from the four-velocity uµ and its derivatives. The three space
components of this vector must become zero in the limiting case v = 0. As a result we find
µ e2 d2 uµ µ νd u
2 ν
g = +u u . (7.172)
6π dτ 2 dτ 2
It is called Abraham–Lorentz–Dirac force.
The integral of the four-force g µ over the world line of the motion of a charge, passing through
a given field, must coincide (except for opposite sign) with the total four-momentum ∆P µ of
the radiation from the charge. The first term in equation above goes to zero on performing the
integration, since at infinity the particle has no acceleration. We integrate the second term by
parts and get: Z Z
e2 duν duν µ
− g dτ =
µ
u dτ = ∆P µ . (7.173)
6π dτ dτ
e2
d¨ = er̈ = E0 (t)e−iωt . (7.175)
m
7.6 The interaction between charged particles and EM field –75/453–
Now we assume the incident direction of the EM wave is x̂, the scattered direction of the EM
wave is n̂′ = (sin θ cos ϕ, sin θ sin ϕ, cos θ). The dipole radiation is
1 D ′ 2
E e4
d ⟨P ⟩ = 2
[Re(d¨ × n̂ )] do = 2 2
|E0 × n̂′ |2 do , (7.176)
16π 32π m
where
|E0 × n̂′ |2 = −(E0y E0z
∗ ∗
+ E0y E0z ) cos θ sin ϕ sin θ − |E0y |2 sin2 ϕ − |E0z |2 sin2 θ + |E0y |2 .
(7.177)
Notice that
1
⟨S⟩ = ⟨Re(E) · Re(E)⟩ = ⟨E0 · E0∗ ⟩ . (7.178)
2
The effective cross-section for scattering can be obtained as
e4 2
dσ = [−(ρ12 + ρ 21 ) cos θ sin ϕ sin θ − ρ 11 sin2
ϕ − ρ 22 sin θ + ρ11 ] do . (7.179)
16π 2 m2
The total cross section is 2
8π e2
σ= . (7.180)
3 4πm
If the incident light is totally linear polarized in ẑ direction, then we have
e4
dσ = sin2 θ do . (7.181)
16π 2 m2
If the incident light is unpolarized, we have
e4 2
dσ = 1 + cos Θ do , (7.182)
32π 2 m2
where cos Θ = cos ϕ sin θ; i.e., Θ is the angle between the direction of incident light and
scatted light.
e e2 ...
ξ̈ = E0 e−iωt − ω02 ξ + ξ. (7.183)
m 6πm
Suppose ξ = ξ0 e−iωt . We can get
eE0 e2 ω 2
ξ= e−iωt where γ ≡ . (7.184)
m(ω0 − ω − iωγ)
2 2 6πm
We can show that
ω4
σ = σ0 , (7.185)
(ω02 − ω 2 )2 + ω 2 γ 2
where σ0 is the total cross section when EM wave is scattered by free charges. When ω ≫ ω0 ,
σ is independent of ω. When ω ≪ ω0 , σ0 is proportional to ω 4 , and it is called Rayleigh
scattering.
Part III
General Relativity
Chapter 8
Theorem 8.1
Vector space A vector space is a collection of objects called vectors, which may be
added together and multiplied (“scaled”) by numbers.
Dual space In mathematics, any vector space V has a corresponding dual vector space
consisting of all linear functionals on V together with a naturally induced linear
structure. For vector space with finite dimensions, we have V = (V ∗ )∗ .
Tensor product In mathematics, the tensor product V ⊗W of two vector spaces V and
W is the vector space generated by the symbols v ⊗ w, with v ∈ V and w ∈ W ,
in which the relations of bilinearity are imposed for the product operation ⊗,
and no other relations are assumed to hold. It is equivalent to the vector space
consisting of all bilinear functionals on V ∗ and W ∗ . It is also the dual vector space
♡
of V ∗ ⊗ W ∗ .
Tensor Suppose V is an n-dimensional vector space over F with dual space V ∗ . The
elements in the tensor product
V rs = V ⊗ · · · ⊗ V ⊗ V ∗ ⊗ · · · ⊗ V ∗ (8.2)
| {z } | {z }
r terms s terms
are called (r, s) type tensors. Suppose {ei }1≤i≤n and {e∗i }1≤i≤n are dual bases in
V and V ∗, respectively. An (r, s) type tensor x can be uniquely expressed as
Antisymmetrization operator
1 X
Ar (x) = sgn · σx. (8.6)
r!
σ∈P(x)
Λr (V ) ≡ Ar (T r (V )), Λ0 (V ) ≡ F, Λ1 (V ) ≡ V. (8.7)
Wedge product
(k + l)! ♡
ξ∧η ≡ Ak+l (ξ ⊗ η) where ξ ∈ Λk (V ), η ∈ Λl (V ). (8.8)
k!l!
Pull-back mapping f : V → W is a linear mapping. We define f ∗ : Λr (W ∗ ) →
Λr (V ∗ ) as
(ξ1 + ξ2 ) ∧ η = ξ1 ∧ η + ξ2 ∧ η; (8.10)
ξ ∧ (η1 + η2 ) = ξ ∧ η1 + ξ ∧ η2 ; (8.11)
ξ ∧ η = (−1) η ∧ ξ; kl
(8.12) ♠
(k + l + h)!
(ξ ∧ η) ∧ ζ = ξ ∧ (η ∧ ζ) = Ak+l+h (ξ ⊗ η ⊗ ζ); (8.13)
k!l!h!
f ∗ (ϕ ∧ ψ) = f ∗ ϕ ∧ f ∗ ψ. (8.14)
–80/453– Chapter 8 Elementary Differential Geometry
Fiber bundle A fiber bundle is a space that is locally a product space, but globally
may have a different topological structure. Specifically, the similarity between
a space E and a product space B × F is defined using a continuous surjective
map π : E → B that in small regions of E behaves just like a projection from
corresponding regions of B × F to B. The map π, called the projection or sub-
mersion of the bundle, is regarded as part of the structure of the bundle. The
space E is known as the total space of the fiber bundle, B as the base space, and
F the fiber.
Vector Bundle A vector bundle is a topological construction that makes precise the
idea of a family of vector spaces parameterized by another space X: to every
point x of the space X we associate a vector space V (x) in such a way that these
vector spaces fit together to form another space of the same kind as X, which is
then called a vector bundle over X.
♡
Tangent bundle In differential geometry, the tangent bundle of a differentiable man-
ifold M is a manifold T M , which assembles all the tangent vectors in M . As a
set, it is given by the disjoint union of the tangent spaces of M . That is,
G [ [
TM = Tx M = {x} × Tx M = {(x, y)|y ∈ Tx M }, (8.19)
x∈M x∈M x∈M
Theorem 8.2
Theorem 8.3
Theorem 8.4
[X, Y ] ≡ X ◦ Y − Y ◦ X (8.20) ♣
Proposition 8.3
Suppose X is a smooth tangent vector field on M and ϕt is the one parameter differen-
tiable transformation group inducing it. Denote the trajectory of ϕt through x by γx (t).
Thus we have linear isomorphism
(ϕ−1
t )∗ = (ϕ−t )∗ : Tγx (t) M → Tx M ; (8.27)
(ϕt )∗ : Tγ∗x (t) → Tx∗ M. (8.28)
Proposition 8.4
Theorem 8.5
v1i1 · · · vri1
τ (v1 , · · · , vr )|U = τ|i1 ···ir | .. .. . (8.40)
. .
v1 · · · vrir
ir
1 ∂f α1 ∂f αr i1
f ∗ ϕ|U = (ϕα1 ···αr ◦ f ) · i
· · · i
dx ∧ · · · ∧ dxir ; (8.41) ♠
r! ∂x 1 ∂x r
and
f ∗ (ϕ ∧ ψ) = f ∗ ϕ ∧ f ∗ ψ. (8.42)
8.5 Exterior differential –85/453–
Proposition 8.6
∀ω ∈ Λ1 (M ), X, Y ∈ T (M ),
∀ω ∈ Λr (M ), X1 , · · · , Xr+1 ∈ T (M ),
♠
X
r+1
dω (X1 , · · · , Xr+1 ) = (−1)i+1 Xi (⟨X1 ∧ · · · ∧ X̂i ∧ · · · ∧ Xr+1 , ω⟩)
i=1
X
+ (−1) i+j
⟨[Xi , Xj ] ∧ · · · ∧ X̂i ∧ · · · ∧ X̂j ∧ · · · Xr+1 , ω⟩. (8.44)
1≤i<j≤r+1
Proposition 8.7
1. d2 = 0.
2. Suppose U = B0 (r) is a spherical neighbourhood with center origin O and radius
♣
r in Rn . Then for all ω ∈ Λr (U ) and dω = 0, there exists τ ∈ Λr−1 (U ), satisfy
that ω = dτ .
∂y α
= fiα (x1 , · · · , xm , y 1 , · · · , y n ) (1 ≤ i ≤ m, 1 ≤ α ≤ n). (8.46)
∂xi
fiα (x, y) is a smooth function on the open set U × V ⊂ Rm × Rn . The equations sets
can be written as Pfaff equations on U × V
y α = g α (x1 , . . . , xm ), (8.48)
ω α = 0, (1 ≤ α ≤ r). (8.51) ♡
dω α ∧ ω 1 ∧ · · · ∧ ω r = 0. (8.52) ♣
is completely integrable.
Suppose α : [0, 1] → M is a path on M . For all t ∈ [0, 1], assign an orientation for
Tα(t) M , denoted by µt . If for t0 ∈ [0, 1], there is a local coordinate (U ; xi ) of α(t0 ) and
a neighbourhood [t0 − δ1 , t0 + δ2 ] of t0 that
α([t0 − δ1 , t0 + δ2 ]) ⊂ U (8.53)
♡
and
∂ ∂
1
,..., m ∈ µt , ∀t ∈ [t0 − δ1 , t0 + δ2 ], (8.54)
∂x ∂x α(t)
Proposition 8.10
A topological manifold with boundary is a Hausdorff space in which every point has a
neighbourhood homeomorphic to an open subset of Euclidean half-space (for a fixed
♡
n):
Rn+ = {(x1 , . . . , xn ) ∈ Rn : xn ≥ 0}. (8.56)
Suppose M is a manifold with boundary. The interior of M , denoted Int M , is the set of
points in M which have neighbourhoods homeomorphic to an open subset of Rn . The
boundary of M , denoted ∂M , is the complement of Int M in M . The boundary points
can be characterized as those points which land on the boundary hyperplane (xn = 0) ♡
of Rn+ under some coordinate chart. If M is a manifold with boundary of dimension
n, then Int M is a manifold (without boundary) of dimension n and ∂M is a manifold
(without boundary) of dimension n − 1.
Theorem 8.8
Ũ = U ∩ ∂M = {(x1 , . . . , xm ) ∈ U : xm = 0} ̸= ∅ (8.57) ♡
Z Z ! Z X
X XZ
ϕ= gα ·ϕ= (gα · ϕ) = gα · ϕ
M M M M
XZ XZ
α α α
♡
= gα · ϕ = f (w1 , · · · , wm ) dw1 ∧ · · · ∧ dwm
Wα Wα
XZ
α α
8.6 Connection
satisfying that ♡
1. ∀s1 , s2 ∈ Γ(E), D(s1 + s2 ) = Ds1 + Ds2 .
2. ∀s ∈ Γ(E) and α ∈ C ∞ (M ), D(αs) = dα ⊗ s + αDs.
If X is a smooth tangent vector field on M , s ∈ Γ(E), then DX s ≡ ⟨X, Ds⟩, called
absolute derivative of s along X.
Proposition 8.11
X X
q
Dsα = Γ β
αi du ⊗ sβ =
i
ωα β ⊗ sβ , (8.62)
1≤i≤m,1≤β≤q β=1
where X
ωα β ≡ Γβ αi dui . (8.63)
1≤i≤m ♠
It can be written compactly as
Ds = ω ⊗ S. (8.64)
If we use a new base S ′ = A · S, we have
Theorem 8.11
Theorem 8.12
Ω ≡ dω − ω ∧ ω. (8.67) ♡
Ω′ = A · Ω · A−1 . (8.68) ♠
Suppose X, Y are two arbitrary tangent vector fields on M . Suppose s can be expressed
P
as s = qα=1 λα sα |p using the local frame. The curvature operator is defined as
X
q
R(X, Y )s ≡ λα Ωαβ (X, Y )sβ |p . (8.69) ♡
α,β=1
The transformation law of curvature matrix ensures that curvature operator is indepen-
dent of the choice of local coordinates.
Proposition 8.13
∂ ∂ ∂
D i
≡ ωi j ⊗ j ≡ Γj ik duk ⊗ j . (8.75) ♡
∂u ∂u ∂u
Proposition 8.14
∂ ∂
DX = (X i,j + X k Γikj ) duj ⊗ i
= X i;j duj ⊗ i ; (8.77)
∂u ∂u
Dα = (αi,j − αk Γ ij ) du ⊗ du = αi;j du ⊗ du .
k j i j i
(8.78)
d2 ui j
i du du
k
+ Γ jk = 0. (8.79) ♡
dt2 dt dt
8.7 Riemannian manifold –93/453–
1
Ωij = Rj ikl duk ∧ dul . (8.80)
2 ♡
∂
R ≡ Rj ikl j ⊗ dui ⊗ duk ⊗ dul . (8.81)
∂u
Proposition 8.15
∂
T jik ≡ Γj ki − Γj ik , T ≡ T jik ⊗ dui ⊗ duk . (8.85) ♡
∂uj
Proposition 8.16
∂
T (X, Y ) = T kij X i Y j = DX Y − DY X − [X, Y ]. (8.86) ♠
∂uk
Theorem 8.14
Theorem 8.15
Theorem 8.16
Let a given point in Riemannian manifold be O and consider some nearby point P . If
P is close enough to O then there exists a unique geodesic joining O to P . Let X i be the
components of the unit tangent vector to this geodesic at O and let s be the geodesic arc
♡
length measured from O to P . Then the Riemann normal coordinates of P are defined
to be ui = sX i . One trivial consequence of this definition is that all geodesics through
O are of the form ui (s) = sX i and that the X i are constant along each geodesic.
Proposition 8.18
Theorem 8.18
Theorem 8.19
Define
R(X, Y, X, Y )
K(E) = (8.98)
G(X, Y, X, Y )
Theorem 8.20
9.1 Introduction
In Newtonian theory, there is a global coordinates (t, x, y, z) for the whole spacetime, where
t is time coordinate and (x, y, z) are Euclidean space coordinates. The equation of motion of
the particle is
2
d2 t d 2 xi ∂Φ dλ
= 0, + i = 0. (9.1)
dλ2 dλ2 ∂x dt
We can impose the spacetime manifold with a connection structure that Γi00 = Φ,i and all
other components vanish. Then, the equation of motion can be written as a geodesic equation,
d2 xα β
α dx dx
γ
+ Γ βγ = 0. (9.2)
dλ2 dλ dλ
The Riemann tensor of the given connection is
∂ 2Φ
Ri0j0 = −Ri00j = , (9.3)
∂xi ∂xj
and all other components vanish. The Ricci tensor is defined as the contraction of the first and
third components of Riemann curvature tensor, i.e., Rµν ≡ Rαµαν . For Newtonian theory, we
have
R00 = Φ,ii , (9.4)
and all other components vanish. As a result, Newton’s law of universal gravitation can be
expressed as
R00 = 4πρ. (9.5)
Regard absolute time t as a scalar field defined once and for all in Newtonian spacetime t =
t(P). The layers of spacetime are the slices of constant t – the “space slices” – each of which
has an identical geometric structure: the old “absolute space”.
9.3 Geometry formulation of Newtonian gravity –99/453–
Curvature of spacetime
Parallel transport a vector around a closed curve lying entirely in a space slice; it will return
to its starting point unchanged. But transport it forward in time by ∆t, northerly in space by
∆xk , back in time by −∆t, and southerly by −∆xk to its starting point; it will return changed
by
∂ ∂
A = −R
δA R ∆t , ∆xk A. (9.6)
∂t ∂xk
Geodesics of a space slice (Euclidean straight lines) that are initially parallel remain always
parallel. But geodesics of spacetime (trajectories of freely falling particles) initially parallel get
pried apart or pushed together by spacetime curvature,
∇u ∇u n + R (n
n , u )u
u = 0. (9.7)
Note: if w is a spatial vector field, then ∇u w is also spatial for every u .
3. Spatial vectors are unchanged by parallel transport around infinitesimal closed curves;
i.e.,
n , u )w
R (n w = 0 if w is spatial, for every u and n . (9.9)
4. All vectors are unchanged by parallel transport around infinitesimal, spatial, closed
curves; i.e.,
R (vv , w ) = 0 for every spatial v and w . (9.10)
6. There exists a metric · defined on spatial vectors only, which is compatible with the
covariant derivative in this sense: for any spatial w and v , and for any u whatsoever,
∇u (w
w · v ) = (∇u w ) · v + w · (∇u v ). (9.12)
Note: Axioms (1), (2), and (3) guarantee that such a spatial metric can exist.
1
J (u
u , n )pp = [R
R (pp , n )u
u + R (pp , u )n
n ]. (9.13)
2
is “self-ad-joint” when operating on spatial vectors, i.e.,
v · [JJ (u w ] = w · [JJ (u
u , n )w u , n )vv ] for all spactial v , w ; and for any u , n . (9.14)
8. “Ideal rods” measure the lengths that are calculated with the spatial metric; “ideal clocks”
measure universal time t; and “freely falling particles” move along geodesics of ∇.
∂ 2Φ
= 4πρ. (9.15)
∂xi ∂xi
d2 xi ∂Φ
2
+ i = 0. (9.16)
dt ∂x
4. “Ideal rods” measure the Galilean coordinate lengths; “ideal clocks” measure universal
time.
∂ ∂
x0 (P) = t(P), i
· j = δij , (9.17)
∂x ∂x
Γj 00 = Φ,j for some scalar field, all other components vanish. (9.18)
Were all the matter in the universe concentrated in a finite region of space and surrounded by
emptiness (“island universe”), then one could impose the global boundary condition Φ → 0
1
as r ≡ (xi xi ) 2 → ∞. This would single out a subclass of Galilean coordinates (“absolute”
Galilean coordinates), with a unique, common Newtonian potential. The transformation from
one absolute Galilean coordinate system to any other is called Galilean transformation.
∂x′i ∂ 2 xp ∂x′i
Γ′i00 = Γj 00 , Γ′ijk = . (9.24)
∂xj ∂x′j ∂x′k ∂xp
The equation of motion in this coordinate is
d2 t′ d2 x′i ′
′i dt dt
′ ′j
′i dx dx
′k
= 0, + Γ 00 + Γ jk = 0, (9.25)
dλ2 dλ2 dλ dλ dλ dλ
or compactly,
d2 x′i ′i
′j
′i dx dx
′k
+ Γ 00 + Γ jk = 0. (9.26)
dt2 dt dt
We can derive that
′i 1 ∂glj′ ′
∂glk ′
∂gjk ∂Φ
Γ = g ′il + − , Γ′i00 = g ′ij . (9.27)
jk
2 ∂x′k ∂x′j ∂x′l ∂x′j
The Hodge star operator on a vector space V with a non-degenerate symmetric bilinear
form (herein referred to as the inner product) is a linear operator on the exterior algebra
of V , mapping k-vectors to (n − k)-vectors where n = dim V , for 0 ≤ k ≤ n. It has
the following property, which defines it completely: given two k-vectors α, β,
where ⟨·, ·⟩ denotes the inner product on k-vectors and ω is the preferred unit n-vector. ♡
The inner product ⟨·, ·⟩ on k-vectors is extended from that on V by requiring that
where (i1 , i2 , · · · , in ) is an even permutation of {1, 2, · · · , n}. Of these n!/2, only Cnk are
independent. The first one in the usual lexicographical order reads
Proposition 10.1
p
|g| i1 ,··· ,in sgn(g)
εi1 ,··· ,in = g i1 j1 · · · g in jn εj1 ,··· ,jn = ϵ = p ϵi1 ,··· ,in . (10.6) ♠
g |g|
Using tensor index notation, the Hodge dual is obtained by contracting the indices of a k-form
with the n-dimensional completely antisymmetric Levi-Civita tensor.
Proposition 10.2
1
(⋆η)i1 ,i2 ,...,in−k = η j1 ,...,jk εj1 ,...,jk ,i1 ,...,in−k , (10.7)
(n − k)! ♠
where η is an arbitrary antisymmetric tensor in k indices.
Gβ δ ≡ Ḡµβ µδ . (10.11)
4. The Bianchi identity takes a particularly simple form when rewritten in terms of the
double dual:
Ḡαβ γδ;δ = 0, (10.12)
and it has the obvious consequence
Gβδ;δ = 0. (10.13)
–104/453– Chapter 10 More on the Geometry of Spacetime
5. The Ricci curvature tensor Rβδ = Rµβµδ , which is symmetric, and the curvature scalar
R = Rββ are related to the Einstein tensor by
1
Gβ δ = Rβδ − δδβ R. (10.14)
2
Lξ g = 0. (10.16)
is called a Killing vector field; it keeps the metric invariant and therefore corresponds to a
space-time symmetry. A few lines of algebra can show that it is equivalent to define killing
vector field by
ξµ;ν + ξν;µ = 0. (10.17)
Derivatives of killing vectors can be related to Riemann tensor by
This shows that from the value of ξ µ and ξ µ;ν at a given point one can determine the Killing
vector field uniquely. One should then specify N values for ξ ν and N (N −1)/2 values for ξ µ;ν ,
so that there are at most N (N + 1)/2 linearly independent Killing vector fields. For N = 4,
there are at most 10 Killing vectors, which is precisely the dimension of the Poincare group of
Minkowski space.
A space-time enjoying the maximum number of Killing vector fields is called a maximally
symmetric space-time. It can be shown from the Killing equations the Riemann tensor must
then satisfy
R
Rρσµν = (δ ρ gσν − δνρ gσµ ). (10.19)
N (N − 1) µ
After contraction, we have
R
Rσν = gσν . (10.20)
N
Also, the Ricci scalar must be a constant. Maximally symmetric spaces are thus spaces of con-
stant curvature. For N = 4, there are three maximally symmetric space-times: Minkowski,
de Sitter and anti-de Sitter.
10.4 The coordinates of observer –105/453–
3. The tetrad changes from point to point along the observer’s world line, relative to parallel
transport:
Ω · e α̂
∇u e α̂ = −Ω where Ωµν = aµ uν − uµ aν + uα ωβ εαβµν . (10.26)
Here a ≡ ∇u u is the acceleration of the observer and we have
u · a = u · ω = 0. (10.27)
If ω were zero, the observer would be Fermi-Walker-transporting his tetrad (gyroscope-
type transport). If both a and ω were zero, he would be freely falling (geodesic motion)
and would be parallel-transporting his tetrad.
4. The observer constructs his proper reference frame in a manner analogous to the Riemann-
normal construction. From each event P0 (τ ) on his world line, he sends out purely spa-
tial geodesics (geodesics orthogonal to u ), with affine parameter equal to proper length,
P = G[τ, n , s], (10.28)
where τ is proper time, telling “starting point” of geodesic, n is tangent vector to geodesic
at starting point, telling “which” geodesic, and s is proper length along geodesic from
starting point, telling “where” on geodesic. The tangent vector has unit length, because
the chosen affine parameter is proper length.
–106/453– Chapter 10 More on the Geometry of Spacetime
5. Each event near the observer’s world line is intersected by precisely one of the geodesics
G[τ, n , s]. Far away, this is not true; the geodesics may cross, either because of the ob-
server’s acceleration or or because of the curvature of spacetime.
6. Pick an event P near the observer’s world line. The geodesic through it originated on the
observer’s world line at a specific time τ , had original direction n = nĵ e ĵ ; and needed
to extend a distance s before reaching P. Hence, the four numbers
are a natural way of identifying the event P. These are the coordinates of P in the ob-
server’s proper reference frame.
10.5 Hypersurfaces
10.5.1 Description of hypersurfaces
Note: We only discuss timelike and spacelile hypersurfaces in this section.
Normal vector
(
−1 if Σ is spacelike
nα nα = ϵ ≡ (10.33)
+1 if Σ is timelike
10.5 Hypersurfaces –107/453–
Induced metric
∂xα
eαa = . (10.34)
∂y a
where hab ≡ gαβ eαa eβb . The completeness relation can be written as
∂xαx α1 αm−1
ϵαx α1 ···αm−1 e · · · em−1 < 0. (10.37)
∂y in 1
If we demand that the direction of nα is the opposite of ∂xα ∂y in , then we have
α
ϵαx α1 ···αm−1 nαx eα1 1 · · · em−1
m−1
> 0. (10.38)
Surface element
Element of two-surface
∂y a ∂xα
eaA = , e α
A = = eαa eaA ; (10.42)
∂θA ∂θA
σAB = hAB eaA ebB = gαβ eαA eβB ; (10.43)
ab a b
h = ϵr r r + σ AB eaA ebB ; (10.44)
g αβ = ϵn nα nβ + ϵr rα rβ + σ AB eαA eβB . (10.45)
If we demand that the direction ra is the opposite of that of ∂y a ∂θin , then the condition of
compatibility can be written as
εµνβγ nµ rν eβ2 eγ3 > 0. (10.46)
We define the surface element of a two-surface as
Gauss-Stokes theorem
∂ det A
= (det A) A−1 ba . (10.50)
∂Aab
Z I I p
√
−g d x =
Aα;α 4 α
A dΣα = ϵAα nα |h|d3 y. (10.55)
Z V I ∂V I ∂V ♣
1 p
αβ
B ;β dΣα = B αβ dSαβ = ϵn ϵr B αβ nα rβ |σ| d2 θ . (10.56)
Σ 2 ∂Σ ∂Σ
Extrinsic curvature
The extrinsic curvature of the hypersurface is defined as
Dnα α
Kab ≡ b
ea = nα;β eαa eβb . (10.61)
Dy
In terms of this, we have
Aα;β eβb = Aa|b eαa − ϵAa Kab nα . (10.62)
We notice that if eαa is substituted is place of Aα , we can obtain
This is know as Gauss-Weingarten equation. We can prove that Kab is a symmetric tensor.
Thus, we have
1
Kab = (Ln gαβ )eαa eβb . (10.64)
2
–110/453– Chapter 10 More on the Geometry of Spacetime
(3) m
Rµαβγ eαa eβb eγc = R+ ϵ(Kab|c − Kac|b )nµ + ϵKab nµ;γ eγc − ϵKac nµ;β eβb .
µ
abc em
(10.66) ♣
− 2ϵGαβ nα nβ = 3R + ϵ(K ab Kab − K 2 ), Gαβ eαa nβ = K ba|b − K,a . (10.67)
R = 3R + ϵ(K 2 − K ab Kab ) + 2ϵ(nα;β nβ − nα nβ ;β );α . (10.68)
Chapter 11
Formulation of General Relativity
Give the fields that generate mass-energy, and their time-rates of change, and give 3-geometry
of space and its time-rate of change, all at one time, and solve for the 4-geometry of spacetime
at that one time. Four of the ten components of Einstein’s law connect the curvature of space
here and now with the distribution of mass-energy here and now, and the other six equations
tell how the geometry as thus determined then proceeds to evolve.
3. All special relativistic laws of physics are valid in local Lorentz frames of metric.
Here R is the curvature scalar of the spacetime, K is the extrinsic curvature scalar of ∂V and
K0 is the extrinsic curvature scalar of ∂V when embedded in flat spacetime. The variation of
Hilbert term is given by
Z I p
√
(16π) δSH = Gαβ δg αβ
−g d x −
4
ϵhαβ δgαβ,µ nµ |h| d3 y . (11.13)
V ∂V
The variation of nondynamical tern is zero. It is used to eliminate the infinity in the boundary
term. The variation of matter term is
Z
∂L 1 √
δSM = αβ
− Lg αβ δg αβ
−g d4 x . (11.15)
V ∂g 2
11.3 Hamiltonian formulation –113/453–
Define
∂L
Tαβ ≡ −2 + Lgαβ . (11.16)
∂g αβ
The variation principle then leads to the Einstein’s equations
The spacetime can be foliated by spacelike hypersurfaces Σt that is described by scalar function
t(xα ), as shown in 11.1. t is a single valued function and the unit normal to the hypersurfaces
nα ∝ ∂α t is a future directed timelike vector field.
Consider a congruence of curves γ intersecting Σt . We use t as a parameter on the curves
and the vector tα is tangent to the congruence (tα ∂α t = 1). Install coordinates y a on Σt and
impose y a (P ′′ ) = y a (P ′ ) = y a (P ), so y a is held constant on each member of the congruence.
This construction defines a coordinate system (t, y a ) in V. Base vectors of the frame (t, y a ) are
given by α α
α ∂x α ∂x
t = , ea = . (11.18)
∂t ya ∂y a t
The normal vector of the hypersurface is given through
tα = N nα + N a eαa . (11.20)
Example: For electromagnetic field in 3+1 decomposition form, we define the electric and
magnetic field by
Ea ≡ Fαβ nβ eαa , ϵabc B c ≡ Fαβ eαa eβb . (11.27)
In this definition, the equation of motion of a particle in electromagnetic field can be written
as
maa = γe[N Ea + ϵabc (v b + N b )B c ]. (11.28)
where aa ≡ aα eαa , γ ≡ [N 2 − (N b + v b )(Nb + vb )]−1/2 and v a ≡ dy a /dt . If we adopt the
coordinates (t, y a ) at first, it is easy to verify that
1
E a ≡ hab Eb = N F 0a , B a = εabc Fbc . (11.29)
2
We further define
√ √
E a ≡ hE a , B a ≡ hB a , ϕ ≡ −A0 , ρe ≡ −j α nα = N j 0 , J a ≡ N j a . (11.30)
If we notice that
2hck
F0a = −hab N 2 F 0b − Fab N b , ϵabc ϵijk hai hbj = . (11.31)
h
we can express the Lagrangian density L = Aµ,ν F µν + F µν Fµν /4 + Aµ j µ as
√ 1 √ √
−gL = −E a Ȧa + ϕE,aa − √ N hab (E a E b + B a B b ) + ϵabc N a E b B c − hϕρe + hAa J a .
2 h
(11.32)
11.3 Hamiltonian formulation –115/453–
or equivalently,
1
Kab = (ḣab − Na|b − Nb|a ). (11.38)
2N
The corresponding canonical momentums of hab are
√ √
∂ −gLG h ab
ab
p = = (K − Khab ), (11.39)
∂ ḣab 16π
or equivalently,
√ 1 ab
hK = 16π p − ph
ab ab
. (11.40)
2
The Hamiltonian on the hypersurface is given by
Z
√
16πHG = N (K ab Kab − K 2 − 3R ) − 2Na (K ab − Khab )|b h d3 y
Σt
I
√
−2 N (k − k0 ) − Na (K ab − Khab )rb σ d2 θ . (11.41)
St
–116/453– Chapter 11 Formulation of General Relativity
where
√ √
16πPab = N hGab − h(N |ab − hab N |cc )
" #
√ 1
+ 16π 2pc(a N |c − h √ pab N c
b)
h |c
2 2N 1 ab N 1 2
+ (16π) √ pc p − pp
a bc
− √ p pcd − p hab ,
cd
(11.42b)
h 2 2 h 2
32πN 1
Hab = √ pab − phab + 2N(a|b) , (11.42c)
h 2
√
h 3
C= ( R + K 2 − K ab Kab ), (11.42d)
16π
√
h
Ca = (K b − Kδab )|b . (11.42e)
16π a
Similarly, we can also get the variation of electromagnetic Hamiltonian,
Z
1 √ ab √ √
δHE = − N hI δhab + hρδN − hsa δN d3 y , a
(11.43a)
Σt 2
where
1
Iab = (E c Ec + B c Bc )hab − E a E b − B a B b , (11.43b)
2
1
ρ = (E c Ec + B c Bc ), (11.43c)
2
sa = ϵabc E b B c . (11.43d)
Now, we can write down Hamilton’s equations and corresponding constraint equations for
general relativity,
where
− H µανβ ≡ h̄µν η αβ + h̄αβ η µν − h̄αν η µβ − h̄µβ η αν . (12.7)
Two different types of coordinate transformations connect nearly globally Lorentz systems to
each other: global Lorentz transformations, and infinitesimal coordinate transformations. As
for global Lorentz transformations, we can verify that hµν and h̄µν transform like components
of a tensor in flat spacetime. For Infinitesimal coordinate transformations
where ξ µ are four arbitrary functions small enough to leave |hµ′ ν ′ | ≪ 1. We can verify that
the metric perturbation functions in the new xµnew and old xµold coordinate systems are related
by
µν = hµν − ξµ,ν − ξν,µ .
hnew old
(12.9)
The functional forms of all other scalars, vectors, and tensors which is of order O(h), such as
Rµν , Tµν and R, are unaltered, to within the precision of linearized theory.
For any physical situation, one can specialize the gauge so that
h̄µα,α = 0, (12.10)
called Lorentz gauge. The Lorentz gauge is not fixed uniquely. The gauge condition is left
unaffected by any gauge transformation for which
2ξ α ≡ ξ α,ββ = 0. (12.11)
Once the gauge has been fixed by fiat for a given system, one can regard hµν as components
of tensors in flat spacetime; and one can regard the field equations and the chosen gauge con-
ditions as geometric, coordinate-independent equations in flat spacetime. This viewpoint al-
lows one to use curvilinear coordinates, if one wishes. But in doing so, one must everywhere
flat
replace the Lorentz components of the metric ηµν by the metric’s components gµν in the flat-
spacetime curvilinear coordinate system; and one must replace all ordinary derivatives in the
field equations and gauge conditions by covariant derivatives whose connection coefficients
flat
come from gµν .
dv i dv i
+ Γi00 = + Φ,i = 0. (12.16)
dt dt
Therefore, we reproduce the classical Newtonian gravitation theory.
Now let us consider the path of a photon through this geometry; in other words, solve the per-
turbed geodesic equation for a null trajectory xµ (λ). (We parametrize the trajectory with λ to
ensure that pµ = dxµ /dλ.) Recall that our philosophy is to consider the metric perturbation
as a field defined on a flat background spacetime. Similarly, we can decompose the geodesic
into a background path plus a perturbation,
where xµ(0) (λ) solves the geodesic equation in the background. We then evaluate all quantities
along the background path, to solve for xµ(1) (λ). For this procedure to make sense, we need to
assume that the potential Φ is not appreciably different along the background and true geode-
sies; this condition amounts to requiring that xi(1) ∂i Φ ≪ Φ. For convenience we denote the
wave vector of the background path as k µ and the derivative of the deviation vector as lµ . The
condition that a path be null is of course
dlµ
= −Γµρσ k ρ k σ . (12.23)
dλ
It follows that
dl0 dl
= −2k(k · ∇Φ), = −2k 2 ∇⊥ Φ, (12.24)
dλ dλ
where ∇⊥ ≡ ∇ − k −2 (k · ∇)k.
−(k + l0 )u0
z≡ − 1 = −Φ. (12.26)
k
The deflection angle of the photon passing by a gravitation source is
Z
l
α = − = 2k ∇⊥ Φ dλ . (12.27)
k
Particularly, for Φ = −M/r, we can get
M 4M
z= , α= , (12.28)
r b
where b is the distance between gravitational source and light ray.
h00 = −2A, h0i = ∂i B + B̄i , hij = 2Cδij + 2∂i ∂j E + ∂i Ēj + ∂j Ēi + Ẽij (12.29)
with
∂i B̄i = 0, ∂i Ēi = 0, ∂i Ẽij = 0, Ẽii = 0. (12.30)
Then we decompose the displacement vector for gauge transformation as
The tensor modes are therefore gauge invariant since they do not depend on the choice of the
coordinate system. This is not the case for the vector and scalar modes. However, we can
define a combination of these modes that are gauge invariant. For the scalar modes, we define
Thus, we have defined 2 scalar quantities and 1 vector quantity (2 degrees of freedom) which
are gauge invariant. All together , we have 6 degrees of freedom, once the 4 arbitrary degrees
of freedom related to the gauge choice have been absorbed.
12.3 Gravitational wave –121/453–
G00 = 2∇2 Ψ,
1
G0i = 2∂i Ψ̇ + ∇2 Ψ̄i ,
2
1 ˙
˙ + 2δ Ψ̈ − 1 2Ẽ .
Gij = (δij ∇2 − ∂i ∂j )(Φ − Ψ) + ∂i Ψ̄j + ∂j Ψ̄i ij ij
2 2
We note again that Einstein tensor (Ricci tensor) is gauge invariant.
Now we consider the solution of linearised Einstein’s equations in vacuum. For the scalar
modes, the Einstein’s equations Gµν = 0 imply that
∇2 Ψ = 0, ∇2 (Φ − Ψ) = 0. (12.35)
The only regular solution is Ψ̄i = 0. Just as for scalar modes, no vector modes can propagate.
Therefore, the only perturbations that can propagate in a Minkowski space-time are the grav-
itational waves and they satisfy
The three conditions Φ = Ψ = Ψ̄i = 0 define a gauge equivalence class. We can choose a
gauge in this family by imposing some conditions on the perturbations. For instance, setting
E = B = 0 and Ēi = 0, we define what is called a transverse and traceless (TT) gauge in
which the metric is completely determined. In this case, the only non-vanishing component
of hTT
µν is Ẽij . A particularly useful set of solutions to this wave equation are the plane waves,
given by
ikσ xσ
hTT
µν = Cµν e , (12.40)
where Cµν is a constant, symmetric, traceless and purely spatial tensor. The Einstein’s equa-
tions now become
k σ kσ = 0, k µ Cµν = 0. (12.41)
Our solution can be made more explicit by choosing spatial coordinates such that the wave is
travelling in the z direction. A little algebra can show that
+ ×
−iω(t−z)
hTT
ij = C+ ϵij + C× ϵij e , (12.42)
–122/453– Chapter 12 Perturbation Theory and Gravitational Radiation
Time
Figure 12.1: Effects of a gravitational plane wave propagating along the axis z on a ring of
particles located in the plane xy, depending on the wave polarization.
Consider the geodesic equation of a test particle in the gravitational field of a gravitational wave
in the TT gauge. Since to leading order in perturbation we have Γi00 = 0, a particle initially
at rest will remain at rest. Of course, this does not mean that nothing happens, but rather
that the frame of reference is co moving with the test particle. To see if anything happens, we
should look at the relative motion of two neighbouring particles, which can be done using the
geodesic deviation equation. The relative acceleration is given by
∇u ∇u n = −R
R (n
n , u )u
u. (12.44)
d2 ni 1 j
2
= ḧTT
ij n . (12.45)
dt 2
For + modes, we have
d2 nx 1 d2 ny 1
2
= − ω 2 C+ nx e−iω(t−z) , 2
= ω 2 C+ ny e−iω(t−z) . (12.46)
dt 2 dt 2
For × modes, we have
d2 nx 1 d2 ny 1
2
= − ω 2 C× ny e−iω(t−z) , 2
= − ω 2 C× nx e−iω(t−z) . (12.47)
dt 2 dt 2
curvature might be due entirely to the waves, or partly to waves and partly to nearby matter
and nongravitational fields.
The analysis uses a coordinate system closely “tuned” to spacetime in the sense that the metric
coefficients can be split into “background” coefficients plus perturbations
(B)
gµν = gµν + hµν . (12.48)
A rather long computation shows that the Ricci tensor for an expanded metric is
(B) (1) (2)
Rµν = Rµν + Rµν + Rµν + error. (12.50)
? A/λ2 A2 /λ2 A3 /λ2
Here a marker has been placed under each term to show its typical order of magnitude; Rµν
(B)
is
the Ricci tensor for the background metric gµν ; and Rµν and Rµν are expressions defined by
(B) (1) (2)
1
(1)
Rµν ≡ (hαµ|να + hαν|µα − hµν|αα − h|µν ); (12.51a)
2
1 1
(2)
Rµν ≡ hαβ|µ hαβ |ν + hαβ (hµν|αβ + hαβ|µν − hαµ|νβ − hαν|µβ )
2 2
1 |α
+hν (hαµ|β − hβµ|α ) − h |β − h
α|β αβ
(hαµ|ν + hαν|µ − hµν|α ) . (12.51b)
2
In these expressions and everywhere below, indices are raised and lowered with gµν
(B)
, and an
upright line denotes a covariant derivative with respect to gµν .
(B)
At the heart of the shortwave formalism is its method for solving the vacuum field equations
Rµν = 0. One begins by selecting out the part linear in the amplitude of the wave A, and
setting it equal to zero. The action of the waves to curve up the background is a nonlinear
phenomenon; so Rµν (B)
cannot be linear in A. Hence, Rµν(1)
is the only linear term, and it must
vanish by itself
(1)
Rµν (h) = 0. (12.52a)
–124/453– Chapter 12 Perturbation Theory and Gravitational Radiation
Of course hµν may contain nonlinear correction terms - call them jµν - of order A2 , which
must not be constrained by this linear equation.
One next splits the remainder of Rµν into a part that varies only on scales far larger than λ,
and a second part that contains the fluctuations. This split can be accomplished by averaging
over several wavelengths:
(B) (2)
Rµν + Rµν (h) +error = 0 [smooth part]; (12.52b)
? A2 /λ2 A3 /λ2
(1)
Rµν (2)
(j) + Rµν (h)− Rµν
(2)
(h) +error = 0 [fluctuating part]. (12.52c)
A2 /λ2 A2 /λ2 A2 /λ2 A3 /λ2
Smooth part shows how the stress-energy in the waves creates the background curvature. It
can be rewritten in the more suggestive form
1 (B) (B)
µν ≡ Rµν − R gµν = 8πTµν
G(B) (B) (GW)
in vaccum, (12.53)
2
where
1 1 (B) (2)
(GW)
Tµν ≡− Rµν (h) − gµν ⟨R (h)⟩
(2)
(12.54)
8π 2
is the stress-energy tensor for the gravitational waves. Equation 12.53 can be generalized to
the case where matter and other fields are present,
(matther) (other fields)
G(B) (GW)
µν = 8π(Tµν + Tµν + Tµν ) (12.55)
Finally, fluctuation part shows how the gravitational waves generate nonlinear corrections j
to themselves (wave-wave scattering, harmonics of the fundamental frequency, etc).
By an appropriate choice of the four functions ξ µ , one can enforce the transverse and traceless
gauge conditions
h̄µα|α = 0, h̄ = 0. (12.58)
12.5 Conservation laws for 4-momentum and angular momentum –125/453–
hµν|αα + 2Rαµβν
(B)
hαβ − 2Rα(µ
B
hν)α = 0. (12.59)
The propagation equation is accurate to first order in the amplitude; and its accuracy is in-
dependent of the ratio λ/R. Thus, it can be applied whenever the waves are weak even if the
wavelength is large. All nonlinear interactions of the wave with itself are neglected in this first-
order propagation equation. Actually contained in the propagation equation are all effects due
to the linear action of the background curvature on the propagating wave.
where the error ∼ (λ/R)(T GWµν /R) is negligible in the shortwave approximation.
curvature permits, when one moves radially outward from the source toward infinity. Every-
where in this coordinate system, even inside the source, where |hµν | ≪ 1 may break down,
we still define formally
hµν ≡ gµν − ηµν . (12.63)
The hµν are clearly not the components of a tensor. Neither is ηµν the true metric tensor.
Nevertheless, one is free to raise and lower indices on hµν with ηµν and to define
1
h̄µν ≡ hµν − ηµν h, h ≡ hαβ η αβ . (12.64)
2
and
− H µανβ ≡ h̄µν η αβ + h̄αβ η µν − h̄αν η µβ − h̄µβ η αν . (12.65)
Then we can define the effective energy-momentum pseudotensor by
µν
16πTeff ≡ H µανβ,αβ , (12.66)
so it follows that
µν
Teff,ν = 0. (12.67)
We further define the total 4-momentum and angular momentum for both strong and weak
source by
Z I
1 1
P ≡
µ 3 µ0
d x Teff = H µα0j,α dSj , (12.68)
16π 16π S
Z I
1
J ≡ (x Teff − x Teff ) d x =
µν µ 0ν ν 0µ 3
(xµ H να0j,α − xν H µα0j,α + H µj0ν − H νj0ν ) dSj ,
16π S
(12.69)
where the closed surface of integration S is in the asymptotically flat region surrounding the
source. Naturally, gravitation energy-momentum pseudotensor is defined as
Although this invariance is hard to see in the volume integrals themselves, it is clear from the
surface-integral forms that no coordinate transformation which changes the coordinates only
inside some spatially bounded region can influence the values of the integrals. For coordinate
changes in the distant, asymptotically flat regions, linearized theory guarantees that under
Lorentz transformations the integrals for P µ and J µν will transform like special relativistic
tensors, and that under infinitesimal coordinate transformations (gauge changes) they will be
invariant.
12.5 Conservation laws for 4-momentum and angular momentum –127/453–
Knowing P µ and J µν , one can figure out the source’s total mass-energy M and intrinsic an-
gular momentum Sρ by
J µν Pν 1 (J µν − Y µ P ν + Y ν P µ )P σ
M ≡ (−P µ Pµ )−1/2 , Yµ ≡− , Sρ ≡ εµνσρ .
M2 2 M
(12.71)
αµνβ
It is clear that any quantities Hnew which agree with the original H αµνβ in the asymptotic
weak-field region will give the same values as H αµνβ does for the P µ and J µν surface integrals.
One especially convenient choice is Landau-Lifshitz pseudotensor, who define
µανβ
HL−L = gµν gαβ − gαν gµβ where gµν ≡ (−g)1/2 g µν . (12.72)
are precisely quadratic in the first derivatives of the metric. It follows that
µν
TL−L,eff ≡ (−g)(T µν + tµν
L−L ) (12.75)
µν
has all the properties of the Teff defined by 12.66.
In evaluating these flux integrals, it is especially convenient to use the Landau-Lifshitz form
µν
of Teff , since that form contains no second derivatives of the metric. Only those portions of
µν
tL−L that die out as 1/r2 or 1/r3 at large r can contribute to the flux integrals 12.76. For static
solutions gµν ∼ const. +O(1/r), tµν 4
L−L dies out as 1/r . Hence, the only contributions come
–128/453– Chapter 12 Perturbation Theory and Gravitational Radiation
from dynamic parts of the metric, which, at these large distances, are entirely in the form of
gravitational waves.
When tµνL−L is averaged over several wavelengths, it becomes the stress-energy tensor T
(GW)µν
for the waves. Moreover, averaging tµνL−L over several wavelengths before evaluating the flux
integrals 12.76 cannot affect the values of the integrals. Therefore, one can freely make in
these integrals the replacement
µν
Teff = T (GW)µν + T µν . (12.77)
h̄µα,α = 0 (12.78)
are exactly satisfied everywhere, including the interior of the source. The exact Einstein field
equations can be written in terms of h̄µν as
Of the ten components of h̄µν , only the six spatial ones are of interest, since only they are
µν
needed in projecting out the transverse-traceless radiation field hTT
jk . Applying (T +tµν ),ν =
0, we can derive that
[(T 00 + t00 )xj xk ],00 = [(T lm + tlm )xj xk ],lm − 2[(T lj + tlj )xk + (T lk + tlk )xj ],l + 2(T jk + tjk ),
(12.82)
12.6 Production of gravitational wave –129/453–
whence
Z Z
1 d2 Ijk
(T jk + tjk )d3 x = where Ijk (t) ≡ [T 00 (t, x) + t00 (t, x)]xj xk d3 x. (12.83)
2 dt2
Now introduce the nearly Newtonian assumption. It guarantees that gravitation contributes
only a small fraction of the total energy, t00 ∼ (Φ,j )2 ∼ |Φ|T 00 ≪ T 00 , hence
Z
Ijk = T 00 (t, x)xj xk d3 x. (12.84)
The quantity Ijk thus represents the quadrupole moment of the mass distribution. By combin-
ing equations 12.83 and 12.81, and by noticing that inside the source |tik | ∼ |Φ,j Φ,k | ∼ T 00 |Φ|,
one obtains jk
2 d2 Ijk (t − r) |T | λ
jk
h̄ (t, x) = 1 + O + |Φ| . (12.85)
r dt2 T 00 R
Notice that |Φ| ∼ v 2 ∼ (R/λ)2 and |T jk |/T 00 ∼ v 2 ∼ (R/λ)2 . Thus, terms of higher order
can be neglected.
jk can be obtained by first lowering indices, using ηlm = δlm and then projecting out the TT
hTT
part using the projection operator for radially traveling waves:
xl
Plm = δlm − nl nm , nl ≡ . (12.86)
r
The result is
2 d2 Ijk
TT
1
TT
hjk = TT
where Ijk = Pjl Ilm Pmk − Pjk Pml Ilm (12.87)
r dt2 2
The effective stress-energy tensor for these outgoing waves are
1 1 ... ... ... 1 ...
T00 = −T0r = Trr =
GW GW GW TT
h h TT
= I jk I jk − 2nj I jl I lk nk + (nj I jk nk ) ,
2
32π jk,0 jk,0 8πr2 2
(12.88)
where I jk ≡ Ijk − δjk I/3 is the reduced quadrupole moment of of the mass distribution. The
total power crossing a sphere of radius r at time t is
Z
1 ... ...
LGW (t, r) = T (GW)0r r2 dΩ = I jk (t − r) I jk (t − r) . (12.89)
5
Example: One case of special interest is the gravitational radiation emitted by a binary star.
For simplicity let us consider two stars of mass m1 and m2 in a circular orbit, We will treat
the motion of the stars in the Newtonian approximation. The angular frequency of the orbit
is ω = (M/d3 )1/2 , where d is the speration of two stars and M ≡ m1 + m2 is the total mass.
The power that they radiate as gravitational waves is
32 µ2 M 3
LGW = , (12.90)
5 d5
where µ ≡ m1 m2 /(m1 + m2 ) is the reduced mass of the system.
Chapter 13
Black Holes
The proof can be found in section 5.2 of Spacetime and Geometry (Sean Carroll). Any spher-
ically symmetric vacuum metric possesses a timelike Killing vector. A metric that possesses
a Killing vector that is timelike near infinity is called stationary. A metric is called static if it
possesses a timelike Killing vector that is orthogonal to a family of hypersurfaces. An alter-
native definition of “static” is stationary, and invariant under time reversal. We should think
of stationary as meaning “doing exactly the same thing at every time,” while static means “not
doing anything at all.”
The Schwarzschild metric coefficients become infinite at r = 0 and r = 2M . The metric
coefficients are coordinate-dependent quantities, it is certainly possible to have a coordinate
singularity that results from a breakdown of a specific coordinate system rather than the un-
derlying manifold. Direct calculation reveals that
48M 2
Rµνρσ Rµνρσ = . (13.3)
r6
13.1 Schwarzschild black holes –131/453–
As for r = 2M , the Schwarzschild radius. We can check that none of the curvature invari-
ants blows up there. We therefore begin to think that it is actually not singular, and we have
simply chosen a bad coordinate system. The surface r = 2M is very well-behaved in the
Schwarzschild metric – it demarcates the event horizon of a black hole.
dxµ dxµ
ϵ = −gµν (13.4)
dλ dλ
is constant along the path. For massive particles ϵ = 1 while for massless particles, ϵ = 0.
We can think of the angular momentum as a three-vector with a magnitude and direction.
Conservation of the direction of angular momentum means that the particle will move in a
plane. We can choose this to be the equatorial plane θ = π/2 of our coordinate system. The
two remaining Killing vectors correspond to energy and the magnitude of angular momentum.
The energy arises from the timelike Killing vector
2M
µ µ
K = (∂t ) = (1, 0, 0, 0), Kµ = − 1, 0, 0, 0 . (13.5)
r
The Killing vector whose conserved quantity is the magnitude of the angular momentum is
For massless particles, these can be thought of as the conserved energy and angular momen-
tum, while for massive particles they are the conserved energy and angular momentum per
unit mass of the particle.
After some algebra manipulations, we can get a single equation for r(λ),
2
1 dr
+ V (r) = E, (13.8)
2 dλ
where
1 M L2 M L2 1
V (r) = ϵ − ϵ + 2 − 3 and E = E 2 . (13.9)
2 r 2r r 2
–132/453– Chapter 13 Black Holes
0.5
V (r)
0.4
0.3
0.2
0.1
0.0
0 5 10 15 20 25 30 0 5 10 15 20 25 30
r/M r/M
Figure 13.1: Effective potentials for particles in Schwarzschild spacetime. There is an inner-
most circular orbit greater than or equal to 3M , and any orbit that falls inside this radius
continues to r = 0 for particles on geodesies.
In general relativity, at r = 2M the potential is always zero; inside this radius is the black
hole. For massless particles there is always a barrier (except for L = 0, for which the potential
vanishes identically), but a sufficiently energetic photon will nevertheless go over the barrier
and be dragged inexorably down to the center. At the top of the barrier are unstable circular
orbits rc = 3M .
For massive particles, the circular orbits are at
√
L2 ± L4 − 12M 2 L2
rc = . (13.10)
2M
For large L there will be two circular orbits, one stable and one unstable. In the L → ∞ limit
their radii are given by rc = (L2 /M, 3M ). In this limit the stable circular orbit becomes farther
away, while the unstable one approaches 3M , behaviour that parallels the massless case. As
√
we decrease L, the two circular orbits come closer together; they coincide when L = 12M
for which rc = 6M . We have therefore found that the Schwarzschild solution possesses stable
circular orbits for r > 6M and unstable circular orbits for 3M < r < 6M . It is important to
remember that these are only the geodesies; there is nothing to stop an accelerating particle
from dipping below r = 3M and emerging, as long as it stays beyond r = 2M .
As for a general non-circular orbit, the equation of the orbit statisfies that
2
dr 1 2M 2E
+ 2 r4 − 2 r3 + r2 − 2M r = 2 r4 . (13.11)
dϕ L L L
d2 x 3M 2 2
− 1 + x = x. (13.13)
dϕ2 L2
In a Newtonian calculation, the last term would be absent, and we could solve for x exactly;
here, we suppose the orbit is far from r = 2M and treat the last term as a perturbation. We
expand x into a Newtonian solution plus a small deviation: x = x0 + x1 . The solution for the
zeroth-order equation is
x0 = 1 + e cos ϕ. (13.14)
Then the first-order equation becomes
d2 x1 3M 2
+ x 1 = (1 + e cos ϕ)2 . (13.15)
dϕ2 L2
3M 2
x = 1 + e cos[(1 − α)ϕ] where α = . (13.16)
L2
During each orbit, perihelion advances by an angle
6πM 2
∆ϕ = 2πα = . (13.17)
L2
An ordinary ellipse satisfies
(1 − e2 )a
r= , (13.18)
1 + e cos ϕ
where a is the semi-major axis. Comparing to our zeroth-order solution and the definition of
x, we see that
L2 ≈ M (1 − e2 )a. (13.19)
Hence, we have
6πM
∆ϕ = . (13.20)
(1 − e2 )a
The range of new coordinates are 0 ≤ ξ < π and |ψ| + ξ < π. The corresponding Penrose
diagram of Minkowski spacetime is shown in Figure 13.3.
i+
I+
r=0
i0
I−
t = constant
r = constant
−
i
It is obvious that light cones are still 45 degree lines in this Penrose diagram. What is more,
points and regions located at infinite distances in the original coordinates have now been
13.1 Schwarzschild black holes –135/453–
brought to a finite distance. The figure indicates a set of different kinds of “infinity” that is
useful in the discussion of physical phenomena. The following list gives the definition of each
of these:
Black holes are characterized by the fact that you can enter them, but never exit. Thus, their
most important feature is actually not the singularity at the center, but the event horizon at
the boundary. An event horizon is a hypersurface separating those spacetime points that are
connected to infinity by a timelike path from those that are not. In general relativity, the global
structure of spacetime can take many different forms, with correspondingly different notions
of infinity. But to think about black holes in the real universe, we use infinity as a proxy for
“well outside the black hole,” and imagine that spacetime sufficiently far away from the hole
can be approximated by Minkowski space.
past event
horizon
From Penrose diagram 13.4, the future event horizon is the surface beyond which timelike
curves cannot escape to infinity. The causal past J − of a region is the set of all points we
can reach from that region by moving along past-directed timelike paths, the event horizon
can be equivalently defined as the boundary of J − (I + ), the causal past of future null infinity.
Analogous definitions hold for the past horizon. From the definition, it is clear that the event
horizon is a null hypersurface.
For large r the slope is ±45◦ , as it would be in flat space, while as we approach r = 2M we get
dt/dr = ±∞, and the light cones “close up”, as shown in Figure 13.5. It seems that a light ray
that approaches r = 2M never seems to get there, at least in this coordinate system; instead it
seems to asymptote to this radius.
Event horizon
If we stayed outside while an intrepid observational general relativist dove into the black hole,
sending back signals all the time, we would simply see the signals reach us more and more
slowly, as shown in Figure 13.6.
Figure 13.6: A beacon falling freely into a black hole emits signals at intervals of constant
proper time. An observer at fixed r receives the signals at successively longer time intervals.
The fact that we never see the infalling observer reach r = 2M is a meaningful statement, but
the fact that their trajectory in the t − r plane never reaches there is not. It is highly dependent
on our coordinate system, and we would like to ask a more coordinate-independent question
(such as, “Does the observer reach this radius in a finite amount of their proper time?”). The
best way to do this is to change coordinates to a system that is better behaved at r = 2M .
We omit the intermediate steps of coordinate transformation and move to Kruskal coordinates
13.2 Reissner-Nordström black holes –137/453–
32M 3 −r/2M
ds2 = e (− dT 2 + dR2 ) + r2 dΩ2 . (13.27)
r
We can now draw a spacetime diagram in the T − R plane, known as a Kruskal diagram,
representing the maximal extension of the Schwarzschild geometry, as shown in Figure 13.7.
r = 2M r=0 r = 2M
t = −∞ t = +∞
r = constant t = constant
r = 2M r = 2M
t = +∞ t = −∞
Figure 13.7: The Kruskal diagram – the Schwarzschild solution in Kruskal coordinates, where
all light cones are at ±45◦ .
We can further transform the coordinates to bring them into a finite range and get the Penrose
diagram of Schwarzschild spacetime. The transformation is given by
ψ+ξ ψ−ξ
T + R = tan , T − R = tan . (13.28)
2 2
ψ+ξ ψ−ξ
tan tan = T 2 − R2 < 1. (13.29)
2 2
H
or
on
iz on
iz
or
H
H
or
n
iz
o
iz
on
or
H
Figure 13.8: The Penrose diagram for the Schwarzschild spacetime.
once again be a timelike coordinate, but with reversed orientation; you are forced to move in
the direction of increasing r. You will eventually be spit out past r = r+ once more, which is
like emerging from a white hole into the rest of the universe. From here you can choose to go
back into a different hole and repeat the voyage as many times as you like.
III III
II I
I I
II I
II I
I I
II I
III III
(a) Ayón-Beato-García black hole spacetime (b) Extremal Ayón-Beato-García (c) No black hole
If M 2 < Q2 /4π, ∆ is always positive and the metric is completely all the way down to r = 0.
The coordinate t is always timelike, and r is always spacelike. But still there is the singularity
at r = 0, which is now a timelike line. Since there is no event horizon, there is no obstruc-
tion to an observer traveling to the singularity and returning to report on what was observed.
This is a naked singularity. A careful analysis of the geodesies reveals that the singularity is
repulsive-timelike geodesies never intersect r = 0; instead they approach and then reverse
course and move away. Null geodesies can reach the singularity, as can nongeodesic timelike
curves. As r → ∞ the solution approaches flat spacetime, and as we have just seen the causal
structure seems normal everywhere. The conformal diagram will therefore be just like that of
Minkowski space, except that now r = 0 is a singularity.
–140/453– Chapter 13 Black Holes
where
∆(r) = r2 − 2M r + a2 , ρ2 (r, θ) = r2 + a2 cos2 θ. (13.32)
We can show that a is the the angular momentum per unit mass of the black hole.
From the form of the Kerr metric, it is obvious that the metric becomes ill-defined at ρ = 0
and at ∆ = 0. The calculation of scalar invariants of the curvature tensor shows that ρ = 0 is
indeed a physical singularity. The condition ρ = 0 corresponds to
ρ2 = r2 + a2 cos2 θ = 0, (13.35)
which can only be satisfied with θ = π/2 and r = 0. Hence we have a ring-like singularity
in the case of a Kerr metric. The curvature invariants are well behaved at ∆ = 0. ∆(r) = 0
is equal to g rr = 0. Thus ∆(r) = 0 is a null surface. The quadratic equation ∆ = 0 has two
roots if |a| < M ,
√
rh = M ± M 2 − a2 , (13.36)
representing inner and outer horizons in Kerr spacetime. The corresponding Penrose diagram
is shown in FIgure 13.10.
There are two Killing vectors of the metric, K µ = (∂t )µ and Rµ = (∂ϕ )µ . The norms of these
Killing vectors are scalar quantities with a coordinate-free, geometrical interpretation. This
13.3 Kerr black holes –141/453–
anti universe
ring sigularity at
future
internal envent
horizon at
Figure 13.10: The Penrose diagram for the Kerr spacetime for the case M > |a|.
Let us consider the surface defined by the quadratic equation gtt = 0. Consider a class of ob-
servers with four-velocity uµ in the direction of the timelike Killing vector K µ . For any photon
with four-momentum pµ propagating in this spacetime, the observer with four-velocity uµ will
attribute a frequency
pµ K µ E
ω = −pµ uµ = µ = (13.38)
K Kµ −gtt
where E is the conserved “energy” of the photon. Thus it is easy to see that the surface with
gtt = 0 corresponds to infinite redshift. For the Kerr metric, the equation gtt also has two
solutions, r = r± , given by
√
r± = M ± M 2 − a2 cos2 θ. (13.39)
Physically, this corresponds to a surface of infinite redshift usually called an ergosurface. The
region between outer ergosurface and horizon is called the ergosphere. The geometrical struc-
ture of Kerr black hole is shown in Figure 13.11.
–142/453– Chapter 13 Black Holes
Figure 13.11: Schematic picture showing the geometrical structure of the Kerr spacetime.
Consider a stationary observer with an angular velocity Ω in the Kerr spacetime with
dϕ uϕ
Ω= = t. (13.40)
dt u
Such an observer has a four-velocity (ut , 0, 0, Ωut ). From the normalization uµ uµ = −1, we
have
gtt + 2gtϕ Ω + gϕϕ Ω2 < 0. (13.41)
This condition leads to limits in the range of values allowed for the angular velocity to be
where
q q
Ωmin = ω − ω 2 − (gtt /gϕϕ ), Ωmax = ω + ω 2 − (gtt /gϕϕ ), (13.43)
with
gϕt 2M ar
ω=− = 2 . (13.44)
gϕϕ (r + a )2 − ∆a2 sin2 θ
2
First, far away from the black hole, we have rΩmin = −1 and rΩmax = +1, which correspond
to the standard result that motion should be at a speed less than that of light. Second, as one
13.3 Kerr black holes –143/453–
moves closer to the black hole, Ωmin increases due to the dragging of the inertial frames. Even-
tually, Ωmin reaches zero at the surface on which gtt = 0, which is the ergosurface. Therefore,
inside the ergosphere, all stationary observers must orbit the black hole with Ω > 0 and hence
static observers can exist only out-side the ergosurface. Finally, as one crosses the ergosur-
face and moves towards the event horizon, the allowed range of angular velocities become
ever more positive with the allowed range narrowing down. At the event horizon, the Ωmin
and Ωmax coincide and all timelike worldlines point inwards. The limiting angular velocity is
given by
a
ΩH ≡ ωrh ,θ = . (13.45)
2M rh
This limiting angular velocity is sometimes called the angular velocity of the horizon.
This result can be used to extract energy from the Kerr black hole in several ways, of which the
simplest one is the following. Consider, for example, a particle A moving in the ergosphere
which breaks into two particles B and C. We let particle B to fall into the black hole and let
particle C escape to infinity. All this can be done using suitable timelike trajectories. The
conservation of four-momentum requires that
EA = EB + EC . (13.46)
Since the particle A can fall into the ergosphere from infinity, we have EA > m. We can arrange
the trajectory of B making EB < 0. It immediately follows that EC > EA . When the particle
C goes back to infinity, it will have more energy than the original particle had. Thus, using the
existence of negative energy orbits in the ergosphere and the local conservation of energy for
processes taking place in the ergoregion, one can extract energy from the black hole.
The Penrose process decreases both the mass and the angular momentum of the Kerr black
hole by an amount equal to the (negative of) the energy and the angular momentum of the
particle B that falls into the black hole. Consider the dot product of pµ with the vector K µ +
ωRµ . Since K µ + ωRµ is timelike outside the horizon and we want particle B to fall into the
horizon, it is necessary that this dot product is negative. Using pµ Kµ = −E, pµ Rµ = L, where
E and L are the conserved energy and angular momentum of particle B, we get the condition
−E + ΩH L < 0. When the particle B falls into the black hole, the angular momentum and
mass of a Kerr black hole will change by δJ = L and δM = E. Hence the above bound
translates into the result
δM > ΩH δJ. (13.47)
A = 4π(rh2 + a2 ). (13.48)
–144/453– Chapter 13 Black Holes
The radii of stable circular orbits are determined by the minima of U (r); that is by the simul-
taneous solution to the equations E = U (r), U ′ (r) = 0. Among all the stable circular orbits,
we are interested in the innermost one. Fairly lengthy but straightforward calculation shows
that this orbital radius is the solution to the quartic equation
√
r2 − 6M r − 3a2 + 8a M r = 0. (13.54)
13.3 Kerr black holes –145/453–
When a = 0 we get the standard result that r = 6M ; as the rotation parameter increases,
the radius of the circular orbit decreases for the co-rotating orbit. This shows that one can
have stable circular orbits very close to the black hole in the case of rotating black holes. The
quantity (m − E)/m represents the fraction of the rest energy that can be released when a
particle falls from the innermost stable circular orbit into the black hole. In the extreme case
√
of a = M , this fraction is 1 − 1/ 3 which is about 42 percent, while the corresponding value
for orbits in the Schwarzschild metric is only about 5.7 percent. This higher efficiency could
be of use in certain astrophysical scenarios.
Chapter 14
Geometry of the Universe
cold matter (baryon, cold dark matter) pCM = 0 and ρCM ∝ a−3
H2
= ΩR0 a−4 + ΩCM0 a−3 + ΩDE0 + ΩK0 a−2 . (14.8)
H02
The subscript 0 denotes today’s value of some parameters and we scale the coordinates to make
a0 = 1. H ≡ ȧ/a and H0 is called Hubble’s constant. Ω0 is the ratio between density ρ0 of
some substance and critical density ρc0 ≡ 3H02 /8π. Particularly, ρK0 = 3K/8π is called the
density of curvature energy and we have ΩK0 = 1 − ΩDE0 − ΩCM0 − ΩR0 .
It follows that
dpt da
+ = 0. (14.12)
pt a
Notice that a0 = 1 and so pt0 = apt . Define cosmological redshift by pt0 = pt /(1 + z). We
have the relation
1
a= . (14.13)
1+z
Luminosity
Suppose there is an object with intrinsic luminosity L at r = 0 and time t. Suppose at time t0 ,
the photon propagate to r = r0 (our position) at time t0 (now). In coordinates (η, χ), we have
η0 = χ0 . Hence Z t0 ′ Z z
dt dz ′
χ0 (z) = = ′
, (14.14)
t a 0 H(z )
–148/453– Chapter 14 Geometry of the Universe
r0 ≡ χ0 , K = 0 (14.15)
√ √
sinh −Kχ0 / −K, K < 0.
The area size of the two-surface t = t0 , r = r0 is 4πr02 . In time interval ∆t, the object emitted
∆N photons, so L = ϵ∆N/∆t. ϵ is the energy of the photon. The time interval for receiver is
∆t0 = ∆t/a (It is easy to verify in coordinates (η, χ) that ∆η = ∆η ′ ). Take into account the
redshift of the photon, the flux we measured is
ϵ∆N L
f= 2
= where dL ≡ (1 + z)r0 . (14.16)
(1 + z)∆t0 4πr0 4πd2L
Size
Suppose there is an object with intrinsic size ∆l at time t. Now we put ourselves at r = 0,
so the object will be at the two surface with metric dσ 2 = a(t)2 r02 dΩ2 . The angle it extends
relative to us satisfy that ar0 ∆θ = ∆l. Hence we have
∆l r0
∆θ = where dA ≡ . (14.17)
dA 1+z
Part IV
Quantum Mechanics
Chapter 15
Linear Algebra
A linear vector space is a set of elements, called vectors, which is closed under addition
and multiplication by scalars. That is to say, if ϕ and ψ are vectors then so is aϕ + bψ,
where a and b are arbitrary scalars. If the scalars belong to the field of complex (real) ♡
numbers, we speak of a complex (real) linear vector space. Henceforth the scalars will
be complex numbers unless otherwise stated.
Example:
2. Spaces of functions of some type, for example, the space of all differentiable functions
The maximum number of linearly independent vectors in a space is called the dimension
♡
of the space.
15.1 Linear Vector Space –151/453–
A maximal set of linearly independent vectors is called a basis for the space. Any vector
♡
in the space can be expressed as a linear combination of the basis vectors.
An inner product (or scalar product) for a linear vector space associates a scalar (ϕ, ψ)
with every ordered pair of vectors. It must satisfy the following properties:
1. (ϕ, ψ) = a complex number.
♡
2. (ϕ, ψ) = (ψ, ϕ)∗ .
3. (ϕ, c1 ψ1 + c2 ψ2 ) = c1 (ϕ, ψ1 ) + c2 (ϕ, ψ2 ).
4. (ϕ, ϕ) ≥ 0,with equality holding if and only if ϕ = 0.
Example:
1. If ψ is the column vector with elements a1 , a2 , · · · , and ϕ is the column vector with
elements b1 , b2 , · · · , then
A set of vectors {ϕn } is said to be orthonormal if the vectors are pairwise orthogonal
♡
and of unit norm; that is to say, their inner products satisfy (ψm , ϕn ) = δmn .
Corresponding to any linear vector space V there exists the dual space of linear func-
tionals on V . A linear functional F assigns a scalar F (ϕ) to each vector ϕ, such that
f being a fixed vector, and ϕ being an arbitrary vector. Thus the spaces V and V ′ are
essentially isomorphic.
In Dirac’s notation, which is very popular in quantum mechanics, the vectors in V are called
ket vectors, and are denoted as |ϕ⟩. The linear functionals in the dual space V ′ are called bra
vectors, and are denoted as ⟨F |. The numerical value of the functional is denoted as
According to the Riesz theorem, there is a one-to-one correspondence between bras and kets.
Therefore we can use the same alphabetic character for the functional (a member of V ′ ) and
the vector (in V ) to which it corresponds, relying on the bra, ⟨F |, or ket, |F ⟩, notation to
determine which space is referred to. It follows that
Notice that the Riesz theorem establishes, by construction, an antilinear correspondence be-
tween bras and kets. If ⟨F | ↔ |F ⟩, we will have the correspondence
An operator on a vector space maps vectors onto vectors. A linear operator satisfies that
According to the Riesz theorem there must exist a vector χ such that
If we define operator A† as
A† |ϕ⟩ = |χ⟩ . (15.16)
we will have
⟨ϕ|A|ψ⟩ = ⟨A† ϕ|ψ⟩ = ⟨ψ|A† |ϕ⟩∗ . (15.17)
Proposition 15.1
An operator A that is equal to its adjoint A† is called self-adjoint. This means that it
satisfies
⟨ϕ|A|ψ⟩ = ⟨ψ|A|ϕ⟩∗ (15.21) ♡
and that the domain of A coincides with the domain of A† . An operator that only sat-
isfies above equation is called Hermitian.
Theorem 15.4
If ⟨ψ|A|ψ⟩ = ⟨ψ|A|ψ⟩∗ for all |ψ⟩, it would follow that ⟨ϕ1 |A|ϕ2 ⟩ = ⟨ϕ2 |A|ϕ1 ⟩∗ for
♣
all |ϕ1 ⟩ and |ϕ2 ⟩, and hence that A = A† .
If an operator acting on a certain vector produces a scalar multiple of that same vector,
we call the vector |ϕ⟩ an eigenvector and the scalar an eigenvalue of the operator A. ♡
The antilinear correspondence between bras and kets, and the definition of the adjoint
operator A† , imply the left-handed eigenvalue equation
Theorem 15.5
Theorem 15.6
If the orthonormal set of vectors {ϕi } is complete, then we can expand an arbitrary vector |v⟩
in terms of it: X X
|v⟩ = |ϕi ⟩ (⟨ϕi |v⟩) = |ϕi ⟩⟨ϕi | |v⟩ . (15.24)
Therefore, X
|ϕi ⟩⟨ϕi | = I. (15.25)
If A |ϕi ⟩ = ai |ϕi ⟩ and the eigenvectors form a complete orthonormal set, then the operator
can be reconstructed in a useful diagonal form in terms of its eigenvalues and eigenvectors:
X
A= ai |ϕi ⟩⟨ϕi | . (15.26)
The Hermitian operators in a finite N -dimensional vector space have complete sets of eigen-
vectors. But This statement does not carry over to infinite-dimensional spaces. A Hermitian
operator in an infinite-dimensional vector space may or may not possess a complete set of
eigenvectors, depending upon the precise nature of the operator and the vector space. Instead,
we have the following spectral theorem.
Theorem 15.7
The orthonormality condition for the continuous case takes the form
Evidently the norm of these formal eigenvectors is infinite, since ⟨q|q⟩ → ∞. Instead of the
spectral theorem for Q, Dirac would write
Z ∞
Q= q |q⟩⟨q| dq . (15.31)
−∞
Dirac’s formulation does not fit into the mathematical theory of Hilbert space, which admits
only vectors of finite norm. The projection operator formally given by
Z λ
E(λ) = |q⟩⟨q| dq (15.32)
−∞
is is well defined in Hilbert space, but its derivative does not exist within the Hilbert space
framework.
Theorem 15.8
If A and B are self-adjoint operators, each of which possesses a complete set of eigenvec-
tors, and if AB = BA, then there exists a complete set of vectors which are eigenvectors ♣
of both A and B.
Let (A, B, · · · ) be a set of mutually commutative operators that possess a complete set
of common eigenvectors. Corresponding to a particular eigenvalue for each operator,
there may be more than one eigenvector. If, however, there is no more than one eigen-
♡
vector (apart from the arbitrary phase and normalization) for each set of eigenvalues
(an , bm , · · · ), then the operators (A, B, · · · ) are said to be a complete commuting set of
operators.
Theorem 15.9
Any operator that commutes with all members of a complete commuting set must be a
♣
function of the operators in that set.
15.4 Rigged Hilbert space –157/453–
Formally, a rigged Hilbert space consists of a Hilbert space H, together with a subspace
Φ which carries a finer topology, that is one for which the natural inclusion Φ ⊆ H
is continuous. It is no loss to assume that Φ is dense in H for the Hilbert norm. We
consider the inclusion of conjugate space HX in ΦX . ΦX is the space of τΦ continuous
antilinear functional on Φ. For any ϕ ∈ Φ, F ∈ ΦX ,we define
♡
⟨F |ϕ⟩ ≡ F (ϕ), ⟨ϕ|F ⟩ ≡ [F (ϕ)]∗ . (15.33)
Now by applying the Riesz representation theorem we can identify HX with H. There-
fore, the definition of rigged Hilbert space is in terms of a sandwich:
Φ ⊆ H ⊆ ΦX . (15.34)
There may or may not exist any solutions to the eigenvalue equation A |an ⟩ = an |an ⟩ for a self-
adjoint operator A on an infinite-dimensional vector space. However, the generalized spectral
theorem asserts that if A is self-adjoint in H then a complete set of eigenvectors exists in the
extended space ΦX . The precise conditions for the proof of this theorem are rather technical,
so the interested reader can refer to Gel’fand and Vilenkin (1964) for further details.
There are many examples of rigged-Hilbert-space triplets. A Hilbert space H is formed by
those functions that are square-integrable. That is, H consists of those functions ψ(x) for
which Z ∞
⟨ψ|ψ⟩ = |ψ(x)|2 dx is finite . (15.35)
−∞
A nuclear space Φ is made up of functions ψ(x) which satisfy the infinite set of conditions,
Z ∞
|ψ(x)|2 (1 + |x|)m dx is finite for m = 0, 1, 2, · · · (15.36)
−∞
The functions ψ(x) which make up Φ must vanish more rapidly than any inverse power of
x in the limit |x| → ∞. The extended space ΦX , which is conjugate to Φ, consists of those
functions χ(x) for which
Z ∞
⟨χ|ψ⟩ = χ∗ (x)ψ(x) dx is finite for any ψ in Φ. (15.37)
−∞
In addition to the functions of finite norm, which also lie in H, ΦX will contain functions
that are unbounded at infinity provided the divergence is no worse than a power of x. Hence
ΦX contains eikx , which is an eigenfunction of the operator D = i d/dx . It also contains
the Dirac delta function, δ(x − λ), which is an eigenfunction of the operator X, defined by
Xψ(x) = xψ(x). These two examples suffice to show that rigged Hilbert space seems to be a
more natural mathematical setting for quantum mechanics than is Hilbert space.
–158/453– Chapter 15 Linear Algebra
Consider a family of unitary operators, U (s), that depend on a single continuous parameter s.
Let U (0) = I be the identity operator, and let U (s1 + s2 ) = U (s1 )U (s2 ). We can demonstrate
that
dU
= iK with K = K †. (15.38)
ds s=0
The Hermitian operator K is called the generator of the family of unitary operators because it
determines U (s), not only for infinitesimal s, but for all s. This can be shown by differentiating
with respect to s2 at s2 = 0,
dU
= U (s1 )iK. (15.40)
ds s=s1
This first order differential equation with initial condition U (0) = I has the unique solution
Proposition 15.2
2. With every physical property (energy, position, momentum, angular momentum, ...)
there exists an associated linear, Hermitian operator A (usually called observable), which
acts in the space of states. The eigenvalues of the operator are the possible values of the
physical properties.
3. • If |ψ⟩ is the vector representing the state of a system and if |ϕ⟩ represents another
physical state, there exists a probability P (|ψ⟩ , |ϕ⟩) of finding |ψ⟩ in state |ϕ⟩,
which is given by the squared modulus of the scalar product on H : P (|ψ⟩ , |ϕ⟩) =
|⟨ψ|ϕ⟩|2 .
4. The evolution of a closed system is unitary. The state vector |ψ(t)⟩ at time t is derived
from the state vector |ψ(t0 )⟩ at time t0 by applying a unitary operator U (t, t0 ), called
the evolution operator: |ψ(t)⟩ = U (t, t0 ) |ψ(t0 )⟩.
Any mapping of the vector space onto itself that preserves the value of |⟨ϕ|ψ⟩| may
be implemented by an operator U with U being either unitary (linear) or antiunitary ♣
(antilinear).
Continuous transformation
Only linear operators can describe continuous transformations because every continuous trans-
formation has a square root. Suppose, for example, that U (l) describes a displacement through
the distance l. This can be done by two displacements of U (l/2), and hence U (l) = U (l/2)U (l/2).
The product of two antilinear operators is linear, since the second complex conjugation nul-
lifies the effect of the first. Thus, regardless of the linear or antilinear character of U (l/2), it
must be the case that U (l) is linear. A continuous operator cannot change discontinuously
from linear to antilinear as a function of l, so the operator must be linear for all l.
Transformations of observables
It follows that
dU (t, t0 )
= −iH(t1 )U (t1 , t0 ). (16.5)
dt t=t1
Suppose that T stands for time ordering, placing all operators evaluated at later times to the
left, equation 16.6 can be written as
∞ Z Z t Z t
(−i)n X t
U (t, t0 ) = I + dt1 dt2 · · · dtn T{H(t1 )H(t2 ) · · · H(tn )}
n! n=1 t0 t0 t0
Z t
′ ′
≡ exp −iT H(t ) dt . (16.7)
t0
If the Hamiltonian operator H is time-dependent but the H’s at different times commute,
equation 16.7 can be simplified to
Z t
′ ′
U (t, t0 ) = exp −i H(t ) dt . (16.8)
t0
Since |ψ(t)⟩ = U (t, t0 ) |ψ(t0 )⟩, we can derive the Schrödinger equation
d |ψ(t)⟩
= −iH(t) |ψ(t)⟩ . (16.10)
dt
The expectation value of an observable Q is ⟨ψ|Q|ψ⟩, denoted by ⟨Q⟩. We then have
d ⟨Q⟩ ∂Q
= −i ⟨[Q, H]⟩ + , (16.11)
dt ∂t
It describes a particle posited in x with internal state s. We would normalize |x, s⟩ so that
that the position and momentum of particles can not be measured simultaneously, so we
would expect [X, P ] ̸= 0.
For a system which has a classical correspondence, the classical equation of motion of a particle
is
and
H = HC (X, P, t). (16.18)
as a definition for momentum operator. The form of H can not be given as a priori,and may
be specified by the hints from classical theory and experiments.
in λn
exp(iGλ)A exp(−iGλ) = A + iλ[G, A] + · · · + [G, [G, [G, · · · [G, A]]] · · · ] + · · · ♣
n!
(16.20)
Thus, T (a) is the space translation operator and we can also define the momentum operator
as the generator of space translation.
–164/453– Chapter 16 Formulation of Quantum Mechanics
Experiments show that some microscopic particles possess a property called spin. The state of
the spin is denoted by |s⟩. The corresponding observables are S = [S1 , S2 , S3 ], which measure
the spin along the 1, 2, 3 direction. Spin operator is the generator of rotation of the spin of the
particle. Thus we have
[Si , Sj ] = iϵijk Sk . (16.26)
The rotation of position and spin is independent. It follows that
[Li , Sj ] = 0. (16.27)
J = L + S. (16.28)
It is the generator of the rotation of the entire system, which is equivalent to the rotation of the
coordinates in opposite direction.
If |q⟩ is the eigenstate of the Q with the eigenvalue q, |qH (t)⟩ ≡ U † (t, t0 ) |q⟩ would be the
eigenstate of the QH with eigenvalue q. Thus, the probability distribution of the measurement
of the observable Q at time t is
It follows that
⟨K⟩ (t) = ⟨K⟩ (t0 ), ⟨k|ψ(t)⟩ = ⟨k|ψ0 ⟩ . (16.36)
The expectation value and probability distribution of the measurement of the observable K
will not change with time for an arbitrary initial state. We will assume that K is a constant of
motion.
Note: The concept of a constant of motion should not be confused with the concept of a stationary state.
Suppose that the Hamiltonian operator H is independent of t, and that the initial state vector is an eigenstate
of H, |ψ0 ⟩ = |En ⟩ with H |En ⟩ = En |En ⟩. This describes a state having a unique value of energy
En . The evolution of the state is
|ψ(t)⟩ = e−iEn t |ψ0 ⟩ . (16.37)
From this result it follows that the expectation value of any dynamical variable R
is independent of t for such a state. By considering functions of R we can further show that the probability
distribution is independent of time. In a stationary state the averages and probabilities of all dynamical
variables are independent of time, whereas a constant of motion has its average and probabilities independent
of time for all states.
Chapter 17
Coordinate and Momentum Representation
It is a matter of taste whether one says that the set of functions forms a representation of the
vector space, or that the vector space consists of the functions ψ(x). The action of an operator
A on the function space is related to its action on the abstract vector space by the rule
For a spin-less particle in the scalar potential W (x), the Hamiltonian is H = P 2 /2m+W (X).
The equation of motion in the coordinate representation is
1 2 ∂ψ(x, t)
− ∇ + W (x) ψ(x, t) = i . (17.6)
2m ∂t
17.2 Galilei transformation of Schrödinger equation –167/453–
x = x′ + vt′ , t = t′ . (17.7)
Because the requirement of invariance under Galilei transformation, we expect in F ′ the Schrödinger
equation has the form
1 ∂ ′ ′ ′ ′ ′ ∂ψ ′ (x′ , t′ )
− + W (x ) ψ (x , t ) = i , (17.9)
2m ∂x′2 ∂t′
where ψ ′ (x′ , t′ ) is the wave function in F ′ . The probability density at a point in spacetime must
be the same in the two frames of reference,
The equations of continuity require that the probability flux J (x, t) be continuous across any
surface, since otherwise the surface would contain sources or sinks. Although this condition
applies to all surfaces, implying that J (x, t) must be everywhere continuous, its practical ap-
plications are mainly to surfaces separating regions in which the potential has different analytic
forms. Usually, we have the following conditions,
–168/453– Chapter 17 Coordinate and Momentum Representation
1.
dψ dψ
ψ(x)|x+0 = ψ(x)|x−0 , = . (17.16)
dx x+0 dx x−0
2.
dψ dψ
ψ(x)|x+0 = ψ(x)|x−0 = 0, − is finite. (17.17)
dx x+0 dx x−0
Consider next the behavior at a singular point, assumed for convenience to be the origin of
coordinates. Let S be a small sphere of radius r surrounding the singularity. The probability
that the particle is inside S must be finite. Suppose that ψ = u/rα , where u is a smooth
function that does not vanish at r = 0. Then we must have |ψ|2 r3 convergent at the origin,
which implies that α < 3/2.
H
The net outward flow through the surface S is F = S J ·dS. It must vanish in the limit r → 0,
since otherwise the origin would be a point source or sink. One has ∂ψ/∂r = r−α ∂u/∂r −
αur−α−1 . The second term does not contribute to the flux, so we obtain
I
−i ∗ ∂u ∂u∗
F =r 2−2α
u −u dΩ, (17.18)
2m ∂r ∂r
where the integration is over solid angle. If the integral does not vanish, we must have α < 1
in order for F to vanish in the limit r → 0. This is a stronger condition than that derived from
the probability density.
Since |ψ|2 is a probability density, it must vanish sufficiently rapidly at infinity so that its in-
tegral over all configuration space is convergent and equal to 1. The conditions that we have
discussed apply to wave functions ψ(x) which represent physically realizable states, but they
need not apply to the eigenfunctions of operators that represent observables. Those eigenfunc-
tions, χ(x), which play the role of filter functions in computing probabilities, are only required
to lie in the extended space, ΦX , of the rigged-Hilbert-space triplet. It has been suggested that
ψ(x) be restricted to the nuclear space Φ, rather than merely to the Hilbert space H. In many
cases this would amount to requiring that ψ(x) should vanish at infinity more rapidly than
any inverse power of the distance.
The time evolution of a quantum state vector, |ψ(t)⟩ = U (t, t0 ) |ψ0 ⟩, can be regarded as the
propagation of an amplitude in configuration space,
Z
ψ(x, t) = G(x, t; x′ , t0 )ψ(x′ , t0 ) dx′ , (17.20)
where
G(x, t; x′ , t0 ) = ⟨x|U (t, t0 )|x′ ⟩ (17.21)
is often called the propagator. Making use of the multiplicative property of the time develop-
ment operator, it follows that the propagator can be written as
Z Z
G(x, t; x0 , t0 ) = · · · G(x, t; xN , tN ) · · · G(x1 , t1 ; x0 , t0 ) dxN · · · dx1 . (17.22)
The N -fold integration is equivalent to a sum over zigzag paths that connect the initial point
(x0 , t0 ) to the final point (x, t). If we now pass to the limit of N → ∞ and ∆t = ti −ti−1 → 0,
we will have the propagator expressed as a sum (or, rather, as an integral) over all paths that
connect the initial point to the final point.
For H = P 2 /2m + V (X), it can be shown that
r
−iH∆t ′ m m(x − x′ )2
xe x = exp i − V (x) ∆t as ∆t → 0. (17.23)
2iπ∆t 2∆t2
The integral can be assumed as a functional integration over all paths x(τ ) which connect the
initial point (x0 , t0 ) to the final point (x, t).
To conclude this section, let us generalize our path-integral formula to a more complicated
systems. Consider a very general quantum system, described by arbitrary set of of coordinates
qi , conjugate momentum pi , and Hamiltonian H(q, p). We can show that
! " #
Y Z dpi q + q X
qk+1 e−iϵH qk =
k+1 k
k
exp −iϵH , pk exp i pik (qi,k+1 − qi,k )
i
2π 2 i
(17.26)
and so
! " !#
Y Z dpik dq i,k X X q k+1 + q k
⟨qN |U (t, t0 )|q0 ⟩ = exp i pik (qi,k+1 − qi,k ) − ϵH , pk .
i,k
2π k i
2
(17.27)
–170/453– Chapter 17 Coordinate and Momentum Representation
There is one momentum integral for each k from 0 to N , and one coordinate integral for each
k from 1 to N . The propagator can also be expressed formally as
! " Z !#
YZ T X
⟨qN |U (t, t0 )|q0 ⟩ = Dq(t)Dp(t) exp i dt p q˙i − H (q, p)
i
, (17.28)
i 0 i
where the functions q(t) are constrained at the endpoints, but p(t) are not. The details of
this generalization can be found in chapter 9.1 of An introduction to quantum field theory
(M.E.Peskin & D.V.Schroeder)
Bloch’s Theorem
A crystal is unchanged by translation through a vector displacement of the form
Rn = n1 a1 + n2 a2 + n3 a3 , (17.33)
where n1 , n2 and n3 are integers, and a1 , a2 and a2 form the edges of a unit cell of the crystal.
Corresponding to such a translation, there is a unitary operator, U (Rn ) = exp(−iP · Rn ),
which leaves the Hamiltonian of the crystal invariant:
These unitary operators for translations commute with each other, as well as with H, so there
must exist a complete set of common eigenstates for all of these operators,
The vector k is called the Bloch wave vector of the state. A function of the Bloch form can be
expanded in a series of plane waves as
X ′
ψk (x) = a(k′ )eik ·x . (17.38)
k′
exp[i(k′ − k) · Rn ] = 1. (17.39)
And Gn ≡ k′ − k is called a vector of the reciprocal lattice. The expansion can now be written
as X
ψk (x) = a(k + Gm )ei(k+Gm )·x . (17.40)
Gm
It follows that
1
[q, p] = i, H = ω(p2 + q 2 ). (17.43)
2
Define annihilation operator as
q + ip
a≡ √ . (17.44)
2
We can deduce that
† 1 1 1
a, a = 1, H = ω(aa† + a† a) = ω(aa† − ) = ω(a† a + ). (17.45)
2 2 2
Introducing number operator N ≡ a† a, we obtain
[N, a] = −a, N, a† = a† . (17.46)
–172/453– Chapter 17 Coordinate and Momentum Representation
⟨n′ |a† |n⟩ = (n + 1)1/2 δn′ ,n+1 , ⟨n′ |a|n⟩ = (n)1/2 δn′ ,n−1 . (17.50)
so severely as to be outside of both Hilbert space and rigged Hilbert space. We would seek
solutions of the form
u(q) = H(q)e− 2 q .
1 2
(17.55)
17.7 Quantum mechanics in classical electromagnetic field –173/453–
Z ( N )
m N2+1 X m(xj+1 − xj )2 1
= lim exp i 2
− mω 2 x2j+1 ∆t dx1 · · · dxN ,
N →∞ 2iπ∆t 2∆t 2
j=0
(17.58)
where x0 = xa , xN +1 = xb and ∆t = (tb − ta )/(N + 1). Suppose xc (t) is the classical path
of the harmonic oscillator and δx(t) ≡ x(t) − xc (t) is the deviation from the classical path.
Substituting x = xc + δx into 17.58, terms which is linear in δx can be dropped since
δS
= 0. (17.59)
δx x(t)=xc (t)
(π − eA)2
H= + eϕ. (17.64)
2m
The Hamiltonian operator in corresponding quantum theory will be
[P − eA(X)]2
H= + eϕ(X). (17.65)
2m
In Heisenberg picture, the equation of motion is
dX 1
= −i[X, H] = (P − eA). (17.66)
dt m
Define kinetic momentum K by
K ≡ P − eA. (17.67)
It follows that
[Ki , Kj ] = ie(∂i Aj − ∂j Ai ) = ieϵijk Bk . (17.68)
Hence,
d2 X ∂K 1 dX dX
m 2 = −i[K, H] + =e E+ ×B−B× . (17.69)
dt ∂t 2 dt dt
1 ∂ψ(x, t)
[−i∇ − eA] · [−i∇ − eA] ψ(x, t) + eϕ(x)ψ(x, t) = i . (17.70)
2m ∂t
Define the probability current of the particle as
1 1 e
j≡ Re(ψ ∗ Kψ) = Im(ψ ∗ ∇ψ) − A|ψ|2 . (17.71)
m m m
We can verify that
∂ρ
∇·j+ = 0. (17.72)
∂t
Gauge transformation
∂Λ
ϕ→ϕ− , A → A + ∇Λ (17.73)
∂t
will leave E and B unchanged. In classical electrodynamics, gauge transformation will not
change the trajectory of particles, which is the only thing we can observe in experiment. In
quantum theory, suppose the state vector |ψ⟩ will transform as
under gauge transformation, where U (t) is a unitary operator. If the Schrödinger equation is
always satisfied, we should demand that
∂U
H ′U − U H = i , (17.75)
∂t
where H ′ is the Hamiltonian operator after gauge transformation. Generally, we have
It follows that
The expectation value of X and K is invariant under gauge transformation. We can also verify
that j is also invariant under gauge transformation.
One special case of gauge transformation is
ϕ → ϕ + ϕ0 (t), A → A. (17.78)
If ϕ0 is a constant, we have
O(t) = exp[−ieϕ0 (t − t0 )]. (17.80)
Kx Ky p
Q′ ≡ , P′ ≡ , γ≡ |eB|. (17.81)
γ γ
We have
1 |eB| ′2
Hxy = (Q + P ′2 ) with [Q′ , P ′ ] = i or − i. (17.82)
2 m
Thus, the eigenvalues of Hxy must be equal to (n + 1/2)|eB|/m, where n is any non-negative
integer.
The spectrum of Kz is gauge invariant. Because the magnetic field is uniform and in the ẑ
direction, it is possible to choose the vector potential such that Az = 0. Thus, the spectrum of
Kz is continuous from −∞ to ∞, like that of Pz . The energy eigenvalues for a charged particle
in a uniform static magnetic field B are therefore
(n + 1/2)|eB| p2
En (pz ) = + z. (17.83)
m 2m
–176/453– Chapter 17 Coordinate and Momentum Representation
The motion parallel to the magnetic field is not coupled to the transverse motion, and is unaf-
fected by the field. The classical motion in the plane perpendicular to the field is in a circular
orbit with angular frequency ωc = eB/m, and it is well known that periodic motions corre-
spond to discrete energy levels whose separation is ωc .
We can also derive the energy spectrum in coordinate representation. Let us choose the vector
potential to be Ax = −yB, Ay = Az = 0. The Hamiltonian now becomes
1 2 ieB ∂ψ e2 B 2 2
− ∇ ψ− y + y ψ = Eψ. (17.85)
2m m ∂x 2m
Suppose
ψ(x, y, z) = exp(ikx x + ikz z)ϕ(y). (17.86)
The equation becomes
1 d2 ϕ(y) mωc2 ′
− + (y − y0 ) − E ϕ(y) = 0,
2
(17.87)
2m dy 2 2
where E ′ = E − kz2 /2m is the energy associated with motion in the xy plane. This is just the
energy eigenvalue equation for a simple harmonic oscillator with angular frequency ω = |ωc |.
The energies for the charged particle in the magnetic field must be E = (n+1/2)|ωc |+kz2 /2m.
Apart from a normalization constant, the eigenfunction is
1 2
ψ = exp(ikx x + ikz z)Hn [α(y − y0 )] exp − α (y − y0 ) . 2
(17.88)
2
√ p
with α = mω = |eB| and y0 = −kx /eB.
For fixed n and kz , the energy eigenvalue is highly degenerate. For convenience, we assume
that the system is confined to a rectangle of dimension Dx ×Dy and subject to periodic bound-
ary conditions. The allowed values of kx are kx = 2πnx /Dx , with nx = 0, ±1, · · · . The orbit
center coordinate y0 = −2πnx /Dx eB must lie in the range [0, Dy ]. In the limit as Dx and Dy
become large, we may ignore problems associated with orbits lying near the boundary, since
they will be a negligible fraction of the total. In this limit the number of degenerate states
corresponding to fixed n and kz will be
Dx Dy |eB| e
= Φ . (17.89)
2π 2π
solenoid remains field-free. The solenoid is located in the unilluminated shadow region so
that no particles will reach it, and moreover it may be surrounded by a cylindrical shield that
is impenetrable to the charged particles. Nevertheless it can be shown that the interference
pattern depends upon the magnetic flux through the cylinder.
Let Ψ(0) (x, t) be the solution of Schrödinger equation with boundary conditions of this prob-
lem for the case in which the vector potential is everywhere zero. Now let us consider the case
in which the magnetic field is non-zero inside the cylinder but zero outside of it. The vector
potential A will not vanish everywhere in the exterior region, even though B outside of the
cylinder. This follows by applying Stokes’ theorem to any path surrounding the cylinder
I ZZ ZZ
A · dx = (∇ × A) · dS = B · dS = Φ. (17.90)
If the flux through the cylinder is not zero, the vector potential must be nonzero on every path
that encloses the cylinder. However in any simply connected region outside of the cylinder,
it is possible to express the vector potential as the gradient of a scalar, from the zero potential
solution by means of a gauge transformation, Ψ = Ψ(0) exp(ieΛ).
In region L, which contains the slit on the left, the wave function can be written as ΨL =
R
ΨL exp(ieΛ1 ), where ΨL is the zero potential solution in region L, and Λ1 (x, t) = A · dx,
with the integral taken along a path within region L. A similar form can be written for the wave
function in the region R, which contains the slit on the right. At the point b, in the overlap of
regions L and R, the wave function is a superposition of contributions from both slits. Hence
we have
Ψb = ΨL exp(ieΛ1 ) + ΨR exp(ieΛ2 ). (17.91)
The interference pattern depends on exp[ie(Λ1 − Λ2 )] = exp(ieΦ). Therefore the interfer-
ence pattern is sensitive to the magnetic flux inside of the cylinder, even though the particles
never pass through the region in which the magnetic field is nonzero. The AB (Aharonov-
Bohm) effect is a topological effect, in that the effect depends on the flux encircled by the
paths available to the particle, even though the paths may never approach the region of the
flux.
Chapter 18
Angular Momentum
Introduce the operator J 2 ≡ Jx2 + Jy2 + Jz2 . We can deduce that [J 2 , J ] = 0. Thus, there
exists a complete set of common eigenstates of J 2 and any one component of J . Particularly,
we have the following eigenvalue equations,
Noticing that
⟨β, m|J 2 − Jz2 |β, m⟩ = ⟨β, m|Jx2 |β, m⟩ + ⟨β, m|Jy2 |β, m⟩ ≥ 0, (18.3)
the inequality m2 ≤ β is obtained immediately. For a fixed value of β, there must be maximum
and minimum values for m.
Define
J+ ≡ Jx + iJy , J− ≡ Jx − iJy . (18.4)
It follows that
[Jz , J+ ] = J+ , [Jz , J− ] = −J− , [J+ , J− ] = 2Jz . (18.5)
Thus, we have
J+ |β, j⟩ = 0. (18.7)
Noticing that
J− J+ = J 2 − Jz2 − Jz , (18.8)
we can derive that β = j(j+1). By similar method, we can show that the minimum eigenvalue
k of Jz for fixed β satisfy β = k(k − 1). Consequently, we have k = −j. Now, we have shown
18.2 Orbital Angular Momentum and Spin –179/453–
the existence of a set of eigenstates corresponding to integer spaced m values in the range
−j ≤ m ≤ j. Since the difference between the maximum value j and the minimum value −j
must be an integer, it follows that j = integer/2. Henceforth we shall adopt the common and
more convenient notation of labeling the eigenstates by j instead of by β. The vector that was
previously denoted as |β, m⟩ will now be denoted as |j, m⟩.
To find the matrix element of angular momentum operators, we notice that
where R is the rotation operator generated by R = exp(−iJ · n̂θ). For a rotation through an
infinitesimal angle ϵ about z axis, we have
∂ψ ∂ψ
Rz (ϵ)ψ(x, y, z) = ψ(x + ϵy, y − ϵx, z) = ψ(x, y, x) + ϵ y −x . (18.13)
∂x ∂y
Noticing that
Rz (ϵ) = I − iϵJz , (18.14)
we have Jz = −i(x∂y − y∂x ), which is the z component of the orbital angular momentum
operator L = X × P in coordinate representation.
For a multicomponent state function, we have
ψ1 (x) ψ1 (R−1 x)
−1
R ψ2 (x) = D ψ2 (R x) . (18.15)
.. ..
. .
The two factors commute because the first acts only on the coordinate and the second acts only
on the components of the column vector. The matrix D must be unitary, and it can be written
as
Dn̂ (θ) = e−iS·n̂θ , (18.17)
where Ss are finite-dimensional Hermitian matrices and satisfy commutation relations [Si , Sj ] =
iϵijk Sk . The angular momentum operator J now takes the form
J =L+S (18.18)
and [Lα , Sβ ] = 0. L and S are called the orbital and spin part of the angular momentum
operator.
∂ 1 ∂ 1 ∂
∇ = êr + êθ + êϕ . (18.19)
∂r r ∂θ r sin θ ∂ϕ
It follows that
∂ 1 ∂ ∂ 1 ∂2
Lz = L · êz = −i , L =L·L=−
2
sin θ + . (18.21)
∂ϕ sin θ ∂θ ∂θ sin2 θ ∂ϕ2
Lz Ylm (θ, ϕ) = mYlm (θ, ϕ) and L2 Ylm (θ, ϕ) = l(l + 1)Ylm (θ, ϕ) (18.22)
are Ylm (θ, ϕ) = eimϕ Plm (cos θ), where Plm is the associated Legendre polynomials. If we
assume that the solutions must be single-valued under rotation, it will follow that m must be
an integer. If we further assume that the solutions must be nonsingular at θ = 0 and θ = π,
from the standard theory of the Legendre equation it will follow that l must be a nonnegative
integer in the range l ≥ |m|. The normalized solutions that result from these assumptions are
the well-known spherical harmonics
1/2
m (m+|m|)/2 (2l + 1)(l − |m|)! |m|
Yl (θ, ϕ) = (−1) eimϕ Pl (cos θ). (18.23)
4π(l + |m|)!
Spin
The eigenvalue equations for S 2 and Sz ,
with eigenstates
p
−iϕ
(1 +pcos θ)e /2 − 1/2 sin θe−iϕ (1 −pcos θ)e−iϕ /2
1/2 sin θ , p cos θ , − 1/2 sin θ , (18.30)
(1 − cos θ)eiϕ /2 1/2 sin θe iϕ
(1 + cos θ)eiϕ /2
Since Ju = Rz (α)Jy Rz (−α), we have Ru (β) = Rz (α)Ry (β)Rz (−α). Similarly, we can also
get Rz′ (γ) = Ru (β)Rz (γ)Ru (−β). The rotation operator now becomes
where
dm′ m (β) ≡ j, m′ e−iβJy j, m .
(j)
(18.35)
For the case of j = 1/2, we have Jy = σy /2 and σy2 = I. We can obtain
(1/2) cos(β/2) − sin(β/2)
d (β) = . (18.36)
sin(β/2) cos(β/2)
Notice that this matrix is periodic in β with period 4π, but it changes sign when 2π is added to
β. This double-valuedness under rotation by 2π is a characteristic of the full rotation matrix
whenever j is a half odd-integer. The matrix is single-valued under rotation by 2π whenever
j is an integer.
18.3 Rotation operator –183/453–
where the rotation R(α, β, γ) takes a vector in the direction (θ, ϕ) into the direction (θ′ , ϕ′ ).
By putting β = γ = 0 we obtain
X ′
Ylm (θ, ϕ)[Dmm′ (α, 0, 0)]∗ = eiαm Ylm (θ, ϕ).
(l)
Ylm (θ, ϕ + α) = (18.39)
m′
As R(θ, ϕ, γ) carries a vector parallel to the z axis into the direction (θ, ϕ), we have
X ′
Ylm (0, 0)[Dmm′ (ϕ, θ, γ)]∗ = cl [Dm0 (ϕ, θ, γ)]∗
(l) (l)
Ylm (θ, ϕ) = (18.42)
m′
for arbitrary γ, thus obtaining a simple relation between the spherical harmonics and the ro-
tation matrices. Conventional normalization is obtained if we put
1/2
2l + 1
cl = . (18.43)
4π
The operator for a rotation through 2π about an axis along the unit vector n̂ is Rn̂ (2π) =
e−2πin̂·J . Its effect on the standard angular momentum eigenstates is
We assume a rotation through 2π as a trivial operation that leaves everything unchanged, i.e.,
all dynamical variables are invariant under 2π rotation:
⟨+|A|−⟩ = 0. (18.46)
–184/453– Chapter 18 Angular Momentum
No physical observable can have nonvanishing matrix elements between states with integer
angular momentum and states with half odd-integer angular momentum. This fact forms the
basis of a superselection rule: There is no observable distinction among the state vectors of the
form
|Ψω ⟩ = |+⟩ + eiω |−⟩ (18.47)
for different values of the phase ω.
These vectors are common eigenstates of the four commutative operators J (1) · J (1) , J (2) ·
(1) (2)
J (2) , Jz , and Jz . It is often desirable to form eigenstates of the total angular momentum
operators, J · J and Jz , where the total angular momentum vector operator is
This is useful when the system is invariant under rotation as a whole, but not under rotation of
the two components separately. The eigenstates of J · J and Jz may be denoted as |α, J, M ⟩.
It is easy to verify that the four operators J (1) · J (1) , J (2) · J (2) , J · J and Jz are mutually
commutative, and hence they possess a complete set of common eigenstates. Since the set of
product vectors and the new set of total angular momentum eigenstates are both eigenstates of
J (1) · J (1) and J (2) · J (2) , the eigenvalues j1 and j2 will be constant in both sets. Therefore we
may confine our attention to the vector space of dimension (2j1 + 1)(2j2 + 1) that is spanned
by product vectors with fixed values of j1 and j2 .
and hence
N (J) = n(J) − n(J + 1). (18.51)
The product vectors |j1 , m1 ⟩ |j2 , m2 ⟩ are eigenstates of the operator Jz , with eigenvalue m1 +
m2 , and the degree of degeneracy n(M ) is equal to the number of pairs (m1 , m2 ) such that
M = m1 + m2 .
18.4 Addition of angular momentum –185/453–
m2
M = j1 + j2
m1
M = j1 − j2
The coefficients of this transformation are called the Clebsch–Gordan coefficients, denoted
as (j1 , j2 , m1 , m2 |J, M ). The phases of the CG coefficients are not yet defined because of the
indeterminacy of the relative phases of the vectors |j1 , j2 , J, M ⟩. For different values of M but
fixed J we adopt the usual phase convention that led to
p
J+ |j1 , j2 , J, M ⟩ = (J + M + 1)(J − M ) |j1 , j2 , J, M + 1⟩ . (18.55)
This leaves one arbitrary phase for each J value, which we fix by requiring that (j1 , j2 , j1 , J −
j1 |J, J) be real and positive. It can be shown that all of the CG coefficients are now real. We
can also prove that CG coefficients vanish unless following conditions are satisfied:
• m1 + m2 = M .
• |j1 − j2 | ≤ J ≤ |j1 + j2 |.
• j1 + j2 + J = an integer.
–186/453– Chapter 18 Angular Momentum
It is possible to work out the values of the CG coefficients by successive application of the
raising or lowering operator to
X
|j1 , j2 , J, M ⟩ = |j1 , j2 , J, M ⟩ (j1 , j2 , m1 , m2 |J, M ). (18.56)
m1 ,m2
The details of the calculation can be found in section 7.7 of Quantum mechanics – a modern
development (Leslie E. Ballentine). There are Table of CG coefficients and Calculator of CG co-
efficients on the internet. A special case of angular momentum addition is spin-orbit coupling
of spin 1/2 particles, and we list the corresponding CG coefficients (l, 1/2, M − ms , ms |J, M )
in Table 18.1.
J = l + 1/2 J = l − 1/2
h i1/2 h i1/2
ms = 1/2 l+M +1/2
2l+1
− l−M +1/2
2l+1
h i1/2 h i1/2
ms = −1/2 l−M +1/2
2l+1
l+M +1/2
2l+1
Now let us consider the relation between CG coefficients and rotation matrices. On the one
hand, we have
⟨j1 , j2 , m1 , m2 |R|j1 , j2 , m′1 , m′2 ⟩ = Dm11 m′ (R)Dm22 m′ (R).
(j ) (j )
(18.57)
1 2
which is equivalent to
R−1 SR = S. (18.63)
Taking the case of infinitesimal rotation, we can derive that
[J , S] = 0. (18.64)
which is equivalent to
R−1 V R = RV . (18.66)
Taking the case of infinitesimal rotation, we can derive that
If V and W are vector operators, we can prove that V · W is scalar operator and V × W is
vector operator.
Similarly, tensor operators are defined as
X
R−1 Tij···k R = Rii′ Rjj ′ · · · Rkk′ Ti′ j ′ ···k′ . (18.68)
i′ ···
Such a tensor is known as a Cartesian tensor. The trouble with a Cartesian tensor is that it is re-
ducible, i.e., it can be decomposed into objects that transform independently under rotations.
For example, the trace of a tensor transform like a scalar under rotations. Thus, we would like
to define spherical tensor operators which are irreducible under rotations. A spherical tensor
operator of rank k with (2k + 1) components is defined as
X
k
R−1 Tq(k) R = [Dqq′ (R)]∗ Tq′ ,
(k) (k)
(18.69)
q ′ =−k
or equivalently
X
k
RTq(k) R−1 =
(k) (k)
Dq′ q (R)Tq′ , (18.70)
q ′ =−k
(k)
where Dqq′ is the rotation matrix. Taking the case of infinitesimal rotation, we can derive that
p (k)
J± , Tq(k) = (k ∓ q)(k ± q + 1)Tq±1 , Jz , Tq(k) = qTq(k) . (18.71)
–188/453– Chapter 18 Angular Momentum
Vx − iVy Vx + iVy
V−1 = √ , V 0 = Vz , V1 = − √ , (18.72)
2 2
satisfy the commutation relation above. So they are spherical tensor of rank 1. Generally, if V
is a vector operator, Ylm (V ) will be a spherical tensor of ranks l.
Spherical tensors can be formed as products of other spherical tensors. We have the following
theorem:
Theorem 18.1
(k ) (k )
If Xq1 1 and Zq2 2 are irreducible spherical tensors of rank k1 and k2 ,
X
Tq(k) = (k1 , k2 , q1 , q2 |k, q)Xq(k1 1 ) Zq(k2 2 ) (18.73) ♣
q1 ,q2
The proof can be found in section 3.10 of Modern Quantum Mechanics (J.J.Sakurai).
The proof can also be found in section 3.10 of Modern Quantum Mechanics (J.J.Sakurai).
⟨τ ′ , j ′ ||S||τ, j⟩
⟨τ ′ , j ′ , m′ |S|τ, j, m⟩ = δjj ′ δmm′ √ . (18.76)
2j + 1
18.6 Spherical potential well –189/453–
⟨τ ′ , j ′ ||Vq ||τ, j⟩
⟨τ ′ , j ′ , m′ |Vq |τ, j, m⟩ = (j, 1, m, q|j ′ , m′ ) √ . (18.77)
2j + 1
For j = j ′ , Wigner-Eckart theorem - when applied to the vector operator- takes a particularly
simple form:
⟨τ ′ , j, m|J · V |τ, j, m⟩
⟨τ ′ , j, m′ |Vq |τ, j, m⟩ = ⟨j, m′ |Jq |j, m⟩ . (18.79)
j(j + 1)
Example: The magnetic moment operator for an atom has the form
−e
µ= (gL L + gS S). (18.80)
2me
The parameters gL and gS have approximately the values gL = 1 and gS = 2. The former is
an generalization of the magnetic moment we worked out in classical electrodynamics for a
system of charged particles. The latter will be discussed in quantum field theory. We define
the effective Lande factor as
−e
⟨τ, J, M ′ |µ|τ, J, M ⟩ = geff ⟨J, M ′ |J |J, M ⟩ . (18.81)
2me
Hence we have
⟨τ, J, M |gL L · J + gs S · J |τ, J, M ⟩ J(J + 1) − L(L + 1) + S(S + 1)
geff = =1+ .
J(J + 1) 2J(J + 1)
(18.82)
1 2
− ∇ Ψ + W (r)Ψ = EΨ. (18.83)
2m
In spherical coordinates, we have
1 ∂ 2 ∂ 1 ∂ ∂ 1 ∂2
∇ = 2
2
r + 2 sin θ + 2 2 . (18.84)
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂ϕ2
[Qcα , Pcβ ] = [Qrα , Prβ ] = iδαβ , [Qcα , Prβ ] = [Qrα , Pcβ ] = 0. (18.92)
1 2 e2
− ∇ Ψ(r) − Ψ(r) = EΨ(r). (18.94)
2µ 4πr
Suppose Ψ(r, θ, ϕ) = Ylm (θ, ϕ)u(r)/r, we have
1 d2 u(r) l(l + 1) e2
− + − u(r) = Eu(r). (18.95)
2µ dr2 2µr2 4πr
18.6 Spherical potential well –191/453–
Define r
p e2 µ
ρ ≡ αr, α ≡ 8µ|E|, λ≡ . (18.96)
4π 2|E|
The eigenvalue equation becomes
d2 u 1 λ l(l + 1)
+ − + − u = 0. (18.97)
dρ2 4 ρ ρ2
As ρ → ∞, we have u ∼ e−ρ/2 . And as ρ → 0, we have u ∼ ρl+1 . Therefore, we would like to
suppose
u(ρ) = ρl+1 e−ρ/2 v(ρ). (18.98)
It follows that
d2 v dv
ρ 2
+ (2l + 2 − ρ) + (λ − l − 1)v = 0. (18.99)
dρ dρ
It is the wellknown confluent hypergeometric differential equation. When λ−1−l = nr , it has
regular solutions. Solutions are called associated Laguerre polynomial, and will be denoted as
L2l+1
n−l−1 (ρ), where n = nr + l + 1). The energy levels are
µe4
En = − . (18.100)
32π 2 n2
The degeneracy of an eigenvalue En is
X
n−1
(2l + 1) = n2 . (18.101)
l=0
Note: The degeneracy of an energy level of a hydrogen atom is greater than this by a factor of 4, which
arises from the twofold orientational degeneracies of the electron and proton spin states. This fourfold
degeneracy is modified by the hyperfine interaction between the magnetic moments of the electron and the
proton.
P −1 XP = −X, P −1 P P = −P . (19.1)
We can verify that P must be linear by applying space inversion to the commutation relation
[Xi , Pi ] = i. Hence the parity operator is a unitary operator rather than an anti-unitary oper-
ator. Since two consecutive space inversions produce no change at all, it follows that the states
described by |ψ⟩ and by P 2 |ψ⟩ must be the same. The operator P 2 can differ from the identity
operator by at most a phase factor. This phase factor is left arbitrary. It is most convenient to
choose that phase factor to be unity, and hence we have
P = P −1 = P † . (19.3)
From the fact that P 2 = 1, it follows that P has eigenvalues ±1. Any even function, ψe (x) =
ψe (−x), is an eigenfunction on P with eigenvalue 1, and any odd function, ψo (x) = −ψo (−x),
is an eigenfunction of P with eigenvalue −1. A function corresponding to parity +1 is also
said to be of even parity, and a function corresponding to parity −1 is said to be of odd parity.
If the parity of operator K is p, i.e., P KP = pK, and the parities of the state |ψ1 ⟩ and |ψ2 ⟩
are p1 and p2 respectively, ⟨ψ1 |K|ψ2 ⟩ would vanish unless p = p1 p2 .
19.2 Time reversal –193/453–
Example: Under space inversion x → −x, the spherical harmonic undergoes the transfor-
mation
Ylm (θ, ϕ) → Ylm (π − θ, ϕ + π) = (−1)l Ylm (θ, ϕ). (19.6)
Hence the single particle orbital angular momentum eigenstate |l, m⟩ is also an eigenstate of
parity, with parity (−1)l . A total orbital angular momentum eigenstate for a two electron atom
is of the form
X
|l1 , l2 , L, M ⟩ = ⟨l1 , l2 , m1 , m2 |l1 , l2 , L, M ⟩ |l1 , m1 ⟩ ⊗ |l2 , m2 ⟩ . (19.7)
m1 ,m2
It is apparent that
P |l1 , l2 , L, M ⟩ = (−1)l1 +l2 |l1 , l2 , L, M ⟩ , (19.8)
We see that, in general, the parity of an angular momentum state is not determined by its total
angular momentum.
If the parity operator P commutes with the Hamiltonian H, parity eigenvalue ±1 will be a
conserved quantity. In that case an even parity state can never acquire an odd parity com-
ponent, and an odd parity state can never acquire an even parity component. If |ψ(t)⟩ is a
possible evolution of a system with Hamiltonian H, satisfying Schrödinger equation
∂ |ψ⟩
H |ψ⟩ = i , (19.9)
∂t
we can derive that
∂P |ψ⟩
HP |ψ⟩ = i , (19.10)
∂t
i.e., the space inversion of |ψ(t)⟩, P |ψ(t)⟩, can also be a possible physical process of the system.
Experiments have shown that parity in β decay is not conserved.
T −1 XT = X, T −1 P T = −P , T −1 J T = −J . (19.11)
We can verify that T must be antilinear by applying space inversion to the commutation rela-
tion [Xi , Pi ] = i. Thus, the time reversal operator is an anti-unitary operator.
Suppose that the evolution of a system satisfies Schrödinger equation
∂ |ψ(t)⟩
H |ψ(t)⟩ = i . (19.12)
∂t
If T H = HT , we can derive that
∂T |ψ(−t)⟩
HT |ψ(−t)⟩ = i , (19.13)
∂t
–194/453– Chapter 19 Discrete Symmetries
The condition for the Hamiltonian to be invariant under complex conjugation is that the po-
tential be real: W = W ∗ . In that case, if ψ(x, −t) is a solution, so will be ψ ∗ (x, t). This
suggests that we may identify the time reversal operator with the complex conjugation oper-
ator in this representation,
T = K0 , (19.16)
where, by definition, K0 ψ(x, t) = ψ ∗ (x, t). In this case T is its own inverse.
R
The formal expression for an arbitrary state in coordinate representation is |ψ⟩ = ψ(x) |x⟩ d3 x,
where the basis vector |x⟩ is an eigenstate of the position operator. Since T is equal to the com-
R
plex conjugation operator, its effect is simply T |ψ⟩ = ψ ∗ (x) |x⟩ d3 x, with T |x⟩ = |x⟩.
Since Z
T |p⟩ = ⟨x|p⟩∗ |x⟩ d3 x = |−p⟩ ,
we have Z Z
∗
T |ψ⟩ = ψ (p) |−p⟩ d p =3
ψ ∗ (−p) |p⟩ d3 p. (19.18)
The time reversal operator must reverse the angular momentum. For spin operator, we have
T −1 ST = −S. (19.19)
In the standard representation of the spin operators, Sx and Sz are real, while Sy is imaginary.
The time reversal operator T cannot be equal to the complex conjugation operator K0 in this
representation, since the effect of the latter is
K0 S x K0 = S x , K0 Sy K0 = −Sy , K0 Sz K0 = Sz . (19.20)
Let us write the time reversal operator as T = Y K0 , where Y is a linear operator. Y must have
the following properties:
And Y must operate only on the spin degrees of freedom. A reasonable choice is that Y =
e−iπSy , whose effect is to rotate spin (and only spin) through the angle π about the y axis.
Therefore the explicit form of the time reversal in this representation is
T = e−iπSy K0 . (19.22)
Two successive applications of the time reversal transformation, must leave the physical situ-
ation unchanged. It follows that
T 2 |ψ⟩ = c |ψ⟩ , (19.23)
where |c| = 1. Noticing that
we have
T 2 (|ψ⟩ + T |ψ⟩) = c |ψ⟩ + c∗ T |ψ⟩ = c′ (|ψ⟩ + T |ψ⟩). (19.25)
It follows that c′ = c∗ = c, and so c = ±1, leading to
or equivalently,
T 2 = e−i2πJy = R(2π), (19.28)
since e−i2πLy = I.
Kramer’s theorem
Let us consider the energy eigenvalue equation, H |ψ⟩ = E |ψ⟩, for a time-reversal-invariant
Hamiltonian. Since HT |ψ⟩ = T H |ψ⟩ = ET |ψ⟩, both |ψ⟩ and T |ψ⟩ are eigenstates with
energy eigenvalue E. There are two possibilities: (a) |ψ⟩and T |ψ⟩ are linearly dependent, and
so describe the same state. (b) |ψ⟩and T |ψ⟩ are linearly independent, and so describe two
degenerate states.
Suppose that (a) is true, in which case we must have T |ψ⟩ = a |ψ⟩ with |a| = 1. A second
application of T yields T 2 |ψ⟩ = |ψ⟩. Thus for those states that satisfy T 2 |ψ⟩ = − |ψ⟩ it is
necessarily true that |ψ⟩and T |ψ⟩ are linearly independent, degenerate states. This result is
known as Kramer’s theorem: any system for which T 2 |ψ⟩ = − |ψ⟩ has only degenerate energy
levels.
Chapter 20
Approximation Method
We pick one of these levels ϵn for study, so the index n will be fixed for the following discussion.
We denote the eigenspace of the unperturbed system corresponding to eigenvalue ϵn by H, so
that the unperturbed eigenstates {|nα⟩ , α = 1, 2, · · · } form a basis in this space.
We take the perturbed Hamiltonian to be H = H0 + λH1 , where λ is a formal expansion
parameter that we allow to vary between 0 and 1 to interpolate between the unperturbed and
perturbed system. When the perturbation is turned on, the unperturbed energy level ϵn may
split and shift. We denote one of the exact energy levels that grows out of ϵn by E. We let |ψ⟩
be an exact energy eigenstate corresponding to energy E, so that
The component P |ψ⟩ is a linear combination of the known unperturbed eigenstates {|nα⟩ , α =
1, 2, · · · }, and is easily characterized. The orthogonal component Q |ψ⟩ is harder to find. It
turns out it is possible to write a neat power series expansion for this solution. Firstly, we have
Note: If there are other unperturbed energy levels ϵk lying close to ϵn , then the perturbation could push
the exact energy E near to or past some of these other levels, and then other small denominators would
make R ill defined. This will certainly happen if the perturbation is large enough. For the time being we
will assume this does not happen, so that R is free of small denominators. When this is not the case we
shall refer to “nearly degenerate perturbation theory”, which is discussed later.
P R = RP = 0, QR = RQ = R, R(E − H0 ) = (E − H0 )R = Q. (20.7)
Then we have
R(E − H0 ) |ψ⟩ = Q |ψ⟩ = λRH1 |ψ⟩ and |ψ⟩ = P |ψ⟩ + λRH1 |ψ⟩ . (20.8)
1
|ψ⟩ = P |ψ⟩ = P |ψ⟩ + λRH1 P |ψ⟩ + λ2 RH1 RH1 P |ψ⟩ + · · · . (20.9)
1 − λRH1
⟨n|ψ⟩ = 1. (20.11)
(20.12)
To find an equation for E, we notice that
It follows that
or in expanded form,
X X
(E − ϵn )cα = λ ⟨nα|H1 |nβ⟩ cβ + λ2 ⟨nα|H1 RH1 |nβ⟩ cβ + · · ·
β β
X X X ⟨nα|H1 |kγ⟩ ⟨kγ|H1 |nβ⟩
=λ ⟨nα|H1 |nβ⟩ cβ + λ2 cβ + · · ·
β β k̸=n,γ
E − ϵk
(20.19)
This equation must be solved simultaneously for the eigenvalues E and the unknown expan-
sion coefficients cα . If we truncate the series at first order, we see that the corrections E − ϵn to
the energies are determined as the eigenvalues of the matrix ⟨nα|H1 |nβ⟩, and the coefficients
cα are the corresponding eigenvectors. This determines the energies to first order, but the co-
efficients cα only to zeroth order. Then P |ψ⟩ becomes known to zeroth order and Q |ψ⟩ to
first order. The first order matrix may or may not have degeneracies itself. If it does not, then
all degeneracies are lifted at first order; if it does, the remaining degeneracies may be lifted at
a higher order, or may persist to all orders. Degeneracies that persist to all orders are almost
20.2 Application of time-independent perturbation theory in hydrogen atom –199/453–
always due to some symmetry of the system, which can usually be recognized at the outset.
The higher order corrections can be worked out step by step, which will not be listed here.
Now let us consider the case in which the unperturbed levels of H0 , while not technically
degenerate, are close to one another. Suppose to be specific that two levels, say, ϵn and ϵm , are
close enough to one another that first order perturbations will push the exact level E close to
or onto the unperturbed level ϵm . In this case we choose some energy, call it ϵ̄, which is close
to ϵn and ϵm . Then let us take the original unperturbed Hamiltonian and perturbation and
rearrange them in the form,
H = H0 + H1 = H0′ + H1′ , (20.20a)
where
X
H0 = ϵk |kα⟩⟨kα| , (20.20b)
kα
X X
H0′ = ϵk |kα⟩⟨kα| + ϵ̄ |kα⟩⟨kα| , (20.20c)
k̸=m,n;α k=m,n;α
X
H1′ = H1 + (ϵk − ϵ̄) |kα⟩⟨kα| . (20.20d)
k=m,n;α
Then standard degenerate perturbation theory may be applied. We will call this procedure
“nearly degenerate perturbation theory.”
For small z, the attractive Coulomb field dominates the total potential and we have the usual
Coulomb well that supports atomic bound states. However, for large negative z, the unper-
turbed potential goes to zero, while the perturbing potential becomes large and negative. At
intermediate values of negative z, the competition between the two potentials gives a maxi-
mum in the total potential. The electric force on the electron is zero at the maximum of the
potential. Given the relative weakness of the applied field, the maximum must occur at a dis-
tance from the nucleus that is large in comparison to the Bohr radius a0 . Atomic states with
small principal quantum numbers n lie well inside this radius. The perturbation analysis we
shall perform applies to these states.
The bound states of the unperturbed system are able to tunnel through the potential barrier.
When an external electric field is turned on, the bound states of the atom cease to be bound in
the strict sense, and become resonances. Electrons that tunnel through the barrier and emerge
into the classically allowed region at large negative z will accelerate in the external field, leaving
behind an ion. This is the phenomenon of field ionization. This effect can be neglected if the
external field is weak enough and the lifetime of the “bound state” is long enough.
In the case of hydrogen, the ground state is |100⟩. The first order shift in the ground state
energy level is given by
(1)
∆Egnd = ⟨100|eF z|100⟩ = 0, (20.24)
which vanishes because the parity of z is odd, but ⟨100| and |100⟩ have the same parity. For the
excited states of hydrogen, according to first order degenerate perturbation theory, the shifts
in the energy levels En are given by the eigenvalues of the n2 × n2 matrix,
According to the Wigner-Eckart theorem and parity, the matrix elements vanish unless l−l′ =
±1 and m = m′ . Consider, for example, the case n = 2. The four degenerate states are
|2, 0, 0⟩, |2, 1, −1⟩, |2, 1, 0⟩ and |2, 1, 1⟩. Only the states |2, 0, 0⟩ and |2, 1, 0⟩ are connected by
the perturbation. Therefore of the 16 matrix elements, the only nonvanishing ones are
and its complex conjugate. The matrix connecting the two states |2, 0, 0⟩ and |2, 1, 0⟩ is
0 −W
, (20.27)
−W 0
and its eigenvalues are the first order energy shifts in the n = 2 level,
(1)
∆E2 = ±W. (20.28)
In addition, the two states |2, 1, −1⟩ and |2, 1, 1⟩ do not shift their energies at first order. The
perturbed eigenstates are
Now let us look at the exact symmetries of the full, perturbed Hamiltonian H = H0 + H1 ,
without doing perturbation theory at all. Since [H, Lz ] = 0 the exact eigenstates of H can be
chosen to be eigenstates of Lz as well. Denote these by |γm⟩, where γ is an additional index
needed to specify an energy eigenstate. Thus, we have
where Eγm is allowed to depend on m since the full rotational symmetry is broken. As for
time reversal, the state T |γm⟩ must be an eigenstate of energy with eigenvalue Eγm since
T H = HT . But because T −1 Lz T = −Lz , it also follows that T |γm⟩ is an eigenstate of Lz
with eigenvalue −m. If m ̸= 0, we must have a degeneracy of at least two. The only energy
levels that can be non-degenerate are those with m = 0. In the example above, even higher
order corrections cannot separate |2, 1, −1⟩ and |2, 1, 1⟩.
The term
p4
HRKE = − (20.32)
8m3
p
is due to the second order term of the expansion series of E = p2 + m2 . (The first order
term is just the kinetic energy in non relativistic quantum mechanics). The term
1
HD = ∇2 V (20.33)
8m2
resolution of the identity. The effect is to smear out the position of the atomic electron over a
distance of order λC . The term
1 1 dV
HSO = L·S (20.34)
2m2 r dr
arises because the electric field of nuclei generates a magnetic field in the rest frame of electron.
The unperturbed energy levels in hydrogen are given by equation 20.22. When spin of electron
is taken into account, these levels are 2n2 degenerate. One choice of base is |nlml ms ⟩. It is the
eigenstate of operator L2 , Lz and Sz . However, Lz and Sz do not commute with HSO . A better
choice of base is |nljmj ⟩. It is the eigenstate of operator L2 , J 2 and Jz . HSO , HRKE and HSO
all commute with L2 , J 2 and Jz . Thus nl′ j ′ m′j H nljmj vanishes unless l′ = l, j = j ′ and
m′j = mj . We can figure out that
1 3 n
⟨nljmj |HRKE |nljmj ⟩ = − 2 − α2 En , (20.35)
n 4 l + 1/2
1
⟨nljmj |HD |nljmj ⟩ = − δl0 α2 En , (20.36)
n
1 j(j + 1) − l(l + 1) − 3/4 2
⟨nljmj |HSO |nljmj ⟩ = − α En . (20.37)
2n l(l + 1/2)(l + 1)
It is independent of the orbital angular momentum quantum number l, although each of the
individual terms does depend on l. However, the total energy shift does depend on j in ad-
dition to the principal quantum number n, so when we take into account the fine structure
corrections,the energy levels of hydrogen atom have the form Enj .
Besides fine structure effect, the remaining important effects causing energy shift are hyperfine
effects and the Lamb shift. The Lamb shift is a shift in the energy levels due to the interaction of
the electron with the vacuum fluctuations of the quantized electromagnetic field. It has small
effects on the s states (l = 0) of hydrogen, thereby introducing a dependence of the energy
levels on l. Thus, including the Lamb shift, the energy levels in hydrogen have the form Enlj ,
and the only degeneracy is that due to rotational invariance. It will be further discussed in
quantum electrodynamics. Hyperfine effects are caused by the interaction between electron
spin and nuclei spin, and will be discussed later.
(P + eA)2 e2
H= − + HFS + ge µB S · B, (20.39)
2m 4πr
20.2 Application of time-independent perturbation theory in hydrogen atom –203/453–
where ge ≈ 2 and µB ≡ e/2m. We assume a uniform magnetic field B = B ẑ. Taking the
gauge
1
A = B × r, (20.40)
2
we have ∇ · A = 0, implying that
P · A = A · P. (20.41)
Hence the cross terms in the expansion of the kinetic energy can be written in either order.
Noticing that
1 1
P · A = P · (B × r) = B · L, (20.42)
2 2
we thus have
H = Ha + HZ + HB + HFS , (20.43)
where
p2 e2 e e2 2 2
Ha = − , HZ = (Lz + 2Sz )B, HB = B (x + y 2 ). (20.44)
2m 4πr 2m 8m
Denote the typical energy of the term Hi as Ei . We have Ea ∼ me4 /32n2 π 2 ℏ2 ϵ20 , EZ ∼
neℏB/2m and EB ∼ 2n4 π 2 ϵ20 ℏ4 B 2 /m3 e2 , leading to
EZ 16π 2 n3 ℏ3 ϵ20 n3 B
∼ B ∼ (20.45)
Ea m2 e3 2 × 105 T
and 2 3 3 2 2 2
EB 8π n ℏ ϵ0 n3 B
∼ B ∼ . (20.46)
Ea m2 e3 4 × 105 T
Under usual experimental conditions, we have
EB ≪ EZ ≪ Ea , (20.47)
Since [HSO , L2 ] = 0, the matrix element vanishes unless l = l′ . Thus we focus on the matrix in
the subspace l = l′ . Take the 2p orbits of hydrogen as an example. There is a 2-fold degeneracy
between |2, 1, −1, 1/2⟩ and |2, 1, 1, −1/2⟩. This makes one 2 × 2 matrix. Let us look at the
off-diagonal element
⟨2, 1, −1, 1/2|f (r)L · S|2, 1, 1, −1/2⟩ . (20.51)
To be non-vanishing, the operator in the middle of the matrix element must connect states
with ∆ml = 2. But in fact that operator L · S = (L+ S− + L− S+ )/2 + Lz Sz permits only
∆m = 0, ±1, the off-diagonal matrix element vanishes and the energy shift is determined by
diagonal elements. As
The final case we shall examine is the weak field limit, in which Hz ≪ HFS and we will treat
Hz as perturbation.
Note: In the case of hydrogen, one should also consider the Lamb shift for a realistic treatment. For example,
in the n = 2 levels of hydrogen, the Lamb shift is about 10 times smaller than the fine structure energy
shifts, indicating that we really should question how the Lamb shift compares to the Zeeman term which
is also (by our assumptions) much smaller than the fine structure term.
The eigenstate of Ha + HFS are |nljmj ⟩ with eigenvalue Enj . Up to the first order, the matrix
elements we need have the form
Since [HZ , L2 ] = 0 and [HZ , Jz ] = 0, off-diagonal matrix element vanishes automatically. The
energy shift is
∆E = µB B ⟨nljmj | Lz + 2Sz |nljmj ⟩ = geff µB Bmj , (20.54)
where
j(j + 1) − l(l + 1) + s(s + 1)
geff = 1 + . (20.55)
2j(j + 1)
20.2 Application of time-independent perturbation theory in hydrogen atom –205/453–
Multipole moments for a system of charges has been discussed in the part of classical electro-
dynamics. In quantum mechanics, we recall that the intrinsic magnetic moment operator of
an electron is defined in the space of electron spin. We may infer that multipole moments of
nuclei are defined in the space of nuclear spin I. Suppose that I 2 has eigenvalues i(i + 1). The
nuclear Hilbert space will be a (2i + 1)-dimensional space in which the standard basis is |mi ⟩
with −i ≤ mi ≤ i.
Not all the multipole fields that occur classically are allowed in the case of nuclei. There are
two rules governing the allowed multipole moments of the nucleus. The first is that electric
multipoles of odd k and magnetic mutlipoles of even k are forbidden. For example, if the
nucleus had an electric dipole moment, the perturbing Hamiltonian would be
d·r
H1 = −e . (20.56)
r3
And just like µ, d must be proportional to the spin, because all vector operators on a single
irreducible subspace are proportional (Wigner-Eckart theorem). Thus, we have
I ·r
H1 = −κe . (20.57)
r3
We find that H1 violates time reversal and parity
The weak interactions do violate parity, and we do know that time reversal is violated at a very
small level in certain decay processes, so it is possible that the terms forbidden by this rule
actually exist at a small level. For example, the neutron or the electron may have an electric
dipole moment, but if such moments exist, they are certainly very small and can be neglected
in our discussion.
The second rule states that a 2k -pole can occur only if k ≤ 2i. For example, the proton with
i = 1/2 can possess an electric monopole moment and a magnetic dipole moment, but not an
electric quadrupole moment. Lying behind this rule is the fact that the operator representing
the 2k -pole on the nuclear Hilbert space is, in fact, an order k irreducible tensor operator. But
the maximum order of an irreducible tensor operator on the nuclear Hilbert space with spin i
is k = 2i.
For hydrogen atom, whose nuclear spin is i = 1/2, the only term we have to concern is mag-
netic moment. A point magnetic dipole of moment µ situated at the origin of the coordinates
–206/453– Chapter 20 Approximation Method
µ×r
A(r) = . (20.59)
r3
To avoid the singularity at origin, we modify A(r) as
(
1
, r<a
A(r) = µ × r a3
1
, (20.60)
r3
, r>a
taking into account of the finite size of nuclei. By taking the curl we compute the magnetic
field (
2µ
3 , r<a
B(r) = a , (20.61)
µ · T, r > a
Define ( (
1
, r<a 0, r<a
∆(r) ≡ a3
and f (r) ≡ . (20.62)
0, r>a 1, r>a
Then we can write
f (r)
A(r) = µ × r ∆(r) + 3 and B(r) = µ · [2∆(r)I + f (r)T] . (20.63)
r
The expressions 20.63 for A and B are the fields of a classical magnetic dipole at the origin,
but now for use in the Hamiltonian we must reinterpret µ as an operator acting on the nuclear
Hilbert space, given in terms of the nuclear spin by
µ = gp µp I, (20.66)
where gp is g-factor of proton and µp ≡ e/2mp . Thus, the Hamiltonian must be interpreted
as an operator acting the total Hilbert space
For Helec the obvious basis is |nljmj ⟩ with energies Enlj when there is no hyperfine terms.
In hydrogen energies depend on l because of the Lamb shift. The obvious basis in Hnucl is
|imi ⟩. Thus we define the basis states in H as |nljmj mi ⟩ (we suppress the index i since it is a
20.3 Time-dependent perturbation theory –207/453–
constant). We call this the uncoupled basis. Now we expand the kinetic energy in Hamiltonian
and neglect the term in A2 , writing the result as H = H0 + H1 , where
p2 e2
H0 = − + HFS + HLamb and H1 = 2µB (P · A + S · B). (20.68)
2me 4πr
Using
P · (I × r) = I · (r × P ) = I · L, (20.69)
we can get
f (r)
H1,orbi ≡ 2µB (P · A) = k(I · L) ∆(r) + 3 . (20.70)
r
where k ≡ ge gp µB µp . As for spin part, we obtain
F ≡ I + J = I + L + S. (20.73)
This suggests that we couple together J and I to create eigenstates of F and Fz . We will call
the result the “coupled basis”, denoted by |nljf mf ⟩.
In the coupled basis the matrix elements we need to consider for degenerate perturbation the-
ory are nljf mf H1 nljf ′ m′f . Since [F , H1 ] = 0, the energy shift caused by H1 is simply
given by diagonal matrix elements, i.e.,
ge gp µB µp 1 f (f + 1) − j(j + 1) − i(i + 1)
∆E = . (20.75)
4πa30 n3 j(j + 1)(2l + 1)
The energy levels now have the form Enljf . The energy eigenstates are |nljf mf ⟩, and are
(2f + 1)-fold degenerate, causing the fine structure levels of hydrogen to split, giving rise to
hyperfine multiplets. For example, the ground state |1, 0, 1/2⟩ splits into two levels f = 0
and f = 1. This f = 0 level is the true ground state of hydrogen. It is nondegenerate. The
f = 1 level is 3-fold degenerate, and lies above the ground state by an energy of approximately
1.42GHz in frequency units, or 21cm in wave length units.
–208/453– Chapter 20 Approximation Method
Let us assume for simplicity that H0 has a discrete spectrum H0 |n⟩ = En |n⟩. We assume that
the system is initially in an eigenstate of the unperturbed system, what we will call the “initial”
state |i⟩ with energy Ei . The evolution of the state in interaction picture is
|ψI (t)⟩ = W (t) |i⟩ . (20.85)
20.3 Time-dependent perturbation theory –209/453–
Let us expand the exact solution of the Schrödinger equation in the interaction picture in the
unperturbed eigenstates as X
|ψI (t)⟩ = cn (t) |n⟩ . (20.86)
n
We then have
D E
cn (t) = ⟨n|W (t)|i⟩ = n U0† (t)U (t) i = eiEn t ⟨n|U (t)|i⟩ (20.87)
Thus, the transition amplitudes in the interaction picture and those in the Schrödinger picture
are related by a simple phase factor. The transition probabilities are the squares of the ampli-
tudes and are the same in either case. The perturbation expansion of the transition amplitude
cn (t) is
n (t) + · · ·
cn (t) = δni + c(1) (20.88)
where
Z t Z t
1 ′ ′ 1 ′
c(1)
n (t) = dt ⟨n|H1I (t )|i⟩ = dt′ ei(En −Ei )t ⟨n|H1 (t′ )|i⟩ . (20.89)
i 0 i 0
1 sin2 ωt/2 π
lim 2
= δ(ω). (20.96)
t→∞ t ω 2
The δ-function enforces energy conservation in the limit t → ∞. But at finite times, transi-
tions take place to states in a range of energies about the initial energy. This width is of order
1/t. This is an example of the energy-time uncertainty relation, ∆t∆E ∼ 1, indicating that a
system that is isolated (not subjected to a measurement) over a time interval ∆t has an energy
that is uncertain by an amount ∆E ∼ 1/∆t.
It is conventional to define the transition rate as the transition probability per unit time,
P (i → f )
Γ(i → f ) ≡ lim . (20.97)
t→∞ t
Up to the first order, we have
The term (e/m)A · P is a Harmonic perturbation with K = (eA0 /m)eik·x (ϵ · P ). Take the
case of absorption of radiation, the transition rate is
e2 |A0 |2 2
Γabs (1 → 2) = 2π 2
2 eik·x (ϵ · P ) 1 δ(Ef − Ei − ω). (20.101)
m
Notice that the average energy density of the radiation field is
1
ρ= E 2 + B 2 = 2ω 2 |A0 |2 . (20.102)
2
The transition rate can be rewritten as
πe2 2
Γabs (1 → 2) = 2
2 eik·x (ϵ · P ) 1 ρ(ω12 ), (20.103)
m2 ω12
where ρ(ω) ≡ ρδ(E2 − E1 − ω) is the average energy density of the EM field per unit angular
frequency.
The electric dipole approximation is based on the fact that the wavelength of radiation field is
far longer than the atomic dimension. The series
eik·x = 1 + ik · x + · · · (20.104)
where d ≡ ex is the electric dipole operator. Assuming the direction and polarization of the
incident light are totally random, we have
1
|⟨2|ϵ · d|1⟩|2 = d221 where d221 ≡ |⟨2|d|1⟩|2 . (20.107)
3
–212/453– Chapter 20 Approximation Method
πg2 d221
Γabs (1 → 2) = B1→2 ρ(ω21 ) where B1→2 ≡ . (20.108)
3
Similarly, the rate of stimulated emission is
πg1 d212
Γemm (2 → 1) = B2→1 ρ(ω21 ) where B2→1 ≡ . (20.109)
3
Usually B1→2 and B1→2 are called Einstein coefficients of absorption and stimulated emission.
And the relation g2 B2→1 = g1 B1→2 is obtained directly. However, the spontaneous emission
can not be explained unless we quantize the EM field as well. A complete treatment of atomic
radiation by quantum field theory can be found in section 4.5 of Theoretical Astrophysics, Vol-
ume 1(T. Padmanabhan).
Now consider the trial wave function ψ(x) = αd/2 ψ0 (αx), where the prefactor ensures that
ψ(x) continues to be normalized. From the scaling property of the potential, it is simple to
show that
E(α) = α2 ⟨T ⟩0 + α−n ⟨V ⟩0 . (20.115)
dE
= 2α ⟨T ⟩0 − nα−n−1 ⟨V ⟩0 = 0. (20.116)
dα
But this minimum must sit at α = 1 since, by construction, this is the true ground state. We
learn that for the homogeneous potentials, we have
2 ⟨T ⟩0 = n ⟨V ⟩0 , (20.117)
Example: For Coulomb potential, we have V ∝ −1/r. The virial theorem tells us that E0 =
⟨T ⟩0 + ⟨V ⟩0 = − ⟨T ⟩0 < 0. In other words, we proved what we already know: the Coulomb
potential has bound states.
Note: Nowhere in our argument of the virial theorem did we state that the potential must be attractive.
Our conclusion above would seem to hold for repulsive potential, yet this is clearly wrong: the repulsive
potential V ∼ +1/r has no bound states. It is because we assumed at the beginning of the argument
that the ground state ψ0 was normalisable. For repulsive potentials this is not true: all states are asymptotically
plane waves of the form eikx . The virial theorem is not valid for repulsive potentials of this kind.
There is another exact and rather pretty result that holds for particles moving in one-dimension.
Consider a particle moving in a potential V (x) such that V (x) = 0 for |x| > L. A bound state
R
exists whenever dx V (x) < 0. In other words, a bound state exists whenever the potential
is “mostly attractive”. However, the converse to this statement does not hold. The proof can
be found in subsection 2.1.3 of Topic in Quantum Mechanics (David Tong).
ℏ2 d2 ψ
− + V (x)ψ = Eψ. (20.118)
2m dx2
We will look for solutions of the form
Plugging this ansatz into 20.118 leaves us with the differential equation
2
d2 W dW
iℏ 2 − + p2 (x) = 0, (20.120)
dx dx
where p2 = 2m(E − V ). Here we’ll look for solutions where the second derivative is merely
small, meaning
2
d2 W dW
ℏ ≪ . (20.121)
dx2 dx
We refer to this as the semi-classical limit. Roughly speaking, it can be thought of as the ℏ → 0
limit. Indeed, mathematically, it makes sense to attempt to solve Schrödinger using a power
series in ℏ. We treat p(x) as the background potential which we will take to be of the order of
| dW /dx|. We expand our solution as
dV |p|2
λ ≪ , (20.127)
dx 2m
20.6 WKB method –215/453–
which says that the change of the potential energy over a de Broglie wavelength should be
much less than the kinetic energy.
The WKB approximation does provides a solution in regions where E ≫ V (x) and, corre-
spondingly, p(x) is real. This is the case in the middle of the potential, where the wave function
oscillates. The WKB approximation also provides a solutions when E ≪ V (x) , where p(x) is
imaginary. This is the case to the far left and far right, where the wave function suffers either
exponential decay or growth
Z
A 1 x ′p
ψ(x) ≈ exp ± dx 2m(V − E) . (20.128)
2m(V − E)1/4 ℏ
ℏ2 d2 ψ
− + Cxψ = Cx0 ψ. (20.130)
2m dx2
and 1/2
1 2 √ π
Ai(u) ∼ √ cos u −u + , u ≪ 0. (20.135)
π −u 3 4
–216/453– Chapter 20 Approximation Method
The main purpose in introducing the Airy function is to put it to work in the WKB approxi-
mation. The asymptotic behavior is exactly what we need to match onto the WKB solution.
First consider the case where u ≪ 0. Here E > V (x) and we have the oscillatory solution:
" #1/2 Z x
(2mCℏ)1/3 1 ′
p π
ψ(x) ∼ p cos dx sgn(C) 2m(E − V ) + . (20.136)
π 2m(E − V ) ℏ x0 4
This takes the same oscillatory form as the WKB solution. The two solutions can be patched
together simply by picking an appropriate normalisation factor and phase for the WKB solu-
tion. Similarly, in the region where u ≫ 0, we have the exponentially decaying solution:
" #1/2 Z
1 (2mCℏ)1/3 1 x ′ p
ψ(x) ∼ p exp − dx sgn(C) 2m(V − E) . (20.137)
2 π 2m(V − E) ℏ x0
This too has the same form as the exponentially decaying WKB solution. This is how we piece
together solutions. In regions where E > V (x), the WKB approximation gives oscillating
solutions. In regimes where E < V (x), it gives exponentially decaying solutions. The Airy
function interpolates between these two regimes.
x
a b
As we approach x = a, the potential takes the linear form and this coincides with the asymp-
totic form of the Airy function. We then follow this Airy function through to Region 2 where
we have
Z x
A 1 ′
p π
ψ2 (x) ≈ cos dx 2m(E − V ) − . (20.139)
m(E − V )1/4 ℏ a 4
The Airy function takes this form close to x = a where V (x) is linear. But we can extend this
solution throughout Region 2 where it coincides with the WKB approximation.
We now repeat this procedure to match Regions 2 an 3. When x ≫ b, the WKB approximation
tells us that the wave function is
Z
A′ 1 x ′p
ψ3 (x) ≈ exp − dx 2m(V − E) . (20.140)
2m(V − E)1/4 ℏ b
We’re left with two expressions for the wave function in Region 2. Clearly these must agree.
Equating the two tells us that |A| = |A′ |, but they may differ by a sign, since this can be
compensated by the cosine function. Insisting that the two cosine functions agree, up to sign,
gives us the condition
Z
b
′
p 1
dx 2m(E − V ) = n+ ℏπ. (20.142)
a 2
The WKB approximation underlies an important piece of history from the pre-Schrödinger
era of quantum mechanics. We can rewrite the quantisation condition as
I
1
dx p = n+ h, h ≡ 2πℏ, (20.143)
2
H
where means that we take a closed path in phase space which, in this one-dimensional ex-
ample, is from xmin to xmax and back again. In the old days of quantum mechanics, Bohr and
Sommerfeld introduced an ad-hoc method of quantisation. They suggested that one should
impose the condition
I
dx p = nh (20.144)
with n an integer. They didn’t include the factor of 1/2. They made this guess because it turns
out to correctly describe the spectrum of the hydrogen atom. The WKB approximation pro-
vides an a-posteriori justification of the Bohr-Sommerfeld quantisation rule. More generally,
“Bohr-Sommerfeld quantisation” means packaging up a 2d-dimensional phase space of the
system into small parcels of volume hd and assigning a quantum state to each. It is, at best, a
crude approximation to the correct quantisation treatment.
–218/453– Chapter 20 Approximation Method
Let’s place ourselves in one of these energy eigenstates. Now vary the parameters λi . The
adiabatic theorem states that if λi are changed suitably slowly, then the system will cling to
the energy eigenstate |n[λ(t)]⟩ that we started off in. To see this, we want to solve the time-
dependent Schrödinger equation
∂ |ψ(t)⟩
i = H(λ) |ψ(t)⟩ . (20.146)
∂t
We expand the solution in a basis of instantaneous energy eigenstates as
X
|ψ(t)⟩ = am (t)eiξm (t) |m(λ)⟩ . (20.147)
m
Here am (t) are coefficients that we wish to determine, while ξm (t) is the usual energy-dependent
phase factor defined as Z t
ξm (t) ≡ − dt′ Em (t′ ). (20.148)
0
To proceed, we substitute our ansatz 20.147 into the Schrödinger equation to find
X
iξm ∂ |m(λ)⟩ i
ȧm e |m(λ)⟩ + am e
iξm
i
λ̇ = 0. (20.149)
m
∂λ
The adiabatic theorem holds when the change of parameters λ̇i is much smaller than the split-
ting of energy levels Em − En . In this limit, we can ignore this term. We’re then left with
If we start at time t = 0 with am = δmn , so the system is in a definite energy eigenstate |n⟩,
the system will remain in the state |n(λ)⟩ as we vary λ. This is true as long as λ̇i ≪ ∆E. In
particular, this means that when we vary the parameters λ, we should be careful to avoid level
crossing, where another state becomes degenerate with the |n(λ)⟩ that we’re sitting in. In this
case, we will have Em = En for some |m⟩ and all bets are off: when the states separate again,
there is no simple way to tell which linear combinations of the state we now sit in.
In contrast to the energy-dependent phase, this does not depend on the time taken to make the
journey in parameter space. Instead, it depends only on the path we take through parameter
space. It is known as the Berry phase.
Like gauge potential in electromagnetic field theory, there is also a redundancy in the infor-
mation contained in the Berry connection Ai (λ). This follows from the arbitrary choice we
made in fixing the phase of the reference states |n(λ)⟩. We could pick a different phase for
every choice of parameters λ,
|n′ (λ)⟩ = eiω(λ) |n(λ)⟩ (20.156)
for any function ω(λ). If we compute the Berry connection arising from this new choice, we
have
∂ω
A′i = Ai − i . (20.157)
∂λ
This takes the same form as the gauge transformation.
Following the analogy with electromagnetism, we might expect that the physical information
in the Berry connection can be found in the gauge invariant field strength which, mathemati-
cally, is known as the curvature of the connection,
∂Ai ∂Aj
Fij (λ) = − . (20.158)
∂λj ∂λi
It is certainly true that F contains some physical information about our quantum system, but
it is not the only gauge invariant quantity of interest. In the present context, the most natural
thing to compute is the Berry phase. Importantly, this too is independent of the arbitrariness
–220/453– Chapter 20 Approximation Method
H
arising from the gauge transformation. This is because ∂i ω dλi = 0. Indeed, we have already
seen this same expression in the context of electromagnetism: it is the Aharonov-Bohm phase.
In fact, it is possible to write the Berry phase in terms of the field strength using the higher-
dimensional version of Stokes’ theorem:
I Z
e = exp i dλ Ai (λ) = exp i dS Fij ,
iγ i ij
(20.159)
C S
where S is a two-dimensional surface in the parameter space bounded by the path C. A stan-
dard example of the application of Berry phase can be found in section 6.3.5 of Applications of
Quantum Mechanics (David Tong).
where
X ∇2 e2 X Zα Zβ
Hnucl ≡ − α
+ , (20.162)
α
2Mα 4π αβ |Rα − Rβ |
and !
X ∇2 e2 X 1 X Zα
Hel ≡ − i
+ − . (20.163)
i
2m 4π ij
|ri − rj | iα
|ri − Rα |
We then solve for the eigenstates of Hel , where the nuclei positions R are viewed as parame-
ters which, as in the adiabatic approximation, will subsequently vary slowly. For fixed R, the
instantaneous electron wave functions are
In what follows, we will assume that the energy levels are non-degenerate. We then make the
ansatz for the wave function of the full system
X
Ψ(r; R) = Φn (R)ϕn (r; R). (20.165)
n
20.7 Slowly changing Hamiltonians –221/453–
We would like to write down an effective Hamiltonian which governs the nuclei wave functions
Φ(R). This is straightforward. The wave function Ψ obeys
Switching to bra-ket notation for the electron eigenstates, we can write this as
X
⟨ϕm |Hnucl Φn |ϕn ⟩ + ϵm (R)Φm = EΦm . (20.167)
n
Now Hnucl contains the kinetic term ∇2R , and this acts both on the nuclei wave function, but
also on the electron wave function where the nuclei positions sit as parameters. We have
X
ϕm ∇2R Φn ϕn = (δmk ∇R + ⟨ϕm |∇R |ϕk ⟩) (δkn ∇R + ⟨ϕk |∇R |ϕn ⟩) Φn . (20.168)
k
We see that the electron energy level ϵn (R) acts as an effective potential for the nuclei. The
Berry connection
An,α = −i ⟨ϕn |∇Rα |ϕn ⟩ (20.171)
acts as an effective magnetic field in which the nuclei moves.
The idea of the Born-Oppenheimer approximation is that we can first solve for the fast-moving
degrees of freedom, to find an effective action for the slow-moving degrees of freedom. We
sometimes say that we have “integrated out” the electron degrees of freedom, language which
really comes from the path integral formulation of quantum mechanics. This is a very powerful
idea, and one which becomes increasingly important as we progress in theoretical physics.
Indeed, this simple idea underpins the Wilsonian renormalization group which we will meet
in later chapters.
Chapter 21
Many Body Problem
In three-dimensional space, the value of eiθ can only be ±1. If the spin of the particle is in-
teger, the phase factor must be 1 and the particle is called boson. If the spin of the particle
is half-integer, the phase factor must be −1 and the particle is called fermion. This is called
spin-statistics theorem and can only be proved by relativistic quantum field theory. In two-
dimensional space, the phase eiθ can be anything, and the particles that obey quantum statistics
of this sort are called anyons. A brief introduction can be found in chapter 12.1 and 12.2 of
Quantum Field Theory and the Standard Model (Matthew D. Schwartz).
Not all vectors in space H1 ⊗· · ·⊗Hn are physical states. Physical states must be the eigenvec-
tors of all Eij with eigenvalue 1 (−1) for bosons (fermions). The space composed of physical
states is called Fock space. For example, suppose there are three particles with different state
a, b, c. |abc⟩ is not a physical state. The physical state for bosons is
1
√ (|abc⟩ + |acb⟩ + |bca⟩ + |bac⟩ + |cab⟩ + |cba⟩) . (21.3)
3!
The physical state for fermions is
1
√ (|abc⟩ − |acb⟩ + |bca⟩ − |bac⟩ + |cab⟩ − |cba⟩) . (21.4)
3!
In general, for N particles filling N distinct states, there are N ! states to start with, but there
is only one totally symmetric state and one totally anti-symmetric state, and the rest of N ! −
2 states are thrown out. Therefore quantum statistics reduces the size of the Hilbert space
quite dramatically. Further more, not all Hermitian operators are physical observables. A
nonphysical operator is one that takes a state in the physical subspace of the Hilbert space (one
that satisfies the right symmetry under exchange), and maps it into a nonphysical state (one
21.2 Non-relativistic quantum field theory –223/453–
that does not have the right symmetry). An example is the operator X1 . We might call this the
operator corresponding to the measurement of the position of particle 1. The problem with
this operator from a physical standpoint is that you cannot measure the position of particle 1.
You can select a region of space, and ask whether there is a particle in that region. But if you
find one, you cannot say whether it is particle 1 or particle 2, since they are indistinguishable.
A physical observable O must obey
The generalization of path integral formulation of quantum mechanics to the N -particle case
is Z ∫ tf
⟨x1f · · · xN f , tf |x1i · · · xN i , ti ⟩ = Dx1 (t) · · · xN (t)ei ti dtL(t) . (21.6)
Here, the particle 1 at the initial position x1i moves to the final position x1f , the particle 2 at
the initial position x2i to x2f , etc, and you sum over all possible paths. When the particles
are identical, however, we need to introduce proper (anti-)symmetrization of the state. For
fermions, we introduce the anti-symmetrized position bra
1 X
⟨[x1 · · · xN ]| = √ (−1)σ xσ(1) · · · xσ(N ) . (21.7)
N! σ
Notice that the Lagrangian for identical particles must be invariant under the exchange of
particles. We can prove that
X
⟨[x1f · · · xN f ], tf |[x1i · · · xN i ], ti ⟩ = (−1)σ xσ(1)f · · · xσ(N )f , tf x1i · · · xN i , ti . (21.8)
σ
In other words, the path integral sums over all possible paths allowing the positions at the final
time slice are interchanged in all possible ways starting from the positions at the initial time
slice. A diagrammatic representation of the path integral is shown in Figure 21.1. The case for
bosons can be obtained easily by dropping all minus signs.
–224/453– Chapter 21 Many Body Problem
It is equivalent to
ψ(x), ψ † (y) = δ(x − y). (21.16)
We can regard ψ(x) as annihilation operator and ψ(x) creation operator of a boson at position
x. The Hamiltonian of the system is
Z
†∇
2
1 †2 2
H = dx −ψ ψ + λψ ψ . (21.17)
2m 2
1
|x1 · · · xN ⟩ = √ ψ † (x1 ) · · · ψ † (xN ) |0⟩ . (21.20)
N!
The state |x1 · · · xN ⟩ is an n-particle state of identical bosons in the position eigenstate at
x1 · · · xN .
Ψ(x, t)is a c-number function which determines a particular superposition of the position
eigenstates |x⟩ and corresponds to the Schrödinger wave function in the particle quantum
mechanics. The Schrödinger equation in quantum field theory is
∂ |Ψ(t)⟩
i = H |Ψ(t)⟩ . (21.24)
∂t
–226/453– Chapter 21 Many Body Problem
Since
Z Z
∇2 Ψ(x, t)
H |Ψ(t)⟩ = dx Ψ(x, t) H, ψ † |0⟩ = dx − |x⟩ , (21.25)
2m
1
|x1 x2 ⟩ = √ ψ † (x1 )ψ † (x2 ) |0⟩ . (21.27)
2
We can derive that
1
⟨x1 x2 |y1 y2 ⟩ = [δ(x1 − y1 )δ(x2 − y2 ) + δ(x1 − y2 )δ(x2 − y1 )] . (21.28)
2
This normalization suggests that we are dealing with a two-particle state of identical particles,
because the norm is non-vanishing when x1 = y1 and x2 = y2 , but also when x1 = y2 and
x2 = y1 . A general two-particle state is constructed by
Z
1
|Ψ(t)⟩ ≡ √ dx1 dx2 Ψ(x1 , x2 , t)ψ † (x1 )ψ † (x2 ) |0⟩ . (21.29)
2
Because ψ † (x1 ), ψ † (x2 ) = 0, the integration over x1 and x2 is symmetric under the in-
terchange of x1 and x2 , and hence Ψ(x1 , x2 , t) = Ψ(x2 , x1 , t). The symmetry under the
exchange suggests that we are dealing with identical bosons. Since
Z
1
H |Ψ(t)⟩ = √ dx1 dx2 Ψ(x1 , x2 , t) H, ψ † (x1 ) ψ † (x2 ) + ψ † (x1 ) H, ψ † (x2 ) |0⟩
2
Z
1 ∇12 ∇22
=√ dx1 dx2 − − + λδ(x1 − x2 ) Ψ(x1 , x2 , t) |x1 x2 ⟩ , (21.30)
2 2m 2m
which is the Schrödinger equation for a two-particle wave function with delta potential as the
interaction between them. Therefore, the Fock space with two creation operators correctly
describes the two-particle quantum mechanics.
If we want a general interaction potential between them, the action must be modified to
Z Z Z
† ∂ ∇2 1 † †
S = dt dx ψ (x) i + ψ(x) − dx dy ψ (x)ψ (y)V (x − y)ψ(x)ψ(y) .
∂t 2m 2
(21.32)
21.2 Non-relativistic quantum field theory –227/453–
It follows that
[N, ψ] = −ψ, N, ψ † = ψ † . (21.39)
Thus we can derive that
N |x1 , · · · , xn ⟩ = n |x1 , · · · , xn ⟩ . (21.40)
It follows that
a(p), a† (q) = δ(p − q), [a(p), a(q)] = a† (p), a† (q) = 0. (21.42)
–228/453– Chapter 21 Many Body Problem
We can rewrite the Hamiltonian in the momentum space. The free part of the Hamiltonian is
Z Z
† −∇
2
p2 †
H0 ≡ dx ψ ψ = dp a (p)a(p). (21.43)
2m 2m
It simply counts the number of particles in a given momentum state and assigns the energy
p2 /2m accordingly. The interaction part of the Hamiltonian is
Z
1
∆H ≡ dx dyψ † (x)ψ † (y)V (x − y)ψ(x)ψ(y)
2
Z
1
= dp dq dp′ dq ′ V (p − q)a† (p)a† (p′ )a(q)a(q ′ )δ(p + p′ − q − q ′ ), (21.44)
2
where Z
1
V (p − q) = dx V (x)e−i(p−q)·x . (21.45)
(2π)3
The delta function represents the momentum conservation in the scattering process due to
the potential V . The potential term of Hamiltonian causes scattering, by annihilating two
particles in momentum states q, q ′ and create them in different momentum states p, p′ with
the amplitude V (p − q).
21.2.4 Fermions
We have seen that the quantized Schrödinger field gives multi-body states of identical bosons.
For fermions, we should use anti-commutation relations rather than commutation relations:
ψ(x), ψ † (y) = δ(x − y), {ψ(x), ψ(y)} = ψ † (x), ψ † (y) = 0. (21.46)
One noteworthy point is that ψ † (x)ψ † (x) = ψ † (x), ψ † (x) /2 = 0. What this means is
that one cannot create two particles at the same position, an expression of Pauli’s exclusion
principle for fermions.
Consider a two-particle state
Z
1
|Ψ(t)⟩ = √ dx1 dx2 Ψ(x1 , x2 , t)ψ † (x1 )ψ † (x2 ) |0⟩ . (21.47)
2
From the anti-commutation relation ψ † (x), ψ † (y) = 0, we have
[A, BC] = {A, B}C − B{A, C}, [AB, C] = A{B, C} − {A, C}B. (21.49)
Because the potential vanishes in the asymptotic region, the Schrödinger equation re-
lates the asymptotic fall-off to the energy of the state,
λ2
E=− . (22.3)
2m
In particular, bound states have E < 0. Indeed, it is this property which ensures that
the particle is trapped within the potential and cannot escape to infinity. Bound states
are rather special. In the absence of a potential, a solution which decays exponentially
to the left will grow exponentially to the far right. But, for the state to be normalisable,
the potential has to turn this behaviour around, so the the wave function decreases at
both x → −∞ and x → +∞. This will only happen for specific values of λ. Ultimately,
this is why the spectrum of bound states is discrete.
• Scattering states are not localised in space and the wave functions are not normalisable.
Instead, asymptotically, far from the potential, scattering states take the form of plane
waves. In one-dimension, there are two possibilities,
Solving the Schrödinger equation in the asymptotic region gives the energy
k2
E= . (22.5)
2m
Scattering states have E > 0. Note that nothing special has to happen to find scattering
solutions. We expect to find solutions for any choice of k.
The coefficient r is called the reflection amplitude. The coefficient t is called the transmission
amplitude. The probability for reflection R and transmission T are given by the usual quantum
mechanics rule:
R = |r|2 , T = |t|2 . (22.7)
Given a solution ψ(x) to the Schrödinger equation, we can construct a conserved probability
current
−i ∗ dψ dψ ∗
J(x) = ψ −ψ , (22.8)
2m dx dx
which obeys dJ/dx = 0. This means that J(x) is constant. For our scattering solution ψR ,
the probability current as x → −∞ is given by
k
J(x) = (1 − |r|2 ). (22.9)
m
Meanwhile, as x → +∞, we have
k 2
J(x) = |t| . (22.10)
m
Equating the two gives R + T = 1.
Now we throw the particle in from the right. We are now looking for solutions which take the
asymptotic form
(
t′ e−ikx , x → −∞
ψL (x) ∼ . (22.11)
e−ikx + r′ eikx , x → +∞
Because the potential V (x) is a real function, if ψR is a solution, so will be ψR∗ . By linearity,
(ψR∗ − r∗ ψR )/t∗ is also a solution, with asymptotic behavior
(
ψR∗ (x) − r∗ ψR (x) te−ikx , x → −∞
∼ r∗ t
. (22.12)
t∗ e−ikx − eikx , x → +∞
t∗
22.1 Scattering in one-dimension –231/453–
(k 2 − q 2 ) sin(qa)e−ika 2iqke−ika
r= , t= , (22.15)
(q 2 + k 2 ) sin(qa) + 2iqk cos(qa) (q 2 + k 2 ) sin(qa) + 2iqk cos(qa)
22.1.2 S-matrix
We have two ingoing asymptotic wave functions, one from the left and one from the right,
We can also think about the S-matrix using our new basis of states. The asymptotic ingoing
modes are even and odd functions, given at |x| → ∞ by
For simplicity, assume that we have a symmetric potential. This means that there is no mixing
between the parity-even and parity-odd wave functions. We start by looking at the parity-even
states. The general solution takes the form
(
eikx + S++ e−ikx , x → −∞
ψ+ (x) = I+ (x) + S++ O+ (x) = . (22.25)
e−ikx + S++ eikx , x→∞
22.1 Scattering in one-dimension –233/453–
Suppose that we make k pure imaginary and write k = iκ with κ > 0. Then we get
(
e−κx + S++ eκx , x → −∞
ψ+ (x) = . (22.26)
eκx + S++ e−κx , x→∞
Both terms proportional to S++ decay asymptotically, but the other terms diverge. The wave
function above is normalisable whenever we can find a κ such that S++ (k = iκ) → ∞. So
poles in the complex momentum plane that lie on the positive imaginary axis correspond to
bound states. This information also tells us the energy of the bound state,
κ2
E=− . (22.27)
2m
We could also have set k = −iκ, with κ > 0. In this case, it is the terms proportional to S++
which diverge and the wave function is normalisable only if S++ (k = −iκ) = 0. However,
since S++ is a phase, this is guaranteed to be true whenever S++ (k = iκ) has a pole, and
simply gives us back the solution above.
Finally, exactly the same arguments hold for parity-odd wave functions. There is a bound state
whenever S−− (k) has a pole at k = iκ with κ > 0.
Example: We can illustrate this with example of the square well, of depth −V0 and width a.
We have,
q tan(qa/2) − ik
S++ = r + t = −eika , (22.28)
q tan(qa/2) + ik
where q 2 = 2mV0 + k 2 . Setting k = iκ, we see that this has a pole when
qa
κ = q tan with κ2 + q 2 = 2mV0 . (22.29)
2
These are the usual equations that you have to solve when finding parity-even bound states in
a square well. Similarly, if we look at the parity-odd wave functions, we have
q + ik tan(qa/2)
S−− = t − r = −eika , (22.30)
q − ik tan(qa/2)
22.1.4 Resonances
Let us think the example shown in Figure 22.1. One the one hand, we know that there can be
no bound states in such a trap because they will have E > 0. Any particle that we place in the
trap will ultimately tunnel out. On the other hand, if the walls of the trap are very large then
–234/453– Chapter 22 Scattering Theory
V (x)
we might expect that the particle stays there for a long time before it eventually escapes. In this
situation, we talk of a resonance. These are also referred to as unstable or metastable states.
Suppose that S++ has a pole that lies on the complex momentum plane at position k = k0 −iγ.
We note that the energy is also imaginary,
Γ k02 − γ 2 2γk0
E = E0 − i where E0 ≡ , Γ≡ . (22.32)
2 2m m
Recall that the time dependence of the wave function is given by
For γ > 0, the overall form of the wave function decays exponentially with time. This is the
characteristic behaviour of unstable states. A wave function that is initially supported inside
the trap will be very small there at time much larger than τ = 1/Γ. Here τ is called the half-
life of the state, while Γ is usually referred to as the width of the state. Including the time
dependence, when S++ → ∞, the solution takes the asymptotic form
(
e−iE0 t e−ik0 x e−γx−Γt/2 , x → −∞
ψ+ (x, t) = . (22.34)
e−iE0 t eik0 x eγx−Γt/2 , x → ∞
where H0 ≡ P 2 /2m is the free-particle Hamiltonian operator. The solution can be written
formally as
1
ψ (±) = V ψ (±) + |ϕ⟩ where H0 |ϕ⟩ = E |ϕ⟩ . (22.38)
E − H0 ± iϵ
In coordinate representation, we have
Z
3 ′ 1
(±)
ψ (x) = ϕ(x) + d x x x V (x′ )ψ (±) (x′ ),
′
(22.39)
E − H0 ± iϵ
It follows that
′ 1 e±ik|x−x |
′
√
G± (x, x ) = − where k ≡ 2mE. (22.41)
4π |x − x′ |
We can verify that
(∇2 + k 2 )G± (x, x′ ) = δ(x − x′ ). (22.42)
Now solution 22.39 becomes
Z ′
eik·x 3 ′1 e±ik|x−x |
ψ (±)
(x) = − 2m dx V (x′ )ψ (±) (x′ ). (22.43)
(2π)3/2 4π |x − x′ |
We can interpret ψ + (x) as a superposition of incident plane wave and scattered wave which
propagate from scatterer to outside region and denote it as ψ(x).
The experiment is done typically by placing the detector far away from the scatterer, i.e., |x| ≪
a where a is the “size” of the scatterer. The integration over x′ , on the other hand, is limited
within the “size” of the scatterer because of the V (x′ ) factor. Therefore, we are in the situation
|x′ | ≪ |x|, and hence can use the approximation
x′ · x
|x − x′ | ≈ |x| − , (22.44)
|x|
The differential cross section for being scattered into solid angle dΩ is
|jscatt |r2 dΩ
dσ = = |f (k, k′ )|2 dΩ , (22.48)
|jinc |
where jinc and jscatt are probability flux of incident and scattered wave function.
In a more realistic situation, we should use wave packets to describe the scattering process.
The basic picture is a free wave packet approaches the scattering center. After a long time, we
have both the original wave packet moving in the original direction plus a spherical wave front
that moves outward. The details can be found in the section 3 of the lecture notes Scattering
Theory I (Hitoshi Murayama).
Furthermore, if we require that the normalization of the wave function should always satisfy
R 3
d x |ψ(x)|2 = 1 for any t, as guaranteed by the unitarity of time evolution operator. This
requirement leads to a special requirement on the scattered wave, and hence f (k, k′ ), from
which we can derive the optical theorem.
In physics, the optical theorem is a general law of wave scattering theory, which relates
the forward scattering amplitude to the total cross section of the scatterer. It is usually
written in the form
(22.49) ♣
kσtot
Im f (0) = ,
4π
where f (0) is the scattering amplitude with an angle of zero, and σtot is the total cross
section of the scatterer.
The meaning of this theorem is clear. Because the scattered wave takes the probability away to
different directions, the total probability for the particle to go to the forward direction (unscat-
tered) should decrease. This decrease is caused by the interference between the unscattered
and scattered waves and hence is proportional to Imf (0). On the other hand, the amount of
decrease in the forward direction should equal the total probability at other directions, which
is proportional to the total cross section. The proof can be found in the section 4 of the lecture
notes Scattering Theory I (Hitoshi Murayama).
Form factor
If the source of Coulomb potential has an distribution ρ(x), the potential will become
Z
α
V (x) = d3 x ρ(x′ ). (22.58)
|x − x |
′
Notice that the potential is mathematically a convolution of the Coulomb potential and the
probability density. Since the first Born amplitude is nothing but the Fourier transform of
the potential, the convolution becomes a product of Fourier transforms, one for the Coulomb
potential and the other for the probability density. Thus we have
f (θ) = f (θ)pointlike F (q), (22.59)
where Z
F (q) ≡ d3 x ρ(x)eiq·x (22.60)
Born expansion
Define T-matrix by
V |ψ⟩ = T |ϕ⟩ . (22.61)
Using the definition of the T-matrix, we find that
m
f (k, k′ ) = − (2π)3 ⟨k′ |T |k⟩ . (22.62)
2π
Multiplying the both sides of the Lippmann-Schwinger equation 22.38 by V from left, we can
get
1
T |ϕ⟩ = V T |ϕ⟩ + V |ϕ⟩ . (22.63)
E − H0 + iϵ
A formal solution to the T-matrix is
1
T = V. (22.64)
1 − V (E − H0 + iϵ)−1
The Taylor expansion of T in geometric series is
1 1 1
T =V +V V +V V V + ··· (22.65)
E − H0 + iϵ E − H0 + iϵ E − H0 + iϵ
Thus we have
1 1 1
|ψ⟩ = 1 + V + V V + ··· |ϕ⟩ . (22.66)
E − H0 + iϵ E − H0 + iϵ E − H0 + iϵ
The first term is the wave which did not get scattered. The second term is the wave that gets
scattered at a point in the potential and then propagates outwards by the propagator. In the
third term, the wave gets scattered at a point in the potential, propagates for a while, and gets
scattered again at another point in the potential, and propagates outwards. In the n + 1-th
term, there are n times scattering of the wave before it propagates outwards.
wgere jl (kr) is spherical Bessel functions of the first kind. The asymptotic behaviour of jl (kr)
at large r is
sin kr − lπ2
jl (kr) ∼ . (22.68)
kr
22.4 Partial wave analysis –239/453–
So we have
1 X
∞
eikz ∼ (2l + 1)(eikr − (−1)l e−ikr )Pl (cos θ) at large r. (22.69)
2ikr l=0
Notice that X
Im f (0) = (2l + 1) Im fl . (22.72)
l
Applying optical theorem, we find that
1
|fl |2 = Im fl . (22.73)
k
It follows that
|1 + 2ikfl |2 = 1. (22.74)
We can define a phase δl by
1 + 2ikfl = e2iδl . (22.75)
Asymptotic behaviour of the wave function 22.46 then would be
1 X
ψ(x) ∼ (2l + 1)Pl (cos θ)[eikr e2iδl − (−1)l e−ikr ]. (22.76)
2ikr l
Compare it to the case of the plane wave without scattering. What this equation says is that
the wave converging on the scatterer has the well-defined phase factor −(−1)l , the same as in
the case without scattering. While the wave that emerges from the scatterer has an additional
phase factor e2iδl . All what scattering did is to shift the phase of the emerging wave by 2δl . The
reason why this is merely a phase factor is the conservation of probability. What converged
to the origin must come out with the same strength. But this shift in the phase causes the
interference among all partial waves different from the case without the phase shifts, and the
result is not a plane wave but contains the scattered wave. In terms of the phase shifts, the cross
section is given by
4π X
σ= 2 (2l + 1) sin2 δl . (22.77)
k l
Actual calculation of phase shifts is basically to solve the Schrödinger equation
1 d2 l(l + 1)
− r+ + 2mV (r) Rl (r) = k 2 Rl (r) (22.78)
r dr2 r2
for each partial waves. After solving the equation, we take the asymptotic limit r → ∞, and
write Rl (r) as a linear combination of jl (kr) cos δl − nl (kr) sin δl . The relative coefficients of
jl and nl determines the phase shift δl , and hence the cross section.
–240/453– Chapter 22 Scattering Theory
The infinite potential corresponds to the boundary condition Rl (a) = 0. We first analyze the
S-wave (l = 0). The Schrödinger equation is simply
d2 rR0
− 2
= k 2 rR0 . (22.80)
dr
The solution is
ceika i(kr−2ka)
rR0 = c sin[k(r − a)] = e − e−ikr . (22.81)
2i
It follows that δ0 = −ka. The reason behind the phase shift is that the wave cannot penetrate
into r < a, the wave is shifted outwards, which is the shift in the phase −ka. The cross section
from the S-wave scattering is
4π
σ0 = 2 sin2 ka. (22.82)
k
Let us generalize the discussion to the case of a little bit penetrable potential
(
0, r > a
V = . (22.83)
V0 , r < a
√
Define K ≡ 2mV0 . If k > K, we have
( √
sin k 2 − K 2 r , r < a
rR0 = . (22.84)
sin(kr + δ0 ), r > a
The phase shift δ0 always starts linearly with k at small momentum, and the slope is negative.
This is a completely general result for a repulsive potential, and a convenient quantity
dδ0
a0 = lim − (22.89)
k→0 dk
is called the scattering length, as it has the dimension of the length. This quantity basically
measures how big the scatterer is. The cross section at k → 0 limit is then given by 4πa20 . For
the hard sphere potential, the scattering length is indeed the size of the sphere.
For the hard sphere problem, the phase shifts for higher partial waves can be worked out sim-
ilarly. We have
jl (ka)
tan δl = . (22.90)
nl (ka)
The cross section is then given by
4π X X∞
2 4π(2l + 1) [jl (ka)]2
σ= 2 (2l + 1) sin δl = . (22.91)
k l l=0
k2 [jl (ka)]2 + [nl (ka)]2
For small momentum k ≪ a−1 , we can use the power expansion of the spherical Bessel func-
tions
rl (2l − 1)!!
jl (r) ∼ , nl (r) ∼ − , (22.92)
(2l + 1)!! rl+1
and find that
δl ∼ (ka)2l+1 . (22.93)
Thus phase shift (and so cross section) is smaller for higher partial waves. This is easy to
understand. When k is small, the centrifugal barrier does not allow the particle to reach r = a
classically. Therefore the effect of the potential is extremely suppressed.
At high momentum, sin2 δl oscillates between 0 and 1 as a function of l up to l ∼ ka. Above
this value, the phase shift drops rapidly to zero. This makes sense from the classical physics
intuition. When l > ka, the impact parameter is larger than the size of the target and there
should not be any scattering.
For small K, the scattering length is negative. This is easy to understand because the wave is
pulled into the potential rather pushed out unlike the repulsive case. However, once we make
the potential more attractive (larger K), the scattering length grows and becomes even infinite
at Ka = π/2.
Let us study the analytic structure of the scattering amplitude more carefully. Notice that
√
1 + i √ k tan k2 + K 2a
−2ika k2 +K 2
e2iδ0 =e √ . (22.97)
1 − i √k2k+K 2 tan k 2 + K 2 a
This is the condition for bound states. The scattering wave eikr becomes e−κr , which is trapped
by scatter. By decreasing K from a sufficiently large value with bound states, the bound state
energies E = −κ2 /2m move up. When Ka = (n + 1/2)π, we have tan Ka → ∞, and we
find a bound state approaching κ = k = 0. The infinite scattering cross section at k = 0
happens because there is a bound state exactly at k = 0.
This can also be seen on the complex k plane in the following manner. The lower half plane
is unphysical as it corresponds to an exponentially growing wave function at the infinity for
the scattered wave. When there are bound states, we see poles along the positive imaginary
axis. By decreasing K, the poles along the positive imaginary axis go down, and a pole reaches
the origin. By further decreasing K, the pole goes below the origin into the unphysical region.
However, the existence of a pole just below the origin makes the scattering amplitude at k → 0
large and results in an anomalously large cross section.
22.5 Resonance
This potential leads to a true bound state if γ is sufficiently negative. On the other hand, for
γ → ∞, the regions inside r < a and outside r > a are decoupled and one finds a tower of
states confined inside the shell. The fate of these states for finite γ is very interesting.
22.5 Resonance –243/453–
The phase shift for the S-wave can be worked out analytically,
k
2iδ0 −2ika
sin ka + 2mγ
eika
e =e . (22.101)
sin ka + k
2mγ
e−ika
The poles are in the unphysical lower half plane. But when γ is large, the poles are very close to
the real axis, and the scattering amplitude receives a large enhancement due to these poles. In
the limit of γ → ∞, or in other words in the limit of no coupling between the regions inside
and outside the shell, they become poles along the real axis. They are the discrete states inside
the shell in this limit. By making γ finite, we introduce coupling between the discrete states
inside the shell to the continuum states outside the shell.
It is instructive to solve Schrödinger equation for the values of k which correspond to the
location of poles. Because the outgoing wave eikr is enhanced relative to the incoming wave
e−ikr by an infinite amount due to the pole, the boundary condition is that the solution is
“purely outgoing”, i.e.,
(
sin kr, r < a
rR0 = , Re(k) > 0. (22.104)
sin ka eik(r−a) , r > a
Because the factor eik(r−a) grows exponentially at large r due to the negative imaginary part
in k, the solution is not a regular normalizable solution. In the large γ limit, sin ka ∼ O(γ −1 )
is small. Therefore the wave function almost vanishes at the shell. Outside the shell, the wave
function oscillates at the small amplitude sin ka, which however starts growing again due to
the eik(r−a) factor exponentially.
We now put the time dependence in. For k = k0 − iκ, we have
Γ k2 k0 κ
E = E0 − i = 0 −i + O κ2 . (22.105)
2 2m m
The time dependence of the wave function is simply
(
sin kr e−iE0 t e− 2 , (r < a)
Γt
−iE0 t − 2
Γt
rR0 (r, t) = rR0 (r)e e = . (22.106)
sin ka eik(r−a) e−iE0 t e− 2 , (r > a)
Γt
Inside the shell, it shows an exponentially decaying probability density uniformly over space.
Outside the shell, the probability density is |rR0 |2 ∝ e2κr−Γt , which shows the probability
flowing out to infinity with speed Γ/2κ = k0 /m, nothing but the velocity of the particle
–244/453– Chapter 22 Scattering Theory
itself. In other words, the wave function describes a “bound state” inside shell decaying into
a continuum state outside the shell moving away at the expected velocity. The resonances can
be viewed as quasi-bound states which decay into continuum states. The lifetime of the quasi-
bound states is τ = 1/Γ. A more rigorous treatment of resonance using wave packets can be
found in the section 5 of the lecture notes Scattering Theory III (Hitoshi Murayama).
E − E0 − iΓ/2
e2iδl ∼ e2iθ . (22.108)
E − E0 + iΓ/2
Γ2 /4
σl ∝ sin2 δl = . (22.109)
(E − E0 )2 + Γ2 /4
At E = E0 , it saturates the unitarity limit sin2 δl = 1, and its shape in energy is called
Lorentzian or Breit-Wigner. Γ is nothing but the FWHM (Full-Width-Half-Maximum) of the
Lorentzian peak in sin2 δl . Comparing the discussion of the decaying probability density with
a run-away wave and the dependence of the cross section on the energy, we established the
relationship between the life time of the quasi-bound state and the FWHM of the resonance as
τ Γ = 1. This is an explicit manifestation of the energy-time uncertainty relation ∆E∆t ∼ 1.
If two particles that scatter are identical particles, such as electron-electron scattering or scat-
tering of two identical atoms, symmetry of the wave function needs to be considered. Under
the interchange of two particles, the center of mass motion is not affected, but the relative
coordinates change their signs. If they have spins, their spins need to be interchanged at the
same time.
If two particles are identical spinless bosons, there is no spin degrees of freedom and the inter-
change of particles is simply x → −x in the wave function. Because they are bosons, the wave
function should not change under the interchange of particles, and hence the wave function
must be an even function of x. Therefore the asymptotic form of the wave function must be
changed to
eikr
ψ(x) → eikz + e−ikz + [f (θ) + f (π − θ)] . (22.113)
r
The differential cross section is then found to be
dσ
= |f (θ) + f (π − θ)|2 . (22.114)
dΩ
Note that one should not integrate over the entire solid angle to obtain the total cross section
because (θ, ϕ) and (π − θ, ϕ + π) correspond to an identical state.
For two spin 1/2 fermions, there are two possible spin wave functions, symmetric S = 1 and
anti-symmetric S = 0. Therefore depending on the spin wave function, we either have a anti-
symmetric or symmetric spatial wave function, respectively. In particular, the differential cross
section is the same as the spinless bosons for the anti-symmetric spin wave function S = 0
while it is
dσ
= |f (θ) − f (π − θ)|2 . (22.115)
dΩ
for the symmetric spin wave function S = 1. In the latter case, the differential cross section
vanishes identially at θ = π/2.
When applied to the scattering problem, an additional issue is to define how we sum over the
final states. In particular, we would like to sum over the continuum plane-wave states, and
we must make the sum well-defined. To define the sum over the continuum states, it is useful
to consider the system in a cube of size L. We impose the periodic boundary condition. The
plane-wave solutions in this box are given by
Coming back to the scattering problem, we sum over the final states to define the rate of the
outgoing particle to go into various momentum states
X Z 3 3
L dp
Γ(i → f ) = 3
2πδ(Ei − Ef )|Vf i |2 , (22.119)
f
(2π)
where
Z Z
e−ipf ·x eipi ·x 1
Vf i = 3
d x 3/2 V (x) 3/2 = 3 d3 x V (x)eiq·x , q = pi − pf . (22.120)
L L L
Notice that E = p2 /2m and δ(Ei − Ef ) = mδ(pf − pi )/pi . Equation 22.121 can be simplified
into Z Z 2
m
σ = dΩ dxV (x)eiq·x . (22.122)
2π
This is nothing but the Born approximation for the scattering cross section.
Part V
23.1 Group
Definition 23.1 Group
Group A group G is a set of elements with a rule for assigning to every (ordered) pair
of elements, satisfying
• If f, g ∈ G, then f g ∈ G.
• For f, g, h ∈ G, f (gh) = (f g)h.
• There is an identity element, e, such that for all f ∈ G, ef = f e = f .
• Every element f ∈ G has an inverse, f −1 , such that f f −1 = f −1 f = 1.
♡
abelian group An abelian group is one in which the multiplication of arbitrary two
elements is commutative.
Finite group A group is finite if it has a finite number of elements. The number of
elements in a finite group G is called the order of G, denoted by N (G). A finite
group with n elements can be characterized by its multiplication table. In each
row or column any group element can appear once and only once.
Let a group G with n elements have a subgroup H with m elements. Then m is a factor
♣
of n. In other words, n/m is an integer.
If H is a subgroup of G, the left cosets are given by {gH|g ∈ G}, where gH ≡ {gh|h ∈
H}. If H is a invariant subgroup, we have gH = Hg. In this case, we can define
multiplication rules for left cosets as (ga H)(gb H) ≡ (ga gb )H. Then the left cosets form ♡
a group, known as the quotient group and written as Q = G/H. If G is finite, we have
N (Q) = N (G)/N (H).
The set of all bijections {1, · · · , n} → {1, · · · , n}, called permutations, with composi-
tion of maps forms a finite group. We call this group the symmetric group of degree n
and it is denoted by Sn . σ ∈ Sn can be represented by (1, · · · , n) → (σ(1), · · · , σ(n)). ♡
We will take the convention of composing permutations from right to left and so taking
π, σ ∈ Sn , we have π · σ = (1, · · · , n) → (π(σ(1)), · · · , π(σ(n))).
Definition 23.5
Proposition 23.1
Two representations, D(g) and D′ (g), are equivalent representations if they are related
♡
by a similarity transformation D′ (g) = S −1 D(g)S. We have χ′ (c) = χ(c).
A representation of group G is unitary if and only if all the matrix elements D(g) are
unitary. All representations of finite groups are equivalent to unitary representations.
♣
As a corollary, if a class c of a finite group contains the inverses of its members, χ(c)
must be real.
23.2 Representation theory –251/453–
Proposition 23.2
Definition 23.11
If we take the trace of the representation, we have following column orthogonality for
chapter table χcr ≡ χr (c):
X
nc χ∗r (c)χs (c) = N (G)δrs , (23.7)
c
♣
where nc is the number of elements in class c. The character table also satisfies row
orthogonality,
X N (G)
χ∗r (c)χr (c′ ) = δcc′ . (23.8)
r
n c
A corollary of column and row orthogonality of character table is that N (C) = N (G).
The identity element e itself is a class and χr (e) = dr , so we have
X
d2r = N (G). (23.9)
r
Proposition 23.4
Proposition 23.6
Corollary 1
For each cell of the Young diagram in coordinates (i, j) (that is, the cell in the ith row
and jth column), the hook Hλ (i, j) is the set of cells (a, b) such that a = i and b ≥ j or
a ≥ i and b = j. The hook-length hλ (i, j) is the number of cells in the hook Hλ (i, j).
The hook-length formula expresses the number of standard Young tableaux of shape λ, ♣
sometimes denoted by dλ , as
n!
dλ = Q . (23.12)
hλ (i, j)
Example: For a Young diagram of shape (2, 1), the hook-length for each cell is given by
3 1. (23.13)
1
3!
dλ = = 2. (23.14)
3
23.3 Representations of the symmetric groups –255/453–
Let Θλσ be any tableau. A horizontal (vertical) permutation hλσ (vλ σ ) is a permutation
that does not exchange numbers between different rows (columns). Each cycle in hλσ ♡
(vλ σ ) must contain numbers that appear in the same row (column).
Definition 23.18
P
Symmetrizer sλσ ≡ h hλσ
P
Anti-symmetrizer aλσ ≡ v (−1)v vλ σ ♡
P
Irreducible symmetrizer eλσ ≡ sλσ aλσ = h,v (−1)v hλσ vλ σ
The horizontal and vertical permutations are hλ = {e, (12)} and vλ = {e, (13)}, respectively.
The symmetrizer, anti-symmetrizer and irreducible symmetrizer are sλ = e + (12), aλ =
e − (13) and eλ = e + (12) − (13) − (321).
(2,3) (2,3)
The horizontal and vertical permutations are hλ = {e, (13)} and vλ = {e, (12)},
(2,3)
respectively. The symmetrizer, anti-symmetrizer and irreducible symmetrizer are sλ =
(2,3) (2,3)
e + (13), vλ = e − (12) and eλ = e − (12) + (13) − (123).
Theorem 23.5
Theorem 23.6
Let Vm be an m-dimensional vector space with basis {|i⟩ , i = 1, · · · , m}. The general
linear group GL(m) consists of all invertible linear transformations on Vm . The natural
m-dimensional representation of GL(m) on Vm is
Theorem 23.7
1. In the smaller tableaux, assign the same symbol, say a; to all boxes in the first row, the
same symbol b to all boxes in the second row, etc.
2. Attach boxes labeled by a to the second tableaux in all possible ways subjected to the
rules that no two a’s appear in the same column and that the resultant graph is still a
Young tableaux (i.e., the length of rows does not increase from top to bottom and there
are not more than n rows, etc.) Repeat this process with b’s ,etc.
3. After all symbols have been added to the tableaux, these added symbols are then read
from right to left in the first row, then the second row in the same order, and so forth.
This sequence of symbols aabbac must form a lattice permutation. Thus, to left of any
symbol there are not fewer a than b and no fewer b than c, etc.
4. These added symbols are read again from top to bottom in the last column, then the last
but one column in the same order, and so forth. This sequence of symbols must also
form a lattice permutation.
Stage 0:
⊗ a a (23.22)
b
Stage 1:
a (23.23)
a
a
Stage 2:
a
a a a (23.24)
a
a
a a
Stage 3:
a a a a
a a a (23.25)
a b a
b a b
b b a a b
So we have
⊗ = ⊕ ⊕ ⊕ ⊕ ⊕ . (23.26)
8 × 8 = 27 + 10 + 10 + 8 + 8 + 1. (23.27)
–258/453– Chapter 23 Elementary Group Theory
Lie groups G are groups where the group elements g ∈ G depends smoothly on a set
of continuous real parameters g = g(α) where α = {αa | 1 ≤ a ≤ N }. In general, we
♡
choose parameters {αa } so that the identity can be expressed as e = g(0). If we find a
representation D(G), we have similarly 1 = D(0).
To integrate various functions F (g) of the group elements over the group G. We need an
R
integration measure dµ (g) to formulate such integrals as G dµ (g)F (g). The measure
should be invariant under group action, i.e., dµ (g) = dµ (g ′ ) where g ′ = g1 g. For a spe-
cific parametrization of a group manifold αa , we can write dµ (g) = ρ(αa ) dα1 · · · dαN . ♡
The invariance of measure indicates that ρ(αa ) dα1 · · · dαN = ρ(αa′ ) dα1′ · · · dαN
′
. The
R
group is compact if G dµ (g) is finite. Most of theorems on finite groups also hold in
P R
the case of compact groups, if we replace the summation g by integral G dµ (g).
Example: The rotation group in 2-dimensional space can be parametrized by the angle of
rotation θ:
cos θ − sin θ
R(θ) = where 0 < θ < 2π. (23.28)
sin θ cos θ
If L(θ′ ) = L(ϕ)L(θ), we have θ′ = ϕ + θ. Since ρ(θ) dθ = ρ(θ′ ) dθ′ , we can derive that
Setting θ = 0 gives ρ(ϕ) = ρ(0). The measure is determined only up to an overall constant,
and so we might as well set ρ(0) = 1. Noting that
Z 2π
dθ = 2π, (23.30)
0
the rotation group is compact. We can check that representation R(θ) of the rotation group is
equivalent to a unitary representation, like that of a finite group.
If L(v ′ ) = L(u)L(v), we have v ′ = (v + u)/(1 + uv). Since ρ(v) dv = ρ(v ′ ) dv ′ , we can derive
that
1 − u2
ρ(v) = ρ(v ′ ) (23.32)
(1 + uv)2
Setting v = 0 gives ρ(u) = ρ(0)/(1 − u2 ). We might as well set ρ(0) = 1. Noting that
Z +1
1
dv = ∞, (23.33)
−1 1 − v
2
23.4.2 SO(N )
We define SO(N ) as the group of all N -by-N real matrices R satisfying R⊺ R = I and det R =
1. The elements of the SO(N ) are represented, by definition, by the N -by-N matrices trans-
forming the N unit basis vectors ê1 , · · · , êN into one another. More precisely, the N -dimensional
irreducible representation of SO(N ) is furnished by a vector. A vector is defined by how it
transforms under a rotation:
Tensor representations can be reduced into several invariant subspace by requiring it to have
definite symmetry properties under permutation of their indices.
The Kronecker delta δ ij is invariant under SO(N ). So T (ij)··· can be further decomposed into
T (ij)··· δ ij and T (ij)··· − δ ij [T (kl)··· δ kl /N ].
The Levi-Civita symbol ϵi1 ···iN is also invariant under SO(N ). So representation T [i1 ···iN −1 ]···
is equivalent to T i··· , T [i1 ···iN −2 ]··· is equivalent to T [ij]··· , etc.
The rotation group SO(2n) enjoys an additional feature of selfdual and anti-selfdual tensors.
Consider the antisymmetric tensor with n indices Ai1 ···in . Construct the tensor B i1 ···in ≡
in ϵi1 ···in in+1 ···i2n Ain+1 ···i2n /n! dual to A, denoted as B ∼ ϵA. Then A is dual B, i.e., A ∼
ϵB. It follows that the two tensors T±i1 ···in ≡ Ai1 ···in ± B i1 ···in are self-dual and anti-selfdual,
respectively. Schematically, ϵT± ∼ ϵ(A ± B) ∼ ϵA ± ϵB ∼ B ± A ∼ ±(A ± B) ∼ ±T± .
Clearly, under an SO(2n) transformation, T+ transforms into a linear combination of T+ ,
while T− transforms into a linear combination of T− . The two tensors correspond to two
irreducible representations with dimension (2n)!/(2(n!)2 ), not (2n)!/(n!)2 .
Example: For SO(3), a pair of antisymmetric indices can always be traded for a single index.
The irreducible representation is furnished by totally symmetric traceless tensors carrying n
indices, with n an arbitrary positive integer, that is, a tensor S i1 ···in that remains unchanged on
the interchange of any pair of indices and that vanishes when any two indices are contracted.
The dimension of S i1 ···in is 2n + 1.
–260/453– Chapter 23 Elementary Group Theory
23.4.3 SU(N )
We define SU(N ) as the group of all N -by-N complex matrices U satisfying U † U = I and
det U = 1. By definition, the fundamental representation is furnished by a vector:
V i = U ij V j . (23.36)
The conjugate representation is furnished by the complex conjugation of a vector:
Vi = Vj (U † )j i where Vi ≡ V i∗ . (23.37)
The product representations of them are thus furnished by tensors with upper and lower in-
dices:
···im 1 ···km
Vji11···j n
= U i1k1 · · · U imkm Vl1k···ln
(U † )l1 j1 · · · (U † )ln jn . (23.38)
Tensor representations can be reduced by requiring it to have definite symmetry properties
under permutation of their upper indices and under permutation of their lower indices.
The Kronecker delta δji is invariant under SU(N ). So Tj···
i···
can be further decomposed into
Ti··· and Tj··· − δj (Tk··· /N ).
i··· i··· i k···
The Levi-Civita symbol ϵi1 ···iN and ϵi1 ···iN are also invariant under SO(N ). Using the two
antisymmetric symbols, we can move indices on SU(N ) tensors up and down stairs.
Example: For SU(2), because the antisymmetric symbols ϵij and ϵij carry two indices, we can
in fact remove all lower indices. Furthermore, it suffices to consider only tensors with up-
per indices all symmetrized. The irreducible representation is furnished by totally symmetric
tensors carrying n upper indices. The dimension of the representation is n + 1.
Since ϵij ψj transforms in exactly the same way as ψ i under SU(2) and ϵij is antisymmetric,
the fundamental representation of SU(2) is pseudoreal.
Any hermitean and traceless 2-by-2 matrix X can be written as a linear combination of the
three Pauli matrices:
x3 x1 + ix2
X = x1 σ1 + x2 σ2 + x3 σ3 = . (23.39)
x1 + ix2 −x3
The determinant of X is det X = −x2 . Pick an arbitrary element U of SU(2). Consider
X ′ ≡ U † XU . Since X ′ is also hermitean and traceless, we can write it as X ′ = x′i σi , and
x → x′ is a linear transformation. Since det X ′ = det X, x′ and x have the same length,
thus defining a rotation R. In other words, we can associate an element R of SO(3) with any
element U of SU(2). This map f : U → R of SU(2) into SO(3) is actually 2-to-1, since U
and −U are mapped into the same R. The unitary group SU(2) is said to double cover the
orthogonal group SO(3).
Example: For SU(3), a pair of antisymmetric upper (lower) indices can always be traded for
a single lower (upper) index. The irreducible representation is furnished by traceless tensor
ϕij11···i
···jn with all upper indices symmetrized and all lower indices symmetrized. The dimension
m
In some neighborhood of the identity, the elements of a Lie group G or its representation
D(G) can be Taylor expanded as,
D(dα) = 1 + i dαa X a + O dα2 , (23.40)
where ♡
∂D(α)
X = −i
a
(23.41)
∂αa α=0
are called the generators of group G in its representation D(G). X a are independent of
one another. The representation of the group elements for finite parameters α = {αa }
can be defined as D(α) = exp(iαa X a ). This procedure is called exponential mapping.
The generators of the Lie group G form an closed algebra under Lie brackets [A, B] =
AB − BA. It is called the Lie algebra. Lie algebras are generally written as
a b ♡
X , X = if abc X c , (23.42)
where coefficients f abc are real, known as the structure constants of the Lie group G.
Proposition 23.7
1. f abc = −f bac .
2. The generators of a unitary representation of Lie group are hermitian matrices.
3. The structure constants satisfy the so-called Jacobi identity, ♠
Define (T a )bd ≡ −if abd . We have [T a , T c ] = if acd T d from Jacobi identity. Thus T a is
♡
a representation of the Lie algebra, called the adjoint representation.
–262/453– Chapter 23 Elementary Group Theory
Note: A simple Lie group may contain discrete invariant subgroups, hence being a simple Lie group is
different from being simple as an abstract group.
Proposition 23.8
For semisimple algebra, the real symmetric object gab and its inverse g ab can be used to raise
and lower indices and to take scalar products, e.g., f abc ≡ g dc f abd = −iTr(T a T b T c −
T a T c T b ). Clearly, f abc is totally antisymmetric.
23.5 Lie algebra –263/453–
Note:
• A compact Lie algebra must be the Lie algebra of a compact semisimple Lie group.
• The Cartan metric on the Lie algebra of a compact Lie group is positive semidefinite, not positive
definite in general.
• The Cartan metric on the Lie algebra of a noncompact Lie group may be positive semidefinite as
well.
From now on, we would choose the base of the compact Lie algebra satisfying that g ab = δ ab
and so all T a are hermitean. Matrices T i commute with one another by definition, and hence
these l matrices can be simultaneously diagonalized by choosing new bases for E a . Denote
the diagonal elements of T i by −β i (a). These l matrices are thus given by
If X a ∈ E, we can say β(a) is a root of the Lie algebra and X a can be denoted as Eβ . Vectors
λ depend on the representation of the Lie algebra, called weight vectors. Clearly, roots of a Lie
algebra are the nonzero weights of its adjoint representation.
23.5.3 SO(N )
In the fundamental representation of SO(N ), the Lie algebra consists of all N -by-N antisym-
metric hermitean matrices. We choose the following bases for the Lie algebra:
[Jmn , Jpq ] = i(δmp Jnq + δnq Jmp − δnp Jmq − δmq Jnp ). (23.51)
Of the N (N − 1)/2 generators, {J12 , J34 , · · · , J2l−1,2l } form a maximal subset of mutually
commuting generators (N = 2l or 2l + 1). Diagonalize them simultaneously, and call them
H 1 , H 2 , · · · , H l respectively.
For N = 2l, we have
from which we read off the 2l weights for the fundamental representation:
Let us write the 2l weights more compactly as ±ei in terms of the l unit vectors ei , for i =
1, · · · , l. The root vectors take us from one state to another, and hence they are given by the
differences between the weights, namely,
We take the positive roots to be ei ± ej (a negative root must be opposite to a positive root). A
simple root is a positive root that cannot be written as a sum of two positive roots with positive
coefficients. The simple roots are then
Figure 23.1: Weight diagram (in fundamental representation) and root diagram of SO(4).
Note that no root takes the weight w1 into w2 ; this is because rotations cannot transform
x1 ± ix2 into each other. Similarly for x3 ± ix4 .
For N = 2l + 1, we have
from which we read off the 2l + 1 weights for the fundamental representation:
Figure 23.2: Weight diagram (in fundamental representation) and root diagram of SO(5).
We take the positive roots to be ei ± ej , (i < j) and ei . The simple roots are then
23.5.4 SU(N )
In the fundamental representation of SU(N ), the Lie algebra consists of all N -by-N traceless
hermitean matrices. Evidently, there are l = N − 1 traceless N -by-N matrices that commute
with one another and hence can be simultaneously diagonalized. They are
√ √
H 1 = diag(1, −1, 0, · · · , 0)/ 2, H 2 = diag(1, 1, −2, 0, · · · , 0)/ 6, · · · ,
p p
H l−1 = diag(1, 1, 1, · · · , −(l − 1), 0)/ (l − 1)l, H l = diag(1, 1, 1, · · · , 1, −l)/ l(l + 1)
(23.60)
Figure 23.3: Weight diagram (in fundamental representation) and root diagram of SU(3).
The simple roots of SU(l + 1) can be written in a more elegant form by going to a space one
dimension higher. Let ei (i = 1, · · · , l + 1) denote unit vectors living in (l + 1)-dimensional
2
space. Then (ei − ei+1 ) = 2, and (ei − ei+1 ) · (ej − ej+1 ) = −1 if j = i ± 1 and 0 otherwise.
The l simple roots SU(l + 1) are then given by
αi = ei − ei+1 , i = 1, · · · l. (23.63)
Note that the simple roots live in the l-dimensional hyperplane perpendicular to the vector
P j
je .
23.5 Lie algebra –267/453–
23.5.5 Sp(2l)
We define Sp(2l) as the group of all 2l-by-2l complex matrices U satisfying U † U = I and
U T JU = J, where
0 I
J≡ . (23.64)
−I 0 2l×2l
In fundamental representation, the Lie algebra of Sp(2l) consists of all 2l-by-2l hermitean
matrices satisfying H ⊺ = JHJ. Thus, the general form of the generators is given by
P W∗
, (23.65)
W −P ⊺
where P is hermitean and W is symmetric. The generators can also be represented in the
direct product notation as
iA ⊗ I, S1 ⊗ σ1 , S 2 ⊗ σ2 , S3 ⊗ σ3 , (23.66)
where A is an arbitrary real l-by-l antisymmetric matrix and S1 , S2 , and S3 are three arbitrary
real n-by-n symmetric matrices.
Of the l(2l + 1) generators, a maximal subset of mutually commuting generators could be
{u1 ⊗ σ3 , · · · , ul ⊗ σ3 }, where ui denotes the l-by-l diagonal matrix with a single entry equal
to 1 in the ith row and ith column. So the weights of the 2l different states in the fundamental
representation are
Figure 23.4: Weight diagram (in fundamental representation) and root diagram of Sp(4).
The root diagram of Sp(4) is the same as that of SO(5), indicating the local isomorphism
Sp(4) ≃ SO(5).
We take the positive roots to be ei ± ej , (i < j) and 2ei . The simple roots are then
Define
M (k, α, β) ≡ Nα,kα+β N−α,(k+1)α+β . (23.70)
From Jacobi identity
we can get
M (k − 1, α, β) = M (k, α, β) + kα · α + α · β. (23.72)
From
[Eα , Epα+β ] = 0, [E−α , E−qα+β ] = 0, (23.73)
we have
M (p, α, β) = 0, M (−q − 1, α, β) = 0. (23.74)
Using equations 23.72 and 23.74, we obtain
1
M (p − s, α, β) = s α · α p − (s − 1) + α · β . (23.75)
2
α·β q−p n
= ≡ . (23.76)
α·α 2 2
Next, we can repeat the same argument with the roles of α and β interchanged, leading to
α·β q ′ − p′ m
= ≡ . (23.77)
β·β 2 2
(α · β)2 mn α·α m
cos2 θαβ = = ≤ 1, ραβ ≡ = . (23.78)
(α · α)(β · β) 4 β·β n
There are only four possible angles between α and β (we can always take θαβ to be acute, by
flipping α if necessary):
Proposition 23.9
Theorem 23.8
Using these theorems, we can enumerate all exceptional compact simple Lie algebras, as shown
in Figure 23.6. The details can be found in section VI.5 of Group Theory in a Nutshell for
Physicists (A.Zee).
Define
i
σij ≡ − [γi , γj ]. (23.84)
2
23.6 Spinor representations of orthogonal algebras –271/453–
Complex conjugation
If matrix C satisfies
σij⊺ C + Cσij = 0, (23.95)
ζ ⊺ Cψ will be invariant under rotation. Specificly, we can choose C1 = iσ2 and
(
0 Cn Cn ⊗ σ1 if n is odd
Cn+1 = n+1 = (23.96)
(−1) Cn 0 Cn ⊗ iσ2 if n is even
Cn⊺ = (−1)n(n+1)/2 Cn ,
(n) (n)
γF Cn = (−1)n Cn γF , γi Cn = (−1)n+i+1 Cn γi . (23.97)
Define
ψc ≡ C −1 ψ ∗ . (23.98)
Note that
1 1
C −1 σij∗ (1 ± γF )∗ C = −σij [1 ± (−1)n γF ]. (23.99)
2 2
We have
∗
C −1 e−iωij σij P±∗ = eiωij σij /4 P± C −1 if n is even (23.100)
and
∗
C −1 e−iωij σij P±∗ = eiωij σij /4 P∓ C −1 if n is odd. (23.101)
In other words, ψc,± will transform like ψ± if n is even and ψ∓ if n is odd. Thus, the spinor
representation of SO(4k + 2) is complex while that of SO(4k) is non-complex.
Note that C = C ⊺ if n = 4k and C = −C ⊺ if n = 4k + 2. Thus, the spinor representation of
SO(8k) is real while that of SO(8k + 4) is pseduoreal.
where
Γκ = γi1 · · · γiκ , i1 , · · · , iκ are all different. (23.103)
If n is even and κ is odd, or κ is even and n is odd, T++ and T−− will vanish. If n and κ are
both even or both odd, T+− and T−+ will vanish.
If T does not vanish, T would transform like a totally antisymmetric rank κ tensor in 2n
dimensional vector space. So the product representation of spinor representations can be re-
duced to the direct sum of antisymmetric tensor representation.
Take SO(4) as an example. We have
Note that [2] is a selfdual totally antisymmetric rank 2 tensor in 4 dimensional vector space.
23.6 Spinor representations of orthogonal algebras –273/453–
In spinor representation |ϵ1 · · · ϵn ⟩ of SO(2n), (ϵi + 1)/2 can be seen as the occupation num-
ber of fermions in energy level i. The generators of SO(2n) are given by all possible bilinear
operators of the form fi fj , fi† fj† and fi† fj , where fi and fi† are annihilation and creation oper-
ators for fermions in level i. The generators of the U(n) subgroup are then the operators that
conserve the number of fermions, namely, fi† fj . The diagonal U(1) is just the total fermion
P
number ni=1 fi† fi .
When restricted on SU(n), the spinor can be decomposed as follows:
• For n odd,
• For n even,
Especially, we have
ϕa (x) ≡ U −1 (t, t0 )ϕa (x)U (t, t0 ), π a (x) ≡ U −1 (t, t0 )π a (x)U (t, t0 ). (24.4)
Furthermore, it can be shown that Q is also the generator of the infinitesimal transformation
δϕa , i.e.,
U † ϕa U = ϕa + δϕa where U ≡ eiQ ≈ I + iQ. (24.12)
However, in some cases, classical conservation laws would break in quantum field theory,
called anomalies, which will be discussed later.
24.4 Momentum
The conserved currents for infinitesimal spacetime translation x′µ = xµ + aµ are
∂L
j µ = −aν T µν where T µν ≡ − ∂ ν ϕa + η µν L. (24.13)
∂(∂µ ϕa )
The corresponding conserved charges, called four-momentum, are
Z Z Z
P ≡ T d x = H, P ≡ T d x = −π a ∂i ϕa d3 x .
0 00 3 i 0i 3
(24.14)
Define that
Z Z
ML ≡
µν µ
(x T 0ν
− x T )d x,
ν 0µ 3
MS ≡
µν
(−π a (Σµν )ab ϕb ) d3 x . (24.19)
where
(Lµν )ab ≡ −i(xµ ∂ ν − xν ∂ µ )δab , (Sµν )ab ≡ −i(Σµν )ab . (24.21)
We now define Ji ≡ ϵijk M jk /2 and Ki ≡ M i0 . So equation 24.20b can be rewritten as
[P µ , M ρσ ] = i(η µσ P ρ − η µρ P σ ), (24.23)
or equivalently,
Finally, we define Li ≡ ϵijk MLjk /2 and Si ≡ ϵijk MSjk /2. We can derive that
and
U −1 (Λ)P µ U (Λ) = Λµν P ν , U −1 (Λ)M µν U (Λ) = Λµρ Λν σ M ρσ , (24.27)
where
i i
U (Λ) = exp θµν M µν
, S = exp θµν Sµν
. (24.28)
2 2
24.6 Anticommutation relation –277/453–
[ϕa , A] and [π a , A] are the same as those in the theory quantized with commutation relation.
It is easy to verify that P i and MSµν have the required form. The form of H is determined
by the specific theory. As we can see later, the Hamiltonian of Dirac field has the required
form. When it is quantized with anticommutation relations, the commutation relations be-
tween field operators, momentum operators and angular momentum operators discussed in
previous sections will hold automatically.
Chapter 25
Scalar Field
(∂ µ ∂µ − m2 )ϕ = 0. (25.2)
[ϕ(x, t), ϕ(y, t)] = 0, [π(x, t), π(y, t)] = 0, [ϕ(x, t), π(y, t)] = iδ(x − y). (25.5)
p (25.6)
f ≡ d3 p/(2π)3 2E. It follows that
where E = p2 + m2 , px ≡ p · x − Et and dp
Z Z
3 −ipx †
a(p) = d x e (iπ + Eϕ), a (p) = d3 x eipx (−iπ + Eϕ). (25.7)
and
i i †
P , a(p) = −pi a(p), P , a (p) = pi a† (p). (25.12)
Define |p⟩ ≡ a† (p) |0⟩. We have
Therefore, we interpret the state |p⟩ as the momentum eigenstate of a single particle of mass m.
We can also show that Ji |p = 0⟩ = 0. So the particle carries no internal angular momentum.
The amplitude for a particle to propagate from y to x is ⟨0|ϕ(x)ϕ(y)|0⟩, denoted by D(x − y).
And we can figure out that Z
D(x − y) = dpe f ip(x−y) (25.14)
and
[ϕ(x), ϕ(y)] = D(x − y) − D(y − x). (25.15)
If x − y is spacelike, a continuous Lorentz transformation can take (x − y) to −(x − y),
leading to [ϕ(x), ϕ(y)] = 0. As a result, a measurement performed at one point can not affect
a measurement at another point whose separation is spacelike.
where the integration over p0 is performed in the way shown by Figure 25.1. The retarded
Green function also satisfies the equation
where T stands for time ordering, placing all operators evaluated at later times to the left, and
the integration over p0 is performed in the way shown by Figure 25.2.
–280/453– Chapter 25 Scalar Field
1 1 λ0
L = − ∂µ ϕ∂ µ ϕ − m20 ϕ2 − ϕ4 . (25.19)
2 2 4!
Ground states of the interaction field theory and free field theory are denoted by |Ω⟩ and |0⟩
respectively. The zero of energy is fixed by H0 |0⟩ = 0.
Define
Z
−iH0 (t−t0 ) λ40 4
ϕI (t, x) ≡ e iH0 (t−t0 )
ϕ(t0 , x)e , HI (x) ≡ d3 x ϕ. (25.21)
4! I
The derivation can be found in section 4.2 of An introduction to quantum field theory (M.E.Peskin
& D.V.Schroeder).
To evaluate the right hand side of equation 25.22, we need the following theorem.
25.3 Perturbation theory for canonical quantization –281/453–
T {ϕI (x1 ) · · · ϕI (xn )} = N {ϕI (x1 ) · · · ϕI (xn ) + all possible contractions } . (25.23) ♣
N means normal order, in which all the a’s are to the right of all the a† s.
The proof can be found in section 4.3 of An introduction to quantum field theory (M.E.Peskin
& D.V.Schroeder).
Example:
⟨0|T {ϕI (x1 )ϕI (x2 )ϕI (x3 )ϕI (x4 )}|0⟩ = DF (x1 − x2 )DF (x3 − x4 )
+ DF (x1 − x3 )DF (x2 − x4 ) + DF (x1 − x4 )DF (x2 − x3 ) (25.24)
Figure 25.3: Feynman diagram representation of perturbation expansion. The symmetry fac-
tor of the diagrams above are S = 4!/3 = 8 and S = 4!/12 = 2 respectively.
Note: We emphasize that in this section, ϕH denotes the operatorvalued Heisenberg picture of the field,
ϕS are the Schrödinger picture of the field, and ϕ(x) represents the classical field whose value is ordinary
number.
The proof can be found in section 9.2 of An introduction to quantum field theory (M.E.Peskin
& D.V.Schroeder).
The generating functional of the correlation function is defined as
Z Z
Z[J] ≡ Dϕ exp i d x L + J(x)ϕ(x) .
4
(25.30)
V =0,P =0
V! 4! i δJ(x) P! 2
(25.38)
If we focus on a term with particular values of V and P , the number of surviving sources (after
we take all the functional derivatives) will be E = 2P − 4V . The 4V functional derivatives
can act on the 2P sources in (2P )!/(2P − 4V )! different combinations. However, many of
the resulting expressions are algebraically identical.
To organize them, we introduce Feynman diagrams similar to those in perturbation theory of
canonical quantization. In these diagrams, a line segment stands for a propagator DF (x − y),
R
a filled circle at one end of a line segment for a source i d4 x J(x), and a vertex joining four
R
line segments for −iλ0 d4 z.
–284/453– Chapter 25 Scalar Field
For each diagram, we can assign a symmetry factor similar to that in perturbation theory for
canonical quantization. Due to the fact that some external sources are identical here, usually
symmetry factors in two cases are not equivalent. However, when calculating the correlation
function, the exchange of the order of functional derivatives to identical sources can eliminate
the difference.
It can be shown that !
X
Z[J] = Z0 [0] exp CI , (25.39)
I
where CI stands for a particular connected diagram, including its symmetry factor. We can
define W [J] by
Z[J] = Z[0] exp(−iW [J]). (25.40)
It follows from W [0] = 0 that X
− iW [J] = CI . (25.41)
I̸={0}
The notation I ̸= {0} means that the vacuum diagrams are omitted from the sum. The detailed
discussion can be found in section 9 of Quantum field theory (M. Srednicki).
25.4.4 Symmetries
Equations of motion
The equation of motion in classical field theory is give by
δS
= 0. (25.42)
δϕ(x)
In quantum field theory, we derive the equation of motion by claiming that the path integral
will be invariant under the infinitesimal change of field, i.e., ϕ(x) → ϕ(x) + ϵ(x). Define
Z
Z[ϕ(x1 ), · · · , ϕ(xn )] ≡ DϕeiS ϕ(x1 ) · · · ϕ(xn ). (25.43)
It follows that
Z Z
δS
δZ = Dϕe iS 4
d x ϵ(x) i ϕ(x1 ) · · · ϕ(xn ) + δ(x − x1 )ϕ(x2 ) · · · ϕ(xn ) + · · · ,
δϕ(x)
(25.44)
leading to
X n
δS
ϕ(x1 ) · · · ϕ(xn ) = i ⟨ϕ(x1 ) · · · δ(x − xi ) · · · ϕ(xn )⟩ . (25.45)
δϕ(x) i=1
Conservation laws
Consider a local field theory of a set of fields ϕa (x), governed by a Lagrangian density L(ϕ).
An infinitesimal symmetric transformation on the fields ϕa is of the form
If ϵ is a constant, the action will be invariant under this transformation, i.e., the Lagrangian
density must be invariant up to a total divergence,
Cross section σ can be constructed from M. For a relativistic collinear scattering process, we
have
dN = σ|v1 − v2 |n1 dt . (25.55)
Consider a 2 → n process p1 + p2 → {pj }. Suppose that the volume of the space in which
the scattering process takes place is V and the duration of the scattering process is T . So the
number density of the incident particle is
1
n1 = (25.56)
V
–286/453– Chapter 25 Scalar Field
multiparticle
continuum
one particle in
motion
bound
state
Assume for now x0 > y 0 and define connected two point function as
Term ⟨Ω|ϕ(x)|Ω⟩ is usually zero by symmetry; for higher spin fields, it is zero by Lorentz
invariance. From the completeness of Klein-Gordon field, we have
X Z d3 p 1
⟨Ω|ϕ(x)ϕ(y)|Ω⟩C = 3 2E
⟨Ω|ϕ(x)|λp ⟩ ⟨λp |ϕ(y)|Ω⟩ . (25.71)
λ
(2π) p
Since
⟨Ω|ϕ(x)|λp ⟩ = ⟨Ω|ϕ(0)|λ0 ⟩ eipx |p0 =Ep , (25.72)
we can obtain
XZ d4 p −i
⟨Ω|ϕ(x)ϕ(y)|Ω⟩C = eip(x−y) | ⟨Ω|ϕ(0)|λ0 ⟩ |2 . (25.73)
λ
(2π) p + mλ − iϵ
4 2 2
–288/453– Chapter 25 Scalar Field
Analogous expressions also hold when y 0 > x0 , and both cases can be summarized as
Z ∞
dM 2
⟨Ω|Tϕ(x)ϕ(y)|Ω⟩C = ρ(M 2 )DF (x − y; M 2 ), (25.74)
0 2π
where
X
ρ(M 2 ) ≡ (2π)δ(M 2 − m2λ )| ⟨Ω|ϕ(0)|λ0 ⟩ |2 . (25.75)
λ
1-particle
states
bound
states
2-particle
states
The one-particle state contributes an isolated delta function to the spectral density function.
If follows that
Figure 25.6: The structure of the two point function in Fourier space.
25.6 LSZ reduction formula –289/453–
The ∼ means that the two sides of the expression share the same singular structure around
p0i → Epi , ki0 → Eki . The proof can be found in section 7.2 of An introduction to quantum
field theory (M.E.Peskin & D.V.Schroeder).
To express 25.78 in the language of Feynman diagrams, we consider the 2 → 2 scattering for
example. Notice that the disconnected diagram should be disregarded because they do not
have the singularity structure with a product of four poles indicated by the right hand side of
the LSZ reduction formula. The exact four point function
2 Z
Y 2 Z
Y
−ipi xi
4
d xi e d4 yi eikj yj ⟨Ω|T{ϕ(x1 )ϕ(x2 )ϕ(y1 )ϕ(y2 )}|Ω⟩ (25.79)
1 1
Amputated
Let −iM 2 (p2 ) denote the sum of all one-particle-irreducible (1PI) insertions into the scalar
propagator. Here 1PI refers to diagrams that is still connected after one line is cut, as shown
in Figure 25.8.
1PI
The exact propagator can be written as a geometric series of 1PI propagators, as shown in
Figure 25.9.
If we expand each re-summed propagator about the physical particle pole, we see that each
external leg of the four-point amplitude contributes
−i −iZ
∼ + (regular) . (25.80)
p2 + m20 + M 2 p →Ep
0 p2 + m 2
Thus, the sum of diagrams of four point function contains a product of four point poles, which
is exactly the singularity on the second line of 25.78. Comparing the coefficients of this product
of poles, we find the relation shown in Figure 25.10.
Amp.
After Fourier transforming the n-point function to momentum space and cutting off the ex-
ternal legs, the Feynman diagram can be evaluated as follows:
25.7 Renormalization
25.7.1 Counting of ultraviolet divergence
Renormalization is the procedure in quantum field theory by which divergent parts of a cal-
culation, leading to nonsensical infinite results, are absorbed by redefinition into a few mea-
surable quantities, so yielding finite answers.
Consider a pure scalar theory in d dimensions with a ϕn interaction term. The corresponding
Lagrangian density is
1 1 λ
L = − ∂ µ ϕ∂µ ϕ − m2 ϕ2 − ϕn . (25.81)
2 2 n!
Let N be the number of external lines in one Feynman diagram, P the number of propagators,
and V the number of vertices. The number of loops in the diagram is L = P − V + 1. There
are n lines meeting at each vertex, so nV = 2P + N . Loosely speaking, each loop has an
R
integral dd p, while each propagator contributes a factor p−2 . Thus the superficial degrees of
divergence is
d−2 d−2
D = dL − 2P = d + n −d V − N. (25.82)
2 2
Naively, we expect a diagram to have a divergence proportional to ΛD , where Λ is a momentum
cutoff, when D > 0. We expect a divergence of the form log Λ when D = 0, and no divergence
when D < 0.
According to the superficial degrees of divergence of the diagram, there are three possible types
of ultraviolet behavior of quantum field theories. We will refer to them as follows:
Renormalizable Only a finite number of amplitudes are superficially diverge; however, di-
vergences occur at all orders in perturbation theory.
Ignoring the vacuum diagram, these amplitudes contain three infinite constants. Our goal
is to absorb these constants into the three unobservable parameters of the theory: the bare
mass m0 , the bare coupling constant λ0 , and the field strength Z. To accomplish this goal, it
is convenient to reformulate the perturbation expansion so that these unobservable quantities
do not appear explicitly in the Feynman rules.
–292/453– Chapter 25 Scalar Field
Define
where m is the physical mass and λ the physical coupling constant, defined by on-shell (OS)
renormalization conditions, as shown in Figure 25.12. The Lagrangian density then becomes
1 1 λ 1 1 δλ
L = − ∂ µ ϕr ∂µ ϕr − m2 ϕ2r − ϕ4r − δZ ∂ µ ϕr ∂µ ϕr − δm ϕ2r − ϕ4r . (25.84)
2 2 4! 2 2 4!
The last three terms, known as counterterms, have absorbed the infinite but unobservable
shifts between the bare parameters and the physical parameters.
Amp.
We can use Feynman rules shown in Figure 25.13 to compute any amplitude in ϕ4 theory. The
procedure is as follows. Compute the desired amplitude as the sum of all possible diagrams
created from the propagator and vertices shown in Figure 25.13. The loop integrals in the di-
agrams will often diverge, so one must introduce a regulator. The result of this computation
will be a function of the three unknown parameters δZ , δm , and δλ . Adjust ( or “renormal-
ize”) these three parameters as necessary to maintain the renormalization conditions shown
in Figure 25.12. After this adjustment, the expression for the amplitude should be finite and in-
dependent of the regulator. This procedure, using Feynman rules with counterterms, is known
as renormalized perturbation theory.
Mandelstam variable
In theoretical physics, the Mandelstam variable are numerical quantities that encode the en-
ergy, momentum, and angles of particles in a scattering process in a Lorentz-invariant fashion.
25.7 Renormalization –293/453–
They are used for scattering processes of two particles to two particles. The Mandelstam vari-
ables s, t, u are defined as
where p1 and p2 are the four-momenta of the incoming particles and p3 and p4 the four-
momenta of the outgoing particles. s is known as the square of the center-of-mass energy
(invariant mass) and t the square of the four-momentum transfer. We can verify that
Wick rotation
Dimensional regularization
Dimensional regularization is a method for regularizing integrals in the evaluation of Feyn-
man diagrams. For example, if one wishes to evaluate a loop integral which is logarithmically
divergent in four dimensions, like
Z
dd q̄ 1
. (25.93)
(2π)d (q̄ 2 + m2 )2
One first rewrites the integral in some way so that the number of variables integrated over does
not depend on d, and then we formally vary the parameter d, to include non-integral values
like d = 4 − ϵ. So the integral 25.93 would become
Z ∞
dq̄ 2π (4−ϵ)/2 q̄ 3−ϵ 2ϵ−4 π ϵ/2−1
4−ϵ Γ (2 − ϵ/2)
= m−ϵ
0 (2π) (q̄ 2 + m2 )2 sin(πϵ/2)Γ(1 − ϵ/2)
1 1 m2
= 2 − ln + γ + O(ϵ). (25.94)
8π ϵ 16π 2 4π
Amp.
L = L0 + LI + LCT , (25.107)
where L0 is the canonically normalized free Lagrangian for physical fields and masses,
LI contains the interaction, again in terms of physical parameters, and LCT contains the
25.7 Renormalization –297/453–
3. At the one-loop level, the self-energy is given by the effective two-point vertices: the
1PI two-point vertex of the interaction and the counter-term two-point vertex. The
counterterms absorb ultraviolet divergences, and the finite parts of the counterterms
are determined by renormalization conditions, which ensure the quantities in L0 + LI
are physical. The conditions constrain the self-energy and the effective vertices, and give
a finite, uniquely-determined value for the counterterms.
Now, for a general theory in d-dimensional spacetime, the field content is given by ϕf , f =
1, 2, · · · , where f labels the field type. The (mass) dimension of the field is [ϕf ] = ∆f and
we have ∆f > 0 in all physical theories. We have interaction vertices of type i, i = 1, 2, · · · ,
contributing a term of the form
Y n
λi ∂ ni ϕf if , (25.108)
f
Now consider a 1PI diagram in such a theory. On the one hand, the value of the diagram
Q
is M ∼ ΛD i λVi i , where Vi is the number of vertices of type i, D the superficial degree of
P
divergence and Λ a high momentum cut-off, leading to [M ] = D + i Vi κi .
Q E
On the other hand, the diagram could arise from an interaction term λ′ f ϕf f , where Ef is
P
the number of external lines of ϕf , resulting in [M ] = [λ′ ] = d − Ef ∆f . It follows that the
superficial degree of divergence of the diagram is
X X
D =d− Ef ∆f − Vi κi . (25.110)
f i
where all loop momenta are taken proportional to s. Generally, internal propagators have the
form
1 1
∼ α (25.113)
(as + p) · · ·
α s
for large s, where a is a numerical constant and p is a combination of the external momenta.
Differentiating M n times with respect to p gives a term proportional to
1 1
α+n
∼ α+n . (25.114)
(as + p) s
Thus the D + 1 derivatives with respect to the external momenta will make M finite. It means
that we have the expansion
where the argument p of the function represents the collection of external momenta. We have
suppressed the index structure, and M0 , M1 , · · · , MD are potentially divergent constants.
Suppose that M has Ef external lines of the field ϕf . Then the divergence of M(p) can be
canceled by counterterms of the form
X
D Y E
Aj (∂)j ϕf f , (25.116)
j=0 f
where Aj s are divergent coefficients in order to cancel the divergence in Mj . The index struc-
ture in Aj ∂ j should match the suppressed index structure of M(p).
However, more difficult situations occurs when we have nested or overlapping divergences,
that is, when two divergent loops share a propagator. Terms like log p2 log Λ2 would appear,
contradicting our naive argument, based on the criterion of the superficial degree of diver-
gence, that the divergent terms of a Feynman integral are always simple polynomials in p. We
will refer to divergences multiplying only polynomials in p as local divergences, since their
Fourier transforms back to position space are delta functions or derivatives of delta functions.
We will call the new, nonpolynomial, term a nonlocal divergence. It is a local divergence sur-
rounded by an ordinary, nondivergent, quantum field theory process.
Fortunately, BPHZ theorem states that, for a general renormalizable quantum field theory,
to any order in perturbation theory, all divergences are removed by the counterterm vertices
corresponding to superficially divergent amplitudes. In other words, any superficially renor-
malizable quantum field theory is in fact rendered finite when one performs renormalized
perturbation theory with the complete set of counterterms.
A more detailed discussion of the appearance and cancellation of non local divergence can
be found in section 10.4 and 10.5 of An introduction to quantum field theory (M.E.Peskin &
D.V.Schroeder).
25.8 Renormalization group –299/453–
the actual physical mass mph of the particle is determined by the location of this pole: p2 =
−m2ph . The relation of m and mph is given by
2
λ m
2 2 2
mph = M (−mph ) + m = 1 +2
2
ln 2
−1 +O λ 2
m2 . (25.118)
32π µ
Because mph is a independent of µ, i.e., dmph /dµ = 0, it can be derived that
dm λ
= +O λ2
m. (25.119)
d ln µ 32π 2
The residue R of the propagator’s pole is no longer one as well. In ϕ4 theory, we have
R = 1 + O λ2 . (25.120)
iT = R2 iM. (25.122)
For a scattering process with p2 ≫ m2 , we have D ∼ x(1 − x)p2 . In OS scheme, the one-loop
correction to propagator or vertex generally includes a factor ln(D/D0 ) ∼ ln(p2 /m2 ), making
–300/453– Chapter 25 Scalar Field
is given by
3λ2
β(λ) = −ϵλ + λ2 G′1 (λ) = 2
+ O λ3 . (25.131)
16π
The first term, −ϵλ, is fixed by matching the O(ϵ) terms in 25.129. The second term, λ2 G′1 , is
similarly determined by matching the O(ϵ0 ) terms. Terms that are higher-order in 1/ϵ must
also cancel, and this determines all the other G′n (λ) in terms of G′1 (λ). These relations among
the G′n (λ)s can be checked order by order in perturbation theory.
Define
X ∞
Mn (λ)
1/2 −1/2
M (λ, ϵ) ≡ ln Zm Zϕ = n
. (25.132)
n=1
ϵ
d ln m ∂M (λ, ϵ) dλ X∞
Mn′ (λ)
2 ′
=− = (ϵλ − λ G1 ) = λM1′ (λ) + · · · , (25.133)
d ln µ ∂λ d ln µ n=1
ϵ
where the ellipses stand for terms with powers of 1/ϵ. In a renormalizable theory, d ln m/d ln µ
should be finite in the ϵ → 0 limit, and so these terms must actually all be zero. Therefore, the
anomalous dimension of the mass, defined via
d ln m
γm (λ) ≡ , (25.134)
d ln µ
is given by
λ
γm (λ) = λM1′ (λ) = + O λ 2
. (25.135)
32π 2
Let us now consider the n-point Green function in the MS renormalization scheme. The bare
Green function should be independent of µ. The bare and renormalized propagators are re-
(n) n/2
lated by G0 = Zϕ G(n) . Taking the logarithm and differentiating with respect to ln µ, we
get
∂ dλ ∂ dm ∂ n d ln Zϕ
+ + + G(n) (λ, m, µ) = 0. (25.136)
∂ ln µ d ln µ ∂λ d ln µ ∂m 2 d ln µ
We can write
a1 a2 − a21 /2
ln Zϕ = + + ··· (25.137)
ϵ ϵ2
Then we have
d ln Zϕ ∂Zϕ dλ a′1
= = + ··· [−ϵλ + β(λ)] = −λa′1 + · · · (25.138)
d ln µ ∂λ d ln µ ϵ
where the ellipses in the last line stand for terms with powers of 1/ϵ. Since G(n) should vary
smoothly with µ in the ϵ → 0 limit, these must all be zero. Therefore, the anomalous dimen-
sion of the field, defined via
1 d ln Zϕ
γϕ (λ) ≡ , (25.139)
2 d ln µ
–302/453– Chapter 25 Scalar Field
is given by
1
γϕ (λ) = − λa′1 = O λ2 . (25.140)
2
Equation 25.136 can now be written as
∂ ∂ ∂
+ β(λ) + γm (λ)m + nγϕ (λ) G(n) (λ, m, µ) = 0 (25.141)
∂ ln µ ∂λ ∂m
in the ϵ → 0 limit. This is the Callan–Symanzik equation for the Green function.
1. β(λ) > 0;
2. β(λ) = 0;
3. β(λ) < 0.
In theories of the first class, the running coupling constant goes to zero in the infra-red, lead-
ing to definite predictions about the small-momentum behavior of the theory. However, the
running coupling constant becomes large in the region of high momenta. Thus the short-
distance behavior of the theory cannot be computed using Feynman diagram perturbation
theory. A Feynman diagram analysis is useful in such theories if one is mainly interested in
large-distance or macroscopic behavior.
In theories of the second class, the coupling constant does not flow. In these theories, the
running coupling constant is independent of the momentum scale, and thus equal to the bare
coupling. This means that there can be no ultraviolet divergences in the relation of coupling
constants. The only possible ultraviolet divergences in such theories are those associated with
field rescaling, which automatically cancel in the computation of S-matrix elements.
In theories of the third class, the running coupling constant becomes large in the large-distance
regime and becomes small at large momenta or short distances. Such theories are called
asymptotically free. In theories of this class, the short-distance behavior is completely solv-
able by Feynman diagram methods. Though ultraviolet divergences appear in every order of
perturbation theory, the renormalization group tells us that the sum of these divergences is
completely harmless.
In the region of strong coupling, the approximation we have made, ignoring the higher-order
terms in the β function is no longer valid. It is a logical possibility that the leading order term
is positive while the higher terms of the β function are negative, so that the β function has the
form shown in Figure 25.16(a). In this case the β function has a zero at a non-zero value λ∗ .
When λ approaches this value, the renormalization group flow slows to a halt; thus λ = λ∗
would be a non-trivial fixed point of the renormalization group.
For a β function of the form of Figure 25.16(a), the β function behaves in the vicinity of the
25.9 Spontaneous symmetry breaking –303/453–
dλ
≈ −B(λ − λ∗ ). (25.142)
d ln µ
B
µ0
λ(µ) = λ∗ + C . (25.143)
µ
Thus, λ indeed tends to λ∗ as µ → ∞, and the rate of approach is governed by the slope of the
β function at the fixed point.
For a massless scalar field with a fixed point, the solution of C-S equation for propagator at the
fixed point is
−γϕ (λ∗ )
(2) C(λ∗ ) µ2
G (p) = , (25.144)
p2 p2
where C(λ∗ ) is an integration constant. Thus the two-point correlation function returns to
the form of a simple scaling law, but with a power law different from that expected by dimen-
sional analysis. At the fixed point we have a scale-invariant quantum field theory in which the
interactions of the theory affect the law of rescaling.
A similar behavior is possible in an asymptotically free theory. If the β function has the form
shown in Figure 25.16(b), the running coupling constant will tend to a fixed point λ∗ as µ → 0.
The two-point correlation function of fields will tend to a power law for asymptotically small
momenta. The two cases shown in Figure 25.16(a) and (b) are called, respectively, ultraviolet-
stable and infrared-stable fixed points.
In higher orders of perturbation theory, β and γ depend on the specific renormalization con-
ventions. However, the existence of a zero of the β function, the slope B at the zero, and the
value of the anomalous dimension at the fixed point should all be independent of the conven-
tions used to compute β and γ.
–304/453– Chapter 25 Scalar Field
It follows that
δ
E[J] = −ϕcl (x). (25.147)
δJ(x)
The effective action is defined as the Legendre transform of E[J]:
Z
Γ[ϕcl ] ≡ −E[J] − d4 y J(y)ϕcl (y). (25.148)
If L is invariant under the transformation U , i.e., L(U ϕ) = L(ϕ), it can be shown that the
effective action Γ is also invariant under transformation U , i.e., Γ(U ϕcl ) = Γ(ϕcl ).
Thanks to the property of Legendre transformation, we can get
δ
Γ[ϕcl ] = −J(x). (25.149)
δϕcl (x)
If the external source is set to zero, we will have
δ
Γ[ϕcl ] = 0. (25.150)
δϕcl (x)
The solution to this equation are the values of ⟨ϕ(x)⟩ in the vacuum states of the theory.
From here on we will assume, for the field theories we consider, that the possible vacuum states
are invariant under translations and Lorentz transformations. Then, for each possible vacuum
state, the corresponding solution ϕcl (x) will be a constant. Furthermore, we know that Γ is an
extensive quantity. If T is the time extent of the region and V is its three dimensional volume,
we can write
Γ[ϕcl ] = −(V T ) · Veff (ϕcl ). (25.151)
The coefficient Veff is called the effective potential . The condition that Γ[ϕcl ] has an extreme
then reduces to the simple equation
∂
Veff (ϕcl ) = 0. (25.152)
∂ϕcl
A system with spontaneously broken symmetry will have several minimum of Veff , all with
the same energy by virtue of the symmetry. The choice of one among these vacuum is the
spontaneous symmetry breaking.
25.9 Spontaneous symmetry breaking –305/453–
It follows that Z ∫ ∫
−iE[J] d4 x(Lr +Jr ϕ) i d4 x(Lct +Jct ϕ)
e = Dϕei e . (25.155)
The term linear in η vanishes by definition of Jr . Put back the effects of the counterterm
Lagrangian, writing it as
(Lct [ϕcl ] + Jct ϕcl ) + (Lct [ϕcl + η] − Lct [ϕcl ] + Jct η). (25.157)
Define
Z
1 δ 3 Sr
Lη ≡ 4 4 4
d x d y d z η(x)η(y)η(z) + · · · +(Lct [ϕcl +η]−Lct [ϕcl ]+Jct η).
3! δϕ(x)δϕ(y)δϕ(z)
(25.158)
We have Z ∫ ∫ 1 δ 2 Sr
e−iE[J] = Z1 ei Lη ( 1i δ
δI
)
Dη ei η
2 δϕδϕ
η+Iη
, (25.159)
I=0
where Z
Z1 ≡ exp i d x (Lr [ϕcl ] + Jr ϕcl + Lct [ϕcl ] + Jct ϕcl ) .
4
(25.160)
A perturbative expansion for iE[J] can be obtained using connected Feynman diagram as
Z
−iE[J] = i (Lr [ϕcl ]+Jr ϕcl +Lct [ϕcl ]+Jct ϕcl )+log(Z2 )+ connected diagrams . (25.164)
Notice that there are no terms remaining that depend explicitly on J; thus, Γ is expressed as a
function of ϕcl , as it should be. The Feynman diagrams contributing to Γ[ϕcl ] have no external
lines, and the simplest ones turn out to have two loops. The lowest-order quantum correction
to Γ is given by the functional determinant Z2 . The last term provides a set of counterterms
that can be used to satisfy the renormalization conditions on Γ and, in the process, to cancel
divergences that appear in the evaluation of the functional determinant and the diagrams. The
renormalization conditions will determine all of the counterterms in Lct .
The formalism we have constructed contains a new counterterm Jct , whose value is deter-
mined by ⟨η⟩ = 0. Our adjustment of Jct to keep ⟨η⟩ = 0 means that the sum of all connected
diagrams with an external line is zero. Consider now that same infinite set of diagrams, but
replace the external line in each of them with some other subdiagram. Here is the point: no
matter what this replacement subdiagram is, the sum of all these diagrams is still zero. There-
fore, we need not bother to compute any of them. The rule is this: ignore any diagram that
falls into two parts when a single line is cut. All of these diagrams (known as tadpoles) are
canceled by the Jct counterterm, no matter what subdiagram they are attached to.
δ 2 Γ[ϕcl ]
= iD−1 (x, y) where D(x, y) = ⟨ϕ(x)ϕ(y)⟩conn , (25.167)
δϕcl (x)δϕcl (y)
δ n Γ[ϕcl ]
= −i⟨ϕ(x1 ) · · · ϕ(xn )⟩1PI where n ≥ 3. (25.168)
δϕcl (x1 ) · · · δϕcl (xn )
A detailed proof can be found in section 10.2 of An introduction to quantum field theory
(M.E.Peskin & D.V.Schroeder)
As a result, the effective action can also be defined constructively as
Z
1 dd k
Γ[ϕ] ≡ Γ[ϕcl,0 ] + η̃(−k)(−k 2 − m2 − M 2 (k 2 ))η̃(k)
2 (2π)d
Z d
1 d k1 dd kn
+ · · · (2π)d δ(k1 + · · · + kn )Vn (k1 , · · · , kn )η̃(k1 ) · · · η̃(kn ),
n! (2π)d (2π)d
25.9 Spontaneous symmetry breaking –307/453–
R
where η̃(k) = dd x e−ikx η(x), η = ϕ − ϕcl,0 , and iVn (k1 , · · · , kn ) equals the value of 1PI
Feynman diagram in momentum space. The effective action has the property that the tree-
level Feynman diagrams it generates give the complete scattering amplitude of the original
theory. A detailed discussion is provided by section 21 of Quantum field theory (M. Srednicki).
Effective action contains the complete set of physical predictions of the quantum field theory.
The vacuum state of the field theory is identified as the minimum of the effective potential. The
location of the minimum determines whether the symmetries of the Lagrangian are preserved
or spontaneously broken. The second derivative of Γ is the inverse propagator. The poles of the
propagator, or the zeros of the inverse propagator, give the values of the particle masses. The
higher derivatives of Γ are the one-particle-irreducible amplitudes. These can be connected by
full propagators and joined together to construct four-and higher-point connected amplitudes,
which give the S-matrix elements. Thus, from the knowledge of Γ, we can reconstruct the
qualitative behavior of the quantum field theory, its pattern of symmetry-breaking, and then
the quantitative details of its particles and their interactions.
In theories without a symmetry of ϕ → −ϕ, there might also be terms linear and cubic in ϕi ;
we omit these for simplicity. The coefficients A0 , A2 , A4 have mass dimension, respectively,
4, 2, and 0; thus we expect them to contain Λ4 , Λ2 , and log Λ divergences, respectively. The
power-counting analysis predicts that all higher terms in the Taylor series expansion should
be finite.
The constant term A0 is independent of ϕcl ; it has no physical significance. However, the di-
vergences in A2 and A4 appear in physical quantities, since these coefficients enter the inverse
propagator and the irreducible four-point function and therefore appear in the computation of
S-matrix elements. There is one further coefficient in the effective action that has non-negative
mass dimension by power counting; this is the coefficient of the term quadratic in ∂µ ϕcl , which
appears when the effective action is evaluated for a non-constant background field:
Z
∆Γ[ϕcl ] = d4 x B2ij ∂µ ϕicl ∂ µ ϕjcl . (25.170)
All other coefficients in the Taylor expansion of the effective action in powers of ϕcl are finite
by power counting.
We can now argue that the counterterms of the original Lagrangian suffice to remove the di-
vergences that might appear in the computation of Γ[ϕcl ]. The argument proceeds in two steps.
We first use the BPHZ theorem to argue that the divergences of Green’s functions can be re-
moved by adjusting a set of counterterms corresponding to the possible operators that can be
–308/453– Chapter 25 Scalar Field
added to the Lagrangian with coefficients of mass dimension greater than or equal to zero.
The coefficients of these counterterms are in 1-to-1 correspondence with the coefficients A2 ,
A4 , and B2 of the effective action. Next, we use the fact that the effective action is manifestly
invariant to the original symmetry group of the model. This is true even if the vacuum state
of the model has spontaneous symmetry breaking, since the method we presented for com-
puting the effective action is manifestly invariant to the original symmetry of the Lagrangian.
Combining these two results, we conclude that the effective action can always be made finite by
adjusting the set of counterterms that are invariant to the original symmetry of the theory, even
if this symmetry is spontaneously broken. By using the results of previous subsection, which
explain how to construct the Green’s functions of the theory from the functional derivatives
of the effective action, this conclusion of renormalizability extends to all the Green’s functions
of the theory.
where α is an infinitesimal parameter and ∆a is some function of all the ϕ’s. Specialize to constant
fields; then the derivative terms in L vanish and the potential alone must be invariant. This condition
can be written as
∂
V (ϕa ) = V (ϕa + α∆a (ϕ)) or ∆a (ϕ) a V (ϕ) = 0. (25.172)
∂ϕ
The effective potential Veff encapsulates the full solution to the theory, including all orders of quantum
corrections. At the same time, it satisfies the general properties of the classical potential: It is invariant
to the symmetries of the theory, and its minimum gives the vacuum expectation value of ϕcl . Thus
∂
∆a (ϕ) Veff (ϕ) = 0. (25.173)
∂ϕa
The first term vanishes since ϕcl is a minimum of Veff , so the second term must also vanish. If the
transformation leaves ϕcl unchanged (i.e., if the symmetry is respected by the ground state), then
25.10 Linear sigma model –309/453–
∆a (ϕcl ) = 0 and this relation is trivial. A spontaneously broken symmetry is precisely one for which
∆a (ϕcl ) ̸= 0; in this case ∆a (ϕcl ) is the vector with eigenvalue zero.
A particle of mass 0 corresponds to a zero eigenvalue of this matrix equation at p2 = 0. Now set p = 0.
This implies (δ 2 Γ/δϕi δϕj )(x, y) has a zero eigenvalue. This is equivalent to ∂ 2 Veff /∂ϕicl ∂ϕjcl has a zero
eigenvalue. This completes the proof of Goldstone’s theorem. 2
25.10.2 Renormalization
From this expression of the Lagrangian written in terms of shifted fields, we can read off the
Feynman rules for the linear sigma model, as shown in Figure 25.17. Then we can compute
tree-level amplitudes without difficulty.
Diagrams with loops, however, will often diverge. For the amplitude with Ne external legs, the
superficial degree of divergence is
D = 4 − Ne . (25.183)
The linear sigma model has eight different superficially divergent amplitudes and several of
these have D > 0 and therefore can contain more than one infinite constant, as shown in
Figure 25.18.
1 1 δλ
Lct = − δZ ∂µ ϕi ∂ µ ϕi − δµ (ϕi )2 − [(ϕi )2 ]2 . (25.184)
2 2 4
Written in terms of σ and π fields, it takes the form
δZ 1 δZ 1
Lct = − (∂µ π k )2 − (δµ + δλ v 2 )(π k )2 − (∂µ σ)2 − (δµ + 3δλ v 2 )σ 2
2 2 2 2
δ λ δλ δλ
− (δµ v + δλ v 3 )σ − δλ vσ(π k )2 − δλ vσ 3 − [(π k )2 ]2 − σ 2 (π k )2 − σ 4 . (25.185)
4 2 4
25.10 Linear sigma model –311/453–
Figure 25.19: Feynman rules for counterterm vertices in the linear sigma model.
The Feynman rules associated with these counterterms are shown in Figure 25.19.
Three renormalization parameters, δZ , δµ and δλ , can be adjusted to satisfy the renormalization
conditions shown in Figure 25.20.
Conclusions from subsection 25.9.4 make sure that these three parameters are able to absorb
all the infinities arising in the divergent amplitudes shown in Figure 25.18. No new symmetry-
breaking terms are needed to make this theory renormalizable. The statement is also verified
up to one-loop level in section 11.2 of An introduction to quantum field theory (M.E.Peskin &
D.V.Schroeder). As an aside, the calculation also shows that π particles remain massless after
one-loop corrections.
Then the operator 25.187 is just equal to the Klein-Gordon operator (∂ 2 − m2i ), where
(
λϕ2cl − µ2 , acting on η 1 , · · · , η N −1
m2i = . (25.189)
3λϕ2cl − µ2 , acting on η N
–312/453– Chapter 25 Scalar Field
As a result, Z2 is given by
Y
N N Z
Y ∫
i
η (∂ 2 −m2i )η
Z2 = Zi = Dη e 2 . (25.190)
i=1 i=1
And we have
X
log Zi = CI . (25.192)
I
where CI represents connected diagram without external source, as shown in Figure 25.21.
leading to
Z n Z
1 d4 p X 1 m2i 1 d4 p m2i
log Zi = − V T − − 2 =− VT log 1 + 2 . (25.194)
2 (2π)4 n p 2 (2π)4 p
i Γ(− d2 ) 2 d
log Zi = (m ) 2 V T. (25.195)
2 (4π)d/2 i
25.11 Optical theorem and unstable particles –313/453–
1 λ 1 Γ(− d2 ) 2 d2 2 d2 1 1
Veff = − µ2 ϕ2cl + ϕ4cl − d/2
[(N − 1)(λϕ 2
cl − µ ) + (3λϕ2
cl − µ ) ] + δ µ ϕ2
cl + δλ ϕ4cl .
2 4 2 (4π) 2 4
(25.196)
To make terms involving ϕcl finite, we must have
2λ2 (N + 8) 2λµ2 (N + 2)
δλ = + finite terms , δµ = − + finite terms. (25.197)
(4π)2 (4 − d) (4π)2 (4 − d)
Functional determinants
Equation 25.190 can also be evaluated formally using functional determinants. Recall the
Gaussian integral
Z ∞ ! r
i X
n
(−2πi)n
d x exp −
n
Aij xi xj = (25.198)
−∞ 2 i,j=1 det A
Formally we have
1
log Zi = − log det (−∂x2 + m2i )δ(x − y) . (25.199)
2
Define
It follows that Z
M (x, z) = d4 y M0 (x − y)M1 (y − z). (25.201)
Thus, we have
log det M = log det M0 + log det M1 → log det M1 . (25.202)
Term log det M0 is dropped out because it is independent of ϕcl .
Since M1 = I − G, where I = δ(x − y) is the identity matrix and G = −im2i DF , we can get
1X
∞
log det M1 = Tr log M1 = Tr log(I − G) = − TrGn , (25.203)
n n=1
where
Z
TrG = n
(−im2i )n dx1 · · · dxn DF (x1 − x2 ) · · · DF (xn − x1 ). (25.204)
− i(T − T † ) = T † T. (25.206)
we can obtain
XZ Y
n X X
∗
−i[T(i → f )−T (f → i)] = fk T(i → {q})T ∗ (f → {q})(2π)4 δ(
dq pf − qk ).
n k=1 k
(25.209)
Let us abbreviate this identity as
XZ
∗
− i[T(i → f ) − T (f → i)] = dΠm T(i → m)T ∗ (f → m), (25.210)
m
where the sum runs over all possible sets of particles and i and f could be one-particle or
multi-particle asymptotic states. For the important special case of forward scattering, we can
set i = f to obtain a simpler identity,
Z
1X
Im T(i → i) = dΠm |T(i → anything )|2 . (25.211)
2 m
Supplying the kinematic factors required to build a cross section, we obtain the standard form
of the optical theorem,
where Ecm is the total center-of-mass energy and pcm is the momentum of either particle in the
center-of-mass frame. This equation relates the forward scattering amplitude to the total cross
section for production of all final states. Since the imaginary part of the forward scattering
amplitude gives the attenuation of the forward-going wave as the beam passes through the
target, it is natural that this quantity should be proportional to the probability of scattering.
in section 7.3 of An introduction to quantum field theory (M.E.Peskin & D.V.Schroeder). This
fact is extremely useful for dealing with unstable particles, which never appear in asymptotic
states.
The exact two-point function for a scalar particle has the form
−i
. (25.213)
p2 + m2+ M 2 (p2 )
We defined the quantity −iM (p2 ) as the sum of all 1PI insertions into the boson propagator,
but we can equally well think of it as the sum of all amputated diagrams for 1-particle → 1-
particle scattering. Under OS renormalization scheme, the LSZ formula would imply
T = −M 2 (p2 ). (25.214)
If the scalar boson is stable, there will be no possible final state that can contribute to the
right-hand side of equation 25.211 and so M 2 (p2 ) must be real. Renormalization condition
M 2 (−m2 ) = 0 can be realized by a real-valued m, which is the physical mass of the stable
particle. The pole of the propagator lies on the real p2 axis, below the multiparticle branch cut.
Often, however, a particle can decay into two or more lighter particles. In this case, M 2 (p2 ) will
acquire an imaginary part and the renormalization condition must by modified as Re M 2 (−m2 ) =
0. The pole in the propagator would be displaced from the real axis.
If this propagator appears in the s channel of a Feynman diagram, the cross section one com-
putes, in the vicinity of the pole, will have the form
2
1
σ∝ where s = −p2 , p = p1 + p2 . (25.215)
s − m − i Im M 2 (−s)
2
where Γtot is the total decay rates of the intermediate particle. As a result, the width of the
resonance and the lifetime of the intermediate particle are related by
∆E∆τ = 1. (25.219)
–316/453– Chapter 25 Scalar Field
We stress once again that our derivation of this equation applies only to the case of a long-
lived unstable particle, so that Γ ≪ m. For a broad resonance, the full energy dependence of
M 2 (p2 ) must be taken into account.
To get a more physical understanding of this result, recall that in non-relativistic quantum
mechanics, a metastable state with energy E0 and angular momentum quantum number l
shows up as a resonance in the partial-wave scattering amplitude,
1
fl ∼ . (25.220)
E − E0 + iΓ/2
If we imagine convolving this amplitude with a wave packet ψ̃(E)e−iEt will find a time depen-
dence Z
1
ψ(t) ∼ dE ψ̃(E)e−iEt ∼ e−iE0 t−Γt/2 . (25.221)
E − E0 + iΓ/2
Therefore |ψ(t)|2 ∼ e−Γt , and we identify Γ as the inverse lifetime of the metastable state.
L = −∂ µ Φ† ∂µ Φ − m2 Φ† Φ. (25.222)
π = Φ̇† . (25.223)
[b(p), b† (q)] = (2π)3 2ωδ(p − q), [c(p), c† (q)] = (2π)3 2ωδ(p − q). (25.227)
Working out the commutation relations between H, P and b, b† , c, c† , we can conclude that
b† (p) / c† (p) creates a b / c particle with momentum p, while b(p) / c(p) annihilates a b / c
particle with momentum p. They share the same mass m.
25.12 Non-relativistic limit –317/453–
We notice that L is invariant under transformation Φ → Φeiα . Noether’s theorem implies that
complex Klein-Gordon field has a conserve charge
Z Z
† †
Q = i d x(Φ̇ Φ − Φ Φ̇) = dp[c
3 f † (p)c(p) − b† (p)b(p)] = Nc − Nb . (25.228)
We would like to interpret c-particle as the antiparticle of b-particle. The number of anti-
particles minus the number of particles is a conserved quantity, i.e., particles and anti-particles
must be created and annihilated in pair.
Integrating by parts, the Lagrangian density of the complex Klein-Gordon field would become
† ∂ ∇2
L = iψ + ψ, (25.231)
∂t 2m
We further define Ni+ ≡ 21 (Si − iKi ) and Ni− ≡ 12 (Si + iKi ). The commutation relations now
becomes
+ + − − + −
Ni , Nj = iϵijk Nk+ , Ni , Nj = iϵijk Nk− , Ni , Nj = 0. (26.5)
We see that we have two different SU(2) Lie algebras that are exchanged by hermitian con-
jugation. A representation of the SU(2) Lie algebra is specified by an integer or half integer;
we therefore conclude that a representation of the Lie algebra of the Lorentz group in four
spacetime dimensions is specified by two integers or half-integers n and n′ .
We will label these representations as (2n + 1, 2n′ + 1); the number of components of a rep-
resentation is then (2n + 1)(2n′ + 1). Different components within a representation can also
be labeled by their angular momentum representations. Since Si = Ni+ + Ni− , deducing the
allowed values of j given n and n′ becomes a standard problem in the addition of angular mo-
menta. The general result is that the allowed values of j are |n − n′ |, |n − n′ | + 1, · · · , n + n′ ,
and each of these values appears exactly once.
26.2 Spin-statistics theorem –319/453–
States with identical particles of integer spin are symmetric under the interchange of
the particles, while states with identical particles of half-integer spin are antisymmet-
ric under the interchange of the particles. This is equivalent to the statement that the
creation and annihilation operators for integer spin particles satisfy canonical commu-
tation relations, while creation and annihilation operators for half-integer spin particles ♣
satisfy canonical anti-commutation relations. Particles quantized with canonical com-
mutation relations are called bosons, and satisfy Bose–Einstein statistics, and particles
quantized with canonical anti-commutation relations are called fermions, and satisfy
Fermi–Dirac statistics.
Roughly speaking, one way to interchange two particles is to rotate them around their mid-
point by π. For a particle of spin s, this rotation will introduce a phase factor of eiπs . Thus, a
two-particle state with identical particles both of spin s will pick up a factor of ei2πs . For s a
half-integer, this will give a factor of −1; for s an integer, it will give a factor of +1. Therefore,
the creation and annihilation operators for integer spin particles satisfy canonical commuta-
tion relations, while creation and annihilation operators for half-integer spin particles satisfy
canonical anti-commutation relations. The detailed proof can be found in section 12.1 and
12.2 of Quantum Field Theory and the Standard Model (Matthew D. Schwartz).
Similarly, a right-handed spinor field is in the (1, 2) representation of the Lie algebra of the
Lorentz group, where
1 1
Sij
R = ϵ
ijk
(Nk+ + Nk− )1,2 = ϵijk σk , Sk0 −
R = i(Ni − Ni )1,2 = − iσk .
+
(26.9)
2 2
–320/453– Chapter 26 Spinor Field
The hermitian conjugate of the left-handed spinor field also furnishes a representation of
Lorentz group. We will distinguish the indices of the conjugate field from those of the original
field by putting dots over them. Thus, we write
U (Λ)−1 ψȧ† (x)U (Λ) = (L∗ )ȧḃ ψḃ† (x)(Λ−1 x). (26.12)
Define
0 −1
ϵab ≡ . (26.13)
1 0
Using the fact that det L = 1, we have
so that ϵab is an invariant symbol of the Lorentz group. The inverse of ϵab is denoted by ϵab . We
can use ϵab and ϵab to raise and lower left-handed spinor indices,
ψ a χa = −ψa χa . (26.16)
Lab Lac = −δbc , Lac Lbd ϵcd = ϵab , U (Λ)−1 ψ a (x)U (Λ) = −Lab (Λ)ψ b (Λ−1 x). (26.17)
For conjugate field, there is also an invariant symbol ϵȧḃ , which is equivalent to ϵab numerically.
We can use ϵȧḃ and ϵȧḃ to raise and lower conjugate spinor indices.
Using the fact that σ2 σi∗ σ2 = −σi and ϵȧḃ = i(σ2 )ȧḃ , we can show that
ȧ
∗ ȧ ∗ d˙ i 1
R ḃ ≡ −(L ) ḃ = ϵ (L )ċ ϵd˙ḃ = exp − θi σi + ηi σi
ȧ ȧċ
. (26.18)
2 2 ḃ
Since
U (Λ)−1 ψ †ȧ (x)U (Λ) = Rȧḃ ψ †ḃ (Λ−1 x), (26.19)
conjugate field ψ †ȧ is actually a right-handed spinor field.
Define
σaµȧ ≡ (I, σ1 , σ2 , σ3 ). (26.20)
26.4 Dynamics of spinor fields –321/453–
It is an invariant symbol under the group (2, 1) ⊗ (1, 2) ⊗ (2, 2). The properties of invariance
symbol can be used to derive the following equations.
Proposition 26.1
1.
σaµȧ σµbḃ = −2ϵab ϵȧḃ , ϵab ϵȧḃ σaµȧ σbνḃ = −2η µν . (26.21)
2.
(Sµν µν
L )ab = (SL )ba , (Sµν µν
R )ȧḃ = (SR )ḃȧ . (26.22)
3. Define
σ̄ µȧa ≡ ϵab ϵȧḃ σbµḃ . (26.23)
♠
Numerically, we have
We adopt the following convention: a missing pair of contracted, undotted indices is under-
stood to be written as cc , and a missing pair of contracted, dotted indices is understood to be
written as ċċ . Thus, if χ and ψ are two left-handed Weyl fields, we have
We expect Weyl fields to describe spin-one-half particles, and by the spin-statistics theorem,
these particles must be fermions. Therefore the corresponding fields must anticommute, rather
than commute. That is, we should have
Proposition 26.2
1.
(χψ)† = ψ † χ† .
♠
2.
[ψ † σ̄ µ χ]† = χ† σ̄ µ ψ.
–322/453– Chapter 26 Spinor Field
L = iψ † σ̄ µ ∂µ ψ. (26.28)
The second term is a total divergence, and vanishes (with suitable boundary conditions on the
fields at infinity) when we integrate it over d4 x to get the action S. Thus iψ † σ̄ µ ∂µ ψ has the
hermiticity properties necessary for a term in L.
The field equation can be obtained from the principle of least action,
σ̄ µ ∂µ ψ = 0. (26.30)
Majorana field
1 1
L = iψ † σ̄ µ ∂µ ψ − mψψ − mψ † ψ † . (26.31)
2 2
The field equation is
We can write the equation of motion more compactly by introducing the gamma matrices
0 σaµċ
γ ≡
µ
. (26.33)
σ̄ µȧc 0
Dirac field
A Dirac field is composed of two left-handed spinor fields with an U (1) symmetry. The La-
grangian density is given by
1 1
L = iχ† σ̄ µ ∂µ χ + iξ † σ̄ µ ∂µ ξ − mχξ − mξ † χ† , (26.37)
2 2
which is invariant under the transformation
We can write the Lagrangian density in terms of the Dirac field. First we take the hermitian
conjugate of Ψ to get
Ψ† = (χ†ȧ , ξ a ). (26.41)
ΨΨ = ξχ + χ† ξ † , Ψγ µ ∂µ Ψ = χ† σ̄ µ ∂µ χ + ξ † σ̄ µ ∂µ ξ + ∂µ (ξσ µ ξ † ). (26.44)
This form of the Lagrangian density is invariant under the U (1) transformation
j µ = Ψγ µ Ψ = χ† σ̄ µ χ − ξ † σ̄ µ ξ. (26.47)
–324/453– Chapter 26 Spinor Field
Charge conjugation
Charge conjugation simply exchanges ξ and χ. We can define a unitary charge conjugation
operator C that implements this:
Then, if we multiply by C, we get a field that we will call ΨC , the charge conjugate of Ψ,
⊺ ξa
Ψ ≡ CΨ =
C
. (26.51)
χ†ȧ
The charge conjugation matrix has a number of useful properties. As a numerical matrix, it
obeys
C⊺ = C† = C−1 = −C, C−1 γ µ C = −(γ µ )⊺ . (26.53)
Now let us return to the Majorana field. It is obvious that a Majorana field is its own charge
conjugate, that is, ΨC ≡ CΨ⊺ = Ψ. This condition is analogous to the condition ϕ† = ϕ that is
satisfied by a real scalar field. A Dirac field, with its U (1) symmetry, is analogous to a complex
scalar field, while a Majorana field, which has no U (1) symmetry, is analogous to a real scalar
field.
Using the fact that Ψ = Ψ⊺ C, the Lagrangian density of a Majorana field in terms of Ψ is given
by
i 1
L = Ψ⊺ Cγ µ ∂µ Ψ − mΨ⊺ CΨ. (26.54)
2 2
Projection matrix
We can also recover the Weyl components of a Dirac or Majorana field by means of a suitable
projection matrix. Define
c
−δa 0
γ5 ≡ . (26.55)
0 δċȧ
26.5 Canonical quantization formulation –325/453–
i
γ5 = iγ 0 γ 1 γ 2 γ 3 = ϵµνρσ γ µ γ ν γ ρ γ σ where ϵ0123 = −1. (26.58)
24
The γ5 also has the following properties:
Define !
(Sµν
L )a
b
0
S µν
≡ µν ȧ . (26.60)
0 (SR ) ḃ
Numerically, we have
i k
i i σ 1 σ
S µν
= [γ µ , γ ν ], S =
i0
, S = ϵijk
ij
. (26.61)
4 2 −σ i 2 σk
it follows that
D−1 γ µ D = Λµν γ ν , D−1 γ 5 D = γ 5 , (26.65)
which means Ψγ µ Ψ behaves like a vector under Lorentz transformation, while Ψγ5 Ψ like a
scalar.
–326/453– Chapter 26 Spinor Field
{ψa (x, t), ψc (y, t)} = {π a (x, t), π c (y, t)} = 0, {ψa (x, t), π c (y, t)} = iδac δ(x − y).
(26.69)
Using the field equation, ψ(x) can be expanded as
Z
ψa = dp f b(p)wa (p)eipx + d† (p)wa (p)e−ipx where p2 = 0, (p̂ · σ + 1)w(p) = 0.
(26.70)
†
Choosing the normalization w (p)w(p) = 2Ep = 2|p|, we can derive that
b(p), b† (q) = d(p), d† (q) = (2π)3 2Ep δ(p − q), (26.71)
and all other anticommutation brackets between b, b† , d and d† vanish. In terms of b, b† , d and
d† , the Hamiltonian of the field is
Z Z
† 1
H = dp f |p| b (p)b(p) + d (p)d(p) − 2E0 V where E0 = (2π)
† −3
d3 p |p|,
2
(26.72)
the momentum is Z
P = dp f p b† (p)b(p) + d† (p)d(p) , (26.73)
where
(p/ + m)u(p) = 0, (−p/ + m)v(p) = 0 (p2 + m2 = 0). (26.79)
/ ≡ aµ γ µ . Each
Here, we introduce the Feynman slash: given any four-vector aµ , we define a
of 26.79 has two linear independent solutions, which we label via s = + and s = −.
For m ̸= 0, it is easiest to analyze equation 26.79 in the rest frame, where p = 0. Two linear
independent solutions for u(p) can be chosen as
√ ξ † ∗
us (0) = m s where ξ+ ξ+ = 1, ξ− = −iσ2 ξ+ . (26.80)
ξs
It can be shown that ξs† ξs′ = δss′ . We also choose the solutions for v(p) as
√ ξ− √ −ξ+
v+ (0) = m , v− (0) = m (26.81)
−ξ− ξ+
ūs (p) = ūs (0) exp(−iη p̂ · K), v̄s (p) = v̄s (0) exp(−iη p̂ · K). (26.85)
This follows from K̄j ≡ βKj† β = Kj . In particular, it turns out that γ µ , Sµν , iγ5 , γ µ γ5 and
iγ5 Sµν all satisfy Ā = A.
Some useful identities of spinors are summarized in the following proposition. The proof can
be found in section 38 of Quantum field theory (Mark Srednicki)
Proposition 26.3
1.
2.
2mūs′ (p′ )γ µ us (p) = ūs′ (p′ )[(p′ + p)µ − 2iSµν (p′ − p)ν ]us (p)
−2mv̄s′ (p′ )γ µ vs (p) = v̄s′ (p′ )[(p′ + p)µ − 2iSµν (p′ − p)ν ]vs (p). (26.88) ♠
3.
4. X X
us (p)ūs (p) = −p/ + m, vs (p)v̄s (p) = −p/ − m. (26.90)
s=± s=±
In the extreme relativistic limit, we have η → ∞ and meη → 2E. Choosing ξs as the eigen-
vector of p̂ · σ with eigenvalue s, we have
√ 0 √ ξ
u+ (p) = v− (p) = 2E , u− (p) = v+ (p) = 2E − . (26.92)
ξ+ 0
26.5 Canonical quantization formulation –329/453–
Next, we need the charge conjugation matrix 26.49. We can show that Cū⊺s (0) = vs (0),
Cv̄s⊺ (0) = us (0). Also, equation 26.53 implies C−1 Kj C = −(Kj )⊺ . From this we conclude
that
Cū⊺s (p) = vs (p), Cv ⊺s (p) = us (p). (26.94)
Taking the complex conjugate of 26.94, we get
Next, γ5 us (0) = +sv−s (0) and γ5 vs (0) = −su−s (0), and that γ5 Kj = Kj γ5 . Therefore
We will need equation 26.93 in our discussion of parity, equation 26.94 in our discussion of
charge conjugation, and 26.97 in our discussion of time reversal.
From equation 26.78, we can derive that
Z Z
3 −ipx †
bs (p) = d x e ūs (p)γ Ψ(x), bs (p) = d3 x eipx Ψ(x)γ 0 us (p)
0
Z Z
3 −ipx †
ds (p) = d x e Ψ(x)γ vs (p), ds (p) = d3 x eipx v̄s (p)γ 0 Ψ(x).
0
(26.98)
{Ψa (x, t), Ψc (x, t)} = {Πa (x, t), Πc (x, t)} = 0, {Ψa (x, t), Πc (y, t)} = iδac δ(x − y).
(26.99)
† †
In terms of b, b , d and d , the anticommutation relations are
n o n o
bs (p), b†s′ (q) = ds (p), d†s′ (q) = (2π)3 2Ep δss′ δ(p − q), (26.100)
XZ
the momentum as
P = f p N + (p, s) + N − (p, s) ,
dp (26.103)
s=±
and the spin angular momentum as
XZ
S= f s p̂ N + (p, s) + N − (p, s) + · · ·
dp (26.104)
s=±
2
In equation 26.104, we choose ξs as the eigenvector of p̂ · σ with eigenvalue s. (If p = 0, p̂
can be chosen arbitrarily.) Terms eliminated will vanish when S is projected onto p̂.
The anti-commutation relations for Ψ(x) at arbitrary spacetime is given by
Ψa (x), Ψc (y) = (i∂/x + m)ac i∆(x − y), (26.105)
where Z
i∆(x − y) ≡ f [eip(x−y) − e−ip(x−y) ].
dp (26.106)
For (x − y)2 > 0, ∆(x − y) = 0 and so Ψa (x), Ψc (y) = 0. It follows that
Ψa (x)Ψb (x), Ψc (y)Ψd (y) = 0 if (x − y)2 > 0. (26.107)
Thus, the microscopic causality is satisfied for any physical observables, such as charge density
or momentum density.
The two point correlation function for Dirac field is given by
Z
0 Ψa (x)Ψc (y) 0 = (i∂/x + m)ac dp f eip(x−y) (26.108)
and Z
0 Ψc (y)Ψa (x) 0 = −(i∂/x + m)ac f eip(y−x) .
dp (26.109)
Retarded green function for Dirac field is defined via
SR (x − y)ac ≡ θ(x0 − y 0 ) 0 Ψa (x)Ψc (y) 0 . (26.110)
It is easy to verify that
Z
d4 p i(p/ − m) ip(x−y)
SR (x − y) = (i∂/x + m)DR (x − y) = e (26.111)
(2π)4 p2 + m2
and
(i∂/x − m)SR (x − y) = iδ(x − y) · 14×4 . (26.112)
Now, we introduce the time ordered product for fermion fields
Tη(x)η(y) ≡ θ(x0 − y 0 )η(x)η(y) − θ(y 0 − x0 )η(y)η(x). (26.113)
The Feynman Green function for Dirac field is defined as
Z
d4 p i(p/ − m) ip(x−y)
SF (x − y) ≡ 0 TΨ(x)Ψ(y) 0 = e . (26.114)
(2π)4 p2 + m2 − iϵ
It follows that
0 TΨa (x)Ψc (y) 0 = − 0 TΨc (y)Ψa (x) 0 = −SF (y − x)ca . (26.115)
We also have ⟨0|TΨ(x)Ψ(y)|0⟩ = 0 TΨ(x)Ψ(y) 0 = 0 for Dirac field.
26.6 Parity, time reversal and charge conjugation –331/453–
So the Majorana condition implies that ds (p) = bs (p). And a free Majorana field can be
expanded as
XZ
Ψ(x) = f bs (p)us (p)eipx + b† (p)vs (p)e−ipx .
dp (26.117)
s
s=±
The anticommutation relations for a Majorana field in two-components form are the same as
those for a Weyl field, given by equation 26.69. Translating into four-components form, we
have
{Ψa (x, t), Ψc (x, t)} = (Cγ 0 )ac δ(x − y), Ψa (x, t), Ψc (y, t) = (γ 0 )ac δ(x − y),
(26.118)
which can be used to show that
n o
{bs (p), bs′ (q)} = 0, bs (p), b†s′ (q) = (2π)3 2Eδss′ δ(p − q). (26.119)
as we would expect.
The Hamiltonian for the Majorana field Ψ in terms of b and b† is
XZ
H= f Ep b† (p)bs (p) − 2E0 V.
dp (26.120)
s
s=±
The Majorana Lagrangian density has no U (1) symmetry. Thus there is no associated charge,
and only one kind of particle (with two possible spin states).
The Feynman Green function ⟨0|TΨ(x)Ψ(y)|0⟩ for Majorana field is the same as that for
Dirac field. With the Majorana condition, we also have
where η is a possible phase factor that should satisfy η 2 = ±1. We could assign different
phase factors to the b and d operators, but we choose them to be the same so that the parity
transformation can be compatible with Majorana condition ds (p) = bs (p) when applying for
Majorana fermions.
Using 26.122, we can derive that
P −1 P P = −P , P −1 SP = S. (26.123)
Thus a parity transformation reverse the three momentum while leaving the spin direction
unchanged.
It can also be derived that
XZ
−1
P Ψ(x)P = f η ∗ bs (p)βus (p)eipPx − ηd† (p)βvs (p)e−ipPx ,
dp (26.124)
s
s=±
where Pµν = diag(1, −1, −1, −1). An acceptable choice is η = −i, resulting in
Recalling that
χa
Ψ= , (26.127)
ξ †ȧ
we see from equation 26.124 that
Thus a parity transformation exchanges a left-handed field for a right-handed one. If we take
the hermitian conjugate of equation 26.128, then raise the index on one side while lowering it
on the other, we get
Comparing equations 26.128 and 26.129, we see that they are compatible with the Majorana
condition χa (x) = ξa (x).
Time reversal
In quantum theory, the time reversal operator is antiunitary, i.e., T −1 iT = −i. Under the time
reversal transformation, we require that
It follows that
T −1 P T = −P , T −1 ST = −S. (26.131)
26.6 Parity, time reversal and charge conjugation –333/453–
Thus a time reversal transformation reverse the three momentum and the spin direction.
Applying to field operator, we obtain
XZ
−1
T Ψ(x)T = f − sCγ5 ζ ∗ bs (p)us (p)eipTx + ζ−s d† (p)vs (p)e−ipTx ,
dp (26.132)
−s s
s=±
Considering the effect of time reversal on the Weyl fields, we can figure out that
Thus left-handed Weyl fields transform into left-handed Weyl fields (and right-handed into
right-handed) under time reversal. If we take the hermitian conjugate of equation 26.135,
then raise the index on one side while lowering it on the other, we get
Charge conjugation
Under charge conjugation, we have
Summary
The transformation properties of the various fermion bilinears under C, P and T are summa-
rized in table 26.1.
We see that ΨΨ and Ψiγ5 Ψ are both even under CP T , while Ψγ µ Ψ and Ψγ µ γ5 Ψ are both
odd. These are examples of a more general rule: a fermion bilinear with n vector indices
(and no uncontracted spinor indices) is even (odd) under CP T if n is even (odd). This also
applies if we allow derivatives acting on the fields, since each component of ∂µ is odd under
the combination P T and even under C.
For scalar and vector fields, it is always possible to choose the phase factors in the C, P , and
T transformations so that, overall, they obey the same rule: a hermitian combination of fields
–334/453– Chapter 26 Spinor Field
Table 26.1: Transformation properties of fermion bilinears under discrete symmetries. Here,
we use the shorthand (−1)µ ≡ 1 for µ = 0 and (−1)µ ≡ −1 for µ = 1, 2, 3.
ΨΨ iΨγ5 Ψ Ψγ µ Ψ Ψγ µ γ5 Ψ
P +1 −1 (−1)µ −(−1)µ
T +1 −1 (−1)µ (−1)µ
C +1 +1 −1 +1
CP T +1 +1 −1 −1
and derivatives is even or odd depending on the total number of uncontracted vector indices.
Putting this together with our result for fermion bilinears, we have the following CP T theo-
rem.
Theorem 26.2 CPT theorem
Any hermitian combination of any set of fields (scalar, vector, Dirac, Majorana) and
their derivatives that is a Lorentz scalar (and so carries no indices) is even under CP T .
♣
Since the Lagrangian must be formed out of such combinations, we have L(x) →
R
L(−x) under CP T , and so the action S = d4 x L is invariant.
1 1
L = − ∂µ ϕ∂ µ ϕ − M02 ϕ2 + iΨγ µ ∂µ Ψ − m0 ΨΨ − g0 ΨΨϕ. (26.139)
2 2
The Hamiltonian is
Z
H = H0 + Hint where Hint = d3 x g0 Ψ(x)Ψ(x)ϕ(x). (26.140)
The correlation function of the Yukawa theory can be expanded perturbatively using
D n h R io E
T
0 T ΨI (x)ΨI (y)ϕI (z) exp −i −T dt HI 0
Ω T{Ψ(x)Ψ(y)ϕ(z)} Ω = lim D n h R io E ,
T →∞(1−iϵ) T
0 T exp −i −T dt HI 0
(26.141)
4
where the definition of ϕI , ΨI and HI are similar to those in ϕ thoery.
The right hand side of equation 26.141 can be evaluated using Wick’s theorem. Before we state
the Wick’s theorem for Yukawa theory, we must note the following conventions:
1. The time-ordered product picks up one minus sign for each interchange of operators
that is necessary to put the fields in time order.
26.8 Path integral formulation –335/453–
2. The normal-ordered product picks up one minus sign for each interchange of operators
that is necessary to put the fields in normal order.
3. Define contractions under the normal-ordering symbol to include minus signs for op-
erator interchanges.
With these conventions, Wick’s theorem takes the same form as before:
T ΨI (x1 )ΨI (x2 )ΨI (x3 ) · · · = N ΨI (x1 )ΨI (x2 )ΨI (x3 ) · · · + all possible contractions .
(26.142)
Example:
0 T ΨIa (x1 )ΨIb (x2 )ΨIc (x3 )ΨId (x4 ) 0
= SF (x1 − x2 )ab SF (x3 − x4 )cd − SF (x1 − x4 )ad SF (x3 − x2 )cb . (26.143)
n h R io
T
Expand ⟨0|T ΨIa (x)ΨIb (y)ϕI (z) exp −i −T dt HI |0⟩ to the first order of g0 , we have
Z
4
0 T ΨIa (x)ΨIb (y)ϕI (z)(−ig0 ) d w ΨI (w)ΨI (w)ϕI (w) 0
Z
= −(−ig0 )SF (x − y)ab d4 w DF (z − w)Tr[SF (w − w)]
Z
+ (−ig0 ) d4 w [SF (x − w)SF (w − y)]ab DF (w − z).
The Feynman rules for Yukawa theory to evaluate Feynman diagrams are:
1. For each Fermion propagator from y to x, P = SF (x − y).
2. For each scalar propagator, P = DF (x − y).
R
3. For each vertex, V = (−ig0 ) d4 w.
4. For each external point, E = 1.
5. Divided by the symmetry factor.
–336/453– Chapter 26 Spinor Field
where the cs are complex numbers, or, equivalently, ci1 i2 ···ik is a complex-valued, completely
antisymmetric tensor of rank k. Again, the θi can be clearly seen here to be playing the role of
a basis vector of a vector space. Observe that the Grassmann algebra generated by n linearly
independent Grassmann variables has dimension 2n ; this follows from the binomial theorem
applied to the above sum, and the fact that the n + 1-fold product of variables must vanish, by
the anti-commutation relations, above. In other words, for n variables, the sum terminates
Λ = C ⊕ Λ1 (V ) ⊕ Λ2 (V ) ⊕ · · · ⊕ Λn (V ), (26.149)
where Λk (V ) is the k-fold alternating product. The dimension of Λk (V ) is given by n choose k,
the binomial coefficient. The special case of n = 1 is called a dual number, and was introduced
by William Clifford in 1873.
26.8 Path integral formulation –337/453–
Multi-variable integration: Z
dθ dη ηθ ≡ 1. (26.151)
Complex conjugation:
Z
∗ ∗ ∗ ∗ ∗
(θη) ≡ η θ = −θ η , dθ∗ dθ θθ∗ ≡ 1. (26.152)
Gaussian integral:
Z Z
−θ∗ bθ ∗ bθ
∗
dθ dθ e = b, dθ∗ dθ θθ∗ e−θ = 1. (26.153)
Linear transformation:
Y Y
θi′ = (det A)( θi ) where θi′ = Aij θj (26.154)
i i
the only term of f (θ) that survives has exactly one factor of each θi and θi∗ ; it is proportional to
Q Q
( i θi )( i θi∗ ). If we replace θ by U θ where U is a unitary transformation, this term acquires
a factor of det U det U ∗ = 1, so the integral is unchanged under the unitary transformation.
Using this effect, we can derive the Gaussian integral over multiple complex Grassmann num-
bers:
! !
YZ YZ
∗ −θi∗ Bij θj ∗
dθi dθi e = det B, dθi dθi θk θl∗ e−θi Bij θj = (B −1 )kl det B.
∗
i i
(26.156)
It follows that Z
′
Ψ (x) = Ψ(x) − i d4 y η̄(y)SF (y − x). (26.160)
(26.161)
So we have Z
Z[η̄, η] = Z[0] exp − d x d y η̄(x)SF (x − y)η(y) .
4 4
(26.162)
P1 =0
P 1! 2
X∞ Z P2
1
× − d y 2 d z2 η̄(y2 )SF (y2 − z2 )η(z2 )
4 4
. (26.166)
P =0
P 2!
2
26.9 LSZ reduction formula –339/453–
If we focus on a term with particular V , P1 and P2 , the number of surviving scalar sources will
be E1 = 2P1 − V and fermion sources E2 = 2P2 − 2V . Those terms can be organized using
Feynman diagrams. In these diagrams, a dashed line segment stands for a scalar propagator
DF (x − y), a line with an arrow pointing from y to x for a fermion propagator SF (x − y), a
R
filled circle at one end of a dashed line segment for a scalar source i d4 x J(x), a filled circle at
R
the start of a line with an arrow for a fermion source i d4 x η(x), a filled circle at the end of a
R
line with an arrow for a anti-fermion source i d4 x η̄(x), a vertex joining three line segments
R
for −ig0 d4 x.
where m is the physical mass of the fermion, and Z2 is the probability for the quantum field
to create or annihilate an exact one-particle eigenstate of H, defined through
p p
⟨Ω|Ψ(0)|p, s, b⟩ = Z2 us (p), ⟨p, s, d|Ψ(0)|Ω⟩ = Z2 v s (p). (26.168)
Scattering amplitude of interacting fermions and antifermions can be evaluated using follow-
ing reduction formula.
× [ūs1 (p1 )(p/1 + m)] · · · [ūsn (pn )(p/n + m)] × [v̄r̄1 (k̄1 )(k/̄1 − m)] · · · [v̄r̄m̄ (k̄m̄ )(k/̄m̄ − m)]
× Ω T{Ψ(x1 ) · · · Ψ(xn )Ψ(x̄1 ) · · · Ψ(x̄n̄ )Ψ(y1 ) · · · Ψ(ym )Ψ(ȳ1 ) · · · Ψ(ȳm̄ )} Ω
× [(k/1 + m)ur1 (k1 )] · · · [(k/m + m)urm (km )] × [(p/̄1 − m)vs̄1 (p̄1 )] · · · [(p/̄n̄ − m)vs̄n̄ (p̄n̄ )].
(26.169)
From equation 26.169, we can see that the scattering amplitude would vanish unless n − n̄ =
m − m̄, implying the conservation of charge. Terms like eipx would impose the condition
of momentum conservation, and terms like p/ ± m would remove external legs in Feynman
diagrams. A formal derivation of reduction formula for fermions can be found in section 2.7
of Advanced Quantum Field Theory (Jorge Crispim Romão).
Now we can list the Feynman rules of Yukawa theory which can be used to evaluate scattering
amplitudes.
1. For each incoming electron, draw a solid line with an arrow pointed towards the vertex,
and label it with the electron’s four-momentum, ki .
–340/453– Chapter 26 Spinor Field
2. For each outgoing electron, draw a solid line with an arrow pointed away from the ver-
tex, and label it with the electron’s four-momentum, pi .
3. For each incoming positron, draw a solid line with an arrow pointed away from the
vertex, and label it with minus the positron’s four-momentum, −k̄i .
4. For each outgoing positron, draw a solid line with an arrow pointed towards the vertex,
and label it with minus the positron’s four-momentum, −p̄i .
5. For each incoming scalar, draw a dashed line with an arrow pointed towards the vertex,
and label it with the scalar’s four-momentum, qi .
6. For each outgoing scalar, draw a dashed line with an arrow pointed away from the vertex,
and label it with the scalar’s four-momentum, qi′ .
7. The only allowed vertex joins two solid lines, one with an arrow pointing towards it and
one with an arrow pointing away from it, and one dashed line (whose arrow can point
in either direction). Using this vertex, join up all the external lines, including extra
internal lines as needed. In this way, draw all possible diagrams that are topologically
inequivalent.
8. Assign each internal line its own four-momentum. Think of the four-momenta as flow-
ing along the arrows, and conserve four-momentum at each vertex. For a tree diagram,
this fixes the momenta on all the internal lines.
10. Spinor indices are contracted by starting at one end of a fermion line: specifically, the
end that has the arrow pointing away from the vertex. The factor associated with the
external line is either ū or v̄. Go along the complete fermion line, following the arrows
backwards, and write down (in order from left to right) the factors associated with the
vertices and propagators that you encounter. The last factor is either a u or v. Repeat
this procedure for the other fermion lines, if any.
11. The overall sign of a tree diagram is determined by drawing all contributing diagrams
in a standard form: all fermion lines horizontal, with their arrows pointing from left to
26.10 Functional determinant –341/453–
right, and with the left endpoints labeled in the same fixed order (from top to bottom); if
the ordering of the labels on the right endpoints of the fermion lines in a given diagram
is an even (odd) permutation of an arbitrarily chosen fixed ordering, then the sign of
that diagram is positive (negative).
12. Each closed fermion loop contributes an extra minus sign.
13. Value of iM is given by a sum over the values of the contributing diagrams.
P P
14. ⟨f |iT |i⟩ = (Z1 )nsca /2 (Z2 )nfer /2 iMδ( pf − pi ).
When evaluating closed fermion loops, we need to calculate the trace of the product of n
gamma matrices. Here, we list some frequently-used formulas:
Tr[ odd no. ofγ µ s] = 0, Tr[γ5 ( odd no. ofγ µ s)] = 0; (26.170a)
Tr[γ µ γ ν ] = −4η µν , Tr[γ µ γ ν γ ρ γ σ ] = 4 (η µν η ρσ − η µρ η νσ + η µσ η νρ ) ; (26.170b)
Tr γ5 = 0, µ ν
Tr[γ5 γ γ ] = 0, Tr[γ5 γ γ γ γ ] = −4iε
µ ν ρ σ µνρσ
. (26.170c)
Recall that if we have n complex Grassmann variables θi , then we can evaluate gaussian inte-
grals by the general formula
Z
dn θ∗ dn θ exp[−iθi∗ Mij θj ] ∝ det M. (26.174)
In the case of the functional integral in equation 26.173, the “matrix” M becomes
Define
M0αβ (x, y) ≡ (−i∂/x + m)αβ δ(x − y), M1βγ (y, z) ≡ δβγ δ(y − z) + igSF (y − z)βγ ϕ(z).
(26.176)
–342/453– Chapter 26 Spinor Field
It follows that
X∞
1
log Z[ϕ] = log det(M0 M1 ) = − Tr Gn + contant. (26.179)
n=1
n
where
Z
n
Tr G = (−ig) n
d4 x1 · · · d4 xn tr SF (x1 − x2 )ϕ(x2 ) · · · SF (xn − x1 )ϕ(x1 ). (26.180)
To better understand what it means, we will rederive it in a different way. Consider treating the
−gϕΨΨ term in L as an interaction. This leads to a vertex that connects two Ψ propagators;
the associated vertex factor is −igϕ(x). And log Z[ϕ] is given by
X
log Z[ϕ] = CI , (26.181)
I
where CI represents connected diagram without external source. The only connected dia-
grams we can draw with these Feynman rules are fermion circles with n vertices where n ≥ 1.
The diagram with n vertices has an n-fold cyclic symmetry, leading to a symmetry factor of
S = n. The closed fermion loop implies a trace over the spinor indices. Thus the value of the
n-vertex diagram is
Z
1
(−ig) n
d4 x1 · · · d4 xn trSF (x1 − x2 )ϕ(x2 ) · · · SF (xn − x1 )ϕ(x1 ). (26.182)
n
Summing up these diagrams, we find that we are missing the overall minus sign in equation
26.179. The appropriate conclusion is that we must associate an extra minus sign with each
closed fermion loop.
Chapter 27
Vector Field
where
i
Λ = exp θρσ SV ,
ρσ
V ) ν ≡ −i(η
(Sρσ µ
δν − η σµ δνρ ).
ρµ σ
(27.2)
2
The vector representation of Lorentz group is equivalent to the (2, 2) representation, as (2, 2)
contains j = 0 and 1, which is just right for a four-vector, whose time component is a scalar
under spatial rotations, and whose space components are a three-vector.
Electromagnetic field is a vector field. The Lagrangian density of free EM field is
1
L = − Fµν F µν where Fµν ≡ ∂µ Aν − ∂ν Aµ , Aµ ≡ (ϕ, A), (27.3)
4
leading to the field equation
∂µ F µν = 0. (27.4)
EM field Aµ has 4 components, which would naively seem to tell us that it has 4 degrees of
freedom. But there are two related comments which will ensure that quantizing the gauge
field Aµ gives rise to 2 degrees of freedom, rather than 4:
• The field A0 has no kinetic term Ȧ0 in the Lagrangian: it is not dynamical. This means
that if we are given some initial data Ai and Ȧi at a time t0 , the field A0 will be fully
determined by ∂µ F µ0 = 0, which, expanding out, reads
∂A
∇2 A0 = ∇ · . (27.5)
∂t
Thus A0 is not independent: we do not get to specify A0 on the initial time slice.
• The Lagrangian density of EM field is invariant under the gauge transformation
Aµ → Aµ + ∂µ λ(x). (27.6)
The seemed infinite number of symmetries, one for each function λ(x), is to be viewed
as a redundancy in our description. That is, two states related by a gauge symmetry are
–344/453– Chapter 27 Vector Field
to be identified: they are the same physical state. One way to see that this interpretation
is necessary is to notice that field equation is not sufficient to specify the evolution of
Aµ . The equation reads,
(ηµν ∂ 2 − ∂µ ∂ν )Aν = 0. (27.7)
But the operator (ηµν ∂ 2 − ∂µ ∂ν ) is not invertible: it annihilates any function of the form
∂µ λ. This means that given any initial data, we have no way to uniquely determine Aµ
at a later time since we can not distinguish between Aµ and Aµ + ∂µ λ. This would
be problematic if we thought that Aµ is a physical object. However, if we are happy to
identify Aµ and Aµ +∂µ λ as corresponding to the same physical state, then our problems
disappear.
The picture that emerges for the theory of electromagnetism is of an enlarged phase space,
foliated by gauge orbits. All states that lie along a given gauge orbit can be reached by a gauge
transformation and are identified. To make progress, we pick a representative from each gauge
orbit. It does not matter which representative we pick after all, they are all physically equiva-
lent. But we should make sure that we pick a “good” gauge, in which we cut the orbits. Here
we will look at two different gauges:
• Lorenz Gauge: ∂ µ Aµ = 0.
• Coulomb Gauge: ∇ · A = 0.
We can make use of the residual gauge transformations in Lorenz gauge to pick ∇·A =
0. Since A0 is fixed by equation 27.5, we have as a consequence A0 = 0. (A0 = 0 will no
longer hold in Coulomb gauge in the presence of charged matter.) The 3 components
of A satisfy a single constraint ∇ · A = 0, leaving behind just 2 degrees of freedom.
These will be identified with the two polarization states of the photon.
∂L ∂L
π0 = = 0, πi = = Ȧi = −E i . (27.8)
∂ Ȧ0 ∂(∂0 Ai )
Three pairs of Ai and π i are not independent from each other. They must satisfy the constraint
equations
∇ · A = 0, ∇ · π = 0. (27.11)
The appropriate commutation relations for EM field are
i
[Ai (x, t), Aj (y, t)] = 0, π (x, t), π j (y, t) = 0,
Z
∂i ∂ j d3 k ki k j
Ai (x, t), π (y, t) = i δi − 2 δ(x − y) ≡ i
j j
δi − 2 eik·(x−y) .
j
∇ (2π)3 k
(27.12)
It can be verified that
Ȧi = −i[Ai (x, t), H] = πi (x, t), π̇ i = −i π i (x, t), H = ∇2 Ai (x, t), (27.13)
which is consistent with the field equation.
Using the field equation and gauge condition, EM field can be expanded as
XZ
A(x) = f r (p)ϵr (p)eipx + a† (p)ϵ∗ (p)e−ipx ] where p2 = 0,
dp[a ϵ · p = 0.
r r
r=±
(27.14)
We will stick to the normalization
ϵr · ϵ∗s = δrs . (27.15)
The completeness relation for the polarization vectors is
X pi pj
ϵir (p)ϵ∗j (p) = δ ij
− . (27.16)
r=±
r
|p| 2
Notice that terms eliminated would vanish when S is projected onto p̂.
Finally, we can derive the Feynman propagator for EM field in Coulomb gauge,
Z
d4 p −i pi pj
GF (x − y)ij ≡ ⟨0|TAi (x)Aj (y)|0⟩ = δij − eip(x−y) . (27.20)
(2π)4 p2 − iϵ |p|2
–346/453– Chapter 27 Vector Field
[Aµ (x, t), Aν (y, t)] = 0, [π µ (x, t), π ν (y, t)] = 0, [Aµ (x, t), π ν (y, t)] = iδµν δ(x − y).
(27.25)
It follows that
[Ȧµ (x, t), Ȧν (y, t)] = 0, [Aµ (x, t), Ȧν (y, t)] = iηµν δ(x − y). (27.26)
where p2 = 0 and ϵλµ are a set of four independent 4-vectors. We choose ϵ1µ and ϵ2µ orthog-
onal to k µ and nµ = (1, 0, 0, 0), such that
ϵλµ ϵ∗µ
λ′ = δλλ′ , λ, λ′ = 1, 2. (27.28)
Finally we choose ϵ0µ = nµ . The vectors ϵ1µ and ϵ2µ are called transverse polarizations, while
ϵ3µ and ϵ0µ longitudinal and scalar polarizations, respectively. In general we can show that
′
ϵλ · ϵ∗λ′ = ηλλ′ , η λλ ϵλµ ϵ∗λ′ ν = ηµν . (27.30)
showing that the quanta associated with λ = 0 has acommutation relation with the wrong
sign.
To see the problem with the sign we construct the one-particle state with scalar polarization,
that is Z
|1⟩ = dpff (p)a† (p) |0⟩ (27.32)
0
µ |ψ⟩ = 0.
∂ µ A+ (27.34)
where |ψT ⟩ is obtained from the vacuum with creation operators with transverse polarization
and |ϕ⟩ with scalar and longitudinal polarization. To understand the consequences it is enough
to analyze the states |ϕ⟩ as ∂ µ A+
µ contains only scalar and longitudinal polarizations
XZ
µ +
∂ Aµ = i f λ (p)(p · ϵλ (p))eipx .
dpa (27.36)
λ=0,3
We can construct |ϕ⟩ as a linear combination of states |ϕn ⟩ with n scalar or longitudinal pho-
tons:
|ϕ⟩ = C0 |ϕ0 ⟩ + C1 |ϕ⟩ + · · · where |ϕ0 ⟩ ≡ |0⟩ . (27.39)
The states |ϕn ⟩ are eigenstates of the operator number for scalar or longitudinal photons,
Z
′
N |ϕn ⟩ = n |ϕn ⟩ where N ≡ dp[a ′ f † (p)a3 (p) − a† (p)a0 (p)]. (27.40)
3 0
Since n ⟨ϕn |ϕn ⟩ = ⟨ϕn |N ′ |ϕn ⟩ = 0, we have ⟨ϕn |ϕn ⟩ = δn0 , i.e., for n ̸= 0, the state |ϕn ⟩ has
zero norm. So the norm for the general physical state |ϕ⟩ is
Define
NLS (p) ≡ a†3 (p)a3 (p) − a†0 (p)a0 (p), NT (p) ≡ a†1 (p)a1 (p) + a†2 (p)a2 (p). (27.42)
We can see that ⟨ψ|H|ψ⟩ / ⟨ψ|ψ⟩ = ⟨ψT |HT |ψT ⟩ / ⟨ψT |ψT ⟩. The arbitrariness of Ci of the
physical states does not affect the physical observables. Only the physical transverse polariza-
tions contribute to the result.
It is important to note that although for the average values of the physical observables only the
transverse polarizations contribute, the scalar and longitudinal polarizations are necessary for
the consistency of the theory. In particular they show up when we consider complete sums
over the intermediate states.
It is easy to verify that GF (x − y)µν is the Green’s function of the field equation,
Dµ Ψ ≡ ∂µ Ψ − ie0 Aµ Ψ. (27.49)
∇ · A = 0, ∇2 A0 = e0 j 0 . (27.52)
H = HD + HM + Hint , (27.54)
where
Z Z
1
HD ≡ d x − Π(⃗
3
α · ∇ + iβm)Ψ, HM ≡ d3 x (π 2 + B 2 )
⃗
2
Z 2 Z 0 0 ′
e ′ j (x)j (x )
Hint ≡ d3 x −e0 j · A + 0 d3 x . (27.55)
2 4π|x − x′ |
The perturbation expansion of correlation function is given by
D n h R io E
T
0 T ΨI (x)ΨI (y)AI (z) exp −i −T dt HI 0
Ω T{Ψ(x)Ψ(y)A(z)} Ω = lim D n h R io E ,
T →∞(1−iϵ) T
0 T exp −i −T dt HI 0
(27.56)
RT
where T dt HI can be written as
Z Z Z ′
4 ′ e0 δ(t − t )
2
′ ′ 0 ′ ′
− d x e0 ΨI γΨI · AI +
4 4
dx dx 0
ΨI (x, t)γ ΨI (x, t)ΨI (x , t )γ ΨI (x , t ) .
8π|x − x′ |
(27.57)
–350/453– Chapter 27 Vector Field
T {AI (x1 )AI (x2 )AI (x3 ) · · · } = N {AI (x1 )AI (x2 )AI (x3 ) · · · + all possible contractions} .
(27.58)
Example:
⟨0|T {AIi (x1 )AIj (x2 )AIk (x3 )AIl (x4 )}|0⟩
= GF (x1 − x2 )ij GF (x3 − x4 )kl + GF (x1 − x3 )ik GF (x2 − x4 )jl + GF (x1 − x4 )il GF (x2 − x3 )jk .
Now we can derive the Feynman rule for QED theory. Firstly, we evaluate the term
Z
0 T ΨIa (x)ΨIb (y)AIi (z)(ie0 ) dw ΨI (w)γΨI (w) · AI (w) 0 .
4
(27.59)
It seems that Feynman rules in Coulomb gauge would be rather messy. However, the offending
non-local interaction comes from the A0 component of the gauge field, and we could try to
redefine the propagator to include a GF (x − y)00 piece which will capture this term. Since
Z ′
iδ(w0 − w′0 ) d4 p ieip(w−w )
= , (27.63)
4π|w − w′ | (2π)4 |p|2
we can combine the non-local interaction with the transverse photon propagator by defining
a new photon propagator
i
|p|2 , µ, ν = 0
GF (p)µν ≡ p2−i pi pj
δij − |p| , µ = i ̸= 0, ν = j ̸= 0 . (27.64)
−iϵ 2
0, otherwise
With this propagator, the wavy photon line now carries a µ, ν = 0, 1, 2, 3 index, with the extra
µ = 0 component taking care of the instantaneous interaction.
−iηµν pµ pν
GF (p)µν = + i(1 − ξ) 2 . (27.66)
p − iϵ
2 (p − iϵ)2
–352/453– Chapter 27 Vector Field
This difficulty is due to gauge symmetry. The functional is badly defined because we are re-
dundantly integrating over a continuous infinity of physically equivalent field configurations.
To fix the problem, we would like to isolate the interesting part of the functional integral, which
counts each physical configuration only once. Let G(A) be some function that we wish to set
equal zero as a gauge-fixing condition. We could constrain the functional integral to cover
only the configurations with G(A) = 0 by inserting a functional delta function, δ[G(A)].
To do so, we insert 1 in the path integral:
Z
δG 1
1 = Dα(x)δ{G[A(α)]} det where Aµ [α(x)] = Aµ (x) + ∂µ α(x). (27.71)
δα e0
We set the gauge fixing function as G(A) = ∂ µ Aµ − ω(x), so that
1 2
G[A(α)] = ∂ µ Aµ + ∂ α − ω(x). (27.72)
e0
Since det(δG/δα ) = det(∂ 2 )/e0 is independent of Aµ (x) and α(x), we have
Z Z
δG
Z[0] = det Dα DAeiS[A] δ{G[A(α)]}. (27.73)
δα
Now change the integration variable from A to A(α). This is a simple shift, so DA = DA(α).
Also, by gauge invariance, S[A] = S[A(α)]. Since A(α) is now just a dummy integration
variable, we can rename it back to A, leading to
Z Z
δG
Z[0] = det Dα DAeiS[A] δ[∂ µ Aµ − ω(x)]. (27.74)
δα
27.4 Path integral quantization –353/453–
X∞ Z V
1 1 δ 1 δ µ 1 δ
Z[J] ∝ ie0 d x 4
µ (x)
·− ·γ ·
V =0
V ! i δJ i δη(x) i δ η̄(x)
X 1
∞ Z P 1
1
× − d y1 d z2 J(y1 )GF (y1 − z1 )J(z1 )
4 4
P1 =0
P 1! 2
X∞ Z P2
1
× − d y2 d z2 η̄(y2 )SF (y2 − z2 )η(z2 )
4 4
. (27.83)
P =0
P 2!
2
If we focus on a term with particular values of V , P1 and P2 , the number of surviving vector
sources will be E1 = 2P1 − V and fermion sources E2 = 2P2 − 2V . Those terms can
be organized using Feynman diagrams. In these diagrams, a wavy line segment stands for a
vector propagator GF (x−y), a line with an arrow pointing from y to x for a fermion propagator
R
SF (x − y), a filled circle at one end of a wavy line segment for a vector source i d4 x J(x), a
R
filled circle at the start of a line with an arrow for a fermion source i d4 x η(x), a filled circle
R
at the end of a line with an arrow for a anti-fermion source i d4 x η̄(x) and a vertex joining
R
three line segments for ie0 γ µ d4 x.
Term ie0 ⟨Ω|Tj µ Ψ(x1 )Ψ(x2 )|Ω⟩ can be represented by the diagram 27.3.
Using diagram 27.3 and equation 27.86, we can get an identify represented by diagram 27.4,
called Ward-Takahashi identify.
Figure 27.4: Feynman diagram representation of Ward identity. Notice that the external leg
of photon is cut-off, while external legs of fermion remain.
Diagram 27.4 can be further generated to the case with n external fermions. Another proof of
Ward-Takahashi identity by analyzing Feynman diagrams directly can be found in section 7.4
of An introduction to quantum field theory (M.E.Peskin & D.V.Schroeder)
Define iΠµν to be the sum of all 1-particle-irreducible insertions into the photon propagator.
So we have
1
G(k) = GF (k) + GF (k)[iΠ(k)]GF (k) + · · · = GF (k) . (27.88)
1 − iΠ(k)GF (k)
It follows that
(iG)−1 = (iGF )−1 − Π. (27.89)
Recall that
1 kµ kν kµ kν
iGF (p)µν = L
(P T + ξPµν ) T
where Pµν ≡ ηµν − , L
Pµν ≡ . (27.90)
k2 − iϵ µν k2 k2
We can derive that
1 L
(iGF )−1
µν
2 T
= k Pµν + Pµν . (27.91)
ξ
–356/453– Chapter 27 Vector Field
We may also decompose Πµν into transverse part and longitudinal part,
kµkν
Πµν = PTµν fT (k 2 ) + PLµν fL (k 2 ) = η µν fT + 2 (fL − fT ). (27.92)
k
Using equations 27.89, 27.91 and 27.92, we can work out the decomposition of G(k)µν as
−i −i
G(k)µν = 2 T
Pµν + 2 PL . (27.93)
k − fT (k ) 2 k /ξ − fL (k 2 ) µν
So if fT,L (k 2 = 0) ̸= 0, a mass will be generated for the photon. Because Π(k) comes from
1PI diagrams, it should not be singular at k 2 = 0, and so we have fL − fT = O(k 2 ).
Scattering amplitude of interacting photons and charged fermions can be evaluated using fol-
lowing reduction formula.
n Z
Y m Z
Y
−ipi xi
⟨p1 · · · pn |S|k1 · · · km ⟩ = 4
d xi e d4 yj eikj yj
1 1
m+n
i
× √ [p21 ϵ∗µ 2 ∗µn
λ1 (p1 )] · · · [pn ϵλn (pn )][k1 ϵλ′1 (k1 )] · · · [km ϵλ′m (pm )]
1 2 ν1 2 νm
Z3
× ⟨Ω|T{Aµ1 (x1 ) · · · Aµn (xn )Aν1 (y1 ) · · · Aνm (ym )}|Ω⟩ . (27.105)
Given the structure of exact propagator and LSZ reduction formula, we can list the Feynman
rules of QED theory which can be used to evaluate scattering amplitudes.
1. For each incoming electron, draw a solid line with an arrow pointed towards the vertex,
and label it with the electron’s four-momentum, pi .
2. For each outgoing electron, draw a solid line with an arrow pointed away from the ver-
tex, and label it with the electron’s four-momentum, p′i .
3. For each incoming positron, draw a solid line with an arrow pointed away from the
vertex, and label it with minus the positron’s four-momentum, −pi .
4. For each outgoing positron, draw a solid line with an arrow pointed towards the vertex,
and label it with minus the positron’s four-momentum, −p′i .
5. For each incoming photon, draw a wavy line with an arrow pointed towards the vertex,
and label it with the photon’s four-momentum,ki .
6. For each outgoing photon, draw a wavy line with an arrow pointed away from the vertex,
and label it with the photon’s four-momentum, ki′ .
7. The only allowed vertex joins two solid lines, one with an arrow pointing towards it and
one with an arrow pointing away from it, and one wavy line. Using this vertex, join
up all the external lines, including extra internal lines as needed. In this way, draw all
possible diagrams that are topologically inequivalent.
–358/453– Chapter 27 Vector Field
8. Assign each internal line its own four-momentum. Think of the four-momenta as flow-
ing along the arrows, and conserve four-momentum at each vertex.
• for each incoming photon, ϵµλ (k); for each outgoing photon, ϵ∗µ
λ (k);
• for each incoming electron, ur (k); for each outgoing electron, ūs (p);
• for each incoming positron, v r (k); for each outgoing positron, vs (p);
• for each vertex, ie0 γ µ ; for each internal photon, GF (p); for each internal fermion,
SF (p).
10. Spinor indices are contracted by starting at one end of a fermion line: specifically, the
end that has the arrow pointing away from the vertex. The factor associated with the
external line is either ū or v̄. Go along the complete fermion line, following the arrows
backwards, and write down (in order from left to right) the factors associated with the
vertices and propagators that you encounter. The last factor is either a u or v. Repeat
this procedure for the other fermion lines, if any. The vector index on each vertex is
contracted with the vector index on either the photon propagator (if the attached photon
line is internal) or the photon polarization vector (if the attached photon line is external).
11. The overall sign of a tree diagram is determined by drawing all contributing diagrams
in a standard form: all fermion lines horizontal, with their arrows pointing from left to
right, and with the left endpoints labeled in the same fixed order (from top to bottom); if
the ordering of the labels on the right endpoints of the fermion lines in a given diagram
is an even (odd) permutation of an arbitrarily chosen fixed ordering, then the sign of
that diagram is positive (negative).
13. Value of iM is given by a sum over the values of the contributing diagrams.
P P
14. ⟨f |iT |i⟩ = (Z2 )nfer /2 (Z3 )npho /2 iMδ( pf − pi ).
+ Proof: Without losing generality, we can consider a physical process with a single incoming and out-
going fermion lines respectively. Therefore, the ward identities represented by diagram 27.4 states that
Here, F keeps the external fermion legs but cuts external photon lines. According to the LSZ reduction
formula, from each diagram we can get the contribution to an S matrix element by taking the coefficient
27.7 Renormalization –359/453–
When calculating invariant matrix element M, the value of photon propagator depends on
the gauge we used. In Coulomb gauge, the photon propagator is given by equation 27.64. In
Lorenz gauge, the photon propagator becomes equation 27.66.
A general scattering process can be represented by Figure 27.6.
27.7 Renormalization
27.7.1 Renormalized quantum electrodynamics
The superficial degree of divergence of a Feynman diagram in QED is
3
D = 4 − Nγ − Ne , (27.112)
2
where Nγ is the number of external photons and Ne is the number of external (anti-)fermions.
Only seven types of diagrams have D > 0, including the vacuum term. However, symmetries
–360/453– Chapter 27 Vector Field
can cause certain terms to cancel, and the divergence of a diagram may be reduced or even
eliminated.
Under charge conjugation, we have C |Ω⟩ = |Ω⟩ and Cj µ (x)C † = −j µ (x), leading to
According to equation 27.85, the amplitude with Nγ = 1 and Ne = 0 must vanish. Similarly,
the amplitude with Nγ = 3 and Ne = 0 also vanishes.
Considering the scattering amplitude with Nγ = 4. The Ward identity requires that if we
replace any external photon by its momentum vector, the amplitude vanishes: pµ Mµνσρ = 0.
By exhaustion one can show that this condition is satisfied only if the amplitude is proportional
to η µν pσ − η µσ pν , with a similar factor for each of the other three legs. Each of these factors
involves one power of momentum, so all terms with less than four powers of momentum in the
Taylor series of this amplitude must vanish. The rest nonvanishing term has D = 0 − 4 = −4,
and therefore this amplitude is finite.
As discussed in section 27.5, the transverse part of the photon propagator is proportional to
(ηµν p2 − pµ pν ). Viewing this expression as a Taylor series in q, we see that the constant and
linear terms both vanish, lowering the superficial degree of divergence from 2 to 0. The diver-
gence is only logarithmic.
Neglecting the vacuum term, there are only three divergent amplitude terms left, as shown in
Figure 27.7. We need four counterterms to eliminate all the divergence.
Define
−1/2 −1/2
Ar ≡ Z3 A, Φr ≡ Z2 Φ, δ3 ≡ Z3 − 1, δ2 ≡ Z2 − 1,
1
δm ≡ Z2 m0 − m, δ1 ≡ Z1 − 1 ≡ (e0 /e)Z2 Z3 − 1,
2
(27.114)
where m is the physical mass of the fermion and e the physical electric charge. The Lagrangian
density then becomes
L = L1 + Lct , (27.115)
where
1
L1 = − Frµν Frµν + iΨr γ µ ∂µ Ψr − mΨr Ψr + eΨr γ µ Ψr Arµ ,
4
1
Lct = − δ3 Frµν Frµν + iδ2 Ψr γ µ ∂µ Ψr − δm Ψr Ψr + eδ1 Ψr γ µ Ψr Arµ . (27.116)
4
27.7 Renormalization –361/453–
Denote the renormalized 1PI component of exact photon propagator by i(η µν q 2 −q µ q ν )Πr (q 2 ),
the renormalized 1PI component of exact fermion propagator by −iΣr (p/) and the renormal-
ized exact amputated photon-fermion-antifermion vertex as ieΓµr (p, p′ ). We should adjust δ1 ,
δ2 , δ3 and δm as necessary to maintain the following renormalization conditions:
d
Σr (p/ = −m) = 0, Σr = 0, Πr (q 2 = 0) = 0, ieΓµr (p − p′ = 0) = ieγ µ
dp/ p
/=−m
(27.117)
As an aside, the renormalized exact fermion and photon propagator are
−i −i
Sr (p) = , Gr (q)µν = PT . (27.118)
p/ + m + Σr (p/) q 2 (1 − Πr (q 2 )) µν
Z1 = Z2 (27.122)
√
in OS renormalization scheme, leading to e = Z3 e0 .
Since the relation between the bare and renormalized electric charge depends only on the EM
field strength renormalization, not on quantities particular to the fermions, there is a universal
electric charge that has the same value for all species.
In the following subsection, we would omit the subscript r unless it is necessary to emphasis
the difference of bare fields and renormalized fields.
–362/453– Chapter 27 Vector Field
Figure 27.9: The one-loop and counterterm corrections to the photon propagator.
Fermion propagator
The exact renormalized fermion propagator in OS renormalization can be written as
Z ∞
1 ρΨ (s)
iS(p/) = + ds √ . (27.125)
p/ + m − iϵ mth p/ + s − iϵ
We see that the first term has a pole at p/ = −m with residue one. This residue corresponds to
the field normalization that is needed for the validity of the LSZ formula. There is a problem,
however: in quantum electrodynamics, the threshold mass mth is m, corresponding to the
contribution of a fermion and a zero energy photon. Thus the second term has a branch point
at p/ = −m. The pole in the first term is therefore not isolated, and its residue is ill defined.
This is a reflection of an underlying infrared divergence, associated with the massless photon.
To deal with it, we must impose an infrared cutoff that moves the branch point away from
the pole. The most direct method is to change the denominator of the photon propagator
from k 2 to k 2 + m2γ , where mγ is a fictitious photon mass. Ultimately, we must deal with this
issue by computing cross sections that take into account detector inefficiencies. In quantum
electrodynamics, we must specify the lowest photon energy ωmin that can be detected. Only
after computing cross sections with extra undetectable photons, and then summing over them,
is it safe to take the limit mγ → 0.
The quantum corrections to the fermion propagator up to one-loop order is described by Fig-
ure 27.10.
27.7 Renormalization –363/453–
Figure 27.10: The one-loop and counterterm corrections to the fermion propagator.
e2 1 e2 1
δ2 = − 2 + finite + O e 4
and δm /m = − 2 + finite + O e4 .
8π ϵ 2π ϵ
(27.127)
′
Imposing the OS renormalization condition Σ(−m) = 0 and Σ (−m) = 0, we find that
Z 1
e2 D
Σ(p/) = − 2 dx (1 − x)p/ + 2m ln + κ2 (p/ + m) + O e4 , (27.128)
8π 0 D0
Vertex
The quantum corrections to the vertex up to one-loop order is shown in Figure 27.11.
Especially, we have F1 (0) = 1 and F2 (0) = α/2π + O(α2 ), where α = e2 /4π is the fine-
structure constant.
e2 1 e2 1 e2 1
Z1 = Z2 = 1 − + O e4
, Z3 = 1 − + O e4
, Zm = 1 − + O e 4
.
8π 2 ϵ 6π 2 ϵ 2π 2 ϵ
(27.135)
After using dimensional regularization, the infinities coming from loop integrals take the form
of inverse powers of ϵ. In the MS renormalization scheme, we choose the Zs to cancel off these
powers of 1/ϵ, and nothing more. Therefore the Zs can be expanded as
X
∞
an (e) X
∞
bn (e) X
∞
cn (e) X
∞
dn (e)
Z1 = 1 + , Z2 = 1 + , Z3 = 1 + , Zm = 1 + .
n=1
ϵn n=1
ϵn n=1
ϵn n=1
ϵn
(27.139)
Using equation 27.135, we can get a1 = b1 = −e2 /8π 2 + O(e4 ), c1 = −e2 /6π 2 + O(e4 ) and
d1 = −e2 /2π 2 + O(e4 ).
Define
X ∞
En (e)
−1/2
E(e, ϵ) ≡ ln Z3 = . (27.140)
n=1
ϵn
We can work ou that E1 = −c1 /2 = e2 /12π 2 + O(e4 ). As ln e0 = E + ln e + ϵ ln µ̃/2 and
de0 /dµ = 0, we can derive that
eE1′ eE2′ de ϵ
1+ + 2 + ··· + e = 0. (27.141)
ϵ ϵ d ln µ 2
In a renormalizable theory, de/d ln µ should be finite in the ϵ → 0 limit. Therefore, the beta
function for the charge is supposed to be
de ϵ e2 ϵ e3
β(e) ≡ = − e + E1′ (e) = − e + + O(e5 ). (27.142)
d ln µ 2 2 2 12π 2
Define
X
∞
Mn (e)
M (e, ϵ) ≡ ln Zm Z2−1 = . (27.143)
n=1
ϵn
We can work out M1 = d1 − b1 = −3e2 /8π 2 + O(e4 ). As ln m0 = M (e, ϵ) + ln m and
d ln m0 /d ln µ = 0, we can derive that
d ln m ∂M (e, ϵ) de 1 X∞
Mn′ (e) e
2 ′
=− = (ϵe − e E1 ) n
= M1′ + · · · (27.144)
d ln µ ∂e d ln µ 2 n=1
ϵ 2
By similar procedure, we can also derive the anomalous dimension of the fermion field and
EM field:
1 d ln Z2 1 db1 e2
γ2 (e) ≡ =− e = 2
+ O e4 , (27.146)
2 d ln µ 4 de 16π
1 d ln Z3 1 dc1 e2
γ3 (e) ≡ =− e = 2
+ O e4 . (27.147)
2 d ln µ 4 de 12π
–366/453– Chapter 27 Vector Field
where Z
′
q ≡ p − p, Ãcl
µ (q) ≡ d3 x e−iq·x Acl
µ (x). (27.150)
If an electron is scattered by the background field and its momentum changes from p to p′ , the
scattering matrix will be
Acl cl
µ (x) = (0, Ai ), B cl = ∇ × Acl , B̃ cl = iq × Ãcl . (27.152)
Using equations 26.80, 26.82 and 26.91, we can work out that
√ (1 − p · σ/2m)ξ
u(p) = m (27.153)
(1 + p · σ/2m)ξ
Notice that in quantum field theory, we normalize the momentum eigenvector as ⟨p′ |p⟩ =
(2π)3 2Eδ(p − p′ ) rather than ⟨p′ |p⟩ = (2π)3 δ(p − p′ ). So there is an extra factor 2m in
equation 27.156 when compared with equation 27.157, resulting in
ne σk o cl
Ṽ (q) = −ξ ′† [F1 (0) + F2 (0)] ξ B̃k (q). (27.158)
m 2
Thus, the magnetic moment of electron is given by
e α
µ= ge S where ge = 2F1 (0) + 2F2 (0) = 2 + + O α2 . (27.159)
2m 2π
proton electron
Figure 27.12: Scattering of electron by proton in non-relativistic limit. Here, we have |p| ∼
|p′ | ∼ |q| ≪ me and |k| ∼ |k′ | ≪ mp . We should also keep in mind that me ≪ mp so that
loops formed by proton propagators can be neglected.
−iηµν −e2
iM = ū(k ′ )(−ieγ µ )u(k) ū(p ′
)(ieγ ν
)u(p) ≈ −i 2me δss′ 2mp δrr′ . (27.160)
q2 q2
−e2 −e2
Ṽ (q) = , V (r) = . (27.161)
q2 4πr
Next let us examine how electron loop modifies the electromagnetic interaction. Using the
exact photon propagator, we have the modified potential
−e2
Ṽ (q) = , (27.162)
q 2 [1 − Π(q 2 )]
where
Z
2α 1
m2e
Π(q ) = −
2
dx x(1 − x) ln + O α2 . (27.163)
π 0 m2e + x(1 − x)q 2
–368/453– Chapter 27 Vector Field
α q2 4πα 4α2
Π(q 2 ) = , Ṽ (q) = − − . (27.164)
15π m2e q2 15m2e
α 4α2
V (r) = − − δ(x). (27.165)
r 15m2e
The correction term indicates that the electromagnetic force becomes much stronger at small
distances. This effect can be measured in the hydrogen atom, where the energy levels are
shifted by
4α2
∆E = − |ψ(0)|2 . (27.166)
15m2e
The wave function is non-zero at the origin only for s-wave states. For the 2S state, the shift is
about −1.123×10−7 eV. This modified potential causes a split for degenerate levels of different
l. This is a (small) part of the Lamb shift splitting.
A more precise correction is given by Uehling potential
α2 e−2me r
δV (r) = − √ . (27.167)
4 πr (me r)3/2
Thus the range of the correction is roughly the electron’s Compton wavelength, m−1 e . Since
hydrogen wave functions are nearly constant on this scale, the delta function was a good ap-
proximation. We can interpret the correction as being due to screening. At r > m−1 e , virtual
+ −
e e pairs make the vacuum a dielectric medium in which the apparent charge is less than
the true charge. At smaller distances we begin to penetrate the polarization cloud and see the
bare charge. This phenomenon is known as vacuum polarization.
Chapter 28
Gauge Field
where R is an orthogonal matrix with a positive determinant: R⊺ = R−1 and det R = +1.
Consider an infinitesimal SO(N ) transformation
Orthogonality of Rij implies that θij is real and antisymmetric. It is convenient to express θij
in terms of a basis set of hermitian matrices (T a )ij . The index a runs from 1 to N (N − 1)/2,
the number of linearly independent, hermitian, antisymmetric, N × N matrices. Commonly,
these matrices obey the normalization condition
Tr T a T b = 2δ ab . (28.4)
The numerical factors f abc are the structure coefficients of the group. If f abc = 0, the group is
abelian. Otherwise, it is nonabelian.
–370/453– Chapter 28 Gauge Field
If we multiply equation 28.6 on the right by T d , take the trace, and use equation 28.4, we find
i
f abc = − Tr T a , T b T c . (28.7)
2
Using the cyclic property of the trace, we find that f abc must be completely antisymmetric.
Taking the complex conjugate of equation 28.7, we find that f abc must be real.
Example: The simplest nonabelian group is SO(3). In this case, we can choose (T a )ij =
−iϵaij . The commutation relations become
T a , T b = iϵabc T c . (28.8)
Consider now the theory of N complex scalar fields ϕi , and a Lagrangian density
1
L = −∂µ ϕ†i ∂ µ ϕi − m2 ϕ†i ϕi − λ(ϕ†i ϕi )2 . (28.9)
4
where U is a unitary matrix, U † = U −1 . We can write Uij = e−iθ U eij , where θ is a real
parameter and det U e = 1; U eij is called a special unitary matrix. Clearly the product of two
special unitary matrices is another special unitary matrix; the N × N special unitary matrices
form the group SU(N ). The group U (N ) is the direct product of the group U(1) and the group
SU(N ).
where θa is a set of real, infinitesimal parameters. Unitarity of Ue implies that the generator
matrices T a are hermitian, and det U e = 1 implies that each T a is traceless. The index a runs
from 1 to N 2 − 1, the number of linearly independent, hermitian, traceless, N × N matrices.
We can choose these matrices to obey the normalization condition
1
Tr T a T b = δ ab . (28.12)
2
Example: For SU(2), we can choose (T a )ij = (σ a )ij /2. The commutation relations become
T a , T b = iϵabc T c . (28.13)
28.1 Nonabelian gauge theory –371/453–
terms with derivatives, such as ∂ µ ϕ† ∂µ ϕi , will not remain invariant under local transforma-
tion. Thus we must include a traceless hermitian N × N matrix of fields Aµ (x), and promote
ordinary derivatives ∂µ to covariant derivatives Dµ = ∂µ − igAµ to ensure that
Dµ ϕ → U Dµ ϕ. (28.16)
i
Aµ (x) → U (x)Aµ (x)U † (x) + U (x)∂µ U † (x). (28.17)
g
Replacing all ordinary derivatives in L with covariant derivatives renders L gauge invariant.
We can write U (x) in terms of the generator matrices as exp[−igΓa (x)T a ]. If the structure
constant f abc ̸= 0, we have a nonabelian gauge theory.
We still need a kinetic term for Aµ (x). Let us define the field strength as
i
Fµν (x) ≡ [Dµ , Dν ] = ∂µ Aν − ∂ν Aµ − ig[Aµ , Aν ]. (28.18)
g
Everything we have just said about SU(N ) also goes through for SO(N ), with unitary replaced
by orthogonal, and traceless replaced by antisymmetric. There is also another class of compact
nonabelian groups called Sp(2N ), and five exceptional compact groups: G(2), F (4), E(6),
E(7) and E(8). Compact means that Tr T a T b is a positive definite matrix. Nonabelian
gauge theory must be based on a compact group, because otherwise some of the terms in Lkin
would have the wrong sign, leading to a Hamiltonian that is unbounded below.
As a specific example, let us consider quantum chromodynamics (QCD), which is based on
the gauge group SU(3). There are several Dirac fields corresponding to quarks. Each quark
comes in three colors; these are the values of the SU(3) index. There are also six flavours: up,
down, strange, charm, bottom, and top. Thus we consider the Dirac field ΨiI (x), where i is
the color index and I is the flavour index. The Lagrangian density is
1
L = iΨiI D
/ ij Ψj I − mI ΨI ΨI − Tr(F µν Fµν ). (28.24)
2
The different quark flavours have different masses, ranging from a few MeV for the up and
down quarks to 174 GeV for the top quark. The covariant derivative in equation 28.24 is
The index a on Aaµ runs from 1 to 8, and the corresponding massless spin-one particles are the
eight gluons.
In a nonabelian gauge theory in general, we can consider scalar or spinor fields in different
representations of the group. A representation of a compact nonabelian group is a set of finite-
dimensional hermitian matrices TRa that obey the same commutation relations as the original
generator matrices T a . Given such a set of D(R) × D(R) matrices, and a field ϕ(x) with
D(R) components, we can write its covariant derivative as Dµ = ∂µ − igAaµ TRa . Under a
gauge transformation, ϕ(x) → UR (x)ϕ(x). The theory will be gauge invariant provided that
• If such a unitary matrix also does not exist, the representation R is complex.
• One way to prove that a representation is complex is to show that at least one generator
matrix TRa (or a real linear combination of them) has eigenvalues that do not come in
plus-minus pairs.
Example:
• The fundamental representation of SO(N ) is real.
• The fundamental representation for SU(2) is pseudoreal.
• The fundamental representation for SU(N ) with N ≥ 3 is complex.
Notice that
a b c c a b b c a
Tr T e T , T , T + [T , T ], T + T , T , T = 0. (28.27)
It can be rearranged as
(−if abd )(−if cde ) − (−if cbd )(−if ade ) = if acd (−if dbe ). (28.29)
Define
(TAa )bc ≡ −if abc . (28.30)
Clearly, TAa is a new representation of the group, called adjoint representation. The dimension
of adjoint representation is equal to the number of the generators. And adjoint representation
is real.
Two related numbers usefully characterize a representation: the index T (R) and thequadratic
Casimir C(R). The index is defined via
Tr TRa TRb ≡ T (R)δ ab . (28.31)
The matrix TRa TRa commutes with every generator, and so must be a number times the identity
matrix. This number is the quadratic Casimir C(R). It is easy to show that
With the standard normalization conventions for the generators, we have T (N) = 1/2 for the
fundamental representation of SU(N ) and T (N) = 2 for the fundamental representation of
SO(N ). Using equation 28.32, it follows that C(N) = (N 2 − 1)/2N for SU(N ) and C(N) =
N − 1 for SO(N ).
A representation R is reducible if there is a unitary transformation TRa → U −1 TRa U that puts
all the nonzero entries into the same diagonal blocks for each a; otherwise it is irreducible.
Consider a reducible representation R whose generators can be put into two blocks, with the
–374/453– Chapter 28 Gauge Field
blocks forming the generators of the irreducible representations R1 and R2 . Then R is the
direct sum representation R = R1 ⊕ R2 , and we have
Suppose we have a field ϕiI that carries two group indices, one for the representation R1 and
one for the representation R2 , denoted by i and I respectively. This field is in the direct product
representation R1 ⊗ R2 . The corresponding generator matrix is
(TRa1 ⊗R2 )iI,jJ = (TRa1 )ij δIJ + δij (TRa2 )IJ . (28.34)
We then have
Consider a field ϕ in the complex representation R. We will adopt the convention that such
a field carries a down index: ϕi , where i = 1, · · · , D(R). Hermitian conjugation changesthe
representation from R to R, and we will adopt the convention that this also raises the index on
the field, (ϕi )† = ϕ†i . Thus a down index corresponds to the representation R, and an up index
to R. Indices can be contracted only if one is up and one is down. Generator matrices for R
are then written with the first index down and the second index up: (TRa )ij . An infinitesimal
group transformation of ϕi takes the form
ϕ†i → ϕ†i − iθa (TRa )ij ϕ†j = ϕ†j + iθa (TRa )j i ϕ†j . (28.37)
Consider the Kronecker delta symbol with one index down and one up: δij . Under a group
transformation, we have
δij → (1 + iθa TRa )ik (1 + iθa TRa )j l δkl = δij + O θ2 , (28.38)
So δij is an invariant symbol of the group. This existence of this invariant symbol, which carries
one index for R and one for R, tells us that the product of the representations R and R must
contain the singlet representation 1, specified by T1a = 0. We therefore can write
R ⊗ R = 1 ⊕ ··· (28.39)
The generator matrix (TRa )ij , which carries one index for R, one for R, and one for the adjoint
representation A, is also an invariant symbol, which implies that
R ⊗ R ⊗ A = 1 ⊕ ··· (28.40)
That is, the product of a representation with its complex conjugate is always reducible into a
sum that includes at least the singlet and adjoint representations. Notably, for the fundamental
representation of SU(N ), we have
N ⊗ N = 1 ⊕ A. (28.42)
R ⊗ R = 1 ⊕ A ⊕ ··· (28.43)
The singlet on the right-hand side implies the existence of an invariant symbol with two R
indices; this symbol is the Kronecker delta δij . The fact that δij = δji implies that the singlet
on the right-hand side of the equation above appears in the symmetric part of this product of
two identical representations. Remarkably, for the fundamental representation of SO(N ), we
have
N ⊗ N = 1S ⊕ AA ⊕ SS . (28.44)
The representation S corresponds to a field with a symmetric traceless pair of fundamental
indices.
Consider now a pseudoreal representation R. Since R is equivalent to its complex conjugate,
up to a change of basis, equation 28.43 still holds. However, we cannot identify δij as the
corresponding invariant symbol, because then R would have to be real, rather than pseudoreal.
From the perspective of the direct product, the only alternative is to have the singlet appear
in the antisymmetric part of the product, rather than the symmetric part. The corresponding
invariant symbol must then be antisymmetric on exchange of its two R indices.
An example is the fundamental representation of SU(2). For SU(N ) in general, another in-
variant symbol is the Levi-Civita tensor ϵi1 ,··· ,iN , which carries N fundamental indices and
iscompletely antisymmetric. For SU(2), the Levi-Civita symbol is ϵij = −ϵji ; this is the two-
index invariant symbol that corresponds to the singlet in the product 2 ⊗ 2 = 1A ⊕ 3S , where
3 is the adjoint representation.
The structure constants f abc are another invariant symbol. This follows from (TAa )bc = −if abc ,
since we have seen that generator matrices in any representation are invariant. Alternatively,
given the generator matrices in a representation R, we can write
T (R)f abc = −i Tr TRa TRb , TRc . (28.45)
Since the right-hand side is invariant, the left-hand side must be as well. If we use an anticom-
mutator in place of the commutator, we get another invariant symbol,
1
A(R)dabc = Tr TRa TRb , TRc , (28.46)
2
where A(R) is the anomaly coefficient of the representation. The cyclic property of the trace
implies that dabc is symmetric on exchange of any pair of indices. Using (TRa )ij = −(TRa )j i , we
can see that
A(R) = −A(R). (28.47)
–376/453– Chapter 28 Gauge Field
We normalize the anomaly coefficient so that it equals one for the smallest complex represen-
tation. In particular, for SU(N ) with N ≥ 3, the smallest complex representation is the fun-
damental, and A(N) = 1. For SU(2), all representations are real or pseudoreal, and A(R) = 0
for all of them.
Aaµ (θ + dθ) = Aaµ (θ) − Dµac (θ) dθc where Dµac ≡ δ ac ∂µ − igAbµ (TAb )ac . (28.50)
we have
Z
δG δGa (x)
1= Dθδ(G) det where b
= −∂ µ Dµab (θ)δ 4 (x − y). (28.52)
δθ δθ (y)
As shown in section 26.10, a functional determinant can be written as a path integral over
complex Grassmann variables. Let us introduce the complex Grassmann field ca (x), and its
hermitian conjugate c̄a (x), called Faddeev–Popov ghosts. Then we can write
Z Z ∫ 4
δG
det ∝ DcDc̄ e iSgh
= DcDc̄ ei d xLgh [A(θ)] , (28.53)
δθ
where
Lgh [A(θ)] ≡ c̄a ∂ µ Dµab (θ)cb = −∂ µ c̄a ∂µ ca + gf abc Acµ (θ)∂ µ c̄a cb . (28.54)
We see that ca (x) has the standard kinetic term for a complex scalar field. The ghost field is
also a Grassmann field, and so a closed loop of ghost lines in a Feynman diagram carries an
extra factor of minus one. Since the particles associated with the ghost field do not in fact exist
(and would violate the spin-statistics theorem if they did), it must be that the amplitude to
produce them in any scattering process is zero.
Combining equations 28.49, 28.52 and 28.53, the path integral for nonabelian gauge theory
would be Z Z Z
Z[0] ∝ Dθδ[G(θ)] DcDc̄ e iSgh (θ)
DAeiSYM (28.55)
28.2 Quantization of nonabelian gauge theory –377/453–
We can change integral variable A in equation 28.55 to A(θ) and we have DA = DA(θ) under
gauge transformation. Also, by gauge invariance, SYM [A] = SYM [A(θ)]. Since A(θ) is now just
a dummy integration variable, we can rename it bake to A. Now equation 28.55 becomes
Z Z
Z[0] ∝ Dθ DcDc̄DA eiSYM +iSgh δ[∂ µ Aaµ − ω a (x)]. (28.56)
LYM = − 12 ∂ µ Aaν ∂µ Aaν + 21 ∂ µ Aaν ∂ν Aaµ − gf abe Aaµ Abν ∂µ Aeν − 41 g 2 f abe f cde Aaµ Abν Acµ Adν .
(28.58)
Adding the gauge-fixing term and doing some integrations-by-parts in the quadratic terms,
we find
h i
LYM +Lgf = 12 Aeµ ηµν ∂ 2 − 1 − 1ξ ∂µ ∂ν Aeν −gf abe Aaµ Abν ∂µ Aeν − 41 g 2 f abe f cde Aaµ Abν Acµ Adν .
(28.59)
Therefore, the gluon propagator is
−iδ ab 1 kµ kν
ab
GF (k)µν = 2 ηµν − 1 − . (28.60)
k − iϵ ξ k2
Figure 28.1: The three-gluon and four-gluon vertices in nonabelian gauge theory.
The three-gluon and four-gluon vertex factors, as shown in Figure 28.1, are given by
abc
iVµνρ (p, q, r) = gf abc [(q − r)µ gνρ + (r − p)ν gρµ + (p − q)ρ gµν ] (28.61)
and
abcd
iVµνρσ = −ig 2 [f abe f cde (gµρ gνσ − gµσ gνρ ) + f ace f dbe (gµσ gρν − gµν gρσ ) + f ade f bce (gµν gσρ − gµρ gσν )].
(28.62)
–378/453– Chapter 28 Gauge Field
For loop calculations, we need to include the ghosts, with Lagrangian density
Figure 28.2: (a) The ghost-ghost-gluon vertex in nonabelian gauge theory; (b) The quark-
quark-gluon vertex in nonabelian gauge theory.
If we include a quark coupled to the gluons, we have the quark Lagrangian density
Lquark = iΨi D
/ ij Ψj − mΨi Ψi = iΨi ∂/Ψi − mΨi Ψi + gAaµ Ψi γ µ (T a )ij Ψj . (28.66)
1
Lct = − δ3 F aµν Fµν a
+ Ψ(iδ2 ∂/ − δm )Ψ + δ2c c̄a ∂ 2 ca
4
1
+ gδ1 Aaµ Ψγ µ T a Ψ − gδ13g f abe Aaµ Abν ∂µ Aeν − g 2 δ14g f abe f cde Aaµ Abν Acµ Adν
4
c abc a µ b c
+ gδ1 f Aµ ∂ c̄ c , (28.69)
28.3 Renormalization of nonabelian gauge theory –379/453–
δ2 = Z2 − 1, δ3 = Z3 − 1, δ2c = Z2c − 1, δm = Z2 m0 − m,
g0 g0 g02 g0 c
δ1 = Z2 (Z3 )1/2 − 1, δ13g = (Z3 )3/2 − 1, δ14g = 2
(Z3 )2 − 1, δ1c = Z2 (Z3 )1/2 − 1.
g g g g
(28.70)
Notice that these eight counterterms depend on five underlying parameters; thus, there are
three relations among them. The situation is very similar to that for the scalar theories with
spontaneously broken symmetry that we studied before. The underlying symmetry of the
theory - local gauge invariance - implies relations among the divergent amplitudes of the the-
ory and among the counterterms required to cancel them. In the present case, a set of five
renormalization conditions uniquely specifies all of the counterterms in a way that removes
all divergences from the theory. The rigorous proof will be omitted here.
Quark propagator
The quantum corrections to the quark propagator up to one-loop order is represented by Fig-
ure 28.3.
Figure 28.3: The one-loop and counterterm corrections to the quark propagator.
g2 1 g2 1
Z2 = 1 − C(R) + O g 4
, Zm = 1 − C(R) + O g 4
. (28.71)
8π 2 ϵ 2π 2 ϵ
Quark-quark-gluon vertex
The quantum corrections to the quark-quark-gluon vertex up to one-loop order is described
by Figure 28.4.
Figure 28.4: The one-loop and counterterm corrections to the quark-quark-gluon vertex.
g2 1
Z1 = 1 − [C(R) + T (A)] 2
+ O g4 . (28.72)
8π ϵ
–380/453– Chapter 28 Gauge Field
Figure 28.5: The one-loop and counterterm corrections to the gluon propagator.
Gluon propagator
The quantum corrections to the gluon up to one-loop order is shown in Figure 28.5.
Note: Term 5T (A)/3 in the square bracket of equation 28.73 comes from gluon loop and ghost loop in
28.5, while term 4nF T (R)/3 comes from quark loop.
Beta function
Z12
α0 = αµ̃ϵ . (28.74)
Z22 Z3
Let us write
X
∞
Gn (α)
ln Z3−1 Z2−2 Z12 = . (28.75)
n=1
ϵn
Then we have
X
∞
Gn (α)
ln α0 = + ln α + ϵ ln µ̃. (28.76)
n=1
ϵn
From equations 28.71, 28.72 and 28.73, we get
11 4 α
G1 (α) = − T (A) − nF T (R) + O α2 . (28.77)
3 3 2π
Since dα0 /dµ = 0 and dα/d ln µ should be finite in the ϵ → 0 limit, it can be derived that
2
dα 2 ′ 11 4 α
β(α) ≡ = −ϵα + α G1 (α) = −ϵα − T (A) − nF T (R) + O α3 . (28.78)
d ln µ 3 3 2π
28.4 Chiral gauge theories and anomalies –381/453–
The gauge group for quantum chromodynamics is SU(3), and quarks are in its fundamental
representation. Thus we have T (A) = 3 and T (R) = 1/2, and the factor in square brackets
is 11 − 2nF /3. As long as nF ≤ 16, the beta function will be negative: the gauge coupling in
quantum chromodynamics gets weaker at high energies, and stronger at low energies.
This has dramatic physical consequences. Perturbation theory cannot serve as a reliable guide
to the low-energy physics. And indeed, in nature we do not see isolated quarks or gluons. The
appropriate conclusion is that color is confined: all finite-energy states are invariant under
a global SU(3) transformation. This has not yet been rigorously proven, but it is the only
hypothesis that is consistent with all of the available theoretical and experimental information.
The detailed calculation omitted in this section can be found in section 73 of Quantum Field
Theory (Mark Sredniki).
Nothing interesting happens in the one-loop and counterterm corrections to the fermion
propagator, or the fermion-fermion-photon vertex. There is simply an extra factor of PL along
the fermion line, which can be moved to the far right. Except for this factor, the results exactly
duplicate those of spinor electrodynamics. All of this implies that a single Weyl field makes
half the contribution of a Dirac field to the leading term in the beta function for the gauge
coupling.
Next we turn to diagrams with three external photons, and no external fermions, shown in
Figure 28.6. In spinor electrodynamics, the fact that the vector potential is odd under charge
conjugation implies that the sum of these diagrams must vanish. For the present case of a
single Weyl field, there is no charge-conjugation symmetry, and so we must evaluate these
diagrams.
The second diagram in Figure 28.6 is the same as the first, with p ↔ q and µ ↔ ν. Thus we
have
Z
d4 l i3 N µνρ
µνρ
iV (p, q, r) = (−1)(ig) 3
+ (p, µ ↔ q, ν) + O g 5 , (28.82)
(2π) (l − p) l (l + q)
4 2 2 2
where
N µνρ = Tr (/l − p/)γ µ /lγ ν (/l + /q)γ ρ PL . (28.83)
The term in equation 28.83 with PL → 1/2 simply yields half the result that we get in spinor
electrodynamics with a Dirac field, which gives a vanishing contribution to V µνρ (p, q, r).
Hence, we can make the replacement PL → −γ5 /2 in equation 28.83.
ig 3 ανβρ ig 3 αρβµ
rρ V µνρ (p, q, r) = 0, pµ V µνρ (p, q, r) = ϵ pα qβ , qν V µνρ (p, q, r) = ϵ qα p β .
8π 2 8π 2
(28.84)
Thus, the three-photon vertex is not gauge invariant.
Equations 28.84 also show that the three-photon vertex does not exhibit the expected symme-
try among the external lines. It lies in the fact that the integral in equation 28.82 is linearly
divergent, and so shifting the loop momentum changes its value. Shifting the loop momen-
tum appropriately can restore symmetry among the external lines. But anomalies can not be
eliminated completely in any regularization scheme.
Consider now a U(1) gauge theory with several left-handed Weyl fields ψi , with charges Qi ,
so that the covariant derivative of ψi is ∂µ − igQi Aµ . Then each of these fields circulates in
28.4 Chiral gauge theories and anomalies –383/453–
the loop in 28.6, and each vertex has an extra factor of Qi . The right-hand sides of equations
P P
28.84 are now multiplied by i Q3i . And if i Q3i happens to be zero, then gauge invariance
is restored. The simplest possibility is to have the ψs come in pairs with equal and opposite
charges.
All of this has a straightforward generalization to nonabelian gauge theories. Suppose we have
a single Weyl field in a (possibly reducible) representation R of the gauge group. Then we must
attach an extra factor of Tr TRa TRb TRc to the first diagram in 28.6, and a factor of Tr TRa TRc TRb
to the second; here the group indices a, b, c go along with the momenta p, q, r, respectively.
Repeating our analysis shows that the diagrams with PL → 1/2 come with an extra factor
of Tr TRa , TRb TRc /2; these contribute to the renormalization of the tree-level three-gluon
vertex. Diagrams with PL → −γ5 /2 come with an extra factor of
1
Tr TRa , TRb TRc = A(R)dabc , (28.85)
2
where dabc is a completely symmetric tensor that is independent of the representation, and
A(R) is the anomaly coefficient of R. In order for this theory to exist, we must have A(R) = 0.
For SU(2) and SO(N ), N ̸= 2, 6, all representations have A(R) = 0. For SU(N ) with N ≥ 3,
the fundamental representation has A(N) = 1, and most complex SU(N ) representations R
have A(R) ̸= 0. Notice that A(R) + A(R) = 0; thus a theory whose left-handed Weyl fields
come in R ⊕ R pairs is automatically anomaly free.
Generally, consider a theory with a nonabelian gauge symmetry, and also a U(1) gauge sym-
metry. The theory contains left-handed Weyl fields in the representations (Ri , Qi ), where Ri
is the representation of the nonabelian group, and Qi is the U(1) charge. For this theory to
be anomaly free, we must demand that Tr T a , T b T c /2 = 0, where T a is either a gener-
ator of the non abelian group in the representation R1 ⊕ · · · Rn , or the generator Q of the
abelian group. The non abelian generators are block diagonal, with blocks given by TRai , and
Qi s diagonal with d(R1 ) entries Q1 , d(R2 ) entries Q2 , etc.
P
• If all three generators are nonabelian, we have Tr T a , T b T c /2 = i A(Ri )dabc , and
P
so we must have i A(Ri ) = 0.
P
• If one generator is the abelian generator Q, we have Tr T a , T b T c /2 = i T (Ri )Qi δ ab ,
P
and so we must have i T (Ri )Qi = 0.
P
• If two generators are abelian, we have Tr Q2 T c = i Q2i Tr TRci = 0, since nonabelian
generators are always traceless.
P
• If all three generators areabelian, we have Tr T a , T b T c /2 = i d(Ri )Q3i , and this
must also vanish.
Because the fermion field is massless, the Lagrangian is invariant under a global symmetry in
which χ and ξ transform with the same phase:
In terms of Ψ, this is
This is called axial U(1) symmetry, because the associated Noether current
is an axial vector. Noether’s theorem leads us to expect that this current is conserved. However,
the axial current actually has an anomalous divergence.
Consider the matrix element ⟨p, q|jAρ (z)|0⟩, where ⟨p, q| is a state of two outgoing photons
with four-momenta p and q, and polarization vectors ϵµ and ϵ′ν , respectively. Using the LSZ
formula for photons, we have
Z
ρ 2 ′
⟨p, q|jA (z)|0⟩ = (ig) ϵµ ϵν d4 x d4 y e−i(px+qy) ⟨0|Tj µ (x)j ν (y)jAρ (z)|0⟩ , (28.90)
where j µ = Ψγ µ Ψ is the Noether current corresponding to the U(1) gauge symmetry. Since
both jµ (x) and jν (x) are Noether currents, we expect the Ward identities
∂ ∂
⟨0|Tj µ (x)j ν (y)jAρ (z)|0⟩ = 0, ⟨0|Tj µ (x)j ν (y)jAρ (z)|0⟩ = 0,
∂xµ ∂y ν
∂
ρ
⟨0|Tj µ (x)j ν (y)jAρ (z)|0⟩ = 0 (28.91)
∂z
to be satisfied. Note that there are no contact terms in equations 28.91, because both j µ (x)
and jAµ (x) are invariant under both U(1) transformations.
Let us define C µνρ (p, q, r) via
Z
4
(2π) δ(p + q + r)C µνρ
(p, q, r) ≡ d4 x d4 y d4 z e−i(px+qy+rz) ⟨0|Tj µ (x)j ν (y)jAρ (z)|0⟩ .
(28.92)
Then we have
⟨p, q|jAρ (z)|0⟩ = −g 2 ϵµ ϵ′ν C µνρ (p, q, r)eirz r=−q−p
. (28.93)
Taking the divergence of the current yields
⟨p, q|∂ρ jAρ (z)|0⟩ = −ig 2 ϵµ ϵ′ν rρ C µνρ (p, q, r)eirz r=−q−p
. (28.94)
To check equations 28.95, we compute C µνρ (p, q, r) with Feynman diagrams. At the one-loop
level, the contributing diagrams are exactly those we computed in previous subsection, except
that the three vertex factors are now γ µ , γ ν and γ ρ γ5 instead of igγ µ PL , igγ ν PL and igγ ρ PL .
But the three PL s can be combined into just one at the last vertex, and then this one can be
replaced by − 21 γ5 . Thus, the vertex function iV µνρ (p, q, r) is related to C µνρ (p, q, r) by
1
iV µνρ (p, q, r) = − (ig)3 C µνρ (p, q, r) + O g 5 . (28.96)
2
In order to preserve the conservation of the current coupled to the gauge field, we should
choose the regularization scheme in which the first two equations in 28.95 are satisfied, result-
ing in
i
rρ C µνρ (p, q, r) = − 2 ϵµναβ pα qβ + O g 2 . (28.97)
2π
Using this in equation 28.94, we find
g 2 µναβ ′ −i(p+q)z
⟨p, q|∂µ jAρ (z)|0⟩ = − ϵ p α q β ϵµ ϵν e + O g 4
. (28.98)
2π 2
Here Aµ is either the U(1) gauge field, or the matrix-valued nonabelian gauge field, depending
on the theory under consideration.
Now consider an axial U(1) transformation of the Dirac field, but with a space-time dependent
parameter α(x):
Ψ(x) → e−iα(x)γ5 Ψ(x), Ψ(x) → Ψe−iα(x)γ5 . (28.100)
The corresponding change in the action is
Z Z
S(A) → S(A) + d x jA (x)∂µ α(x) = S(A) − d4 x α(x)∂µ jAµ (x).
4 µ
(28.101)
If we assume that the measure DΨDΨ is invariant under the axial U(1) transformation, then
we have Z ∫ 4 µ
Z(A) → DΨDΨeiS(A) e−i d xα(x)∂µ jA (x) . (28.102)
This must be equal to the original expression for Z(A). This implies that ∂µ jAµ (x) = 0 holds
inside quantum correlation functions, up to contact terms, as discussed in subsection 25.4.4.
However, the assumption that the measure DΨDΨ is invariant under the axial U(1) transfor-
mation must be examined more closely. The change of variable in equations 28.100 is imple-
mented by the functional matrix
Because the path integral is over fermionic variables (rather than bosonic), we get a jacobian
factor of (det J)−1 (rather than det J) for each of the transformations, so that we have
1
V (ϕ) = m2 ϕ† ϕ + λ(ϕ† ϕ)2 . (28.112)
4
Let us consider m2 < 0. Classically, the field has a nonzero vacuum expectation value (VEV
for short), given by
1
⟨0|ϕ(x)|0⟩ = √ v, (28.113)
2
where we have made a global U(1) transformation to set the phase of the VEV to zero, and
r
4|m|2
v= . (28.114)
λ
We therefore write
1
ϕ(x) = √ [v + ρ(x)]e−iχ(x) , (28.115)
2
where ρ(x) and χ(x) are real scalar fields. The scalar potential depends only on ρ, and is given
by
1 1 1
V = λv 2 ρ2 + λvρ3 + λρ4 . (28.116)
4 4 16
Since χ does not appear in the potential, it is massless; it is the Goldstone boson for the spon-
taneously broken U(1) symmetry.
In the gauge theory, we can make a gauge transformation that shifts the phase of ϕ(x) by an
arbitrary spacetime function. We can use this gauge freedom to set χ(x) = 0; this choice is
called unitary gauge. Now we have
1 1
− (Dµ ϕ)† Dµ ϕ = − ∂ µ ρ∂µ ρ − g 2 (v + ρ)2 Aµ Aµ . (28.117)
2 2
Expanding out the last term, we see that the gauge field now has a mass
M = gv. (28.118)
This is the Higgs mechanism: the Goldstone boson disappears, and the gauge field acquires a
mass. Note that this leaves the counting of particle spin states unchanged: a massless spin-one
particle has two spin states, but a massive one has three. The Goldstone boson has become the
third or longitudinal state of the now-massive gauge field. A scalar field whose VEV breaks a
gauge symmetry is generically called a Higgs field.
where the covariant derivative is (Dµ ϕ)i = ∂µ ϕi − igAaµ (TRa )ij ϕj , and the indices i and j run
from 1 to d(R). We assume that ϕ acquires a VEV
1
⟨0|ϕi (x)|0⟩ = √ vi . (28.119)
2
If we replace ϕ by its VEV in −(Dµ ϕ)† Dµ ϕ, we find a mass term for the gauge fields
1
Lmass = − (M 2 )ab Aaµ Abµ , (28.120)
2
where the mass-squared matrix is
1
(M 2 )ab = g 2 vi∗ TRa , TRb v.
ij j
(28.121)
2
A generator T a is spontaneously broken if (TRa )ij vj ̸= 0. We see that gauge fields correspond-
ing to broken generators get a mass, while those corresponding to unbroken generators do not.
The unbroken generators (if any) form a gauge group with massless gauge fields. The massive
gauge fields (and all other fields) form representations of this unbroken group.
Consider the gauge group SU(N ), with a complex scalar field ϕ in the fundamental represen-
tation. We can make a global SU(N ) transformation to bring the VEV entirely into the last
component, and furthermore make it real. Any generator (TRa )ij that does not have a nonzero
entry in the last column will remain unbroken. These generators form an unbroken SU(N −1)
gauge group. There are three classes of broken generators: those with (TRa )iN = 1/2 for i ̸= N
(there are N −1 of these); those with (TRa )iN = −i/2 for i ̸= N (there are also N −1 of these);
and finally the single generator T N −1 = [2N (N − 1)]−1/2 diag(1, 1, · · · , −N + 1). The gauge
2
fields corresponding to the generators in the first two classes get a mass M = gv/2. we can
group them into a complex vector field that transforms in the fundamental representation
of the unbroken SU(N − 1) subgroup. The gauge field corresponding to T N −1 gets a mass
2
Then all generators whose nonzero entries lie entirely within the ith block commute with V ,
and hence form an unbroken SU(Ni ) subgroup. Furthermore, generators that is proportional
to V also commutes with V , and forms a U(1) subgroup. Thus the unbroken gauge group is
SU(N1 ) × SU(N2 ) × · · · × U (1). The gauge coupling constants for the different groups are
all the same, and equal to the original SU(N ) gauge coupling constant.
G2 δG
Lgf + Lgh = − − c̄ c, (28.127)
2ξ δθ
where G = ∂ µ Aµ , and θ(x) parametrizes an infinitesimal gauge transformation
Aµ → Aµ − ∂µ θ, ϕ → igθϕ. (28.128)
Since δG/δθ = −∂ 2 , the ghost fields have no interactions, and can be ignored.
In the presence of spontaneous symmetry breaking, we choose instead
G = ∂ µ Aµ − ξgνb, (28.129)
–390/453– Chapter 28 Gauge Field
Then we have
δG
= −∂ 2 + ξg 2 v(v + h). (28.132)
δθ
The ghost Lagrangian is
We see from the second term that the ghost has acquired the same mass as the b field.
Now let us examine the vector field. Including Lgh , the terms in L that are quadratic in the
vector field can be written as
1
L0 = − Aµ η µν (−∂ 2 + M 2 ) + (1 − ξ −1 )∂ µ ∂ ν Aν . (28.134)
2
The propagator for the vector field is
−iP µν −iξk µ k ν /k 2 kµkν
SF (k) = + where P µν = g µν − . (28.135)
k 2 + M 2 − iϵ k 2 + ξM 2 − iϵ k2
We see that the transverse components of the vector field propagate with mass M , while the
√
longitudinal component propagates with the same mass as the b and ghost fields, ξM . Since
their masses depend on ξ, the ghosts, the b field, and the longitudinal component of the vector
field must all represent unphysical particles that do not appear in incoming or outgoing states.
When ξ → ∞, we can recover the unitary gauge.
Each broken generator results in a massless Goldstone boson. We note that the potential must
be invariant under a global gauge transformation. It follows that
∂V
(T a )jk ϕk = 0. (28.138)
∂ϕj
∂ 2V ∂V
(T a )jk ϕk + (T a )ji = 0. (28.139)
∂ϕi ∂ϕj ∂ϕj
∂ 2V
(m2 )ij = (28.140)
∂ϕi ∂ϕj ϕi =vi
as the mass-squared matrix for the scalars after spontaneous symmetry breaking. Thus equa-
tion 28.139 becomes
(m2 )ij (T a v)j = 0. (28.141)
We see that if T a v ̸= 0, then T a v is an eigenvector of the mass-squared matrix with eigenvalue
zero. Thus there is a zero eigenvalue for every linearly independent broken generator.
Let us write
ϕi (x) = vi + χi (x). (28.142)
The covariant derivative of becomes
A theorem of linear algebra states that every real rectangular matrix can be written as
where S and R are orthogonal matrices, and the diagonal entries M c are real and nonnegative.
We see that these diagonal entries are the masses of the vector fields. The vector fields of
eaµ = S ba Abµ .
definite mass are then given by A
Now we are ready to fix Rξ gauge. To do so, we add to L the gauge-fixing and ghost terms
Ga Ga δGa
Lgf + Lgh = − − c̄a b cb where Ga = ∂ µ Aaµ − ξF ai χi . (28.150)
2ξ δθ
Then we have
1 µ a ν a 1
Lgf = − ∂ Aµ ∂ Aν − F ai Aaµ ∂ µ χi − ξ(F ai F aj )χi χj . (28.151)
2ξ 2
The last term makes a contribution to the mass-squared matrix for the χ fields,
It follows that
δGa
= −∂µ Dab
µ
+ ξ(M 2 )ab + ξF aj τjk
b
χk , (28.155)
δθb
So the ghost Lagrangian density is
where T a = σ a /2 and Y = −I/2; Y is the hypercharge generator. It will prove useful to write
out g2 Aaµ T a + g1 Bµ Y in matrix form,
1 g2 A3µ − g1 Bµ g2 (A1µ − iA2µ )
. (28.159)
2 g2 (A1µ + iA2µ ) −g2 A3µ − g1 Bµ
This potential gives ϕ a non-zero VEV. We can make a global gauge transformation to bring
this VEV entirely into the first component, and furthermore make it real, so that
1 v
⟨0|ϕ|0⟩ = √ . (28.161)
2 0
The kinetic term for ϕ is −(Dµ ϕ)† Dµ ϕ. After replacing ϕ by its VEV, we find a mass term for
the gauge fields,
2
1 2 g2 A3µ − g1 Bµ g2 (A1µ − iA2µ ) 1
Lmass =− v 1 0 . (28.162)
8 g2 (Aµ + iAµ ) −g2 Aµ − g1 Bµ
1 2 3
0
where H is a real scalar field; the corresponding particle is the Higgs boson. The potential now
reads
1 1 1
V = λv 2 H 2 + λvH 3 + λH 4 . (28.168)
4 4 16
We see that the mass of the Higgs boson is given by m2H = λv 2 /2. The kinetic term for H
comes from the kinetic term for ϕ, and is the usual one for a real scalar field, −∂µ H∂ µ H/2.
Finally, recall that the mass term for the gauge fields is proportional to v 2 . Hence it should be
multiplied by a factor of (1 + H/v)2 .
Now we have to work out the kinetic terms for the gauge fields:
1 1
L = − F aµν Fµν
a
− B µν Bµν . (28.169)
4 4
We find
1 1
√ (Fµν
1
− iFµν
2
) = Dµ Wν+ − Dν Wµ+ , √ (Fµν
1 2
+ iFµν ) = Dµ† Wν− − Dν† Wµ− , (28.170)
2 2
where we have defined a covariant derivative that acts on W + ,
e ≡ g2 sW . (28.172)
Here we are adopting the convention that e is positive. (In our treatment of quantum electro-
dynamics, we used the convention that e is negative, but that is less convenient in the present
context.)
28.7 The Standard Model –395/453–
We also have
3
Fµν = sW Fµν + cW Zµν − ig2 (Wµ+ Wν− − Wν+ Wµ− ), Bµν = cW Fµν − sW Zµν , (28.173)
where Fµν = ∂µ Aν −∂ν Aµ is the usual electromagnetic field strength, and Zµν ≡ ∂µ Zν −∂ν Zµ
is the abelian field strength associated with the Zµ field.
Now we can assemble all of this into the complete Lagrangian density for the electroweak gauge
fields and the Higgs boson in unitary gauge. We will express g2 in terms of e and θW , and λ in
terms of mH and v. We ultimately get
1 1
L = − Fµν F µν − Zµν Z µν − D†µ W −ν Dµ Wν+ + D†µ W −ν Dν Wµ+ + ie(F µν + cot θW Z µν )Wµ+ Wν−
4 4
2
2
1 e − − −ν − − 1 2 µ H
− (W Wµ W Wν − W Wµ W Wν ) − (MW W Wµ + MZ Z Zµ ) 1 +
+µ +ν +µ + 2 +µ
2 sin2 θW 2 v
1 1 1 1
− ∂µ H∂ µ H − m2H H 2 − m2H v −1 H 3 − m2H v −2 H 4 , (28.174)
2 2 2 8
where Dµ = ∂µ − ie(Aµ + cot θW Zµ ).
(Dµ l)i = ∂µ li − ig2 Aaµ (T a )ij lj − ig1 (−1/2)Bµ li , Dµ ē = ∂µ ē − ig1 (+1)Bµ ē, (28.175)
We cannot write down a mass term involving l and (or) ē because there is no gauge-group
singlet contained in any of the products
(2, −1/2) ⊗ (2, −1/2), (2, −1/2) ⊗ (1, +1), (1, +1) ⊗ (1, +1). (28.177)
where ϕ is the Higgs field in the (2, −1/2) representation, and y is the Yukawa coupling con-
stant. A gauge-invariant Yukawa coupling is possible because there is a singlet on the right-
hand side of
(2, −1/2) ⊗ (2, −1/2) ⊗ (1, +1) = (1, 0) ⊕ (3, 0). (28.179)
–396/453– Chapter 28 Gauge Field
√
In unitary gauge, we replace ϕ1 with (v + H)/ 2, where H is the real scalar field representing
the physical Higgs boson, and ϕ2 with zero. The Yukawa term becomes
1
LYuk = − √ y(v + H)(l2 ē + h.c.). (28.180)
2
It is now convenient to assign new names to the SU(2) components of l,
ν
l= . (28.181)
e
Thus we have
1
LYuk = − √ y(v + H)EE, (28.182)
2
where we have defined a Dirac field for the electron
e
E≡ † . (28.183)
ē
√
We see that the electron has acquired a mass me ≡ yv/ 2, while neutrino has remained
massless. For neutrinos, it is more convenient to work with
ν
NL ≡ PL N = . (28.184)
0
We can think of NL as a Dirac field; for example, the neutrino kinetic term can be written as
iNL ∂/NL .
Now we express the covariant derivatives in terms of the Wµ± , Zµ , and Aµ fields. From our
results in previous subsection, we have
g2 0 Wµ+
g2 Aµ T + g2 Aµ T = √
1 1 2 2
(28.185)
2 Wµ− 0
and
g2 A3µ T 3 + g1 Bµ Y = e(T 3 + Y )Aµ + e(cot θW T 3 − tan θW Y )Zµ . (28.186)
Since we identify Aµ as the electromagnetic field and e as the electromagnetic coupling con-
stant, we identify
Q = T3 + Y (28.187)
as the generator of electric charge. Then we see that
where
Having worked out the interactions of a single lepton generation, we now examine what hap-
pens when there is more than one of them. Let us consider the fields liI and ēI , where I =
1, 2, 3 is a generation index. The kinetic term for all these fields is
The most general Yukawa term we can write down now reads
where yIJ is a complex 3 × 3 matrix, and the generation indices are summed. We can make
unitary transformations in generation space on the fields: lI → LIJ lJ and ēI → E IJ ēJ , where
L and E are independent unitary matrices. The kinetic terms are unchanged, and the Yukawa
matrix y is replaced with L⊺ yE. We can choose L and E so that L⊺ yE is diagonal with positive
√
real entries yI . The charged leptons then have masses mI = yI v/ 2, and the neutrinos remain
massless. In the currents, we simply add a generation index I to each field, and sum over it.
(Dµ q)αi = ∂µ qαi − ig3 Aaµ (T3a )αβ qβi − ig2 Aaµ (T2a )i j qαj − ig1 (+1/6)Bµ qαi , (28.194a)
(Dµ ū)α = ∂µ ūα − ig3 Aaµ (T3̄a )αβ ūβ − ig1 (−2/3)Bµ ūα , (28.194b)
¯ α = ∂µ d¯α − ig3 Aa (T a )α d¯β − ig1 (+1/3)Bµ d¯α .
(Dµ d) (28.194c)
µ 3̄ β
We rely on context to distinguish the SU(3) gauge fields from the SU(2) gauge fields. The
kinetic terms for q, ū, and d¯ are
We cannot write down a mass term involving q, ū, and (or) d¯ because there is no gauge-group
singlet contained in any of the products of their representations. But we are able to write down
Yukawa couplings of the form
where ϕ is the Higgs field in the (1, 2, −1/2) representation, and y ′ and y ′′ are the Yukawa
coupling constants.
√
In unitary gauge, we replace ϕ1 with (v + H)/ 2, where H is the real scalar field representing
the physical Higgs boson, and ϕ2 with zero. The Yukawa term becomes
1 1
LYuk = − √ y ′ (v + H)qα2 d¯α − √ y ′′ (v + H)qα1 ūα + h.c. . (28.197)
2 2
It is now convenient to assign new names to the SU(2) components of q,
u
q= . (28.198)
d
Then we have
1 1
LYuk = − √ y ′ (v + H)Dα Dα − √ y ′′ (v + H)Uα Uα , (28.199)
2 2
where we have defined Dirac fields for the down and up quarks,
dα uα
Dα ≡ , Uα ≡ . (28.200)
d¯†α ū†α
y′v y ′′ v
md ≡ √ , mu ≡ √ . (28.201)
2 2
Since Q = T 3 + Y , we have
2 1 2 1¯
Qu = + u, Qd = − d, Qū = − ū, Qd¯ = + d. (28.202)
3 3 3 3
This is just the set of electric charge assignments that we expect for the up and down quarks.
Using equations 28.185 and 28.189 in equations 28.194 and 28.195, we find the coupings of
the gauge fields to the quarks,
1 1 e
Lint = √ g2 Wµ+ J −µ + √ g2 Wµ− J +µ + Zµ JZµ + eAµ JEM
µ
, (28.203)
2 2 s c
W W
Having worked out the interactions of a single quark generation, we now examine what hap-
pens when there is more than one of them. Let us consider the fields qαiI , ūI and d¯I , where
I = 1, 2, 3 is a generation index. The kinetic term for all these fields is
Lkin = iq †αiI σ̄ µ (Dµ )αiβj qβj I + iū†αI σ̄ µ (Dµ )αβ ūβI + id¯†αI σ̄ µ (Dµ )αβ d¯βI , (28.205)
28.7 The Standard Model –399/453–
where the repeated generation index is summed. The most general Yukawa term we can write
down now reads
′ ¯α
LYuk = −ϵij ϕi qαj I yIJ dJ − ϕ†i qαj I yIJ
′′ α
ūJ + h.c., (28.206)
′ ′′
where yIJ and yIJ are complex 3 × 3 matrices. In unitary gauge, this becomes
1 ′ ¯α 1 ′′ α
LYuk = − √ (v + H)dαI yIJ dJ − √ (v + H)uαI yIJ ūJ + h.c. . (28.207)
2 2
We can make unitary transformations in generation space on the fields: dI → DIJ dJ , d¯I →
DIJ d¯J , uI → UIJ uJ and ūI → U IJ ūJ , where U , U , D, D are independent unitary matri-
ces. The kinetic terms are unchanged (except for the couplings to the W ± , as we will discuss
momentarily), and the Yukawa matrices y ′ and y ′′ are replaced with D⊺ y ′ D and U ⊺ y ′′ U .
We can choose U , U , D, D so that D⊺ y ′ D and U ⊺ y ′′ U are diagonal with positive real entries
√
yI′ and yI′′ . The down quarks dI then have masses mI = yI′ v/ 2, and the up quarks uI have
√
masses mI = yI′′ v/ 2. In the neutral currents, we simply add a generation index I to each
field, and sum over it. The charged currents are more complicated, however; they become
and ci = cos θi and si = sin θi . Note that the charged currents now have some terms with
a phase factor eiδ , and some without. Since the time-reversal operator T is antiunitary, the
charged currents do not transform in a simple way under time reversal. This implies that the
charged current terms in Lint are not time-reversal invariant; hence the electroweak interac-
tions violate time-reversal symmetry. Since CP T is always a good symmetry, time-reversal
violation is equivalent to CP violation; δ is therefore sometimes called the CP violating phase.
At last, we would show that the Standard Model is anomaly free. The representation of the left-
handed Weyl fields is three copies of (1, 2, −1/2) ⊕ (1, 1, +1) ⊕ (3, 2, +1/6) ⊕ (3̄, 1, −2/3) ⊕
(3̄, 1, +1/3).
The 3-3-3 anomaly cancels if there are equal numbers of 3’s and 3̄’s; in doing this counting, each
SU(2) component counts separately. We see that each generation has two 3’s from (3, 2, +1/6)
and two 3̄’s from (3̄, 1, −2/3)⊕(3̄, 1, +1/3); thus the 3-3-3 anomaly cancels. There is no 2-2-2
anomaly because the 2 is a pseudoreal representation.
P
For mixed anomalies such as 3-3-1 and 2-2-1, we require i T (Ri )Qi to vanish. For 3-3-1,
each SU(2) component counts separately. Setting T (3) = T (3̄) = 1, we have 2 · (+1/6) +
–400/453– Chapter 28 Gauge Field
(+1/3)+(−2/3) = 0. For 2-2-1, each SU(3) component counts separately. Setting T (2) = 1,
we have −1/2 + 3(+1/6) = 0.
P
For 1-1-1, we require i Q3i to vanish, where the sum counts each SU(2) and SU(3) compo-
nent separately. We have 1·2·(−1/2)3 +1·1·(+1)3 +3·2·(1/6)3 +3·1·(−2/3)3 +3·1·(1/3)3 =
0. Other possible combinations, such as 1-2-3 or 2-2-3, always involve the trace of a single
SU(2) or SU(3) generator, and this vanishes. Finally, the global SU(2) anomaly is absent if
there is an even number of 2’s; we have 1 + 3 = 4 2’s.
Figure 28.7: Standard Model of Particle Physics. The diagram shows the elementary particles
of the Standard Model (the Higgs boson, the three generations of quarks and leptons, and the
gauge bosons), including their names, masses, spins, charges, chiralities, and interactions with
the strong, weak and electromagnetic forces. It also depicts the crucial role of the Higgs boson
in electroweak symmetry breaking, and shows how the properties of the various particles differ
in the (high-energy) symmetric phase (top) and the (low-energy) broken-symmetry phase
(bottom).
Part VI
If the state of a thermodynamic system can be fully characterized by the values of the ther-
modynamic variables, and if these values are invariant over time, one says that it is in a state
of thermodynamic equilibrium. Thermodynamic equilibrium occurs when all fast processes
have already occurred, while the slow ones have yet to take place. Clearly the distinction be-
tween fast and slow processes is dependent on the observation time τ that is being considered.
A system can be in equilibrium if the observation time is fairly short, while it is no longer
possible to consider it in equilibrium for longer observation times. A more curious situation
is that the same system can be considered in equilibrium, but with different properties, for
different observation times.
Let us consider two thermodynamic systems, 1 and 2, that can be made to interact with one
another. Variables like the volume V , the number of particles N , and the internal energy U ,
whose value (relative to the total system) is equal to the sum of the values they assume in
the single systems, are called additive or extensive. Strictly speaking, internal energy is not
extensive, unless the interaction between 1 and 2 can be neglected.
The interaction between thermodynamic systems is usually represented by idealized walls that
29.2 Entropy formulation of thermodynamics –403/453–
allow the passage of one (or more) extensive quantities from one system to the other. Among
the various possibilities, the following are usually considered:
Thermally conductive walls These allow the passage of energy, but not of volume or particles.
Semipermeable walls These allow the passage of particles belonging to a given chemical species.
The space of possible states of equilibrium (compatible with constraints and initial conditions)
is called the space of virtual states. The initial state is obviously a (specific) virtual state. The
central problem of thermodynamics can obviously be restated as follows: Characterize the
actual state of equilibrium among all virtual states.
2. Convexity: If X 1 = (X01 , X11 , · · · , Xr1 ) and X 2 = (X02 , X12 , · · · , Xr2 ) are two thermo-
dynamic states of the same system, then for any α between 0 and 1, one obtains
Xr
∂S
(Xi2 − Xi1 ) ≥ S(X 2 ) − S(X 1 ), (29.3)
i=0
∂Xi
X1
which expresses the fact that the surface S(X0 , X1 , · · · , Xr ) is always below the plane
that is tangent to each of its points. (We adpot the convention that convex means upper
convex).
The entropy postulate allows one to solve the central problem of thermodynamics, by refer-
ring it back to the solution of a constrained extremum problem: The equilibrium state cor-
responds to the maximum entropy compatible with the constraints.
–404/453– Chapter 29 Thermodynamics
(1)
We will denote the value of U (1) at equilibrium by Ueq . One therefore has
∂S (1) ∂S (2)
= . (29.7)
∂U (1) (1)
Ueq ∂U (2) (2)
Ueq
One can easily prove that between two systems, both initially at the same temperature, volume
is initially released by the system in which p is lower to the system in which p is higher. Later,
we will show that p is the pressure of the system.
29.2 Entropy formulation of thermodynamics –405/453–
A semipermeable wall
Let us consider a system composed of several chemical species, and let us introduce the num-
ber of molecules N1 , · · · , Nr belonging to the chemical species that constitute it as part of the
thermodynamic variables. Let us suppose that two systems of this type are separated by a wall
that only allows the k-th chemical species to pass. Clearly, it is impossible for the exchange of
molecules to occur without an exchange of energy. If we introduce the quantity µi by
µi ∂S
= , (29.12)
T ∂Ni
the equilibrium conditions will be
dU X ∂S
r
dS = + dXi . (29.15)
T i=1
∂Xi U,··· ,Xr
Temperature
Let us consider a system made up of a thermal engine and two heat reservoirs with T1 > T2 .
A heat reservoir is a system for which T is independent of U . The whole compound system is
enclosed in a container that allows it to exchange energy with the environment only in a purely
mechanical way.
Let the system evolve from an initial equilibrium condition, in which the first heat reservoir has
internal energy U1 , the second has internal energy U2 , and the thermal engine is in some equi-
librium state, to a final equilibrium state in which the first heat reservoir has internal energy
–406/453– Chapter 29 Thermodynamics
U1′ , the second has U2′ . Thus the work performed by the system is W = (U1 + U2 ) − (U1′ + U2′ ),
and the thermal engine is back to its initial state. By definition, the efficiency of the engine is
given by η = W/(U1 − U1′ ).
In a transformation of this kind, the total entropy of the compound system cannot become
smaller:
S (1) (U1 ) + S (2) (U2 ) ≤ S (1) (U1′ ) + S (2) (U2′ ). (29.18)
Since we are dealing with heat reservoirs, we have
Ui′ − Ui
S (i)
(Ui′ ) (i)
= S (Ui ) + , i = 1, 2. (29.19)
Ti
It follows from equations 29.18 and 29.19 that
U1 − U1′ U ′ − U2
≤ 2 . (29.20)
T1 T2
Using the definition of the efficiency, we can get
T2
η ≤1− . (29.21)
T1
Compared with the maximum efficiency evaluated in elementary thermodynamics, we can
conclude that T is the absolute temperature, up to an overall factor, which can be fixed to 1 by
rescaling the S.
Pressure
Let us consider an infinitesimal variation of V . In this case, mechanics tells us that the work
performed by the system is given by δW = P dV . Thus we have
∂S P
= . (29.22)
∂V U,··· ,Xr T
This allows us to identify the pressure P with the quantity p we defined previousl.
becomes a dependent variable that satisfies a variational principle. This formalism is known
as the energy scheme.
Let ∆X be a virtual variation of the extensive variables (excluding internal energy U ) with
respect to the equilibrium value Xeq . Then
Since S is a monotonically increasing function of U , there exists a value U ′ > U such that
S(U ′ , Xeq + ∆X) = S(U, Xeq ). Therefore, if S is kept constant, as the system moves out
of equilibrium, U cannot but increase. Thus, in energy formalism, the maximum entropy
principle is replaced by the principle of minimum internal energy: Among all states with a
specific entropy value, the state of equilibrium is that in which internal energy is minimal.
The fundamental equation in the energy scheme is U = U (S, X1 , · · · , Xr ). Its differential
assumes the form
Xr
dU = T dS + fi dXi . (29.25)
i=1
Further more, it can be derived that
where X = {X1 , · · · , Xr }. It can shown that the thermodynamical equilibrium in these con-
ditions is characterized by the following variational principle: The value of the Helmholtz
free energy is minimal for the equilibrium state among all virtual states at the given tem-
perature T .
Let us now consider more generally the Legendre transform of the internal energy U with
respect to the intensive variable fi :
Then, the state of equilibrium is specified by the following criterion: Among all the states
that have the same value as f1 , the state of equilibrium is that which corresponds to the
minimum value of Φ.
The partial derivative of Φ, performed with respect to f1 , with the other extensive variables
kept fixed, yields the value of the extensive variable X1 :
∂Φ
= −X1 (S, f1 , X2 , · · · , Xr ). (29.30)
∂f1 S,X2 ,··· ,Xr
Considering two intensive variables T and f1 , we would introduce the thermodynamic po-
tential Φ(T, f1 , X2 , · · · , Xr ), obtained as a Legendre transform of U with respect to S and
X1 :
Φ(T, f1 , X2 , · · · , Xr ) = U − T S − f1 X1 . (29.31)
This thermodynamic potential assumes at equilibrium the minimum value among all the states
with the same values of T and f1 .
We can obtain a whole series of thermodynamic potentials by using a Legendre transform with
respect to the extensive variables Xi . However, we cannot eliminate all extensive variables in
this manner. We will see later that if we did this, the resulting thermodynamic potential would
identically vanish. A general thermodynamic potential
X
k
Φ(T, f1 , · · · , fk , Xk+1 , · · · , Xr ) = U − T S − f i Xi (29.32)
i=1
is concave as a function of the remaining extensive variables, for fixed values of the intensive
variables f1 , · · · , fk . Φ on the other hand is convex as a function of the intensive variables fi s,
when the extensive variables are fixed. The concavity and convexity are connected to the stabil-
ity of thermodynamic equilibrium. For example, the specific heat is always positive whatever
constraint is placed on the system since
∂S ∂ 2Φ
C≡T = −T > 0. (29.33)
∂T ∂T 2
If this were not the case, a small fluctuation in temperature that might make one of the sys-
tem’s regions colder would lead to this system claiming heat from surrounding regions. With
negative specific heat, this energy would make the temperature of the region diminish further,
and in this manner, the energy fluctuation would be amplified.
If we now take P ’s derivative with respect to T and we use the theorem of the equality of mixed
derivatives, we obtain
∂P ∂ 2F ∂S
=− = . (29.36)
∂T V,X2 ,··· ,Xr ∂T ∂V ∂V T,X2 ,··· ,Xr
These relations between thermodynamic derivatives that derive from the equality of mixed
derivatives of thermodynamic potentials are called Maxwell relations.
The free energy designation is derived from the following property. If a system is put in contact
with a reservoir at temperature T , the maximum quantity of work Wmax that it can perform
on its environment is equal to the variation in free energy between the initial and final states.
G(T, P, X2 , · · · , Xr ) = F + P V = U − T S + P V. (29.37)
The variational principle satisfied by the Gibbs free energy is the following: Among all states
that have the same temperature and pressure values, the state of equilibrium is that in
which the Gibbs free energy assumes the minimum value.
G’s differential is expressed as follows:
X
r
dG = −S dT + V dP + fi dXi . (29.38)
i=2
If a system is brought toward equilibrium while temperature and pressure are kept constant, the
maximum work that can be performed on its environment is given precisely by the difference
between the initial and final values of G.
If we Legendre transform the internal energy U with respect to V , we obtain a new thermo-
dynamic potential, usually denoted by H and called enthalpy:
H(S, P, X2 , · · · , Xr ) = U + P V. (29.39)
Enthalpy governs the equilibrium of adiabatic processes that occur while pressure is constant:
Among all states that have the same entropy and pressure values, the state of equilibrium
is the one that corresponds to the minimum value of enthalpy.
–410/453– Chapter 29 Thermodynamics
If a system relaxes toward equilibrium while the pressure is kept constant, the maximum heat
that can be produced by the system is equal to its variation in enthalpy. For this reason, en-
thalpy it is also called free heat. The differential of H is given by
X
r
dH = T dS + V dP + fi dXi . (29.40)
i=2
The equality of the mixed derivatives of G and H yield two more Maxwell relations:
∂S ∂V ∂T ∂V
=− , = . (29.41)
∂P T,X2 ,··· ,Xr ∂T P,X2 ,··· ,Xr ∂P S,X2 ,··· ,Xr ∂S P,X2 ,··· ,Xr
dΩ = −S dT − P dV − N dµ . (29.43)
If one transforms U instead, one obtains a rarely used potential that depends on S, V , and µ,
which we will designate as Φ(S, V, µ) = U − µN . Its differential is given by dΦ = T dS −
P dV − N dµ.
U = T S − P V + µN, (29.45)
From the Euler equation, it follows that the Legendre transform of U with respect to all exten-
sive variables vanishes identically.
Note: The interpretation of the chemical potential as a per particle density of Gibbs free energy is valid
only in the case of simple fluids – in the case of a mixture of several chemical species, it is no longer valid.
29.4 Thermodynamic systems with multi-components –411/453–
If we take the Euler equation’s differential and subtract both sides from the usual expression
of dU , we obtain the Gibbs-Duhem equation:
X
r
S dT + Xi dfi = 0. (29.47)
i=1
dµ = v dP − s dT , (29.48)
where we have made use of the fact that G is extensive, and therefore proportional to N . In
the case of the simple fluid, we also have equation of state s = s(T, P ), which expresses the
entropy per particle s as a function of P and T . In reality, the two equations of state are not
completely independent, because of the Maxwell relations:
∂s ∂v
=− . (29.50)
∂P T ∂T P
If temperature and pressure are kept fixed, the variation of Gibbs free energy for a certain
variation in the number of particles due to the reaction will be
X ∂G X ∂G X
δG = δNi ∝ νi = µi νi . (29.53)
i
∂Ni P,T i
∂Ni P,T i
Since at equilibrium one must have δG = 0 for any virtual variation of the Ni , one will have
X
µi νi = 0. (29.54)
i
–412/453– Chapter 29 Thermodynamics
In the case of a simple fluid, it is realized, for example, when a liquid coexists with its vapor
inside a container. In this case, the intensive variables assume the same value in both sys-
tems, while densities assume different values. In these cases, we refer to each of the coexisting
homogeneous systems as a phase.
One can describe phase coexistence by saying that the equation of state v = v(P, T ) does
not admit of a unique solution, but instead allows for at least the two solutions v = vliq and
v = vvap which correspond to the liquid and vapor, respectively. Since the liquid and vapor
coexist and can exchange particles, the chemical potential of the liquid has to be equal to that
of the vapor:
µliq (P, T ) = µvap (P, T ). (29.55)
On the other hand, we know that for a simple fluid, the chemical potential is equal to the Gibbs
free energy per particle. Thus, the Gibbs free energy in the total system does not depend on
the number of particles that make up the liquid and the vapor system:
In the equation of state P = P (v, T ), phase coexistence appears as a horizontal segment, for
a given value of T , and for values of v between vliq and vvap , as shown in Figure 29.1 (a).
Consider the Helmholtz free energy F . The pressure P is obtained as the derivative of −F
with respect to V at a given value of T . The isotherm curve F (V ) exhibits a straight segment
with slope −Pt , lying between Vliq = N vliq and Vvap = N vvap , as shown in Figure 29.1 (b).
The Gibbs free energy has an turning point at Pt . The two slopes that coexist at the turning
point correspond to Vliq and Vvap , respectively, as shown in Figure 29.1 (c).
P F G
Pt
vliq vvap v V P
Vliq Vvap Pt
(a) (b) (c)
Figure 29.1: (a) Isotherm P as a function of the volume per particle v for a simple liquid; (b)
Isotherm of the Helmholtz free energy F as a function of volume V ; (c) Isotherm of the Gibbs
free energy G as a function of pressure P . The black and red dash lines in each curve represent
metastable and unstable states, respectively.
coexistence we just discussed, or continuous. In the first case, the densities present a disconti-
nuity at the transition (first order transitions), while in the second, they vary with continuity,
even though their derivatives can exhibit some singularities (second order transitions).
In the case of a simple fluid, it is possible to identify the transition curve within the plane of
the intensive variables (P, T ), as shown in Figure 29.2 (a), from the condition of equality of
the chemical potential µ between the two coexisting phases:
Taking the total derivative of equation 29.58 with respect to T , along the transition line Pt (T ),
we can obtain the Clausius–Clapeyron equation for phase coexistence:
dPt svap − sliq
= . (29.59)
dT vvap − vliq
We can also represent the phase diagram in the plane (v, T ), as shown in Figure 29.2 (b). In
this manner, phase coexistence is represented by the existence of a forbidden region vliq (T ) <
v < vvap (T ) in the plane. Outside this region, it is possible to obtain any given value of v in
a homogeneous system. Within this region, instead, the system separates into a liquid and a
vapor phase.
µαi = µi , i = 1, · · · , r, α = 1, · · · , q. (29.60)
–414/453– Chapter 29 Thermodynamics
P T
(vc, Tc)
(Tc, Pc)
T0
T vliq vvap v
(a) (b)
Figure 29.2: (a) The transition curve within the plane (P, T ); (b) Coexistence curve for a
simple fluid. The critical point corresponds to (vc , Tc , Pc ).
In this equation, µi is the shared value taken by the chemical potential of species i. We thus
obtain r(q − 1) equations for q(r − 1) + 2 unknown values. These unknown values are P ,
T , and the q(r − 1) independent densities xαi of species i in phase α. Generically speaking,
f = 2 − q + r free parameters remain. For f = 0, coexistence will occur in isolated points of
the phase diagram, for f = 1, along a line, and so on. The quantity f is called variance.
1 ∂V
χ=− → ∞ for T → Tc . (29.61)
V ∂P T
Now, consider a system whose space of states is the direct product of two subspace, i.e.,
H = HA ⊗ HB . (30.2)
Thus, we have
∗
|ψ⟩⟨ψ| = CiI CjJ |i, I⟩⟨j, J| . (30.4)
We define the partial trace of |ψ⟩⟨ψ| on B as
X
∗
TrB (|ψ⟩⟨ψ|) ≡ CiI CjI |i⟩⟨j| . (30.5)
I
Now, if we take A as the system and B the environment, a piratical observable measures only
on system. For any system which is coupled to environment, its state can be described by an
operator
ρ = Trenv (|ψ⟩⟨ψ|). (30.8)
Thus the expectation value of the measurement on the system is
Tr[ρOsys ]. (30.9)
–416/453– Chapter 30 Principles of Statistical Mechanics and Ensembles
We have
Tr[ρO] = pi ⟨i|O|i⟩ . (30.12)
It is reasonable to assume pi as the (classical) probability of the system in (pure) state |i⟩. One
fundamental postulate of statistical mechanics is that the entropy operator of the system is
Ŝ = − ln ρ. (30.13)
with (
1/Γ for each of the accessible states
ρn = . (30.16)
0 for all other states
The entropy of the system is
S = ln Γ. (30.17)
Thus, we can identify T as the absolute temperature and F as the Helmholtz free energy.
e−β(H−µN )
ρ= . (30.23)
Tr[e−β(H−µN ) ]
Now, we define
ZΩ (β, V, µ) ≡ Tr e−β(H−µN ) , Ω(β, V, N ) ≡ − ln ZΩ /β. (30.24)
Thus, we can identify µ as the chemical potential and Ω as grand canonical potential.
30.3 Fluctuations
30.3.1 Canonical Ensemble
For canonical ensemble, we have
∂ρ
= −ρH + ρ Tr[ρH]. (30.28)
∂β N,V
–418/453– Chapter 30 Principles of Statistical Mechanics and Ensembles
∂U
= − Tr ρH 2 + (Tr[ρH])2 = − E 2 + ⟨E⟩2 = − (∆E)2 . (30.29)
∂β N,V
⟨(∆n)2 ⟩ T
= κT , (30.31)
⟨n⟩2 V
where n = N/V is the number density and κT = − ∂v/∂P T /v is the isothermal compress-
ibility of the system. Thus, the relative root-mean-square fluctuation in the particle density of
the given system is ordinarily O(N −1/2 ) and, hence, negligible.
However, in situations accompanying phase transitions, the compressibility of a given sys-
tem can become excessively large. In the region of phase transitions, especially at the critical
points, we encounter unusually large fluctuations in the particle density of the system. Such
fluctuations indeed exist and account for phenomena like critical opalescence. It is clear that
under these circumstances the formalism of the grand canonical ensemble could, in principle,
lead to results that are not necessarily identical to the ones following from the correspond-
ing canonical ensemble. In such cases, it is the formalism of the grand canonical ensemble
that will have to be preferred because only this one will provide a correct picture of the actual
physical situation.
The energy fluctuation in grand canonical ensemble is
!2
∂U
⟨(∆E)2 ⟩ = T 2 CV + ⟨(∆N )2 ⟩. (30.32)
∂N T,V
The mean-square fluctuation in the energy of a system in the grand canonical ensemble is equal
to the value it would have in the canonical ensemble plus a contribution arising from the fact
that now the particle number N is also fluctuating. Again, under ordinary circumstances, the
relative root-mean-square fluctuation in the energy density of the system would be practically
negligible. However, in the region of phase transitions, unusually large fluctuations in the
value of this variable can arise by virtue of the second term in the formula.
Chapter 31
Interaction-free Systems
|n1 , n2 , · · · , ni , · · ·⟩ , (31.1)
where ni is the number of particles in state |i⟩. Here, we choose |i⟩ as the energy eigenstate
with energy ϵi . Adopting grand canonical ensemble, we have
X ∑ YX
∞ Y 1
ZΩ = Tr e−β(H−µN ) = e−β i ni (ϵi −µ)
= [e−β(ϵi −µ) ]ni = .
n1 ,··· ,ni ,··· i ni =0 i
1 − e−β(ϵi −µ)
(31.2)
Thus, the grand canonical potential of the system is
X
Ω = −β −1 ln ZΩ = T ln 1 − e−β(ϵi −µ) . (31.3)
i
The chemical potential of the system must satisfy that µ < ϵ0 , where ϵ0 is the energy of the
ground state. To derive further results, we prefer to introduce a parameter z, called as the
fugacity of the system, defined by the relation
z ≡ eβµ . (31.4)
∂Ω X 1
N =− = . (31.5)
∂µ T,V i
eβϵi z −1 − 1
F = −N T ln Z1 = −N T ln N + · · · . (31.16)
However, the term −N T ln N is not extensible. This is called Gibbs paradox, due to the fact
that our assumption of distinguishable particles is wrong. It can be amended if we demand
that
ZN
Z= 1 . (31.17)
N!
31.2 Ideal Boltzmann Gas –421/453–
Then the non-extensible term is eliminated. In the grand canonical ensemble, we have
X∞
z N (Z1 )N
ZΩ = = exp[Z1 z]. (31.18)
N =0
N!
1
ni = . (31.22)
eβϵi z −1
As we can see, in the limit of
eβϵi ≥ eβϵ0 ≫ z, (31.23)
Bose-Einstein, Fermi-Dirac and Boltzmann statistics are identical.
• The temperature is high enough so that interaction between molecules can be neglected,
i.e., e−Vint /T ≪ 1.
Suppose the side length of the box is L. The momentum of the particle would be
2π
(nx , ny , nz ), (31.24)
L
where nx ,ny and nz are integers. Thus we have
X β
e− 2m ( L )
2π 2 2
(nx +n2y +n2z )
Z1 = . (31.25)
nx ,ny ,nz
–422/453– Chapter 31 Interaction-free Systems
If the difference of adjacent energy level is much smaller than β −1 , the summation can be
approximately as an integral. In SI units, this condition can be written explicitly as
−1 −2
h2 m L
T ≫ 2
∼ 10−17 K (31.26)
2mkB L mp 1m
Note: When calculating the integral above, the following formula may by useful:
Z ∞
n −a x2 Γ( n+1
2
)
x e dx = n+1 . (31.28)
0 2a 2
∂F V 5 3
S=− = N ln + N, U = F + T S = N T,
∂T V,N N λ3 2 2
∂F NT ∂F V ∂U 3
P =− = , µ= = −T ln , CV = = N. (31.30)
∂V T,N V ∂N T,V N λ3 ∂T V,N 2
Note that we must have z = eβµ = λ3 /v ≪ 1 to ensure the validness of Boltzmann statistics.
In SI units, the condition can be written explicitly as
T 3/2 h3 K3/2
≫ ∼ O(1) . (31.31)
ρ (2πkB )3/2 m5/2 kg/m3
Usually, before the condition is violated, the interaction between molecules becomes impor-
tant and the gas may transform to liquid already.
Here, ϵi is the energy associated with a state of internal motion, while gi is the multiplicity of
that state. The contributions made by the internal motions of the molecules, over and above the
31.3 Ideal Bose Systems –423/453–
translational degrees of freedom, follow straightforwardly from the function j(T ). Explicitly,
we have
∂ ln j ∂ ln j
Fint = −N T ln j, Sint = N ln j + T , Uint = N T 2
∂T ∂T
∂ ∂ ln j
µint = −kT ln j, (CV )int = N T2 . (31.33)
∂T ∂T
How the central problem is to derive an explicit expression for the function j(T ) from a knowl-
edge of the internal states of the molecules. For this, we the internal state of a molecule is
determined by
• the electronic state
• the state of the nuclei
• the vibrational state
• the rotational state.
Rigorously speaking, these four modes of excitation mutually interact; in many cases, however,
they can be treated independently of one another. We can then write
with the result that the net contribution made by the internal motions to the various thermo-
dynamic properties of the system is given by a simple sum of the four respective contributions.
A detailed discussion on gaseous systems composed of molecules with internal motion can be
found in section 6.5 from Statistical Mechanics (R.K.Pathria & Paul D.Beale).
For large V , the spectrum of the single-particle states is almost a continuous one, so the sum-
mations may be replaced by integrations. However, by replacing summation by integration,
we are inadvertently giving a weight zero to the energy level ϵ = 0. This is wrong because in a
quantum mechanical treatment we must give a statistical weight unity to each non-degenerate
single-particle state in the system. It is, therefore, advisable to take this particular state out
of the sum in question before carrying out the integration; a rigorous justification of this un-
usual step can be found in Appendix F of Statistical Mechanics (R.K.Pathria & Paul D.Beale).
We thus obtain
Z ∞
P 2π −βϵ
1
=− 3
(2m)3/2
ϵ1/2
ln 1 − ze dϵ − ln(1 − z) (31.36)
T (2π) 0 V
–424/453– Chapter 31 Interaction-free Systems
and Z ∞
N 2π ϵ1/2 dϵ 1 z
= (2m)3/2 + . (31.37)
V (2π)3 0 z e −1 V 1−z
−1 βϵ
For z ≪ 1, which corresponds to situations not far from the classical limit, the last term of
equations 31.36 and 31.37 is of order 1/N and, therefore, negligible.
However, as z increases and assumes values close to unity, the term V −1 z/(1 − z), which
is identically equal to N0 /V (N0 being the number of particles in the ground state), can well
become a significant fraction of the quantity N/V ; this accumulation of a macroscopic fraction
of the particles into a single state leads to the phenomenon of Bose-Einstein condensation.
Nevertheless, since z = N0 /(N0 + 1), the term −V −1 ln(1 − z) is equal to V −1 ln(N0 + 1),
which is at most O(N −1 ln(N + 1)); this term is, therefore, negligible for all values of z and
hence may be dropped altogether. Thus, we have
P 1 N − N0 1
= 3 g5/2 (z), = 3 g3/2 (z), (31.38)
T λ V λ
where gν (z) are Bose-Einstein functions defined by
Z ∞
1 xν−1 dx z2
gν (z) ≡ = z + + ··· . (31.39)
Γ(ν) 0 z −1 ex − 1 2ν
∂ ln ZΩ 3T V 3 3g5/2 (z)
U =− = g5/2 (z) = P V = (N − N0 )T (31.40)
∂β z,V 2 λ3 2 2g3/2 (z)
N − N0 1
= 3 ζ(3/2) where ξ(ν) ≡ gν (1). (31.41)
V λ
This curious phenomenon of a macroscopically large number of particles accumulating in a
single quantum state is generally referred to as the phenomenon of Bose-Einstein condensa-
tion. The condition for the onset of Bose-Einstein condensation is
2/3
2π N
T < Tc ≡ . (31.42)
m V ζ(3/2)
Here, Tc denotes a characteristic temperature that depends on the particle mass m and the
particle density N/V in the system. Accordingly, for T < Tc , the system may be looked on as
a mixture of two “phases”:
• a normal phase, consisting of Ne = N (T /Tc )3/2 particles distributed over the excited
states.
Pressure
Next, we examine the variation of P with T , keeping v fixed. When T < Tc , the pressure is
T
P = ζ(5/2), (31.43)
λ3
which is proportional to T 5/2 but independent of v, implying infinite compressibility. At the
transition point the value of the pressure is
ζ(5/2) N Tc N Tc
P (Tc ) = ≈ 0.5134 . (31.44)
ζ(3/2) V V
Thus, the pressure exerted by the particles of an ideal Bose gas at the transition temperature
is about one-half of that exerted by the particles of an equivalent Boltzmannian gas. When
T > Tc , the pressure is
g5/2 (z) N T
P = . (31.45)
g3/2 (z) V
As T → ∞, the pressure approaches the classical value.
Specific heat
When T < Tc , the specific heat is
CV 3V d T 15 v
= ζ(5/2) = ζ(5/2) 3 , (31.46)
N 2N dT λ3 4 λ
Entropy
Finally, we examine the adiabats of the ideal Bose gas. For this, we need an expression for the
entropy of the system. Making use of U − T S + P V = N µ, we get
(
5 g5/2 (z)
2 g3/2 (z)
− ln z, T > Tc
s= . (31.52)
5 v
2 λ3
ζ(5/2), T < Tc
A reversible adiabatic process implies the constancy of s. For T > Tc , this implies the con-
stancy of z as well and in turn the constancy of v/λ3 . For T ≤ Tc , it again implies the same.
We thus obtain, quite generally, the following relationship between the volume and the tem-
perature of the system when it undergoes a reversible adiabatic process:
Using equations 31.38, the corresponding relationship between the pressure and the temper-
ature is
P T −5/2 = const. (31.54)
Eliminating T , we obtain
P v 5/3 = const. (31.55)
5 ζ(5/2)
S = Ne . (31.56)
2 ζ(3/2)
As expected, the N0 particles that constitute the condensate do not contribute to the entropy
of the system, while the Ne particles that constitute the normal part contribute an amount of
5ζ(5/2)/2ζ(3/2) per particle.
PV X X 1
= ln ZΩ = ln 1 + ze−βϵi , N= . (31.57)
T i i
eβϵi z −1 +1
Unlike the Bose case, the parameter z in the Fermi case can take on unrestricted values. More-
over, in view of the Pauli exclusion principle, the question of a large number of particles oc-
cupying a single energy state does not even arise in this case. We can replace summations by
corresponding integrations. We thus obtain
P g N g
= 3 f5/2 (z), = 3 f3/2 (z), (31.58)
T λ V λ
31.4 Ideal Fermi systems –427/453–
where g is a weight factor arising from the internal structure of the particles and fν (z) are
Fermi-Dirac functions defined by
Z ∞ ν−1
1 x dx z2 z3
fν (z) ≡ = z − + − ··· (31.59)
Γ(ν) 0 z −1 ex + 1 2ν 3ν
The internal energy of the Fermi gas is given by
∂ ln ZΩ 3T gV 3 3f5/2 (z)
U =− = 3
f5/2 (z) = P V = N T. (31.60)
∂β z,V 2 λ 2 2f3/2 (z)
The free energy of and entropy of the gas are
f5/2 (z) U −F 5 f5/2 (z)
F = N µ − P V = N T ln z − , S= =N − ln z .
f3/2 (z) T 2 f3/2 (z)
(31.61)
Using the recurrence relation of Fermi-Dirac function
∂fν (z)
z = fv−1 (z), (31.62)
∂z
we also obtain the specific heat of the gas:
CV 15 f5/2 (z) 9 f3/2 (z)
= − . (31.63)
N 4 f3/2 (z) 4 f1/2 (z)
In order to determine the various properties of the Fermi gas in terms of the particle density
n and the temperature T , we need to know the functional dependence of the parameter z on
n and T ; this information is formally contained in the implicit relationship gf3/2 (z) = nλ3 .
If the density of the gas is very low and/or its temperature very high, the Fermi gas will be
equivalent to classical ideal gas; we then speak of the gas as being non-degenerate.
If the parameter z is small in comparison with unity but not very small, we should obtain an
expansion for z in powers of nλ3 /g.
If the density and the temperature are such that the parameter (nλ3 /g) is of order unity, the
foregoing expansions cannot be of much use. In that case, one may have to make recourse to
numerical calculation.
If (nλ3 /g) ≫ 1, the functions involved can be expressed as asymptotic expansions in powers
of (ln z)−1 ; we then speak of the gas as being degenerate.
As (nλ3 /g) → ∞, our functions assume a closed form and the expressions for the various
thermodynamic quantities become highly simplified; we then speak of the gas as being com-
pletely degenerate.
Degenerate case
For an analytical study of the Fermi gas at finite, but low, temperatures, we observe that the
value of z is now finite, though still large in comparison with unity. The functions fν (z) can be
expressed as asymptotic expansions in powers of (ln z)−1 . For the values of ν we are presently
interested in, we have the approximation
8 5π 2 −2
f5/2 (z) = (ln z) 5/2
1+ (ln z) + · · · , (31.70a)
15π 1/2 8
4 π2 −2
f3/2 (z) = 1/2 (ln z) 3/2
1 + (ln z) + · · · , (31.70b)
3π 8
2 π2 −2
f1/2 (z) = 1/2 (ln z) 1/2
1 − (ln z) + · · · . (31.70c)
π 24
To the lowest order of T /ϵF , the chemical potential of the degenerate Fermi gas is
" 2 #
π2 T
µ = T ln z ≈ ϵF 1 − . (31.72)
12 ϵF
31.5 Thermodynamics of the blackbody radiation –429/453–
From equations 31.60, 31.70 and 31.72, the internal energy and pressure of the degenerate
Fermi gas are given by
" 2 #
U 3f5/2 (z) 3 π2 3 5π 2
T
= N T ≈ (T ln z) 1 + (ln z)−2 ≈ ϵF 1 + . (31.73)
N 2f3/2 (z) 5 2 5 12 ϵF
and " 2 #
2
2U 2 5π T
P = ≈ nϵF 1 + . (31.74)
3V 5 12 ϵF
Thus, the low temperature specific heat of the gas is
CV 1 ∂U π2 T
= ≈ . (31.75)
N N ∂T V,N 2 ϵF
which gives
S U −F π2 T
= ≈ . (31.77)
N NT 2 ϵF
The summation in equation 31.79 can be approximated by an integral. Since the spin of a
photon can take two distinct values, we have
Z
V T 4 ∞ x3 dx V T4 π2V T 4
U= 2 = Γ(4)ζ(4) = . (31.80)
π 0 ex − 1 π2 15
If there is a small opening in the walls of the cavity, the photons will “effuse” through it. The
net rate of flow of the radiation, per unit area of the opening, is
Z
1 π/2 U U π2
I= cos θ sin θdθ = = T 4. (31.81)
2 0 V 4V 60
–430/453– Chapter 31 Interaction-free Systems
It follows that
U π2T 4
P = = . (31.83)
3V 45
Since the chemical potential of photon gas is zero, the Helmholtz free energy is equal to Ω;
therefore the entropy is given by
U −F 4U
S= = ∝ V T 3. (31.84)
T 3T
The specific heat of the photon gas is
∂S
CV = T = 3S. (31.85)
∂T T
Photons at ground state is undetectable and can be neglected. Thus, the equilibrium number
density of photons in the radiation cavity is
Z
N T 3 ∞ x2 dx 2ξ(3)T 3
= 2 = ∝ T 3. (31.86)
V π 0 ex − 1 π2
Instructive though it may be, formula above cannot be taken at its face value because in the
present problem, the magnitude of the fluctuations in the variable N , which is determined by
the quantity ( ∂P /∂V T )−1 , is infinitely large.
Chapter 32
Quantum Field Theory in Statistical Physics
32.1 Superfluidity
35
melting curve 12
(a) (b)
30
10
25
8
C[J g−1 K−1 ]
20 λ-line
p[bar]
6
15
He-II He-I
4
10
2 Tλ
5
vapor pressure
0 0
0 1 2 3 4 5 6 1.50 1.75 2.00 2.25 2.50 2.75
T [K] T [K]
Figure 32.1: (a) The phase structure of 4He at low temperature; (b) The specific heat of helium
as a function of temperature.
Helium I is a normal fluid and has a normal gas-liquid critical point. Helium II is a mixture of a
normal fluid and a superfluid. The superfluid is characterized by the vanishing of its viscosity.
Helium I and helium II are separated by a line known as the λ-transition line. At Tλ = 2.18 K,
Pλ = 2.29 Pa, helium I, helium II, and helium gas coexist. The specific heat of liquid helium
along the vapour transition line forms a logarithmic discontinuity, as shown in Figure 32.1
(b). The form of this diagram resembles the Greek letter λ and is the reason for calling the
transition a λ transition.
The excitation spectrum of helium II can be measured experimentally through elastic neutron
scattering. It is found to consist of two parts, the phonon region
1
E(p) = ∆ + |p − p0 |2 , when |p| ∼ |p0 |, (32.2)
2µ
where c = 226m/s is the velocity of sound, ∆/kB = 9K is the roton parameter, and µ =
0.25mHe is the effective mass. There is another velocity parameter known as the critical veloc-
ity v0 . It is only when helium II moves with velocity greater than v0 that viscous effects arise.
At low temperature the roton excitations are damped by the Boltzmann factor exp(−β∆).
where Z Z
1
ak = √ dx ψ(x)e −ip·x
, Ve (q) = dx V (x)e−iq·x . (32.4)
V
Here we adopt box normalization to make momentum of the particle discrete. It follows that
ap , a†q = δp,q , [ap , aq ] = a†p , a†q = 0. (32.5)
At low temperature, states with low energy value become dominant. These are expected to be
states with low values of momentum. Let us consider the system close to T ≈ 0. We can then
assume that the state of lowest energy corresponds to atoms of low momentum with a sizable
fraction of molecules in the zero momenta state, leading to Bose-Einstein condensation. Thus
if the system has on average N atoms then a significant number N0 of the atoms are in the
lowest energy state.
Let us suppose that |C; N, N0 ⟩ is a superfluid state with a total of N helium atoms, N0 of which
are in the zero momentum plane wave state. If a†0 and a0 are creation and destruction operators
of a state of zero momentum, we have
For large N0 we can approximate N0 + 1 by N0 so that on the state |C; N, N0 ⟩ we can replace
both the operators a†0 a0 and a†0 a0 by a single c-number, N0 .
We next examine the interaction part of H when restricted to |C; N, N0 ⟩. When all four op-
erators in HI have zero momentum, we have the term
" #
1 1 1 X
HI0 = Ve (0)a†0 a†0 a0 a0 = Ve (0)N02 ≈ Ve (0) N 2 − 2N0 a†k ak . (32.10)
2V 2V 2V k̸=0
The next term is of order N0 and is the part of HI containing two operators carrying zero
momentum. There are six ways in which this can happen. These are displayed with the mo-
mentum variables which are set to zero as shown
N0 X e N0 X e
k1 + q = k2 − q = 0 : V (q)a−q aq , k1 + q = k1 = 0 : V (0)a†k2 ak2 ,
2V q̸=0 2V k ̸=0
2
N0 X N0 X
k1 + q = k2 = 0 : Ve (q)a†−q a−q , k2 − q = k1 = 0 : Ve (q)a†q aq ,
2V q̸=0 2V q̸=0
N0 X e N0 X e
k2 − q = k2 = 0 : V (0)a†k1 ak1 , k1 = k2 = 0 : V (q)a†q a†−q . (32.11)
2V k ̸=0 2V q̸=0
1
Since at low temperature we expect only small momenta excitations to be important, we replace
Ve (k) by Ve (0) in HI . Therefore, on the state |C; N, N0 ⟩, the interacting Hamiltonian, keeping
terms of O(N0 ), is given by
" #
e (0)
V X †
HIB ≡ N 2 + N0 (2ak ak + a†k a†−k + ak a−k ) . (32.12)
2V k̸=0
The function E(k) will then determine the different excitations of the system while bk and b†k
will be destruction and creation operators for these excitations or “quasi-particles”, provided
they satisfy the commutation rules
bp , b†q = δp,q . (32.15)
–434/453– Chapter 32 Quantum Field Theory in Statistical Physics
Writing
bk = α(k)ak − β(k)a†−k , b†k = α(k)a†k − β(k)a−k . (32.16)
we then have the constraint
α(k)2 − β(k)2 = 1. (32.17)
Substituting equation 32.16 into 32.13 and comparing it with the 32.14, we find
v !
u
u k2 k 2 2N e (0)
V
E(k) = t
0
+ . (32.18)
2m 2m V
The energy E(k) is called the quasi-particle energy and operators bk and b†k are quasi-particle
destruction and creation operators. For small values of k we have
r
|k| N0 e
E(k) = V (0)m. (32.19)
m V
Observe that for E(k) to be real we must have Ve (0) > 0. This implies there is a repulsive
R
region for V (x) which must dominate the integral dx V (x). Observe also that |k|/m = v is
a velocity, and N0 m/V = ρ, is the density of the superfluid helium so that the quasi-particle
energy can be written as q
E(k = mv) ≈ |v| ρVe (0). (32.20)
We now show that a system with such an energy spectrum represents a superfluid, i.e., a system
with no friction. Friction in a system represents dissipation of energy. Consider a molecule
of mass MA moving in a medium. If this molecule can change its energy through collisions
with the excitations of the medium, then the system has friction. We will find that a molecule
of mass MA and velocity VA moving through a system consisting of quasi-particles of energy
E(k) cannot change its energy by scattering off quasi-particles if |VA | < |v0 | where |v0 | is a
critical velocity determined by Ve (0) and ρ.
To see this, let us consider the collision of a molecule of mass MA and velocity VA with a quasi-
particle at rest. If the final momentum of the molecule is QA and that of the quasi-particle is
k, we have, from momentum conservation,
|QA |2 = |PA |2 + |k|2 − 2|PA ||k| cos θ, (32.21)
where PA = MA VA and θ is the angle between PA and k. It follows that
|PA |2 − |QA |2 |PA |2 − |QA |2 + |k|2
≤ = cos θ ≤ 1. (32.22)
2|PA ||k| 2|PA ||k|
Combining this with the energy conservation condition
|PA |2 |QA |2
= + E(p), (32.23)
2MA 2MA
we end up with q
MA E(k) ρVe (0)
= ≤ 1. (32.24)
|PA ||k| mVA
q
Thus the process of changing energy for the molecule is not allowed if VA ≤ v0 = ρVe (0)/m
and the system of quasi-particles behaves like a superfluid.
32.2 Finite temperature perturbation theory –435/453–
Define
Notice that T is now the ordering with respect to τ (or imaginary time).
Upon substitution of 32.28 into the grand canonical partition sum, we have
ZΩ = Tr e−βK0 W (β) . (32.29)
So we obtain a perturbative expansion of the partition sum in analogy with that of the evolution
operator. Before we can apply this formalism to the computation of ZΩ , we need to analyze
the finite temperature versions of the time-ordered Green functions and Wick’s theorem. The
detailed discussion can be found in section 9.8 of Elements of Statistical Mechanics (Ivo Sachs,
Siddhartha Sen & James Sexton).
(32.31)
In the limit n → ∞ the error term will go to zero, and we have achieved a splitting of the
original exponential operator
n
Tr e−βH = lim Tr e−ϵV e−ϵT . (32.32)
n→∞
–436/453– Chapter 32 Quantum Field Theory in Statistical Physics
At this point we are still working with operators, but we can now insert a complete set of
states between each term in the product, and convert the problem to one with just commuting
numbers. For simplicity, we assume the system has only one pair of canonical variables.
Tr e−βH
Z
= lim (dp dq)n Tr e−ϵV |q0 ⟩⟨q0 | e−ϵT |p0 ⟩⟨p0 | e−ϵV · · · e−ϵV |qn−1 ⟩⟨qn−1 | e−ϵT |pn−1 ⟩⟨pn−1 |
n→∞
Z n ∑
dp dq n−1
= lim e i=0 ipi (qi+1 −qi )−ϵH(qi ,pi ) where qn = q0 . (32.33)
n→∞ 2π
If T (p) = p2 /2m, the pk integral will be a Gaussian integral, leading to
m n2 Z ∑n−1 m
−βH
dn q eϵ i=0 − 2ϵ2 (qi+1 −qi ) −V (qi ) .
2
Tr e = lim (32.34)
n→∞ 2πϵ
where Z β m
2
q(0) = q(β), SE [q(τ )] = dτ q̇ + V (q) . (32.36)
0 2
In quantum mechanics, we have
Z
−iHT
⟨b| e |a⟩ = Dq(t)eiS[q(t)] , (32.37)
where Z T m
q(0) = a, q(T ) = b, S[q(t)] = dt q̇ − V (q) .
2
(32.38)
2 0
Actually, path integral 32.35 can be obtained directly if we replace T with −iβ and change the
integral variable from t to τ = it in the path integral 32.37.
The path integral formulation of partition function can be generalized to the case of quantum
field straightforwardly. For a system of bosons, we have
Z
−β(H−µN )
Tr e = DψDψ † e−SE , (32.39)
where ψ(x, 0) = −ψ(x, β) and the value of ψ(x, t) is Grassmann number. A formal con-
struction of path integral based on coherent states can be found in section 4.1 and 4.2 of Con-
densed Matter Field Theory (Alexander Altland & Ben Simons).
Chapter 33
Phase Transitions and the Renormalization Group
Determining a suitable order parameter field to characterize a phase is part of the task of a
theory of phase transitions. If the order parameter field changes continuously from one phase
to another, as in the case of a ferromagnet, the transition is said to be a continuous or second-
order phase transition. If it is discontinuous the transition is said to be first order. An example
of a first order transition is when a solid melts to a liquid. The density of the system, which
can be taken as the order parameter, changes discontinuously. A phase transition is a striking
example of an emergent phenomenon. Starting off with only short-range interactions between
its microscopic magnetic moments, the system realizes long-range correlations below critical
temperature Tc .
We start with a model for a ferromagnet. We regard a ferromagnetic solid as being made out of
a finite number of elementary magnets placed at locations throughout the solid. We simplify
our model by assuming that each of these elementary magnets m can either point up m = 1
or down m = −1. Finally each elementary magnet interacts only with its nearest neighbour.
A Hamiltonian for this model could be
X X
H = −g mi mi+n − B mi , (33.1)
n,i i
where the first sum is over i as well as the nearest neighbours of i. Notice that H decreases if
mi , mi+n have the same sign for g > 0.
If there are altogether a large but finite number of magnets in a ferromagnet, the susceptibility
cannot diverge. Whereas, divergence appears if we allow the number of elementary magnets
to tend to infinity. This is because an infinite sum of analytic functions need not be analytic.
In order to analyze this possibility we will need to consider the statistical mechanics partition
function in the limit in which the number of configurations is infinite.
–438/453– Chapter 33 Phase Transitions and the Renormalization Group
Another approach to the problem is to suppose that the external magnetic field B is changed to
B + δB(x). We expect that a change at x, δB(x), will produce a change in the magnetization
δM not just at the point x but at other points as well. Indeed we might expect
We have assumed that the correlation function depends only on temperature and on the dis-
tance between the points x and y. Let us now suppose that δB is independent of x and let us
set y = 0. Then we have
Z
δM (0)
χ(0) = = d3 x CT (|x|). (33.4)
δB
If we assume that (
α, |x| ≤ a(T )
CT (|x|) = , (33.5)
0, |x| > a(T )
that is, a disturbance only propagates a distance a(T ), we can get
4πα 3
χ(0) = a (T ). (33.6)
3
Thus, χ(0) will diverge if a(T ) diverges, that is, if correlations in the system become infinite.
From this point of view, the divergence in this susceptibility is due to the fact that near a phase
transition disturbances propagate over large distances.
around the point x. The magnetization of the volume element ∆V is defined to be M (x)∆V .
For this definition of M (x) to be useful, it is important that M (x) should not be a rapidly
varying function of position. Near the Curie temperature Tc we also expect M (x) to be small
in amplitude.
On the basis of arguments of this kind, Landau proposed to introduce a functional FL [T, B, M ]
of the magnetization density M (x), temperature T , and external magnetic field B(x) of the
form
Z
FL [T, B, M ] = FL (T, B, 0) + d3 x [a(T )|M |2 + b(T )|M |4 + · · ·
X
+ c(T ) (∇j Mi )2 + · · · − B · M ]. (33.7)
ij
The free energy FL (T, B) is then obtained by minimizing FL [T, B, M ] with respect to M .
Notice that the temperature dependent coefficients a(T ), b(T ), c(T ) · · · are assumed to be
smooth functions of temperature. We will simplify the model function by assuming magnetic
field along the z-direction and that M (x) only has components in the z-direction. Then we
have
Z
FL [T, Bz , Mz ] = FL (T, Bz , 0) + d3 x [a(T )Mz2 + b(T )Mz4 + · · ·
The expression for the Landau free energy FL is expected to be useful when T is close to the
Curie temperature Tc . In this region Mz (x) is expected to be small and we also expect |∇Mz |2
to be small. Because of these reasons we will from now on ignore the effect of the higher powers
of Mz and higher gradient terms.
To determine the equilibrium configuration of the magnetization we have to minimize the free
energy with respect to Mz (x). Using
Z
δFL = d3 x [2a(T )Mz + 4b(T )Mz3 − 2c(T )∇2 Mz − Bz ]δMz , (33.9)
2a(T )Mz (x) + 4b(T )Mz3 (x) − 2c(T )∇2 Mz (x) = Bz (x). (33.10)
Suppose now that Bz (x) does not depend on x and let us see if a solution for Mz (x) indepen-
dent of x is possible. Such an x independent solution must satisfy
Now we ask if it is possible to construct a solution with the property that Mz ̸= 0 when Bz = 0
and T < Tc . As we have stressed this model is constructed to represent a ferromagnet near
its Curie temperature. We also assume that the coefficient functions are all smooth functions
of temperature. We thus expect the Mz4 term to be small compared to the Mz2 term. It is then
–440/453– Chapter 33 Phase Transitions and the Renormalization Group
Mz = 0. (33.12)
F
T > Tc
T < Tc
It follows that
[2a0 (T − Tc ) + 12b0 Mz2 − 2c0 ∇2 ]δMz (x) = δBz (x). (33.19)
Using equation 33.3, we obtain
[2a0 (T − Tc ) + 12b0 Mz2 − 2c0 ∇2 ]CT (|x − y|) = δ(x − y). (33.20)
Setting Bz = 0, we get
The solution is
1 e−|x−y|/ξ
CT (|x − y|) = , (33.23)
4π |x − y|
where ξ 2 = c0 /a(T ) for T > Tc and ξ 2 = −c0 /2a(T ) for T < Tc .
We notice that ξ → ∞ as T → Tc . Thus Landau’s theory is in qualitative agreement with the
intuitive idea that long-range correlations are generated in a ferromagnet as T → Tc . Another
point to notice is that if δBz were x independent, then as we saw before,
Z
δM (0)
χ(0) = = d3 x CT (|x|) ∼ ξ 2 → ∞, as T → Tc . (33.24)
δB
Let us summarize the results obtained from Landau’s approach. The approach focused on long-
range correlations and suggested that the singular behaviour of the susceptibility was due to
such correlations when T → Tc . The approach also predicts that the relation between different
macroscopic parameters involves power laws,
1
Mz ∼ (Tc − T )β , Mz ∼ Bz1/δ , χ∼ , (33.25)
(Tc − T )γ
with β = 1/2, γ = 1, and δ = 3. The parameters β, δ and γ are called critical exponents and
are measured experimentally.
The experimental values for these parameters β ≈ 0.33, δ ≈ 4.5, and γ ≈ 1.2 are found
for different ferromagnet with different lattice structures and widely differing values for the
Curie temperature Tc . These parameters thus are a universal property of the ferromagnetic
phase transition. This is also a feature of Landau’s theory. Landau’s theory is in qualitative
agreement with experiment.
X
N X
N
−βH
Z = Tr e where H = −J Si Si+1 − Hext Si . (33.26)
i=1 i=1
Here, Si = ±1 denotes the (uniaxial) magnetization or spin of site i (periodic boundary con-
ditions, SN +1 = S1 , imposed), and Hext represents an external field. The Boltzmann weight
of the system can be factorized according to the relation
∑N Y
N
e−βH = e 1 KSi Si+1 +hSi
= T (Si , Si+1 ), (33.27)
i=1
where K = βJ > 0 and h = βHext , and the weight is defined through the relation T (S, S ′ ) =
exp[KSS ′ + h(S + S ′ )/2]. The partition function of the system can be written as
X K+h
−βH N e e−K
Z= e = Tr T where T = . (33.28)
e−K eK−h
{Si }
We first subdivide the spin chain into regular clusters of b neighbouring spins. We then proceed
to sum over the sub-configurations of each cluster, thereby generating an effective functional
describing the inter-cluster energy balance. For one-dimensional Ising model, we have
N/b
= Tr(T ′ ) = ZN/b (K ′ , h′ ).
N/b
ZN (K, h) = Tr T N = Tr T b (33.29)
all characteristics of the model, including its correlation length ξ, remain invariant. On the
other hand, we noticed above that an RG step is tantamount to doubling the fundamental
length scale of the system. Consistency requires that either ξ = 0 or ξ = ∞.
In the present case, the line of fixed points (1, v) is identified with u = exp(−βJ) = 1, i.e.,
β = 0. This is the limit of infinitely large temperatures, at which we expect the model to be in
a state of maximal thermal disorder, i.e., ξ = 0. Besides the high-temperature fixed line, there
is a zero-temperature fixed point (u, v) = (exp(βJ), exp(h)) = (0, 1) implying T → 0 and
h → 0. Upon approaching zero temperature, the system is expected to order and to build up
long-range correlations, i.e., ξ → ∞. Critical point corresponds to a fixed point of RG group
with infinity correlation length.
Notice, however, an important difference between the high- and the low-temperature set of
fixed points: while the former is an attractive fixed point in the sense that the RG trajecto-
ries approach it asymptotically, the latter is a repulsive fixed point. No matter how low the
temperature at which we start, the RG flow will drive us into a regime of effectively higher
temperature or lower ordering. (Of course, the physical temperature does not change under
renormalization. All we are saying is that the block spin model behaves as an Ising model at a
higher temperature than the original system.)
• We may proceed according to a generalized block spin scheme and integrate over all de-
grees of freedom located within a certain structural unit in the base manifold {x}.(This
scheme is adjusted to lattice problems where {x} = {xi } is a discrete set of points.)
• We could decide to integrate over a certain sector in momentum space. When this sec-
tor is defined to be a shell Λ/b < |p| < Λ, one speaks of a momentum shell integration.
Naturally, within this scheme, the theory will be explicitly cutoff-dependent at interme-
diate stages.
• Alternatively, we may decide to integrate over all high-lying degrees of freedom λ−1 ≤
|p|. In this case, we will of course encounter divergent integrals. An elegant way to han-
dle these divergences is to apply dimensional regularization. Within this approach one
formally generalizes from integer dimensions d to fractional values d ± ϵ. One moti-
vation for doing so is that the formal extension of the characteristic integrals appearing
–444/453– Chapter 33 Phase Transitions and the Renormalization Group
during the RG step to non-integer dimensions are finite. As long as one stays clear of
the dangerous values d = integer one can then safely monitor the dependence of the
integrals on the IR cutoff λ−1 .
RG step
The second part of the program is to actually integrate over short range fluctuations. This
step usually involves approximations. In most cases, one will proceed by a so-called loop ex-
pansion, i.e., one organizes the integration over the fast field ϕf according to the number of
independent momentum integrals (loops) that occur after the appropriate contractions.
Following the procedure, an expansion over the fast degrees of freedom gives an action in
which coupling constants of the remaining slow fields are altered. Notice that the integration
over fast field fluctuations may lead to the generation of “new” operators, i.e., operators that
have not been present in the bare action. In such cases one has to investigate whether the
newly generated operators are “relevant” in their scaling behaviour. If so, the appropriate way
to proceed is to include these operators in the action from the very beginning (with an a priori
undetermined coupling constant). One then verifies whether the augmented action represents
a complete system, i.e., one that does not lead to the generation of operators beyond those that
are already present. If necessary, one has to repeat this step until a closed system is obtained.
Rescaling
One next rescales frequency/momentum so that the rescaled field amplitude ϕ′ fluctuates on
the same scales as the original field ϕ, i.e., one sets
q → bq, ω → bz ω. (33.32)
Here, the frequency renormalization exponent or dynamical exponent z depend on the effec-
tive dispersion relating frequency and momentum. We finally notice that the field ϕ, as an
integration variable, may be rescaled arbitrarily. Using this freedom, we select a term in the
action which we believe governs the behaviour of the “free” theory – in a theory with elastic
R
coupling this might be the leading-order gradient operator dd r (∇ϕ)2 – and require that it
be strictly invariant under the RG step. To this end we designate a dimension Ldϕ for the field,
chosen so as to compensate for the factor bx arising after the renormalization of the operator.
The rescaling ϕ → bdϕ ϕ is known as field renormalization. It renders the “leading” operator
in the action scale invariant.
which is entirely described by the set of changed coupling constants, i.e., the effect of the RG
step is fully encapsulated in the mapping
g ′ = R̃(g), (33.34)
33.3 Renormalization group –445/453–
relating the old value of the vector of coupling constants to the renormalized one. By letting
the control parameter, l ≡ ln b, of the RG step assume infinitesimal values, one can make
the difference between bare and renormalized coupling constants arbitrarily small. It is then
natural to express the difference in the form of a generalized β-function or Gell-Mann–Low
equation
dg
= R(g), (33.35)
dl
where the right-hand side is defined through the relation
R(g) = lim l−1 (R̃(g) − g). (33.36)
l→0
To explore the properties of flow, let us assume that we had managed to diagonalize the matrix
W . Denoting the eigenvalues by λα , and the left-eigenvectors by ϕα , we have
ϕ⊺α W = ϕ⊺α λα . (33.38)
Let vα be the αth component of the vector g − g ∗ when represented in the basis {ϕα }, i.e.,
vα = ϕ⊺α (g − g∗ ). (33.39)
It follows that
dvα
= λα vα . (33.40)
dl
Under renormalization, the coefficients vα change by a mere scaling factor λα , wherefore they
are called scaling fields. It suggests a discrimination between at least three different types of
scaling fields:
–446/453– Chapter 33 Phase Transitions and the Renormalization Group
• For λα > 0 the flow is directed away from the critical point. The associated scaling field
is said to be relevant.
• In the complementary case, λα < 0, the flow is attracted by the fixed point. Scaling
fields with this property are said to be irrelevant.
• Finally, scaling fields which are invariant under the flow, λα = 0 , are termed marginal.
• Firstly, there are stable fixed points, i.e., fixed points whose scaling fields are all irrelevant
or, at worst, marginal. These points define what we might call “stable phases of matter”:
when you release a system somewhere in the parameter space surrounding any of these
attractors, it will scale towards the fixed point and eventually sit there. Or, expressed in
more physical terms, looking at the problem at larger and larger scales will make it more
and more resemble the infinitely correlated self-similar fixed-point configuration. By
construction, the fixed point is impervious to moderate variations in the microscopic
morphology of the system, i.e., it genuinely represents what one might call a “state of
matter.”
• Complementary to stable fixed points, there are unstable fixed points, where all scaling
fields are relevant (e.g., the T = 0 fixed point of the 1-D Ising model). You can never
get there and, even if you managed to approach it closely, the harsh conditions of reality
will make you flow away from it. Although unstable fixed points do not correspond to
realizable forms of matter, they are of importance inasmuch as they “orient” the global
RG flow of the system.
• Finally, there is the generic class of fixed points with both relevant and irrelevant scaling
fields. These points are of particular interest inasmuch as they can be associated with
phase transitions. To understand this point, we first notice that the r eigenvectors asso-
ciated with irrelevant scaling fields span the tangent space S of an r-dimensional mani-
fold known as the critical surface. This critical manifold forms the basin of attraction of
the fixed point, i.e., whenever a set of physical coupling constants g is fine-tuned so that
g ∈ S, the expansion in terms of scaling fields contains only irrelevant contributions
and the system will feel attracted to the fixed point as if it were a stable one. However,
the smallest deviation from the critical surface introduces a relevant component driving
the system exponentially away from the fixed point. For example, in the case of the fer-
romagnetic phase transition, deviations from the critical temperature Tc are relevant. If
we consider a system only slightly above or below Tc , it may initially appear to be crit-
ical. However, upon further increasing the scale, the relevant deviation will grow and
drive the system away from criticality, either towards the stable high-temperature fixed
point of the paramagnetic phase or towards the ferromagnetic low-temperature phase.
33.4 Critical exponents –447/453–
We have seen that, right at the transition/fixed point, the system is self-similar. This implies
that the behaviour of its various characteristics must be described by power laws. The set
of different exponents characterizing the relevant power laws occurring in the vicinity of the
transition are known as critical exponents.
In the following, let us briefly enumerate the list of the most relevant exponents, α, β, γ, δ, η
and ν. Although we shall again make use of the language of the magnetic transition, it is clear
that the definitions of most exponents generalize to other systems.
T ∂ 2F
C=− , (33.41)
Ld ∂T 2 h↘0
M ≡ − ∂H F ∼ (−t)β . (33.42)
H↘0
χ ≡ − ∂h M ∼ (−t)−γ . (33.43)
h↘0
5. Upon approaching the transition point, the correlation length diverges as ξ ∼ |t|−ν .
crosses over from exponential to a power law scaling behaviour at the length scale ξ.
The engineering dimension of ϕ is [ϕ] = L(2−d)/2 and so C(r) has canonical dimension
L2−d . The exponent η is called the anomalous dimension of the correlation function.
As the response functions can be obtained from integrating the connected correlation
functions, we have
Z Z ξ
dd x
χ∼ d x C(x) ∼
d
∼ ξ 2−η . (33.45)
0 rd−2+η
Universality
In fact, the majority of critical systems can be classified into a relatively small number of uni-
versality classes. Crudely speaking, leaving apart more esoteric classes of phase transitions,
there are O(10) fundamentally different types of flow recurrently appearing in practical ap-
plications. This has to be compared with the near infinity of different physical systems that
display critical phenomena. The origin of this universality can readily be understood from the
concept of critical surfaces.
Imagine, then, an experimentalist exploring a system that is known to exhibit a phase tran-
sition. Motivated by the critical phenomena that accompany phase transitions, the available
control parameters Xi (temperature, pressure, magnetic field, etc.) will be varied until the
system begins to exhibit large fluctuations.
On a theoretical level, the variation of the control parameters determines the initial values of
the coupling constants of the model. For microscopic parameters corresponding to a point
above or below the critical manifold, the system asymptotically falls into either the “high-” or
the “low-temperature” regime. However, eventually the trajectory through parameter space
will intersect the critical surface. For this particular set of coupling constants, the system is
critical. As we look at it on larger and larger length scales, it will be attracted by the fixed point
at S, i.e., it will display the universal behaviour characteristic of this particular point. This is the
origin of universality: variation of the system parameters in a different manner will generate a
different trajectory. However, as long as this trajectory intersects with S, it is guaranteed that
the critical behaviour will exhibit the same universal characteristics controlled by the unique
fixed point.
In fact a more far-reaching statement can be made. Given that there is an infinity of systems ex-
hibiting transition behaviour while there is only a very limited set of universality classes, many
systems of very different microscopic morphology must have the same universal behaviour.
More formally, different microscopic systems must map onto the same critical low-energy
theory.
Scaling laws
Let us consider the case of the ferromagnetic transition. The flow in the vicinity of the magnetic
fixed point is controlled by only two relevant scaling fields, the (reduced) temperature t ≡
(T − Tc )/Tc and the reduced magnetic field h ≡ H/T . Other scaling fields gi s are irrelevant.
33.5 RG analysis of the ferromagnetic transition –449/453–
Under a renormalization group transformation, the reduced free energy f = F/T Ld will
behave as
f (t, h, gi ) = b−d f (tbyt , hbyh , gi bλi ) = td/yt f (1, ht−yh /yt , gi t−λi /yt )
t≪1 d/y
≈ t t
f (1, ht−yh /yt , 0) ≡ td/yt f˜(ht−yh /yt ). (33.46)
Here, we have used the freedom of arbitrarily choosing the parameter b to set tbyt = 1 while,
in the third equality, we have assumed that we are sufficiently close to the transition that the
dependence of f on irrelevant scaling fields is inessential. Combining the definitions of critical
exponents and equation 33.46, it can be shown that
d d − yh 2yh − d
α=2− , β= , γ= ,
yt yt yt
yh 1
δ= , ν = , η = 2 + d − 2yh . (33.47)
d − yh yt
The dimensions of the relevant scaling fields have a more fundamental status than the critical
exponents. Of the six classical exponents, only two can be truly independent. Scaling laws can
be derived by eliminating yh and yt in equations 33.47:
(33.50)
These relations convey much about the potential significance of all structurally allowed oper-
ators:
–450/453– Chapter 33 Phase Transitions and the Renormalization Group
neglecting all irrelevant operators. We split our field into fast and slow degrees of freedom
ϕ = ϕs + ϕf , resulting in the fragmentation of the action S[ϕs , ϕf ] = Ss (ϕs ) + Sf [ϕf ] +
Sc [ϕs , ϕf ]. However, the action Sc coupling fast and slow components vanishes, implying that
the integration over the fast field merely leads to an inessential constant. The effect of the RG
step on the action is then entirely contained in the rescaling of the slow action. The scaling
factors are determined by the engineering dimensions of the operators appearing in the action,
i.e., r → b2 r and h → b1+d/2 h. Using the fact that r ∼ t we can then readily write down the
two relevant scaling dimensions of the problem, yt = 2 and yh = d/2 + 1. Using equations
33.47, we obtain
d d 1 d+2 1
α=2− , β= − , γ = 1, δ= , ν= , η=0 (33.56)
2 4 2 d−2 2
We notice that the Gaussian model possesses only one fixed point, namely r = h = 0, which
in the context of ϕ4 -theory is called the Gaussian fixed point.
One tricky issue is that the mean field exponents agree with the scaling analysis here only
when d = 4. This results from the fact that the coefficient b(T ) of Mz4 in mean field theory
is assumed topbe constant in the vicinity of T = Tc . For example, in mean field theory, we
have Mz = a0 (Tc − T )/b0 and so Mz ∼ (−t)1/2 . If we take into account the fact that b0
scales as t(4−d)/2 around Gaussian fixed point, we will find that Mz ∼ (−t)d/4−1/2 , which is
fully compatible with scaling analysis.
It is tempting to think that we can just neglect the irrelevant operators because their coefficients
flow to zero as we approach the infra-red. However, sometimes we will be interesting in quan-
tities which have the irrelevant coupling constants sitting in the denominator. In this case, one
cannot just blindly ignore these irrelevant couplings as they affect the scaling analysis. When
this happens, the irrelevant coupling is referred to as dangerously irrelevant.
If we replace t by −iτ , Z will be exactly the partition function of Ising model in d-dimensional
space. Comparing statistical ϕ4 theory with its quantum counterpart, we can obtain the free
propagator of it as
1
D(p) = 2 . (33.58)
p +r
–452/453– Chapter 33 Phase Transitions and the Renormalization Group
Note that when calculating loop integrals of quantum field, the free propagator will be the
same as that of statistical field after Wick rotation. We may infer that the renormalization
group equations of statistical field and quantum field are identical.
dr λ rλ dλ 3λ2 dh 6−ϵ
= 2r + 2
− , = ϵλ − , = , (33.59)
d ln l 16π 16π 2 d ln l 16π 2 d ln l 2
where ϵ = 4 − d. Equations 33.59 clearly illustrate the meaning of the ϵ-expansion. According
to the second one, a perturbation away from the Gaussian fixed point will initially grow at a
rate set by the engineering dimension ϵ, while the one-loop contribution ∼ λ2 stops the flow
at a value λ ∼ ϵ.
β(λ)
>0
O()
λ
<0
Equating the right-hand sides of Gell-Mann–Low equations to zero (and temporarily ignoring
the magnetic field), we indeed find that besides the Gaussian fixed point a non-trivial fixed
point (r2∗ , λ∗2 ) = (−ϵ/6, 16π 2 ϵ/3) has appeared. Notice that the second fixed point is O(ϵ)
and coalesces with the Gaussian fixed point as ϵ is sent to zero. Plotting the β-function for the
coupling constant λ, we further find that, for ϵ > 0, λ is relevant around the Gaussian fixed
point but irrelevant at the non-trivial fixed point, as shown in Figure 33.2.
To understand the flow diagram of the system, one may linearize the β-function around both
the Gaussian and the non-trivial fixed point. Denoting the linearized mappings by W1,2 , we
have !
2 16π 1
2 2 − 1
3
ϵ 1+ϵ/6
16π 2
W1 = , W2 = . (33.60)
0 ϵ 0 −ϵ
Figure 33.3 shows the flow in the vicinity of the two fixed points, as described by the matrices
W1,2 as well as the extrapolation to a global flow chart. Notice that the critical surface of the
system – the straight line interpolating between the two fixed points – is tilted with respect
to the r (temperature) axis of the phase diagram. This implies that it is not the physical tem-
perature alone that decides whether the system will eventually wind up in the paramagnetic
33.5 RG analysis of the ferromagnetic transition –453/453–
ferromagnetic
paramagnetic
non-trivial
Gaussian
r
unphysical
Figure 33.3: Phase diagram of the ϕ4 -model as obtained from the ϵ-expansion.
or ferromagnetic sector of the phase diagram. Rather one has to relate temperature to the
strength of the non-linearity to decide on which side of the critical surface we are. For exam-
ple, for strong enough λ, even a system with r initially negative may eventually flow towards
the disordered phase. This type of behaviour cannot be predicted from the mean-field analysis
of the model. Rather it represents a non-trivial effect of fluctuations.
Finally notice that, while we can formally extend the flow into the lower portion of the dia-
gram, λ < 0, this region is actually unphysical. The reason is that, for λ < 0, the action is
fundamentally unstable and, in the absence of a sixth-order contribution, does not describe a
physical system.
Of the two eigenvalues of W2 , 2−ϵ/3 and −ϵ, only the former is relevant and tied to the scaling
of the coupling constant r. Thus, we have yt = 2 − ϵ/3 and, as before, yh = (6 − ϵ)/2. The
critical exponents are therefore
ϵ 1 ϵ ϵ 1 ϵ
α= , β= − , γ =1+ , δ = 3 + ϵ, ν= + , η = 0. (33.61)
6 2 6 6 2 12
If we extend the radius of the expansion to ϵ = 1, we obtain the critical exponents for 3-
dimensional Ising model. The agreement with the experimental results has improved even in
spite of the fact that we have driven the ϵ-expansion well beyond its range of applicability.