Lecture notes GR2
Lecture notes GR2
Chris Couzens
Office S1.46
Mathematical Institute, University of Oxford
Andrew Wiles Building, Radcliffe Observatory Quarter
Woodstock Road
Oxford, OX2 6GG
Disclaimer: There are almost certainly typos in the notes, if something does not look
correct or needs further explanation please let me know.
There are a large variety of good textbooks and lecture notes on general relativity. This
course borrows from a number of them, in various different places, chiefly among them is the
book by Sean Carroll, [1] and the book by Wald [2].
For the background material one can read the GR1 lecture notes. This should cover all
the necessary prerequisites that one would need to know about general relativity.
Some useful lecture notes are by Harvey Reall and by Fay Dowker.
1
Contents
1 Introduction 7
1.1 Manifolds 7
1.2 Riemannian geometry 11
1.3 Einsteins equations 16
1.4 Schwarzschild solution 16
2 Killing vectors 19
2.1 Lie derivative 19
2.2 Killing vectors 20
2.3 Maximally symmetric spaces: how many Killing vectors can we have? 21
2.4 Conserved quantities along geodesics and Killing vectors 22
2.5 Spherically symmetric, static and stationary spacetimes 26
2.5.1 Spherically symmetric spacetimes 26
2.5.2 Static and stationary spacetimes 27
2
6 Rotating black holes 62
6.1 The Kerr–Newman solution 63
6.2 The Kerr solution 63
6.3 Komar Integrals 67
6.4 Maximal extension 69
6.5 Ergosphere and Penrose process (or how to steal energy from a black hole) 71
8 Singularity theorem 79
8.1 Singularities 79
8.2 Null hypersurfaces 81
8.3 Geodesic Deviation 82
8.4 Geodesic congruences 83
8.5 Expansion, rotation and shear 85
8.6 Expansion and shear of a null hypersurface 86
8.7 Trapped surfaces 88
8.8 Raychaudhuri’s equation 89
8.9 Energy conditions 90
8.10 Conjugate points 91
8.11 Definition of a black hole and the event horizon 92
8.12 Penrose Singularity Theorem 93
3
10.5 Hawking temperature 119
10.6 Black hole evaporation 123
4
Conventions
• We will use the god-given signature convention of mostly plus (−, +, +, +). This may
differ with the convention you have used in other courses, especially field theory courses.
This convention is preferable when thinking about geometry as it gives positive spatial
distances. For quantum field theory the other convention is preferable since it ensures
that energies and frequencies are positive. You may map between the two conventions
through Wick rotation, essentially allowing the coordinates to become complex.
• Spacetime indices will be taken to be greek letters from the middle of the alphabet:
µ, ν, ρ, ... and run over 0, 1, 2, 3. Latin indices i, j, k, .. run over the spatial directions
and take values 1, 2, 3.
• We employ Einstein summation convention, repeated indices are summed over, unless
otherwise stated.
• After introducing curvature we will take the metric to be gµν and the determinant will
be det(gµν ) ≡ g.
Useful formulae
d2 xµ ν
µ dx dx
ρ dxν dxρ
+ Γ νρ = 0, gµν (x) = −1 ,
dτ 2 dτ dτ dτ dτ
where τ is the proper time. For light, the first equation takes the same form just
replacing τ with an affine parameter. The second is modified by −1 → 0.
5
• The Christoffel symbols (Levi–Civita connection) are
1
Γµνρ = g µσ ∂ν gσρ + ∂ρ gσν − ∂σ gνρ .
2
– Symmetries
Rµνρσ = −Rµνσρ ,
Rµνρσ = Rσρµν .
– Bianchi identity 1
Rµνρσ + Rµρσν + Rµσνρ = 0 .
– Bianchi Identity 2
• Ricci tensor
Rµν = Rρµρν
• Ricci scalar
R = Rµν g µν .
• Einstein tensor
1
Gµν = Rµν − Rg µν .
2
• Einstein–Hilbert action plus cosmological constant,
√
Z
1
S= d4 x −g R + Λ .
16πG
δg µν = −g µρ g νσ δgρσ ,
δg = gg µν δgµν ,
δRµν = ∇ρ δΓρµν − ∇µ δΓρρν .
6
1 Introduction
These are lecture notes for the Part C course General Relativity 2 at Oxford university. They
are an extension of the course General Relativity 1 and we assume that the reader is familiar
with the material covered there.
To keep our conventions in order we will briefly review the essential material from GR1.
For those who have done a GR course but not studied manifolds I recommend consulting the
GR1 notes as manifolds will appear at times in the lectures.
1.1 Manifolds
The underlying structure of General relativity is differential geometry. This is the study of
manifolds.
Definition Let X be any set and T = {Ui |i ∈ I} denote a certain collection of subsets
of X. The pair (X, T ) is called a topological space if T satisfies
1. Both the set X and the empty set ∅ are open subsets: M ∈ T and ∅ ∈ T .
ψ ◦ f ◦ φ−1 : Rm → Rn . (1.1)
7
If we write φ(p) = {xµ } and ψ f (p) = {y α } then, ψ ◦ f ◦ φ−1 is just the usual vector-valued
function y = ψ ◦ f ◦ φ−1 (x) of m variables. Sometimes it is useful to abuse notation and write
y = f (x) or y α = f α (xµ ) when we know the coordinate systems on M and N that are in use.
Definition We say that a function f : M → R is smooth if the map f ◦ φ−1 : U → R is
smooth for all charts. We let the set of all small functions on M be denoted by F(M ).
Definition We say that a map f : M → N between two manifolds is smooth if the map
ψ◦f ◦φ−1 : U → V is smooth for all charts φ : M → Rm and ψ : N → Rn . If y = ψ◦f ◦φ−1 (x)
is C ∞ then we say that f is differentiable at p. This is actually independent of the coordinate
system.
Definition Let f : M → N be a homeomorphism and ψ and φ coordinate functions. If
ψ ◦ f ◦ φ−1 is invertible, f is called a diffeomorphism and M is said to be diffeomorphic to N
and vice-versa. This is denoted by M ≡ N .
Since the map is invertible it follows that if M ≡ N then dim M = dim N . Homeomor-
phisms classify spaces according to whether it is possible to deform one space into another
continuously. Diffeomorphisms classify spaces into equivalence classes according to whether
it is possible to deform one space into the other smoothly. As such a diffeomorphism is
stronger than a homeomorphism, it requires that both the map and its inverse are smooth.
Two diffeomorphic manifolds are viewed as the same manifold.
Tangent vectors We can define curves on our manifold, γ : (a, b) → M and the tangent to
such a curve. If we collect all curves passing through the point p and find all tangent vectors
to the point p, this defines the tangent space at p: Tp (M ) which is a vector space. A basis of
the tangent space is given by
∂
{eµ } = , (1.2)
∂xµ
and any vector field X may be expanded in terms of this basis as
∂
X = Xµ . (1.3)
∂xµ
When we are looking at vector fields in Tp (M ) the X µ are just numbers, however we can
equally consider the tangent bundle which is the union of all tangent spaces in M . Then a
vector field in the tangent bundle has X µ which are functions on M .
Let Ui , j be two coordinate patches with coordinates x = φi (p) and y = φj (p) respectively
and let p ∈ Ui ∪ Uj . Then we can give the vector field X in both sets of coordinates and we
have that
∂ ∂y ν ∂
= , (1.4)
∂xµ ∂xµ ∂y ν
8
and therefore the components of the vector field X transform as
∂ ∂ yµ
X = Xµ µ
= X̃ µ µ ⇒ X̃ µ = X ν . (1.5)
∂x ∂y xν
One-forms Since Tp (M ) is a vector space there exists a dual vector space whose element is
a linear function Tp (M ) → R. The dual space is called the cotangent space at p, and denoted
Tp∗ (M ). An element ω ∈ Tp∗ (M ) is a linear map Tp (M ) → R and is called a cotangent vector,
dual vector or one-form.
The natural basis of the cotangent space is given by the differential of the coordinates:
{dxµ }. Using the bilinear map arising from the tangent and cotangent spaces being dual
vector spaces, one takes
µ ∂
dx , ν = δνµ . (1.6)
∂x
An arbitrary one-form can then be expanded out in this basis as ω = ωµ dxµ . Let us take
p ∈ Ui ∪ Uj as before, then for ω ∈ Tp∗ (M ) we have
∂xµ
ω = ωµ dxµ = ω̃µ dy µ ⇒ ω̃ν = ωµ . (1.7)
∂y ν
Tensors We can now define tensors of type (q, r) to be a multilinear object which maps q
elements of Tp∗ (M ) and r elements of Tp (M ) to R. We denote the set of (q, r) tensors at p to
(q,r)
be Tp (M ). An element of T (q,r) (M ) can be written in terms of the bases described above
as
µ1 ...µq ∂ ∂
T =T ν1 ...νr µ
... µq dxν1 ...dxνr . (1.8)
∂x 1 ∂x
T is a linear function
T : ⊗q Tp∗ (M ) ⊗r Tp (M ) → R . (1.9)
Let Vi = Viµ ∂x∂ µ with 1 ≤ i ≤ r and ωj = ωjµ dxµ with 1 ≤ j ≤ q then the action of T is
µ1 ...µq
T (ω1 , ..., ωq ; V1 , ....Vr ) = T ν1 ...νr ω1µ1 ....ωqµq V1µ1 ....Vrµr . (1.10)
Tensor fields So far we have defined vectors, one-forms and tensors at a particular point
p ∈ M . We want to be able to smoothly assign such an object to every point of M . For a
vector we call such an object a vector field. In other words if V is a vector field then for every
f ∈ F(M ) then V [f ] ∈ F(M ). We will denote the set of all vector fields on M as X (M ). A
vector field X at p ∈ M is denoted by X|p which is an element of Tp (M ). Similarly we may
q
define a tensor field of type (q, r) by a smooth assignment of an element of Tr,p (M ) at each
point p ∈ M . The set of tensor fields of type (q, r) on M is denoted by Trq (M ).
9
Differential forms A differential form of order r, or more succinctly an r-form, is a totally
anti-symmetric tensor of type (0, r).
The Wedge product ∧ of r one-forms is defined to be the totally anti-symmetric tensor
product of the one-forms
X
dxµ1 ∧ dxµ2 ∧ ...dxµr ≡ sgn(P )dxµP (1) ⊗ dxµP (2) ⊗ .... ⊗ dxµP (r) . (1.11)
P ∈Sr
Thus
dxµ ∧ dxν = dxµ ⊗ dxν − dxν ⊗ dxµ . (1.12)
We will denote the vector space of r-forms at the point p ∈ M by Ωrp (M ), a basis is provided
by the set of all wedge products in (1.11). We can then expand an element of Ωrp (M ) as
1
ω= ωµ ...µ dxµ1 ∧ ... ∧ dxµr , (1.13)
r! 1 r
where ωµ1 ...µr are taken to be totally anti-symmetric.
We may define the exterior product to be the map ∧ : Ωqp (M ) × Ωrp (M ) → Ωq+r
p (M ). Its
action follows by trivial extension of the wedge product defined above. Let ω ∈ Ωqp (M ) and
ξ ∈ Ωrp (M ) be an q-form and and r-form respectively. The action of the (q + r)-form ω ∧ ξ
on q + r vectors Vi is
1 X
(ω ∧ ξ)(V1 , ..., Vq+r ) = sgn(P )ω VP (1) , ..., VP (q) ξ VP (q+1) , ..., VP (q+r) . (1.14)
q!r!
P ∈Sq+r
10
It is common to drop the r subscript and simply write d. The wedge product automatically
anti-symmetrises the coefficient so it is indeed a (r + 1)-form that we obtain. It follows that
for ξ ∈ Ωqp (M ), η ∈ Ωrp (M ) we have
We may extend the tensor gp over the full manifold. With a choice of coordinates we can
write the metric as
g = gµν (x)dxµ ⊗ dxν . (1.20)
We will often write this as the line elements ds2 ,
We may view gµν as a matrix, which by the symmetry property above is symmetric.
This implies that the matrix is diagonalisable, with real eigenvalues. If there are i positive
eigenvalues and j negative eigenvalues the pair (i, j) is called the index of the metric. If j = 1
the metric is called a Lorentz metric, for j = 0 we have a Euclidean metric. The number of
negative entries is called the signature and by Sylvester’s law of inertia1 , this is independent
of the choice of basis.
1
This has nothing to do with inertia, Sylvester just wanted a law of inertia like Newton.
11
Lorentzian manifolds For our purposes Riemannian manifolds are not what we want
to consider, instead we want to consider Lorentzian manifolds. The simplest example is
Minkowski space. This is R1,m−1 equipped with the metric
which has components ηµν = diag(−1, 1, ..., 1). Note that on a Lorentzian manifold we take
the index to run over 0, 1, .., m − 1.
At any point p on a general Lorentzian manifold it is always possible to find an orthonor-
mal basis {eµ } of Tp (M ) such that locally the metric looks like the Minkowski metric
At each point on M we can then draw light cones which are the null tangent vectors at that
point. The novelty is that the directions of these light cones can vary smoothly as we move
around the manifold. This specifies the causal structure of spacetime which determines which
regions of spacetime can interact together.
We can use the metric to determine the length of curves. The nature of a curve is
inherited from the nature of its tangent vector. A curve is called timelike if its tangent vector
is everywhere timelike. We then measure the proper time
Z b r
dxµ dxν
τ= dt −gµν . (1.24)
a dt dt
The existence of a metric comes with a large number of benefits.
The metric as an isomorphism The metric gives a natural isomorphism between vectors
and covectors, g : Tp (M ) → Tp∗ (M ) for each p. In a coordinate basis we can write X = X µ ∂µ ,
and map it to a one-form X = Xµ dxµ , as
Xµ = gµν X ν . (1.25)
12
We will usually say that we use the metric to lower (or raise) an index. What we really mean
is that the metric provides and isomorphism between a vector space and its dual. Since g
is non-degenerate and is thus invertible we also have the inverse map. We take the inverse
of gµν to be g µν so that g µν gνρ = δρµ . This can then be thought of as the components of a
symmetric (2, 0) tensor
ĝ = g µν ∂µ ⊗ ∂ν . (1.26)
Then
X µ = g µν Xν . (1.27)
The Volume form The metric also gives a natural volume form on the manifold M . On
a Riemannian manifold we take the volume form to be
q
vol(M ) = det(gµν )dx1 ∧ ...dxm , (1.28)
p √
and we use the shorthand det(gµν ) = g. On a Lorentzian manifold the determinant is
negative and therefore we take the volume form to be
√
vol(M ) = −gdx0 ∧ dx1 ∧ ... ∧ dxn−1 . (1.29)
Hodge dual On an oriented manifold M we can use the totally anti-symmetric tensor
density to define a map which takes a p-form ω ∈ Ωp (M ) to a (m − p)-form ⋆ω ∈ Ωm−p (M ).
We define this map to be
1p
(⋆ω)µ1 ...µm−p = |g|ϵµ1 ...µm−p ν1 ...νp ω ν1 ..νp , (1.30)
p!
where ϵµ1 ...µm is the totally anti-symmetric tensor, with ϵ123...m = 1 and for even permutations,
−1 for odd permutations and 0 otherwise.
This is called the Hodge dual and is independent of coordinates. One can see that it
satisfies
⋆(⋆ω) = ±(−1)p(m−p) ω , (1.31)
∇X (Y + Z) = ∇X Y + ∇X Z , (1.32)
13
∇(f X+gY ) Z = f ∇X Z + g∇Y Z , (1.33)
∇X (f Y ) = X[f ]Y + f ∇X Y , (1.34)
In words, you first differentiate the tensor and then for each upper index you add in a +ΓT
and for every down index a −ΓT . The connection takes tensors to tensors, the (q, r) tensor
gets mapped to a (q, r + 1) tensor.
The connection coefficients are not tensors themselves, but transform as
∂y µ
Γ̃µνρ = (Λ−1 )µκ Λσρ Λτ ν Γκστ + (Λ−1 )µκ Λσρ ∂σ Λκν , with Λµν = . (1.36)
∂xν
The difference
T κστ = Γκστ − Γκτ σ , (1.37)
is called the torsion tensor, and is indeed a tensor. If the torsion tensor vanishes we say that
the connection is torsion free.
∇X T = 0 . (1.40)
14
Let γ connect two points p, q ∈ M . The condition (1.40) provides a map from the vector
space defined at p to the vector space defined at q. Consider a second vector field Y . In
components (1.40) reads
X ν ∂ν Y µ + Γµνρ Y ρ = 0 .
(1.41)
If we evaluate it on the curve γ, we can write Y µ = Y µ (x(λ)) and therefore the condition is
dY µ
+ X ν Γµνρ Y ρ . (1.42)
dλ
A geodesic is a curve tangent to a vector field X that obeys
∇X X = 0 . (1.43)
Along the curve γ with coordinates xµ and tangent vector X this implies
d2 xµ ν
µ dx dx
ρ
+ Γ νρ = 0. (1.44)
dλ2 dλ dλ
This is the same geodesic equation one obtains by varying the action
r
dxµ dxν
Z
S = dλ −gµν (x) , (1.45)
dλ dλ
and picking an affine parameter.
Using the Levi–Civita connection we can define the curvature and torsion tensors. In
components the Riemann tensor is
Given a rank (1, 3) tensor we can construct a rank (0, 2) tensor by contraction, for the
Riemann tensor the resultant (0, 2)-rank tensor is called the Ricci tensor and is defined by
It inherits symmetry in its indices from the properties of the Riemann tensor
R = g µν Rµν . (1.53)
15
1.3 Einsteins equations
Birkhoff ’s theorem The Schwarzschild solution is the unique spherically symmetric asymp-
totically flat solution to the vacuum Einstein equations.
16
New coordinates The Schwarzschild solution in Schwarzschild coordinates has a coordi-
nate singularity at r = Rs = 2GN M . This surface is called the event horizon. In GR no
signals can come out from within the event-horizon, once you fall past the event horizon you
are lost to the outside world.
The apparent singularity at r = Rs is only a coordinate singularity and can be removed
by a coordinate transformation. First introduce the tortoise coordinate r∗
r − 2GN M
r∗ = r + 2GN M log , (1.61)
2GN M
then in these coordinates the null radial in-going/out-going geodesics are particularly simple:
Next introduce a pair of null coordinates further adapted to the null geodesics:
v = t + r∗ , u = t − r∗ . (1.63)
Even though the metric coefficient gvv vanishes at r = 2GN M there is no real degeneracy
there and the metric is well-defined as one can see by computing the determinant.
There is also the complementary outgoing Eddington–Finkelstein coordinates where we
eliminate t using u above. With Eddington–Finkelstein coordinates we are able to continue
the Schwarzschild solution beyond the horizon to r > 0. In fact there are two ways to do
this with either the ingoing or outgoing Eddington–Finkelstein coordinates. In fact we can
do better and write a metric which captures both of these regions simultaneously.
To begin write the Schwarzschild metric using both null (u, v)-coordinates, the metric is
2 2GN M
ds = − 1 − dudv + r2 ds2 (S 2 ) , (1.65)
r
17
both are null coordinates. The original Schwarzschild black hole is parametrised by U < 0
and V > 0. Outside the horizon they satisfy
r∗ 2GN M − r r
U V = − exp = exp , (1.67)
2GN M 2GN M 2GN M
and similarly
U t
= − exp − . (1.68)
V 2GN M
The metric is then
32(GN M )3 − 2G r M
ds2 = − e N dU dV + r2 ds2 (S 2 ) , (1.69)
r
with r(U, V ) defined by inverting (1.67). The original Schwarzschild metric covers just U < 0
and V > 0 however there is no obstruction to extending U, V ∈ R. Nothing bad happens at
r = 2GN M , the metric is smooth and non-degenerate and now we have a metric which covers
all regions. The Kruskal spacetime is the maximal extension of the Schwarzschild solution.
18
2 Killing vectors
Killing vectors play an important role in general relativity and in understanding black holes.
In this section we will introduce the notion of a Killing vector and show how they give rise
to conserved quantities along geodesics.
This is equivalent to solving a set of first order ODEs with fixed initial conditions, and
therefore there is a unique solution at least locally.
Let γ(λ, p) be the integral curve of X which passes through the point p when λ = 0. The
map γ : R × M → M defines the flow generated by X. The flow defines an abelian group
since one can show that σ(λ1 , σ(λ2 , p)) = σ(λ1 + λ2 , p). Let σλ (p) = σ(λ, p) then
σλ στ (p) = σλ+τ (p) ,
σ0 = Unit element , (2.2)
σ−λ = (σλ )−1 .
This allows us to move points along the curve, in particular by using the flow we can
move tensors from one point on the flow to another, recall that this goes by the name of
push-forward or pull back depending on what object we are acting on.2 This allows us to
define the Lie derivative along the vector field X. For a vector Y we have
1h i
LX Y |p = lim σ−ϵ (p) ∗ Y σϵ (p)
−Y p
. (2.3)
ϵ→0 ϵ
19
The Lie derivative can be extended to any tensor with appropriate generalisation. For
tensors one must use a combination of the push-forward and pull-back. Of primary interest
to us here is the Lie derivative of the metric. We have
1h i
LX g = lim (σϵ (p))∗ g|σϵ (p) − g|σϵ (p) . (2.6)
ϵ→0 ϵ
Note that the pull back uses σϵ rather than σ−ϵ , this is not a typo. In coordinates we have
More generally, let T be a tensor of rank (q, r), then the Lie derivative along the vector field
X in local coordinates is
µ1 ...µq µ1 ...µq λ...µq
= X σ ∂σ T − ∂σ X µ1 T µq
µ1 ...σ
LX T ν1 ...νr ν1 ...νr ν1 ...νr − ... − ∂σ X T ν1 ...νr
µ ...µ µ ...µ
+ ∂ν1 X σ T 1 qσ...νr + ... + ∂νr X σ T 1 qν1 ...σ .
(2.9)
To make this more manifestly tensorial one can replace the partial derivatives with any
torsion free connection3 , not necessarily the Levi–Civita connection. One can show that the
Lie derivative satisfies
LX (T + S) = LX T + LX S ,
LX (T ⊗ S) = LX T ⊗ S + T ⊗ LX S ,
(2.10)
L[X,Y ] = LX LY − LY LX ,
LX f = X[f ] ,
where X, Y are vector fields, T and S are arbitrary tensors, and f is a function.
The Lie derivative of the metric along a vector field X captures the variation of the metric
under the infinitesimal coordinate transformation:
xµ → x̃µ = xµ + X µ . (2.11)
20
Then
δgµν (x) = g̃µν (x) − gµν (x) = LX g = ∇µ Xν + ∇ν Xµ , (2.13)
µν
Such vectors are known as Killing vectors. They are vectors which define flows along which
the metric does not change. We say that it generates an isometry of the spacetime and that
the metric has a symmetry. We will see later in the course that there are corresponding
conserved quantities for these symmetries as one may suspect from Noether’s theorem, we
will study this in section ?? after introducing some additional technology.
2.3 Maximally symmetric spaces: how many Killing vectors can we have?
It is natural to ask if there an upper limit on the number of Killing vectors a space can have?
The answer is yes. Consider Euclidean space in n-dimensions Rn with the flat metric. What
are the symmetries of this space? We know that we have both translations and rotations.
Fix a point p in Rn . Translations are the transformations that move the pint: there are n
independent axes along which we can move and therefore there are a total of n translations.
The rotations, centred at p are those transformations which leave p fixed. We can think of
rotations as mapping one of the axes through the point p into one of the others. Since there
are n axes and thus n − 1 axes it can be rotated into. However rotating x into y and y into x
n(n+1)
are not independent and therefore the total number of rotations is 2 . This gives a total
of
n(n + 1)
, (2.15)
2
independent symmetries.
This is the maximum number of linearly independent (by constant coefficients) Killing
vectors that an n-dimensional space may have.
Definition: Maximally symmetric space
n(n+1)
An n-dimensional space with the maximum number of Killing vectors, 2 , is called a
maximally symmetric space.
Aside: To prove this one needs to use that for a Killing vector K we have
∇µ ∇ν K σ = Rσνµρ K ρ . (2.16)
21
Then we view the Killing equation (2.14) as a set of first order PDEs for the n functions
K µ . We can now find a solution as a series expansion around some arbitrary point p in
the manifold. We would have
1
K µ (x) = K µ (p) + (xν − pν )∂ν K µ + (xν − pν )(xσ − pσ )∇ν ∇σ K µ + .... (2.17)
x=p 2 x=p
However since (2.16) allows us to express the second derivative of K at p in terms of K(p)
and ∂µ K(p) it follows that we may eliminate second derivative terms from the expansion.
In fact we may go further, whacking (2.16) with another derivative allows us to express
the third derivative of K in terms of Kν (p) and ∇µ Kν (p) too. We can do this infinitely
many times to obtain expressions for all higher derivative terms. Therefore the solution is
determined uniquely by the initial conditions K µ (p) and ∇µ K ν |x=p . The general solution
is then of the form
where A and B are complicated functions depending on the initial point p and the metric
and its derivatives but independent of the initial data of the Killing vector. Therefore we
have shown that every Killing vector can be determined in terms of the initial conditions
Kµ (p) and ∇µ Kν x=p . There are n-independent components of Kµ (p) and n(n−1) 2 inde-
pendent components of ∇µ Kν x=p . The latter comes about because the initial conditions
must satisfy the Killing equation, which fixes ∇µ Kν x=p to be a n × n anti-symmetric
n(n−1) n(n−1)
matrix which has 2 independent components. This gives the claimed total of 2
Killing vectors.
Examples of maximally symmetric spaces are flat space, spheres, hyperbolic space, Minkowski
space and (anti-) de-Sitter space.
If a manifold is maximally symmetric it means that the curvature is the same in all
directions. The Riemann tensor can in fact be fixed in terms of the constant Ricci scalar and
takes the form
R
Rµνρσ = gµρ gνσ − gµσ gνρ . (2.19)
n(n − 1)
This means that locally the space is determined by the Ricci scalar.4
We have already seen conserved quantities when we studied geodesics. When we had an
ignorable coordinate we found a conserved quantity along the geodesic. This is in fact related
4
For example both a torus and the Euclidean plane are flat, and hence the Riemann tensor vanishes, however
they are very different spaces, one is compact while the other is non-compact. The Ricci scalar therefore does
not capture the global difference of the two spaces.
22
to the presence of Killing vectors.
Consider the action
r
dxµ (λ) dxν (λ)
Z
S= dλ gµν x(λ) . (2.20)
dλ dλ
For simplicity let us assume that λ is an affine parameter which allows us to consider the
action with the square root removed. From GR1 we know that geodesics are the curves which
extremise the action, that is geodesics are curves, xµ (λ), which when deformed by a small
amount δxµ (λ), the change in the action vanishes.
Consider deforming the curve as
xµ → xµ + ϵX µ . (2.21)
δS = S(xµ + ϵX µ ) − S(xµ )
Z
ρ µ ν µ ν µ ν
+ O(ϵ2 )
= ϵ dλ x ∂ρ gµν ẋ ẋ + gµν Ẋ ẋ + ẋ Ẋ
Z h i (2.22)
= ϵ dλ ẋρ ẋσ X µ ∂µ gρσ + gνσ ∂ρ X ν + gνρ ∂σ X ν + O(ϵ2 )
Z h i
= ϵ dλ ẋρ ẋσ ∇ρ Xσ + ∇σ Xρ + O(ϵ2 ) .
We have used that Ẋ µ = ẋρ ∂ρ X µ . We see that if X is a Killing vector field we have a
symmetry of the action. We know from Noether’s theorem that there must be a conserved
charge.
∂L
pµ = . (2.24)
∂ ẋµ
Then for any Killing vector X,
Q = X µ pµ , (2.25)
23
Proof: Consider a small variation δxµ = ϵX µ generated by the Killing vector field X as
above. As shown above such variations leave the action invariant: δS = 0, which is equivalent
to
∂L µ ∂L
X + µ Ẋ µ = 0 . (2.26)
∂xµ ∂ ẋ
Along a geodesic the Euler–Lagrange equations are satisfied:
∂L d ∂L
µ
− = 0. (2.27)
∂x dλ ∂ ẋµ
Therefore along the geodesic (2.26) implies
d d d d
0= pµ X µ + pµ X µ = pµ X µ = Q. (2.28)
dλ dλ dλ dλ
Note that Q is conserved only along the geodesic, for a path which is not a geodesic this is
not conserved. We can see immediately why this must be the case in the derivation above
since we used the Euler–Lagrange equations.
Similarly for the Killing vector K3 = ∂ϕ we have K3µ = (0, 0, 0, 1) and therefore
∂L
Q3 = K3µ pµ = pϕ = = r2 sin2 θϕ̇ = J . (2.30)
∂ ϕ̇
These are both conserved charges that you are familiar and the intuition about ignorable
coordinates giving rise to conserved quantities holds for these. However Killing vectors need
not be so simple. The Schwarzschild solution has two more Killing vectors
K1 = sin ϕ∂θ + cot θ cos ϕ∂ϕ , K2 = cos ϕ∂θ − cot θ sin ϕ∂ϕ . (2.31)
These two combine with K3 to generate the SO(3) isometry of the spacetime: one can check
that
[Ki , Kj ] = ϵijk Kk , (2.32)
24
which is indeed the Lie algebra so(3).
The conserved charges for these are not nearly as simple as the previous two:
Q1 = r2 sin ϕθ̇ + cos θ cos ϕ sin θϕ̇ , Q2 = r2 cos ϕθ̇ − cos θ sin θ sin ϕϕ̇ .
(2.33)
One can check upon application of the geodesic equation that these are indeed conserved.
π
Recall that when we consider geodesics we use the rotational symmetry to set θ(0) = 2 and
θ̇(0) = 0 which leads to motion in a plane. With these values the two charges Q1 , Q2 vanish.
Aside: To find these Killing vectors one needs to solve the Killing equation (2.14) which
gives a set of PDEs to solve. In this case, rather than trying to solve these PDEs, one
can be slightly smarter and use the embedding of the S 2 into R3 . The isometries are
then the rotations about the three axes. We know that these are then:
with the names of the vector fields given by the axis of rotation. Introducing the embed-
ding coordinates
x = sin θ cos ϕ , y = sin θ sin ϕ , z = cos θ , (2.35)
we find that
K1 = X , K2 = Y , K3 = Z . (2.36)
This drastically simplifies the problem in this case where we know the embedding into a
simple space. This is not always possible and one must just bite the bullet and solve the
PDEs.
The familiar conserved quantities along geodesics of the Schwarzschild solution are the
ones where the Killing vector is of the form K = ∂ψ for some coordinate ψ. For any Killing
vector we can find coordinates such that the Killing vector is of this form. However if there
are other Killing vectors this transformation may ruin the nice form of these. For example
the three Killing vectors of S 2 studied above, only one of the Killing vectors is in this nice
adapted form. One could change coordinates to make either of the other two of this nice
form however the sacrifice is that K3 is no longer of this nice form. This is most easily seen
from the embedding of the S 2 into R3 . If we permuted the embedding coordinates in (2.35),
whichever coordinate was just cos θ would have the simple form. To see that in the case of
SO(3) that only one of the Killing vectors can be of this nice form notice that [∂µ , ∂ν ] = 0
and therefore if two vectors were of this simple form the so(3) algebra would not be satisfied.
25
2.5 Spherically symmetric, static and stationary spacetimes
In general when trying to solve Einstein’s equations we need to make some simplifying as-
sumptions. A set of simplifying assumptions you should already have seen when studying the
Schwarzschild solution are spherically symmetric, static and stationary spacetimes. We will
quickly review what this means in order to use these assumptions in the following section to
describe cold stars.
You should be familiar with the isometries of a round two-sphere. One can rotate the two-
sphere through any axis and it looks the same. This is of course the group SO(3). It can
be further enhanced to O(3) if we also include reflections however we will not do this in the
following. Any one-dimensional subgroup of SO(3) gives a one-parameter group of isometries
and thus a Killing vector field. The rank of SO(3) is 3 and thus there are three independent
Killing vectors which can be used to generate the full symmetry group.
If one puts the following metric on the round two-sphere
K1 = sin ϕ∂θ + cot θ cos ϕ∂ϕ , K2 = cos ϕ∂θ − cot θ sin ϕ∂ϕ , K3 = ∂ϕ . (2.38)
26
where A(p) is the area of the S 2 orbit through the point p. In other words the induced metric
on the S 2 passing through the point p has induced metric r(p)2 ds2 (S 2 ).
Definition: Stationary
A spacetime is stationary if it admits an everywhere timelike Killing vector K.
When the space-time is stationary it allows us to introduce a distinguished coordinate
adapted to the Killing vector and write the metric in a simpler form. Given the Killing vector
we may define the flow by finding the integral curve of the vector, let us parametrise the
flow by t. We can pick a hypersurface Σ nowhere tangent to K µ and introduce coordinates
xi on Σ. We may then assign coordinates (t, xi ) to the point a parameter distance t along
the integral curve through the point on Σ with coordinates xi . This gives a coordinate chart
µ
in which K µ = ∂t . Since k µ is a Killing vector field the metric is independent of t and
therefore the metric takes the form
ds2 = g00 (xk )dt2 + 2g0i (xk )dtdxi + gij (xk )dxi dxj , (2.41)
with g00 (xi ) < 0. Conversely given a metric of this form ∂t is a timelike Killing vector field
and thus the metric is stationary.
Definition: Hypersurface-orthogonal
Let Σ be a hypersurface in M specified by the equation f (x) = 0, with f : M → R a smooth
function. We require df ̸= 0 on Σ, then df is normal to Σ.5 The dual vector to df , let us
call it ξ, is said to be hypersurface orthogonal. If ξ is timelike the hypersurface is said to be
spacelike; if ξ is spacelike the hypersurface is timelike and it ξ is null then the hypersurface
is said to be null.
It follows that any other normal to the hypersurface can be written as n = gdf + f n′
with g a smooth function which does not vanish anywhere on Σ, and n′ a smooth one-form.
Then we have
dn = dg ∧ df + df ∧ n′ + f dn′ ⇒ dn Σ
= (dg − n′ ) ∧ df ⇒ n ∧ dn Σ
= 0 . (2.42)
27
such that n = gdf and therefore n is normal to the surfaces defined by f (x) =constant, and
therefore hypersurface-orthogonal.
Definition: Static
A spacetime is static if it admits a hypersurface-orthogonal timelike Killing vector field.
Note: Static implies stationary, but the converse is not true.
Birkhoff’s theorem proves that the Schwarzschild solution is the unique asymptotically flat,
spherically symmetric solution of Einstein’s equations in the absence of matter and cosmo-
logical constant. As such, away from any spherically symmetric static object such as a star,
planet or black hole the metric is the Schwarzschild metric. There are a few questions we may
want to ask at this point. What is the metric inside a star where the Schwarzschild solution
is no longer valid (since there is now a non-trivial contribution from the energy momentum
tensor)? Does GR tell us anything about the different types of stars: hot stars, white dwarfs,
28
neutron stars? In this section we answer these questions by studying the extension of the
Schwarzschild solution to describe a cold star.
As opposed to a hot star, where there is a thermal source of pressure generated by nuclear
reactions in its core, a cold star must be supported from collapse by a non-thermal pressure
source. When a star forms by condensation of a dust cloud due to gravitational attraction
the pressure increases which leads to an increase in temperature. When the dust cloud has
collapsed far enough and has reached a critical temperature, nuclear fusion in the core begins.
The dominant process is the conversion of four protons to form a helium-4 nucleus. The
emission of photons and neutrinos at this stage provides a thermal radiation which balances
against the collapse of the star due to gravity. As the Hydrogen fuel is depleted a helium core
builds up and the pressure from thermal radiation decreases and the star begins to collapse
again.
If the star is massive enough as the core contracts it once again heats up and if a crit-
ical temperature is reached, helium can be fused, giving a thermal pressure which halts the
collapse. If the star is not big enough the temperature which allows Helium to fuse is not
reached and the star uses up its remaining fuel becoming a red dwarf. This process of a
period of equilibrium followed by collapse can keep repeating with the formation of heavier
nuclei in the core such as nickel and iron.
The crucial issue governing how far along this evolutionary sequence a star goes is whether
electron degeneracy pressure becomes sufficient to support the star from further collapse.
There is a critical mass MC , (3.17), below which the collapse is halted by the electron de-
generacy pressure. The Pauli exclusion principle states that two or more identical fermions6
cannot occupy the same quantum state within a quantum system simultaneously.7 Due to
this a gas of cold fermions resists compression, producing a pressure known as degeneracy
pressure. If the mass of the star is below the critical mass no further nuclear fusion will occur
and the star will simply cool down forever in a stable white dwarf configuration. This is the
fate of our sun. A white dwarf is much denser than a regular star: to get an idea about
how much denser it is a matchbox sized piece of white dwarf material would weigh roughly
6
A fermion is particle with half integer spin. Fermions obey Fermi–Dirac statistics. Quarks and leptons
(electrons, muons and tau-ons and their neutrino versions) are examples of fermions.
7
To get a feel of why this is true one needs to recall some facts about the wave-function in quantum
mechanics. We construct a state by acting on the ground state with operators. Operators which give bosons
(integer spin field) satisfy commutation relations, while operators which give rise to fermions satisfy anti-
commutation relations. If we want to insert the same (all quantum numbers the same) fermion at the same
point we must act with the same operator but due to the anti-commutator relations this vanishes and therefore
the wave-function vanishes.
29
the same as t an elephant. Newtonian gravity is still applicable here and shows that a white
dwarf cannot have a mass greater than the Chandrasekhar limit, 1.4 M⊙ with M⊙ the mass
of the Sun. A star more massive than this cannot end its life as a white dwarf unless it sheds
some of its mass.
If M is greater than MC then after a core of nickel and iron of mass MC has formed
it will be unable to support itself, electron degeneracy pressure is insufficient and no further
nuclear fusion occurs. The core will undergo gravitational collapse . When the density of
the core reaches nuclear density, the density of the nucleus of an atom, neutron degeneracy
pressure and nuclear forces provide a significant cold matter pressure. At such high pressure
one finds that beta decay is reversed, protons combine with electrons to produce neutrons. If
the mass of the star is below the critical limit for cold matter Mcritical 2M⊙ then the collapse
will be halted leading to a neutron star. At this stage the Newtonian approximation is no
longer applicable and one must use general relativity.
When the collapse of the core is halted or slowed at nuclear densities a shock wave is
produced and this is expected to lead to the outer envelope of the star producing a supernova.
The presence of pulsars (neutron stars with a hot spot rotating at high speed) at the sites
of the Crab and Vela supernova remnants provides strong evidence that this supernovae are
produced in conjunction with the collapse of the core of a star at the end-point of stellar
evolution.
The final option is to have a star which has a mass larger than the critical mass Mcritical .
Equilibrium can never be achieved and complete gravitational collapse will occur. The end-
point of such a collapse will be a Schwarzschild black hole. We find that for a massive enough
star gravitational collapse into a black hole is inevitable.8
In this section we will show that general relativity predicts a maximum mass for a cold
star. To reach this conclusion we will assume that the star is spherically symmetric and static,
recall that this is one of the assumptions that goes into Birkhoff’s theorem. The interior of
the star can be modelled by a perfect fluid and we then need to solve Einstein’s solutions
with a perfect fluid source and match onto the Schwarzschild solution outside the star.
8
One can formulate this more concretely following Penrose and Hawing that collapse becomes inevitable
once a trapped surface forms. A trapped surface is a two-dimensional for which both the out-going and in-going
future directed geodesics orthogonal to the surface converge. For example consider spheres with r, t constant
in the Schwarzschild metric, these are trapped surfaces for r < RSchwarzschild .
30
3.1 Tolman–Oppenheimer–Volkoff equations
Since we have a static spacetime we have a timelike Killing vector field K with which we can
foliate our spacetime with the surfaces Σt which are orthogonal to K. The orbits of SO(3)
through a point p ∈ Σt lie within Σt . This allows us to define coordinates (r, θ, ϕ) such that
the most general metric with our given assumptions takes the form
We now need to specify the energy-momentum tensor. Outside the star this vanishes
and it remains to come up with a suitable ansatz within the star. We can describe this as a
perfect fluid. The energy momentum tensor for a perfect fluid takes the form
with uµ the four-velocity of the fluid, normalised to uµ uµ = −1, ρ the energy density and p the
pressure measured in the fluid’s local rest frame. Since we are interested in time-independent
and spherically symmetric stars the fluid should be at rest thus u points in the time-direction
only and therefore
u = e−Φ(r) ∂t . (3.3)
Moreover the time-independence and spherical symmetry imply that ρ and p only depend on
r while the vanishing of the energy-momentum tensor outside of the star implies that ρ, p
vanish when r > Rc with Rc the radius of the star.
A fluid’s equations of motion are determined by the conservation of the energy momen-
tum tensor. This follows from the Einstein equations, ergo we need only consider the Einstein
conditions in the following. Since the Einstein equations inherit the symmetries of the space-
time it follows that there are only three non-trivial independent conditions arising from the
Einstein equations. We may take these to be the tt, rr, θθ components, see the mathematica
file in moodle which does this computation.
The independent Einstein equations are
e2Ψ h d −2Ψ
2
i
Ett = r(1 − e ) − 8πr ρ = 0,
r2 dr
1 −2Φ 2Φ e2Ψ − 1
Err = e ∂r e − − 8πre2Ψ p = 0 , (3.4)
r r
−2Ψ Ψ−Φ −Ψ Φ 2Ψ
Eθθ =e r e ∂r re ∂r e − ∂r Ψ − 8πee p = 0 .
31
To proceed it is useful to introduce m(r) via
2m(r) −1
2Ψ(r)
e = 1− , (3.5)
r
with 2m(r) < r. The tt component of the Einstein equation becomes
dm(r)
= 4πr2 ρ(r) . (3.6)
dr
Moreover the rr component reduces to
dΦ(r) m(r) + 4πr3 p(r)
= . (3.7)
dr r(r − 2m(r))
In the Newtonian limit we have r3 p(r) ≪ m(r) and m(r) ≪ r so (3.7) reduces to
dΦ(r) m(r)
≈ 2 , (3.8)
dr r
this is just the spherically symmetric version of Poisson’s equation for the Newtonian gravi-
tational potential. We can see the other terms in (3.7) as relativistic corrections.
The final non-trivial component of the Einstein equations is the θθ component given
above, however rather than using that equation, it is simpler to derive the final equation from
the r-component of energy momentum conservation. This gives
dp(r) m(r) + 4πr3 p(r)
= − p(r) + ρ(r) . (3.9)
dr r(r − 2m(r))
One can check that this is implied by Eθθ = 0 above, see the mathematica file. In the
Newtonian limit (P ≪ ρ, m(r) ≪ r) it reduces to the Newtonian hydrostatic equilibrium
equation
dp(r) ρ(r)m(r)
≈− . (3.10)
dr r2
Note that general relativity has little effect on the equilibrium configurations of stars with
p ≪ ρ and m(r) ≪ r.
We have four unknowns m(r), Φ(r), ρ(r), p(r) and only three equations so the system
is currently underdetermined. The one remaining condition comes from the fact that we are
interested in a cold star, one which has a vanishing temperature. Thermodynamics implies
that T, ρ, p are not independent, and therefore we may write p = p(ρ). Moreover we should
take ρ > 0 and p > 0 and that p(ρ) is an increasing function of ρ.9 The three equations (3.6),
(3.7) and (3.9) are known as the Tolman–Oppenheimer–Volkoff equations.
9
If this were not the case then the star would be unstable since a fluctuation in some region that led to an
increased energy density would lead to a decrease in pressure. This would cause the fluid to more into this
region which would lead to a further increase in ρ and the fluctuation would continue to grow.
32
Outside the star We know that in the absence of matter and with the imposed constraints,
that the unique solution is the Schwarzschild solution:
2M −2 2
2 2M 2
ds = − 1 − dt + 1 − dr + r2 ds2 (S 2 ) . (3.11)
r r
The constant M is the total mass of the star. Recall that Rs = 2M is the Schwarzschild
radius where an event horizon is located. We must therefore take the star to have a radius
larger than the Schwarzschild radius: Rc > Rs . Regular stars have Rc ≫ Rs , for the sun
Rs ≈ 3km while Rc ≈ 7 × 105 km.
Inside the star We now want to consider the interior of the star, and patch it with the
exterior solution above such that the full metric is smooth at the patching surface at r = Rc .
We can integrate (3.6) to give
Z r
m(r) = 4π ρ(r′ )r′2 dr′ + m∗ , (3.12)
0
There is a slight subtlety here in that the total energy of the matter should include the correct
volume measure when integrating over a spacelike hypersurface, the energy for the spacelike
hypersurface Σt defined to be
Z Z Z Rc
E= ρ(r)dvol(Σt ) = ρ(r)eΨ(r) r2 sin θdr ∧ dθ ∧ dϕ = 4π ρ(r)eΨ(r) r2 dr . (3.14)
Σt Σt 0
Note that this differs with the total mass of the star due to the eΨ(r) factor. Since eΨ(r) > 1 it
follows that E > M and one can associate the positive difference E −M to be the gravitational
binding energy of the star. This would be the amount of energy needed to disperse the matter
to infinity, for spherical stars this is a well-defined concept but does not always make sense
in GR.
33
Note that due to the constraint that 2m(r) > r for all r, so that eΨ(r) > 0 it follows that
there is a upper bound on the possible mass of the star: 2M < Rc . There is no Newtonian
analogue of this condition. Reinstating the factors of c and GN we have 2GN M < c2 Rc and
in the c → ∞ limit this is trivial, hence why this constraint is not seen in the Newtonian
theory.
This upper bound can be improved. From equation (3.9) after some algebra and assuming
ρ ≥ 0 and ρ′ (r) ≤ 0, which you will do in sheet 1, one finds that
m(r) 2h p i
≤ 1 − 6πr2 p(r) + 1 + 6πr2 p(r) . (3.15)
r 9
Evaluating on the radius of the star where p = 0, one finds
9M
Rc ≥ . (3.16)
4
Note that this is actually independent of the equation of state and so it applies equally to hot
stars and cold stars which satisfy these assumptions. Stars of uniform constant density can
get arbitrarily close to saturating the bound but as they get closer to the bound the pressure
at the centre diverges.
In order to solve the TOV equations we should use numerical integration. We view (3.6)
and (3.9) as a coupled set of ODEs for m(r) and ρ(r) for some given equation of state.
These can be solved, at least numerically on a computer once initial conditions for the mass
and density are given. We have that m(0) = 0 and therefore we ned only specify a density
ρc = ρ(0) at the centre of the star.
Given these initial conditions we can numerically solve (3.6) and (3.9). Since the latter
equation shows that p decreases with r there must be some point where the pressure vanishes,
this is the surface of the star and the radius is determined by p(Rc ) = 0. We can invert this
to determine Rc as a function of ρc . From (3.13) we can determine M as a function of ρc .
Finally we may determine Φ(r) inside the star by integrating (3.7) from the surface of the star
with initial condition that 2Φ(Rc ) = log(1 − 2M/Rc ), i.e. it gives the Schwarzschild solution
potential. Hence for a given equation of state, static, spherically symmetric cold stars are
form a 1-parameter family of solutions labelled by the central density ρc .
If one follows the above procedure one finds that as ρc increases then M increases to a
maximum before decreasing again for larger ρc . One can see this from (3.15). Due to the
minus sign in the first term as we crank up p(r) the contribution from the positive square root
34
term will no longer be dominant and the upper bound on the mass will start to get smaller.
It follows that there is a maximum mass that a cold star can attain.
The maximum mass depends heavily on the details of the equation of state of cold matter.
For the equation of state of a white dwarf where electron degeneracy pressure is the dominant
outward force, one reproduces the Chandrasekhar bound :
2 2
MC ≈ 1.4 M⊙ , (3.17)
µN
where µN is the number of nucleons per electron. The calculation for this bound does not
require general relativity, Newtonian gravity is good enough, and the two bounds agree to a
good precision. Experimentally we know the equation of state up to some density ρ0 which
is nuclear density, past this we no longer know the density.10 One may guess that with some
crazy configuration one could arrange for a star which is arbitrarily heavy, subject to the
above bound. General relativity says that there is in fact a maximal bound independent of
the equation of state. This is around 5M⊙ .
To see why this is true observe that ρ is a decreasing function of r. We may define the
core of the star as the region in which ρ > ρ0 where we do not know the equation of state,
and the envelope (since it envelopes the core) as the region ρ < ρ0 where we do know the
equation of state. Let r0 be the radius of the core, so that the core is the region r < r0 and
the envelope is the region r0 < r < Rc . The mass of the core is m0 = m(r0 ). Since the density
in the core is bigger than the density on the boundary with the envelope we must have that
4πr03 ρ0
m0 ≥ . (3.18)
3
Note that Newtonian gravity would also predict this inequality, however in GR we also have
the additional constraint (3.15) which we should evaluate at r = r0 where we know the
equation of state and may therefore determine p0 = p ρ(r0 ) :
m0 2h
q i
2
< 1 − 7πr0 p0 + 1 + 6πr02 p0 . (3.19)
r0 9
Since the RHS is a decreasing function of p0 evaluating at p0 = 0 we get the weaker bound
4r0
m0 < . (3.20)
9
10
Past this density this becomes a strongly coupled phenomenon described by a strongly coupled QFT.
Since we typically make progress with understanding QFTs using perturbation theory, when they are strongly
coupled this technique fails and we need new ones. Recently there has been some work using AdS/CFT to
work out a realistic equation of state. (This is by no means the only technique, but it has a nice connection
to string theory.)
35
These two inequalities define a finite region in the m0 − r0 plane. Hence, even though we
are ignorant of the equation of state within the core, GR predicts that its mass cannot be
arbitrarily large.
Using (3.18) to eliminate r0 and plugging this into (3.20) we have
4
m0 < √ . (3.21)
9 3πρ0
Hence, even though we do not know the equation of state inside the core GR predicts that its
mass cannot be indefinitely large. Experimentally we know the equation of state of cold matter
at densities much higher than the density of atomic nucei so we take ρ0 = 5 × 1014 g/cm3 .
Plugging this into the above gives the bound m0 < 5M⊙ .
If we are given a core with mass m0 and radius r0 we can solve (numerically) for the
envelope region using the known equation of state and the equations for m(r) and p(r) with
the initial conditions given by the core. If one plugs this into a computer programme one
finds that the maximal mass M as a function of ρ0 , m0 . One can then vary this over the
allowed region for (m0 , r0 ) one finds that the largest mass is attained for the maximum of m0 .
At this maximum the envelope contributes less than 1% of the total mass so the maximum
mass of M is at almost the same as the maximum of m0 and we have M ≤ 5M⊙ .
This is an upper bound for any physically reasonable equation of state for ρ > ρ0 . Any
equation of state will have a smaller upper bound than the one given here. One may put
further constraints on what we call a physically reasonable equation of state. A natural
demand is that the speed of sound through the mass should not exceed the speed of light, so
dp
that dρ ≤ 1, then the upper bound is further reduced to about 3M⊙ .
3.3 Summary
What have we learnt from this exercise? Firstly we see once again that GR predicts something
that Newtonian gravity cannot, we find an upper bound on the maximal size of any cold star,
independent of its composition. Secondly this has an extremely important consequence for
the ultimate fate of a star. Ordinary hot starts are supported against collapse under their
own weight by ideal gas pressure resulting from the high temperature. This pressure is much
higher than the pressure that can be produced by cold matter at comparable densities and so
the above upper limits do not apply. However since a hot star radiates energy, just look out
the nearest window during the day, if this energy is not replenished hydrostatic equilibrium
cannot be maintained. As the fuel source is used up the hydrostatic equilibrium is lost and it
begins to contract until the cold matter pressure dominates the remaining thermal pressure. If
36
the star was small enough a stable equilibrium may be reached using cold matter pressure and
will remain like this forever. However if the mass is greater than the cold matter upper limit
equilibrium can never be achieved and the star would have to undergo complete gravitational
collapse unless they shed some of their mass to bring their total mass below the upper bound.
Figure 1: The equilibrium configurations of cold matter. Given an equation of state the
equilibrium configuration is uniquely determined by the central density ρc . The radii and
masses of these configurations are shown for values of ρc ranging from ≈ 105 g cm−3 at point
A to ≈ 101 7g cm−3 beyond point D. In the white dwarf regime the values of M and Rc
depend somewhat on the assumed composition of the star. The neutron star regime is far
more dependant on the assumptions that go into the equation of state, and interactions
between the fundamental constituents of the matter. In the latter regime this is just a rough
sketch of the qualitative features. The point B is the Chandrasekhar limit and beyond this
the white dwarf must undergo further gravitational collapse to become a neutron star. It
is at this point that the electron degeneracy pressure is insufficient to prevent gravitational
collapse and therefore the equation of state changes past this point.
Figure taken from Wald based on a figure by Harrison, Thorne, Wakano and Wheeler.
37
4 Causality and Penrose Diagrams
Let us consider a spacetime M . One of the postulates that we demand General Relativity
satisfies is that it is causal. A signal can be sent between two distinct points if and only if
the points can be joined by a non-spacelike curve. Our goal in this section is to investigate
the properties of causality on spacetime. Given that our spacetimes are generically infinite
in extent this can be difficult to understand on a piece of paper. There is a useful way of
resolving this issue called conformal compactification.
Definition: Conformal transformation
A conformal transformation is a map from a spacetime (M, g) to a spacetime (M, g̃) such
that
g̃µν (x) = Ω(x)2 gµν (x) , (4.1)
where Ω(x) is a smooth function of the spacetime coordinates and Ω(x) ̸= 0 for all x ∈ M .
Two spacetimes whose metrics are related by a conformal transformation have the same
null geodesics. However, timelike and spacelike geodesics in one metric will not necessarily
be geodesics in the other. You will prove this in problem sheet 1. One reason why conformal
transformations are useful is because they preserve the causal structure of spacetime. Consider
a vector V µ on M , not necessarily a geodesic. Then since Ω(x)2 > 0 it follows that
Hence curves which are timelike, null or spacelike with respect to one metric remain timelike,
null or spacelike respectively in the conformally rescaled metric.
We may use this to our advantage when studying the causal structure of spacetime.
By using a suitably chosen conformal factor we may bring “infinity” to a finite coordinate
distance which allows us to draw the causal structure on a finite piece of paper. This is known
as a Penrose diagram and encodes the causal structure of the spacetime.
The general procedure for drawing a Penrose diagram is to perform the following steps.
First change coordinates on (M, g) such that “infinity” is brought to finite coordinate distance.
This then allows us to draw the spacetime on a finite piece of paper. The points at “infinity”
will become the edges of the finite diagram. Typically the metric will diverge at these points.
To remedy this we perform a conformal transformation on g to obtain g̃ which is regular
on the edges. The new pair (M, g̃) is a good representation of the original spacetime (M, g)
38
for understanding the causal structure: they have the exact same causal structure. It is
customary to add the points at infinity to the spacetime to form a new manifold M̃ (with
boundary now). The resulting spacetime (M̃ , g̃) is oft called the conformal compactification
of (M, g).
Note that this has some limitations. Conformal transformations generically change the
curvature tensors so that R̃µνρσ ̸= Rµνρσ , R̃µν ̸= Rµν , R̃ ̸= R ... and so forth, therefore
the conformally compactified spacetime is unphysical, it does not satisfy the Einstein field
equations anymore. Moreover, timelike and spacelike geodesics of (M, g) are not geodesics
in (M, g̃). The utility of the conformal compactification is for understanding the causal
structure.
To understand this better let us consider some examples.
where −∞ < t, x < ∞. The null geodesics are given by t ± x =constant. We may introduce
light-cone coordinates u = t − x and v = t + x which makes the null geodesics pretty simple.
In these coordinates the metric becomes
The coordinates are still infinite and so we have not really done much yet. To proceed we
want to shrink infinity down to a finite distance away. Define
where − π2 < ũ, ṽ < π2 . Note that the range is open because strictly u, v → ±∞ are not in the
spacetime. The line-element with these coordinates is now
1
ds2 = − dũdṽ . (4.6)
cos2 ũ cos2 ṽ
It diverges as ũ, ṽ → ± π2 . We can now define a new metric conformally related to the one
above. The obvious conformal factor to use is chosen to remove the prefactor. We take
39
Figure 2: Left: On the left we have Minkowski space, (M, g) in (ũ, ṽ) coordinates. The
boundaries ũ, ṽ = ± π2 are not part of M and g diverges there. Lines with r = const are given
by dashed lines, while the solid lines are those with t = const. Right: On the right is the
Penrose diagram of the conformally compactified spacetime. Future past timelike infinity i± ,
future/past null infinity is denoted J ± while spacelike infinity is denoted i0 .
This metric is now regular at the points at infinity where either ũ ṽ are equal to ± π2 . Since
it is regular there we may now add these points to the spacetime. The resulting spacetime
(M̃ , g̃) is the conformal compactification of (M, g). We may now draw this, see figure 2
The two points (ũ, ṽ) = (− π2 , − π2 ) and (ũ, ṽ) = ( π2 , π2 ) are denoted by i∓ respectively.
All past and future directed timelike curves end up at i∓ so we refer to i− /i+ as past/future
π π
timelike infinity. Future directed null geodesics either end up at ṽ = 2 with constant |ũ| < 2
or at ũ = π
2 with constant |ṽ| < π
2. This set of points is denoted by I + (called scri-plus)
and referred to as future null infinity. An analogous definition holds for past null infinity I −
(scri-minus). Spacelike infinity, i0 denotes the set of end-points of spacelike geodesics, which
correspond to (ũ, ṽ) = ( π2 , − π2 ) and (ũ, ṽ) = (− π2 , π2 ).
We have just seen the Penrose diagram for d = 2, it turns out that this is some-what special
in dimension, Minkowski space in d > 2 is somewhat different. Consider Minkowski space in
d > 2 dimensions. We may use the “rectangular” metric
d−1
X
2 2
ds = −dt + (dxi )2 , (4.8)
i=1
40
where the coordinates have ranges t ∈ (−∞, ∞), xi ∈ (−∞, ∞). To proceed we may a change
of coordinates going to spherical polar coordinates so that the spacelike part of the metric is
equivalent to
d−1
X
(dxi )2 = dr2 + r2 ds2 (S d−2 ) , (4.9)
i=1
with S d−2 the unit (d − 2)-dimensional sphere and ds2 (S d−2 ) the round metric on it. This
exhibits the spacetime as a cone centred at xi = 0. We take r ≥ 0. In these coordinates the
Minkowski metric is
ds2 = −dt2 + dr2 + r2 ds2 (S d−2 ) . (4.10)
(v − u)2 2 d−2
ds2 = −dudv + ds (S ) . (4.12)
4
Note that since r ≥ 0 we have u ≤ v. We now want to bring infinity to finite coordinate
length, to do this we change coordinates to
where
π π π π
ũ ∈ − , , ṽ ∈ − , . (4.14)
2 2 2 2
Note that the range is open since the points at ±∞ in the original coordinates are not part
of the spacetime. We still need to impose that ũ ≤ ṽ. In these coordinates the metric reads
1 h i
ds2 = − − 4dũdṽ + sin 2
(ṽ − ũ)ds2
S d−2
. (4.15)
4 cos2 ũ cos2 ṽ
We may now use a conformal transformation to remove the overall pre-factor and we are left
with
g̃ = 4 cos2 ũ cos2 ṽg = −4dũdṽ + sin2 (ṽ − ũ)ds2 S d−2 .
(4.16)
As before, after the conformal transformation ũ, ṽ = ± π2 is no longer a problem and we may
compactify the space by including these points. We therefore have the coordinate ranges
− π2 ≤ ũ ≤ ṽ ≤ π
2. At fixed point on the sphere the metric is the same as that of 2d
Minkowski space, the difference is in the ranges of ũ, ṽ. We only include the half which is
41
Figure 3: Left: On the left we have Minkowski space in general dimension > 2. Each
point represents a d − 2-dimensional sphere. As the null geodesic passes through r = 0
it emerges on another copy of the Penrose diagram whose points represent the anti-podes
(diametrically opposite point) on the spheres. Right: The right digram shows the conformal
compactification for d = 4 as a portion of the Einstein static universe. The curved line
represents that same null geodesic as on the left-hand-side.
right of the vertical line. Every point on the sphere represents a d − 2 dimensional sphere of
radius sin(ṽ − ũ). The Penrose diagram is drawn in figure 3
In 4d, we can picture this differently. Define the coordinates T = ṽ + ũ and χ = ṽ − ũ.
The coordinate ranges are then −π < T < π and 0 < χ < π, with the added constraint
|T | + χ ≤ π. The metric reads
The spatial part is just the round metric of a three-sphere. This therefore represents a static
universe with spherical spatial slices corresponding to a finite portion of the Einstein static
universe. See the right-hand side of figure 3 there this is plotted. Note that the vertical
direction of the cylinder is T while the angular direction is χ. At each point there is a
42
two-sphere with radius sin2 χ. We have
Note that i± , i0 are actually points since χ = 0 and χ = π are the north and south poles of
S 3 . Meanwhile I ± are null surfaces with the topology of R × S 2 .
There are a number of features to observe. Radial null geodesics are at ±45◦ in the
diagram. All timelike geodesics begin at i− and end at i+ . All null geodesics begin at I −
and end at I + .
Rindler space is a subregion of Minkowski space associated with observers who are eternally
accelerated at a constant rate. It appears often when looking at the near-horizon region of
black holes. Consider the two-dimensional Minkowski metric and an observer moving at a
uniform acceleration of magnitude ξ −1 in the x-direction. Their trajectory is
which has constant acceleration α. Note that the trajectory of the observer satisfies
x2 (τ ) − t2 (τ ) = ξ 2 , (4.20)
which describes a hyperboloid asymptoting to null paths x = −t in the past and x = t in the
future. The accelerated observer travels from past null infinity to future null infinity, rather
than timelike infinity as would be reached by geodesic observers. The region x ≤ t is forever
hidden to them which makes the line x = t a horizon to these observers. This horizon is of
a different flavour to the Schwarzschild horizon since that is an observer independent object
while this horizon is associated with a special family of observers, see figure 4.
Rindler space corresponds to the right wedge x > |t| foliated by the worldlines of the
accelerated observers.
We can choose new coordinates (η, ξ) on 2d Minkowski space that is adapted to uniformly
accelerated motion. Let
t = ξ sinh(η) , x = ξ cosh aη) , (4.21)
43
Figure 4: Eternally accelerating observers in Minkowski space. Their worldlines are in blue
and labelled by ξ. Events in the shaded region such as the black dot are hidden to them. The
Rindler horizon is the boundary between the shaded and unshaded regions. Rindler space is
the right wedge bounded by the dashed black lines which are null. The straight lines are lines
of constant Rindler time.
with coordinate range 0 < ξ < ∞ and −∞ < η < ∞. In these coordinates the Minkowski
metric in (η, ξ) coordinates is
ds2 = −ξ 2 dη 2 + dξ 2 . (4.22)
The proper time measured by an accelerated observer, i.e. a stationary (ξ =constant) observer
in Rindler coordinates is τ = ξη. Since Rindler space is just a subregion of Minkowski space
the Penrose diagram is just a piece of figure 2.
Recall that we could extend the Schwarzschild solution beyond the horizon by using Kruskal
coordinates. The metric in these coordinates reads
32M 3 r
ds2 = − exp − dU dV + r2 ds2 (S 2 ) . (4.23)
r 2M
Recall that the range of the coordinates is −∞ < U, V < ∞. We need to define a new set of
null coordinates to bring infinity to a finite coordinate distance. We transform as
44
such that − π2 < Ũ , Ṽ < π2 . The line element becomes
128M 3
1 r
ds2 = − exp − dŨ dṼ + r2 cos2 Ũ cos2 Ṽ ds2 (S 2 ) . (4.25)
4 cos2 Ũ cos2 Ṽ r 2M
We perform the usual conformal transformation
128M 3 r
g̃ = 4 cos2 Ũ cos2 Ṽ g = − exp − dŨ dṼ + r2 cos2 Ũ cos2 Ṽ ds2 (S 2 ) . (4.26)
r 2M
The curvature singularity at r = 0 is at U V = 1 in U, V coordinates and now corresponds to
Figure 5: Left: The Penrose diagram for Kruskal spacetime. The possible trajectory of
the surface of a collapsing star is plotted, the parts to the left correspond to the interior of
the star and is described by a metric (at fixed time slice) to the metric we constructed in
section 3. Right: The Penrose diagram for a collapsing star. The curved surface represents
the surface of the star with the shaded area corresponding to the interior of the star. The
horizon corresponds to the dashed line and appears in spacetime once the star has collapsed
sufficiently.
45
We can also plot the Penrose diagram of a spherically symmetric collapsing star. The
interior of the star is excluded since the stress energy tensor does not vanish there. We end
up with only the two regions 1 and 3 of Kruskal spacetime, there is no white hole region.
At this point we have almost beaten to death the Schwarzschild solution, we need some new
solutions to play with. There is a generalisation to the Schwarzschild solution that we can
study: we can give it some charge. This will retain the static and spherically symmetric
properties of the Schwarzschild solution but couple Einstein gravity to electromagnetism.
The charged black hole is known as the Reissner–Nordström (RN) black hole.
In nature large imbalances of charge do not occur, it is favourable for the charged object to
attract particles of opposite charge and gradually lose its charge. We would therefore expect
matter undergoing gravitational collapse to be neutral and so the presence of charged black
holes in nature does not seem particularly relevant. Nevertheless the solution exhibits some
interesting features. Moreover, for those doing string theory, RN black holes occasionally
appear, though probably not in your course.
We want to couple Einstein gravity to Electromagnetism. Recall that the general prescrip-
tion for coupling matter to gravity is through minimal coupling.11 Minimal coupling says
we replace the Minkowski metric with the curved metric of spacetime, we replace regular
derivatives with covariant derivatives and add in the correct volume measure.
Electromagnetism in terms of forms
Recall that Electromagnetism is governed by Maxwell’s equations:
⃗ − ∂t E
∇×B ⃗ = J⃗ ,
∇·E ⃗ = ρ,
(5.1)
⃗ + ∂t B
∇×E ⃗ = 0,
⃗ = 0.
∇·B
Here B⃗ and E
⃗ are the electric and magnetic field 3-vectors, J⃗ is a current, ρ is the charge
density. These equations are invariant under Lorentz transformations, even though they
do not look invariant. We can write these in a manifestly invariant way by introducing
the two-form field strength F and its one-form potential A.
11
One can also add non-minimal terms but we will not consider these here.
46
Writing the Maxwell’s equations in component notation we have
ϵijk ∂j Bk − ∂0 E i = J i ,
∂i E i = J 0 ,
(5.2)
ϵijk ∂j Ek + ∂0 B i = 0 ,
∂i B i = 0 .
We have introduced the current 4-vector J = (ρ, J)⃗ to rewrite the first two conditions.
Let us define the field strength tensor Fµν to be
0 −E1 −E2 −E3
1 0 B3 −B2
E
Fµν = (5.3)
E2 −B3 0 B1
E3 B2 −B1 0 µν
We have
F 0i = E i , F ij = ϵijk Bk . (5.4)
Therefore the first two equations in (5.3) can be rewritten as
∂j F ij − ∂0 F 0i = J i ,
(5.5)
∂i F 0i = J 0 ,
d ⋆ F = −J , dF = 0 . (5.8)
The first equation is known as the Maxwell equation, while the second is the Bianchi
identity. Since dF = 0 this means that locally F can be written as a closed form,
F = dA , Fµν = ∂µ Aν − ∂ν Aµ . (5.9)
The one-form A is known as the gauge field. Note that it is not unique, A + dΛ gives the
same field strength F when Λ is a smooth function. Adding the term dΛ to the potential
is known as a gauge transformation, it is a redundancy/symmetry in our description
of the the theory. Physical quantities will generally be expressed in terms of the field
47
strength F . On the other hand we view the gauge field as the dynamical field of the
theory, i.e. the field we vary an action with respect to.
We can write an action for electromagnetism by using the gauge field A and defining
the field strength F to be F = dA. Then the action giving rise to Maxwell’s equations
with sources is
Z Z h 1 i
SMaxwell = d x LEM = d4 x − Fµν F µν + Aµ J µ .
4
(5.10)
4
We have
∂LEM
= Jν , (5.11)
∂Aν
and
∂LEM
= −F µν . (5.12)
∂(∂µ Aν )
Putting everything together, the Euler Lagrange equations give
∂µ F µν = −J ν , (5.13)
as we found above from Maxwell’s equations. The Bianchi identity arises because we
define F = dA and by using that d= 0.
To couple this to gravity we will replace the Minkowski metric with the curved metric, add
the volume measure and replace derivatives with covariant derivatives. Derivatives appear in
the field strength as
Fµν = ∂µ Aν − ∂ν Aµ −→ ∇µ Aν − ∇ν Aµ = ∂µ Aν − ∂ν Aµ , (5.15)
where the latter follows when using the Levi–Civita connection. The action for Einstein–
Maxwell theory is then
4 √ √
Z Z
1
µρ νσ
1
S= d x −g R − Fµν Fρσ g g ≡ d4 x −g R − Fµν F µν . (5.16)
16π 16π
The equations of motion derived from the variation of the Einstein–Maxwell action are
1 ρ 1 ρσ
Rµν − Rgµν = 2 Fµρ Fν − gµν Fρσ F ,
2 4 (5.17)
µν
∇µ F = 0 ,
and should be accompanied by the Bianchi identity dF = 0. Exercise: Check the equations
of motion are indeed those derived from the action.
48
5.2 Reissner–Nordström black hole
Theorem:
The Unique spherically symmetric solution of the Einstein–Maxwell equations with non-
constant area radius function r is the Reissner–Nordström solution:
49
5.2.1 Super extremal RN: e2 > M 2
If M 2 −e2 < 0 the roots r± are complex and therefore ∆(r) does not have any real zeros. Thus
the curvature singularity is not hidden behind a horizon and we have a naked singularity.13
There is no obstruction to an observer travelling to the singularity, studying it and then
returning to us to tell us all about it. If one studies the geodesics one finds that the naked
singularity is repulsive, timelike geodesics never intersect r = 0, rather they approach r = 0
but reverse course and move away. Null geodesics can reach the singularity as can non-geodesic
timelike curves.
As r → ∞ the solution approaches flat spacetime and the causal structure looks normal
everywhere. The conformal diagram will therefore be just like that of Minkowski space, except
now r = 0 is a singularity. The nakedness of the singularity should be offensive to you. We
should never expect to find a black hole with M 2 < e2 as a result of gravitational collapse.
Roughly, the condition states that the total energy of the hole is less than the contribution
to the energy of the electromagnetic fields alone, and therefore we must have something with
negative mass. Therefore we consider this unphysical. The Penrose diagram is given in figure
6.
In this case ∆ has two real simple roots and there are consequently two coordinate singulari-
ties. The surfaces defined by r = r± are both null hypersurfaces and are both event horizons
(for the moment we define an event horizon as a hypersurface separating spacetime points
to those which are connected to infinity by a timelike path from those that are not). The
singularity at r = 0 is a timelike line (contrast this with Schwarzschild where it was spacelike).
To see that they are coordinate singularities we can proceed in a similar manner as we did
for the Schwarzschild solution and define tortoise like coordinates. Let us begin with r > r+
and define
r2
dr∗ = dr . (5.21)
∆(r)
Integrating gives
1 r − r+ 1 r − r−
r∗ = r + log + log + const , (5.22)
2κ+ r+ 2κ− r−
where
r± − r∓
κ± = 2 . (5.23)
2r±
13
We will discuss this in more detail later.
50
Figure 6: The Penrose diagram for the super-extremal Reissner–Nordström solution.
Now let
u = t − r∗ , v = t + r∗ . (5.24)
51
region 0 < r < r+ . There is a curvature singularity at r = 0 and there is a null hypersurface
at r = r± .
It follows that no point in the region r < r+ can send a signal to I + , hence it describes
a black hole. The black hole region is r ≤ r+ and the future event horizon is the null
hypersurface r = r+ .
To understand the global structure we need to define Kruskal-like coordinates:
The RHS is a monotonically increasing function of r from r > r− . initially we have U + < 0
and V + > 0 which gives r > r+ , but now we can analytically continue to U + ≥ 0 or V + ≤ 0.
In particular the metric is smooth and non-degenerate when U + = 0 or V + = 0. We obtain
a diagram very similar to the Kruskal diagram we had for Schwarzschild, see figure 7.
Just as for Kruskal we have a pair of null hypersurfaces which intersect in the bifurcation
2-sphere located at U + = V + = 0. Surfaces of constant t are Einstein–Rosen bridges which
connect regions I and IV. The major difference to the Schwarzschild solution is that we no
longer have a curvature singularity in regions II and III because r(U + , V + ) > r− . However
we know that it is possible to extend our metric into the r < r− region, hence the above
spacetime must be extendable. In other words we know from the EF coordinates that radial
null geodesics reach r = r− in finite affine parameter so we have to investigate what happens
there.
To do this we should start in region II and use ingoing EF coordinates (v, r, θ, ϕ), since
we know that these cover regions I and II. We can now define a retarded time coordinate u
in region II. First define a time coordinate t = v − r∗ in region II with r∗ as defined in (5.22).
The metric in coordinates (t, r, θ, ϕ) takes the static RN form given above with r− < r < r+ .
Now define u = t − r∗ = v − 2r∗ . Having define u in region II we can now define Kruskal
coordinates U − < 0 and V − < − in region II using the formula above. In these coordinates
52
Figure 7: The Reissner–Nordström solution in (U + , V + ) coordinates.
the metric is
r+ r− 2|κ− |r r+ − r 1+|κ− |/κ+ − −
2
ds = − 2 2 e dU dV + r2 ds2 (S 2 ) , (5.29)
κ− r r+
where r(U − , V − ) < r+ is given implicitly by
|κ− |/κ+
− − −2|κ− |r r − r− r+
U V =e . (5.30)
r− r+ − r
We may as before analytically continue to U − ≥ 0 and V − ≥ 0 which gives the diagram 8.
We now have the regions V and VI in which 0 < r < r− . These regions contain the
curvature singularity at r = 0 (U − V − = −1) which is timelike. Region III’ is isometric to
region III and so by ontroducing new coordinates (U +′ , V +′ ) this can be analytically continued
to the future to give further new regions I’, II’, and IV’ as shown in figure 9. In this diagram
the regions I’ and IV’ are new asymptotically flat regions isometric to I and IV. We may repeat
this procedure indefinitely to the future and past, so that the maximal analytic extension of
the RN solution contains infinitely many regions. The resulting Penrose diagram is given in
figure 10. It extends to infinity in both the past and future.
This seems a bit crazy, infinite universes, what is happening here? Notice that if you are
an observer falling into the black hole from far away r+ is just like the Schwarzschild horizon.
53
Figure 8: The Reissner–Nordström solution in (U − , V − ) coordinates.
At this radius r switches from being a spacelike coordinate to a timelike one and therefore
you necessarily move in the direction of decreasing r. Witnesses outside the black hole see
the same phenomena that they would for the Schwarzschild solution, the infalling observer is
seen to move more and more slowly and is increasingly redshifted.
The inevitable fall from r+ to ever-decreasing radii only lasts until you reach the null
surface at r = r− where r switches from being a timelike coordinate back to being spacelike.
You need not continue travelling on a trajectory of decreasing r and therefore your inevitable
doom of hitting the singularity can be stopped. Indeed r = 0 is a timelike line and you are
therefore and therefore not necessarily in your future.
At this point you can continue on to r = 0 or begin to move in the direction of increasing
r back through the null surface at r = r− . Then r will once again be a timelike coordinate,
however now the orientation is reversed and you must travel in the direction of increasing r
until you are spat out of the event horizon at r = r+ . This is like emerging from a white hole
into the rest of the universe. From here you can choose to go back into the black hole, this
time a different one to the one you initially entered. You may then repeat this to your hearts
content.
54
Figure 9: The regions I’, II’, IV’ of the Reissner–Norström solution.
How much of this story is actually science over science fiction? Well, not much. Viewing
the universe from the point of an observer inside the black hole tho is about to cross the event-
horizon at r = r− you notice that the observer can look back in time to see the entire history
of the external universe, at least as seen from the black hole. They see this infinitely long
history in a finite proper time thus any signal that gets to them as they approach r = r− is
infinitely blue-shifted. Therefore it is likely that any non-spherically symmetric perturbation
that comes into an RN black hole will violently disturb the geometry. For this reason it is
difficult to say exactly what the actual geometry inside the horizon looks like, but there is no
good reason why it must contain and infinite number of asymptotically flat regions connecting
to each other via various wormholes.
55
Figure 10: The Penrose diagram of the RN black hole.
56
5.2.3 Extremal RN: M 2 = e2
Finally let us consider the extremal RH when the two roots become equal and we obtain a
double root. The metric of the RN extremal solution is
M 2 2 M −2 2
ds2 = − 1 − dt + 1 − dr + r2 ds2 (S 2 ) , (5.31)
r r
ρ=r−M (5.32)
where
M
H(ρ) = 1 + . (5.34)
ρ
Since the bracketed part of the metric is just the metric on R3 we may rewrite the metric as
h i
ds2 = −H(⃗x)−2 dt2 + H(⃗x)2 dx2 + dy 2 + dz 2 , (5.35)
57
Figure 11: The Penrose diagram of the extremal RN black hole.
58
with
M
H(⃗x) = 1 + . (5.36)
|⃗x|
In the original components the electric field of the extremal solution can be expressed in terms
of a vector potential A as
Q Q
Frt = = ∂r A0 , A0 = − . (5.37)
r2 r
We may rewrite this as
A0 = H −1 − 1 . (5.38)
We can now forget that H takes the form above and just plug the metric into the field
equations and we find that we have a solution provided
∇2 H = 0 , (5.39)
with ∇2 the Laplacian on R3 . It is straightforward to write down all solutions that are well
behaved at infinity, they take the form
N
X Ma
H =1+ , (5.40)
|⃗x − ⃗xa |
a=1
for some set of N spatial points ⃗xa . These are the locations of the N extremal RN black holes
with masses Ma and electric charges Qa = Ma .
We now want to see how to compute the electric and magnetic charges of the solution and
check that they do indeed agree with the parameters Q and P in the RN solution. Consider
Maxwell’s equation in the presence of a current density J:
d ⋆ F = −4π ⋆ J , dF = 0 . (5.41)
59
and assuming Σ has boundary ∂Σ Stoke’s theorem gives
Z
1
Q= ⋆F . (5.44)
4π ∂Σ
⃗ · dS.
⃗
R
This is the analogue of Gauss’ law Q ∼ E
Consider Minkowski spacetime in spherical polar coordinates and choose the orientation
so that the volume form is
Take Σ to be the surface at fixed t = 0.14 We may view Σ as the boundary of the region t ≤ 0
then Stoke’s theorem fixes the orientation of Σ as r2 sin θdr ∧ dθ ∧ dϕ. Let ΣR be the region of
2 . Stokes’ theorem
Σ with r ≤ R of Σ, the boundary is then the two-sphere with radius R: SR
fixes the orientation of the two-sphere to be dθ ∧ dϕ. Consider the Coulomb potential
q q
A = − dt , F =− dt ∧ dr . (5.46)
r r2
Taking the Hodge dual gives
⋆F = q sin θdθ ∧ dϕ , (5.47)
60
It remains to be seen why we call this a conserved charge. Consider two spacelike surfaces
Σ1 and Σ2 . Consider the cylindrical surface, V which is bounded by Σ1 and Σ2 , and large
enough to contain all of the sources, see figure 12. From this latter condition it follows that
J = 0 on the boundaries and outside V . We then have
Z
0= d⋆J
ZV
= ⋆J
Z∂V Z
= ⋆J − ⋆J (5.50)
Σ1 Σ2
Z Z
1 1
= ⋆F − ⋆F
4π ∂Σ1 4π ∂Σ2
= Q[Σ1 ] − Q[Σ2 ] .
∂Σ1
Σ1
Sources
Σ2
∂Σ2
Figure 12: The region V bounded by the two spacelike hypersurfaces Σi and containing all
the sources.
61
We can also define magnetic charges similarly. Since they are already defined on the
spacelike hypersurface we just need to integrate F , as opposed to ⋆F . A similar argument
for showing that it is conserved holds as well.
Let us do this for the RN black hole. We have
Q
F = dr ∧ dt + P sin θdθ ∧ dϕ . (5.51)
r2
The magnetic charge is defined to be
Z Z
2 1 P
P [S ] = F = dvol(S 2 ) = P . (5.52)
4π S2 4π S2
We find that indeed the parameters Q and P are the electric and magnetic charges.
All the solutions we have seen so far have been static and spherically symmetric, though these
are nice testing grounds for us to learn things from they are not likely to be objects that we
will see in our universe. Observational evidence seems to suggest that black holes should
rotate. Our goal in this section is to study rotating black holes.
Since the black holes are rotating we must give up our spherical symmetry, they can
however be axisymmetric: symmetric under rotations about an axis. Moreover we must give
up our metric being static, and reduce to the weaker stationary class of metric. This follows
since if we were to dun time in the opposite direction we must see rotation in the opposite
direction, clearly this cannot be static, we should then impose the weaker stationary condition.
These generalisations make the metric a lot more complicated. Although the Schwarzschild
solution and Reissner–Nordström solutions were discovered shortly after general relativity
was invented, the metric we will study, known as the Kerr(–Newman) metric was first found
in 1963. Kerr originally found the rotating metric without any charges but was later extended
by Newman to include charges.
62
6.1 The Kerr–Newman solution
At larger r the above coordinates reduce to the spherical polar coordinates in Minkowski
spacetime, θ, ϕ have the usual interpretation as angles on S 2 , so we have 0 < θ < π and
ϕ ∈ [0, 2π]. This depends on 4 parameters, a, M , Q and P . You may guess that M is the
mass, Q the electric charge, P the magnetic charge and a related to the angular momentum.
We will show how to compute the angular momentum and mass soon, but for the moment
let us just give the result. The parameter a is the angular momentum per unit mass,
J
a= , (6.3)
M
with J the Komar angular momentum.
Note that the metric can be rearranged into the form
Since all of the essential phenomena persist in the absence of charge we will set Q = P = 0
in the remainder of this section. If we set a → 0 the metric reduces to the Schwarzschild
63
solution. If instead we keep a fixed but set M → 0 then we recover flat space, but in funky
coordinates:
r2 + a2 cos2 θ 2
ds2 = −dt2 + dr + (r2 + a2 cos2 θ)dθ2 + (r2 + a2 ) sin2 θdϕ2 . (6.5)
r2 + a2
The spatial part of the metric is flat three-dimensional space written in ellipsoidal coordinates,
see figure 13. They are related to Cartesian coordinates in three-dimensional space by
p
x = r2 + a2 sin θ cos ϕ ,
p
y = r2 + a2 sin θ sin ϕ , (6.6)
z = r cos θ .
r =0
Constant r
Constant θ
Figure 13: The structure of the ellipsoidal coordinates of the Kerr metric. The region r = 0
is a two-dimensional disc with length 2a.
There are two Killing vectors of the metric both of which are manifest since the metric
is independent of both t and ϕ. Both K = ∂t and R = ∂ϕ are Killing vectors. K is not
orthogonal to t =constant hypersurfaces and hence the metric is stationary and not static.
This makes sense since the black hole is rotating, so not static, but it is spinning in exactly
64
the same way at all times so it is stationary. R expresses the axial symmetry of the solution,
we have a symmetry rotating the solution around the axis of rotation.
Besides the Killing vectors the Kerr metric also has a Killing tensor. A Killing tensor is
any symmetric (0, n) tensor σµ1 ...µn satisfying
where
1 2 1 2
lµ = (r + a2 , ∆, 0, a) , nµ = (r + a2 , −∆, 0, a) , (6.9)
∆ 2ρ2
lµ lµ = 0 , nµ nµ = 0 , lµ nµ = −1 . (6.10)
The coordinates have been chosen so that the event horizons occur at those fixed values
of r for which g rr = 0. Since g rr = ∆/ρ2 we have zeroes when
∆(r) = r2 − 2M r + a2 = 0 . (6.11)
The solutions with M 2 − a2 < 0 describe a naked singularity, and the M 2 = a2 solution is
unstable, so lets assume that M 2 > a2 from now on. The metric is also singular at θ = 0, π
but these are just coordinate singularities of spherical polars so we will ignore these. There
is also a singularity at ρ2 = 0 when r = 0 and θ = π2 .
Let us show that r = r+ is just a coordinate singularity. To do this we define Kerr
coordinates (v, r, θ, χ) for r > r+ by
r2 + a2 a
dv = dt + dr , dχ = dϕ + dr . (6.13)
∆(r) ∆(r)
In the new coordinates we have χ ∼ χ + 2π and the Killing vectors are
∂ ∂
K= , R= . (6.14)
∂v ∂χ
65
The new metric in these coordinates is
∆(r) − a2 sin2 θ 2 2a sin2 θ(r2 + a2 − ∆(r))
ds2 = − dv + 2dvdr − dvdχ
ρ2 ρ2
(6.15)
(r2 + a2 )2 − ∆(r)a2 sin2 θ
− 2a sin2 θdχdr + sin2 θdχ2 + ρ2 dθ2 .
ρ2
This change of coordinates shows that the metric is non-degenerate at r = r+ . We can
analytically continue through the surface r = r+ into a new region with 0 < r < r+ .
The surface r = r+ is a null hypersurface with normal
ξ µ = K µ + ΩH R µ , (6.16)
with
a
ΩH = 2 . (6.17)
r+ + a2
Note that one-form
ρ(r, θ)
ξ= dr , (6.18)
r2 + a2
vanishes on the r = r+ surface and is therefore normal to this hypersurface. The dual one-form
is
∆(r) a
ξ = ∂v + ∂r + 2 ∂χ , (6.19)
r2+a 2 r + a2
which agrees with ξ above on the horizon where ∆(r+ ) = 0. The norm of ξ is
ρ2 (r, θ)∆(r)
ξ µ ξµ = , (6.20)
(r2 + a2 )2
which clearly vanishes at r = r+ and therefore the vector ξ is a null Killing vector on r = r+ .
The region r ≤ r+ part of the black hole region of this spacetime with r = r+ is the future
event horizon H+ . In Boyer–Lindquist coordinates the Killing vector is
∂ ∂
ξ= + ΩH . (6.21)
∂t ∂ϕ
Observe that ξ µ ∂µ (ϕ − ΩH t) = 0 and therefore ϕ = ΩH t + const on integral curves of ξ µ .
Conversely integral curves of K have ϕ = const. We see that particles moving on orbits of ξ
rotate with angular velocity ΩH with respect to a stationary observer (someone on an orbit of
K). In particular they rotate with this angular velocity with respect to a stationary observer
at infinity. Since ξ is tangent to the generators of H+ , then these generators rotate with
angular velocity ΩH with respect to a stationary observer at infinity so we can interpret ΩH
as the angular velocity of the black hole.
66
6.3 Komar Integrals
In the above we have claimed that the Kerr black hole is rotating and has angular momentum
J = aM , we would like to back up this claim. This relies on us defining a Komar integral,
which is essentially a charge associated to a Killing vector.
We have seen that we can define conserved electric and magnetic charges given a gauge
field, one can understand the need for a charge associated to a Killing vector by playing
a little game with Kaluza–Klein reduction.
Consider Einstein gravity in five-dimensions without a cosmological constant. Let us
take an ansatz for the metric of the form
2
ds2 = ϕ2 (x) dψ + Aµ dxµ + gµν dxµ dxν , (6.22)
where ∂ψ is a Killing vector and the one-form A is defined only on the base with coor-
dinates x. Note that gauge transformations are just coordinate transformations in this
formalism.
We can now plug this into the five-dimensional vacuum Einstein equations. One
finds that there are three conditions one must impose in order for the metric to satisfy
the five-dimensional Einstein vacuum equations:
1
□ϕ = ϕ3 F µν Fµν ,
4
3 µν
∇µ ϕ F = 0, (6.23)
1 1 1 1
Rµν − gµν R = ϕ2 Fµρ Fν ρ − gµν Fρσ F ρσ + (∇µ ∇ν ϕ − gµν □ϕ) ,
2 2 4 ϕ
where Fµν = ∂µ Aν − ∂ν Aµ and everything is a four-dimensional object defined by the
metric gµν . For a constant ϕ we can see the Maxwell equation and Einstein equation of
the Einstein–Maxwell theory, of course setting ϕ =constant imposes a non-trivial relation
on the F but let us forget about this for the moment.
We see that if 5-dimensional spacetime has a circle which is small, then we see a
four-dimensional spacetime which is Einstein gravity plus electromagnetism. Now we
know that in the four-dimensional theory we can define electric (and magnetic) charges,
but there should be some remnant of these electric charges in the five-dimensional theory.
In the five-dimensional theory it must enter through the gauge field A and therefore it is
connected to the Killing vector ∂ψ : there must be a way of defining a conserved charge to
a Killing vector which is the analogue of the electric charge in the dimensionally reduced
theory.
Let k be a Killing vector, recall that this implies that ∇(µ kν) = 0, and therefore ∇µ kν is
67
anti-symmetric. We can define the two-form
Kµν = ∇µ kν , K = dk , (6.24)
were we have abused notation to write k for the form and also the vector. For any vector X
recall that we have
(∇µ ∇ν − ∇ν ∇µ )X σ = Rσρµν X ρ . (6.25)
Let us use this with the Killing vector and contract the σ and µ indices, we have
∇µ ∇ν k µ − ∇ν ∇µ k µ = Rρν k ρ
= ∇µ ∇ν k µ (6.26)
= ∇µ Kνµ ,
This should look reminiscent of how we defined electric charges in the previous section. We
may rewrite the above using Einstein’s equations:
1
Rµν = 8πGN Tµν − T ρρ gµν , (6.29)
2
to find the current
1
Jµ = 2 Tµν − T ρρ gµν k ν . (6.30)
2
Thus d ⋆ J = 0. In analogy to how we defined a charge in electromagnetism, on a spatial
hypersurface Σ, we may define the conserved charge
Z Z Z
1 1
Qk (B) = − ⋆J = d ⋆ dk = ⋆dk (6.31)
Σ 8π Σ 8π ∂Σ
We define the charge to be taken at asymptotic infinity.
Definition: Komar mass
Let Σ be a spacelike hypersurface with boundary Sr2 in an asymptotically flat stationary
spacetime, with time-like Killing vector k. The Komar mass (or Komar energy) is
Z
1
MKomar = − lim ⋆dk . (6.32)
8π r→∞ Sr2
68
This is a measure of the total energy of the spacetime. This energy comes from both
matter and the gravitational field. You have seen in GR1 exercises that even when computing
the Komar mass for the Schwarzschild solution, which is a vacuum solution with no matter,
we find a non-zero Komar mass which is equal to M .
Since the only property of k we used was that it is a Killing vector we can also define the
angular momentum.
Definition: Komar angular momentum
Let Σ be a spacelike hypersurface with boundary Sr2 in an asymptotically flat stationary
spacetime with killing axisymmetric vector k. Then the Komar angular momentum is
Z
1
JKomar = lim ⋆dk . (6.33)
16π r→∞ Sr2
The Kerr coordinates are analogous to the ingoing Eddington–Finkelstein coordinates that we
used for the Reisner–Nordström solution. One can similarly define retarded EF coordinates
and study the white hole region, before constructing Kruskal like coordinates which cover the
various regions of the metric.
Just as for the RN solution, the spacetime can be extended across the null hypersurfaces
at r = r− in regions II and III. The resulting maximal extension is similar to that of RN
except for the behaviour near the singularity. There is no longer a singularity at r = 0
but rather at ρ = 0, this is where the Kretschmann invariant Rµνρσ Rµνρσ diverges. Since
ρ2 = r2 + a2 cos2 θ is the sum of two manifestly non-negative quantities it can only vanish
when both vanish, this is then at
π
r = 0, θ= . (6.34)
2
For fixed v, r, θ the metric is
π
This then defines a disc parametrised by θ and χ. When we also take θ = 2 we end up
with the metric ds2 = a2 dχ2 which is the metric on a ring of radius a. Therefore in the
Kerr metric, the curvature singularity has the structure of a ring. The rotation has softened
69
the Schwarzschild singularity, spreading it out over a ring. If you travel toward r = 0 from
π
any other angle other than θ = 2 you will not encounter the singularity and will instead
pass through and enter a new asymptotically flat region. This is not an identical copy of the
spacetime you came from though, instead it is described by the Kerr metric with r < 0. As
a result ∆ never vanishes and there are no horizons in this space.
This spacetime with r < 0 has an unusual feature. One finds that R = ∂ϕ becomes
π
time-like near the singularity, the metric at fixed t, r < 0 and θ = 2 is
2M a 2
ds2 = r2 + a2 + dχ , (6.37)
r
which close enough to the singularity is negative. Since χ is 2π-periodic we end up with closed
timelike curves. You may sometimes hear these referred to as time-machines. It is a curve
that is everywhere timelike and that eventually returns to where it started in spacetime. You
can then travel on this CTC and meet yourself in the past!
This region is unphysical. Much like in the case of sub-extremal RN the inner horizon
at r = r− becomes a curvature singularity in the presence of the smallest perturbations to
the Kerr metric: at the inner horizon perturbations are infinitely blueshifted, which leads to
divergences in the curvature scalars.
When we considered Schwarzschild we saw that it describes the metric outside a spherical
star. This was a consequence of Birkhoff’s theorem. In contrast the Kerr solution does not
describe the spacetime outside a rotating star. This solution is expected to describe only
the final state of gravitational collapse. One can’t obtain a solution describing gravitational
collapse to form a Kerr black hole by simply gluing in a ball of collapsing matter as was
possible for Schwarzschild. Additionally, the spacetime during collapse would not even be
stationary as gravitational waves must be emitted.
Theorem Carter 1971, Robinson 1975
If (M, g) is a stationary, axisymmetric, asymptotically flat vacuum spacetime suitably
regular on, and outside a connected event horizon then (M, g) is a member of the 2-parameter
Kerr family of solutions. The parameters are the angular momentum and mass.
This result says that the final state of gravitational collapse is generically a Kerr black
hole and is fully characterised by just 2 numbers. In contrast the initial state can be arbitrarily
complicated. Nearly all information about the initial state is lost during gravitational collapse:
either by radiation to infinity, or by falling into the black hole, and just 2 numbers are required
to describe the final state on and outside the event horizon. There is an extension of this
theorem for the 4-parameter Kerr–Newman solution.
70
To draw the Penrose diagram it is now more difficult because the metric is no longer
π
spherically symmetric. Since the curvature singularity will appear only for θ = 2 the Penrose
π π
diagram will look different for θ ̸= 2 and θ = 2. To represent both cases it is customary
to draw a Penrose diagram that is an amalgamation of the Penrose diagram for an observer
falling in from the north pole and along the equatorial plane at fixed χ. Notice that χ = const
π
means that ϕ is not constant so the observer falling in at θ = 2 rotates about the polar axis.
See figure 14 for the Penrose diagram.
6.5 Ergosphere and Penrose process (or how to steal energy from a black hole)
By definition a black hole is a region of space where no matter nor light can escape from. It
may come as a surprise that we can extract energy from a black hole if it has an ergosphere.
The norm of the Killing vector K is
1
K µ Kµ = − (∆ − a2 sin2 θ) , (6.38)
ρ2
which we see does not vanish on the horizon, instead on the horizon it is spacelike. This
Killing vector is already spacelike at the outer horizon, except at the north and south poles
at θ = 0, π where it is null. The locus of points where K µ Kµ = 0 is called the stationary
limit surface and is given by
(r − M )2 = M 2 − a2 cos2 θ , (6.39)
(r+ − M )2 = M 2 − a2 . (6.40)
Thus there is a region between these two surfaces, which is called the ergosphere, where K
is spacelike, see figure 15. Therefore since in the ergosphere ∂t is not time-like one cannot
travel along its integral curves and remain stationary with respect to observers at infinity. A
stationary observer is someone whose 4-velocity is parallel to K, since this is spacelike in the
ergosphere they cannot be stationary. Recall that in order to be timelike we need to satisfy
gµν ẋµ ẋν = −1 inside the ergosphere. However each of the terms of the metric are positive
definite inside the ergosphere except the term gtϕ , and therefore ϕ̇ ̸= 0 and so must rotate.
Since ṫ > 0 for a future directed worldline, we must have ϕ̇ > 0 and therefore the timelike
worldline is dragged around in the direction of the rotation of the black hole. This effect is
an example of frame dragging.
We may exploit this to obtain energy from the black hole. Consider a particle with 4-
momentum P µ = mẋµ with m the rest of the particle. Recall that the existence of Killing
71
Figure 14: The Penrose diagram for sub-extremal Kerr. There are and infinite number of
copies of the region outside the black hole. The singularity at r = 0 only appears for θ = π2
and is absent for other values of θ. The regions beyond the singularity are where we have
CTCs.
vectors implies the existence of conserved quantities along geodesics. We have the two con-
72
Figure 15: The horizon structure around the Kerr solution. The event horizons are null
surfaces that separate points past which it is impossible to return to a certain region of space.
The stationary limit surface, is timelike everywhere except where it is tangent to the event
horizon at the poles. It represents the place past which it is impossible to be a stationary
observer. The ergosphere between the stationary limit surface and the outer event horizon is
a region in which it is possible to enter and leave again but not to remain stationary.
served quantities:
µ 2M r dt 2M ar 2 dϕ
E = −Kµ p = m 1 − 2 + sin θ ,
ρ dτ ρ2 dτ
(6.41)
(r2 + a2 )2 − ∆(r)a2 sin2 θ
µ 2M ar 2 dt 2 dϕ
l = Rµ p = m − sin θ + sin θ .
ρ2 dτ ρ2 dτ
These differ slightly with the definitions before where we had energy and angular momentum
per unit mass, here we have multiplied by the mass of the particle. They are of course still
conserved. The minus sign in E is because at infinity both K and p are timelike and so their
inner product is negative and we want energy to be positive.
Let the particle approach a Kerr black hole along a geodesic. The energy of the particle
according to a stationary observer at infinity is conserved along the geodesic. Inside the
ergosphere, since K becomes spacelike we can imagine particles for which
This may bother you slightly that there is a particle with negative energy however, one can
find that all particles have positive energy outside the ergosphere, those with negative energy
must remain in the ergosphere or be accelerated until its energy is positive if it is to escape.
This allows for a way of extracting energy from a rotating black hole. Let us start away
far from the black hole and throw something into the black hole along a geodesic. Let us
73
denote the 4-momentum to be p0 , then its total energy that we measure is
E0 = −pµ0 Kµ , (6.43)
which is conserved. Let the object enter the ergosphere. We arrange for the object to eject a
mass, in a smart way, whilst in the ergosphere. Conservation of momentum gives
p0 = p1 + p2 , (6.44)
with p1 the momentum of the object and p2 the momentum of the ejected mass. Contracting
with the Killing vector K we have the expected relation
E0 = E1 + E2 . (6.45)
If we arrange for E2 < 0 by a clever choice of way of ejecting the mass, then we must have
E1 > E0 . Penrose showed that the ejected mass with negative energy must fall into the black
hole, while the object can now escape with more energy than it initially began with. This is
the Penrose process and is a method for extracting energy from a rotating black hole.
So can a rotating black hole be used as an infinite source of energy? There is no such
thing as a free lunch (though cafe pi occasionally has free lunch samples), so the energy must
comes from somewhere, and the only candidate is that it comes from the black hole. The
Penrose process extracts energy from the black hole by decreasing the black holes angular
momentum. When the mass is ejected we need to it to be ejected against the black hole’s
rotation. Recall that we saw that the event horizon was a Killing horizon for the Killing
vector
ξ µ = K µ + ΩH R µ . (6.46)
On the outer event horizon this indeed becomes null. The statement that the object with
momentum p2 crosses the event horizon by moving forward in time, is simply that
74
Once our object has escaped the ergosphere and the mass has fallen inside the event
horizon the mass and the angular momentum of the black hole are changed. They are now
the initial values plus the negative contributions from the in-falling mass:
δM = E2 , δJ = l2 , (6.49)
with J = M a the angular momentum of the black hole. The inequality 6.48 then translates
into a limit on the amount the angular momentum can decrease
δJ < δM Ω−1
H . (6.50)
The ideal process would be when we have equality, in this case the mass thrown into the black
hole becomes more and more null (since in this limit we have pµ2 ξµ → 0).
There is now a slight curiosity that appears, we can use the Penrose process to reduce the
mass of the black hole, however there is a non-decreasing quantity: the area of the horizon.
Let us compute the area of the event horizon at r = r+ . To do this we look at the induced
metric on the horizon by setting t =const r = r+ . The induced metric is
ds2 (horizon) = γij dxi dxj = ds2 (dt = 0 , dr = 0 , r = r+ )
2 + a2 )2
(r+ (6.51)
2 sin2 θdϕ2 + (r+
2
+ a2 cos2 θ)dθ2 ,
r+ + a2 cos2 θ
The area of the horizon is then simply
Z
A= dvol(horizon) . (6.52)
To show that this does not decrease we work with the so called irreducible mass defined by
2 A
Mirreducible = . (6.55)
16π
Then we have
2 + a2
r+
2
Mirreducible =
4
1 2 p 4
(6.56)
= M + M − M 2 a2
2
1 2 p 4
= M + M − J2 .
2
75
We can differentiate to obtain ho Mirreducible is affected by changes in the mass or angular
momentum:
a
δMirreducible = √ (Ω−1 δM − δJ) . (6.57)
4Mirreducible M 2 − a2 H
We see that the inequality (6.50) becomes
The irreducible mass can never be reduced, hence the name. It follows that the maximum
amount of energy that can be extracted from the black hole is
q
1 p
max(E) = M − Mirreducible = M − √ M2 + M4 − J2 . (6.59)
2
The result after a complete extraction of this amount of energy is a Schwarzschild solution
with mass Mirreducible . The most efficient process is to start with an extremal Kerr black hole
and then one can extract out approximately 29% of its total energy.
The irreducibility of Mirreducible immediately shows that the surface area is non-decreasing.
We have
8πa
δA = √ δM − ΩH δJ . (6.60)
ΩH M 2 − a2
This may be recast as
κ
δM = δA + ΩH δJ , (6.61)
8π
where κ is √
M 2 − a2
κ= √ . (6.62)
2M (M + M 2 − a2 )
The quantity κ is the surface gravity of the Kerr solution. This is the force that an observer
at infinity would have to exert in order to keep a unit mass at the horizon.
For every Killing horizon we can associate a quantity called the surface gravity. Given
the Killing horizon we have an associated Killing vector, ξ which is null on the horizon.
Since ξ is a normal vector to the Killing horizon it obeys the geodesic equation
ξ µ ∇µ ξ ν = −κξ ν . (6.63)
It turns out that κ is constant over the horizon (we will prove this later).
The above equation first started people thinking about a correspondence between the
laws of thermodynamics and black holes. The first law of thermodynamics is
dE = T dS − pdV , (6.64)
76
where T is the temperature, S the entropy, p the pressure and V the volume, thus pdV is the
work done on the system. It is then natural to think of the term ΩH δJ as the work we do on
the black hole by throwing our mass into the black hole. It is then natural to construct the
dictionary
A κ
E↔M, S↔ , T ↔ . (6.65)
4GN 2π
This observation leads nicely on towards studying black hole thermodynamics. Before
we get there we need to introduce some more formal definitions of what a black hole is and
a little more technology.
Many physical questions can be rephrased as an initial value problem. Given the state of a
system at some moment in time what will be the state of the system at some later time. The
fact that this has a definitive answer is due to causality: future events can be understood
as consequences of initial conditions plus the laws of physics. Initial value problems are as
common in GR as in Newtonian physics or special relativity, however the dynamical nature
of the spacetime background introduces new ways in which an initial value formulation could
break down.
For the moment we will look at the problem of evolving matter fields on a fixed back-
ground spacetime rather than the evolution of the metric. The guiding principle is that no
signals can travel faster than the speed of light; therefore information can only flow along
timelike or null paths, not necessarily geodesics. We will define a causal curve to be any path
which is timelike or null everywhere. Given any subset S of a manifold M , we can define the
causal future of S denoted J + (S) to be the set of points that can be reached from S following
a future directed causal curve. The chronological future I + (S) is the set of points that can be
reached by following a future directed timelike curve. A point p will always be in its causal
future J + (S) but not necessarily its own chronological future I + (p), though it could be. The
causal past J − and chronological past I − are defined analogously.
A subset S ⊂ M is called achronal if no two points in S are connected by a time-like
curve. For example any edgeless spacelike hypersurface in Minkowski space is achronal. Given
a closed (the complement is open) achronal set we define the future domain of dependence of
S, D+ (S) to be the set of all points p such that every past moving inextendible causal curve
through p must intersect S. By inextendible we mean that the curve goes on forever and does
not end at some finite point. Elements of S are elements of D+ (S). A similar definition of
77
the past domain of dependence, D− (S) holds by replacing future with past. We define the
boundary of D+ (S) to be the future Cauchy horizon H + (S) and likewise the boundary of
D− (S) to be the past Cauchy horizon H − (S). They are both null surfaces. We have sketched
this in figure 16.
D +(S )
Σ H +(S )
H −(S ) D −(S )
Figure 16: A depiction of the domains of dependence of the set S on the achronal surface
Σ.
If nothing moves faster than light, signals cannot propagate outside the lightcone of any
point p. Therefore if every curve that remains inside the lightcone must intersect S then
information specified on S should be sufficient to predict what the situation is at p. That is,
initial data for matter fields on S can be used to solve for the matter fields at p. The set
of all points for which we can predict what happens by knowing what happens on S is the
union D(S) = D+ (S) ∪ D− (S) is called the domain of dependence. A closed achronal surface
Σ is said to be a Cauchy surface if the domain of dependence D(Σ) is the entire manifold.
Information given on the Cauchy surface can be used to predict what happens throughout
all of spacetime. If a spacetime has a Cauchy surface (it need not) it is said to be globally
hyperbolic.
Therefore a globally hyperbolic spacetime is one in which one can predict what happens
everywhere from data on Σ. Minkowski spacetime is an example of a globally hyperbolic
spacetime as is the Kruskal spacetime. Examples of non globally hyperbolic spacetimes is
2d Minkowski space with the origin removed. In this case for any partial Cauchy surface Σ,
78
there will be some inextendible causal curves which don’t intersect Σ because they end at the
origin.
8 Singularity theorem
We have seen that a spherically symmetric gravitational collapse results in the formation of
a singularity. One can ask whether this is an artefact of the spherical symmetry or if it is
something more generic?In Newtonian gravity the collapse of a spherically symmetric ball of
matter produces a singularity with infinite density at the origin, however a tiny perturbation
of the spherical symmetry does not lead to a singularity, rather a bouncing solution. One
could ask whether this is the same for GR. Work by Roger Penrose answered this question
and showed that singularities are a generic prediction of general relativity.15
8.1 Singularities
We have seen numerous different types of singularity so far. We have defined a metric singu-
larity to arise in some basis if its components are not smooth or the determinant vanishes. A
coordinate singularity can be eliminated by a change of coordinates, for example r = 2M in
the Schwarzschild spacetime in Schwarzschild coordinates. These singularities are unphysical
and can be removed by a better choice of coordinates. If it is not possible to remove the
singularity by a change of coordinates then we have a physical singularity. A scalar curvature
singularity is a singularity where some scalar constructed from the Riemann tensor blows up.
Not all physical singularities are curvature singularities however, we have seen one in
problem sheet 2. Consider the manifold M = R2 and introduce plane polar coordinates (r, ϕ)
with ϕ ∼ ϕ + 2π and define the Riemannian metric
with λ > 0. The metric determinant vanishes at r = 0, however for λ = 1 this is just
Euclidean space in polar coordinates so we can convert to Cartesian coordinates to see that
r = 0 is just a coordinate singularity. However for λ ̸= 1 and define ϕ′ = λϕ. Then the metric
is
g = dr2 + r2 dϕ′2 , (8.2)
15
It is sometimes said that GR predicts its own downfall. This is because GR predicts singularities but is
ill equipped to deal with them. To fully understand them we need a theory of quantum gravity, which GR is
not.
79
which is locally isometric to Euclidean space and therefore curvature singularity free. How-
ever, it is not globally isometric to Euclidean space because the period of ϕ′ is 2πλ. Consider
a circle r = ϵ, this has
circumference 2ϕλϵ
= = 2πλ , (8.3)
radius ϵ
which does not tend to 2π as ϵ → 0. Recall that any smooth Riemannian manifold is locally
flat, i.e. one recovers results of Euclidean geometry on sufficiently small length scales. The
above shows that this is not true for the above metric for small circles around r = 0 and
therefore the metric cannot be extended smoothly to r = 0. This is an example of a conical
singularity.
A problem in defining singularities is that they are not places, they do not belong to
the spacetime manifold because we define spacetime as the pair (M, g) where g is a smooth
Lorentzian metric. This is the reason we remove r = 0 from the Kruskal spacetime, the metric
is no longer smooth there. Similarly in the above example if we want a smooth manifold we
should take M = R2 /(0, 0) so that r = 0 is not part of the spacetime M .
In both examples the existence of the singularity implies that some geodesics cannot be
extended to arbitrarily large affine parameter because they end at the singularity. We will
use this property to define what we mean by a singular spacetime.
First eliminate the trivial case where a geodesic ends because we haven’t taken the range
of its parameter to be large enough. A curve is a smooth map γ : (a, b) → M . Sometimes a
curve can be extended, that is it is part of a bigger curve. If this happens then the first curve
will have an endpoint, which is defined as follows.
Definition future endpoint
The point p ∈ M is a future end-point of a future-directed causal curve γ : (a, b) → M if,
for any neighbourhood O of p there exists t0 such that γ(t) ∈ O for all t > t0 . We say that
γ is future inextendible if it has no future endpoint. Similarly for past endpoints and past
inextendibility. The curve γ is inextendible if it is both future and past inextendible.
Example Let (M, g) be Minkowski spacetime and let γ : (−∞, 0) → M be γ(t) =
(t, 0, 0, 0). Then the origin is a future end-point of γ. However if we instead let (M, g) be
Minkowski spacetime with the origin removed then γ is future inextendible.
Definition Complete
A geodesic is complete if an affine parameter for the geodesic extends to ±∞. A spacetime is
geodesically complete if all inextendible causal geodesics are complete.
Example Minkowski spacetime is goedesically complete as is the spacetime describing a
spherically symmetric star. Kruskal spacetime on the other hand is goedesically incomplete
80
because some geodesics have r → 0 in finite affine parameter and hence cannot be extended
to infinite affine parameter.
A spacetime which is extendible will also be geodesically incomplete. The incompleteness
in this case is because we are not considering the full spacetime. We will therefore regard a
spacetime as singular if it is geodesically incomplete and inextendible. This is true for the
Kruskal spacetime, Kruskal extension of the RN and Kerr–Newman black holes.
Proof: Let N be given by the equation f =constant for some function f with df ̸= 0 on
N . Then we have n = hdf for h some function which does not vanish on f =constant. Let
N = df , the integral curves of n and N are the same up to a choice of reparametrisation.16
Since N is null we have that N µ Nµ = 0 on N which implies that the gradient of this function
16
To see this consider the integral curves defined by nµ :
dx̃µ (λ̃)
nµ = .
dλ̃
We have that the integral curves for N are then
dxµ dx̃µ dλ dx̃µ
Nµ = = h−1 nµ = h−1 = h−1 .
dλ dλ̃ dλ̃ dλ
By choosing the parameter λ(λ̃) so that ddλλ̃ = h we may make the bracket in the last term become unity and
therefore we have shown that the integral curves are the same up to a choice of reparametrisation.
81
is normal to N :
∇µ (N ν Nν ) = 2αNµ , (8.5)
N
N ν ∇µ Nν = N ν ∇ν Nµ ⇒ N ν ∇ν Nµ = αNµ . (8.6)
N
This is nothing but the geodesic equation for a non-affinely parametrised geodesic. Hence on
N the integral curves of N , and therefore also n are null geodesics.
Consider Kruskal spacetime, with metric (1.69). Let N = dU , this is null everywhere
(since g U U = 0) and is normal to a family of null hypersurfaces defined by U =constant.
Since N 2 = 0 everywhere it follows that N is tangent to affinely parametrised null geodesics.
Raising an index gives µ
µ r r ∂
N =− 3
e 2M . (8.7)
16M ∂V
Let N be the surface U = 0. Since U = 0 corresponds to r = 2M on N we have that N
∂
is simply a constant multiple of ∂V . Thus V is an affine parameter for the generators of N .
Similarly U is an affine parameter of for the generators of the null hypersurface V = 0.
Black holes are characterised by the fact that you can enter them but never exit. The most
important feature is therefore not the singularity but rather than event horizon. An event
horizon is a hypersurface separating those spacetime points that are connected to infinity by
a timelike path from those that are not.
82
We see that S µ points from one geodesic to an infinitesimally nearby one in the family. The
vector S is called the deviation vector or separation vector.
On the surface Σ we can use σ, λ coordinates and therefore we have
∂ ∂
S= , X= . (8.10)
∂σ ∂λ
Hence S and X commute:
[X, S] = 0 , ⇔ X µ ∇µ S ν = S µ ∇µ X ν . (8.11)
A more comprehensive picture of the behaviour of neighbouring geodesics comes from con-
sidering not just a one-parameter family but an entire congruence of geodesics. Let U be an
open region of M . A congruence of U is a set of geodesics such that every point in U lies on
precisely one curve. A geodesic congruence can be thought of as a tracing out the paths of a
set of non-interacting particles moving through spacetime with non-intersecting paths. If the
geodesics cross then the congruence comes to an end at that point. Consider a congruence for
which all the geodesics are of the same type, (timelike, spacelike, null). We can then arrange
such that the tangent vector, X µ is normalised to X µ Xµ = ±1, 0 depending on the type.
dxµ
Let X µ = dτ and consider a 1-parameter family of geodesics belonging to a congruence.
Then [S, X] = 0 can be written as
X µ ∇µ S ν = B νµ S µ , B νµ = ∇µ X ν , (8.13)
and measures the failure for S to be parallely transported along the geodesic with tangent
X. It therefore measures the extent to which neighbouring geodesics deviate from remaining
parallel. Note that due to X being a geodesic and normalised to have fixed constant norm,
we have
Xν B νµ = B νµ X µ = 0 . (8.14)
83
Even after normalising the norm X µ Xµ by an appropriate choice of the affine parameter
we still have the freedom to shift by a constant. We can define the constant to be different
on different geodesics, allowing it to depend on σ: λ′ = λ − a(σ) for some function a(σ). This
changes the deviation vector to
da(σ) µ
S ′µ = S µ + X . (8.16)
dσ
This is still a deviation vector pointing to the same geodesic as S µ . Now we have
da(σ) µ
X µ Sµ′ = X µ Sµ + X Xµ , (8.17)
dσ
and therefore in the time-like and space-like case we may fix X µ Sµ = 0, which is a sort of
“gauge” freedom. Since X µ Sµ is constant we have X µ Sµ = 0 everywhere. We can define the
projector
P µν = δνµ − |X|−2 X µ Xν , (8.18)
which projects onto the vector space of the tangent space of a point p vectors normal to X.
Null case This of course does not work for null geodesics since X µ Xµ = 0 and therefore
X µ Sµ′ = X µ Sµ . We can instead fix the gauge freedom by picking a spacelike hypersurface Σ
which intersects each geodesic once. Let N µ be a vector field defined on Σ obeying N 2 = 0
and N µ Xµ = −1 on Σ. Now we can extend N off of Σ by parallel transport along the
geodesics: X µ ∇µ N ν = 0. This implies that N 2 = 0 and N µ Xµ = −1 everywhere. We have
therefore constructed a vector field such that
N2 = 0 , X µ Nµ = −1 , X µ ∇µ N ν = 0 . (8.19)
S µ = αX µ + βN µ + Ŝ µ , (8.20)
where
U µ Ŝµ = N µ Ŝµ = 0 , (8.21)
which implies that Ŝ is either spacelike or zero. Now U µ Sµ = −β and therefore β is constant
along each geodesic. Therefore we can write the the deviation vector S as the sum of a part
αX µ + Ŝ µ orthogonal to X and a part βN that is parallely transported along each geodesic.
An important case is when the congruence contains the generators of a null hypersurface
N and we are interested only in the behaviour of these generators. In this case we pick a
84
one-parameter family of geodesics contained within N then the deviation vector S will be
tangent to N and hence obey X µ Sµ = 0 since X is normal to N , in the decomposition above
this is equivalent to β = 0.
We can write
Ŝ µ = P̃ µν S ν , (8.22)
acting on the tangent space at p onto T⊥ the 2d space of vectors at p orthogonal to X and
N . Since both X and N are parallely transported so is P ,
X µ ∇µ P̃ νσ = 0 . (8.24)
Bµν is a (0, 2) tensor and so we may decompose it into its: anti-symmetric part, symmetric
traceless part, and trace part.
Let us restrict to the null case. Then we may act on B with the projector P̃ as
B̂ µν = P̃ µρ B ρσ P̃ σν , (8.25)
the latter shows that it is independent of the vector N and therefore is an intrinsic property of
the congruence. Moreover the scalar invariants of the rotation and shear, for example ωµν ω µν
or the eigenvalues of σ are also independent of the choice of N .
Proposition If the congruence contains the generators of a (null) hypersurface N then
ωµν = 0 on N . Conversely if ωµν = 0 everywhere then X is everywhere hypersurface orthog-
onal, that is orthogonal to a family of null hypersurfaces.
85
proof The definition of B̂ and X µ B̂µν = 0 = B̂µν X ν implies
B̂ µν = B µν + X µ Nρ B ρν + Xν B µρ N ρ + X µ Xν Nρ B ρτ N τ . (8.29)
since the other terms drop in the anti-symmetrisation. Using the definition of B we have
1
X[µ ωνρ] = X[µ ∇ρ Xν] = − (X ∧ dX)µνρ . (8.31)
6
If X is normal to N then X ∧ dX = 0 on N . Hence on N
1
0 = X[µ ωνρ] = Xµ ωνρ + Xν ωρµ + Xρ ωµν . (8.32)
3
Using that X µ Nµ = −1 and ωµν N ν = 0 we have ωµν = 0 on N . Conversely if ω = 0
everywhere then the above shows that X is hypersurface orthogonal using Frobenius’ theorem.
Figure 17: A depiction of the shear and expansion for a null hypersurface.
Assume that we have a congruence which includes the generators of a null hypersurface N .
The generators of a null hypersurface have ω = 0. To understand how these generators behave
let us restrict to deviation vectors tangent to N , i.e. a one-parameter family of generators
of N . Consider the evolution of the generators of N as a function of affine parameter λ as
shown in figure 17.
86
Qualitatively θ corresponds to neighbouring generators moving apart for θ > 0, together
for θ < 0. Shear on the other hand corresponds to geodesics moving apart in one direction
and together in the orthogonal direction whilst preserving the cross-sectional area.
To make this more precise let us introduce Gaussian null coordinates near N as follows.
Pick a spacelike 2-surface S within N and let y i (i = 1, 2) be coordinates on this surface.
Assign coordinates (λ, y i ) to the point affine parameter distance λ from S along the generator
of N with tangent X µ which intersects the surface S at the point with coordinates y i . Now
we have coordinates (λ, y i ) on N such that the generators are lines of constant y i and X = ∂λ .
Let V be a null vector field on N satisfying V · ∂yi = 0 and V · X = 1. Assign coordinates
(r, λ, y i ) to the point affine parameter distance r along the null geodesic which starts at the
point on N with coordinates (λ, y i ) and has tangent vector V there.
This defines a coordinate chart in a neighbourhood of N such that N is at r = 0 with
X = ∂λ on N and ∂r is tangent to affinely parametrised null geodesics. The latter implies
that grr = 0 everywhere.
At r = 0 we have grλ = X · V = 1 since V = ∂r on N , and gri = V · ∂yi = 0. Since
grµ is independent of r these results are valid for all r. We also know that gλλ = 0 at r = 0,
since X is null, and gλi = 0 at r = 0 (since ∂yi is tangent to N and hence orthogonal to
X). Therefore we can write gλλ = rF and gλi = rhi for some smooth functions F and hi .
Therefore the metric takes the form
On N the metric is
g = 2drdλ + hij dy i dy j , (8.34)
N
87
√
From the form of the metric we see that h is nothing but the area element on a surface of
constant λ within N so θ measures the rate of increase of this area element with respect to
affine parameter along the geodesics.
Consider a 2d spacelike surface S i.e. a 2d submanifold for which all tangent vectors are
spacelike. For any p ∈ S there will be precisely two future directed null vectors X1 and X2
orthogonal to S, up to rescalings. If we assume S is orientable then X1µ and X2µ can be
defined continuously over S. This defines two families of null geodesics which start on S and
are orthogonal to S. These null geodesics form two null hypersurfaces N1 and N2 . In simple
situations these correspond to the set of out-going and in-going light rays that start on S.
Consider a null congruence that contains the generators of Ni . By the proposition above we
will have ωµν = 0 on N1 and N2 .
Example Let S be a two-sphere U = U0 and V = V0 in the Kruskal spacetime. By
symmetry the generators of Ni will be radial null geodesics, see figure 19
88
and dV respectively. We saw above that dU and dV correspond to affine parametrisation.
Raising an index we find
∂ ∂
X1 = rer/2M , X2 = rer/2M , (8.38)
∂V ∂U
where we have discarded an overall constant and fixed the signs so that X1 and X2 are future-
∂ ∂
directed. The vectors ∂U and ∂V are future-directed because they are globally null and hence
define time-orientations. In region I they both give the same time orientation as the one
defined by K.
We can compute the expansion of these congruences:
1 √
θ1 = ∇µ X1µ = √ ∂µ −gX1µ = r−1 er/2M ∂V re−r/2M rer/2M = 2er/2M ∂V r . (8.39)
−g
The right-hand side can be calculated by using the implicit definition of U, V , this gives
8M 2
θ1 = − U, (8.40)
r
and a similar calculation gives
8M 2
θ2 = −
V. (8.41)
r
We can now set U = U0 and V = V0 to study the expansion on S of the null geodesics normal
to S. For S in region I we have θ1 > 0 and θ2 < 0 i.e. the outgoing null geodesics normal
to S are expanding and the ingoing geodesics are converging. Similarly in region IV we have
θ2 > 0 and θ1 < 0 so again we have an expanding family and a converging family. However in
region II we have θ1 < 0 and θ2 < 0: both families of geodesics normal to S are converging.
In region III we have θ1 > 0 and θ2 > 0 so both families are expanding.
Definition: Trapped
A compact orientable spacelike 2-surface is trapped if both families of null geodesics orthogonal
to S have negative expansion everywhere on S. It is marginally trapped if both families have
non-positive expansion everywhere on S.
In Kruskal spacetime all 2-spheres (U = U0 , V = V0 ) in region II are trapped and
2-spheres on the event horizon (U0 = 0, V0 > 0) are marginally trapped.
We now want to understand how the expansion evolves along the geodesic of a null geodesic
congruence.
Proposition: Raychaudhuri’s equation
dθ 1
= − θ2 − σ µν σµν + ω µν ωµν − Rµν X µ X ν . (8.42)
dλ 2
89
Proof: From the definition of θ we have
dθ
= X µ ∇µ θ = X µ ∇µ B ρσ P̃ σρ = P̃ σρ X µ ∇µ B ρσ = P̃ σρ X µ ∇µ ∇σ X ρ . (8.43)
dλ
Now commute derivatives using the definition of the Riemann tensor:
dθ
= P̃ σρ X µ ∇σ ∇µ X ρ + Rρτ µσ X τ
dλ (8.44)
= P̃ σρ ∇σ X µ ∇µ X ρ − ∇σ X µ ∇µ X ρ + P̃ σρ Rρµνσ X µ X ν
= −B ρν P νµ B µρ − Rµν X µ X ν ,
where we used the geodesic equation and in the final term the anti-symmetry of the Riemann
tensor allows us to replace P̃ σρ with δρσ . Finally we can rewrite the first term so that
dθ
= −B̂ µν B̂ νµ − Rµν X µ X ν . (8.45)
dλ
The result then follows by using the expansion of B̂ µν in equation (8.27).
Similar calculations give equations governing the evolution of shear and rotation
Raychaudhuri’s equation involves the Ricci tensor and so is purely geometric. Through the
Einstein equation this is related to the energy-momentum tensor of matter. We want to
consider only physical matter which implies that the energy-momentum tensor should satisfy
some conditions. For example, an observer with 4-velocity uµ would measure and energy
momentum current j µ = −T µν uν . We would expect that physically reasonable matter should
not move faster than light, this current should be non-spacelike. This motivates:
Dominant energy condition For all future-directed timelike vectors V µ the vector
−T µν V ν is a future-directed causal vector or zero.
For matter satisfying the dominant energy condition, if Tµν is zero in some closed region
S of a spacelike hypersurface Σ then Tµν will be zero within D+ (S).
Example
Consider a massless scalar field
1
Tµν = ∂µ ϕ∂ν ϕ − gµν ∂ρ ϕ∂ ρ ϕ . (8.46)
2
Let
1
j µ = −T µν V ν = −V ν ∂ν ϕ∂ µ ϕ + V µ ∂ρ ϕ∂ ρ ϕ , (8.47)
2
then for timelike V µ
1 2
j 2 = V 2 ∂ρ ϕ∂ ρ ϕ ≤ 0 (8.48)
4
90
so j is indeed causal or zero. Now consider
1
V µ jµ = −(V µ ∂µ ϕ)2 + V 2 (∂ρ ϕ∂ ρ ϕ)
2 (8.49)
1 µ 2 1 2 1 2 V · ∂ϕ ρ V · ∂ϕ ρ
= − (V ∂µ ϕ) + V + V ∂ρ ϕ − Vρ ∂ ϕ− V
2 2 2 V2 V2
the final expression in brackets is orthogonal to V and hence must be spacelike or zero so its
norm is non-negative. We then have V · j ≤ 0 using V 2 < 0. Hence j µ is future directed or
zero.
A less restrictive condition requires only that the energy density measured by all observers
is positive:
Weak energy condition For any causal vector V we have
Tµν V µ V ν ≥ 0 . (8.50)
Tµν V µ V ν ≥ 0 . (8.51)
The dominant energy condition implies the weak energy condition, which implies the null
energy condition. Another energy condition is:
Strong energy condition For all causal vector V we have
1
Tµν − gµν T ρρ V µ V ν ≥ 0 . (8.52)
2
Using the Einstein equation this is equivalent to
Rµν V µ V ν ≥ 0 , (8.53)
or gravity is attractive. Despite its name the strong energy condition does not imply any of
the other conditions. The strong energy condition is needed to prove some of the singularity
theorems, but the dominant energy condition is the most important physically. For example
our universe appears to contain a positive cosmological constant. This violates the strong
energy condition but respects the dominant energy condition.
In a spacetime satisfying Einstein’s equation with matter obeying the null energy condition,
the generators of a null hypersurface satisfy
dθ 1
≤ − θ2 . (8.54)
dλ 2
91
Consider the RHS of Raychaudhuri’s equation. The generators of a null hypersurface
have ω = 0. Since vectors in T⊥ are all spacelike, so the metric restricted to T⊥ is positive
definite. Hence σµν σ µν ≥ 0. Einstein’s equations gives Rµν X µ X ν = 8πTµν X µ X ν because X
is null. Hence the null energy condition implies Rµν X µ X ν ≥ 0 and the result follows from
Raychaudhuri’s equation.
Corollary If θ = θ0 < 0 at a point p on a generator γ of a null hypersurfaces then θ → −∞
along γ within an affine parameter distance 2|θ0 |−1 provided γ extends this far.
Proof: Let λ = 0 at p. Then equation (8.54) implies
d −1 1
θ ≥ . (8.55)
dλ 2
Integrating gives θ−1 − θ0−1 ≥ λ
2 which can be rearranged to give
2θ0
θ≤ . (8.56)
2 + λθ0
2
If θ0 < 0 then the RHS goes to ∞ as λ → |θ0 | .
Definition: Conjugate
Points p, q on a geodesic γ are conjugate if there exists a solution of the geodesic deviation
equation along γ that vanishes at p and q but it not identically zero.
Since gravity is an attractive force it focuses geodesics and if the generated curvature is
strong enough conjugate points always develop provided we can extend the geodesics arbi-
trarily far in the past and in the future.
We can now define a black hole and its event horizon. Consider a manifold with metric (M, g)
and its conformal compactification (M̄ , ḡ). Recall that the causal past J − of a region is the
set of all points we can reach from that region by moving along a past-directed timelike paths.
We can define the causal past of scri-plus J − (I + ) ⊂ M̄ . The set of points of M that can
send a signal to I + is M ∩ J − (I + ). We define the black hole region to be the complement
of this region, and the future event horizon to be the boundary of the black hole region:
Definition: Black hole region, future event horizon
Let (M, g) be a spacetime that is asymptotically flat at null infinity. The Black hole region is
B = M \[M ∩ J − (I + )] , (8.57)
where J − (I + ) is defined using the unphysical spacetime (M̄ , ḡ). The future event horizon is
H+ = ∂B.
92
Similarly the white hole region is
W = M \[M ∩ J + (I − )] , (8.58)
93
9 Laws of black hole thermodynamics
In 1973 Bardeen, Carter and Hawking (BCH) wrote a paper, [3], in which they considered
stationary axisymmetric black holes. They found that black holes obeyed laws reminiscent
of the laws of thermodynamics. At the time they thought it was just an analogy. There
seem to be some glaring flaws in this analogy: since nothing can escape from a black hole the
temperature must vanish, secondly, the entropy is dimensionless whereas the horizon area is
a length squared, our final perceived flaw is that the area of every black hole is separately
non-decreasing, whereas only the total entropy is non-decreasing in thermodynamics. The
resolution to all these flaws lies in the incorporation of quantum theory, recall that going
to a quantum theory was also the resolution for apparent paradoxes in thermodynamics, for
example black body radiation. We will not study quantum gravity, this is an active area of
research but present the classical laws and possibly some semi-classical analysis.
There are four laws of black hole thermodynamics which should be contrasted with the
laws of thermodynamics:
94
Law Thermodynamics Black holes
The temperature T is constant The surface gravity κ is constant over
0th throughout a system in thermal equi- the even horizon of a stationary black
librium. hole.
1
1st
P
dE = T dS + i µi dNi dM = 8π κdA + ΩH dJ + ΦH dQ
2nd dS ≥ 0 dA ≥ 0
T cannot be reduced to zero by a finite κ cannot be reduced to zero by a finite
3rd
number of operations. number of operations.
It may seem strange to say that a black hole has a temperature since nothing can escape
from a black hole and therefore they cannot radiate. This would also mean that they cannot
have a physical entropy. Once quantum effects are taken into account it turns out that a
black hole can have a temperature. Moreover, as pointed out by Jacob Bekenstein the second
law of thermodynamics would be violated if black holes did not have an entropy. One could
throw in arbitrary objects into the black hole which have a large entropy and thus lower the
entropy of the exterior universe. In order to save the 2nd law of thermodynamics it is essential
for a black hole to have an entropy and more over it must be proportional to the surface area
of the horizon. Bekenstein’s generalised second law states that
dStotal = d Sextermal + SBH ≥ 0. (9.1)
In 1974 Hawking announced that black holes are hot and radiate just like any hot body
with a temperature
ℏκ
TH = , (9.2)
2πkB
from which it follows that a black holes has an entropy given by
A
SBH = . (9.3)
4GN ℏ
which is known as the Bekenstein–Hawking entropy.
In the remainder of the course our goal is to understand the laws of black hole thermo-
dynamics as presented above.
Proposition
Consider a null geodesic congruence that contains the generators of a Killing horizon N .
Then θ = σ = ω = 0 on N .
Proof : We have already seen that ω = 0 since the generators are hypersurface orthogonal.
Let ξ be a Killing vector field normal to N . On N we can write ξ µ = hU µ where U µ is tangent
95
to the affinely parametrised generators of N and h is a function on N . Le N be specified by
an equation f = 0. Then we can write U µ = h− ξ µ + f V µ where V µ is a smooth vector field.
We can then calculate
Since both ξµ and ∂µ f are parallel to Uµ on N when we project onto T⊥ both terms are
eliminated and we have
B̂µν =0 (9.6)
N
and thus θ = σ = 0 on N .
Theorem: Zeroth law of black hole mechanics
The surface gravity κ is constant on the future event horizon of a stationary black hole
spacetime obeying the dominant energy condition.
Proof: Using Hawking’s theorem we have that H+ is a Killing horizon with respect to
dθ
some Killing vector ξ. We know that θ = 0 along the generators of H+ , and therefore dλ =0
along these generators. Moreover we have just seen that on H+ σ = ω = 0. Therefore
Raychaudhuri’s equation gives
1
0 = Rµν ξ µ ξ ν = 8π(Tµν − gµν T ρρ )ξ µ ξ ν = 8πTµν ξ µ ξ ν (9.7)
H+ 2 H+ H+
where we have used Einstein’s equation and that ξ is null on H+ . This implies
Since ξ is a future-directed causal vector field, then by the dominant energy condition, so is
Jµ (unless it is zero). Thus J µ is parallel to ξ µ on H+ and consequently
1
0 = ξ[µ Jν] = −ξ[µ Tν]ρ ξ ρ − ξ R ξρ , (9.9)
H+ H+ 8π [µ ν]ρ H+
where we have used Einstein’s equation in the final step. One problem sheet 4 you are asked
to show that this is equivalent to
1
0=
ξ ∂ κ. (9.10)
8π [µ ν]
Therefore ∂ν κ is proportional to ξν and therefore for any vector field t tangent to H+ it follows
that tµ ∂µ κ. Therefore κ is constant on H+ provided H+ is connected.
96
Let us the identity we need for proving that the surface gravity is constant on the horizon.
We must be very careful with the formulae we use for the surface gravity and acting on
them with derivatives since some only hold on the horizon. Since ξ 2 H+ = 0 we have that
∇µ ξ 2 is normal to the horizon and therefore there is a function κ on the horizon such
that
∇µ (ξ 2 ) = −2κξµ . (9.11)
We may rewrite this as
ξ µ ∇ν ξµ = −ξ µ ∇µ ξν = −κξν , (9.12)
which is just the geodesic equation in a non-affine parametrisation. The above derivation
of the expression for the surface gravity makes clear that it holds on the Killing surface.
This means that applying derivatives to the above expression is somewhat subtle, we
can only differentiate on the Killing surface and not normal to it. Instead observe that if
ϵµνρσ is the 4d volume element then ϵµνρσ ξσ is tangent to the horizon since ϵµνρσ ξσ ξρ = 0.
Therefore we may use this to project the differential operator onto the horizon by acting
with ϵµνρσ ξρ ∇σ and then this may be applied to any object defined on the horizon.
Equivalently we may act with ξ[µ ∇ν] on any object. Now applying this to (9.12) we
obtain
We may simplify the first term by using the condition that ξ is hypersurface orthogonal
and hence satisfies ξ[µ ∇ν ξρ] = 0. We find
1
ξ[ρ ∇σ] ξ µ ∇µ ξν = − ξ µ ∇ρ ξσ ∇µ ξν
2
1 (9.14)
= − κξν ∇ρ ξσ
2
= κξ[ρ ∇σ] ξν
This cancels the second term of the first row of (9.14). We therefore have
97
and acting on this with with ξ[σ ∇τ ] we obtain
ξ[σ ∇τ ] ξρ ∇µ ξν + ξρ ξ[σ ∇τ ] ∇µ ξν = −2 ξ[σ ∇τ ] ξ[µ ∇ν] ξρ − 2 ξ[σ ∇τ ] ∇[ν ξ|ρ| ξµ] . (9.17)
with the right-hand-side being the expression we required above. We therefore find
Plugging this into the formulae above gives the required result.
We have already seen a form of the first law when we considered the irreducible mass of the
Kerr solution. We will give a somewhat heuristic argument here of the first law and then check
it in more detail for the black holes we have studied previously consider the Killing vector
associated to the Killing horizon, it takes the form ξ = K + ΩH R where K generates time
translations and R generates the axisymmetry. The corresponding charge is a combination
of the mass and the angular momentum:
Z Z Z
1 1 ΩH
Qξ = − ⋆dξ = − ⋆dK − ⋆dR = M − 2ΩH J . (9.21)
8π S∞2 8π S∞2 8π S∞2
We can also evaluate Qξ in another way. Let Σ be a spacelike hypersurface intersecting the
2 which together with the two-sphere S 2 at spatial infinity
horizon H+ on a two-sphere SH ∞
forms the boundary of Σ. Using Stoke’s theorem we have:
Z Z
1 1
Qξ = − ⋆dξ − d ⋆ dξ
8π SH 2 8π Σ
Z Z (9.22)
1 1
Tµν − gµν T ρρ ξ ν ⋆ dxµ ,
=− ⋆dξ + 2
8π SH 2 Σ 2
98
2 we observe that the volume form
outside the horizon. In order to treat the integral over SH
2 , can be written as
on SH
2
dvol(SH ) = ⋆(n ∧ ξ) , (9.24)
nµ ξµ = −1. Therefore
Z Z
2
µν
⋆dξ = dvol(SH ) ⋆ n∧ξ (⋆dξ)µν
2
SH 2
SH
Z
2
=2 dvol(SH )nν ξ µ ∇µ ξν
2
SH (9.25)
Z
2
= −2κ dvol(SH )
2
SH
= −2κAH .
Plugging this into (9.22) we arrive at
Z
κAH 1
M= + 2ΩH J + 2 Tµν − gµν T ρρ ξ ν ⋆ dxµ (9.26)
4π Σ 2
If we are in pure GR, then Tµν = 0 and our spacetime is the Kerr black hole and the formula
reads
κA
M= + 2ΩH J . (9.27)
4π
This is Smarr’s formula for the mass of a Kerr black hole. A formula for δM in the vacuum
case can be obtained by varying (9.27)
1
δM = AH δκ + κδAH + 2 JδΩH + ΩH δJ . (9.28)
4π
An alternative computation gives
1
δM = − AH δκ − 2JδΩH . (9.29)
4π
Adding the two equations gives
1
δM = κδAH + ΩH δJ . (9.30)
8π
In the case where there is an electric charge, we need to define the electric potential
Φ H = ξ µ Aµ − ξ µ Aµ . (9.31)
H+ ∞
For asymptotically flat spacetimes we have that Aµ → 0 as we tend to ∞ and so the second
term drops out. The 1st law with electric charge is then
1
δM = κδAH + ΩH δJ + ΦH δQ (9.32)
8π
99
Kerr–Newman Let us check this for the Kerr–Newman solution:
∆(r) − a2 sin2 θ 2 2a sin2 θ(r2 + a2 − ∆(r))
ds2 = − dt − dtdϕ
ρ(r, θ)2 ρ(r, θ)2
(r2 + a2 )2 − a2 sin2 θ∆(r) 2 2 ρ(r, θ)2 2
+ sin θdϕ + dr + ρ2 (r, θ)dθ2 , (9.33)
ρ(r, θ)2 ∆(r)
1 2
A=− Qr(dt − a sin θdϕ) .
ρ(r, θ)2
The Kerr–Newman solution is the unique stationary black hole solution of the Einstein–
Maxwell theory.
Let us compute the quantities that we will need to check the relation. The outer Killing
horizon is at r = r+ with
p
r± = M ± M 2 − a2 − Q2 . (9.35)
First let us consider the horizon surface area. We fix an arbitrary time t = t0 and look
at the induced metric on the intersection t = t0 and r = r+ , we find
r+2 + a2
ds2 (H) = γµν dxµ dxν = ρ(r+ , θ)2 dθ2 + sin2 θdϕ2 . (9.36)
ρ(r+ , θ)2
The volume form is
2
dvol(γ) = (r+ + a2 ) sin θdθ ∧ dϕ , (9.37)
Next let us consider the surface gravity. We first need to find the Killing vector which is
null on the horizon and then to compute the surface gravity. Since the horizon is a Killing
horizon we know that it must be of the form
ξ = K + ΩH R , (9.39)
where K and R are the generators of time translations and the axis symmetry respectively.
Note that since a Killing vector remains a Killing vector under a constant rescaling there is an
100
arbitrariness in how we pick such a Killing vector. We normalise such that K has coefficient
1. Now this needs to have zero norm on the horizon. The norm is
a2 sin2 θ 2a sin2 θ(r+
2 + a2 ) 2 + a2 )2
(r+
ξ2 = 2 + a2 cos2 θ − 2 + a2 cos2 θ Ω H + 2 2
2 + a2 cos2 θ sin θΩH
N+ r+ r+ r+
(9.40)
sin2 θ
2 2 2 2 2 2 2
= 2 a − 2a(r + + a )ΩH + (r+ + a ) Ω H ,
r+ + a2 cos2 θ
∇µ (ξ 2 ) = −2κξµ . (9.42)
we need to use coordinates in which the horizon is not a coordinate singularity. Rather than
changing coordinates we will instead use an alternative formula for the surface gravity
g µν ∂ν (ξ 2 )∂µ (ξ 2 )
κ2 = lim . (9.43)
r→r+ 4ξ 2
After a slightly painful computation we find
r+ − r−
κ= 2 + a2 ) . (9.44)
2(r+
Finally let us remember that the electric charge is Q and the angular momentum is
J = aM . Putting everything together we have
p 2
AH = 4π M + M 2 − a2 − Q2 + a2
p (9.46)
= 4π 2M 2 − Q2 + 2M M 2 − Q2 − a2 .
101
After a some explicit computation (which you will do in problem sheet 4) and a little rear-
ranging we find
1
δM = κδA + ΩH δJ + ΦH δQ . (9.48)
8π
We see that the proof is deceptively simple, all the hard work goes into proving the uniqueness
theorems. You need to know that the black hole settles down to another Kerr–Newman black
hole and not some other spacetime. It is worth noting that there exist proofs of the first law
known as physical process proofs that do not assume this.
The second law states that in any physical process the area of the event horizon can never
decrease. This is a very surprising feature of these complicated nonlinear PDEs which Hawking
proved using just the Einstein equation, the weak energy condition and cosmic censorship.
Let us give a sketch of the proof. Consider the congruence of the horizon and take a
cross sectional area AH at some value of the affine parameter λ along the geodesics. Then
the expansion θ satisfies
dAH
= θAH . (9.49)
dλ
If we imagine the theorem is violated so that the area decreases then we must have θ < 0
somewhere on the event horizon. Since the generators are geodesics the evolution of the
expansion is governed by Raychaudhuri’s equation. Recall that if θ < 0 and the null energy
condition is satisfied then θ → −∞ in finite λ. This causes a caustic, see figure 20. Since the
points p and q are timelike separated, this contradicts the assumption that the null curves
are the generators of an event horizon, as no two points on the event horizon can be timelike
separated. Thus by contradiction the cross sectional area of an event horizon cannot decrease.
Note that the proof assumes Einstein’s equations, they are not used in an essential way.
Let us use the second law. Consider a Schwarzschild black hole of mass M . Can a black
hole split into two black holes of smaller mass? It turns out that the second law forbids this.
To see this let the masses of the new black holes be m1 and m2 . Conservation of energy
implies M = m1 + m2 . The surface area of a Schwarzschild black hole is A = 4πM 2 . We
have that the entropy of the final state is Af = A1 + A2 = 4π(m21 + m22 ) and the entropy of
the initial state is Ai = 4πM 2 = 4π(m1 + m2 )2 = 4π(m21 + m22 + 2m1 m2 ). It is clear that
Ai > Af and therefore this process violates the second law. Black holes cannot split in two!
102
Figure 20: A family of null geodesics with θ < 0 initially will form a caustic; the dotted curve
connecting p and q lies within the local light cone, so these points are timelike separated.
Of all the laws this is on the least firm ground. When the surface gravity of a black hole
vanishes it is called extremal. For the Kerr–Newman this condition corresponds to M 2 =
a2 + Q2 + P 2 . For Kerr and electrically charged Kerr black holes one can try to throw matter
into the black hole and make it extremal. One finds that it gets harder and harder for the
matter to make the black hole become closer to being an extremal black hole.
103
In general relativity, black hole solutions are fully characterised by few conserved quan-
tities such as the mass, the angular momentum and the electric charge. Black holes do not
have hair. However there are many ways of forming a black hole with assigned values of
these charges. From this perspective black holes are macroscopic thermodynamic objects
with many microstates, corresponding to the different possible ways of forming the same
macroscopic solution. Enumerating these microstates leads to an entropy.
General relativity is not a complete theory. For one, the singularity theorem provides evidence
that the theory is incomplete. More convincingly, GR is a classical theory while the world
is fundamentally quantum mechanical. Trying to understand quantum gravity is one of the
leading avenues of research in high energy theory. Though there has been much progress, a
full understanding of quantum gravity remains elusive.
There are two parts to GR: spacetime curvature and its influence on matter and the
dynamics. of the metric in response to a varying energy momentum tensor. Lacking a true
theory of quantum gravity we may still use the first part, saying that the quantum mechanical
matter propagates in a curved background which we will hold fixed. Rather than obeying
some dynamical equations, we take the metric to be fixed.
To begin let us review some quantum mechanics and quantum field theory before defining
quantum field theory in curved space.
Quantum mechanics is profoundly different from classical mechanics, despite this both try to
answer the same three fundamental questions.
In quantum mechanics the Hilbert space of interest are very often infinite-dimensional.
For example, if a classical system is represented by coordinate x and momentum p, the
104
Hilbert space could be taken to consist of all square-integrable complex-valued functions
of x, or equivalently all square-integrable complex valued functions of p but not both
at once.
where
⟨ψ2 |Aψ1 ⟩ = ⟨A† ψ2 |ψ1 ⟩ , (10.3)
for all states |ψ1 ⟩, |ψ2 ⟩. Many operators will not be Hermitian, but observables should
be real and this requires the operator to be Hermitian. In general such operators do
not commute. This means that we cannot simultaneously specify the precise values of
everything we might want to measure. There will be a maximally set of commuting
observables which would represent all we can say about a system at once.
• Evolution of hte system may be represented in one of two ways: as unitary evolution
of the state vector in Hilbert space in the Schrodinger picture, or by keeping the state
fixed and allowing observables to evolve according to equations of motion called the
Heisenberg picture.
In the Schrodinger picture, where states are represented by complex-valued wave functions
that evolve with time, such as ψ(x, t). The wave function is really the set of compo-
nents of the state vector |ψ⟩ expressed in the delta function position basis |x⟩ so that
R
|ψ(t)⟩ = dxψ(x, t)|x⟩. Canonical quantisation consists of imposing the canonical com-
mutation relation
[x̂, p̂] = i , (10.6)
on the coordinate operator x̂ and its conjugate momentum p̂. For states represented as
wave functions depending on t and x, the operator x̂ is simply multiplication by x, so the
commutation relation can be implemented by fixing
p̂ = −i∂x . (10.7)
105
The Hamiltonian operator is
1 1
H = − ∂x2 + ω 2 x2 , (10.8)
2 2
and the equation of motion is the Schrodinger equation
i∂t ψ = Hψ . (10.9)
Since the Hamiltonian is time independent the solutions separate into functions of space and
functions of time, ψ(x, t) = f (t)g(x). The solutions then come in a discrete set labelled by
an integer n ≥ 0 and we find
ωx2 √
ψn (x, t) = e− 2 Hn ( ωx)e−iEn t , (10.10)
106
From the commutation relations for x̂ and p̂ we find
thus when acting with ↠on |n⟩ we obtain another eigenstate of n̂ with eigenvalue raised by
one and â gives an eigenstate with eigenvalue lowered by 1. n takes integral values from 0 to
∞ and therefore there must be a vacuum state with
107
state can be written formally as some fixed initial state acted on by a unitary time evolution
operator
|ψ(t)⟩ = U (t)|ψ(0)⟩ , (10.24)
where
R
U (t) = e−i Hdt
. (10.25)
The Schrödinger picture expression for the matrix element of a time-independent operator,
A between two time-dependent states can be written in Heisenberg picture in terms of a time
dependent operator A(t) and time independent states as
with
A(t) = U † (t)AU (t) . (10.27)
□ϕ − m2 ϕ = 0 . (10.30)
108
Of course since we are using time derivative we have assumed a particular inertial frame and
therefore the Hamiltonian procedure necessarily violates manifest Lorentz invariance. With
care however, the observables remain Lorentz invariant. The Hamiltoian is represented as
the integral of a Hamiltonian density over the spatial directions directions. The Hamiltonian
density is related to the Lagrangian by a Legendre transformation,
H(ϕ, π) = π ϕ̇ − L(ϕ, ∂µ ϕ)
1 1 m2 2 (10.33)
= π 2 + (∇ϕ)2 + ϕ ,
2 2 2
with (∇ϕ)2 = δ ij ∂i ϕ∂j ϕ. In comparison to the harmonic oscillator the field ϕ(x) plays the
role of the coordinate x and the momentum field π(x) plays the role of p. Instead of a state
being specified by two number (x, p) at some fixed time, the initial conditions are values of
the field over all of the spatial directions at a fixed time.
Note that ϕ(xµ ) is not a wave function; it is a dynamical variable generalising the single
degree of freedom x in the case of the harmonic oscillator. We will use a Heisenberg picture
of time evolution where we promote ϕ to an operator.
First we need to solve the classical theory. The solutions of the Klein–Gordon equation
include the plane wave solution
µ 0 t+i⃗
ϕ(xµ ) = ϕ0 eipµ x = ϕ0 e−ip p·⃗
x
, (10.34)
The latter condition is in order to consider the positive frequency modes only.
We can write down the most general solution by constructing a complete orthonormal
set of modes in terms of which any solution may be expressed. We need to first define an
inner product on the space of solutions. To inner product is an integral over a constant time
hypersurface Σt and is
←→ ←→
Z
(f, g) = i (f ∗ ∂t g)dn−1 x , f ∗ ∂t g = f ∗ ∂t g − ∂t f ∗ g . (10.37)
Σt
By using Stoke’s theorem and the equation of motion one can check that this is independent
of the chosen hypersurface. Let us define
µ
ψp = Np eipµ x , (10.38)
109
with p2 + m2 = 0. Then {ψp , ψp∗ } form a basis of solutions and any field configuration can be
expanded as Z
d3 p ap ψp (x) + a∗p ψp∗ (x) ,
ϕ(x) = (10.39)
with ap and a∗p are complex constants. In order for the basis to be orthonormal we take
1
Np = p ,. (10.40)
2p (2π)3/2
0
We quantise the theory by promoting ϕ and π to be operators and impose the standard
commutation relations:
[ϕ(t, ⃗x), π(t, ⃗y )] = iδ (3) (⃗x − ⃗y ) , [ϕ(t, ⃗x), ϕ(t, ⃗y )] = 0 , [π(t, ⃗x), π(t, ⃗y )] = 0 . (10.41)
This may then be translated into commutation relations for the a’s, with
ap |0⟩ = 0 , ∀p . (10.43)
It may seem that the definition of the vacuum state depends on the initial choice of
inertial frame, however this is not the case. Consider a different inertial frame x̃µ related by a
Lorentz transformation x̃µ = Λµν xν . In this new frame the positive frequency mode functions
are
µ
ψ̃p = Np eipµ x̃ , (10.44)
and the field expansion is Z
ϕ(x̃) = d3 p ãp ψ̃p + ã†p ψ̃p∗ , (10.45)
and in terms of these modes the new vacuum state satisfies ãp |0̃⟩ = 0, ∀p. We need to show
that
ap |0⟩ = 0 ∀p ⇒ ãp |0⟩ ∀p . (10.46)
We have
1 p̃0 1/2 1 p̃0 1/2
ipµ x̃µ ip̃µ xµ
ψ̃p = √ e = e = ψp̃ . (10.47)
p0 p0
p
2p0 (2π)3/2 2p̃0 (2π)3/2
More over since we restrict to the orthochronous subgroup of the Lorentz group, i.e. Λ00 > 0
we have p0 > 0 ⇒ p̃0 > 0. Therefore we have
and the converse follows by symmetry and the vacuum state is independent of the choice of
frame.
110
10.3 QFT in curved spacetime
We now want to consider what changes when we try to quantise a field theory on curved
spacetime. We fix a background (M, g) and assume that it is globally hyperbolic. Recall
that this means that the spacetime admits a Cauchy surface and from initial conditions on
the Cauchy surface we can solve the equations of motion on all of spacetime. We perform
minimal coupling of the theory so that η µν → g µν and ∂µ → ∇µ . The Klein–Gordon equation
becomes
∇2 ϕ ≡ g µν ∇µ ∂ν ϕ = m2 ϕ , (10.49)
with Σ a spacelike hypersurface and nµ a unit normal vector and γ the determinant of the
induced metric. Let the background admit a Killing vector, K, then on functions we have
[K, ∇2 ]f = 0 . (10.51)
Since ∇2 and iK are both self-adjoint and commuting they admit a complete set of common
eigenfunctions
∇2 f = m2 f , iK µ ∂µ f = ωf . (10.52)
If K is timelike we are entitled to call the eigenvalue the frequency. Indeed this is how it works
in Minkowski space where K = ∂t . If f is an eigenfunction with positive frequency ω then f ∗
is an eigenfunction of negative frequency −ω. We can then without loss of generality expand
our fields in terms of positive and negative frequency eigenfunctions of the Laplacian in a
basis {ψi } of positive frequency modes and {ψi∗ } of negative frequency modes. We expand
our field as
(ai ψi + a†i ψi∗ ) ,
X
ϕ= (10.53)
i
with
[ai , a†i′ ] = δij . (10.54)
Consider a sandwich spacetime (M, g) made up of three regions, region B bottom, region
C for centre and region T for top, and assume the Klein–Gordon equation holds throughout
spacetime. Region B is stationary and admits a timelike Killing vector K B , region C is not
stationary and all sorts of dynamical processes might take place so long as it remains globally
hyperbolic, and finally region T is once again stationary with a new timelike Killing vector
111
K T . If we quantise in region B we pick a set of modes {fi , fi∗ } that satisfy iK B fi = ωi fi
with ωi > 0. On the other hand in region T we choose another set of modes {gi , gi∗ } that
satisfy iK T gi = ω̃i gi with ω̃i > 0. Note that even though the positive-frequency conditions
are imposed using the Killing vectors in specific regions the modes extend throughout the
whole of spacetime. In the two cases the respective expansion is then
X X
ϕ(x) = ai fi + a†i fi∗ = bi gi + b†i gi∗ , (10.55)
i i
where the modes have been normalised with respect to the Klein–Gordon inner product so
that the commutation relations are
Since {fi } forms a basis we can also expand any function in terms of it, we have
X
gi = Aij fj + Bij fj∗ . (10.57)
i
The coefficients Aij and Bij are called the Bogoliubov coefficients and the transformation
between the different bases is called a Bogoliubov transformation. Using the normalisation
conditions it can be shown that they satisfy
X
∗
Aik A∗jk − Bik Bjk = δij ,
k
X (10.58)
Aik Bjk − Bik Ajk = 0 .
k
Or in matrix notation
AA† − BB † = 1 , AB T = BAT . (10.59)
∗ †
X
bi = A∗ij aj − Bij aj . (10.60)
j
The procedure above defines a vacuum state associated with the modes {fi , fi∗ } called
the in-vacuum as the states satisfy ai |0⟩in = 0 ∀i. In a stationary reference frame in region
B (i.e. an integral curve of K B this will appear empty. What about in region T ? What
is the expected number of particles of the state |0⟩in with momentum i. It is given by the
expectation value
112
If this is non-zero there is pair production. Alternatively one can see this as the in-vacuum and
out-vacuum are different. Hence a changing spacetime geometry generically causes particle
production.
Even though we have made an effort above to understand QFT in curved space we will
first consider a phenomenon that uses the above ideas but manifests in flat space. This is
the Unruh effect, which states that an accelerating observer in the Minkowski vacuum will
observe a thermal spectrum of particles.
The basic idea is very simple, observers with different notions of positive and negative
frequency modes will disagree on the particle content of a given state. A uniformly accelerated
observe in Minkowski moves along an orbit of a time-like Killing vector, however this is not
the usual time-translation Killing vector. We can therefore expand the field in terms of modes
appropriate for the accelerated observer and calculate the number operator in the ordinary
Minkowski vacuum. We will find that this leads to a thermal spectrum of particles.
To simplify things as much as possible let us consider a massless scalar field in two
dimensions. The wave equation is
□ϕ = 0 . (10.62)
Before trying to quantise the theory consider a uniformly accelerating observer, we have
seen this earlier in section 4.3, but let us review the details. In inertial coordinates the metric
can be written as
ds2 = −dt2 + dx2 . (10.63)
1 1
t(τ ) = sinh(ατ ) , x(τ ) = cosh(ατ ) , (10.64)
α α
note that
x2 = t2 + α2 . (10.65)
We can choose new coordinates on two-dimensional Minkowski space that are adapted to
uniformly accelerated motion as
1 aξ 1 aξ
t= e sinh(aη) , x= e cosh(aη) , (x > |t|) . (10.66)
a a
The new coordinates have ranges
−∞ < η, ξ < ∞ , (10.67)
113
and cover the wedge x > |t| Rindler space corresponds to the right wedge x > |t| foliated
by the worldlines of the accelerated observers and labelled by region I in figure 21. In these
Figure 21: Minkowski spacetime in Rindler coordinates. Region I is the region accessible to
an observer undergoing constant acceleration in the +x-direction. The coordinates (η, ξ) can
be used in region I or region IV, where they point in the opposite direction. The vector filed
∂η corresponds to the generator of Lorentz boosts and the horizons H ± are Killing horizons
for this vector field, which represent the boundaries of the past and future as witnessed by
the Rindler observer.
η=τ, ξ = 0. (10.69)
ds2 = e2aξ − dη 2 + dξ 2 .
(10.70)
114
The null line t = x labelled by H + is a future Cauchy horizon for any η = constant spacelike
hypersurface in region I. Similarly H − is a past Cauchy horizon.
The metric is independent of η and therefore ∂η is a Killing vector, however since this is
Minkowski spacetime there are more of course. Indeed if we express ∂η in the (t, x) coordinates
we have
∂η = a(x∂t + t∂x ) . (10.71)
This is the Killing vector which generates a boost in the x-direction. It is clear that this Killing
vector naturally extends throughout the spacetime. This extends naturally throughout the
spacetime, in regions II and III it is spacelike while in region IV it is timelike but past-directed.
The horizons are Killing horizons for ∂η .
We can define coordinates (η, ξ) in region IV by flipping the signs in (10.66),
1 1
t = − eaξ sinh(aη) , x = − eaξ cosh(aη) , (x < |t|) . (10.72)
a a
The sign guarantees that ∂η and ∂t point in opposite directions. Strictly speaking we cannot
use the (η, ξ) simultaneously in regions I and IV since the ranges are the same in each region,
we must explicitly indicate to which region the coordinate belongs to. We add labels to
distinguish so that the metric takes the same form in both regions.
Along the surface t = 0 the Killing vector ∂η is a hypersurface-orthogonal timelike Killing
vector except for the single point x = 0 where it vanishes. We can therefore use it to define
a set of positive and negative frequency modes on which we can build a Fock space for the
scalar-field Hilbert space. The massless Klein–Gordon equation in Rindler coordinates takes
the form
□ϕ = e−2aξ (−∂η2 + ∂ξ2 )ϕ = 0 . (10.73)
However this is only true in region I since we need our modes to be positive frequency
with respect to a future directed Killing vector, in region IV the relevant Killing vector is
∂−η = −∂η . To remove this problem of defining the modes we introduce two sets of modes
115
one with support in region I and one with support in region IV:
√ 1 e−iωη+ikξ I
(1) 4πω
gk =
0 IV
(10.76)
0 I
(2)
gk =
√ 1 eiωη+ikξ IV
4πω
with ω = |k| in each region. These then define the positive frequency with respect to the
relevant future directed timelike Killing vector. The two sets with their conjugates form a
complete set of modes for any solution to the wave equation throughout the spacetime. Both
sets are non-vanishing in regions II and III however this is obscured by the choice of (η, ξ)
(i) (i)†
coordinates. Denoting the associated annihilation and creation operators as bk and bk , we
can write Z
(1) (1) (1)† (1)∗ (2) (2) (2)† (2)∗
ϕ= dk bk gk + bk gk + bk gk + bk gk . (10.77)
(i) (j)
(gk1 , gk2 ) = δ ij δ(k1 − k2 ) , (10.79)
and similarly for the conjugate modes. There are two sets of modes, Minkowski and Rindler,
that we can expand the solution of the Klein–Gordon equation in. Although the Hilbert
spaces are the same the Fock spaces are different, in particular the definition of the vacuum.
The Minkowski vacuum |0M ⟩ satisfies
ak |0M ⟩ = 0 , (10.80)
(1) (2)
bk |0R ⟩ = bk |0R ⟩ = 0 . (10.81)
We see that because an individual Rindler mode cannot be written in terms of positive
frequency Minkowski modes, the Rindler annihilation modes are a superposition of both the
Minkowski creation and annihilation operators.
A Rindler observer will be static with respect to orbits of the boost Killing vector ∂η .
(1)
Such an observer in region I will describe particles in terms of the Rindler modes gk and will
116
(1)†
observer a state in the Rindler vacuum to be devoid of particles, a state bk |0R ⟩ to contain
a single particle of frequency ω = |k| and so forth. Conversely a Rindler observer travelling
through the Minkowski vacuum state will detect a background of particles, even though to
the inertial observer the vacuum is completely empty.
We would like to know what kind of particles does the Rindler observer detect? We know
how to answer this, we need to compute the Bogolubov coefficients relating the Minkowski
modes to the Rindler modes, and then use this to compute the expectation values. Unruh
found a shortcut to this somewhat tedious computation. His idea was to find a set of modes
that share the same vacuum as the Minkowski modes but for which the overlap with the
Rindler modes is more direct. We start with the Rindler modes and extend them to all of
spacetime, and then express the extension in terms of the original Rindler modes.
We have
(
a(x − t) I
e−a(η−ξ) =
a(t − x) IV
( (10.82)
a(t + x) I
ea(η+ξ) =
−a(t + x) IV
(1)
We can express the spacetime dependence of a mode gk with k > 0 in terms of the
Minkowski coordinates in region I as
√ (1)
4πωgk = aiω/a (x − t)iω/a . (10.83)
The analytic continuation of this throughout all of spacetime is then obvious, we just use this
final expression for all (t, x). We want to express the result in terms of the Rindler modes
(2)
everywhere and so we need to bring the gk modes into the game. We have
√ (2)
4πωgk = a−iω/a (−t − x)−iω/a . (10.84)
This doesn’t match the behaviour of our analytically extended mode, however if we take the
complex conjugate and reverse the wave number we find
√ (2)∗
4πωg−k = aiω/a eπω/a (−t + x)iω/a , (10.85)
and therefore
√
(1) (2)∗
4πω gk + e−πω/a g−k = aiω/a (−t + x)iω/a . (10.86)
An identical result holds for the k < 0 modes. The properly normalised mode is
(1) 1
πω/(2a) (1) −πω/(2a) (2)∗
hk = p e g k + e g−k . (10.87)
2 sinh πω
a
117
(1) (2)
This is the appropriate analytic extension of the gk modes, the extension of the gk modes
is
(2) 1
πω/(2a) (2) −πω/(2a) (1)∗
hk = p e g k + e g−k . (10.88)
2 sinh πω
a
One can check that these are correctly normalised. We can now expand in these modes as
Z
(1) (1) (1)† (1)∗ (2) (2) (2)† (2)∗
ϕ = dk ck hk + ck hk + ck hk + ck hk . (10.89)
(i)
The modes hk can be expressed purely in terms of positive frequency Minkowski modes fk
and therefore they share the same vacuum state |0M ⟩ so that
(i)
ck |0M ⟩ = 0 . (10.90)
In the Minkowski vacuum an observer in region I will observe particles defined by the operators
(1)
bk ; the expected number of such particle of frequency ω is
(1) (1)† (1)
⟨0M |nR (k)|0M ⟩ = ⟨0M |bk bk |0M ⟩
1 −πω/a (1) (1)†
= πω ⟨0M |e c−k c−k |0M ⟩ (10.91)
2 sinh a
1
= 2πω/a δ(0) .
e −1
Planck’s law describes the spectral density of electromagnetic radiation emitted by a black
body in thermal equilibrium at a give temperature T . It says that the spectral radiance of a
body for frequency ω at temperature T is given by
ℏω 3 1
B(ω, T ) = 2 2
. (10.92)
4π c e ℏω/(K BT ) − 1
We conclude that an observer moving with uniform acceleration through the Minkowski
vacuum observes a thermal spectrum of particles. (There is more to saying this is a thermal
spectrum than just the above, one needs to check that there are no hidden correlations in
the observed particles, this has indeed been shown and therefore the radiation detected by a
Rindler observer is truly thermal.)
a
The temperature T = 2π is what would be measured by an observer moving along the
path ξ = 0, which feels the acceleration a = α. Any other path with ξ = constant feels an
acceleration
α = ae−aξ , (10.93)
α
and thus should measure thermal radiation with a temperature T = 2π . As ξ → ∞ the
temperature goes to 0, which is consistent with the fact that near ∞ the Rindler observer is
nearly inertial.
118
The Unruh effect tells us that an accelerated observer will detect particles in the Minkowski
vacuum state. An inertial observer would say that the same state is completely empty, the
expectation value of the energy momentum tensor ⟨Tµν ⟩ = 0. If there is no energy momentum
how can the Rindler observer detect particles? If the Rindler observer is to detect background
particles, they must carry a detector. This must be coupled to the particle being detected.
However if a detector is being maintained at constant acceleration, energy is not conserved.
From the point of view of the Minkowski observer the Rindler detector emits as well as absorbs
particles, once the coupling is introduced the possibility of emission is unavoidable. When
the detector registers a particle the inertial observer would say that it had emitted a particle
and felt a radiation-reaction force in response. Ultimately the energy needed to excite the
Rindler detector does not come from the background energy momentum tensor but from the
energy we put into the detector to keep it accelerating.
We may now use a very quick argument following the above to conclude that a black hole
has a temperature. Consider a static observer at radius r1 > RS outside the Schwarzschild
black hole. Such an observer moves along orbits of the time-like Killing vector K = ∂t . The
red-shift factor is given by r
2GN M
V = 1− , (10.94)
r
and the magnitude of the acceleration is given by
GN M
a= √ . (10.95)
r r − 2GN M
For observed close to the event horizon r1 − 2GN M ≪ 2GN M this acceleration becomes very
large compared to the scale set by the Schwarzschild radius
1
a1 ≫ . (10.96)
2GN M
Let us assume that the quantum state of some scalar field ϕ looks like the Minkowski vac-
uum as seen by a freely falling observer near the black hole. The static observer looks just
like a constant acceleration observer in flat spacetime and will detect Unruh radiation at a
temperature T1 = a2 /(2π).
Now consider a static observer at infinity. The radiation will propagate to infinity with
an appropriate red-shift factor. We find
V1 a
T∞ = . (10.97)
V∞ 2π
119
At infinity we have V∞ = 1 so the observed temperature is
V 1 a1 κ
T∞ = lim = . (10.98)
r1 →2GN M 2π 2π
This is the Hawking effect and the radiation is known as Hawking radiation.
We can be more rigorous in the derivation of the Hawking temperature. Consider a
spacetime that corresponds to a spherically symmetric collapsing star which forms a black
hole, recall that the Penrose diagram is given in 5. This is a curved spacetime which is globally
hyperbolic, for instance I − is a Cauchy surface. Even though the Schwarzschild black hole
solution is a static spacetime the collapsing star is not, and involves complicated dynamics.
However the spacetime is approximately stationary in the far asymptotic past (near I + ) and
the far asymptotic future (near I + ). We can therefore perform second quantisation with
respect to stationary observers near I − which give us “in”-modes and the “in”-vacuum and
also a second quantisation associated with stationary observers at I + leading to the “out”-
vacuum. We have a sandwich spacetime and we can ask will observes in the far future see
particles in the in-vacuum.
The field expansion defining the in-vacuum can be constructed by specifying a complete
set of positive frequency modes on I − . For the quantisation in the far future I + is not a
Cauchy surface for the spacetime, one must take I + ∪ H+ . We may therefore quantise the
field in the far future by specifying a complete set on it. There are three sets of modes:
fi : positive frequency on I −
gi : positive frequency on I + and zero on H+ (10.99)
hi : positive frequency on H+ and zero on I +
Strictly speaking there is no timelike Killing vector on H so the term positive frequency
is somewhat misleading, however the choice of modes hi does not affect the outcome of the
calculation. We can choose an arbitrary set and call them positive frequency modes and
attach them to annihilation operators in the field expansion, we only require that the set
{g, h} give a basis of modes. We can therefore expand
X X X
ϕ(x) = ai fi (x) + h.c. = bI gI (x) + cα hα (x) + h.c. . (10.100)
i I α
120
We now want to look at the analytic solutions of the Klein–Gordon equation in the
Schwarzschild black hole background. This is hard. Instead we can ask if we impose boundary
condition to the solution at I + and investigate what its corresponding form must be on I − .
This amounts to tracing back in time the solution from I + to I − .
The metric of the Schwarzschild black hole spacetime with coordinates (t, r∗ , θ, ϕ) reads
2 2M
− dt2 + dr∗2 + r2 ds2 (S 2 ) .
ds = 1 − (10.102)
r
We will also use the light-cone coordinates u = t − r∗ and v = t + r∗ . We can find the
Klein–Gordon equation for the field ϕ(t, r∗ , θ, ϕ). Expanding in spherical harmonics
we find
h i
∂t2 − ∂r2∗ + Vl (r∗ ) χl = 0 , (10.104)
where
2M l(l + 1) 2M
Vl (r∗ ) = 1− + 3 . (10.105)
r r2 r
We set
χl (t, r∗ ) = e−iωt Rlω (r∗ ) , (10.106)
so that
(∂r2∗ + ω 2 )Rωl = Vl Rωl . (10.107)
We can get some intuition by looking at the potential. Both near the horizon H+ (r∗ → −∞)
and near I ± (r∗ → ∞) the potential tends to zero. It takes the for of a potential barrier. If
we consider how any solution to the above evolves in time, it will be partly transmitted and
partly reflected as it comes in from r∗ = ∞.
Near I ± the solutions are just plane waves. We define outgoing and ingoing as those
which correspond to r∗ increasing or decreasing with time. We define the early modes
1 Ylm
flmω+ = √ e−iωu , outgoing
2πω r
(10.108)
1 Ylm
flmω− =√ e−iωv , ingoing
2πω r
at I − and late modes
1 Ylm
glmω+ = √ e−iωu , outgoing
2πω r
(10.109)
1 Ylm
glmω− =√ e−iωv , ingoing
2πω r
121
at I + . We will be interested mainly in ingoing early modes and outgoing late modes, so we
will use the shorthand notation:
We need to express gω in terms of fω′ and fω∗′ on I − . First note that plane waves such
as gω are in fact completely delocalised since they have support everywhere on I + .
We want to trace the solution of the late modes back in time in terms of the early modes.
As the wave travels inwards from I + toward decreasing values of r∗ , it will encounter the
(r)
potential barrier. One part of the wave, gω will be reflected and end up on I − with the
same frequency ω. This will correspond to a term of the form Aωω′ ∝ δ(ω − ω ′ ) in the
(t)
expansion in (10.101). The remaining part gω will be transmitted through the barrier and
will enter the collapsing matter. In that region the precise geometry of spacetime is unknown.
However since we are interested in a packet peaked at late times and at some finite frequency
ω0 we know that the packet will be peaked at a very high frequency as it enters the collapsing
matter due to the gravitational blueshift. This allows us to assume that the packet will obey
the geometric optics approximation which means that gω takes the form A(x)eiS(x) where
A(x) is slowly varying compared to S. Substituting into the Klein–Gordon equation we find
∇µ S∇µ S = 0, which means that surfaces of constant phase are null. Given a wave we can
trace its surfaces of constant phase back in time by following null geodesics.
Consider tracing back the wave along a particular null geodesic γ which starts off at some
u = u0 at I + and hits I − at v = v0 . Denote by γH a null generator of the horizon H+ which
has been extended into the past until it hits I − at some value of v. We may set this value
to v = 0 without loss of generality since the spacetime is invariant under shifts v → v + c.
We therefore have v0 < 0 for the geodesic γ. Let n be a connecting vector between the two
curves and fix its normalisation by requiring n · ξ = −1 with ξ the generator of the Killing
horizon H+ . Near the horizon the Kruskal coordinate U = −e−κu is an affine distance along
n and we can use it to measure the distance between γ and γH . In order to find the form of
the wave at I − we need to understand how the affine distance along the connecting vector n
will change by the time γ reaches I − . At I − the cooridnate v is an affine parameter aong
the null geodesic integral curves of n. If U0 = 0 then the affine distance is zero at I − . Hence
we can expand the affine distance between γ and γH at I − in powers of U0 : v = cU0 + O(U02 )
for some constant c > 0. Using u = −κ−1 log(−U ) = −κ−1 log(−cv) we can conclude that if
122
(t)
a mode takes the form gω ∼ e−iωu on I + , the transmitted part gω on I − will take the form
(
eiω/κ log(−v) for v < 0
gω(t) ∼ (10.111)
0 for v > 0
up to a constant phase. This is exactly analogous to the Rindler modes in the previous section
with κ ↔ a. We have Aωω′ = e−πω/κ Bωω′ and therefore
1
⟨Nω ⟩ ∝ , (10.112)
eℏω/(kB T ) −1
where the Hawking temperature is given by
ℏκ
T = . (10.113)
2πkB
Since the temperature is inversely proportional to the mass, the black hole hets up as it
evaporates.
If a black hole has a temperature it must evaporate. This leads to a serious problem with
unitarity. We can compute the rate of mass loss due to the Hawking radiation. Stefan’s law
for the rate of energy loss by a blackbody:
dE
∼ −αAT 4 , (10.114)
dt
Plugging in E = M and A ∝ M 2 and T ∝ M −1 we have
dM 1
∝− 2, (10.115)
dT M
and hence the black hole evaporates away completely in a time
G2N 3
τ∼ M , (10.116)
ℏc4
note that the calculation of Hawking radiation assumed no backreaction, M was taken to be
dM
constant. This is good when dt ≪ M but fails in the final stages of evaporation.
Consider a black hole which forms from collapsing matter and then evaporates away
completely, leaving just thermal radiation. It should be possible to arrange that the collapsing
matter is in a definite quantum state |ψ⟩, the associated density matrix would be the one of
a pure state, ρ = |ψ⟩⟨ψ|. When the black hole is formed the Hilbert space naturally splits
into the tensor product of Hilbert spaces, one with support in the interior of the black hole
and the other with support on the exterior of the black hole: H = Hin ⊗ Hout . An outside
123
Figure 22: The evolution of the modes.
observer does not have access to Hin so their description of the black hole state is necessarily
incomplete. They will describe the state outside the horizon as a reduced density matrix
obtained by tracing over Hin : ρout = trin ρ.
Since it described by a non-trivial density matrix the outside state is mixed. This is
consistent with the fact that it contains thermal radiation, so there is no issue so far. The
external state is entangled with the interior and the reduced density matrix is just a way in
which the outside observer parametrises their ignorance of the interior. If we assume that the
black hole has completely evaporated nothing is left in the interior and the exterior reduced
124
density matrix will describe the full state, which is therefore a mixed state. However evolution
from a pure state to a mixed state is forbidden by unitarity in quantum mechanics.
This is the black hole information paradox. It is important to emphasise the difference
between thermal radiation produced in ordinary processes which do not violate unitarity. If
we burn a printed copy of these lecture notes, thermal radiation is produced, however the
process is unitary and in principle one could reconstruct all the information contained in the
notes by studying the radiation and ashes. The early radiation is entangled with excitations
inside the burning body, however the excitations inside the burning body can still transmit
information to the radiation emitted later on which will thus contain non-trivial information.
On the other hand, throwing the notes into a black hole, the information appears to be really
lost once the black hole has fully evaporated because the final radiation is exactly thermal.
The internal excitations are shielded by the horizon and by causality cannot influence the
later outgoing radiation.
Nearly half a century after Hawking formulated the black hole information paradox it is
still and open and active area of research. Our analysis has been in a funny hybrid theory of
quantum field theory coupled to classical general relativity. General relativity predicts a sin-
gularity at the centre of a black hole, this is a regime where quantum effects will dramatically
alter our classical expectations. We need a quantum theory of gravity.
References
125