0% found this document useful (0 votes)

104 views150 pages

Notass

1. The chapter discusses Euclidean spacetime and Newtonian physics. 2. In Newtonian mechanics, there are two basic axioms - the principle of relativity and that absolute time exists. 3. Euclidean spacetime is assumed to be intrinsically flat with well-defined distances and angles. Events are characterized by their spatial coordinates and time.

Uploaded by

Felipe Cruz Vieira

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

104 views150 pages

Notass

Uploaded by

Felipe Cruz Vieira

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 150

CHAPTER 1

EUCLIDEAN SPACETIME AND NEWTONIAN PHYSICS

Absolute, true, and mathematical

time, of itself, and from its own
nature, flows equably without
relation to anything external . . .

Isaac Newton
Scholium of the Principia

The purpose of this chapter is to remind you the basic features of the Galilean spacetime and its
symmetries, which are closely related to the form taken by Newton’s laws as seen by inertial observers.
Although ideas presented in this chapter will be all familiar to you, the way of looking at them will
be probably new. We will introduce some tensorial notation that will be useful in the future. Indeed,
local differential geometry can be understood as a refinement of the tensorial methods presented here.

1.1 Galilean Relativity

Newtonian mechanics is based in two basic axioms:

1. Principle of Relativity: The laws of physics are the same in all the inertial frames: No experiment
can measure the absolute velocity of an observer; the results of any experiment do not depend
on the speed of the observer relative to other observers not involved in the experiment.

2. There exists an absolute time, which is the same for any observer.

1.2 Euclidean spacetime: old wine in a new bottle

When formulating mechanics in an axiomatic form, Newton, based on everyday experience1 , assumed
the spacetime to be Euclidean E1 × E3 , i.e. an intrinsically flat and orientable metric space with
trivial topology and well-defined distances and angles. A physical process in this spacetime (such
as the collision of two particles) is called an event and it is independent of the particular choice of
1 This is, at velocities much smaller than the velocity of light.
1.2 Euclidean spacetime: old wine in a new bottle 3

ine
Future

rldl
Wo
Past Future
e2

Inertial observer
e1

ine
rldl
Wo
Past

Figure 1.1: Galilean spacetime.

coordinates used for its description. The spatial location of the event can be specified in Cartesian
coordinates (x, y, z), in spherical coordinates (r, θ, φ), or making use of any 3 independent numbers
obtained by a well-defined coordinate transformation. However, among all the coordinates systems
that can be used in Newtonian physics, the inertial coordinate systems are privileged (and least for
Newton and Galileo). An inertial frame is a frame moving freely in spacetime, free of any force, which
carries ideal clocks and measuring rods forming an orthonormal Cartesian coordinate system. In such
a frame, a particular event P is characterized by 4 coordinates: its position2

{xi } = {x, y, z} = {x1 , x2 , x3 } , (1.1)

and the time t at which it happens.

Time

Physical time is absolute (up to affine changes, see below) and it is used to characterize particle
trajectories xi (t). The temporal separation dt between two events is well-defined, independently of
their spatial separation (see below). Simultaneous events are characterized by equal time surfaces
separating the future and the past of the events. Any event may cause any simultaneous or later
event.

Space

For each spatial coordinate we define a set of orthonormal basis vectors along the xi coordinate
direction
ei = {ex , ey , ez } = {e1 , e2 , e3 } , ei · ej = δij , (1.2)
with δij the 3 dimensional Kronecker delta
 
1 0 0
δij =  0 1 0  = diag(1, 1, 1) . (1.3)
0 0 1
2 We emphasize here that we do not consider {xi } to be a vector since the homogeneity of space makes the choice of

an origin completely arbitrary. The distance between points is the only significant quantity. On top of that, coordinates
will no longer behave as a vector in the presence of gravity.
1.2 Euclidean spacetime: old wine in a new bottle 4

The infinitesimal displacement vector dX between two points (as any other vector in E3 ) can be
expanded in terms of the basis vectors ei as
3
X
dX = dxi ei = dx1 e1 + dx2 e2 + dx3 e3 , (1.4)
i=1

with dxi the so-called contravariant components of the vector in that orthonormal basis.

Einstein summation convention

Note the way in which we have located the indices in the previous equation. From now on, an
index appearing twice in a product (in a superscript-subscript combination) will be understood
to be automatically summed on or contracted. A quantity with no free tensor indices is said
to be fully contracted The name of the pair of contracted indices (latin indices (i, j, k, . . .) for
the spatial coordinates or greek ones (µ, ν, ρ, . . .) in 3+1 spacetime dimensions) is completely
arbitrary and can be changed at will. For this reason, these indices are called dummy indices.
Expressions with more than two repeated indices should never occur, being necessary in some
cases to relabel them in order to avoid ambiguities. Non repeated indices are called free indices
and must appear at the same level at both sides of the equations, for each independent term.
As you will see, these rules are very useful, since they will allow us to reconstruct equations
without any memorization, just by properly setting the indices up or down in the equation. On
top of that, we will save a lot of time when writing expressions in General Relativity, which
typically contain lots of indices. Using this convention, Eqs. (1.4) can be written as
3
X
dX = dxi ei → dX = dxi ei . (1.5)
i=1

Exercise
Which of the following expressions do not make sense or are ambiguous according to the previous
rules? Why? Restore the sums on dummy indices in the rest of equations.

x i = Ai j B j k x k , xi = Aj k B k l xl , D i j = Ai k B k k C k j , D i j = Ai k B k l C l j ,

xi = Ai j xj + B i k xk , xi = Ai j xj + B i j xj , D i j = Ai k B k l C l i .

The orthonormality of the basis vectors allows us to compute the contravariant components dxi
as the scalar product of the vector dX and the corresponding basis vector ei

dX · ei = dxj ej · ei = dxj (ej · ei ) = dxj δji ≡ dxi ,

(1.6)

where in the last step we have defined the so-called covariant components dxi

dxi ≡ δij dxj . (1.7)

The 3 dimensional Kronecker delta δij allows therefore to lower (or raise) spatial indices. The definition
of covariant vectors is done only for notational brevity, there is nothing deep on it. The location of
the indices in Euclidean space is just a clever way of keeping into account the summation convention
and does not give rise to any change in the numerical value of the different components

dxi = +dxi . (1.8)

1.3 Euclidean space isometry group 5

As we will see in the next chapters, this is not the general case in a non-Cartesian reference frame or
in other spacetimes with undefined metric, such as the Minkowski spacetime, where the distinction
between the temporal components of a covariant and contravariant vector becomes important.
The square of the infinitesimal spatial distance between two points in E3 is given by

|dX|2 ≡ dX 2 = δij dxi dxj = dxi dxi = dx2 + dy 2 + dz 2 , (1.9)

where δij plays the role of a metric in E3 , for an orthonormal basis. The line element dX 2 is positive-
definite.

1.3 Euclidean space isometry group

Requiring coordinate transformations between two inertial frames to leave the spatial (dX 2 ) and
temporal (dt2 ) distances unchanged, uniquely determines the form of these transformations. The
coordinates in different frames will be distinguished by a bar over the kernel, i.e x̄k . Let us start by
showing that the transformation must be linear. Using the chain rule, we have

∂ x̄k i
dx̄k = dx , (1.10)
∂xi
which, imposing the invariance of line element dX̄ 2 = dX 2 , implies

∂ x̄k ∂ x̄l
δij = δkl . (1.11)
∂xi ∂xj
Differentiating the previous expression with respect to xp , and taking into account that δij is a constant
matrix, we get 2 k
∂ x̄ ∂ x̄l ∂ x̄k ∂ 2 x̄l

δkl + = 0. (1.12)
∂xi ∂xp ∂xj ∂xi ∂xp ∂xj
Permuting ipj to pji and jip we obtain two equations
2 k
∂ x̄ ∂ x̄l ∂ x̄k ∂ 2 x̄l

δkl + = 0, (1.13)
∂xp ∂xj ∂xi ∂xp ∂xj ∂xi
2 k
∂ x̄ ∂ x̄l ∂ x̄k ∂ 2 x̄l

δkl + = 0. (1.14)
∂xj ∂xi ∂xp ∂xj ∂xi ∂xp

Subtracting (1.13) from (1.12), adding (1.14), and taking into account the symmetry of the metric
and the fact that the usual derivatives conmute, we get

∂Rk i
δkl Rl j = 0 , (1.15)
∂xp
where we have defined the matrix3
∂ x̄i
Ri j ≡ . (1.16)
∂xj
Since the transformation Ri j is required to be have an inverse4 , we must conclude that

∂Rk i
= 0, (1.17)
∂xp
3 The first index in Ri j labels rows and the second one labels columns.
4 The system x̄ is not at all privileged.
1.4 Tensors in Euclidean space 6

which implies that the transformation must be linear

x̄i = Ri j xj + di , (1.18)

with di some real and arbitrary integration constants and Ri j independent of the coordinates. Substi-
tuting back Eq.(1.18) into (1.11) we obtain the similarity transformation

δij = Rk j Rl j δkl . (1.19)

which is nothing else than the indexed version of the orthogonality condition RT IR = RT R = I for
a 3 × 3 matrix. Ri j is an O(3) matrix! (as you probably expected). Taking the determinant at both
sides of the orthogonality condition, we conclude that the determinant of an orthogonal matrix can
take two different values, namely det R = ±1. Since we will be interested in rotations connected with
the identity, we will restrict ourselves to proper rotations with determinant det R = +1, i.e orientation
preserving transformations
SO(3) = {R| RT IR = I , det R = 1} . (1.20)
Rotations with det R = −1 can be obtained by applying a parity transformation P i j = −I in E3 ,
which is also an orthogonal matrix P T P = I.
The laws of Newtonian mechanics are required to be covariant, i.e. to have the same form in
each inertial frame of reference. In order to achieve so, we will make use of tensors, in this case
Cartesian tensors, which have well defined transformation properties from frame to frame. As you
will realize soon, these objects are the cornerstone of modern physics theories, such as Special or
General Relativity. We will use them repeatedly in this course, so pay attention! We will start
our trip using a concrete and familiar context for the introduction of the tensor notions: rotations
x̄i = Ri j xj in Euclidean space.

1.4 Tensors in Euclidean space

1.4.1 Scalars
A scalar is single number that does not transform under a coordinate transformations (in this case ro-
tations). Some particular examples of Galilean scalars are the spatial line element (dX), the temporal
line element (dt), the 3-volume d3 x ≡ |dx dy dz|, the Lagrangian, the mass of a particle, its charge or
any numerical constant.

Exercise
Show that the 3-volume is indeed a scalar under rotations.

If we can associate a number to all the points in some spacetime region, as for instance happens with
the value of the temperature in the different points of the Earth, we say that we are dealing with a
scalar field. Under coordinate transformations, it transforms as

φ̄ (t, x) = φ t, R−1 x .

φ̄ (t, x̄) = φ (t, x) , or (1.21)

1.4.2 Vectors
What is a vector? A vector V (in this case Cartesian) is an absolute geometrical object with a partic-
ular length and direction which does not depend on the choice of coordinates. The same happens with
the rules of vector calculus. Concepts as the angle between two vectors can be defined independently
1.4 Tensors in Euclidean space 7

e2
e2

2
V
2
V
V e1
1

V1 e1
Figure 1.2: A rotation: transformation of the basis vectors and components.

of the coordinates. Even though there is no need of introducing the concept of components of a vector
in a given basis, doing it is sometimes useful. Let us see what happens when we do it. Consider two
orthonormal frames related for instance by a rotation of angle θ around the z axis, as illustrated in
Fig 1.2. The vector V can be expanded in terms of the two set of basis vectors associated to this
coordinate systems. In terms of the basis ei , the vector V has components V i

V = V i ei = V 1 e1 + V 2 e2 + V 3 e3 , (1.22)

while, in terms of the rotated basis ēi , it has different components V̄ i

V = V̄ i ēi = V̄ 1 ē1 + V̄ 2 ē2 + V̄ 3 ē3 , (1.23)

but the vector itself (V) does not change. The relation between the basis vector ēi and ei can be
easily read from the figure to get
 T  T  
ē1 e1 cos θ − sin θ 0
 ē2  =  e2   sin θ cos θ 0  . (1.24)
ē3 e3 0 0 1

Using this relation, it is easy to write V̄ in terms of the original basis vectors ei and identify from
there the transformation of the components. We obtain
    
V̄1 cos θ sin θ 0 V1
 V̄2  =  − sin θ cos θ 0   V2  (1.25)
V̄3 0 0 1 V3
The previous exercise can be easily generalized to an arbitrary rotation, giving rise to the following
transformation rules5 j
V̄ i = Ri j V j , ēi = R−1 i ej . (1.27)
which, in a much powerful notation, can be written as
∂ x̄i j ∂xi
V̄ i = V , ēi = ej . (1.28)
∂xj ∂ x̄j
5 The example has been presented using the passive viewpoint, in which the same vector ends up with different

components when the reference frame is changed. The expression

V̄ i = Ri j V j , (1.26)
can also describe the active viewpoint in which a given vector is mapped to a different vector under the same basis
choice.
1.4 Tensors in Euclidean space 8

In conclusion, a vector V remains unchanged under (in this case) rotations due to the simultaneous
and opposite change of its components V i and the basis ei
i k
∂ x̄ j ∂x
V = V̄ i ēi = j
V ek = V j δjk ek = V k ek = V . (1.29)
∂x ∂ x̄i

From now on, and in a clear abuse of language, we will frequently employ a standard shorthand and
will refer to V i as a vector instead of saying the components of a vector V. A vector is said to be
contravariant if it transforms as the displacement vector dxi (cf. Eq. (1.10))

∂ x̄i j
V̄ i = V . (1.30)
∂xj
On the other hand, a vector is said to be covariant if it transforms as the basis vectors ei (cf. Eq.
(1.29))
∂xi
V̄i = Vj . (1.31)
∂ x̄j
A particular example of an object with the previous transformation properties is the gradient of a
scalar function
∂f ∂xj ∂f
= (1.32)
∂ x̄i ∂ x̄i ∂xj
The gradient is the difference of the function per unit distance in the direction of the basis vector.
When the basis vector “shrink” the gradient must “shrink” too.

You maybe think that I am being a bit pedantic here. For you the gradient was, till now, a re-
gular vector, as good as the displacement vector. Now I am giving them two different names and
two ‘different” transformation rules! Indeed. . . you are right. . . I am being quite pedantic. . . but
j
just to prepare the notation for the future. Note the matrix R−1 i is just an index notation
for (R−1 )T , which for the particular case of an orthogonal matrix, is equal to the transformation
matrix R itself. As we already said in Section (1.2), there is no clear difference between covariant
and contravariant components as long as one transforms between Euclidean orthonormal basis.
However, this is not the case in general coordinate systems (such as polar coordinates) or in
Special Relativity. Be patient.

Exercise
Show that
• the 3-divergence of a vector field ∂i V i transforms as a scalar field.

• the Laplacian operator ∇2 = ∂i ∂ i transforms as a Galilean scalar operator.

1.4.3 Tensors: linear machines

The previous examples are just particular cases of a general class of quantities that transform with
a linear and homogeneous transformation law under coordinate transformation: tensors. In order to
get some intuition, let us start by considering in detail the transformation laws of rank-2 tensors. In
the same way that a vector V can be expanded in terms on the basis ei , a geometric Cartesian tensor
T can be expanded as
T = T ij ei ⊗ ej . (1.33)
1.4 Tensors in Euclidean space 9

∂ x̄i
Rotations ∂xj ≡ Ri j are constants!

Scalar φ̄ = φ
∂ x̄i j
Contravariant vector V̄ i = ∂xj V
∂xj
Covariant vector V̄i = ∂ x̄i Vj
∂ x̄i ∂ x̄j kl
Contravariant rank-2 tensor T̄ ij = ∂xk ∂xl
T
∂xk ∂xl
Covariant rank-2 tensor T̄ij = ∂ x̄i ∂ x̄j Tkl
∂ x̄i ∂xl k
Mixed rank-2 tensor T̄ i j = ∂xk ∂ x̄j
T l

Table 1.1

where ⊗ denotes the direct product. The transformation property of the different components T ij
under a rotation follows immediately from the previous expresion: T ij transform as the product of
two contravariant vectors Ai and B j

Āi B̄ j = Ri k Rj l Ak Al −→ T̄ ij = Ri k Rj l T kl . (1.34)

As we did in the previous section, we can define the covariant tensor Tij , which transform as the
product of two covariant vectors
k l k l
Āi B̄j = R−1 i R−1 j Ak Al −→ T̄ij = R−1 i R−1 j Tkl . (1.35)

As before, in a clear abuse of language, we will refer to these tensor components as tensors. Particular
examples of rank-2 Cartesian tensor are the inertia tensor
Z
I ij = d3 x ρ(x) r2 δ ij − xi xj

(1.36)

or the quadrupole tensor (1.57) (cf. Section 1.5.1).

Exercise
Show that the inertia tensor I ij is indeed a rank-(2,0) tensor.

Generalizing the transformation laws (1.34) and (1.35) we can define the transformations properties
for arbitrary mixed tensors of contravariant rank m and covariant rank n
m n
!
i1 ...im
Y ∂ x̄ip Y ∂xlq
T̄ j1 ...jn = kp jq
T k1 ...km l1 ...ln (1.37)
p=1
∂x q=1
∂ x̄

= Ri1 k1 . . . Rim km (R−1 )l1 j1 . . . (R−1 )ln jn T k1 ...km l1 ...ln .

(1.38)

Tensors (components) are objects with any number of indices. They share the same transformation
properties as vectors and can be classified according to the number of upper or lower indices. For
instance, we say that a scalar is a rank-0 tensor and a contravariant (or covariant) vector is a con-
travariant (or covariant) rank-1 tensor. In general, a tensor with m upper indices and n lower indices
is called a rank-(m, n) tensor.
1.4 Tensors in Euclidean space 10

A tensor is not just a quantity carrying indices. It is the transformation law what defines a
tensor (see below). Not all quantities with indices are tensors.

1.4.4 Some useful properties

Let me present some useful properties and definitions regarding tensors:

1. The sum (or difference) of two like-tensors is a tensor of the same kind. The proof of this is
straightforward. Imagine we take sum or difference of two general tensors T i1 ...im j1 ...jn and
Ri1 ...im j1 ...jn and apply the transformation rule (1.37), we will get

S̄ i1 ...im j1 ...jn ≡ T̄ i1 ...im j1 ...jn ± R̄i1 ...im j1 ...jn

m n
! m n
!
Y ∂ x̄ip Y ∂xlq k1 ...km
Y ∂ x̄ip Y ∂xlq
= kp jq
T l1 ...ln ± kp jq
Rk1 ...km l1 ...ln
p=1
∂x q=1
∂ x̄ p=1
∂x q=1
∂ x̄
m n
!
ip Y lq
Y ∂ x̄ ∂x
T k1 ...km l1 ...ln ± Rk1 ...km l1 ...ln

= kp jq
p=1
∂x q=1
∂ x̄
m n
!
ip Y
Y ∂ x̄ ∂xlq
= kp jq
S i1 ...im j1 ...jn . (1.39)
p=1
∂x q=1
∂ x̄

2. Given two tensors of rank s and t, the product transforms as a tensor of rank (s + t).
3. If the expression T ... ... = R... ... S ... ... is invariant under coordinate transformations and T ... ... and
R... ... are tensors, then S ... ... is a tensor.

Exercise
Prove this for the particular case Ti = Rj S j i .

4. A tensor contraction occurs when one of a tensor’s free covariant indices is set equal to one
of its free contravariant indices6 . A sum is understood to be performed on the now repeated
indices. For instance, Tij j is a contraction on the second and third indices of the tensor Tij k .
5. The contraction of a rank-2 tensor is a scalar (its trace) whose value is independent of the
coordinate system chosen.

If all the components of a Cartesian tensor T i1 ...im j1 ...jn in a given inertial reference frame are
zero, they will be zero in any other inertial reference frame.
6 Note the words covariant and contravariant. A contraction is never done between two covariant or two contravariant

indices.
1.4 Tensors in Euclidean space 11

1.4.5 Symmetric and antisymmetric tensors

An arbitrary rank-2 tensor can be decomposed into a completely symmetric and a completely anti-
symmetric part
Tij = T(ij) + T[ij] , (1.40)
where we have used the common notation (, ) and [, ] to denote respectively symmetrization and
antisymmetrization over the indices included inside, i.e.
1 1
T(ij) ≡ (Tij + Tji ) , T[ij] ≡ (Tij − Tji ) . (1.41)
2 2
Completely symmetric and antisymmetric rank-2 tensors satisfy Tij = ±Tji , where the plus sign stands
for the symmetric and the minus sign for the antisymmetric one. Particular examples of symmetric
tensors are the inertia tensor (1.36) or the quadrupole tensor (1.57) (cf. Section 1.5.1).

Exercise
Prove that the trace of a tensor is invariant under rotations. Show that a tensor Tij in n
dimensions has three separately invariant parts

1 k 1 k
Tij = T k δij + T(ij) + T[ij] − T k δij . (1.42)
n n

Exercise
Write down the explicit expressions for the completely symmetric and antisymmetric parts of a
rank-3 tensor Tijk .

1.4.6 Permutation tensor

The Levi-Civita or permutation tensor7 of rank 3

+1,
 if ijk is an even permutation of 123
ijk = ijk +1, if ijk is an odd permutation of 123 (1.43)

0, otherwise


flips the sign upon the interchange of any pair of indices and vanishes when two of the indices are
equal. Most of the basic identities of vector algebra and vector calculus can be easily proved by using
an important relation between the metric tensor δij and ijk , the contracted epsilon identity 8

ijk i lm = δjl δkm − δjm δkl . (1.44)

You will deal with this expression in the exercises.

7 Technically, I should say that it is a psedotensor, but we are not interested in introducing this concept here. We

will only deal with rotations.

8 In most of the books you will find this expression with all indices down. Remember that the index convention we

chose is just a way of keeping track of the sums that can be easily extended to the Minkowski case. For Cartesian
tensors the position of the indices makes no difference.
1.5 Covariance and Classical Mechanics 12

1.5 Covariance and Classical Mechanics

The main property of tensors is that their transformation law is linear and homogeneous. Each
component of a tensor, in this case Cartesian, is a linear combination of the components of the tensor
in the original frame, namely
m n
!
i1 ...im
Y ∂ x̄ip Y ∂xlq
T̄ j1 ...jn = kp jq
T k1 ...km l1 ...ln . (1.45)
p=1
∂x q=1
∂ x̄

In order to ensure that fundamental equations satisfy the Galilean Principle of Relativity the only
thing we have to do is to write tensorial equations. For instance, if two quantities S ij k and T ij k
transform as rank-(2, 1) Cartesian tensors, a fundamental law of the kind
S ij k = T ij k , (1.46)
will retain its form in any inertial reference frame, since both sides of the equation transform in the
same way under coordinate transformations (in this case rotations). The fundamental equation (1.46)
is then said to be covariant and the transformation is said to be a symmetry of the physical theory.

1.5.1 Newton’s theory of gravity

A physical example of the previous discussion is the Newtonian theory of gravity published by Netwon
in 1687 within the Philosophiae Naturalis Principia Mathematica. In such a theory, the gravitational
force Fi exerted on a gravitational test mass mG
Fi = −mG ∂i Φ . (1.47)
is determined by a single function9 , the gravitational potential Φ, which depends on the matter dis-
tribution through the so-called Poisson equation 10
∇2 Φ(t, xi ) = 4πGρ(t, xi ) . (1.49)
Eqs. (1.47) and (1.49) are respectively a vector and a scalar covariant equation. If they are valid in
a given inertial frame, they will be automatically valid in any inertial frame, since their form will be
preserved under rotations and translations.

Exercise: Cosmological constant

Galilean invariance allows for an additional constant Λ in the Poisson equation, which becomes

∇2 Φ(t, x) + Λ = 4πGρ(t, x) . (1.50)

Observations of galaxies with typical masses of 1030 M , and intergalactic separations of order
1 Mly do not show any significant deviation from Newton’s inverse square. law. Assuming this
deviation to be smaller than 1%, determine an upper bound on the magnitude of Λ.
9 Eqs. (1.47) and (1.49) are left unaltered by the addition to Φ of an arbitrary function of time f (t), namely
Φ(t, x) → Φ(t, x) + f (t) . (1.48)
Since the transformation affects only the field Φ and not the coordinates, the invariance of Eqs. (1.47) and (1.49) under
(1.48) is referred as an internal or gauge symmetry. The gravitational field Φ(t, x) has no dynamical degrees of freedom.
Eq. (1.49) is not a dynamical equation for the determination of the potential, but rather a constraint on the initial
spatial distribution of the potential, which must apply at all times.
10 No value of the proportionality Newton’s gravitational constant G was available to Newton. Its numerical value was

firstly determined by Cavendish in 1797 using a torsion balance, being the result reasonably close to present laboratory
measurements, G = 6.673(10) × 10−11 N m2 / kg2 . The gravitational constant remains the most uncertain of all the
fundamental constants of physics.
1.5 Covariance and Classical Mechanics 13

O
x
P

x'
x- x'

Figure 1.3: Multipolar expansion.

The solution of the Poisson equation can be worked out in the same way that you did for the
electromagnetic potential in your Classical Electrodynamics course. The only difference (albeit fun-
damental) is the sign of the matter distribution. A formal solution of the Poisson’s equation for an
arbitrary mass distribution can be obtained by applying the superposition principle or using Green
functions to obtain
ρ(x0 ) 3 0
Z
Φ(x) = −G d x , (1.51)
|x − x 0 |
where x = xi ei is the radius vector of the point at which the gravitational potential is computed,
and x0 = x0i ei is an arbitrary point in the matter distribution. Note that the Newtonian potential is
negative, as expected for an attractive force.

Exercise: Green’s functions (*)

Use the Green’s function method to prove Eq.(1.51).

The previous expression becomes the usual −GM/r only for a spherical mass distribution. The
general result for a non-spherical distribution is slightly more complicated. As any distribution func-
tion, the essential features of the matter distribution can be be characterized by its moments. For an
observer sufficiently far away from the object we can perform a Taylor expansion around x0 = 0 to
obtain
1 0 1 1 1 1 1 (−1)n 0 1
= e−x ·∇ = − (x0 · ∇) + (x0 · ∇)2 + . . . + (x · ∇)n + . . . (1.52)
|x − x0 | r r r 2 r n! r
0k 0l 02 kl

0k 3x x − r δ xk xl
1 x xk
= + + + ... , (1.53)
r r3 2r5
P∞ n
where we have used the standard expression for the exponential ex = n=0 (−1) n
n! x and defined the
2 k
distance r = x xk . Inserted back in Eq.(1.51), we realize that the potential created by the matter
distribution
D k xk Qkl xk xl

M
Φ(x) = −G + + + ... , (1.54)
r r3 2r5
can be organized in a series whose individual terms contain information on the spatial structure at an
increasing level of detail while decaying the more rapidly in space the higher the information content
is. The quantities Z
M = ρ(x0 ) d3 x0 , (1.55)
Z
Dk = ρ(x0 )x0k d3 x0 , (1.56)

and Z
Qkl = ρ(x0 ) 3x0k x0l − r02 δ kl d3 x0 ,

(1.57)
1.5 Covariance and Classical Mechanics 14

are respectively the total mass of the system, the mass dipole moment and the mass quadrupole mo-
ment tensor. The dipole moment can be eliminated by simply choosing the origin of coordinates of
the center of mass. The quadrupole moment is the second moment of the mass distribution with its
trace removed. It is proportional to 1/r3 , which gives rise to a deviation from the inverse square law
of the form 1/r4 .

Exercise: Multipole expansion

• Prove Eq. (1.52).
• Prove that the quadrupole tensor for a spherical distribution vanishes.
• Prove that a change of the origin modifies the quadrupole tensor by only adding a constant.
CHAPTER 2
MINKOWSKI SPACETIME AND SPECIAL RELATIVITY

Scarcely anyone who truly

understand relativity theory can
escape this magic.

A. Einstein

In the previous chapter we saw that tensors are a very good tool for writing covariant equations in
3-dimensional Euclidean space. In this chapter we will generalize the tensor concept to the framework
of the Special Theory of Relativity, the Minkowski spacetime. I will assume the reader to be familiar
at least with the rudiments of Special Relativity, avoiding therefore any kind of historical introduction
to the theory.

2.1 Einstein’s Relativity

Special Relativity is based on two basic axions, formulated by Einstein in 19051 :

1. Principle of Relativity (Galileo): The laws of physics are the same in all the inertial frames: No
experiment can measure the absolute velocity of an observer; the results of any experiment do not
depend on the speed of the observer relative to other observers not involved in the experiment.
2. Invariance of the speed of light: The speed of light in vacuum is the same in all the inertial
frames.

Instantaneous action at a distance is inconsistent with the second postulate and must be replaced
by retarded action at a distance. Absolute simultaneity will only apply as an approximation at low
velocities for nearby events.

2.2 Minkowski spacetime: new wine in a old bottle

The framework of Special Relativity is a 4-dimensional manifold called Minkowski (or pseudo-Euclidean)
space-time. The differential and topological structures of the Newtonian and Minkowskian spacetimes
1 On the electrodynamics of moving bodies.
2.2 Minkowski spacetime: new wine in a old bottle 16

coincide2 , but they differ in the metrical structure, i.e. in the definition of distances. While in Newto-
nian spacetime the spatial and temporal distances are independent, in Minskowskian spacetime space
and time by themselves, are doomed to fade away into mere shadows, and only a kind of union of the
two will preserve an independent reality 3 . Space and time are distinguished only by a sign, which will
play however a central role. For any inertial frame of reference in Minkowski spacetime there is a set
of coordinates4
{xµ } = {t, x, y, z} = {x0 , x1 , x2 , x3 } = {x0 , xi } , (2.1)
and a set of orthonormal basis vectors

{eµ } = {et , ex , ey , ez } = {e0 , e1 , e2 , e3 } = {e0 , ei } , (2.2)

satisfying the Lorentz orthonormality condition 5

eµ · eν = ηµν , (2.3)

with  
−1 0 0 0
 0 1 0 0 
ηµν =
 0
 = diag(−1, 1, 1, 1) . (2.4)
0 1 0 
0 0 0 1
The inverse of (2.4) is traditionally denoted by η µν and satisfies6

η µν ηνρ = δ µ ρ , (2.5)

where the 4-dimensional Kronecker delta δ µ ρ is the indexed version of the identity matrix, i.e. δ µ ρ = 1
if µ = ρ and zero otherwise. Note that ηµν and η µν are numerically equivalent.

Exercise
Which is the value of δ µ µ ?

In terms of the basis vectors, the infinitesimal displacement dS between two points in spacetime can
be expressed as
dS = dxµ eµ , (2.6)
where dxµ are the so-called contravariant components. These are computed via the scalar product of
the vector dS and the corresponding basis vector eµ .

dS · eµ = (dxν eν ) · eµ = dxν (eν · eµ ) = dxν ηνµ = dxµ , (2.7)

where we have defined the covariant or dual components by lowering the index of the contravariant
components with the metric
dxµ ≡ ηµν dxν . (2.8)

2 Both of them are smooth, continuous, homogeneous, isotropic, orientable,. . .

3 Minkowski, 1908.
4 As in the previous chapter, we will not consider the quadruplet {xµ } to be a vector in Minkowski spacetime.

Coordinate indices will be always upper indices.

5 Note that the time coordinate is 4-dimensionally orthogonal to the spatial coordinates.
6 Einstein’s summation convention is used.
2.2 Minkowski spacetime: new wine in a old bottle 17

Exercise
Starting with a covariant vector defined by Eq.(2.8) , show that the inverse of the metric η µν
can be used to raise indices
η µν dxν = dxµ . (2.9)

As in the Euclidean case, contravariant and covariant vectors are just an appropriate way of
simplifying the notation and taking into account the summation convention. Note however that in
the present case lowering or raising indices changes the sign of the temporal component while keeping
intact the spatial ones
dx0 = −dx0 , dxi = +dxi . (2.10)
The upper and lower index notation automatically keeps track of the minus signs associated to the
temporal component. The indefiniteness of the metric is automatically incorporated in the notation!

In some old-fashioned books and in ’t Hooft’s lecture notes you will find a fourth coordinate
x4 = it, instead of the coordinate x0 = t appearing before. Written in terms of x4 the Minkowski
spacetime has the appearance of a positive-definite 4 dimensional Euclidean space

eµ · eν = δµν (2.11)

and there is no difference between lower and upper indices. This notation is however confusing
since it hides the non-positive definite character of the metric.

The square of the infinitesimal distance between two events in Minkowskian spacetime is given by

|dS|2 ≡ ds2 = ηµν dxµ dxν = dxµ dxµ = −dt2 + dX 2 . (2.12)

where ηµν is the Minskowski metric and dX 2 ≡ dx2 + dy 2 + dz 2 denotes the spatial interval.

Note that we have arbitrarily chosen a ηµν = diag(−1, 1, 1, 1) spacelike convention for the metric
signature, which keeps intact the notation used for Cartesian tensors in Euclidean spacetime.
Some books use a different timelike convention for the signature of the metric, taking ηµν =
diag(1, −1, −1, −1). Although the physics is independent of the convention used, the signs
appearing in the formulas in those books may differ from those in the expressions presented
here. For instance, using the convention ηµν = diag(1, −1, −1, −1), Eq.(2.40) would change to

pµ pµ = m2 . (2.13)

Note however that in both cases you recover E 2 = p2 + m2 after expanding the expression into
the different components.

Note that, contrary to the Newtonian case, the metric ηµν is not positive-definite. Given the
Lorentzian signature (− + ++) , the interval (2.12) can be positive, zero, or negative

• If ds2 = 0, dX/dt = 1 and the interval corresponds to the trajectory of a light ray. This interval
is called null or lightlike interval. The set of all lightlike wordlines leaving or arriving to a given
point xµ spans the future or past lightcone of the event. There is a lightcone associated to each
point in spacetime.
2.3 Minkowski spacetime isometry group 18

Timelike

Fu
Separation

tu
Massive

re
lig
particle Massless

ht
co
particle

Spacelike ne Spacelike
Separation Separation

Pa
st
lig
ht
co
ne
Timelike
Separation

Figure 2.1: Minkowski spacetime.

• If ds2 < 0 the interval is said to be timelike. It corresponds to the wordline of a particle with
nonzero rest mass moving with a velocity smaller than light, dX/dt < 1. Two events separated
by such an interval are both inside the lightcone and can be in causal contact. There will exist
a frame in which the two events happen at same position but at different times.
• If ds2 > 0 the interval is termed spacelike. There will exist a frame in which the two events
happen at the same time but at different places, without any causal relation between them.

The different concepts are summarized in Fig.2.1.

2.3 Minkowski spacetime isometry group

The transformations of Special Relativity are defined as those that do not change the Minkowski line
element (2.12) (not the spatial or temporal intervals separately!). Following the procedure outlined
in the previous chapter, and taking into account that ηµν is also a constant metric, the requirement
ds2 = ds̄2 gives rise to the following condition
∂Λρ µ ∂Λρ µ
ηρσ Λσ ν = 0 → = 0, (2.14)
∂xπ ∂xπ
where we have defined
∂ x̄µ
Λµ ν ≡ . (2.15)
∂xν
As before, the transformation relating the two reference frames must be linear7 . This set of trans-
formations constitute the so-called inhomogeneous Lorentz group or the Poincare group, which is a
combination of translations
xµ → xµ + aµ (2.16)
and linear homogeneous Lorentz transformations

xµ = Λµ ν xν , (2.17)

with Λ a 4 × 4 matrix, independent of the coordinates. The first (upper) index in Λµ ν labels rows,
while the second (lower) one labels columns.
7 Note that this is basically due to the fact that we are dealing with constant metrics.
2.3 Minkowski spacetime isometry group 19

In order to preserve the line element (2.12) the constant matrices Λµ ν are required to satisfy the
pseudo-orthogonality condition
ηµν = ηρσ Λρ µ Λσ ν , (2.18)
which, in matrix notation, becomes
η = ΛT ηΛ , (2.19)
T
with denoting matrix transpose. Eq. (2.18) is the relativistic analogue of the orthogonality condition
(1.19). The determinant of the Λ matrices is also ±1. As we did in the previous chapter, we will not
consider the full Lorentz group8 (which is neither connected nor compact)

O(3, 1) = L+ ∪ P L+ ∪ T L+ ∪ P T L+ , (2.20)

with P and T the parity P µ ν = diag(+1, −I) and time reversal T µ ν = diag(−1, +I) operations. We
will restrict ourselves to the continuous Lorentz transformations connected with the identity (the
proper Lorentz group)

L+ ≡ SO(3, 1) = {Λ|ΛT ηΛ = η , Λ0 0 ≥ 0 , det Λ = 1} (2.21)

with S denoting special or reflection-free. These transformations are the relativistic analog of proper
rotations in Euclidean spacetime.

Exercise
Verify that the restricted set of Lorentz transformations (2.21) forms a group :

• Closure: The product of any two Lorentz transformations is another Lorentz transforma-
tion.
• There is an identity transformation.
• Every Lorentz transformation has an inverse.

• The product of Lorentz transformations is associative.

The fact that η 2 = I4 , with I4 the identity matrix, allows us to easily compute the inverse Lorentz
transformation
η 2 = η ΛT ηΛ = η ΛT η Λ = I4 Λ−1 = ηΛT η ,

→ (2.22)
which, writing explicitly the components, becomes
µ
Λ−1 ν = η µλ Λρ λ ηρν = Λν µ . (2.23)

The position of the indices is important Λµ ν 6= Λν µ !!

How many Lorentz transformation are there? Each Lorentz transformation is represented by a 4×4
matrix, which makes a total of 16 components. The pseudo-orthogonality condition (2.18) imposes
however some constraints. Indeed, taking the transpose of such a equation leaves it unchanged. The
independent components are just the diagonal elements plus half the off-original elements. We are
8 Note that now, not only the determinant of Λ, but also the element Λ0 0 , plays a special role in the splitting (2.20).
.
2.3 Minkowski spacetime isometry group 20

left therefore with 16 − 10 = 6 independent Lorentz transformations. There are two different kinds of
homogeneous Lorentz transformations. The most obvious one are spatial rotations

µ 1 0
Λ ν= , (2.24)
0 Ri j

where Ri j is a 3 × 3 orthogonal matrix δkl = δij Ri k Rlj , with i, j running only in spatial directions.
There are three independent rotations matrices, one per spatial direction. For instance, a rotation of
angle θ around the z-axis will take the form
 
cos θ sin θ 0
Ri j =  − sin θ cos θ 0  . (2.25)
0 0 1
The difference with Euclidean transformations arises when considering the so-called boosts, which mix
the spatial and temporal components. There are three of them, each one associated to the mixing of
a particular spatial component with time. As an example consider a boost of rapidity η = tanh−1 v
along the x direction
   
γ −γβ 0 0 cosh η − sinh η 0 0
 −γβ
Λµ ν = 
γ 0 0   =  − sinh η cosh η 0 0  ,
 
 0 (2.26)
0 1 0   0 0 1 0 
0 0 0 1 0 0 0 1
√
where we have defined the parameter γ = 1/ 1 − v 2 , with v the 3-velocity. After the boost, the tem-
poral and spatial coordinates are a linear and homogeneous combination of the spatial and temporal
coordinates in the old frame.

Exercise
Verify that Eq. (2.26) gives rise to the standard Lorentz transformation

t0 = γ(t − vx) , x0 = γ(x − vt) . (2.27)

For doing so, assume Λµ ν to be the transformation from the rest frame of a given inertial
observer to the frame of a second initial observer moving with speed v along the x axis and
determine the relation between that velocity and η .

The convenience of using the rapidity parameter η instead of the velocity v resides in the fact that
η combines additively. In fact, if we consider two consecutive boosts in the same direction, we have
Λ(η1 )Λ(η2 ) = Λ(η1 + η2 ) . (2.28)

Exercise
Consider the composition of 2 boosts with velocities v1 and v2 along the x direction. Show that
Λ1 Λ2 gives rise to a boost with 3-velocity
v1 + v2
v= . (2.29)
1 + v 1 v2
What happens in the limit v1 , v2 1? Generalize the previous to an arbitrary direction. Is
the general result symmetric under the interchange of the two velocities?. Note that all the
expressions are written in natural units.
2.3 Minkowski spacetime isometry group 21

t t

s2
x
sevent
n eou
ulta
Sim
Simultaneous events 1
x
Figure 2.2: A boost transformation

The form of Eq. (2.26)

   
cosh η − sinh η 0 0 cos iη sin iη 0 0
 − sinh η cosh η 0 0   sin iη cos iη 0 0 
Λµ ν = = , (2.30)
 0 0 1 0   0 0 1 0 
0 0 0 1 0 0 0 1

and the property (2.28) closely resemble those of spatial rotations (2.25). The main difference is
the change of trigonometric functions by their hyperbolic analogue, reflecting the relative sign of the
temporal direction with respect to the spatial directions. Note however two important differences
between boosts and ordinary rotations

• The rotation parameter θ in Eq.(2.25) runs between 0 and 2π, with both points included. The
rapidity parameter η is non-compact and can take whatever value in R.
• The boost matrix (2.26) is symmetric, which is not the case for ordinary rotations, cf. Eq.
(2.25):

Although we should not take the analogy between rotations and boosts too seriously, it is instructive
to look at the action of a Lorentz transformation on a spacetime diagram. As shown in Fig. 2.2, a
Lorenzt boost rotates time and space by the same angle η = tanh−1 v, but in opposite directions! In
Special Relativity the simultaneity of two events depends on the observer, the hyperplanes of constant
coordinate time do not have an invariant meaning. Note however that the light cone (i.e the dashed
line at 45 degrees in the diagram) is invariant under Lorentz transformations.

Exercise
Use the diagram Fig.2.2 to derive the well-known effects of space contraction
p
L̄ = L 1 − v 2 → L̄ < L , (2.31)

and time dilatation

T
T̄ = √ → T̄ > T . (2.32)
1 − v2
Hint: Don’t be confused by drawing just a slice of Minkowski spacetime in an Euclidean paper.
Use the Minkowski metric!
2.4 Tensors in Minkowski spacetime 22

2.4 Tensors in Minkowski spacetime

The Einstein’s Principle of Relativity introduced at the beginning of this chapter implies that all
the laws of physics must retain their mathematical form in all the inertial frames, i.e. they must be
covariant under Lorentz transformations. As we learnt in the previous chapter, tensorial equations
automatically satisfy this requirement. Since the required discussion of tensors in Minkowski spacetime
closely follows that in Section 1.4 for Eucliden spacetime9 , we will simply summarize the results in
the Table 2.1.

∂ x̄µ
Lorentz transformations ∂xν ≡ Λµ ν are constants!

Scalar φ̄ = φ
∂ x̄µ ν
Contravariant vector V̄ µ = ∂xν V
∂xν
Covariant vector V̄µ = ∂ x̄µ Vν
∂ x̄µ ∂ x̄ν ρσ
Contravariant rank-2 tensor T̄ µν = ∂xρ ∂xσ T
∂xρ ∂xσ
Covariant rank-2 tensor T̄µν = ∂ x̄µ ∂ x̄ν Tρσ
∂ x̄µ ∂xσ ρ
Mixed rank-2 tensor T̄ µ ν = ∂xρ ∂ x̄ν T σ

Table 2.1

Exercise
How does the volume element d4 x = dx0 dx1 dx2 dx3 transforms under Lorentz transformations?

2.5 Covariance and Relativistic Mechanics

In Newtonian spacetime the trajectory of a particle is described by the position 3-vector as a function
of time xi (t), with t an absolute element of the theory. In Special Relativity a particle of mass m
follows timelike worldlines in Minkowski spacetime. The time depends on the chosen reference frame
and it is just another coordinate at the same level of the spatial coordinates. The trajectory xµ (σ) can
be expressed in terms of a completely arbitrary parameter σ, which changes continuously along the
wordline and does not need to have any particular physical interpretation. However, an interesting
(and natural) possibility for the particular case of massive particles is to identify it with the proper
time τ of the particle. This proper time τ is defined as the time measured by an observer in the
particle’s rest frame (dX = 0), which can always be achieved by performing a Lorentz transformation.
The proper time interval dτ is therefore related to the Minkowski spacetime interval10

ds2 = −dτ 2 < 0 . (2.33)

9 Note indeed that the only conceptual difference is the replacement of rotations by Lorentz transformations, the

replacement of latin indices by greek indices, and the use of the Minkowski metric, instead of the Euclidean one, for
lowering and raising indices.
10 Note that the proper time is not a useful parametrization for the worldline of massless particles, such as photons,

since these particles move on the light cone and can travel any distance in zero proper time dτ 2 = −ds2 = 0.
2.5 Covariance and Relativistic Mechanics 23

The connection between the proper time and the measurement made in an inertial reference frame
with coordinate time interval dt is given by
dt −1/2
γ≡ = 1 − v2 . (2.34)
dτ
i
Note that γ is a growing function of the 3-velocity v i = dx
dt and it is always bigger than one. The
proper time goes by at a slower rate than the coordinate time t.

4-velocity

Given τ , and in clear analogy with the 3-dimensional case, the 4-velocity uµ along the trajectory is
given by the 4-vector
dxµ
uµ ≡ , (2.35)
dτ
which is tangent to the worldline of the particle and automatically normalized

ηµν uµ uν = uµ uµ = −1 . (2.36)

In terms of its components, the 4-velocity uµ can be written as

dxµ dt T T
= 1, v i = γ 1, v i . (2.37)
dτ dτ
For an observer at rest γ = 1 and Eq.(2.37) becomes simply uµ = (1, 0, 0, 0)T . In the Newtonian limit
v c, dτ → dt and ui → v i .
Let us consider the behaviour of uµ with respect to Lorentz transformations. Since the proper time
dτ is invariant under Lorentz transformations and dxµ transforms as a contravariant tensor, we have

uµ = Λµ ν uν . (2.38)

The 4-velocity is a timelike 4-vector.

4-momentum

Since the mass m of the particle is a scalar under Lorentz transformations, the 4-momentum

pµ = muµ (2.39)
T T
is a Lorentz 4-vector with components pµ = E, pi = mγ, mγv i .

In the instantaneous rest frame of the particle, pµ = (m, 0)T . This can be used to simplify many
computations. We can compute things in this particular frame and then re-express the result
in a form valid in any other inertial frame by appealing to covariance.

Using the normalization condition for the 4-velocities (2.36), the normalization condition for the
4-momentum becomes
pµ pµ = −m2 , (2.40)
p
which is nothing else than the well-known energy-momentum relation E = p2 + m2 written in a
covariant way11 . In the Newtonian limit (|pi | m) this relation becomes the familiar expression of
11 Note that p is the three momentum pi .
2.5 Covariance and Relativistic Mechanics 24

2
p
the Newtonian theory together with the energy equivalent of the mass mc2 , i.e. E ' m + 2m . Note
that the 4-momentum remains well defined even for massless particles, where it has zero square norm
and becomes lightlike, pµ pµ = 0. We will take this as a definition of a massless classical particle. The
indefiniteness of Minkowski metric allows for non-zero values of the temporal and spatial parts as long
as they cancel out in pµ pµ . In particular, we can always find a frame in which pµ = (E, 0, 0, E)T .

4-acceleration

It is important to remark that Special Relativity, as Newtonian mechanics, is concerned with the
relation between inertial observers and not with the behavior of the objects that they are studying.
In particular, the observed objects can be accelerating. The 4-acceleration in Minkowski spacetime
can be defined as12
uµ (τ + ) − uµ (τ ) duµ d2 xµ
aµ = lim → aµ ≡ = . (2.42)
→0 dτ dτ

The 4-acceleration (2.42) transforms in the proper way under Lorentz transformations since Λµ ν
is linear and depends only on the relative velocity between the two frames. This fact allows it to pass
completely the d2 /dτ 2 . The acceleration aµ is spacelike 4-vector

ηµν aµ aν > 0 (2.43)

orthogonal to the timelike 4-velocity

aµ uµ = 0 , (2.44)
as can be easily shown by computing the derivative of Eq. (2.36) with respect to the proper time.

Energy-momentum conservation

The objects defined above allow us to easily generalize the Newton’s second law, which becomes
dpµ
fµ = = maµ . (2.45)
dτ
Note that Eq.(2.45) is a tensorial identity, which maintains its form under Lorentz transformations
and automatically satisfies Einstein’s principle of Relativity. The explicit expression of f µ depends on
the considered interaction (cf. the first exercise in Section 2.7). The components of the 4-vector f µ
in Eq. (2.45) are proportional to the Newtonian force F i and to the work done by F i per unit time,
i.e. f µ = γ (vi F i , F i )T . Contrary to what happens in Newtonian physics, energy and momentum
conservation laws are not independent. The conservation laws of Newtonian mechanics in a given
collision between particles are replaced by a conservation law for the total 4-momentum 13
X µ X µ
pin = pout , (2.46)
in out

12 You will probably wondering why I am making such a mess writing out the explicit definition in (2.42) instead of

directly writing
duµ d 2 xµ
aµ ≡ = . (2.41)
dτ dτ
The point I want you to notice in that the two vectors v µ (τ + ) and v µ (τ ) are located at different points in spacetime.
What we are really doing when computing the acceleration in Special Relativity is trivially moving the two vectors
to the same point in spacetime before subtracting then. This trivial operation of moving a vector from one point to
another will turn out to be not so trivial in General Relativity. As you will see, we will need to introduce some extra
machinery in order to do that. . . but let’s move one step at a time. . .
13 The interaction is assumed to be a contact interaction. Particles are free away from the interaction point.
2.6 Relativistic Lagrangian for free particles 25

with the subscripts in and out denoting the incoming and outgoing particles. Note that the conserva-
tion law is Lorentz covariant and reduces to the Newtonian momentum and energy conservation for
small velocities.

Exercise
Show that a photon cannot spontaneously decay into an electro-positron pair.

2.6 Relativistic Lagrangian for free particles

The equation of motion for a free particle following from (2.45)

dpµ
= 0, (2.47)
dτ
can be also be derived from a Lagrangian formulation where the role of the generalized coordinates
is played by the space-time coordinates xµ and the classical time t is replaced by an appropriate
parameter σ. The simplest guess for the relativistic action would be a naive generalization of the
Newtonian action, namely
dxµ dxν
Z
1
S = m dτ ηµν . (2.48)
2 dτ dτ
Note however that the previous expression is not invariant under reparametrizations of the path14
τ → f (τ ). The dynamic of the particle seems to depend on the “internal coordinate” τ used in the
description of the curve xµ (τ ). Moreover, the action (2.48) does not contain any information about
the lightcone. On top of that, t neither has a smooth massless limit.
To solve these problems, we will substitute the proper time by an arbitrary parameter σ and
introduce a non-dynamical function15 e(σ), the so-called einbein. This quantity will be treated as an
additional generalized coordinate during the intermediate computations and fixed to a particular value
only at the end. To clarify the construction, we proceed in several steps. We start by replacing the
problematic mass appearing in the action (2.48) by e−1 (σ). This gives rise to the following structure

dxµ dxν
Z
1 1
S∼ dσ ηµν . (2.49)
2 e(σ) dσ dσ

In order for the previous action to be invariant under reparametrizations of the path16 σ → f (σ),
the einbein e(σ) must be chosen to transform in the proper way. The transformation rule can be
determined by inspection: the property e(σ)dσ must remain invariant. In other words, the infinitesimal
displacement dσ and the einbein must transform in an opposite way
−1
dσ̄ = f˙(σ)dσ , ē(σ̄) = f˙(σ) e(σ) . (2.50)

With these transformations at hand, we proceed now to reintroduce the mass parameter m in the
action (2.49). The form of the new term is essentially determined by pure dimensional arguments,
reparametrization invariance and the massless limit. In order for the new piece to be reparametrization
invariant, the integration measure dσ must come together with a factor e(σ). This gives a term
14 The reparametrization invariance of the action should be understood as a gauge symmetry: a redundancy of the

description, not a symmetry relating different solutions of the theory.

15 i.e. no kinetic term for e(σ) will be included.
16 Or if you want invariant under 1D general coordinate transformations.
2.7 Maxwell’s equations 26

dσe(σ) with dimension [E]−2 , which must be compensated17 by something with dimension [E]2 and
proportional to m. There you are: the new term is dσe(σ)m2 . The resulting action is the so-called
einbein action
dxµ dxν
Z
1 1
S= dσ ηµν − m2 e(σ) (2.51)
2 e(σ) dσ dσ
and give rise to the following Euler-Lagrange equations for the generalized coordinates xµ (σ) and e(σ)

dxµ dxµ dxν

d
e−1 (σ) = 0 and ηµν = −m2 e2 (σ) . (2.52)
dσ dσ dσ dσ

The massive or massless character of particles is automatically incorporated in the second equation.
Indeed, choosing e(σ) = 1 and taking the limit m → 0, we obtain the equations of motion for a free
massless particle
d2 xµ dxµ dxν
2
= 0, ηµν = 0. (2.53)
dσ dσ dσ
On the other hand, the equations for massive particles can be obtained by choosing e(σ) = 1/m and
using the proper time as affine parameter (σ = τ )

d2 xµ dxµ dxν
m = 0, ηµν = −1 . (2.54)
dτ 2 dτ dτ
These kind of choices in which e(σ) = constant are called affine and restrict the function f (σ) to the
form
f˙ = 1 → f (σ) = σ + constant . (2.55)

Exercise
Consider the massive case. Show that the action (2.51) is equivalent to the geometrical action
Z
S = −m dτ . (2.56)

2.7 Maxwell’s equations

18
In their traditional form, Maxwell’s equations are given by

∇ × E + ∂t B = 0, (2.57)
∇·B = 0, (2.58)
∇ · E = ρ, (2.59)
∇ × B − ∂t E = J , (2.60)

with E and B the electric and magnetic fields, J the current density and ρ the charge density. They
are 8 coupled linear differential equations, in which the boundary conditions are usually taken to be
such that for infinite systems the fields E and B go to zero at infinity. Note also the symmetry E ↔ B
in the absence of sources.
17 Recall that, in natural units, the action is dimensionless.
18 Note that they are written using the Heaviside-Lorentz convention, in which no 4π factors appear.
2.7 Maxwell’s equations 27

The homogenous Maxwell’s equations (2.57) and (2.58) can be solved by introducing the so- called
electromagnetic potentials: a scalar potential ϕ and a vector potential19 A satisfying

E = −∇ϕ − ∂t A , B = ∇ × A. (2.61)

Using them, the inhomogeneous Maxwell’s equations become

∂t2 ϕ − ∇2 ϕ = ρ , ∂t2 A − ∇2 A = J . (2.62)

Given the electromagnetic potentials ϕ and A in Eq. (2.61) the electromagnetic fields E and B are
completely determined, but no viceversa. A and φ are gauge fields (see the exercise below).
Familiarity with Maxwell’s equations soon leads to the appreciation of the unified nature of the
electromagnetic field and its relativistic nature. Although in their 19th century version Maxwell’s
equations (2.57)-(2.60) do not seem at all invariant under Lorentz transformations, they can be written
in a more compact and elegant way, that makes explicit their covariant form. Introducing the 4-vector
potential (gauge field) Aµ ≡ (ϕ, A) and the charge-current density 4-vector J µ ≡ (ρ, J), we obtain20

∂ν F µν = Jµ , (2.63)
ρ µν
µνρσ ∂ F = 0, (2.64)

where the antisymmetric quantity F µν ≡ ∂ µ Aν − ∂ ν Aµ is the gauge invariant (Faraday) field strength
with components
F 0i = E i , F ij = ijk Bk , (2.65)
and the different are totally antisymmetric tensors in the corresponding dimension21 n

+1, if µ1 µ2 . . . µn is an even permutation of 01 . . . (n − 1) ,

µ1 µ2 ...µn
= −1, if µ1 µ2 . . . µn is an odd permutation of 01 . . . (n − 1) , (2.66)

0, otherwise .


In covariant notation, (2.62) becomes

2Aµ = J µ (2.67)
where we have defined the d’Alambertian operator

2 ≡ ∂µ ∂ µ = −∂t2 + ∂i2 . (2.68)

• The covariant components µνρσ of the permutation tensor µνρσ in Minkowski spacetime
are defined by lowering each of the indices with the metric tensor ηµν

µνρσ = ηµλ ηνκ ηρπ ηστ λκπτ , (2.69)

from which it follows that 0123 = −0123 .

• Note that, whereas a cyclic permutation of the indices in the 3-dimensional permuta-
tion symbol leaves it unchanged (ijk = jki ), a cyclic permutation of the 4-dimensional
permutation symbol gives rise to a minus sign (µνρσ = −νρσµ ).
19 Inthe 3-dimensional sense!
20 Eq. (2.64) is sometimes called a Bianchi identity. As you will see, this will not be the last time that we will find
one of these identities.
21 Note that some books use the opposite sign convention for (2.66).
2.7 Maxwell’s equations 28

Exercise
Just for those of you knowing Classical Field Theory. The previous equations of motion can be
obtained from the following Lagrangian density
1
L = − F µν Fµν + J µ Aµ , (2.70)
4
where J µ is treated as an external source.
L d4 x is invariant under
R
• Check that with the Lagrangian density (2.70), the action S =
Lorentz transformations

Aµ = Λµ ν Aν , F µν = Λµ ρ Λν σ F ρσ , J µ = Λµ ν J ν , (2.71)

and gauge transformations

Aµ → Aµ − ∂ µ χ , (2.72)
where χ = χ(t, x) is an arbitrary function of space-time. The concept of purely elec-
tric or magnetic fields and that of a static charge distribution with zero current become
meaningless, being a good description only in a particular reference frame.
• Which is the Lorentz invariant generalization of the equation of motion F = q(E+v ×B)?
Can you guess the associated Lagrangiana ? Check the consistency of the two results by
computing the Euler-Lagrange equations (3.30) of the obtained Lagrangian .
a Hint: Which is the generalization of the cross product in the 4-dimensional case? Consider the Lagrangian

of a charged particle in an electrostatic potential Φ.

Taking the 4-divergence of Eq. (2.63) we obtain

∂µ J µ = ∂µ ∂ν F µν = 0 , (2.73)

where in the last step we used the fact that F µν is an antisymmetric tensor, i.e. F µν = −F νµ .
Eq.(2.73) is nothing else than the continuity equation

∂ρ
+ ∇ · J = 0. (2.74)
∂t
The conservation of total charge Q(t)
Z Z
Q̇(t) = ρ̇(t, xi )d3 xi = − ∂k J k (t, xi )d3 xi = 0 , (2.75)
R3 R3

is imposed by the field equations. If the charge is not conserved there is no solution!

Exercise
Prove that the product Sµν Aµν of a symmetric tensor S µν and an antisymmetric tensor Aµν is
zero.
CHAPTER 3
“THE HAPPIEST THOUGHT” OF EINSTEIN’S LIFE

For an observer falling freely from

the roof of a house there exists -
at least in his immediate
surroundings - no gravitational
field [. . . ]. The observer therefore
has the right to interpret his state
as ’at rest’ (at least until he hits
the ground!).

A. Einstein (1920)

The Poisson equation for the gravitational field

∇2 Φ(t, xi ) = 4πGρ(t, xi )

is a linear partial differential equation of 2nd order which does not contain any explicit time depen-
dence. The gravitational potential responds instantaneously to the changes in the matter distribution!
This was awkward even for Newton
That one body may act upon another at a distance through a vacuum, without the mediation
of anything else, by and through which their action and force may be conveyed from one to
another, is to me so great an absurdity, that I believe no man, who has philosophical matters
a competent faculty of thinking, can ever fall into it (Principia, p. 643, Ref. 395).

and it is in clear contradiction with Special Relativity. The instinctive reaction of many physicist when
facing this consistency problem was to apply the recipes used when writing the covariant version of
Maxwell equations (promote the operator ∇ to 2, introduce some kind of vector potential Aµ for
the gravitational field, generalize the Newtonian force to some combinations of fields and 4-velocities,
get retarded potentials, etc . . . ). None of the attempts was sucessful1 . Einstein eventually concluded
that a new approach to the problem must be taken. The purpose of this chapter is to present you
Einstein’s new look on gravity. As you will see, the new look turned out to be a real old look that
went back to Galileo himself.
1 We will be back to this point in the future.
3.1 Inertial and gravitational masses 30

3.1 Inertial and gravitational masses

Two different masses enter in the Newton’s theory of mechanics and gravitation. According to New-
ton’s second law the acceleration ai = d2 xi /dt2 experienced by an object is proportional to the exerted
force divided by the inertial mass of the object

f i = mI ai , (3.1)

independently of the origin of the force. This mass mI measures the resistance of an object to ac-
celerations. On the other hand, we have the gravitational mass, which measures the strength of the
gravity (in the same way that the electric charge measures the strength of the electric force). The
force exerted on a gravitational mass mG close to the surface of the Earth is given by

f i = mG g i . (3.2)

Comparing the expressions (3.1) and (3.2), we conclude that the acceleration of gravity should depend
a priori on the ratio of the gravitational mass to the inertial mass.

mG
ai = gi . (3.3)
mI

Nevertheless, as verified by Galileo’s ramp experiments2 , all bodies fall with the same acceleration in
a gravitational field
ai = g i . (3.4)
This observation implies the equality of the quantity controlling inertia (mI ) and that measuring the
strength of gravity (mG )
mI = mG , (3.5)
for all materials, independently of its composition.

Exercise
Consider the magnitude of the electrostatic interaction at a distance r between two particles of
charges q1 , q2 and inertial masses m1i ,m2i
q1 q2
Fe = . (3.6)
4πr2
How does the magnitude of the acceleration felt by particle 2 depends on its properties?
2 Yes, ramps and a water clock. The image of Galileo dropping balls from the leaning power of Pisa is just a widespread

italian legend.
3.1 Inertial and gravitational masses 31

3:00 am

3:00 pm

The results of Galileo’s experiments were confirmed, among others, by Newton himself and by the
Baron Eötvös de Vásárosnamény, who used respectively pendula and a torsion balance with different
materials3 .

Exercise
How does the oscillation period of a simple pendulum depend on the ratio mI /mG ?

The difference in the acceleration experienced by the two bodies is encoded in the so-called Eötvös
parameter
E1I E2I

∆a 2|a1 − a2 | X I
η= = = η − , (3.7)
a |a1 + a2 | mI,1 c2 mI,2 c2
I
where in the last step we have made explicit the contribution of the various energy forms E I to the
difference between inertial and gravitational masses
X EI
mG − mI ≡ ηI 2 . (3.8)
c
I
The experimental results are summarized in the following table.

|mI −mG |
mI

Rest mass, proton and neutrons < 10−11

Rest mass, electrons < 2 × 10−8
Electric fields in nucleus < 4 × 10−10
Magnetic fields in nucleus < 2 × 10−7
Strong fields in nucleus < 5 × 10−10
Weak fields in nucleus < 10−2
Gravitational energy in Earth < 2 × 10−3

3 Eötvos located two test objects on the opposite ends of a dumbbell suspended from a forsion fiber. If the inertial

and gravitational masses of those objects were different the centripetal effects associated with the rotation of the Earth
would give rise to a torque (everywhere but at the poles) that could be measured with a delicate torsion balance pointing
west-east. For a detailed description of the Eötvös’ original experiments see for instance Weinberg’s book.
3.2 The Equivalence Principle 32

Figure 3.1: Lunar Laser Ranging (LLR) experiments

Note that the displayed cases do not include the contribution of the gravitational self-interaction
of the masses, which, for laboratory size experiments, is extremely small. Its contribution can be
however tested via the so-called Nördvedt effect. If the gravitational self-energy did not follow Galileo’s
equivalence principle, the Earth and the Moon would fall at different rates towards the Sun, elongating
the orbit of the Moon in the Sun direction. As shown by Lunar Laser Ranging experiments (LLR),
which use reflectors that were located in the surface in the Moon during the Apollo 11 mission, cf.
Fig. 3.1, the gravitational self-energy behaves as any other energy form, in perfect agreement with
Eötvos’ results. Indeed, LLR experiments provide the most accurate tests of the equivalence between
inertial and gravitational masses. The constrains are really impressive

2|aE − aM |
η= = (−1 ± 1.4) × 10−13 . (3.9)
(aE + aM )

Not only matter but also antimatter seems to follow the Galileo’s result. Important constraints of the
order 10−9 were obtained by the CPLEAR collaborations from neutral kaon systems4 .

3.2 The Equivalence Principle

The experimental equality of inertial and gravitational masses is a quite surprising and mysterious
property, relating two completely different concepts and not required at all for the consistency of New-
ton’s theory. For Galileo and Newton, this was just a coincidence. For Einstein, it would be the first
stone in the impressive geometrical edifice of General Relativity. Einstein’s theory will be constructed
on top of something so simple that even Galileo could have discovered: the relation between gravity
and inertia. To illustrate this equality let me consider one of the most famous Einstein’s Gedankenex-
perimente. Imagine yourself dropping a ball in the surface of the Earth. You will see the ball falling
with constant acceleration. “The effect of gravity”, you would say. Now imagine yourself performing
the same experiment inside a completely isolated rocket in outer space which moves with a constant
acceleration a = g. You will observe the same: the ball falling with constant acceleration. Without
knowing it, you would not be able to decide if you were in the true gravitational field of the Earth or in a
rocket! This apparently trivial observation is summarized in the so-called Weak Equivalence Principle:

4 Under the assumption of exact CPT symmetry.

3.2 The Equivalence Principle 33

g g

Figure 3.2: Einstein’s Gedankenexperiment

Einstein’s Equivalence Principle

The trajectories of particles in the gravitational field are locally indistinguishable from the
trajectories of free particles as viewed from an accelerated reference frame.

The gravitational interaction resembles also the pseudo forces resulting from the use of non-inertial
reference frames. For example, if there is a frame of reference rotating with angular velocity ω with
respect to an inertial reference frame, all bodies appear to accelerate spontaneously with the same
acceleration in that rotating frame. It seems that there is a universal force acting on all bodies with
a magnitude proportional to their inertial masses

F = −mI [ω̇ × r + 2 ω × ṙ + ω × (ω × r)] . (3.10)

Accelerated frames and local gravitational forces appear to be intimately related. Both of them act
in the same way on all bodies, are proportional to mass and can be transformed away by changing to
a suitable reference frame; a local free falling frame in the case of gravity.

Figure 3.3: A local inertial reference frame.

3.3 Life in the rocket: Rindler spacetime 34

Exercise
Einstein’s toy: A version of the following device was constructed as a birthday present for
Albert Einstein. The device consists of hollow broomstick with a cup at the top, together with
a metal ball and an elastic string. When the broomstick is held vertical, the ball can rest in
the cup. The ball is attached to one end of the elastic string, which passes through a hole in
the bottom of the cup, and down the hollow centre of the broomstick to the bottom, where its
other end is secured. You hold the broomstick vertical, with your hand at the bottom, the cup
at the top, and with the ball out of the cup, suspended on its elastic string. The tension in the
string is not enough to draw the ball back into the cup. The problem is to find an elegant way
to get the ball back into the cup. (Inelegant ways are: using your hands or shaking the stick up
and down).

3.3 Life in the rocket: Rindler spacetime

Since accelerated reference frames mimic the local effects of the gravitational field, the understanding
of their properties seems to be a first step towards the correct description of gravity. Let me start by
analyzing the rocket Gedankenexperiment presented above.

Note the slight change of notation below. From now on, we will reserve the first letters of the
Greek alphabet α, β, . . . for indices associated to inertial reference frames. Intermediate letters
of the Greek alphabet µ, ν, . . . will stand for general (non-inertial, accelerated) reference frames.

Consider the movement of the accelerated rocket from the point of view of an inertial observer {ξ α },
momentarily at rest with respect to the rocket’s trajectory. The orientation of the coordinate grid is
such that the rocket moves along the ξ 3 -direction. In that instantaneous inertial frame, the rocket is
seen to undergo a constant acceleration g

uµ = (1, 0, 0, 0) , aµ uµ = 0 → aµ = (0, 0, 0, g) , aµ aµ = g 2 . (3.11)

In order to determine the wordline ξ α (τ ) of the rocket at later times, let us look for a general solution
of the covariant equation uα uα = −1. Writing it explicitly

ηαβ uα uβ = −(u0 )2 + (u3 )2 = −1 , (3.12)

the solution becomes pretty obvious

u0 = cosh f (τ ) , u3 = sinh f (τ ) . (3.13)

The unknown function f (τ ) can be determined by taking the derivative of the last two equations
duα
aα = = f˙(τ ) (sinh f (τ ), 0, 0, cosh f (τ )) (3.14)
dτ
and imposing the covariant condition aµ aµ = g 2 . We get g 2 = f˙2 , f (τ ) = gτ and

uα = (cosh gτ, 0, 0, sinh gτ ) . (3.15)

The work is almost done. Integrating (3.15) with the initial condition ξ α (0) = (0, 0, 0, g −1 ), we get

ξ α (τ ) = g −1 (sinh gτ, 0, 0, cosh gτ ) . (3.16)

3.4 Beyond inertial observers 35

The “constantly accelerated” rocket describes an equilateral hyperbola with semi-major axis 1/g

(ξ 3 )2 − (ξ 0 )2 = g −2 . (3.17)

Let us now look at the problem from the point of view of an accelerated observer sitting in the rocket.
Since the transformation from inertial to accelerated frames is not a Lorentz transformation, we should
expect a change in the Minkowski line element ds2 . The natural coordinates for the accelerated
observer are those adapted to its trajectory. Let’s call them (x0 , x3 ) = (η, ρ). The transformation to
this frame takes the form5

ξ 0 (η, ρ) = ρ sinh η , ξ 3 (η, ρ) = ρ cosh η . (3.18)

In terms of the new coordinates η and ρ, the Minkowski line element ds2 = −(dξ 0 )2 + (dξ 3 )2 becomes
modified
ds2 = −ρ2 dη 2 + dρ2 ≡ gµν dxµ dxν . (3.19)
The metric gµν = diag(−ρ2 , 1) is now space-time dependent!

3.4 Beyond inertial observers

Let us formalize the concepts appearing in the previous example. The distance between two neigh-
boring points, as measured in a local inertial frame at rest with respect to the particle6 , is given by
the Minkowski metric ηαβ
ds2 = ηαβ dξ α dξ β . (3.20)
When performing a transformation to a general coordinate system7 xµ = xµ (ξ α ), the line element
becomes modified
∂ξ α µ ∂ξ β ν ∂ξ α ∂ξ β
ds2 = ηαβ dξ α dξ β = ηαβ µ
dx ν
dx = ηαβ µ ν dxµ dxν = gµν dxµ dxν . (3.21)
∂x ∂x ∂x ∂x
and distances are no longer determined by the Minkowski metric, but rather by a metric

∂ξ α ∂ξ β
gµν = ηαβ , (3.22)
∂xµ ∂xν
which generically depends on the spacetime coordinates. The inverse of the new metric is defined
through the relation g µν gνλ = δ µ λ and can be easily computed by taking into account the identity

∂xµ ∂ξ β
= δβ α . (3.23)
∂ξ α ∂xµ
We get
∂xµ ∂xν
g µν = η αβ . (3.24)
∂ξ α ∂ξ β

Exercise
Prove the similarity transformation (3.24).

5 The change of coordinates is just the Lorentzian analogue of polar coordinates. inspired on the wordline equation

(3.16).
6 In the context of a particle in the gravitational field these frames are called local free falling reference frames.
7 General means completely arbitrary. It can be a curvilinear coordinate system, an accelerated system, a rotating

system . . . whatever you want.

3.5 The geodesic equation 36

Note that the reference frame xµ is not at all privileged. We could perfectly move now into another
non-inertial reference frame x̄ρ (ξ α ) in which

∂xµ ρ ∂xν σ ∂xµ ∂xν ρ σ

ds2 = gµν dxµ dxν = gµν dx̄ dx̄ = gµν dx̄ dx̄ = ḡρσ dx̄ρ dx̄σ , (3.25)
∂ x̄ρ ∂ x̄σ ∂ x̄ρ ∂ x̄σ
with
∂xµ ∂xν
. ḡρσ = gµν (3.26)
∂ x̄ρ ∂ x̄σ
The results of this section are summarized in Figure 3.4.

(⇠ ↵
)
⌘↵

x̄
⇢
(⇠
xµ

↵
)
gµ⌫ ḡ⇢
x̄⇢ (xµ )

domingo, 15 de septiembre de 13

Figure 3.4

Exercise
Show that gµν must be symmetric, i.e. gµν = gνµ .

3.5 The geodesic equation

The equation of motion for a free particle in an accelerated reference frame can be obtained by applying
the transformation (3.22) to the Lagrangian of a free relativistic particle (2.51). We get

dxµ dxν

1
Z
S= dσ e−1 (σ)gµν − m2 e(σ) . (3.27)
2 dσ dσ

As in the Minkowski case, the action (3.27) is invariant under reparametrizations of the path σ →
σ = f (σ) provided that we let e(σ) transform in such a way that the quantity e(σ)dσ is left invariant.
Note also that the action (3.27) is invariant under general coordinate transformations8 , as can be
easily seen by taking into account the similarity relation (3.26).
∂L
The Euler-Lagrange equation ∂e = 0 for the non-dynamical variable e(σ)

dxµ dxν
gµν = −m2 e(σ)2 , (3.28)
dσ dσ
automatically incorporates the massive (e(σ) = 1/m) and massless (e(σ) = 1, m → 0) cases we are
interested in. In these two limits, the action (3.27) takes the form

dxµ dxν dxµ dxν

1 1
Z Z
Smassive = m dσ gµν −1 , Smassless = dσ gµν , (3.29)
2 dσ dσ 2 dσ dσ
8 This should be expected from pure Lagrangian mechanics.
3.5 The geodesic equation 37

with σ = τ for the massive case. Smassive and Smassless are very similar. Indeed, the computation of
the equations of motion is formally equivalent in both cases9 . Let us denote by a dot the derivative
with respect to σ and forget in what follows about the irrelevant factors m and m/2. The equations
of motion
d ∂L ∂L
ρ
− =0 (3.30)
dσ ∂ ẋ ∂xρ
for the generalized coordinates xρ can be computed as follows. The simplest part is the variation of
1/2gµν ẋµ ẋν with respect to xρ . All the dependence on the coordinates is hidden in the metric
∂L 1 ∂gµν µ ν
= ẋ ẋ . (3.31)
∂xρ 2 ∂xρ
The variation with respect to ẋρ is slightly more involved, but can be however written in a very
compact way by taking into account the properties
∂ ẋµ
= δµ ρ gµν δ ν ρ = gµρ , gµν = gνµ , (3.32)
∂ ẋρ
together with some simple index relabeling. We get
∂ ẋµ ν ν

∂ 1 µ ν 1 µ ∂ ẋ 1
ρ
gµν ẋ ẋ = gµν ρ
ẋ + g µν ẋ ρ
= (gρν ẋν + gµρ ẋµ ) = gρν ẋν . (3.33)
∂ ẋ 2 2 ∂ ẋ ∂ ẋ 2
Collecting the two pieces, the Euler-Lagrange equations (3.30) become
d ∂L ∂L d 1 ∂gµν µ ν
− = (gρν ẋν ) − ẋ ẋ
dσ ∂ ẋρ ∂xρ dσ 2 ∂xρ
∂gρν σ ν 1 ∂gµν µ ν
= ẋ ẋ + gρν ẍν − ẋ ẋ
∂xσ 2 ∂xρ
∂gρν 1 ∂gσν
= gρν ẍν + ẋν ẋσ −
∂xσ 2 ∂xρ

ν 1 ν σ ∂gρν ∂gρσ ∂gσν
= gρν ẍ + ẋ ẋ + −
2 ∂xσ ∂xν ∂xρ
= 0. (3.34)
The work is done. Multiplying by the inverse metric and relabeling indices we obtain the equation we
were looking for, the so-called geodesic equation
d2 xµ µ dxν dxρ 1 µσ
+ Γ νρ = 0, Γµ νρ = g (∂ρ gσν + ∂ν gσρ − ∂σ gνρ ) . (3.35)
dσ 2 dσ dσ 2

Exercise
• Consider a reparametrization σ → f (σ). Show that the geodesic equation (3.35) retains
its form only if f (σ) = aσ + b.
• Compute the geodesic equation associated to the Rindler metric (3.16).

The geodesic equation is automatically covariant since the Lagrangian from which it was derived was
invariant under general coordinate transformations. The transformation of the so-called Christoffel
symbols Γµ νλ is however non-homogeneous
0 0
0 ∂ x̄µ ∂xν ∂xρ ∂ x̄µ ∂ 2 xµ
Γ̄µ ν 0 ρ0 = Γµ νρ 0 0 + . (3.36)
µ ν
∂x ∂ x̄ ∂ x̄ ρ ∂xµ ∂ x̄ν 0 ∂ x̄ρ0
9 The only difference is the presence of a global factor m and a constant term m/2 which do not play any role in the

variation of the action

3.5 The geodesic equation 38

They are not a tensor.

Exercise
• Which is the form taken by Eq. (3.37) in a Cartesian coordinate system?

• Prove the transformation law (3.36) using the fact the the geodesic equation is covariant.
• How many independent components have the Christoffel symbols in four dimensions?

The Christoffel symbols encode the local aspects of the gravitational interaction as well as the
fictitious forces (centrifugal, Coriolis, etc)
d2 xµ dxν dxλ
Fµ ≡ 2
= −Γµ νλ (3.37)
dσ dσ dσ
arising when using non-inertial reference frames. This kind of forces can always be eliminated by
going to an inertial reference frame or to a free-falling frame. Note that this would not be the case if
the Christoffel symbols were tensors.

Local free-falling reference frames

The geodesic equation allows for a precise definition of local free-falling frames. According to
Equivalence Principle, in those frames the geodesic equation must become

d2 ξ α
= 0. (3.38)
dσ 2 P

A free-falling reference frame at P is therefore defined asa

gµν (P ) = ηµν , ∂σ gµν (P ) = 0 (3.39)

with the second condition being equivalent to the vanishing of the Christoffel symbols at that
point, i.e. Γµ νλ (P ) = 0.
a The existence of these frames is guaranteed by the so-called Local flatness theorem.

3.5.1 Massive particles don’t go on diet

It is easy to show that the geodesic equation (3.35) is equivalent to a conservation equation
d2 xµ dxν dxρ dxµ dxν

µ d
+ Γ νρ = 0 → gµν =0 (3.40)
dσ 2 dσ dσ dσ dσ dσ
associated to translations in the parameter σ, f (σ) = σ + c. The physical meaning of Eq. (3.40)
is quite obvious and should be expected; it simply states that massless/massive particles remain
massless/massive along the geodesic. In other words, given an initial condition
dxν dxλ

−1 for massive particles
gµν = , (3.41)
dσ dσ σ=0 0 for massless particles

it will by satisfied for all values of σ 10 . We will rediscover this equation in Chapter 5, when dealing
with the concept of parallel transport.

10 Note that for the massive case we can identify σ = τ

3.6 The Newtonian limit 39

Exercise
Prove the relation (3.40).

3.5.2 Conserved quantities

Note that if the metric coefficients are independent of one coordinate11 xν , the Lagrangian (3.27) will
be also independent of such a coordinate. In such a case, the covariant component ẋν is a conserved
quantity along affinity parametrized geodesics
∂L
pν = = gµν ẋµ = ẋν = constant . (3.42)
∂ ẋν
We will make use of this important property in Chapter 9, when dealing with the Schwarzschild
geometry.

3.6 The Newtonian limit

Let us see how the usual results of Newtonian gravity fit into the geometric picture. Of course, we
cannot expect to link the relativistic formulation presented above with a non-relativistic theory of
gravity without doing some assumptions. We will consider a massive particle moving at small velocity
(with respect to the speed of light)

dxi dxi dt
1 −→ (3.43)
dt dτ dτ
in a “weak”
gµν = ηµν + hµν , |hµν | 1 (3.44)
and stationary12 gravitational field
∂0 gµν = ∂0 hµν = 0 . (3.45)
The first two conditions (small velocities and weak fields) are quite natural from the point of view
of a non-relativistic description. On the other hand, the stationarity condition (3.45) is just a good
approximation for the particular cases we will be interested in in this section: the gravitational fields
of the Sun and the Earth.
At first order in the small perturbation hµν , the geodesic equation (3.35) takes the form
2
d2 xµ

dt
+ Γµ00 = 0, (3.46)
dτ 2 dτ

where the Christoffel symbols Γµ00 are completely determined by the perturbation hµν around the
Minkowsky metric13

µ 1 µρ ∂g0ρ ∂g0ρ ∂g00 1 ∂g00 1 ∂h00
Γ00 = g + + = − g µρ ρ = − η µρ . (3.47)
2 ∂x0 ∂x0 ∂xρ 2 ∂x 2 ∂xρ
11 The coordinate xν is then said to be cyclic.
12 Or varying sufficiently slow over the scale probed by the particle.
13 Note that, since we are interested only in first order terms, we can raise and lower indices with the Minkowski

metric ηµν . For example hµ ν = g µλ hλν ' η µλ hλν + O(h2µν ).

3.6 The Newtonian limit 40

Mass Size |Φ|/c2

Atome 10−26 Kg 10−10 m 10−43

Human 102 Kg 1m 10−25
Earth 1025 Kg 107 m 10−9
Sun 1030 Kg 109 m 10−6
Galaxy 1041 Kg 1021 m 10−7
White Dwarf 1030 Kg 107 m 10−4
Neutron Star 1030 Kg 104 m 0.1
Black Hole 1

Table 3.1: Gravitational self-energy: Orders of magnitude

Splitting the spatial and temporal components of Eq. (3.46) and using the stationarity condition
(3.45), we obtain14
d2 t d2 xi 1 ∂h00
= 0, = c2 , (3.48)
dτ 2 dτ 2 2 ∂xi
The first of these two equations allows us to identify the proper time τ with the coordinate time t and
write
d2 xi 1 ∂h00
= c2 . (3.49)
dt2 2 ∂xi
The value of the unknown function h00 can be determined by comparing Eq. (3.48) with the Newtonian
expression for a particle in a gravitational field

d2 xi ∂Φ
= −δ ij j . (3.50)
dt2 ∂x
The first true component of the gravitational metric tensor15 comes directly from Newton’s theory!

2Φ 2Φ
h00 = − 2 −→ g00 = − 1 + 2 . (3.51)
c c

Indeed. . . this is the first and the last component that we can expect to get from Newton . . . New-
tonian gravity involves just one scalar function: the gravitational potential Φ, nothing else. This
observation naturally raises the question of how to compute the remaining components of the metric.
Let’s forget about this problem for a while and enjoy our findings. As you will see, we can learn a
lot of new things without knowing the precise form of the other metric components. The correction
to the Minkowski metric is proportional to the so-called gravitational self-energy Φ/c2 . This quantity
can be understood as the ratio of the Newtonian potential energy to the relativistic energy. For an
object of mass M and typical size R we have

|Φ| GM 2 1 GM
= · = . (3.52)
c2 R M c2 Rc2
14 Note that we have restored the speed of light c for later convenience.
15 Note that we could in principle allow for an extra constant C in Eq. (3.51), i.e h00 = − 2Φ c2
+ C, which should
be fixed by requiring the metric to approach the flat Minkowski metric at infinity. For isolated mass distributions the
gravitational potential Φ vanishes at infinity and therefore C = 0.
3.7 The power of the equivalence principle 41

Figure 3.5: Gravitational redshift, Gedankenexperiment.

Some orders of magnitude for Φ can be found in Table 3.1. Note that, even for a white dwarf or a
galaxy, the gravitational self-energy is very small; the weak field approximation used in the derivation
of Eq. (3.51) is justified. The correction to the Minkowski metric is expected to be important only
for very compact object such as a neutron star or a black hole.

3.7 The power of the equivalence principle

Let us consider the direct consequences of the previous results. In order to get some intuition let me go
back for a moment to the constantly accelerated rocket and perform the following Gedankexperiment.
Imagine two observers in the rocket, one on the base and one on a platform close to the ceiling. The
observer at the bottom sends some light pulses with a frequency dictated by its proper time interval
ν1 = 1/∆τ1 . Due to the acceleration of the rocket, these pulses are received by the observer at the top
at a lower rate ν2 = 1/∆τ2 than the rate at which they were emitted16 . According to the Equivalence
Principle the same phenomenon should happen in the presence of gravity. Yes, gravity must affect
the flow of time!
Let us put our Gedankenexperiment into equations. Having a look to Eq. (3.51), we see that the
interval in proper time dτ at a fixed point in the vicinity of a massive object differs from the interval
in coordinate time dt r
p
µ ν
2Φ(r)
dτ = −gµν dx dx = 1 + dt . (3.53)
c2
A local measurement of this effect is nevertheless impossible since our measure instruments are affected
by gravity in the same way that the timing of the objects we want to measure. Observable effects on
the flow of time can only appear when we compare two different points in the gravitational potential
Φ, as we did in the rocket Gedankenexperiment.

3.7.1 Gravity and the flow of time

Consider two observers in the weak gravitational field of a spherically symmetric and stationary mass
distribution. Although the Newtonian limit developed in the Section 3.6 provides only information
16 By the time at which the light arrives to the top the ceiling is moving faster than when the light was emitted.
3.7 The power of the equivalence principle 42

about the g00 element, the large symmetry of the problem severely constrains the form of the metric
to be
2Φ(r)
ds2 = − 1 + dt2 + grr (r)dr2 + r2 dθ2 + r2 sin2 θdφ2 , (3.54)
c2
with grr (r) an undetermined function of the radial distance, whose explicit form will not be needed in
what follows. In order to disentangle the effect of gravity from other velocity dependent Doppler-like
effects and to make the analysis as clear as possible, we will require the observers to be at rest in a
radial configuration with coordinates r1 and r2 . Imagine the observer at r1 sending pulses of light to
the observer at r2 . The period of emitted pulses is the interval in proper time of the emitter
Z p p Z p
∆τ1 = −g00 (r1 )dt = −g00 (r1 ) dt = −g00 (r1 )∆t1 . (3.55)

On the other hand, the period of received pulses is the interval in proper time of the receiver
Z p p Z p
∆τ2 = −g00 (r2 )dt = −g00 (r2 ) dt = −g00 (r2 )∆t2 . (3.56)

The coordinate interval elapsed between the emission of two pulses ∆t1 is equal to the coordinate
time interval elapsed between the reception on two pulses ∆t2 , as can be easily seen by noting that
the coordinate time interval needed to go from r1 to r2
Z r2
2 2 2 −grr (r)
ds = −g00 dt + grr dr = 0 −→ ∆t = dr (3.57)
r1 g00 (r)

is independent of the coordinate time t. Taking the ratio of Eqs. (3.56) and (3.55), we get the first
prediction of the Equivalence Principle
s s
∆τ2 g00 (r2 ) 1 + 2Φ(r2 )/c2
= = . (3.58)
∆τ1 g00 (r1 ) 1 + 2Φ(r1 )/c2

For weak gravitational fields, the previous expression can be approximated by its binomial expansion17

∆τ2 Φ(r2 ) − Φ(r1 )

'1+ , (3.59)
∆τ1 c2
which is usually quoted in terms of the ratio

∆τ ∆τ2 − ∆τ1 Φ(r2 ) − Φ(r1 )

≡ = . (3.60)
τ ∆τ1 c2

Time goes by slower

Clocks slow down in those places where the gravitational potential is larger (in magnitude).
In particular, clocks at a distance rqfrom the surface of a massive spherical body of mass M
(Φ = − GMr ) slow down by a factor 1 − 2GM
rc2 with respect to clocks at r → ∞ (Φ = 0).

The dilation of time was tested by Hafele and Keating in 1972 using cesium-beam atomic clocks
transported on commercial flights around the Earth and compared on return to standard clocks in
the US Naval Observatory. The net effect on the reading of the on-flight clocks is a combination of
special relativistic effects and gravitational changes in the flow of time. The two contributions act in
17 (1 + x)1/2 = 1 + 12 x.
3.7 The power of the equivalence principle 43

Time of flight 41.2 h 48.6 h

∆τ (ns) Eastward Westward

∆τG 144 ± 14 179 ± 18
∆τSR −184 ± 18 96 ± 10
∆τtot −40 ± 23 275 ± 21
∆τobs −59 ± 10 273 ± 7

Table 3.2: Hafele Keating: Predictions and experimental results

Figure 3.6: The highs and lows: Redka and Pound at the top and bottom of the tower.

an opposite way. Special Relativity tends to decrease the rate of the clock in the plane with respect
to the standard clock in the surface of the Earth18 . On the other hand, gravity tends to speed up
the clock in the plane with respect to the clock in the stronger gravitational field of the Earth. The
experiment was performed twice, once flying towards the east and once flying towards the west. The
results and their comparison with the predictions are summarized in Table 3.2. As you can see, the
agreement between the theory and the theoretical prediction is notably good.

Exercise
How older are the theorists of the upper floor of the Cubotron with respect to the experimental-
ists in the lower floor at the end of their academic life? Should this effect be taken into account
by the Swiss pension system?
18 This is just a consequence of the well-known time dilation effect in Special Relativity.
3.7 The power of the equivalence principle 44

3.7.2 Gravitational shift of frequencies

An immediate consequence of the previous result is the gravitational frequency shift. Denoting by ν1
the frequency of the light emitted at r1 and by ν2 the frequency of the light received at r2 , Eq. (3.58)
can be rewritten as s s
ν1 g00 (r2 ) 1 + 2Φ(r2 )/c2
≡1+z = = , (3.61)
ν2 g00 (r1 ) 1 + 2Φ(r1 )/c2
where we have defined the so-called redshift parameter
∆ν ν1 − ν2
z= ≡ . (3.62)
ν ν2
It z > 0 the received light is said to be redshifted, while if z < 0 the light is said to be blueshifted. For
weak gravitational potentials, Eq. (3.61) can be approximated by
∆ν Φ(r2 ) − Φ(r1 )
z= ' . (3.63)
ν c2
To get an estimate of the order of magnitude of this frequency shift, consider for instance the light
from the Sun (r = r1 ) received on Earth (r = r2 )19 . Since r2 > r1 , |Φ(r1 )| > |Φ(r2 )| and therefore20
z > 0. As in our Gedankenexperiment, the light redshifts (ν1 > ν2 ) as it climbs upwards in the
gravitational potential!

The gravitational frequency shift is a test of the Equivalence Principle, not of the Einstein’s
theory of gravity in its full form. Note that the spatial part of the metric grr (r) did not played
any role in the previous developments.

Numerically, the gravitational redshift of the light emitted by the Sun is very small
∆ν
= 2.12 × 10−6 , (3.64)
ν
and indeed very difficult to detect due to the broadening of spectral lines and to Doppler shifts
associated to the convection currents in the solar atmosphere21 .
A more precise non-astronomical test of the gravitational frequency shift was performed by Pound
and Redka in 1960 using gamma rays produced in a 14.4 keV atomic transition in 57 Fe. These gamma
rays were emitted at the top of a tower of 22.6 meters in the Jefferson Physical Laboratory at Harvard
university and directed down towards a similar sample of 57 Fe located at the bottom of the tower. The
absorption of the gamma rays by the receiver is only efficient if the frequency at reception coincides
with the frequency at emission (Mössbauer effect). Due to the gravitational shift of frequencies this
was not the case. Pound and Rebka compensated the gravitational shift in a very clever way: a
Doppler shift induced by the vertical motion of the source at the top of the tower. By looking for
a resonance in the absorption they were able to obtain a direct measurement of the gravitational
redshift. The result was in excellent agreement with the Equivalence Principle’s prediction
(∆ν/ν)exp
= 1.05 ± 0.10 . (3.65)
(∆ν/ν)th
19 Φ(r ) = −GM /r and Φ(r ) = −GM /r are small (cf. Table 3.1). The binomial expansion is justified. The
1 1 2 2
gravitational field of the Earth is neglected.
20 Remember that the gravitational potential is negative.
21 The gravitational redshift (3.64) corresponds numerically to the Doppler shift associated to a velocity of 0.6 Km/h,

which is easily exceed by the hot gases in the surface of the Sun.
3.8 The weakness of the Equivalence Principle 45

Figure 3.7: Different tests of the gravitational redshift. The parameter α parametrizes the deviations
from the Equivalence Principle, ∆ν/ν = (1 + α)∆Φ/c2 .

g
g

Figure 3.8: Deflection of light, Gedankenexperiment.

Exercise
Determine the predicted value ∆ν/ν in the Pound-Rebka experiment. Are the gamma rays
traveling down the tower blueshifted or redshifted?

3.8 The weakness of the Equivalence Principle

A second (and incomplete) prediction of the Equivalence Principle is the deflection of light. This effect
can be easily understood with another Gedankenexperiment. Imagine an observer in the accelerated
rocket. A pulse of light is emitted by some device from one of the walls in the transverse direction to
the rocket motion. Due to the acceleration of the rocket, the pulse will hit the opposite wall at a height
below that of the emission. Since the uniform acceleration of the rocket is locally indistinguishable
from a gravitational field, we should expect the same deflection of light in a gravitational field.
3.8 The weakness of the Equivalence Principle 46

The bending of light in a gravitational field was considered by Newton himself, but he didn’t
performed any proper computation. The first known result about the deflection of light was presented
by the German astronomer Johann Georg von Soldner in 1804. Based on Newton’s corpuscular theory
of light, Soldner predicted a deflection angle of 0.8700 for a ray of light grazing the surface of the Sun.
Einstein, unaware of Soldner’s computations and based on the Equivalence Principle, obtained the
same number one hundred years later, in 1911. Let us reproduce his arguments and (wrong) results.

3.8.1 Einstein’s 1911 (wrong) treatment

In Einstein’s 1911 paper, the speed of light is considered as a scalar quantity which depends on the
value of the gravitational field

ds2 = −(1 + 2Φ)dt2 + dX 2 = 0 → c = c0 (1 + Φ) (3.66)

The Minkowski value c = c0 is only recovered at long distances (r → ∞), where the gravitational
potential is negligible22 . According to Huygens’ principle the position of a wavefront at a time t + ∆t
can be determined by considering each point of the wavefront at t as a source of spherical waves. The
wavefront at t + ∆t is then given by the envelope of the multiple spherical wavefronts originated at
t. Imagine a wave front in the vicinity of a matter distribution M . Consider two points P1 and P2
separated by a spatial distance δl at time t. The velocity of light at those points (c1 and c2 ) depends
of the value of the gravitational field. Having a look to Fig. ??, we conclude that in a time δt the
wavefront in refracted by an angle
(c1 − c2 ) δt δΦ
δα = = δt , (3.67)
δl δl
with δΦ/δl the component of the gravitational acceleration along the wavefront. This infinitesimal
refraction angle can be integrated along the full path to obtain the total deflection angle23
dΦ
Z Z
α = dα = dt . (3.68)
dl
Since the velocity of light along the path is nearly constant we can set dt = ds, with s measuring the
distance along the path. Evaluating the integral (3.68) for an impact parameter b, we get
Z π/2
dΦ GM 2GM
Z
α= ds = 2
cos θds = , (3.69)
dl −π/2 r b

which for the particular case of a photon grazing the surface of the Sun becomes24
2GM
α= ≈ 0.87500 . (3.70)
c2 R
22 In ”Relativity,The Special and General Theory”, Einstein wrote:
[. . . ] our result shows that, according to the general theory of relativity, the law of the
constancy of the velocity of light in vacuo, which constitutes one of the two fundamental
assumptions in the special theory of relativity and to which we have already frequently referred,
cannot claim any unlimited validity. A curvature of rays of light can only take place when the
velocity of propagation of light varies with position. Now we might think that as a consequence.
of this, the special theory of relativity and with it the whole theory of relativity would be laid
in the dust. But in reality this is not the case. We can only conclude that the special theory
of relativity cannot claim an unlimited domain of validity; its result hold only so long as we
are able to disregard the influences of gravitational fields on the phenomena (e.g. of light).
23 This is a good approximation for small deflections angles, as is the case of the deflection of light by the Sun.
24 Note that we have restored the powers of c.
3.8 The weakness of the Equivalence Principle 47

Figure 3.9: Deflection of light, Huygens’s principle.

As Einstein stated in the original paper, since ‘the fixed stars in the part of the sky near the sun
are visible during a total eclipse of the sun, this consequence of the theory may be compared to
experiment”. He indeed “urgently wishers astronomers to take up this question” and measure the
deflection of light during a solar eclipse. Fortunately for him . . . they didn’t do it on time. Einstein’s
1911 prediction based only in the equivalence principle was incomplete25 . No measurement of the
deflection angle was performed between 1911 and 1915, the moment at which he straightens out his
result to
2GM
α=2× 2 ≈ 2 × 0.87500 . (3.71)
c R
Although different expeditions to observe solar eclipses were organized, all of them were cancelled,
either for climatological or political reasons. One of the most interesting stories is that of the Ger-
man astronomer and mathematician Erwin Finlay Freundlich, which, interested on testing Einstein’s
prediction, convinced the german armament manufacturer Krupp to finance a trip to Crimea on
21st August 1914. Unfortunately for him, and fortunately for Einstein, the German astronomer was
arrested by the Russians as a suspected spy before being able to perform any measurement.

25 We will see why later on.

CHAPTER 4
GENERAL COORDINATES

No one can understand the new

law of gravitation without a
thorough knowledge of the theory
of invariants and of the calculus
of variations

J. J. Thomson
Royal Society, 1919

In Euclidean and Minkowski spacetimes we were dealing with global Cartesian coordinate systems.
Our goal in this Chapter is to develop the mathematical tools needed to write physical equations in
a way completely independent of the particular coordinate system we actually end up using.

4.1 General coordinate transformations

Consider a completely arbitrary set of coordinates {xµ } and a set of natural basis vectors {eµ } tangent
to the curves xµ = constant in N dimensions. These basis vectors are allowed to change in magnitude
and/or direction from point to point. In general, they will satisfy

eµ (x) · eν (x) = gµν (x) . (4.1)

with gµν generically depending on the coordinates. Once we have established a coordinate basis, we
can define the components of a vector. Let us consider the infinitesimal displacement vector between
two points
dS = dxµ eµ , (4.2)
whose scalar product with itself defines the line element

|dS|2 ≡ ds2 = eµ (x) · eν (x) dxµ dxν = gµν (x)dxµ dxν . (4.3)

The above expression represents a generalization of the Pythagorean theorem for an arbitrary coor-
dinate system. As usual, the inverse of the metric g µν is defined through the relation g µν gνλ = δ µ λ .
Note that both gµν (x) and g µν (x) are symmetric.
4.1 General coordinate transformations 49

Exercise
• Prove that gµν is a symmetric matrix. Hint: Assume gµν is not symmetric and decompose
it into a symmetric and an antisymmetric part.
• Which is the number of independent components of a general symmetric matrix in N
dimensions?
Consider two different coordinate systems xµ and x̄µ related to each other by an arbitrary coordi-
nate transformation
x̄µ = f µ (x1 , x2 , . . . , xN ) , (µ = 1, 2, . . . , N ) . (4.4)
The N arbitrary real functions f µ (x1 , x2 , . . . , xN ) are assumed to be single valued, continuous and
differentiable over the whole range of their arguments. By differentiating each of these functions with
respect to the coordinates, we obtain a N × N transformation matrix
 ∂f 1 ∂f 1 ∂f 1

∂x12 ∂x22 . . . ∂x N
µ  ∂f ∂f ∂f 2 
∂ x̄  ∂x1 ∂x2 . . . ∂x N 
=  . . . , (4.5)
∂xν  ..
 .. .. 
∂f N ∂f N ∂f N
∂x1 ∂x2 ... ∂xN

whose entries are, in general, functions of the coordinates. The determinant of the transformation
matrix
∂ x̄µ
J(x) = , (4.6)
∂xν
is called the Jacobian of the transformation. If the N arbitrary real functions are independent, the
Jacobian is different from zero and the coordinate transformation (4.4) can be inverted to express xµ
in terms of x̄µ
xµ = g µ (x̄1 , x̄2 , . . . , x̄N ) . (4.7)

Exercise
The transformation matrix and the Jacobian associated to this inverse coordinate transformation
µ
(4.7) are given respectively by the inverse of the transformation matrix (4.5), [ ∂x
∂ x̄ν ], and the
inverse of the Jacobian (4.6), J¯ = J . Prove it.
−1

As you may expect at this point, there is a simple relationship between the coordinate basis vectors
in the two systems. This relation can be found by requiring the invariance of the line element ds2 ,
which is a purely geometrical quantity independent of the coordinate system used to describe it. We
get
∂ x̄µ ν ∂xν
dx̄µ = dx , ēµ = eν . (4.8)
∂xν ∂ x̄µ
The previous expressions are equivalent to the similarity relation between the metrics in the two
coordinate systems that we found in the previous chapter
∂xµ ∂xν
ḡρσ (x̄) = gµν (x(x̄)) . (4.9)
∂ x̄ρ ∂ x̄σ

Exercise
Taking into account Eq. (4.9), determine the number of independent components of the metric.
4.1 General coordinate transformations 50

A worked-out example: Polar coordinates in the plane

Let us illustrate the previous results with the simplest example one can think of: polar coordi-
nates in R2 . These coordinates are defined by
p x2
x1 = r cos θ , x2 = r sin θ ↔ r= (x1 )2 + (x2 )2 , θ = arctan . (4.10)
x1
The Jacobian matrices associated with this transformation
!
∂xµ ∂x1 ∂x1

∂r ∂θ cos θ −r sin θ
= ∂x 2
∂x 2 = , (4.11)
∂ x̄ν sin θ r cos θ
∂r ∂θ

∂ x̄µ ∂r ∂r

∂x1 ∂x2 cos θ sin θ
= = , (4.12)
∂xν ∂θ
∂x1
∂θ
∂x2
− 1r sin θ 1
r cos θ

are different from zero, i.e. non singular, except at r = 0. The polar coordinate system admits
a pair of basis vectors er and eθ , adapted to the coordinates and related to the Cartesian basis
vectors by (cf. (4.8))

∂x1 ∂x2
er = e1 + e2 = cos θ e1 + sin θ e2 , (4.13)
∂r ∂r
∂x1 ∂x2
eθ = e1 + e2 = −r sin θ e1 + r cos θ e2 . (4.14)
∂θ ∂θ
Note that the resulting basis is not a unit basis

|er |2 = 1 , |eθ | = r2 . (4.15)

On the other hand, the relation between the infinitesimal displacements in both coordinate
system is given by (cf. (4.8))

∂r ∂r
dr = dx1 + dx2 = cos θdx1 + sin θdx2 , (4.16)
∂x1 ∂x2
∂θ ∂θ 1 1
dθ = dx1 + dx2 = − sin θdx1 + cos θdx2 . (4.17)
∂x1 ∂x2 r r
The components of the metric tensor and its inverse in this basis can be computed either through
the definition of the metric (4.1)

grr = er · er = 1 , gθθ = eθ · eθ = r2 , grθ = gθr = er · eθ = 0 , (4.18)

or through its transformation properties (4.9)

2 2
∂xi ∂xj ∂x1 ∂x2

grr = δij = + = cos2 θ + sin2 θ = 1 , (4.19)
∂r ∂r ∂r ∂r
2 2
∂xi ∂xj ∂x1 ∂x2

gθθ = δij = + = (−r sin θ)2 + (r cos θ)2 = r2 . (4.20)
∂θ ∂θ ∂θ ∂θ

In both cases, we get

1 0 1 0
gµν = , g µν = . (4.21)
0 r2 0 r−2

for the metric and its inverse. The line element (4.3) written in polar coordinates becomes

ds2 = dr2 + r2 dθ2 , (4.22)

which is what one usually writes down when asked for the metric of this coordinate system.
4.2 Tensors 51

Exercise
Repeat the same exercise for spherical coordinates.

4.2 Tensors
The transformation laws of tensors under general coordinate transformations are just a generalization
of those found in Chapters 1 and 2, the main difference being the replacement of the constant matrices
Ri j and Λµ ν by the arbitrary transformation matrix (4.5) and the use of the metric gµν and its inverse
for lowering and raising indices
Vµ ≡ gµν V ν , V µ ≡ g µν Vν . (4.23)
The simplest transformation rules are summarized in Table 4.1. For a general tensor with m con-
travariant indices and n covariant indices we have
m n
!
µ1 ...µm
Y ∂ x̄µp Y ∂xσq
T̄ ν1 ...νn = ρp νq
T ρ1 ...ρm σ1 ...σn . (4.24)
p=1
∂x q=1
∂ x̄

∂ x̄µ
General coord. transformations ∂xν are arbitrary!

Table 4.1

4.3 Tensorial densities

In Special Relativity the volume element d4 x = dx0 dx1 dx2 dx3 provides an invariant integration mea-
sure because of the unit value of the determinant of Lorentz transformations. When considering
arbitrary coordinate transformations this is no longer true and the Jacobian of the transformation
appears in the transformation law
∂ x̄µ 4
d4 x̄ = d x = Jd4 x . (4.25)
∂xν
A similar thing happens with the determinant of the metric. Taking the determinant of the matrix
equation (4.9), we get
ḡ = J −2 g . (4.26)
4.4 Covariant derivative 52

The previous expression can be combined with Eq. (4.25) to obtain a generally covariant quantity
p p
|ḡ|d4 x̄ = |g|d4 x , (4.27)
which we can use as an appropriate volume element in arbitrary dimensions. The absolute value in
Eq. (4.27) is introduced to take into account the case of a metric with Lorentzian signature (− + ++)
and negative determinant.

A worked-out example: Polar coordinates in the plane

p
The volume element in polar coordinates is given by dV = |g|drdθ = rdrdθ.

The volume density d4 x and the determinant of the metric g are just particular cases of a general
class of quantities called tensor densities. A tensor density transforms as a tensor except for the
appearance of the Jacobian to a given power w called the weight of the tensor density, namely
m n
!
µ1 ...µm −w
Y ∂ x̄µp Y ∂xσq
D̄ ν1 ...νn = J ρp νq
Dρ1 ...ρm σ1 ...σn . (4.28)
p=1
∂x q=1
∂ x̄

Ordinary tensors can be therefore considered as tensor densities of weight zero. The determinant of
the covariant rank-2 metric tensor is a scalar density of weight 2, while d4 x is a scalar density of
weight −1. Eq. (4.27) can be easily generalized to obtain a rule for transforming tensorial densities
into tensors
m n
!
−w
Y ∂ x̄µp Y ∂xσq w
|ḡ| 2 D̄ µ1 ...µ m
ν1 ...νn = ρ ν
|g|− 2 Dρ1 ...ρm σ1 ...σn . (4.29)
p=1
∂x q=1 ∂ x̄
p q

Exercise
• Show that the totally antisymmetric quantity

+1, if µνρσ is an even permutation of 0123 ,

µνρσ = −1, if µνρσ is an odd permutation of 0123 , (4.30)

0, otherwise .


is a tensor density under general coordinate transformations. Determine its weight. Con-
struct a tensor from it using the metric.

• Show that the components of µνρσ remain unchanged under general coordinate transfor-
mations.

4.4 Covariant derivative

We saw that in Special relativity the derivative of a vector is a tensor under Lorentz transformations.
However, when general coordinate transformations are taken into account, the usual derivative of the
components of a tensor is not a tensor
ρ µ
¯ µ ∂x ∂ x̄ σ
∂ν V̄ = ∂ρ V
∂ x̄ν ∂xσ
∂xρ ∂ x̄µ σ ∂xρ ∂ 2 x̄µ
= ∂ρ V + Vσ (4.31)
∂ x̄ν ∂xσ ∂ x̄ν ∂xρ ∂xσ
4.4 Covariant derivative 53

The second term spoils the tensorial property of the derivative for first-rank tensors. In order to de-
termine why this happens, let’s go back to geometrical, real coordinate independent objects. Consider
for instance the expansion of a vector
V = V µ eµ (x) , (4.32)
in terms of arbitrary basic vectors eµ (x), which, contrary to the Euclidean or Minkowski cases, gener-
ically depend on the coordinates. The derivative of such a vector contains two different contributions,
one due to the intrinsic change of the vector field from place to place and one describing the variation
of the basis vectors from place to place
∂V ∂V µ ∂eµ
ν
= eµ + V µ ν . (4.33)
∂x ∂xν ∂x
The first term is present even in Cartesian coordinates and is a linear combination of the basis vectors,
i.e. a vector. On the other hand, the second term involves the derivative of the basis vectors eµ . The
relation between these vectors and those in a (local) inertial reference frame 1 eα is given by

∂ξ α
eµ = eα . (4.34)
∂xµ
Taking the derivative of the previous expression and using the fact that the vectors eα are constant
∂eα
= 0, (4.35)
∂xν
we get the following relation α
∂eµ ∂ ∂ξ
= eα . (4.36)
∂xν ∂xν ∂xµ
The right-hand side of the previous equation is a linear combination of the inertial basis vectors eα
∂e
and therefore is a vector. This allows us to rewrite ∂xµν in terms of the arbitrary basis vectors eµ

∂ eµ
= Γρ µν eρ . (4.37)
∂xν
For the time being, the so-called affine connection Γρ µν is just a 3-index notation denoting the linear
combination of arbitrary basis vectors eµ . The index µ specifies the basis vector that is differentiated,
ν the coordinate with respect to which it is differentiated and ρ the component of the resulting vector2 .
Inserting the definition (4.37) into Eq.(4.33) we obtain

∂V ∂V µ
ν
= eµ + V µ Γρ µν eρ , (4.38)
∂x ∂xν
where the basis vector in the right hand side can be factored out by simply relabeling the dummy
indices µ and ρ
∂V µ

∂V
= + V Γ ρν eµ ≡ (∇ν V µ ) eµ .
ρ µ
(4.39)
∂xν ∂xν
The quantities in parenthesis are the components of a tensor, called the covariant derivative, which
takes into account the variation of basis vectors from point to point. We will denote it in two alternative
ways, either with the symbol ∇

∇ν V µ = ∂ν V µ + Γµ ρν V ρ . (4.40)
1 Note the use of the first letters of the Greek alphabet for denoting quantities in (local) inertial frames.
2 Yes, I am deliberately using the same symbol I used for the Christoffel symbols. You will understand why in a
while. Be patient.
4.4 Covariant derivative 54

or with a semicolon

V µ ;ν = V µ ,ν + Γµ ρν V ρ . (4.41)

Note that the standard derivative ∂µ has been denoted by a colon. The notation (4.41) is specially
convenient for its brevity and for remembering the definition of the covariant derivative (the ν index
appears in the last position of each term).

A worked-out example: Polar coordinates in the plane

Taking the derivatives of the basic vectors (4.13) and (4.14) and taking into account that the
Cartesian vectors are constant, we have
∂er
=0 → Γµ rr = 0 , (4.42)
∂r
∂eθ 1 1
= eθ → Γr θr = 0 , Γθ θr = , (4.43)
∂r r r
∂er 1 1
= eθ → Γr rθ = 0 , Γθ rθ = , (4.44)
∂θ r r
∂eθ
= −rer → Γr θθ = −r , Γθ θθ = 0 , (4.45)
∂θ
and the associated covariant derivatives become

V r ;r = V r ,r , V θ ;θ = V θ ,θ + 1r V r , (4.46)
1
V θ ;r = V θ ,r + V θ , V r ;θ = V r ,θ − rV θ . (4.47)
r
Note that the final expressions do not involve any Cartesian tensors and allow you to directly
derive the formulae for the divergence of a vector field
1 1 ∂ ∂ θ
∇ · V = V r ;r + V θ ;θ = V r ,r + V θ ,θ + V r = (rV r ) + V . (4.48)
r r ∂r ∂θ
and the Laplacian of a scalar fielda

1 ∂2φ

1 ∂ ∂φ
∇ · ∇φ ≡ ∇2 φ = r + , (4.49)
r ∂r ∂r r2 ∂θ2

in polar coordinates. You should recognize the result. . . The formulae appearing in your favorite
electromagnetism books are just a consequence of the existence of non-vanishing connection
coefficients in curvilinear coordinates!
a Note that ∇µ φ = ∂µ φ.

For a covariant tensor the covariant derivative takes a slightly different form. Consider a scalar

φ = Vµ U µ , . (4.50)

Since a scalar does not depend on the basis vectors, its covariant derivative coincides with the standard
derivative
∂Vµ µ ∂U µ
∇ν φ = ∂ν φ = U + V µ (4.51)
∂xν ∂xν
4.4 Covariant derivative 55

Using Eq. (4.40) for replacing ∂ν U µ in favor of ∇ν U µ and relabeling dummy indices in the term
containing the connection we get

∂Vµ ρ
∇ν φ = − Γ µν Vρ U µ + V µ ∇ν U µ . (4.52)
∂xν

All the terms in the previous expressions, except the one in parenthesis are tensor components. Since
the multiplication and addition of tensor components always gives rise to tensors, the quantity in
parenthesis must be a tensor. The covariant derivative of Vµ becomes is then given by

∇ν Vµ = ∂ν Vµ − Γρ µν Vρ , (4.53)

or
Vµ;ν = Vµ,ν − Γρ µν Vρ . (4.54)
in the semicolon notation. Note the similarities and differences between (4.41) and (4.54). In both
cases, the index with respect to which the covariant derivative is taken (ν in this case) is the last
subscript of the connection Γ. The remaining indices can only be arranged in one way without raising
and lowering them. For a covariant vector (superscript) the sign of the connection is positive; for a
covariant vector (subscript) the connection carries a minus sign. These transformation rules can be
generalized to extra covariant and contravariant indices by introducing a factor Γ for each index with
the proper sign and index matching to obtain
µ... µ... µ λ... λ µ...
Tν... ;ρ = Tν... ,ρ + Γ λρ Tν... + . . . − Γ νρ Tλ... − . . . . (4.55)

A practical rule for covariant derivatives

• Write down the partial derivative of the tensor.
• The expression is to be corrected by a set of terms, one for each index of the tensor. Each
term will be a product of a Γ and the original tensor.
• A term will have positive sign if the index we are correcting for is a superscript and a
negative sign if the index is a subscript.
• The index with respect to which the covariant derivative is taken will be the last subscript
of all the connections Γ.

• The index of the tensor that is being corrected will be replaced by a dummy index, that
will be contracted with one of the indices of Γ.
• The remaining indices can be placed in an unique way.

Exercise
Write explicitly T µ ν;ρ and T µν ;ρ .

4.4.1 Relation between the connection and the metric tensor

Since the definition of the affine connection in (4.37) involves only basis vectors and their derivatives,
it is clear that Γρ µν must be somehow related to the metric and its derivatives. Note however that
there is certain ambiguity in the way we introduced the connection coefficients Γν µρ . We could have
4.4 Covariant derivative 56

perfectly written the same expression with a different ordering of indices, i.e Γν ρµ instead of Γν µρ . In
the most general case, these two quantities are not necessarily equal to each other

T ν µρ ≡ Γν µρ − Γν ρµ 6= 0 . (4.56)

and the spacetime is said to have torsion3 . In what follows, we will require our spacetime to be
torsionless and will take
Γν µρ = Γν ρµ (4.57)
Under this assumption, the relation between the metric and the connection can be determined as
follows. Starting with (4.1), and differentiating it with respect to xρ we obtain

∂ρ gµν = ∂ρ eµ · eν + eµ · ∂ρ eν
= Γσ µρ eσ · eν + eµ · Γσ νρ eσ
= Γσ µρ gσν + Γσ νρ gµσ , (4.58)

where in the last last step we have made use of the defining equation (4.37) and the metric definition
(4.1). By cyclically permuting the indices we obtain the following two equivalent expressions

∂ν gρµ = Γσ ρν gσµ + Γσ µν gρσ , (4.59)

∂µ gνρ = Γσ νµ gρσ + Γσ ρµ gνσ , (4.60)

which, combined with Eq.(4.58) and using the assumed property Γσ µν = Γσ νµ , allows us to form the
following combination
∂ρ gµν + ∂ν gρµ − ∂µ gνρ = 2Γσ ρν gµσ . (4.61)
Multiplying by the inverse metric g κµ , using g κµ gµσ = δ κ σ and relabeling indices we obtain the result
we were looking for4
1
Γµ νρ = g µσ (∂ν gσρ + ∂ρ gνσ − ∂σ gνρ ) . (4.62)
2

A connection satisfying the previous property is called a metric connection, a Christoffel connection,
a Levi-Civita connection or a Riemannian connection.

A worked-out example I: Polar coordinates in the plane

The only nonzero derivative of the covariant metric components is gθθ,r = 2r. Therefore, coming
back to Eq. (4.62), the non-zero components are
1 rr
Γr θθ = g (∂θ grθ + ∂θ gθr − ∂r gθθ ) = −r , (4.63)
2
1 1
Γθ rθ = Γθ θr = g θθ (∂θ gθr + ∂r gθθ − ∂θ grθ ) = , (4.64)
2 r
which coincide with the results (4.42)-(4.45) obtained by using the definition (4.37).

Remember that the metric connection (4.62) is not a tensor

∂ x̄µ ∂xσ ∂xκ ∂ x̄µ ∂ 2 xλ

Γ̄µ νρ = Γλ σκ λ ν ρ
+ . (4.65)
∂x ∂ x̄ ∂ x̄ ∂xλ ∂ x̄ν ∂ x̄ρ
3 As we will see below the connection Γ is not a tensor. The torsion T ν
µρ , involving the difference of two connections,
is however a tensor.
4 Note that in the presence of torsion it would differ from the affine connection defined by (4.37).
4.4 Covariant derivative 57

Exercise
• Prove explicitly the relation (4.68) by using the relation (4.62).

• Show that the difference of two connections is a tensor.

4.4.2 Properties of the covariant derivative

i) Linearity: The covariant derivative of a linear combination of tensors with constant coefficients
is the same as the linear combination of the tensors once the covariant differentiation has been
carried out
∇ρ (aU µ ν + bV µ ν ) = a∇ρ U µ ν + b∇ρ V µ ν . (4.66)

ii) Leibniz’s or chain rule: The covariant derivative of outer and inner products of tensors obey the
same rules as the usual derivative

∇ρ (U µ Vν ) = (∇ρ U µ ) Vν + U µ (∇ρ Vν ) . (4.67)

iii) Metric compatibility: The covariant derivative of the metric tensor is zero

∇ρ gµν = 0 . (4.68)

In other words, the metric tensor is not constant ∂ρ gµν 6= 0 but it is covariantly constant. The
result follows immediately from comparing the general expression for the covariant derivative of
a rank-2 covariant tensor with (4.58)5 .
iv) The raising and lowering of tensor indices is not affected by covariant differentiation. For example

∇ν V µ = ∇ν (g µσ Vσ ) = g µσ ∇ν Vσ , (4.69)

where we have made use of properties ii) and iii). Note that this would not be the case if our
connection was not metric-compatible. We should be very careful about index placement in that
case.
v) The covariant derivative of the Kronecker delta δ µ ν is zero

∇ρ δ µ ν = ∂ρ δ µ ν + Γµ σρ δ σ ν − Γσ νρ δ µ σ = Γµ νρ − Γµ νρ = 0. (4.70)

vi) The covariant derivative commutes with the contraction of indices. For example

∇ν T µρ ρ = ∂ν T µρ ρ + Γµ λν T λρ ρ + Γρ λν T µλ ρ − Γκ ρν T µρ κ = ∂ν T µρ ρ + Γµ λν T λρ ρ . (4.71)

Exercise
• Consider a tensor T µ ν = U µ Vν . Use the Leibniz’s rule (4.67) together with the expressions
for the covariant derivatives of a covariant and a contravariant vector to compute T µν ;ρ .
Is the result consistent with the practical rules below Eq. (4.55) ?
• Verify Eq.(4.68) for the particular case of polar coordinates in the plane.
5 Indeed, we have implicitly assumed that the affine connection was metric-compatible in our derivation of Eq. (4.58).
4.4 Covariant derivative 58

4.4.3 Some useful formulas

Let me present (without proving them) some useful formulae involving the connection and the covari-
ant derivatives.

1. Contraction of the Christoffel symbols

1 p 1 p
Γµ µρ = ∂ν log |g| = p ∂ν |g| . (4.72)
2 |g|

2. Divergence of a contravariant vector

1 p
∇µ V µ = p ∂µ |g|V µ = ∂µ V µ + Γµ µρ V ρ . (4.73)
|g|

3. Covariant form of the Gauss theorem

Z p
d4 x |g|∇µ V µ = 0 . (4.74)

4. Covariant Laplacian6
1 p
∇2 φ ≡ ∇µ ∇µ φ = p ∂µ |g|∂ µ φ . (4.75)
|g|

5. Covariant divergence of a rank-2 tensor

1 p
∇µ T µν = p ∂µ |g|T µν + Γν µρ T µρ . (4.76)
|g|

6. Covariant divergence of a rank-2 symmetric tensor

1 p 1
∇µ S µ ν = p ∂µ |g|S µ ν − (∂ν gµρ ) S µρ . (4.77)
|g| 2

7. Covariant divergence of a rank-2 antisymmetric tensor

1 p
∇µ Aµν = p ∂µ |g|Aµν . (4.78)
|g|

Eqs. (4.73), (4.77) and (4.78) are particularly useful, since they allow us to compute the covariant
derivative of an object without having to compute the Christoffel symbols.

Exercise
• Derive all the expressions in this section.
• Use Eq. (4.75) to rederive the expression for the Laplacian in polar coordinates.
6 The symbol used for the Laplacian operator depends on the dimension of the spacetime considered. The three-

sided symbol ∇2 in (4.75) is the most common notation in arbitrary dimension. The 4-dimensional case is sometimes
singled-out. It has a special name, D’Alambertian and its own four-sided symbol, 2.
4.5 An application: Maxwell equations in arbitrary coordinates 59

4.5 An application: Maxwell equations in arbitrary coordi-

nates
As shown at the end of Chapter 2, the electromagnetic field equations in Cartesian coordinates take
the form
∂β F αβ = J α , ∂γ Fαβ + ∂α Fβγ + ∂β Fγα = 0 , (4.79)
with Fαβ = ∂α Aβ − ∂β Aα . Let us see which is the form taken by those equations in an arbitrary
coordinate system xµ . In order to do that, let us write the field strength and the current in terms of
xµ (ξ α )
∂xµ ∂xν ∂xµ
F µν ≡ α β F αβ , Jµ ≡ α Jα (4.80)
∂ξ ∂ξ ∂ξ
and replace the partial derivatives ∂µ by the covariant derivatives ∇µ to take into account the fact
that the basis vector are in general not constant. We obtain

∇ν F µν = J µ , ∇λ Fµν + ∇µ Fνλ + ∇ν Fλµ = 0 , (4.81)

with Fµν = ∇µ Aν − ∇ν Aµ = ∂µ Aν − ∂ν Aµ . The resulting equations are fully covariant, i.e. if they
are valid in an arbitrary coordinate system they will be valid in all coordinate systems. Taking into
account the antisymmetricity of F µν together with the property (4.78), the first Eq. in (4.81) can be
written in a very convenient form7
p p
∂µ |g|F µν = |g|J µ , (4.82)

which, taking into account (4.73), allows as to easily compute the expression for the continuity equation
(2.73) in arbitrary coordinate systems
1 p
∇µ J µ = p ∂µ |g|J µ = 0. (4.83)
|g|

Exercise
Fill the steps in the derivation of Eqs. (4.82) and (4.83).

4.6 Parallel transport and geodesics

In Cartesian coordinates, a vector field will be considered as constant if its components with respect
to the coordinates are the same everywhere. In that case, the vector field can be thought as parallel
transported to itself at every point in space. In other words, it moves parallel to itself when it moves
along a curve. Within the framework of Cartesian coordinates this is a consistent definition since the
Cartesian basis vectors are parallel to the axes everywhere. The previous property gives rise to an
alternative definition for a geodesic: It is the curve for which the tangent vector always points in the
same direction. Indeed, the equation of motion for a free particle
duµ
=0 (4.84)
dτ
can be interpreted as a straight line in spacetime (i.e. the shortest distance between two points, a
geodesic) or as the fact that the vector uµ remains constant along the line parametrized by τ .
7 The second equation also becomes simpler when taking into account the antisymmetry of the field strength tensor,

Fµν,λ + Fνλ,µ + Fλµ,ν = 0.

4.6 Parallel transport and geodesics 60

On the other hand, if we consider a non-Cartesian set of coordinates, such as polar coordinates in
the plane, the notion of parallel transport is more difficult to define since as we saw in the previous
sections the basis vectors change from point to point. The covariant derivative can be nevertheless
used to provide a natural definition for parallel transport in an arbitrary spacetime. To see this,
consider the derivative of a vector V = V µ eµ along a curve parametrized by an affine parameter σ
dV dV µ deµ
= eµ + V µ
dσ dσ dσ
dV µ µ deµ ∂x
ρ
= eµ + V ρ
dσ dx dσ
dV µ ∂xρ
= eµ + Γ µρ V µ
ν
eν . (4.85)
dσ dσ
Relabelling indices and factoring out the basis vector, we get
µ
∂xρ DV µ

dV dV
= + Γµ νρ V ν eµ ≡ eµ . (4.86)
dσ dσ dσ dσ
DV µ
where we have defined the components of the intrinsic derivative dσ of a contravariant vector as
µ µ ρ
DV dV ∂x
≡ + Γµ νρ V ν . (4.87)
dσ dσ dσ
A similar condition can be found for the components of the intrinsic derivative of a covariant vector
DVµ dVµ ∂xρ
≡ − Γν µρ Vν . (4.88)
dσ dσ dσ
The condition of parallel transport, dV/dσ = 0, implies
DV µ dV µ ∂xρ ν
≡ + Γµ νρ V = 0, (4.89)
dσ dσ dσ

Geodesics and parallel transport

ν
A direct application of (4.89) to the vector uν = dx µ
dσ tangent to a given trajectory x (σ) gives
rise to
d2 xµ µ ∂xρ dxν
+ Γ νρ = 0. (4.90)
d2 σ dσ dσ
Voilá! A curve is a geodesic if it parallel transports its own tangent vector!

The concept of intrinsic derivative and parallel transport can be generalized to objects with more
indices. The parallel transport of a tensor T along the path xµ (λ) is defined by the requirement
DT µ··· ν··· dxρ
≡ ∇ρ T µ··· ν··· = 0 . (4.91)
dσ dσ
Applying this to the metric and taking into account (4.68) we conclude that
D dxρ
gµν = ∇ρ gµν = 0 , (4.92)
dσ dσ
which gives rise to an important property of parallel transport: it conserves the direct product of two
parallel transported vectors
DV µ µ DU µ

D D Dgµν
(Vµ U µ ) = (gµν V µ U ν ) = V µ U ν + gµν U +Vµ = 0, (4.93)
Dσ Dσ Dσ Dσ Dσ
and therefore their norm, orthogonality, etc . . . .
4.7 Summary 61

4.7 Summary
The general procedure for converting an equation which is valid in Cartesian inertial coordinates to
an equation valid in arbitrary coordinate systems is:

• Write the equations in a Lorentz invariant form.

• Replace the Minkowski metric ηµν by gµν .

• Replace partial derivatives by covariant derivatives.
• Replace ordinary derivatives along curves by intrinsic derivatives.

This prescription is known as minimal coupling prescription, since it does not introduce any terms
apart from those already present.

Exercise
The equation of motion of a charged particle in an electromagnetic field in Cartesian coordinates
takes the form
duα
m = qF α β uβ . (4.94)
dτ
Which is the form taken by the previous expression in an arbitrary coordinate system? Do you
recognize the left-hand side?
CHAPTER 5
TIDAL FORCES AND CURVATURE

What are the differential laws

which determine the Riemann
metric (i.e. gµν ) itself?. . . The
solution obviously needed
invariant differential systems of
the second order taken from gµν .
We soon saw that these had been
already established by Riemann.

A. Einstein

In both Newtonian mechanics in the absence of gravity and Einstein’s theory of Relativity, inertial
frames are characterized by the absence of accelerations, which are absolute elements of the theory.
If particles move in straight lines at constant speed the system is inertial. On the other hand, if
the trajectory in spacetime is not a straight line the system must be accelerating. The situation is
slightly different when gravity is taken into account. The equality between inertial and gravitational
masses does not allow to locally distinguish the acceleration of a given reference frame from purely
gravitational effects. Gravity can be locally switched off by properly choosing a local inertial frame
associated to an observer in free-fall in the gravitational field. The word locally is fundamental, since
the global behaviours of accelerations and gravity are completely different: while the true gravitational
field vanishes at large distances, the apparent gravitational field in an accelerating frame takes a
nonzero constant value at infinity. Real and apparent gravity can be distinguished by tracking the
relative acceleration of nearby local inertial observers that appears due to the non-homogeneity of the
gravitational field!

5.1 Gravity is a central force: Tides

Non-uniform gravitational fields are observable. Consider for instance two non-interacting particles
falling towards the surface of the Earth (cf. Fig.5.1). Since the Earth is spherical in shape, both
particles move towards the center of the Earth in such a way the separation between them decreases
as they fall. The central character of the gravitational field gives rise to tidal forces. Let’s put this
into equations.
5.1 Gravity is a central force: Tides 65

Figure 5.1: The effect of tidal forces.

In an inertial frame the equations of motion for the particles are given by the usual Newtonian
expressions, namely
d2 xi ∂Φ(xj )
2
= −δ ik , (5.1)
dt ∂xk
d2 (xi + ξ i ) ∂Φ(xj + ξ j )
2
= −δ ik , (5.2)
dt ∂xk
with ξ i the separation vector between the two particles. For sufficiently small separations Eq. (5.2)
can be Taylor expanded to linear order in ξ i to obtain

d2 (xi + ξ i ) ∂Φ(xi ) ∂Φ(xi )

ik ∂ j
= −δ + ξ + ... . (5.3)
dt2 ∂xk ∂xj ∂xk

The Newtonian deviation equation for the separation vector ξ i becomes therefore

d2 ξ i
2
ik ∂ Φ
= −δ ξj . (5.4)
dt2 ∂xk ∂xj
The non-relativisitic tidal tensor
∂2Φ
E i j ≡ δ ik , (5.5)
∂xk ∂xj
determines the tidal forces, which tend to bring the particles together. This is the fundamental object
for the description of gravity and not their individual accelerations gi = ∂i Φ!

Exercise
Assume the tidal tensor E i j to be reduced to diagonal form, as in the example below. Show
that the components of that tensor cannot all have the same sign.

As a particular example, that will be useful in the future, consider two particles in the gravitational
field of a spherically symmetric distribution of mass M , i.e Φ = −GM/r. The tidal tensor (5.5) in
this case becomes
GM
Eij = (δij − 3ni nj ) 3 , (5.6)
r
where ni ≡ xi /r are the components of the unit vector in the radial direction. Writing explicitly the
different components in polar coordinates we obtain

d2 ξ r 2GM d2 ξ θ GM d2 ξ φ GM
= + 3 ξr , = − 3 ξθ , = − 3 ξφ . (5.7)
dt2 r dt2 r dt2 r
5.2 Geodesic deviation 66

Figure 5.2: Bunch of geodesics classified by the value of λ.

Note the different signs: the object is stretched in the radial direction and compressed in the trans-
verse directions. Tidal forces squeeze a sphere into an ellipsoid (cf. Fig.5.1).

Exercise
Assuming the water in the oceans to be in static equilibrium and taking into account the results
of the previous example, estimate the height of the tides generated by the Moon.

Using the tidal tensor (5.5) we can write the equations governing the structure of Newtonian gravity
in the following suggestive way

E i i = 4πGρ Poisson’s equation (5.8)

2 i
d ξ
= −E i j ξ j Geodesic Deviation (5.9)
dt2 )
Eij = Eji
Bianchi Identities (5.10)
E i [j,l] = 0

where the symbol [j, l] stands for antisymmetrization in the corresponding indices, i.e.
1
E i [j,l] ≡ E i j,l − E i l,j .

(5.11)
2

5.2 Geodesic deviation

Let us now study this issue taking into account the things that we learned in the previous chapter.
Consider a bunch of geodesics xµ (σ, λ) classified by the value of some parameter λ (cf. Fig. 5.2).
Which is the requirement for having tidal forces? To answer this question, let me define two kinds of
µ
vectors (cf. Fig. 5.2): the tangent vector to the trajectory, ∂x ∂σ (σ,λ)
, that we will shortly denote by
µ
uµ (σ, λ), and the derivative in the λ direction, ∂x ∂λ
(σ,λ)
, that we will shortly denote by v µ .
Taking Newtonian gravity as a guide, we expect the motion of the particles to be described by a
second order differential equation involving the change of the separation vector v µ along the path

D2 vµ
= uσ ∇σ (uρ ∇ρ v µ ) . (5.12)
dσ 2
The right hand-side of this equation should contain the information about the true gravitational field.
Using the relation1
v ρ ∇ρ uµ = uρ ∇ρ v µ , (5.13)
1 It follows directly from the definition of the covariant derivatives and the relation ∂uµ /∂λ = ∂v µ /∂σ.
5.2 Geodesic deviation 67

between the covariant derivatives of uµ and v µ , we get two pieces

D2 vµ
= uσ ∇σ (uρ ∇ρ v µ ) = uσ ∇σ (v ρ ∇ρ uµ ) = uσ (∇σ v ρ ) (∇ρ uµ ) + uσ v ρ ∇σ ∇ρ uµ . (5.14)
dσ 2
Changing the order of the covariant derivatives appearing in the first piece and using back Eq. (5.13)
in the second piece, we obtain
D2 vµ
= uσ (∇σ v ρ ) (∇ρ uµ ) + uσ v ρ ∇σ ∇ρ uµ
dσ 2 | {z } | {z }
v σ (∇σ uρ ) ∇ρ ∇σ +[∇σ ,∇ρ ]

= v σ (∇σ uρ ) (∇ρ uµ ) +uσ v ρ ∇ρ ∇σ uµ + uσ v ρ [∇σ , ∇ρ ] uµ

| {z }
σ↔ρ
= v (∇ρ u ) (∇σ uµ ) + uσ v ρ ∇ρ ∇σ uµ + uσ v ρ [∇σ , ∇ρ ] uµ
ρ σ

= v ρ ∇ρ (uσ ∇σ uµ ) + uσ v ρ [∇σ , ∇ρ ] uµ , (5.15)

where in the last steps we have simply performed some index relabelings and collected terms. The
first term in the last line of (5.15) vanishes since, as we show in Section 4.6, the tangent vector to the
trajectory is parallel transported along the geodesic, uσ ∇σ uµ = 0. We are left therefore with a very
compact expression
D2 vµ
= uσ v ρ [∇σ , ∇ρ ] uµ , (5.16)
dσ 2
which hides however a big amount of work inside the commutator of the two covariant derivatives.

What we should expect

Before proceeding to the explicit computation of this commutator, let me anticipate what is
gonna happen. Note that the commutator of two covariant derivatives acting on a scalar φ

[∇σ , ∇ρ ]φ = ∇σ ∂ρ φ − ∇ρ ∂σ φ = Γκσρ − Γκρσ ∂κ φ ,

(5.17)

vanishes for a symmetric connection Γκρσ = Γκσρ , like the metric connection we are working
with (cf. Eq. (4.62)). Taking this into account, let me compute the quantity

[∇σ , ∇ρ ] (φuµ ) = ([∇σ , ∇ρ ] φ) uµ + φ [∇σ , ∇ρ ] uµ = φ [∇σ , ∇ρ ] uµ . (5.18)

The final result has important consequences. In particular, it tells us that [∇σ , ∇ρ ] uµ cannot
depend on the derivatives of uρ because in that case it would also have to depend on the
derivatives of the scalar field φ. As the dependence on the vector uµ is linear, we are left with
an expression of the form
[∇σ , ∇ρ ] uµ = Rµ νσρ uν , (5.19)
with Rµ νρσ some unknown coefficients. Although the particular combination of connections
inside these coefficients cannot be determined without performing the full computation, it is
nice to have an idea of the final result before computing it, right?

Let us start the explicit computation of the commutator [∇σ , ∇ρ ] uµ from the definition of the covariant
derivative
∇ρ uµ = ∂ρ uµ + Γµ κρ uκ . (5.20)
Differentiating with respect to xσ we obtain
∇σ ∇ρ uµ = ∂σ (∇ρ uµ ) + Γµ λσ ∇ρ uλ − Γκ ρσ ∇κ uµ (5.21)
µ µ κ µ λ λ κ κ µ µ λ

= ∂σ ∂ρ u + ∂σ (Γ κρ u )+Γ λσ ∂ρ u + Γ κρ u −Γ ρσ ∂κ u + Γ λκ u ,
5.2 Geodesic deviation 68

where we have treated ∇ρ uµ as a second rank tensor. Computing the difference ∇σ ∇ρ uµ − ∇ρ ∇σ uµ

we find that the terms involving first derivatives of uµ vanish, as expected. The covariant derivatives
of vectors do not commute by a value that depends only on the vector field at the point in question

[∇σ , ∇ρ ] uµ = − (∂ρ Γµ νσ − ∂σ Γµ νρ + Γµ κρ Γκ νσ − Γµ κσ Γκ νρ ) uν ≡ −Rµ νρσ uν = Rµ νσρ uν . (5.22)

Ambiguities
Note that the non-commutation of covariant derivatives gives rise to some ambiguities in the
minimal coupling prescription (colon-goes-to-semicolon) introduced in the previous Chapter.
To illustrate this, consider for instance a physical law which in an inertial frame takes the form

U µ ∂µ ∂ν V ν = U µ ∂ν ∂µ V ν = 0 , (5.23)

with U µ and V ν some vector fields. Which should be the covariant generalization of this law?
Should we write something like
U µ ∇µ ∇ν V ν = 0 , (5.24)
or rather something like
U µ ∇ν ∇ µ V ν = 0 ? (5.25)
According to (5.22), these two equations are not equal; they differ by a factor proportional
Rµ νρσ , which is not necessarily zero. The colon-goes-to-semicolon prescription is ambiguous.
This is reminiscent of the problem of ordering operators in quantum mechanics: the minimal
prescription does not say anything about how to order the operators. The correct way of
adapting the laws of physics to spaces with non-vanishing Rµ νρσ can be only determined by
experiments.

The n4 quantities
Rµ νρσ ≡ ∂ρ Γµ νσ − ∂σ Γµ νρ + Γµ κρ Γκ νσ − Γµ κσ Γκ νρ (5.26)
are the components of a tensor, as can be easily seen by applying the quotient theorem2 to Eq. (5.22).
This tensor is called the curvature or Riemann tensor and it is defined in terms of the metric and its
first and second derivatives.

Exercise:
• Which is the value of Rµ νρσ for a 2 dimensional Euclidean metric written in Cartesian
coordinates? And if the metric is written in polar coordinates?

• Derive the action of the commutator of two covariant derivatives on a covariant vector.
Hint: This should be a fast exercise. Remember the metric compatibility.
• Use the previous result to determine the action of the commutator of covariant derivatives
on an arbitrary rank-(r, s) tensor.

Substituting (5.22) into Eq. (5.16) we obtain the so-called geodesic deviation equation

D2 vµ
= −Rµ νρσ uν uσ v ρ . (5.27)
dσ 2
The term in the right-hand side is the sought-after effect of gravity that cannot be removed by going
to a free falling frame: the tidal acceleration. In the non-relativistic limit, the intrinsic derivative on
2 cf. property 3 in Section 1.4.4
5.3 Flat versus curved: A dirty and quick introduction to curvature. 69

Figure 5.3: First appearance of the Riemann tensor in Einstein’s Zurich notebooks. The Riemann
tensor is written in the old-fashioned notation (ik, lm). According to some urban legends, Einstein
learned the methods of Ricci and Levi-Civita through his school friend Marcel Grossmann. It was
Grossmann the one who went to the library searching for methods to deal with arbitrary coordinate
systems and discovered the Ricci and Levi-Civita’s 1901 paper. The annotation “Grossmann tensor
fourth rank” that you can find in the right hand side of the formula suggests indeed that Grossmann
conveyed the Riemann tensor formula to Einstein.

the left hand side becomes d2 /dt2 and uµ ≈ δ µ 0 , in such a way that

d2 v µ
= −Rµ 0ρ0 v ρ . (5.28)
dt2
Taking into account Eqs. (5.4) and (5.5) we can to identify Rµ 0ρ0 with the non-relativistic tidal tensor
3

E i j = Ri 0j0 . (5.30)

Exercise:
Compute the Christoffel symbols and the curvature tensor to the lowest order for the line element

ds2 = − (1 + 2φ) dt2 + δij dxi dxj . (5.31)

Interpret the result.

5.3 Flat versus curved: A dirty and quick introduction to

curvature.
The geodesic equation is a clear manifestation of the geometrical character of the Einstein’s theory of
gravity: it is a theory of curved spacetimes. To understand this, let me start with a basic and dirty
introduction to the theory of surfaces and the concept of curvature. When I say curvature I mean
what you understand by curvature in your everyday experience; objects such as eggshells, donuts,
tennis balls, etc. . . are curved. A two dimensional surface can be though as embedded in the usual
3 Note that the deviation between two neighboring geodesics parametrized by the values λ and λ + dλ is given by
dxµ
ξµ = δλ = v µ δλ . (5.29)
∂λ
5.3 Flat versus curved: A dirty and quick introduction to curvature. 70

Figure 5.4: Principals curvatures of a surface.

3-dimensional Euclidean space4 . At any given point P on the 2-dimensional surface, we can introduce
a tangent plane with Cartesian coordinates (X1 , X2 ) (cf. Fig. 5.4). This Euclidean space is called the
tangent space to the surface at P . The deviation z(X1 , X2 ) of the curved surface from the tangent
plane describes the local properties of our geometry. Since curvature effects arise only through the
second derivatives of z(x, y), it is convenient to use a quadratic function
1 T
z(X1 , X2 ) = X MX , (5.32)
2
with
a c T
M= , X ≡ (X1 , X2 ) , (5.33)
c b
and a, b and c quantities with dimensions of inverse length5 . Eq. (5.32) can be recast in a diagonal form
by rotating the coordinates, X̄ = RX, and accordingly transforming the matrix M , M̄ = R−1 M R.
In the new coordinate basis (ξ, η),we obtain
1 ξ2 η2

1 2 2
z(ξ, η) = κ1 ξ + κ2 η ≡ + , (5.34)
2 2 ρ1 ρ2
where we have defined the so-called principal curvatures κ1 and κ2 and the principal radii of curvature
ρ1 and ρ2 .
The result is quite intuitive. It simply states that any surface is locally the sum of two parabolas
in the ξ and η directions and with radius of curvature ρ1 and ρ2 respectively (cf. Fig. 5.4).

4 We do this just for visualization purposes; that is why I said that my introduction is somehow dirty. There is

no need to choose a particular embedding for studying the geometry of the surface; the geometry can be completely
determined by measuring angles and distances on the surface. This is indeed a theorem, known as Gauss’ Egregium
Theorem. It words of Gauss himself, it reads

Formula itaque art[iculi] praec[edentis] sponte perducit ad egregium Theorema. Si superficies

curva in quamcunque aliam superficiem explicatur, mensura curvaturae in singulis punctis
invariata manet,
which, for those of you not knowing latin means

Thus the formula of the preceding article leads itself to the remarkable Theorem. If a curved
surface is developed upon any other surface whatever, the measure of curvature in each point
remains unchanged.
5A local region is defined for values of X1 and X2 much smaller than a−1 , b−1 , c−1
5.3 Flat versus curved: A dirty and quick introduction to curvature. 71

Figure 5.5: A clever ant determining the curvature of a sphere via the Bertrand-Diquet-Puiseux
formula.

Exercise
Expand a circle of radius ρ around some point. Comment on the result.

The square of the distance between two nearby points with coordinates6 (x, y) and (x + dx, y + dy) is
given by
2
ds2 = dξ 2 + dη 2 + dz 2 = (κ1 ξdξ + κ2 ηdη) + dξ 2 + dη 2 ≡ γµν dxµ dxν .

(5.35)
Since the measure of the surface curvature cannot depend on the set of coordinates used, it must be
related to the basis-independent attributes of the matrix M . These attributes are its eigenvalues, or
equivalently, its determinant and trace. The determinant K = det M = κ1 κ2 is called intrinsic or
Gaussian curvature and can be expressed entirely in terms of intrinsic measurements on the surface,
without any reference to the external embedding space. Starting from a point P on the surface and
proceeding along a geodesic on the surface for a proper distance , we arrive to a point Q1 . Repeating
this process with geodesics starting off in different directions, we obtain a set of points Q1 , Q2 , . . ., all
of them sitting at the circumference C() of a geodesic disc centered at P (cf. Fig. 8.6). A simple
computation using the metric (5.35) shows that the quantity7
3
lim+ (2π − C()) , (5.37)
→0 π3
measuring the difference between the circumference C() of our geodesic disc and a circumference in
the plane, corresponds precisely to the value of the Gaussian curvature K at P
1 3
K = κ1 κ2 = = lim+ 3 (2π − C()) . (5.38)
ρ1 ρ2 →0 π

This expresion, relating the Gaussian curvature of a surface to the circumference of a geodesic circle,
is known as the Bertrand-Diquet-Puiseux formula, and is closely related to the Gauss-Bonnet theorem
that we will discuss below. Spaces with K = 0 everywhere are said to be flat or developable, since
they can be “developed” or flattened out into a plane without stretching or tearing them (cf. Fig.
6 Note that although M is diagonal, the metric is not.
7 There is not an absolute scale for Gaussian curvature, neither a unique choice of the normalization factor 3/π3
appearing in Eq. (5.37). People have just agreed on the convention that the curvature of the unit sphere should be
equal to 1 (although there are some natural motivations for it). For a small geodesic disc on the unit sphere of radius
we have
1
C() ∼ 2π − 3 , (5.36)
6
which explains the proportionality factor 3/π3 .
5.3 Flat versus curved: A dirty and quick introduction to curvature. 72

Figure 5.6: Positive (K > 0) and negatively curved (K < 0) spaces.

Figure 5.7: A plane sheet of paper (κ1 = κ2 = 0) rolled in the form of a cylinder of radius r (κ1 = 1/r
and κ2 = 0). The extrinsic curvature changes from 0 to κ1 + κ2 = 1/r.
5.4 Parallel transport around a closed path 73

5.7). On the other hand, spaces with K > 0 everywhere are said to be positively curved, while spaces
with K < 0 everywhere are said to be negatively curved or saddle like. For someone living on a given
point of a space embedded in a higher dimensional space, the curvature at that point will be positive
if the space curves away in the same way in any direction, while it will be negative if the space curves
away in a different way when moving in different directions (cf. Fig. 5.6).

A worked-out example: a truly curved space.

As a direct application of the Bertrand-Diquet-Puiseux formula, consider the metric of the
2-dimensional sphere of unit radius

ds2 = dθ2 + sin2 θdφ2 , (5.39)

and take P to be the origin. The distance from the origin to the point (, θ) is given by
Z
ds = . (5.40)
0

The set of points with coordinates (, θ) form a disc whose circumference is given by
Z
dθ sin = 2π sin . (5.41)

Applying (5.38), we get

6 sin
lim 1− = 1. (5.42)
→0 2
The sphere (5.39) is a positively curved space.

On the other hand, the extrinsic curvature 8 is defined through the trace of M , namely κ1 + κ2 . The
difference between the two can be easily understood by considering, for instance, a plane sheet of paper
(κ1 = κ2 = 0) rolled in the form of a cylinder of radius r which will look like a curved 2-dimensional
surface embedded in a 3-dimensional Euclidean space (cf. Fig 5.7). For the cylindrical surface we
have κ1 = 1/r and κ2 = 0. The intrinsic curvature retains the value of the flat sheet of paper. On the
other hand, the extrinsic curvature changes from 0 to κ1 + κ2 = 1/r.

Exercise: Coodinates should not be trusted

Is the 2-dimensional space ds2 = cos2 φdφ2 + sin2 φdθ2 curved or flat?

5.4 Parallel transport around a closed path

Consider the sum of the angles of a triangle, let’s call them α, β and γ. As you know this sum is
equal to π rad in flat space. What happens in a curved surface? When a surface is curved the sum of
the angles in the triangle9 is in general different from π. The more curved the surface is, the larger is
the difference with respect to the flat result. The quantified version of this rather intuitive result is
8 In some books, the extrinsic curvature is normalized as (κ1 + κ2 )/2 and called mean curvature.
9 We are implicitly assuming that the sides of the triangle are geodesics, the curved analog of Euclidean straight lines.
5.4 Parallel transport around a closed path 74

Figure 5.8: Parallel transport of a vector around a closed path on the sphere.

the result of so-called Gauss-Bonnet theorem10 :

Z
KdS = α + β + γ − π (5.43)
S

with K the Gauss curvature and S the area inside the triangle. To generalize this form of curvature,
note that when the tangent vector at the P Q side is parallel transported from P to Q (cf. Fig. 5.8),
it forms an angle π − β with the tangent vector of the next side of the triangle. The same happens in
the other vertices. This means that if we make a parallel transport around the whole close path, we
obtain an angle π − β + π − γ + π − α, which, forgetting about 2π multiples and writing the appropriate
sign is given by α + β + γ − π. The Gauss curvature measures the variation, in relation with the area,
of parallel transported vectors around closed paths.

Ways of determining curvature

• Make distance measurements in different directions to construct the metric and then use
it to find the curvature

• Take a vector and go around two different paths.

Note that in both cases, we don’t make any reference to the higher-dimensional space in which
we are embedded.
Although the intuitive reasoning presented above was bidimensional, it can be easily generalized to
arbitrary dimension. To do that consider the parallel transport equation
dv µ dxρ
= −Γµ νρ v ν (5.44)
dσ dσ
and apply it to the case in which v µ is parallel-transported along a small curve C from some initial
point P . The value of the vector at any other point σ along this curve is given by
Z σ
dxρ
v µ (σ) = vPµ − Γµ νρ v ν dσ . (5.45)
o dσ

Let us assume the loop C to be infinitesimally small. In that case, the quantities in the integrand of
10 The standard presentation of the theory of surfaces is usually based on Gauss’ Egregium Theorem and finishes with

the derivation of the Gauss-Bonnet theorem. This sequence is however not chronological. Gauss deduced the Egregium
Theorem starting from the Gauss-Bonnet theorem.
5.4 Parallel transport around a closed path 75

previous expression can be Taylor expanded around the point P to get

Γµ νρ (σ) = Γµ νρ |P + ∂λ Γµ νρ ∆xλ + . . . (5.46)

v µ (σ) = vPµ − Γµ νρ vPν ∆xρ + . . . (5.47)

with ∆xλ ≡ xλ (σ) − xλP . Plugging back these expressions into (5.45) and retaining only those terms
up to first order in ∆xλ , we obtain
Z σ ρ Z σ
dx dxρ
v µ (σ) = vPµ − Γµ νρ vPν dσ − (∂λ Γµ νρ − Γµ κρ Γκ νλ ) vPν xλ − xλP dσ . (5.48)
P 0 dσ P 0 dσ

The second and the last term (the part associated to xλP ) vanish for a closed path ( dxρ = 0) . We
H

are left therefore with a net change

I
∆v µ = − (∂λ Γµ νρ − Γµ κρ Γκ νλ ) vPν xλ dxρ . (5.49)
P

This effect can be written in a more meaningful form by adding the result of interchanging the dummy
indices ρ and λ. Doing this, and taking into account that
I I
d(xρ xλ ) = xρ dxλ + xλ dxρ = 0 ,

(5.50)

we get
I
1
∆v µ = − (∂ρ Γµ νλ − ∂λ Γµ νρ + Γµ κρ Γκ νλ − Γµ κλ Γκ νρ ) vPν xρ dxλ . (5.51)
2 P

Denoting by I
Aρλ ≡ xρ dxλ (5.52)

the total area enclosed by the loop C and taking into account Eq. (5.26), we finally obtain
1
∆v µ = − Rµ νρλ vPν Aσλ . (5.53)
2
The change of the vector when it moves along a closed path is proportional to the Riemann tensor
and to the area enclosed by the loop11 ! Rµ νρσ is the generalization12 of the Gauss curvature K. The
components of a vector v µ will remain unchanged after parallel transport if and only if the curvature
tensor vanishes. In that happens, the spacetime is actually flat. Any apparent dependence of the
metric on the coordinates will be just an illusion due to the use of some weird coordinate system and
11 Note that although our derivation was performed under the assumption of having an infinitesimal loop, it can be

easily extended to larger closed curves. A given surface A bounded by a curve C can be understood as the sum of many
small areas bounded by closed curves CN . Since the changes in ∆v µ around any of the interior curves cancel and only
the outer edges contribute, we can express the change in the components v µ along C as the sum of the changes around
the small curves, namely X
∆v µ = (∆v µ )N . (5.54)
N
12 Indeed the geodesic deviation equation (5.27) is nothing else than the generalization of the Jacobi equation
d2 y
+ Ky = 0 (5.55)
dσ 2
between two geodesics in a two dimensional surface.
5.5 Properties of the Riemann tensor 76

Figure 5.9: Einstein’s manipulations of the Riemann tensor (Zurich notebook). The computation is
abandoned, “zu umstaendlich” (too involved).

we will be able to find a global coordinate system in which the metric takes a Cartesian form.

Exercise:
Determine the Gauss curvature of a spherical surface of radius R through the Gauss-Bonnet
theorem. Hint: Apply it, for instance, to the triangle determine by the 1/8 part of the sphere.

5.5 Properties of the Riemann tensor

Eq. (5.26) provides a way of computing the 256 components of the Riemann tensor directly from the
line element. This is usually a rather tedious process, even for Einstein (cf. Fig. 5.9). Fortunately, the
covariant form of the Riemann tensor Rµνρσ ≡ gµλ Rλ νρσ shows many interesting symmetries in its
indices that will simplify our life. Writing it explicitly in terms of the metric and Christoffel symbols
we get
1
(∂ν ∂ρ gµσ + ∂µ ∂σ gνρ − ∂ν ∂σ gµρ − ∂µ ∂ρ gνσ ) + gλκ Γλ νρ Γκ µσ − Γλ νσ Γκ µρ .

Rµνρσ = (5.56)
2
Using this expression we can derive the following properties:

• Symmetry: The Riemann tensor Rρσµν is symmetric under the interchange of the first pair of
indices with the second pair of indices

Rµνρσ = +Rρσµν . (5.57)

• Antisymmetry: The Riemann tensor Rρσµν is antisymmetric under the interchange of either
the first two indices or the second two indices

Rµνρσ = −Rµνσρ = −Rνµρσ = Rνµσρ . (5.58)

This is a direct consequence of the definition of the Riemann tensor ( the operator [∇σ , ∇ρ ] is
antisymmetric) and the metric compatibility

[∇σ , ∇ρ ]gµν = 0 −→ Rκ µρσ gκν + Rκ νρσ gµκ = (Rνµσρ + Rµνρσ ) = 0 . (5.59)

• 1st Bianchi identity: The cyclic sum of the last three indices is zero

3Rµ[νρσ] ≡ Rµνρσ + Rµρσν + Rµσνρ = 0 . (5.60)

5.5 Properties of the Riemann tensor 77

This can be easily understood by applying the operator [∇ρ , ∇σ ] to the gradient ∇ν φ of a scalar
field. For any scalar ∇[ρ ∇σ ∇ν] φ = 0, which implies

Rκ [νρσ] ∇κ φ = 0 . (5.61)

Since the resulting expression is valid for all gradients, Eq. (5.60) follows immediately. Note
that the result is non-trivial only when the three indices νρσ are different. When two of these
indices are equal one of the terms drop and the remaining terms just express the antisymmetry
in the last two indices of the curvature tensor.
• 2nd Bianchi identity: The Riemann tensor satisfies the differential identity13

∇κ Rµ νρσ + ∇σ Rµ νκρ + ∇ρ Rµ νσκ = 0 . (5.63)

The proof is left as an exercise.

Exercise
Prove Eq. (5.63) Hint: Use a local inertial frame.

• Ricci tensor and Ricci scalar: There are two important contractions of the Riemann tensor14 .
The first one is a second rank tensor obtained from contracting a pair of indices. Since Rµνρσ is
antisymmetric in µν and ρσ, the only non-trivial contraction is between µ and ρ or between µ
and σ. These two contractions differ only by a change of sign. Taking the first contraction, we
obtain the so-called Ricci tensor

Rνσ ≡ g µρ Rµνρσ = Rµ νµσ = ∂µ Γµ νσ − ∂σ Γµ νµ + Γµ κµ Γκ νσ − Γµ κσ Γκ νµ , (5.64)

which is symmetric, as can be easily seen by taking into account the relation (4.72)
!
µ 1 p 1 p p 1 p
∂σ Γ νµ = ∂σ p ∂ν |g| = − ∂σ |g|∂ν |g| + p ∂ν ∂σ |g| . (5.65)
|g| |g| |g|

The second contraction is the so-called Ricci scalar or Ricci curvature

R ≡ Rν ν = g νσ Rνσ = g µρ g νσ Rµνρσ . (5.66)

That’s all. There are no more non-vanishing contractions. The result (5.66) is quite remarkable.
Among the 20 independent components of the Riemann tensor that transform into linear com-
binations of each other under general coordinate transformations, there is one which remains
unchanged. R is the only scalar involving the metric and two derivatives.

Exercise:
• Among the different ways of constructing a scalar from the Riemann tensor discussed
above, why did I not discuss the contraction µνρσ Rµνρσ ?

13 This identity is related to the Jacobi identity

[[∇µ , ∇ν ], ∇ρ ] + [[∇ν , ∇ρ ], ∇µ ] + [[∇ρ , ∇µ ], ∇ν ] = 0 . (5.62)

14 We will only discuss the contractions at the lower order in the curvature tensor. Higher order contractions such as

R2 , Rµν Rµν or the square of the Riemann tensor, the so-called Kretschmann scalar Rµνρσ Rµνρσ , will be introduced
at its due time.
5.6 Independent components of the Riemann tensor 78

• Contracted Bianchi identities: Note the important result that follows from the Bianchi
identity (5.63) and the definition of the Ricci scalar. Contracting the indices µρ in (5.63) we get

∇κ Rρ νρσ + ∇σ Rρ νκρ + ∇ρ Rρ νσκ = ∇κ Rνσ − ∇σ Rνκ + ∇ρ Rρ νσκ = 0 , (5.67)

where we have made use of the antisymmetry property (5.58). Multiplying by the metric g νσ ,
contracting the indices ν and σ and taking into account that ∇ρ Rρσ σκ = −∇ρ Rσρ σκ = −∇ρ Rρ κ ,
Eq. (5.67) becomes

∇κ R − ∇ σ R σ κ − ∇ ρ R ρ κ = 0. (5.68)

The previous expression can be written in a much more enlightening way

1
∇µ Rµν − gµν R = 0 . (5.69)
2
The divergence of the so-called Einstein tensor
1
Gµν ≡ Rµν − gµν R (5.70)
2
vanishes by construction15 ! The symmetry of the Einstein tensor under the interchange of its
indices follows directly from the symmetries of the Ricci tensor and the metric. Which is the
geometrical meaning of this tensor? To answer this, consider an observer moving with 4-velocity
uµ and compute the spatial components of the Riemann tensor in the instantaneous rest frame
of such an observer16
Rγλκ = hµ γ hν hρ λ hσ κ Rµνρσ (5.71)
where we have made used of the projection operator hµ ν = δ µ ν + uµ uν . Contracting the indices
γ and λ and the indices and κ in the previous expression, we get the scalar

R = hµρ hνσ Rµνρσ = (g µρ + uµ uρ ) (g νσ + uν uσ ) Rµνρσ = R + 2uµ uρ Rµρ . (5.72)

which, comparing with the definition (5.70) of the Einstein tensor , can be written as

R = 2uµ uρ Gµρ . (5.73)

Gµν uµ uν measures the local scalar curvature of the spatially projected curvature tensor.

A final warning
There are several sign conventions involved in the definition of the Riemann tensor and its
contractions. Be careful when taking results from different books or articles. Our convention
is that of Misner, Thorne and Wheeler. A very useful reference sheet taken precisely from this
book can be found in the Moodle.

5.6 Independent components of the Riemann tensor

How many independent components has the Riemann tensor Rµνρσ in n dimensions? As a 4-indexed
object in n dimensions we have a priori n4 independent components, but the symmetries (5.57)-(5.60)
will significantly reduce this number. In order to see this, consider the Riemman tensor Rµνρσ as
15 Remember this, we will made use of it very soon.
16 R is not the curvature of the 3-space orthogonal to uµ , (3) R
µνρσ µνρσ !
5.6 Independent components of the Riemann tensor 79

the expression of a symmetric m × m matrix17 RAB = RBA with indices A = {µν} and B = {ρσ}.
This matrix has 21 m(m + 1) independent components. The value of m is determined by the number
of choices that we have for A and B, which, taking into account Eq.(5.58), have the same content as
a n × n antysimmetric matrix. We have therefore m = 12 n(n − 1) possible choices of A and B. The
total number of components so far is

n4 − 2n3 + 3n2 − 2n

m(m + 1) 1 n(n − 1) n (n − 1)
= +1 = , (5.74)
2 2 2 2 8
but we have still to substract the constraints imposed by Eq.(5.60). To determine the number of extra
constraints, notice that if one sets any two components equal (for instance µ = ν) we get identically
zero (one term goes away by antisymmetry and the other two cancel). Only if the 4 indices are
different we get a constraint. The number of independent constraints is the same as the number of
combinations of 4 objects that can be chosen from n objects

n n! n(n − 1)(n − 2)(n − 3)
= = . (5.75)
4 4!(n − 4)! 24
The final number of independent components of the Riemann tensor becomes
m(m + 1) n! n2 (n2 − 1)
CR = − = . (5.76)
2 4!(n − 4)! 12
Evaluating this for different dimensions we get

Number of dimensions 1 2 3 4 5
µ
Total components of R νρσ 1 16 81 256 625
Independent components of Rµ νρσ 0 1 6 20 50

The number of independent components in 4 dimensions has been reduced from 256 to 20! The fact
that the number is still quite large is reasonable, since we need a lot of numbers to specify how the
space curves in many different directions.. As we will see in the next Section, these are precisely the
degrees of freedom in the second derivatives of the metric that we cannot set to zero by performing a
change of coordinates.

Exercise
• In one dimension the Riemann tensor is always identically zero. Explain why.
Hint: Remember the geometrical interpretation of the Riemann tensor.
• How many components have the Ricci tensor and the Ricci scalar in 2, 3 and 4 dimensions?
And the Einstein tensor? Is there any dimension in which the Riemann and the Ricci
tensors haves the same number of independent components?

5.6.1 Local versus global flatness: A counting exercise

The Equivalence Principle is based on the existence of locally inertial (or freely falling) reference
frames
gµν (P ) = ηµν , ∂σ gµν (P ) = 0 , (5.77)
17 This is sometimes called the Petrov notation.
5.6 Independent components of the Riemann tensor 80

in which gravity can be transformed away. So, one of the things that we will like to verify is that
this kind of coordinate systems exist in the context of Riemannian geometry, i.e., if we can always
introduce a free falling frame (5.77) at an arbitrary point for an arbitrary metric gµν . For doing
that, consider a coordinate transformation from the coordinates xµ to some coordinates ξ α in the
neighborhood of some point P . Performing a Taylor expansion around P , we get
ξ α (x) = ξ α (P ) + Aα µ ∆xµ + Bµν
α
∆xµ ∆xν + C α µνρ ∆xµ ∆xν ∆xρ + . . . , (5.78)
µ µ µ 18
with ∆x ≡ x − P and
∂ξ α 1 ∂2ξα 1 ∂3ξα
Aα µ = , B α
µν = , D α
µνρ = . (5.79)
∂xµ P 2 ∂xµ ∂xν P 6 ∂xµ ∂xν ∂xρ P
Let us see if we can generically choose the values of the coefficients Aα µ , B α µν , Dµνρ
α
. . . in such a way
that the conditions (5.77) are satisfied . In four dimensions, the matrix A µ has 42 = 16 independent
19 α

components. Since we need only 10 conditions to impose gµν (P ) = ηµν , we are left with 6 components
to spare, precisely the number of Lorentz transformations and rotations that we can make without
modifying the form of metric in the Minkowski metric ηµν ! The requirement ∂σ gµν (P ) = 0 give rise
to 4 × 4(4 + 1)/2 = 40 conditions, which are precisely the number of components of the symmetric
quantity B α µν . We have just proven that one can always choose coordinates in such a way that
the metric reduces to the inertial form (5.77) in an infinitesimal region around a point P . In the
mathematical literature, this is known as the local flatness theorem.
But, what happens with the other coefficients? Can we make also put the second derivatives of
the metric to zero by simply performing coordinates transformations? The answer is no. The second
α
derivatives of the metric, ∂σ ∂ρ gµν , have 10 × 10 = 100 independent components, while Dµνρ has only
2
4 × (5 × 6) /6 = 80 components. This means that among the 100 components of the metric second
derivatives only 80 can be set to zero at P via coordinate transformations. Precisely the number of
independent components of the Riemann tensor in 4 dimensions! Indeed, it is not difficult to prove
that, at quadratic order in the coordinates, we can write
1
(Rµρνσ + Rνρµσ ) ∆xρ ∆xσ
gµν = ηµν − (5.80)
3
The second derivatives of the metric (or if you want the first derivative of the Christoffel symbols)
encode the information about the true gravitational field Rµνρσ !. A free falling observer can pretend
that he/she is not in the presence of a gravitational field, but the tidal forces cannot be eliminated!

Exercise
Repeat this exercise in arbitrary dimensions. What happens?

5.6.2 The Weyl tensor

In 4 dimensions, the Riemann tensor has 20 independent components, while the Ricci tensor and the
scalar of curvature can only account for 10+1 of those components. This should be somehow expected,
since the Ricci tensor and the scalar curvature contain the information about the “traces” of the
Riemann tensor, and not of it as a whole. The 20 independent components of the Riemann curvature
tensor in 4 dimensions can be written in terms of three irreducible pieces: the scalar curvature R, the
tracefree part of Ricci tensor
1
Sµν ≡ Rµν − gµν R , (5.81)
4
18 Note that, in spite of the appearances, the coefficients in the previous expression are not tensors, because they only

transform as such under global linear coordinate transformations.

19 Note that the coefficients B α α
µν , Cµνρ . . . are completely symmetric in the lower indices.
5.7 A laboratory for Riemannian geometry: 2 dimensional manifolds 81

and the so-called Weyl tensor

1
Cµνρσ = Rµνρσ − gµ[ρ Rσ]ν − gν[ρ Rσ]µ + Rgµ[ρ gσ]ν . (5.82)
3
The Weyl tensor is a linear rank-(0,4) tensor in Rµνρσ with no dependence on the derivatives of the
metric except through Rµνρσ . It has indeed the same symmetry properties as the Riemann tensor, and
therefore the same number of potential components. Note however that the Weyl tensor is traceless

C µ νµσ = g µρ Cµνρσ = 0 . (5.83)

which, taking into account the symmetry in the indices ν and σ, leaves as with 20−10 = 10 independent
components, which together with the 10 − 1 = 9 independent components of the trace free part of the
Ricci tensor Sµν , and the single component of the curvature scalar R, makes the 20 components of
the Riemann tensor. Note that no new quantities can be obtained by contracting the indices of the
above irreducible components.
An important property of the Weyl tensor is its behaviour under conformal transformations. A
conformal transformation can be understood as a local dilatation, in which the line element changes
from ds2 to Ω2 (x)ds2 , with Ω2 (x) an arbitrary and non-vanishing function called conformal factor 20 .
Through a trivial, but quite involved computation, one can verify that when we perform one of these
conformal transformations
gµν −→ Ω2 (x)gµν , (5.85)
the totally covariant Weyl tensor transforms accordingly

Cµνρσ = Ω2 (x)Cµνρσ , (5.86)

and therefore21 C µ νρσ is conformally invariant22 . This has an interesting consequence: in those case
in which the metric can be written as the result of the conformal transformation of a flat spacetime,
gµν = f (x)δµν or gµν = f (x)ηµν , the Weyl tensor is zero and the Riemann tensor can be entirely
expressed in terms of the Ricci tensor Rµν and the scalar of curvature R.

Exercise
Prove that the Weyl tensor (5.82) is indeed traceless.

5.7 A laboratory for Riemannian geometry: 2 dimensional

manifolds
In two dimensions the covariant Riemann tensor Rµνρσ has only one independent component. Since
the indices can take only two different values, say 1 and 2, and Rµνρσ is antisymmetric in µ and ν
and ρ and σ, and symmetric in the interchange of the combinations µν and ρσ as a whole, we are left
with an expression of the form R1212 . Let us see how this component is related to the Ricci scalar. In
order to do that, let me express the Riemann tensor as a linear combination of two tensors

Sµνρσ = gµρ gνσ , Tµνρσ = gµσ gνρ , (5.87)

20 Note that this kind of transformations conserve the angle between vectors
Uµ V µ
cos (U, V ) = p . (5.84)
(Uν U ν ) (Vρ V ρ )
21 Note the position of the indices
22 This is true in any dimension
5.7 A laboratory for Riemannian geometry: 2 dimensional manifolds 82

depending only in the metric and respecting the symmetries of the Riemann tensor23

Rµνρσ = A (Sµνρσ − Tµνρσ ) . (5.88)

Contracting the previous expression to obtain the Ricci scalar in the left-hand side we get

R = Ag µρ g νσ (Sµνρσ − Tµνρσ ) = A (g µρ gµρ g νσ gνσ − g µρ gµσ g νσ gνρ ) = (4 − 2)A = 2A , (5.89)

which allows as to identify the unknown factor A in Eq. (5.88) and write the fully covariant expres-
sion24
Rµνρσ = K (gµρ gνσ − gµσ gµρ ) , (5.91)
where we have defined the Gaussian curvature as K = R/2.

5.7.1 A worked-out example: 2 dimensional sphere

Let us go trough the whole process of computing the Ricci scalar. This kind of computations are
usually involved, but with a bit of practice and care they are quite tractable25 . The line element on
the surface of a sphere of radius a can be obtained by substituting the coordinate transformations

x = a sin θ cos φ , y = a sin θ sin φ , z = a cos θ ,

into the Euclidean line element ds2 = dx2 + dy 2 + dz 2 . We obtain

2
a 0
ds2 = a2 dθ2 + a2 sin2 θdφ2 −→ gµν = . (5.92)
0 a2 sin2 θ

The Christoffel symbols can be computed in many different ways, being the most practical one the
Lagrangian method. The only non-vanishing terms are

Γθφφ = − cos θ sin θ , Γφθφ = Γφφθ = cot θ . (5.93)

The µ = θ component of the Riemann tensor is given by

Rθ νρσ = ∂ρ Γθ νσ − ∂σ Γθ νρ + Γθ λρ Γλ νσ − Γθ λσ Γλ νρ . (5.94)

Among the two possibles values of the indices appearing in the ΓΓ pieces, only the λ = ρ = φ choice
contributes, so we can expand the sum over λ in the last two terms

Rθ νρσ = ∂ρ Γθ νσ − ∂σ Γθ νρ + Γθ φρ Γφ νσ − Γθ φσ Γφ νρ . (5.95)

Since the Riemann tensor is antisymmetric in ρ and σ, we cannot have ρ = σ. Let’s set therefore
ρ = φ and σ = θ (keeping in mind that the alternative choice, ρ = θ and σ = φ, just gives rise to a
relative minus sign). We have

Rθ νφθ = Γθ φφ Γφ νθ − ∂θ Γθ νφ = 0 , (5.96)
23 The combination S − T is antisymmetric under µ ↔ ν
24 This is the particular expression of a much more general relation
R
Rµνρσ = (gµρ gνσ − gµσ gµρ ) . (5.90)
n(n − 1)
for a maximally symmetric spacetime with constant R is arbitrary dimension. Unfortunately, I don’t have the time to
go trough it. The interested reader can have a look to this subject in Weinberg’s book.
25 Since this is the first non-trivial computation of the Ricci scalar that we perform, I will do it in great detail.

Although I could directly compute R1212 (we are dealing with a 2-dimensional metric) I prefer not to do so in order to
teach you some general tricks related to the symmetries of the Riemann tensor that will be useful when dealing with
more complicated metrics.
5.7 A laboratory for Riemannian geometry: 2 dimensional manifolds 83

from which we get, potentially, two terms

Rθ θφθ = Γθ φφ Γφ θθ − ∂θ Γθ θφ = 0 (5.97)
θ θ φ θ
R φφθ = Γ φφ Γ φθ − ∂θ Γ φφ
= (− cos θ sin θ) (cot θ) − sin2 θ + cos2 θ = − sin2 θ (5.98)
θ
= −R φθφ .
For µ = φ, the Riemann tensor becomes
Rφ νρσ = ∂ρ Γφ νσ − ∂σ Γφ νρ + Γφ λρ Γλ νσ − Γφ λσ Γλ νρ . (5.99)
As before, the only option is ρ = φ and σ = θ
Rφ νφθ = ∂φ Γφ νθ − ∂θ Γφ νφ + Γφ λφ Γλ νθ − Γφ λθ Γλ νφ . (5.100)
Taking into account that the metric does not depend on φ, the previous expression reduces to
Rφ νφθ = −∂θ Γθ νφ − Γφ φθ Γφ νφ , (5.101)
which is different from zero only if ν = θ
1
Rφ θφθ = −∂θ Γθ θφ − Γφ φθ Γφ θφ =
− cot2 θ = 1 . (5.102)
sin2 θ
The Ricci tensor is obtained by contracting the upper and second lower index. In matrix notation we
have θ
R θθθ + Rφ θφθ Rθ θθφ + Rφ θφφ

1 0
Rµν = = (5.103)
Rθ φθθ + Rφ φφθ Rθ φθφ + Rφ φφφ 0 sin2 θ
The Ricci scalar is
1 1 2
R = g θθ Rθθ + g φφ Rφφ = 2
+ 2 2 sin2 θ = 2 . (5.104)
a a sin θ a
The Gaussian curvature
R 1
= 2 K≡ (5.105)
2 a
is positive and constant, as expected, and coincides with the result (5.42) obtained by directly applying
Bertrand-Diquet-Puiseux formula (5.38).
Remember: This was quite an explicit computation to show how to use the symmetries to rapidly
derive the final result. In two dimensional cases it is better two remember that the Riemann tensor
has only one independent component, directly compute the R1212 component
Rθ φθφ = sin2 θ −→ Rθφθφ = gθθ Rθ φθφ = a2 sin2 θ (5.106)
and contract it with the inverse metric to obtain the scalar of curvature
2 2Rθφθφ
R = g θθ Rθθ + g φφ Rφφ = 2 = . (5.107)
a |g|
Note that the result
2Rθφθφ
R= (5.108)
|g|
is just a particular version of Eq. (5.91).

Exercise
Compute the intrinsic curvature of the two-dimensional cone in Cartesian and polar coordinates.
Interpret the result.
CHAPTER 6
EINSTEIN EQUATIONS

You will be convinced of the

general theory of relativity once
you have studied it. Therefore I
am not going to defend it with a
single word.

A. Einstein

6.1 The energy-momentum tensor

Having decided that our description of the motion of test particles and light in a gravitational field
should be based on the idea of curved space times with a metric, we must now complete the theory by
postulating a law to say how the sources of the gravitational field determine the metric. To construct
this gravitational field equation we must first find a covariant way of expressing the source term ρ in
the Poisson equation
∇2 Φ = 4πGρ . (6.1)
It is clear that the relativistic generalization of Eq. (6.1) cannot simply involve ρ as the source of
the relativistic gravitational field, since ρ is the energy density measured by only one observer, that
at rest with respect to the fluid element. It is not the first time that we find this kind of situation.
The relativistic formulation of Maxwell equations needed the combination of the charge density ρe
and the charge current J i into a 4-vector J µ = (ρ, J i ) with the right transformation properties. Can
we do something similar here? The most naive trial would be a combination of the energy density
ρ with someR energy flux si into a 4-vector, let’s say sµ = ρ, si . However, the total energy in this
case, E = ρ d3 x, is not a Lorentz invariant quantity
1
, due to its combination with the linear 3-
i µ i
momentum p into the 4-momentum p = E, p . We are forced therefore to look for a higher rank
object encoding the relation among the energy density, the energy flux, the momentum density and
the momentum flux or stress. This quantity is the so-called energy-momentum-stress tensor. Let’s
construct it.
1 Note ρe d3 x is Lorentz invariant.
R
that in the electromagnetic case, the total electric charge Q =
6.1 The energy-momentum tensor 84

6.1.1 Newtonian fluids

While point particles are characterized by their energy and momentum, the motion of continuous
matter is usually characterized by two quantities: the mass density ρ(t, xi ) and the velocity of the
fluid v(t, xi ), which generally depend on space and time. The evolution of a continuous system is
determined by two equations:

i) A continuity equation
∂ρ ∂ ρv j
+ = 0, (6.2)
∂t ∂xj
reflecting the fact that mass is neither created or destroyed in Classical Mechanics (the flowing
of mass out from a volume is equal to the loss of mass in it).
ii) A Newton’s 2nd law for fluids

∂v i ∂v i

i i
f = ρa = ρ + vj j , (6.3)
∂t ∂x

with
v i (t + ∆t, x + ∆x) − v i (t, x)
ai = lim , (6.4)
∆t→0 ∆t
and f i = f i (t, x) the total force per unit volume around a point x at time t. The so-called total
derivative of the velocity field
Dv i ∂v i ∂v i
≡ + vj j (6.5)
dt ∂t ∂x
contains two pieces, the local derivative ∂v/∂t, which gives the change of the velocity v as a
function of time at a given point in space, and the so-called convective derivative, (v · ∇) v, which
represents the change of v for a moving fluid particle due to the inhomogeneity of the fluid vector
field.
If we assume that there are not other forces apart from those exerted by the fluid on itself, we
are left with internal forces like pressure or friction acting only between neighboring regions of
matter. Consider a infinitesimal volume dV with surface area dA centered at a point x at time
t. Let us denate by nj the normal vector to the surface. In a perfect fluid2 , the force F i exerted
by the matter on the area is proportional to the area itself F i = p(t, x)δ ij nj dA, with p(t, x) the
pressure at that point at time t. In the most general case, we will also have shear forces

F i (t, x) = T ij (t, x)nj dA , (6.6)

due to the tendency of fluid elements moving with different velocities to drag adjacent matter.
The coefficients T ij are the components of the so-called stress tensor, which must be symmetric,
T ij = T ji .

Exercise
Consider the 3-component of the torque acting on an infinitesimal cube of a material of
density ρ and side length L. Compare it with the moment of inertia of the cube I = 61 ρL5 .
What happens if T ij 6= T ji in the limit L → 0?
2 A perfect fluid is defined as one for which there are no forces between the particles, no heat conduction and no

viscosity.
6.1 The energy-momentum tensor 85

The total force exerted per unit area in a given direction3 can be transformed into a total force
by unit volume via the Gauss’ theorem
∂T ij
Z Z
ij
∂j T ij dV fi = − j .

− T nj dA = − −→ (6.7)
A V ∂x
Plugging in this result into the Newton 2nd law (cf. Eq. (6.3))
i
∂v i ∂T ij

∂v
ρ + vj j + = 0, (6.8)
∂t ∂x ∂xj
and using the continuity equation to write

∂v i ∂ ρv i i ∂ρ ∂ ρv i i ∂ ρv
j
ρ = −v = +v (6.9)
∂t ∂t ∂t ∂t ∂xj
The previous result and the continuity equation (6.2), the Newton’s 2nd law (6.3) for this partic-
ular case (f i = −∂j T ij ) can be written as

∂ ρv i ∂
ρv i v j + T ij = 0 ,

+ (6.10)
∂x0 ∂xj
which is the so-called Euler equation.

6.1.2 Relativistic fluids

Eqs. (6.2) and (6.10) can be unified into a single equation in the framework of Special Relativity.
To
see this, note that the 3-velocity v i is contained in the relativistic 4-velocity
uµ
= u0
, ui
= γ, γv i
.
µ i
Taking into account the non-relativistic limit of this relation, u = 1, v , we can rewrite (6.2) and
(6.10) as
∂ ρu0 u0 ∂ ρu0 uj ∂ ρui u0 ∂
ρui uj + T ij = 0 ,

0
+ j
= 0, 0
+ j
(6.11)
∂x ∂x ∂x ∂x
which can be considered as parts of the single equation

∂ν T µν = 0 , T µν = ρuµ uν + tµν , (6.12)

with tµν = diag 0, T . The quantity T µν is the so-called energy-momentum-stress tensor or in a
ij

shorter version the energy-momentum tensor 4 or the stress-energy tensor. It is a rank-2 symmetric
tensor encoding all the information about energy density, momentum density, stress, pressure . . . .
The ten components of this tensor have the following interpretation:

• T 00 is the local energy density, including any potential contribution from forces between particles
and their kinetic energy.
• T 0i is the energy flux in the i direction. This includes not only the bulk motion but also any
other processes giving rise to transfers of energy, as for instance heat conduction.
• T i0 is the density of the momentum component in the i direction, i.e. the 3−momentum density.
As the previous case, it also takes into account the changes in momentum associated to heat
conduction.
3 The minus sign appears because we are considering the force exerted on matter inside the volume by the matter

outside
4 This name can be sometimes misleading as it can be confused with the energy-momentum 4-vector pµ in sentences

including things like “the energy-momentum conservation equation. . . ”. The difference should be always clear from the
context.
6.2 The microscopic description 86

• T ij is the 3-momentum flux or stress tensor, i.e the rate of flow of the i momentum component
per unit area in the plane orthogonal to the j-direction. The component T ii encodes the isotropic
pressure in the i direction while the components T ij with i 6= j refer to the viscous stresses of
the fluid.

6.1.3 Relativistic perfect fluids

A relativistic perfect fluid is defined to be one in which the tµν part of the stress-energy tensor T µν ,
as seen in a local reference frame moving along with the fluid, has same form as the non-relativistic
perfect fluid  
0 0 0 0
 0 p 0 0 
tµν = 
 0 0 p 0 .
 (6.13)
0 0 0 p
Heat conduction, viscosity or any other transport or dissipative processes in this case are negligible.
The form of Eq. (6.13) in an arbitrary inertial frame can be obtained by performing a general Lorentz
transformation
γv i
0
ui

γ u
Λµ ν = = , (6.14)
γv i δ ij + v i v j (γ − 1) /v 2 ui δ ij + ui uj /(1 + γ)

moving from the rest frame uµ = (1, 0) to one in which the fluid moves with 3-velocity v i . We get

t̄µν = Λν ρ Λν σ tρσ = p (η µν + uµ uν ) , (6.15)

with uµ the 4-velocity vector field tangent to the worldines of the fluid particles. Taking into account
this result, the full stress-energy tensor (6.12) takes the form

T µν = (ρ + p)uµ uν + pη µν . (6.16)

The resulting equation is manifestly covariant and can be easily generalized to arbitrary coordinate
systems or curved spacetimes by simply replacing the local metric η µν by a general metric g µν

T µν = (ρ + p)uµ uν + pg µν . (6.17)

The conservation law ∂ν T µν = 0 in Eq. (6.12) becomes a local conservation law

∇ν T µν = 0 (6.18)

in which the standard derivative ∂µ is replaced by the covariant derivative ∇µ . The word local is, as
always in this course, important. Eq. (6.18) is not a conservation law, nor should it be. As we will
see, energy is not conserved in the presence of dynamical spacetime curvature but rather changes in
response to it.

Exercise
Prove Eq. (6.15).

6.2 The microscopic description

The relation between ρ and p is usually characterized by an equation of state p = p(ρ) which depends on
the microscopic particles involved in the fluid. In order to get some insight about the possible equations
6.2 The microscopic description 87

of state, let me consider a macroscopic collection of N structureless point particles interacting through
spatially localized collisions. The energy density associated to any of them is given by

Tn00 = En δ (3) (x − xn (t)) = mn γn δ (3) (x − xn (t)) , (6.19)

p
with γn = 1/ 1 − vn2 and n = 1, . . . , N a label selecting the particular particle we are referring to.
Taking into account the identity
Z +∞ Z +∞
dτ δ (4) (x − x(τ )) = dτ δ (t − t(τ )) δ (3) (x − x(τ ))
−∞ −∞
dτ (3) 1
= δ (x − x(t)) = δ (3) (x − x(t)) , (6.20)
dt γ
the non-Lorentz invariant 3-dimensional Dirac delta appearing in Eq. (6.23) can be transformed into
a Lorentz invariant 4-dimensional Dirac delta5
Z +∞
00
Tn = mn dτn u0n u0n δ (4) (x − xn (τn )) . (6.21)
−∞

The same procedure can be applied to the spatial momentum density (or energy current) of the particle

Tn0i = pin δ (3) (x − xn (t)) = mn γn vni δ (3) (x − xn (t)) = En vni δ (3) (x − xn (t)) , (6.22)

and to the flux of the i momentum component in the j direction (or viceversa)

Tnij = pin vnj δ (3) (x − xn (t)) = pjn vni δ (3) (x − xn (t)) . (6.23)

We obtain
Z +∞ Z +∞
Tn0i = mn dτn u0n uin δ (4) (x − xn (τn )) , Tnij = mn dτn uin ujn δ (4) (x − xn (τn )) . (6.24)
−∞ −∞

Eqs. (6.21) and (6.24) can be rewritten in a very compact way in terms of the stress-energy-momentum
tensor T µν
Z +∞ Z +∞
pµ pν
Tnµν = mn dτn uµn uνn δ (4) (x − xn (τn )) = dτn n n δ (4) (x − xn (τn )) , (6.25)
−∞ −∞ mn
which is manifestly symmetric and Lorentz invariant since uµn uνn is a tensor under Lorentz transfor-
mations and both mn and dτn δ (4) (x − xn (τn )) are Lorentz scalars. The total energy density of the
whole system of particles can be written as the sum of the individual contributions, namely
N
X
T µν = Tnµν . (6.26)
n=1

6.2.1 Energy-momentum tensor conservation and geodesics

Let us see under which conditions the total energy momentum tensor (6.26) is conserved. Taking the
derivative with respect to the coordinates we get
N
X Z +∞
µν
∂µ T = mn dτn uµn uνn ∂µ δ (4) (x − xn (τn )) , (6.27)
n=1 −∞

5 The fact that the 4-Dimensional Dirac delta δ (4) (x) is Lorentz invariant follows directly from the definition

d4 xδ (4) (x) = 1 and the fact that the volume element d4 x is Lorentz invariant.
R
6.2 The microscopic description 88

which using
dxµn ∂ (4)
uµn ∂µ δ (4) (x − xn (τn )) = δ (x − xn (τn )) = −d/dτn δ (4) (x − xn (τn )) , (6.28)
dτn ∂xµ
can be written as
N Z +∞ N Z +∞
X d ν (4) X
∂µ T µν = − mn dτn un δ (x − xn (τn )) + mn dτn u̇νn δ (4) (x − xn (τn )) .
n=1 −∞ dτn n=1 −∞

The first term in the right hand side of the previous expression disappears in the particles are stable,
i.e. if the orbits are closed or come from negative infinite time and disappear into positive infinite
time. We are left then with the second term, which can be written as
N Z +∞ N
µν
X dpνn (4) X dpνn (3)
∂µ T = dτn δ (x − xn (τn )) = δ (x − xn ) , (6.29)
n=1 −∞ dτn n=1
dt

with pνn = mn uνn the 4-momentum of the individual particles. The local energy momentum conser-
vation ∂µ T µν = 0 requires the particles to be free. Or in other words, the condition ∂µ T µν = 0 is
equivalent to the geodesic equation in Minkowski space-time, dpµn /dτ = 0. This will be also the case
in curved spacetime.

6.2.2 The fluid limit

On distances d much larger than the typical mean free path a, the number of particles is large and
the statistical fluctuations about the mean properties of the fluid are expected to be small6 . Imagine
a comoving observer exploring distances d a. If the fluid is isotropic7 , the average value of the
T 0i ∝ u0 ui component measured by this observer will be zero since the vector ui points in all possible
directions. In this case, the fluid can be characterized in terms of two quantities: its mean density
and pressure over the volume ∆V = d3
DX E 1 X D X i i (3) E
ρ= En δ (3) (x − xn ) , p= pn vn δ (x − xn ) . (6.30)
n
∆V 3 i n
∆V

A simple inspection of Eqs. (6.30) reveals that, for standard matter, 0 ≤ p ≤ ρ/3. In any other
reference frame, the energy-momentum tensor for the perfect fluid reads

T µν (x) = (ρ(x) + p(x))uµ (x)uν (x) + p(x)η µν , (6.31)

with uµ (x) denoting now the average value of the 4-velocities uµi of the individual particles NR inside
the volume8 . The perfect fluid form (6.31) can be used to model very different physical situations
that often fall into one of the following categories:
p
1. Non-relativistic matter: For small velocities the dispersion relation En = m2n + p2n can be
approximated by En ' mn +p2n /2mn , which plugged back into (6.30) gives rise to ρ ' mn n+ 32 p.
Taking into account that the statistical definition of temperature T is twice the energy possessed
by each degree of freedom and assuming a monoatomic gas with 3 kinetic degrees of freedom,
we can write T = (2/3) × p2n /2mn and therefore ρ ' mn n + 23 T .
6 Remember that, when we later apply the Equivalence Principle, we will have another scale into play: the scale

L at which the gravitational effects start to be important. If this scale happens to be much larger than the scale d
(L d a), the mean properties of the fluid can be safely considered as constant over the region.
7 i.e if the fluid is perfect.
8 Note that, when writing uµ (x), ρ(x) and p(x) we are explicitly taking into account that the averages can vary from

one region to another.

6.2 The microscopic description 89

2. Dust: A perfect fluid with zero pressure. p = 0, tµν = 0, T µν = ρ diag (1, 0, 0, 0).

3. Radiation: A perfect highly relativistic fluid. In this case En ' |pn | mn and therefore9
ρ ' 3p. The energy momentum for radiation is traceless, T = T µ µ = ηµν T µν = −ρ + 3p = 0.

A worked-out example: The electromagnetic field

The paradigmatic case of a fluid with a radiation equation of state is the electromagnetic
field. To see this explicitly, consider the energy density of the electromagnetic field
1
E2 + B2 ,

T00 = (6.32)
2
and write it in the way seen by an observer moving with 4-velocity uµ . The electron field
seen by that observer is given by
Eµ = Fµν uν . (6.33)
Using this expression we get the following covariant expression for the square of the electric
field
E2 = Fµν uν F µ ρ uρ . (6.34)
A similar expression for the magnetic field square can be obtained from the explicit ex-
pression for the square of the electromagnetic field strength tensor

Fµν F µν = −2 E 2 − B 2 .

(6.35)

We obtain
1
B2 = Fµν uν F µ ρ uρ + Fµν F µν . (6.36)
2
Putting Eqs. (6.34) and (6.36) together, the covariant generalization of the energy density
(6.32) becomes
1
ρ ρσ
ρ = Fρµ F ν − Fρσ F ηµν uµ uν , (6.37)
4
where we have inserted a factor uµ uµ = −1. The work is basically done. The quantity in
parenthesis is the sought-for energy-momentum tensor for the electromagnetic field!
1
Tµν = Fρµ F ρ ν − Fρσ F ρσ ηµν . (6.38)
4

Exercise
• Compute the T 0i in terms of the electric and magnetic fields. Do you recognize the
result?
• Prove that the electromagnetic energy-momentum is symmetric Tµν = Tνµ and
traceless, T µ µ = 0. The electromagnetic field behaves as a fluid with equation of
state p = 1/3ρ.

9 Note |pn |2
pin vn
i in Eq. (6.30) can be written as pin vn
i = i vi =
P P P
that the quantity i i i mγn vn n En
, which goes to
|pn | when En ' |pn |.
6.3 Einstein equations: Heuristic derivation 90

6.3 Einstein equations: Heuristic derivation

We have finally all the tools needed to derive the Einstein field equations for the gravitational field. In
the Poisson equation, the gravitational field is determined by the matter distribution. The relativistic
version of the matter distribution10 , the energy-momentum tensor Tµν M
, must be somehow equated11
to some tensor Kµν depending of the metric gµν and its first and second derivatives12
Kµν = κ2 Tµν
M
, (6.39)
with κ2 a proportionality constant to be determined. But, what tensor? Einstein got the answer to
this question trough a complicated process of intuition, trial and error; superhuman exertions in his
own words. As claimed above, the left-hand side of Eq.(6.39) should contain a second order differential
operator acting on the metric. We already found some quantities with this property in the previous
chapter: the Riemann tensor Rµ νρσ and its contractions. The most natural tentative for Kµν would
be the Ricci tensor Rµν , since this is the contraction appearing in the Newtonian limit of the geodesic
deviation equation (Ri 0i0 = E i i ). This was also one of the first trial and error choices of Einstein
Rµν ≈ κ2 Tµν
M
. (6.40)
Note however that this choice is inconsistent, since the divergence ∇ν Rµν of the Ricci tensor is, in gen-
eral, different from zero and, according to our minimal coupling prescription, the energy-momentum
should be locally conserved, ∇ν TµνM
= 0. Indeed, making use of the Bianchi identity (5.71) we can
write ∇ Rµν = 1/2∇ν R, which together with the trace of Eq. (6.40), R = κ2 g µν Tµν
µ M
= κ2 T M , im-
plies the condition ∇µ T = 0. Since the covariant derivative of the scalar quantity T M is just the
M

partial derivative, we should necessarily have a constant T M throughout the whole spacetime, which
is highly implausible, since, as we know, T M = 0 for the electromagnetic field and T M > 0 for stan-
dard matter. On top of that, Eq. (6.40) hides 10 differential equations for 6 physical unknowns: the
components of the metric that cannot be freely changed by performing coordinates transformations
in the 4 coordinates. We have to try harder.
The most general combination of symmetric tensors involving up to two derivatives of the metric is
Kµν = Rµν + agµν R + Λgµν (6.41)
with a and Λ some unknown constants to be determined13 . Imposing the local conservation of the
energy-momentum tensor ∇µ Tµν
M
=0 in Eq. (6.39) we get
∇µ Kµν = ∇µ (Rµν + agµν R) = 0 , (6.42)
where we have taken into account that the covariant derivative ∇µ is metric compatible and therefore
∇µ (Λg µν ) = 0. Our situation now is much better than that of Einstein, we are aware of the contracted
form of the Bianchi identities14 (5.71) and know the precise value of a that satisfies Eq. (6.42), namely
a = 1/2. Taking this into account, we can rewrite Eq. (6.39) as
Gµν + Λgµν = κ2 Tµν
M
, (6.43)
with
1
Gµν ≡ Rµν − gµν R , (6.44)
2
10 Matter should be understood in a broad sense, meaning really matter, radiation etc. . .
11 A relativistic generalization should take the form of an equation between tensors.
12 The requirement of having derivatives only up to second order is certainly reasonable. If this were not the case, one

would have to specify for the Cauchy problem not only the value of the metric and its first derivative, but also higher
derivatives on a spacelike surface.
13 A possible proportionality constant in front of R
µν has been factored out and incorporated in the still unknown
factors κ and Lambda in the right hand side of Eq. (6.39).
14 He wasn’t.
6.3 Einstein equations: Heuristic derivation 91

the Einstein tensor defined in previous chapter (cf. Eq. (5.72)) and Λ the famous cosmological constant
term. Writing this cosmological constant term in the right hand side of the equation, we can interpret
it as the energy-momentum tensor of a fluid with a weird equation of state p = −ρ
Λ
Gµν = κ2 Tµν
M Λ Λ

+ Tµν , Tµν =− gµν . (6.45)
κ2
M Λ
Defining Tµν ≡ Tµν + Tµν , we can write

Gµν = κ2 Tµν . (6.46)

Even though our derivation was quite heuristic, the solution that we have obtained is unique (Lovelock
theorem). The resulting tensorial equation is a set of ten differential equations15 for the metric gµν (x)
given the energy-momentum tensor Tµν (x). However, due to the existence of the Bianchi identities,
not all the components are longer independent. There are only 6 independent equations to determine
6 independent components of the metric tensor.
As differential equations they are very complicated, even in vacuum. Both the Ricci scalar and
the scalar curvature involve derivatives and products of Christoffel symbols, which in turn involve
derivatives of the metric tensor. There is also some dependence on the metric hidden in the energy-
momentum tensor. On top of that, the equations are not linear, as it should be expected, since,
according to the Equivalence Principle, every form of energy, including the gravitational self-energy,
must be a source of the gravitational field16 . The non-linearity of the equation forbids us to apply the
superposition principle, given two known solutions they cannot be combined to get a new one.

The Einstein equation in words

The physical meaning of Einstein equations can be clarified by considering an observer with
velocity uµ . The energy density as measured in the energy frame of such an observer is given by
ρ = Tµν uµ uν . Taking this into account, together the interpretation of the Einstein tensor that
we developed in the previous Chapter, the physical content of (6.43) can be summarized as

Gµν − κ2 Tµν uµ uν = 0 ,

(6.47)

which in words reads

 
Scalar curvature of the spatial
 = 2κ2 Energy density measured by
sections measured by an .
an observer with 4-velocity uµ

observer with velocity uµ

15 Both sides of the equation are symmetric rank-2 tensors.

16 Note however that they have a well-posed initial-value structure, i.e. they determine the future values of gµν from
given initial data. This consideration is of key importance for the study of systems evolving in time from some initial
state, as for instance, gravitational waves.
6.4 The linearized theory of gravity 92

Newton Einstein

Newton 2nd law Geodesic equation

d 2 xi ∂Φ d 2 xµ ρ
dxν
dt2 = −δ ij ∂xj d2 σ = −Γµ νρ ∂x
dσ dσ

Tidal deviation Geodesic deviation

d2 ξ i i j D2 ξµ
dt2 = −E j ξ dσ 2 = −Rµ νρσ uν uσ ξ ρ

1st Bianchi identity 1st Bianchi identitiy

Eij = Eji Rµνρσ + Rµρσν + Rµσνρ = 0

2nd Bianchi identity 2nd Bianchi identitiy

E i [j,l] = 0 ∇κ R νρσ + ∇σ Rµ νκρ + ∇ρ Rµ νσκ = 0
µ

mass density Energy-momentum tensor

ρ Tµν

Poisson equation Einstein equation

E i i = 4πGρ Gµν = 8πGTµν

single elliptic equation 10 coupled equations

4 elliptic and 6 hyperbolic

boundary data required initial and boundary data required

Table 6.1: Newtonian vs Einstenian description of gravity.

6.4 The linearized theory of gravity

Equation (6.46) looks very promising but we have still to prove that it is able to reproduce the
Newtonian theory of gravity and determine the value of the unknown constants κ and Λ. The fastest
way to obtain the Newtonian limit is to use the assumptions discussed in Section 3.6. Let me however
relax these assumptions and obtain the general expression for the Einstein equation in the so-called
weak field limit. This limit is defined by the condition

gµν = ηµν + hµν , with |hµν | 1 . (6.48)

The quantity hµν is then understood as a small perturbation on top of the Minkowski background.
Consistently with this point of view, we will raise and lower its indices with the flat Minkowski metric
ηµν , namely hµ σ = η µρ hρσ , hµν = η νσ hµ σ .
In order to compute the expression for the Einstein tensor Gµν at the lowest order in perturbation
6.4 The linearized theory of gravity 93

theory we must first determine the linearized version of the Ricci tensor and the scalar curvature,
which are functions of the metric connection Γµ νρ . Inserting the expansion (6.48) into the definition
of the metric connection, we get
1 µσ
Γµ νρ = η (∂ν hσρ + ∂ρ hσν − ∂σ hρν ) + O(h2µν ) . (6.49)
2
The next step is to compute the 4 pieces of Riemman tensor, which, written in a very schematic way,
···
have the structure R··· ∼ ∂Γ − ∂Γ + ΓΓ + ΓΓ. Taking into account (6.49), we realize that only the
first two terms (∼ ∂Γ) give a contribution to the leading order
1 1
Rµ νρσ = ∂ρ (∂σ hµ ν + ∂ν hµ σ − ∂ µ hνσ ) − (ρ ↔ σ) = (∂ν ∂ρ hµ σ + ∂σ ∂ µ hνρ − (ρ ↔ σ)) . (6.50)
2 2
The linearized version of the Ricci tensor and the scalar of curvature can be computed by simply
performing contractions in the previous expression. Denoting respectively by h ≡ hµ µ and 2 = ∂ µ ∂µ
the trace of the perturbation tensor and the d’Alambertian operator and contracting the indices µ
and σ in Eq. (7.17), we get17
1
Rνρ = − (2hνρ + ∂ν ∂ρ h − ∂ν ∂σ hσ ρ − ∂ρ ∂σ hσ ν ) , (6.51)
2
which can be further contracted in the indices ν and ρ to obtain

R = Rν ν = η νρ Rνρ = −2h + ∂ν ∂σ hνσ . (6.52)

Collecting all the terms and inserting them into the definition of the Einstein tensor (6.44), we get
1
Gνρ = − (∂ν ∂ρ h + 2hνρ − ∂ν ∂σ hσ ρ − ∂ρ ∂σ hσ ν − ηνρ 2h + ηνρ ∂µ ∂σ hµσ )
2
1
= − 2h̃νρ + ηνρ ∂µ ∂σ h̃µσ − ∂ν ∂σ h̃σ ρ − ∂ρ ∂σ h̃σ ν , (6.53)
2
where in the last step we have defined the so-called trace reverse
1 1
h̃µν ≡ hµν − ηµν h , hµν = h̃µν − ηµν h̃ , (6.54)
2 2
which keeps track of the extra terms obtained when passing from Rνρ to Gνρ . The name trace reverse
comes from the property h̃ ≡ h̃µ µ = −h. Note also the useful properties

˜
h̃µν = hµν , Gµν = R̃µν . (6.55)

The linearized Einstein equations becomes finally

2h̃νρ + ηνρ ∂µ ∂σ h̃µσ − ∂ν ∂σ h̃σ ρ − ∂ρ ∂σ h̃σ ν = −2κ2 Tνρ . (6.56)

The resulting expression is rather involved, but fortunately we still have some freedom to play with:
the gauge freedom.
17 The global minus sign comes from the permutation of the last two indices two construct the Ricci scalar.
6.4 The linearized theory of gravity 94

Gauge fixing
Eqs. (7.17) and (6.53), and therefore (6.56), are invariant under the transformation

hνρ −→ hνρ − ∂ν ξρ − ∂ρ ξν , (6.57)

as can be easily verified be performing the explicit computation. This kind of change is
called a gauge transformation, due to the strong analogy with the gauge transformations in
the electromagnetic theory. The simplest way to understand this gauge freedom is to trace
it back to the transformation of the full metric gµν . Consider an infinitesimal transformation
xµ → x̄µ = xµ + ξ µ . Under such a transformation the metric changes to
∂ x̄µ ∂ x̄ν
ḡ µν (xρ + ξ ρ ) = g ρσ (xρ ) (6.58)
∂xρ ∂xσ
= g (δ ρ + ∂ρ ξ µ ) (δ ν σ + ∂σ ξ ν )
ρσ µ

= g µν (xρ ) + g µσ ∂σ ξ ν + g νρ ∂ρ ξ µ .

Expanding the left-hand side of this equation in a Taylor series in ξ ρ and retaining only the
terms up to linear order, we get

ḡ µν (xρ ) = g µν (xρ ) + δg µν , (6.59)

with
δg µν ≡ −ξ ρ ∂ρ g µν + g µρ ∂ρ ξ ν + g νρ ∂ρ ξ µ = ∇ν ξ µ + ∇µ ξ ν . (6.60)
In the particular case in which the perturbation is performed around the Minkowski background,
gµν = hµν + ηµν , the covariant derivatives in (6.59) become standard derivatives and we recover
the transformation law (6.57). The linearized theory is invariant under (6.57) because the
full nonlinear theory is invariant under general coordinate transformations! This is extremely
interesting, since it allows us to further simplify the linearized version of the Einstein tensor by
simply performing infinitesimal coordinates transformations, or in other words, changes from
a splitting gµν = ηµν + hold to a different splitting gµν = ηµν + hnew . A simple inspection of
Eq. (6.56) reveals that an interesting condition to be satisfied by the trace reverse tensor in
the new coordinate system would be the tensor analog of the Lorenz gauge ∂µ Aµ = 0 in the
electromagnetic theorya , namely
∂ρ h̃νρ
new = 0 . (6.61)
Let us see if we are allowed to choose such a gauge. The change in the trace reverse tensor h̃µν
follows directly from Eqs. (6.54) and (6.57)
νρ
h̃νρ ν ρ ρ ν νρ µ
new = h̃old − ∂ ξ − ∂ ξ + η ∂µ ξ . (6.62)

Taking the derivative of this equation we get

νρ
∂ρ h̃νρ ν
new = ∂ρ h̃old − 2ξ . (6.63)

In order to satisfy the gauge fixing (6.61), ξ ν must be a solution of the inhomogeneous wave
equation
2ξ ν = ∂ρ h̃νρ
old . (6.64)
The existence of a solution transforming from an arbitrary hµν to the so-called Lorenz gauge
νρ
∂ρ h̃νρ b
new = 0 is guaranteed for sufficiently well behaved ∂ρ h̃old . In fact, the choice is not unique
ν
since we can always add to it any solution of the homogeneous wave equation 2ξH = 0 and the
ν ν νρ νρ
result will still obey 2 (ξ + ξH ) = ∂ρ h̃old . The Lorenz gauge ∂ρ h̃new = 0 is actually a set of
gauges.
a It“kills” three of the four terms in (6.53).
b Asyou learnt in your electrodynamic course, the solution of this equation can be obtained by means of the
retarded Green functions of the d’Alambertian operator.
6.4 The linearized theory of gravity 95

O
x
P

x'
x- x'

In view of the previous discussion, we realize that most of the terms in the left-hand side of Eq. (6.56)
merely serve to maintain gauge invariance. When the Hilbert gauge condition 18 ∂ρ h̃νρ = 0 is imposed,
the linearized version of the Einstein equation simplifies dramatically
2h̃µν = −2κ2 Tµν . (6.65)
This equation is formally identical to the Maxwell equations in the Lorenz gauge and can be solved
by using the Green’s function method.

Green’s functions
Consider a differential wave equation of the form

2f (t, x) = s(t, x) , (6.66)

with f (t, x) a radiation field and s(t, x) a source term. A Green’s function G(t, x; t0 , x0 ) is
defined as the field generated at the point (t, x) by a delta function source at (t0 , x0 ). i.e.

2G(t, x; t0 , x0 ) = δ(t − t0 )δ(x − x0 ) . (6.67)

The field due the actual source s(t, x) can be obtained by integrating the Green’s function
against s(t, x): Z
f (t, x) = dt0 d3 x0 G(t, x; t0 , x0 ) s(t0 , x0 ) . (6.68)

Physically the Green’s function approach merely reflects the fact that (6.66) is a linear equation.
The full solution of the equation can be obtained by solving for a point source and adding the
resulting waves from each point inside the source.

The Green’s function associated with the wave operator 2 is very well known (see for instance the
Jackson’s book on electrodynamics.):
δ(t0 − [t − |x − x0 |])
G(t, x; t0 , x0 ) = − . (6.69)
4π|x − x0 |

Exercise
Derive this equation in case you haven’t done it before.

Using (6.69) into (6.65), we get19

κ2 Tµν (t − |x − x0 |, x0 ) 3 0
Z
h̃µν = d x , (6.70)
2π |x − x0 |
18 This gauge is also called Einstein gauge, harmonic gauge, de Donder gauge, Fock gauge or, in analogy with electro-

magnetism, Lorenz gauge.

19 Note that we can always add to this particular solution an arbitrary solution of the homogeneous wave equation

(vacuum). As in electromagnetism, the metric perturbation consists of the field generated by the source plus wave-like
vacuum solutions propagating at the speed of light.
6.4 The linearized theory of gravity 96

which is analogous to the relation between the vector potential Aµ and the current Jµ in electromag-
netism. Note the argument t − |x − x0 | = t − |x − x0 |/c. Eq. (6.70) is a retarded solution 20 , taking
into account the lag associated with the propagation of information from events at x to position x0 .
Gravitational influences propagate at the finite speed of light. Action at a distance is gone forever!
We will be back to this point at the next chapter , but before let me finish our main task: determining
the value of the constants κ2 and Λ. For doing that let me consider the case we know better: the grav-
itational field created by a static spherical mass distribution of total mass M . The energy-momentum
tensor for such a system has only one non-vanishing component (cf. Eq. (6.45))

Λ
T 00 = ρ + 2 diag (1, 0, 0, 0) . (6.72)
κ
Plugging this into the time independent version of Eq. (6.70), we get
κ2 ρ (x0 ) 3 0
Z Z
1 Λ
h̃00 = 0
d x + d3 x0 , h̃0i = 0 , h̃ij = 0 . (6.73)
2π |x − x | 2π |x − x0 |
If the mass distribution is concentrated around the origin (x0 = 0), the component h00 evaluated at a
distance r = |x − x0 | becomes21
κ2 ρ (x0 ) 3 0 κ2 M
Z Z
1 Λ 3 0 2
h̃00 = d x + d x = + Λr2 (6.74)
2π r 2π r 2π r 3
with Z
M= ρ (x0 ) d3 x0 (6.75)

the total mass of our spherical distribution. Taking now into account that h̃ = η µν h̃µν = −h̃00 and
using the definition (6.54) we get
κ2 M 1
h00 = h11 = h22 = h33 = + Λr2 . (6.76)
4πr 3
Comparing this result with that obtained by performing the weak field limit of the geodesic equation in
the Λ = 0 case, hΛ=0
00 = −2Φ = 2GM/r, allows us to identify the sought-for proportionality constant
κ2 = 8πG . (6.77)
When Λ 6= 0, the Newtonian potential becomes modified at long distances
GM Λ
Φ=− − r2 (6.78)
r 6
and line element takes the form

2 2GM 1 2 2 2GM 1 2
ds = − 1 − − Λr dt + 1 + + Λr dX 2 , (6.79)
r 3 r 3
with dX 2 ≡ dx2 + dy 2 + dz 2 . In Newtonian terms, a positive cosmological constant (Λ > 0) gives rise
to a repulsive force per unit mass whose strength increases linearly with the distance
GM Λ
f =− ur + r ur , (6.80)
r2 3
20 The retarded solution is obtained by imposing the Kirchoff-Sommerfeld “no-incoming radiation” boundary condition

at past null infinity

lim (∂r + ∂t ) (rh̄µν ) = 0 , (6.71)
t→∞
with the limit taken along any surface with ct + r =constant, together with the condition that rh̃µν and r∂ρ h̃µν are
bounded in this limit.
21 Note that the integral in over the prime variables!
6.4 The linearized theory of gravity 97

Cosmological constant
If Λ 6= 0, it must be at least very small, ρΛ ρmatter , to avoid any observational effect in
those situations in which the Newton’s theory of gravity successfully explains the observations.
Taking into account, for instance, that we do not see any modification of the Newtonian theory
of gravity within the solar system, we can set the limit

|Λ| 3M
|ρΛ | = ≤ ρSolar −→ |ρΛ | ≤ 3 ' 10−29 GeV4 (6.81)
8πG 4πRPluto

which, as assumed, makes the contribution of Λ completely negligible on the scale of the systems
we will be interested in in this coursea .
a It will play however a fundamental role at larger scales, as those you will considered in your Cosmology

course.

Linearized Gravity Electromagnetism

Field equation Einstein equation with Maxwell equations

gµν = hµν + hµν

Basic potentials Linearized metric 4-vector potential

hµν (x) Aµ = (Φ, A)

Sources Energy-momentum tensor 4-vector current

T µν J µ = (ρ, J)

Lorenz gauge ∂µ h̃µν = 0 ∂µ Aµ = 0

h̃µν = hµν − 21 ηµν h

Sourced wave equation 2h̃µν = −16πGTµν 2Aµ = Jµ

R [Tµν ]ret 3 0 1
R [Jµ ]ret 3 0
Solution h̃µν = 4G |x−x0 | d x Ãµ = 4π |x−x0 | d x

Table 6.2: Linearized Einstein equations vs Maxwell equations.

CHAPTER 7
GRAVITATIONAL WAVES

x
x

One of the most fascinating predictions of General Relativity is the existence of gravitational waves.
Einstein theory of gravity abandons the Newtonian conception of space and time as a rigid structure
in which the particles move. Spacetime is now alive and can curve, move and vibrate!

7.1 A bunch of questions

In this chapter we will try to answer the following questions

• How are gravitational waves generated?

• How do they propagate?
• Can we detect them? How?

• Why are they interesting?

Let me start by answering the simplest question, the second one.

7.2 Propagation in vacuum

The starting point for any discussion on gravitational waves is the time-dependent version of the
linearized Einstein equations in the Lorenz gauge1

2h̃µν = −16πGTµν . (7.1)

1 The derivation of this equation was performed around a Minkowski background. A more general treatment for
(0)
perturbations around an arbitrary background gµν exists. The so-called Isaacson shortwave approximation can be still
applied in those cases in which the perturbative scale of the waves hµν is much smaller than the curvature scale of the
(0)
background gµν .
7.2 Propagation in vacuum 99

Consider the propagation of the perturbation hµν far away from the generating source. In this case,
the energy-momentum tensor in Eq. (7.1) can be set to zero and we are left with the homogenous
equation
2h̃µν = 0 . (7.2)
The resulting vacuum case is quite particular since it still contains a residual gauge freedom on top
the Lorenz condition
∂ ν h̃µν = 0 . (7.3)
Having a look to Eqs. (6.62) and (6.63)

h̃new old ρ
µν = h̃µν − (∂µ ξν + ∂ν ξµ − ηµν ∂ρ ξ ) , (7.4)

∂ ν h̃new ν old
µν = ∂ h̃µν − 2ξµ , (7.5)
we realize that we can still make an infinitesimal coordinate transformation xµ → xµ +ξ µ with 2ξµ = 0
without modifying the gauge condition (7.3). Indeed, if 2ξµ =0, we automatically have2

2 (∂µ ξν + ∂ν ξµ − ηµν ∂ρ ξ ρ ) = 0 , (7.6)

| {z }
ξµν

meaning that we can always subtract the combination ξµν from h̃µν in Eq. (7.2). The quantity ξµν
depends of 4-arbitrary functions ξµ , which can be chosen at will to impose 4 extra conditions on the
perturbation h̃µν . In particular, we can take ξ0 and ξi in such a way that h̃ = 0 and h̃0i = 0. The
condition of vanishing trace h̃ = 0 erases the distinction between the perturbation and its trace reverse

h̃µν = hµν . (7.7)

On the other hand, the condition h̃0i = h0i = 0 applied the µ = 0 component of the Lorenz gauge
∂ ν h̃µν = 0 implies that h00 is constant in time

∂ 0 h̃00 + ∂ i h̃0i = ∂ 0 h00 + ∂ i h0i = 0 , −→ ∂ 0 h00 = 0 . (7.8)

This component corresponds to the static part of the gravitational interaction, i.e to the Newtonian
potential of the source which gave rise to the gravitational wave. The gravitational wave itself is the
time-dependent part. As far as gravitational waves are concerned, the condition ∂ 0 h00 = 0 really
means h00 = 0.
The discussion presented above defines the so-called transverse-traceless (TT) or radiation gauge

h0µ = 0 , hi i = 0 , ∂ j hij = 0 , (7.9)

which completely fix all the local ambiguities and leaves us with 10 − 4 − 4 = 2 degrees of freedom, the
physical ones. The existence of such a gauge is guaranteed as long as there are no sources. Although,
inside the source we are still allowed to perform a coordinate transformation with 2ξµ = 0 (or equiv-
alently 2ξµν =0) on top of the Lorenz gauge, we cannot set to zero any further component in h̃µν ,
since 2h̃µν 6= 0. The situation is completely analogue to what happens in Classical Electrodynamics.
Maxwell equations can be always reduced to the form 2Aµ = J µ by imposing the Lorenz gauge con-
dition ∂µ Aµ = 0. Once there, we have still the freedom to implement a residual gauge transformation
Aµ −→ Aµ + ∂µ ξ with ξ satisfying the condition 2ξ = 0. In the absence of sources, the function ξ can
be used to get rid of one of the components in Aµ , let’s say A0 . The Lorenz gauge reduces in this case
to a transversality condition on Ai , namely ∂i Ai = 0 and we are left with 4 − 1 − 1 = 2 polarizations.
If instead j 0 6= 0, we have 2A0 6= 0 and there is no choice of ξ able to satisfy simultaneously 2ξ = 0
and A0 = 0.
2 The flat d’Alambertian 2 commutes with ∂ µ .
7.3 Interaction of gravitational waves with matter 100

7.2.1 Plane wave solutions

Eq. (7.2) admits a planar wave solution3 of the form4
σ i
h̃µν = Aµν eikσ x = Aµν eiki x e−iωt , (7.10)
µ
with Aµν a symmetric rank-2 tensor called polarization tensor and k = (ω, k) a wave 4-vector
satisfying the normalization condition5
2h̃µν = η ρσ ∂ρ ∂σ h̃µν = −kσ k σ h̃µν = 0 , −→ kσ k σ = 0 . (7.11)
Since k σ is a null 4-vector, the dispersion relation takes the form ω = k ; and gravitational perturba-
tions propagate at the speed of light. The tranverse-traceless gauge (7.2) translates into the following
restrictions on the components of the symmetric rank-2 tensor Aµν
A0i = 0 , Ai i = 0 , k j Aij = 0 . (7.12)
To clarify our findings, let me consider a particular. Imagine a wave propagating in the z-direction.
In this case, k µ = ω, 0, 0, k 3 = (ω, 0, 0, ω) and A3i = 0, leaving as with only 4 non-vanishing
components, namely A11 , A12 , A21 , A22 . Since Aij is also symmetric and traceless, these components
must satisfy A11 = −A22 ≡ h+ and A12 = A21 ≡ h× , with h+ and h× the so-called “plus” and
“cross” polarizations. Written in matricial form the coefficient Aµν in the tranverse-traceless gauge
takes the form  
0 0 0 0
 0 h+ h× 0 
ATT
µν =  0 h
 . (7.13)
× −h+ 0 
0 0 0 0

Linearized Gravity Electromagnetism

σ σ
Plane wave solution h̃µν = Aµν eikσ x Aµ = aµ eikσ x
Lorenz gauge kσ k σ = 0 kσ k σ = 0
k µ Aµν = 0 k µ aµ = 0

h00 = 0 hi0 = 0 A0 = 0
ij
TT gauge ∂i h = 0 hi i = 0 ∂i Ai = 0
hij = hTT
ij Ai = ATi
symmetric, transverse and traceless transverse

7.3 Interaction of gravitational waves with matter

Once we have learned how to describe the propagation of gravitational waves, the next step is to
discuss their interactions with matter, or in others words, the way of detecting them. Although it
3 This is just the paradigmatic case. In the linear theory, we are always allowed to build an arbitrary wave-like

solution by simply considering a superposition of these plane waves.

4 The real part of the complex-valued expression (7.10) is assumed to be taken at the end of the computation, as

usual.
5 The components A
µν are assumed to be constant.
7.3 Interaction of gravitational waves with matter 101

might seem natural to think that we can learn something interesting by considering the geodesic
equation
duµ
+ Γρµν uµ uν = 0 (7.14)
dτ
for a test particle in the gravitational field of the wave, this is not the case. To see this, consider our
test particle to be at rest, uµ = (1, 0, 0, 0), at an initial time, let’s say, τ = τ0 . Evaluating the geodesic
equation (7.14) at this time we get

duµ 1
= −Γµ00 2∂0 hTT TT

= 0i − ∂i h00 , (7.15)
dτ τ =τ0 2 τ =τ0

which is identically zero since both h0i and h00 are zero in the transverse-traceless gauge. The particle
does not seem to experience any acceleration, it completely ignores the wave! Does this mean that
gravitational waves have no effect in matter? Certainly not! It simply reflects the fact the Riemannian
spacetime is locally flat at any given point.

Gauge freedom in General Relativity

In General Relativity, gauge freedom means freedom to choose the coordinates. The transverse-
traceless gauge is a choice of frame which moves with the particle at the lowest order of approx-
imation. The coordinates stretch themselves, in response to the arrival of the wave, in such a
way that the position of the free test mass initially at rest does not change.

To detect gravitational waves we must go beyond a single point in spacetime and explore its neighbor-
hood. Consider the wave (7.13) passing through a ring of test particles in x − y plane. Let’s denote
by v µ the distance of a test particles to the center of the ring and use the geodesic deviation equation

D2 vµ
= η µλ Rλνρσ uν uρ v σ . (7.16)
dτ 2
The linearized Riemann tensor
1
Rλνρσ = (∂ν ∂ρ hλσ + ∂σ ∂λ hνρ − (ρ ↔ σ)) , (7.17)
2
generated by the crossing gravitational wave is a gauge invariant quantity, meaning that we can
compute it in any frame without affecting the result. Clearly the best choice is the TT gauge since
the form of hµν in this frame is extremely simple. Assuming the particles to be moving slowly,
U ν ≈ (1, 0, 0, 0), Eq. (7.16) becomes6

d2 v i i j 1 d2 hTT
ij
= R 00j v = vj . (7.18)
dt2 2 dt2
The resulting equation is extremely simple. The response of the particles can be understood in purely
Newtonian terms, without any further reference to General Relativity. Since hTT ij is traceless, the
effective Newtonian force per unit mass

1 d2 hTT
ij
Fi ≡ vj , (7.19)
2 dt2
is divergence free, ∂ i Fi = 0, meaning that there are no sources or sinks for the gravitational lines.
Note also that, as in the electromagnetic case, only the transverse directions (v x and v y ) to the wave
propagation are affected (cf. Eq. (7.13)). If a particle is initially at z = 0, it will remain at z = 0.
6 Note that, at leading order in hµν , τ = t.
7.3 Interaction of gravitational waves with matter 102

A pictorial representation of F i can be obtained by drawing the lines of force the “plus” and “cross”
polarizations

These lines are defined in such a way that at each point (x, y) they go in the direction of the force
with a density proportional to the modulus of the force7 . The effect of the components h+ and h×
in the ring of particles is in clear agreement with the quadrupolar pattern displayed in the previous
figure:

• “Plus” polarization: h+ 6= 0 and h× = 0:

In this case,
d2 v x v x d2 σ d2 v y v y d2 σ

2
= 2
(h+ eikσ x ) , 2
=− 2
(h+ eikσ x ) , (7.20)
dt 2 dt dt 2 dt
whose solution, to lowest order of accuracy, can be written as
1 σ 1 σ
v x = v0x + h+ eikσ x v0x v y = v0y − h+ eikσ x v0y . (7.21)
2 2

with v0x and v0y staying for the initial separation of the particles in the x and y directions. A
“+-polarized” wave makes the particles initially located in v0x and v0y bounce back and for in the
x and y directions respectively. This fact, together with the 180◦ phase difference associated to
the minus sign in Eq. (7.21), gives rise to the following pattern in the ring of particles

• “Cross” polarization: h× 6= 0 and h+ = 0:

In this case, the separation vector in a given direction depends also on the initial separation
vector in the orthogonal direction
1 σ 1 σ
v x = v0x + h× eikσ x v0y , v y = v0y + h× eikσ x v0x . (7.22)
2 2
7 Observe that the second figure can be obtained by rotating the first one 45◦ .
7.3 Interaction of gravitational waves with matter 103

A “×-polarized” wave gives rise to a stretching and a squeezing along the 2−1/2 , 2−1/2 , 0 and

−2−1/2 , 2−1/2 , 0 directions. The ring of particles bounces back and for describing a cross

shape.

The components h+ and h× constitute the two independent linear polarizations of the gravitational
wave and play a similar role to the vertical and horizontal polarization in electromagnetic waves.
Different superpositions of these two modes can be always considered within the linear theory.

Linearized Gravity Electromagnetism

k µ = (ω, 0, 0, ω) k µ = (ω, 0, 0, ω)
Polarization A11 = −A22 ≡ h+ 6= 0 a1 6= 0
modes A12 = A21 ≡ h× 6= 0 a2 6= 0
hR,L = √12 (h+ ± ih× ) aR,L = √12 (a1 ± ia2 )

7.3.1 Laser interferometers

The most extended gravitational wave detectors are laser based interferometric detectors, whose basic
operation can be summarized as follows: A laser beam is sent through a beam splitter and directed
towards two very long resonant cavities. The light is reflected on mirrors at the end of the cavities
and sent back to the beam splitter, which transmits half of each beam and reflects the other half.
One part of each beam goes then back to the laser, while the other half-parts are combined to reach
a photodetector in which the interference pattern is monitored. If a gravitational wave of amplitude
h came out to pass trough the detector, its arm length will be periodically shorten in one direction
and lengthen in the other, giving rise to a change in the interference pattern.
7.4 The helicity of the graviton 104

The total difference in length between the two arms can be derived8 from Eq. (7.18)

∆L
∼ h. (7.23)
L
It is interesting to put some numbers. If we consider for instance the typical amplitude of gravitational
waves emitted by a rotating binary system9 , h ' 10−21 , and a typical detector such as LIGO or Virgo
with arm lengths of 3 − 4 km, we get a change ∆L ' 10−16 cm.

7.4 The helicity of the graviton

In an hypothetical quantum theory of gravity, the gravitational waves presented in this Chapter would
be quantized into particles satisfying the relativistic wave equation of a massless particle. The spin of
the graviton can be inferred from the transformations properties of the classical field of the particle
under rotations.

8 We are implicitly assuming that the wave propagates orthogonally to the plane of the detector. In the general case,

we get some angular coefficients of order 1.

9 You will determine this number in the exercise session.
7.4 The helicity of the graviton 105

The Poincaré group has two physically interesting representations:

• Massive representation: These representations are characterized by the mass m2 =
−pµ pµ and the spin s, which can take integer or half-integer values s = 0, 1/2, 1, . . .. The
representation with spin s has dimension 2s + 1. Example: A massive spin-1 particle has
three-degrees of freedom.

• Massless representation: These representations are characterized by pµ pµ = 0 and

a definite value of the helicity, which is defined as the projection of the total angular
momentum (or the spin) in the direction of motiona

h = J · n = (L + S) · n = S · n . (7.24)

Under a rotation of angle θ around that direction a helicity eigenstate |hi transforms as

|hi −→ eihθ |φi . (7.25)

There are always two helicity states h = ±s, corresponding to the alignment or counter-
alignment of the spin and the momentum.
a Unfortunately the traditional symbol h for the helicity coincides with some of the notations used in this

chapter.

Applying a global rotation of angle θ

 
cos θ − sin θ 0
Rij =  − sin θ cos θ 0  (7.26)
0 0 1

to our plane wave (7.13), we get

 
h+ cos 2θ + h× sin 2θ h× cos 2θ − h+ sin 2θ 0
−1 k −1 l
ATT TT

ij = R i R j Akl −→  h× cos 2θ − h+ sin 2θ −h+ cos 2θ − h× sin 2θ 0  (7.27)
0 0 0
In the quantum theory the two polarization amplitudes h+ and h× become annihilation operators of
gravitons and the circular polarization operator, hR,L ≡ (h+ ± ih× ), will transform as

hR,L −→ U hR,L U † (7.28)

with U = eiJ3 θ/~ . Thus

hR,L −→ e∓2iθ hR,L (7.29)
showing the spin of the graviton is 2, in unit of ~.

Linearized Gravity Electromagnetism

k l j
Helicity Aij = R−1 i R−1 j Akl ai = R−1 i aj
hR,L −→ e∓2iθ hR,L aR,L −→ e∓iθ aR,L
CHAPTER 8
THE SCHWARZSCHILD-DROSTE SOLUTION

As you see, the war treated me

kindly enough, in spite of the
heavy gunfire, to allow me to get
away from it all and take this
walk in the land of your ideas

Schwarzschild’s letter
to Einstein during
World War I

Most of the work done till now has been related to weak-field solutions of the Einstein equations.
In this Chapter, we go a step forward an look for exact solutions. Given the non-linearity of the field
equations and the associated difficulty in finding analytical solutions for arbitrary matter distributions,
we will restrict ourselves to vacuum solutions. To determine our starting point, let me rewrite the
Einstein equations Gµν + Λgµν = κ2 Tµν M
in a much more convenient form. Multiplying by the inverse
metric and taking the trace we obtain a relation between the Ricci scalar, the cosmological constant
and the trace T M ≡ g µν Tµν
M
of the locally conserved energy-momentum tensor Tµν M
, namely
1
Rµ µ − Rδ µ µ + Λδ µ µ = κ2 T M −→ R = −κ2 T M + 4Λ . (8.1)
2
Substituting back this result into the original Einstein equations we realize that they can be written
as
1
Rµν = κ2 Tµν − gµν T + Λgµν . (8.2)
2
M
Vacuum solutions (Tµν = Λ = 0) correspond then to solutions of the equation
Rµν = 0 , (8.3)
rather that to solutions of Gµν =0.

Vacuum solutions are not necessarily flat

Eq. (8.3) does not imply the vanishing of the Riemann tensor Rµ νρσ , which contains extra
components.
8.1 A spherically symmetric ansatz 110

The problem of finding a solution of this equation is further simplified in those cases in which the
problem is highly symmetric. In what follows, we will look for spherically symmetric solutions.

8.1 A spherically symmetric ansatz

Consider the spacetime outside a spherically symmetric mass distribution, which can be static or
not. A spacetime is said to posses a particular symmetry if the functional form of the metric under
the action of such a symmetry is maintained. In particular, a spherically symmetric spacetime is a
spacetime whose line element is invariant under rotations (or, if you want. a spacetime “with the
symmetries of the sphere”). The only rotational invariants of the spacelike coordinates x = xi and
their differential are
x · x ≡ r2 , dx · dx , x · dx . (8.4)
The most general spatially isotropic metric that can be constructed with these elements takes the
form
2
ds2 = −a(t, r)dt2 − 2b(t, r)dt (x · dx) + c(t, r) (x · dx) + d(t, r)dx · dx , (8.5)
with a, b, c and d some arbitrary functions of t and r. The required invariance under rotations suggests
the use of spherical coordinates {r, θ, φ}. Performing the change of variables we realize that all the
angular dependence in (8.5) is isolated in the dx · dx part

x · x = r2 , x · dx = rdr , dx · dx = dr2 + r2 dθ2 + r2 sin2 θdφ2 . (8.6)

Substituting these expressions into (8.5) we arrive to the equivalent form

ds2 = −a(t, r)dt2 − 2b(t, r)rdtdr + c(t, r)r2 dr2 + d(t, r) dr2 + r2 dΩ2 ,

(8.7)

where we have defined dΩ2 = dθ2 + sin2 θdφ2 . Collecting terms together and defining some, still
arbitrary, functions

A(t, r) ≡ a(t, r) , B(t, r) ≡ rb(t, r) , C(t, r) ≡ r2 c(t, r) + d(t, r) , D(t, r) ≡ r2 d(t, r) ,

to take into account the extra factors of r in Eq. (8.7), we are left with

ds2 = −A(t, r)dt2 − 2B(t, r)dtdr + C(t, r)dr2 + D(t, r)dΩ2 . (8.8)

The resulting metric can be further simplified by using the freedom in the choice of coordinates. For
instance, we can define a new radial coordinate r̄2 ≡ D(t, r) and eliminate r and dr in terms of r̄, t, dr̄
and dt. This gives rise to a big mess that changes the explicit form of the coefficients A, B, C to some
new, but still arbitrary, coefficients A0 , B 0 , C 0

ds2 = −A0 (t, r̄)dt2 − 2B 0 (t, r̄)dtdr̄ + C 0 (t, r̄)dr̄2 + r̄2 dΩ2 . (8.9)

The next thing we can do is to find some new coordinate time t̄(t, r̄) to get rid of the nasty term dtdr.
To do that, let me define this new time as

dt̄ = µ(t, r̄) [A0 (t, r̄)dt + B 0 (t, r̄)dr̄] = ∂t Ψ(t, r)dt + ∂r Ψ(t, r)dr , (8.10)

where the new unknown integrating factor µ is determined by the condition that the second equality
holds for some Ψ. In other words, we require µ(t, r̄) [A0 (t, r̄)dt + B 0 (t, r̄)dr̄] to be a total differential,
so that the first equality makes sense.
Squaring Eq.(8.10)
dt̄2 = µ2 A02 dt2 + 2A0 B 0 dtdr̄ + B 02 dr̄2

(8.11)
8.2 Spherical symmetry and staticity 111

and isolating the terms related to dt2 and dtdr̄, we get

1 B 02 2
A0 dt2 + 2B 0 dtdr̄ = d t̄2
− dr̄ . (8.12)
A0 µ2 A0
In terms of the new temporal coordinate t̄ the cross term disappears and the antsatz (8.9) becomes
diagonal
B 02

1
ds2 = − 0 2 dt̄2 + C 0 + dr̄2 + r̄2 dΩ2 . (8.13)
Aµ A
Since the functions of t̄ and r̄ in this expression are arbitrary we can collect them into some arbitrary
new functions1
1 B 02
e2α ≡ 0 2 , e2β ≡ C 0 + 0 , (8.14)
Aµ A
and write
ds2 = −e2α(t̄,r̄) dt̄2 + e2β(t̄,r̄) dr̄2 + r̄2 dΩ2 . (8.15)
Dropping the bars to maintain the notation as light as possible, we arrive to our first important result
ds2 = −e2α(t,r) dt2 + e2β(t,r) dr̄2 + r2 dΩ2 . (8.16)
Just by using spherical symmetry and our freedom to change coordinates, we have been able to reduce
the 10 functions in gµν to two functions of only two variables! Rather impressive.

8.2 Spherical symmetry and staticity

The unknown functions α and β can be determined by inserting the antsatz (8.16) into the vacuum
Einstein equations (8.3). The first step in this procedure is to compute the metric connection Γµ νσ .
The job is conceptually straightforward but rather tedious. Whatever the way you do it2 , you should
obtain 12 non-vanishing components out of 40, namely
Γt tt = ∂t α , Γt tr = Γtrt = ∂r α , Γt rr = e2(β−α) ∂t β ,

Γr tt = e2(α−β) ∂r α , Γr tr = Γr rt = ∂t β , Γr rr = ∂r β ,
(8.17)
Γr θθ = −re−2β , Γr φφ = sin2 θ Γrθθ , Γθ rθ = Γθθr = 1/r ,

Γθ φφ = − sin θ cos θ , Γφ θφ = Γφθφ = cot θ , Γφ rφ = Γφ φr = 1/r .

The non-vanishing components of the Riemann tensor associated to these Christoffel symbols are
given by
Rt rtr = e2(β−α) ∂t2 β + (∂t β)2 − ∂t α∂t β] + [∂r α∂r β − ∂r2 α − (∂r α)2 ,

Rt θtθ = −re−2β ∂r α , Rt φtφ = −re−2β sin2 θ ∂r α , Rt θrθ = −re−2α ∂t β ,

Rt φrφ = −re−2α sin2 θ ∂t β , Rr θrθ = re−2β ∂r β , Rr φrφ = re−2β sin2 θ ∂r β , (8.18)

Rθ φθφ = (1 − e−2β ) sin2 θ . .

1 This exponential form is specially useful for writing compact expressions for the components of the metric connec-

tions and the Riemann tensor.

2 The quicker way to get Γµ
νσ is by using the Lagrangian procedure for geodesics, but you can also use the brute
force method and compute them via Eq. (4.62).
8.2 Spherical symmetry and staticity 112

which, contracted, provide us with the non-vanishing components of the Ricci tensor

Rtt = ∂t2 β + (∂t β)2 − ∂t α∂t β + ∂r2 α + (∂r α)2 − ∂r α∂r β + 2r ∂r α e2(α−β) ,

(8.19)
Rrr = ∂t2 β + (∂t β)2 − ∂t α∂t β e2(β−α) − ∂r2 α + (∂r α)2 − ∂r α∂r β − 2r ∂r β ,

2
Rtr = ∂t β , Rθθ = 1 + e−2β [r(∂r β − ∂r α) − 1] , Rφφ = Rθθ sin2 θ .
r

Understanding the result

The result (8.19) can be easily understood from simple symmetry considerations. Consider for
instance the Rrθ component and note that the metric (8.16) is invariant under “reflections” in
the θ and φ coordinates, i.e. θ → −θ and φ → −φ. When θ → −θ, the sign of Rrθ changes and
we are force to have Rrθ = 0. The same kind of argument can be applied to many components
to get
Rrθ = Rrφ = Rtθ = Rtφ = Rθφ = 0 . (8.20)
The relation between Rφφ and Rθθ can be also derived without performing the explicit com-
putation. To see this, consider the coordinate transformation (θ, φ) → (θ̄, φ̄) and write the
expression for the angular part of the line element in both coordinate systems
" 2 2 #
∂θ ∂φ
2 2
dθ + sin θdφ =2 2
+ sin θ dθ02 + . . . (8.21)
∂θ0 ∂θ0

The invariance of the line element under rotations implies the equality
2 2
∂θ ∂φ
+ sin2 θ = 1. (8.22)
∂θ0 ∂θ0

Substituting this into the transformation law for the Rθθ component
2 2 2 ! 2
∂θ ∂φ 2 ∂φ ∂φ
Rθ 0 θ 0 = Rθθ + Rφφ −→ Rθθ = 1 − sin θ + Rφφ
∂θ0 ∂θ0 ∂θ0 ∂θ0

and demanding Rθ0 θ0 = Rθθ , we get the sought-for relation Rφφ = sin2 Rθθ .

The empty-space field equations are obtained by setting each of the components (8.19) equal to zero.
These gives rise to 5 equations among which only 4 are useful since the Rφφ component simply repeats
the information of the Rθθ component. Among these 4 equations, the simplest one is that associated
to Rtr . A simple inspection of this equation reveals a very interesting property: the function β must
be independent of time

Rtr = 0 −→ ∂t β = 0 −→ β = β(r) . (8.23)

Taking into account this result and performing the time derivative of the vacuum equation Rθθ = 0,
we get
∂t Rθθ = 0 −→ ∂t ∂r α = 0 −→ α = γ(r) + κ(t) . (8.24)
The coefficient e2α(r,t) can be then splited into two pieces e2α(r,t) = e2γ(r) e2κ(t) . This allows us to
perform an extra coordinate redefinition

dt → e−κ(t) dt , γ(r) ≡ α(r) , (8.25)

8.3 The Schwarzschild-Droste solution 113

in Eq. (8.16) to obtain a much simpler line element

ds2 = −e2α(r) dt2 + e2β(r) dr2 + r2 dΩ2 , (8.26)

specified by only two time-independent functions α(r) and β(r). The resulting metric is static3 even
though we did not impose any requirement on the source apart from being spherically symmetric. The
source could be as dynamical as a collapsing or a pulsating star and the metric outside the matter
distribution would still take the form (8.26), as long as the collapse is symmetric. This result is in
perfect agreement with our discussion on gravitational waves: if a spherically symmetric body under-
goes pure radial pulsations, there is no quadrupole and there is no emission of gravitational waves.

All vacuum solutions of the Einstein equations with SO(3) symmetry are necessarily static.

8.3 The Schwarzschild-Droste solution

Thanks to symmetry, we are left with 3 equations of a single variable r for two unknows α and β. Let
me rewrite them as
2α0

Rtt = + α00 + α02 − α0 β 0 + e2(α−β) = 0 , (8.27)
r
2β 0

Rrr = − α00 + α02 − α0 β 0 − = 0, (8.28)
r
Rθθ = 1 − e−2β (1 + rα0 − rβ 0 ) = 0 , (8.29)

with the prime denoting derivatives with respect to r. Note that the first two equations are rather
similar. Multiplying the first one by e−2(α−β) and adding it to the second we get
2 0
e2(β−α) Rtt + Rrr = (α + β 0 ) = 0 −→ α0 + β 0 = 0 −→ α(r) + β(r) = constant . (8.30)
r
The integration constant appearing in the previous expression can be always set to zero by simply
performing a coordinate redefinition, allowing us to set α = −β . Inserting this result into Eq. (8.29)
we get 0
Rθθ = 0 −→ (1 + 2rα0 ) e2α = 1 −→ re2α = 1 , (8.31)
which can be easily integrated to obtain
C
re2α = r + C → e2α = e−2β = 1 + , (8.32)
r
or equivalently
−1
C C
ds2 = − 1 + dt2 + 1 + dr2 + r2 dΩ2 . (8.33)
r r
3A static spacetime is one in which
i) The components of gµν are independent of the timelike component x0 .
ii) The line element is invariant under the transformation x0 → −x0 .
If the second condition is not satisfied the spacetime is rather said to be stationary. A particular example of stationary
metric is the one generated by a rotating star, where the change x0 → −x0 changes the sense of rotation.
8.3 The Schwarzschild-Droste solution 114

The obtained metric is asymptotically flat: it tends to the Minkowski metric when r → ∞.

Birkhoff ’s theorem
Any solution of the vacuum Einstein equations with SO(3) symmetry must be static and asymp-
totically flat.

The only thing left is to associate the constant C to some physical parameter. The most important
use of a spherically symmetric vacuum solution is to represent the spacetime outside stars or planets.
In that case, we would expect to recover the Newtonian limit

2GM 2GM
g00 = − 1 − , grr = 1 + , (8.34)
r r

at large r values. Comparing (8.34) with the r → ∞ limit of the metric (8.32)

C C
g00 = 1 + , grr = 1 − , (8.35)
r r

we get C = −2GM , which allows us to write the final and traditional expression for the so-called
Schwarzchild-Droste metric4
−1
RS RS
ds2 = − 1 − dt2 + 1 − dr2 + r2 dΩ2 (8.36)
r r

with
2GM M
RS ≡ ' 3km , (8.37)
c2 M
the Schwarzschild radius 5 .

Exercise
• Verify that (8.36) satisfies Eq. (8.29). Explain why this is guaranteed to happen even
though we initially had three equations for two unknowns.
• Show that the Schwarzschild metric can be written in a form that makes explicit its
isotropic character, namely
2
RS
1− 4ρ

RS
4
ds2 = − 2
dx2 + dy 2 + dz 2 ,

2 dt + 1 + (8.38)
RS 4ρ
1+ 4ρ

with
1 p
ρ= r − GM + r2 − 2GM r . (8.39)
2
4 Karl Schwarzschild found this exact solution in 1915 while serving in the German army on the Russian front during

the World War I and died a year later from pemphigus, a painful autoimmune disease. An alternative derivation of this
solution based on the Weyl method was presented by Droste around the same time but for some reason the physics
community completely ignored it.
5 We have momentarily restored the factors of c.
8.4 Measuring distances and times 115

8.4 Measuring distances and times

Which is the physical meaning of the coordinates (t, r, θ, φ) appearing in the Schwarzschild-Droste
solution? Although they provide a global reference frame for an observer making measurements at an
infinite distance from the source (asymptotic flatness), not all of them represent physical quantities
measured by arbitrary observers. While θ and φ have the same interpretation than the spherical
angular coordinates in flat spacetime, the coordinate radius r and the coordinate time t cannot be
generically interpreted as the physical radius or the physical time measured by a clock.
Physical quantities must be computed from the metric. The physical interval in the radial direction
measured by an arbitrary local observer is given by the proper distance (dt = dθ = dφ)
−1/2
RS
ds = 1 − dr , (8.40)
r
while the time measured by an stationary clock at r (dr = dθ = dφ = 0) is given by the proper time
interval 1/2
RS
dτ = 1 − dt . (8.41)
r

Understanding the result

In the Schwarzschild metric, space is foliated by spheres S 2 of area 4πr2 separated by a proper
−1/2
distance 1 − RrS dr.

8.5 Visualizing Schwarzschild spacetime

A mental image of the Schwarzschild-Droste spacetime can be obtained by embedding a subset of it
into a higher dimensional spacetime6 . Since our solution is static and spherically symmetric, we can,
without loss of generality, fix t =constant and θ = π/2. This leaves us with a 2-dimensional surface
−1
RS
2
dX = 1 − dr2 + r2 dφ2 = f (r)−1 dr2 + r2 dφ2 , (8.42)
r
which can be easily embeded into the ordinary 3-dimensional Euclidean space
" 2 #
2 2 2 2 dz(r)
dX = dz + dr + dφ = 1 + dr2 + r2 dφ2 . (8.43)
dr
The function z(r) can be obtained by simply comparing (8.42) and (8.43)
2
dz(r)
1+ = f (r)−1 , (8.44)
dr
and performing a trivial integration
Z r s
1 − f (r0 ) p
z(r) = dr0 = 2 RS (r − RS ) + constant . (8.45)
0 f (r0 )
The resulting embedding diagram is the Flamm’s paraboloid shown in Fig. 8.5. The distances between
circles on this surface are larger than just ∆r, in clear agreement with the discussion presented in the
previous section.
6 Remember that such embedding diagrams can be misleading. For instance, a 2-dimensional cylinder embedded in

3-dimensional Euclidean space can seem to be curved even though it is intrinsically flat, K = κ1 κ2 = 0.
8.6 Apparent singularity 116

Figure 8.1: Embedding diagram for the Schwarzschild (r − φ) plane: Flamm’s paraboloid

8.6 Apparent singularity

The line element (8.36) appears to contain two singularities, one at r = 0 coming from the gtt
component (blue dashed line) and another at r = RS coming from grr (red line).

Is this a problem? Not necessarily. In most of the astrophysical applications the typical size R of
the source is much larger than the Schwarzschild radius (8.37)
RS RS RS
≈ 10−9 , ≈ 10−6 , ≈ 10−1 . (8.46)
R ⊕ R R NS

This fact makes the singularities at r = 0 and r = Rs completely irrelevant in most of the cases, since
they lie in the interior of objects where the exterior Schwarzschild solution does not apply. Indeed,
the problem disappears when one consider realistic interior solutions of the Einstein equations
−1
2 2GM (r) 2 2GM (r)
ds = − 1 − dt + 1 − dr2 + r2 dΩ2 , (8.47)
r r
since the function M (r) decreases faster than r and effectively kills all the above singularities.
8.7 Geodesics in Schwarzschild metric 117

We should worry and speculate about the singularities only in those cases in which the size of the
object is such that the Schwarzschild-Droste solution applies all the way down to r = RS . This kind
of objects are called black holes. Even in that case the two singularities described above are not on
equal footing. The metric coefficients in the line element (8.36) depend on the choice of a particular
coordinate system and you should not extract any conclusion from them alone. Let me present an
illustrative example.

A worked-out examples: Coordinates should not be trusted

Consider the completely regular and singularity-free Euclidean space in two dimensions

dX 2 = dx2 + dy 2 , (8.48)
√
and perform a general coordinate transformation to a new variable ρ defined through x = 2 ρ
to get
1
ds2 = 2 dρ2 + dy 2 . (8.49)
ρ
The metric appears to blow up at ρ = 0 even though we know that our space is, by construction,
flat and free of singularities. The apparent singularity is a breakdown of our coordinate system
at the point in which ρ becomes negative. It has nothing to do with a breakdown of the
underlaying manifold!

In order to determine if we are dealing with some artifice of our coordinate system or with a true
physical singularity, we cannot neither look to the curvature tensors alone, since their components are
coordinate-dependent7 . We should rather construct scalars out of the curvature tensors. If any the
scalar blows up in a particular coordinate system, it will do in all of them. The simplest possibility
would be to consider the Ricci scalar, R but we can also construct higher order scalars such as Rµν Rµν
Rµνρσ Rµνρσ . For the particular case of the Schwarzschild-Droste metric, the first two quantities are
not useful since are identically equal to zero. We are forced then to consider the square of the Riemann
tensor, the so-called Kretschmann scalar. Taking into account the non-vanishing components of the
Riemann tensor (8.18), we obtain

12RS2
K = Rµνρσ Rµνρσ = , (8.50)
r6
which is a perfectly regular quantity at the Schwarzschild radius, but becomes infinity at r = 0. This
last point is a real physical singularity! The singularity at r = RS is, on the other hand, just a
pathology of the specific coordinate system used.

8.7 Geodesics in Schwarzschild metric

Let us study the motion of pointlike objects in our recently found Schwarzschild solution. To do that,
let me consider the reparametrization invariant action (3.27)

dxµ dxν
Z Z
1 −1 2
S = L dσ = dσ e (σ)gµν − m e(σ) , (8.51)
2 dσ dσ
7 They can catch singularities when going from one coordinate system to another through the transformation matrices

∂ x̄µ /∂xν .
8.7 Geodesics in Schwarzschild metric 118

in the massive (e(σ) = 1/m) and massless (e(σ) = 1, m → 0) cases8

dxµ dxν dxµ dxν

Z Z
1 1
Smassive = m dσ gµν −1 , Smassless = dσ gµν . (8.52)
2 dσ dσ 2 dσ dσ
The geodesic equations for both actions can be directly written by taking into account the non-
vanishing Christoffel symbols (8.17). Let’s denote the derivatives with respect to the affine parameter
σ by a dot. The explicit form9 obtained by following this procedure turns out to be not very useful
since the resulting equations are coupled
RS
ẗ = − ṙṫ , (8.53)
r(r − RS )
RS (r − RS ) 2 RS 2

2 2

r̈ = − ṫ + ṙ − (r − RS ) θ̇ + sin θ φ̇ , (8.54)
2r3 2r(r − RS )
2
θ̈ = − θ̇ṙ + sin θ cos θφ̇2 , (8.55)
r
2
φ̈ = − φ̇ṙ − 2 cot θ θ̇φ̇ . (8.56)
r
Fortunately, our task can be greatly simplified by considering the symmetries of the Schwarzschild-
Droste metric. Since (8.36) does not depend on the coordinates t and θ (they are cyclic coordinates
in (8.52)), we have two conservation laws

RS
∂t L = 0 −→ E = 1− ṫ = constant , (8.57)
r
∂φ L = 0 −→ h = r2 sin2 θφ̇ = constant , (8.58)

with a clear physical interpretation. In the massless case, E and h are the relativistic energy and an-
gular momentum that the particle would have at r = ∞. In the massive case, they are the relativistic
energy and angular momentum per unit mass.

Exercise
Check this by taking the non-relativistic limit of (8.57) and (8.58) at the equatorial plane
θ = π/2.

Conservation of angular momentum means that the particle moves in a plane, which we can set to
be the equatorial plane θ = π2 without loss of generality. Indeed, a simple inspection of Eq. (8.55)
shows that if we consider a geodesic passing through a point on the equator θ = π2 and tangent to the
equatorial plane θ̇ = 0, we will always have θ̈ = 0 and θ̇ = 0.
On top of the above symmetries, we have still a generic conservation law associated to the invariance
of the action (8.51) under reparametrizations of the path σ → σ = f (σ) (cf. Section 3.5.1). This reads
d
gµν uµ uν = 0 gµν uµ uν = − ,

−→ (8.59)
dσ
with = 1 and 0 for massive and massless particles respectively. Expanding this equation10
−1
RS 2 RS
− 1− ṫ + 1 − ṙ2 + r2 φ̇2 = − (8.60)
r r
8 Remember that σ = τ in the massive case.
9 Up to a global factor m in the massive case.
10 Remember that θ = π/2.
8.7 Geodesics in Schwarzschild metric 119

and plugging (8.57) and (8.58) we obtain a single equation for r(σ)
2
1 dr
+ V (r) = E , (8.61)
2 dσ

with
GM h2 GM h2
V (r) ≡ − + 2− (8.62)
r 2r r3
playing the role of an exact effective potential and
1
E2 − .

E≡ (8.63)
2
Eq. (8.61) is structurally equivalent to that of a particle of unit mass and energy11 E moving in an
effective potential V (r). It is interesting to compare the obtained potential with the Newtonian result

GM h2
VN (r) = − + 2 (8.64)
r 2r
The first two terms in Eq. (8.62) are just the universal gravitational attraction and the centrifugal
barrier that were already present in Newton’s theory of gravity. The third term is new.
At sufficiently long distances, the extra contribution is rather small and does not significantly modify
the Newtonian effective potential12 (cf. Fig. 8.4). The situation is completely different at short
distances. The new term eventually dominates over the centrifugal barrier for small r and drives the
potential to −∞13 . Let me analyze the massive and massless case separately.

Massive particles, = 1, σ = τ :

We can distinguish two cases:

• If h2 > 3RS2 , the potential displays both a maximum and a minimum at

 s 
2
h2 

dV (r) RS 
=0 −→ rmax,min = 1± 1−3 , (8.65)
dr =1 RS h

We have then four possibilities depending of the relation between the effective energy of the
particle and the potential (cf. Fig. ??):
1. Circular orbits: If E = V (rmax ) or E = V (rmin ) the particle describes an unstable or stable
orbit respectively.
2. Bound precessing orbits: If 0 > E > V (rmin ) the particle is trapped into the potential and
describes an elliptical orbit with shifting perihelion (see below).
3. Scattering orbits: If V (rmax ) > E > 0 the particle bumps in the potential and retreats back
to infinity.
4. Plunging orbits: If E > V (rmax ) the particle sails over the top of the potential to finally
spiral into the black hole.
11 The true energy per unit mass in E but the effective potential for r rather responds to E.
12 The small correction will play however a central role! See next section.
13 Note that the potential is always zero at r = R .
S
8.7 Geodesics in Schwarzschild metric 120

Newtonian gravity: Massive particle General Relativity: Massive particle

0.6 0.6

0.4 0.4

0.2 0.2
VN HrL VHrL

0.0 0.0

-0.2 -0.2

-0.4 -0.4

0 5 10 15 20 0 5 10 15 20
r r
RS RS

Figure 8.2: Effective potentials in Newtonian gravity and General Relativity for massive particles.
Different lines correspond to h2 /RS2 = 0, 1, 3, 5, 7, 9 (from brown to blue). Note the change in the
potential at the critical value h2 /RS2 = 3.

Figure 8.3: Orbits for massive particles in Schwarzschild-Droste geometry

8.7 Geodesics in Schwarzschild metric 121

Exercise
What happens with rmax and rmin when h → 0? And when h decreases? Which is the
minimal value of h and r allowing for a stable circular orbit?

• If h2 < 3RS2 the centrifugal barrier disappears and the particle has no other option but to spiral
into the singularity. Consider for clarity the limiting case h = 0 in which the particle follows a
radial trajectory. In this case, the radial equation of motion (8.61) becomes14
1/2
√
Z Z
dr RS 1/2
=± → rdr = −RS dτ . (8.66)
dτ r

Integrating the previous equation we get

2 3/2
τ (r) = √ r0 − r3/2 , (8.67)
3 RS
with r0 > r an integration constant fixing the initial value of the proper time to zero. The
particle reaches the Schwarzschild radius in a finite proper time τ . The interval measured by
an observer at rest at spatial infinite is however quite different. Indeed, it is infinite, as can be
easily seen by evaluating
1/2
r3/2 dr
Z Z
dr dτ dr RS RS −1/2
= =− 1− → dt = −RS (8.68)
dt dt dτ r r r − RS

at r = RS . For the observer at infinite the particle appears to approach but never quite cross
the horizon! This is just another indication that the Schwarzschild coordinates are flawed near
R = RS .

Exercise
What happens with t when the observer crosses the horizon?

Massless particles, = 0:

The potential (8.62) with = 0 displays a unique maximum for all values of h at
3
rmax = RS . (8.69)
2
Thus, the motion of massless particles can be divided into three cases:

1. Circular orbit: If E = V (rmax ) the particle describes an unstable circular orbit.

2. Scattering orbits: If V (rmax ) > E the particle bumps in the potential and retreats back to
infinity (deflection of light).
3. Plunging orbits: If E > V (rmax ) the particle sails over the top of the potential to finally spiral
into the black hole.

14 Among the two signs in the square root we take the negative one, in such a way that we fall toward r → 0
8.7 Geodesics in Schwarzschild metric 122

Newtonian gravity: Massless particle General Relativity: Massless particle

0.6 0.6

0.4 0.4

0.2 0.2
VN HrL VHrL

0.0 0.0

-0.2 -0.2

-0.4 -0.4

0 5 10 15 20 0 5 10 15 20
r r
RS RS

Figure 8.4: Effective potentials in Newtonian gravity and General Relativity for massless particles.
Different lines correspond to h2 /RS2 = 0, 1, 3, 5, 7, 9 (from brown to blue).

Figure 8.5: Orbits for massless particles in Schwarzschild-Droste geometry

8.8 Solving the radial equation 123

8.8 Solving the radial equation

Let us determine the equation for the orbits described in the previous section. For doing that, we
make use of Eq. (8.58) with θ = π/2 and change the derivatives with respect to the affine parameter
in Eq. (8.61) to derivatives with respect the angular variable φ

dr dr dφ h dr
= = 2 , (8.70)
dσ dφ dτ r dφ
to obtain 2
h2 2GM h2

h dr 2GM
+ = + + 2E . (8.71)
r2 dφ r 2 r r3
The tricks to solve this kind of equation are well known. Let’s perform a change of variable u ≡ 1/r
in (8.71)
2
du 2GM u 2E
+ u2 = + 2GM u3 + 2 , (8.72)
dφ h2 h
and derive the result with respect to φ. This gives rise to a second order differential equation of the
form
d2 u GM
2
+ u = 2 + 3GM u2 . (8.73)
dφ h

8.8.1 The massive case: Perihelion advance of Mercury

In the massive case = 1 and (8.73) becomes

d2 u GM
2
+ u = 2 + 3GM u2 (8.74)
dφ h
The resulting equation is extremely similar to the Newtonian equation of motion of a particle of mass
m in the equatorial plane
d2 u0 GM
+ u0 = 2 (8.75)
dφ2 h
even though the interpretation of the radial variable r is completely different15 . As you probably
remember from your Classical Mechanics course, the general solution of (8.75) is a conic

GM a(1 − e2 )
u0 = (1 + e cos φ) −→ r0 = (8.76)
h2 (1 + e cos φ)

with
h2
a(1 − e2 ) = . (8.77)
GM

15 In Newtonian gravity r is the radial distance from the mass while in the relativistic it is just a radial coordinate

that can be only related to a distance through the metric.

8.8 Solving the radial equation 124

Orbital Mumbo Jumbo

• a=semi-major axis: 1/2 of the long axis of the ellipse.

• b=semi-minor axis: 1/2 of the short axis of the ellipse.

• e=eccentricity : It characterizes the deviation of the ellipse from circular. When e = 0
the ellipse is a circle, when e = 1 the ellipse is a parabola. It is defined in terms of the
semi major and semi minor axes a and b as
s 2
b
e= 1− . (8.78)
a

• f=focus: The point over the semi-major axis at a distance f = ae from the geometric
center of the ellipse.
b2
• l=semi-latus rectum: The distance l = a from the focus to the ellipse along a line
parallel to the semi-minor axis.
• rp =periapsis: The distance rp = a(1 − e) from the focus to the nearest point of approach
of the ellipse.
• ra =apoapsis: The distance ra = a(1+e) from the focus to the furthest point of approach
of the ellipse.

• The equation of the orbit: It gives the distance to the orbiting body from the focus
of the orbit as a function of the polar angle θ

a(1 − e2 )
r(θ) = . (8.79)
1 + e cos θ

If the gravitational field is sufficiently weak, Newtonian gravity alone is expected to provide a good
approximation to the motion of massive particles in General Relativity. This suggest to treat to extra
term 3GM u2 as a perturbation of top of the solution of Eq. (8.75). The perturbative solution of Eq.
(8.74) can be determined by considering the antsatz

u = u0 + ∆u , (8.80)
8.8 Solving the radial equation 125

with u0 given by (8.76). Inserting this into (8.74) we get

d2 ∆u e2

1 2
+ ∆u = A 1 + + 2e cos φ + e cos 2φ (8.81)
dφ2 2 2
with
3
3 (GM )
A= (8.82)
h4
a tiny parameter. To solve this equation let me notice two identities
d2 φ
(φ sin φ) + φ sin φ = 2 cos φ , (8.83)
dφ2
d2
(cos 2φ) + cos 2φ = −3 cos 2φ . (8.84)
dφ2
A direct comparison of (8.83) and (8.84) with (8.81) suggests the solution
e2

1
∆u = A 1 + − cos 2φ + eφ sin φ , (8.85)
2 6
which can be checked by direct differentiation. The three terms in the square bracket are rather
different. The first and the second one are just a constant and an oscillatory term around zero, both
of them very small due to the constant A in front. The third one is different since it accumulates over
successive orbits and gradually grows with time. Retaining only this last term we get
GM
u= [1 + e (cos φ + αφ sin φ)] (8.86)
h2
which can be written in a more enlightening way
GM
u≈ [1 + e cos (1 − α) φ] (8.87)
h2
by taking into account that
cos [φ (1 − α)] = cos φ cos αφ + sin φ sin αφ ≈ cos φ + αφ sin φ (8.88)
for
2
3 (GM )
α≡ 1. (8.89)
h2
The solution (8.87) shows that the orbit is still periodic, but with a period that is not longer 2π, but
rather 2π(1 − α). The values of r repeats on cycles larger than 2π and the orbit precesses.
8.8 Solving the radial equation 126

The advance of the perihelion in one revolution is

2π 6πG2 M 2
∆φ = − 2π ≈ 2φα = , (8.90)
1−α h2
which taking into account (8.77) can be written as16 (note that we restore the c factors)

6G2 M 2 6πGM
∆φ = 2 2
= . (8.91)
h c a(1 − e2 )c2

Because it is a small effect, let’s accumulate this over 100 years to get the observable quantity
∆φ 100 years
∆φ100 ≡ × , (8.92)
T century
with T the period of the orbit in years. In terms of observable orbits within the solar system, Mercury
is the closest planet to the Sun, and so it should have the largest precession.

Object Mass Mean Equatorial Radius Period Semimajor axis Eccentricity

(1024 kg) (103 km) (days) (108 km)
Mercury 0.33010 2.4397 87.869 0.57909227 0.20563593
Venus 4.8673 6.0518 224.701 1.0820948 0.00677672
Earth 5.9722 6.3710 365.256 1.4959826 0.01671123
Mars 0.64169 3.3895 686.98 2.2794382 0.0933941
Jupiter 1898.1 69.911 4332.71 7.7834082 0.04838624
Saturn 568.32 58.232 10759.50 14.266664 0.05386179
Uranus 86.810 25.362 30685.00 28.706582 0.04725744
Neptune 102.41 24.622 60190.00 44.983964 0.00859048

Taking into account the values for Mercury’s orbit, we obtain

∆φ100 ≈ 43.0300 (8.93)

The major axis of Mercury precesses at a rate of 43 arcsecs per century. The observational results are
in excellent agreement with General Relativity

Planet Observed residual GR prediction

00
Mercury (43.11 ± 0.45) 43.0300
00
Venus (8.4 ± 4.8) 8.600
00
Earth (5.0 ± 1.2) 3.8”

16 The use of the expressions for the unperturbed solution is justified by the fact that we are looking to a very small

quantity.
8.9 The massless case: Gravitational deflection of light 127

43 arcseconds and the end of the Newtonian empire

Newton’s theory had been a very successful theory, extensively used by astronomers for centuries.
It had predicted the return of comet Halley (1758) with an error of 33 days, the elliptical
character of the recently discovered Uranus (1781) and even more surprisingly the location,
mass and orbit parameters of Neptune, even before it was directly observed (1846). Leverrier
discovered it just with the point of his pen!a ; clearly an amazing proof of the universality of
the gravitational interaction. Nevertheless, at the end of the 19th century there were still some
caveats related to Mercury’s orbit. As you should know the 1/r2 dependence of the Newton’s
force gives rise to elliptical trajectories on a plane, and the corresponding perihelion is a priori a
fixed pointb . However, different perturbations (due for instance to the presence of other massive
objects in the Solar system, such as Jupiter, or to the quadrupole moment of the Sun), give rise
to a perihelion advance, and therefore to an ellipse turning on the plane. Even when all those
effects were taken into account there was a residual contribution to the shift. As pointed out by
Leverrier and Newcomb at the end of the 19th, Mercury’s perihelion precesses at a rate of 57500
per century, but only 53200 can be explained by the perturbations associated to the other planets.
The remaining 4300 per century could not be accounted for by the Newtonian theory even when
errors were taken into account. The observational problem was basically closed for everyone
(apart from Leverrierc ), but the theoretical problem would remain open till the introduction of
General Relativity in 1915.
a Francois Arago 1786-1853.
b The Laplace-Runge-Lenz vector is conserved.
c He died believing that the history of the discovery of Neptune would repeat and a new planet with a mass

enough to account for the 4300 per century would be encountered between the Sun and Mercury.

8.9 The massless case: Gravitational deflection of light

Let us consider now the massless case where = 0 and (8.61) becomes
d2 u
+ u = 3GM u2 . (8.94)
dφ2
In the absence of the term 3GM u2 , the previous solution reduces to the simple harmonic oscillator
equation
d2 u0
+ u0 = 0 , (8.95)
dφ2
whose solutions
sin φ
u0 = , (8.96)
b
can be interpreted as straight lines passing at a distance b from the origin. Following a similar
procedure to the one used in the previous section, we look for perturbative solutions of the form
sin φ
u = u0 + ∆u = + ∆u (8.97)
b
with u0 given by (8.96). Substituting (8.97) into (8.94) we get a linear equation in ∆u
d2 ∆u 3GM
2
+ ∆u = sin2 φ , (8.98)
dφ b2
whose solution is given by
3GM 1
∆u = 1 + cos (2φ) . (8.99)
2b2 3
8.10 The post-Newtonian formalism 128

Adding this to the unperturbed solution we get

sin φ 3GM 1
u= + 1+ cos (2φ) , (8.100)
b 2b2 3

which in the limit r → ∞, u → 0 and for small φ gives rise to

2GM
φ≈− . (8.101)
bc2
The total deflection angle is twice this value
2Rs 4GM
∆φ = = . (8.102)
b bc2
For rays coming from a distant stars and grazing the surface of the Sun17

b≈R = 6.96 × 105 km M = 2 × 1030 Kg (8.103)

we get
4GM
∆φ = = 1.7500 . (8.104)
c2 R
Light paths so close to the Sun are of course not visible by day, but they become visible at the
time of a total eclipse. Their position relative to the other background stars during the total eclipse
appears shifted relative to the position in the usual night sky. This prediction of General Relativity
was verified in 1919 just a few years later the formulation of the theory. Two separate groups led by
Arthur Eddington and Andrew Crommelin moved to Guinea and Brazil to observe the total eclipse of
May 29, 1919. They reported deflections of (1.61 ± 0.40)00 and (1.98 ± 0.16)00 , in reasonable agreement
with Einstein’s prediction (8.104).

8.10 The post-Newtonian formalism

Nowdays, the agreement between theory and observation is at the level of a few parts in a thousand.
The deviations from the General Relativity are usually parametrized in terms of the so-called post-
Newtonian parameters β and γ measuring respectively the non-linearity in the superposition law for
gravity and the spatial curvature produced by unit rest mass
−1
G2 M 2

2GM 2γGM
ds2 = − 1 − + 2 (β − γ) dt2
+ 1 − dr2 + r2 dΩ2 . (8.105)
r r2 r

When this parameters are taken into account Eqs. (8.91) and (8.102) become respectively

2 − β + 2γ 6πGM 1 + γ 4GM
∆φ = , ∆φ = (8.106)
3 a(1 − e2 )c2 2 bc2

The General Relativity limit corresponds to γ = β = 1. Recent measurements provide values γ =

0.9998 ± 0.0003 and |2γ − β − 2| < 3 × 10−3 , in excellent agreement with GR.

17 In this case, the effect is maximized and easier to observe.

CHAPTER 9
GENERAL RELATIVITY: THE FIELD THEORY APPROACH

We move now to the modern approach to General Relativity: field theory. The chief advantage of
this formulation is that it is simple and easy; the only thing to specify is the so-called Lagrangian
density. We start by presenting a simple introduction to classical field theory in flat spacetime which
we later generalize to curved spacetime. The last part of the Chapter is devoted to the action for the
gravitational field and the recovery of Einstein equations from it.

9.1 Classical mechanics

The fundamental problem of classical mechanics is to determine the way particles move given a
potential. Dynamical systems can be described by equations of motion or by action functionals
Z
S = dt L (qj (t), q̇j (t)) (9.1)

depending on generalized coordinates and velocities {qj (t), q̇j (t)}. The classical trajectory is defined
as the unique path that extremizes the action functional (δS = 0) for all variations qj → qj + δqj with
fixed initial qj (t0 ) and final values qj (tf ). An explicit variation of the action gives
Z tf
∂L ∂L
δS = dt δqj + δ q̇j . (9.2)
t0 ∂qj ∂ q̇j

Integrating the last term by parts to flip the temporal derivative onto ∂L/∂ q̇j we get
Z tf
∂L d ∂L
δS = dt − δqj , (9.3)
t0 ∂qj dt ∂ q̇i

where we have omitted a total derivative that vanishes because of the boundary conditions δqj (t0 ) =
δqj (tf ) = 0. Since δqj is arbitrary, the extremization of the action translates into the so-called Euler-
Lagrange equations
d ∂L ∂L
− = 0. (9.4)
dt ∂ q̇i ∂qj
This variational formulation has several advantages:
9.2 From Classical Mechanics to Field theory 130

i) The properties of the system are compactly summarized in one function, the Lagrangian.
ii) There is a direct connection between invariances of the Lagrangian and constants of motion1 .
iii) There is a close relation between the Lagrangian formulation of classical mechanics and quantum
mechanics.

9.2 From Classical Mechanics to Field theory

The Lagrangian formalism presented above can be extended to continuous systems involving an infinite
number of degrees of freedom. This is achieved by taking the appropriate limit of a system with a
finite number of degrees of freedom. Consider a one dimensional chain of length l made of N equal
masses m. The masses are separated by a distance a and connected by identical massless springs
with force constant k. The total length of the system is l = (N + 1)a. The displacement of the
particles with respect to its equilibrium position x̄j = ja is described by a generalized coordinate
φ(xj , t) ≡ xj (t) − x̄j with j = 1, . . . , N and φ0 = φN +1 = 0. The Lagrangian of the full system
includes the kinetic energy of the particles and the energy stored into the springs, i.e.
N N
1 X 2 1 X 2
L=T −U = m φ̇j (t) − k (φj+1 (t) − φj (t)) . (9.5)
2 j=1 2 j=0

The continuous limit of the previous expression can be taken by sending N → ∞ and a → 0 in such
a way that the total length of the chain, l = (N + 1)a, remains fixed. To keep the total mass of the
system and the force between particles finite we require m/a and ka to go to some finite values µ and
Y playing the role of the mass density and the Young modulus in the continuous theory. We have
N N 2
1 l h 2

φj+1 (t) − φj (t)
Z
1 X m 2 1X 2
i
L= a φ̇j (t) − a (ka) −→ L = dx µφ̇ − Y (∂x φ) ,
2 j=1 a 2 j=0 a 2 0

with the finite number of generalized coordinates φj replaced by a continuous function φ(x, t). The
antisymmetric dependence of Eq.(9.6)
p on the derivatives suggests the introduction of a set of coordi-
nates xµ = (cs t, x)T with cs = Y /µ and a Lorentzian metric ηµν = diag(−1, 1). This allows us to
write Z
S = d(cs t)dx L (9.6)

with
µcs µν
L=−
η ∂µ φ∂ν φ , (9.7)
2
the so-called Lagrangian density. The jump from fields existing within a physical medium to fields in
vacuum is now straightforward: we must simply replace cs by the speed of light c. Generalizing the
metric ηµν to arbitrary dimensions , we can write generically write the action for relativistic fields as
Z
S = dn x L (φ, ∂µ φ) , (9.8)

where we have allowed for a dependence of the Lagrangian density on the fields.

Exercise
Consider again the chain of masses connected by springs. Modify the system to give rise to an
explicit dependence of the Lagrangian on φ. Hint: Eq. (9.7) is shift-invariant.
1 For instance, if the Lagrangian is invariant under rotations, angular momentum is conserved.
9.3 Principles of Lagrangian construction 131

Gauge freedom in the Lagrangian

The Lagrangian of a physical system is not unique. To see this, consider a transformation of
the form
L −→ L0 = L + ∂σ g σ , (9.9)
and its effect on the action (9.8)
Z Z Z
S 0 = dn xL0 = S + dn x ∂σ g σ = S + g σ dSσ . (9.10)
R ∂R

The term associated to ∂σ g σ turns out to be a boundary term, which does not contribute to
the equations of motion. Lagrangians differing by a contribution ∂σ g σ give rise to the same
equations of motion.

The equations of motion for the field φ(x, t) can be obtained by considering the change of the action
under an infinitesimal change φ(x, t) −→ φ(x, t) + δφ(x, t). The only requirement to be satisfied by
the variations δφ is to be differentiable and to vanish outside some bounded region of spacetime (to
allow an integration by parts). Performing this variation we get
Z Z
n ∂L ∂L n ∂L ∂L
δS = d x δφ + δφ,µ δφ = d x − ∂µ δφ . (9.11)
∂φ ∂φ,µ ∂φ ∂(∂µ φ)

Requiring the action to be stationary (δS = 0) and taking into account that δφ is completely arbitrary,
we obtain the continuous version of the Euler-Lagrange equations

∂L ∂L
∂µ − = 0. (9.12)
∂(∂µ φ) ∂φ

A worked-out example
As a direct application of Eq. (9.12), let me consider the action (9.7)

1 ∂2φ ∂2φ

∂L
∂µ =0 −→ ∂µ (η µν ∂ν φ) = 0 −→ − 2 2 + = 0. (9.13)
∂(∂µ φ) cs ∂t ∂x2

As expected, we get a wave equation.

9.3 Principles of Lagrangian construction

What kind of Lagrangian density should we choose? To be in the safe side, the Lagrangian density of
a relativistic theory is required to satisfy the following requirements:

1. L must be a real-valued function, since it enters into expressions of physical significance, like
the Hamiltonian.
2. L must have dimension 4 in units of energy, since in natural units the action is dimensionless
and [d4 x] = −4.
3. L must be a linear combination of Lorentz invariant quantities constructed from the fields, their
first partial derivatives and the universally available objects ηµν and µνρσ .
9.3 Principles of Lagrangian construction 132

4. The coefficients of this linear combination can be restricted by the symmetries of the problem
(internal symmetries/ gauge symmetries).
5. L should be bounded from below.

The power of the previous program is made most vividly evident by considering some examples.

9.3.1 A complex scalar field with U (1) symmetry

Consider a complex scalar field

Φ = φ1 + iφ2 , Φ∗ = φ1 − iφ2 . (9.14)

The quadratic Lorentz invariants which can be constructed from Φ, Φ∗ , ∂µ Φ and ∂µ Φ∗ lead to a
Lagrangian density of the form
1 µν 1
L= η [a∂µ Φ∂ν Φ + a∗ ∂µ Φ∗ ∂ν Φ∗ + 2a0 ∂µ Φ∗ ∂ν Φ] + [bΦΦ + b∗ ΦΦ∗ + 2b0 Φ∗ Φ] . (9.15)
2 2
where the reality condition L = L∗ imposes the appearance of the pairs a, a∗ and b, b∗ and requires the
coefficients a0 and b0 to be real. The previous Lagrangian density can be written in a more compact
way by introducing the arrays
T
Φ∗

Φ
Φ̃ ≡ and Φ̃† ≡ = (Φ∗ Φ) . (9.16)
Φ∗ Φ

to get2
a∗ b∗

1 a0 1 b0
L = η µν Φ̃†,µ Φ̃,ν + Φ̃∗ Φ̃ . (9.17)
2 a a0 2 b b0
The number of terms appearing in this Lagrangian can be reduced in cases in which we have symmetries
on top of Lorentz invariance. As an illustration of this, imagine the field Φ to possess an internal
symmetry
Φ → eiω Φ , Φ∗ → e−iω Φ∗ . (9.18)
In this case, we necessarily have a = b = 0 and the matrices in (9.17) become diagonal. This leaves
us with a simpler Lagrangian, that with some notational adjustments, can be written as
1 µν
K −η ∂µ Φ∗ ∂ν Φ − κ2 Φ∗ Φ .

L= (9.19)
2
A direct application of the Euler-Lagrange equations (9.12) provides two uncoupled equations for Φ
and Φ∗ , namely
2 − κ2 Φ = 0 , 2 − κ2 Φ∗ = 0 ,

(9.20)
µ
that we can use to provide a physical interpretation for the parameter κ2 . Indeed, setting Φ = eikµ x
with pµ = (E, p) in any of these two equations, we get a dispersion relation E 2 − p2 = κ2 , which
makes it natural to identify the parameter κ2 with the mass m2 of the field.
2 In this notation, the reality condition L = L∗ results from the hermiticity of the 2 × 2 matrices.
9.3 Principles of Lagrangian construction 133

9.3.2 A worked-out example: Vector and tensor fields

The previous analysis can be easily extended to other kinds of fields. Consider for instance a vector
field Aµ and a tensor field Bµν . Which is the most general Lagrangian L(A, ∂A, B, ∂B) that can be
constructed out of these two fields? To clarify the procedure, let me split the problem into several
pieces
L(A, B, ∂A, ∂B) = LA (A, ∂A) + LB (B, ∂B) + Lint (A, B, ∂A, ∂B) . (9.21)
The lagrangian LA is a free lagrangian for Aµ , i.e. a linear combination of Lorentz invariant terms
quadratic in Aµ 3 ,
LA = a1 ∂µ Aµ ∂ν Aν + a2 ∂µ Aν ∂ µ Aν + a3 ∂µ Aν ∂ ν Aµ + a4 Aµ Aµ + a5 µνρσ (∂ ν Aµ )(∂ σ Aρ ) . (9.22)
The lagrangian LB is a free lagrangian for Bµν
LB = b1 ∂µ B νν ∂ µ B κκ + b2 ∂µ (B κν + B νκ ) ∂ µ (Bκν + Bνκ ) + b3 ∂µ (B µν + B νµ ) ∂ν B κκ + (9.23)
2
+ b4 ∂µ (B κµ + B µκ ) ∂ ν (Bκν + Bνκ ) + b5 µνρσ B µν B ρσ + b6 Bµν B µν + b7 Bµν B νµ + b8 (B µ µ ) .
The interaction Lagrangian Lint follows the same logic, but now involving the interactions between
the two fields,
Lint = i1 ∂µ Aν (B µν + B νµ ) + i2 Aµ ∂ν (B µν + B νµ ) + i3 µνρσ (∂ µ Aν )B ρσ +
+ i4 µνρσ Aµ (∂ ν B ρσ ) + i5 Aµ Aν B µν . (9.24)
I maybe missed some terms, but I think you get the idea. As in the scalar case, the list of operators
and independent coefficients can be shortened cases in which there are extra symmetries on top on
Lorentz invariance. Let me illustrate this with a simpler example.

The action for the electromagnetic field

Consider the action (9.22) alone. The first thing that can simplify our life is the gauge freedom in
the choice of the Lagrangian density. In particular notice that choosing gσ = µνρσ (∂ ν Aµ )Aρ in (9.9)
allows us to eliminate the term a5 in (9.22), since
∂ σ gσ = µνρσ (∂ ν Aµ ) (∂ σ Aρ ) + µνρσ (∂ ν ∂ σ Aµ ) Aρ . (9.25)
| {z }
0 by symmetry

Taking this into account we are left with an action containing 4 pieces
Z
S = d4 x [a1 ∂µ Aµ ∂ν Aν + a2 ∂µ Aν ∂ µ Aν + a3 ∂µ Aν ∂ ν Aµ + a4 Aµ Aµ ] . (9.26)

Imagine now that Aµ is the field of a gauge theory. In that case the field configurations Aµ and
A0µ = Aµ + ∂µ χ , (9.27)
with arbitrary scalar function χ give rise to the same physical observables4 . This automatically forbids
the a4 term in (9.26)5 and puts some restrictions on the other coefficients. To see this, let me split
∂µ Aν into its symmetric Sµν ≡ ∂(µ Aν) and antisymmetric Fµν ≡ ∂[µ Aν] parts
∂µ Aν = Sµν + Fµν , (9.28)
3 Quadratic actions give rise to linear equations of motion, where the superposition principle can be applied.
4 In the same way that physicality cannot be attributed to L, we cannot make any claim about the physicality of Aµ .
Physicality might be attributed to the set {Aµ } of gauge-equivalent 4-potentials or to any gauge invariant attribute of
that set, but not to its individual elements.
5 It cannot be compensated by the transformation of the other (derivative) terms.
9.4 The action for the graviton 134

and rewrite the action (9.26) as

Z
S = d4 x a1 S µµ S νν + (a2 + a3 )Sµν S µν + (a2 − a3 )Fµν F µν .

(9.29)

The invariance of the action under the gauge transformation Aµ → Aµ + ∂µ χ requires a1 = 0 and
a3 = −a2 . This restriction leaves us with an action
Z
S = d4 xFµν F µν , (9.30)

where we have omitted an overall normalization factor that can be determined by choosing the coupling
of the gauge field Aµ to matter and the units of that coupling. The equations of motion associated
with this action can be computed via the Euler-Lagrange equations (9.12) or by varying the action
with respect to Aµ . We follow the second procedure to get
Z Z
δS = d x [F δFµν + Fµν δF ] = 2 d4 xF µν δFµν
4 µν µν

Z Z
= 2 d4 xF µν (∂µ δAν − ∂ν δAµ ) = 4 d4 xF µν ∂µ δAν (9.31)
Z
= −4 d4 x∂µ F µν δAν + boundary terms ,

where we used the symmetry properties of Fµν and performed an integration by parts. Imposing
finally the condition δS = 0 for arbitrary δAν , we arrive to the very familiar result

∂µ F µν = 0. (9.32)

The Maxwell equations in vacuum are recovered from an action (9.30) constructed with very limited
principles, namely, quadraticity in the fields, Lorentz invariance and gauge invariance.

9.4 The action for the graviton

The procedure outlined in the previous section is quite powerful. As an interesting application for
General Relativity, let me consider the action for a second rank symmetric and massless tensor field
hµν . As in the vector field case, the kinetic term is constructed out of scalars that are quadratic in
the derivatives ∂ρ hµν . The most general expression will be the sum of the different scalars obtained
by contracting pairs of indices in all possible ways. The resulting action takes the form
Z
S = d4 x [c1 ∂µ hνν ∂ µ hκκ + c2 ∂µ hκν ∂ µ hκν + c3 ∂µ hµν ∂ν hκκ + c4 ∂µ hκµ ∂ ν hκν ] , (9.33)

with c1 , c2 , c3 , c4 some undetermined constants. These constants can be determined up to an overall

factor by requiring the action to be invariant under the gauge transformations

hµν → hµν − ∂µ ξν − ∂ν ξµ . (9.34)

Plugging (9.34) into (9.33) and performing some simple manipulations we get
Z
S → S + d4 x[−2(2c1 + c3 )∂µ hκκ ∂ µ ∂λ ξ λ − 2(2c2 + c4 )∂ µ hµν ξ ν − 2(c3 + c4 )∂ µ hµν ∂ ν ∂κ ξ κ

+ (4c1 + 2c2 + 4c3 + 3c4 )∂µ ∂κ ξ κ ∂ µ ∂λ ξ λ + (2c2 + c4 )ξµ ξ µ ] , (9.35)

9.5 Field theory in curved spacetime 135

which imposing δS = 0, provides the constraints

c2 = −c1 , c4 = −c3 = 2c1 . (9.36)

Taking this into account, the action (9.33) takes the form
Z
S = d4 x [∂µ hκν ∂ µ hκν + 2∂µ hµν ∂ν hκκ − 2∂µ hκµ ∂ ν hκν − ∂µ hνν ∂ µ hκκ ] , (9.37)

where we have omitted an overall normalization factor that can be determined by specifying the
coupling to matter and setting the units of the coupling. The associated equations of motion can be
obtained by varying the action with respect to the field. This leads to
Z
δS = d4 x 2∂ µ hκν ∂µ δhκν + 2∂ν hκκ ∂µ δhµν + 2ηκλ ∂µ hµν ∂ν δhκλ

−2∂ ν hκν ∂µ δhκµ − 2∂ µ hκµ ∂ν δhκν − 2ηκλ ∂ µ hνν ∂µ δhκλ ,

(9.38)

which, integrating by parts, dropping boundary terms and renaming indices can be written as
Z
δS = 2 d4 x −hκν δhκν − ∂µ ∂ν hκκ δhµν − ηκλ ∂µ ∂ν hµν δhκλ +∂µ ∂ ν hκν δhκµ + ∂ µ ∂ν hκµ δhκν + ηκλ hνν δhκλ

Z
= 2 d4 x [−hµν − ∂µ ∂ν hκκ − ηµν ∂κ ∂λ hκλ + ∂µ ∂ κ hκν +∂ν ∂ κ hκµ + ηµν hκκ ] δhµν . (9.39)

A simple inspection reveals that the quantity inside the square brackets is nothing else than the
linearized version of the Einstein tensor Gµν
Z
δS ∝ d4 x Gµν δhµν . −→ Gµν = 0 . (9.40)

The linearized version of Einstein equations in vacuum are recovered from an action (9.39) constructed
with very limited principles, namely, quadraticity in the fields, Lorentz invariance and gauge invari-
ance.

9.5 Field theory in curved spacetime

The variational approach for fields presented in the previous section can be generalized to include the
interaction with gravity. We guess the form of the action in this case with the help of the Equivalence
Principle:

• Replace the Minkowski metric ηµν by gµν .

• Replace partial derivatives by covariant derivatives (colon-goes-to semicolon rule).
p
• Replace the Lorentz invariant volume element dn x by the covariant volume element dn x |g|.
p
Since L̃ and dn x |g| are scalars under general coordinate transformations, the resulting action
Z p
S = dn x |g| L̃(φ, ∇µ φ, gµν ) (9.41)
| {z }
L
p
is guaranteed to provide covariant equations of motion. Note that the untilded quantity L ≡ |g|L̃
is a scalar density of weight 1.
9.6 The energy-momentum tensor 136

9.6 The energy-momentum tensor

Consider now an arbitrary infinitesimal coordinate transformation

xµ → x̄µ = xµ + ξ µ (x) . (9.42)

This transformation generates a perturbation to both the fields and the metric in such a way that the
Lagrangian density L (no tilde) becomes

∂L ∂L ∂L
L(φ + δφ, φ,µ + δφ,µ , gµν + δgµν ) ≈ L(φ, φ,µ , gµν ) + δφ + δφ,µ + δgµν . (9.43)
∂φ ∂φ,µ ∂gµν

Integrating by parts, we get two pieces

Z Z
n ∂L ∂L ∂L
δS = d x − ∂µ δφ + dn x δgµν . (9.44)
∂φ ∂(∂µ φ) ∂gµν
| {z }
=0

The first one is associated to a particular variation δφ and vanishes when taking into account the
Euler-Lagrange equation for φ. The second term must be then equal to zero for S to remain unchanged.
The integrand ∂L/∂gµν is a scalar density. Let’s define a symmetric second-rank tensor out of such a
density
2 ∂L
T µν ≡ p , (9.45)
|g| ∂gµν
and write Z
1 p
δS = dn x |g|T µν δgµν . (9.46)
2
p
Although is tempting to simply set |g|Tµν = 0, this condition is overly restrictive, since δgµν refers
here to a specific type of variation, not to an arbitrary one. The variation δgµν can be however
expressed in terms of the arbitrary perturbation ξµ by taking into account that δgµν = −(ξµ;ν + ξν;µ )
(cf. Eq. (6.60) and notice (9.50)). This gives
Z Z
1 p p
δS = dn x |g|T µν δgµν = − dn x |g|T µν ξµ;ν
2
Z p Z p
= d x |g|T ;ν ξµ − dn x
n µν
|g|T µν ξµ , (9.47)
;ν
| {z }
=0

where we have made use of the symmetry property of T µν and integrated by parts to get a total
derivative that vanishes by assumption on the boundary of integration. Since ξµ is arbitrary we must
have
∇ν T µν = 0 , (9.48)
which is a continuity equation suggesting that we can identify the tensor (9.45) with the energy-
momentum tensor of any physical system.
9.7 The Einstein-Hilbert action 137

A common sign mistake

You will find some books giving an alternative definition of the energy momentum tensor
2 ∂L
Tµν = − p , (9.49)
|g| ∂g µν

in terms of δg µν rather than δgµν . The difference in sign between these two equivalent expres-
sions comes from

gµν g νλ = δ µ λ δ gµν g νλ = 0 δg µν = −g µλ g νρ δgλρ .

→ → (9.50)

9.6.1 A particular case

p
When the Lagrangian L̃ on L = |g|L̃ depends only on the metric and not on the first derivatives
of the metric6 , i.e. L̃ = L̃(φ, ∂µ φ, gµν ), it is possible to derive an
√ alternative expression for the
∂ |g| 1
p µν
energy-momentum tensor. In particular, taking into account that ∂gµν = 2 |g|g , we can write
p ! !
µν 2 ∂L 2 ∂ |g| p ∂ L̃ 2 1p p ∂ L̃
T =p =p L̃ + |g| =p |g|g µν L̃ + |g| , (9.51)
|g| ∂gµν |g| ∂gµν ∂gµν |g| 2 ∂gµν

or equivalently
∂ L̃
T µν = g µν L̃ + 2 . (9.52)
∂gµν

Exercise
Compute the energy-momentum tensor for (9.7).

9.7 The Einstein-Hilbert action

In order to construct an action for the gravitational field we must define a Lagrangian. This Lagrangian
must transform as a scalar under general coordinate transformations and depend on the metric tensor
and its derivatives. Note however that gravity is completely different from all other fundamental
interactions, since non-trivial quantity can be constructed from the metric and its first derivatives
alone. This can be easily seen by considering an arbitrary scalar combination f (gµν , ∂ρ gµν ) of this
quantities around a small region in spacetime. According to the local flatness theorem, in such a
region, it is always possible to find coordinates such that gµν = ηµν and ∂ρ gµν = 0 and therefore
f = constant. But ,since we are dealing with a scalar quantity, this will be the case in any other
coordinate system. In other words, any covariant scalar function constructed just from the metric
tensor and its derivatives will be a trivial constant.
The simplest non-trivial quantity that can be constructed from the metric and its derivatives is the
Ricci scalar, which depends on the metric and its first and second order derivatives. The resulting
action reads Z Z Z
4 1 4
p 1 p
SEH = d xLEH = 2 d x |g|R = 2 d4 x |g|g µν Rµν (9.53)
2κ 2κ
6 Particular cases are the lagrangian for scalar fields or that for the electromagnetic field, where the covariant derivative

reduces to the standard derivative.

9.7 The Einstein-Hilbert action 138

and is known as the Einstein-Hilbert action. The constant of proportionality κ2 is included on dimen-
sional grounds and will be determined of the end of the computation.

Exercise
Which is the dimension of κ2 ?

Consider the variation of (9.53) resulting from the variation of the metric tensor

gµν → gµν + δgµν , (9.54)

where δgµν and its first derivative are assumed to vanish at the boundary of the integration region.
We obtain
p p p
δLEH ∝ |g|δg µν Rµν + δ |g|R + |g|g µν δRµν , (9.55)
| {z } | {z }
δL1 δL2

where we have defined two pieces, L1 and L2 . The first one can be easily evaluated by taking into
account the variation and the variation of the metric determinant
p 1p
δ |g| = − |g|gµν δg µν . (9.56)
2

Exercise
Prove Eq. (9.56).

Collecting the terms, we get

p 1
δL1 = |g| Rµν − gµν R δg µν . (9.57)
2
The evaluation of δL2 is slightly more involved since it requires to perform the variation of the Ricci
tensor. The easiest way of doing this is to consider the variation of the Riemann tensor and perform
the required contractions at the end of the computation. Schematically, the variation of the Riemann
tensor has the structure

Riemann ∼ ∂Γ + ΓΓ −→ δ (Riemann) ∼ δ (∂Γ) + (δΓ) Γ , (9.58)

so the first thing that we have to compute is the variation of Christoffel symbols δΓρµν defined by

Γρ µν −−−−−−→ Γ̃ρ µν = Γρ µν + δΓρµν =⇒ δΓρ µν = Γ̃ρ µν − Γρ µν . (9.59)

gµν +δgµν

Be careful! We are just performing a variation of the metric, not transforming it.

It is easy to see that, even though a connection is not a tensor, the difference of two connections δΓρµν
transforms as a tensor, i.e.
∂ x̄ρ ∂xλ ∂xκ ρ
δ Γ̄ρ µν = δΓ λκ . (9.60)
∂xσ ∂ x̄µ ∂ x̄ν
9.7 The Einstein-Hilbert action 139

Exercise
Check it.

The property (9.60) extremely simplifies the computation of the variation of the Riemann tensor.
Indeed, we can always go to a local free fall reference frame in which Γρµν = 0 at some arbitrary point
P . In such a point the expression (9.58) becomes

δRµν = ∂ρ (δΓρ µν ) − ∂ν (δΓρ ρµ ) = ∇ρ (δΓρ µν ) − ∇ν (δΓρ ρµ ) , (9.61)

where in the last step used of the fact that the partial and covariant derivatives coincide when Γρ µν = 0.
The resulting Palatini equation

δRµν = ∇ρ (δΓρ µν ) − ∇ν (δΓρ ρµ ) (9.62)

is a tensorial equation (remember the property (9.60)) valid in any arbitrary coordinate system (and
not only in the free fall reference frame at P ).

Exercise
Prove that the second term in the previous expression is symmetric, as it should be.

A similar trick can be applied to get the explicit expression of δΓ, that takes the same form that the
definition of the Christoffel symbols, with the metric replaced by the metric variation and the partial
derivatives replaces by covariant derivatives, i.e.
1 µσ
δΓµ νρ = g (∇ν δgσρ + ∇ρ δgσν − ∇σ δgνρ ) . (9.63)
2

Exercise
Check the previous expression by explicit computation.

Taking (9.62) into account, δL2 becomes

p p
δL2 = |g| g µν [∇ρ (δΓρ µν ) − ∇ν (δΓρ ρµ )] = |g| ∇σ [g µν δΓσ µν − g µσ δΓρ ρµ ] , (9.64)

where we have used the metric compatibility condition (4.68). We are left therefore with the covariant
divergence of a vector. Using the property
1 p
V µ ;µ = p |g|V µ , (9.65)
|g| ,µ

we get the boundary term

p hp p i
δL2 = |g| ∇σ [g µν δΓσ µν − g µσ δΓρ ρµ ] = ∂σ |g|g µν (δΓσ µν ) − |g|g µσ (δΓρ ρµ ) . (9.66)
9.8 Einstein equations in the presence of matter 140

As I said before, gravity is a quite particular field theory. The existence of second derivatives
in the Einstein-Hilbert action gives rise to a contribution depending on the value of the first
derivatives on the boundary. To deal with these, we have two options:
• Extend the variational principle and require the fields and their derivatives to be fixed at
the boundary. This would give rise to reasonable field equations. A clear example from
classical mechanics illustrating this would be
Z tf
1 tf 2
Z
1 2
S= dt q̈ + q̇ = q̇(tf ) − q̇(t0 ) + q̇ , (9.67)
t0 2 2 t0

with the assumption that both q̇ and q are fixed at the boundary. This approach has
however some caveats. On the one hand, it does not obey a composition rule of the kind

S(0 → 1 → 2) = S(0 → 1) + S(1 → 2) , (9.68)

where the paths connecting (q0 , t0 ) and (q2 , t2 ) are decomposed at an intermediate time
t1 with t0 < t1 < t2 . Although the paths are expected to be continuous at t = t1 , they do
not need to be smooth at that point which requires leaving q̇1 free at t = t1 . On the other
hand, the action principle has its roots in quantum mechanics, where the simultaneous
fixing of q and q̇ is inappropriate.
• Add the so-called Gibbons-Hawking-York counterterm to the action
Z Z
1 p 1 p
S = SEH + SHGY = 2 d4 x |g|R + 2 d3 x |h|K , (9.69)
2κ R κ ∂R

with h the determinant of the induced metric on the boundary and K the trace of the
extrinsic curvature. The Gibbons-Hawking-York is constructed in such a way that its
variation cancels the unwanted term associated to the second derivatives of the metric,
keeping only the part associated to the quadratic part of the action. Proving this statement
is beyond the scope of this course. The interested reader is referred to the excellent
discussion in Padmanabhan’s book.
Forgetting about the boundary term, the Einstein-Hilbert action (9.53) becomes
Z
1 p 1
δSEH = 2 d x |g| Rµν − gµν R δg µν ,
4
(9.70)
2κ 2

which, demanding it to vanish for arbitrary variations δg µν , gives us the Einstein’s equations in the
absence of matter
1
Gµν ≡ Rµν − Rgµν = 0 . (9.71)
2

9.8 Einstein equations in the presence of matter

Having obtained the Einstein equations in the vacuum, let us now derive its full form in the presence
of matter. Consider the action
S = SEH + SM , (9.72)
9.8 Einstein equations in the presence of matter 141

with SM containing all the non-gravitational fields. The variation of SM with respect to δg µν (upper
indices) gives Z
1 p
δSM = − d4 x |g|Tµν δg µν , (9.73)
2
where we have made use of the covariant definition (9.49). Putting everything together we get
Z
1 p 1
δSEH + δSM = 2 d4 x |g| Rµν − R − κ2 Tµν δg µν . (9.74)
2κ 2

Since δg µν is arbitrary, we must have

Gµν = κ2 Tµν , (9.75)
which confirms the identification of (9.45), or (9.49), with the energy-momentum tensor and allows
us to identify the proportionality constant κ2 with 8πG.

Exercise
Modify the Einstein-Hilbert action to obtain the Einstein equations with cosmological constant.
CHAPTER 3
THE INFLATIONARY PARADIGM

Ubi materia, ibi geometria.

Johannes Kepler

3.1 The hot Big Bang paradise

In General Relativity, the Universe as a whole becomes a dynamical entity that can be
modeled and measured. The combination of Einstein’s theory of gravity with the Standard
Model of particle physics gives rise to the successful hot Big Bang (hBB) scenario, describing
the evolution of the Universe and its matter content from the first fraction of a second till
the present era. The expansion of the Universe, the relative abundance of light nuclei or the
discovery of the Cosmic Microwave Background (CMB) give confidence in the basic picture,
the expansion and cooling of a primordial soup. Many of the key cosmological parameters
describing the Universe have been accurately determined. This has led to the establishment
of a precision cosmological model known as Λ Cold Dark Matter (ΛCDM). At the same time,
these parameters provide useful information for particle physics. The stringent limits on the
sum of neutrino masses and on the variations of fundamental constants clearly illustrate the
entanglement between cosmology and high-energy physics.

3.1.1 Homogeneity and isotropy

In this course, we will not be interested in local objects such as galaxies or stars, but rather on
the dynamics of the Universe as a whole. In particular, we will average over local structures
and assume the Universe to be described by an approximately homoheneous and isotropic
“gas” of matter, whose “molecules” are, for example, galaxies. On physical grounds, homo-
geneity means that the physical conditions are the same at every point. Isotropy at every
point automatically implies homogeneity.

Exercise
Convince yourself that homogeneity does not imply isotropy. Provide some examples.
3.1 The hot Big Bang paradise 26

Figure 3.1: The Cosmic Microwave Background as seen by Planck. The fluctuations on top
of the average temperature T = 2.73K ' 0.235 meV are one part in 105 .

At first sight, the idealization of the Universe as an homogeneous and isotropic object might
seem a bit drastic. On the other hand, we know from hydrodynamics that a continuous
description of gases works very well even if these have a very discontinuous structure at
molecular scales. The homogoneous and isotropic approximation seems to be indeed in good
agreement with observations. Indeed. both the CMB and the galaxy distribution look rather
homogeneous when averaged on sufficiently large scales (cf. Figs. 3.1 and 3.2).
A given spacetime in General Relativity is specified by its metric tensor gµν . This quantity
defines the line element
ds2 = gµν dxµ dxν , (3.1)
where dxµ stands for infinitesimal displacements in the coordinates xµ . From a mathematical
point of view, an homogeneous and isotropic Universe must be equipped with a metric tensor
invariant under translations and rotations in the spatial components. The most general 4-
dimensional geometry consistent with these symmetries is the so called Friedmann-Lemaı̂tre-
Robertson-Walker (FLRW) spacetime,

dr2

2 2 2 2 2 2 2

ds = −dt + a (t) + r dθ + sin θdφ . (3.2)
1 − kr2

This equation represents a time-ordered slicing of spacetime with respect to a global time t
whose 3-dimensional spacial surfaces are maximally symmetric. Here r is a radial coordinate
and θ and φ are the usual angular coordinates on a two-sphere, ranging between 0 < θ < π
and 0 ≤ φ < 2π. The coordinates (r, θ, φ) are usually called comoving coordinates, since they
are decoupled from the effect of expansion.
The FLRW metric is invariant under the redefinition
k p a
k→ , r→r |k| , a→ p , (3.3)
|k| |k|

meaning that the the only relevant parameter is the sign of k. We can therefore distinguish
three types of spatial sections:

1. Flat: for k = 0 the spacial slices are flat and r ranges from zero to infinity, 0 < r < ∞.
3.1 The hot Big Bang paradise 27

Figure 3.2: The Sloan Digital Sky Survey map. Each dot is a galaxy. The empty regions are
just areas that the survey did not cover.

Figure 3.3: 2-dimensional projection of the 3-dimensional slices of the FLRW metric for
k = +1 (left) and k = −1 (right).

2. Open: for k = −1 the spacial slices are hyperbolic and again 0 < r < ∞.

3. Closed: for k = 1, the spacial slices are three-spheres and the radial coordinate r is
restricted to a compact range, 0 < r < 1.

The scale factor a(t) characterizes the relative size of the spacial sections at a given time. Its
temporal evolution depends on the matter content of the Universe via the Einstein equations
1
Rµν − R gµν + Λgµν = 8πG Tµν , (3.4)
2
with G the Newton’s constant, Rµν the Ricci tensor, R = g µν Rµν the Ricci scalar and Λ the
infamous cosmological constant. The energy-momentum tensor Tµν encodes the Universe’s
matter content and is locally conserved,

∇µ Tµν = 0 . (3.5)
3.1 The hot Big Bang paradise 28

The homogeneity and isotropy of the background metric restricts the form of the energy-
momentum tensor to the perfect fluid case

Tµν = p gµν + (ρ + p)uµ uν , (3.6)

with uµ the comoving four-velocity satisfying uµ uµ = −1 and ρ(t) and p(t) the local energy
density and pressure of the fluid. For an observer comoving with the fluid, uµ = (1, 0, 0, 0),
the energy-momentum tensor looks isotropic

T µν = diag(−ρ(t), p(t), p(t), p(t)) . (3.7)

Note that the trace is given by

T = T µ µ = −ρ + 3p . (3.8)

Exercise
Derive Eqs. (3.7) and (3.6).

3.1.2 Friedmann equations

Given the FLRW metric (3.2) metric we can compute the connection coefficients, the Ricci
tensor components and the Ricci scalar. The computation of these quantities is straightfor-
ward but quite tedious, so we will simply summarize the results:

i) Connection coefficients
aȧ
Γ011 = Γ022 = aȧr2 Γ033 = aȧr2 sin2 θ
1 − kr2
ȧ
Γ101 = Γ110 = Γ202 = Γ220 = Γ303 = Γ330 =
a
Γ122 = −r(1 − kr2 ) Γ133 = −r(1 − kr2 ) sin2 θ
1
Γ212 = Γ221 = Γ313 = Γ331 =
r
Γ233 = − sin θ cos θ Γ23 = Γ332 = cot θ .
3
(3.9)

ii) Ricci tensor components

ä
R00 = −3
a
aä + 2ȧ2 + 2k
R11 =
1 − kr2
R22 = r2 (aä + 2ȧ2 + 2k)
R33 = r2 (aä + 2ȧ2 + 2k) sin2 θ , (3.10)

iii) Ricci scalar

6
R= (aä + ȧ2 + k) . (3.11)
a2
3.1 The hot Big Bang paradise 29

Combining these expressions with the energy-momentum tensor (3.7) we can particularize
the Einstein’s equations (3.4) to the homogenous and isotropic case. We obtain the so-called
Friedmann equations
ρ Λ k
H2 = 2 + − 2, (3.12)
3MP 3 a
1 Λ
Ḣ + H 2 = − 2 (ρ + 3p) + , (3.13)
6MP 3
with the dots denoting derivatives with respect to the coordinate time t and

MP = (8πG)−1/2 = 2.436 × 1018 GeV (3.14)

the reduced Planck mass. The quantity

ȧ
H≡ (3.15)
a
is the so-called Hubble rate. This parameter is positive for an expanding Universe and negative
for a contracting one. The value of the Hubble parameter at the present epoch is the Hubble
constant
H0 = 100 h km s−1 Mpc , (3.16)
with h = 67.8 ± 0.9 and Mpc= 3 × 1024 cm standing for “megaparsec”. This value allows us
to estimate the present age and size of the Univere

H0−1 = 9.77 h−1 Gyr cH0−1 = 3000 h−1 Mpc . (3.17)

The Friedmann equations (3.12) and (3.13) can be combined to obtain the continuity equation

ρ̇ + 3H (ρ + p) = 0. (3.18)

Exercise
Derive Eq. (3.18) i) by combining Eqs. (3.12) and (3.13) and ii) from the covariant
energy-momentum conservation (3.5). Interpret the result by considering the adiabatic
dilution of energy due to the expansion and the work done by pressure.
Hint: Consider the second law of thermodynamics T dS = dU + pdV .

The cosmological evolution following from Eqs. (3.12), (3.13) and (3.18) can be determined
once a pressure to energy density relation p(ρ) is specified. We will restrict ourselves to
barotropic fluids for which the pressure is linearly proportional to the energy density,

p = wρ , (3.19)

with w the so-called equation-of-state parameter. This case covers the two main matter
components in the hot Big Bang scenario, namely (non-relativistic) matter (w = 0) and
radiation (w = 1/3).
3.2 Troubles in paradise 30

Exercise
Consider a macroscopic collection of structureless point particles interacting through
spatially localized collisions. On distances d much larger than the typical mean free
path, the number of particles is large and the statistical fluctuations about the mean
properties of the fluid are expected to be small. If the fluid is isotropic,a the mean
density and pressure observed by a comoving observer over the volume ∆V = d3 can
be written as
DX E 1 X D X i i (3) E
ρ= En δ (3) (x − xn ) , p= pn vn δ (x − xn ) , (3.20)
n
∆V 3 n
∆V
i
p
with En = p2n + m2n the energy of the individual particles. The index i is a space
index ranging from 1 to 3 and n selects the particle of mass mn and momentum pn . Use
these microscopic expressions to derive the equation of state for non-relativistic matter
and radiation.
a
i.e if the fluid is perfect.

In our our Universe, several species with different equations of state coexist. Their relative
contribution is traditionally parametrized by the dimensionless parameters
ρM ρR Λ k
ΩM ≡ , ΩR ≡ , ΩΛ ≡ , ΩK ≡ − , (3.21)
3MP2 H 2 3MP2 H 2 3MP2 H 2 (aH)2
with the subindices M, R, Λ and K standing for matter, radiation, cosmological constant and
curvature contributions. At present time, the radiation and curvature contributions are very
small (ΩR ' 5 × 10−5 , ΩK < 0.005) and
ΩM ' 0.3 , ΩΛ ' 0.7 , (3.22)
Our present Universe is therefore dominated by a cosmological constant or dark energy com-
ponent. Note however that is was dominated by matter and radiation in the past. This can
be easily seen by considering the scaling of non-relativistic matter and radiation. Integrating
the conservation equation (3.18) and using Eq. (3.12) for the zero curvature case (k = 0),
we get 2/3(1+w)
−3(1+w) t w 6= −1 ,
ρ∝a , a(t) ∝ Ht (3.23)
e w = −1 .
For non-relatistic matter (w = 0), the energy density dilutes with the volume ρM ∼ a−3 ,
reflecting mass conservation. For relativistic matter (w = 1/3), the energy density dilutes
as ρR ∼ a−4 , due to the additional redshift of energy (∝ a−1 ). Note that the radiation
domination period cannot be eternal to the past. When t → 0, the scale factor goes to zero
and the physical energy density ρ diverges.

3.2 Troubles in paradise

In spite of the success of the hot Big Bang for describing the observed Universe, it is not free
of problems.
3.2 Troubles in paradise 31

3.2.1 Flatness problem

Consider the dimensionless energy density parameter
ρ
Ω≡ , (3.24)
ρcrit
with
ρcrit ≡ 3MP2 H 2 (3.25)
the so-called critical energy density. In terms of this quantity, the Friedmann equation (3.12)
becomes
k
Ω−1= . (3.26)
(aH)2
The quantity Ω − 1 measures the curvature of the Universe. A Universe with flat spacial
sections (k = 0) corresponds to Ω = 1. For k 6= 0, the evolution of the curvature depends
on the evolution of the comoving Hubble radius (aH)−1 . If the Hubble radius (aH)−1 in-
creases/decreases with time, the curvature increases/decreases accordingly. In a Universe
dominated by a fluid with equation of state w, the comoving Hubble radius evolves as
1
(aH)−1 ∝ a 2 (1+3w) . (3.27)

For standard matter sources satisfying the strong energy condition 1 + 3w > 0, (aH)−1 grows
as the Universe expands. For instance, during matter (MD) and radiation domination (RD)
we have
(aH)−1 ∝ a1/2 , (MD) (aH)−1 ∝ a . (RD) (3.28)

Exercise
Derive Eq .(3.27).

The density parameter Ω at present time is very close to one. Specifically, the latest Planck
satellite data combined with baryon acoustic oscillations (BAO) give

Ω0 − 1 = 0.000 ± 0.005 , (3.29)

at the 95% C.L. Taking into account this value (Ω0 ∼ 10−3 ) together with the evolution
equations (3.28) for the comoving Hubble radius during matter and radiation domination,
we can compute the value of Ω − 1 at the time of matter-radiation equality (zeq = 3600)

Ω(zeq ) − 1 = Ω0 − 1 (1 + zeq )−1 ≈ 2.8 × 10−5 , (3.30)

and at Big Bang Nucleosynthesis (zBBN = 1010 )

1 + zeq 2

Ω(zBBN ) − 1 = (Ω(zeq ) − 1) ≈ 3.6 × 10−18 . (3.31)
1 + zBBN
A percent deviation from flatness in the present Universe translates into unnaturally small
deviations at early epochs. In others words, in order to recover the Universe we observe
3.2 Troubles in paradise 32

2.0

1.5

Ω 1.0

0.5

0.0

0 1 2 3 4 5 6
log a

Figure 3.4: Evolution of the energy density parameter Ω in standard cosmology. The point
Ω = 1, corresponding to flat curvature, is a repeller.

today the initial conditions must be terribly fine-tuned. Any deviation from these initial
conditions translates either into a closed and recollapsing Universe or into an open Universe
completely dominated by curvature. This extreme dependence on the initials conditions is
highly unsatisfactory.

Exercise
One could argue that naturalness is just a question of taste and that the most symmetric
initial conditions are somehow more natural. However, this is not very convincing
from the point of view of the self-consistency of the theory, specially if those initial
conditions are unstable. Show that the dimensionless energy density parameter satisfies
the differential equation
dΩ
= (1 + 3w)Ω(Ω − 1) . (3.32)
d log a
Note that for both matter (w = 0) and radiation (w = 1/3) we have 1 + 3w > 0,
meaning that a flat Universe with Ω = 1 is an unstable fixed point. If Ω > 1 at some
point of the evolution, it will keep on growing; and viceversa, if Ω < 1 at some point, it
will keep on decreasing. This behaviour is illustrated in Fig. 3.4.

3.2.2 Horizon problem

Our Universe seems to be extremely homogeneous on large scales. The CMB temperature
anisotropies arise only at the level of one part in 105 . However, if the Universe were only
radiation- or matter-dominated in the past, any two regions in the sky with angular separation
of a few arc degrees would not have been able to communicate between the singularity and
recombination to decide which the common temperature is supposed to be. This is the so-
called horizon problem. To discuss this problem let us start describing the causal structure
3.2 Troubles in paradise 33

⌧ hBB cosmology ⌧ Inflationary cosmology

⌧0 ⌧0

⌧CMB ⌧CMB

⌧i=0 R
⌧end R

Causal
⌧i = 1

Figure 3.5: Conformal diagram for standard hBB cosmology (left) and inflationary cosmology
(right). Inflation solves the horizon problem by extending the conformal diagram to negative
conformal times.

of the FLRW metric. Performing a change of coordinates


dr  sin R for k = 1 ,
dR = √ , r = Sk (R) = R for k = 0 , , (3.33)
1 − kr2 
sinh R for k = −1 ,

the FLRW metric (3.2) can be written as

ds2 = −dt2 + a(t)2 dR2 + Sk2 (R)(dθ2 + sin2 θdφ2 ) .

(3.34)

Defining now the conformal time

dt
Z
τ= , (3.35)
a(t)
we can recast Eq. (3.34) in the conformal form

ds2 = a(τ )2 −dτ 2 + dR2 + Sk2 (R)(dθ2 + sin2 θdφ2 ) .

(3.36)

The causal structure of the FLRW metric is determined by the way in which light propagates
on null geodesics with ds2 = 0. Since the space is isotropic, we can freely set the coordinates
θ and φ to a constant value. In this case, the condition ds2 = 0 implies

R = ±τ + constant . (3.37)

The fact that geodesics are 45◦ lines in the {τ, R} plane is related to the fact that the the
FLRW metric (3.34) is conformally flat. If the Universe started at some initial time ti , then
there is a maximum amount of time for light to have travelled. The (comoving) particle
3.3 Inflationary paradigm 34

horizon is the largest distance that a photon can travel between ti and a later time t > ti
(recall that c ≡ 1)
t a ln a
dt da
Z Z Z
dH = τ − τi = = = (aH)−1 d ln a . (3.38)
ti a ai aȧ ln ai

According to this expression, the evolution of dH depends also on the evolution of the co-
moving Hubble radius (aH)−1 (cf. Eq. (3.27)).

Exercise
Show this.

For standard matter sources 1 + 3w > 0 and (aH)−1 grows as the Universe expands. When
that happens. the integral in Eq. (3.38) becomes dominated by its upper limit
2 1 2
dH ∝ a 2 (1+3w) = (aH)−1 . (3.39)
(1 + 3w) (1 + 3w)

Note that due to the presence of the singularity at ai = 0 (or equivalently at τi = 0) this
quantity is finite. At each instant of time, regions that were never in causal contact before
get into contact for the first time. The fact that two of these regions share approximately
the same temperature cannot by a consequence of thermal equilibrium. On general grounds,
these regions should be expected to look very different from each other. This applies also to
the CMB (see the left pannel of Fig. 3.5). The observed homogeneity of the CMB map is not
only remarkable but also strange and unexpected! Most points in that map share roughly
the same temperature even if the naive horizon scale at decoupling is just a few arc degrees.
How is this possible?

3.3 Inflationary paradigm

The horizon and flatness problems are intimately related to the fact that the comoving Hubble
radius (aH)−1 grows within the standard hot Big Bang scenario. A simple solution to these
problems is to postulate a decrease of (aH)−1 at early times

d
(aH)−1 < 0 . (3.40)
dt
or equivalently a violation the strong energy condition 1 + 3w > 0 (cf. Eq (3.27)). This
additional phase in the history of the Universe is called inflation. The name can be easily
understood by noticing that Eq. (3.40) implies accelerated expansion

d d ä
(aH)−1 = (ȧ)−1 = − 2 < 0 ⇒ ä > 0. (3.41)
dt dt (ȧ)
3.3 Inflationary paradigm 35

No weak energy condition violation

Note that the condition ä > 0 is very different from Ḣ > 0 with
ȧ 1
Ḣ = − H2 = − (ρ + p) . (3.42)
a 2MP2

In an expanding Universe the energy density is always decreasing or at most constant.

In order to have Ḣ > 0, the null energy condition ρ + p ≥ 0 should be violated.

If the inflationary stage lasts long enough, the hot Big Bang problems are automatically
solved. In particular, if the comoving Hubble radius (aH)−1 in Eq. (3.26) decreases, the
curvature is driven towards zero (the unstable point Ω = 1 in Eq. (3.32) becomes now an
attractor). This solves the flatness problem. On the other hand, if 1 + 3w < 0 the integral
in Eq. (3.38) becomes dominated by its lower limit and the singularity is pushed towards
negative conformal times,
2 1
(1+3w)
τi ∝ ai2 = −∞ . (3.43)
(1 + 3w)
The extension of the conformal diagram to negative conformal times allows the light cones
of widely separated CMB points to intersect (see the right pannel of Fig. 3.5). This solves
the horizon problem.

3.3.1 Minimal duration of inflation

Due to the Hubble shrinking, typical scales that were inside the horizon at the onset of
inflation, leave the radius of causal contact as inflation proceeds. When inflation ends, the
comoving Hubble radius (aH)−1 starts increasing and the scales reenter the horizon. This is
illustrated in Fig. 3.6. A simple inspection of this figure reveals that in order to solve the hot
Big Bang problems we must require

(ai Hi )−1 ≥ (a0 H0 )−1 , (3.44)

with (ai Hi )−1 the comoving Hubble radius at the onset of the inflationary regime and
(a0 H0 )−1 its value today. Let us assume for simplicity that the radiation-dominated epoch
starts immediately after the end of inflation and neglect the comparatively shorter matter
and dark-energy dominated epochs. Under these assumptions, we have

(ai Hi )−1 (ai Hi )−1 (a0 H0 )−1 a0

−1
= −1 −1
≥ , (3.45)
(aend Hend ) (a0 H0 ) (aend Hend ) aend
where in the last step we have made use of the condition (3.44) together with the radiation
domination scaling H ∝ a−2 . Taking into account that a0 /aend ∼ Tend /T0 and assuming
inflation to finish at an energy scale1 Tend ∼ 1014 GeV ∼ 1026 T0 , with T0 ∼ 10−3 eV the
1
Although accurate enough for a large set of inflationary models, this value is taken for illustration purposes
only. On general grounds, the precise number of e-folds must be computed model by model taking into account
not only the energy scale at the end of inflation, but also the details of the reheating process and the effects
of any intermediate era between the end of inflation and the onset of radiation domination.
3.3 Inflationary paradigm 36

Scales

1
(ai Hi ) 1 (aH)
Horizon exit Horizon reentry
1
(a0 H0 )

Today
CMB
log a
Inflation Heating Radiation Matter

Figure 3.6: Scales of cosmological interest as a function of the number of e-folds. Due to the
Hubble shrinking, typical scales λ ≡ (a0 H0 )−1 that were inside the horizon at the onset of
inflation, leave the radius of causal contact as inflation proceeds. When inflation ends, the
comoving Hubble radius (aH)−1 starts increasing and the scales reenter the horizon.

temperature of the Universe today, we get

(ai Hi )−1
≥ 1026 ' e60 . (3.46)
(aend Hend )−1

For Hi ' Hend (see below), this conditon becomes

aend
N ≡ log & 60 . (3.47)
ai

Therefore, in order to solve the inflationary problems we need, at least, N = 60 e-folds of

inflation.

3.3.2 Hubble flow parameters

The conditions for inflation are traditionally formulated as conditions on the variation of the
Hubble rate. Defining the fractional change of the Hubble rate per e-fold2 N ,

Ḣ d ln H
≡− =− , dN ≡ Hdt = d ln a , (3.48)
H2 dN
we can rewrite Eq. (3.40) as
d 1
(aH)−1 = − (1 − ) . (3.49)
dt a
2
This quantity will play a central role in the effective field theory of inflation to be presented in Chapter 7.
3.3 Inflationary paradigm 37

For inflation to take place, must be smaller than one. As argued in Section 3.3.1, the
solution of the flatness and horizon problems requires a rather long inflationary stage. In
order to achieve this, the fractional change of ,
d ln ˙
η≡ = , (3.50)
dN H
must also be small, |η| < 1. Note that the η parameter are just the first two elements of a
full series of Hubble flow parameters
d ln i ˙i
i+1 ≡ = . (3.51)
dN Hi

3.3.3 de Sitter spacetime

The minimal value of the parameter is zero. In this case the Hubble rate H is constant and
the Universe expands exponentially fast, a(t) = eHt . This limit motivates the study of the
so-called de Sitter spacetime.
The de Sitter spacetime dS4 can be represented as a 4-dimensional hyperboloid extrinsically
embedded in a d=5 Minkowski spacetime ηAB = diag(−1, 1, 1, 1, 1) with coordinates z A

ηAB z A z B = −z02 + z12 + z22 + z32 + z42 = l2 . (3.52)

The quantity l ≡ 1/H is the so-called de Sitter radius.3 This representation makes explicit
the symmetries of the de Sitter space: rotations and Lorentz transformations in the 10 planes
formed by pairs of the five coordinates z A . This ten parameter SO(4, 1) group plays the same
instrumental role than the Poincare group in Minkowski spacetime. In particular, it greatly
facilitates computations as far as quantum field theory is concerned.
The dS4 spacetime can be also described in an intrinsic way. Consider the transformation
1 H
z0 = sinh(Ht) + eHt δij xi xj , (3.53)
H 2
Ht
zi = x i e , (3.54)
1 H
z4 = cosh(Ht) − eHt δij xi xj , (3.55)
H 2
with i = 1, 2, 3 and −∞ < t < ∞ and −∞ < xi < ∞. In this coordinate system the line
element (3.52) becomes a special case of the flat FLRW spacetime

ds2 = −dt2 + a2 (t)δij dxi dxj , (3.56)

with a(t) = eHt . Note however that Eq. (3.56) is not completely equivalent to (3.52), since
the coordinates {t, xi } cover only half of de Sitter manifold. This can be easily seen by adding
the z0 and z4 coordinates (see also Fig. (3.7)),
1 Ht
z 0 + z4 = e ≥ 0. (3.57)
H

3
The choice of notation will become clear soon.
3.3 Inflationary paradigm 38

Figure 3.7: The embedding of de Sitter space into a five dimensional flat space-time with
two spatial coordinates suppressed. The flat coordinates in (3.53)-(3.55) cover only half of
de Sitter manifold. The surfaces (lines) of constant t and constant x are also indicated.

Exercise
1. Derive Eq. (3.56) from the 5-dimensional embedding (3.52).

2. Other choices of coordinates leading to FLRW metrics with open and closed spatial
sections can be also considered. Find these sets of coordinates.
It is interesting to recast (3.56) in terms of the conformal time (3.35). Taking into account
that
1 1
τ = − e−Ht =⇒ a(τ ) = − , (3.58)
H Hτ
the line element takes the manifestly conformally flat form
1
ds2 = −dτ 2 + δij dxi dxj ,

(3.59)
H 2τ 2
with η ranging between −∞ and 0. Note that Eq. (3.59) is manifestly invariant under the
rescaling
τ → λτ , xi → λxi . (3.60)
As we will see in Section 6.1.3, this symmetry plays a central role in the properties of the
primordial perturbations generated during inflation. But let not anticipate things and focus
on the background evolution for the time being. What seems clear is that in order to recover
the hot Big Bang scenario the de Sitter phase cannot be eternal. In other words, we need to
equip the de Sitter space with a clock.

Solution Manual Modern General Relativity by Guidry PDF
100% (3)
Solution Manual Modern General Relativity by Guidry PDF
125 pages
Properties of Fluids PROBLEMS
100% (1)
Properties of Fluids PROBLEMS
12 pages
Aerospace Standard
100% (1)
Aerospace Standard
35 pages
1 - Euclidean Spacetime & Newtonian Physics
No ratings yet
1 - Euclidean Spacetime & Newtonian Physics
13 pages
2 - Minkowski Spacetime & SR
No ratings yet
2 - Minkowski Spacetime & SR
14 pages
Notes Gri
No ratings yet
Notes Gri
71 pages
Notes gr20
No ratings yet
Notes gr20
77 pages
A No-Nonsense Introduction To General Relativity (Sean Carroll)
100% (1)
A No-Nonsense Introduction To General Relativity (Sean Carroll)
24 pages
Lecture I: Vectors, Tensors, and Forms in Flat Spacetime
No ratings yet
Lecture I: Vectors, Tensors, and Forms in Flat Spacetime
6 pages
Heinzle. Introduction To Relaivity and Cosmology PDF
No ratings yet
Heinzle. Introduction To Relaivity and Cosmology PDF
224 pages
Spec Rel For 581
No ratings yet
Spec Rel For 581
23 pages
SR MT 2020 Tutorial 1 Solutions
No ratings yet
SR MT 2020 Tutorial 1 Solutions
31 pages
Lectures On General Relativity: Mehrdad Mirbabayi ICTP Diploma Program, 2018
No ratings yet
Lectures On General Relativity: Mehrdad Mirbabayi ICTP Diploma Program, 2018
76 pages
1 GR
No ratings yet
1 GR
37 pages
Lecture Notes On Cosmology (ns-tp430m) by Tomislav Prokopec Part I: An Introduction To The Einstein Theory of Gravitation
No ratings yet
Lecture Notes On Cosmology (ns-tp430m) by Tomislav Prokopec Part I: An Introduction To The Einstein Theory of Gravitation
37 pages
MathPhysics - Notes On Mathematical Physics For Mathematicians by Daniel V. Tausk
No ratings yet
MathPhysics - Notes On Mathematical Physics For Mathematicians by Daniel V. Tausk
102 pages
Chapter1 Baumann Geometry Dynamics
No ratings yet
Chapter1 Baumann Geometry Dynamics
23 pages
Special Relativity: 1 The Invariant Interval
No ratings yet
Special Relativity: 1 The Invariant Interval
8 pages
GR Irreducible Minimum - Vectors - Contravariant and Covariant
No ratings yet
GR Irreducible Minimum - Vectors - Contravariant and Covariant
14 pages
Special Relativity
No ratings yet
Special Relativity
7 pages
Notes On General Relativity
No ratings yet
Notes On General Relativity
32 pages
05.0 PP 1 13 Notation Concepts and Conventions in Relativity Theory
No ratings yet
05.0 PP 1 13 Notation Concepts and Conventions in Relativity Theory
13 pages
Restoring Local Causality and Objective Reality To The Entangled Photons
No ratings yet
Restoring Local Causality and Objective Reality To The Entangled Photons
10 pages
Crash Course in GR
No ratings yet
Crash Course in GR
95 pages
PDF PPT MATHEMATICAL PHYSICS Metric Tensor Unit 08
No ratings yet
PDF PPT MATHEMATICAL PHYSICS Metric Tensor Unit 08
22 pages
The Reissner-Nordström Metric: Jonatan Nordebo March 16, 2016
No ratings yet
The Reissner-Nordström Metric: Jonatan Nordebo March 16, 2016
46 pages
Henri Poincare and Relativity Theory - LOGUNOV, A. A.
100% (3)
Henri Poincare and Relativity Theory - LOGUNOV, A. A.
253 pages
GTR Intro Part2
No ratings yet
GTR Intro Part2
4 pages
Lecture 3
No ratings yet
Lecture 3
6 pages
Emil Akhmedov - General Relativity Notes PDF
No ratings yet
Emil Akhmedov - General Relativity Notes PDF
102 pages
GR Notes
No ratings yet
GR Notes
13 pages
Mathematical Relativity: Jos e Nat Ario
No ratings yet
Mathematical Relativity: Jos e Nat Ario
161 pages
Einstein Field Eq
100% (1)
Einstein Field Eq
7 pages
(Solman) Callahan-The Geometry of Spacetime - An Introduction To Special and General Relativity PDF
No ratings yet
(Solman) Callahan-The Geometry of Spacetime - An Introduction To Special and General Relativity PDF
116 pages
General Relativity: Alan D. Rendall
No ratings yet
General Relativity: Alan D. Rendall
33 pages
Flat Spacetime Acta
No ratings yet
Flat Spacetime Acta
30 pages
悦悦爱物理广义相对论笔记2.7版本 GR Notes Differential Geometry A
No ratings yet
悦悦爱物理广义相对论笔记2.7版本 GR Notes Differential Geometry A
187 pages
Mathematical Pre
No ratings yet
Mathematical Pre
10 pages
3.8 Metrik Tensor
No ratings yet
3.8 Metrik Tensor
9 pages
Lesson 5: Metric For A Gravitational Field: Notes From Prof. Susskind Video Lectures Publicly Available On Youtube
No ratings yet
Lesson 5: Metric For A Gravitational Field: Notes From Prof. Susskind Video Lectures Publicly Available On Youtube
41 pages
Black Holes Lecture UCSD Physics 161
No ratings yet
Black Holes Lecture UCSD Physics 161
76 pages
Despre Geometrie
No ratings yet
Despre Geometrie
8 pages
Relativity Lecture v0 1
No ratings yet
Relativity Lecture v0 1
16 pages
General Relativity - Lecture Notes
No ratings yet
General Relativity - Lecture Notes
3 pages
I. Basic Principles
No ratings yet
I. Basic Principles
24 pages
Lecture 3
No ratings yet
Lecture 3
13 pages
Introduction To General Relativity
No ratings yet
Introduction To General Relativity
72 pages
Schwarzschild Solution in General Relativity: Marko Vojinovi C March 2010
No ratings yet
Schwarzschild Solution in General Relativity: Marko Vojinovi C March 2010
19 pages
Logaritmo Natural
No ratings yet
Logaritmo Natural
11 pages
Tensor PDF
No ratings yet
Tensor PDF
25 pages
Overview of Classical Mechanics: 1 Ideas of Space and Time
No ratings yet
Overview of Classical Mechanics: 1 Ideas of Space and Time
7 pages
Index Notation
No ratings yet
Index Notation
8 pages
Tensores
No ratings yet
Tensores
96 pages
1 - A - Brief - Introduction - To - Manifolds (Errata)
No ratings yet
1 - A - Brief - Introduction - To - Manifolds (Errata)
15 pages
GR2015 0416
No ratings yet
GR2015 0416
131 pages
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Mathematics for the Physical Sciences
From Everand
Mathematics for the Physical Sciences
Herbert S. Wilf
No ratings yet
The Logical Solution Syracuse Conjecture
From Everand
The Logical Solution Syracuse Conjecture
Rolando Zucchini
No ratings yet
A Treatise on the Calculus of Finite Differences
From Everand
A Treatise on the Calculus of Finite Differences
George Boole
4/5 (1)
The future of the universe astrophysics
From Everand
The future of the universe astrophysics
Fulvio Gagliardi
No ratings yet
Complex Integration and Cauchy's Theorem
From Everand
Complex Integration and Cauchy's Theorem
G. N. Watson
No ratings yet
DSC-246 DSC-246-7 DSC-246V DSC-246V-7: Instruction Manual
No ratings yet
DSC-246 DSC-246-7 DSC-246V DSC-246V-7: Instruction Manual
32 pages
Experiment 5 PHYS 105
No ratings yet
Experiment 5 PHYS 105
2 pages
Answer Keys: NCERT Booster Programme For NEET-2024 (XII Studying) - Physics - Poll-08
No ratings yet
Answer Keys: NCERT Booster Programme For NEET-2024 (XII Studying) - Physics - Poll-08
14 pages
Crude Vacuum Tower Wash Bed Optimization: Chemical Engineering
No ratings yet
Crude Vacuum Tower Wash Bed Optimization: Chemical Engineering
6 pages
Combining Classifiers With Different Footstep Feature Sets and Multiple Samples For Person Identification
No ratings yet
Combining Classifiers With Different Footstep Feature Sets and Multiple Samples For Person Identification
4 pages
Exercises Part 1
No ratings yet
Exercises Part 1
4 pages
Xcut Master Catalog 2023
No ratings yet
Xcut Master Catalog 2023
572 pages
Units and Measurements DPP 05
No ratings yet
Units and Measurements DPP 05
3 pages
Unit 1 Complete
No ratings yet
Unit 1 Complete
55 pages
WoPhO 2011 S12
No ratings yet
WoPhO 2011 S12
12 pages
Обучение
No ratings yet
Обучение
22 pages
Pratt David - Mysteries of The Inner Earth
No ratings yet
Pratt David - Mysteries of The Inner Earth
90 pages
Orthorhombic, Monoclinic Triclinic: - Minerals Coming Under The
No ratings yet
Orthorhombic, Monoclinic Triclinic: - Minerals Coming Under The
13 pages
Slipstream
100% (3)
Slipstream
162 pages
PDO Course 1
No ratings yet
PDO Course 1
126 pages
CH 14
No ratings yet
CH 14
3 pages
Beams, Plates, and Shells
No ratings yet
Beams, Plates, and Shells
4 pages
B Math 2 Linear Programming
No ratings yet
B Math 2 Linear Programming
3 pages
Field Experience IV - Gases Unit Plan
No ratings yet
Field Experience IV - Gases Unit Plan
7 pages
Astm D623-2007
No ratings yet
Astm D623-2007
3 pages
Ritika Gogna - Cell Size Pogil
No ratings yet
Ritika Gogna - Cell Size Pogil
5 pages
Drill Bits: Supervisor
No ratings yet
Drill Bits: Supervisor
12 pages
Measurement of Noise
No ratings yet
Measurement of Noise
20 pages
Investigation of Simultaneous Audio Sources Localization: XXX, IEEE Member, XXX, IEEE Member and XXX
No ratings yet
Investigation of Simultaneous Audio Sources Localization: XXX, IEEE Member, XXX, IEEE Member and XXX
4 pages
Mutah University Faculty of Engineering Department of Chemical Engineering
No ratings yet
Mutah University Faculty of Engineering Department of Chemical Engineering
6 pages
Energy Basics - Introductory Chemistry
No ratings yet
Energy Basics - Introductory Chemistry
15 pages
Winding Temperature Monitoring
No ratings yet
Winding Temperature Monitoring
45 pages
LESSON 2A Solving QE by Extracting Square Roots
No ratings yet
LESSON 2A Solving QE by Extracting Square Roots
12 pages

Notass

Uploaded by

Notass

Uploaded by

CHAPTER 1

EUCLIDEAN SPACETIME AND NEWTONIAN PHYSICS

Absolute, true, and mathematical

1.1 Galilean Relativity

1.2 Euclidean spacetime: old wine in a new bottle

Figure 1.1: Galilean spacetime.

{xi } = {x, y, z} = {x1 , x2 , x3 } , (1.1)

and the time t at which it happens.

Einstein summation convention

dX · ei = dxj ej · ei = dxj (ej · ei ) = dxj δji ≡ dxi ,

dxi ≡ δij dxj . (1.7)

dxi = +dxi . (1.8)

|dX|2 ≡ dX 2 = δij dxi dxj = dxi dxi = dx2 + dy 2 + dz 2 , (1.9)

1.3 Euclidean space isometry group

which implies that the transformation must be linear

δij = Rk j Rl j δkl . (1.19)

1.4 Tensors in Euclidean space

while, in terms of the rotated basis ēi , it has different components V̄ i

V = V̄ i ēi = V̄ 1 ē1 + V̄ 2 ē2 + V̄ 3 ē3 , (1.23)

components when the reference frame is changed. The expression

• the Laplacian operator ∇2 = ∂i ∂ i transforms as a Galilean scalar operator.

1.4.3 Tensors: linear machines

or the quadrupole tensor (1.57) (cf. Section 1.5.1).

= Ri1 k1 . . . Rim km (R−1 )l1 j1 . . . (R−1 )ln jn T k1 ...km l1 ...ln .

1.4.4 Some useful properties

S̄ i1 ...im j1 ...jn ≡ T̄ i1 ...im j1 ...jn ± R̄i1 ...im j1 ...jn

1.4.5 Symmetric and antisymmetric tensors

1.4.6 Permutation tensor

ijk i lm = δjl δkm − δjm δkl . (1.44)

You will deal with this expression in the exercises.

will only deal with rotations.

1.5 Covariance and Classical Mechanics

1.5.1 Newton’s theory of gravity

Exercise: Cosmological constant

∇2 Φ(t, x) + Λ = 4πGρ(t, x) . (1.50)

Figure 1.3: Multipolar expansion.

Exercise: Green’s functions (*)

Exercise: Multipole expansion

Scarcely anyone who truly

2.1 Einstein’s Relativity

2.2 Minkowski spacetime: new wine in a old bottle

{eµ } = {et , ex , ey , ez } = {e0 , e1 , e2 , e3 } = {e0 , ei } , (2.2)

satisfying the Lorentz orthonormality condition 5

dS · eµ = (dxν eν ) · eµ = dxν (eν · eµ ) = dxν ηνµ = dxµ , (2.7)

2 Both of them are smooth, continuous, homogeneous, isotropic, orientable,. . .

Coordinate indices will be always upper indices.

|dS|2 ≡ ds2 = ηµν dxµ dxν = dxµ dxµ = −dt2 + dX 2 . (2.12)

Figure 2.1: Minkowski spacetime.

The different concepts are summarized in Fig.2.1.

2.3 Minkowski spacetime isometry group

L+ ≡ SO(3, 1) = {Λ|ΛT ηΛ = η , Λ0 0 ≥ 0 , det Λ = 1} (2.21)

• The product of Lorentz transformations is associative.

The position of the indices is important Λµ ν 6= Λν µ !!

t0 = γ(t − vx) , x0 = γ(x − vt) . (2.27)

The form of Eq. (2.26)

and time dilatation

2.4 Tensors in Minkowski spacetime

2.5 Covariance and Relativistic Mechanics

ds2 = −dτ 2 < 0 . (2.33)

In terms of its components, the 4-velocity uµ can be written as

The 4-velocity is a timelike 4-vector.

ηµν aµ aν > 0 (2.43)

orthogonal to the timelike 4-velocity

2.6 Relativistic Lagrangian for free particles

description, not a symmetry relating different solutions of the theory.

dxµ dxµ dxν

2.7 Maxwell’s equations

Using them, the inhomogeneous Maxwell’s equations become

∂t2 ϕ − ∇2 ϕ = ρ , ∂t2 A − ∇2 A = J . (2.62)

In covariant notation, (2.62) becomes

2 ≡ ∂µ ∂ µ = −∂t2 + ∂i2 . (2.68)

µνρσ = ηµλ ηνκ ηρπ ηστ λκπτ , (2.69)

from which it follows that 0123 = −0123 .

and gauge transformations

of a charged particle in an electrostatic potential Φ.

Taking the 4-divergence of Eq. (2.63) we obtain

ijk i lm = δjl δkm − δjm δkl . (1.44)

µνρσ = ηµλ ηνκ ηρπ ηστ λκπτ , (2.69)

from which it follows that 0123 = −0123 .