Notass
Notass
Isaac Newton
Scholium of the Principia
The purpose of this chapter is to remind you the basic features of the Galilean spacetime and its
symmetries, which are closely related to the form taken by Newton’s laws as seen by inertial observers.
Although ideas presented in this chapter will be all familiar to you, the way of looking at them will
be probably new. We will introduce some tensorial notation that will be useful in the future. Indeed,
local differential geometry can be understood as a refinement of the tensorial methods presented here.
1. Principle of Relativity: The laws of physics are the same in all the inertial frames: No experiment
can measure the absolute velocity of an observer; the results of any experiment do not depend
on the speed of the observer relative to other observers not involved in the experiment.
2. There exists an absolute time, which is the same for any observer.
ine
Future
rldl
Wo
Past Future
e2
Inertial observer
e1
ine
rldl
Wo
Past
coordinates used for its description. The spatial location of the event can be specified in Cartesian
coordinates (x, y, z), in spherical coordinates (r, θ, φ), or making use of any 3 independent numbers
obtained by a well-defined coordinate transformation. However, among all the coordinates systems
that can be used in Newtonian physics, the inertial coordinate systems are privileged (and least for
Newton and Galileo). An inertial frame is a frame moving freely in spacetime, free of any force, which
carries ideal clocks and measuring rods forming an orthonormal Cartesian coordinate system. In such
a frame, a particular event P is characterized by 4 coordinates: its position2
Time
Physical time is absolute (up to affine changes, see below) and it is used to characterize particle
trajectories xi (t). The temporal separation dt between two events is well-defined, independently of
their spatial separation (see below). Simultaneous events are characterized by equal time surfaces
separating the future and the past of the events. Any event may cause any simultaneous or later
event.
Space
For each spatial coordinate we define a set of orthonormal basis vectors along the xi coordinate
direction
ei = {ex , ey , ez } = {e1 , e2 , e3 } , ei · ej = δij , (1.2)
with δij the 3 dimensional Kronecker delta
1 0 0
δij = 0 1 0 = diag(1, 1, 1) . (1.3)
0 0 1
2 We emphasize here that we do not consider {xi } to be a vector since the homogeneity of space makes the choice of
an origin completely arbitrary. The distance between points is the only significant quantity. On top of that, coordinates
will no longer behave as a vector in the presence of gravity.
1.2 Euclidean spacetime: old wine in a new bottle 4
The infinitesimal displacement vector dX between two points (as any other vector in E3 ) can be
expanded in terms of the basis vectors ei as
3
X
dX = dxi ei = dx1 e1 + dx2 e2 + dx3 e3 , (1.4)
i=1
with dxi the so-called contravariant components of the vector in that orthonormal basis.
Exercise
Which of the following expressions do not make sense or are ambiguous according to the previous
rules? Why? Restore the sums on dummy indices in the rest of equations.
x i = Ai j B j k x k , xi = Aj k B k l xl , D i j = Ai k B k k C k j , D i j = Ai k B k l C l j ,
xi = Ai j xj + B i k xk , xi = Ai j xj + B i j xj , D i j = Ai k B k l C l i .
The orthonormality of the basis vectors allows us to compute the contravariant components dxi
as the scalar product of the vector dX and the corresponding basis vector ei
where in the last step we have defined the so-called covariant components dxi
The 3 dimensional Kronecker delta δij allows therefore to lower (or raise) spatial indices. The definition
of covariant vectors is done only for notational brevity, there is nothing deep on it. The location of
the indices in Euclidean space is just a clever way of keeping into account the summation convention
and does not give rise to any change in the numerical value of the different components
As we will see in the next chapters, this is not the general case in a non-Cartesian reference frame or
in other spacetimes with undefined metric, such as the Minkowski spacetime, where the distinction
between the temporal components of a covariant and contravariant vector becomes important.
The square of the infinitesimal spatial distance between two points in E3 is given by
where δij plays the role of a metric in E3 , for an orthonormal basis. The line element dX 2 is positive-
definite.
∂ x̄k i
dx̄k = dx , (1.10)
∂xi
which, imposing the invariance of line element dX̄ 2 = dX 2 , implies
∂ x̄k ∂ x̄l
δij = δkl . (1.11)
∂xi ∂xj
Differentiating the previous expression with respect to xp , and taking into account that δij is a constant
matrix, we get 2 k
∂ x̄ ∂ x̄l ∂ x̄k ∂ 2 x̄l
δkl + = 0. (1.12)
∂xi ∂xp ∂xj ∂xi ∂xp ∂xj
Permuting ipj to pji and jip we obtain two equations
2 k
∂ x̄ ∂ x̄l ∂ x̄k ∂ 2 x̄l
δkl + = 0, (1.13)
∂xp ∂xj ∂xi ∂xp ∂xj ∂xi
2 k
∂ x̄ ∂ x̄l ∂ x̄k ∂ 2 x̄l
δkl + = 0. (1.14)
∂xj ∂xi ∂xp ∂xj ∂xi ∂xp
Subtracting (1.13) from (1.12), adding (1.14), and taking into account the symmetry of the metric
and the fact that the usual derivatives conmute, we get
∂Rk i
δkl Rl j = 0 , (1.15)
∂xp
where we have defined the matrix3
∂ x̄i
Ri j ≡ . (1.16)
∂xj
Since the transformation Ri j is required to be have an inverse4 , we must conclude that
∂Rk i
= 0, (1.17)
∂xp
3 The first index in Ri j labels rows and the second one labels columns.
4 The system x̄ is not at all privileged.
1.4 Tensors in Euclidean space 6
x̄i = Ri j xj + di , (1.18)
with di some real and arbitrary integration constants and Ri j independent of the coordinates. Substi-
tuting back Eq.(1.18) into (1.11) we obtain the similarity transformation
which is nothing else than the indexed version of the orthogonality condition RT IR = RT R = I for
a 3 × 3 matrix. Ri j is an O(3) matrix! (as you probably expected). Taking the determinant at both
sides of the orthogonality condition, we conclude that the determinant of an orthogonal matrix can
take two different values, namely det R = ±1. Since we will be interested in rotations connected with
the identity, we will restrict ourselves to proper rotations with determinant det R = +1, i.e orientation
preserving transformations
SO(3) = {R| RT IR = I , det R = 1} . (1.20)
Rotations with det R = −1 can be obtained by applying a parity transformation P i j = −I in E3 ,
which is also an orthogonal matrix P T P = I.
The laws of Newtonian mechanics are required to be covariant, i.e. to have the same form in
each inertial frame of reference. In order to achieve so, we will make use of tensors, in this case
Cartesian tensors, which have well defined transformation properties from frame to frame. As you
will realize soon, these objects are the cornerstone of modern physics theories, such as Special or
General Relativity. We will use them repeatedly in this course, so pay attention! We will start
our trip using a concrete and familiar context for the introduction of the tensor notions: rotations
x̄i = Ri j xj in Euclidean space.
Exercise
Show that the 3-volume is indeed a scalar under rotations.
If we can associate a number to all the points in some spacetime region, as for instance happens with
the value of the temperature in the different points of the Earth, we say that we are dealing with a
scalar field. Under coordinate transformations, it transforms as
φ̄ (t, x) = φ t, R−1 x .
φ̄ (t, x̄) = φ (t, x) , or (1.21)
1.4.2 Vectors
What is a vector? A vector V (in this case Cartesian) is an absolute geometrical object with a partic-
ular length and direction which does not depend on the choice of coordinates. The same happens with
the rules of vector calculus. Concepts as the angle between two vectors can be defined independently
1.4 Tensors in Euclidean space 7
e2
e2
2
V
2
V
V e1
1
V1 e1
Figure 1.2: A rotation: transformation of the basis vectors and components.
of the coordinates. Even though there is no need of introducing the concept of components of a vector
in a given basis, doing it is sometimes useful. Let us see what happens when we do it. Consider two
orthonormal frames related for instance by a rotation of angle θ around the z axis, as illustrated in
Fig 1.2. The vector V can be expanded in terms of the two set of basis vectors associated to this
coordinate systems. In terms of the basis ei , the vector V has components V i
V = V i ei = V 1 e1 + V 2 e2 + V 3 e3 , (1.22)
but the vector itself (V) does not change. The relation between the basis vector ēi and ei can be
easily read from the figure to get
T T
ē1 e1 cos θ − sin θ 0
ē2 = e2 sin θ cos θ 0 . (1.24)
ē3 e3 0 0 1
Using this relation, it is easy to write V̄ in terms of the original basis vectors ei and identify from
there the transformation of the components. We obtain
V̄1 cos θ sin θ 0 V1
V̄2 = − sin θ cos θ 0 V2 (1.25)
V̄3 0 0 1 V3
The previous exercise can be easily generalized to an arbitrary rotation, giving rise to the following
transformation rules5 j
V̄ i = Ri j V j , ēi = R−1 i ej . (1.27)
which, in a much powerful notation, can be written as
∂ x̄i j ∂xi
V̄ i = V , ēi = ej . (1.28)
∂xj ∂ x̄j
5 The example has been presented using the passive viewpoint, in which the same vector ends up with different
In conclusion, a vector V remains unchanged under (in this case) rotations due to the simultaneous
and opposite change of its components V i and the basis ei
i k
∂ x̄ j ∂x
V = V̄ i ēi = j
V ek = V j δjk ek = V k ek = V . (1.29)
∂x ∂ x̄i
From now on, and in a clear abuse of language, we will frequently employ a standard shorthand and
will refer to V i as a vector instead of saying the components of a vector V. A vector is said to be
contravariant if it transforms as the displacement vector dxi (cf. Eq. (1.10))
∂ x̄i j
V̄ i = V . (1.30)
∂xj
On the other hand, a vector is said to be covariant if it transforms as the basis vectors ei (cf. Eq.
(1.29))
∂xi
V̄i = Vj . (1.31)
∂ x̄j
A particular example of an object with the previous transformation properties is the gradient of a
scalar function
∂f ∂xj ∂f
= (1.32)
∂ x̄i ∂ x̄i ∂xj
The gradient is the difference of the function per unit distance in the direction of the basis vector.
When the basis vector “shrink” the gradient must “shrink” too.
You maybe think that I am being a bit pedantic here. For you the gradient was, till now, a re-
gular vector, as good as the displacement vector. Now I am giving them two different names and
two ‘different” transformation rules! Indeed. . . you are right. . . I am being quite pedantic. . . but
j
just to prepare the notation for the future. Note the matrix R−1 i is just an index notation
for (R−1 )T , which for the particular case of an orthogonal matrix, is equal to the transformation
matrix R itself. As we already said in Section (1.2), there is no clear difference between covariant
and contravariant components as long as one transforms between Euclidean orthonormal basis.
However, this is not the case in general coordinate systems (such as polar coordinates) or in
Special Relativity. Be patient.
Exercise
Show that
• the 3-divergence of a vector field ∂i V i transforms as a scalar field.
∂ x̄i
Rotations ∂xj ≡ Ri j are constants!
Scalar φ̄ = φ
∂ x̄i j
Contravariant vector V̄ i = ∂xj V
∂xj
Covariant vector V̄i = ∂ x̄i Vj
∂ x̄i ∂ x̄j kl
Contravariant rank-2 tensor T̄ ij = ∂xk ∂xl
T
∂xk ∂xl
Covariant rank-2 tensor T̄ij = ∂ x̄i ∂ x̄j Tkl
∂ x̄i ∂xl k
Mixed rank-2 tensor T̄ i j = ∂xk ∂ x̄j
T l
Table 1.1
where ⊗ denotes the direct product. The transformation property of the different components T ij
under a rotation follows immediately from the previous expresion: T ij transform as the product of
two contravariant vectors Ai and B j
Āi B̄ j = Ri k Rj l Ak Al −→ T̄ ij = Ri k Rj l T kl . (1.34)
As we did in the previous section, we can define the covariant tensor Tij , which transform as the
product of two covariant vectors
k l k l
Āi B̄j = R−1 i R−1 j Ak Al −→ T̄ij = R−1 i R−1 j Tkl . (1.35)
As before, in a clear abuse of language, we will refer to these tensor components as tensors. Particular
examples of rank-2 Cartesian tensor are the inertia tensor
Z
I ij = d3 x ρ(x) r2 δ ij − xi xj
(1.36)
Exercise
Show that the inertia tensor I ij is indeed a rank-(2,0) tensor.
Generalizing the transformation laws (1.34) and (1.35) we can define the transformations properties
for arbitrary mixed tensors of contravariant rank m and covariant rank n
m n
!
i1 ...im
Y ∂ x̄ip Y ∂xlq
T̄ j1 ...jn = kp jq
T k1 ...km l1 ...ln (1.37)
p=1
∂x q=1
∂ x̄
Tensors (components) are objects with any number of indices. They share the same transformation
properties as vectors and can be classified according to the number of upper or lower indices. For
instance, we say that a scalar is a rank-0 tensor and a contravariant (or covariant) vector is a con-
travariant (or covariant) rank-1 tensor. In general, a tensor with m upper indices and n lower indices
is called a rank-(m, n) tensor.
1.4 Tensors in Euclidean space 10
A tensor is not just a quantity carrying indices. It is the transformation law what defines a
tensor (see below). Not all quantities with indices are tensors.
1. The sum (or difference) of two like-tensors is a tensor of the same kind. The proof of this is
straightforward. Imagine we take sum or difference of two general tensors T i1 ...im j1 ...jn and
Ri1 ...im j1 ...jn and apply the transformation rule (1.37), we will get
2. Given two tensors of rank s and t, the product transforms as a tensor of rank (s + t).
3. If the expression T ... ... = R... ... S ... ... is invariant under coordinate transformations and T ... ... and
R... ... are tensors, then S ... ... is a tensor.
Exercise
Prove this for the particular case Ti = Rj S j i .
4. A tensor contraction occurs when one of a tensor’s free covariant indices is set equal to one
of its free contravariant indices6 . A sum is understood to be performed on the now repeated
indices. For instance, Tij j is a contraction on the second and third indices of the tensor Tij k .
5. The contraction of a rank-2 tensor is a scalar (its trace) whose value is independent of the
coordinate system chosen.
If all the components of a Cartesian tensor T i1 ...im j1 ...jn in a given inertial reference frame are
zero, they will be zero in any other inertial reference frame.
6 Note the words covariant and contravariant. A contraction is never done between two covariant or two contravariant
indices.
1.4 Tensors in Euclidean space 11
Exercise
Prove that the trace of a tensor is invariant under rotations. Show that a tensor Tij in n
dimensions has three separately invariant parts
1 k 1 k
Tij = T k δij + T(ij) + T[ij] − T k δij . (1.42)
n n
Exercise
Write down the explicit expressions for the completely symmetric and antisymmetric parts of a
rank-3 tensor Tijk .
flips the sign upon the interchange of any pair of indices and vanishes when two of the indices are
equal. Most of the basic identities of vector algebra and vector calculus can be easily proved by using
an important relation between the metric tensor δij and ijk , the contracted epsilon identity 8
chose is just a way of keeping track of the sums that can be easily extended to the Minkowski case. For Cartesian
tensors the position of the indices makes no difference.
1.5 Covariance and Classical Mechanics 12
In order to ensure that fundamental equations satisfy the Galilean Principle of Relativity the only
thing we have to do is to write tensorial equations. For instance, if two quantities S ij k and T ij k
transform as rank-(2, 1) Cartesian tensors, a fundamental law of the kind
S ij k = T ij k , (1.46)
will retain its form in any inertial reference frame, since both sides of the equation transform in the
same way under coordinate transformations (in this case rotations). The fundamental equation (1.46)
is then said to be covariant and the transformation is said to be a symmetry of the physical theory.
Observations of galaxies with typical masses of 1030 M , and intergalactic separations of order
1 Mly do not show any significant deviation from Newton’s inverse square. law. Assuming this
deviation to be smaller than 1%, determine an upper bound on the magnitude of Λ.
9 Eqs. (1.47) and (1.49) are left unaltered by the addition to Φ of an arbitrary function of time f (t), namely
Φ(t, x) → Φ(t, x) + f (t) . (1.48)
Since the transformation affects only the field Φ and not the coordinates, the invariance of Eqs. (1.47) and (1.49) under
(1.48) is referred as an internal or gauge symmetry. The gravitational field Φ(t, x) has no dynamical degrees of freedom.
Eq. (1.49) is not a dynamical equation for the determination of the potential, but rather a constraint on the initial
spatial distribution of the potential, which must apply at all times.
10 No value of the proportionality Newton’s gravitational constant G was available to Newton. Its numerical value was
firstly determined by Cavendish in 1797 using a torsion balance, being the result reasonably close to present laboratory
measurements, G = 6.673(10) × 10−11 N m2 / kg2 . The gravitational constant remains the most uncertain of all the
fundamental constants of physics.
1.5 Covariance and Classical Mechanics 13
O
x
P
x'
x- x'
The solution of the Poisson equation can be worked out in the same way that you did for the
electromagnetic potential in your Classical Electrodynamics course. The only difference (albeit fun-
damental) is the sign of the matter distribution. A formal solution of the Poisson’s equation for an
arbitrary mass distribution can be obtained by applying the superposition principle or using Green
functions to obtain
ρ(x0 ) 3 0
Z
Φ(x) = −G d x , (1.51)
|x − x 0 |
where x = xi ei is the radius vector of the point at which the gravitational potential is computed,
and x0 = x0i ei is an arbitrary point in the matter distribution. Note that the Newtonian potential is
negative, as expected for an attractive force.
The previous expression becomes the usual −GM/r only for a spherical mass distribution. The
general result for a non-spherical distribution is slightly more complicated. As any distribution func-
tion, the essential features of the matter distribution can be be characterized by its moments. For an
observer sufficiently far away from the object we can perform a Taylor expansion around x0 = 0 to
obtain
1 0 1 1 1 1 1 (−1)n 0 1
= e−x ·∇ = − (x0 · ∇) + (x0 · ∇)2 + . . . + (x · ∇)n + . . . (1.52)
|x − x0 | r r r 2 r n! r
0k 0l 02 kl
0k 3x x − r δ xk xl
1 x xk
= + + + ... , (1.53)
r r3 2r5
P∞ n
where we have used the standard expression for the exponential ex = n=0 (−1) n
n! x and defined the
2 k
distance r = x xk . Inserted back in Eq.(1.51), we realize that the potential created by the matter
distribution
D k xk Qkl xk xl
M
Φ(x) = −G + + + ... , (1.54)
r r3 2r5
can be organized in a series whose individual terms contain information on the spatial structure at an
increasing level of detail while decaying the more rapidly in space the higher the information content
is. The quantities Z
M = ρ(x0 ) d3 x0 , (1.55)
Z
Dk = ρ(x0 )x0k d3 x0 , (1.56)
and Z
Qkl = ρ(x0 ) 3x0k x0l − r02 δ kl d3 x0 ,
(1.57)
1.5 Covariance and Classical Mechanics 14
are respectively the total mass of the system, the mass dipole moment and the mass quadrupole mo-
ment tensor. The dipole moment can be eliminated by simply choosing the origin of coordinates of
the center of mass. The quadrupole moment is the second moment of the mass distribution with its
trace removed. It is proportional to 1/r3 , which gives rise to a deviation from the inverse square law
of the form 1/r4 .
A. Einstein
In the previous chapter we saw that tensors are a very good tool for writing covariant equations in
3-dimensional Euclidean space. In this chapter we will generalize the tensor concept to the framework
of the Special Theory of Relativity, the Minkowski spacetime. I will assume the reader to be familiar
at least with the rudiments of Special Relativity, avoiding therefore any kind of historical introduction
to the theory.
1. Principle of Relativity (Galileo): The laws of physics are the same in all the inertial frames: No
experiment can measure the absolute velocity of an observer; the results of any experiment do not
depend on the speed of the observer relative to other observers not involved in the experiment.
2. Invariance of the speed of light: The speed of light in vacuum is the same in all the inertial
frames.
Instantaneous action at a distance is inconsistent with the second postulate and must be replaced
by retarded action at a distance. Absolute simultaneity will only apply as an approximation at low
velocities for nearby events.
coincide2 , but they differ in the metrical structure, i.e. in the definition of distances. While in Newto-
nian spacetime the spatial and temporal distances are independent, in Minskowskian spacetime space
and time by themselves, are doomed to fade away into mere shadows, and only a kind of union of the
two will preserve an independent reality 3 . Space and time are distinguished only by a sign, which will
play however a central role. For any inertial frame of reference in Minkowski spacetime there is a set
of coordinates4
{xµ } = {t, x, y, z} = {x0 , x1 , x2 , x3 } = {x0 , xi } , (2.1)
and a set of orthonormal basis vectors
eµ · eν = ηµν , (2.3)
with
−1 0 0 0
0 1 0 0
ηµν =
0
= diag(−1, 1, 1, 1) . (2.4)
0 1 0
0 0 0 1
The inverse of (2.4) is traditionally denoted by η µν and satisfies6
η µν ηνρ = δ µ ρ , (2.5)
where the 4-dimensional Kronecker delta δ µ ρ is the indexed version of the identity matrix, i.e. δ µ ρ = 1
if µ = ρ and zero otherwise. Note that ηµν and η µν are numerically equivalent.
Exercise
Which is the value of δ µ µ ?
In terms of the basis vectors, the infinitesimal displacement dS between two points in spacetime can
be expressed as
dS = dxµ eµ , (2.6)
where dxµ are the so-called contravariant components. These are computed via the scalar product of
the vector dS and the corresponding basis vector eµ .
where we have defined the covariant or dual components by lowering the index of the contravariant
components with the metric
dxµ ≡ ηµν dxν . (2.8)
Exercise
Starting with a covariant vector defined by Eq.(2.8) , show that the inverse of the metric η µν
can be used to raise indices
η µν dxν = dxµ . (2.9)
As in the Euclidean case, contravariant and covariant vectors are just an appropriate way of
simplifying the notation and taking into account the summation convention. Note however that in
the present case lowering or raising indices changes the sign of the temporal component while keeping
intact the spatial ones
dx0 = −dx0 , dxi = +dxi . (2.10)
The upper and lower index notation automatically keeps track of the minus signs associated to the
temporal component. The indefiniteness of the metric is automatically incorporated in the notation!
In some old-fashioned books and in ’t Hooft’s lecture notes you will find a fourth coordinate
x4 = it, instead of the coordinate x0 = t appearing before. Written in terms of x4 the Minkowski
spacetime has the appearance of a positive-definite 4 dimensional Euclidean space
eµ · eν = δµν (2.11)
and there is no difference between lower and upper indices. This notation is however confusing
since it hides the non-positive definite character of the metric.
The square of the infinitesimal distance between two events in Minkowskian spacetime is given by
where ηµν is the Minskowski metric and dX 2 ≡ dx2 + dy 2 + dz 2 denotes the spatial interval.
Note that we have arbitrarily chosen a ηµν = diag(−1, 1, 1, 1) spacelike convention for the metric
signature, which keeps intact the notation used for Cartesian tensors in Euclidean spacetime.
Some books use a different timelike convention for the signature of the metric, taking ηµν =
diag(1, −1, −1, −1). Although the physics is independent of the convention used, the signs
appearing in the formulas in those books may differ from those in the expressions presented
here. For instance, using the convention ηµν = diag(1, −1, −1, −1), Eq.(2.40) would change to
pµ pµ = m2 . (2.13)
Note however that in both cases you recover E 2 = p2 + m2 after expanding the expression into
the different components.
Note that, contrary to the Newtonian case, the metric ηµν is not positive-definite. Given the
Lorentzian signature (− + ++) , the interval (2.12) can be positive, zero, or negative
• If ds2 = 0, dX/dt = 1 and the interval corresponds to the trajectory of a light ray. This interval
is called null or lightlike interval. The set of all lightlike wordlines leaving or arriving to a given
point xµ spans the future or past lightcone of the event. There is a lightcone associated to each
point in spacetime.
2.3 Minkowski spacetime isometry group 18
Timelike
Fu
Separation
tu
Massive
re
lig
particle Massless
ht
co
particle
Spacelike ne Spacelike
Separation Separation
Pa
st
lig
ht
co
ne
Timelike
Separation
• If ds2 < 0 the interval is said to be timelike. It corresponds to the wordline of a particle with
nonzero rest mass moving with a velocity smaller than light, dX/dt < 1. Two events separated
by such an interval are both inside the lightcone and can be in causal contact. There will exist
a frame in which the two events happen at same position but at different times.
• If ds2 > 0 the interval is termed spacelike. There will exist a frame in which the two events
happen at the same time but at different places, without any causal relation between them.
xµ = Λµ ν xν , (2.17)
with Λ a 4 × 4 matrix, independent of the coordinates. The first (upper) index in Λµ ν labels rows,
while the second (lower) one labels columns.
7 Note that this is basically due to the fact that we are dealing with constant metrics.
2.3 Minkowski spacetime isometry group 19
In order to preserve the line element (2.12) the constant matrices Λµ ν are required to satisfy the
pseudo-orthogonality condition
ηµν = ηρσ Λρ µ Λσ ν , (2.18)
which, in matrix notation, becomes
η = ΛT ηΛ , (2.19)
T
with denoting matrix transpose. Eq. (2.18) is the relativistic analogue of the orthogonality condition
(1.19). The determinant of the Λ matrices is also ±1. As we did in the previous chapter, we will not
consider the full Lorentz group8 (which is neither connected nor compact)
O(3, 1) = L+ ∪ P L+ ∪ T L+ ∪ P T L+ , (2.20)
with P and T the parity P µ ν = diag(+1, −I) and time reversal T µ ν = diag(−1, +I) operations. We
will restrict ourselves to the continuous Lorentz transformations connected with the identity (the
proper Lorentz group)
with S denoting special or reflection-free. These transformations are the relativistic analog of proper
rotations in Euclidean spacetime.
Exercise
Verify that the restricted set of Lorentz transformations (2.21) forms a group :
• Closure: The product of any two Lorentz transformations is another Lorentz transforma-
tion.
• There is an identity transformation.
• Every Lorentz transformation has an inverse.
The fact that η 2 = I4 , with I4 the identity matrix, allows us to easily compute the inverse Lorentz
transformation
η 2 = η ΛT ηΛ = η ΛT η Λ = I4 Λ−1 = ηΛT η ,
→ (2.22)
which, writing explicitly the components, becomes
µ
Λ−1 ν = η µλ Λρ λ ηρν = Λν µ . (2.23)
How many Lorentz transformation are there? Each Lorentz transformation is represented by a 4×4
matrix, which makes a total of 16 components. The pseudo-orthogonality condition (2.18) imposes
however some constraints. Indeed, taking the transpose of such a equation leaves it unchanged. The
independent components are just the diagonal elements plus half the off-original elements. We are
8 Note that now, not only the determinant of Λ, but also the element Λ0 0 , plays a special role in the splitting (2.20).
.
2.3 Minkowski spacetime isometry group 20
left therefore with 16 − 10 = 6 independent Lorentz transformations. There are two different kinds of
homogeneous Lorentz transformations. The most obvious one are spatial rotations
µ 1 0
Λ ν= , (2.24)
0 Ri j
where Ri j is a 3 × 3 orthogonal matrix δkl = δij Ri k Rlj , with i, j running only in spatial directions.
There are three independent rotations matrices, one per spatial direction. For instance, a rotation of
angle θ around the z-axis will take the form
cos θ sin θ 0
Ri j = − sin θ cos θ 0 . (2.25)
0 0 1
The difference with Euclidean transformations arises when considering the so-called boosts, which mix
the spatial and temporal components. There are three of them, each one associated to the mixing of
a particular spatial component with time. As an example consider a boost of rapidity η = tanh−1 v
along the x direction
γ −γβ 0 0 cosh η − sinh η 0 0
−γβ
Λµ ν =
γ 0 0 = − sinh η cosh η 0 0 ,
0 (2.26)
0 1 0 0 0 1 0
0 0 0 1 0 0 0 1
√
where we have defined the parameter γ = 1/ 1 − v 2 , with v the 3-velocity. After the boost, the tem-
poral and spatial coordinates are a linear and homogeneous combination of the spatial and temporal
coordinates in the old frame.
Exercise
Verify that Eq. (2.26) gives rise to the standard Lorentz transformation
For doing so, assume Λµ ν to be the transformation from the rest frame of a given inertial
observer to the frame of a second initial observer moving with speed v along the x axis and
determine the relation between that velocity and η .
The convenience of using the rapidity parameter η instead of the velocity v resides in the fact that
η combines additively. In fact, if we consider two consecutive boosts in the same direction, we have
Λ(η1 )Λ(η2 ) = Λ(η1 + η2 ) . (2.28)
Exercise
Consider the composition of 2 boosts with velocities v1 and v2 along the x direction. Show that
Λ1 Λ2 gives rise to a boost with 3-velocity
v1 + v2
v= . (2.29)
1 + v 1 v2
What happens in the limit v1 , v2 1? Generalize the previous to an arbitrary direction. Is
the general result symmetric under the interchange of the two velocities?. Note that all the
expressions are written in natural units.
2.3 Minkowski spacetime isometry group 21
t t
s2
x
sevent
n eou
ulta
Sim
Simultaneous events 1
x
Figure 2.2: A boost transformation
and the property (2.28) closely resemble those of spatial rotations (2.25). The main difference is
the change of trigonometric functions by their hyperbolic analogue, reflecting the relative sign of the
temporal direction with respect to the spatial directions. Note however two important differences
between boosts and ordinary rotations
• The rotation parameter θ in Eq.(2.25) runs between 0 and 2π, with both points included. The
rapidity parameter η is non-compact and can take whatever value in R.
• The boost matrix (2.26) is symmetric, which is not the case for ordinary rotations, cf. Eq.
(2.25):
Although we should not take the analogy between rotations and boosts too seriously, it is instructive
to look at the action of a Lorentz transformation on a spacetime diagram. As shown in Fig. 2.2, a
Lorenzt boost rotates time and space by the same angle η = tanh−1 v, but in opposite directions! In
Special Relativity the simultaneity of two events depends on the observer, the hyperplanes of constant
coordinate time do not have an invariant meaning. Note however that the light cone (i.e the dashed
line at 45 degrees in the diagram) is invariant under Lorentz transformations.
Exercise
Use the diagram Fig.2.2 to derive the well-known effects of space contraction
p
L̄ = L 1 − v 2 → L̄ < L , (2.31)
∂ x̄µ
Lorentz transformations ∂xν ≡ Λµ ν are constants!
Scalar φ̄ = φ
∂ x̄µ ν
Contravariant vector V̄ µ = ∂xν V
∂xν
Covariant vector V̄µ = ∂ x̄µ Vν
∂ x̄µ ∂ x̄ν ρσ
Contravariant rank-2 tensor T̄ µν = ∂xρ ∂xσ T
∂xρ ∂xσ
Covariant rank-2 tensor T̄µν = ∂ x̄µ ∂ x̄ν Tρσ
∂ x̄µ ∂xσ ρ
Mixed rank-2 tensor T̄ µ ν = ∂xρ ∂ x̄ν T σ
Table 2.1
Exercise
How does the volume element d4 x = dx0 dx1 dx2 dx3 transforms under Lorentz transformations?
replacement of latin indices by greek indices, and the use of the Minkowski metric, instead of the Euclidean one, for
lowering and raising indices.
10 Note that the proper time is not a useful parametrization for the worldline of massless particles, such as photons,
since these particles move on the light cone and can travel any distance in zero proper time dτ 2 = −ds2 = 0.
2.5 Covariance and Relativistic Mechanics 23
The connection between the proper time and the measurement made in an inertial reference frame
with coordinate time interval dt is given by
dt −1/2
γ≡ = 1 − v2 . (2.34)
dτ
i
Note that γ is a growing function of the 3-velocity v i = dx
dt and it is always bigger than one. The
proper time goes by at a slower rate than the coordinate time t.
4-velocity
Given τ , and in clear analogy with the 3-dimensional case, the 4-velocity uµ along the trajectory is
given by the 4-vector
dxµ
uµ ≡ , (2.35)
dτ
which is tangent to the worldline of the particle and automatically normalized
ηµν uµ uν = uµ uµ = −1 . (2.36)
uµ = Λµ ν uν . (2.38)
4-momentum
Since the mass m of the particle is a scalar under Lorentz transformations, the 4-momentum
pµ = muµ (2.39)
T T
is a Lorentz 4-vector with components pµ = E, pi = mγ, mγv i .
In the instantaneous rest frame of the particle, pµ = (m, 0)T . This can be used to simplify many
computations. We can compute things in this particular frame and then re-express the result
in a form valid in any other inertial frame by appealing to covariance.
Using the normalization condition for the 4-velocities (2.36), the normalization condition for the
4-momentum becomes
pµ pµ = −m2 , (2.40)
p
which is nothing else than the well-known energy-momentum relation E = p2 + m2 written in a
covariant way11 . In the Newtonian limit (|pi | m) this relation becomes the familiar expression of
11 Note that p is the three momentum pi .
2.5 Covariance and Relativistic Mechanics 24
2
p
the Newtonian theory together with the energy equivalent of the mass mc2 , i.e. E ' m + 2m . Note
that the 4-momentum remains well defined even for massless particles, where it has zero square norm
and becomes lightlike, pµ pµ = 0. We will take this as a definition of a massless classical particle. The
indefiniteness of Minkowski metric allows for non-zero values of the temporal and spatial parts as long
as they cancel out in pµ pµ . In particular, we can always find a frame in which pµ = (E, 0, 0, E)T .
4-acceleration
It is important to remark that Special Relativity, as Newtonian mechanics, is concerned with the
relation between inertial observers and not with the behavior of the objects that they are studying.
In particular, the observed objects can be accelerating. The 4-acceleration in Minkowski spacetime
can be defined as12
uµ (τ + ) − uµ (τ ) duµ d2 xµ
aµ = lim → aµ ≡ = . (2.42)
→0 dτ dτ
The 4-acceleration (2.42) transforms in the proper way under Lorentz transformations since Λµ ν
is linear and depends only on the relative velocity between the two frames. This fact allows it to pass
completely the d2 /dτ 2 . The acceleration aµ is spacelike 4-vector
Energy-momentum conservation
The objects defined above allow us to easily generalize the Newton’s second law, which becomes
dpµ
fµ = = maµ . (2.45)
dτ
Note that Eq.(2.45) is a tensorial identity, which maintains its form under Lorentz transformations
and automatically satisfies Einstein’s principle of Relativity. The explicit expression of f µ depends on
the considered interaction (cf. the first exercise in Section 2.7). The components of the 4-vector f µ
in Eq. (2.45) are proportional to the Newtonian force F i and to the work done by F i per unit time,
i.e. f µ = γ (vi F i , F i )T . Contrary to what happens in Newtonian physics, energy and momentum
conservation laws are not independent. The conservation laws of Newtonian mechanics in a given
collision between particles are replaced by a conservation law for the total 4-momentum 13
X µ X µ
pin = pout , (2.46)
in out
12 You will probably wondering why I am making such a mess writing out the explicit definition in (2.42) instead of
directly writing
duµ d 2 xµ
aµ ≡ = . (2.41)
dτ dτ
The point I want you to notice in that the two vectors v µ (τ + ) and v µ (τ ) are located at different points in spacetime.
What we are really doing when computing the acceleration in Special Relativity is trivially moving the two vectors
to the same point in spacetime before subtracting then. This trivial operation of moving a vector from one point to
another will turn out to be not so trivial in General Relativity. As you will see, we will need to introduce some extra
machinery in order to do that. . . but let’s move one step at a time. . .
13 The interaction is assumed to be a contact interaction. Particles are free away from the interaction point.
2.6 Relativistic Lagrangian for free particles 25
with the subscripts in and out denoting the incoming and outgoing particles. Note that the conserva-
tion law is Lorentz covariant and reduces to the Newtonian momentum and energy conservation for
small velocities.
Exercise
Show that a photon cannot spontaneously decay into an electro-positron pair.
dpµ
= 0, (2.47)
dτ
can be also be derived from a Lagrangian formulation where the role of the generalized coordinates
is played by the space-time coordinates xµ and the classical time t is replaced by an appropriate
parameter σ. The simplest guess for the relativistic action would be a naive generalization of the
Newtonian action, namely
dxµ dxν
Z
1
S = m dτ ηµν . (2.48)
2 dτ dτ
Note however that the previous expression is not invariant under reparametrizations of the path14
τ → f (τ ). The dynamic of the particle seems to depend on the “internal coordinate” τ used in the
description of the curve xµ (τ ). Moreover, the action (2.48) does not contain any information about
the lightcone. On top of that, t neither has a smooth massless limit.
To solve these problems, we will substitute the proper time by an arbitrary parameter σ and
introduce a non-dynamical function15 e(σ), the so-called einbein. This quantity will be treated as an
additional generalized coordinate during the intermediate computations and fixed to a particular value
only at the end. To clarify the construction, we proceed in several steps. We start by replacing the
problematic mass appearing in the action (2.48) by e−1 (σ). This gives rise to the following structure
dxµ dxν
Z
1 1
S∼ dσ ηµν . (2.49)
2 e(σ) dσ dσ
In order for the previous action to be invariant under reparametrizations of the path16 σ → f (σ),
the einbein e(σ) must be chosen to transform in the proper way. The transformation rule can be
determined by inspection: the property e(σ)dσ must remain invariant. In other words, the infinitesimal
displacement dσ and the einbein must transform in an opposite way
−1
dσ̄ = f˙(σ)dσ , ē(σ̄) = f˙(σ) e(σ) . (2.50)
With these transformations at hand, we proceed now to reintroduce the mass parameter m in the
action (2.49). The form of the new term is essentially determined by pure dimensional arguments,
reparametrization invariance and the massless limit. In order for the new piece to be reparametrization
invariant, the integration measure dσ must come together with a factor e(σ). This gives a term
14 The reparametrization invariance of the action should be understood as a gauge symmetry: a redundancy of the
dσe(σ) with dimension [E]−2 , which must be compensated17 by something with dimension [E]2 and
proportional to m. There you are: the new term is dσe(σ)m2 . The resulting action is the so-called
einbein action
dxµ dxν
Z
1 1
S= dσ ηµν − m2 e(σ) (2.51)
2 e(σ) dσ dσ
and give rise to the following Euler-Lagrange equations for the generalized coordinates xµ (σ) and e(σ)
The massive or massless character of particles is automatically incorporated in the second equation.
Indeed, choosing e(σ) = 1 and taking the limit m → 0, we obtain the equations of motion for a free
massless particle
d2 xµ dxµ dxν
2
= 0, ηµν = 0. (2.53)
dσ dσ dσ
On the other hand, the equations for massive particles can be obtained by choosing e(σ) = 1/m and
using the proper time as affine parameter (σ = τ )
d2 xµ dxµ dxν
m = 0, ηµν = −1 . (2.54)
dτ 2 dτ dτ
These kind of choices in which e(σ) = constant are called affine and restrict the function f (σ) to the
form
f˙ = 1 → f (σ) = σ + constant . (2.55)
Exercise
Consider the massive case. Show that the action (2.51) is equivalent to the geometrical action
Z
S = −m dτ . (2.56)
∇ × E + ∂t B = 0, (2.57)
∇·B = 0, (2.58)
∇ · E = ρ, (2.59)
∇ × B − ∂t E = J , (2.60)
with E and B the electric and magnetic fields, J the current density and ρ the charge density. They
are 8 coupled linear differential equations, in which the boundary conditions are usually taken to be
such that for infinite systems the fields E and B go to zero at infinity. Note also the symmetry E ↔ B
in the absence of sources.
17 Recall that, in natural units, the action is dimensionless.
18 Note that they are written using the Heaviside-Lorentz convention, in which no 4π factors appear.
2.7 Maxwell’s equations 27
The homogenous Maxwell’s equations (2.57) and (2.58) can be solved by introducing the so- called
electromagnetic potentials: a scalar potential ϕ and a vector potential19 A satisfying
E = −∇ϕ − ∂t A , B = ∇ × A. (2.61)
Given the electromagnetic potentials ϕ and A in Eq. (2.61) the electromagnetic fields E and B are
completely determined, but no viceversa. A and φ are gauge fields (see the exercise below).
Familiarity with Maxwell’s equations soon leads to the appreciation of the unified nature of the
electromagnetic field and its relativistic nature. Although in their 19th century version Maxwell’s
equations (2.57)-(2.60) do not seem at all invariant under Lorentz transformations, they can be written
in a more compact and elegant way, that makes explicit their covariant form. Introducing the 4-vector
potential (gauge field) Aµ ≡ (ϕ, A) and the charge-current density 4-vector J µ ≡ (ρ, J), we obtain20
∂ν F µν = Jµ , (2.63)
ρ µν
µνρσ ∂ F = 0, (2.64)
where the antisymmetric quantity F µν ≡ ∂ µ Aν − ∂ ν Aµ is the gauge invariant (Faraday) field strength
with components
F 0i = E i , F ij = ijk Bk , (2.65)
and the different are totally antisymmetric tensors in the corresponding dimension21 n
+1, if µ1 µ2 . . . µn is an even permutation of 01 . . . (n − 1) ,
µ1 µ2 ...µn
= −1, if µ1 µ2 . . . µn is an odd permutation of 01 . . . (n − 1) , (2.66)
0, otherwise .
• The covariant components µνρσ of the permutation tensor µνρσ in Minkowski spacetime
are defined by lowering each of the indices with the metric tensor ηµν
Exercise
Just for those of you knowing Classical Field Theory. The previous equations of motion can be
obtained from the following Lagrangian density
1
L = − F µν Fµν + J µ Aµ , (2.70)
4
where J µ is treated as an external source.
L d4 x is invariant under
R
• Check that with the Lagrangian density (2.70), the action S =
Lorentz transformations
Aµ = Λµ ν Aν , F µν = Λµ ρ Λν σ F ρσ , J µ = Λµ ν J ν , (2.71)
∂µ J µ = ∂µ ∂ν F µν = 0 , (2.73)
where in the last step we used the fact that F µν is an antisymmetric tensor, i.e. F µν = −F νµ .
Eq.(2.73) is nothing else than the continuity equation
∂ρ
+ ∇ · J = 0. (2.74)
∂t
The conservation of total charge Q(t)
Z Z
Q̇(t) = ρ̇(t, xi )d3 xi = − ∂k J k (t, xi )d3 xi = 0 , (2.75)
R3 R3
is imposed by the field equations. If the charge is not conserved there is no solution!
Exercise
Prove that the product Sµν Aµν of a symmetric tensor S µν and an antisymmetric tensor Aµν is
zero.
CHAPTER 3
“THE HAPPIEST THOUGHT” OF EINSTEIN’S LIFE
A. Einstein (1920)
∇2 Φ(t, xi ) = 4πGρ(t, xi )
is a linear partial differential equation of 2nd order which does not contain any explicit time depen-
dence. The gravitational potential responds instantaneously to the changes in the matter distribution!
This was awkward even for Newton
That one body may act upon another at a distance through a vacuum, without the mediation
of anything else, by and through which their action and force may be conveyed from one to
another, is to me so great an absurdity, that I believe no man, who has philosophical matters
a competent faculty of thinking, can ever fall into it (Principia, p. 643, Ref. 395).
and it is in clear contradiction with Special Relativity. The instinctive reaction of many physicist when
facing this consistency problem was to apply the recipes used when writing the covariant version of
Maxwell equations (promote the operator ∇ to 2, introduce some kind of vector potential Aµ for
the gravitational field, generalize the Newtonian force to some combinations of fields and 4-velocities,
get retarded potentials, etc . . . ). None of the attempts was sucessful1 . Einstein eventually concluded
that a new approach to the problem must be taken. The purpose of this chapter is to present you
Einstein’s new look on gravity. As you will see, the new look turned out to be a real old look that
went back to Galileo himself.
1 We will be back to this point in the future.
3.1 Inertial and gravitational masses 30
f i = mI ai , (3.1)
independently of the origin of the force. This mass mI measures the resistance of an object to ac-
celerations. On the other hand, we have the gravitational mass, which measures the strength of the
gravity (in the same way that the electric charge measures the strength of the electric force). The
force exerted on a gravitational mass mG close to the surface of the Earth is given by
f i = mG g i . (3.2)
Comparing the expressions (3.1) and (3.2), we conclude that the acceleration of gravity should depend
a priori on the ratio of the gravitational mass to the inertial mass.
mG
ai = gi . (3.3)
mI
Nevertheless, as verified by Galileo’s ramp experiments2 , all bodies fall with the same acceleration in
a gravitational field
ai = g i . (3.4)
This observation implies the equality of the quantity controlling inertia (mI ) and that measuring the
strength of gravity (mG )
mI = mG , (3.5)
for all materials, independently of its composition.
Exercise
Consider the magnitude of the electrostatic interaction at a distance r between two particles of
charges q1 , q2 and inertial masses m1i ,m2i
q1 q2
Fe = . (3.6)
4πr2
How does the magnitude of the acceleration felt by particle 2 depends on its properties?
2 Yes, ramps and a water clock. The image of Galileo dropping balls from the leaning power of Pisa is just a widespread
italian legend.
3.1 Inertial and gravitational masses 31
3:00 am
3:00 pm
The results of Galileo’s experiments were confirmed, among others, by Newton himself and by the
Baron Eötvös de Vásárosnamény, who used respectively pendula and a torsion balance with different
materials3 .
Exercise
How does the oscillation period of a simple pendulum depend on the ratio mI /mG ?
The difference in the acceleration experienced by the two bodies is encoded in the so-called Eötvös
parameter
E1I E2I
∆a 2|a1 − a2 | X I
η= = = η − , (3.7)
a |a1 + a2 | mI,1 c2 mI,2 c2
I
where in the last step we have made explicit the contribution of the various energy forms E I to the
difference between inertial and gravitational masses
X EI
mG − mI ≡ ηI 2 . (3.8)
c
I
The experimental results are summarized in the following table.
|mI −mG |
mI
3 Eötvos located two test objects on the opposite ends of a dumbbell suspended from a forsion fiber. If the inertial
and gravitational masses of those objects were different the centripetal effects associated with the rotation of the Earth
would give rise to a torque (everywhere but at the poles) that could be measured with a delicate torsion balance pointing
west-east. For a detailed description of the Eötvös’ original experiments see for instance Weinberg’s book.
3.2 The Equivalence Principle 32
Note that the displayed cases do not include the contribution of the gravitational self-interaction
of the masses, which, for laboratory size experiments, is extremely small. Its contribution can be
however tested via the so-called Nördvedt effect. If the gravitational self-energy did not follow Galileo’s
equivalence principle, the Earth and the Moon would fall at different rates towards the Sun, elongating
the orbit of the Moon in the Sun direction. As shown by Lunar Laser Ranging experiments (LLR),
which use reflectors that were located in the surface in the Moon during the Apollo 11 mission, cf.
Fig. 3.1, the gravitational self-energy behaves as any other energy form, in perfect agreement with
Eötvos’ results. Indeed, LLR experiments provide the most accurate tests of the equivalence between
inertial and gravitational masses. The constrains are really impressive
2|aE − aM |
η= = (−1 ± 1.4) × 10−13 . (3.9)
(aE + aM )
Not only matter but also antimatter seems to follow the Galileo’s result. Important constraints of the
order 10−9 were obtained by the CPLEAR collaborations from neutral kaon systems4 .
g g
The gravitational interaction resembles also the pseudo forces resulting from the use of non-inertial
reference frames. For example, if there is a frame of reference rotating with angular velocity ω with
respect to an inertial reference frame, all bodies appear to accelerate spontaneously with the same
acceleration in that rotating frame. It seems that there is a universal force acting on all bodies with
a magnitude proportional to their inertial masses
Accelerated frames and local gravitational forces appear to be intimately related. Both of them act
in the same way on all bodies, are proportional to mass and can be transformed away by changing to
a suitable reference frame; a local free falling frame in the case of gravity.
Exercise
Einstein’s toy: A version of the following device was constructed as a birthday present for
Albert Einstein. The device consists of hollow broomstick with a cup at the top, together with
a metal ball and an elastic string. When the broomstick is held vertical, the ball can rest in
the cup. The ball is attached to one end of the elastic string, which passes through a hole in
the bottom of the cup, and down the hollow centre of the broomstick to the bottom, where its
other end is secured. You hold the broomstick vertical, with your hand at the bottom, the cup
at the top, and with the ball out of the cup, suspended on its elastic string. The tension in the
string is not enough to draw the ball back into the cup. The problem is to find an elegant way
to get the ball back into the cup. (Inelegant ways are: using your hands or shaking the stick up
and down).
Note the slight change of notation below. From now on, we will reserve the first letters of the
Greek alphabet α, β, . . . for indices associated to inertial reference frames. Intermediate letters
of the Greek alphabet µ, ν, . . . will stand for general (non-inertial, accelerated) reference frames.
Consider the movement of the accelerated rocket from the point of view of an inertial observer {ξ α },
momentarily at rest with respect to the rocket’s trajectory. The orientation of the coordinate grid is
such that the rocket moves along the ξ 3 -direction. In that instantaneous inertial frame, the rocket is
seen to undergo a constant acceleration g
In order to determine the wordline ξ α (τ ) of the rocket at later times, let us look for a general solution
of the covariant equation uα uα = −1. Writing it explicitly
The unknown function f (τ ) can be determined by taking the derivative of the last two equations
duα
aα = = f˙(τ ) (sinh f (τ ), 0, 0, cosh f (τ )) (3.14)
dτ
and imposing the covariant condition aµ aµ = g 2 . We get g 2 = f˙2 , f (τ ) = gτ and
The work is almost done. Integrating (3.15) with the initial condition ξ α (0) = (0, 0, 0, g −1 ), we get
The “constantly accelerated” rocket describes an equilateral hyperbola with semi-major axis 1/g
(ξ 3 )2 − (ξ 0 )2 = g −2 . (3.17)
Let us now look at the problem from the point of view of an accelerated observer sitting in the rocket.
Since the transformation from inertial to accelerated frames is not a Lorentz transformation, we should
expect a change in the Minkowski line element ds2 . The natural coordinates for the accelerated
observer are those adapted to its trajectory. Let’s call them (x0 , x3 ) = (η, ρ). The transformation to
this frame takes the form5
In terms of the new coordinates η and ρ, the Minkowski line element ds2 = −(dξ 0 )2 + (dξ 3 )2 becomes
modified
ds2 = −ρ2 dη 2 + dρ2 ≡ gµν dxµ dxν . (3.19)
The metric gµν = diag(−ρ2 , 1) is now space-time dependent!
∂ξ α ∂ξ β
gµν = ηαβ , (3.22)
∂xµ ∂xν
which generically depends on the spacetime coordinates. The inverse of the new metric is defined
through the relation g µν gνλ = δ µ λ and can be easily computed by taking into account the identity
∂xµ ∂ξ β
= δβ α . (3.23)
∂ξ α ∂xµ
We get
∂xµ ∂xν
g µν = η αβ . (3.24)
∂ξ α ∂ξ β
Exercise
Prove the similarity transformation (3.24).
5 The change of coordinates is just the Lorentzian analogue of polar coordinates. inspired on the wordline equation
(3.16).
6 In the context of a particle in the gravitational field these frames are called local free falling reference frames.
7 General means completely arbitrary. It can be a curvilinear coordinate system, an accelerated system, a rotating
Note that the reference frame xµ is not at all privileged. We could perfectly move now into another
non-inertial reference frame x̄ρ (ξ α ) in which
(⇠ ↵
)
⌘↵
x̄
⇢
(⇠
xµ
↵
)
gµ⌫ ḡ⇢
x̄⇢ (xµ )
domingo, 15 de septiembre de 13
Figure 3.4
Exercise
Show that gµν must be symmetric, i.e. gµν = gνµ .
dxµ dxν
1
Z
S= dσ e−1 (σ)gµν − m2 e(σ) . (3.27)
2 dσ dσ
As in the Minkowski case, the action (3.27) is invariant under reparametrizations of the path σ →
σ = f (σ) provided that we let e(σ) transform in such a way that the quantity e(σ)dσ is left invariant.
Note also that the action (3.27) is invariant under general coordinate transformations8 , as can be
easily seen by taking into account the similarity relation (3.26).
∂L
The Euler-Lagrange equation ∂e = 0 for the non-dynamical variable e(σ)
dxµ dxν
gµν = −m2 e(σ)2 , (3.28)
dσ dσ
automatically incorporates the massive (e(σ) = 1/m) and massless (e(σ) = 1, m → 0) cases we are
interested in. In these two limits, the action (3.27) takes the form
with σ = τ for the massive case. Smassive and Smassless are very similar. Indeed, the computation of
the equations of motion is formally equivalent in both cases9 . Let us denote by a dot the derivative
with respect to σ and forget in what follows about the irrelevant factors m and m/2. The equations
of motion
d ∂L ∂L
ρ
− =0 (3.30)
dσ ∂ ẋ ∂xρ
for the generalized coordinates xρ can be computed as follows. The simplest part is the variation of
1/2gµν ẋµ ẋν with respect to xρ . All the dependence on the coordinates is hidden in the metric
∂L 1 ∂gµν µ ν
= ẋ ẋ . (3.31)
∂xρ 2 ∂xρ
The variation with respect to ẋρ is slightly more involved, but can be however written in a very
compact way by taking into account the properties
∂ ẋµ
= δµ ρ gµν δ ν ρ = gµρ , gµν = gνµ , (3.32)
∂ ẋρ
together with some simple index relabeling. We get
∂ ẋµ ν ν
∂ 1 µ ν 1 µ ∂ ẋ 1
ρ
gµν ẋ ẋ = gµν ρ
ẋ + g µν ẋ ρ
= (gρν ẋν + gµρ ẋµ ) = gρν ẋν . (3.33)
∂ ẋ 2 2 ∂ ẋ ∂ ẋ 2
Collecting the two pieces, the Euler-Lagrange equations (3.30) become
d ∂L ∂L d 1 ∂gµν µ ν
− = (gρν ẋν ) − ẋ ẋ
dσ ∂ ẋρ ∂xρ dσ 2 ∂xρ
∂gρν σ ν 1 ∂gµν µ ν
= ẋ ẋ + gρν ẍν − ẋ ẋ
∂xσ 2 ∂xρ
∂gρν 1 ∂gσν
= gρν ẍν + ẋν ẋσ −
∂xσ 2 ∂xρ
ν 1 ν σ ∂gρν ∂gρσ ∂gσν
= gρν ẍ + ẋ ẋ + −
2 ∂xσ ∂xν ∂xρ
= 0. (3.34)
The work is done. Multiplying by the inverse metric and relabeling indices we obtain the equation we
were looking for, the so-called geodesic equation
d2 xµ µ dxν dxρ 1 µσ
+ Γ νρ = 0, Γµ νρ = g (∂ρ gσν + ∂ν gσρ − ∂σ gνρ ) . (3.35)
dσ 2 dσ dσ 2
Exercise
• Consider a reparametrization σ → f (σ). Show that the geodesic equation (3.35) retains
its form only if f (σ) = aσ + b.
• Compute the geodesic equation associated to the Rindler metric (3.16).
The geodesic equation is automatically covariant since the Lagrangian from which it was derived was
invariant under general coordinate transformations. The transformation of the so-called Christoffel
symbols Γµ νλ is however non-homogeneous
0 0
0 ∂ x̄µ ∂xν ∂xρ ∂ x̄µ ∂ 2 xµ
Γ̄µ ν 0 ρ0 = Γµ νρ 0 0 + . (3.36)
µ ν
∂x ∂ x̄ ∂ x̄ ρ ∂xµ ∂ x̄ν 0 ∂ x̄ρ0
9 The only difference is the presence of a global factor m and a constant term m/2 which do not play any role in the
Exercise
• Which is the form taken by Eq. (3.37) in a Cartesian coordinate system?
• Prove the transformation law (3.36) using the fact the the geodesic equation is covariant.
• How many independent components have the Christoffel symbols in four dimensions?
The Christoffel symbols encode the local aspects of the gravitational interaction as well as the
fictitious forces (centrifugal, Coriolis, etc)
d2 xµ dxν dxλ
Fµ ≡ 2
= −Γµ νλ (3.37)
dσ dσ dσ
arising when using non-inertial reference frames. This kind of forces can always be eliminated by
going to an inertial reference frame or to a free-falling frame. Note that this would not be the case if
the Christoffel symbols were tensors.
d2 ξ α
= 0. (3.38)
dσ 2 P
with the second condition being equivalent to the vanishing of the Christoffel symbols at that
point, i.e. Γµ νλ (P ) = 0.
a The existence of these frames is guaranteed by the so-called Local flatness theorem.
it will by satisfied for all values of σ 10 . We will rediscover this equation in Chapter 5, when dealing
with the concept of parallel transport.
Exercise
Prove the relation (3.40).
dxi dxi dt
1 −→ (3.43)
dt dτ dτ
in a “weak”
gµν = ηµν + hµν , |hµν | 1 (3.44)
and stationary12 gravitational field
∂0 gµν = ∂0 hµν = 0 . (3.45)
The first two conditions (small velocities and weak fields) are quite natural from the point of view
of a non-relativistic description. On the other hand, the stationarity condition (3.45) is just a good
approximation for the particular cases we will be interested in in this section: the gravitational fields
of the Sun and the Earth.
At first order in the small perturbation hµν , the geodesic equation (3.35) takes the form
2
d2 xµ
dt
+ Γµ00 = 0, (3.46)
dτ 2 dτ
where the Christoffel symbols Γµ00 are completely determined by the perturbation hµν around the
Minkowsky metric13
µ 1 µρ ∂g0ρ ∂g0ρ ∂g00 1 ∂g00 1 ∂h00
Γ00 = g + + = − g µρ ρ = − η µρ . (3.47)
2 ∂x0 ∂x0 ∂xρ 2 ∂x 2 ∂xρ
11 The coordinate xν is then said to be cyclic.
12 Or varying sufficiently slow over the scale probed by the particle.
13 Note that, since we are interested only in first order terms, we can raise and lower indices with the Minkowski
Splitting the spatial and temporal components of Eq. (3.46) and using the stationarity condition
(3.45), we obtain14
d2 t d2 xi 1 ∂h00
= 0, = c2 , (3.48)
dτ 2 dτ 2 2 ∂xi
The first of these two equations allows us to identify the proper time τ with the coordinate time t and
write
d2 xi 1 ∂h00
= c2 . (3.49)
dt2 2 ∂xi
The value of the unknown function h00 can be determined by comparing Eq. (3.48) with the Newtonian
expression for a particle in a gravitational field
d2 xi ∂Φ
= −δ ij j . (3.50)
dt2 ∂x
The first true component of the gravitational metric tensor15 comes directly from Newton’s theory!
2Φ 2Φ
h00 = − 2 −→ g00 = − 1 + 2 . (3.51)
c c
Indeed. . . this is the first and the last component that we can expect to get from Newton . . . New-
tonian gravity involves just one scalar function: the gravitational potential Φ, nothing else. This
observation naturally raises the question of how to compute the remaining components of the metric.
Let’s forget about this problem for a while and enjoy our findings. As you will see, we can learn a
lot of new things without knowing the precise form of the other metric components. The correction
to the Minkowski metric is proportional to the so-called gravitational self-energy Φ/c2 . This quantity
can be understood as the ratio of the Newtonian potential energy to the relativistic energy. For an
object of mass M and typical size R we have
|Φ| GM 2 1 GM
= · = . (3.52)
c2 R M c2 Rc2
14 Note that we have restored the speed of light c for later convenience.
15 Note that we could in principle allow for an extra constant C in Eq. (3.51), i.e h00 = − 2Φ c2
+ C, which should
be fixed by requiring the metric to approach the flat Minkowski metric at infinity. For isolated mass distributions the
gravitational potential Φ vanishes at infinity and therefore C = 0.
3.7 The power of the equivalence principle 41
Some orders of magnitude for Φ can be found in Table 3.1. Note that, even for a white dwarf or a
galaxy, the gravitational self-energy is very small; the weak field approximation used in the derivation
of Eq. (3.51) is justified. The correction to the Minkowski metric is expected to be important only
for very compact object such as a neutron star or a black hole.
about the g00 element, the large symmetry of the problem severely constrains the form of the metric
to be
2Φ(r)
ds2 = − 1 + dt2 + grr (r)dr2 + r2 dθ2 + r2 sin2 θdφ2 , (3.54)
c2
with grr (r) an undetermined function of the radial distance, whose explicit form will not be needed in
what follows. In order to disentangle the effect of gravity from other velocity dependent Doppler-like
effects and to make the analysis as clear as possible, we will require the observers to be at rest in a
radial configuration with coordinates r1 and r2 . Imagine the observer at r1 sending pulses of light to
the observer at r2 . The period of emitted pulses is the interval in proper time of the emitter
Z p p Z p
∆τ1 = −g00 (r1 )dt = −g00 (r1 ) dt = −g00 (r1 )∆t1 . (3.55)
On the other hand, the period of received pulses is the interval in proper time of the receiver
Z p p Z p
∆τ2 = −g00 (r2 )dt = −g00 (r2 ) dt = −g00 (r2 )∆t2 . (3.56)
The coordinate interval elapsed between the emission of two pulses ∆t1 is equal to the coordinate
time interval elapsed between the reception on two pulses ∆t2 , as can be easily seen by noting that
the coordinate time interval needed to go from r1 to r2
Z r2
2 2 2 −grr (r)
ds = −g00 dt + grr dr = 0 −→ ∆t = dr (3.57)
r1 g00 (r)
is independent of the coordinate time t. Taking the ratio of Eqs. (3.56) and (3.55), we get the first
prediction of the Equivalence Principle
s s
∆τ2 g00 (r2 ) 1 + 2Φ(r2 )/c2
= = . (3.58)
∆τ1 g00 (r1 ) 1 + 2Φ(r1 )/c2
For weak gravitational fields, the previous expression can be approximated by its binomial expansion17
The dilation of time was tested by Hafele and Keating in 1972 using cesium-beam atomic clocks
transported on commercial flights around the Earth and compared on return to standard clocks in
the US Naval Observatory. The net effect on the reading of the on-flight clocks is a combination of
special relativistic effects and gravitational changes in the flow of time. The two contributions act in
17 (1 + x)1/2 = 1 + 12 x.
3.7 The power of the equivalence principle 43
Figure 3.6: The highs and lows: Redka and Pound at the top and bottom of the tower.
an opposite way. Special Relativity tends to decrease the rate of the clock in the plane with respect
to the standard clock in the surface of the Earth18 . On the other hand, gravity tends to speed up
the clock in the plane with respect to the clock in the stronger gravitational field of the Earth. The
experiment was performed twice, once flying towards the east and once flying towards the west. The
results and their comparison with the predictions are summarized in Table 3.2. As you can see, the
agreement between the theory and the theoretical prediction is notably good.
Exercise
How older are the theorists of the upper floor of the Cubotron with respect to the experimental-
ists in the lower floor at the end of their academic life? Should this effect be taken into account
by the Swiss pension system?
18 This is just a consequence of the well-known time dilation effect in Special Relativity.
3.7 The power of the equivalence principle 44
The gravitational frequency shift is a test of the Equivalence Principle, not of the Einstein’s
theory of gravity in its full form. Note that the spatial part of the metric grr (r) did not played
any role in the previous developments.
Numerically, the gravitational redshift of the light emitted by the Sun is very small
∆ν
= 2.12 × 10−6 , (3.64)
ν
and indeed very difficult to detect due to the broadening of spectral lines and to Doppler shifts
associated to the convection currents in the solar atmosphere21 .
A more precise non-astronomical test of the gravitational frequency shift was performed by Pound
and Redka in 1960 using gamma rays produced in a 14.4 keV atomic transition in 57 Fe. These gamma
rays were emitted at the top of a tower of 22.6 meters in the Jefferson Physical Laboratory at Harvard
university and directed down towards a similar sample of 57 Fe located at the bottom of the tower. The
absorption of the gamma rays by the receiver is only efficient if the frequency at reception coincides
with the frequency at emission (Mössbauer effect). Due to the gravitational shift of frequencies this
was not the case. Pound and Rebka compensated the gravitational shift in a very clever way: a
Doppler shift induced by the vertical motion of the source at the top of the tower. By looking for
a resonance in the absorption they were able to obtain a direct measurement of the gravitational
redshift. The result was in excellent agreement with the Equivalence Principle’s prediction
(∆ν/ν)exp
= 1.05 ± 0.10 . (3.65)
(∆ν/ν)th
19 Φ(r ) = −GM /r and Φ(r ) = −GM /r are small (cf. Table 3.1). The binomial expansion is justified. The
1 1 2 2
gravitational field of the Earth is neglected.
20 Remember that the gravitational potential is negative.
21 The gravitational redshift (3.64) corresponds numerically to the Doppler shift associated to a velocity of 0.6 Km/h,
which is easily exceed by the hot gases in the surface of the Sun.
3.8 The weakness of the Equivalence Principle 45
Figure 3.7: Different tests of the gravitational redshift. The parameter α parametrizes the deviations
from the Equivalence Principle, ∆ν/ν = (1 + α)∆Φ/c2 .
g
g
Exercise
Determine the predicted value ∆ν/ν in the Pound-Rebka experiment. Are the gamma rays
traveling down the tower blueshifted or redshifted?
The bending of light in a gravitational field was considered by Newton himself, but he didn’t
performed any proper computation. The first known result about the deflection of light was presented
by the German astronomer Johann Georg von Soldner in 1804. Based on Newton’s corpuscular theory
of light, Soldner predicted a deflection angle of 0.8700 for a ray of light grazing the surface of the Sun.
Einstein, unaware of Soldner’s computations and based on the Equivalence Principle, obtained the
same number one hundred years later, in 1911. Let us reproduce his arguments and (wrong) results.
The Minkowski value c = c0 is only recovered at long distances (r → ∞), where the gravitational
potential is negligible22 . According to Huygens’ principle the position of a wavefront at a time t + ∆t
can be determined by considering each point of the wavefront at t as a source of spherical waves. The
wavefront at t + ∆t is then given by the envelope of the multiple spherical wavefronts originated at
t. Imagine a wave front in the vicinity of a matter distribution M . Consider two points P1 and P2
separated by a spatial distance δl at time t. The velocity of light at those points (c1 and c2 ) depends
of the value of the gravitational field. Having a look to Fig. ??, we conclude that in a time δt the
wavefront in refracted by an angle
(c1 − c2 ) δt δΦ
δα = = δt , (3.67)
δl δl
with δΦ/δl the component of the gravitational acceleration along the wavefront. This infinitesimal
refraction angle can be integrated along the full path to obtain the total deflection angle23
dΦ
Z Z
α = dα = dt . (3.68)
dl
Since the velocity of light along the path is nearly constant we can set dt = ds, with s measuring the
distance along the path. Evaluating the integral (3.68) for an impact parameter b, we get
Z π/2
dΦ GM 2GM
Z
α= ds = 2
cos θds = , (3.69)
dl −π/2 r b
which for the particular case of a photon grazing the surface of the Sun becomes24
2GM
α= ≈ 0.87500 . (3.70)
c2 R
22 In ”Relativity,The Special and General Theory”, Einstein wrote:
[. . . ] our result shows that, according to the general theory of relativity, the law of the
constancy of the velocity of light in vacuo, which constitutes one of the two fundamental
assumptions in the special theory of relativity and to which we have already frequently referred,
cannot claim any unlimited validity. A curvature of rays of light can only take place when the
velocity of propagation of light varies with position. Now we might think that as a consequence.
of this, the special theory of relativity and with it the whole theory of relativity would be laid
in the dust. But in reality this is not the case. We can only conclude that the special theory
of relativity cannot claim an unlimited domain of validity; its result hold only so long as we
are able to disregard the influences of gravitational fields on the phenomena (e.g. of light).
23 This is a good approximation for small deflections angles, as is the case of the deflection of light by the Sun.
24 Note that we have restored the powers of c.
3.8 The weakness of the Equivalence Principle 47
P2
P1
As Einstein stated in the original paper, since ‘the fixed stars in the part of the sky near the sun
are visible during a total eclipse of the sun, this consequence of the theory may be compared to
experiment”. He indeed “urgently wishers astronomers to take up this question” and measure the
deflection of light during a solar eclipse. Fortunately for him . . . they didn’t do it on time. Einstein’s
1911 prediction based only in the equivalence principle was incomplete25 . No measurement of the
deflection angle was performed between 1911 and 1915, the moment at which he straightens out his
result to
2GM
α=2× 2 ≈ 2 × 0.87500 . (3.71)
c R
Although different expeditions to observe solar eclipses were organized, all of them were cancelled,
either for climatological or political reasons. One of the most interesting stories is that of the Ger-
man astronomer and mathematician Erwin Finlay Freundlich, which, interested on testing Einstein’s
prediction, convinced the german armament manufacturer Krupp to finance a trip to Crimea on
21st August 1914. Unfortunately for him, and fortunately for Einstein, the German astronomer was
arrested by the Russians as a suspected spy before being able to perform any measurement.
J. J. Thomson
Royal Society, 1919
In Euclidean and Minkowski spacetimes we were dealing with global Cartesian coordinate systems.
Our goal in this Chapter is to develop the mathematical tools needed to write physical equations in
a way completely independent of the particular coordinate system we actually end up using.
with gµν generically depending on the coordinates. Once we have established a coordinate basis, we
can define the components of a vector. Let us consider the infinitesimal displacement vector between
two points
dS = dxµ eµ , (4.2)
whose scalar product with itself defines the line element
|dS|2 ≡ ds2 = eµ (x) · eν (x) dxµ dxν = gµν (x)dxµ dxν . (4.3)
The above expression represents a generalization of the Pythagorean theorem for an arbitrary coor-
dinate system. As usual, the inverse of the metric g µν is defined through the relation g µν gνλ = δ µ λ .
Note that both gµν (x) and g µν (x) are symmetric.
4.1 General coordinate transformations 49
Exercise
• Prove that gµν is a symmetric matrix. Hint: Assume gµν is not symmetric and decompose
it into a symmetric and an antisymmetric part.
• Which is the number of independent components of a general symmetric matrix in N
dimensions?
Consider two different coordinate systems xµ and x̄µ related to each other by an arbitrary coordi-
nate transformation
x̄µ = f µ (x1 , x2 , . . . , xN ) , (µ = 1, 2, . . . , N ) . (4.4)
The N arbitrary real functions f µ (x1 , x2 , . . . , xN ) are assumed to be single valued, continuous and
differentiable over the whole range of their arguments. By differentiating each of these functions with
respect to the coordinates, we obtain a N × N transformation matrix
∂f 1 ∂f 1 ∂f 1
∂x12 ∂x22 . . . ∂x N
µ ∂f ∂f ∂f 2
∂ x̄ ∂x1 ∂x2 . . . ∂x N
= . . . , (4.5)
∂xν ..
.. ..
∂f N ∂f N ∂f N
∂x1 ∂x2 ... ∂xN
whose entries are, in general, functions of the coordinates. The determinant of the transformation
matrix
∂ x̄µ
J(x) = , (4.6)
∂xν
is called the Jacobian of the transformation. If the N arbitrary real functions are independent, the
Jacobian is different from zero and the coordinate transformation (4.4) can be inverted to express xµ
in terms of x̄µ
xµ = g µ (x̄1 , x̄2 , . . . , x̄N ) . (4.7)
Exercise
The transformation matrix and the Jacobian associated to this inverse coordinate transformation
µ
(4.7) are given respectively by the inverse of the transformation matrix (4.5), [ ∂x
∂ x̄ν ], and the
inverse of the Jacobian (4.6), J¯ = J . Prove it.
−1
As you may expect at this point, there is a simple relationship between the coordinate basis vectors
in the two systems. This relation can be found by requiring the invariance of the line element ds2 ,
which is a purely geometrical quantity independent of the coordinate system used to describe it. We
get
∂ x̄µ ν ∂xν
dx̄µ = dx , ēµ = eν . (4.8)
∂xν ∂ x̄µ
The previous expressions are equivalent to the similarity relation between the metrics in the two
coordinate systems that we found in the previous chapter
∂xµ ∂xν
ḡρσ (x̄) = gµν (x(x̄)) . (4.9)
∂ x̄ρ ∂ x̄σ
Exercise
Taking into account Eq. (4.9), determine the number of independent components of the metric.
4.1 General coordinate transformations 50
∂ x̄µ ∂r ∂r
∂x1 ∂x2 cos θ sin θ
= = , (4.12)
∂xν ∂θ
∂x1
∂θ
∂x2
− 1r sin θ 1
r cos θ
are different from zero, i.e. non singular, except at r = 0. The polar coordinate system admits
a pair of basis vectors er and eθ , adapted to the coordinates and related to the Cartesian basis
vectors by (cf. (4.8))
∂x1 ∂x2
er = e1 + e2 = cos θ e1 + sin θ e2 , (4.13)
∂r ∂r
∂x1 ∂x2
eθ = e1 + e2 = −r sin θ e1 + r cos θ e2 . (4.14)
∂θ ∂θ
Note that the resulting basis is not a unit basis
On the other hand, the relation between the infinitesimal displacements in both coordinate
system is given by (cf. (4.8))
∂r ∂r
dr = dx1 + dx2 = cos θdx1 + sin θdx2 , (4.16)
∂x1 ∂x2
∂θ ∂θ 1 1
dθ = dx1 + dx2 = − sin θdx1 + cos θdx2 . (4.17)
∂x1 ∂x2 r r
The components of the metric tensor and its inverse in this basis can be computed either through
the definition of the metric (4.1)
for the metric and its inverse. The line element (4.3) written in polar coordinates becomes
which is what one usually writes down when asked for the metric of this coordinate system.
4.2 Tensors 51
Exercise
Repeat the same exercise for spherical coordinates.
4.2 Tensors
The transformation laws of tensors under general coordinate transformations are just a generalization
of those found in Chapters 1 and 2, the main difference being the replacement of the constant matrices
Ri j and Λµ ν by the arbitrary transformation matrix (4.5) and the use of the metric gµν and its inverse
for lowering and raising indices
Vµ ≡ gµν V ν , V µ ≡ g µν Vν . (4.23)
The simplest transformation rules are summarized in Table 4.1. For a general tensor with m con-
travariant indices and n covariant indices we have
m n
!
µ1 ...µm
Y ∂ x̄µp Y ∂xσq
T̄ ν1 ...νn = ρp νq
T ρ1 ...ρm σ1 ...σn . (4.24)
p=1
∂x q=1
∂ x̄
∂ x̄µ
General coord. transformations ∂xν are arbitrary!
Scalar φ̄ = φ
∂ x̄µ ν
Contravariant vector V̄ µ = ∂xν V
∂xν
Covariant vector V̄µ = ∂ x̄µ Vν
∂ x̄µ ∂ x̄ν ρσ
Contravariant rank-2 tensor T̄ µν = ∂xρ ∂xσ T
∂xρ ∂xσ
Covariant rank-2 tensor T̄µν = ∂ x̄µ ∂ x̄ν Tρσ
∂ x̄µ ∂xσ ρ
Mixed rank-2 tensor T̄ µ ν = ∂xρ ∂ x̄ν T σ
Table 4.1
The previous expression can be combined with Eq. (4.25) to obtain a generally covariant quantity
p p
|ḡ|d4 x̄ = |g|d4 x , (4.27)
which we can use as an appropriate volume element in arbitrary dimensions. The absolute value in
Eq. (4.27) is introduced to take into account the case of a metric with Lorentzian signature (− + ++)
and negative determinant.
The volume density d4 x and the determinant of the metric g are just particular cases of a general
class of quantities called tensor densities. A tensor density transforms as a tensor except for the
appearance of the Jacobian to a given power w called the weight of the tensor density, namely
m n
!
µ1 ...µm −w
Y ∂ x̄µp Y ∂xσq
D̄ ν1 ...νn = J ρp νq
Dρ1 ...ρm σ1 ...σn . (4.28)
p=1
∂x q=1
∂ x̄
Ordinary tensors can be therefore considered as tensor densities of weight zero. The determinant of
the covariant rank-2 metric tensor is a scalar density of weight 2, while d4 x is a scalar density of
weight −1. Eq. (4.27) can be easily generalized to obtain a rule for transforming tensorial densities
into tensors
m n
!
−w
Y ∂ x̄µp Y ∂xσq w
|ḡ| 2 D̄ µ1 ...µ m
ν1 ...νn = ρ ν
|g|− 2 Dρ1 ...ρm σ1 ...σn . (4.29)
p=1
∂x q=1 ∂ x̄
p q
Exercise
• Show that the totally antisymmetric quantity
+1, if µνρσ is an even permutation of 0123 ,
µνρσ = −1, if µνρσ is an odd permutation of 0123 , (4.30)
0, otherwise .
is a tensor density under general coordinate transformations. Determine its weight. Con-
struct a tensor from it using the metric.
• Show that the components of µνρσ remain unchanged under general coordinate transfor-
mations.
The second term spoils the tensorial property of the derivative for first-rank tensors. In order to de-
termine why this happens, let’s go back to geometrical, real coordinate independent objects. Consider
for instance the expansion of a vector
V = V µ eµ (x) , (4.32)
in terms of arbitrary basic vectors eµ (x), which, contrary to the Euclidean or Minkowski cases, gener-
ically depend on the coordinates. The derivative of such a vector contains two different contributions,
one due to the intrinsic change of the vector field from place to place and one describing the variation
of the basis vectors from place to place
∂V ∂V µ ∂eµ
ν
= eµ + V µ ν . (4.33)
∂x ∂xν ∂x
The first term is present even in Cartesian coordinates and is a linear combination of the basis vectors,
i.e. a vector. On the other hand, the second term involves the derivative of the basis vectors eµ . The
relation between these vectors and those in a (local) inertial reference frame 1 eα is given by
∂ξ α
eµ = eα . (4.34)
∂xµ
Taking the derivative of the previous expression and using the fact that the vectors eα are constant
∂eα
= 0, (4.35)
∂xν
we get the following relation α
∂eµ ∂ ∂ξ
= eα . (4.36)
∂xν ∂xν ∂xµ
The right-hand side of the previous equation is a linear combination of the inertial basis vectors eα
∂e
and therefore is a vector. This allows us to rewrite ∂xµν in terms of the arbitrary basis vectors eµ
∂ eµ
= Γρ µν eρ . (4.37)
∂xν
For the time being, the so-called affine connection Γρ µν is just a 3-index notation denoting the linear
combination of arbitrary basis vectors eµ . The index µ specifies the basis vector that is differentiated,
ν the coordinate with respect to which it is differentiated and ρ the component of the resulting vector2 .
Inserting the definition (4.37) into Eq.(4.33) we obtain
∂V ∂V µ
ν
= eµ + V µ Γρ µν eρ , (4.38)
∂x ∂xν
where the basis vector in the right hand side can be factored out by simply relabeling the dummy
indices µ and ρ
∂V µ
∂V
= + V Γ ρν eµ ≡ (∇ν V µ ) eµ .
ρ µ
(4.39)
∂xν ∂xν
The quantities in parenthesis are the components of a tensor, called the covariant derivative, which
takes into account the variation of basis vectors from point to point. We will denote it in two alternative
ways, either with the symbol ∇
∇ν V µ = ∂ν V µ + Γµ ρν V ρ . (4.40)
1 Note the use of the first letters of the Greek alphabet for denoting quantities in (local) inertial frames.
2 Yes, I am deliberately using the same symbol I used for the Christoffel symbols. You will understand why in a
while. Be patient.
4.4 Covariant derivative 54
or with a semicolon
V µ ;ν = V µ ,ν + Γµ ρν V ρ . (4.41)
Note that the standard derivative ∂µ has been denoted by a colon. The notation (4.41) is specially
convenient for its brevity and for remembering the definition of the covariant derivative (the ν index
appears in the last position of each term).
V r ;r = V r ,r , V θ ;θ = V θ ,θ + 1r V r , (4.46)
1
V θ ;r = V θ ,r + V θ , V r ;θ = V r ,θ − rV θ . (4.47)
r
Note that the final expressions do not involve any Cartesian tensors and allow you to directly
derive the formulae for the divergence of a vector field
1 1 ∂ ∂ θ
∇ · V = V r ;r + V θ ;θ = V r ,r + V θ ,θ + V r = (rV r ) + V . (4.48)
r r ∂r ∂θ
and the Laplacian of a scalar fielda
1 ∂2φ
1 ∂ ∂φ
∇ · ∇φ ≡ ∇2 φ = r + , (4.49)
r ∂r ∂r r2 ∂θ2
in polar coordinates. You should recognize the result. . . The formulae appearing in your favorite
electromagnetism books are just a consequence of the existence of non-vanishing connection
coefficients in curvilinear coordinates!
a Note that ∇µ φ = ∂µ φ.
For a covariant tensor the covariant derivative takes a slightly different form. Consider a scalar
φ = Vµ U µ , . (4.50)
Since a scalar does not depend on the basis vectors, its covariant derivative coincides with the standard
derivative
∂Vµ µ ∂U µ
∇ν φ = ∂ν φ = U + V µ (4.51)
∂xν ∂xν
4.4 Covariant derivative 55
Using Eq. (4.40) for replacing ∂ν U µ in favor of ∇ν U µ and relabeling dummy indices in the term
containing the connection we get
∂Vµ ρ
∇ν φ = − Γ µν Vρ U µ + V µ ∇ν U µ . (4.52)
∂xν
All the terms in the previous expressions, except the one in parenthesis are tensor components. Since
the multiplication and addition of tensor components always gives rise to tensors, the quantity in
parenthesis must be a tensor. The covariant derivative of Vµ becomes is then given by
∇ν Vµ = ∂ν Vµ − Γρ µν Vρ , (4.53)
or
Vµ;ν = Vµ,ν − Γρ µν Vρ . (4.54)
in the semicolon notation. Note the similarities and differences between (4.41) and (4.54). In both
cases, the index with respect to which the covariant derivative is taken (ν in this case) is the last
subscript of the connection Γ. The remaining indices can only be arranged in one way without raising
and lowering them. For a covariant vector (superscript) the sign of the connection is positive; for a
covariant vector (subscript) the connection carries a minus sign. These transformation rules can be
generalized to extra covariant and contravariant indices by introducing a factor Γ for each index with
the proper sign and index matching to obtain
µ... µ... µ λ... λ µ...
Tν... ;ρ = Tν... ,ρ + Γ λρ Tν... + . . . − Γ νρ Tλ... − . . . . (4.55)
• The index of the tensor that is being corrected will be replaced by a dummy index, that
will be contracted with one of the indices of Γ.
• The remaining indices can be placed in an unique way.
Exercise
Write explicitly T µ ν;ρ and T µν ;ρ .
perfectly written the same expression with a different ordering of indices, i.e Γν ρµ instead of Γν µρ . In
the most general case, these two quantities are not necessarily equal to each other
T ν µρ ≡ Γν µρ − Γν ρµ 6= 0 . (4.56)
and the spacetime is said to have torsion3 . In what follows, we will require our spacetime to be
torsionless and will take
Γν µρ = Γν ρµ (4.57)
Under this assumption, the relation between the metric and the connection can be determined as
follows. Starting with (4.1), and differentiating it with respect to xρ we obtain
∂ρ gµν = ∂ρ eµ · eν + eµ · ∂ρ eν
= Γσ µρ eσ · eν + eµ · Γσ νρ eσ
= Γσ µρ gσν + Γσ νρ gµσ , (4.58)
where in the last last step we have made use of the defining equation (4.37) and the metric definition
(4.1). By cyclically permuting the indices we obtain the following two equivalent expressions
which, combined with Eq.(4.58) and using the assumed property Γσ µν = Γσ νµ , allows us to form the
following combination
∂ρ gµν + ∂ν gρµ − ∂µ gνρ = 2Γσ ρν gµσ . (4.61)
Multiplying by the inverse metric g κµ , using g κµ gµσ = δ κ σ and relabeling indices we obtain the result
we were looking for4
1
Γµ νρ = g µσ (∂ν gσρ + ∂ρ gνσ − ∂σ gνρ ) . (4.62)
2
A connection satisfying the previous property is called a metric connection, a Christoffel connection,
a Levi-Civita connection or a Riemannian connection.
Exercise
• Prove explicitly the relation (4.68) by using the relation (4.62).
ii) Leibniz’s or chain rule: The covariant derivative of outer and inner products of tensors obey the
same rules as the usual derivative
iii) Metric compatibility: The covariant derivative of the metric tensor is zero
∇ρ gµν = 0 . (4.68)
In other words, the metric tensor is not constant ∂ρ gµν 6= 0 but it is covariantly constant. The
result follows immediately from comparing the general expression for the covariant derivative of
a rank-2 covariant tensor with (4.58)5 .
iv) The raising and lowering of tensor indices is not affected by covariant differentiation. For example
∇ν V µ = ∇ν (g µσ Vσ ) = g µσ ∇ν Vσ , (4.69)
where we have made use of properties ii) and iii). Note that this would not be the case if our
connection was not metric-compatible. We should be very careful about index placement in that
case.
v) The covariant derivative of the Kronecker delta δ µ ν is zero
∇ρ δ µ ν = ∂ρ δ µ ν + Γµ σρ δ σ ν − Γσ νρ δ µ σ = Γµ νρ − Γµ νρ = 0. (4.70)
vi) The covariant derivative commutes with the contraction of indices. For example
∇ν T µρ ρ = ∂ν T µρ ρ + Γµ λν T λρ ρ + Γρ λν T µλ ρ − Γκ ρν T µρ κ = ∂ν T µρ ρ + Γµ λν T λρ ρ . (4.71)
Exercise
• Consider a tensor T µ ν = U µ Vν . Use the Leibniz’s rule (4.67) together with the expressions
for the covariant derivatives of a covariant and a contravariant vector to compute T µν ;ρ .
Is the result consistent with the practical rules below Eq. (4.55) ?
• Verify Eq.(4.68) for the particular case of polar coordinates in the plane.
5 Indeed, we have implicitly assumed that the affine connection was metric-compatible in our derivation of Eq. (4.58).
4.4 Covariant derivative 58
4. Covariant Laplacian6
1 p
∇2 φ ≡ ∇µ ∇µ φ = p ∂µ |g|∂ µ φ . (4.75)
|g|
Eqs. (4.73), (4.77) and (4.78) are particularly useful, since they allow us to compute the covariant
derivative of an object without having to compute the Christoffel symbols.
Exercise
• Derive all the expressions in this section.
• Use Eq. (4.75) to rederive the expression for the Laplacian in polar coordinates.
6 The symbol used for the Laplacian operator depends on the dimension of the spacetime considered. The three-
sided symbol ∇2 in (4.75) is the most common notation in arbitrary dimension. The 4-dimensional case is sometimes
singled-out. It has a special name, D’Alambertian and its own four-sided symbol, 2.
4.5 An application: Maxwell equations in arbitrary coordinates 59
with Fµν = ∇µ Aν − ∇ν Aµ = ∂µ Aν − ∂ν Aµ . The resulting equations are fully covariant, i.e. if they
are valid in an arbitrary coordinate system they will be valid in all coordinate systems. Taking into
account the antisymmetricity of F µν together with the property (4.78), the first Eq. in (4.81) can be
written in a very convenient form7
p p
∂µ |g|F µν = |g|J µ , (4.82)
which, taking into account (4.73), allows as to easily compute the expression for the continuity equation
(2.73) in arbitrary coordinate systems
1 p
∇µ J µ = p ∂µ |g|J µ = 0. (4.83)
|g|
Exercise
Fill the steps in the derivation of Eqs. (4.82) and (4.83).
On the other hand, if we consider a non-Cartesian set of coordinates, such as polar coordinates in
the plane, the notion of parallel transport is more difficult to define since as we saw in the previous
sections the basis vectors change from point to point. The covariant derivative can be nevertheless
used to provide a natural definition for parallel transport in an arbitrary spacetime. To see this,
consider the derivative of a vector V = V µ eµ along a curve parametrized by an affine parameter σ
dV dV µ deµ
= eµ + V µ
dσ dσ dσ
dV µ µ deµ ∂x
ρ
= eµ + V ρ
dσ dx dσ
dV µ ∂xρ
= eµ + Γ µρ V µ
ν
eν . (4.85)
dσ dσ
Relabelling indices and factoring out the basis vector, we get
µ
∂xρ DV µ
dV dV
= + Γµ νρ V ν eµ ≡ eµ . (4.86)
dσ dσ dσ dσ
DV µ
where we have defined the components of the intrinsic derivative dσ of a contravariant vector as
µ µ ρ
DV dV ∂x
≡ + Γµ νρ V ν . (4.87)
dσ dσ dσ
A similar condition can be found for the components of the intrinsic derivative of a covariant vector
DVµ dVµ ∂xρ
≡ − Γν µρ Vν . (4.88)
dσ dσ dσ
The condition of parallel transport, dV/dσ = 0, implies
DV µ dV µ ∂xρ ν
≡ + Γµ νρ V = 0, (4.89)
dσ dσ dσ
The concept of intrinsic derivative and parallel transport can be generalized to objects with more
indices. The parallel transport of a tensor T along the path xµ (λ) is defined by the requirement
DT µ··· ν··· dxρ
≡ ∇ρ T µ··· ν··· = 0 . (4.91)
dσ dσ
Applying this to the metric and taking into account (4.68) we conclude that
D dxρ
gµν = ∇ρ gµν = 0 , (4.92)
dσ dσ
which gives rise to an important property of parallel transport: it conserves the direct product of two
parallel transported vectors
DV µ µ DU µ
D D Dgµν
(Vµ U µ ) = (gµν V µ U ν ) = V µ U ν + gµν U +Vµ = 0, (4.93)
Dσ Dσ Dσ Dσ Dσ
and therefore their norm, orthogonality, etc . . . .
4.7 Summary 61
4.7 Summary
The general procedure for converting an equation which is valid in Cartesian inertial coordinates to
an equation valid in arbitrary coordinate systems is:
This prescription is known as minimal coupling prescription, since it does not introduce any terms
apart from those already present.
Exercise
The equation of motion of a charged particle in an electromagnetic field in Cartesian coordinates
takes the form
duα
m = qF α β uβ . (4.94)
dτ
Which is the form taken by the previous expression in an arbitrary coordinate system? Do you
recognize the left-hand side?
CHAPTER 5
TIDAL FORCES AND CURVATURE
A. Einstein
In both Newtonian mechanics in the absence of gravity and Einstein’s theory of Relativity, inertial
frames are characterized by the absence of accelerations, which are absolute elements of the theory.
If particles move in straight lines at constant speed the system is inertial. On the other hand, if
the trajectory in spacetime is not a straight line the system must be accelerating. The situation is
slightly different when gravity is taken into account. The equality between inertial and gravitational
masses does not allow to locally distinguish the acceleration of a given reference frame from purely
gravitational effects. Gravity can be locally switched off by properly choosing a local inertial frame
associated to an observer in free-fall in the gravitational field. The word locally is fundamental, since
the global behaviours of accelerations and gravity are completely different: while the true gravitational
field vanishes at large distances, the apparent gravitational field in an accelerating frame takes a
nonzero constant value at infinity. Real and apparent gravity can be distinguished by tracking the
relative acceleration of nearby local inertial observers that appears due to the non-homogeneity of the
gravitational field!
In an inertial frame the equations of motion for the particles are given by the usual Newtonian
expressions, namely
d2 xi ∂Φ(xj )
2
= −δ ik , (5.1)
dt ∂xk
d2 (xi + ξ i ) ∂Φ(xj + ξ j )
2
= −δ ik , (5.2)
dt ∂xk
with ξ i the separation vector between the two particles. For sufficiently small separations Eq. (5.2)
can be Taylor expanded to linear order in ξ i to obtain
The Newtonian deviation equation for the separation vector ξ i becomes therefore
d2 ξ i
2
ik ∂ Φ
= −δ ξj . (5.4)
dt2 ∂xk ∂xj
The non-relativisitic tidal tensor
∂2Φ
E i j ≡ δ ik , (5.5)
∂xk ∂xj
determines the tidal forces, which tend to bring the particles together. This is the fundamental object
for the description of gravity and not their individual accelerations gi = ∂i Φ!
Exercise
Assume the tidal tensor E i j to be reduced to diagonal form, as in the example below. Show
that the components of that tensor cannot all have the same sign.
As a particular example, that will be useful in the future, consider two particles in the gravitational
field of a spherically symmetric distribution of mass M , i.e Φ = −GM/r. The tidal tensor (5.5) in
this case becomes
GM
Eij = (δij − 3ni nj ) 3 , (5.6)
r
where ni ≡ xi /r are the components of the unit vector in the radial direction. Writing explicitly the
different components in polar coordinates we obtain
d2 ξ r 2GM d2 ξ θ GM d2 ξ φ GM
= + 3 ξr , = − 3 ξθ , = − 3 ξφ . (5.7)
dt2 r dt2 r dt2 r
5.2 Geodesic deviation 66
Note the different signs: the object is stretched in the radial direction and compressed in the trans-
verse directions. Tidal forces squeeze a sphere into an ellipsoid (cf. Fig.5.1).
Exercise
Assuming the water in the oceans to be in static equilibrium and taking into account the results
of the previous example, estimate the height of the tides generated by the Moon.
Using the tidal tensor (5.5) we can write the equations governing the structure of Newtonian gravity
in the following suggestive way
where the symbol [j, l] stands for antisymmetrization in the corresponding indices, i.e.
1
E i [j,l] ≡ E i j,l − E i l,j .
(5.11)
2
D2 vµ
= uσ ∇σ (uρ ∇ρ v µ ) . (5.12)
dσ 2
The right hand-side of this equation should contain the information about the true gravitational field.
Using the relation1
v ρ ∇ρ uµ = uρ ∇ρ v µ , (5.13)
1 It follows directly from the definition of the covariant derivatives and the relation ∂uµ /∂λ = ∂v µ /∂σ.
5.2 Geodesic deviation 67
vanishes for a symmetric connection Γκρσ = Γκσρ , like the metric connection we are working
with (cf. Eq. (4.62)). Taking this into account, let me compute the quantity
The final result has important consequences. In particular, it tells us that [∇σ , ∇ρ ] uµ cannot
depend on the derivatives of uρ because in that case it would also have to depend on the
derivatives of the scalar field φ. As the dependence on the vector uµ is linear, we are left with
an expression of the form
[∇σ , ∇ρ ] uµ = Rµ νσρ uν , (5.19)
with Rµ νρσ some unknown coefficients. Although the particular combination of connections
inside these coefficients cannot be determined without performing the full computation, it is
nice to have an idea of the final result before computing it, right?
Let us start the explicit computation of the commutator [∇σ , ∇ρ ] uµ from the definition of the covariant
derivative
∇ρ uµ = ∂ρ uµ + Γµ κρ uκ . (5.20)
Differentiating with respect to xσ we obtain
∇σ ∇ρ uµ = ∂σ (∇ρ uµ ) + Γµ λσ ∇ρ uλ − Γκ ρσ ∇κ uµ (5.21)
µ µ κ µ λ λ κ κ µ µ λ
= ∂σ ∂ρ u + ∂σ (Γ κρ u )+Γ λσ ∂ρ u + Γ κρ u −Γ ρσ ∂κ u + Γ λκ u ,
5.2 Geodesic deviation 68
Ambiguities
Note that the non-commutation of covariant derivatives gives rise to some ambiguities in the
minimal coupling prescription (colon-goes-to-semicolon) introduced in the previous Chapter.
To illustrate this, consider for instance a physical law which in an inertial frame takes the form
U µ ∂µ ∂ν V ν = U µ ∂ν ∂µ V ν = 0 , (5.23)
with U µ and V ν some vector fields. Which should be the covariant generalization of this law?
Should we write something like
U µ ∇µ ∇ν V ν = 0 , (5.24)
or rather something like
U µ ∇ν ∇ µ V ν = 0 ? (5.25)
According to (5.22), these two equations are not equal; they differ by a factor proportional
Rµ νρσ , which is not necessarily zero. The colon-goes-to-semicolon prescription is ambiguous.
This is reminiscent of the problem of ordering operators in quantum mechanics: the minimal
prescription does not say anything about how to order the operators. The correct way of
adapting the laws of physics to spaces with non-vanishing Rµ νρσ can be only determined by
experiments.
The n4 quantities
Rµ νρσ ≡ ∂ρ Γµ νσ − ∂σ Γµ νρ + Γµ κρ Γκ νσ − Γµ κσ Γκ νρ (5.26)
are the components of a tensor, as can be easily seen by applying the quotient theorem2 to Eq. (5.22).
This tensor is called the curvature or Riemann tensor and it is defined in terms of the metric and its
first and second derivatives.
Exercise:
• Which is the value of Rµ νρσ for a 2 dimensional Euclidean metric written in Cartesian
coordinates? And if the metric is written in polar coordinates?
• Derive the action of the commutator of two covariant derivatives on a covariant vector.
Hint: This should be a fast exercise. Remember the metric compatibility.
• Use the previous result to determine the action of the commutator of covariant derivatives
on an arbitrary rank-(r, s) tensor.
Substituting (5.22) into Eq. (5.16) we obtain the so-called geodesic deviation equation
D2 vµ
= −Rµ νρσ uν uσ v ρ . (5.27)
dσ 2
The term in the right-hand side is the sought-after effect of gravity that cannot be removed by going
to a free falling frame: the tidal acceleration. In the non-relativistic limit, the intrinsic derivative on
2 cf. property 3 in Section 1.4.4
5.3 Flat versus curved: A dirty and quick introduction to curvature. 69
Figure 5.3: First appearance of the Riemann tensor in Einstein’s Zurich notebooks. The Riemann
tensor is written in the old-fashioned notation (ik, lm). According to some urban legends, Einstein
learned the methods of Ricci and Levi-Civita through his school friend Marcel Grossmann. It was
Grossmann the one who went to the library searching for methods to deal with arbitrary coordinate
systems and discovered the Ricci and Levi-Civita’s 1901 paper. The annotation “Grossmann tensor
fourth rank” that you can find in the right hand side of the formula suggests indeed that Grossmann
conveyed the Riemann tensor formula to Einstein.
the left hand side becomes d2 /dt2 and uµ ≈ δ µ 0 , in such a way that
d2 v µ
= −Rµ 0ρ0 v ρ . (5.28)
dt2
Taking into account Eqs. (5.4) and (5.5) we can to identify Rµ 0ρ0 with the non-relativistic tidal tensor
3
E i j = Ri 0j0 . (5.30)
Exercise:
Compute the Christoffel symbols and the curvature tensor to the lowest order for the line element
3-dimensional Euclidean space4 . At any given point P on the 2-dimensional surface, we can introduce
a tangent plane with Cartesian coordinates (X1 , X2 ) (cf. Fig. 5.4). This Euclidean space is called the
tangent space to the surface at P . The deviation z(X1 , X2 ) of the curved surface from the tangent
plane describes the local properties of our geometry. Since curvature effects arise only through the
second derivatives of z(x, y), it is convenient to use a quadratic function
1 T
z(X1 , X2 ) = X MX , (5.32)
2
with
a c T
M= , X ≡ (X1 , X2 ) , (5.33)
c b
and a, b and c quantities with dimensions of inverse length5 . Eq. (5.32) can be recast in a diagonal form
by rotating the coordinates, X̄ = RX, and accordingly transforming the matrix M , M̄ = R−1 M R.
In the new coordinate basis (ξ, η),we obtain
1 ξ2 η2
1 2 2
z(ξ, η) = κ1 ξ + κ2 η ≡ + , (5.34)
2 2 ρ1 ρ2
where we have defined the so-called principal curvatures κ1 and κ2 and the principal radii of curvature
ρ1 and ρ2 .
The result is quite intuitive. It simply states that any surface is locally the sum of two parabolas
in the ξ and η directions and with radius of curvature ρ1 and ρ2 respectively (cf. Fig. 5.4).
4 We do this just for visualization purposes; that is why I said that my introduction is somehow dirty. There is
no need to choose a particular embedding for studying the geometry of the surface; the geometry can be completely
determined by measuring angles and distances on the surface. This is indeed a theorem, known as Gauss’ Egregium
Theorem. It words of Gauss himself, it reads
Thus the formula of the preceding article leads itself to the remarkable Theorem. If a curved
surface is developed upon any other surface whatever, the measure of curvature in each point
remains unchanged.
5A local region is defined for values of X1 and X2 much smaller than a−1 , b−1 , c−1
5.3 Flat versus curved: A dirty and quick introduction to curvature. 71
Figure 5.5: A clever ant determining the curvature of a sphere via the Bertrand-Diquet-Puiseux
formula.
Exercise
Expand a circle of radius ρ around some point. Comment on the result.
The square of the distance between two nearby points with coordinates6 (x, y) and (x + dx, y + dy) is
given by
2
ds2 = dξ 2 + dη 2 + dz 2 = (κ1 ξdξ + κ2 ηdη) + dξ 2 + dη 2 ≡ γµν dxµ dxν .
(5.35)
Since the measure of the surface curvature cannot depend on the set of coordinates used, it must be
related to the basis-independent attributes of the matrix M . These attributes are its eigenvalues, or
equivalently, its determinant and trace. The determinant K = det M = κ1 κ2 is called intrinsic or
Gaussian curvature and can be expressed entirely in terms of intrinsic measurements on the surface,
without any reference to the external embedding space. Starting from a point P on the surface and
proceeding along a geodesic on the surface for a proper distance , we arrive to a point Q1 . Repeating
this process with geodesics starting off in different directions, we obtain a set of points Q1 , Q2 , . . ., all
of them sitting at the circumference C() of a geodesic disc centered at P (cf. Fig. 8.6). A simple
computation using the metric (5.35) shows that the quantity7
3
lim+ (2π − C()) , (5.37)
→0 π3
measuring the difference between the circumference C() of our geodesic disc and a circumference in
the plane, corresponds precisely to the value of the Gaussian curvature K at P
1 3
K = κ1 κ2 = = lim+ 3 (2π − C()) . (5.38)
ρ1 ρ2 →0 π
This expresion, relating the Gaussian curvature of a surface to the circumference of a geodesic circle,
is known as the Bertrand-Diquet-Puiseux formula, and is closely related to the Gauss-Bonnet theorem
that we will discuss below. Spaces with K = 0 everywhere are said to be flat or developable, since
they can be “developed” or flattened out into a plane without stretching or tearing them (cf. Fig.
6 Note that although M is diagonal, the metric is not.
7 There is not an absolute scale for Gaussian curvature, neither a unique choice of the normalization factor 3/π3
appearing in Eq. (5.37). People have just agreed on the convention that the curvature of the unit sphere should be
equal to 1 (although there are some natural motivations for it). For a small geodesic disc on the unit sphere of radius
we have
1
C() ∼ 2π − 3 , (5.36)
6
which explains the proportionality factor 3/π3 .
5.3 Flat versus curved: A dirty and quick introduction to curvature. 72
Figure 5.7: A plane sheet of paper (κ1 = κ2 = 0) rolled in the form of a cylinder of radius r (κ1 = 1/r
and κ2 = 0). The extrinsic curvature changes from 0 to κ1 + κ2 = 1/r.
5.4 Parallel transport around a closed path 73
5.7). On the other hand, spaces with K > 0 everywhere are said to be positively curved, while spaces
with K < 0 everywhere are said to be negatively curved or saddle like. For someone living on a given
point of a space embedded in a higher dimensional space, the curvature at that point will be positive
if the space curves away in the same way in any direction, while it will be negative if the space curves
away in a different way when moving in different directions (cf. Fig. 5.6).
and take P to be the origin. The distance from the origin to the point (, θ) is given by
Z
ds = . (5.40)
0
The set of points with coordinates (, θ) form a disc whose circumference is given by
Z
dθ sin = 2π sin . (5.41)
On the other hand, the extrinsic curvature 8 is defined through the trace of M , namely κ1 + κ2 . The
difference between the two can be easily understood by considering, for instance, a plane sheet of paper
(κ1 = κ2 = 0) rolled in the form of a cylinder of radius r which will look like a curved 2-dimensional
surface embedded in a 3-dimensional Euclidean space (cf. Fig 5.7). For the cylindrical surface we
have κ1 = 1/r and κ2 = 0. The intrinsic curvature retains the value of the flat sheet of paper. On the
other hand, the extrinsic curvature changes from 0 to κ1 + κ2 = 1/r.
Figure 5.8: Parallel transport of a vector around a closed path on the sphere.
with K the Gauss curvature and S the area inside the triangle. To generalize this form of curvature,
note that when the tangent vector at the P Q side is parallel transported from P to Q (cf. Fig. 5.8),
it forms an angle π − β with the tangent vector of the next side of the triangle. The same happens in
the other vertices. This means that if we make a parallel transport around the whole close path, we
obtain an angle π − β + π − γ + π − α, which, forgetting about 2π multiples and writing the appropriate
sign is given by α + β + γ − π. The Gauss curvature measures the variation, in relation with the area,
of parallel transported vectors around closed paths.
Note that in both cases, we don’t make any reference to the higher-dimensional space in which
we are embedded.
Although the intuitive reasoning presented above was bidimensional, it can be easily generalized to
arbitrary dimension. To do that consider the parallel transport equation
dv µ dxρ
= −Γµ νρ v ν (5.44)
dσ dσ
and apply it to the case in which v µ is parallel-transported along a small curve C from some initial
point P . The value of the vector at any other point σ along this curve is given by
Z σ
dxρ
v µ (σ) = vPµ − Γµ νρ v ν dσ . (5.45)
o dσ
Let us assume the loop C to be infinitesimally small. In that case, the quantities in the integrand of
10 The standard presentation of the theory of surfaces is usually based on Gauss’ Egregium Theorem and finishes with
the derivation of the Gauss-Bonnet theorem. This sequence is however not chronological. Gauss deduced the Egregium
Theorem starting from the Gauss-Bonnet theorem.
5.4 Parallel transport around a closed path 75
with ∆xλ ≡ xλ (σ) − xλP . Plugging back these expressions into (5.45) and retaining only those terms
up to first order in ∆xλ , we obtain
Z σ ρ Z σ
dx dxρ
v µ (σ) = vPµ − Γµ νρ vPν dσ − (∂λ Γµ νρ − Γµ κρ Γκ νλ ) vPν xλ − xλP dσ . (5.48)
P 0 dσ P 0 dσ
The second and the last term (the part associated to xλP ) vanish for a closed path ( dxρ = 0) . We
H
This effect can be written in a more meaningful form by adding the result of interchanging the dummy
indices ρ and λ. Doing this, and taking into account that
I I
d(xρ xλ ) = xρ dxλ + xλ dxρ = 0 ,
(5.50)
we get
I
1
∆v µ = − (∂ρ Γµ νλ − ∂λ Γµ νρ + Γµ κρ Γκ νλ − Γµ κλ Γκ νρ ) vPν xρ dxλ . (5.51)
2 P
Denoting by I
Aρλ ≡ xρ dxλ (5.52)
the total area enclosed by the loop C and taking into account Eq. (5.26), we finally obtain
1
∆v µ = − Rµ νρλ vPν Aσλ . (5.53)
2
The change of the vector when it moves along a closed path is proportional to the Riemann tensor
and to the area enclosed by the loop11 ! Rµ νρσ is the generalization12 of the Gauss curvature K. The
components of a vector v µ will remain unchanged after parallel transport if and only if the curvature
tensor vanishes. In that happens, the spacetime is actually flat. Any apparent dependence of the
metric on the coordinates will be just an illusion due to the use of some weird coordinate system and
11 Note that although our derivation was performed under the assumption of having an infinitesimal loop, it can be
easily extended to larger closed curves. A given surface A bounded by a curve C can be understood as the sum of many
small areas bounded by closed curves CN . Since the changes in ∆v µ around any of the interior curves cancel and only
the outer edges contribute, we can express the change in the components v µ along C as the sum of the changes around
the small curves, namely X
∆v µ = (∆v µ )N . (5.54)
N
12 Indeed the geodesic deviation equation (5.27) is nothing else than the generalization of the Jacobi equation
d2 y
+ Ky = 0 (5.55)
dσ 2
between two geodesics in a two dimensional surface.
5.5 Properties of the Riemann tensor 76
Figure 5.9: Einstein’s manipulations of the Riemann tensor (Zurich notebook). The computation is
abandoned, “zu umstaendlich” (too involved).
we will be able to find a global coordinate system in which the metric takes a Cartesian form.
Exercise:
Determine the Gauss curvature of a spherical surface of radius R through the Gauss-Bonnet
theorem. Hint: Apply it, for instance, to the triangle determine by the 1/8 part of the sphere.
• Symmetry: The Riemann tensor Rρσµν is symmetric under the interchange of the first pair of
indices with the second pair of indices
• Antisymmetry: The Riemann tensor Rρσµν is antisymmetric under the interchange of either
the first two indices or the second two indices
This is a direct consequence of the definition of the Riemann tensor ( the operator [∇σ , ∇ρ ] is
antisymmetric) and the metric compatibility
• 1st Bianchi identity: The cyclic sum of the last three indices is zero
This can be easily understood by applying the operator [∇ρ , ∇σ ] to the gradient ∇ν φ of a scalar
field. For any scalar ∇[ρ ∇σ ∇ν] φ = 0, which implies
Rκ [νρσ] ∇κ φ = 0 . (5.61)
Since the resulting expression is valid for all gradients, Eq. (5.60) follows immediately. Note
that the result is non-trivial only when the three indices νρσ are different. When two of these
indices are equal one of the terms drop and the remaining terms just express the antisymmetry
in the last two indices of the curvature tensor.
• 2nd Bianchi identity: The Riemann tensor satisfies the differential identity13
Exercise
Prove Eq. (5.63) Hint: Use a local inertial frame.
• Ricci tensor and Ricci scalar: There are two important contractions of the Riemann tensor14 .
The first one is a second rank tensor obtained from contracting a pair of indices. Since Rµνρσ is
antisymmetric in µν and ρσ, the only non-trivial contraction is between µ and ρ or between µ
and σ. These two contractions differ only by a change of sign. Taking the first contraction, we
obtain the so-called Ricci tensor
which is symmetric, as can be easily seen by taking into account the relation (4.72)
!
µ 1 p 1 p p 1 p
∂σ Γ νµ = ∂σ p ∂ν |g| = − ∂σ |g|∂ν |g| + p ∂ν ∂σ |g| . (5.65)
|g| |g| |g|
That’s all. There are no more non-vanishing contractions. The result (5.66) is quite remarkable.
Among the 20 independent components of the Riemann tensor that transform into linear com-
binations of each other under general coordinate transformations, there is one which remains
unchanged. R is the only scalar involving the metric and two derivatives.
Exercise:
• Among the different ways of constructing a scalar from the Riemann tensor discussed
above, why did I not discuss the contraction µνρσ Rµνρσ ?
14 We will only discuss the contractions at the lower order in the curvature tensor. Higher order contractions such as
R2 , Rµν Rµν or the square of the Riemann tensor, the so-called Kretschmann scalar Rµνρσ Rµνρσ , will be introduced
at its due time.
5.6 Independent components of the Riemann tensor 78
• Contracted Bianchi identities: Note the important result that follows from the Bianchi
identity (5.63) and the definition of the Ricci scalar. Contracting the indices µρ in (5.63) we get
where we have made use of the antisymmetry property (5.58). Multiplying by the metric g νσ ,
contracting the indices ν and σ and taking into account that ∇ρ Rρσ σκ = −∇ρ Rσρ σκ = −∇ρ Rρ κ ,
Eq. (5.67) becomes
∇κ R − ∇ σ R σ κ − ∇ ρ R ρ κ = 0. (5.68)
which, comparing with the definition (5.70) of the Einstein tensor , can be written as
Gµν uµ uν measures the local scalar curvature of the spatially projected curvature tensor.
A final warning
There are several sign conventions involved in the definition of the Riemann tensor and its
contractions. Be careful when taking results from different books or articles. Our convention
is that of Misner, Thorne and Wheeler. A very useful reference sheet taken precisely from this
book can be found in the Moodle.
the expression of a symmetric m × m matrix17 RAB = RBA with indices A = {µν} and B = {ρσ}.
This matrix has 21 m(m + 1) independent components. The value of m is determined by the number
of choices that we have for A and B, which, taking into account Eq.(5.58), have the same content as
a n × n antysimmetric matrix. We have therefore m = 12 n(n − 1) possible choices of A and B. The
total number of components so far is
n4 − 2n3 + 3n2 − 2n
m(m + 1) 1 n(n − 1) n (n − 1)
= +1 = , (5.74)
2 2 2 2 8
but we have still to substract the constraints imposed by Eq.(5.60). To determine the number of extra
constraints, notice that if one sets any two components equal (for instance µ = ν) we get identically
zero (one term goes away by antisymmetry and the other two cancel). Only if the 4 indices are
different we get a constraint. The number of independent constraints is the same as the number of
combinations of 4 objects that can be chosen from n objects
n n! n(n − 1)(n − 2)(n − 3)
= = . (5.75)
4 4!(n − 4)! 24
The final number of independent components of the Riemann tensor becomes
m(m + 1) n! n2 (n2 − 1)
CR = − = . (5.76)
2 4!(n − 4)! 12
Evaluating this for different dimensions we get
Number of dimensions 1 2 3 4 5
µ
Total components of R νρσ 1 16 81 256 625
Independent components of Rµ νρσ 0 1 6 20 50
The number of independent components in 4 dimensions has been reduced from 256 to 20! The fact
that the number is still quite large is reasonable, since we need a lot of numbers to specify how the
space curves in many different directions.. As we will see in the next Section, these are precisely the
degrees of freedom in the second derivatives of the metric that we cannot set to zero by performing a
change of coordinates.
Exercise
• In one dimension the Riemann tensor is always identically zero. Explain why.
Hint: Remember the geometrical interpretation of the Riemann tensor.
• How many components have the Ricci tensor and the Ricci scalar in 2, 3 and 4 dimensions?
And the Einstein tensor? Is there any dimension in which the Riemann and the Ricci
tensors haves the same number of independent components?
in which gravity can be transformed away. So, one of the things that we will like to verify is that
this kind of coordinate systems exist in the context of Riemannian geometry, i.e., if we can always
introduce a free falling frame (5.77) at an arbitrary point for an arbitrary metric gµν . For doing
that, consider a coordinate transformation from the coordinates xµ to some coordinates ξ α in the
neighborhood of some point P . Performing a Taylor expansion around P , we get
ξ α (x) = ξ α (P ) + Aα µ ∆xµ + Bµν
α
∆xµ ∆xν + C α µνρ ∆xµ ∆xν ∆xρ + . . . , (5.78)
µ µ µ 18
with ∆x ≡ x − P and
∂ξ α 1 ∂2ξα 1 ∂3ξα
Aα µ = , B α
µν = , D α
µνρ = . (5.79)
∂xµ P 2 ∂xµ ∂xν P 6 ∂xµ ∂xν ∂xρ P
Let us see if we can generically choose the values of the coefficients Aα µ , B α µν , Dµνρ
α
. . . in such a way
that the conditions (5.77) are satisfied . In four dimensions, the matrix A µ has 42 = 16 independent
19 α
components. Since we need only 10 conditions to impose gµν (P ) = ηµν , we are left with 6 components
to spare, precisely the number of Lorentz transformations and rotations that we can make without
modifying the form of metric in the Minkowski metric ηµν ! The requirement ∂σ gµν (P ) = 0 give rise
to 4 × 4(4 + 1)/2 = 40 conditions, which are precisely the number of components of the symmetric
quantity B α µν . We have just proven that one can always choose coordinates in such a way that
the metric reduces to the inertial form (5.77) in an infinitesimal region around a point P . In the
mathematical literature, this is known as the local flatness theorem.
But, what happens with the other coefficients? Can we make also put the second derivatives of
the metric to zero by simply performing coordinates transformations? The answer is no. The second
α
derivatives of the metric, ∂σ ∂ρ gµν , have 10 × 10 = 100 independent components, while Dµνρ has only
2
4 × (5 × 6) /6 = 80 components. This means that among the 100 components of the metric second
derivatives only 80 can be set to zero at P via coordinate transformations. Precisely the number of
independent components of the Riemann tensor in 4 dimensions! Indeed, it is not difficult to prove
that, at quadratic order in the coordinates, we can write
1
(Rµρνσ + Rνρµσ ) ∆xρ ∆xσ
gµν = ηµν − (5.80)
3
The second derivatives of the metric (or if you want the first derivative of the Christoffel symbols)
encode the information about the true gravitational field Rµνρσ !. A free falling observer can pretend
that he/she is not in the presence of a gravitational field, but the tidal forces cannot be eliminated!
Exercise
Repeat this exercise in arbitrary dimensions. What happens?
which, taking into account the symmetry in the indices ν and σ, leaves as with 20−10 = 10 independent
components, which together with the 10 − 1 = 9 independent components of the trace free part of the
Ricci tensor Sµν , and the single component of the curvature scalar R, makes the 20 components of
the Riemann tensor. Note that no new quantities can be obtained by contracting the indices of the
above irreducible components.
An important property of the Weyl tensor is its behaviour under conformal transformations. A
conformal transformation can be understood as a local dilatation, in which the line element changes
from ds2 to Ω2 (x)ds2 , with Ω2 (x) an arbitrary and non-vanishing function called conformal factor 20 .
Through a trivial, but quite involved computation, one can verify that when we perform one of these
conformal transformations
gµν −→ Ω2 (x)gµν , (5.85)
the totally covariant Weyl tensor transforms accordingly
and therefore21 C µ νρσ is conformally invariant22 . This has an interesting consequence: in those case
in which the metric can be written as the result of the conformal transformation of a flat spacetime,
gµν = f (x)δµν or gµν = f (x)ηµν , the Weyl tensor is zero and the Riemann tensor can be entirely
expressed in terms of the Ricci tensor Rµν and the scalar of curvature R.
Exercise
Prove that the Weyl tensor (5.82) is indeed traceless.
depending only in the metric and respecting the symmetries of the Riemann tensor23
Contracting the previous expression to obtain the Ricci scalar in the left-hand side we get
which allows as to identify the unknown factor A in Eq. (5.88) and write the fully covariant expres-
sion24
Rµνρσ = K (gµρ gνσ − gµσ gµρ ) , (5.91)
where we have defined the Gaussian curvature as K = R/2.
The Christoffel symbols can be computed in many different ways, being the most practical one the
Lagrangian method. The only non-vanishing terms are
Rθ νρσ = ∂ρ Γθ νσ − ∂σ Γθ νρ + Γθ λρ Γλ νσ − Γθ λσ Γλ νρ . (5.94)
Among the two possibles values of the indices appearing in the ΓΓ pieces, only the λ = ρ = φ choice
contributes, so we can expand the sum over λ in the last two terms
Rθ νρσ = ∂ρ Γθ νσ − ∂σ Γθ νρ + Γθ φρ Γφ νσ − Γθ φσ Γφ νρ . (5.95)
Since the Riemann tensor is antisymmetric in ρ and σ, we cannot have ρ = σ. Let’s set therefore
ρ = φ and σ = θ (keeping in mind that the alternative choice, ρ = θ and σ = φ, just gives rise to a
relative minus sign). We have
Rθ νφθ = Γθ φφ Γφ νθ − ∂θ Γθ νφ = 0 , (5.96)
23 The combination S − T is antisymmetric under µ ↔ ν
24 This is the particular expression of a much more general relation
R
Rµνρσ = (gµρ gνσ − gµσ gµρ ) . (5.90)
n(n − 1)
for a maximally symmetric spacetime with constant R is arbitrary dimension. Unfortunately, I don’t have the time to
go trough it. The interested reader can have a look to this subject in Weinberg’s book.
25 Since this is the first non-trivial computation of the Ricci scalar that we perform, I will do it in great detail.
Although I could directly compute R1212 (we are dealing with a 2-dimensional metric) I prefer not to do so in order to
teach you some general tricks related to the symmetries of the Riemann tensor that will be useful when dealing with
more complicated metrics.
5.7 A laboratory for Riemannian geometry: 2 dimensional manifolds 83
Exercise
Compute the intrinsic curvature of the two-dimensional cone in Cartesian and polar coordinates.
Interpret the result.
CHAPTER 6
EINSTEIN EQUATIONS
A. Einstein
i) A continuity equation
∂ρ ∂ ρv j
+ = 0, (6.2)
∂t ∂xj
reflecting the fact that mass is neither created or destroyed in Classical Mechanics (the flowing
of mass out from a volume is equal to the loss of mass in it).
ii) A Newton’s 2nd law for fluids
∂v i ∂v i
i i
f = ρa = ρ + vj j , (6.3)
∂t ∂x
with
v i (t + ∆t, x + ∆x) − v i (t, x)
ai = lim , (6.4)
∆t→0 ∆t
and f i = f i (t, x) the total force per unit volume around a point x at time t. The so-called total
derivative of the velocity field
Dv i ∂v i ∂v i
≡ + vj j (6.5)
dt ∂t ∂x
contains two pieces, the local derivative ∂v/∂t, which gives the change of the velocity v as a
function of time at a given point in space, and the so-called convective derivative, (v · ∇) v, which
represents the change of v for a moving fluid particle due to the inhomogeneity of the fluid vector
field.
If we assume that there are not other forces apart from those exerted by the fluid on itself, we
are left with internal forces like pressure or friction acting only between neighboring regions of
matter. Consider a infinitesimal volume dV with surface area dA centered at a point x at time
t. Let us denate by nj the normal vector to the surface. In a perfect fluid2 , the force F i exerted
by the matter on the area is proportional to the area itself F i = p(t, x)δ ij nj dA, with p(t, x) the
pressure at that point at time t. In the most general case, we will also have shear forces
due to the tendency of fluid elements moving with different velocities to drag adjacent matter.
The coefficients T ij are the components of the so-called stress tensor, which must be symmetric,
T ij = T ji .
Exercise
Consider the 3-component of the torque acting on an infinitesimal cube of a material of
density ρ and side length L. Compare it with the moment of inertia of the cube I = 61 ρL5 .
What happens if T ij 6= T ji in the limit L → 0?
2 A perfect fluid is defined as one for which there are no forces between the particles, no heat conduction and no
viscosity.
6.1 The energy-momentum tensor 85
The total force exerted per unit area in a given direction3 can be transformed into a total force
by unit volume via the Gauss’ theorem
∂T ij
Z Z
ij
∂j T ij dV fi = − j .
− T nj dA = − −→ (6.7)
A V ∂x
Plugging in this result into the Newton 2nd law (cf. Eq. (6.3))
i
∂v i ∂T ij
∂v
ρ + vj j + = 0, (6.8)
∂t ∂x ∂xj
and using the continuity equation to write
∂v i ∂ ρv i i ∂ρ ∂ ρv i i ∂ ρv
j
ρ = −v = +v (6.9)
∂t ∂t ∂t ∂t ∂xj
The previous result and the continuity equation (6.2), the Newton’s 2nd law (6.3) for this partic-
ular case (f i = −∂j T ij ) can be written as
∂ ρv i ∂
ρv i v j + T ij = 0 ,
+ (6.10)
∂x0 ∂xj
which is the so-called Euler equation.
shorter version the energy-momentum tensor 4 or the stress-energy tensor. It is a rank-2 symmetric
tensor encoding all the information about energy density, momentum density, stress, pressure . . . .
The ten components of this tensor have the following interpretation:
• T 00 is the local energy density, including any potential contribution from forces between particles
and their kinetic energy.
• T 0i is the energy flux in the i direction. This includes not only the bulk motion but also any
other processes giving rise to transfers of energy, as for instance heat conduction.
• T i0 is the density of the momentum component in the i direction, i.e. the 3−momentum density.
As the previous case, it also takes into account the changes in momentum associated to heat
conduction.
3 The minus sign appears because we are considering the force exerted on matter inside the volume by the matter
outside
4 This name can be sometimes misleading as it can be confused with the energy-momentum 4-vector pµ in sentences
including things like “the energy-momentum conservation equation. . . ”. The difference should be always clear from the
context.
6.2 The microscopic description 86
• T ij is the 3-momentum flux or stress tensor, i.e the rate of flow of the i momentum component
per unit area in the plane orthogonal to the j-direction. The component T ii encodes the isotropic
pressure in the i direction while the components T ij with i 6= j refer to the viscous stresses of
the fluid.
moving from the rest frame uµ = (1, 0) to one in which the fluid moves with 3-velocity v i . We get
with uµ the 4-velocity vector field tangent to the worldines of the fluid particles. Taking into account
this result, the full stress-energy tensor (6.12) takes the form
T µν = (ρ + p)uµ uν + pη µν . (6.16)
The resulting equation is manifestly covariant and can be easily generalized to arbitrary coordinate
systems or curved spacetimes by simply replacing the local metric η µν by a general metric g µν
T µν = (ρ + p)uµ uν + pg µν . (6.17)
∇ν T µν = 0 (6.18)
in which the standard derivative ∂µ is replaced by the covariant derivative ∇µ . The word local is, as
always in this course, important. Eq. (6.18) is not a conservation law, nor should it be. As we will
see, energy is not conserved in the presence of dynamical spacetime curvature but rather changes in
response to it.
Exercise
Prove Eq. (6.15).
of state, let me consider a macroscopic collection of N structureless point particles interacting through
spatially localized collisions. The energy density associated to any of them is given by
The same procedure can be applied to the spatial momentum density (or energy current) of the particle
Tn0i = pin δ (3) (x − xn (t)) = mn γn vni δ (3) (x − xn (t)) = En vni δ (3) (x − xn (t)) , (6.22)
and to the flux of the i momentum component in the j direction (or viceversa)
Tnij = pin vnj δ (3) (x − xn (t)) = pjn vni δ (3) (x − xn (t)) . (6.23)
We obtain
Z +∞ Z +∞
Tn0i = mn dτn u0n uin δ (4) (x − xn (τn )) , Tnij = mn dτn uin ujn δ (4) (x − xn (τn )) . (6.24)
−∞ −∞
Eqs. (6.21) and (6.24) can be rewritten in a very compact way in terms of the stress-energy-momentum
tensor T µν
Z +∞ Z +∞
pµ pν
Tnµν = mn dτn uµn uνn δ (4) (x − xn (τn )) = dτn n n δ (4) (x − xn (τn )) , (6.25)
−∞ −∞ mn
which is manifestly symmetric and Lorentz invariant since uµn uνn is a tensor under Lorentz transfor-
mations and both mn and dτn δ (4) (x − xn (τn )) are Lorentz scalars. The total energy density of the
whole system of particles can be written as the sum of the individual contributions, namely
N
X
T µν = Tnµν . (6.26)
n=1
5 The fact that the 4-Dimensional Dirac delta δ (4) (x) is Lorentz invariant follows directly from the definition
d4 xδ (4) (x) = 1 and the fact that the volume element d4 x is Lorentz invariant.
R
6.2 The microscopic description 88
which using
dxµn ∂ (4)
uµn ∂µ δ (4) (x − xn (τn )) = δ (x − xn (τn )) = −d/dτn δ (4) (x − xn (τn )) , (6.28)
dτn ∂xµ
can be written as
N Z +∞ N Z +∞
X d ν (4) X
∂µ T µν = − mn dτn un δ (x − xn (τn )) + mn dτn u̇νn δ (4) (x − xn (τn )) .
n=1 −∞ dτn n=1 −∞
The first term in the right hand side of the previous expression disappears in the particles are stable,
i.e. if the orbits are closed or come from negative infinite time and disappear into positive infinite
time. We are left then with the second term, which can be written as
N Z +∞ N
µν
X dpνn (4) X dpνn (3)
∂µ T = dτn δ (x − xn (τn )) = δ (x − xn ) , (6.29)
n=1 −∞ dτn n=1
dt
with pνn = mn uνn the 4-momentum of the individual particles. The local energy momentum conser-
vation ∂µ T µν = 0 requires the particles to be free. Or in other words, the condition ∂µ T µν = 0 is
equivalent to the geodesic equation in Minkowski space-time, dpµn /dτ = 0. This will be also the case
in curved spacetime.
A simple inspection of Eqs. (6.30) reveals that, for standard matter, 0 ≤ p ≤ ρ/3. In any other
reference frame, the energy-momentum tensor for the perfect fluid reads
with uµ (x) denoting now the average value of the 4-velocities uµi of the individual particles NR inside
the volume8 . The perfect fluid form (6.31) can be used to model very different physical situations
that often fall into one of the following categories:
p
1. Non-relativistic matter: For small velocities the dispersion relation En = m2n + p2n can be
approximated by En ' mn +p2n /2mn , which plugged back into (6.30) gives rise to ρ ' mn n+ 32 p.
Taking into account that the statistical definition of temperature T is twice the energy possessed
by each degree of freedom and assuming a monoatomic gas with 3 kinetic degrees of freedom,
we can write T = (2/3) × p2n /2mn and therefore ρ ' mn n + 23 T .
6 Remember that, when we later apply the Equivalence Principle, we will have another scale into play: the scale
L at which the gravitational effects start to be important. If this scale happens to be much larger than the scale d
(L d a), the mean properties of the fluid can be safely considered as constant over the region.
7 i.e if the fluid is perfect.
8 Note that, when writing uµ (x), ρ(x) and p(x) we are explicitly taking into account that the averages can vary from
2. Dust: A perfect fluid with zero pressure. p = 0, tµν = 0, T µν = ρ diag (1, 0, 0, 0).
3. Radiation: A perfect highly relativistic fluid. In this case En ' |pn | mn and therefore9
ρ ' 3p. The energy momentum for radiation is traceless, T = T µ µ = ηµν T µν = −ρ + 3p = 0.
Fµν F µν = −2 E 2 − B 2 .
(6.35)
We obtain
1
B2 = Fµν uν F µ ρ uρ + Fµν F µν . (6.36)
2
Putting Eqs. (6.34) and (6.36) together, the covariant generalization of the energy density
(6.32) becomes
1
ρ ρσ
ρ = Fρµ F ν − Fρσ F ηµν uµ uν , (6.37)
4
where we have inserted a factor uµ uµ = −1. The work is basically done. The quantity in
parenthesis is the sought-for energy-momentum tensor for the electromagnetic field!
1
Tµν = Fρµ F ρ ν − Fρσ F ρσ ηµν . (6.38)
4
Exercise
• Compute the T 0i in terms of the electric and magnetic fields. Do you recognize the
result?
• Prove that the electromagnetic energy-momentum is symmetric Tµν = Tνµ and
traceless, T µ µ = 0. The electromagnetic field behaves as a fluid with equation of
state p = 1/3ρ.
9 Note |pn |2
pin vn
i in Eq. (6.30) can be written as pin vn
i = i vi =
P P P
that the quantity i i i mγn vn n En
, which goes to
|pn | when En ' |pn |.
6.3 Einstein equations: Heuristic derivation 90
partial derivative, we should necessarily have a constant T M throughout the whole spacetime, which
is highly implausible, since, as we know, T M = 0 for the electromagnetic field and T M > 0 for stan-
dard matter. On top of that, Eq. (6.40) hides 10 differential equations for 6 physical unknowns: the
components of the metric that cannot be freely changed by performing coordinates transformations
in the 4 coordinates. We have to try harder.
The most general combination of symmetric tensors involving up to two derivatives of the metric is
Kµν = Rµν + agµν R + Λgµν (6.41)
with a and Λ some unknown constants to be determined13 . Imposing the local conservation of the
energy-momentum tensor ∇µ Tµν
M
=0 in Eq. (6.39) we get
∇µ Kµν = ∇µ (Rµν + agµν R) = 0 , (6.42)
where we have taken into account that the covariant derivative ∇µ is metric compatible and therefore
∇µ (Λg µν ) = 0. Our situation now is much better than that of Einstein, we are aware of the contracted
form of the Bianchi identities14 (5.71) and know the precise value of a that satisfies Eq. (6.42), namely
a = 1/2. Taking this into account, we can rewrite Eq. (6.39) as
Gµν + Λgµν = κ2 Tµν
M
, (6.43)
with
1
Gµν ≡ Rµν − gµν R , (6.44)
2
10 Matter should be understood in a broad sense, meaning really matter, radiation etc. . .
11 A relativistic generalization should take the form of an equation between tensors.
12 The requirement of having derivatives only up to second order is certainly reasonable. If this were not the case, one
would have to specify for the Cauchy problem not only the value of the metric and its first derivative, but also higher
derivatives on a spacelike surface.
13 A possible proportionality constant in front of R
µν has been factored out and incorporated in the still unknown
factors κ and Lambda in the right hand side of Eq. (6.39).
14 He wasn’t.
6.3 Einstein equations: Heuristic derivation 91
the Einstein tensor defined in previous chapter (cf. Eq. (5.72)) and Λ the famous cosmological constant
term. Writing this cosmological constant term in the right hand side of the equation, we can interpret
it as the energy-momentum tensor of a fluid with a weird equation of state p = −ρ
Λ
Gµν = κ2 Tµν
M Λ Λ
+ Tµν , Tµν =− gµν . (6.45)
κ2
M Λ
Defining Tµν ≡ Tµν + Tµν , we can write
Even though our derivation was quite heuristic, the solution that we have obtained is unique (Lovelock
theorem). The resulting tensorial equation is a set of ten differential equations15 for the metric gµν (x)
given the energy-momentum tensor Tµν (x). However, due to the existence of the Bianchi identities,
not all the components are longer independent. There are only 6 independent equations to determine
6 independent components of the metric tensor.
As differential equations they are very complicated, even in vacuum. Both the Ricci scalar and
the scalar curvature involve derivatives and products of Christoffel symbols, which in turn involve
derivatives of the metric tensor. There is also some dependence on the metric hidden in the energy-
momentum tensor. On top of that, the equations are not linear, as it should be expected, since,
according to the Equivalence Principle, every form of energy, including the gravitational self-energy,
must be a source of the gravitational field16 . The non-linearity of the equation forbids us to apply the
superposition principle, given two known solutions they cannot be combined to get a new one.
Gµν − κ2 Tµν uµ uν = 0 ,
(6.47)
Newton Einstein
The quantity hµν is then understood as a small perturbation on top of the Minkowski background.
Consistently with this point of view, we will raise and lower its indices with the flat Minkowski metric
ηµν , namely hµ σ = η µρ hρσ , hµν = η νσ hµ σ .
In order to compute the expression for the Einstein tensor Gµν at the lowest order in perturbation
6.4 The linearized theory of gravity 93
theory we must first determine the linearized version of the Ricci tensor and the scalar curvature,
which are functions of the metric connection Γµ νρ . Inserting the expansion (6.48) into the definition
of the metric connection, we get
1 µσ
Γµ νρ = η (∂ν hσρ + ∂ρ hσν − ∂σ hρν ) + O(h2µν ) . (6.49)
2
The next step is to compute the 4 pieces of Riemman tensor, which, written in a very schematic way,
···
have the structure R··· ∼ ∂Γ − ∂Γ + ΓΓ + ΓΓ. Taking into account (6.49), we realize that only the
first two terms (∼ ∂Γ) give a contribution to the leading order
1 1
Rµ νρσ = ∂ρ (∂σ hµ ν + ∂ν hµ σ − ∂ µ hνσ ) − (ρ ↔ σ) = (∂ν ∂ρ hµ σ + ∂σ ∂ µ hνρ − (ρ ↔ σ)) . (6.50)
2 2
The linearized version of the Ricci tensor and the scalar of curvature can be computed by simply
performing contractions in the previous expression. Denoting respectively by h ≡ hµ µ and 2 = ∂ µ ∂µ
the trace of the perturbation tensor and the d’Alambertian operator and contracting the indices µ
and σ in Eq. (7.17), we get17
1
Rνρ = − (2hνρ + ∂ν ∂ρ h − ∂ν ∂σ hσ ρ − ∂ρ ∂σ hσ ν ) , (6.51)
2
which can be further contracted in the indices ν and ρ to obtain
Collecting all the terms and inserting them into the definition of the Einstein tensor (6.44), we get
1
Gνρ = − (∂ν ∂ρ h + 2hνρ − ∂ν ∂σ hσ ρ − ∂ρ ∂σ hσ ν − ηνρ 2h + ηνρ ∂µ ∂σ hµσ )
2
1
= − 2h̃νρ + ηνρ ∂µ ∂σ h̃µσ − ∂ν ∂σ h̃σ ρ − ∂ρ ∂σ h̃σ ν , (6.53)
2
where in the last step we have defined the so-called trace reverse
1 1
h̃µν ≡ hµν − ηµν h , hµν = h̃µν − ηµν h̃ , (6.54)
2 2
which keeps track of the extra terms obtained when passing from Rνρ to Gνρ . The name trace reverse
comes from the property h̃ ≡ h̃µ µ = −h. Note also the useful properties
˜
h̃µν = hµν , Gµν = R̃µν . (6.55)
The resulting expression is rather involved, but fortunately we still have some freedom to play with:
the gauge freedom.
17 The global minus sign comes from the permutation of the last two indices two construct the Ricci scalar.
6.4 The linearized theory of gravity 94
Gauge fixing
Eqs. (7.17) and (6.53), and therefore (6.56), are invariant under the transformation
as can be easily verified be performing the explicit computation. This kind of change is
called a gauge transformation, due to the strong analogy with the gauge transformations in
the electromagnetic theory. The simplest way to understand this gauge freedom is to trace
it back to the transformation of the full metric gµν . Consider an infinitesimal transformation
xµ → x̄µ = xµ + ξ µ . Under such a transformation the metric changes to
∂ x̄µ ∂ x̄ν
ḡ µν (xρ + ξ ρ ) = g ρσ (xρ ) (6.58)
∂xρ ∂xσ
= g (δ ρ + ∂ρ ξ µ ) (δ ν σ + ∂σ ξ ν )
ρσ µ
= g µν (xρ ) + g µσ ∂σ ξ ν + g νρ ∂ρ ξ µ .
Expanding the left-hand side of this equation in a Taylor series in ξ ρ and retaining only the
terms up to linear order, we get
with
δg µν ≡ −ξ ρ ∂ρ g µν + g µρ ∂ρ ξ ν + g νρ ∂ρ ξ µ = ∇ν ξ µ + ∇µ ξ ν . (6.60)
In the particular case in which the perturbation is performed around the Minkowski background,
gµν = hµν + ηµν , the covariant derivatives in (6.59) become standard derivatives and we recover
the transformation law (6.57). The linearized theory is invariant under (6.57) because the
full nonlinear theory is invariant under general coordinate transformations! This is extremely
interesting, since it allows us to further simplify the linearized version of the Einstein tensor by
simply performing infinitesimal coordinates transformations, or in other words, changes from
a splitting gµν = ηµν + hold to a different splitting gµν = ηµν + hnew . A simple inspection of
Eq. (6.56) reveals that an interesting condition to be satisfied by the trace reverse tensor in
the new coordinate system would be the tensor analog of the Lorenz gauge ∂µ Aµ = 0 in the
electromagnetic theorya , namely
∂ρ h̃νρ
new = 0 . (6.61)
Let us see if we are allowed to choose such a gauge. The change in the trace reverse tensor h̃µν
follows directly from Eqs. (6.54) and (6.57)
νρ
h̃νρ ν ρ ρ ν νρ µ
new = h̃old − ∂ ξ − ∂ ξ + η ∂µ ξ . (6.62)
In order to satisfy the gauge fixing (6.61), ξ ν must be a solution of the inhomogeneous wave
equation
2ξ ν = ∂ρ h̃νρ
old . (6.64)
The existence of a solution transforming from an arbitrary hµν to the so-called Lorenz gauge
νρ
∂ρ h̃νρ b
new = 0 is guaranteed for sufficiently well behaved ∂ρ h̃old . In fact, the choice is not unique
ν
since we can always add to it any solution of the homogeneous wave equation 2ξH = 0 and the
ν ν νρ νρ
result will still obey 2 (ξ + ξH ) = ∂ρ h̃old . The Lorenz gauge ∂ρ h̃new = 0 is actually a set of
gauges.
a It“kills” three of the four terms in (6.53).
b Asyou learnt in your electrodynamic course, the solution of this equation can be obtained by means of the
retarded Green functions of the d’Alambertian operator.
6.4 The linearized theory of gravity 95
O
x
P
x'
x- x'
In view of the previous discussion, we realize that most of the terms in the left-hand side of Eq. (6.56)
merely serve to maintain gauge invariance. When the Hilbert gauge condition 18 ∂ρ h̃νρ = 0 is imposed,
the linearized version of the Einstein equation simplifies dramatically
2h̃µν = −2κ2 Tµν . (6.65)
This equation is formally identical to the Maxwell equations in the Lorenz gauge and can be solved
by using the Green’s function method.
Green’s functions
Consider a differential wave equation of the form
with f (t, x) a radiation field and s(t, x) a source term. A Green’s function G(t, x; t0 , x0 ) is
defined as the field generated at the point (t, x) by a delta function source at (t0 , x0 ). i.e.
The field due the actual source s(t, x) can be obtained by integrating the Green’s function
against s(t, x): Z
f (t, x) = dt0 d3 x0 G(t, x; t0 , x0 ) s(t0 , x0 ) . (6.68)
Physically the Green’s function approach merely reflects the fact that (6.66) is a linear equation.
The full solution of the equation can be obtained by solving for a point source and adding the
resulting waves from each point inside the source.
The Green’s function associated with the wave operator 2 is very well known (see for instance the
Jackson’s book on electrodynamics.):
δ(t0 − [t − |x − x0 |])
G(t, x; t0 , x0 ) = − . (6.69)
4π|x − x0 |
Exercise
Derive this equation in case you haven’t done it before.
(vacuum). As in electromagnetism, the metric perturbation consists of the field generated by the source plus wave-like
vacuum solutions propagating at the speed of light.
6.4 The linearized theory of gravity 96
which is analogous to the relation between the vector potential Aµ and the current Jµ in electromag-
netism. Note the argument t − |x − x0 | = t − |x − x0 |/c. Eq. (6.70) is a retarded solution 20 , taking
into account the lag associated with the propagation of information from events at x to position x0 .
Gravitational influences propagate at the finite speed of light. Action at a distance is gone forever!
We will be back to this point at the next chapter , but before let me finish our main task: determining
the value of the constants κ2 and Λ. For doing that let me consider the case we know better: the grav-
itational field created by a static spherical mass distribution of total mass M . The energy-momentum
tensor for such a system has only one non-vanishing component (cf. Eq. (6.45))
Λ
T 00 = ρ + 2 diag (1, 0, 0, 0) . (6.72)
κ
Plugging this into the time independent version of Eq. (6.70), we get
κ2 ρ (x0 ) 3 0
Z Z
1 Λ
h̃00 = 0
d x + d3 x0 , h̃0i = 0 , h̃ij = 0 . (6.73)
2π |x − x | 2π |x − x0 |
If the mass distribution is concentrated around the origin (x0 = 0), the component h00 evaluated at a
distance r = |x − x0 | becomes21
κ2 ρ (x0 ) 3 0 κ2 M
Z Z
1 Λ 3 0 2
h̃00 = d x + d x = + Λr2 (6.74)
2π r 2π r 2π r 3
with Z
M= ρ (x0 ) d3 x0 (6.75)
the total mass of our spherical distribution. Taking now into account that h̃ = η µν h̃µν = −h̃00 and
using the definition (6.54) we get
κ2 M 1
h00 = h11 = h22 = h33 = + Λr2 . (6.76)
4πr 3
Comparing this result with that obtained by performing the weak field limit of the geodesic equation in
the Λ = 0 case, hΛ=0
00 = −2Φ = 2GM/r, allows us to identify the sought-for proportionality constant
κ2 = 8πG . (6.77)
When Λ 6= 0, the Newtonian potential becomes modified at long distances
GM Λ
Φ=− − r2 (6.78)
r 6
and line element takes the form
2 2GM 1 2 2 2GM 1 2
ds = − 1 − − Λr dt + 1 + + Λr dX 2 , (6.79)
r 3 r 3
with dX 2 ≡ dx2 + dy 2 + dz 2 . In Newtonian terms, a positive cosmological constant (Λ > 0) gives rise
to a repulsive force per unit mass whose strength increases linearly with the distance
GM Λ
f =− ur + r ur , (6.80)
r2 3
20 The retarded solution is obtained by imposing the Kirchoff-Sommerfeld “no-incoming radiation” boundary condition
Cosmological constant
If Λ 6= 0, it must be at least very small, ρΛ ρmatter , to avoid any observational effect in
those situations in which the Newton’s theory of gravity successfully explains the observations.
Taking into account, for instance, that we do not see any modification of the Newtonian theory
of gravity within the solar system, we can set the limit
|Λ| 3M
|ρΛ | = ≤ ρSolar −→ |ρΛ | ≤ 3 ' 10−29 GeV4 (6.81)
8πG 4πRPluto
which, as assumed, makes the contribution of Λ completely negligible on the scale of the systems
we will be interested in in this coursea .
a It will play however a fundamental role at larger scales, as those you will considered in your Cosmology
course.
R [Tµν ]ret 3 0 1
R [Jµ ]ret 3 0
Solution h̃µν = 4G |x−x0 | d x õ = 4π |x−x0 | d x
x
x
One of the most fascinating predictions of General Relativity is the existence of gravitational waves.
Einstein theory of gravity abandons the Newtonian conception of space and time as a rigid structure
in which the particles move. Spacetime is now alive and can curve, move and vibrate!
Consider the propagation of the perturbation hµν far away from the generating source. In this case,
the energy-momentum tensor in Eq. (7.1) can be set to zero and we are left with the homogenous
equation
2h̃µν = 0 . (7.2)
The resulting vacuum case is quite particular since it still contains a residual gauge freedom on top
the Lorenz condition
∂ ν h̃µν = 0 . (7.3)
Having a look to Eqs. (6.62) and (6.63)
h̃new old ρ
µν = h̃µν − (∂µ ξν + ∂ν ξµ − ηµν ∂ρ ξ ) , (7.4)
∂ ν h̃new ν old
µν = ∂ h̃µν − 2ξµ , (7.5)
we realize that we can still make an infinitesimal coordinate transformation xµ → xµ +ξ µ with 2ξµ = 0
without modifying the gauge condition (7.3). Indeed, if 2ξµ =0, we automatically have2
meaning that we can always subtract the combination ξµν from h̃µν in Eq. (7.2). The quantity ξµν
depends of 4-arbitrary functions ξµ , which can be chosen at will to impose 4 extra conditions on the
perturbation h̃µν . In particular, we can take ξ0 and ξi in such a way that h̃ = 0 and h̃0i = 0. The
condition of vanishing trace h̃ = 0 erases the distinction between the perturbation and its trace reverse
On the other hand, the condition h̃0i = h0i = 0 applied the µ = 0 component of the Lorenz gauge
∂ ν h̃µν = 0 implies that h00 is constant in time
This component corresponds to the static part of the gravitational interaction, i.e to the Newtonian
potential of the source which gave rise to the gravitational wave. The gravitational wave itself is the
time-dependent part. As far as gravitational waves are concerned, the condition ∂ 0 h00 = 0 really
means h00 = 0.
The discussion presented above defines the so-called transverse-traceless (TT) or radiation gauge
which completely fix all the local ambiguities and leaves us with 10 − 4 − 4 = 2 degrees of freedom, the
physical ones. The existence of such a gauge is guaranteed as long as there are no sources. Although,
inside the source we are still allowed to perform a coordinate transformation with 2ξµ = 0 (or equiv-
alently 2ξµν =0) on top of the Lorenz gauge, we cannot set to zero any further component in h̃µν ,
since 2h̃µν 6= 0. The situation is completely analogue to what happens in Classical Electrodynamics.
Maxwell equations can be always reduced to the form 2Aµ = J µ by imposing the Lorenz gauge con-
dition ∂µ Aµ = 0. Once there, we have still the freedom to implement a residual gauge transformation
Aµ −→ Aµ + ∂µ ξ with ξ satisfying the condition 2ξ = 0. In the absence of sources, the function ξ can
be used to get rid of one of the components in Aµ , let’s say A0 . The Lorenz gauge reduces in this case
to a transversality condition on Ai , namely ∂i Ai = 0 and we are left with 4 − 1 − 1 = 2 polarizations.
If instead j 0 6= 0, we have 2A0 6= 0 and there is no choice of ξ able to satisfy simultaneously 2ξ = 0
and A0 = 0.
2 The flat d’Alambertian 2 commutes with ∂ µ .
7.3 Interaction of gravitational waves with matter 100
σ σ
Plane wave solution h̃µν = Aµν eikσ x Aµ = aµ eikσ x
Lorenz gauge kσ k σ = 0 kσ k σ = 0
k µ Aµν = 0 k µ aµ = 0
h00 = 0 hi0 = 0 A0 = 0
ij
TT gauge ∂i h = 0 hi i = 0 ∂i Ai = 0
hij = hTT
ij Ai = ATi
symmetric, transverse and traceless transverse
usual.
5 The components A
µν are assumed to be constant.
7.3 Interaction of gravitational waves with matter 101
might seem natural to think that we can learn something interesting by considering the geodesic
equation
duµ
+ Γρµν uµ uν = 0 (7.14)
dτ
for a test particle in the gravitational field of the wave, this is not the case. To see this, consider our
test particle to be at rest, uµ = (1, 0, 0, 0), at an initial time, let’s say, τ = τ0 . Evaluating the geodesic
equation (7.14) at this time we get
duµ 1
= −Γµ00 2∂0 hTT TT
= 0i − ∂i h00 , (7.15)
dτ τ =τ0 2 τ =τ0
which is identically zero since both h0i and h00 are zero in the transverse-traceless gauge. The particle
does not seem to experience any acceleration, it completely ignores the wave! Does this mean that
gravitational waves have no effect in matter? Certainly not! It simply reflects the fact the Riemannian
spacetime is locally flat at any given point.
To detect gravitational waves we must go beyond a single point in spacetime and explore its neighbor-
hood. Consider the wave (7.13) passing through a ring of test particles in x − y plane. Let’s denote
by v µ the distance of a test particles to the center of the ring and use the geodesic deviation equation
D2 vµ
= η µλ Rλνρσ uν uρ v σ . (7.16)
dτ 2
The linearized Riemann tensor
1
Rλνρσ = (∂ν ∂ρ hλσ + ∂σ ∂λ hνρ − (ρ ↔ σ)) , (7.17)
2
generated by the crossing gravitational wave is a gauge invariant quantity, meaning that we can
compute it in any frame without affecting the result. Clearly the best choice is the TT gauge since
the form of hµν in this frame is extremely simple. Assuming the particles to be moving slowly,
U ν ≈ (1, 0, 0, 0), Eq. (7.16) becomes6
d2 v i i j 1 d2 hTT
ij
= R 00j v = vj . (7.18)
dt2 2 dt2
The resulting equation is extremely simple. The response of the particles can be understood in purely
Newtonian terms, without any further reference to General Relativity. Since hTT ij is traceless, the
effective Newtonian force per unit mass
1 d2 hTT
ij
Fi ≡ vj , (7.19)
2 dt2
is divergence free, ∂ i Fi = 0, meaning that there are no sources or sinks for the gravitational lines.
Note also that, as in the electromagnetic case, only the transverse directions (v x and v y ) to the wave
propagation are affected (cf. Eq. (7.13)). If a particle is initially at z = 0, it will remain at z = 0.
6 Note that, at leading order in hµν , τ = t.
7.3 Interaction of gravitational waves with matter 102
A pictorial representation of F i can be obtained by drawing the lines of force the “plus” and “cross”
polarizations
These lines are defined in such a way that at each point (x, y) they go in the direction of the force
with a density proportional to the modulus of the force7 . The effect of the components h+ and h×
in the ring of particles is in clear agreement with the quadrupolar pattern displayed in the previous
figure:
2
= 2
(h+ eikσ x ) , 2
=− 2
(h+ eikσ x ) , (7.20)
dt 2 dt dt 2 dt
whose solution, to lowest order of accuracy, can be written as
1 σ 1 σ
v x = v0x + h+ eikσ x v0x v y = v0y − h+ eikσ x v0y . (7.21)
2 2
with v0x and v0y staying for the initial separation of the particles in the x and y directions. A
“+-polarized” wave makes the particles initially located in v0x and v0y bounce back and for in the
x and y directions respectively. This fact, together with the 180◦ phase difference associated to
the minus sign in Eq. (7.21), gives rise to the following pattern in the ring of particles
A “×-polarized” wave gives rise to a stretching and a squeezing along the 2−1/2 , 2−1/2 , 0 and
−2−1/2 , 2−1/2 , 0 directions. The ring of particles bounces back and for describing a cross
shape.
The components h+ and h× constitute the two independent linear polarizations of the gravitational
wave and play a similar role to the vertical and horizontal polarization in electromagnetic waves.
Different superpositions of these two modes can be always considered within the linear theory.
k µ = (ω, 0, 0, ω) k µ = (ω, 0, 0, ω)
Polarization A11 = −A22 ≡ h+ 6= 0 a1 6= 0
modes A12 = A21 ≡ h× 6= 0 a2 6= 0
hR,L = √12 (h+ ± ih× ) aR,L = √12 (a1 ± ia2 )
The total difference in length between the two arms can be derived8 from Eq. (7.18)
∆L
∼ h. (7.23)
L
It is interesting to put some numbers. If we consider for instance the typical amplitude of gravitational
waves emitted by a rotating binary system9 , h ' 10−21 , and a typical detector such as LIGO or Virgo
with arm lengths of 3 − 4 km, we get a change ∆L ' 10−16 cm.
8 We are implicitly assuming that the wave propagates orthogonally to the plane of the detector. In the general case,
h = J · n = (L + S) · n = S · n . (7.24)
Under a rotation of angle θ around that direction a helicity eigenstate |hi transforms as
There are always two helicity states h = ±s, corresponding to the alignment or counter-
alignment of the spin and the momentum.
a Unfortunately the traditional symbol h for the helicity coincides with some of the notations used in this
chapter.
k l j
Helicity Aij = R−1 i R−1 j Akl ai = R−1 i aj
hR,L −→ e∓2iθ hR,L aR,L −→ e∓iθ aR,L
CHAPTER 8
THE SCHWARZSCHILD-DROSTE SOLUTION
Schwarzschild’s letter
to Einstein during
World War I
Most of the work done till now has been related to weak-field solutions of the Einstein equations.
In this Chapter, we go a step forward an look for exact solutions. Given the non-linearity of the field
equations and the associated difficulty in finding analytical solutions for arbitrary matter distributions,
we will restrict ourselves to vacuum solutions. To determine our starting point, let me rewrite the
Einstein equations Gµν + Λgµν = κ2 Tµν M
in a much more convenient form. Multiplying by the inverse
metric and taking the trace we obtain a relation between the Ricci scalar, the cosmological constant
and the trace T M ≡ g µν Tµν
M
of the locally conserved energy-momentum tensor Tµν M
, namely
1
Rµ µ − Rδ µ µ + Λδ µ µ = κ2 T M −→ R = −κ2 T M + 4Λ . (8.1)
2
Substituting back this result into the original Einstein equations we realize that they can be written
as
1
Rµν = κ2 Tµν − gµν T + Λgµν . (8.2)
2
M
Vacuum solutions (Tµν = Λ = 0) correspond then to solutions of the equation
Rµν = 0 , (8.3)
rather that to solutions of Gµν =0.
The problem of finding a solution of this equation is further simplified in those cases in which the
problem is highly symmetric. In what follows, we will look for spherically symmetric solutions.
ds2 = −a(t, r)dt2 − 2b(t, r)rdtdr + c(t, r)r2 dr2 + d(t, r) dr2 + r2 dΩ2 ,
(8.7)
where we have defined dΩ2 = dθ2 + sin2 θdφ2 . Collecting terms together and defining some, still
arbitrary, functions
to take into account the extra factors of r in Eq. (8.7), we are left with
ds2 = −A(t, r)dt2 − 2B(t, r)dtdr + C(t, r)dr2 + D(t, r)dΩ2 . (8.8)
The resulting metric can be further simplified by using the freedom in the choice of coordinates. For
instance, we can define a new radial coordinate r̄2 ≡ D(t, r) and eliminate r and dr in terms of r̄, t, dr̄
and dt. This gives rise to a big mess that changes the explicit form of the coefficients A, B, C to some
new, but still arbitrary, coefficients A0 , B 0 , C 0
ds2 = −A0 (t, r̄)dt2 − 2B 0 (t, r̄)dtdr̄ + C 0 (t, r̄)dr̄2 + r̄2 dΩ2 . (8.9)
The next thing we can do is to find some new coordinate time t̄(t, r̄) to get rid of the nasty term dtdr.
To do that, let me define this new time as
dt̄ = µ(t, r̄) [A0 (t, r̄)dt + B 0 (t, r̄)dr̄] = ∂t Ψ(t, r)dt + ∂r Ψ(t, r)dr , (8.10)
where the new unknown integrating factor µ is determined by the condition that the second equality
holds for some Ψ. In other words, we require µ(t, r̄) [A0 (t, r̄)dt + B 0 (t, r̄)dr̄] to be a total differential,
so that the first equality makes sense.
Squaring Eq.(8.10)
dt̄2 = µ2 A02 dt2 + 2A0 B 0 dtdr̄ + B 02 dr̄2
(8.11)
8.2 Spherical symmetry and staticity 111
Γr tt = e2(α−β) ∂r α , Γr tr = Γr rt = ∂t β , Γr rr = ∂r β ,
(8.17)
Γr θθ = −re−2β , Γr φφ = sin2 θ Γrθθ , Γθ rθ = Γθθr = 1/r ,
The non-vanishing components of the Riemann tensor associated to these Christoffel symbols are
given by
Rt rtr = e2(β−α) ∂t2 β + (∂t β)2 − ∂t α∂t β] + [∂r α∂r β − ∂r2 α − (∂r α)2 ,
which, contracted, provide us with the non-vanishing components of the Ricci tensor
Rtt = ∂t2 β + (∂t β)2 − ∂t α∂t β + ∂r2 α + (∂r α)2 − ∂r α∂r β + 2r ∂r α e2(α−β) ,
(8.19)
Rrr = ∂t2 β + (∂t β)2 − ∂t α∂t β e2(β−α) − ∂r2 α + (∂r α)2 − ∂r α∂r β − 2r ∂r β ,
2
Rtr = ∂t β , Rθθ = 1 + e−2β [r(∂r β − ∂r α) − 1] , Rφφ = Rθθ sin2 θ .
r
The invariance of the line element under rotations implies the equality
2 2
∂θ ∂φ
+ sin2 θ = 1. (8.22)
∂θ0 ∂θ0
Substituting this into the transformation law for the Rθθ component
2 2 2 ! 2
∂θ ∂φ 2 ∂φ ∂φ
Rθ 0 θ 0 = Rθθ + Rφφ −→ Rθθ = 1 − sin θ + Rφφ
∂θ0 ∂θ0 ∂θ0 ∂θ0
and demanding Rθ0 θ0 = Rθθ , we get the sought-for relation Rφφ = sin2 Rθθ .
The empty-space field equations are obtained by setting each of the components (8.19) equal to zero.
These gives rise to 5 equations among which only 4 are useful since the Rφφ component simply repeats
the information of the Rθθ component. Among these 4 equations, the simplest one is that associated
to Rtr . A simple inspection of this equation reveals a very interesting property: the function β must
be independent of time
Taking into account this result and performing the time derivative of the vacuum equation Rθθ = 0,
we get
∂t Rθθ = 0 −→ ∂t ∂r α = 0 −→ α = γ(r) + κ(t) . (8.24)
The coefficient e2α(r,t) can be then splited into two pieces e2α(r,t) = e2γ(r) e2κ(t) . This allows us to
perform an extra coordinate redefinition
specified by only two time-independent functions α(r) and β(r). The resulting metric is static3 even
though we did not impose any requirement on the source apart from being spherically symmetric. The
source could be as dynamical as a collapsing or a pulsating star and the metric outside the matter
distribution would still take the form (8.26), as long as the collapse is symmetric. This result is in
perfect agreement with our discussion on gravitational waves: if a spherically symmetric body under-
goes pure radial pulsations, there is no quadrupole and there is no emission of gravitational waves.
All vacuum solutions of the Einstein equations with SO(3) symmetry are necessarily static.
with the prime denoting derivatives with respect to r. Note that the first two equations are rather
similar. Multiplying the first one by e−2(α−β) and adding it to the second we get
2 0
e2(β−α) Rtt + Rrr = (α + β 0 ) = 0 −→ α0 + β 0 = 0 −→ α(r) + β(r) = constant . (8.30)
r
The integration constant appearing in the previous expression can be always set to zero by simply
performing a coordinate redefinition, allowing us to set α = −β . Inserting this result into Eq. (8.29)
we get 0
Rθθ = 0 −→ (1 + 2rα0 ) e2α = 1 −→ re2α = 1 , (8.31)
which can be easily integrated to obtain
C
re2α = r + C → e2α = e−2β = 1 + , (8.32)
r
or equivalently
−1
C C
ds2 = − 1 + dt2 + 1 + dr2 + r2 dΩ2 . (8.33)
r r
3A static spacetime is one in which
i) The components of gµν are independent of the timelike component x0 .
ii) The line element is invariant under the transformation x0 → −x0 .
If the second condition is not satisfied the spacetime is rather said to be stationary. A particular example of stationary
metric is the one generated by a rotating star, where the change x0 → −x0 changes the sense of rotation.
8.3 The Schwarzschild-Droste solution 114
The obtained metric is asymptotically flat: it tends to the Minkowski metric when r → ∞.
Birkhoff ’s theorem
Any solution of the vacuum Einstein equations with SO(3) symmetry must be static and asymp-
totically flat.
The only thing left is to associate the constant C to some physical parameter. The most important
use of a spherically symmetric vacuum solution is to represent the spacetime outside stars or planets.
In that case, we would expect to recover the Newtonian limit
2GM 2GM
g00 = − 1 − , grr = 1 + , (8.34)
r r
at large r values. Comparing (8.34) with the r → ∞ limit of the metric (8.32)
C C
g00 = 1 + , grr = 1 − , (8.35)
r r
we get C = −2GM , which allows us to write the final and traditional expression for the so-called
Schwarzchild-Droste metric4
−1
RS RS
ds2 = − 1 − dt2 + 1 − dr2 + r2 dΩ2 (8.36)
r r
with
2GM M
RS ≡ ' 3km , (8.37)
c2 M
the Schwarzschild radius 5 .
Exercise
• Verify that (8.36) satisfies Eq. (8.29). Explain why this is guaranteed to happen even
though we initially had three equations for two unknowns.
• Show that the Schwarzschild metric can be written in a form that makes explicit its
isotropic character, namely
2
RS
1− 4ρ
RS
4
ds2 = − 2
dx2 + dy 2 + dz 2 ,
2 dt + 1 + (8.38)
RS 4ρ
1+ 4ρ
with
1 p
ρ= r − GM + r2 − 2GM r . (8.39)
2
4 Karl Schwarzschild found this exact solution in 1915 while serving in the German army on the Russian front during
the World War I and died a year later from pemphigus, a painful autoimmune disease. An alternative derivation of this
solution based on the Weyl method was presented by Droste around the same time but for some reason the physics
community completely ignored it.
5 We have momentarily restored the factors of c.
8.4 Measuring distances and times 115
3-dimensional Euclidean space can seem to be curved even though it is intrinsically flat, K = κ1 κ2 = 0.
8.6 Apparent singularity 116
Figure 8.1: Embedding diagram for the Schwarzschild (r − φ) plane: Flamm’s paraboloid
Is this a problem? Not necessarily. In most of the astrophysical applications the typical size R of
the source is much larger than the Schwarzschild radius (8.37)
RS RS RS
≈ 10−9 , ≈ 10−6 , ≈ 10−1 . (8.46)
R ⊕ R R NS
This fact makes the singularities at r = 0 and r = Rs completely irrelevant in most of the cases, since
they lie in the interior of objects where the exterior Schwarzschild solution does not apply. Indeed,
the problem disappears when one consider realistic interior solutions of the Einstein equations
−1
2 2GM (r) 2 2GM (r)
ds = − 1 − dt + 1 − dr2 + r2 dΩ2 , (8.47)
r r
since the function M (r) decreases faster than r and effectively kills all the above singularities.
8.7 Geodesics in Schwarzschild metric 117
We should worry and speculate about the singularities only in those cases in which the size of the
object is such that the Schwarzschild-Droste solution applies all the way down to r = RS . This kind
of objects are called black holes. Even in that case the two singularities described above are not on
equal footing. The metric coefficients in the line element (8.36) depend on the choice of a particular
coordinate system and you should not extract any conclusion from them alone. Let me present an
illustrative example.
dX 2 = dx2 + dy 2 , (8.48)
√
and perform a general coordinate transformation to a new variable ρ defined through x = 2 ρ
to get
1
ds2 = 2 dρ2 + dy 2 . (8.49)
ρ
The metric appears to blow up at ρ = 0 even though we know that our space is, by construction,
flat and free of singularities. The apparent singularity is a breakdown of our coordinate system
at the point in which ρ becomes negative. It has nothing to do with a breakdown of the
underlaying manifold!
In order to determine if we are dealing with some artifice of our coordinate system or with a true
physical singularity, we cannot neither look to the curvature tensors alone, since their components are
coordinate-dependent7 . We should rather construct scalars out of the curvature tensors. If any the
scalar blows up in a particular coordinate system, it will do in all of them. The simplest possibility
would be to consider the Ricci scalar, R but we can also construct higher order scalars such as Rµν Rµν
Rµνρσ Rµνρσ . For the particular case of the Schwarzschild-Droste metric, the first two quantities are
not useful since are identically equal to zero. We are forced then to consider the square of the Riemann
tensor, the so-called Kretschmann scalar. Taking into account the non-vanishing components of the
Riemann tensor (8.18), we obtain
12RS2
K = Rµνρσ Rµνρσ = , (8.50)
r6
which is a perfectly regular quantity at the Schwarzschild radius, but becomes infinity at r = 0. This
last point is a real physical singularity! The singularity at r = RS is, on the other hand, just a
pathology of the specific coordinate system used.
dxµ dxν
Z Z
1 −1 2
S = L dσ = dσ e (σ)gµν − m e(σ) , (8.51)
2 dσ dσ
7 They can catch singularities when going from one coordinate system to another through the transformation matrices
∂ x̄µ /∂xν .
8.7 Geodesics in Schwarzschild metric 118
with a clear physical interpretation. In the massless case, E and h are the relativistic energy and an-
gular momentum that the particle would have at r = ∞. In the massive case, they are the relativistic
energy and angular momentum per unit mass.
Exercise
Check this by taking the non-relativistic limit of (8.57) and (8.58) at the equatorial plane
θ = π/2.
Conservation of angular momentum means that the particle moves in a plane, which we can set to
be the equatorial plane θ = π2 without loss of generality. Indeed, a simple inspection of Eq. (8.55)
shows that if we consider a geodesic passing through a point on the equator θ = π2 and tangent to the
equatorial plane θ̇ = 0, we will always have θ̈ = 0 and θ̇ = 0.
On top of the above symmetries, we have still a generic conservation law associated to the invariance
of the action (8.51) under reparametrizations of the path σ → σ = f (σ) (cf. Section 3.5.1). This reads
d
gµν uµ uν = 0 gµν uµ uν = − ,
−→ (8.59)
dσ
with = 1 and 0 for massive and massless particles respectively. Expanding this equation10
−1
RS 2 RS
− 1− ṫ + 1 − ṙ2 + r2 φ̇2 = − (8.60)
r r
8 Remember that σ = τ in the massive case.
9 Up to a global factor m in the massive case.
10 Remember that θ = π/2.
8.7 Geodesics in Schwarzschild metric 119
and plugging (8.57) and (8.58) we obtain a single equation for r(σ)
2
1 dr
+ V (r) = E , (8.61)
2 dσ
with
GM h2 GM h2
V (r) ≡ − + 2− (8.62)
r 2r r3
playing the role of an exact effective potential and
1
E2 − .
E≡ (8.63)
2
Eq. (8.61) is structurally equivalent to that of a particle of unit mass and energy11 E moving in an
effective potential V (r). It is interesting to compare the obtained potential with the Newtonian result
GM h2
VN (r) = − + 2 (8.64)
r 2r
The first two terms in Eq. (8.62) are just the universal gravitational attraction and the centrifugal
barrier that were already present in Newton’s theory of gravity. The third term is new.
At sufficiently long distances, the extra contribution is rather small and does not significantly modify
the Newtonian effective potential12 (cf. Fig. 8.4). The situation is completely different at short
distances. The new term eventually dominates over the centrifugal barrier for small r and drives the
potential to −∞13 . Let me analyze the massive and massless case separately.
Massive particles, = 1, σ = τ :
We have then four possibilities depending of the relation between the effective energy of the
particle and the potential (cf. Fig. ??):
1. Circular orbits: If E = V (rmax ) or E = V (rmin ) the particle describes an unstable or stable
orbit respectively.
2. Bound precessing orbits: If 0 > E > V (rmin ) the particle is trapped into the potential and
describes an elliptical orbit with shifting perihelion (see below).
3. Scattering orbits: If V (rmax ) > E > 0 the particle bumps in the potential and retreats back
to infinity.
4. Plunging orbits: If E > V (rmax ) the particle sails over the top of the potential to finally
spiral into the black hole.
11 The true energy per unit mass in E but the effective potential for r rather responds to E.
12 The small correction will play however a central role! See next section.
13 Note that the potential is always zero at r = R .
S
8.7 Geodesics in Schwarzschild metric 120
0.6 0.6
0.4 0.4
0.2 0.2
VN HrL VHrL
0.0 0.0
-0.2 -0.2
-0.4 -0.4
0 5 10 15 20 0 5 10 15 20
r r
RS RS
Figure 8.2: Effective potentials in Newtonian gravity and General Relativity for massive particles.
Different lines correspond to h2 /RS2 = 0, 1, 3, 5, 7, 9 (from brown to blue). Note the change in the
potential at the critical value h2 /RS2 = 3.
Exercise
What happens with rmax and rmin when h → 0? And when h decreases? Which is the
minimal value of h and r allowing for a stable circular orbit?
• If h2 < 3RS2 the centrifugal barrier disappears and the particle has no other option but to spiral
into the singularity. Consider for clarity the limiting case h = 0 in which the particle follows a
radial trajectory. In this case, the radial equation of motion (8.61) becomes14
1/2
√
Z Z
dr RS 1/2
=± → rdr = −RS dτ . (8.66)
dτ r
at r = RS . For the observer at infinite the particle appears to approach but never quite cross
the horizon! This is just another indication that the Schwarzschild coordinates are flawed near
R = RS .
Exercise
What happens with t when the observer crosses the horizon?
Massless particles, = 0:
The potential (8.62) with = 0 displays a unique maximum for all values of h at
3
rmax = RS . (8.69)
2
Thus, the motion of massless particles can be divided into three cases:
2. Scattering orbits: If V (rmax ) > E the particle bumps in the potential and retreats back to
infinity (deflection of light).
3. Plunging orbits: If E > V (rmax ) the particle sails over the top of the potential to finally spiral
into the black hole.
14 Among the two signs in the square root we take the negative one, in such a way that we fall toward r → 0
8.7 Geodesics in Schwarzschild metric 122
0.6 0.6
0.4 0.4
0.2 0.2
VN HrL VHrL
0.0 0.0
-0.2 -0.2
-0.4 -0.4
0 5 10 15 20 0 5 10 15 20
r r
RS RS
Figure 8.4: Effective potentials in Newtonian gravity and General Relativity for massless particles.
Different lines correspond to h2 /RS2 = 0, 1, 3, 5, 7, 9 (from brown to blue).
dr dr dφ h dr
= = 2 , (8.70)
dσ dφ dτ r dφ
to obtain 2
h2 2GM h2
h dr 2GM
+ = + + 2E . (8.71)
r2 dφ r 2 r r3
The tricks to solve this kind of equation are well known. Let’s perform a change of variable u ≡ 1/r
in (8.71)
2
du 2GM u 2E
+ u2 = + 2GM u3 + 2 , (8.72)
dφ h2 h
and derive the result with respect to φ. This gives rise to a second order differential equation of the
form
d2 u GM
2
+ u = 2 + 3GM u2 . (8.73)
dφ h
d2 u GM
2
+ u = 2 + 3GM u2 (8.74)
dφ h
The resulting equation is extremely similar to the Newtonian equation of motion of a particle of mass
m in the equatorial plane
d2 u0 GM
+ u0 = 2 (8.75)
dφ2 h
even though the interpretation of the radial variable r is completely different15 . As you probably
remember from your Classical Mechanics course, the general solution of (8.75) is a conic
GM a(1 − e2 )
u0 = (1 + e cos φ) −→ r0 = (8.76)
h2 (1 + e cos φ)
with
h2
a(1 − e2 ) = . (8.77)
GM
15 In Newtonian gravity r is the radial distance from the mass while in the relativistic it is just a radial coordinate
• f=focus: The point over the semi-major axis at a distance f = ae from the geometric
center of the ellipse.
b2
• l=semi-latus rectum: The distance l = a from the focus to the ellipse along a line
parallel to the semi-minor axis.
• rp =periapsis: The distance rp = a(1 − e) from the focus to the nearest point of approach
of the ellipse.
• ra =apoapsis: The distance ra = a(1+e) from the focus to the furthest point of approach
of the ellipse.
• The equation of the orbit: It gives the distance to the orbiting body from the focus
of the orbit as a function of the polar angle θ
a(1 − e2 )
r(θ) = . (8.79)
1 + e cos θ
If the gravitational field is sufficiently weak, Newtonian gravity alone is expected to provide a good
approximation to the motion of massive particles in General Relativity. This suggest to treat to extra
term 3GM u2 as a perturbation of top of the solution of Eq. (8.75). The perturbative solution of Eq.
(8.74) can be determined by considering the antsatz
u = u0 + ∆u , (8.80)
8.8 Solving the radial equation 125
2π 6πG2 M 2
∆φ = − 2π ≈ 2φα = , (8.90)
1−α h2
which taking into account (8.77) can be written as16 (note that we restore the c factors)
6G2 M 2 6πGM
∆φ = 2 2
= . (8.91)
h c a(1 − e2 )c2
Because it is a small effect, let’s accumulate this over 100 years to get the observable quantity
∆φ 100 years
∆φ100 ≡ × , (8.92)
T century
with T the period of the orbit in years. In terms of observable orbits within the solar system, Mercury
is the closest planet to the Sun, and so it should have the largest precession.
The major axis of Mercury precesses at a rate of 43 arcsecs per century. The observational results are
in excellent agreement with General Relativity
00
Mercury (43.11 ± 0.45) 43.0300
00
Venus (8.4 ± 4.8) 8.600
00
Earth (5.0 ± 1.2) 3.8”
16 The use of the expressions for the unperturbed solution is justified by the fact that we are looking to a very small
quantity.
8.9 The massless case: Gravitational deflection of light 127
enough to account for the 4300 per century would be encountered between the Sun and Mercury.
we get
4GM
∆φ = = 1.7500 . (8.104)
c2 R
Light paths so close to the Sun are of course not visible by day, but they become visible at the
time of a total eclipse. Their position relative to the other background stars during the total eclipse
appears shifted relative to the position in the usual night sky. This prediction of General Relativity
was verified in 1919 just a few years later the formulation of the theory. Two separate groups led by
Arthur Eddington and Andrew Crommelin moved to Guinea and Brazil to observe the total eclipse of
May 29, 1919. They reported deflections of (1.61 ± 0.40)00 and (1.98 ± 0.16)00 , in reasonable agreement
with Einstein’s prediction (8.104).
When this parameters are taken into account Eqs. (8.91) and (8.102) become respectively
2 − β + 2γ 6πGM 1 + γ 4GM
∆φ = , ∆φ = (8.106)
3 a(1 − e2 )c2 2 bc2
We move now to the modern approach to General Relativity: field theory. The chief advantage of
this formulation is that it is simple and easy; the only thing to specify is the so-called Lagrangian
density. We start by presenting a simple introduction to classical field theory in flat spacetime which
we later generalize to curved spacetime. The last part of the Chapter is devoted to the action for the
gravitational field and the recovery of Einstein equations from it.
depending on generalized coordinates and velocities {qj (t), q̇j (t)}. The classical trajectory is defined
as the unique path that extremizes the action functional (δS = 0) for all variations qj → qj + δqj with
fixed initial qj (t0 ) and final values qj (tf ). An explicit variation of the action gives
Z tf
∂L ∂L
δS = dt δqj + δ q̇j . (9.2)
t0 ∂qj ∂ q̇j
Integrating the last term by parts to flip the temporal derivative onto ∂L/∂ q̇j we get
Z tf
∂L d ∂L
δS = dt − δqj , (9.3)
t0 ∂qj dt ∂ q̇i
where we have omitted a total derivative that vanishes because of the boundary conditions δqj (t0 ) =
δqj (tf ) = 0. Since δqj is arbitrary, the extremization of the action translates into the so-called Euler-
Lagrange equations
d ∂L ∂L
− = 0. (9.4)
dt ∂ q̇i ∂qj
This variational formulation has several advantages:
9.2 From Classical Mechanics to Field theory 130
i) The properties of the system are compactly summarized in one function, the Lagrangian.
ii) There is a direct connection between invariances of the Lagrangian and constants of motion1 .
iii) There is a close relation between the Lagrangian formulation of classical mechanics and quantum
mechanics.
The continuous limit of the previous expression can be taken by sending N → ∞ and a → 0 in such
a way that the total length of the chain, l = (N + 1)a, remains fixed. To keep the total mass of the
system and the force between particles finite we require m/a and ka to go to some finite values µ and
Y playing the role of the mass density and the Young modulus in the continuous theory. We have
N N 2
1 l h 2
φj+1 (t) − φj (t)
Z
1 X m 2 1X 2
i
L= a φ̇j (t) − a (ka) −→ L = dx µφ̇ − Y (∂x φ) ,
2 j=1 a 2 j=0 a 2 0
with the finite number of generalized coordinates φj replaced by a continuous function φ(x, t). The
antisymmetric dependence of Eq.(9.6)
p on the derivatives suggests the introduction of a set of coordi-
nates xµ = (cs t, x)T with cs = Y /µ and a Lorentzian metric ηµν = diag(−1, 1). This allows us to
write Z
S = d(cs t)dx L (9.6)
with
µcs µν
L=−
η ∂µ φ∂ν φ , (9.7)
2
the so-called Lagrangian density. The jump from fields existing within a physical medium to fields in
vacuum is now straightforward: we must simply replace cs by the speed of light c. Generalizing the
metric ηµν to arbitrary dimensions , we can write generically write the action for relativistic fields as
Z
S = dn x L (φ, ∂µ φ) , (9.8)
where we have allowed for a dependence of the Lagrangian density on the fields.
Exercise
Consider again the chain of masses connected by springs. Modify the system to give rise to an
explicit dependence of the Lagrangian on φ. Hint: Eq. (9.7) is shift-invariant.
1 For instance, if the Lagrangian is invariant under rotations, angular momentum is conserved.
9.3 Principles of Lagrangian construction 131
The term associated to ∂σ g σ turns out to be a boundary term, which does not contribute to
the equations of motion. Lagrangians differing by a contribution ∂σ g σ give rise to the same
equations of motion.
The equations of motion for the field φ(x, t) can be obtained by considering the change of the action
under an infinitesimal change φ(x, t) −→ φ(x, t) + δφ(x, t). The only requirement to be satisfied by
the variations δφ is to be differentiable and to vanish outside some bounded region of spacetime (to
allow an integration by parts). Performing this variation we get
Z Z
n ∂L ∂L n ∂L ∂L
δS = d x δφ + δφ,µ δφ = d x − ∂µ δφ . (9.11)
∂φ ∂φ,µ ∂φ ∂(∂µ φ)
Requiring the action to be stationary (δS = 0) and taking into account that δφ is completely arbitrary,
we obtain the continuous version of the Euler-Lagrange equations
∂L ∂L
∂µ − = 0. (9.12)
∂(∂µ φ) ∂φ
A worked-out example
As a direct application of Eq. (9.12), let me consider the action (9.7)
1 ∂2φ ∂2φ
∂L
∂µ =0 −→ ∂µ (η µν ∂ν φ) = 0 −→ − 2 2 + = 0. (9.13)
∂(∂µ φ) cs ∂t ∂x2
1. L must be a real-valued function, since it enters into expressions of physical significance, like
the Hamiltonian.
2. L must have dimension 4 in units of energy, since in natural units the action is dimensionless
and [d4 x] = −4.
3. L must be a linear combination of Lorentz invariant quantities constructed from the fields, their
first partial derivatives and the universally available objects ηµν and µνρσ .
9.3 Principles of Lagrangian construction 132
4. The coefficients of this linear combination can be restricted by the symmetries of the problem
(internal symmetries/ gauge symmetries).
5. L should be bounded from below.
The power of the previous program is made most vividly evident by considering some examples.
The quadratic Lorentz invariants which can be constructed from Φ, Φ∗ , ∂µ Φ and ∂µ Φ∗ lead to a
Lagrangian density of the form
1 µν 1
L= η [a∂µ Φ∂ν Φ + a∗ ∂µ Φ∗ ∂ν Φ∗ + 2a0 ∂µ Φ∗ ∂ν Φ] + [bΦΦ + b∗ ΦΦ∗ + 2b0 Φ∗ Φ] . (9.15)
2 2
where the reality condition L = L∗ imposes the appearance of the pairs a, a∗ and b, b∗ and requires the
coefficients a0 and b0 to be real. The previous Lagrangian density can be written in a more compact
way by introducing the arrays
T
Φ∗
Φ
Φ̃ ≡ and Φ̃† ≡ = (Φ∗ Φ) . (9.16)
Φ∗ Φ
to get2
a∗ b∗
1 a0 1 b0
L = η µν Φ̃†,µ Φ̃,ν + Φ̃∗ Φ̃ . (9.17)
2 a a0 2 b b0
The number of terms appearing in this Lagrangian can be reduced in cases in which we have symmetries
on top of Lorentz invariance. As an illustration of this, imagine the field Φ to possess an internal
symmetry
Φ → eiω Φ , Φ∗ → e−iω Φ∗ . (9.18)
In this case, we necessarily have a = b = 0 and the matrices in (9.17) become diagonal. This leaves
us with a simpler Lagrangian, that with some notational adjustments, can be written as
1 µν
K −η ∂µ Φ∗ ∂ν Φ − κ2 Φ∗ Φ .
L= (9.19)
2
A direct application of the Euler-Lagrange equations (9.12) provides two uncoupled equations for Φ
and Φ∗ , namely
2 − κ2 Φ = 0 , 2 − κ2 Φ∗ = 0 ,
(9.20)
µ
that we can use to provide a physical interpretation for the parameter κ2 . Indeed, setting Φ = eikµ x
with pµ = (E, p) in any of these two equations, we get a dispersion relation E 2 − p2 = κ2 , which
makes it natural to identify the parameter κ2 with the mass m2 of the field.
2 In this notation, the reality condition L = L∗ results from the hermiticity of the 2 × 2 matrices.
9.3 Principles of Lagrangian construction 133
Consider the action (9.22) alone. The first thing that can simplify our life is the gauge freedom in
the choice of the Lagrangian density. In particular notice that choosing gσ = µνρσ (∂ ν Aµ )Aρ in (9.9)
allows us to eliminate the term a5 in (9.22), since
∂ σ gσ = µνρσ (∂ ν Aµ ) (∂ σ Aρ ) + µνρσ (∂ ν ∂ σ Aµ ) Aρ . (9.25)
| {z }
0 by symmetry
Taking this into account we are left with an action containing 4 pieces
Z
S = d4 x [a1 ∂µ Aµ ∂ν Aν + a2 ∂µ Aν ∂ µ Aν + a3 ∂µ Aν ∂ ν Aµ + a4 Aµ Aµ ] . (9.26)
Imagine now that Aµ is the field of a gauge theory. In that case the field configurations Aµ and
A0µ = Aµ + ∂µ χ , (9.27)
with arbitrary scalar function χ give rise to the same physical observables4 . This automatically forbids
the a4 term in (9.26)5 and puts some restrictions on the other coefficients. To see this, let me split
∂µ Aν into its symmetric Sµν ≡ ∂(µ Aν) and antisymmetric Fµν ≡ ∂[µ Aν] parts
∂µ Aν = Sµν + Fµν , (9.28)
3 Quadratic actions give rise to linear equations of motion, where the superposition principle can be applied.
4 In the same way that physicality cannot be attributed to L, we cannot make any claim about the physicality of Aµ .
Physicality might be attributed to the set {Aµ } of gauge-equivalent 4-potentials or to any gauge invariant attribute of
that set, but not to its individual elements.
5 It cannot be compensated by the transformation of the other (derivative) terms.
9.4 The action for the graviton 134
The invariance of the action under the gauge transformation Aµ → Aµ + ∂µ χ requires a1 = 0 and
a3 = −a2 . This restriction leaves us with an action
Z
S = d4 xFµν F µν , (9.30)
where we have omitted an overall normalization factor that can be determined by choosing the coupling
of the gauge field Aµ to matter and the units of that coupling. The equations of motion associated
with this action can be computed via the Euler-Lagrange equations (9.12) or by varying the action
with respect to Aµ . We follow the second procedure to get
Z Z
δS = d x [F δFµν + Fµν δF ] = 2 d4 xF µν δFµν
4 µν µν
Z Z
= 2 d4 xF µν (∂µ δAν − ∂ν δAµ ) = 4 d4 xF µν ∂µ δAν (9.31)
Z
= −4 d4 x∂µ F µν δAν + boundary terms ,
where we used the symmetry properties of Fµν and performed an integration by parts. Imposing
finally the condition δS = 0 for arbitrary δAν , we arrive to the very familiar result
∂µ F µν = 0. (9.32)
The Maxwell equations in vacuum are recovered from an action (9.30) constructed with very limited
principles, namely, quadraticity in the fields, Lorentz invariance and gauge invariance.
Plugging (9.34) into (9.33) and performing some simple manipulations we get
Z
S → S + d4 x[−2(2c1 + c3 )∂µ hκκ ∂ µ ∂λ ξ λ − 2(2c2 + c4 )∂ µ hµν ξ ν − 2(c3 + c4 )∂ µ hµν ∂ ν ∂κ ξ κ
Taking this into account, the action (9.33) takes the form
Z
S = d4 x [∂µ hκν ∂ µ hκν + 2∂µ hµν ∂ν hκκ − 2∂µ hκµ ∂ ν hκν − ∂µ hνν ∂ µ hκκ ] , (9.37)
where we have omitted an overall normalization factor that can be determined by specifying the
coupling to matter and setting the units of the coupling. The associated equations of motion can be
obtained by varying the action with respect to the field. This leads to
Z
δS = d4 x 2∂ µ hκν ∂µ δhκν + 2∂ν hκκ ∂µ δhµν + 2ηκλ ∂µ hµν ∂ν δhκλ
which, integrating by parts, dropping boundary terms and renaming indices can be written as
Z
δS = 2 d4 x −hκν δhκν − ∂µ ∂ν hκκ δhµν − ηκλ ∂µ ∂ν hµν δhκλ +∂µ ∂ ν hκν δhκµ + ∂ µ ∂ν hκµ δhκν + ηκλ hνν δhκλ
Z
= 2 d4 x [−hµν − ∂µ ∂ν hκκ − ηµν ∂κ ∂λ hκλ + ∂µ ∂ κ hκν +∂ν ∂ κ hκµ + ηµν hκκ ] δhµν . (9.39)
A simple inspection reveals that the quantity inside the square brackets is nothing else than the
linearized version of the Einstein tensor Gµν
Z
δS ∝ d4 x Gµν δhµν . −→ Gµν = 0 . (9.40)
The linearized version of Einstein equations in vacuum are recovered from an action (9.39) constructed
with very limited principles, namely, quadraticity in the fields, Lorentz invariance and gauge invari-
ance.
This transformation generates a perturbation to both the fields and the metric in such a way that the
Lagrangian density L (no tilde) becomes
∂L ∂L ∂L
L(φ + δφ, φ,µ + δφ,µ , gµν + δgµν ) ≈ L(φ, φ,µ , gµν ) + δφ + δφ,µ + δgµν . (9.43)
∂φ ∂φ,µ ∂gµν
The first one is associated to a particular variation δφ and vanishes when taking into account the
Euler-Lagrange equation for φ. The second term must be then equal to zero for S to remain unchanged.
The integrand ∂L/∂gµν is a scalar density. Let’s define a symmetric second-rank tensor out of such a
density
2 ∂L
T µν ≡ p , (9.45)
|g| ∂gµν
and write Z
1 p
δS = dn x |g|T µν δgµν . (9.46)
2
p
Although is tempting to simply set |g|Tµν = 0, this condition is overly restrictive, since δgµν refers
here to a specific type of variation, not to an arbitrary one. The variation δgµν can be however
expressed in terms of the arbitrary perturbation ξµ by taking into account that δgµν = −(ξµ;ν + ξν;µ )
(cf. Eq. (6.60) and notice (9.50)). This gives
Z Z
1 p p
δS = dn x |g|T µν δgµν = − dn x |g|T µν ξµ;ν
2
Z p Z p
= d x |g|T ;ν ξµ − dn x
n µν
|g|T µν ξµ , (9.47)
;ν
| {z }
=0
where we have made use of the symmetry property of T µν and integrated by parts to get a total
derivative that vanishes by assumption on the boundary of integration. Since ξµ is arbitrary we must
have
∇ν T µν = 0 , (9.48)
which is a continuity equation suggesting that we can identify the tensor (9.45) with the energy-
momentum tensor of any physical system.
9.7 The Einstein-Hilbert action 137
in terms of δg µν rather than δgµν . The difference in sign between these two equivalent expres-
sions comes from
or equivalently
∂ L̃
T µν = g µν L̃ + 2 . (9.52)
∂gµν
Exercise
Compute the energy-momentum tensor for (9.7).
and is known as the Einstein-Hilbert action. The constant of proportionality κ2 is included on dimen-
sional grounds and will be determined of the end of the computation.
Exercise
Which is the dimension of κ2 ?
Consider the variation of (9.53) resulting from the variation of the metric tensor
where δgµν and its first derivative are assumed to vanish at the boundary of the integration region.
We obtain
p p p
δLEH ∝ |g|δg µν Rµν + δ |g|R + |g|g µν δRµν , (9.55)
| {z } | {z }
δL1 δL2
where we have defined two pieces, L1 and L2 . The first one can be easily evaluated by taking into
account the variation and the variation of the metric determinant
p 1p
δ |g| = − |g|gµν δg µν . (9.56)
2
Exercise
Prove Eq. (9.56).
so the first thing that we have to compute is the variation of Christoffel symbols δΓρµν defined by
Be careful! We are just performing a variation of the metric, not transforming it.
It is easy to see that, even though a connection is not a tensor, the difference of two connections δΓρµν
transforms as a tensor, i.e.
∂ x̄ρ ∂xλ ∂xκ ρ
δ Γ̄ρ µν = δΓ λκ . (9.60)
∂xσ ∂ x̄µ ∂ x̄ν
9.7 The Einstein-Hilbert action 139
Exercise
Check it.
The property (9.60) extremely simplifies the computation of the variation of the Riemann tensor.
Indeed, we can always go to a local free fall reference frame in which Γρµν = 0 at some arbitrary point
P . In such a point the expression (9.58) becomes
where in the last step used of the fact that the partial and covariant derivatives coincide when Γρ µν = 0.
The resulting Palatini equation
is a tensorial equation (remember the property (9.60)) valid in any arbitrary coordinate system (and
not only in the free fall reference frame at P ).
Exercise
Prove that the second term in the previous expression is symmetric, as it should be.
A similar trick can be applied to get the explicit expression of δΓ, that takes the same form that the
definition of the Christoffel symbols, with the metric replaced by the metric variation and the partial
derivatives replaces by covariant derivatives, i.e.
1 µσ
δΓµ νρ = g (∇ν δgσρ + ∇ρ δgσν − ∇σ δgνρ ) . (9.63)
2
Exercise
Check the previous expression by explicit computation.
where we have used the metric compatibility condition (4.68). We are left therefore with the covariant
divergence of a vector. Using the property
1 p
V µ ;µ = p |g|V µ , (9.65)
|g| ,µ
As I said before, gravity is a quite particular field theory. The existence of second derivatives
in the Einstein-Hilbert action gives rise to a contribution depending on the value of the first
derivatives on the boundary. To deal with these, we have two options:
• Extend the variational principle and require the fields and their derivatives to be fixed at
the boundary. This would give rise to reasonable field equations. A clear example from
classical mechanics illustrating this would be
Z tf
1 tf 2
Z
1 2
S= dt q̈ + q̇ = q̇(tf ) − q̇(t0 ) + q̇ , (9.67)
t0 2 2 t0
with the assumption that both q̇ and q are fixed at the boundary. This approach has
however some caveats. On the one hand, it does not obey a composition rule of the kind
where the paths connecting (q0 , t0 ) and (q2 , t2 ) are decomposed at an intermediate time
t1 with t0 < t1 < t2 . Although the paths are expected to be continuous at t = t1 , they do
not need to be smooth at that point which requires leaving q̇1 free at t = t1 . On the other
hand, the action principle has its roots in quantum mechanics, where the simultaneous
fixing of q and q̇ is inappropriate.
• Add the so-called Gibbons-Hawking-York counterterm to the action
Z Z
1 p 1 p
S = SEH + SHGY = 2 d4 x |g|R + 2 d3 x |h|K , (9.69)
2κ R κ ∂R
with h the determinant of the induced metric on the boundary and K the trace of the
extrinsic curvature. The Gibbons-Hawking-York is constructed in such a way that its
variation cancels the unwanted term associated to the second derivatives of the metric,
keeping only the part associated to the quadratic part of the action. Proving this statement
is beyond the scope of this course. The interested reader is referred to the excellent
discussion in Padmanabhan’s book.
Forgetting about the boundary term, the Einstein-Hilbert action (9.53) becomes
Z
1 p 1
δSEH = 2 d x |g| Rµν − gµν R δg µν ,
4
(9.70)
2κ 2
which, demanding it to vanish for arbitrary variations δg µν , gives us the Einstein’s equations in the
absence of matter
1
Gµν ≡ Rµν − Rgµν = 0 . (9.71)
2
with SM containing all the non-gravitational fields. The variation of SM with respect to δg µν (upper
indices) gives Z
1 p
δSM = − d4 x |g|Tµν δg µν , (9.73)
2
where we have made use of the covariant definition (9.49). Putting everything together we get
Z
1 p 1
δSEH + δSM = 2 d4 x |g| Rµν − R − κ2 Tµν δg µν . (9.74)
2κ 2
Exercise
Modify the Einstein-Hilbert action to obtain the Einstein equations with cosmological constant.
CHAPTER 3
THE INFLATIONARY PARADIGM
Exercise
Convince yourself that homogeneity does not imply isotropy. Provide some examples.
3.1 The hot Big Bang paradise 26
Figure 3.1: The Cosmic Microwave Background as seen by Planck. The fluctuations on top
of the average temperature T = 2.73K ' 0.235 meV are one part in 105 .
At first sight, the idealization of the Universe as an homogeneous and isotropic object might
seem a bit drastic. On the other hand, we know from hydrodynamics that a continuous
description of gases works very well even if these have a very discontinuous structure at
molecular scales. The homogoneous and isotropic approximation seems to be indeed in good
agreement with observations. Indeed. both the CMB and the galaxy distribution look rather
homogeneous when averaged on sufficiently large scales (cf. Figs. 3.1 and 3.2).
A given spacetime in General Relativity is specified by its metric tensor gµν . This quantity
defines the line element
ds2 = gµν dxµ dxν , (3.1)
where dxµ stands for infinitesimal displacements in the coordinates xµ . From a mathematical
point of view, an homogeneous and isotropic Universe must be equipped with a metric tensor
invariant under translations and rotations in the spatial components. The most general 4-
dimensional geometry consistent with these symmetries is the so called Friedmann-Lemaı̂tre-
Robertson-Walker (FLRW) spacetime,
dr2
2 2 2 2 2 2 2
ds = −dt + a (t) + r dθ + sin θdφ . (3.2)
1 − kr2
This equation represents a time-ordered slicing of spacetime with respect to a global time t
whose 3-dimensional spacial surfaces are maximally symmetric. Here r is a radial coordinate
and θ and φ are the usual angular coordinates on a two-sphere, ranging between 0 < θ < π
and 0 ≤ φ < 2π. The coordinates (r, θ, φ) are usually called comoving coordinates, since they
are decoupled from the effect of expansion.
The FLRW metric is invariant under the redefinition
k p a
k→ , r→r |k| , a→ p , (3.3)
|k| |k|
meaning that the the only relevant parameter is the sign of k. We can therefore distinguish
three types of spatial sections:
1. Flat: for k = 0 the spacial slices are flat and r ranges from zero to infinity, 0 < r < ∞.
3.1 The hot Big Bang paradise 27
Figure 3.2: The Sloan Digital Sky Survey map. Each dot is a galaxy. The empty regions are
just areas that the survey did not cover.
Figure 3.3: 2-dimensional projection of the 3-dimensional slices of the FLRW metric for
k = +1 (left) and k = −1 (right).
2. Open: for k = −1 the spacial slices are hyperbolic and again 0 < r < ∞.
3. Closed: for k = 1, the spacial slices are three-spheres and the radial coordinate r is
restricted to a compact range, 0 < r < 1.
The scale factor a(t) characterizes the relative size of the spacial sections at a given time. Its
temporal evolution depends on the matter content of the Universe via the Einstein equations
1
Rµν − R gµν + Λgµν = 8πG Tµν , (3.4)
2
with G the Newton’s constant, Rµν the Ricci tensor, R = g µν Rµν the Ricci scalar and Λ the
infamous cosmological constant. The energy-momentum tensor Tµν encodes the Universe’s
matter content and is locally conserved,
∇µ Tµν = 0 . (3.5)
3.1 The hot Big Bang paradise 28
The homogeneity and isotropy of the background metric restricts the form of the energy-
momentum tensor to the perfect fluid case
with uµ the comoving four-velocity satisfying uµ uµ = −1 and ρ(t) and p(t) the local energy
density and pressure of the fluid. For an observer comoving with the fluid, uµ = (1, 0, 0, 0),
the energy-momentum tensor looks isotropic
Exercise
Derive Eqs. (3.7) and (3.6).
i) Connection coefficients
aȧ
Γ011 = Γ022 = aȧr2 Γ033 = aȧr2 sin2 θ
1 − kr2
ȧ
Γ101 = Γ110 = Γ202 = Γ220 = Γ303 = Γ330 =
a
Γ122 = −r(1 − kr2 ) Γ133 = −r(1 − kr2 ) sin2 θ
1
Γ212 = Γ221 = Γ313 = Γ331 =
r
Γ233 = − sin θ cos θ Γ23 = Γ332 = cot θ .
3
(3.9)
Combining these expressions with the energy-momentum tensor (3.7) we can particularize
the Einstein’s equations (3.4) to the homogenous and isotropic case. We obtain the so-called
Friedmann equations
ρ Λ k
H2 = 2 + − 2, (3.12)
3MP 3 a
1 Λ
Ḣ + H 2 = − 2 (ρ + 3p) + , (3.13)
6MP 3
with the dots denoting derivatives with respect to the coordinate time t and
The Friedmann equations (3.12) and (3.13) can be combined to obtain the continuity equation
ρ̇ + 3H (ρ + p) = 0. (3.18)
Exercise
Derive Eq. (3.18) i) by combining Eqs. (3.12) and (3.13) and ii) from the covariant
energy-momentum conservation (3.5). Interpret the result by considering the adiabatic
dilution of energy due to the expansion and the work done by pressure.
Hint: Consider the second law of thermodynamics T dS = dU + pdV .
The cosmological evolution following from Eqs. (3.12), (3.13) and (3.18) can be determined
once a pressure to energy density relation p(ρ) is specified. We will restrict ourselves to
barotropic fluids for which the pressure is linearly proportional to the energy density,
p = wρ , (3.19)
with w the so-called equation-of-state parameter. This case covers the two main matter
components in the hot Big Bang scenario, namely (non-relativistic) matter (w = 0) and
radiation (w = 1/3).
3.2 Troubles in paradise 30
Exercise
Consider a macroscopic collection of structureless point particles interacting through
spatially localized collisions. On distances d much larger than the typical mean free
path, the number of particles is large and the statistical fluctuations about the mean
properties of the fluid are expected to be small. If the fluid is isotropic,a the mean
density and pressure observed by a comoving observer over the volume ∆V = d3 can
be written as
DX E 1 X D X i i (3) E
ρ= En δ (3) (x − xn ) , p= pn vn δ (x − xn ) , (3.20)
n
∆V 3 n
∆V
i
p
with En = p2n + m2n the energy of the individual particles. The index i is a space
index ranging from 1 to 3 and n selects the particle of mass mn and momentum pn . Use
these microscopic expressions to derive the equation of state for non-relativistic matter
and radiation.
a
i.e if the fluid is perfect.
In our our Universe, several species with different equations of state coexist. Their relative
contribution is traditionally parametrized by the dimensionless parameters
ρM ρR Λ k
ΩM ≡ , ΩR ≡ , ΩΛ ≡ , ΩK ≡ − , (3.21)
3MP2 H 2 3MP2 H 2 3MP2 H 2 (aH)2
with the subindices M, R, Λ and K standing for matter, radiation, cosmological constant and
curvature contributions. At present time, the radiation and curvature contributions are very
small (ΩR ' 5 × 10−5 , ΩK < 0.005) and
ΩM ' 0.3 , ΩΛ ' 0.7 , (3.22)
Our present Universe is therefore dominated by a cosmological constant or dark energy com-
ponent. Note however that is was dominated by matter and radiation in the past. This can
be easily seen by considering the scaling of non-relativistic matter and radiation. Integrating
the conservation equation (3.18) and using Eq. (3.12) for the zero curvature case (k = 0),
we get 2/3(1+w)
−3(1+w) t w 6= −1 ,
ρ∝a , a(t) ∝ Ht (3.23)
e w = −1 .
For non-relatistic matter (w = 0), the energy density dilutes with the volume ρM ∼ a−3 ,
reflecting mass conservation. For relativistic matter (w = 1/3), the energy density dilutes
as ρR ∼ a−4 , due to the additional redshift of energy (∝ a−1 ). Note that the radiation
domination period cannot be eternal to the past. When t → 0, the scale factor goes to zero
and the physical energy density ρ diverges.
For standard matter sources satisfying the strong energy condition 1 + 3w > 0, (aH)−1 grows
as the Universe expands. For instance, during matter (MD) and radiation domination (RD)
we have
(aH)−1 ∝ a1/2 , (MD) (aH)−1 ∝ a . (RD) (3.28)
Exercise
Derive Eq .(3.27).
The density parameter Ω at present time is very close to one. Specifically, the latest Planck
satellite data combined with baryon acoustic oscillations (BAO) give
at the 95% C.L. Taking into account this value (Ω0 ∼ 10−3 ) together with the evolution
equations (3.28) for the comoving Hubble radius during matter and radiation domination,
we can compute the value of Ω − 1 at the time of matter-radiation equality (zeq = 3600)
Ω(zeq ) − 1 = Ω0 − 1 (1 + zeq )−1 ≈ 2.8 × 10−5 , (3.30)
1 + zeq 2
Ω(zBBN ) − 1 = (Ω(zeq ) − 1) ≈ 3.6 × 10−18 . (3.31)
1 + zBBN
A percent deviation from flatness in the present Universe translates into unnaturally small
deviations at early epochs. In others words, in order to recover the Universe we observe
3.2 Troubles in paradise 32
2.0
1.5
Ω 1.0
0.5
0.0
0 1 2 3 4 5 6
log a
Figure 3.4: Evolution of the energy density parameter Ω in standard cosmology. The point
Ω = 1, corresponding to flat curvature, is a repeller.
today the initial conditions must be terribly fine-tuned. Any deviation from these initial
conditions translates either into a closed and recollapsing Universe or into an open Universe
completely dominated by curvature. This extreme dependence on the initials conditions is
highly unsatisfactory.
Exercise
One could argue that naturalness is just a question of taste and that the most symmetric
initial conditions are somehow more natural. However, this is not very convincing
from the point of view of the self-consistency of the theory, specially if those initial
conditions are unstable. Show that the dimensionless energy density parameter satisfies
the differential equation
dΩ
= (1 + 3w)Ω(Ω − 1) . (3.32)
d log a
Note that for both matter (w = 0) and radiation (w = 1/3) we have 1 + 3w > 0,
meaning that a flat Universe with Ω = 1 is an unstable fixed point. If Ω > 1 at some
point of the evolution, it will keep on growing; and viceversa, if Ω < 1 at some point, it
will keep on decreasing. This behaviour is illustrated in Fig. 3.4.
⌧CMB ⌧CMB
⌧i=0 R
⌧end R
Causal
⌧i = 1
Figure 3.5: Conformal diagram for standard hBB cosmology (left) and inflationary cosmology
(right). Inflation solves the horizon problem by extending the conformal diagram to negative
conformal times.
The causal structure of the FLRW metric is determined by the way in which light propagates
on null geodesics with ds2 = 0. Since the space is isotropic, we can freely set the coordinates
θ and φ to a constant value. In this case, the condition ds2 = 0 implies
R = ±τ + constant . (3.37)
The fact that geodesics are 45◦ lines in the {τ, R} plane is related to the fact that the the
FLRW metric (3.34) is conformally flat. If the Universe started at some initial time ti , then
there is a maximum amount of time for light to have travelled. The (comoving) particle
3.3 Inflationary paradigm 34
horizon is the largest distance that a photon can travel between ti and a later time t > ti
(recall that c ≡ 1)
t a ln a
dt da
Z Z Z
dH = τ − τi = = = (aH)−1 d ln a . (3.38)
ti a ai aȧ ln ai
According to this expression, the evolution of dH depends also on the evolution of the co-
moving Hubble radius (aH)−1 (cf. Eq. (3.27)).
Exercise
Show this.
For standard matter sources 1 + 3w > 0 and (aH)−1 grows as the Universe expands. When
that happens. the integral in Eq. (3.38) becomes dominated by its upper limit
2 1 2
dH ∝ a 2 (1+3w) = (aH)−1 . (3.39)
(1 + 3w) (1 + 3w)
Note that due to the presence of the singularity at ai = 0 (or equivalently at τi = 0) this
quantity is finite. At each instant of time, regions that were never in causal contact before
get into contact for the first time. The fact that two of these regions share approximately
the same temperature cannot by a consequence of thermal equilibrium. On general grounds,
these regions should be expected to look very different from each other. This applies also to
the CMB (see the left pannel of Fig. 3.5). The observed homogeneity of the CMB map is not
only remarkable but also strange and unexpected! Most points in that map share roughly
the same temperature even if the naive horizon scale at decoupling is just a few arc degrees.
How is this possible?
d
(aH)−1 < 0 . (3.40)
dt
or equivalently a violation the strong energy condition 1 + 3w > 0 (cf. Eq (3.27)). This
additional phase in the history of the Universe is called inflation. The name can be easily
understood by noticing that Eq. (3.40) implies accelerated expansion
d d ä
(aH)−1 = (ȧ)−1 = − 2 < 0 ⇒ ä > 0. (3.41)
dt dt (ȧ)
3.3 Inflationary paradigm 35
If the inflationary stage lasts long enough, the hot Big Bang problems are automatically
solved. In particular, if the comoving Hubble radius (aH)−1 in Eq. (3.26) decreases, the
curvature is driven towards zero (the unstable point Ω = 1 in Eq. (3.32) becomes now an
attractor). This solves the flatness problem. On the other hand, if 1 + 3w < 0 the integral
in Eq. (3.38) becomes dominated by its lower limit and the singularity is pushed towards
negative conformal times,
2 1
(1+3w)
τi ∝ ai2 = −∞ . (3.43)
(1 + 3w)
The extension of the conformal diagram to negative conformal times allows the light cones
of widely separated CMB points to intersect (see the right pannel of Fig. 3.5). This solves
the horizon problem.
with (ai Hi )−1 the comoving Hubble radius at the onset of the inflationary regime and
(a0 H0 )−1 its value today. Let us assume for simplicity that the radiation-dominated epoch
starts immediately after the end of inflation and neglect the comparatively shorter matter
and dark-energy dominated epochs. Under these assumptions, we have
Scales
1
(ai Hi ) 1 (aH)
Horizon exit Horizon reentry
1
(a0 H0 )
Today
CMB
log a
Inflation Heating Radiation Matter
Figure 3.6: Scales of cosmological interest as a function of the number of e-folds. Due to the
Hubble shrinking, typical scales λ ≡ (a0 H0 )−1 that were inside the horizon at the onset of
inflation, leave the radius of causal contact as inflation proceeds. When inflation ends, the
comoving Hubble radius (aH)−1 starts increasing and the scales reenter the horizon.
(ai Hi )−1
≥ 1026 ' e60 . (3.46)
(aend Hend )−1
Ḣ d ln H
≡− =− , dN ≡ Hdt = d ln a , (3.48)
H2 dN
we can rewrite Eq. (3.40) as
d 1
(aH)−1 = − (1 − ) . (3.49)
dt a
2
This quantity will play a central role in the effective field theory of inflation to be presented in Chapter 7.
3.3 Inflationary paradigm 37
For inflation to take place, must be smaller than one. As argued in Section 3.3.1, the
solution of the flatness and horizon problems requires a rather long inflationary stage. In
order to achieve this, the fractional change of ,
d ln ˙
η≡ = , (3.50)
dN H
must also be small, |η| < 1. Note that the η parameter are just the first two elements of a
full series of Hubble flow parameters
d ln i ˙i
i+1 ≡ = . (3.51)
dN Hi
The quantity l ≡ 1/H is the so-called de Sitter radius.3 This representation makes explicit
the symmetries of the de Sitter space: rotations and Lorentz transformations in the 10 planes
formed by pairs of the five coordinates z A . This ten parameter SO(4, 1) group plays the same
instrumental role than the Poincare group in Minkowski spacetime. In particular, it greatly
facilitates computations as far as quantum field theory is concerned.
The dS4 spacetime can be also described in an intrinsic way. Consider the transformation
1 H
z0 = sinh(Ht) + eHt δij xi xj , (3.53)
H 2
Ht
zi = x i e , (3.54)
1 H
z4 = cosh(Ht) − eHt δij xi xj , (3.55)
H 2
with i = 1, 2, 3 and −∞ < t < ∞ and −∞ < xi < ∞. In this coordinate system the line
element (3.52) becomes a special case of the flat FLRW spacetime
with a(t) = eHt . Note however that Eq. (3.56) is not completely equivalent to (3.52), since
the coordinates {t, xi } cover only half of de Sitter manifold. This can be easily seen by adding
the z0 and z4 coordinates (see also Fig. (3.7)),
1 Ht
z 0 + z4 = e ≥ 0. (3.57)
H
3
The choice of notation will become clear soon.
3.3 Inflationary paradigm 38
Figure 3.7: The embedding of de Sitter space into a five dimensional flat space-time with
two spatial coordinates suppressed. The flat coordinates in (3.53)-(3.55) cover only half of
de Sitter manifold. The surfaces (lines) of constant t and constant x are also indicated.
Exercise
1. Derive Eq. (3.56) from the 5-dimensional embedding (3.52).
2. Other choices of coordinates leading to FLRW metrics with open and closed spatial
sections can be also considered. Find these sets of coordinates.
It is interesting to recast (3.56) in terms of the conformal time (3.35). Taking into account
that
1 1
τ = − e−Ht =⇒ a(τ ) = − , (3.58)
H Hτ
the line element takes the manifestly conformally flat form
1
ds2 = −dτ 2 + δij dxi dxj ,
(3.59)
H 2τ 2
with η ranging between −∞ and 0. Note that Eq. (3.59) is manifestly invariant under the
rescaling
τ → λτ , xi → λxi . (3.60)
As we will see in Section 6.1.3, this symmetry plays a central role in the properties of the
primordial perturbations generated during inflation. But let not anticipate things and focus
on the background evolution for the time being. What seems clear is that in order to recover
the hot Big Bang scenario the de Sitter phase cannot be eternal. In other words, we need to
equip the de Sitter space with a clock.