Relativity Notes
Relativity Notes
Relativity Notes
(MTH6132)
Course notes
September 2010
1
2
Preface
These are the notes for the course of Relativity (MTH6132) I lectured in the Semester
A of 2011 (October-December 2010) at the School of Mathematical Sciences of Queen
Mary, University of London. These notes are mostly based on handwritten notes I have
inherited from Prof. Reza Tavakol. Of course, several parts of the notes have been
adapted to my particular taste and understanding of the subject. In any case, any typos,
omissions or misrepresentations are entierely my responsability.
The present course on Relativity is aimed at the particular characteristics of our
students at the School of Mathematical Sciences. In particular it assumed very little
Physical background. Hence, a certain amount of time is spent presenting the underlying
assumptions and experimental motivation for such a theory. It also assumes very little
from the mathematical side. All the necessary ideas from Differential Geometry and
tensors are provided within.
The course is quite an ambitious one. It begins with Special Relativity, then moves
to Differential Geometry and finally (in the last third) it provides an introduction to
General Relativity. Due to time constraints, there are some clear omissions in the choice
of topics. In particular, in the chapter on Special Relativity it would be desirable to
have a discussion of the Maxwell equations. In the chapter on General Relativity, the
discussion is restricted to the vacuum field equations. There is little mention of the field
equations with matter. Also, it would be desirable to have a discussion of the Friedman
Cosmological models. The discussion of these topics would require at least a couple of
weeks more, and may also involve the reorgainsation of some of the topics discussed in
the mathematical background. I do not discard the possibility of carrying out such a
revision next time I lecture the course.
1
2
Chapter 1
Introduction
3
Special Relativity still applies locally. The domain of applicability of General Relativity
is in Astrophysics and Cosmology. More recently, the Global Positioning System (GPS)
requires of General Relativity to function accurately! Contrary to Special Relativity,
General Relativity was not widely accepted until the 1960’s.
• Events. This notion denotes a single point in space together with a single point in
time. Thus, events are characterised by 4 real numbers: an ordered triple (x, y, z)
giving the location in space relative to a fixed coordinate system and a real number
giving the Newtonian time. One denotes the event by E = (t, x, y, z).
There are an infinite number of frames of reference. Motion relative to each frame
looks, in principle, different. Hence, it is natural to ask: is there a subset of these frames
which are in some sense simple, preferred or natural? The answer to this question is yes.
These are the so-called inertial frames. In an inertial frame an isolated, non-rotating,
unaccelerated body moves on a straight line and uniformly.
Inertial frames are not unique. There are actually an infinite number of these. This
raises the question: can one tell in which inertial frame are we in? It turns out that
within the framework of Newtonian Mechanics this is not possible. More precisely, one
has the following:
Galilean Principle of Relativity. Laws of mechanics cannot distinguish between
inertial frames. This implies that there is no absolute rest. In other words, the laws of
Mechanics retain the same form in different inertial frames.
In this sense, Relativity predates Einstein.
(1) Any material body continues in its state of rest or uniform motion (in a straight
line) unless it is made to change the state by forces acting on it. This principle is
equivalent to the statement of existence of inertial frames.
4
These laws or principles, together with the following fundamental assumptions (some
of which are implicitly assumed in Newton’s laws) amount to the Newtonian framework :
(A1) Space and time are continuous —i.e. not discrete. This is necessary to make use
of the Calculus.
(A2) There is a universal (absolute) time. Different observers in different frames mea-
sure the same time. In fact, Newton also regarded space to be absolute as well.
However, the absoluteness of space is not necessary for the development of the
Newtonian framework, as space intervals turn out to be invariant under Galilean
transformations. Historically, Newton demanded this for subjective reasons.
(A3) Mass remains invariant as viewed from different inertial frames.
(A4) The Geometry of space is Euclidean. For example, the sum of angles in any triangle
equals 180 degrees.
(A5) There is no limit to the accuracy with which quantities such as time and space can
be measured.
As it will be seen in the sequel, Assumptions 2 and 3 are relaxed in Special Relativity
while Assumption 4 is relaxed in General Relativity. Assumption 5 is relaxed in Quantum
Mechanics —not to be discussed in the course. Presumably Assumption 1 will be relaxed
in Quantum Gravity!
x0 = x − vt, y 0 = y, z 0 = z, t0 = t, (1.1)
r0 = r − vt.
In general, if the coordinate axes are not in standard configuration and the origins O and
O0 of the coordinate axes do not coincide, then the general form of the transformation
takes the form:
r0 = Rr − vt + d,
5
where R is the rotation matrix aligning the axes of the frames and d is the distance
between the origins at t = 0. Note that the general transformation is linear, so that F 0
is inertial if F is. The most general transformation would also include
t0 = t + τ
v 02 − v 01 = v 2 − v 1 , r02 − r01 = r2 − r1 .
This implies that f , and hence the Second Law remains invariant under changes in the
inertial frames.
This discussion amounts to a form of self-consistency, in the sense that Physics, when
confined to Newtonian Mechanics, satisfies the Galilean Principle of Relativity.
6
1.2.5 Electromagnetism
Special Relativity arises from the tension between Newtonian Mechanics with the other
great physical theory of the 19th century —Electromagnetism. The fundamental laws of
Electromagnetism are the so-called Maxwell equations 5 :
∇ · D = ρ,
∂B
∇×E =− ,
∂t
∇ · B = 0,
∂D
∇×H =j− ,
∂t
where B is the magnetic induction, E the electric field, H the magnetic field, D the
electric displacement, j the electric current and ρ the electric charge.
It can be shown that these equations predict the existence of electromagnetic waves
for E and H in the form
1 ∂2E 1 ∂2H
∇2 E = , ∇ 2
H = ,
c2 ∂t2 c2 ∂t2
where c is the speed of propagation of the waves. These electromagnetic waves were soon
identified with the propagation of light.
We recall that speed travels with a speed of c ≈ 3 × 108 m/s. This was first measured
by Rømer 6 in 1675 by studying the delay in the appearance of moons of Jupiter.
Within the Newtonian framework, the Maxwell equations give rise to two problems:
(1) With respect to which system of reference is the speed of light c is measured? First,
it was assumed that the absolute space of Newton —the so-called ether — was the
medium in (and relative to) which light moved. However, attempts at detecting
the effects of Earth’s motion on the velocity of light —the so-called terrestrial
ether drift— all failed. The most important of these was the Michelson-Morley
experiment 7 . This gave a null result.
(2) It is easy to show that Maxwell’s equations and the wave equation do not remain
invariant under Galilean transformations.
These problems gave to a crisis in the 19th century Physics. Three scenarios were
put forward to resolve the tension. These were:
(i) Maxwell’s equation were incorrect. The correct laws of Electromagnetism would
remain invariant under Galilean transformations.
(ii) Electromagnetism had a preferred frame of reference —that of ether.
(iii) There is a Relativity Principle for the whole of Physics —Mechanics and Electro-
magnetism. In that case the laws of Mechanics need modification.
Now, Electromagnetism was very successful and have a very strong predictive power.
There was no experimental support for (ii). Hence the point of view (iii) was adopted by
Einstein. His resolution of the tension between Mechanics and Electromagnetism came
to be known as Special Relativity.
5
James C. Maxwell (1831-1879). Scottish mathematician.
6
Ole C. Rømer (1644-1710). Danish astronomer.
7
Albert Michelson (1852-1931). Edward Morley (1838-1923). American physicists.
7
8
Chapter 2
Special Relativity
9
Wordline. Defined as the set of all points that the trajectory of a particle follows in
spacetime.
2.2.2 Examples
• The worldline of a light ray is a straight line with slope equal to 1/c. In practice
we shall usually choose c = 1 so that the slope is equal to 1.
10
Note. All uniformly moving particles have worldlines which are straight lines with
slopes bigger than 1/c or bigger than 1 if c = 1. Therefore they all lie in the shaded
region of the figure.
• The worldlines of accelerating bodies are curved. For example, for a uniformly
accelerated body from rest one has that initially the worldline is tangent to the t.
The upper bound for v is c. The slope of the asymptotic motion is 1(= 1/c). This
situation will be analysed in detail later on.
11
2.3 Lorentz transformations (LT)
Consider two frames F and F 0 moving in standard configuration —i.e. O0 moves with
speed v along the x-axis relative to O. The worldline of O0 in the frame is given as in
the figure:
Let observers O and O0 carry clocks measuring t and t0 respectively such that when
O0 is at (t, vt) according to O, the clock at O0 registers t0 = βt, where β may be a function
of v —in this sense β carries all the effect that the motion has on t. Note also that β = 1
for Galilean transformations.
Now consider a light ray emitted by O at t = t1 , travelling via O0 , being reflected at
p(t, x) and received by O at t = t4 —i.e. a round trip.
We want to relate the coordinates of the event at p relative to the frames F and F 0 .
In line with Einstein’s postulates assume that the speed of light is c for both O and
O0 .From the perspective of O the distance and time may be fixed using the so-called
radar convention:
x = 12 c(t4 − t1 ), t = 12 (t4 + t1 )
so that
12
Similarly,
where it has been used that t0 = βt. Therefore, the time and location of p(t, x) as
measured by O0 is (using again the radar convention) is given by:
βc2 (x − vt)
x0 = 12 c(t03 − t02 ) = , (2.5a)
c2 − v 2
β(c2 t − vx)
t0 = 12 (t03 + t02 ) = , (2.5b)
c2 − v 2
13
where equations (2.4a) and (2.4b) have been used to obtain the second equalities in the
last pair of equations.
Note. The observer O0 is also assuming that the velocity of light is c. This assumption
is inconsistent with the Galilean transformations.
Eliminating x between (2.5a) and (2.5b) one obtains
1 0 vx0
t= t + 2 . (2.6)
β c
Now, the Relativity principle requires that we obtain the same result if we interchange x,
x0 and t, t0 and let v → −v. Applying this idea to equation (2.5b) and equating to (2.6):
Remark. This is the case of a more general transformation with 10 parameters. These
parameters are the 3 components of the velocity, 3 components of a shift of the origin,
3 parameters of a rotation and a further parameter fixing the origin of the time. The
set of these transformations forms a group. The transformation given by (2.9) is the
1-parameter subgroup of this group called the special Lorentz group.
14
We also require α and v to have the same sign as cosh α = cosh(−α).
The Lorentz transformation (2.9) becomes (hyperbolic form of the Lorentz transfor-
mation):
Adding and subtracting x0 and ct0 as given by (2.10a) and (2.10b) one obtains
To show that the Lorentz transformations form a group one needs to show:
The most convenient way to verify the latter is to use the form given by (2.11a) and
(2.11b) and then check one by one:
(i) One sees that there exists an identity Lorentz transformation corresponding to v
(α = 0).
(iii) Let F 00 move with velocity v2 (α2 ) relative to F 0 and F 0 with velocity v1 (α1 )
relative to F —all in standard configuration.
and
15
It follows then that
The previous discussion allows also to discuss the Special Relativity rule for the com-
position of velocities. Since the resultant of two Lorentz transformations with parameters
α1 and α2 is a Lorentz transformation with parameters α1 +α2 , the corresponding relation
between the velocity parameter of the transformation can be easily derived from
v
tanh α =
c
by recalling that
tanh α1 + tanh α2
tanh(α1 + α2 ) = .
1 + tanh α1 tanh α2
Substituting for
v1 v2 v
tanh α1 = , tanh α2 = , tanh α1 + α2 =
c c c
one obtains
v1 + v2
v= (2.12)
1 + v1 v2 /c2
where v is the velocity of F 00 relative to F —it represents the relativistic sum of collinear
velocities v1 and v2 along the x-axis. A generalisation of this rule will be discussed later.
Remark 1. When
v1 v2
1, 1,
c c
then equation (2.12) takes the Galilean form
v = v1 + v2 .
Remark 2. Since | tanh α| < 1, it follows that v always satisfies |v| < c.
x0 = x cos α + y sin α,
y 0 = −x sin α + y cos α,
16
where (x, y) and (x0 , y 0 ) correspond to the coordinates of the point p in the two frames.
x0 = OA + AB = OA + CD
= OC cos α + P C sin α
= x cos α + y sin α
0
y = P B = P D − BD
= P C cos α − OC sin α
= −x sin α + y cos α.
Letting
(OP ) ≡ x2 + y 2 , (2.13)
one sees that in Euclidean space, rotations leaves the distance (OP ) invariant. Note
also that the rotation leaves curves of constant distance from the origin —i.e. circles—
invariant.
17
and multiplying both sides one obtains
where the choice of sign in the previous equation is a convention. Furthermore, since
y 0 = y and z 0 = z one obtains
Alternatively, one could start from the infinitesimal version of the Lorentz transfor-
mations
0 v∆x
∆t = γ ∆t − 2 , ∆x0 = γ (∆x − v∆t) , ∆y 0 = ∆y, ∆z 0 = ∆z,
c
Therefore
−c2 dt2 + dx2 + dy 2 + dz 2
−dt2 + dx2 + dy 2 + dz 2
which, apart from the negative sign is very similar to the Euclidean distance in 4 dimen-
sions
dl2 = dx2 + dy 2 + dz 2 + dw2 .
The latter measures the “distance” between events (t, x, y, z) and (t + dt, x + dx, y +
dy, z + dz) in spacetime.
Note. As opposed to Euclidean geometry, the set of points with equal distances from
the origin defines a hyperbola:
x2 − t2 = D, D a constant.
18
One can also ask what is seen in the reference frame F 0 . For this one can use the
inverse Lorentz transformations
t = γ(t0 + vx0 ), x = γ(x0 + vt0 ).
The x and t axes from the point of view of the frame F 0 are given, respectively, by
1
t0 = −vx0 , t0 = − x.
v
Thus, the picture from F 0 ’s point of view is the following:
19
This picture is consistent with the Principle of Relativity —all frames of reference
are equivalent and should provide an equivalent picture! We shall see further examples
of this symmetry in the sequel.
(ii) Repeated indices as called dummy indices since they may be replaced by another
index (from the same alphabet!) not already used. For example:
ds2 = ηab dxa dxb = ηcd dxc dxd .
(iii) To avoid ambiguity, no index should appear more than twice in the same expression.
So
ai bi ci
is not allowed!
(iv) Indices that occur only once in an expression (or terms of an equation) are called
free indices. In an equation such indices match in every term. For example consider
Ai Bi Cj = Dj .
Notice that i is a dummy index and that j is a free index.
20
Examples
For simplicity in the following examples let the Latin lower case index take values 1, 2.
(1)
Ai B j = A1 B 1 , A1 B 2 , A2 B 1 , A2 B 2
as i, j are free indices.
(2)
2
X
Ai Bi = Ai Bi = A1 B1 + A2 B2 ,
i=1
as i is a dummy index.
(3)
gij = g11 , g12 , g21 , g22
as, again, i, j are free indices.
(5) In Ri jkl all indices are free and there are 16 terms: R111
1 , R1 1
112 , R 122 , . . .
(6)
dxj dxk dxl dxm
Γi jk = Γi lm
ds ds ds ds
as l, m are dummy indices while i is free.
(7) xa yb z b = za yc y c .
(8) gij dxi dxj = gmn dxm dxn = g11 (dx1 )2 + g12 dx1 dx2 + g21 dx2 dx1 + g22 (dx2 )2 .
4-vectors
A 4-vector is a set of four ordered real numbers which transform in exactly the same
manner as do (t, x, y, z) under Lorentz transformations.
Denote 4-vectors by overlines; as opposed to 3-vectors denoted by underlines.
In index notation
Ā = (Ai ) ≡ (A0 , A1 , A2 , A3 ).
The Lorentz transformation relating Ai to A0i may be written as
21
where Li j is the Lorentz transformation matrix defined as
L0 0 L0 1 L0 2 L0 3
γ −vγ 0 0
L1 0 L1 1 L1 2 L1 3 −vγ γ 0 0
(Li j ) ≡
= .
L2 0 L2 1 L2 2 2
L 3 0 0 1 0
L3 0 L3 1 L3 2 L3 3 0 0 0 1
Check:
A0
A1
A00 = (γ, −vγ, 0, 0) 0 1
A2 = γ(A − vA ).
A3
So it transforms like x0 .
is invariant.
Exercise: Show by direct substitution that the norm of a 4-vector is invariant. One has
that
Hence,
Remark. Because of the negative sign in (2.17), the norm of a vector does not have to
be positive! A 4-vector Ā is said to be:
• null if |Ā|2 = 0.
In Minkowski spacetime a null vector need not be a zero vector whose components
are zero! Only in a space in which the norm is positive definite, it is true that |A|2 = 0
implies A = 0.
22
Example: Show that Ā = (1, 1, 0, 0) is a null vector. A direct computation gives
Similarly for
(1, −1, 0, 0), (1, 0, 1, 0), (1, √12 , √12 , 0), etc.
x2 + y 2 + z 2 − t2 = 0.
This is said to define a light cone at the origin, because all lights rays emitted at t = 0
at origin lie on the cone x2 + y 2 + z 2 = c2 t2 . Suppressing 1-space dimension one has the
following figure:
Scalar product
The scalar product of two 4-vectors Ā, B̄ is defined by
Ā · B̄ = ηab Aa B b = −A0 B 0 + A1 B 1 + A2 B 2 + A3 B 3 .
and note that |Ā|2 , |B̄|2 and |Ā + B̄|2 are all invariants. Hence, so is Ā · B̄.
23
Orthogonality
Two vectors are called orthogonal if Ā · B̄ = 0.
Note 1. Because of the nature of the Minkowski geometry, two orthogonal 4-vectors do
not appear orthogonal graphically.
Note 2. Null vectors are orthogonal to themselves (Ā · Ā = 0)!
Basic 4-vectors
In any frame F , there exist 4 basic vectors
ē0 = (1, 0, 0, 0), ē1 = (0, 1, 0, 0), ē2 = (0, 0, 1, 0), ē3 = (0, 0, 0, 1),
Ā = Ai ei = A0 e0 + A1 e1 + A2 e2 + A3 e3 ,
(1) Any event Ei inside the light cone occurring after O from the perspective of F will
also occur after O from the perspective of F 0 no matter how fast F 0 moves with
respect to F so long as v ≤ c. An event E0 outside the light cone and occurring
after O from the point of view of F could occur before O from the point of view
of F 0 . Therefore, outside the future (and similarly the past) light cone of O there
exists no ordered time sense of events.
24
Given any point O, the spacetime is divided up into the absolute past of O (the
past light cone at O) and the absolute future of O (the future light cone at O) and
a region (spacelike) know as the region of relative simultaneity.
(2) For invariance of causality, interactions must take place at speeds less than c. To
see this, consider a process in which an event E1 causes an event E2 at super-light
speed u > c relative to some frame F . Choose coordinates in F such that E1 and
E2 occur on the x-axis and let their space and time separation ∆x > 0, ∆t > 0
(i.e. E1 precedes E2 ). Now, in frame F 0 moving with with velocity v relative to F
we have: v uv
∆t0 = γ ∆t − 2 ∆x = γ∆t 1 − 2
c c
where
∆x
u=
∆t
is the speed of propagation. Now, for
c2
<v<c
u
we would have ∆t0 < 0 so that in F 0 the event E2 precedes E1 —i.e. cause and
effect are reversed or we have information from receiver to transmitter!
25
2.10.2 Length contraction
This is also called the (Lorentz-Fitzgerald contraction). Consider F and F 0 in standard
configuration. Let a rod of length ∆x0 be placed at rest along the x0 -axis of F 0 . To find
the length as measured in F , we must measure the distance between the two ends of the
rod simultaneously in F . Consider two events occurring simultaneously at the end points
of the rod in F . Therefore one has ∆t = 0. Now, using
Geometrically:
F measures the distance between the two ends of the rod at t = 0, i.e. F measures OB,
while F 0 measures OA.
2.11 Paradoxes
These arise from an incautious view of the situation, and the fact that simultaneity means
different things to different observers.
26
Answer: No, since there is no symmetry! The twin A remained in the same inertial
frame, but B has experienced acceleration and deceleration and therefore knows that
she/he has not been in an inertial frame! This solves the paradox.
Note: in Minkowski spacetime OO1 O2 < OO2 .
ds2
dτ 2 ≡ − = dt2 − dx2 − dy 2 − dz 2 .
c2
In the previous definition the minus sign is included so that dτ and dt have the same
sign! The name of proper time comes from the fact that a clock at rest with a moving
particle —i.e. in the particle’s rest frame where dx = dy = dz = 0— has dτ = dτ —i.e.
it is equal to the time elapsed on the particle’s clock.
We employ τ as the invariant measure of time for the particle.
27
2.14 4-velocity and 4-momentum
In order to express Newton’s laws in Special Relativity in an invariant way, we need to
express them in terms of 4-vectors.
4-velocity
The 4-velocity of a particle is defined as a unit tangent to its Worldline:
dx̄ dxi
Ū = , Ui = .
dτ dτ
Remarks:
Ū · Ū = −1. (2.18)
where v denotes the 3-velocity relative to the frame F and v 2 = v · v. Hence, one
concludes that
dt 1
=√ = γ(v) (c = 1).
dτ 1 − v2
Now, using
dx dx dt
= = γ(v)v 1 , etc
dτ dt dτ
one finds that
dt dx dy dz
Ū = , , , = γ(v)(1, v 1 , v 2 , v 3 ),
dτ dτ dτ dτ
or in short
Ū = γ(v)(1, v). (2.19)
Note that the spatial part of Ū is essentially v.
4-momentum
The 4-momentum is the natural analogue of the 3-momentum:
p̄ = m0 Ū ,
where m0 denotes the mass of the particle. From the definition it follows that
p̄ · p̄ = m20 Ū · Ū = −m20 ,
28
where it has been used that Ū · Ū = −1. Also, using (2.19) one has
p̄ = m0 γ(v)(1, v). (2.20)
It follows that the space part of (2.20) can be identified with the 3-momentum, where
by analogy m0 γ is called the the moving mass, or the apparent mass and m0 is referred
as the rest mass.
Let
m0
m ≡ m0 γ(v) = p ,
1 − v 2 /c2
so that the time component of p̄ is identified with the energy
E = m0 c2 γ(v).
One reason for this identification comes from considering the limit for small v/c. For
v/c 1 one has
E = m0 c2 γ(v) = m0 c2 (1 − v 2 /c2 )−1/2
≈ m0 c2 + 21 m0 v 2 ,
where the binomial expansion has been used. Now, the second term is just the Newtonian
kinetic energy ( 12 m0 v 2 ). The first term (m0 c2 ) is then interpreted as the rest mass energy.
This is the famous equation
Erest = m0 c2 .
2.15 Photons
The definition of 4-velocity given in the previous sections breaks down when applied to
particles moving with the speed of light (photons) since for light rays one has ds2 =
−dτ 2 = 0. In this case one may choose another parameter λ and define
dx̄
k̄ = ,
dλ
but again k̄ · k̄ = 0 since k̄ is null. This also implies that p̄ · p̄ = 0 for photons as p̄ is in
the direction of Ū . Now, recalling that p̄ · p̄ = −m20 , it follows that m0 = 0 for photons.
Hence, particles moving wit the speed of light must be massless!
Consider a photon with 4-momentum p̄ = (E, p) defined relative to some frame F .
As seen before p̄ · p̄ = 0, so that one finds that
E 2 − p2 = 0, or E = p.
Therefore, for photons the spatial 3-momentum and the energy are equal. In particular,
if the photon moves along the x-direction one has that
px = E.
29
2.16 Doppler shift
Let F and F 0 be in standard configuration. Consider a photon of frequency ν moving in
the x-direction relative to the frame F . Relative to the frame F 0 the energy of the photon
may be obtained using a Lorentz transformation. For this recall that p̄ is a 4-vector and
its energy is given by its t-component. So, from
p̄ = (E, px ), py = pz = 0,
one obtains
E 0 = γ(E − vpx ), (c = 1). (2.22)
Also, recall that form Quantum Mechanics, a photon of frequency ν has energy given by
hν where h denotes Planck’s constant:
hν − vpx
hν 0 = √ . (2.23)
1 − v2
Furthermore, for such a photon E = px so that substituting into (2.23):
hν − vhν
hν 0 = √ ,
1 − v2
from where
ν0
r
1−v 1−v
=√ = .
ν 1 − v2 1+v
Adding the constant c: s
ν0 1 − v/c
= . (2.24)
ν 1 + v/c
This is the relativistic Doppler shift formula. Note that when v/c 1, then using the
binomial expansion in (2.24) one obtains
ν0
≈ 1 − v/c,
ν
which is the usual (non-relativistic) formula for the Doppler shift.
Remark. The Doppler shift has been fundamental in Cosmology to establish the ex-
pansion of the Universe.
dp̄ d(m0 Ū ) d dt d
= = [m0 γ(v)(1, v)] = m0 [γ(v)(1, v)] . (2.25)
dτ dτ dτ dτ dt
30
But,
dt
= γ(v),
dτ
as seen in section 2.14. Also,
dγ(v) d d
= (1 − v 2 )−1/2 = (1 − v · v)−1/2
dt dt dt
dv
1 −2v ·
=− dt ,
2 (1 − v · v)3/2
so that
dγ(v)
= γ 3 v · v̇,
dt
where we have written
dv
v̇ ≡ .
dt
Substituting into (2.25):
dp̄
= m0 γ γ(0, v̇) + γ 3 v · v̇(1, v) ,
dτ
and finally
dp̄
= m0 γ 4 v · v̇, (1 − v 2 )v̇ + (v · v̇)v .
dτ
Now, for v c one has that γ ≈ 1 and (v · v̇)v ≈ v̇v 2 /c2 1 so that
dp̄
≈ m0 (v · v̇, v̇).
dτ
The second term (spatial part) on the right hand side of the last equation is the usual
rate of change of the 3-momentum while the time part is the rate of change of the kinetic
energy.
4-acceleration
For |v| c the 4-acceleration is defined as
dŪ
≈ (v · v̇, v̇).
dτ
with the spatial part being approximately the 3-acceleration at low v. From
dŪ
Ū · =0
dτ
it follows that the 4-acceleration is orthogonal to the 4-velocity. Using the definition of
4-acceleration Newton’s second law becomes
dp̄
F̄ = ,
dτ
where F̄ denotes the 4-force vector. Note also, that F̄ · Ū = 0 so that also F̄ and Ū are
orthogonal. This can be seen as follows:
dp̄ dŪ
F̄ · Ū = · Ū = m0 · Ū = 0.
dτ dτ
31
2.18 3-velocity and 3-acceleration
Let F and F 0 be in standard configuration and moving with velocity V along the x-axis.
For simplicity, we will restrict our attention to movements along the x-axis. Let v be the
(uniform) velocity of a particle relative to F To find v 0 , the velocity relative to F 0 recall
that:
dx
v= , (2.26a)
dt
dx0
v0 = 0 , (2.26b)
dt
where the increment represents the distances and times between two events for the par-
ticle relative to the two frames. Using the inverse Lorentz transformations
v0 + V
v=
1 + v0V
and calculating the differential
dv 0 v0 + V
dv = − V dv 0 ,
1 + v0V (1 + v 0 V )2
t = γ(t0 + V x0 ),
it follows that
dt = γ(dt0 + V dx0 ),
and furthermore that
dv 1 dv 0
= 3 .
dt γ (1 + v 0 V )3 dt0
Notice that as a consequence of this formula, is the acceleration is zero in one inertial
frame, then it is zero in all inertial frames. Hence, acceleration is in a certain sense
absolute.
32
It follows that
lim v = ∞,
t→∞
dv a0 3/2
= = 1 − v2 a0 .
dt γ
Integrating:
Z v Z t
dv
= a0 dt v0 = 0 at t = t0 ,
0 (1 − v 2 )3/2 t0
one obtains
v
= a0 (t − t0 ).
(1 − v 2 )1/2
Solving for v one finds
dx a0 (t − t0 )
v= = 1/2 .
dt 1 + a20 (t − t0 )2
Integrating once more
1 1/2 1
x − x0 = 1 + a20 (t − t0 )2 − ,
a0 a0
(x − x0 + 1/a0 )2 (t − t0 )2
− = 1. (2.28)
(1/a0 )2 1/a0
The latter is an hyperbola in the (x, t) plane. For simplicity take t0 = 0 and x0 = 1/a0
so that (2.28) reduces to
x2 (t)2
− = 1.
(1/a0 )2 (1/a0 )
This formula gives different hyperbolae for different values of a0 .
33
2.20 Relativistic dynamics
In Special Relativity Newton’s laws become:
First law. Remains unchanged, except that straight lines the straight lines referred to
are now world lines in Minkowski spacetime.
Second law. One has
dp̄
F̄ = .
dτ
Third law. On basis of very precise experiments of Particle Physics, this remains
unchanged. That is, 4-momentum is conserved in collisions:
X
p̄i = constant,
i
Example 1
Consider 2 particles with rest masses m1 and m2 both moving along collinearly with
speeds u1 and u2 . The particles collide and coalesce with the resulting particles moving
in the same direction. The question is: what are the mass m and the speed u of the
resulting particle?
Recall that p̄ = mγ(1, v) for a particle of 3-velocity v. The initial 4-momenta are:
Squaring
p̄2 = p̄ · p̄ = p̄21 + p̄22 + 2p̄1 · p̄2 . (2.30)
However,
34
Substituting in (2.30):
q
m= m21 + m22 + 2m1 m2 γ(u1 )γ(u2 )(1 − u1 u2 ). (2.31)
Remark. In the limit of u1 c and u2 c one has that γ(u1 ), γ(u2 ) ≈ 1 and that
(1 − u1 u2 ) ≈ 1 so that (2.31) and (2.33) yield
m ≈ m1 + m2 ,
m1 u1 + m2 u2
u≈ ,
m1 + m2
which are the classical version of the result.
Example 2
Consider the collision (scattering) of a photon of frequency ν moving in the x-direction
by an electron of mass me in a frame in which me is initially at rest. Assume that the
subsequent motion remains in the xy plane.
Before the collision the 4-momenta of the photon and electron are given, respectively,
by
where ν 0 is the new photon frequency and α, β are as given in the figure.
The conservation of 4-momentum gives:
Squaring:
(p̄p1 + p̄e1 − p̄p2 ) · (p̄p1 + p̄e1 − p̄p2 ) = p̄e2 · p̄e2 . (2.34)
But,
p̄2e1 = p̄2e2 = −m2e , p̄p1 = p̄p2 = 0.
Substituting in (2.34) one obtains
35
from where
−me hν + me hν 0 = h2 νν 0 (cos α − 1),
and
me c2
1 1
sin2 α/2 = − . (2.35)
2h ν0 ν
This example shows that the photon is deflected (or scattered) by and angle given by
(2.35)
36
Chapter 3
(1) The (equation of) motion of a (spherically symmetric) test particle (one whose
own gravitational field may be neglected) in a gravitational field is independent
of its mass and composition. The first verification of this statement is claimed
to be Galileo’s Pisa bell tower experiment —although this particular experiment
probably never took place. More recent experiments like the one by Roll, Krotkov
and Dicke (1964) have allowed to establish the equality to 1 part in 1011 .
(2) Matter (as well as every form of energy) is acted on by (an is itself a source of)
gravitational field. In other words, gravity couples everything.
37
and Coriolis forces) which arise when non-inertial frames of reference are employed. The
important point about these forces is that like gravity, they are proportional to the mass
of the particle. This led Einstein to suspect that these and the gravitational forces should
enter the theory in the same way.
To get a better feeling for this, recall that the only way one can eliminate the force of
gravity is by choosing a freely falling frame —i.e. a comoving frame with the freely falling
particle. This is can be visualised in the thought experiment (Gedankenexperiment) —
sometimes referred to as the lift experiment.
The experiment suggests that there are no local experiments which distinguish non-
rotating free fall in gravitational field from a uniform motion in a space free from gravita-
tional fields. By local, here its is understood that the experiment is performed in a small
region such that the variation of the gravitational field is negligible (observationally).
This is another way of expressing the Equivalence Principle (all particles fall in the same
way). In this sense, Special Relativity is regained locally, in the sense that the laws of
Physics in a freely falling frame are compatible with Special Relativity. Alternatively,
one can say that spacetime is locally Minkowskian. Furthermore, for a global theory in
the presence of gravitation (i.e. GR), the geometry of spacetime must be such that it
is locally Minkowskian. The natural tool to express and implement these ideas is the
so-called tensor calculus.
3.3 Summary
In presence of gravitational fields there exist, in small regions (locally), preferred inertial
frames (i.e. the non-rotating free falling frames) in which the special relativistic results
hold. On a large scale, on the other hand, there are no such preferred frames, and hence
one needs to treat all large scale reference frames on the same footing. This suggests
that the laws of nature should be formulated in such a way that they are invariant under
arbitrary transformations of coordinates (i.e. reference frames), and not just the Lorentz
transformations as was the case of Special relativity.
Interpreted physically, this is called the General Principle of Relativity as opposed to
the Special Principle of Relativity according to which laws of nature have the same form
in inertial frames.
Interpreted mathematically, it is called the principle of General Covariance —the
equations of Physics should have tensorial form.
38
Chapter 4
In describing spacetime we wish our equations to be valid for any coordinates. Tensorial
equations satisfy this property —hence their significance.
Curves
A curve is defined as the set of points given by
xi = f i (u), i = 0, 1, . . . n − 1,
with u a parameter.
39
Subspaces
xi = f i (u1 , u2 , . . . , um ), i = 0, 1, . . . , n − 1,
The points in the space not satisfying this equation fall into 2 classes —that for F > 0
and that for F < 0.
xa = xa (x0b ), (4.2)
where xa and x0a refer to coordinates of a point p relative to coordinates systems F and
F 0 which are no longer assume to be inertial. We shall also assume that the functions xa
and x0a are differentiable.
40
Example
We may describe the plane R2 by Cartesian coordinates (xi ) = (x, y) or polar coordinates
(x0a ) = (r, θ). We then have
Remark. Contravariant tensors of rank k are also called tensors of type (k, 0) —e.g. a
contravariant vector V a is referred to as a tensor of rank (1, 0). An important special
case is a tensor of rank 0 (type (0, 0)) also called a scalar or an invariant:
φ0 = φ at p.
41
More precisely, a covariant vector is defined as a set of n quantities Yb which transform
according to:
∂xb
Ya0 = Yb . (4.6)
∂x0a
Similarly, a covariant tensor of rank 2 (or type (0, 2)) can be defined by:
0 ∂xc ∂xd
Yab = Ycd .
∂x0a ∂x0b
More generally, a covariant tensor of rank k (or type (0, k)) is defined as:
Mixed tensors
One can also define geometric objects called mixed tensors. For example, the mixed
tensor of rank 3 with 1 contravariant and 2 covariant indices (of type (1, 2)) satisfies
Z a1 ···ap b1 ···bq .
An example
If a contravariant vector and a covariant vector have, respectively, components Ai =
(A1 , A2 ) and Ai = (A1 , A2 ) in Cartesian coordinates, find the components in polar coor-
dinates. In this example one has
Also
x = r cos θ, y = r sin θ,
2 2 1/2
r = (x + y ) , θ = arctan(y/x).
42
A lengthy but straightforward calculation gives:
∂x1 ∂x ∂x2 ∂y
01
= = cos θ, 01
= = sin θ,
∂x ∂r ∂x ∂r
∂x1 ∂x ∂x2 ∂y
02
= = −r sin θ, 02
= = r cos θ,
∂x ∂θ ∂x ∂θ
and
∂x01 ∂x x ∂x01 ∂r y
1
= = , 2
= = ,
∂x ∂x r ∂x ∂y r
∂x02 ∂θ y ∂x02 ∂θ x
1
= = − 2, 2
= = 2.
∂x ∂x r ∂x ∂y r
Vab = Wab .
43
The transformation to primed coordinate is given by
0 ∂xc ∂xd
Vab = Vcd ,
∂x0a ∂x0b
0 ∂xc ∂xd
Wab = Wcd ,
∂x0a ∂x0b
so that
0 0
Vab = Wab .
Now, to prove that δ i j is a tensor of type (1, 1) we note that if were one it should transform
as:
∂x0i ∂xb a
δ 0i j = δ b.
∂xa ∂x0j
Now, substituting in the right hand side for δ a b
V a b ≡ cW a b + dZ a b ,
V 0a b = cW 0a b + dZ 0a b
∂x0a ∂xf ∂x0a ∂xf
= c e 0b W e f + d e 0b Z e f ,
∂x ∂ ∂x ∂
0a
∂x ∂x e f
= V f,
∂xe ∂ 0b
44
4.5.2 Direct product
The product of 2 tensors of type (p1 , q1 ) and (p2 , q2 ) is a tensor of type (p1 + p2 , q1 + q2 )
provided none of the indices are the same. As an example, if V a b and W c are tensors of
type (1, 1) and (1, 0) respectively, show that
Z abc ≡ V abW c,
Z 0a b c = V 0a b W 0c ,
∂x0a ∂xf e ∂x0c h
= V f hW ,
∂xe ∂x0b ∂x
0a f
∂x ∂x ∂x e h 0c
= Z f .
∂xe ∂x0b ∂xh
4.5.3 Contraction
Setting an upper and a a lower index equal and summing over its values results in a
new tensor with the two indices absent. That is, one passes from a tensor of rank (p, q)
to one of rank (p − 1, q − 1). For example if Z a b cd is a tensor of type (3, 1), show that
Z ac ≡ Z a b cb is a tensor of type (2, 0). To see this write
Ai B i = A0i B 0i
0i
0 ∂x j
= Ai B
∂xj
so that
∂x0i
Aj − A0i j B j = 0,
∂x
and since B j is arbitrary this implies
∂x0i
Aj = A0i ,
∂xj
so that Aj is indeed contravariant.
45
4.5.5 Symmetric and antisymmetric tensors
A tensor Aij is said to be symmetric if
Aij = Aji ,
For a tensor of higher rank one says that it is symmetric (or skew) with respect to a
pair of indices if interchanging the indices does not change the components (changes the
sign). The indices involved must be both “upstairs” or “downstairs”. For example
R[ab][cd]
implies
Rabcd = −Rbacd ,
Rabcd = −Rabdc ,
Rabcd = Rbadc .
∂x0b a
V 0b = V .
∂xa
46
Differentiating with respect to x0c one obtains
∇c f = ∂c f,
∇c (Ab + Bb ) = ∇c Ab + ∇c Bb , (linearity)
∇c (Aa Bb ) = (∇c Aa )Bb + Aa (∇c Bb ), (Leibnitz rule).
The simplest modification of ∂c that satisfies the above requirements is the following:
∇c V a ≡ ∂c V a + Γa bc V b , (4.7)
where the quantity Γa bc which has N 3 components is called the connection or sometimes
the affine connection. Note that its particular form has not yet identified.
Notation. Very often we shall write equation (4.7) and similar expressions as
V a ;c = V a ,a + Γa bc V b ,
V a ;c ≡ ∇c V a , ∂c V a ≡ V a ,a .
Also notice that the differentiation index c comes last in the connection Γa bc .
∂x0b a
V 0b = V ,
∂xa
it follows that
∂ 2 x0b ∂xd a ∂V a ∂xd ∂x0b
V 0a ,c = V + .
∂xd ∂xa ∂x0c ∂xd ∂x0c ∂xa
Now, from the definition (4.7) one has
V 0a ;c = V 0a ,c + Γ0a bc V 0b ,
so that
∂V b ∂xd ∂x0a ∂ 2 x0a ∂xb b 0a ∂x
0b
V 0a ;c = + V + Γ bc V f. (4.8)
∂xd ∂x0c ∂xb ∂xd ∂xb ∂x0c ∂xf
47
Now, to ensure that V 0a ;c transforms as a tensor of type (1, 1) one requires
Va;b = Va,b − Γc ab Vc .
W b ∇b V a = W b V a ;b = 0.
Now, recall that one way of characterising straight lines in Euclidean space is as curves
whose tangent vectors are parallely transported at every point —i.e. they are autopar-
allels. The notion of shortest distance in this context is not appropriate as we have not
defined a distance on the manifold —-this will be seen in the sequel.
The notion defined above can be used to define the analogue of straight lines in more
general manifolds. Such curves are referred to as affine geodesics —i.e. curves along
which the tangent vector is propagated parallely to itself.
Letting W b to be tangent to a geodesic, one has that
W b ∇b W a = W b W a ;b = 0,
from where
W b W a ,b + Γa cb W c W b = 0.
If the curve is parametrised by λ, then
dxb
Wb = ,
dλ
48
and since
dxb ∂
∂ b d
W b
= ≡ ,
∂x dλ dλ ∂xb
so that
dxa dxc dxb
d
+ Γa bc = 0,
dλ dλ dλ dλ
and finally that
d2 xa dxc dxb
+ Γa bc = 0. (4.9)
dλ dλ dλ
Note. From the existence and uniqueness theorems for ordinary differential equations,
it follows that corresponding to every direction at a point, there exists a unique geodesic
passing through the point. The initial conditions are
dxa
λ = λ0 , xa0 = xa (0), W0a = (0).
dλ
Example. Show that changing the geodesic parameter λ to σ in such a way that σ =
σ(λ), the geodesic equation only keeps its form (4.9) in σ if σ = aλ + b.
To see this recall that
dxa dxa dσ
= ,
dλ dσ dλ
so that 2
d2 xa d2 xa dxa d2 σ
dσ
= + .
dλ2 dσ 2 dλ dσ dλ2
Substituting into equation (4.9) one gets
2
d2 xa dxc dxb dxa d2 σ
dσ
+ Γa bc + = 0,
dσ dσ dσ dλ dσ dλ2
49
(ii) the need of an alternative notion of parallelism based on the idea of length;
(iii) finding a relation between covariant and contravariant tensors.
In order to accomplish these point we introduce the notion of metric. This essentially
amounts to defining the distance between two neighbouring points xa and xa + dxa
through an expression of the form
ds2 = gab (x)dxa dxb (4.10)
where ds2 is called the line element or interval and gab is the metric tensor. A metric
with such a metric defined on it is called a manifold with metric.
Remark 1. The tensor gab is a tensor of type (0, 2). This follows immediately from the
scalar nature of ds2 and the fact that dxa is a contravariant tensor —this from the tensor
detection tensor. To see this recall that
∂xa 0c
dxa = dx ,
∂x0c
so that
ds2 = gab dxa dxb
∂xa ∂xb
= gab 0c 0d dx0c dx0d
∂x ∂x
0
= gcd dx0c dx0d ,
where
0 ∂xa ∂xb
gcd = gab .
∂x0c ∂x0d
The later is precisely the transformation law for a tensor (0, 2).
Remark 2. In order for (4.10) to determine gab uniquely, gab must be symmetric. Note
that if gab is symmetric, then it can always be diagonalised. Let λi , i = 0, . . . N denote
the eigenvalues of gab . If all the eigenvalues of gab are positive, then the metric gab will
be said to be a Riemannian metric and the manifold will be said to be a Riemannian
manifold. If one of the eigenvalues is negative and the remaining positive, then the metric
will be said to be Lorentzian —this is the case of relevance in Relativity. The number of
positive eigenvalues minus the number of negative eigenvaues is calld the signature. For
example, the Minkowski metric of Special Relativity (see below) has signature 2, while
the standard Euclidean metric in R4 has signature 4.
Remark 3. Euclidean space with
ds2 = dx2 + dy 2 + dz 2 + dw2
is an example of a Riemannian manifold. On the other hand, Minkowski space with
ds2 = −dt2 + dx2 + dy 2 + dz 2
is a special case of Lorentzian manifold. Note that in both examples, the coefficients
gab are constants. We also note that the Minkowski metric can be written is spherical
coordinates as:
ds2 = −dt2 + dr2 + r2 dθ2 + r2 sin2 θdϕ2 .
The definition (4.10) allows the following natural definitions of notions one had in
Minkowski space.
50
Norm of a covariant vector V a
This is defined via
|V |2 ≡ gab V a V b .
If |V |2 > 0 ( or |V | < 0) for all vectors V a , the metric is called positive definite (or
negative definite) —this is the Riemannian case. Otherwise it is called indefinite —this
includes the Lorentzian case.
Null vectors
For indefinite metrics there are vectors that are orthogonal to themselves. That is,
gab Aa Ab = 0.
gab g ac = δb c .
For example
gac W ab = Wc b ,
T ab = g ac Tc b = g ac g bd Tcd .
Quite crucially, one can see that the operation of lowering and raising indices does not
add extra information in the tensors. For example, given
V b = g ba Va ,
51
Connection between contravariant and covariant vectors
So far, covariant and contravariant tensors have remained unrelated. The metric can be
taken as the mapping between contravariant and covariant tensors:
Va = gab V b , V a = g ab Vb .
This is the reason why in this case there is no distinction between covariant and con-
travariant tensors.
Remark 1. Raising and lowering of indices enable us to write equations with indices in
any position. It is for that reason that one writes a blank space above each lower index
and below each upper index. For example the contravariant version of
Gab = Tab
is
Gab = T ab .
Theorem 2. In the diagonal form of gab the number of components equal to +1 and to
−1 do not change under coordinate transformations. The difference between these two is
called the signature of the metric.
52
No torsion condition
Let φ be as scalar. The usual partial derivatives acting on a scalar commute. That is,
∂a ∂b φ = ∂b ∂a φ.
This is, in general, not the case for covariant derivatives. Recall that
∇b φ = ∂b φ,
so that
∇a ∇b φ = ∂a ∂b φ − Γe ba ∂e φ,
∇b ∇a φ = ∂b ∂a φ − Γe ab ∂e φ.
Γc ab = Γc ba = Γc(ab) .
This property has a nice geometric interpretation. namely, that the parallelogram formed
by the parallel propagation of two infinitesimal displacements closes.
T a ∇a (gbc V a W b ) = 0,
T a ∇a V b = 0, T a ∇a W b = 0.
T a V b W c ∇a gbc = 0.
This equation should hold for all curves and parallely transported vectors if and only if
∇a gcd = gcd;a = 0.
Theorem 3. Let gab = 0 be a metric. Then there exists a unique connection such that
∇a gbc = 0.
so that
Γcab + Γbac = ∂a gbc ,
where Γcab ≡ gdc Γd ab . By index substitution one also has that
53
Adding the first two equations, subtracting the third and using the symmetry Γc ab = Γc ba
one finds
2Γcab = ∂a gbc + ∇b gac − ∇c gab .
That is,
Γc ab = 21 g cd (∂a gbd + ∂b gad − ∂d gab ) .
This is called the Levi-Civita connection of the metric gab .
dx
L = L(x, ẋ, λ), x = x(λ), ẋ = .
dλ
That is, L is a function of functions of λ —L is called a functional. It is assumed that L
is differentiable in x, ẋ, λ.
We are looking for the necessary conditions on the function x such that the integral
Z x2
L(x, ẋ, λ)dλ
x1
RTo deduce the geodesic equation we want to consider the length of the curve defined
by ds to be stationary. Introducing a parameter λ along the curve such that
Z Z
ds
ds = dλ,
dλ
54
Alternatively, one can find extremals of
2
ds
L= = gjk ẋi ẋj .
dλ
A computation renders
∂L ∂ ẋa ∂ ẋb
c
= gab c ẋb + gab ẋa c ,
∂ ẋ ∂ ẋ ∂ ẋ
= gab δ a c ẋb + gab ẋa δ b c ,
dẋa
d ∂L dgac a
=2 ẋ + 2gac
dλ ∂ ẋc dλ dλ
Finally,
∂L
= ∂c gab ẋa ẋb .
∂xc
Thus, one has that
d ∂L ∂L
0= − = 2gac ẍa + (∂b gac + ∂a gbc − ∂c gab )ẋa ẋb .
dλ ∂ ẋa ∂xa
d2 xl
= 0,
ds2
which is the usual equation for straight motion.
Remark 2. As it stands, the above equation only makes sense for spacelikeRcurves for
which ds2 > 0. For timelike curves one uses dτ instead. Also, starting with ds2 gives
the same geodesic equation.
55
Remark 3. For null geodesics, i.e. geodesics for which ds = 0, the curve may be
parametrised by a parameter
d2 xl j
l dx dx
k
+ Γ jk = 0,
du2 du du
where
dxj dxk
gjk = 0.
du du
Remark 4. It can be proved that if gab is Riemannian then the solutions to equation
(4.13) are curves of minimum length. On the other hand, if gab is Lorentzian, then the
geodesics maximise length. Now, recall that in Special Relativity one defines the proper
time as dτ 2 = −ds2 /c2 . Thus, time observed by a comoving clock always goes slower.
The problem of this approach is that one need to calculate all the components, one by
one.
56
4.11.2 Computation using the Euler-Lagrange equations
This is usually a more useful way as it gives directly the non-zero Christoffel symbols.
Let
ds
L= = u̇2 + cos2 uv̇ 2 .
dλ
The Euler-Lagrange equations are given by
d ∂L ∂L
i
− i = 0.
dλ ∂ ẋ ∂x
d2 x1 j
1 dx dx
k
+ Γ jk = 0,
ds2 ds ds
or
d2 x1 1
1 dx dx
1 1
1 dx dx
2 2
1 dx dx
1 2
1 dx dx
2
+ Γ 11 + Γ 12 + Γ 21 + Γ 22 = 0.
ds2 ds ds ds ds ds ds ds ds
However, in our case one only has v̇ 2 terms so the latter becomes
2
d2 x1 dx2
+ Γ1 22 = 0.
ds2 ds
Γ1 22 = sin u cos u, Γ1 11 = Γ1 12 = Γ1 21 = 0.
d2 x2 1
2 dx dx
2 2
2 dx dx
1
+ Γ 12 + Γ 21 = 0.
ds2 ds ds ds ds
However,
Γ2 12 = Γ221 ,
and hence
Γ2 12 = Γ221 = − tan u.
Finally,
Γ2 22 = Γ2 11 = 0.
57
58
Chapter 5
Curvature
A novel feature of General Relativity is that it employs the notion of curved space. Our
intuition of curvature is mainly based on the curvature of 2-dimensional objects in 3-
dimensional space, like spheres, saddles, etc. The notion of curvature whose definition
depends on a space of higher dimension is called extrinsic. In the case of spacetime this
notion is not useful and require an intrinsic notion —i.e. a definition which is independent
of the embedding space.
where it has been assumed that g12 = 0 for simplicity, it is possible to define an intrinsic
curvature (a scalar function) which is invariant under coordinate transformations, but
varies from point to point. This is given by an expression of the form
such that when K = 0 the space is flat and for a sphere of radius R it gives K = 1/R2 .
In spaces of higher dimension we require more than one quantity at each point to
describe curvature. It turns out that the right definition involves the components of a
4-index tensor called the Riemann curvature tensor :
Ra bcd ≡ ∂c Γa bd − ∂d Γa bc + Γa ec Γe bd − Γa ed Γe bc . (5.1)
Since the Christoffel symbols contain derivatives of the metric, one finds that the Riemann
tensor has the same form as K. Note that in flat space given by Cartesian coordinates
the Christoffel symbols vanish, and thus the Riemann tensor vanishes! If one shows
that Ra bcd is indeed a tensor, then this last statement is valid for any coordinates! This
statement is actually an if and only if statement. The hard part is to show that vanishing
curvature implies Minkowski space.
There are many ways of motivating this formula. Here we will proceed by looking at
the commutation of covariant derivatives. Consider:
∇c ∇b Va − ∇b ∇c Va = Va;b;c − Va;c;b .
59
Now recall that
Va;b = Va,b − Γd ab Vd ,
so that
Va;b;c = (Va;b ),c − Γf ac Vf ;b − Γf bc Va;f
= Va,b − Γd ab Vd − Γf ac Vf,b − Γd f b Vd − Γf bc Va,f − Γd af Vd
,c
d
= Va,b,c − ∂c Γ ab Vd − Γd ab Vd,c − Γf ac Vf,b + Γf ac Γd f b Vd − Γf bc Va,f + Γf bc Γd af Vd .
Interchanging b and c in the last expression:
Va;c;b = Va,c,b − ∂b Γd ac Vd − Γd ac Vd,b − Γf ab Vf,c + Γf ab Γd f c Vd − Γf cb Va,f + Γf cb Γd af Vd .
Thus,
Va;b;c − Va;c;b = (Va,b,c − Va,c,b )
+ Γd ac Vd,b − Γf ac Vf,b + Γf ab Vf,c − Γd ab Vd,c
+ Γf bc Va,f − Γf cb Va,f + Γf bc Γd af Vd − Γf cb Γd af Vd
+ ∂b Γd ac Vd − ∂c Γd ab Vd + Γf ac Γd f b Vd − Γf ab Γd f c Vd
The first term of the right hand side cancels out as usual partial derivatives commute. The
second and third cancel out directly, while in the fourth and fifth we use the symmetry
of the Christoffel symbols. Thus, one is left with
Va;b;c − Va;c;b = Vd ∂b Γd ac − ∂c Γd ab + Γf ac Γd f b − Γf ab Γd f c
= Vd Rd abc ,
as it can be seen by comparison with equation (5.1). This expression is sometimes called
the Ricci identity. Defined through this expression, if follows that the Rd abc is indeed a
tensor as the expression in the left hand side is a tensor —alternatively, one could look
at the transformation rules of the Christoffel symbols. This is much more involved!
Geometric interpretation
It can be shown that the change of a vector V c parallely transported along a closed
path is proportional to the curvature —see figure. For an infinitesimal loop along the
directions given by ub and wd one has that
δV a = Ra cbd V c δub δwd .
Recall that as seen before such parallelogram closes (due to the no Torsion condition)!
60
5.2 Symmetries of the curvature tensor
In general, a tensor of rank 4 has 44 = 256 components (in spacetime). Symmetries,
if present are important because they reduce the number of independent components.
Lowering the index in the definition of the Riemann tensor one obtains
Rabcd = ∂c Γabd − ∂d Γabc + Γaec Γe bd − Γaed Γe bc ,
where
Rabcd = gaf Rf bcd , Γabd = gaf Γf bd .
Now, since Rabcd is a tensor, it should have the same symmetries in all frames. Accord-
ingly, choose a locally inertial frame for which the Christoffel symbols vanish. For these
coordinates one has then that
Rabcd = ∂c Γabd − ∂d Γabc .
Recalling that
1
Γabc = 2 (gab,c + gac,b − gbc,a )
one obtains
1
Rabcd = 2 (gad,bc + gbc,ad − gbd,ac − gac,bd ) ,
from where it is easy to read the symmetries of the tensor. It can be checked that
Rabcd = −Rbacd , Rabcd = −Rabdc , Rabcd = Rcdab .
Furthermore,
Rabcd + Radbc + Racdb = 0 so that Ra(bcd) = 0.
These symmetries amount to 236 constraints, so Rabcd has only 20 non-zero components.
61
5.4 Bianchi identities, the Ricci and Einstein tensors
Recall that in a locally inertial frame one had that
Using the fact that partial derivatives commute one finds that
Now, in a locally inertial frame the Christoffel symbols vanish so that in fact one has
that
Rabcd;e + Rabec;d + Rabde;c = 0, Rab(cd;e) = 0.
This tensorial equation is valid in all frames and is called the Bianchi identity. One could
have derived it by directly taking the covariant derivative of the Riemann tensor.
Remark 1. Because of the symmetries of the Riemann tensor one has that the Ricci
tensor is symmetric. That is,
Rbd = Rdb .
Remark 2. Other contractions of the Riemann tensor vanish or give ±Rbd . For example
Rb bcd = 0 since Rabcd is symmetric on a and b. Also,
and similarly.
R ≡ g ab Rab = g ac g bd Rabcd .
Now,
g ac Rabcd = Rbd , g ac Rabec = −g ac Rabce ,
62
so that (5.2) renders
Rbd;e − Rbe;d + Rc bde;c = 0.
Contracting on b and d:
R;e − Rbe;d − Rc e;c = 0, (5.3)
where it has been used that
Gf e = Rf e 21 Rgf e .
Remark 1. The Einstein tensor is symmetric (from the symmetries of the Ricci and
metric tensors) and therefore it has 10 independent components.
Remark 2. By construction, the Einstein tensor is divergence free.
63
64
Chapter 6
General Relativity
d2 x0a
= 0.
dτ 2
d2 xa b
a dx dx
c
+ γ bc = 0,
dτ 2 dτ dτ
where
∂xa ∂ 2 x0d
γ a bc = .
∂x0d ∂xb ∂xc
Here the γ a bc are the “fictitious” terms that arise due to the non-inertial nature of the
frame.
Now, due to the Equivalence Principle the latter implies that locally gravity is equiv-
alent to acceleration and this in turn gives rise to non-inertial frames. The main idea of
General relativity is to argue that gravitation as well as inertial forces should be described
by appropriate γ a bc ’s!
The simplest way to do this is by means of a Lorentzian manifold —the latter is
endowed with geodesics of the required type:
d2 xa b
a dx dx
c
+ Γ bc = 0.
dτ 2 dτ dτ
Now, if the Γa0bc s are associated with gravitational forces, then the metric gab may be
associated with a potential. Note that the gravitational potential in the Newtonian
theory satisfies
∇2 φ = 4πGρ, ρ the density.
The relativistic analogue of this equation should be tensorial and of second order in the
metric. To take this analogy further, consider two neighbouring particles with coordinates
65
xα (t), xα (t) + ξ α (t), with ξ α (t) small α = 1, 2, 3, moving in a gravitational field with a
potential φ. the equations of motion are then given:
∂φ(x)
ẍα = −
∂xα
and
∂φ(x) 2
β ∂ φ
ẍα + ξ¨α = − − ξ + O(ξ 2 ).
∂xα ∂xα ∂xβ
Subtracting the two last equations:
∂2φ
ξ¨ = −ξ β .
∂xα ∂xβ
This is the relative acceleration of two test particles separated by by a 3-vector ξ α —the
second derivative of the potential gives the tidal forces. This is in analogy to the geodesic
deviation equation:
∇V̄ ∇V̄ ξ α = Ra cdb V c V d ξ b ,
provided that one identifies
∂2φ
−ξ β , and Ra cdb V c V d ξ b .
∂xα ∂xβ
This identification would make clear the relation between gravity and geometry —note
that the Riemann tensor involves second derivatives of the metric tensor.
The main idea underlying General Relativity is that matter (including energy) curves
spacetime (assumed to be a Lorentzian manifold). This in turn affects the motion of par-
ticles and light rays, postulated to move on timelike and null geodesics of the Lorentzian
manifold, respectively.
(3) Correspondence principle. General relativity must agree with Special Relativity
in absence of gravitation and with Newtonian gravitational theory in the case of
weak gravitational fields and in the non-relativistic limit (slow speed).
66
6.3 The Einstein equations in vacuum
In vacuum 9such as in the outside of a body in empty space) one has that the density ρ
vanishes and the equation for the Newtonian potential becomes:
∇2 φ = 0.
The Laplace equation involves an object with two indices (∂ 2 φ/∂xi ∂xj ). As a result,
what one needs is an object with two indices —a contraction of the Riemann tensor, like
the Ricci tensor:
Rbc = 0.
The latter are called the Einstein vacuum field equations. In fact, the most general form
of the vacuum equations which is tensorial and depends linearly on second derivatives of
the metric is:
Rbc = Λgab ,
where Λ is the so-called Cosmological constant.
Remark 1. Outside Cosmology, Λ is usually taken to be zero.
Remark 2. The vacuum equations are a set of ten partial differential equations for the
components of the metric tensor gab . These are hard to solve, apart from simple settings.
Remark 1. The Einstein equations are the simplest compatible with the Equivalence
Principle, but they are not the only ones.
67
From (6.5a) it follows that dt/dτ is a constant. Also, from
dxα dxα dt
= ,
dτ dt dτ
it follows that 2
d2 xα d2 xα dxα d2 t
dt
= + ,
dτ 2 dt2 dτ dt dτ 2
which in our case reduces to
2
d2 xα d2 xα
dt
= .
dτ 2 dt2 dτ
Combining the latter with (6.5a)
d2 xα
= 21 ∇h00 . (6.6)
dt
The corresponding Newtonian result is
d2 xα
= −∇φ (6.7)
dt
where φ is the gravitational potential which far from a central body of mass M at a
distance r is given by
GM
φ=− .
r
Comparing (6.6) and (6.7) one finds then that
h00 = −2φ + constant.
However, at large distances from M one has that φ → 0 (gravity becomes negligible) and
gab → ηab (the space becomes flat). Therefore the constant must be zero so that
h00 = −2φ.
Substituting in (6.4) on finds
g00 = −(1 + 2φ).
Now, recall that φ has dimensions of (velocity)2 , [φ] = [GM/R] = L2 /T 2 . Therefore one
has that φ/c2 at the surface of the Earth is ∼ 10−9 , one the surface of the Sun ∼ 10−6
and at the surface of a white dwarf ∼ 10−4 . It follows that in most cases the distortion
produced by gravity is in gab very small.
(i) The vacuum spherically symmetric static case (the Schwarzschild spacetime).
(ii) The weak field case (gravitational waves).
(iii) The isotropic and homogeneous case (Cosmology).
68
6.6 The Schwarzschild solution
This is the basis for nearly all the tests of General Relativity. The solution corresponds
to the metric corresponding to a static, spherically symmetric gravitational field in the
empty spacetime surrounding a central mass (like the Sun).
Choosing coordinates (t, r, θ, ϕ), it can be shown that a metric of this type is of the
form:
ds2 = −eA(r) dt2 + eB(r) dr2 + r2 (dθ2 + sin2 θdϕ2 ), (6.8)
where A(r) and B(r) describe deviation of the metric from Minkowski spacetime. Note
that for constant t and r the metric reduces to the standard metric for the surface of a
sphere. As one is dealing with vacuum, one is poised to solve
Rab = 0. (6.9)
Substituting (6.8) in (6.9), and after some algebra, the only non-zero components of (6.9)
have the form:
B0
Rrr = R11 = 12 A00 − 41 A0 B 0 + 41 A02 − ,, (6.10a)
r
Rθθ = R22 = e−B 1 + 12 r(A0 − B 0 ) − 1, ,
(6.10b)
2
Rϕϕ = R33 = R22 sin θ, (6.10c)
A0
A−B 1 00 1 0 0 1 02
Rtt = R00 = −e 2A − 4A B + 4A + r , (6.10d)
reA = r + σ, σ a constant
so that
σ
eA = 1 + ,
r
so that the metric one obtains is given by
σ 2 σ −1 2
ds2 = − 1 + dt + 1 + dr + r2 (dθ2 + sin2 θdϕ2 ).
r r
To fix σ, recall that in the Newtonian limit of a central mass M ,
2GM
g00 = − 1 − .
r
69
Comparing with σ
− 1+ ,
r
one finds that
σ = −2GM.
Hence, at the end of the day one has
2GM −1 2
2 2GM 2
ds = − 1 − dt + 1 − dr + r2 (dθ2 + sin2 θdϕ2 ). (6.11)
r r
The latter is called the Schwarzschild metric.
Remark 1. This solution how the presence of mass curves flat spacetime.
Remark 2. The metric (6.11) is asymptotically flat. That is, it becomes Minkowskian
as r → ∞.
Remark 3. The solution only applies to the exterior of a star.
Remark 4. The Birkhoff Theorem: a spherically symmetric solution in vacuum is
necessarily static. That is, there is no time dependence is spherically symmetric solutions.
2M G −1 2
2
dτ 2M G 2
L= = 1− ṫ − 1 − ṙ − r2 (θ̇2 + sin2 θϕ̇2 ),
dλ r r
where 0 denotes differentiation with respect to the parameter λ. For timelike geodesics
one has that λ = τ so that L = 1. On the other hand, for null geodesics L = 0.
The Euler-Lagrange equations read then
d
2Aṫ = 0, , (6.12a)
dλ
d
(2ṙA−1 ) − 2r(θ̇2 + sin2 θϕ̇2 ) + ṙ2 A−2 A0 + ṫ2 A0 = 0, (6.12b)
dλ
d 2
(r θ̇) − r2 sin θ cos θϕ̇2 = 0, (6.12c)
dλ
d 2 2
(r sin θϕ̇) = 0, (6.12d)
dλ
where
2GM
A(r) = 1 − ,
r
and 0 denotes differentiation with respect to r. It turns out that it is simpler to use
2M G −1 2
2M G 2
1− ṫ − 1 − ṙ − r2 (θ̇2 + sin2 θϕ̇2 ) = 1, 0. (6.13)
r r
70
This is, in fact, an integral of motion of the Euler-Lagrange equation. It expresses the
fact that the square of the norm of the 4-velocity vector of a timelike particle is −1, while
that of a photon is 0. This is like in Special Relativity.
As in Classical Mechanics (central force orbit), let us look for solutions in the Equa-
torial plane: θ = π/2. It follows then that θ̇ = 0, and from (6.12c) with cos θ = 0
one finds that θ̈ = 0. The orbits remain in a plane! This is like in Classical Mechanics
—conservation of angular momentum.
Now, from (6.12d) it follows that
r2 ϕ̇ = h, h a constant, (6.14a)
2GM
1− ṫ = l, l a constant, . (6.14b)
r
Substituting (6.14a) and (6.14b) in (6.13) one obtains
2GM −1 2GM −1 2 h2
2
l 1− − 1− ṙ − 2 = 1, 0. (6.15)
r r r
As in Newtonian theory, let u = 1/r so that
dr dr dϕ dr dr 1 du
ṙ = = = ϕ̇ , =− 2 .
dλ dϕ dλ dϕ dϕ u dϕ
Using (6.14a) one finds
du
ṙ = −h .
dϕ
Then equation (6.15) in (u, ϕ) coordinates become
2
du l2 − 1 2GM 2GM u3
+ u2 = + u + , for timelike geodesics, (6.16a)
dϕ h2 h2 c2
2
du l2 2GM u3
+ u2 = 2 + , for null geodesics. (6.16b)
dϕ h c2
The speed of light c has been added for dimensional reasons. These are the analogues
of energy equations in Newtonian theory. One can solve (6.17a)-(6.17b) approximately.
For this, differentiate the equations with respect to ϕ:
d2 u 2GM 3GM u2
+ u = + , for timelike geodesics, (6.17a)
dϕ2 h2 c2
d2 u 3GM u2
+ u = , for null geodesics. (6.17b)
dϕ2 c2
From here, one has to analyse the two cases separately.
71
where is dimensionless and assumed to be small. Then equation (6.17a) implies
d2 u
2
+ u = a + u2 . (6.18)
dϕ a
We will look for solutions of the type
u = u0 + u1 + O(2 ).
d2 u0 d2 u1
+ u0 + + u1 = a + u20 + O(2 ). (6.19)
dϕ2 dϕ2 a
Equating zeroth order terms in in equation (6.19) one obtains
d2 u0
+ u0 = a,
dϕ2
which can be solved to give
where without loss of generality we have set ϕ0 = 0. Now, equating the first order term
in (6.19) one has
d2 u1 u20
+ u1 = .
dϕ2 a
Substituting for u0 from (6.20), and using that
one obtains
d2 u1 b2 b2
+ u1 = a+ + 2b cos ϕ + cos 2ϕ, (6.21)
dϕ2 2a 2a
which is a linear inhomogeneous ordinary differential equation. Its general solution con-
sists of a general solution to the homogeneous part plus a particular solution correspond-
ing to each term of the right hand side of (6.21). One has that for:
b2 b2
a+ , the solution is a + ,
2a 2a
2b cos ϕ the solution is bϕ sin ϕ,
b2 b2
cos 2ϕ the solution is − cos 2ϕ.
2a 6a
Hence, the solution to the orbit equation to first order in is
u = u0 + u1
b2 b2
= a + a + + b cos ϕ − cos 2ϕ + bϕ sin ϕ.
2a 6a
It is observed that only the last term in this expression is non-periodic, and hence, any
irregularity in the orbit (non-periodicity) must relate to this term. To see the effect of
this term recall that
cos ϕ ≈ 1, sin ϕ ≈ ϕ,
72
so that
cos(ϕ − ϕ) = cos ϕ cos ϕ + sin ϕ sin ϕ = cos ϕ + ϕ sin ϕ.
Hence, we write the solution as
b2 b2
u = a + b cos(ϕ − ϕ) + a + − cos 2ϕ ,
2a 6a
that is:
u = a + b cos(ϕ − ϕ) + (periodic terms).
New, recall that the perihelion of a planet around the Sun occurs when r is a minimum(
u maximum). Now, cos(ϕ − ϕ) is a maximum when
∆ϕ ≈ 2π(1 + ),
instead of 2π as in the case of periodic motion. Therefore, the perihelion shift per
revolution (δϕ = ∆ϕ − 2π) is
6πG2 M 2
δϕ = 2π = .
h2 c2
From Newtonian theory we have that
h2 4π 2 α3
α= , T2 = ,
GM (1 − e2 ) GM
where e is the eccentricity of the orbit, α the semi-major axis, and T the period. One
obtains then that
24π 3 α2
δϕ = 2 2 .
c T (1 − e2 )
For the case of the planet Mercury this gives a total shift of 43.0300 per century which is
in good agreement with the classically unaccounted shift of 43.1100 ± 0.4500 .
Remark 1. This is one of the famous classical tests of General Relativity, the so-called
perihelium shift of Mercury.
Remark 2. The effect is largest in Mercury because of its high eccentricity and small
period which results in a large shift.
Remark 3. For Venus one has a predicted shift of 8.600 and an observed of 8.400 ± 4.800 .
For the Earth one has 3.800 and 5.000 ± 1.200 . For the asteroid Icarus 10.300 and 9.800 ± 0.800 .
d2 u 3GM u2
+ u = .
dϕ2 c2
As before, the term GM u/ c2 is small relative to u so let ≡ 3GM/c2 , and rewrite (6.17b)
as
d2 u
+ u = u2 , . (6.22)
dϕ2
73
As before, look for solutions of the form
u = u0 + u1 + O(2 ).
d2 u0 d2 u1
+ u0 + + u1 = u20 + O(2 ).
dϕ2 dϕ2
Equation the zero terms in the previous equation:
d2 u0
+ u0 = 0,
dϕ2
which can be solved to give
u0 = L cos ϕ, L a constant.
d2 u1
+ u1 = u20 = L2 cos2 ϕ = 12 L2 (1 + cos2ϕ)
dϕ2
which has the particular solution
u1 = 21 L2 − 61 L2 cos 2ϕ = 32 L2 − 13 L2 cos2 ϕ.
So, the effects of the first order terms (the last 2 terms) is to make light deflect from a
straight line.
For a light ray grazing the Sun and arriving at Earth, the asymptote of the trajectory
corresponds to values of ϕ for which r → ∞ —that is, u → 0. Substituting in (??) this
gives
3
cos2 ϕ − cos ϕ − 2 = 0.
L
74
The latter can be solved to give
3
q
8 2
cos ϕ = 1 ± 1 + 9 L .
2L
Choosing the negative sign to make cos ϕ < 1 and expanding one finds
2 2
| cos ϕ| ≈ 3 L = 2 GM L 1,
c
One has that δ is the angle that each asymptote makes with undeflected straight line.
The angle between 2 asymptotes is
∆ = 2δ.
Accordingly, one finds
4GM L
∆= .
c2
For a light ray just grazing the Sun this predicts a deflection of 1.7500 which compares
well with some recent radio observations yielding ∆ = 1.7300 ± 0.0500 .
Remark. This is the second famous test of General Relativity —more generally re-
ferred to as bending of light. A first measurement was carried out by Eddington and
collaborators in 1919.
dr = dθ = dϕ = 0.
2GM −1/2
dt
= 1− ,
dτ r
from where it follows that dt (i.e. the period or interval as measured by an observer at
infinity) is larger than dτ as measured at r. Similarly, the period of atomic oscillations in
75
the gravitational field of M , as seen from infinity is increased. Therefore, the frequency
ν (ν = 1/τ ) of light it emits is decreased —i.e. redshifted as seen from infinity.
We define redshift via
λrec − λem νem
z≡ = − 1.
λem νrec
Accordingly,
−1/2
τrec dt 2GM
1+z = = = 1− .
τem dτ r
Note that z → ∞ as r → 2GM/c2 .
Remark. It used to be thought that gravitational redshift also constituted a test of Gen-
eral Relativity, but it turns out that any other theory compatible with the Equivalence
Principle will predict a redshift.
2GM
r= , r = 0.
c2
To get a better feel of what is happening, it is best to look at coordinate independent
scalars at these points. A good candidate for this is Rabcd Rabcd for which the metric
(6.24) is proportional to 1/r6 . The latter is clearly singular at r = 0, with severe physical
consequences such as that tidal forces diverge. Nevertheless the scalar remains well
behaved at r = 2GM/c2 .
One say that at r = 0 one has a physical singularity wheres at r = 2GM/c2 on has a
coordinate singularity. In order to understand this better choose a new coordinate t̃ via
t̃ = t + 2GM ln |r − 2GM |.
The latter are called Eddington-Finkelstein coordinates. Using this new time coordinate
on finds that
2GM
dt = dt̃ − dr,
r − 2GM
so that the metric (6.24) transforms into
2 2GM 2 4GM 2GM
ds = − 1 − dt̃ + dt̃dr + 1 + dr2 + r2 (dθ2 + sin2 θdϕ2 ). (6.25)
r r r
Note that in this coordinates all metric coefficients are well behaved, except for r = 0.
To understand the light con structure in the spacetime, we look at radial light cones
defined by
θ = constant, ϕ = constant, ds2 = 0,
76
so that
2
dt̃ 2GM 4GM dt̃ 2GM
1− − − 1+ = 0.
dr r r dr r
Solving the quadratic equation one finds
dt̃ r + 2GM
= −1, .
dr r − 2GM
Note also that even though a particle can sit stationary at r > 2GM as we do on
Earth, given appropriate forces to counter gravity to stop our free fall), this cannot be
done in a region r < 2GM as all timelike (null) trajectories must make an angle with the
vertical, and therefore must have decreasing r for increasing t̃. Therefore, the particle
must fall towards r → 0. There is no static behaviour in the region r < 2GM and no
escape from the singularity once inside.
77
Combining these equations one obtains
2
dr 2GM
= r02 = −1 + + l2 .
dτ r
78