Relativity Notes

Relativity
(MTH6132)
Course notes
Dr. Juan A. Valiente Kroon
(adapted from notes from Prof. Reza Tavakol)
School of Mathematical Sciences,

Queen Mary, University of London,
Mile End Road
London E1 4NS
September 2010
1
2
Preface
These are the notes for the course of Relativity (MTH6132) I lectured in the Semester
A of 2011 (October-December 2010) at the School of Mathematical Sciences of Queen
Mary, University of London. These notes are mostly based on handwritten notes I have
inherited from Prof. Reza Tavakol. Of course, several parts of the notes have been
adapted to my particular taste and understanding of the subject. In any case, any typos,
omissions or misrepresentations are entierely my responsability.
The present course on Relativity is aimed at the particular characteristics of our
students at the School of Mathematical Sciences. In particular it assumed very little
Physical background. Hence, a certain amount of time is spent presenting the underlying
assumptions and experimental motivation for such a theory. It also assumes very little
from the mathematical side. All the necessary ideas from Differential Geometry and
tensors are provided within.
The course is quite an ambitious one. It begins with Special Relativity, then moves
to Differential Geometry and finally (in the last third) it provides an introduction to
General Relativity. Due to time constraints, there are some clear omissions in the choice
of topics. In particular, in the chapter on Special Relativity it would be desirable to
have a discussion of the Maxwell equations. In the chapter on General Relativity, the
discussion is restricted to the vacuum field equations. There is little mention of the field
equations with matter. Also, it would be desirable to have a discussion of the Friedman
Cosmological models. The discussion of these topics would require at least a couple of
weeks more, and may also involve the reorgainsation of some of the topics discussed in
the mathematical background. I do not discard the possibility of carrying out such a
revision next time I lecture the course.
1
2
Chapter 1
Introduction
1.1 What is Relativity?

The term Relativity encompasses two physical theories proposed by Einstein1 . Namely,
Special Relativity and General Relativity. However, as we will see, the word relativity is
also used in reference to Galilean Relativity 2 . The term Theory of Relativity was first
coined by Max Planck3 in 1906 to emphasize how a theory devised by Einstein in 1905
—what we now call Special Relativity— uses the Principle of Relativity.
1.1.1 Special Relativity?

Special Relativity is the physical theory of the measurement in inertial frames of refer-
ence. It was proposed in 1905 by Albert Einstein in the article On the Electrodynamic of
moving bodies (Zur Elektrodynamik bewegter Körper, Annalen der Physik 17, 891 (1905)).
It generalises Galileo’s Principle of Relativity —all motion is relative and that there is
no absolute and well-defined state of rest. Special Relativity incorporates the principle
that the speed of light is the same for all inertial observers, regardless the state of motion
of the source. The theory is termed special because it only applies to the special case
of inertial reference frames —i.e. frames of reference in uniform relative motion with
respect to each other. Special Relativity predicts the equivalence of matter and energy
as expressed by the formula
E = mc2 .
Special Relativity is a fundamental tool to describe the interaction between elementary
particles, and was widely accepted by the Physics community by the 1920’s.
1.1.2 General Relativity?

General Relativity is the geometric theory of gravitation published by Albert Einstein
in 1915 in the article The field equations of Gravitation (Die Feldgleichungen der Grav-
itation, Sitzungsberichte der Preussischen Akademie der Wisenschaften zu Berlin 884).
It generalises Special relativity and Newton’s law of universal gravitation, providing a
unified description of gravity as the manifestation of the curvature of spacetime. The
theory is general because it applies the Principle of Relativity to any frame of reference so
as to handle general coordinate transformations. From General Relativity it follows that
1
Albert Einstein (1879-1955). Physicist of German origin. Died in the USA.
2
Galileo Galilei (584-1642) . Italian physicist, mathematician and astronomer.
3
Max Planck (1858-1947). German physicist.
3
Special Relativity still applies locally. The domain of applicability of General Relativity
is in Astrophysics and Cosmology. More recently, the Global Positioning System (GPS)
requires of General Relativity to function accurately! Contrary to Special Relativity,
General Relativity was not widely accepted until the 1960’s.
1.2 Pre-relativistic Physics

1.2.1 Galilean Relativity
In order to study General Relativity one starts discussing Special Relativity. To this end,
it is important to briefly look at pre-relativistic Physics to see how Special Relativity
arose.
The starting point of Special Relativity is the study of motion. For this one needs
the following ingredients:
• Frames of reference. These consist of an origin in space, 3 orthogonal axes and

a clock.
• Events. This notion denotes a single point in space together with a single point in
time. Thus, events are characterised by 4 real numbers: an ordered triple (x, y, z)
giving the location in space relative to a fixed coordinate system and a real number
giving the Newtonian time. One denotes the event by E = (t, x, y, z).
There are an infinite number of frames of reference. Motion relative to each frame
looks, in principle, different. Hence, it is natural to ask: is there a subset of these frames
which are in some sense simple, preferred or natural? The answer to this question is yes.
These are the so-called inertial frames. In an inertial frame an isolated, non-rotating,
unaccelerated body moves on a straight line and uniformly.
Inertial frames are not unique. There are actually an infinite number of these. This
raises the question: can one tell in which inertial frame are we in? It turns out that
within the framework of Newtonian Mechanics this is not possible. More precisely, one
has the following:
Galilean Principle of Relativity. Laws of mechanics cannot distinguish between
inertial frames. This implies that there is no absolute rest. In other words, the laws of
Mechanics retain the same form in different inertial frames.
In this sense, Relativity predates Einstein.
1.2.2 Laws of Newton

The three Laws of Newtonian Mechanics4 are:
(1) Any material body continues in its state of rest or uniform motion (in a straight
line) unless it is made to change the state by forces acting on it. This principle is
equivalent to the statement of existence of inertial frames.
(2) The rate of change of momentum is equal to the force.
(3) Action and reaction are equal and opposite.

4
Isaac Newton (1643-1727). English physicist and mathematician.
4
These laws or principles, together with the following fundamental assumptions (some
of which are implicitly assumed in Newton’s laws) amount to the Newtonian framework :
(A1) Space and time are continuous —i.e. not discrete. This is necessary to make use
of the Calculus.
(A2) There is a universal (absolute) time. Different observers in different frames mea-
sure the same time. In fact, Newton also regarded space to be absolute as well.
However, the absoluteness of space is not necessary for the development of the
Newtonian framework, as space intervals turn out to be invariant under Galilean
transformations. Historically, Newton demanded this for subjective reasons.
(A3) Mass remains invariant as viewed from different inertial frames.
(A4) The Geometry of space is Euclidean. For example, the sum of angles in any triangle
equals 180 degrees.
(A5) There is no limit to the accuracy with which quantities such as time and space can
be measured.
As it will be seen in the sequel, Assumptions 2 and 3 are relaxed in Special Relativity
while Assumption 4 is relaxed in General Relativity. Assumption 5 is relaxed in Quantum
Mechanics —not to be discussed in the course. Presumably Assumption 1 will be relaxed
in Quantum Gravity!
1.2.3 Galilean transformations

Galilean transformations tell us how to transform from one inertial frame to another.
Consider two inertial frames: F (x, y, z, t) and F 0 (x0 , y 0 , z 0 , t0 ) moving with velocity v
relative to one another in standard configuration —that is, F 0 moves along the x axis of
the frame F with uniform speed v; all axes remain parallel. See the figure:

Now, suppose that at a given moment of time t, an event E is specified by coordinates

(t, x, y, z) and (t0 , x0 , y 0 , z 0 ) relative to the frames F and F 0 , respectively. Let the origins
O and O0 coincide at t = 0. From the figure one sees that
x0 = x − vt, y 0 = y, z 0 = z, t0 = t, (1.1)
or more compactly (recall that in general v = (vx , vy , vz ), but here vy = vz = 0):
r0 = r − vt.
In general, if the coordinate axes are not in standard configuration and the origins O and
O0 of the coordinate axes do not coincide, then the general form of the transformation
takes the form:
r0 = Rr − vt + d,
5
where R is the rotation matrix aligning the axes of the frames and d is the distance
between the origins at t = 0. Note that the general transformation is linear, so that F 0
is inertial if F is. The most general transformation would also include
t0 = t + τ
where τ is a real constant.

These transformations form a 10-parameter group (1 for τ , 3 for v, 3 for d, and 3 for
R). The group property implies that the composition of two Galilean transformations is a
Galilean transformation, and that given a Galilean transformation there is always an in-
verse transformation. The Galilean transformations restricted to standard configurations
form a 1-parameter subgroup of this group, with v as variable.
1.2.4 Invariance of Newton’s laws under Galilean transformations

Important for the sequel is the notion of invariance. Invariance referes to properties of a
system that remain unchanged under a particular type of transformations. For example,
a = b as a vector equation is invariant under a change in coordinate system. However,
the particular values of components of the vectorial equation do change. We will see
more about this in the next chapter!
We will see that the laws of Mechanics keep the same form as we go from one inertial
frame to another —i.e. under Galilean transformations. To see this, let the position of
a particle P be specified by r = r(t) relative to a frame F . The motion relative to F 0 is
given by equation (1.1). Differentiating both sides twice with respect to t (notice that
t = t0 ) gives:
v 0 = v − V , a0 = a,
where
dr d2 r
v= , a= 2,
dt dt
are, respectively, the velocity and acceleration of the particle.
Now, the First and Third Laws are invariant as the former involves inertial frames
and the latter involves accelerations which are invariant. It remains to show that the
Second Law (the fundamental equation of Newtonian Mechanics)
dr
m = ma = f (1.2)
dt
is invariant as we go from one inertial frame to another.
To show the invariance of (1.2) recall that a0 = a and m remains invariant (by
assumption) so that one only needs to show that f remains invariant as we go from F to
F 0 . To do this, recall that generally f takes the form f = f (r, v, t) where usually r and
v are the relative distance and the relative velocity between two bodies. One can verify
that the relative distances and velocities remain invariant. That is,
v 02 − v 01 = v 2 − v 1 , r02 − r01 = r2 − r1 .
This implies that f , and hence the Second Law remains invariant under changes in the
inertial frames.
This discussion amounts to a form of self-consistency, in the sense that Physics, when
confined to Newtonian Mechanics, satisfies the Galilean Principle of Relativity.
6
1.2.5 Electromagnetism
Special Relativity arises from the tension between Newtonian Mechanics with the other
great physical theory of the 19th century —Electromagnetism. The fundamental laws of
Electromagnetism are the so-called Maxwell equations 5 :
∇ · D = ρ,
∂B
∇×E =− ,
∂t
∇ · B = 0,
∂D
∇×H =j− ,
∂t
where B is the magnetic induction, E the electric field, H the magnetic field, D the
electric displacement, j the electric current and ρ the electric charge.
It can be shown that these equations predict the existence of electromagnetic waves
for E and H in the form
1 ∂2E 1 ∂2H
∇2 E = , ∇ 2
H = ,
c2 ∂t2 c2 ∂t2
where c is the speed of propagation of the waves. These electromagnetic waves were soon
identified with the propagation of light.
We recall that speed travels with a speed of c ≈ 3 × 108 m/s. This was first measured
by Rømer 6 in 1675 by studying the delay in the appearance of moons of Jupiter.
Within the Newtonian framework, the Maxwell equations give rise to two problems:
(1) With respect to which system of reference is the speed of light c is measured? First,
it was assumed that the absolute space of Newton —the so-called ether — was the
medium in (and relative to) which light moved. However, attempts at detecting
the effects of Earth’s motion on the velocity of light —the so-called terrestrial
ether drift— all failed. The most important of these was the Michelson-Morley
experiment 7 . This gave a null result.
(2) It is easy to show that Maxwell’s equations and the wave equation do not remain
invariant under Galilean transformations.
These problems gave to a crisis in the 19th century Physics. Three scenarios were
put forward to resolve the tension. These were:
(i) Maxwell’s equation were incorrect. The correct laws of Electromagnetism would
remain invariant under Galilean transformations.
(ii) Electromagnetism had a preferred frame of reference —that of ether.
(iii) There is a Relativity Principle for the whole of Physics —Mechanics and Electro-
magnetism. In that case the laws of Mechanics need modification.
Now, Electromagnetism was very successful and have a very strong predictive power.
There was no experimental support for (ii). Hence the point of view (iii) was adopted by
Einstein. His resolution of the tension between Mechanics and Electromagnetism came
to be known as Special Relativity.
5
James C. Maxwell (1831-1879). Scottish mathematician.
6
Ole C. Rømer (1644-1710). Danish astronomer.
7
Albert Michelson (1852-1931). Edward Morley (1838-1923). American physicists.
7
8
Chapter 2
Special Relativity
The contradiction brought about by the development of Electromagnetism gave rise to a

crisis in the 19th century that Special Relativity resolved.
2.1 Einstein’s postulates of Special Relativity

(i) There is no ether (there is no absolute system of reference).
(ii) The laws of Nature have the same form in all inertial frames (Einstein’s principle
of Relativity)
(iii) The velocity of light in empty space is a universal constant, i.e. same for all
observers and light sources, independent of their motion —Michelson & Morley’s
result is promoted to an axiom.
Note that postulate (iii) is clearly incompatible with Galilean transformations which
imply c0 = c − v. Because of this the Galilean transformations need modification. This
leads to the Lorentz transformations.
2.2 Spacetime pictures

This is a very useful way to think.
2.2.1 Some definitions

Spacetime. Defined as the set of 4 reals (t, x, y, z).
For simplicity (in order to be to visualise) confine ourselves to 2 dimensions: one
space and one time coordinates.
Event. Represented by a point in spacetime: i.e. E(t, x, y, z) or E(t, x).

9
Wordline. Defined as the set of all points that the trajectory of a particle follows in
spacetime.
2.2.2 Examples
• Worldline of a particle stationary at x = x0 .

• Worldline of a particle moving with uniform velocity v and passing through O at

t = 0 is straight line:
1
x = vt so that t= x.
v
Therefore the slope of of the line is given by 1/v.

• The worldline of a light ray is a straight line with slope equal to 1/c. In practice
we shall usually choose c = 1 so that the slope is equal to 1.
10

Note. All uniformly moving particles have worldlines which are straight lines with
slopes bigger than 1/c or bigger than 1 if c = 1. Therefore they all lie in the shaded
region of the figure.
• The worldlines of accelerating bodies are curved. For example, for a uniformly
accelerated body from rest one has that initially the worldline is tangent to the t.
The upper bound for v is c. The slope of the asymptotic motion is 1(= 1/c). This
situation will be analysed in detail later on.
• The worldlines of instantaneous travel is a horizontal line —however, this is forbid-

den within the framework of Special Relativity.
11
2.3 Lorentz transformations (LT)
Consider two frames F and F 0 moving in standard configuration —i.e. O0 moves with
speed v along the x-axis relative to O. The worldline of O0 in the frame is given as in
the figure:

Let observers O and O0 carry clocks measuring t and t0 respectively such that when
O0 is at (t, vt) according to O, the clock at O0 registers t0 = βt, where β may be a function
of v —in this sense β carries all the effect that the motion has on t. Note also that β = 1
for Galilean transformations.
Now consider a light ray emitted by O at t = t1 , travelling via O0 , being reflected at
p(t, x) and received by O at t = t4 —i.e. a round trip.

We want to relate the coordinates of the event at p relative to the frames F and F 0 .
In line with Einstein’s postulates assume that the speed of light is c for both O and
O0 .From the perspective of O the distance and time may be fixed using the so-called
radar convention:
x = 12 c(t4 − t1 ), t = 12 (t4 + t1 )
so that
x = c(t − t1 ) = c(t4 − t). (2.1)
12

Similarly,
x − vt2 = c(t − t2 ), (2.2)

x − vt3 = c(t3 − t). (2.3)

Now, equations (2.2) and (2.3) imply, respectively

ct − x ct + x
t2 = , t3 = .
c−v c+v
The corresponding times as measured by O0 are:

0 ct − x
t2 = βt2 = β , (2.4a)
c−v

ct + x
t03 = βt3 = β , (2.4b)
c+v
where it has been used that t0 = βt. Therefore, the time and location of p(t, x) as
measured by O0 is (using again the radar convention) is given by:
βc2 (x − vt)
x0 = 12 c(t03 − t02 ) = , (2.5a)
c2 − v 2
β(c2 t − vx)
t0 = 12 (t03 + t02 ) = , (2.5b)
c2 − v 2
13
where equations (2.4a) and (2.4b) have been used to obtain the second equalities in the
last pair of equations.
Note. The observer O0 is also assuming that the velocity of light is c. This assumption
is inconsistent with the Galilean transformations.
Eliminating x between (2.5a) and (2.5b) one obtains
1 0 vx0

t= t + 2 . (2.6)
β c
Now, the Relativity principle requires that we obtain the same result if we interchange x,
x0 and t, t0 and let v → −v. Applying this idea to equation (2.5b) and equating to (2.6):
β(c2 t0 + vx0 ) 1 0 vx0

t= = t + 2 , (2.7)
c2 − v 2 β c
so that 1/2
v2

β= 1− 2 .
c
Letting γ ≡ 1/β, the transformation for x0 can be found from (2.5a):
x0 = γ(x − vt). (2.8)
Similarly for t from equation (2.5b):

vx
t0 = γ t − 2 .
c
Finally, the coordinates y and z remain the same as there is no motion in these directions.
Thus, we have obtained the so-called Lorentz transformations:
vx
x0 = γ(x − vt), t0 = γ t − 2 , y 0 = y, z 0 = z. (2.9)
c
The inverse transformation can be obtained by letting x → x0 , t → t0 and v → −v to
yield:
vx0

x = γ(x + vt ), t = γ t + 2 , y = y 0 , z = z 0 .
0 0 0
c
Remark. This is the case of a more general transformation with 10 parameters. These
parameters are the 3 components of the velocity, 3 components of a shift of the origin,
3 parameters of a rotation and a further parameter fixing the origin of the time. The
set of these transformations forms a group. The transformation given by (2.9) is the
1-parameter subgroup of this group called the special Lorentz group.
2.4 Hyperbolic form of the Lorentz transformations

This a convenient representation for showing the group properties of the Lorentz trans-
formation.
The key idea is to replace the velocity parameter v by a hyperbolic parameter α
satisfies the following:
v v
cosh α = γ, sinh α = γ, tanh α = .
c c
14
We also require α and v to have the same sign as cosh α = cosh(−α).
The Lorentz transformation (2.9) becomes (hyperbolic form of the Lorentz transfor-
mation):
x0 = x cosh α − ct sinh α, (2.10a)

ct0 = −x sinh α + ct cosh α, (2.10b)
0
y = y, (2.10c)
0
z =z (2.10d)
(2.10e)
Adding and subtracting x0 and ct0 as given by (2.10a) and (2.10b) one obtains
ct0 + x0 = e−α (ct + x), (2.11a)

0 0 α
ct − x = e (ct − x), (2.11b)
where it has been used that

eα + e−α
cosh α = .
2
To show that the Lorentz transformations form a group one needs to show:
(i) there exists an identity element;
(ii) for every Lorentz transformation there exists an inverse;
(iii) the composition of Lorentz transformations is a Lorentz transformation and that

the composition is associative.
The most convenient way to verify the latter is to use the form given by (2.11a) and
(2.11b) and then check one by one:
(i) One sees that there exists an identity Lorentz transformation corresponding to v
(α = 0).
(ii) There exists an inverse Lorentz transformation with v = −v (α → −α).
(iii) Let F 00 move with velocity v2 (α2 ) relative to F 0 and F 0 with velocity v1 (α1 )
relative to F —all in standard configuration.
From (2.11a) and (2.11b) one has that
ct00 + x00 = e−α2 (ct0 + x0 ),

ct00 − x00 = eα2 (ct0 − x0 ),
y 00 = y, z 00 = z 0
and
ct0 + x0 = e−α2 (ct + x),

ct0 − x0 = eα2 (ct − x),
y 0 = y, z 0 = z.
15
It follows then that
ct00 + x00 = e−(α1 +α2 ) (ct + x),

ct00 − x00 = e(α1 +α2 ) (ct − x),
y 00 = y, z 00 = z 0 ,
which shows that the composition of Lorentz transformations is a Lorentz trans-

formation and since the hyperbolic parameters add, one also has the associativity.
The previous discussion allows also to discuss the Special Relativity rule for the com-
position of velocities. Since the resultant of two Lorentz transformations with parameters
α1 and α2 is a Lorentz transformation with parameters α1 +α2 , the corresponding relation
between the velocity parameter of the transformation can be easily derived from
v
tanh α =
c
by recalling that
tanh α1 + tanh α2
tanh(α1 + α2 ) = .
1 + tanh α1 tanh α2
Substituting for
v1 v2 v
tanh α1 = , tanh α2 = , tanh α1 + α2 =
c c c
one obtains
v1 + v2
v= (2.12)
1 + v1 v2 /c2
where v is the velocity of F 00 relative to F —it represents the relativistic sum of collinear
velocities v1 and v2 along the x-axis. A generalisation of this rule will be discussed later.
Remark 1. When
v1 v2
1, 1,
c c
then equation (2.12) takes the Galilean form
v = v1 + v2 .
Remark 2. Since | tanh α| < 1, it follows that v always satisfies |v| < c.
2.5 The Minkowski spacetime

There are many ways to study Special relativity. here we take the geometrical approach
developed in 1908 by H. Minkoswki. This approach naturally leads to (and led Einstein!)
to General Relativity.
To gain some intuition, start with the Euclidean geometry of the 2 dimensional plane
and recall the transformation of coordinates corresponding to the rotation of Cartesian
axes by an angle α in such a plane:
x0 = x cos α + y sin α,
y 0 = −x sin α + y cos α,
16
where (x, y) and (x0 , y 0 ) correspond to the coordinates of the point p in the two frames.

The transformation can be deduced from the diagram by observing that:
x0 = OA + AB = OA + CD
= OC cos α + P C sin α
= x cos α + y sin α
0
y = P B = P D − BD
= P C cos α − OC sin α
= −x sin α + y cos α.
Eliminating the rotation parameter α by taking
x02 + y 02 = (x cos α + y sin α)2 + (−x sin α + y cos α)2

= x2 + y 2 .
Letting
(OP ) ≡ x2 + y 2 , (2.13)
one sees that in Euclidean space, rotations leaves the distance (OP ) invariant. Note
also that the rotation leaves curves of constant distance from the origin —i.e. circles—
invariant.

Analogue for Lorentz transformations. Starting from
ct0 + x0 = e−α (ct + x),

ct0 − x0 = eα (ct − x),
17
and multiplying both sides one obtains
−ct2 + x2 = −ct02 + x02 ,
where the choice of sign in the previous equation is a convention. Furthermore, since
y 0 = y and z 0 = z one obtains
−c2 t2 + x2 + y 2 + z 2 = −c2 t02 + x02 + y 02 + z 02 . (2.14)
Alternatively, one could start from the infinitesimal version of the Lorentz transfor-
mations

0 v∆x
∆t = γ ∆t − 2 , ∆x0 = γ (∆x − v∆t) , ∆y 0 = ∆y, ∆z 0 = ∆z,
c
and taking the limit in equation (2.14) one obtains
−c2 dt2 + dx2 + dy 2 + dz 2 = −c2 dt02 + dx20 + dy 20 + dz 20 . (2.15)
Therefore
−c2 dt2 + dx2 + dy 2 + dz 2
remains invariant under Lorentz transformations (boosts).

Remark 1. The value of c is unit dependent. Often, relativists choose units (relativistic
units) such that c = 1. That is, distance is measured in light seconds —the distance
travelled by light in 1 second. From now on we shall put c = 1. Subsequent formulae
may be put “right” dimensionally by putting the missing c’s back on basis of dimensional
grounds.
Remark 2. With c = 1 one has that equation (2.15) reduces to
−dt2 + dx2 + dy 2 + dz 2
which, apart from the negative sign is very similar to the Euclidean distance in 4 dimen-
sions
dl2 = dx2 + dy 2 + dz 2 + dw2 .
Furthermore, they both remain invariant under coordinate transformations: Lorentz

transformations and rotations, respectively. This invariant quantity is called the in-
terval ds2 (or line element) in a new type of geometry called the Minkowski geometry or
spacetime. It is then described by
ds2 = −dt2 + dx2 + dy 2 + dz 2 .
The latter measures the “distance” between events (t, x, y, z) and (t + dt, x + dx, y +
dy, z + dz) in spacetime.
Note. As opposed to Euclidean geometry, the set of points with equal distances from
the origin defines a hyperbola:
x2 − t2 = D, D a constant.
18

2.6 Minkowski diagrams

The consequence of Special Relativity are best visualised using Minkowski diagrams.
These are pictures in Minkowski spacetime (usually x − t pictures). As an example let
us look at the positions of the x0 and t0 axes relative to the x and t axes.
The x0 axis (i.e. t0 = 0) is given by (c = 1):
t0 = γ(t − vx), so that t = vx.
Similarly, the t0 axis (i.e. x0 = 0) is given by
1
x0 = γ(x − vt) = 0, so that t= x
v

One can also ask what is seen in the reference frame F 0 . For this one can use the
inverse Lorentz transformations
t = γ(t0 + vx0 ), x = γ(x0 + vt0 ).
The x and t axes from the point of view of the frame F 0 are given, respectively, by
1
t0 = −vx0 , t0 = − x.
v
Thus, the picture from F 0 ’s point of view is the following:

19
This picture is consistent with the Principle of Relativity —all frames of reference
are equivalent and should provide an equivalent picture! We shall see further examples
of this symmetry in the sequel.
2.7 Index notation

In what follows let
(t, x, y, z) = (x0 , x1 , x2 , x3 ),
where the index position is a convention —more about this later. Write
xa , (a = 0, 1, 2, 3)
for x0 , x1 , x2 , x3 we may write (2.15) as
3 X
X 3
2
ds = ηab dxa dxb (2.16)
a=0 b=0
where ηab is called the Minkowski metric tensor given by

 
−1 0 0 0
 0 1 0 0 
(ηab ) ≡ 
 0 0

1 0 
0 0 0 1
so that
η11 = η22 = η33 = 1, η00 = −1,
while all other ηab ’s are zero.
In order to drop clumsy summations hereafter we will use the so-called Einstein
summation convention:
(i) Whenever an index is repeated (appears exactly twice) in a term, it is understood
to imply summation over that index over all its permissible values. In this course
lower case Latin indices a, b, . . . take values 0, 1, 2, 3. Hence equation (2.16) may
be written
ds2 = ηab dxa dxb .
(ii) Repeated indices as called dummy indices since they may be replaced by another
index (from the same alphabet!) not already used. For example:
ds2 = ηab dxa dxb = ηcd dxc dxd .
(iii) To avoid ambiguity, no index should appear more than twice in the same expression.
So
ai bi ci
is not allowed!
(iv) Indices that occur only once in an expression (or terms of an equation) are called
free indices. In an equation such indices match in every term. For example consider
Ai Bi Cj = Dj .
Notice that i is a dummy index and that j is a free index.
20
Examples
For simplicity in the following examples let the Latin lower case index take values 1, 2.
(1)
Ai B j = A1 B 1 , A1 B 2 , A2 B 1 , A2 B 2
as i, j are free indices.
(2)
2
X
Ai Bi = Ai Bi = A1 B1 + A2 B2 ,
i=1
as i is a dummy index.
(3)
gij = g11 , g12 , g21 , g22
as, again, i, j are free indices.
(4) In Γijk all indices are free. There are 8 terms: Γ1 11 , Γ1 12 ; . . . .
(5) In Ri jkl all indices are free and there are 16 terms: R111
1 , R1 1
112 , R 122 , . . .
(6)
dxj dxk dxl dxm
Γi jk = Γi lm
ds ds ds ds
as l, m are dummy indices while i is free.
(7) xa yb z b = za yc y c .
(8) gij dxi dxj = gmn dxm dxn = g11 (dx1 )2 + g12 dx1 dx2 + g21 dx2 dx1 + g22 (dx2 )2 .
2.8 4-vectors in Special Relativity

In order to write Newton’s laws in the Minkoswki spacetime, we require 4-vectors.
In analogy with 3-vectors (which are invariant under the change of coordinates) we
define 4-vectors in the Minkowski 4-dimensional geometry in such a way that the resulting
calculus will have equations invariant under Lorentz transformations (boosts).
4-vectors
A 4-vector is a set of four ordered real numbers which transform in exactly the same
manner as do (t, x, y, z) under Lorentz transformations.
Denote 4-vectors by overlines; as opposed to 3-vectors denoted by underlines.
In index notation
Ā = (Ai ) ≡ (A0 , A1 , A2 , A3 ).
The Lorentz transformation relating Ai to A0i may be written as
A0i = Li j Aj summation over j
21
where Li j is the Lorentz transformation matrix defined as
L0 0 L0 1 L0 2 L0 3
   
γ −vγ 0 0
 L1 0 L1 1 L1 2 L1 3   −vγ γ 0 0
(Li j ) ≡ 

= .
 L2 0 L2 1 L2 2 2
L 3   0 0 1 0 
L3 0 L3 1 L3 2 L3 3 0 0 0 1
Check:
A0
 
 A1 
A00 = (γ, −vγ, 0, 0)  0 1
 A2  = γ(A − vA ).

A3
So it transforms like x0 .
Norm or magnitude of a 4-vector

It is defined by
|Ā|2 ≡ (A1 )2 + (A2 )2 + (A3 )2 − (A0 )2 = ηab Aa Ab , (2.17)
which in analogy with the invariance of
x2 + y 2 + z 2 − t2 = |x̄|2 , x̄ = (t, x, y, z),
is invariant.
Exercise: Show by direct substitution that the norm of a 4-vector is invariant. One has
that
A00 = γ(A0 − vA1 ),

A01 = γ(A1 − vA0 ),
A02 = A2 , A03 = A3 .
Hence,
−(A00 )2 + (A01 )2 = γ 2 (A1 )2 + γ 2 v 2 (A0 )2 − 2γvA1 A0 − γ 2 (A0 )2 − γ 2 v 2 (A1 )2 + 2γvA0 A1 ,

= γ 2 (A1 )2 (1 − v 2 ) − γ 2 (A0 )2 (1 − v 2 ),
= (A1 )2 − (A0 )2 .
Remark. Because of the negative sign in (2.17), the norm of a vector does not have to
be positive! A 4-vector Ā is said to be:
• timelike if |Ā|2 < 0,
• spacelike if |Ā|2 > 0,
• null if |Ā|2 = 0.
In Minkowski spacetime a null vector need not be a zero vector whose components
are zero! Only in a space in which the norm is positive definite, it is true that |A|2 = 0
implies A = 0.
22
Example: Show that Ā = (1, 1, 0, 0) is a null vector. A direct computation gives
|Ā|2 = −(A0 )2 + (A1 )2 = −1 + 1 = 0.
Similarly for
(1, −1, 0, 0), (1, 0, 1, 0), (1, √12 , √12 , 0), etc.
The light cone

Take the hypersurface in spacetime defined by |Ā|2 = 0 which for Ā = (t, x, y, z) so that
x2 + y 2 + z 2 − t2 = 0.
This is said to define a light cone at the origin, because all lights rays emitted at t = 0
at origin lie on the cone x2 + y 2 + z 2 = c2 t2 . Suppressing 1-space dimension one has the
following figure:

Similarly, |Ā|2 < 0 implies x2 + y 2 + z 2 − t2 < 0 corresponds to the interior of the

light cone containing the t axis. Similarly, |Ā|2 > 0 corresponds to the exterior of the
cone.
Scalar product
The scalar product of two 4-vectors Ā, B̄ is defined by
Ā · B̄ = ηab Aa B b = −A0 B 0 + A1 B 1 + A2 B 2 + A3 B 3 .
Notice that as a consequence of this definition |Ā|2 = Ā · Ā.

Example: Prove that Ā · B̄ is invariant under Lorentz transformations: start with
(Ā + B̄) · (Ā + B̄) = |Ā|2 + |B̄|2 + 2Ā · B̄,
and note that |Ā|2 , |B̄|2 and |Ā + B̄|2 are all invariants. Hence, so is Ā · B̄.
23
Orthogonality
Two vectors are called orthogonal if Ā · B̄ = 0.
Note 1. Because of the nature of the Minkowski geometry, two orthogonal 4-vectors do
not appear orthogonal graphically.
Note 2. Null vectors are orthogonal to themselves (Ā · Ā = 0)!
Basic 4-vectors
In any frame F , there exist 4 basic vectors
ē0 = (1, 0, 0, 0), ē1 = (0, 1, 0, 0), ē2 = (0, 0, 1, 0), ē3 = (0, 0, 0, 1),
in terms of which any 4-vector Ā may be expressed:
Ā = Ai ei = A0 e0 + A1 e1 + A2 e2 + A3 e3 ,
where Ai are the components of Ā

One can add and subtract 4-vectors pictorially like is done for 3-vectors.
Example: With the help of a sketch convince yourself that the sum of two timelike or
spacelike vectors or the sum of a timelike and a spacelike vector can be null!
2.9 A brief discussion of causality

In what follows we discuss some consequences of the x-dependence of the the Lorentz
transformation of time.

(1) Any event Ei inside the light cone occurring after O from the perspective of F will
also occur after O from the perspective of F 0 no matter how fast F 0 moves with
respect to F so long as v ≤ c. An event E0 outside the light cone and occurring
after O from the point of view of F could occur before O from the point of view
of F 0 . Therefore, outside the future (and similarly the past) light cone of O there
exists no ordered time sense of events.
24
Given any point O, the spacetime is divided up into the absolute past of O (the
past light cone at O) and the absolute future of O (the future light cone at O) and
a region (spacelike) know as the region of relative simultaneity.

(2) For invariance of causality, interactions must take place at speeds less than c. To
see this, consider a process in which an event E1 causes an event E2 at super-light
speed u > c relative to some frame F . Choose coordinates in F such that E1 and
E2 occur on the x-axis and let their space and time separation ∆x > 0, ∆t > 0
(i.e. E1 precedes E2 ). Now, in frame F 0 moving with with velocity v relative to F
we have: v uv
∆t0 = γ ∆t − 2 ∆x = γ∆t 1 − 2
c c
where
∆x
u=
∆t
is the speed of propagation. Now, for
c2
<v<c
u
we would have ∆t0 < 0 so that in F 0 the event E2 precedes E1 —i.e. cause and
effect are reversed or we have information from receiver to transmitter!
2.10 Clocks and rods in relativistic motion

We now consider the effects of uniform motion on clocks and rods.
2.10.1 Time dilation

Consider F and F 0 in standard configuration. Let a standard clock be at rest in F 0 (at
x = x0 ) and consider two events in this clock at times t01 and t02 . Let also
∆t0 = t02 − t01 .
In order to find the interval ∆t as measured by F , recall that
∆t = γ(∆t0 + v∆x0 ).
However, ∆x0 = 0 as x02 = x01 = x0 . Hence one obtains
∆t = γ∆t0 ,
Since
1
γ=p > 1,
1 − v 2 /c2
one finds that the interval as measured by F is longer.
There is a symmetry! Both observers say the same thing about each other!
25
2.10.2 Length contraction
This is also called the (Lorentz-Fitzgerald contraction). Consider F and F 0 in standard
configuration. Let a rod of length ∆x0 be placed at rest along the x0 -axis of F 0 . To find
the length as measured in F , we must measure the distance between the two ends of the
rod simultaneously in F . Consider two events occurring simultaneously at the end points
of the rod in F . Therefore one has ∆t = 0. Now, using
∆x0 = γ(∆x − v∆t)
one finds that

1
∆x0 = γ∆x, or ∆x = ∆x0 .
γ
Accordingly, the length of the rod in the direction of motion as measured by F is reduced
by a factor of (1 − v 2 )1/2 .
Geometrically:
F measures the distance between the two ends of the rod at t = 0, i.e. F measures OB,
while F 0 measures OA.

2.11 Paradoxes
These arise from an incautious view of the situation, and the fact that simultaneity means
different things to different observers.
The twin paradox

Consider a pair of twins A and B. Let A be stationary at origin of F whereas B moves
with sped v for a time T and then with speed −v for equal time and returns to A’s
position. The total elapsed time as measured by A is 2T . Because of time dilation, the
time as measured by B is
2T
< 2T.
γ
Therefore, when twins reach the point (0, 2T ) in A’s frame A is older than B.
The “paradox”: cannot B say with equal right that it was she/he who remained where
she/he was while A went on a round trip and that A should, consequently, be the younger
when they meet?
26

Answer: No, since there is no symmetry! The twin A remained in the same inertial
frame, but B has experienced acceleration and deceleration and therefore knows that
she/he has not been in an inertial frame! This solves the paradox.
Note: in Minkowski spacetime OO1 O2 < OO2 .
2.12 Experimental evidence for Special Relativity

Clearly Special Relativity is consistent with Michelson & Morley’s experiment and its
refined versions since.
A well know test of time dilation comes from the behaviour of muons (elementary
particles formed by the collision of Cosmic rays with particles in the upper atmosphere).
The mean life of muons is approximately 2.2 × 10−6 s so that if the moved at the speed
of light they could only cover a distance of approximately 0.66km. However, they reach
the ground level from heights of about 10km. To explain this, they must have a dilation
factor of approximately 15. This means they would have a speed of about 0.997c!
From the muon’s point of view, they have a normal life time, however, they depth of
the atmosphere is contracted by a factor of 15,
Time dilation can also be observed using accurate atomic clocks on board of airplanes
which are then compared with fixed clocks.
2.13 Proper time

In order to develop relativistic dynamics one requires the analogues of
dx dv dp
v= , a= , F = ,
dt dt dt
etc. The problem is that in Special Relativity, t = x0 is not a scalar, so that we cannot
just carry d/dt over to Special Relativity.
The closest thing to dt which is a scalar is the proper time interval dτ defined by
ds2
dτ 2 ≡ − = dt2 − dx2 − dy 2 − dz 2 .
c2
In the previous definition the minus sign is included so that dτ and dt have the same
sign! The name of proper time comes from the fact that a clock at rest with a moving
particle —i.e. in the particle’s rest frame where dx = dy = dz = 0— has dτ = dτ —i.e.
it is equal to the time elapsed on the particle’s clock.
We employ τ as the invariant measure of time for the particle.
27
2.14 4-velocity and 4-momentum
In order to express Newton’s laws in Special Relativity in an invariant way, we need to
express them in terms of 4-vectors.
4-velocity
The 4-velocity of a particle is defined as a unit tangent to its Worldline:
dx̄ dxi
Ū = , Ui = .
dτ dτ
Remarks:
(1) From the definition of dτ one finds that
ds2 = −dτ 2 = dx̄ · dx̄
where dx̄ = (dt, dx, dy, dz) so that
Ū · Ū = −1. (2.18)
So that 4-velocity as defined has unit length.
(2) From dτ 2 = dt2 − dx2 − dy 2 − dz 2 one finds that

2 2 2
dτ dx dy dz
=1− − − = 1 − v2,
dt dt dt dt
where v denotes the 3-velocity relative to the frame F and v 2 = v · v. Hence, one
concludes that
dt 1
=√ = γ(v) (c = 1).
dτ 1 − v2
Now, using
dx dx dt
= = γ(v)v 1 , etc
dτ dt dτ
one finds that
dt dx dy dz
Ū = , , , = γ(v)(1, v 1 , v 2 , v 3 ),
dτ dτ dτ dτ
or in short
Ū = γ(v)(1, v). (2.19)
Note that the spatial part of Ū is essentially v.
4-momentum
The 4-momentum is the natural analogue of the 3-momentum:
p̄ = m0 Ū ,
where m0 denotes the mass of the particle. From the definition it follows that
p̄ · p̄ = m20 Ū · Ū = −m20 ,
28
where it has been used that Ū · Ū = −1. Also, using (2.19) one has
p̄ = m0 γ(v)(1, v). (2.20)
It follows that the space part of (2.20) can be identified with the 3-momentum, where
by analogy m0 γ is called the the moving mass, or the apparent mass and m0 is referred
as the rest mass.
Let
m0
m ≡ m0 γ(v) = p ,
1 − v 2 /c2
so that the time component of p̄ is identified with the energy
E = m0 c2 γ(v).
One reason for this identification comes from considering the limit for small v/c. For
v/c 1 one has
E = m0 c2 γ(v) = m0 c2 (1 − v 2 /c2 )−1/2
≈ m0 c2 + 21 m0 v 2 ,
where the binomial expansion has been used. Now, the second term is just the Newtonian
kinetic energy ( 12 m0 v 2 ). The first term (m0 c2 ) is then interpreted as the rest mass energy.
This is the famous equation
Erest = m0 c2 .
From the previous discussion one can write

p̄ = (E, p), (2.21)
with p the 3-momentum and E the energy. From (2.20) one concludes that
p̄ · p̄ = (E, p) · (E, p) = −E 2 + p · p.
Using (2.18) one concludes
E 2 − p · p = m20 , (c = 1).
2.15 Photons
The definition of 4-velocity given in the previous sections breaks down when applied to
particles moving with the speed of light (photons) since for light rays one has ds2 =
−dτ 2 = 0. In this case one may choose another parameter λ and define
dx̄
k̄ = ,
dλ
but again k̄ · k̄ = 0 since k̄ is null. This also implies that p̄ · p̄ = 0 for photons as p̄ is in
the direction of Ū . Now, recalling that p̄ · p̄ = −m20 , it follows that m0 = 0 for photons.
Hence, particles moving wit the speed of light must be massless!
Consider a photon with 4-momentum p̄ = (E, p) defined relative to some frame F .
As seen before p̄ · p̄ = 0, so that one finds that
E 2 − p2 = 0, or E = p.
Therefore, for photons the spatial 3-momentum and the energy are equal. In particular,
if the photon moves along the x-direction one has that
px = E.
29
2.16 Doppler shift
Let F and F 0 be in standard configuration. Consider a photon of frequency ν moving in
the x-direction relative to the frame F . Relative to the frame F 0 the energy of the photon
may be obtained using a Lorentz transformation. For this recall that p̄ is a 4-vector and
its energy is given by its t-component. So, from
p̄ = (E, px ), py = pz = 0,
one obtains
E 0 = γ(E − vpx ), (c = 1). (2.22)
Also, recall that form Quantum Mechanics, a photon of frequency ν has energy given by
hν where h denotes Planck’s constant:
h = 6.625 × 10−34 Js.
Similarly, one has E 0 = hν 0 . Substituting in (2.22) one obtains
hν − vpx
hν 0 = √ . (2.23)
1 − v2
Furthermore, for such a photon E = px so that substituting into (2.23):
hν − vhν
hν 0 = √ ,
1 − v2
from where
ν0
r
1−v 1−v
=√ = .
ν 1 − v2 1+v
Adding the constant c: s
ν0 1 − v/c
= . (2.24)
ν 1 + v/c
This is the relativistic Doppler shift formula. Note that when v/c 1, then using the
binomial expansion in (2.24) one obtains
ν0
≈ 1 − v/c,
ν
which is the usual (non-relativistic) formula for the Doppler shift.
Remark. The Doppler shift has been fundamental in Cosmology to establish the ex-
pansion of the Universe.
2.17 Newton’s second law and 4-acceleration

In analogy with Newton’s second law, one needs to consider the rate of change of p̄. For
this, use the proper time τ as invariant candidate for time. For a particle moving with
velocity v relative to F one has that
dp̄ d(m0 Ū ) d dt d
= = [m0 γ(v)(1, v)] = m0 [γ(v)(1, v)] . (2.25)
dτ dτ dτ dτ dt
30
But,
dt
= γ(v),
dτ
as seen in section 2.14. Also,
dγ(v) d d
= (1 − v 2 )−1/2 = (1 − v · v)−1/2
dt dt  dt
dv

1 −2v ·
=−  dt  ,
2 (1 − v · v)3/2
so that
dγ(v)
= γ 3 v · v̇,
dt
where we have written
dv
v̇ ≡ .
dt
Substituting into (2.25):
dp̄
= m0 γ γ(0, v̇) + γ 3 v · v̇(1, v) ,

dτ
and finally
dp̄
= m0 γ 4 v · v̇, (1 − v 2 )v̇ + (v · v̇)v .

dτ
Now, for v c one has that γ ≈ 1 and (v · v̇)v ≈ v̇v 2 /c2 1 so that
dp̄
≈ m0 (v · v̇, v̇).
dτ
The second term (spatial part) on the right hand side of the last equation is the usual
rate of change of the 3-momentum while the time part is the rate of change of the kinetic
energy.
4-acceleration
For |v| c the 4-acceleration is defined as
dŪ
≈ (v · v̇, v̇).
dτ
with the spatial part being approximately the 3-acceleration at low v. From
dŪ
Ū · =0
dτ
it follows that the 4-acceleration is orthogonal to the 4-velocity. Using the definition of
4-acceleration Newton’s second law becomes
dp̄
F̄ = ,
dτ
where F̄ denotes the 4-force vector. Note also, that F̄ · Ū = 0 so that also F̄ and Ū are
orthogonal. This can be seen as follows:
dp̄ dŪ
F̄ · Ū = · Ū = m0 · Ū = 0.
dτ dτ
31
2.18 3-velocity and 3-acceleration
Let F and F 0 be in standard configuration and moving with velocity V along the x-axis.
For simplicity, we will restrict our attention to movements along the x-axis. Let v be the
(uniform) velocity of a particle relative to F To find v 0 , the velocity relative to F 0 recall
that:
dx
v= , (2.26a)
dt
dx0
v0 = 0 , (2.26b)
dt
where the increment represents the distances and times between two events for the par-
ticle relative to the two frames. Using the inverse Lorentz transformations
dx = γ(dx0 + V dt0 ), dt = γ(dt0 + V dx0 ),
in (2.26b) one obtains

γ(dx0 + V dt0 ) v0 + V
v= = .
γ(dt0 + V dx) 1 + v0V
In the sequel we will need a transformation rule for the 3-acceleration. Starting from
v0 + V
v=
1 + v0V
and calculating the differential
dv 0 v0 + V
dv = − V dv 0 ,
1 + v0V (1 + v 0 V )2
one concludes that

1 dv 0
dv = . (2.27)
γ 2 (1 + v 0 V )2
Also, from the inverse Lorentz transformation
t = γ(t0 + V x0 ),
it follows that
dt = γ(dt0 + V dx0 ),
and furthermore that
dv 1 dv 0
= 3 .
dt γ (1 + v 0 V )3 dt0
Notice that as a consequence of this formula, is the acceleration is zero in one inertial
frame, then it is zero in all inertial frames. Hence, acceleration is in a certain sense
absolute.
2.19 Uniform acceleration

In Newtonian Physics, uniform acceleration is defined as
dv
= constant.
dt
32
It follows that
lim v = ∞,
t→∞
which contradicts Special Relativity! Thus, in Special Relativity a definition of uniform

acceleration is adopted which does not suffer from this shortcoming. One defines 3-
acceleration as uniform if at each time t the acceleration of the particle relative to an
inertial frame with the same velocity as the particle has the same value —i.e. if it has
the same value in a comoving frame (a frame that is momentarily at rest). An example
of this would be a spacecraft with engine running at constant rate.
Let a constantly accelerating particle have velocity v = v(t) relative to a frame F ,
along its x-axis. Then, at any time t the velocity of the comoving frame F 0 (in which the
particle is stationary) is v. Therefore the velocity of the particle relative to F 0 is v 0 = 0
and the 3-acceleration dv 0 /dt0 is a constant —say a.
Using the transformation rule for the acceleration deduced in the previous section,
namely,
dv 1 dv 0
= 3 ,
dt γ (1 + v 0 v)3 dt0
with v 0 = 0 and dv 0 /dt0 = a0 one obtains
dv a0 3/2
= = 1 − v2 a0 .
dt γ
Integrating:
Z v Z t
dv
= a0 dt v0 = 0 at t = t0 ,
0 (1 − v 2 )3/2 t0
one obtains
v
= a0 (t − t0 ).
(1 − v 2 )1/2
Solving for v one finds
dx a0 (t − t0 )
v= = 1/2 .
dt 1 + a20 (t − t0 )2
Integrating once more
1 1/2 1
x − x0 = 1 + a20 (t − t0 )2 − ,
a0 a0
which can be rewritten as
(x − x0 + 1/a0 )2 (t − t0 )2
− = 1. (2.28)
(1/a0 )2 1/a0
The latter is an hyperbola in the (x, t) plane. For simplicity take t0 = 0 and x0 = 1/a0
so that (2.28) reduces to
x2 (t)2
− = 1.
(1/a0 )2 (1/a0 )
This formula gives different hyperbolae for different values of a0 .
33
2.20 Relativistic dynamics
In Special Relativity Newton’s laws become:
First law. Remains unchanged, except that straight lines the straight lines referred to
are now world lines in Minkowski spacetime.
Second law. One has
dp̄
F̄ = .
dτ
Third law. On basis of very precise experiments of Particle Physics, this remains
unchanged. That is, 4-momentum is conserved in collisions:
X
p̄i = constant,
i
where the sum is over the particles involved in the collision.

Note. Due to constancy of the time component, the conservation of energy with rest
mass is included in the balance!
2.21 Examples of relativistic collisions

This type of problems can be solved by equating components, squaring and then using
further properties of p̄.
Example 1
Consider 2 particles with rest masses m1 and m2 both moving along collinearly with
speeds u1 and u2 . The particles collide and coalesce with the resulting particles moving
in the same direction. The question is: what are the mass m and the speed u of the
resulting particle?
Recall that p̄ = mγ(1, v) for a particle of 3-velocity v. The initial 4-momenta are:
p̄1 = m1 γ(u1 )(1, u1 , 0, 0),

p̄2 = m2 γ(u2 )(1, u2 , 0, 0).
The final 4-momentum is

p̄ = mγ(u)(1, u, 0, 0).
The conservation of -momentum is expressed by
p̄ = p̄1 + p̄2 . (2.29)
Squaring
p̄2 = p̄ · p̄ = p̄21 + p̄22 + 2p̄1 · p̄2 . (2.30)
However,
|p̄1 |2 = −m21 , |p̄2 |2 = −m22 ,

p̄1 · p̄2 = m1 m2 γ(u1 )γ(u2 )(−1 + u1 u2 ).
34
Substituting in (2.30):
q
m= m21 + m22 + 2m1 m2 γ(u1 )γ(u2 )(1 − u1 u2 ). (2.31)
Taking space and t-components of 4-momenta in equation (2.29)
mγ(u)u = m1 γ(u1 )u1 + m2 γ(u2 )u2 , (2.32a)

mγ(u) = m1 γ(u1 ) + m2 γ(u2 ). (2.32b)
Dividing (2.32a) by (2.32b) one obtains

m1 γ(u1 )u1 + m2 γ(u2 )u2
u= . (2.33)
m1 γ(u1 ) + m2 γ(u2 )
Remark. In the limit of u1 c and u2 c one has that γ(u1 ), γ(u2 ) ≈ 1 and that
(1 − u1 u2 ) ≈ 1 so that (2.31) and (2.33) yield
m ≈ m1 + m2 ,
m1 u1 + m2 u2
u≈ ,
m1 + m2
which are the classical version of the result.
Example 2
Consider the collision (scattering) of a photon of frequency ν moving in the x-direction
by an electron of mass me in a frame in which me is initially at rest. Assume that the
subsequent motion remains in the xy plane.
Before the collision the 4-momenta of the photon and electron are given, respectively,
by
p̄p1 = (hν, hν, 0, 0),

p̄e1 = me γ(0)(1, 0, 0, 0), γ(0) = 1.
After the collision we have that
p̄p2 = (hν 0 , hν 0 cos α, hν 0 sin α, 0),

p̄e2 = me γ(v)(1, v cos β, v sin β, 0),
where ν 0 is the new photon frequency and α, β are as given in the figure.
The conservation of 4-momentum gives:
p̄p1 + p̄e1 = p̄p2 + p̄e2 .
Squaring:
(p̄p1 + p̄e1 − p̄p2 ) · (p̄p1 + p̄e1 − p̄p2 ) = p̄e2 · p̄e2 . (2.34)
But,
p̄2e1 = p̄2e2 = −m2e , p̄p1 = p̄p2 = 0.
Substituting in (2.34) one obtains
p̄e1 · p̄p1 − p̄e1 · p̄p2 = p̄p1 · p̄p2 ,
35
from where
−me hν + me hν 0 = h2 νν 0 (cos α − 1),
and
me c2

1 1
sin2 α/2 = − . (2.35)
2h ν0 ν
Similarly, to find β rewrite (2.34) as
(p̄p1 + p̄e2 − p̄p2 ) · (p̄p1 + p̄e2 − p̄p2 ) = p̄e1 · p̄e1 .
This example shows that the photon is deflected (or scattered) by and angle given by
(2.35)
36
Chapter 3
Prelude to General Relativity
3.1 General remarks

At the time of the development of Special Relativity, physical interactions were supposed
to be either gravitational or electromagnetic. Electromagnetism was already compatible
with Special Relativity —i.e. invariant under Lorentz transformations. On the other
hand, Newton’s laws were not.
After the development of Special Relativity, what was needed was to construct a
relativistic theory of gravity compatible with Special Relativity. The first attempts to
construct such theory involved generalisations of Newton’s laws of gravity. For example,
Nordström developed a theory which was Lorentz invariant but which is incompatible
with the observations —it does not produce light bending.
Einstein in 1915 succeeded in constructing a theory which is both Lorentz invariant
and which s compatible with predictions. This theory is called General Relativity. In
order to develop General Relativity, we will require some ingredients of tensor calculus.
To understand why this mathematical tool is required, we take first a look at some of
the principles that underlie the theory.
3.2 The Equivalence Principle

The Equivalence Principle amounts to the following two statements:
(1) The (equation of) motion of a (spherically symmetric) test particle (one whose
own gravitational field may be neglected) in a gravitational field is independent
of its mass and composition. The first verification of this statement is claimed
to be Galileo’s Pisa bell tower experiment —although this particular experiment
probably never took place. More recent experiments like the one by Roll, Krotkov
and Dicke (1964) have allowed to establish the equality to 1 part in 1011 .
(2) Matter (as well as every form of energy) is acted on by (an is itself a source of)
gravitational field. In other words, gravity couples everything.
An immediate consequence of (2) is that it is not possible to eliminate the force of

gravity in the same way that other forces may be eliminated, by for example, discon-
necting power sources or by means of shielding as in the case of Faraday cages. The only
other forces that behave in this way are the so-called fictitious forces (i.e. the centrifugal
37
and Coriolis forces) which arise when non-inertial frames of reference are employed. The
important point about these forces is that like gravity, they are proportional to the mass
of the particle. This led Einstein to suspect that these and the gravitational forces should
enter the theory in the same way.
To get a better feeling for this, recall that the only way one can eliminate the force of
gravity is by choosing a freely falling frame —i.e. a comoving frame with the freely falling
particle. This is can be visualised in the thought experiment (Gedankenexperiment) —
sometimes referred to as the lift experiment.
The experiment suggests that there are no local experiments which distinguish non-
rotating free fall in gravitational field from a uniform motion in a space free from gravita-
tional fields. By local, here its is understood that the experiment is performed in a small
region such that the variation of the gravitational field is negligible (observationally).
This is another way of expressing the Equivalence Principle (all particles fall in the same
way). In this sense, Special Relativity is regained locally, in the sense that the laws of
Physics in a freely falling frame are compatible with Special Relativity. Alternatively,
one can say that spacetime is locally Minkowskian. Furthermore, for a global theory in
the presence of gravitation (i.e. GR), the geometry of spacetime must be such that it
is locally Minkowskian. The natural tool to express and implement these ideas is the
so-called tensor calculus.
3.3 Summary
In presence of gravitational fields there exist, in small regions (locally), preferred inertial
frames (i.e. the non-rotating free falling frames) in which the special relativistic results
hold. On a large scale, on the other hand, there are no such preferred frames, and hence
one needs to treat all large scale reference frames on the same footing. This suggests
that the laws of nature should be formulated in such a way that they are invariant under
arbitrary transformations of coordinates (i.e. reference frames), and not just the Lorentz
transformations as was the case of Special relativity.
Interpreted physically, this is called the General Principle of Relativity as opposed to
the Special Principle of Relativity according to which laws of nature have the same form
in inertial frames.
Interpreted mathematically, it is called the principle of General Covariance —the
equations of Physics should have tensorial form.
38
Chapter 4
Differential Geometry and tensor

calculus
In describing spacetime we wish our equations to be valid for any coordinates. Tensorial
equations satisfy this property —hence their significance.
4.1 Manifolds and coordinates

Roughly speaking, a manifold is locally equivalent to a subset of n-dimensional Euclidean
space Rn —i.e. made of pieces that look like open sets in Rn and such that the pieces
may be glued together smoothly. This definition allows the notion of curved space to be
made precise.
Example. The surface of S 2 (2-sphere) which is locally R2 , even though non-locally it
is curved and closed.
We view an n-dimensional manifold (also called spacetime in Relativity where n = 4)
as a set of points each possessing a set of n coordinates (x0 , x1 , . . . , xn−1 ) where each
coordinate ranges over a subset of the reals or the whole reals.
Note. The first coordinate is chosen to be x0 consistent with the notation in Special
Relativity where x0 = t.
An important feature of general manifolds is that we cannot assume that the whole
manifold can be covered with a single (non-degenerate) coordinate system as it is the
case in Euclidean or Minkowski space.
Example. On the surface of the sphere S 2 , there are no coordinates which cover the
whole surface without degeneracy —i.e. with all images being well defined.
We shall have occasion to deal with space of dimension n ≥ 2, but cases n = 2, 3, 4
are of most interest.
Curves
A curve is defined as the set of points given by
xi = f i (u), i = 0, 1, . . . n − 1,
with u a parameter.
39
Subspaces
A subspace is defined as the set of points given by
xi = f i (u1 , u2 , . . . , um ), i = 0, 1, . . . , n − 1,
with m < n. We speak of a subspace of dimension m < n.
We shall call a space of dimension n − 1 a hypersurface because like a surface in

3-dimensional space it divides the n-dimensional space into 2 disjoint sets. This can be
seen as follows: one can eliminate the n − 1 parameters for the n equations xi = f i
leaving
F (x0 , x1 , . . . , xn−1 ) = 0.
The points in the space not satisfying this equation fall into 2 classes —that for F > 0
and that for F < 0.
4.2 Transformation of coordinates

Assume that well behaved coordinates exist —at least in patches. Since we wish our equa-
tions to be valid for any coordinates, one has to analyse the changes from the coordinates
(xa ) to (x0a ). That is, changes from
x0a = x0a (xb ) ≡ x0a (x0 , . . . , xn−1 ) (4.1)
or inverse transformations of the type
xa = xa (x0b ), (4.2)
where xa and x0a refer to coordinates of a point p relative to coordinates systems F and
F 0 which are no longer assume to be inertial. We shall also assume that the functions xa
and x0a are differentiable.
Differentiating (4.1) one obtains
∂x0a 0 ∂x0a 1 ∂x0a

dx0a = dx + dx + · · · + dxn−1 ,
∂x0 ∂x1 ∂xn−1
or in a more compact form

∂x0a b
dx0a = dx (4.3)
∂xb
where ∂x0a /∂xb is the Jacobian of the transformation. For example, in 2 dimensions we
have !
∂x0a ∂x01 ∂x01
∂x 1 ∂x 2
= ∂x02 ∂x02 .
∂xb ∂x 1 ∂x2
Now, dxa may be treated as an infinitesimal displacement between two neighbouring

points p(xa ) and q(xa + dxa ).
40
Example
We may describe the plane R2 by Cartesian coordinates (xi ) = (x, y) or polar coordinates
(x0a ) = (r, θ). We then have
x01 = r = (x2 + y 2 )1/2 ,

x02 = θ = arctan(x/y).
The inverse transformations are given by
x1 = x = r cos θ = x01 cos x02 ,

x2 = y = r sin θ = x01 sin x02 .
4.3 Contravariant vectors

The infinitesimal displacement dxa is the prototype of a class of geometrical objects
called contravariant vectors. A contravariant vector is defined as a set of n quantities V a
associated with a point p of the manifold which under change of coordinates transform
according to
∂x0a b
V 0a = V , (4.4)
∂xb
that is, in the same way as differentials. Similarly, a contravariant tensor of rank or order
2 is defined as a set of n2 quantities U ab which under a change of coordinates transform
like
∂x0a ∂x0b cd
U 0ab = U . (4.5)
∂xc ∂xd
In general, a contravariant tensor of rank k is as set of quantities V a1 a2 ···ak which
transform according to:
∂x0a1 ∂x0a2 ∂x0ak b1 b2 ···bk
V 0a1 a2 ···ak = · · · V .
∂xb1 ∂xb2 ∂xbk
Remark. Contravariant tensors of rank k are also called tensors of type (k, 0) —e.g. a
contravariant vector V a is referred to as a tensor of rank (1, 0). An important special
case is a tensor of rank 0 (type (0, 0)) also called a scalar or an invariant:
φ0 = φ at p.
4.4 Covariant and mixed tensors

Recall that given a real valued function (scalar field) φ on the manifold one can define
the gradient of φ by
∂φ
∂xa
which is also a vector. If we transform this expression to another coordinate system {x0a }
we have:
∂φ ∂φ ∂xb
=
∂x0a ∂xb ∂x0a
where the chain rule has been used. The latter is the prototype of a covariant vector or
covariant tensor of rank 1 or of type (0, 1).
41
More precisely, a covariant vector is defined as a set of n quantities Yb which transform
according to:
∂xb
Ya0 = Yb . (4.6)
∂x0a
Similarly, a covariant tensor of rank 2 (or type (0, 2)) can be defined by:
0 ∂xc ∂xd
Yab = Ycd .
∂x0a ∂x0b
More generally, a covariant tensor of rank k (or type (0, k)) is defined as:
∂xb1 ∂xb2 ∂xbk

Ya01 a2 ···ak = · · · Yb b ···b .
∂x0a1 ∂x0a2 ∂x0ak 1 2 k
Important remark! It is a convention to write contravariant tensors with raised indices

and covariant tensors with lowered indices.
Mixed tensors
One can also define geometric objects called mixed tensors. For example, the mixed
tensor of rank 3 with 1 contravariant and 2 covariant indices (of type (1, 2)) satisfies
∂x0a ∂xf ∂xg e

Z 0a bc = Z fg.
∂xe ∂x0b ∂x0c
Finally, one may define a mixed tensor of rank (p + q) of type (p, q) —i.e. p contravariant
and q covariant indices. It can be written as
Z a1 ···ap b1 ···bq .
An example
If a contravariant vector and a covariant vector have, respectively, components Ai =
(A1 , A2 ) and Ai = (A1 , A2 ) in Cartesian coordinates, find the components in polar coor-
dinates. In this example one has
(x01 , x02 ) = (r, θ),

(x1 , x2 ) = (x, y).
Also
x = r cos θ, y = r sin θ,
2 2 1/2
r = (x + y ) , θ = arctan(y/x).
Recall also that

∂x0i j ∂xj
A0i = A , A0i = Aj .
∂xj ∂x0i
Thus, one has to compute
∂x0i ∂xj
, .
∂xj ∂x0i
42
A lengthy but straightforward calculation gives:
∂x1 ∂x ∂x2 ∂y
01
= = cos θ, 01
= = sin θ,
∂x ∂r ∂x ∂r
∂x1 ∂x ∂x2 ∂y
02
= = −r sin θ, 02
= = r cos θ,
∂x ∂θ ∂x ∂θ
and
∂x01 ∂x x ∂x01 ∂r y
1
= = , 2
= = ,
∂x ∂x r ∂x ∂y r
∂x02 ∂θ y ∂x02 ∂θ x
1
= = − 2, 2
= = 2.
∂x ∂x r ∂x ∂y r
For the contravariant tensor Aa one has then that

∂x01 a ∂r 1 ∂r 2 1
A01 = a
A = A + A = (xA1 + yA2 ),
∂x ∂x ∂y r
∂x02 a ∂θ 1 ∂θ 2 1
A02 = a
A = A + A = 2 (−yA1 + xA2 ).
∂x ∂x ∂y r
For the covariant tensor Aa one has
∂xa ∂x ∂y
A01 = 01
Aa = A1 + A2 = A1 cos θ + A2 sin θ,
∂x ∂r ∂r
∂x a ∂x ∂y
A02 = 02
Aa = A1 + A2 = −r sin θA1 + r cos θA2 .
∂x ∂θ ∂θ
This example shows, in particular, that contravariant and covariant tensors are different
geometric objects.
Remark 1. In Cartesian coordinates there exists no distinction between covariant and
contravariant vectors and that is why one could get away with thinking that they are the
same. In general, specially in spaces in which no global Cartesian coordinates exist, it is
important to recognize that even though ∂φ/∂xa is a vector, it is not the same kind of
vector as dxa .
Remark 2. Tensors as we have defined them are a set of components at a point of the
manifold with particular transformation rules.
Remark 3. A tensor field is an association of a tensor of the same rank to every point
of a manifold. A tensor field is called continuous or differentiable if its components in
some coordinate system are continuous or differentiable functions of the coordinates. If
they are C ∞ , they are called smooth.
Remark 4. A vector is a tensor with one index. A scalar is a tensor with no indices.
Not all geometric objects are tensors.
Remark 5. The importance of tensors in Mathematical Physics and Relativity lie in the
fact that a tensor equation which holds in one coordinate system holds in all coordinate
systems. For example, suppose that in unprimed coordinates one has that
Vab = Wab .
43
The transformation to primed coordinate is given by
0 ∂xc ∂xd
Vab = Vcd ,
∂x0a ∂x0b
0 ∂xc ∂xd
Wab = Wcd ,
∂x0a ∂x0b
so that
0 0
Vab = Wab .
Remark 6. The Kronecker delta δ i j is defined by

i 1 if i = j
δ j =
0 6 j
if i =
Now, to prove that δ i j is a tensor of type (1, 1) we note that if were one it should transform
as:
∂x0i ∂xb a
δ 0i j = δ b.
∂xa ∂x0j
Now, substituting in the right hand side for δ a b
∂x0i ∂xb ∂x0i

= = δ 0i j .
∂xb ∂x0j ∂x0j
4.5 Tensor algebra

To write equations in a covariant (tensorial) form, we need to build new tensors from
given ones. The following simple algebraic operations are useful. The trick is always to
show that a given object transforms like a tensor.
4.5.1 Addition (linear combination)

The linear combination of tensors of the same type is a tensor of the same type. As a
first example we show that if W a b and Z a b are tensors of type (1, 1), then so is
V a b ≡ cW a b + dZ a b ,
with c, d scalars. To see this notice that
V 0a b = cW 0a b + dZ 0a b
∂x0a ∂xf ∂x0a ∂xf
= c e 0b W e f + d e 0b Z e f ,
∂x ∂ ∂x ∂
0a
∂x ∂x e f
= V f,
∂xe ∂ 0b
which transforms as a tensor of type (1, 1).
44
4.5.2 Direct product
The product of 2 tensors of type (p1 , q1 ) and (p2 , q2 ) is a tensor of type (p1 + p2 , q1 + q2 )
provided none of the indices are the same. As an example, if V a b and W c are tensors of
type (1, 1) and (1, 0) respectively, show that
Z abc ≡ V abW c,
is a tensor of type (2, 1). To see this
Z 0a b c = V 0a b W 0c ,
∂x0a ∂xf e ∂x0c h
= V f hW ,
∂xe ∂x0b ∂x
0a f
∂x ∂x ∂x e h 0c
= Z f .
∂xe ∂x0b ∂xh
4.5.3 Contraction
Setting an upper and a a lower index equal and summing over its values results in a
new tensor with the two indices absent. That is, one passes from a tensor of rank (p, q)
to one of rank (p − 1, q − 1). For example if Z a b cd is a tensor of type (3, 1), show that
Z ac ≡ Z a b cb is a tensor of type (2, 0). To see this write
∂x0a ∂xf ∂x0c ∂x0b e gh

Z 0ac = Z 0a b cb = Z f .
∂xe ∂x0b ∂xg ∂xh
However, note that
∂xf ∂x0b ∂xf
0b h
= = δf h,
∂x ∂x ∂xh
so that
∂x0a ∂x0c eg
Z 0ac = Z .
∂xe ∂xg
The latter is a tensor of rank (2, 0).
4.5.4 Detection of tensors

Suppose one has a geometric object with indices. How does one decide if it is a tensor?
As an example, if B i is an arbitrary contravariant vector and Ai B i is an invariant scalar,
prove that Ai is a covariant vector. To see this write
Ai B i = A0i B 0i
0i
0 ∂x j
= Ai B
∂xj
so that
∂x0i

Aj − A0i j B j = 0,
∂x
and since B j is arbitrary this implies
∂x0i
Aj = A0i ,
∂xj
so that Aj is indeed contravariant.
45
4.5.5 Symmetric and antisymmetric tensors
A tensor Aij is said to be symmetric if
Aij = Aji ,
and antisymmetric (or skew) if

Aij = −Aji .
Remark 1. For tensor this property is preserved under coordinate transformations

because
Aij ± Aji
are tensors by the addition property, so if they vanish in one coordinate system then they
vanish in all coordinate systems.
Remark 2. In n dimensions a symmetric tensor Aij has 21 n(n + 1) independent compo-
nents and an antisymmetric has 21 n(n − 1) independent components.
Remark 3. Any rank 2 tensor can be expressed as the sum of a symmetric and an
antisymmetric (skew) parts:
Aij = 12 (Aij + Aji ) + 12 (Aij − Aji ),

= A(ij) + A[ij] .
For a tensor of higher rank one says that it is symmetric (or skew) with respect to a
pair of indices if interchanging the indices does not change the components (changes the
sign). The indices involved must be both “upstairs” or “downstairs”. For example
R[ab][cd]
implies
Rabcd = −Rbacd ,
Rabcd = −Rabdc ,
Rabcd = Rbadc .
One also defines
A(ijk) = 16 (Aijk + Ajki + Akij + Akij + Aikj + Ajik ),

A[ijk] = 16 (Aijk + Ajki + Akij − Akij − Aikj − Ajik ).
4.6 Derivatives and connections

Most dynamical laws of Physics are expressible as differential equations. A coordinate
independent formulation of such laws requires a coordinate independent definition of
derivative.
It is recalled that the partial derivative of a scalar function is a tensor, however, as
it will be seen the partial derivative of a higher rank tensor is not tensorial. To see this,
consider a contravariant tensor V a . Its transformation law is given by
∂x0b a
V 0b = V .
∂xa
46
Differentiating with respect to x0c one obtains
∂V 0b ∂ 2 x0b ∂xd a ∂V a ∂xd ∂x0b

= V + ,
∂x0c ∂xd ∂xa ∂x0c ∂xd ∂x0c ∂xa
where the chain rule
∂ ∂xd ∂
=
∂x0c ∂x0c ∂xd
has been used. The second term in the right hand side is what one would expect if
∂V a /∂xd were a tensor of second rank. It is the first term the one that destroys the
tensorial character!
One needs a definition of derivative which renders tensors. That is, a modification
∇a of ∂a with produces tensors. If ∇c is to be a derivative one needs the following to be
satisfied:
∇c f = ∂c f,
∇c (Ab + Bb ) = ∇c Ab + ∇c Bb , (linearity)
∇c (Aa Bb ) = (∇c Aa )Bb + Aa (∇c Bb ), (Leibnitz rule).
The simplest modification of ∂c that satisfies the above requirements is the following:
∇c V a ≡ ∂c V a + Γa bc V b , (4.7)
where the quantity Γa bc which has N 3 components is called the connection or sometimes
the affine connection. Note that its particular form has not yet identified.
Notation. Very often we shall write equation (4.7) and similar expressions as
V a ;c = V a ,a + Γa bc V b ,
where we have introduced the colon-semicolon notation:
V a ;c ≡ ∇c V a , ∂c V a ≡ V a ,a .
Also notice that the differentiation index c comes last in the connection Γa bc .
Tensorial character of the covariant differentiation

We will now choose a transformation law for Γa bc making ∇c V a a tensor of type (1, 1).
Recall that we already have seen that from
∂x0b a
V 0b = V ,
∂xa
it follows that
∂ 2 x0b ∂xd a ∂V a ∂xd ∂x0b
V 0a ,c = V + .
∂xd ∂xa ∂x0c ∂xd ∂x0c ∂xa
Now, from the definition (4.7) one has
V 0a ;c = V 0a ,c + Γ0a bc V 0b ,
so that
∂V b ∂xd ∂x0a ∂ 2 x0a ∂xb b 0a ∂x
0b
V 0a ;c = + V + Γ bc V f. (4.8)
∂xd ∂x0c ∂xb ∂xd ∂xb ∂x0c ∂xf
47
Now, to ensure that V 0a ;c transforms as a tensor of type (1, 1) one requires
∂x0a ∂xe ∂xf d ∂ 2 x0a ∂xd ∂xl

Γ0a bc = Γ ef − .
∂xd ∂x0b ∂x0c ∂xd ∂xl ∂x0c ∂x0b
The second term in this last expression is to cancel the second term in (4.8) by noting
that
∂xl ∂x0b
= δl f
∂x0b ∂xf
and that f and b are dummy indices that can be interchanged:
2 0a
∂ x ∂xd ∂xl ∂x0b f ∂ 2 x0a ∂xd f

− V = − V .
∂xd ∂xf ∂x0c ∂x0b ∂xf ∂xd ∂xf ∂x0c
Remark 1. Clearly, Γa bc is not a tensor. Its transformation law is not homogeneous.

Remark 2. By insisting that the covariant derivative of a scalar is the partial derivative
and that the covariant differentiation satisfies the Leibnitz rule, one can obtain a formula
for the covariant derivative of a covariant tensor:
Va;b = Va,b − Γc ab Vc .
In general, one has that
T a··· b··· ;c = T a··· b··· ,c + Γa dc T d··· b··· + · · · − Γd bc T a··· d··· .
4.7 Parallel transport

A tensor V a is said to be parallely transported along W b if
W b ∇b V a = W b V a ;b = 0.
Now, recall that one way of characterising straight lines in Euclidean space is as curves
whose tangent vectors are parallely transported at every point —i.e. they are autopar-
allels. The notion of shortest distance in this context is not appropriate as we have not
defined a distance on the manifold —-this will be seen in the sequel.
The notion defined above can be used to define the analogue of straight lines in more
general manifolds. Such curves are referred to as affine geodesics —i.e. curves along
which the tangent vector is propagated parallely to itself.
Letting W b to be tangent to a geodesic, one has that
W b ∇b W a = W b W a ;b = 0,
from where
W b W a ,b + Γa cb W c W b = 0.
If the curve is parametrised by λ, then
dxb
Wb = ,
dλ
48
and since
dxb ∂

∂ b d
W b
= ≡ ,
∂x dλ dλ ∂xb
so that
dxa dxc dxb

d
+ Γa bc = 0,
dλ dλ dλ dλ
and finally that
d2 xa dxc dxb
+ Γa bc = 0. (4.9)
dλ dλ dλ
Note. From the existence and uniqueness theorems for ordinary differential equations,
it follows that corresponding to every direction at a point, there exists a unique geodesic
passing through the point. The initial conditions are
dxa
λ = λ0 , xa0 = xa (0), W0a = (0).
dλ
Example. Show that changing the geodesic parameter λ to σ in such a way that σ =
σ(λ), the geodesic equation only keeps its form (4.9) in σ if σ = aλ + b.
To see this recall that
dxa dxa dσ
= ,
dλ dσ dλ
so that 2
d2 xa d2 xa dxa d2 σ

dσ
= + .
dλ2 dσ 2 dλ dσ dλ2
Substituting into equation (4.9) one gets
2
d2 xa dxc dxb dxa d2 σ

dσ
+ Γa bc + = 0,
dσ dσ dσ dλ dσ dλ2
which only has the form of (4.9) if

d2 σ
= 0.
dλ2
That is, if
σ = aλ + b.
A parameter of this form is called an affine parameter.
Remark. Note that only the symmetric part of the connection coefficient is required in
the geodesic equation (4.9).
4.8 Manifolds with metric

So far, in addition to tensor fields, our manifold had a connection defined on it. This
allows for the notions of differentiation and parallelism.
There are some reasons that lead us to introduce further structure on the manifold.
Namely,
(i) from the Equivalence principle, spacetime is locally Minkowskian;
49
(ii) the need of an alternative notion of parallelism based on the idea of length;
(iii) finding a relation between covariant and contravariant tensors.
In order to accomplish these point we introduce the notion of metric. This essentially
amounts to defining the distance between two neighbouring points xa and xa + dxa
through an expression of the form
ds2 = gab (x)dxa dxb (4.10)
where ds2 is called the line element or interval and gab is the metric tensor. A metric
with such a metric defined on it is called a manifold with metric.
Remark 1. The tensor gab is a tensor of type (0, 2). This follows immediately from the
scalar nature of ds2 and the fact that dxa is a contravariant tensor —this from the tensor
detection tensor. To see this recall that
∂xa 0c
dxa = dx ,
∂x0c
so that
ds2 = gab dxa dxb
∂xa ∂xb
= gab 0c 0d dx0c dx0d
∂x ∂x
0
= gcd dx0c dx0d ,
where
0 ∂xa ∂xb
gcd = gab .
∂x0c ∂x0d
The later is precisely the transformation law for a tensor (0, 2).
Remark 2. In order for (4.10) to determine gab uniquely, gab must be symmetric. Note
that if gab is symmetric, then it can always be diagonalised. Let λi , i = 0, . . . N denote
the eigenvalues of gab . If all the eigenvalues of gab are positive, then the metric gab will
be said to be a Riemannian metric and the manifold will be said to be a Riemannian
manifold. If one of the eigenvalues is negative and the remaining positive, then the metric
will be said to be Lorentzian —this is the case of relevance in Relativity. The number of
positive eigenvalues minus the number of negative eigenvaues is calld the signature. For
example, the Minkowski metric of Special Relativity (see below) has signature 2, while
the standard Euclidean metric in R4 has signature 4.
Remark 3. Euclidean space with
ds2 = dx2 + dy 2 + dz 2 + dw2
is an example of a Riemannian manifold. On the other hand, Minkowski space with
ds2 = −dt2 + dx2 + dy 2 + dz 2
is a special case of Lorentzian manifold. Note that in both examples, the coefficients
gab are constants. We also note that the Minkowski metric can be written is spherical
coordinates as:
ds2 = −dt2 + dr2 + r2 dθ2 + r2 sin2 θdϕ2 .
The definition (4.10) allows the following natural definitions of notions one had in
Minkowski space.
50
Norm of a covariant vector V a
This is defined via
|V |2 ≡ gab V a V b .
If |V |2 > 0 ( or |V | < 0) for all vectors V a , the metric is called positive definite (or
negative definite) —this is the Riemannian case. Otherwise it is called indefinite —this
includes the Lorentzian case.
Scalar product between two vector Aa and B b

This is defined via
A · B ≡ gab Aa B b .
If gab Aa Ab = 0, then Aa and B b are said to be orthogonal.
Null vectors
For indefinite metrics there are vectors that are orthogonal to themselves. That is,
gab Aa Ab = 0.
Contravariant form of the metric

Let g ≡ det(gab ). If g 6= 0, then the inverse of gab , g ab can be defined by
gab g ac = δb c .
Defined in this way, g ab is a contravariant tensor of rank 2.

As an example, consider the case when gab is diagonal —that is, gab = 0 a 6= b. Then
one can show that for g 6= 0,
1 1
g 11 = , g 22 = , · · · , g ab = 0, a 6= b.
g11 g 22
Lowering and raising of indices (index gymnastics)

One can use gab and g ab to lower and raise indices for general tensors via the rules:
··· ···b
T···a ≡ gab T··· (a is the raised index)
···a ab ···
T··· ≡g T···b (a is the raised index).
For example
gac W ab = Wc b ,
T ab = g ac Tc b = g ac g bd Tcd .
Quite crucially, one can see that the operation of lowering and raising indices does not
add extra information in the tensors. For example, given
V b = g ba Va ,
one has that

geb V b = geb g ba Va = δe a Va = Ve .
Thus, if one raises an index and then one lowers it, one receovers the orginal tensor.
51
Connection between contravariant and covariant vectors
So far, covariant and contravariant tensors have remained unrelated. The metric can be
taken as the mapping between contravariant and covariant tensors:
Va = gab V b , V a = g ab Vb .
For Euclidean space in Cartesian coordinates

 
1 0 ··· 0
.. 
 0 ... ..

. . 
gab = 
 .. . . ..
.

 . . . 0 
0 ··· 0 1
This is the reason why in this case there is no distinction between covariant and con-
travariant tensors.
Remark 1. Raising and lowering of indices enable us to write equations with indices in
any position. It is for that reason that one writes a blank space above each lower index
and below each upper index. For example the contravariant version of
Gab = Tab
is
Gab = T ab .
Remark 2. In a general 4-dimensional Lorentzian manifold with a metric gab with g 6= 0

the following theorem holds:
Theorem 1. Given any point p of a Lorentzian manifold, it is always possible to find

coordinate transformations with origin at p such that
gab (x) = ηab + O(x2 ).
That is, gab is approximately the Minkowski metric to second order:
gab (p) = ηab , gab,c (p) = 0, gab,cd (p) 6= 0.
One also has
Theorem 2. In the diagonal form of gab the number of components equal to +1 and to
−1 do not change under coordinate transformations. The difference between these two is
called the signature of the metric.
For example, the signature of the Minkowski metric is +2.
4.9 The Levi-Civita connection

Up to now the connection Γa bc has remained undefined. We will see now that given
a metric gab , there is a preferred (canonical) connection. For this we will impose two
conditions on the connection.
52
No torsion condition
Let φ be as scalar. The usual partial derivatives acting on a scalar commute. That is,
∂a ∂b φ = ∂b ∂a φ.
This is, in general, not the case for covariant derivatives. Recall that
∇b φ = ∂b φ,
so that
∇a ∇b φ = ∂a ∂b φ − Γe ba ∂e φ,
∇b ∇a φ = ∂b ∂a φ − Γe ab ∂e φ.
Thus, the covarinat derivatives commute if and only if
Γc ab = Γc ba = Γc(ab) .
This property has a nice geometric interpretation. namely, that the parallelogram formed
by the parallel propagation of two infinitesimal displacements closes.
Constancy of the inner product upon parallel propagation

The metric gab imposes a natural condition on the parallel transport. Given two vectors
V a and W b one can require that their inner product gab V a W a remains unchanged if we
parallel transport them along any curve. Thus, we require
T a ∇a (gbc V a W b ) = 0,
with V a and W b satisfying
T a ∇a V b = 0, T a ∇a W b = 0.
Using the Leibnitz rule one obtains
T a V b W c ∇a gbc = 0.
This equation should hold for all curves and parallely transported vectors if and only if
∇a gcd = gcd;a = 0.
Theorem 3. Let gab = 0 be a metric. Then there exists a unique connection such that
∇a gbc = 0.
To prove this start from
0 = ∇a gbc = ∂a gbc − Γd ba gdc − Γd ca gbd ,
so that
Γcab + Γbac = ∂a gbc ,
where Γcab ≡ gdc Γd ab . By index substitution one also has that
Γcba + Γabc = ∂b gac ,

Γbca + Γacb = ∂c gab .
53
Adding the first two equations, subtracting the third and using the symmetry Γc ab = Γc ba
one finds
2Γcab = ∂a gbc + ∇b gac − ∇c gab .
That is,
Γc ab = 21 g cd (∂a gbd + ∂b gad − ∂d gab ) .
This is called the Levi-Civita connection of the metric gab .
4.10 Metric geodesics

In Euclidean geometry, straight lines are defined as the shortest distance between any
two points. Here we give an analogue of this for a manifold with a metric.
Recall that in Lorentzian manifolds, straight lines are not those with shortest dis-
tances (intervals) between 2 points, but the longest. The generalisation of a straight line
—a geodesic line— turns out to be the curve of extremal path (i.e. maximal or minimal).
In order to find extrema, one needs some elements of calculus of variations. Let
dx
L = L(x, ẋ, λ), x = x(λ), ẋ = .
dλ
That is, L is a function of functions of λ —L is called a functional. It is assumed that L
is differentiable in x, ẋ, λ.
We are looking for the necessary conditions on the function x such that the integral
Z x2
L(x, ẋ, λ)dλ
x1
is stationary (i.e. a maximum or a minimum) with respect to changes in the function x.

The required condition is called the Euler-Lagrange equation and takes the form

d ∂L ∂L
− = 0. (4.11)
dλ ∂ ẋ ∂x
This expression can be generalised to the case where L is a function of N independent

functions, xi (λ), i = 1, . . . , N , provided they can be varied independently. In that case
(4.11) becomes
d ∂L ∂L
− i = 0, (4.12)
dλ ∂ ẋi ∂x
corresponding to N equations, one for each value of i.
RTo deduce the geodesic equation we want to consider the length of the curve defined
by ds to be stationary. Introducing a parameter λ along the curve such that
Z Z
ds
ds = dλ,
dλ
the problem becomes that of finding the extremals of

r
ds dxj dxk
q
L= = gjk = gjk ẋi ẋj .
dλ dλ dλ
54
Alternatively, one can find extremals of
2
ds
L= = gjk ẋi ẋj .
dλ
A computation renders
∂L ∂ ẋa ∂ ẋb
c
= gab c ẋb + gab ẋa c ,
∂ ẋ ∂ ẋ ∂ ẋ
= gab δ a c ẋb + gab ẋa δ b c ,
= gcb ẋb + gac ẋa = 2gac ẋa .
Now, recall that the chain rule gives

d dxe ∂
= = ẋe ∂e .
dλ dλ ∂xe
Thus,
dẋa

d ∂L dgac a
=2 ẋ + 2gac
dλ ∂ ẋc dλ dλ
= 2∂e gac ẋe ẋa + 2gac ẍa .
Finally,
∂L
= ∂c gab ẋa ẋb .
∂xc
Thus, one has that

d ∂L ∂L
0= − = 2gac ẍa + (∂b gac + ∂a gbc − ∂c gab )ẋa ẋb .
dλ ∂ ẋa ∂xa
Multiplying by 12 g f c one obtains
ẍf + Γf ab ẋa ẋb = 0,
which can be rewritten as

d2 xf a
f dx dx
b
+ Γ ab dλ dλ = 0. (4.13)
dλ2
This is the geodesic equation which we have met already. Thus, “straight lines” are also
extremal.
Remark 1. In Euclidean space in Cartesian coordinates or in Minkowski space in
Minkowski coordinates all the Christoffel symbols vanishes and equation (4.13) becomes
d2 xl
= 0,
ds2
which is the usual equation for straight motion.
Remark 2. As it stands, the above equation only makes sense for spacelikeRcurves for
which ds2 > 0. For timelike curves one uses dτ instead. Also, starting with ds2 gives
the same geodesic equation.
55
Remark 3. For null geodesics, i.e. geodesics for which ds = 0, the curve may be
parametrised by a parameter
d2 xl j
l dx dx
k
+ Γ jk = 0,
du2 du du
where
dxj dxk
gjk = 0.
du du
Remark 4. It can be proved that if gab is Riemannian then the solutions to equation
(4.13) are curves of minimum length. On the other hand, if gab is Lorentzian, then the
geodesics maximise length. Now, recall that in Special Relativity one defines the proper
time as dτ 2 = −ds2 /c2 . Thus, time observed by a comoving clock always goes slower.
4.11 Calculation of Christoffel symbols and geodesic equa-

tions using the metric
There are 2 ways to do this: either directly using the definition of the Christoffel symbols
or by using the geodesic equation.
As an example consider the 2-dimensional metric
ds2 = du2 + cos2 udv 2 .
4.11.1 Computation using the definition of the Christoffel symbols

Recall that
Γl ij = 12 g lk (∂i gkj + ∂j gik − ∂k gij ).
For our metric one has that

1 0 ij 1 0
gij = , g = .
0 cos2 u 0 1/ cos2 u
Now, for a 2 dimensional space, the Christoffel symbols have 23 = 8 components:
Γ1 11 = 12 g 11 (g11,1 + g11,1 − g11,1 ) = 0 since g11,1 = 0

1 1 11
Γ 12 = 2 g (g11,2 + g12,1 − g12,1 ) = 0 since g12 = 0 and g11,2 = 0
1
Γ 21 = 0,
1
Γ 22 = 12 g 11 (g12,2 + g12,2 − g22,1 ) = − 21 g 11 g22,1 = sin u cos u,
Γ2 12 = 12 g 22 (g21,2 + g22,1 − g12,2 ) = 21 g 22 g22,1 = − tan u,
Γ2 21 = − tan u,
Γ2 11 = Γ2 22 = 0.
The problem of this approach is that one need to calculate all the components, one by
one.
56
4.11.2 Computation using the Euler-Lagrange equations
This is usually a more useful way as it gives directly the non-zero Christoffel symbols.
Let
ds
L= = u̇2 + cos2 uv̇ 2 .
dλ
The Euler-Lagrange equations are given by

d ∂L ∂L
i
− i = 0.
dλ ∂ ẋ ∂x
Look at the different components. For i = 1 one has

d
(2u̇) − (−2 sin u cos uv̇ 2 ) = 0,
dλ
so that
ü + sin u cos uv̇ 2 = 0. (4.14)
The latter is equivalent to (cfr. (4.13)):
d2 x1 j
1 dx dx
k
+ Γ jk = 0,
ds2 ds ds
or
d2 x1 1
1 dx dx
1 1
1 dx dx
2 2
1 dx dx
1 2
1 dx dx
2
+ Γ 11 + Γ 12 + Γ 21 + Γ 22 = 0.
ds2 ds ds ds ds ds ds ds ds
However, in our case one only has v̇ 2 terms so the latter becomes
2
d2 x1 dx2

+ Γ1 22 = 0.
ds2 ds
The latter in combination with gives
Γ1 22 = sin u cos u, Γ1 11 = Γ1 12 = Γ1 21 = 0.
For i = 2 one finds from

d
(2v̇ cos2 u) = 0,
dλ
so that
v̈ − 2u̇v̇ tan u = 0. (4.15)
Again, from the equation for the geodesic one has that
d2 x2 1
2 dx dx
2 2
2 dx dx
1
+ Γ 12 + Γ 21 = 0.
ds2 ds ds ds ds
However,
Γ2 12 = Γ221 ,
and hence
Γ2 12 = Γ221 = − tan u.
Finally,
Γ2 22 = Γ2 11 = 0.
57
58
Chapter 5
Curvature
A novel feature of General Relativity is that it employs the notion of curved space. Our
intuition of curvature is mainly based on the curvature of 2-dimensional objects in 3-
dimensional space, like spheres, saddles, etc. The notion of curvature whose definition
depends on a space of higher dimension is called extrinsic. In the case of spacetime this
notion is not useful and require an intrinsic notion —i.e. a definition which is independent
of the embedding space.
5.1 Intrinsic curvature and the Riemann tensor

Gauss showed that for a general 2-dimensional surface with a metric of the form
ds2 = g11 (x1 , x2 )(dx1 )2 + g22 (x1 , x2 )(dx2 )2 ,
where it has been assumed that g12 = 0 for simplicity, it is possible to define an intrinsic
curvature (a scalar function) which is invariant under coordinate transformations, but
varies from point to point. This is given by an expression of the form
K = F [gij , ∂k gij , ∂l ∂k gij ],
such that when K = 0 the space is flat and for a sphere of radius R it gives K = 1/R2 .
In spaces of higher dimension we require more than one quantity at each point to
describe curvature. It turns out that the right definition involves the components of a
4-index tensor called the Riemann curvature tensor :
Ra bcd ≡ ∂c Γa bd − ∂d Γa bc + Γa ec Γe bd − Γa ed Γe bc . (5.1)
Since the Christoffel symbols contain derivatives of the metric, one finds that the Riemann
tensor has the same form as K. Note that in flat space given by Cartesian coordinates
the Christoffel symbols vanish, and thus the Riemann tensor vanishes! If one shows
that Ra bcd is indeed a tensor, then this last statement is valid for any coordinates! This
statement is actually an if and only if statement. The hard part is to show that vanishing
curvature implies Minkowski space.
There are many ways of motivating this formula. Here we will proceed by looking at
the commutation of covariant derivatives. Consider:
∇c ∇b Va − ∇b ∇c Va = Va;b;c − Va;c;b .
59
Now recall that
Va;b = Va,b − Γd ab Vd ,
so that
Va;b;c = (Va;b ),c − Γf ac Vf ;b − Γf bc Va;f

= Va,b − Γd ab Vd − Γf ac Vf,b − Γd f b Vd − Γf bc Va,f − Γd af Vd
,c
d
= Va,b,c − ∂c Γ ab Vd − Γd ab Vd,c − Γf ac Vf,b + Γf ac Γd f b Vd − Γf bc Va,f + Γf bc Γd af Vd .
Interchanging b and c in the last expression:
Va;c;b = Va,c,b − ∂b Γd ac Vd − Γd ac Vd,b − Γf ab Vf,c + Γf ab Γd f c Vd − Γf cb Va,f + Γf cb Γd af Vd .
Thus,
Va;b;c − Va;c;b = (Va,b,c − Va,c,b )

+ Γd ac Vd,b − Γf ac Vf,b + Γf ab Vf,c − Γd ab Vd,c

+ Γf bc Va,f − Γf cb Va,f + Γf bc Γd af Vd − Γf cb Γd af Vd

+ ∂b Γd ac Vd − ∂c Γd ab Vd + Γf ac Γd f b Vd − Γf ab Γd f c Vd
The first term of the right hand side cancels out as usual partial derivatives commute. The
second and third cancel out directly, while in the fourth and fifth we use the symmetry
of the Christoffel symbols. Thus, one is left with

Va;b;c − Va;c;b = Vd ∂b Γd ac − ∂c Γd ab + Γf ac Γd f b − Γf ab Γd f c
= Vd Rd abc ,
as it can be seen by comparison with equation (5.1). This expression is sometimes called
the Ricci identity. Defined through this expression, if follows that the Rd abc is indeed a
tensor as the expression in the left hand side is a tensor —alternatively, one could look
at the transformation rules of the Christoffel symbols. This is much more involved!
Geometric interpretation
It can be shown that the change of a vector V c parallely transported along a closed
path is proportional to the curvature —see figure. For an infinitesimal loop along the
directions given by ub and wd one has that
δV a = Ra cbd V c δub δwd .
Recall that as seen before such parallelogram closes (due to the no Torsion condition)!
60
5.2 Symmetries of the curvature tensor
In general, a tensor of rank 4 has 44 = 256 components (in spacetime). Symmetries,
if present are important because they reduce the number of independent components.
Lowering the index in the definition of the Riemann tensor one obtains
Rabcd = ∂c Γabd − ∂d Γabc + Γaec Γe bd − Γaed Γe bc ,
where
Rabcd = gaf Rf bcd , Γabd = gaf Γf bd .
Now, since Rabcd is a tensor, it should have the same symmetries in all frames. Accord-
ingly, choose a locally inertial frame for which the Christoffel symbols vanish. For these
coordinates one has then that
Rabcd = ∂c Γabd − ∂d Γabc .
Recalling that
1
Γabc = 2 (gab,c + gac,b − gbc,a )
one obtains
1
Rabcd = 2 (gad,bc + gbc,ad − gbd,ac − gac,bd ) ,
from where it is easy to read the symmetries of the tensor. It can be checked that
Rabcd = −Rbacd , Rabcd = −Rabdc , Rabcd = Rcdab .
Furthermore,
Rabcd + Radbc + Racdb = 0 so that Ra(bcd) = 0.
These symmetries amount to 236 constraints, so Rabcd has only 20 non-zero components.
5.3 Geodesic deviation

Parallel lines in curved space do not remain parallel when extended. For this one considers
two nearby geodesics with tangent given by V a and a vector ξ a describing its separation.
The evolution of the separation vector ξ a is described by the equation
∇V ∇V ξ a = Ra cdb V c V d ξ b ,
which is called the geodesic deviation equation. Note that the curvature is non-zero then
ξ a changes! In this last equation ∇V denotes the directional derivative with respect to
V a —that is, ∇V ≡ V a ∇a .
Remark. The last equation shows that if particles follow geodesics (an assumption made
in General Relativity) then the tidal gravitational forces that make the trajectories to
converge can be mathematically represented by the curvature of spacetime!
61
5.4 Bianchi identities, the Ricci and Einstein tensors
Recall that in a locally inertial frame one had that
Rabcd = 12 (gbc,ad + gad,bc − gbd,ac − gac,cd ).
Differentiating with respect to xe one obtains
Rabcd,e = 12 (gbc,ade + gad,bce − gbd,ace − gac,cde ).
Using the fact that partial derivatives commute one finds that
Rabcd,e + Rabec,d + Rabde,c = 0, Rab(cd,e) = 0.
Now, in a locally inertial frame the Christoffel symbols vanish so that in fact one has
that
Rabcd;e + Rabec;d + Rabde;c = 0, Rab(cd;e) = 0.
This tensorial equation is valid in all frames and is called the Bianchi identity. One could
have derived it by directly taking the covariant derivative of the Riemann tensor.
The Ricci tensor

The Ricci tensor is obtained by contracting the first and third indices of the Riemann
tensor:
Rbd ≡ g ac Rabcd = Rc bcd .
Remark 1. Because of the symmetries of the Riemann tensor one has that the Ricci
tensor is symmetric. That is,
Rbd = Rdb .
Remark 2. Other contractions of the Riemann tensor vanish or give ±Rbd . For example
Rb bcd = 0 since Rabcd is symmetric on a and b. Also,
Ra bda = −Ra bad = −Rbd ,
and similarly.
The Ricci scalar

The Ricci scalar is defined as the contraction of the indices of the Ricci tensor:
R ≡ g ab Rab = g ac g bd Rabcd .
The Einstein tensor

In the next computations recall that gab;c = 0 and g ab ;c = 0. Consider the Bianchi
identity, contract with g ac and bringing g ac into the covariant derivative:
(g ac Rabcd );e + (g ac Rabec );d + (g ac Rabde );c = 0. (5.2)
Now,
g ac Rabcd = Rbd , g ac Rabec = −g ac Rabce ,
62
so that (5.2) renders
Rbd;e − Rbe;d + Rc bde;c = 0.
Contracting on b and d:
R;e − Rbe;d − Rc e;c = 0, (5.3)
where it has been used that
g bd Rc bde = g bd g ca Rabde = −g ca g bd Rbade = −g ca Rae = −Rc e .
On can rewrite equation (5.3) as
(2Rc e − δ c e R);c = 0. (5.4)
Raising e one gets

(2Rcd − g cd R);c = 0.
Defining
Gcd ≡ Rcd − 21 Rg cd
one has
Gcd ;c = 0.
The tensor Gcd is called the Einstein tensor. Observe that one can also lower the indices:
Gf e = Rf e 21 Rgf e .
Remark 1. The Einstein tensor is symmetric (from the symmetries of the Ricci and
metric tensors) and therefore it has 10 independent components.
Remark 2. By construction, the Einstein tensor is divergence free.
63
64
Chapter 6
General Relativity
6.1 Towards the Einstein equations

There are several ways of motivating the Einstein equations. The most natural is perhaps
through considerations involving the Equivalence Principle. In gravitational fields there
exist local inertial frames in which Special Relativity is recovered. The equation of motion
of a free particle in such frames is:
d2 x0a
= 0.
dτ 2
Relative to an arbitrary (accelerating frame) specified by xa = xa (x0b ), the latter becomes:
d2 xa b
a dx dx
c
+ γ bc = 0,
dτ 2 dτ dτ
where
∂xa ∂ 2 x0d
γ a bc = .
∂x0d ∂xb ∂xc
Here the γ a bc are the “fictitious” terms that arise due to the non-inertial nature of the
frame.
Now, due to the Equivalence Principle the latter implies that locally gravity is equiv-
alent to acceleration and this in turn gives rise to non-inertial frames. The main idea of
General relativity is to argue that gravitation as well as inertial forces should be described
by appropriate γ a bc ’s!
The simplest way to do this is by means of a Lorentzian manifold —the latter is
endowed with geodesics of the required type:
d2 xa b
a dx dx
c
+ Γ bc = 0.
dτ 2 dτ dτ
Now, if the Γa0bc s are associated with gravitational forces, then the metric gab may be
associated with a potential. Note that the gravitational potential in the Newtonian
theory satisfies
∇2 φ = 4πGρ, ρ the density.
The relativistic analogue of this equation should be tensorial and of second order in the
metric. To take this analogy further, consider two neighbouring particles with coordinates
65
xα (t), xα (t) + ξ α (t), with ξ α (t) small α = 1, 2, 3, moving in a gravitational field with a
potential φ. the equations of motion are then given:
∂φ(x)
ẍα = −
∂xα
and
∂φ(x) 2
β ∂ φ
ẍα + ξ¨α = − − ξ + O(ξ 2 ).
∂xα ∂xα ∂xβ
Subtracting the two last equations:
∂2φ
ξ¨ = −ξ β .
∂xα ∂xβ
This is the relative acceleration of two test particles separated by by a 3-vector ξ α —the
second derivative of the potential gives the tidal forces. This is in analogy to the geodesic
deviation equation:
∇V̄ ∇V̄ ξ α = Ra cdb V c V d ξ b ,
provided that one identifies
∂2φ
−ξ β , and Ra cdb V c V d ξ b .
∂xα ∂xβ
This identification would make clear the relation between gravity and geometry —note
that the Riemann tensor involves second derivatives of the metric tensor.
The main idea underlying General Relativity is that matter (including energy) curves
spacetime (assumed to be a Lorentzian manifold). This in turn affects the motion of par-
ticles and light rays, postulated to move on timelike and null geodesics of the Lorentzian
manifold, respectively.
6.2 The principles employed in General Relativity

(1) Equivalence Principle.
(2) Principle of General Covariance. This states that laws of Nature should have
tensorial form.
(3) Principle of minimal gravitational coupling. This is used to derive the Gen-
eral Relativity analogues of Special Relativity results. For this change
ηab → gab , ∂ → ∇.
For example in Special Relativity the equations for a perfect fluid are given by:
T ab = (ρ + p)V a V b − pη ab ,
T ab ,b = 0.
In General Relativity these should be changed to:
T ab = (ρ + p)V a V b − pg ab ,
T ab ;b = 0.
(3) Correspondence principle. General relativity must agree with Special Relativity
in absence of gravitation and with Newtonian gravitational theory in the case of
weak gravitational fields and in the non-relativistic limit (slow speed).
66
6.3 The Einstein equations in vacuum
In vacuum 9such as in the outside of a body in empty space) one has that the density ρ
vanishes and the equation for the Newtonian potential becomes:
∇2 φ = 0.
The Laplace equation involves an object with two indices (∂ 2 φ/∂xi ∂xj ). As a result,
what one needs is an object with two indices —a contraction of the Riemann tensor, like
the Ricci tensor:
Rbc = 0.
The latter are called the Einstein vacuum field equations. In fact, the most general form
of the vacuum equations which is tensorial and depends linearly on second derivatives of
the metric is:
Rbc = Λgab ,
where Λ is the so-called Cosmological constant.
Remark 1. Outside Cosmology, Λ is usually taken to be zero.
Remark 2. The vacuum equations are a set of ten partial differential equations for the
components of the metric tensor gab . These are hard to solve, apart from simple settings.
Remark 1. The Einstein equations are the simplest compatible with the Equivalence
Principle, but they are not the only ones.
6.4 Newtonian limit

Consider a slowly moving particle in a weak stationary gravitational field. Recall the
geodesic equation:
d2 xa a dx dx
b c
+ Γ bc = 0. (6.1)
dτ 2 dτ dτ
For a slow moving particle dxα /dτ (α = 1, 2, 3) may be neglected relative to dt/dτ , so
that (6.1) implies that
2
d2 xa a dt
+ Γ 00 = 0. (6.2)
dτ 2 dτ
Since the gravitational field is assumed to be stationary, all t-derivatives of gab vanish
and therefore
∂g00
Γa 00 = − 12 g ad d . (6.3)
∂x
Furthermore, since the field is weak, one may adopt a local coordinate system in which
gab = ηab + hab , |hab | 1. (6.4)
Substitution into (6.3) one has that
∂h00
Γa 00 = − 12 η ad .
∂xd
Substituting in (6.2):
2
d2 xα

dt ∂
= 1
2 ∇h00 , ∇ ≡ η αβ , (6.5a)
dτ 2 dτ ∂xβ
d2 t
= 0, as h00,0 = 0. (6.5b)
dτ 2
67
From (6.5a) it follows that dt/dτ is a constant. Also, from
dxα dxα dt
= ,
dτ dt dτ
it follows that 2
d2 xα d2 xα dxα d2 t

dt
= + ,
dτ 2 dt2 dτ dt dτ 2
which in our case reduces to
2
d2 xα d2 xα

dt
= .
dτ 2 dt2 dτ
Combining the latter with (6.5a)
d2 xα
= 21 ∇h00 . (6.6)
dt
The corresponding Newtonian result is
d2 xα
= −∇φ (6.7)
dt
where φ is the gravitational potential which far from a central body of mass M at a
distance r is given by
GM
φ=− .
r
Comparing (6.6) and (6.7) one finds then that
h00 = −2φ + constant.
However, at large distances from M one has that φ → 0 (gravity becomes negligible) and
gab → ηab (the space becomes flat). Therefore the constant must be zero so that
h00 = −2φ.
Substituting in (6.4) on finds
g00 = −(1 + 2φ).
Now, recall that φ has dimensions of (velocity)2 , [φ] = [GM/R] = L2 /T 2 . Therefore one
has that φ/c2 at the surface of the Earth is ∼ 10−9 , one the surface of the Sun ∼ 10−6
and at the surface of a white dwarf ∼ 10−4 . It follows that in most cases the distortion
produced by gravity is in gab very small.
6.5 Applications of General Relativity

In general, the Einstein field equations are extremely complicated set of non-linear partial
differential equations. In some simple settings, analytic solutions may be found. These
include:
(i) The vacuum spherically symmetric static case (the Schwarzschild spacetime).
(ii) The weak field case (gravitational waves).
(iii) The isotropic and homogeneous case (Cosmology).
Usually assume that Λ = 0, except for Cosmology.
68
6.6 The Schwarzschild solution
This is the basis for nearly all the tests of General Relativity. The solution corresponds
to the metric corresponding to a static, spherically symmetric gravitational field in the
empty spacetime surrounding a central mass (like the Sun).
Choosing coordinates (t, r, θ, ϕ), it can be shown that a metric of this type is of the
form:
ds2 = −eA(r) dt2 + eB(r) dr2 + r2 (dθ2 + sin2 θdϕ2 ), (6.8)
where A(r) and B(r) describe deviation of the metric from Minkowski spacetime. Note
that for constant t and r the metric reduces to the standard metric for the surface of a
sphere. As one is dealing with vacuum, one is poised to solve
Rab = 0. (6.9)
Substituting (6.8) in (6.9), and after some algebra, the only non-zero components of (6.9)
have the form:
B0
Rrr = R11 = 12 A00 − 41 A0 B 0 + 41 A02 − ,, (6.10a)
r
Rθθ = R22 = e−B 1 + 12 r(A0 − B 0 ) − 1, ,

(6.10b)
2
Rϕϕ = R33 = R22 sin θ, (6.10c)
A0

A−B 1 00 1 0 0 1 02
Rtt = R00 = −e 2A − 4A B + 4A + r , (6.10d)
where 0 denotes differentiation with respect to r.

To solve Rab = 0, we start by looking at the combination:
Rrr + eB−A Rtt = − 12 (B 0 + A0 ) = 0.
Integrating one obtains

A = −B.
One can without loss of generality change t to absorb the constant of integration. Sub-
stituting in (6.10b):
eA (1 + rA0 ) − 1 = 0.
The latter can be rewritten as
(reA )0 = 1,
which can be integrated to give
reA = r + σ, σ a constant
so that
σ
eA = 1 + ,
r
so that the metric one obtains is given by
σ 2 σ −1 2
ds2 = − 1 + dt + 1 + dr + r2 (dθ2 + sin2 θdϕ2 ).
r r
To fix σ, recall that in the Newtonian limit of a central mass M ,

2GM
g00 = − 1 − .
r
69
Comparing with σ
− 1+ ,
r
one finds that
σ = −2GM.
Hence, at the end of the day one has
2GM −1 2

2 2GM 2
ds = − 1 − dt + 1 − dr + r2 (dθ2 + sin2 θdϕ2 ). (6.11)
r r
The latter is called the Schwarzschild metric.
Remark 1. This solution how the presence of mass curves flat spacetime.
Remark 2. The metric (6.11) is asymptotically flat. That is, it becomes Minkowskian
as r → ∞.
Remark 3. The solution only applies to the exterior of a star.
Remark 4. The Birkhoff Theorem: a spherically symmetric solution in vacuum is
necessarily static. That is, there is no time dependence is spherically symmetric solutions.
6.7 Experimental tests of General Relativity

The classical experimental tests of General Relativity are based on the Schwarzschild
solution. These are based on the comparison of the trajectories of freely falling particles
and light rays in gravitational field of a central body with their counterparts in Newtonian
theory.
In order to derive the geodesics in Schwarzschild spacetime, it is best to use the
Euler-Lagrange equations with
2M G −1 2
2
dτ 2M G 2
L= = 1− ṫ − 1 − ṙ − r2 (θ̇2 + sin2 θϕ̇2 ),
dλ r r
where 0 denotes differentiation with respect to the parameter λ. For timelike geodesics
one has that λ = τ so that L = 1. On the other hand, for null geodesics L = 0.
The Euler-Lagrange equations read then
d
2Aṫ = 0, , (6.12a)
dλ
d
(2ṙA−1 ) − 2r(θ̇2 + sin2 θϕ̇2 ) + ṙ2 A−2 A0 + ṫ2 A0 = 0, (6.12b)
dλ
d 2
(r θ̇) − r2 sin θ cos θϕ̇2 = 0, (6.12c)
dλ
d 2 2
(r sin θϕ̇) = 0, (6.12d)
dλ
where
2GM
A(r) = 1 − ,
r
and 0 denotes differentiation with respect to r. It turns out that it is simpler to use
2M G −1 2

2M G 2
1− ṫ − 1 − ṙ − r2 (θ̇2 + sin2 θϕ̇2 ) = 1, 0. (6.13)
r r
70
This is, in fact, an integral of motion of the Euler-Lagrange equation. It expresses the
fact that the square of the norm of the 4-velocity vector of a timelike particle is −1, while
that of a photon is 0. This is like in Special Relativity.
As in Classical Mechanics (central force orbit), let us look for solutions in the Equa-
torial plane: θ = π/2. It follows then that θ̇ = 0, and from (6.12c) with cos θ = 0
one finds that θ̈ = 0. The orbits remain in a plane! This is like in Classical Mechanics
—conservation of angular momentum.
Now, from (6.12d) it follows that
r2 ϕ̇ = h, h a constant, (6.14a)

2GM
1− ṫ = l, l a constant, . (6.14b)
r
Substituting (6.14a) and (6.14b) in (6.13) one obtains
2GM −1 2GM −1 2 h2

2
l 1− − 1− ṙ − 2 = 1, 0. (6.15)
r r r
As in Newtonian theory, let u = 1/r so that
dr dr dϕ dr dr 1 du
ṙ = = = ϕ̇ , =− 2 .
dλ dϕ dλ dϕ dϕ u dϕ
Using (6.14a) one finds
du
ṙ = −h .
dϕ
Then equation (6.15) in (u, ϕ) coordinates become
2
du l2 − 1 2GM 2GM u3
+ u2 = + u + , for timelike geodesics, (6.16a)
dϕ h2 h2 c2
2
du l2 2GM u3
+ u2 = 2 + , for null geodesics. (6.16b)
dϕ h c2
The speed of light c has been added for dimensional reasons. These are the analogues
of energy equations in Newtonian theory. One can solve (6.17a)-(6.17b) approximately.
For this, differentiate the equations with respect to ϕ:
d2 u 2GM 3GM u2
+ u = + , for timelike geodesics, (6.17a)
dϕ2 h2 c2
d2 u 3GM u2
+ u = , for null geodesics. (6.17b)
dϕ2 c2
From here, one has to analyse the two cases separately.
6.7.1 Timelike case —an orbiting particle

The appropriate equation (6.17a) is identical to the equation for Newtonian orbits except
for the last term. This last term is small relative to other terms for planetary orbits. The
ration of the last two terms for Mercury is ∼ 10−7 . As a consequence of this, equation
(6.17a) will be solved using perturbation methods. For this let

GM 3GM GM
a≡ 2 , = ,
h c2 h2
71
where is dimensionless and assumed to be small. Then equation (6.17a) implies
d2 u
2
+ u = a + u2 . (6.18)
dϕ a
We will look for solutions of the type
u = u0 + u1 + O(2 ).
Substitution in (6.18) yields
d2 u0 d2 u1

+ u0 + + u1 = a + u20 + O(2 ). (6.19)
dϕ2 dϕ2 a
Equating zeroth order terms in in equation (6.19) one obtains
d2 u0
+ u0 = a,
dϕ2
which can be solved to give
u0 = a + b cos ϕ, b a constant, (6.20)
where without loss of generality we have set ϕ0 = 0. Now, equating the first order term
in (6.19) one has
d2 u1 u20
+ u1 = .
dϕ2 a
Substituting for u0 from (6.20), and using that
cos2 ϕ = 12 (1 + cos 2ϕ),
one obtains
d2 u1 b2 b2

+ u1 = a+ + 2b cos ϕ + cos 2ϕ, (6.21)
dϕ2 2a 2a
which is a linear inhomogeneous ordinary differential equation. Its general solution con-
sists of a general solution to the homogeneous part plus a particular solution correspond-
ing to each term of the right hand side of (6.21). One has that for:
b2 b2
a+ , the solution is a + ,
2a 2a
2b cos ϕ the solution is bϕ sin ϕ,
b2 b2
cos 2ϕ the solution is − cos 2ϕ.
2a 6a
Hence, the solution to the orbit equation to first order in is
u = u0 + u1
b2 b2

= a + a + + b cos ϕ − cos 2ϕ + bϕ sin ϕ.
2a 6a
It is observed that only the last term in this expression is non-periodic, and hence, any
irregularity in the orbit (non-periodicity) must relate to this term. To see the effect of
this term recall that
cos ϕ ≈ 1, sin ϕ ≈ ϕ,
72
so that
cos(ϕ − ϕ) = cos ϕ cos ϕ + sin ϕ sin ϕ = cos ϕ + ϕ sin ϕ.
Hence, we write the solution as
b2 b2

u = a + b cos(ϕ − ϕ) + a + − cos 2ϕ ,
2a 6a
that is:
u = a + b cos(ϕ − ϕ) + (periodic terms).
New, recall that the perihelion of a planet around the Sun occurs when r is a minimum(
u maximum). Now, cos(ϕ − ϕ) is a maximum when
ϕ(1 − ) = 2nπ, or approximately ϕ ≈ 2nπ(1 + ).
Successive perihelia occur then at intervals of
∆ϕ ≈ 2π(1 + ),
instead of 2π as in the case of periodic motion. Therefore, the perihelion shift per
revolution (δϕ = ∆ϕ − 2π) is
6πG2 M 2
δϕ = 2π = .
h2 c2
From Newtonian theory we have that
h2 4π 2 α3
α= , T2 = ,
GM (1 − e2 ) GM
where e is the eccentricity of the orbit, α the semi-major axis, and T the period. One
obtains then that
24π 3 α2
δϕ = 2 2 .
c T (1 − e2 )
For the case of the planet Mercury this gives a total shift of 43.0300 per century which is
in good agreement with the classically unaccounted shift of 43.1100 ± 0.4500 .
Remark 1. This is one of the famous classical tests of General Relativity, the so-called
perihelium shift of Mercury.
Remark 2. The effect is largest in Mercury because of its high eccentricity and small
period which results in a large shift.
Remark 3. For Venus one has a predicted shift of 8.600 and an observed of 8.400 ± 4.800 .
For the Earth one has 3.800 and 5.000 ± 1.200 . For the asteroid Icarus 10.300 and 9.800 ± 0.800 .
6.7.2 Null geodesics

The relevant equation for null geodesics is given by (6.17b):
d2 u 3GM u2
+ u = .
dϕ2 c2
As before, the term GM u/ c2 is small relative to u so let ≡ 3GM/c2 , and rewrite (6.17b)
as
d2 u
+ u = u2 , . (6.22)
dϕ2
73
As before, look for solutions of the form
u = u0 + u1 + O(2 ).
Substituting into (6.22) one has
d2 u0 d2 u1
+ u0 + + u1 = u20 + O(2 ).
dϕ2 dϕ2
Equation the zero terms in the previous equation:
d2 u0
+ u0 = 0,
dϕ2
which can be solved to give
u0 = L cos ϕ, L a constant.
Now, u0 = 1/r so that

1
r cos ϕ = ,
L
which represents a straight line —which is what is expected. To zeroth order light is not
deflected by the gravitational field of the Sun.

Equating terms of order one in one obtains
d2 u1
+ u1 = u20 = L2 cos2 ϕ = 12 L2 (1 + cos2ϕ)
dϕ2
which has the particular solution
u1 = 21 L2 − 61 L2 cos 2ϕ = 32 L2 − 13 L2 cos2 ϕ.
Hence, one obtains that
u = u0 + u1 = L cos ϕ + 23 L2 − 31 L2 cos2 ϕ. (6.23)
So, the effects of the first order terms (the last 2 terms) is to make light deflect from a
straight line.
For a light ray grazing the Sun and arriving at Earth, the asymptote of the trajectory
corresponds to values of ϕ for which r → ∞ —that is, u → 0. Substituting in (??) this
gives
3
cos2 ϕ − cos ϕ − 2 = 0.
L
74
The latter can be solved to give

3
q
8 2
cos ϕ = 1 ± 1 + 9 L .
2L
Choosing the negative sign to make cos ϕ < 1 and expanding one finds

2 2
| cos ϕ| ≈ 3 L = 2 GM L 1,

c
which implies ϕ ≈ π/2. Let ϕ = π/2 + δ son that
sin δ ≈ 2GM L/c2 ≈ δ.
One has that δ is the angle that each asymptote makes with undeflected straight line.
The angle between 2 asymptotes is
∆ = 2δ.
Accordingly, one finds
4GM L
∆= .
c2
For a light ray just grazing the Sun this predicts a deflection of 1.7500 which compares
well with some recent radio observations yielding ∆ = 1.7300 ± 0.0500 .

Remark. This is the second famous test of General Relativity —more generally re-
ferred to as bending of light. A first measurement was carried out by Eddington and
collaborators in 1919.
6.8 Gravitational redshift

Consider a clock or an atom at fixed (r, θ, ϕ). Hence,
dr = dθ = dϕ = 0.
Then the Schwarzschild metric implies that:
2GM −1/2

dt
= 1− ,
dτ r
from where it follows that dt (i.e. the period or interval as measured by an observer at
infinity) is larger than dτ as measured at r. Similarly, the period of atomic oscillations in
75
the gravitational field of M , as seen from infinity is increased. Therefore, the frequency
ν (ν = 1/τ ) of light it emits is decreased —i.e. redshifted as seen from infinity.
We define redshift via
λrec − λem νem
z≡ = − 1.
λem νrec
Accordingly,
−1/2
τrec dt 2GM
1+z = = = 1− .
τem dτ r
Note that z → ∞ as r → 2GM/c2 .
Remark. It used to be thought that gravitational redshift also constituted a test of Gen-
eral Relativity, but it turns out that any other theory compatible with the Equivalence
Principle will predict a redshift.
6.9 Black holes

Looking at the Schwarzschild metric
−1
2 2GM 2 2GM
ds = − 1 − dt + 1 − dr2 + r2 (dθ2 + sin2 θdϕ2 ), (6.24)
r r
one sees that there is something peculiar happening at
2GM
r= , r = 0.
c2
To get a better feel of what is happening, it is best to look at coordinate independent
scalars at these points. A good candidate for this is Rabcd Rabcd for which the metric
(6.24) is proportional to 1/r6 . The latter is clearly singular at r = 0, with severe physical
consequences such as that tidal forces diverge. Nevertheless the scalar remains well
behaved at r = 2GM/c2 .
One say that at r = 0 one has a physical singularity wheres at r = 2GM/c2 on has a
coordinate singularity. In order to understand this better choose a new coordinate t̃ via
t̃ = t + 2GM ln |r − 2GM |.
The latter are called Eddington-Finkelstein coordinates. Using this new time coordinate
on finds that
2GM
dt = dt̃ − dr,
r − 2GM
so that the metric (6.24) transforms into

2 2GM 2 4GM 2GM
ds = − 1 − dt̃ + dt̃dr + 1 + dr2 + r2 (dθ2 + sin2 θdϕ2 ). (6.25)
r r r
Note that in this coordinates all metric coefficients are well behaved, except for r = 0.
To understand the light con structure in the spacetime, we look at radial light cones
defined by
θ = constant, ϕ = constant, ds2 = 0,
76
so that
2
dt̃ 2GM 4GM dt̃ 2GM
1− − − 1+ = 0.
dr r r dr r
Solving the quadratic equation one finds
dt̃ r + 2GM
= −1, .
dr r − 2GM
Therefore, as r → ∞ one has that dt̃/dr → (−1, 1).

For r → 2GM dt̃/dr → (−1, ∞).
For r → 0 dt̃/dr → (−1, −1).
One sees that as r decreases, the outgoing path dt̃/dr > 0 becomes steeper and
steeper and eventually slopes inwards for r < 2GM . Therefore, nothing, including light
can go from r < 2GM to r > 2GM . This justifies the name black hole.
Note also that even though a particle can sit stationary at r > 2GM as we do on
Earth, given appropriate forces to counter gravity to stop our free fall), this cannot be
done in a region r < 2GM as all timelike (null) trajectories must make an angle with the
vertical, and therefore must have decreasing r for increasing t̃. Therefore, the particle
must fall towards r → 0. There is no static behaviour in the region r < 2GM and no
escape from the singularity once inside.
Infalling particles (observers)

Another interesting question is to compare the picture as seen by an observer at infinity
with that seen by an infalling observer. For this, consider radially infalling free particles
into a black holes of mass M . Compare the time elapsed on the particle’s clock τ for a
particle to reach r = 2GM/c2 to time measured by a clock at rest at infinity —i.e. t, as
r → ∞, dτ → dt. Now, consider the timelike geodesics derived in section 6.7.1 and make
them radial by letting dθ = dϕ = 0. One obtains the equations

2GM 0
1− t = l,
r
2GM −1 02

2GM 02
1− t − 1− r = 1.
r r
77
Combining these equations one obtains
2
dr 2GM
= r02 = −1 + + l2 .
dτ r
However, l is arbitrary and can be chosen as l = 1. Hence,

r
dr 2GM
=− ,
dτ r
where the negative root gives the infalling behaviour. Integrating one finds
Z τ Z r
1 √
dτ = − √ rdr,
τ0 2GM r0
so that
2
3/2

τ − τ0 = √ r0 − r3/2 .
3 2GM
It follows that nothing special happens to this particle at r = 2GM/c2 (the Schwarzschild
radius) and the body falls to r = 0 in a finite proper time τ . This allows to calculate the
time taken for a large body (such a galaxy) to collapse.
Tom see what happens in coordinate time t, one can proceed from dt/dr and integrate.
Alternatively, one can start with the light cones of the Schwarzschild solution (radial light
rays) given by
ds2 = 0, dθ = dϕ = 0.
The Schwarzschild metric then implies
dt 1
=± ,
dr 1 − 2GM/r
so that as r → ∞, dt/r → ±1, and asr → 2GM , dt/r → ±∞.

Therefore, the light cones close up as one approaches r → 2GM . Since particles
can only move within light cones, this makes them move more and more vertically. The
approach takes an infinite amount of t-time!
78

Relativity Notes

Uploaded by

Copyright:

Available Formats

Relativity Notes

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Relativity Notes

Uploaded by

Copyright:

Available Formats

Relativity

Dr. Juan A. Valiente Kroon

(adapted from notes from Prof. Reza Tavakol)

School of Mathematical Sciences,

1.1 What is Relativity?

1.1.1 Special Relativity?

1.1.2 General Relativity?

1.2 Pre-relativistic Physics

• Frames of reference. These consist of an origin in space, 3 orthogonal axes and

1.2.2 Laws of Newton

(2) The rate of change of momentum is equal to the force.

(3) Action and reaction are equal and opposite.

1.2.3 Galilean transformations

Now, suppose that at a given moment of time t, an event E is specified by coordinates

or more compactly (recall that in general v = (vx , vy , vz ), but here vy = vz = 0):

where τ is a real constant.

1.2.4 Invariance of Newton’s laws under Galilean transformations

The contradiction brought about by the development of Electromagnetism gave rise to a

2.1 Einstein’s postulates of Special Relativity

2.2 Spacetime pictures

2.2.1 Some definitions

• Worldline of a particle stationary at x = x0 .

• Worldline of a particle moving with uniform velocity v and passing through O at

Therefore the slope of of the line is given by 1/v.

• The worldlines of instantaneous travel is a horizontal line —however, this is forbid-

x = c(t − t1 ) = c(t4 − t). (2.1)

x − vt2 = c(t − t2 ), (2.2)

Now, equations (2.2) and (2.3) imply, respectively

β(c2 t0 + vx0 ) 1 0 vx0

x0 = γ(x − vt). (2.8)

Similarly for t from equation (2.5b):

2.4 Hyperbolic form of the Lorentz transformations

x0 = x cosh α − ct sinh α, (2.10a)

ct0 + x0 = e−α (ct + x), (2.11a)

where it has been used that

(i) there exists an identity element;

(ii) for every Lorentz transformation there exists an inverse;

(iii) the composition of Lorentz transformations is a Lorentz transformation and that

(ii) There exists an inverse Lorentz transformation with v = −v (α → −α).

From (2.11a) and (2.11b) one has that

ct00 + x00 = e−α2 (ct0 + x0 ),

ct0 + x0 = e−α2 (ct + x),

ct00 + x00 = e−(α1 +α2 ) (ct + x),

which shows that the composition of Lorentz transformations is a Lorentz trans-

2.5 The Minkowski spacetime

The transformation can be deduced from the diagram by observing that:

Eliminating the rotation parameter α by taking

x02 + y 02 = (x cos α + y sin α)2 + (−x sin α + y cos α)2

Analogue for Lorentz transformations. Starting from

ct0 + x0 = e−α (ct + x),

−ct2 + x2 = −ct02 + x02 ,

−c2 t2 + x2 + y 2 + z 2 = −c2 t02 + x02 + y 02 + z 02 . (2.14)

and taking the limit in equation (2.14) one obtains

−c2 dt2 + dx2 + dy 2 + dz 2 = −c2 dt02 + dx20 + dy 20 + dz 20 . (2.15)

remains invariant under Lorentz transformations (boosts).

Furthermore, they both remain invariant under coordinate transformations: Lorentz

ds2 = −dt2 + dx2 + dy 2 + dz 2 .

2.6 Minkowski diagrams

2.7 Index notation