Special Relativity Part 1
Special Relativity Part 1
Supplementary Lecture I:
Spacetime Diagrams and Causality
c Joel C. Corbo, 2005
This set of notes accompanied the first in a series of “fun” lectures about relativity
given during the Fall 2005 Physics H7C course at UC Berkeley. Its focus is on using
spacetime diagrams to understand the causal structure of Minkowski space and as a
tool to solve problems in special relativity.
1
Spacetime Diagrams and Causality
which tell us how to transform between the position and time coordinates of two
Lorentz frames† moving with relative velocity v = βc in the x -direction. Before
saying any more about these transformations, we will pause to make it clear exactly
what we mean by a frame as opposed to an observer. A frame is synonymous with
a system of coordinates, so that when we speak of a frame S as opposed to a frame
S 0 we are really speaking of a set of coordinates (ct, x, y, z ) as opposed to a set
(ct 0 , x 0 , y 0 , z 0 ). This is the sense in which the Lorentz transformations are used to
convert between two different Lorentz frames. Clearly, then, a frame is something
which covers all of spacetime.
An observer, on the other hand, is a person (or a rocket, or a particle, or any
other material object) moving through spacetime. In principle, an observer need not
be restricted to a particular Lorentz frame, although we often do so for convenience.
If an observer does stay in the same Lorentz frame, it is tempting to speak of the
observer and the frame interchangeably. However, this is a bad habit to get into,
because it encourages us to describe what happens in spacetime by what a particular
observer “sees,” which is not what enters into the Lorentz transformations. As stated
above, the Lorentz transformations convert between coordinate systems, which span
all of spacetime, as opposed to observers, which exist at particular points in spacetime.
Therefore, what a single human being flying through spacetime physically sees has
very little to do with the underlying reality of the events occurring in that spacetime
as described by the Lorentz transformations‡ .
Given the Lorentz transformations, we can define a quantity that is invariant, that
is, a quantity that has the same value in any Lorentz frame; we call it the spacetime
†
In special relativity, inertial frames are often referred to as Lorentz frames.
‡
If we insist on thinking of frames as having to do with people, we can imagine an infinite array
of people, each with a synchronized clock, spaced uniformly throughout space. Each of these people
observes only what is happening in an infinitesimally small region of space immediately around them,
and once they are done observing, they all get together to reconstruct what happened throughout
spacetime. This is equivalent to speaking of a frame as a coordinate system, but it is needlessly
personified.
2
Spacetime Diagrams and Causality
To see that this quantity is indeed invariant, we can Lorentz transform it¶ ,
3
Spacetime Diagrams and Causality
ct
ht
lig
lig
ht
x
rockets colliding, a bolt of lightning striking, a star exploding, etc) that occurred at
a particular position and time in a given frame.
A particle moving through spacetime traces out a curve on a spacetime diagram
called a worldline. The worldline is simply the trajectory of the particle, but we
do not use the word trajectory because it suggests only motion through space. For
example, if we were asked to describe the trajectory of a rock sitting on the ground,
we would likely say that it has no trajectory because it is not moving. It does have
a worldline, however, because while its position is not changing, it is moving forward
in time. Its worldline on a spacetime diagram is simply a vertical line. This is a good
time to point out that we have chosen to plot position on the horizontal axis and time
on the vertical axis, which is opposite the normal convention of plotting x vs. t, and
that we have chosen to plot ct on the vertical axis instead of just t so that both axes
carry the same units. This means that the slope of an object’s worldline is given by
the reciprocal of its velocity expressed as a fraction of c. In other words, the slope of
a worldline is the reciprocal of β. Therefore, the slope of the worldline of a particle
at rest is infinite, while the slope of a photon is 1.
Suppose we were to draw the trajectories of photons intersecting the origin; we
produce the dotted lines in Figure 1. Imagine rotating this diagram about the ct-axis,
4
Spacetime Diagrams and Causality
so that instead of intersecting just the x -axis they intersect the entire xy-plane. The
light rays then form two cones. The upper cone is called the future light cone and
the lower cone is called the past light cone ∗∗ .
These names are reasonable. Imagine an observer, Albert, starting out for a
journey through spacetime from the origin of the spacetime diagram. Because all
objects must travel at speeds less than or equal to the speed of light, the only points
Albert can get to are those that lie within the future light cone. In this sense, these
points define Albert’s future because they are the set of points accessible to him from
the origin. Similarly, the set of points that lie within the past light cone defines
Albert’s past because these points are the the only ones from which he might have
arrived at the origin. We call the rest of spacetime the present, the set of points
completely inaccessible to Albert†† . These regions of spacetime are labeled in Figure
2. Note that these definitions of Albert’s past, present, and future are only valid
when Albert is at the origin. As Albert moves through spacetime, the set of points
that are his past, present, or future changes depending on his location, as though he
carries along his own set of light cones that constantly repartition these three regions.
This ensures that Albert’s worldline never has a slope less than 1.
The concepts of past, present, and future also have a more mathematical inter-
pretation. Consider the points A, B, and C in Figure 2; A lies in the future of the
origin, C in the present, and B on the future light cone. For A, ct > x, which is
true of any point in the future. This means that the spacetime interval, given by Eq.
(2), is positive for all points in the future. However, for point C, and any other point
in the present, x > ct, which means that the spacetime interval is negative for these
points. On the lightcone, ct = x, so the spacetime interval is identically zero for point
B and any other point on the cone. These facts give rise to a new set of terminology.
Given any two points separated in spacetime, the separation is said to be timelike if
(∆s)2 > 0, spacelike if (∆s)2 < 0, and lightlike or null if (∆s)2 = 0.
∗∗
Of course, because space has three spatial dimensions, the past and future light cones are really
4D hyper-cones. This is obviously impossible for us to picture, so we will stick to our simplified 2D
diagrams.
††
Compare these definitions of past, present, and future to those in Euclidean space combined
with absolute time. From Albert’s point of view at the origin, the future is all points with t > 0, the
past is all points with t < 0, and the present is all points with t = 0, all without regard to position.
This happens because in non-relativistic mechanics there is no speed limit, so that Albert, starting
at the origin, could get to any point in space in finite time if he traveled fast enough.
5
Spacetime Diagrams and Causality
ct
FUTURE B
PRESENT o
PRESENT x
PAST
Before moving on to more specific topics, we would like to say a few words about the
geometry of Minkowski space as represented by spacetime diagrams. When mapmak-
ers make maps of the Earth, they must always sacrifice some aspect of the Earth’s
true geometry in order to make a flat map out of a round surface. They may choose
to correctly reproduce the relative sizes of the continents or to preserve the angles
between lines of latitude and longitude, but they can never accurately reproduce all
aspects of the round Earth. This is OK as long as people using the maps understand
this distortion. In analogy to this, spacetime diagrams do not accurately reproduce
the geometry of Minkowski space. By representing time and space equivalently on an
intrinsically Euclidean plane, spacetime diagrams distort distances so that lines that
appear longer on the diagram actually represent shorter spacetime separations.
Suppose we drew a right triangle on a spacetime diagram, like the one in Figure 3,
such that one leg of the triangle has length c∆t and the other has length ∆x. We can
interpret the vertices of this triangle as events: events A and C occurred at the same
time but in different places while events B and C occurred at the same place but at
different times. We can interpret the hypotenuse of the triangle as the separation ∆s
between events A and B ; it is pictorially what we meant by the interval defined in
6
Spacetime Diagrams and Causality
B
∆s
c∆t
φ
A ∆x C
Eq. (2).
How can we calculate ∆s? The natural approach, given our diagram, is to use the
Pythagorean theorem:
(∆s)2 = (c∆t)2 + (∆x)2 . (4)
However, this contradicts Eq. (2), which says that
We interpret this strange result in analogy to the example of the flat map of the
Earth: although the line segment connecting events A and B in our triangle looks
longer than the line segment connecting events B and C, it in fact represents a shorter
spacetime interval. Just as representing the surface of a sphere on a flat piece of paper
distorts geometrical features, representing Minkowski space on a flat piece of paper
also distorts its geometry‡‡ . Therefore, Minkowski space is fundamentally different
from Euclidean spacez , and our spacetime diagrams misrepresent some aspects of the
true geometry of spacetime.
We can be a bit more concrete about what is causing this misrepresentation by
thinking about trigonometry. Given the angle φ defined by Figure 3 (and forgetting
‡‡
As an extreme example of this, imagine if c∆t = ∆x in our triangle; if that were the case, ∆s,
the “length” of the hypotenuse, would be zero!
z
There is a way to make the interval in Minkowski space and the interval in Euclidean space
look the same: replace ct with ict in Eq. (4). This may seem like a tempting solution, but really it
replaces one inconvenience, representing Minkowski space on a Euclidean surface, with a much worse
one, plotting imaginary time. Since imaginary time is much harder to conceptualize than distorted
distances, we will stick to plotting real time while keeping in mind the misrepresentation of distance
it causes.
7
Spacetime Diagrams and Causality
for the moment that we know there is something strange happening with geometry),
we find that
c∆t
sin φ = (6a)
∆s
∆x
cos φ = . (6b)
∆s
If we substitute these expressions into the trigonometric identity
and simplify, we produce Eq. (4). This should not be too surprising since the
Pythagorean theorem is equivalent to Eq. (7). However, we know it is the wrong
expression for the interval ∆s.
It turns out that there is a way to fix this: replace usual (circular) trig functions
with hyperbolic trig functions by defining
c∆t
cosh α = (8a)
∆s
∆x
sinh α = , (8b)
∆s
where α is the hyperbolic angle . Then, using the identity
and simplifying, we produce Eq. (5), which is the correct expression for the interval
∆s. Thus, we have discovered something interesting about the structure of Minkowski
space: it is based on hyperbolas rather than circles.
One last note before we move on: we mentioned earlier the useful identity
γ 2 − (γβ)2 = 1, (10)
which looks similar in structure to Eq. (9). Let’s define a quantity θ such that
cosh θ = γ (11a)
sinh θ = γβ. (11b)
The hyperbolic angle has an interesting geometrical interpretation that is beyond the scope of
this set of notes. The Wikipedia article on “hyperbolic function” would be a good place to start
learning more about this subject.
8
Spacetime Diagrams and Causality
The quantity θ is called the rapidity, and it has an important interpretation in special
relativity. If we calculate its hyperbolic tangent, we find
sinh θ
tanh θ =
cosh θ
γβ
=
γ
= β. (12)
In other words, θ is closely related to the frame velocity β. In fact, in certain circum-
stances it is more useful to use θ in calculation than β, particularly when adding the
velocities of several frames together. We will see an example of this in the next set
of notes in this series.
9
Spacetime Diagrams and Causality
ct
O
I>
x
O I<
I< O
I>
O
present. If we imagine rotating these pictures about the ct-axis, we produce Figure
5. We see that positive I forms a hyperboloid of two sheets while negative I forms a
hyperboloid of one sheet.
What does this have to do with causality? Suppose we have two points in space-
time, the origin O and some other point P, which are timelike separated such that t P
> 0. Because they are timelike separated, we know that (∆s)2 = I > 0. Therefore,
point P must lie on the hyperbola contained in O’s future. Since Lorentz transfor-
mations leave I invariant, after a Lorentz transformation P must lie on the same
hyperbola that it started out on. Therefore, there is no way to move P out of the
future of O! If an event at O caused an event at P, causality is preserved because
the event at O always occurs before the event at P in any Lorentz frame. Thus,
we conclude that events can be causally connected in special relativity so long as
the interval between them is timelike, because this guarantees that all observers, no
matter their inertial frame, will agree on the temporal order of the events
We could perform a similar analysis if the point P is spacelike separated from the
origin. Unsurprisingly, we would find that such a point cannot be causally connected
to the origin, because different observes will disagree about the order in which the
events at O and P took place (just look at the hyperboloid for I < 0 to see that
10
Spacetime Diagrams and Causality
ct ct
x x
Figure 5: Hyperbolas rotated. The image on the left is the I > 0 hyperbolas, while
the image on the right is the I < 0 hyperbola.
11
Spacetime Diagrams and Causality
this is true). This is OK, though, because there is no way to send a signal between
two spacelike separated points, so there is no practical way that anything at one such
point could cause something to happen at the other anyway.
So what is the connection between these arguments and the geometry of Minkowski
space? Suppose spacetime were Euclidean, with the interval given in Eq. (4). In that
case, the locus of points of constant interval would be a circle around the origin, and
the analysis given here would have shown that if P started out in O’s future, it could
be transformed into O’s future, present, or past, destroying causality completely.
Causality is rescued by the fact that spacetime is Minkowski in nature.
In analogy to the facts above, we should be able to construct the x 0 -axis by setting
ct 0 equal to zero. If we do that, we find
ct = βx, (15)
which tells us that the x 0 -axis can be drawn as a line in the x -ct plane that passes
through O and has slope β. Similarly, we can construct the ct 0 -axis by setting x 0
equal to zero in
x0 = γ(x − βct), (16)
producing
x
ct = . (17)
β
Therefore, the ct 0 -axis makes the same angle with respect to the ct-axis as the x 0 -axis
made with respect to the x -axis; the slope of the ct 0 -axis is β1 ♣ . The primed and
♣
This is a reasonable result since an observer at rest in the primed frame would make a slanted
path in the unprimed frame, with the slope of that path equaling the reciprocal of the observer’s β.
12
Spacetime Diagrams and Causality
ct
ct
`
x̀
unprimed axes are shown in Figure 6. Note that in the primed coordinates, the light
cone still has slope 1, which means that the speed of light is the same in both reference
frames, as it should be. Note also that we are setting the origins of the primed and
unprimed coordinates equal in our diagram; this is OK because spacetime is invariant
under translations of the origin, so we always have the freedom to put the origins on
top of each other.
By using a spacetime diagram representing two inertial frames simultaneously,
most kinematics problems in special relativity reduce to the task of finding the coor-
dinates of relevant events in the two frames. By plotting these events on a spacetime
diagram and determining their coordinates in both frames with the Lorentz transfor-
mations, it is often much easier to solve problems in an understandable way than it
is by doing algebra alone. We will see some examples of this below.
13
Spacetime Diagrams and Causality
2.1 Simultaneity
Suppose we are standing in front of a set of train tracks such that the tracks are
oriented along our x -axis. A train car comes by at speed β. Just as the center of the
train car is right in front of us, two lightning bolts strike. One bolt strikes the front
of the train car and the other the back of the train car; both strikes are simultaneous
in our frame. Are they simultaneous in the train car’s frame?
In order to analyze this situation, we need to understand what the problem is
asking. There are two events of relevance, the two lightning strikes; we will call them
events A and B. By saying that these events look simultaneous in our reference frame,
we mean that they have the same time coordinate. Clearly, however, they must have
different space coordinates. Let’s say that event A has coordinates x = -d and ct =
0 and event B has coordinates x = d and ct = 0.
To find what how these events “look” in the train car’s frame, we must transform
the coordinates of these events. Before blindly calculating, however, let’s see what
all this looks like on a spacetime diagram, like the one in Figure 7. We see pictured
both sets of axes, as well as two slanted lines representing the front and back of
the train car; the events A and B are labeled. Clearly, these events have the same
ct-coordinate. However, they do NOT have the same ct 0 -coordinate. Starting from
event A, we follow a line parallel to the x 0 -axis until we intercept the ct 0 -axis; this is
the ct 0 coordinate of A, and it is positive. If we follow the same procedure for event
B, we find that it has a negative ct 0 coordinate. This means that in the primed frame,
event B happens before event A.
Let’s confirm this with algebra. For event A, we have
ct0 = γ(ct − βx)
= γ(0 − β(−d))
= γβd, (18)
while for event B we have
ct0 = γ(ct − βx)
= γ(0 − β(d))
= −γβd. (19)
We see that the ct 0 coordinate of event B is indeed greater than that of event A.
Therefore, events simultaneous in one Lorentz frame are NOT simultaneous in a
different Lorentz frame.
14
Spacetime Diagrams and Causality
ct
ct
`
x̀
A B
x
15
Spacetime Diagrams and Causality
ct
ct
`
x̀
B
o A
x
L0
x0 = γ(x − βct)
= γ(L0 − β 2 L0 )
= γ(1 − β 2 )L0
L0
= . (21)
γ
Since γ is always greater than 1, we see that the length as measured in the moving
frame is shorter than the length measured in the rod’s rest frame, which is the result
we expected. Something funny is happening with geometry: the rod is shorter in the
moving frame even though it looks like the opposite is true in the diagram. This is
yet another example of the distortion of distances inherent in spacetime diagrams.
16
Spacetime Diagrams and Causality
ct
ct
`
A x̀
cT
o x
This situation is depicted in Figure 9. The two flashes of light are our events, and
they occur at points O and A. The dotted line from A to the ct-axis indicates how
much time appears to have passed in the unprimed frame. It seems like less time has
passed in the unprimed frame, which is opposite of what we expect, but we are again
rescued by the hyperbolic geometry of spacetime.
Setting x 0 = 0 and ct 0 = cτ , we find
ct = γ(ct0 + βx0 )
= γ(cτ ). (22)
Thus the unprimed observer experiences the passage of a time γτ while he sees the
moving clock register the passage of a time τ , where τ is by definition the proper time
of the clock, the time that elapsed in the clock’s own rest frame. In other words, the
unprimed observer sees the moving clock run slowly.
17
Spacetime Diagrams and Causality
ct
T
2
T
2
o x
constant speed, turns around, and returns home at that same speed; we will neglect
the short periods of acceleration necessary to change his velocity. From Fred’s point
of view, George spends a time T /2 on his outward journey, and the same time on his
return journey. This situation is depicted in Figure 10.
We would like to know which twin is older upon their reunion; in other words,
we would like to know which of the two twins has experienced more proper time
between their separation and reunion. For Fred, who remained on Earth, the amount
of proper time experienced is T. The question before us is to calculate the proper
time experienced by George.
To do this, we turn once again to the spacetime interval. We know that this
interval is invariant under Lorentz transformations, but we have not discussed the
physical interpretation of this invariant quantity. In order to do so, let’s restrict our
attention to the case in which the interval is timelike, so that it is positive. We can
then take it’s square root without worrying about generating imaginary numbers.
Let’s also think about infinitesimal length and time separations between two events
(dx and dt) instead of finite separations (∆x and ∆t). Then we can define a quantity
18
Spacetime Diagrams and Causality
ds such that
p
ds = c2 (dt)2 − (dx)2
s 2
1 dx
= c dt 1 − 2
c dt
p
= c dt 1 − β 2
dt
= c . (23)
γ
When γ is constant (on our diagram, when the path under consideration is a straight
line), this integrates to
ct
s= . (24)
γ
Comparing this to our expression for time dilation, Eq. (22), we see that s (up to a
factor of c) is nothing but the proper time for an object traveling at γ such that a
coordinate time t has passed in in unprimed frame. Therefore, when the interval is
timelike, we can interpret it as the proper time between two events♠
τF > τ G , (27)
as expected.
19
Spacetime Diagrams and Causality
path of minimum distance. It turns out that the straight line connecting two timelike-
separated events in spacetime is always the path of maximum proper time; this is yet
another result of the peculiar geometry of Minkowski space. Consider the case of
Fred and George. Fred followed the straight line path through spacetime between the
events marked by the twins’ departure and arrival, whereas George followed a different
path. Given the form of Eq. (26), we see that George’s proper time must always be
less than Fred’s, no matter his speed, because his γ is greater than one. Hence,
Fred followed the path of maximum proper time between those two events. For two
arbitrary timelike separated events, there is always a straight-line path connecting
them that corresponds to the worldline of an inertial observer. By transforming into
the frame of that observer, we reproduce the situation of Fred and George because
that observer is now at rest. Therefore, the straight line path always maximizes the
proper time between two events.
So what is the minimum possible proper time between two events? We see from
Eq. (26) that as γ increases, τ decreases. The limiting case is the one where γ equals
infinity, which is the case for a ray of light traveling at c. Hence, the proper time
elapsed for light is always zero; colloquially, we say that time does not pass for a
ray of light. This means that we can always connect two events by a path of proper
length equal to 0 by connecting them by a series of lines of slope 1.
The second important insight has to do with accelerations. The way that intro-
ductory special relativity is usually taught makes it seem like SR completely breaks
down if an object accelerates. After all, the frames we consider are always inertial,
so what place could accelerations possibly have in this theory? The answer is that
while frames must always be inertial, no one ever said that observers all had to be;
this is, in fact, why I made such a big distinction between frames and observers at
the beginning of these notes. Given a Lorentz frame, an observer can move in an
accelerated fashion with respect to the the frame; if drawn on a spacetime diagram,
his worldline would be a smooth curve with slope always greater than 1.
How do we reconcile accelerated observers with our standard ideas about Lorentz
frames? Take Eq. (24). We arrived at it by assuming that γ was constant. Suppose,
however, that γ were changing, which is the same as supposing that the observer it
represents is accelerating. We could write the proper time of such an observer as
Z t2
dt
τ= . (28)
t1 γ(t)
This is nice mathematically, but we need to make sure it has a sensible physical
20
Spacetime Diagrams and Causality
21