0% found this document useful (0 votes)

48 views106 pages

Chapter 2 Special Relativity

Uploaded by

eltyphysics

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views106 pages

Chapter 2 Special Relativity

Uploaded by

eltyphysics

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 106

2

From Electrodynamics to

SPECIAL RELATIVITY

Introduction. We have already had occasion to note that “Maxwell’s trick”

implied—tacitly but inevitably—the abandonment of Galilean relativity. We
have seen how this development came about (it was born of Maxwell’s desire
to preserve charge conservation), and can readily appreciate its revolutionary
signiﬁcance, for
To the extent that Maxwellean electrodynamics is physically
correct, Newtonian dynamics—which is Galilean covariant—
must be physically in error.

. . . but have now to examine the more detailed ramiﬁcations of this formal
development. The issue leads, of course, to special relativity.
That special relativity is—though born of electrodynamics—“bigger” than
electrodynamics (i.e., that it has non-electrodynamic implications, applications
—and roots) is a point clearly appreciated by Einstein himself (). Readers
should understand, therefore, that my intent here is a limited one: my goal
is not to produce a “complete account of the special theory of relativity” but
only to develop those aspects of special relativity which are speciﬁcally relevant
to our electrodynamical needs . . . and, conversely, to underscore those aspects
of electrodynamics which are of a peculiarly “relativistic” nature.
In relativistic physics c —born of electrodynamics and called (not quite
appropriately) the “velocity of light”—is recognized for what it is: a constant
106 Aspects of special relativity

of Nature which would retain its relevance and more fundamental meaning
“even if electrodynamics—light—did not exist.” From
–1
[ c ] = velocity = LT

we see that in “c-physics” we can, if we wish, measure temporal intervals in the

units of spatial length. It is in this spirit—and because it proves formally to be
very convenient—that we agree henceforth to write

x1 ≡ x
x ≡ ct and
0
x2 ≡ y
x3 ≡ z
To indicate that he has used his “good clock and Cartesian frame” to assign
coordinates to an “event” (i.e., to a point in space at a moment in time:
brieﬂy,
to a point
in spacetime) an inertial observer O may write xµ with
µ ∈ 0, 1, 2, 3 . Or he may (responding to the convenience of the moment)
write one of the following:
 0
x
x0  x1 
x≡ ≡ 2
x x
x3

We agree also to write

 
∂0
∂0 ∂ 
∂µ ≡ ∂ µ , and also ∂ ≡ ≡ 1
∂x ∇ ∂2
∂3

Note particularly that ∂0 = c1 ∂t . We superscript x’s but subscript ∂’s in

anticipation of a fundamental transformation-theoretic distinction that will be
discussed in §2.
It is upon this notational base—simple though it is—that we will build.

1. Notational reexpression of Maxwell’s equations. Even simple thoughts can be

rendered unintelligible if awkwardly expressed . . . and Maxwell’s was hardly a
“simple thought.” It took physicists the better part of 40 years to gain a
clear sense of the essentials of the theory that Maxwell had brought into being
(and which he himself imagined to be descriptive of the mechanical properties
of an imagined but elusive “æther”). Running parallel to the ever-deepening
physical insight were certain notational adjustments/simpliﬁcations inspired by
developments in the world of pure mathematics.77
During the last decade of that formative era increasing urgency attached
to a question
77
See “Theories of Maxwellian design” ().
Notational preparations 107

What are the (evidently non-Galilean) transformations

which preserve the form of Maxwell’s equations?

was ﬁrst posed () and resolved () by H. A. Lorentz ( –), who
was motivated by a desire to avoid the ad hoc character of previous attempts
to account for the results of the Michelson–Morley, Trouton–Noble and related
experiments. Lorentz’ original discussion78 strikes the modern eye as excessively
complex. The discussion which follows owes much to the mathematical insight
of H. Minkowski (–),79 whose work in this ﬁeld was inspired by the
accomplishments of one of his former students (A. Einstein), but which has
roots also in Minkowski’s youthful association with H. Hertz ( –), and
is distinguished by its notational modernism.
Here we look to the notational aspects of Minkowski’s contribution, drawing
tacitly (where Minkowski drew explicitly) upon the notational conventions and
conceptual resources of tensor analysis. In a reversal of the historical order,
I will in §2 let the pattern of our results serve to motivate a review of tensor
algebra and calculus. We will be placed then in position to observe (in §3) the
sense in which special relativity almost “invents itself.” Now to work:
Let Maxwell’s equations (65) be notated

∇·E = ρ
∇ ×B
B− 1 ∂
= c1 j
c ∂t E

∇·B = 0
∇×E
E + c1 ∂t
∂
B= 0

where, after placing all ﬁelds on the left and sources on the right, we have
grouped together the “sourcy” equations (Coulomb, Ampere), and formed a
second quartet from their sourceless counterparts. Drawing now upon the
notational conventions introduced on the preceding page we have

∂1 E1 + ∂2 E2 + ∂3 E3 = c1 j 0 ≡ ρ 



−∂0 E1 + ∂2 B3 − ∂3 B2 = c1 j 1 
(157.1)
−∂0 E2 − ∂1 B3 + ∂3 B1 = c1 j 2 




−∂0 E3 + ∂1 B2 − ∂2 B1 = c1 j 3

78
Reprinted in English translation under the title “Electromagnetic
phenomena in a system moving with any velocity less than that of light” in
The Principle of Relativity (), a valuable collection reprinted classic papers
which is still available in paperback (published by Dover).
79
See §7 of “Die Grundgleichungen für die elektromagnetischen Vorgänge in
bewegten Körpen” () in Minkowski’s Collected Works.
108 Aspects of special relativity


−∂1 B1 − ∂2 B2 − ∂3 B3 = 0 


+∂0 B1 + ∂2 E3 − ∂3 E2 = 0 
(157.2)
+∂0 B2 − ∂1 E3 + ∂3 E1 = 0 



+∂0 B3 + ∂1 E2 − ∂2 E1 = 0
where we have found it formally convenient to write
 
j0
j0
j 
1
j≡ = 2 with j 0 ≡ cρ (158)
j j
j3

It is evident that (157.1) could be written in the following remarkably compact

and simple form

∂µ F µν = c1 j ν
↑ ↑
3
———note : Here as always, summation on
the repeated index is understood . 0

provided the F µν are deﬁned by the following scheme:

   
F 00 F 01 F 02 F 03 0 −E1 −E2 −E3
 F 10 F 11 F 12 F   E1
13
0 −B3 B2 
F ≡  20 = 
F F 21 F 22 F 23 E2 B3 0 −B1
F 30 F 31 F 32 F 33 E3 −B2 B1 0

≡ A(E
E, B) (159)

Here the A-notation is intended to emphasize that the 4×4 matrix in question

is antisymmetric ; as such, it has or 6 independently-speciﬁable components,
which at (159) we have been motivated to identify in a speciﬁc way with the
six components of a pair of 3 -vectors. The statement

F νµ = −F µν : more compactly FT = −F (160)

evidently holds at every spacetime point, and will play a central role in our work
henceforth.
It follows by inspection from results now in hand that the sourceless ﬁeld
equations (157.2) can be formulated

∂µ Gµν = 0

with  
0 B1 B2 B3
 −B 0 −E3 E2 
G ≡ Gµν = A(−B
B, E ) =  1
 (161)
−B2 E3 0 −E1
−B3 −E2 E1 0
Notational preparations 109

. . . but with this step we have acquired an obligation to develop the sense in
which G is a “natural companion” of F. To that end:
Let the Levi-Civita symbol µνρσ be deﬁned

+1 if (µνρσ) is an even permutation of (0123)
µνρσ ≡ −1 if (µνρσ) is an odd permutation of (0123)
0 otherwise

and let quantities Fµν be constructed with its aid:

Fµν ≡ 12 µναβ F αβ where is understood (162)
ρ, σ

By computation we readily establish that

 
0 F 23 −F 13 F 12
 0 F 03 −F 02 
F ≡ Fµν

=  0

F 01 
(−)
0
 
0 −B1 −B2 −B3
 B1 0 −E3 E2 
=  = A(B
B, E )
B2 E3 0 −E1
B3 −E2 E1 0

which would become G if we could change the sign of the B-entries, and this
is readily accomplished: multiply A(B
B , E ) by
 
1 0 0 0
0 −1 0 0
gj ≡   (163)
0 0 −1 0
0 0 0 −1

on the right (this leaves the 0th column unchanged, but changes the sign of the

1st , 2nd and 3rd columns), and again by another factor of gj on the left (this
leaves the 0 row unchanged, but changes the sign of the 1 , 2nd and 3rd rows,
th st

the 1st , 2nd and 3rd elements of which have now been restored to their original

signs). We are led thus to gj A(B B , E ) gj = A(−B
B , E ) which—because

 gjT : gj is its own transpose (i.e., is symmetric)

gj = (164)
 gj –1 : gj is its own inverse

B , E ) = gj A(−B
—can also be expressed A(B B , E ) gj . In short,80

F = gj G gj T equivalently G = gj –1 F ( gj –1 )T (165)

80
problem 35.
110 Aspects of special relativity

Let the elements of gj be called gµν , and the elements of gj –1 (though they

happen to be numerically identical to the elements of gj ) be called g µν :

1 if µ = ν
gj ≡ gµν and gj –1 ≡ g µν ⇒ g µα gαν = δ µ ν ≡
0 if µ = ν
We then have

Fµν = gµα gνβ Gαβ or equivalently Gµν = g µα g νβ Fαβ

To summarize: we have
–1

F µν −−−−−−−−−−−−−−→ Fµν −−−−−−−−−−−−− j −→ F µν = Gµν
lift indices with g

E , B )-notation reads
which in (E

F = A(E
E , B ) −→ A(B
B , E ) −→ A(−B
B, E ) = G

Repetition of the process gives

G = A(−B
B , E ) −→ A(E
E , −B
B ) −→ A(−E
E , −B
B ) = −F

Gµν is said to be the “dual” of F µν , and the process F µν −→ Gµν is called

E , B )-space,” in the sense
“dualization;” it amounts to a kind of “rotation in (E
illustrated below:

B E

E B

Figure 45: The “rotational” eﬀect of “dualization” on E and B .

Preceding remarks lend precise support and meaning to the claim that F µν and
Gµν are “natural companions,” and very closely related.
We shall—as above, but more generally (and for the good tensor-theoretic
reasons that will soon emerge) use g µν and gµν to raise and lower—in short,
to “manipulate”—indices, writing (for example)81

∂ µ = g µα ∂α , ∂µ = gµα ∂ α

j µ = g µα jα , jµ = gµα j α
Fµν = gµα F α ν = gµα gνβ F αβ

81
problem 36.
Notational preparations 111

We are placed thus in position to notice that the sourceless Maxwell equations
(157.2) can be formulated82

∂1 F23 + ∂2 F31 + ∂3 F12 = 0 



∂0 F23 + ∂2 F30 + ∂3 F02 = 0
(166.1)
∂0 F13 + ∂1 F30 + ∂3 F01 = 0 



∂0 F12 + ∂1 F20 + ∂2 F01 = 0

where the sums over cyclic permutations are sometimes called “windmill sums.”
More compactly, we have83

µαρσ ∂ α F ρσ = 0 (166.2)

There is no new physics in the material presented thus far: our work has
been merely reformulational, notational—old wine in new bottles. Proceeding
in response mainly to the linearity of Maxwell’s equations, we have allowed
ourselves to play linear-algebraic and notational games intended to maximize
the formal symmetry/simplicity of Maxwell’s equations . . . so that the
transformation-theoretic problem which is our real concern can be posed in
the simplest possible terms. Maxwell himself 84 construed the electromagnetic
field to involve a pair of 3-vector fields: E and B . We have seen, however, that
• one can construe the components of E and B to be the accidentally
distinguished names given to the six independently-specifiable non-zero
components of an antisymmetric tensor 85 field F µν . The field equations
then read
∂µ F µν = c1 j ν and µαρσ ∂ α F ρσ = 0 (167)
provided the g αβ that enter into the definition ∂ α ≡ g αβ ∂β are given by
(163). Alternatively . . .
• one can adopt the view that the electromagnetic field to involves a pair of
antisymmetric tensor fields F µν and Gµν which are constrained to satisfy
not only the field equations

∂µ F µν = c1 j ν and ∂µ Gµν = 0 (168.1)

but also the algebraic condition

Gµν = 12 g µα g νβ αβρσ F ρσ (168.2)

Here again, the “index manipulators” gµν and g µν must be assigned the
speciﬁc meanings implicit in (163).
82
problem 37.
83
problem 38.
84
Here I take some liberty with the complicated historical facts of the matter:
see again the fragmentary essay77 cited earlier.
85
For the moment “tensor” simply means “doubly indexed.”
112 Aspects of special relativity

It will emerge that Lorentz’ question (page 107), if phrased in the terms natural
to either of those descriptions of Maxwellian electrodynamics, virtually “answers
itself.” But to see how this comes about one must possess a command of the
basic elements of tensor analysis—a subject with which Minkowski
(mathematician that he was) enjoyed a familiarity not shared by any of his
electrodynamical predecessors or contemporaries.86

2. Introduction to the algebra and calculus of tensors. Let P be a point in an

N-dimensional manifold M.87 Let (x1 , x2 , . . . , xN ) be coordinates assigned to
P by a coordinate system X inscribed on a neighborhood88 containing P , and
86
Though (167) and (168) serve optimally my immediate purposes, the reader
should be aware that there exist also many alternative formulations of the
Maxwellian theory, and that these may aﬀord advantages in specialized contexts.
We will have much to say about the formalism that proceeds from writing

Fµν = ∂µ Aν − ∂ν Aµ

and considering the fundamental object of electrodynamic analysis to be a single

4-vector ﬁeld . Alternatively, one might construct and study the “6 -vector”
   
f1 E1
 f   E2 
2
 3  
f  E 
f = 4≡ 3
 f   B1 
 5  
f B2
6
f B3

(see §26 in Arnold Sommerfeld’s Electrodynamics ( English translation ) or

my Classical Field Theory (), Chapter 2, pages 4–6). Or one might consider
electrodynamics to be concerned with the properties of a single complex 3-vector

V ≡ E + iB
B

(see Appendix B in my “On some recent electrodynamical work by Thomas

Wieting” ()). And there exist yet many other formalisms. Maxwell himself
gave passing attention to a “quaternionic” formulation of his theory.
87
Think “surface of a sphere,” “surface of a torus,” etc. or of their higher-
dimensional counterparts. Or of N-dimensional Euclidean space itself. Or—as
soon as you can—4-dimensional spacetime. I intend to proceed quite informally,
and to defer questions of the nature “What is a manifold?” until such time
as we are able to look back and ask “What properties should we fold into our
deﬁnitions? What did we need to make our arguments work?”
88
I say “neighborhood” because it may happen that every coordinate system
inscribed on M necessarily displays one or more singularities (think of the
longitude of the North Pole). It is our announced intention to stay away from
such points.
Introduction to tensor analysis 113

let (x1 , x2 , . . . , xN ) be the coordinates assigned to that same point by a second

coordinate system X. We seek to develop rules according to which objects
deﬁned in the neighborhood of P respond to coordinate transformations: X →X.
The statement that “φ(x) transforms as a scalar ﬁeld ” carries this familiar
meaning:
φ(x) −→ φ(x) ≡ φ(x(x)) (169)
Here and henceforth: x(x) alludes to the functional statements

xm = xm (x1 , x2 , . . . , xN ) : m = 1, 2, . . . N (170)

that describe how X and X are, in the instance at hand, speciﬁcally related.
How do the partial derivatives of φ transform? By calculus

∂φ ∂xa ∂φ
= (171.1)
∂xm ∂xm ∂xa

where (as always) is understood. Looking to the 2nd derivatives, we have
a

∂2φ ∂xa ∂xb ∂ 2 φ ∂ 2 xa ∂φ

m n
= m n a b
+ (171.2)
∂x ∂x ∂x ∂x ∂x ∂x ∂xm ∂xn ∂xa

Et cetera. Such are the “objects” we encounter in routine work, and the
transformation rules which we want to be able to manipulate in a simple
manner.
The quantities ∂xa/∂xm arise directly and exclusively from the equations
(170) that describe X ← X. They constitute the elements of the “transformation
matrix”

W ≡ W n m
W n m ≡ ∂xn /∂xm (172.1)

—the value of which will in general vary from point to point. Function theory
teaches us that the coordinate transformation will be invertible (i.e., that we
can proceed from xn = xn (x) to equations of the form xn = xn (x)) if and only
if W is non-singular: det W = 0, which we always assume to be the case (in the
neighborhood of P ). The inverse X → X of X ← X gives rise to

M ≡ M m n
M m n ≡ ∂xm /∂xn (172.2)

It is important to notice that

n
∂x ∂xa = ∂xn = n = I
WM = δ m (173)
a
∂xa ∂xm ∂xm
114 Aspects of special relativity

i.e., that the matrices M and W are inverses of each other.

Objects X m1 ...mr n1 ...ns are said to comprise the “components of a (mixed)
tensor of contravariant rank r and covariant rank s if and only if they respond
to X → X by the following multilinear rule:

X m1 ...mr n1 ...ns
↓ (174)
a1 · · · M ar W n1 · · · W ns X
m1 ...mr m1 mr b1 bs a1 ...ar
X n1 ...ns = M b1 ...bs

All indices range on 1, 2, . . . , N , N is called the “dimension” of the tensor, and
summation on repeated indices is (by the “Einstein summation convention”)
understood. The covariant/contravariant distinction is signaled notationally as
a subscript/superscript distinction, and alludes to whether it is W or M that
transports the components in question “across the street, from the X-side to
the X-side.”
If
X m −→ X m = M m a X a
then the X m are said to be “components of a contravariant vector .” Coordinate
diﬀerentials provide the classic prototype:
∂xm
dxm −→ dxm = dxa (175)
a
∂xa

If, on the other hand,

Xn −→ X n = W b n Xb

then the Xn are said to be “components of a covariant vector.” Here the ﬁrst
partials φ,n ≡ ∂n φ of a scalar ﬁeld (components of the gradient) provide the
classic prototype:
b
φ,n −→ φ,n = φ,b ∂xn (176)
∂x
b

That was the lesson of (171.1).

Look, however, to the lesson of (171.2), where we found that

a b
φ,mn −→ φ,mn = φ,ab ∂xm ∂xn + extraneous term
∂x ∂x
b

The intrusion of the “extraneous term” is typical of the diﬀerential calculus of

tensors, and arises from an elementary circumstance: hitting

X mn = M maW bnX ab (say)

Introduction to tensor analysis 115

with ∂ p = W q p ∂q gives

∂(M m a W b n ) a
X m n,p = M m a W b n X a b,q W q p + W q p X b
∂xq
= (term with covariant rank increased by one)
+ (extraneous term)

The “extraneous term” vanishes if the M ’s and W ’s are constant; i.e., if the
functions xn (x) depend at most linearly upon their arguments xn = M n a xa +ξ a .
And in a small number of (electrodynamically important!) cases the extraneous
terms cancel when derivatives are combined in certain ways . . . as we will soon
have occasion to see. But in general, eﬀective management of the extraneous
term must await the introduction of some powerful new ideas—ideas that belong
not to the algebra of tensors (my present concern) but to the calculus of tensors.
For the moment I must be content to emphasize that, on the basis of evidence
now in hand,

Not every multiply-indexed object transforms tensorially!

In particular, the xn themselves do not transform tensorially except in the linear

case xn = M n a xa .

A conceptual point of major importance: the X m1 ...mr n1 ...ns refer to a

tensor, but do not themselves comprise the tensor: they are the components
of the tensor X with respect to the coordinate system X, and collectively serve
to describe X. Similarly X m1 ...mr n1 ...ns with respect to X. The tensor itself is a
coordinate -independent object that lives “behind the scene.” The situation is
illustrated in Figure 46.

To lend substance to a remark made near the top of the page: Let Xm
transform as a covariant vector. Look to the transformation properties of Xm,n
and obtain

X m,n = W a m W b n Xa,b + ∂ 2 xa X
n m a
∂x ∂x
extraneous term, therefore non-tensorial

Now construct Amn ≡ Xm,n − Xn,m = −Anm and obtain

Amn = W a m W b n Aab because the extraneous terms cancel

We conclude that the antisymmetric construction Amn (which we might call

the curl of the covariant vector ﬁeld Xm (x)) does—“accidentally”—transform
tensorially.
116 Aspects of special relativity

X

X
X

X

Figure 46: The Xm serve to describe the blue arrow with respect to
the black coordinate system X, as the X m serve to describe the blue
arrow with respect to the red coordinate system X. But neither Xm
nor X m will be confused with the blue arrow itself: to do so would be
to confuse descriptors with the thing described. So it is with tensors
in general. Tensor analysis is concerned with relationships among
alternative descriptors, not with “things in themselves.”
The following points are elementary, but fundamental to applications of
the tensor concept:
1) If the components X ··· ... of a tensor (all) vanish one coordinate system,
then they vanish in all coordinate systems—this by the homogeneity of
the deﬁning statement (174).
2) Tensors can be added/subtracted if and only if X ··· ... and Y ··· ... are of
the same covariant/contravariant rank and dimension. Constructions of
(say) the form Am + Bm “come unstuck” when transformed; for that same
reason, statements of (say) the form Am = Bm —while they may be valid
in some given coordinate system—do not entail Am = B m . But . . .
3) If X ··· ... and Y ··· ... are of the same rank and dimension, then
X ··· ... = Y ··· ... =⇒ X ··· ... = Y ··· ...
It is, in fact, because of the remarkable transformational stability of
tensorial equations that we study this subject, and try to formulate our
physics in tensorial terms.

4) If X ··· ... and Y ··· ... are co-dimensional tensors of ranks r , s and r , s
then their product X ··· ... Y ··· ... is tensorial with rank r + r , s + s :
tensors of the same dimension can be multiplied irrespective of their ranks.
Introduction to tensor analysis 117

If X ··· ... is tensorial of rank r, s then a the operation of

contraction: Set a superscript equal to a subscript, and add

yields components of a tensor of rank r − 1, s − 1 . The mechanism is exposed
most simply by example: start from (say)

X jk = M j a M k b W c X ab c

Set (say) k = 4 and obtain

X jk k = M j a M k b W c k X ab c
= Mja δcb X ab c by M W = I
j ab
=M aX b

according to which X j ≡ X jk k transforms as a contravariant vector. Similarly,

the twice-contracted objects X jk jk and X jk kj transform as (generally distinct)
invariants.89 Mixed tensors of high rank can be singly/multiply contracted in
many distinct ways. It is also possible to “contract one tensor into another; ” a
simple example:

invariant formed by contracting a covariant
Ak B k :
vector into a contravariant vector

The “Kronecker symbol” δ m n is a number-valued object90 with which all

readers are familiar. If “transformed tensorially” it gives

δ m n −→ δ m n = M m a W b n δ a b
= M maW an
= δmn by M W = I

and we are brought to the remarkable conclusion that the components δ m n of the
Kronecker tensor have the same numerical values in every coordinate system.
Thus does δ m n become what I will call a “universally available object”—to be
joined soon by a few others. With this . . .
We are placed in position to observe that if the quantities gmn transform
as the components of a 2nd rank covariant tensor

gmn −→ g mn = W a m W b n gab (177)

89
The “theory of invariants” was a favorite topic among 19th Century
mathematicians, and provided the founding fathers of tensor analysis with a
source of motivation (see pages 206 –211 in E. T. Bell’s The Development of
Mathematics ()).
90
See again the top of page 110.
118 Aspects of special relativity

then
1) the equation g ma gan = δ m n , if taken as (compare page 110) a deﬁnition of
the contravariant tensor g mn , makes good coordinate -independent tensor-
theoretic sense, and
2) so do the equations
···m··· ma ··· ···
X... ... ≡ g X... a ...
··· ··· ··· a ···
X...m... ≡ gma X... ...

by means of which we have proposed already on page 110 to raise and

··· ··· am ··· ···
lower indices.91 To insure that g ma X... a ... and g X... a ... are identical
we will require that

gmn = gnm : implies the symmetry also of g mn

The transformation equation (177) admits—uncharacteristically–of matrix

formulation

gj −→ gj = W Tgj W
Taking determinant of both sides, and writing

g ≡ det gj , W ≡ det W = 1/ det M = M –1

we have
g −→ g = W 2 g (178.1)
The statement that φ(x) transforms as a scalar density of weight w carries this
meaning:
φ(x) −→ φ(x) = W w · φ(x(x))
We recover (169) in the “weightless” case w = 0 (and for arbitrary values of w
when it happens that W = 1). Evidently

g ≡ det gj transforms as a scalar density of weight w = 2 (178.2)

The more general statement that X m1 ...mr n1 ...ns transforms as a tensor density
of weight w means that

X m1 ...mr n1 ...ns = W w · M m1 a1 · · · M mr ar W b1 n1 · · · W bs ns X a1 ...ar b1 ...bs

We can multiply/contract tensors of dissimilar weight, but must be careful not

to try to add them or set them equal. The “tensor/tensor density distinction”
becomes signiﬁcant only in contexts where W = 1.
Familiarity with the tensor density concept places us in position to consider
the tensor-theoretic signiﬁcance of the Levi-Civita symbol
91
Note, however, that we work now N -dimensionally, and have stripped
gmn of its formerly specialized (Lorentzian) construction (163): it has become
“generic.”
Introduction to tensor analysis 119

1 2 ··· N
n1 n2 . . . nN ≡ sgn
n1 n2 · · · nN
where
“sgn” refers to the “signum,” which reports
(see again page 109) whether
n1 , n2 , . . . , nN is an even/odd permutation of 1, 2, . . . , N or no permutation
at all. The tentative assumption that n1 n2 . . . nN transforms as a (totally
antisymmetric) tensor density of unspeciﬁed weight w

n1 n2 . . . nN = W w · W a1 n1 W a2 n2 · · · W aN nN a1 a2 . . . aN

|
= n1 n2 . . . nN det W
by deﬁnition of the determinant!

=W w+1
· n1 n2 . . . nN

brings us to the remarkable conclusion that the components of the Levi-Civita

tensor will have the same numerical values in every coordinate system provided
n1 n2 . . . nN is assumed to transform as a density of weight w = −1. The
Levi-Civita tensor thus joins our short list of “universally available objects.”92
I have remarked that n1 n2 . . . nN is “totally antisymmetric.” It is of
importance to notice in this connection that—more generally—statements of
the forms
X ···m···n··· ... = ±X ···n···m··· ...
and
X ··· ···m···n··· = ±X ··· ···n···m···
have tensorial (or coordinate system independent) signiﬁcance, while symmetry
statements of the hybrid form

X ···m··· ···n··· = ±X ···n··· ···m···

—while they might be valid in some particular coordinate system—“become

unstuck” when transformed. Note also that

X mn = 12 (X mn + X nm ) + 12 (X mn − X nm )

serves to resolve X mn tensorially into its symmetric and antisymmetric parts.93

92
The (weightless) “metric tensor” gmn is not “universally available,” but
must be introduced “by hand.” In contexts where gmn is available (has been
introduced to facilitate index manipulation) it becomes natural to construct
√
g n1 n2 . . . nN : weightless totally antisymmetric tensor
√
—the values of which range on 0, ± g in all coordinate systems.
93
problem 39.
120 Aspects of special relativity

We have now in our possession a command of tensor algebra which is

suﬃcient to serve our immediate needs, but must sharpen our command of the
differential calculus of tensors. This is a more intricate subject, but one into
which—surprisingly—we need not enter very deeply to acquire the tools needed
to achieve our electrodynamical objectives. I will be concerned mainly with the
development of a short list of “accidentally tensorial derivative constructions,”94
and will glance only cursorily at what might be called the “non-accidental
aspects” of the tensor calculus.
catalog of accidentally tensorial derivative constructions

1. We established already at (171.1) that if φ transforms as a weightless scalar

ﬁeld then the components of the gradient of φ
∂m φ transform tensorially (179.1)

2. And we observed on page 115 that if Xm transforms as a weightless covariant

vector ﬁeld then the components of the curl of Xm transform tensorially.
∂n Xm − ∂m Xn transform tensorially (179.2)

3. If Xjk is a weightless tensor ﬁeld, how do the ∂i Xjk transform? Immediately

∂ i X jk = W b j W c k · W a i ∂a Xbc + Xbc ∂ i W b j W c k
2 b c b 2 c
= W a i W b j W c k ∂a Xbc + Xbc ∂ i x j ∂xk + ∂xj ∂ kx i
∂x ∂x ∂x ∂x ∂x ∂x
extraneous term

so ∂i Xjk transforms tensorially only under such circumstances as cause the

“extraneous term” to vanish: this happens when X → X is “aﬃne; ” i.e., when
the W-matrix is x-independent. Notice, however, that we now have
∂ i X jk + ∂ j X ki + ∂ k X ij = W a i W b j W c k (∂a Xbc + ∂a Xbc + ∂a Xbc )
2 b c b 2 c
+ Xbc ∂ i x j ∂xk + ∂xj ∂ kx i
∂x ∂x ∂x ∂x ∂x ∂x
2 b c b 2 c
∂
+ j k x ∂x + ∂xk ∂ i x j
∂x ∂x ∂x i ∂x ∂x ∂x
2 b
∂ x
+ k i j + ∂xc
∂x b
∂ 2 xc
∂x ∂x ∂x ∂xi ∂xj ∂xk

in which etc. is bc-symmetric; if Xbc were anti symmetric the extraneous
term would therefore drop away. We conclude that if Xjk is an antisymmetric
weightless covariant tensor ﬁeld then the components of the windmill sum
∂i Xjk + ∂j Xki + ∂k Xij transform tensorially (179.3)

94
The possibility and electrodynamical utility of such a list was brought ﬁrst
to my attention when, as a student, I happened upon the discussion which
appears on pages 22–24 of E. Schrödinger’s Space-time Structure (). This
elegant little volume (which runs to only 119 pages) provides physicists with
an elegantly succinct introduction to tensor analysis. I recommend it to your
attention.
Introduction to tensor analysis 121

4. If X m is a vector density of unspeciﬁed weight w how does ∂m X m transform?

Immediately

∂mX m = W w · M ma∂m X a + X a∂m W w · M ma

∂a
m
= W w · ∂a X a + X a W w ∂ m ∂x a + wW w−1 ∂Wa
∂x ∂x ∂x
An important lemma95 asserts that

∂ ∂xm = ∂ log det

∂xm

∂xm ∂xa ∂xa ∂xn

= ∂a log M = −∂a log W
= −W –1 ∂a W
so

= W w · ∂a X a + X a (w − 1)W w−1 ∂Wa

∂x
extraneous term

The extraneous term vanishes (for all w) when X → X has the property that W
is x-independent,96 and it vanishes unrestrictedly if w = 1. We conclude that
if X m is a contravariant vector density of unit weight then its divergence

∂m X m transforms tensorially (by invariance) (179.4)

5. If X mn is a vector density of unspeciﬁed weight w how does ∂m X mn

transform? Immediately

∂ m X mn = W w · M m a M n b (W c m ∂c X ab ) + X ab ∂ m (W w · M m a M n b )

extraneous term

= W w · M n b ∂a X ab by M m a W c m = δ c a

The extraneous term can be developed

X ab M n b wW w−1 (M m a ∂ m )W +W w M n b ∂ m M m a +(M m a ∂ m )M n b

|
= −W –1 ∂a W by the lemma

so by M m a ∂ m = ∂a we have
2 n
extraneous term = X ab M n b (w − 1)W w−1 ∂a W + W w ∂ ax b
∂x ∂x

95
For the interesting but somewhat intricate proof, see classical dynamics
(/), Chapter 2, page 49.
96
This is weaker than the requirement that W be x-independent.
122 Aspects of special relativity

The second partial is ab-symmetric, and makes no net contribution if we assume

X ab to be ab-anti symmetric. The surviving fragment of the extraneous term
vanishes (all w) if W is constant, and vanishes unrestrictedly if w = 1. We are
brought thus to the conclusion that if X mn is an antisymmetric density of unit
weight then
∂m X mn transforms tensorially (179.5)
“Generalized divergences” ∂m X mn1 ···np yield to a similar analysis, but will not
be needed.
6. Taking (179.5) and (179.4) in combination we ﬁnd that under those same
conditons (i.e., if X mn is an antisymmetric density of unit weight) then

∂m ∂n X mn transforms tensorially

but this is hardly news: the postulated antisymmetry fo X mn combines with

the manifest symmetry of ∂m ∂n to give

∂m ∂n X mn = 0 automatically

The evidence now in hand suggests—accurately—that antisymmetry has

a marvelous power to dispose of what we have called “extraneous terms.” The
calculus of antisymmetric tensors is in fact much easier than the calculus of
tensors-in -general, and is known as the exterior calculus. That independently
developed sub -branch of the tensor calculus supports not only a diﬀerential
calculus of tensors but also—uniquely—an integral calculus, which radiates
from the theory of determinants (which are antisymmetry -infested) and in
which the fundamental statement is a vast generalization of Stokes’ theorem.97
remark: Readers will be placed at no immediate disadvantage if,
on a ﬁrst reading, they skip the following descriptive comments,
which have been inserted only in the interest of a kind of “sketchy
completeness” and which refer to material which is—remarkably!—
inessential to our electrodynamical progress (though indispensable
in many other physical contexts).

In more general (antisymmetry-free) contexts one deals with the

non-tensoriality of ∂m X ··· ... by modifying the concept of diﬀerentiation, writing
(for example)

Dj X k ≡ W b j W c k ∂b Xc

|—tensorial transform of ∂
j Xk
≡ components of the covariant derivative of X k

97
See again the mathematical digression that culminates on page 50. A
fairly complete and detailed account of the exterior calculus can be found in
“Electrodynamical applications of the exterior calculus” ().
Introduction to tensor analysis 123

where by computation

= ∂ j X k − X i Γ i jk
with
i 2 p
Γ i jk ≡ ∂xp ∂ j x k
∂x ∂x ∂x
By extension of the notational convention Xk, j ≡ ∂j Xk one writes Xk; j ≡ Dj Xk .
It is a clear that X j; k —since created by “tensorial continuation” from the
“seed” ∂j Xk —transforms tensorially, and that it has something to do with
familiar differentiation (is differentiation, but with built-in compensation for
the familiar “extraneous term,” and reduces to ordinary differentiation in the
root coordinate system X). The quantities Γ i jk turn out not to transform
tensorially, but by the rule i 2 p
= M i a W b j W c k Γ a bc + ∂xp ∂ j x k
∂x ∂x ∂x
characteristic of “affine connections.” Finally, one gives up the assumption
that there exists a coordinate system (the X-system of prior discussion) in
which Dj and ∂j have coincident (i.e., in which Γ i jk vanishes globally). The
affine connection Γ i jk (x) becomes an object that we are free to deposit on the
manifold M, to create an “affinely connected manifold”. . . just as by deposition
of gij (x) we create a “metrically connected manifold.” But when we do both
things98 a compatability condition arises, for we expect
• index manipulation followed by covariant differentiation, and
• covariant differentiation followed by index manipulation
to yield the same result. This is readily shown to entail gij;k = 0, which in turn
entails ∂g
aj ∂gak ∂gjk
Γ i jk = 12 g ia k
+ j
−
∂x ∂x ∂xa
The affine connection has become implicit in the metric connection—it has
become the “Christoffel connection,” which plays a central role in Riemannian
geometry and its applications (general relativity): down the road just a short
way lies the Riemann-Christoffel curvature tensor
∂Γ m nj ∂Γ m ni
Rm nij = − + Γ m ai Γ a nj − Γ m aj Γ a ni
∂xi ∂xj
which enters into statements such as the following

Xn;ij − Xn;ji = Xa Ra nij

which describes the typical inequality of crossed covariant derivatives. The

“covariant derivative” was invented by Elwin Christoﬀel (–) in .
98
Notice that we need both if we want to construct such things as the

covariant Laplacian of φ ≡ g mn φ;mn

124 Aspects of special relativity

Figure 47: Any attempt to construct a transformationally coherent

theory of diﬀerentiation by comparing such neighboring vectors is
doomed unless X → X gives rise to a transformation matrix that is
constant on the neighborhood.

Figure 48: The problem just noted is resolved if one compares one
vector with the local parallel transport of the other—a “stand-in”
rooted to the same point as the original vector. For then only a
single transformation matrix enters into the discussion.

Sharp insight into the meaning of the covariant derivative was provided in
 by Levi-Civita,99 who pointed out that when one works from Figure 47 one
cannot realistically expect to obtain a transformationally sensible result, for the
99
The fundamental importance of Levi-Civita’s idea was immediately
appreciated and broadcast by Hermann Weyl. See §14 in his classic Space,
Time & Matter (4th edition , the English translation of which has been
reprinted by Dover).
Introduction to tensor analysis 125

transformation matrices W(x) and W(x + dx) that act upon (say) Xm (x) and
Xm (x + dx) are, in general, distinct. Levi-Civita observed that a workable
procedure does, however, result if one looks not Xm (x + dx) − Xm (x) but to
Xm (x) − Xm (x), where

Xm (x) results from parallel transport

of Xm (x + dx) from x + dx back to x

He endowed the intuitive concept “parallel transport” (Figure 48) with a precise
(natural) meaning, and immediately recovered the standard theory of covariant
diﬀerentiation. But he obtained also much else: he showed, for example,
that “geodesics” can be considered to arise not as “shortest” curves—curves
produced by minimization of arc length

ds with (ds)2 = gmn dxm dxn

—but as curves whose tangents can be got one from another by parallel
transportation: head oﬀ in some direction and “follow your nose” was the
idea. Levi-Civita’s idea so enriched a subject previously known as the “absolute
diﬀerential calculus” that its name was changed . . . to “tensor analysis.”
Our catalog (pages 120–122) can be looked upon as an ennumeration of
circumstances in which—“by accident”—the Γ -apparatus falls away. Look, for
example, to the “covariant curl,” where we have

Xm;n − Xn;m = (Xm,n − Xa Γ a nm ) − (Xn,m − Xa Γ a mn )

= Xm,n − Xn,m by Γ a mn = Γ a nm

The basic principles of the “absolute diﬀerential calculus” were developed

between  and  by Gregorio Ricci-Curbastro (–), who was a
mathematician in the tradition of Riemann and Christoffel.100 In  his
student, Tullio Levi-Civita (–), published “Sulle transformazioni della
eqazioni dinamiche” to demonstrate the physical utility of the methods which
Ricci himself had applied only to differential geometry. In —at the urging
of Felix Klein, in Göttingen—Ricci and Levi-Civita co-authored “Méthodes
de calcul différentiel absolus et leurs applications,” a lengthy review of the
subject . . . but they were Italians writing in French, and published in a German
periodical (Mathematische Annalen), and their work was largely ignored: for
nearly twenty years the subject was known to only a few cognoscente (who
included Minkowski at Göttingen), and cultivated by fewer. General interest
in the subject developed—explosively!—only in the wake of Einstein’s general
theory of relativity (). Tensor methods had been brought to the reluctant
attention of Einstein by Marcel Grossmann, a geometer who had been a
classmate of Einstein’s at the ETH in Zürich (Einstein reportedly used to study
100
Ricci had interest also in physics, and as a young man published (in Nuovo
Cimento) the first Italian account of Maxwellian electrodynamics.
126 Aspects of special relativity

Grossmann’s class notes instead of attending Minkowski’s lectures) and whose

father had been instrumental in obtaining for the young and unknown Einstein
a position in the Swiss patent oﬃce.
Acceptence of the tensor calculus was impeded for a while by those (mainly
mathematicians) who perceived it to be in competition with the exterior
calculus—an elegant French creation (Poincaré, Goursat, Cartan, . . . ) which
treats (but more deeply) a narrower set of issues, but (for that very reason)
supports also a robust integral calculus. The exterior calculus shares the
Germanic pre -history of tensor analysis (Gauss, Grassmann, Riemann, . . . ) but
was developed semi-independently (and somewhat later), and has only fairly
recently begun to be included among the work-a -day tools of mathematical
physicists. Every physicist can be expected today to have some knowledge of
the tensor calculus, but the exterior calculus has yet to ﬁnd a secure place in the
pedagogical literature of physics, and for that (self-defeating) reason physicists
who wish to be understood still tend to avoid the subject . . . in their writing
and (at greater hazard) in their creative thought.

3. Transformation properties of the electromagnetic field equations. We will be

led in the following discussion from Maxwell’s equations to—ﬁrst and most
easily—the group of “Lorentz transformations,” which by some fairly natural
interpretive enlargement detach from their electrodynamic birthplace to provide
the foundation of Einstein’s Principle of Relativity. But it will emerge that

The covariance group of a theory depends

in part upon how the theory is expressed :

slight adjustments in the formal rendition of Maxwell’s equations will lead to

transformation groups that diﬀer radically from the Lorentz group (but that
contain the Lorentz group as a subgroup) . . . and that also is a lesson that admits
of “enlargement”—that pertains to ﬁelds far removed from electrodynamics.
The point merits explicit acknowledgement because it relates to how casually
accepted conventions can exert unwitting control on the development of physics.
first point of view Let Maxwell’s equations be notated101

∂µ F µν = c1 j ν (180.1)
∂µ Fνλ + ∂ν Fλµ + ∂λ Fµν = 0 (180.2)
where F µν is antisymmetric and where
 
1 0 0 0
0 −1 0 0
Fµν ≡ gµα gνβ F αβ with gj ≡ gµν =   (181)
0 0 −1 0
0 0 0 −1

is (automatically) also antisymmetric. From Fµν = −Fνµ it follows, by the way,

101
Compare (167).
Lorentz covariance of Maxwell’s equations 127

that (180.2) reduces to the triviality 0 = 0 unless µ, ν and λ are distinct, so

the equation in question is just a condensed version of the sourceless Maxwell
equations as they were encountered on page 111.102 In view of entry (179.5) in
our catalog it becomes natural to
assume that F µν and j µ transform as the components of
tensor densities of unit weight:
F µν −→ F µν = W · M µ α M ν β F αβ A1
j µ −→ j µ = W · M µ α j α A2
103
We note that it makes coordinate -independent good
sense to assume of the ﬁeld tensor that it is antisymmetric:
F µν antisymmetric =⇒ F µν antisymmetric

The unrestricted covariance (in the sense “form-invariance under coordinate

transformation”) of (180.1) is then assured
∂µ F µν = c1 j ν −→ ∂ µ F µν = c1 j ν

On grounds that it would be intolerable for the description (181) of gj to be
“special to the coordinate system X” we
assume gµν to transform as a symmetric tensor of zero
weight
gµν −→ g µν = W α µ W β ν gαβ B1
but impose upon X → X the constraint that
= gµν B2

This amounts in eﬀect to imposition of the requirement that X → X be of such

a nature that

W Tgj W = gj everywhere (182)

102
We might write
 
0 −E1 −E2 −E3
 E 0 −B3 B2 
F µν ≡  1

E2 B3 0 −B1
E3 −B2 B1 0
 
0 E1 E2 E3
 −E1 0 −B3 B2 
∴ Fµν =  
−E2 B3 0 −B1
−E3 −B2 B1 0
to establish explicit contact with orthodox 3 -vector notation and terminology
(and at the same time to make antisymmetry manifest), but such a step would
be extraneous to the present line of argument.
103
See again page 119.
128 Aspects of special relativity

Looking to the determinant of the preceding equation we obtain

W2 = 1

from which (arguing from continuity) we conclude that

everywhere equal to +1, else
W is (183)
everywhere equal to −1.

This result protects us from a certain embarrassment: assumptions A1 and B1

jointly imply that Fµν transforms as a tensor of unit weight, while covariance of
the windmill sum in (180.2) was seen at (179.3) to require Fµν to transform as a
weightless tensor. But (183) reduces all weight distinctions to empty trivialities.
Thus does B2 insure the covariance of (180.2):

∂µ Fνλ + ∂ν Fλµ + ∂λ Fµν = 0 −→ ∂µ F νλ + ∂ν F λµ + ∂λ F µν = 0

From (182) we will extract the statement that

X → X is a Lorentz transformation (184)

and come to the conclusion that Maxwellian electrodynamics—as formulated

above—is Lorentz covariant. Lorentz () and Einstein () were the
independent co-discoverers of this fundamental fact, which they established
by two alternative (and quite distinct) lines of argument.
second point of view Retain both the ﬁeld equations (180) and
the assumptions A but—in order to escape from the above -mentioned “point of
embarrassment”—agree in place of B1 to
assume that gµν transforms as a symmetric tensor density
of weight w = − 12

gµν −→ g µν = W − 2 · W α µ W β ν gαβ B∗1

for then Fµν becomes weightless, as (179.3) requires. Retaining

= gµν B2

we obtain

W − 2 · W Tgj W = gj
1
everywhere (185.1)
If spacetime were N -dimensional the determinantal argument would now give
N
W 2− 2 = 1

which (uniquely) in the physical case (N = 4) reduces to a triviality: W 0 = 1.

The constraint (183) therefore drops away, with consequences which I will
discuss in a moment.
Lorentz covariance of Maxwell’s equations 129

third point of view This diﬀers only superﬁcially from the

viewpoint just considered. Retain B1 but in place of B2
assume that
g µν = Ωgµν B∗2

Then

W Tgj W = Ω gj (185.2)
and the determinantal argument supplies
2
Ω=WN
↓
1
= W 2 in the physical case N = 4

Equations (185.1) and (185.2) evidently say the same thing: the Lorentzian
constraint (183) drops away and in place of (184) we have

X → X is a conformal transformation (186)

The conformal covariance of Maxwellian electrodynamics was discovered

independently by Cunningham104 and Bateman.105 It gives rise to ideas which
have a curious past106 and which have assumed a central place in elementary
particle physics at high energy. Some of the electrodynamical implications of
conformal covariance are so surprising that they have given rise to vigorous
controversy.107 A transformation is said (irrespective of the speciﬁc context) to
be “conformal” if it preserves angles locally . . . though such transformations do
not (in general) preserve non-local angles, nor do they (even locally) preserve
length. Engineers make heavy use of the conformal recoordinatizations of the
plane that arise from the theory of complex variables via the statement

z → z = f (z) : f (z) analytic

The bare bones of the argument: write z = x + iy, z = u + iv and obtain

u = u(x, y) du = ux dx + uy dy
giving
v = v(x, y) dv = vx dx + vy dy

104
E. Cunningham, “The principle of relativity in electrodynamics and an
extension thereof,” Proc. London Math. Soc. 8, 223 (1910).
105
H. Bateman, “The transformation of the electrodynamical equations,”
Proc. London Math. Soc. 8, 223 (1910).
106
T. Fulton, F. Rohrlich & L. Witten, “Conformal invariance in physics,”
Rev. Mod. Phys. 34, 442 (1962).
107
See “Radiation in hyperbolic motion” in R. Peierls, Surprises in Theoretical
Physics (), page 160.
130 Aspects of special relativity

1 2 3 4

100

-100 -50 50

-50

Figure 49: Cartesian grid (above) and its conformal image (below)
in the case f (z) = z 3 , which supplies

u(x, y) = x3 − 3xy 2
v(x, y) = 3x2 y − y 3

The command ParametricPlot was used to construct the ﬁgure.

But
ux = +vy
analyticity of f (z) ⇐⇒ cauchy-riemann conditions :
uy = −vx
so
ux v
· x = ux vx + uy vy = −ux uy + uy ux = 0
uy vy
Lorentz covariance of Maxwell’s equations 131

which is to say: curves of constant u are everywhere ⊥ to curves of constant v,

just as curves of constant x were everywhere normal to curves of constant y.
The situation is illustrated in the preceding figure. The 2 -dimensional case—in
which one can conformally transform in as infinitely many ways as one can
select f (z)—is, however, exceptional:108 in the cases N > 2 conformality arises
from a less esoteric circumstance, and the possibilities are described by a finite
set of parameters. Let Am and B m be weightless vectors, let the inner product
be defined (A, B) ≡ gmn Am B n , and suppose gmn to transform as a symmetric
tensor density of weight w. Then (A, B) and the “squared lengths” (A, A) and
(B, B) of all transform (not as invariants but) as scalar densities. But the

(A, B)
angle between Am and B m ≡ arccos
(A, A)(B, B)

clearly does transform by invariance. Analysis of (185.2) gives rise in the

physical case (N = 4) to a 15 -parameter conformal group that contains the
6 -parameter Lorentz group as a subgroup.
fourth point of view Adopt the (unique) affine connection Γ λ µν
which vanishes here in our inertial X-coordinate system. For us there is then no
distinction between ordinary differentiation and covariant differentiation. So in
place of (180) we can, if we wish, write

F µν ;µ = c1 j ν (187.1)
Fνλ;µ + Fλµ;ν + Fµν;λ = 0 (187.2)
Which is to say: we can elect to “tensorially continuate” our Maxwell equations
to other coordinate systems or arbitrary (moving curvilinear) design. We retain
the description (181) of gµν , and we retain

gµν −→ g µν = W α µ W β ν gαβ B1

But we have no longer any reason to retain B2 , no longer any reason to impose
any speciﬁc constraint upon the design of g µν . We arrive thus at a formalism
in which
F µν ;µ = c1 j ν −→ F µν ;µ = c1 j ν
Fνλ;µ + Fλµ;ν + Fµν;λ = 0 −→ F νλ;µ + F λµ;ν + F µν;λ = 0
and in which
X → X is unrestricted (188)
µν µ
No “natural weights” are assigned within this formalism to F , j and gµν ,
but formal continuity with the conformally-covariant formalism (whence with
the Lorentz-covariant formalism) seems to require that we assign weights w = 1
to F µν and j µ , weight w = − 12 to gµν .
108
See page 55 of “The transformations which preserve wave equations” ()
in transformtional physics of waves (–).
132 Aspects of special relativity

Still other points of view are possible,109 but I have carried this discussion
already far enough to establish the validity of a claim made at the outset: the
only proper answer to the question “What transformations X → X preserve the
structure of Maxwell’s equations?” is “It depends—depends on how you have
chosen to write Maxwell’s equations.”
We have here touched, in a physical setting, upon an idea—look at
“objects,” and the groups of transformations which preserve relationships
among those objects—which Felix Klein, in the lecture given when (in ,
at the age of ) he assumed the mathematical professorship at the University
of Erlangen, proposed might be looked upon as the organizing principle of
all pure/applied mathematics—a proposal which has come down to us as the
“Erlangen Program.” It has been supplanted in the world of pure mathematics,
but continues to illuminate the historical and present development of physics.110
4. Lorentz transformations, and some of their implications. To state that X ← X
is a Lorentz transformation is, by deﬁnition, to state that the associated
transformation matrix M ≡ M µ ν ≡ ∂xµ /∂xν has (see again page 127) the
property that

M Tgj M = gj everywhere (182)
T –1
where by fundamental assumption gj = gj = gj possesses at each point in
spacetime the speciﬁc structure indicated at (181).
I begin with the observation that M must necessarily be a constant matrix.
The argument is elementary: hit (182) with ∂ λ and obtain

(∂ λ M )Tgj M + M Tgj (∂ λ M ) = O because gj is constant
This can be rendered
gαβ M α λµ M β ν + gαβ M α µ M β νλ = 0
where M α λµ ≡ ∂ 2 xα /∂xλ ∂xµ = M α µλ . More compactly
Γµνλ + Γνλµ = 0
where Γµνλ ≡ gαβ M α β
µ M νλ . Also (subjecting the µνλ to cyclic permutation)
Γνλµ + Γλµν = 0
Γλµν + Γµνλ = 0
so     
0 1 1 Γλµν 0
1 0 1   Γµνλ  =  0 
1 1 0 Γνλµ 0

109
See D. van Dantzig, “The fundamental equations of electromagnetism,
independent of metric geometry,” Proc. Camb. Phil. Soc. 30, 421 (1935).
110
For an excellent discussion see the section “Codiﬁcation of geometry by
invariance” (pages 442–453) in E. T. Bell’s The Development of Mathematics
(). The Erlangen Program is discussed in scholarly detail in T. Hawkins,
Emergence of the theory of Lie Groups (): see the index. For a short
history of tensor analysis, see Bell’s Chapter 9.
Lorentz transformations 133

The 3×3 matrix is non-singular, so we must have

Γλµν = M α λ gαβ ∂ µ M β ν = 0 : ditto cyclic permutations
which in matrix notation reads

M Tgj (∂ µ M ) = O

The matrices M and gj are non-singular, so we can multiply by (M Tgj ) –1 to
obtain
∂ µ M = O : the elements of M must be constants
The functions xµ (x) that describe the transformation X ← X must possess
therefore the inhomogeneous linear structure111
xµ = Λµ ν xν + aµ : the Λµ ν and aµ are constants
The transformation matrix M, with elements given now by constants Λµ ν , will
henceforth be denoted /\\ to emphasize that it is no longer generic but has
been specialized (and also to suggest “Lorentz”). We shall (when the risk of
confusion is slight) write
x = /\\ x + a (189.1)
↑—describes a translation in spacetime

to describe an (“inhomogeneous Lorentz” or) Poincaré transformation, and

x = /\\ x (189.2)
to describe a (simple homogeneous) Lorentz transformation, the assumption in
both cases being that

/\\ = gj
/\\T gj (190)

important remark: Linearity of a transformation—

constancy of the transformation matrix—is suﬃcient in
itself to kill all “extraneous terms,” without the assistance
of weight restrictions.

It was emphasized on page 119 that “not every indexed object transforms
tensorially,” and that, in particular, the xµ themselves do not transform
tensorially except in the linear case. We have now in hand just such a case, and
for that reason relativity becomes—not just locally but globally—an exercise
in linear algebra. Spacetime has become a 4 -dimensional vector space; indeed,
it has become an inner product space, with

(x, y) ≡ gµν xµ y ν 



= (y, x) by gµν = gνµ 
(191.1)
= xTgj y 




= x0 y 0 − x1 y 1 − x2 y 2 − x3 y 3 = x0 y 0 − x · y

111
Einstein ()—on the grounds that what he sought was a minimal
modiﬁcation of the Galilean transformations (which are themselves linear)—
was content simply to assume linearity.
134 Aspects of special relativity

The Lorentz inner product (interchangeably: the “Minkowski inner product”)

described above is, however, “pathological” in the sense that it gives rise to an
“indeﬁnite norm;” i.e., to a norm

(x, x) = gµν xµ xν 


T
= x gj x (191.2)


= (x ) − (x ) − (x ) − (x ) = (x ) − x · x
0 2 1 2 2 2 3 2 0 2 

which (instead of being positive unless x = 0) can assume either sign, and can
vanish even if x = 0. From this primitive fact radiates much—arguably all—
that is most distinctive about the geometry of spacetime . . . which, as Minkowski
was the ﬁrst to appreciate (and as will emerge) lies at the heart of the theory
of relativity.
If Aµ , B µ and gµν transform as weightless tensors, then basic tensor algebra
informs us that gµν Aµ B ν transforms by invariance:

gµν Aµ B ν −→ g µν Aµ B ν = gµν Aµ B ν unrestrictedly

What distinguishes Lorentz transformations from transformations-in-general is

that
g µν = gµν
To phrase the issue as it relates not to things (like Aµ and B µ ) “written on”
spacetime but to the structure of spacetime itself, we can state that the linear
transformation
x −→ x = /\\ x
describes a Lorentz transformation if and only if

xT gj y = xT /\\Tgj /\\ y = xT gj y for all x and y : entails /\\
/\\Tgj = gj

where, to be precise, we require that gj has the speciﬁc design
 
1 0 0 0
 0 −1 0 0
gj ≡  
0 0 −1 0
0 0 0 −1

that at (163) was impressed upon us by our interest in the transformation

properties of Maxwell’s equations (i.e., by some narrowly prescribed speciﬁc
physics).
We come away with the realization that Lorentz transformations have in
fact only incidentally to do with electrodynamics: they are the transformations
that preserve Lorentzian inner products, which is to say: that preserve the
metric properties of spacetime . . . just as “rotations” x −→ x = R x are the
linear transformations that preserve Euclidean inner products

x T I y = x T RT I R y = x T I y for all x and y : entails RT R = I

Lorentz transformations 135

time

space

Figure 50: Two “events” identify a triangle in the spacetime.

Relativity asks each inertial observer to use metersticks and clocks
to assign traditional meanings to the “Euclidean length” of the black
side (here thickened to suggest that space is several-dimensional)
and to the “duration” of the blue side—meanings which (as will
emerge) turn out, however, to yield observer-dependent numbers—
but assigns (Lorentz-invariant!) meaning also to the squared length
of the hypotenuse.

and in so doing preserve the lengths/angles/areas/volumes . . . that endow

Euclidean 3 -space with its distinctive metric properties.
That spacetime can be said to possess metric structure is the great surprise,
the great discovery. In pre -relativistic physics one could speak of the duration
(quantiﬁed by a clock) of the temporal interval ∆t = ta − tb separating a pair
of events, and one could speak of the length

∆0 = (xa − xb )2 + (ya − yb )2 + (za − zb )2

(quantiﬁed by a meter stick) of the spatial interval separating a pair of points;

one spoke of “space” and “time,” but “spacetime” remained an abstraction of

the design space ⊗ time. Only with the introduction gj did it become possible
(see Figure 50) to speak of the (squared) length

(∆s)2 = c2 (ta − tb )2 − (xa − xb )2 − (ya − yb )2 − (za − zb )2

of the interval separating (ta , xa ) from (tb , xb ):

“space ⊗ time” had become “spacetime”

136 Aspects of special relativity

The first person to recognize the profoundly revolutionary nature of what had
been accomplished was (not Einstein but) Minkowski, who began an address to
the Assembly of German Natural Scientists & Physicians ( September )
with these words:
“The views of space and time which I wish to lay before you have
sprung from the soil of experimental physics, and therein lies their
strength. They are radical. Henceforth space by itself, and time by
itself, are doomed to fade away into mere shadows, and only a kind
of union of the two will preserve an independent reality.”
Electrodynamics had led to the first clear perception of the geometrical design
of the spacetime manifold upon which all physics is written. The symmetries
inherent in that geometry were by this time know to be reflected in the design
of Maxwell’s equations. Einstein’s Principle of Relativity holds that they must,
in fact, be reflected in the design of all physical theories—irrespective of the
specific phenomenology to which any individual theory may refer.
Returning now to the technical mainstream of this discussion . . . let the
Lorentz condition (190) be written

/\\–1 = gj –1 /\\T gj (192)
Generally inversion of a 4×4 matrix is difficult, but (192) shows that inversion
of a Lorentz matrix /\\ can be accomplished very easily.112 .
Equations (190/192) impose a multiplicative condition upon /\\ . It was to
reduce multiplicative conditions to additive conditions (which are easier) that
logarithms were invented. Assume, therefore, that /\\ can be written
/\\ = eA = I + A + 2! A
1 2
+ ···
It now follows that
T –1 T
/\\–1 = e− A while gj –1 /\\T gj = gj –1 e A gj = e gj A gj
Evidently /\\ will be a Lorentz matrix if

−A = gj –1 AT gj

which (by gj T = gj ) can be expressed

( gj A)T = −( gj A)
This is an additive condition (involves negation instead of inversion) and

amounts simply to the statement that gj A ≡ Aµν is antisymmetric. Adopt
this notation  
0 A1 A2 A3
 −A1 0 −a3 a2 
gj A =  
−A2 a3 0 −a1
−A3 −a2 a1 0
112
problem 40
Lorentz transformations 137

where comprise a sextet of adjustable real constants.
A1 , A2 , A3 , a1 , a2 , a3

Multiplication on the left by gj –1 gives a matrix of (what I idiosyncratically

call) the “ gj -antisymmetric” design113
 
0 A1 A2 A3
A 0 a3 −a2 
A ≡ Aµ ν =  1 
A2 −a3 0 a1
A3 a2 −a1 0
We come thus to the conclusion that matrices of the form
 
0 A1 A2 A3
/\\ = exp 
A1 0 a3 −a2 
  (193)
A2 −a3 0 a1
A3 a2 −a1 0
are Lorentz matrices; i.e., they satisfy (190/192), and when inserted into (189)
they describe Poincaré/Lorentz transformations.
Does every Lorentz matrix /\\ admit of such representation? Not quite. It
follows immediately from (190) that (det /\\ )2 = 1; i.e., that

“proper”
Λ ≡ det = ±1, according as is
/\\ /\\
“improper”

while the theory of matrices supplies the lovely identity114

det(e M ) = e trM : M is any square matrix (194)

We therefore have Λ = det(e A ) = 1 by trA = 0 :

Every Lorentz matrix /\\ of the form (193) is necessarily

proper ; moreover (as will emerge), every proper /\\ admits (195)
of such an “exponential representation.”

It will emerge also that when one has developed the structure of the matrices
/\\= e A one has “cracked the nut,” in the sense that it becomes easy to describe
their improper companions.115
What it means to “develop the structure of /\\ = e A ” is exposed most
simply in the (physically artiﬁcial) case N = 2. Taking

1 0
gj = : Lorentz metric in 2 -dimensional spacetime
0 −1

113
Notice that gj -antisymmetry becomes literal antisymmetry when the metric

gj is Euclidean. Notice also that while it makes tensor-algebraic good sense to

write A2 = Aµ α Aα ν it would be hazardous to write ( gj A)2 = Aµα Aαν .
114
problem 41.
115
problem 42.
138 Aspects of special relativity

as our point of departure, the argument that gave (193) gives

= eA J
/\\
0 A
= exp (196.1)
A 0

where evidently

0 1
J=
1 0

By quick calculation (or, more elegantly, by appeal to the Cayley -Hamilton

theorem, according to which every matrix satisﬁes its own characteristic
equation) we ﬁnd J2 = I, from which it follows that

I if n is even
J =n
J if n is odd

So

/\\ = 1+ 1 2
2! A
1 4
+ 4! A + ··· I + A + 1 3
3! A + 5! A + ··· J
1 5

cosh A sinh A

cosh A sinh A
= (196.2)
sinh A cosh A

≡ /\\ (A) : Lorentzian for all real values of A

It is evident—whether one argues from (196.2) of (more eﬃciently) from (196.1)

—that

I = /\\ (0) : existence of identity (197.1)

/\\ (A2 ) /\\ (A1 ) = /\\ (A1
+ A2 ) : compositional closure (197.2)
/\\–1 (A) /\\
= (−A) : existence of inverse (197.3)

and that all such /\\ -matrices commute.

We are now—but only now—in position to consider the kinematic meaning
of A, and of the action of /\\ (A). We are, let us pretend, a “point PhD” who—
having passed the physical tests required to establish our inertiality—use
our
“good clock and Cartesian frame” to assign coordinates x ≡ x0 , x1 , x2 , x3 to
events. O—a second observer, similarly endowed,
who we see to be gliding by
with velocity v —assigns coordinates x ≡ x0 , x1 , x2 , x3 to those same events.
O shares our conﬁdence in the validity of Maxwellian electrodynamics: we can
therefore write x = /\\ x + a. In the interests merely of simplicity we will assume
that O’s origin and our origin coincide: the translational terms aµ then drop
away and we have x = /\\ x . . . which in the 2 -dimensional case reads
Lorentz transformations 139

x0 cosh A sinh A x0
= (198)
x1 sinh A cosh A x1
To describe the successive “ticks of the clock at his origin” O writes

ct
0

while—to describe those same events—we write

ct
vt

Immediately vt = ct · sinh A and ct = ct · cosh A which, when we divide the

former by the latter, give

tanh A = β (199)

with
β ≡ v/c (200)
These equations serve to assign kinematic meaning to A, and therefore to /\\ (A).

Drawing now upon the elementary identities

1 tanh A
cosh A = and sinh A =
1 − tanh A 2
1 − tanh2 A
we ﬁnd that (198) can be written
0 0
x 1 β x
=γ (201)
x1 β 1 x1
with
1
γ≡ = 1 + 12 β 2 + 38 β 4 + · · · (202)
1 − β2

Evidently γ becomes singular (see Figure 51) at β 2 = 1; i.e., at v = ±c

. . . with diverse consequences which we will soon have occasion to consider. The
non-relativistic limit arises physically from β 2 1; i.e., from v 2 c2 , but can
be considered formally to arise from c ↑ ∞. One must, however, take careful
account of the c that lurks in the deﬁnitions of x0 and x0 : when that is done,
one ﬁnds that (201) assumes the (less memorably symmetric) form

t 1 v/c2 t
= γ
x v 1 x
giving ↓

1 0 t
= as c↑∞ (203)
v 1 x
140 Aspects of special relativity

-1 -0.5 0.5 1

Figure 51: Graph of the β-dependence of γ ≡ 1/ 1 − β 2 , as
β ≡ v/c ranges on the physical interval −1 < β < +1. Outside that
interval γ becomes imaginary.

Heretofore we have been content to share our profession with a zippy population
of “superluminal inertial observers” who glide past us with speeds v > c. But

/\\ (β) becomes imaginary when β 2 > 1

We cannot enter into meaningful dialog with such observers; we therefore

strip them of their clocks, frames and PhD’s and send them into retirement,
denied any further collaboration in the development of our relativistic theory of
the world114 —indispensable though they were to our former Galilean activity.
Surprisingly, we can get along very well without them, for

/\\ (β2 )/\\ (β1 ) = /\\ (β)

β = β(β1 , β2 ) = tanh(A1 + A2 )
tanh A1 + tanh A2
=
1 + tanh A1 tanh A2
β1 + β2
= (204)
1 + β1 β2

entails (this is immediately evident in Figure 52) that

if v1 < c and v2 < c then so also is v(v1 , v2 ) < c:

one cannot leapfrog into the superluminal domain

The function β(β1 , β2 ) plays in (2 -dimensional) relativity a role precisely

analogous to a “group table” in the theory of ﬁnite groups: it describes how
Lorentz transformations compose , and possess many wonderful properties, of

114
This, however, does not, of itself, deny any conceivable role to superluminal
signals or particles in a relativistic physics!
Lorentz transformations 141

Figure 52: Graph of the function β(β 1 , β2 ). The vertices of the

frame stand at the points ± 1, ±1, ±1 in 3 -dimensional β-space.
If we write β3 = −β(β1 , β2 ) then (204) assumes the high symmetry

β1 + β2 + β3 + β1 β2 β3 = 0

clearly evident in the ﬁgure. The “β-surface” looks rather like a soap
ﬁlm spanning the 6-sided frame that results when the six untouched
edges of the cube are discarded.

which I list here only a few:

β(β1 , β2 ) = β(β2 , β1 )
β(β1 , β2 ) = 0 if β2 = −β1
β(1, 1) = 1

To this list our forcibly retired superluminal friends might add the following:

β(β1 , β2 ) = β( β11 , β12 )

142 Aspects of special relativity

If β is subluminal then β1 is superluminal. So we have here the statement

that the compose of two superluminal Lorentz transformations is subluminal
(the i’s have combined to become real). Moreover, every subluminal Lorentz
transformation can be displayed as such a compose (in many ways). Curious!
Equation (204) is often presented as “relativistic velocity addition formula”
v1 + v2
v=
1 + v1 v2 /c2
v v v v 2 v v 3

1 2 1 2 1 2
= (v1 + v2 ) · 1 − + − + · · ·
c2 c2 c2

= (Galilean formula) · relativistic correction factor

but that portrayal of the situation—though sometimes useful—seems to me to

miss (or to entail risk of missing) the simple origin and essential signiﬁcance of
(204): the tradition that has, for now nearly a century, presented relativity as a
source of endless paradox (and which has, during all that time, contributed little
or nothing to understanding—paradox being, as it is, a symptom of imperfect
understanding) should be allowed to wither.
In applications we will have need also of γ(β1 , β2 ) ≡ [1 − β 2 (β1 , β2 )]− 2 , the
1

structure of which is developed most easily as follows:

γ = cosh(A1 + A2 )

= cosh A1 cosh A2 1 + tanh A1 tanh A2

= γ1 γ2 1 + β1 β2 (205)
This “γ-composition law”—in which we might (though it is seldom useful) use

(γ + 1)(γ − 1)
β = 1−γ = −2
γ
to eliminate the surviving β ’s—will acquire importance when we come to the
theory of radiation.

5. Geometric considerations. Our recent work has been algebraic. The following
remarks emphasize the geometrical aspects of the situation, and are intended
to provide a more vivid sense of what Lorentz transformations are all about.
By way of preparation: In Euclidean 3 -space the equation xTx = r2 deﬁnes a
sphere (concentric about the origin, of radius r) which—consisting as it does of
points all of which lie at the same (Euclidean) distance from the origin—we may
reasonably call an “isometric surface.” Rotations (x x → x = R x with RT R = I)
cause the points of 3 -space to shift about, but by a linear rule (straight lines
remain straight) that maps isometric spheres onto themselves: such surfaces
are, in short, “R -invariant.” Similarly . . .
In spacetime the σ-parameterized equations

xTgj x = σ
The revealed geometry of spacetime 143

deﬁne a population of Lorentz -invariant isometric surfaces Σσ . The surfaces

that in 3 -dimensional spacetime arise from

(x0 )2 − (x1 )2 − (x2 )2 = σ

which describes a

• hyperboloid of two sheets in the case σ > 0

• cone in the case σ = 0
• hyperboloid of one sheet in the case σ < 0

are shown in Figure 53. The analogous construction in 2 -dimensional spacetime

(Figure 54) is easier to sketch, and serves most purposes well enough, but is
misleading in one important respect: it fails to indicate the profound distinction
between one-sheeted and two-sheeted hyperboloids. On the former one can
move continuously from any point to any other (one can, in particular, get
from one to the other by Lorentz transformation), but passage from one sheet
to the other is necessarily discontinuous (requires “time reﬂection,” can might
be symbolized
future past

and cannot be executed “a little bit at a time”).

How—within the geometric framework just described—is one to represent

the action x −→ x = /\\ x of /\\ (β)? I ﬁnd it advantageous to approach the
question somewhat obliquely: Suppose O to be thinking about the points
(events)

+1 +1 −1 −1
, , and
+1 −1 +1 −1

that mark the vertices of a “unit square” on her spacetime diagram. By quick
calculation


+1 +1 −1 −1 
−→ K (β)
+
and −→ K (β)
+


+1 +1 −1 −1
(206)
+1 +1 −1 −1 

−→ K (β)
−
and −→ K (β)
−

−1 −1 +1 +1

where

K + (β) ≡ 1+β
1−β and K − (β) ≡ 1−β
1+β (207)
144 Aspects of special relativity

Figure 53: Isometric surfaces in 3-dimensional spacetime. The

arrow is “the arrow of time.” Points on the blue “null cone” (or
“light cone”) are deﬁned by the condition σ = 0: the interval
separating such points from the origin has zero squared length (in
the Lorentzian sense). Points on the green cup (which is interior
to the forward cone) lie in the “future” of the origin, while points
on the green cap (interior to the backward cone) lie in the “past:”
in both cases σ > 0. Points on the yellow girdle (exterior to
the cone) arise from σ < 0: they are separated from the origin by
intervals of negative squared length, and are said to lie “elsewhere.”
In physical (4 -dimensional) spacetime the circular cross sections
(cut by “time-slices”) become spherical. Special relativity acquires
many of its most distinctive features from the circumstance that the
isometric surfaces Σσ are hyperboloidal.
The revealed geometry of spacetime 145

Figure 54: The isometric surfaces shown in the preceding ﬁgure

become isometric curves in 2-dimensional spacetime, where all
hyperbolas have two branches. We see that

1
gives σ = 12 − 02 = +1, typical of points with timelike
0

1
gives σ = 12 − 12 = 0, typical of points with lightlike
1

0
gives σ = 02 − 12 = −1, typical of points with spacelike
1
separation from the origin. And that—since the ﬁgure maps to itself
under the Lorentz transformations that
• describe the symmetry structure of spacetime
• describe the relationships among inertial observers
—these classiﬁcations are Lorentz-invariant, shared by all inertial
observers.

Calculation would establish what is in fact made obvious already at (206): the
K ± (β) are precisely the eigenvalues of /\\ (β).115 Nor are we surprised that the
associated eigenvectors are null vectors, since

(x, x) → (Kx, Kx) = (x, x) entails (x, x) = 0

115
We note in passing that K − (β) = [K + (β)]–1 = K + (−β).
146 Aspects of special relativity

K+

K-

K- K+

Figure 55: Inertial observer O inscribes a “unit square” , with

lightlike vertices, on her spacetime diagram. /\\ (β) stretches one
diagonal by the factor K + , and shrinks the other by the factor K − .
That individual points “slide along isometric curves” is illustrated
here by the motion • → • of a point of tangency. Corresponding
sides of and its transform have diﬀerent Euclidean lengths, but
identical Lorentzian lengths. Curiously, it follows from K + K − = 1
that and its transform have identical Euclidean areas.116,117

The upshot of preceding remarks is illustrated above, and elaborated in the

ﬁgure on the next page, where I have stated in the caption but here emphasize
once again that such ﬁgures, though drawn on the Euclidean page, are to be read
as inscriptions on 2-dimensional spacetime. The distinction becomes especially
clear when one examines Figure 57.

116
problem 43.
117
Some authors stress the utility in special relativity of what they call the
“k-calculus:” see, for example, Hermann Bondi, Relativity and Common Sense:
A New Approach to Einstein (), pages 88 –121 and occasional papers in the
American Journal of Physics. My K-notation is intended to establish contact
with that obscure tradition.
The revealed geometry of spacetime 147

Figure 56: Elaboration of the preceding ﬁgure. O has inscribed a

Cartesian gridwork on spacetime. On the right is shown the Lorentz
transform of that coordinate grid. Misner, Thorne & Wheeler
(Gravitation (), page 11) have referred in this connection to
the “collapse of the egg crate,” though that picturesque terminology
is somewhat misleading: egg crates preserve side-length when they
collapse, while the present mode of collapse preserves Euclidean
area. Orthogonality, though obviously violated in the Euclidean
sense, is preserved in the Lorentzian sense . . . which is, in fact,
the only relevant sense, since the ﬁgure is inscribed not on the
Euclidean plane but on 2-dimensional spacetime. Notice that
tangents to isometric curves remain in each case tangent to the
same such curve. The entire population of isometric curves (see
again Figure 54) can be recovered as the population of envelopes of
the grid lines, as generated by allowing β to range over all allowed
values (−1 < β < +1).
148 Aspects of special relativity

ϑ
γ

ϑ
γβ

Figure 57: O writes (ct, 0) to describe the “ tth tick of her clock.”
Working from (201) we ﬁnd that O assigns coordinates (γt, γβt) to
that same event. The implication is that the (Euclidean) angle ϑ
subtended by
• O’s time axis and
• O’s representation of O’s time axis
can be described
tan ϑ = β
The same angle, by a similar argument, arises when one looks to
O’s representation of O’s space axis. One could, with this infor-
mation, construct the instance of Figure 56 which is appropriate to
any prescribed β-value. Again I emphasize that—their Euclidean
appearance notwithstanding— O and O are in agreement that O’s
coordinate axes are normal in the Lorentzian sense .118

We are in position now to four points of fundamental physical significance ,

of which three are temporal, and one spatial. The points I have in mind will
be presented in a series of ﬁgures, and developed in the captions:

118
problem 44.
The revealed geometry of spacetime 149

Figure 58: Breakdown of non-local simultaneity . O sees three

spatially-separated events to be simultaneous. O, on the other hand,
assigns distinct x0-coordinates to those same events (see the figure
on the right), which he considers to be non-simultaneous/sequential.
It makes relativistic good sense to use the word “simultaneous”
only in reference to events which (like the birth of twins) occur
at the same moment and at the same spatial point. The Newtonian
concept of “instantaneous action at a distance”—central to his
“Universal Law of Gravitation” but which, on philosophical grounds,
bothered not only Newton’s contemporaries but also Newton himself
—has been rendered relativistically untenable: interactions, in any
relativistically coherent physics, have become necessarily local,
dominated by what philosophers call the “Principle of Contiguity.”
They have, in short, become collision-like events, the effects of which
propagate like a contagion: neighbor infects neighbor. If “particles”
are to participate in collisions they must necessarily be held to be
pointlike in the mathematical sense (a hard idealization to swallow),
lest one acquire an obligation to develop a physics of processes
interior to the particle. The language most natural to physics has
become field theory—a theory in which all interactions are local
field-field interactions, described by partial differential equations.
150 Aspects of special relativity

Figure 59: Conditional covariance of causal sequence . At left:

diverse inertial observers all place the event • on a sheet of the
isometric hyperboloid that is confined to the interior of the forward
lightcone, and all agree that • lies “in the future” of the origin ◦.
But if (as at the right) • is separated from ◦ by a spacelike interval;
i.e., if • lies outside the lightcone at ◦, then some observers see
• to lie in the future of ◦, while other observers see • to lie in
its past. In the latter circumstance it is impossible to develop an
agreed-upon sense of causal sequence. Generally: physical events
at a point p can be said to have been “caused” only by events
that lie in/on the lightcone that extends backward from p, and can
themselves influence only events that lie in/on the lightcone that
extends forward from p. In electrodynamics it will emerge that
(owing to the absence of “photon mass terms”) effects propagate on
the lightcone. Recent quantum mechanical experiments (motivated
by the “EPR paradox”) are of great interest because they have yielded
results that appear to be “acausal” in the sense implied by preceding
remarks: the outcome of a quantum coin-flip at p predetermines
the result of a similar measuremennt at q even though the interval
separating q from p is spacelike.
The revealed geometry of spacetime 151

γβ

Figure 60: Time dilation. Inertial observer O assigns duration x0

to the interval separating “successive ticks • . . . • of her clock.” A
second observer O, in motion relative to O, assigns to those same
events (see again Figure 57) the coordinates

0 x0 γ x0
and =
0 x1 γβ x0

He assigns the same Lorentzian value to the squared length of the

spacetime interval • . . . • that O assigned to • . . . •

(γ x0 )2 − (γβ x0 )2 = (x0 )2 − (0)2

but reports that the 2nd tick occurred at time

x0 = γ x0 > x0

In an example discussed in every text (see, e.g., Taylor & Wheeler,

Spacetime Physics (), §42) the “ticking” is associated with the
lifetime of an unstable particle—typically a muon—which (relative
to the tabulated rest-frame value) seems dilated to observers who see
the particle to be in motion.
152 Aspects of special relativity

Figure 61: Lorentz contraction. This is often looked upon as

the flip side of time dilation, but the situation as it pertains to
spatial intervals is—owing to the fact that metersticks persist, and
are therefore not precise analogs of clockticks—a bit more subtle.
At left is O’s representation of a meterstick sitting there, sitting
there, sitting there . . . and at right is O’s representation of that same
construction. The white arrows indicate that while O and O have
the same thought in mind when they talk about the “length of the
meterstick” (length of the spatial interval that separates one end
from the other at an instant) they are—because they assign distinct
meanings to “at an instant”—actually talking about different things.
Detailed implications are developed in the following figure.
The revealed geometry of spacetime 153

γβ

Figure 62: Lorentz contraction (continued). When observers speak

of the “length of a meterstick” they are really talking about what they
perceive to be the width of the “ribbon” which such an extended
object inscribes on spacetime. This expanded detail from the
preceding ﬁgure shows how it comes about that the meterstick which
O sees to be at rest, and to which she assigns length , is assigned
length
= γ –1 <
by O, who sees the meterstick to be in uniform motion. This familiar
result poses, by the way, a problem which did not escape Einstein’s
attention, and which contributed to the development of general
relativity: The circumference of a rigidly rotating disk has become
too short to go all the way around!119

Prior to Einstein’s appearance on the scene () it was universally held

that time dilation and “Lorentz-FitzGerald contraction” were physical eﬀects,
postulated to account for the null result of the Michelson-Morley experiment,
and attributed to the interaction of physical clocks and physical metersticks
with the physical “æther” through which they were being transported. Einstein
119
See J. Stachel, “Einstein and the rigidly rotating disk” in A. Held (editor),
General Relativity & Gravitation (), Volume 1, page 1. H. Arzeliès, in
Relativistic Kinematics (), devotes an entire chapter to the disk problem
and its relatives.
154 Aspects of special relativity

(with his trains and lanterns) argued that such effects are not “physical,” in the
sense that they have to do with the properties of “stuff”. . . but “metaphysical”
(or should one say: pre-physical?)—artifacts of the operational procedures by
which one assigns meaning to lengths and times. In preceding pages I have, in
the tradition established by Minkowski, espoused a third view: I have
represented all such effects are reflections of the circumstance (brought first
to our attention by electrodynamics) that the hyperbolic geometry of spacetime
is a primitive fact of the world, embraced by all inertial observers . . . and written
into the design of all possible physics.

remark: It would be nice if things were so simple (which in leading

approximation they are), but when we dismissed Newton’s Law of
Universal Gravitation as “relativistically untenable” we acquired
a question (“How did the Newtonian theory manage to serve so
well for so long?”) and an obligation—the development of a “ﬁeld
theory of gravitation.” The latter assignment, as discharged by
Einstein himself, culminated in the invention of “general relativity”
and the realization that it is—except in the approximation that
gravitational eﬀects can be disregarded—incorrect to speak with
global intent about the “hyperbolic geometry of spacetime.” The
“geometry of spacetime” is “hyperbolic” only in the same
approximate/tangential sense that vanishingly small regions
inscribed on (say) the unit sphere become “Euclidean.”

6. Lorentz transformations in 4-dimensional spacetime. The transition from toy

2-dimensional spacetime to physical 4-dimensional spacetime poses an enriched
algebraic problem

/\\ =
0 A
exp (196.1)
A 0
|
|  
↓ 0 A1 A2 A3
/\\ = exp 
A 1 0 a3 −a2 
  (193)
A2 −a3 0 a1
A3 a2 −a1 0

and brings to light a physically-important point or two which were overlooked

by Einstein himself. The algebraic details are, if addressed with a measure of
elegance, of some intrinsic interest120 . . . but I must here be content merely to
outline the most basic facts, and to indicate their most characteristic kinematic/
physical consequences. Consider ﬁrst the

120
See elements of relativity ().
Lorentz transformations in 4-dimensional spacetime 155

case A1 = A2 = A3 = 0 in which /\\ possesses only space/space generators.121

Then
 
0 0 0 0
/\\
0 
= exp  A 
0
0

where  
0 a3 −a2
A ≡  −a3 0 a1  is real and antisymmetric
a2 −a1 0

It follows quite easily that

 
1 0 0 0
0 
= R  (208)
0
0

where R ≡ eA is a 3×3 rotation matrix . The action of such a /\\ can be described
0 0 0
x x x
−→ =
x x Rx

as a spatial rotation that leaves time coordinates unchanged. Look to the case
a1 = a2 = 0, a3 = φ and use the Mathematica command MatrixExp[ /\\ ] to
obtain  
1 0 0 0
/\\ = 
0 cos φ sin φ 0 
 
0 − sin φ cos φ 0
0 0 0 1
with the evident implication that in the general case
√ such a Lorentz matrix
describes a lefthanded rotation through angle φ = a · a about the unit vector
λ ≡ âa .122 Such Lorentz transformations contain no allusion to v and have
no properly kinematic signiﬁcance: O simply stands beside us, using her clock
(indistinguishable from ours) and her rotated Cartesian frame to “do physics.”
What we have learned is that

Spatial rotations are Lorentz transformations

of a special type (a type for which the 2 -dimensional theory is too impoverished
to make provision). The associated Lorentz matrices will be notated R (φ, λ).
Look next to the complementary . . .
121
“Time/time” means 0 appears twice, “time/space” and “space/time” mean
that 0 appears once, “space/space” means that 0 is absent.
122
See classical dynamics (/), Chapter 1, pages 83–89 for a simple
account of the detailed argument.
156 Aspects of special relativity

case a1 = a2 = a3 = 0 in which /\\ possesses only time/space generators.

Here (as it turns out) /\\ does possess kinematic signiﬁcance. The argument
which (on page 139) gave

A = tanh–1 β with β = v/c

now gives
A = tanh–1 β · v̂v
while the argument which (on pages 138–139) gave

/\\ = exp tanh–1 β
0 1 γ vγ/c
=
1 0 vγ/c γ

now gives
  
 0 v̂1 v̂2 v̂3 
 
 v̂ 0 0 0
/\\ = exp tanh–1 β  1 

 v̂2 0 0 0  
v̂3 0 0 0
 
γ v1 γ/c v2 γ/c v3 γ/c
 v1 γ/c 1 + (γ − 1)v1 v1 /v 2 (γ − 1)v1 v2 /v 2 (γ − 1)v1 v3 /v 2 
= 
v2 γ/c (γ − 1)v2 v1 /v 2 1 + (γ − 1)v2 v2 /v 2 (γ − 1)v2 v3 /v 2
v3 γ/c (γ − 1)v3 v1 /v 2 (γ − 1)v3 v2 /v 2 1 + (γ − 1)v3 v3 /v 2

Such Lorentz matrices will be notated

= /\\ (β
β) (209)

β ≡ v /c
They give rise to Lorentz transformations x −→ x = /\\ (β β )x which are “pure”
(in the sense “rotation-free”) and are called “boosts.” The construction (208)
looks complicated, but in fact it possesses precisely the structure that one might
(with a little thought) have anticipated . For (209) supplies123

t = γ t + (γ/c2 )vv· x
(210.1)
x = x + γ t + (γ − 1) (vv· x)/v 2 v

and if we resolve x and x into components which are parallel/perpendicular to

the velocity v with which O sees O to be gliding by

x ≡ (xx · v̂v ) v̂v ≡ xv̂v

x = x⊥ + x with
x⊥ ≡ x − x

x ≡ (xx · v̂v ) v̂v ≡ xv̂v

x = x⊥ + x with
x⊥ ≡ x − x
123
problem 45, 46.
Lorentz transformations in 4-dimensional spacetime 157

then (210.1) can be written (compare (203))


t 1 v/c2 t 

=γ
x v 1 x (210.2)


x⊥ = x⊥

And in the Galilean limit we recover

    
t 1 0 0 0 t
 x1   v1 1 0 0   x1 
 2=  2  (210.3)
x v2 0 1 0 x
x3 v3 0 0 1 x3

general case Having discussed the 3 -parameter family of rotations

R(φ, λ) and the 3 -parameter family of boosts /\\ (β
β ) the questions arises: What
can one say in the general 6-parameter case

/\\ = eA

It is—given the context in which the question was posed—natural to write

A=J+K
with  
0 A1 A2 A3
0 
3
A 0 0
J≡ 1 ≡ Ai Ji
A2 0 0 0
i=1
A3 0 0 0
 
0 0 0 0
a3 −a2 
3
 0 0
K≡ ≡ ai Ki
0 −a3 0 a1
i=1
0 a2 −a1 0

and one might on this basis be tempted to write /\\ = e K · e J , giving

/\\ general = (rotation) · (boost) (211)

Actually, a representation theorem of the form (211) is available, but the

argument which here led us to (211) is incorrect: one can write

e J+K = e K · e J if and only if J and K commute

and in the present instance we (by computation) have

3
J, K = − A ×a
(A a)i Ji (212)
i=1
= O if and only if A and a are parallel
158 Aspects of special relativity

More careful analysis (which requires some fairly sophisticated algebraic

machinery124 ) leads back again to (211), but shows the boost and rotational
factors of /\\ to be different from those initially contemplated. I resist the
temptation to inquire more closely into the correct factorization of /\\ , partly
because I have other fish to fry . . . but mainly because I have already in hand
the facts needed to make my major point, which concerns the composition of
boosts in 4-dimensional spacetime. It follows immediately from (208) that
(rotation) · (rotation) = (rotation) (213.1)
↑—specific description poses a non-trivial
but merely technical (algebraic) problem

It might—on analogical grounds—appear plausible therefore that

(boost) · (boost) = (boost)
but (remarkably!) this is not the case: actually
= (rotation) · (boost) (213.2)
Detailed calculation shows more specifically that
/\\ (β
β 2) · /\\ (β
β 1 ) = R (φ, λ) /\\ (β
β) (214.0)

where 1 + (β2 /β1 )(1 − γ11 ) cos ω β 1 + γ11 β 2
β= (214.1)
1 + β1 β2 cos ω
λ = unit vector parallel to β 2 ×β β1 (214.2)
ω = angle between β 1 and β 2 (214.3)
sin ω
φ = tan–1 (214.4)
1 + cos ω

= (γ1 − 1)(γ2 − 1)/(γ1 + 1)(γ2 + 1) (214.5)
and where β1 , β2 , γ1 and γ2 have the obvious meanings. One is quite unprepared
by 2-dimensional experience for results which are superficially so ugly, and which
are undeniably so complex. The following points should be noted:
1. Equation (214.1) is the 4 -dimensional velocity addition formula. Looking
with its aid to β ·β we obtain the speed addition formula

β12 + β22 + 2β1 β2 cos ω − (β1 β2 sin ω)2
β= (215)
1 + β1 β2 cos ω
⇓
β 1 if β1 1 and β2 1
according to which (see the following figure) one cannot, by composing velocities,
escape from the c-ball . Note also that
↓
β1 + β2
β= in the collinear case: ω = 0
1 + β1 β2

124
The requisite machinery is developed in elaborate detail in elements of
special relativity ().
Lorentz transformations in 4-dimensional spacetime 159

β2

ω β

β1

forbidden region

Figure 63: β 1 and β 2 ,if not collinear, span a plane in 3-dimensional

β -space. The ﬁgure shows the intersection of that plane with what
I call the “c-ball,” deﬁned by the condition β 2 = 1. The placement
of β is given by (214.1). Notice that, while β 1 + β 2 falls into the
forbidden exterior of the c-ball, β does not. Notice also that β lies
on the β 1-side of β 1 + β 2 , from which it deviates by an angle that
turns out to be precisely the φ that enters into the design of the
rotational factor R (φ, λ).

which is in precise conformity with the familiar 2 -dimensional formula (204).

2. It is evident in (214.1) that β depends asymmetrically upon β 1 and β 2 .
Not only is β
= β 1 + β 2 , is its not even parallel to β 1 + β 2 , from which it
deviates by an angle that turns out to be precisely the φ encountered already—
in quite another connection—at (214.4). The asymmetry if the situation might
be summed up in the phrase “β β 1 predominates.” From this circumstance one
acquires interest in the angle Ω between β and β 1 : we ﬁnd

β2 sin ω
Ω = tan–1 (216)
γ1 (β1 + β2 cos ω)
↓ β sin ω
2
Ω0 = tan–1 in the non-relativistic limit
β1 + β2 cos ω
160 Aspects of special relativity

β2 β2
β φ
ω ω β
Ω Ω
β1 β1

Figure 64: At left: Galilean composition of non-collinear velocities.

At right: its Lorentzian counterpart, showing the sense in which
β 1 predominates.” Evidently
“β

Ωrelativistic = Ω0 + φ Ω0

calculations which are elementary in the Galilean case (see the ﬁgure) but
become a little tedious in the relativistic case.125 Asymmetry eﬀects become
most pronounced in the ultra-relativistic limit. Suppose, for example, that
β1 = 1: then Ω ↓ 0 and

β → β 1 , irrespective of the value assigned to β 2 !

More physically,126 suppose β1 < 1 but β2 = 1: then

sin ω
Ω = tan–1 1 − β12
β1 + cos ω
The ﬁrst occurrence of this formula is in §7 of Einstein’s ﬁrst relativity paper
(), where it is found to provide the relativistic correction to the classic “law
of aberration.”127
3. It is a corollary of (215) that

γ = γ1 γ2 1 + β1 β2 cos ω

which gives back (205) in the collinear case.

125
See page 87 in the notes just cited.
126
I say “more physically” because β = 1 cannot pertain to an “observer ”
(though it can pertain to the flight of a massless particle): while it does make
sense to ask what an observer in motion (with respect to us) has to say about
the lightbeam to which we assign a certain direction of propagation, it makes
no sense to ask what the lightbeam has to say about the observer!
127
“Aberration” is the name given by astronomers to the fact that “fixed
stars” are seen to trace small ellipses in the sky, owing to the earth’s annual
progress along its orbit. See page 17 in W. Pauli’s classic Theory of Relativity
(first published in , when Pauli was only twenty-one years old; reissued
with a few additional notes in ) or P. G. Bergmann, Introduction to the
Theory of Relativity (), pages 36–38.
Lorentz transformations in 4-dimensional spacetime 161

4. In the small-velocity approximation (213.1) and (213.4) give

1
v = v1 + v2 − 1
2 β1 β2 cos ω · v 1 + 2
2 β1 + β1 β2 cos ω v 2 + · · ·
φ = 14 β1 β2 sin ω + · · ·

according to which all “relativistic correction terms” are of 2nd order.

The presence of the R-factor on the right side of (213)—i.e., the fact that
rotations arise when one composes non-collinear boosts—can be traced to the
following algebraic circumstance:

J1 , K2 = −J3 = J2 , K1 (217.1)

K1 , K2 = −K3 (217.2)

J1 , J2 = +K3 (217.3)
—each of which remains valid under cyclic index permutation. Equations
(217.1) are but a rewrite of (212). The compositional closure (213.1) to the
rotations can be attributed to the fact that it is a K that stands on the right
side of (217.2). The fact (213.2) that the set of boosts is not compositionally
closed arises from the circumstance that it is again a K—not, as one might have
expected, a J—that stands on right side of (217.3).
The essential presence of the rotational R-factor on the right side of (214)
was discovered by L. H. Thomas (: relativity was then already 21 years
old), whose motivation was not mathematical/kinematic, but intensely physical:
Uhlenbeck & Goudsmit had sought () to derive ﬁne details of the hydrogen
spectrum from the assumption that the electron in the Bohr atom possesses
intrinsic “spin”. . . but had obtained results which were invariably oﬀ by a factor
of 2. Thomas—then a post-doctoral student at the Bohr Institute, and for
reasons to which I will return in a moment—speculated that a “relativistic
correction” would resolve that problem. Challenged by Bohr to develop the idea
(for which neither Bohr nor his associate Kramers held much hope), Thomas
“that weekend” argued as follows: (i ) A proton •, pinned to the origin of an
inertial frame, sees an electron • to be revolving with angular velocity Ωorbital
on a circular Bohr orbit of radius R. (ii ) Go to the frame of the non-inertial
observer who is “riding on the electron” (and therefore sees • to be in circular
motion): do this by
going to the frame of the inertial observer who is instantaneously
comoving with • at time t0 = 0, then. . .
boosting to the frame of the inertial observer who is instantaneously
comoving with • at time t1 = τ , then. . .
boosting to the frame of the inertial observer who is instantaneously
comoving with • at time t2 = 2τ , then. . .
..
.
boosting to the frame of the inertial observer who is instantaneously
comoving with • at time t = N τ
162 Aspects of special relativity

Figure 65: Thomas precession of the non-inertial frame of an

observer • in circular orbit about an inertial observer •. In celestial
mechanical applications the eﬀect is typically so small (on the order
of seconds of arc per century) as to be obscured by dynamical eﬀects.
But in the application to (pre-quantum mechanical) atomic physics
that was of interest to Thomas the precession becomes quite brisk
(on the order of ∼ 1012 Hz.).

and by taking that procedure to the limit τ ↓ 0, N = t/τ ↑ ∞. One arrives thus
at method for Lorentz transforming to the frame of an accelerated observer . The
curvature of the orbit means, however, that successive boosts are not collinear;
rotational factors intrude at each step, and have a cumulative eﬀect which (as
detailed analysis128 shows) can be described
dφ
dt ≡ ΩThomas = (γ − 1)Ωorbital

= 12 β 2 Ωorbital 1 + 34 β 2 + 15 4
24 β + ···

in the counterrotational sense (see the ﬁgure). It is important to notice that

this Thomas precessional eﬀect is of relativistic kinematic origin: it does not
128
See §103 in E. F. Taylor & J. A. Wheeler, Spacetime Physics () or pages
95–116 in the notes previously cited.122 Thomas’ own writing—“The motion of
the spinning electron,” Nature 117, 514 (1926); “The kinematics of an electron
with an axis,” Phil. Mag. 3, 1 (1927); “Recollections of the discovery of the
Thomas precessional frequency” in G. M. Bunce (editor), High Energy Spin
Physics–,AIP Conference Proceedings No. 95 (1983)—have never seemed
to me to be particularly clear. See also J. Frenkel, “Die Elektrodynamic des
rotierenden Elektrons,” Z. für Physik 37, 243 (1926).
Lorentz transformations in 4-dimensional spacetime 163

arise from impressed forces. (iii ) Look now beyond the kinematics to the
dynamics: from •’s viewpoint the revolving • is, in effect, a current loop, the
generator of a magnetic field B . Uhlenbeck & Goudsmit had assumed that
the electron possesses a magnetic moment proportional to its postulated spin:
such an electron senses the B -field, to which it responds by precessing, acquiring
precessional energy EUhlenbeck & Goudsmit . Uhlenbeck & Goudsmit worked,
however, from a mistaken conception of “•’s viewpoint.” The point recognized
by Thomas is that when relativistic frame-precession is taken into account129
one obtains
EThomas = 12 EUhlenbeck & Goudsmit
—in good agreement with the spectroscopic data. This was a discovery of
historic importance, for it silenced those (led by Pauli) who had dismissed as
“too classical” the spin idea when it had been put forward by Krönig and
again, one year later, by Uhlenbeck & Goudsmit: “spin” became an accepted/
fundamental attribute of elementary particles.130
So much for the structure and properties of the Lorentz transformations
. . . to which (following more closely in Minkowski’s footsteps than Lorentz’) we
were led by analysis of the condition

/\\ =
/\\T gj
gj everywhere (182)

which arose from one natural interpretation of the requirement that X → X

preserve the form of Maxwell’s equations . . . but to which Einstein himself
was led by quite other considerations: Einstein—recall his trains/clocks/rods
and lanterns—proceeded by operational/epistemological analysis of how inertial
observers O and O, consistently with the most primitive principles of an
idealized macroscopic physics, would establish the relationship between their
coordinate systems. Einstein’s argument was wonderfully original, and lent an
air of “inescapability” to his conclusions . . . but (in my view) must today be
dismissed as irrelevant, for special relativity appears to remain effective in the
129
See pages 116 –122 in elements of relativity ().
130
Thomas precession is a relativistic effect which 2 -dimensional theory is too
impoverished to expose. Einstein himself missed it, and—so far as I am aware—
never commented in print upon Thomas’ discovery. Nor is it mentioned in
Pauli/s otherwise wonderfully complete Theory of Relativity.125 In  I had
an opportunity to ask Thomas himself how he had come upon his essential
insight. He responded “Nothing is ever really new. I learned about the subject
from Eddington’s discussion [Eddington was in fact one of Thomas’ teachers] of
the relativistic dynamics of the moon—somewhere in his relativity book, which
was then new. I’m sure the whole business—except for the application to Bohr’s
atom—was known to Eddington by . Eddington was a smart man.” Arthur
Stanley Eddington’s The Mathematical Theory of Relativity () provided
the first English-language account of general relativity. The passage to which
Thomas evidently referred occurs in the middle of page 99 in the 2nd edition
(), and apparently was based upon then-recent work by W. De Sitter.
164 Aspects of special relativity

deep microscopic realm where Einstein’s operational devices/procedures (his

“trains and lanterns”) are—for quantum mechanical reasons—meaningless.
Einstein built better than he knew—or could know . . . but I’m ahead of my
story. The Lorentz transformations enter into the statement of—but do not in
and of themselves comprise—special relativity. The “meaning of relativity” is
a topic to which I will return in §8.

7. Conformal transformations in N-dimensional spacetime.* We have seen that

a second—and hardly less natural—interpretation of “Lorentz’ question” gives
rise not to (182) but to a condition of the form

W Tgj W = Ω gj everywhere (185.2)

where (as before)

 
1 0 0 0
 0 −1 0 0 
gj =  
0 0 −1 0
0 0 0 −1
My objective here is to describe the speciﬁc structure of the transformations
X → X which arise from (185.2).
We begin as we began on page 132 (though the argument will not not lead
to a proof of enforced linearity). If (185.2) is written

gαβ W α µ W β ν = gµν (218)

then (since the elements of gj are constants) application of ∂ λ gives

gαβ W α λµ W β ν + gαβ W α µ W β νλ = gµν Ωλ (219)

where W α λµ ≡ ∂ λ W α µ = ∂ 2 xα /∂xλ ∂xµ and Ωλ ≡ ∂ λ Ω. Let functions Γµνλ

and ϕλ ≡ ∂ λ ϕ be deﬁned—deviously—as follows:

Ωλ ≡ 2Ωϕλ (220)

gαβ W α µ W β νλ ≡ ΩΓµνλ : νλ-symmetric (221)

√
Then (since the stipulated invertibility of X → X entails Ω = W
= 0) equation
(219) becomes
Γµνλ + Γνλµ = 2gµν ϕλ
which by the “cyclic permutation argument” encountered on page 132 gives

Γλµν = gλµ ϕν + gλν ϕµ − gµν ϕλ (222)

* It is the logic of the overall argument—certainly not pedagogical good

sense! —that has motivated me to introduce this material (which will not be
treated in lecture). First-time readers should skip directly to §7.
Conformal transformations 165

Now
W α µν = Γλµν · ΩM λ β g βα by (221)
! "# $
= g λκ W α κ by (218)
so by (222)
= ϕµ W α ν + ϕν W α µ − gµν · g λκ ϕλ W α κ (223)
where the µν-symmetry is manifest. More compactly
= Γ κ µν W α κ (224)
where
Γ κ µν ≡ g κλ Γλµν
∂Γ κ µν α
Application of ∂ λ to (224) gives W α λµν = W κ + Γ κ µν W α λκ which
∂xλ

(since W , W and Γ are symmetric in their subscripts, and after relabling
some indices) can be written
∂Γ β λν α
W α λµν = W β + Γ κ νλ W α κµ
∂xµ ! "# $
= Γ β κµ W α β by (224)

β
∂Γ λν
= + Γ β κµ Γ κ νλ W α β
∂xµ
from which it follows in particular that

β
∂Γ λν ∂Γ β λµ
W λµν − W λνµ =
α α
− + Γ κµ Γ νλ − Γ κν Γ µλ W α β
β κ β κ
∂xµ ∂xν
≡ Rβ λµν W α β (225)
The preceding sequence of manipulations will, I fear, strike naive readers as an
unmotivated jumble. But those with some familiarity with patterns of argument
standard to differential geometry will have recognized that
• the quantities W α µ transform as components of an α-parameterized set
of covariant vectors;
• the quantities Γ κ µν are components of 131 an affine connection to which
(222) assigns a specialized structure;
• the α-parameterized equations (224) can be notated
Dν W α µ ≡ ∂ ν W α µ − W α κ Γ κ µν = 0
according to which each of the vectors W α µ has the property that its
covariant derivative 129 vanishes;
• the 4th rank tensor Rβ λµν defined at (225) is just the Riemann-Christoffel
curvature tensor ,129 to which a specialized structure has in this instance
been assigned by (222).
131
See again page 123.
166 Aspects of special relativity

But of diﬀerential geometry I will make explicit use only in the following—
independently veriﬁable—facts: let

Rκλµν ≡ gκβ Rβ λµν

Then—owing entirely to (i ) the deﬁnition of Rβ λµν and (ii ) the µν-symmetry

of Γ β µν —the tensor Rκλµν possess the following symmetry properties:

Rκλµν = −Rκλνµ : antisymmetry on the last pair of indices

= −Rλκµν : antisymmetry on the ﬁrst pair of indices
= +Rµνκλ : supersymmetry
Rκλµν + Rκµνλ + Rκνλµ = 0 : windmill symmetry

These serve to reduce the number of independent components from N 4 to

12 N (N − 1):
1 2 2

N N4 1 2
12 N (N
2
− 1)

1 1 0
2 16 1
3 81 6
4 256 20
5 625 50
6 1296 105
.. .. ..
. . .

We will, in particular, need to know that in the 2 -dimensional case the only
non-vanishing components of Rκλµν are

R0101 = −R0110 = −R1001 = +R1010

Returning now to the analytical mainstream. . .

The left side of (225) vanishes automatically, and from the invertibility of
W we infer that
Rκλµν = 0 (226)

Introducing (222) into (225) we ﬁnd (after some calculation marked by a great
deal of cancellation) that Rκλµν has the correspondingly specialized structure

Rκλµν = gκν Φλµ − gκµ Φλν − gλν Φκµ + gλµ Φκν (227)

where

Φλµ ≡ ϕλµ − ϕλ ϕµ + 12 gλµ · (g αβ ϕα ϕβ ) (228)

ϕλµ ≡ ∂ϕλ /∂x = ∂ ϕ/∂x ∂x = ϕµλ
µ 2 λ µ
Conformal transformations 167

entail Φλµ = Φµλ . It follows now from (227) that

Rλµ ≡ Rα λµα = (N − 2) Φλµ + gλµ · (g αβ Φαβ ) (229.1)

R ≡ Rβ β = 2(N − 1) · g αβ Φαβ (229.2)

must—in consequence of (226)—both vanish:

Rλµ = 0 (230.1)
R=0 (230.2)

In the case N = 2 the equations (230) are seen to reduce to a solitary

condition
g αβ Φαβ = 0 (231)

which in cases N > 2 becomes a corollary of the stronger condition

Φαβ = 0 (232)

This is the conformality condition from which we will work. When introduced
into (227) it renders (226) automatic.132
√
Note that (220) can be written ∂ λ ϕ ≡ ϕλ = ∂ λ log Ω and entails
√
ϕ = log Ω + constant

Returning with this information to (228), the conformality condition (232)

becomes
√ √ √ √ √
∂ 2 log Ω ∂ log Ω ∂ log Ω 1 αβ ∂ log Ω ∂ log Ω
− + 2 gµν · g =0
∂xµ ∂xν ∂xµ ∂xν ∂xα ∂xβ
which—if we introduce
1
F ≡√ (233)
Ω
—can be written

∂ 2 log F ∂ log F ∂ log F ∂ log F ∂ log F

µ ν
= 12 gµν · g αβ α β
−
∂x ∂x ∂x ∂x ∂xµ ∂xν
132
When N = 2 one must, on the other hand, proceed from (231). It is
therefore of interest that (231) and (226) are—uniquely in the case N = 2
—notational variants of the same statement . . . for

R0101 = only independent element

= g01 Φ10 − g00 Φ11 − g11 Φ00 + g10 Φ01 by (227)
00
g g 01 g11 −g01
= −g · g αβ Φαβ by = g –1
·
g 10 g 11 −g10 g00
168 Aspects of special relativity

We write out the derivatives and obtain these simpler-looking statements

g αβ Fα Fβ
Fµν = gµν · (234)
2F
where Fµ ≡ ∂ µ F and Fµν ≡ ∂ µ ∂ ν F . The implication is that

∂ ν (g λµ Fµ ) = g λµ Fµν

αβ
g F α Fβ λ
= 12 δ ν : vanishes unless ν = λ
F

which is to say: g λµ Fµ is a function only of xλ . But gj is, by initial assumption,
a constant diagonal matrix, so we have

Fµ is a function only of xµ , and so are all of its derivatives Fµν

Returning withthis information to (233), we are brought to the conclusion that

the expression etc. is a function only of x0 , only of x1 , . . . ; that it is, in short,
a constant (call it 2C), and that (233) can be written

Fµν = 2Cgµν

giving

F = Cgαβ xα xβ − 2bα xβ + A
= C · (x, x) − 2(b, x) + A (235)

where bα and A are constants of integration. Returning with this information

to (234) we obtain

4CF = g αβ Fα Fβ = g αβ (2Cxα − 2bα )(2Cxβ − 2bβ )

% &
(b, b)
= 4C C · (x, x) − 2(b, x) +
C

the
eﬀect of which, upon comparison with (235), is to constrain the constants
A, bα , C to satisfy

AC = (b, b)

This we accomplish by setting C = (b, b)/A, giving

(b, b)(x, x)
F = A − 2(b, x) + : A and bα now unconstrained
A
Finally we introduce aα ≡ bα /A to obtain the pretty result

F = A 1 − 2(a, x) + (a, a)(x, x) (236)
Conformal transformations 169

The conformal transformations X ← X have yet to be described, but we

now know this about W , the Jacobian of such a transformation:

Ω = W N = 12 = 2 1
2
(237)
F A [1 − 2(b, x) + (a, a)(x, x)]2
Clearly, tensor weight distinctions do not become moot in the context provided
by the conformal group, as they did (to within signs) in connection with the
Lorentz group.
To get a handle on the functions xα (x) that describe speciﬁc conformal
transformations X ← X we introduce
√
∂ µ ϕ ≡ ϕµ = ∂ µ log Ω = −∂ µ log F = − 1 Fµ
F
into (223) to obtain

F W α µν + Fµ W α ν + Fν W α µ = gµν · g λκ Fλ W α κ

or again (use W α µ = ∂xα /∂xµ )

(F xα )µν = Fµν xα + gµν · g λκ Fλ W α κ (238)

To eliminate some subsequent clutter we agree translate from x-coordinates to

y -coordinates whose origin coincides with that of the x-coordinate system: we
write
xα (x) = y α (x) + Ktα with K ≡ A–1
and achieve y α (0) = 0 by setting Ktα ≡ xα (0). Clearly, if the functions xα (x)
satisfy (238) then so also do the functions y α (x), and conversely. We change
dependent variables now once again, writing

F yα ≡ zα

Then y α µ = − F12 Fµ xα + F1 z α µ and (238) assumes the form

'
g λκ Fλ Fκ ( α
z α µν = 1 Fµν − gµν · z + gµν · g λκ Fλ z α κ
F ! "# F $
It follows, however, from the previously established structure of F that

= −Fµν = −2Cgµν

so

= gµν · 1 − 2Cz α + g λκ Fλ z α κ (239)
F
Each of these α-parameterized equations is structurally analogous to (234), and
the argument that gave (235) no gives
% &
now no x-independent term
z α (x) = P α · (x, x) + Λα β xβ +
because y(0)=0 ⇒ z(0)=0
170 Aspects of special relativity

Returning with this population of results to (239) we obtain

2P α C(x, x) − 2(b, x) + A = −2C P α (x, x) + Λα β xβ

+ 2Cxβ − 2bβ 2P α xβ + Λα β

—the eﬀect of which (after much cancellation) is to constrain the constants P α

and Λα β to satisfy P α = − A1 Λα β bβ = −Λα β aβ . Therefore

z α (x) = Λα β xβ − (x, x)aβ

Reverting to y -variables this becomes

α Λα β xβ − (x, x)aβ
y (x) = K
1 − 2(a, x) + (a, a)(x, x)
so in x-variables—the variables of primary interest—we have
% &
α α Λα β xβ − (x, x)aβ
x (x) = K t + (240)
1 − 2(a, x) + (a, a)(x, x)

Finally we set K = 1 and aα = 0 (all α) which by (237) serve to establish

Ω = 1. But in that circumstance (240) assumes the simple form

↓
= Λα β xβ

and the equation (185.2) that served as our point of departure becomes

/\\ =
/\\T gj gj , from which we learn that the Λα β must be elements of a Lorentz
matrix.
Transformations of the form (240) have been of interest to mathematicians
since the latter part of the 19th Century. Details relating to the derivation
of (240) by iteration of inﬁnitesimal conformal transformations were worked
out by S. Lie, and are outlined on pages 28–32 of J. E. Campbell’s Theory of
Continuous Groups (). The ﬁnitistic argument given above—though in a
technical sense “elementary”—shows the toolmarks of a master’s hand, and is in
fact due (in essential outline) to H. Weyl (). I have borrowed most directly
from V. Fock, The Theory of Space, Time & Gravitation (), Appendix A:
“On the derivation of the Lorentz transformations.”
Equation (240) describes—for N
= 2—the most general N -dimensional
conformal transformation, and can evidently be considered to arise by
composition from the following:

Lorentz transformation : x → x = /\\ x (241.1)

Translation : x→x=x+t (241.2)
Dilation : x → x = Kx (241.3)
x − (x, x)a
Möbius transformation : x→x= (241.4)
1 − 2(a, x) + (a, a)(x, x)
Conformal transformations 171

To specify such a transformation one must assign values to

1
2 N (N − 1) + N + 1 + N = 12 (N + 2)(N + 2)

adjustable parameters tα , K, aα and the elements of log /\\ , the physical
dimensionalities of which are diverse but obvious. The associated numerology
is summarized below:
1
N 2 (N + 2)(N + 1)

1 3
2 6+∞
3 10
4 15
5 21
6 28
.. ..
. .
Concerning the entry at N = 2 : equation (240) makes perfect sense in the
case N = 2 , and that case provides a diagramatically convenient context
within which to study the meaning of (240) in the general case. But (240) was
derived from (232), which was seen on page 167 to be stronger that the condition
(231) appropriate to the 2 -dimensional case. The weakened condition requires
alternative analysis,133 and admits of more possibilities—actually infinitely
many more, corresponding roughly to the infinitely many ways of selecting
f (z) in the theory of conformal transformations as it is encountered in complex
function theory.134 I do not pursue the topic because the physics of interest to
us is inscribed (as are we) on 4 -dimensional spacetime.
Some of the mystery which surrounds the Möbius transformations—which
are remarkable for their nonlinearity—is removed by the remark that they can
be assembled from translations and “inversions,” where the latter are defined
as follows:
Inversion : x → x = µ2
x (241.5)
(x, x)
Here µ2 is a constant of arbitrary value, introduced mainly for dimensional
reasons. The proof is by construction:

x −−−−−−−−−−−−−−−−−−→ x = µ2 x/(x, x) 

inversion 



−−−−−−−−−−−−−−−−−2−→ x = x − µ a 2


translation with t = −µ a 
(242)
−−−−−−−−−−−−−−−−−−→ x = µ x/(x, x)
2


inversion 



x − (x, x)a 

= 
1 − 2(a, x) + (a, a)(x, x)

133
The problem is discussed in my transformational physics of waves
( –).
134
See again page 129.
172 Aspects of special relativity

Inversion—which
• admits readily of geometrical interpretation (as a kind of “radial reflection”
in the isometric surface (x, x) = µ2 )
• can be looked upon as the ultimate source of the nonlinearity which is
perhaps the most striking feature of the conformal transformations (240)
—is one of the sharpest tools available to the conformal theorist, so I digress
to examine some of its properties:
We have, in effect, already shown (at (242): set a = 0) that inversion
is—like every kind of “reflection”—self-reciprocal:

(inversion) · (inversion) = identity (243)

That inversion is conformal in the sense “angle-preserving” can be established

as follows: let x and y be the inversive images of x and y. Then
(x, y)
(x, y) = µ4
(x, x)(y, y)
shows that inversion does not preserve inner products. But immediately
(x, y) (x, y)
= (244)
(x, x)(y, y) (x, x)(y, y)

which is to say:
angle = angle
Inversion, since conformal, must be describable in terms of the primitive
transformations listed at (241). How is that to be accomplished? We notice that
each of those transformations—with the sole exception of the improper Lorentz
transformations—is continuous with the identity (which arises at /\\ = I, at
t = 0, at K = 1, at a = 0). Evidently improper Lorentz transformations—in a
word: reﬂections—must enter critically into the fabrication of inversion, and it
is this observation that motivates the following short digression: For arbitrary
non-null aµ we can always write
(x, a) (x, a)
x= x− a + a ≡ x + x⊥
(a, a) (a, a)

which serves to resolve xµ into components parallel/normal to aµ . It becomes

in this light natural to deﬁne

a- reflection : x = x⊥ + x
↓
(x, a)
x̂ = x⊥ − x = x − 2 a (245)
(a, a)

and to notice that (by quick calculation)

(x̂, ŷ) = (x, y) : a-reﬂection is inner-product preserving

Conformal transformations 173

This simple fact leads us to notice that (245) can be written

x̂ = /\\ x with /\\ ≡ Λµ ν = δ µ ν − 2(a, a)–1 aµ aν

where a brief calculation (examine Λα µ gαβ Λβ ν ) establishes that /\\ is a Lorentz

matrix with (according to Mathematica) det /\\ = −1. In short:

a -reﬂections are improper Lorentz transformations (246)

Thus prepared, we are led after a little exploratory tinkering to the following
sequence of transformations:

1
x −−−−−−−−−−−−−−−−−−→ x = x − a
translation (a, a)
(x, a)
−−−−−−−−−−−−−−−−−−→ x = x − 2 a
reﬂection (a, a)
x − (x, x)a
−−−−−−−−−−−−−−−−−−→ x =
Möbius 1 − 2(a, x) + (a, a)(x, x)
..
. algebraic

simpliﬁcation

1 x
= −a
(a, a) (x, x)
1
−−−−−−−−−−−−−−−−−−→ x = x + a
reverse translation (a, a)
x
= µ2 with µ2 ≡ (a, a)–1
(x, x)

The preceding equations make precise the sense in which

inversion = (translation)–1 ·(Möbius)·(reﬂection)·(translation) (247)

and conﬁrm the conclusion reached already at (244): inversion is conformal.

Finally, if one were to attempt direct evaluation of the Jacobian W of the
general conformal transformation (240)—thus to conﬁrm the upshot

N
1
W = ±K N
1 − 2(a, x) + (a, a)(x, x)

of (237)—one would discover soon enough that one had a job on one’s hands!
But the result in question can be obtained as an easy consequence of the
174 Aspects of special relativity

following readily-established statements:

1
Winversion = −µ2N (248.1)
(x, x)N
WLorentz = ±1 (248.2)
Wtranslation = 1 (248.3)
Wdilation = K N (248.4)
It follows in particular from (242) that

1 1 x
WMöbius = (−)2 µ2N · 1 · µ2N with x = µ2 −a
(x, x)N (x, x)N (x, x)
N
1
= (248.5)
1 − 2(a, x) + (a, a)(x, x)

We are familiar with the fact that specialized Lorentz transformations serve
to boost one to the frame of an observer O in uniform motion. I discuss now
a related fact with curious electrodynamic implications: specialized Möbius
transformations serve to boost one to the frame of a uniformly accelerated
observer . From (241.4) we infer that aµ has the dimensionality of reciprocal
length, so
2 gµ ≡ c aµ is dimensionally an “acceleration”
1 2

and in this notation (241.4) reads

xµ − 2c12 (x, x)g µ
xµ → xµ = (249)
1 − c12 (g, x) + 4c14 (g, g)(x, x)
We concentrate now on implications of the assumption that gµ possesses the
specialized structure
 
g0
 g1  0
 = g
g2
g3
that results from setting g0 = 0. To describe (compare page 139) the “successive
ticks of the clock at his origin” O writes

ct
0
which to describe those same events we write

ct 1 ct
=
x 1 − (g t/2c)2 0 + 12 g t2
√
where g ≡ g ·g and the + intruded because we are talking here about g µ ; i.e.,
because we raised the index. In the non-relativistic limit this gives

t t
= 1 2 (250)
x 2g t

which shows clearly the sense in which we see O to be in a state of uniform

acceleration . To simplify more detailed analysis of the situation we (without
Conformal transformations 175

loss of generality) sharpen our former assumption, writing

 
g
g = 0
0

Then

(x − λ)2 + y 2 + x2 − c2 t2
1− 1
c2 (g, x) + 1
4c4 (g, g)(x, x) =
λ2
2
λ ≡ 2cg is a “length”

and (249) becomes


λ2 
t= ·t 

[etc.] 



λ 2 

x= · x + λ (c t − x )
–1 2 2 2 

[etc.] 





=
λ
· c t − x − 2λ + 2λ
2 2 1 2 1 2 

[etc.] 
(251)
λ 

= · − etc. − λ(x − λ) 

[etc.] 



λ2 

y= ·y 

[etc.] 





z=
λ 2
·z 

[etc.]

It is evident that etc. vanishes—and the transformation (251) becomes
therefore singular
—on the lightcone
c2 t2 − (x − λ)2 − y 2 − x2 = 0 whose vertex
is situated at t, x, y, z = 0, λ, 0, 0 . It is to gain a diagramatic advantage
that we now set y = z = 0 and study what (251) has to say about how t and x
depend upon t and x. We have

λ2
t= ·t (252.1)
[(x − λ)2 − c2 t2 ]
λ2
(x + λ) = − · (x − λ) (252.2)
[(x − λ)2 − c2 t2 ]
which jointly entail
2 2
c t − (x + λ)2 c2 t2 − (x − λ)2 = λ4 (253)

But equations (252) can be written

2 2 t
c t − (x − λ)2 = −λ2 (254.1)
t
x−λ
= λ2 (254.2)
x+λ
176 Aspects of special relativity

and when we return with the latter to (253) we ﬁnd

2 2 x+λ
c t − (x + λ)2 = λ2
x−λ
from which t has been eliminated: complete the square and obtain
' x − 12 λ (2 ' (2
λ2
x+λ − (ct)2 = (255.1)
x−λ 2(x − λ)
which is seen to describe a x-parmeterized family of hyperbolas inscribed on the
(t, x)-plane. These are Möbius transforms of the lines of constant x inscribed
on the (t, x)-plane. Proceeding similarly to the elimination of (x − λ) we ﬁnd
2 2 t
c t − (x + λ)2 = −λ2
t
giving
' ( ' 2 (2
λ2 2 λ
ct + − (x + λ)2 = (255.2)
2ct 2ct
which describes a t -parameterized family of hyperbolas—Möbius transforms
of the “time-slices” or lines of constant t inscribed on the (t, x)-plane. The
following remarks proceed from the results now in hand:
• O, by (252), assigns to O’s origin the coordinates t0 = 0, x0 = 0; their
origins, in short, coincide.
• In (255.1) set x = 0 and ﬁnd that O writes

(x + 12 λ)2 − (ct)2 = ( 12 λ)2

to describe O’s worldline, which O sees to be hyperbolic, with x-intercepts

at x = 0 and x = −λ and asymptotes ct = ±(x + 12 λ) that intersect at
t = 0, x = − 12 λ.
• If, in (252), we set x = 0 we obtain

λ2
t= ·t
− c2 t2 ]
[λ2
λ3
x= 2 −λ
[λ − c2 t2 ]
which provide O’s t-parameterized description of O’s worldline. Notice
that t and x both become infinite at t = λ/c, and that t thereafter
becomes negative!
• To describe her lightcone O writes x = ±ct. Insert x = +ct into (252.1),
(ask Mathematica to) solve for t and obtain ct = λct/(2ct + λ). Insert
that result and x = +ct into (252.2) and, after simplifications, obtain
x = +ct. Repeat the procedure taking x = −ct as your starting point:
obtain ct = −λct/(2ct − λ) and finally x = −ct. The striking implication
is that (252) sends

O’s lightcone −→ O’s lightcone

Conformal transformations 177

The conformal group is a wonderfully rich mathematical object, of which I

have scarcely scratched the surface.135 But I have scratched deeply enough to
illustrate the point which motivated this long and intricate digression, a point
made already on page 126:

The covariance group of a theory depends

in part upon how the theory is expressed :

One rendering of Maxwell’s equations led us to the Lorentz group, and to

special relativity. An almost imperceptibly different rendering committed us,
however, to an entirely different line of analysis, and led us to an entirely
different place—the conformal group, which contains the Lorentz group as
a subgroup, but contains also much else . . . including transformations to the
frames of “uniformly accelerated observers.” Though it was electrodynamics
which inspired our interest in the conformal group,136 if you were to ask an
elementary particle theorist about the conformal group you would be told that
“the group arises as the covariance group of the wave equation

ϕ=0 : conformally covariant

Conformal covariance is broken (reduced to Lorentz covariance) by the inclusion

of a “mass term”

( + m2 )ϕ = 0 : conformal covariance is broken

It becomes the dominant symmetry in particle physics because at high energy

mass terms can, in good approximation, be neglected

rest energy mc2 total particle energy

and enters into electrodynamics because the photon has no mass.” That the
group enters also into the physics of massy particles133 is, in the light of such
a remark, somewhat surprising. Surprises are imported also into classical
electrodynamics by the occurrence of accelerations within the conformal group,
for the question then arises: Does a uniformly accelerated charge radiate?137

135
I scratch deeper, and discuss the occurance of the conformal group in
connection with a rich variety of physical problems, in appell, galilean
& conformal transformations in classical/quantum free particle
dynamics () and transformational physics of waves (–).
136
In “‘Electrodynamics’ in 2 -dimensional spacetime” () I develop a
“toy electrodynamics” that gives full play to the exceptional richness that the
conformal group has been seen to acquire in the 2 -dimensional case.
137
This question—ﬁrst posed by Pauli in §32γ of his Theory of Relativity—
once was the focus of spirited controversy: see T. Fulton & F. Rohrlich,
“Classical radiation from a uniformly accelerated charge,” Annals of Physics 9,
178 Aspects of special relativity

8. Transformation properties of electromagnetic fields. To describe such a ﬁeld at

a spacetime point P we might display the values assumed there by the respective
components of the electric and magnetic ﬁeld vectors E and B . Or we might
display the values assumed there by the components F µν of the electromagnetic
ﬁeld tensor. To describe the same physical facts a second138 observer O would
display the values assumed by E and B , or perhaps by F µν . The question is

How are E , B and E , B related?
The answer has been in our possession ever since (at A on page 127, and on
the “natural” grounds there stated) we assumed it to be the case that
F µν transforms as a tensor density of unit weight (256)
But now we know things about the “allowed” coordinate transformations that
on page 127 we did not know. Our task, therefore, is to make explicit the
detailed mathematical/physical consequences of (256). We know (see again
(186) on page 129) that (256) pertains even when X → X is conformal, but
I will restrict my attention to the (clearly less problematic, and apparently
more important) case (184) in which
X → X is Lorentzian

The claim, therefore, is that

x → x = /\\ x induces F → F = V · /\\ F /\\T
where /\\ = gj
/\\T gj
entails 1
V ≡ = ±1
det /\\
and F = V· /\\ F /\\T means F µν = V Λµ α F αβ Λν β . It is known, moreover, that (see
again (211) on page 157) /\\ can be considered to have this factored structure:
/\\ = R · /\\ (β
β)

499 (1960); T. Fulton, F. Rohrlich & L. Witten,

(continued from the preceding page)
“Physical consequences of a coordinate transformation to a uniformly
accelerated frame,” Nuovo Cimento 26, 652 (1962) and E. L. Hill, “On
accelerated coordinate systems in classical and relativistic mechanics,” Phys.
Rev. 67, 358 (1945); “On the kinematics of uniformly accelerated motions &
classical electromagnetic theory,” Phys. Rev. 72, 143 (1947). The matter
is reviewed by R. Peierls in §8.1 of Surprises in Theoretical Physics (),
and was elegantly laid to rest by D. Boulware, “Radiation from a uniformly
accelerated charge,” Annals of Physics 124, 169 (1980). For more general
discussion see T. Fulton, F. Rohrlich & L. Witten, “Conformal invariance in
physics,” Rev. Mod. Phys. 34, 442 (1962) and L. Page, “A new relativity,”
Phys. Rev. 49, 254 (1936). Curiously, Boulware (with whom I was in touch
earlier today:  October ) proceeded without explicit reference to the
conformal group, of which he apparently was (and remains) ignorant.
138
In view of the conformal covariance of electrodynamics I hesitate to insert
here the adjective “inertial.”
How electromagnetic fields respond to Lorentz transformations 179

This means that we can study separately the response of F to spatial rotations
R and its response to boosts /\\ (β
β ).

response to rotations Write out again (159)

 
0 −E1 −E2 −E3

 E1 0 −B1 B2  0 −E
ET
F = A(E
E, B) ≡   ≡
E2 B3 0 −B1 E B
E3 −B2 B1 0
and (208)

1 0T
R≡
0 R

where  
R11 R12 R13
R =  R21 R22 R23 
R31 R32 R33
is a 3×3 rotation matrix: R–1 = RT . It will, in a moment, become essential to
notice that the latter equation, when spelled out in detail, reads
 
(R22 R33 − R23 R32 ) (R13 R32 − R12 R33 ) (R12 R23 − R13 R22 )
1 
(R23 R31 − R21 R33 ) (R11 R33 − R13 R31 ) (R21 R13 − R23 R11 ) 
det R
(R32 R21 − R31 R22 ) (R31 R12 − R32 R11 ) (R11 R22 − R12 R21 )
 
R11 R21 R31
=  R12 R22 R32  (257)
R13 R23 R33

where
1
= ±1 according as R is proper/improper
det R
Our task now is the essentially elementary one of evaluating

1 1 0T 0 −EET 1 0T
F= T
det R 0 R E B 0 R

1 0 −(R E )T
= T
det R R E RBR

which supplies

E = (det R)–1 · RE (258.1)

B = (det R) · R B R
–1 T
(258.2)

The latter shows clearly how the antisymmetry of B comes to be inherited by B,

but does not much resemble its companion. however . . . if we139 ﬁrst spell out

139
problem 47.
180 Aspects of special relativity

the meaning of (258.2)

   
0 −B3 B2 0 −B 3 B2
 B3 0 −B1  = (det R)–1 · R  B 3 0 −B 1  RT (259.1)
−B2 B1 0 −B 2 B1 0

then (on a large sheet of paper) construct a detailed description of the matrix
on the right, and finally make simplifications based on the rotational identity
(257) . . . we find that (258.1) is precisely equivalent to (which is to say: simply
a notational variant of) the statement140

   
B1 B1 (259.2)
 B2  = R  B 2 
B3 B3

Equations (258) can therefore be expressed

E = (det R)–1 · R E (260.1)

B= RB (260.2)

remark: In the conventional language of 3 -dimensional

physics, objects A that respond to rotation x → x = Rx
x by
the rule
A → A = RA A
are said to transform as vectors (or “polar vectors”), which
objects that transform by the rule

A → A = (det R) · RA
A

are said to transform as pseudovectors (or “axial vectors”).

Vectors and pseudovectors respond identically to proper
rotations, but the latter respond to reﬂections (improper
rotations) by acquisition of a minus sign. If A and B are both
vectors (or both pseudovectors) then C ≡ A ×B B provides the
standard example of a pseudovector . . . for reasons that become
evident when one considers what mirrors do to the “righthand
rule.”

The assumption141 that

F µν transforms as a tensor density of unit weight

140
For a more elegant approach to the proof of this important lemma see
pages 22–22 in classical gyrodynamics ().
141
See again the first point of view , page 126.
How electromagnetic fields respond to Lorentz transformations 181

was seen at (260) to carry the implication that

E responds to rotation as a pseudovector
(261.1)
B responds to rotation as a vector

If we were, on the other hand, to assume142 that

F µν transforms as a weightless tensor

then the (det R)–1 factors would disappear from the right side of (258), and we
would be led to the opposite conclusion:

E responds to rotation as a vector
(261.2)
B responds to rotation as a pseudovector

The transformation properties of E and B are in either case “opposite,”143 and

it is from E that the transformation properties of ρ and j are inherited. The
mirror image of the Coulombic field of a positive charge looks
• like the Coulombic field of a negative charge according to (261.1), but
• like the Coulombic field of a positive charge according to (261.2).
Perhaps it is for this reason (supported by no compelling physical argument)
that (261.2) describes the tacitly-adopted convention standard to the relativistic
electrodynamical literature. The factors that distinguish tensor densities from
weightless tensors are, in special relativity, so nearly trivial (det /\\ = ±1) that
many authors successfully contrive to neglect the distinction altogether.

response to boosts All boosts are proper. Our task, therefore, is to

evaluate
A(E β ) A(E
E , B ) = /\\ (β E , B ) /\\T (β
β) (262)
where /\\ (β
β ) has the structure (209) described on page 156. It will serve our
exploratory purposes to suppose initially that
 
β
β =0
0

142
See again the second point of view , page 128.
143 E -like
This fact has been latent ever since—at (67)—we alluded to the “E
character” of c1 v ×B
B , since

vector pseudovector
vector × =
pseudovector vector
182 Aspects of special relativity

—i.e., that we are boosting along the x-axis: then

 
γ γβ 0 0
/\\ (β
 γβ γ 0 0
β) =  
0 0 1 0
0 0 0 1

and it follows from (262) by quick calculation that

 B )3 
0 −E 1 −γ(E
E −β
β ×B
B )2 −γ(E
E −β
β ×B
 E1 0 −γ(B β ×E
B +β E )3 β ×E
B +β
+γ(B E )2 
A(E
E , B ) = 
E −β
γ(E β ×B
B )2 β ×E
B +β
+γ(B E )3 0 −B 1
E −β
γ(E β ×B
B )3 −γ(B β ×E
B +β E )2 +B 1 0

Noting that
E −β
E 1 = (E β ×B
B )1 because β ×B
(β B) ⊥ β
B 1 = (B β ×E
B +β E )1 because β ×E
(β E) ⊥ β
we infer that
E −β
E = (E β ×B E −β
B ) + γ(E β ×B
B )⊥
(263)
B = (B β ×E
B +β E ) + γ(B β ×E
B +β E )⊥

where components and ⊥ to β are deﬁned in the usual way: generically

A = A + A⊥
 
β1 β1 β1 β2 β1 β3
A ≡ (A
A· β̂ β = 12  β2 β1 β2 β2
β )β̂ β2 β3 A
β
β3 β1 β3 β2 β3 β3

projects onto β

Several comments are now in order:

1. We had already on page 46 (when we are arguing from Galilean relativity)
E & B fields transform in a funny, interdependent way.”
reason to suspect that “E
Equations (263) first appear—somewhat disguised—in §4 of Lorentz ().78
They appear also in §6 of Einstein ().78 They were, in particular, unknown
to Maxwell.
2. Equations (263) are ugly enough that they invite reformulation, and can
in fact be formulated in a great variety of (superficially diverse) ways . . . some
obvious—in the 6 -vector formalism86 one writes

E E
= M (β )
β
B B

where M (β
β ) is a 6×6 matrix whose elements can be read oﬀ from (263)—and
some not so obvious. I would pursue this topic in response to some speciﬁc
formal need, but none will arise.
How electromagnetic fields respond to Lorentz transformations 183

3. The following statements are equivalent:


Maxwell’s equations 






∇· E = ρ 



∇ ×B
B− 1 ∂
= c1 j 

c ∂t E 



∇· B = 0 



∇×E
E + c1 ∂t
∂
B=0 






simply “turn black” in response to (264.1)




t = γ t + c12 γ v · x 



x = x + γ t + (γ − 1) v12 v · x v 





ρ = γρ + c12 γ v · j 



j = j + γρ + (γ − 1) v12 v · j v 





E −β
E = (E β ×B E −β
B ) + γ(E β ×B
B )⊥ 



B = (B β ×E
B +β E ) + γ(B β ×E
B +β E )⊥

Maxwell’s equations 






∂ µ F µν = c1 j ν 





∂ µ F νλ + ∂ ν F λµ + ∂ λ F µν = 0 



simply “turn black” in response to (264.2)






xµ = Λµ α xα 





j ν = Λν β j β 



F µν = Λµ α Λν β F αβ

and provide detailed statements of what one means when one refers to the
“Lorentz covariance of Maxwellian electrodynamics.” Note that it is not enough
to know how Lorentz transformations act on spacetime coordinates: one must
know also how they act on ﬁelds and sources. The contrast in the formal
appearance of (264.1: Lorentz & Einstein) and (264.2: Minkowski) is striking,
and motivates me to remark that
• it is traditional in textbooks to view (264.1) as “working equations,” and
to regard (264.2) as “cleaned-up curiosities,” to be written down and
admired as a kind of afterthought . . . but
• my own exposition has been designed to emphasize the practical utility
of (264.2): I view (264.1) as “elaborated commentary” upon (264.2)—too
complicated to work with except in some specialized applications.
4. We know now how to translate electrodynamical statements from one inertial
frame to another. But we do not at present possess answers to questions such
as the following:
184 Aspects of special relativity

• How do electromagnetic ﬁelds and/or Maxwell’s equations look to an

observer in a rotating frame?
• How—when Thomas precession is taken into account—does the nuclear
Coulomb field look to an observer sitting on an electron in Bohr orbit?
• How do electromagnetic fields and the field equations look to an arbitrarily
accelerated observer?
We are, however, in position now to attack such problems, should physical
motivation arise.
5. Suppose O sees a pure E -field: B (x
x) = 0 (all x). It follows from (263) that
we would see and electromagnetic field of the form

E = E + γE
E⊥ = γE E + (1 − γ) v12 (vv · E ) v
B= β ×E
γ(β E ) = c γ(vv ×E
1
E)

Our B -ﬁeld is, however, structurally atypical: it has a specialized ancestory,

and (go to O’s frame) can be transformed away—globally. In general it is not
possible by Lorentz transformation to kill B (or E ) even locally, for to do so
would be (unless E ⊥ B at the spacetime point in question) to stand in violation
of the second of the following remarkable equations144

E·E − B·B = E·E − B·B (265.1)

E·B = E·B (265.2)

The preceding remark makes vividly clear, by the way, why it is that attempts to
“derive” electrodynamics from “Coulomb’s law + special relativity” are doomed
to fail: with only that material to work with one cannot escape from the force
of the special/atypical condition E ·B = 0.
6. We do not have in hand the statements analogous to (264) that serve to lend
detailed meaning to the “conformal covariance of Maxwellian electrodynamics.”
To gain a sense of the most characteristic features of the enriched theory it
would be sufficient to describe how electromagnetic fields and sources respond
to dilations and inversions.
7. An uncharged copper rod is transported with velocity v in the presence of a
homogeneous magnetic field B . We see a charge separation to take place (one
end of the rod becomes positively charge, the other negatively: see Figure 66),
which we attribute the presence q(vv ×BB )-forces. But an observer O co-moving
with the rod sees no such forces (since v = 0), and must attribute the charge
separation phenomenon to the presence of an electric field E . It was to account
for such seeming “explanatory asymmetry” that Einstein invented the theory
of relativity. I quote from the beginning of his  paper:

144
problem 48.
How electromagnetic fields respond to Lorentz transformations 185

Figure 66: A copper rod is transported with constant velocity v in a

homogeneous magnetic ﬁeld. Charge separation is observed to occur
in the rod. Observers in relative motion explain the phenomenon
in—unaccountably, prior to the invention of special relativity—quite
diﬀerent ways.

on the electrodynamics of moving

bodies

a. einstein

It is known that Maxwell’s electrodynamics—as usually understood at

the present time—when applied to moving bodies, leads to asymmetries
which do not appear to be inherent in the phenomena. Take, for example,
the reciprocal electrodynamic action of a magnet and a conductor. The
observable phenomenon here depends only on the relative motion of the
conductor and the magnet, whereas the customary view draws a sharp
distinction between the two cases in which either the one or the other
of these bodies is in motion. For if the magnet is in motion and the
conductor at rest, there arises in the neighborhood of the magnet an
electric field with a certain definite energy, producing a current at the
places where parts of the conductor are situated. But if the magnet is
stationary and the conductor in motion, no electric field arises in the
neighborhood of the magnet. In the conductor, however, we find an
electrtomotive force, to which in itself there is no corresponding energy,
but which gives rise—assuming equality of relative motion in the two
cases discussed—to elecric currents of the same path and intensity as
those produced by the electric forces in the former case.
Examples of this sort, together with the unsuccessful attempts to
discover any motion of the earth relatively to the “light medium,” suggest
that the phenomena of electrodynamics as well as of mechanics possess
no properties corresponding to the idea of absolute rest.
186 Aspects of special relativity

After sixteen pages of inspired argument Einstein arrives at equations (263),

from which he concludes that

. . . electric and magnetic forces do not exist independently of the state

of motion of the system of coordinates.
Furthermore it is clear that the asymmetry mentioned in the
introduction as arising when we consider the currents produced by the
relative motion of a magnet and a conductor now disappears.

He comes to the latter conclusion by arguing that to determine

force F
the
experienced by a moving charge q in an electromagnetic ﬁeld E , B a typical
inertial observer should

i) transform E , B → E 0 , B 0 to the instantaneous rest frame of the
charge;
ii) write F 0 = qEE0;
iii) transform back again to his own reference frame: F ← F 0 .
We don’t, as yet, know how to carry out the last step (because we have yet
to study relativistic mechanics). It is already clear, however, that Einstein’s
program eliminates asymmetry because it issues identical instructions to every
inertial observer . Note, moreover, that it contains no reference to “the” velocity
. . . but refers only to the relative velocity (of charge and observer, of observer
and observer).
The ﬁeld-transformation equations (263) lie, therefore, at the motivating
heart of Einstein’s  paper. All the rest can be read as “technical support”—
evidence of the extraordinary surgery Einstein was willing to perform to remove
a merely aesthetic blemish from a theory (Maxwellean electrodynamics) which—
after all—worked perfectly well as it was! Several morals could be drawn. Most
are too obvious to state . . . and all are too important for the creative physicist
to ignore.

9. Principle of relativity . The arguments which led Einstein to the Lorentz trans-
formations differ profoundly from those which (unbeknownst to Einstein) had
led Lorentz to the same result. Lorentz argued (as we have seen . . . and done)
from the structure of Maxwell’s equations. Einstein, on the other hand (and
though he had an electrodynamic problem in mind), extracted the Lorentz
transformations from an unprecedented operational analysis: his argument
assumed very little . . . and he had, therefore, correspondingly greater confi-
dence in the inevitability and generality of his conclusions. His argument was,
in particular, entirely free from any reference to Maxwell’s equations, so his
conclusion—that inertial observers are interrelated by Lorentz transformations
—could not be specific to Maxwellean electrodynamics. It was this insight—and
the firmness145 with which he adhered to it—which distinguished Einstein’s
thought from that of his contemporaries (Lorentz, Poincaré). It led him to
145
I have indicated on page 163 why, in the light of subsequent developments,
Einstein’s “firmness” can be argued to have been inappropriately strong.
Principle of relativity 187

propose, at the beginning of his §2, two principles . . . which amount, in eﬀect,
to this, the

Principle of Relativity: The concepts, statements and

formulæ of physics—whatever the phenomenology to
(266)
which they speciﬁcally pertain—must preserve their
structure when subjected to Lorentz transformation.

The principle of relativity functions as a “syntactical constraint” on the

“statements that physicists may properly utter”—at least when they are doing
local physics. Concepts/statements/theories which fail to pass the (quite
stringent) “Lorentz covariance test” can, according to the principle of relativity,
be dismissed out of hand as ill-formed, inconsistent with the grammar of physics
. . . and therefore physically untenable. Theories that pass the test are said to be
“relativistic,” “Lorentz invariant” or (more properly) Lorentz covariant. The
physical correctness of such a theory is, of course, not guaranteed. What is
guaranteed is the ultimate physical incorrectness of any theory—whatever may
be its utility in circumscribed contexts (think of non-relativistic classical and
quantum mechanics!)—that stands in violation of the principle of relativity.146
Some theories—such as the version of Maxwellean electrodynamics that
was summarized at (264.1)—conform to the principle of relativity, but do so
“non-obviously.” Other theories—see again (264.2)—conform more obviously.
Theories of the latter type are said to be “manifestly Lorentz covariant.”
Manifest is, for obvious reasons, a very useful formal attribute for a physical
theory to possess. Much attention has been given therefore to the cultivation
of principles and analytical techniques which sharpen one’s ability to generate
manifestly covariant theories “automatically.” Whence the importance which
theoretical physicists nowadays attach to variational principles, tensor analysis,
group representation theory, . . . (Einstein did without them all!).
Clearly, the principle of relativity involves much besides the simple “theory
of Lorentz transformations” (it involves, in short, all of physics!) . . . but one
must have a good command of the latter subject in order to implement the
principle. If in (266) one substitutes for the word “Lorentz” the words
“Galilean,” “conformal,” . . . one obtains the “principle of Galilean relativity,”
the “principle of conformal relativity,” etc. These do have some physically
illuminating formal consequences, but appear to pertain only approximately to
the world-as -we -ﬁnd-it . . . while the principle announced by Einstein pertains
“exactly/universally.”
I have several times emphasized the universal applicability of the principle
146
But every physical theory is ultimately incorrect! So the question that
confronts physicists in individual cases is this: Is Lorentz non-covariance the
principal defect of the theory in question, the defect worth of my corrective
attention? Much more often than not, the answer is clearly “No.”
188 Aspects of special relativity

of relativity. It is, therefore, by way of illustrative application that in Part II

of his paper Einstein turns to the specific physics which had served initially to
motivate his research—Maxwellean electrodynamics. It is frequently stated that
“electrodynamics was already relativistic (while Newtonian dynamics had to be
deformed to conform).” But this is not quite correct. The electrodynamics
inherited by Einstein contained field equations, but it contained no allusion
to a field transformation law . Einstein produced such a law—namely (263)—
by insisting that Maxwell’s field equations conform to the principle of relativity.
Einstein derived (from Maxwell’s equations + relativity, including prior
knowledge of the Lorentz transformations) a result—effectively: that the F µν
transform tensorially—which we were content (on page 127) to assume. We, on
the other hand, used Maxwell’s equations + tensoriality to deduce the design of
the Lorentz transformations. Our approach—which is effectively Lorentz’—is
efficient (also free of allusions to trains & lanterns), but might be criticized on
the ground that it is excessively “parochial,” too much rooted in specifics of
electrodynamics. It is not at all clear that our approach would have inspired
anyone to utter a generalization so audacious as Einstein’s (266). Historically it
didn’t: both Lorentz and Poincaré were in possession of the technical rudiments
of relativity already in , yet both—for distinct reasons—failed to recognize
the revolutionary force of the idea encapsulated at . Einstein was, in this
respect, well served by his trains and lanterns. But it was not Einstein but
Minkowski who first appreciated that at Einstein had in effect prescribed
that

The physics inscribed on spacetime must mimic

the symmetry structure of spacetime itself.

10. Relativistic mechanics of a particle. We possess a Lorentz covariant ﬁeld

dynamics. We want a theory of fields and (charged) particles in interaction.
Self-consistency alone requires that the associated particle dynamics be Lorentz
covariant. So also—irrespective of any reference to electromagnetism—does the
principle of relativity.
The discussion which follows will illustrate how non-relativistic theories
are “deformed to conform” to the principle of relativity. But it is offered to
serve a more explicit and pressing need: my primary goal will be to develop
descriptions of the relativistic analogs of the pre-relativistic concepts of energy,
momentum, force, . . . though a number of collateral topics will be treated en
route.
In Newtonian dynamics the “worldline” of a mass point m is considered to
be described by the 3 -vector-valued solution x(t) of a differential equation of
the form
d2
F (t, x) = m dt2 x(t) (267)
This equation conforms to the principle of Galilean covariance (and it was from
this circumstance that historically we acquired our interest in the “population
Relativistic mechanics of a particle 189

t t(λ)

x(t) x(λ)

Figure 67: At left: the time-parameterized ﬂight of a particle,

standard to Newtonian mechanics, where t is assigned the status
of an independent variable and x is a set of dependent variables.
At right: arbitrarily parameterization permits t to join the list of
dependent variables; i.e., to be treated co-equally with x.

of inertial observers”), but its Lorentz non-covariance is manifest . . . for the

equations treats t and x with a distinctness which the Lorentz transformations
do not allow because they do not preserve. We confront therefore this problem:
How to describe a worldline in conformity with the requirement that space
and time coordinates be treated co-equally? One’s ﬁrst impulse it to give up
t -parameterization in favor of an arbitary parameterization of the worldline
(Figure 67), writing xµ (λ). This at least treats space and time co-equally
. . . but leaves every inertial observer to his own devices: the resulting theory
(kinematics) would be too sloppy to support sharp physics. The “slop” would,
however, disappear if λ could be assigned a “natural” meaning—a meaning
which stands in the same relationship to all inertial observers. Einstein’s idea—
foreshadowed already on page 186—was to assign to λ the meaning/value of
“time as measured by a comoving clock.” The idea is implemented as follows
(see Figure 68): Let O write x(λ) to describe a worldline, and let him write

cdt
dx(λ) ≡ x(λ + dλ) − x(λ) =
dx

to describe the interval separating a pair of “neighboring points” (points on

the tangent at x(λ)). If and only if dx(λ) is timelike will O be able to boost to
the instantaneous restframe (i.e., to the frame of an observer O who sees the
particle to be momentarily resting at her origin):

cdt cdτ
= /\\ (β
β)
dx 0
190 Aspects of special relativity

λ + dλ
λ

λ0

Figure 68: An accelerated observer/particle borrows his/its proper

time increments dτ from the wristwatches of momentarily comoving
inertial observers.

where from the boost-invariant structure of spacetime it follows that

dτ = (dt)2 − c12 dx · dx = 1 − β 2 (t) dt (268)
≡ time diﬀerential measured by instantaneously comoving clock
1 2 dx2 2 dx3 2
dx0 2
=c1
dλ − dxdλ − dλ − dλ dλ
= c1 ds

The proper time τ associated with a ﬁnitely-separated pair of points is deﬁned

λ
1 dxα (λ ) dxβ (λ ) arc-length
τ (λ, λ0 ) = c gαβ dλ = dτ =
λ0 dλ dλ c

x(λ0 ) is the reference point at which
= 0 at λ = λ0 :
we “start the proper clock”

Functional inversion gives

λ = λ(τ, λ0 )

and in place of x(λ) it becomes natural to write

x(τ ) ≡ x(λ(τ, λ0 )) : τ -parameterized description of the worldline

Evidently τ -parameterization is equivalent (to within a c-factor) to arc-length

parameterization—long known by diﬀerential geometers to be “most natural”
in metric spaces. Two points deserve comment:
Relativistic mechanics of a particle 191

Figure 69: The worldline of a masspoint lies everywhere interior

to lightcones with vertices on the worldline. The spacetime interval
separating any two points on a worldline is therefore time-like, and
the constituent points of the worldline fall into a temporal sequence
upon which all inertial observers agree.

1. Einstein’s program works if and only if all tangents to the worldline are
timelike (Figure 69). One cannot, therefore, τ -parameterize the worldline of a
photon. Or of a “tachyon.” The reason is that one cannot boost such particles
to rest: one cannot Lorentz transform the tangents to such worldlines into local
coincidence with the x0 -axis.

2. The dτ ’s in dτ refer to a population of osculating inertial observers.
It is a big step—a step which Einstein (and also L. H. Thomas) considered
quite “natural,” but a big step nonetheless—to suppose that τ has anything
literally to do with “time as measured by a comoving (which in the general case
means an accelerating) clock.” The relativistic dynamics of particles is, in fact,
independent of whether attaches literal meaning to the preceding phrase. Close
reading of Einstein’s paper shows, however, that he did intend to be understood
literally (even though—patent clerk that he was—he would not have expected
his mantle clock to keep good time if jerked about). Experimental evidence
supportive of Einstein’s view derives from the decay of accelerated radioactive
192 Aspects of special relativity

particles and from recent observations pertaining to the so-called twin paradox
(see below).
Given a τ -parameterized (whence everywhere timelike) worldline x(τ ), we
deﬁne by

ct c
u(τ ) ≡ dτ x(τ ) = dτ dt
d dt d
=γ (269)
x v
the 4-velocity uµ (τ ), and by

d2
a(τ ) ≡ dτ 2 x(τ )

d dt d c
= dτ u(τ ) = dτ dt γ (270)
v

1 4
a· v )
c γ1 (a
=
a · v )vv
γ a + c2 γ 4 (a
2a

the 4-acceleration aµ (τ ). These are equations written by inertial observer O:

v refers
to O’s perception
1 of the particle’s instantaneous velocity v (t), and
γ ≡ 1 − c12 v · v − 2 .147 Structurally similar equations (but with everything
turned red) would be written by a second observer O. In developing this aspect
of the subject one must be very careful to distinguish—both notationally and
conceptually—the following:

O’s perception of the instantaneous particle velocity v

O’s perception of O’s velocity s
O’s perception of the instantaneous particle velocity v

Supposing O and O to be boost-equivalent (no frame rotation)

x = /\\ (ss/c)x

we have

u = /\\ (ss/c)u (271.1)

a = /\\ (ss/c)a (271.2)

These equations look simple enough, but their explcit meaning is—owing to the
complexity of /\\ (ss/c), of uµ and particularly of aµ —actually quite complex. I
will develop the detail only when forced by explicit need.148
It follows from (269) that

(u, u) = gαβ uα uβ = γ 2 (c2 − v 2 ) = c2 · γ 2 (1 − β 2 ) = c2 (272)

147
problem 49.
148
In the meantime, see my electrodynamics (/), pages 202–205.
Relativistic mechanics of a particle 193

according to which all velocity 4-vectors have the same Lorentzian length. All
are, in particular (since (u, u) = c2 > 0), timelike. Diﬀerentiating (272) with
respect to τ we obtain
d
dτ (u, u) = 2(u, a) = 0 (273)
according to which it is invariably the case that u ⊥ a in the Lorentzian sense.
It follows now from the timelike character of u that all acceleration 4-vectors
are spacelike. Direct veriﬁcation of these statements could be extracted from
(269) and (270). The statement (u, u) = c2 —of which (273) is an immediate
corollary—has no precursor in non-relativistic kinematics,149 but is, as will
emerge, absolutely fundamental to relativistic kinematics/dynamics.
Looking “with relativistic eyes” to Newton’s 2nd law (267) we write
2
K µ = m dτ
d µ
2 x (τ ) (274)

This equation would be Lorentz covariant—manifestly covariant—if

K µ ≡ Minkowski force transforms like a 4-vector

and m transforms as an invariant. The Minkowski equation (274) can be

reformulated

K µ = m dτ
d µ
u = maµ
d µ
or again = dτ p

0
c p
where p ≡ mu = γm
µ µ
≡ (275)
v p

From the γ -expansion (202) we obtain

p0 = γmc (276.1)

= 1 + 2 β + 8 β + · · · mc
1 2 3 4

= c1 mc2 + 12 mv 2 + · · ·
↑—familiar from non-relativistic dynamics as kinetic energy
p = γmvv (276.2)
= mvv + · · ·
↑—familiar from non-relativistic dynamics as linear momentum

It becomes in this light reasonable to call pµ the energy-momentum 4 -vector.

149
The constant speed condition

v · v = constant

is sometimes encountered, but has no claim to “universality” in non-relativistic

physics: when encountered (as in uniform circular motion), it entails v ⊥ a.
194 Aspects of special relativity

Looking to the ﬁner details of standard relativistic terminology . . . one writes

p0 = c1 E (277)
and calls E = γmc2 = mc2 + 12 mv 2 + · · ·

the relativistic energy. More particularly

E0 ≡ mc2 is the rest energy (278)

T ≡ E − E0 is the relativistic kinetic energy

In terms of the v -dependent “relativistic mass” deﬁned150

m
M ≡ γm = (279)
1 − v 2 /c2

we have

E = M c2 (280.1)

1
and T = (M − m)c2 = − 1 mc2
1 − v 2 /c2

The relativistic momentum can in this notation be described

p = Mvv (280.2)

It is—so far as I can tell—the “non-relativistic familiarity” of (280.2) that

tempts some people151 to view (283) as the fruit of an astounding “empirical
discovery,” lying (they would have us believe) near the physical heart of special
relativity. But (283) is, I insist, a deﬁnition—an occasional convenience,
nothing more—one incidental detail among many in a coherent theory. It
is naive to repeat the tired claim that “in relativity mass becomes velocity
dependent: ” it is profoundly wrongheaded to attempt to force relativistic
dynamics to look less relativistic than it is.
We have
1

p= cE = mu
p
and from (272) it follows that

(p, p) = (E/c)2 − p · p − m2 c2 (281)

This means that p lies always on a certain m-determined hyperboloid (called

the “mass shell ”: see Figure 70) in 4 -dimensional energy-momentum space.
150
It becomes natural in the following context to call m the rest mass, though
in grown-up relativistic physics there is really no other kind . Those who write
m when they mean M are obliged to write m0 to distinguish the rest mass.
151
See, for example, A. P. French, Special Relativity: The MIT Introductory
Physics Series (), page 23.
Relativistic mechanics of a particle 195

pµ

Figure 70: The hyperboloidal mass shell, based upon (281) and
drawn in energy-momentum space. The p0 -axis (energy axis) runs
up. The mass shell intersects the p0 -axis at a point determined by
the value of m:
p0 = mc i.e., E = mc2
The ﬁgure remains meaningful (though the hyperboloid becomes a
cone) even in the limit m ↓ 0, which provides ﬁrst indication that
relativistic mechanics supports a theory of massless particles.

From (281) we obtain

E = ±c p · p + (mc)2 (282)

= ± mc2 + 2m 1
p ·p + · · ·

which for a relativistic particle describes the p -dependence of the energy E, and
should be compared with its non-relativistic free -particle counterpart
1
E = 2m p ·p

The ± assumes major importance in relativistic quantum mechanics (where it

must be explained away lest it provide a rathole that would de-stabilize the
world! ), but in relativistic classical mechanics one simply abandons the minus
sign—dismisses it as an algebraic artifact.
Looking next to the structure of K µ . . . ot follows from the Minkowski
equation K = ma by (u, a) = 0 that

(K, u) = 0 : K ⊥ u in the Lorentzian sense (283)

We infer that the 4 -vectors that describe Minkowski forces are invariably
spacelike. It follows moreover from (283) that as p ∼ u moves around the
K-vector must move in concert, contriving always to be ⊥ to u: in relativistic
196 Aspects of special relativity

dynamics all forces are velocity-dependent. What was fairly exceptional in

non-relativistic dynamics (where F damping = −b v and F magnetic = (q/c) v ×B
B
are the only vecocity-dependent forces that come readily to mind) is in
relativistic dynamics universal . Symbolically

K = K(u, . . .)

where the dots signify such other variables as may in particular cases enter into
the construction of K. The simplest case—which is, as we shall see, the case of
electrodynamical interest—arises when K depends linearly on u:

Kµ = Aµν uν (284.1)

where (K, u) = Aµν uµ uν = 0 forces the quantities Aµν (. . .) to satisfy the

antisymmetry condition : Aµν = −Aνµ (284.2)

K-vectors that depend quadratically upon u exist in much greater variety: the
following example

Kµ = φα (x) c2 gαµ − uα uµ

ﬁgured prominently in early (unsuccessful) eﬀorts to construct a special

relativistic theory of gravitation.152,153
If K is notated
K0
K= (285)
K

then (283)—written γ(K 0 c − K· v ) = 0—entails

K 0 = c1 K· v : knowledge of K determines K 0 (286)

It follows in particular that

K0 = 0 in the (momentary) rest frame (287)

It is, of course, the non-zero value of K that causes the particle to take leave
of (what a moment ago was) the rest frame. Borrowing notation from (275) and

152
This work (∼) is associated mainly with the name of G. Nordström,
but for a brief period engaged the enthusiastic attention of Einstein himself:
see page 144 in Pauli,135 and also A. O. Barut, Electrodynamics and Classical
Theory of Fields and Particles (), page 56; A. Pais, Subtle is the Lord: The
Science and Life of Albert Einstein (), page 232.
153
For further discussion of the “general theory of K-construction” see my
relativistic dynamics (), pages 13–22.
Relativistic mechanics of a particle 197

(285), the Minkowski equation (274) becomes

0
K d γmc
= γ dt (288)
K γmvv
d d
where use has been made once again of dτ = γ dt . In the non-relativistic limit
↓
0 0
= d
F dt mv
v ←− Newtonian!

where we have written

F ≡ lim K (289)
c↑∞

to account for such c-factors as may lurk in the construction of K . We are used
to thinking of the “non-relativistic limit” as an approximiation to relativistic
physics, but at this point it becomes appropriate to remark that
In fully relativistic particle dynamics the “non-relativistic
limit” becomes literally effective in the momentary rest frame.
The implication is that if we knew the force F experienced by a particle at rest
then we could by Lorentz transformation obtain the Minkowski force K active
upon a moving particle: 0
K \\
0
= (β )
/ β (290)
K F
Reading from (210.1) it follows more particularly that

K 0 = γ c1 v ·F

(291)
K = F + (γ − 1)(vv ·F )/v 2 v = F ⊥ + γF
F
from which, it is gratifying to observe, one can recover both (289) and (286).
We stand not (at last) in position to trace the details of the program
proposed154 in a specifically electrodynamical setting by Einstein. Suppose
that a charged particle experiences a force
E : E ≡ electrical field in the particle’s rest frame
F = qE
Then
E ⊥ + γE
K = q(E E)
But from the field transformation equations (263) it follows that
E + β ×B
E ⊥ = γ(E B )⊥
E = (EE + β ×B
B )
where E and B refer to our perception of the electric and magnetic fields at
the particle’s location, and β to our perception of the particle’s velocity. So
(because the γ -factors interdigitate so sweetly) we have
E + c1 v ×B
K = γq(E B) (292)

154
See again page 186.
198 Aspects of special relativity

d
But (288) supplies K = γ dt (γmvv ), so (dropping the γ -factors on left and right)
155
we have
q(EE + c1 v ×B d
B ) = dt (γmvv ) (293)
This famous equation describes the relativistic motion of a charged particle in
an impressed electromagnetic field (no radiation or radiative reaction), and is
the upshot of 156 the Lorentz force law —obtained here not as an it ad hoc
assumption, but as a forced consequence of
• some general features of relativistic particle dynamics
• the transformation properties of electromagnetic fields
• the operational definition of E . . . all fitted into
• Einstein’s “go to the frame of the particle” program (pages 186 & 189).
Returning with (292) to (286) we obtain

E·v
K 0 = c1 γqE (294)

so the Minkowski 4-force experienced by a charged particle in an impressed

elecromagnetic ﬁeld becomes
0
K 1
cE ·v
K= = γq
K E + c1 v ×B
B
  
0 E1 E2 E3 γc
E 0 B3 −B2   γv1 
= (q/c)  1  
E2 −B3 0 B1 γv2
E3 B2 −B1 0 γv3
↓
K µ = (q/c)F µ ν uν (295)

We are brought thus to the striking conclusion that the electromagnetic

Minkowski force is, in the described at (284), simplest possible.
The theory in hand descends from F = mẍ x, and might plausibly be called
“relativistic Newtonian dynamics.” Were we to continue this discussion we
might expect to busy ourselves with the construction of
• a “relativistic Lagrangian dynamics”
• a “relativistic Hamiltonian dynamics”
• a “relativistic Hamilton -Jacobi formalism”
• “relativistic variational principles,” etc.
—all in an eﬀort to produce a full-blown “relativistic dynamics of particles.”
The subject157 is, however, a mineﬁeld, and must be persued with much greater
delicacy than the standard texts suggest. Relativistic particle mechanics
155
problem 50.
156
See again equation (67) on page 35.
157
See the notes153 already cited.
Relativistic mechanics of a particle 199

remains in a relatively primitive state of development because many of the

concepts central to non-relativistic mechanics are—for reasons having mainly to
do with the breakdown of non-local simultaneity—in conflict with the principle
of relativity. But while the relativistic theory of interacting particles presents
awkwardnesses at every turn, the relativistic theory of interacting fields unfolds
with great ease and naturalness: it appears to be a lesson of relativity that we
should adopt a field-theoretic view of the world .
We have already in hand a relativistic particle mechanics which, though
rudimentary, is sufficient to our electrodynamic needs. Were we to pursue this
subject we would want to look to the problem of solving Minkowski’s equation
of motion (274) isn illustrative special cases . . . any short list of which would
include
• the relativistic harmonic oscillator
• the relativistic Kepler problem
• motion in a (spatially/temporally) constant electromagnetic field.
This I do on pages 245–275 of electrodynamics (/), where I give also
many references. The most significant point to emerge from that discussion is
that distinct relativistic systems can have the same non-relativistic limit; i.e.,
that constructing the relativistic generalization of a non-relativistic system is an
inherently ambiguous process. For the present I must be content to examine two
physical questions that have come already to the periphery of our attention.

hyperbolic motion: the “twin paradox” We—who call ourselves O—

are inertial. A second observer Q sits on a mass point m which we see to
be moving with (some dynamically possible but otherwise) arbitrary motion
along our x-axis. I am tempted to say that Q rides in a little rocket, but that
would entail (on physical grounds extraneous to my main intent) the temporal
variability of m: let us suppose therefore that Q moves (accelerates) because
m is acted on by impressed forces. In any event, we imagine Q to be equipped
with
• a clock which—since co -moving—measures proper time τ
• an accelerometer, with output g. If Q were merely a passenger then g(τ )
would constitute a king of log. But if Q were a rocket captain then g(τ )
might describe his ﬂight instructions, his prescribed “throttle function.”
Finally, let Oτ designate the inertial observer who at proper time τ sees Oτ to
be instantaneously at rest: spacetime points to which we assign coordinates x
are by Oτ assigned coordinates xτ . Our interest attaches initially to questions
such as the following: Given the throttle function g(τ ),
1) What is the boost /\\ (τ ) associated with O ← Oτ ?
2) What is the functional relationship between t and τ ?
3) What are the functions
x(t) that describes our sense of Q’s position at time t
β(t) that describes our sense of Q’s velocity at time t
a(t) that describes our sense of Q’s acceleration at time t?
200 Aspects of special relativity

Since Oτ sees Q to be momentarily resting at Oτ ’s origin we have

\\
c
u(τ ) = (τ )
/ by (269)
0

0
a(τ ) = /\\ (τ ) by (269) (296)
g(τ )
But

du(τ ) d/\\ (τ ) c
= =
dτ dτ 0

We know, moreover, that158

0 1
/\\ (τ ) = eA(τ )J with J ≡ , A(τ ) = tanh–1 β(τ )
1 0

so
d/\\ (τ ) /\\ dA(τ )
= (τ ) · J
dτ dτ
dA(τ ) 1 dβ
=
dτ 1 − β 2 dτ
Returning with this information to (296) we obtain

1 dβ
= c1 g(τ )
1 − β 2 dτ

where integration of dt/dτ = γ supplies

t

τ= 1 − β 2 (t ) dt (297)

Given g(•), our assignment therefore is to solve

32 dβ(t) t
1
= c1 g 1 − β 2 (t ) dt (298)
1 − β (t)
2 dt

for β(t): a ﬁnal integration would then supply the x(t) that describes our
perception of Q’s worldline. The problem presented by (298) appears in the
general case to be hopeless . . . but let us at this point assume that the throttle
function has the simple structure

g(τ ) = g : constant

158
See again pages 138 and 139.
Relativistic mechanics of a particle 201

The integrodiﬀerential equation (298) then becomes a diﬀerential

equation
which integrates at once: assuming β(0) = 0 we obtain β/ 1 − β 2 = (g/c)t
giving
t
β(t) = (299.1)
(c/g)2 + t2
By integration we therefore have159
2
x(t) − x(0) + (c2 /g) − (ct)2 = (c2 /g)2 (299.2)
and
τ (t) = (c/g) sinh–1 gt/c (299.3)
while expansion in powers of gt/c (which presumes gt c) gives

v(t) = g t 1 − 12 (gt/c)2 + · · ·

x(t) = x(0) + 12 gt2 1 − 14 (gt/c)2 + · · · (300)

τ (t) = t 1 − 16 (gt/c)2 + · · ·

|—conform to non-relativistic experience

According to (299.2) we see Q to trace out (not a parabolic worldline, as in

non-relativistic physics, but) a hyperbolic worldline, as shown in Figure 71.
The results now in hand place us in position to construct concrete
illustrations of several points that have been discussed thus far only as vague
generalities:
1. Equation (299.1) entails

γ(t) = 1 + (gt/c)2
which places us in position to construct an explicit description

/\\ (t) = γ(t)
1 β(t)
: recall (201)
β(t) 1
↑
t = (c/g) sinh gτ /c , by (299.3)
of the Lorentz matrix that achieves O ← Oτ , and thus to answer a question
posed on page 199. We can use that information to (for example) write

\\
0
K(t) = ma(t) = (t)
/
mg
to describe the relationship between
K(t) ≡ our perception of the Minkowski force impressed upon m

0
≡ O’s perception of that Minkowski force
mg

159
problem 51.
202 Aspects of special relativity

Figure 71: Our (inertial) representation of the hyperbolic worldline

of a particle which initially rests at the point x(0) but moves oﬀ with
(in its own estimation) constant acceleration g. With characteristic
time c/g it approaches (and in Galilean physics would actually
achieve) the speed of light. If we assign to g the comfortable value
9.8 meters/second 2 we ﬁnd c/g = 354.308 days.

2.* In (299.2) set x(0) = 0. The resulting spacetime hyperbola is, by notational
adjustment 12 λ → c2 /g, identical to that encountered at the middle of page 176:
our perception of Q’s worldline is a conformal transform Q’s own perception
of her (from her point of view trivial) worldline. If Q elected to pass her time
doing electrodynamics she would—though non-inertial—use equations that are
structurally identical to the (conformally covariant) equations that we might
use to describe those same electrodynamical events.
3. O is inertial, content to sit home at x = 0. Q—O’s twin—is an astronaut,
who at time t = 0 gives her brother a kiss and sets oﬀ on a ﬂight along the
x-axis, on which her instruction is to execute the following throttle function:


 +g : 0 < τ < 14 T
g(τ ) = −g : 4T
1
< τ < 34 T


4T <τ < T
3
+g :

Pretty clearly, O’s representation of Q’s worldline will be assembled from

four hyperbolic segments (Figure 72), each of duration (c/g) sinh gT/4c. At

* This remark will be intelligible only to those brave readers who ignored my
recommendation that they skip §6.
Relativistic mechanics of a particle 203

Figure 72: Inertial observer O’s representation of the rocket ﬂight

of his twin sister Q. If T 4c/g then O will see Q to be moving
much of the time at nearly the speed of light (hyperbola approaches
its asymptote). The dashed curve represents the ﬂight of a lightbeam
that departs/returns simultaneously with Q.

the moment of her return the clock on Q’s control panel will read T, but
according to O’s clock the

>T
return time = T · (4c/gT ) sinh gT/4c = (301.1)
∼ T only if T 4c/g

and Q’s adventure will have taken her to a turn-around point lying160 a
160
Work from (299.2).
204 Aspects of special relativity

Figure 73: Particle worldlines • → • all lie within the conﬁnes

of the blue box (interior of the spacetime region bounded by the
lightcones that extend forward form the lower vertex, and backward
from the later vertex). The red trajectory—though shortest-possible
in the Euclidean sense—is longest-possible in Minkowski’s sense,
while the blue trajectory has zero length. The “twin paradox” hinges
on the latter fact. The acceleration experienced by the rocket-borne
observer Q is, however, not abrupt (as at the kink in the blue
trajectory) but evenly distributed.

distance = 2 (ct)2 + (c2 /g)2 − c2 /g (301.2)
t = 14 (return time)

away. For brief trips we therefore have

distance = 2(c2 /g) 1 + (gt/c)2 − 1 = 2 · 12 gt2 + · · ·

while for long trips

distance = 2ct 1 + (c/gt)2 − (c/gt)

|—this factor is always positive, always < 1,
and approaches unity as t↑∞
Relativistic mechanics of a particle 205

—both of which make good intuitive sense.161 Notice (as Einstein—at the end
of §4 in his ﬁrst relativity paper—was the ﬁrst to do) that

Q is younger than O upon her return

and that this surprising fact can be attributed to a basic metric property of
spacetime (Figure 73).162 The so-called twin paradox arises when one argues
that from Q’s point of view it is O who has been doing the accelerating, and who
should return younger . . . and they can’t both be younger! But those who pose
the “paradox” misconstrue the meaning of the “relativity of motion. ” Only O
remained inertial throughout the preceding exercise, and only Q had to purchase
rocket fuel . . . and those facts break the supposed “symmetry” of the situation.
The issue becomes more interesting with the observation that we have spent
our lives in (relative to the inertial frames falling through the ﬂoor) “a rocket
accelerating upward with acceleration g” (but have managed to do so without an
investment in “fuel”). Why does our predicament not more nearly resemble the
the predicament of Q than of O?163

current-charge interaction from two points of view We possess

a command of relativistic electrodynamics/particle dynamics that is now so
complete that we can contemplate detailed analysis of the “asymmetries” that
served to motivate Einstein’s initial relativistic work. The outline of the
illustrative discussion which follows was brought to my attention by Richard
Crandall.164 The discussion involves rather more than mere “asymmetry: ” on
its face it involves a “paradox.” The system of interest, and the problem it
presents, are described in Figure 74. The observer O who is at rest with respect
to the wire sees an electromagnetic ﬁeld which (at points exterior to the wire)
can be described
   
0 0
E =  0  and B =  −Bz/R 
0 +By/R

where B = I/2πcR and R = y 2 + z 2 . The Minkowski 4-force experienced by
q therefore becomes (see again (295))
 0   
K 0 0 0 0 γc
 K1  0 0 By/R Bz/R   γv 
 2  = (q/c)   
K 0 −By/R 0 0 0
K3 0 −Bz/R 0 0 0

161
problem 52.
162
problem 53.
163
See at this point C. W. Sherwin, “Some recent experimental tests of the
clock paradox,” Phys. Rev. 120, 17 (1960).
164
For parallel remarks see §5.9 in E. M. Purcell’s Electricity & Magnetism:
Berkeley Physics Course–Volume 2 () and §13.6 of The Feynman Lectures
on Physics–Volume 2 ().
206 Aspects of special relativity

R y
q v

E y

Figure 74: At top: O’s view of the system of interest . . . and at

bottom: O’s view. O—at rest with respect to a cylindrical conductor
carrying current I—sees a charge q whose initial motion is parallel
to the wire. He argues that the wire is wrapped round by a solenoidal
magnetic ﬁeld, so the moving charge experiences a (vv ×B B )-force
directed toward the wire, to which the particle responds by veering
toward and ultimately impacting the wire. O is (initially) at rest
with respect to the particle, so must attribute the impact an electrical
force. But electrical forces arise (in the absence of time-dependent
magnetic ﬁelds) only from charges. The nub of the problem: How
do uncharged current-carrying wires manage to appear charged to
moving observers?

So we have
   
K0 0 0
K  
1
0  K
 2= =
K −(γqBv/c)y/R K
K3 −(γqBv/c)z/R

according to which K is directed radially toward the wire. To describe this

same physics O—who sees O to be moving to the left with speed v—writes
Relativistic mechanics of a particle 207

   
0 0 0 0 γc
 0 0 By/R Bz/R  /\\–1 /\\  γv 
K = /\\ K = (q/c) · /\\   ·  
0 −By/R 0 0 0
0 −Bz/R 0 0 0

F u

with  
1 −β 0 0

/\\ = γ 
−β 1 0 0

0 0 1 0
0 0 0 1
Straightforward computation supplies
  
0 0 −βγBy/R −βγBy/R c
 0 0 + γBy/R + γBz/R  0 
= (q/c) ·   
−βγBy/R γBy/R 0 0 0
−βγBz/R γBz/R 0 0 0
 
0 0
 0  K
=  =
−(γqBv/c)y/R K
−(γqBv/c)z/R

While O saw only a B -field, it is clear from the computed structure of F that O
sees both a B -field (γ times stronger that O’s) and an E -field. We have known
since (210.2) that

(spatial part of any 4-vector)⊥ boosts by invariance

so (since K ⊥ v ) are not surprised to discover that

K = K , but observe that O considers K to a magnetic effect
O considers K to an electric effect
More specifically, O sees (Figure 74) a centrally-directed electric field of just
the strength
E = βγB = βγI/2πcR
that would arise from an infinite line charge linear density

λ = −βγI/c

The question now before us: How does the current-carrying wire acquire, in O’s
estimation, a net charge? An answer of sorts can be obtained as follows: Assume
(in the interest merely of simplicity) that the current is uniformly distributed
on the wire’s cross-section:

I = ja where a ≡ πr2 = cross-sectional area

208 Aspects of special relativity

Figure 75: O’s representation of current ﬂow in a stationary wire

and (below) the result of Lorentz transforming that diagram to the
frame of the passing charge q. For interpretive commentary see the
text.
Relativistic mechanics of a particle 209

To describe the current 4-vector interior to the wire O therefore writes

 
0
 I/a 
j= 
0
0

O, on the other hand, writes the Lorentz transform of j:

 
−βγI/a
cρ  γI/a 
j≡ = /\\ j =   =⇒ ρ = −βγI/ac
j 0
0

O and O assign identical values to the cross-sectional area

a = a because cross-section ⊥ v

so O obtains
λ ≡ charge per unit length = ρ a
= −βγI/c
—in precise agreement with the result deduced previously. Sharpened insight
into the mechanism that lies at the heart of this counterintuitive result can be
gained from a comparison of the spacetime diagrams presented in Figure 75. At
top we see O’s representation of current in a stationary wire: negatively ionized
atoms stand in place, positive charges drift in the direction of current flow.165
In the lower figure we see how the situation presents itself to an observer O
who is moving with speed v in a direction parallel to the current flow. At any
instant of time (look, for example, to his x0 = 0 timeslice, drawn in red) O sees
ions and charge carriers to have distinct linear densities . . . the reason being
that she sees ions and charge carriers to be moving with distinct speeds, and
the intervals separating one ion from the next, one charge carrier from the next
to be Lorentz contracted by distinct amounts. O’s charged wire is, therefore, a
differential Lorentz contraction effect. That such a small velocity differential

drift velocity relative to ions ∼ 10−11 c

can, from O’s perspective, give rise to a measureable net charge is no more
surprising than that it can, from O’s perspective, give rise to a measureable
net current: both can be attributed to the fact that an awful lot of charges
participate in the drift.

165
O knows perfectly well that in point of physical fact the ionized atoms
are positively charged, the current carriers negatively charged, and their drift
opposite to the direction of current ﬂow: the problem is that Benjamin Franklin
did not know that. But the logic of the argument is unaﬀected by this detail.
210 Aspects of special relativity

Just about any electro-mechanical system would yield similar asymmetries/

“paradoxes” when analysed by alternative inertial observers O and O. The
preceding discussion is in all respects typical, and serves to illustrate two points
of general methodological signiﬁcance:
• The formal mechanisms of (manifestly covariant) relativistic physics are
so powerful that they tend to lead one automatically past conceptual
diﬃculties of the sort that initially so bothered Einstein, and (for that
very reason) . . .
• They tend, when routinely applied, to divert one’s attention from certain
(potentially quite useful) physical insights: there exist points of physical
principle which relativistic physics illuminates only when explicitly
interrogated.
When using powerful tools one should always wear goggles.

Peer-E-Kamil - by Umera Ahmed (Roman Urdu Translation by Sk. Danish) Malgun Gothic1
96% (140)
Peer-E-Kamil - by Umera Ahmed (Roman Urdu Translation by Sk. Danish) Malgun Gothic1
492 pages
Becoming Brigitte
100% (5)
Becoming Brigitte
284 pages
Bad Life (VOL 1-4)
92% (12)
Bad Life (VOL 1-4)
788 pages
The Girl in The Green Dress - Jeni Haynes, George Blair-West
100% (5)
The Girl in The Green Dress - Jeni Haynes, George Blair-West
401 pages
The Graham Effect Campus Diaries Book 1 Elle Kennedy Z Library
86% (14)
The Graham Effect Campus Diaries Book 1 Elle Kennedy Z Library
329 pages
EATMORE2BEHAPPY - Ang Mutya NG Section E (Part Three) The Final Battle
84% (67)
EATMORE2BEHAPPY - Ang Mutya NG Section E (Part Three) The Final Battle
2,146 pages
You Amp Me - Tal Bauer
83% (12)
You Amp Me - Tal Bauer
304 pages
Not Without My Daughter-Betty Mahmoody
80% (97)
Not Without My Daughter-Betty Mahmoody
385 pages
Treasure of Nadia Walkthrough
80% (41)
Treasure of Nadia Walkthrough
226 pages
I Fell in Love With Blind Man 1 To 50
91% (23)
I Fell in Love With Blind Man 1 To 50
826 pages
7th Time Loop 05
91% (23)
7th Time Loop 05
276 pages
Crimson Rivers PDF
100% (8)
Crimson Rivers PDF
2,563 pages
Wind and Truth The Brand New Epic Stormli Brandon Sanderson
100% (3)
Wind and Truth The Brand New Epic Stormli Brandon Sanderson
1,592 pages
The Husky and His White Cat Shizun - Erha He Ta de Bai Mao Shizun Vol. 1
90% (10)
The Husky and His White Cat Shizun - Erha He Ta de Bai Mao Shizun Vol. 1
440 pages
Lord of The Mysteries 1
100% (5)
Lord of The Mysteries 1
865 pages
Metal Slinger - Rachel Schne
60% (5)
Metal Slinger - Rachel Schne
366 pages
Bad Life (VOL 1-4)
100% (6)
Bad Life (VOL 1-4)
779 pages
"Can't Hurt Me": David Goggins
88% (8)
"Can't Hurt Me": David Goggins
12 pages
Manhoos Se Mahan Tak
100% (1)
Manhoos Se Mahan Tak
3,252 pages
Liberal Family 1
59% (54)
Liberal Family 1
226 pages
Art Heist Baby
No ratings yet
Art Heist Baby
458 pages
Dating Format
91% (34)
Dating Format
33 pages
Maleeha Meri Behan
88% (40)
Maleeha Meri Behan
57 pages
Pakistani Family Insect Saga PDF
64% (47)
Pakistani Family Insect Saga PDF
679 pages
Mera Nam Ahsan He or Mere Ghar Me Ham 5 Log Hain
67% (15)
Mera Nam Ahsan He or Mere Ghar Me Ham 5 Log Hain
9 pages
Garam Khandan
100% (3)
Garam Khandan
585 pages
Socialism in Europe and The Russian Revolution - Shobhit Nirwan
84% (75)
Socialism in Europe and The Russian Revolution - Shobhit Nirwan
17 pages
Maza Ya Saza (Family Incest)
33% (3)
Maza Ya Saza (Family Incest)
26 pages
ELS Answers Only
20% (5)
ELS Answers Only
20 pages
New Microsoft Office Word Document
67% (3)
New Microsoft Office Word Document
115 pages

Chapter 2 Special Relativity

Uploaded by

Chapter 2 Special Relativity

Uploaded by

2

Introduction. We have already had occasion to note that “Maxwell’s trick”

we see that in “c-physics” we can, if we wish, measure temporal intervals in the

We agree also to write

Note particularly that ∂0 = c1 ∂t . We superscript x’s but subscript ∂’s in

1. Notational reexpression of Maxwell’s equations. Even simple thoughts can be

What are the (evidently non-Galilean) transformations

It is evident that (157.1) could be written in the following remarkably compact

provided the F µν are deﬁned by the following scheme:

F νµ = −F µν : more compactly FT = −F (160)

By computation we readily establish that

Repetition of the process gives

Gµν is said to be the “dual” of F µν , and the process F µν −→ Gµν is called

Figure 45: The “rotational” eﬀect of “dualization” on E and B .

∂µ F µν = c1 j ν and ∂µ Gµν = 0 (168.1)

but also the algebraic condition

Gµν = 12 g µα g νβ αβρσ F ρσ (168.2)

2. Introduction to the algebra and calculus of tensors. Let P be a point in an

and considering the fundamental object of electrodynamic analysis to be a single

(see §26 in Arnold Sommerfeld’s Electrodynamics ( English translation ) or

(see Appendix B in my “On some recent electrodynamical work by Thomas

let (x1 , x2 , . . . , xN ) be the coordinates assigned to that same point by a second

∂2φ ∂xa ∂xb ∂ 2 φ ∂ 2 xa ∂φ

It is important to notice that

i.e., that the matrices M and W are inverses of each other.

If, on the other hand,

That was the lesson of (171.1).

The intrusion of the “extraneous term” is typical of the diﬀerential calculus of

X mn = M maW bnX ab (say)

Not every multiply-indexed object transforms tensorially!

In particular, the xn themselves do not transform tensorially except in the linear

A conceptual point of major importance: the X m1 ...mr n1 ...ns refer to a

Now construct Amn ≡ Xm,n − Xn,m = −Anm and obtain

Amn = W a m W b n Aab because the extraneous terms cancel

We conclude that the antisymmetric construction Amn (which we might call

contraction: Set a superscript equal to a subscript, and add

Set (say) k = 4 and obtain

according to which X j ≡ X jk k transforms as a contravariant vector. Similarly,

The “Kronecker symbol” δ m n is a number-valued object90 with which all

gmn −→ g mn = W a m W b n gab (177)

by means of which we have proposed already on page 110 to raise and

gmn = gnm : implies the symmetry also of g mn

The transformation equation (177) admits—uncharacteristically–of matrix

X m1 ...mr n1 ...ns = W w · M m1 a1 · · · M mr ar W b1 n1 · · · W bs ns X a1 ...ar b1 ...bs

We can multiply/contract tensors of dissimilar weight, but must be careful not

brings us to the remarkable conclusion that the components of the Levi-Civita

X ···m··· ···n··· = ±X ···n··· ···m···

—while they might be valid in some particular coordinate system—“become

serves to resolve X mn tensorially into its symmetric and antisymmetric parts.93

We have now in our possession a command of tensor algebra which is

1. We established already at (171.1) that if φ transforms as a weightless scalar

2. And we observed on page 115 that if Xm transforms as a weightless covariant

3. If Xjk is a weightless tensor ﬁeld, how do the ∂i Xjk transform? Immediately

so ∂i Xjk transforms tensorially only under such circumstances as cause the

4. If X m is a vector density of unspeciﬁed weight w how does ∂m X m transform?

∂ ∂xm = ∂ log det

∂xm ∂xa ∂xa ∂xn

= W w · ∂a X a + X a (w − 1)W w−1 ∂Wa

∂m X m transforms tensorially (by invariance) (179.4)

5. If X mn is a vector density of unspeciﬁed weight w how does ∂m X mn

The extraneous term can be developed

The second partial is ab-symmetric, and makes no net contribution if we assume

but this is hardly news: the postulated antisymmetry fo X mn combines with

The evidence now in hand suggests—accurately—that antisymmetry has

In more general (antisymmetry-free) contexts one deals with the

Xn;ij − Xn;ji = Xa Ra nij

which describes the typical inequality of crossed covariant derivatives. The

covariant Laplacian of φ ≡ g mn φ;mn

Figure 47: Any attempt to construct a transformationally coherent

Xm (x) results from parallel transport

Xm;n − Xn;m = (Xm,n − Xa Γ a nm ) − (Xn,m − Xa Γ a mn )

The basic principles of the “absolute diﬀerential calculus” were developed

Grossmann’s class notes instead of attending Minkowski’s lectures) and whose

3. Transformation properties of the electromagnetic field equations. We will be

The covariance group of a theory depends

slight adjustments in the formal rendition of Maxwell’s equations will lead to

Gµν = 12 g µα g νβ αβρσ F ρσ (168.2)

Figure 52: Graph of the function β(β 1 , β2 ). The vertices of the