Chapter 2 Special Relativity
Chapter 2 Special Relativity
From Electrodynamics to
SPECIAL RELATIVITY
. . . but have now to examine the more detailed ramifications of this formal
development. The issue leads, of course, to special relativity.
That special relativity is—though born of electrodynamics—“bigger” than
electrodynamics (i.e., that it has non-electrodynamic implications, applications
—and roots) is a point clearly appreciated by Einstein himself (). Readers
should understand, therefore, that my intent here is a limited one: my goal
is not to produce a “complete account of the special theory of relativity” but
only to develop those aspects of special relativity which are specifically relevant
to our electrodynamical needs . . . and, conversely, to underscore those aspects
of electrodynamics which are of a peculiarly “relativistic” nature.
In relativistic physics c —born of electrodynamics and called (not quite
appropriately) the “velocity of light”—is recognized for what it is: a constant
106 Aspects of special relativity
of Nature which would retain its relevance and more fundamental meaning
“even if electrodynamics—light—did not exist.” From
–1
[ c ] = velocity = LT
was first posed () and resolved () by H. A. Lorentz ( –), who
was motivated by a desire to avoid the ad hoc character of previous attempts
to account for the results of the Michelson–Morley, Trouton–Noble and related
experiments. Lorentz’ original discussion78 strikes the modern eye as excessively
complex. The discussion which follows owes much to the mathematical insight
of H. Minkowski (–),79 whose work in this field was inspired by the
accomplishments of one of his former students (A. Einstein), but which has
roots also in Minkowski’s youthful association with H. Hertz ( –), and
is distinguished by its notational modernism.
Here we look to the notational aspects of Minkowski’s contribution, drawing
tacitly (where Minkowski drew explicitly) upon the notational conventions and
conceptual resources of tensor analysis. In a reversal of the historical order,
I will in §2 let the pattern of our results serve to motivate a review of tensor
algebra and calculus. We will be placed then in position to observe (in §3) the
sense in which special relativity almost “invents itself.” Now to work:
Let Maxwell’s equations (65) be notated
∇·E = ρ
∇ ×B
B− 1 ∂
= c1 j
c ∂t E
∇·B = 0
∇×E
E + c1 ∂t
∂
B= 0
where, after placing all fields on the left and sources on the right, we have
grouped together the “sourcy” equations (Coulomb, Ampere), and formed a
second quartet from their sourceless counterparts. Drawing now upon the
notational conventions introduced on the preceding page we have
∂1 E1 + ∂2 E2 + ∂3 E3 = c1 j 0 ≡ ρ
−∂0 E1 + ∂2 B3 − ∂3 B2 = c1 j 1
(157.1)
−∂0 E2 − ∂1 B3 + ∂3 B1 = c1 j 2
−∂0 E3 + ∂1 B2 − ∂2 B1 = c1 j 3
78
Reprinted in English translation under the title “Electromagnetic
phenomena in a system moving with any velocity less than that of light” in
The Principle of Relativity (), a valuable collection reprinted classic papers
which is still available in paperback (published by Dover).
79
See §7 of “Die Grundgleichungen für die elektromagnetischen Vorgänge in
bewegten Körpen” () in Minkowski’s Collected Works.
108 Aspects of special relativity
−∂1 B1 − ∂2 B2 − ∂3 B3 = 0
+∂0 B1 + ∂2 E3 − ∂3 E2 = 0
(157.2)
+∂0 B2 − ∂1 E3 + ∂3 E1 = 0
+∂0 B3 + ∂1 E2 − ∂2 E1 = 0
where we have found it formally convenient to write
j0
j0
j
1
j≡ = 2 with j 0 ≡ cρ (158)
j j
j3
∂µ F µν = c1 j ν
↑ ↑
3
———note : Here as always, summation on
the repeated index is understood . 0
≡ A(E
E, B) (159)
Here the A-notation is intended to emphasize that the 4×4 matrix in question
is antisymmetric ; as such, it has or 6 independently-specifiable components,
which at (159) we have been motivated to identify in a specific way with the
six components of a pair of 3 -vectors. The statement
evidently holds at every spacetime point, and will play a central role in our work
henceforth.
It follows by inspection from results now in hand that the sourceless field
equations (157.2) can be formulated
∂µ Gµν = 0
with
0 B1 B2 B3
−B 0 −E3 E2
G ≡ Gµν = A(−B
B, E ) = 1
(161)
−B2 E3 0 −E1
−B3 −E2 E1 0
Notational preparations 109
. . . but with this step we have acquired an obligation to develop the sense in
which G is a “natural companion” of F. To that end:
Let the Levi-Civita symbol µνρσ be defined
+1 if (µνρσ) is an even permutation of (0123)
µνρσ ≡ −1 if (µνρσ) is an odd permutation of (0123)
0 otherwise
and let quantities Fµν be constructed with its aid:
Fµν ≡ 12 µναβ F αβ where is understood (162)
ρ, σ
which would become G if we could change the sign of the B-entries, and this
is readily accomplished: multiply A(B
B , E ) by
1 0 0 0
0 −1 0 0
gj ≡ (163)
0 0 −1 0
0 0 0 −1
on the right (this leaves the 0th column unchanged, but changes the sign of the
1st , 2nd and 3rd columns), and again by another factor of gj on the left (this
leaves the 0 row unchanged, but changes the sign of the 1 , 2nd and 3rd rows,
th st
the 1st , 2nd and 3rd elements of which have now been restored to their original
signs). We are led thus to gj A(B B , E ) gj = A(−B
B , E ) which—because
gjT : gj is its own transpose (i.e., is symmetric)
gj = (164)
gj –1 : gj is its own inverse
B , E ) = gj A(−B
—can also be expressed A(B B , E ) gj . In short,80
F = gj G gj T equivalently G = gj –1 F ( gj –1 )T (165)
80
problem 35.
110 Aspects of special relativity
Let the elements of gj be called gµν , and the elements of gj –1 (though they
happen to be numerically identical to the elements of gj ) be called g µν :
1 if µ = ν
gj ≡ gµν and gj –1 ≡ g µν ⇒ g µα gαν = δ µ ν ≡
0 if µ = ν
We then have
Fµν = gµα gνβ Gαβ or equivalently Gµν = g µα g νβ Fαβ
To summarize: we have
–1
F µν −−−−−−−−−−−−−−→ Fµν −−−−−−−−−−−−− j −→ F µν = Gµν
lift indices with g
E , B )-notation reads
which in (E
F = A(E
E , B ) −→ A(B
B , E ) −→ A(−B
B, E ) = G
G = A(−B
B , E ) −→ A(E
E , −B
B ) −→ A(−E
E , −B
B ) = −F
B E
E B
Preceding remarks lend precise support and meaning to the claim that F µν and
Gµν are “natural companions,” and very closely related.
We shall—as above, but more generally (and for the good tensor-theoretic
reasons that will soon emerge) use g µν and gµν to raise and lower—in short,
to “manipulate”—indices, writing (for example)81
∂ µ = g µα ∂α , ∂µ = gµα ∂ α
j µ = g µα jα , jµ = gµα j α
Fµν = gµα F α ν = gµα gνβ F αβ
81
problem 36.
Notational preparations 111
We are placed thus in position to notice that the sourceless Maxwell equations
(157.2) can be formulated82
∂1 F23 + ∂2 F31 + ∂3 F12 = 0
∂0 F23 + ∂2 F30 + ∂3 F02 = 0
(166.1)
∂0 F13 + ∂1 F30 + ∂3 F01 = 0
∂0 F12 + ∂1 F20 + ∂2 F01 = 0
where the sums over cyclic permutations are sometimes called “windmill sums.”
More compactly, we have83
µαρσ ∂ α F ρσ = 0 (166.2)
There is no new physics in the material presented thus far: our work has
been merely reformulational, notational—old wine in new bottles. Proceeding
in response mainly to the linearity of Maxwell’s equations, we have allowed
ourselves to play linear-algebraic and notational games intended to maximize
the formal symmetry/simplicity of Maxwell’s equations . . . so that the
transformation-theoretic problem which is our real concern can be posed in
the simplest possible terms. Maxwell himself 84 construed the electromagnetic
field to involve a pair of 3-vector fields: E and B . We have seen, however, that
• one can construe the components of E and B to be the accidentally
distinguished names given to the six independently-specifiable non-zero
components of an antisymmetric tensor 85 field F µν . The field equations
then read
∂µ F µν = c1 j ν and µαρσ ∂ α F ρσ = 0 (167)
provided the g αβ that enter into the definition ∂ α ≡ g αβ ∂β are given by
(163). Alternatively . . .
• one can adopt the view that the electromagnetic field to involves a pair of
antisymmetric tensor fields F µν and Gµν which are constrained to satisfy
not only the field equations
Here again, the “index manipulators” gµν and g µν must be assigned the
specific meanings implicit in (163).
82
problem 37.
83
problem 38.
84
Here I take some liberty with the complicated historical facts of the matter:
see again the fragmentary essay77 cited earlier.
85
For the moment “tensor” simply means “doubly indexed.”
112 Aspects of special relativity
It will emerge that Lorentz’ question (page 107), if phrased in the terms natural
to either of those descriptions of Maxwellian electrodynamics, virtually “answers
itself.” But to see how this comes about one must possess a command of the
basic elements of tensor analysis—a subject with which Minkowski
(mathematician that he was) enjoyed a familiarity not shared by any of his
electrodynamical predecessors or contemporaries.86
Fµν = ∂µ Aν − ∂ν Aµ
V ≡ E + iB
B
xm = xm (x1 , x2 , . . . , xN ) : m = 1, 2, . . . N (170)
that describe how X and X are, in the instance at hand, specifically related.
How do the partial derivatives of φ transform? By calculus
∂φ ∂xa ∂φ
= (171.1)
∂xm ∂xm ∂xa
where (as always) is understood. Looking to the 2nd derivatives, we have
a
Et cetera. Such are the “objects” we encounter in routine work, and the
transformation rules which we want to be able to manipulate in a simple
manner.
The quantities ∂xa/∂xm arise directly and exclusively from the equations
(170) that describe X ← X. They constitute the elements of the “transformation
matrix”
W ≡ W n m
W n m ≡ ∂xn /∂xm (172.1)
—the value of which will in general vary from point to point. Function theory
teaches us that the coordinate transformation will be invertible (i.e., that we
can proceed from xn = xn (x) to equations of the form xn = xn (x)) if and only
if W is non-singular: det W = 0, which we always assume to be the case (in the
neighborhood of P ). The inverse X → X of X ← X gives rise to
M ≡ M m n
M m n ≡ ∂xm /∂xn (172.2)
X m1 ...mr n1 ...ns
↓ (174)
a1 · · · M ar W n1 · · · W ns X
m1 ...mr m1 mr b1 bs a1 ...ar
X n1 ...ns = M b1 ...bs
All indices range on 1, 2, . . . , N , N is called the “dimension” of the tensor, and
summation on repeated indices is (by the “Einstein summation convention”)
understood. The covariant/contravariant distinction is signaled notationally as
a subscript/superscript distinction, and alludes to whether it is W or M that
transports the components in question “across the street, from the X-side to
the X-side.”
If
X m −→ X m = M m a X a
then the X m are said to be “components of a contravariant vector .” Coordinate
differentials provide the classic prototype:
∂xm
dxm −→ dxm = dxa (175)
a
∂xa
Xn −→ X n = W b n Xb
then the Xn are said to be “components of a covariant vector.” Here the first
partials φ,n ≡ ∂n φ of a scalar field (components of the gradient) provide the
classic prototype:
b
φ,n −→ φ,n = φ,b ∂xn (176)
∂x
b
a b
φ,mn −→ φ,mn = φ,ab ∂xm ∂xn + extraneous term
∂x ∂x
b
with ∂ p = W q p ∂q gives
∂(M m a W b n ) a
X m n,p = M m a W b n X a b,q W q p + W q p X b
∂xq
= (term with covariant rank increased by one)
+ (extraneous term)
The “extraneous term” vanishes if the M ’s and W ’s are constant; i.e., if the
functions xn (x) depend at most linearly upon their arguments xn = M n a xa +ξ a .
And in a small number of (electrodynamically important!) cases the extraneous
terms cancel when derivatives are combined in certain ways . . . as we will soon
have occasion to see. But in general, effective management of the extraneous
term must await the introduction of some powerful new ideas—ideas that belong
not to the algebra of tensors (my present concern) but to the calculus of tensors.
For the moment I must be content to emphasize that, on the basis of evidence
now in hand,
To lend substance to a remark made near the top of the page: Let Xm
transform as a covariant vector. Look to the transformation properties of Xm,n
and obtain
X m,n = W a m W b n Xa,b + ∂ 2 xa X
n m a
∂x ∂x
extraneous term, therefore non-tensorial
X
X
X
X
Figure 46: The Xm serve to describe the blue arrow with respect to
the black coordinate system X, as the X m serve to describe the blue
arrow with respect to the red coordinate system X. But neither Xm
nor X m will be confused with the blue arrow itself: to do so would be
to confuse descriptors with the thing described. So it is with tensors
in general. Tensor analysis is concerned with relationships among
alternative descriptors, not with “things in themselves.”
The following points are elementary, but fundamental to applications of
the tensor concept:
1) If the components X ··· ... of a tensor (all) vanish one coordinate system,
then they vanish in all coordinate systems—this by the homogeneity of
the defining statement (174).
2) Tensors can be added/subtracted if and only if X ··· ... and Y ··· ... are of
the same covariant/contravariant rank and dimension. Constructions of
(say) the form Am + Bm “come unstuck” when transformed; for that same
reason, statements of (say) the form Am = Bm —while they may be valid
in some given coordinate system—do not entail Am = B m . But . . .
3) If X ··· ... and Y ··· ... are of the same rank and dimension, then
X ··· ... = Y ··· ... =⇒ X ··· ... = Y ··· ...
It is, in fact, because of the remarkable transformational stability of
tensorial equations that we study this subject, and try to formulate our
physics in tensorial terms.
4) If X ··· ... and Y ··· ... are co-dimensional tensors of ranks r , s and r , s
then their product X ··· ... Y ··· ... is tensorial with rank r + r , s + s :
tensors of the same dimension can be multiplied irrespective of their ranks.
Introduction to tensor analysis 117
If X ··· ... is tensorial of rank r, s then a the operation of
X jk = M j a M k b W c X ab c
X jk k = M j a M k b W c k X ab c
= Mja δcb X ab c by M W = I
j ab
=M aX b
δ m n −→ δ m n = M m a W b n δ a b
= M maW an
= δmn by M W = I
and we are brought to the remarkable conclusion that the components δ m n of the
Kronecker tensor have the same numerical values in every coordinate system.
Thus does δ m n become what I will call a “universally available object”—to be
joined soon by a few others. With this . . .
We are placed in position to observe that if the quantities gmn transform
as the components of a 2nd rank covariant tensor
89
The “theory of invariants” was a favorite topic among 19th Century
mathematicians, and provided the founding fathers of tensor analysis with a
source of motivation (see pages 206 –211 in E. T. Bell’s The Development of
Mathematics ()).
90
See again the top of page 110.
118 Aspects of special relativity
then
1) the equation g ma gan = δ m n , if taken as (compare page 110) a definition of
the contravariant tensor g mn , makes good coordinate -independent tensor-
theoretic sense, and
2) so do the equations
···m··· ma ··· ···
X... ... ≡ g X... a ...
··· ··· ··· a ···
X...m... ≡ gma X... ...
we have
g −→ g = W 2 g (178.1)
The statement that φ(x) transforms as a scalar density of weight w carries this
meaning:
φ(x) −→ φ(x) = W w · φ(x(x))
We recover (169) in the “weightless” case w = 0 (and for arbitrary values of w
when it happens that W = 1). Evidently
g ≡ det gj transforms as a scalar density of weight w = 2 (178.2)
The more general statement that X m1 ...mr n1 ...ns transforms as a tensor density
of weight w means that
1 2 ··· N
n1 n2 . . . nN ≡ sgn
n1 n2 · · · nN
where
“sgn” refers to the “signum,” which reports
(see again page 109) whether
n1 , n2 , . . . , nN is an even/odd permutation of 1, 2, . . . , N or no permutation
at all. The tentative assumption that n1 n2 . . . nN transforms as a (totally
antisymmetric) tensor density of unspecified weight w
n1 n2 . . . nN = W w · W a1 n1 W a2 n2 · · · W aN nN a1 a2 . . . aN
|
= n1 n2 . . . nN det W
by definition of the determinant!
=W w+1
· n1 n2 . . . nN
X mn = 12 (X mn + X nm ) + 12 (X mn − X nm )
94
The possibility and electrodynamical utility of such a list was brought first
to my attention when, as a student, I happened upon the discussion which
appears on pages 22–24 of E. Schrödinger’s Space-time Structure (). This
elegant little volume (which runs to only 119 pages) provides physicists with
an elegantly succinct introduction to tensor analysis. I recommend it to your
attention.
Introduction to tensor analysis 121
The extraneous term vanishes (for all w) when X → X has the property that W
is x-independent,96 and it vanishes unrestrictedly if w = 1. We conclude that
if X m is a contravariant vector density of unit weight then its divergence
∂ m X mn = W w · M m a M n b (W c m ∂c X ab ) + X ab ∂ m (W w · M m a M n b )
extraneous term
= W w · M n b ∂a X ab by M m a W c m = δ c a
so by M m a ∂ m = ∂a we have
2 n
extraneous term = X ab M n b (w − 1)W w−1 ∂a W + W w ∂ ax b
∂x ∂x
95
For the interesting but somewhat intricate proof, see classical dynamics
(/), Chapter 2, page 49.
96
This is weaker than the requirement that W be x-independent.
122 Aspects of special relativity
∂m ∂n X mn transforms tensorially
∂m ∂n X mn = 0 automatically
Dj X k ≡ W b j W c k ∂b Xc
|—tensorial transform of ∂
j Xk
≡ components of the covariant derivative of X k
97
See again the mathematical digression that culminates on page 50. A
fairly complete and detailed account of the exterior calculus can be found in
“Electrodynamical applications of the exterior calculus” ().
Introduction to tensor analysis 123
where by computation
= ∂ j X k − X i Γ i jk
with
i 2 p
Γ i jk ≡ ∂xp ∂ j x k
∂x ∂x ∂x
By extension of the notational convention Xk, j ≡ ∂j Xk one writes Xk; j ≡ Dj Xk .
It is a clear that X j; k —since created by “tensorial continuation” from the
“seed” ∂j Xk —transforms tensorially, and that it has something to do with
familiar differentiation (is differentiation, but with built-in compensation for
the familiar “extraneous term,” and reduces to ordinary differentiation in the
root coordinate system X). The quantities Γ i jk turn out not to transform
tensorially, but by the rule i 2 p
= M i a W b j W c k Γ a bc + ∂xp ∂ j x k
∂x ∂x ∂x
characteristic of “affine connections.” Finally, one gives up the assumption
that there exists a coordinate system (the X-system of prior discussion) in
which Dj and ∂j have coincident (i.e., in which Γ i jk vanishes globally). The
affine connection Γ i jk (x) becomes an object that we are free to deposit on the
manifold M, to create an “affinely connected manifold”. . . just as by deposition
of gij (x) we create a “metrically connected manifold.” But when we do both
things98 a compatability condition arises, for we expect
• index manipulation followed by covariant differentiation, and
• covariant differentiation followed by index manipulation
to yield the same result. This is readily shown to entail gij;k = 0, which in turn
entails ∂g
aj ∂gak ∂gjk
Γ i jk = 12 g ia k
+ j
−
∂x ∂x ∂xa
The affine connection has become implicit in the metric connection—it has
become the “Christoffel connection,” which plays a central role in Riemannian
geometry and its applications (general relativity): down the road just a short
way lies the Riemann-Christoffel curvature tensor
∂Γ m nj ∂Γ m ni
Rm nij = − + Γ m ai Γ a nj − Γ m aj Γ a ni
∂xi ∂xj
which enters into statements such as the following
Figure 48: The problem just noted is resolved if one compares one
vector with the local parallel transport of the other—a “stand-in”
rooted to the same point as the original vector. For then only a
single transformation matrix enters into the discussion.
Sharp insight into the meaning of the covariant derivative was provided in
by Levi-Civita,99 who pointed out that when one works from Figure 47 one
cannot realistically expect to obtain a transformationally sensible result, for the
99
The fundamental importance of Levi-Civita’s idea was immediately
appreciated and broadcast by Hermann Weyl. See §14 in his classic Space,
Time & Matter (4th edition , the English translation of which has been
reprinted by Dover).
Introduction to tensor analysis 125
transformation matrices W(x) and W(x + dx) that act upon (say) Xm (x) and
Xm (x + dx) are, in general, distinct. Levi-Civita observed that a workable
procedure does, however, result if one looks not Xm (x + dx) − Xm (x) but to
Xm (x) − Xm (x), where
He endowed the intuitive concept “parallel transport” (Figure 48) with a precise
(natural) meaning, and immediately recovered the standard theory of covariant
differentiation. But he obtained also much else: he showed, for example,
that “geodesics” can be considered to arise not as “shortest” curves—curves
produced by minimization of arc length
ds with (ds)2 = gmn dxm dxn
—but as curves whose tangents can be got one from another by parallel
transportation: head off in some direction and “follow your nose” was the
idea. Levi-Civita’s idea so enriched a subject previously known as the “absolute
differential calculus” that its name was changed . . . to “tensor analysis.”
Our catalog (pages 120–122) can be looked upon as an ennumeration of
circumstances in which—“by accident”—the Γ -apparatus falls away. Look, for
example, to the “covariant curl,” where we have
∂µ F µν = c1 j ν (180.1)
∂µ Fνλ + ∂ν Fλµ + ∂λ Fµν = 0 (180.2)
where F µν is antisymmetric and where
1 0 0 0
0 −1 0 0
Fµν ≡ gµα gνβ F αβ with gj ≡ gµν = (181)
0 0 −1 0
0 0 0 −1
102
We might write
0 −E1 −E2 −E3
E 0 −B3 B2
F µν ≡ 1
E2 B3 0 −B1
E3 −B2 B1 0
0 E1 E2 E3
−E1 0 −B3 B2
∴ Fµν =
−E2 B3 0 −B1
−E3 −B2 B1 0
to establish explicit contact with orthodox 3 -vector notation and terminology
(and at the same time to make antisymmetry manifest), but such a step would
be extraneous to the present line of argument.
103
See again page 119.
128 Aspects of special relativity
W2 = 1
= gµν B2
we obtain
W − 2 · W Tgj W = gj
1
everywhere (185.1)
If spacetime were N -dimensional the determinantal argument would now give
N
W 2− 2 = 1
Then
W Tgj W = Ω gj (185.2)
and the determinantal argument supplies
2
Ω=WN
↓
1
= W 2 in the physical case N = 4
Equations (185.1) and (185.2) evidently say the same thing: the Lorentzian
constraint (183) drops away and in place of (184) we have
u = u(x, y) du = ux dx + uy dy
giving
v = v(x, y) dv = vx dx + vy dy
104
E. Cunningham, “The principle of relativity in electrodynamics and an
extension thereof,” Proc. London Math. Soc. 8, 223 (1910).
105
H. Bateman, “The transformation of the electrodynamical equations,”
Proc. London Math. Soc. 8, 223 (1910).
106
T. Fulton, F. Rohrlich & L. Witten, “Conformal invariance in physics,”
Rev. Mod. Phys. 34, 442 (1962).
107
See “Radiation in hyperbolic motion” in R. Peierls, Surprises in Theoretical
Physics (), page 160.
130 Aspects of special relativity
1 2 3 4
100
50
-100 -50 50
-50
Figure 49: Cartesian grid (above) and its conformal image (below)
in the case f (z) = z 3 , which supplies
u(x, y) = x3 − 3xy 2
v(x, y) = 3x2 y − y 3
But
ux = +vy
analyticity of f (z) ⇐⇒ cauchy-riemann conditions :
uy = −vx
so
ux v
· x = ux vx + uy vy = −ux uy + uy ux = 0
uy vy
Lorentz covariance of Maxwell’s equations 131
F µν ;µ = c1 j ν (187.1)
Fνλ;µ + Fλµ;ν + Fµν;λ = 0 (187.2)
Which is to say: we can elect to “tensorially continuate” our Maxwell equations
to other coordinate systems or arbitrary (moving curvilinear) design. We retain
the description (181) of gµν , and we retain
gµν −→ g µν = W α µ W β ν gαβ B1
But we have no longer any reason to retain B2 , no longer any reason to impose
any specific constraint upon the design of g µν . We arrive thus at a formalism
in which
F µν ;µ = c1 j ν −→ F µν ;µ = c1 j ν
Fνλ;µ + Fλµ;ν + Fµν;λ = 0 −→ F νλ;µ + F λµ;ν + F µν;λ = 0
and in which
X → X is unrestricted (188)
µν µ
No “natural weights” are assigned within this formalism to F , j and gµν ,
but formal continuity with the conformally-covariant formalism (whence with
the Lorentz-covariant formalism) seems to require that we assign weights w = 1
to F µν and j µ , weight w = − 12 to gµν .
108
See page 55 of “The transformations which preserve wave equations” ()
in transformtional physics of waves (–).
132 Aspects of special relativity
Still other points of view are possible,109 but I have carried this discussion
already far enough to establish the validity of a claim made at the outset: the
only proper answer to the question “What transformations X → X preserve the
structure of Maxwell’s equations?” is “It depends—depends on how you have
chosen to write Maxwell’s equations.”
We have here touched, in a physical setting, upon an idea—look at
“objects,” and the groups of transformations which preserve relationships
among those objects—which Felix Klein, in the lecture given when (in ,
at the age of ) he assumed the mathematical professorship at the University
of Erlangen, proposed might be looked upon as the organizing principle of
all pure/applied mathematics—a proposal which has come down to us as the
“Erlangen Program.” It has been supplanted in the world of pure mathematics,
but continues to illuminate the historical and present development of physics.110
4. Lorentz transformations, and some of their implications. To state that X ← X
is a Lorentz transformation is, by definition, to state that the associated
transformation matrix M ≡ M µ ν ≡ ∂xµ /∂xν has (see again page 127) the
property that
M Tgj M = gj everywhere (182)
T –1
where by fundamental assumption gj = gj = gj possesses at each point in
spacetime the specific structure indicated at (181).
I begin with the observation that M must necessarily be a constant matrix.
The argument is elementary: hit (182) with ∂ λ and obtain
(∂ λ M )Tgj M + M Tgj (∂ λ M ) = O because gj is constant
This can be rendered
gαβ M α λµ M β ν + gαβ M α µ M β νλ = 0
where M α λµ ≡ ∂ 2 xα /∂xλ ∂xµ = M α µλ . More compactly
Γµνλ + Γνλµ = 0
where Γµνλ ≡ gαβ M α β
µ M νλ . Also (subjecting the µνλ to cyclic permutation)
Γνλµ + Γλµν = 0
Γλµν + Γµνλ = 0
so
0 1 1 Γλµν 0
1 0 1 Γµνλ = 0
1 1 0 Γνλµ 0
109
See D. van Dantzig, “The fundamental equations of electromagnetism,
independent of metric geometry,” Proc. Camb. Phil. Soc. 30, 421 (1935).
110
For an excellent discussion see the section “Codification of geometry by
invariance” (pages 442–453) in E. T. Bell’s The Development of Mathematics
(). The Erlangen Program is discussed in scholarly detail in T. Hawkins,
Emergence of the theory of Lie Groups (): see the index. For a short
history of tensor analysis, see Bell’s Chapter 9.
Lorentz transformations 133
It was emphasized on page 119 that “not every indexed object transforms
tensorially,” and that, in particular, the xµ themselves do not transform
tensorially except in the linear case. We have now in hand just such a case, and
for that reason relativity becomes—not just locally but globally—an exercise
in linear algebra. Spacetime has become a 4 -dimensional vector space; indeed,
it has become an inner product space, with
(x, y) ≡ gµν xµ y ν
= (y, x) by gµν = gνµ
(191.1)
= xTgj y
= x0 y 0 − x1 y 1 − x2 y 2 − x3 y 3 = x0 y 0 − x · y
111
Einstein ()—on the grounds that what he sought was a minimal
modification of the Galilean transformations (which are themselves linear)—
was content simply to assume linearity.
134 Aspects of special relativity
which (instead of being positive unless x = 0) can assume either sign, and can
vanish even if x = 0. From this primitive fact radiates much—arguably all—
that is most distinctive about the geometry of spacetime . . . which, as Minkowski
was the first to appreciate (and as will emerge) lies at the heart of the theory
of relativity.
If Aµ , B µ and gµν transform as weightless tensors, then basic tensor algebra
informs us that gµν Aµ B ν transforms by invariance:
time
space
The first person to recognize the profoundly revolutionary nature of what had
been accomplished was (not Einstein but) Minkowski, who began an address to
the Assembly of German Natural Scientists & Physicians ( September )
with these words:
“The views of space and time which I wish to lay before you have
sprung from the soil of experimental physics, and therein lies their
strength. They are radical. Henceforth space by itself, and time by
itself, are doomed to fade away into mere shadows, and only a kind
of union of the two will preserve an independent reality.”
Electrodynamics had led to the first clear perception of the geometrical design
of the spacetime manifold upon which all physics is written. The symmetries
inherent in that geometry were by this time know to be reflected in the design
of Maxwell’s equations. Einstein’s Principle of Relativity holds that they must,
in fact, be reflected in the design of all physical theories—irrespective of the
specific phenomenology to which any individual theory may refer.
Returning now to the technical mainstream of this discussion . . . let the
Lorentz condition (190) be written
/\\–1 = gj –1 /\\T gj (192)
Generally inversion of a 4×4 matrix is difficult, but (192) shows that inversion
of a Lorentz matrix /\\ can be accomplished very easily.112 .
Equations (190/192) impose a multiplicative condition upon /\\ . It was to
reduce multiplicative conditions to additive conditions (which are easier) that
logarithms were invented. Assume, therefore, that /\\ can be written
/\\ = eA = I + A + 2! A
1 2
+ ···
It now follows that
T –1 T
/\\–1 = e− A while gj –1 /\\T gj = gj –1 e A gj = e gj A gj
Evidently /\\ will be a Lorentz matrix if
−A = gj –1 AT gj
which (by gj T = gj ) can be expressed
( gj A)T = −( gj A)
This is an additive condition (involves negation instead of inversion) and
amounts simply to the statement that gj A ≡ Aµν is antisymmetric. Adopt
this notation
0 A1 A2 A3
−A1 0 −a3 a2
gj A =
−A2 a3 0 −a1
−A3 −a2 a1 0
112
problem 40
Lorentz transformations 137
where comprise a sextet of adjustable real constants.
A1 , A2 , A3 , a1 , a2 , a3
Multiplication on the left by gj –1 gives a matrix of (what I idiosyncratically
call) the “ gj -antisymmetric” design113
0 A1 A2 A3
A 0 a3 −a2
A ≡ Aµ ν = 1
A2 −a3 0 a1
A3 a2 −a1 0
We come thus to the conclusion that matrices of the form
0 A1 A2 A3
/\\ = exp
A1 0 a3 −a2
(193)
A2 −a3 0 a1
A3 a2 −a1 0
are Lorentz matrices; i.e., they satisfy (190/192), and when inserted into (189)
they describe Poincaré/Lorentz transformations.
Does every Lorentz matrix /\\ admit of such representation? Not quite. It
follows immediately from (190) that (det /\\ )2 = 1; i.e., that
“proper”
Λ ≡ det = ±1, according as is
/\\ /\\
“improper”
It will emerge also that when one has developed the structure of the matrices
/\\= e A one has “cracked the nut,” in the sense that it becomes easy to describe
their improper companions.115
What it means to “develop the structure of /\\ = e A ” is exposed most
simply in the (physically artificial) case N = 2. Taking
1 0
gj = : Lorentz metric in 2 -dimensional spacetime
0 −1
113
Notice that gj -antisymmetry becomes literal antisymmetry when the metric
gj is Euclidean. Notice also that while it makes tensor-algebraic good sense to
write A2 = Aµ α Aα ν it would be hazardous to write ( gj A)2 = Aµα Aαν .
114
problem 41.
115
problem 42.
138 Aspects of special relativity
where evidently
0 1
J=
1 0
So
/\\ = 1+ 1 2
2! A
1 4
+ 4! A + ··· I + A + 1 3
3! A + 5! A + ··· J
1 5
cosh A sinh A
cosh A sinh A
= (196.2)
sinh A cosh A
x0 cosh A sinh A x0
= (198)
x1 sinh A cosh A x1
To describe the successive “ticks of the clock at his origin” O writes
ct
0
tanh A = β (199)
with
β ≡ v/c (200)
These equations serve to assign kinematic meaning to A, and therefore to /\\ (A).
10
-1 -0.5 0.5 1
Figure 51: Graph of the β-dependence of γ ≡ 1/ 1 − β 2 , as
β ≡ v/c ranges on the physical interval −1 < β < +1. Outside that
interval γ becomes imaginary.
Heretofore we have been content to share our profession with a zippy population
of “superluminal inertial observers” who glide past us with speeds v > c. But
114
This, however, does not, of itself, deny any conceivable role to superluminal
signals or particles in a relativistic physics!
Lorentz transformations 141
β1 + β2 + β3 + β1 β2 β3 = 0
clearly evident in the figure. The “β-surface” looks rather like a soap
film spanning the 6-sided frame that results when the six untouched
edges of the cube are discarded.
β(β1 , β2 ) = β(β2 , β1 )
β(β1 , β2 ) = 0 if β2 = −β1
β(1, 1) = 1
To this list our forcibly retired superluminal friends might add the following:
1 2 1 2 1 2
= (v1 + v2 ) · 1 − + − + · · ·
c2 c2 c2
5. Geometric considerations. Our recent work has been algebraic. The following
remarks emphasize the geometrical aspects of the situation, and are intended
to provide a more vivid sense of what Lorentz transformations are all about.
By way of preparation: In Euclidean 3 -space the equation xTx = r2 defines a
sphere (concentric about the origin, of radius r) which—consisting as it does of
points all of which lie at the same (Euclidean) distance from the origin—we may
reasonably call an “isometric surface.” Rotations (x x → x = R x with RT R = I)
cause the points of 3 -space to shift about, but by a linear rule (straight lines
remain straight) that maps isometric spheres onto themselves: such surfaces
are, in short, “R -invariant.” Similarly . . .
In spacetime the σ-parameterized equations
xTgj x = σ
The revealed geometry of spacetime 143
which describes a
that mark the vertices of a “unit square” on her spacetime diagram. By quick
calculation
+1 +1 −1 −1
−→ K (β)
+
and −→ K (β)
+
+1 +1 −1 −1
(206)
+1 +1 −1 −1
−→ K (β)
−
and −→ K (β)
−
−1 −1 +1 +1
where
K + (β) ≡ 1+β
1−β and K − (β) ≡ 1−β
1+β (207)
144 Aspects of special relativity
Calculation would establish what is in fact made obvious already at (206): the
K ± (β) are precisely the eigenvalues of /\\ (β).115 Nor are we surprised that the
associated eigenvectors are null vectors, since
115
We note in passing that K − (β) = [K + (β)]–1 = K + (−β).
146 Aspects of special relativity
K+
K-
K- K+
116
problem 43.
117
Some authors stress the utility in special relativity of what they call the
“k-calculus:” see, for example, Hermann Bondi, Relativity and Common Sense:
A New Approach to Einstein (), pages 88 –121 and occasional papers in the
American Journal of Physics. My K-notation is intended to establish contact
with that obscure tradition.
The revealed geometry of spacetime 147
ϑ
γ
ϑ
γβ
Figure 57: O writes (ct, 0) to describe the “ tth tick of her clock.”
Working from (201) we find that O assigns coordinates (γt, γβt) to
that same event. The implication is that the (Euclidean) angle ϑ
subtended by
• O’s time axis and
• O’s representation of O’s time axis
can be described
tan ϑ = β
The same angle, by a similar argument, arises when one looks to
O’s representation of O’s space axis. One could, with this infor-
mation, construct the instance of Figure 56 which is appropriate to
any prescribed β-value. Again I emphasize that—their Euclidean
appearance notwithstanding— O and O are in agreement that O’s
coordinate axes are normal in the Lorentzian sense .118
118
problem 44.
The revealed geometry of spacetime 149
γβ
x0 = γ x0 > x0
γβ
γ
(with his trains and lanterns) argued that such effects are not “physical,” in the
sense that they have to do with the properties of “stuff”. . . but “metaphysical”
(or should one say: pre-physical?)—artifacts of the operational procedures by
which one assigns meaning to lengths and times. In preceding pages I have, in
the tradition established by Minkowski, espoused a third view: I have
represented all such effects are reflections of the circumstance (brought first
to our attention by electrodynamics) that the hyperbolic geometry of spacetime
is a primitive fact of the world, embraced by all inertial observers . . . and written
into the design of all possible physics.
/\\ =
0 A
exp (196.1)
A 0
|
|
↓ 0 A1 A2 A3
/\\ = exp
A 1 0 a3 −a2
(193)
A2 −a3 0 a1
A3 a2 −a1 0
120
See elements of relativity ().
Lorentz transformations in 4-dimensional spacetime 155
where
0 a3 −a2
A ≡ −a3 0 a1 is real and antisymmetric
a2 −a1 0
where R ≡ eA is a 3×3 rotation matrix . The action of such a /\\ can be described
0 0 0
x x x
−→ =
x x Rx
as a spatial rotation that leaves time coordinates unchanged. Look to the case
a1 = a2 = 0, a3 = φ and use the Mathematica command MatrixExp[ /\\ ] to
obtain
1 0 0 0
/\\ =
0 cos φ sin φ 0
0 − sin φ cos φ 0
0 0 0 1
with the evident implication that in the general case
√ such a Lorentz matrix
describes a lefthanded rotation through angle φ = a · a about the unit vector
λ ≡ âa .122 Such Lorentz transformations contain no allusion to v and have
no properly kinematic significance: O simply stands beside us, using her clock
(indistinguishable from ours) and her rotated Cartesian frame to “do physics.”
What we have learned is that
of a special type (a type for which the 2 -dimensional theory is too impoverished
to make provision). The associated Lorentz matrices will be notated R (φ, λ).
Look next to the complementary . . .
121
“Time/time” means 0 appears twice, “time/space” and “space/time” mean
that 0 appears once, “space/space” means that 0 is absent.
122
See classical dynamics (/), Chapter 1, pages 83–89 for a simple
account of the detailed argument.
156 Aspects of special relativity
now gives
A = tanh–1 β · v̂v
while the argument which (on pages 138–139) gave
/\\ = exp tanh–1 β
0 1 γ vγ/c
=
1 0 vγ/c γ
now gives
0 v̂1 v̂2 v̂3
v̂ 0 0 0
/\\ = exp tanh–1 β 1
v̂2 0 0 0
v̂3 0 0 0
γ v1 γ/c v2 γ/c v3 γ/c
v1 γ/c 1 + (γ − 1)v1 v1 /v 2 (γ − 1)v1 v2 /v 2 (γ − 1)v1 v3 /v 2
=
v2 γ/c (γ − 1)v2 v1 /v 2 1 + (γ − 1)v2 v2 /v 2 (γ − 1)v2 v3 /v 2
v3 γ/c (γ − 1)v3 v1 /v 2 (γ − 1)v3 v2 /v 2 1 + (γ − 1)v3 v3 /v 2
= /\\ (β
β) (209)
β ≡ v /c
They give rise to Lorentz transformations x −→ x = /\\ (β β )x which are “pure”
(in the sense “rotation-free”) and are called “boosts.” The construction (208)
looks complicated, but in fact it possesses precisely the structure that one might
(with a little thought) have anticipated . For (209) supplies123
t = γ t + (γ/c2 )vv· x
(210.1)
x = x + γ t + (γ − 1) (vv· x)/v 2 v
/\\ = eA
A=J+K
with
0 A1 A2 A3
0
3
A 0 0
J≡ 1 ≡ Ai Ji
A2 0 0 0
i=1
A3 0 0 0
0 0 0 0
a3 −a2
3
0 0
K≡ ≡ ai Ki
0 −a3 0 a1
i=1
0 a2 −a1 0
3
J, K = − A ×a
(A a)i Ji (212)
i=1
= O if and only if A and a are parallel
158 Aspects of special relativity
124
The requisite machinery is developed in elaborate detail in elements of
special relativity ().
Lorentz transformations in 4-dimensional spacetime 159
β2
ω β
β1
forbidden region
β2 sin ω
Ω = tan–1 (216)
γ1 (β1 + β2 cos ω)
↓ β sin ω
2
Ω0 = tan–1 in the non-relativistic limit
β1 + β2 cos ω
160 Aspects of special relativity
β2 β2
β φ
ω ω β
Ω Ω
β1 β1
Ωrelativistic = Ω0 + φ Ω0
calculations which are elementary in the Galilean case (see the figure) but
become a little tedious in the relativistic case.125 Asymmetry effects become
most pronounced in the ultra-relativistic limit. Suppose, for example, that
β1 = 1: then Ω ↓ 0 and
and by taking that procedure to the limit τ ↓ 0, N = t/τ ↑ ∞. One arrives thus
at method for Lorentz transforming to the frame of an accelerated observer . The
curvature of the orbit means, however, that successive boosts are not collinear;
rotational factors intrude at each step, and have a cumulative effect which (as
detailed analysis128 shows) can be described
dφ
dt ≡ ΩThomas = (γ − 1)Ωorbital
= 12 β 2 Ωorbital 1 + 34 β 2 + 15 4
24 β + ···
arise from impressed forces. (iii ) Look now beyond the kinematics to the
dynamics: from •’s viewpoint the revolving • is, in effect, a current loop, the
generator of a magnetic field B . Uhlenbeck & Goudsmit had assumed that
the electron possesses a magnetic moment proportional to its postulated spin:
such an electron senses the B -field, to which it responds by precessing, acquiring
precessional energy EUhlenbeck & Goudsmit . Uhlenbeck & Goudsmit worked,
however, from a mistaken conception of “•’s viewpoint.” The point recognized
by Thomas is that when relativistic frame-precession is taken into account129
one obtains
EThomas = 12 EUhlenbeck & Goudsmit
—in good agreement with the spectroscopic data. This was a discovery of
historic importance, for it silenced those (led by Pauli) who had dismissed as
“too classical” the spin idea when it had been put forward by Krönig and
again, one year later, by Uhlenbeck & Goudsmit: “spin” became an accepted/
fundamental attribute of elementary particles.130
So much for the structure and properties of the Lorentz transformations
. . . to which (following more closely in Minkowski’s footsteps than Lorentz’) we
were led by analysis of the condition
/\\ =
/\\T gj
gj everywhere (182)
then (since the elements of gj are constants) application of ∂ λ gives
Ωλ ≡ 2Ωϕλ (220)
Now
W α µν = Γλµν · ΩM λ β g βα by (221)
! "# $
= g λκ W α κ by (218)
so by (222)
= ϕµ W α ν + ϕν W α µ − gµν · g λκ ϕλ W α κ (223)
where the µν-symmetry is manifest. More compactly
= Γ κ µν W α κ (224)
where
Γ κ µν ≡ g κλ Γλµν
∂Γ κ µν α
Application of ∂ λ to (224) gives W α λµν = W κ + Γ κ µν W α λκ which
∂xλ
(since W , W and Γ are symmetric in their subscripts, and after relabling
some indices) can be written
∂Γ β λν α
W α λµν = W β + Γ κ νλ W α κµ
∂xµ ! "# $
= Γ β κµ W α β by (224)
β
∂Γ λν
= + Γ β κµ Γ κ νλ W α β
∂xµ
from which it follows in particular that
β
∂Γ λν ∂Γ β λµ
W λµν − W λνµ =
α α
− + Γ κµ Γ νλ − Γ κν Γ µλ W α β
β κ β κ
∂xµ ∂xν
≡ Rβ λµν W α β (225)
The preceding sequence of manipulations will, I fear, strike naive readers as an
unmotivated jumble. But those with some familiarity with patterns of argument
standard to differential geometry will have recognized that
• the quantities W α µ transform as components of an α-parameterized set
of covariant vectors;
• the quantities Γ κ µν are components of 131 an affine connection to which
(222) assigns a specialized structure;
• the α-parameterized equations (224) can be notated
Dν W α µ ≡ ∂ ν W α µ − W α κ Γ κ µν = 0
according to which each of the vectors W α µ has the property that its
covariant derivative 129 vanishes;
• the 4th rank tensor Rβ λµν defined at (225) is just the Riemann-Christoffel
curvature tensor ,129 to which a specialized structure has in this instance
been assigned by (222).
131
See again page 123.
166 Aspects of special relativity
But of differential geometry I will make explicit use only in the following—
independently verifiable—facts: let
N N4 1 2
12 N (N
2
− 1)
1 1 0
2 16 1
3 81 6
4 256 20
5 625 50
6 1296 105
.. .. ..
. . .
We will, in particular, need to know that in the 2 -dimensional case the only
non-vanishing components of Rκλµν are
Introducing (222) into (225) we find (after some calculation marked by a great
deal of cancellation) that Rκλµν has the correspondingly specialized structure
Rκλµν = gκν Φλµ − gκµ Φλν − gλν Φκµ + gλµ Φκν (227)
where
Rλµ = 0 (230.1)
R=0 (230.2)
Φαβ = 0 (232)
This is the conformality condition from which we will work. When introduced
into (227) it renders (226) automatic.132
√
Note that (220) can be written ∂ λ ϕ ≡ ϕλ = ∂ λ log Ω and entails
√
ϕ = log Ω + constant
g αβ Fα Fβ
Fµν = gµν · (234)
2F
where Fµ ≡ ∂ µ F and Fµν ≡ ∂ µ ∂ ν F . The implication is that
∂ ν (g λµ Fµ ) = g λµ Fµν
αβ
g F α Fβ λ
= 12 δ ν : vanishes unless ν = λ
F
which is to say: g λµ Fµ is a function only of xλ . But gj is, by initial assumption,
a constant diagonal matrix, so we have
Fµν = 2Cgµν
giving
F = Cgαβ xα xβ − 2bα xβ + A
= C · (x, x) − 2(b, x) + A (235)
the
effect of which, upon comparison with (235), is to constrain the constants
A, bα , C to satisfy
AC = (b, b)
(b, b)(x, x)
F = A − 2(b, x) + : A and bα now unconstrained
A
Finally we introduce aα ≡ bα /A to obtain the pretty result
F = A 1 − 2(a, x) + (a, a)(x, x) (236)
Conformal transformations 169
Ω = W N = 12 = 2 1
2
(237)
F A [1 − 2(b, x) + (a, a)(x, x)]2
Clearly, tensor weight distinctions do not become moot in the context provided
by the conformal group, as they did (to within signs) in connection with the
Lorentz group.
To get a handle on the functions xα (x) that describe specific conformal
transformations X ← X we introduce
√
∂ µ ϕ ≡ ϕµ = ∂ µ log Ω = −∂ µ log F = − 1 Fµ
F
into (223) to obtain
F W α µν + Fµ W α ν + Fν W α µ = gµν · g λκ Fλ W α κ
F yα ≡ zα
= −Fµν = −2Cgµν
so
= gµν · 1 − 2Cz α + g λκ Fλ z α κ (239)
F
Each of these α-parameterized equations is structurally analogous to (234), and
the argument that gave (235) no gives
% &
now no x-independent term
z α (x) = P α · (x, x) + Λα β xβ +
because y(0)=0 ⇒ z(0)=0
170 Aspects of special relativity
↓
= Λα β xβ
and the equation (185.2) that served as our point of departure becomes
/\\ =
/\\T gj gj , from which we learn that the Λα β must be elements of a Lorentz
matrix.
Transformations of the form (240) have been of interest to mathematicians
since the latter part of the 19th Century. Details relating to the derivation
of (240) by iteration of infinitesimal conformal transformations were worked
out by S. Lie, and are outlined on pages 28–32 of J. E. Campbell’s Theory of
Continuous Groups (). The finitistic argument given above—though in a
technical sense “elementary”—shows the toolmarks of a master’s hand, and is in
fact due (in essential outline) to H. Weyl (). I have borrowed most directly
from V. Fock, The Theory of Space, Time & Gravitation (), Appendix A:
“On the derivation of the Lorentz transformations.”
Equation (240) describes—for N
= 2—the most general N -dimensional
conformal transformation, and can evidently be considered to arise by
composition from the following:
1 3
2 6+∞
3 10
4 15
5 21
6 28
.. ..
. .
Concerning the entry at N = 2 : equation (240) makes perfect sense in the
case N = 2 , and that case provides a diagramatically convenient context
within which to study the meaning of (240) in the general case. But (240) was
derived from (232), which was seen on page 167 to be stronger that the condition
(231) appropriate to the 2 -dimensional case. The weakened condition requires
alternative analysis,133 and admits of more possibilities—actually infinitely
many more, corresponding roughly to the infinitely many ways of selecting
f (z) in the theory of conformal transformations as it is encountered in complex
function theory.134 I do not pursue the topic because the physics of interest to
us is inscribed (as are we) on 4 -dimensional spacetime.
Some of the mystery which surrounds the Möbius transformations—which
are remarkable for their nonlinearity—is removed by the remark that they can
be assembled from translations and “inversions,” where the latter are defined
as follows:
Inversion : x → x = µ2
x (241.5)
(x, x)
Here µ2 is a constant of arbitrary value, introduced mainly for dimensional
reasons. The proof is by construction:
x −−−−−−−−−−−−−−−−−−→ x = µ2 x/(x, x)
inversion
−−−−−−−−−−−−−−−−−2−→ x = x − µ a 2
translation with t = −µ a
(242)
−−−−−−−−−−−−−−−−−−→ x = µ x/(x, x)
2
inversion
x − (x, x)a
=
1 − 2(a, x) + (a, a)(x, x)
133
The problem is discussed in my transformational physics of waves
( –).
134
See again page 129.
172 Aspects of special relativity
Inversion—which
• admits readily of geometrical interpretation (as a kind of “radial reflection”
in the isometric surface (x, x) = µ2 )
• can be looked upon as the ultimate source of the nonlinearity which is
perhaps the most striking feature of the conformal transformations (240)
—is one of the sharpest tools available to the conformal theorist, so I digress
to examine some of its properties:
We have, in effect, already shown (at (242): set a = 0) that inversion
is—like every kind of “reflection”—self-reciprocal:
which is to say:
angle = angle
Inversion, since conformal, must be describable in terms of the primitive
transformations listed at (241). How is that to be accomplished? We notice that
each of those transformations—with the sole exception of the improper Lorentz
transformations—is continuous with the identity (which arises at /\\ = I, at
t = 0, at K = 1, at a = 0). Evidently improper Lorentz transformations—in a
word: reflections—must enter critically into the fabrication of inversion, and it
is this observation that motivates the following short digression: For arbitrary
non-null aµ we can always write
(x, a) (x, a)
x= x− a + a ≡ x + x⊥
(a, a) (a, a)
a- reflection : x = x⊥ + x
↓
(x, a)
x̂ = x⊥ − x = x − 2 a (245)
(a, a)
Thus prepared, we are led after a little exploratory tinkering to the following
sequence of transformations:
1
x −−−−−−−−−−−−−−−−−−→ x = x − a
translation (a, a)
(x, a)
−−−−−−−−−−−−−−−−−−→ x = x − 2 a
reflection (a, a)
x − (x, x)a
−−−−−−−−−−−−−−−−−−→ x =
Möbius 1 − 2(a, x) + (a, a)(x, x)
..
. algebraic
simplification
1 x
= −a
(a, a) (x, x)
1
−−−−−−−−−−−−−−−−−−→ x = x + a
reverse translation (a, a)
x
= µ2 with µ2 ≡ (a, a)–1
(x, x)
N
1
W = ±K N
1 − 2(a, x) + (a, a)(x, x)
of (237)—one would discover soon enough that one had a job on one’s hands!
But the result in question can be obtained as an easy consequence of the
174 Aspects of special relativity
We are familiar with the fact that specialized Lorentz transformations serve
to boost one to the frame of an observer O in uniform motion. I discuss now
a related fact with curious electrodynamic implications: specialized Möbius
transformations serve to boost one to the frame of a uniformly accelerated
observer . From (241.4) we infer that aµ has the dimensionality of reciprocal
length, so
2 gµ ≡ c aµ is dimensionally an “acceleration”
1 2
Then
(x − λ)2 + y 2 + x2 − c2 t2
1− 1
c2 (g, x) + 1
4c4 (g, g)(x, x) =
λ2
2
λ ≡ 2cg is a “length”
λ2
t= ·t (252.1)
[(x − λ)2 − c2 t2 ]
λ2
(x + λ) = − · (x − λ) (252.2)
[(x − λ)2 − c2 t2 ]
which jointly entail
2 2
c t − (x + λ)2 c2 t2 − (x − λ)2 = λ4 (253)
λ2
t= ·t
− c2 t2 ]
[λ2
λ3
x= 2 −λ
[λ − c2 t2 ]
which provide O’s t-parameterized description of O’s worldline. Notice
that t and x both become infinite at t = λ/c, and that t thereafter
becomes negative!
• To describe her lightcone O writes x = ±ct. Insert x = +ct into (252.1),
(ask Mathematica to) solve for t and obtain ct = λct/(2ct + λ). Insert
that result and x = +ct into (252.2) and, after simplifications, obtain
x = +ct. Repeat the procedure taking x = −ct as your starting point:
obtain ct = −λct/(2ct − λ) and finally x = −ct. The striking implication
is that (252) sends
and enters into electrodynamics because the photon has no mass.” That the
group enters also into the physics of massy particles133 is, in the light of such
a remark, somewhat surprising. Surprises are imported also into classical
electrodynamics by the occurrence of accelerations within the conformal group,
for the question then arises: Does a uniformly accelerated charge radiate?137
135
I scratch deeper, and discuss the occurance of the conformal group in
connection with a rich variety of physical problems, in appell, galilean
& conformal transformations in classical/quantum free particle
dynamics () and transformational physics of waves (–).
136
In “‘Electrodynamics’ in 2 -dimensional spacetime” () I develop a
“toy electrodynamics” that gives full play to the exceptional richness that the
conformal group has been seen to acquire in the 2 -dimensional case.
137
This question—first posed by Pauli in §32γ of his Theory of Relativity—
once was the focus of spirited controversy: see T. Fulton & F. Rohrlich,
“Classical radiation from a uniformly accelerated charge,” Annals of Physics 9,
178 Aspects of special relativity
This means that we can study separately the response of F to spatial rotations
R and its response to boosts /\\ (β
β ).
where
R11 R12 R13
R = R21 R22 R23
R31 R32 R33
is a 3×3 rotation matrix: R–1 = RT . It will, in a moment, become essential to
notice that the latter equation, when spelled out in detail, reads
(R22 R33 − R23 R32 ) (R13 R32 − R12 R33 ) (R12 R23 − R13 R22 )
1
(R23 R31 − R21 R33 ) (R11 R33 − R13 R31 ) (R21 R13 − R23 R11 )
det R
(R32 R21 − R31 R22 ) (R31 R12 − R32 R11 ) (R11 R22 − R12 R21 )
R11 R21 R31
= R12 R22 R32 (257)
R13 R23 R33
where
1
= ±1 according as R is proper/improper
det R
Our task now is the essentially elementary one of evaluating
1 1 0T 0 −EET 1 0T
F= T
det R 0 R E B 0 R
1 0 −(R E )T
= T
det R R E RBR
which supplies
139
problem 47.
180 Aspects of special relativity
then (on a large sheet of paper) construct a detailed description of the matrix
on the right, and finally make simplifications based on the rotational identity
(257) . . . we find that (258.1) is precisely equivalent to (which is to say: simply
a notational variant of) the statement140
B1 B1 (259.2)
B2 = R B 2
B3 B3
A → A = (det R) · RA
A
140
For a more elegant approach to the proof of this important lemma see
pages 22–22 in classical gyrodynamics ().
141
See again the first point of view , page 126.
How electromagnetic fields respond to Lorentz transformations 181
then the (det R)–1 factors would disappear from the right side of (258), and we
would be led to the opposite conclusion:
E responds to rotation as a vector
(261.2)
B responds to rotation as a pseudovector
142
See again the second point of view , page 128.
143 E -like
This fact has been latent ever since—at (67)—we alluded to the “E
character” of c1 v ×B
B , since
vector pseudovector
vector × =
pseudovector vector
182 Aspects of special relativity
Noting that
E −β
E 1 = (E β ×B
B )1 because β ×B
(β B) ⊥ β
B 1 = (B β ×E
B +β E )1 because β ×E
(β E) ⊥ β
we infer that
E −β
E = (E β ×B E −β
B ) + γ(E β ×B
B )⊥
(263)
B = (B β ×E
B +β E ) + γ(B β ×E
B +β E )⊥
A = A + A⊥
β1 β1 β1 β2 β1 β3
A ≡ (A
A· β̂ β = 12 β2 β1 β2 β2
β )β̂ β2 β3 A
β
β3 β1 β3 β2 β3 β3
projects onto β
where M (β
β ) is a 6×6 matrix whose elements can be read off from (263)—and
some not so obvious. I would pursue this topic in response to some specific
formal need, but none will arise.
How electromagnetic fields respond to Lorentz transformations 183
Maxwell’s equations
∂ µ F µν = c1 j ν
∂ µ F νλ + ∂ ν F λµ + ∂ λ F µν = 0
simply “turn black” in response to (264.2)
xµ = Λµ α xα
j ν = Λν β j β
F µν = Λµ α Λν β F αβ
and provide detailed statements of what one means when one refers to the
“Lorentz covariance of Maxwellian electrodynamics.” Note that it is not enough
to know how Lorentz transformations act on spacetime coordinates: one must
know also how they act on fields and sources. The contrast in the formal
appearance of (264.1: Lorentz & Einstein) and (264.2: Minkowski) is striking,
and motivates me to remark that
• it is traditional in textbooks to view (264.1) as “working equations,” and
to regard (264.2) as “cleaned-up curiosities,” to be written down and
admired as a kind of afterthought . . . but
• my own exposition has been designed to emphasize the practical utility
of (264.2): I view (264.1) as “elaborated commentary” upon (264.2)—too
complicated to work with except in some specialized applications.
4. We know now how to translate electrodynamical statements from one inertial
frame to another. But we do not at present possess answers to questions such
as the following:
184 Aspects of special relativity
E = E + γE
E⊥ = γE E + (1 − γ) v12 (vv · E ) v
B= β ×E
γ(β E ) = c γ(vv ×E
1
E)
The preceding remark makes vividly clear, by the way, why it is that attempts to
“derive” electrodynamics from “Coulomb’s law + special relativity” are doomed
to fail: with only that material to work with one cannot escape from the force
of the special/atypical condition E ·B = 0.
6. We do not have in hand the statements analogous to (264) that serve to lend
detailed meaning to the “conformal covariance of Maxwellian electrodynamics.”
To gain a sense of the most characteristic features of the enriched theory it
would be sufficient to describe how electromagnetic fields and sources respond
to dilations and inversions.
7. An uncharged copper rod is transported with velocity v in the presence of a
homogeneous magnetic field B . We see a charge separation to take place (one
end of the rod becomes positively charge, the other negatively: see Figure 66),
which we attribute the presence q(vv ×BB )-forces. But an observer O co-moving
with the rod sees no such forces (since v = 0), and must attribute the charge
separation phenomenon to the presence of an electric field E . It was to account
for such seeming “explanatory asymmetry” that Einstein invented the theory
of relativity. I quote from the beginning of his paper:
144
problem 48.
How electromagnetic fields respond to Lorentz transformations 185
a. einstein
9. Principle of relativity . The arguments which led Einstein to the Lorentz trans-
formations differ profoundly from those which (unbeknownst to Einstein) had
led Lorentz to the same result. Lorentz argued (as we have seen . . . and done)
from the structure of Maxwell’s equations. Einstein, on the other hand (and
though he had an electrodynamic problem in mind), extracted the Lorentz
transformations from an unprecedented operational analysis: his argument
assumed very little . . . and he had, therefore, correspondingly greater confi-
dence in the inevitability and generality of his conclusions. His argument was,
in particular, entirely free from any reference to Maxwell’s equations, so his
conclusion—that inertial observers are interrelated by Lorentz transformations
—could not be specific to Maxwellean electrodynamics. It was this insight—and
the firmness145 with which he adhered to it—which distinguished Einstein’s
thought from that of his contemporaries (Lorentz, Poincaré). It led him to
145
I have indicated on page 163 why, in the light of subsequent developments,
Einstein’s “firmness” can be argued to have been inappropriately strong.
Principle of relativity 187
propose, at the beginning of his §2, two principles . . . which amount, in effect,
to this, the
t t(λ)
x(t) x(λ)
λ + dλ
λ
λ0
λ = λ(τ, λ0 )
1. Einstein’s program works if and only if all tangents to the worldline are
timelike (Figure 69). One cannot, therefore, τ -parameterize the worldline of a
photon. Or of a “tachyon.” The reason is that one cannot boost such particles
to rest: one cannot Lorentz transform the tangents to such worldlines into local
coincidence with the x0 -axis.
2. The dτ ’s in dτ refer to a population of osculating inertial observers.
It is a big step—a step which Einstein (and also L. H. Thomas) considered
quite “natural,” but a big step nonetheless—to suppose that τ has anything
literally to do with “time as measured by a comoving (which in the general case
means an accelerating) clock.” The relativistic dynamics of particles is, in fact,
independent of whether attaches literal meaning to the preceding phrase. Close
reading of Einstein’s paper shows, however, that he did intend to be understood
literally (even though—patent clerk that he was—he would not have expected
his mantle clock to keep good time if jerked about). Experimental evidence
supportive of Einstein’s view derives from the decay of accelerated radioactive
192 Aspects of special relativity
particles and from recent observations pertaining to the so-called twin paradox
(see below).
Given a τ -parameterized (whence everywhere timelike) worldline x(τ ), we
define by
ct c
u(τ ) ≡ dτ x(τ ) = dτ dt
d dt d
=γ (269)
x v
the 4-velocity uµ (τ ), and by
d2
a(τ ) ≡ dτ 2 x(τ )
d dt d c
= dτ u(τ ) = dτ dt γ (270)
v
1 4
a· v )
c γ1 (a
=
a · v )vv
γ a + c2 γ 4 (a
2a
x = /\\ (ss/c)x
we have
These equations look simple enough, but their explcit meaning is—owing to the
complexity of /\\ (ss/c), of uµ and particularly of aµ —actually quite complex. I
will develop the detail only when forced by explicit need.148
It follows from (269) that
147
problem 49.
148
In the meantime, see my electrodynamics (/), pages 202–205.
Relativistic mechanics of a particle 193
according to which all velocity 4-vectors have the same Lorentzian length. All
are, in particular (since (u, u) = c2 > 0), timelike. Differentiating (272) with
respect to τ we obtain
d
dτ (u, u) = 2(u, a) = 0 (273)
according to which it is invariably the case that u ⊥ a in the Lorentzian sense.
It follows now from the timelike character of u that all acceleration 4-vectors
are spacelike. Direct verification of these statements could be extracted from
(269) and (270). The statement (u, u) = c2 —of which (273) is an immediate
corollary—has no precursor in non-relativistic kinematics,149 but is, as will
emerge, absolutely fundamental to relativistic kinematics/dynamics.
Looking “with relativistic eyes” to Newton’s 2nd law (267) we write
2
K µ = m dτ
d µ
2 x (τ ) (274)
K µ = m dτ
d µ
u = maµ
d µ
or again = dτ p
0
c p
where p ≡ mu = γm
µ µ
≡ (275)
v p
p0 = γmc (276.1)
= 1 + 2 β + 8 β + · · · mc
1 2 3 4
= c1 mc2 + 12 mv 2 + · · ·
↑—familiar from non-relativistic dynamics as kinetic energy
p = γmvv (276.2)
= mvv + · · ·
↑—familiar from non-relativistic dynamics as linear momentum
v · v = constant
p0 = c1 E (277)
and calls E = γmc2 = mc2 + 12 mv 2 + · · ·
we have
E = M c2 (280.1)
1
and T = (M − m)c2 = − 1 mc2
1 − v 2 /c2
p = Mvv (280.2)
p0
pµ
mc
Figure 70: The hyperboloidal mass shell, based upon (281) and
drawn in energy-momentum space. The p0 -axis (energy axis) runs
up. The mass shell intersects the p0 -axis at a point determined by
the value of m:
p0 = mc i.e., E = mc2
The figure remains meaningful (though the hyperboloid becomes a
cone) even in the limit m ↓ 0, which provides first indication that
relativistic mechanics supports a theory of massless particles.
which for a relativistic particle describes the p -dependence of the energy E, and
should be compared with its non-relativistic free -particle counterpart
1
E = 2m p ·p
We infer that the 4 -vectors that describe Minkowski forces are invariably
spacelike. It follows moreover from (283) that as p ∼ u moves around the
K-vector must move in concert, contriving always to be ⊥ to u: in relativistic
196 Aspects of special relativity
K = K(u, . . .)
where the dots signify such other variables as may in particular cases enter into
the construction of K. The simplest case—which is, as we shall see, the case of
electrodynamical interest—arises when K depends linearly on u:
Kµ = Aµν uν (284.1)
K-vectors that depend quadratically upon u exist in much greater variety: the
following example
Kµ = φα (x) c2 gαµ − uα uµ
It is, of course, the non-zero value of K that causes the particle to take leave
of (what a moment ago was) the rest frame. Borrowing notation from (275) and
152
This work (∼) is associated mainly with the name of G. Nordström,
but for a brief period engaged the enthusiastic attention of Einstein himself:
see page 144 in Pauli,135 and also A. O. Barut, Electrodynamics and Classical
Theory of Fields and Particles (), page 56; A. Pais, Subtle is the Lord: The
Science and Life of Albert Einstein (), page 232.
153
For further discussion of the “general theory of K-construction” see my
relativistic dynamics (), pages 13–22.
Relativistic mechanics of a particle 197
to account for such c-factors as may lurk in the construction of K . We are used
to thinking of the “non-relativistic limit” as an approximiation to relativistic
physics, but at this point it becomes appropriate to remark that
In fully relativistic particle dynamics the “non-relativistic
limit” becomes literally effective in the momentary rest frame.
The implication is that if we knew the force F experienced by a particle at rest
then we could by Lorentz transformation obtain the Minkowski force K active
upon a moving particle: 0
K \\
0
= (β )
/ β (290)
K F
Reading from (210.1) it follows more particularly that
K 0 = γ c1 v ·F
(291)
K = F + (γ − 1)(vv ·F )/v 2 v = F ⊥ + γF
F
from which, it is gratifying to observe, one can recover both (289) and (286).
We stand not (at last) in position to trace the details of the program
proposed154 in a specifically electrodynamical setting by Einstein. Suppose
that a charged particle experiences a force
E : E ≡ electrical field in the particle’s rest frame
F = qE
Then
E ⊥ + γE
K = q(E E)
But from the field transformation equations (263) it follows that
E + β ×B
E ⊥ = γ(E B )⊥
E = (EE + β ×B
B )
where E and B refer to our perception of the electric and magnetic fields at
the particle’s location, and β to our perception of the particle’s velocity. So
(because the γ -factors interdigitate so sweetly) we have
E + c1 v ×B
K = γq(E B) (292)
154
See again page 186.
198 Aspects of special relativity
d
But (288) supplies K = γ dt (γmvv ), so (dropping the γ -factors on left and right)
155
we have
q(EE + c1 v ×B d
B ) = dt (γmvv ) (293)
This famous equation describes the relativistic motion of a charged particle in
an impressed electromagnetic field (no radiation or radiative reaction), and is
the upshot of 156 the Lorentz force law —obtained here not as an it ad hoc
assumption, but as a forced consequence of
• some general features of relativistic particle dynamics
• the transformation properties of electromagnetic fields
• the operational definition of E . . . all fitted into
• Einstein’s “go to the frame of the particle” program (pages 186 & 189).
Returning with (292) to (286) we obtain
E·v
K 0 = c1 γqE (294)
so
d/\\ (τ ) /\\ dA(τ )
= (τ ) · J
dτ dτ
dA(τ ) 1 dβ
=
dτ 1 − β 2 dτ
Returning with this information to (296) we obtain
1 dβ
= c1 g(τ )
1 − β 2 dτ
for β(t): a final integration would then supply the x(t) that describes our
perception of Q’s worldline. The problem presented by (298) appears in the
general case to be hopeless . . . but let us at this point assume that the throttle
function has the simple structure
g(τ ) = g : constant
158
See again pages 138 and 139.
Relativistic mechanics of a particle 201
159
problem 51.
202 Aspects of special relativity
2.* In (299.2) set x(0) = 0. The resulting spacetime hyperbola is, by notational
adjustment 12 λ → c2 /g, identical to that encountered at the middle of page 176:
our perception of Q’s worldline is a conformal transform Q’s own perception
of her (from her point of view trivial) worldline. If Q elected to pass her time
doing electrodynamics she would—though non-inertial—use equations that are
structurally identical to the (conformally covariant) equations that we might
use to describe those same electrodynamical events.
3. O is inertial, content to sit home at x = 0. Q—O’s twin—is an astronaut,
who at time t = 0 gives her brother a kiss and sets off on a flight along the
x-axis, on which her instruction is to execute the following throttle function:
+g : 0 < τ < 14 T
g(τ ) = −g : 4T
1
< τ < 34 T
4T <τ < T
3
+g :
* This remark will be intelligible only to those brave readers who ignored my
recommendation that they skip §6.
Relativistic mechanics of a particle 203
the moment of her return the clock on Q’s control panel will read T, but
according to O’s clock the
>T
return time = T · (4c/gT ) sinh gT/4c = (301.1)
∼ T only if T 4c/g
and Q’s adventure will have taken her to a turn-around point lying160 a
160
Work from (299.2).
204 Aspects of special relativity
distance = 2 (ct)2 + (c2 /g)2 − c2 /g (301.2)
t = 14 (return time)
—both of which make good intuitive sense.161 Notice (as Einstein—at the end
of §4 in his first relativity paper—was the first to do) that
and that this surprising fact can be attributed to a basic metric property of
spacetime (Figure 73).162 The so-called twin paradox arises when one argues
that from Q’s point of view it is O who has been doing the accelerating, and who
should return younger . . . and they can’t both be younger! But those who pose
the “paradox” misconstrue the meaning of the “relativity of motion. ” Only O
remained inertial throughout the preceding exercise, and only Q had to purchase
rocket fuel . . . and those facts break the supposed “symmetry” of the situation.
The issue becomes more interesting with the observation that we have spent
our lives in (relative to the inertial frames falling through the floor) “a rocket
accelerating upward with acceleration g” (but have managed to do so without an
investment in “fuel”). Why does our predicament not more nearly resemble the
the predicament of Q than of O?163
161
problem 52.
162
problem 53.
163
See at this point C. W. Sherwin, “Some recent experimental tests of the
clock paradox,” Phys. Rev. 120, 17 (1960).
164
For parallel remarks see §5.9 in E. M. Purcell’s Electricity & Magnetism:
Berkeley Physics Course–Volume 2 () and §13.6 of The Feynman Lectures
on Physics–Volume 2 ().
206 Aspects of special relativity
R y
q v
E y
So we have
K0 0 0
K
1
0 K
2= =
K −(γqBv/c)y/R K
K3 −(γqBv/c)z/R
0 0 0 0 γc
0 0 By/R Bz/R /\\–1 /\\ γv
K = /\\ K = (q/c) · /\\ ·
0 −By/R 0 0 0
0 −Bz/R 0 0 0
F u
with
1 −β 0 0
/\\ = γ
−β 1 0 0
0 0 1 0
0 0 0 1
Straightforward computation supplies
0 0 −βγBy/R −βγBy/R c
0 0 + γBy/R + γBz/R 0
= (q/c) ·
−βγBy/R γBy/R 0 0 0
−βγBz/R γBz/R 0 0 0
0 0
0 K
= =
−(γqBv/c)y/R K
−(γqBv/c)z/R
While O saw only a B -field, it is clear from the computed structure of F that O
sees both a B -field (γ times stronger that O’s) and an E -field. We have known
since (210.2) that
λ = −βγI/c
The question now before us: How does the current-carrying wire acquire, in O’s
estimation, a net charge? An answer of sorts can be obtained as follows: Assume
(in the interest merely of simplicity) that the current is uniformly distributed
on the wire’s cross-section:
a = a because cross-section ⊥ v
so O obtains
λ ≡ charge per unit length = ρ a
= −βγI/c
—in precise agreement with the result deduced previously. Sharpened insight
into the mechanism that lies at the heart of this counterintuitive result can be
gained from a comparison of the spacetime diagrams presented in Figure 75. At
top we see O’s representation of current in a stationary wire: negatively ionized
atoms stand in place, positive charges drift in the direction of current flow.165
In the lower figure we see how the situation presents itself to an observer O
who is moving with speed v in a direction parallel to the current flow. At any
instant of time (look, for example, to his x0 = 0 timeslice, drawn in red) O sees
ions and charge carriers to have distinct linear densities . . . the reason being
that she sees ions and charge carriers to be moving with distinct speeds, and
the intervals separating one ion from the next, one charge carrier from the next
to be Lorentz contracted by distinct amounts. O’s charged wire is, therefore, a
differential Lorentz contraction effect. That such a small velocity differential
can, from O’s perspective, give rise to a measureable net charge is no more
surprising than that it can, from O’s perspective, give rise to a measureable
net current: both can be attributed to the fact that an awful lot of charges
participate in the drift.
165
O knows perfectly well that in point of physical fact the ionized atoms
are positively charged, the current carriers negatively charged, and their drift
opposite to the direction of current flow: the problem is that Benjamin Franklin
did not know that. But the logic of the argument is unaffected by this detail.
210 Aspects of special relativity