Vectors and Tensors in Curved Space Time: Physics Dep., University College Cork
Vectors and Tensors in Curved Space Time: Physics Dep., University College Cork
Asaf Pe’er1
1. Introduction
Using the equivalence principle, we have studied the trajectories of free test particles
in curved space time. We argued that from the point of view of the test particle, it is in
free motion, and remains in free motion even in the presence of a gravitational field (think
of astronauts orbiting earth... they are weightless !). The inclusion of gravity changes the
curvature of space time. In the presence of gravity, space-time can no longer be consid-
ered “flat”, but “curved”. This curvature is what gives rise to what we call gravitational
“acceleration”, which is mathematically described by the affine connection.
Our ultimate goal is to understand how exactly does the presence of gravitational field
curves space-time. This will eventually lead us to Einstein’s field equation. However,
before we can get there, we still need to fill a mathematical gap. We first need to understand
how to describe physical quantities such as vectors and tensors, from which physical equations
are derived, in curved space time.
We want to understand how the laws of physics, beyond those governing freely-falling
particles described by the geodesic equation, adapt to the curvature of space-time. The
procedure essentially follows the paradigm established in arguing that free particles move
along geodesics.
• First, we consider an equation that describes a law of physics in flat space-time, tradi-
tionally written in terms of partial derivatives and the flat metric.
• Second, according to the equivalence principle this equation will hold in the presence
of gravity, provided that the equation is generally covariant, namely, it preserves its
form under general coordinate transformation, x → x′ .
1
Physics Dep., University College Cork
–2–
In special relativity, we emphasized the fact that vectors belong to the tangent space,
composed of the set of all vectors at a single point in space-time. The crucial point was to
emphasis the fact that vectors are objects associated with a single point. By doing so,
we had to pay a price: we lost the sense of direction. We could not use a statement like “the
vector points in the x direction” - this doesn’t make sense if the tangent space is merely an
abstract vector space associated with each point !. Now it is time to fix this problem.
Before we continue, since I will often use the term “manifold” to describe curved space
time, I should briefly introduce it. Very crudely speaking, without getting into the math, a
manifold is an n-dimensional space that near each of its points resembles an n-dimensions
Euclidean space. Thus, while locally it looks Euclidean, globally it is not - exactly what
happens to space-time in the presence of gravity. Simple examples are n-dimensional sphere,
–3–
torus (“bagel”), and Riemann surface of genus g which is a two-torus with g holes (see Figure
1).
Fig. 1.— Riemann surfaces. A Riemann surface of genus 0 is a 2-dimensions sphere (also
known as S 2 ), and Riemann surface of genus g is two-dimensional torus with g holes.
Let’s imagine that we wanted to construct the tangent space at a point p in a curved
space time (or in a manifold M ), using only things that are intrinsic to M (no embedding in
higher-dimensional spaces etc.). There is a little bit of mathematical subtlety here, so lets
go over it carefully.
One first guess might be to use our intuitive knowledge that there are objects called
“tangent vectors to curves” which belong in the tangent space. Thus, what we do is that we
draw a set of curves on the manifold that all pass through the point p. The temptation is
to define the tangent space as simply the space of all tangent vectors to these curves at the
point p. But this is obviously cheating; the tangent space Tp is supposed to be the space of
vectors at p, and before we have defined this we don’t have an independent notion of what
“the tangent vector to a curve” is supposed to mean. In some coordinate system xµ any
curve through p defines an element of Rn specified by the n real numbers dxµ /dλ (where
λ is the parameter along the curve), but this map is clearly coordinate-dependent, which is
not what we want.
Nevertheless we are on the right track, we just have to make things independent of
coordinates. To this end we define F to be the space of all smooth functions on M (in
mathematical lingo, we say C ∞ maps f : M → R). Then we notice that each curve through
p defines an operator on this space, the directional derivative, which maps f → df /dλ (at
p). We will make the following claim: the tangent space Tp can be identified with the space
of directional derivative operators along curves through p. To establish this idea we must
demonstrate two things: first, that the space of directional derivatives is a vector space, and
second that it is the vector space we want (it has the same dimensionality as M , yields a
natural idea of a vector pointing along a certain direction, and so on).
–4–
The first claim, that directional derivatives form a vector space, seems straightforward
d d
enough. Imagine two operators dλ and dη representing derivatives along two curves through
p. There is no problem adding these and scaling by real numbers, to obtain a new operator
d d
a dλ + b dη . It is not immediately obvious, however, that the space closes; i.e., that the
resulting operator is itself a derivative operator. A good derivative operator is one that
acts linearly on functions, and obeys the conventional Leibniz (product) rule on products
of functions. Our new operator is manifestly linear, so we need to verify that it obeys the
Leibniz rule. We have
d d dg df dg df
a dλ + b dη (f g) = af dλ + ag dλ + bf dη + bg dη
df df
dg dg
(1)
= a dλ + b dη g + a dλ + b dη f .
As we had hoped, the product rule is satisfied, and the set of directional derivatives is
therefore a vector space.
Is it the vector space that we would like to identify with the tangent space? The easiest
way to become convinced is to find a basis for the space. Consider again a coordinate chart
with coordinates xµ .1 Then there is an obvious set of n directional derivatives at p, namely
the partial derivatives ∂µ at p (see Figure 2).
ρ
2
p ρ
1
x2
x1
Fig. 2.— Coordinate chart on a (curved) manifold M provides a natural way to form the
basis of the tangent space, by using the partial derivatives {∂µ } at p as a basis.
We are now going to claim that the partial derivative operators {∂µ } at p form a basis
for the tangent space Tp . (It follows immediately that Tp is n-dimensional, since that is the
number of basis vectors.) To see this we need to show that any directional derivative can be
decomposed into a sum of real numbers times partial derivatives. But this is in fact just the
1
A coordinate chart, or coordinate system is a way of expressing the points of a small neighborhood of
a point p on a manifold M , as coordinates in Euclidean space. Technically, this is a one to one mapping
φ : U → Rn from an open set U in M to an open set in Rn .
–5–
familiar expression for the components of a tangent vector. Consider a general n-dimensional
manifold M , a curve γ : R → M , and a function f : M → R. If λ is the parameter along
d
the curve γ, we can expand the vector/operator dλ in terms of the partials ∂µ
′ ′
and hence (since the matrix ∂xµ /∂xµ is the inverse of the matrix ∂xµ /∂xµ ),
′
µ′ ∂xµ µ
V = V . (6)
∂xµ
–6–
Since the basis vectors are usually not written explicitly, the rule in Equation 6 for transform-
ing components is what we call the “transformation law of (contravariant) vector.”
An object that transforms according to Equation 6 when the coordinates are changed from
′
xµ → xµ is a contravariant vector. Thus, we identified vectors with directional
derivatives.
Of course, the transformation law in Equation 6 is compatible with the transformation
of contravariant vector components in special relativity. Under Lorentz transformations, we
′ ′
had V µ = Λµ µ V µ , but a Lorentz transformation is a special kind of coordinate transforma-
′ ′
tion, with xµ = Λµ µ xµ . Equation 6, though is much more general, as it encompasses the
behavior of vectors under arbitrary changes of coordinates (and therefore bases), not just
linear transformations. As such, it can be used in curved space time, not only a flat one.
As usual, we are trying to emphasize a somewhat subtle ontological distinction — tensor
components do not change when we change coordinates, but they change when we change the
basis in the tangent space. Since we have decided to use the coordinates to define our basis,
a change of coordinates induces a change of basis (see Figure 3), which, in turn, induces a
change in the tensor components.
xµ
ρ
2 ρ
1’
x µ’
ρ
1 ρ
2’
′
Fig. 3.— When we change the coordinate from xµ to xµ , we induce a change in the basis.
This basis change leads to a change in the components of a tensor.
d
vector dλ
is exactly the directional derivative of the function:
d df
df = . (7)
dλ dλ
Note the following: it’s tempting to think, “why shouldn’t the function f itself be considered
the one-form, and df /dλ its action?” The point is that a one-form, like a vector, exists only
at the point it is defined, and does not depend on information at other points on M . If you
know a function in some neighborhood of a point you can take its derivative, but not just
from knowing its value at the point; the gradient, on the other hand, encodes precisely the
information necessary to take the directional derivative along any curve through p, fulfilling
its role as a dual vector.
Just as the partial derivatives along coordinate axes provide a natural basis for the
tangent space, the gradients of the coordinate functions xµ provide a natural basis for the
cotangent space. Recall that in flat space we constructed a basis for Tp∗ by demanding that
θ̂(µ) (ê(ν) ) = δνµ . Continuing the same philosophy on an arbitrary manifold, we find that (2.14)
leads to
∂xµ
dxµ (∂ν ) = ν
= δνµ . (8)
∂x
Therefore the gradients {dxµ } are an appropriate set of basis one-forms; an arbitrary one-
form is expanded into components as ω = ωµ dxµ .
The transformation properties of basis dual vectors and components follow from what
is by now the usual procedure. We obtain, for basis one-forms,
′
µ′∂xµ
dx = dxµ , (9)
∂xµ
and for components,
∂xµ
ωµ′ =
ωµ . (10)
∂xµ′
We will usually write the components ωµ when we speak about a one-form ω. Thus, equation
10 can be viewed as defining the transformation law of the covariant vector (or one-
form) ω.
3.2. Tensors
The transformation law for general tensors follows this same pattern of replacing the
Lorentz transformation matrix used in flat space with a matrix representing more general
coordinate transformations. A (k, l) tensor T can be expanded
T = T µ1 ···µk ν1 ···νl ∂µ1 ⊗ · · · ⊗ ∂µk ⊗ dxν1 ⊗ · · · ⊗ dxνl , (11)
–8–
where I have used the symbol ⊗ to describe a tensor product (also known as outer
product).2
Under a coordinate transformation the components change like the product of con-
travariant vectors and covariant vectors,
′ ′
′ ′ ∂xµ1 ∂xµk ∂xν1 ∂xνl µ1 ···µk
T µ1 ···µk ν1′ ···νl′ = · · · ′ · · · ′T ν1 ···νl . (12)
∂xµ1 ∂xµk ∂xν1 ∂xνl
This tensor transformation law is straightforward to remember, since there really isn’t any-
thing else it could be, given the placement of indices. Equation 12 thus defines the trans-
formation law of tensors.
3.2.1. Example.
Let us now change the coordinate system: consider new coordinates, say
x′ = x1/3
(15)
y ′ = ex+y .
2
Without getting into precise mathematical definition, if T and S are tensors in the sense that each acts
on a set of dual vectors and vectors, than T ⊗ S can be thought of as first act T on the appropriate set of
dual vectors and vectors, and then act S on the remainder, and then multiply the answers. Note that, in
general, T ⊗ S 6= S ⊗ T .
–9–
We need only plug these expressions directly into Equation 14 to write the components of S
in terms of the new coordinates x′ , y ′ . (Remembering that tensor products don’t commute,
so dx′ dy ′ 6= dy ′ dx′ ):
(x′ )2 1
S = 9(x′ )4 [1 + (x′ )3 ](dx′ )2 − 3 ′
(dx′ dy ′ + dy ′ dx′ ) + ′ 2 (dy ′ )2 , (17)
y (y )
or ′ 2
9(x′ )4 [1 + (x′ )3 ] −3 (xy′)
S µ′ ν ′ = ′ 2 . (18)
−3 (xy′) 1
(y ′ )2
Notice that the tensor S is still symmetric. We did not use the transformation law (Equation
12) directly, but doing so would have yielded the same result, as you can check.
For the most part the various tensor operations we defined in flat space are unaltered
in a more general setting: contraction, symmetrization, etc. There are three important
exceptions: partial derivatives, the metric, and the Levi-Civita tensor. Let’s look at the
metric first.
Using the transformation rule of the metric tensor (Equation 21) and taking its determinant,
µ′ −2
µ′
∂x
g(x ) = µ g(xµ ) . (23)
∂x
– 10 –
Example.
Consider the flat 3-d space written in spherical coordinates: ds2 = dr2 + r2 (dθ2 + sin2 θdφ2 ).
We can thus write the metric as
1 0 0
gµν = 0 r2 0 . (26)
2 2
0 0 r sin θ
√
Thus, g = r4 sin2 θ, and a volume element is gdrdθdφ = r2 sin θdrdθdφ.
The final change we have to make to our tensor knowledge now that we have dropped
the assumption of flat space has to do with the Levi-Civita tensor, ǫµ1 µ2 ···µn . Remember that
the flat-space version of this object, which we will now denote by ǫ̃µ1 µ2 ···µn , was defined as
+1 if µ1 µ2 · · · µn is an even permutation of 01 · · · (n − 1) ,
ǫ̃µ1 µ2 ···µn = −1 if µ1 µ2 · · · µn is an odd permutation of 01 · · · (n − 1) , (27)
0 otherwise .
We will now define the Levi-Civita symbol to be exactly this ǫ̃µ1 µ2 ···µn — that is, an object
with n indices which has the components specified above in any coordinate system. This is
called a “symbol,” of course, because it is not a tensor; it is defined not to change under
coordinate transformations.
– 11 –
We can relate its behavior to that of an ordinary tensor by looking at the determinant
′
of the matrix ∂xµ /∂xµ , which obeys
µ′
∂x ∂xµ1 ∂xµ2 ∂xµn
ǫ̃µ′1 µ′2 ···µ′n = µ ǫ̃µ1 µ2 ···µn µ′ ′ · · · . (28)
∂x ∂x 1 ∂xµ2 ∂xµ′n
(This can be found in any linear algebra book..). Thus, the Levi-Civita symbol is a tensor
density of weight 1.
However, we don’t like tensor densities, we like tensors. There is a simple way to convert
a density into an honest tensor — multiply by |g|w/2 , where w is the weight of the density
(the absolute value signs are there because g < 0 for Lorentz metrics). The result will
transform according to the tensor transformation law. Therefore, for example, we can define
the Levi-Civita tensor as p
ǫµ1 µ2 ···µn = |g| ǫ̃µ1 µ2 ···µn . (29)
Since this is a real tensor, we can raise indices, etc. Sometimes people define a version of the
Levi-Civita symbol with upper indices, ǫ̃µ1 µ2 ···µn , whose components are numerically equal
to the symbol with lower indices. This turns out to be a density of weight −1, and is related
to the tensor with upper indices by
1
ǫµ1 µ2 ···µn = sgn(g) p ǫ̃µ1 µ2 ···µn . (30)
|g|
As an aside, (for
p those of you who like math) we should come clean and admit that, even
with the factor of |g|, the Levi-Civita tensor is in some sense not a true tensor, because
on some manifolds it cannot be globally defined. Those on which it can be defined are
called orientable, and we will deal exclusively with orientable manifolds in this course. An
example of a non-orientable manifold is the Möbius strip; see Schutz’s Geometrical Methods
in Mathematical Physics (or a similar text) for a discussion.
5. Covariant derivatives
The unfortunate fact is that the partial derivative of a tensor is not, in general, a new
tensor. For example, if we take the contravariant vector V µ , whose transformation law is
given by Equation 6,
′
µ′ ∂xµ µ
V = V
∂xµ
′
and we differentiate with respect to xλ , we get
′ µ ′ µ′ ρ 2 µ′
∂V µ ∂x ∂V µ ∂ x ∂xρ
∂ ∂x µ ∂x
= V = + V µ. (31)
∂xλ′ ∂xλ′ ∂xµ ∂xµ ∂xλ′ ∂xρ ∂xµ ∂xρ ∂xλ′
– 12 –
The first term on the right hand side is what we would expect if ∂V µ /∂xλ was a tensor. It
is this second term that destroys the tensor behavior.
This is a very problematic result, as derivatives of tensors are obvious ingredients in
physical equations. We somehow need to find a way to generalize equations such as ∂µ T µν = 0
to curved space time. Thus, what we really look for, is an operator which reduces to the
partial derivative in flat space with Cartesian coordinates, but transforms as a tensor on an
arbitrary manifold. Such an operator is called covariant derivative.
In order to construct a covariant derivative, lets have a look first at the transformation
law of the affine connection.
Although ∂V µ /∂xλ is not a tensor, the results of equation 37 can be used to construct
′ ′
a tensor. This is done by looking at the transformation law of Γλµ′ ν ′ V ν ,
h ′ ′
i ′
′ ′ ∂xλ ∂xσ ∂xτ ρ ∂xσ ∂xρ ∂ 2 xλ ∂xν
Γλµ′ ν ′ V ν = Γ − ∂x
∂xρ ∂xν ′ ∂xµ′ τ σ µ′ ∂xν ′ ∂xρ ∂xσ ∂xκ
Vκ
λ ′ τ σ ′ (38)
∂x ∂x ρ σ ∂x ∂ 2 xλ
= ρ
∂x ∂x µ′ Γτ σ V − ∂x ∂x ∂xσ
µ ′ ρ Vρ
which is basically what we wanted !. We are thus led to define a covariant derivative
∂V µ
∇λ V µ ≡ V µ ;λ ≡ + Γµλσ V σ . (40)
∂xλ
Equation 39 tells us that V µ ;,λ is a tensor, since
′
µ′ ∂xµ ∂xρ µ
V ;λ′ = V ;ρ (41)
∂xµ ∂xλ′
In an identical way, we can define the covariant derivate of a covariant vector ωµ . Recall
the transformation law (Equation 10),
∂xµ
ω = µ′ ωµ ,
∂xµ′
′
and differentiate with respect to xν , to get
Subtracting Equation 43 from 42, we cancel the inhomogeneous terms and obtain
Let us stop for a moment and see what we got. By introducing the concept of covariant
derivatives, combined with the algebraic properties of tensors [linearity, external (direct)
product and contraction (T µ ≡ T µ ρ νρ )], we were able to extend the concept of partial deriva-
tives from flat space time to a curved one. Moreover, we did it without being dependent on
the particular coordinate system used. In particular, we found that the covariant derivatives
are:
1. Linear: (αT + βS);λ = αT;λ + βS;λ , where α and β are numbers, T and S are tensors.
The covariant derivative of the metric tensor is 0. This can be understood “intuitively”,
as in the local inertial frame it vanishes, and being a tensor, if it is 0 in one coordinate
system it is 0 in any coordinate system. This can also be seen directly,
∂gµν
λ
− Γρλµ gρν − Γρλν gρµ ,
gµν;λ = (48)
∂x
and using the definition of the affine connection. In a very similar way, g µν ;λ = 0, and
µ
δν;λ = 0.
The importance of covariant derivative arise from two of its properties:
– 15 –
These properties thus suggest an easy algorithm to asses the effects of gravitation on physical
systems: Write the appropriate SR equation that hold in the absence of gravity;
then replace ηµν with gµν and all derivatives with covariant derivatives. The result-
ing equations will be generally covariant, and true in the absence of gravitation. According
to the principle of general covariance, they will be true in the presence of gravitational field
(provided that we work in sufficiently small region of space).
Let us have another look on covariant derivatives, as these are crucial when working in
curved space-time. Let us look first at flat space-time. When we want to take a derivative
of a vector, we consider two vectors V (xα ) and V (xα + dxα ) separated by an infinitesimal
displacement dxα along the direction of the derivative. Thus, to construct the derivative,
we first transport the vector V (xα + dxα ) parallel to itself back to the point xα , to give the
vector Vk (xα ). Only then it is in the tangent space of xα , and then at a second step, we
subtract the vector V (xα ) from it, using the parallelogram rule (see Figure 4). The key thing
that we do is parallel transport.
We now turn our attention to curved space time. We can perform parallel transport in
curved space time, because locally we have a local inertial frame which is equivalent to flat
space time. However, when we do that, the coordinates of a vector change. This results
from the change in the angle the vector make with the basis vectors. This is demonstrated in
Figure 5. This change is linear in the vector components. We therefore expect a term of the
α
form ∇β V α = ∂V α /∂xβ + Γβγ V γ . Thus, while the first term ∂V α /∂xβ arise from a change
in the vector field V between xα and xα + dxα , the second term arise from a change in the
basis vectors between the two points.
α
The question now, is why do Γβγ turn out to be identical to the affine connection, Γαβγ
- surely this is no coincidence (?). Of course it isn’t. We saw that the geodesic, in locally
flat space time is a straight line. Now, we defined a “straight line” as the curve of extremal
(minimal) distance between points. However, an alternative definition, is a curve whose unit
tangent vector is parallel to itself (see Figure 6). Let us call this tangent vector u: then its
– 16 –
Fig. 4.— The derivative of a vector in flat space time includes to stages: First, the vector
V (xα +dxα ) is being transported parallel to itself from xα +dxα to xα . The transported vector
is subtracted from V (xα ) to obtain the difference ∆V (xα ). The derivative is the difference
∆v(xα )/dxα in the limit dxα → 0.
The equations of electromagnetism, fluid mechanics and many others area of classical
physics make use of the three-dimensional vector calculus employing functions such as gra-
dient, divergence, curl and Laplacian. You have seen explicit forms of these functions in
– 17 –
Fig. 5.— When parallel transporting a vector in non-Cartesian coordinates, the components
of the vector change, due to change in the basis vectors: in this example, we use polar coordi-
nates, and while the vector itself does not change when parallel-transported, its components
do.
∂ωµ ∂ων
ωµ;ν − ων,µ = − µ. (53)
∂xν ∂x
Fig. 6.— A geodesic can be thought of as a line for which a tangent vector V at xα is
′ ′
parallel-transported to xα , the obtained vector Vk (xα ) coincides with the tangent vector
′
V (xα ).
p p
(Note the appearance of |g| that makes d4 x |g| invariant).
In 3-dimensions, the Laplacian of a scalar S is just the divergence of its gradient, namely