0% found this document useful (0 votes)
63 views

Vectors and Tensors in Curved Space Time: Physics Dep., University College Cork

This document discusses vectors and tensors in curved spacetime. It begins by introducing the concept of curved spacetime due to gravity and the need to describe vectors and tensors in this context. It then discusses the principle of general covariance, which states that the laws of physics must take the same form in any coordinate system. Finally, it generalizes the definition of vectors and tensors to curved spacetime by defining them as directional derivative operators along curves at a point, with the partial derivatives forming a basis for the tangent space at that point.

Uploaded by

rebe53
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Vectors and Tensors in Curved Space Time: Physics Dep., University College Cork

This document discusses vectors and tensors in curved spacetime. It begins by introducing the concept of curved spacetime due to gravity and the need to describe vectors and tensors in this context. It then discusses the principle of general covariance, which states that the laws of physics must take the same form in any coordinate system. Finally, it generalizes the definition of vectors and tensors to curved spacetime by defining them as directional derivative operators along curves at a point, with the partial derivatives forming a basis for the tangent space at that point.

Uploaded by

rebe53
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Vectors and tensors in curved space time

Asaf Pe’er1

January 29, 2013

1. Introduction

Using the equivalence principle, we have studied the trajectories of free test particles
in curved space time. We argued that from the point of view of the test particle, it is in
free motion, and remains in free motion even in the presence of a gravitational field (think
of astronauts orbiting earth... they are weightless !). The inclusion of gravity changes the
curvature of space time. In the presence of gravity, space-time can no longer be consid-
ered “flat”, but “curved”. This curvature is what gives rise to what we call gravitational
“acceleration”, which is mathematically described by the affine connection.
Our ultimate goal is to understand how exactly does the presence of gravitational field
curves space-time. This will eventually lead us to Einstein’s field equation. However,
before we can get there, we still need to fill a mathematical gap. We first need to understand
how to describe physical quantities such as vectors and tensors, from which physical equations
are derived, in curved space time.

2. The principle of general covariance

We want to understand how the laws of physics, beyond those governing freely-falling
particles described by the geodesic equation, adapt to the curvature of space-time. The
procedure essentially follows the paradigm established in arguing that free particles move
along geodesics.

• First, we consider an equation that describes a law of physics in flat space-time, tradi-
tionally written in terms of partial derivatives and the flat metric.

• Second, according to the equivalence principle this equation will hold in the presence
of gravity, provided that the equation is generally covariant, namely, it preserves its
form under general coordinate transformation, x → x′ .

1
Physics Dep., University College Cork
–2–

This is known as the principle of general covariance.


We can easily show that the principle of general covariance follows from the equivalence
principle. Assume that we are in an arbitrary gravitational field, and consider any equation
that satisfies both conditions. From the second condition, it follows that if the equation is
true in one coordinate system, it is true in any other system, as it preserves its form. We
know from the equivalence principle that at any point we can construct a locally inertial
system, in which the effects of gravity are absent. From the first criterion, we know that the
equation holds in this system, hence we conclude that it must hold in all other coordinate
systems.
Another way of thinking at the principle of general covariance is that it is a consequence
of the equivalence principle, plus the requirement that the laws of physics be independent
of coordinates. (The requirement that laws of physics be independent of coordinates is
essentially impossible to even imagine being untrue. Given some experiment, if one person
uses one coordinate system to predict a result and another one uses a different coordinate
system, they had better agree.)
The principle of general covariance manifests the importance of vectors and tensors
introduced earlier: as these have simple transformation laws, equations composed of vectors
and tensors can (relatively) easy be made invariant under general coordinate transformations.
The first thing we need to do, though, is to somewhat generalize the notion of vectors and
tensors introduced in flat space-time, so that we could use them in arbitrary curved space-
time.

3. Generalization of the definition of vectors and tensors to curved space-time.

In special relativity, we emphasized the fact that vectors belong to the tangent space,
composed of the set of all vectors at a single point in space-time. The crucial point was to
emphasis the fact that vectors are objects associated with a single point. By doing so,
we had to pay a price: we lost the sense of direction. We could not use a statement like “the
vector points in the x direction” - this doesn’t make sense if the tangent space is merely an
abstract vector space associated with each point !. Now it is time to fix this problem.
Before we continue, since I will often use the term “manifold” to describe curved space
time, I should briefly introduce it. Very crudely speaking, without getting into the math, a
manifold is an n-dimensional space that near each of its points resembles an n-dimensions
Euclidean space. Thus, while locally it looks Euclidean, globally it is not - exactly what
happens to space-time in the presence of gravity. Simple examples are n-dimensional sphere,
–3–

torus (“bagel”), and Riemann surface of genus g which is a two-torus with g holes (see Figure
1).

genus 0 genus 1 genus 2

Fig. 1.— Riemann surfaces. A Riemann surface of genus 0 is a 2-dimensions sphere (also
known as S 2 ), and Riemann surface of genus g is two-dimensional torus with g holes.

Let’s imagine that we wanted to construct the tangent space at a point p in a curved
space time (or in a manifold M ), using only things that are intrinsic to M (no embedding in
higher-dimensional spaces etc.). There is a little bit of mathematical subtlety here, so lets
go over it carefully.
One first guess might be to use our intuitive knowledge that there are objects called
“tangent vectors to curves” which belong in the tangent space. Thus, what we do is that we
draw a set of curves on the manifold that all pass through the point p. The temptation is
to define the tangent space as simply the space of all tangent vectors to these curves at the
point p. But this is obviously cheating; the tangent space Tp is supposed to be the space of
vectors at p, and before we have defined this we don’t have an independent notion of what
“the tangent vector to a curve” is supposed to mean. In some coordinate system xµ any
curve through p defines an element of Rn specified by the n real numbers dxµ /dλ (where
λ is the parameter along the curve), but this map is clearly coordinate-dependent, which is
not what we want.
Nevertheless we are on the right track, we just have to make things independent of
coordinates. To this end we define F to be the space of all smooth functions on M (in
mathematical lingo, we say C ∞ maps f : M → R). Then we notice that each curve through
p defines an operator on this space, the directional derivative, which maps f → df /dλ (at
p). We will make the following claim: the tangent space Tp can be identified with the space
of directional derivative operators along curves through p. To establish this idea we must
demonstrate two things: first, that the space of directional derivatives is a vector space, and
second that it is the vector space we want (it has the same dimensionality as M , yields a
natural idea of a vector pointing along a certain direction, and so on).
–4–

The first claim, that directional derivatives form a vector space, seems straightforward
d d
enough. Imagine two operators dλ and dη representing derivatives along two curves through
p. There is no problem adding these and scaling by real numbers, to obtain a new operator
d d
a dλ + b dη . It is not immediately obvious, however, that the space closes; i.e., that the
resulting operator is itself a derivative operator. A good derivative operator is one that
acts linearly on functions, and obeys the conventional Leibniz (product) rule on products
of functions. Our new operator is manifestly linear, so we need to verify that it obeys the
Leibniz rule. We have
 
d d dg df dg df
a dλ + b dη (f g) = af dλ + ag dλ + bf dη + bg dη

df df
 
dg dg
 (1)
= a dλ + b dη g + a dλ + b dη f .

As we had hoped, the product rule is satisfied, and the set of directional derivatives is
therefore a vector space.
Is it the vector space that we would like to identify with the tangent space? The easiest
way to become convinced is to find a basis for the space. Consider again a coordinate chart
with coordinates xµ .1 Then there is an obvious set of n directional derivatives at p, namely
the partial derivatives ∂µ at p (see Figure 2).

ρ
2

p ρ
1

x2
x1

Fig. 2.— Coordinate chart on a (curved) manifold M provides a natural way to form the
basis of the tangent space, by using the partial derivatives {∂µ } at p as a basis.

We are now going to claim that the partial derivative operators {∂µ } at p form a basis
for the tangent space Tp . (It follows immediately that Tp is n-dimensional, since that is the
number of basis vectors.) To see this we need to show that any directional derivative can be
decomposed into a sum of real numbers times partial derivatives. But this is in fact just the

1
A coordinate chart, or coordinate system is a way of expressing the points of a small neighborhood of
a point p on a manifold M , as coordinates in Euclidean space. Technically, this is a one to one mapping
φ : U → Rn from an open set U in M to an open set in Rn .
–5–

familiar expression for the components of a tangent vector. Consider a general n-dimensional
manifold M , a curve γ : R → M , and a function f : M → R. If λ is the parameter along
d
the curve γ, we can expand the vector/operator dλ in terms of the partials ∂µ

d f (xµ (λ + ǫ)) − f (xµ (λ)) dxµ


f = lim = ∂µ f . (2)
dλ ǫ→0 ǫ dλ
Since the function f is arbitrary, we can write
d dxµ
= ∂µ . (3)
dλ dλ
Thus, the partials {∂µ } do indeed represent a good basis for the vector space of directional
derivatives, which we can therefore safely identify with the tangent space.
d
Of course, the vector represented by dλ is one we already know; it is the tangent vector
to the curve with parameter λ. Thus Equation 3 can be thought of as a restatement of
Equation 28 in the SR chapter, where we claimed the that components of the tangent vector
were simply dxµ /dλ. The only difference is that we are working on an arbitrary manifold,
and we have specified our basis vectors to be ê(µ) = ∂µ .
This particular basis (ê(µ) = ∂µ ) is known as a coordinate basis for Tp ; it is the
formalization of the notion of setting up the basis vectors to point along the coordinate
axes. There is no reason why we are limited to coordinate bases when we consider tangent
vectors; it is sometimes more convenient, for example, to use orthonormal bases of some
sort. However, the coordinate basis is very simple and natural, and we will use it almost
exclusively throughout the course.
One of the advantages of the rather abstract point of view we have taken toward vectors
is that the transformation law is immediate. Since the basis vectors are ê(µ) = ∂µ , the basis

vectors in some new coordinate system xµ are given by the chain rule as
∂xµ
∂ µ′ = ∂µ . (4)
∂xµ′
We can get the transformation law for vector components by the same technique used in flat
space, demanding the vector V = V µ ∂µ be unchanged by a change of basis. We have

V µ ∂ µ = V µ ∂ µ′
′ ∂xµ (5)
= V µ ∂x µ′ ∂µ ,

′ ′
and hence (since the matrix ∂xµ /∂xµ is the inverse of the matrix ∂xµ /∂xµ ),

µ′ ∂xµ µ
V = V . (6)
∂xµ
–6–

Since the basis vectors are usually not written explicitly, the rule in Equation 6 for transform-
ing components is what we call the “transformation law of (contravariant) vector.”
An object that transforms according to Equation 6 when the coordinates are changed from

xµ → xµ is a contravariant vector. Thus, we identified vectors with directional
derivatives.
Of course, the transformation law in Equation 6 is compatible with the transformation
of contravariant vector components in special relativity. Under Lorentz transformations, we
′ ′
had V µ = Λµ µ V µ , but a Lorentz transformation is a special kind of coordinate transforma-
′ ′
tion, with xµ = Λµ µ xµ . Equation 6, though is much more general, as it encompasses the
behavior of vectors under arbitrary changes of coordinates (and therefore bases), not just
linear transformations. As such, it can be used in curved space time, not only a flat one.
As usual, we are trying to emphasize a somewhat subtle ontological distinction — tensor
components do not change when we change coordinates, but they change when we change the
basis in the tangent space. Since we have decided to use the coordinates to define our basis,
a change of coordinates induces a change of basis (see Figure 3), which, in turn, induces a
change in the tensor components.


ρ
2 ρ
1’
x µ’
ρ
1 ρ
2’


Fig. 3.— When we change the coordinate from xµ to xµ , we induce a change in the basis.
This basis change leads to a change in the components of a tensor.

3.1. Covariant vectors

Equation 6 thus provides a general definition of a contravariant vector. We can now


continue to follow the steps we took in flat space (SR), and consider the dual vectors (one
forms). Once again the cotangent space Tp∗ is the set of linear maps ω : Tp → R. The
canonical example of a one-form is the gradient of a function f , denoted df . Its action on a
–7–

d
vector dλ
is exactly the directional derivative of the function:
 
d df
df = . (7)
dλ dλ
Note the following: it’s tempting to think, “why shouldn’t the function f itself be considered
the one-form, and df /dλ its action?” The point is that a one-form, like a vector, exists only
at the point it is defined, and does not depend on information at other points on M . If you
know a function in some neighborhood of a point you can take its derivative, but not just
from knowing its value at the point; the gradient, on the other hand, encodes precisely the
information necessary to take the directional derivative along any curve through p, fulfilling
its role as a dual vector.
Just as the partial derivatives along coordinate axes provide a natural basis for the
tangent space, the gradients of the coordinate functions xµ provide a natural basis for the
cotangent space. Recall that in flat space we constructed a basis for Tp∗ by demanding that
θ̂(µ) (ê(ν) ) = δνµ . Continuing the same philosophy on an arbitrary manifold, we find that (2.14)
leads to
∂xµ
dxµ (∂ν ) = ν
= δνµ . (8)
∂x
Therefore the gradients {dxµ } are an appropriate set of basis one-forms; an arbitrary one-
form is expanded into components as ω = ωµ dxµ .
The transformation properties of basis dual vectors and components follow from what
is by now the usual procedure. We obtain, for basis one-forms,

µ′∂xµ
dx = dxµ , (9)
∂xµ
and for components,
∂xµ
ωµ′ =
ωµ . (10)
∂xµ′
We will usually write the components ωµ when we speak about a one-form ω. Thus, equation
10 can be viewed as defining the transformation law of the covariant vector (or one-
form) ω.

3.2. Tensors

The transformation law for general tensors follows this same pattern of replacing the
Lorentz transformation matrix used in flat space with a matrix representing more general
coordinate transformations. A (k, l) tensor T can be expanded
T = T µ1 ···µk ν1 ···νl ∂µ1 ⊗ · · · ⊗ ∂µk ⊗ dxν1 ⊗ · · · ⊗ dxνl , (11)
–8–

where I have used the symbol ⊗ to describe a tensor product (also known as outer
product).2
Under a coordinate transformation the components change like the product of con-
travariant vectors and covariant vectors,
′ ′
′ ′ ∂xµ1 ∂xµk ∂xν1 ∂xνl µ1 ···µk
T µ1 ···µk ν1′ ···νl′ = · · · ′ · · · ′T ν1 ···νl . (12)
∂xµ1 ∂xµk ∂xν1 ∂xνl
This tensor transformation law is straightforward to remember, since there really isn’t any-
thing else it could be, given the placement of indices. Equation 12 thus defines the trans-
formation law of tensors.

3.2.1. Example.

Let us consider a symmetric (0, 2) tensor S on a 2-dimensional curved space (manifold).


Let us take as coordinate system on the manifold (x1 = x, x2 = y). Let us assume that the
components of the tensor are given by
 
x 0
Sµν = . (13)
0 1
This can be written equivalently as

S = x(dx)2 + (dy)2 , (14)

Let us now change the coordinate system: consider new coordinates, say

x′ = x1/3
(15)
y ′ = ex+y .

This leads directly to


x = (x′ )3
y = ln(y ′ ) − (x′ )3
(16)
dx = 3(x′ )2 dx′
dy = y1′ dy ′ − 3(x′ )2 dx′ .

2
Without getting into precise mathematical definition, if T and S are tensors in the sense that each acts
on a set of dual vectors and vectors, than T ⊗ S can be thought of as first act T on the appropriate set of
dual vectors and vectors, and then act S on the remainder, and then multiply the answers. Note that, in
general, T ⊗ S 6= S ⊗ T .
–9–

We need only plug these expressions directly into Equation 14 to write the components of S
in terms of the new coordinates x′ , y ′ . (Remembering that tensor products don’t commute,
so dx′ dy ′ 6= dy ′ dx′ ):

(x′ )2 1
S = 9(x′ )4 [1 + (x′ )3 ](dx′ )2 − 3 ′
(dx′ dy ′ + dy ′ dx′ ) + ′ 2 (dy ′ )2 , (17)
y (y )
or ′ 2
9(x′ )4 [1 + (x′ )3 ] −3 (xy′)
 
S µ′ ν ′ = ′ 2 . (18)
−3 (xy′) 1
(y ′ )2
Notice that the tensor S is still symmetric. We did not use the transformation law (Equation
12) directly, but doing so would have yielded the same result, as you can check.
For the most part the various tensor operations we defined in flat space are unaltered
in a more general setting: contraction, symmetrization, etc. There are three important
exceptions: partial derivatives, the metric, and the Levi-Civita tensor. Let’s look at the
metric first.

4. Volume elements in curved space time and tensor densities

Clearly, the metric tensor,


∂ξ α ∂ξ β
gµν ≡ ηαβ (19)
∂xµ ∂xν
transforms as
∂ξ α ∂ξ β ∂ξ α ∂xµ ∂ξ β ∂xν
g µ′ ν ′ = ηαβ µ′ ν ′ = ηαβ µ µ′ ν ν ′ (20)
∂x ∂x ∂x ∂x ∂x ∂x
or
∂xµ ∂xν
g µ′ ν ′ =
gµν . (21)
∂xµ′ ∂xν ′
Thus, we see that gµν is indeed a covariant tensor. Its inverse, g µν is a contravariant tensor.
The Kronecker symbol, δνµ is a mixed tensor. However, not everything is tensor!. For
example, the affine connection, Γλµν is not a tensor (as we will see below).
One important example of a non-tensor quantity is the determinant of the metric tensor:

g ≡ −Det (gµν ) = |gµν | (22)

Using the transformation rule of the metric tensor (Equation 21) and taking its determinant,
µ′ −2
µ′
∂x
g(x ) = µ g(xµ ) . (23)
∂x
– 10 –

which can be written as ′ −2



∂x
g = g (24)
∂x
Since |∂x′ /∂x| is the Jacobian of the transformation x → x′ . Thus, g transforms like a
scalar, except for an extra factor of the Jacobian. It is thus called scalar density (which is
a special case of tensor density). The number of factors of |∂x′ /∂x| is called the weight
of the density; thus g is scalar density of weight −2.
The importance of the tensor density arise from the fact that under a general coordinate
transformation x → x′ , the volume element dn x picks up a factor of the Jacobian,
µ′
∂x
d x = µ dn x .
n ′
(25)
∂x

Thus, gdn x is an invariant volume element.

Example.
Consider the flat 3-d space written in spherical coordinates: ds2 = dr2 + r2 (dθ2 + sin2 θdφ2 ).
We can thus write the metric as
1 0 0
 

gµν =  0 r2 0  . (26)
2 2
0 0 r sin θ

Thus, g = r4 sin2 θ, and a volume element is gdrdθdφ = r2 sin θdrdθdφ.

4.1. The Levi-Civita tensor density

The final change we have to make to our tensor knowledge now that we have dropped
the assumption of flat space has to do with the Levi-Civita tensor, ǫµ1 µ2 ···µn . Remember that
the flat-space version of this object, which we will now denote by ǫ̃µ1 µ2 ···µn , was defined as

 +1 if µ1 µ2 · · · µn is an even permutation of 01 · · · (n − 1) ,
ǫ̃µ1 µ2 ···µn = −1 if µ1 µ2 · · · µn is an odd permutation of 01 · · · (n − 1) , (27)
0 otherwise .

We will now define the Levi-Civita symbol to be exactly this ǫ̃µ1 µ2 ···µn — that is, an object
with n indices which has the components specified above in any coordinate system. This is
called a “symbol,” of course, because it is not a tensor; it is defined not to change under
coordinate transformations.
– 11 –

We can relate its behavior to that of an ordinary tensor by looking at the determinant

of the matrix ∂xµ /∂xµ , which obeys
µ′
∂x ∂xµ1 ∂xµ2 ∂xµn
ǫ̃µ′1 µ′2 ···µ′n = µ ǫ̃µ1 µ2 ···µn µ′ ′ · · · . (28)
∂x ∂x 1 ∂xµ2 ∂xµ′n
(This can be found in any linear algebra book..). Thus, the Levi-Civita symbol is a tensor
density of weight 1.
However, we don’t like tensor densities, we like tensors. There is a simple way to convert
a density into an honest tensor — multiply by |g|w/2 , where w is the weight of the density
(the absolute value signs are there because g < 0 for Lorentz metrics). The result will
transform according to the tensor transformation law. Therefore, for example, we can define
the Levi-Civita tensor as p
ǫµ1 µ2 ···µn = |g| ǫ̃µ1 µ2 ···µn . (29)
Since this is a real tensor, we can raise indices, etc. Sometimes people define a version of the
Levi-Civita symbol with upper indices, ǫ̃µ1 µ2 ···µn , whose components are numerically equal
to the symbol with lower indices. This turns out to be a density of weight −1, and is related
to the tensor with upper indices by
1
ǫµ1 µ2 ···µn = sgn(g) p ǫ̃µ1 µ2 ···µn . (30)
|g|

As an aside, (for
p those of you who like math) we should come clean and admit that, even
with the factor of |g|, the Levi-Civita tensor is in some sense not a true tensor, because
on some manifolds it cannot be globally defined. Those on which it can be defined are
called orientable, and we will deal exclusively with orientable manifolds in this course. An
example of a non-orientable manifold is the Möbius strip; see Schutz’s Geometrical Methods
in Mathematical Physics (or a similar text) for a discussion.

5. Covariant derivatives

The unfortunate fact is that the partial derivative of a tensor is not, in general, a new
tensor. For example, if we take the contravariant vector V µ , whose transformation law is
given by Equation 6,

µ′ ∂xµ µ
V = V
∂xµ

and we differentiate with respect to xλ , we get
′  µ ′   µ′   ρ   2 µ′
∂V µ ∂x ∂V µ ∂ x ∂xρ

∂ ∂x µ ∂x
= V = + V µ. (31)
∂xλ′ ∂xλ′ ∂xµ ∂xµ ∂xλ′ ∂xρ ∂xµ ∂xρ ∂xλ′
– 12 –

The first term on the right hand side is what we would expect if ∂V µ /∂xλ was a tensor. It
is this second term that destroys the tensor behavior.
This is a very problematic result, as derivatives of tensors are obvious ingredients in
physical equations. We somehow need to find a way to generalize equations such as ∂µ T µν = 0
to curved space time. Thus, what we really look for, is an operator which reduces to the
partial derivative in flat space with Cartesian coordinates, but transforms as a tensor on an
arbitrary manifold. Such an operator is called covariant derivative.
In order to construct a covariant derivative, lets have a look first at the transformation
law of the affine connection.

5.1. Transformation law of the affine connection

Recall the definition of the affine connection,


∂xλ ∂ 2 ξ α
Γλµν ≡ , (32)
∂ξ α ∂xµ ∂xν
where ξ α (x) is the locally inertial coordinate system. When changing coordinates from xµ

to xµ , the affine connection transforms as

′ ∂xλ ∂ 2 ξα
Γλµ′ ν ′ ≡ ,
∂ξ α ∂xµ′ ∂xν ′ 


λ
∂x ∂x ρ ∂ ∂xσ ∂ξ α
= ∂xρ ∂ξ α ∂xµ′ ′ σ (33)

 σ ∂xντ ∂x2 α 
∂xλ ∂xρ ∂x ∂x ∂ ξ ∂ 2 xσ ∂ξ α
= ∂xρ ∂ξ α ∂xν ∂xµ ∂xτ ∂xσ
′ ′ + ∂xν ∂xµ ∂xσ
′ ′ .
Undoubtedly, lovely. Using again Equation 32, we can write this as
′ ′
′ ∂xλ ∂xσ ∂xτ ρ ∂xλ ∂ 2 xρ
Γλµ′ ν ′
= Γ + (34)
∂xρ ∂xν ′ ∂xµ′ τ σ ∂xρ ∂xν ′ ∂xµ′
Clearly, the first term is what we would get if the affine connection was a tensor. The second
term is inhomogeneous, and makes it a non-tensor.
We will write this in a slightly different form, using the identity

∂xλ ∂xρ λ′
ρ ν ′ = δν ′ , (35)
∂x ∂x
to write ′ ′
∂xλ ∂ 2 xρ ∂ 2 xλ ∂xσ ∂xρ
= − . (36)
∂xρ ∂xµ′ ∂xν ′ ∂xρ ∂xσ ∂xµ′ ∂xν ′
The transformation of the affine connection, equation 34 thus becomes
′ ′
′ ∂xλ ∂xσ ∂xτ ρ ∂xσ ∂xρ ∂ 2 xλ
Γλµ′ ν ′ = Γ − (37)
∂xρ ∂xν ′ ∂xµ′ τ σ ∂xµ′ ∂xν ′ ∂xρ ∂xσ
– 13 –

5.2. Covariant derivatives of vectors and tensors

Although ∂V µ /∂xλ is not a tensor, the results of equation 37 can be used to construct
′ ′
a tensor. This is done by looking at the transformation law of Γλµ′ ν ′ V ν ,
h ′ ′
i ′
′ ′ ∂xλ ∂xσ ∂xτ ρ ∂xσ ∂xρ ∂ 2 xλ ∂xν
Γλµ′ ν ′ V ν = Γ − ∂x
∂xρ ∂xν ′ ∂xµ′ τ σ µ′ ∂xν ′ ∂xρ ∂xσ ∂xκ

λ ′ τ σ ′ (38)
∂x ∂x ρ σ ∂x ∂ 2 xλ
= ρ
∂x ∂x µ′ Γτ σ V − ∂x ∂x ∂xσ
µ ′ ρ Vρ

Adding this to Equation 31 (replacing the indices ν ′ ↔ λ′ ), one gets


′ ′
∂V µ ∂xµ ∂xρ ∂V µ
 
µ′ ν′ µ σ
+ Γ λ′ ν ′ V = + Γρσ V (39)
∂xλ′ ∂xµ ∂xλ′ ∂xρ

which is basically what we wanted !. We are thus led to define a covariant derivative
∂V µ
∇λ V µ ≡ V µ ;λ ≡ + Γµλσ V σ . (40)
∂xλ
Equation 39 tells us that V µ ;,λ is a tensor, since

µ′ ∂xµ ∂xρ µ
V ;λ′ = V ;ρ (41)
∂xµ ∂xλ′

In an identical way, we can define the covariant derivate of a covariant vector ωµ . Recall
the transformation law (Equation 10),

∂xµ
ω = µ′ ωµ ,
∂xµ′

and differentiate with respect to xν , to get

∂ωµ′ ∂xρ ∂xσ ∂ωρ ∂ 2 xρ


= + ωρ . (42)
∂xν ′
∂xµ ∂xν ∂xσ ∂xµ′ ∂xν ′
′ ′

Using Equation 34, we have


h ′ ′
i
∂xλ ∂xσ ∂xτ ρ λ ∂ 2 xρ ∂xκ
Γ + ∂x

Γλµ′ ν ′ ωλ′ = ∂xρ ∂xν ′ ∂xµ′ τ σ ∂xρ ∂xν ′ ∂xµ′ ′ω
∂xλ κ (43)
∂xσ ∂xτ κ 2 κ
= ′Γ
∂xν ∂xµ τ σ κ
′ ω + ∂x∂ν ′ x∂xµ′ ωκ

Subtracting Equation 43 from 42, we cancel the inhomogeneous terms and obtain

∂xρ ∂xσ ∂ωρ


 
∂ωµ′ λ′ κ
− Γµ′ ν ′ ωλ′ = − Γµν ωκ (44)
∂xν ′ ∂xµ′ ∂xν ′ ∂xσ
– 14 –

Thus, we define the covariant derivative of a covariant vector by


∂ωµ
∇λ ωµ ≡ ωµ;λ ≡ − Γσλµ ωσ . (45)
∂xλ
Clearly, from Equation 44, ωµ;λ is a tensor, since
∂xρ ∂xσ
ωµ′ ;λ′ = ωρ;σ (46)
∂xµ′ ∂xλ′
It is clear how to generalize these definitions to arbitrary tensors. For each upper index
we introduce a term with a +Γ, and for each lower index a term with a −Γ. For example,
∂T µ1 µ2 ν1
T µ1 µ2 ν1 ;σ = + Γµσλ1 T λµ2 ν1 + Γµσλ2 T µ1 λ ν1 − Γλσν1 T µ1 µ2 λ . (47)
∂xσ
Clearly, this is a tensor.
The idea of covariant derivative can be extended to tensor densities, but I will check
whether it is absolutely needed before continuing in this direction.

6. Importance of covariant derivatives, and the derivative of the metric

Let us stop for a moment and see what we got. By introducing the concept of covariant
derivatives, combined with the algebraic properties of tensors [linearity, external (direct)
product and contraction (T µ ≡ T µ ρ νρ )], we were able to extend the concept of partial deriva-
tives from flat space time to a curved one. Moreover, we did it without being dependent on
the particular coordinate system used. In particular, we found that the covariant derivatives
are:

1. Linear: (αT + βS);λ = αT;λ + βS;λ , where α and β are numbers, T and S are tensors.

2. Obey the Leibniz (product) rule: ∇(T ⊗ S) = (∇T ) ⊗ S + T ⊗ (∇S), or (T S);λ =


T;λ S + T S;λ .

The covariant derivative of the metric tensor is 0. This can be understood “intuitively”,
as in the local inertial frame it vanishes, and being a tensor, if it is 0 in one coordinate
system it is 0 in any coordinate system. This can also be seen directly,
∂gµν
λ
− Γρλµ gρν − Γρλν gρµ ,
gµν;λ = (48)
∂x
and using the definition of the affine connection. In a very similar way, g µν ;λ = 0, and
µ
δν;λ = 0.
The importance of covariant derivative arise from two of its properties:
– 15 –

1. It converts tensors to other tensors.

2. It reduces to ordinary differentiation in the absence of gravitation, Γλµν = 0, namely in


flat space-time and Cartesian coordinates.

These properties thus suggest an easy algorithm to asses the effects of gravitation on physical
systems: Write the appropriate SR equation that hold in the absence of gravity;
then replace ηµν with gµν and all derivatives with covariant derivatives. The result-
ing equations will be generally covariant, and true in the absence of gravitation. According
to the principle of general covariance, they will be true in the presence of gravitational field
(provided that we work in sufficiently small region of space).

7. Geometric interpretation of covariant derivatives

Let us have another look on covariant derivatives, as these are crucial when working in
curved space-time. Let us look first at flat space-time. When we want to take a derivative
of a vector, we consider two vectors V (xα ) and V (xα + dxα ) separated by an infinitesimal
displacement dxα along the direction of the derivative. Thus, to construct the derivative,
we first transport the vector V (xα + dxα ) parallel to itself back to the point xα , to give the
vector Vk (xα ). Only then it is in the tangent space of xα , and then at a second step, we
subtract the vector V (xα ) from it, using the parallelogram rule (see Figure 4). The key thing
that we do is parallel transport.
We now turn our attention to curved space time. We can perform parallel transport in
curved space time, because locally we have a local inertial frame which is equivalent to flat
space time. However, when we do that, the coordinates of a vector change. This results
from the change in the angle the vector make with the basis vectors. This is demonstrated in
Figure 5. This change is linear in the vector components. We therefore expect a term of the
α
form ∇β V α = ∂V α /∂xβ + Γβγ V γ . Thus, while the first term ∂V α /∂xβ arise from a change
in the vector field V between xα and xα + dxα , the second term arise from a change in the
basis vectors between the two points.
α
The question now, is why do Γβγ turn out to be identical to the affine connection, Γαβγ
- surely this is no coincidence (?). Of course it isn’t. We saw that the geodesic, in locally
flat space time is a straight line. Now, we defined a “straight line” as the curve of extremal
(minimal) distance between points. However, an alternative definition, is a curve whose unit
tangent vector is parallel to itself (see Figure 6). Let us call this tangent vector u: then its
– 16 –

Fig. 4.— The derivative of a vector in flat space time includes to stages: First, the vector
V (xα +dxα ) is being transported parallel to itself from xα +dxα to xα . The transported vector
is subtracted from V (xα ) to obtain the difference ∆V (xα ). The derivative is the difference
∆v(xα )/dxα in the limit dxα → 0.

covariant derivative in its own direction must vanish,


 α 
α β ∂u α γ
∇u u = u + Γβγ u = 0, (49)
∂xβ
where uα = dxα /dτ . However, we already know that u fulfills the geodesic equation, which
we can write as  α 
β ∂u α γ
u + Γβγ u = 0. (50)
∂xβ
α
Comparing Equations 49 and 50 retrieves that indeed, Γβγ = Γαβγ , as we expected.
This argument can in fact be turned around, to give an elegant version of the geodesic
equation in terms of the covariant derivative. A geodesic is a curve whose tangent vector u
obeys
∇u u = 0. (51)

8. Gradient, divergence and curl

The equations of electromagnetism, fluid mechanics and many others area of classical
physics make use of the three-dimensional vector calculus employing functions such as gra-
dient, divergence, curl and Laplacian. You have seen explicit forms of these functions in
– 17 –

Fig. 5.— When parallel transporting a vector in non-Cartesian coordinates, the components
of the vector change, due to change in the basis vectors: in this example, we use polar coordi-
nates, and while the vector itself does not change when parallel-transported, its components
do.

non-Cartesian coordinate systems, such as cylindrical or spherical. The concept of covariant


derivative provides a unified picture of all these derivatives and a direct route to the explicit
forms in given coordinate systems.
We have already seen that the covariant derivative of a scalar is just the ordinary
gradient:
∂S
S;µ = (52)
∂xµ
Another special case is the covariant of the curl. Using Equation 45, ωµ;ν = ∂ωµ /∂xν −Γλµν ωλ ,
and the fact that Γλµν is symmetric in µ and ν, the covariant curl is just the ordinary curl,

∂ωµ ∂ων
ωµ;ν − ων,µ = − µ. (53)
∂xν ∂x

The covariant divergence of a contravariant vector,


∂V µ
V µ ;µ = µ
+ Γµµλ V λ . (54)
∂x
– 18 –

Fig. 6.— A geodesic can be thought of as a line for which a tangent vector V at xα is
′ ′
parallel-transported to xα , the obtained vector Vk (xα ) coincides with the tangent vector

V (xα ).

We can use the symmetry properties of Γµµλ to write it as


 
µ 1 µρ ∂gρµ ∂gρλ ∂gµλ 1 µρ ∂gρµ
Γµλ = g + − − = g (55)
2 ∂xλ ∂xµ ∂xρ 2 ∂xλ

Using the algebraic identity


 
−1 ∂ ∂
T r M (x) λ M (x) = ln DetM (x) (56)
∂x ∂xλ
and applying it to the matrix M = gρµ lead to
1 ∂ p
Γµµλ = p |g|, (57)
|g| ∂xµ
and the covariant divergence is
1 ∂ p
V µ ;µ = p ( |g|V µ ) (58)
|g| ∂xµ
This immediately leads to the covariant form of Gauss’s theorem: if V µ vanishes at infinity,
then Z p
d4 x |g|V µ ;µ = 0. (59)
– 19 –

p p
(Note the appearance of |g| that makes d4 x |g| invariant).
In 3-dimensions, the Laplacian of a scalar S is just the divergence of its gradient, namely

∇2 S = (g ij S;i );j (60)

which, using Equations 52 and 58 is


 
2 1 ∂ p ∂S
∇S=p |g|g ij i . (61)
|g| ∂xj ∂x

You might also like