0% found this document useful (0 votes)
84 views

General Theory of Relativity For Undergraduates - Chapter 1

This chapter introduces concepts needed to understand general relativity, including intrinsic coordinates on curved surfaces, tangent vector spaces, basis vectors, the metric tensor, and covariant derivatives of vectors and tensors. It focuses on developing these concepts for a 2D surface embedded in 3D Euclidean space to aid visualization. Key points introduced are intrinsic vs extrinsic coordinate systems, computing distances using the metric tensor, and how quantities like the basis vectors and metric tensor transform under changes of coordinates to remain physically meaningful.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views

General Theory of Relativity For Undergraduates - Chapter 1

This chapter introduces concepts needed to understand general relativity, including intrinsic coordinates on curved surfaces, tangent vector spaces, basis vectors, the metric tensor, and covariant derivatives of vectors and tensors. It focuses on developing these concepts for a 2D surface embedded in 3D Euclidean space to aid visualization. Key points introduced are intrinsic vs extrinsic coordinate systems, computing distances using the metric tensor, and how quantities like the basis vectors and metric tensor transform under changes of coordinates to remain physically meaningful.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/342330200

General theory of relativity for undergraduates -- Chapter 1

Preprint · June 2020


DOI: 10.13140/RG.2.2.34272.10241/1

CITATIONS READS
0 1,325

1 author:

Jaroslav Albert
Independent
34 PUBLICATIONS   106 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Computation of daily dose by determining initial breathing phase from Prompt gamma data. View project

All content following this page was uploaded by Jaroslav Albert on 20 June 2020.

The user has requested enhancement of the downloaded file.


General theory of relativity for undergraduates
Chapter 1: Intrinsic differential geometry
Abstract
This is the first chapter of a book on general relativity intended for upper undergraduate physics
students. The goal of this chapter is to introduce students to the essential mathematical concepts
they will need to understand general relativity, such as intrinsic coordinates, tangent vector space,
basis vectors, metric tensor, and covariant derivative of vectors and tensors. The through line of
this chapter are the concepts of covariance and invariance. Great emphasis is placed on geometric
arguments. All formulas are derived for a 2-d surface embedded in a 3-d Euclidean space. Thanks
to this approach, any student who has taken vector calculus (or classical electrodynamics) will be
able to follow the material presented herein.

1
In this chapter we develop some basic tools for dealing with curved spaces. To make
matters as simple as possible, we will only work with two dimensional surfaces embedded in
a three dimensional space. This will allow us to visualize every concept introduced in this
chapter.

THE METRIC

Let us consider a 2-d surface embedded in a 3-d space labeled by the Cartesian coordinate
system Y , as shown in Figure 1A. A point p on this surface can be represented by the vector
R = R1 y1 + R2 y2 + R3 y3 , where y1 , y2 and y3 are unit vectors along the three coordinate
axes y1 , y2 and y3 , respectively. One way to describe this surface is to parametrize R3
using the other two components: R3 = R3 (R1 , R2 ). While this is a legitimate scheme, it
can become problematic when the surface in question is multivalued along the y3 direction.
Also, if we were to rotate the surface in such a way that the y3 axes was tangent to a point
on the surface, the derivative of R3 with respect to R1 or R2 at that point would be infinite.
There is an elegant solution to these problems, which is to represent the point p using
a coordinate system X defined on the surface itself. Figure 1B shows an example of such
a coordinate system. X is called an intrinsic coordinate system because it needs only the
surface itself to be defined, as opposed to the extrinsic coordinate system Y , which references
a larger space in order to describe the same surface. The visible grid lines are only meant
to showcase a finite portion of X, which we take to have infinitely many subsections. Much
like when we use kilometers to speak about long distances, we do not need to mention the
existence of meters, centimeters, nanometers etcetera, to make our point. The location of p,
according to this system, is (4, 3); that is, in order to locate p starting from the origin (0, 0),
we must travel a coordinate distance of 4 units in the x1 direction and then the coordinate
distance of 3 units in the x2 direction. If we parametrize R using X instead of Y ,

R(x1 , x2 ) = R1 (x1 , x2 )y1 + R2 (x1 , x2 )y2 + R3 (x1 , x2 )y3 , (1)

a derivative of R with respect to x1 or x2 at any point on the surface is guaranteed to be


finite, provided the surface is smooth (which will be the case throughout this book).
Note that the coordinate distance does not necessarily correspond to actual distance,
one that is measurable with an odometer; we could have chosen to label the coordinates as

2
FIG. 1.

0, π, 2π, 3π... along x1 and 0, 2.71 , 2.72 , 2.73 ... along x2 . Then the location of p would have
been (4π, 2.73 ). A coordinate distance should only be thought of as a label, much like the
streets and avenues in New York City; instructing someone to walk five streets North and
two avenues West does not inform the participant about the actual distance of travel, but it
does get them to their destination. To compute actual distances on the surface, we can use
the fact that R is a vector that represents a real distance. By subtracting R(x1 , x2 ) from
R(x1 + ∆x1 , x2 ), where ∆x1 is some small coordinate distance, we obtain a vector ∆s1 that
connects the points (x1 + ∆x1 , x2 ) and (x1 , x2 ), as shown in Figure 2, and is approximately
tangent to the surface. Similarly, subtracting R(x1 , x2 ) from R(x1 , x2 + ∆x2 ) produces a
vector ∆s2 that connects the points (x1 , x2 + ∆x2 ) and (x1 , x2 ), and it too is approximately
tangent to the surface. The actual square distance between the two pairs of points is
(∆s1 · ∆s1 ) and (∆s2 · ∆s2 ), respectively. Adding these two vectors together, we obtain the
distance vector between the points (x1 , x2 ) and (x1 + ∆x1 , x2 + ∆x2 ), ∆s = ∆s1 + ∆s2 , or,
equivalently,
   
R(x1 + ∆x1 , x2 ) − R(x1 , x2 ) R(x1 , x2 + ∆x2 ) − R(x1 , x2 )
∆s(x1 , x2 ) = ∆x1 + ∆x2 .
∆x1 ∆x2
(2)
If we let ∆x1 and ∆x2 approach zero, the expressions in the square brackets become partial

3
FIG. 2.

derivatives of R with respect to x1 and x2 , respectively. In that same limit, ∆s1 and ∆s2 get
promoted from being only approximately tangential to exactly tangential, and ∆s(x1 , x2 )
becomes the exact distance between the points (x1 , x2 ) and (x1 + ∆x1 , x2 + ∆x2 ). Hence,
replacing ∆ with d, we can write

ds(x) = e1 (x)dx1 + e2 (x)dx2 , (3)

where
∂R(x) ∂R(x)
e1 (x) = , e2 (x) = (4)
∂x1 ∂x2

are tangent vectors called basis. The symbol x will serve as a short hand notation for
(x1 , x2 ). The direction of e1 and e2 is always along the x1 and x2 axis, respectively. These
vectors span the space of a plane that is tangent to the surface at x (see Figure 3). Since the
basis are formed by a real distance (numerator) and a coordinate distance (denominator),
we cannot ascribe either real or coordinate distance to them; they are a mix of the two kinds
of distance.
The square distance between the points (x1 , x2 ) and (x1 +dx1 , x2 +dx2 ) is the dot product

4
FIG. 3.

of ds(x) with itself:

2 X
X 2
2
ds(x) = ds(x) · ds(x) = [ei (x) · ej (x)]dxi dxj . (5)
i=1 j=1

The two index object [ei (x) · ej (x)] is called metric tensor, or metric for short, and is
customarily denoted by gij (x). Since the dot product [ei (x) · ej (x)] is symmetric with
respect to the indices i and j, the metric comprises of three distinct functions of x, g11 ,
g12 = g21 and g22 .

CHANGING COORDINATES, INVARIANCE AND TENSORS

We have seen in the previous section that the basis, and by extension the metric, depend
on our choice of a coordinate system. However, the distance vector ds in Eq. (3) represents a
real, physical distance between two points on a surface, and as such it must be independent
of our taste in coordinates. In other words, two distinct coordinate systems X and X 0 must
produce the same ds:

e1 (x)dx1 + e2 (x)dx2 = e01 (x0 )dx01 + e02 (x0 )dx02 , (6)

5
where
∂R0 (x0 ) ∂R0 (x0 )
e01 (x0 ) = , e02 (x0 ) = . (7)
∂x01 ∂x02
To find the relation between the primed and unprimed basis, we first need to realize that
R(x) = R0 (x0 ). This relation does not imply that the two functions R and R0 are identical,
but that R evaluated at x gives the same vector (magnitude and direction) as R0 evaluated
at x0 . To appreciate the subtlty of this point, let’s consider a quick example.
Consider two coordinate systems, one Cartesian and the other polar, and a function
f (x1 , x2 ) = 3x21 −5x2 . In polar coordinates x1 = r cos(θ), x2 = r sin(θ) and f = 3r2 cos2 (θ)−
5r sin(θ). If we let r → x01 and θ → x02 , we get f 0 = 3x02 2 0 0 0
1 cos (x2 ) − 5x1 sin(x2 ). Clearly,

f and f 0 are not the same function; if they were, f (1, 4) = −17 would be the same as
f 0 (1, 4) = 5.07. The reason they are not the same is that the point (x1 = 1, x2 = 4), which is
the input of f is not the same point as (x01 = 1, x02 = 4) , which is the input of f 0 . The upshot
of this excercise is that when we change coordinates, it is not the function that changes;
what changes is the manner by which we get to the desired points.
Getting back to the basis, let us ask how they transform under the coordinate change
X → X 0 . Starting with the definition (7), and the relation R(x) = R0 (x0 ), we can write

∂R0 (x0 ) ∂R(x(x0 ))


e01 (x0 ) = = , (8)
∂x01 ∂x01

where we made it explicit in the argument of R that x can be expressed in terms of x0 . (In
the above example, we had x1 = x01 cos(x02 ), x2 = x01 sin(x02 ).) According to the chain rule,
we can write

∂R(x(x0 )) ∂R(x(x0 )) ∂x1 ∂R(x(x0 )) ∂x2


   
∂x1 ∂x2
0
= 0
+ 0
= e1 0 + e2 0 . (9)
∂x1 ∂x1 ∂x1 ∂x2 ∂x1 ∂x1 ∂x1

Applying the same treatment to e02 , we get

∂R(x(x0 )) ∂R(x(x0 )) ∂x1 ∂R(x(x0 )) ∂x2


   
∂x1 ∂x2
0
= 0
+ 0
= e1 0 + e2 0 . (10)
∂x2 ∂x1 ∂x2 ∂x2 ∂x2 ∂x2 ∂x2

We also need to transform dx01 and dx02 . Invoking the chain rule again, we can write

∂x01 ∂x01
dx01 = dx1 + dx2 ,
∂x1 ∂x2
∂x02 ∂x02
dx02 = dx1 + dx2 . (11)
∂x1 ∂x2

6
Inserting Eqs. (9), (10) and (11) into Eq. (6), and factoring out dx1 and dx2 , we obtain

∂x1 ∂x01 ∂x1 ∂x02 ∂x2 ∂x01 ∂x2 ∂x02


    
e1 (x)dx1 + e2 (x)dx2 = e1 + + e2 + dx1
∂x01 ∂x1 ∂x02 ∂x1 ∂x01 ∂x1 ∂x02 ∂x1
∂x1 ∂x01 ∂x1 ∂x02 ∂x2 ∂x01 ∂x2 ∂x02
    
+ e1 + + e2 + dx2 .(12)
∂x01 ∂x2 ∂x02 ∂x2 ∂x01 ∂x2 ∂x02 ∂x2

The right hand side does not quite look like the left hand side – yet. Let us examine the
factors of e1 and e2 inside the first bracket. We can show that the first factor is equal to
1 by noticing that it is equivalent to ∂x1 /∂x1 . To see this, let us differentiate an arbitrary
function f (x0 (x)) with respect to x1 . Note that the argument is now x0 as a function of x.
p
(In the above example, we would write x01 = x21 + x22 and x02 = tan−1 (x2 /x1 ).) The result
is
∂f ∂f ∂x01 ∂f ∂x02
= + . (13)
∂x1 ∂x01 ∂x1 ∂x02 ∂x1
Since f is arbitrary, we can set it to x1 (x0 (x)), which gives ∂x1 /∂x1 = 1. Applying the same
logic, we can show that the second factor is equal to ∂x2 /∂x1 , which is zero, since x1 and x2
are independent by definition. Similarly, inside the second bracket in (12) the factors of e1
and e2 equal ∂x1 /∂x2 = 0 and ∂x2 /∂x2 = 1, respectively. Hence, Eq. (6) is indeed correct!
At this point it might be a good time to introduce a notation that will persist for the
rest of this book. First, let us condense expressions (9), (10) and (11) using the summation
symbol:
2
X ∂xj
e0i = ej
j=1
∂x0i
2
X ∂x0i
dx0i = dxj , (14)
j=1
∂xj

for i = 1, 2. Notice that in the first expression, the primed variable on the right hand side
appears in the denominator, while in the second expression it appears in the numerator.
The first kind of transformation is called covariant, while the second kind of transformation
is refered to as contravariant. To distinguish them, we make a slight change in notation:
quantities that transform covariantly will have a lower index, e. g. ei , while those quanti-
ties that transform contravariantly will have an upper index, e. g. dxi → dxi . (To avoid
confusion, we will put parentheses around contravariant objects whenever we wish to ex-
2
ponentiate them. For example, the square of dx2 will be written as (dx2 ) .) Thanks to

7
P
Einstein, we can simplify our notation even further by removing the summation symbol
from all expressions. So, Eqs. (14) can be written as
∂xj
e0i = ej
∂x0i
0i ∂x0i j
dx = dx (15)
∂xj
Einstein’s argument was that since repeated indices are always summed over (unless stated
P
otherwise), why bother writing . In this new notation, the infinitesimal square distance
in Eq. (5) can be written as

ds(x)2 = [ei (x) · ej (x)]dxi dxj . (16)

Notice that both i and j appear as subscripts and superscripts, and each is summed over
with their upstairs/downstairs counterpart. This is not a coincidence: only when we sum
a covariant quantity with a contravariant one is the result invariant under a coordinate
transformation. This is why the upper and lower index notation was invented in the first
place. From now on, whenever we see the same index appearing in two places as a superscript
or a subscript, an alarm should go off in our heads. Wrong!
We can extend the definition of covariant transformation to an object Ti1 ,...,iN with an N
number of indices:
∂xk1 ∂xkN
... Ti01 ,...,iN =
Tk ,...,k covariant. (17)
∂x0i1 ∂x0iN 1 N
Similarly, an object with only upper indices transforms like so:
0i1 ,...,iN ∂x0i1 ∂x0iN k1 ,...,kN
T = ... T contravariant. (18)
∂xk1 ∂xkN
We can also have an object that has a mixture of upper and lower indices, e. g. Tji11,...,j
,...,iM
N
.
The transformation of such an object would be contravariant with respect to the indices
i1 , ..., iM , and covariant with respect to the indices j1 , ..., jN :
 0i1
∂x0iM
  l1
∂xlN

0i1 ,...,iM ∂x ∂x
Tj1 ,...,jN = ... ... Tlk11,...,l
,...,kM
mix. (19)
∂xk1 ∂xkM ∂x0j1 ∂x0jN N

All objects that transform according to one of these three rules are called tensors. The
number of indices determines the tensor’s rank. So, for example, scalars, which have zero
indices, have rank zero. Vector components have rank 1. The metric, which has rank 2,
transforms covariantly:
∂xk
  l 
∂xk ∂xl ∂xk ∂xl

∂x
gij0 = [e0i · e0j ] = e k · ek = [e k · el ] = gkl . (20)
∂x0i ∂x0j ∂x0i ∂x0j ∂x0i ∂x0j

8
An important rank two tensor is the inverse metric. By inverse we mean the same as
an inverse of a matrix. Since gij is symmetric, and its determinant is always nonzero (at
least for all intents and purposes), it will always have an inverse, which we denote by g ij .
The fact that we wrote the inverse of gij with upper indices is not accidental; the inverse
metric indeed transforms contravariantly. We can show this by assuming that g ij transforms
contravariantly, and see if it satisfies the properties that are required of an inverse matrix,
namely, that gij g jk = δik , where δik is the Kronecker delta function, which is 1 if k = i and
0 if k 6= i. Beware! δik is NOT a tensor, despite the fact that it has one upper and one
lower index; it is always the identity matrix. Now, if g ij transforms contravariantly, then
the product gij g jk must always give the identity matrix regardless of the coordinate system.
In other words

∂xq ∂xl ∂x0j ∂x0k


gij0 g 0jk = 0i 0j p s
gql g ps = δik . (21)
∂x ∂x ∂x ∂x

To prove that this statement is true, we must rearrange some terms:

∂xl ∂x0j ∂x0k ∂xq


  
gij0 g 0jk = gql g ps . (22)
∂x0j ∂xp ∂xs ∂x0i

We have shown earlier that the expression in the first parentheses is merely ∂xl /∂xp = δpl .
Hence,

∂x0k ∂xq ∂x0k ∂xq ∂x0k ∂xq ∂x0k ∂xq


       
gij0 g 0jk = δpl gql g ps
= ls
gql g = δqs = = δik .
∂xs ∂x0i ∂xs ∂x0i ∂xs ∂x0i ∂xq ∂x0i
(23)
Indeed, if and only if we assume the inverse metric to be a contravariant tensor, the identity
gij g jk = δjk will hold in any coordinate system.

As a final note, we should say something about the transformation matrices ∂xk /∂x0i
and ∂x0i /∂xk . Because they have indices, one may be tempted to ask how they transform.
The answer is: they do not, as they are mere conversion factors between two coordinate
systems. Just like δik , they are an exception to the rule. The good news is that these are the
only exceptions we will encounter in this book. That is not to say that every object with
indices we encounter will transform as a tensor – quite the contrary – but it will transform
somehow, unlike ∂xk /∂x0i , ∂x0i /∂xk and δik , which do not transform at all.

9
DUAL BASIS

So far, we have discussed one example of a vector: the distance vector ds. The components
of this vector were the differentials dxi , which transform contravariantly. However, what if
we wanted (or had no choice but) to express a vector in terms of vector components that
transformed covariantly? Could we still use the basis ei to describe it? Yes and no. Yes,
in the sense that, since the basis vectors ei at any point x span the tangent plane, they
can be used to describe any vector in that plane. And no, in the sense that the sum Vi ei
is not invariant! To make it invariant, we can define dual basis ei . These are basis that
transform contravariantly. There are potentially infinitely many ways to define such basis;
all we need is to is write ei = T ij ej , where T ij is an arbitrary contravariant tensor. However,
it is more natural to use the inverse metric g ij . One reason for this is that it makes ei and
ej orthogonal, i. e. ei · ej = δij . The proof of this is simple:

ei · ej = ei · (g jk ek ) = (ei · ek )g jk = gik g jk = δik , (24)

where in the last step we used the fact that g jk = g kj . Another reason for defining the dual
basis this way is that the dot product ei · ej gives us the inverse metric:

ei · ej = g ik g jl (ek · el ) = g ik (g jl gkl ) = g ik δkj = g ij (25)

So, from now on, we will use this definition for the dual basis:

ei = g ik ek . (26)

LOWERING AND RAISING INDICES

Suppose we were given a vector field with contravariant vector components, V = V i ei , but
for whatever reason we wanted to use covariant components? What would be the relationship
between the contra- and co-variant vector components V i and Vi ? First of all, in order to
ensure that V remain invariant, we need to change the basis to dual basis: ei → ei = g ik ek .
But this fixes the form of the covariant vector components to this: Vi = gij V j . Proof:

Vi ei = (gij V j )(g ik ek ) = (gij g ik )V j ek = δjk V j ek = V j ej . (27)

10
Hence, any time we want to change an index of a tensor from contra- to co-variant or vice
versa, we follow these procedures, called lowering and raising indices:

Tji = T ik gkj (28)

and
Tij = Tik g kj , (29)

COVARIANT DERIVATIVE

Consider a vector field on a surface and two vectors, V and V0 , located at points x and x0 ,
respectively, as shown in Figure 4 A. How does V differ from V0 and how can we quantify
this difference? If we were working in the three dimensional space in which the surface
is embedded, we would simply subtract V from V0 and call it a day. However, intrinsic
geometry demands that we work with objects that are intrinsic to the surface; in our case,
we must work with the basis. This poses a problem: V is described by basis at x, i. e.
V i ei (Figure 4 B), while V0 is described by basis at x0 , i. e. V 0i e0i (Figure 4 C). Since the
two planes that accommodate basis ei and e0i are not in principal identical, it is impossible
to describe the difference V0 − V in terms of either basis. There is however a procedure
that allows us to describe the part of the difference V0 − V that is relevant to the surface
at x. First, let us take V0 and parallel transport it from x0 to x along a straight line that
passes through both points, as shown in Figure 4 D. The new V0 , now at the point x, will be
labeled as V||0 to signify via the subscript || that it has been parallel-transported. Next, let us
project V||0 onto the x1 and x2 axes of the plane at x, but also onto the axes normal to that
0 0
plane (Figure 4 E and F). As a result, V||0 can be written as V||0 = V|| 1 e1 +V|| 2 e2 + Ṽ||0 n, where
Ṽ||0 is the component of V||0 along the normal. Or, in the Einstein summation convention,
0
V||0 = V|| i ei + Ṽ||0 n. (30)

The difference V||0 − V then reads


0
V||0 − V = (V|| i − V i )ei + Ṽ||0 n. (31)

If we let x0 = (x1 + ∆x1 , x2 ), Eq. (31) can be written as


" 0 #
(V||1 − V) 0 0
∆x1 = (V||1i − V i )ei + Ṽ||1 n, (32)
∆x1

11
FIG. 4.

where the index ||1 tells us that we parallel-transported a vector along the x1 axes. The
expression in the brackets on the left hand side should look familiar to you: in the limit as
∆x1 = dx1 → 0, it becomes the definition of the partial derivative of a vector. Hence,
0 0
∂V (V||ki − V i ) Ṽ||1
= ei + k n, (33)
∂xk dxk dx

where k = 1, 2 labels the axes in the direction of which the vector was parallel-transported.
Parallel-transporting the basis, we get
" 0j
(ei||k − eji )
#
∂ei ẽi||k
k
= k
ej + k n. (34)
∂x dx dx

0
j
The first term in the brackets, ei||k , is the ith component of the parallel-transported basis e0i
that has been projected onto the axis j; and eji = δij by definition. Conventionally, the term
in the square brackets on the right hand side is given the symbol
0j
(ei||k − eji )
Γjki = , (35)
dxk

called the Christoffel symbol. Thus, we can write the partial derivative of a vector as

∂V ∂V i ei ∂V i i ∂ei ∂V i j i
ẽ0i||1 i
= = ei + V = ei + Γ ki V e j + V n (36)
∂xk ∂xk ∂xk ∂xk ∂xk dxk

12
Since i and j in the term Γjki V i ej are dummy indices, we can interchange them: Γjki V i ej =
Γikj V j ei . This gives us a more concise expression for the partial derivative of a vector:

ẽ0i||1 i
 i 
∂V ∂V i j
= + Γkj V ej + k V n. (37)
∂xk ∂xk dx
Comparing this expression with Eq. (33), we can see that
0
(V||ki − V i ) ∂V i
= + Γikj V j . (38)
dxk ∂xk
The term on the right hand side is called covariant derivative and will from now on bear
the symbol Dk :
i ∂V i
Dk V = + Γikj V j Covariant derivative. (39)
∂xk
The covariant derivative tells us how vector components change along the surface. The
presence of the additional term Γikj V j is due to the fact that the basis themselves depend
on the coordinate x.
Since the covariant derivative is the projection of the partial derivative of V onto the
tangent plane, we can define the operator
 
i ∂
Dk = e i e · k . (40)
∂x
The order of operation is always right to left. In other words, first we take the partial
derivative with respect to xk , then take the dot product of ei with the differentiated vector,
and finally multiply by ei . The result is
" 0
#

 
∂ i||k
Dk V = ei ei · k V j ej = ei ei · Dk V j ej + n
∂x dxk
" 0
#
ẽi||k
= ei Dk V j ei · ej + ei · n
 
dxk
= ei Dk V j δji = Dk V i ei ,

(41)

where the orthogonality relation (ei · n) = 0 killed the normal component.


We could also ask: what is the derivative of a vector that is expressed in terms of dual
basis, Vi ei . To answer, we need to know what the partial derivative of the dual basis is. The
relation between the basis and dual basis, given by Eq. (26), can help us:
∂ej
    
∂ei ∂  j
 ∂  j
 ∂ei j ∂ej j
= gij e = (ei · ej )e = · ej e + ei · k e + (ei · ej ) k .
∂xk ∂xk ∂xk ∂xk ∂x ∂x
(42)

13
The first term in the brackets on the right hand side is equal to

∂ei ẽ0i||k
· ej = Γik (eq · ej ) + k (n · ej ) = Γqik gqj .
q
(43)
∂xk dx

Similarly,
∂ej
ei · = Γqjk giq . (44)
∂xk
Thus, we can write Eq. (42) as

ẽ0i||k ∂ej
 
Γm
ki em + m j m j
n = Γki gmj e + Γkj gim e + gij k . (45)
dxk ∂x

Notice that the first term in the brackets is equivalent to Γm


ki em , which is identical to the

first term on the left hand side. So, solving for gij ∂ej /∂xk , we get

∂ej m j
ẽ0i||k
gij k = −Γkj gim e + n. (46)
∂x dxk

Finally, multiplying both sides by g qi leads to

∂eq q j
g qi ẽ0i||k
= −Γkj e + n. (47)
∂xk dxk

ACCELERATION AND DIRECTIONAL DERIVATIVE

We have derived Eq. (39) by taking a partial derivative of a vector field with respect
to xk . However, we could imagine taking a derivative of the same vector field with respect
to some parameter t: d/dt. To see how, let us imagine a small vehicle moving along a line
on the surface, as in Figure (5). Let us also equip this vehicle with a speedometer and a
little pendulum, perhaps hanging from the rear view mirror (in place of one of those aroma
pine trees). From the data the speedometer records during the trip, we can calculate the
magnitude of the acceleration at any time. The deviations from equilibrium of the pendulum
allow us to calculate the acceleration in directions parallel to the tangent plane at any point
on the curve. Thus, by reading these two instruments, we can reconstruct the acceleration
vector along the surface at any point on the curve. But what if instead of these instruments
we had the basis and the mathematical representation of the curve? Could we calculate the
acceleration at any point on the curve? Yes. Here is how. We already know that the vector
ds = dxi ei is tangent to the surface and whose magnitude gives the real distance between

14
FIG. 5.

the points (x1 , x2 ) and (x1 + dx1 , x2 + dx2 ). If we parametrize the curve using time, such
that dxi = ui (t)dt, where ui = dxi /dt, then the tangent velocity vector reads
ds
v(t) = = ui ei . (48)
dt
Taking another time derivative, we obtain the acceleration vector
dv d dui (t) d
a= = [ui (t)ei (x(t))] = ei (x(t)) + ui (t) ei (x(t)). (49)
dt dt dt dt
The second term on the right hand side can be written as

i d i ∂ei (x) dxk j k i


ẽ0i||1 i
u (t) ei (x(t)) = u (t) = Γki u u ej + u n. (50)
dt ∂xk dt dxk
So, the acceleration vector is
ẽ0i||1 i
 i 
dv(t) du (t) i k j
a= = + Γkj u u ei + u n. (51)
dt dt dxk
Back to the question we asked at the beginning of this section: Can we calculate the
acceleration at any point on the curve using only the curve itself and the basis? Well, what
we do know in Eq. (51) is ui (t) and dui (t)/dt, but not Γjki . We will see in the next section
that Γikj can in fact be expressed entirely in terms of the metric, and hence in terms of the
basis. Once we have Γjki explicitly in terms of the basis, we can indeed answer the question
with a yes.

15
We can also ask how an arbitrary vector field V changes along a curve. Using the relation

dV i ∂V i dxk ∂V i k
= = u , (52)
dt ∂xk dt ∂xk

we can write " #


dV ∂V dxk i
ẽ0i||1
n uk .

= = Dk V ei + (53)
dt ∂xk dt dxk

So, the change of V along the surface is given by the expression uk (Dk V i ) ei , where uk Dk
is called directional derivative:

∂V i
 
k i k
u Dk V = u + Γikj V j Directional derivative (54)
∂xk

CHRISTOFFEL SYMBOL

Let’s recall that gij = ei · ej and take a derivative of gij with respect to xk :

∂gij ∂(ei · ej ) ∂ei ∂ej


k
= k
= k
· ej + ei · k . (55)
∂x ∂x ∂x ∂x

According to Eq. (44), the right hand side can be written as

∂gij
k
= Γqik gqj + Γqjk giq . (56)
∂x

Since expression (56) is true for any values of i, j and k, we can also write

∂gkj
= Γqki gqj + Γqji gkq (57)
∂xi
∂gik
= Γqij gqk + Γqkj giq , (58)
∂xj
where in Eq. (57) we swapped i with k, and in Eq. (58) we switched j and k. If we add
Eqs. (56) and (57), and subtract Eq. (58), we end up with

∂gij ∂gkj ∂gik


= (Γqik gqj + Γqki gqj ) + Γqjk giq − Γqkj giq + Γqji gkq − Γqij gqk .
 
k
+ i
− j
(59)
∂x ∂x ∂x

The terms on the right hand side of Eq. (59) have been grouped that way to call attention
to the following fact: the terms in the first parentheses add up to 2Γqik gqj , while those in the
other two parentheses cancel each other out, which gives us

∂gij ∂gkj ∂gik


k
+ i
− = 2Γqik gqj . (60)
∂x ∂x ∂xj

16
This simplification is possible thanks to the symmetries gij = gji and Γijk = Γikj . The first
one we have proven earlier, and the second we will show to be true in a moment. For now,
let’s solve for the Christoffel symbol by multiplying both sides of Eq. (60) by the inverse
metric g mj and summing over j:
 
mj ∂gij ∂gkj ∂gik
g + − = 2Γqik g mj gqj = 2Γqik δqm = 2Γm
ik . (61)
∂xk ∂xi ∂xj

Finally, dividing by 2, we arrive at the final expression


 
m 1 mj ∂gij ∂gkj ∂gik
Γik = g + − . (62)
2 ∂xk ∂xi ∂xj

This is great! The Christoffel symbol can be computed entirely from the metric, or, equiv-
alently, from the basis.
Befor moving on, we must circle back and check that the symmetry Γijk = Γikj that we
used to arrive at Eq. (62) holds. The easiest way to do this is to rewrite Eq. (44) by
multiplying both sides by g mi and sum over i, which gives

∂ej ∂ej
g mi ei · k
= em · k = Γm
jk . (63)
∂x ∂x

Recall from Eq. (4) that


∂R
ei = . (64)
∂xi
If we insert it into ∂ej /∂xk , we get

∂ 2R
Γm m
jk = e · . (65)
∂xk ∂xj

Since partial derivatives can be carried out in any order, we see that the Christoffel symbol
is indeed symmetric with respect to its lower indices.

COVARIANT DERIVATIVE UNDER A COORDINATE TRANSFORMATION

We have seen that an object of the form T ij Kij is invariant under any coordinate transfor-
mation if, and only if, T ij transforms contravariantly and Kij transforms covariantly. This
is why tensors are useful and why we should be asking whether the covariant derivative of a
vector, V i , is a tensor. Just because it has one lower indices and one upper index does not
make it automatically a tensor. We have to check.

17
Let’s start with the first term on the right hand side of Eq. (39). The transformations
of a derivative and a vector component are

∂ ∂xq ∂
= , (66)
∂x0k ∂x0k ∂xq

and
∂x0i j
V 0i = V . (67)
∂xj
To understand where Eq. (66) came from, we have to imagine the derivative acting on some
arbitrary function, f (x(x0 )), as in Eq. (13). So, we have that

∂V 0i
  0i   0i q 
∂ ∂x0i ∂xq j
 q
∂V j
 
∂x ∂ ∂x j ∂x ∂x
= V = + V . (68)
∂x0k ∂x0k ∂xq ∂xj ∂xj ∂x0k ∂xq ∂xq ∂xj ∂x0k

If we only had the first term on the right hand side, ∂V i /∂xk would indeed be a tensor.
However, the second term spoils the transformation. So far, things are not looking good
for the covariant derivative’s chances of being a tensor. But, let’s be diligent and check the
transformation of the second term on the right hand side of Eq. (39), starting with the
Christoffel symbol. We will again work with the relation

∂ej
Γijk = ei · . (69)
∂xk

Let’s write down the transformation of each term:

∂xi m
e0i = e ,
∂x0m
∂ ∂xq ∂
= ,
∂x0k ∂x0k ∂xq
0 ∂xl
ej = el . (70)
∂x0j

And so,

∂e0j ∂x0i ∂xq m ∂


 l 
∂x
Γ0ijk 0i
= e · 0k = e · el
∂x ∂xm ∂x0k ∂xq ∂x0j
∂x0i ∂xq m
 l
∂ ∂xl

∂x ∂el
= e · + el q 0j
∂xm ∂x0k ∂x0j ∂xq ∂x ∂x
0i q l
∂x0i ∂xq ∂ ∂xl
   
∂x ∂x ∂x m ∂el
= e · q + m 0k em · el
∂xm ∂x0k ∂x0j ∂x ∂x ∂x ∂xq ∂x0j
∂x0i ∂xq ∂xl m ∂x0i ∂xq ∂ ∂xm
 
= Γ + , (71)
∂xm ∂x0k ∂x0j lq ∂xm ∂x0k ∂xq ∂x0j

18
where the relation em · el = δlm allowed us to simlify the last line. Here again, we discover
that the Christoffel symbol would transform like a tensor if it were not for the second term.
Multiplying Eq. (71) by
∂x0j n
V 0j = V , (72)
∂xn
and adding Eq. (68), we obtain the transformed covariant derivative of V i :
 0i q 
∂V j ∂x0i ∂xq ∂xl ∂x0j
 
0 0i ∂x ∂x
Dk V = + m 0k Γm
lq V
n
∂xj ∂x0k ∂xq ∂x ∂x ∂x0j ∂xn
∂ ∂x0i ∂xq j ∂x0i ∂xq ∂ ∂xm ∂x0j n
   
+ V + m 0k V
∂xq ∂xj ∂x0k ∂x ∂x ∂xq ∂x0j ∂xn
(73)

Thanks to the relation


∂xl ∂x0j ∂xl
= = δnl , (74)
∂x0j ∂xn ∂xj
we can write
∂x0i ∂xq ∂xl ∂x0j ∂x0i ∂xq m l
 
Γm n
lq V = Γ V . (75)
∂xm ∂x0k ∂x0j ∂xn ∂xm ∂x0k lq
If we also replace m with j, the first two terms on the right hand side of Eq. (73) can be
expressed as
∂x0i ∂xq ∂V j
 
j l
+ Γnl V . (76)
∂xj ∂x0k ∂xq
The last two terms can be simplified by swapping n and j, which gives
∂ ∂x0i ∂xq ∂x0i ∂xq ∂ ∂xm ∂x0n
    
+ V j. (77)
∂xq ∂xj ∂x0k ∂xm ∂x0k ∂xq ∂x0n ∂xj
The expression in the brackets can be further simplified by writing the second term as
∂xq ∂ ∂x0i ∂xm ∂x0n ∂xq ∂ ∂x0i ∂xm ∂x0n
   
− 0k (78)
∂x0k ∂xq ∂xm ∂x0n ∂xj ∂x ∂xq ∂xm ∂x0n ∂xj
The first term in Eq. (78) is zero by virtue of expression (74), while the second term can be
simplified to this:
∂ ∂x0i
  m 0n 
∂xq ∂xq ∂ ∂x0i ∂xq ∂ ∂x0i
    
∂x ∂x m
− 0k = − 0k δ = − 0k , (79)
∂x ∂xq ∂xm ∂x0n ∂xj ∂x ∂xq ∂xm j ∂x ∂xq ∂xj
which cancels the first term in Eq. (77). So, finally, we arrive at the expression
∂x0i ∂xq ∂V j
 
0 i j l
Dk V = + Γnl V . (80)
∂xj ∂x0k ∂xq
The covariant derivative of a vector is indeed a tensor.

19
COVARIANT DERIVATIVE OF TENSORS

Consider the product V i U j , where both V i and U j are components of some vectors
V = V i ei and U = U j ej , respectively. How does the product V i U j change when going from
x to x + dx? Referring to Eq. (33), we know that
0
(V||ki − V i ) ∂V i
= + Γikq V q . (81)
dxk ∂x k

0
Solving for V||ki , we obtain

∂V i
 
0i i
V||k =V + + Γkq V dxk .
i q
(82)
∂xk
0
We get the same expression for U||kj :

∂U j
 
0j
j
U||k =U + i l
+ Γkl U dxk . (83)
∂xk

Thus, up to the first order on dxk ,


 i j

0i 0j
i i ∂V j i ∂U
V||k U||k − V U = U +V i q j i i l
+ Γkq V U + Γkl V U dxk
∂xk ∂xk
 i j 
∂V U
= i q j i i l
+ Γkq V U + Γkl V U dxk . (84)
∂xk

Dividing both sides by dxk , we get an expression for the covariant derivative of V i U j :
 i j 
i j ∂V U i q j i i l
Dk V U = + Γkq V U + Γkl V U . (85)
∂xk

This result establishes the validity of the chain rule for covariant derivatives:
 i   j 
i j i j i j ∂V i q j i ∂U i l
Dk V U = (Dk V )U + V (Dk U ) = + Γkq V U + V + Γkl U . (86)
∂xk ∂xk

We can also ask what the covariant derivative is for general tensors, e. g. T ij , that
are not mere products of two vector components. It turns out that the expression in Eq.
(85) applies to them as well. To prove this, consider a scalar φ that is composed of some
arbitrary contravariant tensor T ij and two arbitrary covariant vector components Vi and Uj :
φ = T ij Vi Uj . Let’s take a derivative of both sides with respect to xk :

∂φ ∂T ij ∂Vi ∂Uj
k
= k
Vi Uj + T ij k Uj + T ij Vi k . (87)
∂x ∂x ∂x ∂x

20
It is straight forward to show that the left hand side transforms like a covariant vector. Just
replace the partial derivative with the expression in Eq. (66) to get

∂φ0 ∂xq ∂φ
= . (88)
∂x0k ∂x0k ∂xq

(Recall that φ0 (x0 ) = φ(x)). Logically, the right hand side should also transform like a
covariant vector. However, it is a useful exercise to show this explicitly. The reader is
encouraged to do this. The right hand side of (87) can also be expressed in terms of the
covariant derivatives of Vi and Uj :
∂φ ∂T ij
= Vi Uj + T ij (Dk Vi + Γqki Vq ) Uj + T ij Vi (Dk Uj + Γlkj Ul )
∂xk ∂xk
∂T ij

ij q
= k
Vi Uj + T Γki Vq Uj + T Γkj Vi Ul + T ij Dk Vi Uj + T ij Vi Dk Uj
ij l
∂x
(89)

If we swap i and q in the second term inside the brackets, and switch l and j in the third
term inside the brackets, we can write
 ij 
∂φ ∂T lj j
= + T Γkq + T Γkl Vi Uj + T ij Dk Vi Uj + T ij Vi Dk Uj .
qj i
∂xk ∂xk
(90)

The expression in the brackets looks exactly like the expression in Eq. (85), but now, instead
of a simple product of vectors, we have a general rank two tensor. If we define this expression

∂T ij
Dk T ij = + T qj Γikq + T lj Γjkl (91)
∂xk

as the covariant derivative of the tensor T ij , then the entire expression on the right hand
side of Eq. (90) is nothing but the application of the chain rule for covariant derivatives:

Dk T ij Vi Uj = Dk T ij Vi Uj + T ij Dk Vi Uj + T ij Vi Dk Uj . (92)

We could go through the same exercise with a covariant tensor Tij and find that its covariant
derivative has the form
∂Tij
Dk Tij = − T qj Γikq − T iq Γjkq . (93)
∂xk
Similarly, for a mix tensor Tji , we would find that

∂Tji
Dk Tji = + Tjq Γikq − Tqi Γqkj . (94)
∂xk

21
In fact, by the same logic, we can show that the covariant derivative of an arbitrary mix
tensor Tji11,...,j
,...,iM
N
reads

M N
∂ i1 ,...,iM X im q1 ,...,qM X qn i1 ,...,iM
Dk Tji11,...,j
,...,iM
= T + Γ T − Γkjn Tq1 ,...,qN . (95)
N
∂xk j1 ,...,jN m=1 kqm j1 ,...,jN n=1

In this book, however, we will not need to worry about tensors with indices greater than
four.

TENSORS AND THEIR DERIVATIVES AS INVARIANTS

We saw earlier that the projection of a differentiated vector can be obtained by applying
the operator in Eq. (40), which gave us Dk V i ei . Suppose we wanted to know how Dk V i ei
changes when going from some point x to x+dx. It makes sense to apply the same operation
to Dk V i ei as we did with V:
 
i ∂ i j
Dl Dk V ei = ei e · l Dk V ej
∂x
" 0
#!


∂ i||l
= ei ei · (Dk V j ) + Γjlq (Dk V q ) ej + n
∂xl dxl
 
∂ i i q
= (Dk V ) + Γlq (Dk V ) ei . (96)
∂xl

Let us examine the last term on the right hand side. We have a partial derivative of the
tensor Dk V i followed by the term Γilq (Dk V q ), which looks like a covariant derivative – of a
vector! To make it a covariant derivative of the mix tensor Dk V i , we would need to add the
term −Γqlj (Dq V j ). However, we can’t just add terms willy-nilly. The term is either there,
or it isn’t. The problem is this. In the previous section, we defined the covariant derivative
of the tensor Tii in such a way so as to ensure that the derivative of the scalar T ij Vi Uj is a
covariant quantity. What we are doing now is applying Dk , defined in Eq. (40), twice to V
and hoping to get the result of Eq. (94). However, this cannot work for the simple reason
that Dl Dk does not transform like a tensor. This can be seen by realizing that the operator
ei ei · is an invariant: a lower index is summed with an upper index. Thus, Dl Dk transforms
just like a second partial derivative:
 m  q
∂xm ∂xq ∂2 ∂xm ∂ ∂xq
  
∂ ∂ ∂x ∂ ∂x ∂ ∂
0 0
= = + . (97)
∂xl ∂xk ∂x0l ∂xm ∂x0k ∂xq ∂x0l ∂x0k ∂xm ∂xq ∂x0l ∂xm ∂x0k ∂xq

22
We can fix this problem by making the operator Dk invariant: Dk → ek Dk . So,

ek Dk V = ek ei ei · (V m em ) = ek ei ei ·(Dk V m ) em = ek ei (Dk V m ) δm
i
= Dk V i ek ei (98)

∂xk
is now an invariant object. If we apply ek Dk again, we obtain

el Dl Dk V i ek ei


∂ 
= el eq eq · Dk V i ek ei
 
∂x
l
∂Dk V i
  k  
l q k i ∂e i k ∂ei
= e eq e · e ei + Dk V ei + Dk V e
∂xl ∂xl ∂xl
∂Dk V i k
     
q ∂e
l q k
 i i q k
 ∂ei
= e eq e · e ei + Dk V e · ei + Dk V e · e
∂xl ∂xl ∂xl
i
  
∂Dk V
= el eq g qk ei − Dk V i Γklm g qm ei + Dk V i g qk Γmli em
∂xl
∂Dk V i
  
l qk i k qm i qk
  m
=e g eq ei − Dk V Γlm (g eq ) ei + Dk V g eq Γli em
∂xl
∂Dk V i
  
l k i k m i k m
=e e ei − Dk V Γlm e ei + Dk V e Γli em
∂xl
(99)

If we switch k with m in the second term and m with i in the last term, and put all the
basis to the right, we get
∂Dk V i
  
l i k i m m i
el ek ei .

e Dl Dk V e ei = − Dm V Γlk + Dk V Γlm (100)
∂xl
The expression in the brackets is the definition for the covariant derivative of a mix tensor,
which in this case is Dk V i . One question regarding the logic in Eq. (99) is: how did we
know to take the dot product of eq with ek , rather than with ei ? The answer is simple:
either one would have produces the same result.
We can now define a set of invariant objects and invariant operators that act on them.
First, we define tensors as
T = Tji11,...,j
,...,iM
N
ei1 ...eiM ej1 ...ejN , (101)

where Tji11,...,j
,...,iM
N
are the tensor components. A tensor of rank zero will have no indices and no
basis. Tensor of rank one will have one index and one basis, for example a vector V i ei . A
tensor of rank two will have two indices and two basis, for example the metric tensor gij ei ej .
Second, we define the differential operator that acts on tensors as

ek Dk = ek ei ei · . (102)
∂xk

23
Acting with this operator on T produces a new invariant object

ek Dk T = Dk Tji11,...,j
,...,iM
ei1 ...eiM ej1 ...ejN ek
 
N
(103)

of rank M + N + 1, where the expression in the brackets is given by Eq. (95).

TENSOR PRODUCT

Throughout this chapter we have been a bit too loose with the basis products of the form
ei ej ek ..., as seen in Eq. (103). What exactly does it mean to have the basis placed in a row?
Let us look at the basis product in a two rank tensor, T = T ij ei ej , and recall the Cartesian
coordinate system from the beginning of this chapter. In that system, we can write the basis
(m) (n)
as ei = ei ŷm and ej = ej ŷn , and the tensor T as
 
(m) (n)
T = T ij ei ej ŷm ŷn , (104)

where ŷm is the unit vector along the mth axis. The parentheses around the indices m and
n are merely a reminder that we are now working in the space extrinsic to the surface; also,
the fact that we have written them as superscripts has nothing to do with co- or contra-
variance, and same goes for the subscripts on the ŷ’s (the Einstein’s summation convention
(m) (n)
still applies however). The new object, T ij ei ej , depends only on the indices m and
(p)
n. If we were to form a scalar from this object and two other vectors, U = U k ek ŷp and
(q)
V = V l el ŷq , it would look like this
 
(m) (n) (p) (q)
U · (T · V) = T ij ei ej U k ek V l el (ŷp · ŷm )(ŷq · ŷn )
   k l h (m) (n) (m) (n) i
ij (m) (n) k (p) l (q) ij
= T ei ej U ek V el δpm δqn = T U V ei ej ek el . (105)

If we focus on the indices m and n, we see that the entire product in the square brackets
could have been formed if we had written ek as a row vector
 
(1) (2) (3)
ek ek ek , (106)

el as a column vector
 
(1)
el
 
(2)
el . (107)
 

 
(3)
el

24
and the product ei ej as a matrix
 
(1) (1) (1) (2) (1) (3)
 ei ej ei ej ei ej

(2) (1) (2) (2) (2) (3)
ei ej =  ei ej ei ej ei ej . (108)
 
 
(3) (1) (3) (2) (3) (3)
ei ej ei ej ei ej

So, we see that the product ei ej is nothing more than a matrix composed of all the combi-
nations of the components of ei and ej . It is called a tensor product and is conventionally
written like so: ei ⊗ ej . A tensor product of more than two basis is a generalization of the
row-matrix-column formalism we are used to. For example, the scalar formed from a rank
3 tensor T = T ijk ei ⊗ ej ⊗ ek and the vectors U, V and W can be written as

ijk ijk l p q
  (m) (n) (s)  (m) (n) (s)
T (ei · U) (ej · V) (ek · W) = T U V W ei ej ek el ep eq . (109)
(m) (n) (s)
Though the expression ei ej ek can be thought of as some kind of 3-dimensional matrix,
it is neither popular to construct it visually nor necessary. We could think of tensor products
as a short hand notation. Instead of writing the right hand side of Eq. (109), we simply
perfom the operation

T ijk (ei · U) (ej · V) (ek · W) = T ijk U l V p W s (ei · el ) (ej · ep ) (ek · es )


= T ijk U l gil V p gjp W s gks = T ijk Ui Vj Wk . (110)

which allows us to bypass the large space referenced by the indices m, n and s.
Another way to think about tensors is as linear maps, or functions, that take vectors as
inputs and output a real number. For example, we can write a rank 2 tensor as T( , ),
where the two slots indicate that the T can take as input up to two vectors. A dot product
between two vectors can also be thought of in this way. For instance, in U · V, U( ) can be
thought of as a linear map that can take as input one vector, U(V), and vice versa. The
reason these maps are called linear is because of this property: U(V +W) = U(V)+U(W).
Throughout this chapter we have worked with a 2-d surface because it was easy to visu-
alize. However, all the formulas and definitions can be generalized for higher dimensional
surfaces simply by letting the indices run past two.
Now we know everything we need to know to understand the mathematics behind general
relativity. However, this chapter does cover a lot of material that may seem a bit abstract,
which is why I think it will be helpful to end it with a section that strikes a more familiar
note.

25
GRADIANT AND DIVERGENCE

In your study of electromagnetism, you have encountered the differential operators known
as gradiant and divergence (and also curl, which I do not mention here because it will come
up in a later chapter). In Cartesian coordinates they seemed very straight-forward. But you
may have wondered how they acquired this form
∂f 1 ∂f ∂f
∇f = ρ̂ + ϕ̂ + z gradient
∂ρ ρ ∂ϕ ∂z
1 ∂(ρAρ ) 1 ∂Aϕ ∂Az
∇·A= + + divergence (111)
ρ ∂ρ ρ ∂ϕ ∂z
in polar, and
∂f 1 ∂f 1 ∂f
∇f = r̂ + θ̂ + ϕ̂ gradient
∂r r ∂θ r sin θ ∂ϕ
1 ∂(r2 Ar ) 1 ∂ (Aθ sin θ) 1 ∂Aϕ
∇·A= 2 + + divergence (112)
r ∂r r sin θ ∂θ r sin θ ∂ϕ
in spherical coordinates.
With the concepts developed in this chapter, we will see how these monsters arise nat-
urally from our definitions of tensors and their derivatives. We restrict ourselves to a 2-
dimensional space in which the formulae for gradiant and divergence reduce to
∂f 1 ∂f
∇f = ρ̂ + ϕ̂ gradient (113)
∂ρ ρ ∂ϕ
1 ∂(ρAρ ) 1 ∂Aϕ
∇·A= + divergence (114)
ρ ∂ρ ρ ∂ϕ
(simply ignore the z-component in Eqs. (111)). Figure (6) shows a 2-d flat surface parallel
to the y1 -y2 plane with polar coordinates: x1 = ρ, x2 = θ. Since the surface is flat, and we
choose the basis to be orthogonal

e1 = eρ , e2 = eθ , (115)

the metric and its inverse are simply

g11 = 1, g22 = ρ2
1
g 11 = 1, g 22 = 2 . (116)
ρ
To calculate a gradiant of some scalar f (x), we apply to it the operator

e k Dk = e k . (117)
∂xk

26
FIG. 6.

There is no necessity to include the factor ei ⊗ ei , as in Eq. (102), because a derivative of


any vector living on a flat surface can only result in a vector (or tensor) that is parallel to
that surface. So, the gradiant of f is
 
k ∂ kl ∂ X
kk ∂ ∂ eθ ∂
e f = g el k f = g ek k f = eρ + 2 f. (118)
∂xk ∂x k
∂x ∂ρ ρ ∂θ

This doesn’t look quite right; according to Eq. (113), there should be a factor of 1/r in front
of ∂/∂θ, not 1/r2 . This is because Eq. (118) is expressed in terms of unit basis êi , while
we have expressed the gradiant in the basis ei . To compare apples with apples, we write

ei = êi gii . Hence,

√ ρρ ∂
   
êθ gθθ ∂ ∂ êθ ∂
êρ g + f = êρ + f. (119)
∂ρ ρ2 ∂θ ∂ρ ρ ∂θ

The gradiant is the dot product of the operator ek ∂/∂xk with a vector V = V i ei :

∂ ∂V k
ek i i
 k i
 k k
· V e i = Dk V e · ei = D k V δ i = D k V = + Γkkj V j . (120)
∂xk ∂xk

However, the vector components V j go with the basis ej : V = V j ej . What we want is vector

components that go with the unit basis ê1 = êρ and ê2 = êθ : V = V̂ j êj = (V j / gjj )ej . So,

27
Eq. (120) becomes
! !
∂ ∂ V̂ k V̂ j
ek k · V i ei = √ + Γkkj √ . (121)
∂x ∂xk gkk gjj

If we plug in gij and g ij from Eq. (116) into Eq. (62), we can work out that the only nonzero
Christoffel symbols are these:

1
Γ122 = −ρ, Γ212 = Γ221 = . (122)
ρ

Plugging these into Eq. (121), we get

∂ ∂ V̂ ρ 1 ∂ V̂ θ V̂ ρ
ek · V i
e i = + + , (123)
∂xk ∂ρ ρ ∂θ ρ

which is identical to Eq. (114). So, as you can see, you have been using the calculus of
intrinsic geometry all this time without even knowing it. I will leave it as an excercise for
the reader to work out the gradient and the divergence on a 3-d flat surface embedded in a
higher dimensional space.

28
View publication stats

You might also like