Tensors: A Guide For Undergraduate Students: American Journal of Physics July 2013
Tensors: A Guide For Undergraduate Students: American Journal of Physics July 2013
net/publication/258757260
CITATIONS READS
5 3,458
2 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Francesco Battaglia on 28 January 2016.
I. INTRODUCTION points of the theory” are provided). In the preface to the sec-
ond edition, the author contradicts his previous statement
Undergraduate students, to whom this paper is addressed, with the warning, “Do not use this book in the wrong way…
are generally aware of the importance of tensors. However, It is advisable to study the first chapters I-V thoroughly.”
as pointed out by others,1 unless students have been engaged We tend to agree with the latter advice and believe that,
in a course dedicated to the study of tensors, the subject as with any subject, there are no short cuts. Yet, we also
remains covered in a veil of mystery. Given that an under- believe that the background necessary to get started can be
standing of tensors helps aid in understanding even simple presented more efficiently than is currently available in the
experiments carried out in undergraduate laboratory literature.
courses,2 this is unfortunate and is due, in our opinion, to Besides the classics by Levi-Civita and Schouten, several
two major causes. First, a lack in the literature of a presenta- other books exist at both the introductory5–10 and
tion of tensors that is tied to what students already know advanced11–16 levels (the books referenced are simply a sam-
from courses they have already taken, such as calculus, vec- pling of those we consider worthwhile to an undergraduate
tor algebra, and vector calculus. And second, the use of a student). The second barrier mentioned is related to notation.
notation that turns the necessary use of multiple indices into To become acquainted with the field, it is necessary to gain
an unnecessarily cumbersome process. some virtuosity with the so-called “index gymnastics.”
To be sure, there exist plenty of excellent books, such as Without this skill, it is not possible for anybody to follow the
The Absolute Differential Calculus by Levi-Civita,3 to name calculations of others or to perform their own. The origin of
a classic book written by one of the founders of the field. much difficulty resides, we believe, in a notation that is
This outstanding treatise, however, starts with three long widely used in the journal literature, in specific books on ten-
chapters on functional determinants and matrices, systems of sors (certainly in all the books referenced above, with only
total differential equations, and linear partial differential three exceptions), and in most popular books on mathemati-
equations, before entering into the algebraic foundations of cal methods for physicists.17 Being a tensor refers to a prop-
tensor calculus. And even so, little connection is made to the erty possessed by a multi-component quantity, defined
vector algebra and calculus that are a standard part of the within a coordinate system, whose components transform in
undergraduate physics curriculum. Given the treatise nature a specific way upon changing the coordinate system. The
of this book (which, by the way, is a must read by anybody notation we are referring to here is that if the components of
who aims at being a professional on the subject), this circum- the multi-component quantity A are denoted by Aa in a par-
stance is not terribly surprising, but it does make the book ticular coordinate system, those in another coordinate system
only somewhat appropriate to those readers (undergraduates) are often denoted by, for instance, A0 a or Aa —of the sixteen
we have in mind here. books quoted in Refs. 3–17, seven use the former notation
The excellent book by Schouten,4 another founder of ten- and six use the latter. As reasonable as it might seem, such a
sor analysis, in spite of its promising title, Tensor Analysis practice is the origin of an unnecessary confusion, perfectly
for Physicists, leaves us with some doubts on its appropriate- avoidable as long as the prime (or the bar) is attached to the
ness for undergraduate students. It is enough to browse index, such as in Aa0 . This tiny change in notation might
through the book to realize that the target audience is quite seem extravagant and unnatural but, as we shall see, it
advanced. The author himself, in the preface to the first edi- greatly simplifies the manipulations involved. Schouten (b.
tion, claims that some readers “can avoid actually working 1883) is seemingly the one who first introduced this notation
through the whole of the earlier chapters” (namely, the first so we will name it after him. What we propose in this paper
five chapters, after which a “brief summary of the salient is that Schouten’s notation be widely adopted.
498 Am. J. Phys. 81 (7), July 2013 http://aapt.org/ajp C 2013 American Association of Physics Teachers
V 498
Downloaded 19 Jun 2013 to 134.124.117.107. Redistribution subject to AAPT license or copyright; see http://ajp.aapt.org/authors/copyright_permission
Only three authors among the ones quoted in Refs. 3–17 space. The vectors may be associatively and commutatively
follow Schouten’s notation: Jeevanjee, Lichnerowicz, and, summed, there is a unique zero-vector, and each vector has
of course, Schouten himself. However, Lichnerowicz uses its unique opposite. The vectors may be associatively multi-
the notation inappropriately and adds confusion to the mat- plied by real numbers, an operation that is distributive with
ter. In Schouten’s notation, a sum over the components of A, respect to both the sum of real numbers and the sum of vec-
which
PN in the unprimed coordinate system is written as tors.19 Between any two vectors is defined a dot (or scalar)
A
a¼1 aP , in the primed coordinate PNsystem is correctly writ- product, a commutative rule that associates a real number to
N
ten as a¼1 A a 0 and not as a0 ¼1 Aa0 , which is what each pair of vectors. This dot-product rule is distributive
Lichnerowicz (incorrectly) writes. with respect to a linear combination of vectors and provides
A common notation that we adopt is the Einstein summa- a non-negative number whenever a vector is dotted with
tion rule. As we will see, we need two types of indices that itself (only the zero-vector dotted with itself gives the num-
we distinguish from each other by locating them either in a ber zero).20 A set of vectors fAn ; n ¼ 1; :::; Ng is said to be
lower position, as in Aa or in an upper position, as in Aa . The linearly independent if (note that a sum over n is implied)
Einstein summation rule is:
Rule 0: Whenever an index is repeated twice in a prod- Cn An ¼ 0 ) Cn ¼ 0 8n: (1)
uct—once in an upper position and once in a lower posi-
tion—the index is called a dummy index0 and summation over0 In general, if N is the maximum allowed number of linearly
P
it is implied. For instance, A b
0 b0 B c means A 0 b0 B c
b independent vectors, then the space is said to be N-dimen-
0 0
a P b a
¼ Aa0 10 B1c þ Aa0 20 B2c þ , and Caa means C
a a
a
¼ C 1
1
sional and, in this case, N linearly independent vectors are
2
þC2 þ . said to form a basis—any vector may be written as a linear
An index that is not a dummy index is called a free index. combination of the basis. Our ordinary space is Euclidean
Dummy and free indices follow three rules that, as trivial as because the Pythagorean theorem holds, and it is 3D because
they might sound, are so crucial and used continuously that the maximum value of N is N ¼ 3; that is, there exist sets of
we better state them explicitly: three vectors fea ; a ¼ 1; 2; 3g such that any vector A may
Rule 1: In the case of a multiple sum, different letters must be written as a linear combination of the vectors ea .
denote the dummy indices even though one letter is primed In our ordinary 3D space a basis of vectors may always
0
and the other is 0 not. For instance, we will not write Aa0 a Ba a be chosen to be orthonormal. By this we mean that if f^ xa ; a
a b
but rather Aa0 b B to imply a double sum over the two indices. ¼ 1; 2; 3g is such a basis, their mutual dot products obey
Rule 2: A dummy index may be renamed, at will and as
convenience requires, within any single factor (as long there x^a x^b ¼ dab ; (2)
is no conflict with Rule 1).
Rule 3: Any free index must appear with the same name where dab is the Kronecker delta, a quantity whose 9 compo-
and position on each side of an equation; thereafter, if one nents are equal to 0 if a 6¼ b or equal to 1 if a ¼ b. Each vec-
wishes to rename it, it must be done throughout the entire tor may be written as
equation.
A ¼ Aa x^a ; (3)
The remainder of this paper is organized as follows. In
Sec. II, we discuss the background assumed, and presumably where Aa is the ath component of the vector A in the basis
shared, by undergraduate students in physics or engineer- _
fx a g. A direct consequence of Eqs. (2) and (3) and the
ing—vector calculus in a Cartesian coordinate system.18 As stated properties of the dot product is that
we will see, there is no need to have upper and lower indices
in such a coordinate system; all indices appear as lower indi- Aa ¼ x^a A (4)
ces. Hence, only in Sec. II, do we allow the Einstein summa-
tion rule to apply to each repeated index, even though and
located twice in a lower position.
In subsequent sections, we tie the background of Sec. II to A B ¼ ðAa x^a Þ ðBb x^b Þ ¼ ðAa Bb Þð^
x a x^b Þ
the algebra and calculus of tensors. In Sec. III, we adopt an ar- ¼ ðAa Bb Þdab ¼ Aa Ba : (5)
bitrary basis (not necessarily orthonormal) and see how the
dual basis naturally emerges, so that for each vector one has to As done here, and in several calculations that follow, we
deal with two types of components—those with respect to the shall redundantly collect some factors within parentheses
original basis and those with respect to its dual. In Sec. IV, we with the aim of clarifying the algebra involved without the
see how the dual basis and the two types of components trans- need for any other explanation.
form under a coordinate transformation, and we introduce the In the 3D space, a cross (or vector) product is defined as
tensor-character concept. In Sec. V, we consider scalar, vector, 2 3
and tensor fields, introduce the covariant derivative, and pro- x^1 x^2 x^3
vide coordinate-independent expressions for the gradient and A B det4 A1 A2 A3 5 ¼ eabc Aa Bb x^c ; (6)
Laplacian of a scalar field and for the divergence and curl of a B1 B2 B3
vector field. Finally, in Sec. VI, we tailor these expressions to
general orthogonal coordinates, thereby completing our jour- where eabc is the Levi-Civita symbol, a quantity whose 27
ney back to what our readers are likely to be familiar with. components are equal to þ1 or 1 according to whether
ða; b; cÞ forms an even or odd permutation of the sequence
II. A REVIEW: ORTHONORMAL BASES ð1; 2; 3Þ, and equal to 0 otherwise (if two or more indices
take on the same value).
Let f^x a ; a ¼ 1; 2; 3g be an orthonormal basis spanning A field is a function of the space coordinates
the vectors of the ordinary Euclidean three-dimensional (3D) x ðx1 ; x2 ; x3 Þ, and possibly of time as well, and may be a
499 Am. J. Phys., Vol. 81, No. 7, July 2013 F. Battaglia and T. F. George 499
Downloaded 19 Jun 2013 to 134.124.117.107. Redistribution subject to AAPT license or copyright; see http://ajp.aapt.org/authors/copyright_permission
scalar or a vector field depending on whether the function is a homogeneous and isotropic (invariant under translations and
scalar or a vector. In this section, we denote the differential rotations) and if we believe in the law of inertia (that invari-
operator by @a @=@xa . Given a vector field A ¼ AðxÞ ance under a Lorentz transformation holds), then the funda-
¼ Aa ðxÞ^x a , its divergence is the scalar field given by mental laws of physics must be accordingly invariant. In
other words, they must be expressed by equations involving
r A ¼ @a Aa (7) single- and multi-component quantities that under transla-
tion, rotation, and Lorentz transformations preserve the ob-
and its curl is the vector field given by jectivity character we are after. Quantities with such a
2 3 character and defined in an N-dimensional space are called
x^1 x^2 x^3
tensors and, more specifically, rth-rank tensors if they have
r A det4 @1 @2 @3 5 ¼ eabc @a Ab x^c : (8)
N r components. Scalars, being single-component quantities,
A1 A2 A3 are then zero-rank tensors, while vectors, being N-compo-
nent quantities, are first-rank tensors. Higher-rank tensors
Given a scalar field /ðxÞ, its gradient is the vector field are…well, tensors!
defined by To see how higher-rank quantities naturally emerge, a
r/ðxÞ ¼ x^a @a /; (9) classic example is the case of the stress tensor defined on
each point of a solid body. We consider a small volume V
and its Laplacian is defined as the divergence of its gradient, surrounding a point within the solid and let Fa denote the ath
namely Cartesian component of the total internal stress force in V.
This force may be due, for instance, to external forces being
D/ r2 / r r/ ¼ @a @a /: (10) applied to the surface of the body, or because V feels on its
surface the gravitational force due to the rest of the surround-
Why do we need scalars, vectors, and, in general, tensors? ing body. Whatever
Ð its cause, the internal stress force can be
What we need in defining a physical quantity is for it to have written Fa ¼ V dV/a , where /a is the ath Cartesian compo-
some character of objectivity in the sense that it does not nent of the stress force density. Stress forces act within a few
have to depend on the coordinate system used. For instance, molecular layers (their action range is negligible when com-
a one-component function of position, such as a temperature pared to the linear dimensions of the body) so they act
field specifying the temperature T at each point P of space, through the surface S enclosing V. The above volume inte-
must provide a unique value T when the point P is consid- gral must then be expressible as a surface integral over S,
ered, regardless of the coordinate system used. and makingÐ use of the divergence
Ð (Gauss’s) theorem, we
Not every quantity specified by a single number is a sca- have Fa ¼ V @b rab dV ¼ S rab nb ds, where nb is the bth
lar. For example, in our 3D space whose points are parame- component of the unit vector perpendicular to the surface
terized by a Cartesian coordinate system, all displacements element ds of S, /a ¼ @b rab , and rab is a 9-component quan-
from a given point, such as the origin, will be specified by tity called the stress tensor.
three quantities ðx1 ; x2 ; x3 Þ. Now, the displacement that in
one coordinate system is specified, for instance, as ð1; 0; 0Þ,
would be specified as ð0; 1; 0Þ in a coordinate system rotated III. ARBITRARY BASES: DUALITY
clockwise by p=2 around the third axis. Thus, each compo- For reasons that will soon become apparent, for the re-
nent of the displacement is indeed a single-component quan- mainder of the paper, we apply the Einstein summation rule
tity, yet these components depend on the coordinate system as stated in Sec. I: summation over a repeated index is
used (they are not scalars) and do not have that objectivity implied only if it appears once in an upper and once in a
character we desire. An objectivity-preserving, single-com- lower position. Let us now see what happens as soon as we
ponent quantity must transform in a specific way under a relax our very first assumption, expressed in Eq. (2), accord-
coordinate transformation—it must be an invariant. ing to which the basis used is orthonormal. In contrast to Eq.
Likewise, multi-component quantities must have an objec- (2), we then consider a set of basis vectors fea ; a ¼ 1; 2; 3g
tivity character (and are thereby called vectors and tensors), for which
a circumstance that translates into specific rules about how
their components must transform in order to preserve that gab ea eb 6¼ dab : (11)
objectivity. As not all single-component quantities are sca-
lars, similarly, not all multi-component quantities are vectors It is appropriate to remark at this point that the (symmetric)
or tensors. We will specify the transformation rules in Sec. matrix ½gab , whose elements are gab , has the determinant
IV. For the time being, we only remark that the above defini- G det½gab 6¼ 0. In fact, given fea g as a basis its vectors
tions for dot and cross products, divergence, curl, and gradi- are linearly independent [Eq. (1) holds] so that
ent are what they are because they assure that the objectivity
we are requiring is preserved. For instance, we will see that Ca ea ¼ 0 ) Ca ¼ 0 8a: (12)
the above-defined displacements are vectors, and the dot
product [as defined in Eq. (5)] of each displacement with But this equation implies that
itself is a scalar. Had we defined, for instance, the single-
component quantity ðx1 Þ2 ðx2 Þ2 þ ðx3 Þ2 , we would not ðCa ea Þ eb ¼ Ca ðea eb Þ ¼ Ca gab ¼ 0 ) Ca ¼ 0 8a:
have had an invariant. (For the displacement example con- (13)
sidered in the previous paragraph, this quantity is equal to
þ1 in the first coordinate system and –1 in the other.) However, a homogeneous linear equation such as Ca gab ¼ 0
Another reason to require in physics the objectivity we possesses the trivial solution (Ca ¼ 0) if and only if
are considering here is that if we believe our space to be det½gab ¼
6 0. (Incidentally, this implies that ½gab is a
500 Am. J. Phys., Vol. 81, No. 7, July 2013 F. Battaglia and T. F. George 500
Downloaded 19 Jun 2013 to 134.124.117.107. Redistribution subject to AAPT license or copyright; see http://ajp.aapt.org/authors/copyright_permission
non-singular matrix; in other words, it admits an inverse.) We left-hand side of Eq. (21) becomes ðCac ec Þ eb ¼ Cac ðec eb Þ
shall denote the elements of the inverse matrix by gab , so that ¼ Cac gcb , which means
which is equivalent to writing Because the matrix ½gab is non-singular, Eq. (22) has the
solution ½Cab ¼ ½gab 1 ¼ ½gab , which means that the unique
½gab ½gab ¼ I or gac gcb ¼ dba ; (15) solution of Eq. (21) is given by
where I is the identity matrix and the indices in the ea ¼ gab eb : (23)
Kronecker delta have been written according to our Rule 3.
Clearly, ½gab is also symmetric and det½gab ¼ 1=det½gab The basis fea g is called the dual basis of fea g and is
1=G, a result that follows from Eq. (15) and the fact that obtained from the latter by transforming it via the matrix
the determinant of a product of matrices is equal to the prod- inverse of ½ea eb . Similarly, the dual basis of the basis fea g
uct of their determinants. is obtained from the latter by transforming it via the matrix
Any vector can be expressed as a linear combination of inverse of ½ea eb . To understand this, we note that Eqs. (23)
the basis fea g so that and (21) lead to
A ¼ Aa ea ; (16)
ea eb ¼ ðgac ec Þ eb ¼ gac ðec eb Þ ¼ gac dbc ¼ gab ; (24)
which is equivalent to Eq. (3) except for the change in nota-
tion, where the components of the vector A in the basis fea g so that
have been labelled with an upperP index. However, if one
asks if Aa ¼ ea A or A B ¼ a Aa Ba still hold as in Eqs. gab ¼ ea eb (25)
(4) and (5), the answer is negative:
whereby ½ea eb 1 ½gab 1 ¼ ½gab . Thus, the vector dual
b b
ea A ¼ ea ðA eb Þ ¼ ðea eb ÞA ¼ gab A 6¼ b
dab Ab ¼A a to ea is
(17) gab eb ¼ gab ðgbc ec Þ ¼ ðgab gbc Þec ¼ dca ec ¼ ea : (26)
and
In other words, if the basis dual to fea g is fea g, then the ba-
a b
A B ¼ ðA ea Þ ðB eb Þ ¼ ðea eb ÞA B a b sis dual to the latter is fea g:
X
¼ gab Aa Bb 6¼ dab Aa Bb ¼ Aa Ba : (18) fea ¼ gab eb g () fea ¼ gab eb g: (27)
a
P We therefore see that once a non-orthonormal basis is con-
In conclusion, Aa 6¼ ea A and A B 6¼ a Aa Ba . Please
a
note that stating A 6¼ ea A is in keeping with Rule 3. In sidered, another basis—its dual—naturally emerges.
some sense, Rule 3 warns us that this cannot be a bona fide Accordingly, any vector A may then be written as
equality. The reader might wonder whether this is a conse-
quence of the adopted rule and our having arbitrarily set the A ¼ A b e b ¼ A b eb ; (28)
labeling index in the upper position. This is not the case. The
inequality in Eq. (17) says that the ath component of a vector where the components Aa can be found by dotting ea with
A in a given basis fea g is not given by the dot product of A Eq. (28) and using Eq. (21) to get
with ea , no matter where we choose to locate the index to
label that component; Rule 3 simply certifies that this is so. A a ¼ A ea : (29)
A natural question to ask is whether there exists some other
basis, which we provisionally denote by fea g, such that Similarly, the components Aa can be found by dotting ea
with Eq. (28) to get
Aa ¼ ea A: (19)
A a ¼ A ea : (30)
Although this provisional notation is motivated by our desire
to follow Rule 3, we will see that such a basis does exist and For reasons that will become apparent in the following sec-
is unique. Moreover, if we call feag the dual basis of the tion, the components labeled by an upper index are called
original basis fea g, it turns out that the dual of fea g is simply the contravariant components of A, and those labeled by a
fea g. To find such a basis, we note that Eqs. (16) and (19) lower index are called the covariant components of A. A
lead to relation between the contravariant and covariant components
may be readily obtained from the equality Ab eb ¼ Ab eb : dot-
Aa ¼ ea A ¼ ea ðAb eb Þ ¼ ðea eb ÞAb ; (20) ting it with either ea or ea and using Eq. (21) together with
either Eq. (11) or Eq. (25) gives
which holds provided the vectors fea g are solutions of the
equation Aa ¼ gab Ab (31)
a
e eb ¼ dab : (21)
and
If the vectors fea g exist they must be expressible as a linear
combination of the basis fea g. Therefore, ea ¼ Cac ec and the Aa ¼ gab Ab : (32)
501 Am. J. Phys., Vol. 81, No. 7, July 2013 F. Battaglia and T. F. George 501
Downloaded 19 Jun 2013 to 134.124.117.107. Redistribution subject to AAPT license or copyright; see http://ajp.aapt.org/authors/copyright_permission
What about the dot product between two vectors? Using Cc ¼ C ec ¼ ðA BÞ ec ¼ Aa Bb ðea eb Þ ec
Eqs. (28), (21), (11), and (25), we now have four ways to
1
write it [compare with Eq. (5)]: ¼ pffiffiffiffi eabc Aa Bb ; (39)
G
A B ¼ gab Aa Bb ¼ gab Aa Bb ¼ Aa Ba ¼ Aa Ba : (33)
where eabc ¼ eabc with the indices located to be consistent
If we consider the two sets of components as vectors of two with our Rule 0. In conclusion we have
different spaces, one dual of the other, we would say that the
pffiffiffiffi 1
dot product is actually performed between the vectors of a AB¼ Geabc Aa Bb ec ¼ pffiffiffiffi eabc Aa Bb ec : (40)
space and those of its dual. This distinction is hidden when G
an orthonormal basis is used because gab ¼ dab and from
Eqs. (27), (31), and (32) we see that such a basis is self-dual From Eqs. (11) and (25), we see that the quantities gab and
so the contravariant and covariant components coincide; this gab depend only on the basis vectors (the way the space has
is why there was no need in Sec. II to distinguish between been parameterized). In addition, from Eqs. (27), (31), and
them by using upper and lower indices. (32), we see that these quantities determine the dual basis
We notice here that in our 3D space, a ready way to obtain from the original basis and relate the covariant and contra-
the dual vectors ea of a given basis fea g is by using the variant components to each other or, as usually phrased, that
relation gab and gab allow the lowering and rising of the indices.
Finally, these quantities determine the dot product of two
e b ec vectors [Eq. (33)] and, via G det½gab , their cross product
ea ¼ : (34)
V as well [Eq. (40)]. Because of the central role played by the
quantities gab and gab , they have been given a special name:
Here ða; b; cÞ form an even permutation of the triplet ð1; 2; 3Þ the metric. In Sec. IV, we will see that the metric is indeed a
and V ¼ ea ðeb ec Þ is the volume of the parallelepiped second-rank tensor.
spanned by the vectors of fea g. Indeed, the vectors defined
by Eq. (34) satisfy Eq. (21), whose unique solution guaran-
tees that these vectors are the wanted duals.21 IV. CHANGING BASES: TENSORS
To find the contravariant and covariant components of the We now wish to move from a given basis fea g to another
cross product of two vectors, let us first show that for any basis that, in keeping with Shouten’s notation, will be
given six vectors, which shall here be denoted as S1 , S2 , S3 , denoted as fea0 g. From the requirement that fea0 g be a basis,
T1 , T2 , and T3 , one has it follows that each vector ea of the unprimed basis can be
written as a linear combination of the primed basis vectors
ðS1 S2 S3 ÞðT1 T2 T3 Þ ¼ det½Sa Tb : (35)
0
ea ¼ Rba eb0 (41)
Using Eqs. (5) and (6), we see that the left-hand side
becomes det½SaA det½TbB ¼ det½SaA det½TBb ¼ det½SaA TAb ,
thus proving the assertion. Here, SaA is the Ath component of and, given fea g as a basis, each vector ea0 of the primed basis
the vector Sa (with similar meaning for TaA ) and we have can be written as a linear combination of the unprimed basis
made use of the facts that determinants are invariant when vectors
rows and columns are interchanged (first equality) and the
determinant of a product of matrices is equal to the product ea0 ¼ Rba0 eb : (42)
of their determinants (second equality). Applying Eq. (35) to
the basis fea g we get This equation acts as the definition
0
of the primed basis. Note
that we might think of Rba (which is a 2-index quantity) as
V 2 ¼ ðe1 e2 e3 Þ2 ¼ det½ea eb ¼ det½gab G; (36) the elements of a matrix. If so, we have to decide which
index refers to the rows and which to the columns; we freely
where again, V ¼ ea ðeb ec Þ is the volume of the parallel- opt for the lower index for the rows and the upper for the
epiped spanned by the basis vectors ea . To avoid any columns. 0
unlikely confusion between an upper index and a power Inserting Eq. (42) into (41), we get ea ¼ Rba ðRcb0 ec Þ
b0 c
exponent, we have inserted within parentheses all quantities ¼ ðRa Rb0 Þec , whereby
raised to a power. We note that 0 0
Rba Rcb0 ¼ dca ; or ½Rba ½Rba0 ¼ I: (43)
1 2 3 2 a b ab
ðe e e Þ ¼ det½e e ¼ det½g ¼ 1=G; (37) 0
The matrix ½Rba is invertible and its inverse is the matrix
due to the fact that det½gab ¼ 1=det½gab . Armed with Eqs. ½Rba0 . It is crucial at this
0
point to have no misunderstandings
(36)–(37) and (28)–(30), we are now ready to give an expres- about the notation: Rba and Rba0 are matrix elements of differ-
sion for the cross product of two vectors C ¼ A B. For the ent matrices—in fact, they are inverse to each other—and it
covariant components, we have is the position of the prime (on the lower or on the upper 0
index) that tells us that this is so. In particular, det½Rba
b
6¼ det½Ra0 and these nonzero determinants are reciprocal to
Cc ¼ C ec ¼ ðA BÞ ec ¼ Aa Bb ðea eb Þ ec
pffiffiffiffi each other.
¼ Veabc Aa Bb ¼ Geabc Aa Bb ; (38) Any vector can now be written as
0 0
and similarly, for the contravariant components, we have A Aa ea Aa0 ea Aa ea0 Aa ea ; (44)
502 Am. J. Phys., Vol. 81, No. 7, July 2013 F. Battaglia and T. F. George 502
Downloaded 19 Jun 2013 to 134.124.117.107. Redistribution subject to AAPT license or copyright; see http://ajp.aapt.org/authors/copyright_permission
where the equalities have been denoted as identities to stress basis; in fact, they transform as does the dual basis as can be
the objectivity character of the quantity we are dealing with. seen by comparing Eqs. (48) and (52). This is the origin of
Our task now is to find how to go from the direct and dual the names covariant and contravariant given to the compo-
bases (contravariant and covariant components) in the nents labeled with indices in, respectively, the lower and
unprimed system to the direct and dual bases (contravariant upper positions.
and covariant components) in the primed system, and vice The dot product defined in Eq. (33) is indeed a scalar (an
versa. Equation (42) is the starting point as it defines the invariant). For instance, from the last form of Eq. (33), we
primed basis (we assume the matrix R ½Rba0 is given). have
Equation (41) gives the unprimed direct basis from the 0 0 0
primed direct basis (requiring inversion of the matrix R). Aa0 Ba ¼ ðRca0 Ac ÞðRad Bd Þ ¼ ðRca0 Rad ÞAc Bd
The metric of the primed system is ¼ dcd Ac Bd ¼ Ac Bc ¼ Aa Ba : (54)
ga0 b0 ea0 eb0 ¼ ðRca0 ec Þ ðRdb0 ed Þ ¼ Rca0 Rdb0 ðec ed Þ; (45) Thus, when going from one basis to another in our 3D
space as dictated by the transformation (42), scalar quantities
or are, by definition, invariant. Or, if we like, quantities that are
invariant under transformation (42) are legitimate one-
ga0 b0 ¼ Rca0 Rdb0 gcd : (46) component physical quantities. On the other hand, the
components of a legitimate 3-component physical quantity
From Eq. (41) and the last identity in Eq. (44), and making a (vector) must transform according to either Eq. (48)
judicious use of our rules, we have (contravariant components) or Eq. (50) (covariant
0 0 0 0 components).
Aa ea0 Aa ea ¼ Aa ðRba eb0 Þ ¼ ðRba Aa Þeb0 ¼ ðRab Ab Þea0 ; Composing two vectors as in Aa Ba has yielded a scalar,
(47) but two vectors might be composed as in Aa Bb , Aa Bb or
Aa Bb , producing 9-component quantities whose transforma-
whereby tion rules, readily obtained from those of the single vectors,
0 0
guarantee the objectivity character we are requiring for a
Aa ¼ Rab Ab ; (48) physical quantity. However, such higher-rank quantities
need not be obtained by composing two vectors. We are then
which are the primed contravariant components from the led to define a second-rank tensor as a 9-component quantity
unprimed ones. To obtain the primed covariant components T whose covariant, contravariant, and mixed components
from the unprimed ones, we may lower the index in the con- transform, respectively, as
travariant components by means of the metric. From Eqs.
(31), (43), (46), and (48), we obtain Ta0 b0 ¼ Rca0 Rdb0 Tcd ; (55)
0 0 0 0
0 0
Aa0 ¼ ga0 c0 Ac ¼ ðRba0 Rdc0 gbd ÞðRce Ae Þ ¼ Rba0 gbd ðRdc0 Rce ÞAe
0
T a b ¼ Rac Rbd T cd ; (56)
¼ Rba0 gbd ðdde Ae Þ ¼ Rba0 ðgbd Ad Þ; (49) 0 0
Tab0 ¼ Rca0 Rbd Tcd : (57)
or Equations (55)–(57) may also be considered as the defining
relations of, respectively, a ð0; 2Þ-type, ð2; 0Þ-type, and
Aa0 ¼ Rba0 Ab : (50) ð1; 1Þ-type second-rank tensor. Contravariant and covariant
vectors are ð1; 0Þ-type and ð0; 1Þ-type first-rank tensors, and
Finally, we want to obtain the primed dual vectors from the scalars are ð0; 0Þ-type zero-rank tensors. Meanwhile, ðp; qÞ-
unprimed ones, which we find as type tensors of rank ðp þ qÞ may be easily defined by gener-
0 0 0 0 0 0 0 alizing the relations (55)–(57).
ea ¼ ga c ec0 ¼ ðRab Rcd gbd ÞðRec0 ee Þ ¼ Rab gbd ðRcd Rec0 Þee From Eqs. (46) and (53) we see that gab and gab are the co-
0 0
¼ Rab gbd ðded ee Þ ¼ Rab ðgbd ed Þ; (51) variant and contravariant components of a second-rank ten-
sor, the metric tensor. We can obtain its mixed components
or by either lowering one index of gab or raising one index of
gab ; in both cases, we have gca gab ¼ dcb , showing that the
0 0 metric-tensor mixed components are given by the Kronecker
ea ¼ Rab eb : (52)
delta dba . To show that dba is indeed a tensor0 we should0
check
that, according to Eq. (57), we have dba0 ¼ Rca0 Rbd ddc . Sure
From this and Eq. (25) we see that 0 0 0
enough, we find that Rca0 ðRbd ddc Þ ¼ Rca0 Rbc ¼ dba0 . This result
0 0
ga b ¼ Rac Rbd gcd :
0 0
(53) also shows that the Kronecker delta dba has the same compo-
nents in all coordinate systems, a property not shared by ei-
P
Let us now look at Eqs. (42), (48), (50), and (52). We see ther dab or dab . For instance, Rca0 Rdb0 dcd ¼ c Rca0 Rcb0 , which,
that the covariant components transform exactly as does the in general, is not equal to da0 b0 unless the transformation ma-
direct basis—compare Eqs. (50) and (42), where the transfor- trix R ½Rba0 happens to be orthogonal (its inverse is equal
mation is governed by the matrix R ½Rba0 —whereas the to its transpose R1 ¼ RT ).
contravariant-component
0
transformation (48) is governed by In an obvious way, a tensor may be multiplied by a real
the matrix R1 ½Rab . These components transform in a number by multiplying each component by the number, and
manner “contrary” (so to speak) to that adopted by the direct tensors of the same type may be added by summing their
503 Am. J. Phys., Vol. 81, No. 7, July 2013 F. Battaglia and T. F. George 503
Downloaded 19 Jun 2013 to 134.124.117.107. Redistribution subject to AAPT license or copyright; see http://ajp.aapt.org/authors/copyright_permission
homonymous components. Tensors of any type may be mul- Xab is a tensor. This would be0 0true if we had had an equation
tiplied: the outer product of a tensor of type ðp; qÞ by a ten- such as ðXa0 b0 Rca0 Rdb0 Xcd ÞAa b ¼ 0 holding for any Aab . But
sor of type ðs; tÞ produces a tensor of type ðp þ s; q þ tÞ. Aa Ab is not any tensor—it is a special (a symmetric) tensor.
From a tensor of type ðp; qÞ, we can get another tensor, of Therefore, from Eq. (59), we cannot infer that Xab is a
type ðp 1; q 1Þ, by setting two indices—one covariant tensor.
and the other contravariant—equal to each other, thereby A tensor is symmetric if it is invariant under the exchange
summing over the resulting repeated index; this operation is of two equal-variance (both upper or lower) indices, such as
called contraction. The outer product of two tensors fol- Tab ¼ Tba , whereas a tensor is antisymmetric if it changes
lowed by a contraction is an inner product, and a fully con- sign under the exchange of two equal-variance indices, such
tracted inner product of two equal-rank tensors can be called as Tab ¼ Tba . The importance of these properties is due to
a scalar product, an example of which is the dot product in the fact that they hold in any coordinate system. If Tab
Eq. (33). That the operations just defined produce new ten- ¼ 6 Tba then
sors is easy to check from the defining transformation prop-
erties (55)–(57). Ta0 b0 ¼ Rca0 Rdb0 Tcd ¼ 6 Rca0 Rdb0 Tdc ¼ 6 Rdb0 Rca0 Tdc
It is possible to test the tensor character of a multi-
component quantity also by using the quotient rule. To ¼ 6 Tb0 a0 ; (60)
explain this rule and how it works, let us start with a certain
notation. Given two tensors T1 and T2 , let T1 T2 denote and similarly for two contravariant indices. From any tensor
any resulting tensor arising from taking their outer product, with two equal-variance indices, such as Tab , we may con-
possibly followed by a contraction. Then let X be a multi- struct a symmetric and an antisymmetric tensor, namely
component quantity whose tensor character (if any) we wish Tab 6 Tba . Likewise, any such tensor may be written as a
to explore. The quotient rule says the following. By treating sum of a symmetric and an antisymmetric tensor as
X provisionally as a tensor and considering a specific prod- Tab ¼ 12 ½ðTab þ Tba Þ þ ðTab Tba Þ. Symmetry/antisymmetry
uct X A T between the components of X and A, as is not defined for two indices of different variance because in
defined above (i.e., a product that does not violate our Rules this case the property would not be coordinate-independent.
0-3), if it can be shown that for any tensor A the quantity T
is also a tensor, then X is a tensor. V. FIELDS
Before seeing how the rule arises, we can readily apply it.
Suppose we do not know that the metric was a tensor, but we A field is a function of the space coordinates (and possibly
do know that it is a 2-index quantity such that, for any of time as well). To represent an objectivity-preserving
tensor Ab , Eq. (31) holds so that gab Ab is a tensor. Then by physical quantity the field function must have a tensorial
the quotient rule, gab are the covariant components of a ten- character. Specifically, in our ordinary 3D space, it must be a
sor or, in short, gab is a tensor. Similarly, because Eq. (32) 3r-component tensor field of rank r, with r ¼ 0; 1; 2; :::: a
holds for any tensor Ab , then gab is also a tensor. Moreover, ð0; 0Þ-type tensor (scalar field), ð1; 0Þ-type tensor (contravar-
because we know that for any tensor Aa the quantity dba Aa iant vector field), ð0; 1Þ-type tensor (covariant vector field)
¼ Ab is also a tensor, we can deduce that dba is a tensor. On or, in general, a ðp; qÞ-type tensor field of rank r ¼ p þ q.
the other hand, to show from the quotient rule that dab is a Let us then consider our ordinary Euclidean 3D space par-
tensor we would need to show that, for instance, for any ten- ameterized by a set of arbitrary coordinates, which we shall
sor Aa the quantity dab Aa also is a tensor Tb (in this case, a denote by x ðx1 ; x2 ; x3 Þ. We assume that they are related to
covariant vector). However, from the definition of dab , we the Cartesian
0 0 0
coordinates—hereafter denoted by
have dab Aa 6¼ Ab (as previously discussed, the equality x0 ðx1 ; x2 ; x3 Þ—by an invertible relation, so that
would hold only within orthonormal bases). Therefore, dab is 0 0 0
not a tensor. xa ¼ xa ðx1 ; x2 ; x3 Þ xa ðxÞ (61a)
The quotient rule should be applied with some care; the
key words in the rule’s statement are for any. Instead of and
proving the rule, we shall see how the proof would 0 0 0
proceed in a case where it is not legitimate to invoke it, xa ¼ xa ðx1 ; x2 ; x3 Þ xa ðx0 Þ: (61b)
thereby catching with one token both the reasoning to follow
for a proof and its pitfalls. Say we have proved that The Jacobian determinants are nonzero
Xab Aa Ab T is a scalar (a zero-rank tensor) for any tensor a0
Aa . Then @x 0
J det b
det½Rab 6¼ 0 (62a)
@x
0 0 0 0
Xa0 b0 Aa Ab ¼ Xcd Ac Ad ¼ Xcd ðRca0 Aa ÞðRdb0 Ab Þ
0 0
and
¼ Rca0 Rdb0 Xcd Aa Ab ; (58)
@xa
J 1 det det½Rab0 6¼ 0; (62b)
where in the second equality we have used the tensor nature @xb0
of Aa . From the first and last terms of this result, it follows
that where we have set
0 0 0
ðXa0 b0 Rca0 Rdb0 Xcd ÞAa Ab ¼ 0; (59) @xa 0 @xa
b
Rab and b0 Rab0 : (63)
a
@x @x
supposedly true for any A . This circumstance, however,
does not allow us to infer that Xa0 b0 Rca0 Rdb0 Xcd ¼ 0, i.e., that These are elements of matrices inverse to each other
504 Am. J. Phys., Vol. 81, No. 7, July 2013 F. Battaglia and T. F. George 504
Downloaded 19 Jun 2013 to 134.124.117.107. Redistribution subject to AAPT license or copyright; see http://ajp.aapt.org/authors/copyright_permission
0
X @xa @xb0 @xa As promised, two bases have indeed emerged from the ar-
Rab0 Rbc ¼ ¼ c ¼ dac ; (64) bitrary coordinate system x ðx1 ; x2 ; x3 Þ: the basis fea g,
b
@xb0 @xc @x
defined at each point in space, of the vectors tangent to the
coordinate curves, and the basis fea g, defined at each point
which is why we wrote the Jacobian as J 1 in Eq. (62b). in space, of the vectors normal to the coordinate surfaces.
We now proceed to show how two bases of vectors The components of the basis fea g are referred to as contravar-
emerge from Eqs. (61). From Eq. (61a), upon fixing one iant, whereas the components of the basis fea g are referred to
coordinate at a time, three surfaces (the coordinate surfaces) as covariant. For instance, if AðxÞ ¼ Aa ðxÞea ¼ Aa ðxÞea is a
are specified; by fixing xa ¼ xa0 , say, one specifies the surface vector field, its contravariant and covariant components would
of the points of space with constant xa , whose parametric transform, respectively, as prescribed by Eqs. (48) and (50),
equations are given by Eq. (61a) upon setting xa ¼ xa0 (the namely
other two coordinates play the role of parameters). For
instance, the coordinate surfaces arising by parameterizing 0
Aa ¼ Rab0 Ab (68)
the space with spherical polar coordinates are spheres cen-
tered at the origin, cones with the vertex at the origin, and
and
half-planes originating from the z-axis.
Similarly, upon fixing two coordinates at a time, xa ¼ xa0 0
Aa ¼ Rba Ab0 : (69)
and xb ¼ xb0 , say, one specifies the points lying on both the
ath and bth coordinate surfaces, i.e., the points lying on their
The tensor algebra of the preceding sections remains
intersection curve—a coordinate curve—whose parametric 0
unchanged, except that Rab0 and Rba now have an explicit
equations are given by Eq. (61a) upon setting xa ¼ xa0 and
expression, given by Eq. (63).
xb ¼ xb0 (the remaining coordinate plays the role of the
Which basis should be used? It just so happens that some
parameter).
tensors are naturally expressed in one basis rather than in
Of course, upon fixing all three coordinates a single point
another. For instance, taking the differential of Eq. (61b), we
of space is specified. Each point is at the intersection of three
have
coordinate curves, whose tangent vectors at that point—read-
ily evaluated from the parametric equation of the curves—are X @xa 0 0
dxa ¼ dxb Rab0 dxb : (70)
X @x b0
0 b
@xb0
ea ¼ e ¼
b0 Rba eb0 : (65)
b
@xa
The differentials dxa do indeed transform as the components
0
Here eb0 is the bth Cartesian basis vector and Eq. (63) has of a contravariant vector. However, the differentials dxa are
been used. Recall that, being orthonormal, a Cartesian basis the Cartesian components of the displacement vector so that
0
is self-dual (eb0 ¼ eb ) and it is precisely this fact that allows we may conclude that dxa are the contravariant components
us to write Eq. (65) in a manner suitable to make use of our of the displacement vector in any coordinate system:
Rule 0. dx ¼ dxa ea .
Are the vectors fea g in Eq. (65) a basis? That is, does Let us now consider a scalar field / /ðxÞ ¼ /ðx0 Þ (the
Ca ea ¼ 0 imply Ca ¼ 0; 8a? The answer is yes; the ortho- equality arises from the definition of a scalar as an invariant
normal 0vectors fea0 g are a 0basis, whereby Ca ðRba eb0 Þ0
0
under coordinate transformations) and compute @a /ðxÞ,
¼ ðC Ra Þeb0 ¼ 0 implies C Ra ¼ 0; 8b. However,0 Ca Rba
a b a b where we denote @a @=@xa , to get
¼ 0 implies Ca ¼ 0; 8a because, by Eq. (62a), det½Rba 6¼ 0.
@/ðxÞ @/ðx0 Þ X @/ðx0 Þ @xb
0
0
As seen above, the matrices whose elements are given in @a / ¼ ¼ Rba @b0 /:
@xa @xa @x b0 @xa
Eq. (63) are inverses of each other, and using Eqs. (65), (41), b
and (52), we could at once write the vectors dual to those in (71)
Eq. (65). However, it is instructive to see how they emerge
from Eq. (61b) by computing, from each of them, the gradi- Here, the second equality follows from the fact that / is a
ent vectors as prescribed by Eq. (9) scalar, the third from the chain rule, and the 0 fourth from Eq.
X @xa 0
(63). From Eq. (71), we have @a /ðxÞ ¼ Rba @b0 /ðx0 Þ, which
ea rxa ¼ eb0 ¼ Rab0 eb : (66) tells us that the partial derivatives @a / do transform as the
@xb0
b components of a covariant vector. However, from Eq. (9),
0 we have @b0 / ðr/Þb0 , and hence we may conclude that
Here again, thanks to the fact that eb0 ¼ eb , we have written @a / are the covariant components of the gradient of the sca-
the last term of Eq. (66) in a manner suitable for applying lar field / in any coordinate system:
Rule 0. Also, the left-hand side has been written in keeping
with Rule 3, although there is more to it than just the desire r/ ðr/Þa ea ¼ ð@a /Þea : (72)
of not breaking rules that, as appealing as they might look,
are arbitrary. Indeed, the vectors in Eqs. (65) and (66) are
The partial derivative of a scalar field with respect to any
dual to each other, as the notation chosen suggests. In fact,
coordinate component readily gives the homonymous covar-
0 0 0 0 iant component of the gradient of the field [this is the mean-
ea eb ¼ ðRac0 ec Þ ðRdb ed0 Þ ¼ Rac0 Rdb ðec ed0 Þ
0 0 0
ing of the word “naturally” in the sentence above Eq. (70)].
¼ Rac0 Rdb dcd0 ¼ Rac0 Rcb ¼ dab (67) Wishing to compute the contravariant components of the
gradient, Eq. (32) gives a recipe: ðr/Þa ¼ gab ðr/Þb
so that they satisfy Eq. (21), the defining equation of dual ¼ gab @b /, where gab are the contravariant components of the
vectors. metric tensor.
505 Am. J. Phys., Vol. 81, No. 7, July 2013 F. Battaglia and T. F. George 505
Downloaded 19 Jun 2013 to 134.124.117.107. Redistribution subject to AAPT license or copyright; see http://ajp.aapt.org/authors/copyright_permission
In principle, the metric tensor may be evaluated from Eqs. In the last term, @b ec may be rewritten as a linear combina-
(61), (65), (66), (11), (14), and (25). An easier way to evalu- tion of the basis vectors to give
ate it, however, is from the invariance of dx dx. This is
gab dxa dxb in arbitrary coordinates, whereas in Cartesian @b ec ¼ ð@b ec Þa ea ¼ ½ea ð@b ec Þea Cabc ea ; (77)
coordinates it is
0 0 0 0
X 0 0 where we have used Eq. (19) in the second equality and
gb0 c0 dxb dxc ¼ db0 c0 dxb dxc ¼ dxc dxc introduced the Christoffel symbol (of the second kind) as
c
X 0 0 Cabc ea ð@b ec Þ: (78)
¼ ðRca dxb ÞðRcb dxb Þ
c
! This is just the ath contravariant component of @b ec and it is
X 0 0 symmetric under the exchange of the lower indices
¼ Rca Rcb dxa dxb : (73)
c 0 0 0 0
@b ec ¼ @b ðRac ea0 Þ ¼ ð@b Rac Þea0 ¼ Rabc ea0 ¼ Racb ea0
By comparison with gab dxa dxb we see that 0 0
X 0 0 ¼ ð@c Rab Þea0 ¼ @c ðRab ea0 Þ ¼ @c eb ; (79)
gab ¼ Rca Rcb ; (74)
c where denoting by primes a Cartesian coordinate system (in
directly computable from Eqs. (61a) and (63). For instance, which the basis vectors are constant),
0 0
@b ec0 ¼ 0, and use has
for the well-known spherical0 polar coordinates—for which been made of the fact that Rabc ¼ Racb (the mixed second par-
0 0
we replace ðx1 ; x2 ; x3 Þ and ðx1 ; x2 ; x3 Þ by the more common tial derivatives may be exchanged). Therefore, we have
ðr; #; uÞ and ðx; y; zÞ—one would get gab dxa dxb ¼ ðdrÞ2
@ b ec ¼ @ c eb (80)
þr 2 ðd#Þ2 þ r2 sin2 #ðduÞ2 , whereby the metric is the diago-
nal matrix ½gab ¼ diag½1; r2 ; r 2 sin2 #.
and
The partial derivatives of a scalar field @a / are then the
(covariant) components of a (first-rank) tensor (the gradient Cabc ¼ Cacb : (81)
of the field), which gives valuable (and, most importantly,
coordinate-independent or objectivity-preserving) informa- Equation (76) can then be written as
tion on the spatial variation rate of the field. The question nat-
urally arises as to how to secure such important information @b A ¼ ð@b Ac Þec þ Cabc Ac ea ¼ ð@b Aa þ Cabc Ac Þea Aa;b ea ;
on the components of a vector field. Do the partial derivatives
@b Aa or @b Aa of the components of a vector field, or the sec- (82)
ond partial derivatives @ba / @b @a / of a scalar field, have a
similar appeal as coordinate-independent, objectivity-preserv- where we have defined the covariant derivative of the contra-
ing quantities? In our language, are @b Aa , @b Aa , or @ba / variant components of the vector A (or more concisely, the
(second-rank) tensors? This is an easy question to answer covariant derivative of the contravariant vector Aa ) as
because we have just to check Eqs. (55)–(57). For instance,
Aa;b @b Aa þ Cabc Ac ; (83)
@b0 A a0 ¼ @b0 ðRca0 Ac Þ ¼ Rca0 ð@b0 Ac Þ þ Ac @b0 Rca0
where the covariant derivative is denoted by a subscript
¼ Rca0 ðRdb0 @d Ac Þ þ Ac Rcb0 a0 ¼ Rdb0 Rca0 @d Ac þ Ac Rcb0 a0 ; semicolon. Notice that, rather than Eq. (76), we could have
(75a) instead written
where we have used the chain rule in the second equality and @b A ¼ @b ðAc ec Þ ¼ ð@b Ac Þec þ Ac ð@b ec Þ (84)
have set @b0 Rca0 Rcb0 a0 . Due to the extra term Ac Rcb0 a0 , Eq.
(75a) shows that @b0 Aa0 6¼ Rdb0 Rca0 @d Ac so that @b Aa is not a with
tensor. Similar reasoning, and reference to Eq. (71), shows
that @ba / @b ð@a /Þ is not a tensor either, and nor is @b Aa : @b ec ¼ ð@b ec Þa ea ¼ ½ea ð@b ec Þea : (85)
0 0 0 0
@b0 Aa ¼ @b0 ðRac Ac Þ ¼ Rac ð@b0 Ac Þ þ Ac @b0 Rac However,
0 0
¼ Rac ðRdb0 @d Ac Þ þ Ac Rdb0 @d Rac ea ð@b ec Þ ¼ @b ðea ec Þec @b ea ¼ @b dca ec @b ea ¼ Ccba ;
0
¼ Rdb0 Rac @d Ac þ Ac Rdb0 Radc :
0
(75b) (86)
where we have used Eq. (21) in the second equality and Eq.
Although @ba / @b ð@a /Þ and @b Aa turn out not to be ten-
(78) in the last. Equation (84) finally becomes
sors, we still have the desire, if not the need, of a physical
quantity providing coordinate-independent information on
@b A ¼ ð@b Ac Þec Ac Ccba ea ¼ ð@b Aa Ccba Ac Þea Aa;b ea ;
the rate of spatial variation of arbitrary-rank tensors, not just
scalars. (87)
Let us compute the bth spatial derivative of a vector, as
given by Eq. (28), keeping in mind that in non-Cartesian where we have defined the covariant derivative of the covari-
coordinates the basis vectors are not constant: ant (components of the) vector Aa as
506 Am. J. Phys., Vol. 81, No. 7, July 2013 F. Battaglia and T. F. George 506
Downloaded 19 Jun 2013 to 134.124.117.107. Redistribution subject to AAPT license or copyright; see http://ajp.aapt.org/authors/copyright_permission
To compute the covariant derivatives, the needed As we have already seen, the partial-derivative term in
Christoffel symbols are provided by the precious metric Eqs. (83) and (88) is not a tensor and neither is the other
term because the Christoffel symbols are not tensors.
1 1 However, the combination of the two terms does make the
Cabc ¼ ðCabc þ Cacb Þ ¼ ½ea ð@b ec Þ þ ea ð@c eb Þ
2 2 covariant derivative a tensor, as we will soon see. To demon-
1 ad strate that the Christoffel symbols are not tensors, it is con-
¼ ½ðg ed Þ ð@b ec Þ þ ðgad ed Þ ð@c eb Þ venient to start by introducing the Christoffel symbol of the
2
1 first kind (please pay attention to the relative position of the
¼ gad ½@b ðed ec Þ þ @c ðed eb Þ ec ð@b ed Þ indices), defined by
2
eb ð@c ed Þ 1
Cabc gad Cdbc ¼ gad gde ½@b gec þ @c geb @e gbc
1 2
¼ gad ½@b gdc þ @c gdb ec ð@d eb Þ eb ð@d ec Þ
2 1
1 ¼ dea ½@b gec þ @c geb @e gbc ; (92)
¼ gad ½@b gdc þ @c gdb @d ðeb ec Þ 2
2
1 which gives
¼ gad ½@b gdc þ @c gdb @d gbc ; (89)
2 1
Cabc gad Cdbc ¼ ½@b gac þ @c gab @a gbc : (93)
where, along with the derivation, we have used, in order, 2
Eqs. (81), (78), (27), (11), (80), and (11) again. The impor-
tant relation obtained is From Cabc gad Cdbc , we have gae Cabc ¼ gae gad Cdbc ¼ ded Cdbc
¼ Cebc , or
1
Cabc ¼ gad ½@b gdc þ @c gdb @d gbc : (90) Cabc ¼ gad Cdbc : (94)
2
From Eqs. (82) and (87), we then have Incidentally, it should be clear from Eqs. (90) and (93) that
all components of the Christoffel symbols are equal to zero
@b A ¼ Aa;b ea ¼ Aa;b ea ; (91) in Cartesian coordinates, where the metric-tensor compo-
nents are given by the (constant) Kronecker delta.
where the covariant derivatives are given by Eqs. (83) and (88) The transformation rule for the Christoffel symbol of the
and the Christoffel symbol (of the second kind) by Eq. (90). first kind is obtained from Eqs. (93) and (46) as
2Ca0 b0 c0 ¼ @b0 ga0 c0 þ @c0 ga0 b0 @a0 gb0 c0 ¼ @b0 ðRda0 Rec0 gde Þ þ @c0 ðRda0 Reb0 gde Þ @a0 ðRdb0 Rec0 gde Þ
¼ ðRda0 Rec0 Þ@b0 gde þ ðRda0 Reb0 Þ@c0 gde ðRdb0 Rec0 Þ@a0 gde þ gde ½@b0 ðRda0 Rec0 Þ þ @c0 ðRda0 Reb0 Þ @a0 ðRdb0 Rec0 Þ
¼ Rda0 Rec0 Rfb0 @f gde þ Rda0 Reb0 Rfc0 @f gde Rdb0 Rec0 Rfa0 @f gde
þgde ½ðRdb0 a0 Rec0 þ Rda0 Reb0 c0 Þ þ ðRdc0 a0 Reb0 þ Rda0 Rec0 b0 Þ ðRda0 b0 Rec0 þ Rdb0 Rea0 c0 Þ
¼ Rda0 Reb0 Rfc0 ð@e gdf þ @f gde @d gef Þ þ gde ð2Rda0 Reb0 c0 þ Rdc0 a0 Reb0 Rdb0 Rea0 c0 Þ
¼ 2Rda0 Reb0 Rfc0 Cdef þ 2gde Rda0 Reb0 c0 : (95)
0 0 0 0 0 0 0 0 0
Cab0 c0 ¼ ga d Cd0 b0 c0 ¼ ðRae Rdf gef ÞðRpd0 Rqb0 Rrc0 Cpqr þ gpq Rpd0 Rqb0 c0 Þ ¼ Rae Rqb0 Rrc0 ðRdf Rpd0 Þgef Cpqr þ gef gpq Rae ðRdf Rpd0 ÞRqb0 c0
0 0 0 0
¼ Rae Rqb0 Rrc0 ðdpf gef ÞCpqr þ gef ðgpq dpf ÞRae Rqb0 c0 ¼ Rae Rqb0 Rrc0 ðgep Cpqr Þ þ ðgef gf q ÞRae Rqb0 c0
0 0 0 0
¼ Rae Rqb0 Rrc0 Ceqr þ ðdeq Rae ÞRqb0 c0 ¼ Rae Rqb0 Rrc0 Ceqr þ Raq Rqb0 c0 ; (97)
or
0 0 0 0 0
Cab0 c0 ¼ Rad Reb0 Rfc0 Cdef þ Rad Rdb0 c0 ¼ Rad Reb0 Rfc0 Cdef Rade Rdb0 Rec0 ; (98)
507 Am. J. Phys., Vol. 81, No. 7, July 2013 F. Battaglia and T. F. George 507
Downloaded 19 Jun 2013 to 134.124.117.107. Redistribution subject to AAPT license or copyright; see http://ajp.aapt.org/authors/copyright_permission
0 0 0 0
Rad Rdb0 c0 ¼ Rad @b0 Rdc0 ¼ @b0 ðRad Rdc0 Þ Rdc0 @b0 Rad We then see that, because of the presence of the second terms
0 0 in Eqs. (96) and (98), the Christoffel symbols do not transform
¼ @b0 dac0 Rdc0 ðReb0 @e Rad Þ as a tensor. As noted, the partial derivatives do not transform
0 0 0
¼ Rdc0 Reb0 Raed ¼ Raed Reb0 Rdc0 ¼ Rade Rdb0 Rec0 : as a tensor either. However, their combination in the covariant
derivatives (83) and (88) does transform as a tensor
(99)
0 0 0
Aa0 ;b0 @b0 Aa0 Ccb0 a0 Ac0 ¼ ðRdb0 Rea0 @d Ae þ Ae Reb0 a0 Þ ðRdb0 Rea0 Rcf Cfde þ Rce Reb0 a0 ÞRpc0 Ap
0 0
¼ Rdb0 Rea0 ð@d Ae Cfde ðRcf Rpc0 ÞAp Þ þ Ae Reb0 a0 ðRce Rpc0 ÞAp Reb0 a0 ¼ Rdb0 Rea0 ð@d Ae Cfde dpf Ap Þ þ Ae Reb0 a0 dpe Ap Reb0 a0
¼ Rdb0 Rea0 ð@d Ae Cfde Af Þ þ Ae Reb0 a0 Ae Reb0 a0 ¼ Rdb0 Rea0 Ad;e ; (100)
where Eqs. (88), (75a), (50), and the second equality in Eq. (98) have been used. Similarly,
0 0 0 0 0 0 0 0 0
Aa;b0 @b0 Aa þ Cab0 c0 Ac ¼ ðRdb0 Rae @d Ae þ Ae Rdb0 Rade Þ þ ðRae Rdb0 Rfc0 Cedf Rade Rdb0 Rec0 ÞRcp Ap
0 0 0 0 0
¼ Rdb0 Rae ð@d Ae þ Rfc0 Cedf Rcp Ap Þ þ Ae Rdb0 Rade Rade Rdb0 ðRec0 Rcp ÞAp
0 0 0 0 0 0 0
¼ Rdb0 Rae Ae;d þ Ae Rdb0 Rade Rade Rdb0 dep Ap ¼ Rdb0 Rae Ae;d þ Ae Rdb0 Rade Rade Rdb0 Ae ¼ Rdb0 Rae Ae;d ; (101)
where Eqs. (83), (75b), (48), and the first equality in Eq. (98) where the mixed second derivatives may be exchanged
have been used. @bc Aa ¼ @cb Aa , in covariant differentiation the order is im-
It is worth mentioning that the covariant derivative could portant, Aa;bc 6¼ Aa;cb ).
have been obtained in the following alternative way, where The covariant derivative of the metric is zero. Indeed,
we only sketch the procedure and leave the details to the
reader. Starting from the first equality in Eq. (98), solve for gab;c ¼ @c gab Cdac gdb Cdbc gad ¼ @c gab Cbac Cabc
Rdb0 c0 and insert the result into Eq. (75a). Rearranging the 1
equation obtained, Eq. (101) follows, showing that the covar- ¼ @c gab ½ð@a gbc þ @c gab @b gac Þ
2
iant derivative defined in Eq. (88) is indeed a tensor.
þ ð@b gac þ @c gab @a gbc Þ ¼ 0; (106)
Similarly,
0
starting from the second equality in Eq. (98), solve
for Rade and insert the result into Eq. (75b). Rearranging the
where Eqs. (103) and (93) have been used. Likewise, from
equation obtained, Eq. (100) follows, showing that the covar-
Eq. (104),
iant derivative defined in Eq. (83) is indeed a tensor.
Covariant derivatives of higher-rank tensors are readily
dba;c ¼ @c dba þ Cbcd dda Cdac dbd ¼ Cbca Cbac ¼ 0; (107)
written. For instance
Aab;c @c Aab þ Cacd Adb þ Cbcd Aad ; (102) and from Eqs. (105) and (107),
Aab;c @c Aab Cdac Adb Cdad Aad ; (103) 0 ¼ dba;c ¼ ðgad gdb Þ;c ¼ gad;c gdb þ gdb db
;c gad ¼ g ;c gad ;
(108)
and
whereby gab ;c ¼ 0. The metric-tensor components are
Aba;c @c Aba þ Cbcd Ada Cdac Abd ; (104)
“transparent” to covariant differentiation, a result known as
with a scheme that should be self-evident. Just recall that the the Ricci theorem.
sign is positive when computing the covariant derivative of a Now that we have a new tensor—the covariant derivative
contravariant component and negative when computing the of a field—we may treat it as such. For instance, the second-
covariant derivative of a covariant component. Rules 1–3 rank tensor in Eq. (83) may be contracted to a scalar, as
automatically take care of the indices and if any rule is bro- explained in Sec. IV. If we do so, we obtain Aa;a ¼ @a Aa
ken there is a mistake. (Regrettably it does not work the þCaac Ac . We have already noted that in Cartesian coordinates
other way—satisfying the rules does not guarantee mistake- the Christoffel symbols are zero, and the covariant and con-
free algebra.) travariant components of a vector are equal to each other.
The covariant derivative of a product follows the same Thus, in Cartesian coordinates,
P the above contraction
rules as the ordinary derivative. For instance becomes Aa;a ¼ @a Aa ¼ a @a Aa . But this is precisely Eq.
(7), the definition of the divergence of a vector, which is
ðAa Bb Þ;c ¼ Aa;c Bb þ Bb;c Aa ; (105) then a scalar (an invariant). We are then led to define the
divergence of a vector in an arbitrary coordinate system as
as can be easily verified by applying Eq. (104) to Aa Bb . (A
word of caution: unlike the case of ordinary differentiation, r A Aa;a @a Aa þ Caac Ac : (109)
508 Am. J. Phys., Vol. 81, No. 7, July 2013 F. Battaglia and T. F. George 508
Downloaded 19 Jun 2013 to 134.124.117.107. Redistribution subject to AAPT license or copyright; see http://ajp.aapt.org/authors/copyright_permission
Notice that the divergence of a vector is defined in terms Finally, as explained at the end of Sec. IV, from a covariant
of its contravariant components. If the covariant components second-rank tensor we can construct an antisymmetric tensor
are available, then r A Aa;a ¼ ðgab Ab Þ;a ¼ gab Ab;a ; we (the one constructed from Aa;b would be Aa;b Ab;a , or one
define the divergence of (the components of) a covariant vec- proportional to this). However, from Eq. (88), we have Aa;b
tor to be the divergence of the associated contravariant (com- Ab;a ¼ @b Aa @a Ab because the terms containing the
ponents of the) vector. Christoffel symbol cancel. Thus, even though the partial
Remarkably, to evaluate the divergence of a vector, it is derivatives of a vector are not tensors (in fact, this is precisely
not necessary to compute the Christoffel symbols. In fact, what has led us to the definition of the covariant derivative),
the ones appearing in Eq. (109) are their antisymmetric difference is a tensor. A 9-component
antisymmetric tensor has only 3 independent components, and
1 ad 1 this is so in any coordinate system (because antisymmetry is
Caac ¼ g ½@a gdc þ @c gda @d gac ¼ gad @c gad ;
2 2 preserved). These three components may then be viewed as
(110) the components of a vector that, in the case of Aa;b Ab;a ,
turn out to be proportional precisely to the curl of the vector
where the cancellation between the first and third terms on A. This quantity is defined as the “cross product” between the
the right-hand side of the first equality arises upon exchang- gradient operator and the vector A [as in Eq. (40)]
ing the dummy indices a and d and taking advantage of the
symmetry of the metric tensor. However, for the matrix 2 3
e1 e2 e3
whose elements are gad , we may write 1 1
r A ¼ pffiffiffiffi det4 @1 @2 @3 5 ¼ pffiffiffiffi eabc @a Ab ec : (116)
G G
Gad A A A 1 2 3
½gad ¼ ½gda ¼ ½gda 1 ¼ ; (111)
G
where the first equality follows from the symmetry of the VI. ORTHOGONAL COORDINATES
metric, the second from Eq. (14), and the last equality—in
Geometrically, orthogonal coordinates have the property
which Gad and G are, respectively, the cofactor of the ele-
that surface coordinates intersect as normal to each other,
ment gad and the determinant of the matrix ½gab —follows
and curve coordinates do so as well. The metric matrices are
from the definition of the inverse of a matrix. Then for each
diagonal,
element of the matrix ½gab we can write gab ¼ Gab =G.
However, from the definition of a determinant we also have
½gab ¼ diag½g11 ; g22 ; g33 diag½h2ð1Þ ; h2ð2Þ ; h2ð3Þ ; (117)
G gab Gab , whereby @G=@gad ¼ Gad and gad ¼ Gad =G
¼ ð1=GÞð@G=@gad Þ. Equation (110) becomes
with the metric coefficients defined by
1 1 @G @gad 1 @G
Caac ¼ gad @c gad ¼ c
¼ pffiffiffiffiffiffi 1
2 2G @gad @x 2G @xc hðaÞ gaa ¼ pffiffiffiffiffiffi ; (118)
pffiffiffiffi pffiffiffiffi gaa
1 @ G @c ð GÞ
¼ pffiffiffiffi pffiffiffiffi ; (112)
G @xc G where the second equality follows from the inverse of a diag-
onal matrix. In orthogonal coordinates, the lengths
pffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffi of the
which can be inserted into Eq. (109) to give coordinate basis vectors are given by e e ¼ gaa ¼ hðaÞ
pffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffi a a
pffiffiffiffi pffiffiffiffi and ea ea ¼ gaa ¼ h1 ðaÞ , and the coordinate basis vectors
@a ð GÞ a @a ð GAa Þ themselves may be expressed as ea ¼ hðaÞ e^a and ea ¼ h1 ^a ,
ðaÞ e
a a
r A A;a @a A þ pffiffiffiffi A ¼ pffiffiffiffi : (113) a
G G where e^a ¼ e^ —these basis vectors are an orthonormal set.
We have written them in a manner to fulfil our Rule 3 and
This is our final, and most general, expression for the diver- have inserted within parentheses the metric-coefficient label
gence of a vector field, applicable in any coordinate system. to remind us—in case of unlikely doubts—that no sum is
Needless to say, we may define divergences of a higher-rank involved over that index.
tensor by taking the covariant derivative with respect to the Any vector can then be written as
ath coordinate of the bth contravariant component and con-
a
tracting with respect to the indices a and b. A ¼ Aa ea ¼ hðaÞ Aa e^a ¼ A~ e^a ¼ Aa ea ¼ h1 ^a ¼ A~a e^a ;
ðaÞ Aa e
The Laplacian of a scalar field / ¼ /ðxÞ is defined as the
divergence of its gradient [see Eq. (10)]. From Eqs. (72) and (119)
(113), we have
where
pffiffiffiffi pffiffiffiffi
@a ½ Gðr/Þa @a ½ Ggab ðr/Þb a
D/ r r/ ¼ p ffiffiffi
ffi ¼ pffiffiffiffi ; A~ ¼ A~a ¼ hðaÞ Aa ¼ h1
ðaÞ Aa (120)
G G
(114) are the physical components, often preferred by physicists
because they have the same physical dimensions of the quan-
or tity considered and because they are relative to an orthonor-
pffiffiffiffi mal basis. For instance, in terms of its physical components,
@a ½ Ggab @b /
D/ r r/ ¼ pffiffiffiffi ; (115) the gradient of a scalar field is then
G
1
which is the general expression, applicable in any coordinate r/ ¼ @a / e^a : (121)
system, for the Laplacian of a scalar field. hðaÞ
509 Am. J. Phys., Vol. 81, No. 7, July 2013 F. Battaglia and T. F. George 509
Downloaded 19 Jun 2013 to 134.124.117.107. Redistribution subject to AAPT license or copyright; see http://ajp.aapt.org/authors/copyright_permission
Calculations should be performed using covariant and con- We have therefore come back, after a journey along the
travariant components, changing into physical components path of tensor calculus, to expressions—Eqs. (121) and
only at the end, if these are preferred. (123)–(125)—that our readers might be familiar with; a good
In an orthogonal coordinate system, the Christoffel place, as any, to stop.
symbols take a simpler form. From Eqs. (90) and (118), we
have
VII. CONCLUSIONS
Cabc ¼ 12 g ½@b gdc þ @c gdb @d gbc
ad
510 Am. J. Phys., Vol. 81, No. 7, July 2013 F. Battaglia and T. F. George 510
Downloaded 19 Jun 2013 to 134.124.117.107. Redistribution subject to AAPT license or copyright; see http://ajp.aapt.org/authors/copyright_permission
13 20
D. C. Kay, Schaum’s Outline of Tensor Calculus (McGraw-Hill, New For vectors defined in the complex field, the scalar product would not be
York, 1988). commutative.
14 21
L. P. Lebedev, M. J. Cloud, and V. A. Eremeyev, Tensor Analysis with In the context of crystallography, the lattice space and its basis are called
Applications in Mechanics (World Scientific, Singapore, 2010). the direct space and direct basis, whereas the dual basis and the lattice
15
J. L. Synge and A. Schild, Tensor Calculus (Dover, USA, 1978). space thereby generated are called the reciprocal basis and reciprocal lat-
16
R. C. Wrede, Introduction to Vector and Tensor Analysis (Dover, USA, tice; V is the volume of the elementary cell of the direct lattice, whereas its
1972). inverse is the volume of the elementary cell of the reciprocal lattice. These
17
For example, see P. M. Morse and H. Feshbach, Methods of Theoretical concepts in crystallography are of utmost importance: the constructive in-
Physics (McGraw-Hill, New York, 1953); F. W. Byron, Jr. and R. W. terference which provides a diffraction spectrum occurs only when the
Fuller, Mathematics of Classical and Quantum Physics (Dover, USA, vector differences between the wave vectors of the incident and diffracted
1970). x-rays are vectors of the reciprocal space.
18 22
We are also assuming that the readers have been exposed to Cramer’s rule This is in contrast to the formulation followed here and which, as archaic
for solving linear algebraic equations, i.e., that they have some familiarity as it might seem, provides nonetheless a practical tool widely used by the
with the properties of determinants and the inversion of matrices. working physicist.
19 23
For completeness, we should add that multiplication by 1 leaves a vector B. Schutz, Geometrical Methods of Mathematical Physics (Cambridge
unchanged. U.P., Cambridge, 1980).
511 Am. J. Phys., Vol. 81, No. 7, July 2013 F. Battaglia and T. F. George 511
View publication stats Downloaded 19 Jun 2013 to 134.124.117.107. Redistribution subject to AAPT license or copyright; see http://ajp.aapt.org/authors/copyright_permission