Tensor Analysis
Tensor Analysis
Tensor Analysis
Tensor Analysis
Tensor Analysis
|
Translated by
Andrea Dziubek and Edmond Rusjan
Authors Translators
Prof. Dr.-Ing. Heinz Schade Prof. Dr. Andrea Dziubek
App. 430 State University of New York
Erlenweg 72 Polytechnic Institute
14532 Kleinmachnow 100 Seymour Road
Germany Utica NY 13502, USA
[email protected] [email protected]
ISBN 978-3-11-040425-8
e-ISBN (PDF) 978-3-11-040426-5
e-ISBN (EPUB) 978-3-11-040549-1
www.degruyter.com
The first systematic presentation of tensor calculus was published in 1900 by Ricci and
Levi-Civita and hence the subject was initially often called Ricci calculus. Gradually,
the term tensor calculus became generally accepted. Tensor calculus was essential
for the formulation of the theory of relativity, which has in turn stimulated its further
development.
Tensor calculus is on the one hand an area of mathematics with applications
in differential geometry and on the other hand an essential resource for theoretical
physics and, increasingly, also for theoretical engineering. The theory of tensors con-
sists of two parts: tensor algebra and tensor analysis. The term tensor analysis is to-
day often (as in the title of this book) used for both parts. Tensor algebra deals with
the general theory of tensors, while tensor analysis is the theory of tensor fields, i. e.
tensors, which depend on the points in space, in particular in the three-dimensional
Euclidean space of our intuition.
In mathematics, tensor algebra is primarily interesting as a part of multilinear
algebra. Consequently, mathematical presentations usually use algebraic methods.
Scalars, vectors, and finally tensors are introduced as elements of sets. Then opera-
tions between these elements are defined that satisfy certain rules. Finally, the prop-
erties of these tensors are investigated. This approach is very abstract for physics stu-
dents, and even more for engineering students, and leads to a more general concept
of vectors and tensors than we need here. 1
In physics and in engineering we are interested in tensors, because physical quan-
tities can be described as tensors (scalars and vectors are special cases of tensors).
Physical equations are equations between tensors and tensor analysis enables us to
formulate physical equations and to calculate with them. If we want to express a phys-
ical quantity, such as velocity or force, quantitatively, we need a coordinate system in
the space of our intuition. The simplest coordinate system is usually a Cartesian coor-
dinate system. We will see that we also need a coordinate system to quantitatively de-
scribe all other physical quantities, with the exception of scalar quantities. This means
that as long as we only use tensor analysis for computations with physical quantities
we can, mathematically speaking, limit ourselves to tensors which can be represented
in a three-dimensional Cartesian basis.
This book is about these tensors. It builds tensor analysis in a more intuitive way
by defining operations between tensors in terms of their coordinates in a Cartesian
1 The original German edition of this book had a final chapter on this mathematical approach to tensor
analysis. We omitted this chapter from the English edition because it appeared that this approach was
of less interest to the readers of this book.
https://doi.org/10.1515/9783110404265-201
basis. This naturally leads to the two tensor analysis notations: the so-called symbolic
or direct notation and the so-called coordinate or index notation. Both notations have
advantages and it is best to learn both notations simultaneously and to be able to
translate between them like between languages. Hence the symbolic notation must be
able to express, as much as possible, the same information as the coordinate notation
with its algorithmic character can express. For example, the order of a tensor is clear
in the coordinate notation from the number of the indices and hence it should be also
clear in the symbolic notation. We aimed to derive a symbolic notation that does that.
The book has five chapters. In Chapter 1 we collect the tools used later in the book,
including the so-called Kronecker symbols and the most important facts about deter-
minants and matrices. Chapter 2 presents tensors in symbolic notation and in Carte-
sian coordinates while Chapter 4 presents tensors in curvilinear coordinates. Chapter 3
deals with the algebra of second-order tensors, which is closely related to the theory of
square order-3 matrices. It also includes some additional theorems about square matri-
ces. Chapter 5 is an introduction to the theory of the representation of tensor functions.
We investigate permissible representations of physical quantities under the condition
that a physical equation must satisfy tensorial homogeneity. This is similar to dimen-
sional analysis, where an equation must satisfy dimensional homogeneity. (Tensorial
homogeneity means that only tensors of the same order can be added or set equal to
each other. For example, torque cannot be equal to work, because torque is a vector
and work is a scalar.)
Important formulas are indented and labeled by a number. Intermediate results,
which are cited later, for example, in a proof or in an auxiliary calculation, are left-
aligned and labeled by a letter on the right side. The most important formulas and
theorems (which are worth memorizing) are highlighted by a gray background and
typeset in a font different from the rest of the text. Italics are used to emphasize text.
The book can be used as a textbook for a one-semester course. It is also well suited
for self-study. Both, and especially the latter, require solving exercise problems and
hence the text contains assignment problems at appropriate places. They are marked
with the symbol of a pen on the left side. Only a limited number of problems is in-
cluded, so all problems should be solved (unless the corresponding part of the text
contains nothing new for the reader), and preferably before continuing reading the
text. The relatively small number of problems makes it possible to write in some detail
how the problems can be solved (see Appendix A).
The bibliography at the end of the book has been compiled by the translators.
Without the critical questions and suggestions for improvements from many stu-
dents and assistants, and the generous support of our Institute, we would not have
been able to write this book. We want to especially thank (now Professor) Frank
Kameier, who professionally and didactically accompanied the course development
for many years, Thomas Lauke for careful proofreading and preparing the manuscript
for printing, Evelyn Kulzer for drawing the pictures, Andrea Dziubek and Edmond Rus-
jan for their conscientious translation into English and for initiating this translation,
and last but not least Dr. Konrad Kieling from the Walter de Gruyter publishing house
for enabling this English edition.
https://doi.org/10.1515/9783110404265-202
1 Algebraic Tools | 1
1.1 The Summation Convention | 1
1.2 N-tuples | 4
1.2.1 Definitions | 4
1.2.2 Linear Operations | 4
1.2.3 Linear Independence | 5
1.3 Determinants | 6
1.3.1 Definitions | 6
1.3.2 Computing Determinants | 7
1.3.3 Computations with Determinants | 8
1.4 Kronecker Symbols | 10
1.4.1 δij | 10
i...j
1.4.2 δp...q | 11
1.4.3 εi...j | 12
1.4.4 Representation of a Determinant by εi...j | 15
1.4.5 εijk | 18
1.5 Matrices | 20
1.5.1 Definitions | 20
1.5.2 Arithmetic Operations and First Conclusions | 21
1.5.3 Equations Involving Matrices and Equations Involving Matrix
Elements | 25
1.5.4 Elementary Transformations, Normal Form, Equivalent Matrices,
Similar Matrices | 26
1.5.5 Orthogonal Matrices | 28
1.6 Algorithms | 29
1.6.1 Computing the Value of a Determinant | 29
1.6.2 Solution of Systems of Linear Equations with the Same Regular
Coefficient Matrix (“Division by a Regular Matrix”, Gauss–Jordan
Algorithm) | 29
1.6.3 Determining the Rank of a Matrix or a Determinant | 30
References | 249
Appendices | 253
Index | 321
2. In some sections, we need to distinguish between lower and upper indices. Where
upper indices can be confused with exponents, we will set the exponents in brackets.
3. For the calculation with matrices and tensors, an agreement proposed by Einstein
which is called summation convention has been adopted for running indices. It exists
in different versions; we introduce it in the following form:
If a running index appears twice in a term, summation over the values of the index
is implied, without explicitly writing the summation sign.
https://doi.org/10.1515/9783110404265-001
4. The summation convention has the following consequences for running indices:
– A running index can only appear once or twice in a term.
If a running index appears once, it is called a free index. A free index succes-
sively takes each value of its index set. If a running index appears twice, it is called
a summation index. A summation index indicates summation over the values of
its index set.
For example, if an index i has the index set 1, 2 and a second index j has the
index set 1, 2, 3, then the equation aij bj = ci abbreviates the two equations
These two equations can also be written as aik bk = ci or amn bn = cm , but not as
aii bi = ci , because in the last equation i appears three times in a term and this is
not defined.
It is often necessary to rename indices. For example, if we substitute Ai = αij Bj
into aj = Ai Cij we get formally aj = αij Bj Cij , but then j appears three times in a
term. To avoid this, we can write the first equation as Ai = αik Bk , and now we get
an acceptable equation, aj = αik Bk Cij .
– Running indices must have the same index set in all terms of an equation.
For example, in aij bj = ci , if i runs from 1 to 3 in aij , it cannot run from 1 to 2
in ci , and the same is true for j.
– A free index must match in all terms of an equation. The order of indices in an equa-
tion does not matter.
For example, the equation aij bj = ck is not permissible, while ai bj = cij and
bj ai = cij are permissible and have the same meaning.
The last rule has one exception: we do not add a subscript to the number zero, i. e. the
equation αi = 0 means all αi are zero. We will also need the logical negation of this
statement. To express “not all αi are zero” we introduce our own inequality sign =\ and
write αi =
\ 0 (read: αi not all zero). On the other hand, the equation with the common
inequality sign αi ≠ 0 means all αi are nonzero. To not deviate more than necessary
from common practice, we will use the usual inequality sign if an equation has both
meanings. This is the case when the equation does not have free indices, so we write
a ≠ 0 and ai bi ≠ 0.
Problem 1.1.
Write out the following expressions:
for N = 4, i. e. i = 1, 2, 3, 4
𝜕ui 𝜕2 φ
A. ai Bi , B. Aii , C. , D. ,
𝜕xi 𝜕xi2
for N = 3
E. aij bij , F. aii bjj ,
for N = 2
𝜕ui 𝜕ui
G. .
𝜕xj 𝜕xj
Problem 1.2.
Substitute:
A. ui = Aik nk in φ = uk vk ,
B. ui = Bij vj and Cij = pi qj in wi = Cmi um ,
𝜕v 𝜕vj 1 𝜕v 𝜕vj
C. ϕ = τij dij , τij = η ( i + ) , dij = ( i + ) , and
𝜕xj 𝜕xi 2 𝜕xj 𝜕xi
𝜕T Ds 𝜕q
qi = −κ (κ constant) in ρ T = ϕ− i.
𝜕xi Dt 𝜕xi
5. In the rare cases that we do not want to apply the summation convention, we un-
derline the corresponding running index. For example, Aij with i, j = 1, 2, 3 represents
the matrix
Aii represents the sum A11 + A22 + A33 of the diagonal elements, and Aii represents
the set (A11 , A22 , A33 ) of the diagonal elements. Underlined indices are neither free nor
summation indices. We call these indices bound indices, because if a term has a bound
index, then the same index must appear either as a free index or as a summation index
in the same term. These indices do not count in the rule that a running index can
appear at most twice in a term.
This means αi Aii is permissible and represents α1 A11 + α2 A22 + α3 A33 and λi aij is
also permissible and represents the matrix
A short calculation shows that it is possible to multiply a free index by a bound in-
dex, but that it is not possible to multiply a summation index by a bound index: From
aij bjk = cik it follows that λi aij bjk = λi cik , but from ai bi = ci di it does not follow
that λi ai bi = λi ci di .
If an index is bound to a free index, we can convert the free index into a summation
index by multiplication. For example λi ai becomes λi ai bi . However, such an expres-
sion is not associative; it should be read as (λi ai )bi . The other association λi (ai bi ) is
not defined, which becomes clear when we substitute ai bi = c.
Problem 1.3.
A. Are the following equations allowed by the summation convention?
1. Ai Bjj = Ci Dkk , 2. Ai Bi = Ci Dj , 3. Ai Bi = Ci Di ,
4. Ai Bi = Ci , 5. Am Bn = Cmn , 6. Ai Bk = Ci Dj ,
1.2 N-tuples
1.2.1 Definitions
The order of elements matters: changing the order of elements in an N-tuple gives a
different N-tuple. Special names are often used for small values of N. We call 2-tuples
ordered pairs, 3-tuples ordered triples, 4-tuples ordered quadruples, etc.
For example, the three Cartesian coordinates of a point form an N-tuple, in par-
ticular, an ordered triple.
We call an N-tuple of arbitrary size with all elements zero a zero-N-tuple and
write (0).
For N-tuples of equal size we define the following so-called linear operations: equality,
addition, subtraction, and multiplication by a number:
– Two N-tuples are equal if all homologous elements, i. e. elements at identical po-
sitions in both N-tuples and subscribed by the same index, are equal.
Equality, addition, and subtraction are defined only for N-tuples of equal size.
1 2 P
1. A set of P N-tuples of equal size (ai ), (ai ), . . . , (ai ) is called linearly independent if
the equation
1 2 P
α1 (ai ) + α2 (ai ) + ⋅ ⋅ ⋅ + αP (ai ) = (0) (1.2)
is satisfied only if all αi are zero. This solution is called the trivial solution. Using the
summation convention we abbreviate
j
αj (ai ) = (0) only if αj = 0,
(1.3)
i = 1, . . . , N, j = 1, . . . , P.
If equation (1.2) has also nontrivial solutions, i. e. if at least one of the αj is nonzero,
j
αj (ai ) = (0), αj =
\ 0, i = 1, . . . , N, j = 1, . . . , P, (1.4)
j
then the N-tuples (ai ) are called linearly dependent.
2. Equation (1.2) represents a system of N homogeneous, linear equations to deter-
mine the P unknown αj :
1 2 P
α1 a1 + α2 a1 + ⋅ ⋅ ⋅ + αP a1 = 0,
1 2 P
α1 a2 + α2 a2 + ⋅ ⋅ ⋅ + αP a2 = 0,
..
.
1 2 P
α1 aN + α2 aN + ⋅ ⋅ ⋅ + αP aN = 0.
If the number of unknowns is larger than the number of equations, i. e. if P > N,
the system always has nontrivial solutions. More than N N-tuples are always linearly
dependent, while N or fewer than N N-tuples can be linearly independent.
3. A set of N-tuples where each pair is linearly dependent is called collinear; a set of
N-tuples where each triple is linearly dependent is called coplanar.
4. The left side of (1.2) is an N-tuple, which is obtained by multiplying each of the N-
tuples (of a given set of N-tuples) by a number and then summing up these products.
This is called a linear combination of these N-tuples.
1.3 Determinants
We assume the reader is familiar with the algebra of determinants; however, we sum-
marize here the most important definitions and theorems (without proofs) for later
reference.
1.3.1 Definitions
called a minor;1 minors corresponding to the main diagonal are also called principal
minors. A minor multiplied by (−1)(i+j) is called the cofactor corresponding to the ele-
ment aij , for which we write bij :
5. The summation convention does not apply across the symbol det: det aij = 1 is a
valid equation; however, det aij = δij is not permissible.
det a
∼ = a11 a22 − a21 a12 . (1.9)
To easily remember this formula, we write out the elements of the determinant:
a11 a12 This square table has two diagonals: the diagonal indicated by the
?? bold line is called the main diagonal, and the diagonal indicated by the
thin line is called the antidiagonal. To compute the value of a determi-
a21 a22
??
nant, we subtract from the product of the elements on the main diagonal
the product of the elements on the antidiagonal; we shorten this rule to “main diago-
nal minus antidiagonal”.
det a
∼ = a11 a22 a33 − a21 a12 a33 − a31 a22 a13 (1.10)
− a11 a32 a23 + a21 a32 a13 + a31 a12 a23 .
To easily remember this formula we first write out the elements of the determinant and
then we repeat the first and the second row below.
1 The term minor is also used with a different meaning in the literature.
2. We can compute the value of a determinant with the cofactor (or Laplace) expansion
formula: we pick any row or column, multiply every element by its cofactor, and add
these products. Then we have
det a
∼ = aik bik = aik bik . (1.11)
The most important rules for computations with determinants are the following:
I. The value of a determinant does not change,
a) if we add to a row a linear combination of the remaining rows;
b) if we add to a column a linear combination of the remaining columns;
c) if we exchange rows and columns (reflect the determinant at the main diago-
nal).
II. The sign of a determinant changes if we exchange two rows or columns.
III. A determinant is zero if and only if its rows are linearly dependent. (According to
condition Ic, this is equivalent to its columns being linearly dependent.)
IV. Determinants which differ in a single row (or column) are added and subtracted
by
a a12 ... a1N a a12 ... a1N
11 11
a21 a22 ... a2N a21 a22 ... a2N
. ± .
. .
. .
a aN2 ... aNN a aN2 ... aNN
N1 N1
(1.12)
a ± a a12 ± a12 ... a1N ± a1N
11 11
a21 a22 ... a2N
= .. .
.
a aN2 ... aNN
N1
det a for j = k,
aij bik = aji bki = { ∼ (1.14)
0 for j ≠ k.
Problem 1.4.
Compute the following determinants (consult also Section 1.6.1):
1 2 −4 4
1
0 −1
3 6 9 0
A. Δ = 3 1 −3 , B. Δ = .
−2 8 1 9
1 2 −2
4
2 −2 0
1.4.1 δij
1. In δij , both indices must have the same index set, 1, . . . , N. We define δij as follows.
Problem 1.5.
Compute for N = 5
A. δii , B. δii .
2. The term δij is clearly symmetric in its indices, i. e. its value does not change if we
change the order of the two indices.
We can summarize these three equations as Ai δij = Aj , and we get similar results for
N ≠ 3. If A has additional indices (possibly with a different index set) not appearing
in δij , these indices are not affected by multiplication with δij .
Let us write this important formula in words: If we sum over one index of δmn , we
replace this index in the other quantity (here the m in Ai...jmp...q ) by the second index
of δmn (here the n) and we drop the δmn .
Problem 1.6.
Simplify the following expressions:
A. δij δjk , B. δi2 δik δ3k , C. δ1k Ak , D. δi2 δjk Aij .
i...j
1.4.2 δp...q
i...j
1. In δp...q the number M of the upper and the lower indices is the same. We also say
i...j
that δp...q has M pairs of indices; in addition, all 2M indices are assumed to have the
same index set 1, . . . , N. Then we define
δ δiq ... δir
ip
δjp δjq ... δjr
ij...k
δpq...r := . . (1.19)
..
δkp δkq ... δkr
i...j
We call δp...q the generalized Kronecker symbol. Clearly, δji = δij , so this name is plau-
sible.
i...j
2. Let us derive some rules for the values of the elements of δp...q . We first consider an
i...j
element of δp...q with at least two equal upper or lower indices. Then the corresponding
rows or columns of the determinant are equal, which implies that the determinant is
zero, independent of the values of the other indices. This must be the case, for exam-
ple, if the number M of index pairs is greater than the maximal length N of the index
i...j
set, and thus such δp...q are zero.
i...j
Next, we consider an element of δp...q with one upper index which does not appear
among the lower indices (or an element with a lower index which does not appear
among the upper indices). Then the elements of the corresponding row (or column) are
all zero and the determinant is again zero, independent of the values of the remaining
indices.
So only those elements can be nonzero, whose upper indices are all different and
whose lower indices are all different, and whose lower indices differ from their upper
indices at most in their order.
We now consider the case where both indices of every index pair are equal, i. e.
the order of the upper and lower indices is the same. Then all diagonal elements of
the determinant are one and all remaining elements are zero, so the determinant and
i...j
hence also the corresponding element of δp...q is one.
Every change in the order of two lower indices (while keeping the order of the
upper indices) exchanges the corresponding columns of the determinant and hence
changes the sign of the determinant.
This leads us to the following result.
Clearly,
i...j p...q
δp...q = δi...j , (1.21)
since the value of a determinant does not change if we exchange rows and columns.
Problem 1.7.
ij
A. Write out all nonzero elements of δpq and compute their values
1. for N = 2, 2. for N = 3.
ij
B. Compute δij for N = 3.
1.4.3 εi...j
1. For some of the more frequently used generalized Kronecker symbols we introduce
another notation. We define
12...M ij...k
εij...k := δij...k = δ12...M . (1.22)
On the one hand, elements of εi...j are nonzero only if none of the indices i . . . j is larger
than M. On the other hand, generalized Kronecker symbols for N < M do not exist
anyway. Therefore, without loss of generality, we have N = M, i. e. the length of the
index set of all indices equals the number of indices in εi...j .
The term εi...j is called permutation symbol, Levi-Civita symbol, or alternating sym-
bol.
Problem 1.8.
Compute for N = 2 all elements of δij and εij .
Then we get
1
εi...j = (−1)αk pk (εi...j ). (1.24)
M!
This formula trivially holds when some of the indices of εi...j are repeated, but then
each term is zero anyway.
From (1.22) and (1.19), we obtain
δ δ1j ... δ1k δ δi2 ... δiM
1i i1
δ2i δ2j ... δ2k δj1 δj2 ... δjM
εij...k = . = . . (1.25)
.. .
.
δMi δMj ... δMk δk1 δk2 ... δkM
3. Let us derive two more useful formulas. From (1.25), it follows that
δ δi2 ... δiM δ δ1q ... δ1r
i1 1p
δj1 δj2 ... δjM δ2p δ2q ... δ2r
εij...k εpq...r = .
.. .
.. .
δ
δk1 δk2 ... δkM Mp δMq ... δMr
Using the multiplication theorem for determinants (1.13), we have for the upper left
element of this product of two determinants δi1 δ1p +δi2 δ2p + ⋅ ⋅ ⋅ +δiM δMp = δim δmp = δip .
The product of εij...k and δpq , where the number of indices of ε and the length of the
index set of all indices is N, is
This formula is easy to understand. According to (1.20), we have for all indices of the
pij...kl
index set of length N δq12...N−1,N = 0, and with (1.19), it follows that
δpq δp1 δp2 ... δp,N−1 δpN
δiq δi1 δi2 ... δi,N−1 δiN
δjq δj1 δj2 ... δj,N−1 δjN
= 0.
.
.
.
δ δk1 δk2 ... δk,N−1 δkN
kq
δlq δl1 δl2 ... δl,N−1 δlN
Expanding (using the “expansion” formula) according to the first column and using
(1.26) gives
The factor ε12...N is one and can be omitted. Now we bring all terms except the first
to the right side and perform as many index exchanges in the epsilons as we need to
move the index p to the M-th position in the M-th term on the right side, while keeping
the order of the remaining indices unchanged. Clearly, M − 1 exchanges are necessary,
and we get
εij...kl δpq = εpj...kl δiq + εip...kl δjq + ⋅ ⋅ ⋅ + εij...pl δkq + εij...kp δlq .
Or, if we change the notation of the indices slightly, we get the first equation (1.27).
The second equation we get from the first equation, if we exchange p and q and use
the symmetry of δpq .
1. The system of εi...j has properties which are similar to the properties of determinants;
for example, a determinant is zero if two rows (or columns) are equal and it changes
sign if we exchange two rows (or columns). The same is true for the elements of εi...j
with respect to the indices, so it is plausible to write εi...j according to (1.25) as a deter-
minant. Conversely, we can assume that a determinant can also be written using εi...j .
To see this, we multiply (1.25) by ai1 aj2 ⋅ ⋅ ⋅ akM . Then we have
If, on the right side, we multiply ai1 into the first column of the determinant, aj2 into
the second column, and so on and take (1.17) into account, we get
a a12 ... a1M
11
a21 a22 ... a2M
εij...k ai1 aj2 ⋅ ⋅ ⋅ akM = . .
..
aM1 aM2 ... aMM
and then multiply the first row of the determinant by a1i , the second row by a2j , etc.
This gives
det a
∼ = εij...k ai1 aj2 ⋅ ⋅ ⋅ akM = εij...k a1i a2j ⋅ ⋅ ⋅ aMk . (1.28)
ij...k
det a
∼ = δ12...M ai1 aj2 ⋅ ⋅ ⋅ akM . (a)
ij...k
The value of δ12...M does not change if we change the order of two index pairs, so we
also have
ji...k ji...k
det a
∼ = δ21...M ai1 aj2 ⋅ ⋅ ⋅ akM = δ21...M aj2 ai1 ⋅ ⋅ ⋅ akM .
And if we rename i to j and j to i in the last equation, we get
ij...k
det a
∼ = δ21...M ai2 aj1 ⋅ ⋅ ⋅ akM .
So we can also obtain det a ∼ if we replace in (a) the lower indices by a permutation of the
numbers 1, 2, . . . , M and the second indices of the terms in the product ai1 aj2 ⋅ ⋅ ⋅ akM
by the same permutation. If we replace in (a) the number indices by running indices,
i. e. if we write
ij...k
δpq...r aip ajq ⋅ ⋅ ⋅ akr ,
this represents an M-fold summation over all M values of the indices p, q, . . . , r. How-
ever, of the M M summands, only those M! terms are nonzero, in which the sequence
p, q, . . . , r is a permutation of the sequence 1, 2, . . . , M. Since all summands are equal
to det a,
∼ from (1.26) it follows that
1 ij...k 1
det a
∼= δ a a ⋅ ⋅ ⋅ akr = ε ε a a ⋅ ⋅ ⋅ akr
M! pq...r ip jq M! ij...k pq...r ip jq
δ δiq ... δir
ip
1 δjp δjq δjr
...
= . aip ajq ⋅ ⋅ ⋅ akr .
M! ..
δkp δkq ... δkr
Finally, if we multiply the first row of the determinant by aip , the second row by ajq ,
etc., we get
a aqp ... arp
pp
1 apq
aqq ... arq
det a ,
∼ M! ..
=
.
a aqr ... arr
pr
1
det a
∼= ε ε a a ⋅ ⋅ ⋅ akr
M! ij...k pq...r ip jq
a
pp apq . . . apr
(1.29)
1 aqp aqq . . . aqr
= . .
M! ..
a arr
rp arq . . .
2. We can also compute the cofactors of the elements of a determinant in terms of εi...j .
We start with (1.29)1 and multiply both sides by δmn , which gives
1
(det a)δ
∼ mn = ε δ ε a a ⋅ ⋅ ⋅ akr . (a)
M! ij...k mn pq...r ip jq
1
(det a)δ
∼ mn = (ε δ + εim...k δjn + ⋅ ⋅ ⋅ + εij...m δkn ) εpq...r aip ajq ⋅ ⋅ ⋅ akr
M! mj...k in
1
= (ε ε a a ⋅ ⋅ ⋅ akr + εim...k εpq...r aip anq ⋅ ⋅ ⋅ akr
M! mj...k pq...r np jq
+ ⋅ ⋅ ⋅ + εij...m εpq...r aip ajq ⋅ ⋅ ⋅ anr ) .
To ensure that the factor anp ajq ⋅ ⋅ ⋅ akr appears in every term, we rename the summation
indices. We have
1
(det a)δ
∼ mn = (ε ε a a ⋅ ⋅ ⋅ akr
M! mj...k pq...r np jq
+ ε i m...k εpq...r a i p anq ⋅ ⋅ ⋅ akr + ⋅ ⋅ ⋅ + ε i j...m εpq...r a i p ajq ⋅ ⋅ ⋅ anr ) .
↑ ↑↑ ↑↑ ↑ ↑ ↑ ↑ ↑↑ ↑
j qp jq p k r p kr p
If we now exchange an index pair in each term except the first term in both epsilons,
which does not change the value of the terms, we see that all M terms are equal. Thus
we obtain
1
(det a)δ
∼ mn = ε ε a ⋅ ⋅ ⋅ akr anp . (b)
(M − 1)! mj...k pq...r jq
Next, we introduce
1
bmp := ε ε a ⋅ ⋅ ⋅ akr
(M − 1)! mj...k pq...r jq
and we want to show that the so-defined bmp is the cofactor to the element amp . For
m = p = 1, we get
1
b11 = ε ε a ⋅ ⋅ ⋅ akr .
(M − 1)! 1j...k 1q...r jq
Since the number of the indices of the epsilons equals the length of the index set and
since all epsilons in which an index appears at least twice are zero, we can substitute
εj...k for ε1j...k and εq...r for ε1q...r if the M − 1 indices of the epsilons range from 2 to M. We
have
1
b11 = ε ε a ⋅ ⋅ ⋅ akr , for all indices: 2, . . . , M.
(M − 1)! j...k q...r jq
But this is, according to (1.29), the minor to a11 , which is, according to (1.8), equal to
the cofactor b11 . Analogously, we obtain for m = 1, p = 2,
1 1
b12 = ε1j...k ε2q...r ajq ⋅ ⋅ ⋅ akr = − ε ε a ⋅ ⋅ ⋅ akr .
(M − 1)! (M − 1)! 1j...k q2...r jq
We can again substitute εj...k for ε1j...k and εq...r for εq2...r if the indices j . . . k can take
values from 2 to M and the indices q . . . r can take values 1 and from 3 to M. Without
the minus sign this is the minor to a12 ; with the minus sign it is the cofactor b12 . Anal-
ogously, the same can be shown for the remaining elements of bmp , so we get for the
cofactor of the element aip of a determinant
1
bip = ε ε a ⋅ ⋅ ⋅ akr , (1.30)
(M − 1)! ij...k pq...r jq
(det a)δ
∼ mn = bmp anp .
Since δij is symmetric we can change the order of m and n on the right side, so we can
also write
(det a)δ
∼ mn = amp bnp .
(det a)δ
∼ mn = bim ain = aim bin .
So we have
(det a)δ
∼ ij = bik ajk = aik bjk = bki akj = aki bkj . (1.31)
1.4.5 εijk
1. We will need the εi...j mostly for the case M = 3 and we wish to write out some
formulas for this case.
{ 1 for ijk = 123 and its cyclic permutations 231 and 312,
{
{
{
{
{−1 for ijk = 321 and its cyclic permutations 213 and 132,
εijk ={ (1.33)
{
{
{ 0 for all remaining index combinations, i. e. if an index
{
{
{ appears at least twice.
We can easily convince ourselves that this definition is consistent with (1.23).
Problem 1.9.
Write out the following expressions for N = 3:
A. vi = εijk ωj rk .
Let vi , ωi , and ri be the Cartesian coordinates of the three vectors v, ω, and r. Which
frequently used vector algebraic relation between v, ω, and r is expressed by this
equation?
𝜕vi
B. Ωj = εijk .
𝜕xk
Let vi and Ωi be the Cartesian coordinates of the two vector fields v and Ω, i. e.
vi and Ωi are functions of the coordinates xi = (x1 , x2 , x3 ) = (x, y, z). Which fre-
quently used vector analytic relation between Ω and v is expressed by this equa-
tion?
3. By equating two homologous indices (i. e. indices which are at the same position in
both epsilons) we obtain from (1.26), after straightforward computations, the so-called
Grassmann identity.
εijk εmnk = εikj εmkn = εkij εkmn = δim δjn − δin δjm . (1.35)
Let us describe this important formula in words. Summing over two homologous in-
dices in two epsilons (for M = 3) gives the difference of two terms which are both
products of two deltas. In the first term of the difference (the minuend), the indices of
both deltas are the homologous indices of the epsilons; in the second term (the sub-
trahend), the indices are not homologous indices of the epsilons.
Setting in (1.35) n = j gives εijk εmjk = δim δjj − δij δjm = 3 δim − δim = 2 δim ,
1.5 Matrices
1.5.1 Definitions
A rectangular table of numbers is called a matrix. The numbers are the elements of
the matrix, with elements aligned horizontally forming the rows and elements aligned
vertically forming the columns. A matrix with M rows and N columns is also called an
M, N-matrix. A matrix with only one row is called a row matrix, a matrix with only
one column is called a column matrix, a matrix with N rows and N columns is called
a square matrix of order N. A matrix which can be obtained from another matrix by
dropping an arbitrary number of rows and columns is called a submatrix of the other
matrix.
We denote a matrix by a letter with an underlying tilde and its elements by the
same letter with two indices, where the first index refers to the row and the second
index refers to the column. We write
We call the element Aij at the intersection of the i-th row and the j-th column the
i, j-element of the matrix. If we take both indices as running indices and let i range
from 1 to M and j from 1 to N, then Aij is a symbol for the entire matrix. So we have
three equivalent ways to write this:
1. A matrix is a special case of an N-tuple; the easiest way to see this is to imagine all
rows of a matrix written one after another. Now we can define linear operations and
the terms linearly independent and linearly dependent for matrices.
– Equality: Two matrices A ∼ and B∼ are equal if and only if their homologous elements
are equal:
A
∼=B
∼ ⇐⇒ Aij = Bij . (1.40)
A
∼±B
∼=C
∼ ⇐⇒ Aij ± Bij = Cij . (1.41)
Clearly equality, addition, and subtraction of matrices are only possible if the matrices
have the same number of rows and columns.
– Multiplication of a matrix by a number: To multiply a matrix A ∼ by a number α, we
multiply every element of the matrix by this number:
αA
∼=B
∼ ⇐⇒ α Aij = Bij . (1.42)
1 2 Q
– Linear independence: Q M, N-matrices A, A, . . . , A are called linearly indepen-
∼ ∼ ∼
dent, if the equation
i i
αi A = 0 ⇐⇒ αi Amn = 0 (1.43)
∼ ∼
has only the trivial solution, where all αi are zero. Otherwise the matrices are
called linearly dependent.
Since the equations (1.43) represent MN homogeneous linear equations for the Q un-
known αi , more than MN matrices are always linearly dependent.
2. Now we define multiplication of two matrices. A matrix C ∼ is the product of two ma-
trices A
∼ and B
∼ if and only if their elements satisfy the relation Cik = Aij Bjk :
A
∼B∼=C
∼ ⇐⇒ Aij Bjk = Cik . (1.44)
Multiplication of matrices requires that the number of columns of the first factor is
equal to the number of rows of the second factor. The so-called Falk scheme is a con-
venient way to perform the multiplication.
The product of two matrices is not commutative in general; in the example given above
B
∼ A
∼ is not even defined. The product of a matrix and the identity matrix is, however,
the original matrix, independent of the order of the factors:
A
∼ I∼ = I∼A
∼=A
∼ ⇐⇒ Aij δjk = δij Ajk = Aik . (1.45)
When the matrix A ∼ is rectangular, however, the order of the identity matrix differs for
the two cases: if we multiply A ∼ by the identity matrix from the right, then the order of
the identity matrix equals the number of columns of A; ∼ if we multiply A ∼ by the identity
matrix from the left, then the order of the identity matrix equals the number of rows
of A.
∼
If A
∼ is square and regular, the converse is also true. If the result of the product (1.45)
of a regular square matrix A ∼ and an (initially unknown) matrix I∼ is again the original
matrix A,∼ then I∼ is the identity matrix.
This is easy to see: If A
∼ is square, then according to (1.44), I∼ also has to be square
2
and has the same order as A,
∼ say N. Then (1.45) forms a system of N nonhomogeneous
linear equations for the N 2 elements of I.
∼ If A
∼ is regular, this equation system clearly
always has a unique solution. Since (1.45) is satisfied for every matrix A
∼ by the identity
matrix I∼and since it has a unique solution if A
∼ is a regular square matrix, the identity
matrix is the only solution.
The product of more than two matrices is associative.
T
the i, j-element of the matrix A
∼ is equal to the j, i-element of the matrix A.
∼
Clearly (ATij )T = ATji = Aij , i. e. the transpose of the transpose matrix is the original
matrix:
T T
(A
∼ ) =A∼ ⇐⇒ (ATij )T = Aij . (1.47)
T
Aim Bmj is the i, j-element of the matrix A
∼ B
∼ and (Aim Bmj ) is the i, j-element of the
T
matrix (A
∼ B)
∼ , and so with (1.46) we get
2 The transpose of a matrix element is not defined. So we should really read ATij as “A transpose ij”
and not “A ij transpose”.
3 We will read this as (Aim Bmj )T “A im B mj (in parentheses) transpose”, but let us keep in mind that
T
we really mean the i, j-element of the matrix (A∼ B)
∼ .
T T
the far-right term is the i, j-element of the matrix B
∼ A∼ . Comparing the far-left and the
T T T
far-right expressions shows that the i, j-elements of the matrices (A ∼ B)
∼ and B∼ A∼ are
equal, and since this is true for all i and j these two matrices are equal. We have now
derived the important formula
T T T
(A
∼ B)
∼ =B ∼ A
∼ ⇐⇒ (Aim Bmj )T = BTim ATmj . (1.48)
4. Another operation defined for square matrices is the inversion of a matrix. If we can
associate to a square matrix A
∼ a matrix A
∼ , such that the product of the two matrices
−1
4
then the matrix A
∼ is called the inverse matrix of A.
−1
∼
If A
∼ is known, then (1.49) represents a nonhomogeneous system of linear equa-
tions for the elements of the matrix A
∼ . According to the rules for systems of linear
−1
(det A)
∼ δij Ajm = Bki Akj Ajm = Bki δkm ,
(−1) (−1)
(det A)
∼ Aim = Bmi ,
(−1)
Bji
A(−1)
ij = , (1.50)
det A
∼
where Bij is the cofactor of Aij . (Note the different positions of i and j on the two sides
of this equation!)
We conclude that a matrix is invertible if and only if it is square and regular. In
practice we usually use the Gauss–Jordan algorithm to compute the inverse matrix
rather than (1.50).
Finally, from (1.49) it also follows that
−1
A
∼ A ∼ = I∼ ⇐⇒ A(−1)
ij Ajk = δik , (1.51)
4 A(−1)
ij
is the i, j-element of the matrix A
∼ and we should read “A inverse ij”. Similarly as the transpose
−1
matrix, the inverse of a matrix is defined, and the inverse of a matrix element is not defined.
A
∼C∼ = A,
∼
and since A ∼ is square and regular, it follows from the converse of (1.45) that C
∼ is the
identity matrix, which proves (1.51) and (1.52). Substituting (1.50) into (1.49)2 and
(1.51)2 we finally obtain
(det A)δ
∼ ik = Aij Bkj = Bji Ajk ; (1.53)
Problem 1.10.
Compute the inverse of the following matrices and verify your result, i. e. show that the
product of the original matrix and the computed inverse is the identity matrix (com-
pare also with Section 1.6.2):
0 1 1 −1
1 0 2
1 2 −1 1
A. (2 1 3) , B. ( ).
1 3 −1 0
1 1 2
−1 −2 1 −2
Problem 1.11.
A. Show that the operations transpose and inverse of a matrix commute, i. e. show
that
−1 T T −1 −T
(A
∼ ) = (A
∼ ) =: A
∼ . (1.54)
for matrices can be written as an equation for the elements of the matrices, or in other
words, every equation for matrices can be translated into an equivalent equation for
matrix elements. From the definition of matrix equality (1.40) it follows that an equa-
tion for matrix elements has the property that the order of free indices coincides in all
its terms. Conversely, an equation for matrix elements can only be translated into an
equation for matrices, when the order of free indices coincides in all terms. This is not
the case in (1.46) and (1.50), for example.
We will meet the same facts again when we compute with tensors.
Theorems II, V, Ia, and Ib for determinants immediately imply that elementary trans-
formations do not change the rank of a matrix.
2. The normal form of any M, N-matrix with rank R is defined as the matrix
R N −R
⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞ ⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞
1 0 ... 0 0 ... 0 }
}
}
0 1 ... 0 0 ... 0 }
}
( .. .. ) } R
(. . ) }
}
( ) }
} (1.56)
( )
D = (0 0 . . . 1 0 . . . 0) } .
( )
(0 0 . . . 0 0 . . . 0) }
}
( ) }
.. .. M−R
. . }
}
}
(0 0 . . . 0 0 . . . 0) }
Obviously, every M, N-matrix of rank R can be transformed into this normal form by
elementary transformations and every M, N-matrix, which can be transformed into
this normal form by elementary transformations, has rank R, so the name normal form
makes sense.
Problem 1.12.
Determine the rank of the matrices (see also Section 1.6.3)
0 3 6 9 −3
1 −3 2 0
0 0 2 −4 8
A. ( ), B. ( 4 −11 10 −1 ) .
1 2 −3 4 5
−2 8 −5 3
1 5 5 9 10
3. The elements of a set are equivalent and form an equivalence class, if there exists a
relation A ∼ B between two arbitrary elements A and B of the set with the following
properties. The relation is
– reflexive, i. e. A ∼ A (the relation exists also for B = A);
– symmetric, i. e. if A ∼ B then B ∼ A;
– transitive, i. e. if A ∼ B and B ∼ C then A ∼ C.
4. It can be shown that for any pair A∼ and B∼ of M, N-matrices of equal rank R, there
exists a pair of regular square matrices S
∼ and T,
∼ such that
A
∼=S
∼B∼T
∼ ⇐⇒ Aij = Sim Bmn Tnj , (1.57)
All matrices with the same number of rows, the same number of columns, and
the same rank form an equivalence class. Two matrices with this property (or in other
words, two matrices A ∼ and B
∼ satisfying the relation A
∼=S
∼B∼ T,
∼ where S
∼ and T
∼ are regu-
lar) are called equivalent.
−1
A
∼=S
∼B∼S
∼ , (1.58)
1. A square matrix A
∼ is called orthogonal, if its inverse equals its transpose. Then
Aik Akj = Aik Ajk = δij , and at the same time ATik Akj = Aki Akj = δij . Thus we get
T
The last two equations show that the sum of the squares of each row (or column) is
one and the sum of the products of two different rows (or two different columns) is
zero.
The easiest way to prove this is to expand (det Aij )2 using the rules for multiplication
of determinants and then transpose the second determinant. Then we have
A A12 ... A1N A A21 ... AN1
11 11
A21 A22 ... A2N A12 A22 ... AN2
. ..
. .
.
A AN2 ... ANN A A2N ... ANN
N1 1N
A A A1i A2i ... A1i ANi
1i 1i
A2i A1i A2i A2i ... A2i ANi
= . .
..
ANi A1i ANi A2i ... ANi ANi
According to (1.59) this is equal to det δij = 1, which completes the proof.
Note that the converse is not true: From det Aij = ±1 it does not follow that Aij is
orthogonal.
3. If the determinant of an orthogonal matrix has the value +1, we call the matrix
proper orthogonal. If the determinant has the value −1, we call the matrix proper or-
thogonal.5
5 A more common name for proper orthogonal is special orthogonal, but there is no corresponding
commonly accepted name for det Aij = −1 matrices, so we prefer the names proper and improper.
1.6 Algorithms
In this section we summarize some fundamental algorithms from linear algebra with-
out proof.
A determinant or a matrix is transformed column by column from left to right. We
describe the steps for an arbitrary column. The row which has the same diagonal ele-
ment as the column which is being transformed is called the corresponding row. The
steps of the algorithms are carried out subsequently, while omitting the steps which
do not satisfy the required assumptions.
1.6.2 Solution of Systems of Linear Equations with the Same Regular Coefficient
Matrix (“Division by a Regular Matrix”, Gauss–Jordan Algorithm)
I. If the diagonal element and all elements below the diagonal element of the left
side of the double matrix are zero, then A
∼ is singular. (stop)
II. If the diagonal element is zero, exchange the corresponding row with a row below
to make the diagonal element nonzero.
III. Divide the corresponding row by the value of the diagonal element.
IV. Make the elements above and below the diagonal element zero by adding to each
row a convenient multiple of the corresponding row.
The aim is the transformation of the matrix or of the elements of the determinant into
normal form. Then the rank of the matrix or the determinant is equal to the number
of nonzero (diagonal) elements.
I. If the diagonal element and all elements below the diagonal are zero, all elements
of this column are zero. Exchange the column with the most-right column whose
elements are not all zero.
II. If the diagonal element is zero but not all elements below the diagonal are zero,
then exchange the corresponding row with a row below to make the diagonal el-
ement nonzero.
III. Make the elements below the diagonal element zero by adding to each row a con-
venient multiple of the corresponding row.
IV. Replace the elements to the right of the diagonal element by zeros. (Make these el-
ements zero by adding to each column a convenient multiple of the column which
is being transformed.)
V. Replace the diagonal element by the number one. (Divide the corresponding row
by the value of the diagonal element.)
x = x ex + y ey + z ez
= x1 e1 + x2 e2 + x3 e3 (2.1)
= xi ei .
https://doi.org/10.1515/9783110404265-002
the tilde coordinate system. We call the other coordinate system the no-tilde coordi-
nate system. The relative position of the two coordinate systems is then characterized
by
β = βx ex + βy ey + βz ez = βi ei . (2.2)
II. The basis vectors ẽi of the tilde coordinate system are represented in the no-tilde
system as
ẽj = αij ei .
The transformation between the no-tilde and the tilde coordinate systems is clearly
completely described by the βi and αij . The αij are called the transformation coefficients
or the transformation matrix.
j j
1. Let ẽi be the coordinates of the vector ẽj in the basis ei , i. e. let ẽj = ẽi ei . Comparing
this expression with (2.3) yields
j
αij = ẽi , (2.4)
Since ẽj ⋅ ek is the cosine of the angle between ẽj and ek , we get
where the αij are the so-called direction cosines. We see immediately that the indices of
αij are not interchangeable; αij is not the same as αji . For example, in the figure on the
facing page, ex and ẽy form an obtuse angle, and the corresponding direction cosine
is therefore negative; ey and ẽx form an acute angle, and the corresponding direction
cosine is therefore positive.
3. Since both the ei and the ẽi are pairwise perpendicular unit vectors, they satisfy the
relations
δij = ẽi ⋅ ẽj = αmi em ⋅ αnj en = αmi αnj em ⋅ en = αmi αnj δmn = αmi αmj ,
det α
∼ = ±1. (2.7)
For a geometric visualization of these two cases, we first assume that both coordinate
systems have the same orientation: either both are right-handed systems or both are
left-handed systems. We can then convert one system into the other system by a mo-
tion.2
1 Assuming familiarity with the scalar product of two vectors from elementary vector analysis.
2 By a motion, we mean a composition of a rotation and a translation (but not a reflection). Some
authors call this a direct, proper, or rigid motion.
1 0 0
(0 1 0) ,
0 0 −1
and its determinant is −1. An arbitrary transformation between two coordinate sys-
tems with different orientations can be composed of a reflection followed by a motion.
Since the determinant of a transformation matrix does not change during a motion, it
has for every transformation, which changes the orientation of the coordinate system,
the value −1. In other words, the transformation is improper orthogonal.
Formula (2.3) can be easily solved for the ei with the help of the orthogonality relation.
We formally multiply (2.3) by αkj , which, due to the summation convention, means
summation over j. With (2.6), we get
After renaming indices, we obtain the following transformation law for basis vectors:
Since we only used the orthogonality relation, these relations (as well as the trans-
formation laws we will derive later) hold, whether the coordinate system changes its
orientation during the transformation or not. In other words, the αij in (2.8) can repre-
sent a proper or an improper orthogonal matrix.
The position vector to a point P can be expressed, as shown in the figure on page 32,
in both coordinate systems by
If all quantities in the last equation are expressed with respect to the no-tilde basis,
we obtain
xi ei = βi ei + x̃i αki ek .
We want to factor out the basis, so we need to rename the indices in the second term.
Then we have
Since the representation (2.1) of a position vector with respect to a basis is unique, we
have
xi = βi + αij x̃j .
αik xi = αik βi + αij αik x̃j = αik βi + δjk x̃j = αik βi + x̃k ,
x̃k = αik xi − αik βi .
2.2 Vectors
2.2.1 Vectors, Vector Components, and Vector Coordinates
1. We first met vectors 3 as arrows in space; the basis vectors of a Cartesian coordinate
system are one such example. In order to represent such a vector quantitatively, we
3 Latin: carrier, driver (agent noun to vehere [to carry]); short for radius vector (traveling ray, i. e. posi-
tion vector: the ray from a fixed point to a moving point, e. g. in the case of planetary motion from the
sun at the focus of the elliptic orbit to the planet). The term vector thus evolved from the term position
vector.
need, similarly as for a point, a coordinate system. Analogously to (2.1) we get for a
vector a in a Cartesian coordinate system with basis ei in different notations
a = ax ex + ay ey + az ez
= a1 e1 + a2 e2 + a3 e3 (2.10)
= ai ei .
2. Here, we would like to conceptually distinguish between the components and the
coordinates of a vector. We call the quantity ax ex the x-component of the vector a, and
we call the quantity ax the x-coordinate of the vector a (in the given coordinate system
or with respect to the given basis). The components of a vector are themselves vectors,
but the coordinates are not; the components of a vector are characterized by their mag-
nitude and direction, the coordinates are characterized by their absolute value and
sign. Conceptually distinguishing between components and coordinates of a vector is
not common practice, but it is meaningful.
1. We can obtain the transformation law for the coordinates of an oriented line from
the transformation law for point coordinates if we realize that (compare the [again
only two-dimensional] figure on this page)
a = y − x = ỹ − x̃,
and so
ai ei = (yi − xi ) ei , ai = yi − xi ,
a
̃ i ẽi = (ỹi − x̃i ) ẽi , a
̃ i = ỹi − x̃i .
Here we could again cancel out the bases ei and ẽi , because the decomposition of a vec-
tor into components is unique, like the decomposition of a position vector. According
to (2.9) we have
which gives
Thus we obtained the transformation law for the coordinates of an oriented line
ai = αij a
̃j , a
̃ i = αji aj . (2.11)
a = ai ei = a
̃ i ẽi
a = ai αij ẽj = a
̃ i ẽi .
Clearly, these are two different representations of the vector a with respect to the tilde
basis. The representation of a vector with respect to a basis is unique, so we can cancel
out the basis and obtain an equation for coordinates only. This changes the summa-
tion index of the basis into a free index. For all terms in the equation for coordinates
to match in this free index, we have to rename the summation indices before cancel-
ing out the basis, so that the index of the tilde basis is the same on both sides of the
equation. For example, here we can replace on the left side j by i and i by j, and we get
aj αji ẽi = a
̃ i ẽi .
a
̃ i = αji aj ,
2. Two typical examples of physical quantities described by vectors are velocities and
angular velocities. However, there is an important difference between these two types
Clearly, polar vectors satisfy these invariances inherently. For a quantity associated
with an axial vector, both the direction of rotation and the corresponding vector have
to be transformed. In a motion, the correlation between the direction of rotation and
the direction of the corresponding vector remains unchanged; a reflection, however,
reverses this correlation: a right-hand screw correlation becomes a left-hand screw
correlation and vice versa. If we do not adjust the correlation, then the signs of the
vector coordinates are reversed in the reflected coordinate system, i. e. the invariance
requirement is violated. To preserve reflection invariance, we introduce the following
convention.
This is a convention in the sense that the reflection invariance would also be satisfied
if we associate with such a quantity in a right-handed system a vector corresponding
to a left-hand screw rule and in a left-handed system a vector corresponding to the
right-hand screw rule.
In a motion (translation and rotation) where the correlation between the direction
of rotation and the corresponding vector is preserved, we do not have this problem.
Certainly this convention has to hold also if the same physical quantity is repre-
sented in two different coordinate systems. A polar vector is invariant under an ar-
bitrary change of coordinates, i. e. we have a = a ̃ , which explains why the vector
commonly is denoted by a in both coordinate systems. An axial vector, however, is
invariant only under a motion, for a coordinate transformation with a change of ori-
entation we have a = −a ̃ . This may seem unfamiliar, as it means that a vector describ-
ing an angular velocity, angular momentum, or torque depends on the orientation of
the coordinate system in which it is represented. In other words, such a vector is not
independent of the coordinate system.
Since we have det α∼ = 1 for a change of coordinates without change of orientation
and det α ∼ = −1 for a change of coordinates together with a change of orientation, we
can summarize the transformation law for axial vectors by writing
a = (det α)
∼ a,
̃
ai = (det α)
∼ αij aj ,
̃ a
̃ i = (det α) αji aj .
∼ (2.12)
3. From the decomposition (2.10) and the transformation laws (2.11) and (2.12) we find
the following definition of a polar vector and an axial vector.
4. Finally we remark that the Cartesian coordinates of a vector depend only on the ba-
sis of the chosen coordinate system, but the coordinates of a position vector depend
also on its origin. This shows that position vectors are no vectors in the sense of the
definition above; however, the difference of two position vectors, which can be inter-
preted geometrically as a directed line from one point to another point, is a vector in
the sense of the definition above; compare the figure on page 36 and the derivation
of (2.11).
2.3 Tensors
2.3.1 Second-Order Tensors
Ai = Tij Bj . (2.13)
̃ i and B
Let ẽj be another arbitrary basis such that between the coordinates A ̃ j of two
vectors A and B the analogous relation
A ̃j
̃ i = T̃ij B (2.14)
holds. Then, what is the relation between the two matrices Tij and T̃ij ?
Let us assume initially that the two vectors A and B are polar. Then according
to (2.11) we have
̃k,
Ai = αik A ̃n.
Bj = αjn B
̃ k = Tij αjn B
αik A ̃n.
α ̃ ̃
im αik Ak = Tij αim αjn Bn ,
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ̃ m = αim αjn Tij B
A ̃ n = T̃mn B
̃n,
δmk
(αim αjn Tij − T̃mn ) B
̃ n = 0.
(Smn − T̃mn ) B
̃ n = 0. (a)
Now we choose three linearly independent polar vectors a, b, and c, so that every polar
vector B can be written as a linear combination of these three vectors; we have
B = α a + β b + γ c, ̃n = α a
B ̃ + γ c̃ .
̃n + β b n n
If we substitute the last equation into (a), we see that (a) is satisfied for arbitrary B, if
(a) is satisfied for three linearly independent vectors. If we now substitute in (a) for B
successively a, b, and c, and write out the equations for m = 1, we obtain
(S11 − T̃11 ) a
̃ 1 + (S12 − T̃12 ) a
̃ 2 + (S13 − T̃13 ) a
̃ 3 = 0,
̃ + (S − T̃ ) b
(S11 − T̃11 ) b ̃ + (S − T̃ ) b
̃ = 0,
1 12 12 2 13 13 3
with nonzero coefficient determinant; i. e. the system of linear equations has only the
trivial solution
For m = 2 and m = 3 we find analogously that also the other elements of the two
matrices Sij and T̃ij must be equal, i. e. we have
and from this we compute the inverse relationship by multiplying by αpm αqn
This is clearly a generalization of the transformation law (2.11) for coordinates of po-
lar vectors. Similarly as before, where we interpreted the quantities ai and ã i , related
by (2.11), as the coordinates of a (coordinate system-independent) polar vector a, we
interpret the matrices Tij and T̃ ij , related by (2.15), as the coordinates of a (coordinate
system-independent) second-order polar tensor4 T with respect to the bases ei and ẽi .
Thus, if in any coordinate system the coordinates Bi of an arbitrary polar vector B
(or equivalently, the coordinates of three linearly independent polar vectors) are as-
signed to the coordinates Ai of a polar vector by always the same relation, either (2.13)
or (2.14), then the Tij are the coordinates of a polar tensor of order 2.
Since Tij = TjiT we arrive at the same transformation law if we start with the equa-
tions Ai = TjiT Bj and A ̃ i = T̃ T B
̃
ji j , i. e. if we sum over the first index of the matrix.
4 Latin: tensioner (agent noun to tendere [to tension]); identified initially only with the quantity that
we today call the (Cauchy) stress tensor. (The word stress tensor is etymologically a pleonasm.)
Because the αij are orthogonal and since the transposed matrix of an orthogonal
matrix is equal to its inverse, it follows from (1.58) that all coordinate matrices of a
second-order polar tensor are similar; however, not all matrices which are similar to
a coordinate matrix of a second-order polar tensor are Cartesian coordinates of this
tensor.
Problem 2.1.
Prove the converse: If the relation (2.13) or (2.14) is valid in any Cartesian coordinate
system and if the Bi are the coordinates of a polar vector and the Tij are the coordinates
of a second-order polar tensor (i. e. they satisfy the transformation equations (2.11) and
(2.15)), then the Ai are the coordinates of a polar vector. (Note the difference: For the
original theorem the Bi must be the coordinates of an arbitrary vector, but the converse
theorem only requires that they are the coordinates of a vector.)
2. To develop an intuition for second-order polar tensors, let us consider the (Cauchy)
stress tensor, from which the tensor originally has its name. The stress tensor describes
the relation between the stress vector σ, i. e. the surface force per unit surface element
in a point of a surface and the orientation n of the surface element through this point.
The stress tensor depends, even in the simplest case of a fluid at rest, on the orientation
of the area element: it is always perpendicular to the surface element. Let p be the
pressure at a given surface element. Then we have
σi = −p ni .
For a fluid in motion (or a deformable solid body) it is shown in mechanics that the
relations
πij = −p δij
we obtain the formula for the fluid at rest as a special case of the formula above.)
Since both n and σ are polar vectors and since we can substitute for n in particular the
three linear independent unit vectors ex , ey , and ez , the πij are, according to our defi-
nition (2.13), the coordinates of a polar tensor of order 2, which we call stress tensor π.
The stress tensor describes the state of stress at a given point by assigning a stress vec-
tor σ to every normal vector n; hence, we can also say the tensor π maps the vector n
to the vector σ.
3. If in (2.13) Ai and Bi are the coordinates of two axial vectors, we again obtain the
transformation law (2.15); so also for this case the Tij are the coordinates of a second-
order polar tensor. If, however, one of the two vectors A and B is a polar vector and the
other vector is an axial vector, then we obtain the transformation law
and the Tij are called the coordinates of an axial tensor of order 2. Often the word
pseudotensor is used instead of axial tensor.
1. We can show similarly that if in any Cartesian coordinate system the coordinates Bi
of an arbitrary vector are assigned to the coordinates Aij of a tensor of order 2 by the
analogous relation
Aij = Tijk Bk ,
then the relation between Tijk and T̃ijk with respect to the bases ei and ẽi , where A and
B are both either polar or axial, is given by the transformation law
Tijk = αim αjn αkp T̃mnp , T̃ijk = αmi αnj αpk Tmnp
or, if one of the two vectors A and B is polar and the other is axial, by the transformation
law
In the former case the Tijk are called the coordinates of a polar tensor T of order 3, and
in the latter case they are called the coordinates of an axial tensor T of order 3.
We obtain the same transformation laws, if we start with the relations
∗ ∗∗
Aij = Tikj Bk or Aij = Tkij Bk .
2. A scalar is also called a tensor of order 0 and a vector is also called a tensor of
order 1. Thus we have the following rule.
The order of a tensor equals the number of underlines of the tensor and the number
of indices of its coordinates.
3. The transformation equations for polar tensors and for axial tensors are different.
For polar tensors we have the following laws.
a = a ̃, a
̃ = a,
ai = αim ãm , a
̃ i = αmi am ,
(2.17)
aij = αim αjn a
̃ mn , a
̃ ij = αmi αnj amn ,
etc. etc.
a = (det α)∼ a,
̃ a
̃ = (det α) a,
∼
∼ im am ,
ai = (det α) α ̃ a
̃ i = (det α) αmi am ,
∼ (2.18)
aij = (det α)
∼ αim αjn amn ,
̃ a
̃ ij = (det α) αmi αnj amn ,
∼
etc. etc.
4. For both polar and axial vectors we have in the two coordinate systems
a = ai ei , a
̃=a
̃ i ẽi .
Similarly, tensors of order 2 or higher can be defined from their representation with
respect to a Cartesian coordinate system and either the transformation law (2.17) or
(2.18), respectively, for the coordinates so obtained. However, we need to postpone
this definition, until we know how to represent a tensor in a coordinate system.
5. The tensor whose coordinates are all zero is called the zero tensor. We denote such
a tensor by 0, 0, 0, etc.
of physical quantities and physical equations under a change of units. All these in-
variances have consequences for the formulation of physical equations, namely it fol-
lows that physical equations do not change under a symmetry operation (e. g. when a
physical experiment is repeated). As a result, certain measurable properties of phys-
ical processes cannot appear in a physical equation (i. e. strictly speaking, they are
no physical quantities), and not all mathematically defined combinations of physical
quantities are also physically permissible. We only mention these consequences here;
a sound justification is beyond the scope of this text.
From the reproducibility of a physical process it follows that points of time can-
not appear explicitly in a physical equation but only time intervals; and from the in-
variance under translation it follows that points (position coordinates) cannot appear
explicitly in a physical equation but only distances between points.
From the invariance of physical properties under a change of units it follows that
all terms of a physical equation must be measurable in the same unit (in other words,
they must be dimensionally homogeneous). For example, energy and temperature are
not measured in the same units, so an energy cannot be equal to a temperature. This
symmetry is systematically utilized in dimensional analysis, a method which is known
to reduce the number of mathematically possible combinations of physical quantities
significantly.
The reflection invariance leads to the distinction between polar and axial tensors;
and from the rotational invariance it follows that physical quantities must all be (po-
lar or axial) tensors and that in a physical equation all terms must be tensors of the
same order that must all be either polar or axial. (Analogous to the concept of dimen-
sional homogeneity, we could call this property tensorial homogeneity.) For exam-
ple, a torque (vector!) cannot be equal to a work (scalar!), although both quantities
are measured in the unit Joule, i. e. we have dimensional homogeneity; and a velocity
(polar vector!) cannot be equal to an angular velocity (axial vector!), even if we elim-
inated their dimensional inhomogeneity by multiplying the axial vector with a polar
scalar. These symmetries are systematically exploited in the theory of representation
of tensor functions (see Chapter 5).
There are no practical reasons for the use of left-handed systems in physics and
technology. Thus we could drop the distinction between polar tensors and axial ten-
sors if we would limit ourselves to right-handed systems. But then we could not use
the reflection invariance to exclude equations which are physically not permissible
(i. e. which are irrelevant for the description of physical processes).
we can define mathematical operations for the tensors themselves and for their coor-
dinates. As a result we have two different notations of tensor calculus, which we call
symbolic notation and index notation. We require tensor equations to be independent
of a coordinate system in symbolic notation (i. e. between tensors) as well as in index
notation (i. e. between tensor coordinates).
It is important to master both notations and to be able to translate from one nota-
tion into the other. Every equation in symbolic notation can be translated into an equa-
tion in index notation; the converse is not always possible, but it is in most practically
occurring cases. We think it is best to grow up virtually bilingual from the beginning
and will therefore write all operations in both notations. When performing the com-
putations we use whichever notation is more convenient. Usually this means we do
the computations in index notation and then translate the results into symbolic nota-
tion. Occasionally we will also speak seemingly inaccurately, for example of a vector
ai , since the coordinates of a vector in an unspecified Cartesian coordinate system are
after all also a symbol for the vector itself.
Let us finally agree to regard the coordinates of vectors as column matrices (rather
than row matrices). Then we can write all tensor equations which contain only scalars,
vectors, and second-order tensors also as matrix equations. Writing tensor equations
in matrix notation combines the conciseness of symbolic notation with the algorith-
mic character of index notation. It fails however, for tensors of order higher than 2
(we will see that this is already the case for the vector product of two vectors); and it
is not obvious how many columns a matrix a ∼ has (for example, whether it represents
the coordinates of a vector or of a second-order tensor). Nevertheless, matrix notation
is widely used (and often confused with symbolic notation). For this reason we will
write the definitions of tensor algebraic operations not only in symbolic notation and
index notation, but whenever possible also in matrix notation; however, we will not
use it here.
5 Since tensor coordinates are (for us: real) numbers (extension to complex numbers is straightfor-
ward), for tensor equations in index notation the rules of arithmetics apply.
a = A, a = A, a = A,
a = B, ai = Bi , a
∼ = B,
∼ (2.19)
a = C, aij = Cij , a
∼ = C.
∼
etc. etc.
Conversely, this implies that two tensors are not equal if they differ in at least one pair
of homologous coordinates.
Two tensors are added or subtracted by adding or subtracting their homologous
coordinates.
a ± b = A, a ± b = A, a ± b = A,
a ± b = B, ai ± bi = Bi , a
∼±b ∼ = B,
∼ (2.20)
a ± b = C, aij ± bij = Cij , a ± b = C.
∼ ∼ ∼
etc. etc.
α a = A, α a = A, α a = A,
α a = B, α ai = Bi , αa∼ = B,
∼ (2.21)
α a = C, α aij = Cij , αa = C.
∼ ∼
etc. etc.
2. For completeness, we still need to convince ourselves that the quantities A, Bi , Cij
defined in this way are tensor coordinates, i. e. that they satisfy the corresponding
transformation law, provided that the quantities on the left side of these equations
are tensor coordinates. This is easy to see, e. g. for polar tensors we have
So we can only equate, add, or subtract tensors, which obey the same transformation
law, and hence have the same order and are either both polar or both axial.
1 2 Q
3. We call Q Tensors A , A , . . . , A which have the same order and are either all polar
or all axial, linearly independent, if the equation
i
αi A = 0
is only satisfied for αi = 0; otherwise these tensors are called linearly dependent. More
than three vectors, more than nine second-order tensors, or in general more than 3N
tensors of order N are clearly always linearly dependent.
2. Two second-order tensors whose coordinates differ only by the order of the indices
are called transposed tensors. Clearly, their coordinate matrices are transposed ma-
trices. Similar to transposed matrices we also denote a transposed tensor by the same
letter with a T as superscript.
3. Tensors whose coordinates differ only in the order of indices are called isomers. The
number of isomers of a tensor increases quickly with its order: A second-order tensor
has two isomers, namely the tensor itself and the transposed tensor. A tensor of order
3 has already six isomers, namely Aijk , Ajki , Akij , Akji , Ajik , and Aikj . Evidently tensors
of order n have n! isomers, and we usually do not have a symbolic notation for them.
a = aT , aij = aji , a
∼=a
T
∼ . (2.24)
1 1
aij = (aij + aji ) + (aij − aji ) (2.26)
2
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 2
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
a(ij) a[ij]
into a symmetric part a(ij) and an antimetric part a[ij] , which is easy to see if we expand
the formula. We analogously say that a higher-order tensor is symmetric or antimetric
with respect to an index pair; e. g. we have6 a[ijk]l = 21 (aijkl − akjil ).
7. Finally, for completeness, we need to show that the transposed coordinate matrices
of a tensor in different coordinate systems are again the coordinates of a tensor. This
is easy to see using the transformation law.
Let aij and ã ij be the coordinates of a second-order polar tensor in two different
coordinate systems, i. e. we have
̃ Tnm = a
and let a ̃ mn be the transposed coordinate matrix in the tilde system. This gives
Thus, symmetry or antimetry of a tensor is a property of the tensor itself and not just
of its coordinate matrix in a particular coordinate system.
Problem 2.2.
A tensor (in a given Cartesian coordinate system) has the coordinates
1 0 2
aij = (4 −3 6) .
2 −4 5
A. Write out the coordinates of the transposed tensor.
B. Decompose the tensor into its symmetric and its antimetric part and check your
result, i. e. show that the sum of both parts gives the original tensor.
6 Some authors denote by a[ijk]l the average of all possible permutations of the indices i, j, k, where
an odd number of index exchanges result in a minus sign, i. e.
1
a[ijk]l = (a + ajkil + akijl − akjil − ajikl − aikjl ).
3! ijkl
We introduce three different products of tensors: the tensor product,8 the scalar prod-
uct,9 and the vector product,10 and we again define them by specifying the arithmetic
operations that need to be performed between the coordinates of the two factors.
The tensor or dyadic product is a generalization of the usual product of two
scalars, so we denote it similarly by writing the two factors next to each other without
a symbol between them.11 It is defined for tensors of arbitrary order.
a b = A, ab = A, a b = A,
a b = B, a bi = Bi , ab∼ = B,
∼
a b = C, ai b = Ci , a
∼ b = C,
∼
T
a b = D, ai bj = Dij , a
∼b∼ = D,
∼
a b = E, a bij = Eij , ab∼ = E,
∼ (2.27)
a b = F, aij b = Fij , a
∼ b = F.
∼
a b = G, ai bjk = Gijk ,
a b = H, aij bk = Hijk ,
etc. etc.
The name tensor product indicates that the tensor product of two vectors is a second-
order tensor.
Problem 2.3.
Two vectors have (in a given coordinate system) the coordinates ai = (2, 3, 4),
bi = (−2, 1, 5). Compute the coordinates of the tensor products a b and b a.
2.7.2 Properties
1. First, we need to convince ourselves that the result of the so-defined product be-
tween tensor coordinates consists again of tensor coordinates. We do this by means of
the equation ai bj = Dij . Writing this equation in two different Cartesian coordinate
systems, we have
ai bj = Dij and a ̃ =̃
̃i b Dij ,
j
and if, for example, ai and bj are the coordinates of two polar vectors, we have
ai = αim a
̃m ̃ .
and bj = αjn b n
This gives
Dij = ai bj = αim a ̃ =α α a
̃ m αjn b ̃ ̃
n im jn ̃ m bn = αim αjn Dmn ,
i. e. the Dij are according to (2.15) the coordinates of a second-order polar tensor.
2. Clearly, the tensor product of two polar tensors or of two axial tensors gives a polar
tensor, and the tensor product of two tensors where one is polar and the other is axial
gives an axial tensor.
3. If we compare the equations in (2.27) in the two notations, we realize that for the
translation the order of the indices must be the same on the left side and on the right
side of the equation. So sometimes we may have to change the order of the factors or
introduce isomers. For example in
aj bi ck + dk eji = fijk ,
bi aj ck + eijT dk = fijk ,
b a c + eT d = f .
If we have to generate isomers of tensors with order higher than 2, then we would
need additional rules to be able to translate the equation into symbolic notation. An
example is the following equation:
aik bj = cijk .
Problem 2.4.
Translate, as far as possible, into the other two notations:
A. a b c = d, B. ai bkl cj = dijkl , C. αi bkj = Aijk , D. aTmn = bnm .
4. Tensor coordinates are real numbers; so all operations in index notation must sat-
isfy the rules for calculations with real numbers, e. g. the commutative law and the
associative law for multiplication must hold. We need to check for each law if it is also
satisfied in symbolic notation.
Clearly addition and tensor multiplication are distributive also in symbolic nota-
tion. We have, for example, in index notation ai bj + ai cj = ai (bj + cj ), and translated
a b + a c = a(b + c).
We can similarly prove the associative law, e. g. we have (ai bj ) ck = ai (bj ck ), and
translated (a b) c = a(b c).
However, the commutative law does not hold in general. It only holds if one of
the factors is a scalar: a bij = bij a gives a b = b a, but ai bj = bj ai cannot be
translated directly because of the different order of the free indices on the two sides of
the equation. It is easy to see that in general a b ≠ b a. We set a b =: A and b a =: B.
Then we have
Since the equation on the right side is commutative in index notation, we can also
write an bm = Bmn and then rename the indices for easier comparison with the equa-
tion on the left side, i. e. we replace n by i and m by j. This gives ai bj = Bji , so we
have Aij = Bji , the tensors A and B are transposed, i. e. we have a b = (b a)T , and thus
a b ≠ b a in general.
5. A tensor product is zero if and only if at least one factor is zero. Consider, for ex-
ample, the tensor product C = A B and assume that only one coordinate of the two
factors is nonzero in the coordinate system under consideration. Then also one co-
ordinate of the product is nonzero. Only if all the coordinates of one factor are zero,
then also all the coordinates of the product are zero. And if all the coordinates of the
product are zero, then also all the coordinates of at least one factor are zero.
6. We can factor out a factor which is common to several tensor products, if the factor
is nonzero. For example, from a b = a c we first get a(b − c) = 0. If a ≠ 0, it then
follows from the last paragraph that b − c = 0, which gives b = c.
7. We can now prove the so-called quotient rule: if the tensor product of two factors,
where one of the factors is a nonzero tensor of order m, is a tensor of order (m+n), then
the other factor is a tensor of order n. It is a polar tensor if the two tensors are either
both polar or both axial, and it is axial if of the two given tensors one is polar and the
other is axial. Consider, for example, the relation
ai bj = Dij and a ̃ =̃
̃i b Dij
j
in two Cartesian coordinate systems and let ai be a nonzero polar vector and Dij a
second-order polar tensor, i. e.
ai =
\ 0, ai = αim a
̃m , Dij = αim αjn ̃
Dmn .
̃ .
bi = αim b m
1. We can now use tensor products to associate second- and higher-order tensors with
their coordinates in a Cartesian coordinate system, analogously to (2.10). If we replace
the left side of the equation a b = c using the representation (2.10), we obtain with
(2.27)
a b = ai ei bj ej = ai bj ei ej = cij ei ej = c,
i. e.
c = cij ei ej .
Here ei ej is a tensor product of two basis vectors. Since i and j take the values 1 to 3, we
can write ei ej as a square matrix with three rows, whose elements are second-order
tensors:
e1 e1 e1 e2 e1 e3
ei ej = ( e2 e1 e2 e2 e2 e3 ) .
e3 e1 e3 e2 e3 e3
Generalizing (2.10) gives the following relation between a tensor of arbitrary order and
its coordinates with respect to the basis ei of a coordinate system.
a = ai e i ,
a = aij ei ej ,
(2.28)
a = aijk ei ej ek ,
etc.
2. Just as any vector a can be written as a linear combination of the three basis vec-
tors ei with the coordinates ai as coefficients, we can also represent any second-order
tensor a as a linear combination of the nine basis tensors ei ej with the coordinates aij
as coefficients, and the same holds for tensors of higher order. Similarly as for vectors,
we also distinguish for tensors of arbitrary order between their components and coor-
dinates; for example, we call the tensor axy ex ey the x, y-component of a in the given
coordinate system, and we call the quantity axy its x, y-coordinate. The components
of a tensor are also tensors, whereas the coordinates are not.
3. We can now generalize our definition of a vector from Section 2.2.2 to tensors of
arbitrary order.
At this point it is useful to distinguish three different types of equations with tensors,
namely tensor equations, transformation equations, and representation equations:
– Tensor equations relate different tensors (tensor equations in symbolic notation)
or coordinates of different tensors in the same coordinate system (tensor equa-
tions in index notation), e. g. a b = c or aij bk = cijk .
– Transformation equations relate the coordinates of the same tensor in different
coordinate systems, e. g. ai = αij a ̃j .
– Representation equations relate a tensor itself and its coordinates in a given coor-
dinate system, e. g. a = aij ei ej .
Clearly, αim αjn δmn = αim αjm = δij , i. e. the δij satisfy the transformation law (2.17)
for the coordinates of a second-order polar tensor. Thus we can interpret the δij as the
Cartesian coordinates of a second-order polar tensor
δ = δij ei ej (2.29)
which is called the δ-tensor, identity tensor, metric tensor, or fundamental tensor.
Clearly, it has the property that its coordinates are the same in every Cartesian co-
ordinate system, namely they are equal to the δij defined in (1.15).
1. Now we want to show that the εijk satisfy the transformation law (2.18) for the coor-
dinates of a third-order axial tensor, i. e. that they are the Cartesian coordinates of an
axial tensor
ε = εijk ei ej ek . (2.30)
Since the εijk are defined by (1.33), independently of a coordinate system, they must
satisfy the equation εijk = (det α)
∼ αmi αnj αpk εmnp . Let
̃εijk = (det α) αmi αnj αpk εmnp ,
∼
and then we have to show, using the properties of the εijk , that ̃εijk = εijk .
Only six of the 27 terms in the triple sum on the right side are nonzero, according
to (1.33):
̃εijk = (det α) [α1i α2j α3k + α2i α3j α1k + α3i α1j α2k
∼
− α3i α2j α1k − α1i α3j α2k − α2i α1j α3k ].
The bracket can clearly be written as a determinant, i. e.
α α1j α1k
1i
̃εijk = (det α) .
∼ α2i α2j α2k
α
3i α3j α3k
Since changing the order of two indices in ̃εijk exchanges two columns and changes
the sign of the determinant, it follows that ̃εijk , like εijk , is antimetric with respect to
all index pairs; i. e. analogously to (1.34) we have
If two indices have the same value, then two columns of the determinant are equal,
i. e. the determinant is zero; so just like εijk , ̃εijk is also zero, if at least two indices are
equal. Finally, it remains to show that ̃ε123 = 1, and indeed for i = 1, j = 2, k = 3 we
have
ε in right-handed systems,
E = Eijk ei ej ek , Eijk = { ijk (2.31)
−εijk in left-handed systems.
We will not use the tensor E.
The scalar product or inner product or dot product of two tensors is denoted by a dot
between the two factors. It is obtained from the corresponding tensor product by set-
ting equal the two indices adjacent to the gap between the two factors, i. e. by summing
over these indices.
It is therefore defined for two tensors of order at least one.
a ⋅ b = A, ai bi = A, aT
∼ b∼ = A,
a ⋅ b = B, ai bij = Bj , aT T
∼ b∼ =B∼ ,
a ⋅ b = C, aij bj = Ci , a
∼b∼ = C,
∼ (2.33)
a ⋅ b = D, aij bjk = Dik , a
∼b∼ = D.
∼
a ⋅ b = E, aij bjkl = Eikl ,
etc. etc.
Its name stems from the fact that the scalar product of two vectors is a scalar.
12 Equation (2.32) is another example of an equation that (without additional conventions) cannot be
translated into symbolic notation.
2.9.2 Properties
1. First, we need to convince ourselves that the product of tensor coordinates defined
in this way are again tensor coordinates. We do this using the example of the equation
ai bij = Bj for polar tensors. Writing this equation in two different Cartesian coordinate
systems, we get
ai bij = Bj and a ̃ =B
̃i b ̃j,
ij
and since the ai are the coordinates of a polar vector and the bij are the coordinates of
a polar tensor, we have
ai = αik a
̃k ̃ .
and bij = αim αjn b mn
This gives
Bj = ai bij = αik a ̃ =δ α a
̃ k αim αjn b ̃ ̃ =α B ̃
mn km jn ̃ k bmn = αjn a
̃m b mn jn n ,
2. Clearly, the scalar product of two polar tensors or two axial tensors gives a polar
tensor and the scalar product of a polar tensor and an axial tensor gives an axial ten-
sor.
3. Comparing the equations from (2.33) in the two notations shows that for the trans-
lation the order of the free indices must be equal on both sides of an equation. Fur-
thermore, to be able to write the summation symbolically as a scalar product, the two
summation indices must be adjacent to the gap between both tensors.
Problem 2.5.
Translate, as far as possible, into the other two notations:
A. a ⋅ b = c, B. b ⋅ a = c, C. aik bij = cjk , D. ai bk ai ck = d,
Hint: Also in F and G (where transposed tensors in symbolic notation appear) only
translate but do not transpose, i. e. the ‘T’ for transposition remains in the equation
after translation. (In order to translate an equation from index notation we may have
to first perform some transpositions; we never have to do this when we translate from
symbolic notation.)
a ⋅ b + a ⋅ c = a ⋅ (b + c).
5. Clearly a ⋅ b = b ⋅ a; if both factors are vectors, the commutative law also holds in
symbolic notation. If one of the factors is of second order or higher, the commutative
law does not hold in general, e. g. a ⋅ b is not equal to b ⋅ a. To show this, we assume
ai bij = bij ai . To be able to translate this equation, we need to transpose bij on the
right side. Then we have ai bij = bTji ai , which gives a ⋅ b = bT ⋅ a, but in general bT ⋅ a
is not equal to b ⋅ a.
6. The associate law holds in symbolic notation for an arbitrary sequence of tensor and
scalar multiplications, if and only if in index notation no pair of summation indices is
separated by other indices. The equation (aij bj ) ck = aij (bj ck ) can be translated into
(a ⋅ b) c = a ⋅ (b c), but the equation (aij bj ) ci = aij (bj ci ) cannot be translated into
(a ⋅ b) ⋅ c = a ⋅ (b ⋅ c). In symbolic notation the associative law holds, if for each tensor
the sum of its adjacent dots is smaller or equal to its order. This is true for a ⋅ b c but
not for a ⋅ b ⋅ c. (In this form, the statement is also true for the multiple scalar products
defined below.)
Problem 2.6.
For which sequences of the factors on the left side can you translate the equation
ai bk ai cj = djk into symbolic notation? For each case, check if the associative law
holds for all five possible associations.
7. Next, we ask ourselves in which cases we can cancel a common factor in scalar
products.
We start with the equation a ⋅ A = b ⋅ A, which we can also write as (a − b) ⋅ A = 0
or (ai − bi ) Ai = 0 .
Clearly, we cannot conclude from a zero scalar product of two vectors that one
of its factors is zero. Even if Ai \= 0, ai − bi does not need to be zero; it can also
be perpendicular to Ai . However, if the equation above is satisfied for three linearly
1 2 3
independent vectors Ai , Ai , and Ai , then certainly ai − bi = 0, because no vector can
be perpendicular to three linearly independent vectors simultaneously. In particular
it follows from (ai − bi ) ei = 0 or ai ei = bi ei that ai = bi ; we have already used
this uniqueness of the decomposition of a vector with respect to a basis several times.
Thus, since every vector can be represented as a linear combination of three Cartesian
basis vectors, we can alternatively require that the equation above be satisfied for any
vector A.
To generalize this proof to the equation ai...jk Ak = bi...jk Ak , we start by writing the
proof for ai Ai = bi Ai more formally. The three equations
with known Ai , Bi , and Ci , form a system of homogeneous linear equations for the
ai − bi . This system has only the trivial solution ai − bi = 0, if and only if the three
vectors A, B, and C are linearly independent.
This proof generalizes to the next higher-order tensors in the equation a ⋅ A = b ⋅ A
or (aij − bij ) Aj = 0 by using the same argument. The three equations
(aij − bij ) Aj = 0,
(aij − bij ) Bj = 0,
(aij − bij ) Cj = 0,
with known Aj , Bj , and Cj , form a system of homogeneous linear equations for ai1 −
bi1 , ai2 − bi2 , and ai3 − bi3 , for each value of the free index i. These three equations
have only the trivial solution, if and only if the three vectors A, B, and C are linearly
independent.
The proof for the equation ai...jk Ak = bi...jk Ak is analogous. A vector which is
a common factor in an equation with several scalar products can be cancelled out,
if and only if arbitrary values can be substituted for this vector (e. g. three linearly
independent vectors).
The equation a ⋅ A = b ⋅ A or (ai − bi ) Aij = 0 with known A forms a system of
homogeneous linear equations for a − b . This system has only the trivial solution, if
the determinant of Aij is nonzero. This condition must hold for the equation a⋅A = b⋅A
or (aij − bij ) Ajk = 0 for each value of the free index i separately. We conclude that a
second-order tensor can be canceled out of several scalar products, if and only if its
determinant does not vanish.
The equation a ⋅ A = b ⋅ A or (ai − bi ) Aijk = 0 with known A forms a system of
nine homogeneous linear equations for the three coordinates ai − bi : one equation
for each value of the free indices j and k. At most three of these nine equations can
be linearly independent. The system of equations has only the trivial solution, if and
only if three equations are linearly independent. So we can cancel out the factor A, if
the matrix
has rank 3. To generalize the proof to equations of the form ai...jk Aklm = bi...jk Aklm we
follow the same approach as above.
In the above matrix formed by the Aijk , we call i a row index, since there corre-
sponds to each value of i one row of the matrix, and we call j and k column indices,
since there corresponds to each value pair jk a column of the matrix. With this notion
we can now say under which condition a tensor of order 3 or higher can be canceled
out. In the equation ai...jk Akm...n = bi...jk Akm...n we can cancel out Akm...n , if the matrix
formed by the coordinates Akm...n , with the summation index k as row index and with
the free indices m . . . n as column indices, has rank 3.
Clearly, the much more general condition that the common factor can assume ar-
bitrary values is also sufficient.
8. We showed at the beginning of this section that the scalar product of two tensors
results again in a tensor. We can now prove the converse, which is also called the
quotient rule and of which we already used a special case in Section 2.3.1, when we
introduced tensors: if the scalar product of two factors is a tensor of order (m + n − 2)
and one factor is a tensor of order m that can take on arbitrary values, then the other
factor is a tensor of order n. We show this for the example aij bjk = cik for polar tensors.
In two different Cartesian coordinate systems we have
a
̃ mj = αpm αqj apq , cik = αim αkn c̃mn .
This gives
̃ ) = 0.
aiq (bqk − αkn αqj b jn
If aiq can take any value, we can cancel it out and then we get
̃ ,
bqk = αqj αkn b jn
If the two vectors form an acute angle, then the scalar product is positive; if they are
perpendicular to each other, it is zero; if they enclose an obtuse angle, it is negative.
10. For the scalar product of a tensor A of arbitrary order with the δ-tensor it follows
as a special case of (1.17) that
A ⋅δ =A; (2.34)
scalar multiplication of a tensor with the δ-tensor does not change the tensor. With
respect to scalar multiplication, the δ-tensor has the same property as the number one
for multiplication in arithmetics. This explains why the δ-tensor is also called identity
tensor.
which we can easily prove analogously to (1.48) by translating it into index notation.
Problem 2.7.
Prove the following identities by translating into index notation:
A. (a b)T = b a, B. a ⋅ b = bT ⋅ a, C. (a ⋅ b ⋅ c)T = cT ⋅ bT ⋅ aT .
1. Setting equal an index in both factors of a tensor product of two tensors defines,
according to the summation convention, an arithmetic operation. This operation is
called contraction of the two tensors on the two indices. A contraction is defined only
in index notation. It builds a tensor of order (m + n − 2) from a tensor of order m and
a tensor of order n. The scalar product is a special case of a contraction, namely a
contraction on two adjacent indices.
2. The operation defined by setting equal two indices of a single tensor is also called
contraction. It reduces the order of the original tensor by two. The simplest contraction
of a tensor, the contraction of a second-order tensor, has a special name; it is called
the trace of a tensor and it is also expressed in symbolic notation.
tr a = b, aii = b. (2.36)
So the term contraction is used for two similar but different operations. The contrac-
tion of the two factors of a tensor multiplication is simultaneously a contraction of
their tensor product.
We denote the double scalar product or double inner product or double dot product
by two adjacent dots. It is obtained by contracting the two index pairs adjacent to the
gap between the two tensors.
a ⋅⋅ b = A, aij bij = A,
a ⋅⋅ b = B, aij bijk = Bk ,
(2.37)
a ⋅⋅ b = C, aijk bjk = Ci ,
etc. etc.
a : b = D, aij bji = D,
a : b = E, aij bjik = Ek , (2.39)
a : b = F, aijk bkji = F.
.
(As a mnemonic hint, the horizontal dots remind of the summation indices on the two
sides of the dots and the vertical dots remind of a plane of reflection.)
Problem 2.8.
Translate, using the definitions in (2.37), into symbolic notation:
A. aij bij = c, B. aij bji = c, C. aij bkjl cik = dl , D. aij bikl cklj = d.
Are parentheses necessary in symbolic notation in C and D?
Problem 2.9.
Compute
A. tr δ, B. δ ⋅ δ, C. δ ⋅⋅ δ.
Problem 2.10.
A. Under which condition is the double scalar product of two second-order tensors
commutative?
B. Under which condition can bij be cancelled out in aij bij = 0 ?
Problem 2.11.
A. Prove that the double scalar product of a symmetric tensor and an antimetric ten-
sor is zero.
B. Prove also the converse: If the double scalar product of two tensors is zero and one
of the tensors is an arbitrary symmetric (antimetric) tensor, then the other tensor
is antimetric (symmetric).
C. Prove also the following theorem: If the double scalar product of a second-order
tensor with the ε-tensor is zero, then this tensor is symmetric (the converse is ac-
cording to (2.40) true anyway).
D. How many linearly independent symmetric tensors are necessary to write an ar-
bitrary symmetric tensor as their linear combination?
1. Tensor multiplication assigns a (second-order) tensor to two vectors and scalar mul-
tiplication assigns a scalar. A vector can be assigned to two vectors by contracting
twice with the ε-tensor. We define
a × b := ε ⋅⋅ a b = A
and call A the vector product or outer product or cross product of the vectors a and b.
We write in index notation
(a × b)i := εijk aj bk = Ai .
Clearly, we can also write εijk between the two vectors or behind them; if the contrac-
tion is translatable, we obtain the expressions
i. e. we have three equivalent definitions for the vector product of two vectors:
a × b := ε ⋅⋅ a b = −a ⋅ ε ⋅ b = a b ⋅⋅ ε.
Like the scalar product, the vector product is also a special case of a contraction.
To compute the vector product of two vectors, we remember that the elements of
εijk are nonzero only if all three indices are different. For example, for i = 1 only ε123
and ε132 are nonzero. Thus we have
2. The ε-tensor is an axial tensor, so the vector product of two polar vectors or of two
axial vectors gives an axial vector, and the vector product of a polar and an axial vector
gives a polar vector. In other words, the three vectors of a vector product are either all
axial or two are polar and the third is axial.
3. Some authors use the polar E-tensor from (2.31) instead of the axial ε-tensor to de-
fine the vector product. With this definition the result of the vector product of two polar
vectors is a polar vector. We denote this vector product by the symbol × ̃ to distinguish
it from our definition:
̃ b) := E ⋅⋅ a b,
(a × ̃ b)i = Eijk aj bk .
(a × (2.42)
The angle is the smaller of the two angles enclosed by the vectors a and b, so that it is
at most 180∘ and thus the sine is nonnegative. The vector a × b is perpendicular to the
vectors a and b (i. e. to the plane spanned by them) and forms a right-hand screw with
the direction of rotation obtained by rotating the vector a about the smaller of the two
angles enclosed by a and b into b in a right-handed system, and a left-hand screw in
a left-handed system.
a × b := −a ⋅ ε ⋅ b = A,
a × b := −a ⋅ ε ⋅ b = B,
a × b := −a ⋅ ε ⋅ b = C,
a × b := −a ⋅ ε ⋅ b = D,
etc.
(2.43)
(a × b)j := −ai εijk bk = ai εkji bk = Aj ,
(a × b)jm := −ai εijk bkm = ai εkji bkm = Bjm ,
(a × b)mj := −ami εijk bk = ami εkji bk = Cmj ,
(a × b)mjn := −ami εijk bkn = ami εkji bkn = Dmjn ,
etc.
If the first factor of a vector product is a vector, then we can write the ε-tensor also in
front of the two factors, i. e.
a × b = −a ⋅ ε ⋅ b = ε ⋅⋅ a b,
and if the last factor of a vector product is a vector, then we can write the ε-tensor also
behind the two factors, i. e.
a × b = −a ⋅ ε ⋅ b = a b ⋅⋅ ε,
a ⊗ b = −a × b,
A ⊗ B = −A × B ,
(2.44)
(a ⊗ b)q := ap εpqr br ,
(A ⊗ B )i...jqm...n := ai...jp εpqr brm...n .
Problem 2.12.
Which of the following expressions are ambiguous without parentheses?
A. a × b c, B. a × b ⋅ c, C. a ⋅ b × c, D. a ⋅ b × c.
Hint: First, check if both meanings, in A e. g. (a × b) c and a × (b c), are defined.
If this is true, then check if the associative law holds. An expression is not unique
without parentheses only if both meanings are defined and if the associative law does
not hold.
Problem 2.13.
Translate the following expressions into the other notation. Then simplify each ex-
pression in index notation and translate the result into symbolic notation, thus ob-
taining a tensor algebraic identity in symbolic notation.
A. εijk am bj εmin ck dn , B. εijk cm bkm εqpi aj dq , C. a × [(b ⋅ c) × d],
D. a × (b × c). E. Compare the expressions B and C.
Problem 2.14.
Use the Grassmann identity (“backwards”) to rewrite the expression ai bk − ak bi and
translate it into symbolic notation.
2.10.2 Properties
Since the vector products in (2.43) are defined by scalar products, we can include the
vector products in the associative law for scalar products.
From a × b := −a ⋅ ε ⋅ b it follows that for both factors the cross is equivalent to the
dot.
So a sequence of products of tensors, including vector products, is associative if
for each tensor the sum of dots and crosses on each side of the tensor is at most equal
to the order of the tensor. For example, this is not the case for b in a × b × c = a ⋅ ε ⋅ b ⋅ ε ⋅ c
and, as we know, a × (b × c) ≠ (a × b) × c.
The remaining statements about vector products can be proven analogously to
their corresponding statements about scalar products.
– The vector product of a tensor of order m and a tensor of order n gives a tensor
of order (m + n − 1).
– The three tensors of a vector multiplication (i. e. the two factors and the prod-
uct) are either all axial or two of them are polar and the third is axial.
– For the translation the order of the free indices of a vector product must be the
same on both sides of an equation.
– Addition and vector multiplication are distributive also in symbolic notation.
1. A vector product followed by a scalar product between three vectors is often called
the triple product of these three vectors; we denote the triple product by square brack-
ets. Using (1.28) we have the following formulas.
[a, b, c] = a × b ⋅ c = a ⋅ b × c = A,
a a2 a3
1 (2.45)
εijk ai bj ck = b1 b2 b3 = A.
c1 c2 c3
(Here we do not need to set parentheses because only (a×b)⋅c is defined, but a×(b⋅c)
is not defined.)
If e. g. all three vectors are polar, then the triple product is an axial scalar; instead
of axial scalar often the term pseudoscalar is used.
2. We recall the geometric interpretation of the triple product. Its magnitude is equal
to the volume of the parallelepiped spanned by the three vectors a, b, and c; it is zero
if the three vectors are coplanar. This includes the case that one of the three vectors is
zero.
The triple product is positive if the vector product a × b forms an acute angle
with c. If this is the case for three polar vectors in a right-handed system, then we say
the three vectors a, b, and c (in this order) form a right-handed system. If this is true
for three polar vectors in a left-handed system, then we say they form (in this order) a
left-handed system.
A triple product changes its sign if we change the order of two vectors. The sign
does not change if we permute the three vectors cyclically.
a a
1 a2 a3 d1 e1 f1 ⋅ d a⋅e a⋅f
(2.45) (1.13)
[a, b, c][d, e, f ] = b1 b2 b3 d2 e2 f2 = b ⋅ d
b⋅e b⋅f ,
c1 c2 c3 d3 e3 f3 c ⋅ d c⋅e c⋅f
a ⋅ d
a⋅e a⋅f
[a, b, c][d, e, f ] = b ⋅ d b⋅e b⋅f .
(2.46)
c ⋅ d c⋅e c⋅f
contraction
(order 1 and higher) aijk bmkn = cijmn
also multiple: aijk bmki = cjm
in particular: scalar multiplication aijk bkm = cijm a⋅b=c
If we differentiate a scalar field a(x, y, z) with respect to the spatial coordinates, we get
the coordinates (𝜕a/𝜕x, 𝜕a/𝜕y, 𝜕a/𝜕z) of a vector field, which is called the gradient of
a. We can show in general that the derivatives of the Cartesian coordinates of a tensor
field of order n with respect to the spatial coordinates form the Cartesian coordinates
of a tensor field of order (n + 1). This statement is called the fundamental theorem of
tensor analysis.
We prove it here using the example of a second-order polar tensor field aij (x),
where the argument x stands for the Cartesian spatial coordinates (x1 , x2 , x3 ). Accord-
ing to (2.17) we have
which is the transformation law for the coordinates of a third-order polar tensor.
2.12.2 Gradient
In symbolic notation we start with the tensor field a(x) = aij ei ej . If we differentiate
with respect to xk , we have, because the bases are independent of the position,
𝜕a 𝜕aij
= ei ej .
𝜕xk 𝜕xk
Since the 𝜕aij /𝜕xk are the coordinates of a tensor of order 3, we obtain from 𝜕a/𝜕xk
a tensor of order 3 by multiplying it by ek . This tensor is called the gradient13 of the
original tensor a.
The gradient of a polar tensor field of order n gives a polar tensor field of order
(n + 1), and the gradient of an axial tensor field of order n gives an axial tensor field of
order (n + 1).
Since the tensor product of basis vectors is not commutative, we have two differ-
ent definitions of the gradient, depending on whether we multiply 𝜕a/𝜕xk by ek from
the left or from the right; they can be distinguished as the left gradient and the right
gradient. Both definitions are used in the literature, so one has to check each time,
when the gradient of a tensor of at least order 1 is used, how the gradient is defined.
These two different definitions of the gradient depend on which convention is
used for the order of the indices in the spatial derivative of tensor coordinates. If the
denominator index is put behind the numerator indices, i. e.
𝜕aij
= Aijk ,
𝜕xk
then contraction with the corresponding basis gives the right gradient
𝜕aij
e e e = Aijk ei ej ek ;
𝜕xk i j k ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
gradR a A
13 Gradiens Latin: (steepest) stepping (increase) (present participle of gradi [to step/walk]).
14 Also called Nabla operator.
where in the first case del acts on the tensor from the right and in the second case it
acts on the tensor from the left.
Since tensor multiplication of a tensor by a scalar is commutative, the two gradi-
ents of a scalar coincide, i. e.
grad a = a ∇ = ∇ a.
Thus the left and the right gradient are generalizations of the gradient from vector
calculus. However, for tensors of at least order 1 we have to make a decision. We choose
here the right gradient, because the order “numerator index in front of denominator
index” is how we read a differential quotient 𝜕aij /𝜕xk . So we agree on the following
convention.
With this convention we can drop the index R in gradR from now on.
For example, we write
𝜕 𝜕aij
grad a = a ∇ = aij ei ej ek = e e e .
𝜕xk 𝜕xk i j k
𝜕a
grad a := e ,
𝜕xk k
𝜕ai
grad a := e e ,
𝜕xk i k (2.49)
𝜕aij
grad a := ei ej ek ,
𝜕xk
etc.
𝜕a
grad a = A, = Ak ,
𝜕xk
𝜕ai
grad a = A, = Aik ,
𝜕xk (2.50)
𝜕aij
grad a = A, = Aijk ,
𝜕xk
etc. etc.
Some authors use the del operator instead of the grad symbol in symbolic notation
and use it in computations like a vector. However, we need additional rules for com-
putations with this “symbolic vector”, since it is not a real vector but only behaves like
a vector in some ways. For that reason we use del only in definitions and to distinguish
the different differential operations in tensor analysis. We will not use it in computa-
tions, where we will use index notation instead, like we did in tensor algebra.
1. Consider two points with the coordinates xi and xi +dxi , i. e. with the position vectors
x and x + d x, which do not have to be neighboring points. The difference d x between
the two position vectors is called the differential of the position vector x.
Differentials of the position vector in two different coordinate systems are related
by
𝜕xi
d xi = d x̃j ,
𝜕x̃j
i. e. the differential of the position vector is a polar vector. We write in different nota-
tions
d x = d xi ei = d x ex + d
⏟⏟⏟⏟⏟⏟⏟⏟⏟ y ey + d
⏟⏟⏟⏟⏟⏟⏟⏟⏟ z ez ,
⏟⏟⏟⏟⏟⏟⏟⏟⏟ (2.52)
d x1 d x2 d x3
𝜕a
d a := grad a ⋅ d x, d a := d xk ,
𝜕xk
𝜕ai
d a := (grad a) ⋅ d x, d ai := d xk ,
𝜕xk (2.53)
𝜕aij
d a := (grad a) ⋅ d x, d aij := d xk ,
𝜕xk
etc. etc.
The differential of a tensor depends clearly on the point x at which we compute the
gradient and on the differential d x. Geometrically, it represents the increase of the
tensor field on the tangential hyperplane of the tensor field at the point x.
Sometimes instead of “the differential of a tensor in index notation” we also speak
of “the differential of tensor coordinates”.
If we use the left gradient instead of the right gradient in the formulas above, then
in symbolic notation, the order of the factors changes.
Since the basis vectors are independent of the position, the differential of a tensor
and the differential of its coordinates are directly related. For example, we have
d a = d (ai ei ) = d ai ei ,
d a = d a,
d a = d ai ei ,
(2.54)
d a = d aij ei ej ,
etc.
2. For neighboring points with the position vectors x + d x and x, the differential
of tensors and the differential of tensor coordinates, respectively, are equal to the dif-
ferences of the tensors and tensor coordinates at the two points, up to second order
in d x.
d a = a(x + d x) − a(x),
d a = a(x + d x) − a(x),
d a = a(x + d x) − a(x),
etc.
(2.55)
d a = a(xp + d xp ) − a(xp ),
d ai = ai (xp + d xp ) − ai (xp ),
d aij = aij (xp + d xp ) − aij (xp ),
etc.
2.12.4 Divergence
We obtain the divergence15 from the gradient, if we replace the tensor product of the
tensor field and the del by the scalar product, i. e.
So the divergence of a tensor field is defined for tensor fields of at least order 1, and
since the scalar product of two vectors is commutative, the right divergence and left
divergence of a vector field coincide; thus both divergences are generalizations of the
divergence from vector calculus.
Clearly, the divergence of a polar tensor field of order n gives a polar tensor field of
order (n − 1), and the divergence of an axial tensor field of order n gives a tensor field
of order (n − 1).
We will use the right divergence from now on and thus drop the index R in divR .
So we agree on the following convention.
The differentiation index in a divergence is equal to the last index of the tensor
whose divergence is computed.
𝜕 𝜕aij 𝜕aij
div a = a ⋅ ∇ = aij ei ej ⋅ ek = ei e⏟⏟⏟⏟⏟⏟⏟⏟⏟
j ⋅ ek = e.
𝜕xk 𝜕xk 𝜕xj i
δjk
𝜕ak
div a := ,
𝜕xk
𝜕aik
div a := e,
𝜕xk i (2.57)
𝜕aijk
div a := ei ej ,
𝜕xk
etc.
15 Divergentia Latin: divergence, also source strength (verbal noun to modern Latin divergere [to di-
verge], derived from vergere [to slope]).
𝜕ak
div a = A, = A,
𝜕xk
𝜕aik
div a = A, = Ai ,
𝜕xk (2.58)
𝜕aijk
div a = A, = Aij ,
𝜕xk
etc. etc.
Problem 2.15.
Translate into the other notation:
𝜕ai 𝜕bj 𝜕ai 𝜕bi
A. div grad a = b, B. grad div a = c, C. = cik , D. = c.
𝜕xj 𝜕xk 𝜕xj 𝜕xj
2.12.5 Curl
We obtain the curl from the gradient, if we replace the tensor product of the tensor
field and the del by the vector product.
So also the curl of a tensor field is defined for tensor fields of at least order 1. Since
the sign of a vector product changes if we change the order of the factors, and since
we want both the right curl and the left curl to coincide for a vector field with the curl
from vector calculus, we define
The curl of a polar tensor field of order n clearly gives an axial tensor field of order n,
and the curl of an axial tensor field of order n gives a polar tensor field of order n.
From now on we use the right curl, so we will drop the index R in curlR .
We write e. g.
curl a = a ⊗ ∇.
In contrast to the gradient and the divergence we have to be careful in ami εijk 𝜕/𝜕xk
to pull the differentiation in front of the epsilon and not the coordinate behind the
epsilon; e. g. in εijk 𝜕ami /𝜕xk = (curl a)jm the order of the free indices is reversed, so it
would be a different tensor.
For tensors of various order we have the following definitions.
𝜕ai
curl a := ε e,
𝜕xk ijk j
𝜕ami
curl a := ε e e,
𝜕xk ijk m j (2.60)
𝜕amni
curl a := ε e e e,
𝜕xk ijk m n j
etc.
𝜕ai
curl a = A, ε = Aj ,
𝜕xk ijk
𝜕ami
curl a = A, ε = Amj ,
𝜕xk ijk (2.61)
𝜕amni
curl a = A, ε = Amnj ,
𝜕xk ijk
etc. etc.
Problem 2.16.
Compute
A. grad x, B. div x, C. curl x.
Problem 2.17.
For each of the following expressions, derive a tensor analytical identity in index no-
tation and then translate this identity into symbolic notation:
A. div (λ a), B. curl (λ a), C. curl grad a, D. div curl a,
𝜕apj 𝜕apj bi
E. curl curl a, F. εijk εmnk bm , G. εijk εmnk .
𝜕xi 𝜕xm
Hint: The last two problems are slightly more difficult, because they have two free
indices. If this is the case, we have to check, before the translation, if the order of the
free indices matches in all terms.
The divergence of a gradient is also called the Laplace operator or the Laplacian and
is denoted by Δ.
The Laplace operator of a polar tensor field of order n gives a polar tensor field of
order n, and the Laplace operator of an axial tensor field of order n gives an axial tensor
field of order n. For example, div grad a gives translated
2
𝜕 𝜕aij 𝜕 aij
= ,
𝜕xk 𝜕xk 𝜕xk2
so for tensors of various order we have the following equations.
𝜕2 a
Δa = ,
𝜕xk2
𝜕2 ai
Δa = e,
𝜕xk2 i (2.63)
𝜕2 aij
Δa = ei ej ,
𝜕xk2
etc.
∫ a(x, y, z) d x,
∫ a(x, y, z) d y, (2.64)
∫ a(x, y, z) d z,
∫∫ a(x, y, z) d y d z,
∫∫ a(x, y, z) d z d x, (2.65)
∫∫ a(x, y, z) d x d y,
We will see that the first three integrals can be interpreted as line integrals, the second
three as surface integrals, and the last as a volume integral.
where the so-defined vector line element d x is, according to (2.51) (and in agreement
with our intuition), a polar vector.
(The three spatial coordinates of the argument can be combined as x, indepen-
dently of whether the equation is written in symbolic notation or in index notation;
i. e. we can do this also in the single integrals of (2.64)). These integrals, where the
differential is a vector line element, are called line integrals of the second kind.
If the differential is the magnitude of the vector line element instead of the line
element, we have
∫ a(x) d x. (2.68)
x = x(u),
y = y(u), u1 ≦ u ≦ u2 , (2.69)
z = z(u);
d xi
d xi = du (2.71)
du
and for its magnitude
d xi 2
d x = √(d xi )2 = √( ) d u. (2.72)
du
Thus we obtain for the line integrals (2.67) and (2.68) along the curve segment (2.69)
u2
d xi
∫ a(x) d xi = ∫ a(x(u)) d u, (2.73)
du
u1
u2
d xi 2
∫ a(x) d x = ∫ a(x(u)) √( ) d u. (2.74)
du
u1
Problem 2.18.
A point mass with mass m is moving along one turn of a helix around the z-axis with
radius a and slope h. The forces acting on the point mass are the gravitational force
F 1 = − m g ez and the elastic force F 2 = − λ x. Compute the work W = ∫ F ⋅ d x exerted
on the point mass.
1. A surface element is characterized by its magnitude d A and its normal vector n (the
unit vector perpendicular to the surface element). The product of the normal vector
and the magnitude of the surface element is called the surface vector d A.
d A = n d A. (2.75)
The normal vector n and the surface vector d A are not yet uniquely defined, since
a surface element has two unit vectors (negatives of each other, in other words, ori-
ented in opposite directions) perpendicular to the surface element. We determine the
orientation with a convention, distinguishing between open surfaces (surfaces with a
boundary curve) and closed surfaces (surfaces without a boundary curve); in addition
we restrict ourselves to two-sided16 surfaces.
– We can assign a direction of rotation to an open surface (and so also to its surface
elements) by the way we travel along the boundary curve. Let us agree to define the
normal vector and the surface vector of a surface element by their magnitudes and
by the rotational direction of a boundary curve. Then n and d A are axial vectors. If
an observer travels along the boundary curve and sees the surface on the left side,
then n (and d A) and the direction of rotation are associated according to the right-
hand screw rule. For example, if the surface is the ring between two concentric
circles, then the boundary has two parts (the outer and the inner circle), and while
traveling along each part the surface must be on the same (e. g. left) side of the
boundary curve, so the observer travels in different directions along the two parts.
– A closed surface does not have a boundary curve; instead it has an inner side and
an outer side. We adopt the following convention.
The normal vector and the surface vector of the surface element of a closed
surface always point outward from the enclosed volume.
16 A surface is two-sided, if we cannot go from one side to the other side without passing through the
boundary. The most famous example of a one-sided surface is the Möbius strip.
For example, if the enclosed volume is the space between two concentric
spheres, then the closed surface has two parts (the surface of the outer sphere
and the surface of the inner sphere), and the normal vector points, according to
our convention, radially outward from each surface element of the outer sphere
and radially inward from each surface element of the inner sphere.
The magnitude of both a polar vector and an axial vector is a polar scalar, i. e. the area
of a surface element is a polar scalar for open surfaces and for closed surfaces and
thus (in agreement with our intuition) independent of the orientation of the coordinate
system.
2. Consider an infinitesimal tetrahedron, with one plane being oriented in arbitrary di-
rection and the other three planes perpendicular to the coordinate axes. The inclined
plane has the magnitude d A and the (outwardly oriented) normal vector n; the length
of the edges along the coordinate axes are |d ax |, |d ay |, and |d az |. The value of d A is
half of the area of the parallelogram spanned by the edge vectors d u and d v (see the
figure above). According to Section 2.10.1, No. 4 we can write, using the vector product,
1
ndA = d u × d v.
2
1
n dA= (−|d az | ez + |d ay | ey ) × (−|d az | ez − |d ax | ex )
2
1
= (|d ax | |d az | e⏟⏟⏟z⏟⏟⏟⏟⏟
× ⏟e⏟⏟⏟x⏟ − |d ay | |d az | e⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
y × ez − |d ax | |d ay | ⏟⏟ × ⏟e⏟⏟⏟x⏟)
e⏟y⏟⏟⏟⏟⏟
2
ey ex −ez
1
= (|d ax | |d az | ey − |d ay | |d az | ex + |d ax | |d ay | ez )
2
1 1 1
= |d ax | |d az | ey − |d ay | |d az | ex + |d ax | |d ay | ez
2
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 2
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 2
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
|d Ay | |d Ax | |d Az |
= −|d Ax | ex + |d Ay | ey + |d Az | ez . (a)
Here |d Ax |, |d Ay |, and |d Az | are the areas of the faces of the tetrahedron perpendicular
to the coordinate axes; geometrically speaking they are the projections of the inclined
surface onto the coordinate planes. The signs differ because the x-component of d A
points in the negative x-direction, while the y-component and the z-component point
in the direction of the positive y-axis and z-axis, respectively. If we compare the last
row of (a) with the general form of the vector d A with respect to the basis ei ,
d A = d Ax e x + d Ay e y + d Az e z ,
n d A + ex |d Ax | − ey |d Ay | − ez |d Az | = 0,
n d A + nx |d Ax | + ny |d Ay | + nz |d Az | = 0.
The last equation says that the sum of the (outwardly oriented) surface vectors over
the total surface of the tetrahedron is equal to the zero vector. Then, for any closed
surface, also the integral over the surface vectors of all surface elements gives again
the zero vector, since we can split the enclosed volume into infinitesimal tetrahedrons,
and the inner surfaces of two adjacent subvolume elements cancel during integration;
on the other hand, the integral over all surface area elements gives A.
∮ d A = 0, ∮ d A = A. (2.76)
∫ a(x, y, z) d Ax := ± ∫∫ a(x, y, z) d y d z,
∫ a(x, y, z) d Az := ± ∫∫ a(x, y, z) d x d y,
Such an integral, where the differential is a vector surface element, is called a surface
integral of the second kind. If the differential is the magnitude of the surface element
instead of the vector surface element we have
∫ a(x) d A, (2.79)
x = x(u, v),
u1 ≦ u ≦ u2 ,
y = y(u, v), (2.80)
v1 (u) ≦ v ≦ v2 (u),
z = z(u, v),
Along a u-line only u changes and v remains constant; i. e. for a line element along a
u-line we have d xi = (𝜕xi /𝜕u) d u. Analogously we have for a line element along a
v-line d xi = (𝜕xi /𝜕v) d v. This gives for the surface vector of a surface element
𝜕xj 𝜕xk
d Ai = εijk du dv (2.82)
𝜕u 𝜕v
𝜕y 𝜕z 𝜕z 𝜕y
d Ax = d y d z = ( − )du dv
𝜕u 𝜕v 𝜕u 𝜕v
𝜕y 𝜕y
(2.83)
𝜕u 𝜕v
= d u d v = 𝜕(y, z) d u d v,
𝜕z 𝜕z
𝜕(u, v)
𝜕u 𝜕v
𝜕z 𝜕x 𝜕x 𝜕z
d Ay = d z d x = ( − )du dv
𝜕u 𝜕v 𝜕u 𝜕v
𝜕z 𝜕z
(2.84)
𝜕u 𝜕v
= d u d v = 𝜕(z, x) d u d v,
𝜕x 𝜕x
𝜕(u, v)
𝜕u 𝜕v
𝜕x 𝜕y 𝜕y 𝜕x
d Az = d x d y = ( − )du dv
𝜕u 𝜕v 𝜕u 𝜕v
𝜕x 𝜕x
(2.85)
𝜕u 𝜕v
= d u d v = 𝜕(x, y) d u d v.
𝜕y 𝜕y
𝜕(u, v)
𝜕u 𝜕v
u2 v2 (u)
𝜕xj 𝜕xk
∫ a(x) d Ai = ∫ ∫ a(x(u, v)) εijk d u d v. (2.86)
𝜕u 𝜕v
u1 v1 (u)
u2 v2 (u)
Problem 2.19.
Consider a hemispherical shell exposed to the internal pressure p = ϱ g(H − z). Com-
pute the vertical force Fz = ∫ p d Az acting on the hemispherical shell.
d V = [d x1 , d x2 , d x3 ] = d x1 × d x2 ⋅ d x3 = d x d y d z e⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
x × ey ⋅ ez = d x d y d z,
1
The volume element d V is a triple product of three polar vectors and so, according to
Section 2.10.3, No. 1, an axial scalar, i. e. it changes its sign under a transformation to a
coordinate system with a different orientation according to (2.18). However, under the
reflection the lower and the upper surface are swapped, so the sign changes again,
and thus the volume is (in agreement with our intuition) a polar scalar.
x = x(u, v, w), u1 ≦ u ≦ u2 ,
y = y(u, v, w), v1 (u) ≦ v ≦ v2 (u), (2.91)
z = z(u, v, w), w1 (u, v) ≦ w ≦ w2 (u, v).
Similarly as before, the volume element is spanned in these coordinates by the three
line elements pointing in positive direction of the coordinate lines and its value is the
absolute value of the corresponding triple product. For a line element in u-direction we
have d xi = (𝜕xi /𝜕u) d u, for a line element in v-direction and in w-direction we have
accordingly d xi = (𝜕xi /𝜕v) d v and d xi = (𝜕xi /𝜕w) d w. So we have for the volume
element in the coordinates u, v, w
𝜕x 𝜕xj 𝜕xk
d V = εijk i d u d v d w. (2.92)
𝜕u 𝜕v 𝜕w
u2 v2 (u) w2 (u, v)
𝜕(x, y, z)
∫ a(x) d V = ∫ ∫ ∫ a(x(u, v, w)) d u d v d w. (2.93)
𝜕(u, v, w)
u1 v1 (u) w1 (u, v)
1. In the formulas above we assumed for simplicity that the integrand is a scalar field.
However, we can easily substitute ai...j (x, y, z) for a(x, y, z) in these formulas; also
the coordinates of a tensor field of higher order are only functions of the three spatial
coordinates.
For example, if the integrand is a second-order tensor we have the following types
of integrals.
∫ a d x = A, ∫ aij d x = Aij ,
∫ a d x = B, ∫ aij d xk = Bijk ,
∫ a d A = D, ∫ aij d Ak = Dijk ,
∫ a d V = E. ∫ aij d V = Eij .
Here the products of the integrand and the differential are tensor products; we can
clearly replace them in integrals of the second kind by scalar products or vector prod-
ucts, and then we get e. g. the formulas
∫ a ⋅ d x = F, ∫ aij d xj = Fi ,
(2.95)
∫ a × d x = G, ∫ aij εklj d xk = Gil .
2. A line integral over a closed curve (the boundary curve of a surface) or a surface
integral over a closed surface (the boundary surface of a volume) are also called a
closed line integral and a closed surface integral, respectively, and we denote such
integrals by ∮.
3. If we integrate a tensor over a given spatial domain (e. g. a given curve between
two points P and Q), then the result, as for any definite integral, does not depend any
more on the integration variables. So integration of a tensor field gives a tensor which
is independent of the position.
1. Consider the Cartesian coordinates ai...j of a tensor field with the following proper-
ties:
– the ai...j are continuously differentiable in a given volume,
– the closed surface of this volume is at least piecewise continuously differentiable,
and
– if the closed surface (as e. g. for the volume enclosed by the two concentric
spheres) has several parts, then the integration is performed over all properly
oriented subsurfaces.
𝜕a
∫ grad a d V = ∮ a d A, ∫ d V = ∮ a d Ak ,
𝜕xk
𝜕aj
∫ grad a d V = ∮ a d A, ∫ d V = ∮ aj d Ak ,
𝜕xk (2.96)
𝜕aij
∫ grad a d V = ∮ a d A, ∫ d V = ∮ aij d Ak ,
𝜕xk
etc. etc.
2. In these equations we can replace the tensor product on the right side with a scalar
product, and thus the gradient on the left side with a divergence, if we multiply the
equations in index notation (starting with the second one) with δjk .
∫ div a d V = ∮ a ⋅ d A, 𝜕ak
∫ d V = ∮ ak d Ak ,
𝜕xk
3. We can also replace the tensor product on the right side with a vector product, and
thus the gradient on the left side with a curl, if we multiply (2.96) in index notation
(starting with the second one) with εjpk .
∫ curl a d V = ∮ a ⊗ d A = − ∮ a × d A,
∫ curl a d V = ∮ a ⊗ d A = − ∮ a × d A,
etc.
(2.98)
𝜕aj
∫ εjpk d V = ∮ aj εjpk d Ak ,
𝜕xk
𝜕aij
∫ εjpk d V = ∮ aij εjpk d Ak ,
𝜕xk
etc.
The equations (2.97) and (2.98) are also versions of the Gauss theorem.
4. When we stated the Gauss theorem we used right derivatives. It is easy to see that
if we use left derivatives, then in symbolic notation we need to change the order of the
factors and replace ⊗ by ×. This gives e. g. for (2.98)2
∫ d VcurlL a = ∮ d A × a.
5. We first prove the basic version of the Gauss theorem, i. e. the equation
𝜕a
∫ d V = ∮ a d Ak .
𝜕xk
𝜕a
∫ d V = ∮ a d Az ,
𝜕z
and let us assume a volume of integration such that the projection of its surface onto
the x, y-plane splits into two parts z = z1 (x, y) and z = z2 (x, y), where the two
parts intersect in a curve C, and let us further assume that to each value pair (x, y)
of the projected surface corresponds exactly one point on z1 and one point on z2 . In
addition we assume that the functions z1 and z2 are continuous and at least piecewise
x2 y2 (x)
We can split the last integral into two parts and interpret the first as an integral of a
over the surface z = z1 (x, y) and the second as an integral of a over the surface z =
z2 (x, y), i. e.
𝜕a
∫ dV = ∫ a d Az − ∫ a d Az ,
𝜕z
z2 (x, y) z1 (x, y)
where the two surfaces are oriented so that the z-coordinates of all surface vectors
point into the positive z-direction. To have the surface vector oriented outward on the
closed surface, as we agreed on, we need to change the sign of the integral over z1 ,
which gives
𝜕a
∫ d V = ∮ a d Az ,
𝜕z
and this is what we wanted to prove.
Now we extend the proof to volumes in which the two surface parts z1 and z2 do
not intersect along a curve C but are connected by a cylindrical shell whose generating
lines are parallel to the z-axis. Such a cylinder does not contribute to the closed surface
integral ∮ a d Az because everywhere on the shell we have d Az = 0; so we conclude
that the basic version of the Gauss theorem is also true for these volumes.
We finally extend the proof to volumes with arbitrarily shaped closed surfaces,
e. g. multiply connected volumes and volumes with inner closed surfaces, as long as
all parts of the closed surfaces are continuous and at least piecewise continuously
differentiable. We can always split these volumes into volumes of the type which we
already handled earlier, i. e. for which the Gauss theorem holds.
Adding the Gauss theorems for the subvolumes on the left side gives the volume
integral over the total volume. When we add the surface integrals on the right side,
the parts for the shared faces of adjacent subvolumes cancel out, because the surface
vectors of a subvolume are always oriented outward, so for adjacent subvolumes they
point in opposite directions; only the surface integral over the outer surface remains,
including possibly existing inner parts of the closed surface, where the surface vector
is also oriented outward of the volume.
The proof of the basic version of the Gauss theorem for the other two coordinates
is similar. To obtain the remaining versions of the Gauss theorem we replace in the
basic version the scalar field a(x, y, z) with the coordinates of a tensor field of higher
order or perform further algebraic operations.
The only difference between the coordinates of a tensor field of any order and a
scalar field is how they transform under a change to another coordinate system. Since
we did not make use of the invariance of the scalar tensor field with respect to a coor-
dinate transformation in our proof, this proof also holds for the coordinates of a tensor
field of higher order.
1. Consider the Cartesian coordinates ai...j of a tensor field with the following proper-
ties:
– the ai...j are continuously differentiable on a given two-sided surface,
– the boundary curve of this surface is at least piecewise continuously differen-
tiable, and
– if the boundary curve (as e. g. for the surface between two concentric circles) has
several parts, then the closed line integral is performed over all properly oriented
parts.
∫ grad a ⊗ d A = − ∫ grad a × d A = ∮ a d x,
∫ grad a ⊗ d A = − ∫ grad a × d A = ∮ a d x,
∫ grad a ⊗ d A = − ∫ grad a × d A = ∮ a d x,
etc.
(2.99)
𝜕a
∫ ε d Ak = ∮ a d xj ,
𝜕xi ijk
𝜕an
∫ ε d Ak = ∮ an d xj ,
𝜕xi ijk
𝜕amn
∫ ε d Ak = ∮ amn d xj ,
𝜕xi ijk
etc.
2. If we replace the tensor product on the right side with a scalar product, by multiply-
ing the equations in index notation (starting with the second one) with δnj , we obtain
the following equations.
∫ curl a ⋅ d A = ∮ a ⋅ d x,
∫ curl a ⋅ d A = ∮ a ⋅ d x,
etc.
𝜕aj (2.100)
∫ εjki d Ak = ∮ aj d xj ,
𝜕xi
𝜕amj
∫ εjki d Ak = ∮ amj d xj ,
𝜕xi
etc.
3. We can also replace in (2.99) the tensor product on the right side with a vector
product, if we multiply the equations in index notation (starting with the second one)
with εnpj = −εnjp . This gives e. g. for the second equation
𝜕an
∫ ε d Ak (−εnjp ) = ∮ an εnpj d xj .
𝜕xi ijk
However, if we want to avoid the uncommon notation with the double scalar product
for the ε-tensor
∫ (grad a × d A) ⋅⋅ ε = − ∮ a × d x,
4. We stated the Stokes theorem using right derivatives. If we prefer to use left deriva-
tives, we again need to change the order of the factors and replace ⊗ by ×. Then we
have e. g. for (2.99)2
∫ d A × gradL a = ∮ d x a.
We see from all these equations that left derivatives lead to symbolic equations with
the commonly used vector product × and right derivatives lead to symbolic equations
with the uncommon vector product ⊗.
5. We again start with proving the basic version of the Stokes theorem, i. e.
𝜕a
∫ ε d Ak = ∮ a d xj ,
𝜕xi ijk
𝜕a 𝜕a 𝜕xm 𝜕xn
Ij := ∫ ε d Ak = ∫∫ εijk εkmn du dv
𝜕xi ijk 𝜕xi 𝜕u 𝜕v
𝜕a 𝜕xm 𝜕xn
= ∫∫ εkij εkmn du dv
𝜕xi 𝜕u 𝜕v
𝜕a 𝜕xm 𝜕xn
= ∫∫(δim δjn − δin δjm ) du dv
𝜕xi 𝜕u 𝜕v
𝜕a 𝜕xj 𝜕a 𝜕xj
= ∫∫ ( − ) d u d v.
𝜕u 𝜕v 𝜕v 𝜕u
Now we consider the x-coordinate in the basic version of the theorem, i. e. we set the
only free index j to 1, and we choose u = x, v = y, i. e. the surface of integration is
given as z = z(x, y). Let us consider for the moment only surfaces whose projection
onto the x, y-plane is a one-to-one image of the surface, whose boundary intersects
with the coordinate lines x = const in at most two points, and let the surface be suffi-
ciently smooth so that z = z(x, y) is continuous and at least piecewise continuously
differentiable. Then we have with
𝜕x1 𝜕x 𝜕x1 𝜕x
= =0 and = =1
𝜕v 𝜕y 𝜕u 𝜕x
x2 y2 (x) x2
𝜕a
I1 = − ∫ d x ∫ dy = − ∫ d x [a(x, y2 (x)) − a(x, y1 (x))].
𝜕y
x1 y1 (x) x1
The last integral has two parts. The first can be interpreted as an integral of a along
the curve y = y1 (x) and the second as an integral of a along the curve y = y2 (x), so
we have
I1 = ∫ d x a − ∫ d x a.
y1 (x) y2 (x)
Here both parts of the curve are oriented so that a traveler on the curves moves to-
wards increasing x. We choose the orientation of the boundary curve as shown in the
figure on the last page; then, if we use a right-handed system (as in the figure), the
z-component of the surface vector points in the positive z-direction. To make sure that
both parts of the curve have this orientation, we need to change the sign of the integral
over y2 , which gives
𝜕a
I1 = ∫ ε d Ak = ∮ a d x,
𝜕xi i1k
which is what we wanted to prove. It is easy to see that this equation also holds, if we
change the orientation of the boundary curve or if we use a left-handed system.
Similarly as for the Gauss theorem we now extend the proof to surfaces where the
boundary of the projected surface of integration intersects with the coordinate lines
x = const not only in two points x1 and x2 , but is partially tangent to these coordinate
lines. Such a part of the boundary does not contribute to the line integral, since we
have d x = 0 everywhere, so we conclude that the equation we want to prove also
holds for those surfaces.
Finally, we can extend the proof to surfaces of any shape, including multiply con-
nected two-sided surfaces, as long as all parts of the boundary are continuous and
at least piecewise continuously differentiable. We can always cut these surfaces into
parts of the type of surfaces for which we have already shown that the Stokes theorem
holds. When we add the Stokes theorems for these subsurfaces, we obtain on the left
side the surface integral over the total surface, and in the closed line integrals on the
right side adjacent curves are each traveled through twice in opposite directions, so
that the shared parts of the boundary cancel out and only the closed line integral over
the outer boundary of the surface remains. For example, in the case of a doubly con-
nected surface with an outer and an inner boundary, both boundary pieces must be
oriented so that a traveler along the curve sees the surface in both cases on the same
side (e. g. on the left side) of the curve. (It is easy to see that if we split a one-sided
surface, like the Möbius strip, the cutting curve is traveled through twice in the same
direction.)
Again, since the three Cartesian coordinates are equivalent, the proofs for the
other two coordinates of the first version of the Stokes theorem are analogous. Using
the same argument as for the Gauss theorem, we can replace the scalar field a(x, y, z)
with the coordinates of a higher-order tensor field, and hence extend the proof to the
other versions of the Stokes theorem.
aij = a
̂ δij + å ij . (3.2)
By deriving unique rules to compute a ̂ and å ij from the coordinates aij of an arbitrary
tensor, we can show that a unique decomposition is always possible. For this, we com-
pute the trace of (3.2). Since å ii is zero by definition, it follows that aii = 3 â or
1
a
̂= a , (3.3)
3 ii
and when we substitute this into (3.2) and solve for å ij we get
1
å ij = aij − a δ . (3.4)
3 kk ij
3. These two decompositions are related as follows:
a(ij)
⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞
aij = a
̂ δij + (a(ij) − a ̂ δij ) +a[ij] . (3.5)
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
å ij
https://doi.org/10.1515/9783110404265-003
Using the transformation laws,1 it is easy to see that all parts of this decomposition,
i. e. the isotropic part a
̂ δij , the symmetric part of the deviator å (ij) := (a(ij) − a
̂ δij ), the
antimetric part a[ij] , the symmetric part a(ij) , and the deviatoric part å ij , transform into
a tensor of its original class, under a transformation to another coordinate system.
Problem 3.1.
Consider a tensor (in a given Cartesian coordinate system) with the coordinate matrix
4 1 −7
aij = (−3 5 8) .
3 2 9
Compute (by finding the corresponding coordinate matrices) its isotropic part, its de-
viatoric part, its symmetric part, its antimetric part, and the symmetric part of its de-
viator. Check your result by adding the isotropic part, the symmetric deviatoric part,
and the antimetric part.
Problem 3.2.
Show that the antimetry of a tensor is invariant under a transformation of coordinates.
a apq apr
1 pp
1
A := det aij = εijk εpqr aip ajq akr = aqp aqq aqr . (3.7)
6 6
arp arq arr
Since the right side of this equation is a scalar, the determinant of the coordinate ma-
trix of a tensor is also a scalar, i. e. it is independent of a chosen coordinate system.
Moreover, for a polar tensor it is a polar scalar, and for an axial tensor it is an axial
scalar. This justifies to have a symbol for the determinant of a tensor itself and thus
we write A = det a.
2. A scalar obtained from the coordinates of a tensor is called an invariant of the ten-
sor. So the trace and the determinant of a tensor are invariants of this tensor.
3. The determinant of an antimetric tensor is zero. If we change in the last equation the
order of i and p, j and q, and k and r, then we get A = 61 εpqr εijk api aqj ark . Comparing
this with the last equation gives, for an antimetric tensor, A = −A, i. e. A = 0.
[ a ] ⋅ B = B × A. (3.10)
Problem 3.3.
Consider the antimetric tensor given by
0 a12 −a31
a[ij] = (−a12 0 a23 ) .
a31 −a23 0
2 Those authors who consider only polar tensors as tensors and who use only the polar ε-tensor call
Ak the axial vector associated with a[ij] . This axial vector is then clearly in our sense also a polar vector.
3 It is not very common to use the ε-tensor in symbolic notation; it can be replaced by a suitable vector
product with the δ-tensor.
1
bmn = ε ε a a . (3.11)
2 mij npq ip jq
This means that the cofactors of a second-order tensor also form the coordinate matrix
of a second-order tensor. The so-defined tensor b is called the cotensor of the tensor a.
Clearly, the cotensor of a second-order tensor is always a polar tensor.
2. In particular, for an antimetric tensor we obtain a simple expression for its cotensor,
if we use the vector associated with the tensor according to (3.8). We have
1
bip = εijk εpqr εjqm Am εkrn An
2
1
= εjki εjqm εpqr εnkr Am An
2
1
= (δkq δim − δkm δiq )(δpn δqk − δpk δqn ) Am An
2
1
= (3 δim δpn − δim δpn − δim δpn + δpm δin ) Am An
2
1
= (δim δpn + δpm δin ) Am An
2
1
= (Ai Ap + Ap Ai ),
2
bip = Ai Ap . (3.12)
3 regular A ≠ 0
2 single singular, rank defect 1 A = 0, bij =
\ 0
1 double singular, rank defect 2 bij = 0, aij =
\ 0
0 triple singular, rank defect 3 aij = 0
Since the property of the coordinate matrices aij and bij , to be zero or nonzero, is in-
variant under coordinate transformations, we conclude that all coordinate matrices of
a tensor have the same rank. We call this property the rank of the associated tensor.
2. An antimetric tensor is, as shown in Section 3.2, always singular. If it is not triple
singular, then it is, according to (3.12), single singular.
aij a(−1)
jk
= δik . (a)
The transformation equation between the coordinates aij of the tensor a in the orig-
inal coordinate system and its coordinates a
̃ ij in another coordinate system is given
by (2.17)
a(−1)
jk
̃ (−1)
= αjp αkq a pq . (c)
We will show that the coordinate matrices of these two tensors are inverses of each
other also in the other coordinate system (and thus in every Cartesian coordinate sys-
tem).
For this, we substitute (b) and (c) into (a), and we obtain
αim αkq a
̃ mn a nq = δik .
̃ (−1)
αim αir α
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ⏟⏟⏟⏟⏟α⏟⏟⏟ks⏟⏟ a
⏟⏟⏟kq ̃ mn a
̃ (−1) αkr αks ,
nq = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
δmr δqs δrs
a
̃ rn a ns = δrs , Q. E. D.
̃ (−1)
For every regular tensor a there exists exactly one tensor a−1 with the property that
the coordinate matrices of both tensors are inverses of each other in both coordinate
systems. We call these two tensors the inverses of each other and then (1.49), (1.51),
(1.52), (1.54), (1.50), and (1.53) give
a−1 ⋅ a = δ, a(−1)
ij
ajk = δik ,
(3.13)
−1 (−1)
(a−1 ) = a, (aij(−1) ) = aij ,
T −1 T −1
(a−1 ) = (aT ) =: a−T , (a(−1)
ij
) = (aTij ) =: a(−T)
ij
,
1 T 1
a−1 = b , a(−1) = b , (3.14)
A ij A ji
Problem 3.4.
Consider the tensor with the coordinates
0 3 2
Aij = (−3 0 −1) .
−2 1 0
aT = a−1 , a ⋅ aT = aT ⋅ a = δ,
aik ajk = δij , aki akj = δij , (3.16)
A = ±1.
Problem 3.5.
Prove that if the coordinate matrix of a tensor is proper or improper orthogonal in one
coordinate system, then it is proper or improper orthogonal in all coordinate systems.
Hint: First show that if the coordinate matrix of a tensor is orthogonal in one co-
ordinate system, then it is orthogonal in all coordinate systems.
Problem 3.6.
Show that a proper orthogonal tensor is equal to its cotensor and that an improper
orthogonal tensor is equal to the negative of its cotensor.
Hint: The coordinates Aij of a regular tensor and the coordinates Bij of its cotensor
are related, with (1.50), as Bij = A(−1)
ji
det A.
U = a ⋅ X, Ui = aij Xj . (3.17)
The tensor a assigns a vector U to each vector X. We also say U is a function of X; since
X is a vector, we call it a vector function, and since U is a vector, we call it a vector-
valued function. We also say the tensor a maps the vector X to the vector U. The set
of all vectors X that are mapped by the tensor a is called the domain and, conversely,
the set of all vectors U is called the range of the mapping. An element U of the range
is also referred to as the image of the corresponding X in the domain of the mapping
and, conversely, X is the inverse image of U. Using (3.17), we have f (X +Y) = f (X)+f (Y)
and f (λ X) = λ f (X); we refer to such a function as linear. In summary, we can call a
tensor a linear vector-valued vector function and we call the mapping that it generates
a linear map.
2. This property of a tensor suggests a geometric interpretation of its coordinates: if
α
we successively substitute for Xi the Cartesian coordinates ei = δiα of the unit vectors
of the underlying coordinate system, then we get for the image of this Cartesian basis
α α
U i = aij ej = aij δjα ,
α
U i = aiα . (3.18)
This means that the columns of the coordinate matrix aij are the coordinates of the
images of the unit vectors of the underlying coordinate system.
3. We consider two collinear vectors Xi and Xi∗ = λ Xi . Their images Ui = aij Xj and
Ui∗ = aij Xj∗ are related as follows:
Two collinear vectors are thus mapped to two also collinear vectors with the same
length ratio. Such a map is called affine.
We will see that further properties of a map depend on the rank of the tensor.
Hence we consider the values which the rank of a tensor can take separately.
3.8.1 Rank 3
1. If the tensor aij has rank 3, then the images of the unit vectors are linearly indepen-
dent, so they are not coplanar, and the inverse tensor a(−1) ij
exists and is the inverse
map
X = a−1 ⋅ U, Xi = a(−1)
ij Uj . (3.19)
2. Clearly, the zero vector is mapped to the zero vector; since the map is one-to-one,
no vector which is different from zero is mapped to the zero vector.
3. If the tensor is orthogonal, then we get for the square of the image of any vector Xi
with (3.17)
i. e. each vector is mapped to a vector of the same length. The scalar product of the
image vectors Ui = aij Xj and Vi = aij Yj of any two vectors Xi and Yi satisfies
4. A regular tensor maps a Cartesian basis to three linearly independent (i. e. non-
coplanar) vectors, which generally are neither unit vectors nor perpendicular to each
other. In particular, an orthogonal tensor maps a Cartesian basis to three mutually
perpendicular unit vectors, i. e. again a Cartesian basis. Later we will prove that if
the tensor is proper orthogonal, then the orientation of the basis is preserved, i. e.
3.8.2 Rank 2
1. If the tensor aij has rank 2, then both the rows and the columns of its coordinate
matrix are coplanar, but not collinear. If we interpret both the rows and the columns as
the coordinates of position vectors or as points in space, then they each define a plane
through the origin, which we call the row plane and the column plane, respectively.
The images of the unit vectors then lie in the column plane. Since any vector in the
domain can be represented as a linear combination of the unit vectors and since this
mapping is linear, its image in the image space can also be represented as a linear
α
combination of the U i ; i. e. geometrically the tensor aij maps each point of the space
to a point in the column plane. We also say the tensor aij maps the three-dimensional
vector space of the Xi onto the two-dimensional vector space of the Ui .
2. Since the matrix of the aij is singular, the equation aij Xj = 0 has nontrivial solu-
tions. This means that there exist vectors Xj whose image is the zero vector, i. e.
aij Xj = 0. (3.20)
These vectors are called null vectors; they span the null space of the single singular
tensor aij . Geometrically, the null space is a straight line through the origin; the di-
rection of this line is called the null direction. If we interpret again the rows of aij as
position vectors, then this line is, according to (3.20), perpendicular to these position
vectors and thus is perpendicular to the row plane. Let X and Y be any two vectors in
the domain, with the images U = a ⋅ X and V = a ⋅ Y. Then we get by subtracting U
− V = a ⋅ (X − Y). Clearly, the images U and V are equal if and only if the difference
between X and Y is pointing in the null direction. Thus two vectors whose projections
onto the row plane are equal have the same image. In other words, if we decompose a
vector X into a component X T in the row plane and a component X N perpendicular to
the row plane, then we get a ⋅ (X T + X N ) = a ⋅ X T .
Two different vectors in the row plane have different images; geometrically, a
rank-2 tensor maps each point in the row plane one-to-one to a point in the column
plane, and the corresponding transposed tensor accordingly maps each point in the
column plane (of the original tensor) one-to-one to a point in the row plane.
3. For an antimetric tensor, which we know is either single singular or the zero tensor,
the vector associated to the tensor points in the null direction. From (3.8)2 it follows
by multiplying by Aj that
aij Aj = εijk Aj Ak ,
Problem 3.7.
Consider a tensor (in a given coordinate system) with the coordinate matrix
1 −1 −1
aij = (−2 1 3) .
−1 0 2
A. Convince yourself that it is single singular.
B. Determine the unit vector which spans the null space.
C. Find a Cartesian basis of the image plane.
Hint: Let U 1 and U 2 be two linearly independent vectors in the image plane;
first find a vector V 1 = U 1 + α U 2 which is perpendicular to U 1 .
3.8.3 Rank 1
If the tensor aij has rank 1, then both the rows and the columns of its coordinate ma-
trix are collinear. Geometrically, both the rows and the columns define a straight line
through the origin, which we call the row line and the column line.
The images of all vectors lie on the column line; the tensor maps the three-
dimensional vector space of the Xi onto the one-dimensional vector space of the Ui .
There are two linearly independent null directions which are both perpendicular
to the row line. These null directions span the null space of the double singular tensor
aij . Geometrically, the null space is a plane through the origin, whose normal vector
lies on the row line; we call it the null plane.
A tensor of rank 1 maps each point of the row line one-to-one to a point of the
column line; the corresponding transposed tensor accordingly maps each point of the
column line (of the original tensor) one-to-one to a point of the row line.
Problem 3.8.
Consider the tensor with the coordinate matrix
1 −1 2
aij = (−1 1 −2) .
3 −3 6
A. Convince yourself that it is double singular.
B. Determine two linearly independent null vectors.
3.8.4 Rank 0
Finally, if the tensor aij has rank 0, then clearly it transforms any vector Xi into the zero
vector: the tensor is the zero tensor.
3.9.1 Definition
2. Now we define the basis g i , reciprocal to the basis g i , as the three vectors
g2 × g3 g3 × g1 g1 × g2
g1 = , g2 = , g3 = . (3.21)
[g 1 , g 2 , g 3 ] [g 1 , g 2 , g 3 ] [g 1 , g 2 , g 3 ]
Since the triple product in the denominator is nonzero, the three vectors g i are
uniquely determined, and since three vectors, which are each perpendicular to the
other two vectors of a basis, cannot be coplanar, the g i also form a basis.
If the g i are polar vectors, then the g i are also polar vectors.
We can summarize the three equations (3.21) in one equation,
1 gj × gk
gi = εijk , (3.23)
2 [g 1 , g 2 , g 3 ]
g 1 ⋅ g 1 = 1, g 2 ⋅ g 1 = 0, g 3 ⋅ g 1 = 0.
Similar equations hold for g 2 and g 3 , and we can summarize these nine equations into
the so-called orthogonality relations
j
g i ⋅ g j = δij , g g k = δij . (3.24)
ik
g g g 1 2 3
11 12 13 g1 g1 g1
1 0 0
(g g g ) (g1 2 3
21 22 23 2 g2 g 2 ) = (0 1 0) .
0 0 1
g g g 1
g3
2
g3
3
g3
31 32 33
The matrix formed by the coordinates of the three vectors g i as rows is the inverse of
the matrix formed by the coordinates of the three vectors g i as columns. From this we
can compute the basis reciprocal to a given basis with the Gauss–Jordan algorithm.
k
Xij = g g j .
ki
k
Xij g = g g j g = g δkm = g = g δij ,
mj ki mj ki mi mj
(Xij − δij ) g = 0
mj
(X − δ) ⋅ g m = 0.
Thus all three vectors g m are null vectors of the tensor X −δ, i. e. the tensor X −δ must
be triple singular. In other words, it must be the zero tensor. This yields the following
relations (which are also called orthogonality relations):
k
g k g k = δ, g g j = δij . (3.25)
ki
If the three vectors of a basis are mutually perpendicular, we call the basis orthogonal,
and if these vectors are unit vectors, we call it a normalized basis. If the basis (as the
basis of a Cartesian coordinate system) is both orthogonal and normalized, it is called
orthonormal.
For a pair of reciprocal bases it follows from (3.21) and (3.24) that if the original
basis is orthogonal, then its reciprocal basis is also orthogonal, and the homologous
vectors of both bases are collinear and their magnitudes are reciprocal. If the original
basis is orthonormal, then it is identical to its reciprocal basis.
The orthogonality relations (3.24) and (3.25) for a Cartesian basis ei are, according
to (2.4),4 given by
i j
ei ⋅ ej = δij , ek ek = δij , αki αkj = δij ,
k k
(3.26)
ek ek = δ, ei ej = δij , αik αjk = δij .
Problem 3.9.
The following is given with respect to a Cartesian basis ei :
A. the (orthogonal) basis g 1 = 3 e1 , g 2 = 2 e2 , g 3 = e3 ;
B. the (nonorthogonal) basis, whose vectors are given by the three edges of a regular
tetrahedron with edge length one, lying in the first octant, with g 1 = e1 and g 2
lying in the e1 , e2 -plane.
1. Let g 1 and g 2 be two linearly independent vectors. Then there is, in the plane spanned
by g 1 and g 2 , exactly one pair of vectors g 1 and g 2 which satisfies the orthogonality re-
lations (3.24), with i and j running naturally only from 1 to 2. We call this basis the basis
reciprocal to g 1 and g 2 . The easiest way to see why this is plausible is geometrically:
for example, g 1 must be in the plane spanned by g 1 and g 2 , it must be perpendicular to
g 2 , it must enclose an acute angle with g 1 , and its length must be such that g 1 ⋅ g 1 = 1.
2. In a Cartesian basis in this plane, all four vectors have only two coordinates, and
the two-dimensional Cartesian coordinates of g 1 and g 2 are computed, analogously to
(3.22), from the equations
εij g εij g
1 2j 2 1i
gi = , gj = . (3.27)
εmn g g εmn g g
1m 2n 1m 2n
It is easy to verify that the so-defined two-dimensional vectors satisfy the orthogonal-
ity relations.
hi = a ⋅ g i . (3.28)
hi g i = a ⋅ g i g i = a ⋅ δ = a,
a = hi g i = h1 g 1 + h2 g 2 + h3 g 3 . (3.29)
Any tensor a can therefore be represented by six vectors in this way; here the hi are the
images of the basis g i under the mapping by the tensor a, and the basis g i is reciprocal
to the basis g i . One can start from an arbitrary basis g i and hence the representation
(3.29) is possible in infinitely many ways: the basis g i can be freely specified and then
all three vectors hi are uniquely determined by a.
The basis g i is always noncoplanar. With respect to the hi , we need to distinguish
between the cases where a has rank 3, 2, or 1.
3.10.1 Rank 3
1. If a has rank 3, then the hi are also noncoplanar. The tensor a−1 , the inverse of the
tensor a, exists and so does the basis hi , reciprocal to hi .
2. The converse of (3.29) is also true: if a tensor a can be written in the form (3.29) and
both the g i and the hi are linearly independent, then the tensor a has rank 3.
Then U = a ⋅ X gives
U = h1 g 1 ⋅ X + h2 g 2 ⋅ X + h3 g 3 ⋅ X.
U = α1 h1 + α2 h2 + α3 h3 ,
hi = a ⋅ g i , g i = aT ⋅ hi , g i = a−1 ⋅ hi , hi = a−T ⋅ g i ,
and solving for the tensors using the orthogonality relations (3.25) gives
a = hi g i , aT = g i hi , a−1 = g i hi , a−T = hi g i .
a = hi g i , aT = g i hi ,
(3.30)
a−1 = g i hi , a−T = hi g i .
Here the g i and g i and also the hi and hi are pairs of reciprocal bases.
– The converse is also true: if a tensor a can be represented in the form (3.30)1
and both the hi and g i form a basis, then the tensor has rank 3.
– If we solve the equations (3.30) for the four bases, we obtain
hi = a ⋅ g i , g i = aT ⋅ hi ,
(3.31)
g i = a−1 ⋅ hi , hi = a−T ⋅ g i .
– For any tensor a, we can freely choose one of the four bases g i , g i , hi , or hi , and
then the other three are uniquely determined.
Problem 3.10.
Represent the regular tensor with the coordinates
2 3 −1
aij = (4 −2 3)
1 2 1
Hint: The hi can be computed using the Gauss–Jordan algorithm in one step.
3.10.2 Rank 2
1. If a has rank 2, we choose the basis g i in (3.28) so that g 1 and g 2 lie in the row plane
of a and g 3 is a null vector of a. Then we have h3 = 0, and g 1 and g 2 are perpendicular
to g 3 , i. e. they also lie in the row plane of a. Then (3.29) reduces to
a = h1 g 1 + h2 g 2 . (3.32)
We can represent any single singular tensor by four vectors in this way; here the g 1
and g 2 are the basis reciprocal to g 1 and g 2 in the row plane of the tensor, and h1 and
h2 are the images of g 1 and g 2 under the mapping by the tensor and they form a basis
in the column plane of the tensor.
2. The converse of (3.32) is also true: if a tensor a can be written in the form a =
h1 g 1 + h2 g 2 , and both h1 and h2 and also g 1 and g 2 are not collinear, then a has rank
2. Then it follows from U = a ⋅ X that
U = h1 g 1 ⋅ X + h2 g 2 ⋅ X.
aT = g 1 h1 + g 2 h2 .
Instead, we can also write aT = g i hi , where i runs only from 1 to 2. Scalar multiplica-
tion of this equation from the right by hj , the basis reciprocal to hj in the column plane
of a, gives
aT ⋅ hj = g i hi ⋅ hj = g i δij = g j .
a = h1 g 1 + h2 g 2 , aT = g 1 h1 + g 2 h2 . (3.33)
hi = a ⋅ g i , g i = aT ⋅ hi . (3.34)
Here the g i are the basis reciprocal to the g i in the plane and the hi are the basis
reciprocal to the hi in the plane.
– The g i and g i lie in the row plane of a, and the hi and hi lie in the column plane
of a.
– For any rank-2 tensor a, we can freely choose one of the four vector pairs
g i , g i , hi , or hi , and then the other three vector pairs are uniquely determined.
Problem 3.11.
A. Verify that the single singular tensor from Problem 3.7 can be represented in the
form a = h1 g 1 + h2 g 2 , with h1 = (−1, 2, 1) and h2 = (0, 1, 1).
B. Compute g 1 and g 2 . (This can be done using the Gauss–Jordan algorithm in one
step.)
C. Write a as the sum of h1 g 1 and h2 g 2 .
3.10.3 Rank 1
1. Finally, if a has rank 1, then we choose the basis in (3.28) so that g 1 lies on the row
line of a and so that g 2 and g 3 are null vectors of a. Then h2 = h3 = 0, and g 1 also lies
on the row line of a. Equation (3.29) reduces to
a = h1 g 1 , aT = g 1 h1 . (3.35)
This means that any double singular tensor can be represented as a tensor product
of two nonzero vectors, where h1 lies on the column line of the tensor, g 1 lies on the
row line of the tensor, and h1 is the image of the vector g 1 reciprocal to g 1 , under the
mapping with the tensor. (Two vectors are reciprocal if they are collinear and their
scalar product is equal to one.)
2. The converse is also true: if a tensor a can be written in the form (3.35)1 and h1 and
g 1 are nonzero, then the tensor has rank 1. Then from U = a ⋅ X we have U = h1 g 1 ⋅ X,
i. e. the image vector U is pointing in the direction of h1 for all X.
3. Let h1 be the vector reciprocal to h1 . Then from (3.35) we get by scalar multiplication
from the right by g 1 and h1 , respectively,
h1 = a ⋅ g 1 , g 1 = aT ⋅ h1 . (3.36)
a = h1 g 1 , aT = g 1 h1 . (3.35)
– The converse is also true: if a tensor a can be represented in the form (3.35)1
and g 1 and h1 are nonzero, then the tensor has rank 1.
– If we solve for h1 and g 1 , then the equations (3.35) become
h1 = a ⋅ g 1 , g 1 = aT ⋅ h1 . (3.36)
1. We want to investigate under which conditions the vectors associated with the trans-
formation Ui = aij Xj are collinear, i. e.
Ui = λ Xi , (3.37)
aij Xj = λ Xi , (3.38)
(aij − λ δij ) Xj = 0. (3.39)
Clearly, this condition is only satisfied if the tensor (aij − λ δij ) is singular, i. e. if
and if Xj is a null vector of this singular tensor. The values λ, for which the tensor (aij −
λ δij ) is singular, are called the eigenvalues of the tensor aij . As scalars, they are clearly
invariants of the tensor. The null vectors of (aij − λ δij ) are called the eigenvectors of
aij , the null directions of (aij − λ δij ) are called the eigendirections of aij , and equation
(3.39) is called the eigenvalue equation of aij .
2. In addition to the eigenvalue problem above, we also introduce the eigenvalue prob-
lem aTij Zj = μ Zi for the transposed tensor. We have
Zj aji = μ Zi , (3.41)
Zi (aij − μ δij ) = 0. (3.42)
μ = λ. (3.43)
The eigenvectors of both problems are also related. We mention here only the simplest
relation: right and left eigenvectors, which belong to different eigenvalues, are orthog-
onal. Let X m be a right eigenvector corresponding to an eigenvalue λm and Zn be a left
eigenvector corresponding to an eigenvalue λn , i. e. we have
a ⋅ X m = λm Xm , Z n ⋅ a = λn Zn .
Scalar multiplication of the first equation from the left with Z n and of the second equa-
tion from the right with X m gives
Z n ⋅ a ⋅ X m = λm Z n ⋅ Xm , Z n ⋅ a ⋅ X m = λn Zn ⋅ X m .
The left sides of both equations are the same, so we set the right sides equal and get
(λm − λn )Z n ⋅ X m = 0.
Xm ⋅ Zn = 0 for m ≠ n. (3.44)
1. The condition for determining the eigenvalues follows from (3.7) and is
1 λ
ε ε a a a − ε ε (δ a a + δjq akr aip + δkr aip ajq )
6 ijk pqr ip jq kr 6 ijk pqr ip jq kr
λ2
+ ε ε (δ δ a + δjq δkr aip + δkr δip ajq )
6 ijk pqr ip jq kr
λ3
− ε ε δ δ δ = 0.
6 ijk pqr ip jq kr
For the coefficient of λ3 , we have εijk εpqr δip δjq δkr = εijk εijk , which equals the sum of
the squares of all nonzero coordinates of the ε-tensor, hence 6. The coefficients of the
remaining powers of λ are clearly scalars and so they are, like the eigenvalues, also
invariants of the tensor aij . We write for the last equation
A − A λ + A λ2 − λ3 = 0, (3.46)
1
A = ε ε δ δ a = aii , (3.47)
2 ijk pqr ip jq kr
1
A = ε ε δ a a = bii , (3.48)
2 ijk pqr ip jq kr
1
A= ε ε a a a = det aij . (3.49)
6 ijk pqr ip jq kr
We call these three invariants the first, second, and third main invariant of the tensor
aij (after the power of the coordinates of aij ). According to (1.35), we can also write A
as
1
A = (δ δ − δjr δkq ) ajq akr
2 jq kr
or
1 1
A = (a a − ajk akj ) = [tr2 a − tr (a ⋅ a)]. (3.50)
2 jj kk 2
2. For a polar tensor, clearly, the eigenvalues and the main invariants are polar scalars.
For an axial tensor, the eigenvalues are, according to (3.39), axial scalars, and then
according to (3.46), both A and A are axial scalars, while A is a polar scalar.
3. Using (3.50), we can now derive a simple expression for the cotensor. We have
(3.11) 1
bip = εijk εpqr ajq akr
2
δ δiq δir δ δiq δir
ip
1 ip
(1.26) 1
= δjp δjq δjr ajq akr = apq aqq arq
2 2
δkp δkq δkr apr aqr arr
1
= [δ (a a − aqr arq ) + δiq (arq apr − apq arr ) + δir (apq aqr − aqq apr )]
2 ip qq rr
1
= [δip (aqq arr − aqr arq ) + ari apr − api arr + apq aqi − aqq api ],
2
1. The characteristic equation (3.45) has at least one real root (we assume real coordi-
nates aij ). Thus each tensor with real coordinates has at least one real eigenvalue. The
following cases are possible:
aij Xj = λ Xi − μ Yi , aij Yj = μ Xi + λ Yi .
Due to the symmetry of aij the left sides are equal. Setting the right sides equal gives
μ(Xi2 + Yi2 ) = 0.
Since the eigenvector is nonzero by assumption, it follows that μ = 0 , i. e. the
eigenvalue must be real.
where Ai is the vector associated with the tensor according to (3.8), the i before the
root is the imaginary unit (case II), and Ai is also an eigenvector corresponding to λ1 .
To see this, we write out the characteristic equation (3.46). We showed at the end of
Section 3.2 that the determinant of an antimetric tensor is zero. According to (3.48) and
(3.12), we have A = A2i , and the trace of an antimetric tensor clearly vanishes.
Thus the characteristic equation reads λ(A2i + λ2 ) = 0, from which (3.52) follows
immediately. The determining equations (3.39) for the eigenvector corresponding to
λ1 = 0 are aij Xj = 0. According to (3.8)2 , this gives εijk Ak Xj = 0, and then we
immediately recognize that Xi = Ai is a solution, because εijk is antimetric and Ak Aj
is symmetric.
4. A tensor which is neither symmetric nor antimetric can clearly belong to any of the
four cases mentioned above.
5. The converse of the characteristic equation (3.46) gives the so-called Vieta formulas
A = λ1 λ2 λ3 ,
A = λ1 λ2 + λ2 λ3 + λ3 λ1 , (3.53)
A = λ1 + λ2 + λ3 .
then the diagonal elements of this matrix are the eigenvalues of the tensor.
In this case the coordinate matrix of the tensor (a − λ δ),
7. If in one row or column of the coordinate matrix of a tensor only the diagonal
element aii is nonzero, then this diagonal element is an eigenvalue of the tensor,
because expanding the determinant in the characteristic equation det (aij − λ δij )
= 0 with respect to that row or column leads to the factor (aii − λ).
If, in particular, in one column only the diagonal element is nonzero, then the
associated coordinate direction is the eigendirection corresponding to this eigenvalue.
To prove this statement, we assume, without loss of generality, that this is the case in
the first column and compute the eigenvectors corresponding to the eigenvalue on the
diagonal position by solving the system of equations
0 a12 a13 X1 0
(0 a22 − a11 a23 ) (X2 ) = (0) .
0 a32 a33 − a11 X3 0
6 If two eigenvalues are zero, then A = 0 and hence also bii = 0, from which we of course cannot
conclude that also bij = 0 .
Its nontrivial solutions are the coordinates of the basis vector e1 , i. e. X1 = 1 and
X2 = X3 = 0.
Clearly, the converse is also true: if a coordinate direction is an eigendirection,
then the diagonal element in the associated column of the coordinate matrix is the
corresponding eigenvalue, and the remaining elements of this column are zero. If we
assume, again without loss of generality, that the x1 -direction is an eigendirection,
then from the system of equations
In this section, we derive theorems about eigenvectors of tensors with real coordinates,
and we obtain statements about the four classes of tensors, which we distinguished
at the beginning of Section 3.11.3.
1. The following statement is fundamental.
From (3.39) it follows immediately that each eigenvalue has at least one eigendi-
rection.
To ensure full generality, we assume that both the eigenvalue and the eigendirec-
tion in (3.39) are complex, i. e.
a ⋅ (X + i Y) = (λ + i μ)(X + i Y).
a ⋅ X = λ X − μ Y, a ⋅ Y = μ X + λ Y.
a ⋅ X = λ X, a ⋅ Y = λ Y.
X = 0. Since X and Y cannot both be zero, our assumption has led to a contradiction,
i. e. a complex eigenvalue cannot have a real eigendirection.
2. The following theorems follow immediately from the fact that the null directions
of the tensor (a − λ δ) are the eigendirections of the tensor a corresponding to the
eigenvalue λ.
Theorem 2. If, for an eigenvalue λ, the tensor (a − λ δ) has rank 2, then there exists only
one eigendirection corresponding to this eigenvalue.
Theorem 3. If, for an eigenvalue λ, the tensor (a−λ δ) has rank 1, then there exist two dis-
tinct (and thus linearly independent) eigendirections corresponding to this eigenvalue.
Theorem 4. If, for an eigenvalue λ, the tensor (a − λ δ) has rank 0, then there exist three
linearly independent eigendirections corresponding to this eigenvalue.
a ⋅ q = λ1 q = λ2 q.
But since q as an eigenvector cannot be the zero vector, it follows from λ1 q = λ2 q that
λ1 = λ2 .
a ⋅ q1 = λ1 q1 , a ⋅ q2 = λ2 q2 , a ⋅ q3 = λ3 q3 .
αi qi = 0, αi =
\ 0
must hold. Let α3 ≠ 0. Then we have with β1 = −α1 /α3 , β2 = −α2 /α3
q3 = β1 q1 + β2 q2 .
Since q1 , q2 , and q3 are unit vectors and all are distinct, clearly neither β1 nor β2 can
be zero. We now substitute the last equation into a ⋅ q3 = λ3 q3 , and we obtain
a ⋅ (β1 q1 + β2 q2 ) = λ3 (β1 q1 + β2 q2 ),
β1 ⏟⏟⏟⏟⏟⏟⏟⏟⏟
a ⋅ q1 + β2 a ⋅ q2 = β1 λ3 q1 + β2 λ3 q2 ,
⏟⏟⏟⏟⏟⏟⏟⏟⏟
λ1 q1 λ2 q2
β1 (λ1 − λ3 ) q1 + β2 (λ2 − λ3 ) q2 = 0.
Theorem 10. If an eigenvalue has only one eigendirection, it can have multiplicity 1, 2,
or 3.
λ1 a12 a13
(0 λ1 a23 ) .
0 0 λ1
0 a12 a13
(0 0 a23 ) .
0 0 0
For a12 ≠ 0, a23 ≠ 0 this matrix has rank 2 and hence the tensor has, according
to Theorem 2, only one eigendirection. For a12 = 0, a23 ≠ 0 the matrix has rank 1
and hence the tensor has, according to Theorem 3, two distinct eigendirections. For
a12 = a13 = a23 = 0 the matrix has rank 0 and hence the tensor has, according to
Theorem 4, three linearly independent eigendirections.
We now show that an eigenvalue λ1 of multiplicity 2 can have at most two distinct
eigendirections.
The rank of the tensor (a − λ1 δ) corresponding to an eigenvalue λ1 of multiplicity
2 cannot be zero, because otherwise (a − λ1 δ) would be the zero tensor, i. e. we would
have a = λ1 δ, and this tensor has the eigenvalue λ1 of multiplicity 3. So the rank of
(a − λ1 δ) must be at least one, and thus the tensor a has, according to Theorem 2 and
Theorem 3, at most two distinct eigendirections.
According to Theorem 1, a real eigenvalue has at least one (real) eigendirection. So
it only remains to show that an eigenvalue of multiplicity 2 can also have two distinct
eigendirections. An example is a tensor with the coordinate matrix
λ1 a12 a13
(0 λ1 a23 ) ,
0 0 λ2
where λ2 ≠ λ1 . According to Section 3.11.3, No. 6, such a tensor has an eigenvalue λ1 of
multiplicity 2 and an eigenvalue λ2 of multiplicity 1, and the coordinate matrix of the
tensor (a − λ1 δ) is given by
0 a12 a13
(0 0 a23 ) .
0 0 λ2 − λ1
If a12 = 0, the matrix has rank 1 and thus the tensor has, according to Theorem 3, two
distinct eigendirections corresponding to the eigenvalue of multiplicity 2.
Now, let λ1 be an eigenvalue of a with multiplicity 1 and let q1 be an eigendirec-
tion corresponding to λ1 . We choose q1 as a unit vector in the x1 -direction of a Cartesian
coordinate system, and we further choose two arbitrary unit vectors, which are orthog-
onal to q1 and orthogonal to each other, as the x2 - and x3 -directions. In this coordinate
system, the coordinate matrix of a is, according to Section 3.11.3, No. 7,
λ1 a∗12 a∗13
(0 a∗22 a∗23 ) ,
0 a∗32 a∗33
For λ = λ1 , the first factor is zero and the equation is satisfied by assumption. If,
for λ = λ1 , the second factor were also zero, then λ1 would not be an eigenvalue of
multiplicity 1. Since this is excluded by assumption, the determinant has rank 2, and
thus it has only one eigendirection corresponding to λ1 .
5. A tensor with less than three linearly independent eigendirections is called defec-
tive, and a tensor with three linearly independent eigendirections is called nondefec-
tive.
According to Theorem 7, a defective tensor must have an eigenvalue with a multi-
plicity of at least 2.
For a nondefective tensor we can choose the g i in (3.28) to be three linearly inde-
pendent eigenvectors and then we have hi = λi g i . Substituting this into (3.29) gives
a = λi g i g i , (3.54)
aT ⋅ g j = λj g j ,
7. We end this section by summarizing the theorems for the four classes of tensors,
which we distinguished at the beginning of the previous section.
II: The tensor has three distinct eigenvalues, one is real and two are complex con-
jugate.
Then there exist three linearly independent eigendirections, the real eigen-
value has a real eigendirection, and the two complex eigenvalues have complex
eigendirections; the tensor is nondefective.
III: The tensor has two distinct real eigenvalues, one of multiplicity 1 and one of
multiplicity 2.
Then two cases are possible:
a) If the tensor (a−λ δ), with λ being an eigenvalue of multiplicity 2, has rank
2, then there exists one real eigendirection corresponding to this eigen-
value and another real eigendirection (different from the first) correspond-
ing to the second eigenvalue of multiplicity 1; the tensor is defective.
b) If the tensor (a − λ δ), with λ being an eigenvalue of multiplicity 2, has
rank 1, then there exist two linearly independent real eigendirections cor-
responding to this eigenvalue and spanning an eigenplane, and another
real eigendirection not lying in the eigenplane and corresponding to the
eigenvalue of multiplicity 1; the tensor is nondefective.
IV: The tensor has one real eigenvalue of multiplicity 3.
Then three cases are possible:
a) If the tensor (a − λ δ) has rank 2, then there exists only one real eigendi-
rection; the tensor is defective.
b) If the tensor (a − λ δ) has rank 1, there exist two linearly independent real
eigendirections, which span an eigenplane; the tensor is defective.
c) If the tensor (a − λ δ) has rank 0, then the tensor is isotropic and each
direction is an eigendirection; the tensor is nondefective.
Problem 3.12.
Consider tensors with the following coordinate matrices:7
1 1 0 1 0 0 1 0 0
A. (0 1 1) , B. (0 1 1) , C. (0 1 0) ,
0 0 1 0 0 1 0 0 1
1 1 0 1 0 0 1 0 0
D. (0 1 1) , E. (0 1 1) , F. (0 1 0) .
0 0 2 0 0 2 0 0 2
For each tensor, find the eigenvalues and, for each eigenvalue, find a set of linearly
independent eigendirections. In which cases is the tensor nondefective, and in which
cases do there exist three mutually orthogonal eigendirections?
1. The terms eigenvalue and eigenvector carry over to any square matrix of order N.
To such a matrix a
∼ we can assign an eigenvalue equation
(a
∼ − λ I)
∼X ∼ = 0.
∼ (3.55)
We call the numbers λ, which satisfy this equation, the eigenvalues of the matrix and
we call the column matrices X ∼ ≠ 0,
∼ which solve this equation for a particular value
of λ, the (right) eigenvectors of the matrix corresponding to the eigenvalue λ.
We can interpret (3.55) as a homogeneous system of linear equations for the N
elements of the column matrix X. ∼ For this system to have a nontrivial solution, the
coefficient determinant must be zero, i. e. the characteristic equation
det (a
∼ − λ I)
∼ =0 (3.56)
det (a
∼ − λ I)
∼
1
= ε ε (a − λ δip )(ajq − λ δjq ) ⋅ ⋅ ⋅ (akr − λ δkr ), (3.57)
N! ij...k pq...r ip
ij . . . kpq . . . r : N indices,
Such an equation has N not necessarily distinct and generally complex solutions
(roots) λ1 , λ2 , . . . , λN , and therefore it can be factored into the form
(λ − λ1 )(λ − λ2 ) ⋅ ⋅ ⋅ (λ − λN ) = 0.
α = λ1 λ2 ⋅ ⋅ ⋅ λN ,
α = λ2 λ3 ⋅ ⋅ ⋅ λN + λ1 λ3 λ4 ⋅ ⋅ ⋅ λN + ⋅ ⋅ ⋅ + λ1 λ2 ⋅ ⋅ ⋅ λN−1 ,
.. (3.59)
.
α (N−1)
= λ1 + λ2 + ⋅ ⋅ ⋅ + λN .
Here the general coefficient α(k) is clearly the sum of all the products with (N − k)
different factors, which can be composed of the N roots λi , and there are (Nk ) such
terms.
1
α= ε ε a a ⋅ ⋅ ⋅ akr = det a. (3.60)
N! ij...k pq...r ip jq ∼
1 1
ε ε a δ ⋅ ⋅ ⋅ akr = ε ε δ a ⋅ ⋅ ⋅ akr ,
N! ij...k pq...r ip jq N! ji...k qp...r jq ip
but this is equal to the first term, i. e. all N terms are equal, and we have
1
α = ε ε a ⋅ ⋅ ⋅ akr . (3.61)
(N − 1)! ij...k iq...r jq
According to (1.30), this is, for i = 1, the minor of the element a11 and analogously,
for i = 2, the minor of the element a22 , etc. This means α is the sum of the principal
minors of the matrix a∼ (actually, of the cofactors, but on the main diagonal minors and
cofactors coincide).
For α , we have in (1.29) instead of the product aip ajq ⋅ ⋅ ⋅ akr the sum of the (N2 )
products, in each of which two factors a are replaced by δ. Analogously to α , we can
show that all terms are equal, i. e.
1
α = ε ε a ⋅ ⋅ ⋅ alq .
(N − 2)! 2! ijk...l ijp...q kp
This in turn is the sum of the principal subdeterminants of order (N − 2) of the matrix.
For the coefficient of λk we get
1
α(k) = ε ε a ⋅ ⋅ ⋅ anq ,
(N − k)! k! i...jm...n i...jp...q mp (3.62)
i . . . j: k indices, m . . . n, p . . . q: N − k indices,
which is the sum of the (Nk ) principal subdeterminants of order (N − k) of the matrix.
For the coefficient α(N−1) of λN−1 , we finally obtain the sum of the elements of the
main diagonal; we call this, analogously to the tensors, the trace of the square matrix.
We have
α(N−1) = aii =: tr a.
∼ (3.63)
For example, for the easiest case of a square matrix of order 2, the characteristic equa-
tion is
α − α λ + λ2 = 0, (3.64)
α = λ1 λ2 = det a
∼ = a11 a22 − a12 a21 ,
(3.65)
α = λ1 + λ2 = tr a
∼ = a11 + a22 .
2. The theorems from Sections 3.11.3 and 3.11.4 about eigenvalues and eigenvectors
carry over, to a large extent, from tensors to square matrices.
A square matrix of order N with real elements has, according to (3.58), exactly
N (real and/or complex) eigenvalues, if we count the eigenvalues according to their
multiplicity. If N is odd, we always have at least one real eigenvalue; if N is even, all
eigenvalues can be complex.
The number of the linearly independent eigendirections corresponding to an
eigenvalue λ is determined by the rank defect of the matrix a ∼ − λ I.
∼ If a
∼ − λ I∼ has the
rank defect M, then there exist M linearly independent eigendirections corresponding
to this eigenvalue. Because of det (a ∼ − λ I)
∼ = 0, the rank defect is at least M = 1, i. e.
for each eigenvalue, there exists at least one corresponding eigendirection, for a real
eigenvalue a real eigendirection, for a complex eigenvalue a complex eigendirection.
If all eigenvalues are real and distinct, then there are N linearly independent
eigenvectors, which then form a basis for the N-dimensional vector space of the col-
umn matrices of order N.
We can make additional statements, if the matrix a ∼ has a particular shape. In a
triangular matrix, the diagonal elements are also eigenvalues; if in the i-th column of
a
∼ only the diagonal element is nonzero, then this diagonal element is an eigenvalue
of a
∼ and the corresponding eigenvector X ∼ contains only a one in the i-th row, while all
the remaining elements are zero.
1. Every tensor whose coordinate matrix has a diagonal form in a suitable Cartesian
coordinate system, i. e. it contains only zeros outside the main diagonal, is clearly sym-
metric. We will show in this section that, conversely, for every symmetric tensor, there
exists (at least) one Cartesian coordinate system in which its coordinate matrix has a
diagonal form. We call the coordinate directions of such a coordinate system the prin-
cipal axes of the tensor, and the transformation of an arbitrary Cartesian coordinate
system to the principal axes we call a principal axes transformation. We assume for
the rest of this section that all Cartesian coordinate systems and by this also the sys-
tem of principal axes are right-handed systems; for this we have to choose the order of
the principal axes appropriately. With this restriction, we do not need to distinguish
between polar and axial tensors for the rest of this section.
2. We will prove that a symmetric tensor always has a system of principal axes by
demonstrating how to find such a system of principal axes. According to Section 3.11.3,
No. 2, a symmetric tensor always has three (not necessarily distinct) real eigenvalues
and according to Section 3.11.4, No. 1, Theorem 1, it has at least one corresponding real
eigendirection. Using these facts, we transform the tensor a in a first step to a Cartesian
coordinate system, in which the x̃1 -axis coincides with this eigendirection. According
to Section 3.11.3, No. 7 and because of the symmetry, which is, according to Section 2.6,
No. 7, invariant under a coordinate transformation, the tensor a has, in this coordinate
system, the coordinate matrix
λ1 0 0
a
̃ ij = ( 0 a
̃ 22 ̃ 23 ) .
a (a)
0 a
̃ 32 a
̃ 33
λ1 − λ2 0 0 X
̃1 0
( 0 a
̃ 22 − λ2 a
̃ 23 ) (X
̃ ) = (0)
2
0 a
̃ 32 a
̃ 33 − λ2 X
̃3 0
(λ1 − λ2 ) X
̃1 = 0,
(a
̃ 22 − λ2 ) X
̃2 + a
̃ 23 X
̃3 = 0, (b)
a
̃ 32 X
̃2 + (a
̃ 33 − λ2 ) X
̃3 = 0.
If λ2 and λ1 are distinct, (b) can only be satisfied if X̃1 = 0 , i. e. if the eigendirec-
tion corresponding to λ2 is perpendicular to the x̃1 -axis, which is at the same time an
(a
̃ 22 − λ1 ) (a ̃ (2)
̃ 33 − λ1 ) − a 23 = 0.
Then the tensor a−λ1 δ is at least double singular and in addition to the eigendirection
X
̃1 = 1, X
̃2 = X ̃3 = 0, there is at least one more eigendirection corresponding to the
eigenvalue λ1 . For the second eigendirection we can choose X ̃1 = 0, which then lies
in the x̃2 , x̃3 -plane, i. e. it is perpendicular to the first eigendirection, and both eigendi-
rections span an eigenplane.
Independently of λ1 and λ2 being equal or not we can choose the two eigendirec-
tions as the x̂1 - and x̂2 -axes of a new Cartesian coordinate system and transform the
tensor a again. Using the same arguments as before, we have for the coordinate matrix
in the new coordinate system
λ1 0 0
aij = ( 0
̂ λ2 0 ).
0 0 a
̂ 33
But then, according to Section 3.11.3, No. 7, the x̂3 -axis is also an eigendirection corre-
sponding to the eigenvalue λ3 = a ̂ 33 ; so we showed that every symmetric tensor has
at least one system of principal axes and that in any system of principal axes the ele-
ments on the main diagonal of the coordinate matrix are eigenvalues and the principal
axes are eigendirections.
Thus a symmetric tensor with three different eigenvalues has only one system of
principal axes. For an eigenvalue of multiplicity 2, there exist two linearly independent
eigendirections, which span an eigenplane, and every Cartesian basis with one basis
vector perpendicular to this plane forms a system of principal axes. A tensor with an
eigenvalue of multiplicity 3 is isotropic; in this case every Cartesian basis is a system
of principal axes.
3. A symmetric tensor is always nondefective and can therefore always be represented
in the form (3.54). Since, according to Section 3.9.3, a basis q i of a system of principal
axes is identical to its reciprocal basis, we can replace both the g i and g i in (3.54) by
qi , which gives
a = λi qi qi .
5. We summarize the results of this section; they are valid under the assumption that
all coordinate systems are right-handed systems and so they are valid for both polar
and for axial tensors.
λ1 0 0
a
̂ ij = λi δij = ( 0 λ2 0). (3.66)
0 0 λ3
j
αij = qi . (3.68)
j
– The coordinates qi of three orthonormal eigenvectors qj of the tensor in the
original coordinate system are determined by the eigenvalue equation
k
(aij − λk δij ) qj = 0. (3.69)
i j
̂ ij = qm qn amn = λi δij ,
a
m n k k
̂ mn = λk qi qj ,
aij = q i qj a (3.70)
a = λk q k q k .
Problem 3.13.
Consider a tensor which has in its original coordinate system the coordinate matrix
5 0 4
tij = (0 9 0) .
4 0 5
Transform the tensor to principal axes, i. e. find a right-handed Cartesian basis (in
terms of the coordinates of the original system) in which the coordinate matrix of the
tensor has a diagonal form, and write out this coordinate matrix.
From the existence of a system of principal axes (3.66) it further follows that if all eigen-
values are nonzero, the tensor a has rank 3; if one eigenvalue is zero, it has rank 2; if
two eigenvalues are zero, it has rank 1; and if all three eigenvalues are zero, it has rank
zero. Thus we have, for a symmetric tensor, a simple relation between the rank of the
tensor and the number of its vanishing eigenvalues.
Rank Eigenvalues
3 λ1,2,3 ≠ 0
2 λ1 = 0, λ2,3 ≠ 0
1 λ1 = λ2 = 0, λ3 ≠ 0
0 λ1 = λ2 = λ3 = 0
1. We consider the quadratic form of an arbitrary tensor a with the Cartesian coordi-
nates aij ,
Q = aij Xi Xj , (3.71)
2. If we compute the quadratic form (3.71) in the system of principal axes, we get
i. e. the sign of Q depends for X ≠ 0 only on the sign of the λj , and because of Vieta’s
formulas (3.53)1 , the same is true for the determinant of the tensor. Thus, for a sym-
metric tensor, we find the following relations between the sign of Q and the signs of
its eigenvalues and of its determinant.
8 Except for the zero tensor, which is both positive and negative semidefinite, the three sets of positive
semidefinite, negative semidefinite, and indefinite tensors are disjoint.
Problem 3.14.
We can generalize the eigenvalue problem of a tensor a to the case that the eigenvalue
λ is associated with a second tensor b, i. e.
aij Xj = λ bij Xj .
Let a and b be two symmetric tensors. Determine the conditions under which the gen-
eralized eigenvalue problem has only real eigenvalues.
3. With the quadratic form Q = aij Xi Xj we can interpret the eigendirections of a sym-
metric tensor a in a different way than in Section 3.11.1, No. 1. To this end, we assume
X to be a unit vector and ask for those directions, for which the quadratic form takes
its maximal or minimal values. Thus we formulated an extremum problem,
Q = aij Xi Xj → extremum,
1 − δij Xi Xj = 0.
A common method for solving such problems is to multiply the left side of the con-
straint with an initially undetermined factor λ (the so-called Lagrange multiplier) and
add this term to the function whose extrema are to be determined, i. e.
h = aij Xi Xj + λ (1 − δij Xi Xj ) .
The necessary condition for the extrema of Q is then that the partial derivatives of the
auxiliary function h with respect to the coordinates of X must be zero, i. e.
𝜕X 𝜕Xj
= (aij − λ δij ) ( i Xj + Xi
𝜕h
)
𝜕Xk ⏟⏟𝜕X
⏟⏟⏟k⏟⏟ ⏟⏟𝜕X
⏟⏟⏟k⏟⏟
δik δjk
= (akj − λ δkj ) Xj + (aik − λ δik ) Xi = 0.
(aij − λ δij ) Xj = 0,
i. e. we obtained the familiar eigenvalue problem for the tensor a. Thus the quadratic
form takes its extrema in the eigendirections of a.
T
A symmetric matrix, i. e. a matrix for which A ∼=A ∼ , must be square. Furthermore, we
restrict ourselves in this section to matrices with real elements, without mentioning it
every time.
Several results from symmetric tensors (with real coordinates) carry over to sym-
metric matrices of order N (with real elements).
Theorem 1. A symmetric matrix has only real eigenvalues, and to each eigenvalue cor-
responds at least one real eigenvector (i. e. an eigenvector where all elements are real
numbers).
Theorem 2. A symmetric matrix has at least one system of principal axes consisting
of mutually orthogonal eigenvectors. In the system of principal axes the matrix has a
diagonal form; the elements on the main diagonal are the eigenvalues of the matrix.
The principal axes transformation follows the same procedure as for tensors. In
the first step we choose a matrix R
∼1 , whose first column is an eigenvector X ∼1 of a.
∼ Be-
cause of the symmetry, as in Section 3.12.1, the elements in the first row and the first
column of the resulting similar matrix a∼ are all zero except the element on the diago-
̃
nal, which is the eigenvalue λ1 corresponding to the eigenvector X ∼1 , so we have
λ1 0 0 ... 0
0 a
̃ 22 a
̃ 23 ... a
̃ 2N
a T 0 a a a
∼=R
∼1 a∼R ).
̃ ( ̃ 23 ̃ 33 ... ̃ 3N )
∼1 = ( .
..
(0 a
̃ 2N a
̃ 3N ... a
̃ NN )
a
̃ 22 a
̃ 23 ... a
̃ 2N
a
̃ 23 a
̃ 33 ... a
̃ 3N
( . ),
..
a
̃ 2N a
̃ 3N ... a
̃ NN
λ1 0 0 ... 0
0 λ2 0 ... 0
a T 0 0 a a
∼=R
∼2 a∼R ).
̂ ( ̂ 33 ... ̂ 3N )
∼2 = ( .
..
(0 0 a
̂ 3N ... a
̂ NN )
After N such steps the matrix is fully diagonalized. If multiple eigenvalues exist, then
we have, as in Section 3.12.1, more than one system of principal axes.
3. Any symmetric matrix of order N belongs to one of the five definiteness classes.
Let a
∼ be a symmetric matrix of order N and let X
∼ be a nonzero column matrix with
N elements. Then its quadratic form is
T
Q=X
∼ a ∼ X,
∼ Q = Xi aij Xj . (3.73)
Now, if the number Q is positive, independently of the choice of X, ∼ we call the matrix
a
∼ positive definite; if it is always negative, we call a
∼ negative definite. If Q is always
positive or zero, or always negative or zero, we call a∼ positive semidefinite or negative
semidefinite, respectively. If finally Q can take both signs, we call a ∼ indefinite. With
these definitions we come to the following theorem.
T T T
Q=X
∼ a ∼X
∼=X
∼ λX∼ = λX
∼ X.∼
T
Since X
∼ has only real elements, X∼ X ∼ is positive, i. e. λ and Q have the same sign.
Since, according to (3.59) and (3.60), the determinant of a symmetric matrix is
equal to the product of its N eigenvalues, we further have the following theorem.
Theorem 5. The principal submatrices of a symmetric matrix have the same definite-
ness as the matrix itself; for example, if the matrix is positive definite, then all its princi-
pal submatrices are also positive definite.
Theorem 6. The main diagonal elements of a positive definite symmetric matrix are all
positive; for a negative definite symmetric matrix they are all negative; for a positive
semidefinite symmetric matrix they are all positive or zero; and for a negative semidefi-
nite symmetric matrix they are all negative or zero.
5. The converse of all these theorems is not true, e. g. if all eigenvalues of a symmetric
matrix are real, we cannot conclude that the matrix is symmetric; and if the determi-
nant of a symmetric matrix is positive, it does not follow that the matrix is positive
definite.
For the following, we need a matrix which maps a position vector X in the x, y-plane
by rotating about the z-axis through an angle ϑ to a vector U. Clearly, these vectors are
X1 = r cos φ, X2 = r sin φ,
U1 = r cos(φ + ϑ) = r (cos φ cos ϑ − sin φ sin ϑ)
= X1 cos ϑ − X2 sin ϑ,
U2 = r sin(φ + ϑ) = r (sin φ cos ϑ + cos φ sin ϑ)
= X2 cos ϑ + X1 sin ϑ,
U1 cos ϑ − sin ϑ X1
( )=( )( ). (3.74)
U2 sin ϑ cos ϑ X2
1. Any tensor (with real coordinates), in particular any orthogonal tensor, has (at least)
one real eigenvalue and correspondingly (at least) one real eigendirection. Since an or-
thogonal tensor represents a length preserving mapping, all its real eigenvalues must
be, according to (3.37), either +1 or −1.
According to Section 3.11.3, in particular (3.53)1 , the following cases can occur:
I. All three eigenvalues are 1, then det a ∼ = 1.
II. All three eigenvalues are −1, then det a ∼ = −1.
III. Two eigenvalues are 1, one is −1, then det a ∼ = −1.
a11 a12 0
(a21 a22 0).
a31 a32 ±1
Since the sum of the squares in the last row must be 1, we have a31 = a32 = 0, so
a11 a12 0
(a21 a22 0).
0 0 ±1
The sum of the squares in the first column is a211 + a221 = 1, which can be satisfied,
without loss of generality, by a11 = cos ϑ, a21 = sin ϑ, i. e.
cos ϑ a12 0
( sin ϑ a22 0).
0 0 ±1
Since the sign of det aij must match the sign of a33 , it follows that
cos ϑ a12
= a22 cos ϑ − a12 sin ϑ = 1.
sin ϑ a22
The sum of the squares in the first row gives a12 = ± sin ϑ, and from that it follows
with the previous equation that a12 = − sin ϑ and a22 = cos ϑ.
Thus, if det a
∼ = 1, i. e. if the tensor is proper orthogonal, we have for the coordi-
nate matrix of our orthogonal tensor
cos ϑ − sin ϑ 0
aij = ( sin ϑ cos ϑ 0) (3.75)
0 0 +1
and if det a
∼ = −1, i. e. if it is improper orthogonal, we have
cos ϑ − sin ϑ 0
aij = ( sin ϑ cos ϑ 0) . (3.76)
0 0 −1
3. These formulas are easy to interpret with (3.74). The mapping given by a proper
orthogonal tensor does not change the z-coordinate of any position vector and it ro-
tates the projection of the position vector onto the x, y-plane by the angle ϑ; i. e. the
mapping of the tensor is a rotation through the angle ϑ about the eigendirection cor-
responding to its positive eigenvalue. This explains why a proper orthogonal tensor
is also called a rotation tensor. The mapping given by an improper orthogonal tensor
changes only the sign of the z-coordinate, and it again rotates the projection of the
position vector onto the x, y-plane by the angle ϑ; i. e. the mapping of the tensor is
a rotation through the angle ϑ about the eigendirection corresponding to its negative
eigenvalue, followed by a reflection through the plane orthogonal to this eigendirec-
tion.
4. Since the mapping by a tensor has to be independent of the coordinate system, this
intuitive interpretation implies that an orthogonal tensor must be invariant under a
rotation about the eigendirection, i. e. it has the same coordinates in all coordinate
systems where the z-axis points in this eigendirection. Hence, such a tensor can be
diagonalized only for ϑ = 0 or ϑ = π .
Problem 3.15.
Find the eigenvalues of the orthogonal tensor
cos ϑ − sin ϑ 0
aij = ( sin ϑ cos ϑ 0) .
0 0 ±1
5. We end this section by reviewing the six cases we distinguished earlier, starting
with the three cases where the tensor is proper orthogonal:
– ϑ = 0: All eigenvalues are 1. The tensor is the identity tensor, it represents a rota-
tion by 0∘ , i. e. every vector is mapped onto itself (Case I).
– 0 < ϑ < π: Two eigenvalues are complex conjugate, one is 1. The tensor represents
a rotation by the angle ϑ (Case V).
– ϑ = π: Two eigenvalues are −1, one is 1. The tensor represents a rotation by 180∘
(Case IV).
Next we consider the three cases where the tensor is improper orthogonal:
– ϑ = 0: Two eigenvalues are 1, one is −1. The tensor represents a reflection through
the plane orthogonal to the eigendirection of the negative eigenvalue (Case III).
– 0 < ϑ < π: Two eigenvalues are complex conjugate, one is −1. The tensor rep-
resents the same reflection as before, combined with a rotation by the angle ϑ
(Case VI).
– ϑ = π: All three eigenvalues are −1. The tensor is the negative identity tensor; it
is also called the inversion tensor, because it maps any vector to a vector with the
same length pointing in the opposite direction (Case II).
Problem 3.16.
A. Find the quantities α, β, γ, δ, ε, ζ in the matrix
α β δ
αij = (−α√2 γ ε) ,
α −β ζ
such that the αij represent the transformation matrix between two Cartesian coor-
dinate systems (with or without change of orientation).
Hint: Consider first which of the two orthogonality relations (2.6) is more con-
venient for the computations.
B. How many solutions does the problem have? What additional information do you
need, so that the problem has a unique solution? How can the different solutions
be interpreted geometrically?
3.13.3 The Orthogonal Tensor as a Function of Rotation Angle and Rotation Axis or
Reflection Axis
3.13.3.1 Rotation
1. A proper orthogonal polar tensor (rotation tensor) can be represented by its axis of
rotation, which we assume to be given by the axial unit vector n and its rotation angle
ϑ. To this end, we have to express the rotated vector U as a function of n, ϑ, and the
original vector X. We choose as a basis for this representation:
– the unit vector n, representing the axis of rotation,
– the projection a of the vector X onto the plane orthogonal to n, and
– the vector n × X.
This basis is orthogonal, but not normalized. To simplify our derivation, we assume
that X and hence U is a polar vector and that the underlying coordinate system is a
right-handed system.
We set
→ → →
OM = α n, MR = β a, and RQ = γ n × X,
and now, from the figure, we have to find α, β, γ, and a as a function of n, ϑ, and X.
Since n is a unit vector, we have
α = OM = OP cos(n, X) = X cos(n, X) = n ⋅ X.
Furthermore,
MR MR
β= = = cos ϑ.
MP MQ
→
For RQ, we have, on the one hand, by taking the length of RQ = γ n × X,
MP
RQ = γ X sin(n, X) = γ X = γ MP,
OP
and on the other hand, from the triangle MRQ,
RQ = MQ sin ϑ = MP sin ϑ.
U = n ⋅ X n + cos ϑ (X − n ⋅ X n) + sin ϑ n × X.
We further see that the coordinates of an orthogonal tensor do not change under a
rotation of the coordinate system about its axis of rotation. In all these coordinate
systems, the coordinates of ni and clearly also the scalar ϑ have the same value. Finally,
we see that (3.77) is the decomposition of aij in its symmetric and antimetric parts.
3. We can also write ni and ϑ as functions of aij . To this end we contract (3.77), so we
have
aii = 1 + 2 cos ϑ,
1
cos ϑ = (aii − 1). (3.78)
2
We further compute εijm aij . Here we can replace aij , according to (2.40), by its antimet-
ric part, so we have
= − (3 − 1) δmk nk sin ϑ,
εijm aij
nm = − . (3.79)
2 sin ϑ
Thus we can uniquely assign an angle of rotation between 0∘ and 180∘ and an axis
of rotation, arbitrarily oriented in space, to any rotation tensor. Rotations by an an-
gle between 0∘ and −180∘ correspond to rotations by an angle between 0∘ and 180∘
with reversely oriented axis of rotation. Substituting ϑ || −ϑ does not change (3.78) and
results with (3.79) in the substitution ni || −ni .
3.13.3.2 Reflection
An improper orthogonal tensor, which reflects any vector through the plane orthogo-
nal to n, can be represented by the unit vector n. To this end, we have to express the
reflected vector U as a function of the original vector X and the normal vector n.
or in coordinate notation
Ui = Xi − 2 nj Xj ni .
Ui = (δij − 2 ni nj ) Xj ,
where aij is a symmetric tensor which has the eigenvalue −1 with multiplicity 1 and
the eigenvalue 1 of multiplicity 2 and n is the unit eigenvector corresponding to the
eigenvalue −1.
From (3.81) it follows that aii = −1 + 2 cos ϑ, εijk aij = −εijk εijm nm sin ϑ; hence we
obtain the angle and the axis of rotation of a rotary reflection analogously to (3.78) and
(3.79) as
1 εijm aij
cos ϑ = (a + 1), nm = − . (3.82)
2 ii 2 sin ϑ
Problem 3.17.
One of the solutions to Problem 3.16 is
1 1 √2 1
2 2 2
αij = (− 1 √2 0 1 √2) .
2 2
1
2
− 21 √2 1
2
Find the axis and the angle of rotation and interpret the transformation geometrically.
represent a mapping between two Cartesian bases, which are rotated to each other.
However, the transformation laws (2.17) for tensor coordinates can be interpreted in
two ways: either as a transformation law for the coordinates of a tensor between two
coordinate systems, which are rotated to each other, or as a transformation law for the
coordinates of two tensors in the same coordinate system, which are rotated to each
other. For example, in the first interpretation, the equations
ai = αij a
̃j , a
̃ i = αji aj (3.84)
relate the coordinates of the vector a, from the figure on the facing page, in the no-
tilde coordinate system to the coordinates of this vector in the tilde coordinate sys-
j
tem. Then, according to (2.4) αij = ẽi is the transformation matrix whose columns are
the basis vectors of the tilde coordinate system, written as a linear combination of the
basis vectors of the no-tilde coordinate system. In the second interpretation, the equa-
tions (3.84) relate the coordinates of the two vectors a and a ̃ in the no-tilde coordinate
system. Then α = αij ei ej is the rotation tensor, which maps the vector a ̃ to the vector
a, i. e. it rotates it through the angle ϑ, which is positive in our case, as can be seen
from the figure.
If we interpret (3.84)1 as an equation for the coordinates of the same vector a in
two different coordinate systems, we cannot translate it into symbolic notation. If we
interpret it as an equation for the coordinates of two different vectors, a and a ̃ , in the
same coordinate system, then we can translate it as a = α ⋅ a ̃.
Analogously, the equations
either relate the coordinates of the point P, from the figure above, in the no-tilde co-
ordinate system to the coordinates of this point in the tilde coordinate system, or they
relate the coordinates of the two points, P and P,̃ in the no-tilde coordinate system. In
the first case it is not possible to translate equation (3.85)1 into symbolic notation, in
the second case we can translate it into x = α ⋅ x̃ with α as rotation tensor.
a2 := a ⋅ a, a(2)
ij
:= aik akj ,
etc., etc.
Here a(2)
ij
is the i, j-coordinate of the tensor a2 and not the square of the i, j-coordinate of
the tensor a, i. e. we say “a square i j”. These powers, where the exponent is a natural
number, are defined for arbitrary tensors.
2. For all tensors, except the zero tensor, we define the zeroth power to be the identity
tensor; we have
a0 := δ, a(0)
ij := δij . (3.87)
3. If the inverse of a tensor exists, i. e. if the original tensor is regular, we define the
minus-one power to be the inverse tensor and further powers with negative integer
exponents as
etc., etc.
aij Xj = λ Xi .
Then
Problem 3.18.
Prove that two tensors which are inverses of each other have reciprocal eigenvalues
and the same eigenvectors; thus the above theorem is valid for all integer powers of a
tensor, as long as negative powers of the tensor exist.
Problem 3.19.
Prove for integer values of n that
Problem 3.20.
The proper orthogonal tensor a rotates the vector X about the axis of rotation n
through the angle ϑ to the vector U. Show that the tensor a3 rotates X about the same
axis of rotation through the angle 3 ϑ to a vector W.
1. For positive semidefinite symmetric tensors, we can also define powers with arbi-
trary real exponents.
A symmetric tensor with eigenvalues λk and three orthonormal eigenvectors qk
can be represented with (3.70) in the form
k k
a = λk qk qk , aij = λk qi qj .
k k
an := (λk )n qk qk , a(n)
ij := (λk )
n
qi qj . (3.90)
Clearly, for integer values of n, the power defined in this way is always uniquely deter-
mined and identical to the integer power of the tensor defined in Section 3.14.1.
am ⋅ an = am+n , a(m)
ij
a(n)
jk
= a(m+n)
ik
,
(3.91)
m n mn n
(a ) = a , (a(m)
ij
) = a(mn)
ij
.
n
n m k k (3.90) k k
(a(m)
ij ) = ((λk )
qi qj ) = (λk )mn qi qj = a(mn)
ij , Q. E. D.
With the definition (3.90), we assign to any positive semidefinite symmetric tensor a,
for any positive real n, exactly one positive semidefinite symmetric tensor an , and we
call this tensor the n-th power of a, because for all these tensors the rules (3.91) are
satisfied.
If we restrict ourselves to positive definite symmetric tensors, our considerations
and the formulas (3.90) and (3.91) are also valid for negative real exponents.
2. The definition (3.90) can be generalized to a more general class of tensors, namely
the nonnegative tensors. A tensor is called nonnegative, if it is not defective and if its
eigenvalues λk are positive or zero.
Using (3.54), a nonnegative tensor can be represented in the form
k
a = λk g k g k , aij = λk g g j , (3.92)
ki
where g k is the basis reciprocal to the eigenvectors g k . The n-th power of a, with n > 0,
is then defined by
k
an := (λk )n g k g k , a(n)
ij := (λk )
n
g gj; (3.93)
ki
If, in particular, all eigenvalues of a nonnegative tensor are positive, the tensor is
called positive. Then the formulas (3.92) and (3.93) also hold for negative exponents.
3. We summarize our definitions of the tensor powers in a table.
Problem 3.21.
A. Show that the tensor of Problem 3.13 is positive definite.
B. Compute the coordinates of its positive definite square root in the original coor-
dinate system and check your result, i. e. verify that the square of the square root
gives the original tensor.
A δ − A a + A a2 − a3 = 0. (3.94)
This equation is called the Cayley–Hamilton theorem. If we replace the powers of the
tensor with the powers of the eigenvalues, it has the same form as the characteristic
equation (3.46); we also say: a tensor satisfies its characteristic equation. To further
generalize (3.94), we scalar multiply by an , where n is an integer number, and we ob-
tain
This equation is also called the Cayley–Hamilton theorem. Thus, we can write any
integer power of a tensor a as a linear combination of a2 , a, and δ; and the coefficients
in this equation are polynomials of the main invariants of the tensor.
2. For tensors which have at most two distinct eigenvalues λ1 and λ2 we can alter-
natively find all eigenvalues, if we refrain from specifying their multiplicity, from the
condition
(λ1 − λ)(λ2 − λ) = 0
or
λ1 λ2 − (λ1 + λ2 ) λ + λ2 = 0; (3.96)
λ1 λ2 X − (λ1 + λ2 ) λ X + λ2 X = 0,
a ⋅ X = λ X, a2 ⋅ X = λ2 X,
which gives
[λ1 λ2 δ − (λ1 + λ2 ) a + a2 ] ⋅ X = 0.
If the tensor has three linearly independent eigenvectors, we can substitute for X three
linearly independent vectors and therefore factor out X.
Thus, for a nondefective tensor with at most two distinct eigenvalues we found,
in addition to (3.94), also the following version of the Cayley–Hamilton theorem:
λ1 λ2 δ − (λ1 + λ2 ) a + a2 = 0. (3.97)
A = tr a,
1 2
A = (tr a − tr a2 ), (3.98)
2
1 3
A = (tr a − 3 tr a tr a2 + 2 tr a3 ).
6
Clearly, tr a, tr a2 , and tr a3 form, in addition to the main invariants and the eigenval-
ues, a third set of invariants of a second-order tensor; we call them the basic invariants.
A = aii = λ1 + λ2 + λ3
= tr a,
A = bii = λ1 λ2 + λ2 λ3 + λ3 λ1
1 2 (3.99)
= (tr a − tr a2 ),
2
A = det a
∼ = λ1 λ2 λ3
1
= (tr3 a − 3 tr a tr a2 + 2 tr a3 ).
6
If we solve for the basic invariants, we have, after elementary computations, the fol-
lowing formulas.
tr a = λ1 + λ2 + λ3 = A ,
If a tensor has only two distinct eigenvalues, we can, in general, solve the first two
equations in (3.99) and the first two equations in (3.100) for the two distinct eigenval-
ues and then substitute these equations into the third equation of (3.99) or (3.100).
Thus the three main invariants and the three basic invariants are not independent
from each other, but only two of them are.
If a tensor has only one eigenvalue (of multiplicity 3), it has only one independent
main invariant and only one independent basic invariant.
The three different sets of invariants (eigenvalues, main invariants, and basic in-
variants) usually do not form a complete set of invariants; a general second-order ten-
sor has seven independent invariants. We will come back to this topic in Chapter 5.
where Rij is orthogonal, Uij and Vij are symmetric and positive definite, and Rij , Uij , and
Vij are uniquely determined. This decomposition is called polar, because it resembles
the decomposition of a complex number into polar coordinates, i. e. into modulus and
argument: Uij and Vij are positive definite tensors, similarly as the modulus of a com-
plex number is positive, the determinant of Rij is 1, similarly as the exponential factor
of a complex number has modulus 1, and Rij can be interpreted as a rotation or as a
rotary reflection, respectively (compare Section 3.13.3).
2. We prove the decomposition theorem (3.101) in six steps.
I. We first show that the scalar product of any tensor with its transposed tensor gives
a positive semidefinite symmetric tensor. If the original tensor is regular, then the
scalar product is even positive definite and thus with (3.53)1 also regular.
Both are easy to see. We note that, for arbitrary Fij , Fij FjkT = Fij Fkj is symmetric.
Furthermore A := Fij Fkj Xi Xk is the square of the vector Fij Xi and thus it is always
greater than or equal to zero, i. e. Fij Fkj is positive semidefinite.
If, in particular, Fij is regular, it does not have a null direction, i. e. for Xi \= 0
we have Fij Xi \= 0 , and hence A > 0. Then, according to Section 3.12.3, Fij Fkj is
positive definite, and its eigenvalues are all positive.
II. Next we show that any regular tensor Fij can be represented in the form
So the transpose of the tensor Rij is equal to the inverse of Rij , i. e. Rij is orthogonal.
III. Uniqueness of the decomposition is easy to prove by contradiction.
Let Fij = vik rkj be a second decomposition. Then we have Vik Rkj = vik rkj and,
if we transpose both sides, RTik Vkj = rik
T
vkj . This gives
Vin(2) = Vik Vkn = Vik δkm Vmn = Vik Rkj RTjm Vmn
T
= vik rkj rjm vmn = vik δkm vmn = vik vkn = vin
(2)
.
Now, since Vij is symmetric and positive definite, Vij(2) is also symmetric and
positive definite: Vij(2) = Vik Vkj = Vik Vjk is symmetric, and A := Vij(2) Xi Xj
= Vik Vjk Xi Xj is the square of the vector Vik Xi , and since Vij is regular, Vij has no
null direction, i. e. the square is always positive.
From the fact that a symmetric and positive definite tensor has only one sym-
metric and positive definite square root, it follows that Vij = vij and hence Vik Rkj =
Vik rkj . Multiplying by Vmi(−1)
then gives Rmj = rmj .
IV. Next we show (also to practice symbolic notation) that any regular tensor F can
also be represented as
F = S ⋅ U, (3.105)
U := √F T ⋅ F, U 2 = U ⋅ U = F T ⋅ F, (3.106)
and
S := F ⋅ U −1 . (3.107)
Analogously to step II, U is again symmetric and positive definite, and if we scalar
multiply (3.107) from the right by U, we get S ⋅ U = F, i. e. the two tensors U and S
satisfy equation (3.105). It remains to show that S is orthogonal. This follows, with
(2.35)
(U −1 )T = U −1 and ST = U −1 ⋅ F T , from
(3.106)
ST ⋅ S = U −1 ⋅ F T ⋅ F ⋅ U −1 = U −1 ⋅ U ⋅ U ⋅ U −1 = δ.
U 2 = U ⋅ δ ⋅ U = U ⋅ ST ⋅ S ⋅ U = u ⋅ sT ⋅ s ⋅ u = u ⋅ δ ⋅ u = u2 ,
and since U 2 and u2 are both symmetric and positive definite, both have only one
symmetric and positive definite square root, i. e. we have U = u and hence S ⋅ U =
s ⋅ U. Scalar multiplication by U −1 from the right gives S = s.
3. Scalar multiplication of the last equation by R from the left and by RT from the right
allows us to solve for V: R ⋅ U ⋅ RT = V. Then we obtain a relation between V and U
which reads in coordinate notation
or
So we can again interpret Vij and Uij as the Cartesian coordinates of the same tensor in
two Cartesian coordinate systems, which are related by the transformation matrix Rij .
k
Thus both tensors have the same eigenvalues, and the unit eigenvectors qV and
k
qU corresponding to the eigenvalue λk are related by
k k k k
(qV )i = Rij (qU )j , (qU )i = Rji (qV )j . (3.109)
Since the Rij are the coordinates of the tensor R, we can write equations (3.108) and
(3.109) also in symbolic notation. For example, for (3.108)1 and (3.109)1 we have
k k
V = R ⋅ U ⋅ RT , qV = R ⋅ qU . (3.110)
k
The tensor R rotates the tensor U to the tensor V and it rotates the eigendirections qU
k
of U to the eigendirections qV of V.
Problem 3.22.
Find, for the tensor
2 2 1
aij = ( 1 −2 2) ,
2 −1 −2
the two factors of the polar decomposition aij = Vik Rkj and check your result, i. e.
verify that Vik Rkj = aij .
Then every point in space is uniquely described by the three variables ui . Thus we
can also call the ui point coordinates, i. e. each set of transformation equations (4.1),
together with a Cartesian coordinate system, defines again a coordinate system. All
these coordinate systems are called curvilinear coordinate systems.1
As an example we consider cylindrical coordinates. They are described by the
transformation equations
x1 = u1 cos u2 , x2 = u1 sin u2 , x3 = u3 .
We obtain conversely
x2
u1 = √x12 + x22 , u2 = arctan , u3 = x3
x1
x = R cos φ, y = R sin φ, z = z,
y (4.2)
R = √x2 + y2 , φ = arctan , z = z.
x
2. From d xi = (𝜕xi /𝜕uj ) d uj it follows that 𝜕xi /𝜕xk = δik = (𝜕xi /𝜕uj ) (𝜕uj /𝜕xk ), and
analogously from d ui = (𝜕ui /𝜕xj )d xj it follows that 𝜕ui /𝜕uk = δik = (𝜕ui /𝜕xj )(𝜕xj /𝜕uk ),
i. e. the partial derivatives of the transformation equations satisfy the orthogonality
relations
𝜕xi 𝜕uj 𝜕ui 𝜕xj
= δik , = δik . (4.3)
𝜕uj 𝜕xk 𝜕xj 𝜕uk
1 Since the transformation equations (2.9) between two Cartesian coordinate systems are special cases
of (4.1), Cartesian coordinate systems are special cases of curvilinear coordinate systems.
https://doi.org/10.1515/9783110404265-004
3. Computing the determinant of both sides of (4.3) and using the multiplication the-
orem for determinants (1.13), for the two Jacobians
𝜕x1 𝜕x1 𝜕x1
𝜕u1 𝜕u2 𝜕u3
𝜕(x1 , x2 , x3 )
𝜕x2 𝜕x2 𝜕x2
1 2 3
:=
𝜕(u , u , u ) 𝜕u1 𝜕u2 𝜕u3
𝜕x3 𝜕x3 𝜕x3
𝜕u1 𝜕u2 𝜕u3
and
𝜕u1 𝜕u1 𝜕u1
𝜕x1 𝜕x2 𝜕x3
1 2 3
𝜕u2 𝜕u2 𝜕u2
𝜕(u , u , u )
:= ,
𝜕(x1 , x2 , x3 ) 𝜕x1 𝜕x2 𝜕x3
𝜕u3 𝜕u3 𝜕u3
𝜕x1 𝜕x2 𝜕x3
𝜕(x1 , x2 , x3 ) 𝜕(u1 , u2 , u3 )
= 1. (4.4)
𝜕(u1 , u2 , u3 ) 𝜕(x1 , x2 , x3 )
Since we can interchange the rows and columns of a determinant, it does not mat-
ter if we take the row or the column index as the numerator index (unlike with the
coordinate matrices 𝜕ai /𝜕xj of the gradient of a vector, where it does matter).
4. We can show that the coordinate transformation (4.1) is one-to-one at all the points
where the Jacobian is neither zero nor infinite. The transformation is called regular at
these points and singular at all other points.
In the example of the cylindrical coordinates we have
𝜕(R, φ, z) 1
= .
𝜕(x, y, z) R
The surfaces with ui = const are called coordinate surfaces. Except at singular points,
they cover the space simply, i. e. every point lies on exactly one coordinate surface
of each of the three families of coordinate surfaces. Each pair of coordinate surfaces
intersects along a coordinate curve, so that, again except for singular points, three
coordinate curves intersect at each point.
In the example of the cylindrical coordinates, the equations for the coordinate sur-
faces are R = √x2 + y2 = const, or simpler, x2 + y2 = const, φ = arctan(y/x)
= const, or simpler, y/x = const, z = const.
The surfaces with x2 + y2 = const are cylinders which are coaxial with the z-axis;
the surfaces y/x = const are planes through the z-axis; and the surfaces z = const
are planes perpendicular to the z-axis.
On the coordinate surfaces R = √x2 + y2 = const, only φ and z vary; on the
coordinate surfaces φ = arctan(y/x) = const, only R and z vary; and on the coordinate
surfaces z = const, only R and φ vary. On the curve of intersection of the surfaces
R = const and φ = const, only z varies; on the curve of intersection of the surfaces
φ = const and z = const, only R varies; and on the curve of intersection of the
surfaces z = const and R = const, only φ varies.
1. At every point, the differentials d xi and d ui of the two coordinate systems are related
by
d u1 + 12 d u2 + 13 d u3 ,
𝜕x1 𝜕x 𝜕x
d x1 =
𝜕u1 𝜕u 𝜕u
d u1 + 22 d u2 + 23 d u3 ,
𝜕x2 𝜕x 𝜕x
d x2 = 1
𝜕u 𝜕u 𝜕u
d u1 + 32 d u2 + 33 d u3 .
𝜕x3 𝜕x 𝜕x
d x3 =
𝜕u1 𝜕u 𝜕u
𝜕xj
gi = ej (4.5)
𝜕ui
with the Cartesian coordinates
𝜕xj
g = (4.6)
ij 𝜕ui
𝜕ui
gi = e (4.7)
𝜕xj j
i 𝜕ui
gj = (4.8)
𝜕xj
has thus the property that its vectors are perpendicular to the coordinate surfaces
of the ui -coordinates at each point. We call this basis the contravariant basis of the
ui -coordinates at this point.
3. The covariant and the contravariant bases are not the only bases which we can
associate to a point in a curvilinear coordinate system. However, computations with
these bases are particularly simple, as we can easily convince ourselves, because they
are reciprocal bases. Both of these bases are called holonomic bases.
4. If we contract (4.5) with 𝜕ui /𝜕xk and (4.7) with 𝜕xk /𝜕ui and then solve for the Carte-
sian bases, we get
The order of the basis vectors ei of the Cartesian coordinate system xi , in which the
curvilinear coordinate system ui (xj ) is embedded, and the order of the basis vectors
g i and g i of its holonomic bases are such that all three bases form right-handed
systems.
Then the Jacobians (4.10) are positive, axial tensors are also independent of the coor-
dinate system, and their coordinates transform in the same way as the coordinates of
polar tensors. Thus we do not need to distinguish between polar and axial tensors.
The transformation matrices 𝜕xi /𝜕uj and 𝜕ui /𝜕xj are, according to (4.6) and (4.8), the
Cartesian coordinates of the two reciprocal bases for the ui -coordinates. These matri-
ces are in general functions of position and thus the two bases are also different at
different positions.
The transformation matrices, and hence also the reciprocal bases, are clearly in-
dependent of position, if and only if the transformation equations (4.1) have the form
and if the transformation coefficients Aij , Aij , Bi , and Bi are independent of position.
Since the reciprocal bases are independent of position, all coordinate curves are
straight lines and all coordinate surfaces are planes. Equally named coordinate curves
are parallel at all points (i. e. all u1 lines are parallel, all u2 lines are parallel, and all
u3 lines are parallel). Equally named coordinate surfaces are, at all points, parallel
planes. Such coordinate systems are called rectilinear and thus rectilinear coordinate
systems are a special case of curvilinear coordinate systems. (For cylindrical coordi-
nates, all R-curves and all z-curves are lines, but the φ-curves are not. Thus cylindrical
coordinates do not form a rectilinear coordinate system.)
The transformation coefficients of a rectilinear coordinate system satisfy the rela-
tions
𝜕ui i 𝜕xi
Aij = = gj, Aij = =g,
𝜕xj 𝜕uj j i
(4.12)
Aij Ajk = δik , Aij Ajk = δik ,
Bi = −Aij Bj , Bi = −Aij Bj .
The transformation matrices Aij and Aij are inverses of each other and hence regular.
If furthermore the transformation matrices are special orthogonal, the ui form a
Cartesian coordinate system, the two bases g i and g i are orthonormal and coincide,
and the equations (4.11) simplify to the transformation equations (2.9).
Problem 4.1.
Consider an oblique, rectilinear coordinate system in the plane, sharing the origin
with a Cartesian coordinate system and with axes at angles α and β with respect to the
axes of this Cartesian coordinate system.
A. Write out the transformation equations xi = xi (uj ) and ui = ui (xj ) for the coordi-
nates of a point P.
B. Find the Cartesian coordinates of the holonomic bases. Which basis vectors are
unit vectors?
A coordinate system in which the coordinate curves at a point are mutually orthogonal
and hence the covariant basis vectors at each point are also mutually orthogonal is
called an orthogonal coordinate system. Cylindrical coordinates are an example of an
orthogonal coordinate system. According to Section 3.9.3, in an orthogonal coordinate
system, the coordinate surfaces at a point and hence the contravariant basis vectors at
every point are also orthogonal. The homologous vectors of both bases have the same
direction and their lengths are reciprocal. The direction and length of the basis vectors
can vary from point to point, as for example in cylindrical coordinates.
Problem 4.2.
Find the Cartesian coordinates of the covariant basis and of the contravariant basis in
cylindrical coordinates.
a = ai ei , a = aij ei ej , etc.,
we can represent the tensor also with respect to the two holonomic bases of a curvilin-
ear coordinate system. Clearly, already a vector has different coordinates with respect
to the covariant basis and to the contravariant basis. The coordinates with respect to
the covariant basis are called contravariant coordinates and are denoted by an up-
per index, the coordinates with respect to the contravariant basis are called covariant
coordinates and are denoted by a lower index. We have the representations
a = ai g i = ai g i .
For a second-order tensor, we have, in addition to the covariant basis g i g j and the
contravariant basis g i g j , also mixed bases, i. e. we have the four representations
a = aij g i g j = ai j g i g j = ai j g i g j = aij g i g j ,
Lower indices of bases and tensor coordinates are called covariant and upper in-
dices are called contravariant.
We obtain the following representations of a tensor with respect to the two holonomic
bases of a curvilinear coordinate system.
a = ai g i = ai g i ,
a = aij g i g j = ai j g i g j = ai j g i g j = aij g i g j , (4.13)
etc.
Clearly, these equations generalize the representation equations (2.28) of a tensor with
respect to a Cartesian basis.
All bases of a tensor defined in (4.13) are called holonomic bases, and all corre-
sponding coordinates are called holonomic coordinates.
In the case of nonrectilinear coordinates, the bases change from position to po-
sition and a tensor field depends on position both through the coordinates and the
bases.
a = a1 g 1 + a2 g 2 + a3 g 3 ,
3. Using the orthogonality relations (3.24) between reciprocal bases we obtain from
(4.13), by an appropriate scalar multiplication, the converse equations
ai = a ⋅ g i ,
ai = a ⋅ g i ,
aij = a ⋅⋅ g i g j = g i ⋅ a ⋅ g j ,
ai j = a ⋅⋅ g i g j = g i ⋅ a ⋅ g j ,
(4.14)
ai j = a ⋅⋅ g i g j = g i ⋅ a ⋅ g j ,
aij = a ⋅⋅ g i g j = g i ⋅ a ⋅ g j ,
etc.
4. Clearly we could also write a position vector with respect to the holonomic bases
of a curvilinear coordinate system, i. e.
x = xi g i = xi g i .
However, in general neither the xi nor the xi depend in a simple manner on the curvi-
linear position coordinates ui , so they are not used.
Problem 4.3.
A tensor T and a basis g i are given by their Cartesian coordinates, i. e.
1 −1 2 g 1 = (1, 0, 0),
T̂ij = ( 2 2 −1) , g 2 = (1, 1, 0),
−1 −2 1 g 3 = (1, 1, 1).
ui = ui (xj ), xi = xi (uj ),
(4.15)
̃i = u
u ̃ i (xj ), ̃ j ).
xi = xi (u
By eliminating this Cartesian coordinate system from the above equations, we obtain
analogous transformation equations between the two curvilinear Coordinate systems
themselves, i. e.
ui = ui (u
̃ j ), ̃i = u
u ̃ i (uj ). (4.16)
𝜕ui 𝜕ũj
= δik . (4.17)
̃ j 𝜕uk
𝜕u
Because the tilde and the no-tilde coordinate systems are on an equal footing, it is
trivial to write down the equation corresponding to the second equation in (4.3).
3. From (4.9) it follows that
𝜕uj ̃j
𝜕u
g̃ = i g j = ij g̃ j ,
𝜕x 𝜕x
ei = gj =
𝜕xi 𝜕xi j 𝜕uj 𝜕u
̃
𝜕uj ̃j
𝜕u 𝜕xi j 𝜕xi j
gj = g̃ , g = j g̃ .
𝜕xi 𝜕xi j 𝜕uj 𝜕u
̃
̃ m and 𝜕u
Contraction with 𝜕xi /𝜕u ̃ m /𝜕xi , respectively, gives
𝜕uj ̃m j
𝜕u
g̃ m = g, g̃ m = g. (4.18)
̃m j
𝜕u 𝜕uj
Since the two coordinate systems are on an equal footing, we can exchange the tilde
and the no-tilde quantities in these equations.
The two equations (4.18) are clearly transformation equations between the covari-
ant and the contravariant bases of two curvilinear coordinate systems. They general-
ize the transformation equations (2.8) between the bases of two Cartesian coordinate
𝜕uj 𝜕ui k
̃ i g̃ = a
a = ai g i = a ̃i g or ai = a
̃ .
i ̃i j
𝜕u ̃k
𝜕u
Hence the transformation equations between the holonomic coordinates of the same
type for a tensor in two curvilinear coordinate systems are
̃i m
𝜕u 𝜕um
̃i =
a a , a
̃i = a ,
𝜕um ̃i m
𝜕u
̃ i 𝜕u
𝜕u ̃ j mn ̃ i 𝜕un m
𝜕u
̃ ij =
a a , ̃i j =
a a n,
𝜕um 𝜕un 𝜕um 𝜕ũj
(4.19)
𝜕um 𝜕u
̃j 𝜕um 𝜕un
a j
̃i = a n, a
̃ ij = a ,
̃ 𝜕un m
𝜕u i ̃ j mn
̃ i 𝜕u
𝜕u
etc.
These equations generalize the transformation equations (2.17) for the coordinates of
a tensor in two Cartesian coordinate systems (when both are right-handed systems).
5. Thus a covariant index of a tensor coordinate transforms like a covariant index of
a basis and “oppositely” to a contravariant index of a tensor coordinate or a basis.
This is the case, because contraction of a covariant index with a contravariant index
(and vice versa) results in an index-free expression, which is as such independent of
a coordinate system, and this does not depend on whether or not these indices are
indices of a tensor or of a basis; ai bi = c, ai g i = a, g i g i = δ. (This property of
covariant and contravariant indices explains the names covariant and contravariant:
covariant means changing similarly, contravariant means changing oppositely.)
6. The transformation law for tensor coordinates depends only on the number of the
upper and lower indices, but not on their order; according to (4.19) we have
̃ i 𝜕un m
𝜕u ̃ i 𝜕un m
𝜕u
̃i j =
a a n, ̃j i =
a a .
𝜕um 𝜕ũj 𝜕um 𝜕ũj n
Problem 4.4.
Compute the transformation equations between the Cartesian coordinates and the
holonomic cylindrical coordinates of a vector.
Problem 4.5.
A. For the oblique coordinate system from Problem 4.1, compute the transformation
equations Ai = Ai (A ̂ j ), A ̂ i (Aj ), and Ai = Ai (A
̂i = A ̂ j ) between the holonomic and
the Cartesian vector coordinates.
B. Add the representations A = Ai g i and A = Ai g i to the graphs (in each case as
addition of two vectors). Show that the contravariant vector coordinates can be
constructed by parallel projection and that the covariant vector coordinates can
be constructed by perpendicular projection onto the coordinate lines.
Hint: First verify geometrically the computed representations A ̂ i (Aj ) and
̂i = A
Ai = Ai (A
̂ j ) using auxiliary lines.
The distinction between upper and lower indices in holonomic bases and tensor co-
ordinates requires that we adjust the summation convention and its consequent rules
for the use of running indices. This follows from the representation equations (4.13)
and (4.14) and from the transformation equations (4.18) and (4.19).
We modify the summation convention for equations in which, besides tensors,
only holonomic tensor coordinates, holonomic bases, and transformation matrices
appear, i. e. for tensor equations, transformation equations, and representation equa-
tions (compare Section 2.7.4) we have the following rule.
If a running index appears twice in a term, once as a lower index and once as an
upper index, summation over the values of the index from one to three is implied,
without explicitly writing the summation sign. In a transformation matrix, the index
in the “numerator” counts as an upper index and the index in the “denominator”
counts as a lower index.
This has the following implications for running indices (see Section 1.1, No. 4):
– A running index can appear only once or twice in a term. If it appears twice, one
must be in upper position and one in lower position.
– Each running index must take the values one, two, three. (In Chapters 2 to 4 we
restrict ourselves to the three-dimensional space of our intuition.)
– All terms of an equation must match in their free indices, also with respect to their
positions, but not with respect to their order.
In the rare cases where we do not want to apply the summation convention, we again
underline the corresponding running index.
g ij = δ ⋅⋅ g i g j = g i ⋅ g j ,
(3.24)
δji = δ ⋅⋅ g i g j = g i ⋅ g j = δij ,
j (3.24)
δi = δ ⋅⋅ g i g j = g i ⋅ g j = δij ,
gij = δ ⋅⋅ g i g j = g i ⋅ g j ,
or, using the fact that the scalar product of two vectors does not depend on the order
of its factors, we write the following.
j
δi = δji = g i ⋅ g j = g i ⋅ g j = δij ,
g ij = g ji = g i ⋅ g j , (4.21)
gij = gji = g i ⋅ g j .
Here (in accordance with the definition (1.19) of the generalized Kronecker symbols), δji
j
and δi have the same meaning as δij ; so we will set one index of the Kronecker symbol
as upper index, if this is required to satisfy the rules for computations with running
indices for curvilinear coordinates. Because the order of the indices does not matter
in the Kronecker symbol, often both indices are set on top of each other (unlike for
mixed tensor coordinates, where the order does matter).
2. The g ij and gij are also called metric coefficients (“measure coefficients”). This
makes sense, because we need them to compute the length of a vector (to “measure”
its length) from the three holonomic coordinates; if, for example, the covariant coor-
dinates ai of a vector a are given, then we compute its length as
i. e. we need both the covariant coordinates ai and the contravariant metric coeffi-
cients. This explains why the δ-tensor is called metric tensor (a name we had already
mentioned earlier).
3. Using (4.21), we can also interpret the coordinates of the δ-tensor geometrically;
they are the scalar products of two basis vectors. We compute for the mixed coordi-
nates in every curvilinear coordinate system, according to the definition (3.21) of re-
ciprocal bases or the orthogonality relations (3.24), the same values, namely 1 if both
indices are equal and 0 if they are not equal. The covariant coordinates and the con-
travariant coordinates of the δ-tensor, however, depend on the coordinate system; co-
ordinates with the same indices are the square of the length of the corresponding basis
vector, e. g. we have
g11 = |g 1 |2 .
Coordinates with different indices are the product of the lengths of the corresponding
two basis vectors and the cosine of the enclosed angle, e. g.
g 13 = |g 1 ||g 3 | cos(g 1 , g 3 ).
Problem 4.6.
Compute the covariant and contravariant cylindrical coordinates of the δ-tensor.
The most obvious property or condition is that the metric coefficients are symmet-
ric, i. e.
ai aj gij = ai aj g ij = a ⋅ a > 0,
which is true for any vector a (except the zero vector), it follows that the matrices of the
metric coefficients gij and g ij are positive definite. From this, according to Theorem 5
and Theorem 4 from Section 3.12.4, it follows that the determinant and all principal
subdeterminants (e. g. for a matrix of order 3 the principal minors and the elements
on the main diagonal) are positive.
First, the diagonal elements must be positive, i. e.
This we see also from their geometric interpretation as the square of the length of the
basis vectors. Secondly, the principal minors must be positive, i. e. the subdetermi-
nants of order 2, which are obtained by dropping a row and its associated column, so
we must have
ii
g ij
g gij g
ii
> 0, ji > 0, i ≠ j. (4.24)
gji
gjj g
g jj
Lastly, the determinant of the metric coefficients itself must also be positive, and this
must be true independently of the orientation of the basis vectors. We have e. g.
g
11 g12 g13 g 1 ⋅ g 1 g1 ⋅ g2 g 1 ⋅ g 3
det gij = g21 g22 g23 = g 2 ⋅ g 1 g2 ⋅ g2 g 2 ⋅ g 3
g31 g32 g33 g ⋅ g
3 1
g3 ⋅ g2 g 3 ⋅ g 3
(2.46)
= [g 1 , g 2 , g 3 ][g 1 , g 2 , g 3 ],
3. Because the two holonomic bases are reciprocal, one is uniquely determined by the
other, and thus the contravariant metric coefficients are uniquely determined by the
covariant metric coeficients and vice versa. Using this, we obtain relations between
i. e. the matrices of the two types of metric coefficients are inverses of each other, and
one can be computed from the other using the Gauss–Jordan algorithm.
4. Clearly we have
(1.13) (4.26)
det gij det g ij = det (gij g jk ) = 1.
1
g := det gij , det g ij = , det gij det g ij = 1. (4.27)
g
and accordingly
1
[g 1 , g 2 , g 3 ]2 = .
g
Problem 4.7.
Find the transformation equation for g := det gij , when changing from one curvilinear
coordinate system to another.
Hint: Start with the transformation equations (4.19) for the covariant tensor coor-
dinates gij .
Clearly we have
g ij g j = g i ⋅ g j g j = g i ⋅ δ = g i ,
gij g j = g i ⋅ g j g j = g i ⋅ δ = g i .
a = ai g i = ai g i ,
j
ai g i ⋅ g j = ai g i ⋅ g j , ai δi = ai g ij , aj = g ji ai ,
ai g i ⋅ g j = ai g i ⋅ g j , ai gij = ai δji , aj = gji ai .
a = aij g i g j = ai j g i g j = ai j g i g j = aij g i g j
eij k = ε ⋅⋅⋅ g i g j g k = [g i , g j , g k ],
..
.
eijk = ε ⋅⋅⋅ g i g j g k = [g i , g j , g k ].
According to (2.46), we can write the square of these triple products as determinant as
follows:
g ij g ik
ii
g
[g i , g j , g k ][g i , g j , g k ] = g ji g jj g jk ,
ki
g
g kj g kk
g ii
g ij δki
j
[g , g , g k ][g , g , g k ] = g ji
i j i j
g jj δk ,
i
δ j
k δk gkk
etc.
g ij
ii
g g ik
ijk i j k
e = [g , g , g ] = ±√ g ji g jj g jk ,
ki
g kj g kk
g
√ ii
g ij δki
√ g
√
√
eij k i j
= [g , g , g k ] = ± g ji g jj
j
δk ,
(4.33)
i j
δ δk gkk
√ k
..
.
√
√ gii gij gik
√
eijk = [g i , g j , g k ] = ± gji gjj gjk .
gki gkj gkk
√
All eight versions can be obtained from each other by raising or lowering single indices
and by replacing g with δ as necessary.
2. The sign of a holonomic coordinate of the ε-tensor changes, if we change the order
of two indices, e. g. i and j. The sign does not change, if we raise or lower an index. For
example,
εpqk g g
(2.45) 3 (3.22) 1p 2q
[g 1 , g 2 , g 3 ] = εijk g g g k = εijk g g
1i 2j 1 i 2 j εlmn g g g
1l 2m 3n
g11 g12
(δip δjq − δiq δjp ) g g g g g g g g −g g g g
g21
1i 2j 1p 2q 1i 1i 2j 2j 1i 2i 1j 2j g22
= = = .
εlmn g g g εlmn g g g [g 1 , g 2 , g 3 ]
1l 2m 3n 1l 2m 3n
Since the numerator is, according to (4.24), always positive, [g 1 , g 2 , g 3 ] has the same
sign as [g 1 , g 2 , g 3 ].
1
e123 = [g 1 , g 2 , g 3 ] = √g, e123 = [g 1 , g 2 , g 3 ] = , e123 e123 = 1.
√g
Since a triple product is zero if two of its vectors are equal and since it changes the
sign when we change the order of two of its vectors, we have for the entirely covariant
2. The holonomic coordinates of the ε-tensor only change their sign if we exchange
two indices, while keeping their positions (upper or lower). The sign does not change
when the indices are changed cyclically, while keeping their positions, for example,
we have
where indices can be lowered or raised and indices in mixed positions can be set both
in upper or lower position. Thus we obtain the translated Grassman identity (1.35).
2 The rule from Section 4.2.3, according to which free indices must be in the same position in all
terms of an equation, applies only to equations which contain besides tensors only holonomic tensor
coordinates, holonomic bases, and transformation coefficients. This is not the case for an equation
with an εijk .
3 See Section 4.2.8.5.
1 ijk
gi = e gj × gk , g i × g j = eijk g k . (4.37)
2
Problem 4.8.
How many holonomic coordinates does the ε-tensor have? How many of them are not
zero in cylindrical coordinates? Find the nonzero holonomic cylindrical coordinates
of the ε-tensor.
Problem 4.9.
Complete the expression eijk em nk to obtain the Grassmann identity.
The δ-tensor, the ε-tensor, and combinations of both are, according to Section 2.8.3,
characterized by the property that their Cartesian coordinates are equal in all Carte-
sian coordinate systems. We called them isotropic tensors.
Clearly, holonomic curvilinear coordinates of isotropic tensors do not have this
property: the metric coefficients can have different values in different curvilinear co-
ordinate systems. However, since the metric coefficients depend only on the length
and the relative position of the basis vectors, which do not change under a rotation
of the basis (at the point under consideration), and since the holonomic curvilinear
coordinates of isotropic tensors depend only on the metric coefficients, the following
property holds. The holonomic curvilinear coordinates of isotropic tensors are invari-
ant under a rotation of the basis. (Note that the bases of all Cartesian coordinate sys-
tems at a point can be obtained from each other by a rotation or a rotary reflection.)
We can derive rules for computations with holonomic coordinates from the rules for
computations in symbolic notation or (e. g. when operations are only defined in co-
ordinate notation) from the rules for computations in Cartesian coordinates. Thus we
find as a third notation, in addition to the symbolic notation and the notation in Carte-
sian coordinates, the notation of an equation in holonomic curvilinear coordinates,
with analogous translation rules.
ai = bi and ai = bi .
Thus two tensors are equal if we have in symbolic notation, in coordinate notation
in Cartesian coordinates, and in coordinate notation in holonomic curvilinear coordi-
nates
a = A, a = A, a = A,
a = B, a
̂i = B
̂i, ai = Bi ,
ai = Bi ,
a = C, a
̂ ij = C
̂ ,
ij aij = C ij ,
(4.38)
ai j = C i j ,
ai j = Ci j ,
aij = Cij ,
etc.
ai ± bi = ci and ai ± bi = ci ,
and, in general, for addition and subtraction of tensors in the three notations
a ± b = A, a±b = A, a±b = A,
i i
a ± b = B, a
̂i ± b
̂ =B
i
̂i, a ±b = Bi ,
ai ± bi = Bi ,
a ± b = C, a
̂ ij ± b
̂ =C
ij
̂ ,
ij aij ± bij = C ij ,
(4.39)
ai j ± bi j = C i j ,
ai j ± bi j = Ci j ,
aij ± bij = Cij ,
etc.
Thus, we can write an equation, whose terms are vectors in holonomic curvilinear
coordinates, in two equivalent versions: either in contravariant coordinates or in co-
variant coordinates; an equation whose terms are second-order tensors can be written
in four equivalent versions. In general, we can write an equation whose terms are ten-
sors of order n in 2n equivalent versions (corresponding to the number of holonomic
curvilinear coordinates).
The transpose of a tensor in Cartesian coordinates is given by (2.23), and with the equa-
tions above we have in holonomic curvilinear coordinates
̂ Tij = a
a ̂ ji , (aT )ij = aji ,
(aT )i j = aj i ,
(4.40)
(aT )i j = aj i ,
(aT )ij = aji ,
i. e. we transpose a tensor by exchanging the two free indices of the holonomic curvi-
linear tensor coordinates while keeping their positions.
2. According to (2.26) and with (4.39), we have for the symmetric part of a second-order
tensor
1 1 ij
a
̂ (ij) = (a
̂ +a
̂ ji ),
a(ij) = (a + aji ),
2 ij 2
1
a(i j) = (ai j + aj i ),
2
(4.41)
1
a(i j) = (a j + aj i ),
2 i
1
a(ij) = (a + aji )
2 ij
and
ai j g i g j bk g k = ai j bk g i g j g k = ci jk g i g j g k or ai j bk = ci jk .
Six more equivalent versions are possible, because three indices i, j, and k can be ar-
ranged into 23 = 8 different combinations of upper and lower indices. We note that
in computations with these tensor products we can pull the coordinates before the
bases (because they are numbers), but we cannot change the order of the bases, be-
cause tensor products of vectors are not commutative, i. e. g i g j ≠ g j g i . Furthermore,
to be able to cancel out bases, they must match not only in their order, but also in the
positions of their indices, i. e. g i g j ≠ g i g j . As a result of these two conditions, after
canceling out the bases, the free indices in a tensor equation match in both their order
and their positions, i. e.
a b = A, a b = A, a b = A,
a b = B, ab
̂ =B
i
̂i, a bi = Bi ,
a bi = Bi ,
a b = C, a
̂i b = C
̂,
i ai b = C i ,
ai b = Ci ,
(4.43)
a b = D, a
̂i b
̂ =̂
j Dij , ai bj = Dij ,
ai bj = Di j ,
ai bj = Di j ,
ai bj = Dij ,
etc.
or
aij bj = ci .
ai j bj = ci , bk gjk = ci ,
aij ⏟⏟⏟⏟⏟⏟⏟⏟⏟ ai j b jk
k g =c.
⏟⏟⏟⏟⏟⏟⏟⏟⏟
i
bj bj
aij bj = ci and ai j bj = ci .
Since the right sides and thus also the left sides are equal (unlike in the equivalent for-
mulations for indices in different positions), we interpret both equations as different
notations of the same version, i. e. of the contravariant version of the vector equation
a ⋅ b = c. In addition we have the covariant version ai j bj = aij bj = ci . We note that
summation indices must be in different positions, but it does not matter which of the
two indices is in upper position and which is in lower position;
a ⋅ b = A, a
̂i b
̂ = A,
i ai bi = ai bi = A,
a ⋅ b = B, a
̂i b
̂ =B
ij
̂j, ai bij = ai bi j = Bj ,
(4.44)
ai bi j = ai bij = Bj ,
etc.
a
̂ ijk b
̂ =A
j
̂ ik , aijk bj = ai j k bj = Aik ,
aij k bj = ai jk bj = Ai k ,
(4.45)
ai jk bj = aij k bj = Ai k ,
ai j k bj = aijk bj = Aik .
4.2.8.5 Summary
1. Tensor equations in holonomic curvilinear coordinates must satisfy the three rules
from Section 4.2.3.
4. Thus the equations (4.31) on raising and lowering of indices are simply translations
̂ i = δij a
of identities such as a ̂ j into curvilinear coordinates, where the index of the
vector coordinate is set in different positions on the left and on the right side.
Problem 4.10.
Translate (in one possible way) into holonomic curvilinear coordinates
A. the equations from Problem 2.5;
B. a × (b × c) = a ⋅ c b − a ⋅ b c.
a = a1 g 1 + a2 g 2 + a3 g 3 = a1 g 1 + a2 g 2 + a3 g 3 .
Since in general the basis vectors are not unit vectors, one part of the length of the
components ai g i and ai g i is contained in the coordinates and the other part in the
basis vectors. If the vector, and thus its components, represent a physical quantity,
such as force, then one part of the physical dimension is in the coordinates and the
other part is in the basis vectors. This is often inconvenient, so we define for each
holonomic basis another basis, whose basis vectors have the same directions, but are
unit vectors. We call this basis the corresponding physical basis and denote this basis
and its corresponding coordinates by a star. Thus we have
gi gi
g∗i = , g ∗i = , (4.46)
√gii √g ii
a = ai g i = a∗i g ∗ i = ai g i = a∗ i g ∗i . (4.47)
The so-defined physical coordinates a∗i and a∗ i clearly represent the length of the
corresponding components, and they have the same physical dimension as the vector
itself. We can generalize (4.47) analogously to higher-order tensors, by substituting
(4.46) for the relation between the physical coordinates and the holonomic coordi-
nates, and we obtain
a∗i = ai √gii , a∗ i = ai √g ii ,
a∗ij = aij √gii √gjj , a∗i j = ai j √gii √g jj ,
(4.48)
a∗ i j = ai j √g ii √gjj , a∗ ij = aij √g ii √g jj ,
etc.
2. That some of the indices in these equations are underlined, i. e. that the summa-
tion convention cannot be applied, already indicates that we pay a price for the di-
mensional equality of the physical coordinates. The powerful elegance of tensor cal-
culus, which is expressed in the summation convention, exists only for holonomic
coordinates. In practice, we therefore perform computations as long as possible in
holonomic coordinates, and convert only the final result into physical coordinates, if
necessary.
3. For example, in general the two physical bases g ∗ i and g ∗i , corresponding to the
reciprocal bases g i and g i , are not reciprocal, i. e. g ∗ i and g ∗i do not satisfy the orthog-
onality relations (3.24) and (3.25).
However, for an important group of curvilinear coordinates, namely the orthogo-
nal coordinates, this is still the case. Here, the homologous vectors of the two holo-
nomic bases have the same direction, i. e. the corresponding physical bases coincide.
We then write
g ∗ i = g ∗i =: g <i> (4.49)
Thus the physical bases corresponding to an orthogonal coordinate system are or-
thonormal at every point, but in general their direction changes from point to point.
This explains why orthogonal physical coordinates are also called locally Cartesian
coordinates.
This change of direction differentiates orthonormal bases from Cartesian bases.
For tensor algebra it does not matter if tensors or bases depend on the position. Con-
sequently, all tensor-algebraic equations have the same form in orthogonal physical
coordinates and in Cartesian coordinates. To translate into orthogonal physical co-
ordinates, we only have to replace the lower indices of the Cartesian coordinates by
indices in acute brackets.
Problem 4.11.
A. Compute the Cartesian coordinates of the physical basis in cylindrical coordi-
nates.
B. Compute the transformation equations between the Cartesian coordinates and the
physical cylindrical coordinates of a vector.
Hint: Using the results of Problems 4.2 and 4.4, the desired equations can be
found immediately or with only few intermediate steps.
C. What is the value of the physical cylindrical coordinates of the δ-tensor and of the
ε-tensor? (No computations necessary.)
D. Translate into orthogonal physical coordinates:
(a ⋅ b)T = c, αik bi j = cjk , a × (b × c) = a ⋅ c b − a ⋅ b c.
𝜕G
G,i := . (4.51)
𝜕ui
Here G can be the position vector, a basis, a tensor, or a tensor coordinate. For the total
differential d G of a quantity G, we have the following expression.
d G = G,i d ui . (4.52)
The partial derivative of a quantity is, like the quantity itself, also a function of po-
sition. The total differential of a quantity depends on the coordinates ui and also on
their differentials d ui . For small d ui , i. e. for nearby points with the position vectors
x,i = g i , (4.54)
i. e. the partial derivative of the position vector with respect to the curvilinear coordi-
nates gives the covariant basis. From (4.52) it further follows that d x = g i d ui , i. e.
the differentials of the curvilinear coordinates ui are the contravariant coordinates of
the vector d x. (The position vector x, as we know, is not a vector; the differential d x of
the position vector, however, is a vector.) This property of the coordinate differentials
explains why we write the index as an upper index. In Cartesian and in curvilinear
coordinates, we have
d x = d xi ei = d ui g i . (4.55)
1. Let us compute the partial derivatives of the two bases with respect to the curvilinear
coordinates ui and write the result with respect to the original basis.
For the covariant basis we get
(4.5) 𝜕2 xk 2
(4.9) 𝜕 xk 𝜕u
m
g i,j = e k = g .
𝜕ui 𝜕uj 𝜕ui 𝜕uj 𝜕xk m
𝜕2 xk 𝜕um
Γm
ij := (4.56)
𝜕ui 𝜕uj 𝜕xk
are called Christoffel symbols (of the second kind). Note that they are symmetric in the
lower indices.
Γm m
ij = Γji . (4.57)
However, they are not tensor coordinates, since they do not satisfy the transformation
law for single contravariant and twice covariant tensor coordinates, as we will see in
Problem 4.13. If the ui are rectilinear, for example, Cartesian coordinates, the Christof-
fel symbols are zero.
To derive the partial derivative of the contravariant basis, we differentiate g i ⋅ g j
= δji with respect to uk . We have
g i ,k ⋅ g j + g i ⋅ g j,k = 0,
g i ,k ⋅ g j = −Γm i m i i
jk g m ⋅ g = −Γjk δm = −Γjk .
g i ,k ⋅ g j g j = g i ,k ⋅ δ = g i ,k = −Γijk g j .
Thus, the partial derivatives of the bases with respect to the curvilinear coordinates
are
g i,j = Γm
ij g m , g i ,j = −Γijm g m . (4.58)
d g i = g i,j d uj = Γm j
ij d u g m ,
(4.59)
d g i = g i ,j d uj = − Γijm d uj g m .
For small d ui , they are the difference between the corresponding basis vectors of two
neighboring points.
j
Γm n j n m n
ik g m ⋅ g = gij,k g ⋅ g − gij Γkm g ⋅ g ,
j
Γnik = gij,k g jn − gij g mn Γkm .
j
gnp Γnik + gij g mn gnp Γkm = gij,k g jn gnp
j j
gpj Γik + gij Γkp = gip,k .
If we multiply the first of the last three equations by − 21 , the other two by + 21 , and add
all three equations, we get
j 1
gkj Γip = (gpk,i + gki,p − gip,k ).
2
1 mk
Γm
ip = g (gki,p + gkp,i − gip,k ). (4.60)
2
Problem 4.12.
Compute the Christoffel symbols in cylindrical coordinates.
Problem 4.13.
Derive the transformation law for Christoffel symbols when changing between two
curvilinear coordinate systems.
1. Let us now compute the partial derivatives of tensors with respect to the curvilinear
coordinates ui and write them with respect to the original basis or bases. We obtain
= aij ,k g i g j + aij Γm ij m
ik g m g j + a g i Γjk g m
j
= (aij ,k + Γimk amj + Γmk aim ) g i g j
= (ai j g i g j ),k = ⋅ ⋅ ⋅ ,
etc.
The coefficients of the original bases in these equations are called covariant deriva-
tives or absolute derivatives of the curvilinear tensor coordinates and they are denoted
by a vertical line; i. e. we have
a|k := a,k ,
ai |k := ai ,k + Γimk am ,
ai |k := ai,k − Γm
ik am ,
j
aij |k := aij ,k + Γimk amj + Γmk aim ,
(4.61)
ai j |k := ai j,k + Γimk am j − Γm i
jk a m ,
j
ai j |k := ai j ,k − Γm j m
ik am + Γmk ai ,
aij |k := aij,k − Γm m
ik amj − Γjk aim ,
etc.
Using these covariant derivatives of tensor coordinates, we can write the partial deriva-
tives of tensors as follows.
a,k = a|k ,
a,k = ai |k g i = ai |k g i ,
(4.62)
a = aij |k g i g j = ai j |k g i g j = ai j |k g i g j = aij |k g i g j ,
,k
etc.
2. The partial derivatives of tensors clearly depend on the coordinate system. Using
the chain rule, we obtain the transformation law for derivatives with respect to the
curvilinear coordinates
𝜕A 𝜕um 𝜕A
= , (4.63)
̃i
𝜕u ̃ i 𝜕um
𝜕u
i. e. the index of the partial derivative of a tensor transforms like a covariant index.
Since, on both sides of the equations (4.62), the index k must transform similarly,
the covariant derivative of the holonomic coordinates of a tensor of n-th order gives the
holonomic coordinates of a tensor of order n + 1, i. e. the covariant derivative adds an-
other covariant index. On the other hand, the partial derivative of a tensor coordinate
with respect to the curvilinear coordinates is not a tensor coordinate.
4.4.5 The Total Differential of Tensors. The Total and the Absolute Differential
of Tensor Coordinates
d A = A,i d ui . (4.64)
δai...j i...j k
m...n = am...n |k d u . (4.66)
The absolute differential of a holonomic tensor coordinate is, just like the covariant
derivative, also a holonomic tensor coordinate, and it is a tensor coordinate of the
same order and type as the original coordinate.
In general, we have
d a = δa,
d a = δai g i = δai g i ,
(4.67)
d a = δaij g i g j = δai j g i g j = δai j g i g j = δaij g i g j ,
etc.
Thus the absolute differential of tensor coordinates are the coordinates of the total dif-
ferential of a tensor with respect to the holonomic bases. They are zero if and only if
the total differential of the tensor is zero, i. e. if the tensor is equal at the two neigh-
boring points. In particular, for a vector, this means geometrically that the two vectors
can be obtained from each other by a parallel translation.
We translate differentials of tensor coordinates from Cartesian coordinates into
holonomic curvilinear coordinates, according to (4.67), as follows.
̂ i ei = δai g = δai g i .
da = da i
4. The following computation shows the relation between the total differentials d a,
d ai , the differentials d ui , and the absolute differentials δai , using a vector as an
example:
d a = d (ai g i ) = ⏟⏟d⏟⏟⏟
a⏟⏟i g i + ai ⏟⏟d⏟⏟⏟
g⏟⏟i = (a i i j k
,k + Γjk a ) g i d u ,
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
j
ai ,k d uk Γik g j d uk ai |k
a,k
⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞
ai |k
⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞
d uk (ai ,k + Γijk aj ) g i .
d a = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ (4.68)
i
da
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
δai
4.4.7 Gradient
∇ = i gi.
𝜕
(4.71)
𝜕u
(The partial derivative of a tensor with respect to the curvilinear coordinates trans-
forms, according to (4.63), like a covariant vector coordinate.)
Thus, according to (2.48), we have for the (right) gradient
or, using (4.62), written out for tensors of different order, we obtain the following for-
mulas.
grad a = a|k g k ,
grad a = ai |k g i g k = ai |k g i g k ,
grad a = aij |k g i g j g k = ai j |k g i g j g k (4.73)
= ai j |k g i g j g k = aij |k g i g j g k ,
etc.
In other words, the covariant derivatives of the coordinates of a tensor are the coordi-
nates of the gradient of this tensor, and we translate partial derivatives into holonomic
curvilinear coordinates as follows.
We replace the partial derivative of a Cartesian tensor coordinate with the covariant
derivative of a holonomic curvilinear tensor coordinate.
2. Since the δ-tensor and the ε-tensor are independent of position, their gradients
must be zero and thus, according to (4.73), the covariant derivatives of their coordi-
nates must be zero, i. e.
g ij |k = gij |k = 0,
(4.74)
eijk |m = eij k |m = ⋅ ⋅ ⋅ = eijk |m = 0.
(The first of these two equations is also called the Ricci theorem.) This implies that,
in general, the partial derivatives of the coordinates of the δ-tensor and the ε-tensor
are nonzero. We can compute them from the corresponding covariant derivative and
the definition (4.61). For the partial derivatives of the δ-tensor we again obtain the
relations (4.60), which we derived earlier.
(4.21)
Using (4.62), we have e. g. div a = ai |k g i ⋅ g k = ai |k δik = ak |k , and for tensors of
various orders we write the following expressions.
div a = ak |k ,
div a = aik |k g i = ai k |k g i , (4.76)
etc.
etc.
3. If we substitute (4.61) for the covariant derivative ai |k , we see, because of the an-
timetry of the ε-tensor and the symmetry of the Christoffel symbols with respect to the
two lower indices, that we can write for the curl of a vector
Although these equations contain partial derivatives, which are not tensor coordi-
nates, they are very convenient for computing the curl.
Problem 4.14.
The following equations are given either in symbolic notation, in Cartesian coordi-
nates, or in holonomic curvilinear coordinates. Translate each of them into the other
two notations.
𝜕â i 𝜕a ̂j
A. a × curl a = b, B. + =b
̂ ,
ij
𝜕xj 𝜕xi
i i j k m n
C. a bj |i = cj , D. e jk eimn a b c d = f .
Simplify the last expression, using the Grassmann identity, before translating.
Substituting successively 1, 2, and 3 for the free index and summing over the summa-
tion index, we obtain
(div a)1 = a11 |1 + a12 |2 + a13 |3 ,
(div a)2 = a21 |1 + a22 |2 + a23 |3 ,
(div a)3 = a31 |1 + a32 |2 + a33 |3 .
Now we compute the covariant derivatives of the tensor coordinates in these equa-
tions. Using (4.61), we have
j
aij |j = aij ,j + Γimj amj + Γmj aim .
Taking into account that for cylindrical coordinates only Γ122 , Γ212 , and Γ221 are nonzero,
we have
a11 |1 = a11 ,1 ,
a12 |2 = a12 ,2 + Γ122 a22 + Γ212 a11 ,
a13 |3 = a13 ,3 ,
a21 |1 = a21 ,1 + Γ221 a21 ,
a22 |2 = a22 ,2 + Γ212 a12 + Γ212 a21 ,
a23 |3 = a23 ,3 ,
a31 |1 = a31 ,1 ,
a32 |2 = a32 ,2 + Γ212 a31 ,
a33 |3 = a33 ,3 .
Next we substitute these covariant derivatives into the equations for the contravariant
coordinates of the divergence; we write out the partial derivatives and we substitute
for the Christoffel symbols their values Γ122 = −R, Γ212 = Γ221 = 1/R, so we obtain
1 1
a11 = aRR , a12 = a , a13 = aRz , a21 = a ,
R Rφ R φR
1 1 1
a22 = a , a23 = a , a31 = azR , a32 = a , a33 = azz ,
R2 φφ R φz R zφ
and thus we have
𝜕 aRφ aφφ 1
(div a)R = (div a)1 =
𝜕aRR 𝜕a
+ − R 2 + aRR + Rz ,
𝜕R 𝜕φ R R R 𝜕z
1. According to (4.54), together with (4.18), and also according to (4.63), the index of
the partial derivative of the position vector or of a tensor transforms like a covariant
index. This enabled us to introduce the covariant derivative (4.61) of a tensor coordi-
nate in terms of the partial derivative of the tensor. For the second derivative this is not
the case: the partial derivative of (4.54) is x,ij = g i,j = Γm
ij g m , and the partial derivative
of (4.63) is
𝜕2 A 𝜕um 𝜕un 𝜕2 A 𝜕2 um 𝜕A
= n m
+ , (4.80)
̃ j 𝜕u
𝜕u ̃i ̃ i 𝜕u
𝜕u ̃ j 𝜕u 𝜕u ̃ i 𝜕um
̃ j 𝜕u
𝜕u
i. e. while the index i in A,i transforms covariantly, the two indices in A,ij do not trans-
form like tensor indices. This is the reason why we do not use the second partial deriva-
tive of the position vector or of a tensor. Instead, we introduce the second covariant
derivative. Since the covariant derivative of a tensor coordinate is again a tensor co-
ordinate, the covariant derivative of the covariant derivative of a tensor coordinate is
defined and is again a tensor coordinate. We call it the second covariant derivative of
the original tensor coordinate and write
ai...j i...j
m...n |pq := am...n |p |q . (4.81)
As we know, all Cartesian coordinates of this tensor are zero, which follows immedi-
ately from the fact that the second partial derivative of a quantity does not depend on
the order of differentiation. However, if all Cartesian coordinates of a tensor are zero,
according to the transformation law (4.19) for tensor coordinates, also all holonomic
coordinates of this tensor are zero in any curvilinear coordinate system. Thus we have
ai...j |kn ek m n = 0. For m = 1, for example, this gives
Since e3 1 2 ≠ 0, it follows that ai...j |32 = ai...j |23 . For m = 2 and m = 3 we have
ai...j |13 = ai...j |31 , ai...j |12 = ai...j |21 . Thus, in general we obtain the following formula.
ai...j i...j
m...n |pq = am...n |qp . (4.83)
The second covariant derivative of a tensor coordinate also does not depend on the
order of differentiation.
4. We finally compute Δ A := div grad A . For example, we have
grad a = ai |k g i g k = ai |m g mk g i g k ,
mk
div grad a = (ai |m g mk )|k g i = g mk ai |mk g i + g⏟⏟⏟⏟⏟⏟⏟⏟⏟
|k ai |m g i ,
(4.74)
= 0
or, in general,
Δ a = g mn a|mn ,
Δ a = g mn ai |mn g i = g mn ai |mn g i ,
Δ a = g mn aij |mn g i g j = g mn ai j |mn g i g j (4.84)
= g mn ai j |mn g i g j = g mn aij |mn g i g j ,
etc.
Problem 4.16.
Translate into the other two notations:
𝜕2 a
̂m ̂ 𝜕2 a
̂k
A. = bm , B. =b
̂ ,
m C. d a = (grad a) ⋅ d x.
𝜕xk2 𝜕xm 𝜕xk
Problem 4.17.
Compute in physical cylindrical coordinates: A. Δ a, B. Δ a.
Problem 4.18.
A. Show, using the definition of the covariant derivative, that the following holds:
B. Prove that the bracket represents the coordinates of the zero tensor of fourth or-
der.
The tensor
Rm ijk := Γm m n m n m
ik, j − Γij,k + Γik Γnj − Γij Γnk
d x = d ui g i . (4.85)
d x1 = d u1 g 1 , d x2 = d u2 g 2 , d x3 = d u3 g 3 . (4.86)
This explains why we usually use the contravariant coordinates of the vector d x. From
(4.85), together with (4.86), we obtain in component form
dx = d u1 g 1 + d
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ u2 g 2 + d
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ u3 g 3 .
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ (4.87)
d x1 d x2 d x3
d A = d Ai g i . (4.88)
d A1 = d A1 g 1 , d A2 = d A2 g 2 , d A3 = d A3 g 3 . (4.89)
This again explains why we commonly use the covariant coordinates of the vector d A.
From (4.88), together with (4.89), analogously to (4.87), we obtain
dA = d A1 ⏟g⏟⏟⏟⏟1 + ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
⏟⏟⏟⏟⏟⏟⏟⏟ d A2 g 2 + d A3 g 3 .
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ (4.90)
d A1 d A2 d A3
3. The three vectors d Ai and the three vectors d x i are related as follows. For example,
the surface element with the surface vector d A1 is spanned by the line elements d x2
and d x 3 , i. e. we have
d A1 = ±d x2 × d x3 , d A2 = ±d x 3 × d x 1 , d A3 = ±d x1 × d x2 . (4.91)
d A1 = ±√g d u2 d u3 ,
d A2 = ±√g d u3 d u1 , (4.92)
d A3 = ±√g d u1 d u2 .
d V = [d x1 , d x2 , d x 3 ]. (4.93)
d V = [d u1 g 1 , d u2 g 2 , d u3 g 3 ] = [g 1 , g 2 , g 3 ] d u1 d u2 d u3
d V = √g d u1 d u2 d u3 . (4.94)
1. In (2.94) and (2.95) we defined various types of integrals of tensor fields, both in
symbolic notation and in Cartesian coordinates. If, in the equations in symbolic nota-
tion, we write all tensors as the sum of their components with respect to a Cartesian
coordinate system, we can pull the bases in front of the integral and cancel them (they
are independent of position) and thus obtain equations in Cartesian coordinates. We
have, e. g. for
∫ a d x = B, (a)
∫a
̂ ij ei ej d xk ek = ei ej ek ∫ a
̂ ij d xk = B
̂ ijk e e e ,
i j k
∫a
̂ ij d xk = B
̂ ijk . (b)
do is to write the bases, using (4.5) or (4.7), with respect to a Cartesian basis and then
pull this Cartesian basis in front of the integrals and cancel them out, e. g. we write
In this way, we obtain an expression for the integrand which contains curvilinear co-
ordinates. However, according to the transformation law (4.19) for tensor coordinates,
the integrand consists of the Cartesian coordinates of the respective tensor as a func-
tion of the curvilinear coordinates. Such expressions are sometimes useful, but let us
keep in mind that ultimately they are expressions in Cartesian tensor coordinates.
2. However, there are two special cases where the integrand can be written in “true”
curvilinear coordinates. First, if the integral is a scalar combination of tensor coordi-
nates, then the integral does not contain a basis. An example is the Gauss theorem
(2.97)1 in the form
∫ ai |i d V = ∮ ai d Ai (4.95)
∫ ai |k eijk d Aj = ∮ ai d ui . (4.96)
∫ a|3 d V = ∮ a d A3 (4.97)
tensor analysis in the 19th century. From today’s perspective, however, it is easier to
derive the fundamental equations of the theory of surfaces as a special case of tensor
analysis in three-dimensional curvilinear coordinate systems. We will take this path
here.
We further assume that the surface is sufficiently smooth and that the functions xi are
sufficiently often differentiable with respect to u1 and u2 . The coordinates u1 and u2 are
also called surface parameters or surface coordinates.
2. From the perspective of the surrounding space, the surface can be interpreted as
a coordinate surface in a three-dimensional curvilinear coordinate system. The third
coordinate u3 has the same value for all points on the surface u3 = c. Points with
u3 ≠ c are not included in the surface. With this interpretation we can assign two basis
vectors to the surface, i. e. the covariant basis vectors g 1 and g 2 , which are, according
to (4.5), tangent to the coordinate curves of the u1 - and u2 -coordinates, respectively,
i. e.
𝜕xi 𝜕xi
g1 = e = x,1 , g2 = e = x,2 . (4.100)
𝜕u1 i 𝜕u2 i
Together, g 1 and g 2 span the tangent plane of the surface at the point P.
If we need a basis in the three-dimensional space, we add to g 1 and g 2 the normal
vector n of the surface at the point P, so we have
g1 × g2
n= . (4.101)
|g 1 × g 2 |
The basis g 1 , g 2 , n is called a moving frame of the surface and, in general, it is varying
from point to point. The normal vector n has length 1 and is perpendicular to both
g 1 and g 2 . The lengths of g 1 and g 2 and the angle enclosed by them depend on the
definition of the coordinates u1 and u2 .
The normal vector can be interpreted as a covariant basis vector g 3 = (𝜕xi /𝜕u3 ) ei ,
if the third coordinate u3 is at every point normal to the surface and scaled such that
g 3 = n. This interpretation turns out to be useful, when we specialize the equations of
tensor analysis from the three-dimensional space to the surface. However, according
to definition (4.101), g 3 = n does not depend on u3 ; it is via g 1 and g 2 only determined
by the surface coordinates u1 and u2 .
3. For our further investigations we find it useful to extend the summation convention.
As before, we denote coordinate indices in the three-dimensional space by small Latin
letters i, j, k, . . . with the value set 1, 2, 3; and as before we sum over Latin indices, which
appear twice, from 1 to 3. However, to analyze a surface, we need only two coordinates.
We denote indices which only take the values 1, 2 by small Greek letters α, β, . . . , and
we sum correspondingly over Greek indices appearing twice from 1 to 2. Thus, we write
(4.99)2 and (4.100) compactly as
xi = xi (uα ), (4.102)
𝜕xi
gα = e = x,α . (4.103)
𝜕uα i
uα = uα (u
̃ β ), ̃α = u
u ̃ α (uβ ). (4.104)
𝜕xi
g̃ α = e.
̃α i
𝜕u
Using the chain rule, we get
𝜕uβ
g̃ α = g . (4.105)
̃α β
𝜕u
This equation corresponds to the transformation law (4.18) for covariant basis vectors,
with the only difference that here summation is only from 1 to 2.
The new basis vectors g̃ α can be represented as a linear combination of the old ba-
sis vectors g α (as we can expect from the transformation equations (4.104)), and thus,
they also lie in the tangent plane spanned by g α . In other words, only one tangent
plane exists at a point P and it contains the tangent vectors of all curves through the
point P.
5. The basis vectors g i , reciprocal to the moving frame, are given by (3.21) and with
g 3 = n as
g2 × n n × g1 g1 × g2
g1 = , g2 = , g3 = .
[g 1 , g 2 , n] [g 1 , g 2 , n] [g 1 , g 2 , n]
From the properties of the vector product, we know that g 1 and g 2 are perpendicular
to n and that n is perpendicular to g 1 and g 2 . Thus g 1 and g 2 lie in the plane spanned
by g 1 and g 2 , while g 3 is perpendicular to g 1 and g 2 . Hence g 3 and n are collinear and
since [g 1 , g 2 , n] = g 1 × g 2 ⋅ n = |g 1 × g 2 |, we have g 3 = n. The g α are again called con-
travariant basis vectors. According to Section 3.9.4, they satisfy, analogously to (3.24),
the orthogonality relations for reciprocal bases in a plane. We have
g α ⋅ g β = δαβ . (4.106)
g α g α + n n = δ. (4.107)
1. The differential of the position vector on the surface is given, in terms of the coor-
dinates uα and u
̃ α , as
e d uα = d uα g α
𝜕xi
d x = d xi ei =
𝜕uα i
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
gα
̃α = d u
̃ α g̃ .
𝜕xi
= e du
𝜕u ̃α i
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ α
gα
̃
Thus, the differential of the position vector d x also lies in the tangent plane. Substi-
tution of the transformation law (4.105) for the basis vectors gives
̃α
𝜕u
̃ α g̃ = d uβ g = d uβ
du g̃ .
α β 𝜕uβ α
Problem 4.19.
Derive the transformation laws for the contravariant basis vectors g α and for the co-
variant coordinates aα of a surface vector a.
Surface tensors of higher order are defined similarly, and their coordinates must
analogously satisfy the transformation laws (4.19). Thus, computations with surface
tensors are performed in the same way as with tensors in a three-dimensional curvilin-
ear coordinate system. The only difference is that the values of the indices are limited
to 1, 2.
Problem 4.20.
Metric coefficients on a curved surface can be defined by the scalar product
g α ⋅ g β = gαβ , just like in a three-dimensional curvilinear coordinate system. Show
that the metric coefficients gαβ are the entirely covariant coordinates of a second-
order surface tensor.
1. Now we ask how we can measure lengths, angles, and surface areas on a curved
surface; these quantities are also called metric properties of a surface. The fundamen-
tal idea is to replace the curved surface in the infinitesimal neighborhood of a point P
by its tangent plane; there we can compute lengths, angles, and surface areas, using
the tools of plane geometry. The transition to surface patches of finite size then follows
directly by integration.
2. The differential of the position vector was introduced in Section 4.5.2 as
d x = g α d uα . (4.109)
d x ⋅ d x = (g α d uα ) ⋅ (g β d uβ ) = g α ⋅ g β d uα d uβ .
The scalar products of the (covariant) basis vectors of the surface form, analogously
to (4.21), the (covariant) metric coefficients of the surface, i. e.
gαβ := g α ⋅ g β , (4.110)
and then
d x ⋅ d x = gαβ d uα d uβ . (4.111)
This equation is called the first fundamental form of the surface. It offers another way
to show, different from Problem 4.20, that the metric coefficients are tensor coordi-
nates; since the left side is a scalar d x ⋅ d x and the differentials d uα and d uβ of the
coordinates are, according to (4.108), the contravariant coordinates of a surface vector,
the gαβ are, according to the quotient rule, the covariant coordinates of a second-order
surface tensor. The metric coefficients are symmetric, since they are defined as scalar
products,
and furthermore, because d x ⋅ d x > 0, their matrix is positive definite. The following
notations are commonly used:
and with u1 = u and u2 = v these are the relations (2.87). Analogously to (4.21), we
compute the contravariant metric coefficients as
g αβ := g α ⋅ g β . (4.114)
gαγ g γβ = g α ⋅ g γ g γ ⋅ g β = g α ⋅ (δ − n n) ⋅ g β = g α ⋅ g β − g α ⋅ n n ⋅ g β = δαβ ,
i. e. the matrices of the two types of metric coefficients are inverses of each other, so
we have
Because the metric coefficients on a surface are only 2,2-matrices, we can easily com-
pute the inverse and we obtain
g22 g11 g12
g 11 = , g 22 = , g 12 = g 21 = − , (4.116)
g g g
Problem 4.21.
Show, using a surface vector as an example, that similarly as in three dimensions, we
can again use the metric coefficients gαβ and g αβ to raise and lower indices.
Using the definition (4.110) and the Grassmann identity (compare the solution to
Problem 2.13 A), we can also write the determinant as
1
⋅ g 1 g⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
g = g⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 2
g 1 ⋅ g 2 g⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
⋅ g 2 − ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 1
⋅ g 2 = (g 1 × g 2 ) ⋅ (g 1 × g 2 ) = |g 1 × g 2 |2
g11 g22 g12 g12
or
√g = |g 1 × g 2 |. (4.118)
Furthermore, g is also the determinant of the covariant metric coefficients of the mov-
ing frame. Since g 1 × g 2 and n are, according to (4.101), collinear and n has length 1, it
follows with (4.25)1 that
3. For a curve defined on a surface, the surface coordinates uα are functions of a curve
parameter s, i. e. we have for the position vector to a point on the curve
Using the chain rule and (4.103), we obtain for the vector element tangent to the curve
dx d uα d uα
dx = d s = x,α d s = gα d s = t d s.
ds ds ds
Here
dx d uα
t= , tα = (4.120)
ds ds
is the vector tangent to the curve.
To find the length L of a finite curve element between the point A (with the pa-
rameter value sA ) and the point B (with the parameter value sB ) we integrate over
|d x| = √d x ⋅ d x. Using (4.111) and (4.120), we get
sB
L = ∫ √gαβ t α t β d s. (4.121)
sA
4. For two surface curves intersecting at a point P on the surface, we can compute the
angle φ between the two curves from the two tangent vectors t 1 and t 2 . From
t 1 ⋅ t 2 = |t 1 | |t 2 | cos φ
t1 ⋅ t2 gαβ t α t β
1 2
cos φ = = . (4.122)
|t 1 | |t 2 | √gμν t μ t ν √gρσ t ρ t σ
1 1 2 2
5. We consider two curve elements d x1 and d x2 , which are tangent to the u1 - and
u2 -coordinate curves,
d x 1 = g 1 d u1 , d x 2 = g 2 d u2 .
The surface area of the parallelogram spanned by the two vectors is given by the length
of the vector product
d A = d x1 × d x 2 = (g 1 × g 2 ) d u1 d u2 ,
d A = |g 1 × g 2 | d u1 d u2 = √g d u1 d u2 .
Hence, we compute the surface area A of a finite surface area patch bounded by u1A ≤
u1 ≤ u1B and u2A (u1 ) ≤ u2 ≤ u2B (u1 ) with the following integral:
A=∫ ∫ √g d u1 d u2 . (4.123)
u1A u2A (u1 )
We thus derived once more the equation for a surface integral of the first kind from
Section 2.14.3; to see this, we compare the last equation with (2.88) and write g using
the notations E, F, G from (4.113), so we have g = E G − F 2 .
1. Planes are defined by the fact that their normal vector n points everywhere in the
same direction, i. e. its differential always satisfies d n = 0. On a curved surface,
the direction of n varies from point to point, i. e. we have d n ≠ 0. This suggests to
investigate d n to gain further insight into the nature of curvature.
2. According to Section 4.5.1, the normal vector n is a function of the surface coordi-
nates uα , i. e. its differential is given by
d n = n,β d uβ . (4.124)
We could compute n,β directly from the defining equation (4.101). However, this is com-
plicated and messy, because we have to use the product rule and the quotient rule sev-
eral times. It turns out to be more efficient to use some of the properties of the moving
frame.
The normal vector n is by definition a unit vector, so its scalar product with itself is
n ⋅ n = 1.
Partial differentiation of this equation with respect to the surface coordinates uβ gives
n,β ⋅ n = 0 .
Hence the vectors n,β are perpendicular to n, i. e. they lie in the tangent plane of the
surface at the point P. We can thus represent the n,β as a linear combination of the
basis vectors g α which span this tangent plane. Denoting the coefficients of this linear
combination by −bα β , we first obtain
− d n ⋅ d x = −(n,β d uβ ) ⋅ (g α d uα ) = bγ β g γ ⋅ g α d uα d uβ = b γ α β
β gγα d u d u .
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
gγα bαβ
The left side is the scalar −d n⋅d x, the coordinate differentials d uα and d uβ are accord-
ing to (4.108) the contravariant coordinates of a surface vector, and we already know
from (4.111) that the metric coefficients gγα are the covariant coordinates of a surface
tensor of second order. Hence it follows from the quotient rule that the bγ β are also
tensor coordinates, or more precisely, the coordinates of a surface tensor of second
order. Thus, we can lower the first index using the metric coefficients, and we obtain
the covariant coordinates bαβ of the curvature tensor, i. e. we obtain
−d n ⋅ d x = bαβ d uα d uβ . (4.126)
n ⋅ g α = 0.
Taking the partial derivative of this equation with respect to the surface coordinates
uβ gives
n,β ⋅ g α = −n ⋅ g α,β ,
− bγ β g γ ⋅ g α = −n ⋅ g α,β ,
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
gγα
i. e. we obtain
If we substitute the definition (4.101) for n and use |g 1 × g 2 | = √g, we can write with
(4.118) the covariant coordinates of the curvature tensor as a triple product, i. e.
1
bαβ = [g , g , g ]. (4.128)
√g 1 2 α,β
According to (2.45), this equation reads in coordinate notation, using the definition
(4.103) of the basis vectors g α ,
1 𝜕x 𝜕xj 𝜕2 xk
bαβ = εijk 1i . (4.129)
√g 𝜕u 𝜕u2 𝜕uα 𝜕uβ
because the order of the partial derivatives with respect to uα and uβ does not matter.
The covariant coordinates of the curvature tensor are often denoted by
Problem 4.22.
The equation z = f (x, y) = x y defines a surface over the Cartesian x, y-plane. Compute
the coordinates bα β and bαβ of the curvature tensor for this surface.
Hint: Use the Cartesian coordinates x and y as surface parameters.
Problem 4.23.
Write the mean curvature and the Gaussian curvature in terms of E, F, G, L, M, N.
i. e. d t/d s points in the direction of the normal vector n. Since in addition |n| = 1, we
can write for the curvature
dt
κ= ⋅ n,
ds
which also determines the sign; if d t/d s and n have the same direction, κ is positive;
if they have opposite directions, it is negative. Furthermore, it follows from t ⋅ n = 0
that
dt dn
⋅ n = −t ⋅ ,
ds ds
dx dn
κ=− ⋅ . (4.134)
ds ds
The plane of a normal section can be rotated arbitrarily around the normal vector. So it
makes sense to ask for which planes the curvature attains its maximum and minimum
values. The answer to this question is the solution to the extremum problem
dx dn
κ=− ⋅ → extremum
ds ds
for the tangent vector t, under the constraint t ⋅ t = 1, i. e.
dx dx
1− ⋅ = 0.
ds ds
We introduce a Lagrange multiplier k and the corresponding Lagrange function
dx dn dx dx
h=− ⋅ + k (1 − ⋅ ),
ds ds ds ds
substitute the first fundamental form (4.111) and the second fundamental form (4.126),
and use (4.120) to obtain
d uα d uβ d uα d uβ
h = bαβ + k (1 − gαβ ) = bαβ t α t β + k (1 − gαβ t α t β ).
ds ds ds ds
Thus, the necessary condition 𝜕h/𝜕t γ = 0 for the extrema to exist leads, analogously
to Section 3.12.3, to the generalized eigenvalue problem
In the theory of surfaces these eigenvalues are called principal curvatures. Prob-
lem 3.14 shows that they are always real, because the bαβ are symmetric and the gαβ
are symmetric and positive definite. The corresponding eigendirections are called
principal curvature directions. The principal curvatures k1 and k2 are as usual com-
puted by solving the characteristic equation det (bαβ − k gαβ ) = 0. If we use the
mixed coordinates bα β , we can write the characteristic equation, according to the
multiplication theorem (1.13) for determinants, as
μ μ
det (bαβ − k gαβ ) = det (gαμ (bμ β − k δβ )) = (det gαμ ) (det (bμ β − k δβ )) = 0,
k2 − 2 H k + K = 0 . (4.136)
if we multiply each equation by the other eigenvector and subtract the two equations;
then we have
The left side is zero due to the symmetry of bαβ ; on the right side we can use the sym-
metry of gαβ . Thus we get for k1 ≠ k2
gαβ t α t β = 0.
1 2
With (4.122), however, this is equivalent to cos φ = 0, i. e. the angle between the
eigendirections t α and t β is φ = 90∘ .
1 2
Problem 4.24.
Find the principal curvatures and the principal curvature directions for the surface
z = f (x, y) = x y of Problem 4.22.
1. In the last section we used the partial derivative n,β of the normal vector to gather
more information about the curvature of a surface. In this section we generalize our
considerations to the covariant basis vectors g 1 and g 2 , i. e. we ask, in all generality,
how the moving frame varies along the surface.
2. We can write the partial derivative of a covariant basis using (4.58) in terms of the
Christoffel symbols as follows:
g i,j = Γm
ij g m .
γ γ
g α,β = Γαβ g γ + Γ3αβ n, n,β = Γ3β g γ + Γ33β n. (a)
From Section 4.5.4 we already know that n,β can be represented as a linear combination
of the g α , so comparing (a) with (4.125) gives
γ
Γ3β = −bγ β = −g γα bαβ , Γ33β = 0. (4.138)
In the next step we use g α ⋅ n = 0 and after taking the partial derivatives, we have
n,β ⋅ g α = −n ⋅ g α,β .
Substituting n,β and g α,β and using (a) together with (4.138), we obtain
γ γ
− bγ β g γ ⋅ g α = −n ⋅ (Γαβ g γ + Γ3αβ n) = −Γαβ g γ ⋅ n −Γ3αβ ⏟⏟
n⏟⏟⏟
⋅n⏟⏟ ,
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ⏟⏟⏟⏟⏟⏟⏟⏟⏟
gγα 0 1
Thus, we finally obtain for the partial derivative of the moving frame
γ
g α,β = Γαβ g γ + bαβ n, (4.140)
These equations are known in the theory of surfaces as the derivative equations of
Gauss and Weingarten. The bαβ are, according to (4.128), the covariant coordinates of
γ
the curvature tensor. The Christoffel symbols Γαβ can be computed from (4.60).
Christoffel symbols themselves are related to the partial derivatives of the metric coef-
ficients according to (4.60), and thus the curvature properties of a surface are already
determined by the metric coefficients.
Thus, we can, for a given set of metric coefficients, interpret the derivative equa-
tions (4.140) and (4.141) as the determining partial differential equations of the moving
frame. However, the metric coefficients cannot be chosen arbitrarily. Since (4.140) and
(4.141) consist of a total of eighteen equations for only nine vector coordinates, certain
integrability conditions must be satisfied; otherwise the moving frame cannot be con-
sistently determined.
The integrability conditions follow from the interchangeability of the mixed sec-
ond partial derivatives. From (4.140) we have
γ γ γ
g α,βδ = (Γαβ g γ + bαβ n) = Γαβ,δ g γ + Γαβ g γ,δ + bαβ,δ n + bαβ n,δ .
,δ
The second term on the right side can be written with the help of renaming indices
and equation (4.140) as
γ γ
Γαβ g γ,δ = Γναβ g ν,δ = Γναβ (Γνδ g γ + bνδ n) ,
and from (4.141) it further follows that the fourth term equals
γ γ
g α,βδ = (Γαβ,δ + Γναβ Γνδ − bαβ bγ δ ) g γ + (bαβ,δ + Γναβ bνδ ) n.
and subtract the two expressions. Since g α,βδ − g α,δβ = 0, it follows from the right
sides that
γ γ γ γ
0 = (Γαβ,δ − Γαδ,β + Γναβ Γνδ − Γναδ Γνβ − (bαβ bγ δ − bαδ bγ β ) ) g γ
This equation is satisfied, if the parentheses in front of g γ and n are zero, and this gives
the following integrability conditions:
γ γ γ γ
Γαβ,δ − Γαδ,β + Γναβ Γνδ − Γναδ Γνβ = bαβ bγ δ − bαδ bγ β , (a)
The expression on the left side of (a) is called (mixed) Riemann curvature tensor of
the surface
μ μ μ
Rμ αβγ := Γαγ,β − Γαβ,γ + Γναγ Γνβ − Γναβ Γμνγ , (4.142)
Comparing this equation with (4.61) shows that it represents the covariant derivatives
of the entirely covariant coordinates of the curvature tensor, i. e. instead of (4.144) we
can also write
Problem 4.25.
Obtain the integrability conditions (4.143) and (4.144) from the relation Ri jkl = 0 for
the three-dimensional Riemann curvature tensor.
Rβαγδ = bβγ bαδ − bβδ bαγ = − (bαγ bβδ − bαδ bβγ ) = −Rαβγδ ,
and correspondingly, if we change the order of the last two indices, we obtain
Rαβδγ = bαδ bβγ − bαγ bβδ = − (bαγ bβδ − bαδ bβγ ) = −Rαβγδ .
Thus, the entirely covariant coordinates of the Riemann curvature tensor are antimet-
ric with respect to the first two indices and also with respect to the last two indices.
In other words, 12 of the 16 coordinates are zero, namely the coordinates where the
first and second index or the third and fourth index are equal. For the remaining four
coordinates we have R1212 = −R1221 = R2121 = −R2112 , i. e. the entirely covariant coordi-
nates of the Riemann curvature tensor are symmetric also with respect to the first and
second index pair. It remains to compute only one coordinate, and we choose R1212 , so
we write
Using the multiplication theorem (1.13) and with (4.133), we can write this determinant
in terms of the Gaussian curvature K, i. e.
In the last step we used properties of symmetry and antimetry that are similar to those
of permutation symbols, so we write
However, because of the permutation symbols this is not an equation between tensor
coordinates. To make this an equation for tensor coordinates, we first have to introduce
the holonomic coordinates of a two-dimensional ε-tensor. For the shared set of index
values 1,2 we set μ = i, ν = j; then we can write εμν as εμν = εij3 . We define the
covariant coordinates analogously as
Problem 4.26.
Show that the eμν satisfy the transformation law for the covariant coordinates of a
second-order surface tensor.
Problem 4.27.
Compute, using the Christoffel symbols, the Gaussian curvature of a spherical surface
and of a cylindrical surface.
γ
n,βδ = − (bγ β,δ + bν β Γνδ ) g γ − bν β bνδ n.
γ
n,δβ = − (bγ δ,β + bν δ Γνβ ) g γ − bν δ bνβ n,
and subtract the two expressions. Because n,βδ − n,δβ = 0 it follows from the right
sides that
γ γ
0 = − (bγ β,δ − bγ δ,β + bν β Γνδ − bν δ Γνβ ) g γ − (bν β bνδ − bν δ bνβ ) n,
i. e.
γ γ
bγ β,δ + Γνδ bν β = bγ δ,β + Γνβ bν δ , (a)
and
must hold. Again we can write (a) in terms of covariant derivatives, if we subtract
Γνβδ bγ ν on both sides and take Γνβδ = Γνδβ into account. Then we have
γ γ
bγ β,δ + Γνδ bν β − Γνβδ bγ ν = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ bγ δ,β + Γνβ bν δ − Γνδβ bγ ν .
bγ β |δ bγ δ |β
This equation is equivalent to (4.145), with the only difference that here we used the
mixed coordinates of the curvature tensor.
We can write (b), by comparing it with (4.143), in terms of the Riemann curvature
tensor. Thus we get with (4.147)
2. How to evaluate conditions of the type (a) or (b) and how to use them to construct
functions relating tensors is the subject of a theory called the theory of the representa-
tion of tensor functions. The basic idea is that a scalar-valued function cannot depend
on single tensor coordinates, but only on scalar combinations of tensor coordinates,
which are clearly invariant under a coordinate transformation. This explains why, in-
stead of the theory of the representation of tensor functions, we also speak of the the-
ory of invariants. We will show that for any symmetric tensor, such as the stress ten-
sor T, the three basic invariants tr T, tr T 2 , and tr T 3 form a complete set of invariants,
so we can write the scalar-valued function f in our example as a function of the three
basic invariants (or, according to Section 3.15, of another equivalent set of invariants,
such as the eigenvalues or the main invariants), so we have
σ = f (tr T, tr T 2 , tr T 3 ).
https://doi.org/10.1515/9783110404265-005
T ⋅⋅ H = f (D, H),
where the auxiliary tensor H must appear in such a way that it can be canceled out in
the end.
Before we turn to the details, we have to clarify how many tensor invariants are
needed for a (yet to be defined) representation to be complete. For this we first need to
generalize the Cayley–Hamilton theorem, which in its basic version relates only pow-
ers of the same tensor, while in the theory of the representation of tensor functions,
products of different tensors often appear as well. Since the tensors in physical appli-
cations are often polar, we restrict ourselves to polar tensors for the rest of this chapter
(this includes scalars and vectors as tensors of order zero and one, respectively), with-
out mentioning this every time. In the few cases where we consider axial tensors, or if
the distinction between polar and axial tensors is important for the argument, we will
mention this explicitly.
Contraction with api bqj crk gives, if we multiply the first row by api , the second row by
bqj , and the third row by crk ,
Multiplying δlp , δlq , and δlr into the first, second, and third row of the first three de-
terminants, respectively, gives
a alr als app apr aps app apq aps
lq
− bqq bqr bqs + blp blr bls − bqp bqq bqs
crq crr crs crp crr crs clp clq cls
a
pp apq apr
+ δls bqp bqq bqr = 0.
crp crq crr
− alq bqr crs − alr bqs crq − als bqq crr + als bqr crq + alr bqq crs + alq bqs crr
+ app blr crs + apr bls crp + aps blp crr − aps blr crp − apr blp crs − app bls crr
− app bqq cls − apq bqs clp − aps bqp clq + aps bqq clp + apq bqp cls + app bqs clq
+ δls (app bqq crr + apq bqr crp + apr bqp crq − apr bqq crp − apq bqp crr
− app bqr crq ) = 0.
− a ⋅ b ⋅ c − a ⋅ c ⋅ b − a tr b tr c + a tr(b ⋅ c) + a ⋅ c tr b + a ⋅ b tr c
+ b ⋅ c tr a + b tr(a ⋅ c) + b ⋅ a tr c − b ⋅ c ⋅ a − b ⋅ a ⋅ c − b tr a tr c
− c tr a tr b − c ⋅ a ⋅ b − c ⋅ b ⋅ a + c ⋅ a tr b + c tr(a ⋅ b) + c ⋅ b tr a
+ δ (tr a tr b tr c + tr(a ⋅ b ⋅ c) + tr(a ⋅ c ⋅ b) − tr(a ⋅ c) tr b − tr(a ⋅ b) tr c
− tr(b ⋅ c) tr a) = 0.
Finally, after sorting these terms and reversing the signs, we obtain
a ⋅ (b ⋅ c + c ⋅ b) + b ⋅ (a ⋅ c + c ⋅ a) + c ⋅ (a ⋅ b + b ⋅ a)
− (b ⋅ c + c ⋅ b) tr a − (a ⋅ c + c ⋅ a) tr b − (a ⋅ b + b ⋅ a) tr c
+ [tr b tr c − tr(b ⋅ c)] a + [tr a tr c − tr(a ⋅ c)] b
(5.1)
+ [tr a tr b − tr(a ⋅ b)] c
− [tr a tr b tr c + tr(a ⋅ b ⋅ c) + tr(c ⋅ b ⋅ a)
− tr(b ⋅ c) tr a − tr(a ⋅ c) tr b − tr(a ⋅ b) tr c] δ = 0.
6 a3 − 6 tr a a2 + 3 [tr2 a − tr a2 ] a − [tr3 a − 3 tr a tr a2 + 2 tr a3 ] δ = 0.
After dividing by −6 and using (3.98) we see that this is the Cayley–Hamilton theorem
(3.94), so we call (5.1) the generalized Cayley–Hamilton theorem.
3. For the representation of tensor functions, the first row of (5.1) is of particular im-
portance, so we introduce the following symbol:
Σ := a ⋅ (b ⋅ c + c ⋅ b) + b ⋅ (a ⋅ c + c ⋅ a) + c ⋅ (a ⋅ b + b ⋅ a) . (5.2)
Then we obtain from (5.1), after solving for the first row, i. e.
Σ = (b ⋅ c + c ⋅ b) tr a + (a ⋅ c + c ⋅ a) tr b + (a ⋅ b + b ⋅ a) tr c
− [tr b tr c − tr(b ⋅ c)] a − [tr a tr c − tr(a ⋅ c)] b
− [tr a tr b − tr(a ⋅ b)] c (5.3)
+ [tr a tr b tr c + tr(a ⋅ b ⋅ c) + tr(c ⋅ b ⋅ a)
− tr(b ⋅ c) tr a − tr(a ⋅ c) tr b − tr(a ⋅ b) tr c] δ.
Problem 5.1.
Show, using the transformation equations (2.17) or (2.18), that tr T and tr T 2 are invari-
ants of a (polar or axial) second-order tensor.
In Section 3.15, we saw that the eigenvalues, the main invariants, and the basic
invariants are three different sets of three invariants of a second-order tensor. The el-
ements of each of these three sets are independent of each other, i. e. no element of a
set can be computed from the other two elements of the same set; but the three sets are
not independent of each other, i. e. if we know the invariants of one set, then we can
compute the invariants of the other sets. However, it is not clear if the three invariants
of a set form a complete set, i. e. if further invariants exist, which cannot be computed
from these invariants. A complete set of invariants is called a basis (of invariants).
The easiest way to compute invariants is from appropriate contractions of tensor
coordinates. An example are the main invariants of a second-order tensor. Such invari-
ants are polynomials of tensor coordinates and we call them polynomial invariants.
In the theory of the representation of tensor functions we usually consider only poly-
nomial invariants and we use the following terms. A polynomial invariant which can
be written as a polynomial of other polynomial invariants is called reducible (with
respect to these invariants); otherwise it is called irreducible. A complete set of irre-
ducible invariants, i. e. a set such that all other polynomial invariants can be written
in terms of these irreducible invariants, is called an integrity basis. However, it is also
possible that polynomial invariants are related by a polynomial and yet none of them
can be written as a polynomial of the other invariants. Such polynomials are called
syzygies;1 an example is equation (2.46) for the scalar products of the six vectors a, b,
c, d, e, and f .
An example of nonpolynomial invariants are the eigenvalues of a second-order
tensor. We computed them in Section 3.11.2 from a cubic equation whose coefficients
are polynomial invariants, but an eigenvalue cannot be written as a polynomial of
tensor coordinates.
1 Greek: pair (from syn [together] and zygon [yoke]), i. e. literally, a pair coupled together by a yoke,
e. g. in astronomy a generic term for conjunction and opposition of two planets, and in poetry the
juxtaposition of two equal metrical feet.
The polynomial invariants are a subset of all the invariants of a tensor. Since re-
ducible invariants are polynomial combinations of tensor coordinates, it may still hap-
pen that there are more irreducible invariants than independent invariants. In other
words, an integrity basis can contain more elements than a basis.
The triple product plays a particular role. If the vectors u, v, w are polar, then the first
six invariants are also polar scalars, whereas the triple product is, due to the ε-tensor,
an axial scalar. However, according to our definition, an axial scalar is no invariant,
i. e. from three (polar) vectors u, v, w, we can form six irreducible invariants.
If we assume that T is polar, then S and A are also polar and b is axial.
The decomposition into a symmetric part and an antimetric part is, according to
Section 2.6, No. 7, independent of the coordinate system. So we can also use the prin-
cipal axis system of S to represent the coordinates of the full tensor T, and thus we
σ1 0 0 0 β3 −β2
T̃ij = ( 0 σ2 0 ) + (−β3 0 β1 ) ,
0 0 σ3 β2 −β1 0
σ1 β3 −β2
T̃ij = (−β3 σ2 β1 ) . (5.4)
β2 −β1 σ3
Here the σi are the (not necessarily different) eigenvalues of S and the βi are the co-
ordinates of b in the principal axis system of S. We already know from Section 3.11.2
that the eigenvalues σi are invariants. The coordinates of b can be interpreted as the
projections of the vector b onto the principal axes of S; hence the βi are also invariants,
because the direction of the principal axes of S and the direction of b are independent
of the coordinate system.
Thus the principal axis system of the symmetric part S is a distinguished coordi-
nate system of the tensor T; the coordinate matrix of T is then (and only then) anti-
metric, except on the main diagonal.
We conclude that, in general, a second-order tensor has six independent invari-
ants; more than six independent invariants are not possible, because the tensor is
uniquely determined by the σi and βi . However, these invariants are not polynomial
invariants, i. e. they cannot be written as polynomials of tensor coordinates in an ar-
bitrary Cartesian coordinate system.
2. If the symmetric part S has multiple eigenvalues, the number of independent in-
variants is reduced.
For an eigenvalue of multiplicity two, we choose the x, y-plane as the eigenplane
of the principal axis system and we put the x-axis such that b lies in the x, z-plane.
Then the tensor T has the coordinate matrix
σ1 0 0 0 β3 0 σ1 β3 0
T̃ij = ( 0 σ1 0 ) + (−β3 0 β1 ) = (−β3 σ1 β1 ) ,
0 0 σ3 0 −β1 0 0 −β1 σ3
σ 0 0 0 0 0 σ 0 0
Tij = (0
̃ σ 0) + (0 0 β ) = (0 σ β) ,
0 0 σ 0 −β 0 0 −β σ
cos ϑ − sin ϑ 0
R
̃ ij = ( sin ϑ cos ϑ 0) .
0 0 ±1
This is a representation of a tensor with respect to the principal axis system of its sym-
metric part, so we note that with the angle ϑ only one independent invariant exists.
tr(a ⋅ b ⋅ . . . ⋅ c ⋅ d) = aij bjk ⋅ ⋅ ⋅ cmn dni = bjk ⋅ ⋅ ⋅ cmn dni aij = tr(b ⋅ . . . ⋅ c ⋅ d ⋅ a),
so we have
II. The trace of a sequence of scalar products of second-order tensors does not change
if we transpose each factor, while reversing the order of the factors, i. e.
so we have
and also
(5.7) (a) (5.5)
tr(s ⋅ a) = tr (s ⋅ a)T = −tr(a ⋅ s) = −tr(s ⋅ a),
we obtain
tr(s ⋅ a) = 0. (5.8)
From three factors, there are altogether eight possible scalar products: a3 , b3 , a2 ⋅ b, a ⋅
b⋅a, b⋅a2 , b2 ⋅a, b⋅a⋅b, a⋅b2 ; after computing the traces and taking cyclical permutations
according to (5.5) into account, only four invariants of degree three remain:
I31 := tr a3 , I32 := tr b3 ,
(5.11)
I33 := tr(a2 ⋅ b), I34 := tr(a ⋅ b2 ).
The invariants of degree one, two, and three, which we have considered so far, are all
irreducible, because none can be written in terms of the others. However, for invariants
of degree four, a fundamental restriction appears, since invariants such as tr(a3 ⋅b) and
tr(b3 ⋅ a) are reducible. To see this, we solve the Cayley–Hamilton theorem (3.94) for a3
and scalar multiply it from the right by b, so we have
a3 ⋅ b = A a2 ⋅ b − A a ⋅ b + A b.
i. e. tr(a3 ⋅ b) can be written in terms of invariants of degree less than or equal to three,
since A , A , and A are, according to (3.99), also functions of tr a, tr a2 , and tr a3 . The
same is true for tr(b3 ⋅a), because a and b can be interchanged in the above derivation.
We can extend these considerations to invariants of the form tr(c ⋅ a3 ⋅ d) and
tr(c ⋅ b3 ⋅ d), where c and d are arbitrary second-order tensors; in particular they can
be scalar products of the factors a and b. Such invariants can be reduced, using the
Cayley–Hamilton theorem (3.94), to invariants of a smaller degree. In other words, in-
variants with a term of power three (or higher) are reducible. Hence we only need to
investigate sequences of scalar products with the elements a2 , b2 , a, b.
For scalar products with at least four factors, we can derive another relation, us-
ing the generalized Cayley–Hamilton theorem, which will narrow down the search for
irreducible invariants. For this we set in (5.1) a = b = n and scalar multiply from the
right by an arbitrary tensor d. For the first row we get with (5.2)
Σ ⋅ d = [n ⋅ (n ⋅ c + c ⋅ n) + n ⋅ (n ⋅ c + c ⋅ n) + c ⋅ (n ⋅ n + n ⋅ n)] ⋅ d
= 2 [n ⋅ c ⋅ n + n2 ⋅ c + c ⋅ n2 ] ⋅ d = 2 [n ⋅ c ⋅ n ⋅ d + n2 ⋅ c ⋅ d + c ⋅ n2 ⋅ d] .
Using (5.3), we can write tr(Σ ⋅ d) in terms of invariants of at least one degree less than
the degree of tr(n⋅c⋅n⋅d), tr(n2 ⋅c⋅d), and tr(n2 ⋅d⋅c). In the theory of the representation
of tensor functions we also say that tr(n ⋅ c ⋅ n ⋅ d) and − (tr(n2 ⋅ c ⋅ d) + tr(n2 ⋅ d ⋅ c)) are
equivalent and we write
Equivalent invariants have the same degree and differ only in their (reducible) invari-
ants of lower degrees; in particular, we can replace an irreducible invariant in an in-
tegrity basis by an equivalent invariant. If a computation shows that an invariant is
equivalent to zero, then this invariant is reducible. The important implication of the
equivalence relation (5.12) is that invariants with two factors which are equal but ap-
pear separated from each other can be written as the sum of two other invariants in
which this factor appears squared.
With these preparations we can now reduce our search for fourth-order irre-
ducible invariants to the following scalar products: a2 ⋅ b ⋅ a, a ⋅ b ⋅ a2 , b2 ⋅ a ⋅ b, b ⋅ a ⋅ b2 ,
a ⋅ b ⋅ a ⋅ b, b ⋅ a ⋅ b ⋅ a, a ⋅ b2 ⋅ a, b ⋅ a2 ⋅ b, a2 ⋅ b2 , b2 ⋅ a2 . After computing the trace
and using cyclic permutation we have tr(a2 ⋅ b ⋅ a) = tr(a ⋅ b ⋅ a2 ) = tr(a3 ⋅ b) and
tr(b2 ⋅ a ⋅ b) = tr(b ⋅ a ⋅ b2 ) = tr(b3 ⋅ a), so these invariants are reducible. With the
same reasoning we have tr(a ⋅ b ⋅ a ⋅ b) = tr(b ⋅ a ⋅ b ⋅ a) and tr(a2 ⋅ b2 ) = tr(b2 ⋅ a2 )
= tr(a ⋅ b2 ⋅ a) = tr(b ⋅ a2 ⋅ b), so only two invariants remain, i. e. tr(a ⋅ b ⋅ a ⋅ b) and
tr(a2 ⋅ b2 ), and we have to check if they are irreducible. Setting in (5.12) n = a, c = d = b
we obtain
tr(a ⋅ b ⋅ a ⋅ b) ≡ −2 tr(a2 ⋅ b2 ),
i. e. both invariants are equivalent. Hence only one irreducible invariant of degree four
exists and we choose
When investigating scalar products with five factors, we consider cyclic permutations
from the outset and then we see that there are only two ways of computing invariants
of degree five, which are composed of the elements a2 , b2 , a, b, namely tr(a ⋅ b ⋅ a ⋅ b2 )
and tr(b ⋅ a ⋅ b ⋅ a2 ). From (5.12) it follows that
tr(a ⋅ b ⋅ a ⋅ b2 ) ≡ −2 tr(a2 ⋅ b3 ),
tr(b ⋅ a ⋅ b ⋅ a2 ) ≡ −2 tr(b2 ⋅ a3 );
however, because of a3 and b3 both invariants are reducible. In other words, no irre-
ducible invariants of degree five exist.
In order to find the irreducible invariants of degree six, which can be formed from
the elements a2 , b2 , a, b, we start, after taking cyclic permutations into account, from
the invariants tr(a ⋅ b2 ⋅ a ⋅ b2 ), tr(b ⋅ a2 ⋅ b ⋅ a2 ), tr(a ⋅ b ⋅ a ⋅ b ⋅ a ⋅ b), tr(a2 ⋅ b2 ⋅ a ⋅ b),
tr(a2 ⋅ b ⋅ a ⋅ b2 ). We then investigate, using (5.12), which of them are irreducible. For
tr(a ⋅ b2 ⋅ a ⋅ b2 ) ≡ −2 tr(a2 ⋅ b4 ),
tr(b ⋅ a2 ⋅ b ⋅ a2 ) ≡ −2 tr(b2 ⋅ a4 );
but because of a4 and b4 these invariants are reducible. For the third invariant we have
i. e. the third invariant is equivalent to the (negative) sum of the last two invariants.
For the last two invariants we obtain
i. e. they are equivalent and can only be reduced as a sum, but not individually, to in-
variants of lower degree. So we can regard one of them as irreducible and we choose
It turns out that we do not have to look further for invariants of degree seven or higher,
because they will always be reducible. This follows from two facts. First, a factor can
appear in irreducible invariants only as itself or squared. Second, the equivalence re-
lation (5.12) states that invariants appearing twice can be replaced by invariants with
quadratic factors. From the elements a2 , b2 , a, b we can form, without repetition, at
most invariants of degree six, because for higher degrees at least one of the elements
appears twice. If the element appearing twice is a or b, we can replace the invariant
under consideration, using (5.12), with equivalent invariants with a2 or b2 ; however,
the elements a2 and b2 already appear, so that applying (5.12) again leads to invariants
with a4 or b4 , which are reducible. If the element appearing twice is a2 or b2 , then
applying (5.12) once is already sufficient to reduce the invariant under consideration
to invariants of lower degree.
tensors):
I11 = tr a, I12 = tr b,
2
I21 = tr a , I22 = tr b2 , I23 = tr(a ⋅ b),
I31 = tr a3 , I32 = tr b3 , I33 = tr(a2 ⋅ b), I34 = tr(a ⋅ b2 ), (5.15)
I41 = tr(a2 ⋅ b2 ),
I61 = tr(a2 ⋅ b ⋅ a ⋅ b2 ).
It remains to answer the question if these irreducible invariants form a complete set.
We will do this by means of examples.
4. As a first example, we consider a single tensor T and set in (5.15) a = T,
b = T T . Since the trace does not change if we transpose its argument and since,
according to (5.7), transposition and exponentiation are interchangeable within a
trace, we see immediately that some of the invariants in (5.15) are the same. We have
I1 := Tii = tr T,
I2 := Tij Tji = tr T 2 ,
I3 := Tij Tij = tr(T ⋅ T T ),
I4 := Tij Tjk Tki = tr T 3 , (5.16)
These invariants in fact form an integrity basis. Both T and T T are needed and it is
not possible to write all invariants in terms of traces using only T. We can see this,
for example, if we compare I2 and I3 . In index notation we have I2 = Tij Tji and I3 =
Tij Tij ; both are scalars obtained by contracting tensor coordinates, but without the
transposed tensor, we cannot write I3 as a trace.2
2 This explains why (5.15) is not a complete set of irreducible invariants of the tensors a and b;
to obtain an integrity basis, we have to include also the transposed tensors aT and bT , i. e. we have
to investigate four different tensors.
If we compare our results with Section 5.3.2, we see that, in general, a second-
order tensor has six independent invariants, but seven irreducible invariants. We can
clearly write all irreducible invariants in terms of the independent invariants from Sec-
tion 5.3.2, No. 1, if we write the traces in (5.16) with respect to the principal axis system
of the symmetric part of T.
Symmetric tensors obey T T = T. Then I2 = I3 and I4 = I5 , and we can reduce
I6 = tr T 4 and I7 = tr T 6 , using the Cayley–Hamilton theorem (3.95), to the remaining
invariants. So, from the irreducible invariants in (5.16), only the three basic invariants
I1 = tr T, I2 = tr T 2 , I4 = tr T 3 remain.
Antimetric tensors satisfy T T = −T; hence I1 = 0. We know T 2 is symmetric
because Tij Tjk = (−Tji ) (−Tkj ) = Tkj Tji , so I2 is nonzero, but we have I3 = −I2 .
Here T 3 is again antimetric because Tij Tjk Tkl = (−Tji ) (−Tkj ) (−Tlk ) = −Tlk Tkj Tji ,
so we have I4 = 0 and thus also I5 = −I4 = 0. It further follows that I6 = tr T 4 and
I7 = −tr T 6 , so the Cayley–Hamilton theorem (3.95) helps us reduce I6 and I7 to the
only remaining invariant I2 = tr T 2 .
For orthogonal tensors we have T T = T −1 . Then I3 = I6 = I7 = 3. In other words,
I3 , I6 , and I7 do not contain any information about a particular orthogonal tensor
and thus do not count as invariants. It further follows that I5 = I1 , so we are left
with the three basic invariants I1 = tr T, I2 = tr T 2 , I4 = tr T 3 . The determinant of
an orthogonal tensor is, according to Section 3.13.2, equal to ±1, and the cotensor is,
according to (3.14), equal to the tensor itself, up to the sign. Thus, we have for the main
invariants, according to (3.47)–(3.49), A = tr T, A = ±tr T, A = ±1; then it follows
from (3.100) that tr T 2 = tr2 T ∓ 2 tr T, tr T 3 = tr3 T ∓ 3 tr2 T ± 3, so an orthogonal tensor
has only one irreducible invariant I1 = tr(T).
5. As our next example we consider the case of two symmetric tensors U and V, which
is also important for physical applications. Now we do not need to distinguish between
the tensors and their transposed tensors, so we certainly can obtain an integrity basis
from (5.15). We set a = U, b = V in (5.15) and keep the first 10 invariants. Only the
invariant of degree six does not belong to the integrity basis because it turns out to
be reducible. On the one hand, from the symmetry of the tensors and by taking cyclic
permutations and transposition (5.5)–(5.7) into account, it follows that
tr(U 2 ⋅ V ⋅ U ⋅ V 2 ) ≡ −tr(U 2 ⋅ V ⋅ U ⋅ V 2 ),
Problem 5.2.
Find all irreducible invariants which can be computed from two antimetric tensors A
and B, and compare the result with the invariants of two vectors from Section 5.3.1.
Hint: To show that tr(A2 ⋅ B2 ) is reducible, represent the two antimetric tensors A
and B temporarily by their corresponding vectors.
Problem 5.3.
Find all irreducible invariants which can be formed from a polar symmetric tensor S
and a vector u, and compare the result with the invariants of a symmetric tensor and
an antimetric tensor. Also, distinguish the cases where u is polar or axial.
5.3.4 Summary
We finally summarize the results of Section 5.3 in a table. Here T denotes an arbitrary
tensor, R is an orthogonal tensor, S, U, V are symmetric tensors, A, B are antimetric
tensors, and u, v, w are vectors. All vectors and tensors are polar.
u, v u ⋅ u, v ⋅ v, u ⋅ v
u, v, w u ⋅ u, v ⋅ v, w ⋅ w, u ⋅ v, u ⋅ w, v ⋅ w
T tr T , tr T 2 , tr T 3 , tr(T ⋅ T T ), tr(T 2 ⋅ T T ),
tr(T 2 ⋅ (T T )2 ), tr(T 2 ⋅ T T ⋅ T ⋅ (T T )2 )
S tr S, tr S2 , tr S3
A tr A2
R tr R
U, V tr U, tr U 2 , tr U 3 , tr V , tr V 2 , tr V 3 ,
tr(U ⋅ V ), tr(U 2 ⋅ V ), tr(U ⋅ V 2 ), tr(U 2 ⋅ V 2 )
A, B tr A2 , tr B2 , tr(A ⋅ B)
S, A tr S, tr S2 , tr S3 , tr A2 , tr(S ⋅ A2 ),
tr(S2 ⋅ A2 ), tr(S2 ⋅ A ⋅ S ⋅ A2 )
S, u tr S, tr S2 , tr S3 , u ⋅ u, u ⋅ S ⋅ u, u ⋅ S2 ⋅ u
Tensors cannot be related by arbitrary functions, because the function value must sat-
isfy the corresponding transformation law for tensor coordinates under a change to
another Cartesian coordinate system. Let polar vectors v, . . . , w and polar second-
order tensors M, . . . , N be arguments of tensor functions F of various orders. Under
a change of the Cartesian coordinate system we have for the coordinates of the argu-
ments, according to (2.17), the transformation equations
ṽi = αpi vp , . . . , w
̃j = αqj wq ,
M
̃kl = αpk αql Mpq , . . . , N
̃mn = αpm αqn Npq .
Depending on the order of the tensors, we obtain different conditions for the func-
tion F :
– A polar scalar s = f (v, . . . , w, M, . . . , N) must remain unchanged under a trans-
formation of the coordinate system,
u
̃ r = αur uu = fr (ṽi , . . . , w
̃j , M
̃kl , . . . , N
̃mn ).
With
If the transformation coefficients αij form an arbitrary orthogonal matrix (we also say
that they encompass all orthogonal transformations), we call (5.17), (5.18) and (5.19)
isotropic tensor functions.
The functions (5.17), (5.18), and (5.19) can in general further depend on polar
scalars; since this does not impose any restrictions on the functions, we did not in-
clude scalars in the list of arguments.
We derived the invariance conditions here only for polar tensors, but they can be
easily transferred to axial tensors by means of the transformation equations (2.18).
In order to satisfy the invariance condition (5.17), a scalar cannot depend on individual
tensor coordinates, because, in general, coordinates change under a transformation
of the coordinate system, so a scalar can only depend on combinations of coordinates
which are invariants and thus also scalars. If it is possible to form an integrity basis
with P irreducible invariants I1 , . . . , IP from the coordinates v, . . . , w, M, . . . , N, then
the scalar-valued function f satisfies
The theory of the representation of tensor functions does not provide any more infor-
mation about the function f , so for a particular physical process one has to rely on
experiments. The type and the number P of the invariants I1 , . . . , IP depend on the
particular case and must be determined according to the rules from Section 5.3.
1. We can obtain a scalar from a vector by means of the scalar product of the vector
and another vector. This suggests that, with the help of the scalar product of the vector-
valued function f and an arbitrary auxiliary vector h and by adding the auxiliary vector
to the argument list, we can reduce the problem of finding a representation of a vector-
valued function to the problem of finding a representation of a scalar-valued function:
have to be determined for each case individually. The vectors J i are also called the
generators of the representation and a complete set of generators is called a function
basis. We start by writing u ⋅ h as a linear combination of the J i ⋅ h, i. e.
u ⋅ h = k1 J 1 ⋅ h + ⋅ ⋅ ⋅ + kQ J Q ⋅ h, (a)
where the coefficients k1 , . . . , kQ are scalars which can still depend on the invariants
I1 , . . . , IP of the integrity basis for v, . . . , w, M, . . . , N, i. e. we have
ki = ki (I1 , . . . , IP ).
Since each term in (a) is a scalar product with the auxiliary vector h, we can cancel
h out, and thus we obtain a representation of the original vector-valued function f ,
which inherently satisfies the invariance condition (5.18) due to its construction. We
have
u = k1 J 1 + ⋅ ⋅ ⋅ + kQ J Q . (5.21)
We show by means of two examples how to determine the generators J i of this repre-
sentation for a particular case.
u = f (v).
u ⋅ h = f (v, h).
According to Section 5.3.1, the vector v has only one irreducible invariant, i. e. its
square v ⋅ v, and from the vectors v and h we can form also only one irreducible si-
multaneous invariant, and this is linear in h, namely the scalar product v ⋅ h. So the
representation of the vector u has only one generator, namely the vector v itself, and
we have
u = k(v ⋅ v) v.
The function k(v⋅v) cannot be determined further from the theory of the representation
of tensor functions. For example, for a physical process, the missing information must
be obtained from experiments.
3. In a second example, we extend the function f from No. 2 and include a symmetric
tensor S, so we seek a representation for
u = f (v, S)
u ⋅ h = f (v, S, h).
u = k1 v + k2 v ⋅ S + k3 v ⋅ S2 .
ki = f (tr S, tr S2 , tr S3 , v ⋅ v, v ⋅ S ⋅ v, v ⋅ S2 ⋅ v).
Problem 5.4.
Find the representation of a vector u which depends on a polar tensor T for the follow-
ing cases:
– T is symmetric or antimetric;
– u is polar or axial.
1. It is easy to generalize our work from Section 5.4.3 for vector-valued functions to
tensor-valued functions; we simply introduce an auxiliary tensor H and use the double
scalar product; this again ensures that the result inherently satisfies the invariance
condition (5.19) for tensor-valued functions. Then, from
T = f (v, . . . , w, M, . . . , N)
we first get
T ⋅⋅ H = f (v, . . . , w, M, . . . , N, H).
The generators J i are now second-order tensors, because in order for H to cancel out
at the end, the simultaneous invariants with the auxiliary tensor must have the linear
T ⋅⋅ H = k1 J 1 ⋅⋅ H + ⋅ ⋅ ⋅ + kQ J Q ⋅⋅ H,
and, after canceling H out, we obtain for the original tensor-valued function f
T = k1 J 1 + ⋅ ⋅ ⋅ + kQ J Q . (5.22)
As in Section 5.4.3, the coefficients k1 , . . . , kQ are scalars which can depend on the
invariants I1 , . . . , IP of an integrity basis for v, . . . , w, M, . . . , N, i. e.
ki = ki (I1 , . . . , IP ).
The type and number Q of the generators J 1 , . . . , J Q can only be determined for each
particular case; we will again explain this with two examples.
T = f (S).
T ⋅⋅ H = f (S, H).
Since S is symmetric and because we only need the invariants which are linear in H,
we can start from (5.15) and set there a = S and b = H. From this we obtain the three
basic invariants tr S, tr S2 , tr S3 of the symmetric tensor S and, using the symmetry of
S, the invariants which are linear in H,
T = k1 δ + k2 S + k3 S2 ,
where the coefficients k1 , k2 , k3 are scalar-valued functions of the three basic invariants
of S, i. e.
ki = f (tr S, tr S2 , tr S3 ).
Since we assumed that the tensor S is symmetric, we immediately obtain from the the-
ory of the representation of tensor functions that the tensor T must also be symmetric.
T = f (S, v).
T ⋅⋅ H = f (S, v, H).
We can keep the invariants of S and v from Section 5.4.3, No. 3, so we only need to
compute the simultaneous invariants which are linear in H. As in No. 2 we initially
find
δ ⋅⋅ H, S ⋅⋅ H, S2 ⋅⋅ H.
v v ⋅⋅ H.
In order to form the simultaneous invariants with both S and v, we start from v v ⋅⋅ H
and form scalar products with v v and S or S2 on the right or on the left; we can neglect
higher powers of S, because of the Cayley–Hamilton theorem (3.95). This gives
(S ⋅ v v) ⋅⋅ H, (v v ⋅ S) ⋅⋅ H, (S2 ⋅ v v) ⋅⋅ H, (v v ⋅ S2 ) ⋅⋅ H,
(S ⋅ v v ⋅ S) ⋅⋅ H, (S2 ⋅ v v ⋅ S) ⋅⋅ H, (S ⋅ v v ⋅ S2 ) ⋅⋅ H, (S2 ⋅ v v ⋅ S2 ) ⋅⋅ H.
T = k1 δ + k2 S + k3 S2 + k4 v v + k5 S ⋅ v v + k6 v v ⋅ S + k7 S2 ⋅ v v + k8 v v ⋅ S2 .
ki = f (tr S, tr S2 , tr S3 , v ⋅ v, v ⋅ S ⋅ v, v ⋅ S2 ⋅ v).
In contrast to No. 2, here we cannot conclude from the symmetry of S that T is also
symmetric. This is only true if k5 = k6 and k7 = k8 . But if T is symmetric, then we can
represent T also in the form
If T is symmetric, then S2 ⋅v v and v v⋅S2 can only appear as a sum, and according to the
generalized Cayley–Hamilton theorem (5.1), we can replace this sum by the generator
S ⋅ v v ⋅ S, which is symmetric from the outset. The coefficients k1∗ , . . . , k6∗ are again
scalar-valued functions of the invariants of S and v.
Problem 5.5.
Find the representation of a polar second-order tensor T which depends on a polar
vector or on an axial vector v.
5.4.5 Summary
We summarize the results of Section 5.4 in a table. Here S denotes a polar symmetric
tensor, A a polar antimetric tensor, v a polar vector, and u an axial vector.
vector-valued functions
polar axial
v v
u u
S –
A ϵ ⋅⋅ A
2
S, v v, S ⋅ v, S ⋅ v
tensor-valued functions
polar axial
v δ, v v ϵ⋅v
u δ, u u, ϵ ⋅ u
S δ, S, S2
S, v δ, S, S2 , v v, S ⋅ v v, v v ⋅ S, S2 ⋅ v v, v v ⋅ S2
2. The invariance conditions from Section 5.4.1 can also be used if only a subset of
all orthogonal transformations is allowed. Without further elaboration, we only men-
tion here that, in general, the number of the invariants increases if the permissible
transformations are restricted. An example is the restriction to only proper orthogonal
transformations. Then we also speak of hemitropic invariants and hemitropic tensor
functions. They are formed similarly as in Sections 5.3 and 5.4, but now we also have
to take the simultaneous invariants with the ε-tensor into account, because for proper
orthogonal transformations, we do not need to distinguish between polar and axial
tensors. Another example are rotations only about a certain axis. If we choose this
axis as the z-axis of a Cartesian coordinate system, then the matrix of the transforma-
tion coefficients has, according to (3.75) and with Section 3.13.4, the form
cos φ − sin φ 0
αij = ( sin φ cos φ 0) .
0 0 1
Then it follows from the transformation equations (2.17) that the coordinate u3 of a
vector and the coordinate T33 of a second-order tensor remain unchanged under the
transformation and thus they have to be counted as invariants.
3. All tensors which are invariant under the transformations of a certain symmetry
group form an anisotropy class. We explain this by means of three examples with
second-order tensors.
The anisotropy class of general anisotropy includes all tensors; its symmetry
group consists of the identical transformation αij = δij and the inversion
αij = −δij , because only under these transformations the coordinates do not change.
We have
On the other hand, the anisotropy class of isotropy includes all tensors whose coor-
dinates are invariant under arbitrary orthogonal transformations. We already intro-
duced such tensors as isotropic tensors; they have the form
T = k δ,
because for the transformed coordinates, we have, according to the orthogonality re-
lation (2.6),
As a third example, we consider the anisotropy class of transverse isotropy. Its sym-
metry group includes all rotations and rotary reflections with a fixed axis. The gen-
eral form of a transversely isotropic tensor of second order follows from Problem 5.5.
A transversely isotropic tensor can be seen as a tensor which depends only on the
direction n of the axis of rotation or rotary reflection: T = f (n). Thus, it remains to
ensure in the solution to Problem 5.5 that n is an (axial) unit vector, and we have
T = α δ + β n n + γ ε ⋅ n. (5.23)
Since n ⋅ n = 1, here the α, β, γ are, unlike in Problem 5.5, not functions, but arbitrary
constants.
If we choose the z-axis of the Cartesian coordinate system as the axis of rotation
or rotary reflection, i. e. if n1 = n2 = 0, n3 = 1, then T has the coordinate matrix
α 0 0 0 0 0 0 γ 0
Tij = (0 α 0) + (0 0 0) + (−γ 0 0)
0 0 α 0 0 β 0 0 0
α γ 0
= (−γ α 0 ).
0 0 α+β
we have
T
T̃ij = αmi αnj Tmn = αim Tmn αnj
α cos φ α sin φ
0
− γ sin φ + γ cos φ cos φ − sin φ 0
( )
= (−α sin φ α cos φ ) sin φ
( cos φ 0)
0
− γ cos φ − γ sin φ 0 0 1
( 0 0 α + β)
( 0 0 α + β)
α γ 0
= (−γ α 0 ).
0 0 α+β
The class of transversely isotropic tensors has some subclasses of tensors which we
have already seen earlier. If we set α = cos ϑ, β = ±1−cos ϑ, γ = − sin ϑ, then we obtain
for T, according to (3.77) and (3.81), the general form of an orthogonal tensor, with the
angle of rotation ϑ and the axis of rotation or rotary reflection n. If we set α = β = 0,
then T is antimetric and γ n is the corresponding vector of T. If we choose γ = 0, then
the tensor is symmetric with an eigenvalue α of multiplicity 2 and another eigenvalue
α + β of multiplicity 1; n is the eigendirection corresponding to the eigenvalue α + β
of multiplicity 1.
We summarize the results of our discussion on anisotropy classes of second-order
tensors in a table. For the anisotropy classes, the number of tensors T, which are con-
tained in the class, increases from the first class to the last; the tensors of a partic-
ular anisotropy class are always included in the next anisotropy class. For the sym-
metry groups, the number of orthogonal matrices αij which are included in the class
increases from the last class to the first; the matrices of a symmetry group are always
contained in the previous symmetry group.
Isotropy kδ arbitrary
cos φ − sin φ 0
Transverse isotropy αδ + βnn + γ ε ⋅ n ( sin φ cos φ 0 ) , n = ez
0 0 1
General anisotropy arbitrary ±δij
tensor D in Section 5.1, we silently assumed an isotropic solid, and, according to Sec-
tion 5.4.4, No. 2, the function f has the following representation:
T = k1 δ + k2 D + k3 D2 , ki = f (tr D, tr D2 , tr D3 ).
If, on the other hand, the solid is transversely isotropic, then it has a distinguished
direction n, which we have to include in the argument list of the function f , i. e. we are
looking for a representation for T = f (D, n). We can use the result from Section 5.4.4,
No. 3, if in addition we take into account that T is symmetric and that n is a unit vector,
i. e. that n ⋅ n is no invariant, i. e.
This expression includes the stress–strain relation for an isotropic solid as a special
case, if we set k4∗ = k5∗ = k6∗ = 0 and if k1∗ , k2∗ , and k3∗ do not depend on n.
For small strains we can approximate (a) by a linear relation between the coordi-
nates of T and D. Then k3∗ = k6∗ = 0, k2∗ = α, and k5∗ = β are constant, and k1∗ and k4∗
are linear functions of the invariants tr D and n ⋅ D ⋅ n which are linear in D, so we have
k1∗ = κ tr D + λ n ⋅ D ⋅ n, k4∗ = μ tr D + ν n ⋅ D ⋅ n.
The expression in the curly brackets can be interpreted as the coordinates of a fourth-
order elasticity tensor E with the symmetry properties
Schutz encourages his readers to develop both a pictorial way of thinking and a feel-
ing for geometrical tools in physics. He emphasizes the idea that tensors are geomet-
rical objects independent of any coordinate system. The book has a large chapter on
physical applications including thermodynamics, mechanics, electromagnetism, and
gauge theories.
Epstein believes, just as we do, that what is true and good must also be beautiful,
symmetric, and well balanced. He introduces the concept of a continuum as well as
the field theories in the language of topology and modern differential geometry, and
emphasizes the richness of these ideas using illustrations from continuum mechanics
and materials science.
Frankel aims to provide a working knowledge of geometric techniques for a deeper
understanding of both classical and modern physics and engineering at the level of
advanced undergraduate and graduate students. He promotes a language with which
mathematicians and scientists can communicate and includes applications to fluid
dynamics, electromagnetism, thermodynamics, elasticity, and relativity.
Tensors naturally appear in the study of curves, surfaces, and manifolds, so many
differential geometry books include chapters on tensor analysis.
An elementary account of the geometry of curves and surfaces, which only re-
quires a standard background in calculus and linear algebra, and an introduction to
the main ideas of differential geometry are given in:
– Barrett O’Neill: Elementary Differential Geometry, 2nd Edition. Elsevier, 1997,
2006.
For linear algebra and multilinear algebra, we recommend Strang and Greub, respec-
tively:
– Gilbert Strang: Introduction to Linear Algebra, 5th Edition. Wellesley–Cambridge
Press, 2016.
– Werner Greub: Multilinear Algebra, 2nd Edition. Springer, 1978.
https://doi.org/10.1515/9783110404265-006
The most important facts about a large number of curvilinear coordinate systems can
be found in:
– Parry Moon, Domina Eberle Spencer: Field Theory Handbook. Springer, 1988.
Readers who would like to learn more about the representation theory, introduced
in Chapter 5, are encouraged to consult the specialized literature on the subject. We
recommend:
– Jean-Paul Boehler (ed.): Application of Tensor Functions in Solid Mechanics. CISM
Courses and Lectures, No. 292. Springer, 1987.
– A. J. M. Spencer: Theory of invariants. In: A. Cemal Eringen (ed.): Continuum
Physics, Vol. I. Academic Press, 1971.
For problems that go beyond the scope of this book, we refer to the following books:
– A. M. Goodbody: Cartesian Tensors. Chichester: Ellis Horwood Limited, 1982.
– Jerald L. Ericksen: Tensor fields (Appendix to: Clifford Truesdell, R. A. Toupin: The
Classical Field Theories). In: Handbuch der Physik, Vol. III/1. Springer, 1960.
– Mikhail Itskov: Tensor Algebra and Tensor Analysis for Engineers, 4th Edition.
Springer, 2015.
– David Lovelock, Hanno Rund: Tensors, Differential Forms, and Variational Prin-
ciples. Dover, 1989.
The following book is a very readable introduction to differential forms and their ap-
plications. Differential forms were first introduced by Cartan and they generalize an-
timetric tensors:
– Harley Flanders: Differential Forms with Applications to the Physical Sciences.
Dover, 1989.
Problem 1.2.
A. During substitution we have to ensure two things:
– Substitution must not lead to an index appearing more than twice in a term.
We can avoid this by renaming those summation indices which appear also
in one of the other terms (as free indices or as summation indices).
– The quantity to be substituted (here u) must have the same index in the equa-
tion in which it is inserted and in the equation we use to substitute. If this is
not the case, we must rename one of the two indices.
In this problem it is easiest to rename the index k in φ = uk vk to i: φ = ui vi ; this
gives the solution φ = Aik nk vi .
B. If we want to keep i as a free index, we can write the substitution equations e. g.
as um = Bmj vj and Cmi = pm qi . Substituting then gives wi = pm qi Bmj vj .
https://doi.org/10.1515/9783110404265-007
If we rename in the third term j to i and i to j, we see that the first and the third
terms are equal, so we get
Problem 1.3.
A. 1. Yes (i is a free index on the left and on the right).
2. No (no free index on the left, i and j are free indices on the right).
3. No (no free index on the left, i is a free index on the right).
4. Yes (i is a free index on the left and on the right).
5. Yes (m and n are free indices on the left and on the right).
6. No (i and k are free indices on the left, i and j are free indices on the right).
7. Yes (i and k are free indices on the left and on the right).
8. Yes (n is a free index on the left and on the right).
9. Yes (no free index on the left and on the right).
10. Yes (i is a free index on the left and on the right).
B. 1. A1 (B11 + B22 ) = C1 (D11 + D22 ), A2 (B11 + B22 ) = C2 (D11 + D22 ).
4. A1 B1 = C1 , A2 B2 = C2 .
5. A1 B1 = C11 , A1 B2 = C12 , A2 B1 = C21 , A2 B2 = C22 .
7. We obtain four equations:
A1 B1 = A1 B1 , A1 B2 = A2 B1 , A2 B1 = A1 B2 , A2 B2 = A2 B2 .
The first equation and the last equation are trivial and the other two equations
are equivalent, so that the only nontrivial statement is A1 B2 = A2 B1 .
8. μ11 A1 + μ21 A2 = c1 (F11 + F22 ), μ12 A1 + μ22 A2 = c2 (F11 + F22 ).
9. A = α1 C11 + α2 C22 .
10. A11 = B11 C111 + B12 C221 , A22 = B21 C112 + B22 C222 .
Problem 1.4.
A. We calculate a determinant of order three with a zero element most conveniently
by an expansion with respect to the row (or column) containing the zero. Expan-
sion with respect to the first row gives Δ = 1(−2 + 6) − 1(6 − 1) = −1.
B. A determinant of order larger than three is best transformed into triangular form;
see also Section 1.6.1. (In the following calculations a number next to a row means
that this line is multiplied by the number and is added to the row indicated by the
arrow.)
(: 2)
1 2 −4 4 1 1 −4 4 −1 2 −4
3 6 9 0 (: 3) 1 1 3 0 ←
Δ= =6
−2 8 1 9 −2 4 1 9 ←
4 2 −2 0 4 1 −2 0 ←
1 1 −4 4 1 1 −4 4
0 0 7 −4 ↑ 0 −3 14 −16 2
=6 = −6
0 6 −7 17 0 6 −7 17 ←
0 −3 14 −16 ↓ 0 0 7 −4
1 1 −4 4 1 1 −4 4
0 −3 14 −16 0 −3 14 −16
= −6 = −18
0 0 21 −15 (: 3) 0 0 7 −5 −1
0 0 7 −4 0 0 7 −4 ←
1 1 −4 4
0 −3 14 −16
= −18 = −18 ⋅ 1 ⋅ (−3) ⋅ 7 ⋅ 1 = 378.
0 0 7 −5
0 0 0 1
Problem 1.5.
A. δii = (δ11 , δ22 , δ33 , δ44 , δ55 ) = (1, 1, 1, 1, 1).
B. δii = δ11 + δ22 + δ33 + δ44 + δ55 = 5.
Problem 1.6.
A. δij δjk = δik .
B. δi2 δik is equal to δk2 or δ2k , depending on whether we substitute the i in the first
δ by the k in the second δ, or if we substitute the i in the second δ by the 2 in the
first δ; since δk2 = δ2k , the result is the same: δk2 δ3k = δ32 = 0.
C. δ1k Ak = A1 .
D. δi2 δjk Aij = A2k .
Problem 1.7.
A. Following the arguments from Section 1.4.2, No. 2, we get:
12 21 12 21
1. for N = 2: δ12 = δ21 = 1, δ21 = δ12 = −1,
12 21 13 31 23 32
2. for N = 3: δ12 = δ21 = δ13 = δ31 = δ23 = δ32 = 1,
12 21 13 31 23 32
δ21 = δ12 = δ31 = δ13 = δ32 = δ23 = −1.
ij
B. δijis the sum of all elements with identical upper and lower indices and the solu-
tion to the sub-problem A 2 shows that this sum is six. We obtain the same result
using definition (1.19):
ij δii δij
δij = = δii δjj − δ ij δji = 3 ⋅ 3 − 3 = 6.
⏟⏟⏟⏟⏟⏟⏟⏟⏟
δji δjj
δii
Problem 1.8.
Problem 1.9.
The εijk are different from zero only if all three indices are different. If only one index is
a free index and if we have chosen a number for this index, only two elements, of the
nine possible elements with this free index, are nonzero. This is used in the following.
A. vi = εijk ωj rk :
v1 = ε123 ω2 r3 + ε132 ω3 r2 = ω2 r3 − ω3 r2 ,
v2 = ε231 ω3 r1 + ε213 ω1 r3 = ω3 r1 − ω1 r3 ,
v3 = ε312 ω1 r2 + ε321 ω2 r1 = ω1 r2 − ω2 r1 .
These three equations are usually summarized as the vector product of ω and r:
v = ω × r.
𝜕v
B. Ωj = εijk i :
𝜕xk
𝜕v 𝜕v 𝜕v 𝜕v
Ω1 = ε312 3 + ε213 2 = 3 − 2 ,
𝜕x2 𝜕x3 𝜕x2 𝜕x3
𝜕v1 𝜕v 𝜕v 𝜕v
Ω2 = ε123 + ε321 3 = 1 − 3 ,
𝜕x3 𝜕x1 𝜕x3 𝜕x1
𝜕v2 𝜕v1 𝜕v2 𝜕v1
Ω3 = ε231 + ε132 = − .
𝜕x1 𝜕x2 𝜕x1 𝜕x2
These three equations are usually summarized as the curl of v: Ω = curl v.
Problem 1.10.
We compute the inverse of the matrix using the Gauss–Jordan algorithm (Section 1.6.2).
We have
A.
1 0 2 1 0 0 −2 −1
2 1 3 0 1 0 ←
1 1 2 0 0 1 ←
1 0 2 1 0 0
0 1 −1 −2 1 0 −1
0 1 0 −1 0 1 ←
1 0 2 1 0 0 ←
0 1 −1 −2 1 0 ←
0 0 1 1 −1 1 −2 1
1 0 0 −1 2 −2
0 1 0 −1 0 1
0 0 1 1 −1 1
Check:
−1 2 −2
−1 0 1
1 −1 1
1 0 2 1 0 0
2 1 3 0 1 0
1 1 2 0 0 1
B.
0 1 1 −1 1 0 0 0 ↑
1 2 −1 1 0 1 0 0 ↓
1 3 −1 0 0 0 1 0
−1 −2 1 −2 0 0 0 1
1 2 −1 1 0 1 0 0 −1 1
0 1 1 −1 1 0 0 0
1 3 −1 0 0 0 1 0 ←
−1 −2 1 −2 0 0 0 1 ←
1 2 −1 1 0 1 0 0 ←
0 1 1 −1 1 0 0 0 −2 −1
0 1 0 −1 0 −1 1 0 ←
0 0 0 −1 0 1 0 1
1 0 −3 3 −2 1 0 0
0 1 1 −1 1 0 0 0
0 0 −1 0 −1 −1 1 0 −1
0 0 0 −1 0 1 0 1
1 0 −3 3 −2 1 0 0 ←
0 1 1 −1 1 0 0 0 ←
0 0 1 0 1 1 −1 0 3 −1
0 0 0 −1 0 1 0 1 −1
1 0 0 3 1 4 −3 0 ←
0 1 0 −1 0 −1 1 0 ←
0 0 1 0 1 1 −1 0
0 0 0 1 0 −1 0 −1 −3 1
1 0 0 0 1 7 −3 3
0 1 0 0 0 −2 1 −1
0 0 1 0 1 1 −1 0
0 0 0 1 0 −1 0 −1
Check:
1 7 −3 3
0 −2 1 −1
1 1 −1 0
0 −1 0 −1
0 1 1 −1 1 0 0 0
1 2 −1 1 0 1 0 0
1 3 −1 0 0 0 1 0
−1 −2 1 −2 0 0 0 1
Problem 1.11.
A. We have several options (as is often the case with proofs). The easiest way is to
transpose both sides of (1.54), according to (1.46). We consider each i, j-element of
the two matrices in the following equation:
T
(ATij )
(−1)
ij ) = Aji ,
(A(−1) = Aji(−1) .
(−1)
The right sides are equal and we establish the claim by equating the left sides.
B. The usual calculation rules (commutativity, associativity, distributivity) clearly
hold for calculations between elements, so we prove the claim for the elements
(we can also say: in element notation).
We start with the definition of the inverse in the form (1.51):
Akn
Multiplication by A(−1)
np gives
δkp
Problem 1.12.
Here we use the algorithm from Section 1.6.3 to find the rank. We have
A.
0 3 6 9 −3 : 3
0 0 2 −4 8 :2
1 2 −3 4 5
1 5 5 9 10
0 1 2 3 −1 ←
0 0 1 −2 4 ←
1 2 −3 4 5 ←
1 5 5 9 10
1 2 −3 4 5 −1
0 1 2 3 −1
0 0 1 −2 4
1 5 5 9 10 ←
1 2 −3 4 5 → Zeros
0 1 2 3 −1 −3
0 0 1 −2 4
0 3 8 5 5 ←
1 0 0 0 0
0 1 2 3 −1 → Zeros
0 0 1 −2 4 −2
0 0 2 −4 8 ←
1 0 0 0 0
0 1 0 0 0
0 0 1 −2 4 → Zeros
0 0 0 0 0
1 0 0 0 0 The matrix
0 1 0 0 0 has rank 3.
0 0 1 0 0
0 0 0 0 0
B.
1 −3 2 0 −4 2
4 −11 10 −1 ←
−2 8 −5 3 ←
1 −3 2 0 → Zeros
0 1 2 −1 −2
0 2 −1 3 ←
1 0 0 0
0 1 2 −1 → Zeros
0 0 −5 5 : (−5)
1 0 0 0
0 1 0 0
0 0 1 −1 → Zeros
1 0 0 0 The matrix
0 1 0 0 has rank 3.
0 0 1 0
Problem 1.13.
Reflexivity: For A
∼=B
∼ we can take S∼=T ∼ = I:
∼ Aij = δim Amn δnj .
Symmetry: From Aij = Sim Bmn Tnj it follows that
Spi
(−1)
Aij Tjq
(−1)
=S pi Sim Bmn T
(−1)
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ nj Tjq
(−1)
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ = Bpq .
δpm δnq
Transitivity: From Aij = Sim Bmn Tnj , Bmn = Pmr Crs Qsn it follows that
Here, U
∼ and V ∼ are the product of two regular matrices and hence they are, accord-
ing to (1.13), also regular.
Problem 2.1.
Assumption: Ai = Tij Bj , (2.13) =: (a),
̃ i = T̃ij B
A ̃j, (2.14) =: (b),
Tij = αim αjn T̃mn , (2.15) =: (c),
Bj = αjk B̃k, (2.11) =: (d).
Claim: ̃ m.
Ai = αim A
(a) (c), (d)
Proof: Ai = Tij Bj = αjn T̃mn αjk B
αim ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ̃k,
(2.6) (1.17)
= δnk T̃mn = T̃mk
̃ k (b)
Ai = αim T̃mk B ̃ m , what was to be shown.
= αim A
Problem 2.2.
A.
1 4 2
aTij = (0 −3 −4) .
2 6 5
B.
1 0 2 1 4 2
1[ ]
a(ij) = [(4 −3 6 ) + (0 −3 −4)]
2
[ 2 −4 5 2 6 5 ]
2 4 4 1 2 2
1
= (4 −6 2) = (2 −3 1) ,
2
4 2 10 2 1 5
1 0 2 1 4 2
1[ ]
a[ij] = [(4 −3 6 ) − (0 −3 −4)]
2
[ 2 −4 5 2 6 5 ]
0 −4 0 0 −2 0
1
= (4 0 10) = ( 2 0 5) .
2
0 −10 0 0 −5 0
1 2 2 0 −2 0 1 0 2
(2 −3 1) + ( 2 0 5) = (4 −3 6) .
2 1 5 0 −5 0 2 −4 5
Problem 2.3.
−4 2 10 −4 −6 −8
ab =
̂ (−6 3 15) , ba =
̂ ( 2 3 4) .
−8 4 20 10 15 20
Problem 2.4.
We cannot translate the sub-problems A, B, and C into matrix notation, because the
order of their tensors is greater than 2. Before translating into symbolic notation, we
need to rewrite each equation so that the order of the (free) indices is the same in all
terms; this is possible in all cases (B, C, and D) (otherwise translation into symbolic
notation would not be possible).
A. a b c = d ⇐⇒ aij bk cl = dijkl .
B. ai bkl cj = dijkl ,
ai cj bkl = dijkl ⇐⇒ a c b = d.
C. αi bkj = Aijk ,
αi bTjk = Aijk ⇐⇒ α bT = A.
D. aTmn = bnm ,
anm = bnm ⇐⇒ a=b ⇐⇒ a
∼ = b.
∼
Problem 2.5.
A. a ⋅ b = c ⇐⇒ aij bjk = cik ⇐⇒ a
∼b∼ = c.
∼
B. b ⋅ a = c ⇐⇒ bij ajk = cik ⇐⇒ b
∼ ∼ c.
a = ∼
C. aik bij = cjk :
We have two ways to rewrite the left side, so that it can be translated:
aik bij = bTji aik and aik bij = aTki bij .
They lead to different translations:
– bTji aik = cjk can be translated immediately, i. e.
bTji aik = cjk ⇐⇒ bT ⋅ a = c ⇐⇒ b T
∼ a ∼ = c.
∼
– In aTki bij = cjk we first set cjk = ckj
T
, so
aTki bij = ckj
T
⇐⇒ aT ⋅ b = cT ⇐⇒ a T
∼ b
T
∼ = c∼ .
Both translations are correct; we will show later that they can be converted into
each other.
D. ai bk ai ck = d,
T T
ai ai bk ck = d ⇐⇒ a⋅ab⋅c =d ⇐⇒ (a∼ a)
∼ (b∼ c)∼ = d.
E. This equation cannot be translated into symbolic notation, because symbolic no-
tation is usually not defined for isomers of tensors of order 3.
F. (a ⋅ b)T = c ⇐⇒ (aij bjk )T = cik ⇐⇒ (a
∼ b)
T
∼ = c. ∼
G. (a ⋅ b ⋅ c)T = d ⇐⇒ (aij bjk ckl )T = dil ⇐⇒ (a
∼b∼ c)
T
∼ = d.
∼
Problem 2.6.
In the following forms the equation can be translated:
are equivalent.
are equivalent.
So in all three cases parentheses need not to be set; all five possible ways to inter-
pret the expression lead to the same result.
Problem 2.7.
A. (a b)T is translated to (ai bj )T , which gives transposed (ai bj )T = aj bi . To make
the order of the free indices on both sides equal, we change on the right side the
order of the factors aj bi = bi aj , which gives (ai bj )T = bi aj . This equation can
be translated and we get (a b)T = b a, Q. E. D.
B. Translating a ⋅ b gives ai bij . Now we have ai bij = bTji ai , which translates into
a ⋅ b = bT ⋅ a, Q. E. D.
C. Translating (a ⋅ b ⋅ c)T gives (aij bjk ckl )T . Now we have (aij bjk ckl )T = alj bjk cki =
T T T
cki bjk alj = cik bkj ajl . Translation of the very left and the very right terms gives
(a ⋅ b ⋅ c)T = cT ⋅ bT ⋅ aT , Q. E. D.
Problem 2.8.
A. aij bij = c ⇐⇒ a ⋅⋅ b = c.
B. aij bji = c, aij bTij = c ⇐⇒ a ⋅⋅ bT = c.
C. aij bkjl cik = dl :
T
Option 2: cki aij bkjl = dl ⇐⇒ (cT ⋅ a) ⋅⋅ b = d.
In this translation we need the parentheses, because cT ⋅ (a ⋅ ⋅ b) = d is also
T
defined, but gives something different, i. e. cki ajl bjli = dk .
D. aij bikl cklj = d ⇐⇒ a ⋅⋅ (b ⋅⋅ c) = d.
Again, we do not need parentheses here, because (a ⋅⋅ b) ⋅⋅ c is not defined, but
we still recommend them.
Problem 2.9.
A. tr δ = δii = δ11 + δ22 + δ33 = 3.
(1.17)
B. δij δjk = δik , so we have δ ⋅ δ = δ.
C. δij δij = (δ11 )2 + (δ12 )2 + (δ13 )2 + (δ21 )2 + ⋅ ⋅ ⋅ + (δ33 )2 = 3.
Problem 2.10.
A. aij bij = bij aij can be translated into a ⋅⋅ b = b ⋅⋅ a; therefore the double scalar
product of two tensors of order two is always commutative.
1 2 9
B. If we choose nine different tensors bij , bij , . . . , bij for bij , we get from aij bij = 0
a homogeneous system of linear equations for the nine unknown aij , i. e.
1 1 1
a11 b11 + a12 b12 + ⋅ ⋅ ⋅ + a33 b33 = 0,
2 2 2
a11 b11 + a12 b12 + ⋅ ⋅ ⋅ + a33 b33 = 0,
..
.
9 9 9
a11 b11 + a12 b12 + ⋅ ⋅ ⋅ + a33 b33 = 0.
Clearly, we can cancel out the bij from aij bij = 0, if and only if the system of equa-
k
tions has only the trivial solution, i. e. if the nine tensors bij , i, j = 1, 2, . . . , 3, k =
1, . . . , 9 are linearly independent. Since every tensor bij can be written as a lin-
k
ear combination of nine linearly independent tensors bij , we can cancel out bij if
and only if bij can take arbitrary values, and the criterion for this is that we can
substitute nine linearly independent tensors.
Problem 2.11.
(2.26) 1 1
A. a(ij) b[ij] = (aij + aji ) ⋅ (bij − bji )
2 2
1
= (a b + aji bij − aij bji − aji bji ).
4 ij ij
We have aij bij = aji bji and aji bij = aij bji , as we can see immediately, after
renaming in both equations on the right side the summation index i to j and the
summation index j to i, i. e. the bracket is zero, which is what we wanted to prove.
B. We want to show that, for arbitrary tensors a(ij) , it follows from a(ij) bij = 0 that
bij is antimetric. It can be shown analogously that, for arbitrary values of b[ij] , it
follows from aij b[ij] = 0 that aij is symmetric.
We can prove this analogously to Part A of this problem: by assumption, we
have
1
(a + aji ) bij = 0, aij bij + aji bij = 0,
2 ij
where aij can take arbitrary values. Renaming the indices in the second term gives
Since aij can take any value, we can cancel out aij in the last equation and it follows
that bij + bji = 0, i. e. bij is antimetric.
C. For i = 1, it follows that ε123 a23 + ε132 a32 = 0, a23 − a32 = 0, a23 = a32 ; for i = 2
and i = 3, it correspondingly follows that a31 = a13 , a12 = a21 . This proves the
statement.
i
D. From six linearly independent symmetric tensors Amn , i = 1, . . . , 6, we can con-
i
struct any symmetric tensor Bmn by a linear combination αi Amn = Bmn . Because of
i
the symmetry with respect to m and n, only six of the nine equations αi Amn = Bmn
i
are different. For given Amn and Bmn , they form a system of linear equations for the
i
six αi . Because the Amn are linearly independent, the coefficient matrix of these
equations is regular and the solution to the system of equations is unique.
Problem 2.12.
A. (a × b) c =
̂ εijk aj bk cl .
a × (b c) =
̂ εijk aj bk cl .
Both readings are defined and mean the same thing.
B. (a × b) ⋅ c =
̂ εijk aj bk ci .
The expression a × (b ⋅ c) is not defined.
C. (a ⋅ b) × c =
̂ εijk ajm bm ck .
a ⋅ (b × c) =
̂ ajm εmik bi ck = εmik ajm bi ck .
Both readings are defined and mean something different.
D. (a ⋅ b) × c =
̂ εijk am bmj ck .
a ⋅ (b × c) =
̂ − ai bij εjkl cl = εkjl ai bij cl .
Both readings are defined and mean the same thing.
Problem 2.13.
A. εijk am bj εmin ck dn
= εijk bj ck εinm dn am ̂ (b × c) ⋅ (d × a)
=
= εijk εinm bj ck dn am
= (δjn δkm − δjm δkn ) bj ck dn am
= bj ck dj ak − bj ck dk aj
= bj dj ck ak − bj aj ck dk ̂ b ⋅ d c ⋅ a − b ⋅ a c ⋅ d,
=
(b × c) ⋅ (d × a) = b ⋅ d c ⋅ a − b ⋅ a c ⋅ d.
Here and in the following examples, other formulations are also permissible in
symbolic notation, e. g. here we can write for the left side (c × b) ⋅ (a × d).
B. εijk aj bkm cm is a vector product with i as the only free index.
The indices i and q in εqpi are summation indices, so they belong to a vector
product with the above vector product as one of its factors. We set the free index
p as first index, so we have
C. a × [(b ⋅ c) × d] =
̂ εijk aj εkpq bpm cm dq
= εijk εpqk aj bpm cm dq
= (δip δjq − δiq δjp ) aj bpm cm dq
= aj bim cm dj − aj bjm cm di
= dj aj bim cm − di aj bjm cm =
̂ d ⋅ a b ⋅ c − d a ⋅ b ⋅ c,
a × [(b ⋅ c) × d] = d ⋅ a b ⋅ c − d a ⋅ b ⋅ c.
D. a × (b × c) =
̂ εijk aj εkmn bm cn
= εijk εmnk aj bm cn
= (δim δjn − δin δjm ) aj bm cn
= an bi cn − am bm ci
= an cn bi − am bm ci =
̂ a ⋅ c b − a ⋅ b c,
a × (b × c) = a ⋅ c b − a ⋅ b c.
E. The terms B and C are both of the form a × b ⋅ c × d and they differ only by the order
in which the two vector products have to be evaluated, in other words, by setting
the parentheses. The expressions are different; therefore the associative law does
not hold.
Problem 2.14.
For translation, the order of the free indices must be the same in all terms:
ai bk − bi ak = εikj εjmn am bn ,
a b − b a = ε ⋅ (a × b).
Problem 2.15.
𝜕 𝜕amn 𝜕2 amn
A. div grad a = b ⇐⇒ = = bmn .
𝜕xi 𝜕xi 𝜕xi2
𝜕 𝜕amn 𝜕2 amn
B. grad div a = c ⇐⇒ = = cmi .
𝜕xi 𝜕xn 𝜕xi 𝜕xn
𝜕ai 𝜕bj
C. = cik ⇐⇒ (grad a) ⋅ (grad b) = c.
𝜕xj 𝜕xk
𝜕ai 𝜕bi
D. =c ⇐⇒ (grad a) ⋅⋅ (grad b) = c.
𝜕xj 𝜕xj
Problem 2.16.
𝜕xi
A. grad x =
̂ = δij ⇐⇒ grad x = δ.
𝜕xj
𝜕xi
B. div x =̂ = 3 ⇐⇒ div x = 3.
𝜕xi
𝜕xi
C. curl x =̂ ε = δik εijk = 0 ⇐⇒ curl x = 0.
𝜕xk ijk
Problem 2.17.
𝜕λ ai 𝜕a 𝜕λ
A. div (λ a) =
̂ = λ i + a,
𝜕xi 𝜕xi 𝜕xi i
div (λ a) = λ div a + (grad λ) ⋅ a.
𝜕λ ai 𝜕a 𝜕λ
B. curl (λ a) =
̂ ε = λ i εijk + ε a,
𝜕xk ijk 𝜕xk 𝜕xk ijk i
curl (λ a) = λ curl a + (grad λ) × a.
𝜕 𝜕a 𝜕2 a (2.40)
C. curl grad a =
̂ εijk = ε = 0,
𝜕xk 𝜕xi 𝜕xk 𝜕xi ijk
curl grad a = 0.
𝜕 𝜕ai 𝜕2 ai (2.40)
D. div curl a =
̂ εijk = ε = 0,
𝜕xj 𝜕xk 𝜕xj 𝜕xk ijk
div curl a = 0.
𝜕 𝜕am 𝜕2 am
E. curl curl a =
̂ ( εmin ) εijk = εinm εijk
𝜕xk 𝜕xn 𝜕xk 𝜕xn
𝜕2 am 𝜕2 ak 𝜕2 aj
= (δnj δmk − δnk δmj ) = −
𝜕xk 𝜕xn 𝜕xk 𝜕xj 𝜕xk 𝜕xk
𝜕 𝜕ak 𝜕 𝜕aj
= − ,
𝜕xj 𝜕xk 𝜕xk 𝜕xk
curl curl a = grad div a − div grad a.
Problem 2.18.
W = ∫[x]F⋅ = ∫(Fx d x + Fy d y + Fz d z)
= ∫[−λ x d x − λ y d y − (λ z + m g) d z]
2π
dx dy dz
= − ∫ [λ x +λy + (λ z + m g) ] d φ.
dφ dφ dφ
0
We substitute the parametric representation of the helix and obtain, after elementary
calculations,
1
W = − λ h2 − m g h.
2
The work of the external forces exerted on the mass point is − 21 λ h2 − m g h, i. e. the
mass point must do the work of 21 λ h2 + m g h to move up along the helix against the
external forces.
Problem 2.19.
𝜕(x, y)
Fz = ∫ p d Az = ∫ ϱ g(H − z) d Az = ∫∫ ϱ g (H − z (u, v)) d u d v.
𝜕(u, v)
𝜕(x, y)
= R2 sin u cos u,
𝜕(u, v)
π
2π 2
2π 3
Fz = ϱ g (πR2 H − R ).
3
In mechanics it is shown that the vertical force equals the weight of the volume above
the area; in our case this is the weight of the indicated volume. This volume is the
difference between a circular cylinder with radius R and height H and a hemisphere
with radius R. Using this fact and the corresponding stereometric formulas we can
compute the vertical force without having to solve an integral.
Problem 3.1.
From
4 1 −7
aij = (−3 5 8)
3 2 9
6 0 0 −2 −1 −2 0 2 −5 4 1 −7
(0 6 0) + ( −1 −1 5) + (−2 0 3) = (−3 5 8)
0 0 6 −2 5 3 5 −3 0 3 2 9
a
̂ δij + å (ij) + a[ij] = aij .
Problem 3.2.
We can prove the statement as follows:
For second-order polar tensors, we have, according to (2.17),
For a
̃ mn = −a
̃ nm it then follows that
Problem 3.3.
Evaluating (3.8) gives
1
A1 = (ε123 a[23] + ε132 a[32] ) = a23
2
and the two cyclic permutations, thus
Hence, the coordinates of a vector satisfy the transformation law for certain coordi-
nates of an antimetric tensor.
Problem 3.4.
The given tensor is antimetric.
A. Consequently, its determinant is zero.
B. The cotensor satisfies, according to (3.12), bij = Ai Aj , and from the solution to
Problem 3.3 it follows that (A1 , A2 , A3 ) = (a23 , a31 , a12 ), i. e. (A1 , A2 , A3 ) =
(−1, −2, 3). Thus, we obtain the following coordinates of the cotensor:
1 2 −3
bij = ( 2 4 −6) .
−3 −6 9
C. Since an antimetric tensor is singular, its inverse tensor does not exist.
Problem 3.5.
Let aij be the coordinate matrix of the tensor a in a given coordinate system; then we
have, according to (2.17),
where ã mn is the coordinate matrix of the tensor in another coordinate system. Fur-
thermore, let aij be proper or improper orthogonal. Then we have, according to (1.59),
in both cases
Substituting the first equation for aij and for akj in the second equation gives
αim αjn a
̃ mn αkp αjq a
̃ pq = δik ,
̃ mn αkp δnq a
αim a ̃ pq = δik ,
αim αkp a
̃ mn a
̃ pn = δik .
α im αir α
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ⏟⏟⏟⏟⏟α⏟⏟⏟ks⏟⏟ a
⏟⏟⏟kp ̃ mn a
̃ pn = δik αir αks = α kr αks ,
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
δmr δps δrs
a
̃ rn a
̃ sn = δrs ,
i. e. the coordinate matrix of the tensor a is orthogonal also in the other coordinate
system (and hence in every coordinate system).
Finally we need to show that a
̃ ij is proper orthogonal if aij is proper orthogonal and
vice versa. To show this, we compute the determinant of the transformation equation
between aij and a ̃ ij and use Theorem 1.13 for the multiplication of determinants. We
have
Since the transformation matrix αij is orthogonal, its determinant is ±1 and it follows
that
i. e. the coordinate matrices aij and ã ij are either both proper orthogonal or both im-
proper orthogonal, which proves the statement.
Problem 3.6.
Orthogonal tensors satisfy A(−1)
ji
= ATji = Aij , so it follows from (1.50) that
Proper orthogonal tensors have det Aij = 1 and hence Bij = Aij ; improper orthogonal
tensors have det Aij = −1 and thus Bij = − Aij .
Problem 3.7.
A. A tensor is single singular, if the rows of its coordinate matrix are coplanar, but
not collinear. For the tensor under consideration, we see that the sum of the first
two rows is the third row and that the second row is not a multiple of the first
row. Hence the three rows are coplanar, but not collinear. If we do not see this, we
need to convince ourselves that the determinant of the coordinate matrix is zero
and that a minor is nonzero.
B. In order to find the null direction, we have to solve the homogeneous system of
linear equations aij Xj = 0. Because the coefficient matrix of this system of equa-
tions is single singular, we can omit one equation and we can choose freely one
unknown, e. g. as 1. If we omit the third equation and set x3 = 1, we obtain
1 −1 1 2
−2 1 −3 ←
X1 − X2 = 1, 1 −1 1 ←
, X = (2, 1, 1),
−2 X1 + X2 = −3, 0 −1 −1 −1 −1
1 0 2
0 1 1
i. e. the null direction (in the given coordinate system) is given by the unit vector
1 1 1
X̊ = ± ( √6, √6, √6) .
3 6 6
C. The columns of the coordinate matrix lie in the image plane, e. g. two linearly
independent vectors in the image plane are
U 1 ⋅ V 1 = U 1 ⋅ U 1 + α U 1 ⋅ U 2 = 0,
U1 ⋅ U1 6
α=− =− = 2,
U1 ⋅ U2 −3
V 1 = (−1, 0, −1).
Problem 3.8.
A. For the tensor to be double singular, for example, the rows of the coordinate ma-
trix must be collinear. We see immediately that the second row is the negative of
the first row and that the third row is three times the first row.
B. In order to solve the system of equations aij Xj = 0, we can omit two equations
and we can freely choose two unknowns in the remaining equation. If we omit the
second and third equations, we obtain for X2 = 1, X3 = 0:
X1 − 1 = 0, X1 = 1
and for X2 = 0, X3 = 1:
X1 + 2 = 0, X1 = −2.
C. The unit vector normal to the null plane is in the direction of the row line, i. e.
1 1 1
Y̊ = ± ( √6, − √6, √6) .
6 6 3
D. The image line is the column line; its direction is given by the unit vector
1 1 3 √
Ů = ± ( √11, − √11, 11) .
11 11 11
Problem 3.9.
A. For an orthogonal basis it immediately follows from
g 1 = 3 e1 , g 2 = 2 e2 , g 3 = e3
g 1 = e1 .
For g 2 we have
1
g 2 ⋅ e3 = 0, g 2 ⋅ g 1 = g 2 ⋅ e1 = cos 60∘ = , g 2 ⋅ g 2 = 1.
2
Using the first two equations we obtain for the third equation
1 1√
g 2 ⋅ g 2 = (g 2 ⋅ e1 )2 + (g 2 ⋅ e2 )2 = + (g 2 ⋅ e2 )2 = 1, g 2 ⋅ e2 = 3,
4 2
1 1
g2 = e + √3 e2 .
2 1 2
Finally, we have for g 3
1 1
g 3 ⋅ g 1 = cos 60∘ = , g 3 ⋅ g 2 = cos 60∘ = , g 3 ⋅ g 3 = 1.
2 2
This implies that
1
g 3 ⋅ e1 = ,
2
1 1 1 1 1 1
g 3 ⋅ g 2 = g 3 ⋅ e1 + √3 g 3 ⋅ e2 = + √3 g 3 ⋅ e2 = , g 3 ⋅ e2 = √3,
2 2 4 2 2 6
1 1
g 3 ⋅ g 3 = (g 3 ⋅ e1 )2 + (g 3 ⋅ e2 )2 + (g 3 ⋅ e3 )2 = + + (g 3 ⋅ e3 )2 = 1,
4 12
1√
g 3 ⋅ e3 = 6,
3
1 1 √ 1
g 3 = e1 + 3 e2 + √6 e3 .
2 6 3
The easiest method to compute the basis reciprocal to g i is to use the Gauss–Jordan
algorithm, i. e.
g1 1 0 0 1 0 0 − 21 − 21
1 1
g2 2 2
√3 0 0 1 0 ←
1 1 1
g3 2 6
√3
3
√6 0 0 1 ←
1 0 0 1 0 0
1
0 2
√3 0 − 21 1 0 − 31 2
√3
= 2
3
√3
1 1
0 6
√3
3
√6 − 21 0 1 ←
1 0 0 1 0 0
0 1 0 − 31 √3 2
3
√3 0
1
0 0 3
√6 − 31 − 31 1 3
√6
= 1
2
√6
1 0 0 1 0 0
0 1 0 − 31 √3 2
3
√3 0
0 0 1 − 61 √6 − 61 √6 1
2
√6
1 2
g g g3
1√ 1
g 1 = e1 − 3 e2 − √6 e3 ,
3 6
2√ 1
g2 = 3 e2 − √6 e3 ,
3 6
1
g 3 = √6 e3 .
2
Problem 3.10.
i
According to (3.30)2 , we have g i hi = aT or g m hn = aTmn = anm ,
i
1 2 3
g1 g1 g1 h1 h2 h3
1 1 1 a11 a21 a31
1 2 3
(g 2 g2 g 2 ) (h2 1 h2
2
h3 ) = (a12
2
a22 a32 ) .
1 2 3
h1 h2 h3 a13 a23 a33
g3 g3 g3 3 3 3
g1 g2 g3
1 3 1 2 4 1 1
0 1 2 3 −2 2
−1 −3 −2 −1 3 1 ←
1 3 1 2 4 1 ←
0 1 2 3 −2 2 −3
0 0 −1 1 7 2
1 0 −5 −7 10 −5 ←
0 1 2 3 −2 2 ←
0 0 −1 1 7 2 −5 2 −1
1 0 0 −12 −25 −15 h1
0 1 0 5 12 6 h2
0 0 1 −1 −7 −2 h3
Thus we get
2 3 −1 −12 0 12 15 5 −15 −1 −2 2
(4 −2 3) = (−25 0 25) + (36 12 −36) + (−7 −14 14) .
1 2 1 −15 0 15 18 6 −18 −2 −4 4
Problem 3.11.
A. In order to be able to represent a with h1 and h2 in this form, h1 and h2 have to lie
in the column plane of a. The easiest way to verify this is to prove that the matrix
formed by h1 , h2 , and two columns of a has rank 2, so
−1 0 1 −1 2 1 −1
2 1 −2 1 ←
1 1 −1 0 ←
1 0 −1 1
0 1 0 −1
0 1 0 −1
Since the last two rows are equal, we can omit one row when we determine the
rank, and since the first two rows are not collinear, the matrix has rank 2.
i
B. We have hi g i = a, hm g n = amn ,
i
i
To determine the g n , the first two rows are sufficient:
h1 h2
−1 0 1 −1 −1 2 −1
2 1 −2 1 3 ←
1 0 −1 1 1 g1
0 1 0 −1 1 g2
1 −1 −1 1 −1 −1 0 0 0
(−2 1 3) = (−2 2 2) + (0 −1 1) .
−1 0 2 −1 1 1 0 −1 1
Problem 3.12.
Determining the eigenvalues:
All coordinate matrices are triangular matrices. Hence, according to Section 3.11.3,
No. 6, the diagonal elements are the eigenvalues: In the cases A to C, the tensor has
the eigenvalue λ = 1 of multiplicity 3. In the cases D to F, the tensor has the eigenvalue
λ = 1 of multiplicity 2 and the eigenvalue λ = 2 of multiplicity 1.
0 1 0 x1 0
(0 0 1) (x2 ) = (0) .
0 0 0 x3 0
The coefficient matrix has rank 2, i. e. according to Section 3.11.4, Theorem 2, only
one eigendirection exists and the tensor is defective. From the equation we obtain
x2 = 0, x3 = 0. We can freely choose x1 , so the eigendirection is given by the unit
vector
x̊ = (1, 0, 0).
0 0 0 x1 0
(0 0 1) (x2 ) = (0) .
0 0 0 x3 0
The coefficient matrix has rank 1, i. e. according to Section 3.11.4, Theorem 3, two
distinct eigendirections exist and the tensor is again defective. Here we obtain
from the equation x3 = 0. We can freely choose x1 and x2 , so two eigendirections
are, for example, given by the unit vectors
0 0 0 x1 0
(0 0 0) (x2 ) = (0) .
0 0 0 x3 0
The coefficient matrix has rank 0, i. e. according to Section 3.11.4, Theorem 4, three
linearly independent eigendirections exist, or in other words, every direction is an
eigendirection. The tensor is nondefective and we can also find three mutually or-
thogonal eigenvectors. Accordingly, the equation does not provide any restriction
for the coordinates of the eigenvectors and a set of mutually orthogonal eigenvec-
tors is, for example,
0 1 0 x1 0
(0 0 1) (x2 ) = (0) .
0 0 1 x3 0
The coefficient matrix has rank 2, only one eigendirection corresponds to the
eigenvalue with multiplicity two, and the tensor is defective. From the equation
we obtain x2 = 0, x3 = 0; the eigendirection is given by the unit vector
x̊ 1 = (1, 0, 0).
−1 1 0 x1 0
( 0 −1 1) (x2 ) = (0) ,
0 0 0 x3 0
which gives
− x1 + x2 = 0, −x2 + x3 = 0.
0 0 0 x1 0
(0 0 1) (x2 ) = (0) .
0 0 1 x3 0
The coefficient matrix has rank 1, two eigendirections correspond to the eigen-
value with multiplicity two, and the tensor is nondefective. From the equation we
only obtain x3 = 0; two orthogonal eigendirections are, for example, given by the
unit vectors
−1 0 0 x1 0
( 0 −1 1) (x2 ) = (0) ,
0 0 0 x3 0
which gives
− x1 = 0, −x2 + x3 = 0.
0 0 0 x1 0
(0 0 0) (x2 ) = (0) .
0 0 1 x3 0
The coefficient matrix has again rank 1, two eigendirections correspond to the
eigenvalue with multiplicity two, and the tensor is nondefective. From the equa-
tion we obtain only x3 = 0; two orthogonal eigendirections are given by the unit
vectors
−1 0 0 x1 0
( 0 −1 0) (x2 ) = (0) ,
0 0 0 x3 0
which gives
− x1 = 0, −x2 = 0.
x̊ 3 = (0, 0, 1).
Problem 3.13.
A principal axes transformation requires finding three mutually perpendicular eigen-
directions. Such a triple of eigendirections exists only for symmetric tensors. See also
the results of the last problem.
We start by finding the eigenvalues of the tensor. The characteristic equation is
5−λ 0 4
0 9−λ 0 = 0.
4 0 5−λ
λ1 = λ2 = 9, λ3 = 1.
From the eigenvalue equation for the eigenvalue λ = 9 with multiplicity two, we
obtain the orthonormal eigenvectors
1 1
q1 = ( √2, 0, √2) and q2 = (0, 1, 0),
2 2
and from the eigenvalue equation for the eigenvalue λ = 1, we obtain the normalized
eigenvector
1 1
q3 = ( √2, 0, − √2) ,
2 2
which is orthogonal to the other two eigenvectors. We still have to check if the three
eigendirections form, in this order, a right-handed system. For this to be true, the value
of the determinant
1 √2 1
2
0 2
√2
0 1 0
1
2
√2 0 − 21 √2
must be positive. This is clearly not the case, i. e. we have to reverse the direction of
the third eigendirection, and thus we obtain as a principal axes system
1 1 1 1
q1 = ( √2, 0, √2) , q2 = (0, 1, 0), q3 = (− √2, 0, √2) .
2 2 2 2
The coordinate matrix of the tensor in this principal axes system is
9 0 0
a
̂ ij = (0 9 0) .
0 0 1
Problem 3.14.
We assume that the generalized eigenvalue problem has a complex eigenvalue and a
complex eigendirection. Then, similarly to Section 3.11.3, No. 2, it follows that
Contraction of the first equation with Yi and of the second equation with Xi and com-
puting the difference gives
a ij Yi Xj − aij Xi Yj = λ (b
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ij Yi Xj − bij Xi Yj ) − μ (bij Yi Yj + bij Xi Xj ) .
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
0 0
The first two expressions are zero, because aij and bij are symmetric, and what remains
is
μ (bij Yi Yj + bij Xi Xj ) = 0.
Problem 3.15.
Solving the characteristic equation gives
Problem 3.16.
A. The two orthogonality relations yield six equations, which can be solved for the
six unknowns α, β, γ, δ, ε, ζ . Here αki αkj = δij (a statement about the sums of
the squares and the sums of the products of the columns) is more useful than
αik αjk = δij (an analogous statement about the row sums) because in the first case
the six equations are partially decoupled.
From αk1 αk1 = 1 it follows that α = ± 21 .
From αk1 αk2 = 0 and αk2 αk2 = 1 it follows, independently of the sign of α,
that
1
β = ± √2, γ = 0.
2
From αk1 αk3 = 0, αk2 αk3 = 0, and αk3 αk3 = 1 it follows, independently of the
sign of α and β, that
1 1
δ=ζ =± , ε = ± √2.
2 2
Thus the general form of the transformation matrix is
± 21 ± 21 √2 ± 21
αij = (∓ 21 √2 0 ± 21 √2) ,
± 21 ∓ 21 √2 ± 21
where we can choose in each column either the upper or the lower sign, indepen-
dently of the other columns.
B. Thus the problem has 23 = 8 solutions. We can characterize one of these solu-
tions e. g. by specifying the signs of α, β, and δ, which corresponds to choosing
the upper or the lower sign in the three columns. According to (2.4), the columns
of αij represent the coordinates of the rotated basis vectors ẽj with respect to the
original basis ei ; every sign change of a column can be interpreted as reversing
the orientation of the corresponding basis vector.
Problem 3.17.
To decide whether to use (3.78) or (3.82) to compute the angle of rotation, we first have
to check if the transformation matrix is proper orthogonal or improper orthogonal.
The determinant is +1, so the transformation matrix describes a rotation, and (3.78)
yields cos ϑ = 0, ϑ = ± π2 . We choose ϑ = + π2 . Then sin ϑ = 1, and from (3.79) it
follows after a short computation that
1 1
n = (− √2, 0, − √2).
2 2
The transformation matrix describes a rotation of 90∘ about an axis in the x, z-plane
which forms with both the negative x-axis and the negative z-axis an angle of 45∘ . This
is equivalent to a rotation of −90∘ about an axis in the x, z-plane which forms with both
the positive x-axis and the positive z-axis an angle of 45∘ , respectively. This is the case
for ϑ = − π2 , sin ϑ = −1.
Problem 3.18.
We can prove this as follows. Let λ be an eigenvalue and let Xi be an eigenvector of aij
corresponding to λ. Then we have
aij Xj = λ Xi .
Multiplication by a(−1)
ki
gives
a (−1)
aij Xj = λ a(−1)
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
ki ki
Xi , Xk = λ a(−1)
ki
Xi , a(−1)
ki
Xi = λ−1 Xk , Q. E. D.
δkj
Problem 3.19.
We can prove this by mathematical induction:
Basic step: n = 0.
For n = 0, according to (3.87), this statement is satisfied.
Inductive step:
Assumption: For n = k we have (Ak )T = (AT )k . (a)
Proof of statement 1:
(3.86) (2.35) (a)
(Ak+1 )T = (A ⋅ Ak )T = (Ak )T ⋅ AT = (AT )k ⋅ AT
(3.86)
= (AT )k+1 , Q. E. D.
(3.86)
= (AT )k−1 , Q. E. D.
Problem 3.20.
We have U = a ⋅ X and we set V := a ⋅ U. Then we have
W = a3 ⋅ X = a2 ⋅ a ⋅ X = a2 ⋅ U = a ⋅ a
⏟⏟⏟⏟⏟⏟⏟ ⋅ U = a ⋅ V.
⏟⏟⏟⏟⏟⏟⏟
U V
The tensor a rotates (in each case through the angle ϑ) first the vector X into the vec-
tor U, then the vector U into the vector V, and finally the vector V into the vector W,
i. e. a3 rotates the vector X through the angle 3 ϑ into the vector W.
Problem 3.21.
Problem 3.13 shows that the tensor has the following eigenvalues and the following
corresponding normalized eigenvectors, which form a right-handed coordinate sys-
tem:
1 1
λ1 = λ2 = 9, q1 = ( √2, 0, √2) ,
2 2
q2 = (0, 1, 0),
1 1
λ3 = 1, q3 = (− √2, 0, √2) .
2 2
A. Since the original tensor is symmetric and all its eigenvalues are positive, the ten-
sor is positive definite.
B. We can compute the coordinates of the positive definite square root b = √a, using
(3.90), from
k k
bij = √λk qi qj .
1 2 3 1 1 1
√λ1 q1 √λ2 q1 √λ3 q1 q1 q2 q3
(√λ q1 √λ2 q2
2 3
√λ3 q2 ) (q1
2 2
q2
2
q3 )
1 2
1 2 3 3 3 3
√λ1 q3 √λ2 q3 √λ3 q3 q1 q2 q3
or in the form
1 2
q1 q1
1 1 1 2 2 2
bij = √λ1 (q1 ) (q1 , q2 , q3 ) + √λ2 (q2 ) (q1 , q2 , q3 )
2 2
1 2
q3 q3
3
q1
3 3 3
+ √λ3 (q3 ) (q1 , q2 , q3 ) .
2
3
q3
2 0 1
bij = (0 3 0) .
1 0 2
We can verify this result by showing that the square of bij is aij .
Problem 3.22.
Straightforward computations give
9 0 0 3 0 0
aik aTkj = (0 9 0) , Vij = √aik aTkj = (0 3 0) .
0 0 9 0 0 3
2 2 1
3 3 3
Rij = ( 31 − 32 2
3
),
2
3
− 31 2
−3
Problem 4.1.
A. We see from the figure that
OB = x1 , OA = OC cos α = u1 cos α,
BP = x2 , AB = CP sin β = u2 sin β,
OC = u1 , BD = OC sin α = u1 sin α,
CP = u2 , DP = CP cos β = u2 cos β,
x1 = u1 cos α + u2 sin β,
x2 = u1 sin α + u2 cos β,
x1 cos β − x2 sin β
= u1 (cos α cos β − sin α sin β) = u1 cos(α + β),
−x1 sin α + x2 cos α
= u2 (− sin α sin β + cos α cos β) = u2 cos(α + β),
cos β sin β
u1 = x1 − x,
cos(α + β) cos(α + β) 2
sin α cos α
u2 = − x + x.
cos(α + β) 1 cos(α + β) 2
In this case the covariant basis vectors are unit vectors, but the contravariant basis
vectors are not.
Problem 4.2.
𝜕x 𝜕x 𝜕x
x = R cos φ: = cos φ, = −R sin φ, = 0;
𝜕R 𝜕φ 𝜕z
𝜕y 𝜕y 𝜕y
y = R sin φ: = sin φ, = R cos φ, = 0;
𝜕R 𝜕φ 𝜕z
𝜕z 𝜕z 𝜕z
z = z: = 0, = 0, = 1;
𝜕R 𝜕φ 𝜕z
𝜕R 2x x
R = √x 2 + y 2 : = = = cos φ,
𝜕x 2√x2 + y2 R
𝜕R 2y y
= = = sin φ,
𝜕y 2√x2 + y2 R
𝜕R
= 0;
𝜕z
y 𝜕φ 1 y x2 y
φ = arctan : = (− 2 ) = − 2
x 𝜕x y
1 + (x)
2 x x + y 2 x2
y 1
= − 2 = − sin φ,
R R
𝜕φ 1 1 x2 1
= =
1 + ( xy ) x x + y x
𝜕y 2 2 2
x 1
= 2 = cos φ,
R R
𝜕φ
= 0;
𝜕z
𝜕xj 𝜕x 𝜕y 𝜕z
gj = gives g 1 = ( , , ) = (cos φ, sin φ, 0),
i 𝜕ui 𝜕R 𝜕R 𝜕R
𝜕x 𝜕y 𝜕z
g2 = ( , , ) = (−R sin φ, R cos φ, 0),
𝜕φ 𝜕φ 𝜕φ
𝜕x 𝜕y 𝜕z
g3 = ( , , ) = (0, 0, 1);
𝜕z 𝜕z 𝜕z
i 𝜕ui 𝜕R 𝜕R 𝜕R
gj = gives g 1 = ( , , ) = (cos φ, sin φ, 0),
𝜕xj 𝜕x 𝜕y 𝜕z
𝜕φ 𝜕φ 𝜕φ 1 1
g2 = ( , , ) = (− sin φ, cos φ, 0),
𝜕x 𝜕y 𝜕z R R
𝜕z 𝜕z 𝜕z
g3 = ( , , ) = (0, 0, 1).
𝜕x 𝜕y 𝜕z
Problem 4.3.
Equation (4.14) relates contravariant coordinates and Cartesian coordinates. We have
i j
T ij = g i ⋅ T ⋅ g j = g m T̂mn g n .
In order to apply this equation, we first have to compute the Cartesian coordinates of
the basis reciprocal to g i . We get, for example using the Gauss–Jordan algorithm,
1 2 3
T̂11 T̂12 T̂13 g1 g1 g1
1 2 3
T̂21 T̂22 T̂23 g2 g2 g2
1 2 3 2 −6 3
T̂31 T̂32 T̂33 g3 g3 g3
, T ij = (−1 6 −2) .
1 1 1
g1 g2 g3 T 11 T 12 T 13 1 −3 1
2 2 2
g1 g2 g3 T 21 T 22 T 23
3 3 3
g1 g2 g3 T 31 T 32 T 33
Problem 4.4.
If we identify the tilde coordinates in (4.19), i. e. in
̃i m
𝜕u 𝜕um
̃i =
a a , a
̃i = a ,
𝜕um ̃i m
𝜕u
with cylindrical coordinates and the no-tilde coordinates with Cartesian coordinates,
and then use the solutions to Problem 4.2, we get
𝜕R 𝜕R 𝜕R
a1 = ax + ay + a = ax cos φ + ay sin φ,
𝜕x 𝜕y 𝜕z z
𝜕φ 𝜕φ 𝜕φ sin φ cos φ
a2 = a + a + a = −ax + ay ,
𝜕x x 𝜕y y 𝜕z z R R
𝜕z 𝜕z 𝜕z
a3 = a + a + a = az ;
𝜕x x 𝜕y y 𝜕z z
𝜕x 𝜕y 𝜕z
a1 = a + a + a = ax cos φ + ay sin φ,
𝜕R x 𝜕R y 𝜕R z
𝜕x 𝜕y 𝜕z
a2 = ax + ay + a = −ax R sin φ + ay R cos φ,
𝜕φ 𝜕φ 𝜕φ z
𝜕x 𝜕y 𝜕z
a3 = a + a + a = az .
𝜕z x 𝜕z y 𝜕z z
Conversely, if we identify the tilde coordinates with Cartesian coordinates and the no-
tilde coordinates with cylindrical coordinates, we get
𝜕x 𝜕x 𝜕x
ax = a1 + a2 + a3 = a1 cos φ − a2 R sin φ
𝜕R 𝜕φ 𝜕z
𝜕R 𝜕φ 𝜕z sin φ
= a1 + a2 + a3 = a1 cos φ − a2 ,
𝜕x 𝜕x 𝜕x R
𝜕y 𝜕y 𝜕y
ay = a1 + a2 + a3 = a1 sin φ + a2 R cos φ
𝜕R 𝜕φ 𝜕z
𝜕R 𝜕φ 𝜕z cos φ
= a1 + a2 + a3 = a1 sin φ + a2 ,
𝜕y 𝜕y 𝜕y R
𝜕z 𝜕z 𝜕z
az = a1 + a2 + a3 = a3
𝜕R 𝜕φ 𝜕z
𝜕R 𝜕φ 𝜕z
= a1 + a2 + a3 = a3 .
𝜕z 𝜕z 𝜕z
Problem 4.5.
A. With the solutions to Problem 4.1, we obtain
𝜕xj
Ai = ̂j
A gives A1 = Ax cos α + Ay sin α,
𝜕ui
A2 = Ax sin β + Ay cos β.
B.
→
A = PQ
→
A1 = JI = PR,A1 g 1 = PR,
→
A2 = HC = RQ, A2 g 2 = RQ,
Ax = KL = PG = PF + FG,
PF = PR cos α = A1 cos α,
FG = RD = RQ sin β = A2 sin β,
Ay = EB = GQ = GD + DQ,
GD = FR = PR sin α = A1 sin α,
DQ = RQ sin β = A2 sin β,
→
A1 g 1 = PR, |A1 g 1 | = PR = A1 ,
→
A2 g 2 = RQ, |A2 g 2 | = RQ = A2 .
→
A = PQ
Ax = UV = PN = GQ,
Ay = MF = PG = NQ,
A1 = SK = PH = TL = TN + NL,
TN = PN cos α = Ax cos α, NL = NQ sin α = Ay sin α,
A2 = JD = PE = IC = IG + GC,
IG = PG cos β = Ay cos β, GC = GQ sin β = Ax sin β,
→ PH A1
A1 g 1 = PR, |A1 g 1 | = PR = = ,
cos(α + β) cos(α + β)
→ → PE A2
A2 g 2 = RQ = PB, |A2 g 2 | = PB = = .
cos(α + β) cos(α + β)
Problem 4.6.
From the solution to Problem 4.2 we know
1
g11 = 1, g22 = R2 , g33 = 1, g 11 = 1, g 22 = , g 33 = 1.
R2
The off-diagonal elements are all zero, because cylindrical coordinates are orthogonal.
Problem 4.7.
According to (4.19), the transformation equations are
𝜕um 𝜕un
g̃ij = g .
̃ j mn
̃ i 𝜕u
𝜕u
𝜕um 𝜕un
det (g̃ij ) = det ( i
) det ( j ) det (gmn ),
𝜕u
̃ 𝜕u
̃
2
𝜕(u1 , u2 , u3 )
g̃ = [ ] g,
̃1 , u
𝜕(u ̃2 , u
̃3 )
Problem 4.8.
A tensor of order three has 33 = 27 Cartesian coordinates, each of which can be trans-
lated into 23 = 8 different holonomic coordinates, so the ε-tensor has 8 ⋅ 27 = 216
holonomic coordinates. Because cylindrical coordinates are orthogonal, all the coor-
dinates of the ε-tensor which match in two indices are zero, so only 8 ⋅ 6 = 48 coordi-
nates are nonzero.
The easiest way to compute these coordinates is from (4.33):
Since the basis vectors are mutually orthogonal to each other, the magnitudes of the
triple products are equal to the product of the lengths of the vectors spanning the triple
products, and a coordinate is positive or negative, depending on whether the three
vectors of the triple product form a right-handed system or a left-handed system.
Problem 4.9.
eijk em nk = g im δnj − δni g jm .
Problem 4.10.
Possible translations are:
A. a ⋅ b = c ⇐⇒ aij bj k = cik ,
b⋅a=c ⇐⇒ bij ajk = ci k ,
a ̂ = ĉ
̂ ik b ij jk ⇐⇒ aik bi j = cjk ,
a ̂ a
̂i b k ̂i c
̂k = d ⇐⇒ ai bk ai ck = d,
a ̂ ĉ = d
̂ ijk b i k
̂
j ⇐⇒ aijk bi ck = dj ,
a ̂ ĉ = d
̂ ijk b i j
̂
k ⇐⇒ aijk bi cj = dk ,
(a ⋅ b)T = c ⇐⇒ (aij bjk )T = ci k ,
(a ⋅ b ⋅ c)T = d ⇐⇒ (ai j bj k ck l )T = di l .
B. eijk aj ekmn bm cn = aj cj bi − aj bj ci .
Problem 4.11.
Since cylindrical coordinates are orthogonal, our considerations from Section 4.3,
No. 3 apply here as well.
A. From the solution to Problem 4.2 it follows immediately, after normalizing the ba-
sis vectors, that
g <1> = (cos φ, sin φ, 0), g <2> = (− sin φ, cos φ, 0), g <3> = (0, 0, 1).
B. According to (4.48), we have a<i> = ai √gii , and with g11 = 1, g22 = R2 , g33 = 1,
it follows that
C. Since the g <i> are locally Cartesian and since the coordinates of δ and ε are equal
in all Cartesian coordinate systems, it follows that g <ij> = δij , e<ijk> = εijk .
D. (a ⋅ b)T = c ⇐⇒ (a<ij> b<jk>)T = c<ik>,
αik bi j = cjk ⇐⇒ α<ik> b<ij> = c<jk>,
a × (b × c) = a ⋅ c b − a ⋅ b c ⇐⇒ εijk a<j> εkmn b<m> c<n>
= a<j> c<j> b<i> − a<j> b<j> c<i>.
Problem 4.12.
The easiest way to compute the Christoffel symbols is by (4.60).
For cylindrical coordinates we get
g11 = 1, g11,i = 0,
g33 = 1, g33,i = 0,
1
g 11 = 1, g 22 = , g 33 = 1.
R2
This gives
1 11
Γ1ip = g (g1i,p + g1p,i − gip,1 ),
2
1 22
Γ2ip = g (g2i,p + g2p,i − gip,2 ),
2
1 33
Γ3ip = g (g3i,p + g3p,i − gip,3 ),
2
Problem 4.13.
We can derive the transformation law as follows. Starting with equation (4.58)1 in the
tilde system, we substitute the transformation equations (4.18)1 for the covariant basis
vectors, so we have
𝜕um 𝜕un 𝜕u ̃i l 𝜕2 um 𝜕u ̃i
Γ̃ijk g̃ i = Γ mn g
̃ + g̃ ,
̃ j 𝜕u
𝜕u ̃ k 𝜕ul i ̃ k 𝜕um i
̃ j 𝜕u
𝜕u
̃ i 𝜕um 𝜕un l
𝜕u 𝜕2 um 𝜕u ̃i
Γ̃ijk = l Γmn + j k .
𝜕u 𝜕u j
̃ 𝜕u ̃ k 𝜕u
̃ 𝜕ũ 𝜕um
The first term is the transformation law for single contravariant and twice covariant
tensor coordinates. The second term shows that Christoffel symbols are not tensor co-
ordinates. If the no-tilde coordinates are Cartesian, the Christoffel symbols vanish in
the no-tilde coordinate system and the transformation law reduces to Definition 4.56
of the Christoffel symbols. Starting from equation (4.58)2 , we analogously obtain the
equivalent transformation law
̃ i 𝜕um 𝜕un l
𝜕u ̃ i 𝜕um 𝜕un
𝜕2 u
Γ̃ijk = l Γmn − .
𝜕u 𝜕u j
̃ 𝜕ũk 𝜕u 𝜕un 𝜕u
m ̃ j 𝜕u
̃k
Problem 4.14.
𝜕a
̂m ̂
A. a × curl a = b ⇐⇒ εijk a
̂j ε =b
𝜕xn mkn i
⇐⇒ eijk aj am |n em k n = bi .
̂ i 𝜕a
𝜕a ̂j
B. + ̂
=b ij ⇐⇒ ai |j + aj |i = bij ⇐⇒ grad a + grad T a = b.
𝜕xj 𝜕xi
̂
𝜕b j
C. ai bj |i = cj ⇐⇒ a
̂i = ĉj ⇐⇒ (grad b) ⋅ a = c.
𝜕xi
D. ei jk eimn aj bk cm dn = ei jk aj bk eimn cm dn
= (gjm gkn − gjn gkm ) aj bk cm dn = aj cj bk dk − aj dj bk ck ,
⇐⇒ εijk εimn a ̂ ĉ d
̂j b ̂
k m n =a
̂ d
̂ j ĉj b ̂ ̂j d
k k −a
̂ b
̂ ̂ ,
j k ck
⇐⇒ (a × b) ⋅ (c × d) = a ⋅ c b ⋅ d − a ⋅ d b ⋅ c.
Problem 4.15.
(4.73) (4.61)
A. (grad a)i = a|i = a,i ,
(B.6) 𝜕a
(grad a)R = (grad a)1 = a,1 = ,
𝜕R
1 1 1 𝜕a
(grad a)φ = (grad a)2 = a,2 = ,
R R R 𝜕φ
𝜕a
(grad a)z = (grad a)3 = a,3 = .
𝜕z
(4.76) (4.61) (B.13)
B. div a = ai |i = ai ,i + Γimi am = a1 ,1 + a2 ,2 + a3 ,3 + Γ212 a1
(4.79)
C. (curl a)j = eijk ai,k ,
analogously we find
(4.73) (4.61)
D. (grad a)i j = ai |j = ai ,j + Γimj am ,
or we use the fact that cylindrical coordinates are orthogonal, and thus the scalar
product is an algebraic operation where the factors are composed of the physical
cylindrical coordinates, similarly as in Cartesian coordinates. Using the solutions
to part D and from (B.17), respectively, we obtain
𝜕aR 1 𝜕aR aφ 𝜕a
= bR + ( − ) bφ + R bz
𝜕R R 𝜕φ R 𝜕z
𝜕aR 𝜕a bφ 𝜕aR aφ bφ
= bR + R + bz −
𝜕R 𝜕φ R 𝜕z R
Problem 4.16.
𝜕2 a
̂m ̂
A. = bm ⇐⇒ am |kp g kp = bm ⇐⇒ div grad a = Δ a = b.
𝜕xk2
𝜕2 a
̂k
B. ̂
=b m ⇐⇒ ak |km = bm ⇐⇒ grad div a = b.
𝜕xm 𝜕xk
𝜕a
̂ ij
C. d a = (grad a) ⋅ d x ⇐⇒ δaij = aij |k d uk ⇐⇒ da
̂ ij = d xk .
𝜕xk
Problem 4.17.
Similarly as in the last problem, we have two options:
– We first perform the computations in holonomic coordinates and translate the re-
sult into physical coordinates.
– We compute the Laplace operator as the product of the physical coordinates of the
gradient and the divergence, which we have computed earlier.
A. Option 1:
(4.84) (4.81)
Δ a = g mn a|mn = g mn a|m |n
(4.61) (4.61)
= g mn [(a|m ),n − Γjmn a|j ] = g mn (a,mn − Γjmn a,j ).
𝜕2 a 1 𝜕a 1 𝜕2 a 𝜕2 a
Δa = + + 2 + .
2 R 𝜕R R 𝜕φ2 𝜕z 2
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
𝜕R
1 𝜕 𝜕a
(R )
R 𝜕R 𝜕R
Option 2:
𝜕a 1 𝜕a 𝜕a
b = grad a ⇐⇒ bR = , bφ = , bz = ,
𝜕R R 𝜕φ 𝜕z
B. We compute only the R-coordinate and refer to (B.21) for the other two coordinates.
Option 1:
(4.84) (4.81)
(Δ a)i = g mn ai |mn = g mn ai |m |n , with bim := ai |m ,
(Δ a)i = g mn bim |n ,
(4.61) j
ai |mn = bim |n = bim,n − Γin bjm − Γjmn bij
(4.61) j j
= (ai,m − Γim aj ),n − Γin (aj,m − Γkjm ak )
− Γjmn (ai,j − Γkij ak )
j j j
= ai,mn − Γim,n aj − Γim aj,n − Γin aj,m
j
+ Γin Γkjm ak − Γjmn ai,j + Γjmn Γkij ak .
j j j
ai |mm = ai,mm − Γim,m aj − 2 Γim aj,m − Γjmm ai,j + Γim Γkjm ak + Γjmm Γkij ak .
𝜕2 a1
a1 |11 = a1,11 = ,
𝜕R2
a1 |22 = a1,22 − Γ212,2 a2 − 2 Γ212 a2,2 − Γ122 a1,1 + Γ212 Γ122 a1
𝜕2 a1 2 𝜕a2 𝜕a
= − + R 1 − a1 ,
𝜕φ2 R 𝜕φ 𝜕R
𝜕2 a1
a1 |33 = a1,33 = ,
𝜕z 2
𝜕2 a1 1 𝜕2 a1 2 𝜕a2 𝜕a1 𝜕2 a1
= + ( − + R − a 1 ) + ,
𝜕R2 R2 𝜕φ2 R 𝜕φ 𝜕R 𝜕z 2
𝜕2 aR 1 𝜕aR aR 1 𝜕2 aR 𝜕2 aR 2 𝜕aφ
(Δ a)R = + − + + − 2 .
𝜕R 2 R 𝜕R R 2 2
R 𝜕φ 2 𝜕z 2 R 𝜕φ
Problem 4.18.
A. According to (4.61), we have
ai |k = ai,k − Γm
ik am =: bik ,
= ⏟⏟a⏟⏟⏟ m
− Γm
⏟⏟ − Γik,l am ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
i,kl − Γnil an,k + Γnil Γm
ik am,l ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ nk am
1 2 3
ai |lk = ⏟⏟a⏟⏟⏟ m
− Γm
⏟⏟ − Γil,k am ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
i,lk − Γnik an,l + Γnik Γm
il am,k ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ nl am
1 3 2
B. On the left side are the entirely covariant coordinates of a third-order tensor, and
with the quotient rule the terms in parentheses represent the single contravariant
and triple covariant coordinates Rm ikl of a fourth-order tensor. According to (4.83),
the tensor on the left side is the zero tensor for arbitrary am , i. e. we have Rm ikl am =
0 for arbitrary am , and this implies that Rm ikl = 0.
Problem 4.19.
To derive the transformation law for the contravariant basis vectors, we cannot refer
to the Cartesian basis, as we did in Section 4.2.2, because only the relation between
the surface coordinates uα and ũ α is one-to-one, but the relation (4.102) between uα or
uβ = uβ [ u
̃ α (uγ ) ] ,
and then we obtain, using the chain rule, analogous to (4.17), the relation
𝜕uβ 𝜕uβ 𝜕ũα
= = δγβ .
𝜕uγ 𝜕u
̃ α 𝜕uγ
̃ α /𝜕uγ gives
Therefore, contraction of (4.105) with 𝜕u
̃α
𝜕u ̃ α 𝜕uβ
𝜕u
g
̃ = g̃ = g γ .
𝜕uγ α ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
𝜕uγ 𝜕u ̃α β
δγβ
̃α
𝜕u
δ = gγ gγ + n n = g̃ g γ + n n,
𝜕uγ α
so scalar multiplication with g̃ β from the left gives
̃α β
𝜕u
̃ β ⋅ δ = γ g⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
g⏟⏟⏟⏟⏟⏟⏟⏟⏟ g̃ β ⋅ n n,
̃ ⋅ g̃ g γ + ⏟⏟⏟⏟⏟⏟⏟⏟⏟
𝜕u α
g̃ β δαβ 0
̃β γ
𝜕u
g̃ β = g .
𝜕uγ
This is the desired transformation law for the contravariant basis vectors, analogous
to (4.18).
To derive the transformation law for the covariant coordinates aα of a surface vec-
tor a, we compute the scalar product of the relation
̃ β g̃ β = aβ g β
a=a
with g̃ α and substitute on the right side the transformation law (4.105), so we have
𝜕uγ
a ̃ β ⋅ g̃ = aβ g β ⋅ g̃ = aβ g β ⋅ g
̃ β g⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ .
α α γ ̃α
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 𝜕u
δαβ δγβ
This gives the transformation law for the covariant coordinates of a surface vector,
analogous to (4.19),
𝜕uβ
a
̃α = a .
̃α β
𝜕u
Problem 4.20.
Writing the covariant metric coefficients as scalar products of the covariant basis vec-
tors, we have in two different surface coordinate systems
Substituting the transformations law (4.105) into the left equation gives
Comparing this equation with (4.19) shows that it is the transformation law for the
entirely covariant coordinates of a second-order tensor, here in particular of a surface
tensor, because the summation runs only from 1 to 2.
Problem 4.21.
The scalar product of a = aα g α = aα g α , once with g β and once with g β , together with
(4.106), (4.110) and (4.114), yields
β
aα g α ⋅ g β = aα g α ⋅ g β , aα δα = aα g αβ , aβ = g βα aα ,
aα g α ⋅ g β = aα g α ⋅ g β , aα gαβ = aα δβα , aβ = gβα aα .
Problem 4.22.
If we take the Cartesian coordinates as surface parameters u1 = x, u2 = y, the para-
metric representation of the surface z = x y is given as
x = x e1 + y e2 + x y e2 .
For the next computations we need the metric coefficients. We compute the covariant
metric coefficients gαβ , using (4.110), as follows:
The determinant is
To compute the covariant coordinates of the curvature tensor bαβ = n ⋅ g α,β , we need,
according to (4.127), the normal vector n and the partial derivatives g α,β of the covariant
basis vectors.
We compute the vector product g 1 × g 2 ,
g 1 × g 2 = −y e1 − x e2 + e3 ,
and obtain, from (4.101), with |g 1 × g 2 | = √y2 + x2 + 1 = √g, the normal vector, i. e.
g1 × g2 1
n= = (−y e1 − x e2 + e3 ) .
|g 1 × g 2 | √g
𝜕g 1 𝜕g 2 𝜕g 1
g 1,1 = = 0, g 2,2 = = 0, g 1,2 = g 2,1 = = e3 .
𝜕x 𝜕y 𝜕y
Hence we obtain for the covariant coordinates bαβ of the curvature tensor
1
b11 = n ⋅ g 1,1 = 0, b22 = n ⋅ g 2,2 = 0, b12 = b21 = n ⋅ g 1,2 = .
√g
xy 1 + x2
b1 1 = g 1γ bγ1 = g 12 b21 = − , b1 2 = g 1γ bγ2 = g 11 b12 = ,
g√g g√g
1 + y2 xy
b2 1 = g 2γ bγ1 = g 22 b21 = , b2 2 = g 2γ bγ2 = g 21 b12 = − .
g√g g√g
We note that the matrix of the covariant coordinates bαβ of the curvature tensor is
symmetric, but the matrix of the mixed coordinates bα β is not symmetric. We further
note that the metric coefficients g12 = g21 are zero only if x = 0 or y = 0. Thus, in
general, g 1 ⋅ g 2 ≠ 0, i. e. the x- and y-coordinate curves do not form an orthogonal grid
on the surface z = x y, but only in the x, y-plane.
Problem 4.23.
Using the covariant coordinates of the curvature tensor, we obtain the mean curvature
as H = 21 bα α = 21 g αβ bβα and the Gaussian curvature as K = det bα β = det (g αγ bγβ ) =
(det g αγ ) (det bγβ ) = g1 det bγδ (compare (1.13) and (4.27)). Substituting (4.113), (4.116),
and (4.131) gives
1 11 G L − 2F M + E N
H= (g b11 + 2 g 12 b12 + g 22 b22 ) = ,
2 2 (E G − F 2 )
b11 b22 − b12 b12 L N − M 2
K= = .
g11 g22 − g12 g12 E G − F2
Problem 4.24.
Using the results of Problem 4.22, we obtain for the mean curvature H and the Gaussian
curvature K
1 1 −x y − (1 + x 2 + y2 ) 1
H= (b 1 + b2 2 ) = , K = b1 1 b2 2 − b1 2 b2 1 = = − 2.
2 g√g g 3 g
−x y ± √x2 y2 + g −x y ± √(1 + x2 ) (1 + y2 )
k1,2 = = .
g√g g√g
The principal curvature directions are determined by (bα β − k δβα )t β = 0, and with
g11 = 1 + y2 , g22 = 1 + x2 , g12 = x y we have
Solving these equations we obtain the principal curvature directions (in nonnormal-
ized form) as
Problem 4.25.
If we set m = μ, i = α, j = β, k = γ for the indices of the three-dimensional Riemann
curvature tensor in Problem 4.18 and write out the third terms of the summation over
n, we have
μ μ μ μ μ
Γαγ,β − Γαβ,γ + Γναγ Γνβ − Γναβ Γμνγ + Γ3αγ Γ3β − Γ3αβ Γ3γ .
0 = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
Rμ αβγ
From this we obtain, using (4.138)1 and (4.139), the integrability condition (4.143):
μ μ
Rμ αβγ = ⏟⏟Γ⏟⏟⏟
3
αβ⏟⏟ ⏟⏟Γ
⏟⏟⏟
3
Γ⏟⏟⏟
3γ⏟⏟ − ⏟⏟ Γ⏟⏟⏟
αγ⏟⏟ ⏟⏟ 3β⏟⏟
= bμ β bαγ − bμ γ bαβ .
bαβ −bμ γ bαγ −bμ β
3 3 ν 3 ν 3 3 3 3 3
0=Γ αγ,β − Γ
⏟⏟⏟⏟⏟⏟⏟ αβ,γ + Γαγ ⏟⏟Γ
⏟⏟⏟⏟⏟⏟⏟ νβ⏟⏟ − Γαβ ⏟⏟Γ
⏟⏟⏟ Γ⏟⏟⏟
νγ⏟⏟ + ⏟⏟
⏟⏟⏟ Γ⏟⏟⏟
αγ⏟⏟ ⏟⏟ Γ⏟⏟⏟
3β⏟⏟ − ⏟⏟ αβ⏟⏟ ⏟⏟Γ 3γ⏟⏟ ,
⏟⏟⏟
bαγ,β bαβ,γ bνβ bνγ bαγ 0 bαβ 0
bαβ,γ − Γναγ bνβ = bαγ,β − Γναβ bνγ ;
Problem 4.26.
From the definition (4.146) it follows, for two different surface coordinate systems, with
(4.33) and g̃ 3 = g 3 = n, that
Substituting the transformation law (4.105) into the left equation gives
This is, similar to Problem 4.20, the transformation law for the entirely covariant co-
ordinates of a surface tensor of second order.
Problem 4.27.
We can compute the Gaussian curvature from the Riemann curvature tensor. Accord-
ing to (4.147), we have K = R1212 /g. Using (4.142), we further obtain
μ μ μ μ
R1212 = g1μ (Γ22,1 − Γ21,2 + Γν22 Γν1 − Γν21 Γν2 )
μ μ μ μ μ μ
= g1μ (Γ22,1 − Γ21,2 + Γ122 Γ11 + Γ222 Γ21 − Γ121 Γ12 − Γ221 Γ22 ).
K g = g11 (Γ122,1 − Γ121,2 + Γ122 Γ111 + Γ222 Γ121 − Γ121 Γ112 − Γ221 Γ122 )
̂ g22 (Γ233,2 − Γ232,3 + Γ233 Γ222 + Γ333 Γ232 − Γ232 Γ223 − Γ332 Γ233 ) .
=
bα β = g αμ bμβ = g αμ Γ3μβ ,
On a cylinder surface we have K = 0. This means we can unfold the surface into a
plane, whereas such an unfolding is not possible with a spherical surface, because
K ≠ 0.
Problem 5.1.
If T is a polar tensor, then its coordinates transform according to (2.17), so
Setting the indices equal, i = j, and using the orthogonality relation (2.6), we obtain
for the trace of T
tr T 2 = Tik Tki = αim αkn T̃mn αkp αiq T̃pq = αim αiq αkn αkp T̃mn T̃pq
= δmq δnp T̃mn T̃pq = T̃qp T̃pq .
Tii = (det α) ̃
∼ Tmm
and
Here tr T 2 is a polar scalar, but tr T is an axial scalar, and hence, according to our
convention from Chapter 5, tr T does not count as an invariant.
Problem 5.2.
All odd powers of antimetric tensors are antimetric, i. e. the trace of these powers is
zero; but the even powers are symmetric. Furthermore, according to (5.8), the trace of a
scalar product of a symmetric and an antimetric tensor is zero, so we see, by comparing
with (5.15), that we have the following nonzero invariants: tr A2 , tr B2 , tr(A⋅B), tr(A2 ⋅B2 ),
tr(A2 ⋅ B ⋅ A ⋅ B2 ).
The invariant of degree six can be shown to be reducible, in the same way as we
showed this for two symmetric tensors in Section 5.3.3, No. 5.
Only the investigation of the invariant of degree four requires more effort. If we
write the antimetric tensor A, using (3.8)2 , in terms of its corresponding vector a, we
obtain, using the Grassmann identity, for the coordinates of the square A2
The same holds for the coordinates of the square B2 , so we compute the coordinates
of A2 ⋅ B2 . We have
Using (3.8)1 , we can reintroduce the coordinates of the antrimetric tensors on the right
side. Using the Grassmann identity and taking antimetry into account, we get initially
1 1
ak bk = εklm εkpq Alm Bpq = (δlp δmq − δlq δmp ) Alm Bpq
4 4
1 1 1
= (Apq Bpq − Aqp Bpq ) = − Aqp Bpq = − tr(A ⋅ B).
4 2 2
Hence the integrity basis for two antimetric tensors consists of only three irreducible
invariants:
– the only irreducible invariants, according to Section 5.3.3, No. 4, of the tensors A
and B themselves: I1 = tr A2 , I2 = tr B2 ;
– the only irreducible simultaneous invariant of A and B: I3 = tr(A ⋅ B).
Problem 5.3.
We can compute the following invariants from the coordinates of a symmetric tensor
S and a vector u:
– the basic invariants of the symmetric tensor S:
I1 = tr S, I2 = tr S2 , I3 = tr S3 ;
I4 = u ⋅ u;
I5 = u ⋅ S ⋅ u, I6 = u ⋅ S2 ⋅ u, I7 = [u, S ⋅ u, S2 ⋅ u].
Problem 5.4.
We obtain from u = f (T), by introducing an auxiliary vector, u ⋅ h = f (T, h). Simul-
taneous invariants of T and h which are linear in h can only be computed using the
ε-tensor.
If T is symmetric, then, according to (2.40), we have εijk Tjk hi = 0; the same result
is obtained for the tensor powers T 2 , T 3 , which are also symmetric. So a representation
u = f (T), in which a vector u depends on a symmetric tensor T, does not exist.
If T is antimetric, then T 2 is symmetric and T 3 is again antimetric; but using the
Cayley–Hamilton theorem (3.94), T 3 can be written in terms of T. So with εijk Tjk hi only
one irreducible invariant exists which is linear in h. If T is polar, then ε ⋅⋅ T is axial
because of the ε-tensor. Thus, a representation u = f (T) of a vector u, which depends
on an antimetric tensor T, only exists if the vector u is also axial, and it is
u = k(tr T 2 ) ε ⋅⋅ T,
u = k ∗ (t ⋅ t) t.
The coefficients k and k ∗ are polar scalar-valued functions of the polar invariants
2
tr T and t ⋅ t, respectively.
Problem 5.5.
We get from T = f (v), after introducing an auxiliary tensor, T ⋅ ⋅ H = f (v, H). So,
three invariants exist which are linear in H:
If v is polar, then v v is also polar, but ε ⋅ v is axial because of the ε-tensor. Since we
are looking for a representation for a polar tensor T, we have only the two generators
δ and v v, and the representation is
T = k1 (v ⋅ v) δ + k2 (v ⋅ v) v v.
T = k1 (v ⋅ v) δ + k2 (v ⋅ v) v v + k3 (v ⋅ v) ε ⋅ v.
R = √x 2 + y 2 , 0 ≦ R < ∞,
y (B.2)
φ = arctan , 0 ≦ φ < 2π.
x
https://doi.org/10.1515/9783110404265-008
B.1.2 Bases
The Cartesian coordinates of the covariant basis, the contravariant basis, and the
physical basis are
aR = ax cos φ + ay sin φ,
aφ = ay cos φ − ax sin φ, (B.4)
az = az ;
ax = aR cos φ − aφ sin φ,
ay = aφ cos φ + aR sin φ, (B.5)
az = az ;
a1 = a1 = aR ,
a2 = R aφ , a2 = 1
aφ , (B.6)
R
a3 = a3 = az ;
1
g11 = g 11 = 1, g22 = R2 , g 22 = , g33 = g 33 = 1, (B.10)
R2
1
e123 = e12 3 = e1 23 = e1 2 3 = R, e123 = e1 23 = e12 3 = e1 2 3 = (B.11)
R
and in addition the ε-tensor also contains the corresponding permutations. Also
𝜕aR
(grad a)RR = ,
𝜕R
1 𝜕aR aφ
(grad a)Rφ = − ,
R 𝜕φ R
𝜕aR
(grad a)Rz = ,
𝜕z
𝜕aφ
(grad a)φR = ,
𝜕R
1 𝜕aφ aR (B.17)
(grad a)φφ = + ,
R 𝜕φ R
𝜕aφ
(grad a)φz = ,
𝜕z
𝜕az
(grad a)zR = ,
𝜕R
1 𝜕az
(grad a)zφ = ,
R 𝜕φ
𝜕az
(grad a)zz = ;
𝜕z
𝜕aR 𝜕a bφ 𝜕aR aφ bφ
[(grad a) ⋅ b]R = b + R + b − ,
𝜕R R 𝜕φ R 𝜕z z R
𝜕aφ 𝜕aφ bφ 𝜕aφ aR bφ
[(grad a) ⋅ b]φ = b + + b + , (B.18)
𝜕R R 𝜕φ R 𝜕z z R
𝜕az 𝜕a bφ 𝜕az
[(grad a) ⋅ b]z = bR + z + b ;
𝜕R 𝜕φ R 𝜕z z
𝜕2 a 1 𝜕a 1 𝜕2 a 𝜕2 a
Δa = + + + ; (B.20)
𝜕R2 R 𝜕R R2 𝜕φ2 𝜕z 2
𝜕2 aR 1 𝜕aR aR 1 𝜕2 aR 𝜕2 aR 2 𝜕aφ
(Δa)R = + − + + − 2 ,
𝜕R2 R 𝜕R R2 R2 𝜕φ2 𝜕z 2 R 𝜕φ
𝜕2 aφ 1 𝜕aφ aφ 2
1 𝜕 aφ 𝜕 aφ
2
2 𝜕aR
(Δa)φ = + − + + + 2 , (B.21)
𝜕R2 R 𝜕R R2 R2 𝜕φ2 𝜕z 2 R 𝜕φ
𝜕2 az 1 𝜕az 1 𝜕2 az 𝜕2 az
(Δa)z = + + + .
𝜕R2 R 𝜕R R2 𝜕φ2 𝜕z 2
d ui = (d R, d φ, d z),
d Ai = (R d φ d z, R d z d R, R d R d φ), (B.22)
d V = R d R d φ d z.
d x = (d R, R d φ, d z),
(B.23)
d A = (R d φ d z, d z d R, R d R d φ).
x = r sin ϑ cos φ,
y = r sin ϑ sin φ, (B.24)
z = r cos ϑ,
r = √x2 + y2 + z 2 , 0 ≦ r < ∞,
z
ϑ = arccos , 0 ≦ ϑ ≦ π, (B.25)
√x2 + y2 + z 2
y
φ = arctan , 0 ≦ φ < 2π.
x
B.2.2 Bases
The Cartesian coordinates of the covariant basis, the contravariant basis, and the
physical basis are
1 1 1
g2 = { cos ϑ cos φ, cos ϑ sin φ, − sin ϑ} ,
r r r
g <2> = {cos ϑ cos φ, cos ϑ sin φ, − sin ϑ}; (B.26)
sin φ cos φ
g 3 = {− , , 0} ,
r sin ϑ r sin ϑ
g <3> = {− sin φ, cos φ, 0}.
a1 = a1 = ar ,
1
a2 = r aϑ , a2 = a , (B.29)
r ϑ
1
a3 = r sin ϑ aφ , a3 = a ;
r sin ϑ φ
arϑ = axx cos ϑ sin ϑ cos2 φ + axy cos ϑ sin ϑ cos φ sin φ
− axz sin2 ϑ cos φ + ayx cos ϑ sin ϑ cos φ sin φ
+ ayy cos ϑ sin ϑ sin2 φ − ayz sin2 ϑ sin φ
+ azx cos2 ϑ cos φ + azy cos2 ϑ sin φ − azz cos ϑ sin ϑ,
aϑr = axx cos ϑ sin ϑ cos2 φ + axy cos ϑ sin ϑ cos φ sin φ
+ axz cos2 ϑ cos φ + ayx cos ϑ sin ϑ cos φ sin φ
+ ayy cos ϑ sin ϑ sin2 φ + ayz cos2 ϑ sin φ
− azx sin2 ϑ cos φ − azy sin2 ϑ sin φ − azz cos ϑ sin ϑ,
(B.30)
aϑϑ = axx cos2 ϑ cos2 φ + axy cos2 ϑ cos φ sin φ
− axz cos ϑ sin ϑ cos φ + ayx cos2 ϑ cos φ sin φ
+ ayy cos2 ϑ sin2 φ − ayz cos ϑ sin ϑ sin φ
− azx cos ϑ sin ϑ cos φ − azy cos ϑ sin ϑ sin φ + azz sin2 ϑ,
axy = arr sin2 ϑ cos φ sin φ + arϑ cos ϑ sin ϑ cos φ sin φ
+ arφ sin ϑ cos2 φ − aϑr cos ϑ sin ϑ cos φ sin φ
+ aϑϑ cos2 ϑ cos φ sin φ + aϑφ cos ϑ cos2 φ
− aφr sin ϑ sin2 φ − aφϑ cos ϑ sin2 φ − aφφ cos φ sin φ,
ayx = arr sin2 ϑ cos φ sin φ + arϑ cos ϑ sin ϑ cos φ sin φ
− arφ sin ϑ sin2 φ + aϑr cos ϑ sin ϑ cos φ sin φ
+ aϑϑ cos2 ϑ cos φ sin φ − aϑφ cos ϑ sin2 φ
+ aφr sin ϑ cos2 φ + aφϑ cos ϑ cos2 φ − aφφ cos φ sin φ,
(B.31)
ayy = arr sin2 ϑ sin2 φ + arϑ cos ϑ sin ϑ sin2 φ
+ arφ sin ϑ cos φ sin φ + aϑr cos ϑ sin ϑ sin2 φ
+ aϑϑ cos2 ϑ sin2 φ + aϑφ cos ϑ cos φ sin φ
+ aφr sin ϑ cos φ sin φ + aφϑ sin ϑ cos φ sin φ + aφφ cos2 φ,
a3 3 = a3 3 = aφφ .
𝜕2 aφ 2 𝜕aφ 2
1 𝜕 aφ cot ϑ 𝜕aφ
(Δa)φ = + + + 2
𝜕r 2 r 𝜕r r 2 𝜕ϑ2 r 𝜕ϑ
2
1 𝜕 aφ 2 𝜕a r 2 cot ϑ 𝜕aϑ aφ
+ 2 2
+ 2 + 2 − .
2
r sin ϑ 𝜕φ r sin ϑ 𝜕φ r sin ϑ 𝜕φ r sin2 ϑ 2