Tensor Analysis

Heinz Schade, Klaus Neemann

Tensor Analysis

Heinz Schade, Klaus Neemann

Tensor Analysis

Translated by
Andrea Dziubek and Edmond Rusjan

Mathematics Subject Classification 2010

Authors Translators
Prof. Dr.-Ing. Heinz Schade Prof. Dr. Andrea Dziubek
App. 430 State University of New York
Erlenweg 72 Polytechnic Institute
14532 Kleinmachnow 100 Seymour Road
Germany Utica NY 13502, USA
[email protected] [email protected]

Dr.-Ing. Klaus Neemann Prof. Dr. Edmond Rusjan

Internationales Studienkolleg State University of New York
Technische Universität Berlin Polytechnic Institute
Straße des 17. Juni 145 100 Seymour Road
10623 Berlin Utica NY 13502, USA
Germany [email protected]
[email protected]

ISBN 978-3-11-040425-8
e-ISBN (PDF) 978-3-11-040426-5
e-ISBN (EPUB) 978-3-11-040549-1

Library of Congress Cataloging-in-Publication Data

Names: Schade, Heinz, author. | Neemann, Klaus, author.
Title: Tensor analysis / Heinz Schade, Klaus Neemann.
Other titles: Tensoranalysis. English
Description: Berlin ; Boston : De Gruyter, [2018] | Series: De Gruyter
textbook | Text in English, translated from German. | Includes
bibliographical references.
Identifiers: LCCN 2018023807| ISBN 9783110404258 | ISBN 9783110404265 (e-book
(pdf) | ISBN 9783110405491 (e-book (epub)
Subjects: LCSH: Calculus of tensors. | Vector analysis.
Classification: LCC QA433 .S2713 2018 | DDC 515/.63--dc23
LC record available at https://lccn.loc.gov/2018023807

Bibliographic information published by the Deutsche Nationalbibliothek

The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data are available on the Internet at http://dnb.dnb.de.

© 2018 Walter de Gruyter GmbH, Berlin/Boston

Cover image: Klaus Neemann and urfinguss / iStock / Getty Images
Typesetting: VTeX UAB, Lithuania
Printing and binding: CPI books GmbH, Leck


The first systematic presentation of tensor calculus was published in 1900 by Ricci and
Levi-Civita and hence the subject was initially often called Ricci calculus. Gradually,
the term tensor calculus became generally accepted. Tensor calculus was essential
for the formulation of the theory of relativity, which has in turn stimulated its further
Tensor calculus is on the one hand an area of mathematics with applications
in differential geometry and on the other hand an essential resource for theoretical
physics and, increasingly, also for theoretical engineering. The theory of tensors con-
sists of two parts: tensor algebra and tensor analysis. The term tensor analysis is to-
day often (as in the title of this book) used for both parts. Tensor algebra deals with
the general theory of tensors, while tensor analysis is the theory of tensor fields, i. e.
tensors, which depend on the points in space, in particular in the three-dimensional
Euclidean space of our intuition.
In mathematics, tensor algebra is primarily interesting as a part of multilinear
algebra. Consequently, mathematical presentations usually use algebraic methods.
Scalars, vectors, and finally tensors are introduced as elements of sets. Then opera-
tions between these elements are defined that satisfy certain rules. Finally, the prop-
erties of these tensors are investigated. This approach is very abstract for physics stu-
dents, and even more for engineering students, and leads to a more general concept
of vectors and tensors than we need here. 1
In physics and in engineering we are interested in tensors, because physical quan-
tities can be described as tensors (scalars and vectors are special cases of tensors).
Physical equations are equations between tensors and tensor analysis enables us to
formulate physical equations and to calculate with them. If we want to express a phys-
ical quantity, such as velocity or force, quantitatively, we need a coordinate system in
the space of our intuition. The simplest coordinate system is usually a Cartesian coor-
dinate system. We will see that we also need a coordinate system to quantitatively de-
scribe all other physical quantities, with the exception of scalar quantities. This means
that as long as we only use tensor analysis for computations with physical quantities
we can, mathematically speaking, limit ourselves to tensors which can be represented
in a three-dimensional Cartesian basis.
This book is about these tensors. It builds tensor analysis in a more intuitive way
by defining operations between tensors in terms of their coordinates in a Cartesian

1 The original German edition of this book had a final chapter on this mathematical approach to tensor
analysis. We omitted this chapter from the English edition because it appeared that this approach was
of less interest to the readers of this book.


VI | Preface

basis. This naturally leads to the two tensor analysis notations: the so-called symbolic
or direct notation and the so-called coordinate or index notation. Both notations have
advantages and it is best to learn both notations simultaneously and to be able to
translate between them like between languages. Hence the symbolic notation must be
able to express, as much as possible, the same information as the coordinate notation
with its algorithmic character can express. For example, the order of a tensor is clear
in the coordinate notation from the number of the indices and hence it should be also
clear in the symbolic notation. We aimed to derive a symbolic notation that does that.
The book has five chapters. In Chapter 1 we collect the tools used later in the book,
including the so-called Kronecker symbols and the most important facts about deter-
minants and matrices. Chapter 2 presents tensors in symbolic notation and in Carte-
sian coordinates while Chapter 4 presents tensors in curvilinear coordinates. Chapter 3
deals with the algebra of second-order tensors, which is closely related to the theory of
square order-3 matrices. It also includes some additional theorems about square matri-
ces. Chapter 5 is an introduction to the theory of the representation of tensor functions.
We investigate permissible representations of physical quantities under the condition
that a physical equation must satisfy tensorial homogeneity. This is similar to dimen-
sional analysis, where an equation must satisfy dimensional homogeneity. (Tensorial
homogeneity means that only tensors of the same order can be added or set equal to
each other. For example, torque cannot be equal to work, because torque is a vector
and work is a scalar.)
Important formulas are indented and labeled by a number. Intermediate results,
which are cited later, for example, in a proof or in an auxiliary calculation, are left-
aligned and labeled by a letter on the right side. The most important formulas and
theorems (which are worth memorizing) are highlighted by a gray background and
typeset in a font different from the rest of the text. Italics are used to emphasize text.
The book can be used as a textbook for a one-semester course. It is also well suited
for self-study. Both, and especially the latter, require solving exercise problems and
hence the text contains assignment problems at appropriate places. They are marked
with the symbol of a pen on the left side. Only a limited number of problems is in-
cluded, so all problems should be solved (unless the corresponding part of the text
contains nothing new for the reader), and preferably before continuing reading the
text. The relatively small number of problems makes it possible to write in some detail
how the problems can be solved (see Appendix A).
The bibliography at the end of the book has been compiled by the translators.
Without the critical questions and suggestions for improvements from many stu-
dents and assistants, and the generous support of our Institute, we would not have
been able to write this book. We want to especially thank (now Professor) Frank
Kameier, who professionally and didactically accompanied the course development
for many years, Thomas Lauke for careful proofreading and preparing the manuscript
for printing, Evelyn Kulzer for drawing the pictures, Andrea Dziubek and Edmond Rus-

Preface | VII

jan for their conscientious translation into English and for initiating this translation,
and last but not least Dr. Konrad Kieling from the Walter de Gruyter publishing house
for enabling this English edition.

Kleinmachnow and Berlin, Germany Heinz Schade

March 2018 Klaus Neemann

Acknowledgments of the Translators
First and foremost, we would like to thank the authors, Professor Heinz Schade and
Dr. Klaus Neemann, for writing this beautiful, reader friendly, concise, and self-
contained book, for giving us the opportunity to translate it and for their guidance,
help, and patience during the translation. We would like to thank the publisher, Wal-
ter de Gruyter, and in particular Dr. Konrad Kieling, for making this happen and for
their support and patience. We are very grateful to SUNY Poly, Utica, and in particular
to our Dean, Professor Andrew Russell, for their support and for making this project
possible, and to our alumni Paul Lastowicka and Christopher Indolfi, for their diligent
reading of translation drafts and for offering suggestions for improvements.

Utica, NY, USA Andrea Dziubek

March 2018 Edmond Rusjan


Preface | V

Acknowledgments of the Translators | IX

1 Algebraic Tools | 1
1.1 The Summation Convention | 1
1.2 N-tuples | 4
1.2.1 Definitions | 4
1.2.2 Linear Operations | 4
1.2.3 Linear Independence | 5
1.3 Determinants | 6
1.3.1 Definitions | 6
1.3.2 Computing Determinants | 7
1.3.3 Computations with Determinants | 8
1.4 Kronecker Symbols | 10
1.4.1 δij | 10
1.4.2 δp...q | 11
1.4.3 εi...j | 12
1.4.4 Representation of a Determinant by εi...j | 15
1.4.5 εijk | 18
1.5 Matrices | 20
1.5.1 Definitions | 20
1.5.2 Arithmetic Operations and First Conclusions | 21
1.5.3 Equations Involving Matrices and Equations Involving Matrix
Elements | 25
1.5.4 Elementary Transformations, Normal Form, Equivalent Matrices,
Similar Matrices | 26
1.5.5 Orthogonal Matrices | 28
1.6 Algorithms | 29
1.6.1 Computing the Value of a Determinant | 29
1.6.2 Solution of Systems of Linear Equations with the Same Regular
Coefficient Matrix (“Division by a Regular Matrix”, Gauss–Jordan
Algorithm) | 29
1.6.3 Determining the Rank of a Matrix or a Determinant | 30

2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates | 31

2.1 Cartesian Coordinates, Points, Position Vectors | 31
2.1.1 Position Vectors and Point Coordinates | 31
2.1.2 Transformations of Cartesian Coordinate Systems | 31

XII | Contents

2.1.3 Properties of the Transformation Coefficients | 32

2.1.4 Transformation Law for Basis Vectors | 34
2.1.5 Transformation Law for Point Coordinates | 35
2.2 Vectors | 35
2.2.1 Vectors, Vector Components, and Vector Coordinates | 35
2.2.2 The Transformation Law for Vector Coordinates | 36
2.3 Tensors | 40
2.3.1 Second-Order Tensors | 40
2.3.2 Tensors of Arbitrary Order | 43
2.3.3 Symmetries in Physics | 44
2.4 Symbolic Notation, Index Notation, and Matrix Notation | 45
2.5 Equality, Addition, and Subtraction of Tensors.
Multiplication of a Tensor by a Scalar.
Linear Independence | 46
2.6 Transposed, Isometric, Symmetric, and Antimetric Tensors | 48
2.7 Tensor Multiplication of Tensors | 50
2.7.1 Definition | 50
2.7.2 Properties | 51
2.7.3 Tensors, Tensor Components, and Tensor Coordinates | 54
2.7.4 Tensor Equations, Transformation Equations, and Representation
Equations | 55
2.8 δ-Tensor, ε-Tensor, Isotropic Tensors | 55
2.8.1 The δ-Tensor | 55
2.8.2 The ε-Tensor | 56
2.8.3 Isotropic Tensors | 57
2.9 Scalar Multiplication of Tensors | 57
2.9.1 Definition | 57
2.9.2 Properties | 58
2.9.3 Contraction and Trace | 63
2.9.4 Multiple Scalar Products | 63
2.10 Vector Multiplication of Tensors | 65
2.10.1 Definition | 65
2.10.2 Properties | 68
2.10.3 Triple Product | 69
2.11 Summary of Tensor-Algebraic Operations | 70
2.12 Differential Operations | 71
2.12.1 The Fundamental Theorem of Tensor Analysis | 71
2.12.2 Gradient | 71
2.12.3 (Total) Differential | 74
2.12.4 Divergence | 76
2.12.5 Curl | 77
2.12.6 Laplace Operator | 79

Contents | XIII

2.13 Rules for Indices and Underlines | 79

2.14 Integrals of Tensor Fields | 80
2.14.1 Line Integrals of Tensor Coordinates | 80
2.14.2 Normal Vector and Surface Vector of a Surface Element | 82
2.14.3 Surface Integrals of Tensor Coordinates | 85
2.14.4 Volume Integrals of Tensor Coordinates | 88
2.14.5 Integrals of Higher-Order Tensor Fields | 90
2.15 Gauss Theorem and Stokes Theorem | 91
2.15.1 Gauss Theorem | 91
2.15.2 Stokes Theorem | 94

3 Algebra of Second-Order Tensors | 99

3.1 Additive Decomposition of a Tensor | 99
3.2 The Determinant of a Tensor | 100
3.3 The Vector Associated with an Antimetric Tensor | 101
3.4 The Cotensor of a Tensor | 102
3.5 The Rank of a Tensor | 102
3.6 The Inverse Tensor | 103
3.7 Orthogonal Tensors | 104
3.8 Tensors as Linear Vector Functions | 105
3.8.1 Rank 3 | 106
3.8.2 Rank 2 | 107
3.8.3 Rank 1 | 108
3.8.4 Rank 0 | 109
3.9 Reciprocal Bases | 109
3.9.1 Definition | 109
3.9.2 Orthogonality Relations | 110
3.9.3 Orthogonal Bases and Orthonormal Bases | 111
3.9.4 Reciprocal Bases in the Plane | 111
3.10 Representation of a Tensor by Vectors | 112
3.10.1 Rank 3 | 113
3.10.2 Rank 2 | 114
3.10.3 Rank 1 | 116
3.11 Eigenvalues and Eigenvectors. The Characteristic Equation | 117
3.11.1 Eigenvalues and Eigenvectors | 117
3.11.2 The Characteristic Equation and Main Invariants | 118
3.11.3 Classification of Tensors by the Type of Their Eigenvalues,
Eigenvalue Theorems | 119
3.11.4 Theorems about Eigenvectors | 122
3.11.5 Eigenvalues and Eigenvectors of Square Matrices | 128
3.12 Symmetric Tensors | 130
3.12.1 Principal Axes Transformation | 130

XIV | Contents

3.12.2 Eigenvalues and Rank of a Tensor | 135

3.12.3 Eigenvalues and Definiteness of a Tensor | 135
3.12.4 Symmetric Matrices | 137
3.13 Orthogonal Polar Tensors | 140
3.13.1 Rotation in a Plane | 140
3.13.2 Transformation to an Eigendirection | 140
3.13.3 The Orthogonal Tensor as a Function of Rotation Angle and Rotation
Axis or Reflection Axis | 144
3.13.4 Rotation and Coordinate Transformation | 148
3.14 Powers of Tensors. Cayley–Hamilton Theorem | 149
3.14.1 Powers with Integer Exponents | 149
3.14.2 Powers with Real Exponents | 150
3.14.3 Cayley–Hamilton Theorem | 152
3.15 Basic Invariants | 153
3.16 Polar Decomposition of a Tensor | 154

4 Tensor Analysis in Curvilinear Coordinates | 159

4.1 Curvilinear Coordinates | 159
4.1.1 Curvilinear Coordinate Systems | 159
4.1.2 Coordinate Surfaces and Coordinate Curves | 161
4.1.3 Holonomic Bases | 162
4.1.4 Rectilinear and Cartesian Coordinate Systems | 164
4.1.5 Orthogonal Coordinate Systems | 165
4.2 Holonomic Tensor Coordinates | 165
4.2.1 Introduction | 165
4.2.2 Transformations Between Two Curvilinear Coordinate Systems | 168
4.2.3 The Summation Convention | 170
4.2.4 The δ-Tensor | 171
4.2.5 Raising and Lowering Indices | 175
4.2.6 The ε-Tensor | 176
4.2.7 Isotropic Tensors | 179
4.2.8 Tensor Algebra in Holonomic Coordinates | 179
4.3 Physical Bases and Tensor Coordinates | 184
4.4 Tensor-Analytical Operations | 186
4.4.1 Partial Derivative and Differential of the Position Vector | 187
4.4.2 Partial Derivative and Total Differential of the Holonomic Bases,
Christoffel Symbols | 187
4.4.3 Christoffel Symbols and Metric Coefficients | 188
4.4.4 Partial Derivatives of Tensors. Partial and Covariant Derivatives of
Tensor Coordinates | 190
4.4.5 The Total Differential of Tensors. The Total and the Absolute Differential
of Tensor Coordinates | 191

Contents | XV

4.4.6 Derivatives with Respect to a Parameter | 193

4.4.7 Gradient | 193
4.4.8 Divergence and Curl | 194
4.4.9 Physical Coordinates of Tensor-Analytical Operations | 195
4.4.10 Second Covariant Derivative of a Tensor Coordinate.
Laplace Operator | 197
4.4.11 Integrals of Tensor Fields | 199
4.5 Fundamentals of the Theory of Surfaces | 202
4.5.1 Surface Coordinates and Moving Frames | 203
4.5.2 Surface Vectors and Surface Tensors | 205
4.5.3 Metric Properties | 206
4.5.4 Curvature Properties | 210
4.5.5 Derivative Equations, Integrability Conditions | 215

5 Representation of Tensor Functions | 221

5.1 The Basic Idea of the Representation of Tensor Functions | 221
5.2 Generalized Cayley–Hamilton Theorem | 223
5.3 Invariants of Vectors and Second-Order Tensors | 225
5.3.1 Invariants of Vectors | 226
5.3.2 Independent Invariants of a Second-Order Tensor | 226
5.3.3 Irreducible Invariants of Second-Order Tensors | 228
5.3.4 Summary | 236
5.4 Isotropic Tensor Functions | 236
5.4.1 Invariance Conditions | 236
5.4.2 Scalar-Valued Functions | 238
5.4.3 Vector-Valued Functions | 238
5.4.4 Tensor-Valued Functions | 240
5.4.5 Summary | 243
5.5 Considering Anisotropy | 244

References | 249

Appendices | 253

A Solutions to the Problems | 253

B Cylindrical Coordinates and Spherical Coordinates | 309

B.1 Cylindrical Coordinates | 309
B.1.1 Transformation Equations for Point Coordinates | 309
B.1.2 Bases | 310
B.1.3 Transformation Equations for Tensor Coordinates | 310
B.1.4 δ-Tensor and ε-Tensor | 311

XVI | Contents

B.1.5 Christoffel Symbols | 312

B.1.6 Differential Operators | 312
B.1.7 Curve-, Area-, and Volume Elements | 313
B.2 Spherical Coordinates | 314
B.2.1 Transformation Equations for Point Coordinates | 314
B.2.2 Bases | 315
B.2.3 Transformation Equations for Tensor Coordinates | 315
B.2.4 δ-Tensor and ε-Tensor | 318
B.2.5 Christoffel Symbols | 318
B.2.6 Differential Operators | 319
B.2.7 Curve-, Area-, and Volume Elements | 320

Index | 321

1 Algebraic Tools
In this chapter, we introduce the summation convention, the Kronecker delta sym-
bol δij , the generalized Kronecker delta symbol δp...q , and the Levi-Civita permutation
symbol εi...j . At the same time, we review the most important facts about N-tuples,
determinants, and matrices. We will obtain some more, less elementary properties of
square matrices later by generalizing tensor algebra results.

1.1 The Summation Convention

1. We distinguish two types of letter subscripts and superscripts:
– Running indices assume values from an index set of natural numbers. They suc-
cessively assume each value from the given index set. For example, if i takes the
values from 1 to 4, then ai denotes the set (a1 , a2 , a3 , a4 ) and we say that the index
i runs from 1 to 4. We will use lower-case Latin letters (except x, y, z) for running
– Labels have a fixed meaning. We will use other letters for labels. For example, we
write the (physical) cylindrical coordinates of a vector a as (aR , aϕ , az ).

2. In some sections, we need to distinguish between lower and upper indices. Where
upper indices can be confused with exponents, we will set the exponents in brackets.

3. For the calculation with matrices and tensors, an agreement proposed by Einstein
which is called summation convention has been adopted for running indices. It exists
in different versions; we introduce it in the following form:

If a running index appears twice in a term, summation over the values of the index
is implied, without explicitly writing the summation sign.

A term is a mathematical expression separated by a plus sign, a minus sign, an equal-

ity sign, or an inequality sign.
For example, if i assumes the values 1, 2, 3, then ai bi is, according to the summa-
tion convention, ∑3i=1 ai bi , which is a1 b1 + a2 b2 + a3 b3 . This is independent of the
meaning of ai and bi ; for example, (a1 , a2 , a3 ) and (b1 , b2 , b3 ) do not have to be vectors.
We often abbreviate ai ai as a2i or a(2)
. However, aij aij , which is, according to the
summation convention, the double sum ∑i ∑j aij aij , cannot be written as a square
because a(2)
is defined differently later.


2 | 1 Algebraic Tools

4. The summation convention has the following consequences for running indices:
– A running index can only appear once or twice in a term.
If a running index appears once, it is called a free index. A free index succes-
sively takes each value of its index set. If a running index appears twice, it is called
a summation index. A summation index indicates summation over the values of
its index set.
For example, if an index i has the index set 1, 2 and a second index j has the
index set 1, 2, 3, then the equation aij bj = ci abbreviates the two equations

a11 b1 + a12 b2 + a13 b3 = c1 ,

a21 b1 + a22 b2 + a23 b3 = c2 .

These two equations can also be written as aik bk = ci or amn bn = cm , but not as
aii bi = ci , because in the last equation i appears three times in a term and this is
not defined.
It is often necessary to rename indices. For example, if we substitute Ai = αij Bj
into aj = Ai Cij we get formally aj = αij Bj Cij , but then j appears three times in a
term. To avoid this, we can write the first equation as Ai = αik Bk , and now we get
an acceptable equation, aj = αik Bk Cij .
– Running indices must have the same index set in all terms of an equation.
For example, in aij bj = ci , if i runs from 1 to 3 in aij , it cannot run from 1 to 2
in ci , and the same is true for j.
– A free index must match in all terms of an equation. The order of indices in an equa-
tion does not matter.
For example, the equation aij bj = ck is not permissible, while ai bj = cij and
bj ai = cij are permissible and have the same meaning.

The last rule has one exception: we do not add a subscript to the number zero, i. e. the
equation αi = 0 means all αi are zero. We will also need the logical negation of this
statement. To express “not all αi are zero” we introduce our own inequality sign =\ and
write αi =
\ 0 (read: αi not all zero). On the other hand, the equation with the common
inequality sign αi ≠ 0 means all αi are nonzero. To not deviate more than necessary
from common practice, we will use the usual inequality sign if an equation has both
meanings. This is the case when the equation does not have free indices, so we write
a ≠ 0 and ai bi ≠ 0.

Problem 1.1.
Write out the following expressions:
for N = 4, i. e. i = 1, 2, 3, 4
𝜕ui 𝜕2 φ
A. ai Bi , B. Aii , C. , D. ,
𝜕xi 𝜕xi2

1.1 The Summation Convention | 3

for N = 3
E. aij bij , F. aii bjj ,

for N = 2
𝜕ui 𝜕ui
G. .
𝜕xj 𝜕xj

Problem 1.2.
A. ui = Aik nk in φ = uk vk ,
B. ui = Bij vj and Cij = pi qj in wi = Cmi um ,
𝜕v 𝜕vj 1 𝜕v 𝜕vj
C. ϕ = τij dij , τij = η ( i + ) , dij = ( i + ) , and
𝜕xj 𝜕xi 2 𝜕xj 𝜕xi
𝜕T Ds 𝜕q
qi = −κ (κ constant) in ρ T = ϕ− i.
𝜕xi Dt 𝜕xi

Expand the parentheses.

5. In the rare cases that we do not want to apply the summation convention, we un-
derline the corresponding running index. For example, Aij with i, j = 1, 2, 3 represents
the matrix

A11 A12 A13

(A21 A22 A23 ) ,
A31 A32 A33

Aii represents the sum A11 + A22 + A33 of the diagonal elements, and Aii represents
the set (A11 , A22 , A33 ) of the diagonal elements. Underlined indices are neither free nor
summation indices. We call these indices bound indices, because if a term has a bound
index, then the same index must appear either as a free index or as a summation index
in the same term. These indices do not count in the rule that a running index can
appear at most twice in a term.
This means αi Aii is permissible and represents α1 A11 + α2 A22 + α3 A33 and λi aij is
also permissible and represents the matrix

λ1 a11 λ1 a12 λ1 a13

(λ2 a21 λ2 a22 λ2 a23 ) .
λ3 a31 λ3 a32 λ3 a33

A short calculation shows that it is possible to multiply a free index by a bound in-
dex, but that it is not possible to multiply a summation index by a bound index: From

Brought to you by | Cambridge University Library

aij bjk = cik it follows that λi aij bjk = λi cik , but from ai bi = ci di it does not follow
that λi ai bi = λi ci di .
If an index is bound to a free index, we can convert the free index into a summation
index by multiplication. For example λi ai becomes λi ai bi . However, such an expres-
sion is not associative; it should be read as (λi ai )bi . The other association λi (ai bi ) is
not defined, which becomes clear when we substitute ai bi = c.

Problem 1.3.
A. Are the following equations allowed by the summation convention?

1. Ai Bjj = Ci Dkk , 2. Ai Bi = Ci Dj , 3. Ai Bi = Ci Di ,

4. Ai Bi = Ci , 5. Am Bn = Cmn , 6. Ai Bk = Ci Dj ,

7. Ai Bk = Ak Bi , 8. μmn Am = cn Fii , 9. A = αm Cmm ,

10. Aii = Bij Cjji .

B. Write out all permissible equations for N = 2.

1.2 N-tuples
1.2.1 Definitions

A sequence of N numbers is called an N-tuple. We write an N-tuple as (a1 , a2 , . . . , aN )

or short (ai ). The length of the sequence N is called the size of the N-tuple and does
not appear in the abbreviation.

(ai ) := (a1 , a2 , . . . , aN ). (1.1)

The order of elements matters: changing the order of elements in an N-tuple gives a
different N-tuple. Special names are often used for small values of N. We call 2-tuples
ordered pairs, 3-tuples ordered triples, 4-tuples ordered quadruples, etc.
For example, the three Cartesian coordinates of a point form an N-tuple, in par-
ticular, an ordered triple.
We call an N-tuple of arbitrary size with all elements zero a zero-N-tuple and
write (0).

1.2.2 Linear Operations

For N-tuples of equal size we define the following so-called linear operations: equality,
addition, subtraction, and multiplication by a number:
– Two N-tuples are equal if all homologous elements, i. e. elements at identical po-
sitions in both N-tuples and subscribed by the same index, are equal.

Brought to you by | Cambridge University Library

– Addition and subtraction of two N-tuples are defined by adding or subtracting

their homologous elements.
– Multiplication of an N-tuple by a number is defined by multiplying every element
of the N-tuple by that number.

Equality, addition, and subtraction are defined only for N-tuples of equal size.

1.2.3 Linear Independence

1 2 P
1. A set of P N-tuples of equal size (ai ), (ai ), . . . , (ai ) is called linearly independent if
the equation
1 2 P
α1 (ai ) + α2 (ai ) + ⋅ ⋅ ⋅ + αP (ai ) = (0) (1.2)

is satisfied only if all αi are zero. This solution is called the trivial solution. Using the
summation convention we abbreviate
αj (ai ) = (0) only if αj = 0,
i = 1, . . . , N, j = 1, . . . , P.
If equation (1.2) has also nontrivial solutions, i. e. if at least one of the αj is nonzero,
αj (ai ) = (0), αj =
\ 0, i = 1, . . . , N, j = 1, . . . , P, (1.4)
then the N-tuples (ai ) are called linearly dependent.
2. Equation (1.2) represents a system of N homogeneous, linear equations to deter-
mine the P unknown αj :
1 2 P
α1 a1 + α2 a1 + ⋅ ⋅ ⋅ + αP a1 = 0,
1 2 P
α1 a2 + α2 a2 + ⋅ ⋅ ⋅ + αP a2 = 0,
1 2 P
α1 aN + α2 aN + ⋅ ⋅ ⋅ + αP aN = 0.
If the number of unknowns is larger than the number of equations, i. e. if P > N,
the system always has nontrivial solutions. More than N N-tuples are always linearly
dependent, while N or fewer than N N-tuples can be linearly independent.
3. A set of N-tuples where each pair is linearly dependent is called collinear; a set of
N-tuples where each triple is linearly dependent is called coplanar.
4. The left side of (1.2) is an N-tuple, which is obtained by multiplying each of the N-
tuples (of a given set of N-tuples) by a number and then summing up these products.
This is called a linear combination of these N-tuples.

Brought to you by | Cambridge University Library

1.3 Determinants
We assume the reader is familiar with the algebra of determinants; however, we sum-
marize here the most important definitions and theorems (without proofs) for later

1.3.1 Definitions

1. We can assign a number to a collection of numbers aij arranged in a square table.

This number is called the determinant of this square table and we write
󵄨󵄨 a a12 ... a1N 󵄨󵄨
󵄨󵄨 11 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 a21 a22 ... a2N 󵄨󵄨
󵄨󵄨 󵄨󵄨
det a
∼ = det aij = 󵄨󵄨 .
󵄨󵄨 .
󵄨󵄨 .
󵄨󵄨 (1.5)
󵄨󵄨 . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 a aN2 ... aNN 󵄨󵄨
󵄨 N1 󵄨

A determinant with N rows and N columns is called a determinant of order N. Clearly,

each row or column of a determinant is an N-tuple. Let

Aij...k := ai1 aj2 ⋅ ⋅ ⋅ akN (1.6)

be a product of the elements of a determinant of order N with exactly one element

from each row and from each column, i. e. the indices ij . . . k represent an arbitrary
permutation of the numbers 1, . . . , N. Since there are N! permutations of N numbers,
this yields N! products.
The value of the determinant is the algebraic sum of these N! products. The prod-
uct A12...N and all products which are obtained by an even number of index exchanges
from the sequence of indices 1, 2, . . . , N have a positive sign and all products which
are obtained by an odd number of exchanges have a negative sign.
This definition can easily be written as a formula: Let
– pm (Aij...k ) with m = 1, . . . , N! be the products Aij...k in any order and
– αm be the number of index exchanges needed to transform the index sequence
1, 2, . . . , N into the index sequence i, j, . . . , k (which is the index sequence of the
permutation needed to generate the product pm (Aij...k )). Then
det a
∼ := (−1) pm (Aij...k ). (1.7)

2. A determinant, which is obtained from the original determinant by omitting any

number of rows and an equal number of columns, is called a subdeterminant of the
original determinant. If the omitted columns have the same indices as the omitted
rows, the subdeterminant is called a principal subdeterminant. The subdeterminant
Δij of order N − 1, which is obtained by omitting the i-th row and the j-th column, is

1.3 Determinants | 7

called a minor;1 minors corresponding to the main diagonal are also called principal
minors. A minor multiplied by (−1)(i+j) is called the cofactor corresponding to the ele-
ment aij , for which we write bij :

bij := (−1)(i+j) Δij . (1.8)

Cofactors are also called adjuncts or algebraic complements.

3. A determinant of order N has rank R ≦ N, if at least one subdeterminant of order R

is nonzero and all higher-order subdeterminants are zero. For “the determinant a∼ has
the rank R ” we also write rank det a
∼ = R.

4. A determinant of order N is called regular if it has the rank N, it is called singular if

its rank is smaller than N, and it is called M-times singular if its rank is N − M; then
we also say the determinant has rank defect M.

5. The summation convention does not apply across the symbol det: det aij = 1 is a
valid equation; however, det aij = δij is not permissible.

1.3.2 Computing Determinants

1. For a determinant of order two, according to (1.7), we have

det a
∼ = a11 a22 − a21 a12 . (1.9)

To easily remember this formula, we write out the elements of the determinant:
a11 a12 This square table has two diagonals: the diagonal indicated by the
?? bold line is called the main diagonal, and the diagonal indicated by the
thin line is called the antidiagonal. To compute the value of a determi-
a21 a22
nant, we subtract from the product of the elements on the main diagonal
the product of the elements on the antidiagonal; we shorten this rule to “main diago-
nal minus antidiagonal”.

For a determinant of order three, according to (1.7), we have

det a
∼ = a11 a22 a33 − a21 a12 a33 − a31 a22 a13 (1.10)
− a11 a32 a23 + a21 a32 a13 + a31 a12 a23 .

To easily remember this formula we first write out the elements of the determinant and
then we repeat the first and the second row below.

1 The term minor is also used with a different meaning in the literature.

8 | 1 Algebraic Tools

a11 a12 a13

This rectangular table has three main diagonals and three
antidiagonals. The value of the determinant is the difference be-
? ?
a21 a22 a23
? ?
tween the sum of the products of the elements on each main di-
agonal and the sum of the products of the elements on each an-
?? ??
a31 a32 a33
?? ??
tidiagonal. Again we shorten this rule to “main diagonals minus
?? ?? antidiagonals” (rule of Sarrus).
a11 a12 a13
?? ??
Unfortunately, there are no analogous computational pat-
? ? terns or rules for higher-order determinants.
a21 a22 a23
? ?

2. We can compute the value of a determinant with the cofactor (or Laplace) expansion
formula: we pick any row or column, multiply every element by its cofactor, and add
these products. Then we have

det a
∼ = aik bik = aik bik . (1.11)

For example, aik bik represents

for k = 1: a11 b11 + a21 b21 + a31 b31 ,
for k = 2: a12 b12 + a22 b22 + a32 b32 ,
for k = 3: a13 b13 + a23 b23 + a33 b33 .

All three expressions are equal to det a.

3. In practice, we compute the value of a determinant by first transforming the deter-
minant into triangular form, i. e. a determinant which has only zeros below the main
diagonal or only zeros above the main diagonal. To do this, we follow the rules for
computations with determinants which we summarize in Section 1.3.3. Once we have
transformed a determinant into triangular form, its value is, according to the cofactor
expansion, the product of the elements on the main diagonal.

1.3.3 Computations with Determinants

The most important rules for computations with determinants are the following:
I. The value of a determinant does not change,
a) if we add to a row a linear combination of the remaining rows;
b) if we add to a column a linear combination of the remaining columns;
c) if we exchange rows and columns (reflect the determinant at the main diago-
II. The sign of a determinant changes if we exchange two rows or columns.
III. A determinant is zero if and only if its rows are linearly dependent. (According to
condition Ic, this is equivalent to its columns being linearly dependent.)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:29 AM
1.3 Determinants | 9

IV. Determinants which differ in a single row (or column) are added and subtracted
󵄨󵄨 a a12 ... a1N 󵄨󵄨 󵄨󵄨 a󸀠 a󸀠12 ... a󸀠1N 󵄨󵄨
󵄨󵄨 11 󵄨󵄨 󵄨󵄨 11 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 a21 a22 ... a2N 󵄨󵄨 󵄨󵄨 a21 a22 ... a2N 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 . 󵄨󵄨 ± 󵄨󵄨 . 󵄨󵄨
󵄨󵄨 . 󵄨󵄨 󵄨󵄨 . 󵄨󵄨
󵄨󵄨 . 󵄨󵄨 󵄨󵄨 . 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 a aN2 ... aNN 󵄨󵄨 󵄨󵄨 a aN2 ... aNN 󵄨󵄨
󵄨 N1 󵄨 󵄨 N1 󵄨
󵄨󵄨 a ± a󸀠 a12 ± a󸀠12 ... a1N ± a󸀠1N 󵄨󵄨
󵄨󵄨 11 11 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨󵄨 a21 a22 ... a2N 󵄨󵄨
= 󵄨󵄨󵄨󵄨 .. 󵄨󵄨 .
󵄨󵄨 . 󵄨󵄨
󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨 a aN2 ... aNN 󵄨󵄨
󵄨 N1 󵄨

V. To multiply a determinant by a number we multiply any of its rows or columns by

this number.
VI. Determinant multiplication is given by

(det aim )(det bmj ) = det(aim bmj ). (1.13)

With condition Ic, we establish that

det(aim bmj ) = det(ami bmj ) = det(aim bjm ) = det(ami bjm ).

VII. If bij is the cofactor of the element aij of a determinant det a,

∼ then

det a for j = k,
aij bik = aji bki = { ∼ (1.14)
0 for j ≠ k.

Problem 1.4.
Compute the following determinants (consult also Section 1.6.1):
󵄨󵄨 1 2 −4 4 󵄨󵄨󵄨󵄨
󵄨󵄨 1 󵄨󵄨
󵄨󵄨 0 −1 󵄨󵄨󵄨󵄨 󵄨󵄨 󵄨
󵄨 󵄨 󵄨󵄨 3 6 9 0 󵄨󵄨󵄨󵄨
A. Δ = 󵄨󵄨󵄨󵄨 3 1 −3 󵄨󵄨󵄨󵄨 , B. Δ = 󵄨󵄨󵄨 󵄨.
󵄨󵄨 󵄨 󵄨󵄨 −2 8 1 9 󵄨󵄨󵄨󵄨
󵄨󵄨 1 2 −2 󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨 4 󵄨
󵄨 2 −2 0 󵄨󵄨󵄨

10 | 1 Algebraic Tools

1.4 Kronecker Symbols

In this section, we define some quantities which we need later and we explore their
most important properties.

1.4.1 δij

1. In δij , both indices must have the same index set, 1, . . . , N. We define δij as follows.

1 for equal indices, i. e. i = j,

δij := { (1.15)
0 for different indices, i. e. i ≠ j.

The term δij is often called the Kronecker symbol.

Problem 1.5.
Compute for N = 5
A. δii , B. δii .

2. The term δij is clearly symmetric in its indices, i. e. its value does not change if we
change the order of the two indices.

δij = δji . (1.16)

3. For example, Ai δij means for N = 3

Ai δij = A1 δ1j + A2 δ2j + A3 δ3j .

Substituting 1, 2, and 3 for j, we get

for j = 1: Ai δi1 = A1 ,
for j = 2: Ai δi2 = A2 ,
for j = 3: Ai δi3 = A3 .

We can summarize these three equations as Ai δij = Aj , and we get similar results for
N ≠ 3. If A has additional indices (possibly with a different index set) not appearing
in δij , these indices are not affected by multiplication with δij .

Ai...jmp...q δmn = Ai...jnp...q . (1.17)

Let us write this important formula in words: If we sum over one index of δmn , we
replace this index in the other quantity (here the m in Ai...jmp...q ) by the second index
of δmn (here the n) and we drop the δmn .

1.4 Kronecker Symbols | 11

Problem 1.6.
Simplify the following expressions:
A. δij δjk , B. δi2 δik δ3k , C. δ1k Ak , D. δi2 δjk Aij .

4. Analogously, we prove that

λm Ai...jmp...q δmn = λn Ai...jnp...q . (1.18)

1.4.2 δp...q

1. In δp...q the number M of the upper and the lower indices is the same. We also say
that δp...q has M pairs of indices; in addition, all 2M indices are assumed to have the
same index set 1, . . . , N. Then we define
󵄨󵄨 δ δiq ... δir 󵄨󵄨
󵄨󵄨 ip 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 δjp δjq ... δjr 󵄨󵄨
ij...k 󵄨 󵄨󵄨
δpq...r := 󵄨󵄨󵄨󵄨 . 󵄨󵄨 . (1.19)
󵄨󵄨 .. 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 δkp δkq ... δkr 󵄨󵄨󵄨

We call δp...q the generalized Kronecker symbol. Clearly, δji = δij , so this name is plau-
2. Let us derive some rules for the values of the elements of δp...q . We first consider an
element of δp...q with at least two equal upper or lower indices. Then the corresponding
rows or columns of the determinant are equal, which implies that the determinant is
zero, independent of the values of the other indices. This must be the case, for exam-
ple, if the number M of index pairs is greater than the maximal length N of the index
set, and thus such δp...q are zero.
Next, we consider an element of δp...q with one upper index which does not appear
among the lower indices (or an element with a lower index which does not appear
among the upper indices). Then the elements of the corresponding row (or column) are
all zero and the determinant is again zero, independent of the values of the remaining
So only those elements can be nonzero, whose upper indices are all different and
whose lower indices are all different, and whose lower indices differ from their upper
indices at most in their order.
We now consider the case where both indices of every index pair are equal, i. e.
the order of the upper and lower indices is the same. Then all diagonal elements of
the determinant are one and all remaining elements are zero, so the determinant and
hence also the corresponding element of δp...q is one.

12 | 1 Algebraic Tools

Every change in the order of two lower indices (while keeping the order of the
upper indices) exchanges the corresponding columns of the determinant and hence
changes the sign of the determinant.
This leads us to the following result.

{ 1 if all upper indices are distinct and the sequence

{ of the lower indices is an even permutation of the
{ sequence of the upper indices,
{−1 if all upper indices are distinct and the sequence
δp...q ={ of the lower indices is an odd permutation of the (1.20)
{ sequence of the upper indices,
{ 0 otherwise, i. e. if some of the upper indices are
{ equal or when the lower indices are not a permu-
{ tation of the upper indices.

i...j p...q
δp...q = δi...j , (1.21)

since the value of a determinant does not change if we exchange rows and columns.

Problem 1.7.
A. Write out all nonzero elements of δpq and compute their values
1. for N = 2, 2. for N = 3.
B. Compute δij for N = 3.

1.4.3 εi...j

1. For some of the more frequently used generalized Kronecker symbols we introduce
another notation. We define
12...M ij...k
εij...k := δij...k = δ12...M . (1.22)

On the one hand, elements of εi...j are nonzero only if none of the indices i . . . j is larger
than M. On the other hand, generalized Kronecker symbols for N < M do not exist
anyway. Therefore, without loss of generality, we have N = M, i. e. the length of the
index set of all indices equals the number of indices in εi...j .
The term εi...j is called permutation symbol, Levi-Civita symbol, or alternating sym-

Problem 1.8.
Compute for N = 2 all elements of δij and εij .

1.4 Kronecker Symbols | 13

2. According to (1.22), the values of εi...j are the following.

{ 1 for ε12...M and for all even permutations of these indices,

εi...j = {−1 for all odd permutations of these indices, (1.23)
{ 0 otherwise (i. e. if an index appears at least twice).

For M indices, εi...j has M M elements (number of combinations of M elements including

repetitions in M classes), and M! of these elements (number of permutations of M
elements without repetitions) are nonzero, where half of them are +1 and the other
half are −1.
Clearly, εi...j is antimetric with respect to each index pair, i. e. exchanging two in-
dices only changes the sign.
From (1.23), we get a formula that is sometimes useful. Let
– εi...j be an arbitrary element with M different indices;
– pk (εi...j ) with k = 1, . . . , M! be the elements in any order, which can be formed by
permutation of the indices; and
– αk be the number of index exchanges necessary to change the order of indices εi...j
into the order of indices pk (εi...j ).

Then we get

εi...j = (−1)αk pk (εi...j ). (1.24)
This formula trivially holds when some of the indices of εi...j are repeated, but then
each term is zero anyway.
From (1.22) and (1.19), we obtain
󵄨󵄨 δ δ1j ... δ1k 󵄨󵄨 󵄨󵄨 δ δi2 ... δiM 󵄨󵄨
󵄨󵄨 1i 󵄨󵄨 󵄨󵄨 i1 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 δ2i δ2j ... δ2k 󵄨󵄨 󵄨󵄨 δj1 δj2 ... δjM 󵄨󵄨
󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
εij...k = 󵄨󵄨󵄨󵄨 . 󵄨󵄨 = 󵄨󵄨 . 󵄨󵄨 . (1.25)
󵄨󵄨 .. 󵄨󵄨 󵄨󵄨 .
󵄨󵄨 󵄨󵄨 .
󵄨󵄨 󵄨󵄨󵄨 󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨 δMi δMj ... δMk 󵄨󵄨 󵄨󵄨 δk1 δk2 ... δkM 󵄨

3. Let us derive two more useful formulas. From (1.25), it follows that
󵄨󵄨 δ δi2 ... δiM 󵄨󵄨 󵄨󵄨 δ δ1q ... δ1r 󵄨󵄨
󵄨󵄨 i1 󵄨󵄨 󵄨󵄨 1p 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 δj1 δj2 ... δjM 󵄨󵄨󵄨 󵄨󵄨󵄨 δ2p δ2q ... δ2r 󵄨󵄨
󵄨 󵄨󵄨
εij...k εpq...r = 󵄨󵄨󵄨󵄨 . 󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨 .. 󵄨󵄨 .
󵄨󵄨 .. 󵄨󵄨 󵄨󵄨 . 󵄨󵄨
󵄨󵄨 󵄨󵄨󵄨 󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 δ 󵄨󵄨
󵄨󵄨 δk1 δk2 ... δkM 󵄨 󵄨 Mp δMq ... δMr 󵄨

Using the multiplication theorem for determinants (1.13), we have for the upper left
element of this product of two determinants δi1 δ1p +δi2 δ2p + ⋅ ⋅ ⋅ +δiM δMp = δim δmp = δip .

14 | 1 Algebraic Tools

Analogously, we compute the remaining elements of this product of two determinants

and we obtain
󵄨󵄨 δ δiq ... δir 󵄨󵄨
󵄨󵄨 ip 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 δjp δjq ... δjr 󵄨󵄨
󵄨 󵄨󵄨
εij...k εpq...r = 󵄨󵄨󵄨󵄨 . 󵄨󵄨
󵄨󵄨 .. 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 δkp δkq ... δkr 󵄨󵄨󵄨

or, with (1.19),

󵄨󵄨 δ δiq ... δir 󵄨󵄨
󵄨󵄨 ip 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 δjp δjq ... δjr 󵄨󵄨
ij...k 󵄨 󵄨󵄨
εij...k εpq...r = δpq...r = 󵄨󵄨󵄨󵄨 . 󵄨󵄨 . (1.26)
󵄨󵄨 .. 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 δkp δkq ... δkr 󵄨󵄨󵄨

The product of εij...k and δpq , where the number of indices of ε and the length of the
index set of all indices is N, is

εij...k δpq = εpj...k δiq + εip...k δjq + ⋅ ⋅ ⋅ + εij...p δkq

= εqj...k δip + εiq...k δjp + ⋅ ⋅ ⋅ + εij...q δkp . (1.27)

This formula is easy to understand. According to (1.20), we have for all indices of the
index set of length N δq12...N−1,N = 0, and with (1.19), it follows that

󵄨󵄨 󵄨󵄨
󵄨󵄨 δpq δp1 δp2 ... δp,N−1 δpN 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 δiq δi1 δi2 ... δi,N−1 δiN 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨󵄨 δjq δj1 δj2 ... δj,N−1 δjN 󵄨󵄨
󵄨󵄨 󵄨󵄨 = 0.
󵄨󵄨 . 󵄨󵄨
󵄨󵄨 . 󵄨󵄨
󵄨󵄨 . 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 δ δk1 δk2 ... δk,N−1 δkN 󵄨󵄨
󵄨󵄨 kq 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 δlq δl1 δl2 ... δl,N−1 δlN 󵄨
Expanding (using the “expansion” formula) according to the first column and using
(1.26) gives

δpq εij...kl ε12...N − δiq εpj...kl ε12...N + δjq εpi...kl ε12...N − ⋅ ⋅ ⋅

+ (−1)N−1 δkq εpij...l ε12...N + (−1)N δlq εpij...k ε12...N = 0.

The factor ε12...N is one and can be omitted. Now we bring all terms except the first
to the right side and perform as many index exchanges in the epsilons as we need to
move the index p to the M-th position in the M-th term on the right side, while keeping
the order of the remaining indices unchanged. Clearly, M − 1 exchanges are necessary,
and we get

εij...kl δpq = εpj...kl δiq + εip...kl δjq + ⋅ ⋅ ⋅ + εij...pl δkq + εij...kp δlq .

Brought to you by | Cambridge University Library

Or, if we change the notation of the indices slightly, we get the first equation (1.27).
The second equation we get from the first equation, if we exchange p and q and use
the symmetry of δpq .

1.4.4 Representation of a Determinant by εi...j

1. The system of εi...j has properties which are similar to the properties of determinants;
for example, a determinant is zero if two rows (or columns) are equal and it changes
sign if we exchange two rows (or columns). The same is true for the elements of εi...j
with respect to the indices, so it is plausible to write εi...j according to (1.25) as a deter-
minant. Conversely, we can assume that a determinant can also be written using εi...j .
To see this, we multiply (1.25) by ai1 aj2 ⋅ ⋅ ⋅ akM . Then we have

󵄨󵄨 δ δ1j ... δ1k 󵄨󵄨

󵄨󵄨 1i 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 δ2i δ2j ... δ2k 󵄨󵄨
󵄨 󵄨󵄨
εij...k ai1 aj2 ⋅ ⋅ ⋅ akM = 󵄨󵄨󵄨󵄨 . 󵄨󵄨 ai1 aj2 ⋅ ⋅ ⋅ akM .
󵄨󵄨 .. 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 δMi δMj ... δMk 󵄨

If, on the right side, we multiply ai1 into the first column of the determinant, aj2 into
the second column, and so on and take (1.17) into account, we get
󵄨󵄨 a a12 ... a1M 󵄨󵄨
󵄨󵄨 11 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 a21 a22 ... a2M 󵄨󵄨
󵄨 󵄨󵄨
εij...k ai1 aj2 ⋅ ⋅ ⋅ akM = 󵄨󵄨󵄨󵄨 . 󵄨󵄨 .
󵄨󵄨 .. 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 aM1 aM2 ... aMM 󵄨

We get the same result if we start with

󵄨󵄨 δ δi2 ... δiM 󵄨󵄨
󵄨󵄨 i1 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 δj1 δj2 ... δjM 󵄨󵄨
󵄨 󵄨󵄨
εij...k a1i a2j ⋅ ⋅ ⋅ aMk = 󵄨󵄨󵄨󵄨 . 󵄨󵄨 a1i a2j ⋅ ⋅ ⋅ aMk
󵄨󵄨 .. 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 δk1 δk2 ... δkM 󵄨

and then multiply the first row of the determinant by a1i , the second row by a2j , etc.
This gives

det a
∼ = εij...k ai1 aj2 ⋅ ⋅ ⋅ akM = εij...k a1i a2j ⋅ ⋅ ⋅ aMk . (1.28)

Using the definition (1.22) of εi...j , we can also write

det a
∼ = δ12...M ai1 aj2 ⋅ ⋅ ⋅ akM . (a)

Brought to you by | Cambridge University Library

The value of δ12...M does not change if we change the order of two index pairs, so we
also have
ji...k ji...k
det a
∼ = δ21...M ai1 aj2 ⋅ ⋅ ⋅ akM = δ21...M aj2 ai1 ⋅ ⋅ ⋅ akM .
And if we rename i to j and j to i in the last equation, we get
det a
∼ = δ21...M ai2 aj1 ⋅ ⋅ ⋅ akM .
So we can also obtain det a ∼ if we replace in (a) the lower indices by a permutation of the
numbers 1, 2, . . . , M and the second indices of the terms in the product ai1 aj2 ⋅ ⋅ ⋅ akM
by the same permutation. If we replace in (a) the number indices by running indices,
i. e. if we write
δpq...r aip ajq ⋅ ⋅ ⋅ akr ,

this represents an M-fold summation over all M values of the indices p, q, . . . , r. How-
ever, of the M M summands, only those M! terms are nonzero, in which the sequence
p, q, . . . , r is a permutation of the sequence 1, 2, . . . , M. Since all summands are equal
to det a,
∼ from (1.26) it follows that
1 ij...k 1
det a
∼= δ a a ⋅ ⋅ ⋅ akr = ε ε a a ⋅ ⋅ ⋅ akr
M! pq...r ip jq M! ij...k pq...r ip jq
󵄨󵄨 δ δiq ... δir 󵄨󵄨
󵄨󵄨 ip 󵄨󵄨
󵄨󵄨 󵄨󵄨
1 󵄨󵄨󵄨 δjp δjq δjr
󵄨 ... 󵄨󵄨
= 󵄨󵄨 . 󵄨󵄨 aip ajq ⋅ ⋅ ⋅ akr .
M! 󵄨󵄨󵄨 .. 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 δkp δkq ... δkr 󵄨󵄨󵄨

Finally, if we multiply the first row of the determinant by aip , the second row by ajq ,
etc., we get
󵄨󵄨 a aqp ... arp 󵄨󵄨󵄨󵄨
󵄨󵄨 pp
󵄨󵄨 󵄨
1 󵄨󵄨󵄨 apq
󵄨 aqq ... arq 󵄨󵄨󵄨󵄨
det a 󵄨 󵄨󵄨 ,
∼ M! 󵄨󵄨󵄨 ..
= 󵄨󵄨
󵄨󵄨 . 󵄨󵄨
󵄨󵄨 󵄨
󵄨󵄨 a aqr ... arr 󵄨󵄨󵄨
󵄨 pr

det a
∼= ε ε a a ⋅ ⋅ ⋅ akr
M! ij...k pq...r ip jq
󵄨󵄨 a 󵄨󵄨
󵄨󵄨 pp apq . . . apr 󵄨󵄨
󵄨󵄨 󵄨󵄨 (1.29)
1 󵄨󵄨󵄨 aqp aqq . . . aqr
󵄨 󵄨󵄨
= 󵄨󵄨 . 󵄨󵄨 .
M! 󵄨󵄨󵄨 .. 󵄨󵄨
󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨 a arr 󵄨󵄨󵄨
󵄨 rp arq . . .

Brought to you by | Cambridge University Library

2. We can also compute the cofactors of the elements of a determinant in terms of εi...j .
We start with (1.29)1 and multiply both sides by δmn , which gives

(det a)δ
∼ mn = ε δ ε a a ⋅ ⋅ ⋅ akr . (a)
M! ij...k mn pq...r ip jq

Next, we replace εij...k δmn using (1.27)1 and we get

(det a)δ
∼ mn = (ε δ + εim...k δjn + ⋅ ⋅ ⋅ + εij...m δkn ) εpq...r aip ajq ⋅ ⋅ ⋅ akr
M! mj...k in
= (ε ε a a ⋅ ⋅ ⋅ akr + εim...k εpq...r aip anq ⋅ ⋅ ⋅ akr
M! mj...k pq...r np jq
+ ⋅ ⋅ ⋅ + εij...m εpq...r aip ajq ⋅ ⋅ ⋅ anr ) .

To ensure that the factor anp ajq ⋅ ⋅ ⋅ akr appears in every term, we rename the summation
indices. We have
(det a)δ
∼ mn = (ε ε a a ⋅ ⋅ ⋅ akr
M! mj...k pq...r np jq
+ ε i m...k εpq...r a i p anq ⋅ ⋅ ⋅ akr + ⋅ ⋅ ⋅ + ε i j...m εpq...r a i p ajq ⋅ ⋅ ⋅ anr ) .
↑ ↑↑ ↑↑ ↑ ↑ ↑ ↑ ↑↑ ↑
j qp jq p k r p kr p

If we now exchange an index pair in each term except the first term in both epsilons,
which does not change the value of the terms, we see that all M terms are equal. Thus
we obtain
(det a)δ
∼ mn = ε ε a ⋅ ⋅ ⋅ akr anp . (b)
(M − 1)! mj...k pq...r jq

Next, we introduce

bmp := ε ε a ⋅ ⋅ ⋅ akr
(M − 1)! mj...k pq...r jq

and we want to show that the so-defined bmp is the cofactor to the element amp . For
m = p = 1, we get

b11 = ε ε a ⋅ ⋅ ⋅ akr .
(M − 1)! 1j...k 1q...r jq

Since the number of the indices of the epsilons equals the length of the index set and
since all epsilons in which an index appears at least twice are zero, we can substitute
εj...k for ε1j...k and εq...r for ε1q...r if the M − 1 indices of the epsilons range from 2 to M. We
b11 = ε ε a ⋅ ⋅ ⋅ akr , for all indices: 2, . . . , M.
(M − 1)! j...k q...r jq

Brought to you by | Cambridge University Library

But this is, according to (1.29), the minor to a11 , which is, according to (1.8), equal to
the cofactor b11 . Analogously, we obtain for m = 1, p = 2,

1 1
b12 = ε1j...k ε2q...r ajq ⋅ ⋅ ⋅ akr = − ε ε a ⋅ ⋅ ⋅ akr .
(M − 1)! (M − 1)! 1j...k q2...r jq

We can again substitute εj...k for ε1j...k and εq...r for εq2...r if the indices j . . . k can take
values from 2 to M and the indices q . . . r can take values 1 and from 3 to M. Without
the minus sign this is the minor to a12 ; with the minus sign it is the cofactor b12 . Anal-
ogously, the same can be shown for the remaining elements of bmp , so we get for the
cofactor of the element aip of a determinant

bip = ε ε a ⋅ ⋅ ⋅ akr , (1.30)
(M − 1)! ij...k pq...r jq

where the epsilons have M indices.

With this, from (b) we obtain

(det a)δ
∼ mn = bmp anp .

Since δij is symmetric we can change the order of m and n on the right side, so we can
also write

(det a)δ
∼ mn = amp bnp .

If we substitute εpq...r δmn in (a) using (1.27)2 , we similarly obtain

(det a)δ
∼ mn = bim ain = aim bin .

So we have

(det a)δ
∼ ij = bik ajk = aik bjk = bki akj = aki bkj . (1.31)

By this way, we also proved (1.14).

If we take the determinant of (1.31) on both sides, then with (1.13) it follows that
(det a)
∼ = (det b)(det
∼ a)
∼ and thus for det a∼ ≠ 0 we obtain
det b
∼ = (det a)
∼ . (1.32)

1.4.5 εijk

1. We will need the εi...j mostly for the case M = 3 and we wish to write out some
formulas for this case.

Brought to you by | Cambridge University Library

{ 1 for ijk = 123 and its cyclic permutations 231 and 312,
{−1 for ijk = 321 and its cyclic permutations 213 and 132,
εijk ={ (1.33)
{ 0 for all remaining index combinations, i. e. if an index
{ appears at least twice.

We can easily convince ourselves that this definition is consistent with (1.23).

2. Clearly, εijk is antimetric with respect to each index pair.

εijk = εjki = εkij = −εkji = −εjik = −εikj . (1.34)

Problem 1.9.
Write out the following expressions for N = 3:
A. vi = εijk ωj rk .
Let vi , ωi , and ri be the Cartesian coordinates of the three vectors v, ω, and r. Which
frequently used vector algebraic relation between v, ω, and r is expressed by this
B. Ωj = εijk .
Let vi and Ωi be the Cartesian coordinates of the two vector fields v and Ω, i. e.
vi and Ωi are functions of the coordinates xi = (x1 , x2 , x3 ) = (x, y, z). Which fre-
quently used vector analytic relation between Ω and v is expressed by this equa-

3. By equating two homologous indices (i. e. indices which are at the same position in
both epsilons) we obtain from (1.26), after straightforward computations, the so-called
Grassmann identity.

εijk εmnk = εikj εmkn = εkij εkmn = δim δjn − δin δjm . (1.35)

Let us describe this important formula in words. Summing over two homologous in-
dices in two epsilons (for M = 3) gives the difference of two terms which are both
products of two deltas. In the first term of the difference (the minuend), the indices of
both deltas are the homologous indices of the epsilons; in the second term (the sub-
trahend), the indices are not homologous indices of the epsilons.
Setting in (1.35) n = j gives εijk εmjk = δim δjj − δij δjm = 3 δim − δim = 2 δim ,

εijk εmjk = 2 δim . (1.36)

Brought to you by | Cambridge University Library

If we further substitute m = i, we obtain

εijk εijk = 6. (1.37)

4. The product of εijk and δpq is, according to (1.27),

εijk δpq = εpjk δiq + εipk δjq + εijp δkq

= εqjk δip + εiqk δjp + εijq δkp .

1.5 Matrices
1.5.1 Definitions

A rectangular table of numbers is called a matrix. The numbers are the elements of
the matrix, with elements aligned horizontally forming the rows and elements aligned
vertically forming the columns. A matrix with M rows and N columns is also called an
M, N-matrix. A matrix with only one row is called a row matrix, a matrix with only
one column is called a column matrix, a matrix with N rows and N columns is called
a square matrix of order N. A matrix which can be obtained from another matrix by
dropping an arbitrary number of rows and columns is called a submatrix of the other
We denote a matrix by a letter with an underlying tilde and its elements by the
same letter with two indices, where the first index refers to the row and the second
index refers to the column. We write

A11 A12 ... A1N

A21 A22 ... A2N
( . ).
AM1 AM2 ... AMN

We call the element Aij at the intersection of the i-th row and the j-th column the
i, j-element of the matrix. If we take both indices as running indices and let i range
from 1 to M and j from 1 to N, then Aij is a symbol for the entire matrix. So we have
three equivalent ways to write this:

A11 A12 ... A1N

A21 A22 ... A2N
( . ) = Aij = A. (1.39)
.. ∼
AM1 AM2 ... AMN

We call a matrix whose elements are all zero a zero matrix 0,

∼ and we call a square
matrix whose only nonzero elements are elements with equal indices (the elements

1.5 Matrices | 21

on the main diagonal) a diagonal matrix or a diagonalized matrix. A diagonal matrix

in which the elements on the main diagonal are all one is called an identity matrix or
a one matrix and is denoted by I. ∼ Clearly the elements of the identity matrix are the δij ,
so we will refer to them by this symbol.
The determinant computed from the elements of a square matrix is called the de-
terminant of a matrix. The cofactors corresponding to the elements of this determinant
are also called cofactors of the elements of the matrix. A determinant computed from
the elements of a submatrix which was obtained from a not necessarily square matrix
by deleting an arbitrary number of rows and columns is called a subdeterminant of
the matrix.
A matrix has rank R, if it has at least one nonzero subdeterminant of order R and
no higher-order nonzero subdeterminant. By Theorem III from Section 1.3.3 on deter-
minants a matrix with rank R has exactly R linearly independent rows and exactly R
linearly independent columns. We denote the rank of the matrix A ∼ by rank A.∼
A square matrix of order N is called regular if it has rank N and it is called singular
if it has a lower rank; it is called M-times singular if it has the rank N − M, or in other
words, if it has the rank defect M.

1.5.2 Arithmetic Operations and First Conclusions

1. A matrix is a special case of an N-tuple; the easiest way to see this is to imagine all
rows of a matrix written one after another. Now we can define linear operations and
the terms linearly independent and linearly dependent for matrices.
– Equality: Two matrices A ∼ and B∼ are equal if and only if their homologous elements
are equal:

∼ ⇐⇒ Aij = Bij . (1.40)

– Addition (subtraction): A matrix C ∼ is the sum (difference) of two matrices A

∼ and B

if and only if the same is true for the homologous elements of the three matrices:

∼ ⇐⇒ Aij ± Bij = Cij . (1.41)

Clearly equality, addition, and subtraction of matrices are only possible if the matrices
have the same number of rows and columns.
– Multiplication of a matrix by a number: To multiply a matrix A ∼ by a number α, we
multiply every element of the matrix by this number:

∼ ⇐⇒ α Aij = Bij . (1.42)

Brought to you by | Cambridge University Library

1 2 Q
– Linear independence: Q M, N-matrices A, A, . . . , A are called linearly indepen-
∼ ∼ ∼
dent, if the equation

i i
αi A = 0 ⇐⇒ αi Amn = 0 (1.43)
∼ ∼

has only the trivial solution, where all αi are zero. Otherwise the matrices are
called linearly dependent.

Since the equations (1.43) represent MN homogeneous linear equations for the Q un-
known αi , more than MN matrices are always linearly dependent.

2. Now we define multiplication of two matrices. A matrix C ∼ is the product of two ma-
trices A
∼ and B
∼ if and only if their elements satisfy the relation Cik = Aij Bjk :

∼ ⇐⇒ Aij Bjk = Cik . (1.44)

Multiplication of matrices requires that the number of columns of the first factor is
equal to the number of rows of the second factor. The so-called Falk scheme is a con-
venient way to perform the multiplication.

For Aij Bjk = Cik :

B11 B12 B13
B21 B22 B23
B31 B32 B33
B41 B42 B43
A11 A12 A13 A14 C11 C12 C13
A21 A22 A23 A24 C21 C22 C23
For A ij Bjk Ckl = Dil :
B11 B12 B13
B21 B22 B23 C11 C12 C13 C14
B31 B32 B33 C21 C22 C23 C24
B41 B42 B43 C31 C32 C33 C34
A11 A12 A13 A14 F11 F12 F13 D11 D12 D13 D14
A21 A22 A23 A24 F21 F22 F23 D21 D22 D23 D24

The product of two matrices is not commutative in general; in the example given above
∼ A
∼ is not even defined. The product of a matrix and the identity matrix is, however,
the original matrix, independent of the order of the factors:

∼ I∼ = I∼A
∼ ⇐⇒ Aij δjk = δij Ajk = Aik . (1.45)

Brought to you by | Cambridge University Library

When the matrix A ∼ is rectangular, however, the order of the identity matrix differs for
the two cases: if we multiply A ∼ by the identity matrix from the right, then the order of
the identity matrix equals the number of columns of A; ∼ if we multiply A ∼ by the identity
matrix from the left, then the order of the identity matrix equals the number of rows
of A.

If A
∼ is square and regular, the converse is also true. If the result of the product (1.45)
of a regular square matrix A ∼ and an (initially unknown) matrix I∼ is again the original
matrix A,∼ then I∼ is the identity matrix.
This is easy to see: If A
∼ is square, then according to (1.44), I∼ also has to be square
and has the same order as A,
∼ say N. Then (1.45) forms a system of N nonhomogeneous
linear equations for the N 2 elements of I.
∼ If A
∼ is regular, this equation system clearly
always has a unique solution. Since (1.45) is satisfied for every matrix A
∼ by the identity
matrix I∼and since it has a unique solution if A
∼ is a regular square matrix, the identity
matrix is the only solution.
The product of more than two matrices is associative.

3. Next we define the transpose of a matrix. To any matrix A∼ we can associate a

matrix A
∼ , which is obtained from the original matrix by exchanging the rows and
columns; the matrix A
∼ is called the transpose matrix of A
∼ and we read “A transpose”.
T 2
We denote the i, j-element of the matrix A
∼ (not: the transpose of the i, j-element of
the matrix A)
∼ by (Aij ) or Aij . Then

(Aij )T ≡ ATij := Aji , (1.46)

the i, j-element of the matrix A
∼ is equal to the j, i-element of the matrix A.

Clearly (ATij )T = ATji = Aij , i. e. the transpose of the transpose matrix is the original
∼ ) =A∼ ⇐⇒ (ATij )T = Aij . (1.47)

Aim Bmj is the i, j-element of the matrix A
∼ B
∼ and (Aim Bmj ) is the i, j-element of the
matrix (A
∼ B)
∼ , and so with (1.46) we get

(Aim Bmj )T = Ajm Bmi = Bmi Ajm = BTim ATmj .

On the far left is the i, j-element3 of the matrix (A

∼ B)
∼ , next to it is the j, i-element of
the matrix A∼ B,
∼ the next term cannot be interpreted as an element of a matrix, and

2 The transpose of a matrix element is not defined. So we should really read ATij as “A transpose ij”
and not “A ij transpose”.
3 We will read this as (Aim Bmj )T “A im B mj (in parentheses) transpose”, but let us keep in mind that
we really mean the i, j-element of the matrix (A∼ B)
∼ .

Brought to you by | Cambridge University Library

the far-right term is the i, j-element of the matrix B
∼ A∼ . Comparing the far-left and the
far-right expressions shows that the i, j-elements of the matrices (A ∼ B)
∼ and B∼ A∼ are
equal, and since this is true for all i and j these two matrices are equal. We have now
derived the important formula

∼ B)
∼ =B ∼ A
∼ ⇐⇒ (Aim Bmj )T = BTim ATmj . (1.48)

4. Another operation defined for square matrices is the inversion of a matrix. If we can
associate to a square matrix A
∼ a matrix A
∼ , such that the product of the two matrices

is the identity matrix I,

∼A∼ = I∼ ⇐⇒ Aij A(−1)
= δik , (1.49)

then the matrix A
∼ is called the inverse matrix of A.

If A
∼ is known, then (1.49) represents a nonhomogeneous system of linear equa-
tions for the elements of the matrix A
∼ . According to the rules for systems of linear

equations we can solve the system for A(−1)

if and only if A
∼ is regular and then the
solution is unique. The solution can be derived from (1.31). To see this, we multiply
(det A)
∼ δij = Bki Akj by Ajm :

(det A)
∼ δij Ajm = Bki Akj Ajm = Bki δkm ,
(−1) (−1)

(det A)
∼ Aim = Bmi ,

ij = , (1.50)
det A

where Bij is the cofactor of Aij . (Note the different positions of i and j on the two sides
of this equation!)
We conclude that a matrix is invertible if and only if it is square and regular. In
practice we usually use the Gauss–Jordan algorithm to compute the inverse matrix
rather than (1.50).
Finally, from (1.49) it also follows that

∼ A ∼ = I∼ ⇐⇒ A(−1)
ij Ajk = δik , (1.51)

which means that the inverse of A

∼ is again A,
∼ i. e.
−1 −1
∼ ) = A.∼ (1.52)

4 A(−1)
is the i, j-element of the matrix A
∼ and we should read “A inverse ij”. Similarly as the transpose

matrix, the inverse of a matrix is defined, and the inverse of a matrix element is not defined.

Brought to you by | Cambridge University Library

To see this we multiply (1.49) from the right by A:

∼A∼ A ∼ = I∼A
∼ = A.

If we write for a moment the product A

∼ A
∼ as C,
∼ the last equation becomes

∼C∼ = A,

and since A ∼ is square and regular, it follows from the converse of (1.45) that C
∼ is the
identity matrix, which proves (1.51) and (1.52). Substituting (1.50) into (1.49)2 and
(1.51)2 we finally obtain

(det A)δ
∼ ik = Aij Bkj = Bji Ajk ; (1.53)

compare also with (1.31).

Problem 1.10.
Compute the inverse of the following matrices and verify your result, i. e. show that the
product of the original matrix and the computed inverse is the identity matrix (com-
pare also with Section 1.6.2):
0 1 1 −1
1 0 2
1 2 −1 1
A. (2 1 3) , B. ( ).
1 3 −1 0
1 1 2
−1 −2 1 −2

Problem 1.11.
A. Show that the operations transpose and inverse of a matrix commute, i. e. show
−1 T T −1 −T
∼ ) = (A
∼ ) =: A
∼ . (1.54)

B. Show that analogously to (1.48) the following holds:

−1 −1 −1
∼ B)
∼ =B ∼ A∼ . (1.55)

1.5.3 Equations Involving Matrices and Equations Involving Matrix Elements

We distinguished between matrices (which we denoted by letters with an underlying

tilde) and their elements (which we denoted by letters with two indices) and we de-
fined arithmetic operations for matrices (which are not numbers) by writing out the
arithmetic operations which have to be carried out between their elements (which are
numbers). By this, we found two notations for matrix computations: Every equation

Brought to you by | Cambridge University Library

for matrices can be written as an equation for the elements of the matrices, or in other
words, every equation for matrices can be translated into an equivalent equation for
matrix elements. From the definition of matrix equality (1.40) it follows that an equa-
tion for matrix elements has the property that the order of free indices coincides in all
its terms. Conversely, an equation for matrix elements can only be translated into an
equation for matrices, when the order of free indices coincides in all terms. This is not
the case in (1.46) and (1.50), for example.
We will meet the same facts again when we compute with tensors.

1.5.4 Elementary Transformations, Normal Form, Equivalent Matrices,

Similar Matrices

1. The following transformations are called elementary transformations of an

M, N-matrix:
– exchange of two rows or two columns;
– multiplication of a row or a column by a nonzero number;
– addition of a linear combination of the remaining rows (columns) to a row (col-

Theorems II, V, Ia, and Ib for determinants immediately imply that elementary trans-
formations do not change the rank of a matrix.

2. The normal form of any M, N-matrix with rank R is defined as the matrix

R N −R
⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞ ⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞
1 0 ... 0 0 ... 0 }
0 1 ... 0 0 ... 0 }
( .. .. ) } R
(. . ) }
( ) }
} (1.56)
( )
D = (0 0 . . . 1 0 . . . 0) } .
( )
(0 0 . . . 0 0 . . . 0) }
( ) }
.. .. M−R
. . }
(0 0 . . . 0 0 . . . 0) }

Obviously, every M, N-matrix of rank R can be transformed into this normal form by
elementary transformations and every M, N-matrix, which can be transformed into
this normal form by elementary transformations, has rank R, so the name normal form
makes sense.

Brought to you by | Cambridge University Library

Problem 1.12.
Determine the rank of the matrices (see also Section 1.6.3)

0 3 6 9 −3
1 −3 2 0
0 0 2 −4 8
A. ( ), B. ( 4 −11 10 −1 ) .
1 2 −3 4 5
−2 8 −5 3
1 5 5 9 10

3. The elements of a set are equivalent and form an equivalence class, if there exists a
relation A ∼ B between two arbitrary elements A and B of the set with the following
properties. The relation is
– reflexive, i. e. A ∼ A (the relation exists also for B = A);
– symmetric, i. e. if A ∼ B then B ∼ A;
– transitive, i. e. if A ∼ B and B ∼ C then A ∼ C.

Such a relation is called an equivalence relation on the set.

4. It can be shown that for any pair A∼ and B∼ of M, N-matrices of equal rank R, there
exists a pair of regular square matrices S
∼ and T,
∼ such that

∼ ⇐⇒ Aij = Sim Bmn Tnj , (1.57)

where clearly the matrix S

∼ has order M and the matrix T
∼ has order N.
Problem 1.13.
Show that (1.57) is an equivalence relation for the two matrices A
∼ and B.

All matrices with the same number of rows, the same number of columns, and
the same rank form an equivalence class. Two matrices with this property (or in other
words, two matrices A ∼ and B
∼ satisfying the relation A
∼B∼ T,
∼ where S
∼ and T
∼ are regu-
lar) are called equivalent.

5. In particular, if two equivalent matrices A

∼ and B
∼ are square, then S∼ and T∼ have the
same number of rows and columns. If, in addition, S ∼ and T
∼ are inverses of each other,

∼ , (1.58)

and we call the matrices A

∼ and B
∼ similar.

Brought to you by | Cambridge University Library

1.5.5 Orthogonal Matrices

1. A square matrix A
∼ is called orthogonal, if its inverse equals its transpose. Then
Aik Akj = Aik Ajk = δij , and at the same time ATik Akj = Aki Akj = δij . Thus we get

the following equivalent notations for orthogonal matrices:

∼ =A ∼ ,
∼A∼ =A ∼ A∼ = I,
∼ (1.59)
Aik Ajk = δij , Aki Akj = δij .

The last two equations show that the sum of the squares of each row (or column) is
one and the sum of the products of two different rows (or two different columns) is

2. Orthogonal matrices also satisfy

det Aij = ±1. (1.60)

The easiest way to prove this is to expand (det Aij )2 using the rules for multiplication
of determinants and then transpose the second determinant. Then we have
󵄨󵄨 A A12 ... A1N 󵄨󵄨 󵄨󵄨 A A21 ... AN1 󵄨󵄨
󵄨󵄨 11 󵄨󵄨 󵄨󵄨 11 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 A21 A22 ... A2N 󵄨󵄨󵄨 󵄨󵄨󵄨 A12 A22 ... AN2 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 . 󵄨󵄨 󵄨󵄨 .. 󵄨󵄨
󵄨󵄨 . 󵄨󵄨 󵄨󵄨 . 󵄨󵄨
󵄨󵄨 . 󵄨󵄨
󵄨󵄨 󵄨󵄨󵄨 󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨 A AN2 ... ANN 󵄨󵄨 󵄨󵄨 A A2N ... ANN 󵄨󵄨
󵄨 N1 󵄨 󵄨 1N 󵄨
󵄨󵄨 A A A1i A2i ... A1i ANi 󵄨󵄨
󵄨󵄨 1i 1i 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 A2i A1i A2i A2i ... A2i ANi 󵄨󵄨
󵄨 󵄨󵄨
= 󵄨󵄨󵄨󵄨 . 󵄨󵄨 .
󵄨󵄨 .. 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 ANi A1i ANi A2i ... ANi ANi 󵄨󵄨󵄨

According to (1.59) this is equal to det δij = 1, which completes the proof.
Note that the converse is not true: From det Aij = ±1 it does not follow that Aij is

3. If the determinant of an orthogonal matrix has the value +1, we call the matrix
proper orthogonal. If the determinant has the value −1, we call the matrix proper or-

5 A more common name for proper orthogonal is special orthogonal, but there is no corresponding
commonly accepted name for det Aij = −1 matrices, so we prefer the names proper and improper.

Brought to you by | Cambridge University Library

1.6 Algorithms
In this section we summarize some fundamental algorithms from linear algebra with-
out proof.
A determinant or a matrix is transformed column by column from left to right. We
describe the steps for an arbitrary column. The row which has the same diagonal ele-
ment as the column which is being transformed is called the corresponding row. The
steps of the algorithms are carried out subsequently, while omitting the steps which
do not satisfy the required assumptions.

1.6.1 Computing the Value of a Determinant

The aim is to transform a determinant into a triangular determinant. Then we compute

the value of the determinant as the product of the elements on the main diagonal.
I. If the diagonal element and all elements below this diagonal element are zero, the
determinant has the value zero. (stop)
II. If the diagonal element is zero, exchange the corresponding row with a row below
to make the diagonal element nonzero by taking into account sign changes of the
III. Make the elements below the diagonal element zero by adding to each row a con-
venient multiple of the corresponding row.

1.6.2 Solution of Systems of Linear Equations with the Same Regular Coefficient
Matrix (“Division by a Regular Matrix”, Gauss–Jordan Algorithm)

The aim is to compute the matrix X ∼ in the equation A∼X ∼ = B,

∼ where A ∼ is a regular (and
hence a square) matrix. (According to the rules for matrix multiplication, A, ∼ X,
∼ and B ∼
have the same number of rows, and X ∼ and B
∼ have the same number of columns.)
If X
∼ has only one column (so B ∼ also has only one column), A ∼X∼=B ∼ represents a
system of linear equations, which has a unique solution. If X ∼ has two columns, (so B∼
also has two columns), A X
∼∼ ∼ = B represents two systems of linear equations with the
same coefficient matrix and both equation systems have unique solutions, etc. If B ∼ is
the identity matrix, then X ∼ is the inverse matrix of A.

A B If we introduced the concept of division by a matrix, we could describe
∼ ∼
.. .. the solution to this problem as division by a regular matrix. To solve the
. .
problem, we write the two known matrices, A ∼ and B,
∼ side by side and trans-
I∼ X ∼ form this double matrix using elementary row transformations until A ∼ be-
comes the identity matrix. This will transform B ∼ into the desired matrix. The described
method is called the Gauss–Jordan algorithm.

Brought to you by | Cambridge University Library

I. If the diagonal element and all elements below the diagonal element of the left
side of the double matrix are zero, then A
∼ is singular. (stop)
II. If the diagonal element is zero, exchange the corresponding row with a row below
to make the diagonal element nonzero.
III. Divide the corresponding row by the value of the diagonal element.
IV. Make the elements above and below the diagonal element zero by adding to each
row a convenient multiple of the corresponding row.

1.6.3 Determining the Rank of a Matrix or a Determinant

The aim is the transformation of the matrix or of the elements of the determinant into
normal form. Then the rank of the matrix or the determinant is equal to the number
of nonzero (diagonal) elements.
I. If the diagonal element and all elements below the diagonal are zero, all elements
of this column are zero. Exchange the column with the most-right column whose
elements are not all zero.
II. If the diagonal element is zero but not all elements below the diagonal are zero,
then exchange the corresponding row with a row below to make the diagonal el-
ement nonzero.
III. Make the elements below the diagonal element zero by adding to each row a con-
venient multiple of the corresponding row.
IV. Replace the elements to the right of the diagonal element by zeros. (Make these el-
ements zero by adding to each column a convenient multiple of the column which
is being transformed.)
V. Replace the diagonal element by the number one. (Divide the corresponding row
by the value of the diagonal element.)

Brought to you by | Cambridge University Library

Cartesian Coordinates
We are now prepared to explore in the rest of the book the analysis of tensors, which
can be quantitatively represented in Cartesian coordinate systems in the space we live
in, as is the case with tensors which represent physical quantities.
Since we live in a three-dimensional space, we only need the formulas of Chap-
ter 1 for N = 3. In Chapter 2, we develop tensor analysis in Cartesian coordinates. In
Chapter 3, we focus on special properties of the simplest tensors, besides scalars and
vectors, namely second-order tensors. In Chapter 4, we will extend tensor analysis to
include curvilinear coordinates. In Chapter 5, we introduce a method called theory of
the representation of tensor functions, which, similarly to dimensional analysis, can
be used to reduce the number of possible combinations of physical quantities in an

2.1 Cartesian Coordinates, Points, Position Vectors

2.1.1 Position Vectors and Point Coordinates

To describe a point in our three-dimensional world quantitatively, we need a coordi-

nate system. In this chapter we consider only Cartesian coordinate systems. A Carte-
sian coordinate system consists of an origin and three pairwise perpendicular unit
vectors fixed at the origin. We call these unit vectors a (Cartesian) basis and denote
them by (ex , ey , ez ) or by (e1 , e2 , e3 ) or, remembering the summation convention, briefly
by ei .
We can then identify a point in a given Cartesian coordinate system either by its
coordinates xi = (x1 , x2 , x3 ) = (x, y, z) or by the position vector x from the origin of
the coordinate system to this point, so we have the following alternative notations:

x = x ex + y ey + z ez
= x1 e1 + x2 e2 + x3 e3 (2.1)
= xi ei .

2.1.2 Transformations of Cartesian Coordinate Systems

Consider two Cartesian coordinate systems, in an arbitrary orientation relative to

each other, as shown in the following figure (to keep it simple, the figure is only two-
dimensional, but the considerations are also valid in three dimensions). We label one
of these coordinate systems by a tilde, to distinguish one from another, and we call it


32 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

the tilde coordinate system. We call the other coordinate system the no-tilde coordi-
nate system. The relative position of the two coordinate systems is then characterized

I. ̃ of the tilde coordinate system has the coordinates (β , β , β ) in the

The origin O x y z
no-tilde coordinate system, so its position vector is

β = βx ex + βy ey + βz ez = βi ei . (2.2)

II. The basis vectors ẽi of the tilde coordinate system are represented in the no-tilde
system as

ẽx = α11 ex + α21 ey + α31 ez = αi1 ei ,

ẽy = α12 ex + α22 ey + α32 ez = αi2 ei ,
ẽz = α13 ex + α23 ey + α33 ez = αi3 ei , (2.3)

ẽj = αij ei .

The transformation between the no-tilde and the tilde coordinate systems is clearly
completely described by the βi and αij . The αij are called the transformation coefficients
or the transformation matrix.

2.1.3 Properties of the Transformation Coefficients

j j
1. Let ẽi be the coordinates of the vector ẽj in the basis ei , i. e. let ẽj = ẽi ei . Comparing
this expression with (2.3) yields

αij = ẽi , (2.4)

i. e. αij is the i-coordinate of the vector ẽj represented in the basis ei .

Brought to you by | Cambridge University Library

2. Another geometric interpretation of the αij follows from scalar multiplication of

(2.3) by ek :1

ẽj ⋅ ek = αij ei ⋅ ek = αij δik = αkj .

Since ẽj ⋅ ek is the cosine of the angle between ẽj and ek , we get

αij = cos(ei , ẽj ), (2.5)

where the αij are the so-called direction cosines. We see immediately that the indices of
αij are not interchangeable; αij is not the same as αji . For example, in the figure on the
facing page, ex and ẽy form an obtuse angle, and the corresponding direction cosine
is therefore negative; ey and ẽx form an acute angle, and the corresponding direction
cosine is therefore positive.

3. Since both the ei and the ẽi are pairwise perpendicular unit vectors, they satisfy the

ei ⋅ ej = δij , ẽi ⋅ ẽj = δij .

With (2.3), this gives

δij = ẽi ⋅ ẽj = αmi em ⋅ αnj en = αmi αnj em ⋅ en = αmi αnj δmn = αmi αmj ,

i. e. according to (1.59), the transformation coefficients form an orthogonal matrix, and

we have

αik αjk = δij , αki αkj = δij . (2.6)

These two equivalent relations are called orthogonality relations.

4. Orthogonal matrices satisfy, according to (1.60),

det α
∼ = ±1. (2.7)

For a geometric visualization of these two cases, we first assume that both coordinate
systems have the same orientation: either both are right-handed systems or both are
left-handed systems. We can then convert one system into the other system by a mo-

1 Assuming familiarity with the scalar product of two vectors from elementary vector analysis.
2 By a motion, we mean a composition of a rotation and a translation (but not a reflection). Some
authors call this a direct, proper, or rigid motion.

34 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

If the coordinate transformation can be interpreted as a motion, we can associate

a transformation matrix αij (t) to every point in time t between the initial time t0 and
the final time t1 (when both coordinate systems coincide), i. e. for all t with t0 ≦ t ≦ t1 .
These matrices, and consequently their determinants, are continuous functions of t.
At the final time, t1 , we have αij (t1 ) = δij and hence det α(t
∼ 1 ) = 1. Since the determinant
is continuous, its value cannot jump from 1 to −1 at any point in time, so det α ∼ = 1 at all
times and for all transformations and in this case the transformation matrix is proper
The simplest case of a coordinate transformation between two Cartesian coordi-
nate systems with different orientations is a reflection; then two of the three basis vec-
tors do not change, and the third basis vector reverses its direction. In this case, the
transformation matrix is a diagonal matrix with two ones and one negative one, for

1 0 0
(0 1 0) ,
0 0 −1

and its determinant is −1. An arbitrary transformation between two coordinate sys-
tems with different orientations can be composed of a reflection followed by a motion.
Since the determinant of a transformation matrix does not change during a motion, it
has for every transformation, which changes the orientation of the coordinate system,
the value −1. In other words, the transformation is improper orthogonal.

2.1.4 Transformation Law for Basis Vectors

Formula (2.3) can be easily solved for the ei with the help of the orthogonality relation.
We formally multiply (2.3) by αkj , which, due to the summation convention, means
summation over j. With (2.6), we get

αkj ẽj = αkj αij ei = δki ei = ek .

After renaming indices, we obtain the following transformation law for basis vectors:

ei = αij ẽj , ẽi = αji ej . (2.8)

Since we only used the orthogonality relation, these relations (as well as the trans-
formation laws we will derive later) hold, whether the coordinate system changes its
orientation during the transformation or not. In other words, the αij in (2.8) can repre-
sent a proper or an improper orthogonal matrix.

2.2 Vectors | 35

2.1.5 Transformation Law for Point Coordinates

The position vector to a point P can be expressed, as shown in the figure on page 32,
in both coordinate systems by

x = xi ei , x̃ = x̃i ẽi , x = β + x̃.

If all quantities in the last equation are expressed with respect to the no-tilde basis,
we obtain

xi ei = βi ei + x̃i αki ek .

We want to factor out the basis, so we need to rename the indices in the second term.
Then we have

xi ei = βi ei + x̃j αij ei = (βi + αij x̃j ) ei .

Since the representation (2.1) of a position vector with respect to a basis is unique, we

xi = βi + αij x̃j .

To solve this equation for x̃j , we multiply it by αik :

αik xi = αik βi + αij αik x̃j = αik βi + δjk x̃j = αik βi + x̃k ,
x̃k = αik xi − αik βi .

We have thus derived the transformation law for point coordinates:

xi = αij x̃j + βi , x̃i = αji xj − αji βj . (2.9)

2.2 Vectors
2.2.1 Vectors, Vector Components, and Vector Coordinates

1. We first met vectors 3 as arrows in space; the basis vectors of a Cartesian coordinate
system are one such example. In order to represent such a vector quantitatively, we

3 Latin: carrier, driver (agent noun to vehere [to carry]); short for radius vector (traveling ray, i. e. posi-
tion vector: the ray from a fixed point to a moving point, e. g. in the case of planetary motion from the
sun at the focus of the elliptic orbit to the planet). The term vector thus evolved from the term position

36 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

need, similarly as for a point, a coordinate system. Analogously to (2.1) we get for a
vector a in a Cartesian coordinate system with basis ei in different notations

a = ax ex + ay ey + az ez
= a1 e1 + a2 e2 + a3 e3 (2.10)
= ai ei .

2. Here, we would like to conceptually distinguish between the components and the
coordinates of a vector. We call the quantity ax ex the x-component of the vector a, and
we call the quantity ax the x-coordinate of the vector a (in the given coordinate system
or with respect to the given basis). The components of a vector are themselves vectors,
but the coordinates are not; the components of a vector are characterized by their mag-
nitude and direction, the coordinates are characterized by their absolute value and
sign. Conceptually distinguishing between components and coordinates of a vector is
not common practice, but it is meaningful.

2.2.2 The Transformation Law for Vector Coordinates

1. We can obtain the transformation law for the coordinates of an oriented line from
the transformation law for point coordinates if we realize that (compare the [again
only two-dimensional] figure on this page)

a = y − x = ỹ − x̃,

and so

ai ei = (yi − xi ) ei , ai = yi − xi ,
̃ i ẽi = (ỹi − x̃i ) ẽi , a
̃ i = ỹi − x̃i .

Brought to you by | Cambridge University Library

Here we could again cancel out the bases ei and ẽi , because the decomposition of a vec-
tor into components is unique, like the decomposition of a position vector. According
to (2.9) we have

xi = αij x̃j + βi , yi = αij ỹj + βi ,

which gives

ai = yi − xi = αij (ỹj − x̃j ) = αij a

̃j .

Conversely, according to (2.9) we get

x̃i = αji xj − αji βj , ỹi = αji yj − αji βj ,

̃ i = ỹi − x̃i = αji (yj − xj ) = αji aj .

Thus we obtained the transformation law for the coordinates of an oriented line

ai = αij a
̃j , a
̃ i = αji aj . (2.11)

We arrive at the same result, if in

a = ai ei = a
̃ i ẽi

we substitute (2.8)1 for ei :

a = ai αij ẽj = a
̃ i ẽi .

Clearly, these are two different representations of the vector a with respect to the tilde
basis. The representation of a vector with respect to a basis is unique, so we can cancel
out the basis and obtain an equation for coordinates only. This changes the summa-
tion index of the basis into a free index. For all terms in the equation for coordinates
to match in this free index, we have to rename the summation indices before cancel-
ing out the basis, so that the index of the tilde basis is the same on both sides of the
equation. For example, here we can replace on the left side j by i and i by j, and we get

aj αji ẽi = a
̃ i ẽi .

Then, after canceling out ẽi , we get

̃ i = αji aj ,

which is the transformation law (2.11)2 . Conversely we obtain (2.11)1 if we replace in

the original equation ẽi by (2.8)2 .

2. Two typical examples of physical quantities described by vectors are velocities and
angular velocities. However, there is an important difference between these two types

38 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

of quantities. Velocity is characterized by its magnitude and direction, and angular

velocity is characterized by its magnitude and the direction of rotation. We readily as-
sociate a velocity with an oriented line and therefore with a vector; its magnitude is the
speed of the velocity and its direction is the direction of the velocity vector. For angu-
lar velocity we need a convention, for example the right-hand screw rule, to associate
the direction of rotation with the direction of the vector. Let us, for the lack of a better
argument, arbitrarily agree to use the right-hand screw rule. Then we associate the an-
gular velocity with a vector whose magnitude is the speed of the angular velocity and
whose direction, together with the direction of rotation, forms a right-handed screw.
Vectors which describe physical quantities such as velocity are called polar vectors
and vectors which describe physical quantities like angular velocity are called axial
vectors (or pseudovectors).
We generally assume in physics that a physical process is unaffected by a trans-
lation, rotation, or reflection. If we move or reflect the coordinate system in the same
way, we consequently expect that the physical quantities have the same coordinates
for the different cases. (We can neglect here the violation of reflection invariance in
elementary particle physics.)

Clearly, polar vectors satisfy these invariances inherently. For a quantity associated
with an axial vector, both the direction of rotation and the corresponding vector have
to be transformed. In a motion, the correlation between the direction of rotation and
the direction of the corresponding vector remains unchanged; a reflection, however,
reverses this correlation: a right-hand screw correlation becomes a left-hand screw
correlation and vice versa. If we do not adjust the correlation, then the signs of the
vector coordinates are reversed in the reflected coordinate system, i. e. the invariance
requirement is violated. To preserve reflection invariance, we introduce the following

Brought to you by | Cambridge University Library

A physical quantity which is characterized by a magnitude and a direction of ro-

tation is associated with a vector corresponding to the right-hand screw rule in a
right-handed system and with a vector corresponding to the left-hand screw rule in
a left-handed system.

This is a convention in the sense that the reflection invariance would also be satisfied
if we associate with such a quantity in a right-handed system a vector corresponding
to a left-hand screw rule and in a left-handed system a vector corresponding to the
right-hand screw rule.
In a motion (translation and rotation) where the correlation between the direction
of rotation and the corresponding vector is preserved, we do not have this problem.
Certainly this convention has to hold also if the same physical quantity is repre-
sented in two different coordinate systems. A polar vector is invariant under an ar-
bitrary change of coordinates, i. e. we have a = a ̃ , which explains why the vector
commonly is denoted by a in both coordinate systems. An axial vector, however, is
invariant only under a motion, for a coordinate transformation with a change of ori-
entation we have a = −a ̃ . This may seem unfamiliar, as it means that a vector describ-
ing an angular velocity, angular momentum, or torque depends on the orientation of
the coordinate system in which it is represented. In other words, such a vector is not
independent of the coordinate system.
Since we have det α∼ = 1 for a change of coordinates without change of orientation
and det α ∼ = −1 for a change of coordinates together with a change of orientation, we
can summarize the transformation law for axial vectors by writing

a = (det α)
∼ a,

and we obtain as transformation law for the coordinates of an axial vector

ai = (det α)
∼ αij aj ,
̃ a
̃ i = (det α) αji aj .
∼ (2.12)

3. From the decomposition (2.10) and the transformation laws (2.11) and (2.12) we find
the following definition of a polar vector and an axial vector.

A vector is a quantity which according to (2.10) can be represented in a Cartesian

coordinate system and whose coordinates transform according to (2.11) or (2.12) if
the Cartesian coordinate system is changed. If the transformation law (2.11) holds,
the vector is called polar, and if the transformation law (2.12) holds, the vector is
called axial.

4. Finally we remark that the Cartesian coordinates of a vector depend only on the ba-
sis of the chosen coordinate system, but the coordinates of a position vector depend
also on its origin. This shows that position vectors are no vectors in the sense of the

40 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

definition above; however, the difference of two position vectors, which can be inter-
preted geometrically as a directed line from one point to another point, is a vector in
the sense of the definition above; compare the figure on page 36 and the derivation
of (2.11).

2.3 Tensors
2.3.1 Second-Order Tensors

1. Let Bi be the coordinates of an arbitrary vector B = Bi ei with respect to a basis ei ,

and let Tij be a matrix which assigns the coordinates Ai of another vector A to the Bi
according to the relation

Ai = Tij Bj . (2.13)

̃ i and B
Let ẽj be another arbitrary basis such that between the coordinates A ̃ j of two
vectors A and B the analogous relation

A ̃j
̃ i = T̃ij B (2.14)

holds. Then, what is the relation between the two matrices Tij and T̃ij ?
Let us assume initially that the two vectors A and B are polar. Then according
to (2.11) we have

Ai = αik A ̃n.
Bj = αjn B

Substituting this into (2.13), we get

̃ k = Tij αjn B
αik A ̃n.

Multiplication by αim gives with (2.14)

α ̃ ̃
im αik Ak = Tij αim αjn Bn ,
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ̃ m = αim αjn Tij B
A ̃ n = T̃mn B
(αim αjn Tij − T̃mn ) B
̃ n = 0.

For brevity we temporarily introduce Smn := αim αjn Tij , i. e. we have

(Smn − T̃mn ) B
̃ n = 0. (a)

Now we choose three linearly independent polar vectors a, b, and c, so that every polar
vector B can be written as a linear combination of these three vectors; we have

B = α a + β b + γ c, ̃n = α a
B ̃ + γ c̃ .
̃n + β b n n

Brought to you by | Cambridge University Library

If we substitute the last equation into (a), we see that (a) is satisfied for arbitrary B, if
(a) is satisfied for three linearly independent vectors. If we now substitute in (a) for B
successively a, b, and c, and write out the equations for m = 1, we obtain

(S11 − T̃11 ) a
̃ 1 + (S12 − T̃12 ) a
̃ 2 + (S13 − T̃13 ) a
̃ 3 = 0,
̃ + (S − T̃ ) b
(S11 − T̃11 ) b ̃ + (S − T̃ ) b
̃ = 0,
1 12 12 2 13 13 3

(S11 − T̃11 ) c̃1 + (S12 − T̃12 ) c̃2 + (S13 − T̃13 ) c̃3 = 0.

This is a homogenous system of linear equations for

(S11 − T̃11 ), (S12 − T̃12 ), and (S13 − T̃13 )

with nonzero coefficient determinant; i. e. the system of linear equations has only the
trivial solution

S11 = T̃11 , S12 = T̃12 , S13 = T̃13 .

For m = 2 and m = 3 we find analogously that also the other elements of the two
matrices Sij and T̃ij must be equal, i. e. we have

T̃mn = αim αjn Tij ,

and from this we compute the inverse relationship by multiplying by αpm αqn

αpm αqn T̃mn = α pm αim ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟

⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ αqn αjn Tij = Tpq ,
δpi δqj

Tij = αim αjn T̃mn , T̃ij = αmi αnj Tmn . (2.15)

This is clearly a generalization of the transformation law (2.11) for coordinates of po-
lar vectors. Similarly as before, where we interpreted the quantities ai and ã i , related
by (2.11), as the coordinates of a (coordinate system-independent) polar vector a, we
interpret the matrices Tij and T̃ ij , related by (2.15), as the coordinates of a (coordinate
system-independent) second-order polar tensor4 T with respect to the bases ei and ẽi .
Thus, if in any coordinate system the coordinates Bi of an arbitrary polar vector B
(or equivalently, the coordinates of three linearly independent polar vectors) are as-
signed to the coordinates Ai of a polar vector by always the same relation, either (2.13)
or (2.14), then the Tij are the coordinates of a polar tensor of order 2.
Since Tij = TjiT we arrive at the same transformation law if we start with the equa-
tions Ai = TjiT Bj and A ̃ i = T̃ T B
ji j , i. e. if we sum over the first index of the matrix.

4 Latin: tensioner (agent noun to tendere [to tension]); identified initially only with the quantity that
we today call the (Cauchy) stress tensor. (The word stress tensor is etymologically a pleonasm.)

Brought to you by | Cambridge University Library

Because the αij are orthogonal and since the transposed matrix of an orthogonal
matrix is equal to its inverse, it follows from (1.58) that all coordinate matrices of a
second-order polar tensor are similar; however, not all matrices which are similar to
a coordinate matrix of a second-order polar tensor are Cartesian coordinates of this

Problem 2.1.
Prove the converse: If the relation (2.13) or (2.14) is valid in any Cartesian coordinate
system and if the Bi are the coordinates of a polar vector and the Tij are the coordinates
of a second-order polar tensor (i. e. they satisfy the transformation equations (2.11) and
(2.15)), then the Ai are the coordinates of a polar vector. (Note the difference: For the
original theorem the Bi must be the coordinates of an arbitrary vector, but the converse
theorem only requires that they are the coordinates of a vector.)

2. To develop an intuition for second-order polar tensors, let us consider the (Cauchy)
stress tensor, from which the tensor originally has its name. The stress tensor describes
the relation between the stress vector σ, i. e. the surface force per unit surface element
in a point of a surface and the orientation n of the surface element through this point.
The stress tensor depends, even in the simplest case of a fluid at rest, on the orientation
of the area element: it is always perpendicular to the surface element. Let p be the
pressure at a given surface element. Then we have

σi = −p ni .

For a fluid in motion (or a deformable solid body) it is shown in mechanics that the

σx (x, t, n) = πxx (x, t) nx + πxy (x, t) ny + πxz (x, t) nz ,

σy (x, t, n) = πyx (x, t) nx + πyy (x, t) ny + πyz (x, t) nz ,
σz (x, t, n) = πzx (x, t) nx + πzy (x, t) ny + πzz (x, t) nz

hold, or, using the summation convention,

σi (x, t, n) = πij (x, t) nj ,

i. e. every coordinate of the stress vector σ is a homogeneous linear function of all

coordinates of the normal vector n. (Setting

πij = −p δij

we obtain the formula for the fluid at rest as a special case of the formula above.)
Since both n and σ are polar vectors and since we can substitute for n in particular the
three linear independent unit vectors ex , ey , and ez , the πij are, according to our defi-
nition (2.13), the coordinates of a polar tensor of order 2, which we call stress tensor π.

Brought to you by | Cambridge University Library

The stress tensor describes the state of stress at a given point by assigning a stress vec-
tor σ to every normal vector n; hence, we can also say the tensor π maps the vector n
to the vector σ.

3. If in (2.13) Ai and Bi are the coordinates of two axial vectors, we again obtain the
transformation law (2.15); so also for this case the Tij are the coordinates of a second-
order polar tensor. If, however, one of the two vectors A and B is a polar vector and the
other vector is an axial vector, then we obtain the transformation law

Tij = (det α) ̃ T̃ij = (det α)

∼ αim αjn Tmn , ∼ αmi αnj Tmn , (2.16)

and the Tij are called the coordinates of an axial tensor of order 2. Often the word
pseudotensor is used instead of axial tensor.

2.3.2 Tensors of Arbitrary Order

1. We can show similarly that if in any Cartesian coordinate system the coordinates Bi
of an arbitrary vector are assigned to the coordinates Aij of a tensor of order 2 by the
analogous relation

Aij = Tijk Bk ,

then the relation between Tijk and T̃ijk with respect to the bases ei and ẽi , where A and
B are both either polar or axial, is given by the transformation law

Tijk = αim αjn αkp T̃mnp , T̃ijk = αmi αnj αpk Tmnp

or, if one of the two vectors A and B is polar and the other is axial, by the transformation

Tijk = (det α) ̃ T̃ijk = (det α)

∼ αim αjn αkp Tmnp , ∼ αmi αnj αpk Tmnp .

In the former case the Tijk are called the coordinates of a polar tensor T of order 3, and
in the latter case they are called the coordinates of an axial tensor T of order 3.
We obtain the same transformation laws, if we start with the relations

∗ ∗∗
Aij = Tikj Bk or Aij = Tkij Bk .

Brought to you by | Cambridge University Library

2. A scalar is also called a tensor of order 0 and a vector is also called a tensor of
order 1. Thus we have the following rule.

The order of a tensor equals the number of underlines of the tensor and the number
of indices of its coordinates.

3. The transformation equations for polar tensors and for axial tensors are different.
For polar tensors we have the following laws.

a = a ̃, a
̃ = a,
ai = αim ãm , a
̃ i = αmi am ,
aij = αim αjn a
̃ mn , a
̃ ij = αmi αnj amn ,
etc. etc.

For axial tensors we have the following laws.

a = (det α)∼ a,
̃ a
̃ = (det α) a,

∼ im am ,
ai = (det α) α ̃ a
̃ i = (det α) αmi am ,
∼ (2.18)
aij = (det α)
∼ αim αjn amn ,
̃ a
̃ ij = (det α) αmi αnj amn ,

etc. etc.

4. For both polar and axial vectors we have in the two coordinate systems

a = ai ei , a
̃ i ẽi .

Similarly, tensors of order 2 or higher can be defined from their representation with
respect to a Cartesian coordinate system and either the transformation law (2.17) or
(2.18), respectively, for the coordinates so obtained. However, we need to postpone
this definition, until we know how to represent a tensor in a coordinate system.

5. The tensor whose coordinates are all zero is called the zero tensor. We denote such
a tensor by 0, 0, 0, etc.

6. Sometimes it is convenient to have a symbol for a tensor of arbitrary order. We

denote such a tensor by upper-case large cursive script letters (A , B , etc.) and its
coordinates by ai...j , bi...j , etc.

2.3.3 Symmetries in Physics

General invariances in physics, such as invariance under a translation, a rotation, or a

reflection, are also called symmetries in physics. Besides these three symmetries, there
are the invariance under repetition, the so-called reproducibility, and the invariance

Brought to you by | Cambridge University Library

of physical quantities and physical equations under a change of units. All these in-
variances have consequences for the formulation of physical equations, namely it fol-
lows that physical equations do not change under a symmetry operation (e. g. when a
physical experiment is repeated). As a result, certain measurable properties of phys-
ical processes cannot appear in a physical equation (i. e. strictly speaking, they are
no physical quantities), and not all mathematically defined combinations of physical
quantities are also physically permissible. We only mention these consequences here;
a sound justification is beyond the scope of this text.
From the reproducibility of a physical process it follows that points of time can-
not appear explicitly in a physical equation but only time intervals; and from the in-
variance under translation it follows that points (position coordinates) cannot appear
explicitly in a physical equation but only distances between points.
From the invariance of physical properties under a change of units it follows that
all terms of a physical equation must be measurable in the same unit (in other words,
they must be dimensionally homogeneous). For example, energy and temperature are
not measured in the same units, so an energy cannot be equal to a temperature. This
symmetry is systematically utilized in dimensional analysis, a method which is known
to reduce the number of mathematically possible combinations of physical quantities
The reflection invariance leads to the distinction between polar and axial tensors;
and from the rotational invariance it follows that physical quantities must all be (po-
lar or axial) tensors and that in a physical equation all terms must be tensors of the
same order that must all be either polar or axial. (Analogous to the concept of dimen-
sional homogeneity, we could call this property tensorial homogeneity.) For exam-
ple, a torque (vector!) cannot be equal to a work (scalar!), although both quantities
are measured in the unit Joule, i. e. we have dimensional homogeneity; and a velocity
(polar vector!) cannot be equal to an angular velocity (axial vector!), even if we elim-
inated their dimensional inhomogeneity by multiplying the axial vector with a polar
scalar. These symmetries are systematically exploited in the theory of representation
of tensor functions (see Chapter 5).
There are no practical reasons for the use of left-handed systems in physics and
technology. Thus we could drop the distinction between polar tensors and axial ten-
sors if we would limit ourselves to right-handed systems. But then we could not use
the reflection invariance to exclude equations which are physically not permissible
(i. e. which are irrelevant for the description of physical processes).

2.4 Symbolic Notation, Index Notation, and Matrix Notation

We distinguished between tensors themselves (which we denoted by underlined let-
ters) and their Cartesian coordinates (which we denoted by indexed letters). Likewise

46 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

we can define mathematical operations for the tensors themselves and for their coor-
dinates. As a result we have two different notations of tensor calculus, which we call
symbolic notation and index notation. We require tensor equations to be independent
of a coordinate system in symbolic notation (i. e. between tensors) as well as in index
notation (i. e. between tensor coordinates).
It is important to master both notations and to be able to translate from one nota-
tion into the other. Every equation in symbolic notation can be translated into an equa-
tion in index notation; the converse is not always possible, but it is in most practically
occurring cases. We think it is best to grow up virtually bilingual from the beginning
and will therefore write all operations in both notations. When performing the com-
putations we use whichever notation is more convenient. Usually this means we do
the computations in index notation and then translate the results into symbolic nota-
tion. Occasionally we will also speak seemingly inaccurately, for example of a vector
ai , since the coordinates of a vector in an unspecified Cartesian coordinate system are
after all also a symbol for the vector itself.
Let us finally agree to regard the coordinates of vectors as column matrices (rather
than row matrices). Then we can write all tensor equations which contain only scalars,
vectors, and second-order tensors also as matrix equations. Writing tensor equations
in matrix notation combines the conciseness of symbolic notation with the algorith-
mic character of index notation. It fails however, for tensors of order higher than 2
(we will see that this is already the case for the vector product of two vectors); and it
is not obvious how many columns a matrix a ∼ has (for example, whether it represents
the coordinates of a vector or of a second-order tensor). Nevertheless, matrix notation
is widely used (and often confused with symbolic notation). For this reason we will
write the definitions of tensor algebraic operations not only in symbolic notation and
index notation, but whenever possible also in matrix notation; however, we will not
use it here.

2.5 Equality, Addition, and Subtraction of Tensors.

Multiplication of a Tensor by a Scalar.
Linear Independence
In the following sections we define mathematical operations for tensors, i. e. we de-
velop a calculus for tensors. We will define an operation between tensors (i. e. in sym-
bolic notation) by writing out the operation that has to be carried out between the
coordinates of the involved tensors (in index notation).5

5 Since tensor coordinates are (for us: real) numbers (extension to complex numbers is straightfor-
ward), for tensor equations in index notation the rules of arithmetics apply.

2.5 Linear Operations for Tensors | 47

We start by defining the linear operations equality, addition, subtraction, and

multiplication by a scalar. In the next sections we will then introduce additional alge-
braic operations, in particular three multiplications as well as differential and integral
operations for tensors.
1. Two tensors are equal, if they match (in the same coordinate system) in all homol-
ogous coordinates.

a = A, a = A, a = A,
a = B, ai = Bi , a
∼ = B,
∼ (2.19)
a = C, aij = Cij , a
∼ = C.

etc. etc.

Conversely, this implies that two tensors are not equal if they differ in at least one pair
of homologous coordinates.
Two tensors are added or subtracted by adding or subtracting their homologous

a ± b = A, a ± b = A, a ± b = A,
a ± b = B, ai ± bi = Bi , a
∼±b ∼ = B,
∼ (2.20)
a ± b = C, aij ± bij = Cij , a ± b = C.
∼ ∼ ∼
etc. etc.

To multiply a tensor by a scalar we multiply all coordinates by this scalar.

α a = A, α a = A, α a = A,
α a = B, α ai = Bi , αa∼ = B,
∼ (2.21)
α a = C, α aij = Cij , αa = C.
∼ ∼
etc. etc.

2. For completeness, we still need to convince ourselves that the quantities A, Bi , Cij
defined in this way are tensor coordinates, i. e. that they satisfy the corresponding
transformation law, provided that the quantities on the left side of these equations
are tensor coordinates. This is easy to see, e. g. for polar tensors we have

Cij = aij ± bij = αim αjn (a ̃ )=α α C

̃ mn ± b ̃
mn im jn mn .

So we can only equate, add, or subtract tensors, which obey the same transformation
law, and hence have the same order and are either both polar or both axial.
1 2 Q
3. We call Q Tensors A , A , . . . , A which have the same order and are either all polar
or all axial, linearly independent, if the equation
αi A = 0

Brought to you by | Cambridge University Library

is only satisfied for αi = 0; otherwise these tensors are called linearly dependent. More
than three vectors, more than nine second-order tensors, or in general more than 3N
tensors of order N are clearly always linearly dependent.

2.6 Transposed, Isometric, Symmetric, and Antimetric Tensors

1. Here it is useful to remember that the coordinates of a second-order tensor form a
square matrix. So we write

a11 a12 a13 axx axy axz

aij := (a21 a22 a23 ) = (ayx ayy ayz ) . (2.22)
a31 a32 a33 azx azy azz

2. Two second-order tensors whose coordinates differ only by the order of the indices
are called transposed tensors. Clearly, their coordinate matrices are transposed ma-
trices. Similar to transposed matrices we also denote a transposed tensor by the same
letter with a T as superscript.

aTij = aji . (2.23)

3. Tensors whose coordinates differ only in the order of indices are called isomers. The
number of isomers of a tensor increases quickly with its order: A second-order tensor
has two isomers, namely the tensor itself and the transposed tensor. A tensor of order
3 has already six isomers, namely Aijk , Ajki , Akij , Akji , Ajik , and Aikj . Evidently tensors
of order n have n! isomers, and we usually do not have a symbolic notation for them.

4. A second-order tensor which is equal to its transposed tensor is called symmetric;

for such a tensor we have

a = aT , aij = aji , a
∼ . (2.24)

5. A second-order tensor which is negative equal to its transposed tensor is called

antimetric, antisymmetric, skew symmetric or alternating; for such a tensor we have

a = −aT , aij = −aji , a

∼ = −a
∼ . (2.25)

6. We can uniquely decompose a second-order tensor using the formula

1 1
aij = (aij + aji ) + (aij − aji ) (2.26)
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 2
a(ij) a[ij]

Brought to you by | Cambridge University Library

into a symmetric part a(ij) and an antimetric part a[ij] , which is easy to see if we expand
the formula. We analogously say that a higher-order tensor is symmetric or antimetric
with respect to an index pair; e. g. we have6 a[ijk]l = 21 (aijkl − akjil ).

7. Finally, for completeness, we need to show that the transposed coordinate matrices
of a tensor in different coordinate systems are again the coordinates of a tensor. This
is easy to see using the transformation law.
Let aij and ã ij be the coordinates of a second-order polar tensor in two different
coordinate systems, i. e. we have

aij = αim αjn a

̃ mn ,

̃ Tnm = a
and let a ̃ mn be the transposed coordinate matrix in the tilde system. This gives

aTij = aji = αjm αin a ̃ Tnm ,

̃ mn = αin αjm a Q. E. D.7

Thus, symmetry or antimetry of a tensor is a property of the tensor itself and not just
of its coordinate matrix in a particular coordinate system.

Problem 2.2.
A tensor (in a given Cartesian coordinate system) has the coordinates

1 0 2
aij = (4 −3 6) .
2 −4 5
A. Write out the coordinates of the transposed tensor.
B. Decompose the tensor into its symmetric and its antimetric part and check your
result, i. e. show that the sum of both parts gives the original tensor.

6 Some authors denote by a[ijk]l the average of all possible permutations of the indices i, j, k, where
an odd number of index exchanges result in a minus sign, i. e.
a[ijk]l = (a + ajkil + akijl − akjil − ajikl − aikjl ).
3! ijkl

7 Quod erat demonstrandum, meaning: what was to be demonstrated.

Brought to you by | Cambridge University Library

2.7 Tensor Multiplication of Tensors

2.7.1 Definition

We introduce three different products of tensors: the tensor product,8 the scalar prod-
uct,9 and the vector product,10 and we again define them by specifying the arithmetic
operations that need to be performed between the coordinates of the two factors.
The tensor or dyadic product is a generalization of the usual product of two
scalars, so we denote it similarly by writing the two factors next to each other without
a symbol between them.11 It is defined for tensors of arbitrary order.

a b = A, ab = A, a b = A,

a b = B, a bi = Bi , ab∼ = B,

a b = C, ai b = Ci , a
∼ b = C,

a b = D, ai bj = Dij , a
∼b∼ = D,

a b = E, a bij = Eij , ab∼ = E,
∼ (2.27)
a b = F, aij b = Fij , a
∼ b = F.

a b = G, ai bjk = Gijk ,
a b = H, aij bk = Hijk ,
etc. etc.

The name tensor product indicates that the tensor product of two vectors is a second-
order tensor.

Problem 2.3.
Two vectors have (in a given coordinate system) the coordinates ai = (2, 3, 4),
bi = (−2, 1, 5). Compute the coordinates of the tensor products a b and b a.

8 Frequently also called the dyadic product.

9 Frequently also called the dot product.
10 Frequently also called the cross product.
11 Some authors use ⊗ to denote the tensor product; we introduce ⊗ in Section 2.10.1 with another

2.7 Tensor Multiplication of Tensors | 51

2.7.2 Properties

1. First, we need to convince ourselves that the result of the so-defined product be-
tween tensor coordinates consists again of tensor coordinates. We do this by means of
the equation ai bj = Dij . Writing this equation in two different Cartesian coordinate
systems, we have

ai bj = Dij and a ̃ =̃
̃i b Dij ,

and if, for example, ai and bj are the coordinates of two polar vectors, we have

ai = αim a
̃m ̃ .
and bj = αjn b n

This gives

Dij = ai bj = αim a ̃ =α α a
̃ m αjn b ̃ ̃
n im jn ̃ m bn = αim αjn Dmn ,

i. e. the Dij are according to (2.15) the coordinates of a second-order polar tensor.

2. Clearly, the tensor product of two polar tensors or of two axial tensors gives a polar
tensor, and the tensor product of two tensors where one is polar and the other is axial
gives an axial tensor.

3. If we compare the equations in (2.27) in the two notations, we realize that for the
translation the order of the indices must be the same on the left side and on the right
side of the equation. So sometimes we may have to change the order of the factors or
introduce isomers. For example in

aj bi ck + dk eji = fijk ,

we must first rewrite the two terms on the left side, i. e.

bi aj ck + eijT dk = fijk ,

and then we obtain the translation

b a c + eT d = f .

If we have to generate isomers of tensors with order higher than 2, then we would
need additional rules to be able to translate the equation into symbolic notation. An
example is the following equation:

aik bj = cijk .

52 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

Problem 2.4.
Translate, as far as possible, into the other two notations:
A. a b c = d, B. ai bkl cj = dijkl , C. αi bkj = Aijk , D. aTmn = bnm .

4. Tensor coordinates are real numbers; so all operations in index notation must sat-
isfy the rules for calculations with real numbers, e. g. the commutative law and the
associative law for multiplication must hold. We need to check for each law if it is also
satisfied in symbolic notation.
Clearly addition and tensor multiplication are distributive also in symbolic nota-
tion. We have, for example, in index notation ai bj + ai cj = ai (bj + cj ), and translated
a b + a c = a(b + c).
We can similarly prove the associative law, e. g. we have (ai bj ) ck = ai (bj ck ), and
translated (a b) c = a(b c).
However, the commutative law does not hold in general. It only holds if one of
the factors is a scalar: a bij = bij a gives a b = b a, but ai bj = bj ai cannot be
translated directly because of the different order of the free indices on the two sides of
the equation. It is easy to see that in general a b ≠ b a. We set a b =: A and b a =: B.
Then we have

ai bj = Aij and bm an = Bmn .

Since the equation on the right side is commutative in index notation, we can also
write an bm = Bmn and then rename the indices for easier comparison with the equa-
tion on the left side, i. e. we replace n by i and m by j. This gives ai bj = Bji , so we
have Aij = Bji , the tensors A and B are transposed, i. e. we have a b = (b a)T , and thus
a b ≠ b a in general.
5. A tensor product is zero if and only if at least one factor is zero. Consider, for ex-
ample, the tensor product C = A B and assume that only one coordinate of the two
factors is nonzero in the coordinate system under consideration. Then also one co-
ordinate of the product is nonzero. Only if all the coordinates of one factor are zero,
then also all the coordinates of the product are zero. And if all the coordinates of the
product are zero, then also all the coordinates of at least one factor are zero.
6. We can factor out a factor which is common to several tensor products, if the factor
is nonzero. For example, from a b = a c we first get a(b − c) = 0. If a ≠ 0, it then
follows from the last paragraph that b − c = 0, which gives b = c.
7. We can now prove the so-called quotient rule: if the tensor product of two factors,
where one of the factors is a nonzero tensor of order m, is a tensor of order (m+n), then
the other factor is a tensor of order n. It is a polar tensor if the two tensors are either
both polar or both axial, and it is axial if of the two given tensors one is polar and the
other is axial. Consider, for example, the relation

ai bj = Dij and a ̃ =̃
̃i b Dij

2.7 Tensor Multiplication of Tensors | 53

in two Cartesian coordinate systems and let ai be a nonzero polar vector and Dij a
second-order polar tensor, i. e.

ai =
\ 0, ai = αim a
̃m , Dij = αim αjn ̃
Dmn .

Then we should get consequently

̃ .
bi = αim b m

We prove this as follows:

ai bj = Dij = αim αjn ̃

Dmn = αim αjn a ̃ =a α b
̃m b ̃
n i jn n .

Since ai = ̃ , which is the required

\ 0 , we can cancel out ai , and by this we get bj = αjn b n
statement, up to the names of the indices.
The quotient rule is obviously the converse of the theorem that the tensor product
of a tensor of order m and a tensor of order n is a tensor of order (m + n).

8. Let us summarize the properties of tensor products that we proved above.

– The tensor product of a tensor of order m and a tensor of order n is a tensor of

order (m + n).
– If the tensor product of two factors, of which one is a nonzero tensor of order m,
is a tensor of order (m + n), then the other factor is a tensor of order n (quotient
– The three tensors of a tensor multiplication (i. e. the two factors and the prod-
uct) are either all polar, or two of them are axial and the third is polar.
– For translating into the symbolic notation, the index order in a tensor product
must match on the left and on the right side of an equation.
– Addition and tensor multiplication are distributive also in symbolic notation.
– Tensor multiplication is associative also in symbolic notation.
– Tensor multiplication is in symbolic notation in general only commutative, if
one factor is a scalar.
– A tensor product is zero if and only if at least one factor is zero.
– A nonzero common factor in an equation involving tensor products can be can-
celed out.

54 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

2.7.3 Tensors, Tensor Components, and Tensor Coordinates

1. We can now use tensor products to associate second- and higher-order tensors with
their coordinates in a Cartesian coordinate system, analogously to (2.10). If we replace
the left side of the equation a b = c using the representation (2.10), we obtain with

a b = ai ei bj ej = ai bj ei ej = cij ei ej = c,

i. e.

c = cij ei ej .

Here ei ej is a tensor product of two basis vectors. Since i and j take the values 1 to 3, we
can write ei ej as a square matrix with three rows, whose elements are second-order

e1 e1 e1 e2 e1 e3
ei ej = ( e2 e1 e2 e2 e2 e3 ) .
e3 e1 e3 e2 e3 e3

Generalizing (2.10) gives the following relation between a tensor of arbitrary order and
its coordinates with respect to the basis ei of a coordinate system.

a = ai e i ,
a = aij ei ej ,
a = aijk ei ej ek ,

2. Just as any vector a can be written as a linear combination of the three basis vec-
tors ei with the coordinates ai as coefficients, we can also represent any second-order
tensor a as a linear combination of the nine basis tensors ei ej with the coordinates aij
as coefficients, and the same holds for tensors of higher order. Similarly as for vectors,
we also distinguish for tensors of arbitrary order between their components and coor-
dinates; for example, we call the tensor axy ex ey the x, y-component of a in the given
coordinate system, and we call the quantity axy its x, y-coordinate. The components
of a tensor are also tensors, whereas the coordinates are not.

3. We can now generalize our definition of a vector from Section 2.2.2 to tensors of
arbitrary order.

2.8 δ-Tensor, ε-Tensor, Isotropic Tensors | 55

A tensor is a quantity which, according to (2.28), can be represented in a Cartesian

coordinate system and whose coordinates transform, according to (2.17) or (2.18),
if the Cartes ian coordinate system is changed. If the transformation law (2.17) ap-
plies, the tensor is called polar, and if the transformation law (2.18) applies, the
tensor is called axial.

2.7.4 Tensor Equations, Transformation Equations, and Representation Equations

At this point it is useful to distinguish three different types of equations with tensors,
namely tensor equations, transformation equations, and representation equations:
– Tensor equations relate different tensors (tensor equations in symbolic notation)
or coordinates of different tensors in the same coordinate system (tensor equa-
tions in index notation), e. g. a b = c or aij bk = cijk .
– Transformation equations relate the coordinates of the same tensor in different
coordinate systems, e. g. ai = αij a ̃j .
– Representation equations relate a tensor itself and its coordinates in a given coor-
dinate system, e. g. a = aij ei ej .

The summation convention applies to all three types of equations.

If tensors describe physical quantities, only tensor equations can be used to rep-
resent physical facts, because only tensor equations connect different tensors. In con-
trast, transformation equations and representation equations are mathematical equa-
tions without physical content.

2.8 δ-Tensor, ε-Tensor, Isotropic Tensors

2.8.1 The δ-Tensor

Clearly, αim αjn δmn = αim αjm = δij , i. e. the δij satisfy the transformation law (2.17)
for the coordinates of a second-order polar tensor. Thus we can interpret the δij as the
Cartesian coordinates of a second-order polar tensor

δ = δij ei ej (2.29)

which is called the δ-tensor, identity tensor, metric tensor, or fundamental tensor.
Clearly, it has the property that its coordinates are the same in every Cartesian co-
ordinate system, namely they are equal to the δij defined in (1.15).

56 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

2.8.2 The ε-Tensor

1. Now we want to show that the εijk satisfy the transformation law (2.18) for the coor-
dinates of a third-order axial tensor, i. e. that they are the Cartesian coordinates of an
axial tensor

ε = εijk ei ej ek . (2.30)

Since the εijk are defined by (1.33), independently of a coordinate system, they must
satisfy the equation εijk = (det α)
∼ αmi αnj αpk εmnp . Let
̃εijk = (det α) αmi αnj αpk εmnp ,

and then we have to show, using the properties of the εijk , that ̃εijk = εijk .
Only six of the 27 terms in the triple sum on the right side are nonzero, according
to (1.33):
̃εijk = (det α) [α1i α2j α3k + α2i α3j α1k + α3i α1j α2k

− α3i α2j α1k − α1i α3j α2k − α2i α1j α3k ].
The bracket can clearly be written as a determinant, i. e.
󵄨󵄨 α α1j α1k 󵄨󵄨
󵄨󵄨 1i 󵄨󵄨
󵄨󵄨 󵄨󵄨
̃εijk = (det α) 󵄨󵄨󵄨 .
∼ 󵄨󵄨󵄨 α2i α2j α2k
󵄨󵄨 α 󵄨󵄨
󵄨 3i α3j α3k 󵄨󵄨

Since changing the order of two indices in ̃εijk exchanges two columns and changes
the sign of the determinant, it follows that ̃εijk , like εijk , is antimetric with respect to
all index pairs; i. e. analogously to (1.34) we have

̃εijk = ̃εjki = ̃εkij = −̃εkji = −̃εjik = −̃εikj .

If two indices have the same value, then two columns of the determinant are equal,
i. e. the determinant is zero; so just like εijk , ̃εijk is also zero, if at least two indices are
equal. Finally, it remains to show that ̃ε123 = 1, and indeed for i = 1, j = 2, k = 3 we

̃ε123 = (det α)2 = 1.

So we have ̃εijk = εijk , which completes the proof.
2. Some authors use, instead of the ε-tensor, a polar tensor whose coordinates coin-
cide in a right-handed system with the coordinates of the ε-tensor, i. e. with the εijk
from (1.33). To distinguish the two tensors we call this tensor E. Then we have

ε in right-handed systems,
E = Eijk ei ej ek , Eijk = { ijk (2.31)
−εijk in left-handed systems.
We will not use the tensor E.

2.9 Scalar Multiplication of Tensors | 57

2.8.3 Isotropic Tensors

Tensors whose coordinates do not change by a transformation to another Cartesian

coordinate system, such as the δ-tensor and the ε-tensor, are called isotropic tensors.
Clearly, every polar scalar is isotropic, and there are no isotropic axial scalars,
other than zero, nor are there any (polar or axial) isotropic vectors, other than the
zero vector. It can be shown that the most general isotropic tensors of order 2 and 3
have the coordinates A δij and A εijk , respectively, where A is an arbitrary polar scalar.
It can further be shown that the most general isotropic tensors of higher order
are sums of tensor products of δ-tensors and ε-tensors. Therefore the most general
combination of two δ-tensors constitutes the most general isotropic tensor of order 4;
we can easily convince ourselves that it has the form

aijkl = A δij δkl + B δik δjl + C δil δjk , (2.32)

where A, B, and C are arbitrary polar scalars.12

2.9 Scalar Multiplication of Tensors

2.9.1 Definition

The scalar product or inner product or dot product of two tensors is denoted by a dot
between the two factors. It is obtained from the corresponding tensor product by set-
ting equal the two indices adjacent to the gap between the two factors, i. e. by summing
over these indices.
It is therefore defined for two tensors of order at least one.

a ⋅ b = A, ai bi = A, aT
∼ b∼ = A,
a ⋅ b = B, ai bij = Bj , aT T
∼ b∼ =B∼ ,
a ⋅ b = C, aij bj = Ci , a
∼b∼ = C,
∼ (2.33)
a ⋅ b = D, aij bjk = Dik , a
∼b∼ = D.

a ⋅ b = E, aij bjkl = Eikl ,
etc. etc.

Its name stems from the fact that the scalar product of two vectors is a scalar.

12 Equation (2.32) is another example of an equation that (without additional conventions) cannot be
translated into symbolic notation.

58 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

2.9.2 Properties

1. First, we need to convince ourselves that the product of tensor coordinates defined
in this way are again tensor coordinates. We do this using the example of the equation
ai bij = Bj for polar tensors. Writing this equation in two different Cartesian coordinate
systems, we get

ai bij = Bj and a ̃ =B
̃i b ̃j,

and since the ai are the coordinates of a polar vector and the bij are the coordinates of
a polar tensor, we have

ai = αik a
̃k ̃ .
and bij = αim αjn b mn

This gives

Bj = ai bij = αik a ̃ =δ α a
̃ k αim αjn b ̃ ̃ =α B ̃
mn km jn ̃ k bmn = αjn a
̃m b mn jn n ,

thus the Bj are the coordinates of a polar vector.

2. Clearly, the scalar product of two polar tensors or two axial tensors gives a polar
tensor and the scalar product of a polar tensor and an axial tensor gives an axial ten-

3. Comparing the equations from (2.33) in the two notations shows that for the trans-
lation the order of the free indices must be equal on both sides of an equation. Fur-
thermore, to be able to write the summation symbolically as a scalar product, the two
summation indices must be adjacent to the gap between both tensors.

Problem 2.5.
Translate, as far as possible, into the other two notations:
A. a ⋅ b = c, B. b ⋅ a = c, C. aik bij = cjk , D. ai bk ai ck = d,

E. aik bijk = cj , F. (a ⋅ b)T = c, G. (a ⋅ b ⋅ c)T = d.

Hint: Also in F and G (where transposed tensors in symbolic notation appear) only
translate but do not transpose, i. e. the ‘T’ for transposition remains in the equation
after translation. (In order to translate an equation from index notation we may have
to first perform some transpositions; we never have to do this when we translate from
symbolic notation.)

4. Addition and scalar multiplication are distributive also in symbolic notation, e. g.

ai bij + ai cij = ai (bij + cij ) gives in symbolic notation

a ⋅ b + a ⋅ c = a ⋅ (b + c).

2.9 Scalar Multiplication of Tensors | 59

5. Clearly a ⋅ b = b ⋅ a; if both factors are vectors, the commutative law also holds in
symbolic notation. If one of the factors is of second order or higher, the commutative
law does not hold in general, e. g. a ⋅ b is not equal to b ⋅ a. To show this, we assume
ai bij = bij ai . To be able to translate this equation, we need to transpose bij on the
right side. Then we have ai bij = bTji ai , which gives a ⋅ b = bT ⋅ a, but in general bT ⋅ a
is not equal to b ⋅ a.

6. The associate law holds in symbolic notation for an arbitrary sequence of tensor and
scalar multiplications, if and only if in index notation no pair of summation indices is
separated by other indices. The equation (aij bj ) ck = aij (bj ck ) can be translated into
(a ⋅ b) c = a ⋅ (b c), but the equation (aij bj ) ci = aij (bj ci ) cannot be translated into
(a ⋅ b) ⋅ c = a ⋅ (b ⋅ c). In symbolic notation the associative law holds, if for each tensor
the sum of its adjacent dots is smaller or equal to its order. This is true for a ⋅ b c but
not for a ⋅ b ⋅ c. (In this form, the statement is also true for the multiple scalar products
defined below.)

Problem 2.6.
For which sequences of the factors on the left side can you translate the equation
ai bk ai cj = djk into symbolic notation? For each case, check if the associative law
holds for all five possible associations.

7. Next, we ask ourselves in which cases we can cancel a common factor in scalar
We start with the equation a ⋅ A = b ⋅ A, which we can also write as (a − b) ⋅ A = 0
or (ai − bi ) Ai = 0 .
Clearly, we cannot conclude from a zero scalar product of two vectors that one
of its factors is zero. Even if Ai \= 0, ai − bi does not need to be zero; it can also
be perpendicular to Ai . However, if the equation above is satisfied for three linearly
1 2 3
independent vectors Ai , Ai , and Ai , then certainly ai − bi = 0, because no vector can
be perpendicular to three linearly independent vectors simultaneously. In particular
it follows from (ai − bi ) ei = 0 or ai ei = bi ei that ai = bi ; we have already used
this uniqueness of the decomposition of a vector with respect to a basis several times.
Thus, since every vector can be represented as a linear combination of three Cartesian
basis vectors, we can alternatively require that the equation above be satisfied for any
vector A.
To generalize this proof to the equation ai...jk Ak = bi...jk Ak , we start by writing the
proof for ai Ai = bi Ai more formally. The three equations

(ai − bi ) Ai = 0 or (a1 − b1 ) A1 + (a2 − b2 ) A2 + (a3 − b3 ) A3 = 0,

(ai − bi ) Bi = 0 or (a1 − b1 ) B1 + (a2 − b2 ) B2 + (a3 − b3 ) B3 = 0,
(ai − bi ) Ci = 0 or (a1 − b1 ) C1 + (a2 − b2 ) C2 + (a3 − b3 ) C3 = 0,

60 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

with known Ai , Bi , and Ci , form a system of homogeneous linear equations for the
ai − bi . This system has only the trivial solution ai − bi = 0, if and only if the three
vectors A, B, and C are linearly independent.
This proof generalizes to the next higher-order tensors in the equation a ⋅ A = b ⋅ A
or (aij − bij ) Aj = 0 by using the same argument. The three equations

(aij − bij ) Aj = 0,
(aij − bij ) Bj = 0,
(aij − bij ) Cj = 0,

with known Aj , Bj , and Cj , form a system of homogeneous linear equations for ai1 −
bi1 , ai2 − bi2 , and ai3 − bi3 , for each value of the free index i. These three equations
have only the trivial solution, if and only if the three vectors A, B, and C are linearly
The proof for the equation ai...jk Ak = bi...jk Ak is analogous. A vector which is
a common factor in an equation with several scalar products can be cancelled out,
if and only if arbitrary values can be substituted for this vector (e. g. three linearly
independent vectors).
The equation a ⋅ A = b ⋅ A or (ai − bi ) Aij = 0 with known A forms a system of
homogeneous linear equations for a − b . This system has only the trivial solution, if
the determinant of Aij is nonzero. This condition must hold for the equation a⋅A = b⋅A
or (aij − bij ) Ajk = 0 for each value of the free index i separately. We conclude that a
second-order tensor can be canceled out of several scalar products, if and only if its
determinant does not vanish.
The equation a ⋅ A = b ⋅ A or (ai − bi ) Aijk = 0 with known A forms a system of
nine homogeneous linear equations for the three coordinates ai − bi : one equation
for each value of the free indices j and k. At most three of these nine equations can
be linearly independent. The system of equations has only the trivial solution, if and
only if three equations are linearly independent. So we can cancel out the factor A, if
the matrix

A111 A112 A113 A121 A122 A123 A131 A132 A133

Aijk = (A211 A212 A213 A221 A222 A223 A231 A232 A233 )
A311 A312 A313 A321 A322 A323 A331 A332 A333

has rank 3. To generalize the proof to equations of the form ai...jk Aklm = bi...jk Aklm we
follow the same approach as above.
In the above matrix formed by the Aijk , we call i a row index, since there corre-
sponds to each value of i one row of the matrix, and we call j and k column indices,
since there corresponds to each value pair jk a column of the matrix. With this notion
we can now say under which condition a tensor of order 3 or higher can be canceled
out. In the equation ai...jk Akm...n = bi...jk Akm...n we can cancel out Akm...n , if the matrix

2.9 Scalar Multiplication of Tensors | 61

formed by the coordinates Akm...n , with the summation index k as row index and with
the free indices m . . . n as column indices, has rank 3.
Clearly, the much more general condition that the common factor can assume ar-
bitrary values is also sufficient.

8. We showed at the beginning of this section that the scalar product of two tensors
results again in a tensor. We can now prove the converse, which is also called the
quotient rule and of which we already used a special case in Section 2.3.1, when we
introduced tensors: if the scalar product of two factors is a tensor of order (m + n − 2)
and one factor is a tensor of order m that can take on arbitrary values, then the other
factor is a tensor of order n. We show this for the example aij bjk = cik for polar tensors.
In two different Cartesian coordinate systems we have

aij bjk = cik and a ̃ = c̃ .

̃ ij b jk ik

Assuming a and c are polar tensors, we have

̃ mj = αpm αqj apq , cik = αim αkn c̃mn .

This gives

aij bjk = cik = αim αkn c̃mn = αim αkn a ̃

̃ mj b jn
= α α α α a ̃
b α ̃
im kn pm qj pq jn = kn qj iq bjn ,
α a
αkn δip

̃ ) = 0.
aiq (bqk − αkn αqj b jn

If aiq can take any value, we can cancel it out and then we get

̃ ,
bqk = αqj αkn b jn

i. e. b is a polar tensor of order 2.

9. We remember the geometric interpretation of the scalar product of two vectors: It

is well known that it is equal to the product of the lengths of the two vectors and the
cosine of the enclosed angle, i. e.

a ⋅ b = |a||b| cos(a, b).

If the two vectors form an acute angle, then the scalar product is positive; if they are
perpendicular to each other, it is zero; if they enclose an obtuse angle, it is negative.

10. For the scalar product of a tensor A of arbitrary order with the δ-tensor it follows
as a special case of (1.17) that

A ⋅δ =A; (2.34)

62 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

scalar multiplication of a tensor with the δ-tensor does not change the tensor. With
respect to scalar multiplication, the δ-tensor has the same property as the number one
for multiplication in arithmetics. This explains why the δ-tensor is also called identity

11. Finally we note the important identity

(a ⋅ b)T = bT ⋅ aT , (aij bjk )T = bTij aTjk , (a

∼ b)
∼ =b
∼ a
∼ , (2.35)

which we can easily prove analogously to (1.48) by translating it into index notation.

Problem 2.7.
Prove the following identities by translating into index notation:
A. (a b)T = b a, B. a ⋅ b = bT ⋅ a, C. (a ⋅ b ⋅ c)T = cT ⋅ bT ⋅ aT .

12. In summary, the scalar product has the following properties.

– The scalar product of a tensor of order m and a tensor of order n is a tensor of

order (m + n − 2).
– If the scalar product of two factors is a tensor of order (m + n − 2) and if one
factor is a tensor of order m, which can assume arbitrary values, then the other
factor is a tensor of order n (quotient rule).
– The three tensors of a scalar multiplication (i. e. the two factors and the prod-
uct) are either all polar, or two of them are axial and the third is polar.
– For translation the order of the free indices in a scalar product must be the same
on both sides of an equation.
– Addition and scalar multiplication are distributive also in symbolic notation.
– Scalar multiplication is in general commutative only if both factors are vectors.
– Any sequence of tensor and scalar multiplications is associative also in sym-
bolic notation, if for each tensor the sum of its adjacent dots is smaller or equal
to its order.
– A scalar product can be zero, even if none of its factors is zero.
– A common factor in an equation with several scalar products that can assume
arbitrary values can be canceled out. In particular we have:
– A vector as a common factor in a scalar product can be cancelled out if we
can substitute three linearly independent vectors for it.
– A second-order tensor as a common factor in a scalar product can be can-
celled out if its determinant is nonzero.
– A higher-order tensor as a common factor in a scalar product can be can-
celled out if the matrix of its coordinates, with the summation index as row
index and the free indices as column indices, has rank 3.

2.9 Scalar Multiplication of Tensors | 63

2.9.3 Contraction and Trace

1. Setting equal an index in both factors of a tensor product of two tensors defines,
according to the summation convention, an arithmetic operation. This operation is
called contraction of the two tensors on the two indices. A contraction is defined only
in index notation. It builds a tensor of order (m + n − 2) from a tensor of order m and
a tensor of order n. The scalar product is a special case of a contraction, namely a
contraction on two adjacent indices.
2. The operation defined by setting equal two indices of a single tensor is also called
contraction. It reduces the order of the original tensor by two. The simplest contraction
of a tensor, the contraction of a second-order tensor, has a special name; it is called
the trace of a tensor and it is also expressed in symbolic notation.

tr a = b, aii = b. (2.36)

So the term contraction is used for two similar but different operations. The contrac-
tion of the two factors of a tensor multiplication is simultaneously a contraction of
their tensor product.

2.9.4 Multiple Scalar Products

We denote the double scalar product or double inner product or double dot product
by two adjacent dots. It is obtained by contracting the two index pairs adjacent to the
gap between the two tensors.

a ⋅⋅ b = A, aij bij = A,
a ⋅⋅ b = B, aij bijk = Bk ,
a ⋅⋅ b = C, aijk bjk = Ci ,
etc. etc.

Multiple scalar products are defined accordingly, e. g.

a ⋅⋅⋅ b = A, aijk bijk = A. (2.38)

It is straightforward to generalize the highlighted properties of the single scalar prod-

uct to multiple scalar products.
Some authors define the double scalar product and similarly multiple scalar prod-
ucts by a mirrored contraction of two neighboring index pairs; we suggest to denote it
by two vertical dots to distinguish it from our definition, i. e.

64 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

a : b = D, aij bji = D,
a : b = E, aij bjik = Ek , (2.39)
a : b = F, aijk bkji = F.

(As a mnemonic hint, the horizontal dots remind of the summation indices on the two
sides of the dots and the vertical dots remind of a plane of reflection.)

Problem 2.8.
Translate, using the definitions in (2.37), into symbolic notation:
A. aij bij = c, B. aij bji = c, C. aij bkjl cik = dl , D. aij bikl cklj = d.
Are parentheses necessary in symbolic notation in C and D?

Problem 2.9.
A. tr δ, B. δ ⋅ δ, C. δ ⋅⋅ δ.

Problem 2.10.
A. Under which condition is the double scalar product of two second-order tensors
B. Under which condition can bij be cancelled out in aij bij = 0 ?

Problem 2.11.
A. Prove that the double scalar product of a symmetric tensor and an antimetric ten-
sor is zero.

a(ij) b[ij] = 0. (2.40)

B. Prove also the converse: If the double scalar product of two tensors is zero and one
of the tensors is an arbitrary symmetric (antimetric) tensor, then the other tensor
is antimetric (symmetric).
C. Prove also the following theorem: If the double scalar product of a second-order
tensor with the ε-tensor is zero, then this tensor is symmetric (the converse is ac-
cording to (2.40) true anyway).

εijk ajk = 0 ⇐⇒ ajk = akj . (2.41)

D. How many linearly independent symmetric tensors are necessary to write an ar-
bitrary symmetric tensor as their linear combination?

2.10 Vector Multiplication of Tensors | 65

2.10 Vector Multiplication of Tensors

2.10.1 Definition

1. Tensor multiplication assigns a (second-order) tensor to two vectors and scalar mul-
tiplication assigns a scalar. A vector can be assigned to two vectors by contracting
twice with the ε-tensor. We define

a × b := ε ⋅⋅ a b = A

and call A the vector product or outer product or cross product of the vectors a and b.
We write in index notation

(a × b)i := εijk aj bk = Ai .

Clearly, we can also write εijk between the two vectors or behind them; if the contrac-
tion is translatable, we obtain the expressions

(a × b)i := εijk aj bk = −aj εjik bk = aj bk εjki ,

i. e. we have three equivalent definitions for the vector product of two vectors:

a × b := ε ⋅⋅ a b = −a ⋅ ε ⋅ b = a b ⋅⋅ ε.

Like the scalar product, the vector product is also a special case of a contraction.
To compute the vector product of two vectors, we remember that the elements of
εijk are nonzero only if all three indices are different. For example, for i = 1 only ε123
and ε132 are nonzero. Thus we have

(a × b)1 = ε123 a2 b3 + ε132 a3 b2 = a2 b3 − a3 b2 ,

(a × b)2 = ε231 a3 b1 + ε213 a1 b3 = a3 b1 − a1 b3 ,
(a × b)3 = ε312 a1 b2 + ε321 a2 b1 = a1 b2 − a2 b1 ,

i. e. we can write for the components of the vector product

a × b = (a2 b3 − a3 b2 ) e1 + (a3 b1 − a1 b3 ) e2 + (a1 b2 − a2 b1 ) e3

or in the more familiar form

󵄨󵄨 e e2 e3 󵄨󵄨󵄨󵄨
󵄨󵄨 1
󵄨 󵄨
a × b = 󵄨󵄨󵄨󵄨 a1 a2 a3 󵄨󵄨󵄨󵄨 .
󵄨󵄨 󵄨
󵄨󵄨 b1 b2 b3 󵄨󵄨󵄨

66 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

2. The ε-tensor is an axial tensor, so the vector product of two polar vectors or of two
axial vectors gives an axial vector, and the vector product of a polar and an axial vector
gives a polar vector. In other words, the three vectors of a vector product are either all
axial or two are polar and the third is axial.

3. Some authors use the polar E-tensor from (2.31) instead of the axial ε-tensor to de-
fine the vector product. With this definition the result of the vector product of two polar
vectors is a polar vector. We denote this vector product by the symbol × ̃ to distinguish
it from our definition:

̃ b) := E ⋅⋅ a b,
(a × ̃ b)i = Eijk aj bk .
(a × (2.42)

Clearly, this definition is necessary, if the introduction of axial tensors is abandoned

altogether. For example, the angular velocity vector is then assigned to the direction of
rotation by the right-hand screw rule, independent of the coordinate system. The ad-
vantage of this definition is that all tensors are independent of the coordinate system,
and the disadvantage is that the reflection symmetry of physical equations is violated.
We will not use this vector product here.

4. We recall the geometric interpretation of the vector product a × b. Its magnitude

is equal to the area of the parallelogram spanned by the vectors a and b:

|a × b| = |a||b| sin(a , b).

The angle is the smaller of the two angles enclosed by the vectors a and b, so that it is
at most 180∘ and thus the sine is nonnegative. The vector a × b is perpendicular to the
vectors a and b (i. e. to the plane spanned by them) and forms a right-hand screw with
the direction of rotation obtained by rotating the vector a about the smaller of the two
angles enclosed by a and b into b in a right-handed system, and a left-hand screw in
a left-handed system.

5. Finally, we generalize also the vector product to higher-order tensors. According to

the definition of the vector product, we can write the ε-tensor in front of the two vec-
tors, between the two vectors, or behind them. We generalize to higher-order tensors
using the formulation where the ε-tensor is written between the two factors.

2.10 Vector Multiplication of Tensors | 67

a × b := −a ⋅ ε ⋅ b = A,
a × b := −a ⋅ ε ⋅ b = B,
a × b := −a ⋅ ε ⋅ b = C,
a × b := −a ⋅ ε ⋅ b = D,
(a × b)j := −ai εijk bk = ai εkji bk = Aj ,
(a × b)jm := −ai εijk bkm = ai εkji bkm = Bjm ,
(a × b)mj := −ami εijk bk = ami εkji bk = Cmj ,
(a × b)mjn := −ami εijk bkn = ami εkji bkn = Dmjn ,

If the first factor of a vector product is a vector, then we can write the ε-tensor also in
front of the two factors, i. e.

a × b = −a ⋅ ε ⋅ b = ε ⋅⋅ a b,

(a × b)jm = −ai εijk bkm = εjik ai bkm ,

and if the last factor of a vector product is a vector, then we can write the ε-tensor also
behind the two factors, i. e.

a × b = −a ⋅ ε ⋅ b = a b ⋅⋅ ε,

(a × b)mj = −ami εijk bk = ami bk εikj .

The vector product of two vectors is often written as (a × b)i = εijk aj bk .

Instead of the standard definition we can also introduce its negative and so avoid
the minus sign. If we denote this unusual vector product with ⊗ (read: circled cross),
then we have

a ⊗ b = −a × b,
A ⊗ B = −A × B ,
(a ⊗ b)q := ap εpqr br ,
(A ⊗ B )i...jqm...n := ai...jp εpqr brm...n .

68 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

Problem 2.12.
Which of the following expressions are ambiguous without parentheses?
A. a × b c, B. a × b ⋅ c, C. a ⋅ b × c, D. a ⋅ b × c.
Hint: First, check if both meanings, in A e. g. (a × b) c and a × (b c), are defined.
If this is true, then check if the associative law holds. An expression is not unique
without parentheses only if both meanings are defined and if the associative law does
not hold.

Problem 2.13.
Translate the following expressions into the other notation. Then simplify each ex-
pression in index notation and translate the result into symbolic notation, thus ob-
taining a tensor algebraic identity in symbolic notation.
A. εijk am bj εmin ck dn , B. εijk cm bkm εqpi aj dq , C. a × [(b ⋅ c) × d],
D. a × (b × c). E. Compare the expressions B and C.

Problem 2.14.
Use the Grassmann identity (“backwards”) to rewrite the expression ai bk − ak bi and
translate it into symbolic notation.

2.10.2 Properties

Since the vector products in (2.43) are defined by scalar products, we can include the
vector products in the associative law for scalar products.
From a × b := −a ⋅ ε ⋅ b it follows that for both factors the cross is equivalent to the
So a sequence of products of tensors, including vector products, is associative if
for each tensor the sum of dots and crosses on each side of the tensor is at most equal
to the order of the tensor. For example, this is not the case for b in a × b × c = a ⋅ ε ⋅ b ⋅ ε ⋅ c
and, as we know, a × (b × c) ≠ (a × b) × c.
The remaining statements about vector products can be proven analogously to
their corresponding statements about scalar products.

– The vector product of a tensor of order m and a tensor of order n gives a tensor
of order (m + n − 1).
– The three tensors of a vector multiplication (i. e. the two factors and the prod-
uct) are either all axial or two of them are polar and the third is axial.
– For the translation the order of the free indices of a vector product must be the
same on both sides of an equation.
– Addition and vector multiplication are distributive also in symbolic notation.

2.10 Vector Multiplication of Tensors | 69

– A vector product is not only zero if one of the factors is zero.

– Vector multiplication is in general not commutative.
– Any sequence of products of tensors is associative also in symbolic form, if
and only if the sum of the scalar product dots and the vector product crosses
on both sides of each tensor is at most equal to the order of the tensor.
– A common factor that can assume arbitrary values can be canceled out in an
equation with several vector products.

2.10.3 Triple Product

1. A vector product followed by a scalar product between three vectors is often called
the triple product of these three vectors; we denote the triple product by square brack-
ets. Using (1.28) we have the following formulas.

[a, b, c] = a × b ⋅ c = a ⋅ b × c = A,
󵄨󵄨 a a2 a3 󵄨󵄨󵄨󵄨
󵄨󵄨 1 (2.45)
󵄨 󵄨
εijk ai bj ck = 󵄨󵄨󵄨󵄨 b1 b2 b3 󵄨󵄨󵄨󵄨 = A.
󵄨󵄨 󵄨
󵄨󵄨 c1 c2 c3 󵄨󵄨󵄨

(Here we do not need to set parentheses because only (a×b)⋅c is defined, but a×(b⋅c)
is not defined.)
If e. g. all three vectors are polar, then the triple product is an axial scalar; instead
of axial scalar often the term pseudoscalar is used.

2. We recall the geometric interpretation of the triple product. Its magnitude is equal
to the volume of the parallelepiped spanned by the three vectors a, b, and c; it is zero
if the three vectors are coplanar. This includes the case that one of the three vectors is
The triple product is positive if the vector product a × b forms an acute angle
with c. If this is the case for three polar vectors in a right-handed system, then we say
the three vectors a, b, and c (in this order) form a right-handed system. If this is true
for three polar vectors in a left-handed system, then we say they form (in this order) a
left-handed system.
A triple product changes its sign if we change the order of two vectors. The sign
does not change if we permute the three vectors cyclically.

70 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

3. For the product of two triple products we have

󵄨󵄨 a 󵄨󵄨 a 󵄨󵄨
󵄨󵄨 1 a2 a3 󵄨󵄨󵄨󵄨 󵄨󵄨󵄨󵄨 d1 e1 f1 󵄨󵄨󵄨󵄨 󵄨󵄨 ⋅ d a⋅e a⋅f 󵄨󵄨
(2.45) 󵄨󵄨 󵄨 (1.13) 󵄨󵄨 󵄨󵄨
[a, b, c][d, e, f ] = 󵄨󵄨󵄨󵄨 b1 b2 b3 󵄨󵄨󵄨󵄨 󵄨󵄨󵄨󵄨 d2 e2 f2 󵄨󵄨󵄨󵄨 = 󵄨󵄨 b ⋅ d
󵄨󵄨 b⋅e b⋅f 󵄨󵄨 ,
󵄨󵄨 󵄨󵄨 󵄨 󵄨󵄨 󵄨󵄨󵄨
󵄨󵄨 c1 c2 c3 󵄨󵄨󵄨 󵄨󵄨󵄨 d3 e3 f3 󵄨󵄨󵄨 󵄨󵄨 c ⋅ d c⋅e c⋅f 󵄨
󵄨󵄨 a ⋅ d 󵄨󵄨
󵄨󵄨 a⋅e a⋅f 󵄨󵄨
󵄨󵄨 󵄨󵄨
[a, b, c][d, e, f ] = 󵄨󵄨󵄨 b ⋅ d b⋅e b⋅f 󵄨󵄨 .
󵄨󵄨 (2.46)
󵄨󵄨 c ⋅ d c⋅e c⋅f 󵄨󵄨󵄨
󵄨 󵄨

2.11 Summary of Tensor-Algebraic Operations

For convenience we summarize the tensor-algebraic operations in the following table.
If an operation is defined for tensors of different orders, we give a typical example for
each case.

Name Index notation Symbolic notation

For one tensor:

transpose (order 2 only) aTij := aji aT
contraction (order 2 and higher) aijik = bjk
also multiple: aijij = b
in particular for order 2: computation of
the trace aii = b tr a = b
For two tensors:
(only for the same order and polarity) aij = bij a=b
addition (subtraction)
(only for the same order and polarity) aij ± bij = cij a±b=c
tensor multiplication ai bjk = cijk ab=c

(order 1 and higher) aijk bmkn = cijmn
also multiple: aijk bmki = cjm
in particular: scalar multiplication aijk bkm = cijm a⋅b=c

also multiple: aijk bjk = ci a ⋅⋅ b = c

in particular: vector multiplication ak εijk bim = cjm a×b=c

For three vectors:
computation of the triple product εijk ai bj ck = d [a, b, c] = d

2.12 Differential Operations | 71

2.12 Differential Operations

We now turn to tensor fields, e. g. tensors whose coordinates are functions of the posi-
tion. Analogously to the three products we define three differential operators for tensor
fields: gradient, divergence, and curl.

2.12.1 The Fundamental Theorem of Tensor Analysis

If we differentiate a scalar field a(x, y, z) with respect to the spatial coordinates, we get
the coordinates (𝜕a/𝜕x, 𝜕a/𝜕y, 𝜕a/𝜕z) of a vector field, which is called the gradient of
a. We can show in general that the derivatives of the Cartesian coordinates of a tensor
field of order n with respect to the spatial coordinates form the Cartesian coordinates
of a tensor field of order (n + 1). This statement is called the fundamental theorem of
tensor analysis.
We prove it here using the example of a second-order polar tensor field aij (x),
where the argument x stands for the Cartesian spatial coordinates (x1 , x2 , x3 ). Accord-
ing to (2.17) we have

aij (x) = αim αjn a

̃ mn (x) = αim αjn a
̃ mn (x(x̃)) = αim αjn a
̃ mn (x̃),
̃ mn
d aij = αim αjn d x̃q ,
𝜕aij 𝜕a
̃ 𝜕x̃q
= αim αjn mn .
𝜕xp 𝜕x̃q 𝜕xp

Further from (2.9) it follows that

= αpq ,

and thus we have

𝜕aij 𝜕a
̃ mn
= αim αjn αpq ,
𝜕xp 𝜕x̃q

which is the transformation law for the coordinates of a third-order polar tensor.

2.12.2 Gradient

In symbolic notation we start with the tensor field a(x) = aij ei ej . If we differentiate
with respect to xk , we have, because the bases are independent of the position,

𝜕a 𝜕aij
= ei ej .
𝜕xk 𝜕xk

72 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

Since the 𝜕aij /𝜕xk are the coordinates of a tensor of order 3, we obtain from 𝜕a/𝜕xk
a tensor of order 3 by multiplying it by ek . This tensor is called the gradient13 of the
original tensor a.
The gradient of a polar tensor field of order n gives a polar tensor field of order
(n + 1), and the gradient of an axial tensor field of order n gives an axial tensor field of
order (n + 1).
Since the tensor product of basis vectors is not commutative, we have two differ-
ent definitions of the gradient, depending on whether we multiply 𝜕a/𝜕xk by ek from
the left or from the right; they can be distinguished as the left gradient and the right
gradient. Both definitions are used in the literature, so one has to check each time,
when the gradient of a tensor of at least order 1 is used, how the gradient is defined.
These two different definitions of the gradient depend on which convention is
used for the order of the indices in the spatial derivative of tensor coordinates. If the
denominator index is put behind the numerator indices, i. e.
= Aijk ,
then contraction with the corresponding basis gives the right gradient
e e e = Aijk ei ej ek ;
𝜕xk i j k ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
gradR a A

if the denominator index is put in front of the numerator indices, i. e.

= Bkij ,
then contraction with the corresponding basis gives the left gradient
e e e = Bkij ek ei ej .
𝜕xk k i j ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
gradL a B

Introducing the del operator14

∇ := e (2.47)
𝜕xk k
we can define the two gradients as tensor products with the del operator

gradR A := A ∇, gradL A := ∇ A , (2.48)

13 Gradiens Latin: (steepest) stepping (increase) (present participle of gradi [to step/walk]).
14 Also called Nabla operator.

2.12 Differential Operations | 73

where in the first case del acts on the tensor from the right and in the second case it
acts on the tensor from the left.
Since tensor multiplication of a tensor by a scalar is commutative, the two gradi-
ents of a scalar coincide, i. e.

grad a = a ∇ = ∇ a.

Thus the left and the right gradient are generalizations of the gradient from vector
calculus. However, for tensors of at least order 1 we have to make a decision. We choose
here the right gradient, because the order “numerator index in front of denominator
index” is how we read a differential quotient 𝜕aij /𝜕xk . So we agree on the following

The last index of a gradient is the differentiation index.

With this convention we can drop the index R in gradR from now on.
For example, we write

𝜕 𝜕aij
grad a = a ∇ = aij ei ej ek = e e e .
𝜕xk 𝜕xk i j k

For tensors of various order we have the following definitions.

grad a := e ,
𝜕xk k
grad a := e e ,
𝜕xk i k (2.49)
grad a := ei ej ek ,

In both notations this is written as follows.

grad a = A, = Ak ,
grad a = A, = Aik ,
𝜕xk (2.50)
grad a = A, = Aijk ,
etc. etc.

74 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

Some authors use the del operator instead of the grad symbol in symbolic notation
and use it in computations like a vector. However, we need additional rules for com-
putations with this “symbolic vector”, since it is not a real vector but only behaves like
a vector in some ways. For that reason we use del only in definitions and to distinguish
the different differential operations in tensor analysis. We will not use it in computa-
tions, where we will use index notation instead, like we did in tensor algebra.

2.12.3 (Total) Differential

1. Consider two points with the coordinates xi and xi +dxi , i. e. with the position vectors
x and x + d x, which do not have to be neighboring points. The difference d x between
the two position vectors is called the differential of the position vector x.
Differentials of the position vector in two different coordinate systems are related

d xi = d x̃j ,

and according to (2.9) we have

d xi = αij d x̃j , (2.51)

i. e. the differential of the position vector is a polar vector. We write in different nota-

d x = d xi ei = d x ex + d
⏟⏟⏟⏟⏟⏟⏟⏟⏟ y ey + d
⏟⏟⏟⏟⏟⏟⏟⏟⏟ z ez ,
⏟⏟⏟⏟⏟⏟⏟⏟⏟ (2.52)
d x1 d x2 d x3

where d x1 , d x2 , and d x3 are the components of d x in the coordinate system under

With our definition of the gradient of a tensor we define the (total) differential of
a tensor in the two notations.

d a := grad a ⋅ d x, d a := d xk ,
d a := (grad a) ⋅ d x, d ai := d xk ,
𝜕xk (2.53)
d a := (grad a) ⋅ d x, d aij := d xk ,
etc. etc.

Brought to you by | Cambridge University Library

The differential of a tensor depends clearly on the point x at which we compute the
gradient and on the differential d x. Geometrically, it represents the increase of the
tensor field on the tangential hyperplane of the tensor field at the point x.
Sometimes instead of “the differential of a tensor in index notation” we also speak
of “the differential of tensor coordinates”.
If we use the left gradient instead of the right gradient in the formulas above, then
in symbolic notation, the order of the factors changes.
Since the basis vectors are independent of the position, the differential of a tensor
and the differential of its coordinates are directly related. For example, we have

d a = d (ai ei ) = d ai ei ,

and for tensors of various order we write the following equations.

d a = d a,
d a = d ai ei ,
d a = d aij ei ej ,


2. For neighboring points with the position vectors x + d x and x, the differential
of tensors and the differential of tensor coordinates, respectively, are equal to the dif-
ferences of the tensors and tensor coordinates at the two points, up to second order
in d x.

d a = a(x + d x) − a(x),
d a = a(x + d x) − a(x),
d a = a(x + d x) − a(x),

d a = a(xp + d xp ) − a(xp ),
d ai = ai (xp + d xp ) − ai (xp ),
d aij = aij (xp + d xp ) − aij (xp ),

Brought to you by | Cambridge University Library

2.12.4 Divergence

We obtain the divergence15 from the gradient, if we replace the tensor product of the
tensor field and the del by the scalar product, i. e.

divR A := A ⋅ ∇, divL A := ∇ ⋅ A . (2.56)

So the divergence of a tensor field is defined for tensor fields of at least order 1, and
since the scalar product of two vectors is commutative, the right divergence and left
divergence of a vector field coincide; thus both divergences are generalizations of the
divergence from vector calculus.
Clearly, the divergence of a polar tensor field of order n gives a polar tensor field of
order (n − 1), and the divergence of an axial tensor field of order n gives a tensor field
of order (n − 1).
We will use the right divergence from now on and thus drop the index R in divR .
So we agree on the following convention.

The differentiation index in a divergence is equal to the last index of the tensor
whose divergence is computed.

For example, we have

𝜕 𝜕aij 𝜕aij
div a = a ⋅ ∇ = aij ei ej ⋅ ek = ei e⏟⏟⏟⏟⏟⏟⏟⏟⏟
j ⋅ ek = e.
𝜕xk 𝜕xk 𝜕xj i

For tensors of various order we have the following definitions.

div a := ,
div a := e,
𝜕xk i (2.57)
div a := ei ej ,

15 Divergentia Latin: divergence, also source strength (verbal noun to modern Latin divergere [to di-
verge], derived from vergere [to slope]).

Brought to you by | Cambridge University Library

In the two notations this is written as follows.

div a = A, = A,
div a = A, = Ai ,
𝜕xk (2.58)
div a = A, = Aij ,
etc. etc.

Problem 2.15.
Translate into the other notation:
𝜕ai 𝜕bj 𝜕ai 𝜕bi
A. div grad a = b, B. grad div a = c, C. = cik , D. = c.
𝜕xj 𝜕xk 𝜕xj 𝜕xj

2.12.5 Curl

We obtain the curl from the gradient, if we replace the tensor product of the tensor
field and the del by the vector product.
So also the curl of a tensor field is defined for tensor fields of at least order 1. Since
the sign of a vector product changes if we change the order of the factors, and since
we want both the right curl and the left curl to coincide for a vector field with the curl
from vector calculus, we define

curlR A := A ⊗ ∇ = −A × ∇, curlL A := ∇ × A . (2.59)

The curl of a polar tensor field of order n clearly gives an axial tensor field of order n,
and the curl of an axial tensor field of order n gives a polar tensor field of order n.
From now on we use the right curl, so we will drop the index R in curlR .
We write e. g.

curl a = a ⊗ ∇.

We translate, starting from the right side, so we obtain

𝜕 𝜕a
ami εijk = mi εijk = (curl a)mj ,
𝜕xk 𝜕xk
curl a = mi εijk em ej .

In contrast to the gradient and the divergence we have to be careful in ami εijk 𝜕/𝜕xk
to pull the differentiation in front of the epsilon and not the coordinate behind the

Brought to you by | Cambridge University Library

epsilon; e. g. in εijk 𝜕ami /𝜕xk = (curl a)jm the order of the free indices is reversed, so it
would be a different tensor.
For tensors of various order we have the following definitions.

curl a := ε e,
𝜕xk ijk j
curl a := ε e e,
𝜕xk ijk m j (2.60)
curl a := ε e e e,
𝜕xk ijk m n j

In both notations this is written as follows.

curl a = A, ε = Aj ,
𝜕xk ijk
curl a = A, ε = Amj ,
𝜕xk ijk (2.61)
curl a = A, ε = Amnj ,
𝜕xk ijk
etc. etc.

Problem 2.16.
A. grad x, B. div x, C. curl x.

Problem 2.17.
For each of the following expressions, derive a tensor analytical identity in index no-
tation and then translate this identity into symbolic notation:
A. div (λ a), B. curl (λ a), C. curl grad a, D. div curl a,
𝜕apj 𝜕apj bi
E. curl curl a, F. εijk εmnk bm , G. εijk εmnk .
𝜕xi 𝜕xm
Hint: The last two problems are slightly more difficult, because they have two free
indices. If this is the case, we have to check, before the translation, if the order of the
free indices matches in all terms.

Brought to you by | Cambridge University Library

2.12.6 Laplace Operator

The divergence of a gradient is also called the Laplace operator or the Laplacian and
is denoted by Δ.

Δ := div grad . (2.62)

The Laplace operator of a polar tensor field of order n gives a polar tensor field of
order n, and the Laplace operator of an axial tensor field of order n gives an axial tensor
field of order n. For example, div grad a gives translated
𝜕 𝜕aij 𝜕 aij
= ,
𝜕xk 𝜕xk 𝜕xk2
so for tensors of various order we have the following equations.

𝜕2 a
Δa = ,
𝜕2 ai
Δa = e,
𝜕xk2 i (2.63)
𝜕2 aij
Δa = ei ej ,

2.13 Rules for Indices and Underlines

We would like to mention that the notation we have introduced leads to rules which
must be satisfied by all members of a well-posed equation, in index notation and in
symbolic notation. While doing computations, it often turns out worthwhile to check
if these rules are satisfied.
An equation in index notation must satisfy the rules from Section 1.1, where we
introduced the summation convention: A summation index can appear only once or
twice in a term and all terms of an equation must match in their free indices.
For an equation in symbolic notation, the number of underlines must match in
all terms of an equation while taking into consideration the effect of the arithmetic
symbols. Each dot reduces the order of the tensor by two, each vector product and
each divergence reduces the order of the tensor by one, and each gradient increases
the order of the tensor by one. The other operations, i. e. the tensor product, the curl,
the differential, and the Laplace operator, do not change the order of the tensor.
Finally, for translating between the two notations the order of the free indices in
index notation must match in all terms before or after the translation.

80 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

2.14 Integrals of Tensor Fields

The coordinate of a tensor field (in the following we assume as the simplest case a
polar scalar) is a function of the three spatial coordinates. So we can compute the
following spatial integrals:

∫ a(x, y, z) d x,

∫ a(x, y, z) d y, (2.64)

∫ a(x, y, z) d z,

∫∫ a(x, y, z) d y d z,

∫∫ a(x, y, z) d z d x, (2.65)

∫∫ a(x, y, z) d x d y,

∫∫∫ a(x, y, z) d x d y d z. (2.66)

We will see that the first three integrals can be interpreted as line integrals, the second
three as surface integrals, and the last as a volume integral.

2.14.1 Line Integrals of Tensor Coordinates

1. Clearly, the line element from a point x on a curve to a neighboring point x + d x on

the curve is the differential d x of the position vector at the point x, so we can combine
the three single integrals from (2.64), using (2.52), as the vector integral

∫ a(x) d x, ∫ a(x) d xi , (2.67)

where the so-defined vector line element d x is, according to (2.51) (and in agreement
with our intuition), a polar vector.
(The three spatial coordinates of the argument can be combined as x, indepen-
dently of whether the equation is written in symbolic notation or in index notation;
i. e. we can do this also in the single integrals of (2.64)). These integrals, where the
differential is a vector line element, are called line integrals of the second kind.
If the differential is the magnitude of the vector line element instead of the line
element, we have

∫ a(x) d x. (2.68)

Such an integral is called a line integral of the first kind.

2.14 Integrals of Tensor Fields | 81

2. The domain of integration in a line integral is a segment of a curve in space. If we

assign a coordinate u to any point of the curve, we can describe the curve segment by

x = x(u),
y = y(u), u1 ≦ u ≦ u2 , (2.69)
z = z(u);

we can combine these three equations into vector form:

x = x(u), xi = xi (u). (2.70)

This gives for a line element

d xi
d xi = du (2.71)
and for its magnitude

d xi 2
d x = √(d xi )2 = √( ) d u. (2.72)

Thus we obtain for the line integrals (2.67) and (2.68) along the curve segment (2.69)
d xi
∫ a(x) d xi = ∫ a(x(u)) d u, (2.73)
d xi 2
∫ a(x) d x = ∫ a(x(u)) √( ) d u. (2.74)

Problem 2.18.
A point mass with mass m is moving along one turn of a helix around the z-axis with
radius a and slope h. The forces acting on the point mass are the gravitational force
F 1 = − m g ez and the elastic force F 2 = − λ x. Compute the work W = ∫ F ⋅ d x exerted
on the point mass.

Hint: A parametric representation of a helix is

x = a cos φ, y = a sin φ, z = h φ/(2π).

Brought to you by | Cambridge University Library

2.14.2 Normal Vector and Surface Vector of a Surface Element

1. A surface element is characterized by its magnitude d A and its normal vector n (the
unit vector perpendicular to the surface element). The product of the normal vector
and the magnitude of the surface element is called the surface vector d A.

d A = n d A. (2.75)

The normal vector n and the surface vector d A are not yet uniquely defined, since
a surface element has two unit vectors (negatives of each other, in other words, ori-
ented in opposite directions) perpendicular to the surface element. We determine the
orientation with a convention, distinguishing between open surfaces (surfaces with a
boundary curve) and closed surfaces (surfaces without a boundary curve); in addition
we restrict ourselves to two-sided16 surfaces.
– We can assign a direction of rotation to an open surface (and so also to its surface
elements) by the way we travel along the boundary curve. Let us agree to define the
normal vector and the surface vector of a surface element by their magnitudes and
by the rotational direction of a boundary curve. Then n and d A are axial vectors. If
an observer travels along the boundary curve and sees the surface on the left side,
then n (and d A) and the direction of rotation are associated according to the right-
hand screw rule. For example, if the surface is the ring between two concentric
circles, then the boundary has two parts (the outer and the inner circle), and while
traveling along each part the surface must be on the same (e. g. left) side of the
boundary curve, so the observer travels in different directions along the two parts.
– A closed surface does not have a boundary curve; instead it has an inner side and
an outer side. We adopt the following convention.

The normal vector and the surface vector of the surface element of a closed
surface always point outward from the enclosed volume.

Here n and d A are polar vectors.

16 A surface is two-sided, if we cannot go from one side to the other side without passing through the
boundary. The most famous example of a one-sided surface is the Möbius strip.

2.14 Integrals of Tensor Fields | 83

For example, if the enclosed volume is the space between two concentric
spheres, then the closed surface has two parts (the surface of the outer sphere
and the surface of the inner sphere), and the normal vector points, according to
our convention, radially outward from each surface element of the outer sphere
and radially inward from each surface element of the inner sphere.

The magnitude of both a polar vector and an axial vector is a polar scalar, i. e. the area
of a surface element is a polar scalar for open surfaces and for closed surfaces and
thus (in agreement with our intuition) independent of the orientation of the coordinate

2. Consider an infinitesimal tetrahedron, with one plane being oriented in arbitrary di-
rection and the other three planes perpendicular to the coordinate axes. The inclined
plane has the magnitude d A and the (outwardly oriented) normal vector n; the length
of the edges along the coordinate axes are |d ax |, |d ay |, and |d az |. The value of d A is
half of the area of the parallelogram spanned by the edge vectors d u and d v (see the
figure above). According to Section 2.10.1, No. 4 we can write, using the vector product,

ndA = d u × d v.

84 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

We further see from the figure that

n dA= (−|d az | ez + |d ay | ey ) × (−|d az | ez − |d ax | ex )
= (|d ax | |d az | e⏟⏟⏟z⏟⏟⏟⏟⏟
× ⏟e⏟⏟⏟x⏟ − |d ay | |d az | e⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
y × ez − |d ax | |d ay | ⏟⏟ × ⏟e⏟⏟⏟x⏟)
ey ex −ez
= (|d ax | |d az | ey − |d ay | |d az | ex + |d ax | |d ay | ez )
1 1 1
= |d ax | |d az | ey − |d ay | |d az | ex + |d ax | |d ay | ez
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 2
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 2
|d Ay | |d Ax | |d Az |

= −|d Ax | ex + |d Ay | ey + |d Az | ez . (a)

Here |d Ax |, |d Ay |, and |d Az | are the areas of the faces of the tetrahedron perpendicular
to the coordinate axes; geometrically speaking they are the projections of the inclined
surface onto the coordinate planes. The signs differ because the x-component of d A
points in the negative x-direction, while the y-component and the z-component point
in the direction of the positive y-axis and z-axis, respectively. If we compare the last
row of (a) with the general form of the vector d A with respect to the basis ei ,

d A = d Ax e x + d Ay e y + d Az e z ,

we find the following conclusion.

The absolute values of the coordinates of an arbitrarily oriented surface element

are the magnitudes of the projections of this surface element onto the coordinate

3. If we bring in (a) all terms to the left side, we have

n d A + ex |d Ax | − ey |d Ay | − ez |d Az | = 0,

or with nx = ex , ny = −ey , nz = −ez

n d A + nx |d Ax | + ny |d Ay | + nz |d Az | = 0.

The last equation says that the sum of the (outwardly oriented) surface vectors over
the total surface of the tetrahedron is equal to the zero vector. Then, for any closed
surface, also the integral over the surface vectors of all surface elements gives again
the zero vector, since we can split the enclosed volume into infinitesimal tetrahedrons,

2.14 Integrals of Tensor Fields | 85

and the inner surfaces of two adjacent subvolume elements cancel during integration;
on the other hand, the integral over all surface area elements gives A.

∮ d A = 0, ∮ d A = A. (2.76)

2.14.3 Surface Integrals of Tensor Coordinates

1. The double integral ∫∫ a(x, y, z) d y d z of (2.65) is an integration of a over the pro-

jection of the surface of integration onto the y, z-plane, i. e. we write ∫ a(x, y, z) d Ax .
The same applies to the other two double integrals in (2.65), so we have

∫ a(x, y, z) d Ax := ± ∫∫ a(x, y, z) d y d z,

∫ a(x, y, z) d Ay := ± ∫∫ a(x, y, z) d z d x, (2.77)

∫ a(x, y, z) d Az := ± ∫∫ a(x, y, z) d x d y,

where the sign is determined by the orientation of the surface element.

We can combine the three double integrals in vector form as follows:

∫ a(x) d A, ∫ a(x) d Ai . (2.78)

Such an integral, where the differential is a vector surface element, is called a surface
integral of the second kind. If the differential is the magnitude of the surface element
instead of the vector surface element we have

∫ a(x) d A, (2.79)

and such an integral is called a surface integral of the first kind.

2. The domain of integration of a surface integral is a segment of a surface in space. If
we define two coordinates u and v on the surface and if the surface segment is bounded
by two curves v = v1 (u) and v = v2 (u), then the surface segment is described by

x = x(u, v),
u1 ≦ u ≦ u2 ,
y = y(u, v), (2.80)
v1 (u) ≦ v ≦ v2 (u),
z = z(u, v),

which can be combined as

x = x(u, v), xi = xi (u, v). (2.81)

86 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

A surface element of this surface is a parallelogram spanned by two line elements

along the coordinate lines. The surface vector of this surface element is then equal
to the vector product of these two line elements, where the orientation of the surface
vector changes if we change the order of the line elements. From now on we choose
the order of the line elements, i. e. the order of the surface coordinates u and v, such
that the orientation of the surface vector is in agreement with our convention.

Along a u-line only u changes and v remains constant; i. e. for a line element along a
u-line we have d xi = (𝜕xi /𝜕u) d u. Analogously we have for a line element along a
v-line d xi = (𝜕xi /𝜕v) d v. This gives for the surface vector of a surface element

𝜕xj 𝜕xk
d Ai = εijk du dv (2.82)
𝜕u 𝜕v

and for its coordinates

𝜕y 𝜕z 𝜕z 𝜕y
d Ax = d y d z = ( − )du dv
𝜕u 𝜕v 𝜕u 𝜕v
󵄨󵄨 𝜕y 𝜕y 󵄨󵄨
󵄨󵄨 󵄨󵄨 (2.83)
󵄨󵄨 𝜕u 𝜕v 󵄨󵄨
󵄨󵄨 󵄨󵄨
= 󵄨󵄨󵄨󵄨 󵄨󵄨 d u d v = 𝜕(y, z) d u d v,
𝜕z 𝜕z 󵄨󵄨
󵄨󵄨 󵄨󵄨 𝜕(u, v)
󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕u 𝜕v 󵄨󵄨
󵄨 󵄨

𝜕z 𝜕x 𝜕x 𝜕z
d Ay = d z d x = ( − )du dv
𝜕u 𝜕v 𝜕u 𝜕v
󵄨󵄨 𝜕z 𝜕z 󵄨󵄨
󵄨󵄨 (2.84)
󵄨󵄨 𝜕u 𝜕v 󵄨󵄨
󵄨󵄨 󵄨󵄨
= 󵄨󵄨󵄨󵄨 󵄨󵄨 d u d v = 𝜕(z, x) d u d v,
𝜕x 𝜕x 󵄨󵄨
󵄨󵄨 󵄨󵄨 𝜕(u, v)
󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕u 𝜕v 󵄨󵄨
󵄨 󵄨

2.14 Integrals of Tensor Fields | 87

𝜕x 𝜕y 𝜕y 𝜕x
d Az = d x d y = ( − )du dv
𝜕u 𝜕v 𝜕u 𝜕v
󵄨󵄨 𝜕x 𝜕x 󵄨󵄨
󵄨󵄨 (2.85)
󵄨󵄨 𝜕u 𝜕v 󵄨󵄨
󵄨󵄨 󵄨󵄨
= 󵄨󵄨󵄨󵄨 󵄨󵄨 d u d v = 𝜕(x, y) d u d v.
𝜕y 𝜕y 󵄨󵄨
󵄨󵄨 󵄨󵄨 𝜕(u, v)
󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕u 𝜕v 󵄨󵄨
󵄨 󵄨

Hence we obtain for the surface integral (2.78)

u2 v2 (u)
𝜕xj 𝜕xk
∫ a(x) d Ai = ∫ ∫ a(x(u, v)) εijk d u d v. (2.86)
𝜕u 𝜕v
u1 v1 (u)

The magnitude of the surface element is given by

𝜕xj 𝜕xk 𝜕xm 𝜕xn

d A = √(d Ai )2 = √εijk ε d u d v.
𝜕u 𝜕v imn 𝜕u 𝜕v

Applying the Grassmann identity (1.35) gives

𝜕xj 𝜕xk 𝜕xm 𝜕xn

d A = √(d Ai )2 = √(δjm δkn − δjn δkm ) du dv
𝜕u 𝜕v 𝜕u 𝜕v
𝜕xj 𝜕xk 2 𝜕xj 𝜕xj 2
= √( ) ( ) −( ) d u d v.
𝜕u 𝜕v 𝜕u 𝜕v

With the commonly used notations

𝜕xj 𝜕x 2 𝜕y 2 𝜕z 2
E := ( ) =( ) +( ) +( ) ,
𝜕u 𝜕u 𝜕u 𝜕u
𝜕xj 𝜕xj 𝜕x 𝜕x 𝜕y 𝜕y 𝜕z 𝜕z
F := = + + , (2.87)
𝜕u 𝜕v 𝜕u 𝜕v 𝜕u 𝜕v 𝜕u 𝜕v
𝜕xj 𝜕x 2 𝜕y 2 𝜕z 2
G := ( ) =( ) +( ) +( ) ,
𝜕v 𝜕v 𝜕v 𝜕v

we obtain for the surface integral (2.79)

u2 v2 (u)

∫ a(x) d A = ∫ ∫ a(x(u, v)) √EG − F 2 d u d v. (2.88)

u1 v1 (u)

88 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

Problem 2.19.
Consider a hemispherical shell exposed to the internal pressure p = ϱ g(H − z). Com-
pute the vertical force Fz = ∫ p d Az acting on the hemispherical shell.

Hint: A parametric representation of a hemispherical shell is

x = R sin u cos v, y = R sin u sin v, z = R cos u.

2.14.4 Volume Integrals of Tensor Coordinates

1. Consider a volume element spanned by the three line elements d x1 , d x 2 , and d x3

pointing in the positive coordinate directions of a Cartesian coordinate system. For
the corresponding triple product we have according to (2.52), independently of the
orientation of the coordinate system,

d V = [d x1 , d x2 , d x3 ] = d x1 × d x2 ⋅ d x3 = d x d y d z e⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
x × ey ⋅ ez = d x d y d z,

i. e. the domain of integration in the triple integral (2.66) is a volume, i. e.

∫ a(x) d V := ∫∫∫ a(x, y, z) d x d y d z. (2.89)

The volume element d V is a triple product of three polar vectors and so, according to
Section 2.10.3, No. 1, an axial scalar, i. e. it changes its sign under a transformation to a
coordinate system with a different orientation according to (2.18). However, under the
reflection the lower and the upper surface are swapped, so the sign changes again,
and thus the volume is (in agreement with our intuition) a polar scalar.

2. If the volume of integration is bounded by two areas z = z1 (x, y) and z = z2 (x, y)

which intersect in the two curves y = y1 (x) and y = y2 (x), then we get by taking the
integration limits into account
x2 y2 (x) z2 (x, y)

∫ a(x) d V = ∫ ∫ ∫ a(x) d x d y d z. (2.90)

x1 y1 (x) z1 (x, y)

2.14 Integrals of Tensor Fields | 89

Sometimes the integration limits are conveniently described in a different coordinate

system u, v, w. The volume of integration reads in these coordinates

x = x(u, v, w), u1 ≦ u ≦ u2 ,
y = y(u, v, w), v1 (u) ≦ v ≦ v2 (u), (2.91)
z = z(u, v, w), w1 (u, v) ≦ w ≦ w2 (u, v).

Similarly as before, the volume element is spanned in these coordinates by the three
line elements pointing in positive direction of the coordinate lines and its value is the
absolute value of the corresponding triple product. For a line element in u-direction we
have d xi = (𝜕xi /𝜕u) d u, for a line element in v-direction and in w-direction we have
accordingly d xi = (𝜕xi /𝜕v) d v and d xi = (𝜕xi /𝜕w) d w. So we have for the volume
element in the coordinates u, v, w
󵄨󵄨 𝜕x 𝜕xj 𝜕xk 󵄨󵄨󵄨󵄨
d V = 󵄨󵄨󵄨εijk i 󵄨 d u d v d w. (2.92)
󵄨󵄨 𝜕u 𝜕v 𝜕w 󵄨󵄨󵄨

Or, with (2.45),

󵄨󵄨 𝜕x 𝜕x 𝜕x 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕u 𝜕v 𝜕w 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕y 𝜕y 𝜕y 󵄨󵄨 󵄨 󵄨
󵄨󵄨 d u d v d w = 󵄨󵄨󵄨󵄨 𝜕(x, y, z) 󵄨󵄨󵄨󵄨 d u d v d w.
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨
d V = 󵄨󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕u 𝜕v 𝜕w 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨
󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨 𝜕(u, v, w) 󵄨󵄨
󵄨󵄨 𝜕z 𝜕z 𝜕z 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕u 𝜕v 𝜕w 󵄨󵄨
󵄨 󵄨

Thus we obtain for the volume integral in the coordinates u, v, w

u2 v2 (u) w2 (u, v)
󵄨󵄨 𝜕(x, y, z) 󵄨󵄨
∫ a(x) d V = ∫ ∫ ∫ a(x(u, v, w)) 󵄨󵄨 󵄨󵄨 d u d v d w. (2.93)
󵄨󵄨 󵄨
󵄨 𝜕(u, v, w) 󵄨󵄨
u1 v1 (u) w1 (u, v)

90 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

2.14.5 Integrals of Higher-Order Tensor Fields

1. In the formulas above we assumed for simplicity that the integrand is a scalar field.
However, we can easily substitute ai...j (x, y, z) for a(x, y, z) in these formulas; also
the coordinates of a tensor field of higher order are only functions of the three spatial
For example, if the integrand is a second-order tensor we have the following types
of integrals.

∫ a d x = A, ∫ aij d x = Aij ,

∫ a d x = B, ∫ aij d xk = Bijk ,

∫ a d A = C, ∫ aij d A = Cij , (2.94)

∫ a d A = D, ∫ aij d Ak = Dijk ,

∫ a d V = E. ∫ aij d V = Eij .

Here the products of the integrand and the differential are tensor products; we can
clearly replace them in integrals of the second kind by scalar products or vector prod-
ucts, and then we get e. g. the formulas

∫ a ⋅ d x = F, ∫ aij d xj = Fi ,
∫ a × d x = G, ∫ aij εklj d xk = Gil .

2. A line integral over a closed curve (the boundary curve of a surface) or a surface
integral over a closed surface (the boundary surface of a volume) are also called a
closed line integral and a closed surface integral, respectively, and we denote such
integrals by ∮.

3. If we integrate a tensor over a given spatial domain (e. g. a given curve between
two points P and Q), then the result, as for any definite integral, does not depend any
more on the integration variables. So integration of a tensor field gives a tensor which
is independent of the position.

2.15 Gauss Theorem and Stokes Theorem | 91

2.15 Gauss Theorem and Stokes Theorem

2.15.1 Gauss Theorem

1. Consider the Cartesian coordinates ai...j of a tensor field with the following proper-
– the ai...j are continuously differentiable in a given volume,
– the closed surface of this volume is at least piecewise continuously differentiable,
– if the closed surface (as e. g. for the volume enclosed by the two concentric
spheres) has several parts, then the integration is performed over all properly
oriented subsurfaces.

Then the Gauss theorem holds.

∫ grad a d V = ∮ a d A, ∫ d V = ∮ a d Ak ,
∫ grad a d V = ∮ a d A, ∫ d V = ∮ aj d Ak ,
𝜕xk (2.96)
∫ grad a d V = ∮ a d A, ∫ d V = ∮ aij d Ak ,
etc. etc.

Here d A is, according to Section 2.14.2, a polar vector.

2. In these equations we can replace the tensor product on the right side with a scalar
product, and thus the gradient on the left side with a divergence, if we multiply the
equations in index notation (starting with the second one) with δjk .

∫ div a d V = ∮ a ⋅ d A, 𝜕ak
∫ d V = ∮ ak d Ak ,

∫ div a d V = ∮ a ⋅ d A, 𝜕aik (2.97)

∫ d V = ∮ aik d Ak ,
etc. etc.

92 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

3. We can also replace the tensor product on the right side with a vector product, and
thus the gradient on the left side with a curl, if we multiply (2.96) in index notation
(starting with the second one) with εjpk .

∫ curl a d V = ∮ a ⊗ d A = − ∮ a × d A,

∫ curl a d V = ∮ a ⊗ d A = − ∮ a × d A,

∫ εjpk d V = ∮ aj εjpk d Ak ,

∫ εjpk d V = ∮ aij εjpk d Ak ,


The equations (2.97) and (2.98) are also versions of the Gauss theorem.

4. When we stated the Gauss theorem we used right derivatives. It is easy to see that
if we use left derivatives, then in symbolic notation we need to change the order of the
factors and replace ⊗ by ×. This gives e. g. for (2.98)2

∫ d VcurlL a = ∮ d A × a.

5. We first prove the basic version of the Gauss theorem, i. e. the equation

∫ d V = ∮ a d Ak .

Here we assume that a(x, y, z) is a scalar field, which is continuously differentiable

with respect to the spatial coordinates.
Let us consider for the moment only the z-coordinate of this equation,

∫ d V = ∮ a d Az ,

and let us assume a volume of integration such that the projection of its surface onto
the x, y-plane splits into two parts z = z1 (x, y) and z = z2 (x, y), where the two
parts intersect in a curve C, and let us further assume that to each value pair (x, y)
of the projected surface corresponds exactly one point on z1 and one point on z2 . In
addition we assume that the functions z1 and z2 are continuous and at least piecewise

2.15 Gauss Theorem and Stokes Theorem | 93

continuously differentiable. Then we have

x2 y2 (x) z2 (x,y)
𝜕a 𝜕a
∫ dV = ∫ dx ∫ dy ∫ dz
𝜕z 𝜕z
x1 y1 (x) z1 (x,y)

x2 y2 (x)

= ∫ d x ∫ d y [a(x, y, z2 (x, y)) − a(x, y, z1 (x, y))].

x1 y1 (x)

We can split the last integral into two parts and interpret the first as an integral of a
over the surface z = z1 (x, y) and the second as an integral of a over the surface z =
z2 (x, y), i. e.

∫ dV = ∫ a d Az − ∫ a d Az ,
z2 (x, y) z1 (x, y)

where the two surfaces are oriented so that the z-coordinates of all surface vectors
point into the positive z-direction. To have the surface vector oriented outward on the
closed surface, as we agreed on, we need to change the sign of the integral over z1 ,
which gives

∫ d V = ∮ a d Az ,
and this is what we wanted to prove.
Now we extend the proof to volumes in which the two surface parts z1 and z2 do
not intersect along a curve C but are connected by a cylindrical shell whose generating

94 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

lines are parallel to the z-axis. Such a cylinder does not contribute to the closed surface
integral ∮ a d Az because everywhere on the shell we have d Az = 0; so we conclude
that the basic version of the Gauss theorem is also true for these volumes.
We finally extend the proof to volumes with arbitrarily shaped closed surfaces,
e. g. multiply connected volumes and volumes with inner closed surfaces, as long as
all parts of the closed surfaces are continuous and at least piecewise continuously
differentiable. We can always split these volumes into volumes of the type which we
already handled earlier, i. e. for which the Gauss theorem holds.
Adding the Gauss theorems for the subvolumes on the left side gives the volume
integral over the total volume. When we add the surface integrals on the right side,
the parts for the shared faces of adjacent subvolumes cancel out, because the surface
vectors of a subvolume are always oriented outward, so for adjacent subvolumes they
point in opposite directions; only the surface integral over the outer surface remains,
including possibly existing inner parts of the closed surface, where the surface vector
is also oriented outward of the volume.
The proof of the basic version of the Gauss theorem for the other two coordinates
is similar. To obtain the remaining versions of the Gauss theorem we replace in the
basic version the scalar field a(x, y, z) with the coordinates of a tensor field of higher
order or perform further algebraic operations.
The only difference between the coordinates of a tensor field of any order and a
scalar field is how they transform under a change to another coordinate system. Since
we did not make use of the invariance of the scalar tensor field with respect to a coor-
dinate transformation in our proof, this proof also holds for the coordinates of a tensor
field of higher order.

2.15.2 Stokes Theorem

1. Consider the Cartesian coordinates ai...j of a tensor field with the following proper-
– the ai...j are continuously differentiable on a given two-sided surface,
– the boundary curve of this surface is at least piecewise continuously differen-
tiable, and
– if the boundary curve (as e. g. for the surface between two concentric circles) has
several parts, then the closed line integral is performed over all properly oriented

Then the Stokes theorem holds.

2.15 Gauss Theorem and Stokes Theorem | 95

∫ grad a ⊗ d A = − ∫ grad a × d A = ∮ a d x,

∫ grad a ⊗ d A = − ∫ grad a × d A = ∮ a d x,

∫ grad a ⊗ d A = − ∫ grad a × d A = ∮ a d x,

∫ ε d Ak = ∮ a d xj ,
𝜕xi ijk
∫ ε d Ak = ∮ an d xj ,
𝜕xi ijk
∫ ε d Ak = ∮ amn d xj ,
𝜕xi ijk

Here d A is, according to Section 2.14.2, an axial vector.

2. If we replace the tensor product on the right side with a scalar product, by multiply-
ing the equations in index notation (starting with the second one) with δnj , we obtain
the following equations.

∫ curl a ⋅ d A = ∮ a ⋅ d x,

∫ curl a ⋅ d A = ∮ a ⋅ d x,


𝜕aj (2.100)
∫ εjki d Ak = ∮ aj d xj ,

∫ εjki d Ak = ∮ amj d xj ,


3. We can also replace in (2.99) the tensor product on the right side with a vector
product, if we multiply the equations in index notation (starting with the second one)
with εnpj = −εnjp . This gives e. g. for the second equation

∫ ε d Ak (−εnjp ) = ∮ an εnpj d xj .
𝜕xi ijk

96 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

However, if we want to avoid the uncommon notation with the double scalar product
for the ε-tensor

∫ (grad a × d A) ⋅⋅ ε = − ∮ a × d x,

then the left side cannot be translated into symbolic notation.

4. We stated the Stokes theorem using right derivatives. If we prefer to use left deriva-
tives, we again need to change the order of the factors and replace ⊗ by ×. Then we
have e. g. for (2.99)2

∫ d A × gradL a = ∮ d x a.

We see from all these equations that left derivatives lead to symbolic equations with
the commonly used vector product × and right derivatives lead to symbolic equations
with the uncommon vector product ⊗.

5. We again start with proving the basic version of the Stokes theorem, i. e.

∫ ε d Ak = ∮ a d xj ,
𝜕xi ijk

where, as before, we assume that a(x, y, z) is a continuously differentiable scalar field.

We define two coordinates u and v on the surface of integration and then we have,
according to (2.82),

𝜕a 𝜕a 𝜕xm 𝜕xn
Ij := ∫ ε d Ak = ∫∫ εijk εkmn du dv
𝜕xi ijk 𝜕xi 𝜕u 𝜕v
𝜕a 𝜕xm 𝜕xn
= ∫∫ εkij εkmn du dv
𝜕xi 𝜕u 𝜕v
𝜕a 𝜕xm 𝜕xn
= ∫∫(δim δjn − δin δjm ) du dv
𝜕xi 𝜕u 𝜕v

𝜕a 𝜕xi 𝜕xj 𝜕a 𝜕xj 𝜕xi

𝜕a 𝜕xj 𝜕a 𝜕xj
= ∫∫ ( − ) d u d v.
𝜕u 𝜕v 𝜕v 𝜕u

Now we consider the x-coordinate in the basic version of the theorem, i. e. we set the
only free index j to 1, and we choose u = x, v = y, i. e. the surface of integration is
given as z = z(x, y). Let us consider for the moment only surfaces whose projection
onto the x, y-plane is a one-to-one image of the surface, whose boundary intersects

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
2.15 Gauss Theorem and Stokes Theorem | 97

with the coordinate lines x = const in at most two points, and let the surface be suffi-
ciently smooth so that z = z(x, y) is continuous and at least piecewise continuously
differentiable. Then we have with

𝜕x1 𝜕x 𝜕x1 𝜕x
= =0 and = =1
𝜕v 𝜕y 𝜕u 𝜕x
x2 y2 (x) x2
I1 = − ∫ d x ∫ dy = − ∫ d x [a(x, y2 (x)) − a(x, y1 (x))].
x1 y1 (x) x1

The last integral has two parts. The first can be interpreted as an integral of a along
the curve y = y1 (x) and the second as an integral of a along the curve y = y2 (x), so
we have

I1 = ∫ d x a − ∫ d x a.
y1 (x) y2 (x)

Here both parts of the curve are oriented so that a traveler on the curves moves to-
wards increasing x. We choose the orientation of the boundary curve as shown in the
figure on the last page; then, if we use a right-handed system (as in the figure), the
z-component of the surface vector points in the positive z-direction. To make sure that
both parts of the curve have this orientation, we need to change the sign of the integral
over y2 , which gives

I1 = ∫ ε d Ak = ∮ a d x,
𝜕xi i1k

98 | 2 Tensor Analysis in Symbolic Notation and in Cartesian Coordinates

which is what we wanted to prove. It is easy to see that this equation also holds, if we
change the orientation of the boundary curve or if we use a left-handed system.
Similarly as for the Gauss theorem we now extend the proof to surfaces where the
boundary of the projected surface of integration intersects with the coordinate lines
x = const not only in two points x1 and x2 , but is partially tangent to these coordinate
lines. Such a part of the boundary does not contribute to the line integral, since we
have d x = 0 everywhere, so we conclude that the equation we want to prove also
holds for those surfaces.
Finally, we can extend the proof to surfaces of any shape, including multiply con-
nected two-sided surfaces, as long as all parts of the boundary are continuous and
at least piecewise continuously differentiable. We can always cut these surfaces into
parts of the type of surfaces for which we have already shown that the Stokes theorem
holds. When we add the Stokes theorems for these subsurfaces, we obtain on the left
side the surface integral over the total surface, and in the closed line integrals on the
right side adjacent curves are each traveled through twice in opposite directions, so
that the shared parts of the boundary cancel out and only the closed line integral over
the outer boundary of the surface remains. For example, in the case of a doubly con-
nected surface with an outer and an inner boundary, both boundary pieces must be
oriented so that a traveler along the curve sees the surface in both cases on the same
side (e. g. on the left side) of the curve. (It is easy to see that if we split a one-sided
surface, like the Möbius strip, the cutting curve is traveled through twice in the same
Again, since the three Cartesian coordinates are equivalent, the proofs for the
other two coordinates of the first version of the Stokes theorem are analogous. Using
the same argument as for the Gauss theorem, we can replace the scalar field a(x, y, z)
with the coordinates of a higher-order tensor field, and hence extend the proof to the
other versions of the Stokes theorem.

3 Algebra of Second-Order Tensors
The coordinates of a second-order tensor clearly form a square matrix, and many of
the concepts and results transfer from matrices to second-order tensors. One example
is the transpose operation: the definition of a transposed tensor (2.23) corresponds
to the definition of a transposed matrix (1.46) and, similarly, the equations (1.48) and
(2.35) for the transpose of a product of matrices and of the scalar product of tensors
correspond to each other.

3.1 Additive Decomposition of a Tensor

1. We have already seen a (unique) decomposition of a tensor, namely the decompo-
sition (2.26) into its symmetric part a(ij) and its antimetric part a[ij] :
1 1
aij = (aij + aji ) + (aij − aji ) . (3.1)
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 2
a(ij) a[ij]

Sometimes this decomposition is called Cartesian decomposition, because it has the

form of a complex number with Cartesian coordinates in the complex plane (or Argand
plane) where the transposed tensor corresponds to the complex conjugate number, the
symmetric part to the real part, and the antimetric part to the imaginary part.
2. There is another unique additive decomposition of a tensor, namely the decomposi-
tion into an isotropic tensor and a deviator; a deviator is a tensor whose trace is zero.
We write this decomposition as

aij = a
̂ δij + å ij . (3.2)

By deriving unique rules to compute a ̂ and å ij from the coordinates aij of an arbitrary
tensor, we can show that a unique decomposition is always possible. For this, we com-
pute the trace of (3.2). Since å ii is zero by definition, it follows that aii = 3 â or
̂= a , (3.3)
3 ii
and when we substitute this into (3.2) and solve for å ij we get
å ij = aij − a δ . (3.4)
3 kk ij
3. These two decompositions are related as follows:
aij = a
̂ δij + (a(ij) − a ̂ δij ) +a[ij] . (3.5)
å ij


100 | 3 Algebra of Second-Order Tensors

Using the transformation laws,1 it is easy to see that all parts of this decomposition,
i. e. the isotropic part a
̂ δij , the symmetric part of the deviator å (ij) := (a(ij) − a
̂ δij ), the
antimetric part a[ij] , the symmetric part a(ij) , and the deviatoric part å ij , transform into
a tensor of its original class, under a transformation to another coordinate system.

Problem 3.1.
Consider a tensor (in a given Cartesian coordinate system) with the coordinate matrix

4 1 −7
aij = (−3 5 8) .
3 2 9

Compute (by finding the corresponding coordinate matrices) its isotropic part, its de-
viatoric part, its symmetric part, its antimetric part, and the symmetric part of its de-
viator. Check your result by adding the isotropic part, the symmetric deviatoric part,
and the antimetric part.

Problem 3.2.
Show that the antimetry of a tensor is invariant under a transformation of coordinates.

3.2 The Determinant of a Tensor

1. For the determinant A of the Cartesian coordinates aij of a tensor

󵄨󵄨 a a12 a13 󵄨󵄨󵄨󵄨

󵄨󵄨 11
󵄨 󵄨
A := det aij = 󵄨󵄨󵄨󵄨 a21 a22 a23 󵄨󵄨󵄨󵄨 (3.6)
󵄨󵄨 󵄨
󵄨󵄨 a31 a32 a33 󵄨󵄨󵄨

we have with (1.29)

󵄨󵄨 a apq apr 󵄨󵄨
1 󵄨󵄨󵄨 pp
󵄨 󵄨󵄨
1 󵄨󵄨
A := det aij = εijk εpqr aip ajq akr = 󵄨󵄨󵄨 aqp aqq aqr 󵄨󵄨 . (3.7)
6 6 󵄨󵄨 󵄨󵄨
󵄨󵄨 arp arq arr 󵄨󵄨

Since the right side of this equation is a scalar, the determinant of the coordinate ma-
trix of a tensor is also a scalar, i. e. it is independent of a chosen coordinate system.
Moreover, for a polar tensor it is a polar scalar, and for an axial tensor it is an axial
scalar. This justifies to have a symbol for the determinant of a tensor itself and thus
we write A = det a.

1 See Section 2.6, No. 7.

3.3 The Vector Associated with an Antimetric Tensor | 101

2. A scalar obtained from the coordinates of a tensor is called an invariant of the ten-
sor. So the trace and the determinant of a tensor are invariants of this tensor.
3. The determinant of an antimetric tensor is zero. If we change in the last equation the
order of i and p, j and q, and k and r, then we get A = 61 εpqr εijk api aqj ark . Comparing
this with the last equation gives, for an antimetric tensor, A = −A, i. e. A = 0.

3.3 The Vector Associated with an Antimetric Tensor

By the relation Ak = 21 εijk aij we can associate to any tensor aij a vector Ak . Conversely,
multiplying with εmnk gives εmnk Ak = 21 εijk εmnk aij = 21 (δim δjn − δin δjm ) aij = 21 (amn −
anm ) = a[mn] . If aij is symmetric, then Ak is zero, i. e. Ak depends only on the antimetric
part of aij . The antimetric part of aij and the vector Ak are uniquely associated (because
in a three-dimensional space an antimetric tensor and a vector are both determined
by three parameters). We have
Ai = ε a , a[ij] = εijk Ak . (3.8)
2 ijk [jk]
We call Ak the vector associated with the tensor a[ij] ,2 or we say a[ij] is the tensor to Ak .
A polar antimetric tensor is associated with an axial vector, and an axial antimetric
tensor is associated with a polar vector.
If we introduce a symbol for the antimetric part of a tensor, e. g. [ a ], we can write
the two equations (3.8) in symbolic notation as follows:3
1 1
A= ε ⋅⋅ [ a ] = (δ × [ a ]) ⋅⋅ δ, [ a ] = ε ⋅ A = −δ × A. (3.9)
2 2
Multiplying (3.8)2 with the coordinates Bm of a vector B and contracting the indices m
and j gives the often used identity for antimetric tensors

[ a ] ⋅ B = B × A. (3.10)

Problem 3.3.
Consider the antimetric tensor given by

0 a12 −a31
a[ij] = (−a12 0 a23 ) .
a31 −a23 0

Compute the coordinates of the associated vector.

2 Those authors who consider only polar tensors as tensors and who use only the polar ε-tensor call
Ak the axial vector associated with a[ij] . This axial vector is then clearly in our sense also a polar vector.
3 It is not very common to use the ε-tensor in symbolic notation; it can be replaced by a suitable vector
product with the δ-tensor.

102 | 3 Algebra of Second-Order Tensors

3.4 The Cotensor of a Tensor

1. Let aij be the coordinate matrix of a second-order tensor. The cofactors bij of aij are
given, with (1.30), as

bmn = ε ε a a . (3.11)
2 mij npq ip jq
This means that the cofactors of a second-order tensor also form the coordinate matrix
of a second-order tensor. The so-defined tensor b is called the cotensor of the tensor a.
Clearly, the cotensor of a second-order tensor is always a polar tensor.

2. In particular, for an antimetric tensor we obtain a simple expression for its cotensor,
if we use the vector associated with the tensor according to (3.8). We have

bip = εijk εpqr εjqm Am εkrn An
= εjki εjqm εpqr εnkr Am An
= (δkq δim − δkm δiq )(δpn δqk − δpk δqn ) Am An
= (3 δim δpn − δim δpn − δim δpn + δpm δin ) Am An
= (δim δpn + δpm δin ) Am An
= (Ai Ap + Ap Ai ),

bip = Ai Ap . (3.12)

3.5 The Rank of a Tensor

1. The terms rank, regular, and singular also transfer from the matrix of a tensor to
the tensor itself.
For the Cartesian coordinates aij of a tensor a we have the following table.

Rank Name Property

3 regular A ≠ 0
2 single singular, rank defect 1 A = 0, bij =
\ 0
1 double singular, rank defect 2 bij = 0, aij =
\ 0
0 triple singular, rank defect 3 aij = 0

Brought to you by | Cambridge University Library

Since the property of the coordinate matrices aij and bij , to be zero or nonzero, is in-
variant under coordinate transformations, we conclude that all coordinate matrices of
a tensor have the same rank. We call this property the rank of the associated tensor.

2. An antimetric tensor is, as shown in Section 3.2, always singular. If it is not triple
singular, then it is, according to (3.12), single singular.

3.6 The Inverse Tensor

Let aij be a coordinate matrix of a regular tensor a. Then it has an inverse matrix a(−1)
and (1.49) shows that

aij a(−1)
= δik . (a)

The transformation equation between the coordinates aij of the tensor a in the orig-
inal coordinate system and its coordinates a
̃ ij in another coordinate system is given
by (2.17)

aij = αim αjn a

̃ mn . (b)

The matrix a(−1)

can also be interpreted as the coordinate matrix of a tensor a−1 in the
original coordinate system. The transformation equation between the a(−1)
and the
coordinates a
̃ (−1)
of a−1 in the other coordinate system is

̃ (−1)
= αjp αkq a pq . (c)

We will show that the coordinate matrices of these two tensors are inverses of each
other also in the other coordinate system (and thus in every Cartesian coordinate sys-
For this, we substitute (b) and (c) into (a), and we obtain

αim α jn a ̃ mn αjp αkq a

⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ pq = δik ,
̃ (−1)
̃ mn δnp

αim αkq a
̃ mn a nq = δik .
̃ (−1)

Multiplication with αir αks gives

αim αir α
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ⏟⏟⏟⏟⏟α⏟⏟⏟ks⏟⏟ a
⏟⏟⏟kq ̃ mn a
̃ (−1) αkr αks ,
nq = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
δmr δqs δrs

̃ rn a ns = δrs , Q. E. D.
̃ (−1)

104 | 3 Algebra of Second-Order Tensors

For every regular tensor a there exists exactly one tensor a−1 with the property that
the coordinate matrices of both tensors are inverses of each other in both coordinate
systems. We call these two tensors the inverses of each other and then (1.49), (1.51),
(1.52), (1.54), (1.50), and (1.53) give

a ⋅ a−1 = δ, aij a(−1)

= δik ,

a−1 ⋅ a = δ, a(−1)
ajk = δik ,
−1 (−1)
(a−1 ) = a, (aij(−1) ) = aij ,
T −1 T −1
(a−1 ) = (aT ) =: a−T , (a(−1)
) = (aTij ) =: a(−T)

1 T 1
a−1 = b , a(−1) = b , (3.14)
A ij A ji

A δ = bT ⋅ a = a ⋅ bT , A δik = bji ajk = aij bkj , (3.15)

where A is the determinant and b is the cotensor of a.

Problem 3.4.
Consider the tensor with the coordinates

0 3 2
Aij = (−3 0 −1) .
−2 1 0

Find (using all you know):

A. its determinant,
B. the coordinates of its cotensor,
C. if possible, the coordinates of its inverse tensor.

3.7 Orthogonal Tensors

It is easy to show, analogously to how it was derived for the inverse tensor, that the
coordinate matrix of a tensor is proper or improper orthogonal in all coordinate sys-
tems, if it is proper or improper orthogonal in one coordinate system (compare Prob-
lem 3.5). We therefore call such a tensor proper orthogonal or improper orthogonal,
respectively, and then, with (1.59) and (1.60), we have

aT = a−1 , a ⋅ aT = aT ⋅ a = δ,
aik ajk = δij , aki akj = δij , (3.16)
A = ±1.

3.8 Tensors as Linear Vector Functions | 105

Problem 3.5.
Prove that if the coordinate matrix of a tensor is proper or improper orthogonal in one
coordinate system, then it is proper or improper orthogonal in all coordinate systems.
Hint: First show that if the coordinate matrix of a tensor is orthogonal in one co-
ordinate system, then it is orthogonal in all coordinate systems.

Problem 3.6.
Show that a proper orthogonal tensor is equal to its cotensor and that an improper
orthogonal tensor is equal to the negative of its cotensor.
Hint: The coordinates Aij of a regular tensor and the coordinates Bij of its cotensor
are related, with (1.50), as Bij = A(−1)
det A.

3.8 Tensors as Linear Vector Functions

1. The scalar product of a tensor and a vector results in a vector:

U = a ⋅ X, Ui = aij Xj . (3.17)

The tensor a assigns a vector U to each vector X. We also say U is a function of X; since
X is a vector, we call it a vector function, and since U is a vector, we call it a vector-
valued function. We also say the tensor a maps the vector X to the vector U. The set
of all vectors X that are mapped by the tensor a is called the domain and, conversely,
the set of all vectors U is called the range of the mapping. An element U of the range
is also referred to as the image of the corresponding X in the domain of the mapping
and, conversely, X is the inverse image of U. Using (3.17), we have f (X +Y) = f (X)+f (Y)
and f (λ X) = λ f (X); we refer to such a function as linear. In summary, we can call a
tensor a linear vector-valued vector function and we call the mapping that it generates
a linear map.
2. This property of a tensor suggests a geometric interpretation of its coordinates: if
we successively substitute for Xi the Cartesian coordinates ei = δiα of the unit vectors
of the underlying coordinate system, then we get for the image of this Cartesian basis
α α
U i = aij ej = aij δjα ,
U i = aiα . (3.18)

This means that the columns of the coordinate matrix aij are the coordinates of the
images of the unit vectors of the underlying coordinate system.
3. We consider two collinear vectors Xi and Xi∗ = λ Xi . Their images Ui = aij Xj and
Ui∗ = aij Xj∗ are related as follows:

Ui∗ = aij Xj∗ = aij λ Xj = λ aij Xj = λ Ui .

106 | 3 Algebra of Second-Order Tensors

Two collinear vectors are thus mapped to two also collinear vectors with the same
length ratio. Such a map is called affine.
We will see that further properties of a map depend on the rank of the tensor.
Hence we consider the values which the rank of a tensor can take separately.

3.8.1 Rank 3

1. If the tensor aij has rank 3, then the images of the unit vectors are linearly indepen-
dent, so they are not coplanar, and the inverse tensor a(−1) ij
exists and is the inverse

X = a−1 ⋅ U, Xi = a(−1)
ij Uj . (3.19)

If we interpret vectors geometrically as position vectors, then to each vector corre-

sponds a point in space, and the tensor aij assigns to each point in space one-to-one
another point in space. We also say the tensor aij maps the three-dimensional vector
space of the Xi one-to-one to the three-dimensional vector space of the Ui .

2. Clearly, the zero vector is mapped to the zero vector; since the map is one-to-one,
no vector which is different from zero is mapped to the zero vector.

3. If the tensor is orthogonal, then we get for the square of the image of any vector Xi
with (3.17)

Ui2 = aij Xj aik Xk = aij aik Xj Xk = δjk Xj Xk = Xj2 ,

i. e. each vector is mapped to a vector of the same length. The scalar product of the
image vectors Ui = aij Xj and Vi = aij Yj of any two vectors Xi and Yi satisfies

Ui Vi = aij Xj aik Yk = aij aik Xj Yk = δjk Xj Yk = Xj Yj ,

i. e. scalar products are preserved under the mapping.

Now, Ui Vi = U V cos(Ui , Vi ) and Xi Yi = X Y cos(Xi , Yi ), and since the lengths
remain the same, i. e. U = X and V = Y , we also have cos(Ui , Vi ) = cos(Xi , Yi ),
i. e. the angles between two vectors also remain the same under the mapping. Such
a map, which is length and angle preserving (or isometric and isogonal), is called a
congruent map or an orthogonal map.

4. A regular tensor maps a Cartesian basis to three linearly independent (i. e. non-
coplanar) vectors, which generally are neither unit vectors nor perpendicular to each
other. In particular, an orthogonal tensor maps a Cartesian basis to three mutually
perpendicular unit vectors, i. e. again a Cartesian basis. Later we will prove that if
the tensor is proper orthogonal, then the orientation of the basis is preserved, i. e.

Brought to you by | Cambridge University Library

a right-handed system is mapped to a right-handed system and a left-handed system

is mapped to a left-handed system; the map represents a rotation. If the tensor is im-
proper orthogonal, then it changes the orientation: a right-handed system is mapped
to a left-handed system and a left-handed system to a right-handed system; the map
represents a rotation combined with a reflection.

3.8.2 Rank 2

1. If the tensor aij has rank 2, then both the rows and the columns of its coordinate
matrix are coplanar, but not collinear. If we interpret both the rows and the columns as
the coordinates of position vectors or as points in space, then they each define a plane
through the origin, which we call the row plane and the column plane, respectively.
The images of the unit vectors then lie in the column plane. Since any vector in the
domain can be represented as a linear combination of the unit vectors and since this
mapping is linear, its image in the image space can also be represented as a linear
combination of the U i ; i. e. geometrically the tensor aij maps each point of the space
to a point in the column plane. We also say the tensor aij maps the three-dimensional
vector space of the Xi onto the two-dimensional vector space of the Ui .

2. Since the matrix of the aij is singular, the equation aij Xj = 0 has nontrivial solu-
tions. This means that there exist vectors Xj whose image is the zero vector, i. e.

aij Xj = 0. (3.20)

These vectors are called null vectors; they span the null space of the single singular
tensor aij . Geometrically, the null space is a straight line through the origin; the di-
rection of this line is called the null direction. If we interpret again the rows of aij as
position vectors, then this line is, according to (3.20), perpendicular to these position
vectors and thus is perpendicular to the row plane. Let X and Y be any two vectors in
the domain, with the images U = a ⋅ X and V = a ⋅ Y. Then we get by subtracting U
− V = a ⋅ (X − Y). Clearly, the images U and V are equal if and only if the difference
between X and Y is pointing in the null direction. Thus two vectors whose projections
onto the row plane are equal have the same image. In other words, if we decompose a
vector X into a component X T in the row plane and a component X N perpendicular to
the row plane, then we get a ⋅ (X T + X N ) = a ⋅ X T .
Two different vectors in the row plane have different images; geometrically, a
rank-2 tensor maps each point in the row plane one-to-one to a point in the column
plane, and the corresponding transposed tensor accordingly maps each point in the
column plane (of the original tensor) one-to-one to a point in the row plane.

108 | 3 Algebra of Second-Order Tensors

3. For an antimetric tensor, which we know is either single singular or the zero tensor,
the vector associated to the tensor points in the null direction. From (3.8)2 it follows
by multiplying by Aj that

aij Aj = εijk Aj Ak ,

and this is zero, according to (2.40).

Problem 3.7.
Consider a tensor (in a given coordinate system) with the coordinate matrix
1 −1 −1
aij = (−2 1 3) .
−1 0 2
A. Convince yourself that it is single singular.
B. Determine the unit vector which spans the null space.
C. Find a Cartesian basis of the image plane.
Hint: Let U 1 and U 2 be two linearly independent vectors in the image plane;
first find a vector V 1 = U 1 + α U 2 which is perpendicular to U 1 .

3.8.3 Rank 1

If the tensor aij has rank 1, then both the rows and the columns of its coordinate ma-
trix are collinear. Geometrically, both the rows and the columns define a straight line
through the origin, which we call the row line and the column line.
The images of all vectors lie on the column line; the tensor maps the three-
dimensional vector space of the Xi onto the one-dimensional vector space of the Ui .
There are two linearly independent null directions which are both perpendicular
to the row line. These null directions span the null space of the double singular tensor
aij . Geometrically, the null space is a plane through the origin, whose normal vector
lies on the row line; we call it the null plane.
A tensor of rank 1 maps each point of the row line one-to-one to a point of the
column line; the corresponding transposed tensor accordingly maps each point of the
column line (of the original tensor) one-to-one to a point of the row line.

Problem 3.8.
Consider the tensor with the coordinate matrix
1 −1 2
aij = (−1 1 −2) .
3 −3 6
A. Convince yourself that it is double singular.
B. Determine two linearly independent null vectors.

3.9 Reciprocal Bases | 109

C. Determine the unit vector normal to the null plane.

D. Find the unit vector in the direction of the image line.

3.8.4 Rank 0

Finally, if the tensor aij has rank 0, then clearly it transforms any vector Xi into the zero
vector: the tensor is the zero tensor.

3.9 Reciprocal Bases

3.9.1 Definition

1. Let three vectors g 1 , g 2 , and g 3 be noncoplanar, i. e. we have [g 1 , g 2 , g 3 ] ≠ 0. To-

gether they are called a basis or frame and we write g i .

2. Now we define the basis g i , reciprocal to the basis g i , as the three vectors

g2 × g3 g3 × g1 g1 × g2
g1 = , g2 = , g3 = . (3.21)
[g 1 , g 2 , g 3 ] [g 1 , g 2 , g 3 ] [g 1 , g 2 , g 3 ]

We also want to write these equations in coordinate notation. If we choose a Cartesian

coordinate system and denote e. g. the coordinates of the vector g 1 in this coordinate
system with g and the coordinates of the vector g 1 with g i , then the equations (3.21)

εijk g g εijk g g εijk g g

1 2j 3k 2 3k 1i 3 1i 2j
gi = , gj = , gk = . (3.22)
εlmn g g g εlmn g g g εlmn g g g
1l 2m 3n 1l 2m 3n 1l 2m 3n

Since the triple product in the denominator is nonzero, the three vectors g i are
uniquely determined, and since three vectors, which are each perpendicular to the
other two vectors of a basis, cannot be coplanar, the g i also form a basis.
If the g i are polar vectors, then the g i are also polar vectors.
We can summarize the three equations (3.21) in one equation,

1 gj × gk
gi = εijk , (3.23)
2 [g 1 , g 2 , g 3 ]

as we see immediately, if we sequentially substitute 1, 2, and 3 for i.

110 | 3 Algebra of Second-Order Tensors

3.9.2 Orthogonality Relations

1. Clearly, the three vectors g i are, according to (3.21), constructed so that e. g.

g 1 obeys

g 1 ⋅ g 1 = 1, g 2 ⋅ g 1 = 0, g 3 ⋅ g 1 = 0.

Similar equations hold for g 2 and g 3 , and we can summarize these nine equations into
the so-called orthogonality relations

g i ⋅ g j = δij , g g k = δij . (3.24)

2. For given g i and unknown g i , these orthogonality relations form an inhomogeneous

system of nine linear equations for nine unknowns, whose coefficient determinant is
nonzero. Since such a system has exactly one solution, g i is, with (3.21), the only basis
which satisfies together with g i the orthogonality relations (3.24); the equations (3.21)
and (3.24) are therefore equivalent. Conversely, g i is reciprocal to g i , because we can
change the order of the factors in (3.24).
Writing (3.24) as a matrix equation gives

g g g 1 2 3
11 12 13 g1 g1 g1
1 0 0
(g g g ) (g1 2 3
21 22 23 2 g2 g 2 ) = (0 1 0) .
0 0 1
g g g 1
31 32 33

The matrix formed by the coordinates of the three vectors g i as rows is the inverse of
the matrix formed by the coordinates of the three vectors g i as columns. From this we
can compute the basis reciprocal to a given basis with the Gauss–Jordan algorithm.

3. We want to compute g k g k ; clearly this is a second-order tensor, for which we tem-

porarily introduce the unknown X. Then we have in coordinate notation

Xij = g g j .

Multiplying by g gives with (3.24)


Xij g = g g j g = g δkm = g = g δij ,
mj ki mj ki mi mj

(Xij − δij ) g = 0

3.9 Reciprocal Bases | 111

or, translated back into symbolic notation,

(X − δ) ⋅ g m = 0.

Thus all three vectors g m are null vectors of the tensor X −δ, i. e. the tensor X −δ must
be triple singular. In other words, it must be the zero tensor. This yields the following
relations (which are also called orthogonality relations):
g k g k = δ, g g j = δij . (3.25)

3.9.3 Orthogonal Bases and Orthonormal Bases

If the three vectors of a basis are mutually perpendicular, we call the basis orthogonal,
and if these vectors are unit vectors, we call it a normalized basis. If the basis (as the
basis of a Cartesian coordinate system) is both orthogonal and normalized, it is called
For a pair of reciprocal bases it follows from (3.21) and (3.24) that if the original
basis is orthogonal, then its reciprocal basis is also orthogonal, and the homologous
vectors of both bases are collinear and their magnitudes are reciprocal. If the original
basis is orthonormal, then it is identical to its reciprocal basis.
The orthogonality relations (3.24) and (3.25) for a Cartesian basis ei are, according
to (2.4),4 given by
i j
ei ⋅ ej = δij , ek ek = δij , αki αkj = δij ,
k k
ek ek = δ, ei ej = δij , αik αjk = δij .

Problem 3.9.
The following is given with respect to a Cartesian basis ei :
A. the (orthogonal) basis g 1 = 3 e1 , g 2 = 2 e2 , g 3 = e3 ;
B. the (nonorthogonal) basis, whose vectors are given by the three edges of a regular
tetrahedron with edge length one, lying in the first octant, with g 1 = e1 and g 2
lying in the e1 , e2 -plane.

Compute the reciprocal bases.

3.9.4 Reciprocal Bases in the Plane

1. Let g 1 and g 2 be two linearly independent vectors. Then there is, in the plane spanned
by g 1 and g 2 , exactly one pair of vectors g 1 and g 2 which satisfies the orthogonality re-

4 In (2.4) the general Cartesian basis is denoted by ̃ei .

112 | 3 Algebra of Second-Order Tensors

lations (3.24), with i and j running naturally only from 1 to 2. We call this basis the basis
reciprocal to g 1 and g 2 . The easiest way to see why this is plausible is geometrically:
for example, g 1 must be in the plane spanned by g 1 and g 2 , it must be perpendicular to
g 2 , it must enclose an acute angle with g 1 , and its length must be such that g 1 ⋅ g 1 = 1.

2. In a Cartesian basis in this plane, all four vectors have only two coordinates, and
the two-dimensional Cartesian coordinates of g 1 and g 2 are computed, analogously to
(3.22), from the equations

εij g εij g
1 2j 2 1i
gi = , gj = . (3.27)
εmn g g εmn g g
1m 2n 1m 2n

It is easy to verify that the so-defined two-dimensional vectors satisfy the orthogonal-
ity relations.

3.10 Representation of a Tensor by Vectors

In Section 3.8 we analyzed the equation U = a ⋅ X by successively substituting for X
the vectors of a Cartesian basis. With the concept of a reciprocal basis, we are now in
the position to generalize this idea to any basis.
Let a be any tensor and g i be any basis and let the image of g i under the mapping
a be

hi = a ⋅ g i . (3.28)

Tensor multiplication by the reciprocal basis g i gives with (3.25)

hi g i = a ⋅ g i g i = a ⋅ δ = a,

a = hi g i = h1 g 1 + h2 g 2 + h3 g 3 . (3.29)

Any tensor a can therefore be represented by six vectors in this way; here the hi are the
images of the basis g i under the mapping by the tensor a, and the basis g i is reciprocal
to the basis g i . One can start from an arbitrary basis g i and hence the representation
(3.29) is possible in infinitely many ways: the basis g i can be freely specified and then
all three vectors hi are uniquely determined by a.
The basis g i is always noncoplanar. With respect to the hi , we need to distinguish
between the cases where a has rank 3, 2, or 1.

3.10 Representation of a Tensor by Vectors | 113

3.10.1 Rank 3

1. If a has rank 3, then the hi are also noncoplanar. The tensor a−1 , the inverse of the
tensor a, exists and so does the basis hi , reciprocal to hi .
2. The converse of (3.29) is also true: if a tensor a can be written in the form (3.29) and
both the g i and the hi are linearly independent, then the tensor a has rank 3.
Then U = a ⋅ X gives

U = h1 g 1 ⋅ X + h2 g 2 ⋅ X + h3 g 3 ⋅ X.

We set g 1 ⋅ X = α1 , g 2 ⋅ X = α2 , and g 3 ⋅ X = α3 , and we see that the αi can take any

value, since the g i are assumed linearly independent and X can be chosen arbitrarily.
So we have

U = α1 h1 + α2 h2 + α3 h3 ,

i. e. U is a linear combination of the hi , which are also assumed linearly independent.

Therefore we have a mapping from the three-dimensional space of the vectors X onto
the three-dimensional space of the vectors U, which means that a must have rank 3.

3. Scalar multiplication of (3.28) by a−1 from the left gives g i = a−1 ⋅ hi .

In coordinate notation we see immediately that the transpose of (3.29) is
aT = g i hi . Then scalar multiplication by hj from the right gives aT ⋅ hj = g i δij or
g j = aT ⋅ hj .
Finally, scalar multiplication of the last equation from the left by a−T gives hj =
a−T ⋅ g j .
Hence there are four equivalent relations,

hi = a ⋅ g i , g i = aT ⋅ hi , g i = a−1 ⋅ hi , hi = a−T ⋅ g i ,

and solving for the tensors using the orthogonality relations (3.25) gives

a = hi g i , aT = g i hi , a−1 = g i hi , a−T = hi g i .

4. We summarize the results of this section.

– Any rank-3 tensor a can be represented as follows:

a = hi g i , aT = g i hi ,
a−1 = g i hi , a−T = hi g i .

Here the g i and g i and also the hi and hi are pairs of reciprocal bases.

114 | 3 Algebra of Second-Order Tensors

– The converse is also true: if a tensor a can be represented in the form (3.30)1
and both the hi and g i form a basis, then the tensor has rank 3.
– If we solve the equations (3.30) for the four bases, we obtain

hi = a ⋅ g i , g i = aT ⋅ hi ,
g i = a−1 ⋅ hi , hi = a−T ⋅ g i .

– For any tensor a, we can freely choose one of the four bases g i , g i , hi , or hi , and
then the other three are uniquely determined.

Problem 3.10.
Represent the regular tensor with the coordinates

2 3 −1
aij = (4 −2 3)
1 2 1

in the form a = hi g i , where the g i are given by their coordinates

g 1 = (1, 0, −1), g 2 = (3, 1, −3), g 3 = (1, 2, −2).

Hint: The hi can be computed using the Gauss–Jordan algorithm in one step.

3.10.2 Rank 2

1. If a has rank 2, we choose the basis g i in (3.28) so that g 1 and g 2 lie in the row plane
of a and g 3 is a null vector of a. Then we have h3 = 0, and g 1 and g 2 are perpendicular
to g 3 , i. e. they also lie in the row plane of a. Then (3.29) reduces to

a = h1 g 1 + h2 g 2 . (3.32)

We can represent any single singular tensor by four vectors in this way; here the g 1
and g 2 are the basis reciprocal to g 1 and g 2 in the row plane of the tensor, and h1 and
h2 are the images of g 1 and g 2 under the mapping by the tensor and they form a basis
in the column plane of the tensor.

2. The converse of (3.32) is also true: if a tensor a can be written in the form a =
h1 g 1 + h2 g 2 , and both h1 and h2 and also g 1 and g 2 are not collinear, then a has rank
2. Then it follows from U = a ⋅ X that

U = h1 g 1 ⋅ X + h2 g 2 ⋅ X.

3.10 Representation of a Tensor by Vectors | 115

If we set g 1 ⋅ X = α1 and g 2 ⋅ X = α2 , then these two equations form for given g 1 , g 2 , α1 ,

and α2 a system of two linear equations for the three Xi . For each value pair (α1 , α2 )
there are infinitely many solutions and, conversely, the value pair (α1 , α2 ) can take
any values for suitable Xi ; for arbitrary X the image vector U sweeps over the entire
plane spanned by h1 and h2 , i. e. a has rank 2.

3. Transposition of (3.32) gives

aT = g 1 h1 + g 2 h2 .

Instead, we can also write aT = g i hi , where i runs only from 1 to 2. Scalar multiplica-
tion of this equation from the right by hj , the basis reciprocal to hj in the column plane
of a, gives

aT ⋅ hj = g i hi ⋅ hj = g i δij = g j .

4. We summarize the results of this section.

– Any rank-2 tensor a can be represented as follows:

a = h1 g 1 + h2 g 2 , aT = g 1 h1 + g 2 h2 . (3.33)

Here hi and g i are linearly independent.

– The converse is also true: if a tensor a can be represented in the form (3.33)1
and both the g i and the hi are linearly independent, then the tensor has rank 2.
– Solving the equations (3.33) for hi and g i yields

hi = a ⋅ g i , g i = aT ⋅ hi . (3.34)

Here the g i are the basis reciprocal to the g i in the plane and the hi are the basis
reciprocal to the hi in the plane.
– The g i and g i lie in the row plane of a, and the hi and hi lie in the column plane
of a.
– For any rank-2 tensor a, we can freely choose one of the four vector pairs
g i , g i , hi , or hi , and then the other three vector pairs are uniquely determined.

Problem 3.11.
A. Verify that the single singular tensor from Problem 3.7 can be represented in the
form a = h1 g 1 + h2 g 2 , with h1 = (−1, 2, 1) and h2 = (0, 1, 1).
B. Compute g 1 and g 2 . (This can be done using the Gauss–Jordan algorithm in one
C. Write a as the sum of h1 g 1 and h2 g 2 .

116 | 3 Algebra of Second-Order Tensors

3.10.3 Rank 1

1. Finally, if a has rank 1, then we choose the basis in (3.28) so that g 1 lies on the row
line of a and so that g 2 and g 3 are null vectors of a. Then h2 = h3 = 0, and g 1 also lies
on the row line of a. Equation (3.29) reduces to

a = h1 g 1 , aT = g 1 h1 . (3.35)

This means that any double singular tensor can be represented as a tensor product
of two nonzero vectors, where h1 lies on the column line of the tensor, g 1 lies on the
row line of the tensor, and h1 is the image of the vector g 1 reciprocal to g 1 , under the
mapping with the tensor. (Two vectors are reciprocal if they are collinear and their
scalar product is equal to one.)

2. The converse is also true: if a tensor a can be written in the form (3.35)1 and h1 and
g 1 are nonzero, then the tensor has rank 1. Then from U = a ⋅ X we have U = h1 g 1 ⋅ X,
i. e. the image vector U is pointing in the direction of h1 for all X.

3. Let h1 be the vector reciprocal to h1 . Then from (3.35) we get by scalar multiplication
from the right by g 1 and h1 , respectively,

h1 = a ⋅ g 1 , g 1 = aT ⋅ h1 . (3.36)

4. We summarize the results.

– Any rank-1 tensor a can be represented as follows:

a = h1 g 1 , aT = g 1 h1 . (3.35)

– The converse is also true: if a tensor a can be represented in the form (3.35)1
and g 1 and h1 are nonzero, then the tensor has rank 1.
– If we solve for h1 and g 1 , then the equations (3.35) become

h1 = a ⋅ g 1 , g 1 = aT ⋅ h1 . (3.36)

Here g 1 is the reciprocal vector to g 1 and h1 is the reciprocal vector to h1 .

– We know g 1 and g 1 lie on the row line of a and h1 and h1 lie on the column line
of a.
– We can freely choose one of the four vectors g 1 , g 1 , h1 , or h1 of a rank-1 tensor
a, and then the other three are uniquely determined.

3.11 Eigenvalues and Eigenvectors. The Characteristic Equation | 117

3.11 Eigenvalues and Eigenvectors. The Characteristic Equation

3.11.1 Eigenvalues and Eigenvectors

1. We want to investigate under which conditions the vectors associated with the trans-
formation Ui = aij Xj are collinear, i. e.

Ui = λ Xi , (3.37)

where the aij are real, λ is a scalar, and Xi =

\ 0.
This is the case if

aij Xj = λ Xi , (3.38)
(aij − λ δij ) Xj = 0. (3.39)

Clearly, this condition is only satisfied if the tensor (aij − λ δij ) is singular, i. e. if

det (aij − λ δij ) = 0, (3.40)

and if Xj is a null vector of this singular tensor. The values λ, for which the tensor (aij −
λ δij ) is singular, are called the eigenvalues of the tensor aij . As scalars, they are clearly
invariants of the tensor. The null vectors of (aij − λ δij ) are called the eigenvectors of
aij , the null directions of (aij − λ δij ) are called the eigendirections of aij , and equation
(3.39) is called the eigenvalue equation of aij .

2. In addition to the eigenvalue problem above, we also introduce the eigenvalue prob-
lem aTij Zj = μ Zi for the transposed tensor. We have

Zj aji = μ Zi , (3.41)
Zi (aij − μ δij ) = 0. (3.42)

According to the position of the eigenvectors, (3.42) is also called a left-eigenvalue

problem and (3.39) a right-eigenvalue problem.
Clearly, (3.40) is the condition determining the eigenvalues of both eigenvalue
problems, so the eigenvalues are the same for both problems, i. e.

μ = λ. (3.43)

The eigenvectors of both problems are also related. We mention here only the simplest
relation: right and left eigenvectors, which belong to different eigenvalues, are orthog-
onal. Let X m be a right eigenvector corresponding to an eigenvalue λm and Zn be a left
eigenvector corresponding to an eigenvalue λn , i. e. we have

a ⋅ X m = λm Xm , Z n ⋅ a = λn Zn .

118 | 3 Algebra of Second-Order Tensors

Scalar multiplication of the first equation from the left with Z n and of the second equa-
tion from the right with X m gives

Z n ⋅ a ⋅ X m = λm Z n ⋅ Xm , Z n ⋅ a ⋅ X m = λn Zn ⋅ X m .

The left sides of both equations are the same, so we set the right sides equal and get

(λm − λn )Z n ⋅ X m = 0.

For λm ≠ λn it follows that

Xm ⋅ Zn = 0 for m ≠ n. (3.44)

From now on we will consider only the right-eigenvalue problem.

3.11.2 The Characteristic Equation and Main Invariants

1. The condition for determining the eigenvalues follows from (3.7) and is

det (aij − λ δij )

1 (3.45)
= ε ε (a − λ δip )(ajq − λ δjq )(akr − λ δkr ) = 0.
6 ijk pqr ip
This is a cubic equation for λ, the so-called characteristic equation of the tensor
aij .5 Generally, there exist three (not necessarily real) eigenvalues with (at least) one
eigendirection corresponding to each eigenvalue.
Expanding the parentheses gives

1 λ
ε ε a a a − ε ε (δ a a + δjq akr aip + δkr aip ajq )
6 ijk pqr ip jq kr 6 ijk pqr ip jq kr
+ ε ε (δ δ a + δjq δkr aip + δkr δip ajq )
6 ijk pqr ip jq kr
− ε ε δ δ δ = 0.
6 ijk pqr ip jq kr

For the coefficient of λ3 , we have εijk εpqr δip δjq δkr = εijk εijk , which equals the sum of
the squares of all nonzero coordinates of the ε-tensor, hence 6. The coefficients of the
remaining powers of λ are clearly scalars and so they are, like the eigenvalues, also
invariants of the tensor aij . We write for the last equation

A − A󸀠 λ + A󸀠󸀠 λ2 − λ3 = 0, (3.46)

5 The left side of this equation is called the characteristic polynomial.

3.11 Eigenvalues and Eigenvectors. The Characteristic Equation | 119

where, using (1.36), (3.11), and (3.7), we have

A󸀠󸀠 = ε ε δ δ a = aii , (3.47)
2 ijk pqr ip jq kr

A󸀠 = ε ε δ a a = bii , (3.48)
2 ijk pqr ip jq kr

A= ε ε a a a = det aij . (3.49)
6 ijk pqr ip jq kr
We call these three invariants the first, second, and third main invariant of the tensor
aij (after the power of the coordinates of aij ). According to (1.35), we can also write A󸀠
A󸀠 = (δ δ − δjr δkq ) ajq akr
2 jq kr
1 1
A󸀠 = (a a − ajk akj ) = [tr2 a − tr (a ⋅ a)]. (3.50)
2 jj kk 2
2. For a polar tensor, clearly, the eigenvalues and the main invariants are polar scalars.
For an axial tensor, the eigenvalues are, according to (3.39), axial scalars, and then
according to (3.46), both A and A󸀠󸀠 are axial scalars, while A󸀠 is a polar scalar.

3. Using (3.50), we can now derive a simple expression for the cotensor. We have

(3.11) 1
bip = εijk εpqr ajq akr
󵄨󵄨 δ δiq δir 󵄨󵄨 󵄨󵄨 δ δiq δir 󵄨󵄨󵄨󵄨
󵄨󵄨 ip
1 󵄨󵄨󵄨 ip
󵄨󵄨 󵄨
(1.26) 1󵄨 󵄨󵄨 󵄨
= 󵄨󵄨󵄨󵄨 δjp δjq δjr 󵄨󵄨󵄨 ajq akr = 󵄨󵄨󵄨 apq aqq arq 󵄨󵄨󵄨󵄨
2 󵄨󵄨 󵄨󵄨 2 󵄨󵄨 󵄨
󵄨󵄨 δkp δkq δkr 󵄨󵄨 󵄨󵄨 apr aqr arr 󵄨󵄨󵄨
= [δ (a a − aqr arq ) + δiq (arq apr − apq arr ) + δir (apq aqr − aqq apr )]
2 ip qq rr
= [δip (aqq arr − aqr arq ) + ari apr − api arr + apq aqi − aqq api ],

bip = A󸀠 δip − A󸀠󸀠 api + apq aqi , bT = A󸀠 δ − A󸀠󸀠 a + a2 . (3.51)

3.11.3 Classification of Tensors by the Type of Their Eigenvalues,

Eigenvalue Theorems

1. The characteristic equation (3.45) has at least one real root (we assume real coordi-
nates aij ). Thus each tensor with real coordinates has at least one real eigenvalue. The
following cases are possible:

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
120 | 3 Algebra of Second-Order Tensors

I. There are three distinct real eigenvalues.

II. There are three distinct eigenvalues, one is real and two are complex conjugate.
III. There are two distinct real eigenvalues, one has multiplicity 1 and the other has
multiplicity 2.
IV. There is only one real eigenvalue, with multiplicity 3.

2. It is easy to show that a symmetric tensor cannot have a complex eigenvalue

(cases I, III, and IV). We prove this by contradiction, by assuming the existence of
a complex eigenvalue, as well as a complex eigendirection. We have

aij (Xj + iYj ) = (λ + iμ)(Xi + iYi ).

Splitting this equation into real and imaginary parts gives

aij Xj = λ Xi − μ Yi , aij Yj = μ Xi + λ Yi .

We multiply the first equation by Yi and the second equation by Xi , so we have

aij Yi Xj = λ Yi Xi − μ Yi2 , aij Xi Yj = μ Xi2 + λ Xi Yi .

Due to the symmetry of aij the left sides are equal. Setting the right sides equal gives
μ(Xi2 + Yi2 ) = 0.
Since the eigenvector is nonzero by assumption, it follows that μ = 0 , i. e. the
eigenvalue must be real.

3. An antimetric tensor has three eigenvalues,

λ1 = 0, λ2 = i √A2i , λ3 = −i √A2i , (3.52)

where Ai is the vector associated with the tensor according to (3.8), the i before the
root is the imaginary unit (case II), and Ai is also an eigenvector corresponding to λ1 .
To see this, we write out the characteristic equation (3.46). We showed at the end of
Section 3.2 that the determinant of an antimetric tensor is zero. According to (3.48) and
(3.12), we have A󸀠 = A2i , and the trace of an antimetric tensor clearly vanishes.
Thus the characteristic equation reads λ(A2i + λ2 ) = 0, from which (3.52) follows
immediately. The determining equations (3.39) for the eigenvector corresponding to
λ1 = 0 are aij Xj = 0. According to (3.8)2 , this gives εijk Ak Xj = 0, and then we
immediately recognize that Xi = Ai is a solution, because εijk is antimetric and Ak Aj
is symmetric.

4. A tensor which is neither symmetric nor antimetric can clearly belong to any of the
four cases mentioned above.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.11 Eigenvalues and Eigenvectors. The Characteristic Equation | 121

5. The converse of the characteristic equation (3.46) gives the so-called Vieta formulas

A = λ1 λ2 λ3 ,
A󸀠 = λ1 λ2 + λ2 λ3 + λ3 λ1 , (3.53)
A󸀠󸀠 = λ1 + λ2 + λ3 .

A tensor is regular if and only if no eigenvalue is zero, i. e. it is singular if and only if

at least one eigenvalue is zero. If only one eigenvalue is zero, the tensor has rank 2,
since then clearly A󸀠 ≠ 0 and according to (3.48) also bij = \ 0. 6 We will make further
statements for symmetric tensors in Section 3.12.2.
6. If the coordinate matrix of a tensor a in an appropriate coordinate system is trian-
gular, i. e.

a11 a12 a13

(0 a22 a23 ) ,
0 0 a33

then the diagonal elements of this matrix are the eigenvalues of the tensor.
In this case the coordinate matrix of the tensor (a − λ δ),

a11 − λ a12 a13

( 0 a22 − λ a23 ) ,
0 0 a33 − λ

is also triangular, so that the characteristic equation det (a − λ δ) = 0 is

(a11 − λ)(a22 − λ)(a33 − λ) = 0.

7. If in one row or column of the coordinate matrix of a tensor only the diagonal
element aii is nonzero, then this diagonal element is an eigenvalue of the tensor,
because expanding the determinant in the characteristic equation det (aij − λ δij )
= 0 with respect to that row or column leads to the factor (aii − λ).
If, in particular, in one column only the diagonal element is nonzero, then the
associated coordinate direction is the eigendirection corresponding to this eigenvalue.
To prove this statement, we assume, without loss of generality, that this is the case in
the first column and compute the eigenvectors corresponding to the eigenvalue on the
diagonal position by solving the system of equations

0 a12 a13 X1 0
(0 a22 − a11 a23 ) (X2 ) = (0) .
0 a32 a33 − a11 X3 0

6 If two eigenvalues are zero, then A󸀠 = 0 and hence also bii = 0, from which we of course cannot
conclude that also bij = 0 .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
122 | 3 Algebra of Second-Order Tensors

Its nontrivial solutions are the coordinates of the basis vector e1 , i. e. X1 = 1 and
X2 = X3 = 0.
Clearly, the converse is also true: if a coordinate direction is an eigendirection,
then the diagonal element in the associated column of the coordinate matrix is the
corresponding eigenvalue, and the remaining elements of this column are zero. If we
assume, again without loss of generality, that the x1 -direction is an eigendirection,
then from the system of equations

a11 − λ a12 a13 1 0

( a21 a22 − λ a23 ) (0) = (0)
a31 a32 a33 − λ 0 0

it follows immediately that a11 = λ, a21 = a31 = 0.

3.11.4 Theorems about Eigenvectors

In this section, we derive theorems about eigenvectors of tensors with real coordinates,
and we obtain statements about the four classes of tensors, which we distinguished
at the beginning of Section 3.11.3.
1. The following statement is fundamental.

Theorem 1. To each eigenvalue there exists at least one eigendirection, in particular,

a real eigenvalue has at least one real eigendirection. A complex eigenvalue cannot have
a real eigendirection.

From (3.39) it follows immediately that each eigenvalue has at least one eigendi-
To ensure full generality, we assume that both the eigenvalue and the eigendirec-
tion in (3.39) are complex, i. e.

a ⋅ (X + i Y) = (λ + i μ)(X + i Y).

Splitting this equation into real and imaginary parts gives

a ⋅ X = λ X − μ Y, a ⋅ Y = μ X + λ Y.

If we assume that the eigenvalue is real, i. e. μ = 0 , it follows that

a ⋅ X = λ X, a ⋅ Y = λ Y.

These two equations can be satisfied by X ≠ 0, Y = 0, so a real eigenvalue has at least

one real eigendirection.
If we assume conversely that the eigenvalue is complex, but the eigendirection is
real, i. e. μ ≠ 0 and Y = 0, then it follows from the imaginary part that μ X = 0 or

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.11 Eigenvalues and Eigenvectors. The Characteristic Equation | 123

X = 0. Since X and Y cannot both be zero, our assumption has led to a contradiction,
i. e. a complex eigenvalue cannot have a real eigendirection.
2. The following theorems follow immediately from the fact that the null directions
of the tensor (a − λ δ) are the eigendirections of the tensor a corresponding to the
eigenvalue λ.

Theorem 2. If, for an eigenvalue λ, the tensor (a − λ δ) has rank 2, then there exists only
one eigendirection corresponding to this eigenvalue.

Theorem 3. If, for an eigenvalue λ, the tensor (a−λ δ) has rank 1, then there exist two dis-
tinct (and thus linearly independent) eigendirections corresponding to this eigenvalue.

All directions which can be represented as linear combinations of distinct eigendi-

rections, i. e. all vectors lying in the plane spanned by the two eigendirections, are
eigendirections. There exist infinitely many different eigendirections; we also say they
form an eigenplane.

Theorem 4. If, for an eigenvalue λ, the tensor (a − λ δ) has rank 0, then there exist three
linearly independent eigendirections corresponding to this eigenvalue.

Thus all directions are eigendirections; every direction can be represented as a

linear combination of three linearly independent eigendirections. Rank 0 means that
the tensor (a − λ δ) must be the zero tensor, i. e. we have a = λ δ; in other words, the
tensor a must be isotropic. Clearly, the converse is also true: if a tensor is isotropic, it
has an eigenvalue of multiplicity 3, with three linearly independent eigendirections.
3. We will show in two steps that the eigendirections corresponding to distinct eigen-
values are linearly independent.

Theorem 5. Two distinct eigenvalues cannot have a shared eigendirection.

We prove this by contradiction: let us assume that q is the common eigendirection

corresponding to two eigenvalues λ1 and λ2 of the tensor a. Then we have

a ⋅ q = λ1 q = λ2 q.

But since q as an eigenvector cannot be the zero vector, it follows from λ1 q = λ2 q that
λ1 = λ2 .

Theorem 6. The eigendirections corresponding to three distinct eigenvalues are linearly


We again prove by contradiction. Let λ1 , λ2 , and λ3 be the three different eigen-

values. Then, according to Theorem 1, there exists at least one eigendirection for each
eigenvalue, and, according to Theorem 5, these eigendirections are distinct. Let q1 , q2 ,
and q3 be unit vectors in these eigendirections. Then we have

a ⋅ q1 = λ1 q1 , a ⋅ q2 = λ2 q2 , a ⋅ q3 = λ3 q3 .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
124 | 3 Algebra of Second-Order Tensors

We now assume that q1 , q2 , and q3 are linearly dependent. Then

αi qi = 0, αi =
\ 0

must hold. Let α3 ≠ 0. Then we have with β1 = −α1 /α3 , β2 = −α2 /α3

q3 = β1 q1 + β2 q2 .

Since q1 , q2 , and q3 are unit vectors and all are distinct, clearly neither β1 nor β2 can
be zero. We now substitute the last equation into a ⋅ q3 = λ3 q3 , and we obtain

a ⋅ (β1 q1 + β2 q2 ) = λ3 (β1 q1 + β2 q2 ),
β1 ⏟⏟⏟⏟⏟⏟⏟⏟⏟
a ⋅ q1 + β2 a ⋅ q2 = β1 λ3 q1 + β2 λ3 q2 ,
λ1 q1 λ2 q2

β1 (λ1 − λ3 ) q1 + β2 (λ2 − λ3 ) q2 = 0.

Since λ1 − λ3 ≠ 0, λ2 − λ3 ≠ 0, and since q1 and q2 are distinct eigenvectors, clearly

β1 and β2 (and thus α1 and α2 ) must be zero, which contradicts the conclusion above
that both are nonzero. So the assumption that the qi are linearly dependent is wrong.

4. Next we want to prove an important relation between the multiplicity of an eigen-

value and the number of the corresponding linearly independent eigendirections.

Theorem 7. The number of the linearly independent eigendirections corresponding to

an eigenvalue is at most equal to the multiplicity of the eigenvalue.

Conversely we conclude the following statement.

Theorem 8. If an eigenvalue has three linearly independent eigendirections, it has mul-

tiplicity 3.

Theorem 9. If an eigenvalue has two distinct eigendirections, it can have multiplicity 2

or 3.

Theorem 10. If an eigenvalue has only one eigendirection, it can have multiplicity 1, 2,
or 3.

We show first that a tensor a with an eigenvector of multiplicity 3 can have up to

three linearly independent eigendirections.
It is sufficient to give an example for each of these three cases. Let us consider a
tensor with the coordinate matrix

λ1 a12 a13
(0 λ1 a23 ) .
0 0 λ1

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.11 Eigenvalues and Eigenvectors. The Characteristic Equation | 125

According to Section 3.11.3, No. 6, such a matrix has an eigenvalue λ1 of multiplicity 3.

The coordinate matrix of the tensor (a − λ1 δ) is

0 a12 a13
(0 0 a23 ) .
0 0 0
For a12 ≠ 0, a23 ≠ 0 this matrix has rank 2 and hence the tensor has, according
to Theorem 2, only one eigendirection. For a12 = 0, a23 ≠ 0 the matrix has rank 1
and hence the tensor has, according to Theorem 3, two distinct eigendirections. For
a12 = a13 = a23 = 0 the matrix has rank 0 and hence the tensor has, according to
Theorem 4, three linearly independent eigendirections.
We now show that an eigenvalue λ1 of multiplicity 2 can have at most two distinct
The rank of the tensor (a − λ1 δ) corresponding to an eigenvalue λ1 of multiplicity
2 cannot be zero, because otherwise (a − λ1 δ) would be the zero tensor, i. e. we would
have a = λ1 δ, and this tensor has the eigenvalue λ1 of multiplicity 3. So the rank of
(a − λ1 δ) must be at least one, and thus the tensor a has, according to Theorem 2 and
Theorem 3, at most two distinct eigendirections.
According to Theorem 1, a real eigenvalue has at least one (real) eigendirection. So
it only remains to show that an eigenvalue of multiplicity 2 can also have two distinct
eigendirections. An example is a tensor with the coordinate matrix
λ1 a12 a13
(0 λ1 a23 ) ,
0 0 λ2
where λ2 ≠ λ1 . According to Section 3.11.3, No. 6, such a tensor has an eigenvalue λ1 of
multiplicity 2 and an eigenvalue λ2 of multiplicity 1, and the coordinate matrix of the
tensor (a − λ1 δ) is given by

0 a12 a13
(0 0 a23 ) .
0 0 λ2 − λ1
If a12 = 0, the matrix has rank 1 and thus the tensor has, according to Theorem 3, two
distinct eigendirections corresponding to the eigenvalue of multiplicity 2.
Now, let λ1 be an eigenvalue of a with multiplicity 1 and let q1 be an eigendirec-
tion corresponding to λ1 . We choose q1 as a unit vector in the x1 -direction of a Cartesian
coordinate system, and we further choose two arbitrary unit vectors, which are orthog-
onal to q1 and orthogonal to each other, as the x2 - and x3 -directions. In this coordinate
system, the coordinate matrix of a is, according to Section 3.11.3, No. 7,

λ1 a∗12 a∗13
(0 a∗22 a∗23 ) ,
0 a∗32 a∗33

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
126 | 3 Algebra of Second-Order Tensors

and the characteristic equation is

󵄨󵄨 λ − λ a∗12 a∗13 󵄨󵄨󵄨󵄨
󵄨󵄨 1 󵄨󵄨 ∗ 󵄨
󵄨󵄨 󵄨 󵄨a −λ a∗23 󵄨󵄨󵄨
󵄨󵄨 0 a∗22 − λ a∗23 󵄨󵄨󵄨󵄨 = (λ1 − λ) 󵄨󵄨󵄨󵄨 22 ∗ 󵄨󵄨 = 0.
󵄨󵄨 0 󵄨 󵄨󵄨 a32 a∗33 − λ 󵄨󵄨󵄨
󵄨 a∗32 a∗33 − λ 󵄨󵄨󵄨

For λ = λ1 , the first factor is zero and the equation is satisfied by assumption. If,
for λ = λ1 , the second factor were also zero, then λ1 would not be an eigenvalue of
multiplicity 1. Since this is excluded by assumption, the determinant has rank 2, and
thus it has only one eigendirection corresponding to λ1 .

5. A tensor with less than three linearly independent eigendirections is called defec-
tive, and a tensor with three linearly independent eigendirections is called nondefec-
According to Theorem 7, a defective tensor must have an eigenvalue with a multi-
plicity of at least 2.
For a nondefective tensor we can choose the g i in (3.28) to be three linearly inde-
pendent eigenvectors and then we have hi = λi g i . Substituting this into (3.29) gives

a = λi g i g i , (3.54)

where g i is the basis reciprocal to the linearly independent eigenvectors g i . Transpos-

ing (3.54) gives aT = λi g i g i , and applying scalar multiplication from the right by g j
gives further

aT ⋅ g j = λj g j ,

i. e. the g j are, with (3.42), at the same time left eigenvectors of a.

6. The multiplicity of an eigenvalue as a root of the characteristic equation (3.45) is

called the multiplicity of the eigenvalue, or more precisely, its algebraic multiplicity.
In addition, we can also define the geometric multiplicity of an eigenvalue: it is the
number of linearly independent eigendirections corresponding to this eigenvalue. Us-
ing these terms, we can distinguish between defective and nondefective tensors in the
following way. If a tensor has an eigenvalue whose geometric multiplicity is less than
its geometric multiplicity, then the tensor is defective. Conversely, a tensor is nondefec-
tive, if the algebraic multiplicity of each eigenvalue equals its geometric multiplicity.

7. We end this section by summarizing the theorems for the four classes of tensors,
which we distinguished at the beginning of the previous section.

I: The tensor has three distinct real eigenvalues.

Then there exist three linearly independent real eigendirections, one for
each eigenvalue; the tensor is nondefective.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.11 Eigenvalues and Eigenvectors. The Characteristic Equation | 127

II: The tensor has three distinct eigenvalues, one is real and two are complex con-
Then there exist three linearly independent eigendirections, the real eigen-
value has a real eigendirection, and the two complex eigenvalues have complex
eigendirections; the tensor is nondefective.
III: The tensor has two distinct real eigenvalues, one of multiplicity 1 and one of
multiplicity 2.
Then two cases are possible:
a) If the tensor (a−λ δ), with λ being an eigenvalue of multiplicity 2, has rank
2, then there exists one real eigendirection corresponding to this eigen-
value and another real eigendirection (different from the first) correspond-
ing to the second eigenvalue of multiplicity 1; the tensor is defective.
b) If the tensor (a − λ δ), with λ being an eigenvalue of multiplicity 2, has
rank 1, then there exist two linearly independent real eigendirections cor-
responding to this eigenvalue and spanning an eigenplane, and another
real eigendirection not lying in the eigenplane and corresponding to the
eigenvalue of multiplicity 1; the tensor is nondefective.
IV: The tensor has one real eigenvalue of multiplicity 3.
Then three cases are possible:
a) If the tensor (a − λ δ) has rank 2, then there exists only one real eigendi-
rection; the tensor is defective.
b) If the tensor (a − λ δ) has rank 1, there exist two linearly independent real
eigendirections, which span an eigenplane; the tensor is defective.
c) If the tensor (a − λ δ) has rank 0, then the tensor is isotropic and each
direction is an eigendirection; the tensor is nondefective.

Problem 3.12.
Consider tensors with the following coordinate matrices:7

1 1 0 1 0 0 1 0 0
A. (0 1 1) , B. (0 1 1) , C. (0 1 0) ,
0 0 1 0 0 1 0 0 1

1 1 0 1 0 0 1 0 0
D. (0 1 1) , E. (0 1 1) , F. (0 1 0) .
0 0 2 0 0 2 0 0 2

For each tensor, find the eigenvalues and, for each eigenvalue, find a set of linearly
independent eigendirections. In which cases is the tensor nondefective, and in which
cases do there exist three mutually orthogonal eigendirections?

7 Source: Adalbert Duschek, August Hochrainer: Grundzüge der Tensorrechnung in analytischer

Darstellung, Vol. 1: Tensoralgebra, 5th Edition, pp. 113–115. Wien: Springer, 1968.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
128 | 3 Algebra of Second-Order Tensors

3.11.5 Eigenvalues and Eigenvectors of Square Matrices

1. The terms eigenvalue and eigenvector carry over to any square matrix of order N.
To such a matrix a
∼ we can assign an eigenvalue equation
∼ − λ I)
∼X ∼ = 0.
∼ (3.55)

We call the numbers λ, which satisfy this equation, the eigenvalues of the matrix and
we call the column matrices X ∼ ≠ 0,
∼ which solve this equation for a particular value
of λ, the (right) eigenvectors of the matrix corresponding to the eigenvalue λ.
We can interpret (3.55) as a homogeneous system of linear equations for the N
elements of the column matrix X. ∼ For this system to have a nontrivial solution, the
coefficient determinant must be zero, i. e. the characteristic equation

det (a
∼ − λ I)
∼ =0 (3.56)

must hold. Using (1.29), we have for the left side

det (a
∼ − λ I)

= ε ε (a − λ δip )(ajq − λ δjq ) ⋅ ⋅ ⋅ (akr − λ δkr ), (3.57)
N! ij...k pq...r ip
ij . . . kpq . . . r : N indices,

which is clearly a polynomial of degree N in λ. We can also write the characteristic

equation in the form

α − α󸀠 λ + α󸀠󸀠 λ2 + ⋅ ⋅ ⋅ + (−1)N−1 α(N−1) λN−1 + (−1)N λN = 0. (3.58)

Such an equation has N not necessarily distinct and generally complex solutions
(roots) λ1 , λ2 , . . . , λN , and therefore it can be factored into the form

(λ − λ1 )(λ − λ2 ) ⋅ ⋅ ⋅ (λ − λN ) = 0.

By expanding, we obtain for the coefficients α(k) (read: α k-prime)

α = λ1 λ2 ⋅ ⋅ ⋅ λN ,
α󸀠 = λ2 λ3 ⋅ ⋅ ⋅ λN + λ1 λ3 λ4 ⋅ ⋅ ⋅ λN + ⋅ ⋅ ⋅ + λ1 λ2 ⋅ ⋅ ⋅ λN−1 ,
.. (3.59)
α (N−1)
= λ1 + λ2 + ⋅ ⋅ ⋅ + λN .

Here the general coefficient α(k) is clearly the sum of all the products with (N − k)
different factors, which can be composed of the N roots λi , and there are (Nk ) such

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.11 Eigenvalues and Eigenvectors. The Characteristic Equation | 129

By expanding (3.57) and sorting according to the powers of λ, we obtain, alter-

natively to (3.59), a representation of the coefficients α(k) as functions of the matrix
elements aij .
For the constant term, it follows from (1.29) that

α= ε ε a a ⋅ ⋅ ⋅ akr = det a. (3.60)
N! ij...k pq...r ip jq ∼

For the linear term, we obtain

α󸀠 = ε ε (δ a ⋅ ⋅ ⋅ akr + aip δjq ⋅ ⋅ ⋅ akr + ⋅ ⋅ ⋅ + aip ajq ⋅ ⋅ ⋅ δkr ).
N! ij...k pq...r ip jq
Expanding this and exchanging the order of the factors and the corresponding indices
of the epsilons, gives, e. g. for the second term,

1 1
ε ε a δ ⋅ ⋅ ⋅ akr = ε ε δ a ⋅ ⋅ ⋅ akr ,
N! ij...k pq...r ip jq N! ji...k qp...r jq ip
but this is equal to the first term, i. e. all N terms are equal, and we have

α󸀠 = ε ε a ⋅ ⋅ ⋅ akr . (3.61)
(N − 1)! ij...k iq...r jq

According to (1.30), this is, for i = 1, the minor of the element a11 and analogously,
for i = 2, the minor of the element a22 , etc. This means α󸀠 is the sum of the principal
minors of the matrix a∼ (actually, of the cofactors, but on the main diagonal minors and
cofactors coincide).
For α󸀠󸀠 , we have in (1.29) instead of the product aip ajq ⋅ ⋅ ⋅ akr the sum of the (N2 )
products, in each of which two factors a are replaced by δ. Analogously to α󸀠 , we can
show that all terms are equal, i. e.

α󸀠󸀠 = ε ε a ⋅ ⋅ ⋅ alq .
(N − 2)! 2! ijk...l ijp...q kp

This in turn is the sum of the principal subdeterminants of order (N − 2) of the matrix.
For the coefficient of λk we get

α(k) = ε ε a ⋅ ⋅ ⋅ anq ,
(N − k)! k! i...jm...n i...jp...q mp (3.62)
i . . . j: k indices, m . . . n, p . . . q: N − k indices,

which is the sum of the (Nk ) principal subdeterminants of order (N − k) of the matrix.
For the coefficient α(N−1) of λN−1 , we finally obtain the sum of the elements of the
main diagonal; we call this, analogously to the tensors, the trace of the square matrix.
We have

α(N−1) = aii =: tr a.
∼ (3.63)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
130 | 3 Algebra of Second-Order Tensors

For example, for the easiest case of a square matrix of order 2, the characteristic equa-
tion is

α − α󸀠 λ + λ2 = 0, (3.64)

and for the coefficients of this equation we have

α = λ1 λ2 = det a
∼ = a11 a22 − a12 a21 ,
α󸀠 = λ1 + λ2 = tr a
∼ = a11 + a22 .
2. The theorems from Sections 3.11.3 and 3.11.4 about eigenvalues and eigenvectors
carry over, to a large extent, from tensors to square matrices.
A square matrix of order N with real elements has, according to (3.58), exactly
N (real and/or complex) eigenvalues, if we count the eigenvalues according to their
multiplicity. If N is odd, we always have at least one real eigenvalue; if N is even, all
eigenvalues can be complex.
The number of the linearly independent eigendirections corresponding to an
eigenvalue λ is determined by the rank defect of the matrix a ∼ − λ I.
∼ If a
∼ − λ I∼ has the
rank defect M, then there exist M linearly independent eigendirections corresponding
to this eigenvalue. Because of det (a ∼ − λ I)
∼ = 0, the rank defect is at least M = 1, i. e.
for each eigenvalue, there exists at least one corresponding eigendirection, for a real
eigenvalue a real eigendirection, for a complex eigenvalue a complex eigendirection.
If all eigenvalues are real and distinct, then there are N linearly independent
eigenvectors, which then form a basis for the N-dimensional vector space of the col-
umn matrices of order N.
We can make additional statements, if the matrix a ∼ has a particular shape. In a
triangular matrix, the diagonal elements are also eigenvalues; if in the i-th column of
∼ only the diagonal element is nonzero, then this diagonal element is an eigenvalue
of a
∼ and the corresponding eigenvector X ∼ contains only a one in the i-th row, while all
the remaining elements are zero.

3.12 Symmetric Tensors

3.12.1 Principal Axes Transformation

1. Every tensor whose coordinate matrix has a diagonal form in a suitable Cartesian
coordinate system, i. e. it contains only zeros outside the main diagonal, is clearly sym-
metric. We will show in this section that, conversely, for every symmetric tensor, there
exists (at least) one Cartesian coordinate system in which its coordinate matrix has a
diagonal form. We call the coordinate directions of such a coordinate system the prin-
cipal axes of the tensor, and the transformation of an arbitrary Cartesian coordinate
system to the principal axes we call a principal axes transformation. We assume for

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.12 Symmetric Tensors | 131

the rest of this section that all Cartesian coordinate systems and by this also the sys-
tem of principal axes are right-handed systems; for this we have to choose the order of
the principal axes appropriately. With this restriction, we do not need to distinguish
between polar and axial tensors for the rest of this section.
2. We will prove that a symmetric tensor always has a system of principal axes by
demonstrating how to find such a system of principal axes. According to Section 3.11.3,
No. 2, a symmetric tensor always has three (not necessarily distinct) real eigenvalues
and according to Section 3.11.4, No. 1, Theorem 1, it has at least one corresponding real
eigendirection. Using these facts, we transform the tensor a in a first step to a Cartesian
coordinate system, in which the x̃1 -axis coincides with this eigendirection. According
to Section 3.11.3, No. 7 and because of the symmetry, which is, according to Section 2.6,
No. 7, invariant under a coordinate transformation, the tensor a has, in this coordinate
system, the coordinate matrix

λ1 0 0
̃ ij = ( 0 a
̃ 22 ̃ 23 ) .
a (a)
0 a
̃ 32 a
̃ 33

The corresponding characteristic equation is

󵄨󵄨 λ − λ 0 0 󵄨󵄨󵄨󵄨
󵄨󵄨 1
󵄨󵄨 󵄨
󵄨󵄨 0 a
̃ 22 − λ ̃ 23 󵄨󵄨󵄨 = 0,
󵄨󵄨 󵄨󵄨
󵄨󵄨 0 a
̃ 32 a
̃ 33 − λ 󵄨󵄨󵄨

so the two eigenvalues λ2 and λ3 are determined by the equation

󵄨󵄨 ̃ ̃ 23 󵄨󵄨󵄨󵄨
󵄨󵄨 a22 − λ a
󵄨󵄨 󵄨 = 0.
󵄨󵄨 a a
̃ 33 − λ 󵄨󵄨󵄨󵄨
󵄨 ̃ 32
The eigenvalue equation corresponding to (a) for λ2 reads

λ1 − λ2 0 0 X
̃1 0
( 0 a
̃ 22 − λ2 a
̃ 23 ) (X
̃ ) = (0)

0 a
̃ 32 a
̃ 33 − λ2 X
̃3 0

or, after the multiplication,

(λ1 − λ2 ) X
̃1 = 0,
̃ 22 − λ2 ) X
̃2 + a
̃ 23 X
̃3 = 0, (b)
̃ 32 X
̃2 + (a
̃ 33 − λ2 ) X
̃3 = 0.

If λ2 and λ1 are distinct, (b) can only be satisfied if X̃1 = 0 , i. e. if the eigendirec-
tion corresponding to λ2 is perpendicular to the x̃1 -axis, which is at the same time an

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
132 | 3 Algebra of Second-Order Tensors

eigendirection corresponding to the eigenvalue λ1 . Along the way, we proved an im-

portant theorem.
The eigendirections corresponding to two distinct eigenvalues of a symmetric tensor
are perpendicular to each other.
It follows immediately from Section 3.11.3, No. 7 that, for three different eigen-
values, the corresponding eigendirections form the basis of a system of principal
axes and, according to Section 3.11.4, No. 4, Theorem 7, this principal axes system is
If λ2 and λ1 are equal, we have both λ1 − λ2 = 0 and

̃ 22 − λ1 ) (a ̃ (2)
̃ 33 − λ1 ) − a 23 = 0.

Then the tensor a−λ1 δ is at least double singular and in addition to the eigendirection
̃1 = 1, X
̃2 = X ̃3 = 0, there is at least one more eigendirection corresponding to the
eigenvalue λ1 . For the second eigendirection we can choose X ̃1 = 0, which then lies
in the x̃2 , x̃3 -plane, i. e. it is perpendicular to the first eigendirection, and both eigendi-
rections span an eigenplane.
Independently of λ1 and λ2 being equal or not we can choose the two eigendirec-
tions as the x̂1 - and x̂2 -axes of a new Cartesian coordinate system and transform the
tensor a again. Using the same arguments as before, we have for the coordinate matrix
in the new coordinate system

λ1 0 0
aij = ( 0
̂ λ2 0 ).
0 0 a
̂ 33

But then, according to Section 3.11.3, No. 7, the x̂3 -axis is also an eigendirection corre-
sponding to the eigenvalue λ3 = a ̂ 33 ; so we showed that every symmetric tensor has
at least one system of principal axes and that in any system of principal axes the ele-
ments on the main diagonal of the coordinate matrix are eigenvalues and the principal
axes are eigendirections.
Thus a symmetric tensor with three different eigenvalues has only one system of
principal axes. For an eigenvalue of multiplicity 2, there exist two linearly independent
eigendirections, which span an eigenplane, and every Cartesian basis with one basis
vector perpendicular to this plane forms a system of principal axes. A tensor with an
eigenvalue of multiplicity 3 is isotropic; in this case every Cartesian basis is a system
of principal axes.
3. A symmetric tensor is always nondefective and can therefore always be represented
in the form (3.54). Since, according to Section 3.9.3, a basis q i of a system of principal
axes is identical to its reciprocal basis, we can replace both the g i and g i in (3.54) by
qi , which gives

a = λi qi qi .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.12 Symmetric Tensors | 133

4. In our proof of the principal axes transformation, we transformed the coordinate

matrix of a symmetric tensor, step-by-step, into diagonal form. For practical computa-
tions, it is not necessary to follow this stepwise procedure; we can compute the diag-
onal form with a single transformation. To this end, we solve the eigenvalue problem
aij Xj = λ Xi , form a Cartesian basis from the eigenvectors qi , and use the coordinates
j j
qi , as in (2.4), as transformation coefficients αij = qi . Applying the transformation
̂ ij = αmi αnj amn then results in a coordinate representation with respect to the ba-
sis qi , in which only the diagonal elements are nonzero, so they are the eigenvalues:
̂ ij = λi δij ; the order of the eigenvalues on the main diagonal depends on the order in
which the basis is constructed from the corresponding eigenvectors.

5. We summarize the results of this section; they are valid under the assumption that
all coordinate systems are right-handed systems and so they are valid for both polar
and for axial tensors.

– A system of principal axes is a (right-handed) Cartesian coordinate system in

which the coordinate matrix of the tensor has a diagonal form.
– Any tensor which has a system of principal axes is symmetric and every sym-
metric tensor has (at least) one system of principal axes.
– Any principal axis is an eigendirection of the tensor and each triple of mutually
orthogonal eigendirections forms a system of principal axes.
– In every system of principal axes the elements on the main diagonal of the co-
ordinate matrix are the eigenvalues of the tensor, i. e.

λ1 0 0
̂ ij = λi δij = ( 0 λ2 0). (3.66)
0 0 λ3

– The eigenvalues λi are determined by the characteristic equation

det (aij − λ δij ) = 0. (3.67)

– The transformation coefficients αij of the principal axes transformation

̂ ij = αmi αnj amn are also the coordinates qi of the three orthonormal eigenvec-
tors qj of the tensor in the original coordinate system, i. e.

αij = qi . (3.68)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
134 | 3 Algebra of Second-Order Tensors

– The coordinates qi of three orthonormal eigenvectors qj of the tensor in the
original coordinate system are determined by the eigenvalue equation

(aij − λk δij ) qj = 0. (3.69)

– Between the coordinates a ̂ ij of the tensor in a system of principal axes, its

eigenvalues λi , its coordinates aij in the original coordinate system, and the
coordinates qi of the three orthonormal eigenvectors in the original coordinate
system, we have the following relations:

i j
̂ ij = qm qn amn = λi δij ,
m n k k
̂ mn = λk qi qj ,
aij = q i qj a (3.70)

a = λk q k q k .

– For the number of systems of principal axes we have:

– If only eigenvalues with multiplicity 1 exist, there is only one system of prin-
cipal axes.
– If an eigenvalue of multiplicity 2 exists, then any coordinate system with
one axis in the eigendirection corresponding to the eigenvalue with multi-
plicity 1 is a system of principal axes.
– If an eigenvalue of multiplicity 3 exists, then every coordinate system is a
system of principal axes; the tensor is isotropic.

Problem 3.13.
Consider a tensor which has in its original coordinate system the coordinate matrix

5 0 4
tij = (0 9 0) .
4 0 5

Transform the tensor to principal axes, i. e. find a right-handed Cartesian basis (in
terms of the coordinates of the original system) in which the coordinate matrix of the
tensor has a diagonal form, and write out this coordinate matrix.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.12 Symmetric Tensors | 135

3.12.2 Eigenvalues and Rank of a Tensor

From the existence of a system of principal axes (3.66) it further follows that if all eigen-
values are nonzero, the tensor a has rank 3; if one eigenvalue is zero, it has rank 2; if
two eigenvalues are zero, it has rank 1; and if all three eigenvalues are zero, it has rank
zero. Thus we have, for a symmetric tensor, a simple relation between the rank of the
tensor and the number of its vanishing eigenvalues.

Rank Eigenvalues

3 λ1,2,3 ≠ 0
2 λ1 = 0, λ2,3 ≠ 0
1 λ1 = λ2 = 0, λ3 ≠ 0
0 λ1 = λ2 = λ3 = 0

3.12.3 Eigenvalues and Definiteness of a Tensor

1. We consider the quadratic form of an arbitrary tensor a with the Cartesian coordi-
nates aij ,

Q = aij Xi Xj , (3.71)

where X is an arbitrary nonzero vector. If independently of the choice of X the

quadratic form Q is always positive, a is called positive definite; if Q is always nega-
tive, a is called negative definite. If Q is always positive or zero or always negative or
zero, a is called positive semidefinite or negative semidefinite, respectively. Finally,
if Q can take both signs, a is called indefinite. That is, any positive definite tensor is
also positive semidefinite, any negative definite tensor is also negative semidefinite,
and any tensor which is neither positive semidefinite nor negative semidefinite is

2. If we compute the quadratic form (3.71) in the system of principal axes, we get

Q = λ1 X12 + λ2 X22 + λ3 X32 , (3.72)

i. e. the sign of Q depends for X ≠ 0 only on the sign of the λj , and because of Vieta’s
formulas (3.53)1 , the same is true for the determinant of the tensor. Thus, for a sym-
metric tensor, we find the following relations between the sign of Q and the signs of
its eigenvalues and of its determinant.

8 Except for the zero tensor, which is both positive and negative semidefinite, the three sets of positive
semidefinite, negative semidefinite, and indefinite tensors are disjoint.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
136 | 3 Algebra of Second-Order Tensors

Quadratic Form Tensor Eigenvalues Determinant

Q>0 positive definite all > 0 >0

Q ≧0 positive semidefinite all ≧0 ≧0
Q<0 negative definite all < 0 <0
Q ≦0 negative semidefinite all ≦0 ≦0
Q takes indefinite with no
positive and different statement
negative sign possible

Thus positive and negative definite tensors are always regular.

Problem 3.14.
We can generalize the eigenvalue problem of a tensor a to the case that the eigenvalue
λ is associated with a second tensor b, i. e.

aij Xj = λ bij Xj .

Let a and b be two symmetric tensors. Determine the conditions under which the gen-
eralized eigenvalue problem has only real eigenvalues.

3. With the quadratic form Q = aij Xi Xj we can interpret the eigendirections of a sym-
metric tensor a in a different way than in Section 3.11.1, No. 1. To this end, we assume
X to be a unit vector and ask for those directions, for which the quadratic form takes
its maximal or minimal values. Thus we formulated an extremum problem,

Q = aij Xi Xj 󳨀→ extremum,

for the coordinates of the vector X with the constraint Xi Xi = 1 or

1 − δij Xi Xj = 0.

A common method for solving such problems is to multiply the left side of the con-
straint with an initially undetermined factor λ (the so-called Lagrange multiplier) and
add this term to the function whose extrema are to be determined, i. e.

h = aij Xi Xj + λ (1 − δij Xi Xj ) .

The necessary condition for the extrema of Q is then that the partial derivatives of the
auxiliary function h with respect to the coordinates of X must be zero, i. e.

𝜕X 𝜕Xj
= (aij − λ δij ) ( i Xj + Xi
𝜕Xk ⏟⏟𝜕X
⏟⏟⏟k⏟⏟ ⏟⏟𝜕X
δik δjk
= (akj − λ δkj ) Xj + (aik − λ δik ) Xi = 0.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.12 Symmetric Tensors | 137

From the symmetry of aij and δij it follows that

(aij − λ δij ) Xj = 0,

i. e. we obtained the familiar eigenvalue problem for the tensor a. Thus the quadratic
form takes its extrema in the eigendirections of a.

3.12.4 Symmetric Matrices

A symmetric matrix, i. e. a matrix for which A ∼=A ∼ , must be square. Furthermore, we
restrict ourselves in this section to matrices with real elements, without mentioning it
every time.
Several results from symmetric tensors (with real coordinates) carry over to sym-
metric matrices of order N (with real elements).

1. A fundamental statement for symmetric matrices is the following.

Theorem 1. A symmetric matrix has only real eigenvalues, and to each eigenvalue cor-
responds at least one real eigenvector (i. e. an eigenvector where all elements are real

We can show, analogously to the corresponding proof for symmetric tensors in

Section 3.11.3, No. 2, that a symmetric matrix cannot have complex eigenvalues. Fur-
thermore, from (3.55) it follows that for any real eigenvalue there exists at least one
corresponding real eigenvector.

2. We can generalize the concept of orthogonality to an N-dimensional space, using

the scalar product of two vectors. If, for the matrix product of two column matrices
with N elements, we have X ∼ Y∼ = 0 (i. e. Xi Yi = 0), we call the column matrices X ∼
and Y ∼ mutually orthogonal to each other. From N matrices with mutually orthogonal
columns, we can construct an orthogonal matrix R, ∼ and then we can assign, according
to (1.58), a similar matrix a
∼ a∼R
∼ to a matrix a.
∼ Clearly, this is a generalization of the
transformation law (2.15) for the coordinates of a second-order tensor. Similar matrices
have the same characteristic equation (and thus the same eigenvalues), which in our
case follows, using (1.13), from
det (a
∼ − λ E)
∼ = det (R∼ a ∼R
∼ −λR
∼ E ∼ R)
∼ = (det R
∼ ) det (a
∼ − λ E)
∼ (det R)

= det (a
∼ − λ E)
∼ = 0.
With these generalizations we come to the following theorem.

Theorem 2. A symmetric matrix has at least one system of principal axes consisting
of mutually orthogonal eigenvectors. In the system of principal axes the matrix has a
diagonal form; the elements on the main diagonal are the eigenvalues of the matrix.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
138 | 3 Algebra of Second-Order Tensors

The principal axes transformation follows the same procedure as for tensors. In
the first step we choose a matrix R
∼1 , whose first column is an eigenvector X ∼1 of a.
∼ Be-
cause of the symmetry, as in Section 3.12.1, the elements in the first row and the first
column of the resulting similar matrix a∼ are all zero except the element on the diago-
nal, which is the eigenvalue λ1 corresponding to the eigenvector X ∼1 , so we have

λ1 0 0 ... 0
0 a
̃ 22 a
̃ 23 ... a
̃ 2N
a T 0 a a a
∼1 a∼R ).
̃ ( ̃ 23 ̃ 33 ... ̃ 3N )
∼1 = ( .

(0 a
̃ 2N a
̃ 3N ... a
̃ NN )

The first equation in the eigenvalue problem a∼X

̃ ̃ = λX
∼ ∼ of the matrix a
∼ is now decou-
pled from the remaining equations, i. e. the eigenvector X
∼1 has the elements X = 1,
̃ ̃
̃ = ⋅⋅⋅ = X
̃ = 0, while further eigenvectors X
∼i must satisfy X = 0 (if λi ≠ λ1 ) or we
̃ ̃
12 1N i1
can choose X
̃ = 0 (if λi = λ1 ). Thus the eigenvector X
∼1 is orthogonal to all remaining
eigenvectors X
∼i . Therefore, we can, using the principal submatrix

̃ 22 a
̃ 23 ... a
̃ 2N
̃ 23 a
̃ 33 ... a
̃ 3N
( . ),
̃ 2N a
̃ 3N ... a
̃ NN

find a second eigenvalue λ2 , construct an orthogonal matrix R ∼2 whose first two

columns are the eigenvectors X
∼1 and X
∼2 , and transform the matrix a.
∼ This gives

λ1 0 0 ... 0
0 λ2 0 ... 0
a T 0 0 a a
∼2 a∼R ).
̂ ( ̂ 33 ... ̂ 3N )
∼2 = ( .

(0 0 a
̂ 3N ... a
̂ NN )

After N such steps the matrix is fully diagonalized. If multiple eigenvalues exist, then
we have, as in Section 3.12.1, more than one system of principal axes.

3. Any symmetric matrix of order N belongs to one of the five definiteness classes.
Let a
∼ be a symmetric matrix of order N and let X
∼ be a nonzero column matrix with
N elements. Then its quadratic form is

∼ a ∼ X,
∼ Q = Xi aij Xj . (3.73)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.12 Symmetric Tensors | 139

Now, if the number Q is positive, independently of the choice of X, ∼ we call the matrix
∼ positive definite; if it is always negative, we call a
∼ negative definite. If Q is always
positive or zero, or always negative or zero, we call a∼ positive semidefinite or negative
semidefinite, respectively. If finally Q can take both signs, we call a ∼ indefinite. With
these definitions we come to the following theorem.

Theorem 3. If a symmetric matrix is positive definite, it has only positive eigenvalues; if

it is negative definite, it has only negative eigenvalues; if it is positive semidefinite, it has
no negative eigenvalues; and if it is negative semidefinite, it has no positive eigenvalues.

We can explain this statement as follows. If we choose X

∼ to be an arbitrary eigen-
vector corresponding to the eigenvalue λ, we have

∼ a ∼X
∼ λX∼ = λX
∼ X.∼
Since X
∼ has only real elements, X∼ X ∼ is positive, i. e. λ and Q have the same sign.
Since, according to (3.59) and (3.60), the determinant of a symmetric matrix is
equal to the product of its N eigenvalues, we further have the following theorem.

Theorem 4. The determinant of a positive definite symmetric matrix is positive.

4. A submatrix which is obtained from a symmetric matrix by eliminating a number

of columns and rows with the same indices is called a principal submatrix.

Theorem 5. The principal submatrices of a symmetric matrix have the same definite-
ness as the matrix itself; for example, if the matrix is positive definite, then all its princi-
pal submatrices are also positive definite.

We prove this statement as follows. If we choose X ∼ in (3.73) such that X1 = 0,

then from the N 2 terms on the right side of (3.73) all those with index one vanish, i. e.
we obtain the quadratic form of the principal submatrix that we obtained from a ∼ by
eliminating the first row and the first column. If a∼ is positive definite, then Q is also
positive for the chosen X,
∼ and because this is true for all X
∼ with X1 = 0 , the subde-
terminant under consideration itself is also positive definite. With analogous consid-
erations for X,
∼ in which one or several other elements vanish, we can generalize this
to all principal submatrices; and if we follow the same argument for a matrix with a
different definiteness, for example a negative semidefinite matrix, we see immediately
that our argument remains valid.
Because the main diagonal elements of a matrix are also the principal subdeter-
minants of order one, we have in particular the following theorem.

Theorem 6. The main diagonal elements of a positive definite symmetric matrix are all
positive; for a negative definite symmetric matrix they are all negative; for a positive
semidefinite symmetric matrix they are all positive or zero; and for a negative semidefi-
nite symmetric matrix they are all negative or zero.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
140 | 3 Algebra of Second-Order Tensors

5. The converse of all these theorems is not true, e. g. if all eigenvalues of a symmetric
matrix are real, we cannot conclude that the matrix is symmetric; and if the determi-
nant of a symmetric matrix is positive, it does not follow that the matrix is positive

3.13 Orthogonal Polar Tensors

3.13.1 Rotation in a Plane

For the following, we need a matrix which maps a position vector X in the x, y-plane
by rotating about the z-axis through an angle ϑ to a vector U. Clearly, these vectors are

X1 = r cos φ, X2 = r sin φ,
U1 = r cos(φ + ϑ) = r (cos φ cos ϑ − sin φ sin ϑ)
= X1 cos ϑ − X2 sin ϑ,
U2 = r sin(φ + ϑ) = r (sin φ cos ϑ + cos φ sin ϑ)
= X2 cos ϑ + X1 sin ϑ,

U1 cos ϑ − sin ϑ X1
( )=( )( ). (3.74)
U2 sin ϑ cos ϑ X2

3.13.2 Transformation to an Eigendirection

1. Any tensor (with real coordinates), in particular any orthogonal tensor, has (at least)
one real eigenvalue and correspondingly (at least) one real eigendirection. Since an or-
thogonal tensor represents a length preserving mapping, all its real eigenvalues must
be, according to (3.37), either +1 or −1.
According to Section 3.11.3, in particular (3.53)1 , the following cases can occur:
I. All three eigenvalues are 1, then det a ∼ = 1.
II. All three eigenvalues are −1, then det a ∼ = −1.
III. Two eigenvalues are 1, one is −1, then det a ∼ = −1.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.13 Orthogonal Polar Tensors | 141

IV. Two eigenvalues are −1, one is 1, then det a

∼ = 1.
V. Two eigenvalues are complex conjugate to each other, one is 1; since the product
of two complex conjugate numbers is positive, then det a
∼ = 1.
VI. Two eigenvalues are complex conjugate to each other, one is −1, then det a
∼ = −1.

Thus, in all cases where det a

∼ = 1, one eigenvalue is 1; and in all cases where det a
−1, one eigenvalue is −1.

2. It is reasonable to assume that in a coordinate system where one basis vector is

an eigenvector, an orthogonal tensor has a particularly simple form which is easy to
interpret. Therefore, we assume in the following that the basis vector e3 of our coordi-
nate system is an eigenvector of the tensor and that the corresponding eigenvalue has
the same sign as the determinant of the tensor. The coordinate matrix with respect to
such a basis is then, according to Section 3.11.3, No. 7,

a11 a12 0
(a21 a22 0).
a31 a32 ±1

Since the sum of the squares in the last row must be 1, we have a31 = a32 = 0, so

a11 a12 0
(a21 a22 0).
0 0 ±1

The sum of the squares in the first column is a211 + a221 = 1, which can be satisfied,
without loss of generality, by a11 = cos ϑ, a21 = sin ϑ, i. e.

cos ϑ a12 0
( sin ϑ a22 0).
0 0 ±1

Since the sign of det aij must match the sign of a33 , it follows that

󵄨󵄨 󵄨
󵄨󵄨cos ϑ a12 󵄨󵄨󵄨
󵄨󵄨 󵄨󵄨 = a22 cos ϑ − a12 sin ϑ = 1.
󵄨󵄨 sin ϑ a22 󵄨󵄨󵄨

The sum of the squares in the first row gives a12 = ± sin ϑ, and from that it follows
with the previous equation that a12 = − sin ϑ and a22 = cos ϑ.
Thus, if det a
∼ = 1, i. e. if the tensor is proper orthogonal, we have for the coordi-
nate matrix of our orthogonal tensor

cos ϑ − sin ϑ 0
aij = ( sin ϑ cos ϑ 0) (3.75)
0 0 +1

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
142 | 3 Algebra of Second-Order Tensors

and if det a
∼ = −1, i. e. if it is improper orthogonal, we have
cos ϑ − sin ϑ 0
aij = ( sin ϑ cos ϑ 0) . (3.76)
0 0 −1
3. These formulas are easy to interpret with (3.74). The mapping given by a proper
orthogonal tensor does not change the z-coordinate of any position vector and it ro-
tates the projection of the position vector onto the x, y-plane by the angle ϑ; i. e. the
mapping of the tensor is a rotation through the angle ϑ about the eigendirection cor-
responding to its positive eigenvalue. This explains why a proper orthogonal tensor
is also called a rotation tensor. The mapping given by an improper orthogonal tensor
changes only the sign of the z-coordinate, and it again rotates the projection of the
position vector onto the x, y-plane by the angle ϑ; i. e. the mapping of the tensor is
a rotation through the angle ϑ about the eigendirection corresponding to its negative
eigenvalue, followed by a reflection through the plane orthogonal to this eigendirec-

4. Since the mapping by a tensor has to be independent of the coordinate system, this
intuitive interpretation implies that an orthogonal tensor must be invariant under a
rotation about the eigendirection, i. e. it has the same coordinates in all coordinate
systems where the z-axis points in this eigendirection. Hence, such a tensor can be
diagonalized only for ϑ = 0 or ϑ = π .

Problem 3.15.
Find the eigenvalues of the orthogonal tensor
cos ϑ − sin ϑ 0
aij = ( sin ϑ cos ϑ 0) .
0 0 ±1

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.13 Orthogonal Polar Tensors | 143

5. We end this section by reviewing the six cases we distinguished earlier, starting
with the three cases where the tensor is proper orthogonal:
– ϑ = 0: All eigenvalues are 1. The tensor is the identity tensor, it represents a rota-
tion by 0∘ , i. e. every vector is mapped onto itself (Case I).
– 0 < ϑ < π: Two eigenvalues are complex conjugate, one is 1. The tensor represents
a rotation by the angle ϑ (Case V).
– ϑ = π: Two eigenvalues are −1, one is 1. The tensor represents a rotation by 180∘
(Case IV).

Next we consider the three cases where the tensor is improper orthogonal:
– ϑ = 0: Two eigenvalues are 1, one is −1. The tensor represents a reflection through
the plane orthogonal to the eigendirection of the negative eigenvalue (Case III).
– 0 < ϑ < π: Two eigenvalues are complex conjugate, one is −1. The tensor rep-
resents the same reflection as before, combined with a rotation by the angle ϑ
(Case VI).
– ϑ = π: All three eigenvalues are −1. The tensor is the negative identity tensor; it
is also called the inversion tensor, because it maps any vector to a vector with the
same length pointing in the opposite direction (Case II).

Problem 3.16.
A. Find the quantities α, β, γ, δ, ε, ζ in the matrix

α β δ
αij = (−α√2 γ ε) ,
α −β ζ

such that the αij represent the transformation matrix between two Cartesian coor-
dinate systems (with or without change of orientation).
Hint: Consider first which of the two orthogonality relations (2.6) is more con-
venient for the computations.
B. How many solutions does the problem have? What additional information do you
need, so that the problem has a unique solution? How can the different solutions
be interpreted geometrically?

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
144 | 3 Algebra of Second-Order Tensors

3.13.3 The Orthogonal Tensor as a Function of Rotation Angle and Rotation Axis or
Reflection Axis Rotation

1. A proper orthogonal polar tensor (rotation tensor) can be represented by its axis of
rotation, which we assume to be given by the axial unit vector n and its rotation angle
ϑ. To this end, we have to express the rotated vector U as a function of n, ϑ, and the
original vector X. We choose as a basis for this representation:
– the unit vector n, representing the axis of rotation,
– the projection a of the vector X onto the plane orthogonal to n, and
– the vector n × X.

This basis is orthogonal, but not normalized. To simplify our derivation, we assume
that X and hence U is a polar vector and that the underlying coordinate system is a
right-handed system.

According to the figure, we have

󳨀󳨀→ 󳨀󳨀→ 󳨀󳨀→ 󳨀󳨀→
U = OQ = OM + MR + RQ. (a)

We set
󳨀󳨀→ 󳨀󳨀→ 󳨀󳨀→
OM = α n, MR = β a, and RQ = γ n × X,

and now, from the figure, we have to find α, β, γ, and a as a function of n, ϑ, and X.
Since n is a unit vector, we have

α = OM = OP cos(n, X) = X cos(n, X) = n ⋅ X.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.13 Orthogonal Polar Tensors | 145

β= = = cos ϑ.
For RQ, we have, on the one hand, by taking the length of RQ = γ n × X,
RQ = γ X sin(n, X) = γ X = γ MP,
and on the other hand, from the triangle MRQ,

RQ = MQ sin ϑ = MP sin ϑ.

Setting both terms equal gives γ = sin ϑ. Finally,

󳨀󳨀→ 󳨀󳨀→ 󳨀󳨀→
a = MP = OP − OM = X − n ⋅ X n.

Substituting this into (a) gives

U = n ⋅ X n + cos ϑ (X − n ⋅ X n) + sin ϑ n × X.

To obtain this equation in the form U = a ⋅ X, we translate it into coordinate notation:

Ui = nj Xj ni + cos ϑ (Xi − nj Xj ni ) + sin ϑ εijk nj Xk ,

from which we get by factoring out Xj

Ui = [ni nj + (δij − ni nj ) cos ϑ − εijk nk sin ϑ] Xj ,

aij = ni nj + (δij − ni nj ) cos ϑ − εijk nk sin ϑ. (3.77)

2. Thus an orthogonal tensor is completely determined by four pieces of information,

– the direction of the rotation axis, represented by the unit vector ni (two pieces
of information, e. g. two coordinates with respect to a Cartesian basis; the third
coordinate of ni is then determined by normalizing the vector to unit length);
– the rotation angle ϑ; and
– the sign of the determinant.

We further see that the coordinates of an orthogonal tensor do not change under a
rotation of the coordinate system about its axis of rotation. In all these coordinate
systems, the coordinates of ni and clearly also the scalar ϑ have the same value. Finally,
we see that (3.77) is the decomposition of aij in its symmetric and antimetric parts.
3. We can also write ni and ϑ as functions of aij . To this end we contract (3.77), so we

aii = 1 + 2 cos ϑ,
cos ϑ = (aii − 1). (3.78)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
146 | 3 Algebra of Second-Order Tensors

We further compute εijm aij . Here we can replace aij , according to (2.40), by its antimet-
ric part, so we have

εijm aij = − εijm εijk nk sin ϑ

= − (δjj δmk − δjk δmj ) nk sin ϑ

= − (3 − 1) δmk nk sin ϑ,
εijm aij
nm = − . (3.79)
2 sin ϑ

Thus we can uniquely assign an angle of rotation between 0∘ and 180∘ and an axis
of rotation, arbitrarily oriented in space, to any rotation tensor. Rotations by an an-
gle between 0∘ and −180∘ correspond to rotations by an angle between 0∘ and 180∘
with reversely oriented axis of rotation. Substituting ϑ || −ϑ does not change (3.78) and
results with (3.79) in the substitution ni || −ni . Reflection
An improper orthogonal tensor, which reflects any vector through the plane orthogo-
nal to n, can be represented by the unit vector n. To this end, we have to express the
reflected vector U as a function of the original vector X and the normal vector n.

From the figure we have

󳨀󳨀→ 󳨀󳨀→ 󳨀󳨀→ 󳨀󳨀→ 󳨀󳨀→

U = OQ = ON + NQ = −OM + MP.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.13 Orthogonal Polar Tensors | 147

Following the same considerations as in the last section, we have

󳨀󳨀→ 󳨀󳨀→
OM = n ⋅ X n, MP = X − n ⋅ X n,
U = −n ⋅ X n + X − n ⋅ X n = X − 2 n ⋅ X n

or in coordinate notation

Ui = Xi − 2 nj Xj ni .

Factoring out Xj we obtain

Ui = (δij − 2 ni nj ) Xj ,

aij = δij − 2 ni nj , (3.80)

where aij is a symmetric tensor which has the eigenvalue −1 with multiplicity 1 and
the eigenvalue 1 of multiplicity 2 and n is the unit eigenvector corresponding to the
eigenvalue −1. Rotary Reflection

A rotary reflection (or improper rotation) is a rotation about the axis n by the angle ϑ,
combined with a reflection about the plane orthogonal to n. Then we have

aij = (δik − 2 ni nk )[nk nj + (δkj − nk nj ) cos ϑ − εkjm nm sin ϑ]

= ni nj − 2 ni nj + (δij − 2 ni nj − ni nj + 2 ni nj ) cos ϑ
− (εijm nm − 2 εkjm ni nk nm ) sin ϑ,
aij = −ni nj + (δij − ni nj ) cos ϑ − εijm nm sin ϑ. (3.81)

From (3.81) it follows that aii = −1 + 2 cos ϑ, εijk aij = −εijk εijm nm sin ϑ; hence we
obtain the angle and the axis of rotation of a rotary reflection analogously to (3.78) and
(3.79) as
1 εijm aij
cos ϑ = (a + 1), nm = − . (3.82)
2 ii 2 sin ϑ

Problem 3.17.
One of the solutions to Problem 3.16 is
1 1 √2 1
2 2 2

αij = (− 1 √2 0 1 √2) .
2 2
− 21 √2 1

Find the axis and the angle of rotation and interpret the transformation geometrically.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
148 | 3 Algebra of Second-Order Tensors

3.13.4 Rotation and Coordinate Transformation

The equations (2.8)

ei = αij ẽj , ẽi = αji ej (3.83)

represent a mapping between two Cartesian bases, which are rotated to each other.

However, the transformation laws (2.17) for tensor coordinates can be interpreted in
two ways: either as a transformation law for the coordinates of a tensor between two
coordinate systems, which are rotated to each other, or as a transformation law for the
coordinates of two tensors in the same coordinate system, which are rotated to each
other. For example, in the first interpretation, the equations

ai = αij a
̃j , a
̃ i = αji aj (3.84)

relate the coordinates of the vector a, from the figure on the facing page, in the no-
tilde coordinate system to the coordinates of this vector in the tilde coordinate sys-
tem. Then, according to (2.4) αij = ẽi is the transformation matrix whose columns are
the basis vectors of the tilde coordinate system, written as a linear combination of the
basis vectors of the no-tilde coordinate system. In the second interpretation, the equa-
tions (3.84) relate the coordinates of the two vectors a and a ̃ in the no-tilde coordinate
system. Then α = αij ei ej is the rotation tensor, which maps the vector a ̃ to the vector
a, i. e. it rotates it through the angle ϑ, which is positive in our case, as can be seen
from the figure.
If we interpret (3.84)1 as an equation for the coordinates of the same vector a in
two different coordinate systems, we cannot translate it into symbolic notation. If we
interpret it as an equation for the coordinates of two different vectors, a and a ̃ , in the
same coordinate system, then we can translate it as a = α ⋅ a ̃.
Analogously, the equations

xi = αij x̃j , x̃i = αji xj (3.85)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.14 Powers of Tensors. Cayley–Hamilton Theorem | 149

either relate the coordinates of the point P, from the figure above, in the no-tilde co-
ordinate system to the coordinates of this point in the tilde coordinate system, or they
relate the coordinates of the two points, P and P,̃ in the no-tilde coordinate system. In
the first case it is not possible to translate equation (3.85)1 into symbolic notation, in
the second case we can translate it into x = α ⋅ x̃ with α as rotation tensor.

3.14 Powers of Tensors. Cayley–Hamilton Theorem

3.14.1 Powers with Integer Exponents

1. Applying the transformation U = a ⋅ X a second time gives V = a ⋅ U = a ⋅ a ⋅ X. This

suggests to introduce the square of the tensor a as a tensor which maps X to V and to
define higher powers of a analogously. Then we have

a2 := a ⋅ a, a(2)
:= aik akj ,

a3 := a ⋅ a ⋅ a, a(3) := aik akl alj , (3.86)


etc., etc.

Here a(2)
is the i, j-coordinate of the tensor a2 and not the square of the i, j-coordinate of
the tensor a, i. e. we say “a square i j”. These powers, where the exponent is a natural
number, are defined for arbitrary tensors.
2. For all tensors, except the zero tensor, we define the zeroth power to be the identity
tensor; we have

a0 := δ, a(0)
ij := δij . (3.87)

3. If the inverse of a tensor exists, i. e. if the original tensor is regular, we define the
minus-one power to be the inverse tensor and further powers with negative integer
exponents as

a−2 := a−1 ⋅ a−1 , a(−2)

:= a(−1)

a−3 := a−1 ⋅ a−1 ⋅ a−1 , a(−3)

:= a(−1)
akl alj ,
(−1) (−1)

etc., etc.

4. The n-th power of an eigenvalue of a is an eigenvalue of an , and all powers of a have

the same eigenvectors as a itself.
We start by proving this theorem for powers with positive integer exponents, using
the example

bij = aim amn anj .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
150 | 3 Algebra of Second-Order Tensors

Let λ be an eigenvalue and let Xj be an eigenvector of aij , i. e. we have

aij Xj = λ Xi .


bij Xj = aim amn anj Xj = aim amn λ Xn = aim λ2 Xm = λ3 Xi ,

i. e. λ3 is an eigenvalue and Xi is an eigenvector of bij = a(3)

. The extension of the proof
to arbitrary positive integer exponents is clear. We leave the proof for negative integer
exponents to the following problem.

Problem 3.18.
Prove that two tensors which are inverses of each other have reciprocal eigenvalues
and the same eigenvectors; thus the above theorem is valid for all integer powers of a
tensor, as long as negative powers of the tensor exist.

Problem 3.19.
Prove for integer values of n that

(An )T = (AT )n . (3.89)

Problem 3.20.
The proper orthogonal tensor a rotates the vector X about the axis of rotation n
through the angle ϑ to the vector U. Show that the tensor a3 rotates X about the same
axis of rotation through the angle 3 ϑ to a vector W.

3.14.2 Powers with Real Exponents

1. For positive semidefinite symmetric tensors, we can also define powers with arbi-
trary real exponents.
A symmetric tensor with eigenvalues λk and three orthonormal eigenvectors qk
can be represented with (3.70) in the form
k k
a = λk qk qk , aij = λk qi qj .

We now define its n-th power an by

k k
an := (λk )n qk qk , a(n)
ij := (λk )
qi qj . (3.90)

Clearly, for integer values of n, the power defined in this way is always uniquely deter-
mined and identical to the integer power of the tensor defined in Section 3.14.1.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.14 Powers of Tensors. Cayley–Hamilton Theorem | 151

If we restrict ourselves to positive semidefinite tensors, the eigenvalues λk are pos-

itive or zero. If, in addition, we require that (λk )n is also positive or zero, then the n-th
power of the tensor is also always uniquely defined by (3.90). (If we allow negative
values for (λk )n , then, for example, (λk ) 2 has two values.)
A tensor defined in this way can be called the n-th power of aij , if it satisfies the
following rules:

am ⋅ an = am+n , a(m)
= a(m+n)
m n mn n
(a ) = a , (a(m)
) = a(mn)

Using (3.90) we have

r r s s (3.26) r r s s
= (λr )m qi qj (λs )n qj qk = (λr )m (λs )n qi q
⏟⏟⏟j⏟⏟q⏟⏟j qk
(1.18) r r r r
= (λr )m (λr )n qi qk = (λr )m+n qi qk = a(m+n)

n m k k (3.90) k k
ij ) = ((λk )
qi qj ) = (λk )mn qi qj = a(mn)
ij , Q. E. D.

With the definition (3.90), we assign to any positive semidefinite symmetric tensor a,
for any positive real n, exactly one positive semidefinite symmetric tensor an , and we
call this tensor the n-th power of a, because for all these tensors the rules (3.91) are
If we restrict ourselves to positive definite symmetric tensors, our considerations
and the formulas (3.90) and (3.91) are also valid for negative real exponents.

2. The definition (3.90) can be generalized to a more general class of tensors, namely
the nonnegative tensors. A tensor is called nonnegative, if it is not defective and if its
eigenvalues λk are positive or zero.
Using (3.54), a nonnegative tensor can be represented in the form
a = λk g k g k , aij = λk g g j , (3.92)

where g k is the basis reciprocal to the eigenvectors g k . The n-th power of a, with n > 0,
is then defined by
an := (λk )n g k g k , a(n)
ij := (λk )
g gj; (3.93)

if we demand (λk )n to be positive or zero, an is defined uniquely by (3.93) and is also

nonnegative. It is easy to show that also the powers of a nonnegative tensor as defined
by (3.93) satisfy the rules (3.91).

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
152 | 3 Algebra of Second-Order Tensors

If, in particular, all eigenvalues of a nonnegative tensor are positive, the tensor is
called positive. Then the formulas (3.92) and (3.93) also hold for negative exponents.
3. We summarize our definitions of the tensor powers in a table.

Exponent of a tensor power Admissible tensors

natural number all tensors

zero all tensors except the zero tensor
integer number regular tensors
positive real number nonnegative tensors
real number positive tensors

Problem 3.21.
A. Show that the tensor of Problem 3.13 is positive definite.
B. Compute the coordinates of its positive definite square root in the original coor-
dinate system and check your result, i. e. verify that the square of the square root
gives the original tensor.

3.14.3 Cayley–Hamilton Theorem

1. Substituting (3.51) into (3.15) gives A δ = A󸀠 a − A󸀠󸀠 a2 + a3 , or

A δ − A󸀠 a + A󸀠󸀠 a2 − a3 = 0. (3.94)

This equation is called the Cayley–Hamilton theorem. If we replace the powers of the
tensor with the powers of the eigenvalues, it has the same form as the characteristic
equation (3.46); we also say: a tensor satisfies its characteristic equation. To further
generalize (3.94), we scalar multiply by an , where n is an integer number, and we ob-

A an − A󸀠 an+1 + A󸀠󸀠 an+2 − an+3 = 0. (3.95)

This equation is also called the Cayley–Hamilton theorem. Thus, we can write any
integer power of a tensor a as a linear combination of a2 , a, and δ; and the coefficients
in this equation are polynomials of the main invariants of the tensor.
2. For tensors which have at most two distinct eigenvalues λ1 and λ2 we can alter-
natively find all eigenvalues, if we refrain from specifying their multiplicity, from the

(λ1 − λ)(λ2 − λ) = 0

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.15 Basic Invariants | 153


λ1 λ2 − (λ1 + λ2 ) λ + λ2 = 0; (3.96)

this equation is also called characteristic equation. We multiply this equation by an

eigenvector X,

λ1 λ2 X − (λ1 + λ2 ) λ X + λ2 X = 0,

and take into account that

a ⋅ X = λ X, a2 ⋅ X = λ2 X,

which gives

[λ1 λ2 δ − (λ1 + λ2 ) a + a2 ] ⋅ X = 0.

If the tensor has three linearly independent eigenvectors, we can substitute for X three
linearly independent vectors and therefore factor out X.
Thus, for a nondefective tensor with at most two distinct eigenvalues we found,
in addition to (3.94), also the following version of the Cayley–Hamilton theorem:

λ1 λ2 δ − (λ1 + λ2 ) a + a2 = 0. (3.97)

3.15 Basic Invariants

In addition to the equations (3.47) to (3.49) and (3.53), we can find a third represen-
tation for the three main invariants of a tensor. To this end, we take the trace of the
Cayley–Hamilton theorem (3.94) and substitute (3.50) for A󸀠 and (3.47) for A󸀠󸀠 , so
3 A − (tr2 a − tr a2 ) tr a + tr a tr a2 − tr a3 = 0,
A = (tr3 a − tr a2 tr a − 2 tr a tr a2 + 2 tr a3 ).
Together with (3.47) and (3.50) we get

A󸀠󸀠 = tr a,

1 2
A󸀠 = (tr a − tr a2 ), (3.98)
1 3
A = (tr a − 3 tr a tr a2 + 2 tr a3 ).

Clearly, tr a, tr a2 , and tr a3 form, in addition to the main invariants and the eigenval-
ues, a third set of invariants of a second-order tensor; we call them the basic invariants.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
154 | 3 Algebra of Second-Order Tensors

In summary, we have the following expressions.

A󸀠󸀠 = aii = λ1 + λ2 + λ3
= tr a,

A󸀠 = bii = λ1 λ2 + λ2 λ3 + λ3 λ1
1 2 (3.99)
= (tr a − tr a2 ),

A = det a
∼ = λ1 λ2 λ3
= (tr3 a − 3 tr a tr a2 + 2 tr a3 ).

If we solve for the basic invariants, we have, after elementary computations, the fol-
lowing formulas.

tr a = λ1 + λ2 + λ3 = A󸀠󸀠 ,

tr a2 = λ12 + λ22 + λ32 = (A󸀠󸀠 )2 − 2 A󸀠 , (3.100)

tr a3 = λ13 + λ23 + λ33 = (A󸀠󸀠 )3 − 3 A󸀠 A󸀠󸀠 + 3 A.

If a tensor has only two distinct eigenvalues, we can, in general, solve the first two
equations in (3.99) and the first two equations in (3.100) for the two distinct eigenval-
ues and then substitute these equations into the third equation of (3.99) or (3.100).
Thus the three main invariants and the three basic invariants are not independent
from each other, but only two of them are.
If a tensor has only one eigenvalue (of multiplicity 3), it has only one independent
main invariant and only one independent basic invariant.
The three different sets of invariants (eigenvalues, main invariants, and basic in-
variants) usually do not form a complete set of invariants; a general second-order ten-
sor has seven independent invariants. We will come back to this topic in Chapter 5.

3.16 Polar Decomposition of a Tensor

1. In this section we consider the following theorem. Any regular tensor Fij can be
written in the form

Fij = Vik Rkj = Rik Ukj , (3.101)

where Rij is orthogonal, Uij and Vij are symmetric and positive definite, and Rij , Uij , and
Vij are uniquely determined. This decomposition is called polar, because it resembles
the decomposition of a complex number into polar coordinates, i. e. into modulus and

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.16 Polar Decomposition of a Tensor | 155

argument: Uij and Vij are positive definite tensors, similarly as the modulus of a com-
plex number is positive, the determinant of Rij is 1, similarly as the exponential factor
of a complex number has modulus 1, and Rij can be interpreted as a rotation or as a
rotary reflection, respectively (compare Section 3.13.3).
2. We prove the decomposition theorem (3.101) in six steps.
I. We first show that the scalar product of any tensor with its transposed tensor gives
a positive semidefinite symmetric tensor. If the original tensor is regular, then the
scalar product is even positive definite and thus with (3.53)1 also regular.
Both are easy to see. We note that, for arbitrary Fij , Fij FjkT = Fij Fkj is symmetric.
Furthermore A := Fij Fkj Xi Xk is the square of the vector Fij Xi and thus it is always
greater than or equal to zero, i. e. Fij Fkj is positive semidefinite.
If, in particular, Fij is regular, it does not have a null direction, i. e. for Xi \= 0
we have Fij Xi \= 0 , and hence A > 0. Then, according to Section 3.12.3, Fij Fkj is
positive definite, and its eigenvalues are all positive.
II. Next we show that any regular tensor Fij can be represented in the form

Fij = Vik Rkj , (3.102)

where Vij is symmetric and positive definite and Rij is orthogonal.

This is also easy to see. If Fij is regular, then, according to step I, the product
Fik FkjT is symmetric and positive definite, i. e. with Section 3.14.2 there exists a
symmetric and positive definite square root

Vij := √Fik FkjT , Vij(2) = Vik Vkj = Fik FkjT . (3.103)

Here √Fik FkjT is the i, j-coordinate of the tensor (F ⋅ F T ) 2 and not the square root of
the i, j-coordinate of the tensor (F ⋅ F T ).
Since a positive definite symmetric tensor is regular, Vij(−1) exists and hence the

Rij := Vik(−1) Fkj (3.104)

also exists. Multiplying the last equation by Vmi gives

Vmi Rij = Vmi Vik(−1) Fkj = δmk Fkj = Fmj ,

i. e. the tensors defined by (3.103) and (3.104) satisfy equation (3.102).

It remains to show that Rij is orthogonal. For this we exploit that the inverse of
a (regular) symmetric tensor is again symmetric. We can show this as follows. Let
V = V T . Then we have (V −1 )T = (V T )−1 = V −1 . Now we compute
(3.104) T
Rik RTkj = Vim
Fmk (Vkn
Fnj )
= Vim
Fmk Fkn (Vnj(−1) )
= Vim
Vmk Vkn Vnj(−1) = δik δkj = δij .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
156 | 3 Algebra of Second-Order Tensors

So the transpose of the tensor Rij is equal to the inverse of Rij , i. e. Rij is orthogonal.
III. Uniqueness of the decomposition is easy to prove by contradiction.
Let Fij = vik rkj be a second decomposition. Then we have Vik Rkj = vik rkj and,
if we transpose both sides, RTik Vkj = rik
vkj . This gives

Vin(2) = Vik Vkn = Vik δkm Vmn = Vik Rkj RTjm Vmn
= vik rkj rjm vmn = vik δkm vmn = vik vkn = vin

Now, since Vij is symmetric and positive definite, Vij(2) is also symmetric and
positive definite: Vij(2) = Vik Vkj = Vik Vjk is symmetric, and A := Vij(2) Xi Xj
= Vik Vjk Xi Xj is the square of the vector Vik Xi , and since Vij is regular, Vij has no
null direction, i. e. the square is always positive.
From the fact that a symmetric and positive definite tensor has only one sym-
metric and positive definite square root, it follows that Vij = vij and hence Vik Rkj =
Vik rkj . Multiplying by Vmi(−1)
then gives Rmj = rmj .
IV. Next we show (also to practice symbolic notation) that any regular tensor F can
also be represented as

F = S ⋅ U, (3.105)

where S is orthogonal and U is symmetric and positive definite. To this end we


U := √F T ⋅ F, U 2 = U ⋅ U = F T ⋅ F, (3.106)


S := F ⋅ U −1 . (3.107)

Analogously to step II, U is again symmetric and positive definite, and if we scalar
multiply (3.107) from the right by U, we get S ⋅ U = F, i. e. the two tensors U and S
satisfy equation (3.105). It remains to show that S is orthogonal. This follows, with
(U −1 )T = U −1 and ST = U −1 ⋅ F T , from

ST ⋅ S = U −1 ⋅ F T ⋅ F ⋅ U −1 = U −1 ⋅ U ⋅ U ⋅ U −1 = δ.

V. We show (again by contradiction) that also this second decomposition is unique.

Let F = s ⋅ u be a second decomposition. Then we have S ⋅ U = s ⋅ u and
U ⋅ ST = u ⋅ sT . Furthermore, we have

U 2 = U ⋅ δ ⋅ U = U ⋅ ST ⋅ S ⋅ U = u ⋅ sT ⋅ s ⋅ u = u ⋅ δ ⋅ u = u2 ,

and since U 2 and u2 are both symmetric and positive definite, both have only one
symmetric and positive definite square root, i. e. we have U = u and hence S ⋅ U =
s ⋅ U. Scalar multiplication by U −1 from the right gives S = s.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
3.16 Polar Decomposition of a Tensor | 157

VI. It remains to show that R = S .

From F = V ⋅R it follows that F = R⋅RT ⋅V ⋅R. Since V is symmetric and positive
definite, T := RT ⋅ V ⋅ R is also symmetric and positive definite, i. e. in coordinate
notation Tij := RTim Vmn Rnj = Rmi Rnj Vmn . So we can interpret Vij and Tij with (2.17)
as Cartesian coordinates of the same tensor in two different coordinate systems,
which are related by the transformation Rij . Because the decomposition F = S ⋅ U
is unique, it follows that S = R and U = RT ⋅ V ⋅ R.

3. Scalar multiplication of the last equation by R from the left and by RT from the right
allows us to solve for V: R ⋅ U ⋅ RT = V. Then we obtain a relation between V and U
which reads in coordinate notation

Uij = RTim Vmn Rnj , Vij = Rim Umn RTnj


Vij = Rim Rjn Umn , Uij = Rmi Rnj Vmn . (3.108)

So we can again interpret Vij and Uij as the Cartesian coordinates of the same tensor in
two Cartesian coordinate systems, which are related by the transformation matrix Rij .
Thus both tensors have the same eigenvalues, and the unit eigenvectors qV and
qU corresponding to the eigenvalue λk are related by

k k k k
(qV )i = Rij (qU )j , (qU )i = Rji (qV )j . (3.109)

Since the Rij are the coordinates of the tensor R, we can write equations (3.108) and
(3.109) also in symbolic notation. For example, for (3.108)1 and (3.109)1 we have
k k
V = R ⋅ U ⋅ RT , qV = R ⋅ qU . (3.110)

The tensor R rotates the tensor U to the tensor V and it rotates the eigendirections qU
of U to the eigendirections qV of V.

Problem 3.22.
Find, for the tensor
2 2 1
aij = ( 1 −2 2) ,
2 −1 −2

the two factors of the polar decomposition aij = Vik Rkj and check your result, i. e.
verify that Vik Rkj = aij .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:15 AM
Brought to you by | Cambridge University Library
Download Date | 10/11/18 1:15 AM
4 Tensor Analysis in Curvilinear Coordinates
4.1 Curvilinear Coordinates
4.1.1 Curvilinear Coordinate Systems

1. Until now, we described a point P in space using a Cartesian coordinate system.

Such a coordinate system consists of an origin O and an orthonormal basis ei at this
origin, and we described the point P by the three coordinates xi of the position vector
x = OP = xi ei .
Let us consider a set of transformation equations (one-to-one, except at singular

ui = ui (xj ), xi = xi (uj ). (4.1)

Then every point in space is uniquely described by the three variables ui . Thus we
can also call the ui point coordinates, i. e. each set of transformation equations (4.1),
together with a Cartesian coordinate system, defines again a coordinate system. All
these coordinate systems are called curvilinear coordinate systems.1
As an example we consider cylindrical coordinates. They are described by the
transformation equations

x1 = u1 cos u2 , x2 = u1 sin u2 , x3 = u3 .

We obtain conversely
u1 = √x12 + x22 , u2 = arctan , u3 = x3

or in the usual notation x1 = x, x2 = y, u1 = R, u2 = φ, x3 = u3 = z

x = R cos φ, y = R sin φ, z = z,
y (4.2)
R = √x2 + y2 , φ = arctan , z = z.
2. From d xi = (𝜕xi /𝜕uj ) d uj it follows that 𝜕xi /𝜕xk = δik = (𝜕xi /𝜕uj ) (𝜕uj /𝜕xk ), and
analogously from d ui = (𝜕ui /𝜕xj )d xj it follows that 𝜕ui /𝜕uk = δik = (𝜕ui /𝜕xj )(𝜕xj /𝜕uk ),
i. e. the partial derivatives of the transformation equations satisfy the orthogonality
𝜕xi 𝜕uj 𝜕ui 𝜕xj
= δik , = δik . (4.3)
𝜕uj 𝜕xk 𝜕xj 𝜕uk

1 Since the transformation equations (2.9) between two Cartesian coordinate systems are special cases
of (4.1), Cartesian coordinate systems are special cases of curvilinear coordinate systems.


Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
160 | 4 Tensor Analysis in Curvilinear Coordinates

3. Computing the determinant of both sides of (4.3) and using the multiplication the-
orem for determinants (1.13), for the two Jacobians
󵄨󵄨 𝜕x1 𝜕x1 𝜕x1 󵄨󵄨
󵄨󵄨 󵄨󵄨
𝜕u1 𝜕u2 𝜕u3
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 󵄨󵄨
𝜕(x1 , x2 , x3 ) 󵄨󵄨
󵄨󵄨 𝜕x2 𝜕x2 𝜕x2 󵄨󵄨
1 2 3
𝜕(u , u , u ) 󵄨󵄨󵄨󵄨 𝜕u1 𝜕u2 𝜕u3 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕x3 𝜕x3 𝜕x3 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕u1 𝜕u2 𝜕u3 󵄨󵄨

󵄨󵄨 𝜕u1 𝜕u1 𝜕u1 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕x1 𝜕x2 𝜕x3 󵄨󵄨
󵄨󵄨 󵄨󵄨
1 2 3
𝜕u2 𝜕u2 𝜕u2
󵄨 󵄨󵄨
𝜕(u , u , u ) 󵄨󵄨󵄨 󵄨󵄨
:= 󵄨󵄨󵄨 󵄨󵄨 ,
𝜕(x1 , x2 , x3 ) 󵄨󵄨 𝜕x1 𝜕x2 𝜕x3 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕u3 𝜕u3 𝜕u3 󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 𝜕x1 𝜕x2 𝜕x3 󵄨

we obtain the relation

𝜕(x1 , x2 , x3 ) 𝜕(u1 , u2 , u3 )
= 1. (4.4)
𝜕(u1 , u2 , u3 ) 𝜕(x1 , x2 , x3 )

Since we can interchange the rows and columns of a determinant, it does not mat-
ter if we take the row or the column index as the numerator index (unlike with the
coordinate matrices 𝜕ai /𝜕xj of the gradient of a vector, where it does matter).

4. We can show that the coordinate transformation (4.1) is one-to-one at all the points
where the Jacobian is neither zero nor infinite. The transformation is called regular at
these points and singular at all other points.
In the example of the cylindrical coordinates we have

󵄨󵄨 cos φ −R sin φ 0 󵄨󵄨󵄨󵄨

𝜕(x, y, z) 󵄨󵄨󵄨 󵄨
= 󵄨󵄨󵄨 sin φ R cos φ 0 󵄨󵄨󵄨󵄨 = R,
𝜕(R, φ, z) 󵄨󵄨 󵄨
󵄨󵄨 0 0 1 󵄨󵄨󵄨

which gives with (4.4)

𝜕(R, φ, z) 1
= .
𝜕(x, y, z) R

Thus cylindrical coordinates are singular at all points where R = √x 2 + y2 = 0, i. e.

on the z-axis. At these points the transformation (4.2) is indeed not one-to-one, since
the value of φ is not unique at these points.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.1 Curvilinear Coordinates | 161

4.1.2 Coordinate Surfaces and Coordinate Curves

The surfaces with ui = const are called coordinate surfaces. Except at singular points,
they cover the space simply, i. e. every point lies on exactly one coordinate surface
of each of the three families of coordinate surfaces. Each pair of coordinate surfaces
intersects along a coordinate curve, so that, again except for singular points, three
coordinate curves intersect at each point.

In the example of the cylindrical coordinates, the equations for the coordinate sur-
faces are R = √x2 + y2 = const, or simpler, x2 + y2 = const, φ = arctan(y/x)
= const, or simpler, y/x = const, z = const.
The surfaces with x2 + y2 = const are cylinders which are coaxial with the z-axis;
the surfaces y/x = const are planes through the z-axis; and the surfaces z = const
are planes perpendicular to the z-axis.
On the coordinate surfaces R = √x2 + y2 = const, only φ and z vary; on the
coordinate surfaces φ = arctan(y/x) = const, only R and z vary; and on the coordinate
surfaces z = const, only R and φ vary. On the curve of intersection of the surfaces
R = const and φ = const, only z varies; on the curve of intersection of the surfaces
φ = const and z = const, only R varies; and on the curve of intersection of the
surfaces z = const and R = const, only φ varies.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
162 | 4 Tensor Analysis in Curvilinear Coordinates

4.1.3 Holonomic Bases

1. At every point, the differentials d xi and d ui of the two coordinate systems are related

d u1 + 12 d u2 + 13 d u3 ,
𝜕x1 𝜕x 𝜕x
d x1 =
𝜕u1 𝜕u 𝜕u

d u1 + 22 d u2 + 23 d u3 ,
𝜕x2 𝜕x 𝜕x
d x2 = 1
𝜕u 𝜕u 𝜕u

d u1 + 32 d u2 + 33 d u3 .
𝜕x3 𝜕x 𝜕x
d x3 =
𝜕u1 𝜕u 𝜕u

On the curve of intersection of the surfaces u2 = const and u3 = const we have

at any point d u2 = d u3 = 0. Thus the polar vector with the Cartesian coordinates
(𝜕x1 /𝜕u1 , 𝜕x2 /𝜕u1 , 𝜕x3 /𝜕u1 ) is a (nonnormalized) tangent vector to this coordinate
curve. We consider it as the vector g 1 of a basis. Then the basis defined by

gi = ej (4.5)
with the Cartesian coordinates
g = (4.6)
ij 𝜕ui

consists of vectors tangent to the coordinate curves of the ui -coordinates at every

point. This basis is called the natural or covariant basis of the ui -coordinates at the
point under consideration.

2. Clearly, the (nonnormalized) polar vector with the Cartesian coordinates

(𝜕u1 /𝜕x1 , 𝜕u1 /𝜕x2 , 𝜕u1 /𝜕x3 ) is perpendicular to the surface u1 (x1 , x2 , x3 ) = const
at the point (x1 , x2 , x3 ). We consider this vector as the vector g 1 of another basis. The
basis defined by

gi = e (4.7)
𝜕xj j

with the Cartesian coordinates

i 𝜕ui
gj = (4.8)

has thus the property that its vectors are perpendicular to the coordinate surfaces
of the ui -coordinates at each point. We call this basis the contravariant basis of the
ui -coordinates at this point.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.1 Curvilinear Coordinates | 163

3. The covariant and the contravariant bases are not the only bases which we can
associate to a point in a curvilinear coordinate system. However, computations with
these bases are particularly simple, as we can easily convince ourselves, because they
are reciprocal bases. Both of these bases are called holonomic bases.
4. If we contract (4.5) with 𝜕ui /𝜕xk and (4.7) with 𝜕xk /𝜕ui and then solve for the Carte-
sian bases, we get

𝜕ui 𝜕xj 𝜕ui (4.3) 𝜕xk i 𝜕xk 𝜕ui (4.3)

gi = i e = ek , g = i e = ek .
𝜕xk 𝜕u 𝜕xk j 𝜕ui 𝜕u 𝜕xj j
Thus we have
g j = ij g j .
ei = (4.9)
𝜕xi 𝜕u
5. A system of transformation equations (4.1) associates a curvilinear coordinate sys-
tem to any given Cartesian coordinate system, i. e. we can associate a triple of curvi-
linear coordinates and a pair of holonomic bases to each point in space.
6. We have
𝜕(x1 , x2 , x3 ) 𝜕x (4.6) (2.45)
= det ( ij ) = det (g ) = [g 1 , g 2 , g 3 ],
𝜕(u1 , u2 , u3 ) 𝜕u ji

𝜕(u1 , u2 , u3 ) 𝜕ui (4.8) i (2.45)

= det ( ) = det (g j ) = [g 1 , g 2 , g 3 ],
𝜕(x1 , x2 , x3 ) 𝜕xj
𝜕(x1 , x2 , x3 ) 𝜕(u1 , u2 , u3 )
= [g 1 , g 2 , g 3 ], = [g 1 , g 2 , g 3 ]. (4.10)
𝜕(u1 , u2 , u3 ) 𝜕(x1 , x2 , x3 )
Both triple products have, according to (4.4), the same sign. If the ei form a right-
handed system and if the triple products are positive in this right-handed system, the
g i and the g i also form right-handed systems. If the triple products are negative in this
right-handed system, the g i and the g i form left-handed systems (see Section 2.10.3).
Since we can change the orientation of a basis by interchanging two basis vectors,
we can, without significant loss of generality, avoid many sign rules and case distinc-
tions if we restrict ourselves to right-handed systems. So for the rest of Chapter 4 we
make the following assumption.

The order of the basis vectors ei of the Cartesian coordinate system xi , in which the
curvilinear coordinate system ui (xj ) is embedded, and the order of the basis vectors
g i and g i of its holonomic bases are such that all three bases form right-handed

Then the Jacobians (4.10) are positive, axial tensors are also independent of the coor-
dinate system, and their coordinates transform in the same way as the coordinates of
polar tensors. Thus we do not need to distinguish between polar and axial tensors.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
164 | 4 Tensor Analysis in Curvilinear Coordinates

4.1.4 Rectilinear and Cartesian Coordinate Systems

The transformation matrices 𝜕xi /𝜕uj and 𝜕ui /𝜕xj are, according to (4.6) and (4.8), the
Cartesian coordinates of the two reciprocal bases for the ui -coordinates. These matri-
ces are in general functions of position and thus the two bases are also different at
different positions.
The transformation matrices, and hence also the reciprocal bases, are clearly in-
dependent of position, if and only if the transformation equations (4.1) have the form

ui = Aij xj + Bi , xi = Aij uj + Bi (4.11)

and if the transformation coefficients Aij , Aij , Bi , and Bi are independent of position.
Since the reciprocal bases are independent of position, all coordinate curves are
straight lines and all coordinate surfaces are planes. Equally named coordinate curves
are parallel at all points (i. e. all u1 lines are parallel, all u2 lines are parallel, and all
u3 lines are parallel). Equally named coordinate surfaces are, at all points, parallel
planes. Such coordinate systems are called rectilinear and thus rectilinear coordinate
systems are a special case of curvilinear coordinate systems. (For cylindrical coordi-
nates, all R-curves and all z-curves are lines, but the φ-curves are not. Thus cylindrical
coordinates do not form a rectilinear coordinate system.)
The transformation coefficients of a rectilinear coordinate system satisfy the rela-

𝜕ui i 𝜕xi
Aij = = gj, Aij = =g,
𝜕xj 𝜕uj j i
Aij Ajk = δik , Aij Ajk = δik ,

Bi = −Aij Bj , Bi = −Aij Bj .

The transformation matrices Aij and Aij are inverses of each other and hence regular.
If furthermore the transformation matrices are special orthogonal, the ui form a
Cartesian coordinate system, the two bases g i and g i are orthonormal and coincide,
and the equations (4.11) simplify to the transformation equations (2.9).

Problem 4.1.
Consider an oblique, rectilinear coordinate system in the plane, sharing the origin
with a Cartesian coordinate system and with axes at angles α and β with respect to the
axes of this Cartesian coordinate system.
A. Write out the transformation equations xi = xi (uj ) and ui = ui (xj ) for the coordi-
nates of a point P.
B. Find the Cartesian coordinates of the holonomic bases. Which basis vectors are
unit vectors?

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.2 Holonomic Tensor Coordinates | 165

4.1.5 Orthogonal Coordinate Systems

A coordinate system in which the coordinate curves at a point are mutually orthogonal
and hence the covariant basis vectors at each point are also mutually orthogonal is
called an orthogonal coordinate system. Cylindrical coordinates are an example of an
orthogonal coordinate system. According to Section 3.9.3, in an orthogonal coordinate
system, the coordinate surfaces at a point and hence the contravariant basis vectors at
every point are also orthogonal. The homologous vectors of both bases have the same
direction and their lengths are reciprocal. The direction and length of the basis vectors
can vary from point to point, as for example in cylindrical coordinates.

Problem 4.2.
Find the Cartesian coordinates of the covariant basis and of the contravariant basis in
cylindrical coordinates.

4.2 Holonomic Tensor Coordinates

4.2.1 Introduction

1. Just as we can represent a tensor, according to (2.28), by coordinates with respect

to a Cartesian basis, i. e.

a = ai ei , a = aij ei ej , etc.,

we can represent the tensor also with respect to the two holonomic bases of a curvilin-
ear coordinate system. Clearly, already a vector has different coordinates with respect
to the covariant basis and to the contravariant basis. The coordinates with respect to
the covariant basis are called contravariant coordinates and are denoted by an up-
per index, the coordinates with respect to the contravariant basis are called covariant
coordinates and are denoted by a lower index. We have the representations

a = ai g i = ai g i .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
166 | 4 Tensor Analysis in Curvilinear Coordinates

For a second-order tensor, we have, in addition to the covariant basis g i g j and the
contravariant basis g i g j , also mixed bases, i. e. we have the four representations

a = aij g i g j = ai j g i g j = ai j g i g j = aij g i g j ,

where we call e. g. the ai j contravariant–covariant coordinates and the corresponding

basis g i g j covariant–contravariant basis.

Lower indices of bases and tensor coordinates are called covariant and upper in-
dices are called contravariant.

We obtain the following representations of a tensor with respect to the two holonomic
bases of a curvilinear coordinate system.

a = ai g i = ai g i ,
a = aij g i g j = ai j g i g j = ai j g i g j = aij g i g j , (4.13)


Clearly, these equations generalize the representation equations (2.28) of a tensor with
respect to a Cartesian basis.
All bases of a tensor defined in (4.13) are called holonomic bases, and all corre-
sponding coordinates are called holonomic coordinates.
In the case of nonrectilinear coordinates, the bases change from position to po-
sition and a tensor field depends on position both through the coordinates and the

2. In addition to coordinates, we can again introduce components. We write a vector

a with respect to, for example, the covariant basis g i , i. e.

a = a1 g 1 + a2 g 2 + a3 g 3 ,

and we call a1 g 1 the vector component corresponding to a1 . Again, vector compo-

nents are vectors, but the vector coordinates are not; they are signed (real) numbers.
In general, the holonomic basis vectors of a curvilinear coordinate system are not unit
vectors, so a part of the length of a vector component is contained in the coordinate
and another part in the basis. Thus, contrary to Cartesian coordinates, the magnitude
of a vector coordinate and the magnitude of the corresponding vector component are
generally not equal.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.2 Holonomic Tensor Coordinates | 167

3. Using the orthogonality relations (3.24) between reciprocal bases we obtain from
(4.13), by an appropriate scalar multiplication, the converse equations

ai = a ⋅ g i ,

ai = a ⋅ g i ,

aij = a ⋅⋅ g i g j = g i ⋅ a ⋅ g j ,

ai j = a ⋅⋅ g i g j = g i ⋅ a ⋅ g j ,
ai j = a ⋅⋅ g i g j = g i ⋅ a ⋅ g j ,

aij = a ⋅⋅ g i g j = g i ⋅ a ⋅ g j ,

aijk = a ⋅⋅⋅ g i g j g k = g i ⋅ a ⋅⋅ g j g k = g i g j ⋅⋅ a ⋅ g k = g i g j g k ⋅⋅⋅ a,


4. Clearly we could also write a position vector with respect to the holonomic bases
of a curvilinear coordinate system, i. e.

x = xi g i = xi g i .

However, in general neither the xi nor the xi depend in a simple manner on the curvi-
linear position coordinates ui , so they are not used.

5. In equations where both Cartesian and holonomic curvilinear coordinates appear,

we find it useful to distinguish the Cartesian coordinates from the covariant coordi-
nates by denoting e. g. the Cartesian coordinates of a vector as â i , or written out as
(ax , ay , az ).

Problem 4.3.
A tensor T and a basis g i are given by their Cartesian coordinates, i. e.

1 −1 2 g 1 = (1, 0, 0),
T̂ij = ( 2 2 −1) , g 2 = (1, 1, 0),
−1 −2 1 g 3 = (1, 1, 1).

Find the covariant coordinates T ij of the tensor T in the basis g i g j .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
168 | 4 Tensor Analysis in Curvilinear Coordinates

4.2.2 Transformations Between Two Curvilinear Coordinate Systems

1. Let two curvilinear coordinate systems be given by their transformation equations

with respect to the same Cartesian coordinate system, i. e.

ui = ui (xj ), xi = xi (uj ),
̃i = u
u ̃ i (xj ), ̃ j ).
xi = xi (u

By eliminating this Cartesian coordinate system from the above equations, we obtain
analogous transformation equations between the two curvilinear Coordinate systems
themselves, i. e.

ui = ui (u
̃ j ), ̃i = u
u ̃ i (uj ). (4.16)

2. Similarly as in Section 4.1.1, we obtain analogous orthogonality relations to (4.3) by

writing out the differential of the first equation and taking the appropriate derivative.
We have

𝜕ui 𝜕ũj
= δik . (4.17)
̃ j 𝜕uk
Because the tilde and the no-tilde coordinate systems are on an equal footing, it is
trivial to write down the equation corresponding to the second equation in (4.3).
3. From (4.9) it follows that

𝜕uj ̃j
g̃ = i g j = ij g̃ j ,
𝜕x 𝜕x
ei = gj =
𝜕xi 𝜕xi j 𝜕uj 𝜕u

𝜕uj ̃j
𝜕u 𝜕xi j 𝜕xi j
gj = g̃ , g = j g̃ .
𝜕xi 𝜕xi j 𝜕uj 𝜕u

̃ m and 𝜕u
Contraction with 𝜕xi /𝜕u ̃ m /𝜕xi , respectively, gives

𝜕uj 𝜕xi ̃ j 𝜕xi

𝜕u 𝜕u ̃ m 𝜕xi j 𝜕u ̃ m 𝜕xi j
g = g̃ , g = g̃ ,
𝜕xi 𝜕u ̃ m j ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 𝜕xi 𝜕ũm j 𝜕xi 𝜕uj
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 𝜕xi 𝜕u ̃j
𝜕uj δjm 𝜕u ̃m δmj
𝜕u 𝜕uj

𝜕uj ̃m j
g̃ m = g, g̃ m = g. (4.18)
̃m j
𝜕u 𝜕uj
Since the two coordinate systems are on an equal footing, we can exchange the tilde
and the no-tilde quantities in these equations.
The two equations (4.18) are clearly transformation equations between the covari-
ant and the contravariant bases of two curvilinear coordinate systems. They general-
ize the transformation equations (2.8) between the bases of two Cartesian coordinate

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.2 Holonomic Tensor Coordinates | 169

systems. The matrices 𝜕uj /𝜕u

̃ m and 𝜕u
̃ m /𝜕uj in these equations are, according to (4.17),
inverses of each other and therefore regular; they are called the transformation ma-
trices between the two curvilinear coordinate systems. Clearly they generalize the (or-
thogonal) transformation matrices αij and αji = αijT between two Cartesian coordinate
systems, which are also inverses of each other.
4. From the equations (4.18) we immediately obtain the transformation equations for
tensor coordinates of the same type. For example, we have

𝜕uj 𝜕ui k
̃ i g̃ = a
a = ai g i = a ̃i g or ai = a
̃ .
i ̃i j
𝜕u ̃k
Hence the transformation equations between the holonomic coordinates of the same
type for a tensor in two curvilinear coordinate systems are

̃i m
𝜕u 𝜕um
̃i =
a a , a
̃i = a ,
𝜕um ̃i m
̃ i 𝜕u
𝜕u ̃ j mn ̃ i 𝜕un m
̃ ij =
a a , ̃i j =
a a n,
𝜕um 𝜕un 𝜕um 𝜕ũj
𝜕um 𝜕u
̃j 𝜕um 𝜕un
a j
̃i = a n, a
̃ ij = a ,
̃ 𝜕un m
𝜕u i ̃ j mn
̃ i 𝜕u


These equations generalize the transformation equations (2.17) for the coordinates of
a tensor in two Cartesian coordinate systems (when both are right-handed systems).
5. Thus a covariant index of a tensor coordinate transforms like a covariant index of
a basis and “oppositely” to a contravariant index of a tensor coordinate or a basis.
This is the case, because contraction of a covariant index with a contravariant index
(and vice versa) results in an index-free expression, which is as such independent of
a coordinate system, and this does not depend on whether or not these indices are
indices of a tensor or of a basis; ai bi = c, ai g i = a, g i g i = δ. (This property of
covariant and contravariant indices explains the names covariant and contravariant:
covariant means changing similarly, contravariant means changing oppositely.)
6. The transformation law for tensor coordinates depends only on the number of the
upper and lower indices, but not on their order; according to (4.19) we have

̃ i 𝜕un m
𝜕u ̃ i 𝜕un m
̃i j =
a a n, ̃j i =
a a .
𝜕um 𝜕ũj 𝜕um 𝜕ũj n

Problem 4.4.
Compute the transformation equations between the Cartesian coordinates and the
holonomic cylindrical coordinates of a vector.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
170 | 4 Tensor Analysis in Curvilinear Coordinates

Problem 4.5.
A. For the oblique coordinate system from Problem 4.1, compute the transformation
equations Ai = Ai (A ̂ j ), A ̂ i (Aj ), and Ai = Ai (A
̂i = A ̂ j ) between the holonomic and
the Cartesian vector coordinates.
B. Add the representations A = Ai g i and A = Ai g i to the graphs (in each case as
addition of two vectors). Show that the contravariant vector coordinates can be
constructed by parallel projection and that the covariant vector coordinates can
be constructed by perpendicular projection onto the coordinate lines.
Hint: First verify geometrically the computed representations A ̂ i (Aj ) and
̂i = A
Ai = Ai (A
̂ j ) using auxiliary lines.

4.2.3 The Summation Convention

The distinction between upper and lower indices in holonomic bases and tensor co-
ordinates requires that we adjust the summation convention and its consequent rules
for the use of running indices. This follows from the representation equations (4.13)
and (4.14) and from the transformation equations (4.18) and (4.19).
We modify the summation convention for equations in which, besides tensors,
only holonomic tensor coordinates, holonomic bases, and transformation matrices
appear, i. e. for tensor equations, transformation equations, and representation equa-
tions (compare Section 2.7.4) we have the following rule.

If a running index appears twice in a term, once as a lower index and once as an
upper index, summation over the values of the index from one to three is implied,
without explicitly writing the summation sign. In a transformation matrix, the index
in the “numerator” counts as an upper index and the index in the “denominator”
counts as a lower index.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.2 Holonomic Tensor Coordinates | 171

This has the following implications for running indices (see Section 1.1, No. 4):
– A running index can appear only once or twice in a term. If it appears twice, one
must be in upper position and one in lower position.
– Each running index must take the values one, two, three. (In Chapters 2 to 4 we
restrict ourselves to the three-dimensional space of our intuition.)
– All terms of an equation must match in their free indices, also with respect to their
positions, but not with respect to their order.

In the rare cases where we do not want to apply the summation convention, we again
underline the corresponding running index.

4.2.4 The δ-Tensor Holonomic Coordinates

1. For the holonomic coordinates of the δ-tensor a special notation is used, and we
δ = g ij g i g j = δji g i g j = δi g i g j = gij g i g j . (4.20)

Conversely, using (4.14), we obtain for the coordinates

g ij = δ ⋅⋅ g i g j = g i ⋅ g j ,
δji = δ ⋅⋅ g i g j = g i ⋅ g j = δij ,
j (3.24)
δi = δ ⋅⋅ g i g j = g i ⋅ g j = δij ,
gij = δ ⋅⋅ g i g j = g i ⋅ g j ,

or, using the fact that the scalar product of two vectors does not depend on the order
of its factors, we write the following.
δi = δji = g i ⋅ g j = g i ⋅ g j = δij ,

g ij = g ji = g i ⋅ g j , (4.21)
gij = gji = g i ⋅ g j .

Here (in accordance with the definition (1.19) of the generalized Kronecker symbols), δji
and δi have the same meaning as δij ; so we will set one index of the Kronecker symbol
as upper index, if this is required to satisfy the rules for computations with running
indices for curvilinear coordinates. Because the order of the indices does not matter
in the Kronecker symbol, often both indices are set on top of each other (unlike for
mixed tensor coordinates, where the order does matter).

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
172 | 4 Tensor Analysis in Curvilinear Coordinates

2. The g ij and gij are also called metric coefficients (“measure coefficients”). This
makes sense, because we need them to compute the length of a vector (to “measure”
its length) from the three holonomic coordinates; if, for example, the covariant coor-
dinates ai of a vector a are given, then we compute its length as

a = √a ⋅ a = √ai g i ⋅ aj g j = √ai aj g i ⋅ g j = √ai aj g ij ,

i. e. we need both the covariant coordinates ai and the contravariant metric coeffi-
cients. This explains why the δ-tensor is called metric tensor (a name we had already
mentioned earlier).

3. Using (4.21), we can also interpret the coordinates of the δ-tensor geometrically;
they are the scalar products of two basis vectors. We compute for the mixed coordi-
nates in every curvilinear coordinate system, according to the definition (3.21) of re-
ciprocal bases or the orthogonality relations (3.24), the same values, namely 1 if both
indices are equal and 0 if they are not equal. The covariant coordinates and the con-
travariant coordinates of the δ-tensor, however, depend on the coordinate system; co-
ordinates with the same indices are the square of the length of the corresponding basis
vector, e. g. we have

g11 = |g 1 |2 .

Coordinates with different indices are the product of the lengths of the corresponding
two basis vectors and the cosine of the enclosed angle, e. g.

g 13 = |g 1 ||g 3 | cos(g 1 , g 3 ).

Problem 4.6.
Compute the covariant and contravariant cylindrical coordinates of the δ-tensor. Properties of the Metric Coefficients

1. The metric coefficients of a curvilinear coordinate system at a point (or shorter,

the metric of a curvilinear coordinate system) are defined by the lengths of the three
basis vectors and the angles enclosed by them, i. e. by six parameters. All coordinate
systems which have the same values for these six parameters (at every point) have the
same metric coefficients. Geometrically, these are coordinate systems whose bases can
be obtained from each other by a rotation or by a rotary reflection.

2. Now we investigate some properties of the metric coefficients. What conditions

must a square matrix of order 3 satisfy, so that it can be interpreted as a matrix of
metric coefficients?

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.2 Holonomic Tensor Coordinates | 173

The most obvious property or condition is that the metric coefficients are symmet-
ric, i. e.

gij = gji , g ij = g ji . (4.22)

Next, from the condition

ai aj gij = ai aj g ij = a ⋅ a > 0,

which is true for any vector a (except the zero vector), it follows that the matrices of the
metric coefficients gij and g ij are positive definite. From this, according to Theorem 5
and Theorem 4 from Section 3.12.4, it follows that the determinant and all principal
subdeterminants (e. g. for a matrix of order 3 the principal minors and the elements
on the main diagonal) are positive.
First, the diagonal elements must be positive, i. e.

gii > 0, g ii > 0. (4.23)

This we see also from their geometric interpretation as the square of the length of the
basis vectors. Secondly, the principal minors must be positive, i. e. the subdetermi-
nants of order 2, which are obtained by dropping a row and its associated column, so
we must have
󵄨󵄨 ii
g ij 󵄨󵄨󵄨
󵄨󵄨 g gij 󵄨󵄨󵄨󵄨 󵄨󵄨 g
󵄨󵄨 ii
󵄨󵄨 󵄨 > 0, 󵄨󵄨 ji 󵄨󵄨 > 0, i ≠ j. (4.24)
󵄨󵄨 gji
󵄨 gjj 󵄨󵄨󵄨󵄨 󵄨󵄨 g
󵄨 g jj 󵄨󵄨󵄨

Lastly, the determinant of the metric coefficients itself must also be positive, and this
must be true independently of the orientation of the basis vectors. We have e. g.
󵄨 󵄨
󵄨󵄨 g
󵄨󵄨 11 g12 g13 󵄨󵄨󵄨󵄨 󵄨󵄨󵄨󵄨 g 1 ⋅ g 1 g1 ⋅ g2 g 1 ⋅ g 3 󵄨󵄨󵄨
󵄨 󵄨󵄨 󵄨󵄨
det gij = 󵄨󵄨󵄨󵄨 g21 g22 g23 󵄨󵄨󵄨 = 󵄨󵄨󵄨 g 2 ⋅ g 1 g2 ⋅ g2 g 2 ⋅ g 3 󵄨󵄨󵄨󵄨
󵄨󵄨 󵄨 󵄨 󵄨
󵄨󵄨 g31 g32 g33 󵄨󵄨󵄨 󵄨󵄨󵄨󵄨 g ⋅ g
3 1
g3 ⋅ g2 g 3 ⋅ g 3 󵄨󵄨󵄨󵄨

= [g 1 , g 2 , g 3 ][g 1 , g 2 , g 3 ],

which is clearly always positive. Writing it as a square, we have

det gij = [g 1 , g 2 , g 3 ]2 > 0, det g ij = [g 1 , g 2 , g 3 ]2 > 0. (4.25)

3. Because the two holonomic bases are reciprocal, one is uniquely determined by the
other, and thus the contravariant metric coefficients are uniquely determined by the
covariant metric coeficients and vice versa. Using this, we obtain relations between

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
174 | 4 Tensor Analysis in Curvilinear Coordinates

the two types of metric coefficients. We have

gij g jk = g i ⋅ g j g j ⋅ g k = g i ⋅ δ ⋅ g k = g i ⋅ g k = δik ,

gij g jk = δik , (4.26)

i. e. the matrices of the two types of metric coefficients are inverses of each other, and
one can be computed from the other using the Gauss–Jordan algorithm.
4. Clearly we have
(1.13) (4.26)
det gij det g ij = det (gij g jk ) = 1.

If we introduce the symbol g for the determinant of gij , we have

g := det gij , det g ij = , det gij det g ij = 1. (4.27)

5. Using (2.46) and (4.25), we further have

󵄨󵄨 g 1 ⋅ g 1 g1 ⋅ g2 g 1 ⋅ g 3 󵄨󵄨󵄨
󵄨󵄨 󵄨󵄨
[g 1 , g 2 , g 3 ][g 1 , g 2 , g 3 ] = 󵄨󵄨󵄨󵄨 g 2 ⋅ g 1 g2 ⋅ g2 g 2 ⋅ g 3 󵄨󵄨󵄨󵄨 = g
󵄨󵄨󵄨 g ⋅ g g3 ⋅ g2 g 3 ⋅ g 3 󵄨󵄨󵄨󵄨
󵄨󵄨 3 1

and accordingly
[g 1 , g 2 , g 3 ]2 = .

So we obtain, together with (4.10),

𝜕(x1 , x2 , x3 )
[g 1 , g 2 , g 3 ] = = √det gij = √g,
𝜕(u1 , u2 , u3 )
1 2 𝜕(u1 , u2 , u3 ) √
3 1
[g , g , g ] = = det g ij = √ .
𝜕(x1 , x2 , x3 ) g

6. From the geometric interpretation of the metric coefficients as scalar products of

basis vectors, it follows immediately that for orthogonal coordinate systems all metric
coefficients with different indices are zero, and that homologous covariant and con-
travariant metric coefficients are reciprocal, i. e.

gij = gii δij , g ij = g ii δij , g ii gii = 1. (4.29)

For Cartesian coordinates we have

gij = g ij = δij . (4.30)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.2 Holonomic Tensor Coordinates | 175

Problem 4.7.
Find the transformation equation for g := det gij , when changing from one curvilinear
coordinate system to another.
Hint: Start with the transformation equations (4.19) for the covariant tensor coor-
dinates gij .

4.2.5 Raising and Lowering Indices

Clearly we have

g ij g j = g i ⋅ g j g j = g i ⋅ δ = g i ,

gij g j = g i ⋅ g j g j = g i ⋅ δ = g i .

It follows further from

a = ai g i = ai g i ,

after scalar multiplication by g j and g j , respectively, that

ai g i ⋅ g j = ai g i ⋅ g j , ai δi = ai g ij , aj = g ji ai ,
ai g i ⋅ g j = ai g i ⋅ g j , ai gij = ai δji , aj = gji ai .

The same applies to higher-order tensors; for example, from

a = aij g i g j = ai j g i g j = ai j g i g j = aij g i g j

it follows after scalar multiplication by g k from the right that

aij g i g j ⋅ ⋅ ⋅ g k = ai j g i g j ⋅ g k and ai j g i g j ⋅ g k = aij g i g j ⋅ g k ,

aij g i δjk = ai j g i g jk and ai j g i δjk = aij g i g jk ,

aik = g kj ai j and ai k = g kj aij .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
176 | 4 Tensor Analysis in Curvilinear Coordinates

Thus we can raise or lower an index in a basis or in tensor coordinates by contracting

with the corresponding metric coefficient, i. e.
g i = g im g m ,
g i = gim g m ,
ai = g im am ,
ai = gim am ,
aij = g im am j = g jn ai n = g im g jn amn , (4.31)

ai j = g im amj = gjn ain = g im gjn am n ,

ai j = gim amj = g jn ain = gim g jn am n ,
aij = gim am j = gjn ai n = gim gjn amn ,

4.2.6 The ε-Tensor Holonomic Coordinates

1. We denote the holonomic coordinates of the ε-tensor by e. There are eight different
versions of e:
ε = eijk g i g j g k = eij k g i g j g k = ⋅ ⋅ ⋅ = eijk g i g j g k . (4.32)

From (4.14) and (2.45) we obtain the converse equations

eijk = ε ⋅⋅⋅ g i g j g k = [g i , g j , g k ],

eij k = ε ⋅⋅⋅ g i g j g k = [g i , g j , g k ],
eijk = ε ⋅⋅⋅ g i g j g k = [g i , g j , g k ].

According to (2.46), we can write the square of these triple products as determinant as
g ij g ik 󵄨󵄨󵄨
󵄨󵄨 ii
󵄨󵄨 g
󵄨 󵄨󵄨
[g i , g j , g k ][g i , g j , g k ] = 󵄨󵄨󵄨 g ji g jj g jk 󵄨󵄨󵄨 ,
󵄨󵄨 ki 󵄨󵄨
󵄨󵄨 g
󵄨 g kj g kk 󵄨󵄨󵄨
󵄨󵄨 g ii
󵄨󵄨 g ij δki 󵄨󵄨󵄨󵄨
󵄨󵄨 󵄨󵄨
󵄨󵄨 j 󵄨
[g , g , g k ][g , g , g k ] = 󵄨󵄨󵄨 g ji
i j i j
g jj δk 󵄨󵄨󵄨󵄨 ,
󵄨󵄨 󵄨󵄨
󵄨󵄨 i
󵄨󵄨 δ j 󵄨󵄨
󵄨 k δk gkk 󵄨󵄨󵄨

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.2 Holonomic Tensor Coordinates | 177

Thus we obtain the converse equations to (4.32) as follows.

g ij
󵄨󵄨 ii
󵄨󵄨 g g ik
ijk i j k 󵄨󵄨 󵄨󵄨
e = [g , g , g ] = ±√ 󵄨󵄨󵄨 g ji g jj g jk 󵄨󵄨,
󵄨󵄨 ki
g kj g kk
󵄨󵄨 g 󵄨󵄨
󵄨 󵄨󵄨

√ 󵄨󵄨 ii
g ij δki 󵄨󵄨󵄨
√ 󵄨󵄨 g 󵄨
√ 󵄨󵄨 󵄨󵄨
√ 󵄨󵄨 󵄨
eij k i j
= [g , g , g k ] = ± 󵄨󵄨󵄨 g ji g jj
j 󵄨
δk 󵄨󵄨󵄨󵄨,
󵄨󵄨 (4.33)
󵄨󵄨 i j 󵄨󵄨
󵄨󵄨 δ δk gkk 󵄨󵄨󵄨󵄨
√󵄨 k
√ 󵄨󵄨 gii gij gik 󵄨󵄨󵄨󵄨
√ 󵄨󵄨 󵄨󵄨
eijk = [g i , g j , g k ] = ± 󵄨󵄨󵄨󵄨 gji gjj gjk 󵄨󵄨󵄨.
󵄨󵄨 gki gkj gkk 󵄨󵄨󵄨󵄨

All eight versions can be obtained from each other by raising or lowering single indices
and by replacing g with δ as necessary.

2. The sign of a holonomic coordinate of the ε-tensor changes, if we change the order
of two indices, e. g. i and j. The sign does not change, if we raise or lower an index. For

εpqk g g
(2.45) 3 (3.22) 1p 2q
[g 1 , g 2 , g 3 ] = εijk g g g k = εijk g g
1i 2j 1 i 2 j εlmn g g g
1l 2m 3n
󵄨󵄨 󵄨
󵄨󵄨 g11 g12 󵄨󵄨󵄨
(δip δjq − δiq δjp ) g g g g g g g g −g g g g 󵄨󵄨
󵄨󵄨 g21
1i 2j 1p 2q 1i 1i 2j 2j 1i 2i 1j 2j 󵄨 g22 󵄨󵄨󵄨
= = = .
εlmn g g g εlmn g g g [g 1 , g 2 , g 3 ]
1l 2m 3n 1l 2m 3n

Since the numerator is, according to (4.24), always positive, [g 1 , g 2 , g 3 ] has the same
sign as [g 1 , g 2 , g 3 ]. Properties of the Holonomic Coordinates

1. Using (4.33) and (4.28), we have

e123 = [g 1 , g 2 , g 3 ] = √g, e123 = [g 1 , g 2 , g 3 ] = , e123 e123 = 1.

Since a triple product is zero if two of its vectors are equal and since it changes the
sign when we change the order of two of its vectors, we have for the entirely covariant

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
178 | 4 Tensor Analysis in Curvilinear Coordinates

coordinates and the entirely contravariant coordinates of the ε-tensor2

1 1
eijk = √g εijk , eijk = εijk , eijk = e . (4.34)
√g g ijk

2. The holonomic coordinates of the ε-tensor only change their sign if we exchange
two indices, while keeping their positions (upper or lower). The sign does not change
when the indices are changed cyclically, while keeping their positions, for example,
we have

eij k = ej k i = ek ij = −ek ji = −eji k = −ei k j .

3. Holonomic coordinates of the ε-tensor are zero:

– if at least two indices in the same positions (both either lower or upper) are equal
(because then two vectors in the corresponding triple product are equal),
– for orthogonal coordinate systems, also, if at least two indices in different posi-
tions are equal (because then two homologous vectors of the corresponding re-
ciprocal bases are collinear).

4. We can interpret (1.26) as an equation between the Cartesian coordinates of the

ε-tensor and the Cartesian coordinates of the δ-tensor, i. e. as a tensor equation in
Cartesian coordinates. Its translation3 e. g. into contravariant curvilinear coordinates
󵄨󵄨 ip
g iq g ir
󵄨󵄨 g 󵄨󵄨
󵄨󵄨 󵄨󵄨
eijk epqr = 󵄨󵄨󵄨 g jp g jq g jr 󵄨󵄨 ,
󵄨󵄨 (4.35)
󵄨󵄨 kp
g kq g kr
󵄨󵄨 g 󵄨󵄨
󵄨 󵄨󵄨

where indices can be lowered or raised and indices in mixed positions can be set both
in upper or lower position. Thus we obtain the translated Grassman identity (1.35).

eijk epq k = (g ip g jq − g iq g jp ) . (4.36)

Here, again we can lower free indices.

5. We have

(3.23) 1 gj × gk (4.28) 1 εijk (4.34) 1 ijk

gi = εijk = g × gk = e gj × gk ,
2 [g 1 , g 2 , g 3 ] 2 √g j 2

2 The rule from Section 4.2.3, according to which free indices must be in the same position in all
terms of an equation, applies only to equations which contain besides tensors only holonomic tensor
coordinates, holonomic bases, and transformation coefficients. This is not the case for an equation
with an εijk .
3 See Section

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.2 Holonomic Tensor Coordinates | 179

which we can solve for g i × g j by contraction with eipq as follows:

1 ijk
gi = e gj × gk , g i × g j = eijk g k . (4.37)

Problem 4.8.
How many holonomic coordinates does the ε-tensor have? How many of them are not
zero in cylindrical coordinates? Find the nonzero holonomic cylindrical coordinates
of the ε-tensor.

Problem 4.9.
Complete the expression eijk em nk to obtain the Grassmann identity.

4.2.7 Isotropic Tensors

The δ-tensor, the ε-tensor, and combinations of both are, according to Section 2.8.3,
characterized by the property that their Cartesian coordinates are equal in all Carte-
sian coordinate systems. We called them isotropic tensors.
Clearly, holonomic curvilinear coordinates of isotropic tensors do not have this
property: the metric coefficients can have different values in different curvilinear co-
ordinate systems. However, since the metric coefficients depend only on the length
and the relative position of the basis vectors, which do not change under a rotation
of the basis (at the point under consideration), and since the holonomic curvilinear
coordinates of isotropic tensors depend only on the metric coefficients, the following
property holds. The holonomic curvilinear coordinates of isotropic tensors are invari-
ant under a rotation of the basis. (Note that the bases of all Cartesian coordinate sys-
tems at a point can be obtained from each other by a rotation or a rotary reflection.)

4.2.8 Tensor Algebra in Holonomic Coordinates

We can derive rules for computations with holonomic coordinates from the rules for
computations in symbolic notation or (e. g. when operations are only defined in co-
ordinate notation) from the rules for computations in Cartesian coordinates. Thus we
find as a third notation, in addition to the symbolic notation and the notation in Carte-
sian coordinates, the notation of an equation in holonomic curvilinear coordinates,
with analogous translation rules.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
180 | 4 Tensor Analysis in Curvilinear Coordinates Equality, Addition, and Subtraction

For example, from a = b it follows that ai g i = bi g i and also that ai g i = bi g i . Since
the decomposition of two vectors a and b with respect to a basis g i or g i is unique, we
can cancel out the basis and obtain the equations for coordinates

ai = bi and ai = bi .

Analogously, from a = b it follows that

aij = bij , ai j = bi j , ai j = bi j , aij = bij .

Thus two tensors are equal if we have in symbolic notation, in coordinate notation
in Cartesian coordinates, and in coordinate notation in holonomic curvilinear coordi-

a = A, a = A, a = A,
a = B, a
̂i = B
̂i, ai = Bi ,
ai = Bi ,
a = C, a
̂ ij = C
̂ ,
ij aij = C ij ,
ai j = C i j ,
ai j = Ci j ,
aij = Cij ,

Analogously, from a ± b = c it follows that

ai ± bi = ci and ai ± bi = ci ,

and, in general, for addition and subtraction of tensors in the three notations

a ± b = A, a±b = A, a±b = A,
i i
a ± b = B, a
̂i ± b
̂ =B
̂i, a ±b = Bi ,
ai ± bi = Bi ,
a ± b = C, a
̂ ij ± b
̂ =C
̂ ,
ij aij ± bij = C ij ,
ai j ± bi j = C i j ,
ai j ± bi j = Ci j ,
aij ± bij = Cij ,

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.2 Holonomic Tensor Coordinates | 181

Thus, we can write an equation, whose terms are vectors in holonomic curvilinear
coordinates, in two equivalent versions: either in contravariant coordinates or in co-
variant coordinates; an equation whose terms are second-order tensors can be written
in four equivalent versions. In general, we can write an equation whose terms are ten-
sors of order n in 2n equivalent versions (corresponding to the number of holonomic
curvilinear coordinates). Transposed, Symmetric, and Antimetric Tensors

1. If we assume that one of the two coordinate systems in the transformation equations
(4.19) is a Cartesian coordinate system, we have

𝜕ui 𝜕uj 𝜕uj 𝜕ui T

aij = a
̂ mn = ̂ = (aT )ji ,
𝜕xm 𝜕xn 𝜕xn 𝜕xm nm

𝜕ui 𝜕xn 𝜕x 𝜕ui T

ai j = ̂ mn = n
a ̂ = (aT )j i ,
𝜕xm 𝜕uj 𝜕uj 𝜕xm nm

The transpose of a tensor in Cartesian coordinates is given by (2.23), and with the equa-
tions above we have in holonomic curvilinear coordinates

̂ Tij = a
a ̂ ji , (aT )ij = aji ,
(aT )i j = aj i ,
(aT )i j = aj i ,
(aT )ij = aji ,

i. e. we transpose a tensor by exchanging the two free indices of the holonomic curvi-
linear tensor coordinates while keeping their positions.

2. According to (2.26) and with (4.39), we have for the symmetric part of a second-order

1 1 ij
̂ (ij) = (a
̂ +a
̂ ji ),
a(ij) = (a + aji ),
2 ij 2
a(i j) = (ai j + aj i ),
a(i j) = (a j + aj i ),
2 i
a(ij) = (a + aji )
2 ij

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
182 | 4 Tensor Analysis in Curvilinear Coordinates

and for the antimetric part of a second-order tensor

1 1 ij
̂ [ij] = (a
̂ −a
̂ ji ), a[ij] = (a − aji ),
2 ij 2
a[i j] = (ai j − aj i ),
a[i j] = (a j − aj i ),
2 i
a[ij] = (a − aji ).
2 ij Tensor Multiplication

For example, from a b = c it follows that

aij g i g j bk g k = aij bk g i g j g k = cijk g i g j g k or aij bk = cijk


ai j g i g j bk g k = ai j bk g i g j g k = ci jk g i g j g k or ai j bk = ci jk .

Six more equivalent versions are possible, because three indices i, j, and k can be ar-
ranged into 23 = 8 different combinations of upper and lower indices. We note that
in computations with these tensor products we can pull the coordinates before the
bases (because they are numbers), but we cannot change the order of the bases, be-
cause tensor products of vectors are not commutative, i. e. g i g j ≠ g j g i . Furthermore,
to be able to cancel out bases, they must match not only in their order, but also in the
positions of their indices, i. e. g i g j ≠ g i g j . As a result of these two conditions, after
canceling out the bases, the free indices in a tensor equation match in both their order
and their positions, i. e.

a b = A, a b = A, a b = A,
a b = B, ab
̂ =B
̂i, a bi = Bi ,
a bi = Bi ,
a b = C, a
̂i b = C
i ai b = C i ,
ai b = Ci ,
a b = D, a
̂i b
̂ =̂
j Dij , ai bj = Dij ,
ai bj = Di j ,
ai bj = Di j ,
ai bj = Dij ,

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.2 Holonomic Tensor Coordinates | 183 Contraction and Its Special Cases

For example, from a ⋅ b = c it follows analogously that

aij g i g j ⋅ bk g k = aij bk g i g j ⋅ g k = aij bk g i δjk = aij bj g i = ci g i


aij bj = ci .

The other three possible positions of the indices j and k are

ai j bj = ci , bk gjk = ci ,
aij ⏟⏟⏟⏟⏟⏟⏟⏟⏟ ai j b jk
k g =c.

bj bj

Since, according to (4.31), we have bk gjk = bj and bk g jk = bj , we essentially have for

a contravariant i the two notations

aij bj = ci and ai j bj = ci .

Since the right sides and thus also the left sides are equal (unlike in the equivalent for-
mulations for indices in different positions), we interpret both equations as different
notations of the same version, i. e. of the contravariant version of the vector equation
a ⋅ b = c. In addition we have the covariant version ai j bj = aij bj = ci . We note that
summation indices must be in different positions, but it does not matter which of the
two indices is in upper position and which is in lower position;

a ⋅ b = A, a
̂i b
̂ = A,
i ai bi = ai bi = A,

a ⋅ b = B, a
̂i b
̂ =B
̂j, ai bij = ai bi j = Bj ,
ai bi j = ai bij = Bj ,

If a contraction cannot be written as scalar product, we have analogously

̂ ijk b
̂ =A
̂ ik , aijk bj = ai j k bj = Aik ,
aij k bj = ai jk bj = Ai k ,
ai jk bj = aij k bj = Ai k ,
ai j k bj = aijk bj = Aik .

This can be proven using the transformation law (4.19).

We abstain from writing out the analogous equations for multiple scalar products,
vector products, and triple products.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
184 | 4 Tensor Analysis in Curvilinear Coordinates Summary

1. Tensor equations in holonomic curvilinear coordinates must satisfy the three rules
from Section 4.2.3.

2. To translate a tensor-algebraic equation from Cartesian coordinates into holonomic

curvilinear coordinates, every free index must be replaced by either a covariant index
or a contravariant index (i. e. the same index in each term must be replaced by the
same type of holonomic index) and we have to replace each pair of summation indices
with two holonomic indices in different positions. This means that we have to replace
δij with either g ij , δji , δi , or gij and ε... with e...
To translate from curvilinear coordinates into Cartesian coordinates, we have to
write all upper indices as lower indices, and we have to replace g ij and gij with δij and
with ε... .

3. Furthermore, when translating from symbolic notation to holonomic curvilinear

coordinates and vice versa, the order of the free indices must match in every term.

4. Thus the equations (4.31) on raising and lowering of indices are simply translations
̂ i = δij a
of identities such as a ̂ j into curvilinear coordinates, where the index of the
vector coordinate is set in different positions on the left and on the right side.

Problem 4.10.
Translate (in one possible way) into holonomic curvilinear coordinates
A. the equations from Problem 2.5;
B. a × (b × c) = a ⋅ c b − a ⋅ b c.

4.3 Physical Bases and Tensor Coordinates

1. Consider a vector a and the two bases of a curvilinear coordinate system. Then we
can always decompose a uniquely into its components with respect to the two bases
g i and g i , i. e.

a = a1 g 1 + a2 g 2 + a3 g 3 = a1 g 1 + a2 g 2 + a3 g 3 .

Since in general the basis vectors are not unit vectors, one part of the length of the
components ai g i and ai g i is contained in the coordinates and the other part in the
basis vectors. If the vector, and thus its components, represent a physical quantity,
such as force, then one part of the physical dimension is in the coordinates and the
other part is in the basis vectors. This is often inconvenient, so we define for each
holonomic basis another basis, whose basis vectors have the same directions, but are

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.3 Physical Bases and Tensor Coordinates | 185

unit vectors. We call this basis the corresponding physical basis and denote this basis
and its corresponding coordinates by a star. Thus we have

gi gi
g∗i = , g ∗i = , (4.46)
√gii √g ii

and for a vector a

a = ai g i = a∗i g ∗ i = ai g i = a∗ i g ∗i . (4.47)

The so-defined physical coordinates a∗i and a∗ i clearly represent the length of the
corresponding components, and they have the same physical dimension as the vector
itself. We can generalize (4.47) analogously to higher-order tensors, by substituting
(4.46) for the relation between the physical coordinates and the holonomic coordi-
nates, and we obtain

a∗i = ai √gii , a∗ i = ai √g ii ,
a∗ij = aij √gii √gjj , a∗i j = ai j √gii √g jj ,
a∗ i j = ai j √g ii √gjj , a∗ ij = aij √g ii √g jj ,


2. That some of the indices in these equations are underlined, i. e. that the summa-
tion convention cannot be applied, already indicates that we pay a price for the di-
mensional equality of the physical coordinates. The powerful elegance of tensor cal-
culus, which is expressed in the summation convention, exists only for holonomic
coordinates. In practice, we therefore perform computations as long as possible in
holonomic coordinates, and convert only the final result into physical coordinates, if

3. For example, in general the two physical bases g ∗ i and g ∗i , corresponding to the
reciprocal bases g i and g i , are not reciprocal, i. e. g ∗ i and g ∗i do not satisfy the orthog-
onality relations (3.24) and (3.25).
However, for an important group of curvilinear coordinates, namely the orthogo-
nal coordinates, this is still the case. Here, the homologous vectors of the two holo-
nomic bases have the same direction, i. e. the corresponding physical bases coincide.
We then write

g ∗ i = g ∗i =: g <i> (4.49)

and, for example, for a vector

a = a<i> g <i>. (4.50)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
186 | 4 Tensor Analysis in Curvilinear Coordinates

Thus the physical bases corresponding to an orthogonal coordinate system are or-
thonormal at every point, but in general their direction changes from point to point.
This explains why orthogonal physical coordinates are also called locally Cartesian
This change of direction differentiates orthonormal bases from Cartesian bases.
For tensor algebra it does not matter if tensors or bases depend on the position. Con-
sequently, all tensor-algebraic equations have the same form in orthogonal physical
coordinates and in Cartesian coordinates. To translate into orthogonal physical co-
ordinates, we only have to replace the lower indices of the Cartesian coordinates by
indices in acute brackets.

Problem 4.11.
A. Compute the Cartesian coordinates of the physical basis in cylindrical coordi-
B. Compute the transformation equations between the Cartesian coordinates and the
physical cylindrical coordinates of a vector.
Hint: Using the results of Problems 4.2 and 4.4, the desired equations can be
found immediately or with only few intermediate steps.
C. What is the value of the physical cylindrical coordinates of the δ-tensor and of the
ε-tensor? (No computations necessary.)
D. Translate into orthogonal physical coordinates:
(a ⋅ b)T = c, αik bi j = cjk , a × (b × c) = a ⋅ c b − a ⋅ b c.

4.4 Tensor-Analytical Operations

Let us again consider tensor fields, i. e. we assume that bases and tensors and hence
also the tensor coordinates are functions of the three curvilinear coordinates ui of a
point in space.
We introduce an abbreviated notation for the partial derivative of a quantity G
with respect to the curvilinear coordinates ui .

G,i := . (4.51)

Here G can be the position vector, a basis, a tensor, or a tensor coordinate. For the total
differential d G of a quantity G, we have the following expression.

d G = G,i d ui . (4.52)

The partial derivative of a quantity is, like the quantity itself, also a function of po-
sition. The total differential of a quantity depends on the coordinates ui and also on
their differentials d ui . For small d ui , i. e. for nearby points with the position vectors

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.4 Tensor-Analytical Operations | 187

x + d x and x, the differential of a tensor or of a tensor coordinate is, up to second-

order terms in the d ui , equal to the difference of the tensor (or the tensor coordinate)
between the two points, i. e.

d G = G(x + d x) − G(x). (4.53)

4.4.1 Partial Derivative and Differential of the Position Vector

Differentiating x = xj ej with respect to ui gives, since the Cartesian basis is indepen-

dent of position, x,i = (𝜕xj /𝜕ui ) ej , which, according to (4.5), is g i ;

x,i = g i , (4.54)

i. e. the partial derivative of the position vector with respect to the curvilinear coordi-
nates gives the covariant basis. From (4.52) it further follows that d x = g i d ui , i. e.
the differentials of the curvilinear coordinates ui are the contravariant coordinates of
the vector d x. (The position vector x, as we know, is not a vector; the differential d x of
the position vector, however, is a vector.) This property of the coordinate differentials
explains why we write the index as an upper index. In Cartesian and in curvilinear
coordinates, we have

d x = d xi ei = d ui g i . (4.55)

4.4.2 Partial Derivative and Total Differential of the Holonomic Bases,

Christoffel Symbols

1. Let us compute the partial derivatives of the two bases with respect to the curvilinear
coordinates ui and write the result with respect to the original basis.
For the covariant basis we get

(4.5) 𝜕2 xk 2
(4.9) 𝜕 xk 𝜕u
g i,j = e k = g .
𝜕ui 𝜕uj 𝜕ui 𝜕uj 𝜕xk m

Here, the expressions

𝜕2 xk 𝜕um
ij := (4.56)
𝜕ui 𝜕uj 𝜕xk

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
188 | 4 Tensor Analysis in Curvilinear Coordinates

are called Christoffel symbols (of the second kind). Note that they are symmetric in the
lower indices.

Γm m
ij = Γji . (4.57)

However, they are not tensor coordinates, since they do not satisfy the transformation
law for single contravariant and twice covariant tensor coordinates, as we will see in
Problem 4.13. If the ui are rectilinear, for example, Cartesian coordinates, the Christof-
fel symbols are zero.
To derive the partial derivative of the contravariant basis, we differentiate g i ⋅ g j
= δji with respect to uk . We have

g i ,k ⋅ g j + g i ⋅ g j,k = 0,

g i ,k ⋅ g j = −Γm i m i i
jk g m ⋅ g = −Γjk δm = −Γjk .

Tensor multiplication with g j gives

g i ,k ⋅ g j g j = g i ,k ⋅ δ = g i ,k = −Γijk g j .

Thus, the partial derivatives of the bases with respect to the curvilinear coordinates

g i,j = Γm
ij g m , g i ,j = −Γijm g m . (4.58)

2. The total differential of a holonomic basis is given by (4.52) as

d g i = g i,j d uj = Γm j
ij d u g m ,
d g i = g i ,j d uj = − Γijm d uj g m .

For small d ui , they are the difference between the corresponding basis vectors of two
neighboring points.

4.4.3 Christoffel Symbols and Metric Coefficients

An alternative to (4.56) is to express the Christoffel symbols as a function of the metric

To do this we differentiate g i = gij g j with respect to uk . Then we obtain

g i,k = gij,k g j + gij g j ,k ,

Γm j m
ik g m = gij,k g − gij Γkm g .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.4 Tensor-Analytical Operations | 189

Scalar multiplication by g n gives

Γm n j n m n
ik g m ⋅ g = gij,k g ⋅ g − gij Γkm g ⋅ g ,

and with (4.21) we obtain

Γnik = gij,k g jn − gij g mn Γkm .

Multiplying by gnp gives

gnp Γnik + gij g mn gnp Γkm = gij,k g jn gnp

and (4.26) yields

j j
gpj Γik + gij Γkp = gip,k .

Cyclic permutations of the free indices i, p, and k give

j j
gkj Γpi + gpj Γik = gpk,i ,
j j
gij Γkp + gkj Γpi = gki,p .

If we multiply the first of the last three equations by − 21 , the other two by + 21 , and add
all three equations, we get

j 1
gkj Γip = (gpk,i + gki,p − gip,k ).

Multiplication by g km finally gives (independently of the underlying Cartesian coordi-

nate system)

1 mk
ip = g (gki,p + gkp,i − gip,k ). (4.60)

Problem 4.12.
Compute the Christoffel symbols in cylindrical coordinates.

Problem 4.13.
Derive the transformation law for Christoffel symbols when changing between two
curvilinear coordinate systems.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
190 | 4 Tensor Analysis in Curvilinear Coordinates

4.4.4 Partial Derivatives of Tensors. Partial and Covariant Derivatives of

Tensor Coordinates

1. Let us now compute the partial derivatives of tensors with respect to the curvilinear
coordinates ui and write them with respect to the original basis or bases. We obtain

a,k = (ai g i ),k = ai ,k g i + ai g i,k = ai ,k g i + ai Γm i i m

ik g m = (a ,k + Γmk a ) g i

= (ai g i ),k = ai,k g i + ai g i ,k = ai,k g i − ai Γikm g m

= (ai,k − Γm i
ik am ) g ,

a = (aij g i g j ),k = aij ,k g i g j + aij g i,k g j + aij g i g j,k


= aij ,k g i g j + aij Γm ij m
ik g m g j + a g i Γjk g m
= (aij ,k + Γimk amj + Γmk aim ) g i g j

= (ai j g i g j ),k = ⋅ ⋅ ⋅ ,

The coefficients of the original bases in these equations are called covariant deriva-
tives or absolute derivatives of the curvilinear tensor coordinates and they are denoted
by a vertical line; i. e. we have

a|k := a,k ,
ai |k := ai ,k + Γimk am ,
ai |k := ai,k − Γm
ik am ,
aij |k := aij ,k + Γimk amj + Γmk aim ,
ai j |k := ai j,k + Γimk am j − Γm i
jk a m ,
ai j |k := ai j ,k − Γm j m
ik am + Γmk ai ,

aij |k := aij,k − Γm m
ik amj − Γjk aim ,


Using these covariant derivatives of tensor coordinates, we can write the partial deriva-
tives of tensors as follows.

a,k = a|k ,
a,k = ai |k g i = ai |k g i ,
a = aij |k g i g j = ai j |k g i g j = ai j |k g i g j = aij |k g i g j ,


Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.4 Tensor-Analytical Operations | 191

2. The partial derivatives of tensors clearly depend on the coordinate system. Using
the chain rule, we obtain the transformation law for derivatives with respect to the
curvilinear coordinates

𝜕A 𝜕um 𝜕A
= , (4.63)
𝜕u ̃ i 𝜕um
i. e. the index of the partial derivative of a tensor transforms like a covariant index.
Since, on both sides of the equations (4.62), the index k must transform similarly,
the covariant derivative of the holonomic coordinates of a tensor of n-th order gives the
holonomic coordinates of a tensor of order n + 1, i. e. the covariant derivative adds an-
other covariant index. On the other hand, the partial derivative of a tensor coordinate
with respect to the curvilinear coordinates is not a tensor coordinate.

4.4.5 The Total Differential of Tensors. The Total and the Absolute Differential
of Tensor Coordinates

1. The total differential of a tensor is given by (4.52) as

d A = A,i d ui . (4.64)

As we would expect, it is independent of the coordinates; the index of the partial

derivative of a tensor is covariant, the index of the coordinate differentials is con-

2. For tensor coordinates we must distinguish two different differentials (similar to

two different derivatives). The total differential is, according to (4.52),
d ai...j k
m...n = am...n,k d u . (4.65)

This represents, up to second-order terms in d uk , the increase of the tensor coordi-

nates d ai...j
m...n between two neighboring points. Since the partial derivative of a tensor
coordinate is not a tensor coordinate, also the total differential of a tensor coordinate
is not a tensor coordinate.
The expression analogous to (4.65), but with the covariant derivative of the ten-
sor coordinates in place of the partial derivative, is called the absolute differential, to
distinguish it from the total differential, and it is denoted by δai...j
m...n .

δai...j i...j k
m...n = am...n |k d u . (4.66)

The absolute differential of a holonomic tensor coordinate is, just like the covariant
derivative, also a holonomic tensor coordinate, and it is a tensor coordinate of the
same order and type as the original coordinate.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
192 | 4 Tensor Analysis in Curvilinear Coordinates

3. The total differential of, for example, a vector, is

(4.64) (4.62) (4.66)

d a = a,k d uk = ai |k g i d uk = δai g i .

In general, we have

d a = δa,
d a = δai g i = δai g i ,
d a = δaij g i g j = δai j g i g j = δai j g i g j = δaij g i g j ,


Thus the absolute differential of tensor coordinates are the coordinates of the total dif-
ferential of a tensor with respect to the holonomic bases. They are zero if and only if
the total differential of the tensor is zero, i. e. if the tensor is equal at the two neigh-
boring points. In particular, for a vector, this means geometrically that the two vectors
can be obtained from each other by a parallel translation.
We translate differentials of tensor coordinates from Cartesian coordinates into
holonomic curvilinear coordinates, according to (4.67), as follows.

We replace the total differential of a Cartesian tensor coordinate by the absolute

differential of a holonomic curvilinear tensor coordinate, e. g.

̂ i ei = δai g = δai g i .
da = da i

4. The following computation shows the relation between the total differentials d a,
d ai , the differentials d ui , and the absolute differentials δai , using a vector as an

d a = d (ai g i ) = ⏟⏟d⏟⏟⏟
a⏟⏟i g i + ai ⏟⏟d⏟⏟⏟
g⏟⏟i = (a i i j k
,k + Γjk a ) g i d u ,
ai ,k d uk Γik g j d uk ai |k

ai |k
d uk (ai ,k + Γijk aj ) g i .
d a = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ (4.68)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.4 Tensor-Analytical Operations | 193

4.4.6 Derivatives with Respect to a Parameter

If the curvilinear coordinates ui depend on a parameter t, then the derivatives of a

tensor coordinate with respect to this parameter are given by (4.65) and (4.66) as
d ai...j
m...n i...j d uk
= am...n,k ,
dt dt
δai...j d uk
= ai...j
m...n |k .
dt dt
Only the second of these two derivatives with respect to a parameter is again a tensor
coordinate, and it is a tensor coordinate of the same order and type as the original
tensor coordinate.
From (4.67), we analogously obtain
d a δai δa
= g = i gi,
dt dt i dt
da δaij δai j δa j δaij i j (4.70)
= gi gj = gi gj = i gi gj = g g,
dt dt dt dt dt

4.4.7 Gradient

1. The partial derivative (4.62) of a tensor is not independent of the coordinates; it

has a covariant index. Thus it is not a tensor. However, we can make it a tensor, if
we multiply it by the corresponding contravariant basis. Depending on whether we
multiply it from the left or from the right, we obtain the left or the right gradient.
Using the chain rule, we obtain for the del operator4 (2.47)
𝜕 𝜕ui (4.7) 𝜕
∇ := ek = i ek =
𝜕xk 𝜕u 𝜕xk 𝜕ui

∇ = i gi.
(The partial derivative of a tensor with respect to the curvilinear coordinates trans-
forms, according to (4.63), like a covariant vector coordinate.)
Thus, according to (2.48), we have for the (right) gradient

grad A = A ∇ = A,k g k (4.72)

or, using (4.62), written out for tensors of different order, we obtain the following for-

4 Here e. g. a ∇ = (𝜕a/𝜕ui ) g i = a,i g i ; i. e. we do not compute the partial derivative of g i !

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
194 | 4 Tensor Analysis in Curvilinear Coordinates

grad a = a|k g k ,
grad a = ai |k g i g k = ai |k g i g k ,
grad a = aij |k g i g j g k = ai j |k g i g j g k (4.73)

= ai j |k g i g j g k = aij |k g i g j g k ,


In other words, the covariant derivatives of the coordinates of a tensor are the coordi-
nates of the gradient of this tensor, and we translate partial derivatives into holonomic
curvilinear coordinates as follows.

We replace the partial derivative of a Cartesian tensor coordinate with the covariant
derivative of a holonomic curvilinear tensor coordinate.

2. Since the δ-tensor and the ε-tensor are independent of position, their gradients
must be zero and thus, according to (4.73), the covariant derivatives of their coordi-
nates must be zero, i. e.

g ij |k = gij |k = 0,
eijk |m = eij k |m = ⋅ ⋅ ⋅ = eijk |m = 0.

(The first of these two equations is also called the Ricci theorem.) This implies that,
in general, the partial derivatives of the coordinates of the δ-tensor and the ε-tensor
are nonzero. We can compute them from the corresponding covariant derivative and
the definition (4.61). For the partial derivatives of the δ-tensor we again obtain the
relations (4.60), which we derived earlier.

4.4.8 Divergence and Curl

1. The (right) divergence of a tensor follows from (2.56) as

div A = A ⋅ ∇ = A,k ⋅ g k . (4.75)

Using (4.62), we have e. g. div a = ai |k g i ⋅ g k = ai |k δik = ak |k , and for tensors of
various orders we write the following expressions.

div a = ak |k ,
div a = aik |k g i = ai k |k g i , (4.76)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.4 Tensor-Analytical Operations | 195

2. Analogously, we have for the (right) curl of a tensor, according to (2.59),

curl A = A ⊗ ∇ = −A × ∇ = −A,k × g k . (4.77)

With (4.62), we have e. g. curl a = −ai |k g i × g k = −ai |k ei kj g j , and for tensors of
various orders we obtain the following formulas.

curl a = ai |k ei jk g j = ai |k eijk g j = ai |k eij k g j = ai |k ei j k g j ,

curl a = ami |k ei jk g m g j = am i |k eijk g m g j = ⋅ ⋅ ⋅ , (4.78)


3. If we substitute (4.61) for the covariant derivative ai |k , we see, because of the an-
timetry of the ε-tensor and the symmetry of the Christoffel symbols with respect to the
two lower indices, that we can write for the curl of a vector

curl a = ai,k eijk g j = ai,k ei j k g j . (4.79)

Although these equations contain partial derivatives, which are not tensor coordi-
nates, they are very convenient for computing the curl.

Problem 4.14.
The following equations are given either in symbolic notation, in Cartesian coordi-
nates, or in holonomic curvilinear coordinates. Translate each of them into the other
two notations.
𝜕â i 𝜕a ̂j
A. a × curl a = b, B. + =b
̂ ,
𝜕xj 𝜕xi
i i j k m n
C. a bj |i = cj , D. e jk eimn a b c d = f .
Simplify the last expression, using the Grassmann identity, before translating.

4.4.9 Physical Coordinates of Tensor-Analytical Operations

In the course of our presentation we arrived naturally at the holonomic coordinates of

the tensors grad a, div a, curl a, etc. When the coordinates of these tensors are men-
tioned in physics books, e. g. in cylindrical or spherical coordinates, what is usually
meant are the physical coordinates. It is not very difficult to compute the physical co-
ordinates from arbitrary holonomic coordinates, using the transformation equations
between holonomic and physical tensor coordinates.
Because these conversions are often used, we compute them here step-by-step for
one example, i. e. for the divergence of a second-order tensor in cylindrical coordi-
nates. In holonomic coordinates, we have

(div a)i = aim |m .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
196 | 4 Tensor Analysis in Curvilinear Coordinates

Substituting successively 1, 2, and 3 for the free index and summing over the summa-
tion index, we obtain
(div a)1 = a11 |1 + a12 |2 + a13 |3 ,
(div a)2 = a21 |1 + a22 |2 + a23 |3 ,
(div a)3 = a31 |1 + a32 |2 + a33 |3 .

Now we compute the covariant derivatives of the tensor coordinates in these equa-
tions. Using (4.61), we have
aij |j = aij ,j + Γimj amj + Γmj aim .

Taking into account that for cylindrical coordinates only Γ122 , Γ212 , and Γ221 are nonzero,
we have
a11 |1 = a11 ,1 ,
a12 |2 = a12 ,2 + Γ122 a22 + Γ212 a11 ,
a13 |3 = a13 ,3 ,
a21 |1 = a21 ,1 + Γ221 a21 ,
a22 |2 = a22 ,2 + Γ212 a12 + Γ212 a21 ,
a23 |3 = a23 ,3 ,
a31 |1 = a31 ,1 ,
a32 |2 = a32 ,2 + Γ212 a31 ,
a33 |3 = a33 ,3 .

Next we substitute these covariant derivatives into the equations for the contravariant
coordinates of the divergence; we write out the partial derivatives and we substitute
for the Christoffel symbols their values Γ122 = −R, Γ212 = Γ221 = 1/R, so we obtain

𝜕a11 𝜕a12 1 𝜕a13

(div a)1 = + − R a22 + a11 + ,
𝜕R 𝜕φ R 𝜕z

𝜕a21 1 21 𝜕a22 1 12 1 21 𝜕a23

(div a)2 = + a + + a + a + ,
𝜕R R 𝜕φ R R 𝜕z

𝜕a31 𝜕a32 1 31 𝜕a33

(div a)3 = + + a + .
𝜕R 𝜕φ R 𝜕z
We thus computed the divergence of a tensor in (one type of) holonomic cylindrical
coordinates, and it only remains to convert the left and the right side, using (4.48),
into physical cylindrical coordinates. We compute
g11 = g 11 = 1, g22 = R2 , g 22 = , g33 = g 33 = 1.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.4 Tensor-Analytical Operations | 197

Using (4.48), we find that aR = a1 , aφ = R a2 , az = a3 , and

1 1
a11 = aRR , a12 = a , a13 = aRz , a21 = a ,
R Rφ R φR
1 1 1
a22 = a , a23 = a , a31 = azR , a32 = a , a33 = azz ,
R2 φφ R φz R zφ
and thus we have
𝜕 aRφ aφφ 1
(div a)R = (div a)1 =
𝜕aRR 𝜕a
+ − R 2 + aRR + Rz ,
𝜕R 𝜕φ R R R 𝜕z

𝜕 aφR 1 aφR aφφ

(div a)φ = R(div a)2 = R (
+ +
𝜕R R R R 𝜕φ R2
1 aRφ 1 aφR 𝜕 aφz
+ + + ),
R R R R 𝜕z R
𝜕 azφ azR 𝜕azz
(div a)z = (div a)3 =
+ + + ,
𝜕R 𝜕φ R R 𝜕z
or, reordered,
𝜕aRR 1 𝜕aRφ 𝜕aRz aRR − aφφ
(div a)R = + + + ,
𝜕R R 𝜕φ 𝜕z R
𝜕aφR 1 𝜕aφφ 𝜕aφz aRφ + aφR
(div a)φ = + + + ,
𝜕R R 𝜕φ 𝜕z R

𝜕azR 1 𝜕azφ 𝜕azz azR

(div a)z = + + + .
𝜕R R 𝜕φ 𝜕z R
Problem 4.15.
Compute in physical cylindrical coordinates:
A. grad a, B. div a, C. rot a, D. grad a, E. (grad a) ⋅ b.
The equations (B.1) to (B.13) from the appendix can be assumed to be known.

4.4.10 Second Covariant Derivative of a Tensor Coordinate. Laplace Operator

1. According to (4.54), together with (4.18), and also according to (4.63), the index of
the partial derivative of the position vector or of a tensor transforms like a covariant
index. This enabled us to introduce the covariant derivative (4.61) of a tensor coordi-
nate in terms of the partial derivative of the tensor. For the second derivative this is not
the case: the partial derivative of (4.54) is x,ij = g i,j = Γm
ij g m , and the partial derivative
of (4.63) is

𝜕2 A 𝜕um 𝜕un 𝜕2 A 𝜕2 um 𝜕A
= n m
+ , (4.80)
̃ j 𝜕u
𝜕u ̃i ̃ i 𝜕u
𝜕u ̃ j 𝜕u 𝜕u ̃ i 𝜕um
̃ j 𝜕u

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
198 | 4 Tensor Analysis in Curvilinear Coordinates

i. e. while the index i in A,i transforms covariantly, the two indices in A,ij do not trans-
form like tensor indices. This is the reason why we do not use the second partial deriva-
tive of the position vector or of a tensor. Instead, we introduce the second covariant
derivative. Since the covariant derivative of a tensor coordinate is again a tensor co-
ordinate, the covariant derivative of the covariant derivative of a tensor coordinate is
defined and is again a tensor coordinate. We call it the second covariant derivative of
the original tensor coordinate and write

ai...j i...j
m...n |pq := am...n |p |q . (4.81)

2. The second covariant derivative of a tensor coordinate is clearly a coordinate of

the gradient of the gradient of the tensor under consideration. From B = grad A ,
C = grad B = grad grad A , we obtain bi...jp = ai...j |p , ci...jpq = bi...jp |q = ai...j |pq or

grad grad A = ai...j |pq g i ⋅ ⋅ ⋅ g j g p g q (4.82)

and other versions in other holonomic coordinates.

3. We compute the tensor curl grad A as follows:

grad A = ai...j |k g i ⋅ ⋅ ⋅ g j g k = bi...jk g i ⋅ ⋅ ⋅ g j g k = B ,

curl grad A = curl B = bi...jk |n ek m n g i ⋅ ⋅ ⋅ g j g m = ai...j |kn ek m n g i ⋅ ⋅ ⋅ g j g m .

As we know, all Cartesian coordinates of this tensor are zero, which follows immedi-
ately from the fact that the second partial derivative of a quantity does not depend on
the order of differentiation. However, if all Cartesian coordinates of a tensor are zero,
according to the transformation law (4.19) for tensor coordinates, also all holonomic
coordinates of this tensor are zero in any curvilinear coordinate system. Thus we have
ai...j |kn ek m n = 0. For m = 1, for example, this gives

ai...j |32 e3 1 2 + ai...j |23 e2 1 3 = (ai...j |32 − ai...j |23 ) e3 1 2 = 0.

Since e3 1 2 ≠ 0, it follows that ai...j |32 = ai...j |23 . For m = 2 and m = 3 we have
ai...j |13 = ai...j |31 , ai...j |12 = ai...j |21 . Thus, in general we obtain the following formula.

ai...j i...j
m...n |pq = am...n |qp . (4.83)

The second covariant derivative of a tensor coordinate also does not depend on the
order of differentiation.
4. We finally compute Δ A := div grad A . For example, we have

grad a = ai |k g i g k = ai |m g mk g i g k ,
div grad a = (ai |m g mk )|k g i = g mk ai |mk g i + g⏟⏟⏟⏟⏟⏟⏟⏟⏟
|k ai |m g i ,
= 0

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.4 Tensor-Analytical Operations | 199

or, in general,

Δ a = g mn a|mn ,
Δ a = g mn ai |mn g i = g mn ai |mn g i ,
Δ a = g mn aij |mn g i g j = g mn ai j |mn g i g j (4.84)
= g mn ai j |mn g i g j = g mn aij |mn g i g j ,


Problem 4.16.
Translate into the other two notations:

𝜕2 a
̂m ̂ 𝜕2 a
A. = bm , B. =b
̂ ,
m C. d a = (grad a) ⋅ d x.
𝜕xk2 𝜕xm 𝜕xk

Problem 4.17.
Compute in physical cylindrical coordinates: A. Δ a, B. Δ a.

Problem 4.18.
A. Show, using the definition of the covariant derivative, that the following holds:

ai |kl − ai |lk = (Γm m n m n m

il,k − Γik,l + Γil Γnk − Γik Γnl ) am .

B. Prove that the bracket represents the coordinates of the zero tensor of fourth or-
The tensor

Rm ijk := Γm m n m n m
ik, j − Γij,k + Γik Γnj − Γij Γnk

is called (mixed) Riemann curvature tensor. It is characteristic for Euclidean

spaces that this tensor is zero.

4.4.11 Integrals of Tensor Fields Line, Surface, and Volume Elements

Let us first derive expressions for a line element, a surface element, and a volume
element in terms of the curvilinear coordinates.

1. A line element is, according to (4.55), given as

d x = d ui g i . (4.85)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
200 | 4 Tensor Analysis in Curvilinear Coordinates

We denote the line elements of the coordinate curves by d x1 ,d x 2 , d x3 ; so here the

index is not a covariant index. Since the basis vectors g i are by construction tangent
to the coordinate curves, it follows from (4.85) that

d x1 = d u1 g 1 , d x2 = d u2 g 2 , d x3 = d u3 g 3 . (4.86)

This explains why we usually use the contravariant coordinates of the vector d x. From
(4.85), together with (4.86), we obtain in component form

dx = d u1 g 1 + d
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ u2 g 2 + d
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ u3 g 3 .
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ (4.87)
d x1 d x2 d x3

2. A surface element d A is analogously given by

d A = d Ai g i . (4.88)

We denote the surface elements of the coordinate planes by d A1 , d A2 , d A3 ; i. e. here

again the index is not a covariant index. Since the basis vectors g i are by construction
perpendicular to the coordinate surfaces at each point, it follows from (4.88) that

d A1 = d A1 g 1 , d A2 = d A2 g 2 , d A3 = d A3 g 3 . (4.89)

This again explains why we commonly use the covariant coordinates of the vector d A.
From (4.88), together with (4.89), analogously to (4.87), we obtain

dA = d A1 ⏟g⏟⏟⏟⏟1 + ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
⏟⏟⏟⏟⏟⏟⏟⏟ d A2 g 2 + d A3 g 3 .
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ (4.90)
d A1 d A2 d A3

3. The three vectors d Ai and the three vectors d x i are related as follows. For example,
the surface element with the surface vector d A1 is spanned by the line elements d x2
and d x 3 , i. e. we have

d A1 = ±d x2 × d x3 , d A2 = ±d x 3 × d x 1 , d A3 = ±d x1 × d x2 . (4.91)

The sign is determined by the orientation of the three vectors d Ai .

Substituting the line elements into (4.91) gives

(4.89) (4.91) (4.86)

d A1 = d A1 g 1 = ± d x 2 × d x3 = ± d u2 g 2 × d u3 g 3
(3.21) (4.28)
= ±[g 1 , g 2 , g 3 ] d u2 d u3 g 1 = ± √g d u2 d u3 g 1 , i. e.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.4 Tensor-Analytical Operations | 201

d A1 = ±√g d u2 d u3 ,
d A2 = ±√g d u3 d u1 , (4.92)
d A3 = ±√g d u1 d u2 .

Here again the sign is determined by the orientation of the d Ai .

4. The volume element spanned by the line elements d x 1 , d x2 , and d x 3 is given by

(assuming right-handed systems we dropped the absolute-value signs)

d V = [d x1 , d x2 , d x 3 ]. (4.93)

Substituting the line elements, using (4.86), into (4.93), we obtain

d V = [d u1 g 1 , d u2 g 2 , d u3 g 3 ] = [g 1 , g 2 , g 3 ] d u1 d u2 d u3

or, using (4.28),

d V = √g d u1 d u2 d u3 . (4.94) Integrals in Curvilinear Coordinates

1. In (2.94) and (2.95) we defined various types of integrals of tensor fields, both in
symbolic notation and in Cartesian coordinates. If, in the equations in symbolic nota-
tion, we write all tensors as the sum of their components with respect to a Cartesian
coordinate system, we can pull the bases in front of the integral and cancel them (they
are independent of position) and thus obtain equations in Cartesian coordinates. We
have, e. g. for

∫ a d x = B, (a)

̂ ij ei ej d xk ek = ei ej ek ∫ a
̂ ij d xk = B
̂ ijk e e e ,
i j k

̂ ij d xk = B
̂ ijk . (b)

A version in holonomic curvilinear coordinates, i. e. a translation of (a) and (b) into

holonomic curvilinear coodinates, does not exist in general. Clearly, we can write the
tensors, e. g. in (a), in component form with respect to any curvilinear coordinate sys-
tem, but only in the case of rectilinear coordinates we can pull the bases in front of the
integrals and cancel them out. Curvilinear bases depend on position and they cannot
be pulled in front of the integrals. Thus, in the case of curvilinear coordinates, inte-
grals over spatial domains and hence integral theorems, including the theorems of
Gauss and Stokes, cannot be written in “true” coordinate notation. The best we can

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
202 | 4 Tensor Analysis in Curvilinear Coordinates

do is to write the bases, using (4.5) or (4.7), with respect to a Cartesian basis and then
pull this Cartesian basis in front of the integrals and cancel them out, e. g. we write

(4.7) 𝜕ui 𝜕ui

∫ a d V = ∫ ai g i d V = ∫ ai e j d V = e j ∫ ai d V.
𝜕xj 𝜕xj

In this way, we obtain an expression for the integrand which contains curvilinear co-
ordinates. However, according to the transformation law (4.19) for tensor coordinates,
the integrand consists of the Cartesian coordinates of the respective tensor as a func-
tion of the curvilinear coordinates. Such expressions are sometimes useful, but let us
keep in mind that ultimately they are expressions in Cartesian tensor coordinates.

2. However, there are two special cases where the integrand can be written in “true”
curvilinear coordinates. First, if the integral is a scalar combination of tensor coordi-
nates, then the integral does not contain a basis. An example is the Gauss theorem
(2.97)1 in the form

∫ ai |i d V = ∮ ai d Ai (4.95)

and the Stokes theorem (2.100)1 in the form

∫ ai |k eijk d Aj = ∮ ai d ui . (4.96)

Second, we obtain “true” expressions in curvilinear coordinates for coordinate sys-

tems, where one basis vector is independent of position. This is the case, e. g. for cylin-
drical coordinates, where the third basis vector (if we list the indices in the usual order)
is independent of position and moreover is a unit vector, i. e. g 3 = g 3 = e3 . Then all
tensor coordinates whose free indices have the value three are also Cartesian coordi-
nates, and we can integrate over them. This gives, for example, the Gauss theorem
(2.96)1 in the form

∫ a|3 d V = ∮ a d A3 (4.97)

or, using (2.98)1 , in the form

∫ ai |k ei3k d V = ∮ ai ei3k d Ak . (4.98)

4.5 Fundamentals of the Theory of Surfaces

The theory of surfaces is the part of differential geometry which investigates the prop-
erties of curved surfaces. This topic is not only important for many technical applica-
tions, but also historically relevant, as it was the starting point for the development of

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.5 Fundamentals of the Theory of Surfaces | 203

tensor analysis in the 19th century. From today’s perspective, however, it is easier to
derive the fundamental equations of the theory of surfaces as a special case of tensor
analysis in three-dimensional curvilinear coordinate systems. We will take this path

4.5.1 Surface Coordinates and Moving Frames

1. Let u1 and u2 be two coordinates defined on a surface, which uniquely determine

the position of any point P on the surface. Then the position vector of the point P is
given in a Cartesian coordinate system as x = OP = xi ei ,

x = x(u1 , u2 ), xi = xi (u1 , u2 ). (4.99)

We further assume that the surface is sufficiently smooth and that the functions xi are
sufficiently often differentiable with respect to u1 and u2 . The coordinates u1 and u2 are
also called surface parameters or surface coordinates.

2. From the perspective of the surrounding space, the surface can be interpreted as
a coordinate surface in a three-dimensional curvilinear coordinate system. The third
coordinate u3 has the same value for all points on the surface u3 = c. Points with
u3 ≠ c are not included in the surface. With this interpretation we can assign two basis
vectors to the surface, i. e. the covariant basis vectors g 1 and g 2 , which are, according
to (4.5), tangent to the coordinate curves of the u1 - and u2 -coordinates, respectively,
i. e.
𝜕xi 𝜕xi
g1 = e = x,1 , g2 = e = x,2 . (4.100)
𝜕u1 i 𝜕u2 i
Together, g 1 and g 2 span the tangent plane of the surface at the point P.
If we need a basis in the three-dimensional space, we add to g 1 and g 2 the normal
vector n of the surface at the point P, so we have
g1 × g2
n= . (4.101)
|g 1 × g 2 |

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
204 | 4 Tensor Analysis in Curvilinear Coordinates

The basis g 1 , g 2 , n is called a moving frame of the surface and, in general, it is varying
from point to point. The normal vector n has length 1 and is perpendicular to both
g 1 and g 2 . The lengths of g 1 and g 2 and the angle enclosed by them depend on the
definition of the coordinates u1 and u2 .
The normal vector can be interpreted as a covariant basis vector g 3 = (𝜕xi /𝜕u3 ) ei ,
if the third coordinate u3 is at every point normal to the surface and scaled such that
g 3 = n. This interpretation turns out to be useful, when we specialize the equations of
tensor analysis from the three-dimensional space to the surface. However, according
to definition (4.101), g 3 = n does not depend on u3 ; it is via g 1 and g 2 only determined
by the surface coordinates u1 and u2 .

3. For our further investigations we find it useful to extend the summation convention.
As before, we denote coordinate indices in the three-dimensional space by small Latin
letters i, j, k, . . . with the value set 1, 2, 3; and as before we sum over Latin indices, which
appear twice, from 1 to 3. However, to analyze a surface, we need only two coordinates.
We denote indices which only take the values 1, 2 by small Greek letters α, β, . . . , and
we sum correspondingly over Greek indices appearing twice from 1 to 2. Thus, we write
(4.99)2 and (4.100) compactly as

xi = xi (uα ), (4.102)
gα = e = x,α . (4.103)
𝜕uα i

4. Consider a surface where a second set of coordinates u ̃ β is defined, which is re-

lated one-to-one (except at singular points) to the first set of coordinates uα , by the
transformation equations

uα = uα (u
̃ β ), ̃α = u
u ̃ α (uβ ). (4.104)

Then, analogously to (4.103), we have for the covariant basis vectors g̃ α

g̃ α = e.
̃α i
Using the chain rule, we get

𝜕xi 𝜕x 𝜕uβ 𝜕uβ 𝜕xi

g̃ α = ei = βi ei = α e ,
̃ α
𝜕u 𝜕u
̃ α 𝜕u 𝜕uβ i
̃ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟

g̃ α = g . (4.105)
̃α β
This equation corresponds to the transformation law (4.18) for covariant basis vectors,
with the only difference that here summation is only from 1 to 2.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.5 Fundamentals of the Theory of Surfaces | 205

The new basis vectors g̃ α can be represented as a linear combination of the old ba-
sis vectors g α (as we can expect from the transformation equations (4.104)), and thus,
they also lie in the tangent plane spanned by g α . In other words, only one tangent
plane exists at a point P and it contains the tangent vectors of all curves through the
point P.

5. The basis vectors g i , reciprocal to the moving frame, are given by (3.21) and with
g 3 = n as

g2 × n n × g1 g1 × g2
g1 = , g2 = , g3 = .
[g 1 , g 2 , n] [g 1 , g 2 , n] [g 1 , g 2 , n]

From the properties of the vector product, we know that g 1 and g 2 are perpendicular
to n and that n is perpendicular to g 1 and g 2 . Thus g 1 and g 2 lie in the plane spanned
by g 1 and g 2 , while g 3 is perpendicular to g 1 and g 2 . Hence g 3 and n are collinear and
since [g 1 , g 2 , n] = g 1 × g 2 ⋅ n = |g 1 × g 2 |, we have g 3 = n. The g α are again called con-
travariant basis vectors. According to Section 3.9.4, they satisfy, analogously to (3.24),
the orthogonality relations for reciprocal bases in a plane. We have

g α ⋅ g β = δαβ . (4.106)

For the second orthogonality relation (3.25) we obtain with g 3 = g 3 = n

g α g α + n n = δ. (4.107)

4.5.2 Surface Vectors and Surface Tensors

1. The differential of the position vector on the surface is given, in terms of the coor-
dinates uα and u
̃ α , as

e d uα = d uα g α
d x = d xi ei =
𝜕uα i

̃α = d u
̃ α g̃ .
= e du
𝜕u ̃α i
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ α


Thus, the differential of the position vector d x also lies in the tangent plane. Substi-
tution of the transformation law (4.105) for the basis vectors gives

̃ α g̃ = d uβ g = d uβ
du g̃ .
α β 𝜕uβ α

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
206 | 4 Tensor Analysis in Curvilinear Coordinates

So the coordinate differentials satisfy the transformation law

̃α =
du d uβ , (4.108)
which is, according to (4.19), the transformation law for contravariant vector coordi-
2. A quantity a, which can be represented in the form a = aα g α = a ̃ α g̃ and whose
α α
coefficients a and a ̃ are related by the transformation equations (4.108), is called a
surface vector, and the aα and a ̃ α are called its contravariant coordinates. Using the
contravariant basis vectors, we can also write a surface vector in the form a = aα g α =
̃ α g̃ α , i. e. in terms of its covariant coordinates.

Problem 4.19.
Derive the transformation laws for the contravariant basis vectors g α and for the co-
variant coordinates aα of a surface vector a.

Surface tensors of higher order are defined similarly, and their coordinates must
analogously satisfy the transformation laws (4.19). Thus, computations with surface
tensors are performed in the same way as with tensors in a three-dimensional curvilin-
ear coordinate system. The only difference is that the values of the indices are limited
to 1, 2.

Problem 4.20.
Metric coefficients on a curved surface can be defined by the scalar product
g α ⋅ g β = gαβ , just like in a three-dimensional curvilinear coordinate system. Show
that the metric coefficients gαβ are the entirely covariant coordinates of a second-
order surface tensor.

4.5.3 Metric Properties

1. Now we ask how we can measure lengths, angles, and surface areas on a curved
surface; these quantities are also called metric properties of a surface. The fundamen-
tal idea is to replace the curved surface in the infinitesimal neighborhood of a point P
by its tangent plane; there we can compute lengths, angles, and surface areas, using
the tools of plane geometry. The transition to surface patches of finite size then follows
directly by integration.
2. The differential of the position vector was introduced in Section 4.5.2 as

d x = g α d uα . (4.109)

The scalar product of the differential with itself is

d x ⋅ d x = (g α d uα ) ⋅ (g β d uβ ) = g α ⋅ g β d uα d uβ .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.5 Fundamentals of the Theory of Surfaces | 207

The scalar products of the (covariant) basis vectors of the surface form, analogously
to (4.21), the (covariant) metric coefficients of the surface, i. e.

gαβ := g α ⋅ g β , (4.110)

and then

d x ⋅ d x = gαβ d uα d uβ . (4.111)

This equation is called the first fundamental form of the surface. It offers another way
to show, different from Problem 4.20, that the metric coefficients are tensor coordi-
nates; since the left side is a scalar d x ⋅ d x and the differentials d uα and d uβ of the
coordinates are, according to (4.108), the contravariant coordinates of a surface vector,
the gαβ are, according to the quotient rule, the covariant coordinates of a second-order
surface tensor. The metric coefficients are symmetric, since they are defined as scalar

gαβ = gβα , (4.112)

and furthermore, because d x ⋅ d x > 0, their matrix is positive definite. The following
notations are commonly used:

g11 =: E, g22 =: G, g12 = g21 =: F. (4.113)

Substituting the definitions (4.110) and (4.103) gives

𝜕xi 2 𝜕xi 2 𝜕xi 𝜕xi

E = g1 ⋅ g1 = ( ) , G = g2 ⋅ g2 = ( ) , F = g1 ⋅ g2 = ,
𝜕u1 𝜕u2 𝜕u1 𝜕u2

and with u1 = u and u2 = v these are the relations (2.87). Analogously to (4.21), we
compute the contravariant metric coefficients as

g αβ := g α ⋅ g β . (4.114)

Using the orthogonality relations (4.106) and (4.107) and since g α ⋅ n = n ⋅ g α = 0, we


gαγ g γβ = g α ⋅ g γ g γ ⋅ g β = g α ⋅ (δ − n n) ⋅ g β = g α ⋅ g β − g α ⋅ n n ⋅ g β = δαβ ,

i. e. the matrices of the two types of metric coefficients are inverses of each other, so
we have

gαγ g γβ = δαβ . (4.115)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
208 | 4 Tensor Analysis in Curvilinear Coordinates

Because the metric coefficients on a surface are only 2,2-matrices, we can easily com-
pute the inverse and we obtain
g22 g11 g12
g 11 = , g 22 = , g 12 = g 21 = − , (4.116)
g g g

where g is the determinant of the covariant metric coefficients,

g = g11 g22 − g12 g12 . (4.117)

Problem 4.21.
Show, using a surface vector as an example, that similarly as in three dimensions, we
can again use the metric coefficients gαβ and g αβ to raise and lower indices.

Using the definition (4.110) and the Grassmann identity (compare the solution to
Problem 2.13 A), we can also write the determinant as

⋅ g 1 g⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
g = g⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 2
g 1 ⋅ g 2 g⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
⋅ g 2 − ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 1
⋅ g 2 = (g 1 × g 2 ) ⋅ (g 1 × g 2 ) = |g 1 × g 2 |2
g11 g22 g12 g12

√g = |g 1 × g 2 |. (4.118)

Furthermore, g is also the determinant of the covariant metric coefficients of the mov-
ing frame. Since g 1 × g 2 and n are, according to (4.101), collinear and n has length 1, it
follows with (4.25)1 that

det gij = [g 1 , g 2 , n]2 = (g 1 × g 2 ⋅ n)2 = |g 1 × g 2 |2 |n|2 = |g 1 × g 2 |2 = g,

det gij = g > 0. (4.119)

3. For a curve defined on a surface, the surface coordinates uα are functions of a curve
parameter s, i. e. we have for the position vector to a point on the curve

x = x (uα (s)) = x(s).

Using the chain rule and (4.103), we obtain for the vector element tangent to the curve

dx d uα d uα
dx = d s = x,α d s = gα d s = t d s.
ds ds ds
dx d uα
t= , tα = (4.120)
ds ds
is the vector tangent to the curve.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.5 Fundamentals of the Theory of Surfaces | 209

To find the length L of a finite curve element between the point A (with the pa-
rameter value sA ) and the point B (with the parameter value sB ) we integrate over
|d x| = √d x ⋅ d x. Using (4.111) and (4.120), we get


L = ∫ √gαβ t α t β d s. (4.121)

4. For two surface curves intersecting at a point P on the surface, we can compute the
angle φ between the two curves from the two tangent vectors t 1 and t 2 . From

t 1 ⋅ t 2 = |t 1 | |t 2 | cos φ

it follows with (4.120) that

t1 ⋅ t2 gαβ t α t β
1 2
cos φ = = . (4.122)
|t 1 | |t 2 | √gμν t μ t ν √gρσ t ρ t σ
1 1 2 2

5. We consider two curve elements d x1 and d x2 , which are tangent to the u1 - and
u2 -coordinate curves,

d x 1 = g 1 d u1 , d x 2 = g 2 d u2 .

The surface area of the parallelogram spanned by the two vectors is given by the length
of the vector product

d A = d x1 × d x 2 = (g 1 × g 2 ) d u1 d u2 ,

or, using (4.118),

d A = |g 1 × g 2 | d u1 d u2 = √g d u1 d u2 .

Hence, we compute the surface area A of a finite surface area patch bounded by u1A ≤
u1 ≤ u1B and u2A (u1 ) ≤ u2 ≤ u2B (u1 ) with the following integral:

u1B u2B (u1 )

A=∫ ∫ √g d u1 d u2 . (4.123)
u1A u2A (u1 )

We thus derived once more the equation for a surface integral of the first kind from
Section 2.14.3; to see this, we compare the last equation with (2.88) and write g using
the notations E, F, G from (4.113), so we have g = E G − F 2 .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
210 | 4 Tensor Analysis in Curvilinear Coordinates

4.5.4 Curvature Properties

1. Planes are defined by the fact that their normal vector n points everywhere in the
same direction, i. e. its differential always satisfies d n = 0. On a curved surface,
the direction of n varies from point to point, i. e. we have d n ≠ 0. This suggests to
investigate d n to gain further insight into the nature of curvature.

2. According to Section 4.5.1, the normal vector n is a function of the surface coordi-
nates uα , i. e. its differential is given by

d n = n,β d uβ . (4.124)

We could compute n,β directly from the defining equation (4.101). However, this is com-
plicated and messy, because we have to use the product rule and the quotient rule sev-
eral times. It turns out to be more efficient to use some of the properties of the moving
The normal vector n is by definition a unit vector, so its scalar product with itself is

n ⋅ n = 1.

Partial differentiation of this equation with respect to the surface coordinates uβ gives

n,β ⋅ n = 0 .

Hence the vectors n,β are perpendicular to n, i. e. they lie in the tangent plane of the
surface at the point P. We can thus represent the n,β as a linear combination of the
basis vectors g α which span this tangent plane. Denoting the coefficients of this linear
combination by −bα β , we first obtain

n,β = −bα β g α . (4.125)

We will show that the bα β are the contravariant–covariant coordinates of a second-

order surface tensor, which is called the curvature tensor. To do this we compute the
scalar product of the differential of the position vector with the differential of the nor-
mal vector at the point P, and we obtain, using (4.109), (4.124), and (4.125),

− d n ⋅ d x = −(n,β d uβ ) ⋅ (g α d uα ) = bγ β g γ ⋅ g α d uα d uβ = b γ α β
β gγα d u d u .
gγα bαβ

The left side is the scalar −d n⋅d x, the coordinate differentials d uα and d uβ are accord-
ing to (4.108) the contravariant coordinates of a surface vector, and we already know
from (4.111) that the metric coefficients gγα are the covariant coordinates of a surface
tensor of second order. Hence it follows from the quotient rule that the bγ β are also
tensor coordinates, or more precisely, the coordinates of a surface tensor of second

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.5 Fundamentals of the Theory of Surfaces | 211

order. Thus, we can lower the first index using the metric coefficients, and we obtain
the covariant coordinates bαβ of the curvature tensor, i. e. we obtain

−d n ⋅ d x = bαβ d uα d uβ . (4.126)

This equation is called the second fundamental form of a surface.

3. To compute the bαβ , we use the fact that n is perpendicular to the g α , i. e. that their
scalar product satisfies

n ⋅ g α = 0.

Taking the partial derivative of this equation with respect to the surface coordinates
uβ gives

n,β ⋅ g α = −n ⋅ g α,β ,

from which we get by substituting (4.125) for n,β

− bγ β g γ ⋅ g α = −n ⋅ g α,β ,

i. e. we obtain

bαβ = n ⋅ g α,β . (4.127)

If we substitute the definition (4.101) for n and use |g 1 × g 2 | = √g, we can write with
(4.118) the covariant coordinates of the curvature tensor as a triple product, i. e.
bαβ = [g , g , g ]. (4.128)
√g 1 2 α,β

According to (2.45), this equation reads in coordinate notation, using the definition
(4.103) of the basis vectors g α ,

1 𝜕x 𝜕xj 𝜕2 xk
bαβ = εijk 1i . (4.129)
√g 𝜕u 𝜕u2 𝜕uα 𝜕uβ

The curvature tensor is clearly symmetric, i. e.

bαβ = bβα , (4.130)

because the order of the partial derivatives with respect to uα and uβ does not matter.
The covariant coordinates of the curvature tensor are often denoted by

b11 =: L, b22 =: N, b12 = b21 =: M. (4.131)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
212 | 4 Tensor Analysis in Curvilinear Coordinates

Problem 4.22.
The equation z = f (x, y) = x y defines a surface over the Cartesian x, y-plane. Compute
the coordinates bα β and bαβ of the curvature tensor for this surface.
Hint: Use the Cartesian coordinates x and y as surface parameters.

4. In order to derive a coordinate-independent expression for the curvature, it is rea-

sonable to use the invariants of the curvature tensor. Since the coordinates of the cur-
vature tensor form a 2,2-matrix, we have, according to (3.65), only two main invariants:
the trace and the determinant. Both invariants have a name in surface theory; the trace
is called mean curvature (after division by 2),
1 α 1
H := b α = (b1 1 + b2 2 ) , (4.132)
2 2
and the determinant is called Gaussian curvature,

K := det bα β = (b1 1 b2 2 − b1 2 b2 1 ) . (4.133)

Problem 4.23.
Write the mean curvature and the Gaussian curvature in terms of E, F, G, L, M, N.

5. We extend our considerations about the curvature of a surface by investigating

curves which result from the intersections of the curved surface and a plane in nor-
mal direction, i. e. a plane at the point P on the curved surface, which contains the
normal vector; such a plane is also called a normal section of the surface. We again
ask for the curvature of such a curve. To answer this question, we find it reasonable
to compute how the tangent vector changes along the curve. To obtain comparable
results, we request that the tangent vector t has always the same length, i. e. that the
curve parameter s is scaled so that
󵄨󵄨 d x 󵄨󵄨
|t| = 󵄨󵄨󵄨󵄨 󵄨󵄨󵄨󵄨 = 1.
Then we can define the absolute value of the change of the tangent vector as a measure
for the curvature κ of the curve. In addition, it makes sense to assign a sign to the cur-
vature, depending on which side of the curve the center of the curvature lies (i. e. the
center of the circle, which approximates the curve at the point P under consideration),
i. e.
󵄨󵄨 2 󵄨󵄨
󵄨󵄨 d t 󵄨󵄨 󵄨d x󵄨
κ := ± 󵄨󵄨󵄨󵄨 󵄨󵄨󵄨󵄨 = ± 󵄨󵄨󵄨󵄨 2 󵄨󵄨󵄨󵄨 .
󵄨 󵄨 s 󵄨󵄨 d s 󵄨󵄨
From the scaling it follows that t ⋅ t = 1, and hence by differentiating with respect to
the curve parameter,
⋅ t = 0,

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.5 Fundamentals of the Theory of Surfaces | 213

i. e. d t/d s points in the direction of the normal vector n. Since in addition |n| = 1, we
can write for the curvature
κ= ⋅ n,
which also determines the sign; if d t/d s and n have the same direction, κ is positive;
if they have opposite directions, it is negative. Furthermore, it follows from t ⋅ n = 0
dt dn
⋅ n = −t ⋅ ,
ds ds

and so we finally obtain with (4.120) the curvature of a normal section as

dx dn
κ=− ⋅ . (4.134)
ds ds
The plane of a normal section can be rotated arbitrarily around the normal vector. So it
makes sense to ask for which planes the curvature attains its maximum and minimum
values. The answer to this question is the solution to the extremum problem

dx dn
κ=− ⋅ 󳨀→ extremum
ds ds
for the tangent vector t, under the constraint t ⋅ t = 1, i. e.

dx dx
1− ⋅ = 0.
ds ds
We introduce a Lagrange multiplier k and the corresponding Lagrange function

dx dn dx dx
h=− ⋅ + k (1 − ⋅ ),
ds ds ds ds

substitute the first fundamental form (4.111) and the second fundamental form (4.126),
and use (4.120) to obtain

d uα d uβ d uα d uβ
h = bαβ + k (1 − gαβ ) = bαβ t α t β + k (1 − gαβ t α t β ).
ds ds ds ds

Thus, the necessary condition 𝜕h/𝜕t γ = 0 for the extrema to exist leads, analogously
to Section 3.12.3, to the generalized eigenvalue problem

(bαβ − k gαβ ) t β = 0. (4.135)

In the theory of surfaces these eigenvalues are called principal curvatures. Prob-
lem 3.14 shows that they are always real, because the bαβ are symmetric and the gαβ
are symmetric and positive definite. The corresponding eigendirections are called

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
214 | 4 Tensor Analysis in Curvilinear Coordinates

principal curvature directions. The principal curvatures k1 and k2 are as usual com-
puted by solving the characteristic equation det (bαβ − k gαβ ) = 0. If we use the
mixed coordinates bα β , we can write the characteristic equation, according to the
multiplication theorem (1.13) for determinants, as
μ μ
det (bαβ − k gαβ ) = det (gαμ (bμ β − k δβ )) = (det gαμ ) (det (bμ β − k δβ )) = 0,

or, because det gαμ > 0, also as

det (bα β − k δβα )

󵄨󵄨 1
b1 2
󵄨 b −k 󵄨󵄨 1 2 1 2 1 2 2
= 󵄨󵄨󵄨󵄨 1 2 2 󵄨󵄨 = (b 1 b 2 − b 2 b 1 ) − (b 1 + b 2 ) k + k = 0.
󵄨󵄨 b 1 b 2−k 󵄨󵄨󵄨 ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
K 2H
Using (4.132) for the mean curvature H and (4.133) for the Gaussian curvature K, we

k2 − 2 H k + K = 0 . (4.136)

Conversely, Vieta’s formulas give

H= (k + k2 ) , K = k1 k2 . (4.137)
2 1
The principal curvature directions corresponding to the principal curvatures k1 and k2
are orthogonal to each other. We see this from the eigenvalue equations

bαβ t β = k1 gαβ t β , bαβ t β = k2 gαβ t β ,

1 1 2 2

if we multiply each equation by the other eigenvector and subtract the two equations;
then we have

bαβ t α t β − bαβ t α t β = k1 gαβ t α t β − k2 gαβ t α t β .

2 1 1 2
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 2 1 1 2
0 (k − k ) g t t α β
1 2 αβ
1 2

The left side is zero due to the symmetry of bαβ ; on the right side we can use the sym-
metry of gαβ . Thus we get for k1 ≠ k2

gαβ t α t β = 0.
1 2

With (4.122), however, this is equivalent to cos φ = 0, i. e. the angle between the
eigendirections t α and t β is φ = 90∘ .
1 2
Problem 4.24.
Find the principal curvatures and the principal curvature directions for the surface
z = f (x, y) = x y of Problem 4.22.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.5 Fundamentals of the Theory of Surfaces | 215

4.5.5 Derivative Equations, Integrability Conditions

1. In the last section we used the partial derivative n,β of the normal vector to gather
more information about the curvature of a surface. In this section we generalize our
considerations to the covariant basis vectors g 1 and g 2 , i. e. we ask, in all generality,
how the moving frame varies along the surface.

2. We can write the partial derivative of a covariant basis using (4.58) in terms of the
Christoffel symbols as follows:

g i,j = Γm
ij g m .

This gives with g 3 = n for the moving frame of a surface

γ γ
g α,β = Γαβ g γ + Γ3αβ n, n,β = Γ3β g γ + Γ33β n. (a)

From Section 4.5.4 we already know that n,β can be represented as a linear combination
of the g α , so comparing (a) with (4.125) gives

Γ3β = −bγ β = −g γα bαβ , Γ33β = 0. (4.138)

In the next step we use g α ⋅ n = 0 and after taking the partial derivatives, we have

n,β ⋅ g α = −n ⋅ g α,β .

Substituting n,β and g α,β and using (a) together with (4.138), we obtain

γ γ
− bγ β g γ ⋅ g α = −n ⋅ (Γαβ g γ + Γ3αβ n) = −Γαβ g γ ⋅ n −Γ3αβ ⏟⏟
⋅n⏟⏟ ,
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ⏟⏟⏟⏟⏟⏟⏟⏟⏟
gγα 0 1

bαβ = Γ3αβ . (4.139)

Thus, we finally obtain for the partial derivative of the moving frame

g α,β = Γαβ g γ + bαβ n, (4.140)

n,β = −bγ β g γ . (4.141)

These equations are known in the theory of surfaces as the derivative equations of
Gauss and Weingarten. The bαβ are, according to (4.128), the covariant coordinates of
the curvature tensor. The Christoffel symbols Γαβ can be computed from (4.60).

3. A by-product of obtaining the derivative equations is that, using (4.139), we can

write the coordinates of the curvature tensor in terms of the Christoffel symbols. The

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
216 | 4 Tensor Analysis in Curvilinear Coordinates

Christoffel symbols themselves are related to the partial derivatives of the metric coef-
ficients according to (4.60), and thus the curvature properties of a surface are already
determined by the metric coefficients.
Thus, we can, for a given set of metric coefficients, interpret the derivative equa-
tions (4.140) and (4.141) as the determining partial differential equations of the moving
frame. However, the metric coefficients cannot be chosen arbitrarily. Since (4.140) and
(4.141) consist of a total of eighteen equations for only nine vector coordinates, certain
integrability conditions must be satisfied; otherwise the moving frame cannot be con-
sistently determined.
The integrability conditions follow from the interchangeability of the mixed sec-
ond partial derivatives. From (4.140) we have
γ γ γ
g α,βδ = (Γαβ g γ + bαβ n) = Γαβ,δ g γ + Γαβ g γ,δ + bαβ,δ n + bαβ n,δ .

The second term on the right side can be written with the help of renaming indices
and equation (4.140) as
γ γ
Γαβ g γ,δ = Γναβ g ν,δ = Γναβ (Γνδ g γ + bνδ n) ,

and from (4.141) it further follows that the fourth term equals

bαβ n,δ = −bαβ bγ δ g γ .

Thus we obtain for g α,βδ , after sorting the coefficients of g γ and n,

γ γ
g α,βδ = (Γαβ,δ + Γναβ Γνδ − bαβ bγ δ ) g γ + (bαβ,δ + Γναβ bνδ ) n.

We now exchange β and δ, so

γ γ
g α,δβ = (Γαδ,β + Γναδ Γνβ − bαδ bγ β ) g γ + (bαδ,β + Γναδ bνβ ) n,

and subtract the two expressions. Since g α,βδ − g α,δβ = 0, it follows from the right
sides that
γ γ γ γ
0 = (Γαβ,δ − Γαδ,β + Γναβ Γνδ − Γναδ Γνβ − (bαβ bγ δ − bαδ bγ β ) ) g γ

+ (bαβ,δ − bαδ,β + Γναβ bνδ − Γναδ bνβ ) n.

This equation is satisfied, if the parentheses in front of g γ and n are zero, and this gives
the following integrability conditions:
γ γ γ γ
Γαβ,δ − Γαδ,β + Γναβ Γνδ − Γναδ Γνβ = bαβ bγ δ − bαδ bγ β , (a)

bαβ,δ − Γναδ bνβ = bαδ,β − Γναβ bνδ .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.5 Fundamentals of the Theory of Surfaces | 217

The expression on the left side of (a) is called (mixed) Riemann curvature tensor of
the surface
μ μ μ
Rμ αβγ := Γαγ,β − Γαβ,γ + Γναγ Γνβ − Γναβ Γμνγ , (4.142)

i. e. we can write the integrability conditions as

Rμ αβγ = bμ β bαγ − bμ γ bαβ , (4.143)

bαβ,δ − Γναδ bνβ = bαδ,β − Γναβ bνδ . (4.144)

In a three-dimensional Cartesian coordinate system all Christoffel symbols are zero

by (4.60), because the metric coefficients gij = δij are constant. Hence the three-
dimensional Riemann curvature tensor Ri jkl is the zero tensor of fourth order (see Prob-
lem 4.18). This statement is also true for all three-dimensional curvilinear coordinate
systems, which are defined with respect to a Cartesian coordinate system. However, for
a curved surface, i. e. a two-dimensional structure embedded in the three-dimensional
space, this statement is not true. If the coordinates of the curvature tensor are nonzero,
then it follows from (4.143) that the two-dimensional Riemann curvature tensor Rμ αβγ
is a nonzero tensor.
We can bring the integrability condition (4.144) into a more familiar form, if we
add the term Γνβδ bαν on both sides and use the symmetry of the Christoffel symbols
Γνβδ = Γνδβ with respect to the lower indices. Then we have

bαβ,δ − Γναδ bνβ − Γνβδ bαν = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟

⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ bαδ,β − Γναβ bνδ − Γνδβ bαν .
bαβ |δ bαδ |β

Comparing this equation with (4.61) shows that it represents the covariant derivatives
of the entirely covariant coordinates of the curvature tensor, i. e. instead of (4.144) we
can also write

bαβ |γ − bαγ |β = 0. (4.145)

In the theory of surfaces (4.145) is called the Mainardi–Codazzi equation.

Problem 4.25.
Obtain the integrability conditions (4.143) and (4.144) from the relation Ri jkl = 0 for
the three-dimensional Riemann curvature tensor.

4. If we consider the entirely covariant coordinates of the Riemann curvature tensor,

we can make further statements. From Rμ βγδ = bμ γ bβδ − bμ δ bβγ it follows with (4.143),
after multiplication by gαμ , that

Rαβγδ = bαγ bβδ − bαδ bβγ .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
218 | 4 Tensor Analysis in Curvilinear Coordinates

If we change the order of the first two indices, we get

Rβαγδ = bβγ bαδ − bβδ bαγ = − (bαγ bβδ − bαδ bβγ ) = −Rαβγδ ,

and correspondingly, if we change the order of the last two indices, we obtain

Rαβδγ = bαδ bβγ − bαγ bβδ = − (bαγ bβδ − bαδ bβγ ) = −Rαβγδ .

Thus, the entirely covariant coordinates of the Riemann curvature tensor are antimet-
ric with respect to the first two indices and also with respect to the last two indices.
In other words, 12 of the 16 coordinates are zero, namely the coordinates where the
first and second index or the third and fourth index are equal. For the remaining four
coordinates we have R1212 = −R1221 = R2121 = −R2112 , i. e. the entirely covariant coordi-
nates of the Riemann curvature tensor are symmetric also with respect to the first and
second index pair. It remains to compute only one coordinate, and we choose R1212 , so
we write

R1212 = b11 b22 − b12 b21 = det bμν .

Using the multiplication theorem (1.13) and with (4.133), we can write this determinant
in terms of the Gaussian curvature K, i. e.

R1212 = det bμν = det (bλ ν gλμ ) = (det bλ ν ) (det gλμ ) = K g.

In the last step we used properties of symmetry and antimetry that are similar to those
of permutation symbols, so we write

Rαβγδ = K g εαβ εγδ .

However, because of the permutation symbols this is not an equation between tensor
coordinates. To make this an equation for tensor coordinates, we first have to introduce
the holonomic coordinates of a two-dimensional ε-tensor. For the shared set of index
values 1,2 we set μ = i, ν = j; then we can write εμν as εμν = εij3 . We define the
covariant coordinates analogously as

eμν := eij3 . (4.146)

Then it follows from (4.34)1 with (4.119) that

eμν = eij3 = √g εij3 = √g εμν ,

i. e. we finally obtain for the Riemann curvature tensor

Rαβγδ = K eαβ eγδ . (4.147)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
4.5 Fundamentals of the Theory of Surfaces | 219

Problem 4.26.
Show that the eμν satisfy the transformation law for the covariant coordinates of a
second-order surface tensor.

Problem 4.27.
Compute, using the Christoffel symbols, the Gaussian curvature of a spherical surface
and of a cylindrical surface.

5. For completeness it remains to check if further integrability conditions can be ob-

tained from (4.141). We proceed similarly to No. 3 and compute

n,βδ = (−bγ β g γ ) = −bγ β,δ g γ − bγ β g γ,δ .

Substituting (4.140) and sorting with respect to the g γ and n gives

n,βδ = − (bγ β,δ + bν β Γνδ ) g γ − bν β bνδ n.

We again exchange β and δ, so

n,δβ = − (bγ δ,β + bν δ Γνβ ) g γ − bν δ bνβ n,

and subtract the two expressions. Because n,βδ − n,δβ = 0 it follows from the right
sides that

γ γ
0 = − (bγ β,δ − bγ δ,β + bν β Γνδ − bν δ Γνβ ) g γ − (bν β bνδ − bν δ bνβ ) n,

i. e.

γ γ
bγ β,δ + Γνδ bν β = bγ δ,β + Γνβ bν δ , (a)


bνδ bν β − bνβ bν δ = 0 (b)

must hold. Again we can write (a) in terms of covariant derivatives, if we subtract
Γνβδ bγ ν on both sides and take Γνβδ = Γνδβ into account. Then we have

γ γ
bγ β,δ + Γνδ bν β − Γνβδ bγ ν = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ bγ δ,β + Γνβ bν δ − Γνδβ bγ ν .
bγ β |δ bγ δ |β

This equation is equivalent to (4.145), with the only difference that here we used the
mixed coordinates of the curvature tensor.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
220 | 4 Tensor Analysis in Curvilinear Coordinates

We can write (b), by comparing it with (4.143), in terms of the Riemann curvature
tensor. Thus we get with (4.147)

0 = Rν νβδ = g μν Rμνβδ = K g⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟

eμν eβδ .

This equation is satisfied since g μν is symmetric and eμν is antimetric.

Thus, no further integrability conditions can be obtained from (4.141).

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:10 AM
5 Representation of Tensor Functions
5.1 The Basic Idea of the Representation of Tensor Functions
1. Modeling a physical process often means to find a function relating two tensors.
From the context we can assume that such a function exists, but it is not known and it
can often be determined only from experiments. Typical examples of such functions
are the stress–displacement relation T = f (D) for an elastic solid, where the (sym-
metric) stress tensor T depends on an (also symmetric) displacement tensor D; or the
yield-stress condition σ = f (T) in plasticity theory, which relates the experimentally
determined yield-stress σ to the three-dimensional stress state described by the stress
tensor T.
The functions f or f in the examples are called, according to the order of the ten-
sors, scalar-valued or tensor-valued tensor functions. Contrary to what one may ex-
pect, these functions cannot have an arbitrary form, because the transformation prop-
erties of a tensor impose certain restrictions. For example, if in a given Cartesian co-
ordinate system σ = f (Tij ) and Tij = fij (Dkl ), then, under a change of the coordinate
system, substituting the transformed coordinates T̃ ij into the function f must result in
the same scalar σ = f (T̃ ij ), and similarly, substituting the transformed coordinates
D̃ kl into the function f must give the transformed coordinates T̃ ij = fij (̃
Dkl ). If we as-
sume that the tensors are polar, then, according to the transformation equations (2.17),
a scalar-valued function f must satisfy

f (Tmn ) = f (αim αjn Tij ), (a)

and a tensor-valued function f is restricted by the condition

αmi αnj fmn (Dkl ) = fij (αpk αql Dpq ). (b)

2. How to evaluate conditions of the type (a) or (b) and how to use them to construct
functions relating tensors is the subject of a theory called the theory of the representa-
tion of tensor functions. The basic idea is that a scalar-valued function cannot depend
on single tensor coordinates, but only on scalar combinations of tensor coordinates,
which are clearly invariant under a coordinate transformation. This explains why, in-
stead of the theory of the representation of tensor functions, we also speak of the the-
ory of invariants. We will show that for any symmetric tensor, such as the stress ten-
sor T, the three basic invariants tr T, tr T 2 , and tr T 3 form a complete set of invariants,
so we can write the scalar-valued function f in our example as a function of the three
basic invariants (or, according to Section 3.15, of another equivalent set of invariants,
such as the eigenvalues or the main invariants), so we have

σ = f (tr T, tr T 2 , tr T 3 ).


Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
222 | 5 Representation of Tensor Functions

These considerations also apply to tensor-valued functions, if we make them

scalar-valued by introducing an auxiliary tensor. In our example, then T = f (D) gives

T ⋅⋅ H = f (D, H),

where the auxiliary tensor H must appear in such a way that it can be canceled out in
the end.
Before we turn to the details, we have to clarify how many tensor invariants are
needed for a (yet to be defined) representation to be complete. For this we first need to
generalize the Cayley–Hamilton theorem, which in its basic version relates only pow-
ers of the same tensor, while in the theory of the representation of tensor functions,
products of different tensors often appear as well. Since the tensors in physical appli-
cations are often polar, we restrict ourselves to polar tensors for the rest of this chapter
(this includes scalars and vectors as tensors of order zero and one, respectively), with-
out mentioning this every time. In the few cases where we consider axial tensors, or if
the distinction between polar and axial tensors is important for the argument, we will
mention this explicitly.

3. In physics, representation of tensor functions is as important as dimensional anal-

ysis. Physical quantities are the product of a numerical value and a unit, but the quan-
tity itself is independent of the chosen unit, i. e. if we change the unit, the numerical
value changes correspondingly. This property carries over to equations between phys-
ical quantities, where we say that an equation must be dimensionally homogeneous.
Specifically, it follows that physical quantities cannot be related by an arbitrary func-
tion, but only by functions which are invariant under a change of units. This condition
of invariance under a change of units has its analog for tensors in the relations (a) and
(b), which result from the requirement that an equation must be invariant under a
change of the coordinate system.
In dimensional analysis we use the condition of invariance under a change of
units to replace relations between quantities by relations between dimensionless com-
binations of quantities (which do not need units). In many cases, this leads to restric-
tions on the permissible functions. Analogously, in the theory of the representation of
tensor functions, we change to scalar combinations of tensor coordinates (i. e. tensor
invariants), which again leads to restrictions on the permissible functions.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
5.2 Generalized Cayley–Hamilton Theorem | 223

5.2 Generalized Cayley–Hamilton Theorem

1. According to (1.20), δpqrs = 0, and then (1.19) gives

󵄨󵄨 δ δiq δir δis 󵄨󵄨󵄨󵄨

󵄨󵄨 ip
󵄨󵄨 󵄨
󵄨󵄨 δjp δjq δjr δjs 󵄨󵄨󵄨󵄨
󵄨󵄨 δ 󵄨 = 0.
󵄨󵄨 kp δkq δkr δks 󵄨󵄨󵄨󵄨
󵄨󵄨 󵄨
󵄨󵄨 δlp δlq δlr δls 󵄨󵄨󵄨

Contraction with api bqj crk gives, if we multiply the first row by api , the second row by
bqj , and the third row by crk ,

󵄨󵄨 a apq apr aps 󵄨󵄨󵄨󵄨

󵄨󵄨 pp
󵄨󵄨 󵄨
󵄨󵄨󵄨 bqp bqq bqr bqs 󵄨󵄨󵄨󵄨
󵄨󵄨 󵄨 = 0.
󵄨󵄨󵄨 crp crq crr crs 󵄨󵄨󵄨󵄨
󵄨󵄨 󵄨
󵄨󵄨 δlp δlq δlr δls 󵄨󵄨󵄨

We expand this determinant along the last row, so we obtain

󵄨󵄨 a apr aps 󵄨󵄨󵄨󵄨 󵄨󵄨 a

󵄨󵄨 pp apr aps 󵄨󵄨󵄨
󵄨󵄨 pq
󵄨 󵄨󵄨 󵄨󵄨 󵄨
−δlp 󵄨󵄨󵄨󵄨 bqq bqr bqs 󵄨󵄨󵄨 + δlq 󵄨󵄨󵄨 bqp bqr bqs 󵄨󵄨󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨
󵄨󵄨 crq crr crs 󵄨󵄨 󵄨󵄨 crp crr crs 󵄨󵄨󵄨
󵄨󵄨 a
󵄨󵄨 pp apq aps 󵄨󵄨󵄨
󵄨 󵄨󵄨 a apq apr 󵄨󵄨
󵄨󵄨 pp 󵄨󵄨
󵄨󵄨 󵄨󵄨 󵄨 󵄨󵄨
− δlr 󵄨󵄨󵄨 bqp bqq bqs 󵄨󵄨󵄨 + δls 󵄨󵄨󵄨󵄨 bqp bqq bqr 󵄨󵄨 = 0.
󵄨󵄨 󵄨󵄨 󵄨󵄨
󵄨󵄨 crp crq crs 󵄨󵄨 󵄨󵄨 crp crq crr 󵄨󵄨

Multiplying δlp , δlq , and δlr into the first, second, and third row of the first three de-
terminants, respectively, gives

󵄨󵄨 a alr als 󵄨󵄨󵄨󵄨 󵄨󵄨󵄨󵄨 app apr aps 󵄨󵄨󵄨󵄨 󵄨󵄨󵄨󵄨 app apq aps 󵄨󵄨󵄨󵄨
󵄨󵄨 lq
󵄨 󵄨 󵄨 󵄨 󵄨 󵄨
− 󵄨󵄨󵄨󵄨 bqq bqr bqs 󵄨󵄨󵄨󵄨 + 󵄨󵄨󵄨󵄨 blp blr bls 󵄨󵄨󵄨󵄨 − 󵄨󵄨󵄨󵄨 bqp bqq bqs 󵄨󵄨󵄨󵄨
󵄨󵄨 󵄨 󵄨 󵄨 󵄨 󵄨
󵄨󵄨 crq crr crs 󵄨󵄨󵄨 󵄨󵄨󵄨 crp crr crs 󵄨󵄨󵄨 󵄨󵄨󵄨 clp clq cls 󵄨󵄨󵄨
󵄨󵄨 a
󵄨󵄨 pp apq apr 󵄨󵄨󵄨
󵄨󵄨 󵄨
+ δls 󵄨󵄨 bqp bqq bqr 󵄨󵄨󵄨󵄨 = 0.
󵄨󵄨 󵄨󵄨
󵄨󵄨 crp crq crr 󵄨󵄨

We expand each of the determinants and obtain

− alq bqr crs − alr bqs crq − als bqq crr + als bqr crq + alr bqq crs + alq bqs crr
+ app blr crs + apr bls crp + aps blp crr − aps blr crp − apr blp crs − app bls crr
− app bqq cls − apq bqs clp − aps bqp clq + aps bqq clp + apq bqp cls + app bqs clq
+ δls (app bqq crr + apq bqr crp + apr bqp crq − apr bqq crp − apq bqp crr
− app bqr crq ) = 0.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
224 | 5 Representation of Tensor Functions

Translating this expression into symbolic notation gives

− a ⋅ b ⋅ c − a ⋅ c ⋅ b − a tr b tr c + a tr(b ⋅ c) + a ⋅ c tr b + a ⋅ b tr c
+ b ⋅ c tr a + b tr(a ⋅ c) + b ⋅ a tr c − b ⋅ c ⋅ a − b ⋅ a ⋅ c − b tr a tr c
− c tr a tr b − c ⋅ a ⋅ b − c ⋅ b ⋅ a + c ⋅ a tr b + c tr(a ⋅ b) + c ⋅ b tr a
+ δ (tr a tr b tr c + tr(a ⋅ b ⋅ c) + tr(a ⋅ c ⋅ b) − tr(a ⋅ c) tr b − tr(a ⋅ b) tr c
− tr(b ⋅ c) tr a) = 0.

Finally, after sorting these terms and reversing the signs, we obtain

a ⋅ (b ⋅ c + c ⋅ b) + b ⋅ (a ⋅ c + c ⋅ a) + c ⋅ (a ⋅ b + b ⋅ a)
− (b ⋅ c + c ⋅ b) tr a − (a ⋅ c + c ⋅ a) tr b − (a ⋅ b + b ⋅ a) tr c
+ [tr b tr c − tr(b ⋅ c)] a + [tr a tr c − tr(a ⋅ c)] b
+ [tr a tr b − tr(a ⋅ b)] c
− [tr a tr b tr c + tr(a ⋅ b ⋅ c) + tr(c ⋅ b ⋅ a)
− tr(b ⋅ c) tr a − tr(a ⋅ c) tr b − tr(a ⋅ b) tr c] δ = 0.

2. If we set a = b = c, then (5.1) simplifies to

6 a3 − 6 tr a a2 + 3 [tr2 a − tr a2 ] a − [tr3 a − 3 tr a tr a2 + 2 tr a3 ] δ = 0.

After dividing by −6 and using (3.98) we see that this is the Cayley–Hamilton theorem
(3.94), so we call (5.1) the generalized Cayley–Hamilton theorem.

3. For the representation of tensor functions, the first row of (5.1) is of particular im-
portance, so we introduce the following symbol:

Σ := a ⋅ (b ⋅ c + c ⋅ b) + b ⋅ (a ⋅ c + c ⋅ a) + c ⋅ (a ⋅ b + b ⋅ a) . (5.2)

Then we obtain from (5.1), after solving for the first row, i. e.

Σ = (b ⋅ c + c ⋅ b) tr a + (a ⋅ c + c ⋅ a) tr b + (a ⋅ b + b ⋅ a) tr c
− [tr b tr c − tr(b ⋅ c)] a − [tr a tr c − tr(a ⋅ c)] b
− [tr a tr b − tr(a ⋅ b)] c (5.3)
+ [tr a tr b tr c + tr(a ⋅ b ⋅ c) + tr(c ⋅ b ⋅ a)
− tr(b ⋅ c) tr a − tr(a ⋅ c) tr b − tr(a ⋅ b) tr c] δ.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
5.3 Invariants of Vectors and Second-Order Tensors | 225

5.3 Invariants of Vectors and Second-Order Tensors

Let us recall our definition of an invariant. An invariant is a scalar combination of
tensor coordinates which (for polar scalars) does not change under a change of the
coordinate system, or (for axial scalars) at most its sign changes. Following our def-
inition from Section 5.1, however, we use the term invariant in this chapter only for
polar scalars.

Problem 5.1.
Show, using the transformation equations (2.17) or (2.18), that tr T and tr T 2 are invari-
ants of a (polar or axial) second-order tensor.

In Section 3.15, we saw that the eigenvalues, the main invariants, and the basic
invariants are three different sets of three invariants of a second-order tensor. The el-
ements of each of these three sets are independent of each other, i. e. no element of a
set can be computed from the other two elements of the same set; but the three sets are
not independent of each other, i. e. if we know the invariants of one set, then we can
compute the invariants of the other sets. However, it is not clear if the three invariants
of a set form a complete set, i. e. if further invariants exist, which cannot be computed
from these invariants. A complete set of invariants is called a basis (of invariants).
The easiest way to compute invariants is from appropriate contractions of tensor
coordinates. An example are the main invariants of a second-order tensor. Such invari-
ants are polynomials of tensor coordinates and we call them polynomial invariants.
In the theory of the representation of tensor functions we usually consider only poly-
nomial invariants and we use the following terms. A polynomial invariant which can
be written as a polynomial of other polynomial invariants is called reducible (with
respect to these invariants); otherwise it is called irreducible. A complete set of irre-
ducible invariants, i. e. a set such that all other polynomial invariants can be written
in terms of these irreducible invariants, is called an integrity basis. However, it is also
possible that polynomial invariants are related by a polynomial and yet none of them
can be written as a polynomial of the other invariants. Such polynomials are called
syzygies;1 an example is equation (2.46) for the scalar products of the six vectors a, b,
c, d, e, and f .
An example of nonpolynomial invariants are the eigenvalues of a second-order
tensor. We computed them in Section 3.11.2 from a cubic equation whose coefficients
are polynomial invariants, but an eigenvalue cannot be written as a polynomial of
tensor coordinates.

1 Greek: pair (from syn [together] and zygon [yoke]), i. e. literally, a pair coupled together by a yoke,
e. g. in astronomy a generic term for conjunction and opposition of two planets, and in poetry the
juxtaposition of two equal metrical feet.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
226 | 5 Representation of Tensor Functions

The polynomial invariants are a subset of all the invariants of a tensor. Since re-
ducible invariants are polynomial combinations of tensor coordinates, it may still hap-
pen that there are more irreducible invariants than independent invariants. In other
words, an integrity basis can contain more elements than a basis.

5.3.1 Invariants of Vectors

A single vector u has only one irreducible invariant, i. e. its square ui ui .

From two vectors u and v we can form a total of three irreducible invariants:
– the squares ui ui and vi vi of each of the vectors;
– the scalar product ui vi .

The scalar product is an example of a so-called simultaneous invariant, i. e. an invari-

ant which consists of coordinates of different tensors.
As the number of vectors increases, the number of irreducible invariants quickly
increases. For three vectors u, v, w, there are already seven independent scalar com-
– the squares ui ui , vi vi , wi wi , of each of the vectors;
– the scalar products ui vi , ui wi , vi wi , of each of the vector pairs;
– the triple product εijk ui vj wk of all three vectors.

The triple product plays a particular role. If the vectors u, v, w are polar, then the first
six invariants are also polar scalars, whereas the triple product is, due to the ε-tensor,
an axial scalar. However, according to our definition, an axial scalar is no invariant,
i. e. from three (polar) vectors u, v, w, we can form six irreducible invariants.

5.3.2 Independent Invariants of a Second-Order Tensor

1. Let us first determine the number of independent invariants of an arbitrary second-

order tensor T. To this end, we decompose the tensor into its symmetric part S and
its antimetric part A. We can write the antimetric part A, according to Section 3.3, in
terms of its corresponding vector b, so we have

Tij = Sij + Aij = Sij + εijk bk .

If we assume that T is polar, then S and A are also polar and b is axial.
The decomposition into a symmetric part and an antimetric part is, according to
Section 2.6, No. 7, independent of the coordinate system. So we can also use the prin-
cipal axis system of S to represent the coordinates of the full tensor T, and thus we

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
5.3 Invariants of Vectors and Second-Order Tensors | 227

have for its coordinate matrix

σ1 0 0 0 β3 −β2
T̃ij = ( 0 σ2 0 ) + (−β3 0 β1 ) ,
0 0 σ3 β2 −β1 0

σ1 β3 −β2
T̃ij = (−β3 σ2 β1 ) . (5.4)
β2 −β1 σ3

Here the σi are the (not necessarily different) eigenvalues of S and the βi are the co-
ordinates of b in the principal axis system of S. We already know from Section 3.11.2
that the eigenvalues σi are invariants. The coordinates of b can be interpreted as the
projections of the vector b onto the principal axes of S; hence the βi are also invariants,
because the direction of the principal axes of S and the direction of b are independent
of the coordinate system.
Thus the principal axis system of the symmetric part S is a distinguished coordi-
nate system of the tensor T; the coordinate matrix of T is then (and only then) anti-
metric, except on the main diagonal.
We conclude that, in general, a second-order tensor has six independent invari-
ants; more than six independent invariants are not possible, because the tensor is
uniquely determined by the σi and βi . However, these invariants are not polynomial
invariants, i. e. they cannot be written as polynomials of tensor coordinates in an ar-
bitrary Cartesian coordinate system.
2. If the symmetric part S has multiple eigenvalues, the number of independent in-
variants is reduced.
For an eigenvalue of multiplicity two, we choose the x, y-plane as the eigenplane
of the principal axis system and we put the x-axis such that b lies in the x, z-plane.
Then the tensor T has the coordinate matrix

σ1 0 0 0 β3 0 σ1 β3 0
T̃ij = ( 0 σ1 0 ) + (−β3 0 β1 ) = (−β3 σ1 β1 ) ,
0 0 σ3 0 −β1 0 0 −β1 σ3

i. e. only four independent invariants exist.

For an eigenvalue of multiplicity three, every coordinate system is also a princi-
pal axis system of S. So we can put the x-axis in the direction of b and we get for the
coordinate matrix of T

σ 0 0 0 0 0 σ 0 0
Tij = (0
̃ σ 0) + (0 0 β ) = (0 σ β) ,
0 0 σ 0 −β 0 0 −β σ

i. e. T has only two independent invariants.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
228 | 5 Representation of Tensor Functions

3. If the tensor is symmetric, we have A = 0. A symmetric tensor has at most three

independent invariants, namely if all its eigenvalues are different. If the tensor has an
eigenvalue of multiplicity two, it has only two invariants; and if it has an eigenvalue
of multiplicity three, only one invariant exists.
If the tensor is antimetric, we have S = 0. An antimetric tensor has only one
independent invariant, i. e. the square of the corresponding vector.
In addition, we consider orthogonal tensors. In a coordinate system in which the
z-axis is the axis of rotation or the axis of rotary reflection, an orthogonal tensor has,
according to (3.75) or (3.76), the coordinate matrix

cos ϑ − sin ϑ 0
̃ ij = ( sin ϑ cos ϑ 0) .
0 0 ±1

This is a representation of a tensor with respect to the principal axis system of its sym-
metric part, so we note that with the angle ϑ only one independent invariant exists.

5.3.3 Irreducible Invariants of Second-Order Tensors

1. Irreducible invariants are by definition polynomial invariants, which are obtained

from appropriate contractions of tensor coordinates. For second-order tensors, we can
write the polynomial invariants also as the trace of a sequence of scalar products be-
tween these tensors. We begin with a few statements, which we need later:
I. The trace of a sequence of scalar products of second-order tensors does not change
if we cyclically permute the order of the factors in the scalar products, i. e.

tr(a ⋅ b ⋅ . . . ⋅ c ⋅ d) = aij bjk ⋅ ⋅ ⋅ cmn dni = bjk ⋅ ⋅ ⋅ cmn dni aij = tr(b ⋅ . . . ⋅ c ⋅ d ⋅ a),

so we have

tr(a ⋅ b ⋅ . . . ⋅ c ⋅ d) = tr(b ⋅ . . . ⋅ c ⋅ d ⋅ a) = ⋅ ⋅ ⋅ = tr(d ⋅ a ⋅ b ⋅ . . . ⋅ c). (5.5)

II. The trace of a sequence of scalar products of second-order tensors does not change
if we transpose each factor, while reversing the order of the factors, i. e.

tr(a ⋅ b ⋅ . . . ⋅ c ⋅ d) = aij bjk ⋅ ⋅ ⋅ cmn dni

= din cnm ⋅ ⋅ ⋅ bTkj aTji = tr(dT ⋅ cT ⋅ . . . ⋅ bT ⋅ aT ),

so we have

tr(a ⋅ b ⋅ . . . ⋅ c ⋅ d) = tr(dT ⋅ cT ⋅ . . . ⋅ bT ⋅ aT ). (5.6)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
5.3 Invariants of Vectors and Second-Order Tensors | 229

In particular, if all factors are equal, we have

tr an = tr (aT )n = tr (an )T , (5.7)

where n is a natural number.

III. The trace of a scalar product of a symmetric tensor s = sT and an antimetric tensor
a = −aT is zero.
(s ⋅ a)T = aT ⋅ sT = −a ⋅ s (a)

and also
(5.7) (a) (5.5)
tr(s ⋅ a) = tr (s ⋅ a)T = −tr(a ⋅ s) = −tr(s ⋅ a),

we obtain

tr(s ⋅ a) = 0. (5.8)

2. In order to write the irreducible invariants of a second-order tensor as the trace of a

sequence of scalar products, the main question is how many of these invariants form
an integrity basis. Answering this question here in all generality would be beyond
the scope of this book. We restrict ourselves to the case that the factors in the scalar
product depend only on two different tensors a and b. We will follow a systematical
approach and increase the number of the factors in the scalar product step-by-step;
the number of these factors is also called the degree of the invariant.
With only one factor, there are only two ways to form invariants in the described
manner, namely the trace of each of the two tensors themselves, i. e.

I11 := tr a, I12 := tr b. (5.9)

Scalar products of two factors can be formed in four different ways: a2 , b2 , a ⋅ b, b ⋅ a;

after taking cyclical permutations (5.5) into account, only three invariants of degree
two remain:

I21 := tr a2 , I22 := tr b2 , I23 := tr(a ⋅ b). (5.10)

From three factors, there are altogether eight possible scalar products: a3 , b3 , a2 ⋅ b, a ⋅
b⋅a, b⋅a2 , b2 ⋅a, b⋅a⋅b, a⋅b2 ; after computing the traces and taking cyclical permutations
according to (5.5) into account, only four invariants of degree three remain:

I31 := tr a3 , I32 := tr b3 ,
I33 := tr(a2 ⋅ b), I34 := tr(a ⋅ b2 ).

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
230 | 5 Representation of Tensor Functions

The invariants of degree one, two, and three, which we have considered so far, are all
irreducible, because none can be written in terms of the others. However, for invariants
of degree four, a fundamental restriction appears, since invariants such as tr(a3 ⋅b) and
tr(b3 ⋅ a) are reducible. To see this, we solve the Cayley–Hamilton theorem (3.94) for a3
and scalar multiply it from the right by b, so we have

a3 ⋅ b = A󸀠󸀠 a2 ⋅ b − A󸀠 a ⋅ b + A b.

Computing the trace gives

tr(a3 ⋅ b) = A󸀠󸀠 tr(a2 ⋅ b) − A󸀠 tr(a ⋅ b) + A tr b,

i. e. tr(a3 ⋅ b) can be written in terms of invariants of degree less than or equal to three,
since A󸀠󸀠 , A󸀠 , and A are, according to (3.99), also functions of tr a, tr a2 , and tr a3 . The
same is true for tr(b3 ⋅a), because a and b can be interchanged in the above derivation.
We can extend these considerations to invariants of the form tr(c ⋅ a3 ⋅ d) and
tr(c ⋅ b3 ⋅ d), where c and d are arbitrary second-order tensors; in particular they can
be scalar products of the factors a and b. Such invariants can be reduced, using the
Cayley–Hamilton theorem (3.94), to invariants of a smaller degree. In other words, in-
variants with a term of power three (or higher) are reducible. Hence we only need to
investigate sequences of scalar products with the elements a2 , b2 , a, b.
For scalar products with at least four factors, we can derive another relation, us-
ing the generalized Cayley–Hamilton theorem, which will narrow down the search for
irreducible invariants. For this we set in (5.1) a = b = n and scalar multiply from the
right by an arbitrary tensor d. For the first row we get with (5.2)

Σ ⋅ d = [n ⋅ (n ⋅ c + c ⋅ n) + n ⋅ (n ⋅ c + c ⋅ n) + c ⋅ (n ⋅ n + n ⋅ n)] ⋅ d

= 2 [n ⋅ c ⋅ n + n2 ⋅ c + c ⋅ n2 ] ⋅ d = 2 [n ⋅ c ⋅ n ⋅ d + n2 ⋅ c ⋅ d + c ⋅ n2 ⋅ d] .

Computing the trace, taking cyclic permutations into account, yields

tr(Σ ⋅ d) = 2 [tr(n ⋅ c ⋅ n ⋅ d) + tr(n2 ⋅ c ⋅ d) + tr(n2 ⋅ d ⋅ c)] ,

tr(n ⋅ c ⋅ n ⋅ d) = − (tr(n2 ⋅ c ⋅ d) + tr(n2 ⋅ d ⋅ c)) + tr(Σ ⋅ d).

Using (5.3), we can write tr(Σ ⋅ d) in terms of invariants of at least one degree less than
the degree of tr(n⋅c⋅n⋅d), tr(n2 ⋅c⋅d), and tr(n2 ⋅d⋅c). In the theory of the representation
of tensor functions we also say that tr(n ⋅ c ⋅ n ⋅ d) and − (tr(n2 ⋅ c ⋅ d) + tr(n2 ⋅ d ⋅ c)) are
equivalent and we write

tr(n ⋅ c ⋅ n ⋅ d) ≡ − (tr(n2 ⋅ c ⋅ d) + tr(n2 ⋅ d ⋅ c)) . (5.12)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
5.3 Invariants of Vectors and Second-Order Tensors | 231

Equivalent invariants have the same degree and differ only in their (reducible) invari-
ants of lower degrees; in particular, we can replace an irreducible invariant in an in-
tegrity basis by an equivalent invariant. If a computation shows that an invariant is
equivalent to zero, then this invariant is reducible. The important implication of the
equivalence relation (5.12) is that invariants with two factors which are equal but ap-
pear separated from each other can be written as the sum of two other invariants in
which this factor appears squared.
With these preparations we can now reduce our search for fourth-order irre-
ducible invariants to the following scalar products: a2 ⋅ b ⋅ a, a ⋅ b ⋅ a2 , b2 ⋅ a ⋅ b, b ⋅ a ⋅ b2 ,
a ⋅ b ⋅ a ⋅ b, b ⋅ a ⋅ b ⋅ a, a ⋅ b2 ⋅ a, b ⋅ a2 ⋅ b, a2 ⋅ b2 , b2 ⋅ a2 . After computing the trace
and using cyclic permutation we have tr(a2 ⋅ b ⋅ a) = tr(a ⋅ b ⋅ a2 ) = tr(a3 ⋅ b) and
tr(b2 ⋅ a ⋅ b) = tr(b ⋅ a ⋅ b2 ) = tr(b3 ⋅ a), so these invariants are reducible. With the
same reasoning we have tr(a ⋅ b ⋅ a ⋅ b) = tr(b ⋅ a ⋅ b ⋅ a) and tr(a2 ⋅ b2 ) = tr(b2 ⋅ a2 )
= tr(a ⋅ b2 ⋅ a) = tr(b ⋅ a2 ⋅ b), so only two invariants remain, i. e. tr(a ⋅ b ⋅ a ⋅ b) and
tr(a2 ⋅ b2 ), and we have to check if they are irreducible. Setting in (5.12) n = a, c = d = b
we obtain

tr(a ⋅ b ⋅ a ⋅ b) ≡ −2 tr(a2 ⋅ b2 ),

i. e. both invariants are equivalent. Hence only one irreducible invariant of degree four
exists and we choose

I41 := tr(a2 ⋅ b2 ). (5.13)

When investigating scalar products with five factors, we consider cyclic permutations
from the outset and then we see that there are only two ways of computing invariants
of degree five, which are composed of the elements a2 , b2 , a, b, namely tr(a ⋅ b ⋅ a ⋅ b2 )
and tr(b ⋅ a ⋅ b ⋅ a2 ). From (5.12) it follows that

tr(a ⋅ b ⋅ a ⋅ b2 ) ≡ −2 tr(a2 ⋅ b3 ),
tr(b ⋅ a ⋅ b ⋅ a2 ) ≡ −2 tr(b2 ⋅ a3 );

however, because of a3 and b3 both invariants are reducible. In other words, no irre-
ducible invariants of degree five exist.
In order to find the irreducible invariants of degree six, which can be formed from
the elements a2 , b2 , a, b, we start, after taking cyclic permutations into account, from
the invariants tr(a ⋅ b2 ⋅ a ⋅ b2 ), tr(b ⋅ a2 ⋅ b ⋅ a2 ), tr(a ⋅ b ⋅ a ⋅ b ⋅ a ⋅ b), tr(a2 ⋅ b2 ⋅ a ⋅ b),
tr(a2 ⋅ b ⋅ a ⋅ b2 ). We then investigate, using (5.12), which of them are irreducible. For

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
232 | 5 Representation of Tensor Functions

the first two invariants we have

tr(a ⋅ b2 ⋅ a ⋅ b2 ) ≡ −2 tr(a2 ⋅ b4 ),
tr(b ⋅ a2 ⋅ b ⋅ a2 ) ≡ −2 tr(b2 ⋅ a4 );

but because of a4 and b4 these invariants are reducible. For the third invariant we have

tr(a ⋅ (b ⋅ a ⋅ b) ⋅ a ⋅ b) ≡ −tr(a2 ⋅ b ⋅ a ⋅ b2 ) − tr(a2 ⋅ b2 ⋅ a ⋅ b),

i. e. the third invariant is equivalent to the (negative) sum of the last two invariants.
For the last two invariants we obtain

tr(a2 ⋅ b2 ⋅ a ⋅ b) = tr(a ⋅ (a ⋅ b2 ) ⋅ a ⋅ b) ≡ − ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟

tr(a3 ⋅ b3 ) −tr(a2 ⋅ b ⋅ a ⋅ b2 ),

i. e. they are equivalent and can only be reduced as a sum, but not individually, to in-
variants of lower degree. So we can regard one of them as irreducible and we choose

I61 := tr(a2 ⋅ b ⋅ a ⋅ b2 ). (5.14)

It turns out that we do not have to look further for invariants of degree seven or higher,
because they will always be reducible. This follows from two facts. First, a factor can
appear in irreducible invariants only as itself or squared. Second, the equivalence re-
lation (5.12) states that invariants appearing twice can be replaced by invariants with
quadratic factors. From the elements a2 , b2 , a, b we can form, without repetition, at
most invariants of degree six, because for higher degrees at least one of the elements
appears twice. If the element appearing twice is a or b, we can replace the invariant
under consideration, using (5.12), with equivalent invariants with a2 or b2 ; however,
the elements a2 and b2 already appear, so that applying (5.12) again leads to invariants
with a4 or b4 , which are reducible. If the element appearing twice is a2 or b2 , then
applying (5.12) once is already sufficient to reduce the invariant under consideration
to invariants of lower degree.

3. We summarize the results of our investigations.

We computed invariants as the trace of a sequence of scalar products of second-
order tensors, and we restricted our search for invariants to the case that the factors
in the sequence of scalar products are two different tensors a and b. Using the Cayley–
Hamilton theorem (3.94) and its generalization (5.1), we could show that invariants
with more than six factors are always reducible. Finally, we were able to find with
(5.9), (5.10), (5.11), (5.13), and (5.14) a total of 11 irreducible invariants (some of which
are simultaneous invariants, since they are built from the coordinates of different

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
5.3 Invariants of Vectors and Second-Order Tensors | 233

I11 = tr a, I12 = tr b,
I21 = tr a , I22 = tr b2 , I23 = tr(a ⋅ b),
I31 = tr a3 , I32 = tr b3 , I33 = tr(a2 ⋅ b), I34 = tr(a ⋅ b2 ), (5.15)
I41 = tr(a2 ⋅ b2 ),
I61 = tr(a2 ⋅ b ⋅ a ⋅ b2 ).

It remains to answer the question if these irreducible invariants form a complete set.
We will do this by means of examples.
4. As a first example, we consider a single tensor T and set in (5.15) a = T,
b = T T . Since the trace does not change if we transpose its argument and since,
according to (5.7), transposition and exponentiation are interchangeable within a
trace, we see immediately that some of the invariants in (5.15) are the same. We have

I11 = I12 = tr T, I21 = I22 = tr T 2 , I31 = I32 = tr T 3 ,

and further investigation of I33 and I34 shows that

I34 = tr(T ⋅ (T T )2 ) = tr(T 2 ⋅ T T ) = I33 .

Thus a single second-order tensor has, in general, seven irreducible invariants:

I1 := Tii = tr T,
I2 := Tij Tji = tr T 2 ,
I3 := Tij Tij = tr(T ⋅ T T ),
I4 := Tij Tjk Tki = tr T 3 , (5.16)

I5 := Tij Tjk Tik = tr(T 2 ⋅ T T ),

I6 := Tij Tjk Tlk Til = tr(T 2 ⋅ (T T )2 ),
I7 := Tij Tjk Tlk Tlm Tnm Tin = tr(T 2 ⋅ T T ⋅ T ⋅ (T T )2 ).

These invariants in fact form an integrity basis. Both T and T T are needed and it is
not possible to write all invariants in terms of traces using only T. We can see this,
for example, if we compare I2 and I3 . In index notation we have I2 = Tij Tji and I3 =
Tij Tij ; both are scalars obtained by contracting tensor coordinates, but without the
transposed tensor, we cannot write I3 as a trace.2

2 This explains why (5.15) is not a complete set of irreducible invariants of the tensors a and b;
to obtain an integrity basis, we have to include also the transposed tensors aT and bT , i. e. we have
to investigate four different tensors.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
234 | 5 Representation of Tensor Functions

If we compare our results with Section 5.3.2, we see that, in general, a second-
order tensor has six independent invariants, but seven irreducible invariants. We can
clearly write all irreducible invariants in terms of the independent invariants from Sec-
tion 5.3.2, No. 1, if we write the traces in (5.16) with respect to the principal axis system
of the symmetric part of T.
Symmetric tensors obey T T = T. Then I2 = I3 and I4 = I5 , and we can reduce
I6 = tr T 4 and I7 = tr T 6 , using the Cayley–Hamilton theorem (3.95), to the remaining
invariants. So, from the irreducible invariants in (5.16), only the three basic invariants
I1 = tr T, I2 = tr T 2 , I4 = tr T 3 remain.
Antimetric tensors satisfy T T = −T; hence I1 = 0. We know T 2 is symmetric
because Tij Tjk = (−Tji ) (−Tkj ) = Tkj Tji , so I2 is nonzero, but we have I3 = −I2 .
Here T 3 is again antimetric because Tij Tjk Tkl = (−Tji ) (−Tkj ) (−Tlk ) = −Tlk Tkj Tji ,
so we have I4 = 0 and thus also I5 = −I4 = 0. It further follows that I6 = tr T 4 and
I7 = −tr T 6 , so the Cayley–Hamilton theorem (3.95) helps us reduce I6 and I7 to the
only remaining invariant I2 = tr T 2 .
For orthogonal tensors we have T T = T −1 . Then I3 = I6 = I7 = 3. In other words,
I3 , I6 , and I7 do not contain any information about a particular orthogonal tensor
and thus do not count as invariants. It further follows that I5 = I1 , so we are left
with the three basic invariants I1 = tr T, I2 = tr T 2 , I4 = tr T 3 . The determinant of
an orthogonal tensor is, according to Section 3.13.2, equal to ±1, and the cotensor is,
according to (3.14), equal to the tensor itself, up to the sign. Thus, we have for the main
invariants, according to (3.47)–(3.49), A󸀠󸀠 = tr T, A󸀠 = ±tr T, A = ±1; then it follows
from (3.100) that tr T 2 = tr2 T ∓ 2 tr T, tr T 3 = tr3 T ∓ 3 tr2 T ± 3, so an orthogonal tensor
has only one irreducible invariant I1 = tr(T).
5. As our next example we consider the case of two symmetric tensors U and V, which
is also important for physical applications. Now we do not need to distinguish between
the tensors and their transposed tensors, so we certainly can obtain an integrity basis
from (5.15). We set a = U, b = V in (5.15) and keep the first 10 invariants. Only the
invariant of degree six does not belong to the integrity basis because it turns out to
be reducible. On the one hand, from the symmetry of the tensors and by taking cyclic
permutations and transposition (5.5)–(5.7) into account, it follows that

tr(U 2 ⋅ V ⋅ U ⋅ V 2 ) = tr((V 2 )T ⋅ U T ⋅ V T ⋅ (U 2 )T ) = tr(U 2 ⋅ V 2 ⋅ U ⋅ V),

and on the other hand, from (5.12), we get

tr(U 2 ⋅ V ⋅ U ⋅ V 2 ) = tr(U ⋅ (U ⋅ V) ⋅ U ⋅ V 2 ) ≡ − tr(U 3

⋅ V 3 ) −tr(U 2 ⋅ V 2 ⋅ U ⋅ V).
The comparison of these two relations yields

tr(U 2 ⋅ V ⋅ U ⋅ V 2 ) ≡ −tr(U 2 ⋅ V ⋅ U ⋅ V 2 ),

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
5.3 Invariants of Vectors and Second-Order Tensors | 235

which is only possible if tr(U 2 ⋅ V ⋅ U ⋅ V 2 ) ≡ 0, and thus it is reducible.

So we find that the integrity basis of two symmetric tensors U and V consists of
ten irreducible invariants:
– the basic invariants
tr U, tr U 2 , tr U 3 and tr V, tr V 2 , tr V 3 of each of the tensors and
– the simultaneous invariants
tr(U ⋅ V), tr(U 2 ⋅ V), tr(U ⋅ V 2 ), tr(U 2 ⋅ V 2 ).

Problem 5.2.
Find all irreducible invariants which can be computed from two antimetric tensors A
and B, and compare the result with the invariants of two vectors from Section 5.3.1.
Hint: To show that tr(A2 ⋅ B2 ) is reducible, represent the two antimetric tensors A
and B temporarily by their corresponding vectors.

6. As a last example, we investigate the invariants of a symmetric tensor S and an

antimetric tensor A. Also for this case we can obtain an integrity basis from (5.15),
since the tensors and their transposed tensors differ at most by a sign. If we set
a = S, b = A in (5.15), some invariants are zero, because not only the trace of an
antimetric tensor and its (also antimetric) third power is zero but also, according to
(5.8), the trace of the scalar product of a symmetric tensor and an antimetric tensor is
zero. Thus, we have I12 = I23 = I32 = I33 = 0. Contrary to the case of two symmetric
tensors, here the invariant of degree six is not reducible. Using cyclic permutations
and transposition (5.5)–(5.7), we initially have

tr(S2 ⋅ A ⋅ S ⋅ A2 ) = tr((A2 )T ⋅ ST ⋅ AT ⋅ (S2 )T ) = −tr(A2 ⋅ S ⋅ A ⋅ S2 ),

and from (5.12) we further obtain that

tr(S2 ⋅ A ⋅ S ⋅ A2 ) = tr(S ⋅ (S ⋅ A) ⋅ S ⋅ A2 ) ≡ − ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟

tr(S3 ⋅ A3 ) −tr(S2 ⋅ A2 ⋅ S ⋅ A).

Comparing these two equations leads to the trivial statement tr(S2 ⋅ A ⋅ S ⋅ A2 )

≡ tr(S2 ⋅ A ⋅ S ⋅ A2 ), i. e. we obtain no additional information about this invariant.
We thus obtain for the integrity basis of a symmetric tensor S and an antimetric
tensor A a total of seven irreducible invariants:
– the basic invariants
tr S, tr S2 , tr S3 and tr A2 of each of the tensors,
– and also the simultaneous invariants
tr(S ⋅ A2 ), tr(S2 ⋅ A2 ), tr(S2 ⋅ A ⋅ S ⋅ A2 ).

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
236 | 5 Representation of Tensor Functions

Problem 5.3.
Find all irreducible invariants which can be formed from a polar symmetric tensor S
and a vector u, and compare the result with the invariants of a symmetric tensor and
an antimetric tensor. Also, distinguish the cases where u is polar or axial.

5.3.4 Summary

We finally summarize the results of Section 5.3 in a table. Here T denotes an arbitrary
tensor, R is an orthogonal tensor, S, U, V are symmetric tensors, A, B are antimetric
tensors, and u, v, w are vectors. All vectors and tensors are polar.

Arguments Integrity basis

u u⋅u

u, v u ⋅ u, v ⋅ v, u ⋅ v

u, v, w u ⋅ u, v ⋅ v, w ⋅ w, u ⋅ v, u ⋅ w, v ⋅ w

T tr T , tr T 2 , tr T 3 , tr(T ⋅ T T ), tr(T 2 ⋅ T T ),
tr(T 2 ⋅ (T T )2 ), tr(T 2 ⋅ T T ⋅ T ⋅ (T T )2 )

S tr S, tr S2 , tr S3

A tr A2

R tr R

U, V tr U, tr U 2 , tr U 3 , tr V , tr V 2 , tr V 3 ,
tr(U ⋅ V ), tr(U 2 ⋅ V ), tr(U ⋅ V 2 ), tr(U 2 ⋅ V 2 )

A, B tr A2 , tr B2 , tr(A ⋅ B)

S, A tr S, tr S2 , tr S3 , tr A2 , tr(S ⋅ A2 ),
tr(S2 ⋅ A2 ), tr(S2 ⋅ A ⋅ S ⋅ A2 )

S, u tr S, tr S2 , tr S3 , u ⋅ u, u ⋅ S ⋅ u, u ⋅ S2 ⋅ u

5.4 Isotropic Tensor Functions

5.4.1 Invariance Conditions

Tensors cannot be related by arbitrary functions, because the function value must sat-
isfy the corresponding transformation law for tensor coordinates under a change to
another Cartesian coordinate system. Let polar vectors v, . . . , w and polar second-
order tensors M, . . . , N be arguments of tensor functions F of various orders. Under

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
5.4 Isotropic Tensor Functions | 237

a change of the Cartesian coordinate system we have for the coordinates of the argu-
ments, according to (2.17), the transformation equations

ṽi = αpi vp , . . . , w
̃j = αqj wq ,

̃kl = αpk αql Mpq , . . . , N
̃mn = αpm αqn Npq .

Depending on the order of the tensors, we obtain different conditions for the func-
tion F :
– A polar scalar s = f (v, . . . , w, M, . . . , N) must remain unchanged under a trans-
formation of the coordinate system,

f (vi , . . . , wj , Mkl , . . . , Nmn ) = f (ṽi , . . . , w

̃j , M
̃kl , . . . , N
̃mn ),

and then it follows for the function f that

f (vi , . . . , wj , Mkl , . . . , Nmn )

= f (αpi vp , . . . , αqj wq , αpk αql Mpq , . . . , αpm αqn Npq ).

– For a polar vector u = f (v, . . . , w, M, . . . , N), substituting the transformed coor-

dinates ṽi , . . . , w
̃j , M
̃kl , . . . , N
̃mn must give the transformed vector coordinates

̃ r = αur uu = fr (ṽi , . . . , w
̃j , M
̃kl , . . . , N
̃mn ).

In the original coordinates, we have

uu = fu (vi , . . . , wj , Mkl , . . . , Nmn ),

so it follows for the function f that

αur fu (vi , . . . , wj , Mkl , . . . , Nmn )

= fr (αpi vp , . . . , αqj wq , αpk αql Mpq , . . . , αpm αqn Npq ).

– For a polar second-order tensor T = f (v, . . . , w, M, . . . , N), substituting the

transformed coordinates ṽi , . . . , w
̃j , M
̃kl , . . . , N
̃mn must give the transformed ten-
sor coordinates

T̃rs = αur αvs Tuv = frs (ṽi , . . . , w

̃j , M
̃kl , . . . , N
̃mn ).


Tuv = fuv (vi , . . . , wj , Mkl , . . . , Nmn ),

we then obtain for the function f

αur αvs fuv (vi , . . . , wj , Mkl , . . . , Nmn )

= frs (αpi vp , . . . , αqj wq , αpk αql Mpq , . . . , αpm αqn Npq ).

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
238 | 5 Representation of Tensor Functions

If the transformation coefficients αij form an arbitrary orthogonal matrix (we also say
that they encompass all orthogonal transformations), we call (5.17), (5.18) and (5.19)
isotropic tensor functions.
The functions (5.17), (5.18), and (5.19) can in general further depend on polar
scalars; since this does not impose any restrictions on the functions, we did not in-
clude scalars in the list of arguments.
We derived the invariance conditions here only for polar tensors, but they can be
easily transferred to axial tensors by means of the transformation equations (2.18).

5.4.2 Scalar-Valued Functions

In order to satisfy the invariance condition (5.17), a scalar cannot depend on individual
tensor coordinates, because, in general, coordinates change under a transformation
of the coordinate system, so a scalar can only depend on combinations of coordinates
which are invariants and thus also scalars. If it is possible to form an integrity basis
with P irreducible invariants I1 , . . . , IP from the coordinates v, . . . , w, M, . . . , N, then
the scalar-valued function f satisfies

s = f (v, . . . , w, M, . . . , N) = f (I1 , I2 , . . . , IP ). (5.20)

The theory of the representation of tensor functions does not provide any more infor-
mation about the function f , so for a particular physical process one has to rely on
experiments. The type and the number P of the invariants I1 , . . . , IP depend on the
particular case and must be determined according to the rules from Section 5.3.

5.4.3 Vector-Valued Functions

1. We can obtain a scalar from a vector by means of the scalar product of the vector
and another vector. This suggests that, with the help of the scalar product of the vector-
valued function f and an arbitrary auxiliary vector h and by adding the auxiliary vector
to the argument list, we can reduce the problem of finding a representation of a vector-
valued function to the problem of finding a representation of a scalar-valued function:

u ⋅ h = f (v, . . . , w, M, . . . , N) ⋅ h = f (v, . . . , w, M, . . . , N, h).

Then f is, according to Section 5.4.2, a function of the P irreducible invariants I1 , . . . , IP ,

which can be formed from the v, . . . , w, M, . . . , N and the simultaneous invariants
with a vector h. However, since h is only an auxiliary vector, which must cancel out
from the vector-valued function f at the end, we need only those simultaneous invari-
ants which are linear in h. These simultaneous invariants have the form J i ⋅ h, where
J i is a set of Q vectors, which are computed from the v, . . . , w, M, . . . , N and which

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
5.4 Isotropic Tensor Functions | 239

have to be determined for each case individually. The vectors J i are also called the
generators of the representation and a complete set of generators is called a function
basis. We start by writing u ⋅ h as a linear combination of the J i ⋅ h, i. e.

u ⋅ h = k1 J 1 ⋅ h + ⋅ ⋅ ⋅ + kQ J Q ⋅ h, (a)

where the coefficients k1 , . . . , kQ are scalars which can still depend on the invariants
I1 , . . . , IP of the integrity basis for v, . . . , w, M, . . . , N, i. e. we have

ki = ki (I1 , . . . , IP ).

Since each term in (a) is a scalar product with the auxiliary vector h, we can cancel
h out, and thus we obtain a representation of the original vector-valued function f ,
which inherently satisfies the invariance condition (5.18) due to its construction. We

u = k1 J 1 + ⋅ ⋅ ⋅ + kQ J Q . (5.21)

We show by means of two examples how to determine the generators J i of this repre-
sentation for a particular case.

2. In the first example, we seek a representation of a vector u which depends only on

another vector v,

u = f (v).

Scalar multiplication with an auxiliary vector h gives

u ⋅ h = f (v, h).

According to Section 5.3.1, the vector v has only one irreducible invariant, i. e. its
square v ⋅ v, and from the vectors v and h we can form also only one irreducible si-
multaneous invariant, and this is linear in h, namely the scalar product v ⋅ h. So the
representation of the vector u has only one generator, namely the vector v itself, and
we have

u = k(v ⋅ v) v.

The function k(v⋅v) cannot be determined further from the theory of the representation
of tensor functions. For example, for a physical process, the missing information must
be obtained from experiments.

3. In a second example, we extend the function f from No. 2 and include a symmetric
tensor S, so we seek a representation for

u = f (v, S)

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
240 | 5 Representation of Tensor Functions

or, after scalar multiplication by an auxiliary vector h,

u ⋅ h = f (v, S, h).

As in No. 2, the scalar product v ⋅ h is a simultaneous invariant which is linear in h.

In addition, other linear simultaneous invariants can be formed with the help of the
scalar product of S and v on the left and h on the right; since S is symmetric, the order
does not matter. Doing the same with the integer powers of S yields initially the invari-
ants v ⋅ S ⋅ h, v ⋅ S2 ⋅ h, v ⋅ S3 ⋅ h, etc. Since v ⋅ h can also be written as v ⋅ δ ⋅ h, we see that
v ⋅ S3 ⋅ h (and correspondingly any expression with higher powers of S) is reducible,
because we can write S3 , using the Cayley–Hamilton theorem (3.94), in terms of S2 , S,
and δ. Hence there are three generators for the representation of the vector u, i. e.

u = k1 v + k2 v ⋅ S + k3 v ⋅ S2 .

The coefficients k1 , k2 , k3 are scalar-valued functions of the invariants of v and S. Since

here we consider only polar vectors and tensors, it follows from the result to Prob-
lem 5.3 that

ki = f (tr S, tr S2 , tr S3 , v ⋅ v, v ⋅ S ⋅ v, v ⋅ S2 ⋅ v).

Problem 5.4.
Find the representation of a vector u which depends on a polar tensor T for the follow-
ing cases:
– T is symmetric or antimetric;
– u is polar or axial.

5.4.4 Tensor-Valued Functions

1. It is easy to generalize our work from Section 5.4.3 for vector-valued functions to
tensor-valued functions; we simply introduce an auxiliary tensor H and use the double
scalar product; this again ensures that the result inherently satisfies the invariance
condition (5.19) for tensor-valued functions. Then, from

T = f (v, . . . , w, M, . . . , N)

we first get

T ⋅⋅ H = f (v, . . . , w, M, . . . , N, H).

The generators J i are now second-order tensors, because in order for H to cancel out
at the end, the simultaneous invariants with the auxiliary tensor must have the linear

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
5.4 Isotropic Tensor Functions | 241

form J i ⋅⋅ H. Then we can represent T ⋅⋅ H as the linear combination of all Q simulta-

neous invariants, i. e. we have

T ⋅⋅ H = k1 J 1 ⋅⋅ H + ⋅ ⋅ ⋅ + kQ J Q ⋅⋅ H,

and, after canceling H out, we obtain for the original tensor-valued function f

T = k1 J 1 + ⋅ ⋅ ⋅ + kQ J Q . (5.22)

As in Section 5.4.3, the coefficients k1 , . . . , kQ are scalars which can depend on the
invariants I1 , . . . , IP of an integrity basis for v, . . . , w, M, . . . , N, i. e.

ki = ki (I1 , . . . , IP ).

The type and number Q of the generators J 1 , . . . , J Q can only be determined for each
particular case; we will again explain this with two examples.

2. We first consider a second-order tensor T which depends only on a symmetric

second-order tensor S, so we seek a representation for

T = f (S).

The double scalar product with an auxiliary tensor H gives

T ⋅⋅ H = f (S, H).

Since S is symmetric and because we only need the invariants which are linear in H,
we can start from (5.15) and set there a = S and b = H. From this we obtain the three
basic invariants tr S, tr S2 , tr S3 of the symmetric tensor S and, using the symmetry of
S, the invariants which are linear in H,

tr H = Hii = δij Hij = δ ⋅⋅ H,

tr(S ⋅ H) = Sij Hji = Sij Hij = S ⋅⋅ H,
tr(S2 ⋅ H) = Sij Sjk Hki = Skj Sji Hki = S2 ⋅⋅ H.

Hence we have three generators δ, S, S2 , and the representation of the tensor T is

T = k1 δ + k2 S + k3 S2 ,

where the coefficients k1 , k2 , k3 are scalar-valued functions of the three basic invariants
of S, i. e.

ki = f (tr S, tr S2 , tr S3 ).

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
242 | 5 Representation of Tensor Functions

Since we assumed that the tensor S is symmetric, we immediately obtain from the the-
ory of the representation of tensor functions that the tensor T must also be symmetric.

3. As a second example, we consider a tensor T which depends on a symmetric tensor

S and also on a vector v, i. e. we are looking for a representation for

T = f (S, v).

The double scalar product with the auxiliary tensor H gives

T ⋅⋅ H = f (S, v, H).

We can keep the invariants of S and v from Section 5.4.3, No. 3, so we only need to
compute the simultaneous invariants which are linear in H. As in No. 2 we initially

δ ⋅⋅ H, S ⋅⋅ H, S2 ⋅⋅ H.

Another simultaneous invariant can be formed with the vector v :

v v ⋅⋅ H.

In order to form the simultaneous invariants with both S and v, we start from v v ⋅⋅ H
and form scalar products with v v and S or S2 on the right or on the left; we can neglect
higher powers of S, because of the Cayley–Hamilton theorem (3.95). This gives

(S ⋅ v v) ⋅⋅ H, (v v ⋅ S) ⋅⋅ H, (S2 ⋅ v v) ⋅⋅ H, (v v ⋅ S2 ) ⋅⋅ H,
(S ⋅ v v ⋅ S) ⋅⋅ H, (S2 ⋅ v v ⋅ S) ⋅⋅ H, (S ⋅ v v ⋅ S2 ) ⋅⋅ H, (S2 ⋅ v v ⋅ S2 ) ⋅⋅ H.

It remains to check if these simultaneous invariants are irreducible. Clearly,

(S ⋅ v v) ⋅ ⋅ H and (v v ⋅ S) ⋅ ⋅ H are irreducible. However, the scalar products S ⋅ v v ⋅ S,
S2 ⋅ v v, and v v ⋅ S2 are linked via the generalized Cayley–Hamilton theorem (5.1),
for example, if we set a = c = S and b = v v . Thus, from the three invari-
ants (S2 ⋅ v v) ⋅ ⋅ H, (v v ⋅ S2 ) ⋅ ⋅ H, and (S ⋅ v v ⋅ S) ⋅ ⋅ H only two are irreducible,
and we choose S2 ⋅ v v and v v ⋅ S2 for the function basis. The remaining invariants
(S2 ⋅ v v ⋅ S) ⋅⋅ H, (S ⋅ v v ⋅ S2 ) ⋅⋅ H, and (S2 ⋅ v v ⋅ S2 ) ⋅⋅ H are reducible, since, according to
the generalized Cayley–Hamilton theorem (5.1), the scalar products S2 ⋅ v v ⋅ S, S ⋅ v v ⋅ S2 ,
and S2 ⋅ v v ⋅ S2 can be written in terms of δ, S, S2 , v v, S ⋅ v v, v v ⋅ S, S2 ⋅ v v, v v ⋅ S2 .
Hence the function f has eight generators, and we can represent the tensor T as

T = k1 δ + k2 S + k3 S2 + k4 v v + k5 S ⋅ v v + k6 v v ⋅ S + k7 S2 ⋅ v v + k8 v v ⋅ S2 .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
5.4 Isotropic Tensor Functions | 243

The coefficients k1 , . . . , k8 are, as in Section 5.4.3, No. 3, scalar-valued functions of the

invariants of S and v , i. e.

ki = f (tr S, tr S2 , tr S3 , v ⋅ v, v ⋅ S ⋅ v, v ⋅ S2 ⋅ v).

In contrast to No. 2, here we cannot conclude from the symmetry of S that T is also
symmetric. This is only true if k5 = k6 and k7 = k8 . But if T is symmetric, then we can
represent T also in the form

T = k1∗ δ + k2∗ S + k3∗ S2 + k4∗ v v + k5∗ (S ⋅ v v + v v ⋅ S) + k6∗ S ⋅ v v ⋅ S.

If T is symmetric, then S2 ⋅v v and v v⋅S2 can only appear as a sum, and according to the
generalized Cayley–Hamilton theorem (5.1), we can replace this sum by the generator
S ⋅ v v ⋅ S, which is symmetric from the outset. The coefficients k1∗ , . . . , k6∗ are again
scalar-valued functions of the invariants of S and v.

Problem 5.5.
Find the representation of a polar second-order tensor T which depends on a polar
vector or on an axial vector v.

5.4.5 Summary

We summarize the results of Section 5.4 in a table. Here S denotes a polar symmetric
tensor, A a polar antimetric tensor, v a polar vector, and u an axial vector.

Argument Function basis

vector-valued functions
polar axial
v v
u u
S –
A ϵ ⋅⋅ A
S, v v, S ⋅ v, S ⋅ v
tensor-valued functions
polar axial
v δ, v v ϵ⋅v
u δ, u u, ϵ ⋅ u
S δ, S, S2
S, v δ, S, S2 , v v, S ⋅ v v, v v ⋅ S, S2 ⋅ v v, v v ⋅ S2

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
244 | 5 Representation of Tensor Functions

5.5 Considering Anisotropy

1. In the previous section, we called a tensor function isotropic if the transformation
coefficients in the invariance conditions (5.17), (5.18) and (5.19) include all orthogonal
transformations. We could also consider only a subset of certain permissible orthog-
onal transformations, e. g. only rotations about a fixed axis. Such considerations are
important in physics and material science, if the problem is to describe the properties
of a material in more detail; many materials, such as crystals and composite mate-
rials, are characterized by a directional dependency, i. e. they behave differently un-
der rotations, depending on the direction of the axis of rotation. In such cases, often
certain rotations about distinguished axes and angles of rotation exist, where the be-
havior of these materials remains the same. These rotations can be used to classify
a material; we also say that the material possesses a certain symmetry or belongs to
a certain symmetry group. Mathematically, such a symmetry group is defined as the
set of orthogonal transformations under which the behavior of the material remains

2. The invariance conditions from Section 5.4.1 can also be used if only a subset of
all orthogonal transformations is allowed. Without further elaboration, we only men-
tion here that, in general, the number of the invariants increases if the permissible
transformations are restricted. An example is the restriction to only proper orthogonal
transformations. Then we also speak of hemitropic invariants and hemitropic tensor
functions. They are formed similarly as in Sections 5.3 and 5.4, but now we also have
to take the simultaneous invariants with the ε-tensor into account, because for proper
orthogonal transformations, we do not need to distinguish between polar and axial
tensors. Another example are rotations only about a certain axis. If we choose this
axis as the z-axis of a Cartesian coordinate system, then the matrix of the transforma-
tion coefficients has, according to (3.75) and with Section 3.13.4, the form

cos φ − sin φ 0
αij = ( sin φ cos φ 0) .
0 0 1

Then it follows from the transformation equations (2.17) that the coordinate u3 of a
vector and the coordinate T33 of a second-order tensor remain unchanged under the
transformation and thus they have to be counted as invariants.

3. All tensors which are invariant under the transformations of a certain symmetry
group form an anisotropy class. We explain this by means of three examples with
second-order tensors.
The anisotropy class of general anisotropy includes all tensors; its symmetry
group consists of the identical transformation αij = δij and the inversion
αij = −δij , because only under these transformations the coordinates do not change.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
5.5 Considering Anisotropy | 245

We have

T̃ij = αmi αnj Tmn = δmi δnj Tmn = Tij .

On the other hand, the anisotropy class of isotropy includes all tensors whose coor-
dinates are invariant under arbitrary orthogonal transformations. We already intro-
duced such tensors as isotropic tensors; they have the form

T = k δ,

because for the transformed coordinates, we have, according to the orthogonality re-
lation (2.6),

T̃ij = αmi αnj (k δmn ) = k αmi αmj = k δij .

As a third example, we consider the anisotropy class of transverse isotropy. Its sym-
metry group includes all rotations and rotary reflections with a fixed axis. The gen-
eral form of a transversely isotropic tensor of second order follows from Problem 5.5.
A transversely isotropic tensor can be seen as a tensor which depends only on the
direction n of the axis of rotation or rotary reflection: T = f (n). Thus, it remains to
ensure in the solution to Problem 5.5 that n is an (axial) unit vector, and we have

T = α δ + β n n + γ ε ⋅ n. (5.23)

Since n ⋅ n = 1, here the α, β, γ are, unlike in Problem 5.5, not functions, but arbitrary
If we choose the z-axis of the Cartesian coordinate system as the axis of rotation
or rotary reflection, i. e. if n1 = n2 = 0, n3 = 1, then T has the coordinate matrix

α 0 0 0 0 0 0 γ 0
Tij = (0 α 0) + (0 0 0) + (−γ 0 0)
0 0 α 0 0 β 0 0 0

α γ 0
= (−γ α 0 ).
0 0 α+β

It is easy to see that, if we evaluate the transformation equations, the coordinates of

T remain unchanged under a rotation of the coordinate system about the z-axis, i. e.

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
246 | 5 Representation of Tensor Functions

we have
T̃ij = αmi αnj Tmn = αim Tmn αnj

cos φ sin φ 0 α γ 0 cos φ − sin φ 0

= (− sin φ cos φ 0) (−γ α 0 ) ( sin φ cos φ 0)
0 0 1 0 0 α+β 0 0 1

α cos φ α sin φ
− γ sin φ + γ cos φ cos φ − sin φ 0
( )
= (−α sin φ α cos φ ) sin φ
( cos φ 0)
− γ cos φ − γ sin φ 0 0 1

( 0 0 α + β)

α cos2 φ − γ sin φ cos φ −α cos φ sin φ + γ sin2 φ

+ α sin2 φ + γ sin φ cos φ + α cos φ sin φ + γ cos2 φ
= (−α cos φ sin φ − γ cos2 φ α sin2 φ + γ sin φ cos φ
( )
+ α cos φ sin φ − γ sin2 φ + α cos2 φ − γ sin φ cos φ

( 0 0 α + β)

α γ 0
= (−γ α 0 ).
0 0 α+β

The class of transversely isotropic tensors has some subclasses of tensors which we
have already seen earlier. If we set α = cos ϑ, β = ±1−cos ϑ, γ = − sin ϑ, then we obtain
for T, according to (3.77) and (3.81), the general form of an orthogonal tensor, with the
angle of rotation ϑ and the axis of rotation or rotary reflection n. If we set α = β = 0,
then T is antimetric and γ n is the corresponding vector of T. If we choose γ = 0, then
the tensor is symmetric with an eigenvalue α of multiplicity 2 and another eigenvalue
α + β of multiplicity 1; n is the eigendirection corresponding to the eigenvalue α + β
of multiplicity 1.
We summarize the results of our discussion on anisotropy classes of second-order
tensors in a table. For the anisotropy classes, the number of tensors T, which are con-
tained in the class, increases from the first class to the last; the tensors of a partic-
ular anisotropy class are always included in the next anisotropy class. For the sym-
metry groups, the number of orthogonal matrices αij which are included in the class
increases from the last class to the first; the matrices of a symmetry group are always
contained in the previous symmetry group.

4. Anisotropies can also be considered by isotropic tensor functions. To explain this

we consider the stress–strain relation for an elastic body. When we introduced the
function T = f (D) between the symmetric stress tensor T and the symmetric strain

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
5.5 Considering Anisotropy | 247

Anisotropy class General tensor T Symmetry group αij

Isotropy kδ arbitrary
cos φ − sin φ 0
Transverse isotropy αδ + βnn + γ ε ⋅ n ( sin φ cos φ 0 ) , n = ez
0 0 1
General anisotropy arbitrary ±δij

tensor D in Section 5.1, we silently assumed an isotropic solid, and, according to Sec-
tion 5.4.4, No. 2, the function f has the following representation:

T = k1 δ + k2 D + k3 D2 , ki = f (tr D, tr D2 , tr D3 ).

If, on the other hand, the solid is transversely isotropic, then it has a distinguished
direction n, which we have to include in the argument list of the function f , i. e. we are
looking for a representation for T = f (D, n). We can use the result from Section 5.4.4,
No. 3, if in addition we take into account that T is symmetric and that n is a unit vector,
i. e. that n ⋅ n is no invariant, i. e.

T = k1∗ δ + k2∗ D + k3∗ D2 + k4∗ n n + k5∗ (D ⋅ n n + n n ⋅ D) + k6∗ D ⋅ n n ⋅ D,

ki∗ = f (tr D, tr D2 , tr D3 , n ⋅ D ⋅ n, n ⋅ D2 ⋅ n).

This expression includes the stress–strain relation for an isotropic solid as a special
case, if we set k4∗ = k5∗ = k6∗ = 0 and if k1∗ , k2∗ , and k3∗ do not depend on n.
For small strains we can approximate (a) by a linear relation between the coordi-
nates of T and D. Then k3∗ = k6∗ = 0, k2∗ = α, and k5∗ = β are constant, and k1∗ and k4∗
are linear functions of the invariants tr D and n ⋅ D ⋅ n which are linear in D, so we have

k1∗ = κ tr D + λ n ⋅ D ⋅ n, k4∗ = μ tr D + ν n ⋅ D ⋅ n.

Then (a) simplifies in index notation to

Tij = (κ Dpp + λ np Dpq nq ) δij + (μ Dpp + ν np Dpq nq ) ni nj

+ α Dij + β (Dip np nj + ni np Dpj ) .

If we introduce appropriate Kronecker symbols, we can factor out Dkl as follows:

Tij = (κ δpk δpl Dkl + λ np δpk Dkl δlq nq ) δij

+ (μ δpk δpl Dkl + ν np δpk Dkl δlq nq ) ni nj + α δik δjl Dkl
+ β (δik Dkl δlp np nj + ni np δpk Dkl δlj )
= {κ δij δkl + λ δij nk nl + μ ni nj δkl + ν ni nj nk nl + α δik δjl
+ β (δik nj nl + ni nk δjl )}Dkl .

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
248 | 5 Representation of Tensor Functions

Since both Tij and Dkl are symmetric, we can write

Tij = {κ δij δkl + λ δij nk nl + μ ni nj δkl + ν ni nj nk nl + 21 α (δik δjl + δil δjk )

+ 21 β (δik nj nl + ni nk δjl + δil nj nk + ni nl δjk )}Dkl .

The expression in the curly brackets can be interpreted as the coordinates of a fourth-
order elasticity tensor E with the symmetry properties

Eijkl = Ejikl = Eijlk ,

so we can shorten this to

Tij = Eijkl Dkl .

The constants α, β, κ, λ, μ, and ν must be determined from experiments (possibly by

making further physical assumptions).

Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
We would like to recommend a few books for further reading and we start with our
– Bernhard Schutz: Geometrical Methods of Mathematical Physics. Cambridge Uni-
versity Press, 1980.
– Marcelo Epstein: Differential Geometry, Basic Notions and Physical Examples.
Springer, 2014.
– Theodore Frankel: The Geometry of Physics – An Introduction, 3rd Edition. Cam-
bridge University Press, 1997, 2004, 2012.

Schutz encourages his readers to develop both a pictorial way of thinking and a feel-
ing for geometrical tools in physics. He emphasizes the idea that tensors are geomet-
rical objects independent of any coordinate system. The book has a large chapter on
physical applications including thermodynamics, mechanics, electromagnetism, and
gauge theories.
Epstein believes, just as we do, that what is true and good must also be beautiful,
symmetric, and well balanced. He introduces the concept of a continuum as well as
the field theories in the language of topology and modern differential geometry, and
emphasizes the richness of these ideas using illustrations from continuum mechanics
and materials science.
Frankel aims to provide a working knowledge of geometric techniques for a deeper
understanding of both classical and modern physics and engineering at the level of
advanced undergraduate and graduate students. He promotes a language with which
mathematicians and scientists can communicate and includes applications to fluid
dynamics, electromagnetism, thermodynamics, elasticity, and relativity.
Tensors naturally appear in the study of curves, surfaces, and manifolds, so many
differential geometry books include chapters on tensor analysis.
An elementary account of the geometry of curves and surfaces, which only re-
quires a standard background in calculus and linear algebra, and an introduction to
the main ideas of differential geometry are given in:
– Barrett O’Neill: Elementary Differential Geometry, 2nd Edition. Elsevier, 1997,

For linear algebra and multilinear algebra, we recommend Strang and Greub, respec-
– Gilbert Strang: Introduction to Linear Algebra, 5th Edition. Wellesley–Cambridge
Press, 2016.
– Werner Greub: Multilinear Algebra, 2nd Edition. Springer, 1978.


Brought to you by | Cambridge University Library

Download Date | 10/11/18 1:12 AM
250 | References

The most important facts about a large number of curvilinear coordinate systems can
be found in:
– Parry Moon, Domina Eberle Spencer: Field Theory Handbook. Springer, 1988.

A generalization of tensor analysis is presented in:

– Parry Moon, Domina Eberle Spencer: Theory of Holors. Cambridge University
Press, 2005.

Readers who would like to learn more about the representation theory, introduced
in Chapter 5, are encouraged to consult the specialized literature on the subject. We
– Jean-Paul Boehler (ed.): Application of Tensor Functions in Solid Mechanics. CISM
Courses and Lectures, No. 292. Springer, 1987.
– A. J. M. Spencer: Theory of invariants. In: A. Cemal Eringen (ed.): Continuum
Physics, Vol. I. Academic Press, 1971.

For problems that go beyond the scope of this book, we refer to the following books:
– A. M. Goodbody: Cartesian Tensors. Chichester: Ellis Horwood Limited, 1982.
– Jerald L. Ericksen: Tensor fields (Appendix to: Clifford Truesdell, R. A. Toupin: The
Classical Field Theories). In: Handbuch der Physik, Vol. III/1. Springer, 1960.
– Mikhail Itskov: Tensor Algebra and Tensor Analysis for Engineers, 4th Edition.
Springer, 2015.
– David Lovelock, Hanno Rund: Tensors, Differential Forms, and Variational Prin-
ciples. Dover, 1989.

The following book is a very readable introduction to differential forms and their ap-
plications. Differential forms were first introduced by Cartan and they generalize an-
timetric tensors:
– Harley Flanders: Differential Forms with Applications to the Physical Sciences.
Dover, 1989.

For further reading we recommend more of our favorites:

– Ralph Abraham, Jerrold E. Marsden, Tudor Ratiu: Manifolds, Tensor Analysis, and
Applications, 2nd Edition. Addison Wesley, 1988.
– Philippe G. Ciarlet: An Introduction to Differential Geometry with Applications to
Elasticity. Springer, 2005.
– Darryl D. Holm: Geometric Mechanics – Part I: Dynamics and Symmetry, 2nd Edi-
tion. Imperial College Press, 2011.
– Darryl D. Holm: Geometric Mechanics – Part II: Rotating, Translating and Rolling,
2nd Edition. Imperial College Press, 2011.
– Bo-Yu Hou, Bo-Yuan Hou: Differential Geometry for Physicists. World Scientific,

References | 251

– Charles W. Misner, Kip S. Thorne, John Archibald Wheeler: Gravitation. Macmil-

lan, W. H. Freemann and Company, 1973.
– Mikio Nakahara: Geometry, Topology, and Physics, 3rd Edition. CRC Press, 2017.
– M. Spivak: A Comprehensive Introduction to Differential Geometry, Vols. 1–5. Pub-
lish or perish Inc., 1979.
– Shlomo Sternberg: Curvature in Mathematics and in Physics. Dover, 2012.

A Solutions to the Problems

Problem 1.1.
A. ai Bi = a1 B1 + a2 B2 + a3 B3 + a4 B4 .
B. Aii = A11 + A22 + A33 + A44 .
𝜕ui 𝜕u1 𝜕u2 𝜕u3 𝜕u4
C. = + + + .
𝜕xi 𝜕x1 𝜕x2 𝜕x3 𝜕x4
𝜕2 φ 𝜕2 φ 𝜕2 φ 𝜕2 φ 𝜕2 φ 𝜕2 φ 𝜕2 φ
D. 2
stands for : = 2 + 2 + 2 + 2.
𝜕xi 𝜕xi 𝜕xi 𝜕xi2 𝜕x1 𝜕x2 𝜕x3 𝜕x4
E. In order to gain experience, let us carry out the two summations one by one:

aij bij = a1j b1j + a2j b2j + a3j b3j

= a11 b11 + a12 b12 + a13 b13 + a21 b21 + a22 b22 + a23 b23 + a31 b31
+ a32 b32 + a33 b33 .

F. aii bjj = (a11 + a22 + a33 )(b11 + b22 + b33 )

= a11 b11 + a22 b11 + a33 b11 + a11 b22 + a22 b22 + a33 b22 + a11 b33
+ a22 b33 + a33 b33 .
𝜕ui 𝜕ui 𝜕u1 𝜕u1 𝜕u2 𝜕u2
G. = +
𝜕xj 𝜕xj 𝜕xj 𝜕xj 𝜕xj 𝜕xj
𝜕u1 2 𝜕u 2 𝜕u 2 𝜕u 2
=( ) +( 1) +( 2) +( 2) .
𝜕x1 𝜕x2 𝜕x1 𝜕x2
(As soon as we substitute numbers for the indices, i. e. when the summation con-
vention does not apply anymore, we can clearly write squares.)

Problem 1.2.
A. During substitution we have to ensure two things:
– Substitution must not lead to an index appearing more than twice in a term.
We can avoid this by renaming those summation indices which appear also
in one of the other terms (as free indices or as summation indices).
– The quantity to be substituted (here u) must have the same index in the equa-
tion in which it is inserted and in the equation we use to substitute. If this is
not the case, we must rename one of the two indices.
In this problem it is easiest to rename the index k in φ = uk vk to i: φ = ui vi ; this
gives the solution φ = Aik nk vi .
B. If we want to keep i as a free index, we can write the substitution equations e. g.
as um = Bmj vj and Cmi = pm qi . Substituting then gives wi = pm qi Bmj vj .


254 | A Solutions to the Problems

C. If we substitute τij and dij and multiply the terms, we get

η 𝜕vi 𝜕vj 𝜕v 𝜕vj

ϕ = τij dij = ( + )( i + )
2 𝜕xj 𝜕xi 𝜕xj 𝜕xi
η 𝜕vi 𝜕vi 𝜕vj 𝜕vi 𝜕vj 𝜕vj
= ( +2 + ).
2 𝜕xj 𝜕xj 𝜕xi 𝜕xj 𝜕xi 𝜕xi

If we rename in the third term j to i and i to j, we see that the first and the third
terms are equal, so we get

𝜕vi 𝜕vi 𝜕vj 𝜕vi

ϕ=η ( + ).
𝜕xj 𝜕xj 𝜕xi 𝜕xj

So we get the solution to this problem as

Ds 𝜕v 𝜕vi 𝜕vj 𝜕vi 𝜕2 T

ϱT =η ( i + )+κ 2.
Dt 𝜕xj 𝜕xj 𝜕xi 𝜕xj 𝜕xi

Problem 1.3.
A. 1. Yes (i is a free index on the left and on the right).
2. No (no free index on the left, i and j are free indices on the right).
3. No (no free index on the left, i is a free index on the right).
4. Yes (i is a free index on the left and on the right).
5. Yes (m and n are free indices on the left and on the right).
6. No (i and k are free indices on the left, i and j are free indices on the right).
7. Yes (i and k are free indices on the left and on the right).
8. Yes (n is a free index on the left and on the right).
9. Yes (no free index on the left and on the right).
10. Yes (i is a free index on the left and on the right).
B. 1. A1 (B11 + B22 ) = C1 (D11 + D22 ), A2 (B11 + B22 ) = C2 (D11 + D22 ).
4. A1 B1 = C1 , A2 B2 = C2 .
5. A1 B1 = C11 , A1 B2 = C12 , A2 B1 = C21 , A2 B2 = C22 .
7. We obtain four equations:
A1 B1 = A1 B1 , A1 B2 = A2 B1 , A2 B1 = A1 B2 , A2 B2 = A2 B2 .
The first equation and the last equation are trivial and the other two equations
are equivalent, so that the only nontrivial statement is A1 B2 = A2 B1 .
8. μ11 A1 + μ21 A2 = c1 (F11 + F22 ), μ12 A1 + μ22 A2 = c2 (F11 + F22 ).
9. A = α1 C11 + α2 C22 .
10. A11 = B11 C111 + B12 C221 , A22 = B21 C112 + B22 C222 .

A Solutions to the Problems | 255

Problem 1.4.
A. We calculate a determinant of order three with a zero element most conveniently
by an expansion with respect to the row (or column) containing the zero. Expan-
sion with respect to the first row gives Δ = 1(−2 + 6) − 1(6 − 1) = −1.
B. A determinant of order larger than three is best transformed into triangular form;
see also Section 1.6.1. (In the following calculations a number next to a row means
that this line is multiplied by the number and is added to the row indicated by the
(: 2)
1 2 −4 4 1 1 −4 4 −1 2 −4
3 6 9 0 (: 3) 1 1 3 0 ←
Δ= =6
−2 8 1 9 −2 4 1 9 ←
4 2 −2 0 4 1 −2 0 ←
1 1 −4 4 1 1 −4 4
0 0 7 −4 ↑ 0 −3 14 −16 2
=6 = −6
0 6 −7 17 0 6 −7 17 ←
0 −3 14 −16 ↓ 0 0 7 −4
1 1 −4 4 1 1 −4 4
0 −3 14 −16 0 −3 14 −16
= −6 = −18
0 0 21 −15 (: 3) 0 0 7 −5 −1
0 0 7 −4 0 0 7 −4 ←
1 1 −4 4
0 −3 14 −16
= −18 = −18 ⋅ 1 ⋅ (−3) ⋅ 7 ⋅ 1 = 378.
0 0 7 −5
0 0 0 1

Problem 1.5.
A. δii = (δ11 , δ22 , δ33 , δ44 , δ55 ) = (1, 1, 1, 1, 1).
B. δii = δ11 + δ22 + δ33 + δ44 + δ55 = 5.

Problem 1.6.
A. δij δjk = δik .
B. δi2 δik is equal to δk2 or δ2k , depending on whether we substitute the i in the first
δ by the k in the second δ, or if we substitute the i in the second δ by the 2 in the
first δ; since δk2 = δ2k , the result is the same: δk2 δ3k = δ32 = 0.
C. δ1k Ak = A1 .
D. δi2 δjk Aij = A2k .

256 | A Solutions to the Problems

Problem 1.7.
A. Following the arguments from Section 1.4.2, No. 2, we get:
12 21 12 21
1. for N = 2: δ12 = δ21 = 1, δ21 = δ12 = −1,
12 21 13 31 23 32
2. for N = 3: δ12 = δ21 = δ13 = δ31 = δ23 = δ32 = 1,
12 21 13 31 23 32
δ21 = δ12 = δ31 = δ13 = δ32 = δ23 = −1.
B. δijis the sum of all elements with identical upper and lower indices and the solu-
tion to the sub-problem A 2 shows that this sum is six. We obtain the same result
using definition (1.19):

ij δii δij
δij = = δii δjj − δ ij δji = 3 ⋅ 3 − 3 = 6.
δji δjj

Problem 1.8.

δ δ12 1 0 ε11 ε12 0 1

( 11 )=( ), ( )=( ).
δ21 δ22 0 1 ε21 ε22 −1 0

Problem 1.9.
The εijk are different from zero only if all three indices are different. If only one index is
a free index and if we have chosen a number for this index, only two elements, of the
nine possible elements with this free index, are nonzero. This is used in the following.
A. vi = εijk ωj rk :
v1 = ε123 ω2 r3 + ε132 ω3 r2 = ω2 r3 − ω3 r2 ,
v2 = ε231 ω3 r1 + ε213 ω1 r3 = ω3 r1 − ω1 r3 ,
v3 = ε312 ω1 r2 + ε321 ω2 r1 = ω1 r2 − ω2 r1 .
These three equations are usually summarized as the vector product of ω and r:
v = ω × r.
B. Ωj = εijk i :
𝜕v 𝜕v 𝜕v 𝜕v
Ω1 = ε312 3 + ε213 2 = 3 − 2 ,
𝜕x2 𝜕x3 𝜕x2 𝜕x3
𝜕v1 𝜕v 𝜕v 𝜕v
Ω2 = ε123 + ε321 3 = 1 − 3 ,
𝜕x3 𝜕x1 𝜕x3 𝜕x1
𝜕v2 𝜕v1 𝜕v2 𝜕v1
Ω3 = ε231 + ε132 = − .
𝜕x1 𝜕x2 𝜕x1 𝜕x2
These three equations are usually summarized as the curl of v: Ω = curl v.

A Solutions to the Problems | 257

Problem 1.10.
We compute the inverse of the matrix using the Gauss–Jordan algorithm (Section 1.6.2).
We have
1 0 2 1 0 0 −2 −1
2 1 3 0 1 0 ←
1 1 2 0 0 1 ←
1 0 2 1 0 0
0 1 −1 −2 1 0 −1
0 1 0 −1 0 1 ←
1 0 2 1 0 0 ←
0 1 −1 −2 1 0 ←
0 0 1 1 −1 1 −2 1
1 0 0 −1 2 −2
0 1 0 −1 0 1
0 0 1 1 −1 1


−1 2 −2
−1 0 1
1 −1 1
1 0 2 1 0 0
2 1 3 0 1 0
1 1 2 0 0 1

258 | A Solutions to the Problems

0 1 1 −1 1 0 0 0 ↑
1 2 −1 1 0 1 0 0 ↓
1 3 −1 0 0 0 1 0
−1 −2 1 −2 0 0 0 1
1 2 −1 1 0 1 0 0 −1 1
0 1 1 −1 1 0 0 0
1 3 −1 0 0 0 1 0 ←
−1 −2 1 −2 0 0 0 1 ←
1 2 −1 1 0 1 0 0 ←
0 1 1 −1 1 0 0 0 −2 −1
0 1 0 −1 0 −1 1 0 ←
0 0 0 −1 0 1 0 1
1 0 −3 3 −2 1 0 0
0 1 1 −1 1 0 0 0
0 0 −1 0 −1 −1 1 0 −1
0 0 0 −1 0 1 0 1
1 0 −3 3 −2 1 0 0 ←
0 1 1 −1 1 0 0 0 ←
0 0 1 0 1 1 −1 0 3 −1
0 0 0 −1 0 1 0 1 −1
1 0 0 3 1 4 −3 0 ←
0 1 0 −1 0 −1 1 0 ←
0 0 1 0 1 1 −1 0
0 0 0 1 0 −1 0 −1 −3 1
1 0 0 0 1 7 −3 3
0 1 0 0 0 −2 1 −1
0 0 1 0 1 1 −1 0
0 0 0 1 0 −1 0 −1

1 7 −3 3
0 −2 1 −1
1 1 −1 0
0 −1 0 −1
0 1 1 −1 1 0 0 0
1 2 −1 1 0 1 0 0
1 3 −1 0 0 0 1 0
−1 −2 1 −2 0 0 0 1

A Solutions to the Problems | 259

Problem 1.11.
A. We have several options (as is often the case with proofs). The easiest way is to
transpose both sides of (1.54), according to (1.46). We consider each i, j-element of
the two matrices in the following equation:
(ATij )
ij ) = Aji ,
(A(−1) = Aji(−1) .

The right sides are equal and we establish the claim by equating the left sides.
B. The usual calculation rules (commutativity, associativity, distributivity) clearly
hold for calculations between elements, so we prove the claim for the elements
(we can also say: in element notation).
We start with the definition of the inverse in the form (1.51):

(Aij Bjk )(−1) Akl Blm = δim .

To bring Akl Blm to the right side, we multiply by B(−1)

mn ;

(Aij Bjk )(−1) Akl B (−1)

⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ δim B(−1)
mn ,
lm Bmn = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
ln Bin


(Aij Bjk )(−1) Akn = B(−1)

in .

Multiplication by A(−1)
np gives

(Aij Bjk )(−1) A kn⏟⏟⏟

⏟⏟⏟⏟⏟⏟⏟ ⏟⏟⏟⏟⏟⏟ = Bin Anp ,
(−1) (−1)


(Aij Bjp )(−1) = B(−1)

in Anp ,
what was to be shown.

260 | A Solutions to the Problems

Problem 1.12.
Here we use the algorithm from Section 1.6.3 to find the rank. We have
0 3 6 9 −3 : 3
0 0 2 −4 8 :2
1 2 −3 4 5
1 5 5 9 10
0 1 2 3 −1 ←
0 0 1 −2 4 ←
1 2 −3 4 5 ←
1 5 5 9 10
1 2 −3 4 5 −1
0 1 2 3 −1
0 0 1 −2 4
1 5 5 9 10 ←
1 2 −3 4 5 → Zeros
0 1 2 3 −1 −3
0 0 1 −2 4
0 3 8 5 5 ←
1 0 0 0 0
0 1 2 3 −1 → Zeros
0 0 1 −2 4 −2
0 0 2 −4 8 ←
1 0 0 0 0
0 1 0 0 0
0 0 1 −2 4 → Zeros
0 0 0 0 0
1 0 0 0 0 The matrix
0 1 0 0 0 has rank 3.
0 0 1 0 0
0 0 0 0 0

A Solutions to the Problems | 261

1 −3 2 0 −4 2
4 −11 10 −1 ←
−2 8 −5 3 ←
1 −3 2 0 → Zeros
0 1 2 −1 −2
0 2 −1 3 ←
1 0 0 0
0 1 2 −1 → Zeros
0 0 −5 5 : (−5)
1 0 0 0
0 1 0 0
0 0 1 −1 → Zeros
1 0 0 0 The matrix
0 1 0 0 has rank 3.
0 0 1 0

Problem 1.13.
Reflexivity: For A
∼ we can take S∼=T ∼ = I:
∼ Aij = δim Amn δnj .
Symmetry: From Aij = Sim Bmn Tnj it follows that

Aij Tjq
=S pi Sim Bmn T
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ nj Tjq
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ = Bpq .
δpm δnq

Transitivity: From Aij = Sim Bmn Tnj , Bmn = Pmr Crs Qsn it follows that

Sim Pmr Crs Q

Aij = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ sn Tnj = Uir Crs Vsj .
=: Uir =: Vsj

Here, U
∼ and V ∼ are the product of two regular matrices and hence they are, accord-
ing to (1.13), also regular.

Problem 2.1.
Assumption: Ai = Tij Bj , (2.13) =: (a),
̃ i = T̃ij B
A ̃j, (2.14) =: (b),
Tij = αim αjn T̃mn , (2.15) =: (c),
Bj = αjk B̃k, (2.11) =: (d).
Claim: ̃ m.
Ai = αim A
(a) (c), (d)
Proof: Ai = Tij Bj = αjn T̃mn αjk B
αim ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ̃k,
(2.6) (1.17)
= δnk T̃mn = T̃mk
̃ k (b)
Ai = αim T̃mk B ̃ m , what was to be shown.
= αim A

262 | A Solutions to the Problems

Problem 2.2.
1 4 2
aTij = (0 −3 −4) .
2 6 5
1 0 2 1 4 2
1[ ]
a(ij) = [(4 −3 6 ) + (0 −3 −4)]
[ 2 −4 5 2 6 5 ]

2 4 4 1 2 2
= (4 −6 2) = (2 −3 1) ,
4 2 10 2 1 5

1 0 2 1 4 2
1[ ]
a[ij] = [(4 −3 6 ) − (0 −3 −4)]
[ 2 −4 5 2 6 5 ]

0 −4 0 0 −2 0
= (4 0 10) = ( 2 0 5) .
0 −10 0 0 −5 0

Check: a(ij) + a[ij] = aij ,

1 2 2 0 −2 0 1 0 2
(2 −3 1) + ( 2 0 5) = (4 −3 6) .
2 1 5 0 −5 0 2 −4 5

Problem 2.3.

−4 2 10 −4 −6 −8
ab =
̂ (−6 3 15) , ba =
̂ ( 2 3 4) .
−8 4 20 10 15 20

Problem 2.4.
We cannot translate the sub-problems A, B, and C into matrix notation, because the
order of their tensors is greater than 2. Before translating into symbolic notation, we
need to rewrite each equation so that the order of the (free) indices is the same in all
terms; this is possible in all cases (B, C, and D) (otherwise translation into symbolic
notation would not be possible).
A. a b c = d ⇐⇒ aij bk cl = dijkl .

B. ai bkl cj = dijkl ,
ai cj bkl = dijkl ⇐⇒ a c b = d.

A Solutions to the Problems | 263

C. αi bkj = Aijk ,
αi bTjk = Aijk ⇐⇒ α bT = A.
D. aTmn = bnm ,
anm = bnm ⇐⇒ a=b ⇐⇒ a
∼ = b.

Problem 2.5.
A. a ⋅ b = c ⇐⇒ aij bjk = cik ⇐⇒ a
∼b∼ = c.

B. b ⋅ a = c ⇐⇒ bij ajk = cik ⇐⇒ b
∼ ∼ c.
a = ∼
C. aik bij = cjk :
We have two ways to rewrite the left side, so that it can be translated:
aik bij = bTji aik and aik bij = aTki bij .
They lead to different translations:
– bTji aik = cjk can be translated immediately, i. e.
bTji aik = cjk ⇐⇒ bT ⋅ a = c ⇐⇒ b T
∼ a ∼ = c.

– In aTki bij = cjk we first set cjk = ckj
, so
aTki bij = ckj
⇐⇒ aT ⋅ b = cT ⇐⇒ a T
∼ b
∼ = c∼ .
Both translations are correct; we will show later that they can be converted into
each other.
D. ai bk ai ck = d,
ai ai bk ck = d ⇐⇒ a⋅ab⋅c =d ⇐⇒ (a∼ a)
∼ (b∼ c)∼ = d.
E. This equation cannot be translated into symbolic notation, because symbolic no-
tation is usually not defined for isomers of tensors of order 3.
F. (a ⋅ b)T = c ⇐⇒ (aij bjk )T = cik ⇐⇒ (a
∼ b)
∼ = c. ∼
G. (a ⋅ b ⋅ c)T = d ⇐⇒ (aij bjk ckl )T = dil ⇐⇒ (a
∼b∼ c)
∼ = d.

Problem 2.6.
In the following forms the equation can be translated:

1. ai ai cj bk = djk , 2. cj ai ai bk = djk , 3. cj bk ai ai = djk .

In case 1, we initially translate a ⋅ a c b = d. Then all five associations

(a ⋅ a)(c b) = a ⋅ ((a c) b) = a ⋅ (a (c b)) = ((a ⋅ a) c) b = (a ⋅ (a c)) b

are equivalent.

In case 2, we initially translate c a ⋅ a b = d. Again, all five associations

(c a) ⋅ (a b) = c ((a ⋅ a) b) = c (a ⋅ (a b)) = ((c a) ⋅ a) b = (c (a ⋅ a)) b

are equivalent.

264 | A Solutions to the Problems

In case 3, we initially translate c b a ⋅ a = d. All five associations

(c b)(a ⋅ a) = c ((b a) ⋅ a) = c (b (a ⋅ a)) = ((c b) a) ⋅ a = (c (b a)) ⋅ a

are again equivalent here.

So in all three cases parentheses need not to be set; all five possible ways to inter-
pret the expression lead to the same result.

Problem 2.7.
A. (a b)T is translated to (ai bj )T , which gives transposed (ai bj )T = aj bi . To make
the order of the free indices on both sides equal, we change on the right side the
order of the factors aj bi = bi aj , which gives (ai bj )T = bi aj . This equation can
be translated and we get (a b)T = b a, Q. E. D.
B. Translating a ⋅ b gives ai bij . Now we have ai bij = bTji ai , which translates into
a ⋅ b = bT ⋅ a, Q. E. D.
C. Translating (a ⋅ b ⋅ c)T gives (aij bjk ckl )T . Now we have (aij bjk ckl )T = alj bjk cki =
cki bjk alj = cik bkj ajl . Translation of the very left and the very right terms gives
(a ⋅ b ⋅ c)T = cT ⋅ bT ⋅ aT , Q. E. D.

Problem 2.8.
A. aij bij = c ⇐⇒ a ⋅⋅ b = c.
B. aij bji = c, aij bTij = c ⇐⇒ a ⋅⋅ bT = c.
C. aij bkjl cik = dl :

Option 1: aij cik bkjl = dl ⇐⇒ a ⋅⋅ (c ⋅ b) = d.

We do not need to set parentheses in this translation, since (a ⋅ ⋅ c) ⋅ b is not
defined; however, we recommend setting parentheses, so that we do not have to
check this.

Option 2: cki aij bkjl = dl ⇐⇒ (cT ⋅ a) ⋅⋅ b = d.
In this translation we need the parentheses, because cT ⋅ (a ⋅ ⋅ b) = d is also
defined, but gives something different, i. e. cki ajl bjli = dk .
D. aij bikl cklj = d ⇐⇒ a ⋅⋅ (b ⋅⋅ c) = d.
Again, we do not need parentheses here, because (a ⋅⋅ b) ⋅⋅ c is not defined, but
we still recommend them.

Problem 2.9.
A. tr δ = δii = δ11 + δ22 + δ33 = 3.
B. δij δjk = δik , so we have δ ⋅ δ = δ.
C. δij δij = (δ11 )2 + (δ12 )2 + (δ13 )2 + (δ21 )2 + ⋅ ⋅ ⋅ + (δ33 )2 = 3.

A Solutions to the Problems | 265

Problem 2.10.
A. aij bij = bij aij can be translated into a ⋅⋅ b = b ⋅⋅ a; therefore the double scalar
product of two tensors of order two is always commutative.
1 2 9
B. If we choose nine different tensors bij , bij , . . . , bij for bij , we get from aij bij = 0
a homogeneous system of linear equations for the nine unknown aij , i. e.

1 1 1
a11 b11 + a12 b12 + ⋅ ⋅ ⋅ + a33 b33 = 0,
2 2 2
a11 b11 + a12 b12 + ⋅ ⋅ ⋅ + a33 b33 = 0,
9 9 9
a11 b11 + a12 b12 + ⋅ ⋅ ⋅ + a33 b33 = 0.

Clearly, we can cancel out the bij from aij bij = 0, if and only if the system of equa-
tions has only the trivial solution, i. e. if the nine tensors bij , i, j = 1, 2, . . . , 3, k =
1, . . . , 9 are linearly independent. Since every tensor bij can be written as a lin-
ear combination of nine linearly independent tensors bij , we can cancel out bij if
and only if bij can take arbitrary values, and the criterion for this is that we can
substitute nine linearly independent tensors.

Problem 2.11.
(2.26) 1 1
A. a(ij) b[ij] = (aij + aji ) ⋅ (bij − bji )
2 2
= (a b + aji bij − aij bji − aji bji ).
4 ij ij
We have aij bij = aji bji and aji bij = aij bji , as we can see immediately, after
renaming in both equations on the right side the summation index i to j and the
summation index j to i, i. e. the bracket is zero, which is what we wanted to prove.
B. We want to show that, for arbitrary tensors a(ij) , it follows from a(ij) bij = 0 that
bij is antimetric. It can be shown analogously that, for arbitrary values of b[ij] , it
follows from aij b[ij] = 0 that aij is symmetric.
We can prove this analogously to Part A of this problem: by assumption, we
(a + aji ) bij = 0, aij bij + aji bij = 0,
2 ij

where aij can take arbitrary values. Renaming the indices in the second term gives

aij bij + aij bji = 0, aij (bij + bji ) = 0.

Since aij can take any value, we can cancel out aij in the last equation and it follows
that bij + bji = 0, i. e. bij is antimetric.

266 | A Solutions to the Problems

C. For i = 1, it follows that ε123 a23 + ε132 a32 = 0, a23 − a32 = 0, a23 = a32 ; for i = 2
and i = 3, it correspondingly follows that a31 = a13 , a12 = a21 . This proves the
D. From six linearly independent symmetric tensors Amn , i = 1, . . . , 6, we can con-
struct any symmetric tensor Bmn by a linear combination αi Amn = Bmn . Because of
the symmetry with respect to m and n, only six of the nine equations αi Amn = Bmn
are different. For given Amn and Bmn , they form a system of linear equations for the
six αi . Because the Amn are linearly independent, the coefficient matrix of these
equations is regular and the solution to the system of equations is unique.

Problem 2.12.
A. (a × b) c =
̂ εijk aj bk cl .
a × (b c) =
̂ εijk aj bk cl .
Both readings are defined and mean the same thing.
B. (a × b) ⋅ c =
̂ εijk aj bk ci .
The expression a × (b ⋅ c) is not defined.
C. (a ⋅ b) × c =
̂ εijk ajm bm ck .
a ⋅ (b × c) =
̂ ajm εmik bi ck = εmik ajm bi ck .
Both readings are defined and mean something different.
D. (a ⋅ b) × c =
̂ εijk am bmj ck .
a ⋅ (b × c) =
̂ − ai bij εjkl cl = εkjl ai bij cl .
Both readings are defined and mean the same thing.

Only the term C is ambiguous without parentheses.

Problem 2.13.
A. εijk am bj εmin ck dn
= εijk bj ck εinm dn am ̂ (b × c) ⋅ (d × a)
= εijk εinm bj ck dn am
= (δjn δkm − δjm δkn ) bj ck dn am
= bj ck dj ak − bj ck dk aj
= bj dj ck ak − bj aj ck dk ̂ b ⋅ d c ⋅ a − b ⋅ a c ⋅ d,
(b × c) ⋅ (d × a) = b ⋅ d c ⋅ a − b ⋅ a c ⋅ d.
Here and in the following examples, other formulations are also permissible in
symbolic notation, e. g. here we can write for the left side (c × b) ⋅ (a × d).
B. εijk aj bkm cm is a vector product with i as the only free index.

A Solutions to the Problems | 267

The indices i and q in εqpi are summation indices, so they belong to a vector
product with the above vector product as one of its factors. We set the free index
p as first index, so we have

εpiq εijk aj bkm cm dq ̂ [a × (b ⋅ c)] × d

= εpiq εkij aj bkm cm dq
= (δpk δqj − δpj δqk ) aj bkm cm dq
= aq bpm cm dq − ap bqm cm dq
= aq dq bpm cm − ap dq bqm cm ̂ a ⋅ d b ⋅ c − a d ⋅ b ⋅ c,
[a × (b ⋅ c)] × d = a ⋅ d b ⋅ c − a d ⋅ b ⋅ c.

C. a × [(b ⋅ c) × d] =
̂ εijk aj εkpq bpm cm dq
= εijk εpqk aj bpm cm dq
= (δip δjq − δiq δjp ) aj bpm cm dq
= aj bim cm dj − aj bjm cm di
= dj aj bim cm − di aj bjm cm =
̂ d ⋅ a b ⋅ c − d a ⋅ b ⋅ c,
a × [(b ⋅ c) × d] = d ⋅ a b ⋅ c − d a ⋅ b ⋅ c.
D. a × (b × c) =
̂ εijk aj εkmn bm cn
= εijk εmnk aj bm cn
= (δim δjn − δin δjm ) aj bm cn
= an bi cn − am bm ci
= an cn bi − am bm ci =
̂ a ⋅ c b − a ⋅ b c,
a × (b × c) = a ⋅ c b − a ⋅ b c.
E. The terms B and C are both of the form a × b ⋅ c × d and they differ only by the order
in which the two vector products have to be evaluated, in other words, by setting
the parentheses. The expressions are different; therefore the associative law does
not hold.

Problem 2.14.

ai bk − ak bi = am bn (δim δkn − δkm δin ) = am bn εjik εjmn .

For translation, the order of the free indices must be the same in all terms:

ai bk − bi ak = εikj εjmn am bn ,
a b − b a = ε ⋅ (a × b).

268 | A Solutions to the Problems

Problem 2.15.

𝜕 𝜕amn 𝜕2 amn
A. div grad a = b ⇐⇒ = = bmn .
𝜕xi 𝜕xi 𝜕xi2
𝜕 𝜕amn 𝜕2 amn
B. grad div a = c ⇐⇒ = = cmi .
𝜕xi 𝜕xn 𝜕xi 𝜕xn
𝜕ai 𝜕bj
C. = cik ⇐⇒ (grad a) ⋅ (grad b) = c.
𝜕xj 𝜕xk
𝜕ai 𝜕bi
D. =c ⇐⇒ (grad a) ⋅⋅ (grad b) = c.
𝜕xj 𝜕xj

Problem 2.16.
A. grad x =
̂ = δij ⇐⇒ grad x = δ.
B. div x =̂ = 3 ⇐⇒ div x = 3.
C. curl x =̂ ε = δik εijk = 0 ⇐⇒ curl x = 0.
𝜕xk ijk

Problem 2.17.

𝜕λ ai 𝜕a 𝜕λ
A. div (λ a) =
̂ = λ i + a,
𝜕xi 𝜕xi 𝜕xi i
div (λ a) = λ div a + (grad λ) ⋅ a.
𝜕λ ai 𝜕a 𝜕λ
B. curl (λ a) =
̂ ε = λ i εijk + ε a,
𝜕xk ijk 𝜕xk 𝜕xk ijk i
curl (λ a) = λ curl a + (grad λ) × a.
𝜕 𝜕a 𝜕2 a (2.40)
C. curl grad a =
̂ εijk = ε = 0,
𝜕xk 𝜕xi 𝜕xk 𝜕xi ijk
curl grad a = 0.
𝜕 𝜕ai 𝜕2 ai (2.40)
D. div curl a =
̂ εijk = ε = 0,
𝜕xj 𝜕xk 𝜕xj 𝜕xk ijk
div curl a = 0.
𝜕 𝜕am 𝜕2 am
E. curl curl a =
̂ ( εmin ) εijk = εinm εijk
𝜕xk 𝜕xn 𝜕xk 𝜕xn
𝜕2 am 𝜕2 ak 𝜕2 aj
= (δnj δmk − δnk δmj ) = −
𝜕xk 𝜕xn 𝜕xk 𝜕xj 𝜕xk 𝜕xk
𝜕 𝜕ak 𝜕 𝜕aj
= − ,
𝜕xj 𝜕xk 𝜕xk 𝜕xk
curl curl a = grad div a − div grad a.

A Solutions to the Problems | 269

F. We start by simplifying the given expression:

𝜕apj 𝜕apj 𝜕apn 𝜕apm
εijk εmnk bm = (δim δjn − δin δjm ) bm = bm − b
𝜕xi 𝜕xi 𝜕xm 𝜕xn m
𝜕apn 𝜕aTmp
= bm − bm .
𝜕xm 𝜕xn
Then we write the expression in a form that can be translated directly:
𝜕apj 𝜕apj
εijk εmnk b = ( ε ) εmnk bm .
𝜕xi m 𝜕xi jki
Setting equal the right sides gives
𝜕apj 𝜕apn 𝜕aTmp
( εjki ) εmnk bm = bm − bm ,
𝜕xi 𝜕xm 𝜕xn
and in all the terms the order of the free indices is such that p comes before n, so
the translation is
(curl a) × b = (grad a) ⋅ b − b ⋅ grad(aT ).
𝜕apj bi 𝜕apj bi 𝜕apn bm 𝜕apm bn
G. εijk εmnk = (δim δjn − δin δjm ) = −
𝜕xm 𝜕xm 𝜕xm 𝜕xm
𝜕bm 𝜕apn 𝜕bn 𝜕apm
= apn + b − apm − b
𝜕xm 𝜕xm m 𝜕xm 𝜕xm n
𝜕bm 𝜕apn 𝜕b T 𝜕apm
= apn + bm − apm ( m ) − b ,
𝜕xm 𝜕xm 𝜕xn 𝜕xm n
𝜕apj bi 𝜕apj (− εikj ) bi 𝜕apj εikj bi
εijk εmnk = (− εknm ) = εknm ,
𝜕xm 𝜕xm 𝜕xm
𝜕apj εikj bi 𝜕b 𝜕apn 𝜕b T 𝜕apm
εknm = apn m + bm − apm ( m ) − b ,
𝜕xm 𝜕xm 𝜕xm 𝜕xn 𝜕xm n
curl (a × b) = a div b + (grad a) ⋅ b − a ⋅ gradT b − (div a) b.

Problem 2.18.

W = ∫[x]F⋅ = ∫(Fx d x + Fy d y + Fz d z)

= ∫[−λ x d x − λ y d y − (λ z + m g) d z]

dx dy dz
= − ∫ [λ x +λy + (λ z + m g) ] d φ.
dφ dφ dφ

We substitute the parametric representation of the helix and obtain, after elementary

W = − λ h2 − m g h.

270 | A Solutions to the Problems

The work of the external forces exerted on the mass point is − 21 λ h2 − m g h, i. e. the
mass point must do the work of 21 λ h2 + m g h to move up along the helix against the
external forces.

Problem 2.19.

𝜕(x, y)
Fz = ∫ p d Az = ∫ ϱ g(H − z) d Az = ∫∫ ϱ g (H − z (u, v)) d u d v.
𝜕(u, v)

After short calculations we obtain

𝜕(x, y)
= R2 sin u cos u,
𝜕(u, v)
2π 2

Fz = ∫ d v ∫ ϱ g(H − R cos u) R2 sin u cos u d u

0 0
2π 2
= ϱ g R ∫ d v ∫ (H − R cos u) sin u cos u d u.
0 0

For example, if we the substitute η = cos u, we obtain

2π 3
Fz = ϱ g (πR2 H − R ).

In mechanics it is shown that the vertical force equals the weight of the volume above
the area; in our case this is the weight of the indicated volume. This volume is the
difference between a circular cylinder with radius R and height H and a hemisphere
with radius R. Using this fact and the corresponding stereometric formulas we can
compute the vertical force without having to solve an integral.

Problem 3.1.

4 1 −7
aij = (−3 5 8)
3 2 9

A Solutions to the Problems | 271

we obtain its isotropic part as

6 0 0
̂ = aii = 6, a δij = (0
̂ 6 0) ,
0 0 6
its deviatoric part as −2 1 −7
å ij = aij − a
̂ δij , å ij = (−3 −1 8) ,
3 2 3
its symmetric part as
4 −1 −2
a(ij) = (aij + aji ), a(ij) = ( −1 5 5) ,
−2 5 9
its antimetric part as
0 2 −5
a[ij] = (aij − aji ), a[ij] = (−2 0 3) ,
5 −3 0
and the symmetric part of its deviator as
−2 −1 −2
å (ij) = a(ij) − a
̂ δij = (å ij + å ji ), å (ij) = ( −1 −1 5) .
−2 5 3

6 0 0 −2 −1 −2 0 2 −5 4 1 −7
(0 6 0) + ( −1 −1 5) + (−2 0 3) = (−3 5 8)
0 0 6 −2 5 3 5 −3 0 3 2 9
̂ δij + å (ij) + a[ij] = aij .

Problem 3.2.
We can prove the statement as follows:
For second-order polar tensors, we have, according to (2.17),

aij = αim αjn a

̃ mn .

For a
̃ mn = −a
̃ nm it then follows that

aij = − αim αjn a

̃ nm = − αjn αim a
̃ nm = − aji , Q. E. D.

Problem 3.3.
Evaluating (3.8) gives
A1 = (ε123 a[23] + ε132 a[32] ) = a23
and the two cyclic permutations, thus

(A1 , A2 , A3 ) = (a23 , a31 , a12 ).

Hence, the coordinates of a vector satisfy the transformation law for certain coordi-
nates of an antimetric tensor.

272 | A Solutions to the Problems

Problem 3.4.
The given tensor is antimetric.
A. Consequently, its determinant is zero.
B. The cotensor satisfies, according to (3.12), bij = Ai Aj , and from the solution to
Problem 3.3 it follows that (A1 , A2 , A3 ) = (a23 , a31 , a12 ), i. e. (A1 , A2 , A3 ) =
(−1, −2, 3). Thus, we obtain the following coordinates of the cotensor:

1 2 −3
bij = ( 2 4 −6) .
−3 −6 9

C. Since an antimetric tensor is singular, its inverse tensor does not exist.

Problem 3.5.
Let aij be the coordinate matrix of the tensor a in a given coordinate system; then we
have, according to (2.17),

aij = αim αjn a

̃ mn ,

where ã mn is the coordinate matrix of the tensor in another coordinate system. Fur-
thermore, let aij be proper or improper orthogonal. Then we have, according to (1.59),
in both cases

aij akj = δik .

Substituting the first equation for aij and for akj in the second equation gives

αim αjn a
̃ mn αkp αjq a
̃ pq = δik ,
̃ mn αkp δnq a
αim a ̃ pq = δik ,
αim αkp a
̃ mn a
̃ pn = δik .

Multiplication by αir αks gives

α im αir α
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ⏟⏟⏟⏟⏟α⏟⏟⏟ks⏟⏟ a
⏟⏟⏟kp ̃ mn a
̃ pn = δik αir αks = α kr αks ,
δmr δps δrs

̃ rn a
̃ sn = δrs ,

i. e. the coordinate matrix of the tensor a is orthogonal also in the other coordinate
system (and hence in every coordinate system).
Finally we need to show that a
̃ ij is proper orthogonal if aij is proper orthogonal and
vice versa. To show this, we compute the determinant of the transformation equation
between aij and a ̃ ij and use Theorem 1.13 for the multiplication of determinants. We

det aij = (det αij )(det αij )(det a

̃ ij ).

A Solutions to the Problems | 273

Since the transformation matrix αij is orthogonal, its determinant is ±1 and it follows

det aij = det a

̃ ij ,

i. e. the coordinate matrices aij and ã ij are either both proper orthogonal or both im-
proper orthogonal, which proves the statement.

Problem 3.6.
Orthogonal tensors satisfy A(−1)
= ATji = Aij , so it follows from (1.50) that

Bij = Aij det Aij .

Proper orthogonal tensors have det Aij = 1 and hence Bij = Aij ; improper orthogonal
tensors have det Aij = −1 and thus Bij = − Aij .

Problem 3.7.
A. A tensor is single singular, if the rows of its coordinate matrix are coplanar, but
not collinear. For the tensor under consideration, we see that the sum of the first
two rows is the third row and that the second row is not a multiple of the first
row. Hence the three rows are coplanar, but not collinear. If we do not see this, we
need to convince ourselves that the determinant of the coordinate matrix is zero
and that a minor is nonzero.
B. In order to find the null direction, we have to solve the homogeneous system of
linear equations aij Xj = 0. Because the coefficient matrix of this system of equa-
tions is single singular, we can omit one equation and we can choose freely one
unknown, e. g. as 1. If we omit the third equation and set x3 = 1, we obtain

1 −1 1 2
−2 1 −3 ←
X1 − X2 = 1, 1 −1 1 ←
, X = (2, 1, 1),
−2 X1 + X2 = −3, 0 −1 −1 −1 −1
1 0 2
0 1 1

i. e. the null direction (in the given coordinate system) is given by the unit vector

1 1 1
X̊ = ± ( √6, √6, √6) .
3 6 6

C. The columns of the coordinate matrix lie in the image plane, e. g. two linearly
independent vectors in the image plane are

U 1 = (1, −2, −1) and U 2 = (−1, 1, 0).

274 | A Solutions to the Problems

V 1 = U 1 + α U 2 is perpendicular to U 1 if the following holds:

U 1 ⋅ V 1 = U 1 ⋅ U 1 + α U 1 ⋅ U 2 = 0,

U1 ⋅ U1 6
α=− =− = 2,
U1 ⋅ U2 −3

V 1 = (−1, 0, −1).

The normalized vectors Ů 1 = ( 61 √6, − 31 √6, − 61 √6) and V̊ 1 = (− 21 √2, 0, − 21 √2)

form a Cartesian basis in the image plane.

Problem 3.8.
A. For the tensor to be double singular, for example, the rows of the coordinate ma-
trix must be collinear. We see immediately that the second row is the negative of
the first row and that the third row is three times the first row.
B. In order to solve the system of equations aij Xj = 0, we can omit two equations
and we can freely choose two unknowns in the remaining equation. If we omit the
second and third equations, we obtain for X2 = 1, X3 = 0:

X1 − 1 = 0, X1 = 1

and for X2 = 0, X3 = 1:

X1 + 2 = 0, X1 = −2.

Thus, two linearly independent null vectors are

X 1 = (1, 1, 0) and X 2 = (−2, 0, 1).

C. The unit vector normal to the null plane is in the direction of the row line, i. e.
1 1 1
Y̊ = ± ( √6, − √6, √6) .
6 6 3
D. The image line is the column line; its direction is given by the unit vector
1 1 3 √
Ů = ± ( √11, − √11, 11) .
11 11 11
Problem 3.9.
A. For an orthogonal basis it immediately follows from

g 1 = 3 e1 , g 2 = 2 e2 , g 3 = e3

according to Section 3.9.3 that

1 1
g1 = e, g2 = e, g 3 = e3 .
3 1 2 2

A Solutions to the Problems | 275

B. We first compute the three vectors of the original basis, i. e.

g 1 = e1 .

For g 2 we have
g 2 ⋅ e3 = 0, g 2 ⋅ g 1 = g 2 ⋅ e1 = cos 60∘ = , g 2 ⋅ g 2 = 1.
Using the first two equations we obtain for the third equation
1 1√
g 2 ⋅ g 2 = (g 2 ⋅ e1 )2 + (g 2 ⋅ e2 )2 = + (g 2 ⋅ e2 )2 = 1, g 2 ⋅ e2 = 3,
4 2
1 1
g2 = e + √3 e2 .
2 1 2
Finally, we have for g 3
1 1
g 3 ⋅ g 1 = cos 60∘ = , g 3 ⋅ g 2 = cos 60∘ = , g 3 ⋅ g 3 = 1.
2 2
This implies that
g 3 ⋅ e1 = ,
1 1 1 1 1 1
g 3 ⋅ g 2 = g 3 ⋅ e1 + √3 g 3 ⋅ e2 = + √3 g 3 ⋅ e2 = , g 3 ⋅ e2 = √3,
2 2 4 2 2 6
1 1
g 3 ⋅ g 3 = (g 3 ⋅ e1 )2 + (g 3 ⋅ e2 )2 + (g 3 ⋅ e3 )2 = + + (g 3 ⋅ e3 )2 = 1,
4 12
g 3 ⋅ e3 = 6,
1 1 √ 1
g 3 = e1 + 3 e2 + √6 e3 .
2 6 3
The easiest method to compute the basis reciprocal to g i is to use the Gauss–Jordan
algorithm, i. e.
g1 1 0 0 1 0 0 − 21 − 21
1 1
g2 2 2
√3 0 0 1 0 ←
1 1 1
g3 2 6
√6 0 0 1 ←
1 0 0 1 0 0
0 2
√3 0 − 21 1 0 − 31 2
= 2
1 1
0 6
√6 − 21 0 1 ←
1 0 0 1 0 0
0 1 0 − 31 √3 2
√3 0
0 0 3
√6 − 31 − 31 1 3
= 1
1 0 0 1 0 0
0 1 0 − 31 √3 2
√3 0
0 0 1 − 61 √6 − 61 √6 1
1 2
g g g3

276 | A Solutions to the Problems

Thus, we obtain the reciprocal basis:

1√ 1
g 1 = e1 − 3 e2 − √6 e3 ,
3 6
2√ 1
g2 = 3 e2 − √6 e3 ,
3 6
g 3 = √6 e3 .

Problem 3.10.
According to (3.30)2 , we have g i hi = aT or g m hn = aTmn = anm ,
1 2 3
g1 g1 g1 h1 h2 h3
1 1 1 a11 a21 a31
1 2 3
(g 2 g2 g 2 ) (h2 1 h2
h3 ) = (a12
a22 a32 ) .
1 2 3
h1 h2 h3 a13 a23 a33
g3 g3 g3 3 3 3

g1 g2 g3
1 3 1 2 4 1 1
0 1 2 3 −2 2
−1 −3 −2 −1 3 1 ←
1 3 1 2 4 1 ←
0 1 2 3 −2 2 −3
0 0 −1 1 7 2
1 0 −5 −7 10 −5 ←
0 1 2 3 −2 2 ←
0 0 −1 1 7 2 −5 2 −1
1 0 0 −12 −25 −15 h1
0 1 0 5 12 6 h2
0 0 1 −1 −7 −2 h3

Thus we get

h1 = (−12, −25, −15), h2 = (5, 12, 6), h3 = (−1, −7, −2).

The representation a = hi g i is then

2 3 −1 −12 0 12 15 5 −15 −1 −2 2
(4 −2 3) = (−25 0 25) + (36 12 −36) + (−7 −14 14) .
1 2 1 −15 0 15 18 6 −18 −2 −4 4

A Solutions to the Problems | 277

Problem 3.11.
A. In order to be able to represent a with h1 and h2 in this form, h1 and h2 have to lie
in the column plane of a. The easiest way to verify this is to prove that the matrix
formed by h1 , h2 , and two columns of a has rank 2, so

−1 0 1 −1 2 1 −1
2 1 −2 1 ←
1 1 −1 0 ←
1 0 −1 1
0 1 0 −1
0 1 0 −1

Since the last two rows are equal, we can omit one row when we determine the
rank, and since the first two rows are not collinear, the matrix has rank 2.
B. We have hi g i = a, hm g n = amn ,

h1 h1 1 1 1 a11 a12 a13

1 2 g g2 g3
( 1 ) = ( a21 a22 a23 ) .
( h2 h2 ) g2 2
1 2 1 a31 a32 a33
h3 h3
1 2

To determine the g n , the first two rows are sufficient:

h1 h2
−1 0 1 −1 −1 2 −1
2 1 −2 1 3 ←
1 0 −1 1 1 g1
0 1 0 −1 1 g2

So we obtain g 1 = (−1, 1, 1), g 2 = (0, −1, 1).

C. The representation a = h1 g 1 + h2 g 2 is then

1 −1 −1 1 −1 −1 0 0 0
(−2 1 3) = (−2 2 2) + (0 −1 1) .
−1 0 2 −1 1 1 0 −1 1

Problem 3.12.
Determining the eigenvalues:
All coordinate matrices are triangular matrices. Hence, according to Section 3.11.3,
No. 6, the diagonal elements are the eigenvalues: In the cases A to C, the tensor has
the eigenvalue λ = 1 of multiplicity 3. In the cases D to F, the tensor has the eigenvalue
λ = 1 of multiplicity 2 and the eigenvalue λ = 2 of multiplicity 1.

278 | A Solutions to the Problems

Determining the eigendirections:

A. The eigenvalue equation is

0 1 0 x1 0
(0 0 1) (x2 ) = (0) .
0 0 0 x3 0

The coefficient matrix has rank 2, i. e. according to Section 3.11.4, Theorem 2, only
one eigendirection exists and the tensor is defective. From the equation we obtain
x2 = 0, x3 = 0. We can freely choose x1 , so the eigendirection is given by the unit

x̊ = (1, 0, 0).

B. The eigenvalue equation is

0 0 0 x1 0
(0 0 1) (x2 ) = (0) .
0 0 0 x3 0

The coefficient matrix has rank 1, i. e. according to Section 3.11.4, Theorem 3, two
distinct eigendirections exist and the tensor is again defective. Here we obtain
from the equation x3 = 0. We can freely choose x1 and x2 , so two eigendirections
are, for example, given by the unit vectors

x̊ 1 = (1, 0, 0), x̊ 2 = (0, 1, 0).

C. The eigenvalue equation is

0 0 0 x1 0
(0 0 0) (x2 ) = (0) .
0 0 0 x3 0

The coefficient matrix has rank 0, i. e. according to Section 3.11.4, Theorem 4, three
linearly independent eigendirections exist, or in other words, every direction is an
eigendirection. The tensor is nondefective and we can also find three mutually or-
thogonal eigenvectors. Accordingly, the equation does not provide any restriction
for the coordinates of the eigenvectors and a set of mutually orthogonal eigenvec-
tors is, for example,

x̊ 1 = (1, 0, 0), x̊ 2 = (0, 1, 0), x̊ 3 = (0, 0, 1).

D. For λ = 1 the eigenvalue equation is

0 1 0 x1 0
(0 0 1) (x2 ) = (0) .
0 0 1 x3 0

A Solutions to the Problems | 279

The coefficient matrix has rank 2, only one eigendirection corresponds to the
eigenvalue with multiplicity two, and the tensor is defective. From the equation
we obtain x2 = 0, x3 = 0; the eigendirection is given by the unit vector

x̊ 1 = (1, 0, 0).

For λ = 2, the eigenvalue equation is

−1 1 0 x1 0
( 0 −1 1) (x2 ) = (0) ,
0 0 0 x3 0
which gives

− x1 + x2 = 0, −x2 + x3 = 0.

With x3 = 1 we obtain the eigenvector x = (1, 1, 1), i. e. the eigendirection is given

by the unit vector
1 1 1
x̊ 2 = ( √3, √3, √3) .
3 3 3
E. For λ = 1, the eigenvalue equation is

0 0 0 x1 0
(0 0 1) (x2 ) = (0) .
0 0 1 x3 0
The coefficient matrix has rank 1, two eigendirections correspond to the eigen-
value with multiplicity two, and the tensor is nondefective. From the equation we
only obtain x3 = 0; two orthogonal eigendirections are, for example, given by the
unit vectors

x̊ 1 = (1, 0, 0), x̊ 2 = (0, 1, 0).

For λ = 2, the eigenvalue equation is

−1 0 0 x1 0
( 0 −1 1) (x2 ) = (0) ,
0 0 0 x3 0
which gives

− x1 = 0, −x2 + x3 = 0.

One eigenvector is x = (0, 1, 1), i. e. the eigendirection is given by

1√ 1√
x̊ 3 = (0, 2, 2) .
2 2
This eigendirection is not perpendicular to the eigenplane corresponding to λ = 1,
i. e. no three mutually orthogonal eigendirections exist.

280 | A Solutions to the Problems

F. For λ = 1, the eigenvalue equation is

0 0 0 x1 0
(0 0 0) (x2 ) = (0) .
0 0 1 x3 0

The coefficient matrix has again rank 1, two eigendirections correspond to the
eigenvalue with multiplicity two, and the tensor is nondefective. From the equa-
tion we obtain only x3 = 0; two orthogonal eigendirections are given by the unit

x̊ 1 = (1, 0, 0), x̊ 2 = (0, 1, 0).

For λ = 2, the eigenvalue equation is

−1 0 0 x1 0
( 0 −1 0) (x2 ) = (0) ,
0 0 0 x3 0

which gives

− x1 = 0, −x2 = 0.

The eigendirection is given by the unit vector

x̊ 3 = (0, 0, 1).

This eigendirection is perpendicular to the eigenplane corresponding to λ = 1,

i. e. three mutually orthogonal eigendirections exist.

Problem 3.13.
A principal axes transformation requires finding three mutually perpendicular eigen-
directions. Such a triple of eigendirections exists only for symmetric tensors. See also
the results of the last problem.
We start by finding the eigenvalues of the tensor. The characteristic equation is

5−λ 0 4
0 9−λ 0 = 0.
4 0 5−λ

Expanding with respect to the second row gives

(9 − λ)[(5 − λ)(5 − λ) − 16] = 0,

and after elementary computations

λ1 = λ2 = 9, λ3 = 1.

A Solutions to the Problems | 281

From the eigenvalue equation for the eigenvalue λ = 9 with multiplicity two, we
obtain the orthonormal eigenvectors

1 1
q1 = ( √2, 0, √2) and q2 = (0, 1, 0),
2 2
and from the eigenvalue equation for the eigenvalue λ = 1, we obtain the normalized

1 1
q3 = ( √2, 0, − √2) ,
2 2
which is orthogonal to the other two eigenvectors. We still have to check if the three
eigendirections form, in this order, a right-handed system. For this to be true, the value
of the determinant
1 √2 1
0 2
0 1 0
√2 0 − 21 √2

must be positive. This is clearly not the case, i. e. we have to reverse the direction of
the third eigendirection, and thus we obtain as a principal axes system

1 1 1 1
q1 = ( √2, 0, √2) , q2 = (0, 1, 0), q3 = (− √2, 0, √2) .
2 2 2 2
The coordinate matrix of the tensor in this principal axes system is

9 0 0
̂ ij = (0 9 0) .
0 0 1

Problem 3.14.
We assume that the generalized eigenvalue problem has a complex eigenvalue and a
complex eigendirection. Then, similarly to Section 3.11.3, No. 2, it follows that

aij (Xj + iYj ) = (λ + iμ) bij (Xj + iYj ) .

Splitting into real and imaginary part gives

aij Xj = bij (λ Xj − μ Yj ) , aij Yj = bij (μ Xj + λ Yj ) .

Contraction of the first equation with Yi and of the second equation with Xi and com-
puting the difference gives

a ij Yi Xj − aij Xi Yj = λ (b
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ ij Yi Xj − bij Xi Yj ) − μ (bij Yi Yj + bij Xi Xj ) .
0 0

282 | A Solutions to the Problems

The first two expressions are zero, because aij and bij are symmetric, and what remains

μ (bij Yi Yj + bij Xi Xj ) = 0.

To obtain μ = 0 from this equation, the expression in parentheses must be nonzero,

which is the case if the tensor b is positive or negative definite.

Problem 3.15.
Solving the characteristic equation gives

λ1 = ±1, λ2 = cos ϑ + i sin ϑ, λ3 = cos ϑ − i sin ϑ.

Problem 3.16.
A. The two orthogonality relations yield six equations, which can be solved for the
six unknowns α, β, γ, δ, ε, ζ . Here αki αkj = δij (a statement about the sums of
the squares and the sums of the products of the columns) is more useful than
αik αjk = δij (an analogous statement about the row sums) because in the first case
the six equations are partially decoupled.
From αk1 αk1 = 1 it follows that α = ± 21 .
From αk1 αk2 = 0 and αk2 αk2 = 1 it follows, independently of the sign of α,
β = ± √2, γ = 0.
From αk1 αk3 = 0, αk2 αk3 = 0, and αk3 αk3 = 1 it follows, independently of the
sign of α and β, that

1 1
δ=ζ =± , ε = ± √2.
2 2
Thus the general form of the transformation matrix is

± 21 ± 21 √2 ± 21
αij = (∓ 21 √2 0 ± 21 √2) ,
± 21 ∓ 21 √2 ± 21

where we can choose in each column either the upper or the lower sign, indepen-
dently of the other columns.
B. Thus the problem has 23 = 8 solutions. We can characterize one of these solu-
tions e. g. by specifying the signs of α, β, and δ, which corresponds to choosing
the upper or the lower sign in the three columns. According to (2.4), the columns
of αij represent the coordinates of the rotated basis vectors ẽj with respect to the
original basis ei ; every sign change of a column can be interpreted as reversing
the orientation of the corresponding basis vector.

A Solutions to the Problems | 283

Problem 3.17.
To decide whether to use (3.78) or (3.82) to compute the angle of rotation, we first have
to check if the transformation matrix is proper orthogonal or improper orthogonal.
The determinant is +1, so the transformation matrix describes a rotation, and (3.78)
yields cos ϑ = 0, ϑ = ± π2 . We choose ϑ = + π2 . Then sin ϑ = 1, and from (3.79) it
follows after a short computation that

1 1
n = (− √2, 0, − √2).
2 2

The transformation matrix describes a rotation of 90∘ about an axis in the x, z-plane
which forms with both the negative x-axis and the negative z-axis an angle of 45∘ . This
is equivalent to a rotation of −90∘ about an axis in the x, z-plane which forms with both
the positive x-axis and the positive z-axis an angle of 45∘ , respectively. This is the case
for ϑ = − π2 , sin ϑ = −1.

Problem 3.18.
We can prove this as follows. Let λ be an eigenvalue and let Xi be an eigenvector of aij
corresponding to λ. Then we have

aij Xj = λ Xi .

Multiplication by a(−1)

a (−1)
aij Xj = λ a(−1)
ki ki
Xi , Xk = λ a(−1)
Xi , a(−1)
Xi = λ−1 Xk , Q. E. D.

Problem 3.19.
We can prove this by mathematical induction:
Basic step: n = 0.
For n = 0, according to (3.87), this statement is satisfied.
Inductive step:
Assumption: For n = k we have (Ak )T = (AT )k . (a)

1. Statement: For n = k + 1 we have (Ak+1 )T = (AT )k+1 .

2. Statement: For n = k − 1 we have (Ak−1 )T = (AT )k−1 .

Proof of statement 1:
(3.86) (2.35) (a)
(Ak+1 )T = (A ⋅ Ak )T = (Ak )T ⋅ AT = (AT )k ⋅ AT

= (AT )k+1 , Q. E. D.

284 | A Solutions to the Problems

Proof of the statement 2:

(3.86) (2.35) (a), (1.54)

(Ak−1 )T = (A−1 ⋅ Ak )T = (Ak )T ⋅ (A−1 )T = (AT )k ⋅ (AT )−1

= (AT )k−1 , Q. E. D.

Problem 3.20.
We have U = a ⋅ X and we set V := a ⋅ U. Then we have

W = a3 ⋅ X = a2 ⋅ a ⋅ X = a2 ⋅ U = a ⋅ a
⏟⏟⏟⏟⏟⏟⏟ ⋅ U = a ⋅ V.

The tensor a rotates (in each case through the angle ϑ) first the vector X into the vec-
tor U, then the vector U into the vector V, and finally the vector V into the vector W,
i. e. a3 rotates the vector X through the angle 3 ϑ into the vector W.

Problem 3.21.
Problem 3.13 shows that the tensor has the following eigenvalues and the following
corresponding normalized eigenvectors, which form a right-handed coordinate sys-

1 1
λ1 = λ2 = 9, q1 = ( √2, 0, √2) ,
2 2
q2 = (0, 1, 0),
1 1
λ3 = 1, q3 = (− √2, 0, √2) .
2 2
A. Since the original tensor is symmetric and all its eigenvalues are positive, the ten-
sor is positive definite.
B. We can compute the coordinates of the positive definite square root b = √a, using
(3.90), from

k k
bij = √λk qi qj .

We can also write this equation in the form

1 2 3 1 1 1
√λ1 q1 √λ2 q1 √λ3 q1 q1 q2 q3
(√λ q1 √λ2 q2
2 3
√λ3 q2 ) (q1
2 2
q3 )
1 2
1 2 3 3 3 3
√λ1 q3 √λ2 q3 √λ3 q3 q1 q2 q3

A Solutions to the Problems | 285

or in the form

1 2
q1 q1
1 1 1 2 2 2
bij = √λ1 (q1 ) (q1 , q2 , q3 ) + √λ2 (q2 ) (q1 , q2 , q3 )
2 2
1 2
q3 q3

3 3 3
+ √λ3 (q3 ) (q1 , q2 , q3 ) .

Both ways lead to

2 0 1
bij = (0 3 0) .
1 0 2

We can verify this result by showing that the square of bij is aij .

Problem 3.22.
Straightforward computations give

9 0 0 3 0 0
aik aTkj = (0 9 0) , Vij = √aik aTkj = (0 3 0) .
0 0 9 0 0 3

Using the Gauss–Jordan algorithm we obtain from Vik Rkj = aij

2 2 1
3 3 3

Rij = ( 31 − 32 2
− 31 2

and we can easily verify that Vik Rkj = aij .

Problem 4.1.
A. We see from the figure that

OB = x1 , OA = OC cos α = u1 cos α,
BP = x2 , AB = CP sin β = u2 sin β,
OC = u1 , BD = OC sin α = u1 sin α,
CP = u2 , DP = CP cos β = u2 cos β,

286 | A Solutions to the Problems

x1 = u1 cos α + u2 sin β,
x2 = u1 sin α + u2 cos β,

x1 = u1 cos α + u2 sin β cos β − sin α

1 2
x2 = u sin α + u cos β − sin β cos α

x1 cos β − x2 sin β
= u1 (cos α cos β − sin α sin β) = u1 cos(α + β),
−x1 sin α + x2 cos α
= u2 (− sin α sin β + cos α cos β) = u2 cos(α + β),

cos β sin β
u1 = x1 − x,
cos(α + β) cos(α + β) 2
sin α cos α
u2 = − x + x.
cos(α + β) 1 cos(α + β) 2

𝜕xj 𝜕x1 𝜕x2

B. g j = gives g 1 = ( , ) = (cos α, sin α) ,
i 𝜕ui 𝜕u1 𝜕u1
𝜕x 𝜕x2
g 2 = ( 12 , ) = (sin β, cos β) ;
𝜕u 𝜕u2
i 𝜕ui 𝜕u1 𝜕u1
gj = gives g 1 = ( , )
𝜕xj 𝜕x1 𝜕x2
cos β sin β
= ( ,− ),
cos(α + β) cos(α + β)
𝜕u2 𝜕u2
g2 = ( , )
𝜕x1 𝜕x2
sin α cos α
= (− , ).
cos(α + β) cos(α + β)

In this case the covariant basis vectors are unit vectors, but the contravariant basis
vectors are not.

A Solutions to the Problems | 287

Problem 4.2.
𝜕x 𝜕x 𝜕x
x = R cos φ: = cos φ, = −R sin φ, = 0;
𝜕R 𝜕φ 𝜕z

𝜕y 𝜕y 𝜕y
y = R sin φ: = sin φ, = R cos φ, = 0;
𝜕R 𝜕φ 𝜕z

𝜕z 𝜕z 𝜕z
z = z: = 0, = 0, = 1;
𝜕R 𝜕φ 𝜕z

𝜕R 2x x
R = √x 2 + y 2 : = = = cos φ,
𝜕x 2√x2 + y2 R
𝜕R 2y y
= = = sin φ,
𝜕y 2√x2 + y2 R
= 0;

y 𝜕φ 1 y x2 y
φ = arctan : = (− 2 ) = − 2
x 𝜕x y
1 + (x)
2 x x + y 2 x2

y 1
= − 2 = − sin φ,
𝜕φ 1 1 x2 1
= =
1 + ( xy ) x x + y x
𝜕y 2 2 2

x 1
= 2 = cos φ,
= 0;

𝜕xj 𝜕x 𝜕y 𝜕z
gj = gives g 1 = ( , , ) = (cos φ, sin φ, 0),
i 𝜕ui 𝜕R 𝜕R 𝜕R
𝜕x 𝜕y 𝜕z
g2 = ( , , ) = (−R sin φ, R cos φ, 0),
𝜕φ 𝜕φ 𝜕φ
𝜕x 𝜕y 𝜕z
g3 = ( , , ) = (0, 0, 1);
𝜕z 𝜕z 𝜕z
i 𝜕ui 𝜕R 𝜕R 𝜕R
gj = gives g 1 = ( , , ) = (cos φ, sin φ, 0),
𝜕xj 𝜕x 𝜕y 𝜕z
𝜕φ 𝜕φ 𝜕φ 1 1
g2 = ( , , ) = (− sin φ, cos φ, 0),
𝜕x 𝜕y 𝜕z R R
𝜕z 𝜕z 𝜕z
g3 = ( , , ) = (0, 0, 1).
𝜕x 𝜕y 𝜕z

288 | A Solutions to the Problems

Problem 4.3.
Equation (4.14) relates contravariant coordinates and Cartesian coordinates. We have

i j
T ij = g i ⋅ T ⋅ g j = g m T̂mn g n .

In order to apply this equation, we first have to compute the Cartesian coordinates of
the basis reciprocal to g i . We get, for example using the Gauss–Jordan algorithm,

g 1 = (1, −1, 0), g 2 = (0, 1, −1), g 3 = (0, 0, 1),

and further, using Falk’s scheme, in two steps

1 2 3
T̂11 T̂12 T̂13 g1 g1 g1
1 2 3
T̂21 T̂22 T̂23 g2 g2 g2
1 2 3 2 −6 3
T̂31 T̂32 T̂33 g3 g3 g3
, T ij = (−1 6 −2) .
1 1 1
g1 g2 g3 T 11 T 12 T 13 1 −3 1
2 2 2
g1 g2 g3 T 21 T 22 T 23
3 3 3
g1 g2 g3 T 31 T 32 T 33

Problem 4.4.
If we identify the tilde coordinates in (4.19), i. e. in

̃i m
𝜕u 𝜕um
̃i =
a a , a
̃i = a ,
𝜕um ̃i m
with cylindrical coordinates and the no-tilde coordinates with Cartesian coordinates,
and then use the solutions to Problem 4.2, we get

𝜕R 𝜕R 𝜕R
a1 = ax + ay + a = ax cos φ + ay sin φ,
𝜕x 𝜕y 𝜕z z
𝜕φ 𝜕φ 𝜕φ sin φ cos φ
a2 = a + a + a = −ax + ay ,
𝜕x x 𝜕y y 𝜕z z R R
𝜕z 𝜕z 𝜕z
a3 = a + a + a = az ;
𝜕x x 𝜕y y 𝜕z z
𝜕x 𝜕y 𝜕z
a1 = a + a + a = ax cos φ + ay sin φ,
𝜕R x 𝜕R y 𝜕R z
𝜕x 𝜕y 𝜕z
a2 = ax + ay + a = −ax R sin φ + ay R cos φ,
𝜕φ 𝜕φ 𝜕φ z
𝜕x 𝜕y 𝜕z
a3 = a + a + a = az .
𝜕z x 𝜕z y 𝜕z z

A Solutions to the Problems | 289

Conversely, if we identify the tilde coordinates with Cartesian coordinates and the no-
tilde coordinates with cylindrical coordinates, we get

𝜕x 𝜕x 𝜕x
ax = a1 + a2 + a3 = a1 cos φ − a2 R sin φ
𝜕R 𝜕φ 𝜕z
𝜕R 𝜕φ 𝜕z sin φ
= a1 + a2 + a3 = a1 cos φ − a2 ,
𝜕x 𝜕x 𝜕x R
𝜕y 𝜕y 𝜕y
ay = a1 + a2 + a3 = a1 sin φ + a2 R cos φ
𝜕R 𝜕φ 𝜕z
𝜕R 𝜕φ 𝜕z cos φ
= a1 + a2 + a3 = a1 sin φ + a2 ,
𝜕y 𝜕y 𝜕y R
𝜕z 𝜕z 𝜕z
az = a1 + a2 + a3 = a3
𝜕R 𝜕φ 𝜕z
𝜕R 𝜕φ 𝜕z
= a1 + a2 + a3 = a3 .
𝜕z 𝜕z 𝜕z

Problem 4.5.
A. With the solutions to Problem 4.1, we obtain

𝜕ui ̂ cos β sin β

Ai = A gives A1 = A − A ,
𝜕xj j cos(α + β) x cos(α + β) y
sin α cos α
A2 = − Ax + A ;
cos(α + β) cos(α + β) y
̂ i = 𝜕x Aj
A gives Ax = A1 cos α + A2 sin β,
Ay = A1 sin α + A2 cos β;

Ai = ̂j
A gives A1 = Ax cos α + Ay sin α,
A2 = Ax sin β + Ay cos β.

A = PQ

A1 = JI = PR,A1 g 1 = PR,
A2 = HC = RQ, A2 g 2 = RQ,

290 | A Solutions to the Problems

Ax = KL = PG = PF + FG,
PF = PR cos α = A1 cos α,
FG = RD = RQ sin β = A2 sin β,
Ay = EB = GQ = GD + DQ,
GD = FR = PR sin α = A1 sin α,
DQ = RQ sin β = A2 sin β,
A1 g 1 = PR, |A1 g 1 | = PR = A1 ,
A2 g 2 = RQ, |A2 g 2 | = RQ = A2 .

A = PQ

Ax = UV = PN = GQ,
Ay = MF = PG = NQ,

A1 = SK = PH = TL = TN + NL,
TN = PN cos α = Ax cos α, NL = NQ sin α = Ay sin α,
A2 = JD = PE = IC = IG + GC,
IG = PG cos β = Ay cos β, GC = GQ sin β = Ax sin β,

󳨀→ PH A1
A1 g 1 = PR, |A1 g 1 | = PR = = ,
cos(α + β) cos(α + β)
󳨀󳨀→ 󳨀→ PE A2
A2 g 2 = RQ = PB, |A2 g 2 | = PB = = .
cos(α + β) cos(α + β)

A Solutions to the Problems | 291

Problem 4.6.
From the solution to Problem 4.2 we know
g11 = 1, g22 = R2 , g33 = 1, g 11 = 1, g 22 = , g 33 = 1.
The off-diagonal elements are all zero, because cylindrical coordinates are orthogonal.

Problem 4.7.
According to (4.19), the transformation equations are

𝜕um 𝜕un
g̃ij = g .
̃ j mn
̃ i 𝜕u

Computing the determinant and using (1.13) yields

𝜕um 𝜕un
det (g̃ij ) = det ( i
) det ( j ) det (gmn ),
̃ 𝜕u
𝜕(u1 , u2 , u3 )
g̃ = [ ] g,
̃1 , u
𝜕(u ̃2 , u
̃3 )

A quantity satisfying this transformation law is called a pseudoscalar of weight 2.

The determinant of the Cartesian coordinates of a second-order tensor is, accord-
ing to (3.7), a scalar (and is called the determinant of the tensor); the determinant of its
covariant coordinates clearly is a pseudoscalar of weight 2. Analogously, we can show
that the determinant of the mixed coordinates of a second-order tensor is a scalar and
that the determinant of its contravariant coordinates is a pseudoscalar of weight −2.

Problem 4.8.
A tensor of order three has 33 = 27 Cartesian coordinates, each of which can be trans-
lated into 23 = 8 different holonomic coordinates, so the ε-tensor has 8 ⋅ 27 = 216
holonomic coordinates. Because cylindrical coordinates are orthogonal, all the coor-
dinates of the ε-tensor which match in two indices are zero, so only 8 ⋅ 6 = 48 coordi-
nates are nonzero.
The easiest way to compute these coordinates is from (4.33):

eijk = [g i , g j , g k ], eij k = [g i , g j , g k ], etc.

Since the basis vectors are mutually orthogonal to each other, the magnitudes of the
triple products are equal to the product of the lengths of the vectors spanning the triple
products, and a coordinate is positive or negative, depending on whether the three
vectors of the triple product form a right-handed system or a left-handed system.

Problem 4.9.
eijk em nk = g im δnj − δni g jm .

Brought to you by | Cambridge University Library

292 | A Solutions to the Problems

Problem 4.10.
Possible translations are:
A. a ⋅ b = c ⇐⇒ aij bj k = cik ,
b⋅a=c ⇐⇒ bij ajk = ci k ,
a ̂ = ĉ
̂ ik b ij jk ⇐⇒ aik bi j = cjk ,
a ̂ a
̂i b k ̂i c
̂k = d ⇐⇒ ai bk ai ck = d,
a ̂ ĉ = d
̂ ijk b i k
j ⇐⇒ aijk bi ck = dj ,
a ̂ ĉ = d
̂ ijk b i j
k ⇐⇒ aijk bi cj = dk ,
(a ⋅ b)T = c ⇐⇒ (aij bjk )T = ci k ,
(a ⋅ b ⋅ c)T = d ⇐⇒ (ai j bj k ck l )T = di l .

B. eijk aj ekmn bm cn = aj cj bi − aj bj ci .

Problem 4.11.
Since cylindrical coordinates are orthogonal, our considerations from Section 4.3,
No. 3 apply here as well.
A. From the solution to Problem 4.2 it follows immediately, after normalizing the ba-
sis vectors, that

g <1> = (cos φ, sin φ, 0), g <2> = (− sin φ, cos φ, 0), g <3> = (0, 0, 1).

B. According to (4.48), we have a<i> = ai √gii , and with g11 = 1, g22 = R2 , g33 = 1,
it follows that

ar = ax cos φ + ay sin φ, ax = ar cos φ − aφ sin φ,

aφ = −ax sin φ + ay cos φ, ay = ar sin φ + aφ cos φ,
az = az , az = az .

C. Since the g <i> are locally Cartesian and since the coordinates of δ and ε are equal
in all Cartesian coordinate systems, it follows that g <ij> = δij , e<ijk> = εijk .
D. (a ⋅ b)T = c ⇐⇒ (a<ij> b<jk>)T = c<ik>,
αik bi j = cjk ⇐⇒ α<ik> b<ij> = c<jk>,
a × (b × c) = a ⋅ c b − a ⋅ b c ⇐⇒ εijk a<j> εkmn b<m> c<n>
= a<j> c<j> b<i> − a<j> b<j> c<i>.

A Solutions to the Problems | 293

Problem 4.12.
The easiest way to compute the Christoffel symbols is by (4.60).
For cylindrical coordinates we get

g11 = 1, g11,i = 0,

g22 = R2 , g22,1 = 2 R, g22,2 = g22,3 = 0,

g33 = 1, g33,i = 0,
g 11 = 1, g 22 = , g 33 = 1.

This gives

1 11
Γ1ip = g (g1i,p + g1p,i − gip,1 ),

which is nonzero only for i = p = 2, and then we have Γ122 = −R;

1 22
Γ2ip = g (g2i,p + g2p,i − gip,2 ),

which is nonzero only for i = 2, p = 1 and i = 1, p = 2, so we have Γ212 = Γ221 = R1 ;

1 33
Γ3ip = g (g3i,p + g3p,i − gip,3 ),

which is zero for all i and p.

So the only nonzero Christoffel symbols are Γ122 = −R, Γ212 = Γ221 = R1 .

Problem 4.13.
We can derive the transformation law as follows. Starting with equation (4.58)1 in the
tilde system, we substitute the transformation equations (4.18)1 for the covariant basis
vectors, so we have

𝜕g̃j 𝜕 𝜕um 𝜕um 𝜕g m 𝜕2 um

Γ̃ijk g̃ i = = ( j gm) = j
+ j k gm;
𝜕u 𝜕u
̃ k 𝜕u
̃ 𝜕u
̃ 𝜕ũ k 𝜕u
̃ 𝜕ũ

𝜕g m 𝜕g m 𝜕un (4.58) 𝜕un l ̃i

(4.18) 𝜕u
with = = Γ g and g = g̃
𝜕u 𝜕un 𝜕ũk ̃ k mn l
𝜕u l 𝜕ul i
we obtain

𝜕um 𝜕un 𝜕u ̃i l 𝜕2 um 𝜕u ̃i
Γ̃ijk g̃ i = Γ mn g
̃ + g̃ ,
̃ j 𝜕u
𝜕u ̃ k 𝜕ul i ̃ k 𝜕um i
̃ j 𝜕u
̃ i 𝜕um 𝜕un l
𝜕u 𝜕2 um 𝜕u ̃i
Γ̃ijk = l Γmn + j k .
𝜕u 𝜕u j
̃ 𝜕u ̃ k 𝜕u
̃ 𝜕ũ 𝜕um

Brought to you by | Cambridge University Library

294 | A Solutions to the Problems

The first term is the transformation law for single contravariant and twice covariant
tensor coordinates. The second term shows that Christoffel symbols are not tensor co-
ordinates. If the no-tilde coordinates are Cartesian, the Christoffel symbols vanish in
the no-tilde coordinate system and the transformation law reduces to Definition 4.56
of the Christoffel symbols. Starting from equation (4.58)2 , we analogously obtain the
equivalent transformation law
̃ i 𝜕um 𝜕un l
𝜕u ̃ i 𝜕um 𝜕un
𝜕2 u
Γ̃ijk = l Γmn − .
𝜕u 𝜕u j
̃ 𝜕ũk 𝜕u 𝜕un 𝜕u
m ̃ j 𝜕u
Problem 4.14.
̂m ̂
A. a × curl a = b ⇐⇒ εijk a
̂j ε =b
𝜕xn mkn i

⇐⇒ eijk aj am |n em k n = bi .
̂ i 𝜕a
𝜕a ̂j
B. + ̂
=b ij ⇐⇒ ai |j + aj |i = bij ⇐⇒ grad a + grad T a = b.
𝜕xj 𝜕xi
𝜕b j
C. ai bj |i = cj ⇐⇒ a
̂i = ĉj ⇐⇒ (grad b) ⋅ a = c.
D. ei jk eimn aj bk cm dn = ei jk aj bk eimn cm dn
= (gjm gkn − gjn gkm ) aj bk cm dn = aj cj bk dk − aj dj bk ck ,
⇐⇒ εijk εimn a ̂ ĉ d
̂j b ̂
k m n =a
̂ d
̂ j ĉj b ̂ ̂j d
k k −a
̂ b
̂ ̂ ,
j k ck

⇐⇒ (a × b) ⋅ (c × d) = a ⋅ c b ⋅ d − a ⋅ d b ⋅ c.

Problem 4.15.
(4.73) (4.61)
A. (grad a)i = a|i = a,i ,
(B.6) 𝜕a
(grad a)R = (grad a)1 = a,1 = ,
1 1 1 𝜕a
(grad a)φ = (grad a)2 = a,2 = ,
R R R 𝜕φ
(grad a)z = (grad a)3 = a,3 = .
(4.76) (4.61) (B.13)
B. div a = ai |i = ai ,i + Γimi am = a1 ,1 + a2 ,2 + a3 ,3 + Γ212 a1

𝜕a1 𝜕a2 𝜕a3 1 1 (B.6) 𝜕aR 𝜕 aφ 𝜕az aR

= + + + a = + + +
𝜕R 𝜕φ 𝜕z R 𝜕R 𝜕φ R 𝜕z R
𝜕aR aR 1 𝜕aφ 𝜕az
= + + + .
𝜕R R 𝜕φ 𝜕z
1 𝜕
(R aR )
R 𝜕R

A Solutions to the Problems | 295

C. (curl a)j = eijk ai,k ,

(B.6) (B.11) 1 𝜕a3 𝜕a2

(curl a)R = (curl a)1 = e312 a3,2 + e213 a2,3 = ( − )
R 𝜕φ 𝜕z
(B.6) 1 𝜕az 𝜕 1 𝜕az 𝜕aφ
= [ − (R aφ )] = − ;
R 𝜕φ 𝜕z R 𝜕φ 𝜕z

analogously we find

𝜕aR 𝜕az 𝜕aφ aφ 1 𝜕aR

(curl a)φ = − and (curl a)z = + − .
𝜕z 𝜕R 𝜕R R
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ R 𝜕φ
1 𝜕
(R aφ )
R 𝜕R

(4.73) (4.61)
D. (grad a)i j = ai |j = ai ,j + Γimj am ,

(B.9) (B.13) 𝜕a1 (B.6) 𝜕aR

(grad a)RR = (grad a)1 1 = a1 ,1 + Γ1m1 am = a1 ,1 = = ;
𝜕R 𝜕R

analogously we find the remaining coordinates; compare with (B.17).

E. Here we have two options: analogously to the earlier problems we compute

[(grad a) ⋅ b]i = ai |m bm , etc.,

or we use the fact that cylindrical coordinates are orthogonal, and thus the scalar
product is an algebraic operation where the factors are composed of the physical
cylindrical coordinates, similarly as in Cartesian coordinates. Using the solutions
to part D and from (B.17), respectively, we obtain

[(grad a) ⋅ b]R = (grad a)RR bR + (grad a)Rφ bφ + (grad a)Rz bz

𝜕aR 1 𝜕aR aφ 𝜕a
= bR + ( − ) bφ + R bz
𝜕R R 𝜕φ R 𝜕z

𝜕aR 𝜕a bφ 𝜕aR aφ bφ
= bR + R + bz −
𝜕R 𝜕φ R 𝜕z R

and analogously the remaining coordinates; compare with (B.18).

Brought to you by | Cambridge University Library

296 | A Solutions to the Problems

Problem 4.16.
𝜕2 a
̂m ̂
A. = bm ⇐⇒ am |kp g kp = bm ⇐⇒ div grad a = Δ a = b.

𝜕2 a
B. ̂
=b m ⇐⇒ ak |km = bm ⇐⇒ grad div a = b.
𝜕xm 𝜕xk
̂ ij
C. d a = (grad a) ⋅ d x ⇐⇒ δaij = aij |k d uk ⇐⇒ da
̂ ij = d xk .

Problem 4.17.
Similarly as in the last problem, we have two options:
– We first perform the computations in holonomic coordinates and translate the re-
sult into physical coordinates.
– We compute the Laplace operator as the product of the physical coordinates of the
gradient and the divergence, which we have computed earlier.

A. Option 1:

(4.84) (4.81)
Δ a = g mn a|mn = g mn a|m |n
(4.61) (4.61)
= g mn [(a|m ),n − Γjmn a|j ] = g mn (a,mn − Γjmn a,j ).

In cylindrical coordinates we have with (B.10) and (B.13)

Δ a = g 11 a,11 + g 22 (a,22 − Γ122 a,1 ) + g 33 a,33

𝜕2 a 1 𝜕2 a 𝜕a 𝜕2 a
= + ( + R ) + ,
𝜕R2 R2 𝜕φ2 𝜕R 𝜕z 2

𝜕2 a 1 𝜕a 1 𝜕2 a 𝜕2 a
Δa = + + 2 + .
2 R 𝜕R R 𝜕φ2 𝜕z 2
1 𝜕 𝜕a
(R )
R 𝜕R 𝜕R

Option 2:

𝜕a 1 𝜕a 𝜕a
b = grad a ⇐⇒ bR = , bφ = , bz = ,
𝜕R R 𝜕φ 𝜕z

𝜕bR bR 1 𝜕bφ 𝜕bz

Δ a = div b = + + +
𝜕R R R 𝜕φ 𝜕z
𝜕2 a 1 𝜕a 1 𝜕2 a 𝜕2 a
= + + 2 + .
𝜕R 2 R 𝜕R R 𝜕φ2 𝜕z 2

A Solutions to the Problems | 297

B. We compute only the R-coordinate and refer to (B.21) for the other two coordinates.
Option 1:

(4.84) (4.81)
(Δ a)i = g mn ai |mn = g mn ai |m |n , with bim := ai |m ,

(Δ a)i = g mn bim |n ,

(4.61) j
ai |mn = bim |n = bim,n − Γin bjm − Γjmn bij
(4.61) j j
= (ai,m − Γim aj ),n − Γin (aj,m − Γkjm ak )
− Γjmn (ai,j − Γkij ak )
j j j
= ai,mn − Γim,n aj − Γim aj,n − Γin aj,m
+ Γin Γkjm ak − Γjmn ai,j + Γjmn Γkij ak .

Since we have to contract this expression with g mn and since g mn is nonzero

in cylindrical coordinates only if m = n, only ai |mm remains. We have

j j j
ai |mm = ai,mm − Γim,m aj − 2 Γim aj,m − Γjmm ai,j + Γim Γkjm ak + Γjmm Γkij ak .

Taking (B.13) into account we obtain

𝜕2 a1
a1 |11 = a1,11 = ,
a1 |22 = a1,22 − Γ212,2 a2 − 2 Γ212 a2,2 − Γ122 a1,1 + Γ212 Γ122 a1

𝜕2 a1 2 𝜕a2 𝜕a
= − + R 1 − a1 ,
𝜕φ2 R 𝜕φ 𝜕R
𝜕2 a1
a1 |33 = a1,33 = ,
𝜕z 2

(Δ a)1 = g 11 a1 |11 + g 22 a1 |22 + g 33 a1 |33

𝜕2 a1 1 𝜕2 a1 2 𝜕a2 𝜕a1 𝜕2 a1
= + ( − + R − a 1 ) + ,
𝜕R2 R2 𝜕φ2 R 𝜕φ 𝜕R 𝜕z 2
𝜕2 aR 1 𝜕aR aR 1 𝜕2 aR 𝜕2 aR 2 𝜕aφ
(Δ a)R = + − + + − 2 .
𝜕R 2 R 𝜕R R 2 2
R 𝜕φ 2 𝜕z 2 R 𝜕φ

Brought to you by | Cambridge University Library

298 | A Solutions to the Problems

Option 2: From b = grad a, Δ a = div b it follows that

𝜕bRR 1 𝜕bRφ 𝜕bRz bRR − bφφ

(Δ a)R = (div b)R = + + +
𝜕R R 𝜕φ 𝜕z R
𝜕2 aR 1 𝜕 1 𝜕aR aφ 𝜕2 aR
= + ( − ) +
𝜕R2 R 𝜕φ R 𝜕φ R 𝜕z 2
1 𝜕a 1 𝜕aφ aR
+ ( R − − )
R 𝜕R R 𝜕φ R
𝜕2 aR 1 𝜕aR aR 1 𝜕2 aR 𝜕2 aR 2 𝜕aφ
= + − + + − 2 .
𝜕R2 R 𝜕R R2 R2 𝜕φ2 𝜕z 2 R 𝜕φ

Problem 4.18.
A. According to (4.61), we have

ai |k = ai,k − Γm
ik am =: bik ,

ai |kl = bik |l = bik,l − Γnil bnk − Γnkl bin

= (ai,k − Γm n m n m
ik am ),l − Γil (an,k − Γnk am ) − Γkl (ai,n − Γin am )

= ⏟⏟a⏟⏟⏟ m
− Γm
⏟⏟ − Γik,l am ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
i,kl − Γnil an,k + Γnil Γm
ik am,l ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ nk am
1 2 3

− Γnkl ai,n ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟

⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ + Γnkl Γm
in am ,
4 5

ai |lk = ⏟⏟a⏟⏟⏟ m
− Γm
⏟⏟ − Γil,k am ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
i,lk − Γnik an,l + Γnik Γm
il am,k ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ nl am
1 3 2

− Γnlk ai,n ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟

⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ + Γnlk Γm
in am .
4 5

Here equal terms are labeled by numbers, and we get

ai |kl − ai |lk = (Γm m n m n m

il,k − Γik,l + Γil Γnk − Γik Γnl ) am .

B. On the left side are the entirely covariant coordinates of a third-order tensor, and
with the quotient rule the terms in parentheses represent the single contravariant
and triple covariant coordinates Rm ikl of a fourth-order tensor. According to (4.83),
the tensor on the left side is the zero tensor for arbitrary am , i. e. we have Rm ikl am =
0 for arbitrary am , and this implies that Rm ikl = 0.

Problem 4.19.
To derive the transformation law for the contravariant basis vectors, we cannot refer
to the Cartesian basis, as we did in Section 4.2.2, because only the relation between
the surface coordinates uα and ũ α is one-to-one, but the relation (4.102) between uα or

A Solutions to the Problems | 299

ũ α and the Cartesian coordinates xi of the three-dimensional space is not one-to-one.

Instead, we have to use the transformation law (4.105) for the covariant basis vectors
and the orthogonality relations (4.106) and (4.107).
Since the relation between the surface coordinates uα and u
̃ α is one-to-one, we can
substitute the two equations (4.104) into each other, i. e.

uβ = uβ [ u
̃ α (uγ ) ] ,

and then we obtain, using the chain rule, analogous to (4.17), the relation
𝜕uβ 𝜕uβ 𝜕ũα
= = δγβ .
𝜕uγ 𝜕u
̃ α 𝜕uγ
̃ α /𝜕uγ gives
Therefore, contraction of (4.105) with 𝜕u
𝜕u ̃ α 𝜕uβ
̃ = g̃ = g γ .
𝜕uγ α ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
𝜕uγ 𝜕u ̃α β

Substitution of g γ into the orthogonality relation (4.107) gives

δ = gγ gγ + n n = g̃ g γ + n n,
𝜕uγ α
so scalar multiplication with g̃ β from the left gives

̃α β
̃ β ⋅ δ = γ g⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
g⏟⏟⏟⏟⏟⏟⏟⏟⏟ g̃ β ⋅ n n,
̃ ⋅ g̃ g γ + ⏟⏟⏟⏟⏟⏟⏟⏟⏟
𝜕u α

g̃ β δαβ 0

̃β γ
g̃ β = g .
This is the desired transformation law for the contravariant basis vectors, analogous
to (4.18).
To derive the transformation law for the covariant coordinates aα of a surface vec-
tor a, we compute the scalar product of the relation

̃ β g̃ β = aβ g β

with g̃ α and substitute on the right side the transformation law (4.105), so we have

a ̃ β ⋅ g̃ = aβ g β ⋅ g̃ = aβ g β ⋅ g
̃ β g⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ .
α α γ ̃α
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 𝜕u
δαβ δγβ

This gives the transformation law for the covariant coordinates of a surface vector,
analogous to (4.19),
̃α = a .
̃α β

Brought to you by | Cambridge University Library

300 | A Solutions to the Problems

Problem 4.20.
Writing the covariant metric coefficients as scalar products of the covariant basis vec-
tors, we have in two different surface coordinate systems

g̃αβ = g̃ α ⋅ g̃ β and gαβ = g α ⋅ g β .

Substituting the transformations law (4.105) into the left equation gives

𝜕uγ 𝜕uδ 𝜕uγ 𝜕uδ 𝜕uγ 𝜕uδ

g̃αβ = ( g ) ⋅ ( g ) = g ⋅ g = g .
̃α γ
𝜕u ̃β δ
𝜕u ̃ α 𝜕u
𝜕u ̃ β ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
γ δ 𝜕u ̃ β γδ
̃ α 𝜕u

Comparing this equation with (4.19) shows that it is the transformation law for the
entirely covariant coordinates of a second-order tensor, here in particular of a surface
tensor, because the summation runs only from 1 to 2.

Problem 4.21.
The scalar product of a = aα g α = aα g α , once with g β and once with g β , together with
(4.106), (4.110) and (4.114), yields
aα g α ⋅ g β = aα g α ⋅ g β , aα δα = aα g αβ , aβ = g βα aα ,
aα g α ⋅ g β = aα g α ⋅ g β , aα gαβ = aα δβα , aβ = gβα aα .

Problem 4.22.
If we take the Cartesian coordinates as surface parameters u1 = x, u2 = y, the para-
metric representation of the surface z = x y is given as

x = x e1 + y e2 + x y e2 .

The covariant basis vectors are, using (4.100),

𝜕xi 𝜕xi
g1 = e = e1 + y e3 , g2 = e = e2 + x e3 .
𝜕x i 𝜕y i

For the next computations we need the metric coefficients. We compute the covariant
metric coefficients gαβ , using (4.110), as follows:

g11 = g 1 ⋅ g 1 = 1 + y2 , g22 = g 2 ⋅ g 2 = 1 + x2 , g12 = g21 = g 1 ⋅ g 2 = x y.

The determinant is

g = det gαβ = g11 g22 − g12 g21 = 1 + x2 + y2 .

Using (4.116), we obtain the contravariant metric coefficients g αβ as

g22 1 + x2 g11 1 + y2 g12 −x y

g 11 = = , g 22 = = , g 12 = g 21 = − = .
g g g g g g

Brought to you by | Cambridge University Library

A Solutions to the Problems | 301

To compute the covariant coordinates of the curvature tensor bαβ = n ⋅ g α,β , we need,
according to (4.127), the normal vector n and the partial derivatives g α,β of the covariant
basis vectors.
We compute the vector product g 1 × g 2 ,

g 1 × g 2 = −y e1 − x e2 + e3 ,

and obtain, from (4.101), with |g 1 × g 2 | = √y2 + x2 + 1 = √g, the normal vector, i. e.

g1 × g2 1
n= = (−y e1 − x e2 + e3 ) .
|g 1 × g 2 | √g

Taking the partial derivatives of the covariant basis vectors gives

𝜕g 1 𝜕g 2 𝜕g 1
g 1,1 = = 0, g 2,2 = = 0, g 1,2 = g 2,1 = = e3 .
𝜕x 𝜕y 𝜕y

Hence we obtain for the covariant coordinates bαβ of the curvature tensor

b11 = n ⋅ g 1,1 = 0, b22 = n ⋅ g 2,2 = 0, b12 = b21 = n ⋅ g 1,2 = .

For the mixed coordinates bα β = g αγ bγβ we obtain

xy 1 + x2
b1 1 = g 1γ bγ1 = g 12 b21 = − , b1 2 = g 1γ bγ2 = g 11 b12 = ,
g√g g√g
1 + y2 xy
b2 1 = g 2γ bγ1 = g 22 b21 = , b2 2 = g 2γ bγ2 = g 21 b12 = − .
g√g g√g

We note that the matrix of the covariant coordinates bαβ of the curvature tensor is
symmetric, but the matrix of the mixed coordinates bα β is not symmetric. We further
note that the metric coefficients g12 = g21 are zero only if x = 0 or y = 0. Thus, in
general, g 1 ⋅ g 2 ≠ 0, i. e. the x- and y-coordinate curves do not form an orthogonal grid
on the surface z = x y, but only in the x, y-plane.

Problem 4.23.
Using the covariant coordinates of the curvature tensor, we obtain the mean curvature
as H = 21 bα α = 21 g αβ bβα and the Gaussian curvature as K = det bα β = det (g αγ bγβ ) =
(det g αγ ) (det bγβ ) = g1 det bγδ (compare (1.13) and (4.27)). Substituting (4.113), (4.116),
and (4.131) gives

1 11 G L − 2F M + E N
H= (g b11 + 2 g 12 b12 + g 22 b22 ) = ,
2 2 (E G − F 2 )
b11 b22 − b12 b12 L N − M 2
K= = .
g11 g22 − g12 g12 E G − F2

Brought to you by | Cambridge University Library

302 | A Solutions to the Problems

Problem 4.24.
Using the results of Problem 4.22, we obtain for the mean curvature H and the Gaussian
curvature K

1 1 −x y − (1 + x 2 + y2 ) 1
H= (b 1 + b2 2 ) = , K = b1 1 b2 2 − b1 2 b2 1 = = − 2.
2 g√g g 3 g

Solving the characteristic equation

2xy 1
k2 − 2 H k + K = k2 + k − 2 = 0,
g√g g

we obtain the principal curvatures as

−x y ± √x2 y2 + g −x y ± √(1 + x2 ) (1 + y2 )
k1,2 = = .
g√g g√g

The principal curvature directions are determined by (bα β − k δβα )t β = 0, and with
g11 = 1 + y2 , g22 = 1 + x2 , g12 = x y we have

1 −g g22 −g ± √g11 g22 1 0 t1 0

[ ( 12 ) − 12 ( )] ( 2 ) = ( ) ,
g√g g11 −g12 g√g 0 1 t 0
∓√g11 g22 g22 t1 0
( ) ( 2) = ( ) .
g11 ∓√g11 g22 t 0

Solving these equations we obtain the principal curvature directions (in nonnormal-
ized form) as

√g11 g22 −√g11 g22

tβ = ( ), tβ = ( ).
1 g11 2 g11

It is easy to verify by substitution that gαβ t α t β = 0, i. e. the principal curvature

1 2
directions are perpendicular to each other.

Problem 4.25.
If we set m = μ, i = α, j = β, k = γ for the indices of the three-dimensional Riemann
curvature tensor in Problem 4.18 and write out the third terms of the summation over
n, we have
μ μ μ μ μ
Γαγ,β − Γαβ,γ + Γναγ Γνβ − Γναβ Γμνγ + Γ3αγ Γ3β − Γ3αβ Γ3γ .
0 = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
Rμ αβγ

From this we obtain, using (4.138)1 and (4.139), the integrability condition (4.143):
μ μ
Rμ αβγ = ⏟⏟Γ⏟⏟⏟
αβ⏟⏟ ⏟⏟Γ
3γ⏟⏟ − ⏟⏟ Γ⏟⏟⏟
αγ⏟⏟ ⏟⏟ 3β⏟⏟
= bμ β bαγ − bμ γ bαβ .
bαβ −bμ γ bαγ −bμ β

Brought to you by | Cambridge University Library

A Solutions to the Problems | 303

Setting m = 3, i = α, j = β, k = γ and using (4.138) and (4.139), we further obtain

3 3 ν 3 ν 3 3 3 3 3
0=Γ αγ,β − Γ
⏟⏟⏟⏟⏟⏟⏟ αβ,γ + Γαγ ⏟⏟Γ
⏟⏟⏟⏟⏟⏟⏟ νβ⏟⏟ − Γαβ ⏟⏟Γ
⏟⏟⏟ Γ⏟⏟⏟
νγ⏟⏟ + ⏟⏟
⏟⏟⏟ Γ⏟⏟⏟
αγ⏟⏟ ⏟⏟ Γ⏟⏟⏟
3β⏟⏟ − ⏟⏟ αβ⏟⏟ ⏟⏟Γ 3γ⏟⏟ ,
bαγ,β bαβ,γ bνβ bνγ bαγ 0 bαβ 0
bαβ,γ − Γναγ bνβ = bαγ,β − Γναβ bνγ ;

this is the integrability condition (4.144).

Problem 4.26.
From the definition (4.146) it follows, for two different surface coordinate systems, with
(4.33) and g̃ 3 = g 3 = n, that

ẽμν = [g̃ μ , g̃ ν , n], eμν = [g μ , g ν , n].

Substituting the transformation law (4.105) into the left equation gives

𝜕uα 𝜕uβ 𝜕uα 𝜕uβ 𝜕uα 𝜕uβ

ẽμν = [ g , g , n] = [g , g , n] = e .
̃ μ α 𝜕u
𝜕u ̃ν β ̃ μ 𝜕u
𝜕u ̃ν μ ν ̃ ν αβ
̃ μ 𝜕u

This is, similar to Problem 4.20, the transformation law for the entirely covariant co-
ordinates of a surface tensor of second order.

Problem 4.27.
We can compute the Gaussian curvature from the Riemann curvature tensor. Accord-
ing to (4.147), we have K = R1212 /g. Using (4.142), we further obtain

μ μ μ μ
R1212 = g1μ (Γ22,1 − Γ21,2 + Γν22 Γν1 − Γν21 Γν2 )
μ μ μ μ μ μ
= g1μ (Γ22,1 − Γ21,2 + Γ122 Γ11 + Γ222 Γ21 − Γ121 Γ12 − Γ221 Γ22 ).

A spherical surface is in spherical coordinates given by r = const, and correspond-

ingly, a cylindrical surface is in cylindrical coordinates given by R = const. Thus, for
further calculations we conveniently use the formulas from Appendix B. Both coordi-
nate systems are orthogonal, i. e. when summing over μ only g11 is nonzero. We note
that in the appendix the radial direction is in both cases the 1-direction; however, in
surface theory the radial direction is the 3-direction, i. e. we need to rename the indices
1 → 2 and 2 → 3. This gives

K g = g11 (Γ122,1 − Γ121,2 + Γ122 Γ111 + Γ222 Γ121 − Γ121 Γ112 − Γ221 Γ122 )

̂ g22 (Γ233,2 − Γ232,3 + Γ233 Γ222 + Γ333 Γ232 − Γ232 Γ223 − Γ332 Γ233 ) .

In spherical coordinates, using (B.33), (B.35), and (B.36), we have g22 = r 2 , g =

r 4 sin2 ϑ, Γ233 = − sin ϑ cos ϑ, Γ332 = cot ϑ; the other Christoffel symbols are zero. So we

Brought to you by | Cambridge University Library

304 | A Solutions to the Problems

obtain the Gaussian curvature of a spherical surface as

g22 2
K= (Γ33,2 − Γ332 Γ233 )
r2 𝜕 1
= 2
( (− sin ϑ cos ϑ) + sin ϑ cos ⏟⏟⏟⏟ϑ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ cot ϑ⏟⏟ ) = 2 .
r sin ϑ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
𝜕ϑ r
cos2 ϑ
− cos2 ϑ + sin2 ϑ
In cylindrical coordinates, according to (B.13), all Christoffel symbols that appear here
are zero, so the Gaussian curvature of a cylinder surface is K = 0.
Alternatively, we can find the Gaussian curvature from its definition (4.133). We
start with K = det bα β . Using (4.139), we have

bα β = g αμ bμβ = g αμ Γ3μβ ,

and thus, using (1.13) and (4.27), we obtain

K = det bα β = (det g αμ ) (det bμβ ) = det Γ3μβ .
In order to further obtain expressions in spherical and cylindrical coordinates, we re-
name the indices 1 → 2, 2 → 3, 3 → 1 and obtain
󵄨󵄨 1 󵄨
1 󵄨󵄨 Γ22 Γ123 󵄨󵄨󵄨
K= 󵄨󵄨 1 󵄨󵄨 .
g22 g33 󵄨󵄨 Γ
󵄨 32 Γ133 󵄨󵄨󵄨

In spherical coordinates, we have using (B.33) and (B.36)

󵄨󵄨 󵄨󵄨
1 󵄨󵄨 −r 0 󵄨󵄨 1
K= 󵄨
2 󵄨󵄨󵄨 0 2 󵄨󵄨󵄨 = 2 ;
r sin ϑ 󵄨 −r sin ϑ 󵄨󵄨 r

in cylindrical coordinates, we have, using (B.10) and (B.13),

󵄨󵄨 󵄨
1 󵄨󵄨 −R 0 󵄨󵄨󵄨
K= 󵄨󵄨
󵄨󵄨 0
󵄨󵄨 = 0.
R2 󵄨 0 󵄨󵄨󵄨

On a cylinder surface we have K = 0. This means we can unfold the surface into a
plane, whereas such an unfolding is not possible with a spherical surface, because
K ≠ 0.

Problem 5.1.
If T is a polar tensor, then its coordinates transform according to (2.17), so

Tij = αim αjn T̃mn .

Setting the indices equal, i = j, and using the orthogonality relation (2.6), we obtain
for the trace of T

tr T = Tii = αim αin T̃mn = δmn T̃mn = T̃mm ,

Brought to you by | Cambridge University Library

A Solutions to the Problems | 305

i. e. the trace is independent of the underlying coordinate system and thus it is an

invariant of the tensor T.
Similarly, we obtain for the trace of T 2

tr T 2 = Tik Tki = αim αkn T̃mn αkp αiq T̃pq = αim αiq αkn αkp T̃mn T̃pq
= δmq δnp T̃mn T̃pq = T̃qp T̃pq .

If T is polar, then tr T and tr T 2 are also polar.

If T is axial, then similar computations, using the transformation law (2.18), give

Tii = (det α) ̃
∼ Tmm


Tij Tji = T̃qp T̃pq .

Here tr T 2 is a polar scalar, but tr T is an axial scalar, and hence, according to our
convention from Chapter 5, tr T does not count as an invariant.

Problem 5.2.
All odd powers of antimetric tensors are antimetric, i. e. the trace of these powers is
zero; but the even powers are symmetric. Furthermore, according to (5.8), the trace of a
scalar product of a symmetric and an antimetric tensor is zero, so we see, by comparing
with (5.15), that we have the following nonzero invariants: tr A2 , tr B2 , tr(A⋅B), tr(A2 ⋅B2 ),
tr(A2 ⋅ B ⋅ A ⋅ B2 ).
The invariant of degree six can be shown to be reducible, in the same way as we
showed this for two symmetric tensors in Section 5.3.3, No. 5.
Only the investigation of the invariant of degree four requires more effort. If we
write the antimetric tensor A, using (3.8)2 , in terms of its corresponding vector a, we
obtain, using the Grassmann identity, for the coordinates of the square A2

Aij Ajk = εijl εjkm al am = (δlk δim − δlm δik ) al am = ai ak − am am δik .

The same holds for the coordinates of the square B2 , so we compute the coordinates
of A2 ⋅ B2 . We have

Aij Ajk Bkp Bpq = (ai ak − am am δik ) (bk bq − bl bl δkq )

= ak bk ai bq − am am bi bq − bl bl ai aq + am am bl bl δiq .

Then the trace is

tr(A2 ⋅ B2 ) = Aij Ajk Bkp Bpi = ak bk ai bi + am am bi bi . (a)

Brought to you by | Cambridge University Library

306 | A Solutions to the Problems

Using (3.8)1 , we can reintroduce the coordinates of the antrimetric tensors on the right
side. Using the Grassmann identity and taking antimetry into account, we get initially

1 1
ak bk = εklm εkpq Alm Bpq = (δlp δmq − δlq δmp ) Alm Bpq
4 4
1 1 1
= (Apq Bpq − Aqp Bpq ) = − Aqp Bpq = − tr(A ⋅ B).
4 2 2

Similarly, we obtain am am = − 21 tr A2 and bi bi = − 21 tr B2 ; then, from (a) it follows that

tr(A2 ⋅ B2 ) can be written in terms of the lower-order invariants and thus is reducible,
i. e.
1 2 1
tr(A2 ⋅ B2 ) = tr (A ⋅ B) + tr A2 tr B2 .
4 4

Hence the integrity basis for two antimetric tensors consists of only three irreducible
– the only irreducible invariants, according to Section 5.3.3, No. 4, of the tensors A
and B themselves: I1 = tr A2 , I2 = tr B2 ;
– the only irreducible simultaneous invariant of A and B: I3 = tr(A ⋅ B).

According to Section 3.3, we can uniquely associate a vector to an antimetric tensor,

and two vectors have, according to Section 5.3.1, three irreducible invariants, i. e. the
same number of invariants as two antimetric tensors.

Problem 5.3.
We can compute the following invariants from the coordinates of a symmetric tensor
S and a vector u:
– the basic invariants of the symmetric tensor S:

I1 = tr S, I2 = tr S2 , I3 = tr S3 ;

– the square of the vector u:

I4 = u ⋅ u;

– the simultaneous invariants of S und u:

I5 = u ⋅ S ⋅ u, I6 = u ⋅ S2 ⋅ u, I7 = [u, S ⋅ u, S2 ⋅ u].

Simultaneous invariants which contain S3 can be reduced to I1 to I7 , using the Cayley–

Hamilton theorem (3.94). Scalar quantities such as εijk ui Sjk and εijk ui Sjk (2)
are zero, be-
cause of the symmetry of S and due to (2.40), so they do not belong to the simultaneous
If S is polar and u is axial, then I1 to I7 are polar, i. e. the integrity basis then consists
of seven irreducible invariants. If, on the other hand, both S and u are polar, only I1 to

Brought to you by | Cambridge University Library

A Solutions to the Problems | 307

I6 are polar, but I7 = εijk ui Sjl ul Skm

um is axial because of the ε-tensor, i. e. in this case
the integrity basis has only six irreducible invariants.
We can, according to Section 3.3, associate an axial vector to a polar antimetric
tensor and, according to Section 5.3.3, No. 6, a polar antimetric tensor and a polar
symmetric tensor have seven irreducible invariants, i. e. the same number as a polar
symmetric tensor and an axial vector.

Problem 5.4.
We obtain from u = f (T), by introducing an auxiliary vector, u ⋅ h = f (T, h). Simul-
taneous invariants of T and h which are linear in h can only be computed using the
If T is symmetric, then, according to (2.40), we have εijk Tjk hi = 0; the same result
is obtained for the tensor powers T 2 , T 3 , which are also symmetric. So a representation
u = f (T), in which a vector u depends on a symmetric tensor T, does not exist.
If T is antimetric, then T 2 is symmetric and T 3 is again antimetric; but using the
Cayley–Hamilton theorem (3.94), T 3 can be written in terms of T. So with εijk Tjk hi only
one irreducible invariant exists which is linear in h. If T is polar, then ε ⋅⋅ T is axial
because of the ε-tensor. Thus, a representation u = f (T) of a vector u, which depends
on an antimetric tensor T, only exists if the vector u is also axial, and it is

u = k(tr T 2 ) ε ⋅⋅ T,

or, in terms of the axial vector t associated with T,

u = k ∗ (t ⋅ t) t.

The coefficients k and k ∗ are polar scalar-valued functions of the polar invariants
tr T and t ⋅ t, respectively.

Problem 5.5.
We get from T = f (v), after introducing an auxiliary tensor, T ⋅ ⋅ H = f (v, H). So,
three invariants exist which are linear in H:

Hii = δ ⋅⋅ H, vi vj Hij = v v ⋅⋅ H, εijk vk Hij = (ε ⋅ v) ⋅⋅ H.

If v is polar, then v v is also polar, but ε ⋅ v is axial because of the ε-tensor. Since we
are looking for a representation for a polar tensor T, we have only the two generators
δ and v v, and the representation is

T = k1 (v ⋅ v) δ + k2 (v ⋅ v) v v.

If v is axial, then both v v and ε ⋅ v are polar, and the representation is

T = k1 (v ⋅ v) δ + k2 (v ⋅ v) v v + k3 (v ⋅ v) ε ⋅ v.

Brought to you by | Cambridge University Library

B Cylindrical Coordinates and Spherical Coordinates
The most commonly used curvilinear coordinate systems in practice are the cylindri-
cal and spherical coordinates. We summarize here the most important formulas for
these two coordinate systems. The radial cylindrical coordinate is denoted by R and
the radial spherical coordinate is denoted by r.

B.1 Cylindrical Coordinates

B.1.1 Transformation Equations for Point Coordinates

The transformation equations are

x = R cos φ, y = R sin φ, (B.1)

R = √x 2 + y 2 , 0 ≦ R < ∞,
y (B.2)
φ = arctan , 0 ≦ φ < 2π.


310 | B Cylindrical Coordinates and Spherical Coordinates

B.1.2 Bases

The Cartesian coordinates of the covariant basis, the contravariant basis, and the
physical basis are

g 1 = g 1 = g <1> = {cos φ, sin φ, 0};

g 2 = {−R sin φ, R cos φ, 0},
1 1
g 2 = {− sin φ, cos φ, 0} , (B.3)
g <2> = {− sin φ, cos φ, 0};
g 3 = g 3 = g <3> = {0, 0, 1}.

B.1.3 Transformation Equations for Tensor Coordinates

In the following formulas, we denote the Cartesian coordinates, e. g. of a vector, by ax ,

ay , az , its covariant cylindrical coordinates by a1 , a2 , a3 , its contravariant cylindrical
coordinates by a1 , a2 , a3 , and its physical cylindrical coordinates by aR , aφ , az :

aR = ax cos φ + ay sin φ,
aφ = ay cos φ − ax sin φ, (B.4)
az = az ;

ax = aR cos φ − aφ sin φ,
ay = aφ cos φ + aR sin φ, (B.5)
az = az ;

a1 = a1 = aR ,
a2 = R aφ , a2 = 1
aφ , (B.6)
a3 = a3 = az ;

aRR = axx cos2 φ + (axy + ayx ) cos φ sin φ + ayy sin2 φ,

aRφ = axy cos2 φ − (axx − ayy ) cos φ sin φ − ayx sin2 φ,
aRz = axz cos φ + ayz sin φ,
aφR = ayx cos2 φ − (axx − ayy ) cos φ sin φ − axy sin2 φ,
aφφ = ayy cos2 φ − (axy + ayx ) cos φ sin φ + axx sin2 φ, (B.7)
aφz = ayz cos φ − axz sin φ,
azR = azx cos φ + azy sin φ,
azφ = azy cos φ − azx sin φ,
azz = azz ;

Brought to you by | Cambridge University Library

B.1 Cylindrical Coordinates | 311

axx = aRR cos2 φ − (aRφ + aφR ) cos φ sin φ + aφφ sin2 φ,

axy = aRφ cos2 φ + (aRR − aφφ ) cos φ sin φ − aφR sin2 φ,
axz = aRz cos φ − aφz sin φ,
ayx = aφR cos2 φ + (aRR − aφφ ) cos φ sin φ − aRφ sin2 φ,
ayy = aφφ cos2 φ + (aRφ + aφR ) cos φ sin φ + aRR sin2 φ, (B.8)
ayz = aφz cos φ + aRz sin φ,
azx = azR cos φ − azφ sin φ,
azy = azφ cos φ + azR sin φ,
azz = azz ;

a11 = a11 = a1 1 = a1 1 = aRR ,

a12 = a1 2 = R aRφ , a12 = a1 2 = a ,
R Rφ
a13 = a13 = a1 3 = a1 3 = aRz ,
a21 = a2 1 = R aφR , a21 = a2 1 = aφR ,
a22 = R2 aφφ , a22 = 2 aφφ , a2 2 = a2 2 = aφφ , (B.9)
a23 = a2 3 = R aφz , a23 = a2 3 = aφz ,
a31 = a31 = a3 1 = a3 1 = azR ,
a32 = a3 2 = R azφ , a32 = a3 2 = azφ ,
a33 = a33 = a3 3 = a3 3 = azz .

B.1.4 δ-Tensor and ε-Tensor

The only nonzero coordinates are

g11 = g 11 = 1, g22 = R2 , g 22 = , g33 = g 33 = 1, (B.10)

e123 = e12 3 = e1 23 = e1 2 3 = R, e123 = e1 23 = e12 3 = e1 2 3 = (B.11)
and in addition the ε-tensor also contains the corresponding permutations. Also

g := det gij = R2 . (B.12)

Brought to you by | Cambridge University Library

312 | B Cylindrical Coordinates and Spherical Coordinates

B.1.5 Christoffel Symbols

The only nonzero Christoffel symbols are

Γ212 = Γ221 = , Γ122 = −R. (B.13)

B.1.6 Differential Operators

For each operator the physical coordinates are given:

𝜕a 1 𝜕a 𝜕a
grad a = { , , }; (B.14)
𝜕R R 𝜕φ 𝜕z
𝜕aR aR 1 𝜕aφ 𝜕az
div a = + + + ; (B.15)
𝜕R R R 𝜕φ 𝜕z
1 𝜕az 𝜕aφ 𝜕aR 𝜕az 𝜕aφ 1 𝜕aR aφ
curl a = { − , − , − + }; (B.16)
R 𝜕φ 𝜕z 𝜕z 𝜕R 𝜕R R 𝜕φ R

(grad a)RR = ,
1 𝜕aR aφ
(grad a)Rφ = − ,
R 𝜕φ R
(grad a)Rz = ,
(grad a)φR = ,
1 𝜕aφ aR (B.17)
(grad a)φφ = + ,
R 𝜕φ R
(grad a)φz = ,
(grad a)zR = ,
1 𝜕az
(grad a)zφ = ,
R 𝜕φ
(grad a)zz = ;

𝜕aR 𝜕a bφ 𝜕aR aφ bφ
[(grad a) ⋅ b]R = b + R + b − ,
𝜕R R 𝜕φ R 𝜕z z R
𝜕aφ 𝜕aφ bφ 𝜕aφ aR bφ
[(grad a) ⋅ b]φ = b + + b + , (B.18)
𝜕R R 𝜕φ R 𝜕z z R
𝜕az 𝜕a bφ 𝜕az
[(grad a) ⋅ b]z = bR + z + b ;
𝜕R 𝜕φ R 𝜕z z

Brought to you by | Cambridge University Library

B.1 Cylindrical Coordinates | 313

𝜕aRR 1 𝜕aRφ 𝜕aRz aRR − aφφ

(div a)R = + + + ,
𝜕R R 𝜕φ 𝜕z R
𝜕aφR 1 𝜕aφφ 𝜕aφz aRφ + aφR
(div a)φ = + + + , (B.19)
𝜕R R 𝜕φ 𝜕z R
𝜕azR 1 𝜕azφ 𝜕azz azR
(div a)z = + + + ;
𝜕R R 𝜕φ 𝜕z R

𝜕2 a 1 𝜕a 1 𝜕2 a 𝜕2 a
Δa = + + + ; (B.20)
𝜕R2 R 𝜕R R2 𝜕φ2 𝜕z 2

𝜕2 aR 1 𝜕aR aR 1 𝜕2 aR 𝜕2 aR 2 𝜕aφ
(Δa)R = + − + + − 2 ,
𝜕R2 R 𝜕R R2 R2 𝜕φ2 𝜕z 2 R 𝜕φ
𝜕2 aφ 1 𝜕aφ aφ 2
1 𝜕 aφ 𝜕 aφ
2 𝜕aR
(Δa)φ = + − + + + 2 , (B.21)
𝜕R2 R 𝜕R R2 R2 𝜕φ2 𝜕z 2 R 𝜕φ
𝜕2 az 1 𝜕az 1 𝜕2 az 𝜕2 az
(Δa)z = + + + .
𝜕R2 R 𝜕R R2 𝜕φ2 𝜕z 2

B.1.7 Curve-, Area-, and Volume Elements

The elements are

d ui = (d R, d φ, d z),
d Ai = (R d φ d z, R d z d R, R d R d φ), (B.22)
d V = R d R d φ d z.

In (B.23) the physical coordinates are given:

d x = (d R, R d φ, d z),
d A = (R d φ d z, d z d R, R d R d φ).

Brought to you by | Cambridge University Library

314 | B Cylindrical Coordinates and Spherical Coordinates

B.2 Spherical Coordinates

B.2.1 Transformation Equations for Point Coordinates

The transformation equations are

x = r sin ϑ cos φ,
y = r sin ϑ sin φ, (B.24)
z = r cos ϑ,

r = √x2 + y2 + z 2 , 0 ≦ r < ∞,

ϑ = arccos , 0 ≦ ϑ ≦ π, (B.25)
√x2 + y2 + z 2

φ = arctan , 0 ≦ φ < 2π.

Brought to you by | Cambridge University Library

B.2 Spherical Coordinates | 315

B.2.2 Bases

The Cartesian coordinates of the covariant basis, the contravariant basis, and the
physical basis are

g 1 = g 1 = g <1> = {sin ϑ cos φ, sin ϑ sin φ, cos ϑ};

g 2 = {r cos ϑ cos φ, r cos ϑ sin φ, −r sin ϑ},

1 1 1
g2 = { cos ϑ cos φ, cos ϑ sin φ, − sin ϑ} ,
r r r
g <2> = {cos ϑ cos φ, cos ϑ sin φ, − sin ϑ}; (B.26)

g 3 = {−r sin ϑ sin φ, r sin ϑ cos φ, 0},

sin φ cos φ
g 3 = {− , , 0} ,
r sin ϑ r sin ϑ
g <3> = {− sin φ, cos φ, 0}.

B.2.3 Transformation Equations for Tensor Coordinates

We denote the Cartesian coordinates, e. g. of a vector, by ax , ay , az , its covariant spher-

ical coordinates by a1 , a2 , a3 , its contravariant spherical coordinates by a1 , a2 , a3 , and
its physical spherical coordinates by ar , aϑ , aφ :

ar = ax sin ϑ cos φ + ay sin ϑ sin φ + az cos ϑ,

aϑ = ax cos ϑ cos φ + ay cos ϑ sin φ − az sin ϑ, (B.27)
aφ = −ax sin φ + ay cos φ;

ax = ar sin ϑ cos φ + aϑ cos ϑ cos φ − aφ sin φ,

ay = ar sin ϑ sin φ + aϑ cos ϑ sin φ + aφ cos φ, (B.28)
az = ar cos ϑ − aϑ sin ϑ;

a1 = a1 = ar ,
a2 = r aϑ , a2 = a , (B.29)
r ϑ
a3 = r sin ϑ aφ , a3 = a ;
r sin ϑ φ

Brought to you by | Cambridge University Library

316 | B Cylindrical Coordinates and Spherical Coordinates

arr = axx sin2 ϑ cos2 φ + axy sin2 ϑ cos φ sin φ

+ axz cos ϑ sin ϑ cos φ + ayx sin2 ϑ cos φ sin φ
+ ayy sin2 ϑ sin2 φ + ayz cos ϑ sin ϑ cos φ
+ azx cos ϑ sin ϑ cos φ + azy cos ϑ sin ϑ sin φ + azz cos2 ϑ,

arϑ = axx cos ϑ sin ϑ cos2 φ + axy cos ϑ sin ϑ cos φ sin φ
− axz sin2 ϑ cos φ + ayx cos ϑ sin ϑ cos φ sin φ
+ ayy cos ϑ sin ϑ sin2 φ − ayz sin2 ϑ sin φ
+ azx cos2 ϑ cos φ + azy cos2 ϑ sin φ − azz cos ϑ sin ϑ,

arφ = −axx sin ϑ cos φ sin φ + axy sin ϑ cos2 φ

− ayx sin ϑ sin2 φ + ayy sin ϑ cos φ sin φ
− azx cos ϑ sin φ + azy cos ϑ cos φ,

aϑr = axx cos ϑ sin ϑ cos2 φ + axy cos ϑ sin ϑ cos φ sin φ
+ axz cos2 ϑ cos φ + ayx cos ϑ sin ϑ cos φ sin φ
+ ayy cos ϑ sin ϑ sin2 φ + ayz cos2 ϑ sin φ
− azx sin2 ϑ cos φ − azy sin2 ϑ sin φ − azz cos ϑ sin ϑ,
aϑϑ = axx cos2 ϑ cos2 φ + axy cos2 ϑ cos φ sin φ
− axz cos ϑ sin ϑ cos φ + ayx cos2 ϑ cos φ sin φ
+ ayy cos2 ϑ sin2 φ − ayz cos ϑ sin ϑ sin φ
− azx cos ϑ sin ϑ cos φ − azy cos ϑ sin ϑ sin φ + azz sin2 ϑ,

aϑφ = −axx cos ϑ cos φ sin φ + axy cos ϑ cos2 φ

− ayx cos ϑ sin2 φ + ayy cos ϑ cos φ sin φ
+ azx sin ϑ sin φ − azy sin ϑ cos φ,

aφr = −axx sin ϑ cos φ sin φ − axy sin ϑ sin2 φ

− axz cos ϑ sin φ + ayx sin ϑ cos2 φ
+ ayy sin ϑ cos φ sin φ + ayz sin ϑ cos φ,

aφϑ = −axx cos ϑ cos φ sin φ − axy cos ϑ sin2 φ

+ axz sin ϑ sin φ + ayx cos ϑ cos2 φ
+ ayy cos ϑ cos φ sin φ − ayz sin ϑ cos φ,

aφφ = axx sin2 φ − axy cos φ sin φ

− ayx cos φ sin φ + ayy cos2 φ;

Brought to you by | Cambridge University Library

B.2 Spherical Coordinates | 317

axx = arr sin2 ϑ cos2 φ + arϑ cos ϑ sin ϑ cos2 φ

− arφ sin ϑ cos φ sin φ + aϑr cos ϑ sin ϑ cos2 φ
+ aϑϑ cos2 ϑ cos2 φ − aϑφ cos ϑ cos φ sin φ
− aφr sin ϑ cos φ sin φ − aφϑ cos ϑ cos φ sin φ + aφφ sin2 φ,

axy = arr sin2 ϑ cos φ sin φ + arϑ cos ϑ sin ϑ cos φ sin φ
+ arφ sin ϑ cos2 φ − aϑr cos ϑ sin ϑ cos φ sin φ
+ aϑϑ cos2 ϑ cos φ sin φ + aϑφ cos ϑ cos2 φ
− aφr sin ϑ sin2 φ − aφϑ cos ϑ sin2 φ − aφφ cos φ sin φ,

axz = arr cos ϑ sin ϑ cos φ − arϑ sin2 ϑ cos φ

+ aϑr cos2 ϑ cos φ − aϑϑ cos ϑ sin ϑ cos φ
− aφr cos ϑ sin φ + aφϑ sin ϑ sin φ,

ayx = arr sin2 ϑ cos φ sin φ + arϑ cos ϑ sin ϑ cos φ sin φ
− arφ sin ϑ sin2 φ + aϑr cos ϑ sin ϑ cos φ sin φ
+ aϑϑ cos2 ϑ cos φ sin φ − aϑφ cos ϑ sin2 φ
+ aφr sin ϑ cos2 φ + aφϑ cos ϑ cos2 φ − aφφ cos φ sin φ,
ayy = arr sin2 ϑ sin2 φ + arϑ cos ϑ sin ϑ sin2 φ
+ arφ sin ϑ cos φ sin φ + aϑr cos ϑ sin ϑ sin2 φ
+ aϑϑ cos2 ϑ sin2 φ + aϑφ cos ϑ cos φ sin φ
+ aφr sin ϑ cos φ sin φ + aφϑ sin ϑ cos φ sin φ + aφφ cos2 φ,

ayz = arr cos ϑ sin ϑ sin φ − arϑ sin2 ϑ sin φ

+ aϑr cos2 ϑ sin φ − aϑϑ cos ϑ sin ϑ sin φ
+ aφr cos ϑ cos φ − aφϑ sin ϑ cos φ,

azx = arr cos ϑ sin ϑ cos φ + arϑ cos2 ϑ cos φ

− arφ cos ϑ sin φ − aϑr sin2 ϑ cos φ
− aϑϑ cos ϑ sin ϑ cos φ + aϑφ sin ϑ sin φ,

azy = arr cos ϑ sin ϑ sin φ + arϑ cos2 ϑ sin φ

+ arφ cos ϑ cos φ − aϑr sin2 ϑ sin φ
− aϑϑ cos ϑ sin ϑ sin φ − aϑφ sin ϑ cos φ,

azz = arr cos2 ϑ − arϑ cos ϑ sin ϑ

− aϑr cos ϑ sin ϑ + aϑϑ sin2 ϑ;

Brought to you by | Cambridge University Library

318 | B Cylindrical Coordinates and Spherical Coordinates

a11 = a11 = a1 1 = a1 1 = arr ,

a12 = a1 2 = r arϑ , a12 = a1 2 = a ,
r rϑ
a13 = a1 3 = r sin ϑ arφ , a13 = a1 3 = a ,
r sin ϑ rφ
a21 = a2 1 = r aϑr , a21 = a2 1 = a ,
r ϑr
a22 = r 2 aϑϑ , a22 = a , a2 2 = a2 2 = aϑϑ ,
r 2 ϑϑ
a23 = r 2 sin ϑ aϑφ , a23 = 2 a ,
r sin ϑ ϑφ
1 (B.32)
a2 3 = a , a2 3 = sin ϑ aϑφ ,
sin ϑ ϑφ
a31 = a3 1 = r sin ϑ aφr , a31 = a3 1 = a ,
r sin ϑ φr
a32 = r 2 sin ϑ aφϑ , a32 = aφϑ ,
r2 sin ϑ
a3 2 = sin ϑ aφϑ , a3 2 = aφϑ ,
sin ϑ
a33 = r 2 sin2 ϑ aφφ , a33 = aφφ ,
r sin2 ϑ

a3 3 = a3 3 = aφφ .

B.2.4 δ-Tensor and ε-Tensor

The only nonzero coordinates are

g11 = g 11 = 1, g22 = r 2 , g 22 = 2 ,
r (B.33)
g33 = r 2 sin2 ϑ, g 33 = ;
r 2 sin2 ϑ

e123 = e1 23 = r 2 sin ϑ, e1 2 3 = e12 3 = sin ϑ,

1 1 (B.34)
e12 3 = e1 2 3 = , e1 23 = e123 = ,
sin ϑ r sin2 ϑ

and the corresponding permutations of the ε-tensor. Furthermore

g := det gij = r 4 sin2 ϑ. (B.35)

B.2.5 Christoffel Symbols

The only nonzero Christoffel symbols are

Γ122 = −r, Γ133 = −r sin2 ϑ,

Γ212 = Γ221 = Γ313 = Γ331 = 1r , (B.36)
Γ233 = − sin ϑ cos ϑ, Γ323 = Γ332 = cot ϑ.

Brought to you by | Cambridge University Library

B.2 Spherical Coordinates | 319

B.2.6 Differential Operators

For each operator the physical coordinates are given:

𝜕a 1 𝜕a 1 𝜕a
grad a = { , , }; (B.37)
𝜕r r 𝜕ϑ r sin ϑ 𝜕φ
𝜕ar 2 ar 1 𝜕aϑ cot ϑ aϑ 1 𝜕aφ
div a = + + + + ; (B.38)
𝜕r r r 𝜕ϑ r r sin ϑ 𝜕φ
1 𝜕aφ cot ϑ aφ 1 𝜕aϑ
(curl a)r = + − ,
r 𝜕ϑ r r sin ϑ 𝜕φ
1 𝜕ar 𝜕aφ aφ (B.39)
(curl a)ϑ = − − ,
r sin ϑ 𝜕φ 𝜕r r
𝜕aϑ aϑ 1 𝜕ar
(curl a)φ = + − ;
𝜕r r r 𝜕ϑ
(grad a)rr = ,
1 𝜕ar aϑ
(grad a)rϑ = − ,
r 𝜕ϑ r
1 𝜕ar aφ
(grad a)rφ = − ,
r sin ϑ 𝜕φ r
(grad a)ϑr = ,
1 𝜕aϑ ar
(grad a)ϑϑ = + , (B.40)
r 𝜕ϑ r
1 𝜕aϑ cot ϑ aφ
(grad a)ϑφ = − ,
r sin ϑ 𝜕φ r
(grad a)φr = ,
1 𝜕aφ
(grad a)φϑ = ,
r 𝜕ϑ
1 𝜕aφ ar cot ϑ aϑ
(grad a)φφ = + + ;
r sin ϑ 𝜕φ r r
𝜕ar 𝜕a b 𝜕a bφ
[(grad a) ⋅ b]r = br + r ϑ + r
𝜕r 𝜕ϑ r 𝜕φ r sin ϑ
aϑ bϑ + aφ bφ
− ,
𝜕aϑ 𝜕a b 𝜕a bφ
[(grad a) ⋅ b]ϑ = br + ϑ ϑ + ϑ
𝜕r 𝜕ϑ r 𝜕φ r sin ϑ
ar bϑ − cot ϑ aφ bφ
+ ,
𝜕aφ 𝜕aφ bϑ 𝜕aφ bφ
[(grad a) ⋅ b]φ = br + +
𝜕r 𝜕ϑ r 𝜕φ r sin ϑ
ar bφ + cot ϑ aϑ bφ
+ ;

Brought to you by | Cambridge University Library

320 | B Cylindrical Coordinates and Spherical Coordinates

𝜕arr 2 arr 1 𝜕arϑ 1 𝜕arφ

(div a)r = + + +
𝜕r r r 𝜕ϑ r sin ϑ 𝜕φ
aϑϑ + aφφ cot ϑ arϑ
− + ,
r r
𝜕aϑr 2 aϑr 1 𝜕aϑϑ
(div a)ϑ = + +
𝜕r r r 𝜕ϑ
1 𝜕aϑφ cot ϑ(aϑϑ − aφφ ) arϑ
+ + + ,
r sin ϑ 𝜕φ r r
𝜕aφr 2 aφr 1 𝜕aφϑ 1 𝜕aφφ
(div a)φ = + + +
𝜕r r r 𝜕ϑ r sin ϑ 𝜕φ
cot ϑ(aϑφ + aφϑ ) arφ
+ + ;
r r
𝜕2 a 2 𝜕a 1 𝜕2 a cot ϑ 𝜕a 1 𝜕2 a
Δa = 2 + + 2 + + ; (B.43)
𝜕r r 𝜕r r 𝜕ϑ2 r 2 𝜕ϑ r 2 sin2 ϑ 𝜕φ2
𝜕2 ar 2 𝜕ar 1 𝜕2 ar cot ϑ 𝜕ar
(Δa)r = + + + 2
𝜕r 2 r 𝜕r r 2 𝜕ϑ2 r 𝜕ϑ
1 𝜕 ar 2 ar 2 𝜕aϑ 2 cot ϑ aϑ
+ − 2 − 2 −
r 2 sin2 ϑ 𝜕φ2 r r 𝜕ϑ r2
2 𝜕aφ
− 2 ,
r sin ϑ 𝜕φ
𝜕2 aϑ 2 𝜕aϑ 1 𝜕2 aϑ cot ϑ 𝜕aϑ
(Δa)ϑ = + + + 2 (B.44)
𝜕r 2 r 𝜕r r 2 𝜕ϑ2 r 𝜕ϑ
1 𝜕2 aϑ 2 𝜕ar aϑ 2 cot ϑ 𝜕aφ
+ 2 2
+ 2 − 2
− 2 ,
r sin ϑ 𝜕φ r 𝜕ϑ r sin ϑ r sin ϑ 𝜕φ

𝜕2 aφ 2 𝜕aφ 2
1 𝜕 aφ cot ϑ 𝜕aφ
(Δa)φ = + + + 2
𝜕r 2 r 𝜕r r 2 𝜕ϑ2 r 𝜕ϑ
1 𝜕 aφ 2 𝜕a r 2 cot ϑ 𝜕aϑ aφ
+ 2 2
+ 2 + 2 − .
r sin ϑ 𝜕φ r sin ϑ 𝜕φ r sin ϑ 𝜕φ r sin2 ϑ 2

B.2.7 Curve-, Area-, and Volume Elements

The elements are

d ui = (d r, d ϑ, d φ),
d Ai = (r 2 sin ϑ d ϑ d φ, r 2 sin ϑ d φ d r, r 2 sin ϑ d r d ϑ), (B.45)
d V = r 2 sin ϑ d r d ϑ d φ.
In (B.46) the physical coordinates are given:
d x = (d r, r d ϑ, r sin ϑ d φ),
d A = (r 2 sin ϑ d ϑ d φ, r sin ϑ d φ d r, r d r d ϑ).

Brought to you by | Cambridge University Library

δ-tensor 55, 171–175, 194 – , reciprocal 109–112, 179
δij 10, 11, 55 – , reciprocal, in the plane 111, 112
i...j basis vectors of a surface
δp...q 11, 12
ε-tensor 56, 64, 66, 176–179, 194, 218 – , contravariant 205
εi...j 12–18 – , covariant 203
εijk 18–20, 56
absolute – basis 31
– derivative see covariant derivative – coordinate system 31, 164
– differential 191, 192 – coordinates 31, 167, 174
addition – decomposition of a tensor 99
– of determinants 9 Cayley–Hamilton theorem 152, 153, 224
– of matrices 21 – , generalized 224
– of N-tuples 4 characteristic equation 118, 119, 128, 153
– of tensors 47, 70, 180 Christoffel symbols 188, 189
additive decomposition of a tensor 99, 100 cofactor 7, 17, 21
adjunct see cofactor collinear 5
affine map 106 column
algebraic complement see cofactor – index 60
alternating see antimetric – line 108
– tensor see permutation symbol – matrix 20
angle-preserving 106 – plane 107
anisotropy 244–248 components
antidiagonal of a determinant 7 – of a tensor 54
antimetric – of a vector 36
– (part of a) tensor 48, 64, 99, 101–103, 108, congruent map see orthogonal map
120, 182, 228, 234, 246 contraction 63, 70, 183
– (with respect to two indices) 13 contravariant
antisymmetric see antimetric – basis 162
axial – basis vectors 205
– scalar 69, 88 – coordinates 165, 169, 170
– tensor 43, 56, 66, 119 – index 166
– vector 38, 66, 82, 101 coordinate 161
– curve 161
bases – notation see notations of tensors
– , holonomic 166 – system see coordinates
basic invariants 153, 154 – transformation 31, 32, 148, 149, 160
basis coordinates 31
– , Cartesian 31 – , Cartesian 31, 167, 174
– , contravariant 162 – , contravariant 165, 169, 170
– , covariant 162 – , covariant 165, 169, 170
– , holonomic 162, 163 – , curvilinear 159–165, 201, 202
– , normalized 111 – , holonomic 165–184
– , orthogonal 111 – , locally Cartesian 186
– , orthonormal 111 – of a tensor 54, 165–167
– , physical 184–186 – of a vector 36

322 | Index

– , orthogonal 174, 185 direction cosines 33

– , physical 184–186, 195–197 divergence 76, 77, 194
coplanar 5 division by a matrix 29, 30
cotensor 102, 119 dot product see scalar multiplication
covariant double scalar multiplication 63, 64
– basis 162 dyadic product see tensor multiplication
– basis vectors 203
– coordinates 165, 169, 170 eigendirection 117, 118, 122–130, 134, 140–143
– derivative 190, 191 eigenplane 123
– index 166 eigenvalue equation 117, 128
curl 77, 78, 195 eigenvalue problem 117, 118
curvature – , generalized 136
– , Gaussian 212 – , left 117
– , mean 212 – , right 117
– tensor 210 eigenvalues 117–121, 126, 128–143, 149, 153,
– tensor, Riemann 217–219 157
curve eigenvector 117, 122–130, 137, 149, 157
– element see line element elementary transformations 26
– integral see line integral equality
curvilinear – of determinants 8
– coordinate system 159, 160 – of matrices 21
– coordinates 159–165, 201, 202 – of N-tuples 4
cylindrical coordinates 309–313 – of tensors 47, 70, 180
decomposition of a tensor – class 27
– , additive 99, 100 – relation 27
– , Cartesian 99 equivalent
– , polar 154–157 – invariants 231
defective tensor 126 – matrices 27
definiteness 135–140 – set of elements 27
del operator 72, 74, 193 E-tensor 56, 66
derivative expansion formula, Laplace 8
– , covariant 190, 191
– , partial see partial derivative Falk’s scheme 22
– , second covariant 197–199 first fundamental form 207
– with respect to a parameter 193 frame see basis
derivative equation frame, moving 204
– of Gauss 215 function basis 239
– of Weingarten 215 fundamental
determinant 6–9, 15–18, 29 – form, first 207
– of a matrix 21, 139 – form, second 211
– of a tensor 100, 101, 135 – tensor see δ-tensor
– , regular 7 fundamental theorem of tensor analysis 71
– , singular 7
deviator 99 Gauss
diagonal matrix 21 – , derivative equation of 215
difference see subtraction – theorem 91–94
differential, total see total differential Gauss–Jordan algorithm 29, 30
differential geometry 202 Gaussian curvature 212

Brought to you by | Cambridge University Library

Index | 323

generalized – tensor 103, 104

– Cayley–Hamilton theorem 224 inversion tensor 143
– eigenvalue problem 136 irreducible 225
– Kronecker symbol 11 isomer 48
generator 239 isotropic
gradient 71–74, 193, 194 – tensor 57, 99, 123, 179, 245
Grassmann identity 19, 178 – tensor function 238

Hamilton–Cayley theorem see Cayley–Hamilton Kronecker symbol 10–20, 171

theorem – , generalized 11
hemitropic tensor function 244
holonomic label 1
– bases 166 Lagrange multiplier 136, 213
– basis 162, 163 Laplace operator 79, 198
– coordinates 165–184 left
homologous 4 – curl 77
– divergence 76
identity – gradient 72, 75, 193
– matrix 21 left-eigenvalue problem 117
– tensor see δ-tensor left-hand screw rule 39
improper orthogonal 28, 34, 104, 107 left-handed system 33, 39
indefinite 135, 139 length-preserving 106
index Levi-Civita tensor see permutation symbol
– , bound 3 line element 80, 199
– , contravariant 166 line integral 80, 81
– , covariant 166 – , closed 90
– , free 2 – of first kind 80
– , running 1 – of second kind 80
– , summation 2 linear
inequality sign 2 – combination 5
integrals of tensor fields 80–90, 199–202 – coordinate system 164
integrity basis 225 – equations 29, 30
invariance condition for tensor functions 221, – map 105
236–238 – operations 4
invariant – vector function 105–109
– , degree of an 229 linearly
– tensor functions 221 – dependent 5, 22, 48
invariants 101, 117, 118, 153, 154, 225 – independent 5, 22, 47
– , basic 153, 154 locally Cartesian coordinates 186
– , equivalent 231
– , independent 226, 227 main diagonal
– , irreducible 225 – of a determinant 7
– , main 153 – of a matrix 21
– , polynomial 225, 228 main invariants 119, 153
– , reducible 225 Mainardi–Codazzi equation 217
– , simultaneous 226 map
inverse – , affine 106
– image 105 – , linear 105
– of a matrix 24, 25 – , orthogonal 106

324 | Index

matrices – of tensors 45, 46, 79, 184

– , equivalent 27 null
– , similar 27 – space 107, 108
matrix 20–28 – vector 107
– , diagonalized 21
– , inverse 24 one matrix see identity matrix
– notation see notations of tensors one-sided surface 82
– , orthogonal 28 order of a tensor 44
– , regular 21 orientation 33, 82
– , singular 21 orthogonal
– , square 20, 99, 128, 130, 137–140 – basis 111
– , transpose 23 – coordinate system 165
mean curvature 212 – coordinates 174, 185
metric 172 – , improper 28, 34, 104, 107
– coefficients 172–175, 188, 189, 207, 208 – map 106
– tensor see δ-tensor – matrix 28
minor 7 – , proper 28, 34, 104, 106
– , principal 7 – tensor 104–106, 140–149, 228, 234, 246
motion 33 orthogonality relations 33, 110, 111, 168
moving frame 204 orthonormal basis 111
multiple scalar multiplication 63, 64, 183 outer product see vector multiplication
– , double scalar 63, 64 pair 4
– , multiple scalar 63, 64, 183 – of indices 11
– of a determinant by a number 9 parallel translation 192
– of a matrix by a number 21 partial derivative 186
– of a tensor by a scalar 47 – of a basis 187, 188
– of determinants 9 – of a position vector 187
– of matrices 22 – of a tensor 190, 191
– of an N-tuple by a constant number 4 – of a tensor coordinate 190, 191
– , scalar 57–64, 70, 183 permutation symbol 12
– , tensor 50–55, 70, 182 physical
– , vector 65–70, 96, 183 – basis 184–186
– coordinates 184–186, 195–197
N-tuple 4, 5, 21 point 31
Nabla operator see del operator – coordinates 159
natural basis see covariant basis polar
negative – decomposition of a tensor 154–157
– definite 135, 139 – scalar 88
– semidefinite 135, 139 – tensor 41, 43, 56, 119
nondefective tensor 126 – vector 38, 66, 74, 82
nonnegative tensor 151 position vector 31, 39, 167, 186
normal positive
– form of a matrix 26 – definite 135, 139, 173
– section 212 – semidefinite 135, 139
– vector 82–85, 203 – tensor 152
normalized basis 111 powers of a tensor 149–153
notations principal
– of matrices 25 – axes transformation 130–134

Brought to you by | Cambridge University Library

Index | 325

– curvature 213 right-hand screw rule 39

– curvature directions 214 right-handed system 33, 39
– minor 7 rotary reflection 142, 147, 172
– submatrix 139 rotation 38, 140, 142, 144–146, 148, 149, 172
product see multiplication – angle 144
proper orthogonal 28, 34, 104, 106 – axis 144
pseudoscalar 69, 291 – tensor 142
pseudotensor see axial tensor row
pseudovector see axial vector – index 60
– line 108
quadratic form 135 – matrix 20
quadruple 4 – plane 107
quotient rule 52, 61 rule of Sarrus 8
raising and lowering of indices 175, 176, 184, – for indices 79, 184
208 – for underlines 79
– of a determinant 7, 30 scalar 44
– of a matrix 21, 30 – , axial 69, 88
– of a tensor 102, 103, 135 – multiplication 57–64, 70, 183
rank defect – , polar 88
– of a determinant 7 scalar-valued function 221, 238
– of a matrix 21 second fundamental form 211
reciprocal similar matrices 27
– basis 109–112, 179 simultaneous invariant 226
– basis in the plane 111, 112 singular
– vector 116 – determinant 7
reducible 225 – matrix 21
reflection 34, 38, 146, 147 – tensor 102, 107–109, 121
reflexive 27 size of an N-tuple 4
regular skew symmetric see antimetric
– coordinate transformation 160 special orthogonal see proper orthogonal
– determinant 7 spherical coordinates 314–320
– matrix 21 square matrix 20, 99, 128–130, 137–140
– tensor 102, 106, 121, 136, 154 Stokes theorem 94–98
representation subdeterminant
– of tensor equations 55 – of a determinant 6
– of tensor functions 221 – of a matrix 21
representation of a tensor submatrix 20
– by eigendirections 151 subtraction
– by eigenvectors 112–116 – of determinants 9
reproducibility 44 – of matrices 21
Ricci theorem 194 – of N-tuples 4
Riemann curvature tensor 199, 217–219 – of tensors 47, 70, 180
right sum see addition
– curl 77 summation convention 1–4, 170, 171, 204
– divergence 76 surface
– gradient 72, 75, 193 – coordinates see surface parameter
right-eigenvalue problem 117 – element 82, 86, 200

Brought to you by | Cambridge University Library

326 | Index

– one-sided 82 – , isotropic 238

– parameter 203 – , scalar-valued 221, 238
– tensor 206 – , tensor-valued 221, 240–243
– two-sided 82 – , vector-valued 238–240, 243
surface integral 85–88 tensor functions
– , closed 90 – , invariance condition for 221, 236–238
– of first kind 85 – , invariant 221
– of second kind 85 – , representation of 221
surface vector tensor-valued function 221, 240–243
– (of a surface element) 82–86 theory of surfaces 202
– (in the theory of surfaces) 206 total differential 74–76, 186
symbolic notation see notations of tensors, 70 – of a basis 187, 188
symmetric – of a tensor 191, 192
– relation 27 – of a tensor coordinate 191, 192
– (part of a) tensor 48, 64, 99, 120, 130–140, trace 63, 70
150, 181, 228, 234, 239, 241, 242, 246 – of a square matrix 129
– (in two indices) 10 transformation
symmetries in physics 44, 45 – coefficients see transformation matrix
symmetry group 244 – equations 55
syzygy 225 transformation law
– for basis vectors 34, 162
tangent – for derivatives with respect to the coordinates
– plane 203 of the position vector 191
– vector 208, 212 – for point coordinates 35, 168–170
tensor 40–45, 54 – for tensor coordinates 44, 169
– , antimetric 48, 64, 99, 101–103, 108, 120, – for vector coordinates 36–39
182, 228, 234, 246 transformation matrix 32–34, 164
– , axial 43, 56, 66, 119 transitive 27
– components 166 translation 38, 46, 78, 184, 192, 194
– coordinates 166 transpose
– equations 55, 184 – of a matrix 23, 25
– field 71–98, 186–202 – of a tensor 48, 70, 181
– , inverse 103, 104 transverse isotropy 245, 247
– , isotropic 57, 99, 123, 179, 245 triangular form 8
– multiplication 50–55, 70, 182 triangular matrix 121
– , nonnegative 151 triple 4
– of a vector 101 triple product 69, 70, 183
– , orthogonal 104–106, 140–149, 228, 234, 246 two-sided surface 82, 94
– , polar 41, 43, 56, 119
– , positive 152 value of a determinant 6–8, 29
– , regular 102, 106, 121, 136, 154 – in triangular form 8
– , singular 102, 107–109, 121 vector 35–39, 44
– , symmetric 48, 64, 99 – , axial 38, 66, 82, 101
– symmetric 120 – components 35, 36, 166
– , symmetric 130–140, 150, 181, 228, 234, 239, – coordinates 35, 36, 166
241, 242, 246 – multiplication 65–70, 96, 183
– , transverse isotropic 245 – of an antimetric tensor 101
tensor function 221, 236 – , polar 38, 66, 74, 82
– , hemitropic 244 – , reciprocal 116

Index | 327

vector function 105 weight of a pseudoscalar 291

– , linear 105–109 Weingarten
vector-valued function 105, 238–240, 243 – , derivative equation of 215

Vieta’s formulas 121

volume – matrix 20
– element 88, 201 – tensor 44
– integral 88, 89 zero-N-tuple 4

You might also like