0% found this document useful (0 votes)
32 views81 pages

Tensors (Susanka)

This document serves as a comprehensive introduction to tensors within a finite-dimensional real vector space, detailing operations such as contraction, index manipulation, and various tensor-related concepts. It aims to connect linear algebra with tensor applications, providing clear definitions and proofs while avoiding the complexities of differentiable manifolds. The content is organized into sections covering notation, tensor operations, and additional structures related to alternating tensors, making it a valuable resource for understanding tensor methods.

Uploaded by

manuel fernandez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views81 pages

Tensors (Susanka)

This document serves as a comprehensive introduction to tensors within a finite-dimensional real vector space, detailing operations such as contraction, index manipulation, and various tensor-related concepts. It aims to connect linear algebra with tensor applications, providing clear definitions and proofs while avoiding the complexities of differentiable manifolds. The content is organized into sections covering notation, tensor operations, and additional structures related to alternating tensors, making it a valuable resource for understanding tensor methods.

Uploaded by

manuel fernandez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

TENSORS

(DRAFT COPY)

LARRY SUSANKA

Abstract. The purpose of this note is to define tensors with respect to a


fixed finite dimensional real vector space and indicate what is being done
when one performs common operations on tensors, such as contraction and
raising or lowering indices. We include discussion of relative tensors, inner
products, symplectic forms, interior products, Hodge duality and the Hodge
star operator and the Grassmann algebra.
All of these concepts and constructions are extensions of ideas from linear
algebra including certain facts about determinants and matrices, which we use
freely. None of them requires additional structure, such as that provided by a
differentiable manifold.
Sections 2 through 10 provide an introduction to tensors. In sections 11
through 24 we show how to perform routine operations involving tensors. In
sections 25 through 27 we explore additional structures related to spaces of
alternating tensors.
Our aim is modest. We attempt only to bridge the gap between linear
algebra and its (relatively) benign notation and the vast world of tensor appli-
cations. We define everything carefully and consistently, and this is a concise
repository of proofs which otherwise may be spread out over a book (or merely
referenced) in the study of an application.
Many of these applications occur in the context of differentiable manifolds,
or solid-state physics, or electrodynamics and each of these important subjects
come equipped with its own challenges.
Having read these notes the vocabulary, notational issues and constructions
of tensor methods, at least, will likely be less of an obstacle.

Date: March 19, 2017.


1
2 LARRY SUSANKA
TENSORS (DRAFT COPY) 3

Contents
1. Some Notation 5
2. Rn , Rn∗ and Rn∗∗ 6
3. V , V ∗ and V ∗∗ 8
4. Change of Basis 10
5. The Einstein Summation Convention 13
6. Tensors and the (Outer) Tensor Product 14
7. Tensor Coordinates and Change of Basis 18
8. Tensor Character 20
9. Relative Tensors 23
10. An Identification of Hom(V, V ) and T11 (V) 26
11. Contraction and Trace 26
12. Evaluation 29
13. Symmetry and Antisymmetry 30
14. A Basis for Λr (V ) 35
15. Determinants and Orientation 38
16. Change of Basis for Alternating Tensors and Laplace Expansion 39
17. Evaluation in Λs (V ) and Λs (V ): the Interior Product 41
18. Bilinear Forms 43
19. An Isomorphism Induced by a Nondegenerate Bilinear Form 46
20. Raising and Lowering Indices 47
21. Four Facts About Tensors of Order 2 52
22. Matrices of Symmetric Bilinear Forms and 2-Forms 55
23. Cartesian Tensors 61
24. Volume Elements and Cross Product 64
25. An Isomorphism Between Λr (V ) and Λn−r (V ) 68
26. The Hodge ∗ Operator and Hodge Duality 72
27. The Grassmann Algebra 74
References 77
Index 78
4 LARRY SUSANKA
TENSORS (DRAFT COPY) 5

1. Some Notation

In these notes we will be working with a few sets and functions repeatedly, so
we lay out these critters up front. The page number references their introduction.
Rn all n × 1 “column matrices” with real entries (Page 6.)
Standard ordered basis: e given by e1 , . . . , en
Rn∗ all 1 × n “row matrices” with real entries (Page 7.)
Standard dual ordered basis: e∗ given by e1 , . . . , en
The standard inner product on Rn generates the Euclidean geometry on Rn .
Denoted x · y or hx, yi or xt y for x, y ∈ Rn (“t” indicates transpose.)
V generic real n-dimensional vector space
with two ordered bases a given by a1 , . . . , an and b given by b1 , . . . , bn
Pn
A : V → Rn defined by v = Pi=1 Ai (v)ai for any v ∈ V (Page 8.)
n
B : V → Rn defined by v = i=1 B i (v)bi for any v ∈ V (Page 10.)

Matrix M has ijth entry Mji . M itself can be denoted Mji .
Pn Pn
Define matrix A by Aij = Ai (bj ). Note bj = i=1 Ai (bj )ai = i=1 Aij ai .
Pn Pn
Define matrix B by Bij = B i (aj ). Note aj = i=1 B i (aj )bi = i=1 Bij bi .
A and B are called matrices of transition. (Page 10.)
We calculate that BA = I = δji , the n × n identity matrix. (Page 10.)


Pn Pn
Suppose v = i=1Pxi ai = i=1P y i bi ∈ V . (Page 11.)
n n
Then x = j=1 y Aj = j=1 y j Ai (bj )
i j i
n n
and y i = j=1 xj Bij = j=1 xj B i (aj ).
P P

If x and y in Rn represent v ∈
 V i in
 bases a and b respectively
 i (Page 11.)
then y = Bx and B = ∂xj and x = Ay and A = ∂y
∂y ∂x
j .

The entries of B represent the rate of change of a b coordinate with respect to


variation of an a coordinate, when other a coordinates are left unchanged.
V∗ all real linear functionals on V
a∗ given by a1 , . . . , an : ordered basis of V ∗ dual to a
b∗ given by b1 ,P V ∗ dual to b
. . . , bn : ordered basis ofP
n n
We show a = i=1 A (bi )b and b = i=1 B j (ai )ai (Page 12.)
j j i j

Pn Pn
Suppose θ = j=1 gj aj = j=1 fj bj ∈ V ∗ . (Page 12.)
Pn Pn Pn Pn
Then gj = i=1 fi Bij = i=1 fi B i (aj ) and fj = i=1 gi Aij = i=1 gi Ai (bj ).
If τ and σ in Rn∗ represent θ ∈ V ∗ in bases a∗ and b∗ respectively (Page 13.)
then σ = τ A and τ = σB.
Bilinear form G has matrix Ga = Ga ij = (Gi,j (a)) (Page 43.)

 i

If invertible, define Gi,j (a) by Ga −1 = Ga = Ga j = Gi,j (a) (Page 45.)

6 LARRY SUSANKA

2. Rn , Rn∗ and Rn∗∗

We presume throughout that the reader has seen the basics of linear algebra,
including, at least, the concepts of ordered basis and dimension of a real vector
space and the fact that a linear transformation is determined by its effect
on an ordered basis of its domain. Facts about symmetric and skew symmetric
matrices and how they can be brought to standard forms and several facts about
determinants are referenced and used. Beyond that, the reader need only steel
him-or-her-self for the index-fest which (no way around it) ensues.

Rest assured that with some practice you too will be slinging index-laden mon-
strosities about with the aplomb of a veteran.

We represent members of the real vector space Rn as n × 1 “column matrices”

x1
 
n
x =  ...  =
  X i
x ei
xn i=1
 
0
 .. 
.
 
1 ←− ith row.
where for each i = 1, . . . , n we define ei =  
.
 .. 
0

A real number “is” a 1 × 1 matrix.

We abandon the standard practice in basic calculus and algebra textbooks whereby
members of Rn are denoted by rows. The typographical convenience of this prac-
tice for the authors of these books is inarguable, saving the reams of paper in every
elementary class which would be devoted to whitespace in the textbook using our
convention. Our choice is the one most commonly adopted for the subject at hand.
Here, rows of numbers will have a different meaning, to be defined shortly.

Superscripts and subscripts abound in the discussion while exponents are scarce,
so it should be presumed that a notation such as xi refers to the ith coordinate or
ith instance of x rather than the ith power of x.
Pn
Any x ∈ Rn has one and only one representation as j=1 xj ej where the xj are
real numbers. The ordered list e1 , e2 , . . . , en is called an ordered basis of Rn by
virtue of this property, and is called the standard ordered basis of Rn , denoted
e.
TENSORS (DRAFT COPY) 7

n m
Any
 linear transformation σ from R to R is identified with an m × n matrix
σji with ith row and jth column entry given by σji = σ i (ej ). Then
 1 
σ (x) m m

n

 ..  X i X X
σ(x) =  .  = σ (x)ei = σi  xj ej  ei
σ m (x) i=1 i=1 j=1
 
m
X Xn
σ i (ej )xj  ei = σji x.

= 
i=1 j=1

So a linear transformation
 σ “is” left matrix multiplication by the m × n matrix
σji = σ i (ej ) . We will usually not distinguish between a linear transformation
from Rn to Rm and this matrix, writing either σ(x) or σx for σ evaluated at x.
Be aware, however, that a linear transformation is a function while matrix mul-
tiplication by a certain matrix is an arrangement of arithmetic operations which
can be employed, when done in specific order, to evaluate a function at one of its
domain members.
With this in mind, the set of linear transformations from Rn to R, also called
the dual of Rn , is identified with the set of 1 × n “row matrices,” which we will
denote Rn∗ . Individual members of Rn∗ are called linear functionals on Rn .
Any linear functional σ on Rn is identified with the row matrix
(σ1 , σ2 , . . . , σn ) = (σ(e1 ), σ(e2 ), . . . , σ(en ))
n
acting on R by left matrix multiplication.
x1
 
n
σ(ej )xj = (σ1 , σ2 , . . . , σn )  ...  = σx.
X
σ(x) =
 
j=1 xn

The entries of row matrices are distinguished from each other by commas. These
commas are not strictly necessary, and can be omitted. But without them there
might be no obvious visual cue that you have moved to the next column: (3 − 2x −
4 − y) versus (3, −2x, −4 − y).
Rn∗ itself has ordered basis e∗ given by the list e1 , e2 , . . . , en where each ei =
(ei )t , the transpose of ei . The ordered basis e∗ is called the standard ordered
basis of Rn∗ .
Pn
Any σ ∈ Rn∗ is j=1 σ(ej )ej and this representation is the only representation
of σ as a nontrivial linear combination of the ej .
The “double dual” of Rn , denoted Rn∗∗ , is the set of linear transformations
from Rn∗ to R. These are also called linear functionals on Rn∗ . This set of func-
tionals is identified with Rn by the evaluation map E : Rn → Rn∗∗ defined for
x ∈ Rn by
E(x)(σ) = σ(x) = σx ∀ σ ∈ Rn∗ .

Note that if E(x) is the 0 transformation then x must be 0, and since Rn and
n∗∗
R have the same dimension E defines an isomorphism onto Rn∗∗ .
8 LARRY SUSANKA

We will not normally distinguish between Rn and Rn∗∗ . The notations x(σ) and
σ(x) and σ x will be used interchangeably for x ∈ Rn and σ ∈ Rn∗ .
Note that the matrix product x σ is an n × n matrix, not a real number. It
will normally be obvious from context if we intend functional evaluation x(σ) or
this matrix product. Members of Rn = Rn∗∗ “act on” members of Rn∗ by matrix
multiplication on the right.

3. V , V ∗ and V ∗∗

If V is a generic n-dimensional real vector space with ordered basis a given by


a1 , . . . , an there is a unique function A : V → Rn defined by
 1 
A (v) n n
A(v) =  ...  =
 X i X
A (v)ei when v = Ai (v)ai .

An (v) i=1 i=1

A is called the coordinate map for the ordered basis a, and A(v) is said
to represent v in ordered basis a. The individual entries in A(v) are called
“the first coordinate,” “the second coordinate” and so on. The word component
is used synonymously with coordinate.
Pn Pn
The function A associates i=1 Ai (v)ai ∈ V with i=1 Ai (v)ei ∈ Rn .
A is an isomorphism of V onto Rn . Denote its inverse by A.
n
X n
X
A(x) = xi ai ∈ V when x= xi ei ∈ Rn .
i=1 i=1

A real valued linear transformation on V is called a linear functional on V.


The set of all these is denoted V∗ and called the dual of V. V ∗ has dimension n
and has ordered basis a∗ given by a1 , . . . , an where each ai is defined by
n
X
aj (v) = Aj (v) when v= Ai (v)ai .
i=1

In other words, aj “picks off” the jth coordinate of v ∈ V when you represent v
in terms of the ordered basis a.
This ordered basis is said to be the ordered basis dual to a.
If you know that a1 is a member of an ordered basis, but do not
1
know Pna2 , . . . , an you cannot determine a . The coefficient of a1 in the sum
v = i=1 Ai (v)ai depends on every member of the ordered basis, not just a1 .
If σ ∈ Rn∗ define A∗ (σ) ∈ V ∗ by A∗ (σ)(v) = σ(A(v)).
n
X n
X
A∗ (σ) = σi ai ∈ V ∗ when σ= σi ei ∈ Rn∗ .
i=1 i=1

A∗ is an isomorphism of Rn∗ onto V ∗ . We will denote its inverse by A∗ .


TENSORS (DRAFT COPY) 9

n
X n
X
A∗ (θ) = gi ei when θ= gi ai .
i=1 i=1

A∗ is called the coordinate map for ordered basis a∗ . The row matrix A∗ (θ) =
( A∗ 1 (θ), . . . , A∗ n (θ) ) is said to represent θ in the dual ordered basis a∗ .
The four isomorphisms described here are represented pictorially below.

V V∗
x  x 
   
ai ↔ ei A A and A ∗
 A∗ ai ↔ ei
   
  
 y  y

Rn Rn∗

To keep track of this small menagerie of “A”s remember: an even number of


“hats” on A sends a member of V or V ∗ to its representative in Rn or Rn∗ .
These are not merely isomorphisms. They are isomorphisms coordi-
nated in a way that allows you to evaluate functionals on vectors by
moving them down to Rn and Rn∗ and using their images under these
isomorphisms to perform the calculation.
So for θ ∈ V ∗ , σ ∈ Rn∗ , v ∈ V and x ∈ Rn :

A∗ (σ) A(x) = σx.



A∗ (θ)A(v) = θ(v) and

V ∗∗ , the “double dual” of V , is the set of linear functionals on V ∗ .


As before, the evaluation map, defined for each v ∈ V by
E(v)(θ) = θ(v) ∀ θ ∈ V ∗ ,
provides an isomorphism of V onto V ∗∗ and we will not normally distinguish be-
tween members of V and members of V ∗∗ . The notations v(θ) and θ(v) are used
interchangeably for v ∈ V and θ ∈ V ∗ .
Our motivation for creating these two structures (one involving V , the other
Rn ) bears some examination. Suppose we are using vector methods to help us
understand a question from physics, such as surface tension on a soap bubble or
deformation in a crystal lattice. In these applications a vector space V can arise
which is natural for the problem. Physics is simple in V . If V itself is simple
(we don’t intend to rule out the possibility that V = Rn ) all is well. But often
V is complicated, perhaps a quotient space in a function space or a subspace of
Rk for larger dimension k. The representation of V in Rn provided by an ordered
basis for V , however, may be easier to understand and provides a venue for number
crunching. The isomorphisms bring the necessary geometry from V down to Rn
where we can work undisturbed by irrelevant aspects of V . You choose where you
want to work, picking the description to match what you are trying to do.
10 LARRY SUSANKA

4. Change of Basis

If V possesses two ordered bases a given by a1 , . . . , an and b given by b1 , . . . , bn ,


we want to understand how the isomorphism A : V → Rn and its inverse A are
related to the analogous isomorphisms B and B given by
 1 
B (v) n
 ..  X
B(v) =  .  where v = B i (v)bi .
B n (v) i=1

Specifically, if x and y in Rn represent the same vector v in V with


respect to different bases a and b respectively, how are x and y related?
Let A be the matrix with ith row and jth column entry Aij = Ai (bj ) and let B be
the matrix with ith row and jth column entry Bij = B i (aj ). The columns of A are
the coordinates of the b basis vectors in terms of the a basis vectors. The columns
of B are the coordinates of the a basis vectors in terms of the b basis vectors.
We call A the matrix of transition from b to a and B the matrix of transi-
tion from a to b. They will be used to aid the translation between representations
of vectors in terms of the bases a and b.
Crudely (and possibly incorrectly when there is no notion of distance in V )
speaking, if all the vectors of b are longer than those of a, the entries of B will all
be smaller than 1 because the coordinates of vectors must become smaller in the
new basis b. Conversely, if the members of b are all short the entries of B will be
large.
For each i
n
X n
X n
X
bi = Aj (bi )aj = Aj (bi ) B k (aj )bk
j=1 j=1 k=1
 
n
X Xn
=  B k (aj )Aj (bi ) bk .
k=1 j=1

By uniqueness of coefficients we have, for each k,


n n
(
X X 0, if i 6= k;
k j
B (aj )A (bi ) = Bkj Aji = δik ≡
j=1 j=1
1, if i = k.

This means that the matrices A and B are inverse to each other.1

BA = I, where I is the n × n identity matrix δji .




So if A(v) = x and B(v) = y then v must be equal to


 
n
X Xn n
X n
X Xn n
X
Aj (v)aj = Aj (v) B i (aj )bi =  B i (aj )Aj (v) bi = B i (v)bi .
j=1 j=1 i=1 i=1 j=1 i=1

1The size of the identity matrix is usually suppressed. If that might cause confusion the
 
notation I(n) = δji (n) can be used. The symbol δji is called the Kronecker delta.
TENSORS (DRAFT COPY) 11

So for each i,
n
X n
X
y i = B i (v) = B i (aj )Aj (v) = B i (aj )xj .
j=1 j=1

y = Bx if A(v) = x and B(v) = y.

The representation in Rn of a vector in V created using the ordered basis b is


obtained from the representation using a by left multiplying the representation by
the matrix B.
Another way of expressing this fact is the following, for each i:
n
X n
X n
X n
X
yi = xj Bij = xj B i (aj ) when v= xj a j = y j bj ∈ V.
j=1 j=1 j=1 j=1

Thought of as a linear function from Rn to Rn , the relationship y =Bx is


∂y i ∂xi
differentiable and we find that the derivative is B = ∂x j . Similarly, A = ∂y j .
n n
These are called the Jacobian matrices of the linear function from R to R
given by the coordinate change.
Repeating for emphasis, a vector v has coordinates in both basis a and basis b.
If you change the value of one coefficient a little, say on basis vector aj , you change
the vector v and this new vector could have slight changes to all of its coefficients
in basis b.
The ijth entry of B is the rate of change of the ith coordinate of a vector in basis
b, the coefficient on basis vector bi , with respect to changes in the jth coordinate
of the vector in basis a, other a-coordinates remaining constant.
It is for this reason a symbolic notation such as
db da
= B = “from a to b” and = A = “from b to a”
da db
is sometimes used, making explicit reference to the bases involved. This reminder
is particularly convenient when many bases are in play.
Some texts use the “derivative” notation for these matrices, and the partial
derivative notation for their entries, throughout. We do not. The reason for this
is purely typographical. When I change to this partial derivative notation many of
the equations scattered throughout this text are harder to look at, and we are only
going to be referring to two bases in these notes so a more flexible (and specific)
notation is not required.
The partial derivative notation is used in applications involving differentiable
manifolds, where it is appropriate and convenient.
Next, we create the ordered basis b∗ with members b1 , . . . , bn dual to b and
defined for each j by
n
X
bj (v) = B j (v) when v= B i (v)bi .
i=1
12 LARRY SUSANKA

bj “picks off” the jth coordinate of v ∈ V when you represent v in terms of the
ordered basis b.

We create isomorphisms B ∗ and B ∗ as before and indicated in the picture below.

V V∗
x  x 
   
bi ↔ e i B  B and B ∗
B∗ bi ↔ ei
   
  
 y  y

Rn Rn∗

We will need to know how to represent each functional bj in terms of a∗ .


Pn
Suppose bj = i=1 Cij ai for j = 1, . . . , n. Then:
n
X
δkj =bj (bk ) = Cij ai (bk )
i=1
n n
!
X X
= Cij ai w
A (bk )aw
i=1 w=1
Xn n
X
= Cij Ai (bk ) = Cij Aik .
i=1 i=1

That means the matrix C is the inverse of A. We already know that A−1 = B:
that is, C = B. A symmetrical result holds for a description of the aj in terms of
b∗ .

To reiterate: for each j


n
X n
X n
X n
X
bj = B j (ai )ai = Bji ai and aj = Aj (bi )bi = Aji bi .
i=1 i=1 i=1 i=1

Pn Pn
Now suppose we have a member θ = j=1 fj bj = j=1 gj a
j
in V ∗ .
n n n n n
! n n
!
X X X X X X X
θ= k
gk a = gk k
A (bj )b = j k
gk A (bj ) b = j
gk Akj bj .
k=1 k=1 j=1 j=1 k=1 j=1 k=1

This means that for each j


n
X n
X n
X n
X
fj = gi Ai (bj ) = gi Aij when θ= fj bj = gj aj .
i=1 i=1 j=1 j=1

We can look at the same calculation from a slightly different standpoint. Suppose
σ, τ ∈ Rn∗ represent the same linear transformation θ ∈ V ∗ with respect to bases
b∗ and a∗ respectively.
TENSORS (DRAFT COPY) 13

Specifically, θ(v) = σB(v) = τ A(v) ∀ v ∈ V . So for each i


σi = σei = σB(bi ) = θ(bi )
 
n
X Xn n
X
= θ Aj (bi )aj  = Aj (bi )θ(aj ) = Aj (bi )τ A(aj )
j=1 j=1 j=1
n
X n
X n
X
= Aj (bi )τ ej = τj Aj (bi ) = τj Aji .
j=1 j=1 j=1

σ = τA if A∗ (θ) = τ and B ∗ (θ) = σ.

The representation in Rn∗ of a linear transformation in V ∗ created using the


ordered basis b∗ is obtained from the representation using a∗ by right multiplying
the representation by the matrix A.

V V∗
All these arrows A A∗ B∗
are reversible. @B @
R
@ @R
@
Rn - Rn
Rn∗ - Rn∗
Left multiply Right multiply
by matrix B by matrix A

We now raise an important point on change of basis. Rn and Rn∗ are vector
spaces and we do not intend to preclude the possibility that V or V ∗ could be
either of these. Still, when Rn and Rn∗ are considered as range of A and A∗
we will never change basis there, but always use e and e∗ . Though the
representative of a vector from V or V ∗ might change as we go along, we take the
point of view that this happens only as a result of a change in basis in V.

5. The Einstein Summation Convention

At this point it becomes worthwhile to use a common notational contrivance: in


an indexed sum of products, it often (very often) happens that an index symbol
being summed over occurs onceP as subscript and once as superscript in each product
n
in the sum. The formula bj = i=1 Aij ai for j = 1, . . . , n provides an example.
Henceforth, if you see a formula in these notes involving a product
within which one subscript and one superscript are indicated by the same
symbol then a summation over all the values of that index is presumed.
If an index is not used as an index of summation it is presumed that
the formula has distinct instances, one for each potential value of that
index.
This is called the Einstein summation convention. With this convention the
n different formulae found above are represented by bj = Aij ai , and if anything
else is intended that must be explicitly explained in situ.
14 LARRY SUSANKA

In these notes we will never use this convention if, within a product, the
same symbol is used more than once as subscript or more than once as
superscript.
So for us ai xi is not a sum. Neither is ai ai xi or (ai + ai )xi . But (ai )2 xi and
ai xi + ai xi do indicate summation on i, and the latter is the same as ai xi + aj xj .
If we intend symbols
Pn such as ai xi to denote a sum we must use the old conven-
i i
tional notation i=1 a x .
Sometimes the notation can be ambiguous, as in ai xi . Rather than elaborate
on our convention, converting a notational convenience into a mnemonic nuisance,
we will simply revert to the old summation notation whenever confusion might
occur.
Be aware that many texts which focus primarily on tensor representations in-
volving special types of vector spaces—inner product spaces—do not require that
a subscript occur “above and below” if summation is intended, only that it be re-
peated twice in a term. For them, ai xi is a sum. We will never use this version of
the summation convention.

6. Tensors and the (Outer) Tensor Product

We define the product space


Vsr = V ∗ × · · · × V ∗ × V × · · · × V
| {z } | {z }
r copies s copies

where r is the number of times that V ∗ occurs (they are all listed first) and s is the
number of times that V occurs. We will define Vsr (k) by
(
r V ∗ , if 1 ≤ k ≤ r;
Vs (k) =
V, if r + 1 ≤ k ≤ r + s.

A function T : Vsr → R is called multilinear if it is linear in each domain factor


separately.
Making this precise is notationally awkward. In fact, to the beginner this whole
business looks like a huge exercise in creative but only marginally successful
management of notational messes. Here is an example.
T is multilinear if whenever v(i) ∈ Vsr (i) for each i then the function
T (v(1), . . . , v(j − 1), x, v(j + 1), . . . , v(r + s))
in the variable x ∈ Vsr (j) is linear on Vsr (j) for j = 1, . . . , r + s. Sometimes we say,
colloquially, that T is linear in each of its slots separately. x in the formula above
is said to be in the jth slot of T .
It is, obviously, necessary to tweak this condition at the beginning and end of
a subscript range, where j − 1 or j + 1 might be “out of bounds” and it is left to
the reader to do the right thing here, and in similar cases later in the text.
TENSORS (DRAFT COPY) 15

A tensor2 on V is any multilinear function T as described above. r + s is called


the order or rank of the tensor T .
T is said to have contravariant order r and covariant order s. If s = 0 the
tensor is called contravariant while if r = 0 it is called covariant. If neither r
nor s is zero, T is called a mixed tensor.
A covariant tensor of order 1 is simply a linear functional, and in this context
is called a 1-form, a covariant vector or a covector. A contravariant tensor of
order 1 is often called a contravector or, simply, a vector.
The sum of two tensors defined on Vsr is a tensor defined on Vsr , and also a
constant multiple of a tensor defined on Vsr is a tensor defined on Vsr . So the
tensors defined on Vsr constitute a real vector space denoted Tsr (V). For convenience
we define T00 (V) to be R. Members of R are called, variously, real numbers,
constants, invariants or scalars.
Tensors can be multiplied to form new tensors of higher order.
If T : Vsr → R is a tensor and if Te : Vsere → R is a tensor then the tensor product
of T with T e (in that order) is denoted T ⊗ T e and defined to be the multilinear
function
r+er
T ⊗ Te : Vs+e s →R
r+er
given for (v(1), . . . , v(r + re + s + se)) ∈ Vs+es by the product

(T ⊗ Te)((v(1), . . . , v(r + re + s+e


s))
= T (v(1), . . . , v(r),v(r + re + 1), . . . , v(r + re + s))
· Te(v(r + 1), . . . ,v(r + re), v(r + re + s + 1), . . . , v(r + re + s + se)).

Sometimes this product is called the outer product of the two tensors. T ⊗ Te is
of order r + re + s + se, contravariant of order r + re and covariant of order s + se.
This process of creating tensors of higher order is definitely not commutative:
T ⊗ Te 6= Te ⊗ T in general. But this process is associative so a tensor represented
as T ⊗ S ⊗ R is unambiguous.
Suppose a is an ordered basis of V , identified with V ∗∗ , and a∗ the dual ordered
basis of V ∗ . We use in the calculation below, for the first time, the Einstein
summation convention: repeated indices which occur exactly once as superscript
and exactly once as subscript in a term connote summation over the range of that
index.
If T ∈ Tsr (V ) and v = (v(1), . . . , v(r + s)) ∈ Vsr then
T (v) = T A∗ i1 (v(1))ai1 , . . . , A∗ ir (v(r))air ,
Air+1 (v(r + 1))air+1 , . . . , Air+s (v(r + s))air+s


= A∗ i1 (v(1)) · · · Air+s (v(r + s))T (ai1 , . . . , air+s )


= T (ai1 , . . . , air+s )ai1 ⊗ · · · ⊗ air ⊗ air+1 ⊗ · · · ⊗ air+s (v).

2The word tensor is adapted from the French “tension” meaning “strain” in English. Under-
standing the physics of deformation of solids was an early application.
16 LARRY SUSANKA

From this exercise in multilinearity we conclude that the nr+s tensors of the form
ai1 ⊗ · · · ⊗ air ⊗ air+1 ⊗ · · · ⊗ air+s span Tsr (V ) and it is fairly easy to show they
constitute a linearly independent set and hence a basis, the standard
basis for Tsr (V ) for ordered basis a.
The nr+s indexed numbers T (ai1 , . . . , air+s ) determine the tensor T and we say
that this indexed collection of numbers “is” the tensor T in the ordered
basis a.
We introduce the notation
Tiir+1
1 ,...,ir i1 ir
,...,ir+s (a) = T (a , . . . , a , ar1 +1 , . . . , air+s ).

Alternatively, after letting jk = ir+k for k = 1, . . . , s we define

Tji11,...,j
,...,ir
s
(a) = T (ai1 , . . . , air , aj1 , . . . , ajs ).

···
For each ordered basis a, T··· (a) is a real valued function whose domain consists
of all the ordered r+s-tuples of integers from the index set and whose value on any
given index combination is shown above.
So as a multiple sum over r + s different integer indices

T = Tji11,...,j
,...,ir
s
(a) ai1 ⊗ · · · ⊗ air ⊗ aj1 ⊗ · · · ⊗ ajs .

A tensor which is the tensor product of r different vectors and s different cov-
ectors is of order t = r + s. Tensors which can be represented this way are called
simple. It is fairly obvious but worth noting that not all tensors are simple.
To see this note that the selection of r vectors and s covectors to form a simple
tensor involves nt coefficients to represent these vectors and covectors in basis
a when V has dimension n. On the other hand, a general tensor of order t is
determined by the nt numbers Tji11,...,j
,...,ir
s
(a). Solving nt equations in nt unknowns is
generally not possible for integers n and t exceeding 1, unless n = t = 2.

Getting back to the discussion from above, we associate T with the tensor AT ∈
e
e
Ts (R ) defined by
r n

e = T i1 ,...,ir (a) ei ⊗ · · · ⊗ ei ⊗ ej1 ⊗ · · · ⊗ ejs .


AT
e
j1 ,...,js 1 r

e is a combination of A and A∗ applied, as appropriate, factor by factor.


A
e

We leave the definition of A


e to the imagination of the reader.

Tsr (V )
x 
 
à A
  e
e
 
 y

Tsr (Rn )
TENSORS (DRAFT COPY) 17

If you wish to do a calculation with a tensor T to produce a number you can use
its representation AT
e , with respect to a given ordered basis a in V , as a tensor on
e
n
R .
Specifically, we can evaluate T on a member of (v(1), . . . , v(r + s)) ∈ Vsr by
calculating the value of Tji11,...,j
,...,ir
s
(a)ei1 ⊗ · · · ⊗ eir ⊗ ej1 ⊗ · · · ⊗ ejs on the member
(A∗ (v(1)), . . . , A∗ (v(r)), A(v(r + 1)), . . . , A(v(r + s))) of (Rn )rs .
We have arrived at several closely related definitions of, or uses for, the creatures
like Tji11,...,j
,...,ir
s
(a) seen peppering books which apply tensors.

Most simply, they are nothing more than the numbers T (ai1 , . . . , air , aj1 , . . . , ajs ),
the coefficients of the simple tensors ai1 ⊗ · · · ⊗ air ⊗ aj1 ⊗ · · · ⊗ ajs in the explicit
representation of the multilinear function, using the basis a.
···
More abstractly, T··· is a function from the set of all ordered bases of V to the
set of real valued functions whose domain is the nr+s possible values of our index
··· ···
set. The value of T··· on an ordered basis a is the function T··· (a).
The value of this function on an index combination is denoted Tji11,...,j
,...,ir
s
(a), and
these numbers are called the coordinates of T in the ordered basis a. For
various index combinations but fixed ordered basis these numbers might be anything
(though of course they depend on T and a.) These values are unrelated to each
other for different index combinations in the sense that they can be prescribed
at will or measured and found to be any number if you are creating a tensor from
scratch in a particular basis. But once they are all specified or known in one ordered
basis their values are determined in all other bases, and we shall see soon how these
numbers must change in a new ordered basis.
···
Finally, we can think of T··· as defining a function from the set of all ordered
bases of V to the set of tensors in Tsr (Rn ).
For each ordered basis a, T is assigned its own representative tensor AT e ∈
e
Ts (R ). Not every member of Ts (R ) can be a representative of this particular ten-
r n r n

sor using different bases. The real number coefficients must change in a coordinated
way when moving from ordered basis to ordered basis, as we shall see below.
There are advantages and disadvantages to each point of view.
The list of coefficients using a single basis would suffice to describe a tensor com-
pletely, just as the constant of proportionality suffices to describe a direct variation
of one real variable with another. But specifying the constant of proportionality
and leaving it at that misses the point of direct variation, which is a functional
relationship. Likewise, the coefficients which define a tensor are just numbers and
thinking of them alone downplays the relationship they are intended to represent.
Guided by the impression that people think more easily about relationships
taking place in some familiar environment you have the option of regarding tensors
as living in Rn through a representative tensor there, one representative for each
ordered basis of V . The utility of this viewpoint depends, of course, on familiarity
with tensors in Rn .
18 LARRY SUSANKA

7. Tensor Coordinates and Change of Basis

Suppose T is the same tensor from above and b is a new ordered basis of V with
dual ordered basis b∗ of V ∗ .
Representing this tensor in terms of both the old ordered basis a and the new
ordered basis b we have
T = T (bw1 , . . . , bwr , bh1 , . . . , bhs ) bw1 ⊗ · · · ⊗ bwr ⊗ bh1 ⊗ · · · ⊗ bhs
= T (ai1 , . . . , air , aj1 , . . . , ajs ) ai1 ⊗ · · · ⊗ air ⊗ aj1 ⊗ · · · ⊗ ajs
= T (ai1 , . . . , air , aj1 , . . . , ajs )
· (B w1 (ai1 )bw1 ) ⊗ · · · ⊗ (B wr (air )bwr )
⊗ Aj1 (bh1 )bh1 ⊗ · · · ⊗ Ajs (bhs )bhs
 

= T (ai1 , . . . , air , aj1 , . . . , ajs )


· B w1 (ai1 ) · · ·B wr (air )Aj1 (bh1 ) · · · Ajs (bhs )
· bw1 ⊗ · · · ⊗ bwr ⊗ bh1 ⊗ · · · ⊗ bhs .

In a shorter form we have


Thw11,...,h
,...,wr
s
(b) = Tji11,...,j
,...,ir
s
1 w r j1 js
i1 · · · Bir Ah1 · · · Ahs .
(a)Bw

Behold the—rather messy—formula which must be satisfied by the coordinates


of representations of a tensor in different bases.
We have already seen and used the coordinates of one mixed tensor, given by
δji ai ⊗ aj . It is worth checking directly that the coordinates of this (rather unusual)
tensor are the same in any ordered basis.
In deriving the formula for new tensor coordinates after change of basis we used
a technique we will revisit.
In this case our assumption was simple: we assumed we were in possession of a
tensor T , a multilinear real valued function on a product space formed from copies
of V and V ∗ . No coordinates are mentioned, and play no role in the definition.
With that coordinate-free definition in hand, we defined coordinates in any basis
and showed a relationship which must hold between coordinates in different bases.
You will most likely never see an actual tensor (as opposed to the “Let T be
a tensor ....” type of tensor) defined in any way except through a coordinate
representation in a basis. The existence of a tensor T underlying a multilinear
function defined in a basis is part of the theory of the subject you were studying
when the putative tensor appeared.
The change of basis formula is then a testable consequence for quantities you
can measure of that subject-specific theory.
We will see this kind of thing in action—coordinate-free definition followed by
implications for coordinates—later when we discuss topics such as contraction and
raising and lowering indices.
This essential idea will be raised again in the next section.
TENSORS (DRAFT COPY) 19

At this point it seems appropriate to discuss the meaning of covariant and con-
travariant tensors and how their coordinates vary under change of basis, with the
goal of getting some intuition about their differences. We illustrate this with tensors
of rank 1: contravariant vectors (vectors) and and covariant vectors (covectors.)
The physical scenario we have in mind consists of two types of items in the air
in front of your eyes with you seated, perhaps, at a desk.
First, we have actual physical displacements, say of various dust particles that
you witness. Second, we have stacks of flat paper, numbered like pages in a book,
each stack having uniform “air gap” between pages throughout its depth, though
that gap might vary from stack to stack. We make no restriction about the angle
any particular stack must have relative to the desk. We consider these pages as
having indeterminant extent, perhaps very large pages, and the stacks to be as deep
as required, though of uniform density.
The magnitude of a displacement will be indicated by the length of a line segment
connecting start to finish, which we can give numerically should we decide on a
standard of length. Direction of the displacement is indicated by the direction of
the segment together with an “arrow head” at the finish point.
The magnitude of a stack will be indicated by the density of pages in the stack
which we can denote numerically by reference to a “standard stack” if we decide
on one. The direction of the stack is in the direction of increasing page number.
We now define a coordinate system on the space in front of you, measuring
distances in centimeters, choosing an origin, axes and so on in some reasonable way
with z axis pointing “up” from your desk.
Consider the displacement of a dust particle which moves straight up 100 cen-
timeters from your desk, and a stack of pages laying on your desk with density 100
pages per centimeter “up” from your desk.
If you decide to measure distances in meters rather than centimeters, the vertical
coordinate of displacement drops to 1, decreasing by a factor of 100. The numerical
value of the density of the stack, however, increases to 10, 000.
When the “measuring stick” in the vertical direction increases in length by a
factor of 100, coordinates of displacement drop by that factor and displacement is
called contravariant because of this.
On the other hand, the stack density coordinate changes in the same way as the
basis vector length, so we would describe the stack objects as covariant.
We haven’t really shown that these stack descriptions and displacements can be
regarded as vector spaces, though no doubt you have seen the geometrical procedure
for scalar multiplication and vector addition of displacements. There are purely
geometrical ways of combining stacks to produce a vector space structure on stacks
too: two intersecting stacks create parallelogram “columns” and the sum stack has
sheets that extend the diagonals of these columns.
But the important point is that if stacks and displacements are to be thought of
as occurring in the same physical universe, and if a displacement is to be represented
as a member u of R3 , then a stack must be represented as a linear functional, a
member σ of R3∗ .
20 LARRY SUSANKA

There is a physical meaning associated with the number σu. It is the number
of pages of the stack corresponding to σ crossing the shaft of the displacement
corresponding to u, where this number is positive when the motion was in the
direction of increasing “page number.”
It is obvious on physical grounds that this number must be invariant: it cannot
depend on the vagaries of the coordinate system used to calculate it. That is the
meaning of
σu = σ (AB) u = (σA) (Bu)
and why tensor coordinates of the two types change as they do.

8. Tensor Character

In applications, a real valued function is often defined by specified operations on


the coordinates of several vector and/or covector variables in a convenient ordered
basis a. It is then proposed that this function “is” a tensor.
This hypothesis has an underlying assumption, which must be remembered, and
two parts, which must be verified.
The often unstated (or at least underemphasized and forgotten by beginners)
assumption is that there actually is a vector space V which “is” reality, or closer to
it anyway, whose coordinates in a basis generate a representative tensor in Tsr (Rn ).
You will likely be working with nothing but representative tensors, but the “real
thing” is a member of Tsr (V ).
The first part of the hypothesis to be verified is that the coordinate process is
linear in each vector or covector (each “slot”) separately.
Secondly, it must be shown or simply assumed (possibly for physical reasons) that
the process by which you produced the coordinates has tensor character: that is,
had you applied the same process by which you calculated the tensor coordinates
Tji11,...,j
,...,ir
s
(a) in the first coordinate system a to a second coordinate system b, the
tensor coordinates Tji11,...,j
,...,ir
s
(b) thereby produced would be the properly transformed
tensor coordinates from the first basis.
In order to provide examples and make the connection with classical treatments
we first do some spadework. We specialize the example to consider a tensor of rank
3, selecting that as the “sweet spot” between too simple and too complicated.
The symbols x and y are intended to be the representatives in ordered basis a∗
of generic covectors, while z is the representative in ordered basis a of a generic
vector. In our examples below we assume the dimension of these underlying spaces
to be at least three (so x, y and z have at least three coordinates each) though that
is not really relevant to the issue.
A coordinate polynomial for x, y and z is any polynomial involving the co-
ordinates of x, y and z and, after simplification perhaps, no other indeterminates.
If, after simplification, each nonzero term in this polynomial is of third degree
and involves one coordinate from each of x, y and z then the polynomial is said to
be multilinear in x, y and z.
TENSORS (DRAFT COPY) 21

So the polynomial x1 y2 z 2 − 7x2 y2 z 3 is multilinear in x, y and z but x1 x3 z 2 −


7x2 y2 z 1 is not and x1 y2 z 1 − 7x2 y2 is not. Though 8x3 y2 − x2 y1 is not multilinear
in x, y and z, it is multilinear in x and y.
Multilinearity of a coordinate polynomial is necessary if it is to represent the
effect of a member T of T12 (V ) on generic vectors and covectors in its domain: it
encapsulates multilinearity of T .
More generally, sometimes the coordinates of a proposed tensor T of order k are
given by formulae involving an explicit ordered basis a. Coordinates of the appro-
priate combination of k different generic vectors or covectors in this ordered basis a
are named, as was x, y and z above, and the calculated tensor coordinates affixed
as coefficient for each coordinate monomial, creating a coordinate polynomial.
Multilinearity means that the polynomial must (possibly after simplification and
collection of like terms) be the sum of polynomial terms each of which is degree k
and involves one coordinate from each of the k generic vectors or covectors. The
(simplified) polynomial is called the coordinate polynomial for T in ordered
basis a.
Going back to our example of order 3, if the polynomial x1 y2 z 3 − 7x2 y2 z 1 is
the coordinate polynomial for a tensor T ∈ T12 in ordered basis a where V has
dimension 3, the tensor coordinates are T31,2 (a) = 1, T12,2 (a) = −7 and in this
ordered basis the other twenty-five tensor coefficients are 0.
After satisfying ourselves that a proposed coordinate polynomial is multilinear,
which is often obvious, we still have some work to do. The numbers 1, −7 and 0,
the coordinates in ordered basis a, came from somewhere. Usually it is an explicit
formula involving the basis a. There is only one tensor T could be:
T = a1 ⊗ a2 ⊗ a3 − 7 a2 ⊗ a2 ⊗ a1 .
The question is, do the formulae that produced 1, −7 and 0 for basis a have tensor
character: will they, if applied in a second basis b yield the necessary coefficients
in that new basis?
Let’s get specific. If basis b happens to be b1 = 2a1 and b2 = a1 + a2 and b3 = a3
then    1 −1 
2 1 0 2 2 0
A = 0 1 0 and B =  0 2 0 .
0 0 1 0 0 1
So T12,2 (b) must equal
Tji11 ,i2 (a) B2i1 B2i2 Aj11 = T31,2 (a) B21 B22 A31 + T12,2 (a) B22 B22 A11
= 1 · 0 · 2 · 0 + (−7) · 2 · 2 · 2 = −56.

In the new basis the coordinate polynomial will have a term


T12,2 (b) x2 y 2 z 1 = −56 x2 y 2 z 1
where x, y and z are the representatives in basis b of the same two covectors and
the same vector, in the same order, that x, y and z represent in basis a.
In the notation we have used up to this section, T will have a term −56 b2 ⊗b2 ⊗b1
when represented in basis b.
22 LARRY SUSANKA

There are 27 coordinates to be checked in a generic second basis if you need to


demonstrate tensor character when the tensor coordinates (or coordinate polyno-
mial coefficients) are supplied as formulae involving the basis.
To show that formulae producing coefficients in a basis of a member of Tsr (V )
have tensor character involves nr+s potentially painful calculation involving coordi-
nates in a generic second basis and the associated matrices of transition, the boxed
equations in Section 7.
It is fortunate that, in applications, it is rare to find tensors for which either of
n or r + s exceed 4: that would still, at worst, require 256 calculations, though
symmetries among the coordinates can cut this down considerably.
Taking an alternative point of view, if it is a theory that the procedure which
produces coordinates in a basis should have tensor character, then the value of each
coordinate in each basis is a prediction: it provides an opportunity for experimental
verification of the theory.
Here’s a simple example of a procedure that fails to produce coefficients with
tensor character, using the same bases a and b.
Let us suppose that we define a multilinear (actually, this one is linear) function
to be the sum of the first three coordinates of a covector in a basis. So using basis
a the corresponding coordinate polynomial is x1 + x2 + x3 which corresponds to
tensor a1 +a2 +a3 . But using basis b the procedure produces coordinate polynomial
x1 + x2 + x3 which corresponds to tensor b1 + b2 + b3 . These tensors are not the
same, therefore this process does not define a tensor.
On a cautionary note, we mention a practice common in older tensor treatments
in which tensors are defined using coordinate polynomials only, without explicit
elaboration of the structure we have built from V and V ∗ .
You might see symbols T i j and S k l named as contravariant tensors. In our
language, these are the coefficients of tensors T and S in a basis, and when covectors
x, y, u and w are represented in terms of that basis
T (x, y) = T i j xi yj and S(u, w) = S k l uk wl .
Note that all the numbers on right side of both equations above are dependent on
the explicit basis, but tensor character is invoked and we conclude that the sum of
these products is invariant. In a new “barred basis”
ij kl
T (x, y) = T xi y j and S(u, w) = S uk w l .
Usually, T and T (x, y) and S and S(x, y) are not mentioned explicitly, nor are the
bases named. You are supposed to simply understand they are there.
Tensor operations are defined on these coordinate polynomials alone. Sometimes
it is understood that the index i is associated with x, j with y, k with u and l with
w. When an index symbol is associated with a variable symbol, statements can
arise that seem odd to us.
For instance, the tensor products T i j S k l and S k l T i j refer to the coordinate
polynomials
T i j xi yj S k l uk wl S k l uk wl T i j xi yj
   
and
TENSORS (DRAFT COPY) 23

which are the same by the ordinary commutative and distributive properties of real
numbers. Tensor product commutes using these conventions! Of course, what is
happening behind the scenes is that you have decided on an order for your four
covectors (perhaps alphabetical) and are simply sticking to it.
Many people find the silent mental gymnastics needed to keep the older notation
straight to be unsettling, and tend to lose track in really complex settings. The
gyrations we have chosen to replace them are, at least, explicit.

9. Relative Tensors

Certain types of processes that define a linear or multilinear function for each
basis don’t seem to be that useful or reflect much other than the features of the
particular basis. For instance if v = v i (a) ai ∈ V the functions
Xn
fa , ga : V → R given by fa (v) = v i (a) and ga (v) = v 1 (a)
i=1
(same formula for any basis) are both linear but are not covectors.
They correspond to linear coordinate polynomials
x1 + · · · + xn and x1 in any basis.

They certainly might be useful for something (giving us useful information about
each basis perhaps) but whatever that something might be it will probably not
describe “physical law.”
But among those processes that define a linear or multilinear function for each
basis, it is not only tensors which have physical importance or which have consis-
tent, basis-independent properties. We will discuss some of these entities (relative
tensors) after a few paragraphs of preliminary material.
The set of matrices of transition with matrix multiplication forms a group called
the general linear group over R. We denote this set with this operation by
GL(n, R).
A real valued function f : GL(n, R) → R is called multiplicative if f (CD) =
f (C)f (D) for every pair of members C and D of GL(n, R).
Any nontrivial (i.e. nonzero) multiplicative function must take the identity ma-
trix to 1, and also a matrix and its inverse matrix must
 be sent to reciprocal real
numbers. Any such function must satisfy f A−1 B A = f (B), for instance.
Given two nontrivial multiplicative functions, both their product and their ratio
are multiplicative. If a nontrivial multiplicative function is always positive, any real
power of that function is multiplicative.
In this section we will be interested in several kinds of multiplicative functions.
First, let det denote the determinant function on square matrices. Its proper-
ties are explored in any linear algebra text. We find in [8] p.154 that det(M N ) =
det(M ) det(N ) for
 compatible square matrices M and N . Since det(I) = 1 this
−1 −1
yields det M = (det(M )) when M is invertible. We then find that
det M −1 N M = det(N ).

24 LARRY SUSANKA

Determinant is an invariant under similarity transformations.


The determinant is also multiplicative on GL(n, R).
So is the absolute value of the determinant function, |det|, and also the ratio
sign = |det|
det . This last function is 1 when applied to matrices with positive deter-
minant and −1 otherwise.
If w is any real number, the functions f and g defined by
w w
f (C) = sign(C) |det(C)| and g(C) = |det(C)|
are multiplicative and, in fact, these functions exhaust the possibilities among con-
tinuous multiplicative functions3 on GL(n, R).
f (C) is negative exactly when det(C) is negative, and is called an odd weighting
function with weight w with reference to this fact. g is always positive, and is
called an even weighting function with weight w.
A relative tensor is defined using ordinary tensors but with a different trans-
formation law when going to a new coordinate system. These transformation laws
take into account how volumes could be measured in V , or the “handedness” of
a coordinate basis there, and relative tensors are vital when theories of physical
processes refer to these properties in combination with multilinearity.
Specifically, one measures or calculates or otherwise determines numbers called
coordinates of a proposed relative tensor T in one basis a, the numbers Tji11,...,j
,...,ir
s
(a),
and in this basis the tensor density T is the multilinear function
T = Tji11,...,j
,...,ir
s
(a) ai1 ⊗ · · · ⊗ air ⊗ aj1 ⊗ · · · ⊗ ajs .

The difference is that when switching to a new coordinate system b the coordi-
nates of a relative tensor change according to one of the two formulae
w
Thk11,...,h
,...,kr
s
(b) = sign(B) |det(B)| Tji11,...,j
,...,ir
s
(a)Bki11 · · · Bkirr Ajh11 · · · Ajhss
or
w
Thk11,...,h
,...,kr
s
(b) = |det(B)| Tji11,...,j
,...,ir
s
(a)Bki11 · · · Bkirr Ajh11 · · · Ajhss .

Relative tensors of the first kind are called odd relative tensors of weight w
while tensors of the second kind are called even relative tensors of weight w.
If C is the matrix of transition from ordered basis b to ordered basis c then C B
is the matrix of transition from ordered basis a directly to ordered basis c.
Because of this, it is not hard to show that the coordinates Thk11,...,h
,...,kr
s
(c) for a
relative tensor in ordered basis c can be derived from the coordinates in ordered
basis a directly, or through the intermediary coordinates in ordered basis b using
two steps: that is, the multiplicative nature of the weight function is exactly what
we need for the coordinate change requirement to be consistent.
An even relative tensor of weight 0 is just a tensor, so the concept of relative
tensor generalizes that of tensor. If it needs to be emphasized, ordinary tensors
are called absolute tensors to distinguish them from relative tensors with more
interesting weighting functions.
3See [2] page 349.
TENSORS (DRAFT COPY) 25

This subject has a long history and many applications, so a lot of vocabulary is
to be expected.
Odd relative tensors are described variously (and synonymously) as
• axial tensors
• pseudotensors
• twisted tensors
• oriented tensors.

The “axial” adjective seems more common when the relative tensor is of rank 0
or 1: axial scalars or axial vectors or covectors, but sometimes you will see
these relative tensors referred to as pseudoscalars or pseudovectors or pseu-
docovectors.
An even relative tensor (for example an ordinary tensor) is sometimes called a
polar tensor.
If the weight w is positive (usually 1) the phrase tensor density is likely to be
used for the relative tensor, while if the weight is negative (often −1) we take a
tip from [19] and call these tensor capacities, though many sources refer to any
relative tensor of nonzero weight as a tensor density.
These adjectives can be combined: e.g., axial vector capacity or pseudoscalar
density.
The columns of the change of basis matrix B represents the old basis vectors a
in terms of the new basis vectors b. The number |det(B)| is often interpreted as
measuring the “relative volume” of the parallelepiped formed with edges along basis
a as compared to the parallelepiped formed using b as edges. So, for example, if the
new basis vectors b are all just twice as long as basis vectors a, then |det(B)| = 2−n .
With that situation in mind, imagine V to be filled with uniformly distributed
dust. Since the unit of volume carved out by basis b is so much larger, the measured
density of dust using basis b would be greater by a factor of 2n than that measured
using basis a. On the other hand, the numerical measure assigned to the “volume”
or “capacity” of a given geometrical shape in V would be smaller by just this factor
when measured using basis b as compared to the number obtained using basis a as
a standard volume.
The vocabulary “tensor density” for negative weights and “tensor capacity” for
positive weights acknowledges and names these (quite different) behaviors.
One can form the product of relative tensors (in a specified order) to produce
another relative tensor. The weight of the product is the sum of the weights of the
factors. The product relative tensor will be odd exactly when an odd number of
the factor relative tensors are odd.
Many of the constructions involving tensors apply also to relative tensors with
little or no change. These new objects will be formed by applying the construction
to the “pure tensor” part of the relative tensor and applying the ordinary distribu-
tive property (or other arithmetic) to any weighting function factors, as we did in
the last paragraph.
26 LARRY SUSANKA

10. An Identification of Hom(V, V ) and T11 (V)

There is an identification of the vector space Hom(V, V) of all linear functions


F : V → V and the tensor space T11 which we describe now.
Whenever you see a function with vector output called a tensor, and they are
common, it is this association to which they refer.
Define Ψ(F ) : V ∗ × V → R to be the function Ψ(F )(ω, v) = ω(F (v)).
Ψ(F ) is obviously multilinear and Ψ(F + c G) = Ψ(F ) + c Ψ(G) for any other
linear map G : V → V and any scalar c. So we have created a member of T11 and a
homomorphism from Hom(V, V ) into T11 , and each space has dimension n2 .
If Ψ(F ) is the zero tensor then ω(F (v)) = 0 for every ω and v so F is the
zero transformation. That means the kernel of Ψ is trivial and we have created an
isomorphism Ψ : Hom(V, V ) → T11 . You will note that no reference was made to a
basis in the definition of Ψ.
Consider the map Θ : T11 → Hom(V, V ) given, for T = Tji (a) ai ⊗ aj ∈ T11 , by
Θ(T )(v) = Tji (a) v j ai = Tji (a) aj (v) ai whenever v = v j aj .
If v = v j aj and ω = ωj aj
(Ψ ◦ Θ)(T )(ω, v) = ω( Θ(T )(v) ) = ωk ak Tji (a) aj (v) ai = ωi Tji (a) v j
= Tji (a) ai ⊗ aj (ω, v) = T (ω, v).
It follows that Θ = Ψ−1 .
It may not be obvious that the formula for Θ is independent of basis, but it must
be since the definition of Ψ is independent of basis, and the formula for Θ produces
Ψ−1 in any basis.

11. Contraction and Trace

At this point we are going to define a process called contraction which allows us
to reduce the order of a relative tensor by 2: by one covariant and one contravariant
order value. We will initially define this new tensor by a process which uses a basis.
Specifically, we consider T ∈ Tsr (V ) where both r and s are at least 1. Select
one contravariant index position α and one covariant index position β. We will
define the tensor Cα β T, called the contraction of T with respect to the αth
contravariant and βth covariant index.
Define in a particular ordered basis a the numbers
i ,...,i i ,...,i ,k,i ,...,i
Hj11,...,jr−1
s−1
(a) = Tj11,...,jβ−1
α−1 α r−1
,k,jβ ,...,js−1 (a).

This serves to define real numbers for all the required index values in Ts−1 r−1
(V )
i1 ,...,ir−1
and so these real numbers Hj1 ,...,js−1 (a) do define some tensor in Ts−1 (V ).
r−1

Any process, including this one, that creates a multilinear functional in one
ordered basis defines a tensor: namely, the tensor obtained in other bases by trans-
forming the coordinates according to the proper pattern for that order of tensor.
TENSORS (DRAFT COPY) 27

It is worth belaboring the point raised in Section 8: that is not what people
usually mean when they say that a process creates or “is” a tensor. What they
usually mean is that the procedure outlined, though it might use a basis
inter alia, would produce the same tensor whichever basis had been
initially picked. We illustrate this idea by demonstrating the tensor character of
contraction.
Contracting T in the αth contravariant against the βth covariant coordinate in
the ordered basis b gives
w ,...,w w ,...,w ,k,w ,...,w
Hh11,...,hs−1
r−1
(b) = Th11,...,hβ−1
α−1 α r−1
,k,hβ ,...,hs−1 (b)
w w j j j
= Tji11,...,j
,...,ir
s i1 · · · Biα−1 Biα Biα+1 · · · Bir
(a)Bw 1 α−1 k wα r−1
Ajh11 · · · Ahβ−1
β−1
Akβ Ahβ+1
β
· · · Ajhss−1 .
j j
We now note that Bkiα Akβ = δiαβ . So we make four changes to the last line:

j j
• Factor out the sum Bkiα Akβ and replace it with δiαβ .
j
• Set q = jβ = iα to eliminate δiαβ from the formula.
(
im , if m = 1, . . . α − 1;
• Define im =
i , if m = α, . . . , r − 1.
( m+1
jm , if m = 1, . . . , β − 1;
• Define j m =
jm+1 , if m = β, . . . , s − 1.
The last line in the equation from above then becomes
i ,...,i ,q,i ,...,i w w j j j j
Tj 1,...,jα−1 ,q,jα ,...,jr−1 (a)Bw
i
1
· · · Bi α−1 Bw
i
α
· · · Bi r−1 Ah11 · · · Ahβ−1
β−1
Ahββ · · · Ahs−1
s−1
1 β−1 β s−1 1 α−1 α r−1

i ,...,i w j j
= Hj 1 ,...,jr−1 (a)Biw1 · · · Bi r−1 Ah11 · · · Ahs−1
s−1
1 s−1 1 r−1

This is exactly how the coordinates of H must change in going from ordered basis
a to b if the contraction process is to generate the same tensor when applied
to the coordinates of T in these two bases. Contraction can be applied to the
representation of a tensor in any ordered basis and will yield the same tensor.
So we define the tensor Cα
β T by

(Cβα T )··· ···


··· = H··· .

Whenever T ∈ T11 (for instance, a tensor such as Ψ(F ) from Section 10 above)
we have an interesting case. Then the number (C11 T )(a) = Tkk (a) = T (ak , ak ) is
called the trace of T (or the trace of F : V → V when T = Ψ(F ) as formed in the
previous section.)
We have just defined a number trace(T) = C11 T ∈ T00 = R for each member T
of T11 (V). No basis is mentioned here: the trace does not depend on the basis. It
is an invariant.
This usage of the word “trace” is related to the trace of a matrix as follows.
Suppose θ ∈ V ∗ and v ∈ V . In the ordered basis a the tensor T is Tji (a) ai ⊗ aj

and θ and v have representations as θ = θi ai and v = v i ai . So if Tji (a) is the
28 LARRY SUSANKA

matrix with ijth entry Tji (a)

T (θ, v) =Tji (a) ai ⊗ aj (θ, v)


=Tji (a) θi v j
v1
 

=(θ1 , . . . , θn ) Tji (a)  ... 


 

vn
=A∗ (θ) Tji (a) A(v).


In other words, down in Rn the tensor Tji (a) ei ⊗ ej is, essentially, “matrix
multiply in the middle” with the representative row matrix A∗ (θ) of θ on the left
and the column A(v) on the right. trace(T ) is the ordinary trace of this “middle
matrix,” the sum of its diagonal entries.

Two matrices P and Q are said to be similar if there is invertible matrix M


with P = M −1 QM , and the process of converting Q into P in this way is called a
similarity transformation.

If we change basis to b the “matrix in the middle” from above becomes


  j

Tji (b) = Bw i Tj (a) Ah = B Tj (a) A = A
i i −1
Tji (a) A.
 

Any invertible matrix A (this requires a tiny proof ) can be used to generate a
new ordered basis b of V , and we recover by direct calculation the invariance of
trace under similarity transformations.

It is interesting to see how this relates to the linear transformation F : V → V


and associated Ψ(F ) and the matrix of F under change of basis, and we leave that
comparison to the reader.

If T ∈ Tsr (V ) is any tensor with 0 < α ≤ r and 0 < β ≤ s, the contraction Cβα T
is related to the trace of a certain matrix, though it is a little harder to think about
with all the indices floating around. First, we fix all but the αth contravariant and
βth covariant index, thinking of all the rest as constant for now. We define the
matrix M (a) to be the matrix with ijth entry
i ,...,i ,i,i ,...,i
Mji (a) = Tj11,...,jβ−1
α−1 α+1 r
,j,jβ+1 ,...,js (a).

So for each index combination with entries fixed except in these two spots, we
have a member of T11 (V ) and M (a) is the matrix of its representative, using ordered
basis a, in T11 (Rn ). The contraction is the trace of this matrix: the trace
of a different matrix for each index combination.

Contraction of relative tensors proceeds in an identical fashion. Contraction


does not change the weight of a relative tensor, and the properties “even” or “odd”
persist, unchanged by contraction.
TENSORS (DRAFT COPY) 29

12. Evaluation

Suppose T ∈ Tsr (V). By definition, T is a multilinear function defined on Vsr =


V × · · · × V ∗ × V × · · · × V , where there are r copies of V ∗ followed by s copies of

V in the domain.
If θ ∈ V ∗ and 1 ≤ k ≤ r we can evaluate T at θ in its kth slot yielding the
function
E(θ, k)(T ) = T (· · · , θ, · · · )
where the dots indicate remaining “unfilled” slots of T , if any.
It is obvious that E(θ, k)(T ) is also a tensor, and a member of Tsr−1 (V).
Suppose a is an ordered basis of V and θ = θi (a) ai . Then
T = Tji11,...,j
,...,ir
s
(a) ai1 ⊗ · · · ⊗ air ⊗ aj1 ⊗ · · · ⊗ ajs
so
E(θ, k)(T )(a)
= Tji11,...,j
,...,ir
s
(a) ai1 ⊗ · · · ⊗ aik−1 ⊗ aik (θ) ⊗ aik+1 ⊗ · · · ⊗ air ⊗ aj1 ⊗ · · · ⊗ ajs
= Tji11,...,j
,...,ir
s
(a) θik (a) ai1 ⊗ · · · ⊗ aik−1 ⊗ aik+1 ⊗ · · · ⊗ air ⊗ aj1 ⊗ · · · ⊗ ajs
i ,...,i
= Gj11 ,...,jr−1
s
(a) ai1 ⊗ · · · ⊗ air−1 ⊗ aj1 ⊗ · · · ⊗ ajs
where
i ,...,i i ,...,i ,w,ik ,...,ir−1
Gj11 ,...,jr−1
s
(a) = Tj11,...,jsk−1 (a) θw (a).
G = E(θ, k)(T ) is the tensor you would obtain by taking the tensor product of
T with θ and then contracting in the appropriate indices:
k
G = E(θ, k)(T ) = Cs+1 (T ⊗ θ).

By a calculation identical to the above, evaluation at a member v ∈ V in slot


r + k, one of the last s slots of T , is a tensor in Ts−1
r
(V), and that tensor is
H = E(v, k)(T ) = Ckr+1 (T ⊗ v).

In ordered basis a, if v = v i (a) ai then H has coordinates


Hji11,...,j
,...,ir
s−1
(a) = Tji11,...,j
,...,ir
k−1 ,w,jk ,...,js−1
(a) v w (a) .

By combining the two operations, we see that evaluating T in any number of


its slots at specific choices of members of V or V ∗ (as appropriate for the slots
involved) will produce a tensor. If k slots are to be filled, that tensor can be formed
by the k-fold tensor product of T with the specified members of V or V ∗ , followed
by k contractions.
We make an obvious remark: if we evaluate a tensor at all of its slots, we get
a tensor of order 0, an invariant or (synonymous) scalar. If we evaluate the
representation of that same tensor at the representations of those same vectors in
any basis we get the same numerical output.
This remark has a converse.
r
Recall that the coordinates in a basis of a member of Vsr is a member of (Rn )s .
Suppose you determine a multilinear function for each basis of V which acts on
30 LARRY SUSANKA

r
(Rn )s . Suppose further that, for each v ∈ Vsr , if you evaluate the multilinear
function corresponding to a basis at the coordinates corresponding to v in that
basis you always get the same (that is, basis independent) numerical output. Then
the process by which you created the multilinear function in each basis has tensor
character: you have defined a tensor in Tsr (V).
Some sources find this observation important enough to name, usually calling it
(or something equivalent to it) The Quotient Theorem.4
Combining our previous remarks about products and contraction of relative ten-
sors, we see that evaluation of a relative tensor on vectors or covectors can be
performed at any of its slots, and the process does not change the weight of a
relative tensor. The properties “even” or “odd” persist unchanged by evaluation.
Evaluation of a relative tensor on relative vectors or relative covectors could
change both weight and whether the resulting relative tensor is even or odd, and
we leave it to the reader to deduce the outcome in this case.

13. Symmetry and Antisymmetry

It is obvious that, for a tensor of order 2, T (x, y) need not bear any special
relation to T (y, x). For one thing, T (x, y) and T (y, x) won’t both be defined unless
the domain is either V02 or V20 . But even then they need not be related in any
specific way.
A covariant or contravariant relative tensor T of order L is called symmetric
when switching exactly two of its arguments never changes its value. Specifically,
for each pair of distinct positive integers j and k not exceeding L and all v ∈ VL0
or V0L , whichever is the domain of T,
T (v(1), . . . , v(L))
= T (v(1), . . . , v(j − 1), v(k), v(j + 1), . . . , v(k − 1), v(j), v(k + 1), . . . , v(L)).

A covariant or contravariant relative tensor is called (any or all of) antisym-


metric, alternating or skew symmetric if switching any two of its arguments
introduces a minus sign but otherwise does not change its value. Specifically:
T (v(1), . . . , v(L))
= − T (v(1), . . . , v(j − 1), v(k), v(j + 1), . . . , v(k − 1), v(j), v(k + 1), . . . , v(L)).

The following facts about permutations are discussed in many sources, including
[9] p. 46. A permutation of the set {1, . . . , L}, the set consisting of the first
L positive integers, is a one-to-one and onto function from this set to itself. Let
PL denote the set of all permutations on this set. The composition of any two
permutations is also a permutation. There are L! distinct permutations of a set
containing L elements. Any permutation can be built by switching two elements
4Our statement has a generalization. Suppose that we identify a fixed collection of slots of the
multilinear function defined as above using a basis. Suppose that whenever you fill those particular
slots with coordinates of vectors/covectors in a basis the resulting contracted multilinear function
is independent of the basis: had you used coordinates in a different basis the new coefficients
would be the same as the old ones, properly transformed so as to have tensor character. Then the
means by which you formed the original coefficients has tensor character.
TENSORS (DRAFT COPY) 31

of the set at a time gradually building up the final permutation by composition of


a sequence of such switches. A “switch,” involving as it does exactly two distinct
elements, is also called a 2-cycle. You might be able to build P by this process
in many ways. If you can build a permutation as the composition of an even
number of 2-cycles it is called an even permutation while if you can build a
permutation as the composition of an odd number of 2-cycles it is called an odd
permutation. It is a fact that a permutation is even or odd but cannot be both.
There are exactly L!/2 even permutations and L!/2 odd ones. If P is even, define
sgn(P) = 1 and if P is odd define sgn(P) = −1. For any permutations P and Q
we have sgn(P ◦ Q) = sgn(P ) sgn(Q).
If P is a permutation of {1, . . . , L} and T is a covariant or contravariant relative
tensor of order L let TP denote the tensor whose values are given by
TP (v(1), . . . , v(L)) = T (v(P (1)), . . . , v(P (L))).
In other words, TP is the relative tensor obtained from T by reordering the argu-
ments using the permutation P . The function that sends T to TP is sometimes
called a braiding map. Braiding maps are isomorphisms.
Using this notation, T is symmetric precisely if TP = T whenever P is a 2-cycle.
T is alternating precisely when TP = −T whenever P is a 2-cycle.
Since every permutation is a composition of 2-cycles, this means that T is sym-
metric precisely when TP = T for all permutations P , while T is alternating pre-
cisely if TP = sgn(P )T for any permutation P .
A covariant or contravariant relative tensor T of order L can undergo a process
called symmetrization. It is defined in a coordinate-independent way by
1 X
Sym(T) = TP
L!
P ∈PL

where the sum is over all L! members of the set of permutations PL of {1, . . . , L}.
Antisymmetrization is accomplished by the coordinate-independent formula
1 X
Alt(T) = sgn(P ) TP .
L!
P ∈PL

It is sometimes said that Sym(T ) is the symmetric part of T , while Alt(T ) is


said to be the skew part of T .
For instance if a1 , a2 are 1-forms then
1 1 1
Sym(a1 ⊗ a2 ) = ( a ⊗ a2 + a2 ⊗ a1 ) and Alt(a1 ⊗ a2 ) = ( a1 ⊗ a2 − a2 ⊗ a1 )
2 2
and if a3 is a third 1-form then
1
Alt(a1 ⊗ a2 ⊗ a3 ) = ( a1 ⊗ a2 ⊗ a3 − a2 ⊗ a1 ⊗ a3 + a2 ⊗ a3 ⊗ a1
6
− a3 ⊗ a2 ⊗ a1 + a3 ⊗ a1 ⊗ a2 − a1 ⊗ a3 ⊗ a2 ).
Both of these operations are linear when viewed as functions on the space of
tensors of a given order. Applied to a representation of T as T = Ti1 ,...,iL (a) ai1 ⊗
32 LARRY SUSANKA

· · · ⊗ aiL in a basis, this gives Sym(T ) = Ti1 ,...,iL (a) Sym(ai1 ⊗ · · · ⊗ aiL ) and
Alt(T ) = Ti1 ,...,iL (a) Alt(ai1 ⊗ · · · ⊗ aiL ).
In some sources, the Alt operation is indicated on coordinates of a tensor by
using square brackets on the indices, such as T[i1 ,...,iL ] . So
T[i1 ,...,iL ] (a) = Alt(T )i1 ,...,iL (a).
In these sources round brackets may be used to indicate symmetrization:
T(i1 ,...,iL ) (a) = Sym(T )i1 ,...,iL (a).
One also sees these notations spread across tensor products as in the expression
S[i1 ,...,im Tj1 ,...,jL ] , used to indicate the coordinates of Alt(S ⊗ T ).
It is not necessary to involve all indices in these operations. Sometimes you may
see brackets enclosing only some of the indices, as
T[i1 ,...,ik ],ik+1 ,...,iL (a) or T(i1 ,...,ik ),ik+1 ,...,iL (a).
The intent here is that these tensors are calculated on a selection of L vectors
by permuting the vectors in the first k “slots” of T in all possible ways, leaving the
last L − k arguments fixed. After summing the results of this (with factor “−1”
where appropriate) divide by k!.
In the calculation to follow, we let Q = P −1 for a generic permutation P , and
note that sgn(P ) = sgn(Q). For a particular basis tensor ai1 ⊗ · · · ⊗ aiL ,
Alt(ai1 ⊗ · · · ⊗ aiL )(v(1), . . . , v(L))
1 X
= sgn(P )(ai1 ⊗ · · · ⊗ aiL )(v(P (1)), . . . , v(P (L)))
L!
P ∈PL
1 X
= sgn(P ) v(P (1))i1 · · · v(P (L))iL
L!
P ∈PL
1 X
= sgn(Q) v(1)iQ(1) · · · v(L)iQ(L)
L!
Q∈PL
1 X
= sgn(Q)(aiQ(1) ⊗ · · · ⊗ aiQ(L) )(v(1), . . . , v(L)).
L!
Q∈PL

In a basis, if T = Ti1 ,...,iL (a) ai1 ⊗ · · · ⊗ aiL then

1 X
Alt(T ) = sgn(Q) Ti1 ,...,iL (a) aiQ(1) ⊗ · · · ⊗ aiQ(L)
L!
Q∈PL
1 X
= sgn(Q) TiQ(1) ,...,iQ(L) (a) ai1 ⊗ · · · ⊗ aiL .
L!
Q∈PL
(Sum on an index pair im and iQ(q) whenever m = Q(q).)

So the permutations can be applied to the domain factors or to the


basis superscripts in a basis representation or to the coefficients in a
basis to calculate Alt(T). Obviously, the same fact pertains for Sym(T).
TENSORS (DRAFT COPY) 33

It is worthwhile to check that Sym(T ) is symmetric, while Alt(T ) is alternating.


It is also true that if T is symmetric then Sym(T ) = T and Alt(T ) = 0. If T is
alternating then Sym(T ) = 0 and Alt(T ) = T .
1
Our choice of scaling constant L! in the definitions of Sym and Alt makes these
linear functions, when restricted to tensors of a specified rank, into projections onto
the spaces of symmetric and alternating tensors of that rank, respectively.
Choice of constant here and in the wedge product below are not uni-
versally adopted, even within the work of a single author, and in each
new source you must note the convention used there. Formulas differ by
various “factorial” factors from text to text as a regrettable consequence.
Note that the coefficients of any covariant 2-tensor T can be written as
Ti,j + Tj,i Ti,j − Tj,i
Ti,j = +
2 2
so any such tensor is the sum of a symmetric and an alternating tensor.
But if the rank exceeds 2 it is not true that every covariant or contravariant
relative tensor is the sum of an alternating with a symmetric tensor, and at this
point it might be a useful exercise to verify the following facts concerning T =
a1 ⊗ a2 ⊗ a3 − a2 ⊗ a1 ⊗ a3 and S = T − Alt(T ).
• T is neither symmetric nor alternating yet Sym(T ) = 0.
• S is not symmetric and not alternating yet Alt(S) = Sym(S) = 0.
• Conclude that S is not the sum of a symmetric with an alternating tensor.
If S and T are both covariant relative tensors, of orders s and t respectively, we
define the wedge product of S with T (in that order) by
(s + t)!
S∧T = Alt(S ⊗ T ).
s! t!

The wedge product is in some sources called the exterior product, not to be
confused with the ordinary tensor (outer) product.
It is not too hard to show that if S and R are both covariant relative tensors of
order s with the same weighting function and if T is a covariant relative tensor of
order t and if k is a constant then
S ∧ T = (−1)st T ∧ S and (kS + R) ∧ T = kS ∧ T + R ∧ T.

If R is a covariant relative tensor of order r we would like to know that (R ∧


S) ∧ T = R ∧ (S ∧ T ): that is, wedge product is associative. This is a bit harder to
show, but follows from a straightforward, if messy, calculation.
In preparation we state the following facts about permutations. Suppose P ∈ Pk
and 0 < k ≤ L. Define P ∈ PL to be the permutation
(
P (i), if i ≤ k;
P (i) =
i, if L ≥ i > k.

−1
First, any member Q of PL is Q ◦ P ◦ P and so can be written in k! different
−1
ways as R ◦ P where R = Q ◦ P for various P ∈ Pk .
34 LARRY SUSANKA

  −1  
Second, sgn P = sgn(P ) = sgn P and sgn R ◦ P = sgn(R) sgn(P ).

Alt Alt ai1 ⊗ · · · ⊗ aik ⊗ aik+1 ⊗ · · · ⊗ aiL


 
!
1 X iP (1) iP (k) ik+1 iL
=Alt sgn(P ) a ⊗ ··· ⊗ a ⊗a ⊗ ··· ⊗ a
k!
P ∈Pk
1 X
sgn(R) sgn P aiR◦P (1) ⊗ · · · ⊗ aiR◦P (k) ⊗ aiR(k+1) ⊗ · · · ⊗ aiR(L)

=
k! L!
R∈PL , P ∈Pk
1 X
sgn R ◦ P aiR◦P (1) ⊗ · · · ⊗ aiR◦P (L) .

=
k! L!
R∈PL , P ∈Pk

If we focus on a fixed permutation iQ(1) , . . . , iQ(L) of the superscripts in the last


double sum above we see that each is repeated k! times, once for permutation
−1
R=Q◦P for each permutation P . So the last double sum above is
1 X
sgn(Q) aiQ(1) ⊗ · · · ⊗ · · · ⊗ aiQ(L)
k! L!
Q∈PL , P ∈Pk
1 X
sgn(Q) aiQ(1) ⊗ · · · ⊗ · · · ⊗ aiQ(L) = Alt ai1 ⊗ · · · ⊗ aiL .

= k!
k! L!
Q∈PL

The obvious modification shows also that


Alt ai1 ⊗ · · · ⊗ aik ⊗ Alt aik+1 ⊗ · · · ⊗ aiL = Alt ai1 ⊗ · · · ⊗ aiL .
 

Suppose R = Ri1 ,...,ir (a) ai1 ⊗ · · · ⊗ air and S = Sj1 ,...,js (a) aj1 ⊗ · · · ⊗ ajs and
T = Tk1 ,...,kr (a) akt ⊗ · · · ⊗ akt are any covariant tensors.
By linearity of Alt and the result above we find that

Alt(Alt(R ⊗ S) ⊗ T ) = Alt(R ⊗ S ⊗ T ) = Alt(R ⊗ Alt(S ⊗ T )).

We remark in passing that the argument above is easily adapted to show that
Sym satisfies:

Sym(Sym(R ⊗ S) ⊗ T ) = Sym(R ⊗ S ⊗ T ) = Sym(R ⊗ Sym(S ⊗ T )).

From these facts about Alt, it follows immediately that


(r + s + t)!
R ∧ (S ∧ T ) = Alt(R ⊗ S ⊗ T ) = (R ∧ S) ∧ T.
r! s! t!
Since the operation ∧ is associative, we have found that an expression involving
three or more wedge factors, such as R ∧ S ∧ T , is unambiguous. We note also that
this wedge product is an alternating tensor, and its order is the sum of the orders
of its factors.
As a special but common case, if β 1 , . . . , β L ∈ V ∗ an induction argument gives
X
β1 ∧ · · · ∧ βL = sgn(Q) β Q(1) ⊗ · · · ⊗ β Q(L) .
Q∈PL

So this product of 1-forms can be evaluated on v1 , . . . , vL ∈ V by


TENSORS (DRAFT COPY) 35

X  
β 1 ∧ · · · ∧ β L (v1 , . . . , vL ) = sgn(Q) β Q(1) ⊗ · · · ⊗ β Q(L) )(v1 , . . . , vL )


Q∈PL
X
sgn(Q) β Q(1) (v1 ) · · · β Q(L) (vL ) = det β i (vj )

=
Q∈PL

where det stands for the determinant function5, in this case applied to the L × L
matrix with ith row, jth column entry β i (vj ).
And it follows that if Q ∈ PL then the wedge product of 1-forms satisfies

β 1 ∧ · · · ∧ β L = sgn(Q) β Q(1) ∧ · · · ∧ β Q(L) .


Every part of the discussion in this section works with minor modifications if
all tensors involved are contravariant rather than covariant tensors, and these do
occur in applications.
As with tensor products, the weight of a wedge product of relative covariant or
contravariant tensors is the sum of the weights of the factors and the wedge product
of relative tensors will be odd exactly when an odd number of the factor relative
tensors are odd.

14. A Basis for Λr (V )

Define Λs (V) to be the subset of Ts0 (V ) consisting of tensors T with Alt(T ) = T :


that is, Λs (V ) is the set of alternating members of Ts0 (V ), a condition trivially
satisfied when s = 0 or 1.
Members of Λs (V ) are called s-forms, specifically the s-forms on V .
The alternating members of T0s (V ), denoted Λs (V ). They are called s-vectors.
Note that 1-forms on V ∗ are identified with V , so Λ1 (V ∗ ) = Λ1 (V ) = V and,
generally Λs (V ) = Λs (V ∗ ).
We confine our work, mostly, to Λs (V ) in this section but invite the reader to
make the obvious notational adaptations needed to state and prove the same results
in Λs (V ).
If S ∈ Λs (V ) and T ∈ Λt (V ) there is no reason to suppose that S ⊗T , which is in
Ts+t
0
(V ), will be alternating. However S ∧ T will be alternating, so wedge product
defines an operation on forms that produces a form as output.
Suppose R ∈ Λr (V ). In an ordered basis a, R is written
R = Ri1 ,...,ir (a) ai1 ⊗ · · · ⊗ air
where, because R is alternating, the coordinates satisfy the antisymmetry condition
Ri1 ,...,ir (a) = sgn(Q) RiQ(1) ,...,iQ(r) (a) for Q ∈ Pr .

Because of antisymmetry, if there is any repeat of an index value within one


term the coefficient of that term must be 0. Nonzero coefficients must be among
5The sum in the last line is usually taken to be the definition of the determinant of an L × L
matrix with these numerical entries.
36 LARRY SUSANKA

those in which the list of index values consists of r distinct integers. Let’s focus on
a particular list i1 , . . . , ir and for specificity we choose them so that 1 ≤ i1 < · · · <
ir ≤ n. A list of indices like this will be called increasing.
The part of the sum given for R in the a coordinates involving any way of
rearranging this particular list of integers is
X
RiQ(1) ,...,iQ(r) (a) aiQ (1) ⊗ · · · ⊗ aiQ (r) (Sum on Q only.)
Q∈Pr
X
= Ri1 ,...,ir (a) sgn(Q) aiQ (1) ⊗ · · · ⊗ aiQ (r) (Sum on Q only.)
Q∈Pr

= Ri1 ,...,ir (a) ai1 ∧ · · · ∧ air . (This is not a sum.)

We learn two important things from this.


First, Λr (V) is spanned by wedge products of the form
ai1 ∧ · · · ∧ air . (These are increasing indices.)

Second, when we gather together all the terms as above, the coefficient on
ai1 ∧ · · · ∧ air (increasing indices) is the same as the coefficient on ai1 ⊗ · · · ⊗ air
in the representation of alternating R as a member of Tr0 (V ) using or-
dered basis a.
It remains to show that these increasing-index wedge products are linearly inde-
pendent, and so form a basis of Λr (V ).
Suppose we have a linear combination of these tensors
Ti1 ,...,ir (a) ai1 ∧ · · · ∧ air (Sum here on increasing indices only.)
Evaluating this tensor at a particular (aj1 , . . . , ajr ) with 1 ≤ j1 < · · · < jr ≤ r we
find that all terms are 0 except possibly
Tj1 ,...,jr (a) aj1 ∧ · · · ∧ ajr (aj1 , . . . , ajr ) (This is not a sum.)
Expanding this wedge product in terms of Alt and evaluating, we find that this is
Tj1 ,...,jr (a). So if this linear combination is the zero tensor, every coefficient is 0:
that is, these increasing wedge products form a linearly independent set.
So tensors of the form ai1 ∧ · · · ∧ air where 1 ≤ i1 < · · · < ir ≤ r constitute a
basis of Λr (V ), the standard basis of Λr (V) for ordered basis a.

Ri1 ,...,ir (a) ai1 ⊗ · · · ⊗ air (R is alternating, sum here on all indices.)
i1 ir
= Ri1 ,...,ir (a) a ∧ · · · ∧ a . (Sum here on increasing indices only.)

n!
There are (n−r)! ways to form r-tuples of distinct numbers between 1 and n.
There are r! distinct ways of rearranging any particular list of this kind so there
n!
are (n−r)! r! ways of selecting r increasing indices from among n index values.

So the dimension of Λr (V ) is
 
n! n
= .
(n − r)! r! r
TENSORS (DRAFT COPY) 37

Note that if r = n or r = 0 the dimension is 1. If r > n the dimension of Λr (V )


is 0. Λ1 (V ) = T10 (V ) = V ∗ so if r = 1 this dimension is n, and this is the case when
r = n − 1 as well.
Though not used as often in applications, we might as well consider the space of
symmetric tensors of covariant order r and find it’s dimension too.
Any symmetric covariant order r tensor can be written as a linear combination
of tensors of the form
Sym(ai1 ⊗ · · · ⊗ air )
for arbitrary choice of i1 , . . . , ir between 1 and n.
But any permutation of a particular list choice will yield the same symmetrized
tensor here so we examine the collection of all Sym(ai1 ⊗ · · · ⊗ air ) where i1 , . . . , ir
is a nondecreasing list of indices.
This collection constitutes a linearly independent set by the similar argument
carried out for alternating tensors above, and hence constitutes a basis for the space
of symmetric tensors of order r. We want to count how many lists i1 , . . . , ir satisfy
1 ≤ i1 ≤ i2 ≤ i3 ≤ · · · ≤ ir ≤ n.
At first blush, counting how many ways this can be done seems to be a much
harder proposition than counting the number of strictly increasing sequences as
we did for alternating tensors, but it can be converted into that calculation by
examining the equivalent sequence
1 ≤ i1 < i2 + 1 < i3 + 2 < · · · < ir + r − 1 ≤ n + r − 1.
Thus, the dimension of the space of symmetric tensors of order r is
 
(n + r − 1)! n+r−1
= .
(n − 1)! r! r

Returning from this symmetric-tensor interlude, there is a simpler formula


(or, at least, one with fewer terms) for calculating the wedge product of
two tensors known, themselves, to be alternating. This involves “shuffle
permutations.”
Suppose s, t and L are positive integers and L = s + t. An s-shuffle in PL
is a permutation P in PL with P (1) < P (2) < · · · < P (s) and also P (s + 1) <
P (s + 2) < · · · < P (L).
It is clear that for each permutation Q there are s! t! different permutations P
for which the sets {Q(1), Q(2), . . . , Q(s)} and {P (1), P (2), . . . , P (s)} are equal, and
only one of these is a shuffle permutation.
We let the symbol Ss,L denote the set of these shuffle permutations.
If a is an ordered basis of V and S and T are alternating
L! 1 X
S∧T = Alt(S ⊗ T ) = sgn(P )(S ⊗ T )P
s! t! s! t!
P ∈PL
1 X
sgn(P )Si1 ,...,is Tis+1 ,...,iL ai1 ∧ · · · ∧ ais ⊗ ais+1 ∧ · · · ∧ aiL P .
 
=
s! t!
P ∈PL
38 LARRY SUSANKA

This is a huge sum involving indices for which i1 , . . . , is is in increasing order and
also is+1 , . . . , iL is in increasing order, as well as the sum over permutations.
Suppose i1 , . . . , is , is+1 , . . . , iL is a particular choice of such indices.
Although it does not affect the result below, as a practical matter if performing
one of these calculations by hand we note that if any index value among the i1 , . . . , is
is repeated in is+1 , . . . , iL the result after summation on the permutations will be
0 so we may discard such terms and restrict our attention to terms with entirely
distinct indices.
If P and Q are two permutations in PL and {Q(1), Q(2), . . . , Q(s)} is the same set
as {P (1), P (2), . . . , P (s)} then P can be transformed into Q by composition with a
sequence of 2-cycles that switch members of the common set {Q(1), Q(2), . . . , Q(s)},
followed by a sequence of 2-cycles that switch members of the set {Q(s + 1), Q(s +
2), . . . , Q(L)}. Each of these switches, if applied to a term in the last sum above,
would introduce a minus sign factor in one of the two wedge product factors, and
also introduce a minus sign factor into sgn(P ). The net result is no change, so each
permutation can be transformed into an s-shuffle permutation without altering the
sum above. Each shuffle permutation would be repeated s! t! times.
We conclude:
X
S∧T = sgn(P )(S ⊗ T )P
P ∈Ss,L

when S, T are alternating, and where Ss,L are the s-shuffle permutations.

15. Determinants and Orientation

The single tensor a1 ∧ · · · ∧ an spans Λn (V ) when V has dimension n. If σ is any


member of Λn (V ) then, in ordered basis a, σ has only one coordinate σ1,...,n (a). It
is common to skip the subscript entirely in this case and write σ = σ(a) a1 ∧· · ·∧an .
We can be explicit about the effect of σ on vectors in V . Suppose Hj = Ai (Hj )ai
for j = 1, . . . , n are generic members of V . Form matrix H(a) with ijth entry
H(a)ij = Ai (Hj ). H(a) is the n × n matrix formed from the columns A(Hj ).
Suppose that σ ∈ Λn (V ). Then for some constant k,
σ(H1 , . . . , Hn ) = k a1 ∧ · · · ∧ an (H1 , . . . , Hn ) = k det(H(a)).


You may recall various properties we cited earlier from [8] p.154: det(M N ) =
det(M ) det(N ) for compatible square matrices M and N , det(I) = 1 and the fact
that det(N ) = det(M −1 N M ) when M and N are compatible and M invertible.
A couple of other facts are also useful when dealing with determinants.
First, by examining the effect of replacing permutations P by Q = P −1 we see
that
P (1)
X X
det(M ) = sgn(P ) M1 · · · MnP (n) = 1
sgn(Q) MQ(1) n
· · · MQ(n)
P ∈Pn Q∈Pn

which implies that the determinant of a matrix and its transpose are equal.
TENSORS (DRAFT COPY) 39

Second, define M1,...,i−1,i+1,...,n


1,...,j−1,j+1,...,n to be the (n − 1) × (n − 1) matrix obtained from
the square matrix M by deleting the ith row and jth column. In [8] p.146 (and
also in this work on page 41) it is shown that
n
X  
1,...,i−1,i+1,...,n
det(M ) = (−1)i+j Mji det M1,...,j−1,j+1,...,n for each j = 1, . . . , n.
i=1

Calculating the determinant this way, in terms of smaller determinants for a


particular j, is called a Laplace expansion or, more descriptively, “expanding
the determinant around the jth column.” Coupled with the remark concerning the
transpose found above we can also “expand the determinant around any row.”
Returning now from this determinant theory interlude, if Hj = aj for each j then
H(a) is the identity matrix and so det(H(a)) = 1 and we identify the constant k
as σ(a1 , · · · , an ) = σ(a). So for any choice of these vectors Hj

σ(H1 , . . . , Hn ) = σ(a1 , · · · , an ) det(H(a)) = σ(a) det(H(a)).

In addition to its utility as a tool to calculate σ(H1 , . . . , Hn ), this provides a way


of classifying bases of V into two groups, called orientations. Given any particular
nonzero σ, if σ(a1 , . . . , an ) is positive the ordered basis a belongs to one orientation,
called the orientation determined by σ. If σ(a1 , . . . , an ) is negative the ordered
basis a is said to belong to the opposite orientation. This division of the bases
into two groups does not depend on σ (though the signs associated with each group
could switch.)

16. Change of Basis for Alternating Tensors and Laplace Expansion

If we change coordinates to another ordered basis b the coordinates of the r-form


R change as
Rj1 ,...,jr (b) = Rh1 ,...,hr (a) Ahj11 · · · Ahjrr .
We can choose j1 , . . . , jr to be increasing indices if we are thinking of R as a member
of Λr (V ) but the sum over h1 , . . . , hr is over all combinations of indices, not just
increasing combinations. We will find a more convenient expression.
Let’s focus on a particular increasing combination of indices i1 , . . . , ir . Then
each term in the sum on the right involving these particular indices in any specified
order is of the form
i i
RiQ(1) ,...,iQ(r) (a) AjQ(1)
1
· · · AjQ(r)
r
(This is not a sum.)
i i
= Ri1 ,...,ir (a) sgn(Q) AjQ(1)
1
· · · AjQ(r)
r
(This is not a sum.)
for some permutation Q ∈ Pr . You will note that the sum over all permutations
yields Ri1 ,...,ir (a) times the determinant of the r × r matrix obtained from A by
selecting rows i1 , . . . , ir and columns j1 , . . . , jr . If you denote this matrix Aij11,...,ir
,...,jr
we obtain the change of basis formula in Λr (V ):
 
Rj1 ,...,jr (b) = Ri1 ,...,ir (a) det Aji11,...,i
,...,jr
r
(Sum on increasing indices only.)
40 LARRY SUSANKA

It is worth noting here that if T is an r-vector (an alternating member of T0r (V ))


then by a calculation identical to the one above we have
 
T j1 ,...,jr (b) = T i1 ,...,ir (a) det Bji11,...,i
,...,jr
r
(Sum on increasing indices only.)

As examples, consider the n-form σ = σ(a) a1 ∧ · · · ∧ an and the n-vector x =


x(a) a1 ∧ · · · ∧ an . The change of basis formula6 gives

σ(b) = σ(a) det(A) and x(b) = x(a) det(B).

This implies immediately

b1 ∧ · · · ∧ bn = det(B) a1 ∧ · · · ∧ an and b1 ∧ · · · ∧ bn = det(A) a1 ∧ · · · ∧ an

with interesting consequences when det(B) = det(A) = 1.


Let’s see how this meshes with the calculation shown above for σ evaluated on
the Hj .
Form matrix H(b) from columns B(Hj ). Recall that B(Hj ) = BA(Hj ). This
implies that H(b) = BH(a) and so det(H(b)) = det(B) det(H(a)).
From above,
σ(H1 , . . . , Hn ) = σ(b) det(H(b)) = σ(a) det(A) det(BH(a))
= σ(a) det(A) det(B) det(H(a)) = σ(a) det(H(a)).
We make an important observation: Suppose h1 , . . . , hr are members of V ∗ . If
there is a linear relation among them then one of them, say h1 , can be written in
terms of the others: h1 = c2 h2 + · · · + cr hr . Using standard summation notation
r
X
h1 ∧ · · · ∧ hr = ci hi ∧ h2 ∧ · · · ∧ hr = 0
i=2

because each tensor in the sum is alternating and there is a repeated factor in each
wedge product.
On the other hand, suppose b1 , . . . , br is a linearly independent list of members
of V ∗ . Then this list can be extended to an ordered basis of V ∗ dual to an ordered
basis b of V . The n-form b1 ∧ · · · ∧ bn is nonzero so
0 6= b1 ∧ · · · ∧ bn = b1 ∧ · · · ∧ br ∧ br+1 ∧ · · · ∧ bn
 

which implies that b1 ∧ · · · ∧ br 6= 0.


We conclude that
h1 ∧ · · · ∧ hr = 0 if and only if h1 , . . . , hr is a dependent list in V ∗ .

6Note the similarity to the change of basis formula for a tensor capacity in the case of n-forms,
and tensor density in case of n-vectors. The difference, of course, is that here the determinant
incorporates the combined effect of the matrices of transition, and is not an extra factor.
TENSORS (DRAFT COPY) 41

A similar result holds for s-vectors in Λs (V ∗ ).


We can use the change of basis formula and the associativity of wedge product
to prove a generalization of the Laplace expansion for determinants. The shuffle
permutations come in handy here.
Suppose A is an invertible n × n matrix and b is an ordered basis of V . So A
can be regarded as a matrix of transition to a new basis a.
From above, we see that b1 ∧ · · · ∧ bn = det(A) a1 ∧ · · · ∧ an .
But if 0 < r < n, we also have b1 ∧ · · · ∧ bn = (b1 ∧ · · · ∧ br ) ∧ (br+1 ∧ · · · ∧ bn ).
Applying the change of base formula to the factors on the right, we see that
(b1 ∧ · · · ∧ br ) ∧ (br+1 ∧ · · · ∧ bn ) (Sum on increasing indices only below.)
h   i h   i
k1 ,...,kn−r
= det Aj1,...,r1 ,...,jr
aj1 ∧ · · · ∧ air ∧ det A r+1,...,n ak1 ∧ · · · ∧ akn−r .

The nonzero terms in the expanded wedge product consist of all and only those
pairs of increasing sequences for which j1 , . . . , jr , k1 , . . . , kn−r constitutes an r-
shuffle permutation of 1, . . . , n. So
h   i h   i
k ,...,kn−r
det Aj11,...,r
,...,jr
aj1 ∧ · · · ∧ air ∧ det A 1r+1,...,n ak1 ∧ · · · ∧ akn−r
(Sum on increasing indices only in the line above.)
   
Q(1),...,Q(r) Q(r+1),...,Q(n)
X
= det A 1,...,r det A r+1,...,n aQ(1) ∧ · · · ∧ aQ(n)
Q∈Sr,n

(Sum on shuffle permutations only in the line above and the line below.)
   
Q(1),...,Q(r) Q(r+1),...,Q(n)
X
= sgn(Q) det A 1,...,r det A r+1,...,n a1 ∧ · · · ∧ an .
Q∈Sr,n

We conclude that
   
Q(1),...,Q(r) Q(r+1),...,Q(n)
X
det(A) = sgn(Q) det A 1,...,r det A r+1,...,n .
Q∈Sr,n

Note that the Laplace expansion around the first column appears when r = 1.
More generally, if P is any permutation of 1, . . . , n then
   
Q(1),...,Q(r) Q(r+1),...,Q(n)
X
det(A) = sgn(P ) sgn(Q) det AP (1),...,P (r) det AP (r+1),...,P (n)
Q∈Sr,n

which recovers, among other things, the Laplace expansion around any column.

17. Evaluation in Λs (V ) and Λs (V ): the Interior Product

One often sees a contraction or evaluation operator applied to s-forms or s-


vectors, which we define now.
It is called the interior product, indicated here using the “angle” symbol, y.
It involves one vector and one s-form, or one covector and one s-vector.
If v is a vector and c ∈ Λ0 (V ) = R we define the “angle” operation on c,
called the interior product of v with c, by vy c = 0.
42 LARRY SUSANKA

More generally, for θ ∈ Λr (V ) we define vy θ, the interior product of v with θ,


to be the contraction of the tensor product of θ against v in the first index of θ.
This is the tensor we called E(v, 1)(θ) in Section 12, and considered as an eval-
uation process it is obvious that vy θ ∈ Λr−1 (V ) for each r > 0.
We note that interior product is linear, both in v and θ.
We examine a few cases and extract properties of this operation. We suppose
coordinates below are in a fixed basis a of V .
If θ ∈ Λ1 (V ) then vy θ = θ(v) = θi v i .
If θ = θ1,2 v 1 ∧ v 2 ∈ Λ2 (V ) we have
vy θ = θ(v, ·) = θ1,2 v 1 a2 + θ2,1 v 2 a1
= θ1,2 v 1 a2 − v 2 a1 .


If w is another vector,
wy (vy θ) = θ1,2 v 1 w2 − θ1,2 v 2 w1 = −vy (wy θ).

This readily generalizes to θ ∈ Λr (V ) for any r ≥ 1

vy ( wy θ ) = − w y ( vy θ )
as can be seen by examining the pair of evaluations θ(v, w, . . . ) and θ(w, v, . . . ).
This implies that for any θ
vy ( vy θ ) = 0.

You will note that


v y a1 ∧ a2 = vy a1 ∧ a2 + (−1)1 a1 ∧ vy a2 .
  

We will prove a similar formula for θ of higher order.


If θ ∈ Λr (V ) and τ ∈ Λs (V ) for r and s at least 1 then

vy (θ ∧ τ ) = ( vy θ ) ∧ τ + (−1)r θ ∧ ( v y τ )

We prove the result when v = a1 . Since a is an arbitrary basis and interior


product is linear in v, this will suffice for the general result.
We further presume that θ = ai1 ∧ · · · ∧ air and τ = aj1 ∧ · · · ∧ ajs for increasing
indices. If we can prove the result in this case then by linearity of wedge product
in each factor and linearity of the evaluation map we will have the result we seek.
We consider four cases. In each case, line (A) is vy (θ ∧ τ ), line (B) is ( vy θ ) ∧ τ ,
while line (C) is (−1)r θ ∧ ( v y τ ). In each case, the first is the sum of the last two.
Case One: i1 = 1 and j1 6= 1.
(A) a1 y ai1 ∧ · · · ∧ air ∧ aj1 ∧ · · · ∧ ajs = ai2 ∧ · · · ∧ air ∧ aj1 ∧ · · · ∧ ajs .


a1 y ai1 ∧ · · · ∧ air ∧ aj1 ∧ · · · ∧ ajs = ai2 ∧ · · · ∧ air ∧ aj1 ∧ · · · ∧ ajs .



(B)
(C) (−1)r ai1 ∧ · · · ∧ air ∧ a1 y aj1 ∧ · · · ∧ ajs

= 0.
TENSORS (DRAFT COPY) 43

Case Two: i1 6= 1 and j1 = 1.


(A) a1 y ai1 ∧ · · · ∧ air ∧ aj1 ∧ · · · ∧ ajs


= a1 y (−1)r aj1 ∧ ai1 ∧ · · · ∧ air ∧ aj2 ∧ · · · ∧ ajs




= (−1)r ai1 ∧ · · · ∧ air ∧ aj2 ∧ · · · ∧ ajs .


a1 y ai1 ∧ · · · ∧ air ∧ aj1 ∧ · · · ∧ ajs = 0 ∧ aj1 ∧ · · · ∧ ajs = 0.

(B)
(−1)r ai1 ∧ · · · ∧ air ∧ a1 y aj1 ∧ · · · ∧ ajs

(C)
= (−1)r ai1 ∧ · · · ∧ air ∧ aj2 ∧ · · · ∧ ajs .

Case Three: i1 6= 1 and j1 6= 1.


(A) = (B) = (C) = 0.

Case Four: i1 = 1 and j1 = 1.


(A) a1 y ai1 ∧ · · · ∧ air ∧ aj1 ∧ · · · ∧ ajs = a1 y 0 = 0.


a1 y ai1 ∧ · · · ∧ air ∧ aj1 ∧ · · · ∧ ajs = ai2 ∧ · · · ∧ air ∧ aj1 ∧ · · · ∧ ajs .



(B)
(C) (−1)r ai1 ∧ · · · ∧ air ∧ a1 y aj1 ∧ · · · ∧ ajs


= (−1)r ai1 ∧ · · · ∧ air ∧ aj2 ∧ · · · ∧ ajs


= (−1)r (−1)r−1 ai2 ∧ · · · ∧ air ∧ ai1 ∧ aj2 ∧ · · · ∧ ajs
= (−1) ai2 ∧ · · · ∧ air ∧ aj1 ∧ aj2 ∧ · · · ∧ ajs .

The interior product is defined for a covector-and-s-vector pair by identical


means, yielding the analogous properties.
Applied to relative tensors, the weight of an interior product of relative tensors
is, as with wedge and tensor products, the sum of the weights of the two factors.
An interior product of relative tensors will be odd when one but not both of the
two factors is odd.

18. Bilinear Forms

Since V ∗ and V are both n dimensional, there are many isomorphisms between
them. The obvious isomorphism sending basis to dual basis is basis-dependent: it is
a feature of the basis, not the space and will not in general reflect geometry, which
usually comes from a nondegenerate bilinear form on V . This structure encapsulates
physical properties and physical invariants of great interest to practical folk such as
physicists and engineers. Sometimes it will correspond to the analogue of ordinary
Euclidean distance in V . Sometimes it is created by a Lorentzian metric on V ,
coming perhaps from relativistic space-time considerations.
A bilinear form on V is simply a member of T20 (V ).
Suppose a is an ordered basis for V . The coordinates of a bilinear form G in the
ordered basis a are Gi,j (a) = G(ai , aj ). Define the matrix representation of G
with respect to ordered basis a to be the matrix Ga with ijth entry Ga ij = Gi,j (a).
44 LARRY SUSANKA

We wish to distinguish between the tensor G with coordinates Gi,j (a) in the
ordered basis a and the matrix Ga with ijth entry Gi,j (a). They refer to distinct
(obviously related) ideas.
G is the tensor
G = Gi,j (a) ai ⊗ aj .

If v = v i (a) ai and w = wi (a) ai

G(v, w) = G v i (a) ai , wj (a) aj = Gi,j (a) v i (a) wj (a) = A(v)t Ga A(w).




Evaluating G(v, w) in Rn as A(v)t Ga A(w) takes advantage of our common


knowledge of how to multiply matrices, but we are careful (and it is important)
to note that the matrix Ga is not the representative tensor of G in ordered basis
a. That representative tensor is
Gi,j (a) ei ⊗ ej .

Any bilinear form can be written as

G = Gi,j (a) ai ⊗ aj
1  1
Gi,j (a)ai ⊗ aj + Gi,j (a)aj ⊗ ai + Gi,j (a)ai ⊗ aj − Gi,j (a)aj ⊗ ai .

=
2 2

The first term is symmetric while the second is alternating.

A bilinear form is called degenerate if there is v ∈ V with


v 6= 0 and G(v, w) = 0 ∀ w ∈ V.
Otherwise the bilinear form is called nondegenerate.
The two most common types of bilinear forms are symplectic forms and inner
products.
A symplectic form on V is an alternating nondegenerate bilinear form on
V . Symplectic forms are used many places but one common application is in the
Hamiltonian formulation of mechanics.
The matrix of a symplectic form (and in fact any 2-form) satisfies Ga t = − Ga :
that is, the matrix is skew symmetric. The fact that a symplectic form is non-
degenerate implies that this matrix is invertible. The inverse matrix is also skew
symmetric.
An inner product on V is a symmetric nondegenerate bilinear form on V .
Sometimes an inner product is called a metric tensor.
A symmetric bilinear form is called non-negative if
G(v, v) ≥ 0 ∀ nonzero v ∈ V
and positive definite if
G(v, v) > 0 ∀ nonzero v ∈ V.
TENSORS (DRAFT COPY) 45

Negative definite and non-positive symmetric bilinear forms are defined by


making the obvious adaptations of these definitions.
We specifically do not require our inner product to be positive definite.
But if an inner product is positive definite it can be used to define lengths of
vectors and angles between vectors by calculations which are identical to the case
of the standard inner product in Rn . Some of the geometrical ideas spawned by
positive definite symmetric G are listed below.
p
The length of a vector v is defined to be kvkG = G(v, v). When the inner
product G is understood by all we drop the subscript and write kvk. We define the
angle between nonzero vectors v and w to be
 
G(v, w)
The angle between nonzero v and w: θv,w = arccos .
kvk kwk
It is a bit of work to show that this definition of length scales with linear multiples
as we would expect7 and satisfies the Polarization Identity8, the Parallelogram Law9
and the Triangle10 and Cauchy-Schwarz11 Inequalities, and that an angle is always
defined between pairs of nonzero vectors and never violates our understanding of
how angles should behave.12 We forgo the proofs of these facts.
A vector space endowed with this notion of distance and angle (i.e. one that
comes from, or could have come from, a positive definite inner product) is called
Euclidean.
The matrix of an inner product (and in fact any symmetric covariant tensor of
rank 2) is symmetric: Ga = Ga t . The inverse of the matrix of an inner product is
symmetric too.
If G is nondegenerate then Ga must be invertible. Let Ga be the inverse of Ga .
i
We will denote the ijth entry of Ga by Ga j = Gi,j (a).

Ga and Ga are inverse matrices, so


Gi,k (a)Gk,j (a) = Gi,k (a)Gk,j (a) = δji
for each i and j.
Note bene: Matrices have entries Mji in the ith row and jth column. For bilinear
forms we associate these matrix entries with tensor coefficients whose
indices are both high or both low. With this convention we can utilize the
Einstein summation notation in certain calculations we need to perform later.

7Scaling of Length: kcvk = |c| kvk for any c ∈ R and v ∈ V .


8
Polarization Identity: G(v, w) = 41 kv + wk2 − 14 kv − wk2 .
9
Parallelogram Law: kv + wk2 + kv − wk2 = 2 kvk2 + 2 kwk2 .
10
Triangle Inequality: kv + wk ≤ kvk + kwk.
11
Cauchy-Schwarz Inequality: |G(v, w)| ≤ kvk kwk
12
For instance, if v, s and w are nonzero vectors we would hope that θv,w ≤ θv,s + θs,w with
equality when s = cv + dw for positive real numbers c and d.
46 LARRY SUSANKA

19. An Isomorphism Induced by a Nondegenerate Bilinear Form

Suppose we are given a symplectic form or inner product G = Gi,j (a) ai ⊗ aj


with matrix Ga in ordered basis a as indicated above.
The ijth entry of Ga is Gi,j (a) and the ijth entry of the inverse matrix Ga
is a number we have denoted Gi,j (a). This collection of numbers looks like the
coefficients of a tensor, and we will see that it is.
In this section we will be working with a single basis a for V , and will temporarily
suspend explicit reference to that basis in most coordinates.
We define [ : V → V ∗ , using the musical “flat” symbol, by

[(u)(w) = G(u, w) = A(u)t Ga A(w) ∀ w ∈ V.

If (and, generally, only if) Ga is the identity matrix, left multiplication by


A(u)t Ga corresponds to simple “dot product” involving the coordinates of u against
the coordinates of v.
If [(u) is the zero transformation then u must be 0 since G is nondegenerate. So
[ is one-to-one and since the vector spaces V and V ∗ have the same dimension, [
is an isomorphism onto V ∗ .

If u = ui ai ∈ V then [(u) = Gi,j ui aj ∈ V ∗ .

Consider the following calculation, and its implication in the box below.
[( σm Gm,w aw ) = Gw,j σm Gm,w aj = σm δjm aj = σj aj = σ.

Define the function ] : V ∗ → V , using the musical “sharp” symbol, by

](σ) = Gi,j σi aj for σ = σi ai ∈ V ∗ .


−1
We see that ] = [ .

So ](σ) evaluated on τ will, generally, look like the sum of simple products σi τi
only when Ga (and so Ga ) is the identity matrix.
[ was defined without reference to a basis, and so both it and its inverse ] have
meaning independent of basis. The formulas we create involving their effect on
vectors and covectors in a basis are valid in any basis.
We use the musical “flat” symbol [ and the “sharp” symbol ] as mnemonic aids.
[ lowers or “flattens” the index on a coefficient turning a vector, usually associated
with an arrow, into a covector, which is usually associated with a family of parallel
planes (or hyperplanes.) The ] symbol raises the index, turning a “flat” object into
a “sharp” one.
We can use [ to create G? , a tensor of order 2, on V ∗ . G? will be a member of
T02 (V) and is defined by:

G? (σ, τ ) = G (](τ ), ](σ)) ∀ σ, τ ∈ V ∗ .


TENSORS (DRAFT COPY) 47

The reversal of order you see in σ and τ above is not relevant when G is sym-
metric, as it usually will be. In the symplectic case, this reversal avoids a minus
sign in the formula.
If σ = σi ai and τ = τi ai then
(σ, τ ) =G τm Gm,w aw , σL GL,s as


=τm σL Gm,w GL,s Gw,s = τm σL δsm GL,s


=σL τm GL,m .
So G? is the tensor
G? = GL,m aL ⊗ am .

The tensor G? is said to be the inner product conjugate to G.


We can evaluate G? down in Rn∗ as:
G? (σ, τ ) = A∗ (σ) Ga A∗ (τ )t .

We call Ga the matrix of G? .


The bilinear form G and its conjugate G? induce tensors on Rn and Rn∗ by
transporting vectors to V and V ∗ using A and A∗ .

AG(x, y) = G A(x), A(y)
e
e
and
e ? (σ, τ ) = G? ( A∗ (σ), A∗ (τ ) )
AG
e

We have produced the nondegenerate tensor G? defined on V ∗ × V ∗ and an


isomorphism [ from V to V ∗ . We can carry this process forward, mimicking our
construction of G? and [, to produce a nondegenerate bilinear form on V ∗∗ = V
and an isomorphism from V ∗ to V .
You will find that this induced bilinear form is G (i.e. the tensor conjugate to
G? is G) and the isomorphism is ].

20. Raising and Lowering Indices

Suppose, once again, we are given a nondegenerate 2-form G with representation


Gi,j (a) ai ⊗aj in ordered basis a. Though the initial calculations make no symmetry
assumptions, in practice G will be an inner product or, possibly, a symplectic form.
In this section as in the last we will be working with a single basis a for V and
will temporarily suspend reference to that basis in most coordinates.
−1
The functions [ : V → V ∗ and ] = [ and the conjugate tensor G? from the
last section were defined using G alone and so are not dependent on coordinate
systems, though we have found representations for their effect in a basis.
[ and ] can be used to modify tensors in a procedure called “raising or lowering
indices.”
48 LARRY SUSANKA

This process is entirely dependent on a fixed choice of G and extends the proce-
dure by which we created G? from G.
We illustrate the process on vectors and covectors before introducing a notational
change to help us with higher orders.
Suppose τ = τi ai is in V ∗ .
] is an isomorphism from V ∗ to V , and τ is linear on V so the composition τ ◦ ]
is a linear real valued function on V ∗ : that is, a vector.
Define the “raised index version of τ ” to be this vector, and denote its coordinates
in basis a by τ k so τ ◦ ] = τ k ak .
If σ = σi ai is in V ∗ ,
τ ◦ ](σ) = τj aj (](σ)) = τj aj Gk,L σk aL = τj Gk,j σk = τj Gk,j ak (σ).


So τ k = τj Gk,j for each k. From this you can see that τ ◦ ] could be obtained
by the tensor product of τ with G? followed by a contraction against the second
index of G? .
Suppose x = xi ai is in V .
The linear function [ has domain V and range V ∗ and x has domain V ∗ and
range R so the composition x ◦ [ is a covector.
If v = v k ak ∈ V ,
x ◦ [(v) = xj aj ([(v)) = xj aj Gk,L v k aL = xj Gk,j v k = xj Gk,j ak (v).


Define the “lowered index version of x” to be this covector, and denote its coor-
dinates in basis a by xk so x ◦ [ = xk ak .
As you can see, the covector x ◦ [ could have been obtained by tensor product
of x with G followed by contraction against the second index of G.
We have, in compact form, for vectors and covectors:

xk = xj Gk,j and τ k = τj Gk,j (independent of basis)

Since [ = ]−1 we find that x ◦ [ ◦ ] = x and τ ◦ ] ◦ [ = τ so raising a lowered


index and lowering a raised index gets you back where you started.
There is a common ambiguity which can cause problems and must be mentioned,
centered around using the same base symbol x for coordinates xi and also coordi-
nates xi . That invites the practice of using the same symbol x to denote both the
vector and the covector obtained by lowering its index, and a similar practice for
a covector τ and its raised index version. The version which is intended must be
deduced from context or made explicit somehow.
We will now introduce the notational change mentioned above.
The domain of our tensors in Tsr (V ) was V ∗ × · · · × V ∗ × V × · · · × V where all the
copies of V are listed last. But nothing we did required that they be last. It was
a notational convenience only that prompted this. It was the number of factors of
V and V ∗ which mattered, not where they were on the list. It is more convenient
for the remainder of this section for us to allow V and V ∗ to appear anywhere as
factors in the domain of a tensor and we create a notation to accommodate this.
TENSORS (DRAFT COPY) 49

A tensor formerly indicated as Tji11,...,i


,...,js in Ts (V ) will now look like T
r r i1 ...ir
j1 ...js
where we give a separate column in the block of coordinates to the right of T for
each factor in the domain. Factors of V ∗ in the domain correspond to raised indices
while lowered ones correspond to domain factors V . The “column” notation does
not use the comma to separate indices.
So a tensor with coordinates Tabc defg has domain V ∗ ×V ×V ∗ ×V ×V ×V ∗ ×V ∗
and has contravariant order 4 and covariant order 3. To recover our previous
situation merely slide all factors of V ∗ to the left and pack the the indices to the
left in order, inserting commas where they are required.
Suppose T is any tensor of order k, contravariant of order r and covariant of
order s, with coordinates given in this column notation. The domain of T is a set
W which is a product of copies of V and V ∗ in some order: we do not require all the
copies of V ∗ to be on the left. We presume that the ith index of T is contravariant:
that is, the index in column i is “high” and that V ∗ is the ith factor of W .
We let Wi be the factor space obtained by substituting V in place of V ∗ at the
ith factor.
Define the function [i : Wi → W by
[i ( (v(1), . . . , v(k)) ) = ( v(1), . . . , [(v(i)), . . . , v(k) ) .
In other words, [i leaves a member of W unchanged except in its ith factor, where
the vector v(i) is replaced by the covector [(v(i)).
We now consider the tensor T ◦ [i : Wi → R. This tensor is still of rank k but
now its contravariant rank is r − 1 and its covariant rank is s + 1. It is said to be
a version of T whose ith index is low, and the process of forming T ◦ [i from
T is called lowering the ith index on T.
Similarly, if the jth index of T is covariant then the index in column j is “low”
and V is the jth factor of W .
We let W j be the factor space obtained by substituting V ∗ in place of V as the
jth factor.
Define the function ]j : W j → W by
]j ( (v(1), . . . , v(k)) ) = ( v(1), . . . , ](v(j)), . . . , v(k) ) .
In other words, ]j leaves a member of W unchanged except in its jth factor, where
the covector v(j) is replaced by the vector ](v(j)).
We now consider the tensor T ◦ ]j : W j → R. This tensor has contravariant rank
r + 1 and covariant rank s − 1. It is said to be a version of T whose jth index
is high, and the process of forming T ◦ ]j from T is called raising the jth index
on T.
Generally, for a fixed tensor of order k there will be 2k different index position
choices. Confusingly, all of these different tensors can be, and in some treatments
routinely are, denoted by the same symbol T .
Presumably, you will start out with one specific known version of T as above.
It will have, perhaps, coordinates T ij . The other three versions in this rank-2 case
will be
• T ◦ [1 with coordinates denoted Ti j
50 LARRY SUSANKA

• T ◦ ]2 with coordinates denoted T i j


• T ◦ [1 ◦ ]2 with coordinates denoted Ti j .
When dealing with coordinates, there can be no ambiguity here. T has a column
for each index and a position (high or low) for an index in each column. If the
index is not where it should be it must have been raised by ] or lowered by [ at that
domain factor. The problems arise when coordinate-free statements are preferred
and in that case using the same symbol for all versions of T is bad notation. The
practice is, however, ubiquitous.
By means identical to the rank 1 case considered above, we find the relationship
between coordinates of the various versions of rank-2 T to be related by

Ti j = Ti L Gj,L = T Lj Gi,L = T L M Gi,L Gj,M


and
T ij
= T iL Gj,L = TLj Gi,L = TL M Gi,L Gj,M .

Once again, to raise an index on a version of T you take the tensor


product of that version with G? and contract against the second index
of G? . To lower an index on a version of T you take the tensor product
of that version with G and contract at the second index of G.
If G (and hence G? ) is symmetric the choice of index of contraction on G or G?
is irrelevant. If G is not symmetric, it does matter.
The statement in bold above applies to tensors of any rank.
We can apply this process to the tensor G itself. Raising both indices gives
coordinates
GLk Gj,k Gi,L = δki Gj,k = Gj,i = Gi,j .
It is in anticipation of this result that we used the raised-index notation for the
coordinates of G? . The tensor obtained from G by raising both indices using G
actually is the conjugate metric tensor, so we have avoided a clash of notations.
We also note that Gki = δik = Gi k .
It is possible to contract a tensor by two contravariant or two covariant
index positions, by first raising or lowering one of the two and then contracting
this new pair as before. We restrict our attention here to symmetric G.
We illustrate the process with an example. Let us suppose given a tensor with
coordinates T ab cde f g , and that we wish to contract by the two contravariant indices
in the 6th and 7th columns. We first lower the 7th index to form
T ab cde fg = T ab cde f h Ggh .
We then contract to form C 67 T as
(C 67 T )ab cde = T ab cde kk = T ab cde kh Gkh .

Done in the other order we have


T ab cdek k = T ab cde hk Gkh
which is the same by symmetry of G.
TENSORS (DRAFT COPY) 51

Contracting in the 4th and 5th index positions, which are covariant, is defined
by first raising one of these indices and then contracting at the new pair.
(C45 T )ab cf g = T ab ck kf g = T ab chk f g Gkh .

In this case too, symmetry of G guarantees that the result does not depend on
which of these two indices was raised.
To put some of this in perspective, we consider a very special case of an
ordered basis a and an inner product G with matrix Ga = I = Ga , the
identity matrix. This means that the inner product in Rn created by G using
this ordered basis is the ordinary Euclidean inner product, or dot product. If
T is a member of V (that is, a contravariant tensor of order 1) or a member of V ∗
(a covariant tensor of order 1) the act of raising or lowering the index corresponds,
in Rn , to taking the transpose of the representative there. The numerical values of
coordinates of any tensor do not change when an index is raised or lowered in this
very special but rather common case.
Once again restricting attention to symmetric G, indices can be raised or
lowered on alternating tensors just as with any tensor, though the result could not
be alternating unless all indices are raised or lowered together. In that case, though,
the transformed tensor will be alternating as we now show.
If R is an r-form, it can be represented in a basis as Rj1 ...jr aj1 ⊗ · · · ⊗ ajr
where the sum is over all indices, or as Rj1 ...jr aj1 ∧ · · · ∧ ajr where the sum is over
increasing indices.
In the first representation, the fact that R is alternating is encapsulated in the
equation
Rj1 ...jr = sgn(Q) RjQ(1) ...jQ(r) for any permutation Q ∈ Pr .

Raising all indices on R yields


R j1 j2 ...jr = R i1 i2 ...ir Gi1 j1 Gi2 j2 · · · Gir jr
where the sums on the right are over all indices.
Raising indices with j1 and j2 permuted gives
R j2 j1 ...jr =R i1 i2 ...ir Gi1 j2 Gi2 j1 · · · Gir jr = −R i2 i1 ...ir Gi1 j2 Gi2 j1 · · · Gir jr
= − R i2 i1 ...ir Gi2 j1 Gi1 j2 · · · Gir jr = −R j1 j2 ...jr
so the raised index version of R is alternating as stated.
i1 ,...,ir
We recall a notation we used earlier and let Ga j1 ,...,jr denote the matrix obtained
from Ga by selecting rows i1 , . . . , ir and columns j1 , . . . , jr .
By means identical to that used for the change of basis formula for wedge prod-
ucts found on page 39 we conclude that for alternating R
 i1 ,...,ir

R j1 ...jr = R i1 ...ir det Ga j1 ,...,jr (Sum here on increasing indices only.)

We observe also that if T is an r-vector (an alternating member of T0r (V )) then


we can lower all indices using matrix G rather than G to obtain for alternating T
52 LARRY SUSANKA

 
i1 ...ir j1 ,...,jr
T j1 ...jr = T det Ga i1 ,...,ir (Sum here on increasing indices only.)

Finally, the various processes we have described can be adapted with the obvious
modifications to raise or lower the index on a relative tensor. The procedure does
not alter the weighting function.

21. Four Facts About Tensors of Order 2

In this section we will consider how to calculate all four types of tensors of order
2 on V and V ∗ using matrices acting in specific ways on the representatives of
vectors and covectors in Rn and Rn∗ . In this text, these matrices are called the
matrix representations of the relevant tensors.
We will then show how the matrix representations change as the basis changes
from basis a to b in V .
Then we will make three different kinds of observations about these matrices.
We will examine how these matrices are related in a fixed ordered basis if the
tensors involved are all “raised or lowered versions” of the same tensor with respect
to an inner product.
Then we will comment on how these matrices change when the matrices of tran-
sition are of a special type—orthogonal matrices.
Finally, we will examine the determinants of these matrices.
For the following calculations v, w are generic members of V , while θ, τ are to
be generic members of V ∗ .
Suppose P is a tensor with domain V × V . We saw that Pij (b) = PkL (a) Aki AL
j.
We will determine this fact directly to show exactly how this carries over to a
calculation with matrices in Rn .
Let matrix MI (a) be the matrix with ijth entry Pij (a) and MI (b) be the matrix
with ijth entry Pij (b).

P (v, w) =Pij (a) Ai (v) Aj (w)


=A(v)t MI (a) A(w)
t
= (A B(v)) MI (a) A B(w)
=B(v)t At MI (a) A B(w).

We conclude that

Case I MI (b) = At MI (a) A where MI i


j = Pij .
TENSORS (DRAFT COPY) 53

Suppose Q is a tensor with domain V × V ∗ . Then Qi j (b) = QkL (a) Aki BjL .

Let matrix MII (a) be the matrix with ijth entry Qi j (a) and MII (b) be the
matrix with ijth entry Qi j (b).
Q(v, θ) =Qi j (a) Ai (v) A∗ j (θ)
=A(v)t MII (a) A∗ (θ)t
t
= (A B(v)) MII (a) (B ∗ (θ) B)t
=B(v)t At MII (a) Bt B ∗ (θ)t .

We conclude that

Case II MII (b) = At MII (a) Bt where MII i


j = Qi j .

Suppose R is a tensor with domain V ∗ × V . Then Ri j (b) = RkL (a) Bik AL


j.

Let matrix MIII (a) be the matrix with ijth entry Ri j (a) and MIII (b) be the
matrix with ijth entry Ri j (b).

R(θ, v) =Ri j (a) A∗ i (θ) Aj (v)


=A∗ (θ) MIII (a) A(v)
=B ∗ (θ) B MIII (a) A B(v).

We conclude that

Case III MIII (b) = B MIII (a) A where MIII i


j = Ri j .

Suppose S is a tensor with domain V ∗ × V ∗ . Then S ij (b) = S kL (a) Bik BjL .


Let matrix MIV (a) be the matrix with ijth entry S ij (a) and MIV (b) be the
matrix with ijth entry S ij (b).
S(θ, τ ) =Si j (a) A∗ i (θ) A∗ j (τ )
=A∗ (θ) MIV (a) A∗ (τ )t
=B ∗ (θ) B MIV (a) (B ∗ (τ ) B)t
=B ∗ (θ) B MIV (a) Bt B ∗ (τ )t .

We conclude that

Case IV MIV (b) = B MIV (a) Bt where MIV i


j = S ij .

We will now consider the situation where these four tensors are obtained
by raising or lowering indices of (any) one of them with the services of
an inner product G.
54 LARRY SUSANKA

We will let MI = (Pij ) as above and relate the other matrices from above to this
one. So
Pi j (a) = Pik (a) Gkj (a) MII (a) = MI (a) Ga
i ik
P j (a) = G (a) Pkj (a) MIII (a) = Ga MI (a)
P ij (a) = Gik (a) PkL (a) GLj (a) MIV (a) = Ga MI (a) Ga
Since G−1
a = Ga you can modify the above to write any of them in terms of the
others and the matrices Ga and Ga .
We make explicit note of this calculation in an important and common case:
When these four tensors are all “raised or lowered” versions
of each other, and if Ga and Ga are the identity matrices,
then the matrix of each of these tensors is the same.
A basis for which matrix Ga (and so Ga too) is the identity matrix is called
orthonormal with respect to G. We will have more to say about such bases
later.
We now consider “special” matrices of transition.
Sometimes the change of basis matrix A satisfies
At = A−1 = B and so Bt = B−1 = A.

This happens exactly when AAt = I where I is the n × n identity matrix, or


Xn
Aki Akj = δji .
k=1

The columns of A form an orthonormal ordered basis of Rn with respect


to the usual inner product there. Similarly, so do the rows. Matrices like this
are called orthogonal.
The matrix representations of the four types of tensors of order 2,
the original cases I through IV, all transform in the same way under
coordinate changes with orthogonal matrices of transition.
Finally, we take determinants of the equations in cases I through IV.
The determinants of matrices A and At are the same. So are the determinants
of matrices B and Bt . The determinants of matrices A and B are reciprocals of
each other.
So for the matrices of the mixed tensors, Cases II and III, we find that
det (MII (b)) = det (MII (a)) and det (MIII (b)) = det (MIII (a)) .
The determinants of these matrices are invariant.
On the other hand, in Cases I and IV, the purely contravariant and covariant
cases, respectively, these determinants do depend on basis. We find that

2
det (MI (b)) = (det(A)) det (MI (a))
and
TENSORS (DRAFT COPY) 55

2
det (MIV (b)) = (det(B)) det (MIV (a)) .
Of particular interest is the case where the underlying tensors are versions of an
inner product G. So
MI (a) = Ga and MI (b) = Gb

while

MIV (a) = Ga and MIV (b) = Gb .


and all four matrices are invertible.
In this case we have
s
det (Gb )
det(A) = sign(A)
det (Ga )

and
s 
det Gb
det(B) = sign(B) .
det Ga

22. Matrices of Symmetric Bilinear Forms and 2-Forms

We discuss here various details regarding matrix representation of symmetric


bilinear forms and 2-forms which can be useful in calculations. Both types of rank
2 tensors may be present in an application, each contributing to different aspects
of the geometry.
We accumulate several facts first.
The tensor G we will work with is given, in basis a, by Gi,j (a)ai ⊗ aj .
Define the set W = { v ∈ V | G(v, w) = 0 for all w ∈ V }
= { v ∈ V | G(w, v) = 0 for all w ∈ V }.
If W = {0} then G is either an inner product or a symplectic form.
If W 6= {0} then an ordered basis w1 , . . . , wk of W can be extended to an ordered
basis y1 , . . . , yn−k , w1 , . . . , wk of V .
If Y is the span of y1 , . . . , yn−k then G restricted to Y is either an inner product
or a symplectic form, and the last k columns and k rows of the matrix of G with
respect to the basis y1 , . . . , yn−k , w1 , . . . , wk of V contain only zero entries.
Recall from section 21 that the matrix Gb for bilinear form G in ordered basis
b can be obtained from the matrix Ga in ordered basis a as Gb = At Ga A.
If A is any orthogonal matrix and At M A = N the matrices N and M are called
orthogonally equivalent.
We will start with symmetric bilinear form G.
56 LARRY SUSANKA

The matrix of any symmetric bilinear form is symmetric. In [8] p.369 we find that
any symmetric matrix is orthogonally equivalent to a diagonal matrix
with r negative diagonal entries listed first, then s positive diagonal entries and zero
diagonal entries listed last. We find there that the numbers r and s of negative and
positive entries on the diagonal of Ga is independent of basis when Ga is diagonal,
a result referred to as Sylvester’s Law of Inertia.

s is the dimension of any subspace of V of maximal dimension upon which G is


positive definite. r is the dimension of any subspace of V of maximal dimension
upon which G is negative definite. r + s is the dimension of any subspace of V of
maximal dimension upon which G is nondegenerate. r + s is the rank of any matrix
representation of G.

The ordered pair (s, r) is called the signature of G and the number s − r is
sometimes called the index of G. Be warned: this vocabulary is not used in a
consistent way in the literature.

Writing this out in more detail, suppose that Ga is the matrix of symmetric G
with respect to a. Then there is an orthogonal matrix B, which can be used as a
matrix of transition from ordered basis a to a new ordered basis b, and for which
(by orthogonality) and Bt = B−1 = A and, finally, for which the matrix Gb is
diagonal.

Gb = At Ga A = A−1 Ga A
 
G1,1 (b)
 .. 

 . 


 Gr,r (b) 


 Gr+1,r+1 (b) 

=
 .. 
 . 


 Gr+s,r+s (b) 


 0 

 .. 
 . 
0

The orthogonal matrix B can be chosen so that

G(bi , bj ) = eti Gb ej = Gi,j (b) = 0 unless i = j

and also

n
X negative,
 if 1 ≤ i ≤ r;
i i
G= Gi,i (b) b ⊗ b where Gi,i (b) is positive , if r + 1 ≤ i ≤ r + s;

i=1 
0, if r + s + 1 ≤ i ≤ n.

Now suppose G is an inner product on V : that is, r + s = n.


TENSORS (DRAFT COPY) 57

Taking the ordered basis b as above create a new ordered basis c using diagonal
matrix of transition
√ 1
 
−G1,1 (b)
..
 
.
 
 
√ 1
 
 
−Gr,r (b)
K=
 
√ 1 
 Gr+1,r+1 (b)

 
 .. 

 . 

√ 1
Gn,n (b)
 
√ 1
In other words, cj = bj for each j, where the sign is chosen to make
± Gj,j (b)
the root real. Note: Kt 6= K , in general, so we lose orthogonal equivalence
−1

at this point—and similarity too.


The new matrix of G, obtained as Gc = Kt Gb K, is still diagonal and
n
(
X −1, if 1 ≤ i ≤ r;
G= Gi,i (c) ci ⊗ ci , Gi,i (c) =
i=1
1, if r + 1 ≤ i ≤ r + s = n.

Whenever the matrix of inner product G with respect to an ordered basis is


diagonal with only ±1 on the diagonal and with the negative entries first the
ordered basis is called orthonormal with respect to this inner product. It
should be pointed out that some texts reverse this order.

There is an orthonormal ordered basis for any inner product.

Suppose c and w are any two orthonormal ordered bases and H = hij is the


matrix of transition to w from c. So


−1,
 if L = K ≤ r;
G(wK , wL ) = 1, if r < L = K ≤ n;

0, if L 6= K;

 
= G hiK ci , hjL cj = hiK hjL G (ci , cj )
r
X n
X
= − hiK hiL + hiK hiL .
i=1 i=r+1

So the transpose of matrix H can be tweaked to produce H−1 . Specif-


ically, divide the transpose of H into four blocks at row and column r.
Multiply the off-diagonal blocks by −1 and you have H−1 .
The multiplication can be accomplished by multiplying the first r rows and
then the first r columns of Ht by −1, and each such multiplication will affect the
determinant by a factor of −1. The combined effect is to leave the determinant
58 LARRY SUSANKA

unchanged, and since det (Ht ) = det (H) we conclude that det H−1 = det (H) =


±1.
A matrix of transition H between two bases which are orthonormal
with respect to any inner product has determinant ±1.

Also, if G is positive definite (i.e. r = 0) then Ht = H−1 .


A matrix of transition H between two bases which are orthonormal
with respect to any positive definite inner product is an orthogonal
matrix: that is, it satisfies Ht = H−1 .

A diagonalization algorithm for any symmetric bilinear form by a change of


basis from any specified basis is found using a modification of the Gram-Schmidt
process, which we briefly outline below. Though the matrices of transition we
create have determinant ±1, they will not in general be orthogonal even in the
positive definite case.
Suppose a is any ordered basis of n dimensional V and G is a symmetric bilinear
form on V .
Define the set of basis vectors S = { ai ∈ a | G(ai , aj ) = 0 for all j = 1, . . . , n }.
Note that it is quite possible for all the diagonal entries G(ai , ai ) of the matrix
for G in this basis to be 0. The basis vectors in S (if there are any) are those ai for
which the entire ith row of that matrix is filled with zeroes, and by symmetry the
entire ith column is filled with zeroes too.
We will now re-order a so that all these vectors occur last. If S contains k vectors
the matrix of G in basis a has nothing but zeroes in the bottom k × n block, and
nothing but zeroes in the rightmost n × k block. So the rank of this matrix cannot
exceed n − k. Any reordering of members of an ordered basis involves a matrix of
transition with determinant ±1.
With this preliminary tweaking out of the way, we proceed with the algorithm.
The first step involves three possible cases and the last two both lead to Step 2.
Step 1a: If the matrix of G is diagonal (in particular, it is diagonal if it is the
zero matrix) we are done.
Step 1b: If the matrix of G is not diagonal and G(ai , ai ) 6= 0 for some smallest i
reorder the members of the ordered basis a if necessary so that G(a1 , a1 ) 6= 0. Again,
this reordering of members of the ordered basis involves a matrix of transition with
determinant ±1.
Step 1c: If the matrix of G is not diagonal and G(ai , ai ) = 0 for all i then
G(a1 , ai ) 6= 0 for some smallest i. Replace a1 by a1 + ai . The span of the new
list of basis vectors is still V and now G(a1 , a1 ) 6= 0. This change of ordered basis
involves a matrix of transition with determinant 1.
Step 2: Let
G(ai , a1 )
b1 = a1 and bi = ai − a1 for i = 2, . . . , n.
G(a1 , a1 )
The matrix of transition between these bases has determinant 1. Note that if v is
a linear combination of the last n − 1 basis vectors b2 , . . . , bn then G(b1 , v) = 0.
TENSORS (DRAFT COPY) 59

Step 3: The matrix of G in the ordered basis b1 , . . . , bn has all zeros in the first
row and in the first column except possibly for the first diagonal entry, which is
G(b1 , b1 ). It also has the bottom k rows and the last k columns filled with zeroes.
Step 4: If this matrix is actually diagonal, we are done. If the matrix is not yet
diagonal, reorder the b1 , . . . , bn−1 if necessary (leaving b1 alone) so that G(b2 , b2 ) 6=
0 or, if that is impossible, replace b2 with b2 + bi for some smallest i so that the
inequality holds. Let
G(bi , b2 )
c1 = b1 = a1 and c2 = b2 and ci = bi − b2 for i = 3, . . . , n.
G(b2 , b2 )
Once again, the matrix of transition has determinant 1. This time the matrix of
G in the ordered basis c1 , . . . , cn has all zeros in the first two rows and the first two
columns except for the first two diagonal entries, which are G(c1 , c1 ) and G(c2 , c2 ).
Of course the last k rows and the right-most k columns are still filled with zeroes.
We carry on in the pattern of Step 4 for (at most) n − 3 − k more steps yielding
an ordered basis for V for which the matrix of G is diagonal. And we can do a final
re-ordering, if we wish, so that the negative eigenvalues are listed first followed by
the positive eigenvalues on the diagonal.
The matrix of transition which produces this diagonal form from a specific basis
a, the product of all the intermediary matrices of transition, will have determinant
±1 but it will not, in general, be orthogonal.
We conclude this section with a similar discussion for a 2-form S.
In this case any matrix of S is skew symmetric. We find in [1] p.163 that there
is a unique (dependent on S of course) natural number r and there is an ordered
basis b for which the matrix Sb has the form
0 I 0
 

Sb = −I 0 0 each I is an r × r identity matrix.


0 0 0

If you start with a matrix Sa for any 2-form S in ordered basis a the new matrix
is of the form Sb = At Sa A, just as above. Once again At 6= A−1 (at least, not
necessarily) so Sb will not in general be either orthogonally equivalent or
similar to Sa .

The rank of any matrix like Sb must be an even number, so if S is a sym-


plectic form on V the dimension of V must be even.

(
−1, row r + i column i for 1 ≤ i ≤ r;
The entries of Sb are zero except for
1, row i column r + i for 1 ≤ i ≤ r.
So with this basis all S(bi , bj ) are 0 except
S(br+i , bi ) = etr+i Gb ei = −1 for 1 ≤ i ≤ r
and
S(bi , br+i ) = eti Gb er+i = 1 for 1 ≤ i ≤ r
So
60 LARRY SUSANKA

r
X r
 X
S= bi ⊗ br+i − br+i ⊗ bi = bi ∧ br+i .
i=1 i=1

There is an algorithm similar to the Gram-Schmidt process which will produce


an ordered basis of the kind we are discussing for any 2-form.
Suppose a is any ordered basis of n dimensional V and S is a 2-form.
For the purpose of this construction only, we will call a matrix “skew-diagonal”
if its entries are all zero except for 2 × 2 blocks along the diagonal of the form
 
0 x
where x is a real number.
−x 0
Our goal is to create from the ordered basis a a new ordered basis for which the
matrix of S is skew-diagonal using matrices of transition whose determinants are 1
(not ±1.) It is then an easy process to create a basis in the form specified above,
though the last matrix of transition will not have determinant 1, necessarily.
Step 1: If the matrix of S is skew-diagonal (in particular, it is skew-diagonal
if it is the zero matrix) we are done. Otherwise, S is not the zero tensor. Since
S(v, v) = 0 for all v ∈ V , if S is not the zero tensor we must be able to reorder
the members of the ordered basis a (if necessary) so that S(a1 , a2 ) 6= 0. Because
we are reordering basis members two at a time, this reordering can be done with a
matrix of transition with determinant 1.
Step 2: Let
S(ai , a2 ) S(ai , a1 )
b1 = a1 and b2 = a2 and bi = ai − a1 + a2 for i = 3, . . . , n.
S(a1 , a2 ) S(a1 , a2 )
The matrix of transition between these bases has determinant 1. If v is any linear
combination of the last n − 2 basis vectors b3 , . . . , bn then S(b1 , v) = S(b2 , v) = 0.
Step 3: The matrix of S in the ordered basis b1 , . . . , bn has all zeros in the first
two rows and in the first two columns except for a 2 × 2 skew-diagonal block on the
far upper left, whose entries are
 
0 S(b1 , b2 )
.
−S(b1 , b2 ) 0
If the matrix of S in this new ordered basis is skew-diagonal we are done.
Step 4: If the matrix is not yet skew-diagonal, reorder the b1 , . . . , bn if necessary
(leaving b1 and b2 alone) so that S(b3 , b4 ) 6= 0. This reordering can be done with a
matrix of transition with determinant 1. Let
S(bi , b4 ) S(bi , b3 )
ci = bi for i = 1, . . . , 4 and ci = bi − b3 + b4 for i = 5, . . . , n.
S(b3 , b4 ) S(b3 , b4 )
Once again, the matrix of transition has determinant 1. This time the matrix of S
in the ordered basis c1 , . . . , cn has all zeros in the first four rows and the first four
columns except for the first two 2 × 2 diagonal blocks, whose entries are those of
the matrices
   
0 S(c1 , c2 ) 0 S(c3 , c4 )
and .
−S(c1 , c2 ) 0 −S(c3 , c4 ) 0
TENSORS (DRAFT COPY) 61

We carry on in this fashion yielding an ordered basis for V for which the matrix of
G is skew-diagonal. The product of all these matrices of transition has determinant
1. If p is the end-product ordered basis obtained after r iterations of this process
Xr
S= S(p2i−1 , p2i ) p2i−1 ∧ p2i .
i=1

An interesting and immediate consequence of this representation result is that a


2-form S on a space of dimension 2n is nondegenerate (and therefore a symplectic
| ∧ ·{z
form) if and only if S · · ∧ S} 6= 0.
n factors
An n-fold wedge product, as seen in the last line, is sometimes denoted S∧n .
This particular 2n-form on the 2n-dimensional V , if nonzero, is called the volume
element generated by the symplectic form S.

23. Cartesian Tensors

We presume in this section that our vector space V is endowed with a preferred
and agreed-upon positive definite inner product, yielding preferred bases—the or-
thonormal bases—and the orthogonal matrices of transition which connect them.
The matrices of transition between these orthonormal bases have determinant ±1,
so these bases form two groups called orientations. Two bases have the same
orientation if the matrices of transition between them have determinant 1.
We suppose, first, that a multilinear function is defined or proposed whose coor-
dinates are given for orthonormal bases. If these coordinates have tensor character
when transforming from one of these orthonormal bases to another the multilinear
function defined in each orthonormal basis is said to have Cartesian character
and to define a Cartesian tensor.
If the coordinates are only intended to be calculated in, and transformed among,
orthonormal coordinate systems of a pre-determined orientation, tensor character
restricted to these bases is called direct Cartesian character and we are said to
have defined a direct Cartesian tensor.
Of course, ordinary tensors are both Cartesian and direct Cartesian tensors.
Cartesian tensors are direct Cartesian tensors.
Roughly speaking, there are more coordinate formulae in a basis that yield Carte-
sian tensors and direct Cartesian tensors than coordinate formulae that have (un-
restricted) tensor character. Our meaning here is clarified in the next paragraph.
The coordinates of Cartesian or direct Cartesian tensors given by coordinate
formulae can certainly be consistently transformed by the ordinary rules for change
of basis to bases other than those intended for that type of tensor. Often the original
coordinate formulae make sense and can be calculated in these more general bases
too. But there is no guarantee that these transformed coordinates will match those
calculated directly, and in common examples they do not.
In general, when the vocabulary “Cartesian tensor” is used and coor-
dinates represented, every basis you see will be presumed to be orthonor-
mal unless explicit mention is made to the contrary. If direct Cartesian
62 LARRY SUSANKA

tensors are discussed, all these orthonormal bases will be presumed to


have the given orientation.
With that assumption, there are important simplifications in some of the stan-
dard calculations involving tensors. These simplifications can be so significant that
some folks will never use anything but Cartesian tensors if a natural inner product
can be found. However these simplifications can be misleading, concealing interest-
ing phenomenon behind the symmetry of these bases.
For instance, raising or lowering an index using the inner product has no effect on
the coordinates of a Cartesian tensor, so coordinates Tji11,...,j
,...,ir
s
(a) in some treatments
are all written as Tj1 ,...,jr+s (a) and only the total order rather than covariant or con-
travariant order is significant. It is then natural to modify the Einstein summation
notation to indicate summation on any doubly-repeated index.
This conflation of 2r+s different tensors is handy but rather confusing if you ever
decide to leave the orthonormal realm.
As we remarked above, sometimes formulae for coordinates in a basis with Carte-
sian or direct Cartesian character are intended to be taken literally in any basis,
and in this wider setting might not have (unrestricted) tensor character.
Even or odd relative tensors with nontrivial weighting function provide common
examples of this. Relative tensors on an orientation class of orthonormal bases are
direct Cartesian tensors. Even relative tensors of any weight are Cartesian tensors
in orthonormal bases. Relative tensors and the “background” tensor they agree
with in these orthonormal bases are very different objects conceptually and the
fact that their coordinates are numerically identical in these special bases might be
convenient but cannot be counted as illuminating.
We give two examples of Cartesian tensors.
Pn
Define for basis a the function T : V ×V → R given by T (v, w) = i=1 v i (a)wi (a)
when v = v i (a) ai and w = wi (a) ai . In the paragraph below we check that this
formula (obviously multilinear) is invariant when the matrix of transition B to a
new basis b is orthogonal, using the fact that the inverse of B is its transpose.
We have v = v i (a) ai = v i (a) Bki bk and w = wi (a) ai = wi (a) Bki bk so the
formula for T applied to the coordinates of these vectors in basis b is
Xn n
X n
X n
X
v i (a)Bki wj (a)Bkj = v i (a)wj (a)Bki Bkj = v i (a) wj (a) δji = v i (a) wi (a).
k=1 k=1 i=1 i=1

Considering the same problem from another (nicer) point of view, the numbers
(
i 0, if i 6= j;
δ j=
1, if i = j
define the coordinates of a mixed tensor and this tensor has the same coordinates
in any basis, orthonormal or not.
Given any inner product G, we saw earlier that δi j are the coordinates of G and
δ i j are the coordinates of G? .
But since G is positive definite, δ i j , δ i j , δi j and δi j are all the same in any
orthonormal basis, and so they are (all of them) coordinates of Cartesian tensors,
TENSORS (DRAFT COPY) 63

and usually all represented using common coordinates δi j . The “dot product”
Cartesian tensor we worked with above is δi j ai ⊗ aj .
As a second example, consider the function

−1, if the ordered list i1 , . . . , in is an odd permutation


of the integers 1, . . . , n;




i1 ... in = 1, if the ordered list i1 , . . . , in is an even permutation

of the integers 1, . . . , n;





 0, otherwise.

This is called the permutation symbol, or Levi-Civita symbol, of order n .


These are the coordinates of a direct Cartesian tensor, another example of a
tensor with constant coordinates, though in this case to retain “tensor character”
we need to restrict the bases even more than for the Kronecker delta. Here we must
look only at orthonormal bases of a fixed orientation.
If v 1 , . . . , v n are any members of V ∗ , a quick examination shows that
v 1 ∧ · · · ∧ v n = i1 ... in v i1 ⊗ · · · ⊗ v in .

If a and b are any bases, we saw that if σ is any n-form, the coefficients σ(a)
and σ(b) defined by
σ = σ(a) a1 ∧ · · · ∧ an = σ(b) b1 ∧ · · · ∧ bn
are related by
−1
σ(b) = (det(B)) σ(a).

That means three things. First, if a and b are both orthonormal bases of the
predetermined orientation, then det(B) = 1 so the fixed numbers i1 ... in satisfy
i1 ... in ai1 ⊗ · · · ⊗ ain = i1 ... in bi1 ⊗ · · · ⊗ bin
as suggested above: the numbers i1 ... in are the coordinates of a covariant direct
cartesian tensor.
Second, we see that these numbers are the coordinates of a covariant odd rel-
ative Cartesian tensor with weighting function sign(B): that is, if you switch to
an orthonormal basis of the “other” orientation you multiply by the sign of the
determinant of the matrix of transition to the new basis.
Finally, we see that when transferring to an arbitrary basis these constant coeffi-
−1
cients require the weighting function (det(B)) in the new basis, so these constant
coefficients define an odd covariant relative tensor of weight −1.
I have also seen it stated that these same coefficients define both an odd covariant
relative tensor of weight −1 and an odd contravariant tensor of weight 1. That is
because in these sources there is no notational distinction on the coefficients between
“raised or lowered index” versions of a tensor, and what they mean is that when a
is an orthonormal basis of the predetermined orientation, the n-vector
i1 ... in ai1 ⊗ · · · ⊗ ain = a1 ∧ · · · ∧ an
transforms to arbitrary bases with weighting function of weight 1.
64 LARRY SUSANKA

This kind of thing is fine as long as you stay within the orthonormal bases, but
when you explicitly intend to leave this realm, as you must if the statement is to
have nontrivial interpretation, it is a bad idea.
If you want to consider the Cartesian tensor δi j or direct Cartesian tensor i1 ,...,ik ,
or any such tensors whose coordinate formulae make sense in bases more general
than their intended bases of application, you have two options.
You can insist on retaining the coordinate formulae in a more general base, in
which case you may lose tensor character, and might not even have a relative tensor.
Alternatively, you can abandon the coordinate formulae outside their intended
bases of application and allow coordinates to transform when moving to these bases
by the usual coordinate change rules, thereby retaining tensor character.

24. Volume Elements and Cross Product

In this section we explore initial versions of ideas we explore in more depth later.
A volume element on V is simply a nonzero member of Λn (V ). This vo-
cabulary is usually employed (obviously) when one intends to calculate volumes.
An ordered basis a of V can be used to form a parallelogram or parallelepiped
Para1 ,...,an in V consisting of the following set of points:
P ara1 ,...,an is formed from all ci ai where each ci satisfies 0 ≤ ci ≤ 1.
We may decide that we want to regard the volume of this parallelepiped as a
standard against which the volumes of other parallelepipeds are measured. In that
case we would pick the volume element Vola :
V ola = a1 ∧ · · · ∧ an
V ola assigns value 1 to (a1 , . . . , an ), interpreted as the volume of P ara1 ,...,an .

More generally, for any ordered list of vectors H1 , . . . , Hn in V we interpret


|V ola (H1 , . . . , Hn )| to be the volume of the parallelepiped with edges H1 , . . . , Hn .
This might be consistent with what you already know or think about volumes
in light of the following facts:
• A sends each ai to the corners of the unit cube in Rn which, barring evidence
to the contrary, would seem to have volume 1.
• More generally, A sends each point in P ara1 ,...,an ⊂ V to the corresponding
point in P are1 ,...,en ⊂ Rn .
• Still more generally, if H1 , . . . , Hn is any ordered list of vectors in V then
A sends each point in P arH1 ,...,Hn to P arAi (H1 )ei ,...,Ai (Hn )ei which is a
parallelepiped with edges Ai (H1 )ei , . . . , Ai (Hn )ei .
• If H1 is replaced by 2H1 the new parallelepiped ought to have twice the
volume. Linearity of V ola in each coordinate reflects this.
• You might have calculated in dimensions 1, 2 and 3 that a parallelepiped
with edges Ai (H1 )ei , . . . , Ai (Hn )ei has length, area or volume | det(H(a))|
and have reason to believe that this formula for volume makes sense in
higher dimensions too. Justification could come by filling out larger or
TENSORS (DRAFT COPY) 65

smaller shapes with translates of “rational multiples” of a standard cube


and adding to get total volume: integration.
• We saw on page 40 in Section 14 that | det(H(a))| = |V ola (H1 , . . . , Hn )|
making it seem consistent to assign this number to the volume of P arH1 ,...,Hn .
However you decide, once you come to the conclusion that P ara1 ,...,an should be
your fiduciary volume standard and that V ola should be the volume-measurement
tool it is easy to find V ola from any nonzero σ ∈ Λn (V ):
1
V ola = σ = a1 ∧ · · · ∧ an .
σ(a1 , . . . , an )
But how would one come to that conclusion in the first place?
First, it might have arisen as the volume element S ∧n for a symplectic form
S, as outlined on page 61. This form can be used to decide an orientation for
bases and has other physical significance related to the Hamiltonian formulation
of mechanics on the even dimensional phase space, which we cannot explore here.
Contact forms play a similar role on the extended phase space, which includes a
time parameter and is therefore odd dimensional.
Second, we might have in hand a fixed inner product G. We saw that we can
create an orthonormal ordered basis a for any inner product. This is where the
volume element comes from in many situations: as V ola for orthonormal a with
respect to inner product G.
Of course this is not quite enough to pin down a volume element. We saw in
Section 22 that if b is another orthonormal ordered basis then V ola and V olb are
different by a factor of the determinant of the matrix of transition between bases,
which is ±1. So to identify a unique volume element, we must also decide on one
of the two orientations for our orthonormal ordered bases. Sometimes we are only
interested in |V ola | rather than the signed or oriented volume provided by V ola ,
and the issue is irrelevant. In other situations the distinction is crucial.
When given an inner product G and an orthonormal basis a of specified orien-
tation the volume element a1 ∧ · · · ∧ an can be written, in light of the formulae on
pages 40 and 55, in terms of another basis b as
1
V ola = a1 ∧ · · · ∧ an = b1 ∧ · · · ∧ bn
det(B)
s
det (Gb ) 1 p
= sign(B) b ∧ · · · ∧ bn = sign(B) |det (Gb )| b1 ∧ · · · ∧ bn
det (Ga )
where B is the matrix of transition to the new basis.
So that is the picture if you are given an inner product on V from which you
glean enough geometric information to understand volume in V .
But this merely pushes our conundrum back one level. It remains: from which
forehead springs this symplectic form or this metric tensor?
In an application, say from physics, a metric tensor can be thought of as the
source itself, the most basic reflection and encapsulation of simple local physics,
from which grows the idea of global distances measured by numerous little yard-
sticks trapped on locally flat pieces of a curvy universe. In other words, some
66 LARRY SUSANKA

theories about the world generate (locally) an inner product when phrased in ge-
ometrical language. These theories are satisfying to us because they tap into and
salvage at least part of our natural internal models of how the external world should
be.
Some of these theories also have the added advantage that they seem to be true,
in the sense that repeated tests of these theories have so far turned up, uniformly,
repeated confirmations.
When the metric tensor at each point of this universe is positive definite it is
called Riemannian and induces the usual idea of distance on this curvy surface,
a Riemannian metric on what is called a Riemannian manifold. Small pieces
of a Riemannian manifold exhibit the ordinary Euclidean geometry of Rn .
When the metric tensor defined at each point has Lorentz signature—that is,
when diagonalized its matrix has one negative entry and three positive entries—we
find ourselves on a Lorentzian manifold, the space-time of Einstein’s gen-
eral relativity. Small pieces of a Lorentzian manifold exhibit the Minkowski
geometry of special relativity.
In that world with inner product G, some vectors satisfy G(v, v) > 0 and a v
like that is said to indicate a “space-like” displacement. No massive or massless
particle can move in this way. If G(v, v) = 0, v can indicate the a displacement of
a massless particle such as light, and lies on the “light-cone.” When G(v, v) < 0
the “time” coordinate of v is large enough in proportion to its “space” coordinates
that the vector could indicate the displacement of a massive particle. The vector
is called “time-like.”
There is another construction one sees when given specific inner product G and
orientation to yield V ola for orthonormal a. This is a means of creating a vector in
V , called cross product, from a given ordered list H1 , . . . , Hn−1 of members of V .
The cross product is defined in two steps as follows.
If you evaluate V ola using H1 , . . . , Hn−1 in its first n − 1 slots, there is one slot
remaining. Thus V ola (H1 , . . . , Hn−1 , ) is a member T of V ∗ , a different member
of V ∗ for each choice of these various Hi . This is the first step.
For the second step, raise the index of T using G? to yield a member of V , the
cross product
T (a)i ai = T (a)k Gk,i (a) ai ∈ V.
The raised index version of the tensor T is often denoted H1 × · · · × Hn−1 . The
case of n = 3 and Ga = I is most familiar, yielding the usual cross product of pairs
of vectors.
The covector (unraised) T has some interesting properties.
For example, it is linear when thought of as a function from V to V ∗ , where
the member of V replaces Hi for any fixed i. Also, if Hi and Hj are switched for
distinct subscripts, the new T becomes the negative of the old T . Because of this, if
there is any linear relation among the Hi then T = 0. On the other hand, if the Hi
are linearly independent then they can be extended to an ordered basis by adding
one vector Hn and then T (Hn ) 6= 0 which means T 6= 0. Conclusion: T = 0
exactly when H1 , . . . , Hn−1 is a dependent list.
TENSORS (DRAFT COPY) 67

The tensor T is calculated on any vector Hn as det(H(a)) and as we saw on page


39 this determinant can be expanded around the last column as
Xn  
1,...,i−1,i+1,...,n
T (Hn ) = det(H(a)) = (−1)i+n Ai (Hn ) det H(a)1,...,n−1 .
i=1
and so T is
n
X  
T (a)i ai = (−1)i+n det H(a)1,...,i−1,i+1,...,n
1,...,n−1 ai ∈ V ∗ .
i=1

T is nonzero exactly when H1 , . . . , Hn−1 constitutes an independent collection


of vectors.
An explicit formula for the raised-index version of T (the version in V ) is
n n
!
X X  
T (a)i ai = (−1)k+n det H(a)1,...,k−1,k+1,...,n
1,...,n−1 Gk,i (a) ai .
i=1 k=1

If a has been chosen so that Ga is diagonal, this formula simplifies to


n
!
X  
i i+n 1,...,i−1,i+1,...,n i,i
T (a) ai = (−1) det H(a)1,...,n−1 G (a) ai .
i=1

Now suppose we let Hn = H1 × · · · × Hn−1 and calculate det(H(a)).


det(H(a))
n
X  
1,...,i−1,i+1,...,n
= (−1)i+n Ai (Hn ) det H(a)1,...,n−1
i=1
n
X  
1,...,i−1,i+1,...,n
= (−1)i+n det H(a)1,...,n−1
i=1
n
!
X  
k+n 1,...,k−1,k+1,...,n k,i
· (−1) det H(a)1,...,n−1 G (a)
k=1
n X
X n    
= (−1)i+n+k+n Gk,i (a) det H(a)1,...,i−1,i+1,...,n
1,...,n−1
1,...,k−1,k+1,...,n
det H(a)1,...,n−1 .
k=1 i=1

If a has been chosen so that Ga is diagonal the last line becomes


Xn   2
det(H(a)) = Gi,i (a) det H(a)1,...,i−1,i+1,...,n
1,...,n−1 .
i=1

If H1 , . . . , Hn−1 constitutes an independent collection of vectors and the Gi,i (a)


all have the same sign, this is nonzero so H1 , . . . , Hn−1 , H1 × · · · × Hn−1 constitutes
an ordered basis of V . If the Gi,i (a) are all positive, that ordered basis has the
same orientation as a.
Considering a different issue, if σ is any volume element and we change ordered
basis from a to b then, as we saw earlier, the (single) coordinate changes according
to σ(b) = det(A)σ(a). This implies
a1 ∧ · · · ∧ an = det(A) b1 ∧ · · · ∧ bn .
68 LARRY SUSANKA

So if det(A) = 1 then for any Hn


T (Hn ) = det(H(a)) = det(A) det(H(b)) = det(H(b)).

Going back to the issue of the formation of cross product, we conclude that step
one in the process of creating the coordinates of the cross product in an ordered
basis is the same, up to numerical multiple, from basis to basis. When the matrix
of transition has determinant 1, the volume element is the same for these two bases,
so step one of the process has the same form regardless of other aspects of the new
basis. Step two, however, involves the matrix Gb which changes as B Ga Bt .

25. An Isomorphism Between Λr (V ) and Λn−r (V )

n!
We have seen that the dimension of the space Λr (V ) is r! (n−r)! for any r with
0 ≤ r ≤ n, which is the same as the dimensions of
Λr (V ) = Λr (V ∗ ) and Λn−r (V ) and Λn−r (V ) = Λn−r (V ∗ ).
So they are all isomorphic as vector spaces.
Our goal in this section is to create a specific isomorphism between Λr (V ) and
Λn−r (V ), and also between Λr (V ) and Λn−r (V ), and we want this construction
to be independent of basis in some sense.
It is required that a volume element σ, a member of Λn (V), be spec-
ified for the construction to proceed.
If a is any basis of V with dual basis a∗ the n-form a1 ∧ · · · ∧ an is a multiple of
σ. We presume that basis a has been chosen so that a1 ∧ · · · ∧ an = σ.
For a choice of ai1 , . . . , air we define
Formit a (ai1 ⊗ · · · ⊗ air ) = σ(ai1 , . . . , air , ·, . . . , ·) : V n−r → R.
It has various interesting properties. It is multilinear and alternating, so it is a
member of Λn−r (V ). Also, if P is any permutation of {1, . . . , r} then
Formit a (aiP (1) ⊗ · · · ⊗ aP (r) ) = sgn(P ) Formit a (ai1 ⊗ · · · ⊗ air ).
So if there are any repeated terms among the aik then Formit a (ai1 ⊗· · · ⊗air ) = 0.
Having defined Formit a on a basis of T0r (V ) consisting of tensors of the form
ai1 ⊗ · · · ⊗ air , we can extend the definition of Formit a by linearity to all of T0r (V ).
Any linear combination of members of Λn−r (V ) is also in Λn−r (V ), so the range of
Formit a is in Λn−r (V ).
We note (evaluate on (n − r)-tuples (aj1 , . . . ajn−r ) in V n−r ) that
Formit a (ai1 ⊗· · ·⊗air ) = Formit a (ai1 ∧· · ·∧air ) = sgn(P ) Formit a (aiP (1) ∧· · ·∧aiP (r) )
for any permutation P of {1, . . . , r} and any choice of i1 , . . . , ir .
We will only be interested here in Formit a evaluated on members of Λr (V ) and
we officially restrict, therefore, to this domain:
Formit a : Λr (V ) → Λn−r (V ).
TENSORS (DRAFT COPY) 69

It is pretty easy to see that if ai1 , . . . , air is a listing of basis vectors in increasing
order and air+1 , . . . , ain is a listing of the basis vectors left unused among the
ai1 , . . . , air then

Formit a (ai1 ∧ · · · ∧ air )(air+1 , . . . , ain ) = σ(ai1 , . . . , air , air+1 , . . . , ain ) = ±1

while if any of the vectors air+1 , . . . , ain repeat a vector on the list ai1 , . . . , air then

Formit a (ai1 ∧ · · · ∧ air )(air+1 , . . . , ain ) = σ(ai1 , . . . , air , air+1 , . . . , ain ) = 0.

We deduce from these facts that Formit a (ai1 ∧ · · · ∧ air ) = ± air+1 ∧ · · · ∧ ain .

Recall that for each increasing index choice i1 , . . . , ir drawn from {1, . . . , n}
there is exactly one shuffle permutation P in Sr,n ⊂ Pn for which P (k) = ik for
k = 1, . . . , r. For this shuffle, we also have (by definition of shuffle) P (r + 1) <
P (r + 2) < · · · < P (n). There are n! members of Pn but only r!(n−r)!
n!
members of
Sr,n .

So for our shuffle permutation P we can see that

Formit a (aP (1) ∧ · · · ∧ aP (r) ) = sgn(P ) aP (r+1) ∧ · · · ∧ aP (n) .

Suppose T ∈ Λr (V ) with

T = T i1 ,...,ir (a) ai1 ∧ · · · ∧ air (Sum on increasing indices.)


X
= T P (1),...,P (r) (a) aP (1) ∧ · · · ∧ aP (r) ,
P ∈Sr,n

where the final sum above is only over the shuffle permutations. Then

Formit a (T ) = T i1 ,...,ir (a) Formit a (ai1 ∧ · · · ∧ air ) (Sum on increasing indices.)


X
= sgn(P ) T P (1),...,P (r) (a) aP (r+1) ∧ · · · ∧ aP (n)
P ∈Sr,n

This map is obviously onto Λn−r (V ) and in view of the equality of dimension
provides an isomorphism from Λr (V ) onto Λn−r (V ).

It is hardly pellucid that this isomorphism does not depend on specific choice of
basis a.

We let b be a second basis of V with σ = b1 ∧ · · · ∧ bn and form isomorphism


Formit b .

Suppose m1 , . . . mr is an increasing sequence and P is the shuffle permutation


P ∈ Sr,n with mj = P (j) for j = 1, . . . , r.
70 LARRY SUSANKA

By the change of basis formula for wedge product on page 39 we have


Formit a (bm1 ∧ · · · ∧ bmr ) = Formit a det Ajm1 1,...,j
 
,...,mr aj1 ∧ · · · ∧ ajr
r

(Sum on the line above over increasing indices only.)


X  
= sgn(Q) det AQ(1),...,Q(r)
m1 ,...,mr aQ(r+1) ∧ · · · ∧ aQ(n)
Q∈Sr,n
 
Q(1),...,Q(r)
X
= sgn(Q) det AP (1),...,P (r)
Q∈Sr,n
 
Q(r+1),...,Q(n)
· det A k1 ,...,kn−r bk1 ∧ · · · ∧ bkn−r .
(Sum on the line above over both shuffles and increasing indices.)

We would like to deduce that the n-r-form in the last two lines above is
Formit b (bP (1) ∧ · · · ∧ bP (r) ) = sgn(P ) bP (r+1) ∧ · · · ∧ bP (n)
and we will prove this by showing that Formit a (bP (1) ∧· · ·∧bP (r) ) and Formit b (bP (1) ∧
· · · ∧ bP (r) ) agree when evaluated on any (n − r)-tuple of vectors of the form
(bi1 , . . . , bin−r ) for integers i1 , . . . , in−r between 1 and n.
First, if there is any duplication among these integers then alternating Formit a (bP (1) ∧
· · ·∧bP (r) ) and bP (r+1) ∧· · ·∧bP (n) must both be zero when evaluated at (bi1 , . . . , bin−r ).
So we may presume that all the i1 , . . . , in−r are distinct.
Further, we can re-order the vectors in the n-r-tuple and both Formit a (bP (1) ∧

· · · ∧ bP (r) )(bi1 , . . . , bin−r ) and bP (r+1) ∧ · · · ∧ bP (n) (bi1 , . . . , bin−r ) will be multi-
plied by the signum of the reordering permutation. So we may presume, to confirm
equality, that i1 , . . . , in−r is an increasing sequence.
If i1 , . . . , in−r contains any of the integers P (1), . . . , P (r) then
Formit b (bP (1) ∧ · · · ∧ bP (r) )(bi1 , . . . , bin−r ) = σ(bP (1) , . . . , bP (r) , bi1 , . . . , bin−r ) = 0.
Also in this case
Formit a (bP (1) ∧ · · · ∧ bP (r) )(bi1 , . . . , bin−r )
   
Q(1),...,Q(r) Q(r+1),...,Q(n)
X
= sgn(Q) det A P (1),...,P (r) det A i1 ,...,in−r .
Q∈Sr,n

By the generalization of the Laplace expansion formula found on page 41, this
is the determinant of a matrix related to A with columns reordered and with (at
least) two columns duplicated. It is, therefore, also zero.
So it remains to show equality when ij = P (r + j) for j = 1, . . . , n − r. In that
case
Formit a (bP (1) ∧ · · · ∧ bP (r) )(bP (r+1) , . . . , bP (n) ) = sgn(P ) det(A) = sgn(P )
while we also have
 
sgn(P ) bP (r+1) ∧ · · · ∧ bP (n) (bP (r+1) , . . . , bP (n) ) = sgn(P ).

In other words, Formit a and Formit b agree when evaluated on a basis of V n−k
and so are equal. We no longer need to (and will not) indicate choice of basis as
subscript on the function Formit a , using instead Formit σ with reference to the
TENSORS (DRAFT COPY) 71

specified volume element σ. By the way, had the other orientation been given by
choosing −σ as volume element, we find that Formit σ = −Formit −σ .
We extend Formit σ slightly and define
Formit σ : Λ0 (V ) → Λn (V ) to be the map sending a constant k to kσ.

We clarify the definition of Formit σ : Λn (V ) → Λ0 (V ). It is the map given by


Formit σ (k σ) = k for constant k.

Recapitulating, we have defined isomorphisms

Formit σ : Λr (V ) → Λn−r (V ) for 0 ≤ r ≤ n.

When 0 < r < n and for any basis a for which σ = a1 ∧ · · · ∧ an we have

Formit σ aP (1) ∧ · · · ∧ aP (r) = sgn(P ) aP (r+1) ∧ · · · ∧ aP (n)




for any shuffle permutation P ∈ Sr,n .


More generally, if b is any basis at all,

Formit σ bP (1) ∧ · · · ∧ bP (r) = sgn(P )σ(b1 , . . . , bn ) bP (r+1) ∧ · · · ∧ bP (n)




for any shuffle permutation P ∈ Sr,n .


Whenever a and b are bases for which σ = a1 ∧ · · · ∧ an = b1 ∧ · · · ∧ bn the change
of basis formula tells us that the two n-vectors a1 ∧ · · · ∧ an and b1 ∧ · · · ∧ bn are
equal. This n-vector is related to and determined by σ, but is not the “raised index
version of σ.” There is no inner product here to raise or lower indices.
Using this n-vector we define, by means identical to those used above, isomor-
phisms

Vecit σ : Λr (V ) → Λn−r (V ) for 0 ≤ r ≤ n.

When 0 < r < n and for basis a for which σ = a1 ∧ · · · ∧ an we have


 
Vecit σ aP (1) ∧ · · · ∧ aP (r) = sgn(P ) aP (r+1) ∧ · · · ∧ aP (n)

for any shuffle permutation P ∈ Sr,n .


More generally, if b is any basis at all,
  sgn(P )
Vecit σ bP (1) ∧ · · · ∧ bP (r) = bP (r+1) ∧ · · · ∧ bP (n)
σ(b1 , . . . , bn )
for any shuffle permutation P ∈ Sr,n .
One natural question is how Vecit σ is related to Formit −1
σ for each r.

Using basis a with σ = a1 ∧ · · · ∧ an


Vecit σ ◦ Formit σ (a1 ∧ · · · ∧ ar ) = Vecit σ (ar+1 ∧ · · · ∧ an ) = sgn(Q) a1 ∧ · · · ∧ ar
where Q is the permutation that converts the list
r + 1, . . . , n, 1, . . . , r
72 LARRY SUSANKA

to natural order. This is accomplished by r(n − r) switches of consecutive members


on the list, and we find that
Vecit σ ◦ Formit σ (a1 ∧ · · · ∧ ar ) = (−1)r(n−r) a1 ∧ · · · ∧ ar .

To resolve the issue for a generic basis tensor ai1 ∧ · · · ∧ air we can form a basis c
by permutation of the members of the basis a and apply this calculation (watching
signs if the permutation is odd) in this new basis c. The same formula holds.

Vecit σ ◦ Formit σ is (−1)r(n−r) times the identity map.

So if r is even or n is odd this is always the identity map. A minus sign is


introduced only in case n is even and r is odd.

26. The Hodge ∗ Operator and Hodge Duality

Our goal in this section is to identify explicit isomorphisms between Λr (V ) and


Λn−r (V ) and also between Λr (V ) and Λn−r (V ) for each r.
In this section we presume given an inner product G on V and an
orientation.
From these two specifications, we create the n-form σ = a1 ∧ · · · ∧ an
where a is an orthonormal basis of requisite orientation.
This is the world of direct Cartesian tensors if the inner product happens to be
positive definite, though in important cases it is not.
Our inner product has signature (p, q): that is, when diagonalized the matrix of
G has p positive and q = n − p negative entries.
For various orders r, we define “raising” isomorphisms Raise : Λr (V ) → Λr (V )
and “lowering” isomorphisms Lower : Λr (V ) → Λr (V ) by raising or lowering all
indices on each tensor, as described on page 51. These isomorphisms, for a given
r, are inverse to each other.
We define isomorphisms for various r values

Formit σ ◦ Raise : Λr (V ) → Λn−r (V ) and Vecit σ ◦ Lower : Λr (V ) → Λn−r (V )

which will all be denoted ∗ and called “the” Hodge star operator.
The (n − r)-form ∗τ ∈ Λn−r (V ) will be called the Hodge dual of τ ∈ Λr (V ).
The (n − r)-vector ∗x ∈ Λn−r (V ) will be called the Hodge dual of x ∈ Λr (V ).
With our definition of ∗ on Λr (V ) we calculate that ∗ 1 = σ and ∗ σ = (−1)q 1.
As this hints, and as was the case with Formit σ and Vecit σ , the Hodge operator
∗ is closely related to its inverse.
Our inner product G has signature (p, q), so q is the number of −1 entries on
the diagonal of the matrix of Ga = Ga for orthonormal basis a.
TENSORS (DRAFT COPY) 73

Suppose 0 < r < n and P is any shuffle permutation  in Sr,n and that exactly k
of the vectors aP (1) , . . . , aP (r) satisfy G aP (i) , aP (i) = −1. So q − k of the vectors
aP (r+1) , . . . , aP (n) satisfy that equation. It follows that

∗ ◦ ∗ aP (1) ∧ · · · ∧ aP (r)

= Vecit σ ◦ Lower ◦ Vecit σ ◦ Lower aP (1) ∧ · · · ∧ aP (r)
 
= Vecit σ ◦ Lower ◦ Vecit σ (−1)k aP (1) ∧ · · · ∧ aP (r)
= Vecit σ ◦ Lower sgn(P ) (−1)k aP (r+1) ∧ · · · ∧ aP (n)

 
= Vecit σ sgn(P ) (−1)q−k (−1)k aP (r+1) ∧ · · · ∧ aP (n)
= sgn(P ) (−1)q−k (−1)k (−1)r(n−r) sgn(P ) aP (1) ∧ · · · ∧ aP (r)
= (−1)r(n−r)+q aP (1) ∧ · · · ∧ aP (r) .

In other words,

∗ ◦ ∗ is (−1)r(n−r)+q times the identity map.

The cases r = 0 and r = n, not covered above, are calculated directly with the
same outcome. Also, the calculation of ∗ ◦ ∗ aP (1) ∧ · · · ∧ aP (r) is very similar,
again yielding the same conclusion.

The Hodge ∗ can be used to form an inner product on each Λr (V ) and each
Λr (V ). We illustrate the construction on Λr (V ).

For each x, y ∈ Λr (V ) the n-vector x ∧ ∗y is a multiple of the standard n-vector


a1 ∧ · · · ∧ an . Define

< x, y >r to be k when x ∧ ∗y = k a1 ∧ · · · ∧ an

where a is properly oriented and orthonormal.

< x, y >r is obviously linear in each factor separately.



(ai1 ∧ · · · ∧ air ) ∧ aj1 ∧ · · · ∧ ajn−r is nonzero exactly when all the subscripts
are different. So (ai1 ∧ · · · ∧ air ) ∧ (∗ aj1 ∧ · · · ∧ ajr ) is nonzero exactly when all the
subscripts i1 , . . . , ir are different and constitute a reordering of j1 , . . . , jr . Assuming
both are in increasing order, they must be the same indexing. The wedge product
is of the form aP (1) ∧ · · · ∧ aP (r) ∧ ∗ aP (1) ∧ · · · ∧ aP (r) for P ∈ Sr,n .


For any shuffle P ∈ Sr,n let


 
Ma (P ) = G aP (1) , aP (1) · · · G aP (r) , aP (r) .

In other words, Ma (P ) is the product of the diagonal entries of the matrix Ga (all 1
or −1) which correspond to positions P (j) along that diagonal for any j = 1, . . . , r.
74 LARRY SUSANKA

We calculate
x ∧ ∗y = xi1 ,...,ir (a) ai1 ∧ · · · ∧ air ∧ ∗ y j1 ,...,jr (a) aj1 ∧ · · · ∧ ajr
 
X
= xP (1),...,P (r) (a) y P (1),...,P (r) (a) sgn(P ) Ma (P ) aP (1) ∧ · · · ∧ aP (n)
P ∈Sr,n
 
X
=  xP (1),...,P (r) (a) y P (1),...,P (r) (a) Ma (P ) a1 ∧ · · · ∧ an
P ∈Sr,n

which is obviously symmetrical: that is, < x, y >r = < y, x >r .


It also follows by inspection of the sum above that we actually have an inner
product: the bilinear function is nondegenerate and the standard basis of Λr (V )
for basis a is an orthonormal basis of Λr (V ) with respect to this inner product.
We note also that
aP (1) ∧ · · · ∧ aP (r) , ∗ aP (1) ∧ · · · ∧ aP (r) r
= Ma (P ).
In particular, for any signature (p, q) of G we will have G = < · , · >1 .

27. The Grassmann Algebra

Given a vector space V of dimension n ≥ 2 we define the Grassmann algebra


G(V ) for V to be the free sum

Λ0 (V ) ⊕ Λ1 (V ) ⊕ Λ2 (V ) ⊕ · · · ⊕ Λn (V ) ⊕ · · ·
together with wedge product as the multiplication on the algebra, calculated by
distribution of the nonzero terms of the formal sums, and where wedge product
against a scalar (a member of Λ0 (V )) is given by ordinary scalar multiplication.
The Grassmann algebra is also called the exterior algebra, with reference to
the wedge or exterior product.
We note that all summands beyond Λn (V ) consist of the zero tensor only.

P∞This means, first, that members of G(V ) are all formal sums of the form θ =
i=0 dθei where each dθei ∈ Λi (V ). There can be at most n+1 nonzero summands.

dθei is called the grade-i part of θ. This representation is unique for each θ: that
is, two members of G(V ) are equal if and only if their grade-i parts are equal for
i = 0, . . . , n.
Though Λr (V ) is not actually contained in G(V ), an isomorphic copy of each
Λr (V ) is in G(V ). We do not normally distinguish, unless it is critical for some
rare reason, between a member of Λr (V ) and the corresponding member of G(V )
which is 0 except for a grade-r part.
G(V ) is a vector space with scalar multiplication given by ordinary scalar mul-
tiplication distributed across the formal sum to each grade, while addition is cal-
culated by adding the parts of corresponding grade. The member of G(V ) which is
the zero form at each grade acts as additive identity.
TENSORS (DRAFT COPY) 75

If a is a basis of V then 1 together with the set of all


aii ∧ · · · ∧ air
for all r between 1 and n and increasing sequences i1 , . . . , ir forms a basis for G(V ),
which therefore has dimension
n!
1 + n + ··· + + · · · + n + 1 = 2n .
r!(n − r)!

Looking at individual grades, θ ∧ τ for θ, τ ∈ G(V ) is given by


i
X
dθ ∧ τ ei = dθer ∧ dτ ei−r
r=0

for each i between 0 and n, while dθ ∧ τ ei is the zero i-form for i > n.
Sometimes formulae produce reference to a negative grade. Any part or coeffi-
cient involved is interpreted to be the appropriate form of zero, just as one would
do for a reference to a grade larger than n.
If θ is nonzero only in grade i while τ is nonzero only in grade j then θ ∧ τ can
only be nonzero in grade i + j. By virtue of this property, G(V ) is called a graded
algebra.
Members of G(V ) are called multi-forms.
Replacing V by V ∗ produces the Grassmann algebra G(V ∗ ), whose members are
called multi-vectors and which also forms a graded algebra with wedge product.
If the field we are working with is R, we note that the Grassmann algebra of
multi-forms G(V ) is
R ⊕ V ∗ ⊕ Λ2 (V ) ⊕ · · · ⊕ Λn (V ) ⊕ {0} ⊕ · · ·
while the Grassmann algebra of multi-vectors G(V ∗ ) is
R ⊕ V ⊕ Λ2 (V ) ⊕ · · · ⊕ Λn (V ) ⊕ {0} ⊕ · · · .

When a volume element σ is specified, the two families of isomorphisms


Formit σ and Vecit σ from Section 25 induce (vector space) isomorphisms (when
applied at each grade) between G(V ) and G(V ∗ ).
Given the additional structure of an inner product G on V and a
choice of orientation we can, similarly, use the Hodge ∗ map to induce (vector
space) automorphisms on G(V ) and on G(V ∗ ).
The lowering and raising isomorphisms Lower and Raise induce vector space
isomorphisms between G(V ) and G(V ∗ ).
Further, at the end of Section 26 we saw how to use G to create inner products
on each Λr (V ) and, using G? , on each Λr (V ).
By defining for each multi-vector x and y in G(V ∗ ) the sum
n
X
< x, y >= h dxei , dyei ii
i=0
76 LARRY SUSANKA

these inner products induce an inner product on all of G(V ∗ ).


An analogous procedure using the dual inner product G? gives us an inner prod-
uct on the space G(V ) of multi-forms.
The interior product on s-forms and s-vectors induces in the obvious way an
interior product on G(V ∗ ) and on G(V ), lowering the highest nonzero grade (if
any) of a multi-vector or multi-form by one.
For instance, if v ∈ V and θ ∈ G(V ) we define
n
X
vy θ = vy dθei .
i=0

The eponymous Hermann Grassmann, around and after 1840, wrote a series of
works within which he essentially invented linear and multilinear algebra as we
know it, including the first use of vectors. His writing style was considered to be
unmotivated and opaque by the few who read his work. In this, as in other matters,
he was a bit ahead of his time.
The value of his ideas did not become widely known for seventy years, and then
mostly by reference, through the work of other mathematicians and physicists such
as Peano, Clifford, Cartan, Gibbs and Hankel.
Grassmann interpreted nonzero wedge products of vectors such as W = x1 ∧· · ·∧
xr to represent an r dimensional subspace of V , in light of the fact that this wedge
product is characterized (up to constant multiple) by the property that W ∧ y = 0
and y ∈ V if and only if y is in the span of x1 , . . . , xr .
He thought of the various operations on the Grassmann algebra as constituting
a calculus of subspaces or “extensions,” combining and giving information about
geometrical objects and associated physical processes in a very direct and intuitive
way. For instance
( x1 ∧ · · · ∧ xr ) ∧ ( y1 ∧ · · · ∧ ys ) ,
will be nonzero if and only if x1 , . . . , xr , y1 , . . . , ys form a linearly independent set
of vectors. In that case the wedge product determines the combined span of the
two spaces the factors represent, which will have dimension r + s.
On the other hand, if the combined span of the two spaces is all of V but
neither of the two tensors is, individually, 0 and if V is given an inner product and
orientation, the multi-vector
∗ ( ( ∗ x1 ∧ · · · ∧ xr ) ∧ ( ∗ y1 ∧ · · · ∧ ys ) )
determines the intersection of the two spaces which will have dimension r + s − n.
His work, reformulated and clarified with modern notation, plays a large and
increasing role in many areas, from current theoretical physics to practical engi-
neering.
TENSORS (DRAFT COPY) 77

References
[1] Abraham, R. and Marsden, J., Foundations of Mechanics. Addison-Wesley Publishing Co.,
Reading, MA, 1985
[2] Aczél, J., Lectures on Functional Equations and Their Applications. Dover Publications, Inc.,
New York, NY, 2006
[3] Akivis, M. and Goldberg, V., An Introduction to Linear Algebra and Tensors. Dover Publi-
cations, Inc., New York, NY, 1972
[4] Arnold, V., Mathematical Methods of Classical Mechanics. Springer-Verlag New York, Inc.,
New York, NY, 1980
[5] Bishop, R. and Crittenden, R., Geometry of Manifolds. Academic Press, New York, NY, 1964

[6] Bishop, R. and Goldberg, S., Tensor Analysis on Manifolds. Dover Publications, Inc., New
York, NY, 1980
[7] Flanders, H., Differential Forms with Applications to the Physical Sciences. Dover Publica-
tions, Inc., New York, NY, 1989
[8] Hoffman, K. and Kunze, R., Linear Algebra. Prentice-Hall Inc., Inglewood Cliffs, NJ, 1971
[9] Hungerford, T., Algebra. Springer-Verlag Inc., New York, NY, 1974
[10] José, J., and Saletan, E., Classical Dynamics A Contemporary Approach. Cambridge Uni-
versity Press, Cambridge, U.K., 1998
[11] Misner, C. and Thorne, K. and Wheeler, J., Gravitation. W. H. Freeman and Co., San
Fancisco, CA, 1973
[12] Nelson, E., Tensor Analysis. Princeton University Press, Princeton, NJ, 1967
[13] Spivak, M., Calculus on Manifolds. W.A. Benjamin, Inc., Menlo Park, CA, 1965
[14] Spivak, M., Differential Geometry Vol. I. Publish or Perish, Inc., Berkeley, CA, 1970
[15] Stacy, B. D., Tensors in Euclidean 3-Space from a Course Offered by J. W. Lee. Notes
transcribed and organized by B. D. Stacy, 1981
[16] Sussman, G., and Wisdom, J., and Mayer, M., Structure and Interpretation of Classical
Mechanics. The MIT Press, Cambridge, MA, 2001
[17] Synge, J. L., and Schild, A., Tensor Calculus. Dover Publications, Inc., New York, NY, 1978

[18] Wald, R., General Relativity. University of Chicago Press, Chicago, IL, 1984
[19] Weinreich, G., Geometrical Vectors. University of Chicago Press, Chicago, IL, 1998
Index
∗, 72 Ss,L , 37
1-form, 15 T00 (V ), 15
2-cycle, 31 Tsr (V ), 15
< , >r , 73 ⊗, 15
A, 8 |v|, 45
A∗ , 8 ∧, 33
Ã, 16 n-form, 35
A,
e 16
e n-vector, 35
A∗ , 8 sgn(P ), 31
A, 8 sign(M ), 24
At , 5 trace(T ), 27
Alt(T ), 31 Abraham, R., 77
Cβα T , 26 absolute tensor, 24
E, 7, 9 Aczél, J., 77
GL(n, R), 23 Akivis, M. A., 77
G? , 46 alternating tensor, 30
H1 × · · · × Hn−1 , 66 angle
P ara1 ,...,an , 64 from a positive definite inner product, 45
S ∧n , 61 “angle” operator, y, 41
Sym(T ), 31 antisymmetric tensor, 30
T ab cde f g , 49 antisymmetrization, 31
T i1 ...irj1 ...js , 49 aplomb, 6
Tji11,...,j
,...,ir
(a), 16 Arnold, V. I., 77
s
∗ axial
V ,8
scalar, 25
Vsr , 14
tensor, 25
Vsr (k), 14
vector or covector, 25
V ola , 64
Λs (V ), 35 bad
Λs (V ), 35 idea, 64
[, 46 notation, 50
Formit σ , 71 basis
Vecit σ , 71 dual, 8
G(V ), 74 ordered, 6
G(V ∗ ), 75 orthonormal ordered, 54
], 46 standard ordered basis for Rn , 6
δik , 10 standard ordered basis for Rn∗ , 7
det, 23 bilinear form, 43
det(H),
 35  Bishop, R. L., 77
det Aji11,...,i r
,...,jr , 39
brackets on indices, 32
i1 ... in , 63 braiding map, 31
∂y i
∂xj
, 11 capacity
h , iG,a , 47 tensor, 25
h , iG? ,a∗ , 47 Cartesian
dθei , 74 character, 61
y, 41 tensor, 61
Rn , 6 Cauchy-Schwarz Inequality, 45
Rn∗ , 7 character
a∗ , 8 Cartesian, 61
e∗ , 7 direct Cartesian, 61
e, 6 tensor, 20
A, 10 component, 8
B, 10 conjugate
Hom(V, V ), 26 metric tensor, 47
Pn , 30 contact forms, 65
78
TENSORS (DRAFT COPY) 79

contraction functional
by one contravariant index and one linear, 7, 8
covariant index, 26
by two contravariant or two covariant general linear group, 23
indices, 50 general relativity, 66
contravariant, 15 geometry
vector, 15 Euclidean, 5, 45, 66
contravector, 15 Minkowski, 66
coordinate Goldberg, S. I., 77
map, 8, 9 Goldberg, V. V., 77
polynomial, 20 grade-i part, 74
polynomial for a tensor in a basis, 21 graded algebra, 75
coordinates Gram-Schmidt process, 58
of a functional in a dual basis, 9 Grassmann algebra, 74
of a tensor in an ordered basis, 17 Grassmann, Hermann, 76
of a vector in an ordered basis, 8 gymnastics
covariant, 15 mental, 23
vector, 15
Hamiltonian mechanics, 44
covector, 15
Hodge
Crittenden, R. J., 77
dual, 72
critters, 5
“star” (or simply ∗) operator, 72
cross product, 66
Hoffman, K. M., 77
degenerate, 44 Hungerford, T. W., 77
density
increasing
tensor, 25
index list, 36
determinant, 23, 35
index of a symmetric bilinear form, 56
direct
inner product, 44
Cartesian character, 61
Lorentzian, 66
Cartesian tensor, 61
on G(V ∗ ) and G(V ), 75
dot product, 51
on Λr (V ) and Λr (V ), 73
dual
Riemannian, 66
Hodge, 72
interior product, 41, 76
of Rn , 7
invariant, 15, 29
of Rn∗ , 7
of V , 8 Jacobian matrix, 11
of V ∗ , 9 José, J. V., 77
of basis, 8
Kronecker delta, 10
Einstein Kunze, R. A., 77
general relativity, 66
summation convention, 13 Laplace expansion, 39, 41
Euclidean Lee, J. W., 77
geometry, 5, 45, 66 length, 45
inner product on Rn , 51 Levi-Civita symbol, 63
vector space, 45 linear functional, 7
evaluation map, 7, 9 Lorentzian
even inner product, 66
permutation, 31 manifold, 66
relative tensor, 24 metric tensor, 66
weighting function, 24 lower index, 47
exterior
algebra, 74 manifold
product, 33 Lorentzian, 66
Riemannian, 66
fest Marsden, J. E., 77
index, 6 matrix
Flanders, H., 77 representation, 43, 52
form (n-form), 35 Mayer, M. E., 77
80 LARRY SUSANKA

menagerie tensor (outer), 15


small, 9 wedge, 33
metric pseudoscalar, 25
Riemannian, 66 pseudotensor, 25
tensor, 44 pseudovector or pseudocovector, 25
Lorentzian, 66
Riemannian, 66 Quotient Theorem, 30
Minkowski geometry, 66
raise index, 47
Misner, C. W., 77
rank, 15
mixed, 15
regrettable
multi-forms, 75
multi-vector, 75 consequence, 33
multilinear, 14 relative tensor, 24
coordinate polynomial, 20 even, 24
multiplicative function, 23 odd, 24
representation
negative definite, 44 matrix, 43, 52
Nelson, E., 77 representative
non-negative, 44 of a functional, 9
non-positive, 45 of a tensor, 17
nondegenerate, 44 of a vector, 8
Riemannian
odd manifold, 66
permutation, 31 metric, 66
relative tensor, 24 metric tensor, 66
weighting function, 24
order, 15 Saletan, E. J., 77
contravariant, 15 scalar, 15
covariant, 15 Schild, A., 77
ordered basis, 6 shuffle permutation, 37, 69
orientation, 39, 61 signature of a symmetric bilinear form, 56
of orthonormal ordered basis, 65 similar
oriented matrices, 28
tensor, 25 similarity transformation, 24, 28
orthogonal matrix, 54 simple tensor, 16
orthogonally equivalent, 55 skew part, 31
orthonormal ordered basis skew symmetric matrix, 44
with respect to an inner product, 54, 57 skew symmetric tensor, 30
with respect to the usual inner product slot, 14
on Rn , 54 space-time, 66
outer product, 15 Spivak, M., 77
Stacy, B. D., 77
parallelepiped, 64 standard basis
Parallelogram of Λr (V ), 36
Law, 45 of Tsr (V ), 16
parallelogram, 64 Sussman, G. J., 77
permutation, 30 Sylvester’s Law of Inertia, 56
shuffle, 37, 69 symmetric
symbol, 63 matrix, 45
polar part, 31
tensor, 25 symmetric tensor, 30
Polarization Identity, 45 symmetrization, 31
positive definite, 44 symplectic form, 44
product Synge, J. L., 77
cross, 66
dot, 51 tensor, 15
exterior, 33 absolute, 24
inner, 44 alternating, 30
interior, 41, 76 antisymmetric, 30
TENSORS (DRAFT COPY) 81

axial, 25
capacity, 25
Cartesian, 61
character, 20
density, 25
direct Cartesian, 61
metric, 44
oriented, 25
product (outer), 15
pseudotensor, 25
relative, 24
even, 24
odd, 24
simple, 16
skew symmetric, 30
symmetric, 30
twisted, 25
Thorne, K. S., 77
tiny
proof, 28
trace, 27
transition
matrices, 10
transpose, 5
Triangle Inequality, 45
tweak, 14
twisted
tensor, 25

vector, 15
n-vector, 35
volume element, 61, 64

Wald, R. M., 77
wedge product, 33
weight
of a relative tensor, 24
weighting function, 24
Weinreich, G., 77
Wheeler, J. A., 77
Wisdom, J., 77

You might also like