Tensors (Susanka)
Tensors (Susanka)
(DRAFT COPY)
LARRY SUSANKA
Contents
1. Some Notation 5
2. Rn , Rn∗ and Rn∗∗ 6
3. V , V ∗ and V ∗∗ 8
4. Change of Basis 10
5. The Einstein Summation Convention 13
6. Tensors and the (Outer) Tensor Product 14
7. Tensor Coordinates and Change of Basis 18
8. Tensor Character 20
9. Relative Tensors 23
10. An Identification of Hom(V, V ) and T11 (V) 26
11. Contraction and Trace 26
12. Evaluation 29
13. Symmetry and Antisymmetry 30
14. A Basis for Λr (V ) 35
15. Determinants and Orientation 38
16. Change of Basis for Alternating Tensors and Laplace Expansion 39
17. Evaluation in Λs (V ) and Λs (V ): the Interior Product 41
18. Bilinear Forms 43
19. An Isomorphism Induced by a Nondegenerate Bilinear Form 46
20. Raising and Lowering Indices 47
21. Four Facts About Tensors of Order 2 52
22. Matrices of Symmetric Bilinear Forms and 2-Forms 55
23. Cartesian Tensors 61
24. Volume Elements and Cross Product 64
25. An Isomorphism Between Λr (V ) and Λn−r (V ) 68
26. The Hodge ∗ Operator and Hodge Duality 72
27. The Grassmann Algebra 74
References 77
Index 78
4 LARRY SUSANKA
TENSORS (DRAFT COPY) 5
1. Some Notation
In these notes we will be working with a few sets and functions repeatedly, so
we lay out these critters up front. The page number references their introduction.
Rn all n × 1 “column matrices” with real entries (Page 6.)
Standard ordered basis: e given by e1 , . . . , en
Rn∗ all 1 × n “row matrices” with real entries (Page 7.)
Standard dual ordered basis: e∗ given by e1 , . . . , en
The standard inner product on Rn generates the Euclidean geometry on Rn .
Denoted x · y or hx, yi or xt y for x, y ∈ Rn (“t” indicates transpose.)
V generic real n-dimensional vector space
with two ordered bases a given by a1 , . . . , an and b given by b1 , . . . , bn
Pn
A : V → Rn defined by v = Pi=1 Ai (v)ai for any v ∈ V (Page 8.)
n
B : V → Rn defined by v = i=1 B i (v)bi for any v ∈ V (Page 10.)
Matrix M has ijth entry Mji . M itself can be denoted Mji .
Pn Pn
Define matrix A by Aij = Ai (bj ). Note bj = i=1 Ai (bj )ai = i=1 Aij ai .
Pn Pn
Define matrix B by Bij = B i (aj ). Note aj = i=1 B i (aj )bi = i=1 Bij bi .
A and B are called matrices of transition. (Page 10.)
We calculate that BA = I = δji , the n × n identity matrix. (Page 10.)
Pn Pn
Suppose v = i=1Pxi ai = i=1P y i bi ∈ V . (Page 11.)
n n
Then x = j=1 y Aj = j=1 y j Ai (bj )
i j i
n n
and y i = j=1 xj Bij = j=1 xj B i (aj ).
P P
If x and y in Rn represent v ∈
V i in
bases a and b respectively
i (Page 11.)
then y = Bx and B = ∂xj and x = Ay and A = ∂y
∂y ∂x
j .
Pn Pn
Suppose θ = j=1 gj aj = j=1 fj bj ∈ V ∗ . (Page 12.)
Pn Pn Pn Pn
Then gj = i=1 fi Bij = i=1 fi B i (aj ) and fj = i=1 gi Aij = i=1 gi Ai (bj ).
If τ and σ in Rn∗ represent θ ∈ V ∗ in bases a∗ and b∗ respectively (Page 13.)
then σ = τ A and τ = σB.
Bilinear form G has matrix Ga = Ga ij = (Gi,j (a)) (Page 43.)
i
If invertible, define Gi,j (a) by Ga −1 = Ga = Ga j = Gi,j (a) (Page 45.)
6 LARRY SUSANKA
We presume throughout that the reader has seen the basics of linear algebra,
including, at least, the concepts of ordered basis and dimension of a real vector
space and the fact that a linear transformation is determined by its effect
on an ordered basis of its domain. Facts about symmetric and skew symmetric
matrices and how they can be brought to standard forms and several facts about
determinants are referenced and used. Beyond that, the reader need only steel
him-or-her-self for the index-fest which (no way around it) ensues.
Rest assured that with some practice you too will be slinging index-laden mon-
strosities about with the aplomb of a veteran.
x1
n
x = ... =
X i
x ei
xn i=1
0
..
.
1 ←− ith row.
where for each i = 1, . . . , n we define ei =
.
..
0
We abandon the standard practice in basic calculus and algebra textbooks whereby
members of Rn are denoted by rows. The typographical convenience of this prac-
tice for the authors of these books is inarguable, saving the reams of paper in every
elementary class which would be devoted to whitespace in the textbook using our
convention. Our choice is the one most commonly adopted for the subject at hand.
Here, rows of numbers will have a different meaning, to be defined shortly.
Superscripts and subscripts abound in the discussion while exponents are scarce,
so it should be presumed that a notation such as xi refers to the ith coordinate or
ith instance of x rather than the ith power of x.
Pn
Any x ∈ Rn has one and only one representation as j=1 xj ej where the xj are
real numbers. The ordered list e1 , e2 , . . . , en is called an ordered basis of Rn by
virtue of this property, and is called the standard ordered basis of Rn , denoted
e.
TENSORS (DRAFT COPY) 7
n m
Any
linear transformation σ from R to R is identified with an m × n matrix
σji with ith row and jth column entry given by σji = σ i (ej ). Then
1
σ (x) m m
n
.. X i X X
σ(x) = . = σ (x)ei = σi xj ej ei
σ m (x) i=1 i=1 j=1
m
X Xn
σ i (ej )xj ei = σji x.
=
i=1 j=1
So a linear transformation
σ “is” left matrix multiplication by the m × n matrix
σji = σ i (ej ) . We will usually not distinguish between a linear transformation
from Rn to Rm and this matrix, writing either σ(x) or σx for σ evaluated at x.
Be aware, however, that a linear transformation is a function while matrix mul-
tiplication by a certain matrix is an arrangement of arithmetic operations which
can be employed, when done in specific order, to evaluate a function at one of its
domain members.
With this in mind, the set of linear transformations from Rn to R, also called
the dual of Rn , is identified with the set of 1 × n “row matrices,” which we will
denote Rn∗ . Individual members of Rn∗ are called linear functionals on Rn .
Any linear functional σ on Rn is identified with the row matrix
(σ1 , σ2 , . . . , σn ) = (σ(e1 ), σ(e2 ), . . . , σ(en ))
n
acting on R by left matrix multiplication.
x1
n
σ(ej )xj = (σ1 , σ2 , . . . , σn ) ... = σx.
X
σ(x) =
j=1 xn
The entries of row matrices are distinguished from each other by commas. These
commas are not strictly necessary, and can be omitted. But without them there
might be no obvious visual cue that you have moved to the next column: (3 − 2x −
4 − y) versus (3, −2x, −4 − y).
Rn∗ itself has ordered basis e∗ given by the list e1 , e2 , . . . , en where each ei =
(ei )t , the transpose of ei . The ordered basis e∗ is called the standard ordered
basis of Rn∗ .
Pn
Any σ ∈ Rn∗ is j=1 σ(ej )ej and this representation is the only representation
of σ as a nontrivial linear combination of the ej .
The “double dual” of Rn , denoted Rn∗∗ , is the set of linear transformations
from Rn∗ to R. These are also called linear functionals on Rn∗ . This set of func-
tionals is identified with Rn by the evaluation map E : Rn → Rn∗∗ defined for
x ∈ Rn by
E(x)(σ) = σ(x) = σx ∀ σ ∈ Rn∗ .
Note that if E(x) is the 0 transformation then x must be 0, and since Rn and
n∗∗
R have the same dimension E defines an isomorphism onto Rn∗∗ .
8 LARRY SUSANKA
We will not normally distinguish between Rn and Rn∗∗ . The notations x(σ) and
σ(x) and σ x will be used interchangeably for x ∈ Rn and σ ∈ Rn∗ .
Note that the matrix product x σ is an n × n matrix, not a real number. It
will normally be obvious from context if we intend functional evaluation x(σ) or
this matrix product. Members of Rn = Rn∗∗ “act on” members of Rn∗ by matrix
multiplication on the right.
3. V , V ∗ and V ∗∗
A is called the coordinate map for the ordered basis a, and A(v) is said
to represent v in ordered basis a. The individual entries in A(v) are called
“the first coordinate,” “the second coordinate” and so on. The word component
is used synonymously with coordinate.
Pn Pn
The function A associates i=1 Ai (v)ai ∈ V with i=1 Ai (v)ei ∈ Rn .
A is an isomorphism of V onto Rn . Denote its inverse by A.
n
X n
X
A(x) = xi ai ∈ V when x= xi ei ∈ Rn .
i=1 i=1
In other words, aj “picks off” the jth coordinate of v ∈ V when you represent v
in terms of the ordered basis a.
This ordered basis is said to be the ordered basis dual to a.
If you know that a1 is a member of an ordered basis, but do not
1
know Pna2 , . . . , an you cannot determine a . The coefficient of a1 in the sum
v = i=1 Ai (v)ai depends on every member of the ordered basis, not just a1 .
If σ ∈ Rn∗ define A∗ (σ) ∈ V ∗ by A∗ (σ)(v) = σ(A(v)).
n
X n
X
A∗ (σ) = σi ai ∈ V ∗ when σ= σi ei ∈ Rn∗ .
i=1 i=1
n
X n
X
A∗ (θ) = gi ei when θ= gi ai .
i=1 i=1
A∗ is called the coordinate map for ordered basis a∗ . The row matrix A∗ (θ) =
( A∗ 1 (θ), . . . , A∗ n (θ) ) is said to represent θ in the dual ordered basis a∗ .
The four isomorphisms described here are represented pictorially below.
V V∗
x x
ai ↔ ei A A and A ∗
A∗ ai ↔ ei
y y
Rn Rn∗
4. Change of Basis
This means that the matrices A and B are inverse to each other.1
1The size of the identity matrix is usually suppressed. If that might cause confusion the
notation I(n) = δji (n) can be used. The symbol δji is called the Kronecker delta.
TENSORS (DRAFT COPY) 11
So for each i,
n
X n
X
y i = B i (v) = B i (aj )Aj (v) = B i (aj )xj .
j=1 j=1
bj “picks off” the jth coordinate of v ∈ V when you represent v in terms of the
ordered basis b.
V V∗
x x
bi ↔ e i B B and B ∗
B∗ bi ↔ ei
y y
Rn Rn∗
That means the matrix C is the inverse of A. We already know that A−1 = B:
that is, C = B. A symmetrical result holds for a description of the aj in terms of
b∗ .
Pn Pn
Now suppose we have a member θ = j=1 fj bj = j=1 gj a
j
in V ∗ .
n n n n n
! n n
!
X X X X X X X
θ= k
gk a = gk k
A (bj )b = j k
gk A (bj ) b = j
gk Akj bj .
k=1 k=1 j=1 j=1 k=1 j=1 k=1
We can look at the same calculation from a slightly different standpoint. Suppose
σ, τ ∈ Rn∗ represent the same linear transformation θ ∈ V ∗ with respect to bases
b∗ and a∗ respectively.
TENSORS (DRAFT COPY) 13
V V∗
All these arrows A A∗ B∗
are reversible. @B @
R
@ @R
@
Rn - Rn
Rn∗ - Rn∗
Left multiply Right multiply
by matrix B by matrix A
We now raise an important point on change of basis. Rn and Rn∗ are vector
spaces and we do not intend to preclude the possibility that V or V ∗ could be
either of these. Still, when Rn and Rn∗ are considered as range of A and A∗
we will never change basis there, but always use e and e∗ . Though the
representative of a vector from V or V ∗ might change as we go along, we take the
point of view that this happens only as a result of a change in basis in V.
In these notes we will never use this convention if, within a product, the
same symbol is used more than once as subscript or more than once as
superscript.
So for us ai xi is not a sum. Neither is ai ai xi or (ai + ai )xi . But (ai )2 xi and
ai xi + ai xi do indicate summation on i, and the latter is the same as ai xi + aj xj .
If we intend symbols
Pn such as ai xi to denote a sum we must use the old conven-
i i
tional notation i=1 a x .
Sometimes the notation can be ambiguous, as in ai xi . Rather than elaborate
on our convention, converting a notational convenience into a mnemonic nuisance,
we will simply revert to the old summation notation whenever confusion might
occur.
Be aware that many texts which focus primarily on tensor representations in-
volving special types of vector spaces—inner product spaces—do not require that
a subscript occur “above and below” if summation is intended, only that it be re-
peated twice in a term. For them, ai xi is a sum. We will never use this version of
the summation convention.
where r is the number of times that V ∗ occurs (they are all listed first) and s is the
number of times that V occurs. We will define Vsr (k) by
(
r V ∗ , if 1 ≤ k ≤ r;
Vs (k) =
V, if r + 1 ≤ k ≤ r + s.
Sometimes this product is called the outer product of the two tensors. T ⊗ Te is
of order r + re + s + se, contravariant of order r + re and covariant of order s + se.
This process of creating tensors of higher order is definitely not commutative:
T ⊗ Te 6= Te ⊗ T in general. But this process is associative so a tensor represented
as T ⊗ S ⊗ R is unambiguous.
Suppose a is an ordered basis of V , identified with V ∗∗ , and a∗ the dual ordered
basis of V ∗ . We use in the calculation below, for the first time, the Einstein
summation convention: repeated indices which occur exactly once as superscript
and exactly once as subscript in a term connote summation over the range of that
index.
If T ∈ Tsr (V ) and v = (v(1), . . . , v(r + s)) ∈ Vsr then
T (v) = T A∗ i1 (v(1))ai1 , . . . , A∗ ir (v(r))air ,
Air+1 (v(r + 1))air+1 , . . . , Air+s (v(r + s))air+s
2The word tensor is adapted from the French “tension” meaning “strain” in English. Under-
standing the physics of deformation of solids was an early application.
16 LARRY SUSANKA
From this exercise in multilinearity we conclude that the nr+s tensors of the form
ai1 ⊗ · · · ⊗ air ⊗ air+1 ⊗ · · · ⊗ air+s span Tsr (V ) and it is fairly easy to show they
constitute a linearly independent set and hence a basis, the standard
basis for Tsr (V ) for ordered basis a.
The nr+s indexed numbers T (ai1 , . . . , air+s ) determine the tensor T and we say
that this indexed collection of numbers “is” the tensor T in the ordered
basis a.
We introduce the notation
Tiir+1
1 ,...,ir i1 ir
,...,ir+s (a) = T (a , . . . , a , ar1 +1 , . . . , air+s ).
Tji11,...,j
,...,ir
s
(a) = T (ai1 , . . . , air , aj1 , . . . , ajs ).
···
For each ordered basis a, T··· (a) is a real valued function whose domain consists
of all the ordered r+s-tuples of integers from the index set and whose value on any
given index combination is shown above.
So as a multiple sum over r + s different integer indices
T = Tji11,...,j
,...,ir
s
(a) ai1 ⊗ · · · ⊗ air ⊗ aj1 ⊗ · · · ⊗ ajs .
A tensor which is the tensor product of r different vectors and s different cov-
ectors is of order t = r + s. Tensors which can be represented this way are called
simple. It is fairly obvious but worth noting that not all tensors are simple.
To see this note that the selection of r vectors and s covectors to form a simple
tensor involves nt coefficients to represent these vectors and covectors in basis
a when V has dimension n. On the other hand, a general tensor of order t is
determined by the nt numbers Tji11,...,j
,...,ir
s
(a). Solving nt equations in nt unknowns is
generally not possible for integers n and t exceeding 1, unless n = t = 2.
Getting back to the discussion from above, we associate T with the tensor AT ∈
e
e
Ts (R ) defined by
r n
Tsr (V )
x
à A
e
e
y
Tsr (Rn )
TENSORS (DRAFT COPY) 17
If you wish to do a calculation with a tensor T to produce a number you can use
its representation AT
e , with respect to a given ordered basis a in V , as a tensor on
e
n
R .
Specifically, we can evaluate T on a member of (v(1), . . . , v(r + s)) ∈ Vsr by
calculating the value of Tji11,...,j
,...,ir
s
(a)ei1 ⊗ · · · ⊗ eir ⊗ ej1 ⊗ · · · ⊗ ejs on the member
(A∗ (v(1)), . . . , A∗ (v(r)), A(v(r + 1)), . . . , A(v(r + s))) of (Rn )rs .
We have arrived at several closely related definitions of, or uses for, the creatures
like Tji11,...,j
,...,ir
s
(a) seen peppering books which apply tensors.
Most simply, they are nothing more than the numbers T (ai1 , . . . , air , aj1 , . . . , ajs ),
the coefficients of the simple tensors ai1 ⊗ · · · ⊗ air ⊗ aj1 ⊗ · · · ⊗ ajs in the explicit
representation of the multilinear function, using the basis a.
···
More abstractly, T··· is a function from the set of all ordered bases of V to the
set of real valued functions whose domain is the nr+s possible values of our index
··· ···
set. The value of T··· on an ordered basis a is the function T··· (a).
The value of this function on an index combination is denoted Tji11,...,j
,...,ir
s
(a), and
these numbers are called the coordinates of T in the ordered basis a. For
various index combinations but fixed ordered basis these numbers might be anything
(though of course they depend on T and a.) These values are unrelated to each
other for different index combinations in the sense that they can be prescribed
at will or measured and found to be any number if you are creating a tensor from
scratch in a particular basis. But once they are all specified or known in one ordered
basis their values are determined in all other bases, and we shall see soon how these
numbers must change in a new ordered basis.
···
Finally, we can think of T··· as defining a function from the set of all ordered
bases of V to the set of tensors in Tsr (Rn ).
For each ordered basis a, T is assigned its own representative tensor AT e ∈
e
Ts (R ). Not every member of Ts (R ) can be a representative of this particular ten-
r n r n
sor using different bases. The real number coefficients must change in a coordinated
way when moving from ordered basis to ordered basis, as we shall see below.
There are advantages and disadvantages to each point of view.
The list of coefficients using a single basis would suffice to describe a tensor com-
pletely, just as the constant of proportionality suffices to describe a direct variation
of one real variable with another. But specifying the constant of proportionality
and leaving it at that misses the point of direct variation, which is a functional
relationship. Likewise, the coefficients which define a tensor are just numbers and
thinking of them alone downplays the relationship they are intended to represent.
Guided by the impression that people think more easily about relationships
taking place in some familiar environment you have the option of regarding tensors
as living in Rn through a representative tensor there, one representative for each
ordered basis of V . The utility of this viewpoint depends, of course, on familiarity
with tensors in Rn .
18 LARRY SUSANKA
Suppose T is the same tensor from above and b is a new ordered basis of V with
dual ordered basis b∗ of V ∗ .
Representing this tensor in terms of both the old ordered basis a and the new
ordered basis b we have
T = T (bw1 , . . . , bwr , bh1 , . . . , bhs ) bw1 ⊗ · · · ⊗ bwr ⊗ bh1 ⊗ · · · ⊗ bhs
= T (ai1 , . . . , air , aj1 , . . . , ajs ) ai1 ⊗ · · · ⊗ air ⊗ aj1 ⊗ · · · ⊗ ajs
= T (ai1 , . . . , air , aj1 , . . . , ajs )
· (B w1 (ai1 )bw1 ) ⊗ · · · ⊗ (B wr (air )bwr )
⊗ Aj1 (bh1 )bh1 ⊗ · · · ⊗ Ajs (bhs )bhs
At this point it seems appropriate to discuss the meaning of covariant and con-
travariant tensors and how their coordinates vary under change of basis, with the
goal of getting some intuition about their differences. We illustrate this with tensors
of rank 1: contravariant vectors (vectors) and and covariant vectors (covectors.)
The physical scenario we have in mind consists of two types of items in the air
in front of your eyes with you seated, perhaps, at a desk.
First, we have actual physical displacements, say of various dust particles that
you witness. Second, we have stacks of flat paper, numbered like pages in a book,
each stack having uniform “air gap” between pages throughout its depth, though
that gap might vary from stack to stack. We make no restriction about the angle
any particular stack must have relative to the desk. We consider these pages as
having indeterminant extent, perhaps very large pages, and the stacks to be as deep
as required, though of uniform density.
The magnitude of a displacement will be indicated by the length of a line segment
connecting start to finish, which we can give numerically should we decide on a
standard of length. Direction of the displacement is indicated by the direction of
the segment together with an “arrow head” at the finish point.
The magnitude of a stack will be indicated by the density of pages in the stack
which we can denote numerically by reference to a “standard stack” if we decide
on one. The direction of the stack is in the direction of increasing page number.
We now define a coordinate system on the space in front of you, measuring
distances in centimeters, choosing an origin, axes and so on in some reasonable way
with z axis pointing “up” from your desk.
Consider the displacement of a dust particle which moves straight up 100 cen-
timeters from your desk, and a stack of pages laying on your desk with density 100
pages per centimeter “up” from your desk.
If you decide to measure distances in meters rather than centimeters, the vertical
coordinate of displacement drops to 1, decreasing by a factor of 100. The numerical
value of the density of the stack, however, increases to 10, 000.
When the “measuring stick” in the vertical direction increases in length by a
factor of 100, coordinates of displacement drop by that factor and displacement is
called contravariant because of this.
On the other hand, the stack density coordinate changes in the same way as the
basis vector length, so we would describe the stack objects as covariant.
We haven’t really shown that these stack descriptions and displacements can be
regarded as vector spaces, though no doubt you have seen the geometrical procedure
for scalar multiplication and vector addition of displacements. There are purely
geometrical ways of combining stacks to produce a vector space structure on stacks
too: two intersecting stacks create parallelogram “columns” and the sum stack has
sheets that extend the diagonals of these columns.
But the important point is that if stacks and displacements are to be thought of
as occurring in the same physical universe, and if a displacement is to be represented
as a member u of R3 , then a stack must be represented as a linear functional, a
member σ of R3∗ .
20 LARRY SUSANKA
There is a physical meaning associated with the number σu. It is the number
of pages of the stack corresponding to σ crossing the shaft of the displacement
corresponding to u, where this number is positive when the motion was in the
direction of increasing “page number.”
It is obvious on physical grounds that this number must be invariant: it cannot
depend on the vagaries of the coordinate system used to calculate it. That is the
meaning of
σu = σ (AB) u = (σA) (Bu)
and why tensor coordinates of the two types change as they do.
8. Tensor Character
which are the same by the ordinary commutative and distributive properties of real
numbers. Tensor product commutes using these conventions! Of course, what is
happening behind the scenes is that you have decided on an order for your four
covectors (perhaps alphabetical) and are simply sticking to it.
Many people find the silent mental gymnastics needed to keep the older notation
straight to be unsettling, and tend to lose track in really complex settings. The
gyrations we have chosen to replace them are, at least, explicit.
9. Relative Tensors
Certain types of processes that define a linear or multilinear function for each
basis don’t seem to be that useful or reflect much other than the features of the
particular basis. For instance if v = v i (a) ai ∈ V the functions
Xn
fa , ga : V → R given by fa (v) = v i (a) and ga (v) = v 1 (a)
i=1
(same formula for any basis) are both linear but are not covectors.
They correspond to linear coordinate polynomials
x1 + · · · + xn and x1 in any basis.
They certainly might be useful for something (giving us useful information about
each basis perhaps) but whatever that something might be it will probably not
describe “physical law.”
But among those processes that define a linear or multilinear function for each
basis, it is not only tensors which have physical importance or which have consis-
tent, basis-independent properties. We will discuss some of these entities (relative
tensors) after a few paragraphs of preliminary material.
The set of matrices of transition with matrix multiplication forms a group called
the general linear group over R. We denote this set with this operation by
GL(n, R).
A real valued function f : GL(n, R) → R is called multiplicative if f (CD) =
f (C)f (D) for every pair of members C and D of GL(n, R).
Any nontrivial (i.e. nonzero) multiplicative function must take the identity ma-
trix to 1, and also a matrix and its inverse matrix must
be sent to reciprocal real
numbers. Any such function must satisfy f A−1 B A = f (B), for instance.
Given two nontrivial multiplicative functions, both their product and their ratio
are multiplicative. If a nontrivial multiplicative function is always positive, any real
power of that function is multiplicative.
In this section we will be interested in several kinds of multiplicative functions.
First, let det denote the determinant function on square matrices. Its proper-
ties are explored in any linear algebra text. We find in [8] p.154 that det(M N ) =
det(M ) det(N ) for
compatible square matrices M and N . Since det(I) = 1 this
−1 −1
yields det M = (det(M )) when M is invertible. We then find that
det M −1 N M = det(N ).
24 LARRY SUSANKA
The difference is that when switching to a new coordinate system b the coordi-
nates of a relative tensor change according to one of the two formulae
w
Thk11,...,h
,...,kr
s
(b) = sign(B) |det(B)| Tji11,...,j
,...,ir
s
(a)Bki11 · · · Bkirr Ajh11 · · · Ajhss
or
w
Thk11,...,h
,...,kr
s
(b) = |det(B)| Tji11,...,j
,...,ir
s
(a)Bki11 · · · Bkirr Ajh11 · · · Ajhss .
Relative tensors of the first kind are called odd relative tensors of weight w
while tensors of the second kind are called even relative tensors of weight w.
If C is the matrix of transition from ordered basis b to ordered basis c then C B
is the matrix of transition from ordered basis a directly to ordered basis c.
Because of this, it is not hard to show that the coordinates Thk11,...,h
,...,kr
s
(c) for a
relative tensor in ordered basis c can be derived from the coordinates in ordered
basis a directly, or through the intermediary coordinates in ordered basis b using
two steps: that is, the multiplicative nature of the weight function is exactly what
we need for the coordinate change requirement to be consistent.
An even relative tensor of weight 0 is just a tensor, so the concept of relative
tensor generalizes that of tensor. If it needs to be emphasized, ordinary tensors
are called absolute tensors to distinguish them from relative tensors with more
interesting weighting functions.
3See [2] page 349.
TENSORS (DRAFT COPY) 25
This subject has a long history and many applications, so a lot of vocabulary is
to be expected.
Odd relative tensors are described variously (and synonymously) as
• axial tensors
• pseudotensors
• twisted tensors
• oriented tensors.
The “axial” adjective seems more common when the relative tensor is of rank 0
or 1: axial scalars or axial vectors or covectors, but sometimes you will see
these relative tensors referred to as pseudoscalars or pseudovectors or pseu-
docovectors.
An even relative tensor (for example an ordinary tensor) is sometimes called a
polar tensor.
If the weight w is positive (usually 1) the phrase tensor density is likely to be
used for the relative tensor, while if the weight is negative (often −1) we take a
tip from [19] and call these tensor capacities, though many sources refer to any
relative tensor of nonzero weight as a tensor density.
These adjectives can be combined: e.g., axial vector capacity or pseudoscalar
density.
The columns of the change of basis matrix B represents the old basis vectors a
in terms of the new basis vectors b. The number |det(B)| is often interpreted as
measuring the “relative volume” of the parallelepiped formed with edges along basis
a as compared to the parallelepiped formed using b as edges. So, for example, if the
new basis vectors b are all just twice as long as basis vectors a, then |det(B)| = 2−n .
With that situation in mind, imagine V to be filled with uniformly distributed
dust. Since the unit of volume carved out by basis b is so much larger, the measured
density of dust using basis b would be greater by a factor of 2n than that measured
using basis a. On the other hand, the numerical measure assigned to the “volume”
or “capacity” of a given geometrical shape in V would be smaller by just this factor
when measured using basis b as compared to the number obtained using basis a as
a standard volume.
The vocabulary “tensor density” for negative weights and “tensor capacity” for
positive weights acknowledges and names these (quite different) behaviors.
One can form the product of relative tensors (in a specified order) to produce
another relative tensor. The weight of the product is the sum of the weights of the
factors. The product relative tensor will be odd exactly when an odd number of
the factor relative tensors are odd.
Many of the constructions involving tensors apply also to relative tensors with
little or no change. These new objects will be formed by applying the construction
to the “pure tensor” part of the relative tensor and applying the ordinary distribu-
tive property (or other arithmetic) to any weighting function factors, as we did in
the last paragraph.
26 LARRY SUSANKA
At this point we are going to define a process called contraction which allows us
to reduce the order of a relative tensor by 2: by one covariant and one contravariant
order value. We will initially define this new tensor by a process which uses a basis.
Specifically, we consider T ∈ Tsr (V ) where both r and s are at least 1. Select
one contravariant index position α and one covariant index position β. We will
define the tensor Cα β T, called the contraction of T with respect to the αth
contravariant and βth covariant index.
Define in a particular ordered basis a the numbers
i ,...,i i ,...,i ,k,i ,...,i
Hj11,...,jr−1
s−1
(a) = Tj11,...,jβ−1
α−1 α r−1
,k,jβ ,...,js−1 (a).
This serves to define real numbers for all the required index values in Ts−1 r−1
(V )
i1 ,...,ir−1
and so these real numbers Hj1 ,...,js−1 (a) do define some tensor in Ts−1 (V ).
r−1
Any process, including this one, that creates a multilinear functional in one
ordered basis defines a tensor: namely, the tensor obtained in other bases by trans-
forming the coordinates according to the proper pattern for that order of tensor.
TENSORS (DRAFT COPY) 27
It is worth belaboring the point raised in Section 8: that is not what people
usually mean when they say that a process creates or “is” a tensor. What they
usually mean is that the procedure outlined, though it might use a basis
inter alia, would produce the same tensor whichever basis had been
initially picked. We illustrate this idea by demonstrating the tensor character of
contraction.
Contracting T in the αth contravariant against the βth covariant coordinate in
the ordered basis b gives
w ,...,w w ,...,w ,k,w ,...,w
Hh11,...,hs−1
r−1
(b) = Th11,...,hβ−1
α−1 α r−1
,k,hβ ,...,hs−1 (b)
w w j j j
= Tji11,...,j
,...,ir
s i1 · · · Biα−1 Biα Biα+1 · · · Bir
(a)Bw 1 α−1 k wα r−1
Ajh11 · · · Ahβ−1
β−1
Akβ Ahβ+1
β
· · · Ajhss−1 .
j j
We now note that Bkiα Akβ = δiαβ . So we make four changes to the last line:
j j
• Factor out the sum Bkiα Akβ and replace it with δiαβ .
j
• Set q = jβ = iα to eliminate δiαβ from the formula.
(
im , if m = 1, . . . α − 1;
• Define im =
i , if m = α, . . . , r − 1.
( m+1
jm , if m = 1, . . . , β − 1;
• Define j m =
jm+1 , if m = β, . . . , s − 1.
The last line in the equation from above then becomes
i ,...,i ,q,i ,...,i w w j j j j
Tj 1,...,jα−1 ,q,jα ,...,jr−1 (a)Bw
i
1
· · · Bi α−1 Bw
i
α
· · · Bi r−1 Ah11 · · · Ahβ−1
β−1
Ahββ · · · Ahs−1
s−1
1 β−1 β s−1 1 α−1 α r−1
i ,...,i w j j
= Hj 1 ,...,jr−1 (a)Biw1 · · · Bi r−1 Ah11 · · · Ahs−1
s−1
1 s−1 1 r−1
This is exactly how the coordinates of H must change in going from ordered basis
a to b if the contraction process is to generate the same tensor when applied
to the coordinates of T in these two bases. Contraction can be applied to the
representation of a tensor in any ordered basis and will yield the same tensor.
So we define the tensor Cα
β T by
Whenever T ∈ T11 (for instance, a tensor such as Ψ(F ) from Section 10 above)
we have an interesting case. Then the number (C11 T )(a) = Tkk (a) = T (ak , ak ) is
called the trace of T (or the trace of F : V → V when T = Ψ(F ) as formed in the
previous section.)
We have just defined a number trace(T) = C11 T ∈ T00 = R for each member T
of T11 (V). No basis is mentioned here: the trace does not depend on the basis. It
is an invariant.
This usage of the word “trace” is related to the trace of a matrix as follows.
Suppose θ ∈ V ∗ and v ∈ V . In the ordered basis a the tensor T is Tji (a) ai ⊗ aj
and θ and v have representations as θ = θi ai and v = v i ai . So if Tji (a) is the
28 LARRY SUSANKA
vn
=A∗ (θ) Tji (a) A(v).
In other words, down in Rn the tensor Tji (a) ei ⊗ ej is, essentially, “matrix
multiply in the middle” with the representative row matrix A∗ (θ) of θ on the left
and the column A(v) on the right. trace(T ) is the ordinary trace of this “middle
matrix,” the sum of its diagonal entries.
Any invertible matrix A (this requires a tiny proof ) can be used to generate a
new ordered basis b of V , and we recover by direct calculation the invariance of
trace under similarity transformations.
If T ∈ Tsr (V ) is any tensor with 0 < α ≤ r and 0 < β ≤ s, the contraction Cβα T
is related to the trace of a certain matrix, though it is a little harder to think about
with all the indices floating around. First, we fix all but the αth contravariant and
βth covariant index, thinking of all the rest as constant for now. We define the
matrix M (a) to be the matrix with ijth entry
i ,...,i ,i,i ,...,i
Mji (a) = Tj11,...,jβ−1
α−1 α+1 r
,j,jβ+1 ,...,js (a).
So for each index combination with entries fixed except in these two spots, we
have a member of T11 (V ) and M (a) is the matrix of its representative, using ordered
basis a, in T11 (Rn ). The contraction is the trace of this matrix: the trace
of a different matrix for each index combination.
12. Evaluation
V in the domain.
If θ ∈ V ∗ and 1 ≤ k ≤ r we can evaluate T at θ in its kth slot yielding the
function
E(θ, k)(T ) = T (· · · , θ, · · · )
where the dots indicate remaining “unfilled” slots of T , if any.
It is obvious that E(θ, k)(T ) is also a tensor, and a member of Tsr−1 (V).
Suppose a is an ordered basis of V and θ = θi (a) ai . Then
T = Tji11,...,j
,...,ir
s
(a) ai1 ⊗ · · · ⊗ air ⊗ aj1 ⊗ · · · ⊗ ajs
so
E(θ, k)(T )(a)
= Tji11,...,j
,...,ir
s
(a) ai1 ⊗ · · · ⊗ aik−1 ⊗ aik (θ) ⊗ aik+1 ⊗ · · · ⊗ air ⊗ aj1 ⊗ · · · ⊗ ajs
= Tji11,...,j
,...,ir
s
(a) θik (a) ai1 ⊗ · · · ⊗ aik−1 ⊗ aik+1 ⊗ · · · ⊗ air ⊗ aj1 ⊗ · · · ⊗ ajs
i ,...,i
= Gj11 ,...,jr−1
s
(a) ai1 ⊗ · · · ⊗ air−1 ⊗ aj1 ⊗ · · · ⊗ ajs
where
i ,...,i i ,...,i ,w,ik ,...,ir−1
Gj11 ,...,jr−1
s
(a) = Tj11,...,jsk−1 (a) θw (a).
G = E(θ, k)(T ) is the tensor you would obtain by taking the tensor product of
T with θ and then contracting in the appropriate indices:
k
G = E(θ, k)(T ) = Cs+1 (T ⊗ θ).
r
(Rn )s . Suppose further that, for each v ∈ Vsr , if you evaluate the multilinear
function corresponding to a basis at the coordinates corresponding to v in that
basis you always get the same (that is, basis independent) numerical output. Then
the process by which you created the multilinear function in each basis has tensor
character: you have defined a tensor in Tsr (V).
Some sources find this observation important enough to name, usually calling it
(or something equivalent to it) The Quotient Theorem.4
Combining our previous remarks about products and contraction of relative ten-
sors, we see that evaluation of a relative tensor on vectors or covectors can be
performed at any of its slots, and the process does not change the weight of a
relative tensor. The properties “even” or “odd” persist unchanged by evaluation.
Evaluation of a relative tensor on relative vectors or relative covectors could
change both weight and whether the resulting relative tensor is even or odd, and
we leave it to the reader to deduce the outcome in this case.
It is obvious that, for a tensor of order 2, T (x, y) need not bear any special
relation to T (y, x). For one thing, T (x, y) and T (y, x) won’t both be defined unless
the domain is either V02 or V20 . But even then they need not be related in any
specific way.
A covariant or contravariant relative tensor T of order L is called symmetric
when switching exactly two of its arguments never changes its value. Specifically,
for each pair of distinct positive integers j and k not exceeding L and all v ∈ VL0
or V0L , whichever is the domain of T,
T (v(1), . . . , v(L))
= T (v(1), . . . , v(j − 1), v(k), v(j + 1), . . . , v(k − 1), v(j), v(k + 1), . . . , v(L)).
The following facts about permutations are discussed in many sources, including
[9] p. 46. A permutation of the set {1, . . . , L}, the set consisting of the first
L positive integers, is a one-to-one and onto function from this set to itself. Let
PL denote the set of all permutations on this set. The composition of any two
permutations is also a permutation. There are L! distinct permutations of a set
containing L elements. Any permutation can be built by switching two elements
4Our statement has a generalization. Suppose that we identify a fixed collection of slots of the
multilinear function defined as above using a basis. Suppose that whenever you fill those particular
slots with coordinates of vectors/covectors in a basis the resulting contracted multilinear function
is independent of the basis: had you used coordinates in a different basis the new coefficients
would be the same as the old ones, properly transformed so as to have tensor character. Then the
means by which you formed the original coefficients has tensor character.
TENSORS (DRAFT COPY) 31
where the sum is over all L! members of the set of permutations PL of {1, . . . , L}.
Antisymmetrization is accomplished by the coordinate-independent formula
1 X
Alt(T) = sgn(P ) TP .
L!
P ∈PL
· · · ⊗ aiL in a basis, this gives Sym(T ) = Ti1 ,...,iL (a) Sym(ai1 ⊗ · · · ⊗ aiL ) and
Alt(T ) = Ti1 ,...,iL (a) Alt(ai1 ⊗ · · · ⊗ aiL ).
In some sources, the Alt operation is indicated on coordinates of a tensor by
using square brackets on the indices, such as T[i1 ,...,iL ] . So
T[i1 ,...,iL ] (a) = Alt(T )i1 ,...,iL (a).
In these sources round brackets may be used to indicate symmetrization:
T(i1 ,...,iL ) (a) = Sym(T )i1 ,...,iL (a).
One also sees these notations spread across tensor products as in the expression
S[i1 ,...,im Tj1 ,...,jL ] , used to indicate the coordinates of Alt(S ⊗ T ).
It is not necessary to involve all indices in these operations. Sometimes you may
see brackets enclosing only some of the indices, as
T[i1 ,...,ik ],ik+1 ,...,iL (a) or T(i1 ,...,ik ),ik+1 ,...,iL (a).
The intent here is that these tensors are calculated on a selection of L vectors
by permuting the vectors in the first k “slots” of T in all possible ways, leaving the
last L − k arguments fixed. After summing the results of this (with factor “−1”
where appropriate) divide by k!.
In the calculation to follow, we let Q = P −1 for a generic permutation P , and
note that sgn(P ) = sgn(Q). For a particular basis tensor ai1 ⊗ · · · ⊗ aiL ,
Alt(ai1 ⊗ · · · ⊗ aiL )(v(1), . . . , v(L))
1 X
= sgn(P )(ai1 ⊗ · · · ⊗ aiL )(v(P (1)), . . . , v(P (L)))
L!
P ∈PL
1 X
= sgn(P ) v(P (1))i1 · · · v(P (L))iL
L!
P ∈PL
1 X
= sgn(Q) v(1)iQ(1) · · · v(L)iQ(L)
L!
Q∈PL
1 X
= sgn(Q)(aiQ(1) ⊗ · · · ⊗ aiQ(L) )(v(1), . . . , v(L)).
L!
Q∈PL
1 X
Alt(T ) = sgn(Q) Ti1 ,...,iL (a) aiQ(1) ⊗ · · · ⊗ aiQ(L)
L!
Q∈PL
1 X
= sgn(Q) TiQ(1) ,...,iQ(L) (a) ai1 ⊗ · · · ⊗ aiL .
L!
Q∈PL
(Sum on an index pair im and iQ(q) whenever m = Q(q).)
The wedge product is in some sources called the exterior product, not to be
confused with the ordinary tensor (outer) product.
It is not too hard to show that if S and R are both covariant relative tensors of
order s with the same weighting function and if T is a covariant relative tensor of
order t and if k is a constant then
S ∧ T = (−1)st T ∧ S and (kS + R) ∧ T = kS ∧ T + R ∧ T.
−1
First, any member Q of PL is Q ◦ P ◦ P and so can be written in k! different
−1
ways as R ◦ P where R = Q ◦ P for various P ∈ Pk .
34 LARRY SUSANKA
−1
Second, sgn P = sgn(P ) = sgn P and sgn R ◦ P = sgn(R) sgn(P ).
Suppose R = Ri1 ,...,ir (a) ai1 ⊗ · · · ⊗ air and S = Sj1 ,...,js (a) aj1 ⊗ · · · ⊗ ajs and
T = Tk1 ,...,kr (a) akt ⊗ · · · ⊗ akt are any covariant tensors.
By linearity of Alt and the result above we find that
We remark in passing that the argument above is easily adapted to show that
Sym satisfies:
X
β 1 ∧ · · · ∧ β L (v1 , . . . , vL ) = sgn(Q) β Q(1) ⊗ · · · ⊗ β Q(L) )(v1 , . . . , vL )
Q∈PL
X
sgn(Q) β Q(1) (v1 ) · · · β Q(L) (vL ) = det β i (vj )
=
Q∈PL
where det stands for the determinant function5, in this case applied to the L × L
matrix with ith row, jth column entry β i (vj ).
And it follows that if Q ∈ PL then the wedge product of 1-forms satisfies
those in which the list of index values consists of r distinct integers. Let’s focus on
a particular list i1 , . . . , ir and for specificity we choose them so that 1 ≤ i1 < · · · <
ir ≤ n. A list of indices like this will be called increasing.
The part of the sum given for R in the a coordinates involving any way of
rearranging this particular list of integers is
X
RiQ(1) ,...,iQ(r) (a) aiQ (1) ⊗ · · · ⊗ aiQ (r) (Sum on Q only.)
Q∈Pr
X
= Ri1 ,...,ir (a) sgn(Q) aiQ (1) ⊗ · · · ⊗ aiQ (r) (Sum on Q only.)
Q∈Pr
Second, when we gather together all the terms as above, the coefficient on
ai1 ∧ · · · ∧ air (increasing indices) is the same as the coefficient on ai1 ⊗ · · · ⊗ air
in the representation of alternating R as a member of Tr0 (V ) using or-
dered basis a.
It remains to show that these increasing-index wedge products are linearly inde-
pendent, and so form a basis of Λr (V ).
Suppose we have a linear combination of these tensors
Ti1 ,...,ir (a) ai1 ∧ · · · ∧ air (Sum here on increasing indices only.)
Evaluating this tensor at a particular (aj1 , . . . , ajr ) with 1 ≤ j1 < · · · < jr ≤ r we
find that all terms are 0 except possibly
Tj1 ,...,jr (a) aj1 ∧ · · · ∧ ajr (aj1 , . . . , ajr ) (This is not a sum.)
Expanding this wedge product in terms of Alt and evaluating, we find that this is
Tj1 ,...,jr (a). So if this linear combination is the zero tensor, every coefficient is 0:
that is, these increasing wedge products form a linearly independent set.
So tensors of the form ai1 ∧ · · · ∧ air where 1 ≤ i1 < · · · < ir ≤ r constitute a
basis of Λr (V ), the standard basis of Λr (V) for ordered basis a.
Ri1 ,...,ir (a) ai1 ⊗ · · · ⊗ air (R is alternating, sum here on all indices.)
i1 ir
= Ri1 ,...,ir (a) a ∧ · · · ∧ a . (Sum here on increasing indices only.)
n!
There are (n−r)! ways to form r-tuples of distinct numbers between 1 and n.
There are r! distinct ways of rearranging any particular list of this kind so there
n!
are (n−r)! r! ways of selecting r increasing indices from among n index values.
So the dimension of Λr (V ) is
n! n
= .
(n − r)! r! r
TENSORS (DRAFT COPY) 37
This is a huge sum involving indices for which i1 , . . . , is is in increasing order and
also is+1 , . . . , iL is in increasing order, as well as the sum over permutations.
Suppose i1 , . . . , is , is+1 , . . . , iL is a particular choice of such indices.
Although it does not affect the result below, as a practical matter if performing
one of these calculations by hand we note that if any index value among the i1 , . . . , is
is repeated in is+1 , . . . , iL the result after summation on the permutations will be
0 so we may discard such terms and restrict our attention to terms with entirely
distinct indices.
If P and Q are two permutations in PL and {Q(1), Q(2), . . . , Q(s)} is the same set
as {P (1), P (2), . . . , P (s)} then P can be transformed into Q by composition with a
sequence of 2-cycles that switch members of the common set {Q(1), Q(2), . . . , Q(s)},
followed by a sequence of 2-cycles that switch members of the set {Q(s + 1), Q(s +
2), . . . , Q(L)}. Each of these switches, if applied to a term in the last sum above,
would introduce a minus sign factor in one of the two wedge product factors, and
also introduce a minus sign factor into sgn(P ). The net result is no change, so each
permutation can be transformed into an s-shuffle permutation without altering the
sum above. Each shuffle permutation would be repeated s! t! times.
We conclude:
X
S∧T = sgn(P )(S ⊗ T )P
P ∈Ss,L
when S, T are alternating, and where Ss,L are the s-shuffle permutations.
You may recall various properties we cited earlier from [8] p.154: det(M N ) =
det(M ) det(N ) for compatible square matrices M and N , det(I) = 1 and the fact
that det(N ) = det(M −1 N M ) when M and N are compatible and M invertible.
A couple of other facts are also useful when dealing with determinants.
First, by examining the effect of replacing permutations P by Q = P −1 we see
that
P (1)
X X
det(M ) = sgn(P ) M1 · · · MnP (n) = 1
sgn(Q) MQ(1) n
· · · MQ(n)
P ∈Pn Q∈Pn
which implies that the determinant of a matrix and its transpose are equal.
TENSORS (DRAFT COPY) 39
because each tensor in the sum is alternating and there is a repeated factor in each
wedge product.
On the other hand, suppose b1 , . . . , br is a linearly independent list of members
of V ∗ . Then this list can be extended to an ordered basis of V ∗ dual to an ordered
basis b of V . The n-form b1 ∧ · · · ∧ bn is nonzero so
0 6= b1 ∧ · · · ∧ bn = b1 ∧ · · · ∧ br ∧ br+1 ∧ · · · ∧ bn
6Note the similarity to the change of basis formula for a tensor capacity in the case of n-forms,
and tensor density in case of n-vectors. The difference, of course, is that here the determinant
incorporates the combined effect of the matrices of transition, and is not an extra factor.
TENSORS (DRAFT COPY) 41
The nonzero terms in the expanded wedge product consist of all and only those
pairs of increasing sequences for which j1 , . . . , jr , k1 , . . . , kn−r constitutes an r-
shuffle permutation of 1, . . . , n. So
h i h i
k ,...,kn−r
det Aj11,...,r
,...,jr
aj1 ∧ · · · ∧ air ∧ det A 1r+1,...,n ak1 ∧ · · · ∧ akn−r
(Sum on increasing indices only in the line above.)
Q(1),...,Q(r) Q(r+1),...,Q(n)
X
= det A 1,...,r det A r+1,...,n aQ(1) ∧ · · · ∧ aQ(n)
Q∈Sr,n
(Sum on shuffle permutations only in the line above and the line below.)
Q(1),...,Q(r) Q(r+1),...,Q(n)
X
= sgn(Q) det A 1,...,r det A r+1,...,n a1 ∧ · · · ∧ an .
Q∈Sr,n
We conclude that
Q(1),...,Q(r) Q(r+1),...,Q(n)
X
det(A) = sgn(Q) det A 1,...,r det A r+1,...,n .
Q∈Sr,n
Note that the Laplace expansion around the first column appears when r = 1.
More generally, if P is any permutation of 1, . . . , n then
Q(1),...,Q(r) Q(r+1),...,Q(n)
X
det(A) = sgn(P ) sgn(Q) det AP (1),...,P (r) det AP (r+1),...,P (n)
Q∈Sr,n
which recovers, among other things, the Laplace expansion around any column.
If w is another vector,
wy (vy θ) = θ1,2 v 1 w2 − θ1,2 v 2 w1 = −vy (wy θ).
vy ( wy θ ) = − w y ( vy θ )
as can be seen by examining the pair of evaluations θ(v, w, . . . ) and θ(w, v, . . . ).
This implies that for any θ
vy ( vy θ ) = 0.
vy (θ ∧ τ ) = ( vy θ ) ∧ τ + (−1)r θ ∧ ( v y τ )
Since V ∗ and V are both n dimensional, there are many isomorphisms between
them. The obvious isomorphism sending basis to dual basis is basis-dependent: it is
a feature of the basis, not the space and will not in general reflect geometry, which
usually comes from a nondegenerate bilinear form on V . This structure encapsulates
physical properties and physical invariants of great interest to practical folk such as
physicists and engineers. Sometimes it will correspond to the analogue of ordinary
Euclidean distance in V . Sometimes it is created by a Lorentzian metric on V ,
coming perhaps from relativistic space-time considerations.
A bilinear form on V is simply a member of T20 (V ).
Suppose a is an ordered basis for V . The coordinates of a bilinear form G in the
ordered basis a are Gi,j (a) = G(ai , aj ). Define the matrix representation of G
with respect to ordered basis a to be the matrix Ga with ijth entry Ga ij = Gi,j (a).
44 LARRY SUSANKA
We wish to distinguish between the tensor G with coordinates Gi,j (a) in the
ordered basis a and the matrix Ga with ijth entry Gi,j (a). They refer to distinct
(obviously related) ideas.
G is the tensor
G = Gi,j (a) ai ⊗ aj .
G = Gi,j (a) ai ⊗ aj
1 1
Gi,j (a)ai ⊗ aj + Gi,j (a)aj ⊗ ai + Gi,j (a)ai ⊗ aj − Gi,j (a)aj ⊗ ai .
=
2 2
Consider the following calculation, and its implication in the box below.
[( σm Gm,w aw ) = Gw,j σm Gm,w aj = σm δjm aj = σj aj = σ.
So ](σ) evaluated on τ will, generally, look like the sum of simple products σi τi
only when Ga (and so Ga ) is the identity matrix.
[ was defined without reference to a basis, and so both it and its inverse ] have
meaning independent of basis. The formulas we create involving their effect on
vectors and covectors in a basis are valid in any basis.
We use the musical “flat” symbol [ and the “sharp” symbol ] as mnemonic aids.
[ lowers or “flattens” the index on a coefficient turning a vector, usually associated
with an arrow, into a covector, which is usually associated with a family of parallel
planes (or hyperplanes.) The ] symbol raises the index, turning a “flat” object into
a “sharp” one.
We can use [ to create G? , a tensor of order 2, on V ∗ . G? will be a member of
T02 (V) and is defined by:
The reversal of order you see in σ and τ above is not relevant when G is sym-
metric, as it usually will be. In the symplectic case, this reversal avoids a minus
sign in the formula.
If σ = σi ai and τ = τi ai then
(σ, τ ) =G τm Gm,w aw , σL GL,s as
This process is entirely dependent on a fixed choice of G and extends the proce-
dure by which we created G? from G.
We illustrate the process on vectors and covectors before introducing a notational
change to help us with higher orders.
Suppose τ = τi ai is in V ∗ .
] is an isomorphism from V ∗ to V , and τ is linear on V so the composition τ ◦ ]
is a linear real valued function on V ∗ : that is, a vector.
Define the “raised index version of τ ” to be this vector, and denote its coordinates
in basis a by τ k so τ ◦ ] = τ k ak .
If σ = σi ai is in V ∗ ,
τ ◦ ](σ) = τj aj (](σ)) = τj aj Gk,L σk aL = τj Gk,j σk = τj Gk,j ak (σ).
So τ k = τj Gk,j for each k. From this you can see that τ ◦ ] could be obtained
by the tensor product of τ with G? followed by a contraction against the second
index of G? .
Suppose x = xi ai is in V .
The linear function [ has domain V and range V ∗ and x has domain V ∗ and
range R so the composition x ◦ [ is a covector.
If v = v k ak ∈ V ,
x ◦ [(v) = xj aj ([(v)) = xj aj Gk,L v k aL = xj Gk,j v k = xj Gk,j ak (v).
Define the “lowered index version of x” to be this covector, and denote its coor-
dinates in basis a by xk so x ◦ [ = xk ak .
As you can see, the covector x ◦ [ could have been obtained by tensor product
of x with G followed by contraction against the second index of G.
We have, in compact form, for vectors and covectors:
Contracting in the 4th and 5th index positions, which are covariant, is defined
by first raising one of these indices and then contracting at the new pair.
(C45 T )ab cf g = T ab ck kf g = T ab chk f g Gkh .
In this case too, symmetry of G guarantees that the result does not depend on
which of these two indices was raised.
To put some of this in perspective, we consider a very special case of an
ordered basis a and an inner product G with matrix Ga = I = Ga , the
identity matrix. This means that the inner product in Rn created by G using
this ordered basis is the ordinary Euclidean inner product, or dot product. If
T is a member of V (that is, a contravariant tensor of order 1) or a member of V ∗
(a covariant tensor of order 1) the act of raising or lowering the index corresponds,
in Rn , to taking the transpose of the representative there. The numerical values of
coordinates of any tensor do not change when an index is raised or lowered in this
very special but rather common case.
Once again restricting attention to symmetric G, indices can be raised or
lowered on alternating tensors just as with any tensor, though the result could not
be alternating unless all indices are raised or lowered together. In that case, though,
the transformed tensor will be alternating as we now show.
If R is an r-form, it can be represented in a basis as Rj1 ...jr aj1 ⊗ · · · ⊗ ajr
where the sum is over all indices, or as Rj1 ...jr aj1 ∧ · · · ∧ ajr where the sum is over
increasing indices.
In the first representation, the fact that R is alternating is encapsulated in the
equation
Rj1 ...jr = sgn(Q) RjQ(1) ...jQ(r) for any permutation Q ∈ Pr .
i1 ...ir j1 ,...,jr
T j1 ...jr = T det Ga i1 ,...,ir (Sum here on increasing indices only.)
Finally, the various processes we have described can be adapted with the obvious
modifications to raise or lower the index on a relative tensor. The procedure does
not alter the weighting function.
In this section we will consider how to calculate all four types of tensors of order
2 on V and V ∗ using matrices acting in specific ways on the representatives of
vectors and covectors in Rn and Rn∗ . In this text, these matrices are called the
matrix representations of the relevant tensors.
We will then show how the matrix representations change as the basis changes
from basis a to b in V .
Then we will make three different kinds of observations about these matrices.
We will examine how these matrices are related in a fixed ordered basis if the
tensors involved are all “raised or lowered versions” of the same tensor with respect
to an inner product.
Then we will comment on how these matrices change when the matrices of tran-
sition are of a special type—orthogonal matrices.
Finally, we will examine the determinants of these matrices.
For the following calculations v, w are generic members of V , while θ, τ are to
be generic members of V ∗ .
Suppose P is a tensor with domain V × V . We saw that Pij (b) = PkL (a) Aki AL
j.
We will determine this fact directly to show exactly how this carries over to a
calculation with matrices in Rn .
Let matrix MI (a) be the matrix with ijth entry Pij (a) and MI (b) be the matrix
with ijth entry Pij (b).
We conclude that
Suppose Q is a tensor with domain V × V ∗ . Then Qi j (b) = QkL (a) Aki BjL .
Let matrix MII (a) be the matrix with ijth entry Qi j (a) and MII (b) be the
matrix with ijth entry Qi j (b).
Q(v, θ) =Qi j (a) Ai (v) A∗ j (θ)
=A(v)t MII (a) A∗ (θ)t
t
= (A B(v)) MII (a) (B ∗ (θ) B)t
=B(v)t At MII (a) Bt B ∗ (θ)t .
We conclude that
Let matrix MIII (a) be the matrix with ijth entry Ri j (a) and MIII (b) be the
matrix with ijth entry Ri j (b).
We conclude that
We conclude that
We will now consider the situation where these four tensors are obtained
by raising or lowering indices of (any) one of them with the services of
an inner product G.
54 LARRY SUSANKA
We will let MI = (Pij ) as above and relate the other matrices from above to this
one. So
Pi j (a) = Pik (a) Gkj (a) MII (a) = MI (a) Ga
i ik
P j (a) = G (a) Pkj (a) MIII (a) = Ga MI (a)
P ij (a) = Gik (a) PkL (a) GLj (a) MIV (a) = Ga MI (a) Ga
Since G−1
a = Ga you can modify the above to write any of them in terms of the
others and the matrices Ga and Ga .
We make explicit note of this calculation in an important and common case:
When these four tensors are all “raised or lowered” versions
of each other, and if Ga and Ga are the identity matrices,
then the matrix of each of these tensors is the same.
A basis for which matrix Ga (and so Ga too) is the identity matrix is called
orthonormal with respect to G. We will have more to say about such bases
later.
We now consider “special” matrices of transition.
Sometimes the change of basis matrix A satisfies
At = A−1 = B and so Bt = B−1 = A.
2
det (MI (b)) = (det(A)) det (MI (a))
and
TENSORS (DRAFT COPY) 55
2
det (MIV (b)) = (det(B)) det (MIV (a)) .
Of particular interest is the case where the underlying tensors are versions of an
inner product G. So
MI (a) = Ga and MI (b) = Gb
while
and
s
det Gb
det(B) = sign(B) .
det Ga
The matrix of any symmetric bilinear form is symmetric. In [8] p.369 we find that
any symmetric matrix is orthogonally equivalent to a diagonal matrix
with r negative diagonal entries listed first, then s positive diagonal entries and zero
diagonal entries listed last. We find there that the numbers r and s of negative and
positive entries on the diagonal of Ga is independent of basis when Ga is diagonal,
a result referred to as Sylvester’s Law of Inertia.
The ordered pair (s, r) is called the signature of G and the number s − r is
sometimes called the index of G. Be warned: this vocabulary is not used in a
consistent way in the literature.
Writing this out in more detail, suppose that Ga is the matrix of symmetric G
with respect to a. Then there is an orthogonal matrix B, which can be used as a
matrix of transition from ordered basis a to a new ordered basis b, and for which
(by orthogonality) and Bt = B−1 = A and, finally, for which the matrix Gb is
diagonal.
Gb = At Ga A = A−1 Ga A
G1,1 (b)
..
.
Gr,r (b)
Gr+1,r+1 (b)
=
..
.
Gr+s,r+s (b)
0
..
.
0
and also
n
X negative,
if 1 ≤ i ≤ r;
i i
G= Gi,i (b) b ⊗ b where Gi,i (b) is positive , if r + 1 ≤ i ≤ r + s;
i=1
0, if r + s + 1 ≤ i ≤ n.
Taking the ordered basis b as above create a new ordered basis c using diagonal
matrix of transition
√ 1
−G1,1 (b)
..
.
√ 1
−Gr,r (b)
K=
√ 1
Gr+1,r+1 (b)
..
.
√ 1
Gn,n (b)
√ 1
In other words, cj = bj for each j, where the sign is chosen to make
± Gj,j (b)
the root real. Note: Kt 6= K , in general, so we lose orthogonal equivalence
−1
Suppose c and w are any two orthonormal ordered bases and H = hij is the
−1,
if L = K ≤ r;
G(wK , wL ) = 1, if r < L = K ≤ n;
0, if L 6= K;
= G hiK ci , hjL cj = hiK hjL G (ci , cj )
r
X n
X
= − hiK hiL + hiK hiL .
i=1 i=r+1
unchanged, and since det (Ht ) = det (H) we conclude that det H−1 = det (H) =
±1.
A matrix of transition H between two bases which are orthonormal
with respect to any inner product has determinant ±1.
Step 3: The matrix of G in the ordered basis b1 , . . . , bn has all zeros in the first
row and in the first column except possibly for the first diagonal entry, which is
G(b1 , b1 ). It also has the bottom k rows and the last k columns filled with zeroes.
Step 4: If this matrix is actually diagonal, we are done. If the matrix is not yet
diagonal, reorder the b1 , . . . , bn−1 if necessary (leaving b1 alone) so that G(b2 , b2 ) 6=
0 or, if that is impossible, replace b2 with b2 + bi for some smallest i so that the
inequality holds. Let
G(bi , b2 )
c1 = b1 = a1 and c2 = b2 and ci = bi − b2 for i = 3, . . . , n.
G(b2 , b2 )
Once again, the matrix of transition has determinant 1. This time the matrix of
G in the ordered basis c1 , . . . , cn has all zeros in the first two rows and the first two
columns except for the first two diagonal entries, which are G(c1 , c1 ) and G(c2 , c2 ).
Of course the last k rows and the right-most k columns are still filled with zeroes.
We carry on in the pattern of Step 4 for (at most) n − 3 − k more steps yielding
an ordered basis for V for which the matrix of G is diagonal. And we can do a final
re-ordering, if we wish, so that the negative eigenvalues are listed first followed by
the positive eigenvalues on the diagonal.
The matrix of transition which produces this diagonal form from a specific basis
a, the product of all the intermediary matrices of transition, will have determinant
±1 but it will not, in general, be orthogonal.
We conclude this section with a similar discussion for a 2-form S.
In this case any matrix of S is skew symmetric. We find in [1] p.163 that there
is a unique (dependent on S of course) natural number r and there is an ordered
basis b for which the matrix Sb has the form
0 I 0
If you start with a matrix Sa for any 2-form S in ordered basis a the new matrix
is of the form Sb = At Sa A, just as above. Once again At 6= A−1 (at least, not
necessarily) so Sb will not in general be either orthogonally equivalent or
similar to Sa .
(
−1, row r + i column i for 1 ≤ i ≤ r;
The entries of Sb are zero except for
1, row i column r + i for 1 ≤ i ≤ r.
So with this basis all S(bi , bj ) are 0 except
S(br+i , bi ) = etr+i Gb ei = −1 for 1 ≤ i ≤ r
and
S(bi , br+i ) = eti Gb er+i = 1 for 1 ≤ i ≤ r
So
60 LARRY SUSANKA
r
X r
X
S= bi ⊗ br+i − br+i ⊗ bi = bi ∧ br+i .
i=1 i=1
We carry on in this fashion yielding an ordered basis for V for which the matrix of
G is skew-diagonal. The product of all these matrices of transition has determinant
1. If p is the end-product ordered basis obtained after r iterations of this process
Xr
S= S(p2i−1 , p2i ) p2i−1 ∧ p2i .
i=1
We presume in this section that our vector space V is endowed with a preferred
and agreed-upon positive definite inner product, yielding preferred bases—the or-
thonormal bases—and the orthogonal matrices of transition which connect them.
The matrices of transition between these orthonormal bases have determinant ±1,
so these bases form two groups called orientations. Two bases have the same
orientation if the matrices of transition between them have determinant 1.
We suppose, first, that a multilinear function is defined or proposed whose coor-
dinates are given for orthonormal bases. If these coordinates have tensor character
when transforming from one of these orthonormal bases to another the multilinear
function defined in each orthonormal basis is said to have Cartesian character
and to define a Cartesian tensor.
If the coordinates are only intended to be calculated in, and transformed among,
orthonormal coordinate systems of a pre-determined orientation, tensor character
restricted to these bases is called direct Cartesian character and we are said to
have defined a direct Cartesian tensor.
Of course, ordinary tensors are both Cartesian and direct Cartesian tensors.
Cartesian tensors are direct Cartesian tensors.
Roughly speaking, there are more coordinate formulae in a basis that yield Carte-
sian tensors and direct Cartesian tensors than coordinate formulae that have (un-
restricted) tensor character. Our meaning here is clarified in the next paragraph.
The coordinates of Cartesian or direct Cartesian tensors given by coordinate
formulae can certainly be consistently transformed by the ordinary rules for change
of basis to bases other than those intended for that type of tensor. Often the original
coordinate formulae make sense and can be calculated in these more general bases
too. But there is no guarantee that these transformed coordinates will match those
calculated directly, and in common examples they do not.
In general, when the vocabulary “Cartesian tensor” is used and coor-
dinates represented, every basis you see will be presumed to be orthonor-
mal unless explicit mention is made to the contrary. If direct Cartesian
62 LARRY SUSANKA
Considering the same problem from another (nicer) point of view, the numbers
(
i 0, if i 6= j;
δ j=
1, if i = j
define the coordinates of a mixed tensor and this tensor has the same coordinates
in any basis, orthonormal or not.
Given any inner product G, we saw earlier that δi j are the coordinates of G and
δ i j are the coordinates of G? .
But since G is positive definite, δ i j , δ i j , δi j and δi j are all the same in any
orthonormal basis, and so they are (all of them) coordinates of Cartesian tensors,
TENSORS (DRAFT COPY) 63
and usually all represented using common coordinates δi j . The “dot product”
Cartesian tensor we worked with above is δi j ai ⊗ aj .
As a second example, consider the function
−1, if the ordered list i1 , . . . , in is an odd permutation
of the integers 1, . . . , n;
i1 ... in = 1, if the ordered list i1 , . . . , in is an even permutation
of the integers 1, . . . , n;
0, otherwise.
If a and b are any bases, we saw that if σ is any n-form, the coefficients σ(a)
and σ(b) defined by
σ = σ(a) a1 ∧ · · · ∧ an = σ(b) b1 ∧ · · · ∧ bn
are related by
−1
σ(b) = (det(B)) σ(a).
That means three things. First, if a and b are both orthonormal bases of the
predetermined orientation, then det(B) = 1 so the fixed numbers i1 ... in satisfy
i1 ... in ai1 ⊗ · · · ⊗ ain = i1 ... in bi1 ⊗ · · · ⊗ bin
as suggested above: the numbers i1 ... in are the coordinates of a covariant direct
cartesian tensor.
Second, we see that these numbers are the coordinates of a covariant odd rel-
ative Cartesian tensor with weighting function sign(B): that is, if you switch to
an orthonormal basis of the “other” orientation you multiply by the sign of the
determinant of the matrix of transition to the new basis.
Finally, we see that when transferring to an arbitrary basis these constant coeffi-
−1
cients require the weighting function (det(B)) in the new basis, so these constant
coefficients define an odd covariant relative tensor of weight −1.
I have also seen it stated that these same coefficients define both an odd covariant
relative tensor of weight −1 and an odd contravariant tensor of weight 1. That is
because in these sources there is no notational distinction on the coefficients between
“raised or lowered index” versions of a tensor, and what they mean is that when a
is an orthonormal basis of the predetermined orientation, the n-vector
i1 ... in ai1 ⊗ · · · ⊗ ain = a1 ∧ · · · ∧ an
transforms to arbitrary bases with weighting function of weight 1.
64 LARRY SUSANKA
This kind of thing is fine as long as you stay within the orthonormal bases, but
when you explicitly intend to leave this realm, as you must if the statement is to
have nontrivial interpretation, it is a bad idea.
If you want to consider the Cartesian tensor δi j or direct Cartesian tensor i1 ,...,ik ,
or any such tensors whose coordinate formulae make sense in bases more general
than their intended bases of application, you have two options.
You can insist on retaining the coordinate formulae in a more general base, in
which case you may lose tensor character, and might not even have a relative tensor.
Alternatively, you can abandon the coordinate formulae outside their intended
bases of application and allow coordinates to transform when moving to these bases
by the usual coordinate change rules, thereby retaining tensor character.
In this section we explore initial versions of ideas we explore in more depth later.
A volume element on V is simply a nonzero member of Λn (V ). This vo-
cabulary is usually employed (obviously) when one intends to calculate volumes.
An ordered basis a of V can be used to form a parallelogram or parallelepiped
Para1 ,...,an in V consisting of the following set of points:
P ara1 ,...,an is formed from all ci ai where each ci satisfies 0 ≤ ci ≤ 1.
We may decide that we want to regard the volume of this parallelepiped as a
standard against which the volumes of other parallelepipeds are measured. In that
case we would pick the volume element Vola :
V ola = a1 ∧ · · · ∧ an
V ola assigns value 1 to (a1 , . . . , an ), interpreted as the volume of P ara1 ,...,an .
theories about the world generate (locally) an inner product when phrased in ge-
ometrical language. These theories are satisfying to us because they tap into and
salvage at least part of our natural internal models of how the external world should
be.
Some of these theories also have the added advantage that they seem to be true,
in the sense that repeated tests of these theories have so far turned up, uniformly,
repeated confirmations.
When the metric tensor at each point of this universe is positive definite it is
called Riemannian and induces the usual idea of distance on this curvy surface,
a Riemannian metric on what is called a Riemannian manifold. Small pieces
of a Riemannian manifold exhibit the ordinary Euclidean geometry of Rn .
When the metric tensor defined at each point has Lorentz signature—that is,
when diagonalized its matrix has one negative entry and three positive entries—we
find ourselves on a Lorentzian manifold, the space-time of Einstein’s gen-
eral relativity. Small pieces of a Lorentzian manifold exhibit the Minkowski
geometry of special relativity.
In that world with inner product G, some vectors satisfy G(v, v) > 0 and a v
like that is said to indicate a “space-like” displacement. No massive or massless
particle can move in this way. If G(v, v) = 0, v can indicate the a displacement of
a massless particle such as light, and lies on the “light-cone.” When G(v, v) < 0
the “time” coordinate of v is large enough in proportion to its “space” coordinates
that the vector could indicate the displacement of a massive particle. The vector
is called “time-like.”
There is another construction one sees when given specific inner product G and
orientation to yield V ola for orthonormal a. This is a means of creating a vector in
V , called cross product, from a given ordered list H1 , . . . , Hn−1 of members of V .
The cross product is defined in two steps as follows.
If you evaluate V ola using H1 , . . . , Hn−1 in its first n − 1 slots, there is one slot
remaining. Thus V ola (H1 , . . . , Hn−1 , ) is a member T of V ∗ , a different member
of V ∗ for each choice of these various Hi . This is the first step.
For the second step, raise the index of T using G? to yield a member of V , the
cross product
T (a)i ai = T (a)k Gk,i (a) ai ∈ V.
The raised index version of the tensor T is often denoted H1 × · · · × Hn−1 . The
case of n = 3 and Ga = I is most familiar, yielding the usual cross product of pairs
of vectors.
The covector (unraised) T has some interesting properties.
For example, it is linear when thought of as a function from V to V ∗ , where
the member of V replaces Hi for any fixed i. Also, if Hi and Hj are switched for
distinct subscripts, the new T becomes the negative of the old T . Because of this, if
there is any linear relation among the Hi then T = 0. On the other hand, if the Hi
are linearly independent then they can be extended to an ordered basis by adding
one vector Hn and then T (Hn ) 6= 0 which means T 6= 0. Conclusion: T = 0
exactly when H1 , . . . , Hn−1 is a dependent list.
TENSORS (DRAFT COPY) 67
Going back to the issue of the formation of cross product, we conclude that step
one in the process of creating the coordinates of the cross product in an ordered
basis is the same, up to numerical multiple, from basis to basis. When the matrix
of transition has determinant 1, the volume element is the same for these two bases,
so step one of the process has the same form regardless of other aspects of the new
basis. Step two, however, involves the matrix Gb which changes as B Ga Bt .
n!
We have seen that the dimension of the space Λr (V ) is r! (n−r)! for any r with
0 ≤ r ≤ n, which is the same as the dimensions of
Λr (V ) = Λr (V ∗ ) and Λn−r (V ) and Λn−r (V ) = Λn−r (V ∗ ).
So they are all isomorphic as vector spaces.
Our goal in this section is to create a specific isomorphism between Λr (V ) and
Λn−r (V ), and also between Λr (V ) and Λn−r (V ), and we want this construction
to be independent of basis in some sense.
It is required that a volume element σ, a member of Λn (V), be spec-
ified for the construction to proceed.
If a is any basis of V with dual basis a∗ the n-form a1 ∧ · · · ∧ an is a multiple of
σ. We presume that basis a has been chosen so that a1 ∧ · · · ∧ an = σ.
For a choice of ai1 , . . . , air we define
Formit a (ai1 ⊗ · · · ⊗ air ) = σ(ai1 , . . . , air , ·, . . . , ·) : V n−r → R.
It has various interesting properties. It is multilinear and alternating, so it is a
member of Λn−r (V ). Also, if P is any permutation of {1, . . . , r} then
Formit a (aiP (1) ⊗ · · · ⊗ aP (r) ) = sgn(P ) Formit a (ai1 ⊗ · · · ⊗ air ).
So if there are any repeated terms among the aik then Formit a (ai1 ⊗· · · ⊗air ) = 0.
Having defined Formit a on a basis of T0r (V ) consisting of tensors of the form
ai1 ⊗ · · · ⊗ air , we can extend the definition of Formit a by linearity to all of T0r (V ).
Any linear combination of members of Λn−r (V ) is also in Λn−r (V ), so the range of
Formit a is in Λn−r (V ).
We note (evaluate on (n − r)-tuples (aj1 , . . . ajn−r ) in V n−r ) that
Formit a (ai1 ⊗· · ·⊗air ) = Formit a (ai1 ∧· · ·∧air ) = sgn(P ) Formit a (aiP (1) ∧· · ·∧aiP (r) )
for any permutation P of {1, . . . , r} and any choice of i1 , . . . , ir .
We will only be interested here in Formit a evaluated on members of Λr (V ) and
we officially restrict, therefore, to this domain:
Formit a : Λr (V ) → Λn−r (V ).
TENSORS (DRAFT COPY) 69
It is pretty easy to see that if ai1 , . . . , air is a listing of basis vectors in increasing
order and air+1 , . . . , ain is a listing of the basis vectors left unused among the
ai1 , . . . , air then
while if any of the vectors air+1 , . . . , ain repeat a vector on the list ai1 , . . . , air then
We deduce from these facts that Formit a (ai1 ∧ · · · ∧ air ) = ± air+1 ∧ · · · ∧ ain .
Recall that for each increasing index choice i1 , . . . , ir drawn from {1, . . . , n}
there is exactly one shuffle permutation P in Sr,n ⊂ Pn for which P (k) = ik for
k = 1, . . . , r. For this shuffle, we also have (by definition of shuffle) P (r + 1) <
P (r + 2) < · · · < P (n). There are n! members of Pn but only r!(n−r)!
n!
members of
Sr,n .
Suppose T ∈ Λr (V ) with
where the final sum above is only over the shuffle permutations. Then
This map is obviously onto Λn−r (V ) and in view of the equality of dimension
provides an isomorphism from Λr (V ) onto Λn−r (V ).
It is hardly pellucid that this isomorphism does not depend on specific choice of
basis a.
We would like to deduce that the n-r-form in the last two lines above is
Formit b (bP (1) ∧ · · · ∧ bP (r) ) = sgn(P ) bP (r+1) ∧ · · · ∧ bP (n)
and we will prove this by showing that Formit a (bP (1) ∧· · ·∧bP (r) ) and Formit b (bP (1) ∧
· · · ∧ bP (r) ) agree when evaluated on any (n − r)-tuple of vectors of the form
(bi1 , . . . , bin−r ) for integers i1 , . . . , in−r between 1 and n.
First, if there is any duplication among these integers then alternating Formit a (bP (1) ∧
· · ·∧bP (r) ) and bP (r+1) ∧· · ·∧bP (n) must both be zero when evaluated at (bi1 , . . . , bin−r ).
So we may presume that all the i1 , . . . , in−r are distinct.
Further, we can re-order the vectors in the n-r-tuple and both Formit a (bP (1) ∧
· · · ∧ bP (r) )(bi1 , . . . , bin−r ) and bP (r+1) ∧ · · · ∧ bP (n) (bi1 , . . . , bin−r ) will be multi-
plied by the signum of the reordering permutation. So we may presume, to confirm
equality, that i1 , . . . , in−r is an increasing sequence.
If i1 , . . . , in−r contains any of the integers P (1), . . . , P (r) then
Formit b (bP (1) ∧ · · · ∧ bP (r) )(bi1 , . . . , bin−r ) = σ(bP (1) , . . . , bP (r) , bi1 , . . . , bin−r ) = 0.
Also in this case
Formit a (bP (1) ∧ · · · ∧ bP (r) )(bi1 , . . . , bin−r )
Q(1),...,Q(r) Q(r+1),...,Q(n)
X
= sgn(Q) det A P (1),...,P (r) det A i1 ,...,in−r .
Q∈Sr,n
By the generalization of the Laplace expansion formula found on page 41, this
is the determinant of a matrix related to A with columns reordered and with (at
least) two columns duplicated. It is, therefore, also zero.
So it remains to show equality when ij = P (r + j) for j = 1, . . . , n − r. In that
case
Formit a (bP (1) ∧ · · · ∧ bP (r) )(bP (r+1) , . . . , bP (n) ) = sgn(P ) det(A) = sgn(P )
while we also have
sgn(P ) bP (r+1) ∧ · · · ∧ bP (n) (bP (r+1) , . . . , bP (n) ) = sgn(P ).
In other words, Formit a and Formit b agree when evaluated on a basis of V n−k
and so are equal. We no longer need to (and will not) indicate choice of basis as
subscript on the function Formit a , using instead Formit σ with reference to the
TENSORS (DRAFT COPY) 71
specified volume element σ. By the way, had the other orientation been given by
choosing −σ as volume element, we find that Formit σ = −Formit −σ .
We extend Formit σ slightly and define
Formit σ : Λ0 (V ) → Λn (V ) to be the map sending a constant k to kσ.
When 0 < r < n and for any basis a for which σ = a1 ∧ · · · ∧ an we have
To resolve the issue for a generic basis tensor ai1 ∧ · · · ∧ air we can form a basis c
by permutation of the members of the basis a and apply this calculation (watching
signs if the permutation is odd) in this new basis c. The same formula holds.
which will all be denoted ∗ and called “the” Hodge star operator.
The (n − r)-form ∗τ ∈ Λn−r (V ) will be called the Hodge dual of τ ∈ Λr (V ).
The (n − r)-vector ∗x ∈ Λn−r (V ) will be called the Hodge dual of x ∈ Λr (V ).
With our definition of ∗ on Λr (V ) we calculate that ∗ 1 = σ and ∗ σ = (−1)q 1.
As this hints, and as was the case with Formit σ and Vecit σ , the Hodge operator
∗ is closely related to its inverse.
Our inner product G has signature (p, q), so q is the number of −1 entries on
the diagonal of the matrix of Ga = Ga for orthonormal basis a.
TENSORS (DRAFT COPY) 73
Suppose 0 < r < n and P is any shuffle permutation in Sr,n and that exactly k
of the vectors aP (1) , . . . , aP (r) satisfy G aP (i) , aP (i) = −1. So q − k of the vectors
aP (r+1) , . . . , aP (n) satisfy that equation. It follows that
∗ ◦ ∗ aP (1) ∧ · · · ∧ aP (r)
= Vecit σ ◦ Lower ◦ Vecit σ ◦ Lower aP (1) ∧ · · · ∧ aP (r)
= Vecit σ ◦ Lower ◦ Vecit σ (−1)k aP (1) ∧ · · · ∧ aP (r)
= Vecit σ ◦ Lower sgn(P ) (−1)k aP (r+1) ∧ · · · ∧ aP (n)
= Vecit σ sgn(P ) (−1)q−k (−1)k aP (r+1) ∧ · · · ∧ aP (n)
= sgn(P ) (−1)q−k (−1)k (−1)r(n−r) sgn(P ) aP (1) ∧ · · · ∧ aP (r)
= (−1)r(n−r)+q aP (1) ∧ · · · ∧ aP (r) .
In other words,
The cases r = 0 and r = n, not covered above, are calculated directly with the
same outcome. Also, the calculation of ∗ ◦ ∗ aP (1) ∧ · · · ∧ aP (r) is very similar,
again yielding the same conclusion.
The Hodge ∗ can be used to form an inner product on each Λr (V ) and each
Λr (V ). We illustrate the construction on Λr (V ).
In other words, Ma (P ) is the product of the diagonal entries of the matrix Ga (all 1
or −1) which correspond to positions P (j) along that diagonal for any j = 1, . . . , r.
74 LARRY SUSANKA
We calculate
x ∧ ∗y = xi1 ,...,ir (a) ai1 ∧ · · · ∧ air ∧ ∗ y j1 ,...,jr (a) aj1 ∧ · · · ∧ ajr
X
= xP (1),...,P (r) (a) y P (1),...,P (r) (a) sgn(P ) Ma (P ) aP (1) ∧ · · · ∧ aP (n)
P ∈Sr,n
X
= xP (1),...,P (r) (a) y P (1),...,P (r) (a) Ma (P ) a1 ∧ · · · ∧ an
P ∈Sr,n
Λ0 (V ) ⊕ Λ1 (V ) ⊕ Λ2 (V ) ⊕ · · · ⊕ Λn (V ) ⊕ · · ·
together with wedge product as the multiplication on the algebra, calculated by
distribution of the nonzero terms of the formal sums, and where wedge product
against a scalar (a member of Λ0 (V )) is given by ordinary scalar multiplication.
The Grassmann algebra is also called the exterior algebra, with reference to
the wedge or exterior product.
We note that all summands beyond Λn (V ) consist of the zero tensor only.
P∞This means, first, that members of G(V ) are all formal sums of the form θ =
i=0 dθei where each dθei ∈ Λi (V ). There can be at most n+1 nonzero summands.
dθei is called the grade-i part of θ. This representation is unique for each θ: that
is, two members of G(V ) are equal if and only if their grade-i parts are equal for
i = 0, . . . , n.
Though Λr (V ) is not actually contained in G(V ), an isomorphic copy of each
Λr (V ) is in G(V ). We do not normally distinguish, unless it is critical for some
rare reason, between a member of Λr (V ) and the corresponding member of G(V )
which is 0 except for a grade-r part.
G(V ) is a vector space with scalar multiplication given by ordinary scalar mul-
tiplication distributed across the formal sum to each grade, while addition is cal-
culated by adding the parts of corresponding grade. The member of G(V ) which is
the zero form at each grade acts as additive identity.
TENSORS (DRAFT COPY) 75
for each i between 0 and n, while dθ ∧ τ ei is the zero i-form for i > n.
Sometimes formulae produce reference to a negative grade. Any part or coeffi-
cient involved is interpreted to be the appropriate form of zero, just as one would
do for a reference to a grade larger than n.
If θ is nonzero only in grade i while τ is nonzero only in grade j then θ ∧ τ can
only be nonzero in grade i + j. By virtue of this property, G(V ) is called a graded
algebra.
Members of G(V ) are called multi-forms.
Replacing V by V ∗ produces the Grassmann algebra G(V ∗ ), whose members are
called multi-vectors and which also forms a graded algebra with wedge product.
If the field we are working with is R, we note that the Grassmann algebra of
multi-forms G(V ) is
R ⊕ V ∗ ⊕ Λ2 (V ) ⊕ · · · ⊕ Λn (V ) ⊕ {0} ⊕ · · ·
while the Grassmann algebra of multi-vectors G(V ∗ ) is
R ⊕ V ⊕ Λ2 (V ) ⊕ · · · ⊕ Λn (V ) ⊕ {0} ⊕ · · · .
The eponymous Hermann Grassmann, around and after 1840, wrote a series of
works within which he essentially invented linear and multilinear algebra as we
know it, including the first use of vectors. His writing style was considered to be
unmotivated and opaque by the few who read his work. In this, as in other matters,
he was a bit ahead of his time.
The value of his ideas did not become widely known for seventy years, and then
mostly by reference, through the work of other mathematicians and physicists such
as Peano, Clifford, Cartan, Gibbs and Hankel.
Grassmann interpreted nonzero wedge products of vectors such as W = x1 ∧· · ·∧
xr to represent an r dimensional subspace of V , in light of the fact that this wedge
product is characterized (up to constant multiple) by the property that W ∧ y = 0
and y ∈ V if and only if y is in the span of x1 , . . . , xr .
He thought of the various operations on the Grassmann algebra as constituting
a calculus of subspaces or “extensions,” combining and giving information about
geometrical objects and associated physical processes in a very direct and intuitive
way. For instance
( x1 ∧ · · · ∧ xr ) ∧ ( y1 ∧ · · · ∧ ys ) ,
will be nonzero if and only if x1 , . . . , xr , y1 , . . . , ys form a linearly independent set
of vectors. In that case the wedge product determines the combined span of the
two spaces the factors represent, which will have dimension r + s.
On the other hand, if the combined span of the two spaces is all of V but
neither of the two tensors is, individually, 0 and if V is given an inner product and
orientation, the multi-vector
∗ ( ( ∗ x1 ∧ · · · ∧ xr ) ∧ ( ∗ y1 ∧ · · · ∧ ys ) )
determines the intersection of the two spaces which will have dimension r + s − n.
His work, reformulated and clarified with modern notation, plays a large and
increasing role in many areas, from current theoretical physics to practical engi-
neering.
TENSORS (DRAFT COPY) 77
References
[1] Abraham, R. and Marsden, J., Foundations of Mechanics. Addison-Wesley Publishing Co.,
Reading, MA, 1985
[2] Aczél, J., Lectures on Functional Equations and Their Applications. Dover Publications, Inc.,
New York, NY, 2006
[3] Akivis, M. and Goldberg, V., An Introduction to Linear Algebra and Tensors. Dover Publi-
cations, Inc., New York, NY, 1972
[4] Arnold, V., Mathematical Methods of Classical Mechanics. Springer-Verlag New York, Inc.,
New York, NY, 1980
[5] Bishop, R. and Crittenden, R., Geometry of Manifolds. Academic Press, New York, NY, 1964
[6] Bishop, R. and Goldberg, S., Tensor Analysis on Manifolds. Dover Publications, Inc., New
York, NY, 1980
[7] Flanders, H., Differential Forms with Applications to the Physical Sciences. Dover Publica-
tions, Inc., New York, NY, 1989
[8] Hoffman, K. and Kunze, R., Linear Algebra. Prentice-Hall Inc., Inglewood Cliffs, NJ, 1971
[9] Hungerford, T., Algebra. Springer-Verlag Inc., New York, NY, 1974
[10] José, J., and Saletan, E., Classical Dynamics A Contemporary Approach. Cambridge Uni-
versity Press, Cambridge, U.K., 1998
[11] Misner, C. and Thorne, K. and Wheeler, J., Gravitation. W. H. Freeman and Co., San
Fancisco, CA, 1973
[12] Nelson, E., Tensor Analysis. Princeton University Press, Princeton, NJ, 1967
[13] Spivak, M., Calculus on Manifolds. W.A. Benjamin, Inc., Menlo Park, CA, 1965
[14] Spivak, M., Differential Geometry Vol. I. Publish or Perish, Inc., Berkeley, CA, 1970
[15] Stacy, B. D., Tensors in Euclidean 3-Space from a Course Offered by J. W. Lee. Notes
transcribed and organized by B. D. Stacy, 1981
[16] Sussman, G., and Wisdom, J., and Mayer, M., Structure and Interpretation of Classical
Mechanics. The MIT Press, Cambridge, MA, 2001
[17] Synge, J. L., and Schild, A., Tensor Calculus. Dover Publications, Inc., New York, NY, 1978
[18] Wald, R., General Relativity. University of Chicago Press, Chicago, IL, 1984
[19] Weinreich, G., Geometrical Vectors. University of Chicago Press, Chicago, IL, 1998
Index
∗, 72 Ss,L , 37
1-form, 15 T00 (V ), 15
2-cycle, 31 Tsr (V ), 15
< , >r , 73 ⊗, 15
A, 8 |v|, 45
A∗ , 8 ∧, 33
Ã, 16 n-form, 35
A,
e 16
e n-vector, 35
A∗ , 8 sgn(P ), 31
A, 8 sign(M ), 24
At , 5 trace(T ), 27
Alt(T ), 31 Abraham, R., 77
Cβα T , 26 absolute tensor, 24
E, 7, 9 Aczél, J., 77
GL(n, R), 23 Akivis, M. A., 77
G? , 46 alternating tensor, 30
H1 × · · · × Hn−1 , 66 angle
P ara1 ,...,an , 64 from a positive definite inner product, 45
S ∧n , 61 “angle” operator, y, 41
Sym(T ), 31 antisymmetric tensor, 30
T ab cde f g , 49 antisymmetrization, 31
T i1 ...irj1 ...js , 49 aplomb, 6
Tji11,...,j
,...,ir
(a), 16 Arnold, V. I., 77
s
∗ axial
V ,8
scalar, 25
Vsr , 14
tensor, 25
Vsr (k), 14
vector or covector, 25
V ola , 64
Λs (V ), 35 bad
Λs (V ), 35 idea, 64
[, 46 notation, 50
Formit σ , 71 basis
Vecit σ , 71 dual, 8
G(V ), 74 ordered, 6
G(V ∗ ), 75 orthonormal ordered, 54
], 46 standard ordered basis for Rn , 6
δik , 10 standard ordered basis for Rn∗ , 7
det, 23 bilinear form, 43
det(H),
35 Bishop, R. L., 77
det Aji11,...,i r
,...,jr , 39
brackets on indices, 32
i1 ... in , 63 braiding map, 31
∂y i
∂xj
, 11 capacity
h , iG,a , 47 tensor, 25
h , iG? ,a∗ , 47 Cartesian
dθei , 74 character, 61
y, 41 tensor, 61
Rn , 6 Cauchy-Schwarz Inequality, 45
Rn∗ , 7 character
a∗ , 8 Cartesian, 61
e∗ , 7 direct Cartesian, 61
e, 6 tensor, 20
A, 10 component, 8
B, 10 conjugate
Hom(V, V ), 26 metric tensor, 47
Pn , 30 contact forms, 65
78
TENSORS (DRAFT COPY) 79
contraction functional
by one contravariant index and one linear, 7, 8
covariant index, 26
by two contravariant or two covariant general linear group, 23
indices, 50 general relativity, 66
contravariant, 15 geometry
vector, 15 Euclidean, 5, 45, 66
contravector, 15 Minkowski, 66
coordinate Goldberg, S. I., 77
map, 8, 9 Goldberg, V. V., 77
polynomial, 20 grade-i part, 74
polynomial for a tensor in a basis, 21 graded algebra, 75
coordinates Gram-Schmidt process, 58
of a functional in a dual basis, 9 Grassmann algebra, 74
of a tensor in an ordered basis, 17 Grassmann, Hermann, 76
of a vector in an ordered basis, 8 gymnastics
covariant, 15 mental, 23
vector, 15
Hamiltonian mechanics, 44
covector, 15
Hodge
Crittenden, R. J., 77
dual, 72
critters, 5
“star” (or simply ∗) operator, 72
cross product, 66
Hoffman, K. M., 77
degenerate, 44 Hungerford, T. W., 77
density
increasing
tensor, 25
index list, 36
determinant, 23, 35
index of a symmetric bilinear form, 56
direct
inner product, 44
Cartesian character, 61
Lorentzian, 66
Cartesian tensor, 61
on G(V ∗ ) and G(V ), 75
dot product, 51
on Λr (V ) and Λr (V ), 73
dual
Riemannian, 66
Hodge, 72
interior product, 41, 76
of Rn , 7
invariant, 15, 29
of Rn∗ , 7
of V , 8 Jacobian matrix, 11
of V ∗ , 9 José, J. V., 77
of basis, 8
Kronecker delta, 10
Einstein Kunze, R. A., 77
general relativity, 66
summation convention, 13 Laplace expansion, 39, 41
Euclidean Lee, J. W., 77
geometry, 5, 45, 66 length, 45
inner product on Rn , 51 Levi-Civita symbol, 63
vector space, 45 linear functional, 7
evaluation map, 7, 9 Lorentzian
even inner product, 66
permutation, 31 manifold, 66
relative tensor, 24 metric tensor, 66
weighting function, 24 lower index, 47
exterior
algebra, 74 manifold
product, 33 Lorentzian, 66
Riemannian, 66
fest Marsden, J. E., 77
index, 6 matrix
Flanders, H., 77 representation, 43, 52
form (n-form), 35 Mayer, M. E., 77
80 LARRY SUSANKA
axial, 25
capacity, 25
Cartesian, 61
character, 20
density, 25
direct Cartesian, 61
metric, 44
oriented, 25
product (outer), 15
pseudotensor, 25
relative, 24
even, 24
odd, 24
simple, 16
skew symmetric, 30
symmetric, 30
twisted, 25
Thorne, K. S., 77
tiny
proof, 28
trace, 27
transition
matrices, 10
transpose, 5
Triangle Inequality, 45
tweak, 14
twisted
tensor, 25
vector, 15
n-vector, 35
volume element, 61, 64
Wald, R. M., 77
wedge product, 33
weight
of a relative tensor, 24
weighting function, 24
Weinreich, G., 77
Wheeler, J. A., 77
Wisdom, J., 77