Notes 610
Notes 610
on
Linear and Multilinear Algebra
2301-610
Wicharn Lewkeeratiyutkul
Department of Mathematics and Computer Science
Faculty of Science
Chulalongkorn University
August 2014
Contents
Preface iii
1 Vector Spaces 1
1.1 Vector Spaces and Subspaces . . . . . . . . . . . . . . . . . . . . . 2
1.2 Basis and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Linear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4 Matrix Representation . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.5 Change of Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.6 Sums and Direct Sums . . . . . . . . . . . . . . . . . . . . . . . . . 48
1.7 Quotient Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
1.8 Dual Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2 Multilinear Algebra 73
2.1 Free Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.2 Multilinear Maps and Tensor Products . . . . . . . . . . . . . . . . 78
2.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2.4 Exterior Products . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
i
ii CONTENTS
Bibliography 203
Preface
This book grew out of the lecture notes for the course 2301-610 Linear and
Multilinaer Algebra given at the Deparment of Mathematics, Faculty of Science,
Chulalongkorn University that I have taught in the past 5 years.
Linear Algebra is one of the most important subjects in Mathematics, with
numerous applications in pure and applied sciences. A more theoretical linear
algebra course will emphasize on linear maps between vector spaces, while an
applied-oriented course will mainly work with matrices. Matrices have an ad-
vantage of being easier to compute, while it is easier to establish the results by
working with linear maps. This book tries to establish a close connection between
the two aspects of the subject.
I would like to thank my students who took this course with me and proof-
read the early drafts. Special thanks go to Chao Kusollerschariya who provide
technical help about latex and suggest several easier proofs, and Detchat Samart
who supplied the proofs on polynomials.
Please do not distribute.
Wicharn Lewkeeratiyutkul
iii
Chapter 1
Vector Spaces
In this chapter, we will study an abstract theory of vector spaces and linear maps
between vector spaces. A vector space is a generalization of the space of vectors in
the 2- or 3-dimensional Euclidean space. We can add two vectors and multiply a
vector by a scalar. In a general framework, we still can add vectors, but the scalars
don’t have to be numbers; they are required to satisfy some algebraic properties
which constitute a field. A vector space is defined to be a non-empty set that
satisfies certain axioms that generalize the addition and scalar multiplication of
vectors in R2 and R3 . This will allow our theory to be applicable to a wide range
of situations.
Once we have some vector spaces, we can construct new vector spaces from
existing ones by taking subspaces, direct sums and quotient spaces. We then
introduce a basis for a vector space, which can be regarded as choosing a coor-
dinate system. Once we fix a basis for the vector space, every other element can
be written uniquely as a linear combination of elements in the basis.
We also study a linear map between vector spaces. It is a function that
preserves the vector space operations. If we fix bases for vector spaces V and
W , a linear map from V into W can be represented by a matrix. This will allow
the computational aspect of the theory. The set of all linear maps between two
vector spaces is a vector space itself. The case when the target space is a scalar
field is of particular importance, called the dual space of the vector space.
1
2 CHAPTER 1. VECTOR SPACES
(ii) ∀x ∈ F , x + 0 = 0 + x = x;
(iii) ∀x ∈ F ∃y ∈ F , x + y = y + x = 0;
(iv) ∀x, y ∈ F , x + y = y + x;
(vi) ∀x ∈ F , x · 1 = 1 · x = x;
(vii) ∀x ∈ F − {0} ∃y ∈ F , x · y = y · x = 1;
(viii) ∀x, y ∈ F , x · y = y · x;
(ix) ∀x, y, z ∈ F , x · (y + z) = x · y + x · z.
(ii) ∃0̄ ∈ V ∀v ∈ V , v + 0̄ = 0̄ + v = v;
(iv) ∀u, v ∈ V , u + v = v + u;
(v) ∀m, n ∈ F ∀v ∈ V , (m + n) · v = m · v + n · v;
1.1. VECTOR SPACES AND SUBSPACES 3
(vi) ∀m ∈ F ∀u, v ∈ V , m · (u + v) = m · u + m · v;
(viii) ∀v ∈ V , 1 · v = v.
(i) ∀v ∈ V , 0 · v = 0̄;
(ii) ∀k ∈ F , k · 0̄ = 0̄;
(iii) ∀v ∈ V ∀k ∈ F , k · v = 0̄ ⇔ k = 0 or v = 0̄;
Example 1.1.5. The following sets with the given addition and scalar multipli-
cation are vector spaces.
F n = {(x1 , x2 , . . . , xn ) | xi ∈ F, i = 1, 2, . . . , n},
with the usual matrix addition and scalar multiplication. Note that if
m = n, we write Mn (F ) for Mn×n (F ).
F(X) = {f : X → F }
Once we have some vector spaces to begin with, there are several methods to
construct new vector spaces from the old ones. We first consider a subset which
is also a vector space under the same operations.
(i) W is a subspace of V ;
(ii) ∀v ∈ W ∀w ∈ W , v + w ∈ W and ∀v ∈ W ∀k ∈ F, kv ∈ W ;
(iii) ∀v ∈ W ∀w ∈ W ∀α ∈ F ∀β ∈ F , αv + βw ∈ W .
Example 1.1.8.
(v) By Example 1.1.5 (v), the set of real-valued functions F([a, b]) is a vector
space over R. Now let
Then C([a, b]) is a subspace of F([a, b]). This follows from the standard
fact from calculus that a sum of two continuous functions is still continuous
and a multiplication of a continuous function by a scalar is also continuous.
(vi) Let S be the sequence space in Example 1.1.5 (iv). The following subsets
are subspaces of S:
n ∞
X o
`1 = (xn ) ∈ S : |xn | < ∞
n=1
n o
∞
` = (xn ) ∈ S : sup |xn | < ∞
n∈N
n o
c = (xn ) ∈ S : lim xn exists .
n→∞
These subspaces play an important role and will be studied in greater details
in functional analysis.
(ii) S ⊆ hSi;
(iii) S ⊆ T ⇒ hSi ⊆ hT i;
(iv) Since S ⊆ hSi, by (iii), hSi ⊆ hhSii. On the other hand, hhSii is the smallest
subspace of V containing hSi. But then hSi is a subspace of V containing hSi
itself. It implies that hhSii ⊆ hSi. Hence hhSii = hSi.
v = α1 v1 + · · · + αn vn and w = β1 w1 + · · · + βm wm .
8 CHAPTER 1. VECTOR SPACES
It follows that
Example 1.1.15.
(i) Let V = F n , where F is a field. Let
(iii) The set of monomials {1, x, x2 , x3 , . . . } spans the vector space F [x] because
any polynomial in F [x] can be written as a linear combination of monomials.
ek = (0, . . . , 0, 1, 0, 0, . . . )
where ek has 1 in the k-th coordinate, and 0’s elsewhere. Then {ek }∞ k=1
does not span S. For example, a sequence (1, 1, 1, . . . ) cannot be written
as a linear combination of ek ’s. In fact,
span {ek }∞
k=1 = {(xn ) ∈ S | xn = 0 for all but finitely many n’s}.
Exercises
1.1.1. Determine which of the following subsets of R4 are subspaces of R4 .
(i) U = {(a, b, c, d) | a + 2b = c + 2d}.
1.1.7. An abelian group hV, +i is called divisible if for any non-zero integer n,
nV = V , i.e. if for every u ∈ V and for any non-zero integer n, there exists v ∈ V
such that u = nv.
Prove that an abelian group hV, +i is a vector space over Q if and only if V
is divisible, all of whose non-zero elements are of infinite order.
10 CHAPTER 1. VECTOR SPACES
k1 v1 + · · · + kn vn = 0 ⇒ k1 = · · · = kn = 0.
Remark.
Example 1.2.2.
(iii) Let V = F [x]. Then the set {1, x, x2 , x3 , . . . } is linearly independent. This
follows from the fact that a polynomial a0 + a1 x + · · · + an xn = 0 if and
only if ai = 0 for i = 0, 1, . . . , n.
(iv) Let S be the vector space of sequences in F defined in Example 1.1.15 (iv).
For each k ∈ N, let ek be the k-th coordinate sequence. Then {ek }∞ k=1 is
linearly independent in S. We leave it to the reader to verify this fact.
1.2. BASIS AND DIMENSION 11
(v) Let V = C([0, 1]), the space of continuous real-valued functions defined on
[0,1]. Let f (x) = 2x and g(x) = 3x for any x ∈ [0, 1]. Then {f, g} is linearly
independent in C([0, 1]). Indeed, let α, β ∈ R be such that αf + βg = 0.
Then α 2x + β 3x = 0 for any x ∈ [0, 1]. If x = 0, α + β = 0. If x = 1,
2α + 3β = 0. Solving these equations, we obtain α = β = 0.
k1 v1 + k2 v2 + · · · + kn vn + kv = 0.
Example 1.2.5.
Theorem 1.2.6. Let B be a basis for a vector space V . Then any element in V
can be written uniquely as a linear combination of elements in B.
αi − βi = 0 for i = 1, . . . , k;
αi = 0 for i = k + 1, . . . , n;
−βj = 0 for j = k + 1, . . . , m.
Pn
Hence m = n = k and v is written uniquely as a linear combination i=1 αi vi .
Next we show that every vector space has a basis. The proof of this requires
the Axiom of Choice in an equivalent form of Zorn’s Lemma which we recall first.
Theorem 1.2.9. In a vector space, every linearly independent set can be extended
to a basis. In particular, every vector space has a basis.
Proof. The second statement follows from the first one by noting that the empty
set is a linearly independent set and thus can be extended to a basis. To prove
the first statement, let So be a linearly independent set in a vector space V . Set
Since each vi belongs to some Sαi and E 0 is a totally ordered set, there must be
Sβ in E 0 such that vi ∈ Sβ for i = 1, . . . , n. Since Sβ is linearly independent, we
have ki = 0 for i = 1, . . . , n. This shows that T is an upper bound in E 0 .
By Zorn’s Lemma, E has a maximal element S ∗ . Thus S ∗ is a linearly inde-
pendent set containing So . If there is v ∈ / hS ∗ i, by Theorem 1.2.3, S ∗ ∪ {v} is a
linearly independent set containing So . This contradicts the maximality of S ∗ .
Hence hS ∗ i = V , which implies that S ∗ is a basis for V .
14 CHAPTER 1. VECTOR SPACES
The proof of the existence of a basis for a vector space above relies on the
Zorn’s Lemma, which is equivalent to the Axiom of Choice. Any proof that
requires the Axiom of Choice is nonconstructive. It gives the existence without
explaining how to find one. In this situation, it implies that every vector space
contains a basis, but it does not tell us how to construct one. If the vector
space is finitely generated, i.e. spanned by a finite set, then we can construct
a basis from the spanning set. In general, we know that a vector space has a
basis but we may not be able to give one such example. For example, consider
the vector space S = {(xn ) | xn ∈ F for all n ∈ N}. We have seen that the set
of coordinate sequences {ek }∞k=1 is a linearly independent subset of S and hence
can be extended to a basis for S, but we do not have an explicit description of
such a basis.
A basis for a vector space is not unique. However, any two bases for the
same vector space have the same cardinality. We begin by proving the following
theorem.
Theorem 1.2.10. For any vector space V , if V has a spanning set with n ele-
ments, then any subset of V with more than n elements is linearly dependent.
We examine the scalars ai1 that multiply v1 and split the proof into two cases.
Case 1: ai1 = 0 for i = 1, . . . , m. In this case, the sums in (1.1) do not involve
v1 . Let W = span{v2 , . . . , vn }. Then W is spanned by a set with n − 1 elements,
R ⊆ W and |R| = m > n > n − 1. It follows that R is linearly dependent.
1.2. BASIS AND DIMENSION 15
In fact, the above Corollary is true if V has infinite bases too, but the proof
requires arguments involving infinite cardinal numbers, which is beyond the scope
of this book, so we state it as a fact below and omit the proof.
Theorem 1.2.12. All bases for a vector space have the same cardinality.
Example 1.2.14.
(i) dim({0}) = 0.
(ii) dim F n = n.
Proof. Let B be a basis for W . Then |B| = dim W = dim V . By Corollary 1.2.16,
B is a basis for V . Hence W = hBi = V .
1.2. BASIS AND DIMENSION 17
Exercises
√ √
1.2.1. Prove that {1, 2, 3} is linearly independent over Q, but linearly depen-
dent over R.
1.2.2. Prove that {sin x, cos x} is a linearly independent subset of C([0, π]).
1.2.5. Let A and B be linearly independent subsets of a vector space V such that
A∩B = ∅. Show that A∪B is linearly independent if and only if hAi∩hBi = {0}.
1.2.7. Let V be a vector space over a field F and S ⊆ V with |S| ≥ 2. Show
that S is linearly dependent if and only if some element of S can be written as a
linear combination of the other elements in S.
1.2.8. Let S be a subset of a vector space V . Show that S is a basis for V if and
only if S is a minimal spanning subset of V .
We can combine conditions (i) and (ii) together into a single condition as
follows:
Hence T is linear.
The above proposition says that a linear map preserves a linear combination
of two elements. We can apply a mathematical induction to show that it preserves
any linear combination of elements in a vector space.
(i) The zero map T : V → W defined by T (v) = 0̄ for all v ∈ V . The zero map
will be denoted by 0.
The map R is called the right-shift operator and the map L is called the
left-shift operator.
20 CHAPTER 1. VECTOR SPACES
Definition 1.3.6. Let T : V → W be a linear map. Define the kernel and the
image of T to be the following sets:
(ii) im T is a subspace of W ;
Thus im T is a subspace of W .
(iv) Suppose that T is 1-1. It is clear that {0} ⊆ ker T . Let u ∈ ker T . Then
T (u) = 0 = T (0). Since T is 1-1, u = 0. Hence ker T = {0}.
Conversely, assume that ker T = {0}. Let u, v ∈ V be such that T (u) = T (v).
Then T (u − v) = T (u) − T (v) = 0. Thus u − v = 0, i.e. u = v. This shows that
T is 1-1.
The next theorem states the relation between the dimensions of the kernel
and the image of a linear map.
1.3. LINEAR MAPS 21
Then
n
X n
X
T αi vi = αi T (vi ) = 0.
i=k+1 i=k+1
Pn
Hence i=k+1 αi vi ∈ ker T . Since A is a basis for ker T , there exist α1 , . . . , αk ∈ F
such that
Xn X k
αi vi = αi vi .
i=k+1 i=1
It follows that
k
X n
X
αi vi + (−αi )vi = 0.
i=1 i=k+1
2x − y = 0, x + 2y − z = 0, z − 5x = 0.
Solving this system of equations, we see that y = 2x, z = 5x, where x is a free
variable. Hence ker T = {(x, 2x, 5x) | x ∈ R} = h(1, 2, 5)i. Moreover,
Since (2, 1, −5) = −2(−1, 2, 0) − 5(0, −1, 1), im T = h(−1, 2, 0), (0, −1, 1)i. Hence
rank T = 2 and the nullity of T is 1.
The next theorem states that a function defined on a basis of a vector space
can be uniquely extended to a linear map on the entire vector space. Hence a
linear map on a vector space is uniquely determined on its basis.
Theorem 1.3.11. Let B be a basis for a vector space V . Then for any vector
space W and a function t : B → W , there is a unique linear map T : V → W
which extends t.
Clearly, this map is well-defined and T extends t. To show that T is linear, let
u, v ∈ V and r, s ∈ F . Then
m
X n
X
u= αi ui and v = βj vj
i=1 j=1
Hence
k
X m
X n
X
T (ru + sv) = (rαi + sβi )t(ui ) + rαi t(ui ) + sβj t(vj )
i=1 i=k+1 j=k+1
Xm n
X
= r αi t(ui ) + s βj t(vj )
i=1 j=1
= r T (u) + s T (v).
Uniqueness: Assume that S and T are linear maps from V into W that are
extensions of t. Let v ∈ V . Then v can be written uniquely as v = ni=1 ki vi for
P
Do the same for T . We can see that S(v) = T (v) for any v ∈ V .
We can state the above theorem in terms of the universal mapping property,
which will be useful later.
Let iB : B ,→ V denote the inclusion map defined by iB (x) = x for any x ∈ B.
Then the above theorem can be restated as:
iB
/V
B? For any vector space W and a function t : B → W ,
? ??
??
t ??? T there exists a unique linear map T : V → W such
that T ◦ iB = t.
W
24 CHAPTER 1. VECTOR SPACES
Example 1.3.14.
a1 v1 + · · · + an vn ←→ (a1 , . . . , an ).
(ii) Mm×n (F ) ∼
= Mn×m (F ). The linear maps Φ : Mm×n (F ) → Mn×m (F ) and
Ψ : Mn×m (F ) → Mm×n (F ) defined by
(i) T is 1-1;
(ii) T is onto;
Proof. The condition T S = IV implies that T is onto and S is 1-1. The conclusion
now follows from Theorem 1.3.16.
Remark. Theorem 1.3.16 and Corollary 1.3.17 may not hold if V is infinite
dimensional. See problem 1.3.13.
26 CHAPTER 1. VECTOR SPACES
The next proposition shows that a composition of linear maps is still linear.
(x · y) · z = x · (y · z) for any x, y, z ∈ V .
Proposition 1.3.25. Let V be a vector space over a field F . Define the product
on L(V ) by ST = S ◦ T for any S, T ∈ L(V ). Then L(V ) is a unital algebra. If
dim V > 1, it is a non-commutative algebra.
Proof. By Proposition 1.3.20, L(V ) is a vector space over F . By linearity of S,
for any S, T1 , T2 ∈ L(V ) and α, β ∈ F ,
On the other hand, by the definition of addition, for any S, T1 , T2 ∈ L(V ) and
α, β ∈ F ,
(αS1 + βS2 )T = α S1 T + β S2 T.
The associativity of the product follows from the associativity of the composition
of functions. Moreover, IV T = T IV = T . Hence L(V ) is a unital algebra. If
dim V > 1, choose a linearly independent subset {x, y} of V and extend it to a
basis for V . Define S(x) = y, S(y) = y, T (x) = x and T (y) = x and extend them
to linear maps on V . It is easy to see that ST (x) 6= T S(x).
(ii) The set F [x] of polynomials over F is a unital commutative algebra under
the usual polynomial operations. The polynomial 1 is the multiplicative
identity.
(iii) Let X be a non-empty set and F a field. The set of all F -valued functions
F(X) = {f : X → F } is a unital commutative algebra under the point-
wise operations. The constant function 1(x) = 1 for any x ∈ X, is the
multiplicative identity.
(iv) The space C([a, b]) of continuous functions on [a, b] is a unital commutative
algebra under the pointwise operations.
1.3. LINEAR MAPS 29
Exercises
1.3.1. Fix a matrix Q ∈ Mn (F ) and let W = {A ∈ Mn (F ) | AQ = QA}.
T −1 [W ] = {u ∈ U | T (u) ∈ W }
is a subspace of U .
1.3.5. Let V be a vector space over a field F with dim V = 1. Show that
if T : V → V is a linear map, then there exists a unique scalar k such that
T (v) = kv for any v ∈ V .
1.3.10. Let Vi be vector spaces over a field F and fi : Vi → Vi+1 linear maps.
Consider a sequence
fi−1 fi
· · · −→ Vi−1 −−−→ Vi −
→ Vi+1 −→ · · ·
0 −→ V1 −→ V2 −→ · · · −→ Vn −→ 0.
Prove that
n
X
(−1)i dim Vi = 0.
i=0
1.3.13. Give an example to show that Theorem 1.3.16 and Corollary 1.3.17 may
not hold if V is infinite dimensional.
1.3. LINEAR MAPS 31
1.3.14. Prove that the set {Tij } in Proposition 1.3.21 is a basis for L(V, W ).
In other words, an ordered basis for a vector space is a basis such that the order
of its elements is taken into account. We still use the usual notation {v1 , . . . , vn }
for ordered basis (v1 , . . . , vn ).
Proposition 1.4.3. Let V be a vector space over a field F with dim V = n. Fix
an ordered basis B for V . Then the map v 7→ [v]B is a linear isomorphism from
V onto F n .
v = α1 v1 + · · · + αn vn ←→ [α1 . . . αn ]t .
1.4. MATRIX REPRESENTATION 33
It is easy to see that the map in each direction is a linear map and is an inverse
of each other.
Theorem 1.4.4. Let V and W be vector spaces over a field F with dim V = n
and dim W = m. Fix ordered bases B for V and C for W , respectively. If
T : V → W is a linear map, then there is a unique m × n matrix A such that
For each j ∈ {1, . . . , n}, there exist a1j , . . . , amj ∈ F such that
m
X
T (vj ) = aij wi . (1.4)
i=1
Now we obtain all entries aij ’s of A. Hence if A satisfies (1.3), then A must be in
this form. Now we show that the matrix A defined this way satisfies (1.3). Let
v ∈ V and write v = k1 v1 + · · · + kn vn , where k1 , . . . , kn ∈ F . Then
n
X n
X
T (v) = T kj vj = kj T (vj )
j=1 j=1
n
X m
X
= kj aij wi
j=1 i=1
m X
X n
= aij kj wi .
i=1 j=1
Hence [T (v)]C is an m × 1 matrix whose i-th row is nj=1 aij kj . On the other
P
hand, A[v]B is an m×1 matrix whose i-th row is obtained by multiplying the i-th
row of A to the only column of [v]B . Hence the i-th row of A[v]B is nj=1 aij kj .
P
Remark. We can give an alternative proof of the existence part of the above
theorem as follows. For each vj in B, we write
m
X
T (vj ) = aij wi .
i=1
Form an m×n matrix A with the the (i, j)-entry aij given by the above equation.
Hence
a1j
.
[T (vj )]C = .
.
amj
On the other hand, A[vj ]B is the j-th column of A. Hence [T (vj )]C = A[vj ]B .
Now we can view [T ( · )]C and A[ · ]B = LA ([ · ]B ) as composite functions of linear
maps and hence both of them are linear. We have established the equality of
these two linear maps on the ordered basis B = {v1 , . . . , vn } and thus they must
be equal on all elements v ∈ V .
Definition 1.4.5. The unique matrix A in Theorem 1.4.4 is called the matrix
representation of T with respect to the ordered bases B and C, respectively, and
is denoted by [T ]B,C . Hence
V
T /W
[ ]B [ ]C
[T ]B,C
Fn / Fm
Let B = {(1, 1, 0), (0, 1, 1), (1, 0, 1)} and C = {(1, 1), (−1, 1)} be bases for R3 and
R2 , respectively. Find [T ]B,C .
Solution. If we rotate the points (1,0) and (0,1) on the plane counterclockwise
by θ-angle, using elementary geometry, we see that they get moved to the points
(cos θ, sin θ) and (− sin θ, cos θ), respectively. Hence
Thus !
cos θ − sin θ
[Tθ ]B = .
sin θ cos θ
If (x, y) ∈ R2 , then
! ! !
cos θ − sin θ x x cos θ − y sin θ
[Tθ (x, y)]B = = .
sin θ cos θ y x sin θ + y cos θ
which implies
By Theorem 1.4.10 and Corollary 1.4.11, we see that linear maps and matrices
are two aspects of the same thing. We can prove theorems about matrices by
working with linear maps instead. See, e.g., Exercises 1.4.1-1.4.3. On the other
hand, matrices have an advantage of being easier to calculate with.
38 CHAPTER 1. VECTOR SPACES
In the remaining part of this section, we discuss the rank of a matrix. The
rank of a linear map is the dimension of its image. We will define the rank of a
matrix to be the dimension of its column space, which turns out to be the same
as the dimension of its row space. We will establish the relation between the rank
of a matrix and the rank of the corresponding linear map.
Theorem 1.4.14. Let A be an m × n matrix over a field F . Then the row rank
and the column rank of A are equal.
1.4. MATRIX REPRESENTATION 39
vk = (βk1 , . . . , βkn ) ∈ F n .
For i = 1, . . . , m, write
d
X d
X
ri = αik vk = αik (βk1 , . . . , βkn ),
k=1 k=1
Hence for j = 1, . . . , n,
cj = (a1j , . . . , amj )
d
X d
X
= α1k βkj , . . . , αmk βkj
k=1 k=1
d
X
= βkj (α1k , . . . , αmk )
k=1
d
X
= βkj xk ,
k=1
hc1 , . . . , cn i ⊆ hx1 , . . . , xd i.
Hence the column rank of A ≤ d = the row rank of A. But this is true for
any matrix A. Thus the column rank of At ≤ the row rank of At . Since the row
space and the column space of At are the column space and the row space of A,
respectively, this shows that the column rank of A equals the row rank of A.
40 CHAPTER 1. VECTOR SPACES
Remark. The elementary row operations preserve the row space of a matrix.
Hence the rank of a matrix is still preserved under the elementary row operations.
We can apply these operations to the matrix until it is in a reduced echelon form.
Then the rank of the matrix is the number of non-zero row vectors in the reduced
echelon form. However, the elementary row operations do not preserve the column
space (but it preserves the column rank = the row rank).
It follows that im LA = the column space of A. Thus (ii) follows from (i).
Exercises
1.4.1. Recall that if A ∈ Mm×n (F ), then the linear map LA : F n → F m is given
by LA (x) = Ax, for all x ∈ F n . Prove the following statements:
(i) if B and C are standard ordered bases for F n and F m , respectively, then
[LA ]B,C = A;
(i) ker T ∼
= ker LA ;
(ii) im T ∼
= im LA ;
Theorem 1.5.1. Let B and B0 be ordered bases for a vector space V . Then there
exists a unique square matrix P such that
V ?
??
[ ]B ??[ ]B0
??
??
F n P / Fn
Proof. The proof of this theorem is similar to the proof of Theorem 1.4.4. Let
B = {v1 , . . . , vn } and B0 = {v10 , . . . , vn0 } be ordered bases for V . First, assume
that there is a matrix P such that (1.7) holds. For each vj ∈ B, [vj ]B is the n × 1
column matrix with 1 in the j-th row and 0 in the other positions. Thus P [vj ]B
is the j-th column of P . Hence for (1.7) to hold, the j-th-column of P must be
[vj ]B0 . It follows that P will be of the form:
h i
P = [v1 ]B0 [v2 ]B0 . . . [vn ]B0 .
It remains to show that the matrix P defined above satisfies (1.7). The proof is
the same as that of Theorem 1.4.4 and we leave it as an exercise.
Definition 1.5.2. The matrix P with the property above is called the transition
matrix from B to B0 . Notice that this is the same as [IV ]B,B0 .
The proof of Theorem 1.5.1 gives a method of how to find a transition matrix.
Let B = {v1 , . . . , vn } and B0 = {v10 , . . . , vn0 } be ordered bases for V . The j-th
column of the transition matrix from B to B0 is the coordinate vector of vj with
respect to B0 . More precisely, for j = 1, . . . , n, write
n
X
vj = pij vi0 .
i=1
Example 1.5.3. Let B = {(1, 0), (0, 1)} and B0 = {{(1, 1), (−1, 1)} be ordered
bases for R2 . Find the transition matrix from B to B0 and the transition matrix
from B0 to B.
Solution. Note that
1 1
(1, 0) = (1, 1) − (−1, 1),
2 2
1 1
(0, 1) = (1, 1) + (−1, 1).
2 2
!
1 1
Hence the transition matrix from B to B0 is 2 2 . Similarly,
− 12 1
2
Hence
But then the identity matrix I is the unique matrix such that [v]B = I[v]B for
any v ∈ V . Thus QP = I. By the same reason, P Q = I. This shows that P and
Q are inverses of each other.
44 CHAPTER 1. VECTOR SPACES
The next theorem shows the relation between the matrix representations of
the same linear map with respect to different ordered bases.
V
T /W
?? ??
?? [ ]
?? B0 ?? [ ]
?? C0
?? [ ]C ??
?? ??
[T ]B0 ,C0
[ ]B
Fn / Fm
oo 7 o 7
ooo Q ooo
ooo
P ooo
o o o o
ooo oooo
oooo [T ]B,C ooo
F n / Fm
Replacing w = T (v) in (1.10) and applying the other identities above, we have
for any v ∈ V . But then [T ]B0 ,C0 is the unique matrix such that
Exercises
1.5.1. If A = [aij ] is a square matrix in Mn (F ), define the trace of A to be the
sum of all entries in the main diagonal:
n
X
tr(A) = aii .
i=1
(i) If V is finite dimensional, prove that it is impossible to find two linear maps
S and T on V such that ST − T S = IV .
(ii) Show that the statement in (i) is not true if V is infinite dimensional.
(Take V = F [x], S(f )(x) = f 0 (x) and T (f )(x) = xf (x) for any f ∈ F [x].)
1.5.5. Let V be a finite-dimensional vector space with dim V = n. Show that two
n × n matrices A and B are similar if and only if they are matrix representations
of the same linear map on V with respect to (possibly) different ordered bases.
1.5. CHANGE OF BASES 47
1.5.7. Show that if A and B are similar matrices, then rank A = rank B.
48 CHAPTER 1. VECTOR SPACES
Definition 1.6.1. Let V and W be vector spaces over the same field F . Define
V × W = {(v, w) | v ∈ V, w ∈ W },
(v, w) + (v 0 , w0 ) = (v + v 0 , w + w0 )
k(v, w) = (kv, kw)
Then
n
X m
X
αi vi , βj wj = (0, 0).
i=1 j=1
Hence
n
X m
X
αi vi = 0 and βj wj = 0.
i=1 j=1
Proposition 1.6.2 shows that the dimension of V × W is the sum of the dimen-
sions of V and W if they are finite-dimensional. It suggests that the Cartesian
product V × W is really a “sum” and not a product of vector spaces. That is
why we call it the external direct sum. The adjective external is to emphasize
that we construct a new vector space from the existing ones.
Next, we turn to constructing a new subspace from existing ones. We know
that an intersection of subspaces is still a subspace, but a union of subspaces may
not be a subspace. The sum of subspaces will play a role of union as we will see
below.
W1 + W2 = {w1 + w2 | w1 ∈ W1 , w2 ∈ W2 }.
W1 + W2 = hW1 ∪ W2 i .
Example 1.6.5.
R2 = h(1, 0)i + h(0, 1)i = h(1, 0)i + h(1, 1)i = h(1, −1)i + h(1, 1)i .
Example 1.6.8.
(ii) Let V = Mn (R) and let W1 and W2 be the subspaces of symmetric matrices
and of skew-symmetric matrices, respectively:
k
X n
X m
X
u1 + u2 = (αi + βi )vi + αk+i wi + βk+i wi0 .
i=1 i=1 i=1
k
X n
X m
X
αi vi + βi wi + βi0 wi0 = 0. (1.13)
i=1 i=1 i=1
Then
k
X n
X m
X
αi vi + βi wi = − βi0 wi0 ∈ W1 ∩ W2 .
i=1 i=1 i=1
Hence
m
X k
X
− βi0 wi0 = γi vi
i=1 i=1
for some γ1 , . . . , γk in F , which implies
k
X m
X
γi vi + βi0 wi0 = 0.
i=1 i=1
By linear independence of B2 , γi and βi0 are all zero. Now, (1.13) reduces to
k
X n
X
αi vi + βi wi = 0.
i=1 i=1
The next proposition shows the relation between internal and external direct
sums.
Proof. Exercise.
In the future, we will talk about a direct sum without stating whether it is
internal or external. We also write V ⊕ W to denote the (external) direct sum of
V and W . It should be clear from the context whether it is internal or external.
Moreover, by Proposition 1.6.11, we can regard it as an internal direct sum or
an external direct sum without confusion. We sometimes omit the adjective
“internal” or “external” and simply talk about the direct sum of vector spaces.
It is easy to show that T is linear and satisfies T |V1 = T1 and T |V2 = T2 . This
finishes the uniqueness and existence of the map T .
Proposition 1.6.13 is the universal mapping property for the direct sum. If
we let ι1 : V1 → V1 ⊕ V2 and ι2 : V2 → V1 ⊕ V2 be the inclusion maps of V1 and V2
into V1 ⊕ V2 , respectively, then it can be summarized by the following diagram:
ι1 ι2
V1 E / V1 ⊕ V2 o V2
EEE y
EE yy
EE yyy
EE yy
EE T yyy
T1 EE yy T2
EE yy
EE y
" |yy
W
This proposition can also be interpreted for the external direct sum if we define
ι1 : V1 → V1 ⊕ V2 and ι2 : V2 → V1 ⊕ V2 by ι1 (v1 ) = (v1 , 0) and ι2 (v2 ) = (0, v2 ) for
any v1 ∈ V1 and v2 ∈ V2 .
There is also another universal mapping property of the direct sum in terms
of the projection maps.
Proposition 1.6.14. Let V1 and V2 be vector spaces over the same field. For
i = 1, 2, define πi : V1 ⊕ V2 → Vi by πi (v1 , v2 ) = vi for any v1 ∈ V1 and v2 ∈ V2 .
Then given any vector space W and linear maps T1 : W → V1 and T2 : W → V2 ,
there is a unique linear map T : W → V1 ⊕V2 such that π1 ◦T = T1 and π2 ◦T = T2 .
WE
yy EE
y EE
yyy EE
T1 y y EE T1
yy EE
y T EE
yyy EE
yy EE
|yy E"
π1 π2
V1 o V1 ⊕ V2 / V2
Proof. Exercise.
Next, we will define a sum and a direct sum for a finite number of subspaces.
1.6. SUMS AND DIRECT SUMS 55
W1 + · · · + Wn = {w1 + · · · + wn | w1 ∈ W1 , . . . , wn ∈ Wn }.
W1 + · · · + Wn = hW1 ∪ · · · ∪ Wn i .
(i) V = W1 + · · · + Wn , and
Denote it by V = W1 ⊕ · · · ⊕ Wn .
The second condition in the above definition can be replaced by one of the
following equivalent statements below.
v = w1 + · · · + wn = w10 + · · · + wn0 ,
56 CHAPTER 1. VECTOR SPACES
v = w1 + · · · + wi−1 + 0 + wi+1 + · · · + wn ,
for i = 1, 2, . . . , n.
W1 × · · · × Wn = {(w1 , . . . , wn ) | w1 ∈ W1 , . . . , wn ∈ Wn }.
We list important results for a finite direct sum of vector spaces whose proofs
are left as exercises.
Proof. Exercise.
Proof. Exercise.
In the above definition, we see that a direct product and an external direct
sum are the same when we have a finite number of vector spaces. Next, we
consider the general case when we have an arbitrary number of vector spaces. In
this case, the definitions of a direct product and an external direct sum will be
different. But there is a close relation between the internal direct sum and the
external direct sum.
Definition 1.6.23. Let {Vα }α∈Λ be a family of vector spaces. Define the Carte-
sian product
Y n [ o
Vα = v : Λ → Vα : v(α) ∈ Vα for all α ∈ Λ .
α∈Λ α∈Λ
L Q L
It is easy to see that α∈Λ Vα is a subspace of α∈Λ Vα . We call α∈Λ Vα the
(external) direct sum of {Vα }α∈Λ . Note that the direct product and the external
direct sum of {Vα }α∈Λ are the same when the index set Λ is finite.
Definition 1.6.24. Let V be a vector space over a field F . Let {Vα }α∈Λ be a
family of subspaces of V such that
S
(i) V = α∈Λ Vα ;
DS E
(ii) for each β ∈ Λ, Vβ ∩ α∈Λ−{β} Vα = {0}.
Then we say that V is the (internal) direct sum of {Vα }α∈Λ and denote it by
L L P
V = α∈Λ Vα . An element in α∈Λ Vα can be written as a finite sum α∈Λ vα ,
where vα ∈ Vα for each α ∈ Λ and vα = 0 for all but finitely many α’s. Moreover,
this representation is unique.
Theorem 1.6.25. Let {Vα }α∈Λ be a family of vector spaces. Form the external
L
direct sum V = α∈Λ Vα . For each α ∈ Λ, let Wα be the subspace of V defined
by
Wα = {v ∈ V | v(β) = 0 for all β ∈ Λ − {α}}.
Then Wα ∼
L
= Vα for each α ∈ Λ and V = α∈Λ Wα as an internal direct sum.
On the other hand, let V be a vector space over a field F and {Wα }α∈Λ a
L
family of subspaces of V such that V = α∈Λ Wα as an internal direct sum.
Form an external direct sum W = α∈Λ Wα . Then V ∼
L
= W.
Proof. Exercise.
1.6. SUMS AND DIRECT SUMS 59
Exercises
1.6.1. Let V = Mn (R) be a vector space over R. Define
(ii) P 2 = P and Q2 = Q;
(i) L(U ⊕ V, W ) ∼
= L(U, W ) ⊕ L(V, W );
(ii) L(U, V ⊕ W ) ∼
= L(U, V ) ⊕ L(U, W ).
(iii) L( ni=1 Vi , W ) ∼
L Ln
= i=1 L(Vi , W );
(iv) L(U,
Ln ∼ Ln
= i=1 L(U, Vi ).
i=1 Vi )
60 CHAPTER 1. VECTOR SPACES
v + W = {v + w | w ∈ W }.
(i) u + W = v + W ⇔ u − v ∈ W ;
(ii) u + W 6= v + W ⇒ (u + W ) ∩ (v + W ) = ∅.
(u + W ) + (v + W ) = (u + v) + W
k(v + W ) = kv + W,
the operations define above. The space V /W is called the quotient space of V
modulo W .
Proof. Exercise.
T is surjective. To show that T is 1-1, we will prove that ker T = {ker t} (recall
that ker t is the zero in V / ker t). Let v + ker t ∈ ker T . Then
Proof. Exercise.
Proof. Exercise.
Exercises
1.7.1. Let W be a subspace of a vector space V . Define a relation ∼ on V by
u∼v if u − v ∈ W.
Prove that ∼ is an equivalence relation on V and the equivalence classes are the
affine spaces of V .
V
T /W
p q
V /A
T̃ / W/B
V ∗ = L(V, F ) = Hom(V, F ).
Example 1.8.2.
Ta (x1 , . . . , xn ) = a · x = a1 x1 + · · · + an xn ,
(iii) For each a ∈ F , define Ea : F [x] → F be Ea (p) = p(a) for each p ∈ F [x].
Then Ea is a linear functional on F [x].
Rb
(iv) Define T : C([a, b]) → R by T (f ) = a f (x) dx for each f ∈ C([a, b]). Then
T is a linear functional on C([a, b]).
(v) For any square matrix, its trace is the sum of all elements in the main
diagonal of the matrix. Define tr : Mn (F ) → F as follows:
n
X
tr([aij ]) = aii .
i=1
Proof. Let f ∈ V ∗ . We will show that f = ni=1 f (vi )vi∗ . To see this, let g be
P
n
X n
X
g(vj ) = f (vi )vi∗ (vj ) = f (vi )δij = f (vj ).
i=1 i=1
g = k0 f0 + k1 f1 + · · · + km fm ,
Definition 1.8.4. Let V be a vector space. For any subset S of V , the annihilator
of S, denoted by S ◦ , is defined by
(ii) S ◦ is a subspace of V ∗ ;
T t (f ) = f ◦ T for any f ∈ W ∗ .
(ii) (IV )t = IV ∗ .
Hence T t : W ∗ → V ∗ is linear.
(ii) Note that
Hence (IV )t = IV ∗ .
(iii) Let S, T ∈ L(V, W ) and α, β ∈ F . Then for any f ∈ W ∗ ,
Hence (αS + βT )t = αS t + βT t .
(iv) Let S ∈ L(U, V ) and T ∈ L(V, W ). Then for any f ∈ W ∗ ,
(T S)t (f ) = f ◦ (T ◦ S) = S t (f ◦ T ) = S t (T t (f )).
Hence (T S)t = S t T t .
(v) Assume that T ∈ L(V, W ) is invertible. Then there is S ∈ L(W, V ) such that
ST = IV and T S = IW . Then
of B and C, respectively. Let A = [aij ] = [T ]B,C and B = [bij ] = [T t ]C∗ ,B∗ . Then
for j = 1, . . . , n,
m
X
T (vj ) = akj wk
k=1
and for i = 1, . . . , m,
n
X
T t
(wi∗ ) = bki vk∗ .
k=1
If V is a vector space, its dual space V ∗ is also a vector space and hence we
can again define the dual space (V ∗ )∗ of V ∗ . In the sense that we will describe
below, the second dual V ∗∗ = (V ∗ )∗ is closely related to the original space V ,
especially if V is finite-dimensional, V and V ∗∗ are isomorphic via a canonical
(basis-free) linear isomorphism.
To establish the main result about the double dual space, the following propo-
sition will be useful.
Proof. Assume that v 6= 0. Then {v} is linearly independent and thus can be
extended to a basis B for V . Let t : B → F be defined by t(v) = 1 and t(x) = 0
for any x ∈ B − {v}. Extend t to a linear functional f on V . Hence f ∈ V ∗ and
f (v) 6= 0.
Theorem 1.8.12. Let V be a vector space over a field F . For each v ∈ V , define
v̂ : V ∗ → F by v̂(f ) = f (v) for any f ∈ V ∗ . Then
Hence V ∼
= V ∗∗ , via the canonical map Φ if V is finite-dimensional.
(v\
+ w)(f ) = f (v + w) = f (v) + f (w) = v̂(f ) + ŵ(f ) = (v̂ + ŵ)(f ).
(αv)(f
d ) = f (αv) = αf (v) = α v̂(f ).
Exercises
1.8.1. Consider C as a vector space over R. Prove that the dual basis for {1, i} is
{Re, Im}, where Re and Im are the real part and the imaginary part, respectively,
of a complex number.
(i) (U + W )◦ = U ◦ ∩ W ◦ .
(ii) (U ∩ W )◦ = U ◦ + W ◦ .
(iii) If V = U ⊕ W , then V ∗ = U ◦ ⊕ W ◦ .
Prove that
(ii) ◦ M is a subspace of V .
1.8.9. Let f , g ∈ V ∗ be such that ker f ⊆ ker g. Prove that g = αf for some
α ∈ F.
(ii) im T t = (ker T )◦ .
Multilinear Algebra
73
74 CHAPTER 2. MULTILINEAR ALGEBRA
iB
/V
B? Given a vector space W and a function t : B → W ,
? ??
??
t ??? T there exists a unique linear map T : V → W such
that T ◦ iB = t.
W
We now define a free vector space on a non-empty set by the universal mapping
property.
X?
i /V
?? Given a vector space W and a function t : X → W ,
??
?
t ??? T there exists a unique linear map T : V → W such
W that T ◦ i = t.
Hence if V is a vector space over F with a basis B, then (V, iB ) is a free vector
space on B, where iB : B ,→ V is the inclusion map.
If (V, i) is a free vector space on a non-empty set X, we will soon see that
i(X) forms a basis for V . Since i is injective, we can identify X with a subset
i(X) of V and simply say that V is a vector space containing X as a basis. The
term “free” means there is no relationship between the elements of X. The point
of view here is that, starting from an arbitrary set, we can construct a vector
space for which the given set is a basis.
Proposition 2.1.3. Let F be a field and X a non-empty set. Then there exists
a free vector space over F on X.
2.1. FREE VECTOR SPACES 75
Proof. Define
It follows that FX is a vector space over F containing {δx }x∈X as a basis. Let
iX : X → FX be defined by iX (x) = δx for each x ∈ X. It is readily checked that
the universal mapping property is satisfied. Hence (FX , iX ) is a free vector space
over F on X.
With a slight abuse of notation, we will identify the function δx with the
element x ∈ X itself. Then we can view FX as a vector space containing X as a
basis. A typical element in FX can be written as ni=1 αi xi , where n ∈ N, αi ∈ F
P
In general, there are several ways to construct a free vector space on a non-
empty set. However, the universal mapping property will show that different
constructions of a free vector space on the same set are all isomorphic. Hence, a
free vector space is uniquely determined up to isomorphism.
Proof. Let (V1 , i1 ) and (V2 , i2 ) be free vector spaces on a non-empty set X. By
the universal mapping property, we have the following commutative diagrams:
i1 i2
X? / V1 X? / V2
?? ??
?? ??
?
t ??? T ?
t ??? T
W W
i1 i2
X? / V1 X? / V2
?? ??
?? ??
?
i2 ?? ?
i1 ??
1
T
2 T
V2 V1
i1 i2
X? / V1 X? / V2
?? ??
?? ??
? IV ? IV
i1 ?? i2 ??
1 2
V1 V2
Hence T2 ◦ T1 = IV1 and T1 ◦ T2 = IV2 . This shows that T1 and T2 are inverses of
each other. Hence T : V1 → V2 is a linear isomorphism such that T ◦ i1 = i2 .
2.1. FREE VECTOR SPACES 77
Exercises
2.1.1. Let (V, i) be a free vector space on a non-empty set X. Given a vector
space U and a function j : X → U , show that (U, j) is a free vector space on X
if and only if there is a unique linear map f : U → V such that f ◦ j = i.
2.1.2. Let (V, i) be a free vector space on a non-empty set X. Prove directly
from the universal mapping property that i(X) spans V .
Hint: Let W be the span of i(X) and iW : W → V the inclusion map. Apply
the universal mapping property to the following commutative diagram to show
that iW is surjective.
?V
i
ϕ
i
X? /W
??
??i
?? iW
??
V
2.1.3. Let (V, i) be a free vector space on a non-empty set X. Prove that i(X)
is a basis for V .
78 CHAPTER 2. MULTILINEAR ALGEBRA
Examples.
(1) Let V be a vector space. Then the dual pairing ω : V × V ∗ → F defined by
ω(v, f ) = f (v) for any v ∈ V and f ∈ V ∗ is a bilinear form.
(2) If V is an algebra, a multiplication · : V × V → V is a bilinear map.
(3) Let A be an n × n matrix over F . The map L : F n × F n → F defined by
det(r1 , . . . , rn ) = det A.
Proposition 2.2.2. Let V1 , . . . , Vn and W be vector spaces over F . Then the set
of multilinear maps Mul(V1 , . . . , Vn ; W ) is a vector space over F under addition
and scalar multiplication defined by
for any (v1 , v2 , . . . , vn ) ∈ V1 × · · · × Vn . The first equality above follows from the
linearity of T and the second one follows from the linearity in the second and
first variables, respectively.
From this proposition, we see that theory of linear maps cannot be applied
to multilinear maps directly. However, we can transform a multilinear map to a
linear map on a certain vector space and apply theory of linear algebra to this
induced linear map and then transfer information back to the original multilinear
map. In the process of doing so, we will construct a new vector space which is
very important in its own. It is called a tensor product of vector spaces. We
begin by considering a tensor product of two vector spaces.
Let U and V be vector spaces over F . We would like to define a new vector
space U ⊗ V which is the “product”of U and V . (Note that the direct product
U × V is really the “sum”of U and V .) The space U ⊗ V will consist of formal
elements of the form
(α1 u1 ) ⊗ v1 + · · · + (αn un ) ⊗ vn .
u1 ⊗ v1 + · · · + un ⊗ vn . (2.2)
However, this representation is not unique. We can have different formal sums
(2.2) that represent the same element in U ⊗ V . This will be a problem when we
define a function on the tensor product U ⊗ V . To get around this problem, we
will introduce the universal mapping property of a tensor product. In fact, we
will define a tensor product U ⊗ V to be the universal object that turns a bilinear
map on U × V into a linear map on U ⊗ V . Any linear map on the tensor product
U ⊗ V will be defined through the universal mapping property.
U × VG
b / X
GG | Given any vector space W and a bilinear map
GG |
G
ϕ GGG |φ ϕ : U × V → W , there exists a unique linear
# ~|
W map φ : X → W such that φ ◦ b = ϕ.
There are several ways to define a tensor product of vector spaces. If the
vector spaces are finite-dimensional, we can give an elementary construction. On
the other hand, one can construct a tensor product of modules, in which case a
construction of a tensor product of vector spaces is a special case. Here, we will
adopt a medium ground in which we construct a tensor product of two vector
spaces, not necessarily finite-dimensional.
2.2. MULTILINEAR MAPS AND TENSOR PRODUCTS 81
Theorem 2.2.5. Let U and V be vector spaces. Then a tensor product of U and
V exists.
Proof. Let U and V be vector spaces over a field F . Let (FU ×V , i) denote the
free vector space on U × V . Here U × V is the Cartesian product of U and V
with no algebraic structure. Then
nX o
FU ×V = αj (uj , vj ) | (uj , vj ) ∈ U × V and αj ∈ F .
finite
b = π◦i
%
U × VJ
i/ FU ×V π / FU ×V /T
JJ s
JJ ϕ s
JJ s
ϕ JJ
J$ ys s φ
W
sends any of the vectors which generate T to zero, so T ⊆ ker ϕ. Hence by the
universal mapping property of the quotient space, there exists a unique linear
map φ : FU ×V /T → W such that φ ◦ π = ϕ. Hence
φ ◦ b = φ ◦ π ◦ i = ϕ ◦ i = ϕ.
Proof. The proof here is the same as the proof of uniqueness of a free vector
space on a non-empty set (Theorem 2.1.4). We repeat it here for the sake of
completeness. Let (X1 , b1 ) and (X2 , b2 ) be tensor products of U and V . Note
that b1 and b2 are bilinear maps from U × V into X1 and X2 , respectively. By
the universal mapping property of (X1 , b1 ), there exists a unique linear map
F1 : X1 → X2 such that F1 ◦ b1 = b2 . Similarly, there exists a unique linear map
F2 : X2 → X1 such that F2 ◦ b2 = b1 .
7 X1
nnn W
b1 nnnn
nnn
nnn
U ×VP F1 F2
PPP
PPP
PPP
b2 PPP
'
X2
2.2. MULTILINEAR MAPS AND TENSOR PRODUCTS 83
Hence F2 ◦ F1 ◦ b1 = b1 . But then IX1 is the unique linear map from X1 into X1
such that IX1 ◦ b1 = b1 . Thus F2 ◦ F1 = IX1 . Similarly, F1 ◦ F2 = IX2 .
b1 b2
U ×V / X1 U ×V / X2
?? ??
?? ??
?? ??
?? ??
? ?
b1 ??
?? IX1 b2 ??
?? IX2
? ?
X1 X2
U × VG
b / U ⊗V
GG v Given any vector space W and a bilinear map
GG v
G
ϕ GGG
v ϕ : U × V → W , there exists a unique linear
v φ
# {v
W map φ : U ⊗ V → W such that φ ◦ b = ϕ.
Theorem 2.2.8. Let U and V be vector spaces. If B and C are bases for U and
V , respectively, then {u ⊗ v | u ∈ B, v ∈ C} is a basis for U ⊗ V .
84 CHAPTER 2. MULTILINEAR ALGEBRA
For k = 1, . . . , n, define ϕk : B → F by
1 if u = u ;
k
ϕk (u) =
0 otherwise.
= ak` .
Φ(y + z) = y (y ∈ Y , z ∈ Z).
U ⊗ V 6= {u ⊗ v | u ∈ U, v ∈ V }.
But
U ⊗ V = span{u ⊗ v | u ∈ U, v ∈ V }
nX n o
= ui ⊗ vi | n ∈ N, ui ∈ U, vi ∈ V .
i=1
F ⊗V ∼
=V ∼
= V ⊗ F.
Define ϕ : F × V → V by
It is easy to see that ϕ is a bilinear map. Then there is a unique linear map
Φ : F ⊗ V → V such that Φ ◦ b = ϕ. In particular, Φ(k ⊗ v) = kv for any k ∈ F
and v ∈ V . Now define Ψ : V → F ⊗ V by Ψ(v) = 1 ⊗ v for any v ∈ V . By
Proposition 2.2.7, Ψ is linear. Moreover, Φ ◦ Ψ = IV . To see that Ψ ◦ Φ = IF ⊗V ,
consider, for any k ∈ F and v ∈ V ,
U ⊗V ∼
= V ⊗ U.
ϕ(u, v) = b2 (v, u) = v ⊗ u.
Similarly, define ψ : V × U → U ⊗ V by
ψ(v, u) = b1 (u, v) = u ⊗ v.
Then ϕ and ψ are bilinear maps and hence there exists a unique pair of linear
maps Φ : U ⊗ V → V ⊗ U and Ψ : V ⊗ U → U ⊗ V such that Φ ◦ b1 = ϕ and
Ψ ◦ b2 = ψ. Note that
(U ⊗ V ) ⊗ W ∼
= U ⊗ (V ⊗ W ).
By Proposition 2.2.7, we see that ϕw is bilinear. Then there exists a unique linear
map φw : U ⊗ V → U ⊗ (V ⊗ W ) such that
It is easy to see that φw+w0 = φw + φw0 and φkw = kφw for any w, w0 ∈ W and
k ∈ F . Define a bilinear map φ : (U ⊗ V ) × W → U ⊗ (V ⊗ W ) by
U ⊗ (V ⊕ W ) ∼
= (U ⊗ V ) ⊕ (U ⊗ W ).
Proof. Define ϕ : U × (V ⊕ W ) → (U ⊗ V ) ⊕ (U ⊗ W ) by
Then f1 and f2 are bilinear maps and hence there exist linear maps ψ1 : U ⊗ V →
U ⊗ (V ⊕ W ) and ψ2 : U ⊗ W → U ⊗ (V ⊕ W ) such that
Now, define Ψ : (U ⊗ V ) ⊕ (U ⊗ W ) → U ⊗ (V ⊕ W ) by
Hence Φ ◦ ψ1 (x) = (x, 0) for any x ∈ U ⊗ V . Similarly, Φ ◦ ψ2 (y) = (0, y) for any
y ∈ U ⊗ W . Hence for any x ∈ U ⊗ V and y ∈ U ⊗ W ,
V1 × · · · ×M Vn
t / V1 ⊗ · · · ⊗ Vn
MMM q
MMM qq
ϕ MMM q φ
M& xq q
W
We can show that a tensor product V1 ⊗ · · · ⊗ Vn exists and is unique up to
isomorphism. The proof is similar to the case n = 2 and will only be sketched
here. The uniqueness part is routine. For the existence, consider the free vector
space FV1 ×···×Vn on V1 ×· · ·×Vn modulo the subspace T generated by the elements
of the form
Theorem 2.2.16. Let V1 , . . . , Vn be vector spaces over the same field F . For
k = 1, . . . , n, there is a unique linear isomorphism
k
O n
O n
O
Φk : Vi ⊗ Vi → Vi
i=1 i=k+1 i=1
Proof. Exercise.
92 CHAPTER 2. MULTILINEAR ALGEBRA
Exercises
Bil(U, V ; W ) ∼
= L(U ⊗ V, W ),
The unique linear map Ψ is called the tensor product of S and T , denoted by
S ⊗ T . Hence
(i) S ⊗ (T + T 0 ) = S ⊗ T + S ⊗ T 0 ;
(ii) (S + S 0 ) ⊗ T = S ⊗ T + S 0 ⊗ T ;
2.3 Determinants
In this section, we will define the determinant function. Here, we do not need
the fact that F is a field. It suffices to assume that F is a commutative ring
with identity. However, we will develop the theory on vector spaces over a field
as before, but keep in mind that what we are doing here works in a more general
situation where vector spaces are replaced by modules over a commutative ring
with identity.
and skew-symmetric if
for any v1 , . . . , vn ∈ V .
Proof. Assume that f is alternating. First, let us consider the case n = 2. Note
that for any u, v ∈ V ,
Hence f (u, v) = −f (v, u) for any u, v ∈ V . This argument can be generalized for
general n: for any v1 , . . . , vn ∈ V ,
f (v1 , . . . , vi , . . . , vj , . . . , vn ) = −f (v1 , . . . , vj , . . . , vi , . . . , vn ).
This shows that (2.8) holds for a transposition σ = (i j) ∈ Sn and hence holds
for any σ ∈ Sn .
On the other hand, assume that f is skew-symmetric. Let (v1 , . . . , vn ) ∈ V n
and vi = vj for some i 6= j. Let σ be the transposition (i j). Then sgn σ = −1
and thus
f (v1 , . . . , vi , . . . , vj , . . . , vn ) = −f (v1 , . . . , vj , . . . , vi , . . . , vn )
= −f (v1 , . . . , vi , . . . , vj , . . . , vn ),
Next, we will consider a multilinear map on the vector space F n over the field
F . We can view an element in (F n )n as an n × n matrix whose i-th row is the
i-th component in (F n )n .
By multilinearity,
n
X n
X
f (X1 , . . . , Xn ) = f a1j1 ej1 , . . . , anjn ejn
j1 =1 jn =1
n
X n
X
= ··· a1j1 . . . anjn f (ej1 , . . . , ejn ).
j1 =1 jn =1
2.3. DETERMINANTS 97
Since f is alternating, f (ej1 , . . . , ejn ) = 0 unless ej1 , . . . , ejn are all distinct; that
is, the set {j1 , . . . , jn } = {1, . . . , n} in some order. Hence the sum above reduces
to the sum of n! terms over all the permutations in Sn :
X
f (X1 , . . . , Xn ) = a1σ(1) . . . anσ(n) f (eσ(1) , . . . , eσ(n) )
σ∈Sn
X
= (sgn σ) a1σ(1) . . . anσ(n) f (e1 , . . . , en )
σ∈Sn
X
= r (sgn σ) a1σ(1) . . . anσ(n) .
σ∈Sn
where each Xi = (ai1 , . . . , ain ) and verify that it satisfies the desired property.
To see that f is multilinear, we will show that f is linear in the first coordinate.
For the other coordinates, the proof is similar. Assume that X10 = (b11 , . . . , b1n ).
Then
f (αX1 + βX10 , . . . , Xn )
X
= r (sgn σ) [αa1σ(1) + βb1σ(1) ] . . . anσ(n)
σ∈Sn
X X
=α r (sgn σ) a1σ(1) . . . anσ(n) + β r (sgn σ) b1σ(1) . . . anσ(n)
σ∈Sn σ∈Sn
= αf (X1 , . . . , Xn ) + βf (X10 , . . . , Xn ).
Let σ ∈ An and τ = σ(j k). If i ∈ / {j, k}, then τ (i) = σ(j k)(i) = σ(i) and thus
aiτ (i) = aiσ(i) . Moreover, τ (j) = σ(j k)(j) = σ(k) implies ajτ (j) = ajσ(k) = akσ(k)
since Xj = Xk . Similarly, akτ (k) = akσ(j) = ajσ(j) . This shows that for any
σ ∈ An and τ = σ(j k),
Thus each term in the first sum in (2.11) will cancel out with the corresponding
term in the second sum so that the total sum is zero. Hence f is alternating.
Remark. By Theorem 2.3.3 and Definition 2.3.4, it follows that any alternat-
ing multilinear function f : Mn (F ) → F is a scalar multiple of the determinant
function.
Proof. Let A = [aij ] and At = [bij ] where bij = aji . Note that if σ ∈ Sn
is such that σ(i) = j, then i = σ −1 (j) and thus aσ(i)i = ajσ−1 (j) . Moreover,
sgn σ = sgn σ −1 for any σ ∈ Sn . Hence
X
det(At ) = (sgn σ) b1σ(1) . . . bnσ(n)
σ∈Sn
X
= (sgn σ) aσ(1)1 . . . aσ(n)n
σ∈Sn
X
= (sgn σ −1 ) a1σ−1 (1) . . . anσ−1 (n) .
σ∈Sn
Since the last sum is taken over all permutations in Sn , it must equal det A.
det T = det([T ]B ),
[T ]B0 = P [T ]B P −1 ,
Proof. Let [S] and [T ] be the matrix representations of S and T (with respect to
a certain ordered basis), respectively. Then [ST ] = [S][T ]. Hence
Theorem 2.4.1. Let V be a vector space over a field F and k a positive integer.
Then there exists a vector space X over F , together with a k-linear alternating
map a : V k → X satisfying the universal mapping property: given a vector space
W and a k-linear alternating map ϕ : V k → W , there exists a unique linear map
φ : X → W such that φ ◦ a = ϕ.
VkB
a / X
BB |
BB |
ϕ BBB |φ
! ~|
W
Moreover, the pair (X, a) satisfying the universal mapping property above is
unique up to isomorphism.
a(v1 , . . . , vk ) = v1 ⊗ · · · ⊗ vk + T.
a = π◦f
f
#
VkD / V ⊗k π / V ⊗k /T
DD u
DD ϕ u
ϕ DDD u
" zu u φ
W
102 CHAPTER 2. MULTILINEAR ALGEBRA
because ϕ is alternating. This shows that ϕ sends the elements that generate
T to zero. Hence T ⊆ ker ϕ. Then by the universal mapping property of the
quotient space, there is a unique linear map φ : V ⊗k /T → W such that φ ◦ π = ϕ.
Hence
φ ◦ a = φ ◦ π ◦ f = ϕ ◦ f = ϕ.
Definition 2.4.2. The vector space X in Theorem 2.4.1 is called the k-th exterior
power of V and is denoted by k V . Hence k V is a vector space together
V V
property:
VkA
a / Vk V
AA Given any vector space W and a k-linear alter-
AA y
y nating map ϕ : V k → W , there is a unique linear
ϕ AAA y φ
|y
map φ : k V → W such that φ ◦ a = ϕ.
V
W
Proof. The first two properties follow from the multilinearity of a. The last two
properties follow from the fact that a is alternating and skew-symmetric.
By the universal mapping for the tensor product, there is a unique linear map
π : V ⊗k → k V such that
V
π(x1 ⊗ · · · ⊗ xk ) = x1 ∧ · · · ∧ xk
Fβ (vα ) = δαβ for all α, β ∈ I. Let B∗ = {f1 , . . . , fn } be the dual basis of B for
V ∗ . Then fj (vi ) = δij for i, j = 1, . . . , n. Define fα : V k → F by
X
fα (x1 , . . . , xk ) = (sgn σ) fiσ(1) (x1 ) . . . fiσ(k) (xk ).
σ∈Sk
104 CHAPTER 2. MULTILINEAR ALGEBRA
Then fα is a k-linear alternating map. The proof of this fact is similar to the
existence part of the proof of Theorem 2.3.3 and is omitted here. By the universal
mapping property, there is a unique linear map Fα : k V → F such that Fα ◦a =
V
for any v1 , . . . , vn ∈ V .
n
X
T (vi ) = aij vj .
j=1
2.4. EXTERIOR PRODUCTS 105
Note that the obtained matrix A = [aij ] is the transpose of the matrix represen-
tation of T . But then the determinant of a matrix is equal to the determinant of
its transpose, and thus det T = det A. Now, let us consider
n
X n
X
T (v1 ) ∧ · · · ∧ T (vn ) = a1j1 vj1 ∧ · · · ∧ anjn vjn
j1 =1 jn =1
n
X n
X
= ··· a1j1 . . . anjn (vj1 ∧ · · · ∧ vjn )
j1 =1 jn =1
X
= a1σ(1) . . . anσ(n) (vσ(1) ∧ · · · ∧ vσ(n) )
σ∈Sn
X
= (sgn σ) a1σ(1) . . . anσ(n) (v1 ∧ · · · ∧ vn )
σ∈Sn
= (det T )(v1 ∧ · · · ∧ vn ).
Exercises
2.4.1. Let V be a vector space and v1 , . . . , vk ∈ V . If {v1 , . . . , vk } is linearly
dependent, show that v1 ∧ · · · ∧ vk = 0. In particular, if dim V = n and k > n,
then v1 ∧ · · · ∧ vk = 0.
for any v1 , . . . , vn ∈ V . Moreover, det T is the only scalar satisfying the above
equality for any v1 , . . . , vn ∈ V .
2.4.5. Let V be a vector space over a field F and k a positive integer. Show
that there exists a vector space X over F , together with a symmetric k-linear
map s : V k → X satisfying the universal mapping property: given a vector space
W and a symmetric k-linear map ϕ : V k → W , there exists a unique linear map
φ : X → W such that φ ◦ s = ϕ. Moreover, show that the pair (X, s) satisfying
the universal mapping property above is unique up to isomorphism.
The pair (X, s) satisfying the above universal mapping property for symmetric
k-linear maps is called the k-th symmetric product of V , denoted by S k (V ).
Canonical Forms
3.1 Polynomials
Definition 3.1.1. A polynomial f (x) ∈ F [x] is said to be monic if the coefficient
of the highest degree term of f (x) is 1. A polynomial f (x) ∈ F [x] is said to be
constant if f (x) = c for some c ∈ F . Equivalently, f (x) is constant if f (x) = 0
or deg f (x) = 0.
Definition 3.1.2. Let f (x), g(x) ∈ F [x], with g(x) 6= 0. We say that g(x)
divides f (x), denoted by g(x) | f (x), if there is a polynomial q(x) ∈ F [x] such
that f (x) = q(x)g(x).
Theorem 3.1.3 (Division Algorithm). Let f (x), g(x) ∈ F [x], with g(x) 6= 0.
Then there exist unique polynomials q(x) and r(x) in F [x] such that
107
108 CHAPTER 3. CANONICAL FORMS
Proof. First, we will show the existence part. If f (x) = 0, take q(x) = 0 and
r(x) = 0. If f (x) 6= 0 and deg f (x) < deg g(x), take q(x) = 0 and r(x) = f (x).
Assume that deg f (x) ≥ deg g(x). We will prove the theorem by induction on
deg f (x). If deg f (x) = 0, then deg g(x) = 0, i.e., f (x) = a and g(x) = b for some
a, b ∈ F − {0}. Then f (x) = ab−1 g(x) + 0, with q(x) = ab−1 and r(x) = 0. Next,
let f (x), g(x) ∈ F [x] with deg f (x) = n > 0 and deg g(x) = m ≤ n. Assume that
the statement holds for any polynomial of degree < n. Write
Then either h(x) = 0 or deg h(x) < n. If h(x) = 0, take q(x) = an b−1 m x
n−m and
r(x) = 0. If deg h(x) < n, by the induction hypothesis, there exist q 0 (x) and r0 (x)
in F [x] such that
where either r0 (x) = 0 or deg r0 (x) < deg g(x). Combining (1) and (2) together,
we have that f (x) = (an b−1
m x
n−m + q 0 (x))g(x) + r 0 (x), as desired.
where qi (x), ri (x) ∈ F [x] and ri (x) = 0 or deg ri (x) < deg g(x), for i = 1, 2. Then
(q1 (x) − q2 (x))g(x) = r2 (x) − r1 (x). If r2 (x) − r1 (x) 6= 0, then q1 (x) − q2 (x) 6= 0,
which implies
deg g(x) ≤ deg((q1 (x) − q2 (x))g(x)) = deg(r2 (x) − r1 (x)) < deg g(x),
Corollary 3.1.4. Let p(x) ∈ F [x] and α ∈ F . Then p(α) = 0 if and only if
p(x) = (x − α)q(x) for some q(x) ∈ F [x].
3.1. POLYNOMIALS 109
Proof. Assume that p(α) = 0. By the Division Algorithm, there exist q(x), r(x)
in F [x] such that p(x) = (x − α)q(x) + r(x) where deg r(x) < 1 or r(x) = 0, i.e.
r(x) is constant. By the assumption, we have r(α) = p(α) = 0, which implies
r(x) = 0. So p(x) = (x − α)q(x). The converse is obvious.
Definition 3.1.5. Let p(x) ∈ F [x] and α ∈ F . We say that α is a root or a zero
of p(x) if p(α) = 0. Hence the above corollary says that α is a root of p(x) if and
only if x − α is a factor of p(x).
Definition 3.1.7. Let f1 (x), . . . , fn (x) ∈ F [x]. A monic polynomial g(x) ∈ F [x]
is said to be the greatest common divisor of f1 (x), . . . , fn (x) if it satisfies these
two properties:
(ii) for any h(x) ∈ F [x], if h(x) | fi (x) for i = 1, . . . , n, then h(x) | g(x).
Definition 3.1.8. Let f1 (x), . . . , fn (x) ∈ F [x]. A monic polynomial g(x) ∈ F [x]
is said to be the least common multiple of f1 (x), . . . , fn (x) if it satisfies these
two properties:
(ii) for any h(x) ∈ F [x], if fi (x) | h(x) for i = 1, . . . , n, then g(x) | h(x).
Proposition 3.1.9. Let f1 (x), . . . , fn (x) be nonzero polynomials in F [x] and let
Proof. Let
n
nX o
P= pi (x)fi (x) | pi (x) ∈ F [x], i = 1, . . . , n .
i=1
Algorithm, there exist a(x), r(x) ∈ F [x] such that f1 (x) = a(x)d(x) + r(x), where
r(x) = 0 or deg r(x) < deg d(x). It follows that r(x) ∈ P. Indeed,
But then d(x) is an element in P with the smallest degree. Hence r(x) = 0, which
implies d(x) | f1 (x). Similarly, we have d(x) | fi (x) for i = 1, . . . , n. It follows
that d(x) | g(x). Thus there exists σ(x) ∈ F [x] such that
n
X
g(x) = σ(x)d(x) = σ(x)pi (x)fi (x).
i=1
Definition 3.1.10. Two polynomials f (x) and g(x) in F [x] are said to be rela-
tively prime if gcd(f (x), g(x)) = 1.
Corollary 3.1.11. Two polynomials f (x) and g(x) in F [x] are relatively prime
if and only if there exist p(x), q(x) ∈ F [x] such that p(x)f (x) + q(x)g(x) = 1.
3.1. POLYNOMIALS 111
Proof. If gcd(f (x), g(x)) = 1, the implication follows from Theorem 3.1.9. Con-
versely, suppose there exist p(x), q(x) ∈ F [x] such that p(x)f (x) + q(x)g(x) = 1.
Let d(x) = gcd(f (x), g(x)). Then d(x) | f (x) and d(x) | g(x). It follows easily
that d(x) | p(x)f (x) + q(x)g(x). Hence d(x) | 1, which implies d(x) = 1.
Definition 3.1.12. Let p(x), q(x), r(x) ∈ F [x] with r(x) 6= 0. We say that p(x)
is congruent to q(x) modulo r(x) if r(x) | (p(x) − q(x)), denoted by
Then σj (x) and ϕj (x) are relatively prime for each j. By Proposition 3.1.9, for
each i, there exist pi (x), qi (x) ∈ F [x] such that 1 = pi (x)ϕi (x) + qi (x)σi (x). Let
ri (x)pi (x)ϕi (x) = ri (x) − ri (x)qi (x)σi (x) ≡ ri (x) mod σi (x)
and
rj (x)pj (x)ϕj (x) ≡ 0 mod σi (x) for j 6= i.
Note that the notion of irreducibility depends on the field F . For example,
f (x) = x2 + 1 is irreducible over R, but not irreducible over C because
x2 + 1 = (x + i)(x − i) over C.
Lemma 3.1.15. Let f (x), g(x) and h(x) be polynomials in F [x] where f (x) is
irreducible. If f (x) | g(x)h(x), then either f (x) | g(x) or f (x) | h(x).
Proof. Assume that f (x) is irreducible, f (x) | g(x)h(x), but f (x) - g(x). We will
show that gcd(f (x), g(x)) = 1. Let d(x) = gcd(f (x), g(x)). Since d(x) | f (x),
we can write f (x) = d(x)k(x) for some k(x) ∈ F [x]. By irreducibility of f (x),
d(x) or k(x) is a constant. If k(x) = k is a constant, then d(x) = k −1 f (x), which
implies that f (x) | g(x), a contradiction. Hence, d(x) is a constant, i.e. d(x) = 1.
By Proposition 3.1.9, we have 1 = p(x)f (x)+q(x)g(x) for some p(x), q(x) ∈ F [x].
Thus h(x) = p(x)f (x)h(x)+q(x)g(x)h(x). Since f (x) divides g(x)h(x), it divides
the term on the right-hand side of this equation and hence f (x) | h(x).
then n = m and we can renumber the indices so that gi (x) = αi hi (x) for some
αi ∈ F for i = 1, . . . , n.
Proof. We will prove the theorem by induction on deg f (x). Obviously, any
polynomial of degree 1 is irreducible. Let n > 1 be an integer and assume that
any polynomial of degree less than n can be written as a product of irreducible
polynomials. Let f (x) ∈ F [x] with deg f (x) = n. If f (x) is irreducible, we are
done. Otherwise, we can write f (x) = g(x)h(x) for some g(x), h(x) ∈ F [x], where
deg g(x) < n and deg h(x) < n. By the induction hypothesis, both g(x) and h(x)
can be written as products of irreducible polynomials, and hence so can f (x).
Next, we will prove uniqueness of the factorization again by the induction
on deg f (x). This is clear in case deg f (x) = 1. Let n > 1 and assume that a
3.1. POLYNOMIALS 113
where gi (x) and hj (x) are all irreducible. Hence g1 (x) | h1 (x) . . . hm (x). It
follows easily by a generalization of Lemma 3.1.15 that g1 (x) | hi (x) for some
i = 1, . . . , m. By renumbering the irreducible factors in the second factorization
if necessary, we may assume that i = 1. Since g1 (x) and h1 (x) are irreducible,
g1 (x) = α1 h1 (x) for some α1 ∈ F . Thus
Note that the polynomial above has degree less than n. Hence, by the induction
hypothesis, m = n and for each j = 2, . . . , n, gj (x) = αj hj (x) for some αj ∈ F .
This finishes the induction and the proof of the theorem.
Proof. (i) ⇒ (ii). Let p(x) ∈ F [x] − F and n = deg p(x). We will prove by
induction on n. If n = 1, then we are done. Assume that n > 1 and every non-
constant polynomial of degree n − 1 in F [x] splits over F . Since F is algebraically
closed, p(x) has a root α ∈ F . By Corollary 3.1.4, p(x) = (x − α)q(x) for some
q(x) ∈ F [x]. Then deg q(x) = n−1, and hence q(x) splits over F by the induction
hypothesis. Thus p(x) also splits over F .
(ii) ⇒ (iii). Let q(x) be an irreducible polynomial in F [x]. Then deg q(x) ≥ 1
and hence q(x) splits over F by the assumption. If deg q(x) > 1, then any linear
factor of q(x) is its nonconstant proper factor, contradicting irreducibility of q(x).
Thus deg q(x) = 1.
(iii) ⇒ (i). Let f (x) be a nonconstant polynomial over F . By Theorem 3.1.16,
f (x) can be written as a product of linear factors. Hence there exists α ∈ F such
that (x − α) | f (x), i.e. α is a root of f (x), by Corollary 3.1.4. This shows that
F is algebraically closed.
3.2. DIAGONALIZATION 115
3.2 Diagonalization
Throughout this chapter, V will be a finite-dimensional vector space over a field
F and T : V → V a linear operator on V .
[U −1 T U ]B = [U −1 ]B [T ]B [U ]B = P −1 [T ]B P = A.
Let C = {U (v1 ), . . . , U (vn )}. Since B is an ordered basis for V and U is a linear
isomorphism, we see that C is a basis for V and [T ]C = A by (3.1).
Proof. Exercise.
Proposition 3.2.5. Let V be a vector space over a field F with dim V = n and
T : V → V a linear operator. Then T is diagonalizable if and only if there is a
basis B = {v1 , . . . , vn } for V and scalars λ1 , . . . , λn ∈ F , not necessarily distinct,
such that
T vj = λj vj for j = 1, . . . , n.
T vj = λj vj for j = 1, . . . , n.
Proof. This follows immediately from Proposition 3.2.5 and Corollary 3.2.4.
3.2. DIAGONALIZATION 117
α1 λ1 v1 + α2 λ2 v2 + · · · + αk λk vk = 0. (3.4)
α1 λk v1 + α2 λk v2 + · · · + αk λk vk = 0. (3.5)
Notice that we use the assumption V being finite-dimensional in the fourth equiv-
alence.
Remark. Note that the matrix xIn −A is in Mn (F [x]) with each entry in xIn −A
being a polynomial in F [x]. In this case, F [x] is a ring but not a field. We can
extend the definition of the determinant of a matrix over a field to that of a
matrix over a ring. However, we cannot define the characteristic polynomial of
a linear operator T to be det(xIV − T ) because xIV − T is not a linear operator
on a vector space V . We define its characteristic polynomial using its matrix
representation instead.
Example. Define T : R2 → R2 by
Solution. Let B = {(1, 0), (0, 1)} be the standard ordered basis for R2 . Let
!
1 4
A = [T ]B = .
3 2
Example. Define T : R2 → R2 by
T (x, y) = (x + y, y).
Example. Define T : R2 → R2 by
Hence
[Ap1 . . . Apn ] = [λ1 p1 . . . λn pn ].
−2 0 5
Solution.
x−2 0 2
χA (x) = det 0 x−1 0 = (x − 1)2 (x − 6).
2 0 x−5
0 1 −2 0 0 6
then we have P −1 AP = D.
Proof. We outline the calculations and leave the details to the reader. Note that
! ! !
B C Im C B O
= ,
O D O D O In
where the zero matrices are of their suitable sizes. It is easy to verify that
! !
Im C B O
det = det D and det = det B.
O D O In
mi ≤ di ≤ ni for i = 1, . . . , k.
Hence
k
X k
X k
X
n = mi ≤ di ≤ ni = n.
i=1 i=1 i=1
Exercises
In this exercise, let V be a finite-dimensional vector space over a field F and
T : V → V a linear operator.
2 2 0 3 −6 −4
3.2.3. Let S and T be linear operators on V . Show that ST and T S have the
same set of eigenvalues.
Hint: Separate the cases whether 0 is an eigenvalue.
3.2.4. If A and B are similar square matrices, show that χA = χB . Hence similar
matrices have the same set of eigenvalues.
3.2.7. Let λ ∈ F and suppose there is a non-zero v ∈ V such that T (v) = λv.
Prove that there is a non-zero linear functional f ∈ V ∗ such that T t (f ) = λf . In
other words, if λ is an eigenvalue of T , then it is an eigenvalue of T t .
128 CHAPTER 3. CANONICAL FORMS
p(T ) = a0 I + a1 T + · · · + an T n .
In other words, the map p(x) 7→ p(T ) is an algebra homomorphism from the
polynomial algebra F [x] into the algebra of linear operators L(V ). Note that any
two polynomials in T commute:
p(A) = a0 In + a1 A + · · · + an An .
Then p(A) is an n × n matrix over F and the map p(x) 7→ p(A) is an algebra ho-
momorphism from the polynomial algebra F [x] into the algebra of n × n matrices
Mn (F ).
Proof. We will prove only the first part of the theorem. By Lemma 3.3.2, there is
a polynomial p(x) such that p(T ) = 0. By the Well-Ordering Principle, let m(x)
be a polynomial over F of smallest degree such that m(T ) = 0. By dividing all
the coefficients by the leading coefficient, we can choose m(x) to be monic. Now
let f (x) ∈ F [x] be a polynomial such that f (T ) = 0. By the Division Algorithm
for polynomials (Theorem 3.1.3), there exist q(x), r(x) ∈ F [x] such that
Since f (T ) = m(T ) = 0, it follows that r(T ) = 0. But then m(T ) is the polyno-
mial of smallest degree such that m(T ) = 0. This shows that r(x) = 0 and that
f (x) = q(x)m(x). Thus m(x) | f (x).
Now, let m(x) and m0 (x) be monic polynomials of smallest degree such that
m(T ) = m0 (T ) = 0. By the argument above, m(x) | m0 (x) and m0 (x) | m(x).
This implies that m0 (x) = c m(x) for some c ∈ F . Since m(x) and m0 (x) are
monic, we see that c = 1 and that m(x) = m0 (x).
where deg r(x) < deg(x − λ) or r(x) = 0, i.e., r(x) = r is a constant. Thus
Recall that for any square matrix P , adj P = (Cof P )t is a matrix satisfying
P adj P = (det P )In . Thus adj C is an n×n matrix whose entries are polynomials
of degree ≤ n − 1. Hence we can write adj C as
kn In = Mn−1
kn−1 In = Mn−2 − AMn−1
.. ..
. .
k1 In = M0 − AM1
k0 In = −AM0 .
Multiply on the left the first equation by An , the second equation by An−1 , and
so on. We then have
kn An = An Mn−1
kn−1 An−1 = An−1 Mn−2 − An Mn−1
.. ..
. .
k1 A = AM0 − A2 M1
k0 In = −AM0 .
kn An + kn−1 An−1 + · · · + k1 A + k0 In = 0.
Hence χA (A) = 0.
Proof. This follows immediately from Theorem 3.3.3, Theorem 3.3.6 and Corol-
lary 3.3.7.
Find χT and mT .
Solution. Let B = {(1, 0), (0, 1)} be the standard basis for R2 . Let
!
3 −2
A = [T ]B = .
2 −1
Then
!
x−3 2
χT (x) = χA (x) = det = (x − 3)(x + 1) + 4 = (x − 1)2 .
−2 x + 1
Since mA divides χA and they have the same roots, we see that mA (x) = x − 1
or mA (x) = (x − 1)2 . If p(x) = x − 1, then p(A) = A − I 6= 0. Hence mT (x) =
mA (x) = (x − 1)2 .
Find χT and mT .
3.3. MINIMAL POLYNOMIAL 133
Solution. Let B = {(1, 0, 0), (0, 1, 0), (0, 0, 1)} be the standard basis for R3 . Let
3 −2 0
A = [T ]B = −2 3 0 .
0 0 5
Then
x−3 2 0
χT (x) = χA (x) = det 2 x−3 0 = (x − 5)2 (x − 1).
0 0 x−5
0 0 0 0 0 4
p(T ) = (T − α1 I) . . . (T − αj I) . . . (T − αk I),
we can switch the order of the terms in the parentheses so that T − αj I is the last
one on the right-hand side. Then (T − αj I)(vi ) = 0, which implies p(T )(vi ) = 0.
Hence p(T ) = 0. It follows that mT (x) | p(x). But then p(x) is a product of
distinct linear factors, and so is mT (x).
134 CHAPTER 3. CANONICAL FORMS
It follows that
q1 (T )τ1 (T ) + · · · + qk (T )τk (T ) = I.
Let v ∈ V and vi = qi (T )τi (T )(v) for i = 1, . . . , k. Then v = v1 + · · · + vk and
O O ... Ak
Proof. (a) Let C be a basis for W and extend it to a basis B for V . Let A = [T ]B
and B = [TW ]C . Then χA (x) and χB (x) are the characteristic polynomials of T
136 CHAPTER 3. CANONICAL FORMS
where Ii denotes the identity matrix of size dim Vi for each i. This finishes the
proof of part (a).
To prove part (b), we will show that
3.3. MINIMAL POLYNOMIAL 137
(ii) for any p(x) ∈ F [x], if mTi (x) | p(x) for i = 1, . . . , k, then mT (x) | p(x).
The first statement follows from Proposition 3.3.11 (ii). To show the second
statement, let p(x) ∈ F [x] be such that mTi (x) | p(x) for i = 1, . . . , k. Then
p(x) = qi (x)mTi (x) for some qi (x) ∈ F [x]. In particular, if vi ∈ Vi , then
This shows that p(T ) = 0, which implies mT (x) | p(x). This finishes the proof of
the second statement and of part (b).
Exercises
In this exercise, unless otherwise stated, V is a finite-dimensional vector space
over a field F .
3.3.1. Find the characteristic polynomials and the minimal polynomials of the
following matrices, and determine whether they are diagonalizable.
3 1 −1 5 −6 −6
(a) 2 2 −1 (b) −1 4 2 .
2 2 0 3 −6 −4
3.3.2. Let P : R2 → R2 be defined by P (x, y) = (x, 0). Find the minimal poly-
nomial of P .
3.3.4. Show that T is invertible if and only if the constant term in the minimal
polynomial of T is non-zero. Moreover, if T is invertible, then T −1 = p(T ) for
some p(x) ∈ F [x].
3.3.11. Let A and B be nonsingular complex square matrices such that ABA =
B. Prove that
Proof. (i) We will prove this statement under the assumption that F is alge-
braically closed. In this case, S and T are diagonalizable. (In the general case,
we can extend V to a new vector space over an algebraically closed field Ω con-
taining F . Then S and T wil be diagonalizable over Ω.) Since S and T commute,
they are simultaneously diagonalizable by Theorem 3.3.15. Thus there is an
ordered basis B for V such that [S]B and [T ]B are diagonal matrices. Hence
142 CHAPTER 3. CANONICAL FORMS
Vi = ker(T − λi I)mi , i = 1, . . . , k.
Then
(ii) V = V1 ⊕ · · · ⊕ Vk .
σi (T )τi (T ) = mT (T ) = 0.
Note that τ1 (x), . . . , τk (x) have no common factors in F [x], and thus
Hence
q1 (T )τ1 (T ) + · · · + qk (T )τk (T ) = I.
Let v ∈ V and vi = qi (T )τi (T )(v) for i = 1, . . . , k. Then v = v1 + · · · + vk and
σi (T )(v) = 0. (3.9)
P
Write v = j6=i vj , where vj ∈ Vj for j = 1, . . . , k and j 6= i. Then τi (T )(vj ) = 0
for all j 6= i. Hence
X
τi (T )(v) = τi (T )(vj ) = 0. (3.10)
j6=i
Note that gcd(σi , τi ) = 1. By Proposition 3.1.9, there exist p(x), q(x) ∈ F [x]
such that
p(x)σi (x) + q(x)τi (x) = 1.
Thus
p(T )σi (T ) + q(T )τi (T ) = I.
By (3.9) and (3.10), it follows that
(iii) dim Vi = ni .
Proof. (i) Let Ti = T |Vi . Then (Ti − λi I)mi = 0 on Vi . Hence the minimal
polynomial mTi (x) of Ti divides σi (x) = (x − λi )mi . Thus mTi (x) = (x − λi )pi
for some integer pi . It follows that χTi (x) = (x − λi )qi for some integer qi . By
Proposition 3.3.13(i), we have
We will choose an ordered basis Bi for Vi so that [Ti − λi IVi ]Bi has a nice
form. Note that (Ti − λi IVi )mi = 0 on Vi . In this case, we say that Ti − λi IVi is
a nilpotent operator. We will investigate a nilpotent operator more carefully.
Proof. Exercise.
Proof. Let j be the smallest integer for which {v, T (v), . . . , T j (v)} is linearly
dependent. The existence of j follows from the assumption that V is finite-
dimensional. It follows that {v, T (v), . . . , T j−1 (v)} is linearly independent. We
will show that
This is clear for 0 ≤ k ≤ j − 1. Suppose T s (v) ∈ h{v, T (v), . . . , T j−1 (v)}i. Write
Since the set {v, T (v), . . . , T j (v)} is linearly dependent, T j (v) can be written as
a linear combination of v, T (v), . . . , T j−1 (v). Hence
Hence V = h{v, T (v), . . . , T j−1 (v)}i. It follows that {v, T (v), . . . , T j−1 (v)} is a
basis for V . Since dim V = n, we see that j = n.
[T ]B = Nk ,
Proof. Let vi = T k−i (v) for i = 1, . . . , k. Then T (v1 ) = 0 and T (vi ) = vi−1 for
i = 2, . . . , k. It follows that [T ]B = Nk where Nk is defined by (3.11).
(iii) V = W ⊕ W 0 .
/ W + W 0,
y∈ but T (y) = w0 ∈ W 0 . (3.15)
(i) V = W1 ⊕ · · · ⊕ Wr ;
V = W1 ⊕ W 0 = W1 ⊕ · · · ⊕ Wr .
While the cyclic subspaces that constitute the cyclic decomposition in Theo-
rem 3.4.13 are not unique, the number of cyclic subspaces in the direct sum and
their respective dimensions are uniquely determined by the information of the
operator T alone.
V = W1 ⊕ · · · ⊕ Wr ,
where Wi ’s are T -cyclic subspaces such that Ind T = dim W1 ≥ · · · ≥ dim Wr and
dim Wi = Ind(T |Wi ) for i = 1, . . . , r. Then
(i) r = dim(ker T );
This shows that Wi ∩ ker T q is spanned by {T ki −q (v), . . . , T ki −1 (v)} and thus has
dimension q.
Now, applying (3.16) and (3.18) to q = 1, we see that r = dim(ker T ). In
general,
r
X X X
q
dim(ker T ) = dim(Wi ∩ ker T q ) = ki + q.
i=1 ki ≤q−1 ki ≥q
3.4. JORDAN CANONICAL FORMS 151
Hence
X X X X
dim(ker T q−1 ) = ki + (q − 1) = ki + (q − 1).
ki ≤q−2 ki ≥q−1 ki ≤q−1 ki ≥q
It follows that
where
(i) k = k1 ≥ k2 ≥ · · · ≥ kr ;
(ii) k1 + · · · + kr = n = dim V .
Proof. It follows from Theorem 3.4.13 and Proposition 3.4.11. The uniqueness
part follows from Proposition 3.4.14.
V = V1 ⊕ · · · ⊕ Vk .
Vi = Wi1 ⊕ · · · ⊕ Wiri .
By Proposition 3.4.11, there is an ordered basis Bij for Wij such that
Finally, we have
k
M ri
k M
M
V = Vi = Wij .
i=1 i=1 j=1
(i) For each i, each entry on the main diagonal of Jij is λi , and the number
of λi ’s on the main diagonal of J is equal to ni . Hence the sum (over j) of
the orders of the Jij ’s is ni .
(iii) For each i, the number of blocks Jij equals the dimension of the eigenspace
ker(T − λi I).
154 CHAPTER 3. CANONICAL FORMS
(iv) For each i the number of blocks Jij with size q × q equals
(v) The Jordan canonical form is unique up to the order of the Jordan blocks.
Each Jordan block Jij corresponds to the subspace Wij in the cyclic decomposi-
tion above. Hence the largest Jordan block Jij is of size mi × mi .
Parts (iii) and (iv) follow from Proposition 3.4.14. The knowledge of (i)-(iv)
shows that the Jordan canonical form is unique up to the order of the Jordan
blocks.
Corollary 3.4.18. Let A be a square matrix such that mA (x) splits over F .
Then A is similar to a matrix in the Jordan canonical form (3.19). Moreover,
two matrices are similar if and only if they have the same Jordan canonical form,
except possibly for a permutation of the blocks.
Solution. We can extract the following information about the Jordan canonical
form J of T :
• J has size 7 × 7.
With these 3 properties above, the Jordan canonical form of T is one of the
following matrices:
2 1 2 1
0 2 0 2
2 1
2
0 2 or 2 .
3 1
3 1
0 3 0 3
3 3
The first matrix occurs when dim ker(T − 2I) = 2 and the second one occurs
when dim ker(T − 2I) = 3.
156 CHAPTER 3. CANONICAL FORMS
Solution. Note that since mT (x) = (x − 5)3 , we see that ker(T − 5I)3 = V .
From the give information, we know that
With these 3 pieces of information above, the possible Jordan canonical form of
T can be one of the following matrices:
5 1 0 5 1 0
0 5 1 0 5 1
0 0 5 0 0 5
5 1 or 5 1 0 .
0 5
0 5 1
5 1 0 0 5
0 5 5
Hence the only possible Jordan canonical form is the first matrix above.
3.4. JORDAN CANONICAL FORMS 157
Solution. Two matrices are similar if and only if they have the same Jordan
canonical form. Hence we will find 3 × 3 matrices in Jordan canonical form such
that A2 = 0.
Let p(x) = x2 . Then p(A) = 0, which implies that mA (x) | p(A). Hence
mA (x) = x or mA (x) = x2 . If mA (x) = x, then A = 0. If mA (x) = x2 , then A
has 2 Jordan blocks of sizes 2 × 2 and 1 × 1, respectively with 0 on its diagonal:
0 1 0
0 0 0 .
0 0 0
Since J is the upper triangular matrix, det J and tr J is the product and the sum,
respectively, of the diagonal entries. But then the diagonal entries of J consist of
the eigenvalues of A. The result now follows.
158 CHAPTER 3. CANONICAL FORMS
Exercises
3.4.1. Find the characteristic polynomial and the minimal polynomial of matrix
1 2 3
A = 0 4 5
0 0 4
3.4.5. If A ∈ M5 (C) with χA (x) = (x − 2)3 (x + 7)2 and mA (x) = (x − 2)2 (x + 7),
what is the Jordan canonical form for A?
3.4.6. How many possible Jordan canonical forms are there for a 6 × 6 complex
matrix A with χA (x) = (x + 2)4 (x − 1)2 ?
(i) Describe all the possibilities for the Jordan canonical form of A.
(i) Find all the possibilities for the Jordan canonical form of A.
161
162 CHAPTER 4. INNER PRODUCT SPACES
The following proposition gives a formula that shows how to recover a sesqui-
linear form from its quadratic form.
We also have a Polarization identity for a symmetric bilinear form, which will
be given as an exercise.
- nondegenerate if
f (x, y) = 0 ∀y ∈ V ⇒ x = 0, and
f (y, x) = 0 ∀y ∈ V ⇒ x = 0.
- positive semi-definite if
- positive definite if
∀x ∈ V, x 6= 0 ⇒ f (x, x) > 0.
These identities, together with the Polarization identity, imply that f is hermi-
tian.
Proof. We will prove this for a sesquilinear form over a complex vector space.
Let A = f (x, x), B = |f (x, y)| and C = f (y, y). If B = 0, the result follows
trivially. Suppose B 6= 0. Let α = B/f (y, x). Then |α| = 1 and αf (y, x) = B.
By Corollary 4.1.7, we also have αf (x, y) = B. For any r ∈ R,
Exercises
4.1.1. If f : V × V → F is a symmetric bilinear form on V and q(v) = f (v, v) is
its associated quadratic form, show that for any x, y ∈ V ,
1h i
f (x, y) = q(x + y) − q(x − y)
4
1 h i
= q(x + y) − q(x) − q(y) .
2
4.1.2. For any z = (z1 , . . . , zn ) and w = (w1 , . . . , wn ) in Cn , define
(z, w) = z1 w1 + · · · + zn wn .
Remark. In (i), p in an integer in the set {1, . . . , n}. In (ii) and (iii), we assume
that n = 2m is even.
Prove that A = B.
4.1.6. Compute the matrix representations of the bilinear forms in Problem 4.1.3.
Show that
[T ]tB [f ]B [T ]B = [f ]B
A real (or complex) vector space equipped with an inner product is called a real
(or complex) inner product space.
Proof. Easy.
From Definition 4.2.1 and Proposition 4.2.2, we see that if F = R, then the
inner product is linear in both variables, and if F = C, then the inner product is
linear in the first variable and conjugate-linear in the second variable. Hence the
real inner product is a positive definite, symmetric bilinear form and the complex
inner product is a positive definite hermitian sesquilinear form.
In other words, kxk is the square-root of the associated quadratic form of x. The
Cauchy-Schwarz inequality (Theorem 4.1.9) can be written as
A vector space equipped with a norm is called a normed linear space, or simply
a normed space. Property (iii) is referred to as the triangle inequality.
Proof. It is easy to see that kxk ≥ 0 and kxk = 0 if and only if x = 0. For any
x ∈ V and α ∈ F,
kx + yk2 = hx + y, x + yi
= hx, xi + hx, yi + hy, xi + hy, yi
= kxk2 + 2 Rehx, yi + kyk2
≤ kxk2 + 2|hx, yi| + kyk2
≤ kxk2 + 2 kxk kyk + kyk2
= (kxk + kyk)2 .
(1) If F = R, then
1
kx + yk2 − kx − yk2 .
hx, yi =
4
(2) If F = C, then
1
kx + yk2 − kx − yk2 + ikx + iyk2 − ikx − iyk2 .
hx, yi =
4
Proof. The complex case is Proposition 4.1.4. The real case is easy and is left as
an exercise.
Examples.
1. Fn is an inner product space with respect to the following inner product
n
X
hx, yi = xi ȳi = x1 ȳ1 + x2 ȳ2 + · · · + xn ȳn ,
i=1
2. `2 = { (xn ) | ∞ 2 2
P
n=1 |xn | < ∞}. If x = (xn ) and y = (yn ) ∈ ` , then
∞
X
hx, yi = xi ȳi
i=1
170 CHAPTER 4. INNER PRODUCT SPACES
W ⊥ = { x ∈ V | x ⊥ W }.
Note that we can always construct an orthonormal set from an orthogonal set
of nonzero vectors by dividing each vector by its norm.
Examples.
(2) {(1, −1, 0), (1, 1, 0), (0, 0, 1)} is an orthogonal set in R3 , but not an orthonor-
malnset. By dividing each element byoits norm, we obtain an orthonormal
set ( √12 , − √12 , 0), ( √12 , √12 , 0), (0, 0, 1) .
172 CHAPTER 4. INNER PRODUCT SPACES
(3) {ei2nπx }∞
n=−∞ is an orthonormal set in C[0, 1] because
Z 1 Z 1
2nπix
e ·e2mπix dx = e2(n−m)πix dx = δnm .
0 0
and
X n Xn
2 2
kxk = |αi | = | hx, ui i |2 .
i=1 i=1
4.2. INNER PRODUCT SPACES 173
Pn
Proof. If x = i=1 αi ui , then
n
DX E n
X
hx, uj i = αi ui , uj = αi hui , uj i = αj .
i=1 i=1
= hx, uj i − hx, uj i = 0.
Pn
This implies that x − PN (x) ⊥ uj for j = 1, . . . , n. If y = j=1 cj uj ∈ N , then
D n
X E Xn
hx − PN (x), yi = x − PN (x), cj uj = cj hx − PN (x), uj i = 0.
j=1 j=1
span{x1 , . . . , xn } = span{u1 , . . . , un }.
Define
n−1
X
zn = x n − hxn , ui i ui .
i=1
span{u1 , . . . , un } ⊆ span{x1 , . . . , xn }.
span{x1 , . . . , xn } ⊆ span{u1 , . . . , un }.
Example.
(1) {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is an orthonormal basis for R3 .
( √12 , − √12 , 0), ( √12 , √12 , 0), (0, 0, 1) is an orthonormal basis for R3 .
(2)
4.2. INNER PRODUCT SPACES 175
span{x1 , . . . , xn } = span{u1 , . . . , un }.
Solution. Let x1 = (1, 1, 0), x2 = (0, 1, 1) and x3 = (1, 0, 1). First, set
x1 1 1
u1 = = √ , √ ,0 .
kx1 k 2 2
Next, let
1 1 1 1 1
z2 = x2 − hx2 , u1 i u1 = (0, 1, 1) − √ √ , √ ,0 = − , ,1 .
2 2 2 2 2
Then set √
z2 2 1 1 1 1 2
u2 = =√ − , , 1 = −√ , √ , √ .
kz2 k 3 2 2 6 6 6
Now let
z3 = x3 − hx3 , u1 i u1 − hx3 , u2 i u2
1 1 1 1 1 1 2
= (1, 0, 1) − √ √ , √ ,0 − √ −√ , √ , √
2 2 2 6 6 6 6
2 2 2
= ,− , .
3 3 3
Finally, set
√
z3 3 2 2 2 1 1 1
u3 = = ,− , = √ ,− √ , √ .
kz3 k 2 3 3 3 3 3 3
n o
We have an orthonormal basis √1 , √1 , 0 , − √1 , √1 , √2 , √1 , − √1 , √1 .
2 2 6 6 6 3 3 3
176 CHAPTER 4. INNER PRODUCT SPACES
V = W ⊕ W ⊥.
x = PW (x) + (x − PW (x)) ∈ W + W ⊥ .
= hv, wi.
To show uniqueness, let w0 ∈ V be such that f (v) = hv, wi = hv, w0 i for any
v ∈ V . By Proposition 4.2.3, w = w0 .
178 CHAPTER 4. INNER PRODUCT SPACES
Exercises
4.2.6. In each of the following parts, apply Gram-Schmidt process to the given
basis for R3 to produce an orthonormal basis, and write the given element x as
a linear combination of the elements in the orthonormal basis thus obtained.
(a) {(1, 0, −1), (0, 1, 1), (1, 2, 3)}, and x = (2, 1, −2);
(b) {(1, 1, 1), (0, 1, 1), (0, 0, 3)}, and x = (3, 3, 1).
Prove also that the equality holds if and only if {v1 , . . . , vk } is an orthonormal
basis for V .
4.2. INNER PRODUCT SPACES 179
Setting r = 1, we have
hT x, yi + hT y, xi = 0.
Setting r = i, we have
hT x, yi − hT y, xi = 0.
Remark. Part (ii) may not hold for a real inner product space. For example,
let V = R2 and T is the 90◦ -rotation, i.e. T (x, y) = (−y, x) for any (x, y) ∈ R2 .
Then hT v, vi = 0 for each v ∈ V , but T 6= 0.
Then
hx, Syi = hx, T ∗ yi for all x, y ∈ V .
1. T ∗∗ = T ;
3. (T S)∗ = S ∗ T ∗ ;
Hence T ∗∗ = T .
Hence (T S)∗ = S ∗ T ∗ .
(T −1 )∗ T ∗ = T ∗ (T −1 )∗ = I ∗ = I.
It has all the properties listed in the previous theorem. Since we are mainly
interested in the case where V = W , we will restrict ourselves to this setting.
(LA )∗ = LA∗ ,
4.3. OPERATORS ON INNER PRODUCT SPACES 183
- T is said to be normal if T T ∗ = T ∗ T ;
If V is a real inner product space and T is unitary, then we may say that T is
orthogonal. It is clear that if T is self-adjoint or unitary, then it is normal.
If F = R, then
- A is said to be symmetric if At = A ;
hT x, xi = hx, T xi = hT x, xi,
0 = hT (x + y), x + yi = hT x, xi + hT x, yi + hT y, xi + hT y, yi,
4.3. OPERATORS ON INNER PRODUCT SPACES 185
hT x, yi = hy, T xi = hT y, xi.
The first equality follows from the fact that the inner product is real and the
second one follows because T is self-adjoint. It follows that hT x, yi = 0.
(i) T is unitary;
(iv) T ∗ T = I.
(ii) ⇒ (iii). We use the Polarization identity (Proposition 4.2.7). We will prove
it when F = C. The real case can be done the same way. For all x, y ∈ V ,
1 i
kx + yk2 − kx − yk2 + kx + iyk2 − kx − iyk2
hx, yi =
4 4
1 i
kT x + T yk − kT x − T yk2 +
2
kT x + iT yk2 − kT x − iT yk2
=
4 4
= hT x, T yi.
186 CHAPTER 4. INNER PRODUCT SPACES
R(x1 , x2 , . . . ) = (0, x1 , x2 , . . . ).
Then kRxk = kxk for all x ∈ `2 , but R is not surjective and thus not invertible.
(i) A is unitary;
(iv) A∗ A = In ;
Proof. The proof that (i), (ii), (iii) and (iv) are equivalent is similar to the proof
of Theorem 4.3.10. We now show that (iv) and (v) are equivalent. Let A = [aij ]
and A∗ = [bij ], where bij = aji . Then A∗ A = [cij ], where
n
X n
X
∗
(A A)ij = bik akj = aki akj . (4.4)
k=1 k=1
The fact that A∗ A = In is equivalent to (A∗ A)ij = δij for i, j ∈ {1, . . . , n}. The
i-th column vector of A is Ci = (a1i , . . . , ani ), for i = 1, . . . , n. Hence
n
X
hCi , Cj i = aki akj .
k=1
4.3. OPERATORS ON INNER PRODUCT SPACES 187
It follows that
n
X
hCj , Ci i = hCi , Cj i = aki akj . (4.5)
k=1
From (4.4) and (4.5), we see that (iv) and (v) are equivalent.
That (vi) is equivalent to the other statements follows from the fact that A is
unitary if and only if At is unitary and that the row vectors of A are the column
vectors of At .
(i) A is orthogonal;
(iv) At A = In ;
Exercises
4.3.1. Let V be a (finite-dimensional) inner product space. If P is an orthogonal
projection onto a subspace of V , prove that P 2 = P and P ∗ = P . Conversely, if
P is a linear operator on V such that P 2 = P and P ∗ = P , show that P is an
orthogonal projection onto a subspace of V .
(ii) [T ∗ ]B = [T ]∗B ;
T (W ⊥ ) = T (W )⊥ .
4.3.7. Show that every linear operator T on a complex inner product space V
can be written uniquely in the form
T = T1 + iT2 ,
(a) P is self-adjoint;
(b) P is normal;
In the last equality, we use the fact that an eigenvalue of a self-adjoint operator
is real. Since λ 6= µ, we have hu, vi = 0. Hence the eigenspaces associated with λ
and µ are orthogonal.
The complex spectral theorem says that a linear operator on a complex inner
product space can be diagonalized by an orthonormal basis precisely when it is
normal. However, if an inner product space is real, a linear operator is diagonal-
ized by an orthonormal basis precisely when it is self-adjoint. To prove this, we
need an important lemma. A linear operator on a real inner product space may
not have an eigenvalue, but it is true for a self-adjoint operator.
Proof. Let V be a real inner product space. Assume first that V has an orthonor-
mal basis B consisting of eigenvectors of T . Then [T ]B is a diagonal matrix, say,
[T ]B = diag(λ1 , . . . , λn ), where λi ’s are all real. Thus
Proof. We will give a proof for the complex case. Let A be a complex matrix.
First, note that if P is an invertible matrix, then D = P −1 AP is equivalent to
P D = AP . Moreover, if D is a diagonal matrix and P = [u1 . . . un ], where each
ui is the i-th column of P , then
A = P DP −1 = P DP t .
Example. Define !
1 2
A = .
2 −2
Find an orthogonal matrix P and a diagonal matrix D such that A = P DP −1 .
Proof. !
x − 1 −2
χA (x) = det = x2 + x − 6.
−2 x + 2
4.4. SPECTRAL THEOREM 195
Thus B = {( √15 , √
−2
5
), ( √25 , √15 )} is an orthonormal basis for R2 consisting of eigen-
vectors of A. Let
! !
√1 √2 −3 0
P = 5 5 and D = .
−2 √1
√
5 5
0 2
Example. Define
5 4 2
A = 4 5 2.
2 2 2
Proof. Solving the equation det(xI3 − A) = 0, we have x = 1, 1, 10, which are the
eigenvalues of A. Hence
2 2 1
u3 = (2, 2, 1)/k(2, 2, 1)k = , , .
3 3 3
Let
√1 −4
√ 2
5 45 3 1 0 0
√5 2
0
P = and D = 0 1 0 .
45 3
−2
√ −2
√ 1 0 0 10
5 45 3
It is easy to show that A has eigenvalues 1 and 2 and that V1 = span{(1, 0)} and
V2 = span{(1, 1)}. Hence {(1, 0), (1, 1)} is a basis for R2 consisting of eigenvectors
of A. However, we cannot choose vectors in V1 and V2 that are orthogonal. Hence
there is no orthonormal basis for R2 consisting of eigenvectors of A.
2. A real matrix can be orthogonally diagonalizable over C, but not over R.
For example, consider the following real orthogonal matrix:
!
cos θ − sin θ
A=
sin θ cos θ
However, the only real matrices which are orthogonally diagonalizable (over R)
are symmetric matrices. Hence A is not orthogonally diagonalizable over R.
hT x, xi ≥ 0 for any x ∈ V .
(i) T is positive;
λkxk2 = hλx, xi = hT x, xi ≥ 0,
which implies λ ≥ 0.
(ii) ⇒ (iii). Assume T is self-adjoint and all eigenvalues of T are nonnegative.
By the Spectral theorem, there is an orthonormal basis B = {u1 , . . . , un } for V
consisting of eigenvectors of T . Assume that T uj = λj uj for j = 1, . . . , n. Then
p
λj ≥ 0 for all j. Define P uj = λj uj for j = 1, . . . , n and extend it to a linear
operator on V . Clearly,
P 2 uj = P ( λj uj ) = λj uj = T uj
p
for j = 1, . . . , n.
P ∗ P = P P = P 2 = T.
198 CHAPTER 4. INNER PRODUCT SPACES
Proof. Exercise.
j and that
p
ker(P − λj I) ⊆ ker(T − λj I).
4.4. SPECTRAL THEOREM 199
√ √
This shows that the only eigenvalues of P are λ1 , . . . , λk . Since P is self-
adjoint, it is diagonalizable and thus
p p
V = ker(P − λ1 I) ⊕ · · · ⊕ ker(P − λk I).
It follows that
p
ker(P − λj I) = ker(T − λj I) for j = 1, . . . , k.
p
Hence on each subspace ker(T − λj I) of V , P = λj I. Thus the positive square-
root P of T is uniquely determined.
U ∗ U = (T P −1 )∗ (T P −1 ) = (P −1 )∗ T ∗ T P −1 = P −1 P 2 P −1 = I.
Hence U is unitary.
Suppose T = U1 P1 = U2 P2 , where U1 , U2 are unitary and P1 , P2 are positive
definite. Then
T ∗ T = (P1∗ U1∗ )(U1 P1 ) = P1∗ IP1 = P12 .
Exercises
4.4.1. Given
0 2 −1
A= 2 3 −2 ,
−1 −2 0
find an orthogonal matrix P that diagonalizes A.
4.4.2. Let T be a normal operator on a complex finite-dimensional inner product
space and let σ(T ) denote the set of eigenvalues of T . Prove that
(a) T is self-adjoint if and only if σ(T ) ⊆ R;
(ii) I = E1 + · · · + En ;
(iii) T = λ1 E1 + · · · + λn En .
Conversely, if there exist orthogonal projections E1 , . . . , En satisfying (i)-(iii)
above, show that T is normal.
4.4.6. Let a1 , . . . , an , b1 , . . . , bn ∈ F for some n ∈ N. Show that there is a
polynomial p(x) ∈ F[x] such that p(ai ) = bi for i = 1, . . . , n. This is called the
Lagrange Interpolation Theorem.
4.4.7. Let T be a linear operator on a finite-dimensional complex inner product
space. Show that T is normal if and only if there is a polynomial p ∈ C[x] such
that T ∗ = p(T ).
4.4. SPECTRAL THEOREM 201
0 0 9
[1] Sheldon Axler, Linear Algebra Done Right, Seconnd Edition, Springer, New
York 1997.
[3] William C. Brown, A Second Course in Linear Algebra, John Wiley & Sons,
New York, 1988.
[5] Kenneth Hoffman and Ray Kunze, Linear Algebra, Second Edition, Prentice
Hall, New Jersey, 1971.
[7] Seymour Lipschutz, Linear Algebra SI (Metric) Edition, McGraw Hill, Sin-
gapore, 1987.
[8] Aigli Papantonopoulou, Algebra : Pure and Applied, First Edition, Prentice
Hall, New Jersey, 2002.
[9] Steven Roman, Advanced Linear Algebra, Third Edition, Springer, New York
2008.
[10] Surjeet Singh and Qazi Zameeruddin, Modern Algebra, Vikas Publishing
House PVT Ltd., New Delhi, 1988.
203