0% found this document useful (0 votes)
30 views30 pages

Linear Alg Notes 2018

(1) The document defines key concepts in linear algebra including vector spaces, linear maps, kernels and images of linear maps, and linear independence. (2) A vector space is a set with operations of vector addition and scalar multiplication that satisfy certain properties. A linear map between vector spaces is a function that preserves these operations. (3) The kernel of a linear map is the set of vectors mapped to zero, and the image is the set of vectors it maps to. These are always vector subspaces.

Uploaded by

Jose Luis Giri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views30 pages

Linear Alg Notes 2018

(1) The document defines key concepts in linear algebra including vector spaces, linear maps, kernels and images of linear maps, and linear independence. (2) A vector space is a set with operations of vector addition and scalar multiplication that satisfy certain properties. A linear map between vector spaces is a function that preserves these operations. (3) The kernel of a linear map is the set of vectors mapped to zero, and the image is the set of vectors it maps to. These are always vector subspaces.

Uploaded by

Jose Luis Giri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Some notes on linear algebra

Throughout these notes, k denotes a field (often called the scalars in


this context). Recall that this means that there are two binary operations
on k, denoted + and ·, that (k, +) is an abelian group, · is commutative
and associative and distributes over addition, there exists a multiplicative
identity 1, and, for all t ∈ k, t 6= 0, there exists a multiplicative inverse for
t, denoted t−1 . The main example in this course will be k = C, but we shall
also consider the cases k = R and k = Q. Another important case is k = Fq ,
a finite field with q elements, where q = pn is necessarily a prime power (and
p is a prime number). For example, for q = p, Fp = Z/pZ. If we drop the
requirement that multiplication be commutative, then such a k is called a
division algebra or skew field. It is hard to find noncommutative examples
of division rings, though. For example, by a theorem of Wedderburn, a finite
division ring is a field. One famous example is the quaternions

H = R + R · i + R · j + R · k.

A typical element of H is then of the form a0 + a1 i + a2 j + a3 k. Here


i2 = j 2 = k 2 = −1, and

ij = k = −ji; jk = i = −kj; ki = j = −ik.

These rules then force the rules for multiplication for H, and a somewhat
lengthy check shows that H is a (noncommutative) division algebra.
We shall occasionally relax the condition that every nonzero element has
a multiplicative inverse. Thus a ring R is a set with two binary operations
+ and ·, that (R, +) is an abelian group, · is associative and distributes over
addition, and there exists a multiplicative identity 1. If · is commutative as
well, we call R a commutative ring. A standard example of a noncommuta-
tive ring is the ring Mn (R) of n × n matrices with coefficients in R with the
usual operations of matrix addition and multiplication, for n > 1, and with
multiplicative identity the n × n identity matrix I. (For n = 1, it is still
a ring but in fact it is isomorphic to R and hence is commutative.) More

1
generally, if k is any field, for example k = C and n > 1, then the set Mn (k)
of n × n matrices with coefficients in k with the usual operations of matrix
addition and multiplication is a noncommutative ring.

1 Definition of a vector space


Definition 1.1. A k-vector space or simply a vector space is a triple
(V, +, ·), where (V, +) is an abelian group (the vectors), and there is a func-
tion k × V → V (scalar multiplication), whose value at (t, v) is just denoted
t · v or tv, such that

1. For all s, t ∈ k and v ∈ V , s(tv) = (st)v.

2. For all s, t ∈ k and v ∈ V , (s + t)v = sv + tv.

3. For all t ∈ k and v, w ∈ V , t(v + w) = tv + tw.

4. For all v ∈ V , 1 · v = v.

A vector subspace W of a k-vector space V is a subgroup W of (V, +)


such that, for all t ∈ k and w ∈ W , tw ∈ W , i.e. W is a subgroup closed
under scalar multiplication. It is then straightforward to check that W is
again a k-vector space. For example, {0} and V are always vector subspaces
of V .

It is a straightforward consequence of the axioms that 0v = 0 for all


v ∈ V , where the first 0 is the element 0 ∈ F and the second is 0 ∈ V ,
that, for all t ∈ F , t0 = 0 (both 0’s here are the zero vector), and that, for
all v ∈ V , (−1)v = −v. Hence, W is a vector subspace of V ⇐⇒ W is
nonempty and is closed under addition and scalar multiplication.

Example 1.2. (0) The set {0} is a k-vector space.


(1) The n-fold Cartesian product k n is a k-vector space, where k n is a group
under componentwise addition and scalar multiplication, i.e.

(a1 , . . . , an ) + (b1 , . . . , bn ) = (a1 + b1 , . . . , an + bn )


t(a1 , . . . , an ) = (ta1 , . . . , tan ).

(2) If X is a set and k X denotes the group of functions from X to k, then


k X is a k-vector space. Here addition is defined pointwise: (f + g)(x) =
f (x) + g(x), and similarly for scalar multiplication: (tf )(x) = tf (x).

2
(3) With X a set as in (2), we consider the subset k[X] ⊆ k X of functions
f : X → k such that the set {x ∈ X : f (x) 6= 0} is a finite subset of X,
then k[X] is easily seen to be a vector space under pointwise addition and
scalar multiplication of functions, so that k[X] is a vector subspace of k X .
Of course, k[X] = k X if and only if X is finite. In general, for a function
f : X → k, we define the support of f or Supp f by:

Supp f = {x ∈ X : f (x) 6= 0}.

Then k[X] is the subset of k X consisting of all functions whose support is


finite.
Given x ∈ X, let δx ∈ k[X] be the function defined by
(
1, if y = x;
δx (y) =
0, otherwise.

The function δx is often called the delta function at x. Then clearly, for
every f ∈ k[X], X
f (x) = f (x)δx ,
x∈X

where in fact the above sum is finite (and therefore meaningful). We will
often identify the function δx with the element x ∈ X. In this case, we
can view XP as a subset of k[X], and every element of k[X] can be uniquely
written as x∈X tx · x, where the tx ∈ k and tx = 0 for all but finitely
many x. The case X = {1, . . . , n} corresponds to k n , and the element δi
to the vector ei = (0, . . . , 1, . . . , 0) (the ith component is 1 and all other
components are 0).

For a general ring R, the analogue of a vector space is an R-module,


with some care if R is not commutative:

Definition 1.3. A left R-module M is is a triple (M, +, ·), where (M, +) is


an abelian group and there is a function R × M → M , whose value at (r, m)
is just denoted r · m or rm, such that

1. For all r, s ∈ R and m ∈ M , r(sm) = (rs)m.

2. For all r, s ∈ R and m ∈ M , (r + s)m = rm + sm.

3. For all r ∈ R and m, n ∈ M , r(m + n) = rm + rn.

4. For all m ∈ M , 1 · m = m.

3
Left submodules of an R-module M are defined in the obvious way.
A right R-module is an abelian group M with a multiplication by R on
the right, i.e. a function R × M → M whose value at (r, m) is denoted
mr, satisfying (ms)r = m(sr) and the remaining analogues of (2), (3), (4)
above. Thus, if R is commutative, there is no difference between left and
right R-modules.

Example 1.4. (1) The n-fold Cartesian product Rn is a left R-module,


where Rn is a group under componentwise addition and scalar multiplica-
tion, i.e.

(a1 , . . . , an ) + (b1 , . . . , bn ) = (a1 + b1 , . . . , an + bn )


r(a1 , . . . , an ) = (ra1 , . . . , ran ).

Of course, it can also be made into a right R-module.


(2) If X is a set and RX denotes the group of functions from X to R, then
RX is a left R-module using pointwise addition and scalar multiplication.
Similarly the subset R[X], defined in the obvious way, is a submodule.
(3) Once we replace fields by rings, there are many more possibilities for
R-modules. For example, if I is an ideal in R (or a left ideal if R is not
commutative), then I is an R-submodule of R and the quotient ring R/I
(in the commutative case, say) is also an R-module. For example, Z/nZ
is a Z-module, but it looks very different from a Z-module of the form Zn .
For another example, which will be more relevant to us, if R = Mn (k) is
the (noncommutative) ring of n × n matrices with coefficients in the field k,
then k n is a left R-module, by defining A · v to be the usual multiplication
of the matrix A with the vector v (viewed as a column vector).

2 Linear maps
Linear maps are the analogue of homomorphisms and isomorphisms:

Definition 2.1. Let V1 and V2 be two k-vector spaces and let F : V1 → V2


be a function (= map). Then F is linear or a linear map if it is a group
homomorphism from (V1 , +) to (V2 , +), i.e. is additive, and satisfies: For all
t ∈ k and v ∈ V1 , F (tv) = tF (v). The composition of two linear maps is
again linear. For example, for all V1 and V2 , the constant function F = 0 is
linear. If V1 = V2 = V , then the identity function Id : V → V is linear.
The function F is a linear isomorphism or briefly an isomorphism if it
is both linear and a bijection; in this case, it is easy to check that F −1 is

4
also linear. We say that V1 and V2 are isomorphic, written V1 ∼
= V2 , if there
exists an isomorphism F from V1 to V2 .
If F : V1 → V2 is a linear map, then we define

Ker F = {v ∈ V1 : F (v) = 0};


Im F = {w ∈ V2 : there exists v ∈ V1 such that F (v) = w.

Lemma 2.2. Let F : V1 → V2 be a linear map. Then Ker F is a subspace of


V1 and Im F is a subspace of V2 . Moreover, F is injective ⇐⇒ Ker F = {0}
and F is surjective ⇐⇒ Im F = V2 .

To find examples of linear maps, we use the following:

Lemma 2.3. Let W be a vector space and let w1 , . . . , wn ∈ P W . There is


a unique linear map F : k n → W defined by F (t1 , . . . , tn ) = ni=1 ti wi . It
satisfies: For every i, F (ei ) = wi .

This lemma may be generalized as follows:

Lemma 2.4. Let X be a set and let V be a vector space. For each func-
Pf : X → V , P
tion there is a unique linear map F : k[X] → V defined by
F ( x∈X tx · x) = x∈X tx f (x) (a finite sum). In particular, F is specified
by the requirement that F (x) = f (x) for all x ∈ X. Finally, every linear
function k[X] → V is of this form.

For this reason, k[X] is sometimes called the free vector space on the set
X.

3 Linear independence and span


Let us introduce some terminology:

Definition 3.1. Let V be a k-vector space and let v1 , . . . , vd ∈ V be a


sequence of vectors. A linear combination of v1 , . . . , vd is a vector of the
form t1 v1 + · · · + td vd , where the ti ∈ k. The span of {v1 , . . . , vd } is the set
of all linear combinations of v1 , . . . , vd . Thus

span{v1 , . . . , vd } = {t1 v1 + · · · + td vd : ti ∈ k for all i}.

By definition (or logic), span ∅ = {0}.

With V and v1 , . . . , vd as above, we have the following properties of span:

5
(i) span{v1 , . . . , vd } is a vector subspace of V containing vi for every
i. In fact, it isP the image of the linear map F : k d → V defined by
F (t1 , . . . , td ) = di=1 ti vi above.

(ii) If W is a vector subspace of V containing v1 , . . . , vd , then

span{v1 , . . . , vd } ⊆ W.

In other words, span{v1 , . . . , vd } is the smallest vector subspace of V


containing v1 , . . . , vd .

(iii) For every v ∈ V , span{v1 , . . . , vd } ⊆ span{v1 , . . . , vd , v}, and equality


holds if and only if v ∈ span{v1 , . . . , vd }.

Definition 3.2. A sequence of vectors v1 , . . . , vd such that

V = span{v1 , . . . , vd }

will be said to span V . Thus, v1 , . . P . , vd span V ⇐⇒ the linear map


F : k → V defined by F (t1 , . . . , td ) = di=1 ti vi is surjective.
d

Definition 3.3. A k-vector space V is a finite dimensional vector space if


there exist v1 , . . . , vd ∈ V such that V = span{v1 , . . . , vd }. The vector space
V is infinite dimensional if it is not finite dimensional.

For example, k n is finite-dimensional. But, if X is an infinite set, then


k[X] and hence k X are not finite-dimensional vector spaces.

Definition 3.4. A sequence w1 , . . . , wr ∈ V is linearly independent if the


following holds: if there exist ti ∈ k such that

t1 w1 + · · · + tr wr = 0,

then ti = 0 for all i. The sequence w1 , . . . , wr is linearly dependent if it is


not linearly independent. Note that w1 , . . . , wr ∈ V are linearly indepen-
dent ⇐⇒ KerP F = {0}, where F : k n → V is the linear map defined by
F (t1 , . . . , tr ) = ri=1 ti wi ⇐⇒ the map F defined above is injective. It
then follows that the vectors w1 , . . . , wr are linearly independent if and only
if, given ti , si ∈ k, 1 ≤ i ≤ r, such that

t1 w1 + · · · + tr wr = s1 w1 + · · · + sr wr ,

then ti = si for all i.

6
Note that the definition of linear independence does not depend only
on the set {w1 , . . . , wr }—if there are any repeated vectors wi = wj , then we
can express 0 as the nontrivial linear combination wi − wj . Likewise if one
of the wi is zero then the set is linearly dependent.

Definition 3.5. Let V be a k-vector space. The vectors v1 , . . . , vd are


a basis of V if they are linearly independent and V = span{v1 , . . . , vd }.
Equivalently, the vectors v1 , . . . , vd arePa basis of V ⇐⇒ the linear map
F : k d → V defined by F (t1 , . . . , td ) = di=1 ti vi is an isomorphism.

Remark 3.6. There are analogous notions of linear independence, span,


and basis for an infinite dimensional vector space V : If {vi : i ∈ I} is a
collection of vectors in V indexed by a set I, then {vi : i ∈ I} is linearly
independent if the following holds: P if there exist ti ∈ k such that ti = 0 for
all but finitely many i ∈ I and i∈I ti vi = 0, then ti = 0 for all i. The span
of {vi : i ∈ I} is the subset
( )
X
ti vi : ti ∈ k, ti = 0 for all but finitely many i ∈ I ,
i∈I

and {vi : i ∈ I} spans V if its span is equal to V . Finally, {vi : i ∈ I} is a


basis of V if it is linearly independentPand spans V , or equivalently if every
v ∈ V can be uniquely expressed as i∈I ti vi , where ti ∈ k and ti = 0 for
all but finitely many i ∈ I.
With this definition, the set X (or equivalently the set {δx : x ∈ X}) is
a basis for k[X].

We summarize the salient facts about span, linear independence, and


bases as follows:

1. Suppose that V = span{v1 , . . . , vn } and that w1 , . . . , w` are linearly


independent vectors in V . Then ` ≤ n.

2. If V is a finite dimensional k-vector space, then every two bases for


V have the same number of elements—call this number the dimension
of V which we write as dim V or dimk V if we want to emphasize the
field k. Thus for example dim k n = n, and, more generally, if X is a
finite set, then dim k[X] = #(X).

3. If V = span{v1 , . . . , vd } then some subsequence of v1 , . . . , vd is a basis


for V . Hence dim V ≤ d, and if dim V = d then v1 , . . . , vd is a basis
for V .

7
4. If V is a finite-dimensional vector space and w1 , . . . , w` are linearly
independent vectors in V , then there exist vectors

w`+1 , . . . , wr ∈ V

such that w1 , . . . , w` , w`+1 , . . . , wr is a basis for V . Hence dim V ≥ `,


and if dim V = ` then w1 , . . . , w` is a basis for V .

5. If V is a finite-dimensional vector space and W is a vector subspace


of V , then dim W ≤ dim V . Moreover dim W = dim V if and only if
W =V.

6. If V is a finite-dimensional vector space and F : V → V is linear, then


F is injective ⇐⇒ F is surjective ⇐⇒ F is an isomorphism.

Remark 3.7. If V is a finite dimensional k-vector space and W ⊆ V , then


W is a vector subspace of V if and only if it is of the form span{v1 , . . . , vd }
for some v1 , . . . , vd ∈ V . This is no longer true if V is not finite dimensional.

4 The group algebra


We apply the construction of k[X] to the special case where X = G is a
group (written multiplicatively). In this case, wePcan view k[G] as functions
f : G → k with finite support or as formal sums g∈G tg · g, where tg = 0 for
all but finitely many g. It is natural to try to extend the multiplication in G
to a multiplication on k[G], by defining the product of the formal symbols
g and h to be gh (using the product in G and then expanding by using the
obvious choice of rules. Explicitly, we define
 
  
X X X X 
 sg · g   tg · g  = 
  · g.
sh1 th2 
g∈G g∈G g∈G h1 ,h2 ∈G
h1 h2 =g

It is straightforward to check that, with our finiteness assumptions, the inner


sums in the formula above are all finite and the coefficients of the product
as defined above are 0 for all but finitely many g.
If we view elements of k[G] instead as functions, then the product of two
functions f1 and f2 is called the convolution f1 ∗f2 , and is defined as follows:
X X X
(f1 ∗f2 )(g) = f1 (h1 )f2 (h2 ) = f1 (h)f2 (h−1 g) = f1 (gh−1 )f2 (h).
h1 ,h2 ∈G h∈G h∈G
h1 h2 =g

8
Again, it is straightforward to check that the sums above are finite and that
Supp f1 ∗ f2 is finite as well.
With either of the above descriptions, one can then check that k[G] is a
ring. The messiest calculation is associativity of the product. For instance,
we can write
X XX
f1 ∗ (f2 ∗ f3 )(g) = f1 (x)(f2 ∗ f3 )(x−1 g) = f1 (x)f2 (y)f3 (y −1 x−1 g)
x∈G x∈G y∈G
X
= f1 (x)f2 (y)f3 (z).
x,y,z∈G
xyz=g

Similarly,
X XX
(f1 ∗ f2 ) ∗ f3 (g) = (f1 ∗ f2 )(gz −1 )f3 (z) = f1 (x)f2 (x−1 gz −1 )f3 (z)
z∈G x∈G z∈G
X
= f1 (x)f2 (y)f3 (z).
x,y,z∈G
xyz=g

Thus f1 ∗ (f2 ∗ f3 ) = (f1 ∗ f2 ) ∗ f3 . Left and right distributivity can also be


checked by a (somewhat easier) calculation. If 1 ∈ G is the identity element,
let 1 denote the element of k[G] which is 1 · 1 (the first 1 is 1 ∈ k and the
second is 1 ∈ G). Thus we identify 1 with the element δ1 ∈ k[G]. Then
1 ∗ f = f ∗ 1 = f . In fact, we leave it as an exercise to show that, for all
h ∈ G, and all f ∈ k[G],
(δh ∗ f )(g) = f (h−1 g);
(f ∗ δh )(g) = f (gh−1 ).
In other words, convolution of the function f with δh has the effect of trans-
lating the function f , i.e. shifting the variable.
The ring k[G] is commutative ⇐⇒ G is abelian. It is essentially never
a division ring. For example, if g ∈ G has order 2, so that g 2 = 1, then the
element 1 + g ∈ k[G] is a zero divisor: let −g denote the element (−1) · g,
where −1 ∈ k is the additive inverse of 1. Then
(1 + g)(1 − g) = 1 + g − g − g 2 = 1 − 1 = 0.

5 Linear functions and matrices


Let F : k n → k m be a linear function. Then
X X
F (t1 , . . . , tn ) = F ( ti ei ) = ti F (ei ).
i i

9
We can P write F (ei ) in terms of the basis e1 , . . . , em of k m : suppose that
F (ei ) = mj=1 aji ej = (a1i , . . . , ami ). We can then associate to F an m × n
matrix with coefficients in k as follows: Define
 
a11 a12 . . . a1n
A =  ... .. .. ..  .

. . . 
am1 am2 . . . amn

Then the columns of A are the vectors F (ei ). Thus every linear map
F : k n → k m corresponds to an m × n matrix with coefficients in k. We
denote by Mm,n (k) the set of all such. Clearly Mm,n (k) is a vector space of
dimension mn, with basis

{Eij , 1 ≤ i ≤ m, 1 ≤ j ≤ n},

where Eij is the n × n matrix whose (i, j)th entry is 1 and such that all
other entries are 0. Equivalently, Mm,n (k) ∼ = k[X], where X = {1, . . . , m} ×
{1, . . . , n} is the set of ordered pairs (i, j) with 1 ≤ i ≤ m and 1 ≤ j ≤ n. If
m = n, we abbreviate Mm,n (k) by Mn (k); it is the space of square matrices
of size n.
Composition of linear maps corresponds to matrix multiplication: If
F : k n → k m and G : k m → k ` are linear maps corresponding to the matrices
A and B respectively, then G ◦ F : k n → k ` corresponds to the ` × n matrix
BA. Here, the (i, j) entry of BA is given by
m
X
bik akj .
k=1

Composition of matrices is associative and distributes over addition of ma-


trices; moreover (tB)A = B(tA) = t(BA) if this is defined. The identity
matrix I ∈ Mn (k) is the matrix whose diagonal entries are 1 and such that
all other entries are 0; it corresponds to the identity map Id : k n → k n . Thus
as previously mentioned Mn (k) is a ring.
A matrix A ∈ Mn (k) is invertible if there exists a matrix B ∈ Mn (k)
such that BA = I. Then necessarily AB = I, and we write B = A−1 .
If F : k n → k n is the linear map corresponding to the matrix A, then A
is invertible ⇐⇒ F is an isomorphism ⇐⇒ F is injective ⇐⇒ F
is surjective. We have the following formula for two invertible matrices
A1 , A2 ∈ Mn (k): if A1 , A2 are invertible, then so is A1 A2 , and

(A1 A2 )−1 = A−1 −1


2 A1 .

10
The subset of Mn (k) consisting of invertible matrices is then a group, de-
noted by GLn (k).
Suppose that V1 and V2 are two finite dimensional vector spaces, and
choose bases v1 , . . . , vn of V1 and w1 , . . . , wm of V2 , so that n = dim V1 and
m = dim V2 . If F : V1 → V2 is linear, then, given the bases v1 , . . . , vn and
w1 , . . . , wm we again get an m × n matrix A by the formula: A = (aij ),
where aij is defined by
Xm
F (vi ) = aji wj .
j=1

We say that A is the matrix associated to the linear map f and the bases
v1 , . . . , vn and w1 , . . . , wm . Given any collection of vectors u1 , . . . , uk in V2 ,
there is a unique linear map F : V1 → V2 defined by F (vi ) = ui for all i.
n
In fact, if we let PnH : k → V1 be the “change of basis” map defined by
H(t1 , . . . , tn ) = i=1 ti wi , then H is an isomorphism and so has an inverse
−1 n
Pn , which is again linear. Let G : k → V2−1be the map G(t1 , . . . , tn ) =
H
i=1 ti ui . Then the map F is equal to G ◦ H , proving the existence and
uniqueness of F .
A related question is the following: Suppose that v1 , . . . , vn is a basis of
k and that w1 , . . . , wm is a basis of k m . Let F be a linear map k n → k m .
n

The we have associatedPa matrix A to F , using the standard bases of k n


m
and k m , i.e. F (ei ) = j=1 aji ej . We can also associate a matrix B to
F by using the bases v1 , . . . , vn and w1 , . . . , wm , i.e. F (vi ) = m
P
j=1 ji wj .
b
What is the relationship between A and B? This question can be answered
most simply in terms of linear maps. Let G : k n → k m be P the linear map
corresponding to the matrix B in the usual way: G(ei ) = kj=1 bji ej . As
above, let H1 : k n → k n be the “change of basis” map in k n defined by
H1 (ei ) = vi , and let H2 : k m → k m be the corresponding change of basis
map in k m , defined by H2 (ej ) = wj . Then clearly (by applying both sides
to vi for every i)
F = H2 ◦ G ◦ H1−1 .
Thus, if H1 corresponds to the n×n (square) matrix C1 and H2 corresponds
to the m × m matrix C2 , then F , which corresponds to the matrix A, also
corresponds to the matrix C2 · B · C1−1 , and thus we have the following
equality of m × n matrices:

A = C2 · B · C1−1 .

A case that often arises is when n = m, so that A and B are square matrices,
and the two bases v1 , . . . , vn and w1 , . . . , wn are equal. In this case C1 =

11
C2 = C, say, and the above formula reads

A = C · B · C −1 .

We say that A is obtained by conjugating B by C.


Similarly, suppose that V is a finite dimensional vector space and that
F : V → V is a linear map. Choosing a basis v1 , . . . , vn of V identifies F
n
with a square matrix P A. Equivalently, if H1 : k → V is the isomorphism
H1 (t1 , . . . , tn ) = i ti vi defined by the basis v1 , . . . , vn , then A corresponds
to the linear map G1 : k n → k n defined by G1 = H1−1 ◦ F ◦ H1 . Choosing
n
a different basis w1 , . . . , wn ofPV gives a different isomorphism H2 : k → V
defined by H2 (t1 , . . . , tn ) = i ti wi , and hence a different matrix B corre-
sponding to G2 = H2−1 ◦ F ◦ H2 . Then, using (H2−1 ◦ H1 )−1 = H1−1 ◦ H2 , we
have
G2 = (H2−1 ◦ H1 ) ◦ G1 ◦ (H2−1 ◦ H1 )−1 = H ◦ G1 ◦ H −1 ,
say, where H = H2−1 ◦ H1 : k n → k n is invertible, and corresponds to the
invertible matrix C. Thus
B = CAC −1 .
It follows that two different choices of bases in V lead to matrices A, B ∈
Mn (k) which are conjugate.

Definition 5.1. Let V be a vector space. We let GL(V ) be the group of


isomorphisms from V to itself. If V is finite dimensional, with dim V = n,
then GL(V ) ∼ = GLn (k) by choosing a basis of V . More precisely, fixing a
basis {v1 , . . . , vn }, we define Φ : GL(V ) → GLn (k) by setting Φ(F ) = A,
where A is the matrix of F with respect to the basis {v1 , . . . , vn }. Choos-
ing a different basis {w1 , . . . , wn } replaces Φ by CΦC −1 , where C is the
appropriate change of basis matrix.

6 Determinant and trace


We begin by recalling the standard properties of the determinant. For every
n, we have a function det : Mn (k) → k, which is a linear function of the
columns of A (i.e. is a linear function of the ith column if we hold all of
the other columns fixed) and is alternating: given i 6= j, 1 ≤ i, j ≤ n, if
A0 is the matrix obtained by switching the ith and j th columns of A, then
det A0 = − det A. It is unique up the normalizing condition that

det I = 1.

12
One can show that det A can be evaluated by expanding about the ith row
for any i: X
det A = (−1)i+k aik det Aik .
k

Here (ai1 , . . . , ain ) is the ith row of A and Aik is the (n − 1) × (n − 1) matrix
obtained by deleting the ith row and k th column of A. det A can also be
evaluated by expanding about the j th column of A:
X
det A = (−1)k+j akj det Akj .
k

In terms of the symmetric group Sn ,


X
det A = (sign σ)aσ(1),1 · · · aσ(n),n ,
σ∈Sn

where sign(σ) (sometimes denoted ε(σ)) is +1 if σ is even and −1 if σ is


odd.
For example, if A is an upper triangular matrix (aij = 0 if i > j), and
in particular if A is a diagonal matrix (aij = 0 for i 6= j) then det A is the
product of the diagonal entries of A.
Another useful fact is the following:

det A = det(t A),

where, if A = (aij ), then t A is the matrix with (i, j)th entry aji .
We have the important result:

Proposition 6.1. For all n × n matrices A and B, det AB = det A det B.

Proposition 6.2. The matrix A is invertible ⇐⇒ det A 6= 0, and in this


case
1
det(A−1 ) = .
det A
Hence det : GLn (k) → k ∗ is a homomorphism, where k ∗ is the multiplicative
group of nonzero elements of k.

Corollary 6.3. If C is an invertible n×n matrix and A is any n×n matrix,


then det A = det(CAC −1 ). Thus, the determinant of a linear map can be
read off from its matrix with respect to any basis.

Proof. Immediate from det(CAC −1 ) = det C · det A · det(C −1 ) = det C ·


det A · (det C)−1 = det A.

13
An important consequence of the corollary is the following: let V be a
finite dimensional vector space and let F : V → V be a linear map. Choosing
a basis v1 , . . . , vn for V gives a matrix A associated to F and the basis
v1 , . . . , vn . Changing the basis v1 , . . . , vn replaces A by CAC −1 for some
invertible matrix C. Since det A = det(CAC −1 ), in fact the determinant
det F is well-defined, i.e. independent of the choice of basis.
Thus for example if v1 , . . . , vn is a basis of k n and, for every i, there
exists a di ∈ k such that Avi = di vi , then det A = d1 · · · dn .
One important application of determinants is to finding eigenvalues and
eigenvectors. Recall that, if F : k n → k n (or F : V → V , where V is a
finite dimensional vector space) is a linear map, a nonzero vector v ∈ k n
(or V ) is an eigenvector of F with eigenvalue λ ∈ k if F (v) = λv. Since
F (v) = v ⇐⇒ λv − F (v) = 0 ⇐⇒ v ∈ Ker(λ Id −F ), we see that v
is an eigenvector of F with eigenvalue λ ⇐⇒ det(λI − A) = 0, where
A is the matrix associated to F (and some choice of basis in the case of a
finite dimensional vector space V ). Defining the characteristic polynomial
pA (t) = det(tI − A), it is easy to see that pA is a polynomial of degree n
whose roots are the eigenvalues of F .
A linear map F : V → V is diagonalizable if there exists a basis v1 , . . . , vn
of V consisting of eigenvectors; similarly for a matrix A ∈ Mn (k). This
says that the matrix A for F in the basis v1 , . . . , vn is a diagonal matrix
(the only nonzero entries are along the diagonal), and in fact aii = λi ,
where λi is the eigenvalue corresponding to the eigenvector vi . Hence, for
all positive integers d (and all integers if F is invertible), v1 , . . . , vn are also
eigenvectors for F d or Ad , and the eigenvalue corresponding to vi is λdi . If the
characteristic polynomial pA has n distinct roots, then A is diagonalizable.
In general, not every matrix is diagonalizable, even for an algebraically closed
field, but we do have the following:

Proposition 6.4. If k is algebraically closed, for example if k = C, and if


V is a finite dimensional vector space and F : V → V is a linear map, then
there exists at least one eigenvector for F .

Remark 6.5. Let A ∈ Mn (k). If f (t) = di=0 ai ti ∈ k[t] is a polynomial in


P

t, then we can apply f to the matrix A: f (A) = di=0 ai Ai ∈ Mn (k). More-


P
over, evaluation at A defines a homomorphism evA : k[t] → Mn (k), viewing
Mn (k) as a (non-commutative) ring. There is a unique monic polynomial
mA (t) such that (i) mA (A) = 0 and (ii) if f (t) ∈ k[t] is any polynomial such
that f (A) = 0, then mA (t) divides f (t) in the ring k[t]. In fact, we can take
mA (t) to be the monic generator of the principal ideal Ker evA .

14
Then one has the Cayley-Hamilton theorem: If pA (t) is the characteristic
polynomial of A, then pA (A) = 0 and hence mA (t) divides pA (t). For
example, if A is diagonalizable with eigenvalues λ1 , . . . , λn , then

pA (t) = (t − λ1 ) · · · (t − λn ),

and it is easy to see that pA (A) = 0 in this case. On the other hand,
in this case the minimal polynomial is the product of the factors (t − λi )
for distinct eigenvalues λi . Hence we see that, in the case where A is
diagonalizble, mA (t) divides pA (t), but they are equal ⇐⇒ there are no
repeated eigenvalues.
Finally, we define the trace of a matrix. If A = (aij ) ∈ Mn (k), we define
n
X
Tr A = aii .
i=1

Thus the trace of a matrix is the sum of its diagonal entries. Clearly,
Tr : Mn (k) → k is a linear map. As such, it is much simpler that the
very nonlinear function det. However, they are related in several ways. For
example, expanding out the characteristic polynomial pa (t) = det(t Id −A)
as a function of t, one can check that

pa (t) = tn − (Tr A)tn−1 + · · · + (−1)n det A.

In a related interpretation, Tr is the derivative of det at the identity. One


similarity between Tr and det is the following:
Proposition 6.6. For all A, B ∈ Mn (k),

Tr(AB) = Tr(BA).

Proof. By
Pdefinition of matrix multiplication, AB is the matrix whose (i, j)
entry is nr=1 air brj , and hence
n X
X n
Tr(AB) = air bri .
i=1 r=1

Since Tr(BA) is obtained by reversing the roles of A and B,


n X
X n n X
X n
Tr(BA) = bir ari = ari bir = Tr(AB),
i=1 r=1 i=1 r=1

after switching the indices r and i.

15
Corollary 6.7. For A ∈ Mn (k) and C ∈ GLn (k),

Tr(CAC −1 ) = Tr A.

Hence, if V is a finite dimensional vector space and F : V → V is a linear


map, then Tr F is well-defined.

7 Direct sums
Definition 7.1. Let V1 and V2 be two vector spaces. We define the direct
sum or external direct sum V1 ⊕ V2 to be the product V1 × V2 , with + and
scalar multiplication defined componentwise:

(v1 , v2 ) + (w1 , w2 ) = (v1 + w1 , v2 + w2 ) for all v1 , w1 ∈ V1 and v2 , w2 ∈ V2 ;


t(v1 , v2 ) = (tv1 , tv2 ) for all v1 ∈ V1 , v2 ∈ V2 , and t ∈ k.

It is easy to see that, with this definition, V1 ⊕ V2 is a vector space. It


contains subspaces V1 ⊕ {0} ∼ = V1 and {0} ⊕ V2 ∼ = V2 , and every element of
V1 ⊕ V2 can be uniquely written as a sum of an element of V1 ⊕ {0} and and
element of {0} ⊕ V2 . Define i1 : V1 → V1 ⊕ V2 by i1 (v) = (v, 0), and similarly
define i2 : V2 → V1 ⊕ V2 by i1 (w) = (0, w). Then i1 and i2 are linear and i1
is an isomorphism from V1 to V1 ⊕ {0} and likewise for i2 .

The direct sum V1 ⊕ V2 has the following “universal” property:

Lemma 7.2. If V1 , V2 , W are vector spaces and F1 : V1 → W , F2 : V2 → W


are linear maps, then the function F1 ⊕ F2 : V1 ⊕ V2 → W defined by

(F1 ⊕ F2 )(v1 , v2 ) = F1 (v1 ) + F2 (v2 )

is linear, and satisfies F (v1 , 0) = F1 (v1 ), F (0, v2 ) = F2 (v2 ), i.e. F1 = F ◦ i1 ,


F2 = F ◦ i2 . Conversely, given a linear map G : V1 ⊕ V2 → W , if we define
G1 : V1 → W by G1 (v1 ) = G(v1 , 0) and G2 : V2 → W by G2 (v2 ) = G(0, v2 ),
so that G1 = G ◦ i1 and G2 = G ◦ i2 , then G1 and G2 are linear and
G = G1 ⊕ G2 .

The direct sum of V1 and V2 also has a related property that makes it
into the product of V1 and V2 . Let π1 : V1 ⊕ V2 → V1 be the projection onto
the firsy factor: π1 (v1 , v2 ) = v1 , and similarly for π2 : V1 ⊕ V2 → V2 . It is
easy to check from the definitions that the πi are linear and surjective, with
π1 ◦i1 = Id on V1 and π2 ◦i2 = Id on V2 . Note that π1 ◦i2 = 0 and π2 ◦i1 = 0.

16
Lemma 7.3. If V1 , V2 , W are vector spaces and G1 : W → V1 , G2 : W → V2
are linear maps, then the function (G1 , G2 ) : W → V1 ⊕ V2 defined by

(G1 , G2 )(w) = (G1 (w), G2 (w))

is linear, and satisfies π1 ◦ (G1 , G2 ) = G1 , π2 ◦ (G1 , G2 ) = G2 . Conversely,


given a linear map G : W → V1 ⊕ V2 , if we define G1 : W → V1 by G1 = π1 ◦
G, G2 : W → V2 by π2 ◦G, then G1 and G2 are linear and G = (G1 , G2 ).

Remark 7.4. It may seem strange to introduce the notation ⊕ for the
Cartesian product. One reason for doing so is that we can define the direct
sum and direct product for an arbitrary collection Vi , i ∈ I of vector spaces
Vi . In this case, when I is infinite, the direct sum and direct product have
very different properties.

Finally, if we have vector spaces V, W, U, X and linear maps F : V → U


and G : W → X, we can define the linear map F ⊕ G : V ⊕ W → U ⊕ X
(same notation as Lemma 7.2) by the formula

(F ⊕ G)(v, w) = (F (v), G(w)).

The direct sum V1 ⊕ · · · ⊕ Vd = di=1 Vi of finitely many vector spaces


L
is similarly defined, and has similar properties. It is also easy to check for
example that

V1 ⊕ V2 ⊕ V3 ∼
= (V1 ⊕ V2 ) ⊕ V3 ∼
= V1 ⊕ (V2 ⊕ V3 ).

It is straightforward to check the following (proof as exercise):

Lemma 7.5. Suppose that v1 , . . . , vn is a basis for V1 and that w1 , . . . , wm


is a basis for V2 . Then (v1 , 0), . . . , (vn , 0), (0, w1 ), . . . , (0, wm ) is a basis for
V1 ⊕ V2 . Hence V1 and V2 are finite dimensional ⇐⇒ V1 ⊕ V2 is finite
dimensional, and in this case

dim(V1 ⊕ V2 ) = dim V1 + dim V2 .

In particular, we can identify k n ⊕ k m with k n+m . Of course, in some


sense we originally defined k n as k
| ⊕ ·{z
· · ⊕ k}.
n times
A linear map F : k n → k r is the same thing as an r × n matrix A, and
a linear map G : k m → k r is the same thing as an r × m matrix B. It is
then easy to check that the linear map F ⊕ G : k n ⊕ k m ∼
= k n+m → k r is the
r × (n + m) matrix A B . Likewise, given linear maps F : k r → k n and

17
G : k r → k m corresponding to matrices A, B,  thelinear map (F, G) : k r →
A
k m+n corresponds to the (n + m) × r matrix . Finally, if given linear
B
maps F : k n → k r and G : k m → k s corresponding to matrices A, B, the
n+m → k r+s corresponds to the (r + s) × (n + m) matrix
linear
  F ⊕G: k
map  
A O A O
. In particular, in the case n = r and m = s, A, B, and
O B O B
are square matrices (of sizes n, m, and n + m respectively). We thus have:

Proposition 7.6. In the above notation,

Tr(F ⊕ G) = Tr F + Tr G.

8 Internal direct sums


The term “external direct sum” suggests that there should also be a notion
of an “internal direct sum.” Suppose that W1 and W2 are subspaces of a
vector space V . Define

W1 + W2 = {w1 + w2 : w1 ∈ W1 , w2 ∈ W2 }.

It is easy to see directly that W1 + W2 is a subspace of V . To see this


another way, let i1 : W1 → V and i2 : W2 → V denote the inclusions, which
are linear. Then as we have seen above there is an induced linear map
i1 ⊕ i2 : W1 ⊕ W2 → V , defined by

(i1 ⊕ i2 )(w1 , w2 ) = w1 + w2 ,

viewed as an element of V . Thus W1 + W2 = Im(i1 ⊕ i2 ), and hence is


a subspace of V by Lemma 2.2. The sum W1 + · · · + Wd of an arbitrary
number of subspaces is defined in a similar way.

Definition 8.1. The vector space V is an internal direct sum of the two
subspaces W1 and W2 if the linear map (i1 ⊕ i2 ) : W1 ⊕ W2 → V defined
above is an isomorphism.

We shall usually omit the word “internal” and understand the statement
that V is a direct sum of the subspaces W1 and W2 to always mean that it
is the internal direct sum.

Proposition 8.2. Let W1 and W2 be subspaces of a vector space V . Then


V is the direct sum of W1 and W2 ⇐⇒ the following two conditions hold:

18
(i) W1 ∩ W2 = {0}.

(ii) V = W1 + W2 .
Proof. The map (i1 ⊕ i2 ) : W1 ⊕ W2 → V is an isomorphism ⇐⇒ it is
injective and surjective. Now Ker(i1 ⊕ i2 ) = {(w1 , w2 ) ∈ W1 ⊕ W2 : w1 +
w2 = 0}. But w1 + w2 = 0 ⇐⇒ w2 = −w1 , and this is only possible if
w1 ∈ W1 ∩ W2 . Hence

Ker(i1 ⊕ i2 ) = {(w, −w) ∈ W1 ⊕ W2 : w ∈ W1 ∩ W2 }.

In particular, W1 ∩ W2 = {0} ⇐⇒ Ker(i1 ⊕ i2 ) = {0} ⇐⇒ i1 ⊕ i2 is


injective. Finally, since Im(i1 ⊕ i2 ) = W1 + W2 , we see that W1 + W2 = V
⇐⇒ i1 ⊕ i2 is surjective. Putting this together, V is the direct sum of W1
and W2 ⇐⇒ i1 ⊕ i2 is both injective and surjective ⇐⇒ both (i) and (ii)
above hold.

More generally, it is easy to check the following:


Proposition 8.3. Let W1 , . . . , Wd be subspaces of a vector space V . Then V
is the direct sum of W1 , . . . , Wd , i.e. the induced linear map W1 ⊕· · ·⊕Wd →
V is an isomorphism, ⇐⇒ the following two conditions hold:
(i) Given wi ∈ Wi , if w1 + · · · + wd = 0, then wi = 0 for every i.

(ii) V = W1 + · · · + Wd .
In general, to write a finite dimensional vector space as a direct sum
W1 ⊕ · · · ⊕ Wd , where none of the Wi is 0, is to decompose the vector
space into hopefully simpler pieces. For example, every finite dimensional
vector space V is a direct sum V = L1 ⊕ · · · ⊕ Ln , where the Li are one
dimensional subspaces. Note however that, if V is one dimensional, then it
is not possible to write V ∼= W1 ⊕ W2 , where both of W1 , W2 are nonzero.
So, in this sense, the one dimensional vector spaces are the building blocks
for all finite dimensional vector spaces.
Definition 8.4. If W is a subspace of V , then a complement to W is a
subspace W 0 of V such that V is the direct sum of W and W 0 . Given W ,
it is easy to check that a complement to W always exists, and in fact there
are many of them.
One important way to realize V as a direct sum is as follows: suppose
that V = W1 ⊕W2 . On the vector space W1 ⊕W2 , we have the projection map
p1 : W1 ⊕ W2 → W1 ⊕ W2 defined by p1 (w1 , w2 ) = (w1 , 0). Let p : V → W1

19
denote the corresponding linear map via the isomorphism V ∼ = W1 ⊕ W2 .
Concretely, given v ∈ V , p(v) is defined as follows: write v (uniquely) as
w1 + w2 , where w1 ∈ W1 and w2 ∈ W2 . Then p(v) = w1 . The linear map p
has the following properties:

1. Im p = W1 .

2. For all w ∈ W1 , p(w) = w.

3. Ker p = W2 .

The next proposition says that we can reverse this process, and will be
a basic tool in decomposing V as a direct sum.

Proposition 8.5. Let V be a vector space and let p : V → V be a linear map.


Let W be the subspace Im p, and suppose that, for all w ∈ W , p(w) = w.
Then, for W 0 = Ker p, V is the direct sum of W and W 0 .

Proof. We must show that W and W 0 satisfy (i) and (ii) of Proposition 8.3.
First suppose that w ∈ W ∩ W 0 = W ∩ Ker p. Then p(w) = w since w ∈ W ,
and p(w) = 0 since w ∈ W 0 = Ker p. Hence w = 0, i.e. W ∩ W 0 = {0}. Now
let v ∈ V . Then w = p(v) ∈ W , and v = p(v) + (v − p(v)) = w + w0 , say,
where we have set w = p(v) and w0 = v − p(v) = v − w. Since w ∈ W ,

p(v − w) = p(v) − p(w) = w − w = 0,

where we have used p(w) = w since w ∈ W . Thus w0 ∈ Ker p and hence


v = w + w0 , where w ∈ W and w0 ∈ W 0 . Hence W + W 0 = V . It follows
that both (i) and (ii) of Proposition 8.3 are satisfied, and so V is the direct
sum of W and W 0 .

Definition 8.6. Let V be a vector space and W a subspace. A linear map


p : V → V is a projection onto W if Im = W and p(w) = w for all w ∈ W .
Note that p is not in general uniquely determined by W .

Proposition 8.7. Let V be a finite dimensional vector space and let W be


a subspace. If p : V → V is a projection onto W , then

Tr p = dim W.

Proof. There exists a basis w1 , . . . , wa , wa+1 , . . . , wn of V for which w1 , . . . , wa


is a basis of W and wa+1 , . . . , wn is a basis of Ker p. Here a = dim W . In this
basis, the matrix for p is a diagonal matrix A with diagonal entries aii = 1
for i ≤ a and aii = 0 for i > a. Thus Tr p = Tr A = a = dim W .

20
9 New vector spaces from old
There are many ways to construct new vector spaces out of one or more
given spaces; the direct sum V1 ⊕ V2 is one such example. As we saw there,
we also want to be able to construct new linear maps from old ones. We
give here some other examples:
Quotient spaces: Let V be a vector space, and let W be a subspace of
V . Then, in particular, W is a subgroup of V (under addition), and since
V is abelian W is automatically normal. Thus we can consider the group of
cosets
V /W = {v + W : v ∈ V }.
It is a group under coset addition. Also, given a coset v + W , we can define
scalar multiplication by the rule

t(v + W ) = tv + W.

We must check that this is well-defined (independent of the choice of repre-


sentative of the coset v + W ): if v and v + w are two elements of the same
coset, then t(v + w) = tv + tw, and this is in the same coset as tv since
tw ∈ W . It is then easy to check that the vector space axioms hold for V /W
under coset addition and scalar multiplication as defined above.
If V is finite dimensional, then so is W . Let w1 , . . . , wa be a basis
for W . As we have see, we can complete the linearly independent vectors
w1 , . . . , wa to a basis w1 , . . . , wa , wa+1 , . . . , wn for V . It is then easy to check
that wa+1 + W, . . . , wn + W form a basis for V /W . In particular,

dim(V /W ) = dim V − dim W.

We have the natural surjective homomorphism π : V → V /W , and it is easy


to check that it is linear. Given any linear map F : V /W → U , where U is
a vector space, G = F ◦ π : V → U is a linear map satisfying: G(w) = 0 for
all w ∈ W . We can reverse this process: If G : V → U is a linear map such
that G(w) = 0 for all w ∈ W , then define F : V /W → U by the formula:
F (v + W ) = G(v). If we replace v by some other representative in the coset
v + W , necessarily of the form v + w with w ∈ W , then G(v + w) = G(v) +
G(w) = G(v) is unchanged. Hence the definition of F is independent of the
choice of representative, so there is an induced homomorphism F : V /W →
U . It is easy to check that F is linear. Summarizing:
Proposition 9.1. Let V be a vector space and let W be a subspace. For a
vector space U , there is a bijection (described above) from the set of linear

21
maps F : V /W → U and the set of linear maps G : V → U such that G(w) =
0 for all w ∈ W .

Dual spaces and spaces of linear maps: Let V be a vector space and
define the dual vector space V ∗ (sometimes written V ∨ ) by:

V ∗ = {F ∈ k V : F is linear}.

(Recall that k V denotes the set of functions from V to k.) Thus the elements
of V ∗ are linear maps V → k. Similarly, if V and W are two vector spaces,
define
Hom(V, W ) = {F ∈ W V : F is linear}.
(Here, again, W V denotes the set of functions from V to W .) Then the
elements of Hom(V, W ) are linear maps F : V → W , and V ∗ = Hom(V, k).

Proposition 9.2. Hom(V, W ) is a vector subspace of W V , and hence a


vector space in its own right under pointwise addition and scalar multiplica-
tion. In particular, V ∗ is a vector space under pointwise addition and scalar
multiplication.

Proof. By definition, Hom(V, W ) is a subset of W V . It is closed under


pointwise addition since, if F1 : V → W and F2 : V → W are both linear,
then so is F1 + F2 (proof as exercise). Similarly, Hom(V, W ) is closed under
scalar multiplication. Since 0 ∈ Hom(V, W ), and Hom(V, W ) is a subgroup
of W V closed under scalar multiplication, Hom(V, W ) is a vector subspace
of W V .

Suppose that V is finite dimensional with basis v1 , . . . , vn . We define the


dual basis v1∗ , . . . , vn∗ of V ∗ as follows: vi∗ : V → k is the unique linear map
such that vi∗ (vj ) = 0 if i 6= j and vi∗ (vi ) = 1. (Recall that every linear map
V → k is uniquely specified by its values on a basis, and every possible set of
values arises in this way.) We summarize the above requirement by writing
vi∗ (vj ) = δij , where δij , the Kronecker delta function, is defined to be 1 if
i = j and 0 otherwise. It is easy to check that v1∗ , . . . , vn∗ is in fact a basis
of V ∗ , hence dim V ∗ = dim V if V is finite dimensional. In particular, for
V = k n with the standard basis e1 , . . . , en , V ∗ ∼ = k n with basis e∗1 , . . . , e∗n .
But thus isomorphism changes in a complicated way with respect to linear
maps, and cannot be defined intrinsically (i.e. without choosing a basis).
More generally, suppose that V and W are both finite dimensional,
with bases v1 , . . . , vn and w1 , . . . , wm respectively. Define the linear map

22
vi∗ wj : V → W as follows:
(
0, if ` 6= i;
vi∗ wj (v` ) =
wj , if ` = i.

In other words, for all v ∈ V , (vi∗ wj )(v) = vi∗ (v)wj . Suppose that F : V → W
corresponds to the matrix A = (aij ) withP respect to the bases v1 , . . . , vn
m
and wP1 , . . . , w m . In other words, F (v i ) = j=1 aji wj . Then by definition
F = i,j aji vi wj . It then follows that the vi∗ wj , 1 ≤ i ≤ n, 1 ≤ j ≤ m,

are a basis for Hom(V, W ) (we saw a special case of this earlier for V ∼ = kn ,
W ∼= k m , where we denoted e∗j ei by Eij ). Thus

dim Hom(V, W ) = (dim V )(dim W ).

The “universal” properties of direct sums can be expressed as follows:

Proposition 9.3. Let V, W, U be vector spaces. Then

Hom(V ⊕ W, U ) ∼
= Hom(V, U ) ⊕ Hom(W, U );
Hom(U, V ⊕ W ) ∼
= Hom(U, V ) ⊕ Hom(U, W ).

In fact, the statements of Lemma 7.2 and Lemma 7.3 give an explicit
construction of the isomorphisms.
Next we see how linear maps behave for these constructions. Suppose
that V1 and V2 are two vector spaces and that G : V1 → V2 is a linear map.
Given an element F of V2∗ , in other words a linear map F : V2 → k, we can
consider the composition on the right with G: define

G∗ (F ) = F ◦ G : V1 → k.

Since the composition of two linear maps is linear, G∗ (F ) ∈ V1∗ , and a


straightforward calculation shows that G∗ (F1 + F2 ) = G∗ (F1 ) + G∗ (F2 ) and
that G∗ (tF ) = tG∗ (F ), so that G∗ is linear. In fact, one can show the
following (proof omitted):

Proposition 9.4. The function G 7→ G∗ is a linear function from the vector


space Hom(V1 , V2 ) to the vector space Hom(V2∗ , V1∗ ), and it is an isomor-
phism if V1 and V2 are finite dimensional.

Remark 9.5. In case V1 = k n and V2 = k m , a linear map from V1 to V2 is the


same thing as an m×n matrix A = (aij ). Using the isomorphisms (k n )∗ ∼ = kn
and (k m )∗ ∼
= k m via the dual bases e∗i , there is a natural isomorphism from

23
Hom(k n , k m ) = Mm,n (k) to Hom(k m , k n ) = Mn,m (k). In other words, we
have a natural way to associate an n × m matrix to the m × n matrix A. It
is easy to check that this is the transpose matrix
t
A = (aji ).

Clearly, A 7→ t A is an isomorphism from Mm,n (k) to Mn,m (k). In fact, its


inverse is again transpose, and more precisely we have the obvious formula:
for all A ∈ Mm,n (k),
tt
A = t (t A) = A.
Note the reversal of order in Proposition 9.4. In fact, this extends to
compositions as follows:
Proposition 9.6. Suppose that V1 , V2 , V3 are vector spaces and that G1 : V1 →
V2 and G2 : V2 → V3 are linear maps. Then

(G2 ◦ G1 )∗ = G∗1 ◦ G∗2 .

Proof. By definition, given F ∈ V3∗ ,

(G2 ◦ G1 )∗ (F ) = F ◦ (G2 ◦ G1 ) = (F ◦ G2 ) ◦ G1 ,

since function composition is associative. On the other hand,

G∗1 ◦ G∗2 (F ) = G∗1 (G∗2 (F )) = G∗1 (F ◦ G2 ) = (F ◦ G2 ) ◦ G1 ,

proving the desired equality.

Remark 9.7. For the case of matrices, i.e. V1 = k n , V2 = k m , V3 = k p , the


reversal of order in Proposition 9.6 is the familiar formula
t
(AB) = t B · t A.

The change in the order above is just a fact of life, and it is the only
possible formula along these lines if we want to line up domains and ranges
in the right way. However, there is one case where we can keep the original
order of composition. Suppose that V1 = V2 = V , say, and that G is
invertible, i.e. is a linear isomorphism. Then we have the operation G 7→
G−1 , and it also reverses the order of operations: (G1 ◦ G2 )−1 = G−1 −1
2 ◦ G1 .
∗ −1
So if we combine this with G 7→ G , we get a function G 7→ (G ) , from ∗

invertible linear maps on V to invertible linear maps on V ∗ . given as follows:


if F ∈ V ∗ , i.e. F : V → k is a linear function, define

G · F = F ◦ (G−1 ).

24
(Note: a calculation shows that, if G is invertible, then G∗ is invertible, and
in fact we have the formula (G∗ )−1 = (G−1 )∗ , so that these two operations
commute.) In this case, the order is preserved, since, if G1 , G2 are two
isomorphisms, then
((G1 ◦ G2 )−1 )∗ = (G−1 −1 ∗ −1 ∗ −1 ∗
2 ◦ G1 ) = (G1 ) ◦ (G2 ) ,

and so the order is unchanged. For example, taking V1 = V2 = k n in the


above notation, for all G1 , G2 ∈ GLn (k) and F ∈ Hom(k n , W ),
(G1 ◦ G2 ) · F = G1 · (G2 · F ).
This says that the above defines a group action of GLn (k) on Hom(k n , W ).
In the case of Hom(V, W ), we have to consider linear maps for both
the domain and range. Given two vector spaces V1 and V2 and a linear
map G : V1 → V2 , we can define G∗ as before, by right composition: if
F ∈ Hom(V2 , W ), in other words if F : V2 → W is a linear map, we can
consider the linear map G∗ F = F ◦ G : V1 → W . Thus G∗ is a map from
Hom(V2 , W ) to Hom(V1 , W ), and it is linear. By the same kind of arguments
as above for the dual space, if V1 , V2 , V3 are vector spaces and G1 : V1 → V2
and G2 : V2 → V3 are linear maps, then
(G2 ◦ G1 )∗ = G∗1 ◦ G∗2 .
On the other hand, given two vector spaces W1 and W2 and a linear map
H : W1 → W2 , we can define a function H∗ : Hom(V, W1 ) → Hom(V, W2 )
by using left composition: for F ∈ Hom(V, W1 ), we set
H∗ (F ) = H ◦ F.
Note that the composition is linear, since both F and H were assumed
linear, and hence H ∗ (F ) ∈ Hom(V, W2 ). It is again easy to check that H∗ is
a linear map from Hom(V, W1 ) to Hom(V, W2 ). Left composition preserves
the order, though, since given H1 : W1 → W2 and H2 : W2 → W3 , we have
(H2 ◦ H1 )∗ (F ) = (H2 ◦ H1 ) ◦ F = H2 ◦ (H1 ◦ F )
= (H2 )∗ ((H1 )∗ (F )) = ((H2 )∗ ◦ (H1 )∗ )(F ).
Finally, suppose given G : V1 → V2 and H : W1 → W2 . We use the same sym-
bol to denote G∗ : Hom(V2 , W1 ) → Hom(V1 , W1 ) and G∗ : Hom(V2 , W2 ) →
Hom(V1 , W2 ), and also use H∗ to denote both H∗ : Hom(V1 , W1 ) → Hom(V1 , W2 )
and H∗ : Hom(V2 , W1 ) → Hom(V2 , W2 ). With this understanding, given
F : V2 → W1 ,
G∗ ◦ H∗ (F ) = H ◦ F ◦ G = H∗ ◦ G∗ (F ).
Hence:

25
Proposition 9.8. The functions G∗ ◦ H∗ : Hom(V2 , W1 ) → Hom(V1 , W2 )
and H∗ ◦ G∗ : Hom(V2 , W1 ) → Hom(V1 , W2 ) agree.

We paraphrase this by saying that right and left composition commute.


Equivalently, there is a commutative diagram
∗ H
Hom(V2 , W1 ) −−−−→ Hom(V2 , W2 )
 
G∗ y
  ∗
yG
Hom(V1 , W1 ) −−−−→ Hom(V1 , W2 ).
H∗

More on duality: Let V be a vector space and V ∗ its dual. If V = k n ,


then V ∗ ∼ = k n , and in fact a reasonable choice of basis is the dual basis
e∗1 , . . . , e∗n . Thus, if V is a finite dimensional vector space, then V ∼ = V∗
n
since both are isomorphic to k . However, the isomorphism depends on the
choice of a basis of V .
However, let V ∗∗ = (V ∗ )∗ be the dual of the dual of V . We call V ∗∗ the
double dual of V . There is a “natural” linear map ev : V → V ∗∗ defined as
follows: First, define the function E : V ∗ × V → k by

E(f, v) = f (v),

and then set


ev(v)(f ) = E(f, v) = f (v).
In other words, ev(v) is the function on V ∗ defined by evaluation at v. For
example, if V = k n with basis e1 , . . . , en , and e∗1 , . . . , e∗n is the dual basis of
(k n )∗ , then it is easy to see that

ev(ei )(e∗j ) = δij .

This says that ev(ei ) = e∗∗ ∗∗ ∗∗ ∗ ∗


i , where e1 , . . . , en is the dual basis to e1 , . . . , en .
Thus:

Proposition 9.9. If V is finite dimensional, then ev : V → V ∗∗ is an iso-


morphism.

Remark 9.10. If V is not finite dimensional, then V ∗ and hence V ∗∗ tend to


be much larger than V , and so the map V → V ∗∗ , which is always injective,
is not surjective.

26
Another example of duality arises as follows: Given two vector spaces V
and W and F ∈ Hom(V, W ), we have defined F ∗ : Hom(W, k) = W ∗ →
Hom(V, k) = V ∗ , and the linear map Hom(V, W ) → Hom(W ∗ , V ∗ ) de-
fined by F 7→ F ∗ is an isomorphism. Repeating this construction, given
F ∈ Hom(V, W ) we have defined F ∗ ∈ Hom(W ∗ , V ∗ ), and thus F ∗∗ ∈
Hom(V ∗∗ , W ∗∗ ). If V and W are finite dimensional, then V ∗∗ ∼ = V , by
an explicit isomorphism which is the inverse of ev, and similarly W ∗∗ ∼
= W.
Thus we can view F ∗∗ as an element of Hom(V, W ), and it is straightforward
to check that, via this identification,
F ∗∗ = F.
In fact, translating this statement into the case V = k n , W = k m , this
statement reduces to the identity tt A = A for all A ∈ Mm,n (k) of Remark 9.5.

10 Tensor products
Tensor products are another very useful way of producing new vector spaces
from old. We begin by recalling a definition from linear algebra:
Definition 10.1. Let V, W, U be vector spaces. A function F : V × W → U
is bilinear if it is linear in each variable when the other is held fixed: for all
w ∈ W , the function f (v) = F (v, w) is a linear function from V to U , and
all v ∈ V , the function g(w) = F (v, w) is a linear function from W to U .
Multilinear maps F : V1 × V2 × · · · × Vn → U are defined in a similar way.
Bilinear maps occur throughout linear algebra. For example, for any field
k we have the standard inner product, often denoted by h·, ·i : k n × k n → k,
and it is defined by: if v = (v1 , . . . , vn ) and w = (w1 , . . . , wn ), then
n
X
hv, wi = v i wi .
i=1

More generally, if B : k n × k m → k is any bilinear function, then it is easy to


see that there exist unique aij ∈ k such that, for all v = (v1 , . . . , vn ) ∈ k n ,
w = (w1 , . . . , wm ) ∈ k m ,
X
B(v, w) = aij vi wj .
i,j

Here, aij = B(ei , ej ) and the formula is obtained by expanding out


X X
B( vi ei , wj ej )
i j

27
using the defining properties of bilinearity.
Another important example of a bilinear function is composition of linear
maps: given three vector spaces V, W, U , we have the composition function

C : Hom(V, W ) × Hom(W, U ) → Hom(V, U )

defined by C(F, G) = G ◦ F , and it is easily checked to be bilinear. Corre-


spondingly, we have matrix multiplication

Mn,m (k) × Mm,` (k) → Mn,` (k),

and the statement that matrix multiplication distributes over matrix addi-
tion (on both sides) and commutes with scalar multiplication of matrices (in
the sense that (tA)B = A(tB) = t(AB)) is just the statement that matrix
multiplication is bilinear.
The tensor product V ⊗ W of two vector spaces is a new vector space
which is often most usefully described by a “universal property” with respect
to bilinear maps: First, for all v ∈ V , w ∈ W , there is a symbol v ⊗ w ∈
V ⊗ W , which is bilinear, i.e. for all v, vi ∈ V , w, wi ∈ W and t ∈ k,

(v1 + v2 ) ⊗ w = (v1 ⊗ w) + (v2 ⊗ w);


v ⊗ (w1 + w2 ) = (v ⊗ w1 ) + (v ⊗ w2 );
(tv) ⊗ w = v ⊗ (tw) = t(v ⊗ w).

Second, for every vector space U and bilinear function F : V × W → U ,


there is a unique linear function Fb : V ⊗ W → U such that, for all v ∈ V
and w ∈ W , F (v, w) = Fb(v ⊗ w).
One can show that, for two vector spaces V and W , the tensor product
V ⊗ W exists and is uniquely characterized, up to a unique isomorphism,
by the above universal property. The construction is not very illuminating:
even if V and W are finite dimensional, the construction of V ⊗W starts with
the very large (infinite dimensional if k is infinite) vector space k[V ×W ], the
free vector space corresponding to the set V ×W and then taking the quotient
by the subspace generated by elements of the form (v1 + v2 , w) − (v1 , w) −
(v2 , w), (v, w1 + w2 ) − (v, w1 ) − (v, w2 ), (tv, w) − t(v, w), (v, tw) − t(v, w).
The important features of the tensor product are as follows:

1. If V and W are finite dimensional, v1 , . . . , vn is a basis of V1 and


w1 , . . . , wm is a basis of W , then vi ⊗ wj , 1 ≤ i ≤ n, 1 ≤ j ≤ m, is a
basis of V ⊗ W . Thus, in this case V ⊗ W is finite dimensional as well
and
dim(V ⊗ W ) = (dim V ) · (dim W ).

28
2. Given vector spaces V1 , V2 , W1 , W2 and linear maps G : V1 → V2 ,
H : W1 → W2 , then there is an induced linear map, denoted G ⊗ H,
from V1 ⊗ W1 to V2 ⊗ W2 . It satisfies: for all v ∈ V1 and w ∈ W1 ,

(G ⊗ H)(v ⊗ w) = G(v) ⊗ H(w).

Moreover, given another pair of vector spaces V3 , W3 , and linear maps


G0 : V2 → V3 and H 0 : W2 → W3 , then

(G0 ⊗ H 0 ) ◦ (G ⊗ H) = (G0 ◦ G) ⊗ (H 0 ◦ H).

3. There is a “natural” isomorphism S : V ⊗ W ∼


= W ⊗ V , which satisfies

S(v ⊗ w) = w ⊗ v.

Thus, as with direct sum (but unlike Hom), tensor product is sym-
metric.

4. There is a “natural” isomorphism

(V1 ⊕ V2 ) ⊗ W ∼
= (V1 ⊗ W ) ⊕ (V2 ⊗ W ),

and similarly for the second factor, so that tensor product distributes
over direct sum.

5. There are “natural” isomorphisms

V1 ⊗ (V2 ⊗ V3 ) ∼
= (V1 ⊗ V2 ) ⊗ V3 ∼
= V1 ⊗ V2 ⊗ V3 ,

where by definition V1 ⊗ V2 ⊗ V3 has a universal property with respect


to trilinear maps V1 × V2 × V3 → U .

6. There is a “natural” linear map L from V ∗ ⊗W to Hom(V, W ), which is


an isomorphism if V is finite dimensional. (Note that this is consistent
with the statement that, if both V and W are finite dimensional, then
dim(V ⊗ W ) = dim(Hom(V, W )) = (dim V )(dim W ).) Here natural
means in particular the following: Suppose that V1 , V2 , W2 , W2 are
vector spaces, and that G : V1 → V2 and H : W1 → W2 are linear
maps. We have defined the linear map H∗ ◦ G∗ = G∗ ◦ H∗ from
Hom(V2 , W1 ) to Hom(V1 , W2 ). On the other hand, we have the linear
map G∗ : V2∗ → V1∗ and hence there is a linear map G∗ ⊗H : V2∗ ⊗W1 →
V1∗ ⊗ W2 , and these commute with the isomorphisms L : V2∗ ⊗ W1 →

29
Hom(V2 , W1 ) and L0 : V1∗ ⊗ W2 → Hom(V1 , W2 ) in the sense that the
following diagram is commutative:
L
V2∗ ⊗ W1 −−−−→ Hom(V2 , W1 )
 
 ∗
G∗ ⊗H y

yG ◦H∗
V1∗ ⊗ W2 −−−−
0
→ Hom(V1 , W2 ).
L

In particular, if V is finite dimensional then there is a “natural” iso-


morphism from Hom(V, V ) to V ∗ ⊗ V . Using this, we can give an intrinsic
definition of the trace as follows:

Proposition 10.2. The function E : V ∗ × V → k defined by

E(f, v) = f (v)

is bilinear. Using the isomorphism V ∗ ⊗ V ∼


= Hom(V, V ), the corresponding
linear map V ∗ ⊗ V → k is identified with the trace.

Warning: In general, it can be somewhat tricky to work with tensor


products. For example, every element P of a tensor product V ⊗ W can be
written as a finite sum of the form i vi ⊗ wi , where vi ∈ V and wi ∈ W .
But it is not in general possible to write every element of V ⊗ W in the
form v ⊗ w, forPa single choice of v ∈ V , w ∈ W . Also, the expression of
an element as i vi ⊗ wi is far from unique. What this means in practice,
for example, if that we cannot try to define a linear map G : V ⊗ W → U
by simply defining G on elements of the form v ⊗ w, and expect that G is
always well defined. Instead, it is best to use the universal property of the
tensor product when attempting to define linear maps from V ⊗ W to some
other vector space.

30

You might also like