20.W. A. Adkins, S. H. Weintraub, Algebra An Approach Via Module Theory
20.W. A. Adkins, S. H. Weintraub, Algebra An Approach Via Module Theory
20.W. A. Adkins, S. H. Weintraub, Algebra An Approach Via Module Theory
i Mathematics
William A. Adkins
Steven H. Weintraub
Algebra
An Approach via Module Theory
Springer-Verlag
Graduate Texts in Mathematics 136
Editorial Board
S. Axler F.W. Gehring K.A. Ribet
Springer
New York
Berlin
Heidelberg
Barcelona
Hong Kong
London
Milan
Paris
Singapore
Tokyo
Graduate Texts in Mathematics
Algebra
An Approach via Module Theory
Springer
William A. Adkins
Steven H. Weintraub
Department of Mathematics
Louisiana State University
Baton Rouge, LA 70803
USA
Editorial Board
S. Axler F.W. Gehring K.A. Ribet
Mathematics Department Mathematics Department Department of Mathematics
San Francisco State East Hall University of California
University University of Michigan at Berkeley
San Francisco. CA 94132 Ann Arbor, MI 48109 Berkeley, CA 94720-3840
USA USA USA
This chapter also has Dixon's proof of a criterion for similarity of matrices
based solely on rank computations.
In Chapter 6 we discuss duality and investigate bilinear, sesquilinear,
and quadratic forms, with the assistance of module theory, obtaining com-
plete results in a number of important special cases. Among these are the
cases of skew-symmetric forms over a PID, sesquilinear (Hermitian) forms
over the complex numbers, and bilinear and quadratic forms over the real
numbers, over finite fields of odd characteristic, and over the field with two
elements (where the Arf invariant enters in the case of quadratic forms).
Chapter 7 has two sections. The first discusses semisimple rings and
modules (deriving Wedderburn's theorem), and the second develops some
multilinear algebra. Our results in both of these sections are crucial for
Chapter 8.
Our final chapter, Chapter 8, is the capstone of the book, dealing with
group representations mostly, though not entirely, in the semisimple case.
Although perhaps not the most usual of topics in a first-year graduate
course, it is a beautiful and important part of mathematics. We view a
representation of a group G over a field F as an F(G)-module, and so this
chapter applies (or illustrates) much of the material we have developed in
this book. Particularly noteworthy is our treatment of induced representa-
tions. Many authors define them more or less ad hoc, perhaps mentioning as
an aside that they are tensor products. We define them as tensor products
and stick to that point of view (though we provide a recognition principle
not involving tensor products), so that, for example, Frobenius reciprocity
merely becomes a special case of adjoint associativity of Hom and tensor
product.
The interdependence of the chapters is as follows:
I.
I.
4.1-4.3
4.4-4.6 7
F-61
1 J.
5 8
viii Preface
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . V
Chapter 1 Groups . . . . . . . . . . . . . . . . . . . . . 1
Chapter 2 Rings . . . . . . . . . . . . . . . . . . . . . 49
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . 507
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . 510
Index of Notation . . . . . . . . . . . . . . . . . . . . . 511
Index of Terminology . . . . . . . . . . . . . . . . . . . . 517
Chapter 1
Groups
In this chapter we introduce groups and prove some of the basic theorems in
group theory. One of these, the structure theorem for finitely generated abelian
groups, we do not prove here but instead derive it as a corollary of the more
general structure theorem for finitely generated modules over a PID (see Theorem
3.7.22).
the additive notation. In this chapter, we will write e for the identity of gen-
eral groups, i.e., those written multiplicatively, but when we study group
representation theory in Chapter 8, we will switch to 1 as the identity for
multiplicatively written groups.
To present some examples of groups we must give the set G and the
operation : G x C - G and then check that this operation satisfies (a),
(b), and (c) of Definition 1.1. For most of the following examples, the fact
that the operation satisfies (a), (b), and (c) follows from properties of the
various number systems with which you should be quite familiar. Thus
details of the verification of the axioms are generally left to the reader.
(1.2) Examples.
(1) The set Z of integers with the operation being ordinary addition of
integers is a group with identity e = 0, and the inverse of m E Z is
-m. Similarly, we obtain the additive group Q of rational numbers, R
of real numbers, and C of complex numbers.
(2) The set Q' of nonzero rational numbers with the operation of ordinary
multiplication is a group with identity e = 1, and the inverse of a E Q'
is 1/a. Q' is abelian, but this is one example of an abelian group that
is not normally written with additive notation. Similarly, there are the
abelian groups R' of nonzero real numbers and C* of nonzero complex
numbers.
(3) The set Z = {0, 1, ... , n- 1} with the operation of addition modulo n
is a group with identity 0, and the inverse of x E Z is n-x. Recall that
addition modulo n is defined as follows. If x, y E Z,, take x + y E Z
and divide by n to get x + y = qn + r where 0 < r < n. Then define
x + y (mod n) to be r.
(4) The set U,, of complex nt^ roots of unity, i.e., U = {exp((2k7ri)/n)
0 < k < n - 11 with the operation of multiplication of complex num-
bers is a group with the identity e = 1 = exp(0), and the inverse of
exp((2kai)/n) is exp((2(n - k)7ri)/n).
(5) Let Z;, = {m : 1 < m < n and m is relatively prime to n}. Under the
operation of multiplication modulo n, Z;, is a group with identity 1.
Details of the verification are left as an exercise.
(6) If X is a set let Sx be the set of all bijective functions f : X X.
Recall that a function is bijective if it is one-to-one and onto. Functional
composition gives a binary operation on Sx and with this operation
it becomes a group. Sx is called the group of permutations of X or
the symmetric group on X. If X = {1, 2, ..., n} then the symmetric
group on X is usually denoted S and an element a of S,, can be
conveniently indicated by a 2 x n matrix
1 2 n
a
- a(1) a(2) ... a(n)
1.1 Definitions and Examples 3
where the entry in the second row under k is the image a(k) of k
under the function a. To conform with the conventions of functional
composition, the product afl will be read from right to left, i.e., first
do Q and then do a. For example,
1 2 3 4 3 4 3 4
(3 2 4 1) (3 4 1 2) - (4 1 3 2)
(7) Let GL(n, R) denote the set of n x n invertible matrices with real
entries. Then GL(n, R) is a group under matrix multiplication. Let
SL(n, R) = IT E GL(n, R) : detT = 11. Then SL(n,R) is a group
under matrix multiplication. (In this example, we are assuming famil-
iarity with basic properties of matrix multiplication and determinants.
See Chapter 4 for details.) GL(n, R) (respectively, SL(n, R)) is known
as the general linear group (respectively, special linear group) of degree
n over R.
(8) If X is a set let P(X) denote the power set of X, i.e., P(X) is the set
of all subsets of X. Define a product on P(X) by the formula AD B =
(A \ B) U (B \ A). A D B is called the symmetric difference of A and
B. It is a straightforward exercise to verify the associative law for the
symmetric difference. Also note that AAA = 0 and 0/IA = AL0 = A.
Thus P(X) with the symmetric difference operation is a group with 0
as identity and every element as its own inverse. Note that P(X) is an
abelian group.
(9) Let C(R) be the set of continuous real-valued functions defined on R
and let D(R) be the set of differentiable real-valued functions defined
on R. Then C(R) and D(R) are groups under the operation of function
addition.
One way to explicitly describe a group with only finitely many elements
is to give a table listing the multiplications. For example the group {1, -1}
has the multiplication table
1 -1
e a b c
e e a b c
a a e c b
b b c e a
c c b a e
is the table of a group called the Klein 4-group. Note that in these tables
each entry of the group appears exactly once in each row and column.
Also the multiplication is read from left to right; that is, the entry at the
intersection of the row headed by a and the column headed by Q is the
product aQ. Such a table is called a Cayley diagram of the group. They
are sometimes useful for an explicit listing of the multiplication in small
groups.
The following result collects some elementary properties of a group:
The results in part (5) of Proposition 1.3 are known as the cancellation
laws for a group.
The associative law for a group G shows that a product of the elements
a, b, c of G can be written unambiguously as abc. Since the multiplication
is binary, what this means is that any two ways of multiplying a, b, and c
(so that the order of occurrence in the product is the given order) produces
the same element of G. With three elements there are only two choices for
multiplication, that is, (ab)c and a(bc), and the law of associativity says
1.1 Definitions and Examples 5
that these are the same element of G. If there are n elements of G then
the law of associativity combined with induction shows that we can write
a1a2 . an unambiguously, i.e., it is not necessary to include parentheses
to indicate which sequence of binary multiplications occurred to arrive at
an element of G involving all of the a1. This is the content of the next
proposition.
(1.4) Proposition. Any two ways of multiplying the elements a1, a2, ..., an
in a group G in the order given (i.e., removal of all parentheses produces
the juxtaposition ala2 an) produces the same element of G.
Proof. If n = 3 the result is clear from the associative law in G.
Let n > 3 and consider two elements g and h obtained as products
of a1, a2, ..., an in the given order. Writing g and h in terms of the last
multiplication used to obtain them gives
g = (a1 ... a:) (aj+1 ... an)
and
h =
Since i and j are less than n, the induction hypothesis implies that the
products al a;, a;+1 an, al aj, and a,+1. . an are unambiguously
defined elements in G. Without loss of generality we may assume that i < j.
If i = j then g = h and we are done. Thus assume that i < j. Then, by the
induction hypothesis, parentheses can be rearranged so that
g = (al ... a;)((ai+l ... ai)(a,+1 ... an))
and
h = ((a1 ... a:)(ai+1... ai))(ai+1... an).
Letting A = (a1 ...a;), B = (a;+1 ...a,), and C = (a,+1 ... an) the in-
duction hypothesis implies that A, B, and C are unambiguously defined
elements of G. Then
g = A(BC) = (AB)C = h
and the proposition follows by the principle of induction. 0
Since products of n elements of G are unambiguous once the order has
been specified, we will write a1a2 an for such a product, without any
specification of parentheses. Note that the only property of a group used
in Proposition 1.4 is the associative property. Therefore, Proposition 1.4 is
valid for any associative binary operation. We will use this fact to be able to
write unambiguous multiplication of elements of a ring in later chapters. A
convenient notation for a1 an is 11,n_1 a;. If a; = a for all i then 11ni=1 a is
denoted an and called the nth power of a. Negative powers of a are defined
6 Chapter 1. Groups
by a-" = (a-1)n where n > 0, and we set a° = e. With these notations the
standard rules for exponents are valid.
Proof. Part (1) follows from Proposition 1.4 while part (2) is an easy exercise
using induction.
Proof. If H is a subgroup then (1) and (2) are satisfied as was observed in
the previous paragraph. If (1) and (2) are satisfied and a E H then a-1 E H
by (2) and e = aa-1 E H by (1). Thus conditions (a), (b), and (c) in the
definition of a group are satisfied for H, and hence H is a subgroup of G.
0
(2.2) Remarks. (1) Conditions (1) and (2) of Proposition 2.1 can be replaced
by the following single condition.
(1)' If a, b E H then ab-1 E H.
1.2 Subgroups and Cosets 7
(2.8) Examples. You should provide proofs (where needed) for the claims
made in the following examples.
(1) The additive group Z is an infinite cyclic group generated by the num-
ber 1.
(2) The multiplicative group Q' is generated by the set S = {1/p : p is a
prime number) U { -1 }.
(3) The group Z is cyclic with generator 1.
(4) The group Un is cyclic with generator exp(2iri/n).
(5) The even integers are a subgroup of Z. More generally, all the multiples
of a fixed integer n form a subgroup of Z and we will see shortly that
these are all the subgroups of Z.
(6) If a = (s s then H = {e, a, a2 } is a subgroup of the symmetric
i) S3 is generated by or and 3 = (z 1 3 )
group S3. Also,
(7) IfQ=(1i3)and7=(szi)then Ss=(Q,7)
(8) A matrix A = [a;j) is upper triangular if al) = 0 for i > j. The
subset T(n, R) C GL(n, R) of invertible upper triangular matrices is
a subgroup of GL(n, R).
(9) If G is a group let Z(G), called the center of G, be defined by
Z(G)={aEG:ab=ba forallbEG}.
Then Z(G) is a subgroup of G.
1.2 Subgroups and Cosets 9
(2.9) Definition. The order of G, denoted IGI, is the cardinality of the set G.
The order of an element a E G, denoted o(a) is the order of the subgroup
generated by a. (In general, IXI will denote the cardinality of the set X,
with IXI = oo used to indicate an infinite set.)
(2) If o(a) < oo, then o(a) is the smallest positive integer n such that
an = e.
(3) ak = e if and only if o(a) I k.
Proof. (1) If an34 e for any n > 0, then a' a' for any r#ssince a'=a'
implies ar-' = e = a'-r, and if r 54 s, then r - s > 0 or s - r > 0, which is
excluded by our hypothesis. Thus, if an # e for n > 0, then I (a) = oo, so
o(a) = oo. If an = e then let a'n be any element of (a). Writing m = qn + r
where 0 < r < n we see that a' = anq+r = angar = (an)gar = egar = ar.
Thus (a) = {e, a, a2, ..., an-1} and o(a) < n < oo.
(2) By part (1), if o(a) < oo then there is an n > 0 such that a" = e and
for each such n the argument in (1) shows that (a) = {e, a, ..., an-1}. If
we choose n as the smallest positive integer such that an = e then we claim
that the powers a' are all distinct for 0 < i < n - 1. Suppose that a' = a3
for 0 < i < j < n - 1. Then aj-' = e and 0 < j - i < n, contradicting the
choice of n. Thus o(a) = n = smallest positive integer such that an = e.
(3) Assume that ak = e, let n = o(a), and write k = nq + r where
0 < r < n. Then e = ak = anq+r = angar = ar. Part (2) shows that we
must have r=0sothat k=nq. 0
Proof. We give the proof of (1). Suppose a-1b E H and let b = ah for some
h E H. Then bh' = a(hh') for all h' E H and ah1 = (ah)(h-1h1) = b(h-1h1)
for all h1 E H. Thus aH = W. Conversely, suppose aH = W. Then
b = be = ah for some h E H. Therefore, a-1b = h E H. 0
12 Chapter 1. Groups
(2.15) Theorem. Let H be a subgroup of G. Then the left cosets (right cosecs)
of H form a partition of G.
Proof. Define a relation L on G by setting a ^'L b if and only if a-'b E H.
Note that
(1) a-'.La,
(2) a ^'L b implies b -L a (since a-lb E H implies that b-'a = (a-'b)-l E
H), and
(3) a -.L b and b -L c implies a -L c.
Thus, L is an equivalence relation on G and the equivalence classes of L,
denoted [a]L, partition G. (See the appendix.) That is, the equivalence
classes [a]L and [b]L are identical or they do not intersect. But
[a]L={bEG:a-Lb}
={bEG:a-'bEH}
= {b e C: b= ah for some hE H}
= aH.
Thus, the left cosets of H partition G and similarly for the right cosets.
(2.21) Remark. The converse of Theorem 2.17 is false in the sense that if
m is an integer dividing [Cl, then there need not exist a subgroup H of G
with IHI = m. A counterexample is given in Exercise 31. It is true, however,
when m is prime. This will be proved in Theorem 4.7.
If IGI < oo, then Corollaries 2.18 and 2.19 show that the exponent of
G divides the order of G.
There is a simple multiplication formula relating indices for a chain of
subgroups K C H C G.
(2.25) Examples.
(1) If G = Z and H = 2Z is the subgroup of even integers, then the
cosets of H consist of the even integers and the odd integers. Thus,
14 Chapter 1. Groups
j a 0J :aER'}
Therefore, the set of cosets of H in G is in one-to-one correspondence
with the set of nonzero real numbers.
(5) Groups of order < 5. Let G be a group with JGJ < 5. If IGI = 1, 2, 3, or
5 then Corollary 2.20 shows that G is cyclic. Suppose now that JGJ = 4.
Then every element a 0 e E G has order 2 or 4. If G has an element a
of order 4 then G = (a) and G is cyclic. If G does not have any element
of order 4 then G = {e, a, b, c} where a2 = b2 = c2 = e since each
nonidentity element must have order 2. Now consider the product ab. If
ab = e then ab = a2, so b = a by cancellation. But a and b are distinct
elements. Similarly, ab cannot be a or b, so we must have ab = c. A
similar argument shows that ba = c, ac = b = ca, be = a = cb. Thus, G
has the Cayley diagram of the Klein 4-group. Therefore, we have shown
that there are exactly two nonisomorphic groups of order 4, namely,
the cyclic group of order 4 and the Klein 4-group.
The left cosets of a subgroup were seen (in the proof of Theorem 2.14)
to be a partition of G by describing an explicit equivalence relation on G.
There are other important equivalence relations that can be defined on a
group G. We will conclude this section by describing one such equivalence
relation.
identity element), but it is not a group except in the trivial case G = {e}
since an inverse will not exist (using the multiplication on P'(G)) for any
subset S of G with BSI > 1. If S E PO(G) let S-1 = {s'1 : s E S}. Note,
however, that S-1 is not the inverse of S under the multiplication of P' (G)
except when S contains only one element. If H is a subgroup of G, then
HH = H, and if IHI < oo, then Remark 2.2 (2) implies that this equality
is equivalent to H being a subgroup of G. If H is a subgroup of G then
H-1 = H since subgroups are closed under inverses.
Now consider the following question. Suppose H, K E P' (G) are sub-
groups of G. Then under what conditions is HK a subgroup of G? The
following lemma gives one answer to this question; another answer will be
provided later in this section after the concept of normal subgroup has been
introduced.
to aN = Na for all a E G.
(3.6) Proposition. If N a G, then the coset space GIN C P' (G) forms a
group under the multiplication inherited from P'(G).
Proof. By Lemma 3.3, GIN is closed under the multiplication on P'(G).
Since the multiplication on P' (G) is already associative, it is only necessary
to check the existence of an identity and inverses. But the coset N = eN
satisfies
(eN)(aN) = eaN = aN = aeN = (aN)(eN),
so N is an identity of GIN. Also
so that a-1N is an inverse of aN. Therefore, the axioms for a group struc-
ture on GIN are satisfied.
(3.8) Remark. If N a G and IGI < oo, then Lagrange's theorem (Theorem
2.17) shows that IG/Nj = [G: N] = IGI/INI.
(3.9) Examples.
(1) If G is abelian, then every subgroup of G is normal.
(2) SL(n, R) is a normal subgroup of GL(n, R). Indeed, if A E GL(n, R)
and B E SL(n, R) then
(3.14) Theorem. (Third isomorphism theorem) Let N<G, HaG and assume
that N C H. Then
G/H 'A (G/N)/(H/N).
Proof. Letting
S1 = {H : H is a subgroup of G containing N)
and
S2 = {subgroups of GIN),
define a : S1 - S2 by a(H) = H/N = Im(ir[H). Suppose H1/N = H2/N
where H1, H2 E S1. We claim that H1 = H2. Let h1 E H1. Then h1N E
H2/N, so h1N = h2N where h2 E H2. Therefore, H1 C H2 and a similar
argument shows that H2 C H1 so that H1 = H2. Thus a is one-to-one. If
K E S2 then a-1(K) E S1 and a(ir-1(K)) = K so that a is surjective. We
conclude that a is a 1 - 1 correspondence between Sl and S2.
Now consider properties (1) and (2). The fact that H1 C H2 if and
only if HI IN C H2/N is clear. To show that [H2 : H1) = [H2/N : H2/N] it
is necessary to show that the set of cosets aH1 (for a E H2) is in one-to-one
correspondence with the set of cosets aHl/N (for a E H2/N). This is left
as an exercise.
Suppose H a G. Then H/N a GIN since
(aN)(H/N)(aN)-' = (aHa-1)/N = H/N.
Conversely, let H/N be a normal subgroup of GIN. Then if irl
G/N - (G/N)/(H/N) is the natural map we see that Ker(zrl o ir) = H.
Thus, H o G. 0
(3.18) Examples.
(1) Aut(Z) ' Z. To see this let E Aut(Z). Then if 0(1) = r it follows
that i(m) = mr so that Z = Im (0) = (r). Therefore, r must be a
generator of Z, i.e., r = ±1. Hence 0(m) = m or 0(m) = -m for all
ME Z.
(2) Let G = {(a, b) : a, b E Z}. Then Aut(G) is not abelian. Indeed,
{{(
Aut(G) °° GL(2, Z) = b : a, b, c, d E Z and ad- be = ±l } .
(3.22) Example.
(1) The group S3 has Z(S3) = {e} (check this). Thus Inn(S3) 5 S3. Recall
that S3 = {e, a, aZ, /3, a/3, a2(} (see Example 2.8 (6)). Note that a and
/3 satisfy a3 = e = 82 and a/ 3 = a2/3. The elements a and a' have
order 3 and /3, a/3, and a2/3 all have order 2. Thus if 0 E Aut(S3)
22 Chapter 1. Groups
then O(a) E {a, a2} and 0(0) E {/3, a,0, a2)3}. Since S3 is generated by
the automorphism 0 is completely determined once 0(a) and
0(0) are specified. Thus I Aut(S3)) < 6 and we conclude that
Aut(S3) = Inn(S3) S3.
given by
0m~ 'm.
Furthermore, this is an isomorphism of groups since
Omi (O ns(r)) = Om, (mgr) = m1 m2r = Om1m2(r)
0
Now
Ker(4') = {a E G : ab = b for all b E G} = {e}.
Thus, 4? is injective, so by the first isomorphism theorem G 25 Im(4?) C SG.
0
(4.2) Remark. The homomorphism (b is called the left regular representa-
tion of G. If IGI < oo then 4? is an isomorphism only when IGI < 2 since if
IGI > 2 then ISGI = fGf! > JGJ. This same observation shows that Theo-
rem 4.1 is primarily of interest in showing that nothing is lost if one chooses
to restrict consideration to permutation groups. As a practical matter, the
size of SG is so large compared to that of G that rarely is much insight
gained with the use of the left regular representation of G in SG. It does,
however, suggest the possibility of looking for smaller permutation groups
that might contain a copy of G. One possibility for this will be considered
now.
(4.4) Corollary. Let H be a subgroup of the finite group G and assume that
IGI does not divide [G : HJ!. Then there is a subgroup N C H such that
N 0 {e} and N a G.
Proof. Let N be the kernel of the permutation representation 4iH. By Propo-
sition 4.3 N is the largest normal subgroup of G contained in H. To see
that N 0 {e}, note that GIN Im (4iH), which is a subgroup of SG/H
Thus,
IGI/INI = I Im ($H)I ( I SG/HI = [G: H]!.
Since IGI does not divide (G : H]!, we must have that INI > 1 so that
N 0 {e}. 0
(4.5) Corollary. Let H be a subgroup of the finite group G such that
(IHI, ([G: H] - 1)!) = 1.
Then H a G.
Proof. Let N = Ker(4iH). Then N C H and GIN ?5
I [G : H]! = (IGI/IHI)!
Therefore,
Proof. The orbits of G form a partition X, and hence IXI = E IGxI where
the sum is over a set consisting of one representative of each orbit of G.
The result then follows from Lemma 4.10.
(4.12) Remark. Note that Lemma 4.11 generalizes the class equation (Corol-
lary 2.28), which is the special case of Lemma 4.11 when X = G and G
acts on X by conjugation.
The three parts of the following theorem are often known as the three
Sylow theorems:
(4.14) Theorem. (Sylow) Let G be a finite group and let p be a prime dividing
IGI.
1.4 Permutation Representations and the Sylow Theorems 27
Proof. Let m = IGI and write m = p"k where k is not divisible by p and
n > I. We will first prove that G has a p-Sylow subgroup by induction on
M. If m = p then G itself is a p-Sylow subgroup. Thus, suppose that m > p
and consider the class equation of G (Corollary 2.28):
where the sum is over a complete set of nonconjugate a not in Z(G). There
are two possibilities to consider:
(1) For some a, [G : C(a)] is not divisible by p. In that case, IC(a)I =
IGI/[G : C(a)] = p"k' for some k' dividing k. Then p divides IC(a)I
and IC(a)I < IGI, so by induction C(a) has a subgroup H of order p",
which is then also a p-Sylow subgroup of G.
(2) [G: C(a)] is divisible by p for all a Z(G). Then, since IGI is divisible
by p, we see from Equation (4.1) that p divides IZ(G)I. By Cauchy's
theorem (Theorem 4.7), there is an x E Z(G) of order p. Let N = (x).
If n = 1 (i.e., p divides IGI, but p2 does not) then N itself is a p-Sylow
subgroup of G. Otherwise, note that since N C Z(G), it follows that
N 4G (Exercise 21). Consider the projection map 7r : G -+ GIN. Now
I G/NI = p"-' k < IGI, so by induction, GIN has a subgroup H with
IHI = p"'1, and then it-1(H) is a p-Sylow subgroup of G.
Thus, we have established that G has a p-Sylow subgroup P. Let X
be the set of all subgroups of G conjugate to P. (Of course, any subgroup
conjugate to P has the same order as P, so it is also a p-Sylow subgroup
of G.) The group G acts on X by conjugation, and since all elements of
X are conjugate to P, there is only one orbit. By Lemma 4.11, we have
IXI = [G: G(P)]. But P C G(P), so [G: G(P)] divides k and, in particular,
is not divisible by p. Thus, IXI is relatively prime to p.
Now let H be an arbitrary p-subgroup of G, and consider the action
of H on X by conjugation. Again by Lemma 4.11,
It is clear from the construction that a1 and a2 are disjoint cycles. Contin-
uing in this manner we eventually arrive at a factorization
a =
If a = (il i,) then a = (il i,.) . (il i2) so that an r-cycle a can
be written as a product of (o(a) - 1) transpositions. Hence, if a ,E e E S,
is written in its cycle decomposition a = al a, then a is the product of
f (a) _ Ei=1(o(a;) - 1) transpositions. We also set f (e) = 0. Now suppose
that
a = (al bl)(a2 b2) ... (at bt)
is written as an arbitrary product of transpositions. We claim that f (a) - t
is even. To see this note that
(ai1i2 ... i,.bj ... j,)(ab) _ (a11 ... je)(bil ... ir)
and (since (a b)2 = e)
(ail ... js)(bi1 ... i,.)(ab) _ (ail i2 ... i,, b jl ... j,)
where it is possible that no ik or jk is present. Hence, if a and b both
occur in the same cycle in the cycle decomposition of a it follows that
f (a (a b)) = f (a) - 1, while if they occur in different cycles or are both
not moved by a then f (a (a b)) = f (a) + 1. In any case
f (a (a b)) - f (a) = 1 (mod 2).
(5.6) Remark. Note that the above argument gives a method for computing
sgn(a). Namely, decompose a = al a, into a product of cycles and
compute f(a) = E;=1(o(a;) - 1). Then sgn(a) = 1 if f(a) is even and
sgn(a) = -1 if f (a) is odd.
32 Chapter 1. Groups
aQa-1(a(ir-1)) = a(ir)
aAa-'(a(ir)) =a(il)
so that a/a-1 = (a(il) ... a(ir)).
1.5 The Symmetric Group and Symmetry Groups 33
(2) Let Q = (i1 i,.) and -y = (jl . j,.) be any two r-cycles in Sn.
Define a E S,, by a(ik) =.1k for 1 < k < r and extend a to a permutation
in any manner. Then by part (1) apa-1
= ry.
is the cycle decomposition of Q. Thus, a and 0 have the same cycle struc-
ture.
The converse is analogous to the proof of Proposition 5.9 (2); it is left
to the reader.
D2n={a'( :0<i<n,j=0,1).
34 Chapter 1. Groups
It is easy to check that o(a) = n and that i3a/3 = a-'. If the vertices of
P are numbered n, 1, 2, ... , n -1 counterclockwise starting at (1, 0), then
Den is identified as a subgroup of S by
a,-4(12... n)
(1 n - 1)(2 n - 2) ((n - 1)/2 (n + 1)/2) when n is odd,
(In - 1)(2 n - 2) (n/2 - 1 (n/2) + 1) when n is even.
Thus, we have arrived at a concrete representation of the dihedral group
that was described by means of generators and relations in Example 2.8
(13).
(5.12) Examples.
(1) If X is the rectangle in R2 with vertices (0, 1), (0, 0), (2, 0), and (2, 1)
labelled from 1 to 4 in the given order, then the symmetry group of X
is the subgroup
H = {e, (13)(24), (12)(34), (14)(23)}
of S4, which is isomorphic to the Klein 4-group.
(2) DB °-° S3 since D6 is generated as a subgroup of S3 by the permutations
a= (123) andp=(23).
(3) D8 is a (nonnormal) subgroup of S4 of order 8. If a = (1234) and
a = (13) then
Ds = {e, a, a2, a3, Q, aQ, a2Q, a3,3}.
There are two other subgroups of S4 conjugate to Ds (exercise).
The homomorphisms 7rN and nH are called the natural projections while
LN and tH are known as the natural injections. The word canonical is used
interchangeably with natural when referring to projections or injections.
Note the following relationships among these homomorphisms
Ker(7rH) = Im(LN)
Ker(7rN) = Im(tH)
7rH o tH = 1H
7rNOLN=IN
(1c refers to the identity homomorphism of the group G). In particular,
N x H contains a normal subgroup
N = Im(LN) = Ker(7H) = N
and a normal subgroup
H = Im(tH) = Ker(7rN) 25 H
(6.4) Examples.
(1) Recall that if G is a group then the center of G, denoted Z(G), is the
set of elements that commute with all elements of G. It is a normal
subgroup of G. Now, if N and H are groups, then it is an easy exercise
(do it) to show that Z(N x H) = Z(N) x Z(H). As a consequence,
one obtains the fact that the product of abelian groups is abelian.
(2) The group Z2 X Z2 is isomorphic to the Klein 4-group. Therefore, the
two nonisomorphic groups of order 4 are Z4 and Z2 X Z2-
(3) All the hypotheses in the definition of internal direct product are nec-
essary for the validity of Proposition 6.3. For example, let G = S3,
N = A3, and H = ((12)). Then N oG but H is not a normal subgroup
of G. It is true that G = NH and N n H = {e}, but G 9t N x H since
G is not abelian, but N x H is abelian.
(4) In the previous example S3 is the semidirect product of N = A3 and
H = ((12)).
(e,h)(n,e)(e,h)-1 = (e,h)(n,e)(e,h'')
_ (mh(n), h) (e, h-1)
_ (Oh(n)k (e),hh-1)
_ (Oh (n), e).
(6.10) Examples.
(1) Let 0 : H - Aut(N) be defined by 0(h) = 1N for all h E H. Then
N la 0 H is just the direct product of N and H.
(2) If 95: Z2 -, Aut(Zn) is defined by 1 i-+ 01(a) = -a where Z2 = 10, 1),
then Z )4 ,0 Z2 ti Den
(3) The construction in Example (2) works for any abelian group A in
place of Zn and gives a group A A 0 Z2. Note that A m Z2 9t A X Z2
unless a 2 = e for all a E A.
(4) Zpz is a nonsplit extension of Zp by Zp. Indeed, define r : Zpz -+ Zp by
r(r) = r (mod p). Then Ker(ir) is the unique subgroup of Zp2 of order
p, i.e., Ker(ir) = (p) C Zp2. But then any nonzero homomorphism
a : Z, -, Zp2 must have IIm(a)I = p and, since there is only one
subgroup of Zpa of order p, it follows that lm(a) = Ker(r). Therefore,
r o a = 0 0 1Z, so that the extension is nonsplit.
(6.11) Remark. Note that all semidirect products arise via the construction
of Theorem 6.9 as follows. Suppose G = NH is a semidirect product. Define
0: H -. Aut(N) by Oh(n) = hnh-1. Then the map 4i : G --+ N xm H,
defined by 4i(nh) = (n, h), is easily seen to be an isomorphism. Note that
4i is well defined by Lemma 6.5 and is a homomorphism by Theorem 6.9
(4).
IGI and study groups with particularly simple prime factorizations for their
order. First note that groups of prime order are cyclic (Corollary 2.20) so
that every group of order 2, 3, 5, 7, 11, or 13 is cyclic. Next we consider
groups of order p2 and pq where p and q are distinct primes.
(7.1) Proposition. If p is a prime and G is a group of order p2, then G =' Zp2
orC Z,XZr.
Proof. If G has an element of order p2, then G Zp2. Assume not. Let
e96 a E G. Then o(a) = p. Set N = (a). Let b E G with b V N, and set
H = (b). Then N Z, and H Zp, and by Corollary 4.6, N a G and
H a G; so
G°-NxH?5ZpxZp
by Proposition 6.3. 0
(7.2) Proposition. Let p and q be primes such that p > q and let G be a
group of order pq.
(1) If q does not divide p - 1, then G Z.
(2) If q I p - 1, then G Zpq or G Zp x m Z. when
0: Zq -' Aut(Zp) = Z.
is a nontrivial homomorphism. All nontrivial homomorphisms produce
isomorphic groups.
(7.4) Remark. The results obtained so far completely describe all groups of
order < 15, except for groups of order 8 and 12. We shall analyze each of
these two cases separately.
Groups of Order 8
We will consider first the case of abelian groups of order 8.
Proof. Since G is not abelian, it is not cyclic so G does not have an element
of order 8. Similarly, if a2 = e for all a E G, then G is abelian (Exercise 8);
therefore, there is an element a E G of order 4. Let b be an element of G
not in (a). Since (G : (a)] = 2, the subgroup (a) a G. But IG/(a)I = 2 so
that b2 E (a). Since o(b) is 2 or 4, we must have b2 = e or b2 = a2. Since
(a) a C, b-lab is in (a) and has order 4. Since G is not abelian, it follows
that b-lab = a3. Therefore, G has two generators a and b subject to one of
the following sets of relations:
(1) a4 = e, b2 = e, b-lab = a3;
(2) a4 = e, b2 = a2, b-lab = a3.
Groups of Order 12
To classify groups of order 12, we start with the following result.
(7.8) Proposition. Let G be a group of order p2q where p and q are distinct
primes. Then G is the semidirect product of a p-Sylow subgroup H and a
q-Sylow subgroup K.
Proof. If p > q then H a G by Corollary 4.6.
If q > p then 1 + kq p2 for some k > 0. Since q > p, this can
I
Proof. Exercise.
2 Z2 1
3 Z3 1
4 Z4 2
Z2xZ2
5 Z5 1
6 Z6 S3 2
7 Z7 1
8 Z8 Q 5
Z4 X Z2 D8
Z2<Z2xZ2
9 Z9 2
Z3 X Z3
10 Z1o D1o 2
11 Z11 1
12 Z12 A4 5
Z2 X Z6 D12
Z3 X O Z4
13 Z13 1
14 Z14 D14 2
15 Z15 1
1.8 Exercises 45
1.8 Exercises
S=
11 m n ll
[0 1 b :molm, noln, bolb
0 0 1
A= [0 U]
and B= [ 0l 0].
(a) Show that A and B satisfy the relations A4 = I,A2 = B2,B-'AB =
A-1. (Thus, is a concrete representation of the quaternion group.)
b) Prove that IQI = 8 and list all the elements of Q in terms of A and B.
c) Compute Z(Q) and prove that Q/Z(Q) is abelian.
d Prove that every subgroup of Q is normal.
25. Let n be a fixed positive integer. Suppose a group G has exactly one subgroup
H of order n. Prove that H a C.
26. Let HaG and assume that G/H is abelian. Show that every subgroup K C G
containing H is normal.
27. Let G. be the multiplicative subgroup of GL(2, C) generated by
a= (16 2
5
3
4
4
1
5
2
61
3
(1 2 3 4 5 6 7 8)
R= 8 1 3 6 5 7 4 2
1 2 3 4 5 6 7 8 9)
7= (2 3 4 5 6 7 8 9 1)
(1 2 3 4 5 6 7 8 91
5 8 9 2 1 4 3 6 7
5 4 1 7 10 2 6 9 8
(b) Prove that a permutation a is even if and only if there are an even
number of even order cycles in the cycle decomposition of or.
43. Show that if a subgroup G of Sn contains an odd permutation then G has a
normal subgroup H with [C : HI = 2.
44. For a E Sn, let
then j(a) = 5.) Show that sgn(a) = 1 if j(a) is even and agn(a) = -1 if
T (o,) is odd. Thus, f provides a method of determining if a permutation is
even or odd without the factorization into disjoint cycles.
45. a Prove that S. is generated by the transpositions 12), (13), ..., (1 n).
b Prove that Sn is generated by (12) and (12 n).
46. In the group S4 compute the number of permutations conjugate to each of
the following permutations: e = (1), a = (12), Q = (12 3), ry = (12 3 4), and
6 = (12)(3 4).
47. (a) Find all the subgroups of the dihedral group Da.
b Show that Ds is not isomorphic to the quaternion group Q. Note, how-
ever, that both groups are nonabelian groups of order 8. (Hint: Count
the number of elements of order 2 in each group.)
48. Construct two nonisomorphic nonabelian groups of order p3 where p is an
odd prime.
49. Show that any group of order 312 has a nontrivial normal subgroup.
50. Show that any group of order 56 has a nontrivial normal subgroup.
51. Show Aut(Zs X Z3) °_r S3.
52. How many elements are there of order 7 in a simple group of order 168? (See
Exercise 38 for the definition of simple.)
53. Classify all groups (up to isomorphism) of order 18.
54. Classify all groups (up to isomorphism) of order 20.
Chapter 2
Rings
(1.1) Definition. A ring (R, +, ) is a set R together with two binary op-
erations +: R x R R (addition) and : R x R R (multiplication)
satisfying the following properties.
(a) (R, +) is an abelian group. We write the identity element as 0.
(b) a (b c) = (a b) c ( is associative).
(c) and right
distributive over +).
As in the case of groups, it is conventional to write ab instead of a b.
A ring will be denoted simply by writing the set R, with the multiplication
and addition being implicit in most cases. If R # {0} and multiplication on
R has an identity element, i.e., there is an element 1 E R with al = la = a
for all a E R, then R is said to be a ring with identity. In this case 1 0
(see Proposition 1.2). We emphasize that the ring R = {0} is not a ring
with identity.
0=ab-ac=a(b- c),
and since a is not a zero divisor and a 0, this implies that b - c = 0, i.e.,
b = c. The other half is similar.
(1.6) Remark. The conclusion of Proposition 1.5 is valid under much weaker
hypotheses. In fact, there is a theorem of Wedderburn that states that the
commutativity follows from the finiteness of the ring. Specifically, Wedder-
burn proved that any finite division ring is automatically a field. This result
2.1 Definitions and Examples 51
requires more background than the elementary Proposition 1.5 and will not
be presented here.
(1.10) Examples.
(1) 2Z = {even integers} is a subring of the ring Z of integers. 2Z does
not have an identity and thus it fails to be an integral domain, even
though it has no zero divisors.
(2) Z,,, the integers under addition and multiplication modulo n, is a ring
with identity. Z has zero divisors if and only if n is composite. Indeed,
ifn=rs for 1 <r <n, 1 <s<nthenrs=OinZ, andr96O,s540
in Z,,, so Z has zero divisors. Conversely, if Z has zero divisors then
there is an equation ab = 0 in Z with a 34 0, b 1-10 in Z,,. By choosing
representatives of a and b in Z we obtain an equation ab = nk in Z
where we may assume that 0 < a < n and 0 < b < n. Therefore, every
prime divisor of k divides either a or b so that after enough divisions
we arrive at an equation rs = n where 0 < r < n and 0 < s < n, i.e.,
n is composite.
(3) Example (2) combined with Proposition 1.5 shows that Z is a field if
and only if n is a prime number. In particular, we have identified some
finite fields, namely, Z. for p a prime.
(4) There are finite fields other than the fields ZP. We will show how to
construct some of them after we develop the theory of polynomial rings.
52 Chapter 2. Rings
Table 1.1. Multiplication and addition for a field with four elements
0 1 a b 0 1 a b
0 0 1 a b 0 0 0 0 0
1 1 0 b a 1 0 1 a b
a a b 0 1 a 0 a b 1
b b a 1 0 b 0 b 1 a
For now we can present a specific example via explicit addition and
multiplication tables. Let F = {0, 1, a, b} have addition and multipli-
cation defined by Table 1.1. One can check directly that (F, +, ) is a
field with 4 elements. Note that the additive group (F, +) °-° Z2 X Z2
and that the multiplicative group (F', ) Z3-
(5) Let Z[i] = {m + ni : m, n E Z}. Then Z[i( is a subring of the field of
complex numbers called the ring of gaussian integers. As an exercise,
check that the units of Z[ij are (±l, ±i).
(6) Let d # 0,1 E Z be square-free (i.e., n2 does not divide d for any
n> 1)andletQ(fd]={a+bf :a,bEQ}c C. Then Q[fd(isa
subfield of C called a quadratic field.
(7) Let X be a set and P(X) the power set of X. Then (P(X),A,n)
is a commutative ring with identity where addition in P(X) is the
symmetric difference (see Example 1.2 (8)) and the product of A and
B is A n B. For this ring, 0 = 0 E P(X) and 1 = X E P(X ).
(8) Let R be a ring with identity and let M,n,n(R) be the set of m x n
matrices with entries in R. If m = n we will write M,,(R) in place of
MM,n(R). If A = [aid( E M,n,n(R) we let enti3(A) = aid denote the
entry of A in the ith row and j1h column for 1 < i < m, 1 < j < n. If
A, B E M,n,n(R) then the sum is defined by the formula
enti3(A + B) = enti?(A) + ent,,(B),
while if A E M,n,n(R) and B E M,,,p(R) the product AB E Mm,p(R)
is defined by the formula
n
enti,(AB) = E entik (A) entkj (B).
k=1
b"
_ 1 ifi=j,
0 ifi0j.
2.1 Definitions and Examples 53
There are mn matrices E1,, (1 < i < m, 1 < j < n) in Mm,n that
are particularly useful in many calculations concerning matrices. E;,
is defined by the formula entkl (E 3) = bk;61j i that is, E;; has a 1 in the
ij position and 0 elsewhere. Therefore, any A = (a;,] E M,n,n(R) can
be written as
m [nom
A=EEa'JE,,.
i=1 j=1
Note that the symbol E,,; does not contain notation indicating which
Mm,n(R) the matrix belongs to. This is determined from the context.
There is the following matrix product rule for the matrices E;j (when
the matrix multiplications are defined):
(1.2) EE,Ekl =bjkE1.
In case m = n, note that Eili = E,,, and when n > 1, E11E12 = E12
while E12E11 = 0. Therefore, if n > 1 then the ring Mn(R) is not
commutative and there are zero divisors in Mn(R). The matrices E1, E
M,,(R) are called matrix units, but they are definitely not (except for
n = 1) units in the ring Mn(R). A unit in the ring Mn(R) is an
invertible matrix, so the group of units of M,(R) is called the general
linear group GL(n, R) of degree n over the ring R.
(9) There are a number of important subrings of M,,(R). To mention a
few, there is the ring of diagonal matrices
Dn(R)={AEMn(R):ent11(A)=0 if i54 j},
the ring of upper triangular matrices
Tn(R)={AEMn(R):enti,(A)=0if i > j},
and the ring of lower triangular matrices
Tn(R)={A EMn(R):ent1,(A)=0if i<j}.
All three of these subrings of Mn(R) are rings with identity, namely the
identity of Mn(R). The subrings of strictly upper triangular matrices
STn and strictly lower triangular matrices STn do not have an identity.
A matrix is strictly upper triangular if all entries on and below the
diagonal are 0 and strictly lower triangular means that all entries on
and above the diagonal are 0.
(10) Let F be a subfield of the real numbers R and let x, y E F with x > 0
and y > 0. Define a subring Q(-x, -y; F) of M2(C) by
Q(-x, -y; F) =
a+b// c/+d
cv-d a-b a, b,c,d EF}.
U
54 Chapter 2. Rings
(In these formulas and v/'--y denote the square roots with positive
imaginary parts.) It is easy to check that Q(-x, -y; F) is closed under
matrix addition and matrix multiplication so that it is a subring of
[
M2(C). Let
1- [0 1]' i- [ 0 -,/=x]'
j= 0 1, and k= [- 0 0
It is easy to check that RIX] is a ring with these operations (do it).
RIX] is called the ring of polynomials in the indeterminate X with
coefficients in R. Notice that the indeterminate X is nowhere men-
tioned in our definition of RIX]. To show that our description of RIX]
agrees with that with which you are probably familiar, we define X as
a function on Z+ as follows
1 ifn = 1,
X (n) _ 0 ifn54l.
Then the function Xn satisfies
Xn(m) _ 1 if m = n,
0 ifmon.
Therefore, any f E RIX] can be written uniquely as
00
f = 1: f(n) Xn
n[=0
where the summation is actually finite since only finitely many f (n) #
0. Note that X° means the identity of RIX], which is the function
1 : Z+ --+ R defined by
1(n)= J1 ifn=0,
l0 ifn>O.
We do not need to assume that R is commutative in order to define
the polynomial ring RIX], but many of the theorems concerning poly-
nomial rings will require this hypothesis, or even that R be a field.
However, for some applications to linear algebra it is convenient to
have polynomials over noncommutative rings.
(13) We have defined the polynomial ring RIX] very precisely as functions
from Z+ to R that are 0 except for at most finitely many nonnegative
integers. We can similarly define the polynomials in several variables.
Let (Z+)n be the set of all n-tuples of nonnegative integers and, if R
is a commutative ring with identity, define RIX,,... , Xn] to be the set
of all functions f : (Z+)n -+ R such that f (a) = 0 for all but at most
finitely many a E (Z+)n. Define ring operations on RIX,,... , Xn] by
56 Chapter 2. Rings
0 otherwise.
If a = (al,...,an) E (Z+)n we write X° = and we
leave it as an exercise to check the following formula, which corresponds
to our intuitive understanding of what a polynomial in several variables
is. If f E R[X,,... , Xn] then we can write
f = [1 a°Xa
oE(Z++)n
f(X) = EanXn
n=0
Since we cannot compute infinite sums (at least without a topology
and a concept of limit), this expression is simply a convenient way to
keep track of f (n) for all n. In fact, an = f (n) is the meaning of the
above equation. With this convention, the multiplication and addition
of formal power series proceeds by the rules you learned in calculus for
manipulating power series. A useful exercise to become familiar with
algebra in the ring of formal power series is to verify that f E R[[X]J
is a unit if and only if f (0) is a unit in R.
(15) Let G be a group and let R be a ring with identity. Let R(G) be the
set of all functions f : G - R such that f (a) 0 0 for at most a finite
number of a E G. Define multiplication and addition on R(G) by the
formulas
(f + 9)(a) = f(a) + 9(a)
(fg)(a) = f(b)9(b-la).
bEG
Note that the summation used in the definition of product in R(G) is
a finite sum since f (b) 16 0 for at most finitely many b E G. The ring
2.1 Definitions and Examples 57
(1.11) Proposition. Let R be a ring and let a1, ..., am, b1i ..., bn E R.
Then
m n
(a1 + ... + am)(bl + + b")
a'bj.
i=1 j=1
Proof. Exercise. 0
58 Chapter 2. Rings
Every ring R has at least two ideals, namely, {0} and R are both
ideals of R. For division rings, these are the only ideals, as we ee from the
following observation.
(2.3) Lemma. If R is a division ring, then the only ideals of R are R and
{0}.
Proof. Let I C R be an ideal such that I# {0}. Let a 0 0 E I and let
b E R. Then the equation ax = b is solvable in R, so b E I. Therefore,
I=R. 0
(2.5) Remarks. (1) In fact, the converse of Lemma 2.2 is also true; that
is, every ideal is the kernel of some ring homomorphism. The proof of this
requires the construction of the quotient ring, which we will take up next.
(2) The converse of Lemma 2.3 is false. See Remark 2.28.
(2.8) Theorem. (Third isomorphism theorem) Let R be a ring and let I and
J be ideals of R with I C J. Then J/I is an ideal of R/1 and
R/J (R/I)/(J/I )
Proof. Define a function f : R/I - R/J by f (a + I) = a + J. It is easy to
check that this is a well-defined ring homomorphism. Then
Ker(f)={a+I:a+J=J}={a+I:aEJ}=J/I.
The result then follows from the first isomorphism theorem.
RXR= rixisi:ri,siER,xiEX,n>1 .
RX rixi:r1ER,xiEX,n>1 .
Proof. (2) Every ideal containing X certainly must contain RXR. It is only
necessary to observe that RXR is indeed an ideal of R, and hence it is the
smallest ideal of R containing X. Parts (1) and (3) are similar and are left
for the reader. 0
(2.13) Remarks. (1) The description given in Lemma 2.12 (2) of the ideal
generated by X is valid for rings with identity. There is a similar, but more
complicated description for rings without an identity. We do not present it
since we shall not have occasion to use such a description.
(2) If X = {a} then the ideal generated by X in a commutative ring
R with identity is the set Ra = {ra : r E R}. Such an ideal is said to be
principal. An integral domain R in which every ideal of R is principal is
62 Chapter 2. Rings
1i={ali.ani:aiiEIi
i=1
(1 < i < n) and m is arbitrary .
1j=1
Proof. Exercise.
(2.17) Corollary. In a ring with identity there are always maximal ideals.
Proof.
J = {xER:axEI}.
Then I C J since I is an ideal, and J is clearly an ideal. Also, J 54 I since
b E J but b I, and J 34 R since 1 V J. Therefore, I is not maximal. The
theorem follows by contraposition.
(2.23) Examples.
(1) We compute all the ideals of the ring of integers Z. We already know
that all subgroups of Z are of the form nZ = {nr : r E Z}. But if
s E Z and nr E nZ then s(nr) = nrs = (nr)s so that nZ is an ideal of
Z. Therefore, the ideals of Z are the subsets nZ of multiples of a fixed
integer n. The quotient ring Z/nZ is the ring Zn of integers modulo
n. It was observed in the last section that Zn is a field if and only if n
is a prime number.
(2) Define 0 : Z[XJ - Z by 0(ao + a1X + + anXn) = ao. This is a
surjective ring homomorphism, and hence, Ker(¢) is a prime ideal. In
fact, Ker(O) = (X) = ideal generated by X.
Now define 0: Z[XJ Z2[XJ by
aug E rg g = J rg
gEC gEG
0 ann 0
Proof. We will first do the special case of the theorem where al = 1 and
a j = 0 f o r j > 1. F o r each j > 1, since 1 + 1j = R, we can find bj E 11
and cj E Ij with bj + Cj = 1. Then fl 2(b; + cj) = 1, and since each
b j E It, it follows that 1 = [J .2(bi + cj) E 11 + [I 2 I. Therefore, there
66 Chapter 2. Rings
b - aEnli.
i=1
The converse is clear. 0
There is another version of the Chinese remainder theorem. In order to
state it we will need to introduce the concept of cartesian, or direct, product
of rings. We will only be concerned with finite cartesian products. Thus let
R1,. .. , Rn be finitely many rings and let f 1 Ri = R1 x x Rn denote
the cartesian product set. On the set f 1 Ri we may define addition and
multiplication componentwise, i.e.,
(a1,...,a.) + (b1,...,bn) _ (a1 + b1,...,an + bn)
(al....,an)(bl,...,bn) _ (albl,...,anbn),
to make r[° Ri into a ring called the direct product of R11. .. , Rn. Given
1
(2.25) Corollary. Let R be a commutative ring with identity and let I1, ... ,1
be ideals of R such that Ii + I = R if i 0 j. Define
Ft
f:R-.fiR/li
i=1
Proof. Surjectivity follows from Theorem 2.24, and it is clear from the
definition of f that Ker(f) = I1 fl . . . fl In.
Proof. The only ideals of D are {0} and D so that the only ideals of M"(D)
are {0} = and
Q(R)={a/bEF:a,bER, b00}.
Of course, if c 54 0 E R then a/b = (ac)/(bc) E F since a/b just means
ab-1. Thus Q(R) is obtained from R in the same manner in which the
rational numbers Q are obtained from the integers Z. We will now consider
the converse situation. Suppose we are given an integral domain R. Can
we find a field F that contains an isomorphic copy of R as a subring? The
answer is yes, and in fact we will work more generally by starting with a
commutative ring R and a subset S of R of elements that we wish to be able
to invert. Thus we are looking for a ring R' that contains R as a subring
and such that S C (R')', i.e., every element of S is a unit of R'. In the case
of an integral domain R we may take S = R \ {0} to obtain Q(R).
Note that if S' is any nonempty subset of R, then the set S consisting of
all finite products of elements of S' is multiplicatively closed. If S' contains
no zero divisors, then neither does S.
RS
f
m m'
IWS
We leave to the reader the routine check that these operations make RS
into a commutative ring with identity. Observe that 0/s = 0/8', s/s = s'/s'
for any s, s' E S and 0/s is the additive identity of RS, while s/s is the
multiplicative identity. If 8, t E S then (s/t)(t/s) = (st)/(at) = 1 E RS so
that (s/t)-1 = t/s.
Now we define : R - RS. Choose s E S and define R -a RS
by 0,(a) = (as)/s. We claim that if a' E S then 0,. = 0,. Indeed, 0,(a) =
(as)/s and -0,' (a) = (as')/s', but (as)/s = (as')/s' since ass' = as's. Thus
we may define 0 to be 0, for any s E S.
O(a)4(b)
and
(a + b)s as2 + bs2
0(a + b) =
s s2
as bs
s s
_ 0(a) +.O(b).
Ker(O) = {a E R : Q = -11
s s
={aER:as'=0 for some s'ES}
= {0}
If =a=
¢(b)(O(c))-'
then O(b)O(c) = O(Y)O(c) and O(b')(O(c'))-'
hence ¢(bc') = 0(b'c). But 0 is injective so bc' = b'c. Now apply the homo-
morphism m' to conclude = (3(a) = so /3 is
¢'(b)(4'(c))-1 4'(b')(O'(c))-1
well defined.
Claim. 3 is bijective.
y(Q(a)) = y(O'(b)(O'(c))-t)
= a.
Similarly, one shows that /3(y(a')) = a' so that 33 is bijective.
It remains to check that, 3 is a homomorphism. We will check that ,Q
preserves multiplication and leave the similar calculation for addition as an
exercise. Let a1 = a2 = -O(b2)(O(c2))-1. Since !b is a ring
4(b1)(4(c1))-t and
f(ata2) _ O'(btb2)(O'(ctc2))-t
_
a(at)Q(a2)
This completes the proof of Theorem 3.5.
(3.6) Examples.
(1) Q(Z) = Q.
(2) Let p E Z be a prime and let Sp = {1, p, p2, ...} C Z. Then
a
ZS' = b E Q: b is a power of p
Rs={b EQ(R):bP}.
72 Chapter 2. Rings
X(n) _ 1 if n = 1,
0 if n 1,
then every f E R[X] can be written uniquely in the form
00
n=O
where the sum is finite since only finitely many an = f (n) E R are not 0.
With this notation the multiplication formula becomes
1:(anvu') (fbx1)
_ n='o0
where n
cn=Eanbn-m
M=0
It is traditional to denote elements of R[X] by a symbol f (X) but it
is important to recognize that f (X) does not mean a function f on the set
2.4 Polynomial Rings 73
(4.2) Lemma. Let R be a commutative ring and let f (X), g(X) E R[X].
Then
(1) deg(f (X) + g(X)) < max{deg(f(X)),deg(g(X))};
(2) deg(f (X)g(X)) < deg(f (X)) + deg(g(X)); and
(3) equality holds in (2) if the leading coefficient of f(X) or g(X) is not
a zero divisor. In particular, equality holds in (2) if R is an integral
domain.
For (2) and (3), suppose deg(f (X)) = n _> 0 and deg(g(X)) = m > 0.
Then
with an#0
with bm 0.
Therefore,
f (X)g(X) = aobo + (aobl + a,bo)X + . + a"bmXm+n
so that deg(f (X)g(X )) < n + m with equality if and only if anbm -A 0. If
an or bn is not a zero divisor, then it is certainly true that anbm # 0.
The case for f (X) = 0 or g(X) = 0 is handled separately and is left to
the reader.
(4.8) Remark. This result is false if R is not an integral domain. For example
if R = Z2 X Z2 then all four elements of R are roots of the quadratic
polynomial X2 - X E R[X].
(4.9) Corollary. Let R be an integral domain and let f (X), g(X) E R[X]
be polynomials of degree- < n. If f (a) = g(a) for n + 1 distinct a E R, then
f(X) =g(X)
76 Chapter 2. Rings
Proof. The polynomial h(X) = f (X) - g(X) is of degree < n and has
greater than n roots. Thus h(X) = 0 by Corollary 4.7. 0
In the case R is a field, there is the following complement to Corol-
lary 4.9.
(4.10) Proposition. (Lagrange Interpolation) Let F be afield and let ao, a1,
..., an be n + 1 distinct elements of F. Let co, c1, ..., Cn be arbitrary (not
necessarily distinct) elements of F. Then there exists a unique polynomial
f (X) E F[X] of degree < n such that f (a,) = ci for 0 < i < n.
Proof. Uniqueness follows from Corollary 4.9, so it is only necessary to
demonstrate existence. To see existence, for 0 < i < n, let P, (X) E FIX]
be defined by
Note that deg Pi (X) = n and P, (ai) = b,i. Thus, we may take
n
(4.2) f(X) _ >c:Pi(X)
i=o
to conclude the proof. 0
P,(X)=(X-ao) (X-ai_1)
for 1 < i < n. Any polynomial f(X) E FIX] of degree at most n can be
written uniquely as
= n
(4.3) f(X) EoiPi(X)
i=o
The coefficients cri can be computed from the values f (ai) for 0 < i < n.
The details are left as an exercise. The expression in Equation (4.3) is
known as Newton's form of interpolation; it is of particular importance in
numerical computations.
2.4 Polynomial Rings 77
If R is a PID that is not a field, then it is not true that the polynomial
ring R[X] is a PID. In fact, if I = (p) is a nonzero proper ideal of R, then
J = (p, X) is not a principal ideal (see Example 2.23 (2)). It is, however,
true that every ideal in the ring R[X] has a finite number of generators.
This is the content of the following result.
Then, as above,
(4.18) Remarks.
(1) According to Corollary 4.6, F is algebraically closed if and only if the
nonconstant irreducible polynomials in FIX) are precisely the polyno-
mials of degree 1.
(2) The complex numbers C are algebraically closed. This fact, known
as the fundamental theorem of algebra, will be assumed known
at a number of points in the text. We shall not, however, present a
proof.
(3) If F is an arbitrary field, then there is a field K such that F C K and
K is algebraically closed. One can even guarantee that every element
of K is algebraic over F, i.e., if a E K then there is a polynomial
f (X) E FIX) such that f (a) = 0. Again, this is a fact that we shall
not prove since it involves a few subtleties of set theory that we do not
wish to address.
(5.4) Examples.
(1) Let F be a field and let
R = F[X2, X3J
= {p(X) E F[XJ : p(X) = as + a2X 2 + a3 X3 + + anX"}.
Then X2 and X3 are irreducible in R, but they are not prime since
X2 I (X3)2 = X6, but X2 does not divide X3, and X3 I X4X2 = X6,
but X3 does not divide either X4 or X2. All of these statements are
easily seen by comparing degrees.
(2) Let Z[/1 = {a + b/ : a, b E Z}. In Z[f-] the element 2 is
irreducible but not prime.
Proof. Suppose that 2 = (a + bs)(c + dy) with a, b, c, d E Z. Then
taking complex conjugates gives 2 = (a - bs)(c - d%), so multiplying
these two equations gives 4 = (a2 + 3b2)(c2 + 3d2). Since the equation
in integers a2 + 302 = 2 has no solutions, it follows that we must have
a2 +3b2 = 1 or c2 +3d2 = 1 and this forces a = ±1, b = 0 or c = ±1, d = 0.
Therefore, 2 is irreducible in Z[ Note that 2 is not a unit in Z[v/-3]
since the equation 2(a + b/) = 1 has no solution with a, b E Z.
2.5 Principal Ideal Domains and Euclidean Domains 81
To see that 2 is not prime, note that 2 divides 4 = (1 + V" 3)(1- -../--3)
but 2 does not divide either of the factors 1 + vJ or 1 - in Z[/).
We conclude that 2 is not a prime in the ring Z[-3J. D
Proof. Lemma 5.3 shows that if a is prime, then a is irreducible. Now assume
that a is irreducible and suppose that a I be. Let d = gcd{a, b}. Thus a = de
and b = df . Since a = de and a is irreducible, either d is a unit or e is a
unit. If e is a unit, then a I b because a is an associate of d and d I b. If d is
a unit, then d = ar' + bs' for some r', s' E R (since d E (a, b)). Therefore,
I = ar + bs for some r, 8 E R, and hence, c = arc + bsc. But a I arc and
a ( bsc (since a I be by assumption), so a I c as required.
119129139 ...
be a chain of ideals of R and let I = U°_1 I. Then I is an ideal of R so that
I = (a1i ... , am) for some ai E R. But ai E I = U°O_1 In for 1 < i < m so
ai E In, for some ni. Since we have a chain of ideals, it follows that there
is an n such that ai E In for all i. Thus, for any k > n there is an inclusion
(al, ... , am) C Ik C 1 = (al, ... , am)
so that Ik = I = In for all k > n and R satisfies the ACC on ideals. 0
(5.11) Remark. In light of Definition 5.9 and Proposition 5.10, the Hilbert
basis theorem (Theorem 4.14) and its corollary (Corollary 4.15) are often
stated:
By the ascending chain condition, this must stop at some (bn). Therefore
b is a prime and we conclude that every a 0 E R that is not a unit is
divisible by some prime.
Therefore, if a 0 is not a unit, write a = p1c1 where p1 is a prime.
Thus (a) 5 (c1). If c1 is a unit, stop. Otherwise, write c1 = p2c2 with p2
a prime so that (c1) (c2). Continue in this fashion to obtain a chain of
ideals
...
(a) 5 (cl) 5 (CO
By the ACC this must stop at some c, and by the construction it follows
that this c = u is a unit. Therefore,
a = P1c1 = P1P2c2 = =
so that a is factored into a finite product of primes times a unit.
Now consider uniqueness of the factorization. Suppose that
a = pop, ...Pn
where Pi.... , pn, Q1i ... , qm are primes while p0 and qo are units of R. We
will use induction on k = min{m, n}. If n = 0 then a is a unit, so a = qo and
hence m = 0. Also m = 0 implies n = 0. Thus the result is true for k = 0.
Suppose that k > 0 and suppose that the result is true for all elements
b E R that have a factorization with fewer than k prime elements. Then
Pn I go91 qm, so pn divides some qj since pn is a prime element. After
reordering, if necessary, we can assume that pn I qm. But q,n is prime, so
qm = Pnc implies that c is a unit. Thus, pn and qn are associates. Let
a'
Then a' has a factorization with fewer than k prime factors, so the induction
hypothesis implies that n - 1 = m - 1 and qi is an associate of p,(;) for
some a E Sn _ 1 i and the argument is complete.
(5.14) Remark. Note that the proof of Theorem 5.12 actually shows that if
R is any commutative Noetherian ring, then every nonzero element a E R
has a factorization into irreducible elements, i.e., any a E R can be factored
as a = up1 ... pn where u is a unit of R and p1, ... , pn are irreducible (not
prime) elements, but this factorization is not necessarily unique; however,
this is not a particularly useful result.
2.5 Principal Ideal Domains and Euclidean Domains 85
(5.15) Examples.
(1) In F[X2, X3] there are two different factorizations of X6 into irre-
ducible elements, namely, (X2)3 = Xs = (X3)2.
(2) In Z[v there is a factorization
4=2.2=(1+v)(1--v/'---3)
into two essentially different products of irreducibles.
(3) If F is a field and p(X) E F[X) is an irreducible polynomial, then
p(X) is prime since F[X] is a PID. Thus the ideal (p(X)) is maximal
according to Corollary 5.8. Hence the quotient ring F[X]/(p(X)) is
a field. If p E Z is a prime number let Fp denote the field Zp. Then
F2[X)/(X2+X+1) is a field with 4 elements, F3[X]/(X2+1) is a field
with 9 elements, while F2[X]/(X3 + X + 1) is a field with 8 elements.
In fact, one can construct for any prime p E Z and n > 1 a field F.
with q = p" elements by producing an irreducible polynomial of degree
n in the polynomial ring Fp[X]. (It turns out that F. is unique up to
isomorphism, but the proof of this requires Galois theory, which we do
not treat here.)
(4) Let H(C) be the ring of complex analytic functions on the entire com-
plex plane. (Consult any textbook on complex analysis for verification
of the basic properties of the ring H(C).) The units of H(C) are pre-
cisely the complex analytic functions f : C C such that f (z) 0 0
for all z E C. Furthermore, if a E C then the function z - a divides
f (z) E H(C) if and only if f (a) = 0. From this it is easy to see (exer-
cise) that the irreducible elements of H(C) are precisely the functions
(z - a) f (z) where f (z) 34 0 for all z E C. Thus, a complex analytic
function g(z) can be written as a finite product of irreducible elements
if and only if g has only finitely many zeros. Therefore, the complex
analytic function sin(z) cannot be written as a finite product of irre-
ducible elements in the ring H(C). (Incidentally, according to Remark
5.14, this shows that the ring H(C) is not Noetherian.)
(5) Let R be the subring of Q[X] consisting of all polynomials whose con-
stant term is an integer, i.e.,
(5.17) Examples.
(1) Z together with v(n) = Ink is a Euclidean domain.
(2) If F is a field, then FIX] together with v(p(X)) = deg(p(X)) is a
Euclidean domain (Theorem 4.4).
(5.18) Lemma. If R is a Euclidean domain and a E R\{0}, then v(1) < v(a).
Furthermore, v(1) = v(a) if and only if a is a unit.
Proof. First note that v(1) < v(1-a) = v(a). If a is a unit, let ab = 1. Then
v(a) < v(ab) = v(1), so v(1) = v(a). Conversely, suppose that v(1) = v(a)
and divide 1 by a. Thus, 1 = aq+r where r = 0 or v(r) < v(a) = v(1). But
v(1) < v(r) for any r 0 0, so the latter possibility cannot occur. Therefore,
r= 0 so l= aq and a is a unit. 0
This set has a smallest element no. Choose a E I with v(a) = no. We claim
that I = Ra. Since a E I, it is certainly true that Ra C I. Now let b E I.
Then b=aq+rforq,rE R with r = 0 or v(r) < v(a). Butr=b - aqE I,
so v(a) < v(r) if r 0. Therefore, we must have r = 0 so that b = aq E Ra.
Hence, I = Ra is principal. 0
2.5 Principal Ideal Domains and Euclidean Domains 87
Since v(a2) > v(a3) > v(a4) > , this process must terminate after a finite
number of steps, i.e., an+1 = 0 for some n. For this n we have
an-1 = angn-1 + 0.
Claim. an = gcd{al,a2}.
Proof. If a, b E R then denote the gcd of {a, b} by the symbol (a, b). The-
orem 5.5 shows that the gcd of {a, b} is a generator of the ideal generated
by {a, b}.
Now we claim that (a;, a;+1) = (at+1, ai+2) for 1 < i < n - 1. Since
a; = a;+1qi + ai+2i it follows that
This result gives an algorithmic procedure for computing the gcd of two
elements in a Euclidean domain. By reversing the sequence of steps used to
compute d = (a, b) one can arrive at an explicit expression d = ra + sb for
the gcd of a and b. We illustrate with some examples.
(5.20) Examples.
(1) We use the Euclidean algorithm to compute (1254, 1110) and write it
as r1254 + x1110. Using successive divisions we get
88 Chapter 2. Rings
1254=1110.1+144
1110=144.7+102
144=102.1+42
102=42.2+18
42=18.2+6
18=6.3+0.
Thus, (1254, 1110) = 6. Working backward by substituting into suc-
cessive remainders, we obtain
6=42-18.2
=42-(102-42.2).2=5.42-2.102
=54.1254-61.1110.
(2) Let f (X) = X2 -X+ l and g(X) = X3 + 2X2 + 2 E Z3[X]. Then
g(X) = Xf (X) + 2X + 2
and
f (X) = (2X + 2)2.
Thus, (f(X), g(X)) = (2X + 2) = (X + 1) and X + 1 = 2g(X) -
2X f (X).
(3) We now give an example of how to solve a system of congruences in a
Euclidean domain by using the Euclidean algorithm. The reader should
refer back to the proof of the Chinese remainder theorem (Theorem
2.24) for the logic of our argument.
Consider the following system of congruences (in Z):
x -1 (mod 15)
x=3 (mod 11)
x 6 (mod 8).
1=13-2.6
=-15.6+(88-15.5).7=88.7-15.41
=616-15.41,
so 616 - 1 (mod 15) and 616 - 0 (mod 88). Similarly, by applying the
Euclidean algorithm to the pair (120, 11), we obtain that -120 - 1
(mod 11) and -120 - 0 (mod 120), and by applying it to the pair
(165, 8), we obtain -495 - 1 (mod 8) and -495 = 0 (mod 165). Then
our solution is
x=--I.(616)+3-(-120)+6-(-495)=-3946 (mod 1320)
or, more simply,
x - 14 (mod 1320).
But
90 Chapter 2. Rings
and since v(a), v(/3) E N it follows that v(a) = vL/3) = 2. Thus if 2 is not
irreducible in there is a number a = a + Wd E Z['//J such that
v(a) = a2 - db2 = ±2.
But ifd<-3 andbo0then
a2-db2=a2+(-d)b2>0+3.1>±2,
while if b = 0 then
a2 - db2 = a2 96 ±2
2.5 Principal Ideal Domains and Euclidean Domains 91
(5.24) Remarks.
(1) A complex number is algebraic if it is a root of a polynomial with
integer coefficients, and a subfield F C C is said to be algebraic if
every element of F is algebraic. If F is algebraic, the integers of F are
those elements of F that are roots of a monic polynomial with integer
coefficients. In the quadratic field F = Q(v/--31, every element of the
ring Z[v/-3) is an integer of F, but it is not true, as one might expect,
that Z(v/-3) is all of the integers of F. In fact, the following can be
shown: Let d # 0, 1 be a square-free integer. Then the ring of integers
of the field is
R={a+bl 1 2:a,bEZ}.
We leave it as an exercise for the reader to prove that R is in fact a
Euclidean domain. (Compare with Proposition 5.22.)
(2) So far all the examples we have seen of principal ideal domains have
been Euclidean domains. Let
R={a+b(1+2-19) :a,bEZ}
-11119
and
b=Pi'...pr
where 0 < mi and 0 < n, for each i. Let ki = min{mi,ni} for l < i < r
and let
d=pii ...pr'
We claim that d is a gcd of a and b. It is clear that d I a and d I b, so let e
be any other common divisor of a and b. Since e I a, we may write a = ec
2.6 Unique Factorization Domains 93
Our goal is to prove that if R is a UFD then the polynomial ring R(X]
is also a UFD. This will require some preliminaries.
(6.3) Definition. Let R be a UFD and let f (X) 0 0 E R(X]. A gcd of the
coefficients of f (X) is called the content of f (X) and is denoted cont(f (X)).
The polynomial f (X) is said to be primitive if cont(f (X)) = 1.
(6.4) Lemma. (Gauss's lemma) Let R be a UFD and let f (X), g(X) be
nonzero polynomials in R(X]. Then
cont(f (X)g(X )) = cont(f (X)) cont(g(X)).
In particular, if f (X) and g(X) are primitive, then the product f (X)g(X )
is primitive.
Proof. Write f(X) = cont(f(X))fi(X) and g(X) = cont(g(X))91(X)
where f1(X) and gi (X) are primitive polynomials. Then
f (X)g(X) = cont(f (X)) cont(g(X)) fl(X)gj(X),
so it is sufficient to check that f1(X)g1(X) is primitive. Now let
fi(X) =ao+a1X +...+anX'n
and
gl(X)
and suppose that the coefficients of fl (X )gi (X) have a common divisor d
other than a unit. Let p be a prime divisor of d. Then p must divide all of
the coefficients of fi (X)g1(X ), but since f1(X) and gi (X) are primitive,
p does not divide all the coefficients of f, (X) nor all of the coefficients of
gi (X). Let a, be the first coefficient of fi (X) not divisible by p and let b,
be the first coefficient of g1(X) not divisible by p. Consider the coefficient
of X'+° in f i (X )gi (X ). This coefficient is of the form
94 Chapter 2. Rings
The content of the left side is ad and the content of the right side is cb, so
ad = ucb where u E R is a unit. Substituting this in Equation (6.1) and
dividing by cb gives
uf2(X) = f1(X)
Thus the two polynomials differ by multiplication by the unit u E R and
the coefficients satisfy the same relationship 3 = a/b = u(c/d) = ua. 0
P X) = a1a2f1(X)f2(X ),
(6.7) Corollary. Let R be a UFD with quotient field F. Then the irreducible
elements of R[X] are the irreducible elements of R and the primitive poly-
nomials f(X) E R[X] which are irreducible in FIX].
Proof.
f(X) =cg1(X)...gr(X)
f(X) = udi...d9g1(X)...gr(X)
of f(X) into a product of irreducible elements of R[X].
It remains to check uniqueness. Thus suppose we also have a factor-
ization
f(X) = vbi ...btq'(X)...gk(X)
where each q;(X) is a primitive polynomial in R[X] and each bi is an irre-
ducible element of R. By Corollary 6.7, this is what any factorization into
irreducible elements of R[X] must look like. Since this also gives a factor-
ization in FIX] and factorization there is unique, it follows that r = k and
qi(X) is an associate of q;(X) in FIX] (after reordering if necessary). But
if primitive polynomials are associates in FIX], then they are associates in
R[X] (Lemma 6.5). Furthermore,
(6.9) Corollary. Let R be a UFD. Then R[X,, ... , X,,J is also a UFD. In
particular, FIX,, ... , XnJ is a UFD for any field F.
Proof. Exercise. 0
(6.10) Example. We have seen some examples in Section 2.5 of rings that
are not UFDs, namely, F[X2, X3J and some of the quadratic rings (see
Proposition 5.22 and Lemma 5.23). We wish to present one more example
of a Noetherian function ring that is not a UFD. Let
S1={(x,y)ER2:z2+y2=1}
be the unit circle in R2 and let I C R[X, YJ be the set of all polynomials
such that f (x, y) = 0 for all (x, y) E S'. Then I is a prime ideal of R[X, Y)
and R[X, YJ/I can be viewed as a ring of functions on S' by means of
f (X, Y) +I i f
where 7(x, y) = f (x, y). We leave it for the reader to check that this is
well defined. Let T be the set of all f (X, Y) + I E R[X, YJ/I such that
7(x, y) j4 0 for all (x, y) E S'. Then let the ring R be defined by localizing
at the multiplicatively closed set T, i.e.,
R = (R[X, YJ/I )T .
(6.12) Corollary. Let p be a prime number and let f9(X) E Q[X] be the
polynomial
98 Chapter 2. Rings
2.7 Exercises
10. Let R be a ring and let R°P ("op" for opposite) be the abelian group R,
together with a new multiplication a b defined b a b = ba, where ba
denotes the given multiplication on R. Verify that R°" is a ring and that the
identity function 1R : R R°P is a ring homomorphism if and only if R is
a commutative ring.
11. (a) Let A be an abelian group. Show that End(A) is a ring. (End(A) is
defined in Example 1.10 (11).)
(b) Let F be a field and V an F-vector space. Show that EndF(V) is a ring.
Here, Endp(V) denotes the set of all F-linear endomorphisms of V, i.e.,
EndF(V)={hEEnd(V):h(av)=ah(v)forallvEV,aEF}.
In this definition, End(V) means the abelian group endomorphisms of V,
and the ring structure is the same as that of End(V) in part (a).
12. (a) Let R be a ring without zero divisors. Show that if ab = 1 then ba = 1
as well. Thus, a and b are units of R.
(b) Show that if a, b, c E R with ab = 1 and ca = 1, then b = c, and thus
a (and b) are units. Conclude that if ab = 1 but ba $ 1, then neither a
nor b are units.
(c) Let F be a field and FIX] the polynomial ring with coefficients from
F. F[X] is an F-vector space in a natural way, so by Exercise 11, R =
EndF(FF[X]) is a ring. Give an example of a, b E R with ab = 1 but
ba54 1.
13. (a) Let x and y be arbitrary positive real numbers. Show that the quaternion
algebras Q(-x, -y; R) and Q(-1, -1; R) are isomorphic.
(b) Show that the quaternion algebras Q(-1, -3;Q), Q(-1, -7;Q), and
Q(-1, -11; Q) are all distinct.
(c) Analogously to Example 1.10 (10), we may define indefinite quaternion
algebras by allowing x or y to be negative. Show that, for any nonzero
real number x and any subfield F of R, Q(1, x; F) and Q(l, -x; F) are
isomorphic.
(d) Show that for any nonzero real number x, Q(1, x; R) and Q(1, 1; R) are
isomorphic.
(e) Show that for any subfield F of R, Q(1, 1; F) is isomorphic to M3(F),
the ring of 2 x 2 matrices with coefficients in F. (Thus, Q(1, 1; F) is not
a division ring.)
14. Verify that Z(i)/(3 + i) Zio.
15. Let F be a field and let R C F(XJ x F[Y] be the subring consisting of all
pairs (f (X), g(Y)) such that f (0) = g(0 . Verify that FIX, YJ/(XY) 5 R.
16. Let R be a ring with identity and let I be an ideal of R. Prove that
5_
17. Let F be a field and let R = { [ o 0] E M2 (F) 1. Verify that R is a ring. Does
R have an identity? Prove that the set I = { o o] E R} is a maximal ideal
of R.
18. (a) Given the complex number z = I + i, let o : R(X) -. C be the substitu-
tion homomorphism determined by z. Compute er(m).
(b) Give an explicit isomorphism between the complex numbers C and the
quotient ring R[XJ/(X' - 2X + 2).
19. Let R = C11[0, 1]) be the ring of continuous real-valued functions on the
interval (0,. Let T C [0, 11, and let
I(T)={f E R: f(x)=0forallxET}.
a Prove that I(T) is an ideal of R.
b If x E 10, 1] and M. = I({x}), prove that M. is a maximal ideal of R
and R`M= ?f R.
100 Chapter 2. Rings
(c) If S C R let
R[vd]=(a+bf:a,bER) CS.
(a) Prove that R[fd] is a commutative ring with identity.
(b) Prove that Z[d] is an integral domain.
(c) If F is a field, prove that F[fd] is a field.
21. (a) If R = Z or R = Q and d is not a square in R, show that R[ fd] °_°
R[X ]/ (X 2 - d) where (X2 - d) is the principal ideal of R(X] generated
by X2 - d.
(b) If R = Z or R = Q and di, d2, and di/d2 are not squares in R \ {0},
show that R[ f j ] and R[ v/ 2 are not isomorphic.
(The most desirable proof of these assertions is one which works for both
R = Z and R = Q, but separate proofs for the two cases are acceptable.)
(c) Let R, = ZD[X]/(X2 - 2) and R2 = ZD[X]/(X2 - 3). Determine if
Rl°°R2in case p=2,p=5,or p= 11.
22. Recall that R' denotes the group of units of the ring R.
(a) Show that (Z(vr--1])' = {±1,
(b) If d < -1 show that (Z[f])' _ {t1).
c) Show that
(d) Let d > 0 E Z not be a perfect square. Show that if Z(d) has one unit
other than ±1, it has infinitely many.
(e) It is known that the hypothesis in part (d) is always satisfied. Find a
unit in Z[%/-dl other than ±1 fcr 2 < d < 15, d A 4, 9.
23. Let F Q be a field. An element a E F is said to be an algebraic integer if
for some monic polynomial p(X) E Z[X], we have p(a) = 0. Let d E Z be a
nonsquare.
(a) Show that if a E F is an algebraic integer, then a is a root of an irre-
ducible monic polynomial p(X) E Z[X]. (Hint: Gauss's lemma.)
(b) Verify that a + bf E Q[fd] is a root of the quadratic polynomial
p(X) = X2 - 2aX + (a2 - bed) E Q[X].
(c) Determine the set of algebraic integers in the fields Q[f] and Q(f].
(See Remark 5.24 (1).)
24. If R is a ring with identity, then Aut(R), called the automorphism group of
R, denotes the set of all ring isomorphisms m : R R.
2.7 Exercises 101
m: FIX,YI -.FIT]
such that m(X) = T2 and O(Y) = T3. Show that Ker(O) is the principal
ideal generated by y2 - X3. What is Im(m)?
38. Prove Proposition 4.10 (Lagrange interpolation) as a corollary of the Chinese
remainder theorem.
39. Let F be a field and let ao, ai, ... , an be n + 1 distinct elements of F.
If f : F - F is a function, define the successive divided differences of f,
denoted f [ao, ... , a,], by means of the inductive formula:
flao[ = f(ao)
f[ai] - flan]
flao, all = al - ao
f =fo+fig+...+fd9d.
42. Let K and L be fields with K C L. Suppose that f (X), g(X) E K[X).
(a) If f (X) divides g(X) in L[X], prove that f (X) divides g(X) in XJ.
b Prove that the greatest common divisor of f (X) and g(X) in KJX is
the same as the greatest common divisor of f (X) and g(% in L X . (We
will always choose the monk generator of the ideal (f (X), g(X) as the
greatest common divisor in a polynomial ring over a field.)
43. (a) Suppose that R is a Noetherian ring and I C R is an ideal. Show that
Rh is Noetherian.
(b) If R is Noetherian and S is a subring of R, is S Noetherian?
c) Suppose that R is a commutative Noetherian ring and S is a nonempty
multiplicatively closed subset of R containing no zero divisors. Prove
that Rs (the localization of R away from S) is also Noetherian.
2.7 Exercises 103
The two equations represent the left and right divisions of f (X) by 9(X). In
the special case that g(X) = X - a for a E R, prove the following version of
these equations (noncommutative remainder theorem):
Let f (X) = a1 + a1X + + anX" E R[X] and let a E R. Then there are
unique qc(X) and qR (X) E R(X ] such that
f(X) = gR(X)(X - a) + fx(a)
and
f(X) = (X -a)gc(X)+ft(a)
where
n n
akak
fn (a) = 1: and f c (a) _ akak
k=0 k=0
are, respectively, the right and left evaluations of f (X) at a E R. (Hint: Use
the formula
Xk - ak = (X k-1 + X k-2a + ... + Xak-2 + ak-1) (X - a)
= (X - a)(Xk-1 + aXk-2 + ... + ak-2X + ak-1).
Then multiply on either the left or the right by at, and sum over k to get
the division formulas and the remainders.)
45. Let R be a UFD and let a and b be nonzero elements of R. Show that
ab = (a, bJ(a, b) where [a, b) = lcm{a, b} and (a, b) = gcd{a, b}.
46. Let R be a UFD. Show that d is a gcd of a and b (a, b E R \ {0}) if and only
if d divides both a and b and there is no prime p dividing both a/d and b/d.
In particular, a and b are relatively prime if and only if there is no prime p
dividing both a and b.)
47. Let R be a UFD and let {r1}1 be a finite set of pairwise relatively prime
nonzero elements of R (i.e., ri and r3 are relatively prime whenever i 36 j).
Let a = f 1 ri and let a, = a/ri. Show that the set {ai} 1 is relatively
prime.
48. Let R be a UFD and let F be the quotient field of R. Show that d E R is a
square in R if and only if it is a square in F (i.e., if the equation a2 = d has
a solution with a E F then, in fact, a E R). Give a counterexample if R is
not a UFD.
49. Let x, y, z be integers with gcd x, y, z = 1. Show that there is an integer a
such that cd{x + ay, z} = 1. (Hint: The Chinese remainder theorem may
be helpful.)
104 Chapter 2. Rings
53. For what values of a (mod 77) does the following system of simultaneous
congruences have a solution?
x-6 (mod21)
x - 9 (mod 33)
x =-a (mod 77).
54. (a) Solve the following system of simultaneous congruences in Q(XI:
f(X) -3 (mod X + 1)
f (X) 12X (mod X2 - 2)
f (X) -4X (mod X3).
b) Solve this system in Z5 X .
c) Solve this system in Z3 X .
d Solve this system in Z2 X .
55. Suppose that ml, m2 E Z are not relatively prime. Then prove that there are
integers al, a2 for which there is no solution to the system of congruences:
x - a1 (mod ml)
x - a2 (mod M2)-
56. Let R be a UFD. Prove that
f(X,Y)=X4+2Y2X3+3Y3X2+4YX+5Y+6Y2
is irreducible in the polynomial ring R(X, Y].
57. Prove that if R is a UFD and if j(X) is a monic polynomial with a root in
the quotient field of R, then that root is in R. (This result is usually called
the rational root theorem.)
58. Suppose that R is a UFD and S C R\ {0} is a multiplicatively closed subset.
Prove that Rs is a UFD.
59. Let F be a field and let F((X)j be the ring of formal power series with
coefficients in F. If j = F,°° u a X" 0 0 E F, let o(f) = min{n # 0}
and define o(0) = oo. o(f) is usually called the order of the power series j.
Prove the following facts:
(a) o(f9) = o(f) + o(9).
2.7 Exercises 105
(cr)m(ab) = (ma)b.
(dr)m1 = m.
(1.2) Remarks.
(1) If R is a commutative ring then any left R-module also has the struc-
ture of a right R-module by defining mr = rm. The only axiom that
requires a check is axiom (cr). But
m(ab) = (ab)m = (ba)m = b(am) = b(ma) = (ma)b.
(2) More generally, if the ring R has an antiautomorphism (that is, an
additive homomorphism 0: R -' R such that 0(ab) = m(b)4(a)) then
any left R-module has the structure of a right R-module by defining
ma = ¢(a)m. Again, the only axiom that needs checking is axiom (cr):
(ma)b = .(b)(ma)
= O(b)(O(a)m)
= (m(b)O(a))m
= 0(ab)m
= m(ab).
An example of this situation occurs for the group ring R(G) where R
is a ring with identity and G is a group (see Example 2.1.10 (15)). In
this case the antiautomorphism is given by
-O(Ea99) _ a99-'
9EG 9EG
The theories of left R-modules and right R-modules are entirely par-
allel, and so, to avoid doing everything twice, we must choose to work on
3.1 Definitions and Examples 109
one side or the other. Thus, we shall work primarily with left R-modules
unless explicitly indicated otherwise and we will define an R-module (or
module over R) to be a left R -module. (Of course, if R is commutative, Re-
mark 1.2 (1) shows there is no difference between left and right R-modules.)
Applications of module theory to the theory of group representations will,
however, necessitate the use of both left and right modules over noncommu-
tative rings. Before presenting a collection of examples some more notation
will be introduced.
(1.4) Definition.
(1) Let F be a field. Then an F-module V is called a vector space over F.
(2) If V and W are vector spaces over the field F then a linear transfor-
mation from V to W is an F-module homomorphism from V to W.
(1.5) Examples.
(1) Let G be any abelian group and let g E G. If n E Z then define the
scalar multiplication ng by
(n terms) if n > 0,
ng= 0 ifn=0,
(-g) + + (-g) (-n terms) ifn < 0.
Using this scalar multiplication G is a Z-module. Furthermore, if G
and H are abelian groups and f : G -a H is a group homomorphism,
then f is also a Z-module homomorphism since (if n > 0)
(2) Let R be an arbitrary ring. Then R" is both a left and a right R-module
via the scalar multiplications
a(bl, ... b") = (abi, ... ,ab")
and
(bi, ... , b" )a = (bia, ... , b"a).
(3) Let R be an arbitrary ring. Then the set of matrices M,,,,"(R) is both
a left and a right R-module via left and right scalar multiplication of
matrices, i.e.,
ent;j (aA) = a ent,J(A)
and
ent,3(Aa) = (ent,j(A))a.
(4) As a generalization of the above example, the matrix multiplication
maps
(a, b + I) ab + I
and
R/IxR-+R/I
(a+I,b).--sab+l.
(7) M is defined to be an R-algebra if M is both an R-module and a ring,
with the ring addition being the same as the module addition, and the
multiplication on M and the scalar multiplication by R satisfying the
following identity: For every r E R, m1, m2 E M,
HomR(R,M)^='M
as Z-modules via the map 4) : HomR(R, M) -+ M where 0(f) = f (1).
(11) Let R be a commutative ring, let M be an R-module, and let S C
EndR(M) be a subring. (Recall from Example 1.5 (8) that EndR(M)
is a ring, in fact, an R algebra.) Then M is an S-module by means of
the scalar multiplication map S x M M defined by (f, m) '--' f (m).
(12) As an important special case of Example 1.5 (11), let T E EndR(M)
and define a ring homomorphism 0 : R[X] EndR(M) by sending
X to T and a E R to alM. (See the polynomial substitution theorem
(Theorem 2.4.1).) Thus, if
then
O(f(X))
We will denote O(f (X)) by the symbol f (T) and we let Im(O) = R[T].
That is, R[T] is the subring of EndR(M) consisting of "polynomials"
in T. Then M is an R[TJ module by means of the multiplication
f (T)m = f (T)(m).
Using the homomorphism 0: R(X) -, R[T] we see that M is an R[X]-
module using the scalar multiplication
f (X)m = f (T)(m).
This example is an extremely important one. It provides the basis for
applying the theory of modules over principal ideal domains to the
study of linear transformations; it will be developed fully in Section
4.4.
(13) We will present a concrete example of the situation presented in Ex-
ample 1.5 (12). Let F be a field and define a linear transformation
T : F2 -+ F2 by T(ul, u2) = (u2, 0). Then T2 = 0, so if f (X) =
ao + a1X + + amXm E F[XJ, it follows that f (T) = ao1Fa + a1T.
Therefore the scalar multiplication f (X )u for u E F2 is given by
(2.2) Examples.
If R is any ring then the R-submodules of the R-module R are precisely
the left ideals of the ring R.
If G is any abelian group then G is a Z-module and the Z-submodules
of G are just the subgroups of G.
Let f : M - N be an R-module homomorphism. Then Ker(f) C M
and Im(f) C N are R-submodules (exercise).
Continuing with Example 1.5 (12), let V be a vector space over a
field F and let T E EndF(V) be a fixed linear transformation. Let VT
denote V with the FIX)-module structure determined by the linear
transformation T. Then a subset W C V is an F[X]-submodule of the
module VT if and only if W is a linear subspace of V and T (W) C W,
i.e., W must be a T-invariant subspace of V. To see this, note that
X w = T(w), and if a E F, then a w = aw-that is to say, the
action of the constant polynomial a E F[X] on V is just ordinary
scalar multiplication, while the action of the polynomial X on V is
the action of T on V. Thus, an F[X]-submodule of VT must be a T-
invariant subspace of V. Conversely, if W is a linear subspace of V
such that T(W) C W then Tm(W) C W for all m > 1. Hence, if
f (X) E F[X] and W E W then f (X) w = f (T)(w) E W so that W is
closed under scalar multiplication and thus W is an F[X]-submodule
of V.
(N+P)/PAN/(NnP).
Proof. Let rr : M , M/P be the natural projection map and let rro be
the restriction of rr to N. Then ao is an R-module homomorphism with
Ker(rro) = N n P and Im(iro) = (N + P)/P. The result then follows from
the first isomorphism theorem. 0
rather than ({x1, ... , for the submodule generated by S. There is the
following simple description of (S).
(2.11) Remarks.
(1) We have uc({O}) = 0 by Lemma 2.9 (1), and M # {0} is cyclic if and
only if µ(M) = 1.
(2) The concept of cyclic R-module generalizes the concept of cyclic group.
Thus an abelian group G is cyclic (as an abelian group) if and only if
it is a cyclic Z-module.
(3) If R is a PID, then any R-submodule M of R is an ideal, so µ(M) = 1.
(4) For a general ring R, it is not necessarily the case that if N is a sub-
module of the R-module M, then µ(N) < µ(M). For example, if R is
a polynomial ring over a field F in k variables, M= R, and N C M
is the submodule consisting of polynomials whose constant term is 0,
then u(M) = 1 but µ(N) = k. Note that this holds even if k = oo. We
shall prove in Corollary 6.4 that this phenomenon cannot occur if R is
a PID. Also see Remark 6.5.
Proof. Let S = {x1, ... , xk} C N be a minimal generating set for N and if
rr : M -+ M/N is the natural projection map, choose T = {yl, ... , yt} C M
so that {rr(yl), ... ,7r(ye)} is a minimal generating set for M/N. We claim
that S U T generates M so that u(M) < k + I = µ(N) + µ(M/N). To see
this suppose that x E M. Then rr(x) = al7r(y1) + + atrr(yt). Let y =
116 Chapter 3. Modules and Vector Spaces
IM={>aimi:nEN,aiEI,miEM}.
,_ 1 JJJ
group is torsion. The converse is not true. For a concrete example, take
G = Q/Z. Then IGI = oo, but every element of Q/Z has finite order
since q(p/q + Z) = p + Z = 0 E Q/Z. Thus (Q/Z),. = Q/Z.
(2) An abelian group is torsion-free if it has no elements of finite order
other than 0. As an example, take G = Z" for any natural number n.
Another useful example to keep in mind is the additive group Q.
(3) Let V = F2 and consider the linear transformation T : F2 -+ F2
defined by T(ul, u2) = (u2i 0). See Example 1.5 (13). Then the F[X)
module VT determined by T is a torsion module. In fact Ann(VT) =
(X2). To see this, note that T2 = 0, so X2 u = 0 for all u E V. Thus,
(X2) C Ann(VT). The only ideals of F[XI properly containing (X2)
are (X) and the whole ring F[X], but X 0 Ann(VT) since X (0, 1) _
(1, 0) 54 (0, 0). Therefore, Ann(VT) _ (X2).
The following two observations are frequently useful; the proofs are left
as exercises:
Proof. Exercise.
(2.21) Proposition. Let F be afield and let V be a vector space over F, i.e.,
an F-module. Then V is torsion-free.
Proof. Exercise.
(x1, ... , xn) + (y1, ... , yn) _ (xl +Y1, ... , xn +Y.)
a(x1i ... , xn) = (axi, ... , axn)
where the 0 element is, of course, (0, ... , 0). The R-module thus con-
structed is called the direct sum of M1, ... , Mn and is denoted
n
M1 ®...®Mn (or ®M;).
=1
3.3 Direct Sums, Exact Sequences, and Horn 119
(3.1) Theorem. Let M be an R-module and let Ml, ... , M,, be submodules
of M such that
(1) M=M1+ +Mn, and
(2) for 1 < i < n,
Then
M1®... ®Mn.
Proof. Let fi : Mi -+ M be the inclusion map, that is, fi(x) = x for all
x E Mi and define
f:Ml®...(D MMn-+M
by
f(xl,...,xn) = xl + ... + xn.
xi E =0
so that (x1, ... ,xn) = 0 and f is an isomorphism. O
Our primary emphasis will be on the finite direct sums of modules just
constructed, but for the purpose of allowing for potentially infinite rank
free modules, it is convenient to have available the concept of an arbitrary
direct sum of R-modules. This is described as follows. Let {Mj)jEj be
120 Chapter 3. Modules and Vector Spaces
Then
MMi.
jEJ
Proof. Exercise. 0
(3.6) Definition.
(1) The sequence (3.1), if exact, is said to be a short exact sequence.
(2) The sequence (3.1) is said to be a split exact sequence (or just split)
if it is exact and if Im(f) = Ker(g) is a direct summand of M.
122 Chapter 3. Modules and Vector Spaces
Proof 0
(3.8) Example. Let p and q be distinct primes. Then we have short exact
sequences
(3.2) 0 - ZP Zpq Zq 0
and
defined by
0.(f) = 0 of for all f E HomR(M, N)
and
tP* (g) = go 10 for all g E HomR(Ml, N).
It is straightforward to check that 0.(f +g) = 0. (f) +0. (g) and ?P* (f +g) _
Ii' (f) + t/5' (g) for appropriate f and g. That is, 0. and +f are homomor-
phisms of abelian groups, and if R is commutative, then they are also
R-module homomorphisms.
Given a sequence of R-modules and R-module homomorphisms
HomR(Ms+1, N)
A natural question is to what extent does exactness of sequence (3.6)
imply exactness of sequences (3.7) and (3.8). One result along these lines
is the following.
(3.9)
0= (0.00.)(1m.) =V,00.
Thus Im(O) C Ker(O). Now let N = Ker(t(i) and let t : N -+ M be the
inclusion. Since tp. (t) = 7P o t = 0, exactness of Equation (3.10) implies that
t = ¢.(a) for some a E HomR(N, MI). Thus,
Im(O) Im(t) = N = Ker(tP),
and we conclude that sequence (3.9) is exact.
Again, exactness of sequence (3.11) is left as an exercise.
0--.MI-.M 1P+M2-i0
is a short exact sequence, the sequences (3.10) and (3.12) need not be short
exact, i.e., neither tG. or 0' need be surjective. Following are some examples
to illustrate this.
126 Chapter 3. Modules and Vector Spaces
0-+0--a0 ±- +Zn
and ii. is certainly not surjective.
These examples show that Theorem 3.10 is the best statement that
can be made in complete generality concerning preservation of exactness
under application of HomR. There is, however, the following criterion for
the preservation of short exact sequences under Hom:
3.3 Direct Sums, Exact Sequences, and Horn 127
and
00f)
= (WOQ)0f
0.aa.(f)=IM,,,
= (1M2) ° f
= (1HomR(N,Ma)) (f)
Thus, r/). o)3. = 1HomR(N,M2) so that r/i. is surjective and /i. is a splitting
of exact sequence (3.18).
where c(m) = (m, 0) is the canonical injection and rr(ml, m2) = m2 is the
canonical projection.
128 Chapter 3. Modules and Vector Spaces
(3.14) Remarks.
(1) Notice that isomorphism (3.20) is given explicitly by
't(f)=(r1of,7rzof)
where f E HomR(N, M1 ® M2) and Jri(ml, m2) = mi (for i = 1, 2);
while isomorphism (3.21) is given explicitly by
4'(f)=(fo4,foL2)
where f E HomR(Ml ® M2, N), Li : M1 - M1 ® M2 is given by
G1(m)=(m,0)andi2:M2- M1®M2is given by t2(m)=(0,m).
(2) Corollary 3.13 actually has a natural extension to arbitrary (not nec-
essarily finite) direct sums. We conclude this section by stating this
extension. The proof is left as an exercise for the reader.
Proof. Exercise. 0
When the ring R is implicit from the context, we will sometimes write
linearly dependent (or just dependent) and linearly independent (or just
independent) in place of the more cumbersome R-linearly dependent or
R-linearly independent. In case S contains only finitely many elements
x1, x2i ... , xn, we will sometimes say that x1, x2, ... , x, are R-linearly de-
pendent or R-linearly independent instead of saying that S = {x1, ... , xn}
is R-linearly dependent or R-linearly independent.
3.4 Flee Modules 129
(4.2) Remarks.
(1) To say that S C M is R-linearly independent means that whenever
there is an equation
=0
where x1 i ... , xn are distinct elements of S and a1, ... , an are in R,
then
a1 = =an =0.
(2) Any set S that contains a linearly dependent set is linearly dependent.
(3) Any subset of a linearly independent set S is linearly independent.
(4) Any set that contains 0 is linearly dependent since 1 -0 = 0.
(5) A set S C M is linearly independent if and only if every finite subset
of S is linearly independent.
It is clear that conditions (1) and (2) in the definition of basis can be
replaced by the single condition:
(1') S C M is a basis of M 36 {0} if and only if every x E M can be written
uniquely as
x=alxl+...+anxn
foral,...,anERandx1,...,xnES.
(4.4) Definition. An R-module M is a free R-module if it has a basis.
(4.6) Examples.
(1) If R is a field then R-linear independence and R-linear dependence in
a vector space V over R are the same concepts used in linear algebra.
(2) R' is a free module with basis S = {el, ... , en} where
e; = (0,...,0,1,0,...,0)
S={E,j:1<i<m,I<j<n}.
(4) The ring R[X] is a free R-module with basis {X" : n E Z+}. As in
Example 4.6 (2), R[XJ is also a free R[X]-module with basis {1}.
(5) If C is a finite abelian group then G is a Z-module, but no nonempty
subset of G is Z-linearly independent. Indeed, if g E G then IGI g = 0
but Cl I34 0. Therefore, finite abelian groups can never be free Z-
modules, except in the trivial case G = {0} when 0 is a basis.
(6) If R is a commutative ring and I C R is an ideal, then I is an R-
module. However, if I is not a principal ideal, then I is not free as an
R-module. Indeed, no generating set of I can be linearly independent
since the equation (-a2)al +ala2 = 0 is valid for any al, a2 E R.
(7) If M1 and M2 are free R-modules with bases Sl and S2 respectively,
then Ml ® M2 is a free R-module with basis S; U S2, where
0 = ax = 1:(aaj)xj.
jEJ
g = L, E aij fij.
i=1 j=1
Then
g(vk) = ak1wl + ... +aknwn = f(vk)
for 1 < k < m, so g = f since the two homomorphisms agree on a basis
of M. Thus, {fij 1 < i < m; I < j < n} generates HomR(M, N), and
:
(4.12) Remarks.
(1) A second (essentially equivalent) way to see the same thing is to write
M ®;" 1R and N L" ®j=1R. Then, Corollary 3.13 shows that
m n
HomR(M, N) = ®®HomR(R, R).
i=1 j=1
But any f E HomR(R, R) can be written as f = f(1) . 1R. Thus
HomR(R, R) R so that
m n
HomR(M, N) L R.
i=1 j=1
(2) The hypothesis of finite generation of M and N is crucial for the va-
lidity of Theorem 4.11. For example, if R = Z and M = ®i°Z is the
free Z-module on the index set N, then Corollary 4.10 shows that
00
HomR(M, Z) L fl Z.
3.4 Free Modules 133
But the Z-module r IT Z is not a free Z-module. (For a proof of this fact
(which uses cardinality arguments), see I. Kaplansky, Infinite Abelian
Groups, University of Michigan Press, (1968) p. 48.)
,P((aj)jEJ) = Eajx3.
jEJ
Thus, Proposition 4.14 states that every module has a free presenta-
tion.
0-+Ml -+M-f+F--i0
of R-modules is split exact.
Proof. Let S = {xj }jEJ be a basis of the free module F. Since f is surjective,
for each j E J there is an element yj E M such that f (yj) = x j . Define
h:S M by h(xj) = yj. By Proposition 4.9, there is a unique 6 E
134 Chapter 3. Modules and Vector Spaces
(4.1)
(4.19) Remark. It is a theorem that any two bases of a free module over
a commutative ring R have the same cardinality. This result is proved
for finite-dimensional vector spaces by showing that any set of vectors of
cardinality larger than that of a basis must be linearly dependent. The
same procedure works for free modules over any commutative ring R, but
it does require the theory of solvability of homogeneous linear equations
over a commutative ring. However, the result can be proved for R a PID
without the theory of solvability of homogeneous linear equations over R;
we prove this result in Section 3.6. The result for general commutative rings
then follows by an application of Proposition 4.13.
are not free. We will conclude this section with the fact that all modules
over division rings, in particular, vector spaces, are free modules. In Section
3.6 we will study in detail the theory of free modules over a PID.
The proof of Theorem 4.20 actually proved more than the existence of
a basis of V. Specifically, the following more precise result was proved.
Proof. Exercise.
136 Chapter 3. Modules and Vector Spaces
Notice that the above proof used the existence of inverses in the division
ring D in a crucial way. We will return in Section 3.6 to study criteria that
ensure that a module is free if the ring R is assumed to be a PID. Even
when R is a PID, e.g., R = Z, we have seen examples of R modules that
are not free, so we will still be required to put restrictions on the module
M to ensure that it is free.
0- M, -+M-'P-.0
splits.
(2) There is an R-module P' such that P ® P is a free R-module.
(3) For any R-module N and any surjective R-module homomorphism ?p :
M - P, the homomorphism
tb.: HomR(N, M) -' HomR(N, P)
is surjective.
(4) For any surjective R-module homomorphism ¢ : M N, the homo-
morphism
0.: HomR(P, M) HomR(P, N)
is surjective.
0.(f)=0o(go0)
=fo'YOQ
=folF
= f.
Hence, O.: HomR(P, M) - HomR(P, N) is surjective.
(4) => (1). A short exact sequence
0 - MI M -'p s P -+ 0,
in particular, includes a surjection V : M -i P. Now take N = P in part
(4). Thus,
ii.: HomR(P, M) -+ HomR(P, P)
is surjective. Choose /3 : P -+ M with t/i.(,0) = lp. Then /3 splits the short
exact sequence and the result is proved. 0
138 Chapter 3. Modules and Vector Spaces
= -aal + ((ab)/2)a2
= -aal + aa2b/2
= -2a + 3a
=a
so that a is a splitting of the surjective map 0. Hence, F Ker(O) ® I
and by Theorem 5.1, I is a projective R-module.
F=P®P'=(®Pj)ED P',
jEJ
and hence, each Pj is also a direct summand of the free R-module F. Thus,
Pj is projective.
Conversely, suppose that Pj is projective for every j E J and let P,' be
an R-module such that Pj ® Pj' = F, is free. Then
P®((DPj')(P,®P;)
jEJ jEJ
® Fj.
jEJ
Since the direct sum of free modules is free (Example 4.6 (8)), it follows
that P is a direct summand of a free module, and hence P is projective.
140 Chapter 3. Modules and Vector Spaces
(5.10) Examples.
(1) If I C R is the principal ideal I = (a) where a 0 0, then I is an
invertible ideal. Indeed, let b = 1/a E K. Then any x E I is divisible
by a in R so that bx = (1/a)x E R, while a(l/a) = 1.
(2) Let R = Z[vr---5] and let I = (2, 1 + ). Then it is easily checked
that I is not principal, but I is an invertible ideal. To see this, let
a1 =2,a2=1+-,,/--5,b1 =-1, and b2 = (1 - -,/--5)/2. Then
albs + a2b2 = -2 + 3 = 1.
Furthermore, a1b2 and a2b2 are in R, so it follows that b2I C R, and
we conclude that I is an invertible ideal.
x = -0(0(x)) = 1: 'pj(x)aj
jEJ
(5.6) bj = Oj(x)
X
The element bj E K depends on j E J but not on the element x 54 0 E I.
To see this, suppose that x' 54 0 E I is another element of I. Then
X'pj(x) = Oj(x'x) =''j(xx) = x1pj(x )
142 Chapter 3. Modules and Vector Spaces
n n n
x= E V)i(x)a.i =>(bix)ai =x (bia)).
.i=1 .i=1 i=1
for the fundamental structure theorem for finitely generated modules over
PIDs which will be developed in Section 3.7.
We start with the following definition:
Since we will not be concerned with the fine points of cardinal arith-
metic, we shall not distinguish among infinite cardinals so that
free-rankR(M) E Z+ U {oo}.
Since a basis is a generating set of M, we have the inequality u(M) <
free-rankR(M). We will see in Corollary 6.18 that for an arbitrary commu-
tative ring R and for every free R-module, free-rankR(M) = p(M) and all
bases of M have this cardinality.
Proof. We will first present a proof for the case where free-rankR(M) < 00.
This case will then be used in the proof of the general case. For those who
are only interested in the case of finitely generated modules, the proof of
the second case can be safely omitted.
Case 1. free-rankR(M) < 00.
We will argue by induction on k = free-rankR(M). If k = 0 then
M = (0) so N = (0) is free of free-rank 0. If k = 1, then M is cyclic so
M = (x) for some nonzero x E M. If N = (0) we are done. Otherwise, let
I = {a ER: ax EN). Since! is an ideal ofRandRisaPID,I=(d);
since N j4 (0), d 54 0. If Y E N then y = ax = rdx E (dx) so that N = (dx)
is a free cyclic R-module. Thus free-rankR(N) = 1 and the result is true
for k = 1.
Assume by induction that the result is true for all M with free-rank k,
and let M be a module with free-rankR(M) = k+l. Let S = {x1, ... xk+l }
be a basis of M and let Mk = (xl, ... ,xk). If N C Mk we are done by
induction. Otherwise N n Mk is a submodule of Mk which, by induction, is
free of free-rank e < k. Let {yl, ... , ye} be a basis of N n Mk. By Theorem
2.5
N/(N n Mk) 25 (N + Mk)/Mk C M/Mk = (xk+I + Mk)-
By the k = 1 case of the theorem, (N + Mk)/Mk is a free cyclic submodule
of M/Mk with basis dxk+1 + Mk where d 54 0. Choose ye+l E N so that
ye+i = dxk+i + x' for some x' E Mk. Then (N + Mk) lMk = (ye+i + Mk).
144 Chapter 3. Modules and Vector Spaces
We claim that S' = {yl, ... , ye, yt+1 } is a basis of N. To see this, let y E N.
Then y + Mk = ae+i(ye+i + Mk) so that y - at+lyt+1 E N n Mk, which
implies that y - at+Iyt+1 = a1y1 + aeye. Thus S' generates N. Suppose
that aly, + +ae+lyt+1 = 0. Then at+1(dxk+1 +x')+aIy1 + +aeye = 0
so that ae+ldxk+l E Mk. But S is a basis of M so we must have ae+ld = 0;
since d 34 0 this forces at+1 = 0. Thus a1y1 + + aeye = 0 which implies
that a1 = = at = 0 since {y1, . . , yt} is linearly independent. Therefore
S' is linearly independent and hence a basis of N, so that N is free with
free-rankR(N) < e + 1 < k + I. This proves the theorem in Case 1.
Case 2. free-rankn(M) = oo.
Since (0) is free with basis 0, we may assume that N # (0). Let S =
{xj}jEJ be a basis of M. For any subset K C J let MK = ({xk}kEK)
and let NK = N n MK. Let T be the set of all triples (K, K', f) where
K' C K C J and f : K' -+ NK is a function such that (f (k)}kEK' is a
basis of NK We claim that T # 0.
.
Claim. K = J.
Assuming the claim is true, it follows that MK = M, NK = N n MK =
N, and {f(k)}kEK' is a basis of N. Thus, N is a free module (since it has
a basis), and since S was an arbitrary basis of M, we conclude that N has
a basis of cardinality < free-rankR(M), which is what we wished to prove.
It remains to verify the claim. Suppose that K j4 J and choose j E
J \ K. Let L = K U {j}. If NK = NL then (K, K', f) (L, K', f),
contradicting the maximality of (K, K', f) in T. If NK 36 NL, then
NL/(NL n MK) (NL + MK)/MK C ML/MK = (x3 + MK)-
By Case 1, (NL + 1MIK)/MK is a free cyclic submodule with basis dxj + MK
where d # 0. Choose z E NL so that z = dxj + w for some w E MK.
Then (NL + MK)/MK = (z + MK). Now let L' = K' U {j} and define
L' -+NL by
f,(k) = J f (k) if k E K',
Z ifk=j.
3.6 Free Modules over a PID 145
b,z+1: bkf(k)=0
kE K'
so that
db,x,+b,w+ E bkf(k) =0.
kEK'
That is, db,x, E MK n (x,) = (0), and since S = {xt}tEJ is a basis
of M, we must have db, = 0. But d j4 0, so b, = 0. This implies that
EkE K, bk f (k) = 0. But { f (k) }kE K' is a basis of NK, so we must have
bk = 0 for all k E K'. Thus { f'(k)}kEL' is a basis of NL. We conclude that
(K, K', f) (L, L', f'), which contradicts the maximality of (K, K', f).
Therefore, the claim is verified, and the proof of the theorem is complete.
(6.4) Corollary. Let M be a finitely generated module over the PID R and
let N C M be a submodule. Then N is finitely generated and
µ(N) < µ(M).
Proof. Let
0- K- F 0+M-y0
be a free presentation of M such that free-rank(F) = p(M) < oo, and let
NI = 0-1(N). By Theorem 6.2, N1 is free with
µ(N1) < free-rank(NI) < free-rank(F) = µ(M).
Since N = O(NI ), we have µ(N) < µ(N1), and the result is proved.
146 Chapter 3. Modules and Vector Spaces
(6.5) Remark. The hypothesis that R be a PID in Theorem 6.2 and Corol-
laries 6.3 and 6.4 is crucial. For example, consider the ring R = Z[X] and
let M = R and N = (2, X). Then M is a free R-module and N is a sub-
module of M that is not free (Example 4.6 (6)). Moreover, R = Z[V/-5],
P = (2, 1 + /) gives an example of a projective R-module P that is
not free (Example 5.6 (3)). Also note that 2 = µ(N) > µ(M) = 1 and
2 = µ(P) > I = u(R).
Recall that if M is a free module over an integral domain R, then M is
torsion-free (Proposition 4.8). The converse of this statement is false even
under the restriction that R be a PID. As an example, consider the Z-
module Q. It is clear that Q is a torsion-free Z-module, and it is a simple
exercise to show that it is not free. There is, however, a converse if the
module is assumed to be finitely generated (and the ring R is a PID).
qox = cjgoy;
;=1
_ ci(go/ai)aiyi
i=1
_ G(qo/a;)b;xi
;=1
I
Ec;(go/a;)b; x1
i=1
Therefore.
I
bgox, = agox = (c(o/ai)bi)
aE X1.
i=1
0-+M,--iM-'M/M, 0.
(6.12) Remarks.
(1) If R is a field, then every nonzero x E M is primitive.
(2) The element x E R is a primitive element of the R-module R if and
only if x is a unit.
3.6 Free Modules over a PID 149
(3) The element (2, 0) E Z2 is not primitive since (2, 0) = 2 (1, 0).
(4) If R = Z and M = Q, then no element of M is primitive.
(6.15) Lemma. Let R be a PID and let M be a free R-module with basis
S = {xj}jEJ. If X = E,EJajxj E M, then x is primitive if and only if
gcd(jai }jEJ) = 1.
Proof. Let d = gcd({aj}jEJ). Then x = d(>2jEJ(aj/d)xj), so if d is not a
unit then x is not primitive. Conversely, if d = 1 and x = ay then
E ajxj = x
jEJ
= ay(
=
l
a(Ebjxj)
jEJ
_ E abjxj.
jEJ
Either this chain stops at some i, which means that xi is primitive, or (6.1)
is an infinite properly ascending chain of submodules of M. We claim that
the latter possibility cannot occur. To see this, let N = Ui_° I (xi). Then N
is a submodule of the finitely generated module M over the PID R so that
N is also finitely generated by {yl,... , yk } (Corollary 6.4). Since (xo) C
(xl) C , there is an i such that {yi,...,yk} C (xi). Thus N = (xi) and
hence (x,) = (xi+i) = , which contradicts having an infinite properly
150 Chapter 3. Modules and Vector Spaces
as+bu=0.
Multiplying the first equation by u, multiplying the second by v, and adding
shows that a = 0, while multiplying the first by -s, multiplying the second
by r, and adding shows that b = 0. Hence, {x, x2} is linearly independent
and, therefore, a basis of M.
Now suppose that u(M) = k > 2 and that the result is true for all free
R-modules of rank < k. By Theorem 6.6 there is a basis {x1, ... , xk } of M.
Letx=Ekla;x;.Ifak=0then xEM1=(x1,...,xk_1),sobyinduc-
tion there is a basis {x, x2, ... ,xk_1} of M1. Then {x,x2, ... ,xk_1, xk} is
a basis of M containing x. Now suppose that ak # 0 and let y = Ei=1 a;x;.
If y = 0 then x = akxk, and since x is primitive, it follows that ak is a unit
of R and {x1, ... ,xk_1, x} is a basis of M containing x in this case. If
y 0 then there is a primitive y' such that y = by' for some b E R. In
particular, y' E M1 so that M1 has a basis {y', x2, ... , x'ti-11 and hence
M has a basis {y', x2, ... , xk_1, xk}. But x =akxk + y = akxk + by' and
gcd(ak, b) = 1 since x is primitive. By the previous case (k = 2) we conclude
that the submodule (xk, y') has a basis {x, y"}. Therefore, M has a basis
{x, x2, ... , xk_1, y"} and the argument is complete when k = µ(M) < oo.
If k = oo let {x, }jE J be a basis of M and let x = 1 aixj, for
some finite subset I = 01, ... , j } c J. If N = (xj...... xi,,) then x is
a primitive element in the finitely generated module N, so the previous
argument applies to show that there is a basis {x, x2, ... , x' } of N. Then
{x, x2, ... ,xn} U {x,}iEJ\I is a basis of M containing x. 0
(6.17) Corollary. If M is a free module over a PID R, then every basis of
M contains µ(M) elements.
Proof. In case µ(M) < oo, the proof is by induction on µ(M). If µ(M) = 1
then M = (x). If {x1, x2} C M then x1 = a1x and and x2 = a2x so that
a2x1 -a, X2 = 0, and we conclude that no subset of M with more than one
element is linearly independent.
Now suppose that p(M) = k > 1 and assume the result is true for all
free R-modules N with µ(N) < k. Let S = {x) }JE J c M be any basis of
M and choose x E S. Since x is primitive (being an element of a basis),
Theorem 6.16 applies to give a basis {x, y2i ... , yk } of M with precisely
µ(M) = k elements. Let N = M/(x) and let it : M - N be the projection
map. It is clear that N is a free R-module with basis 7r(S) \ {ar(x)}. By
Proposition 2.12 it follows that µ(N) > k -1, and since {7r(y2), ... , ir(yk)}
generates N, we conclude that µ(N) = k - 1. By induction, it follows that
ISI - 1 < oo and ISO - 1 = k - 1, i.e., ISI = k, and the proof is complete in
case µ(M) < oo.
In case µ(M) = oo, we are claiming that no basis of M can contain a
finite number k E Z+ of elements. This is proved by induction on k, the
proof being similar to the case µ(M) finite, which we have just done. We
leave the details to the reader. 0
152 Chapter 3. Modules and Vector Spaces
(6.18) Corollary. Let R be any commutative ring with identity and let M be
a free R-module. Then every basis of M contains µ(M) elements.
Proof. Let I be any maximal ideal of R (recall that maximal ideals exist
by Theorem 2.2.16). Since R is commutative, the quotient ring R/I = K
is a field (Theorem 2.2.18), and hence it is a P1D. By Proposition 4.13,
the quotient module M/IM is a finitely generated free K-module so that
Corollary 6.17 applies to show that every basis of M/IM has µ(M/IM)
elements. Let S = {xj}JEJ be an arbitrary basis of the free R-module M
and let it : M - M/IM be the projection map. According to Proposition
4.13, the set 7r(S) = {7r(xJ)}3EJ is a basis of M/IM over K, and therefore,
(6.19) Remarks.
(1) If M is a free R-module over a commutative ring R, then we have
proved that free-rank(M) = µ(M) = the number of elements in any
basis of M. This common number we shall refer to simply as the rank
of M, denoted rankR(M) or rank(M) if the ring R is implicit. If R is
a field we shall sometimes write dimR(M) (the dimension of M over
R) in place of rankR(M). Thus, a vector space M (over R) is finite
dimensional if and only if dimR(M) = rankR(M) < oo.
(2) Corollary 6.18 is the invariance of rank theorem for finitely generated
free modules over an arbitrary commutative ring R. The invariance of
rank theorem is not valid for an arbitrary (possibly noncommutative)
ring R. As an example, consider the Z-module M = ®nENZ, which
is the direct sum of countably many copies of Z. It is simple to check
that M M ® M. Thus, if we define R = Endz(M), then R is a
noncommutative ring, and Corollary 3.13 shows that
R = Endz(M)
= Homz(M, M)
Homz(M, M (D M)
Homz(M, M) ® Homz(M, M)
R®R.
The isomorphisms are isomorphisms of Z-modules. We leave it as an
exercise to check that the isomorphisms are also isomorphisms of R-
modules, so that R a, R2, and hence, the invariance of rank does
not hold for the ring R. There is, however, one important class of
noncommutative rings for which the invariance of rank theorem holds,
namely, division rings. This will be proved in Proposition 7.1.14.
3.6 Free Modules over a PID 153
(6.20) Corollary. If M and N are free modules over a PID R, at least one of
which is finitely generated, then M N if and only if rank(M) -= rank(N).
Proof. If M and N are isomorphic, then p(M) = µ(N) so that rank(M) =
rank(N). Conversely, if rank(M) = rank(N), then Proposition 4.9 gives a
homomorphism f : M - N, which takes a basis of M to a basis of N. It is
easy to see that f must be an isomorphism.
RkM ---+ O
where K = Ker(¢). Since M is free, Corollary 4.16 gives Rk -- M ®K, and
according to Theorem 6.2, K is also free of finite rank. Therefore,
and the maximality of (sl) in S shows that (Si) = (c(w)) = (d). In partic-
ular, (Si) = (d) so that s1 I a1, and we conclude that
z = b1(slxl) + E bjij
jEJ'
3.6 Free Modules over a PID 155
N = (six,) ® Ni,
and the claim is proved.
By Theorem 6.2, N1 is a free R-module since it is a submodule of the
free R-module M. Furthermore, by the claim we see that
rank(N1) = rank(N) - 1 = n - 1.
Applying the induction hypothesis to the pair N1 C M1, we conclude that
there is a basis S' of M1 and a subset {x2, ... xn} of S', together with
nonzero elements 82, ... , 8n of R, such that
(6.6) {82x2, ... , snxn } is a basis of N1
and
(6.7) si I s;+1 for 2 < i < n - 1.
Let S = S' U {x,}. Then the theorem is proved once we have shown tha
Si 132.
To verify that s1 $2, consider the element 82x2 E Ni C N and
1
(6.25) Remark. In Section 3.7, we will prove that the elements {s1, ... , sn}
are determined just by the rank n submodule N and not by the particular
choice of a basis S of M. These elements are called the invariant factors of
the submodule N in the free module M.
Proof. Since µ(M) = n, let {vi, ... vn} be a generating set of M and
,
Equation (7.2) follows from Equation (7.1). The proof is now completed
by observing that Ann(wi) 54 R for any i since, if Ann(wi) = R, then
158 Chapter 3. Modules and Vector Spaces
Z2
Z.(1,0)ED
More generally, if M is a free R-module of rank n, then any choice of basis
{v1, ... ,v,a} provides a cyclic decomposition
Rv
with Ann(vi) = 0 for all i. Therefore, there is no hope that the cyclic factors
themselves are uniquely determined. What does turn out to be unique,
however, is the chain of annihilator ideals
Ann(wl) 2 ...
where we require that Ann(w;) # R, which simply means that we do not
allow copies of (0) in our direct sums of cyclic submodules. We reduce the
uniqueness of the annihilator ideals to the case of finitely generated torsion
R-modules by means of the following result. If M is an R-module, recall
that the torsion submodule M, of M is defined by
M, = {x E M : Ann(x) 34 (0)}-
(7.6) Corollary. Two finitely generated torsion modules over a PID are iso-
morphic if and only if they have the same chain of invariant ideals.
Proof. 0
(7.7) Remark. In some cases the principal ideals Ann(w3) have a preferred
generator aj. In this case the generators {ajIj=1 are called the invariant
factors of M.
Proof. (1) Since Ann(M) = (sn) = (me(M)) by Theorem 7.1 and the
defintion of me(M), it follows that if aM = 0, i.e., a E Ann(M), then
me(M) I a.
(2) Clearly s,, divides s1 ... sn.
(3) Suppose that p I 31 sn = co(M). Then p divides some si, but
(si) 2 (sn), so si I sn. Hence, p I sn = me(M). 0
(7.10) Remark. There are, unfortunately, no standard names for these in-
variants. The notation we have chosen reflects the common terminology in
the two cases R = Z and R = F[X]. In the case R = Z, me(M) is the
exponent and co(M) is the order of the finitely generated torsion Z-module
162 Chapter 3. Modules and Vector Spaces
so by Theorem C
pknk
sn = unplnl ...
where the divisibility conditions imply that
0 < elf < e2j < ... < eni for 1<j<k.
Then the proof of Theorem 7.12 shows that M is the direct sum of cyclic
submodules with annihilators eij > 0}, and the theorem is proved.
0
(7.14) Definition. The prime powers eij > 0, 1 < j < k} are called
the elementary divisors of M.
(7.16) 0 < e1j < e2j < < enj for 1 < j < k.
We show that the set of invariant factors (Equation (7.15)) can be recon-
structed from the set of prime powers in Equation (7.17). Indeed, if
ej= maxe;j,
1<.<n
1<j<k,
then the inequalities (7.16) imply that sn is an associate of pi' pkk. Delete
from the set of prime powers in set (7.17), and repeat the process with
the set of remaining elementary divisors to obtain sn_1. Continue until all
prime powers have been used. At this point, all invariant factors have been
recovered. Notice that the number n of invariant factors is easily recovered
from the set of elementary divisors of M. Since s1 divides every si, it follows
that every prime dividing 81 must also be a prime divisor of every s,.
Therefore, in the set of elementary divisors, n is the maximum number of
occurrences of p"i for a single prime p. 0
Then the elementary divisors of M are 22, 22, 3,3 2 , 5, 7,7 2. Using the
algorithm from Theorem 7.15, we can recover the invariant factor descrip-
tion of M as follows. The largest invariant factor is the product of the
highest power of each prime occurring in the set of elementary divisors,
i.e., the least common multiple of the set of elementary divisors. That is,
32 = 72 5 32 22 = 8820. Note that the number of invariant factors of
M is 2 since powers of the primes 2, 3, and 7 occur twice in the set of ele-
mentary divisors, while no prime has three powers among this set. Deleting
72, 5,3 2 , 22 from the set of elementary divisors, we obtain s1 = 7.3.22 = 84.
This uses all the elementary divisors, so we obtain
Al Z&4 X Zs82o
3.7 Finitely Generated Modules over PIDs 165
(7.17) Lemma. Let M be a module over a PID R and suppose that x E MT.
If Ann(x) = (r) and a E R with (a, r) = d (recall that (a, r) = gcd{a, r}),
then Ann(ax) = (r/d).
Proof. Since (r/d)(ax) = (a/d)(rx) = 0, it follows that (r/d) C Ann(ax).
If b(ax) = 0, then r I (ba), so ba = rc for some c E R. But (a, r) = d, so
there are s, t E R with rs + at = d. Then rct = bat = b(d - rs) and we see
that bd = r(ct + bs). Therefore, b E (r/d) and hence Ann(ax) = (r/d). 0
(7.18) Lemma Let M be a module over a PID R, and let x1, ... , xn E Mr
with Ann(xi) = (ri) for 1 < i < n. If {rl, ... , rn} is a pairwise relatively
prime subset of R and x = x1 + + xn, then Ann(x) _ (a) _ (f 1 ri).
Conversely, if y E M, is an element such that Ann(y) _ (b) si)
where {s1, . . . , s, } is a pairwise relatively prime subset of R, then we may
write y = yl + + yn where Ann(yi) _ (si) for all i.
Let x = x1 + - + xn. Then a = rj 1 ri E Ann(x) so that (a) C
P r o o f.
Ann(x). It remains to check that Ann(x) C (a). Thus, suppose that bx = 0.
By the Chinese remainder theorem (Theorem 2.2.24), there are c1, ... , cn E
R such that
1 (mod (ri)),
ci = 0 (mod (r3)), if j 0 i.
Then, since (rj) = Ann(xj), we conclude that cixj = 0 if i 96 j, so for each
iwith 1<i<n
Therefore, bci E Ann(xi) = (ri), and since ci - 1 (mod (ri)), it follows that
ri I b for 1 < i < n. But {ri, ... , rn } is pairwise relatively prime and thus
where the set {s1, ... , sn } is pairwise relatively prime. As in the above
paragraph, apply the Chinese remainder theorem to get c1, ... , en E R
such that
(I (mod (si)),
c' = 0 (mod (sj)), if j , i.
Since b is the least common multiple of {s1i ... , sn}, it follows that
1 - c1 (mod (b)),
166 Chapter 3. Modules and Vector Spaces
we define
(7.21) sm = Pi m' ... Pkmk
Note that fmj > 0 for 1 < j < k.
Delete {pf-1, ... , pf-* } from the set S and repeat the above process
with the remaining prime powers until no further positive prime powers are
3.7 Finitely Generated Modules over PIDs 167
available. Since a prime power for a particular prime p is used only once at
each step, this will produce elements al, ... , sm E R. From the inductive
description of the construction of si, it is clear that every prime dividing s;
also divides s;+1 to at least as high a power (because of Equation (7.21)).
Thus,
s; s;+1 for 1 < i < m.
Therefore, we may write
Pk1k
Si = uipi" ...
(7.22)
am = UmPlm' ... Pk'"`
where
(7.23) f, >0}={pWs:eo,6 > 0}
where repetitions of prime powers are allowed and where
(7.24) 0:5 for 1<j<k
by Equation (7.20).
For each (1 < i < m), choose w;j E S with Ann(w;j) =
and let xi = wit + + wik. Lemma 7.18 shows that Ann(x;) = (si) for
1 < i < m, and thus,
k k
Rx; Rl (si) °-` ®Rl (Pf ") ®Rw:j
j=1 j=1
M ®RzaQ
0.0
m k
?® Rw;j
i=1 j=1
! Rxl®...0Rzm
where Ann(x;) = (s;). Since si s;+1 for 1 < i < m, it follows that
I
{Si, ... , am} are the invariant factors of M, and since the set of prime
power factors of {Si, ... , sm} (counting multiplicities) is the same as the
set of prime power factors of {t1i ... , t,,} (see Equation (7.23)), the proof
is complete.
S={d,,:1<i<k;I<j<ti}
is the set of elementary divisors of M.
Proof. By Theorem 7.1,
divisors of Mi are the prime power factors of {sit, ... , si,., }. Then
k
M=®MiL, ®Rwij
i=1 i,j
where Ann(wij) = (sjj). The result now follows from Proposition 7.19. O
k
(7.26) co(M) _ [Ico(Ms).
i=1
the finite abelian group up to isomorpism. Also any finite abelian group is
uniquely isomorphic to a group
Z,, X . . X Z,k
(7.23) Example. We will carry out the above procedure for n = 600 =
23 3.52. There are three primes, namely, 2, 3, and 5. The exponent of 2
is3andwecanwrite 3=1+1+1,3=1+2,and3=3.Thus there are
three partitions of 3. The exponent of 3 is 1, so there is only one partition,
while the exponent of 5 is 2, which has two partitions, namely, 2 = 1 + 1
and 2 = 2. Thus there are 3.1.2 = 6 distinct, abelian groups of order 600.
They are
Z2 X Z2 X Z2 X Z3 X Z5 X Z5 L Z2 X Z10 X Z30
Z2 X Z2 X Z2 X Z3 X Z25 = Z2 X Z2 X Z150
Z2XZ4XZ3XZ5XZ5~Z10XZ60
170 Chapter 3. Modules and Vector Spaces
Z2 X Z4 X Z3 X Z25 = Z2 X Z300
z8xZ3xZ5xZ5Z5XZ120
z8xZ3XZ25Z600
where the groups on the right are expressed in invariant factor form and
those on the left are decomposed following the elementary divisors.
We will conclude this section with the following result concerning the
structure of finite subgroups of the multiplicative group of a field. This
is an important result, which combines the structure theorem for finite
abelian groups with a bound on the number of roots of a polynomial with
coefficients in a field.
(7.25) Corollary. Suppose that F is a finite field with q elements. Then F'
is a cyclic group with q - 1 elements, and every element of F is a root of
the polynomial X9 - X.
Proof. Exercise.
Proof. Let H be a finite subgroup of C' with IHJ = n. Then every element
z of H has the property that z" = 1. In other words, z is a root of the
equation X" = 1. Since this equation has at most n roots in C and since
every element of G,, is a root of this equation, we have z E G". Thus, we
conclude that H C G" and hence H = G" because n = IHI = IG"I. 0
(8.3) Remarks.
(1) A submodule S of M that satisfies condition (3) of Proposition 8.2
is called a pure submodule of M. Thus, a submodule of a finitely
generated module over a PID is pure if and only if it is complemented.
(2) If R is a field, then every subspace S C M satisfies condition (3) so that
every subspace of a finite-dimensional vector space is complemented.
Actually, this is true without the finite dimensionality assumption, but
our argument has only been presented in the more restricted case. The
fact that arbitrary subspaces of vector spaces are complemented follows
from Corollary 4.21.
(3) The implication (3) (1) is false without the hypothesis that M be
finitely generated. As an example, consider a free presentation of the
Z-module Q:
O-*S-M-+Q-0.
Since MIS - Q and Q is torsion-free, it follows that S satisfies con-
dition (3) of Proposition 8.1. However, if S is complemented, then a
complement T - Q; so Q is a submodule of a free Z-module M, and
hence Q would be free, but Q is not a free Z-module.
Proof. Let S = (v1, ... , v,,,) where m = rank S. Extend this to a basis
{v1, ... , of M. Then T = (v,,,+i, ... , v,,) is a complement of S in M
and T '_5 M/S. Thus,
rank M=n=m+(n-m)=rank S+rank(M/S).
(8.7) Proposition. Let R be a PID and let f : M -. N be an R-module
homomorphism of finite-rank free R-modules. Then
(1) Ker(f) is a pure submodule, but
(2) Im(f) need not be pure.
(2) f is an injection.
(3) f is a surjection.
3.9 Exercises
1. If M is an abelian group, then Endz(M), the set of abelian group endomor-
phisms of M, is a ring under addition and composition of functions.
(a) If M is a left R-module, show that the function 0 : R -+ Endz(M)
defined by 0(r)(m) = rm is a ring homomorphism. Conversely, show
that any ring homomorphism 0 : R -. Endz(M) determines a left R-
module structure on M.
(b) Show that giving a right R-module structure on M is the same as giving
a ring homomorphism 0: R°' -+ Endz(M).
2. Show that an abelian group G admits the structure of a if and
only if nG = (0).
3. Show that the subring Z(4 ] of Q is not finitely generated as a Z-module if
9 ¢ Z.
4. Let M be an S module and suppose that R C S is a subring. Then 1Li is also
an R-module by Example 1.5 10). Suppose Chat N C M is an R-submodule.
LetSN=(sn:sES, nEN .
(a) If S = and R = Z, show that SN is the S-submodule of M generated
by N.
(b) Show that the conclusion of part (a) need not hold if S = R and R = Q.
3.9 Exercises 175
10. Let R be a commutative ring with 1 and let I and J be ideals of R. Prove
that R/I °_° R/J as R-modules if and only if I = J. Suppose ask
X
that R/I and /J be isomorphic rings. Is the same conclusion valid?
id? (Hint:
Consider F[X]/(X - a) where a E F and show that F[X]/(X - a) °_° F as
rings.)
11. Prove Theorem 2.7.
12. Prove Lemma 2.9.
13. Let M be an R-module and let f E EndR(M) be an idempotent endomor-
phism of M, i.e., f o f = f. (That is, f is an idempotent element of the ring
EndR(M).) Show that
M °- (Ker(f)) ® (Im(f)).
14. Prove the remaining cases in Theorem 3.10.
15. Let R be a PID and let a and b E R be nonzero elements. Then show
that HomR(R/Ra, R/Rb) ?f R/Rd where d = (a, b) is the greatest common
divisor of a and b.
16. Compute Homz(Q, Z).
17. Give examples of short exact sequences of R-modules
176 Chapter 3. Modules and Vector Spaces
and
O-.N1 -.N
such that
a) Mi'`N1, M2tN, M2 N2;
b) Mi° NI,M N,M22.N2;
c) M1 N1, M °__ N, M2 °5 N2
18. Show that there is a split exact sequence
0 - . mZmn - Zmn -+ nZmn - 0
of Zmn-modules if and only if (m, n) = 1.
19. Let N1 and N2 be submodules of an R-module M. Show that there is an
exact sequence
O-.N1nN2!N1®N2-N1+N2-+0
where p(x) = (x, x) and m(x, y) = x - y.
20. Let R be an integral domain and let a and b be nonzero elements of R. Let
M = R/R(ab) and let N = Ra/R(ab). Then M is an R-module and N is a
subm ule. Show that N is a complemented submodule in M if and only if
there are u, v E R such that ua + vb = I.
21. Let R be a ring, M a finitely generated R-module, and 4 : M -+ R" a
surjective R-module homomorphism. Show that Ker ¢) is finitely generated.
(Note that this is valid even when M has submodules that are not finitely
generated.) (Hint: Consider the short exact sequence:
0 -+ K -. M m-+ R" -+ 0. )
22. Suppose that
0 , M1 m M - M2 -+ 0
0 -. Ni1' 19
N
1h
N2 -0
is a commutative diagram of R-modules and R-module homomorphisms.
Assume that the rows are exact and that f and h are isomorphisms. Then
prove that g is an isomorphism.
23. Let R be a commutative ring and S a multiplicatively closed subset of R
containing no zero divisors. If M is an R-module, then Ms was defined in
Exercise 6. Prove that the operation of forming quotients with elements of
S is exact. Precisely:
(a) Suppose that M' f M 9+ M" is a sequence of R-modules and homo-
morphisms which is exact at M. Show that the sequence
Ms Is Ms gs MS
is an exact sequence of Rs-modules and homomorphisms.
(b) As a consequence of part (a), show that if M' is a submodule of M, then
Ms can be identified with an Rs-submodule of Ms.
(c) If N and P are R-submodules of M, show (under the identification
of part (b)) that (N + P)s = Ns + Ps and (N n P)s = Ns n Ps.
(That is, formation of fractions commutes with finite sums and finite
intersections.)
(d) If N is a submodule of M show that
3.9 Exercises 177
(M/N)s'-` (Ms)/(Ns)-(That
A= 1 1 -11 .
0 2 3
Show that the two rows of A are linearly independent over R, but that any
two of the three columns are linearly dependent over R.
30. Let V be a finite-dimensional complex vector space. Then V is also a vector
space over R. Show that dime, V = 2dimc V. (Hint: If
B={vi,...,vn}
is a basis of V over C, show that
i3' = {vl, ... vn, ivl, ... ivn)
is a basis of V over R.)
31. Extend Exercise 30 as follows. Let L be a field and let K be a subfield of L.
If V is a vector space over L, then it is also a vector space over K. Prove
that
dimK V = [L: K] dim,. V
where [L : K] = dimK L is the dimension of L as a vector space over K.
(Note that we are not assuming that dimK L < oo.)
32. Let K C L be fields and let V be a vector space over L. Suppose that
B = {ua },Er is a basis of V as an L-module, and let W be the K-submodule
of V generated by B. Let U C W be any K-submodule, and let U,. be the
L-submodule of V generated by U. Prove that
U,.nW=U.
That is, taking L-linear combinations of elements of U does not produce any
new elements of W.
178 Chapter 3. Modules and Vector Spaces
That is, taking L-linear combinations of elements of U does not produce any
new elements of W.
33. Let K C L be fields and let A E A, (K), b E M.,1(K). Show that the matrix
equation AX = b has a solution X E M,,,1(K) if and only if it has a solution
X E M,,,I(L).
34. Prove that the Lagrange interpolation polynomials (Proposition 2.4.10) and
the Newton interpolation polynomials (Remark 2.4.11) each form a basis of
the vector space of polynomials of degree < n with coefficients from
F.
35. Let F denote the set of all functions from Z+ to Z+, and let M be the
free Q-module with basis F. Define a multiplication on M by the formula
(f g)(n) = f (n) + g(n) for all f, g E F and extend this multiplication by
linearity to all of M. Let fm be the function f,,, n) = 6.,,,,, for all m, n > 0.
Show that each fm is irreducible in fact, prime) as an element of the ring
onsider the function f (n) = 1 for all it > 0. Show that f does not
Al. Now consider'
have a factorization into irreducible elements in Al. (Hint: It may help to
think of f as the "infinite monomial"
Xo ro)Xi (1) ... Xm(-) ....
(Compare this exercise with Example 2.5.15.)
36. Let F be a field, and let
2 = {p,(X) : pa(X) is an irreducible monic polynomial in FIXJ}.
We will say that a rational function h(X) = f (X)/g(X) E F(X) is proper
if deg(f (X)) < deg(g(X)). Let F(X)pr denote the set of all proper rational
functions in F[XJ.
(a) Prove that F(X) FIX] tD F(X)pr as F-modules.
(b) Prove that
B=
X' E Z; 0 < j < deg(pa(X)), k > 1 }
(p. (X))'
is a basis of F(X)pr as an F-module. The expansion of a proper rational
function with respect to the basis B is known as the partial fraction
expansion; it should be familiar front elementary calculus.
37. Prove that Q is not a projective Z-module.
38. Let
R = { f : [0, 1] - R : f is continuous and f (0) = f (1) }
and let
M = {f : 10, 11 R: f is continuous and f (0) = - f (1)}.
Then R is a ring under addition and multiplication of functions, and M is
an R-module. Show that M is a projective R-module that is not free. (Hint:
Show that M ®M ? -, R (D R.)
39. Show that submodules of projective modules need not be projective. (Hint:
Consider pZP2 C Z,2 as ZD2-modules.) Over a PID, show that submodules
of projective modules are projective.
40. (a) If R is a Dedekind domain, prove that R is Noetherian.
b If R is an integral domain that is a local ring (i.e., R has a unique
maximal ideal), show that any invertible ideal I of R is principal.
(c) Let R be an integral domain and S C R \ {0} a multiplicatively closed
subset. If I is an invertible ideal of R, show that Is is an invertible ideal
of Rs.
3.9 Exercises 179
(d) Show that in a Dedekind domain R, every nonzero prime ideal is maxi-
mal. (Hint: Let M be a maximal ideal of R containing a prime ideal P,
and let S = R \ M. Apply parts (b) and (c).)
41. Show that Z[f-3] is not a Dedekind domain.
42. Show that Z[XJ is not a Dedekind domain. More generally, let R be any
integral domain that is not a field. Show that RIX] is not a Dedekind domain.
43. Suppose R is a PID and M = R(x) is a cyclic R-module with Ann M = (a) 9
(0). Show that if N is a submodule of M, then N is cyclic with Ann N = (b)
where b is a divisor of a. Conversely, show that M has a unique submodule
N with annihilator (b) for each divisor b of a.
44. Let R be a PID, M an R-module, x E M with Ann(x) = (a) 96 (0). Factor
a = upi' ... pk" with u a unit and pi, ..., pi, distinct primes. Let Y E M
with Ann(y) = (b) 0- (0), where b = ti p1`' . . .pk ' with 0 < rn; < n; for
1 < i < k. Show that Ann(x + y) = (a).
45. Let R be a PID, let M be a free R-module of finite rank, and let N C M be a
submodule. If M/N is a torsion R-module, prove that rank(M) = rank(N).
46. Let R be a PID and let M and N be free R-modules of the same finite rank.
Then an R-module homomorphism f : M - N is an injection if and only if
N/ Im(f) is a torsion R-module.
47. Let u = (a, b) E Z2.
(a) Show that there is a basis of Z2 containing u if and only if a and b are
relatively prime.
(b) Suppose that u = (5,12). Find a v E Z2 such that {u, v} is a basis of
V.
48. Let M be a torsion module over a PID R and assume Ann(M) = (a) 0 (0).
If a = pi' pk'' where pi, ... , pk are the distinct prime factors of a, then
show that MD, = qiM where q; = a/p;'. Recall that if p E R is a prime,
then M9 denotes the p-primary component of M.
49. Let M be a torsion-free R-module over a PID R, and assume that x E M is
a primitive element. If px = qx' show that q I p.
50. Find a basis and the invariant factors for the submodule of Z3 generated by
x1 _ (1,0,-1), x2 = (4,3, -1), x3 = (0,9,3), and x4 = (3, 12,3).
51. Find a basis for the submodule of Q(XJ3 generated by
68. If f(XI, ... Xn) E R[X1, ... X 1 , , the degree off is the highest degree of
a monomial in f with nonzero coe cient, where
The theme of the present chapter will be the application of the structure theo-
rem for finitely generated modules over a PID (Theorem 3.7.1) to canonical form
theory for a linear transformation from a vector space to itself. The fundamental
results will be presented in Section 4.4. We will start with a rather detailed in-
troduction to the elementary aspects of matrix algebra, including the theory of
determinants and matrix representation of linear transformations. Most of this
general theory will be developed over an arbitrary (in most instances, commuta-
tive) ring, and we will only specialize to the case of fields when we arrive at the
detailed applications in Section 4.4.
enti3(AB) = [rowi(A)][col3(B)]
Let I E M. (R) be defined by ent,3 (In) = bid where bij is the kronecker
6 function, i.e., bii = 1 and bid = 0 if i 0 j. The following lemma contains
some basic properties of matrix multiplication. In part (c), the concept of
center of a ring is needed. If R is a ring, then the center of R, denoted
C(R), is defined by
C(R)={aER:ab=ba forallbER}.
Note that C(R) is a subring of R and R is commutative if and only if
R = C(R).
Proof. Exercise.
Recall (from Example 2.1 (8)) that the matrix E3 E M,,(R) is defined
to be the matrix with 1 in the ijtn position and 0 elsewhere. E1, is called
a matrix unit, but it should not be confused with a unit in the matrix ring
4.1 Matrix Algebra 185
(1.2) Lemma. Let {Eii : 1 < i, j < n} be the set of matrix units in MM(R)
and let A = [aii] E Mn(R). Then
(1) EiiEki = bjkEsl.
(2) En Eii
Ei.i
n
= I.. n
(3) A = aiiEii = i.i=1 Eiiaij.
(4) EiiAEk1 = aikE 1.
Proof. Exercise.
Remark. When speaking of matrix units, we will generally mean the mar
trices Eii E Mn(R). However, for every m, n there is a set {&i}; ` 1 1 C
Mm,n(R) where Eii has 1 in the (i, j) position and 0 elsewhere. Then, with
appropriately adjusted indices, items (3) and (4) in Lemma 1.2 are valid.
Moreover, {Eii}in 1J 1 is a basis of M,n,n(R) as both a left R-module and
a right R-module. Hence, if R is commutative so that rank makes sense
(Remark 3.6.19), then it follows that rankR(Mm,n(R)) = mn.
and
rowk(Eli A) = [rowk(E1i)]A = blk rowi(A).
Comparing entries in these n pairs of matrices shows that ai, = 0 if s # j
and all = aii. Since j is arbitrary, this shows that A = a11In is a scalar
matrix. Since A must commute with all scalar matrices, it follows that
all E C(R).
186 Chapter 4. Linear Algebra
A unit in the ring M,, (R) is a matrix A such that there is some matrix B
with AB = BA = In. Such matrices are said to be invertible or unimodular.
The set of invertible matrices in M,,(R) is denoted GL(n, R) and is called
the general linear group over R of dimension n. Note that GL(n, R) is, in
fact, a group since it is the group of units of a ring.
If A = (a,j] E M,n,n(R), then the transpose of A, denoted At E
!VIn,,n(R), is defined by
The following formulas for transpose are straightforward and are left as an
exercise.
Proof.
Tr(AB) = Tr(BA).
Tr(S-'AS) = Tr(A).
4.1 Matrix Algebra 187
Proof (1)
R (AB) entii(AB)
i=1
M n
= E E entik(A) entki(B)
i=1 k=1
n m
= E E entki (B) entik (A)
k=1 i=1
n
_ 1: entkk(B)
k=1
= T-(BA).
(1.8) Definition. Let R be a ring and let {Eij};j=1 C Mn(R) be the set of
matrix units. Then we define a number of particularly useful matrices in
Mn(R).
(1) For a E R and i 36 j define
Tij(a) = In +aEj.
The matrix Tij (a) is called an elementary transvection. Tij (a) differs
from In only in the ijth position, where Tij(a) has an a.
(2) IfaER' is a unit ofRand1<i<n, then
Di(a) = In - Eii + aEii
is called an elementary dilation. Di(a) agrees with the identity matrix
In except that it has an a (rather than a 1) in the ith diagonal position.
(3) The matrix
Pij =In -Eii-Ejj+Eij+Eji
is called an elementary permutation matrix. Note that Pij is obtained
from the identity In by interchanging rows i and j (or columns i and
j)
188 Chapter 4. Linear Algebra
(1.9) Definition. If R is a ring, then matrices of the form Di(,0), Tij(a), and
Pig are called elementary matrices over R. The integer n is not included
in the notation for the elementary matrices, but is determined from the
context.
Proof. (1)
Proof. 0
... Ara
where each block A,j is a matrix of size mi x nj with entries in R. Two
particularly important partitions of A are the partition by rows
Al
A= where A; = rowi(A) E M1,n(R)
Am
A=
and
j1l ... Bit
B=
Bat ... Bat
are partitioned matrices. Can the product C = AB be computed as a
partitioned matrix C = [C,,( where C,, = Fk=1 AtkBkj? The answer is yes
provided all of the required multiplications make sense. In fact, parts (5)
and (6) of Lemma 1.1 are special cases of this type of multiplication for
the partitions that come from rows and columns. Specifically, the equation
4.1 Matrix Algebra 191
Al A1B
AB= B=
Am A,,,B
and
col,. (B13 )
col,(B)
col, ( Bij )
From Equation (1.2) we conclude that
192 Chapter 4. Linear Algebra
ent,r(Cij) = enta,s(C)
n
= aa7b70
y=1
n1 a,,,b0., n2 ni
_ 1: + E aatibck, + ... + aa7ba'Y
7=1 -f=n1+1 ry=ne-,+1
= entr(Ai1Bij) + ent,,,.(Ai2B2j) + ... + ent,.r(AitBtj)
and the result is proved.
0 0 ... Ar
is a block diagonal matrix, then we say that A is the direct sum of the
matrices A1i ... , Ar and we denote this direct sum by
A=A1ED ...ED A,.
Thus, if Ai E Mm,,n; (R), then A ®Ar E M,n,n(R) where m = E;=1 mi
and n = F-'=1 ni.
The following result contains some straightforward results concerning
the algebra of direct sums of matrices:
(1.15) Lemma. Let R be a ring, and let A1, ... , Ar and B1, ... , Br be
matrices over R of appropriate sizes. (The determination of the needed size
is left to the reader.) Then
(1) (®i .1Ai) + (E Bi) ®i=1 (Ai + Bi),
(2) (®i IAi) (®i=1Bi) (AiBi),
(3) (E D7 = ®i=1Ai 1 if Ai E GL(ni, R), and
(4) Tr(®i lAi) = i=11 (Ai) if Ai E M., (R).
Proof. Exercise.
4.1 Matrix Algebra 193
(1.16) Definition. Let R be a commutative ring, let A E Mm1 rn, (R), and let
B E Mm2,n,(R). Then the tensor product or kronecker product of A and
B, denoted A® B E Mmon,,nin,(R), is the partitioned matrix
(1.3) A®B=
(1.17) Examples.
(1) Im ®In = Imn
(2) I m ®B = ®;"_1B.
a 0 b 0
ra bl _ 0 a 0 b
(3) c d
®I2- c 0 d 0'
0 c 0 d
Proof. By Proposition 1.14, Cij, the (i, j) block of (A1 ® B1)(A2 ® B2), is
given by
nl
= CY:(entik(A1)entki(A2))) B1 B2,
k=1
Proof. 0
Proof. Exercise. 0
D(B) = aD(A).
(2) If A, B, C E Mn(R) are identical in all rows except for the ith row
and
rowi(C) = rowi(A) + rowi(B),
then
D(C) = D(A) + D(B).
4.2 Determinants and Linear Equations 195
Note that property (2) does not say that D(A + B) = D(A) + D(B).
This is definitely not true if n > 1.
One may also speak of n-linearity on columns, but Proposition 2.9
will show that there is no generality gained in considering both types of
n-linearity. Therefore, we shall concentrate on rows.
(2.2) Examples.
(1) Let Dl and D2 be n-linear functions. Then for any choice of a and b in
R, the function D : Mn(R) R defined by D(A) = aDi(A)+bD2(A)
is also an n-linear function. That is, the set of n-linear functions on
Mn(R) is closed under addition and scalar multiplication of functions,
i.e., it is an R-module.
(2) Let o E S. be a permutation and define D, : Mn(R) R by the
formula
D0(A) = al(l)a2(2) ... ano(n)
where A = [a,3). It is easy to check that D, is an n-linear function,
but it is not a determinant function since it is not alternating.
(3) Let f : Sn -# R be any function and define Df : Mn(R) -+ R by the
formula D f = EaES. f (a)D,. Applying this to a specific A = [a13] E
Mn(R) gives
Ai+Aj Aj
0=D(B)=D =D +D
Aj + Aj Ai+Aj
Ai Ai Aj
=D +D +D
Ai Aj Aj
=D +D
= D(A) + D(PijA)
by Proposition 1.12. Thus, D(PijA) = -D(A), and the lemma is proved.
0
(2.5) Remark. Lemma 2.4 is the reason for giving the name "alternating"
to the property that D(A) = 0 for a matrix A that has two equal rows.
Indeed, suppose D has the property given by Lemma 2.4, and let A be a
matrix with rows i and j equal. Then PijA = A, so from the property of
4.2 Determinants and Linear Equations 197
Let E1 = row; (In) for 1 < i < n, and consider all n x n matrices
formed by using the matrices E; as rows. To develop a convenient notation,
let Stn denote the set of all functions w : {1, 2, ..., n} {1, 2, ..., n},
and if w E Stn, let P,,, denote the n x n matrix with rowi(P,) = E,(;). For
example, if n = 3 and w(1) = 2, w(2) = 1, and w(3) = 2, then
0 1 0
P,,, = 1 0 0
0 1 0
Since w-1 = (it jt)(it-i jt-i) (ii j1), the second equality, together with
Proposition 1.13 (3), shows that
coli(P.) = E,.,-,(i)
Therefore, again using Proposition 1.13 (3), we see that AP,, is the matrix
defined by
coli(AP,,) = col,,-,(i)(A).
To summarize, left multiplication by P,,, permutes the rows of A, following
the permutation w, while right multiplication by P,,, permutes the columns
of A, following the permutation w-1.
Recall that the sign of a permutation a, denoted sgn(o), is +1 if a is
a product of an even number of transpositions and -1 if a is a product of
an odd number of transpositions.
Proof. (1) If w V Sn then w(i) = w(j) for some i $ j so that P, has two
rows that are equal. Thus, D(P,,) = 0.
(2) If w E Sn, write w = (i1 ii) (it it) as a product of transpositions
to get (by Proposition 1.12)
Pw=Pi,i,...piti,
By Lemma 2.4, we conclude that
(-1)tD(In) = sgn(w)D(In),
and the lemma is proved. 0
and thus,
En,=1
Fn aij1Ej1
E
uj2=1 a2j2Ej3
A =
`n
`', 1 anjn Ejn
If D is a determinant function, we may compute D(A) using (2.1), the
n-linearity of D, and Lemma 2.7 as follows:
En
En =1 a1j1 Ej1
E a2j3Ejs
D(A) = D uj2
[n
Lj anjn Ejn
n Ej2=1Ej1
a2j2Ejs
alj, D
j1=1
ujn anjnEjn
Ej,
n n
j2
a1j1a2j3D
11=1 j,=1
En
nc 1 anjnEjn
Ej1
n n Ej2
= F ... E a1j1 a2jz ... anjn D
ji=1 jn=1
Ejn
1: a1 (1)a2,a(2)...anw(n)D(PW)
LaE12n
Thus, we have arrived at the uniqueness part of the following result since
formula (2.2) is the formula which must be used to define any determinant
function.
But a a' gives a pairing of the even and odd permutations in Sn, and
hence, we conclude
sgn(a)ao-1(,(1))c(1)...ao-1(C(R))C\f1)
oES,
_ 1: sgn(a)ao_,(1) 1 ...ao-1(n)n
oES.,
E Sgn(a-1)ao-1(1) I ... so-1(n) n
cES
sgn(r)a7(1) 1 ... a1.(n) n
*ES.,
= det(Al).
Here we have used the fact that sgn(a-1) = sgn(a) and that
av-1(o(1)) 0(1) "' ao-1(o(n)) o(n) = ao-,(1) 1 ... ao_,(n) n,
This last equation is valid because R is commutative and {a(1), ..., a(n)}
is just a reordering of {1, ..., n} for any a E Sn. 0
det(S-'AS) = det(A).
That is, similar matrices have the same determinant.
Proof. Exercise. 0
Similarly to the proof of Theorem 2.11, one can obtain the formula for
the determinant of a direct sum of two matrices.
There is also a simple formula for the determinant of the tensor product
of two square matrices.
and
n
(2) >(_1)k+iaik det(A,k) = bij det(A).
k=1
for all A E M,a(R). A direct calculation shows that Di1(In) = bid, so that
Equation (2.6) yields (1) of the theorem.
Formula (2) (cofactor expansion along row j) is obtained by applying
formula (1) to the matrix At and using the fact (Proposition 2.10) that
det(A) = det(A'). 0
(2.16) Definition. If A E MM(R), then we define the cofactor matrix of A,
denoted Cofac(A), by the formula
Adj(A) = (Cofac(A))t.
(2.19) Examples.
(1) A matrix A E Mn(Z) is invertible if and only if det(A) = ±1.
(2) If F is a field, a matrix A E Mn(F[X]) is invertible if and only if
det(A) E F' = F \ {0}.
(3) If F is a field and A E Mn(F[X]), then for each a E F, evaluation of
each entry of A at a E F gives a matrix A(a) E M,,(F). If det(A) =
f (X) 34 0 E FIX], then A(a) is invertible whenever f (a) 0 0. Thus,
A(a) is invertible for all but finitely many a E F.
4.2 Determinants and Linear Equations 205
(2.23) Remarks.
(1) M-rank(A) = 0 means that Ann(F1(A)) 96 0. That is, there is a nonzero
a E R with aa,j = 0 for all aid. Note that this is stronger than saying
that every element of A is a zero divisor. For example, if A = [ 2 3 ] E
M1,2(Z6), then every element of A is a zero divisor in Z6 but there is
no single nonzero element of Z6 that annihilates both entries in the
matrix.
(2) If A E then M-rank(A) = n means that det(A) is not a zero
divisor of R.
(3) To say that M-rank(A) = t means that there is an a # 0 E R with
a D = 0 for all (t + 1) x (t + 1) minors D of A, but there is no nonzero
b E R which annihilates all t x t minors of A by multiplication. In
particular, if det Ala 101 is not a zero divisor of R for some a E Q,,,n,
(3 E Q,,,,, then M-rank(A) > s.
Proof. Exercise. 0
5xi=bdi if1<i<t+l,
1xi=0 ift+2<i<n.
Then X 54 0 since xt+1 = bdetA[a 1 (3] 0 0. But Equation (2.7) and the
fact that b E Ann(Ft+1(A)) show that
bEt+1 aljdj
AX=
e+i
bamidi 0
0
bdetA[(1,...,t,t+ 1) 1 (1,...,t,t+1)]
L bdetA((1,...,t,m) (1,...,t,t+1)]
0.
Proof. If A E Mn(R) then Fn(A) = (det A), so M-rank(A) < n if and only
if det A is a zero divisor. In particular, if R is an integral domain then
M-rank(A) < n if and only if det A = 0. 0
There are still two other concepts of rank which can be defined for
matrices with entries in a commutative ring.
(2.29) Corollary.
(1) If R is a commutative ring and A E Mm,,n(R), then
Proof. Exercise. 0
4.2 Determinants and Linear Equations 209
(2.30) Remarks.
(1) If R is an integral domain, then all four ranks of a matrix A E M,,,,. (R)
are equal, and we may speak unambiguously of the rank of A. This will
be denoted by rank(A).
(2) The condition that R be an integral domain in Corollary 2.29 (2) is
necessary. As an example of a matrix that has all four ranks different,
consider A E M4(Z210) defined by
0 2 3 5
2 0 6 0
A= 3 0 3 0
0 0 0 7
_
m
i=1
r n
(a*xi)
j=1=1
Wi
=0
since AX = 0. Therefore, S is R-linearly dependent. 0
210 Chapter 4. Linear Algebra
Proof. Suppose that a = (a1, ..., at), 13 = (Qt, ..., 3t) and let C =
AB[a 10). Thus,
entij (C) = row., (A) colp, (B)
n
at:, kbko,
k=1
so that
Ek=1 a.,kbk3, ... >k.1 aQIkbkI,
C= .
Ek=1
a.,k40, aa,kbkO,
Using n-linearity of the determinant function, we conclude that
n n bk, Q, bk, Ot
(2.9) detC= E Ea.,k, a., k, det
k,=1
ke_1 bkt A,
... bkt At
If k; = kj for i 34 j, then the ith and jth rows of the matrix on the right are
equal so the determinant is 0. Thus the only possible nonzero determinants
on the right occur if the sequence (k1, ... , kt) is a permutation of a sequence
4.2 Determinants and Linear Equations 211
detC = )detBIV I p]
'YEQe ,.. eES
J
(2.35) Examples.
(1) The Cauchy-Binet formula gives another verification of the fact that
det(AB) = det A det B for square matrices A and B. In fact, the only
element of Q,,,,, is the sequence -y = (1, 2, ... ,n) and A[y I y] = A,
B[y yJ = B, and AB[y I y] = AB, so the product formula for
I
Equation (2.11) easily gives the following result, which shows that de-
terminantal rank is not changed by multiplication by nonsingular matrices.
D-rank(UAV) = D-rank(A).
where
t
s(-Y) = E yj
j=1
71 < ... < 7k = p < 7k+1 < ... < 7k+,-1 < q < 7k+r < ... < 7t
and
A[aI7')=A[aI7]P,.,
where w is the r-cycle (k + r - 1, k + r - 2, ... , k). Similarly,
A[&I7J=A[aI7JPW'
where w' is a (q - p + 1 - r)-cycle. Thus,
det(A[a 17']) det(A[a 17 J)
(-1)°(1')+(r-1)+(q-D)-r
_ det(A[a 171) det(A[a I Yl)
[f(m)1S = If1B[rn]a.
However, when we try to apply this to ni = rv, we get f (m) = f (rv) _
rf(v) = r(sv) = (rs)v, so [f(m)15 = [rs) while [f]a[m]a = [s][r] = [sr]. If
R is commutative, these are equal, but in general they are not.
On the other hand, this formulation of the problem points the way to
its solution. Namely, recall that we have the ring R°P (Remark 3.1.2 (3))
whose elements are the elements of R, whose addition is the same as that of
R, but whose multiplication is given by r s = sr, where on the right-hand
side we have the multiplication of R. Then, indeed, the equation
good ring, neither better nor worse than R itself. Thus, we adopt a second
solution. Let op : R -, RP be the function that is the identity ,n elements,
i.e., op(t) = t for every t E R. Then we have op(sr) = op(r) op(s), where the
multiplication on the right-hand side, written as usual as juxtaposition, is
the multiplication in R°P. This notation also has the advantage of reminding
us that t E R, but op(t) E R°P.
Note that if R is commutative then RIP = R and op is the identity,
which in this case is a ring homomorphism. In fact, op : R -+ RP is a ring
homomorphism if and only if R is commutative. In most applications of
matrices, it is the case of commutative rings that is of primary importance.
If you are just interested in the commutative case (as you may well be),
we advise you to simply mentally (not physically) erase "op" whenever
it appears, and you will have formulas that are perfectly legitimate for a
commutative ring R.
After this rather long introduction, we will now proceed to the for-
mal mathematics of associating matrices with homomorphisms between free
modules.
If R is a ring, M is a free R-module ofrank n,andB={vl,...,vn}
is a basis of M, then we may write v E M as v = al vi + + anvn for
unique a1, ... , an E R. This leads us to the definition of coordinates. Define
io:M -+ Mn,1(R°P)by
op(al)
op(an)
The n x 1 matrix [v]8 is called the coordinate matrix of v with respect to
the basis B.
Suppose that B' = {vi, ... ,v;,} is another basis of M and define the
matrix Ps,, E Mn(R°P) by the formula
(3.1) [v2]8, ...
pt = [[v1]8' [vn]8']-
That is, cold (Pg,) = [v3]13-. The matrix P8 is called the change of basis
matrix from the basis B to the basis B'. Since vj = E 1 b,jv;, it follows
that if B = 8' then PB = In.
(3.1) Proposition. Let M be a free R-module of rank n, and let 8, 8', and
B" be bases of M. Then
(1) for any v E M, [via' = Pt IV] B;
(2) PB , = PB, PB; and
(3) PB is invertible and (PB) -' = PB '.
0": M - MM,1(R°P),
defined by /"(v) = PB[v]B. To show that u" = 1/i' we need only show
that they agree on elements of the basis B. If B = {vj, ... , then
(vi]B = ei = Ei1 E M.,1(R) since vi = F,' 1 b;jv,. Thus,
as required.
(2) For any v E M we have
PB PB = PB = I,,.
PB,.
Thus, PB, is invertible and (PE) -1 =
(3.2) Lemma. Let R be a ring and let M be a free R-module. If B' is any
basis of M with n elements and P E GL(n, R°P) is any invertible matrix,
then there is a basis B of M such that
P=Pg.
Proof. Let B' = {vi, ... , vn} and suppose that P = (op(pii )]. Let v? _
-, 1 pijv;. Then B = {v1, ... ,
and by construction, Pt = P.
is easily checked to be a basis of M,
Remark. Note that the choice of notation for the change of basis matrix
P8 has been chosen so that the formula in Proposition 3.1 (2) is easy to
remember. That is, a superscript and an adjacent (on the right) subscript
that are equal cancel, as in
pt pt = PB,, .
The same mnemonic device will be found to be useful in keeping track of
superscripts and subscripts in Propositions 3.5 and 3.6.
4.3 Matrix Representation of Homomorphisms 217
(3.3) Definition. Let M and N be finite rank free R-modules with bases
B = {vi, ... , v,n } and C = {w1, ... , w, ,J respectively. Fo, each f E
HomR(M, N) define the matrix of f with respect to B, C, denoted [f]C,
by
(3.4) Remarks.
(1) Note that every matrix A E Mn,,n (RP) is the matrix [f) B for a unique
f E HomR(M, N). Indeed, if B = {vi, ... , v,n} and C = {w1, ... , wn}
are bases of M and N respectively, and if A = [op(aij )], then define f E
HomR(M, N) by f (vj) = E 1 aijwi. Such an f exists and is unique
by Proposition 3.4.9; it is clear from the construction that A = [f].
Thus the mapping f '-+ [f] B gives a bijection between HomR(M, N)
and Mn,,n(R°P).
(2) Suppose that R is a commutative ring. Then we already know (see
Theorem 3.4.11) that HomR(M, N) is a free R-module of rank mn, as
is the R-module Mn,,n(R); hence they are isomorphic as R-modules. A
choice of basis B for M and C for N provides an explicit isomorphism
HomR(M, N) - Mn,m(R),
defined by 4 (f) = (f). We leave it as an exercise for the reader to
check that 4 ( f , ) = Eji where {E1}11 is the standard basis of
Mn,m(R), while {fij}in1 is the basis of HomR(M, N) constructed
1
[.f(v)IC = [f]C[v]B-
218 Chapter 4. Linear Algebra
op(bi )
Proof. If [v)8 = then
op(bm)
f(v)=f E bjvj
(j=1
M
_ >bjf(vj)
j=1
i=1
(jaij
j=1
wi.
(3.7) Remark. From this proposition we can see that matrix multiplication
is associative. Let M, N, P, and Q be free R-modules with ba.,es B, C, D,
and 6 respectively, and let f
R-module homomorphisms. Then, by Proposition 3.6,
C(EndR(M)) = R 1m.
That is, a homomorphism f : M - M commutes with every other homo-
morphism g : M -+ M if and only if f =r-1m for some r E R.
For the remainder of this section we shall assume that the ring R is
commutative.
an equation of linear dependence with not all of {a, a1, ... , an } equal
to zero. Then
4.3 Matrix Representation of Homomorphisms 221
0=af(v)
= f(av)
i=1
Since {w1, ... , wn } is a basis of N and R is an integral domain, it follows
that ai = 0 for all i. Hence av = 0 and thus v = 0 (similarly), and we
conclude that f is an injection.
Proof. The equivalence of (5), (6), and (7) follows from Remark 2.18, while
the equivalence of (1), (2), and (3) to (5), (6), and (7), respectively, is a
consequence of Corollary 3.8.
Now clearly (1) implies (4). On the other hand, assume that f is a
surjection. Then there is a short exact sequence
0-Ker(f),M 0.
222 Chapter 4. Linear Algebra
(3.15) Proposition. Let F be a field and let M and N be vector spaces over
F of dimension n. Then the following are equivalent.
(1) f is an isomorphism.
(2) f is injective.
(3) f is surjective.
Proof. (1) is immediate from Proposition 3.13 and Theorem 2.17 (2), while
part (2) follows from Corollary 2.27. 0
4.3 Matrix Representation of Homomorphisms 223
(3.18) Definition.
(1) Let R be a commutative ring. Matrices A, B E Mn,,n(R) are said to
be equivalent if and only if there are invertible matrices P E GL(n, R)
and Q E GL(m, R) such that
B = PAQ.
Equivalence of matrices is an equivalence relation on Mn,,n(R).
(2) If M and N are finite rank free R-modules, then we will say that R-
module homomorphisms f and g in HomR(M, N) are equivalent if
there are invertible endomorphisms h1 E EndR(M) and h2 E EndR(N)
such that h2 f hi 1 = g. That is, f and g are equivalent if and only if
there is a commutative diagram
M + N
hs
IM hll
9+ N
where the vertical maps are isomorphisms. Again, equivalence of ho-
momorphisms is an equivalence relation on HomR(M, N).
(3.19) Proposition.
(1) Two matrices A, B E Mn,m(R) are equivalent if and only if there are
bases B, B' of a free module M of rank m and bases C, C' of a free
module N of rank n such that A = [f] and B = [f]. That is, two
c
matrices are equivalent if and only if they represent the same R-module
homomorphism with respect to different bases.
(2) If M and N are free R-modules of rank m and n respectively, then
homomorphisms f and g E HomR(M, N) are equivalent if and only if
there are bases B, B' of M and C, C' of N such that
[f ]c = (gilc l'.
Proof. (1) Since every invertible matrix is a change of basis matrix (Lemma
3.2), the result is immediate from Proposition 3.16.
(2) Suppose that [f]c = [g]c,. Then Equation (3.3) gives
(3.5) [f ]c = [91 c' = l c, [9]c (PB) -'
The matrices Pg and PC are invertible so that we may write (by Corollary
3.9) Pg = [h1]5 and [h2]c where h1 E EndR(M) and h2 E EndR(N)
are invertible. Thus Equation (3.5) gives
If)" = [h2]c[9]c ([h1]g)-1 = [h29hi 1]c
c
224 Chapter 4. Linear Algebra
[fJc=[Or 01
_ f(aivi)
i=1
0 = >aif(vi) _ >ai(siwi)
{=1 :=1
Since {s1w1i ... , srwr} is linearly independent, this implies that ai = 0 for
1 < i < r. But then
m
aivi = 0,
i-r+1
and since {vr+1, ... , is a basis of Ker(f ), we conclude that ai = 0 for
all i, and the claim is verified.
It is clear from the construction that
Dr 0
fflcs = 0 o1
(3.21) Remark. In the case where the ring R is a field, the invariant factors
are all 1. Therefore, if f : M - N is a linear transformation between finite-
dimensional vector spaces, then there is a basis B of M and a basis C of N
such that the matrix of f is
1f1B = 10r 0,
Proof. Exercise. 0
If M = N, then Proposition 3.16 becomes the following result:
(3.23) Proposition. Let f E EndR(M) and let B, B' be two bases for the
free R-module M. Then
Proof.
(3.25) Corollary.
(1) Two matrices A, B E Mn (R) are similar if and only if there are bases
B and B' of a free R-module M of rank n and f E HomR(M) such
that A = [f)13 and B = if ]s- . That is, two n x n matrices are similar
if and only if they represent the same R-module homomorphism with
respect to different bases.
(2) Let M be a free R-module of rank n and let f, g E EndR(M) be endo-
morphisms. Then f and g are similar if and only if there are bases B
and B' of M such that
[f]L; = [g)s'
That is, f is similar to g if and only if the two homomorphisms are
represented by the same matrix with respect to appropriate bases.
Proof. Exercise.
4'(f) = m([fIB)
where B is a basis of M. According to Corollary 3.25, the definition of m is
independent of the choice of basis of M because 0 is a class function. The
most important class functions that we have met so far are the trace and
the determinant (Lemma 1.6 (2) and Corollary 2.12). Thus, the trace and
the determinant can be defined for any endomorphism of a free R-module
of finite rank. We formally record this observation.
Tr(f) = T VI is)
where B is any basis of M. Tr(f) will be called the trace of the homo-
morphism f; it is independent of the choice of basis B.
(2) There is a multiplicative function det : EndR(M) - R defined by
det(f) = det([fJB)
where B is any basis of M. det(f) will be called the determinant of the
homomorphism f ; it is independent of the choice of basis B.
Proof.
(3.28) Proposition. Let R be a commutative ring, and let M1, M2, N1, and
N2 be finite rank free R-modules. If Bi is a basis of Mi and Ci is a basis of
Ni (i = 1, 2), then let B1 U B2 and C1 U C2 be the natural bases of Ml ® M2
and N1 ® N2 respectively (see Example 3.4.6 (7)). If fi E HomR(M1, Ni)
for i = 1, 2, then fl ® f2 E HomR(M1 ® M2, N1 ® N2) and
[f1® f2 JcBlUc,
UB2
= [f1JB' ® [f2JB2
2.
Proof. Exercise.
_ AB
[fit; 0D
where A E Mr(R) if and only if the submodule N = (v1, ... , v,.) is an
invariant submodule of f.
Proof. If if JB = [tip) then the block form means that tip = 0 for r + 1 < i <
n, I < j < r. Thus, if 1 < j < r it follows that
fl
In Proposition 3.30, if the block B = 0, then not only is (vi, ... , vk)
an invariant submodule, but the complementary submodule (vk+1, ... , v,,,)
is also invariant under f. From this observation, extended to an arbitrary
number of blocks, we conclude:
Proof 0
TA : M.,i(R) - Mn.1(R)
defined by TA(v) = Av. We shall usually identify Mn,1(R) with R' via the
standard basis {Eti1 : 1 < i < n} and speak of TA as a map from R" to R".
In practice, in studying endomorphisms of a free R-module, eigenvalues
and eigenvectors play a key role (for the matrix of f depends on a choice
of basis, while eigenvalues and eigenvectors are intrinsically defined). We
shall consider them further in Section 4.4.
(3.37) Remark. The only place in the above proof where we used that R is
a PM is to prove that the joint eigenspaces Mi n Nj are free submodules of
M. If R is an arbitrary commutative ring and f and g are commuting diago-
nalizable endomorphisms of a free R-module M, then the proof of Theorem
3.36 shows that the joint eigenspaces Mi n Nj are projective R-modules.
There is a basis of common eigenvectors if and only if these submodules are
in fact free. We will show by example that this need not be the case.
Thus, let R be a commutative ring for which there exists a finitely
generated projective R-module P which is not free (see Example 3.5.6 (3)
or Theorem 3.5.11). Let Q be an R-module such that P ®Q = F is a free
R-module of finite rank n, and let
M=FED F=P1ED Q1®P2ED Q2
where P1 = P2 = P and Q1 = Q2 = Q. Then M is a free R-module of rank
2n. Let \1 96 a2 and Pl # P2 be elements of R and define f, g E EndR(M)
by
F
M1 = Ker(f -a11M)=Pi ®Q1
M2 = Ker(f -A21M)=P2®Q2F
and g is diagonalizable with eigenspaces
N, =Ker(g-µ11M)=PI F
F.
Moreover, f g = g f . However, there is no basis of common eigenvectors of
f and g since the joint eigenspace M1 n N1 = Pl is not free.
(4.1) Proposition. Let V and W be vector spaces over the field F, and
suppose that T E EndF(V), S E EndF(W). Then
4.4 Canonical Form Theory 233
(4.2) Theorem. Let V be a vector space over the field F, and let T1, T2 E
EndF(V). Then the R-modules VT, and VT, are isomorphic if and only if
T1 and T2 are similar.
Proof. By Proposition 4.1, an R-module isomorphism (recall R = F[X])
P:VT2 -'VT1
consists of an invertible linear transformation P : V V such that PT2 =
T1P, i.e., T1 = PT2P-1. Thus VT, and VT2 are isomorphic (as R-modules)
if and only if the linear transformations T1 and T2 are similar. Moreover,
we have seen that the similarity transformation P produces the R-module
isomorphism VT, to VT,.
This theorem, together with Corollary 3.25, gives the theoretical un-
derpinning for our approach in this section. We will be studying linear
transformations T by studying the R-modules VT, so Theorem 4.2 says
that, on the one hand, similar transformations are indistinguishable from
this point of view, and on the other hand, any result, property, or invariant
we derive in this manner for a linear transformation T holds for any trans-
formation similar to T. Let us fix T. Then by Corollary 3.25, as we vary
the basis B of V, we obtain similar matrices [TIs. Our objective will be to
234 Chapter 4. Linear Algebra
find bases in which the matrix of T is particularly simple, and hence the
structure and properties of T are particularly easy to understand.
(4.3) Proposition. Let V be a vector space over the field F and suppose that
dimF(V) = n < oo. If T E EndF(V), then the R-module (R = F[X]) VT is
a finitely generated torsion R-module.
Proof. Since the action of constant polynomials on elements of VT is just
the scalar multiplication on V determined by F, it follows that any F-
generating set of V is a priori an R-generating set for VT. Thus U(VT) <
n = dimF(V). (Recall that u(M) (Definition 3.2.9) denotes the minimum
number of generators of the R-module M.)
Let v E V. We need to show that Ann(v) $ (0). Consider the elements
v, T(v), ..., T"(v) E V. These are n+1 elements in an n-dimensional vector
space V, and hence they must be linearly dependent. Therefore, there are
scalars ao, al, ... a E F, not all zero, such that
(4.6) apv + a1T(v) + + 0.
(4.5) Remark. It is worth pointing out that the proofs of Proposition 4.3
and Corollary 4.4 show that a polynomial g(X) is in Ann(VT) if and only
if g(T) = 0 E EndF(V).
(4.6) Definition.
The monic polynomials f1(X ), ... , fk(X) in Equation (4.8) are called
the invariant factors of the linear transformation T.
The invariant factor fk(X) of T is called the minimal polynomial
MT(X) of T.
The characteristic polynomial cT(X) of T is the product of all the
invariant factors of T, i.e.,
mT(X) I q(X).
(2) MT(X) divides cT(X).
(3) If p(X) is any irreducible polynomial dividing cT (X ), then p(X) divides
mT(X).
dimp(V) = deg(f(X)).
(2) I f V T ° _ Rvi ® E ) where Ann(vi) = (fi(X)) as in Equation (4.7),
then
k
(4.12) > deg(fi(X )) = dim(V) = deg(cT(X )).
i=1
q(T)(v)
is surjective and vT R/ Ker(i) as F-modules. But Ker(rl) _ (f (X)),
and as F-modules, R/(f(X)) has a basis {1, X, ... ,Xn-1} where n =
deg(f (X)). Thus, dimF(V) = n = deg(f (X)).
(2) and (3) are immediate from (1) and the definitions. 0
0 0 0 0 -ao
1 0 0 0 -al
0 1 0 0 -a2
(4.13) C(f M) =
I0 0 1 0
0 0 . 0 1 -an-1
-°n-2 J
(4.13) Examples.
(1) For each AEF,C(X-A)=[a]EM1(F).
(2) diag(al, ... an) = ®i 1C(X - a,).
r0 Ol l
(3) C((2 + 1) =
(4) If A = C(X - a) ®C(X2 - 1), then
a 0 0
A= 0 0 1
0 1 0
But
T"(e1) = -aoe1 - a1T(e1) - an-1Tn-1(e1),
i.e.,
T"(el)+an-1T"-1(el)+...+a1T(e1)+aoe1 =0.
(4.16) Corollary. Let V be any finite-dimensional vector space over the field
F, let T E EndF(V), and suppose that the R-module VT is cyclic with
Ann(VT) = (f (X )) . Then there is a basis 8 of V such that
Proof. Let VT ^_' Rv1 ® ... ® Rvk where R = F[XI and Ann(vi) = (fi(X))
and where fi(X) I fi+1(X) for 1 < i < k. Let deg(f;) = ni. Then Bi =
{vi, T(vi), ... ,T1,-1(vi)} is a basis of the cyclic submodule Rvi. Since
submodules of VT are precisely the T-invariant subspaces of V, it follows
that TAR,,; E EndF(Rvi) and Corollary 4.16 applies to give
(4.15) (TIRvj , = C(fi(X))
By Equation (4.12), ni + + nk = n, and hence, B = U;= I is a basis of
V and Proposition 3.32 and Equation (4.15) apply to give Equation (4.14).
0
(4.18) Corollary. Two linear transformations T1 and T2 on V have the same
rational canonical form if and only if they are similar.
Proof. Two linear transformations T1 and T2 have the same rational canon-
ical form if and only if they have the same invariant factors, which occurs if
and only if the R-modules VT, and VT, are isomorphic. Now apply Theorem
4.2. 0
4.4 Canonical Form Theory 239
(4.20) Lemma. Let F be a field and let f (X) E FIX) be a monic polynomial
of degree n. Then the matrix XIn - C(f (X)) E Mn(F[XJ) and
(4.16) det (XIn - C(f (X))) = f(X).
X 0 . .. 0 0 ao
-1 X . .. 0 0 a1
0 -1 . .. 0 0 a2
det(XIn - C(f (X))) = det
1
0 0 . . . -1 X an-2
0 0 0 -1 X+an_1
X ... 0 0 a1
-1 ... 0 0 a2
= X det
0 -1 X an-2
0 0 -1 X +an_1
-1 X ... 0 0
0 -1 .. 0 0
+ a0(_1)n+1 det
0 0 -1 X
0 0 ... 0 -1
= X (Xn-1 + an_1Xn-2 +... + al)
+ao(-1)n+l(_1)n-1
= f(X),
and the lemma is proved.
Proof. By Lemma 4.22, if Equation (4.17) is true for one basis, it is true
for any basis. Thus, we may choose the basis B so that (T] B is in rational
canonical form, i.e.,
CA(T) = 0.
VT =Rv,e...®Rvn
where Ann(vi) = (X - A,). Note that
me(Rv,) = Ann(vi) = (X - A,),
so Proposition 3.7.21 implies that
MT(X) = me(VT)
= lcm{me(Rvl), ... , me(Rvn)}
n
= fl(X - Ai)
i=1
= f(X)
Also by Proposition 3.7.21, we see that cT(X) = f (X). Therefore MT(X) _
cT(X) and Lemma 4.11 (3) shows that the R-module VT is cyclic with
annihilator (f (X)). Thus the rational canonical form of T is
0 0 ... 0 0 -a0
1 0 ... 0 0 -a1
0 1 0 0 -a2
[T)eo = C(f(X)) =
0 0 ... 1 0 -an_2
0 0 0 1 -an_,
where f (X) = X' + an- 1 X n -1 + + a 1 X + ao and the basis Bo is chosen
appropriately.
(4.28) Definition.
(1) A linear transformation T : V -* V is diagonalizable if V has a basis
such that 171E is a diagonal matrix.
(2) A matrix A E Mn(F) is diagonalizable if it is similar to a diagonal
matrix.
(4.29) Remark. Recall that we have already introduced the concept of diago-
nalizability in Definition 3.35. Corollary 3.34 states that T is diagonalizable
if and only if V posseses a basis of eigenvectors of T. Recall that v 0 E V
is an eigenvector of T if the subspace (v) is T-invariant, i.e., T(v) = Av for
some A E F. The element A E F is an eigenvalue of T. We will consider
criteria for diagonalizability of a linear transformation based on properties
of the invariant factors.
[T]B = ® Ailn
i=1
then let Bi = {vi1, ... , vin, } and let V = (B,). Then T(v) = Aiv for all
v E Vi, so Vi is a T-invariant subspace of V and hence an F[X]-submodule
of VT. from Example 4.26, we see that me(V) = X - A,, and, as in Example
4.27,
(4.20) VT ED ED t
cT(X) = co(T)
= co(T1) co(Tt)
=CT'(X)...CT,(X)
= (X - A1)"' ... (X - At)"'
as claimed.
where
1 ifi<j,
IE(i'j) = 0 if i > j.
(4.35) Remark. Note that the hypothesis on the field F is certainly satisfied
for any k if the field F is the field C of complex numbers.
From Theorem 4.30, we see that there are essentially two reasons why a
linear transformation may fail to be diagonalizable. The first is that mT(X)
may factor into linear factors, but the factors may fail to be distinct; the
second is that mT(X) may have an irreducible factor that is not linear. For
example, consider the linear transformations T. : F2 F2, which are given
by multiplication by the matrices
r 11
We shall concentrate our attention on the first problem and deal with the
second one later. The approach will be via the primary decomposition theo-
rem for finitely generated torsion modules over a PID (Theorems 3.7.12 and
3.7.13). We will begin by concentrating our attention on a single primary
cyclic R-module.
(4.21)
0 0 0
0 0 0
but
(Ja,n-AIn)n=Hn =0.
Therefore, if we let TA,n : F" Fn be the linear transformation obtained
by multiplying by Ja,n, we conclude that
TTA,n (X) = (X - A)n = CT"..
(since deg(cT,,.,, (X)) = n) so that Lemma 4.11 (3) shows that the FIX]-
module (F")T,,,,, is cyclic.
[T] e = Ja,n.
4.4 Canonical Form Theory 247
n
g(X) = E ak(X - A)n-k E Ann(v) = Ann(V).
k=1
But deg(g(X)) < n, so this can only occur if g(X) = 0, in which case
a1 = = an = 0. Thus B is linearly independent and hence a basis of V.
Now we compute the matrix [T]B. To do this note that
T(vk) = T ((T - A)n-k(v))
= (T - A)(T - ))n-k(v) + A(T - A)n-k(v)
= (T - A)n-(k-1)(v) + \(T - \)n-k(v)
= Jvk_1+.vk ifk>2,
1 Jivk ifk = 1.
Therefore, [T]B = Ja,n, as required. 0
[T]B=J=(Dii
i=1
Proof. (1) a (2) is Theorem 4.30, while (1) q (3) is Corollary 3.34.
(4.22)
i=1
Proof. (1) (2). If [T]p is in Jordan canonical form then the basis B
consists of generalized eigenvectors.
(2) (3). Let B = {vj, ... , be a basis of V, and assume
(T - ai)k' (vi) = 0,
i.e., each vi is assumed to be a generalized eigenvector of T. Then
mT(X) = me(VT) = lcm{(X - A1)k', ... (X - \,a)k-}
is a product of linear factors.
(3) (1). This is Theorem 4.38. 0
(4.46) Remarks.
(1) Ker(T - Alv) = {v E V : T(v) = Av} is called the elgenspace of A.
(2) {v E V : (T - \)k(V) = 0 for some k E N} is called the generalized
eigenspace of the eigenvalue A. Note that the generalized eigenspace
corresponding to the eigenvalue A is nothing more than the (X - A)-
primary component of the torsion module VT. Moreover, it is clear
from the definition of primary component of VT that the generalized
eigenspace of T corresponding to A is Ker(T - Alv)' where r is the
exponent of X - A in MT (X) = me(VT).
(3) Note that Lemma 4.43 implies that distinct (generalized) eigenspaces
are linearly independent.
Proof. First note that 1 < veeom(A) since A is an eigenvalue. Now let VT
®;_1V, be the primary decomposition of the F[X]-module VT, and assume
(by reordering, if necessary) that V1 is (X -A)-primary. Then T-A : V; -+ V;
is an isomorphism for i > 1 because Ann([;) is relatively prime to X - A
for i > 1. Now write
V1 c--- W1® ... ( D
as a sum of cyclic submodules. Since V1 is (X - A)-primary, it follows that
Ann(Wk) = ((X - A)9;) for some qk > 1. The Jordan canonical form of
Taw,, is J,\,4,, (by Proposition 4.37). Thus we see that Wk contributes 1 to
the geometric multiplicity of A and qk to the algebraic multiplicity of A. 0
(4.48) Corollary.
The geometric multiplicity of A is the number of Jordan blocks J,\,, in
the Jordan canonical form of T with value A.
The algebraic multiplicity of A is the sum of the sizes of the Jordan
blocks of T with value A.
If mT(X) = (X - A)Qp(X) where (X - A) does not divide p(X), then
q is the size of the largest Jordan block of T with value A.
If A1, ..., Ak are the distinct eigenvalues of T, then
.u(VT) = max{vgeom(Aj) : 1 < i < k}.
Proof. (1), (2), and (3) are immediate from the above proposition, while (4)
follows from the algorithm of Theorem 3.7.15 for recovering the invariant
factors from the elementary divisors of a torsion R-module. 0
4.4 Canonical Form Theory 251
Proof.
We have seen that in order for T to have a Jordan canonical form, the
minimal polynomial mT(X) must be a product of linear factors. We will
conclude this section by developing a generalized Jordan canonical form to
handle the situation when this is not the case. In addition, the important
case F = R, the real numbers, will be developed in more detail, taking into
account the special relationship between the real numbers and the complex
numbers. First we will do the case of a general field F. Paralleling our
previous development, we begin by considering the case of a primary cyclic
R-module (compare Definition 4.36 and Proposition 4.37).
particular, that the Jordan matrix J),,, is the some as the irreducible Jordan
block J(X_A),n.
[T]a=J'=®J;
i=1
A 12 0 0 0
0 A 12 0 0
(4.25) JR = . . .
0 0 0 ... A 12
0 0 0 . 0 A
[T)8 = ® J,
i=1
In case (4.26), T has a Jordan canonical form J,,,.; in case (4.27) we will
show that T has a real Jordan canonical form JR where z = a + bi.
Thus, suppose that p(X) = (X -a)2+b2 and let z = a+bi. By hypothe-
sis VT is isomorphic as an R[XI-module to the R[XJ-module R[X]/(p(X )'').
Recall that the module structure on V7 is given by Xu = T(u), while the
module structure on R[X]/(p(X)') is given by polynomial multiplication.
Thus we can analyze how T acts on the vector space V by studying how X
acts on R[XJ/(p(X)') by multiplication. This will be the approach we will
follow, without explicitly carrying over the basis to V.
Consider the C[X]-module IV = C[XJ/(p(X)'). The annihilator of W
is p(X)', which factors in C[XJ as p(X)' = q(X)'q(X)' where q(X)
((X - a) + bi). Since q(X)' and q(X )' are relatively prime in CJX],
(4.28) 4(X)r92(X
1 = q(X)r91(X) + )
for g, (X) and 92(X) in C[XJ. Averaging Equation (4.28) with it's complex
conjugate gives
4.4 Canonical Form Theory 255
= f(X) +7(X).
Thus, by Theorem 3.7.12, we may write
(4.29) W = C[X J/(p(X )r) = (v) ® (w),
where v = f (X) + (p(X )r), w = 7(X) + (p(X )r), Ann(v) = (4(X)') and
Ann(w) = (q(X)'). Equation (4.29) provides a primary cyclic decomposi-
tion of C[XJ/(p(X)r) and, as in the proof of Theorem 4.37,
C = {q(X )r-kw : 1 < k < r} U {q(X )r-kv : 1 < k < r}
is a basis of W = C(XJ/(p(X)r) over C. For 1 < k < r, let
1
Vk = (q(X)r-kw + q(X )r-kV) E W
2
and
= 2 (q(X)r-1k-1),w .+ q(X)r-(k-1).vl
Similarly,
0 0 a -b
L0 0 b a
(4.61) Remark. Our entire approach in this section has been to analyze a
linear transformation T by analyzing the structure of the F[X]-module VT.
Since Tl and T2 are similar precisely when VT, and VT, are isomorphic, each
and every invariant we have derived in this section-characteristic and min-
imal polynomials, rational, Jordan, and generalized Jordan canonical forms,
(generalized) eigenvalues and their algebraic and geometric multiplicities,
etc.-is the same for similar linear transformations Ti and T2.
J5,1
®
must 0 0 transformations
transformation to Since
multiplicity
(1)),
(2)).
there
0 0 6 0 0 0 0 0 6 0 0 0 analysis, we 0 05 0 0 0 sizes
3+2+1+1
4.48
4.48
1 5 000 0 1 50 0 0 0
J5,2 possibility
1 50 0 00 linear
minimal
take
Js,,..
= thus
(D multiplicity
linear
block
all
5 00 0
Jordan
5 0 000 0 5 0 0 0 0 0 similar
0 0 a eigenvalue
are
geometric
may
the
a the J6.2 single is T
= = is eigenvalue = -8)7,
T we only
of so the
2, I,
8 1 0 0 0 0 0
0 8 1 0 0 0 0
0 0 8 0 0 0 0
J8.3®J8,2®41 ®J8,1 = 0 0 0 8 1 0 0
0 0 0 0 8 0 0
0 0 0 0 0 8 0
0 0 0 0 0 0 8
[T]c =A= -2 1 2
-2 -1 4
/ ` 1 0
v,=( }, v2=( o >, V3= ( 1
\ \ 1
Thus, if
1 1 0
C3 1 , 0 1 I I,
1 1 1
1 1 1
then
0 0 2
[21
Ker(T - 1v) _ ( 3 = (vl),
2 1
JJ
so in this example we are in case (b). To find the basis for the Jordan
canonical form, we find a vector v2 E F3 with (T - 2)(V3) = v2. By solving
a system of linear equations we find that we may take
-3
v3= -2
0
Also,
J -3
Ker(T - 1v) = ( -1 ) _ (vi).
1 /
Thus, if B = {v1, v2, v3} (note the order), then B is a basis of F3 and
[T]!3 = J.
As a practical matter, there is a second method for finding a suitable
basis of F3. Note that dim(Ker(T - 2.1v )2) = 2. Pick any vector v3, which
is in Ker(T - 2 lv)2 but not in Ker(T - 2 1v), say v3 = [2 0 -11'.
Then let v2 = (T - 2)(v3) _ [-1 -2 -11'. If B' _ {v1, v3, v3}, then we
also have [T] g, = J. O
That is, [TJg is already in Jordan canonical form. Compute p(VT), the
cyclic decomposition of VT, and the rational canonical form of T.
0 1 1 1 0 0 0 0
where
C = {w1, w21 Tw2, T2w2, W3, Tw3, T2w3, T3w3}.
11
264 Chapter 4. Linear Algebra
(5.7) Remark. We have given information about the rational canonical form
in the above examples in order to fully illustrate the situation. However, as
we have remarked, it is the Jordan canonical form that is really of interest.
In particular, we note that while the methods of the current section produce
the Jordan canonical form via computation of generalized eigenvectors, and
then the rational canonical form is computed (via Lemma 3.7.18) from the
Jordan canonical form (e.g., Examples 5.3 and 5.6), it is the other direction
that is of more interest. This is because the rational canonical form of T can
be computed via elementary row and column operations from the matrix
XI,, - [71a. The Jordan canonical form can then be computed from this
information. This approach to computations will be considered in Chapter
5.
wl ... Wk,
Swl ... Swk,
Wkl+l Wk3
Swk, +1 Swk,
Wt
e4 e8
e3 e7
e2 e6 elo e12
el e5 e9 ell e,3
Thus, the bottom row is a basis of Ker(T - 2.1 v), the bottom two rows are
a basis of Ker(T - 2 lv)2, etc. These tables suggest that the top vectors
of each column are vectors that are in Ker(T - \1v)k for some k, but
they are not also (T - Alv)w for some other generalized eigenvector w.
This observation can then be formalized into a computational scheme to
produce the generators w; of the primary cyclic submodules W;.
We first introduce some language to describe entries in Table 5.1. It
is easiest to do this specifically in the example considered in Table 5.2, as
otherwise the notation is quite cumbersome, but the generalization is clear.
We will number the rows of Table 5.1 from the bottom up. We then say (in
the specific case of Table 5.2) that {e13} is the tail of row 1, {elo, e,2} is
266 Chapter 4. Linear Algebra
the tail of row 2, the tail of row 3 is empty, and {e4, es} is the tail of row
4. Thus, the tail of any row consists of those vectors in the row that are at
the top of some column.
Clearly, in order to determine the basis B of W it suffices to find the
tails of each row. We do this as follows:
For i = 0, 1, 2,... , let Va') = Ker(S') (where, as above, we let S =
T - Alv). Then
Now for our algorithm. We begin at the top, with i = r, and work
down. By Claim 5.8, we may choose a complement Va') of Vai-1) in V')
which contains the subspace S(VAi+1)). Let Va,) be any complement of
4.5 Computational Examples 267
S(Vr') in V. Then a basis for Va gives the tail of row i of Table 5.1.
W
(Note that Var+l) = {O}, so at the first stage of the process Va = Vary
is any complement of V(' ' in V(r) = W. Also, at the last stage of the
process 0a°) = {0}, so Vale = Vain and Vale is any complement of S(Va2))
in Vlll, the eigenspace of T corresponding to A.)
(5.9) Example. We will illustrate this algorithm with the linear transforma-
tion T : F13 - F13 whose Jordan basis is presented in Table 5.2. Of course,
this transformation is already in Jordan canonical form, so the purpose is
just to illustrate how the various subspace V\ , and V.\ relate to the
basis B = {el, ... , e13}. Since there is only one eigenvalue, for simplicity,
let V(') = Vt, with a similar convention for the other spaces. Then
V5=V4=F13
V4 = (el, ... , el3)
V3 = (el, e2, e3, e5, e6, e7, e9, elo, ell, e,2, e,3)
V2 = (el, e2, es, e6, e9, elo, ell, e12, e13)
V, = (el, e5, e9, ell, e,3)
Vo = {0},
while, for the complementary spaces we may take
V5 = {0}
V4 = (e4, es)
V3 = (e3, e7)
V2 = (e2, e6, elo, e,2)
V, = (el, es, e9, ell, e13).
Since V4 = F13, we conclude that there are 4 rows in the Jordan table of
T, and since V4 = V4, we conclude that the tail of row 4 is {e4, e8}. Since
S(e4) = e3 and S(es) = e7, we see that S(V4) = V3 and hence the tail
of row 3 is empty. Now S(e3) = e2 and S(e7) = e6 so that we may take
V2 = (elo, e,2), giving {elo, e12} as the tail of row 2. Also, S(e2) = el,
S(e6) = e5, S(elo) = e9, and S(e,2) = ell, so we conclude that V, = (e,3),
i.e., the tail of row 1 is {e13}.
Examples 5.3 and 5.4 illustrate simple cases of the algorithm described
above for producing the Jordan basis. We will present one more example,
which is (slightly) more complicated.
-1 1 1
-3
[T]c - A - -8
3 1
2 5 3
2 0 -1
in the standard basis C. Find the Jordan canonical form of T and a basis
B of V such that [TJB is in Jordan canonical form.
Solution. We compute that
CT(X) = CA(X) _ (X - 2)4.
Thus, there is only one eigenvalue, namely, 2, and v,lg(2) = 4. Now find the
eigenspace of 2, i.e.,
1
Vl')=Ker(T-2.1v)=`/
1
2 ,
1
0
I \,
0 2
0
Then compute
-3
-3
V1 = (T - 2 lv)(w1) = -8
2
and
1
vJ=(T-2.1v )(w2)= 1
2
0
Setting
v1 = (T - 2 ly)(w1), v2 = wl,
v4=W2,
4.6 Inner Product Spaces and Normal Linear Transformations 269
we obtain a basis B = {v1, v2, v3, v4} such that [T]g = J. The table
corresponding to Table 5.1 is
V2 V4
V1 V3
(6.2) Examples.
(1) The standard inner product on Fn is defined by
(U: = n
v) F, ujvj
j=1
(2) If A E M,n,n(F), then let A' = A` where q denotes the matrix ob-
tained from A by conjugating all the entries. A' is called the Hermitian
transpose of A. If we define
(A : B) = Tr(AB')
270 Chapter 4. Linear Algebra
The norm of a vector v is well defined by Definition 6.2 (4). There are
a number of standard inequalities related to the norm.
llu+v112=(u+v:u+v)
= IIuI12+(u:v)+(v:u)+IIv112
= IIuI12 + 2Re(u: v) + IIv112
IIui12 + 211ullIIvIl + IIv112
_ (IIuII + IIvII)2
Taking square roots gives the triangle inequality.
Thus, we can define the angle between u and v by means of the equation
Proof. Exercise. 11
(W1)1 = W,
(2)
(3) dim W + dim Wl = dim V, and
(4)
(6.12) Remark. If V is a vector space over F, its dual space V' is defined
to be V* = HomF(V, F). If V is finite-dimensional, then, by Corollary
3.4.10, V and V' have the same dimension and so are isomorphic, but in
general there is no natural isomorphism between the two. However, an inner
product on V gives a canonical isomorphism 0: V -' V' defined as follows:
For y = V, 0(y) E V' is the homomorphism O(y)(x) = (x : y). To see that
0 is an isomorphism, one only needs to observe that 0 is injective since
dim V = dim V. But if y 0 0 then 0(y)(y) = (V: y) > 0, so 0(y) 0 and
Ker(4) _ (0).
(Tv:w)=(v:T'w)
for all v, w E V. T' is called the adjoint of T.
Proof. Let w E V. Then h,,, : V -+ F defined by h,,,(v) = (Tv : w) is an
element of the dual space V. Thus (by Remark 6.12), there exists a unique
w E V such that
(Tv:w)=(v:w)
for all v E V. Let T'(w) = w. We leave it as an exercise to verify that
T' E EndF(V). 0
= Qij.
(6.17) Remarks.
(1) If T is self-adjoint or unitary, then it is normal.
(2) If F = C, then a self-adjoint linear transformation (or matrix) is called
Hermitian, while if F = R, then a self-adjoint transformation (or ma-
trix) is called symmetric. If F = R, then a unitary transformation (or
matrix) is called orthogonal.
(3) Lemma 6.15 shows that the concept of normal is essentially the same
for transformations on finite-dimensional vector spaces and for matri-
ces. A similar comment applies for self-adjoint and unitary.
Proof. (1) This follows immediately from the fact that (aT")' = a(T')"
and the definition of normality.
(2)
VC, and in fact, Tc is self-adjoint. By part (1) applied to Tc we see that all
the eigenvalues of Tc are real and mTQ(X) is a product of distinct (real)
linear factors. Thus, mTa (X) E R(X ). If f (X) E R[X], then
(6.3) f(Tc)(u, v) = (f(T)(u), f(T)(v))
Equation (6.3) shows that MT(X) = mTo (X) and we conclude that T is
diagonalizable. Part (2) is completed exactly as in Corollary 6.21. 0
4.7 Exercises
1. Suppose R is a finite ring with JR[ = s. Then show that Mm,n(R) is finite
with JM,","(R)l = s". In particular, [M"(R)I = s"'.
2. Prove Lemma I.I.
3. Prove Lemma 1.2.
4. (a) Suppose that A = [al a", ] and B E M,..,, (R). Then show that
AB=j:"1a;row{(B).
(b) Suppose
S={[a aa,bER}.
Verify that S is a subring of M2(R) and show that S is isomorphic to the
field of complex numbers C.
6. Let R be a commutative ring.
(a) If 1 < j < n prove that Ejj = P, 1 Ell P1t. Thus, the matrices E;; are
all similar.
(b) If A, B E M"(R) define [A, B] = AB - BA. The matrix [A, B] is called
the commutator of A and B and we will say that a matrix C E Mn(R)
is a commutator if C = [A, B] for some A, B E M,,(R). If i 54 j show
that E;, and E11 - E are commutators.
(c) If C is a commutator, show that 7r(C) = 0. Conclude that In is not a
commutator in any Mn(R) for which n is not a multiple of the charac-
teristic of R. What about 12 E M2(Z2)?
7. If S is a ring and a E S then the centralizer of a, denoted C(a), is the set
C(a) = {b E S : ab = ba}. That is, it is the subset of S consisting of elements
which commute with a.
a Verify that C(a) is a subring of S.
b What is C(1).
4.7 Exercises 279
21. (a) Suppose that A has the block decomposition A = [A. A, . Prove that
det A = (det Al) (det A2 ).
(b) More generally, suppose that A = [Aij] is a block upper triangular (re-
spectively, lower block triangular) matrix, i.e., Aii is square and Aij = 0
if i > j (respectively, i < j). Show that detA = fli(detAii).
22. If R is a commutative ring, then a derivation on R is a function b : R -> R
such that 5(a + b) = b(a) + b(b) and b(ab) = ab(b) + b(a)b.
(a) Prove that b(al ... an) = >2 1(a, ... ai-ib(ai)ai+1 ... an).
(b) If b is a derivation on R and A E Mn(R) let Ai be the n x n matrix
obtained from A by applying b to the elements of the ith row. Show that
b(det A) = I:n 1 det Ai.
23. If A E Mn(Q[X]) then detA E Q[X]. Use this observation and your knowl-
edge of polynomials to calculate det A for each of the following matrices,
without doing any calculation.
1 1 2 3
(a) A= 1 2-X2 2 3
2 3 1 5
2 3 1 9-X2
1 1 1 .. 1
1 1-X 1 .. 1
(b) A= 1 1 2-X 1
1 1 1 ... m-X
24. Let F be a field and consider the "generic" matrix [Xij] with entries in the
polynomial ring F[Xij] in the n2 indeterminants Xs- (1 < i, i < n). Show
that det[Xij] is an irreducible polynomial in F[Xijj. (Hint: Use Laplace's
expansion to argue by induction on n.)
25. Let A E Mn(Z) be the matrix with entii(A) = 2 (1 < i < n), entij(A) = 1
if Ii - jI = 1, and entij(A) = 0 if ji - jI > 1. Compute det(A).
26. Let An E Mn(Z) be the matrix with entii(An) = i for 1 < i < n and aij = 1
if i # j. Show that det(An) = (n - 1)!.
27. Let A E Mn(Z) be a matrix such that entij(A) = ±1 for all i and j. Show
that 2n-1 divides det(A).
28. Let R be a commutative ring and let a, b E R. Define a matrix A(a, b) E
Mn(R by entii(A(a, b)) = a for all i and entij(A(a, b)) = b if i j. Compute
det(A a, b)). (Hint: First find the Jordan canonical form of A(a, b).)
29. Let V(xi, ...,xn) be the Vandermonde determinant:
1 x1 X1 x1
. . . xn-1
1 x
X2 X2z 2
V(x1, ... ,xn) = .
1 xn x2n xri-1
n J
(a) Prove that
detV(x1, ... xn) = 11 (xi -xj).
1<i<j<n
(b) Suppose that ti, ... , to+1 are n + 1 distinct elements of a field F. Let
Pi (X) (1 < i < n + 1) be the Lagrange interpolation polynomials deter-
mined by t1, ... , to+1. Thus,
B = {Pi(X), ... ,Pn+1(X)}
4.7 Exercises 281
op(A-1)` = (op(A`))-'
46. Let P3=If (X)EZ[X]:degf(X)<3}. LetA={1, X, X2, X3} and
B = {l, X, X(2). X(3)}
49. Let R be a commutative ring. We will say that A and B E M8(R) are
permutation similar if there is a permutation matrix P such tl,,.t P-1 AP =
B. Show that A and B are permutation similar if and only if there is a free
R-module M of rank n, a basis B = {v1, ... , vn} of M, and a permutation
v E Sn such that A = [f]2i and B = [f)c where f E EndR(M) and C =
{v,(1), ... ,vv(n)}.
50. Let R be a commutative ring, and let
B= {Eli, ... ,E1n, E21, ... ,E2n, ... ,Eml, ... ,E,nn}
be the basis of M,n,n(R) consisting of the matrix units in the given order.
Another basis of M,n,n(R) is given by the matrix units in the following order:
C={E11,...,Em1,E12,...,Em2,...,E1n...,Emn}.
If A E Mm(R) then GA E EndR(Mm,n(R)) will denote left multiplication by
A, while 1ZB will denote right multiplication by B, where B E Mn(R).
(a) Show that AE,, = Ek akiEki and E,,B = E 1 bleE{r.
1
58. Let F be a field and let A E Mn(F). Show that A and At have the same
minimal polynomial.
59. Let K be a field and let F be a subfield. Let A E Mn(F).
Show that the minimal polynomial of A is the same whether A is considered
in M. (F) or Mn(K).
60. An algebraic integer is a complex number which is a root of a monic polyno-
mial with integer coefficients. Show that every algebraic integer is an eigen-
value of a matrix A E Mn(Z) for some n.
61. Let A E Mn(R) be an invertible matrix.
(a) Show that det(X-'In - A-') = (-X)-ndet(A-')cA(X).
b If
CA(X) = Xn +a, Xn-1 + ... + an-IX + an
and
CA-1(X) =Xn+b,X'-'+ +bn_,X+bn,
then show that bi = (-1)' for 1 < i < n where we set
ao=1.
62. If A E MM(F) (where F is a field) is nilpotent, i.e., Ak = 0 for some k,
prove that A' = 0. Is the same result true if F is a general commutative
ring rather than a field?
63. Let F be an infinite field and let Y = {Aj}jEJ C Mn(F) be a commuting
family of diagonalizable matrices. Show that there is a matrix B E Mn(F)
and a family fj(X)}jEJ C F[X] of polynomials of degree < n-1 such that
Aj = fj(B). (Hint: By Theorem 3.36 there is a matrix P E GL(n, F) such
that
P-'AjP = diag(Aij, ... , Ani).
Let t,, ... , to be n distinct points in F, and let
B = Pdiag(tl, ... ,tn)P-'.
Use Lagrange interpolation to get a polynomial fj (X) of degree _< n -1 such
that fj(ti) = Aij for all i, j. Show that { fj(X)}jEJ works.)
64. Let F be a field and V a finite-dimensional vector space over F. Let S E
EndF(V) and define Ads : EndF(V) EndF(V) by
Ads(T) = [S, T] = ST - TS.
(a) If S is nilpotent, show that Ads is nilpotent.
b If S is diagonalizable, show that Ads is diagonalizable.
65. Let Ni, N2 E M,(F) be nilpotent matrices. Show that N, and N2 are similar
if and only if
rank(Ni) = rank(N2) for all k > 1.
66. Let F be an algebraically closed field and V a finite-dimensional vector space
over F.
(a) Suppose that T, S E EndF(V). Prove that T and S are similar if and
only if
dim(Ker(T - A1V )k) = dim(Ker(S - Alv)k)
for all A E F and k E N. (This result is known as Weyr's theorem.)
4.7 Exercises 285
-2 0 0 1 1 0 1 0
1 1 0 1
(f)
4 3 -2 0
(e)
2 0 1 -2 -2 1 5 0 '
-1 0 0 0 2 0 -1 3
74. In each case below, you are given some of the following information for
a linear transformation T : V -. V, V a vector space over the complex
numbers C: (1) characteristic polynomial for T; (2) minimal polynomial for
T; (3) algebraic multiplicity of each eigenvalue; (4) geometric multiplicity
of each eigenvalue; (5) rank(VT) as an C[X]-module; (6) the elementary
divisors of the module VT. Find all possibilities for T consistent with the
given data (up to similarity) and for each possibility give the rational and
Jordan canonical forms and the rest of the data.
(a) CT(X) = (X - 2)4(X - 3)2.
(b) CT(X) = X2(X - 4)' and MT(X) = X(X - 4)3.
(c) dim V = 6 and MT(X) = (X + 3)2(X + 1)2.
d) CT X = X(X - 1)4(X - 2)5 , v m(1) = 2, and v`eom(2) = 2.
e cT(X = (X - 5)(X - 7)(X - (X - 11).
f) dimV=4andmr(X)=X-1.
75. Recall that a matrix A E MM(F) is idempotent if A2 = A.
(a) What are the possible minimal polynomials of an idempotent A?
b If A is idempotent and rank A = r, show that A is similar to B =
Ir ®On-r
76. If T : Cn Cn denotes a linear transformation, find all possible Jordan
canonical forms of a T satisfying the given data:
(a) cT(X) = (X - 4)3(X - 5)2.
(b) n=6andmT(X)=(X -9)3.
(c) n = 5 and mr(X) _ (X - 6)2(X - 7).
d T has an eigenvalue 9 with algebraic multiplicity 6 and geometric mul-
tiplicity 3 (and no other eigenvalues).
(e) T has an eigenvalue 6 with algebraic multiplicity 3 and geometric mul-
tiplicity 3, and eigenvalue 7 with algebraic multiplicity 3 and geometric
multiplicity 1 (and no other eigenvalues).
77. (a) Show that the matrix A E M3(F) (F a field) is uniquely determined up
to similarity by CA(X) and mA(X).
(b) Give an example of two matrices A, B E M4(F) with the same charac-
teristic and minimal polynomials, but with A and B not similar.
78. Let A E If all the roots of the characteristic polynomial CA(X) are
real numbers, show that A is similar to a matrix B E MM(R).
79. Let F be an algebraically closed field, and let A E Mn(F).
(a) Show that A is nilpotent if and only if all the eigenvalues of A are zero.
b Show that Tr(Ar) = ai + + an where are the eigenvalues
of A counted with multiplicity.
(c) If char(F) = 0 show that A is nilpotent if and only if Tr(A') = 0 for all
r E N. (Hint: Use Newton's identities, Exercise 61 of Chapter 2.)
80. Prove that every normal complex matrix has a normal square root, i.e., if
A E M. (C) is normal, then there is a normal B E A. (C) such that B2 = A.
81. Prove that every Hermitian matrix with nonnegative eigenvalues has a Her-
mitian square root.
82. Show that the following are equivalent:
a) U E M,,(C) is unitary.
b) The columns of U are orthonormal.
c) The rows of U are orthonormal.
83. (a) Show that every matrix A E M,(C) is unitarily similar to an upper
triangular matrix T, i.e., UAU' = T, where U is unitary.
(b) Show that a normal complex matrix is unitarily similar to a diagonal
matrix.
4.7 Exercises 287
84. Show that a commuting family of normal matrices has a common basis of
orthogonal eigenvectors, i.e., there is a unitary U such that UA3U` = D3 -
for all Aj in the commuting family. (D3 denotes a diagonal matrix.)
85. A complex matrix A E M,,(C) is normal if and only if there is a polynomial
f (X) E C[X] of degree at most n - 1 such that A' = f (A). (Hint: Apply
Lagrange interpolation.)
86. Show that B = ®A; is normal if and only if each Al is normal.
87. Prove that a normal complex matrix is Hermitian if and only if all its eigen-
values are real.
88. Prove that a normal complex matrix is unitary if and only if all its eigenvalues
have absolute value 1.
89. Prove that a normal complex matrix is skew-Hermitian (A' = -A) if and
only if all its eigenvalues are purely imaginary.
90. If A E let H(A) = I (A + A') and let S(A) = (A - A'). H(A) is
called the Hermitian part of A and S(A) is called the skew-Hermitian
z part
of A. These should be thought of as analogous to the real and imaginary
parts of a complex number. Show that A is normal if and only if H(A) and
S(A) commute.
91. Let A and B be self-adjoint linear transformations. Then AB is self-adjoint
if and only if A and B commute.
92. Give an example of an inner product space V and a linear transformation T :
V -+ V with T'T = lv, but T not invertible. (Of course, V will necessarily
be infinite dimensional.)
93. (a) If S is a skew-Hermitian matrix, show that I - S is nonsingular and the
matrix
U=(I+S)(I-S)-1
is unitary.
(b) Every unitary matrix U which does not have -1 as an eigenvalue can be
written as
U = (I + S)(I - S)-1
for some skew-Hermitian matrix S.
94. This exercise will develop the spectral theorem from the point of view of
projections.
(a) Let V be a vector space. A linear transformation E : V -* V is called
a projection if E2 = E. Show that there is a one-to-one correspondence
between projections and ordered pairs of subspaces (V1, V2) of V with
V1ED V2=V given by
E H (Ker(E), Im(E)).
(b) If V is an inner product space, a projection E is called orthogonal if E _
E. Show that if E is an orthogonal projection, then Im(E)' = Ker(E)
and Ker(E)1 = Im(E), and conversely.
(c) A set of (orthogonal) projections {E1, ... , E,.} is called complete if
E;Ej=O ori$jand
lv
Show that any set of (orthogonal) projections {E1, ... , E,.} with EiEE _
0 for i 54 j is a subset of a complete set of (orthogonal) projections.
(d) Prove the following result.
Let T : V --> T be a diagonalizable (resp., normal) linear transformation
on the finite-dimensional vector space V. Then there is a unique set of
288 Chapter 4. Linear Algebra
distinct scalars {Al, ... , A,.} and a unique complete set of projections
(reap., orthogonal projections) {E,, ... , Er} with
T=AiEi+...+ArEr.
Also, show that are the eigenvalues of T and {1m(E;)};al are
the associated eigenapaces.
(e) Let T and {E;) be as in part (d) Let U : V - ' V be an arbitrary
linear transformation. Show that U = UT if and only if E1U = UEt
for 1 < i < r.
(f) Let F be an infinite field. Show that there are polynomials pi(X) E FIX)
for 1 < i < r with pi (T) = E. (Hint: See Exercise 63.)
Chapter 5
Matrices over PIDs
T 2
Rrn Rn
a
M-0
Th
Rm Rn -# N -. 0
where the hi are isomorphisms and iri are the canonical projections. Define
0: M -i N by 0(x) = ir2(h2(y)) where irl(y) = x. It is necessary to check
that this definition is consistent; i.e., if irl(yi) = irl(y2), then ir2(h2(yl)) =
ir2(h2(y2)). But if 7rl(yi) = irl(y2), then yl - y2 E Ker(7rl) = Im(f), so
y1 - y2 = f(z) for some z E R'n. Then
h2(yi - 112) = h2f(z) = ghl(z) E Im(g) = Ker(7r2)
so a2h2(yl - y2) = 0 and the definition of 0 is consistent. We will leave it
to the reader to check that 0 is an R-module isomorphism.
by,OB(ej) = vj for 1 < j < n where A = {e1, en} is the standard basis
of the free F[X]-module F[X]n and B = {v1, ... , vn} is a given basis of V.
Thus,
(1.3) f1(T)(vi)+...+fn(T)(vn)
VG13(f1(X), ... ,fn(X)) =
for all (f1(X), ... , f. (X)) E F[X]n. Let K = Ker(08). If A = [T]B then
A = [aij] E Mn(F) where
n
T(vj) _ aijvi.
i=1
Let
n
(1.5) pj(X) = Xej - >aijei E F[X]n for 1 < j < n.
i=1
>hj(X)Xe, = E hj(X)aijei
j=1 i,j=1
and since {ei, ... , en} is a basis of F[X]n, it follows that
n
(1.8) hi(X)'X =Ehj(X)a;j.
j=1
If some hi (X) 96 0, choose i so that hi (X) has maximal degree, say,
deghi(X) = r > deghj(X) (1 < j < n).
Then the left-hand side of Equation (1.8) has degree r + 1 while the right-
hand side has degree < r. Thus hi(X) = 0 for 1 < i < n and C is a basis of
K.
-013 L(1f(x)1)
l
9(X)
- [xf(x)_9(X)1
X9(X)
Note that
(f(X)1
l g(X) J
- lao+bll + (Xf(X)-9'(X)
bo Xg(X) JJ
L
where
9(X) - bo
9'(X) = X
f(X) +9(X) - ao - bl
f(X) = X
Since f (X) and g(X) are arbitrary, we see that
(1.10) F[X]2/Im(OB) F2
Proof. If A and B are similar, then B = P-'AP for some P E GL(n, F).
Then
Rm g M -* 0
where h1 and h2 are isomorphisms. If A = {el, ... en} is the standard
basis on Rn, then V = {vl, ... , vn} and W = {w1, ... , wn} are generating
sets of the R-module M, where vi = 9r1(ei) and w3 = ir2(e,). Note that
we are not assuming that these generating sets are minimal, i.e., we do not
assume that µ(M) = n. From the diagram (1.13) we see that the generators
v; and w3 are related by
5.1 Equivalence and Similarity 295
wj = ir1(h21(ej))
n
_ 11pijei)
i=1
(1.15) = tPiv.
That is, wj is a linear combination of the vi where the scalars come from
the jeh column of P-1 in the equivalence equation B = PAQ-1.
(1.9) Example. We will apply the general analysis of Example 1.8 to a spe-
cific numerical example. Let M be an abelian group with three generators
v1i v2, and v3, subject to the relations
6v1 + 4v2 + 2v3 = 0
-2v1 + 2v2 + 6v3 = 0.
That is, M = Z3/K where K is the subgroup of Z3 generated by yi =
(6,4,2) and y2 = (-2,2,6), so there is a finite free presentation of M
(1.16) 0, Z2- TA
+ Z3-- 11
-+ M->0
where TA denotes multiplication by the matrix
6 -2
A= 4 2
2 6
then
2 0
B=PAQ= 0 10
0 0
3 2 1
P-1= 2 1 0
1 0 0
(2.2) Theorem. Let R be a PID, let a1, ..., a,a E R, and let d =
gcd{a1, ... ,an}. Then there is a matrix A E Mn(R) such that
(1) rowl(A)=[a1 an], and
(2) det(A) = d.
Prof. The proof is by induction on n. If n = 1 the theorem is trivially
true. Suppose the theorem is true for n - 1 and let Al E Mn-1(R) be a
matrix with rowl(A1) = [a1 an_1] and det(A1) = d1 = gcd{a1,...,
an-11- Since
d = gcd{al,...,an}
= gcd{gcd{a1i ... a, _1}, an}
= gcd{dl, an},
it follows that there are u, v E R such that ud1 - van = d. Now define A
by
an
AI 0
A=
0
L U
Since d1 ai for 1 < i < n - 1, it follows that A E M. (R) and row, (A) =
[a1 an ]. Now compute the determinant of A by cofactor expansion
along the last column. Thus,
det(A) = udet(A1) + (-1)n+1an det(Ain)
where A1n denotes the minor of A obtained by deleting row 1 and column n.
Note that A1n is obtained from AI by moving row,(A1) = (a1 an_1 ]
to the n -1 row, moving all other rows up by one row, and then multiplying
the new row n -1 (= [ a1 an- I ]) by v/d1. That is, using the language
of elementary matrices
A1n = Dn-1(v/d1)Pn-1,n-2 ... P3.2 P21A1.
Thus,
(2.3) Remark. Note that the proof of Theorem 2.2 is completely algorith-
mic, except perhaps for finding u, v with ud, - van = d; however, if R is a
Euclidean domain, that is algorithmic as well. It is worth noting that the ex-
istence part of Theorem 2.2 follows easily from Theorem 3.6.16. The details
are left as an exercise; however, that argument is not at all algorithmic.
(2.4) Corollary. Suppose that R is a PID, that a1, ..., a, are relatively
prime elements of R, and 1 < i < n. Then there is a unimodular matrix A2
with rowi(A2) _ [a, an ] and a unimodular matrix B2 with col2(B2) _
[a, ...
t
an] .
and det A = 1, which is guaranteed by the Theorem 2.2. Then let Ai = P12A
and let B2 = Al P1. where P12 is the elementary permutation matrix that
interchanges rows 1 and i (or columns 1 and i). (]
then we may use the matrix A in the induction step. Observe that 10-9 = 1,
so the algorithm gives us a matrix
25 15 7 9
B- 3 2 0 0
10 6 3 0
25 15 7 10
bt\d/+.+bm(d)=1
so {b1, ... bm} is relatively prime. By Theorem 2.2 there is a matrix U1 E
GL(m, R) such that rows (Ul) = [ b1 ... bm ]. Then
and ci = uilal + + uimam so that d I ci for all i > 2. Hence ci = aid for
ai E R. Now, if U is defined by
al
rl
Al = T2,(-72) ... T,.1(-7m)PjjA =
rm
amp
al')
1
0
UA=A,=
0
am
0
(2.7) Examples.
(1) If R = Z, then a complete set of nonassociates consists of the nonneg-
ative integers; while if m E Z is a nonzero integer, then a complete set
of residues modulo m consists of the m integers 0, 1, ... , Iml - 1. A
complete set of residues modulo 0 consists of all of Z.
(2) If F is a field, then a complete set of nonassociates consists of {0, 1};
while if a E F \ {0}, a complete set of residues modulo a is {0}.
(3) If F is a field and R = F[X] then a complete set of nonassociates of R
consists of the monic polynomials together with 0.
Thus, if the matrix A is in Hermite normal form, then A looks like the
following matrix:
0
302 Chapter 5. Matrices over PIDs
0
Al=U1A=
B,
0
where B1 E M,_l,n_1(R). By the induction hypothesis, there is an in-
vertible matrix V E GL(m - 1, R) (which may be taken as a product of
elementary matrices if R is Euclidean) such that VB1 is in Hermite normal
form. Let
1 0
U2
0 V
Then U2A1 = A2 is in Hermite normal form except that the entries a,,,,
(i > 1) may not be in P(a;a,). This can be arranged by first adding a
multiple of row 2 to row 1 to arrange that a,,,, E P(a2n, ), then adding
a multiple of row 3 to row 1 to arrange that a1n3 E P(a3n, ), etc. Since
a;3 = 0 if j < n;, a later row operation does not change the columns before
n;, so at the end of this sequence of operations A will have been reduced
to Hermite normal form, and if R was Euclidean then only elementary row
operations will have been used. 0
If we choose a complete set of nonassociates for R so that it contains 1
(as the unique representative for the units) and a complete set of represen-
tatives modulo 1 to be {0}, then the Hermite form of any U E GL(n, R) is
5.2 Hermite Normal Form 303
I,,. This is easy to see since a square matrix in Hermite normal form must
be upper triangular and the determinant of such a matrix is the product
of the diagonal elements. Thus, if a matrix in Hermite normal form is in-
vertible, then it must have units on the diagonal, and by our choice of 1 as
the representative of the units, the matrix must have all 1's on the diago-
nal. Since the only representative modulo 1 is 0, it follows that all entries
above the diagonal must also be 0, i.e., the Hermite normal form of any
U E GL(n, R) is I,,.
If we apply this observation to the case of a unimodular matrix U with
entries in a Euclidean domain R, it follows from Theorem 2.9 that U can be
reduced to Hermite normal form, i.e., I,,, by a finite sequence of elementary
row operations. That is,
El...E1U=I
where each Ej is an elementary matrix. Hence, U = El 1 . . . ET' is itself a
product of elementary matrices. Therefore, we have arrived at the following
result.
4 2 9 5
A= 6 3 4 3
8 4 1 -1
The left multiplications used in the reduction are U1, ..., U7 E GL(3, Z),
while Al = U1A and A; = U;A;_1 for i > 1. Then
304 Chapter 5. Matrices over PIDs
-1 1 0 2 1 -5 -2
UI = 0 1 0 Al = Ul A = 6 3 4 3
0 0 1 8 4 1 -1
1 0 0 2 1 -5 -2
U2= -3 1 0 A2 = U2A1 = 0 0 19 9
-4 0 1 0 0 21 7
r1 0 0 2 1 -5 -2
U3= 1
0 10 -9 A3 = U3A2 = 0 0 1 27
-1 1 0 0 2 -2
1 0 0 2 1 -5 -2
U4= 0 1 0 A4 = U4A3 = 0 0 1 27
0 -2 1 0 0 0 -56
1 0 0 2 1 -5 -2
U5 = 0 1 0 A5 = U5A4 = 0 0 1 27
0 0 -1 0 0 0 56
1 5 0 2 1 0 133
U8= 0 1 0 A6=UBA5= 0 0 1 27
0 0 1 0 0 0 56
0 -2 2 1 0 21
U7= 10 1 0 A7=U7A6= 0 0 1 27
0 1 0 0 0 56
0 U 0
0 0
so kin,u,i = 0 for s > 1, and hence, u,1 = 0 for s > 1. Therefore,
_ u11
U- 0 Ui
If H = [ro i(H)]
Hi and K = [ro i(K)]
KI where Hi, K1 E M,n_i,n(R) then
H1 = Ui Ki and H1 and K1 are in Hermite normal form. By induction on
the number of rows we can conclude that nj = tj for 2 < j < r. Moreover,
by partitioning U in the block form U = [ u" u12 ] where Ul 1 E Mr (R)
U2i Usz
and by successively comparing cola, (H) = U Colnj (K) we conclude that
U21 = 0 and that U1 i is upper triangular. Thus,
U11 U12 Uir
0 U22 u2r U12
U=
0 0 urr
0 U22
since u,.. = 0 for -y < s while k7n,+, = 0 if 'y > s + 1. Since u,, = 1, we
conclude that
h.,,n.+i = ks,n.+, + u,,,+1ks+1,n.+1
Therefore,
-f=l
Therefore, h,,,,,+,+, = k,,n,+,+, since they both belong to the same residue
class modulo Hence u,,,+j+I = 0. Therefore, we have shown
k,+j+1,n.+,r,
and, since the last m - r rows of H and K are zero, it follows that H =
UK = K and the uniqueness of the Hermite normal form is proved. 0
We will conclude this section with the following simple application of
the Hermite normal form. Recall that if R is any ring, then the (two-sided)
ideals of the matrix ring M,, (R) are precisely the sets M,' (J) where J is an
ideal of R (Theorem 2.2.26). In particular, if R is a division ring then the
only ideals of M,, (R) are (0) and the full ring M,, (R). There are, however,
many left ideals of the ring M,,(R), and if R is a PID, then the Hermite
normal form allows one to compute explicitly all the left ideals of M,(R),
namely, they are all principal.
(2.14) Theorem. Let R be a PID and let J C Mn(R) be a left ideal. Then
there is a matrix A E Mn(R) such that J = (A), i.e., J is a principal left
ideal of Mn(R).
Proof. Mn(R) is finitely generated as an R-module (in fact, it is free of
rankn2), and the left ideal J is an R-submodule of Mn(R). By Theo-
rem 3.6.2, J is finitely generated as an R-module. Suppose that (as an
R-module)
J=(B1,...,Bk)
where B; E Mn(R). Consider the matrix
5.3 Smith Normal Form 307
B1
B= E
Bk
A=P11B1+ - + P1kBk-
.0
where Q = P-1 = [Qt , it follows that B; = Qi1A. Therefore, J C (A),
and the proof is complete. 0
Dr 0
(3.1) UAV =
0 0
where r = rank(A) and Dr = diag(sl, ... , s,.) with si 0 0 (1 < i < r) and
Si I si+l for 1 < i < r - 1. Furthermore, if R is a Euclidean domain, then
the matrices U and V can be taken to be a product of elementary matrices.
Proof. Consider the homomorphism TA : R" - Rm defined by multiplica-
tion by the matrix A. By Proposition 4.3.20, there is a basis B of R' and
a basis C of R' such that
[Dr
(3.2) [TA]" _ 0]
where Dr = diag(sl, ... , Sr) with si $ 0 (1 < i < r) and si I si+l for
1 < i < r - 1. If C' and B' denote the standard bases on R' and R"
respectively, then the change of basis formula (Proposition 4.3.16) gives
Since A = [TA]C, and since the change of basis matrices are invertible,
Equation (3.3) implies Equation (3.1). The last statement is a consequence
of Theorem 2.10.
(3.2) Remarks.
(1) The matrix [r 00] is called the Smith normal form of A after H. J.
Smith, who studied matrices over Z.
(2) Since the elements s1, ... , Sr are the invariant factors of the submodule
Im(TA), they are unique (up to multiplication by units of R). We shall
call these elements the invariant factors of the matrix A. Thus, two
matrices A, B E M,,n(R) are equivalent if and only if they have the
same invariant factors. This observation combined with Theorem 1.7
gives the following criterion for the similarity of matrices over fields.
(3.3) Theorem. Let F be a field and let A, B E Mn(F). Then A and B are
similar if and only if the matrices XI7, - A and XI, - B E Mn(F[X]) have
the same invariant factors.
Proof.
for 2 < i < m. By applying a similar process to the elements of the first
row, we may also assume that elementary row and column operations have
produced a matrix B in which b11 bit and b11 I bl, for 2 < i < m and
I
2 < j < n. Then subtracting multiples of the first row and column of B
produces an equivalent matrix B = [b11J ® C where C E Mm-l,,,-l. We
may arrange that b11 divides every element of C. If this-is not the case
already, then simply add a row of C to the first row of B, producing an
equivalent matrix to which the previous process can be applied. Since each
repetition reduces v(B), only a finite number of repetitions are possible
before we achieve a matrix B = [b11J ® C in which b11 divides every entry
of C. If C is not zero, repeat the process with C. This process will end with
the production, using only elementary row and column operations, of the
Smith normal form.
4 -4 0 16
13 A 14
0 0 0
1 0 0 2 1 -3 -1
1 0 0
0 1 0 1 -1 -3 1
0 1 0
0 0 1 4 -4 0 16
0 0 1
0 0 0
0 1 0 1 -1 -3 1
1 0 0
1 0 0 2 1 -3 -1 0 1 0
0 0 1 4 -4 0 16
0 0 1
0 0 0
0 1 0 1 -1 -3 1
0 0
H 1 -2 0 0 3 3 -3
1
0 1 0
0 -4 1 0 0 12 12
0 0 1
0 1 0 1 0 0 0
1 3 -1-
1 0 0
-+ 1 -2 0 0 3 3 -3
0 1 0
0 -4 1 0 0 12 12
0 0 1 _
1 2 0-
0 1 0 1 0 0 0
1 -1 1
1 -2 0 0 3 0 0
0 1 0
0 -4 1 0 0 12 12
0 0 1
1 2 -2
0 1 0 1 0 0 0
1 -1 2
1 -2 0 0 3 0 0
0 1 -1
0 -4 1 0 0 12 0
0 0 1
= U S V
(3.6) Remark. Theorem 3.3 and the algorithm of Remark 3.4 explain the
origin of the adjective rational in rational canonical form. Specifically, the
invariant factors of a linear transformation can be computed by "ratio-
nal" operations, i.e., addition, subtraction, multiplication, and division of
polynomials. Contrast this with the determination of the Jordan canonical
form, which requires the complete factorization of polynomials. This gives
an indication of why the rational canonical form is of some interest, even
though the Jordan canonical form gives greater insight into the geometry
of linear transformations.
From Equation (3.6), we conclude that t32 = t23 = t33 = 0, t12 = t21,
t13 = t31 = pt22 = ps for some s E R. Therefore, T must have the form
tll t12 PS
T = t12 s 0
Ps 0 0
and hence, det(T) = -p2s3. Since this can never be a unit of the ring R,
it follows that the matrix equation AT = TAt has no invertible solution T.
Thus, A is not similar to At.
312 Chapter 5. Matrices over PIDs
(3.9) Remark. Theorem 1.7 is valid for any commutative ring R. That is,
if A, B E MM(R), then A and B are similar if and only if the polynomial
matrices XI, - A and X In - B are equivalent in Mn (R[X ]) . The proof we
have given for Theorem 1.7 goes through with no essential modifications.
With this in mind, a consequence of Example 3.8 is that the polynomial
matrix
X -p 0
X13-A= 0 X -1 E M3(R[X])
0 0X
is not equivalent to a diagonal matrix. This is clear since the proof of
Proposition 3.7 would show that A and At were similar if XI3 - A was
equivalent to a diagonal matrix.
What this suggests is that the theory of equivalence for matrices with
entries in a ring that is not a PID (e.g., R[X] when R is a PID that is
not a field) is not so simple as the theory of invariant factors. Thus, while
Theorem 1.7 (extended to A E M,,(R)) translates the problem of similar-
ity of matrices in M,(R) into the problem of equivalence of matrices in
M,(R[X]), this merely replaces one difficult problem with another that is
equally difficult, except in the fortuitous case of R = F a field, in which
case the problem of equivalence in M,,(F[X]) is relatively easy to handle.
The invariant factors of a matrix A E M,n,n (R) (R a PID) can be
computed from the determinantal divisors of A. Recall (see the discus-
sion prior to Definition 4.2.20) that Q,,,,,, denotes the set of all sequences
a = (i1, ... , ip) of p integers with 1 < it < i2 < < ip m. If a E Qp,,,,,,
,3 E Qj,,,, and A E M,n...n(R), then A[a 10] denotes the submatrix of A
whose row indices are in a and whose column indices are in 0. Also recall
(Definition 4.2.20) that the determinantal rank of A, denoted D-rank(A),
is the largest t such that there is a submatrix A[a 1,31 (where a E Qt,,,, and
/3 E Qt,,,) with det A[a 10] 0. Since R is an integral domain, all the ranks
of a matrix are the same, so we will write rank(A) for this common num-
ber. For convenience, we will repeat the following definition (see Definition
4.2.21):
:3.10) Definition. Let R be a PID, let A E M,n,,, (R), and let k be an integer
such that 1 < k _< min{m, n}. If det A[a 1 a] = 0 for all a E Qk,m,
l3 E Qk,n, then we set dk (A) = 0. Otherwise, we set
dk(A) = gcd{det A[a 10] : a E Qk,m., 0 E Qk,n}.
dk (A) is called the kth determinantal divisor of A. For convenience in some
formulas, we set do(A) = 1.
Thus, if dk(A) = 0 then det B[a I /3] = 0 for all a E Qk,n, Q E Qk,n and
hence dk(B) = 0. If dk(A) 34 0 then dk(A) I detA[w I r] for all w E Qk,,n,
r E Qk.n, so Equation (3.7) shows that dk(A) I det B[a I ,o] for all a E Qk,m,
(3 E Qk,n Therefore, dk(A) I dk(B).
Since it is also true that A = U-'BV-1, we conclude that dk(A) = 0
if and only if dk(B) = 0 and if dk(A) 54 0 then
A[0 Dr 0
0
where Dr = diag(al, ... ,Sr) with s; 36 0 (1 < i < r) and s, I s;+l for
1<i<r-1.Ifa=(i1,i2...,ik)EQk,rthen detA[aIa]=s;, ...sik,
while det A[(3 I ry] = 0 for all other /3 E Qk,,n, -y E Qk,n. Then, since si 18 +1
for 1 < i < r - 1, it follows that
(3.8) dk(A) =
at Sk if1<k<r
0 if r + 1 < k < min{m, n}.
From Equation (3.8) we see that the diagonal entries of A, i.e., sl, ..., sr,
can be computed from the determinantal divisors of A. Specifically,
al = d1(A)
d2(A)
8 =
di(A)
Sr __
4(A)
4-1(A).
By Lemma 3.11, Equations (3.9) are valid for computing the invariant
factors of any matrix A E M,n,n(R).
314 Chapter 5. Matrices over PIDs
(3.12) Examples.
(1) Let
-2 0 10
A= 0 -3 -4 E M3(Z).
1 2 -1
Then the Smith normal form of A is diag(1, 1, 8). To see this note that
ent31(A) = 1, so d1(A) = 1;
detA[(1,2)I(1,2)]=6 and detA[(2,3)l(2,3)]=11,
so d2(A) = 1, while det A = 8, so d3(A) = 8. Thus, s1(A) = 1, s2(A) _
1, and s3(A) = 8.
(2) Let
X(X - 1)3 0 0
B 0 X-1 0 E M3(Q[X]).
0 0 X
Then d1(B) = 1, d2(B) = X(X - 1), and d3(X) = X2(X - 1)4.
Therefore, the Smith normal form of B is diag(1, X (X -1), X (X -1)3 ).
UAV=I j ]=B
where Dr = diag(s1, ... ,Sr) with s, i 0 for all i and s; I s;+1 for 1 < i <
r - 1, then by Proposition 1.3
(3.10) M Coker(TB) ?° (RI (81)) ®. ® (R/(Sr)) ® Rm
Therefore, we see that the si 54 1 are precisely the invariant factors of the
torsion submodule Mr of M. This observation combined with Equation
(3.9) provides a determinantal formula for the invariant factors of M. We
record the results in the following theorem.
(3.13) Theorem. Let R be a PID and let A E Mm,n(R). Suppose that the
Smith normal form of A is [ ° o]. Suppose that s, = 1 for 1 < i < k (take
k = 0 if s1 # 1) and s; 96 1 fork < i < r. If M = Coker(TA) where
TA : Rn --y Rm is multiplication by A, then
(1) µ(M) = m - k;
5.3 Smith Normal Form 315
(2) rank(M/M,) = m - r;
(3) the invariant factors of M, are t; = sk+i for 1 < i < r - A*,; and
(4) dk+1(A) for l < i < r - k.
ti = dk+i-1(A)
Proof. All parts follow from the observations prior to the theorem; details
are left as an exercise. 0
If we apply this theorem to the presentation of VT from Proposition
1.5, we arrive at a classical description of the minimal polynomial mT(X)
due to Frobenius.
(3.13) 0 < el.i G e2j < < erg (1 < j < k).
The prime power factors eiJ > 0}, counted according to the number
of times each occurs in the Equation (3.12), are called the elementary di-
visors of the matrix A. Of course, the elementary divisors of A are nothing
more than the elementary divisors of the torsion submodule of Coker(TA),
where, as usual, TA : R" R' is multiplication by the matrix A (Theorem
3.13 (3)). For example, let
A = diag(12, 36, 360, 0, 0).
A is already in Smith normal form, so the invariant factors of A are 12 =
22 3, 36 = 22 . 32, and 360 = 23 . 32 . 5. Hence the elementary divisors of
A are
{22, 3,2 2 , 32, 23.32, 5}.
ej= maxeij
1<i<r
1<j<k
then sr is an associate of pi' ... p','. Delete {pj', ... , pk' } from the set
of elementary divisors and repeat the process with the set of remaining
elementary divisors to obtain sr_l. Continue this process until all the ele-
mentary divisors have been used. At this point the remaining si are 1 and
we have recovered the invariant factors from the set of elementary divisors.
0
Then ss = 23 .3 2 7. 112 = 60984. Deleting 23, 32, 7, 112 from the set of
elementary divisors leaves the set
12,2 2 , 22, 32, 32, 11}.
Thus, s4 = 22 32 . 11 = 396. Deleting 22, 32, 11 leaves the set {2, 22, 32)
so that 83 = 22.32 = 36. Deleting 22 and 32 gives a set {2} so that 32 = 2.
Since the set obtained by deleting 2 from {2} is empty, we must have that
sl = 1. Therefore, A is equivalent to the matrix
1 0 0 0 0 0-
0 2 0 0 0 0
0 0 36 0 0 0
0 0 0 396 0 0
0 0 0 0 60984 0
0 0 0 0 0 0
0 0 0 0 0 0
where Dr = diag(t1, ... , tr). Then the prime power factors of the t; (1 <
i < r) are the elementary divisors of A.
Proof. Let p be any prime that divides some t; and arrange the tj according
to ascending powers of p, i.e.,
ti3=Pe'91
tir = Pe,gr
where (p, q;) = 1 for 1 < i < r and 0 < e1 < e2 < - - - < er. Then the
exact power of p that divides the determinantal divisor dk(A) is pej+-.-+e,,
for 1 <_ k <_ r and hence the exact power of p that divides the kth invariant
factor sk(A) = dk(A)/dk_I(A) is pek for 1 < k < r. (Recall that do(A) is
defined to be 1.) Thus pek is an elementary divisor for 1 < k < r. Applying
this process to all the primes that divide some t; completes the proof. 0
Dr 0 0
0 E, 0 ,
0 0 0
2 -4 1 3
2 -3 0 2
E M4 (Q) .
[T]e=A= 0 - 1 1 2
1 -1 -1 0
X-2 4 -1 -3
XI4-A= -2 X+3 0 -2
0 1 X-1 -2
-1 1 1 X
1 -1 -1 -X
-1) -2 X+3 0 -2
DiP14 0 1 X-1 -2
X-2 4 -1 -3
-1 -1 -X
0X+1 -2 -2X - 2
T21(2)
T41(-(X - 2)) `0 [10
1
X+2 X-3 X2-2X-3
X-1 -2
1 0 0 0
0X+1 -2 -2X - 2
0 1 X-1 -2 T12(
Ti3 1
0 X+2 X-3 X2-2X-3 T14( )
1 0 0 0
0 1 X-1 -2
P23 r-+
0 X+1 -2 -2X-2
0 X+2 X-3 X2-2X-3
5.4 Computational Examples 321
1 0 0 0
T42(-(X +2)) 0 1 X-1 -2
T32 - X + 1 0 0 -X2 - 1 0
0 0 -X2-1 X2+1
1 0 0 0
0 1 0 0
T23(-(X - 1))
0 0 X2+1 0 T24(2)
D3(-1)
0 0 X2+1 X2+1
1 0 0 0
0 1 0 0
0 0X 2 +1 0 T43(-1).
0 0 0 X2+1
0 0 0 -1
0 0 1 0
Since MT (X) does not split into linear factors over the field Q, it follows
that T does not possess a Jordan canonical form.
Our next goal is to produce a basis B' of V such that [71s, = R. Our
calculation will be based on Example 1.8, particularly Equation (1.15).
That is, supposing that
P(X)(XI4 - A)Q(X) = diag(1, 1, X2 + 1, X2 + 1)
then
VT °-` Q(XJwI Q[x]w2
where w3 = F 1Pij+zv; if P(X)-' _ [p,,]. But from our caclulations
above (and Lemma 4.1.11), we conclude that
P(X)-1 = P14Di(-1)T21(-2)T41(X - 2)P23T32(X + 1)T42(X + 2).
Therefore,
X-2 X+2 0 1
8' = {w1, T(w1) = -4v1 - 3v2 - v3 - v4, w2, T(w2) = 2v1 + 2v2 + v4}
is a basis of V such that [T]n- = R. Moreover, S-IAS = R where
0 -4 1 2
S
0 -1 0 0
-P8.
0 -1 0 1
(4.2) Remark. Continuing with the above example, if the field in Example
4.1 is Q[i], rather than Q, then mT(X) = X2 + 1 = (X + i)(X - i),
so T is diagonalizable. A basis of each eigenspace can be read off from
the caclulations done above. In fact, it follows immediately from Lemma
3.7.17 that Ann((X - i)wj) = (X + i) for j = 1, 2. Thus, the eigenspace
corresponding to the eigenvalue -i, that is, Ker(T + i), has a basis
{(X - i)w1i (X - i)w2}
and similarly for the eigenvalue i. Therefore, diag(i, i, -i, -i) = S1'AS1,
where
-4 2+i -4 2-i
Si =
-3+i 2 -3-i 2
-1 0 -1 0
-1 1 -1 1
and
0 0 1 -1 0 0 0 -1
It remains to compute the change of basis matrices which transform
A into R and J, respectively. As in Example 4.1, the computation of these
matrices is based upon Equation (1.15) and Lemma 3.7.17. We start by
computing P(X)-1:
P(X)-1 = P12D1(-1)T21(X)T31(-2)T41(-1)P23D2(4)
T32(-(X - 1))T42(X - 1)T34(-1)D3(4)T43(-(X - 1))14
X -(X - 1) X+3 -1
_ -1 0 0 0
-2 4 0 0
-1 X-1 -(X - 1) 1
Now for the Jordan canonical form. From Lemma 3.7.17, we see that
(4.4) Ann((X - 1)w2) = ((X + 1)2)
and
(4.5) Ann((X + 1)2w2) = ((X - 1)).
Equation (4.5) implies that
w2 = (X + 1)2w2 = -8v1 + 4v2 - 8v3 + 8v4
is an eigenvector of T with eigenvalue 1. Let w4 = (X - 1)w2 = -4v3 and
let w3 = (X + 1)w4 = -4v1 + 4v2 + 4v4 (see Proposition 4.4.37). Then
The main applications of the Smith normal form and the description
of equivalence of matrices via invariant factors and elementary divisors are
5.4 Computational Examples 325
amlxl+ - - +annxn=0
-
where ail E Z and b; E Z for all i, j. System (4.8) is also called a linear
diophantine system. We let A = [a;3] E Mm,n(Z), X = [xl . . . xn ]', and
B = [b1 bm ]t. Then the system of Equations (4.8) can be written in
matrix form as
(4.9) AX = B.
Now transform A to Smith normal form
(4.10) UAV
A. 0
0 0
slyl = Cl
s2y2 = C2
(4.12) sryr = Cr
0 = Cr+1
0=cn,.
The solution of the system (4.12) can be easily read off; there is a solution
if and only if s;Icifor1<i<randc;=0forr+l<i<m.Ifthereisa
solution, all other solutions are obtained by arbitrarily specifying the n - r
parameters y,.+ 1, ... , yn. Observing that X = VY we can then express the
solutions in terms of the original variables xl, ... , xn.
5.4 Computational Examples 327
4 -4 0 16 16
We leave it as an exercise for the reader to verify (via elementary row and
column operations) that if
0 0
1 1 2 -2
U= 1
1
-2 0 and
V= 0 1 -1 2
0 0 -1
0 -4 1
0 0
1
0 1
then
1 0 0 0
UAV = 0 3 0 0 = S.
0 0 12 0
Let
1
C=UB= 6
12
Then the system AX = B is transformed into the system SY = C, i.e.,
yl = 1
3112 = 6
12y3 = 12.
This system has the solutions
(4.8) Remark. The method just described will work equally well to solve
systems of linear equations with coefficients from any PID.
328 Chapter 5. Matrices over PIDs
for all A E F and k E N. This can be reduced to a finite number of rank con-
ditions if the eigenvalues of A are known. But knowledge of the eigenvalues
involves solving polynomial equations, which is intrinsically difficult.
S1={['
S2={[1 ?]:ci40}
S3= d] :bc#O,a+d=2,ad-be=1}.
[0
We leave the verification of this description of the orbit of Al as an exercise.
In this section we will present a very simple criterion for the similar-
ity of two matrices A and B (linear transformations), which depends only
on the computation of three matrices formed from A and B. This has the
effect of providing explicit (albeit complicated) equations and inequations
for the orbit OA of A under similarity. Unlike the invariant factor and ele-
mentary divisor theory for linear transformations, which was developed in
5.5 A Rank Criterion for Similarity 329
the nineteenth century, the result we present now is of quite recent vin-
tage. The original condition (somewhat more complicated than the one we
present) was proved by C. I. Byrnes and M. A. Gauger in a paper published
in 1977 (Decidability criteria for the similarity problem, with applications
to the moduli of linear dynamical systems, Advances in Mathematics, Vol.
25, pp 59-90). The approach we will follow is due to J. D. Dixon (An is(>-
morphism criterion for modules over a principal ideal domain, Linear and
Multilinear Algebra, Vol. 8, pp. 69-72 (1979)) and is based on a numerical
criterion for two finitely generated torsion modules over a PID R to be
isomorphic. This result is then applied to the F[X]-modules VT and VS,
where S, T E EndF(V), to get the similarity criterion.
...ED R/(sn)
and
N ?` R/(ti) ®... ® R/(tm)
Proof. This follows immediately from Lemma 5.2 and Proposition 3.3.15.
where
v(ai) _ [1 1 1 0 0J
with ei ones. Then define
v(M) v(si)
i=1
and
m
v(N) = 1: v(tj).
j=1
Notice that the matrix v(M) determines M up to isomorphism since one
can recover the elementary divisors of M from v(M). To see this, choose
the largest ti > 0 such that v(M) - v(p2') has nonnegative entries. Then
pt` is an elementary divisor of M. Subtract v(pi') from v(M) and repeat
the process until the zero vector is obtained. (See the proof of Proposition
3.7.19.)
r, j=
Let si = pi" ... pe"' t pIf'' .
l al, fji}
"prf''' and define dijl = minffe
+
= £(R/(si, tj)).
Therefore,
n m
(v(M) : v(N)) = E(v(si) : v(tj))
i=1 j=1
n m
_ e(R/(si, tj))
i=1 j=1
=(M:N).
Similarly, (M : M) = (v(M) : v(M)) and (N : N) = (v(N) : v(N)). By the
Cauchy-Schwartz inequality in Rk we conclude that
(M : N)2 = (v(M) : v(N) )2
< (v(M) : v(M))(v(N) : v(N))
_ (M : M) (N : N),
as required. Moreover, equality holds if and only if v(M) and v(N) are
linearly dependent over R, and since the vectors have integral coordinates,
it follows that we must have v(M) and v(N) linearly dependent over Q,
i.e.,
sv(M) = tv(N)
332 Chapter 5. Matrices over PIDs
Tfij(vk) =T(bikvj)
= bikT(vj)
n
= bik atjvi
l=1
n
ai j bik vt
t=1
n
Eaijfi! (Vk).
I=1
Proof. Since TS,T = LS -1T, the result is immediate from Lemma 5.10.
and hence:
as F[X]-modules. Hence, they have the same rank as F-modules, and thus
rAA = rAB = rBB follows immediately from Lemma 5.12.
Conversely, assume that
2
(5.11) rAB = rAArBB
it follows that
5.5 A Rank Criterion for Similarity 335
But
{(i, j) : 1 < i, j < k and min{i, j} = t}] = 2k - 2t + 1,
so
k
11
5.6 Exercises
0 0 7-6X
0 -4+X 2X E 1113(ZS[X]).
2+4X 5 0
M,, (R)) such that det B is an associate of d..(A), the m`h determinantal
divisor of A. (Hint: First put A in Smith normal form.)
16. Let S = {vi, ... , vk } C M",1(R where R is a PID. Show that S can
be extended to a basis of M",1(R) if and only if dk(A) = 1, where
A= [vi ... Vk ].
17. Suppose that A E M3(Z) and det A = 210. Compute the Smith normal form
of A. More generally, suppose that R is a PID and A E M"(R) is a matrix
such that det A is square-free. Then compute the Smith normal form of A.
18. Let A E M" (Z) and assume that det A y6 0. Then the inverse of A exists in
M"(Q), and by multiplying by a common denominator t of all the nonzero
entries of A-, we find that to-1 E M"(Z). Show that the least positive
integer t such that tA'1 E M"(Z) is t = Is"(A)l where s"(A) is the n1h
invariant factor of A.
19. Let A, B E M"(Z) such that AB = kI" for some k 54 0. Show that the
invariant factors of A are divisors of k.
20. Let R be a PID and let A E M"(R), B E M,"(R). Show that the elementary
divisors of A®B are the product of elementary divisors of A and of B. More
precisely, if p' is an elementary divisor of A® B where p E R is a prime, then
pr = pkpl where pk is an elementary divisor of A and p' is an elementary
divisor of B, and conversely, if pk is an elementary divisor of A and p' is an
elementary divisor of B, then pk+i is an elementary divisor of A 0 B.
21. Let A E M4(F) where F is a field. If A has an invariant factor s(X) of degree
2 show that the Smith normal form of XI4 - A is diag(1, 1, s( X), s(X)).
Conclude that CA(X) is a perfect square in F[X).
22. Find all integral solutions to the following systems AX = B of equations:
0 2 1 [51
(b) A= 1 -1 0 , B 1
2 0 -1 7
(c) A= [68 19
14 22]' B= [7]'
23. Show that the matrices A = [ 8 i ] and B = [ 232 -15 ] in M2 (Q) are
similar.
24. Show that the matrices
canonical form and compare your results with the same calculations done in
Chapter 4.
27. Find an example of a unimodular matrix A E M3(Z) such that A is not
similar to A'. (Compare with Example 3.8.)
28. Show that the matrix A = 2X X I is not equivalent in M2(Z(XJ) to a
2
diagonal matrix. (Hint: Use `Fitting ideals.)
29. Let Z" have the standard basis lei, ... ,e"} and let K C Z" be the sub-
module generated by f, _ j a,jej where a,j E Z and 1 < i < n. Let
A = [a,j] E MM(Z) and let d = det A. Show that Z/K is torsion if and only
ifdetA=d00andifd#0showthatIZ/Kl=l 1.
30. Suppose that an abelian group G has generators xl, x2, and x3 subject to
the relations xl - 3x3 = 0 and xl + 2x2 + 5x3 = 0. Determine the invariant
factors of G and Cl iif G is finite.
31. Suppose that an abelian group G has generators xi, x2, and x3 subject to
the relations 2x1 - x2 = 0, xl - 3x2 = 0, and x1 + x2 + x3 = 0. Determine
the invariant factors of G and Cl lif G is finite.
32. Verify the claim of Example 5.1.
33. Let F be a field and let A E M"(F), B E Mm(F). Show that the matrix
equation
AX - XB = 0
for X E M",,"(F) has only the trivial solution X = 0 if and only if the
characteristic polynomials cA(X) and CB(X) are relatively prime in FIX].
In particular, if F is algebraically closed, this equation has only the trivial
solution if and only if A and B have no eigenvalues in common.
34. Let F be a field. Suppose that A = Al ® A2 E M"(F) where Al E Mk(F)
and A2 E Mm(F) and assume that cA,(X and CA3(.X) are relatively prime.
Prove that if B E M"(F) commutes with A, then B is also a direct sum
B = B1 ® B2 where B1 E Mk(F) and B2 E M,,,(F).
35. Let F be a field. Recall that C(f(X)) denotes the companion matrix of the
monic polynomial f (X) E FIX]. If deg(f (X)) = n and deg(g(X)) = n, show
that
rank(C(f (X)) ® In - I" ® C(9(X ))) = deg(lcm{ f (X), 9(X)}).
6.1 Duality
Recall that if R is a commutative ring, then HomR(M, N) denotes the set
of all R-module homomorphisms from M to N. It has the structure of an R-
module by means of the operations (f + g) (x) = f (x) + g(x) and (a!) (x) =
a(f (x)) for all x E M, a E R. Moreover, if M = N then HomR(M, M) =
EndR(M) is a ring under the multiplication (fg)(x) = f (g(x)). An R-
module A, which is also a ring, is called an R-algebra if it satisfies the
extra axiom a(xy) = (ax)y = x(ay) for all x, y E A and a E R. Thus
EndR(M) is an R-algebra. Recall (Theorem 3.4.11) that if M and N are
finitely generated free R-modules (R a commutative ring) of rank m and n
respectively, then HomR(M, N) is a free R-module of rank mn.
In this section R will always denote a commutative ring so that
HomR(M, N) will always have the structure of an R-module.
v,(vi)=bij= 1 ifi=j
0 ifi#j.
(1.4) Example. Let R = Z and M = V. Consider the basis B = {vl =
(1, 0), v2 = (0,1)}. Then vi (a, b) = a and vz (a, b) = b. Now consider the
342 Chapter 6. Bilinear and Quadratic Forms
basis C = {wl = (1, 1), w2 = (1, 2)}. Then (a, b) = (2a - b)wl + (b - a)w2
so that wi (a, b) = 2a - b and w2 (a, b) = b - a. Therefore, vl 34 w' and
v2 # w2. Moreover, if D = Jul = (1,0), u2 = (1,1)} then ui(a,b) = a - b
and u2 (a, b) = b so that ui 54 vl even though ul = vl. The point is that
an element v, in a dual basis depends on the entire basis and not just the
single element vi.
Since 77(v1) and v;' agree on a basis of M', they are equal. Hence, q(M)
(vl', ... vn) = M" so that q is surjective, and hence, is an isomorphism.
0
(1.8) Remark. For general R-modules M, the map q : M - M" need not
be either injective or surjective. (See Example 1.9 below.) When q happens
to be an isomorphism, the R-module M is said to be reflexive. According
to Theorem 1.7, free R-modules of finite rank are reflexive. We shall prove
below that finitely generated projective modules are also reflexive, but first
some examples of nonrefiexive modules are presented.
(1.9) Examples.
(1) Let R be a PID that is not a field and let M be any finitely generated
nonzero torsion module over R. Then according to Exercise 9 of Chap-
ter 3, M' = HomR(M, R) = (0). Thus, M" = (0) and the natural
map q : M -' M" is clearly not injective.
(2) Let R = Q, and let M = ®nENQ be a vector space over Q of countably
infinite dimension. Then M' °-' fl EN Q. Since
M=®QC fQ,
nEN nEN
by
4)(wi, w2)(8) =w1(Oot1)+w2(Oot2)
where 0 E (M1 ® M2)' and i, : Mi -+ M1 ® M2 is the canonical injection.
and
_ (wl,w2)(01,02).
Therefore,
4)o 41 = 1(MI(DM2)..
and
W o 4) = 1Mi ®Mr,
and the lemma is proved. 0
InM2).. w
(M1 (D M1"eMM'
That is,
W 077 = (711, 712)
6.1 Duality 345
Proof.
((Y rl) (v1, v2)) (wl, W2) =' (77(v1, v2)) (w1, w2)
= (rl(vl, v2)(w1 o rrl), rl(vl, v2)(w2 o l2))
= ((w1 o 7r1)(vl, v2) , (w2 o rr2)(vl, v2))
= (w1(vI), w2(v2))
Convention. For the remainder of this section R will denote a PID and M
will denote a free R-module of finite rank unless explicitly stated otherwise.
(1.14) Definition.
(1) If N is a submodule of M, then we define the hull of N, denoted
Hull(N)={x'EM:rx'EN for somer540ER}.
If A is a subset of M, then we define Hull(A) = Hull((A)).
(2) If A is a subset of M then define the annihilator of A to be the following
subset of the dual module M*:
K(A) = Ann(A)
={wEM*:Ker(w)DA}
={wEM*w(x)=0 for all xEAl
C M*.
346 Chapter 6. Bilinear and Quadratic Forms
K'(B) = Ann(B)
={xEM:w(x)=O forallwEB}
C M.
(1.15) Remarks.
(1) If N is a submodule of M, then M/ Hull(N) is torsion-free, so Hull(N)
is always a complemented submodule (see Proposition 3.8.2); further-
more, Hull(N) = N if and only M/N is torsion-free, i.e., N itself is
complemented. In particular, if R is a field then Hull(N) = N for all
subspaces of the vector space M.
(2) If A is a subset of M, then the annihilator of A in the current context
of duality, should not be confused with the annihilator of A as an
ideal of R (see Definition 3.2.13). In fact, since M is a free R-module
and R is a PID, the ideal theoretic annihilator of any subset of M is
automatically (0).
(3) Note that Ann(A) = Ann(Hull(A)). To see this note that w(ax') _
0 r* aw(x') = 0. But R has no zero divisors, so aw(x') = 0 if and only
if w(x') = 0. Also note that Ann(A) is a complemented submodule
of M' for the same reason. Namely, aw(x) = 0 for all x E A and
a#OER *w(x)=0 forallxEA.
(4) Similarly, Ann(B) = Ann(Hull(B)) and Ann(B) is a complemented
submodule of M for any subset B C M*.
(1.16) Proposition. Let M be a free R-module of finite rank, let A, A1, and
A2 be subsets of M, and let B, B1, and B2 be subsets of M. Then the
following properties of annihilators are valid:
(1) If Al C A2, then K(A1) J K(A2).
(2) K(A) = K(Hull(A)).
(3) K(A) E C(M').
(4) K({0}) = M' and K(M) = {0}.
6.1 Duality 347
(5) K'(K(A)) A.
(1') If B1 C B2, then K'(B1) D K'(B2).
(2') K'(B) = K'(Hull(B)).
(3') KO(B) E C(M).
(4') K'({O}) = M and K'(M') = {0}.
(5') K(K'(B)) B.
Proof. Exercise. 0
The following result is true for any reflexive R-module (and not just
finite rank free modules). Since the work is the same, we will state it in
that context:
and
(1.19) Theorem. Let M be a free R-module of finite rank. Then the function
K : C(M) - C(M')
is a one-to-one correspondence with inverse K*.
Proof. We claim that for every complemented submodule S C M and T C
M', we have
K'(K(S)) = S
and
K(K'(T)) = T.
We will prove the first of these equalities; the second is similar.
First note the K'(K(S)) 2 S for every complemented submodule S C
M by Proposition 1.16 (5), so Corollary 3.8.5 implies that it suffices to show
that rank(K'(K(S))) = rank(S). But
rank(S) = rank(M) - rank(K(S))
and
rank(K(S)) = rank(M') - rank(K'(K(S)))
by Theorem 1.18. Since rank(M) = rank(M'), the result follows.
6.1 Duality 349
forallxEM.
(1.21) Remarks.
(1) f' : N' M' is an R-module homomorphism.
(2) Ad : HomR(M, N) - HomR(N', M'), defined by Ad(f) = f', is an
R-module homomorphism.
(3) If M and N are free, Ker(f) is always a complemented submodule of
M, but Im(f) need not be complemented. (See Proposition 3.8.7.)
(1.22) Theorem. Let M and N be free R-modules of finite rank and let
f E HomR(M, N). Then
(1) Ann(Im(f )) = Ker(f') C N',
(2) rank(Im(f')) = rank(Im(f))), and
(3) Im(f') = Ann(Ker(f)) C M' if Im(f') is a complemented submodule
of M*.
Proof. (1) Let w E N'. Then
w E Ker(f') p f'(w) = 0
qwo f =0
aw(f(x))=0 tlxEM
aw(y)=0 VyEIm(f)
a w E Ann(Im(f)).
(2) Since f' : N' -+ M', Proposition 3.8.8 gives
rank(N') = rank(Im(f')) + rank(Ker(f'))
while Theorem 1.18 shows
rank(N) = rank(Im(f)) + rank(Ann(Im(f))).
Since rank(N) = rank(N'), (2) follows from (1).
(3) Now let r E M'. Then
r E Im(f') a r= f' (w) for some w E N'
a r(x) = w(f (x)) Vx E M.
If x E Ker(f) then f (x) = 0, so w(f (x)) = 0. Therefore, r(x) = 0, and we
conclude that r E Ann(Ker(f )). Hence, Im(f') C Ann(Ker(f )).
By Theorem 1.18 and part (2),
rank(Ann(Ker(f))) = rank(M) - rank(Ker(f))
= rank(Im(f)) = rank(Im(f')).
Since Im(f') is assumed to be complemented, we conclude that Im(f') _
Ann(Ker(f)). 0
350 Chapter 6. Bilinear and Quadratic Forms
Proof.
(1.24) Proposition. Let M and N be free R-modules of finite rank with bases
B and C, respectively, and let f E HomR(M, N). Then
[f"]C _ ([.f]CC)t
But then
aii = wi (.f(vj))
= (w: o f)(vj)
= (.f*(w2M(vj)
= bpi.
(2.2) Examples.
(1) Every ring has the trivial conjugation c(r) = r. Since Aut(Q) = {1Q),
it follows that the trivial conjugation is the only one on Q. The same
is true for the ring Z.
(2) The field C has the conjugation c(z) = z, where the right-hand side
is complex conjugation. (This is where the name "conjugation" for a
function c as above comes from.)
(3) The field Q[/] and the ring (where d is not a square) both
have the conjugation c(a + bVd-) = a - bf .
Because of Example 2.2 (2), we will write 'r, instead of c(r), to denote
conjugation.
(2.5) Proposition.
(1) Let 0 be a bilinear (reap., sesquilinear) form on M. Then am
M - M', defined by
am(y)(x) = O(x, y)
is an R-homomorphism (reap., R-antihomomorphism).
(2) Let a : M --+ M' be an R-homomorphism (neap., R-antihomo-
morphism). Then 0Q : M x M - R, defined by
ma(x, y) = a(y)(x)
is a bilinear (reap., sesquilinear) form on M.
Proof. Exercise.
(2.6) Examples.
(1) Fix s E R. Then ¢(rl, r2) = rlsr2 (reap., = rl02) is a bi- (reap.,
sesqui-) linear form on R.
(2) O(x, y) = xty is a bilinear form on Mn,l(R), and O(x, y) = x1V is a
sesquilinear form on Mn,1(R). Note that ji is obtained from y by entry
by entry conjugation.
(3) More generally, for any A E Mn(R), O(x, y) = xtAy is a bilinear form,
and O(x, y) = xtA'y is a sesquilinear form on Mn,l (R).
(4) Let M = Mn,,n(R). Then 4,(A, B) = Tr(AtB) (reap., ¢(A, B) _
TT(AtB)) is a bi- (reap., sesqui-) linear form on M.
(5) Let M be the space of continuous real- (reap., complex-) valued func-
tions on [0, 1]. Then
Of, 9) = f0
1 .f (x)9(x) dx
O(,f, 9) = fo f(x)9(x) dx
1
We will often have occasion to state theorems that apply to both bi-
linear and sesquilinear forms. We thus, for convenience, adopt the language
that 0 is a b/s-linear form means 0 is a bilinear or sesquilinear form. Also,
6.2 Bilinear and Sesquilinear Forms 353
the theorems will often have a common proof for both cases. We will then
write the proof for the sesquilinear case, from which the proof for the bi-
linear case follows by taking the conjugation to be trivial (i.e., r = r for all
r E R).
We will start our analysis by introducing the appropriate equivalence
relation on b/s-linear forms.
Our object in this section will be to derive some general facts about
b/s-linear forms, to derive canonical forms for them, and to classify them
up to isometry in favorable cases. Later on we will introduce the related
notion of a quadratic form and investigate it. We begin by considering the
matrix representation of a b/s-linear form with respect to a given basis.
these sets is an R-module in a natural way, i.e., via addition and scalar
multiplication of R-valued functions.
One obvious question is how the matrices of a given form with respect
to different bases are related. This is easy to answer.
O(x, y) = [x]c[0)c[v]c
But, also,
O(x, y) = [x1B[01e[v1e,
and if P = PP, then Proposition 4.3.1 gives
[x)g = P[x}c and [y] B = P[y]c.
Thus,
6.2 Bilinear and Sesquilinear Forms 355
_ 0(vi, vi)v:
i=1
In order to see that this is true we need only check that am(vj)(vk) =
0(vk, vj), which is immediate from the definition of ao (see Proposition
2.5). Then from the definition of A = [ao18. (Definition 4.3.3), we see that
A is the matrix with entij (A) = 0(vi, vj), and this is precisely the definition
of [0113 (Definition 2.8).
(2.18) Remarks.
(1) We do not define skew-Hermitian if 2 divides 0 in R.
(2) Let 0 be a b/s-linear form on M and let A be the matrix of ¢ (with
respect to any basis). Then the conditions on m in Definition 2.17
correspond to the following conditions on A:
(a) 0 is symmetric if and only if At = A;
(b) 0 is skew-symmetric if and only if At = -A and all the diagonal
entries of A are zero;
(c) 0 is Hermitian if and only if A 54 A and At = A; and
(d) 0 is skew-Hermitian if and only if A # A and At = -A (and hence
every diagonal entry of A satisfies a = -a).
(3) In practice, most forms that arise are one of these four types.
We introduce a bit of terminological shorthand. A symmetric bilinear
form will be called (+1)-symmetric and a skew-symmetric bilinear form
will be called (-1)-symmetric; when we wish to consider both possibilities
simultaneously we will refer to the form as c-symmetric. Similar language
applies with c-Hermitian. When we wish to consider a form that is ei-
ther symmetric (bilinear) or Hermitian (sesquilinear) we will refer to it
as (+1)-symmetric b/s-linear, with a similar definition for (-1)-symmetric
b/s-linear. When we wish to consider all four cases at once we will refer to
an c-symmetric b/s-linear form.
ao(y)(x) _ O(x, y) 0 0.
6.2 Bilinear and Sesquilinear Forms 357
Proof. This follows immediately from Lemma 2.16 and Proposition 4.3.17.
0
Note that if N is any submodule of M, the restriction 46N = 'IN of
any c-symmetric b/s-linear form on M to N is an c-symmetric b/s-linear
form on N. However, the restriction of a non-singular b/s-linear form is
not necessarily non-singular. For example, let 0 be the b/s-linear form on
M2,1(R) with matrix [ i a ] . If N1 = ([ o ]) and N2 = ([0]), then 01N, is
non-singular, but cIN2 is singular and, indeed, degenerate.
The following is standard terminology:
(2.24) Remark. Let N1 have a basis B1, N2 a basis B2, and let M = N1®N2,
in which case B = Bl U B2 is a basis of M. Then M = N1 1 N2 if and only
if
A 0
[0Jg = 0B
Conversely, if [O]B is of this form, and if N1 (reap., N2) denotes the span of
B1 (reap., B2)1 then M = N1 1 N2. In this case we will also say 0 _ 01 102
where 4i = 01,y, .
(2.26) Proposition. Let M be a finite rank free module over a PID R, and let
0 be an e-symmetric b/s-linear form on M. Then 0 is isometric to 00 1 01
defined on M° 1 M1, where 41o is identically zero on M° and 01 is non-
degenerate on M1. Furthermore, 00 and 01 are uniquely determined up to
isometry.
Proof. Note that M° is a pure submodule of M (since it is the kernel of a
homomorphism (Proposition 3.8.7)) and so it is complemented. Choose a
complement M1. Then M1 is free and M °_f M°®M1. We let db = 411M0 and
01 = 01 M, . Of course M° and M1 are orthogonal since M° is orthogonal
to all of M, so we have M = M° 1 Ml with 0 = 00 1 01. Also, if
mlEM1with 01(m'1,ml)=Ofor all m'EM1,then 41(m,ml)=0 for all
m E M, i.e., m1 E M°. Since M = M° ® M1, M° n M1 = (0) and so 01 is
non-degenerate.
The construction in the above paragraph is well defined except for the
choice of M1. We now show that different choices of M1 produce isometric
forms. Let it : M -+ M/M° = M'. Then M' has a form 0' defined as
follows: If x', y' E M', choose x, y E M with w(x) = x' and 7r(y) = y'.
Set 41'(x', y') = 41(x, y), and note that this is independent of the choice
of x and y. But now note that regardless of the choice of M1, not only is
RIM, : M1 - M' an isomorphism, but is in fact an isometry between 01
and 0'. 0
The effect of this proposition is to reduce the problem of classifying
e-symmetric b/s-linear forms to that of classifying non-degenerate ones. It
also says that the following definition does indeed give an invariant of such
a form.
and
0 0 0
0 -1 1
0 1 -3
(2.30) Examples.
(1) Ml = M°.
(2) If N C M°, then N1 = M.
(3) Let 0 be the b/s-linear form on M2,1(R) whose matrix with respect to
the standard basis is
(a) [o 1J
(b) [1 OJ;
360 rChapter 6. Bilinear and Quadratic Forms
(c) 0l ; and
l0
1 OJJ
(d) 0 0
(2.33) Remark. Note that in Lemma 2.31 and Proposition 2.32 there is no
restriction on 0, just on GIN. The reader should reexamine Examples 2.30
in light of the above lemma and proposition.
6.2 Bilinear and Sesquilinear Forms 361
(2.34) Corollary. Let R be a PID and M a free R-module of finite rank. Let
N be a pure submodule of M with 41N and OINJ. both non-singular. Then
(N')' = N.
Proof. We have M=NIN' = (N')' _L N-L. But it is easy to check
that (N1)1 D N, so they are equal. 0
(n - k) I
r0 1 0 ell 1[ -e2
0 e2
1 ... 0 ek
-1 0 -el 0J 0 -ek 0
basis {w1, ... ,w,,,} of M' such that {f1w1, ... , fmwm} is a basis of the
submodule Im(am). Since fl I f2 I I fm, we see that lm(am) C f1 M',
i.e., O(v1, v2) is divisible by fl for every v1, v2 E M. Let x1, ..., x,,, be the
dual basis of M, that is, wi(xj) = b,j. Let yl E M with am(yl) = f1w1. If
ax1 + by, = 0, then
0 = O(xi, ax, + by,)
f0
)s = [ l 0f ]
.
Note that N is a pure submodule. To see this, suppose az = bx1 +cy1 where
a, b, c E R; by cancelling common factors we can assume that gcd{a, b, c} =
1. Then O(xl, ax) = .(xl, bxl + cyl) = cfl, while O(xl, az) = a4(x, z) =
adfi since O(v1, v2) is divisible by fl for all v1, v2 E M. Thus, ad = c, i.e.,
arc.
A similar computation with 0(yi, az) shows that a I b, so a is a common
divisor of a, b, and c, i.e., a is a unit and z E N. By Proposition 2.32,
M = Hull(N 1 N'). But, in fact, M = N 1 N'. To see this, let us
consider the form 0' defined by
O'(vi, v2) = fl 10(Vl, V2)
Then N' is also the orthogonal complement of N with respect to ,b' and
O'IN is non-singular, so M = N 1 N', i.e., io = ip 1 ¢1 with ml = OINI
Note that N ' / Im(ay,) has "invariant factors" (in the above sense)
f1 and fl, so we see that f2 = fl. Then the "invariant factors" of
(N-L)'/ Im(am,) are f3, ..., f,,,, and the theorem follows by induction.
(2.36) Corollary. Let R be a PID, and let m be a non-singular skew-
symmetric bilinear form on a free R-module M of finite rank. Then
rank(M) = 2n is even, and 0 is isometric to
n[0101'
Proof.
(2.37) Examples.
(1) Consider the skew-symmetric bilinear form 0 over Z with matrix
0 2 0 -2
A= -2 0 -2 -8
0 2 0 4
2 8 -4 0
from which we see that the invariant factors are (2, 2, 6, 6) and, hence,
that 0 is isometric to
[ -2 0, 1 [ -6 0]
(2) Let A be any invertible n x n skew-symmetric matrix over a field F,
i.e., At = -A and the diagonal entries are 0. Then A is the matrix of
a non-singular skew-symmetric form over F, and hence, PAP = mJ
where m = n/2 and J = [ 01 0]. Then
det(A)(det(P) )2 = det(nJ) = 1.
In particular, det(A) is a square in F. Now let R = Z[Y] where Y =
{Xij : 1 < i < j < n}, that is, R is the polynomial ring over Z in
the (2) indeterminates X;j for 1 < i < j < n. Let F = Q(Y) be the
quotient field of R and let A E Mn(F) be the skew-symmetric matrix
with entij = X;j for 1 < i < j < n, ent;, = -X.- for 1 < j < i < n,
and entii = 0. Then det(A) = P(Xij)2 for some element P(Xij) E F.
But R is a UFD, and hence, the equation Z2 = det(A) has a solution
in F if and only if it has a solution in R. Thus, P(Xij) E R, and since
P(Xi3) is a root of a quadratic equation, there are two possibilities for
the solution. We choose the solution as follows. Choose integers xij so
364 Chapter 6. Bilinear and Quadratic Forms
that the evaluation of the matrix A at the integers xij gives the matrix
mJ. Then choose P(Xij) so that the polynomial evaluation P(xij) =
det(mJ) = 1. Then we will call the polynomial P(Xi3) E Z[Xi)J the
generic Pfaffian and we will denote it Pf(A).
If S is any commutative ring with identity, then the evaluation Xij
bi3 induces a ring homomorphism
71: -+ S.
for some elements a1, a2, ..., an E R. Here [ail denotes the form on
R whose matrix is [ail, i.e., the form O(rl, r2) = rlair2. (The terminol-
ogy "diagonalizable" is used because in the obvious basis 0 has the matrix
diag(al, a2, ... an)-)
6.2 Bilinear and Sesquilinear Forms 365
[b c,
with a 54 0. Let e = b/a. Then
1 [ab
Ie c] [0 1] - [0 d,
1)
(with d = ae2 + c) and 0 is diagonalized.
Now suppose that rank(M) > 3. Find x as above with 0(x, x) = a 0 0
and write M = N I N1 as above. Pick y E N1. If 0(y, y) 0 0, then
OINl is odd and we are done (because we can apply induction). Thus,
suppose 0(y, y) = 0. Since VP = OIN1 is non-singular, there is z E N1 with
0(y, z) = b # 0. If 0(z, z) 0 0, then ik is odd and we are again done by
induction. Thus suppose 0(z, z) = 0. Let M1 be the subspace of M with
basis {x, y, z} and let 01 = OIM, . Then, in this basis 01 has the matrix
a 0 0
A= 0 0 b
0 b 0
in a basis, which we will simply denote by {x', y', z'}. Now let N be the
subspace of M spanned by x', and so, as above, M = N 1 N1. But now
0 = OI N1 is odd. This is because y' E N1 and t,i(y', y') = be 0 0. Thus,
we may apply induction and we are done. 0
(2.43) Example. We will diagonalize the symmetric bilinear form with ma-
trix
6.2 Bilinear and Sesquilinear Forms 367
0 1 2
A= 1 0 3
2 3 0
The reader should not be under the impression that, just because we
have been able to diagonalize symmetric or Hermitian forms, we have been
able to classify them. However, there are a number of important cases where
we can achieve a classification.
for some ri E R. Note that the multiplicative group R' has even order,
so the squares form a subgroup of index 2. Then det(4)) is a square or a
nonsquare accordingly as there are an even or an odd number of nonsquares
among the {r,}. Thus, the theorem will be proved once we show that the
form [ri] I [rj] with r, and rj both nonsquares is equivalent to the form
[1] 1 (s] for some a (necessarily a square).
Thus, let [0]g = [ ] in a basis B = {v1, V2} of M. R has an odd
o°
number of elements, say 2k + 1, of which k + 1 are squares. Let A = {a2r1 :
a E R} and B = {1-b2r2 : b E R}. Then A and B both have k+1 elements,
soAf1B00.Thus, forsome ao,b0ER,
ar 1 = 1 - ba re,
4r, +b 0 r2 = 1.
Let N be the subspace of M spanned by a0v1 + bov2. Then 4)]N = [1] and
M = N 1 N- L, so, in an appropriate basis, 0 has the matrix [ o , J, as
claimed. 0
Pt[0Js1P = 101s21
and taking determinants shows that ab2 = ccab1 (where c = det(P)); so
g : NIL -+ N2' defined by 9(vi) = c-1vs gives an isometry between cIN.
and 0IN.L.
Next let m > 3 and consider the submodule N12 of M with basis
{v1, v2}, where v1 generates N1, and v2 = f(v1) generates N2. Then
are also isometric. But in this case M = N12 1 N12, from which it readily
follows that
(N: nN12)1N 2=N; ,
yielding the theorem.
Now suppose WI N1, is singular. Then there is a 0 0 w E N12 with
¢(v, w) = 0 for all v E N12. Suppose there is a v3 E M with q5(v3i w) 0 0.
(Such an element v3 certainly exists if 0 is non-singular on M.) Of course,
v3 0 N12, so {v1, v2, v3} form a basis for a submodule N123 of M of rank
3 and 0IN1 is non-singular. (To see this, consider the matrix of 0 in the
basis {v1, w, v3} of N123.) Now for i = 1, 2,
N123=Ni1(Ni nN123)
and w E Ni n N12 3 with 4,(w, w) = 0, so there is a basis {w, w1 } of
Ni n N123 with 0 having matrix [° aj in that basis (with ai = di). We
claim that any two such forms are isometric, and this will follow if, given
anyaERwitha=Z, there is abERwith
Lb 1J 11 a110 1J - 11 0J
If char(R) 96 2, take b = -a/2 (and note that b = 6). If char(R) = 2, let
c E R with c 34 c (which exists as we are assuming that 0 is Hermitian)
370 Chapter 6. Bilinear and Quadratic Forms
and let b = ac/(c+c). Hence, tINi nN,23 and 0IN2 nN,23 are isometric, and
M = N123 1 N12 3 (as OI N, 2 3 is non-singular); so, as in the case 01 N, 2 is
non-singular, it follows that 0IN. and OINz are isometric.
It remains to deal with the case that O(v, w) = 0 for every v E M.
We claim that in this case N1 = N2, so the theorem is true. To see this,
let B1 2 = {V1, V21 and extend B1 2 to a basis B = {v1 i ... , vm } of M. Let
A = [O]B. Then
PtAP = rA
L0 0
*
(the right-hand side being a block matrix). This is [¢]B' in the basis
13 ={v1i712,7/3,...,vm}
and then Nl = N2 is the subspace with basis
{w, v3, ... , vm}
Note that such an element exists by the proof of Theorem 2.42. Let N11
be the subspace of M generated by v1 and let N21 be the subspace of M
generated by v2. Then
M=N111(NinN1)1NIL =N211(N,nN2)1N2.
Then the case n = 1 of the theorem implies that
(NL1nN1)1Ni and (N21nN2)1N2
are isometric, and then the inductive hypothesis implies that NIL and N2
are isometric, proving the theorem.
(2.48) Corollary. Let 01, 02, and 13 be forms on modules of finite rank
over a field R, all three of which are either symmetric or Hermitian. If
char(R) = 2, assume all three are Hermitian. If 01 is non-singular and
01 1 02 and 01 1 03 are isometric, then 02 and 03 are isometric.
Proof.
(2.49) Remark. Witt's theorem is false in the case we have excluded. Note
that
1 0 0 1 0 0
0 0 1 and 0 1 0
0 1 0 0 0 1
are isometric forms on (Fr2)3, as they are both odd and non-singular, but
I 0 1J and [01
1]
are not isometric on (F2)2, as the first is even and the second is odd.
r[1] 1 s[-1]
with r+s = n = rank(M). Furthermore, the integers r ands are well defined
and 45 is determined up to isometry by rank(o) = n, and signature(O) =
r-s
Proof. Except for the fact that r and s are well defined, this is all a direct
corollary of Theorem 2.42. (Any two of n, r, and s determine the third, and
we could use any two of these to classify 0. However, these determine and
are determined by the rank and signature, which are the usual invariants
that are used.) Thus, we need to show that r and s are well defined by 0.
To this end, let M+ be a subspace of M of largest dimension with OPM+
positive definite. We claim that r = rank(M+). Let B = {v1, ... be a
basis of M with
[0113 = Ir ®-I,.
If Ml = (v1, ... , vr), then OIM, is positive definite. Thus, rank(M+) > r.
This argument also shows that if M_ is a subspace of M of largest possible
dimension with 01M_ negative definite, then rank(M_) > s.
We claim that r = rank(M+) and s = rank(M_). If not, then the
above two inequalities imply
Proof. It is tempting, but wrong, to try to prove this as follows: The form
0 is diagonalizable, so just diagonalize it and inspect the diagonal entries.
The mistake here is that to diagonalize the form 0 we take PAP, whereas
to diagonalize A we take PAP-', and these will usually be quite different.
For an arbitrary matrix P there is no reason to suppose that the diagonal
entries of P=AP are the eigenvalues of A, which are the diagonal entries of
PAP`.
On the other hand, this false argument points the way to a correct
argument: First note that we may write similarity as (P)-'AP. Thus if P
is a matrix with P' = (P)'', then the matrix B = PtAP will have the
same eigenvalues as A.
Let us regard A as the matrix of a linear transformation a on R" where
R = R or R = C. Then A is either real symmetric or complex Hermitian.
In other words, a is self-adjoint in the language of Definition 4.6.16. But
then by Theorem 4.6.23, there is an orthonormal basis B = {v,, ... , v"} of
R" with B = [a]B diagonal, i.e., P-'AP = B where P is the matrix whose
columns are v1, ..., v,,, and furthermore B E M"(R), where n = rank(A).
But then the condition that B is orthonormal is exactly the condition that
Pt = (15)-1, and we are done. 0
1 0 -3 1
L 0 1 1 -1
To do this, calculate that the determinants of the principal minors are
bo(A) = 1, 61(A) = -2, 62(A) = 3, 63(A) = -3, 64(A) = -59
giving 3 sign changes, so this form diagonalizes to
-1 0 0 0
0 -1 0 0
0 0 -1 0
0 0 0 1
choices. Choose a basis B for M. Then we have the dual basis B' of M*.
Recall that [f']B. = ([f]a)t (Proposition 1.24), so
[f']s = [a0']e* ([f]e)`
On the other hand, if B = {v;}, then we have a basis C' of M' given by
C' = (am(v1)}. Then there is also a basis C dual to C' (using the canonical
isomorphism between M and M**). By definition, I,,, the identity
matrix, where n = rank(M), so we have more simply
4':M-+R
defined by
4i(x) = t'(x, x)
is a quadratic form on M with associated bilinear form
Proof. Part (1) is obvious from Definition 3.1 (2) and is merely stated for
emphasis. To prove (2), suppose that ' is associated to 4;. Then for every
x E M,
44'(x) = 4'(2x) = 4'(x + x) = 24'(x) + 45(x, x).
Thus,
(3.1) 45(x, x) = 24'(x)
and 0 is even.
Conversely, suppose that 0 is even. Let B = {vi}gEJ be a basis for M
and choose an ordering of the index set I. (Of course, if rank(M) = n is
finite, then I = 11, ... , n} certainly has an order, but it is a consequence
of Zorn's lemma that every set has an order.) To define O(x, y) it suffices to
define ii(vi, vj) for i, j E I. Define i/i(vi, v3) as follows: ?P(vi, vj) = 0(vi, vi)
if i < j, tI'(vi, vj) = 0 if j < i, and ?P(vi, vi) is any solution of the equation
2:fi(vi, vi) = QS(vi, vi). Since 0 is assumed to be even, this equation has a
solution. Then 4'(x) = 1li(x, x) is a quadratic form with associated bilinear
form 4'.
(3) This is a direct consequence of Equation (3.1). 0
(3.3) Lemma. (1) Let 4ii be a quadratic form on a module M1 over R with
associated bilinear form 0j, for i = 1, 2. Then 4i : M1 (D M2 -+ R, defined
by
4'(x1,x2) = 01(x1) +'02(x2),
is a quadratic form on M1 ® M2 with associated bilinear form 451 102 (In -
We call 4' the orthogonal direct sum of 4'1 and 4'2. Thus Lemma 3.3
tells us that the procedures of forming orthogonal direct sums of associated
bilinear and quadratic forms are compatible.
It follows from Theorem 3.2 that if 2 is not a zero divisor in R, the
classification problem for quadratic forms over R reduces to that for even
symmetric bilinear forms. (Recall that if 2 is a unit of R, then every sym-
metric bilinear form is even.) Thus we have already dealt with a number of
important cases-R = R, R = C, or R a finite field of odd characteristic.
We will now study the case of quadratic forms over the field F2 of 2 ele-
ments. In this situation it is common to call a quadratic form 4' associated
to the even symmetric bilinear form 0 a quadratic refinement of 0, and we
shall use this terminology. Note that in this case condition (1) of Definition
3.1 simply reduces to the condition that 4'(0) = 0, and this is implied by
condition (2)---set y = 0. Thus, we may neglect condition (1).
f(x) _'F1(x)+4'2(x)
is an R-linear function f : M R, i.e., f E M*.
(2) If 'F1 is any quadratic refinement of ¢ and f E M' is arbitrary, then
t2 ='F1 + f is also a quadratic refinement of 0.
Proof. These are routine computations, which are left to the reader.
e(1) = 3, o(1) = 1
and
Arf(4') = 1 if 14-1(0)l = o(m) and I0-1(1)1 = e(m).
The form 0 is called even or odd according as Arf(0) = 0 or 1.
4,(x) = 1 4,(y) = 0 0
4(x) = 0 Cy) = 1 0
4(x) = 0 4,(y) = 0 4(z) = I
O(x) = 1 Cy) =1 $(z) = 1.
The first three have Arf(4b) = 0; the last one has Arf($) = 1. But then
it is easy to check that AutF, (M) S3, the permutation group on 3 ele-
ments, acting by permuting x, y, and z. We observed that 0 is completely
symmetric in x, y, and z, so S3 leaves 0 invariant and permutes the first
three possibilites for 4' transitively; hence, they are all equivalent. (As there
is only one 4' with Arf(4') = 1, it certainly forms an equivalence class by
itself.)
Now let m > 1. We have that 4' is isometric to 41 1 . . . 1 ',,, (by
Lemma 3.3) and
Arf(4') = Arf(41) + - + Arf(4'm)
(by Corollary 3.9), so Arf(4') = 0 or 1 accordingly as there are an even
number or an odd number of the forms 4'i with Arf(4'i) = 1. Each 4i has
rank 2, and we have just seen that all rank 2 forms 4i with Arf(4'i) = 0 are
isometric. Thus to complete the proof we need only show that if ' is the
unique rank 2 form with 1, then T 1'P is isometric to +i 1 4D2
with Arf(4,) = 0 for i = 1, 2.
Let 0 be the bilinear form associated to 4' _' 1 41 and let M have a
basis B = {21, y1, x2, 112} in which
0 1 0 0
[Ole _ 1 0 0 0
0 0 0 1
0 0 1 0
i=1
(2) Let c = (a1 i b1, ... , a., bm) E (F2)2m be arbitrary. If v E M, then
v = Ei ` 1(rix; + siyi), for ri, si E F2, and we define
m m m
-tc(v) _ risi + airi + bisi.
i=1 i=1 i=1
The function 4 : M -- F2 is a quadratic refinement of 0, and
m
Arf(4) _ aibi.
i=1
Proof. Exercise.
384 Chapter 6. Bilinear and Quadratic Forms
(3.12) Remark. The reader should not get the impression that there is
a canonical 4?, obtained by taking c = (0, ... , 0) of which the others are
modifications; the formula for Lc depends on the choice of symplectic basis,
and this is certainly not canonical.
(3.13) Definition.
(1) Let 0 be an arbitrary b/s-linear form on a free module M over a ring
R. Then the isometry group Isom(q) is defined by Isom(0) =
If E AutR(M) : 0(f (x), f (y)) = OX, y) for all x, y E M}.
(2) Let 4P be an arbitrary quadratic form on a free module M over a ring
R. Then the isometry group Isom(4)) is defined by
Isom(') = If E AutR(M) : 4?(f (x)) = 4i(x) for all x E M}.
But
6.3 Quadratic Forms 385
for all x, y E M. Comparing Equation (3.5) and Equation (3.6) gives the
result. O
(3.16) Remarks.
(1) If M has infinite rank then f need not be an isomorphism in the
situation of Corollary 3.15. For an example, let M = Q°° =
with ¢ the bilinear form on M given by
o°
O((xl, x2, ... ), (y1, y2, ... )) _ xiyi
i=1
Then f : M M, defined by
[f]`[f]=12.
Geometrically, this means that an isometry of R2 is determined by a pair
of orthonormal vectors of R2, namely, the first and second columns of [f].
Hence, the isometries of R2 (with respect to the standard inner product)
are one of the two types:
386 Chapter 6. Bilinear and Quadratic Forms
_ cos8 -sing
Pe sin 8 cos 0
(3.19) Lemma. Let M, 4), M1, and y be as in Definition 3.18, and let 0 be
the symmetric bilinear form associated to it.
(1) fm, E Isom(4)), and (fM,)2 = 'Al.
(2) If g E Isom(4'), then 9fM,9-1 = f9(nf,.
(3) fy is given by the formula
Proof. Exercise. 0
Since a hyperplane reflection f,,, is an isometry, it is certainly true that
+(x) = 4)(f,,,(x)). The following lemma gives a simple criterion for the
existence of a hyperplane reflection that interchanges two given points.
Then fw(x) = y.
6.3 Quadratic Forms 387
Proof. Exercise. 0
As an easy application of reflections, we will present another proof of
Witt's theorem (Theorem 2.47). The difficult part of the proof of Theorem
2.47 was the n = 1 step in the induction. When reflections are available,
this step is very easy.
The next two lemmas are technical results needed in the proof of the
Cartan-Dieudonn6 theorem.
If O (x) 36 0 for all x 3& 0 E M, then this follows from the first two
cases. Thus, suppose there exists x 0 0 E M with 4i(x) = 0. Choose y E M
such that 4(x, y) 54 0, which is possible since 0 is non-singular. Since
O(x, rx) = rq$(x, x), it is clear that B = {x, y} is a linearly independent
subset of M and, hence, a basis. Replacing y by a multiple of y, we may
assume that O(x, y) = 1. Furthermore, if r E R, then
.O(y + rx, y + rx) = 0(y, y) + 2r4(x, y)
so by replacing x with a multiple of x we can also assume that ¢(y, y) = 0.
That is, we have produced a basis B of M such that
0 1
1
0
ab [0 1 [0 1
01 Lc 0J
This equation implies that
or b]
[9)s = I 0 ao l ] [9)s = I b ,
Since
a
[0
01] =
-0
a-[.0, a 0
1 0'
1
[ 0 0(y, z)1
('01B =
0(y, z) O(z, z) J
390 Chapter 6. Bilinear and Quadratic Forms
Proof. (1) Let g = fy, fy. and let Nj = Ker(fy, - 1). Then
dim(Ni n - r.
(2) This follows immediately from (1).
(3.26) Corollary.
(1) If dim M = 2 then every isometry of determinant -1 is a reflection.
(2) If dim M = 3 and g is an isometry of determinant 1, then g is the
product of 2 reflections.
Proof. Exercise.
6.4 Exercises
0 0 0 0 1 2 1 0
0 0 0 0 0 1 2 0
0 0 0 0 1 0 0 2
0 1 1 -1 2 1
-1 0 5 -3 2 -3
-1 -5 0 3 -7 4
(a)
1 3 -3 0 6 5
-2 -2 7 -6 0 1
-1 3 -4 -5 -1 0
0 0 -6 -6 -6 -8
0 0 -6 -7 -7 -9
6 6 0 -1 -5 -7
(b)
6 7 1 0 -6 -8
6 7 5 6 0 0
8 9 7 8 0 0
12. Find the signature of each of the following forms over R. Note that this also
gives their diagonalization over R.
6.4 Exercises 393
5 40 9 5 -11 3
(a) 40 50 12 (b) -11 3 -12
9 12 3 3 -12 0
2 -4 6 1 3 4
(c) -4 13 -12 (d) 3 9 5
6 -12 18 4 5 0
2 1 0 0 0 0
1 2 1 0 0 0
(e) 0 1 2 1 0 0
0 0 1 0 1 0
0 0 0 1 -2 1
0 0 0 0 1 -2
Also, diagonalize each of these forms over Q.
13. Carry out the details of the proof of Lemma 2.41.
14. Analogous to the definition of even, we could make the following definition:
Let R be a PID and p a prime not dividing 2 (e.g., R = Z and p an odd
prime). A form 0 is p-ary if O(x, x) E pR for every x E M. Show that if 0 is
p-ary, then O(x, y) E pR for every x, y E M.
15. Prove Proposition 3.4.
16. Prove Lemma 3.6.
17. Prove Proposition 3.11.
18. Prove Lemma 3.19.
19. Prove Lemma 3.20.
20. Let f E AUtR(M) where R = R or C and where rank(M) < oo. If fk = Im
for some k, prove that there is a non-singular form 0 on M with f E Isom(O).
21. Diagonalize the following forms over the indicated fields:
13 4 6
(a) 4 7 8 over F2, F3, F5, and Q
6 8 2
1 4 1
(b) 4 -2 10 over F2, F3, F5, and Q
-1 10 4
22. A symmetric matrix A = (a;2 J E M (R) is called diagonally dominant if
a1+ ? laii l
i#l
for 1 < i < n. If the inequality is strict, then A is called strictly diagonally
dominant. Let 0 be the bilinear form on R" whose matrix (in the standard
basis) is A.
(a) If A is diagonally dominant, show that 0 is positive semidefinite, i.e.,
O(x, x) >0 for all x E R".
(b) If A is strictly diagonally dominant, show that 0 is positive definite.
23. Let 0 be an arbitrary positive (or negative) semidefinite form. Show that
is non-degenerate if and only if it is positive (negative) definite.
24. Let R be a ring with a (possibly trivial) conjugation. Show that
{P E GL(n, R) : Pt = (F)_'}
is a subgroup of GL n, R). If R = R, with trivial conjugation, this group is
called the orthogonal group and denoted O(n), and if R = C, with complex
conjugation, it is called the unitary group and denoted U(n).
394 Chapter 6. Bilinear and Quadratic Forms
A= ab b
c
This chapter will be concerned with collecting a number of results and construc-
tions concerning modules over (primarily) noncommutative rings that will be
needed to study group representation theory in Chapter 8.
(1.8) Definition.
(1) If R is a ring (not necessarily commutative) and M is an R-module,
then a chain of submodules of M is a sequence {M;} o of submodules
of M such that
(1.1) (0)=M0 5M15M2
The length of the chain is n.
(2) We say that a chain (Njo is a refinement of the chain {M;} o
if each M, is equal to N3 for some j. Refinement of chains defines a
partial order on the set C of all chains of submodules of M.
(3) A maximal element of C (if it exists) is called a composition series of
M.
398 Chapter 7. Topics in Module Theory
(1.9) Remarks.
(1) Note that the chain (1.1) is a composition series if and only if each of
the modules M;/M;_1 (1 < i < n) is a simple module.
(2) Our primary interest will be in decomposing a module as a direct sum
of simple modules. Note that if M = ® 1 M; where M; is a simple
R-module, then M has a composition series
n
(0) 511115Ml®M25...5®M:=M.
;=1
(1.10) Examples.
(1) Let D be a division ring and let M be a D-module with a basis
{xl,...,xm}. LetMo=(0)andfor1<i<nletMi=(xi,...,x,).
Then {M;} o is a chain of submodules of length n, and since
composition series and hence it may be refined until its length is e(M), at
which time it will be a composition series.
Proof. Let
(0)=Ko5K15...5Kn=K
be a composition series of K, and let
(0)=Lo 5L15...5L,n=L
7.1 Simple and Semisimple Rings and Modules 401
be a composition series for L. For 0 _< i < n, let Mi = q(Ki), and for
n + 1 < i < n + m, let M. = O-1(Li_n) Then {Mi}n o'" is a chain of
submodules of M and
Mi/Mi_1
Ki1Ki_1 for 1 < i < n
for n+ l< i< n+ m
so that {M}°on is a composition series of M. Thus, £(M) = n + m. O
M a` ®Rl(p:')
i=1
Then it is an easy exercise to check that M is of finite length and
k
C(M) _ ei.
i=1
(1.4) M ®(I'aMa)
aEA
where {Ma}aEA is a set of pairwise distinct (i.e., M. MP if a ,l3)
simple modules. Equation (1.4) is said to be a simple factorization of the
semisimple module M. Notice that this is analogous to the prime factor-
ization of elements in a PID. This analogy is made even more compelling
by the following uniqueness result for the simple factorization.
(1.5) M ® (raMa)
aEA
and
(1.6) N ® (AONO)
OEB
where {Ma}aEA and (N0}OEB are the distinct simple factors of M and N,
respectively. If M is isomorphic to N, then there is a bijection V) : A B
such that Ma ^_' No(,,,) for all a E A. Moreover, Iral < oo if and only if
IA,p(a)I < oo and in this case Iral = IAw(a)I
Proof. Let ' : M - N be an isomorphism and let a E A be given. We may
write M Ma®M' with M' = ®yEA\{a} (ryM.y)®raMa where rQ is ra
with one element deleted. Then by Proposition 3.3.15,
HomR(M, N) = HomR(Ma, N) ® HomR(M', N)
(1.7) = (®AO HomR(Ma, NO)) ® HomR(M', N).
8EB
By Schur's lemma, HomR(Ma, NO) = (0) unless M. ^_' NO. Therefore,
in Equation (1.7) we will have HomR(Ma, N) = 0 or HomR(M0, N)
AO HomR(M0, NO) for a unique 3 E B. The first alternative cannot occur
since the isomorphism 0 : M - N is identified with (0 0 1-1, 10 0 1.2) where
tl : M. --+ M is the canonical injection (and 12: M' M is the injection).
If HomR(M,,, N) = 0 then 0 o t1 = 0, which means that 01m. = 0. This is
impossible since 0 is injective. Thus the second case occurs and we define
0(a) _ 3 where HorR(M0, NO) # (0). Thus we have defined a function
0:A B, which is one-to-one by Schur's lemma. It remains to check that
0 is surjective. But given 0 E B, we may write N NO ® N'. Then
HomR(M, N) = HomR(M, NO) ® HomR(M, N')
and
HomR(M, NO) = [ (11 HomR(Ma, NO)).
aEA ra
Since 0 is surjective, we must have HomR(M, NO) # (0), and thus, Schur's
lemma implies that
HomR(M, NO) H Hom(M0, NO)
ra
for a unique a E A. Then i (a) = /3, so ip is surjective.
According to Proposition 3.3.15 and Schur's lemma,
M - (D (I'QMQ) ® (ApNN)
aEA (3EB
with distinct simple factors {MQ}QEA and {Nfl}pEB. Then there is a bijec-
tion tli : A - B such that MQ N p(Q) for all a E A. Moreover, jr,,, I < 00
if and only if IA,(Q) I < oo and in this case II',, I = IAy(Q) I.
Proof. Take 0 = 1M in Theorem 1.18. 0
(1.20) Remarks.
(1) While it is true in Corollary 1.19 that MQ - N,,(,) (isomorphism as
R-modules), it is not necessarily true that MQ = N,,(0). For example,
let R = F be a field and let M be a vector space over F of dimension
s. Then for any choice of basis {ml, ... , m,} of M, we obtain a direct
sum decomposition
Rm,.
(2) In Theorem 1.18 we have been content to distinguish between finite and
infinite index sets rQ, but we are not distinguishing between infinite
sets of different cardinality. Using the theory of cardinal arithmetic,
one can refine Theorem 1.18 to conclude that II'QI = IA>G(a)I for all
a E A, where ISI denotes the cardinality of the set S.
M'=N®(®Mi).
iEJ
S={PcI:Mp2, ®MiandMpnN=(0)}.
iEP
(1.8) x=xp,
where xp, 0 0 E Mp, for {pi, ... ,pk} C P. Since C is a chain, there is
an index ct E A such that {po, pi, ... ,pie} C PQ. Equation (1.8) shows
that Mpo 9t EIEP, MI, which contradicts the fact that PQ E S. Therefore,
we must have P E S, and Zorn's lemma applies to conclude that S has a
maximal element J.
Claim. N®((DiEJM1)
If this were not true, then there would be an index io E I such that
Mio 0 N + Mj. This implies that Mio ¢ N and Mio ¢ M. Since Mio n N
and M,, n Mj are proper submodules of Mio, it follows that M,o n N = (0)
and MionMJ = (0) because Mio is a simple R-module. Therefore, {io}UJ E
S, contradicting the maximality of J. Hence, the claim is proved. 0
(1.22) Corollary. If an R-module M is a sum of simple submodules, then
M is semisimple.
Proof. Take N = (0) in Theorem 1.21. 0
Proof. (1) . (2) follows from Lemma 1.21, and (3) (1) is immediate
from Corollary 1.22. It remains to prove (2) = (3).
Let Ml be a submodule of M. First we observe that every submodule of
Ml is complemented in Ml. To see this, suppose that N is any submodule of
MI. Then N is complemented in M, so there is a submodule N' of M such
7.1 Simple and Semisimple Rings and Modules 405
Since all R-modules are assumed projective, we have that RIM is pro-
jective, and hence (by Theorem 3.5.1) this sequence splits. Therefore,
R M ® N for some submodule N C R, which is isomorphic (as an
R-module) to RIM. Then by Theorem 1.23, R is semisimple. 0
(1.29) Corollary. Let R be a semisimple ring and let M be an R-module.
Then M is irreducible (simple) if and only if M is indecomposable.
Proof. 11
R = ®(I'aMQ),
oEA
R= ®Np
OEB
where each No is simple. We will show that B is finite, and then both
finiteness statements in the corollary are immediate from Theorem 1.30.
Consider the identity element 1 E R. By the definition of direct sum,
we have
1= 1: rpnp
OE B
for some elements rp E R, np E N0, with all but finitely many rp equal to
zero. Of course, each Np is a left R-submodule of R, i.e., a left ideal.
Now suppose that B is infinite. Then there is a go E B for which
r00 = 0. Let n be any nonzero element of Npo. Then
nE ® Np.
pEB\{po}
Thus,
n E No. n( @ No) _ {0},
pEB\{po}
408 Chapter 7. Topics in Module Theory
where {M,}k 1 are the distinct simple R-modules and nl, ... , nk are pos-
itive integers. Then R is anti-isomorphic to R°P, and
R°P = EndR(R)
k k
(
HomR (® niMi, ® niMil/
i=1 i=1
k
®HomR(niMM, niMi)
i=1
k
^_' ®EndR(niM,),
i=1
EndD(D") = ® Pij
as a right (resp., left) D-module, and each Pij is certainly simple (on either
side).
Remark. In the language of the next section, this definition becomes "A
ring R with identity is simple if it is simple as an (R, R)-bimodule."
From this lemma and Theorem 1.28, we see that if R is a field, then
every R-module (i.e., vector space) is semisimple and there is nothing more
to say. For the remainder of this section, we will assume that R is a PID
that is not a field.
Let M be a finitely generated R-module. Then by Corollary 3.6.9, we
have that M F ® M where F is free (of finite rank) and Mr is the
torsion submodule of M. If F 76 (0) then Lemma 1.37 shows that M is
not semisimple. It remains to consider the case where M = M i.e., where
M is a finitely generated torsion module. Recall from Theorem 3.7.13 that
each such M is a direct sum of primary cyclic R-modules.
(0) -A p`-1 M 5 M,
Proof. First suppose that M is cyclic, and me(M) = (pi' ... pr). Then
the primary decomposition of M is given by
MSM (R/(pi'))ED ...(D (Rl k'')),
(1.40) Remark. In the two special cases of finite abelian groups and linear
transformations that we considered in some detail in Chapters 3 and 4,
Theorem 1.39 takes the following form:
(1) A finite abelian group is semisimple if and only if it is the direct product
of cyclic groups of prime order, and it is simple if and only if it is cyclic
of prime order.
(2) Let V be a finite-dimensional vector space over a field F and let
T : V -+ V be a linear transformation. Then VT is a semisimple F[XJ-
module if and only if the minimal polynomial mT(X) of T is a product
of distinct irreducible factors and is simple if and only if its character-
istic polynomial cT(X) is equal to its minimal polynomial mT(X), this
polynomial being irreducible (see Lemma 4.4.11.) If F is algebraically
closed (so that the only irreducible polynomials are linear ones) then
VT is semisimple if and only if T is diagonalizable and simple if and
only if V is one-dimensional (see Corollary 4.4.32).
412 Chapter 7. Topics in Module Theory
(2.2) Examples.
(1) Every left R-module is an (R, Z)-bimodule, and every right S-module
is a (Z, S)-bimodule.
(2) If R is a commutative ring, then every left or right R-module is an
(R, R)-bimodule in a natural way. Indeed, if M is a left R-module,
then according to Remark 3.1.2 (1), M is also a right R-module by
means of the operation r,ir = rm. Then Equation (2.1) is
r(ms) = r(sm) = (rs)m = (sr)m = s(rm) = (rm)s.
(3) If T is a ring and R and S are subrings of T (possibly with R = S = T),
then T is an (R, S)-bimodule. Note that Equation (2.1) is simply the
associative law in T.
(4) If M and N are left R-modules, then the abelian group HomR(M, N)
has the structure of an (EndR(N), EndR(M))-bimodule, as follows. If
f E HomR(M, N), ¢ E EndR(M), and V) E EndR(N), then define
f ¢ = f o 0 and f = o f . These definitions provide a left EndR(N)-
module and a right EndR(M)-module structure on HomR(M, N), and
Equation (2.1) follows from the associativity of composition of func-
tions.
(5) Recall that a ring T is an R-algebra, if T is an R-module and the R-
module structure on T and the ring structure of T are compatible, i.e.,
r(tit2) = (rtl)t2 = ti(rt2) for all r E R and t1, t2 E T. If T happens
to be an (R,S)-bimodule, such that r(tlt2) = (rtl)t2 = tl(rt2) and
(tlt2)s = tl(t2s) = (tls)t2 for all r E R, s E S, and t1, t2 E T, then we
7.2 Multilinear Algebra 413
(2.4) RxS={>r;xsj:nENandr1ER,siESfor1<i<n}.
i-i
(2.3) Examples.
(1) If R is a ring, then a left R-submodule of R is a left ideal, a right
R-submodule is a right ideal, and an (R, R)-bisubmodule of R is a
(two-sided) ideal.
(2) As a specific example, let R = M2(Q) and let x = [a o]. Then the left
R-submodule of R generated by {x} is
xR = { [a0 ]:abEQ},
01
b
When considering bimodules, there are (at least) three distinct types
of homomorphisms that can be considered. In order to keep them straight,
we will adopt the following notational conventions. If M and N are left
R-modules (in particular, both could be (R, S)-bimodules, or one could be
an (R, S)-bimodule and the other a (R, T)-bimodule), then HomR(11/, N)
will denote the set of (left) R-module homomorphisms from M to N. If M
and N are right S-modules, then Hom_S(M, N) will denote the set of all
(right) S-module homomorphisms. If M and N are (R, S)-bimodules, then
Hom(R,s)(M, N) will denote the set of all (R, S)-bimodule homomorphisms
from M to N. With no additional hypotheses, the only algebraic structure
that can be placed upon these sets of homomorphisms is that of abelian
groups, i.e., addition of homomorphisms is a homomorphism in each situa-
tion described. The first thing to be considered is what additional structure
is available.
= (rif(mts))t+ (r2f(m2s))t
= rj (f (mis)t) + r2(f (mis)t)
= ri (sf t)(mi) + r2(sft)(m2),
where the third equality follows from the (R, S)-bimodule structure on M,
while the next to last equality is a consequence of the (R, T)-bimodule
structure on N. Thus, s f t is an R-module homomorphism for all s E S,
t E T, and f E HomR(M, N).
Now observe that, if s1i 32 E S and m E M, then
(s1(s2f )) (m) = (s2f)(ms1)
= f((ms1)s2)
= f(m(s1s2))
= ((s1s2)f) (m)
so that HomR(M, N) satisfies axiom (ci) of Definition 3.1.1. The other
axioms are automatic, so HomR(M, N) is a left S-module. Similarly, if t1i
t2 ETandmEM, then
((ft1)t2) (m) = ((ft1)(m))t2
= (f (m)t1) t2
= f(m)(t1t2)
= (f(t1t2)) (m)
Thus, HomR(M, N) is a right T-module by Definition 3.1.1 (2). We have
only checked axiom (Cr), the others being automatic.
It remains to check the compatibility of the left S-module and right
T-module structures. But, if s E S, t E T, f E HomR(M, N), and m E M,
then
((s f )t) (m) = (s f)(m)t = f (ms)t = (f t)(ms) = s(f t)(m).
Thus, (s f )t = s(f t) and HomR(M, N) is an (S, T)-bimodule, which com-
pletes the proof of the proposition. 0
Proved in exactly the same way is the following result concerning the
bimodule structure on the set of right R-module homomorphisms.
(2.6) Corollary.
(1) If M is a left R-module, then M' = HomR(M, R) is a right R-module.
(2) If M and N are (R, R)-bimodules, then HomR(M, N) is an (R, R)-
bimodule, and EndR(M) is an (R, R)-bialgebra. In particular, this is the
case when the ring R is commutative.
Proof. Exercise. 0
Remark. If M and N are both (R, S)-bimodules, then the set of bimod-
ule homomorphisms Hom(R,S)(M, N) has only the structure of an abelian
group.
(2.9) Ml 0-+ M -i M2 -i 0
is a sequence of (R, S)-bimodules and (R, S)-bimodule homomorphisms,
then the sequence (2.9) is exact if and only if the sequence
(2.11) miM_M2-+0
7.2 Multilinear Algebra 417
where c(m,n) E Z and all but finitely many of the integers c(m,n) are 0. Note
that F can be given the structure of an (R, T)-bimodule via the multipli-
cation
k k
(2.14) r E ci(mi, ni) t = > ce(rmei net)
e=1 i_1
where rER,tET,andc1,...,ckEZ.
Let K C F be the subgroup of F generated by the subset Hl U H2 U H3
where the three subsets H1, H2, and H3 are defined by
H1={(ml+m2,n)-(m1,n)-(m2,n)m1,m2EM,nEN}
H2={(m,ni+n2)-(m,ni)-(m,n2):mEM,n1,n2EN}
H3={(ms,n)-(m,sn):mEM, nEN, aES}.
418 Chapter 7. Topics in Module Theory
(2.9) Definition. With the notation introduced above, the tensor product of
the (R, S)-bimodule M and the (S, T)-bimodule N, denoted M ®s N, is the
quotient (R,T)-bimodule
M®sN=F/K.
If 7r : F - F/K is the canonical projection map, then we let m ®s n =
ir((m, n)) for each (m, n) E M x N C F. When S is clear from the context
we will frequently write m ®n in place of m ®s n.
Proof. Exercise. 0
Indeed, the tensor product M ®s N is obtained from the cartesian
product M x N by "forcing" the relations (2.16)-(2.18), but no others,
to hold. This idea is formalized in Theorem 2.12, the statement of which
requires the following definition.
Note that conditions (1), (2), and (3) simply state that for each m E M
the function gm : N -- Q defined by gm (n) = 9(m, n) is in Hom-T(N, Q)
and for each n E N the function g" : M -' Q defined by g"(m) = g(m, n)
is in HomR(M, Q). Condition (4) is compatibility with the S-module struc-
tures on M and N.
If it : F -+ M ®S N = F/K is the canonical projection map and
e : M x N F is the inclusion map that sends (m, n) to the basis element
(m, n) E F, then we obtain a map B : M x N - M ® N. According
to Proposition 2.10, the function 0 is S-middle linear. The content of the
following theorem is that every S-middle linear map "factors" through 0.
This can, in fact, be taken as the fundamental defining property of the
tensor product.
(2.13) Remarks.
(1) If M is a right R-module and N is a left R-module, then M OR N is
an abelian group.
(2) If M and N are both (R, R)-bimodules, then M OR N is an (R, R)-
bimodule. A particular (important) case of this occurs when R is a
commutative ring. In this case every left R-module is automatically a
right R-module, and vice-versa. Thus, over a commutative ring R, it
is meaningful to speak of the tensor product of R-modules, without
explicit attention to the subtleties of bimodule structures.
(3) Suppose that M is a left R-module and S is a ring that contains R as
a subring. Then we can form the tensor product S®R M which has the
structure of an (S, Z)-bimodule, i.e, S OR M is a left S-module. This
construction is called change of rings and it is useful when one would
like to be able to multiply elements of M by scalars from a bigger ring.
For example, if V is any vector space over R, then C ®R V is a vector
space over the complex numbers. This construction has been implicitly
used in the proof of Theorem 4.6.23.
(4) If R is a commutative ring, M a free R-module, and 0 a bilinear form
on M, then 0: M x M - R is certainly middle linear, and so 0 induces
an R-module homomorphism
4:M®RM-.R.
7.2 Multilinear Algebra 421
(2.14) Corollary.
(1) Let M and M' be (R, S)-bimodules, let N and N' be (S,T)-bimodules,
and suppose that f : M -+ M' and g : N -+ N' are bimodule homo-
morphisms. Then there is a unique (R,T)-bimodule homomorphism
(2.20) f ®g=f Os g: M®sN -+M'®sN'
satisfying (f ®g) (m (9 n) = f (m) ®g(n) for all m E M, n E N.
(2) If M" is another (R, S)-bimodule, N" is an (S, T)-bimodule, and f" :
M' -. M", g" : N' -. N" are bimodule homomorphisme, then letting
f®g:M®N-+M'®N' and f'®g':M'®N'---WON" be defined
as in part (1), we have
(f'®9)(f®g)=(f'f)0(9g):M®N-+M"®N".
Proof. (1) Let F be the free abelian group on M x N used in the definition of
M ®s N, and let h : F M®s N' be the unique Z-module homomorphism
such that h(m, n) = f (m) Os g(n). Since f and g are bimodule homomor-
phisms, it is easy to check that h is an S-middle linear map, so by Theorem
2.12, there is a unique bimodule homomorphism h : M 0 N -+ M' ® N'
such that h = h o 0 where 0 : M x N -+ M 0 N is the canonical map. Let
f ®g = h. Then
(f 0 g)(m (9 n) = h(m (9 n) = h o 0(m, n) = h(m, n) = f (m) 0 g(n)
as claimed.
(2) is a routine calculation, which is left as an exercise. 0
(2.15) Proposition. Let M be an (R, S)-bimodule. Then there are (R, S)-
bimodule isomorphisms
R®RM5M and M®sS:L, M.
(M ®s N) OT MOs (N®TP).
f: (M os N) OT P - M®s(N®TP)
satisfying f ((m®n)®p) = m®(n®p). Similarly, there is an (R, U)-bimodule
homomorphism
g:M®s(NOT P) (M®sN)®TP
satisfying gg(m (9 (n ®p)) = (m (9 n) ®p. Clearly, §1 (respectively j g-) is the
identity on elements of the form (m ®n) ®p (respectively, m ®(n (&p)), and
since these elements generate the respective tensor products, we conclude
that f and g are isomorphisms. O
M®sN®®(MM®sN3)
iEI jEJ
of (R, T) -bimodules.
Proof. Exercise. O
7.2 Multilinear Algebra 423
(2.19) Remark. When one is taking Horn and tensor product of various
bimodules, it can be somewhat difficult to keep track of precisely what
type of module structure is present on the given Hom or tensor product.
The following is a useful mnemonic device for keeping track of the various
module structures when forming Hom and tensor products. We shall write
RMS to indicate that M is an (R, S)-birnodule. When we form the tensor
product of an (R, S)-birnodule and an (S, T)-birnodule, then the resulting
module has the structure of an (R, T)-bimodule (Definition 2.9). This can
be indicated mnemonically by
(2.21) RMS ®S SNT = RPT
Note that the two subscripts "S" on the bimodules appear adjacent to
the subscript "S" on the tensor product sign, and after forming the tensor
product they all disappear leaving the outside subscripts to denote the
bimodule type of the answer (= tensor product).
A similar situation holds for Horn, but with one important differ-
ence. Recall from Proposition 2.4 that if M is an (R, S)-bimodule and N
is an (R, T)-bimodule, then HornR(M, N) has the structure of an (S,T)-
bimodule. (Recall that HomR(M, N) denotes the left R-module homomor-
phisms.) In order to create a simple mnemonic device similar to that of
Equation (2.21), we make the following definition. If M and N are left R-
modules, then we will write M fiR N for HomR(M, N). Using mR in place
of OR, we obtain the same convention about matching subscripts disap-
pearing, leaving the outer subscripts to give the bimodule type, provided
that the order of the subscripts of the module on the left of the OR sign are
reversed. Thus, Proposition 2.4 is encoded in this context as the statement
the result being an (S, R)-birnodule (see Proposition 2.5). Note that we
must reverse the subscripts on M and interchange the position of M and
N.
We shall now investigate the connection between Horn and tensor prod-
uct. This relationship will allow us to deduce the effect of tensor products
on exact sequences, using the known results for Hom (Theorems 2.7 and
2.8 in the current section, which are generalizations of Theorems 3.3.10 and
3.3.12).
(2.22)
Homs(Mi, HomT(N, P)) f Homs(M2, HomT(N, P))
1 02
(IN O).
HOMT(N (&s Ml, P) ____+ HOMT(N OS M2, P)
Remark. Note that Theorems 2.20 and 2.21 are already important results
in case M1=M2=Mandt=1M.
As a simple application of adjoint associativity, there is the following
result.
HomR(P, A) HomR(P, B)
and this completes the proof.
426 Chapter 7. Topics in Module Theory
(2.29) 0
is short exact, the tensored sequence (2.25) need not be part of a short exact
sequence, i.e., the initial map need not be injective. For a simple situation
where this occurs, take m = n in Example 2.32 (1). Then exact sequence
(2.30) becomes
Zn001- ' Zn - Zn -+ 0.
The map 1 is the zero map, so it is certainly not an injection.
This example, plus our experience with Hom, suggests that we consider
criteria to ensure that tensoring a short exact sequence with a fixed module
produces a short exact sequence. We start with the following result, which
is exactly analogous to Theorem 2.8 for Hom.
(2.33) 0-pM1-.M-.M2-+0
is a split short exact sequence of (S, R) -bimodules, then
(2.35) 0 N ®T M1
14 N ®T M 1- N ®T M2 --i 0
is a split short exact sequence of (R, S)-bimodules.
Proof. We will do sequence (2.34); (2.35) is similar and is left as an exercise.
Let of : M - M1 split 0, and consider the map
a®1:Al OR N-'M1OR N.
Then
((a®1)(O®1))(m(9 n) = (aO(9 1)(m®n) = (1(9 1)(m(& n) = m®n
so that 0®1 is an injection, which is split by a®1. The rest of the exactness
is covered by Theorem 2.23. 0
(2.26) Remark. Theorems 2.7 and 2.23 show that given a short exact se-
quence, applying Hom or tensor product will give a sequence that is exact
on one end or the other, but in general not on both. Thus Horn and tensor
product are both called half exact, and more precisely, Hom is called left
exact and tensor product is called right exact. We will now investigate some
conditions under which the tensor product of a module with a short exact
sequence always produces a short exact sequence. It was precisely this type
of consideration for Hom that led us to the concept of projective module.
In fact, Theorem 3.5.1 (4) shows that if P is a projective R-module and
7.2 Multilinear Algebra 429
0- M1 0+M &M2-i0
is a short exact sequence of R-modules, then the sequence
MI OR N°-(D(MlOR Rj)=$Mlj
jEJ jEJ
where each MI j is isomorphic to M1 as a left S-module, and similarly
M®RN ®jEJMj, where each Mj is isomoprhic to M as a left S-module.
Furthermore, the map t ®1 : M1®R N -+ M OR N is given as a direct sum
®(tj:Mij--+Mj)
jEJ
where each tj agrees with i under the above identifications. But then, since
t is an injection, so is each tj, and hence so is t ®1.
Now suppose that N is projective as a left R-module. Then there is a
left R-module N' such that N ® N' = F where F is a free left R-module.
We have already shown that
t®1:Mi®RF-+M®RF
is an injection. But using Proposition 2.18 again,
so we may write t(91 = c1® t2 where (in particular) c1 = L(91 : M1®R N --+
M OR F. Since c ® 1 is an injection, so is c1, as claimed. Thus the proof is
complete in the case that N is projective as a left R-module. The proof in
case N is projective as a right T-module is identical.
Note that we have not used the right T-module structures in the above
proof. This is legitimate, since if a homomorphism is injective as a map of
left S-modules, and it is an (S, T)-bimodule map, then it is injective as an
(S, T)-bimodule map.
(2.36) 0--.M1
is a short exact sequence of (S, R)-bimodules, then
( : M' OR P - HomR(M, P)
defined by Equation (2.39) is an (S, T) -bimodule isomorphism.
Proof. Since S is an (S, T)-bimodule homomorphism, it is only necessary
to prove that it is bijective. To achieve this first suppose that M is free of
finite rank k as a left R-module. Let B = {vl, ... , vk} be a basis of M and
let {vi, ... ,v;} be the basis of M' dual to B. Note that every element of
M' OR P can be written as x = Fk 1 of ®pi for pt, ... , Pk E P. Suppose
that ((x) = 0, i.e., (((x))(m) = 0 for every m E M. But ((x)(vi) = pi so
that pi = 0 for 1 < i < k. That is, x = 0 and we conclude that ( is injective.
Given any f E HomR(M, P), let
k
xf=>2v1 ®f(vi)
Then (((xf))(vi) = f (vi) for 1 < i < k, i.e., S(xf) and f agree on a basis
of M; hence, ((xf) = f and C is a surjection, and the proof is complete in
case M is free of rank k.
Now suppose that M is finitely generated and projective, and let N be
a left R-module such that F = M ® N is finitely generated and free. Then
S : F' OR P - HomR(F, P) is a Z-module isomorphism, and
F'®RP= (M®N)'®RPQ (M'(DN')ORP°` (M* OR P)®(N'®R P)
while
CF=CM®(N
CM : M' OR P HomR(M, P)
CN : N' OR P -' HomR(N, P).
Since (F is an isomorphism, it follows that CM and (N are isomorphisms as
well. In particular, CM is bijective and the proof is complete. 0
(2.33) Corollary. Let M be an (R, S)-bimodule, which is finitely generated
and projective as a left R-module, and let P be an arbitrary (T, R)-bimodule.
Then
(P®RM)'
as (S,T)-bimodules.
Proof. From Proposition 2.32, there is an isomorphism
7.2 Multilinear Algebra 433
Then E is a basis for M and F is a basis for N. With respect to these bases,
there is the following result:
434 Chapter 7. Topics in Module Theory
Proof. Exercise. l7
I
(fl ®f2]Y _
an11 B an12B
7.3 Exercises
(a) Show that if M satisfies the DCC, then any nonempty set of submodules
of M contains a minimal element.
(b) Show that e(M) < oo if and only if M satisfies both the ACC (ascending
chain condition) and DCC.
5. Let R = { [o b] a, b E R; c E Q}. R is a ring under matrix addition and
:
multiplication. Show that R satisfies the ACC and DCC on left ideals, but
neither chain condition is valid for right ideals. Thus R is of finite length as
a left R-module, but e(R) = oo as a right R-module.
6. Let R be a ring without zero divisors. If R is not a division ring, prove that
R does not have a composition series.
7. Let f : Ml --, M2 be an R-module homomorphism.
a If f is injective, prove that e(M1) < t(M2)-
b) If f is surjective, prove that £(M2) < e(Ml).
8. Let M be an R-module of finite length and let K and N be submodules of
M. Prove the following length formula:
e(K + N) + e(K n N) = e(K) + e(N).
9. a) Compute e(Z,-).
b) Compute e(Z
(G3 ® Zq .
c) Compute e where is any finite abelian group.
d) More generally, compute e(M) for any finitely generated torsion module
over a PID R.
10. Compute the length of M = F[X]/(f(X)) as an F[X]-module if f(X) is
a polynomial of degree n with two distinct irreducible factors. What is the
length of M as an F-module?
11. Let F be a field, let V be a finite-dimensional vector space over F, and let
T E EndF(V). We shall say that T is semisimple if the F[XI-module VT is
semisimple. If A E we shall say that A is semisimple if the linear
transformation TA : F' -, F' (multiplication by A) is semisimple. Let F2
be the field with 2 elements and let F = F2 (Y) be the rational function field
in the indeterminate Y, and let K = F[XJ/(X2 +Y). Since X2 + Y E FIX]
is irreducible, K is a field containing F as a subfield. Now let
A=C(X2+Y)= L0 0Y ] EM2(F).
Show that A is semisimple when considered in M2(F) but A is not semisimple
when considered in M2(K). Thus, semisimplicity of a matrix is not neces-
sarily preserved when one passes to a larger field. However, prove that if L
is a subfield of the complex numbers C, then A E M,,(L) is semisimple if
and only if it is also semisimple as a complex matrix.
12. Let V be a vector space over R and let T E EndR(V) be a linear transforma-
tion. Show that T = S + N where S is a semisimple linear transformation,
N is nilpotent, and SN = NS.
13. Prove that the modules Mi and N3 in the proof of Lemma 1.33 are simple,
as claimed.
14. Prove Lemma 1.37.
15. If D is a division ring and n is a positive integer, prove that EndD(D") is a
simple ring.
16. Give an example of a semisimple commutative ring that is not a field.
17. (a) Prove that if R is a semisimple ring and I is an ideal, then R/I is
semisimple.
(b) Show (by example) that a subring of a semisimple ring need not be
semisimple.
436 Chapter 7. Topics in Module Theory
18. Let R be a ring that is semisimple as a left R-module. Show that R is simple
if and only if all simple left R-modules are isomorphic.
19. Let M be a finitely generated abelian group. Compute each of the following
pups:
a Homy M, Q/Z .
b Homz(Q/Z, M).
c) M Oz Q/Z.
20. Let M be an (R, S)-bimodule and N an (S, T)-bimodule. Suppose that
E x; ®y; = 0 in M Os N. Prove that there exists a finitely generated
(R, S)-bisubmodule Mo of M and a finitely generated (S, T)-bisubmodule
No of N such that E x; 0 y; = 0 in Mo Os No.
21. Let R be an integral domain and let M be an R-module. Let Q be the
quotient field of R and define m : M - Q OR M by 0(x) = 1®x. Show that
Ker(4') = M, = torsion submodule of M. (Hint: If 10 x = 0 E Q OR M
then 1(9 x = 0 in (Re`) OR M M for some c 54 0 E R. Then show that
cx=0.)
22. Let R be a PID and let M be a free R-module with N a submodule. Let Q be
the quotient field and let 4': M Q OR M be the map 4'(x) = 1®x. Show
that N is a pure submodule of M if and only if Q Q. (0(N ) fl Im(4') = O(N).
23. Let R be a PID and let M be a finitely generated R-module. If Q is the
quotient field of R, show that M OR Q is a vector space over Q of dimension
equal to rankR(M/M,).
24. Let R be a commutative ring and S a multiplicatively closed subset of R
containing no zero divisors. Let Rs be the localization of R at S. If M is
an R-module, then the Rs-module Ms was defined in Exercise 6 of Chapter
3. Show that Ms Rs OR M where the isomorphism is an isomorphism of
Rs-modules.
25. If S is an R-algebra, show that S OR
M and N be finitely generated R-modules over a PID R. Compute
M ®R N. As a special case, if M is a finite abelian group with invariant
factors st, ..., st (where as usual we assume that s;
divides s;+1), show that M®z M is a finite group of order n;=1 aie-2)+1
27. Let F be a field and K a field containing F. Suppose that V is a finite-
dimensional vector space over F and let T E EndF(V). If B = {v;} is a
basis of V, then C = {1 ®B = 110 vi I is a basis of K ®F V. Show that
)10T]c = IT)s If S E ndF(V), show that 1®T is similar to 1®S if and
only if S is similar to T.
28. Let V be a complex inner product space and T : V V a normal linear
transformation. Prove that T is self-adjoint if and only if there is a real inner
product space W, a self-adjoint linear transformation S : W W, and an
isomorphism 0: C OR W -+ V making the following diagram commute.
®s
C ®R W C OR W
where (I, J) denotes the ideal of S®RT generated by I ®RT and S®R J.
30. (a) Let F be a field and K a field containing F. If f (X) E F[X ), show that
there is an isomorphism of K-algebras:
K®F (F[X]/(f(X))) ` K[Xll(f(X)).
(b) By choosing F, f (X), and K appropriately, find an example of two fields
K and L containing F such that the F-algebra K ®F L has nilpotent
elements.
31. Let F be a field. Show that F[X, Y) ?f F[X]®F F[Y] where the isomorphism
is an isomorphism of F-algebras.
32. Let G1 and Gs be groups, and let F be a field. Show that
(1.3) Definition. Let G be a finite group. A field F is called good for G (or
simply good) if the following conditions are satisfied.
(1) The characteristic of F is relatively prime to the order of G.
(2) If m denotes the exponent of G, then the equation X1 - 1 = 0 has m
distinct roots in F.
Remark. Actually, (2) implies (1), and also, (2) implies that the equation
Xk - 1 = 0 has k distinct roots in F, for every k dividing m (Exercise 34,
Chapter 2). Furthermore, since the roots of Xk - 1 = 0 form a subgroup
of the multiplicative group F', it follows from Theorem 3.7.24 that there is
(at least) one root C = tk such that these roots are
440 Chapter 8. Group Representations
We shall reserve the use of the symbol ((or (k) to denote this. We further
assume that these roots have been consistently chosen, in the following
sense:
If k1 divides k2, then ((k2)k2/k, = (k,
Note that the field C (or, in fact, any algebraically closed field of
characteristic zero) is good for every finite group, and in C we may simply
choose (k = exp(21ri/k) for every positive integer k.
(1.4) Examples.
(1) Let M be an F vector space of dimension 1 and define a : G Aut(M)
by a(g) = lm for all g E G. We call M the trivial representation of
degree 1, and we denote it by r.
(2) M = F(G) as an F(G)-module. This is a representation of degree n.
As an F vector space, M has a basis {g : g E G}, and an element
go E G acts on M by
be the dihedral group of order 2m, and let F be a field that is good
for G. The representations of D2m are described in Tables 1.1 and 1.2,
using the following matrices:
r k
(1.1) Ak = I o S'k and B= 0 0
Representation x y degree
T 1 1 1
1 -1 1
4'k Ak B 2 1<k<(m-1)/2
Representation x y degree
1 1 1
1 -1 1
-1 1 1
-1 -1 1
(8) Suppose that M has a basis 8 such that for every g E G, [0'(9)113 is
a matrix with exactly one nonzero entry in every row and column.
Then M is called a monomial representation of G. For example, the
representations of D2,,, given above are monomial. If all nonzero en-
tries of [v(g)] g are 1, then the monomial representation is called a
permutation representation. (The reader should check that this defi-
nition agrees with the definition of permutation representation given
in Example 1.4 (3).)
(9) Let X = {1, x, x2, ... } with the multiplication x'xi = x'+'. Then
X is a monoid, i.e., it satisfies all of the group axioms except for the
existence of inverses. Then one can define the monoid ring exactly as
in the case of the group ring. If this is done, then F(X) is just the
polynomial ring F[x]. Let M be an F vector space and let T : M -- M
be a linear transformation. We have already seen that M becomes an
F(X)-module via x'(m) = T'(m), for m E M and i > 0. Thus, we
have an example of a monoid representation. Now we may identify Z
with
-2 -1 2
E (agg = ag.
9EG gEG
(1.5) Definition.
(1) M is an irreducible representation of G if M is an irreducible F(G)-
module. (See Definition 7.1.1.)
(2) M is an indecomposable representation of G if M is an indecomposable
F(G)-module. (See Definition 7.1.6.)
(1.6) Example. Let F be good for Zn. Then the regular representation R(Zn)
is isomorphic to
00 ®61 ® ... ®9n_1
Proof. Consider the F-basis {1, g, ... ,gn-1} of 7Z(Zn). In this basis, a(g)
has the matrix
0 0 0 1
1 0 ... 0 0
0 1 ... 0 0
0 0 ... 0 0
0 0 ... 1 0
444 Chapter 8. Group Representations
(In fact, this statement is true without the requirement that p be prime.)
(3) Let P be the set of vertices of a regular m-gon, and let D2m act on P
in the usual manner (Section 1.5). Then, as
')+ ®01 ®02 ®... ®0(.n-1)/2 if m is odd,
F (P) ®0jp-1 ifmiseven.
lV,++®V,-+®01®02ED
8.1 Examples and General Results 445
R(D2m)
ifmisodd,
1 10++ ®+G+- ®+6-+ 20, S . ® 20;j if m is even.
m = agg where ag E F.
9EG
gom = E ag(9o9)
9EC
and
m = > ago9(909)
gEG
for every go E G.
Now suppose that G is finite. Then by Equation (1.2)
= 1
n j a2(9-1)f(at(9)) (a1(9o)(v1))
9EG
= 1 E a2(9-1)f(Q1(99o))(v1)
n gEC
= 1 E a2(9o)(12(9019-1)f(a1(99o))(VI)
n gEG
8.1 Examples and General Results 447
= 1 E C2(90)C2((99o)-1)f(Cl(99o))(ul)
n gEG
Let g' = gg0. As g runs through the elements of G, so does g'. Thus,
Av(f)(o1(9o)(vl)) = n E a2(9o)Q2(9 -1)f(a1(9))(v1)
g'EG
= 02 (go) Av(f)(vl)
as required, so Equation (1.3) is satisfied.
Also, if f E Homc(V1, V2)1 then for every g E G,
C2(9)f = fol(9)
Hence, in this case
= 1 f°1(9-1)01(9)
gEC
= 1 F, fcl(e)
gEC
f
gEG
= 1 F, 9-1P(9(t(v)))
n 9EG
= 1n F, 9-1P(t(9(v)))
9EG
= 1 >2 9-1(9(v))
n 9EG
nv
n
=v
as required.
Now suppose that char(F) divides the order of G. Recall that we have
the augmentation map e as defined in Example 1.4 (11), giving a short
exact sequence of F(G)-modules
However,
E(a1: g) =(Je(Eg) =an=OE F
gEG 9EG
since char(F) I n. Thus, a(r) C Ro, contradicting Ro fl a(r) = (0). 0
8.1 Examples and General Results 449
(1.15) Example. Consider Example 1.4 (10) again, but this time with
F = Fy, the field of p elements. Then Tp = [o i] = [o i], so we may
regard T as giving an F-representation of Z.. As in Example 1.12, this is
indecomposable but not irreducible.
Proof. 0
as claimed. 0
The second basic result we have is Schur's lemma, which we amplify a
bit in our situation.
450 Chapter 8. Group Representations
EndR(M)_{[b b :a,bER}=C
under the isomorphism [ b a ] '--» a + bi.
We close this section with the following lemma, which will be important
later.
(2.1) Theorem. Let G be a finite group and F a good field for G. Then
G is abelian if and only if every irreducible F-representation of G is one-
dimensional.
Proof. First assume that G is abelian and let M be an F-representation of
G. By Corollary 1.17, if deg(M) is infinite, M cannot be irreducible. Thus
we may assume that deg(M) < oo. Now the representation is given by a
homomorphism a : G -+ Aut(M), so
a(g)a(h) = a(gh) = a(hg) = a(h)a(g) for all g, h E G.
By Lemma 1.20, each a(g) is diagonalizable, so
S = {a(g) : g E G}
is a set of mutually commuting diagonalizable transformations. By Theorem
4.3.36, S is simultaneously diagonalizable. If B = {v1, ... ,vk} is a basis of
M in which they are all diagonal, then
(ao(g)ao(h))(1) = (ao(h)ao(g))(1)
ao(g)(h) = ao(h)(g)
gh = hg
and G is abelian.
452 Chapter 8. Group Representations
(2.2) Theorem. Let G be a finite group and F a good field for G. Then G
is abelian if and only if G has n distinct irreducible F-representations.
Proof. Let G be abelian. We shall construct n distinct F-representations of
G. By Theorem 3.7.22, we know that we may write G as a direct sum of
cyclic groups
(2.2) G = Zn, ® Zn2 ® ... ®Zn.
with n = n1 n3. Since F is good for G, it is also good for each of
the cyclic groups Zni, and by Example 1.4 (6), the cyclic group Zn, has
the ni distinct F-representations 0k for 0 < k < ni - 1; to distinguish
these representations for different i, we shall denote them ok' . Thus, On'
Zn, - F. If rri : G -+ Zn; denotes the projection, then
(2.3) on,7ri : G --+ F' = Aut(F)
defines a one-dimensional representation (and, hence, an irreducible repre-
sentation) of G. Thus,
{ok'7ri: 1<i<s, 0<k<ni-1}
is a collection of it irreducible F-representations of G; by Corollary 1.17,
this is all of them.
On the other hand, suppose that G is not abelian, and let {Mi}iE, be
the set of irreducible representations of G. Since G is not abelian, Theorem
2.1 implies that deg(Mi) > 1 for some i. Then, as in the proof of Corollary
1.17,
III = F, 1 < F, deg(Mi) < deg(F(G)) = n,
iE1 iE1
so III < n, as claimed. D
(2.3) Corollary. Let G be a finite abelian group and F a good field for G. If
M1, ... , Mn denote the distinct irreducible F-representations of G, then
n
F(G) Mi.
i-1
(2.4) Corollary. Let G be a finite abelian group and F a good field for G. If
M is an irreducible representation of G, then
8.3 Decomposition of the Regular Representation 453
EndG(M) = F.
Observe that an excellent field is good and that the field C is excellent
for every G. Our objective in this section is to count the number of irre-
ducible representations of a finite group G over an excellent field F, and
to determine their multiplicities in the regular representation F(G). The
answers turn out to be both simple and extremely useful.
(3.3) Lemma. Let F be an excellent field for G, and let P and Q be isotypic
representations of G of the same type. If P 5 m1M and Q m2M, then
as F-algebras
(3.1) Homc(P, Q) 25 Mm2,mi (F)
Q'+ M.
Then f7i E EndG(M), and since F is excellent for G, Lemma 1.18 (Schur's
lemma) implies that fji is given by multiplication by some element a3i E F.
Then the isomorphism of the lemma is given by
f A= [ai,).
(3.4) Theorem. (Frobenius) Let F be an excellent field for the finite group
G.
(1) The number of distinct irreducible F-representations of G is equal to
the number t of distinct conjugacy classes of elements of G.
(2) If {Mi}i=1 are the distinct irreducible F-representations of G, then the
multiplicity of Mi in the regular representation R of G is equal to its
-
degree di = deg(Mi) for l < i < t.
(3) Et=1 d? = n = JGJ.
®miMi
i=1
for some positive integers m1, ... , mq.
(1) We shall prove this by calculating dimF C in two ways, where C is
the center of R, i.e.,
C={rER:rr'=r'r forallr'ER}.
C is clearly an F-algebra.
First we compute dimF (C) directly. Let {Ci };_ 1 be the sets of mutually
conjugate elements of G, i.e., for each i, and each 91 and g2 E Ci, there
8.3 Decomposition of the Regular Representation 455
c;= 1: g.
9EC,
E 9i 9
g,ECi
E gig
9cEC+
1: 9(9-19:9)
g EC,
99,
9; EC;
where the fourth equality holds because C1 is a conjugacy class. This im-
mediately implies that
(3.4) C _D (cl, ... , ct),
x = a9g E C.
gEG
That is, any two mutually conjugate elements have the same coefficient,
and hence, C C (c1, ... , ct). Together with Equation (3.4), this implies that
C = (cl, ... , ct), and since c1, ... , ct are obviously F-linearly independent
elements of R, it follows that
456 Chapter 8. Group Representations
(3.5) dimF(C) = t.
Now for our second calculation of dimF(C). We know in general that
(3.6) R a, HomR(R, R) = EndR(R).
We will calculate dixF(C') where C' is the center of EndR(R). Of course,
dimF(C') = duuF(C) by Equation (3.6). But
EndR(R) = HomR(R, R)
9
HomR(®m$Mi, $m, M,)
i=1 ,j=1
9 9
®®HomR(m;M;, m,M,).
i=1 j=1
where Ci is the center of the matrix algebra Mn, (F); but by Lemma 4.1.3,
Ci = FIm, C Mm, (F) so that dimF(Ci) = 1. By Equation (3.8), it follows
that
dunF (C') = q.
Hence, q = t, as claimed.
(2) We shall prove this by calculating dimF(M;) in two ways. First, by
definition,
by Schur's lemma and Lemma 3.3 again. This matrix space has dimension
m; over F, so m; = d;, as claimed.
(3) By part (2) and Equation (3.7),
n = dimF(R)
a
= dimF (®M., (F))
i=1
9
_ dimp(M., (F))
:=1
a
= E m?
:=1
c
= d 2
as claimed. D
(3.5) Remark. Note that this theorem generalizes the results of Section 8.2.
For a group G is abelian if and only if every conjugacy class of G consists of
a single element, in which case there are n = IGI conjugacy classes. Then G
has n distinct irreducible F-representations, each of degree 1 and appearing
with multiplicity 1 in the regular representation (and n = E "j 12).
l),
f1 {x, xm-1}, {x2, xm-2} ,. .,
{X(m-1)/2 x(m+1)/2}
{y, xy, x2y, ... ,x"'-1y}.
There are (m + 3)/2 conjugacy classes and in Example 1.4 (7), we con-
structed (m + 3)/2 distinct irreducible representations over a good field F,
so by Theorem 3.4 we have found all of them.
For m even G has the following conjugacy classes:
{1}, {x, X'- 1), {x2, xm-2}, ... , {xI-1, x"}, {x1},
{x;y : i is even}, {x'y : i is odd}.
458 Chapter 8. Group Representations
[d1 0
P(±1) = 0
[±i 0
P(fi) = 0 :i
0
P(±j) = I±i 0
0 01
p(±k) =
1
±1
0
Note that in the matrices, i is the complex number i. We must check that p is
irreducible. This can be done directly, but it is easier to make the following
observation. If p were not irreducible it would have to be isomorphic to
a'(Qi) ®7r`(aj) for some i, j E {0, 1, 2, 3}, but it cannot be, for p(-1) is
nontrivial, but (7r' (vi) ®7r' (off)) (-1) is trivial for any choice of i and j.
VZ2ED Z2
_ {1, (12)(34), (13)(24), (14)(23)}
={1,I,J,K}
and
S ° c Z3 = {1, (1 2 3), (13 2)} = {1, T, T2}.
We compute that A4 has 4 conjugacy classes
{1}, {I, J, K}, {T, TI, TJ, TK}, {T2, T2I, ,T2J, T2K},
so we expect 4 irreducible representations whose degrees satisfy ri 1 d =
12, giving dl = d2 = d3 = 1 and d4 = 3. (Alternatively, we find that V is
the commutator subgroup of G, and so we have exactly 3 one-dimensional
representations of G by Proposition 2.5. Then the equation Ei=1 d? = 12
and dl = d2 = d3 = 1 with di > 1 for i > 3 forces t = 4 and d4 = 3.)
The three one-dimensional representations of G are 7r* (O..) for 0 < i <
2, where Bi are the representations of the cyclic group S constructed in
Example 1.4 (6).
Now we need to find a three-dimensional representation. Let
M=C4={(zl,z2,z3,z4):ziEC, 1 <i<4}.
Then S4, and hence, A4, acts on C4 by permuting the coordinates, i.e.,
460 Chapter 8. Group Representations
9(zl, z2, z3, z4) = (z9(1), z9(2), zg(3), Z9(4)) for g E S4.
Consider
M0 = {(z1, z2, z3, z4) : z1 + 22 + Z3 + z4 = 0}.
This subspace of M is invariant under S4, so it gives a three-dimensional
representation a of S4, and we consider its restriction to A4, which we
still denote by a. We claim that this representation is irreducible, and the
argument is the same as the final observation in Example 3.8: The subgroup
V acts trivially in each of the representations 7r* (0j) for i = 0, 1, 2, but
nontrivially in the representation a.
(3.11) Theorem. (Burnside) Let F be an excellent field for the finite group
G and let p : G -+ Aut(V) be an irreducible F- representation of G. Then
{p(g):gEG}
spans EndF(V).
Proof. We first claim that for any field F, if po : G -+ Aut(F(G)) is the
regular representation of G, then {po(g)} is a linearly independent set in
Aut(F(G)). For suppose
a = a9po(g) = 0.
9EG
Then
0 = a(1) = E agg
9EG
so a9 =0 for each gE C.
Now let F be an excellent field for G. Then by Theorem 3.4 we have
an isomorphism 0 : Vo -+ F(G) with Vo = ®2=id2V2, po : G -+ Aut(Vo),
where pt : G -+ Aut(VV) are the distinct irreducible F-representations of G.
Choose a basis B, for each V and let 13 be the basis of Vo that is the union
of these bases. If MM(g) = [pt(g)]B,, then for each g E G, [po(g)]B is the
block diagonal matrix
diag(Mj(g), M2(g), ... M2(g),... , Mt(g), ... , Mt(9))
,
where M2(g) is repeated d, = dim(U) times. (Of course, M1(g) = [1] ap-
pears once.) By the first paragraph of the proof, we have that the dimension
of {po(g) : g E G} is equal to n, so we see that
t
n<E42
where q2 is the dimension of the span of {pi(g) : g E G}. But this span is a
subspace of End(V2), a space of dimension d?. Thus, we have
t t
n<I:gt<Ed?=n
i=1 i=1
where the latter equality is Theorem 3.4 (3), so we have qt = d, for 1 < i <
t, proving the theorem.
462 Chapter8. Group Representations
8.4 Characters
In this section we develop the theory of characters. In practice, characters
are a tool whose usefulness, especially in characteristic zero, can hardly be
overemphasized. We will begin without restricting the characteristic.
(4.2) Examples.
(1) If a = drr, then X, (g) = d for every g E G.
(2) If a is any representation of degree d, then X,(1) = d.
(3) If a is the regular representation of C, then X,(1) = n = IGI and
X,(9) = 0 for all g 1. (To see this, consider [a(g)] in the basis
{g : g E G} of F(G).)
(4.3) Lemma. If g1 and 92 are conjugate elements of G, then for any rep-
resentation a,
(4.2) Xa(91) = Xa(92)
Proof. (1) is obvious, and (2) follows immediately from Proposition 7.2.35
and Lemma 4.1.20. 0
Our next goal is to derive the basic orthogonality results for charac-
ters. Along the way, we will derive a bit more: orthogonality for matrix
coefficients.
(1) Suppose that V1 and V2 are distinct. Then for any i1, ji, i2, j2,
(2) Suppose that V1 = V2 (so that a1 = a2) and Bl = Cit. Let d = deg(Vi).
Then
1 1 1 1/d if it = j2 and ji = i2,
n E Pil j, (9)gi2 j2 (9 ) = 0 otherwise.
9EG
Proof. Let Qi be the projection of V1 onto its ith summand F (as determined
by the basis B1) and let aj be the inclusion of F onto the jth summand of
V2 (as determined by the basis 82). Then f = aj$i E Hom(Vj, V2). Note
that [aj(33JBs = Eji where Eji is the matrix with 1 in the jith position and
0 elsewhere (see Section 5.1). Let us compute Av(f). By definition
Av(f) = n F, a2(9-1)(ai#i)a1(9)-
gEG
so the sums in question are just the entries of [Av(f ))81 (as we vary i, j
and the entry of the matrix.)
Consider case (1). Then Av(f) E HomG (Vi , V2) = (0) by Schur's
lemma, so f is the zero map and all matrix entries are 0, as claimed.
Now for case (2). Then Av(f) E HomG(V, V) = F, by Schur's lemma,
with every element a homothety, represented by a scalar matrix. Thus all
the off-diagonal entries of [Av(f )Jg, are 0, showing that the sum in Equation
(4.4) is zero if i1 0 j2. Since a1 = 0`2i we may rewrite the sum (replacing g
by g-1) as
1 1
E Pi2is (9 )
9EG
n F pii(9)4ii(9-1)
gEC
for any i, j (by varying the choice of f and the diagonal element in ques-
tion).
Now consider
fo=a1f1+a2l32+....+ad$d.
Since there are d summands, we see that the diagonal entries of Av(fo) are
all equal to dx. But fo is the identity! Hence, Av(fo) = fo has its diagonal
entries equal to one, so dx = 1 and x = 11d, as claimed.
d3 d2 dl dz
1 1:
n EEPii(9)gjj(9-1) =EE n P;;(9)9jj(9-1)
gEG i=1 j=1 i=1 j=1 gEG
Proof. The proof follows that of Proposition 4.5, with f = aj oiol (h). Then
the matrix corresponding to that in Equation (4.5) is
P1j(h9)4i1(9-1) P1j(h9)gi2(9-1) .
P2j(h9)gi1(9-1) P2j(h9)4i2(9-1) ..
this sum being independent of i. As in the proof of Corollary 4.7, the sum
we are interested in is
d d d
EE - >P:t (h9)9jj (9
n
1)
_ 1
1
> Pjj (h9)9jj (9
I)
1=1 j=1 gEG j=1 gEG
x1+...+xd
Then
d(xi + . + xd) = Tr(Av(fo))
= 1
n > Tr(aj(9-Ih9))
gEG
= 1 E Tr(a1(h))
n
gEG
_ Tr(o, I(h))
= xo, (h),
yielding the desired equality.
(4.9) Remarks.
(1) We should caution the reader that there are in fact three alternatives in
Proposition 4.5: that VI and V2 are distinct, that they are equal, or that
they are isomorphic but unequal. In the latter case the sum in Equation
(4.4) and the corresponding sum in the statement of Proposition 4.8
may vary (see Exercise 14). Note that we legitimately reduced the third
8.4 Characters 467
case to the second in the proof of Corollary 4.7; the point there is that
while the individual matrix entries of isomorphic representations will
differ, their traces will be the same.
(2) We should caution the reader that the sum in Proposition 4.8 (2) with
i1 = j2 but jl 36 i2 may well be nonzero and that the quantities
XI, ... , xd in the proof of the proposition may well be unequal (see
Exercise 15).
01i...,Ot
(in some order).
Let ICiI be the number of elements of Ci. Then Corollary 4.7 gives a
sort of orthogonality relation on the rows of A; namely,
1 if g E C;,
fi (9)
0 otherwise.
(4.13) Lemma. Let F be an excellent field for G. Then
A={Xi,...,Xt}
is a basis for the space of class functions on G.
Proof. By Lemma 4.3, the characters are class functions. There are t of
them, so to show that they are a basis it suffices to show that they are
linearly independent. Suppose
aix, + a2X2 + - + atXt = 0 where a, E F
where, as in Definition 4.10, we have written X; for X,,. Note that X,!9
defined by X; (9) = Xi(9-1), is also a class function. Then for each i,
(a1X1 + a2X2 + ... + atXt)X* = 0
But, by the orthogonality relations, this sum is just a;/n, so that ai/n = 0
and a; = 0 for each i, as required. 0
(4.14) Remark. If x is the character of the representation V defined by
o : G - Aut(V), then X*(9) = X(9-1) is indeed the character of a rep-
resentation; namely, with the given action of G, V is a left F(G)-module.
The action of G given by g '- o(g-1) gives V the structure of a right F(G)-
module. Then V' = HomG(V, F(G)) is a left F(G)-module with character
X'. (In terms of matrices, if B is a basis for V, B' is the dual basis of V',
and X' is defined by a' : G -+ Aut(V*), then
(a*(9)le- =
(4.15) Proposition. Let A and B be the above matrices, and let Xi and f;
be defined as above. Then for every 1 < i < t,
(1) Xi = j=1 aitfi, and
(2) fi = >. ..1 bijxj.
Proof. (1) is easy. We need only show that both sides agree on ck for 1
k < t. Since fj(ck) = 6jk,
t
F, a11fj(Ck) = aik = Xi(Ck)
j=1
as claimed.
8.4 Characters 469
Now (1) says that the change of basis matrix PB is just A. Then
PA = (Ps')' = A-' = B,
giving (2). 0
bki = !k(C-i)
ft
bkjXj (ci)
j=1
t
( 1) l Xj(ci)
E-1
n EXj(ci)Xj(ck1)
is
u .
j-1
1
i(V, W) = (V, W) = (Xv, Xw) =
n 1: Xv(g)Xw(g-')
gEG
Proof. Note that the hypotheses imply that R is semisimple, and recall
Schur's lemma. The proof then becomes straightforward, and we leave it
for the reader. 0
Remark. If G is finite, F is of characteristic zero or prime to the order of
G, and V is of finite degree, then we have
i(V, W) = dimF V. ®F(G) W.
Xo=Xo=Xv+
i.e., if and only if Xo is real valued.
We have the following important result:
Xo(9-1) = Xo(9)-
Proof. Let g E G. Then g has finite order k, say. By Lemma 1.20, V has a
basis B with [o(g)[B diagonal, and in fact,
[0'(g)]B = diag(S°', C°', ... , (ad)
X(91) = X(92)
Proof. The only if is trivial. As for the if part, suppose that X(gl) = X(92)
for every complex character. Then by Lemma 4.20,
1)
X(92 = X(91),
so if XI, , Xt are the irreducible characters,
t t
EXj(91)Xj(92 1) = EXj(91)Xj(91) =
E
t
IXi(91)12 > 0
j=1 j=1 j=1
since each term is nonnegative and X1(91) = 1 (X1 being the character of the
trivial representation r). Then by Corollary 4.16, gl and g2 are conjugate.
0
(4.23) Proposition. Let G be a finite group. Then every complex character
of G is real valued if and only if every element in G is conjugate to its oum
inverse.
472 Chapter 8. Group Representations
Proof. Suppose that every g is conjugate to g-1. Then for every complex
character X, X(g) = X(g-1) = (g), so X(g) is real.
Conversely, suppose that Xi(g) is real for every irreducible complex
representation ai and every g E G. Since a1 = r, X1(g) = 1, so
e e
0< XI(g)2
= Xi(g)Xi((g_1)-1)
j=1 j=1
Lemma 4.20 also gives a handy way of encoding the orthogonality rela-
tions (Corollaries 4.7 and 4.16) for complex characters. Recall the character
table A = [a11] = [X,(c1)J of Definition 4.10. Let
(4.9) ej ei = (
l / n l1 E
f I X.1(92 1)Xi(9i 1)9291
nJ
J\ / 9i.92EC
The interior sum, and hence the double sum, is zero if i $ j by Proposition
4.8. If i = j the interior sum is nXi(h-1)/di (again by Proposition 4.8), and
so in this case the double sum is
d'2
(nXi(h-1)/di) h = ei.
n hEC
t t didj
_ E j=1 i=1
n
gEG
Xi(h-1g)Xj(g-1)
n 2
_ (d=) (nXi(h-1)/di)
n
n
_ dixi(h-1)
i=1
n
_ Xi(1)Xi(h-1)
i=1
= f0 ifh54 1
In ifh=1
where the third equality is by Proposition 4.8 and the last is by Corollary
4.16. Thus, al = 1 and ah = 0 for h # 1, giving el + + et = 1. 0
{(kgei:k=0,...,m-1, gEG).
(Here we consider C(G) as an additive abelian group.) Since e? = ei, we
have
n -e2
n
= (Xi(g-1)g) ei E Mi.
di ei = di
(X7, X7 _ (1 . 102+15.22+20. 12
+12.02+12.02)
= 3,
so y has three irreducible components. We compute
(r,X7)=(1.1.10+15.1.2+20.1.1
+12.1.0+12.1.0)
= 1,
and since we have already computed X4, we may compute
1,
Cl C2 C3 C4 C5
Oil 1 1 1 1 1
a3 3 x2 x3 x4 x4
a3 3 Y2 113 3/4 Y5
a4 4 0 1 -1 -1
(k5 5 -11 0 0
so
0=1+3z2+4.0+5.1
so Z2 = -2, and by evaluation on c3, c4, and c5 we obtain z3 = 0, z4 =
z5 = 1. Now let us use orthogonality.
(-1)
X3 = X3, X3 = (4),
We use orthogonality once more to get
x,
(or vice-versa, but there is no order on X3 and X3, so we make this choice).
Hence, we find the complete "expanded" character table of A5:
478 Chapter 8. Group Representations
C, C2 C3 C4 Cs
1 15 20 12 12
a, 1 1 1 1 1
a3 3 -1 0 (1+v"5-)/2 (1-f)/2
C"3 3 -1 0 (1-v')/2 (1+f)/2
a4 4 0 1 -1 -1
as 5 1 -1 0 0
(4.30) Example. Let us show how to determine all the irreducible complex
characters of the symmetric group Ss of order 120.
First we see from Corollary 1.5.10 that S5 has seven conjugacy classes,
with representatives 1, (12), (123), (12)(34), (1234), (123)(45), and (12345),
so we expect seven irreducible representations. One of them is r, of course,
and another, also of degree 1, is e given by e(g) = sgn(g) = ±1 E Aut(C),
the sign of the permutation g.
Observe that the representations 3 and ry of As constructed in Example
4.29 are actually restrictions of representations of Ss, so the representations
a4 and as of As constructed there are restrictions of representations a4 and
as of S5. Since a4 and as are irreducible, so are a4 and as. Furthermore,
since a4 and
a's are irreducible, so are e ® 64 and e ® as (by Exercise 13),
and we may compute that the characters of a4 and e ® 54 (respectively, as
and e (& &5) are unequal, so these representations are distinct.
Hence, we have found six irreducible representations, of degrees 1, 1,
4, 4, 5, 5, so we expect one more of degree d, with
12+12+42+42+52+52+d2 = 120
so d = 6. If we call this a6i then we have (using X(a) for )(o, for conve-
nience),
(4.31) Example. It is easy to check from Example 3.9 that A4 has the
following expanded character table (where we have listed the conjugacy
classes in the same order as there):
8.5 Induced Representations 479
Cl C2 C3 C4
1 3 4 4
r = ir'(8o) 1 1 1 1
(2
7T (91) 1 1 (
S2
it (92 ) 1 1
a 3 -1 0 0
and for i = 0, 1, or 2
(a®a,7r'(9;))=12(1.1.9+3.1.1)=1,
so
Note that this makes sense as we may regard F(G) as an (F(G), F(H))-
bimodule, and then the result of induction is an F(G)-module. If V =
IndH (W ), we say that V is induced from W and call V an induced repre-
sentation.
Proof. The first statement of the proposition is clear from the isomorphism
F(G) = ®9iF(H)
iEI
of right F(H)-modules. As for the converse, note that there is a one-to-one
correspondence
{Wi} «--.. { left cosets of H }
given by
Wi «-' {g E G : g(Wi) = Wi }.
Pick coset representatives {9i}iEl with gi = 1. Define a function a
G x W V by a(g, w) = g(w). This clearly extends to an F-linear trans-
formation a : F(G) x W V, and since
a(9h, w) = 9h(w) = g(hw) = a(9, hw)
it readily follows that a is F(H)-middle linear and so defines
a:F(G)®F(H)W V.
Now we define Q : V -4 F(G) ®F(H) W. Since V = ®Wi, it suffices to
define /3 : Wi - F(G) ®F(H) W for each i E I. We let /3(wi) = gi ® g,-1(wi),
482 Chapter 8. Group Representations
for wi E Wi. Let us check that & and 6 are inverses of each other, estab-
lishing the claimed isomorphism.
First,
)3a(9 ® w) = 3(g(w))
= Q(9ih(w))
= Q(9i(hw))
=gi®hw
=gih®w
=g®w,
as required.
Proof. Let V = IndH (W). Then we may identify W with W, in the state-
ment of Theorem 5.6. Then K = {g E G : g(WW) = Wi} acting on W by
the above formula, so V = IndK(W) as well.
(5.8) Corollary.
(1) Let V be a transitive permutation representation on the set P = {pi }iEI
and letH={gEG:g(p,)=p,}. Then V=IndH(r).
(2) Let V be a transitive monomial representation With respect to the basis
B = {bi}, and let H = {g E G : g9Fbi) = Fb,}. Let a = Fb1 as a
representation of H. Then V = Indi(a).
Proof.
8.5 Induced Representations 483
by Theorem 5.6. But in the statement of the corollary we have just grouped
the a, into isomorphism classes, there being [N(a) : H] of these in each
class. 0
484 Chapter 8. Group Representations
V = IndG (Vi).
Proof. (1) In the notation of the proof of Theorem 5.11, we have that, as
an F(H)-module, V = Eg(W1), so V is a sum of simple F(H)-modules
and hence is semisimple by Lemma 4.3.20.
(2) Clearly, all the Uj are conjugate to a1 and all conjugates appear.
Let V, be as in Theorem 5.11, with V, = ®;_i k, (W1) for some group
elements ki E G. Then if gj is as in the proof of Theorem 5.11,
m1 Wj Vj
= gi (Vi )
M1
= ®gi(ki(W,))
i=1
m, Wi
so mi = m1 by Corollary 7.1.18.
(3) This is merely a restatement of (2).
(4) Since g(W1) = W, for g E H', V1 = >J g(W,) where the summa-
tion is over left coset representatives of H1 in N(a). 0
The next formula turns out to be tremendously useful, and we will see
many examples of its use. Recall that we defined the intertwining number
i(V,W)=(V,W)
of two representations in Definition 4.17.
Proof. By definition,
But
where the second equality is the adjoint associativity of Hom and tensor
product (Theorem 7.2.20). Again, by definition,
m = (M, F(G))
_ (F(G), M)
_ (IndH(T), M)
_ (T, ROH(M))
_ (T, dr)
=d
as claimed.
so
2
Indv°(T) = ®Tr'(9,)
i=O
(since
deg(Indv' (T)) = [A4 : V] deg(T) = 3. 1 = 3
and the right-hand side is a representation of degree 3). Also, for i = 0, 1, 2
andj=1,2,3
0 = (A,, T) = (Ai, Reso(x'(91))) = (Indy(A,),fr'(9 )),
so we must have IndX(A,) = a (as both are of degree 3).
Continuing with this example, since we have a split extension, we have
a subgroup S of A4 isomorphic to Z3, and we identify S with Z3 via this
isomorphism. (We have given S and this isomorphism explicitly in Example
3.9). Now S has three irreducible representations 90 = r, 91 i and 92. Because
we have a splitting, R.ess' (7r' (6;)) = 9;, or, more generally,
Example. Note that D2m has an abelian subgroup of index 2 and its irre-
ducible complex representations all have dimension at most 2. The same is
true for Q8. Also, A4 has an abelian subgroup of index 3 and its irreducible
complex representations all have dimension at most 3.
V =®gi(W)=®Wi
i=1 icl
(5.21) Example. In Example 4.29, we found the character table of A5. Let
us here adopt an alternate approach, finding the irreducible characters via
induced representations. We still know, of course, that they must have de-
grees 1, 3, 3, 4, 5 and we still denote them as in Example 4.29.
Of course, al = r. The construction of a4 was so straightforward that
an alternative is hardly necessary, but we shall give one anyway (as we shall
need most of the work in any case). Let G = As and let H = A4 included
in the obvious way (as permutations of {1, 2, 3, 4) C- (1, 2, 3, 4, 5). Then
a system of left coset representatives for H is
{1,(12)(45), (12)(35), (13)(25), (23)(15)} = {gi, ... ,g5}.
We choose as representatives for the conjugacy classes of C
{1, (14)(23), (123), (12345), (13524)} = {cl, ... ,c5}.
Of course, g' (c1)gi = g, for every i. Otherwise, one can check that
9i 1(cj)gi 0 H except in the following cases:
Now, following Example 3.9, let W = ir'(01) (or 7r'(02)) and consider
Indo (W) = V. Again, it is easy to compute Xv:
Xv(c1) = 5, Xv(c2) = 1,
Xv(c3) = exp(21ri/3) +exp(4iri/3) _ -1,
XV (C4) = XV (C5) = 0-
Now r does not appear in W by Frobenius reciprocity, so that implies here
(by considering degrees) that V is irreducible (or alternatively one may
calculate that (Xv, XV) = 1) so V = a5 and its character is given above.
Now we are left with determining the characters of the two irreducible
representations of degree 3. To find these, let H = Z5 be the subgroup
generated by the 5-cycle (1 2 3 4 5). Then H has a system of left coset
representatives
{1, (14)(23), (243), (142), (234), (143),
(12)(34), (13)(24), (123), (134), (124), (132)}
= (91,...,912)
Again, g, 1(c1)gj = c1 for every i. Otherwise, one can check that gi 1(cj)gi
H except in the following cases: g11(c4 )91 = c4, 91 1(c5)91 = cs, and
g21((12345))g2 = (15432) _ (12345)-1 and gz1((13524))g2 =
(14253) = (13524)-1.
Now let W = 9j and let V = IndH (W). Again by Theorem 5.20 we
compute Xv(c1) = 12, Xv(c2) = Xv(c3) = 0, and
Xv(c4) = exp(2iri/5) + exp(8ai/5) if 1 or 4,
Xv(c5) = exp(47ri/5) +exp(6ai/5)
Xv(c4) = exp(47ri/5) + exp(6ai/5)
if j - 2 or 3.
Xv(c5) = exp(2iri/5) + exp(8iri/5)
In any case, one has that r does not appear in V (by either Frobenius
reciprocity or calculating (Xv,X1) = 0) and a4 and as each appear in
V with multiplicity 1 (by calculating (Xv, X4) = (Xv, Xs) = 1), so their
complement is an irreducible representation of degree 3 (which checks with
(Xv, Xv) = 3), whose character is Xv - X4 - X5-
Choose j = 1 (or 4) and denote this representation by a3, and choose
j = 2 (or 3) and denote this representation by a3 (and note that they are
distinct as their characters are unequal). Then we may calculate that
X3(c1) = X3(c,) = 3
X3(2) = X3(2) = -1
X3(C3) = X3(03) = 0
X3(c4) = 1 + exp(2iri/5) + exp(87ri/5) = (1 + f)/2
%3(cs) = 1 + exp(4iri/5) + exp(6ai/5) = (1 - f)12
and vice-versa for Xs, agreeing with Example 4.29.
492 Chapter 8. Group Representations
Our last main result in this section is Mackey's theorem, which will
generalize Corollary 5.10, but, more importantly, give a criterion for an
induced representation to be irreducible. We begin with a pair of subgroups
K, H of G. A K-H double coset is
KgH={kgh:kEK, hEH}.
It is easy to check that the K-H double cosets partition G (though, unlike
for ordinary cosets, they need not have the same cardinality).
We shall also refine our previous notation slightly. Let a : H -+ Aut(W)
be a representation of H. For g E C, we shall set H9 = g-'Hg and we will
let a9 be the representation a9 : H9 -. Aut(W) by a9(h) = a(ghg-1), for
h E H. Finally, let us set H. = H9 fl K. We regard any representation of
H, given by a : H - Aut(W), as a representation of H9 by o9 and, hence,
as a representation of H. by the restriction of a9 to H9. (In particular, this
applies to the regular representation F(H) of H.)
where the sum is taken over a complete set of K-H double coset represen-
tatives.
Proof. For simplicity, let us write the right-hand side as ®9(F(K)(&F(H))9.
Define maps a and f3 as follows:
For g' E G, write g' as g' = kgh for k E K, h E H, and g one of
the given double coset representatives, and let a(g') _ (k ® h),. We must
check that a is well defined. Suppose that g' = kgh with k E K and
h E H. We need to show (k ®h)9 = (k (& h)9. Now kgh = g' = kgh gives
9_1k-'!g = hh 1, and then
(k ®h)g = (k(k-11) ®(hh-1)h)9
®g_1(k-lk)9(hh-1)h)9
= (k
_ (k ®h)9,
as required. Then a extends to a map on F(G) by linearity. Conversely,
define 3, on K x H by 3.9 (k, h) = kgh E G and extend Q9 to a map
fag : F(K) x F(H) -. F(G)
by linearity. Then for any x E H9, we have
39(k,9-lxgh),
Note that the subgroup H. depends not only on the double coset KgH,
but on the choice of representative g. However, the modules involved in the
statement of Mackey's theorem are independent of this choice. We continue
to use the notation of the preceding proof.
(5.23) Proposition. Let g and g be in the same K-H double coset. Then
(F(K) ® F(H))g is isomorphic to (F(K) ® F(H))y as (F(K), F(H))-
bimodules.
Proof. Let g= kgh with k E K, h E H, and define a: K x H K x H by
a(k, h) = (kk, hh). Extend to
a : F(K) x F(H) -' F(K) x F(H)
by linearity, thus giving
a : F(K) x F(H) -+ (F(K) 0 F(H))g.
We show that a is middle linear:
Let x E Hg be arbitrary. Then
a(kx, h) = kxk ®hh
= k(xk) ®hh
= k ®g-'(xk)ghh
and
a(k, g-'xgh) = kk ®hg-'xgh
= k ® (9-lkg)hg-'xgh.
But g = kgh, so
g-lkgh(h-1g_lk-1)x(kgh)h
(9 'kg)hg-'xgh =
= g-lxkghh,
F(K) ®F(H9) W.
where the direct sum is over a complete set of H-H double cosets, H9 =
H9 n H, and W9 is the representation of H9 on W defined by a9.
Proof. The required isomorphism is a consequence of the following chain of
equalities and isomorphisms.
Endc(V) = HomG(V, V)
= HomH(W1 V)
as in the proof of Frobenius reciprocity
L' HomG(V, W)
as by our assumption on F, F(H) is semisimple
8.5 Induced Representations 495
(5.26) Corollary.
(1) Let G be a finite group, H a subgroup of G, and F a field of character-
istic zero or relatively prime to the order of H. Let W be a represen-
tation of H and set V = IndH(W). Then EndG(V) = F if and only
if EndH (W) = F and HomH9 (W9, ResH9 (W)) = 0 for every g E G,
gVH.
(2) Let F be an algebraically closed field of characteristic zero or relatively
prime to the order of G. Then V is irreducible if and only if W is
irreducible and, for each g E G, g V H, the H9-representations W9
and ResH9 (W) are disjoint (i.e., have no mutually isomorphic irre-
ducible components). In particular, if H is a normal subgroup of G, V
is irreducible if and only if W is irreducible and distinct from all its
conjugates.
1--iV--iG--*-+S-+1
with G a group of order n = 2"' (27` -1). G has the one-dimensional complex
representations n' (01) for i = 0, ... , p - 1. Also, if a is any nontrivial
complex representation of V, G has the representation Indy(a) of degree
[G : V) = 21-1. Now a is disjoint from all its conjugates (as Ker(a) may be
considered to be an F2-vector space of dimension m-1, and GL(m-1, F2)
does not have an element of order p), so by Lemma 5.25, a is irreducible.
As (2" - 1)2 + 2m-1(1)2 = n, these 2' complex representations are all of
the irreducible complex representations of G. (Note that if m = 2, then
G = A4, so this is a generalization of Example 5.29.)
(6.4) Lemma. Let Ql, ... , Qk partition P into domains of transitivity. Then
CP=CQi®...®CQk,
Proof.
Proof. 0
(Recall that in this situation all of the subgroups Gp, for P E P, are
conjugate and we may choose H to be any one of them.)
Because of Lemma 6.4, we shall almost always restrict our attention
to transitive representations, though we state the next two results more
generally.
Proof. We know that al and 0`2 are equivalent if and only if their characters
Xi and X2 are equal. But if X is the character of a permutation representa-
tion or of G on a set P, then its character is given by
X(9) = I{p E P : a(9)(p) = p}I
0
Q = {a(g)(p) : g E H}.
Proof. Clear from the remark preceding Theorem 6.9, identifying domains
of imprimitivity with left cosets. 0
(6.14) Examples.
(1) Note that 1-fold transitive is just transitive.
(2) The permutation representation of D2n on the vertices of an n-gon is
doubly transitive if n = 3, but only singly (= 1-fold) transitive if n > 3.
(3) The natural permutation representation of S on {1, ... ,n} is n-fold
transitive, and of An on 11, ... , n} is (n - 2)-fold (but not (n- I)-fold)
transitive.
(4) The natural permutation representation of Sn on (ordered or un-
ordered) pairs of elements of {1, ... ,n} is transitive but not doubly
500 Chapter 8. Group Representations
transitive for n > 3, for there is no g E S,, taking {(1, 2), (2,3)1 to
{(1,2), (3,4)}.
=1 X(g)X(g)
gEG
=1 X(g)X(g)
gEG
= (X, X).
Note that r is a subrepresentation of a by Lemma 6.4; also note that
(X, X) = 2 if and only if in the decomposition of a into irreducibles there
are exactly two distinct summands, yielding the theorem.
(6.18) Remark. The converse of Proposition 6.15 is false. For example, let
p be a prime and consider the permutation representation of D2p on the
8.6 Permutation Representations 501
C(m+3)/2 41 P
by
x'y vertex pi fixed by x'y
and further, for any g E D2m, g(x'y)g-1 fixes the vertex g(p;), so the two
actions of G are isomorphic.
(2) Consider D2m for m even. Then (cf. Example 3.7) we have
C1 = {1}, C2 = {x, xm-1},
... , Cm/2 = {x?-1, x+l},
C;J+1 = {X?), CJ+2 = {x'y : i is even}, C;1+3 = WY: i is odd}.
Again, ry1 = r and ry; = r®t/i+- for i = 2, ... , m/2 (as conjugation
by x is trivial on C; but conjugation by y is not). We will determine the
502 Chapter 8. Group Representations
Also, xi(x'y)x-j
Xz+2(x3)=XZ+3(x3)=0 for j# 2.
Now y(xiy)y-1 = yxi = x-ly, so y fixes x:y when 2i = 0, i.e., i = 0 or
m/2. Hence, if m/2 is odd,
X -+2(y) = XZ+3(y) = 1,
while if m/2 is even
'Y2+2='+b++®-+®y
and
'Y +3 = 'Y
2
where
_ 02 ®04 ®06 ®...(D Ok,
1 1 T T2
{1} 1 1 1 1
{I, J, K} 3 3 0 0
{T, TI, TJ, TK} 4 0 1 1
(7.2) Definition. Let R be a ring. Let F be the free abelian group with basis
(P: P is a projective R-module},
and let N be the subgroup spanned by
{M - M1 - M2 : there is a short exact sequence of 7Z-modules:
-+M--iM2--i0}.
Then let K(R) = F/N.
(7.3) Remark. The reader will recall that we defined a good field F for
G in Definition 1.3 and an excellent one in Definition 3.1. We often used
the hypothesis of excellence, but in our examples goodness sufficed. This
is no accident. If F is a good field and F' an excellent field containing F,
then all F'-representations of G are in fact defined over F, i.e., every F'-
representation is of the form V = F' ®F W, where W is an F(G)-module.
(In other words, if V is defined by a : G Aut(V), then V has a basis 13
such that for every g E G, [a(g)]g is a matrix with coefficients in F.) This
was conjectured by Schur and proven by Brauer. We remark that there is no
proof of this on "general principles"; Brauer actually proved more, showing
how to write all representations as linear combinations of particular kinds
of induced representations. (Of course, in the easy case where G is abelian,
we showed this result in Corollary 2.3.)
8.8 Exercises
1. Verify the assertions of Example 1.8 directly (i.e., not as consequences of our
general theory or by character computations). In particular, for (3) and (4),
find explicit bases exhibiting the isomorphisms.
2. Let 7r : C -i H be an epimorphism. Show that a representation a of H is
irreducible if and only if 7r' (a) is an irreducible representation of G.
3. Let H be a subgroup of G. Show that if Resy(a) is irreducible, then so is a,
but not necessarily conversely.
4. Verify the last assertion of Example 1.7.
5. Show that r and Ra are the only irreducible Q-representations of Zp.
6. Find all irreducible and all indecomposable Fy-representations of Zp. (Fp
denotes the unique field with p elements.)
7. In Example 3.8, compute 7r'(8;) ®p in two ways: by finding an explicit basis
and by using characters.
8. Do the same for the representations a' (9) ® a of Example 3.9.
9. In Example 3.8 (resp., 3.9) prove directly that p (reap., a) is irreducible.
10. In Example 3.10, verify that the characteristic polynomials of a(U) and
a'(U) are as claimed. Also, verify that 7r' (7/'_) ® w'(01) = 7r' (&1) both
directly and by using characters.
11. Verify that Ds and Qs have the same character table (cf. Remark 4.32).
12. Show that the representation p of Q8 constructed in Example 3.8 cannot be
defined over R, although its character is real valued, but that 2p can (cf.
Remark 4.33).
13. Let F and G be arbitrary. If a is an irreducible F-representation of G and
(3 is a 1-dimensional F-representation of G, show that a ®Q is irreducible.
14. Find an example to illustrate Remark 4.9 (1). (Hint: Let G = D6.)
15. (a) Find an example to illustrate both phenomena remarked on in Remark
4.9 (2). (Hint: Let G = D6.)
(b) Show that neither phenomenon remarked on in Remark 4.9 (2) can occur
if h is in the center of the group G.
16. Prove Lemma 4.18.
17. Prove Proposition 4.24.
18. Show that the following is the expanded character table of S4:
C1 C2 C3 C4 C5
1 6 3 8 6
r 1 1 1 1 1
a 3 1 -1 0 1
7r'(Ol) 2 0 2 -1 0
7r'(7(i_)®a 3 -1 -1 0 1
7r'(7/i_) 1 -1 1 1 -1
19. Show that the following is the expanded character table of S5 in two ways-
by the method of Example 4.30 and the method of Example 5.28.
Ci C2 C3 C4 C5 C6 C7
1 10 15 20 20 30 24
r 1 1 1 1 1 1 1
54 4 2 0 1 -1 0 1
as 5 1 1 -1 1 -1 0
as 6 0 -2 0 0 0 1
E®&5 5 -1 1 -1 -1 1 0
E®a4 4 -2 0 1 1 0 -1
C 1 -1 1 1 -1 -1 1
506 Chapter 8. Group Representations
X = U [x].
zEX
508 Appendix
Now suppose that [x] n [y] ; 0. Then there is an element z E [x] n [y].
Then x - z and y - z. Therefore, by symmetry, z - y and then, by
transitivity, we conclude that x - y. Thus y E [xJ and another application
of transitivity shows that [y] C [x] and by symmetry we conclude that
[xi C [y] and hence [x] = [y]. 0
The standard example of a partially ordered set is the power set P(Y)
of a nonempty set Y, where A < B means A C B.
If X is a partially ordered set, we say that X is totally ordered if
whenever x, y E X then x < y or y < x. A chain in a partially ordered set
X is a subset C C X such that C with the partial order inherited from X
is a totally ordered set.
If S C X is nonempty, then an upper bound for S is an element xo E X
(not necessarily in S) such that
s < x0 for all s E S.
A maximal element of X is an element m E X such that
if m < x, then m = x.
A.1 Equivalence Relations and Zorn's Lemma 509
This list consists of all the symbols used in the text. Those without a page
reference are standard set theoretic symbols; they are presented to establish
the notation that we use for set operations and functions. The rest of the
list consists of symbols defined in the text. They appear with a very brief
description and a reference to the first occurrence in the text.
C set inclusion
A C B but A #,B
n set intersection
U set union
A\B everything in A but not in B
AxB cartesian product of A and B
JAI cardinality of A (E N U {oo})
f:X-+Y function from X to Y
a'-+ f(a) a is sent to f (a) by f
IX :X-*X identity function from X to X
fIZ restriction of f to the subset Z
N natural numbers = { 1, 2, ... }
e, 1 group identity 1
Z integers 2
Z+ nonnegative integers 54
Q rational numbers 2
R real numbers 2
C complex numbers 2
Q' nonzero rational numbers 2
R' nonzero real numbers 2
C. nonzero complex numbers 2
Z integers modulo n 2
Zn integers relatively prime to n (multiplication mod n) 2
512 Index of Notation
39-9
90000
111111