Module Introduction
Module Introduction
KEITH CONRAD
1. Introduction
One of the most basic concepts in linear algebra is linear combinations: of vectors, of polynomials,
of functions, and so on. For example, the polynomial 7 − 2T + 3T 2 is a linear combination of 1, T ,
and T 2 : 7 · 1 − 2 · T + 3 · T 2 . The coefficients used for linear combinations in a vector space are in
a field, but there are many places where we meet linear combinations with coefficients in a ring.
√ √
Example 1.1. In the ring Z[ −5] let p be the ideal (2, 1 + −5), so by definition
√ √ √ √ √
p = 2x + (1 + −5)y : x, y ∈ Z[ −5] = Z[ −5] · 2 + Z[ −5] · (1 + −5),
√ √
which is the set of all linear combinations of 2 and 1 + −5 with coefficients in Z[ −5].
√
Such linear combinations do not provide unique representation: for x and y in Z[ −5],
√ √ √
x · 2 + y · (1 + −5) = (x − (1 + −5)) · 2 + (y − 2) · (1 + −5).
√
So we can’t treat x and y as “coordinates” of 2x + (1 + −5)y. In a vector space like Rn or Cn ,
whenever there is a duplication of representations with a spanning set we can remove a vector from
√
the spanning set, but in p this is not the case: if we could reduce the spanning set {2, 1 + −5} of
√ √
p to a single element α, then p = Z[ −5]α = (α) would be a principal ideal in Z[ −5], but it can
be shown that p is not a principal ideal.
√ √
Instead of reducing the size of {2, 1 + −5} to get a nice spanning set for p with Z[ −5]-
coefficients, we can try to do something else to standardize the representation of elements of p as
linear combinations: restrict coefficients to Z. That works in this case because:
√ √
p = 2x + (1 + −5)y : x, y ∈ Z[ −5]
√ √ √
= 2(a + b −5) + (1 + −5)(c + d −5) : a, b, c, d ∈ Z
√ √ √
= 2a + 2 −5b + (1 + −5)c + (−5 + −5)d : a, b, c, d ∈ Z
√ √ √
= Z · 2 + Z · 2 −5 + Z · (1 + −5) + Z · (−5 + −5).
Describing p in terms of linear combinations with Z-coefficients made the spanning set grow. We
√
can shrink the spanning set back to {2, 1 + −5} because the new members of this spanning set,
√ √ √
2 −5 and −5 + −5, are Z-linear combinations of 2 and 1 + −5:
√ √ √ √
2 −5 = (−1)2 + 2(1 + −5) and − 5 + −5 = (−3)2 + (1 + −5).
1
2 KEITH CONRAD
√ √ √
Thus 2 −5 and −5 + −5 are in Z · 2 + Z · (1 + −5), so
√ √
p = Z · 2 + Z · (1 + −5) = 2m + (1 + −5)n : m, n ∈ Z .
√ √
Using coefficients in Z rather than in Z[ −5], there is unique representation: if 2m + (1 + −5)n =
√
2m0 + (1 + −5)n0 then
√
(2(m − m0 ) + (n − n0 )) + (n − n0 ) −5 = 0.
The real and imaginary parts are 0, so n = n0 and then m = m0 . Thus, we can regard m and n as
√
“coordinates” for 2m + (1 + −5)n, and the situation looks a lot closer to ordinary linear algebra.
√ √
Warning. That the set {2, 1 + −5} has its Z[ −5]-linear combinations coincide with its Z-
√
linear combinations is something of a fluke. In the ring Z[ d], the set of Z-linear combinations of
√
two elements generally does not coincide with the set of their Z[ d]-linear combinations.
The use of linear combinations with coefficients coming from a ring rather than a field suggests
the concept of “vector space over a ring,” which for historical reasons is not called a vector space
but instead a module.
In the special case when R = F is a field, an F -module is just an F -vector space by another
name.
Example 1.3. The set Rn of ordered n-tuples in R with componentwise addition and scalar
multiplication by R as in linear algebra is an R-module.
Example 1.4. Every ideal I in R is an R-module with addition and scalar multiplication being
the operations in R.
Example 1.5. A quotient ring R/I for an ideal I is an R-module where r(x mod I) := rx mod I
for r ∈ R and x ∈ R. (It is easy to check that this is well-defined and satisfies the axioms.) So I
and R/I are both R-modules, whereas in the language of ring theory, ideals and quotient rings are
not the same kind of object: an ideal is almost never a ring, for instance.
Example 1.6. The ring R[T ] is an R-module using obvious addition and scalar multiplication.
Example 1.7. The set Map(R, R) of functions f : R → R under pointwise addition of functions
and the scalar multiplication (rf )(x) = r(f (x)) is an R-module.
INTRODUCTORY NOTES ON MODULES 3
That addition in M is commutative actually follows from other axioms for R-modules: distribu-
tivity both ways on (1 + 1)(m + m0 ) shows m + m + m0 + m0 = m + m0 + m + m0 , and canceling
the leftmost m’s and rightmost m0 ’s (addition on M is a group law) gives m + m0 = m0 + m for all
m and m0 in M .
We will show how many concepts from linear algebra (linear dependence/independence, basis,
linear transformation, subspace) can be formulated for modules. When the scalars are not a field,
we encounter genuinely new phenomena. For example, the intuitive idea of linear independence in
a vector space as meaning no vector is a linear combination of the others is the wrong way to think
about linear independence in modules. As another example, in a vector space each spanning set
can be refined to a basis, but a module with a finite generating set does not have to have a basis
(this is related to nonprincipal ideals). In the last section, we’ll see how the concept of a module
gives a new way to think about a question purely about matrices acting on vector spaces: for a
field F and matrices A and B in Matn (F ), deciding if A and B are conjugate in Matn (F ) is the
same as deciding if two F [T ]-module structures on F n (each one uses A or B) are isomorphic.
2. Basic definitions
In linear algebra the concepts of linear combination, linear transformation, isomorphism, sub-
space, and quotient space all make sense when the coefficients are in a ring, not just a field, so they
can all be adapted to the setting of modules with no real changes.
Linear combinations are the basic way to create new elements of a module from old ones, just
as in linear algebra in Rn . For instance, an ideal (a, b, c) = Ra + Rb + Rc in R is nothing other
than the set of R-linear combinations of a, b, and c in R.
Example 2.2. We can view the ideal I = (1 + 2i) in Z[i] as both a Z[i]-module and as a Z-module
in a natural way. As a Z[i]-module, we can get everywhere in I from 1+2i: I = Z[i](1+2i). As a Z-
module, we can get everywhere in I from 1+2i and i(1+2i) = −2+i since I = Z(1+2i)+Zi(1+2i) =
Z(1 + 2i) + Z(−2 + i).
Mostly we will be interested in modules with finite spanning sets, but there are some important
examples of modules that require infinite spanning sets. So let’s give the general definition of
spanning set, allowing infinite ones.
4 KEITH CONRAD
Definition 2.3. A spanning set of an R-module M is a subset {mi }i∈I of M such that every
m ∈ M is a finite R-linear combination of the mi ’s:
X
m= ri mi ,
i∈I
(a vector space has a finite spanning set if and only if it is finite-dimensional), but the analogy is
rather weak: finitely generated modules can be very complicated when R is not a field.
In words, this says ϕ sends R-linear combinations to R-linear combinations with the same
coefficients. (Taking r = r0 = 1 makes this the additive condition and taking r0 = 0 makes
this the scaling condition.) We can also characterize a linear transformation as one satisfying
ϕ(rm + m0 ) = rϕ(m) + ϕ(m0 ), but that description is asymmetric while the concept of linear
transformation is not, so don’t use that description!
Example 2.10. The R-linear maps ϕ : R → R are exactly the functions ϕ(x) = ax for some a ∈ R.
Indeed, if ϕ is R-linear then ϕ(x) = ϕ(x · 1) = xϕ(1) = xa = ax where a = ϕ(1). And conversely,
letting ϕa : R → R for a ∈ R by ϕa (x) = ax, this is R-linear since
(2.1) ϕa (x + y) = a(x + y) = ax + ay = ϕa (x) + ϕa (y), ϕa (rx) = a(rx) = arx = rax = rϕa (x).
Example 2.11. For positive integers m and n, each matrix A ∈ Mn×m (R) defines an R-linear
transformation Rm → Rn by v 7→ Av (the usual product of a matrix and a vector). Note the flip in
the order of m and n in the size of the matrix and in the domain and target of the transformation.
Remark 2.12. In Section 3 we’ll define the concept of a basis for a finitely generated module.
Some finitely generated modules do not have a basis, such as a nonprincipal finitely generated ideal
(Example 3.7). For finitely generated modules M and N that have bases, it will turn out (just as
in linear algebra) that every linear transformation M → N can be described with a matrix as in
Example 2.11.
For modules without bases, there is nothing like a matrix available to describe a linear trans-
formation between them. This is the fundamental difference between vector spaces and modules:
matrices are insufficient for describing linear maps between general modules, even between general
finitely generated modules.
Like group homomorphisms, an R-linear transformation ϕ is injective if and only if ker ϕ = {0}.
Being injective is a property that does not involve R-linearity, so we can think about whether ϕ is
injective just by treating ϕ as a group homomorphism, which makes it clear that this property is
the same as having kernel {0}.
Example 2.17. Viewing R as an R-module, its submodules are precisely its ideals. This is very
important!
Example 2.18. In Z[i] the Z[i]-submodules all have the form Z[i]α since all ideals in Z[i] are prin-
cipal. In Z[i] its Z[i]-submodules (ideals) are also Z-submodules, but there are more Z-submodules
in Z[i] than ideals. For example, Z + Z · 2i = {a + 2bi : a, b ∈ Z} is a Z-submodule of Z[i] that is
not an ideal (it isn’t preserved under multiplication by i).
Definition 2.19. If N ⊂ M is a submodule, then the quotient group M/N has the natural scalar
multiplication r(m mod N ) := rm mod N (easily checked to be well-defined and to satisfy the
axioms to be an R-module). We call M/N a quotient module.
In particular, for ideals J ⊂ I ⊂ R, we can say that I/J is an ideal in R/J or (without mentioning
R/J) that I/J is an R-module.
It is straightforward to check that if ϕ : M → N is an R-linear transformation, the kernel and
image of ϕ are both R-modules (one is a submodule of M and the other is a submodule of N )
and the homomorphism theorems for groups carry over to theorems about linear transformations
of R-modules. For example, an R-linear map ϕ : M → N induces an injective R-linear map
ϕ : M/ ker ϕ → N that is an isomorphism of M/ ker ϕ with im ϕ.
M ⊕ N = {(m, n) : m ∈ M, n ∈ N }
More generally, for a collection of R-modules {Mi }i∈I , their direct sum is the R-module
M
Mi = {(mi )i∈I : mi ∈ Mi and all but finitely many mi are 0}
i∈I
The construction of the direct sum and direct product of R-modules appears different only when
the index set I is infinite. In particular, we may write M ⊕ N or M × N for this common notion
when given just two R-modules M and N .
Note M ∼ = M ⊕ {0} and N ∼ = {0} ⊕ N inside M ⊕ N .
8 KEITH CONRAD
As in group theory, there is a criterion for a module to be isomorphic to the direct product of two
submodules by addition: if L is an R-module with submodules M and N , we have M ⊕ N ∼ = L by
(m, n) 7→ m+n if and only if M +N = L and M ∩N = {0}. (Addition from M ⊕N to L is R-linear
by the way the R-module structure on M ⊕ N is defined. Then the property M + N = L makes
the addition map surjective and the property M ∩ N = {0} makes the addition map injective.)
Let M be an R-module. Recall a spanning set {mi }i∈I is a subset such that for each m ∈ M ,
X
m= ri mi
i∈I
Definition 3.1. In an R-module M , a subset {mi }i∈I is called linearly independent if the only
P
relation i ri mi = 0 is the one where all ri are 0, and linearly dependent otherwise: there is
P
a relation i ri mi = 0 where some ri is not 0. A subset of M is called a basis if it is a linearly
independent spanning set. A module that has a basis is called a free module, and if the basis is
finite then M is called a finite-free module.
Remark 3.2. In a vector space, {vi }i∈I is linearly independent if no vi is a linear combination
P
of the other vj ’s. The equivalence of that with i ci vi = 0 ⇒ all ci = 0 uses division by nonzero
scalars in a field. In a module over a ring that is not a field, we can’t divide scalars and the two
viewpoints of linear independence in vector spaces really are not the same in the setting of modules.
P
The description in terms of i ci vi = 0 is the right one to carry over to modules.
√ √ √
Example 3.5. The ideal p = (2, 1+ −5) of Z[ −5] can be regarded as both a Z[ −5]-module and
√ √ √
a Z-module. The set {2, 1 + −5} is linearly dependent over Z[ −5] since r1 · 2 + r2 (1 + −5) = 0
√
where r1 = 1 + −5 and r2 = −2 while it is linearly independent over Z and in fact is a basis of p
as a Z-module from Example 1.1.
It is not hard to check that in a module, every subset of a linearly independent subset is linearly
independent and every superset of a linearly dependent set is linearly dependent (the calculation
goes exactly as in linear algebra), but beware that a lot of the geometric intuition you have about
linear independence and bases for vector spaces in linear algebra over fields breaks down for modules
INTRODUCTORY NOTES ON MODULES 9
over rings. The reason is that rings in general behave differently from fields: two nonzero elements
need not be multiples of each other and some nonzero elements could be zero divisors.
Perhaps the most basic intuition about linear independence in vector spaces that has to be used
with caution in a module is the meaning of linear independence. In a module, linear independence
is defined to mean “no nontrivial linear relations,” but in a vector space linear independence
has an entirely different-looking but equivalent formulation: no member of the subset is a linear
combination of the other members in the subset. This latter condition is a valid property of linearly
independent subsets in a module (check!) but it is not a characterization of linear independence in
modules in general:
natural example is the ring C ∞ (R) of smooth functions R → R with pointwise operations.
The functions in C ∞ (R) whose derivatives of all orders at 0 vanish (it contains the function
2
that’s e−1/x for x 6= 0 and 0 for x = 0) is an ideal and can be shown not to be finitely
generated.
(3) A finite-free R-module can strictly contain a finite-free R-module with bases of the same
size (this never occurs in linear algebra: a subspace with the same finite dimension must
be the entire space). For example, with R = Z we can use M = Zd and N = (2Z)d where
d ≥ 1. More generally, for every domain R that is not a field and every nonzero non-unit
a ∈ R we can use M = Rd and N = (Ra)d .
What is a Z-module? It’s an abelian group M equipped with a map Z × M → M such that for
all m, m0 ∈ M and a, a0 ∈ Z,
(1) 1m = m,
(2) a(m + m0 ) = am + am0 ,
(3) (a + a0 )m = am + a0 m and (aa0 )m = a(a0 m).
Armed with these conditions, it’s easy to see by induction that for a ∈ Z+ ,
am = m {z· · · + m} .
| +m+
a times
(−a)m = a(−m) = −m
| − m{z
− · · · − m},
a times
and we also have (0)m = 0. Thus multiplication by Z on a Z-module is the usual concept of integral
multiples in an abelian group (or integral powers if we used multiplicative notation for the group
law in M ), so an abelian group M has only one Z-module structure.
In other words, a Z-module is just another name for an abelian group, and its Z-submodules are
just its subgroups. A Z-linear transformation between two Z-modules is just a group homomor-
phism (“additive map”) between abelian groups, because the scaling condition ϕ(am) = aϕ(m)
with a ∈ Z is true for all group homomorphims ϕ. (Warning. A nonabelian group is not a Z-
module! Modules always have commutative addition by definition, and we also saw at the end of
Section 1 that it is logically forced by the other conditions for being a module. You might also
recall the exercise from group theory that if (gh)2 = g 2 h2 for all elements g and h then gh = hg,
so g and h must commute.)
There are many concepts related to abelian groups that generalize in a useful way to R-modules.
If the concept can be expressed in terms of Z-linear combinations, then replace Z with R and
presto: you have the concept for R-modules. Below is a table of comparison of such concepts.
12 KEITH CONRAD
Example 4.1. Let G = C× . This is an abelian group, so a Z-module, but written multiplicatively
(each m ∈ Z acts on z ∈ C× has value z m ). Its torsion subgroup is all the roots of unity, an infinite
subgroup. So the torsion subgroup of an abelian group need not be finite (unless it is finitely
generated).
4The word “torsion” means “twistiness”. It entered group theory from algebraic topology: the nonorientability of
some spaces is related to the presence of nonzero elements of finite order in a homology group of the space. Then it
was natural to use the term torsion to refer to elements of finite order in an arbitrary abelian group, and replacing
integral multiples with ring multiples led to the term being used in module theory for the analogous concept.
INTRODUCTORY NOTES ON MODULES 13
√ √
Example 4.2. Let U = Z[ 2]× = ±(1 + 2)Z . This is an abelian group, so a Z-module (think
√
multiplicatively!). It is generated by {−1, 1 + 2} and has torsion subgroup {±1}.
Example 4.3. If I is a nonzero ideal in a domain R, then R/I is a torsion R-module: pick
c ∈ I − {0}, and then for all m ∈ R/I we have cm = 0 in R/I since Rc ⊂ I. As a special case,
Z/kZ is a torsion abelian group when k is nonzero.
Example 4.4. Every vector space V over a field F is torsion-free as an F -module: if cv = 0 and
c 6= 0 then multiplying by c−1 gives c−1 (cv) = 0, so (c−1 c)v = 0, so v = 0. Thus the only F -torsion
element in V is 0.
The last entry in our table above says that for a domain R, the R-module analogue of a finite
abelian group is a finitely generated torsion R-module. One may think at first that the module
analogue of a finite abelian group should be an R-module that is a finite set, but (for general R)
that is much too restrictive to be a useful definition.
Another way R-modules extend features of abelian groups is the structure of the mappings
between them. For two abelian groups A and B, written additively, the set of all group homomor-
phisms from A to B is also an abelian group, denoted Hom(A, B), where the sum ϕ + ψ of two
homomorphisms ϕ, ψ : A → B is defined by pointwise addition: (ϕ + ψ)(a) := ϕ(a) + ψ(a). So
the set of all homomorphisms A → B can be given the same type of algebraic structure as A and
B. (This is not true for nonabelian groups. For groups G and H, the pointwise product of two
homomorphisms G → H is not usually a homomorphism if H is nonabelian.) Taking B = A, the
additive group Hom(A, A) is a ring using composition as multiplication.
Something similar happens for two R-modules M and N : the set of all R-linear maps M → N
is an R-module, denoted HomR (M, N ), using pointwise addition and scaling, and when we take
N = M the set of R-linear maps M → M is a ring where the multiplication is composition.
Example 4.5. We saw in Example 2.11 that each A ∈ Matn×m (R) leads to a function Rm → Rn
given by v 7→ Av (the usual product of a matrix and a vector) and this is an R-linear transformation.
It can be shown that all R-linear maps Rm → Rn arise in this way, so HomR (Rm , Rn ) ∼ = Matn×m (R)
n n ∼
as R-modules. Similarly, HomR (R , R ) = Matn (R) as rings.
The ring structure on Hom(M, M ) (all homomorphisms of M as an abelian group) gives a concise
abstract way to define what an R-module is. For each r ∈ R, the scaling map m 7→ rm on M is a
group homomorphism, and the axioms of an R-module are exactly that the map R → Hom(M, M )
associating to each r the function “multiply by r” is a ring homomorphism: the second axiom says
m 7→ rm is in Hom(M, M ) and the first and third axioms say that sending r to “multiply elements
of M by r” is additive and multiplicative and preserves multiplicative identities. So an R-module
“is” an abelian group M together with a specified ring homomorphism R → Hom(M, M ), with the
image of each r ∈ R in Hom(M, M ) being interpreted as the mapping “multiply by r” on M . This
is analogous to saying an action of a group G on a set X “is” a group homomorphism G → Sym(X).
14 KEITH CONRAD
Let’s take a look at the meaning of isomorphisms of modules in the context of ideals in a ring: for
ideals I, J ⊂ R, when are I and J isomorphic as R-modules (i.e., when does there exist a bijective
R-linear map between I and J)?
Even when two ideals are not principal, the simple scaling idea in the previous example completely
accounts for how two ideals could be isomorphic modules, at least in a domain:
Theorem 5.2. If R is a domain with fraction field K, then ideals I and J in R are isomorphic
as R-modules if and only if I = cJ for some c ∈ K × . In particular, I ∼
= R as an R-module if and
only if I is a nonzero principal ideal.
In K, if x, x0 6= 0, we get
ϕ(x) ϕ(x0 )
= .
x x0
So set c = ϕ(x)/x ∈ K × where x ∈ I − {0}; we just saw that this is independent of x. Then
ϕ(x) = cx for all x ∈ I − {0}: this is obvious if x = 0 and holds by design of c if x 6= 0. Thus,
J = ϕ(I) = cI.
Taking J = R, we have as a special case I ∼ = R as an R-module if and only if I = cR for some
×
c ∈ K . Necessarily c = c · 1 ∈ cR = I ⊂ R, so I = cR with c ∈ R − {0}. Such I are the nonzero
principal ideals in R.
√ √ √
Example 5.3. Let R = Z[ −5], p = (3, 1 + −5) and q = (3, 1 − −5). Complex conjugation is
Z-linear and from the description
√ √
p = 3a + (1 + −5)b : a, b ∈ Z[ −5]
√
we get p = (3, 1 − −5) = q. So p and q are isomorphic as Z-modules since complex conjugation
is a Z-module isomorphism.
But are p and q isomorphic as R-modules? Complex conjugation on R is not R-linear (since cx
is equal to c x rather than equal to cx), so the isomorphism we gave between p and q as Z-modules
INTRODUCTORY NOTES ON MODULES 15
doesn’t tell us how things go as R-modules. In any event, if p and q are to be somehow isomorphic
as R-modules then we definitely need a new bijection between them to show this! The previous
theorem tells us that the only way p and q can be isomorphic R-modules is if there is a nonzero
√
element of the fraction field Q[ −5] that scales p to q, and it is not clear what such an element
might be.
It turns out that p and q are isomorphic as R-modules,5 with one isomorphism ϕ : p → q being
√
2 + −5
ϕ(x) = x.
3
(Of course this is also a Z-module isomorphism.) The way this isomorphism is discovered involves
some concepts in algebraic number theory. Here is a variant, also explained by algebraic number
√
theory, where the opposite conclusion holds: consider R0 = Z[ −14] and the ideals
√ √
p0 = (3, 1 + −14) and q0 = (3, 1 − −14)
that satisfy p0 = q0 , so p0 ∼
= q0 as Z-modules using complex conjugation. It turns out that p0 and q0
are not isomorphic as R0 -modules, but proving that requires some work.
Example 6.1. Let V = R2 . This is an R-vector space. It has a two-element spanning set over R;
e.g., ( 10 ) , ( 01 ). It is torsion-free as an R-module, as are all vector spaces (Example 4.4). But when
we introduce a matrix A acting on V , we can turn V into a finitely generated torsion module over
the domain R[T ]. Here’s how that works.
Let A = ( 35 24 ). The effect of A on R2 lets us put an action of R[T ] on R2 through polynomial
values at A: for f (T ) ∈ R[T ] and v ∈ R2 declare
f (T ) · v = f (A)v.
a, b ∈ R such that ! !
1 x
(aT + b) · = .
0 y
5The ideals (2, 1 + √−5) and (2, 1 − √−5) are also isomorphic R-modules, but something even stronger holds: they
√ √
are equal since the generators of each ideal are in the other ideal. This is from 1 + −5 + 1 − −5 = 2.
16 KEITH CONRAD
χA (T ) = det(T · I2 − A)
! !!
T 0 3 2
= det −
0 T 5 4
!
T − 3 −2
= det
−5 T − 4
= T 2 − 7T + 2.
Example 6.2. If instead we take A = ( 10 01 ) then we can make R2 into an R[T ]-module by letting
T act as the identity matrix: f (T ) · v = f (I2 )v. Now
!
f (1) 0
f (I2 ) = ,
0 f (1)
so f (I2 )v = f (1)v is just a scalar multiple of v. Thus the only place that the new scalar mul-
tiplication by R[T ] can move a given vector is to an R-multiple of itself. Therefore R2 in this
new R[T ]-module structure is not a cyclic module as it was in the previous example. But R2 is
still finitely generated as an R[T ]-module (the standard basis is a spanning set) and it is a torsion
module since (T − 1)v = I2 v − v = 0 (so all elements are killed by T − 1).
Studying linear operators on Rn from the viewpoint of torsion modules over R[T ] is the key
to unlocking the structure of matrices in a conceptual way because the structure theory for finite
abelian groups carries over to finitely generated torsion modules over a PID (like R[T ]). For
6When we substitute A for T in a polynomial f (T ), the constant term c becomes c I .
0 0 2
INTRODUCTORY NOTES ON MODULES 17
example, every finitely generated torsion module over a PID is a direct sum of cyclic modules,
generalizing the fact that any finite abelian group is a direct sum of cyclic groups.
Let’s now revisit the topic of isomorphisms of modules, this time with vector spaces over a
field F viewed as F [T ]-modules using an F -linear operator in the role of T -multiplication. Say
A, B ∈ Matn (F ) where F is a field and n ≥ 1. Generalizing Examples 6.1 and 6.2, we can view
V = F n as an F [T ]-module in two ways, by letting the action of T on V be A or B:
(6.1) f (T ) · v = f (A)(v)
or
(6.2) f (T ) · v = f (B)(v)
for f (T ) ∈ F [T ]. Let VA be V with scalar multiplication by F [T ] as in (6.1) (so T · v = Av) and let
VB be V with scalar multiplication by F [T ] as in (6.2) (T · v = Bv). Whether or not VA and VB
are isomorphic F [T ]-modules turns out to be equivalent to whether or not A and B are conjugate
matrices:
for all v, v0 ∈ V and f (T ) ∈ F [T ]. The second equation is the same as ϕ(f (A)v) = f (B)ϕ(v).
Polynomials are sums of monomials and knowing multiplication by T determines multiplication
by T i for all i ≥ 1, so the above conditions on ϕ are equivalent to
for all v and v0 in V and c in F . The first two equations say ϕ is F -linear and the last equation
says ϕ(Av) = Bϕ(v) for all v ∈ V . So ϕ : V → V is an F -linear bijection and ϕ(Av) = Bϕ(v)
for all v ∈ V . Since V = F n , every F -linear map ϕ : V → V is a matrix transformation: for some
U ∈ Matn (F ),
ϕ(v) = U v.
Indeed, if there were such a matrix U then letting v run over the standard basis e1 , . . . , en tells us the
i-th column of U is ϕ(ei ), so turn around and define U to be the matrix [ϕ(e1 ) · · · ϕ(en )] ∈ Matn (F )
having ith column ϕ(ei ). Then ϕ and U have the same values on the ei ’s and both are linear on
F n , so they have the same value at every vector in F n . Since ϕ is a bijection, U is invertible, i.e.,
U ∈ GLn (F ). Now the condition ϕ(Av) = Bϕ(v) for all v ∈ V means
U (Av) = B(U v) ⇐⇒ Av = U −1 BU v
for all v ∈ V = F n . Letting v = e1 , . . . , en tells us that A and U −1 BU have the same ith column
for all i, so they are the same matrix: A = U −1 BU , so B = U AU −1 .
18 KEITH CONRAD
ϕ(f (T ) · v) = f (T ) · ϕ(v)
ϕ(T i · v) = T i · ϕ(v)
for all v ∈ V and for i ≥ 0. For this to hold, it suffices to check ϕ(T · v) = T · ϕ(v) for all v ∈ V .
This last condition just says ϕ(Av) = Bϕ(v) for all v ∈ V . Since B = U AU −1 , U A = BU , so
as in the preceding calculations shows that these R[T ]-module structures on Rn are isomorphic if
and only if B = U AU −1 for some U ∈ GLn (R).7 These two module structures on Rn are torsion
thanks to the Cayley–Hamilton theorem in Matn (R). Thus the conjugation problem in Matn (R)
for a general commutative ring R is special case of the isomorphism problem for finitely generated
torsion R[T ]-modules. Unfortunately, R[T ] when R is not a field is usually too complicated for
there to be a nice classification of the finitely generated torsion R[T ]-modules. Even the case when
R = F [X] for a field F , so R[T ]-modules are can be thought of as F [X, Y ]-modules, is very hard.
For example, see https://math.stackexchange.com/questions/641169/.
7The group GL (R) of invertible matrices consists of exactly those U ∈ Mat (R) where det U ∈ R× , which is not
n n
the condition det U 6= 0 except when R is a field (precisely the case when R× = R − {0}).