0% found this document useful (0 votes)
31 views258 pages

Pmath 334

Uploaded by

24510040
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views258 pages

Pmath 334

Uploaded by

24510040
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 258

PMath 334 Notes - Rings and Fields

Laurent W. Marcoux

April 14, 2024


Preface

Preface to the Second Edition - April 14, 2024

A number of typos from the first edition have now been corrected. Presumably,
many others remain, and there may even be new ones! Please read the preface
below, and bring any remaining typos/errors to my attention.
In particular, I would like to thank the following people for already having
brought some typos and/or errors to my attention: T. Dieudonné, W. Dong, M.
Hürlimann, Z. Kun, M. Nguyen, J. Nordmann, A. Okell, N. Saloojee, S. Steel, J.
Tzoganakis, Z. Wang, and X. Yin.

Preface to the First Edition - April 19, 2021

The following is a set of class notes for the PMath 334 course I am currently
teaching at the University of Waterloo in January, 2021. They are a work in progress,
and – this being the “first edition” – they are replete with typos. A student should
approach these notes with the same caution he or she would approach buzz saws;
they can be very useful, but you should be alert and thinking the whole time you have
them in your hands. Enjoy.

Just one short comment about the Exercises at the end of each chapter. These
are of varying degrees of difficulty. Some are much easier than Assignment Questions,
and some are of a comparable level of difficulty. Those few Exercises that marked by
three asterisks are definitely worth doing, as they are crucial to understanding the
underlying concepts. The marked exercises are also of varying levels of difficulty, but
it is better for the reader to discover some things on his/her own, since the reader
will then understand and retain those things better. Besides, the only way to learn
mathematics is to do mathematics.
In our humble opinion, an excellent approach to reading these notes is as follows.
● One first gathers the examples of rings from the second and third chapters.
One then reads the statements of the theorems/propositions/corollaries,
etc., and interprets those results for each of those examples. The purpose
of the theory is to understand and unify the examples.
i
ii PREFACE

● To learn the proofs, we recommend that one read the statement of a given
theorem or proposition, and tries to prove the result oneself. If one gets
stuck at a certain point in the proof, one reads the proof until one gets past
that point, and then one resumes the process of proving the result oneself.
Also, one should keep in mind that if one doesn’t know where to start, one can
always start with the definition, which means that one always knows where to start.
Just saying.

I strongly recommend that the reader consult other textbooks as well as these
notes. As ridiculous as this may sound, there are other people who can write as well
as if not better than your humble author, and it is important that the reader find the
source which best suits the reader. Moreover, by consulting multiple sources, the
reader will discover results not covered in any single reference. I shall only mention
two namely the book of I.N. Herstein [Her69], and a wonderful little book on Galois
(and Field Theory) by I. Stewart [Ste04].

I would like to thank the following people for bringing some typos to my atten-
tion: T. Cooper, J. Demetrioff, R. Gambhir, A. Murphy (more than once), K. Na,
and B. Virsik. Any remaining typos and mistakes are the fault of my colleagues...
well.... ok, maybe they’re partly my fault too.
Finally, I would like to thank Mr. Nic Banks for providing me with a short list of
examples of Euclidean domains beyond the much shorter and apparently universally
standard list which appears in so many elementary ring theory textbooks I consulted.

April 14, 2024


PREFACE iii

The reviews are in!

He is a writer for the ages, the ages of four to eight.


Dorothy Parker

This paperback is very interesting, but I find it will never replace a


hardcover book - it makes a very poor doorstop.
Alfred Hitchcock

It was a book to kill time for those who like it better dead.
Rose Macaulay

That’s not writing, that’s typing.


Truman Capote

Only the mediocre are always at their best.


Jean Giraudoux
Contents

Preface i
Chapter 1. A brief overview 1
1. Groups 1
Supplementary Examples 8
Appendix 10
Exercises for Chapter 1 11
Chapter 2. An introduction to Rings 13
1. Definitions and basic properties 13
2. Polynomial Rings – a first look 18
3. New rings from old 20
4. Basic results 24
Supplementary Examples 28
Appendix 31
Exercises for Chapter 2 34
Chapter 3. Integral Domains and Fields 37
1. Integral domains - definitions and basic properties 37
2. The character of a ring 42
3. Fields - an introduction 45
Supplementary Examples. 48
Appendix 51
Exercises for Chapter 3 52
Chapter 4. Homomorphisms, ideals and quotient rings 55
1. Homomorphisms and ideals 55
2. Ideals 63
3. Cosets and quotient rings 68
4. The Isomorphism Theorems 76
Supplementary Examples. 84
Appendix 87
Exercises for Chapter 4 91
Chapter 5. Prime ideals, maximal ideals, and fields of quotients 95
1. Prime and maximal ideals 95
2. From integral domains to fields 101
v
vi CONTENTS

Supplementary Examples. 107


Appendix 110
Exercises for Chapter 5 114

Chapter 6. Euclidean Domains 117


1. Euclidean Domains 117
2. The Euclidean algorithm 123
3. Unique Factorisation Domains 125
Supplementary Examples. 136
Appendix 141
Exercises for Chapter 6 146

Chapter 7. Factorisation in polynomial rings 149


1. Divisibility in polynomial rings over a field 149
2. Reducibility in polynomial rings over a field 154
3. Factorisation of elements of Z[x] over Q 158
Supplementary Examples. 166
Appendix 170
Exercises for Chapter 7 171

Chapter 8. Vector spaces 173


1. Definition and basic properties 173
Supplementary Examples. 180
Appendix 183
Exercises for Chapter 8 184

Chapter 9. Extension fields 187


1. A return to our roots (of polynomials) 187
Supplementary Examples. 202
Appendix 205
Exercises for Chapter 9 207

Chapter 10. Straight-edge and Compasses constructions 209


1. An ode to Wantzel 209
2. Enter fields 212
3. Back to geometry 218
Appendix 222
Exercises for Chapter 10 225

Appendix A. The Axiom of Choice 227


1. Introduction 227
2. An apology for the Axiom of Choice. 233
3. The prosecution rests its case. 235
4. So what to do? 236
5. Equivalences 237
CONTENTS vii

Bibliography 243
Index 245
CHAPTER 1

A brief overview

Somewhere on this globe, every ten seconds, there is a woman giving


birth to a child. She must be found and stopped.
Sam Levenson

1. Groups
1.1. Pure Mathematics is, we would argue, the study of mathematical objects
and structures, and of the relationships between these. We seek to understand and
classify these structures. This is a very vague statement, of course. What kinds of
objects and structures are we dealing with, and what do we mean by “relationships”?
The objects and structures are many and varied. Vector spaces, continuous func-
tions, integers, real and complex numbers, groups, rings, fields, algebras, topological
spaces and manifolds are just a very tiny sample of the cornucopia of objects which
come under the purview of modern mathematics. In dealing with various members
of a single category – and here we are not using category in the strict mathematical
sense, for categories are also a structure unto themselves – we often use functions
or morphisms between them to see how they are related. For example, an injective
function f from a non-empty set A to a non-empty set B suggests that B must have
at least as many elements as A. (In fact, we take this as the definition of “at least
as many elements” when A and B are infinite!)

1.2. The two principal structures we shall examine in this course are rings and
fields. The typical approach in most books begins with the definition of a ring, for
example, and then produces a (hopefully long) list of examples of rings to demon-
strate their importance. This process is, however, essentially the inverse of how
such a concept develops in the first place. In practice, one normally starts with a
long list of examples of objects, each of which has its own intrinsic value. Through
inspiration and hard work, one may discover certain commonalities between those
objects, and one may observe that any other object which shares that commonality
will – by necessity – behave in a certain way as suggested by the initial objects of
interest. The advantage of doing this is that if one can prove that certain behaviour
is a result of the commonality, as opposed to being specific to one of the examples,
1
2 1. A BRIEF OVERVIEW

then one may deduce that such behaviour will apply to all objects in this collection
and beyond.
But enough of generalities. Let us illustrate our point through an example. Since
both rings and fields exhibit a group structure under addition, let us momentarily
digress to examine these.

1.3. Example. Consider the following list of structures.


(a) Z = {. . . , −2, −1, 0, 1, 2, 3, . . .}, the set of integers. Note that Z admits a
binary operation, called addition, which is denoted by +. To say that
addition is a binary operation on Z means that + is actually a function on
the set of ordered pairs of integers Z × Z with range included in Z. That is,
+∶ Z×Z → Z
(a, b) ↦ a + b.
We also say that Z is closed under addition.
(b) Consider next the set N = {1, 2, 3, . . .} of natural numbers. Again, ad-
dition, also denoted by +, is a binary operation on N: that is, we have a
function
+∶ N×N → N
(a, b) ↦ a + b.
(c) Let n ∈ N, and consider next the set of n × n complex matrices Mn (C).
Given A = [ai,j ] and B = [bi,j ] in Mn (C), we can define the sum of A and
B to be
A + B ∶= [ai,j + bi,j ].
In a similar manner to the first two examples, we find that Mn (C) is closed
under addition.
(d) Let
C([0, 1], R) = {f ∶ [0, 1] → R ∶ f is continuous}.
Given f, g ∈ C([0, 1], R), we may define their sum f + g via
(f + g)(x) ∶= f (x) + g(x) for all x ∈ [0, 1].
From first-year calculus, we know that since f and g are each continuous
on [0, 1], so is f + g. Once again, we say that C([0, 1], R) is closed under
addition.
(e) Let Q ∶= { pq ∶ p, q ∈ Z, q ≠ 0} denote the set of rational numbers, and set
Q∗ ∶= Q ∖ {0}. Given r1 = pq11 , r2 = pq21 ∈ Q∗ , the product r1 ⋅ r2 = pq11 qp22 ∈ Q∗ .
Thus Q∗ is closed under multiiplication.
(f) If we set Q+ ∶= {r ∈ Q ∶ r > 0}, then Q+ is again closed under multiplication.

Examples (a), (c), (d), (e) and (f) have (at least the following) two things in
common:
1. GROUPS 3

(i) in each case - there is a neutral element; that is, an element z in the set
which satisfies a + z = a = z + a for all a in the original set.
For example, in the case of Z, we simply set z = 0, whereas for Mn (C),
we need to set z = [0i,j ], the zero matrix. In the case of C([0, 1], R), we
would set z to be the zero function, that is, the function z(x) = 0, x ∈ [0, 1].
In the case of Q∗ and Q+ , the number 1 = 11 serves as a neutral element
under multiplication.
Note that N does not possess a neutral element. Regardless of which
z ∈ N one chooses, n + z ≠ n for any n ∈ N.
(ii) Note also that for each element a in each of the first, third and fourth
examples, there is an element b such that a + b = z = b + a, where z is the
neutral element from item (i) above. This coincides with what we normally
refer to as the negative or the additive inverse of a. For example, in the
case of f ∈ C([0, 1], R), we set −f to be the function (−f )(x) = −(f (x)),
x ∈ [0, 1]. Again – that −f is continuous when f is continuous is proven in
a first-year Calculus course.
Given r = p/q ∈ Q∗ (in using this notation we are assuming that p, q ∈ Z),
we note that p ≠ 0 ≠ q, and so s ∶= q/p ∈ Q∗ , and sr = 1 = rs. That is, the
product of r and s yields the neutral element from above. Of course, we
normally refer to s as the (multiplicative) inverse of r.
Once again, N fails to have this property.

1.4. Example. The next example appears, at least on the surface, rather dif-
ferent from those above. Let ∆ denote an equilateral triangle, with vertices labelled
1, 2 and 3.
1

3 2
figure 1. ∆
Let us think of ∆ as sitting in the x, y-plane sitting inside the 3-dimensional
real vector space R3 . We equip R3 with its usual Euclidean distance (or metric) as
follows: if x = (x1 , x2 , x3 ) and y = (y1 , y2 , y3 ) ∈ R3 , the distance between x and y is
defined as

d(x, y) = ∣x1 − y1 ∣2 + ∣x2 − y2 ∣2 + ∣x3 − y3 ∣2 .
We first consider isometries of the triangle. By an isometry, we mean a map
φ ∶ ∆ → R3 such that d(x, y) = d(φ(x), φ(y)). That is, φ preserves distance.
4 1. A BRIEF OVERVIEW

We then define a symmetry of ∆ to be an isometry of ∆ with the property


that
φ(∆) = {φ(x) ∶ x ∈ ∆} = ∆.
For example, we might rotate the triangle ∆ clockwise through 0 radians, 2π/3
radians, or through 4π/3 radians about its centre. Let us call these three symmetries
ϱ0 , ϱ1 and ϱ2 respectively.
1 2

ϱ2

3 2 1 3
figure 2. the action of ϱ2
We might also fix one of the vertices, for example the top vertex (labelled 1 in
Figure 1), and reflect ∆ along the vertical line from the top vertex to the midpoint
of the horizontal line segment connecting the leftmost vertex (labelled 3 in Figure
1) to the rightmost vertex (labelled 2 in Figure 1). Let us call this symmetry φ1 ,
and the symmetries similarly obtained by fixing vertex 2 by φ2 , while that similarly
obtained by fixing vertex 3 by φ3 .
1 1

φ1

3 2 2 3
figure 3. the action of φ1
We have so far discovered six distinct symmetries of ∆. Are there others?

A moment’s thought will hopefully convince the reader that any symmetry must
take distinct vertices of ∆ to distinct vertices of ∆. There are only six ways of doing
this: once we have chosen where to send vertex 1, there remain two choices for where
to send vertex 2, and then vertex 3 must be sent to the remaining vertex. The total
number of choices is therefore 6 = 3⋅2⋅1. Since we already have produced six distinct
symmetries above, we indeed have them all. Let us denote by symm(∆) the set
symm(∆) = {ϱ0 , ϱ1 , ϱ2 , φ1 , φ2 , φ3 }
of all symmetries of the equilateral triangle ∆.
1. GROUPS 5

Here is where it gets interesting. Suppose that we choose to compose two of


these symmetries, that is: first we perform one symmetry, and then another. For
example, we might choose to first do ϱ2 , and then φ1 . What would the result look
like?
We can answer this by considering what the maps do to each of the vertices.
Note that ϱ2 sends vertex 1 to vertex 3, vertex 2 to vertex 1 and vertex 3 to vertex
2. We abbreviate this to

ϱ2 (1) = 3, ϱ2 (2) = 1, ϱ2 (3) = 2.

Similarly, φ1 takes vertex 1 to vertex 1, vertex 2 to vertex 3, and vertex 3 to vertex


2, which we may write as

φ1 (1) = 1, φ1 (2) = 3, φ1 (3) = 2.

Thus
φ1 ○ ϱ2 (1) = φ1 (ϱ2 (1)) = φ1 (3) = 2,
and similiarly φ1 ○ ϱ2 (2) = 1, φ1 ○ ϱ2 (3) = 3.
1 2

3 2 3 1
figure 4. The action of φ1 ○ ϱ2 .
But this is precisely what happens if we simply apply φ3 to ∆! In other words,

φ1 ○ ϱ2 = φ3 .

In analogy to the three examples from Example 1.3, a quick calculation shows
that ϱ0 serves as a neutral element for symm(∆). Also, given any symmetry α in
symm(∆), there exists a second symmetry β such that β ○ α = ϱ0 = α ○ β.
For example, if α = ϱ1 , then we may set β = ϱ2 , while if α = φ1 , then we choose
β = φ1 as well.
So, in terms of its behaviour using composition as the binary operation, symm(∆)
behaves more like (Z, +) than (N, +) does!

1.5. Each of the sets we have defined in Example 1.3 and Example 1.4 is inter-
esting in its own right. But now we have discovered that, except for the example of
the natural numbers, they all share a few properties in common. It is this kind of
analysis that leads mathematicians to make definitions such as the following.
6 1. A BRIEF OVERVIEW

1.6. Definition. A non-empty set G equipped with a binary operation


⋅∶G×G→G
is said to be a group if the following conditions hold:
(a) the operation ⋅ is associative: that is,
(g ⋅ h) ⋅ k = g ⋅ (h ⋅ k) for all g, h, k ∈ G.
(b) There exists an element e ∈ G, called a neutral element or an identity
element for G such that
e ⋅ g = g = g ⋅ e for all g ∈ G.
(c) Given g ∈ G, there exists an element h ∈ G such that
g ⋅ h = e = h ⋅ g,
where e is the neutral element of G defined above. The element h is referred
to as an inverse of g.
The group G is said to be abelian if g ⋅ h = h ⋅ g for all g, h ∈ G. It is customary to
use “+” to denote the binary operation for abelian groups (instead of “⋅”).

1.7. Remark. Since a group consists of the non-empty set G along with the
binary operation ⋅, we normally write (G, ⋅) to denote a group. Informally, when
the binary operation is understood and there is no risk of confusion, we also refer
to a group G, and we even drop the ⋅ from the notation, writing gh to denote g ⋅ h.

We shall leave it as an exercise for the reader to show that if G is a group, then
it admits exactly one identity element e, and that given g ∈ G, the element h ∈ G
satisfying g ⋅ h = e = h ⋅ g is also uniquely defined. For this reason, we speak of the
identity element of a group G, as well as the inverse of g ∈ G. We also adopt the
notation g −1 to denote the unique inverse of g.

1.8. Example. Although we have not verified the associativity of addition in


each of the examples from Example 1.3, this can indeed be done, and so it follows
that (Z, +), (Mn (C), +) and (C([0, 1], R), +) are all (abelian) groups under addition.
Furthermore, (symm(∆), ○) from Example 1.4 is a group under composition.
The fact that ϱ2 ○ φ1 = φ2 ≠ φ3 = φ1 ○ ϱ2 means that symm(∆) is not abelian.
The set N of natural numbers fails to be a group under addition, since it admits
neither a neutral element, nor inverses.

1.9. Definition. A subgroup H of a group (G, ⋅) is a non-empty subset of G


which is a group using the binary operation ⋅ inherited from G.

1.10. Example. Let H = 2Z = {2m ∶ m ∈ Z}. Then H is a subgroup of (Z, +).


1. GROUPS 7

1.11. Example. Recall that symm(∆) = {ϱ0 , ϱ1 , ϱ2 , φ1 , φ2 , φ3 } as defined in


Example 1.4 is a group using composition ○.
We leave it to the reader to verify that H ∶= {ϱ0 , ϱ1 , ϱ2 } is a group under ○, and
thus H is a subgroup of symm(∆).
1.12. Example. The set Q+ ∶= {q ∈ Q ∶ q > 0} is a group under multiplication.
While Q+ is a subset of Q, it is not a subgroup of (Q, +), since we are not using
the same binary operation. That is, in the first case we are using multiplication as
our operation, while in the case of (Q, +), we are using addition. Since 1 ∈ Q+ but
−1 ∈/ Q+ , we see that Q+ is not closed under addition, and so (Q+ , +) is not a group.
On the other hand, (Q+ , ⋅) is a subgroup of (Q∗ , ⋅), where Q∗ = {q ∈ Q ∶ q ≠ 0}.
8 1. A BRIEF OVERVIEW

Supplementary Examples
S1.1. Example. Let n ∈ N. Let GLn (C) ∶= {T ∈ Mn (C) ∶ T is invertible}, and
let ⋅ represent usual matrix multiplication.
⎡1 ⎤
⎢ ⎥
⎢ 1 ⎥
⎢ ⎥
Then (GLn (C), ⋅) is a group, and In = ⎢ ⎥ ∈ Mn (C) is the identity of
⎢ ⋱ ⎥
⎢ ⎥
⎢ 1⎥⎦

this group. We refer to GLn (C) as the general linear group in Mn (C).
Also,
Un (C) ∶= {U ∈ Mn (C) ∶ U is unitary}
is a group, called the unitary group of Mn (C).
S1.2. Example. Let n ∈ N. Then SLn (C) ∶= {T ∈ Mn (C) ∶ det T = 1} is a
group, again using usual matrix multiplication as the group operation.
This group is referred to as the special linear group of Mn (C).
S1.3. Example. More generally, let Ω ⊆ T ∶= {z ∈ C ∶ ∣z∣ = 1} be a multiplicative
group. Let
Dn (Ω) ∶= {T ∈ Mn (C) ∶ det T ∈ Ω}.
Then Dn (Ω) is a group, using usual matrix multiplication as the group operation.
S1.4. Example. The set of rational numbers Q is an abelian group under
addition, while the set Q∗ of non-zero rational numbers is an abelian group under
multiplication. As we have seen, the set Q+ of strictly positive rational numbers is
also an abelian group under multiplication.
S1.5. Example. Let (V, +, ⋅) be a vector space over R. (Here, ⋅ refers to scalar
multiplication.) Then (V, +) is an abelian group.
In particular, given a natural number n, (Rn , +) is an additive, abelian group.
This explains why the examples given in Example 1.3 are (abelian) groups.
S1.6. Example. Amongst the most important classes of groups are the so-
called permutation groups.
Let ∅ ≠ X be a non-empty set. A permutation of X is a bijective function
f ∶ X → X. Typically, if X = {1, 2, 3, . . . , n} for some natural number n, then
we use symbols such as σ or ϱ to denote permutations. We refer to the set of all
permutations of X as the symmetric group of X and (here we shall) denote it by
Σ(X). The group operation is composition of functions.
Note that the identity map ι ∶ X → X given by ι(x) = x, x ∈ X is a bijection.
Given two bijections f, g of X, their composition f ○ g is again a bijection, and
f ○ ι = f = ι ○ f . Thus ι serves as the identity of Σ(X) under composition. Also,
since f is a bijection, given y ∈ X, we may write y = f (x) for a unique choice of
x ∈ X. Thus we may define the bijection f −1 ∶ X → X via f −1 (y) = x. Clearly
f −1 ○ f = ι = f ○ f −1 , so that f −1 is the inverse for f under composition.
In general, Σ(X) is non-abelian. (For which sets X is it abelian?)
SUPPLEMENTARY EXAMPLES 9

A permutation group is any subset of Σ(X) which is itself a group under


composition.

The fact that we refer to the set of permutations as the symmetric group, and
that we refer to the group symm(∆) from Example 1.4 as a group of symmetries is
not merely a coincidence. As we observed in the discussion of that Example, any
symmetry of ∆ is really just a permutation of the vertices, and any permutation of
the vertices leads to a symmetry of ∆.
S1.7. Example. The set Zn = {0, 1, 2, . . . , n − 1} is an abelian group under
addition modulo n. It is an interesting exercise to prove that if p ∈ N is a prime
number, then every group G with p elements behaves exactly like Zp under addition.
Technically speaking, (Zp , +) is the unique group of order p (i.e. with p elements)
up to group isomorphism.
To be explicit, if p ∈ N is a prime number and G = {g1 , g2 , . . . , gp } is a group
using the binary operation ⋅, then there exists a bijective map ϱ ∶ Zp → G such
that ϱ(a + b) = ϱ(a) ⋅ ϱ(b) for all a, b ∈ Zp . Such a map is referred to as a group
isomorphism. The proof of this for n = 2, 3 or n = 5 is certainly within reach.
We shall have more to say about so-called homomorphisms of groups in Chap-
ter 4.
S1.8. Example. If (G, ⋅) and (H, ∗) are groups, then so is G × H ∶= {(g, h) ∶
g ∈ G, h ∈ H} under the operation ●, where
(g1 , h1 ) ● (g2 , h2 ) ∶= (g1 ⋅ g2 , h1 ∗ h2 ).
We leave the verification of this to the reader.
S1.9. Example. The discrete Heisenberg group H3 (Z) is the set of 3 × 3
matrices of the form
⎡1 a b ⎤
⎢ ⎥
⎢ ⎥
⎢0 1 c ⎥ ,
⎢ ⎥
⎢0 0 1⎥
⎣ ⎦
where a, b, c ∈ Z, and where
⎡1 a b ⎤ ⎡1 x y ⎤ ⎡1 a + x y + az + b⎤
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢0 1 c ⎥ ⎢0 1 z ⎥ = ⎢0 1 c + z ⎥.
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢0 0 1⎥ ⎢0 0 1⎥ ⎢0 0 1 ⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
If one replaces Z by R, one obtains the continuous Heisenberg group.
S1.10. Example. The set T ∶= {z ∈ C ∶ ∣z∣ = 1} is a group, using usual complex
multiplication as the operation.
10 1. A BRIEF OVERVIEW

Appendix
A1.1. Niels Henrik Abel (5 August 1802 - 6 April 1829) is known for a wide
variety of results that bear his name. Perhaps his most famous result is the proof
that given a polynomial p of degree five over the real numbers, it is not possible
to solve the equation p(x) = 0 in radicals. (The well-known quadratic equation
does exactly this for polynomials of degree two. In the case of polynomials of degree
three or four, solutions in radicals also exist, but they are not as simple, and are
subsequently less known.)

Abel lived in poverty and died of tuberculosis. The Abel Prize in mathematics
is named in his honour. (There is no Nobel Prize in mathematics.) Given how
prestigious and merited the Nobel Peace Prize always is, a prize in the world of
mathematics is... (we leave it to the reader to complete this sentence).
EXERCISES FOR CHAPTER 1 11

Exercises for Chapter 1

Exercise 1.1.
Let (G, ⋅) be a group. Prove that if e, f ∈ G and gf = ge = g = eg = f g for all
g ∈ G, then e = f . That is, prove that the identity element of a group is unique.

Exercise 1.2.
Let (G, ⋅) be a group. Prove that if g, h1 , and h2 ∈ G and if gh2 = gh1 = e = h1 g =
h2 g, then h1 = h2 . This shows that the inverse of g is unique, allowing us to refer to
the inverse of g, which we then denote by g −1 .

Exercise 1.3.
Let (G, ⋅) be a group. Suppose that g 2 ∶= g ⋅ g = e for all g ∈ G. Prove that G is
abelian.

Exercise 1.4.
In Example 1.4, we determined that (symm(∆), ○) is a non-abelian group with
six elements, and we determined all of its elements by observing what each symmetry
does to the vertices of the triangle ∆.

(a) Determine (symm(◻), ○), the group of symmetries of a square – in partic-


ular, how many elements must it have?
(b) Is this group abelian?

Exercise 1.5.
Let Γ denote an isosceles triangle which is not equilateral. Determine
(symm(Γ), ○).

Exercise 1.6.
A permutation of a non-empty set X is a bijective function f ∶ X → X. Denote
by Σ(X) the set of all permutations of X. As we saw in Example S1.6, using
composition as the operation: f ○ g(x) ∶= f (g(x)) for all x ∈ X, we have that
(Σ(X), ○) is a group.
In the case where X = {1, 2, . . . , n}, we may write a permutation by specifying
its value at each of the elements of X as a 2 × n matrix: for example,
1 2 3 ⋯ n−1 n
σ=[ ]
a1 a2 a3 ⋯ an−1 an
represents the permutation σ ∈ Σ(X) that satisfies σ(j) = aj , 1 ≤ j ≤ n.
(a) What is the relationship (if any) between Σ({1, 2, 3}) and symm(∆)?
(b) What is the relationship (if any) between Σ({1, 2, 3, 4}) and symm(◻))?
12 1. A BRIEF OVERVIEW

Exercise 1.7.
Let Q∗ ∶= {q ∈ Q ∶ q ≠ 0}. Let ⋅ represent the usual product of rational numbers.
Prove that (Q∗ , ⋅) is an abelian group.

Exercise 1.8.
For those of you with a background in linear algebra: let V be a vector space
over C and let
L(V) ∶= {T ∶ V → V ∶ T is linear}.
Set G(V) ∶= {T ∈ L(V) ∶ T is invertible}. Using composition of linear maps ○ as the
binary operation, prove that (G(V), ○) is a group.

Exercise 1.9.
Let (G, ⋅) be a group and g, h and k ∈ G. Suppose that
g ⋅ h = g ⋅ k.
Prove that h = k.

We say that the cancellation law holds for groups.

Exercise 1.10.
Let n ∈ N and let ωn ∶= e2πi/n ∈ C. Let Ωn ∶= {1, ω, ω 2 , . . . , ω n−1 }. Prove that Ωn
is an abelian group under complex multiplication, and that if n is prime, then the
only subgroups of Ωn are {1} and Ωn itself.
CHAPTER 2

An introduction to Rings

I can speak Esperanto like a native.


Spike Milligan

1. Definitions and basic properties


1.1. As we indicated in the first Chapter, the method of writing down a defi-
nition of a mathematical structure followed by examples is not the way that such
structures are typically discovered. The reason for teaching the material in this
order, however, is expediency. It is simply that much faster.
Granted an unlimited amount of time, it would be instructive to approach an
entire course through carefully selected examples which gradually lead the students
to discover on their own the desired definitions, propositions and theorems. Over
a period of twelve weeks, however, this is not practical, and so we submit to the
common, if more prosaic practice of definition, example, theorem and proof.
The student whose primary goal is to learn the material (as opposed the student
whose primary goal is to obtain a good grade) would be well-served to spend as
much time as their other courses allow to understand what each of the results we
shall prove means in the examples we give, as well as in examples that they should
endeavour to discover on their own.
Let us keep in mind that the purpose of the theory is to try to unite and un-
derstand the examples, which should be important in their own right. We are not
trying to motivate the examples using the definitions!

1.2. Definition. A ring R is a non-empty set equipped with two binary oper-
ations, known as addition (denoted by +), and multiplication (denoted by ⋅, or
simply by juxtaposition of two elements). The operations must satisfy (A1) - (A5)
and (M1)-(M4) below:
(A1) a + b ∈ R for all a, b ∈ R.
(A2) (a + b) + c = a + (b + c) for all a, b, c ∈ R.
(A3) There exists a neutral element 0 ∈ R such that
a + 0 = a = 0 + a for all a ∈ R.
13
14 2. AN INTRODUCTION TO RINGS

(A4) Given a ∈ R, there exists an element b ∈ R such that


a + b = 0 = b + a.
(A5) a + b = b + a for all a, b ∈ R.
Taken together, conditions (A1) - (A4) are the statement that (R, +) is a group, and
condition (A5) is the statement that (R, +) is in fact abelian.
We also require that
(M1) a ⋅ b ∈ R for all a, b ∈ R.
(M2) a ⋅ (b ⋅ c) = (a ⋅ b) ⋅ c for all a, b, c ∈ R.
(M3) a ⋅ (b + c) = (a ⋅ b) + (a ⋅ c) for all a, b, c ∈ R.
(M4) (a + b) ⋅ c = (a ⋅ c) + (b ⋅ c) for all a, b, c ∈ R.
We say that R is unital if there exists an element 1 ≠ 0 in R such that 1 ⋅ a = a =
a ⋅ 1 for all a ∈ R. This element 1 is also referred to as a multiplicative identity.

1.3. Remark. Condition (M1) above states that R is closed under multiplica-
tion, (M2) says that multiplication is associative on R, while conditions (M3) and
(M4) state that multiplication is left and right distributive with respect to addition.
We point out that some authors require that all rings be unital. It is true that
given a non-unital ring, there is a way of attaching a multiplicative identity to it
(this will eventually appear as an Assignment question), however we shall refrain
from doing so because the abstract construction is not always natural in terms of
the (non-unital) rings we shall consider.
We point out that the decision to require that 1 ≠ 0 is to ensure that R = {0} is
not considered a unital ring.

1.4. Example. Given an abelian group (G, +), we can always turn it into a
ring by setting defining g1 ⋅ g2 = 0 for all g1 , g2 ∈ G.

1.5. Examples. In some cases, the examples of (abelian) groups we studied in


Chapter 1 were actually examples of rings, as we shall now see.
(a) In Example 1.1.3, we saw that (Z, +) is an abelian group under addition.
That (Z, ⋅) satisfies (M1)-(M4) is a standard result. Thus (Z, +, ⋅) is a ring.
Since 1 ∈ Z and 1 ⋅ a = a = a ⋅ 1 for all a ∈ Z, (Z, +, ⋅) is a unital ring.
(b) Again, from Example 1.1.3, Mn (C) is an abelian group under addition.
Given A = [ai,j ] and B = [bi,j ] ∈ Mn (C), recall from linear algebra that we
define
A ⋅ B ∶= [ci,j ],
where for each 1 ≤ i, j ≤ n, we set ci,j = ∑nk=1 ai,k bk,j . That (Mn (C), +, ⋅)
satisfies (M1)-(M4) is a standard exercise from linear algebra. Note that
1 ∶= In , the n × n identity matrix, serves as a multiplicative identity for
(Mn (C), +, ⋅).
1. DEFINITIONS AND BASIC PROPERTIES 15

(c) Yet again, as in Example 1.1.3, we set


C([0, 1], R) = {f ∶ [0, 1] → R ∶ f is continuous}.
As we saw there, (C[0, 1], R), +) is an abelian group using (point-wise)
defined addition. Given f, g ∈ C([0, 1], R), we may define their product to
be f g, where
(f g)(x) ∶= f (x)g(x) for all x ∈ [0, 1].
That (C([0, 1], R), +, ⋅) is a (unital) ring with these operations is left as an
exercise for the reader.

1.6. Example. Let n ∈ N. Recall from Math 135 that we define


Zn ∶= {0, 1, 2, . . . , n − 1},
equipped with addition +̇ and multiplication ⊗, modulo n. That is, given a, b ∈ Zn ,
we define
a+̇b = (a + b) mod n
and
a ⊗ b ∶= ab mod n.
For example, if n = 12, then Z12 = {0, 1, 2, . . . , 10, 11} and
8+̇9 = 17 mod 12 = 5,
while
8 ⊗ 10 = 80 mod 12 = 8.
Of course, in practice, we keep the “mod n” in our heads, and we use the usual
notation + for addition and ⋅ (or juxtaposition) for multiplication. Thus we typically
write
8+9=5
and
8(10) = 8 ⋅ 10 = 8
in Z12 .
We leave it to the reader to verify that (Zn , +, ⋅) is a ring. Note that it is
commutative.

1.7. Examples. The following are also rings, and the verification of this is left
to the reader.
(a) (C, +, ⋅). The complex numbers.
(b) (R, +, ⋅). The real numbers.
(c) (Q, +, ⋅). The rational numbers.
16 2. AN INTRODUCTION TO RINGS

(d) (7Z, +, ⋅). Here, you will recall that


7Z ∶= {7x ∶ x ∈ Z} = {. . . , −14, −7, 0, 7, 14, . . .}.
The addition and multiplication that we define on 7Z are those inherited
from the fact that 7Z ⊆ Z.
More generally, for any k ∈ Z, kZ ∶= {km ∶ m ∈ Z} is a ring. This ring is
unital
√ if and only if ∣k∣ √= 1; i.e. if and
√ only if k ∈ {−1, 1}.
(e) (Z[ 2], +, ⋅), where Z[ 2] ∶= {a + b 2 ∶ a, b ∈ Z}.
(f) (Q[i], +, ⋅), where Q[i] ∶= {a + bi ∶ a, b ∈ Q}, and where i2 = −1.

1.8. Example. A similar construction allows us to define an example of a


group ring.
Let (G, ⋅) be a group, and R be a ring. We define the group ring R⟨G⟩ as follows:
the elements of R⟨G⟩ are formal finite sums of the form
m
∑ rj gj ,
j=1

where rj ∈ R and gj ∈ G, 1 ≤ j ≤ m.
The sum of two elements of R⟨G⟩ is performed as follows: let a = ∑pj=1 rj gj and
b = ∑qi=1 si hi be two elements of R⟨G⟩. Set E ∶= {g1 , g2 , . . . , gp } ∪ {h1 , h2 , . . . , hq }.
After a change of notation, we may rewrite E = {y1 , y2 , . . . , yk } where all of the yj ’s
are distinct (and k ≤ p + q). We may then write a = ∑kj=1 aj yj and b = ∑kj=1 bj yj ,
noting that it is entirely possible that aj = 0 or that bj = 0 for some of the j’s.
The purpose of this was to write both a and b as a combination of the same set
of group elements.
Then
⎛k ⎞ ⎛k ⎞ k
a + b = ∑ aj yj + ∑ bj yj ∶= ∑ (aj + bj )yj .
⎝j=1 ⎠ ⎝j=1 ⎠ j=1

Multiplication is performed by allowing elements of R to commute with elements


of G:
⎛p ⎞ q p q
∑ rj gj (∑ si hi ) = ∑ ∑ rj si (gj hi ).
⎝j=1 ⎠ i=1 j=1 i=1

Let’s have a look at this in action through two concrete examples of group rings.
(a) Let G = {e, ω, ω 2 }, and equip G with the multiplication determined by the
following table:

e ω ω2
e e ω ω2
ω ω ω2 e
ω2 ω2 e ω
1. DEFINITIONS AND BASIC PROPERTIES 17

The verification that G is a group is left to the reader. Suppose that


R = Q, the rational numbers. To see what addition and multiplication
look like in Q⟨G⟩, consider the following. A general element of Q⟨G⟩ looks
like a = r1 ⋅ e + r2 ω + r3 ω 2 , where r1 , r2 , r3 ∈ Q. Suppose also that b =
s1 e + s2 ω + s3 ω 2 ∈ Q⟨G⟩. Then
a + b = (r1 e + r2 ω + r3 ω 2 ) + (s1 e + s2 ω + s3 ω 2 )
= (r1 + s1 ) e + (r2 + s2 ) ω + (r3 + s3 ) ω 2 ,

and
a ⋅ b = (r1 e + r2 ω + r3 ω 2 )(s1 e + s2 ω + s3 ω 2 )
= r1 s1 e2 + r1 s2 e ⋅ ω + r1 s3 e ⋅ ω 2 + r2 s1 ω ⋅ e + r2 s2 ω 2 + r2 s3 ω 3
+ r3 s1 ω 2 ⋅ e + r3 s2 ω 3 + r3 s3 ω 4
= (r1 s1 + r2 s3 + r3 s2 ) e + (r1 s2 + r2 s1 + r3 s3 ) ω + (r1 s3 + r2 s2 + r3 s1 ) ω 2 .
(b) As a second example, let n ∈ N. A matrix P ∈ Mn (C) is said to be a
permutation matrix if every row and every column of P contains exactly
one “1”, and zeroes everywhere else. The reason for this nomenclature is
that if we view elements of Mn (C) as linear transformations on Cn written
with respect to the standard orthonormal basis B ∶= {e1 , e2 , . . . , en }, then a
permutation matrix corresponds to a linear transformation which permutes
the basis B; i.e. P ej = eσ(j) , where σ ∶ {1, 2, . . . , n} → {1, 2, . . . , n} is a
bijection.
Consider the set Pn of all n × n permutation matrices. We leave it to
the reader to prove that Pn is a group. In the case where n = 3, we find
that
P3 = {P1 , P2 , P3 , P4 , P5 , P6 }

⎪ ⎡1 0 0⎤ ⎡0 1 0⎤ ⎡0 0 1⎤⎥ ⎡⎢1 0 0⎤⎥ ⎡⎢0 0 1⎤⎥ ⎡⎢0 1 0⎤⎥⎫

⎪⎢⎢
⎪ ⎥ ⎢
⎥ ⎢
⎥ ⎢
⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎪

= ⎨⎢0 1 0⎥ , ⎢0 0 1⎥ , ⎢1 0 0⎥ , ⎢0 0 1⎥ , ⎢0 1 0⎥ , ⎢1 0 0⎥⎬ .

⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎪
⎩⎢⎣0 0 1⎥⎦ ⎢⎣1 0 0⎥⎦ ⎢⎣0 1
⎪ 0⎥⎦ ⎢⎣0 1 0⎥⎦ ⎢⎣1 0 0⎥⎦ ⎢⎣0 0 1⎥⎦⎪


A general element of the group ring Z⟨P3 ⟩ looks like
6
a = ∑ aj Pj ,
j=1

where aj ∈ Z, 1 ≤ j ≤ 6. As an example of a product in this ring, consider


(2P2 + 4P5 )(3P3 + 1P4 − 7P5 ) =
= 6P2 P3 + 2P2 P4 − 14P2 P5 + 12P5 P3 + 4P5 P4 − 28P5 P5
= 6P1 + 2P5 − 14P6 + 12P6 + 4P2 − 28P1
= −22P1 + 4P2 + 2P5 − 2P6 .
18 2. AN INTRODUCTION TO RINGS

1.9. Example. Let V be a vector space over K, where K = R or C. Let


L(V) ∶= {T ∶ V → V ∶ T is K-linear}.
Then (L(V), +, ○) is a ring. That is, given T, R ∈ L(V), we define T + R to be the
map satisfying (T + R)(v) = T v + Rv for all v ∈ V, while T ○ R is defined through
composition, namely T ○ R(v) = T (Rv) for all v ∈ V.

1.10. Remark. Notice that in the example above, L(V) is not only a ring, but
it is in fact a vector space over K. We say that it as an algebra over K.

As a second example, note that (C([0, 1], R), +, ⋅) is an algebra over R.

2. Polynomial Rings – a first look


2.1. One particular class of rings which will be of special importance to us is
the class of polynomial rings. Later, we shall devote an entire chapter to studying
factorisation in polynomial rings with coefficients in a field.

2.2. Definition.
(a) Let n ∈ N and x1 , x2 , . . . , xn be a finite collection of abstract symbols (also
called indeterminates). The set {x1 , x2 , . . . , xn } is often called an alpha-
bet. A word in {x1 , x2 , . . . , xn } is a formal product (i.e. a concatenation)
W ∶= xi1 xi2 ⋯xik
where k ∈ N and ij ∈ {1, 2, . . . , n} for all 1 ≤ j ≤ k. That is, it is just a
concatenation of finitely many elements of the alphabet. The length of the
word W above is defined to be k. By convention, we define W0 to be the
empty word, with the understanding that W0 W = W = W W0 for any word
W ∈ W.
Let us denote by Wk the set of all words of length k in {x1 , x2 , . . . , xn },
and by W the set
W ∶= ∪∞
k=0 Wk .

Given a commutative, unital ring R, we then define


m
R⟨x1 , x2 , . . . , xn ⟩ ∶= {r0 ⋅ 1 + ∑ rj Wj ∶ m ∈ N, rj ∈ R and Wj ∈ W, 1 ≤ j ≤ m},
j=1

and refer to this as the ring of polynomials in n non-commuting


variables x1 , x2 , . . . , xn over the commutative, unital ring R. Alter-
natively, the terminology R⟨x1 , x2 , . . . , xn ⟩ is the ring of polynomials in
n non-commuting variables with coefficients in R is often used.
2. POLYNOMIAL RINGS – A FIRST LOOK 19

Given A ∶= ∑pj=1 rj Wj and B ∶= ∑qi=1 si Vi ∈ R⟨x1 , x2 , . . . , xn ⟩, we set


p q
A ⋅ B = ( ∑ rj Wj ) ⋅ (∑ si Vi )
j=1 i=1
p q
= ∑ ∑ rj si Wj Vi .
j=1 i=1

For example, if n = 3, then W1 ∶= x3 x1 x1 x2 x3 x2 x2 = x3 x21 x2 x3 x22 defines


an element of W7 . Also,
(5x1 x3 x2 + 3x2 x3 )(−x2 + 3x3 x1 )
= −5x1 x3 x22 + 15x1 x3 x2 x3 x1 − 3x2 x3 x2 + 9x2 x23 x1 .
Observe that 1 ⋅ W0 serves as the multiplicative identity element of
R⟨x1 , x2 , . . . , xn ⟩, which means that R⟨x1 , x2 , . . . , xn ⟩ is unital.
(b) We can also insist as part of our hypotheses above that the indeterminates
x1 , x2 , . . . , xn must commute with one another – i.e. that xi xj = xj xi for
all 1 ≤ i, j ≤ n. In this case, we write
R[x1 , x2 , . . . , xn ]
and call this the ring of polynomials in n commuting variables x1 , x2 ,
. . . , xn over the commutative, unital ring R.
Note that if n = 1, then R⟨x1 ⟩ = R[x1 ] is just the ring of polynomials in x1 . In
this setting, the expression “commuting variables” is extraneous.

2.3. The case where n = 1 will be very close to our hearts. Given a commutative,
unital ring R, we have defined the ring of polynomials in x with coefficients in R to
be the set of formal symbols
R[x] = R⟨x⟩ = {p0 W0 + p1 x + ⋯ + pm xm ∶ m ≥ 1, pk ∈ R, 0 ≤ k ≤ m}.
In practice, we typically do one of two things:
● either we drop the W0 notation and simply write p0 + p1 x + ⋯ + pm xm , or
● we denote W0 by x0 and write p0 x0 + p1 x + ⋯ + pm xm .
In these notes, we will tend to drop the W0 altogether. Hopefully, this should
not cause any confusion, as the notation agrees with the usual notation (that we’ve
all come to know and love) for polynomials in x.
Two elements p0 + p1 x + ⋯ + pm xm (where pm ≠ 0) and q0 + q1 x + ⋯ + qn xn (where
qn ≠ 0) are considered equal if and only if n = m and pk = qk , 0 ≤ k ≤ n.
As we have just seen, R[x] is a ring using the operations (say m ≤ n)
(p0 + p1 x + ⋯ + pm xm ) + (q0 + q1 x + ⋯ + qn xn )
= (p0 + q0 ) + (p1 + q1 )x + ⋯ + (pm + qm )xm + qm+1 xm+1 + ⋯ + qn xn ,
20 2. AN INTRODUCTION TO RINGS

and
(p0 + p1 x + ⋯ + pm xm )(q0 + q1 x + ⋯ + qn xn )
= p0 q0 + (p0 q1 + p1 q0 )x1 + (p0 q2 + p1 q1 + p2 q0 )x2 + ⋯ + pm qn xn+m .

2.4. Terminology. Given 0 ≠ p(x) ∶= p0 + p1 x + p2 x2 + ⋯ + pm xm ∈ R[x], we


define the degree of p(x) to be
deg p(x) = max{k ∶ pk ≠ 0}.
If p(x) = 0, then the degree of p(x) is undefined.
Note that any polynomial r(x) of degree at most m ≥ 1 can be written as
r(x) = r0 + r1 x + ⋯ + rm xm by adding terms whose coefficients are equal to zero if
necessary. Thus, for example, given a polynomial r(x) of degree 2, we may always
write r(x) = r0 + r1 x + r2 x2 + 0x3 + 0x4 .
(1) If deg p(x) = m and pm = 1, we say that p(x) is a monic polynomial;
(2) if p(x) = p0 , we say that p(x) is a constant polynomial.
Thus if p(x) and q(x) are polynomials of degree 0 ≤ m and 0 ≤ n respectively,
we find that either p(x) + q(x) = 0 or
deg(p(x) + q(x)) ≤ max(deg(p(x)), deg(q(x))),
while
deg(p(x)q(x)) ≤ deg(p(x)) + deg(q(x)).
The reason that we need not have equality in either of the two inequalities above
is that
● if n = m and pm = −qm , then p(x) + q(x) has degree strictly less than m (or
is undefined when p(x) + q(x) = 0), while
● if pm qn = 0, then p(x)q(x) has degree strictly less than m + n.

3. New rings from old


3.1. We have seen a number of examples of rings. Let us now examine a few
constructions that allow us to build new rings from these.
3.2. Example. Let R be a ring.
(a) Let n ∈ N. Then
Mn (R) = {[ri,j ] ∶ ri,j ∈ R, 1 ≤ i, j ≤ n}
is a ring, using usual matrix addition and multiplication (and the product
and sum from R).
(b) Let n ∈ N. Then
Tn (R) = {T = [ti,j ] ∶ ti,j ∈ R, 1 ≤ i, j ≤ n, ti,j = 0 if j < i}
is a ring, again using usual matrix addition and multiplication.
3. NEW RINGS FROM OLD 21

(c) Let ∅ ≠ X be a set. Then


RX ∶= {f ∶ X → R ∶ f a function}
is a ring, using point-wise addition and multiplication.

3.3. Example.
(a) Direct products. Let Λ be a non-empty set, which may be finite or
infinite, countable or uncountable. Suppose that for each λ ∈ Λ, we are
given a ring Rλ . We define the direct product of the family {Rλ ∶ λ ∈ Λ}
to be the set
∏ Rλ ∶= {(rλ )λ ∶ rλ ∈ Rλ , λ ∈ Λ}.
λ∈Λ

Thus, if Λ = {1, 2}, then


∏ Rλ = R1 × R2 = {(r1 , r2 ) ∶ r1 ∈ R1 , r2 ∈ R2 }
λ∈{1,2}

is the usual crossed product of R1 and R2 .


We define addition and multiplication co-ordinate-wise:
(rλ )λ + (sλ )λ ∶= (rλ + sλ )λ ,
and
(rλ )λ ⋅ (sλ )λ ∶= (rλ ⋅ sλ )λ .

An equivalent formulation of this definition is to first let R ∶= ∪λ∈Λ Rλ


(as a set, with no operations). We then define
∏ Rλ ∶= {f ∶ Λ → R ∶ f (λ) ∈ Rλ for all λ ∈ Λ}.
λ∈Λ

Any such function f is referred to as a choice function for Λ. Writing


rλ ∶= f (λ), λ ∈ Λ and (rλ )λ for f , we arrive at the equivalence of the two
formulations.
(For those with a knowledge of the Axiom of Choice, we point out
that since 0λ ∈ Rλ for each λ, the zero function f (λ) = 0λ lies in ∏λ∈Λ Rλ ≠ ∅
without appealing to this axiom. In other words, we are not dealing with
Bertrand Russell’s socks here, but rather with his shoes.)
(b) Direct sums. As before, we let Λ be a non-empty set, which may be
finite or infinite, countable or uncountable. Suppose that for each λ ∈ Λ,
we are give a ring Rλ . We define the direct sum of the family {Rλ ∶ λ ∈ Λ}
to be the set
⊕ Rλ ∶= {(rλ )λ ∈ ∏ Rλ ∶ rλ = 0 for all but finitely many λ ∈ Λ}.
λ∈Λ λ∈Λ

Again, we define addition and multiplication co-ordinate-wise.


22 2. AN INTRODUCTION TO RINGS

Observe that if Λ is finite (for example if Λ = {1, 2, . . . , n} for some


n ∈ N), then
n n
∏ Rλ = ∏ Rj = ⊕ Rn = {(r1 , r2 , . . . , rn ) ∶ rj ∈ Rj , 1 ≤ j ≤ n}.
λ∈Λ j=1 j=1

Also, if Rλ = R for all λ ∈ Λ, then


Λ
∏ Rλ = ∏ R = R ,
λ∈Λ λ∈Λ
as defined in Example 2.3.2(c).

3.4. Example. Suppose that R1 = R, R2 = Q, R3 = Z4 and R4 = M2 (Z). For


each of these rings, we shall use the usual operations of addition and multiplication.
Let Λ ∶= {1, 2, 3, 4}. Then
∏ Rλ ∶= R1 × R2 × R3 × R4
λ∈Λ
= R × Q × Z4 × M2 (Z)
= {(x, q, a, T ) ∶ x ∈ R, q ∈ Q, a ∈ Z4 , T ∈ M2 (T)}.
Since the set Λ is finite,
∏ Rλ = ⊕λ∈Λ Rλ .
λ∈Λ
It is important to keep in mind that Λ is an ordered set. That is, if we define
Ω = {2, 4, 1, 3}, then
∏ Rω ∶= R2 × R4 × R1 × R3
ω∈Ω
= Q × M2 (Z) × R × Z4
= {(q, T, x, a) ∶ x ∈ R, q ∈ Q, a ∈ Z4 , T ∈ M2 (T)}
= ⊕ω∈Ω Rω ,
but
∏ Rλ ≠ ∏ Rω .
λ∈Λ ω∈Ω

3.5. Example. Suppose that R = (Z, +, ⋅). Then


∏ R = ∏ Z = {(z1 , z2 , z3 , . . .) ∶ zn ∈ Z, n ≥ 1},
n∈N n∈N
and
⊕n∈N R = ⊕n∈N Z = {(z1 , z2 , z3 , . . .) ∶ zn ∈ Z, n ≥ 1, {k ∈ N ∶ zk ≠ 0} is finite}.
Thus e ∶= (1, 1, 1, . . .) ∈ ∏n∈N Z, but e ∈/ ⊕n∈N Z.
We add and multiply terms coordinate-wise. Thus if z = (z1 , z2 , z3 , . . .) and
w = (w1 , w2 , w3 , . . .) ∈ ∏n∈N Z, then
z + w = (z1 + w1 , z2 + w2 , z3 + w3 , . . .),
3. NEW RINGS FROM OLD 23

and
z ⋅ w = (z1 w1 , z2 w2 , z3 w3 , . . .).
It follows that the element e = (1, 1, 1, . . .) defined above is the identity element
of ∏n∈N Z. We leave it as an exercise for the reader to prove that ⊕n∈N Z is non-unital.

3.6. Example. Let R be a ring and X ⊆ R. Let us define


RX ∶= ∩{S ⊆ R ∶ X ⊆ S and S is a ring}.
Then RX is a ring, and in fact it is the smallest ring in R that contains X, in the
sense that if T is any ring satisfying X ⊆ T ⊆ R, then RX ⊆ T .
We say that RX is the ring generated by X.

1 0 0 1
3.7. Example. Let R = M2 (R). Let Y = {T1 ∶= [ ] , T2 ∶= [ ]}. If S ⊆
0 2 0 0
1 0
M2 (R) is a ring and T1 ∈ S, then T12 ∈ S, and therefore X1 ∶= 2T1 − T12 = [ ] ∈ S.
0 0
0 0
From this we see that X2 ∶= T1 − X1 = [ ] ∈ S as well. This in turn implies that
0 2
m 0
[ ] = mX1 ∶= X1 + X1 + ⋯ + X1 m times
0 0
and nX2 , pT2 ∈ S for all m, n, p ∈ N. Since (additive) inverses are also in S, we see
that
m p
W ∶= {[ ] ∶ m, n, p ∈ Z} ⊆ S.
0 2n
Thus
W ⊆ ∩{V ⊆ R ∶ Y ⊆ V and V is a ring}.
By definition, W consists of all upper-triangular 2×2 matrices whose (1, 1) and (1, 2)
entries are arbitrary integers, and whose (2, 2) entry is an arbitrary even integer.
To see that W = RY , it suffices to know that W is a ring. We could go through
the entire list of conditions from Definition 2.2 above, but it will be much easier to
apply the Subring Test below (see Proposition 4.10).
m p1
That W ≠ ∅ is obvious from its definition. If W1 ∶= [ 1 ] and W2 ∶=
0 2n1
m p2
[ 2 ] ∈ W, then
0 2n2
m − m2 p1 − p2
W1 − W 2 = [ 1 ] ∈ W,
0 2(n1 − n2 )
and
m1 m2 m1 p2 + p1 (2n2 )
W1 W2 = [ ] ∈ W.
0 2(2n1 n2 )
24 2. AN INTRODUCTION TO RINGS

By the Subring Test, W is a ring, and therefore it must be the smallest ring that
contains Y , i.e., W = RY .

4. Basic results
4.1. We now have at hand a large number of rings. It is time that we prove our
first results. We shall adopt the common notation whereby we express multiplication
through juxtaposition, that is: we write ab to mean a ⋅ b. We also introduce the
notation a − b to mean a + (−b).

4.2. Proposition. Let (R, +, ⋅) be a ring and a, b, c ∈ R. Then


(a) If a + b = a + c, then b = c.
(b) a0 = 0 = 0a.
(c) a(−b) = −(ab) = (−a)b.
(d) (−a)(−b) = ab.
(e) a(b − c) = ab − ac, and (a − b)c = ac − bc.
If R is unital, then
(f) (−1)a = −a, and
(g) (−1)(−1) = 1.
Proof.
(a) This is the cancellation law that was stated as Exercise 1.9.
Let’s prove this. Indeed, recall that (R, +) is an abelian group. Since
a ∈ R, there exists an element x ∈ R such that x + a = 0, the neutral element
of (R, +).

Then a + b = a + c implies that


b = 0 + b = (x + a) + b = x + (a + b) = x + (a + c) = (x + a) + c = 0 + c = c.
Observe that associativity of addition was crucial to this proof.
(b) Note that a0 + 0 = a0 = a(0 + 0) = a0 + a0. By the cancellation law, a0 = 0.
The proof that 0a = 0 is similar, and is left as an exercise.
(c) Now 0 = a0 = a(b + (−b)) = ab + a(−b). Since the additive inverse of ab is
unique (Exercise 1.2), it follows that a(−b) = −(ab). Again, the proof that
(−a)b = −(ab) is similar and is left as an exercise.
(d) By item (c) above,
(−a)(−b) = −(a(−b)) = −(−(ab)).
That is, (−a)(−b) is the additive inverse of −(ab). But ab is also the additive
inverse of −(ab), and so by uniqueness,
(−a)(−b) = ab.
4. BASIC RESULTS 25

(e) Now, using item (c) above,


a(b − c) = a(b + (−c)) = ab + a(−c) = ab + (−(ac)) = ab − ac.
Also,
(a − b)c = (a + (−b))c = ac + (−b)c = ac − bc.
(f) We see from item (c) above that
(−1)a = −(1a) = −a.
(g) From item (d),
(−1)(−1) = (1)(1) = 1.

4.3. Proposition. Let (R, +, ⋅) be a ring.


(a) If R has a multiplicative identity, then it is unique.
(b) If x ∈ R has a multiplicative inverse, then it is unique.
Proof.
(a) Suppose that e, f ∈ R and that er = f r = r = rf = re for all r ∈ R, so that
both e and f are multiplicative identities for R.
Then
e = ef = f.
(b) Suppose that r ∈ R and that x, y ∈ R satisfy ry = yr = 1 = xr = rx. Then
y = (1)y = (xr)y = x(ry) = x(1) = x.

4.4. Remark. Proposition 4.3 is what allows us to adopt the notation 1 to


denote the unique multiplicative identity of R (when it exists), and to adopt the
notation x−1 for the unique multiplicative inverse of x (when it exists).

4.5. Definition. Let (R, +, ⋅) be a unital ring. An element r ∈ R is said to be


invertible if it admits a multiplicative inverse. That is, r is invertible if there exists
s ∈ R such that rs = 1 = sr.
We denote the set of invertible elements of R by inv(R).
By Remark 4.4, when r is invertible, the element s satisfying sr = 1 = rs is unique
and is denoted by r−1 .
4.6. Examples.
(a) The only invertible elements of (Z, +, ⋅) are 1 and −1, with inverses 1 and
−1 respectively.
(b) Every non-zero element of R is invertible (with respect to the usual multi-
plication of real numbers).
(c) From linear algebra, we recall that if n ≥ 1 is an integer, then a matrix
T ∈ Mn (R) is invertible if and only if its determinant is non-zero.
26 2. AN INTRODUCTION TO RINGS

4.7. Definition. A non-empty subset S of a ring (R, +, ⋅) is called a subring


of R if (S, +, ⋅) is a ring using the addition and multiplication operations
inherited from R.

4.8. Example. Let S = 2Z ∶= {. . . , −4, −2, 0, 2, 4, . . .} denote the set of all even
integers. We leave it as an exercise for the reader to show that 2Z is a subring of Z.

4.9. Remark. In defining a subring S of a ring R, it is crucial that we maintain


the same operations as those used in R. For example, we know that (Z, +, ⋅) is a
ring using the usual multiplication and addition operations. By Example 2.3.2(a),
M2 (Z) is also a ring.
0 r 0 r1 0 r2
Let S ∶= {[ ] ∶ r, s ∈ Z}, and for A = [ ] and B = [ ], define
s 0 s1 0 s2 0
0 r1 + r2
A+B =[ ]
s1 + s2 0
(this is the usual addition) and
0 0
A ∗ B ∶= [ ].
0 0
It is not difficult to verify that (S, +, ∗) is a ring, but it is not a subring of (M2 (Z), +, ⋅),
because we have changed the definition of multiplication in S.
Note that under the usual definition of multiplication (say ⋅) in M2 (Z), S1 ∶=
0 1 0 0 1 0
[ ] and S2 ∶= [ ] ∈ S, but S1 ⋅ S2 = [ ] ∈/ S. Thus (S, +, ⋅) is not a ring, and
0 0 1 0 0 0
therefore it cannot be a subring of (M2 (Z), +, ⋅).

4.10. Proposition. (The Subring Test) Let ∅ ≠ S ⊆ R, where (R, +, ⋅) is


a ring. Then S is a subring of R if and only if a − b ∈ S and ab ∈ S for all a, b ∈ S.
Proof.
Suppose first that S is a subring of R. If a, b ∈ S, then, since S is a ring, −b ∈ S
and a − b = a + (−b) ∈ S. Also, a, b ∈ S implies that ab ∈ S.

Next, suppose that S ⊆ R, and that a − b, ab ∈ S for all a, b ∈ S. Let a, b, c ∈ S.


Then – since the addition and multiplication on S are those inherited from R, it
follows that (a + b) + c = a + (b + c) and a(bc) = (ab)c, since a, b, c ∈ R and addition
and multiplication are associative in R.
Similarly, multiplication distributes over addition in S because it does in R, and
we are using the same addition and multiplication. Furthermore, a + b = b + a since
this holds in R.
Now 0 = a + (−a) ∈ S by hypothesis, so S has a neutral element under addition.
Also, 0, a ∈ S implies that 0 + (−a) = −a ∈ S, so S includes the additive inverses of
each of its elements. Finally, a + b = a − (−b) ∈ S, so that S is closed under addition.
4. BASIC RESULTS 27

Thus (S, +) is an abelian group, closed under multiplication, which distributes


over addition. That is, (S, +, ⋅) is a ring under the operations inherited from R, and
so it is a subring of (R, +, ⋅).

4.11. Remark. We point out that in the Subring Test, it is crucial that we
show that a − b ∈ S (and ab ∈ S) rather than a + b (and ab ∈ S) for all a, b ∈ S.
Consider N = {1, 2, 3, . . . , }, equipped with the usual addition and multiplication
it inherits as a subset of the ring Z. For m, n ∈ N, clearly m + n ∈ N and mn ∈ N.
But (N, +, ⋅) is not a ring because – for example – 2 ∈ N, but −2 ∈/ N. Alternatively,
(N, +, ⋅) is not a ring because it does not have a neutral element under addition - i.e.
0 ∈/ N.
4.12. Examples.
(a) Let R ∶= (C([0, 1], R), +, ⋅) and S = {f ∈ R ∶ f (x) = 0 for all x ∈ [1/2, 8/9]}.
Then S is a subring of R.
(b) For any ring R and any natural number n, Tn (R) is a subring of Mn (R).
(c) Q is a subring of R, and Z is a subring of Q and of R.
(d) Z5 is not a subring of Z10 . (Why not?)

4.13. Example. Let (R, +, ⋅) be a ring. We define the centre of R to be the


set
Z(R) ∶= {z ∈ R ∶ zr = rz for all r ∈ R}.
Suppose that w, z ∈ Z(R), and r ∈ R. Then
● (wz)r = w(zr) = w(rz) = (wr)z = (rw)z = r(wz).
Since r ∈ R was arbitrary, we conclude that wz ∈ Z(R).
● (w − z)r = wr − zr = rw − rz = r(w − z).
Again, since r ∈ R was arbitrary, we conclude that w − z ∈ Z(R).
By the Subring Test, Z(R) is a subring of R.
28 2. AN INTRODUCTION TO RINGS

Supplementary Examples
a b
S2.1. Example. Let R = Z. Then M2 (Z) = {[ ] ∶ a, b, c, d ∈ Z} is a ring,
c d
where
a b a b a + a2 b1 + b2
[ 1 1] + [ 2 2] = [ 1 ],
c1 d1 c2 d2 c1 + c2 d1 + d2
and
a b a b a a +b c a b +b d
[ 1 1 ] [ 2 2 ] = [ 1 2 1 2 1 2 1 2] .
c1 d1 c2 d2 c1 a2 + d1 c2 c1 b2 + d1 d2

S2.2. Example. Let D((0, 1), R) = {f ∶ (0, 1) → R ∶ f is differentiable on (0, 1)}.


Define (f + g)(x) = f (x) + g(x) and (f g)(x) = f (x)g(x) for all x ∈ (0, 1).
Then D((0, 1), R) is a ring. The details are left to the reader.
S2.3. Example. Most readers are familiar with the ring of complex numbers
C ∶= {a + bi ∶ a, b ∈ R, i2 = −1}. Perhaps less well-known is the ring of real quater-
nions:
Define H ∶= {a1 + b i + c j + d k ∶ a, b, c, d ∈ R}, where i, j, k satisfy the follow
multiplication table:
× 1 i j k

1 1 i j k
i i -1 k -j
j j -k -1 i
k k j -i -1
We define
(a1 1+b1 i+c1 j+d2 k)+(a2 1+b2 i+c2 j+d2 k) = (a1 +a2 )1+(b1 +b2 ) i+(c1 +c2 ) j+(d1 +d2 ) k.
Multiplication behaves analogously to complex multiplication, only following the
multiplication table above. Thus, for example,
(π + 2j + 3k)(e + 2i) = πe + 2e j + 3e k + 2π i + 4 j i + 6 k i
= π e + 2π i + (2e + 6) j + (3e − 4) k.
We leave it to the reader to find the (long) formula for multiplying two arbitrary
real quaternions.

Note that by replacing R by Q above, we obtain the rational quaternions.


S2.4. Example. Let
C0 (R, R) ∶= {f ∶ R → R ∶ f is continuous and lim f (x) = 0 = lim f (x)}.
x→∞ x→−∞
Then C0 (R, R) is a ring under pointwise addition and multiplication: that is, (f g)(x) =
f (x)g(x) and (f + g)(x) = f (x) + g(x) for all x ∈ R.
It is non-unital.
SUPPLEMENTARY EXAMPLES 29

S2.5. Example. Let R ∶= Z2 × Z3 be the direct product of Z2 and Z3 . Given


(a, b) ∈ R, we have that a ∈ Z2 and b ∈ Z3 . We define
(a1 , b1 ) + (a2 , b2 ) ∶= (a1 + a2 , b1 + b2 )
and
(a1 , b1 )(a2 , b2 ) ∶= (a1 a2 , b1 b2 ),
where a1 + a2 , a1 a2 are determined in Z2 while b1 + b2 , b1 b2 are determined by the
operations in Z3 .
Note that R has 6 elements, as does Z6 . We invite the reader to write out the
addition and multiplication tables for R and Z6 and to look for similarities. The
relation will be made more explicit in Chapter 4.

S2.6. Example. Let R ∶= Z2 × Z2 be the direct product of Z2 and Z2 . Given


(a, b) ∈ R, we have that a ∈ Z2 and b ∈ Z2 . We define
(a1 , b1 ) + (a2 , b2 ) ∶= (a1 + a2 , b1 + b2 )
and
(a1 , b1 )(a2 , b2 ) ∶= (a1 a2 , b1 b2 ),
where a1 + a2 , b1 + b2 , a1 a2 and b1 b2 are determined by the operations in Z2 .
This ring has 4 elements, as does Z4 . Note, however, that if (a, b) ∈ R, then
(a, b) + (a, b) = (0, 0) is the zero element of R. On the other hand, 1 ∈ Z4 and
1 + 1 = 2 ≠ 0 in Z4 .
There is something fundamentally different about these two rings. That too will
be made explicit below.

S2.7. Example. Let


w z
R = {[ ] ∶ w, z ∈ C} ⊆ M2 (C).
−z w
Then R is a ring.
Can you see any relationship between R and the ring H of real quaternions from
Example 4.1?

S2.8. Example. Let ∅ ≠ X be a non-empty set, and denote by P(X) the


power set of X, that is:
P(X) = {Y ∶ Y ⊆ X}.
Given Y, Z ∈ P(X), set Y ⊕ Z ∶= (Y ∖ Z) ∪ (Z ∖ Y ), and Y ⊗ Z ∶= Y ∩ Z.
Then (P(X), ⊕, ⊗) is a (commutative) ring. The additive identity is ∅, while
the multiplicative identity is X. We leave it to the reader to verify the details.
Suppose that X = {a, b}. Is there any relationship between (P(X), ⊕, ⊗) and
Z2 × Z2 ? Or between (P(X), ⊕, ⊗) and Z4 ? If so, what is the relationship? If not,
why does this fail?
30 2. AN INTRODUCTION TO RINGS

S2.9. Example. Let (G, +) denote an abelian group. An endomorphism of


G is a map
φ∶G→G
that satisfies φ(g + h) = φ(g) + φ(h) for all g, h ∈ G.
Let End(G) ∶= {φ ∶ φ is an endomorphism of G}. For φ, ψ ∈ End(G), define
(φ+̇ψ)(g) = φ(g) + ψ(g),
and
(φ ∗ ψ)(g) = φ(ψ(g)), g ∈ G.
We leave it to the reader to verify that (End(G), +̇, ∗) is a ring.
S2.10. Example. We can also consider the so-called ring of formal power
series in x, which extends the notion of a polynomial ring over a ring.
Let R be a commutative ring. Denote by R[[x]] the ring

R[[x]] ∶= { ∑ rn xn ∶ rn ∈ R for all n}.
n=0
(There is no notion of convergence here – this is just notation! )
We define addition and multiplication on R[[x]] by setting:
∞ ∞ ∞
n n n
∑ rn x + ∑ sn x ∶= ∑ (rn + sn )x ,
n=0 n=0 n=0
and
∞ ∞ ∞
( ∑ rn xn ) ( ∑ sn xn ) ∶= ∑ tn xn ,
n=0 n=0 n=0
where for all n ≥ 0,
n
tn ∶= ∑ rj sn−j .
j=0
We leave it to the reader to verify that this is indeed a ring.

Note that when we say that there is no notion of convergence here, we mean
that we could just have easily defined an element of R[[x]] to be a sequence (rn )∞
n=0
with rn ∈ R for all n ≥ 0, and (rn )n + (sn )n = (rn + sn )n .
The reason for preferring the “series” notation is that helps to “explain” how
one might come up with the multiplication
(rn )n ∗ (sn )n ∶= (tn )n ,
where
n
tn ∶= ∑ rj sn−j .
j=0
n
In other words, the x , n ≥ 0 is really just a placeholder for the coefficient rn ∈ R.
APPENDIX 31

Appendix
A2.1. As mentioned in the introductions to each of the first two chapters, math-
ematics typically evolves by considering a large number of objects that one is in-
terested in, and having one or more clever individuals notice a common link which
relates those objects to one another.
In the present case, examples arose out of number theory (integers, rational
numbers, real numbers, complex numbers), linear algebra (linear transformations,
matrices), set theory (Boolean rings), analysis (continuous functions, differentiable
functions), amongst other areas.
The formulation of the notion of a ring began in the 1870’s with the study
of polynomials and algebraic integers, in part by Richard Dedekind. The term
“Zahlring” (German for “number ring”) was introduced by David Hilbert in the
1890s. Adolf Fraenkel gave the first axiomatic definition of a ring in 1914, but his
definition was stricter than the current definition. The modern axiomatic definition
of a (commutative) ring was given by Emmy Noether.

A2.2. Some authors – especially algebraists – require a ring to have a multi-


plicative identity. For example, Fraenkel required this, although Noether did not,
despite the fact that she was an algebraist (and an excellent one at that). There is
a sickening tendency of some people to refer to rings without identity as rngs. Let
us agree that this vox nihili should never be mentioned in public again. (Not that
the current author has a strong opinion on this, mind you.)
It is true that it is always possible to “add” an identity to a ring in order to
make it unital, but this construction is not always natural. (For example, the set
C0 (R, R) = {f ∶ R → R ∣ f is continuous and lim f (x) = 0 = lim f (x)}
x→∞ x→−∞

defined in Example 4.1 above becomes a ring using pointwise addition, but with
pointwise multiplication replaced by convolution:

(f ∗ g)(x) ∶= ∫ f (t)g(x − t) dt.
−∞

There is no reasonable function h ∶ R → R which could serve as an identity for this


function under convolution. Instead, one has to artificially introduce an abstract
symbol and declare it to be an identity by convention.
On the other hand, it is possible to find a sequence (δn )n of functions in C0 (R, R)
satisfying
lim(f ∗ δn ) = f,
n
but this requires us to understand what we mean by “limits of functions”, which
in this setting is a bit beyond the scope of this course. We mention in passing
that the sequence is referred to as an approximate identity for the ring. These
notions tend to be popular in Analysis, especially Harmonic analysis (commutative
and non-commutative) and Operator Algebras.
32 2. AN INTRODUCTION TO RINGS

A2.3. The ring H of real quaternions was first described by the Irish mathe-
matician William Rowan Hamilton. Note that if ω1 , ω2 ∈ H, then ω1 ω2 ≠ ω2 ω1
in general, but if 0 ≠ ω ∈ H, then ω is invertible. A non-commutative ring in which
every non-zero element is invertible is referred to as a division ring or a skew
field.
A2.4. In some textbooks, the degree of the zero polynomial is defined to be
“−∞”, with the understanding that −∞ + (−∞) = −∞ = −∞ + n = n + (−∞) for all
n ∈ N. In this way, we see that
deg (p(x) + q(x)) ≤ max(deg(p(x)), deg (q(x))
for all polynomials p(x), q(x) – that is, we don’t have to separately argue that
p(x) + q(x) might be equal to zero.
We have chosen not to use this terminology, because there is a tendency for
people to think of “−∞” as a number as opposed to a symbol, and to think of the
above “sums” as being actual addition, instead of mere convention.
A2.5 The ring of polynomials revisited. Recall that if R is a commutative
ring, then we defined the ring of polynomials R[x] to be the set of formal symbols:
R[x] = R⟨x⟩ = {p0 + p1 x + ⋯ + pm xm ∶ m ≥ 1, pk ∈ R, 0 ≤ k ≤ m}.
Let us now give an equivalent, but slightly different and very useful definition of the
ring of polynomials R[x].

Let (R, +, ⋅) be a commutative, unital ring and consider the ring


PR ∶= ⊕∞
n=0 R
∶= {(p0 , p1 , p2 , . . .) ∶ pn ∈ R for all n and pn = 0 except for finitely many n ≥ 0}.
Given p = (p0 , p1 , p2 , . . .) and q = (q0 , q1 , q2 , . . .) ∈ PR , we define
p + q ∶= (p0 + q0 , p1 + q1 , p2 + q2 , . . .)
and
p ∗ q ∶= (r0 , r1 , r2 , . . .),
where
m
rm ∶= ∑ pk qm−k , m ≥ 0.
k=0
That is,
p ∗ q ∶= (p0 q0 , p0 q1 + p1 q0 , p0 q2 + p1 q1 + p2 q0 , . . .).
(If this multiplication reminds you of the multiplication of polynomials, that’s a
good thing!)
Exercise. We leave it as an exercise for the reader to verify that (PR , +, ∗) is a
ring with these two operations.

Note that if R is unital, say 1R ∈ R, then


e ∶= (1R , 0, 0, 0, . . .) ∈ PR
APPENDIX 33

satisfies e ∗ p = p = p ∗ e for all p ∈ PR , and thus e is the multiplicative identity of PR .

Define x ∶= (0, 1R , 0, 0, 0, . . .) ∈ PR . If m ≥ 1, then


xm = (0, 0, . . . , 0, 1R , 0, 0, 0, . . .),
where the 1R term occurs at the m + 1-st position. In other words,
xm = (a0 , a1 , a2 , . . . , am , am+1 , am+2 , . . .),
where ak = 0 if k ≠ m + 1 and am+1 = 1R .
Given r ∈ R and p = (p0 , p1 , p2 , . . .) ∈ PR , we define
r ⋅ p ∶= (rp0 , rp1 , rp2 , . . .) ∈ PR .
In other words, if we identify r ∈ R with the element ̂
r ∶= (r, 0, 0, 0, . . .) ∈ PR , then
we have that
r⋅p=̂ r ∗ p.
This allows us to write a general element p = (p0 , p1 , p2 , . . .) ∈ PR (where n ∈ N is
chosen such that pm = 0 if m ≥ n) in the form
p = p0 e + p1 ⋅ x + p2 ⋅ x2 + ⋯ + pN ⋅ xn .
It is clear that p = q if and only if pk = qk for all k ≥ 0, and we have specifically chosen
the addition and multiplication in PR to mimic the addition and multiplication of
polynomials.
In the language of Chapter Four, R[x] and PR are isomorphic. They are
precisely the same ring, just expressed with different notation.

We invite the reader to think of a way of expressing R[x1 , x2 ] in a similar form.


34 2. AN INTRODUCTION TO RINGS

Exercises for Chapter 2

Exercise 2.1.
Consider the set Pn of n×n permutation matrices in Mn (C). Thinking of Mn (C)
as a vector space over C, is
span{Pn } = Mn (C)?

Exercise 2.2.
Let

⎪ ⎡1 0 0⎤ ⎡0 1 0⎤ ⎡0 0 1⎤⎫
⎪⎢⎢
⎪ ⎥ ⎢ ⎥ ⎢ ⎥⎪
⎥ ⎢ ⎥ ⎢ ⎥⎪

H = {H1 , H2 , H3 } ∶= ⎨⎢0 1 0⎥ , ⎢0 0 1⎥ , ⎢1 0 0⎥⎬ .

⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎪

⎩⎢⎣0 0 1⎥⎦ ⎢⎣1 0 0⎥⎦ ⎢⎣0 1 0⎥⎦⎪
⎪ ⎭
Prove that H is a group and write out its multiplication table (as we did with
G in Example 2.1.8).
(a) Is there a relationship between H and the group G from that example?
(b) Is there a relationship between Q⟨H⟩ and Q⟨G⟩?
If relationships do exist, how would you describe them?

Exercise 2.3.
Find all subrings of (Z, +, ⋅).

Exercise 2.4.
Let R and S be subrings of a commutative ring T .
(a) Define
n
RS ∶= {∑ ri si ∶ n ≥ 1, ri ∈ R, si ∈ S, 1 ≤ i ≤ n}.
i=1
Prove that RS is a subring of T . Is it always true that R ⊆ RS? Either
prove that it is, or find a counterexample to show that it is not always true.
(b) Did we need to take all finite sums in the expression for RS above? In
other words, if
A ∶= {rs ∶ r ∈ R, s ∈ S},
is A a subring of T ? Again, you must justify your answer!
(c) Prove that R ∩ S is a subring of T . Is the commutativity of T required
here?
(d) Was commutativity of T required in part (a) above? If not, prove part (a)
without assuming that T is commutative. Otherwise, provide an example
of a non-commutative ring T where RS is not a subring of T .

Exercise 2.5.
Let R and S be rings. Describe all subrings of R ⊕ S in terms of the subrings of
R and S.
EXERCISES FOR CHAPTER 2 35

Exercise 2.6.
Let G = Z ⊕ Z, equipped with the operation (g1 , h1 ) + (g2 , h2 ) ∶= (g1 + g2 , h1 + h2 ).
(a) Prove that G is an abelian group.
(b) Describe End(G), the endomorphism ring of G as defined in Example 2.9.

Exercise 2.7.
Let R be a ring and a, b ∈ R. Show that in general,
(a + b)2 ∶= (a + b)(a + b) ≠ a2 + 2ab + b2 .
State the correct formula, and find the formula for (a + b)3 .

Exercise 2.8.
Consider M2 (R) equipped with usual matrix addition and multiplication. If
L ⊆ R is a subring with the additional property that R ∈ R and M ∈ L implies that
RM ∈ L, we say that L is a left ideal of M2 (R).
Find all left ideals of M2 (R). Can you generalise this to Mn (R) for n ≥ 3?

Note. As you might imagine, one can also define right ideals in a similar
manner. What is the relationship between left and right ideals of Mn (R)?

Exercise 2.9.
Let (R, +, ⋅) be a ring. Define a new multiplication ∗ on R by setting
a ∗ b ∶= b ⋅ a for all a, b ∈ R.
Prove that (R, +, ∗) is also a ring.
This ring is referred to as the opposite ring of R.

Exercise 2.10.
Let R be a ring and suppose that a2 = a for all a ∈ R. Prove that R is commu-
tative; that is, prove that ab = ba for all a, b ∈ R.
A ring R for which a2 = a for all a ∈ R is called a Boolean ring. Provide an
example of such a ring.
CHAPTER 3

Integral Domains and Fields

If you want to know what God thinks of money, just look at the people
he gave it to.
Dorothy Parker

1. Integral domains - definitions and basic properties


1.1. Recall from Exercise 1.9 that one of the nicer properties that holds in a
group (G, ⋅) is the cancellation law, namely:
if g, h and k ∈ G and
g ⋅ h = g ⋅ k,
then h = k.
If (R, +, ⋅) is a ring, then (R, +) is an abelian group, and so the cancellation law
holds for (R, +). That is, given a, b, c ∈ R, the equation
a+b=a+c
implies that b = c.

The cancellation law is the group analogue of a familiar technique we employ


when solving equations: for example, if we are working with real numbers, we solve
the equation
5x = 35
by multiplying both sides of the equation by 1
5 = 5−1 to get
x = 5−1 ⋅ 5x = 5−1 35 = 7.
In other words,
5x = 5 ⋅ 7
implies that x = 7.

Unfortunately, the cancellation law is not some universal truth that holds in all
algebraic structures at all times. In particular, the cancellation law does not have
37
38 3. INTEGRAL DOMAINS AND FIELDS

to hold for multiplication in a ring. For example, consider the matrices


1 0 0 0 0 0
A ∶= [ ], B ∶= [ ], C ∶= [ ],
0 0 −100 π 12 e3 + 19
all of which lie in the ring M2 (R).
A simple calculation shows that
0 0
AC = [ ] = AB,
0 0
but clearly B ≠ C.

In this Chapter, we shall isolate this “bad behaviour”, and we shall give a special
name to those rings which avoid it.
1.2. Definition. Let R be a ring. An element 0 ≠ r ∈ R is said to be a left
divisor of zero if there exists 0 ≠ x ∈ R such that rx = 0.
Similarly, 0 ≠ s ∈ R is said to be a right divisor of zero if there exists 0 ≠ y ∈ R
such that ys = 0.
Finally, 0 ≠ t ∈ R is said to be a (joint) divisor of zero if it is both a left and
a right divisor of 0.
We remark that if R is a commutative ring, then every left (or right) divisor of
zero is automatically a joint divisor of zero.
1.3. Examples.
(a) Let R = (C([0, 1], R), +, ⋅). Let f ∈ R be the function

⎪0
⎪ 0 ≤ x ≤ 21
f (x) = ⎨

⎪ x − 2 12 ≤ x ≤ 1.
1

Let g ∈ R be the function

⎪ 1 − x 0 ≤ x ≤ 12

g(x) = ⎨ 2 1

⎪0 2 ≤ x ≤ 1.

Then f ≠ 0 ≠ g, but f g = 0 = gf , and thus f is a joint divisor of zero (as
is g).
(b) Let V = {x = (xn )∞ N
n=1 ∈ R ∶ limn xn = 0}. We leave it as an exercise for the
reader to prove that V is a vector space over R under the operations
(xn )n + (yn )n ∶= (xn + yn )n ,
λ(xn )n ∶= (λxn )n ,
for all (xn )n , (yn )n ∈ V and λ ∈ R.
Define the linear map
S∶ V → V
(x1 , x2 , x3 , . . .) ↦ (0, x1 , x2 , x3 , x4 , . . .).
1. INTEGRAL DOMAINS - DEFINITIONS AND BASIC PROPERTIES 39

Again - we leave it to the reader to verify that S ∈ L(V). Note that x ∈ V


and Sx = 0 implies that x = 0; that is, S is injective. Clearly S ≠ 0.
Suppose that T ∈ L(V) and that ST = 0. Then for all y ∈ V, ST y =
S(T y) = 0 implies that T y = 0. But then T y = 0 for all y ∈ V implies that
T = 0. This shows that S is not a left divisor of zero.
On the other hand, let R ∈ L(V) be the map
R∶ V → V
(x1 , x2 , x3 , . . .) ↦ (x1 , 0, 0, 0, . . .).
(One should verify that this is indeed a linear map on V!)
Then R ≠ 0 since R(1, 0, 0, . . .) = (1, 0, 0, . . .), but for any x = (xn )n ∈ V,
RSx = R(Sx) = R(0, x1 , x2 , . . .) = (0, 0, 0, . . .) = 0.
Thus S is a right divisor of zero.
(c) If (R, +, ⋅) is a unital ring and 0 ≠ r ∈ R is invertible, then it is neither a
left nor a right divisor of zero. Indeed, denoting by r−1 the multiplicative
inverse of r, we see that the equation r ⋅ x = 0 implies that
x = 1 ⋅ x = (r−1 ⋅ r) ⋅ x = r−1 ⋅ (r ⋅ x) = r−1 ⋅ 0 = 0,
proving that r is not a left divisor of zero. The argument that r is not a
right divisor of zero is similar and is left to the reader.
(d) Suppose that R = (Z24 , +, ⋅). Then, with addition and multiplication done
“mod 24”, as described in Example 2.1.6, we see that 6 ⋅ 8 = 48 mod 24 = 0.
Since Z24 is a commutative ring, we find that 6 is a joint divisor of zero.

1.4. Definition. A ring (D, +, ⋅) is said to be an integral domain if


● D is unital.
● D is commutative, and
● if a, b ∈ D and a ⋅ b = 0, then either a = 0 or b = 0. In other words, D has no
divisors of zero.
(Note that since D is commutative, if it were to admit any kind of divisor of zero,
it would have to be a joint divisor of zero.)

1.5. Examples.
(a) The motivating example of an integral domain, and probably the origin of
the terminology, is the ring (Z, +, ⋅) of integers.
(b) Each of (Q, +, ⋅), (R, +, ⋅) and (C, +, ⋅) is also an integral domain.
(c) From Example 1.3 above,
(C([0, 1], R), +, ⋅)
is not an integral domain.
(d) M2 (R) is not commutative - it has no chance of being an integral domain.
40 3. INTEGRAL DOMAINS AND FIELDS

1.6. Examples.
(a) The ring of Gaussian integers
Z[i] ∶= {a + bi ∶ a, b ∈ Z, i2 = −1}
is an integral domain.
(b) The subring (2Z, +, ⋅) of Z is not an integral domain, because it is not
unital!
(c) The ring (R[x], +, ⋅) of polynomials in one variable with coefficients in R is
an integral domain. We leave it to the reader to verify this.
Again - we remind the reader that polynomial rings will be of special importance
later. Despite the simplicity of the proof, the following result is crucial, and hence
is designated a “theorem”.

1.7. Theorem. Suppose that D is an integral domain. Then so is the ring


D[x] of polynomials with coefficients in D.
Proof. Let p(x) = p0 + p1 x + p2 x2 + ⋯ + pm xm and q(x) = q0 + q1 x + q2 x2 + ⋯ + qn xn
be polynomials of degree m and n respectively. (Note that the hypothesis that their
degrees are defined means that the polynomials are non-zero.)
Then pm ≠ 0 ≠ qn , and
r(x) ∶= p(x)q(x) = r0 + r1 x + r2 x2 + ⋯ + rn+m xn+m
is a polynomial of degree n + m, given that rn+m = pm qn ≠ 0 since D is an integral
domain.
Thus D[x] is an integral domain.

1.8. Remark. The following useful fact was embedded in the above proof. For
ease of reference, we isolate it as a remark.

If D is an integral domain and 0 ≠ p(x), q(x) ∈ D[x], then


deg(p(x)q(x)) = deg(p(x)) + deg(q(x)).

1.9. Proposition. Let (R, +, ⋅) be a unital, commutative ring. The following


statements are equivalent.
(a) R is an integral domain.
(b) If a, b, c ∈ R with 0 ≠ a, and if ab = ac, then b = c.
Proof.
(a) implies (b). Suppose that (R, +, ⋅) is an integral domain, and let a, b, c ∈ R.
If a ≠ 0 and ab = ac, then a(b − c) = 0 implies that b − c = 0, i.e. b = c.
1. INTEGRAL DOMAINS - DEFINITIONS AND BASIC PROPERTIES 41

(b) implies (a). Now suppose that (R, +, ⋅) is a unital commutative ring and
that condition (b) holds. If there exists a ≠ 0 and b ∈ R with ab = 0, then
ab = 0 = a0. By hypothesis, b = 0. But then a is not a left divisor of zero.
Since R is commutative, a is not a right divisor of zero either. In other
words, R has no divisors of zero, and as such, it is an integral domain.

1.10. Proposition 1.9 identifies integral domains as precisely those commutative,


unital rings where the cancellation law (for multiplication) holds. As pointed out at
the beginning of this section, this is useful in trying to solve equations.
Suppose, for example, that one is given a polynomial equation
p0 + p1 x + p2 x2 + ⋯ + pn xn = 0
with coefficients pk ∈ D, where (D, +, ⋅) is an integral domain and pn ≠ 0. Suppose
furthermore that the polynomial factors into linear terms. (This hypothesis is non-
trivial, and need not hold in general. For example, the polynomial q(x) = x2 + 1 with
coefficients in R does not factor into linear terms.)
We may then write
pn (x − α1 )(x − α2 )⋯(x − αn ) = 0,
with each αj ∈ D. The fact that we are in an integral domain D means that the
only solutions to this equation are found by taking x = αj for some j.
When the ring of coefficients is not an integral domain, it is possible that we
may have many other solutions. For example, consider the polynomial equation
1 0 0 0 0 0
q(x) = (x − [ ]) (x − [ ]) = [ ]
0 0 0 1 0 0
in one variable with coefficients in M2 (R) (which is not an integral domain). Then
Then
1 0 0 0 1 0 0 0
q([ ]) = [ ][ ]=[ ],
0 1 0 1 0 0 0 0
even though
1 0 1 0 0 0
[ ] ∈/ {[ ],[ ]} .
0 1 0 0 0 1

1.11. Proposition. Let 2 ≤ n ∈ N. Then (Zn , +, ⋅) is an integral domain if and


only if n is prime.
Proof. Suppose first that n is prime. Suppose that a, b ∈ Zn and that a ⋅ b = 0. Then
n divides ab. But n prime then implies that n either divides a or that n divides b.
That is, a = 0 or b = 0 in Zn . Thus Zn has no divisors of zero.
42 3. INTEGRAL DOMAINS AND FIELDS

Next, suppose that n is not prime, say that n = m1 m2 , where 1 < m1 , m2 < n.
Then m1 ≠ 0 ≠ m2 ∈ Zn , and m1 m2 = n mod n = 0 in Zn , proving that m1 and m2
are joint divisors of zero in Zn . Thus Zn is not an integral domain.

√ √
1.12. Exercise. Let p ∈ N be a prime number. Then Q( p) = {a + b p ∶
a, b ∈ Q} is an integral domain, using the usual addition and multiplication of real
numbers.

2. The character of a ring


2.1. In dealing with integers, rational numbers and real numbers, we know that
if we keep adding a (non-zero) number to itself many times, we can never end up
getting the value 0. It might also seem counterintuitive that such a thing could ever
happen under any circumstance (which I believe is sufficiently vague to classify as
a true statement). Ah, but then we look at the watch that grandmaman gave us,
and realise that if it is one o’clock now, and if we keep moving forward in time (as
we are wont to do), then – twelve hours from now, and twelve hours from then – it
will be one o’clock again. Somehow, adding twelve times one hour is, insofar as the
face of our grandmaman’s watch is concerned, the same as adding zero hours.
We shall have more to say about grandmaman’s watch once we begin to examine
quotient rings.
In the meantime, let us observe that the phenomenon observed on the face
of our grandmaman’s watch occurs in a number of rings, and it is an extremely
useful phenomenon to keep in mind. (Just one more reason to appreciate your
grandmaman.)

2.2. Notation. Let (R, +, ⋅) be a ring and n ∈ N. Given r ∈ R, we denote by


n r the element
n
n r ∶= ∑ r = r + r + ⋯ + r (n times).
j=1

2.3. Definition. Let R be a ring. We define the character of R as


char(R) ∶= min{n ∈ N ∶ n r = 0 for all r ∈ R},
if such a natural number n ∈ N exists. Otherwise, we say that char(R) = 0.

2.4. Examples.
(a) Each of the following rings has characteristic zero: Z, Q, R and C.
(b) If n ∈ N, then char (Zn , +, ⋅) = n. In particular, char(Z12 ) = 12, which
explains in part what is going on with grandmaman’s watch.
2. THE CHARACTER OF A RING 43

2.5. Proposition. Let R be a unital ring. If char(R) ≠ 0, then


char(R) = min{n ∈ N ∶ n 1 = 0}.

Proof. Suppose first that γ ∶= char(R) ≠ 0. Then γ r = 0 for all r ∈ R, and so


γ 1 = 0. In particular, γ ∈ {n ∈ N ∶ n 1 = 0}. It follows that {n ∈ N ∶ n 1 = 0} ≠ ∅ and
that
min{n ∈ N ∶ n 1 = 0} ≤ γ = char(R).

Next, if κ ∶= min{n ∈ N ∶ n 1 = 0}, then for any r ∈ R,


κ r = r + r + ⋯r = (1 + 1 + ⋯ + 1)r = (κ 1)r = 0r = 0,
and so char(R) ≤ κ. That is,
char(R) ≤ min{n ∈ N ∶ n 1 = 0}.
Combining these inequalities yields the desired result.


The above Proposition will make our work significantly easier when it comes to
computing the characteristic of a unital ring, even those that on the surface appear
rather complicated.
2.6. Examples.
(a) Consider R = Z2 ⊕ Z3 = {(a, b) ∶ a ∈ Z2 , b ∈ Z3 }. We claim that char(R) = 6.
Indeed, the identity of R is e ∶= (1, 1). Note that as an ordered list, we
have that
(1e, 2e, 3e, 4e, 5e, 6e) = ((1, 1), (0, 2), (1, 0), (0, 1), (1, 2), (0, 0)).
That is, 6 = min{n ∈ N ∶ n e = 0}, and so by the above Proposition 2.5,
char(R) = 6.
(b) Consider next R = Z2 ⊕ Z4 = {(a, b) ∶ a ∈ Z2 , b ∈ Z4 }. Let e ∶= (1, 1) denote
the multiplicative identity of R.
Then
(1e, 2e, 3e, 4e) = ((1, 1), (0, 2), (1, 3), (0, 0)).
By Proposition 2.5, we deduce that
char(R) = 4.
(c) Let R = Z5 ⟨x, y, z⟩ denote the ring of polynomials in three non-commuting
variables x, y and z with coefficients in Z5 . Then R is a unital ring with
identity e = 1 ∈ Z5 . Thus
(1e, 2e, 3e, 4e, 5e) = (1, 2, 3, 4, 0),
and so by Proposition 2.5,
char(Z5 ⟨x, y, z⟩) = 5.
44 3. INTEGRAL DOMAINS AND FIELDS

2.7. Remark. To learn mathematics, it is not sufficient to merely read math-


ematics, although reading mathematics is certainly an important step that cannot
be overlooked. In addition to reading, one should always be asking oneself questions
about what one is reading.
● What was the key step in the proof of a result? Where was it used?
● Can we eliminate any of the hypotheses? Can we generalise the proof?
In the example above, we saw that char(Z2 ⊕Z3 ) = 6 = 2⋅3, whereas char(Z2 ⊕Z4 ) =
4 = max{2, 4}.
Why do these results appear to be different? What (if anything) is really going
on?

These are the kinds of questions one should be asking oneself throughout the
learning process. Yes, it is time-consuming, but the better one understands math-
ematics, the less one has to memorise. In a perfect world, one would only need to
memorise definitions. (I’ve yet to figure out how to get around that!)
For example, rather than trying to memorise a proof line-by-line, one would only
need to remember the key step, relying upon oneself to fill in the routine steps of
the proof. The time invested in learning these things at the beginning is time saved
later on.

2.8. Proposition. Let R be a commutative ring, and suppose that p ∶= char(R)


is prime. If r, s ∈ R, then
(r + s)p = rp + sp .

Proof. Consider
p p p
(r + s)p = rp + ( )rp−1 sp + ( )rp−2 s2 + ⋯ + ( )r1 sp−1 + sp .
1 2 p−1
Here, for 1 ≤ k ≤ p − 1,
p p!
( )=
k k!(p − k)!
Now, p is prime and 1 ≤ k, (p − k) < p implies that p does not divide k! nor (p − k)!.
p
Since p divides the numerator, it follows that p divides ( ), 1 ≤ k ≤ p − 1. But then
k
p
( )rp−k sk = 0, 1 ≤ k ≤ p − 1,
k
and so
(r + s)p = rp + 0 + 0 + ⋯ + 0 + sp = rp + sp .

Again, the reader should be asking (and hopefully answering) the question:
where was the commutativity of R used in this proof?
3. FIELDS - AN INTRODUCTION 45

3. Fields - an introduction
3.1. In Section 1 of this Chapter, we identified a property of rings which is
equivalent to the ring satisfying the cancellation law for multiplication, namely: we
asked that the ring be an integral domain. We argued that this is a requirement if
we hope to solve equations that involve elements of that ring.
Note that (Z, +, ⋅) is an integral domain, and yet if we only knew about inte-
gers and we didn’t know anything about rational numbers, we could not solve the
equation
2x = 1024.
That is, we could not do it by simply multiplying 2 by 12 , because 12 ∈/ Z. (Trial and
error would still work, I suppose.)
This leads us to consider rings which are even more special than integral do-
mains. In some ways, these new rings will behave almost as well as the real numbers
themselves.

3.2. Definition. A field (F, +, ⋅) is a commutative, unital ring in which every


non-zero element admits a multiplicative inverse. That is, if 0 ≠ x ∈ F, then there
exists y ∈ F such that xy = 1 = yx. As we have seen, such an element y is unique
and denoted by x−1 .

3.3. Remark. We remark that by Example 1.3(d), every field F is automatically


an integral domain. Since F is unital, it must have at least two elements, namely 0
and 1.

3.4. Examples.
(a) Q, R and C are all fields, using the usual notions of addition and multipli-
cation.
(b) (Z, +, ⋅) is not a field, since 0 ≠ 2 ∈ Z does not admit a multiplicative inverse
in Z.
(c) Let F ∶= {0, 1, x, y}, equipped with the following addition and multiplication
operations.

+ 0 1 x y ⋅ 0 1 x y
0 0 1 x y 0 0 0 0 0
1 1 0 y x and 1 0 1 x y .
x x y 0 1 x 0 x y 1
y y x 1 0 y 0 y 1 x

Then one can verify that (F, +, ⋅) is a field. Note that F has 4 = 22 elements.
Coincidence? We shall see.
46 3. INTEGRAL DOMAINS AND FIELDS

3.5. Theorem. If (D, +, ⋅) is an integral domain and D is finite, then it is a


field.
Proof. Suppose that (D, +, ⋅) is an integral domain with m ∈ N elements, say
D = {0, 1, d3 , d4 , . . . , dm }.
Clearly 1 = 1−1 is invertible. Now suppose that 3 ≤ k ≤ m. Then
dk D ∶= {dk d ∶ d ∈ D} ⊆ D.
Moreover, if there exist x, y ∈ D with dk x = dk y, then x = y by virtue of the fact that
D is an integral domain.
Thus dk D has exactly m elements, and from this we conclude that dk D = D. In
particular, there exists x ∈ D such that dk x = 1. Since D is commutative, xdk = 1 as
well.
That is, every non-zero element of D is invertible, and so D is a field.

3.6. Corollary. Let 2 ≤ n ∈ N. Then (Zn , +, ⋅) is a field if and only if n is


prime.
Proof. Suppose first that (Zn , +, ⋅) is a field. Then, by Remark 3.3, Zn is an integral
domain. By Proposition 1.11, we see that n is prime.

Conversely, suppose that 2 ≤ n ∈ N is prime. By Proposition 1.11, (Zn , +, ⋅) is an


integral domain (with n < ∞ elements). By Theorem 3.5 above, it is a field.


3.7. Example. If p ∈ N is prime, then (Q( p), +, ⋅) is a field.
√ √ √
Recall that Q( p) = {a + b p ∶ a, b ∈ Q}. Clearly Q( p) ⊆ R. To prove that it
is a ring, it suffices to apply the Subring Test, Proposition 2.4.10.
√ √
Note that if r1 ∶= a1 + b1 p, r2 ∶= a2 + b2 p ∈ Q, then
√ √
r1 − r2 = (a1 − a2 ) + (b1 − b2 ) p ∈ Q( p),
since Q is a ring which implies that a1 − a2 , b1 − b2 ∈ Q.
Moreover,
√ √ √ √
r1 ⋅ r2 = (a1 + b1 p)(a2 + b2 p) = (a1 b1 + pa2 b2 ) + (a1 b2 + a2 b1 ) p ∈ Q( p),
since a1 b1 + pa2 b2 , a1 b2 + a2 b1 ∈ Q.

By the Subring Test, Q( p) is a subring of R. Since R is commutative, so
√ √ √
is Q( p), and e = 1 + 0 p = 1 serves as the identity element of Q( p) under
multiplication.
Recall that R is a field, and so it is an integral domain. As such, R has no divisors

of zero. From this it follows that Q( p) has no divisors of zero, for otherwise they
would be divisors of zero in R.

So far we have that Q( p) is an integral domain. There remains to show that

every non-zero element of Q( p) is invertible.
3. FIELDS - AN INTRODUCTION 47

√ √
Suppose that 0 ≠ r ∶= a + b p ∈ Q( p). By Exercise 7 below, a2 − pb2 ≠ 0.
Thinking of r as an element of R, where we know it is invertible, we may write
√ √
1 1 a−b p a−b p
r−1 = √ = √ ( √ ) = .
a+b p a+b p a−b p a2 − pb2
That is,
a b √ √
r−1 = − 2 p ∈ Q( p).
a2 − pb2 a − pb2

Thus Q( p) is a field.
48 3. INTEGRAL DOMAINS AND FIELDS

Supplementary Examples.
S3.1. Example. We invite the student to verify that
√ √ √ √ √
Q( 2, 3) ∶= {a + b 2 + c 3 + d 6 ∣ a, b, c, d ∈ Q}
is a subfield of R. That is, it is a subset of R which is a field using the addition and
multiplication it inherits from R.

S3.2. Example. Let R = ∏∞ ∞


n=1 Z2 = {(an )n=1 ∶ an ∈ Z2 , n ≥ 1} be the product
ring as defined in the previous Chapter.
If a ∶= (an )∞
n=1 ∈ R, then

a + a = (an )n + (an )n ∶= (an + an )n .


But an ∈ Z2 , so an + an = 0 for all n ≥ 1, and therefore a + a = 0.
This shows that R has finite characteristic (equal to 2), despite the fact that it
is an infinite ring.

S3.3. Example. Let D be an integral domain. If e ∈ D is an idempotent, i.e.


if e = e2 , then e ∈ {0, 1}. Indeed,
0 = e2 − e = e(e − 1)
implies that e = 0 or e = 1.
On the other hand, for any real number r ∈ R,
1 r
E ∶= [ ] ∈ M2 (R)
0 0

satisfies E 2 = E. This is a very serpentine way of showing that M2 (R) is not an


integral domain.

S3.4. Example. Let D4 denote the set of all 2 × 2 matrices of the form
a b
[ ],
b a+b
where a and b ∈ Z2 . We add and multiply elements of D4 using the usual matrix
operations.
We invite the reader to prove that D4 is a field under these operations.
Note. This example is due to R.A. Dean, and thus our choice of notation.

S3.5. Example. If R is a commutative ring, then R[x] is not a field. After


all, q(x) ∶= x ∈ R[x], but if p(x) ∶= (q(x))−1 ∈ R[x], then as we have seen (see
Remark 1.8)
deg(p(x)q(x)) = deg(p(x)) + deg(q(x)) = deg(p(x)) + 1 ≥ 1,
so that p(x)q(x) ≠ 1.
SUPPLEMENTARY EXAMPLES. 49

S3.6. Example. Let R be a commutative ring, and let R[[x]] denote the ring
of formal power series defined in Example 2.4.1.
We leave it as an exercise for the reader to prove that R[[x]] is an integral
domain. (Hint. if 0 ≠ p(x) = ∑∞ n
n=0 pn x ∈ R[[x]], let ξ(p(x)) ∶= min{k ≥ 0 ∶ pk ≠ 0}.
Show that if 0 ≠ q(x) ∈ R[[x]], then ξ(p(x)q(x)) = ξ(p(x)) + ξ(q(x)). Why is this
helpful?)

S3.7. Example. Let R be a commutative ring and q(x) = qn xn + qn−1 xn−1 +


⋯ + q1 x + q0 ∈ R[x]. We define the derivative of q(x) to be

q ′ (x) ∶= nqn xn−1 + (n − 1)qn−1 xn−2 + ⋯ + 2q2 x + 1q1 + 0q0 ∈ R[x].

We leave it as an exercise for the reader to check that the usual rules of differentiation
of real functions holds in this setting:
● (p(x) + q(x))′ = p′ (x) + q ′ (x);
● (r0 q(x))′ = r0 q ′ (x) for all r0 ∈ R;
● (p(x)q(x))′ = p′ (x)q(x) + p(x)q ′ (x).
Let a ∈ R. If (x − a)2 ∣ q(x) in the sense that q(x) = (x − a)2 g(x) for some
g(x) ∈ R[x], then (x − a) ∣ q ′ (x). To see this, we use the third property above to
write

q ′ (x) = ((x − a)2 g(x))′


= [(x − a)2 ]′ g(x) + (x − a)2 g ′ (x)
= (x − a)′ (x − a) + (x − a)(x − a)′ + (x − a)2 g ′ (x)
= (x − a)((x − a)′ + (x − a)′ + (x − a)g ′ (x))
= (x − a)(2 + (x − a)g ′ (x)).

Thus (x − a) ∣ q ′ (x).

We emphasise that elements of R[x] are not functions acting on a set (e.g. the
real line). As such, this is an abstract notion of derivative, which we have named
“derivative” because it behaves like the derivative of real-valued functions on R.

S3.8. Example. Let R be a unital ring. An element m ∈ R is said to be


nilpotent of order k ≥ 1 if mk = 0 ≠ mk−1 . (Note that by convention, m0 ∶= 1.
Using this convention, we see that 0 is always nilpotent of order 1. )
Observe that if 0 ≠ m ∈ R is a nilpotent, then (even if R is commutative) R is not
an integral domain. Indeed, m is nilpotent of order k ≥ 1, then m1 mk−1 = mk = 0,
although m ≠ 0 ≠ mk−1 .
Suppose that R is commutative and consider the set nil(R) ∶= {m ∈ R ∶ m =
0 or m is nilpotent of order k ≥ 1}. We claim that nil(R) is a subring of R. Indeed,
50 3. INTEGRAL DOMAINS AND FIELDS

if m1 , m2 ∈ nil(R) with orders k1 and k2 respectively, then


k1 + k2 k1 +k2 k1 + k2 k1 +k2 −1 1
(m1 − m2 )k1 +k2 = ( )m1 +( )m1 m2 +
0 1
k1 + k2 k1 + k2 k1 +k2
⋯+( )m11 mk21 +k2 −1 + ( )m .
k1 + k1 − 1 k1 + k2 2
Note that in each term we have that mi1 mjj appears where either i ≥ k1 or j ≥ k2 . In
the first case, mi1 = 0 and in the second case, mj2 = 0. Thus m1 − m2 ∈ N and it has
order at most k1 + k2 .
Also, if k ∶= min(k1 , k2 ), then
(m1 m2 )k = mk1 mk2 = 0,
since at least one of these terms is zero. (Note that the fact that R is commutative
was important to both of these proofs.)
In fact, if r ∈ R and m ∈ nil(R) is nilpotent of order k, then (rm)k = rk mk =
rk 0 = 0, so rm ∈ nil(R). We shall examine such subrings in greater detail below,
calling them ideals of R.
S3.9. Example. Let Λ ≠ ∅ be a set with at least two elements, and for each
λ ∈ Λ, suppose that Dλ is an integral domain. Then ∏λ∈Λ Dλ is not an integral
domain.
Indeed, let λ1 ≠ λ2 ∈ Λ, and let d ∶= (dλ )λ , e = (eλ )λ be elements of ∏λ Dλ , where
dλ = 0 if λ ≠ λ1 , otherwise dλ1 = 1, while eλ = 0 if λ ≠ λ2 , otherwise eλ2 = 1. Then
d ≠ 0 ≠ e in ∏λ Dλ , and yet de = 0.
Since d, e ∈ ⊕λ Dλ , we see that ⊕λ Dλ is not an integral domain either when Λ
has more than one element.
S3.10. Example. We invite the reader to verify that
1 0 1 0 1 0 1 0
D ∶= {[ ],[ ],[ ],[ ]} ⊆ M2 (Z2 )
0 1 0 1 0 1 0 1
is an integral domain, using usual matrix addition and multiplication (with entries
added and multiplied in Z2 !!!).
APPENDIX 51

Appendix
A3.1. Fields play a central role in algebra and number theory, in particular
in the search for solutions to equations. It can be shown that they have a strong
relation to Euclidean geometry by using field theory to demonstrate that one cannot,
using a compass and straightedge, trisect an (arbitrary) angle or find a square whose
area is the same as that as a circle of radius 1.
The solutions to a quadratic equation ax2 + bx + c = 0 where a, b and c ∈ R and
0 ≠ a are given by √
−b ± b2 − 4ac
x= .
2a
There exist formulae, albeit more complicated than the one above, to identify the
solutions of cubic and quartic equations. In 1799, the Italian mathematician Paolo
Ruffini claimed that no such formula exists for a quintic equation (i.e. one of
degree 5), although his argument contained errors which were finally resolved by the
Norwegian Niels Henrik Abel, this being the same Abel whose name is used to
describe abelian groups.
The crowning result in this area is due to Évariste Galois, who derived neces-
sary and sufficient conditions for a polynomial equation to be solvable algebraically.
Galois’ contributions to algebra are extensive and quite amazing, especially consid-
ering that he died at the age of 20 of peritonitis, caused by a gunshot wound suffered
in a dual. Go figure.
The original word for a field was the German word Körper , introduced by
Richard Dedekind in 1871 (for subfields of the complex numbers), while the expres-
sion field was introduced much later (1893) by E. Hastings Moore. We mention
in passing that Körper is actually the German word for “body”, and the equivalent
word corps is used to describe a field in French, while cuerpo is used to describe a
field in Spanish. So English is the outlier, pretty much. How surprised the reader
chooses to be by this is really up to the reader, frankly.
52 3. INTEGRAL DOMAINS AND FIELDS

Exercises for Chapter 3

Exercise 3.1.
Our definition of a joint divisor of zero 0 ≠ r in a ring R says that there exist
0 ≠ x, y ∈ R such that
xr = 0 = ry.
If r is a joint divisor of zero in R, is it always the case that there exists 0 ≠ z ∈ R
such that
zr = 0 = rz?

Exercise 3.2.
Consider the ring of polynomials in two commuting variables (Z[x, y], +, ⋅) with
coefficients in (the commutative, unital ring) Z, as defined in Example 2.1.8(b).
Prove that (Z[x, y], +, ⋅) is an integral domain.

Exercise 3.3.
Prove that the ring of Gaussian integers Z[i] is indeed a ring, as the name
suggests. (Hint: use the Subring Test. Which ring should Z[i] be contained in?)

Exercise 3.4.
For which rings (R, +, ⋅) is the ring of polynomials R[x] in one variable with
coefficients in R an integral domain? Formulate and prove your conjecture.

Exercise 3.5.
Let n1 , n2 , n3 ∈ N and suppose that R = Zn1 ⊕ Zn2 ⊕ Zn3 . Find
char(R),
and formulate a conjecture of what the character of the ring
S ∶= ⊕kj=1 Znj
should be, given n1 , n2 , . . . , nk ∈ N. Better yet, try to prove your conjecture!

Exercise 3.6.
Find a unital ring R and a polynomial p(x) = p0 + p1 x + ⋯ + pn xn of degree at
least two such that the equation
p(x) = 0
admits infinitely many solutions in R.

Exercise 3.7.
Let a, b ∈ Q and suppose that p ∈ N is prime. Prove that a2 − pb2 ≠ 0. This was
used in Example 3.7 above.
EXERCISES FOR CHAPTER 3 53

Exercise 3.8. ***


Let D be an integral domain, and suppose that char(D) = 0. Given d ∈ D and
n ≥ 1, recall that n d = d + d + ⋯ + d, the sum of n copies of d. If n = 0, let us define
n d = 0, and for an integer n ≤ −1, we shall define n d = −((−n)d) = −(d + d + d + ⋯ + d)
(n times).
Prove that if 0 ≠ d, then nd = 0 if and only if n = 0.
We note that in some texts (e.g. [Her75]), this is taken as the definition of
characteristic zero for integral domains. While this works for integral domains, this
fails for more general commutative, unital rings.

Exercise 3.9. ***


Let (D, +, ⋅) be an integral domain. Prove that if char(D) ≠ 0, then char(D)
is a prime number.

Exercise 3.10.
Let H denote the ring of real Hamiltonians, defined in Example 2.4.1. Prove
that H satisfies all of the properties of a field, except that multiplication need not
be commutive. As pointed out in the Appendix to Chapter 2, we say that H is a
division ring or a skew field .
CHAPTER 4

Homomorphisms, ideals and quotient rings

If at first you don’t succeed... so much for skydiving.


Henny Youngman

1. Homomorphisms and ideals


1.1. Given two mathematical objects sharing a common structure – be they
both sets, vector spaces, groups, rings or whatever – one is interested in studying
the relationship between them. A common theme in mathematics is the notion of
a homomorphism. The nature of a homomorphism depends upon the nature of the
mathematical structure which our objects of interest share in common.
In the case of (arbitrary) sets A and B, there is essentially no algebraic structure,
and so the set homomorphisms from A to B then consist of arbitrary maps whose
domain is A and whose codomain is B. The kind of information one might glean
from these relates to the only structure the sets have: their size. For example, if one
knows that there exists an injective function f ∶ A → B (i.e. a one-to-one function),
then the size of B must be at least as great as the size of A. Indeed, this is how
we compare the sizes of infinite sets – through the existence or non-existence of
injections from the first to the second.
In the case of two vector spaces V and W over the same field (e.g. both vector
spaces over R), the morphisms are precisely the functions from V to W which pre-
serve the vector space structure. What is the “vector space structure”? The two
basic properties of a vector space V over R are that given two vectors x and y in V
and a scalar κ ∈ R, we may define their sum x + y and scalar multiple κx as elements
of V. As such, we ask that our vector space homomorphisms T ∶ V → W satisfy
T (x + y) = T x + T y and T (κx) = κT (x) for all x, y ∈ V and κ ∈ R. You will recall
from your first linear algebra course that these are precisely the functions we refer
to as linear maps.
A group (G, ⋅) admits a single binary operation, namely multiplication. Given
two groups (G, ⋅) and (H, ∗), a group homomorphism φ ∶ G → H is a map that
satisfies
φ(g1 ⋅ g2 ) = φ(g1 ) ∗ φ(g2 ) for all g1 , g2 ∈ G.
55
56 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

Observe that the first multiplication, namely g1 ⋅ g2 , is taking place in (G, ⋅), while
the second product φ(g1 ) ∗ φ(g2 ) is taking place in (H, ∗). (Something similar
happened with linear maps above: the sum of x and y occurred in V, whereas the
sum of T x and T y occurred in W.) More often than not, multiplication in both
G and H is denoted simply by juxtaposition, and the reader will often see a group
homomorphism defined as a map φ ∶ G → H which satisfies φ(g1 g2 ) = φ(g1 )φ(g2 )
for all g1 , g2 ∈ G. The reader is expected to keep track of which multiplication is
being used where.
Which brings us to rings. Rings (R, +, ⋅) admit two operations, as did vector
spaces. It is not surprising, therefore, that – like vector space homomorphisms – our
notion of a homomorphism for a ring should require that we respect both of these
operations.

1.2. Definition. Let (R, +, ⋅) and (S, +̇, ∗) be two rings. A map φ ∶ R → S is
said to be a ring homomorphism if
φ(r1 + r2 ) = φ(r1 )+̇φ(r2 )
and
φ(r1 ⋅ r2 ) = φ(r1 ) ∗ φ(r2 )
for all r1 , r2 ∈ R.

1.3. Similar to the case when dealing with group homomorphisms, the reader
is often expected to keep track of the two different notions of addition and of mul-
tiplication arising in R and S, and a ring homomorphism from R to S is (infor-
mally) defined as a map φ ∶ R → S that satisfies φ(r1 + r2 ) = φ(r1 ) + φ(r2 ) and
φ(r1 r2 ) = φ(r1 )φ(r2 ) for all r1 , r2 ∈ R.
If there is a risk of confusion, we shall adopt the very strict notation from
Definition 1.2. In general, however, the notation given in the previous paragraph
should not cause misunderstanding, and we shall bow to tradition and use it with
almost reckless abandon.

1.4. The homomorphisms which appear in different branches of algebra them-


selves share a great many properties. If one has seen them at work in the theory
of groups, for example, then one may already have a very good idea of what to ex-
pect from ring homomorphisms. In particular, the so-called First, Second and Third
Isomorphism Theorems each have their variants for groups, rings, fields, algebras
and even vector spaces. (Unsurprisingly, our main focus in this course will be the
ring-theoretic versions of these results.)
In particular, we remind the reader of the First Homomorphism Theorem for
(real) Vector spaces, which says that if V and W are vector spaces over R, and if
T ∶ V → W is an R-linear map, then
ran T ∶= T (V) = {T v ∶ v ∈ V} ≃ V/ ker T,
where ker T = {v ∈ V ∶ T v = 0} is the kernel of the linear map T .
1. HOMOMORPHISMS AND IDEALS 57

Here ≃ refers to a vector space isomorphism; that is, a map φ ∶ ran T →


V/ ker T which is linear and bijective.
The First Homomorphism Theorem for Vector spaces requires the notion of a
quotient vector space of a vector space with one of its subspaces. As we shall soon
see, a similar result holds for ring homomorphisms between two rings R and S. The
notion of a quotient ring will also make its appearance. Unlike the vector space
setting, however, it will not be fruitful to consider the quotient of a ring by an
arbitrary subring. Instead, we shall see that the kernel of a ring homomorphism is
a subring which has special properties, and these are the subrings which will be of
most interest to us.
1.5. Definition. Let R and S be rings and let φ ∶ R → S be a (ring) homomor-
phism. The kernel of φ is the set
ker φ ∶= {r ∈ R ∶ φ(r) = 0}.

1.6. Remark. Note that if R, S, and φ are as above, then K ∶= ker φ is a


subring of R. To show this, we shall use the Subring Test, Proposition 2.4.10.
Note first that 0R ∈ ker φ ≠ ∅. Indeed,
φ(0R ) + 0S = φ(0R ) = φ(0R + 0R ) = φ(0R ) + φ(0R ).
By Proposition 2.4.2, 0S = φ(0R ), and therefore 0R ∈ ker φ.
Let k1 , k2 ∈ K, Then
φ(k1 − k2 ) = φ(k1 ) − φ(k2 ) = 0 − 0 = 0,
and
φ(k1 k2 ) = φ(k1 )φ(k2 ) = 0 ⋅ 0 = 0.
Thus k1 − k2 and k1 k2 ∈ K, showing that K is a subring of R.

1.7. Example. Let R and S be arbitrary rings and define φ ∶ R → S via φ(r) = 0
for all r ∈ R. Then φ is a ring homomorphism, known as the trivial homomorphism
from R to S. As the reader may have already surmised, not much information about
R and S can be gleaned from studying this map.

1.8. Example. Let p ∈ N be a prime number and consider R = Q( p) = S.
Define √ √
φ ∶ Q( p) → Q( p)
√ √ .
a+b p ↦a−b p
Then
√ √ √
φ((a1 + b1 p) + (a2 + b2 p)) = φ((a1 + a2 ) + (b1 + b2 ) p)

= (a1 + a2 ) − (b1 + b2 ) p
√ √
= (a1 − b1 p) + (a2 − b2 p)
√ √
= φ(a1 + b1 p) + φ(a2 + b2 p).
58 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

Similarly,
√ √ √
φ((a1 + b1 p)(a2 + b2 p)) = φ((a1 a2 + pb1 b2 ) + (a1 b2 + b1 a2 ) p)

= (a1 a2 + pb1 b2 ) − (a1 b2 + b1 a2 ) p
√ √
= (a1 − b1 p)(a2 − b2 p)
√ √
= φ(a1 + b1 p)φ(a2 + b2 p).

Thus φ is a ring homomorphism of Q( p) into (in fact onto) itself.

1.9. Example. Let n ∈ N. The map


φ∶ Z → Zn
a ↦ a mod n
is called the canonical homomorphism of Z onto Zn . That is is indeed a ring
homomorphism is left as an exercise for the reader.

1.10. Example. Let R, S and T be rings, and set A = R ⊕ S ⊕ T , so that A


is also a ring. The map φ1 ∶ A → R defined by φ(r, s, t) = r for all (r, s, t) ∈ A is a
ring homomorphism, as is the map φ2 ∶ A → S defined by φ2 (r, s, t) = s and the map
φ3 ∶ A → T defined by φ3 (r, s, t) = t for all (r, s, t) ∈ A.

This is again a good time to make sure that we are thinking while reading these
notes: what is so special about having three rings R, S and T ? Would it have worked
with four rings? With twenty? With infinitely many? If not, what goes wrong?
Is the map ψ ∶ A → S ⊕ R defined by ψ(r, s, t) = (s, r) a ring homomorphism? So
many questions! That is indeed the nature of mathematics.

1.11. Example. Consider the map


φ ∶ M2 (R) → R
a b
[ ] ↦ a.
c d
Then
a1 b1 a b
φ([ ] + [ 2 2 ]) = a1 + a2 ,
c1 d1 c2 d2
but
a1 b1 a2 b2 a a +b c a b +b d
φ([ ][ ]) = φ([ 1 2 1 2 1 2 1 2 ])
c1 d1 c2 d2 c1 a2 + d1 c2 c1 b2 + d1 d2
= a1 a2 + b1 c2
≠ a1 a2
a1 b1 a b
= φ([ ])φ([ 2 2 ])
c1 d1 c2 d2
1. HOMOMORPHISMS AND IDEALS 59

in general. Yes, it can happen for some matrices, but it is not always true. For
0 1 0 0
example, if A ∶= [ ] and B ∶= [ ], then φ(A) = 0 = φ(B), but
0 0 1 0

1 0
φ(AB) = φ([ ]) = 1 ≠ 0 = φ(A)φ(B).
0 0
Thus φ is not a ring homomorphism.

1.12. Definition. Let R and S be rings and let φ ∶ R → S be a homomorphism.


● If φ is injective, then we say that φ is a monomorphism.
● If φ is surjective, we say that φ is a epimorphism.
● If φ is bijective, we say that it is an isomorphism. In this case, we say
that R and S are isomorphic, and we write R ≃ S.
● Finally, a homomorphism of a ring R into itself is referred to as an endo-
morphism, while a bijective endomorphism is called an automorphism
of R.

1.13. Remark. From the point of view of set theory, there is not much differ-
ence between the sets A = {a, b, c} and B = {1, 2, 3}. True, one consists of letters and
the other of numbers, but those are not set-theoretic properties. That fact that each
set has three elements is a set-theoretic property, as is the fact that each set admits
a total of 8 subsets. In fact, from the point of view of Set Theory, B is just the set A
(after relabelling the entries of A). If we wish to impress people, we might say that
A and B are isomorphic as sets. Why we would want to impress people, especially
the kinds of people who would be impressed by such a statement, is another matter
altogether.
In much the same way, however, one should think of two isomorphic rings R
and S are being essentially the same, although presented differently. While one ring
might consist of numbers and another of functions, or matrices, or polynomials with
coefficients in some ring, any ring-theoretic property of R will be possessed by S
and vice-versa. That is, as rings, S is the just the ring R, with the elements and
the operations relabelled.

1.14. Examples. The proofs of the following facts are computations which we
leave to the reader.
(a) Consider the map
α∶ T2 (C) → T3 (C)
⎡t ⎤
t11 t12 ⎢ 11 0 t12 ⎥
⎢ ⎥
[ ] ↦ ⎢ 0 0 0 ⎥.
0 t22 ⎢
⎢ 0 0 t22 ⎥

⎣ ⎦
Then α is a monomorphism of T2 (C) into T3 (C). Clearly it is not surjective,
and so it is not an isomorphism.
60 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

(b) Consider the map


β∶ T3 (C) → T2 (C)
⎡t ⎤
⎢ 11 t12 t13 ⎥ t t
⎢ ⎥
⎢ 0 t22 t23 ⎥ ↦ [ 22 23 ] .

⎢0
⎥ 0 t33
⎣ 0 t33 ⎥⎦
Then φ is an epimorphism of T3 (C) onto T2 (C). Note that
⎡1 0 0 ⎤ ⎡−π 2 43 ⎤
⎢ ⎥ 2 −9 ⎢ ⎥
⎢ ⎥ ⎢ ⎥
β(⎢0 2 −9⎥) = [ ] = β(⎢ 0 2 −9⎥),

⎢0 0 6 ⎥
⎥ 0 6 ⎢
⎢0 0 6⎥

⎣ ⎦ ⎣ ⎦
and thus β is not injective. In particular, β is not an isomorphism.
(c) Let S ∈ Mn (C) be an invertible matrix. Consider the map
γ ∶ Mn (C) ↦ Mn (C)
A ↦ S −1 AS.
Then
γ(A + B) = S −1 (A + B)S = S −1 AS + S −1 BS = γ(A) + γ(B),
and
γ(AB) = S −1 (AB)S = (S −1 AS)(S −1 BS) = γ(A) γ(B).
Thus γ is a homomorphism.
If γ(A) = γ(B), then S −1 AS = S −1 BS, whence
A = S(S −1 AS)S −1 = S(S −1 BS)S −1 = B,
showing that γ is injective.
Also, if X ∈ Mn (C), then Y ∶= SXS −1 ∈ Mn (C), and
γ(Y ) = S −1 Y S = X,
showing that γ is surjective.
Hence γ is an isomorphism.

1.15. Exercise. Let


a b
R ∶= {[ ] ∶ a, b ∈ R} ⊆ M2 (R).
−b a
Verify that R is a subring of M2 (R) using the usual matrix operations. Define
η∶ C → R
a b
a + bi ↦ [ ].
−b a
Verify that η is an isomorphism of rings.
This is an example of what we meant when we suggested that isomorphic rings
are just a single ring, presented in more than one way. The isomorphism η allows us
1. HOMOMORPHISMS AND IDEALS 61

to think of complex numbers as certain 2 × 2 matrices with real entries, multiplied


and added using usual matrix operations.
Any ring theoretic property of C (the fact that it is a field, for example) will be
shared by R and vice-versa. We should be careful to note that C is isomorphic to
R = η(C), and not to M2 (R), which is only the co-domain of η. (This comment also
gives the reader the opportunity to brush up on the difference between the range
and the codomain of a function, in case “it’s been a while”.)

1.16. Exercise. Let R be a ring, and suppose that R is isomorphic to a field


F. Prove that R must be a field.

The following observations are basic, but very useful.

1.17. Proposition. Let R and T be rings and suppose that φ ∶ R → T is a


homomorphism. Let S be a subring of R.
(a) If r ∈ R and n ∈ N, then
φ(nr) = nφ(r) and φ(rn ) = φ(r)n .
In particular, φ(0) = 0 and φ(−r) = −φ(r) for all r ∈ R.
(b)
φ(S) = {φ(s) ∶ s ∈ S} is a subring of T.
In particular, ran φ = φ(R) is a subring of T .
(c) If S is commutative, then so is φ(S).
(d) If R is unital, T ≠ {0} and φ is surjective, then T is unital and 1T = φ(1R ).
Proof. Since this may be the first time we see this kind of argument, we shall be
extremely pedantic and we shall use 0R and 1R to denote the additive and multi-
plicative identities of R, and a similar notation will be used for other rings. In the
future, however, the reader will be expected to understand which ring one is working
with, and therefore which identities one is using. That is, the same notation 0 will
be used to denote the additive identity in more than one ring at a time.
(a) Note that φ(0R ) = φ(0R + 0R ) = φ(0R ) + φ(0R ). Thus
0T = φ(0R ) − φ(0R )
= (φ(0R ) + φ(0R )) − φ(0R )
= φ(0R ) + (φ(0R ) − φ(0R ))
= φ(0R ) + 0T
= φ(0R )
Meanwhile, for all r ∈ R,
0T = φ(0R ) = φ(r + (−r)) = φ(r) + φ(−r),
implying that φ(−r) = −φ(r). (Note that R and S are abelian groups under
addition, so that we have 0T = φ(−r) + φ(r) for free.)
62 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

We shall prove that φ(nr) = nφ(r) by induction, and leave the (similar)
proof that φ(rn ) = φ(r)n for all r ∈ R and n ∈ N to the reader.
Let r ∈ R be arbitrary. First note that if n = 1, then φ(1r) = φ(r) =
1φ(r), which provides the initial step in our induction process.
In general, if φ(nr) = nφ(r), then
φ((n + 1)r) = φ(nr + 1r) = φ(nr + r)
= φ(nr) + φ(r) = nφ(r) + φ(r)
= (n + 1)φ(r).
This completes the induction step, and the proof that φ(nr) = nφ(r) for
all n ∈ N, r ∈ R.
(b) Since 0R = 0S ∈ S, 0T = φ(0S ) ∈ φ(S), and thus the latter is non-empty.
Let us apply the Subring Test: Let x, y ∈ φ(S). Then there exist r, s ∈ S
such that x = φ(r) and y = φ(s). Thus
x − y = φ(r) − φ(s) = φ(r − s) ∈ φ(S),
and
xy = φ(r)φ(s) = φ(rs) ∈ φ(S),
since r, s ∈ S and S a subring of R implies that r − s and rs ∈ S.
By the Subring Test, φ(S) is a subring of T .
(c) If S is commutative, then for xy ∈ φ(S), there exist r, s ∈ S such that
x = φ(r) and y = φ(s). Thus
xy = φ(r)φ(s) = φ(rs) = φ(sr) = φ(s)φ(r) = yx,
proving that φ(S) is commutative as well.
(d) Let t ∈ T be arbitrary. Since φ is surjective, t = φ(r) for some r ∈ R. Thus
t = φ(r) = φ(1R ⋅ r) = φ(1R )φ(r) = φ(1R ) ⋅ t,
and similarly,
t = φ(r) = φ(r ⋅ 1R ) = φ(r)φ(1R ) = t ⋅ φ(1R ).
Since t ∈ T was arbitrary, φ(1R ) is a multiplicative identity for T . Since
this must be unique, φ(1R ) = 1T .

Here is a wonderful little application of these results.
1.18. Proposition. Let n ∈ N and consider the decimal expansion of n,
n = ak 10k + ak−1 10k−1 + ⋯ + a2 102 + a1 10 + a0 .
Then n is divisible by 9 if and only if the sum ak + ak−1 + ⋯ + a2 + a1 + a0 of its digits
is divisible by 9.
Proof. Define the map
β∶ N → Z9
m ↦ m mod 9.
2. IDEALS 63

We leave it to the reader to verify that β is a ring homomorphism. Thus


β(n) = β(ak 10k + ak−1 10k−1 + ⋯ + a2 102 + a1 10 + a0 )
= β(ak )β(10k ) + β(ak−1 )β(10k−1 ) + ⋯ + β(a2 )β(102 ) + β(a1 )β(10) + β(a0 ).

But β(10) = 1, and so by Proposition 1.17 above, β(10j ) = β(10)j = 1j = 1 for all
j ≥ 1. Thus
n mod 9 = β(n)
= β(ak ) + β(ak−1 ) + ⋯ + β(a2 ) + β(a1 ) + β(a0 )
= (ak + ak−1 + ⋯ + a2 + a1 + a0 ) mod 9.
In particular, n is divisible by 9 if and only if β(n) = n mod 9 = 0. But β(n) = 0
if and only if the sum of the digits of n is divisible by 9.
Putting these two things together, we see that n is divisible by 9 if and only if
the sum of its digits is divisible by 9, as claimed.

2. Ideals
2.1. We now turn our attention to a special class of subrings of a given ring R.
Elements of this class will be called ideals of R. They will be of interest because
they can always be realised as the kernels of some homomorphism from R into some
ring S, and as mentioned in paragraph 1.4 above, they will play a central role in
establishing the three Isomorphism Theorems for rings.

2.2. Definition. Let R be a ring. A left ideal (resp. right ideal) of R is


a subring K of R with the property that if k ∈ K and r ∈ R, then rk ∈ K (resp.
kr ∈ K).
If K is both a left and a right ideal of R, we say that K is an ideal (sometimes
also a two-sided ideal) of R. We write K ⊲ R when K is a (two-sided) ideal of R.

2.3. Note that the condition that rk ∈ K for all r ∈ R and k ∈ K implies in
particular that jk ∈ K for all j, k ∈ K. As such, in the definition of an ideal, it
suffices to ask that K be an additive subgroup of R with the property that rk and
kr ∈ K for all r ∈ R, k ∈ K.
That is, by combining these remarks with the Subring Test 2.4.10, we see that
to check that K is an ideal of R, it suffices to prove that
● k − j ∈ K for all j, k ∈ K; and
● rk and kr ∈ K for all k ∈ K, r ∈ R.
For ease of reference, we shall call this the Ideal Test.
64 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

We also point out that if R is commutative, then every left or right ideal is
automatically a two-sided ideal.
2.4. Example. Consider the ring Z. Let m ∈ N and set K = mZ = {mz ∶ z ∈ Z}.
By a (hopefully by now) routine application of the Subring Test, we see that K is a
subring of Z.
If k ∈ K, then k = mz for some z ∈ Z. Thus for any r ∈ Z,
kr = rk = r(mz) = m(rz),
showing that rk, kr ∈ K. Thus K is an ideal of Z.

⎪ ⎡x 0 0⎤ ⎫

⎪⎢⎢
⎪ ⎥
⎥ ⎪

2.5. Example. Let R = M3 (Z), and let K = ⎨⎢y 0 0⎥ ∶ x, y, z ∈ Z⎬.
⎪ ⎢
⎪⎢ z 0 0⎥⎥ ⎪


⎩⎣ ⎦ ⎪

That K is a subring of R is left as a computation for the reader. If R = [ri,j ] ∈
⎡x 0 0⎤
⎢ ⎥
⎢ ⎥
M3 (Z) and K = ⎢y 0 0⎥ ∈ K, then
⎢ ⎥
⎢ z 0 0⎥
⎣ ⎦
⎡r ⎤ ⎡ ⎤
⎢ 11 r12 r13 ⎥ ⎢x 0 0⎥
⎢ ⎥ ⎢ ⎥
RK = ⎢r21 r22 r23 ⎥ ⎢y 0 0⎥
⎢ ⎥ ⎢ ⎥
⎢r31 r32 r33 ⎥ ⎢ z 0 0⎥
⎣ ⎦ ⎣ ⎦
⎡r x + r y + r z 0 0⎤
⎢ 11 12 13 ⎥
⎢ ⎥
= ⎢r21 x + r22 y + r23 z 0 0⎥ ∈ K.
⎢ ⎥
⎢r31 x + r32 y + r33 z 0 0⎥
⎣ ⎦
⎡1 0 0⎤ ⎡1 2 3⎤
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
Thus K is a left ideal of M3 (Z). Note that L ∶= ⎢0 0 0⎥ ∈ K and S ∶= ⎢0 0 0⎥ ∈
⎢ ⎥ ⎢ ⎥
⎢0 0 0⎥ ⎢0 0 0⎥
⎣ ⎦ ⎣ ⎦
R, but
LS = S ∈/ K,
proving that K is not a right ideal, and a fortiori not a two-sided ideal.
The most important example of an ideal stems from the following result. Al-
though its proof is routine, its significance is profound. As we shall eventually see,
every ideal of R arises in this manner, for an appropriate choice of S and corre-
sponding homomorphism φ.

2.6. Theorem. Let R and S be rings and let φ ∶ R → S be a homomorphism.


Then
K ∶= ker φ
is a two-sided ideal in R.
Moreover, φ is injective if and only if ker φ = {0}.
Proof. Let k1 , k2 ∈ K. Then φ(k1 ) = 0 = φ(k2 ). Thus
φ(k1 − k2 ) = φ(k1 ) − φ(k2 ) = 0 − 0 = 0,
2. IDEALS 65

whence k1 − k2 ∈ K.
If r ∈ R is arbitrary, then
φ(rk) = φ(r)φ(k) = φ(r)0 = 0 = 0φ(r) = φ(k)φ(r) = φ(kr),
proving that rk and kr both lie in K.
By the Ideal Test, K is a two-sided ideal of R.

By Proposition 1.17 above, we know that φ(0) = 0 for any homomorphism. If φ


is injective, then we conclude that 0 ≠ r implies that 0 ≠ φ(r), and thus ker φ = {0}.

Conversely, suppose that ker φ = {0}. If φ(r1 ) = φ(r2 ) for some r1 , r2 ∈ R, then
0 = φ(r1 ) − φ(r2 ) = φ(r1 − r2 ),
implying that r1 − r2 ∈ ker φ = {0}, whence r1 = r2 . Thus φ is injective.

2.7. Because of limitations of time, we shall mostly concentrate on two-sided


ideals in our discussions below. Our readers should nevertheless question themselves
as to which results hold for one-sided ideals, and for those that do not hold, what
can be done to “repair” the results?

2.8. Proposition. Let R be a ring and J, K be ideals of R. The following are


also ideals of R.
(a) J + K ∶= {j + k ∶ j ∈ J, k ∈ K}.
(b) J ∩ K ∶= {m ∶ m ∈ J and m ∈ K}.
(c) JK ∶= {∑qi=1 ji ki ∶ q ≥ 1, ji ∈ J, ki ∈ K, 1 ≤ i ≤ q}.
Proof. In all cases, we shall apply the Ideal Test from paragraph 2.3.
(a) Let a = j1 + k1 and b = j2 + k2 ∈ J + K, where, as the notation suggests,
j1 , j2 ∈ J and k1 , k2 ∈ K. Then
a − b = (j1 + k1 ) − (j2 + k2 ) = (j1 − j2 ) + (k1 − k2 ).
But J and K are rings, by virtue of their being ideals, and thus j1 − j2 ∈ J
and k1 − k2 ∈ K. It follows that a − b ∈ J + K.
If r ∈ R, then ra = rj1 + rk1 , while ar = j1 r + k1 r. Since J and K are
ideals, rj1 and j1 r ∈ R, while rk1 and k1 r ∈ K. Hence ra, ar ∈ J + K.
By the Ideal Test, J + K ⊲ R.
(b) Let a, b ∈ J ∩ K. Since each of J and K is a ring, a − b ∈ J and a − b ∈ K,
whence a − b ∈ J ∩ K.
Also, given r ∈ R, the fact that both J and K are ideals implies that
ra, ar ∈ J and ra, ar ∈ K, whence ar, ra ∈ J ∩ K.
By the Ideal Test, J ∩ K ⊲ R.
66 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

(c) Let a = ∑pm=1 jm km and b = ∑qn=1 ̂jn ̂


kn ∈ JK, where jm , ̂ jn ∈ J, 1 ≤ m ≤ p,
̂
1 ≤ n ≤ q and km , kn ∈ K, 1 ≤ m ≤ p, 1 ≤ n ≤ q. Since J is a ring, −̂
jn ∈ J for
all 1 ≤ n ≤ q, and so
p q
jn̂
a − b = ( ∑ jm km ) − ( ∑ ̂ kn )
m=1 n=1
p q
jn )̂
= ( ∑ jm km ) + ( ∑ (−̂ kn )
m=1 n=1

lies in JK.
If r ∈ R, then rjm ∈ J and km r ∈ K for all 1 ≤ m ≤ p, so that
p p
ra = r ( ∑ jm km ) = ∑ (rjm )km ∈ JK,
m=1 m=1
and
p p
ar = ( ∑ jm km ) r = ∑ jm (km r) ∈ JK.
m=1 m=1
By the Ideal Test, JK ⊲ R.

2.9. Definition. Let R be a ring and F ⊆ R. We denote by ⟨F ⟩ the smallest


ideal of R which contains F , i.e.,
⟨F ⟩ ∶= ∩{K ⊲ R ∶ F ⊆ K}.
We refer to this as the ideal generated by F .
If F = {k} consists of a single element, we typically write ⟨k⟩ instead of ⟨{k}⟩;
also, we say in this case that ⟨F ⟩ is a principal or singly-generated ideal.

2.10. Remarks.
(a) It is not hard to show that (and we leave it as an exercise for the reader to
show that) if R is a unital ring and ∅ ≠ F ⊆ R, then
n
⟨F ⟩ = {∑ ri xi si ∶ n ∈ N, ri , si ∈ R, xi ∈ F, 1 ≤ i ≤ n}.
i=1
When R is non-unital, there is a slight issue that arises with this de-
scription. For example, suppose that
0 x
R = {[ ] ∶ x ∈ R} ,
0 0
0 1
which forms a non-unital subring of M2 (R). Let a = [ ], and consider
0 0
0 0
F = {a}. For any r, s ∈ R, it is straightforward to verify that ras = [ ],
0 0
2. IDEALS 67

and so a would not belong to the set described upon the right-hand side of
the above equation, although a ∈ ⟨a⟩ by definition. (The set on the right
hand side of the equation would just contain the zero matrix in this case.)
As such, this description of the ideal generated by a set F does not hold in
general when R is non-unital.
(b) What is true in general is that if R is any ring, a ∈ R, and we define
ma ∶= a + a + ⋯ + a(m times), then ma ∈ ⟨a⟩ for all m ∈ N, and so (−m)a ∶=
−(ma) ∈ ⟨a⟩ as well. Writing Za ∶= {ma ∶ m ∈ Z}, we find that in general,
n
⟨a⟩ = {ma + ra + as + ∑ rk ask ∶ m ∈ Z, n ≥ 1, r, s, rk , sk ∈ R, 1 ≤ k ≤ n}.
k=1

If we apply this to the example in part (a) above, we find that

0 m
⟨a⟩ = {[ ] ∶ m ∈ Z} .
0 0

2.11. Exercise. Find all ideals of Z, proving thereby that every ideal of Z is
a principal ideal. (An integral domain in which every ideal is principal is called a
principal ideal domain, denoted by pid. Thus Z is a pid. We shall return to these
later.)

2.12. Exercise. Let R = Z, and suppose that J = ⟨6⟩, K = ⟨4⟩. Find J + K,


J ∩ K and JK. Generalise your result to arbitrary ideals J = ⟨m⟩ and K = ⟨n⟩ of Z.

2.13. Example. Let R = (C([0, 1], R), +, ⋅). We leave it to the reader to verify
that J ∶= {f ∈ R ∶ f (x) = 0, x ∈ [0, 31 ]} and K ∶= {g ∈ R ∶ g(x) = 0, x ∈ [ 14 , 1]} are
ideals of R.
● Consider J ∩ K. If h ∈ J ∩ K, then h ∈ J implies that h(x) = 0 for all
x ∈ [0, 13 ] and h(x) = 0 for all x ∈ [ 41 , 1]. But then h(x) = 0 for all x ∈ [0, 1],
so h = 0.
That is, J ∩ K = {0}.
● Next, consider JK. If f ∈ J and g ∈ K, then

⎪0g(x) = 0
⎪ if x ∈ [0, 31 ]
f (x)g(x) = ⎨ .

⎪f (x)0 = 0 if x ∈ [ 14 , 1]

Thus f (x)g(x) = 0 for all x ∈ [0, 1], so f g = 0.
More generally, if h ∈ JK, then h = ∑ni=1 fi gi for some n ∈ N and some
fi ∈ J and gi ∈ K, 1 ≤ i ≤ n. Since each fi gi = 0 from the above computation,
we see that h = ∑ni=1 0 = 0.
Thus JK = {0}.
68 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

● As for J + K, let h ∈ R be any continuous function such that h(x) = 0,


x ∈ [ 14 , 31 ]. Define
⎧ ⎧
⎪0
⎪ x ∈ [0, 31 ] ⎪h(x) x ∈ [0, 41 ]

f (x) = ⎨ and g(x) = ⎨ .

⎪ h(x) x ∈ [ 31 , 1] ⎪
⎪0 x ∈ [ 14 , 1]
⎩ ⎩
Then f and g are continuous on [0, 1] (you should check this!), f ∈ J and
g ∈ K, so h = f + g ∈ J + K.
Thus {h ∈ C([0, 1], R) ∶ h(x) = 0, x ∈ [ 41 , 13 ]} ⊆ J + K. Conversely, if
f ∈ J and g ∈ K, then f + g is continuous because each of f and g is, and
for x ∈ [ 41 , 13 ], we see that
f + g(x) = f (x) + g(x) = 0 + 0 = 0.
Hence J + K ⊆ {h ∈ C([0, 1], R) ∶ h(x) = 0, x ∈ [ 41 , 31 ]}. Combining these two
containments,
1 1
J + K = {h ∈ C([0, 1], R) ∶ h(x) = 0, x ∈ [ , ]}.
4 3

3. Cosets and quotient rings


3.1. The notion of cosets appears not only in ring theory, but in many areas of
algebra, including the theories of linear algebra, groups, rings, fields and algebras
over a field. Hopefully, many of you will have already come across this concept in
linear algebra.
For example, if we consider the real vector space V = R3 , then, given any subspace
W of V, we may consider the cosets x+W ∶= {x+w ∶ w ∈ W}. What are the subspaces
of V? The answer depends upon the dimension of W.
● If dim W = 0, then W = {0}, the origin of the vector space. In this case,
x + W = {x}, a single point in V - which we may think of as the origin
translated by x.
● If dim W = 1, then W is a line through the origin. In this case, x + W is a
line parallel to W, passing through x + 0 = x.
● If dim W = 2, then W is a plane through the origin. In this case, x + W is
a plane parallel to W, passing through x.
● Finally, if dim W = 3, then W = V and x + W = V is the entire space.
Note that a line, a plane and the entire vector space each consist of infinitely
many points, but they are single quantities unto themselves: we are dealing with a
single line, a single plane or a single copy of the entire space. Each coset is a set.
The sets have many points in them (except in the case where W = {0} above!), but
that’s neither here nor there.
The concept of a coset is not purely an abstract construction, however. You will
recall our earlier reference to your grandmaman’s watch, and our earlier threat to
resume this thread of conversation. As you shall now see, that threat was anything
3. COSETS AND QUOTIENT RINGS 69

but idle. Assuming that your grandmaman was a high-ranking member of the army
and had a 24-hour watch with the hours indicated 0000, 0100, 0200, . . . , 2200, 2300
but with the date notably absent from the face of her watch (it’s been a while since
she retired, obviously), we see that 1400 hours refers not just to two in the afternoon
today, but in fact two in the afternoon tomorrow, the next day, yesterday and in
fact every day. If we take a philosophical leap of faith and make the simplifying
assumption that time has no beginning and no end (a mild exaggeration, perhaps),
then 1400 on your grandmaman’s watch would really indicate a member of {1400 +
2400n ∶ n ∈ Z}, which is the set of all two-in-the-afternoons throughout eternity,
moving forwards or backwards in time. In other words, two in the afternoon on
your grandmaman’s watch is just the set 1400 + 2400Z, a coset of the subring 2400Z
of Z.
The collection of all two-in-the-afternoons we think of as a single object, denoted
by 1400 on the face of your grandmaman’s watch. Observe that if we wanted to know
what her watch would say three hours from now, we would add 0300 + 2400Z to
1400 + 2400Z to get 1700 + 2400Z, i.e. 5 in the afternoon. Of course, if we wanted to
know the time tomorrow plus three hours from now, we would add 2700+2400Z to the
current time 1400+2400Z to get 4100+2400Z. But there is no 4100 on the face of your
grandmaman’s watch, because this is indicated by 1700, since 4100 = 1700 + 2400,
and thus 4100 + 2400Z is represented by 1700 + 2400Z.
The point is, if you understand your grandmaman’s watch, then you understand
cosets. Of course, the advent of smart-phone technology is quickly making your
grandmaman and indeed everyone’s grandmaman obsolete, but that unfortunate
circumstance is way beyond the scope of these lecture notes.
To summarise: all of this suggests that the concept of a coset is either not as
complicated as we are tempted to believe it might be, or that your grandmaman
possessed some crazy-mad sophisticated technology. Your humble author prefers the
former explanation.

3.2. Definition. Let R be a ring and L ⊆ R be a subring of R. A coset of L is


a set of the form
x + L ∶= {x + m ∶ m ∈ L}.
(In other words, it is a translation of L by x.) We say that x is a representative
of the coset x + L.
The collection of all cosets of L in R is denoted by

R/L ∶= {x + L ∶ x ∈ R}.

3.3. The astute reader (hopefully you!) will have noticed that we referred to
x above as a representative of the coset x + L, as opposed to the representative of
x + L. There’s a very good reason for this. Unless L = {0} is the trivial subring,
then the representative of x + L is not unique, as the next result proves.
70 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

3.4. Proposition. Let R be a ring and L be a subring of R. Let x, y ∈ R. Then


the following statements are equivalent:
(a) x + L = y + L; that is, both x and y are representatives of the coset x + L.
(b) x − y ∈ L.
Proof.
(a) Suppose that x + L = y + L. Since 0 ∈ L, x + 0 = x ∈ y + L and so there exists
m ∈ L such that x = y + m. That is, x − y = m ∈ L.
(b) Suppose next that m ∶= x − y ∈ L, so that x = y + m. Then for any n ∈ L,
x + n = (y + m) + n = y + (m + n) ∈ y + L. Hence x + L ⊆ y + L.
But we also have that y = x − m, and hence given any n ∈ L, y + n =
(x + (−m)) + n = x + (n − m) ∈ L, proving that y + L ⊆ x + L.
Taken together, these two statements show that x + L = y + L.

3.5. Examples.
(a) Let R = Z and L = ⟨12⟩. Let us show that 7 + L = 19 + L ≠ 4 + L.

Observe that
7 + ⟨12⟩ = 7 + {. . . , −24, −12, 0, 12, 24, 36, . . .}
= {. . . , −17, −5, 7, 19, 43, 55, . . . , }
= 19 + ⟨12⟩,
but that
4 + ⟨12⟩ = 4 + {. . . , −24, −12, 0, 12, 24, 36, . . .}
= {. . . , −20, −8, 4, 16, 28, 40, . . .}
≠ 7 + ⟨12⟩.
(For example, 4 ∈ 4 + ⟨12⟩, but 4 ∈/ 7 + ⟨12⟩.)
(b) Note that

⎪ ⎡d 0 0 ⎤⎥ ⎫

⎪⎢⎢ 1
⎪ ⎥ ⎪

D3 (R) = ⎨⎢ 0 d2 0 ⎥ ∶ d1 , d2 , d3 ∈ R⎬

⎪ ⎢ ⎥ ⎪

⎩⎢⎣ 0 0 d3 ⎥⎦
⎪ ⎪

is a subring of T3 (R). It is not an ideal of T3 (R) since, for example, the
matrix
⎡0 1 0⎤ ⎡1 0 0⎤
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
E12 ∶= ⎢0 0 0⎥ ∈ T3 (R) and D1 ∶= ⎢0 0 0⎥ ∈ D3 (R),
⎢ ⎥ ⎢ ⎥
⎢0 0 0⎥ ⎢0 0 0⎥
⎣ ⎦ ⎣ ⎦
but
E12 = D1 ⋅ E12 ∈/ D3 (R).
3. COSETS AND QUOTIENT RINGS 71

⎡17 2⎤
⎢ 2 π− √e ⎥⎥

Fix a matrix, say T = ⎢ 0 −12.1 − 2 ⎥ ∈ T3 (R). Then
⎢ ⎥
⎢0 0 1 ⎥⎦


⎪ ⎡x 2 π − e2 ⎤ ⎫

⎪⎢⎢
⎪ √ ⎥⎥ ⎪

T + D3 (R) = ⎨⎢ 0 y − 2 ⎥ ∶ x, y, z ∈ R⎬ .

⎪ ⎢ ⎥ ⎪

⎩⎢⎣ 0 0
⎪ z ⎥⎦ ⎪

3.6. In linear algebra, one shows that if V is a vector space (over a field F) and
W is a subspace of V, then we can view the set V/W of cosets of W in V as a vector
space using the operations
(x + W) + (y + W) ∶= (x + y) + W
and
κ(x + W) ∶= (κx) + W
for all x, y ∈ V and κ ∈ F. Unfortunately, in the case of rings, it is not true that the
cosets of a subring can always be made into a ring using the canonical operations.
This is where the notion of an ideal takes on all of its importance.

3.7. Theorem. Let R be a ring and L be a subring of R. The set R/L of all
cosets of L in R becomes a ring using the operations
(x + L) + (y + L) ∶= (x + y) + L
and
(x + L)(y + L) ∶= (xy) + L
for x + L, y + L ∈ R/L if and only if L is an ideal of R.
Proof.
● Suppose first that L is an ideal of R. Let us prove that R/L is a ring under the
operations defined above. Note that the Subring Test does not apply here – R/L
is not contained in anything that we already know to be a ring. As such, we are
forced to use the definition of a ring, which involves checking a great many things.
Life isn’t always easy.
The first thing that we have to do is to verify that the operations are well-defined.
(See the Appendix to this Chapter to learn more about when we have to check that
operations are well-defined.)

Note that if x1 + L = x2 + L and y1 + L = y2 + L, then m1 ∶= x1 − x2 and m2 ∶=


y1 − y2 ∈ L by Proposition 3.4. Thus
(x1 + y1 ) − (x2 + y2 ) = (x1 − x2 ) + (y1 − y2 ) = m1 + m2 ∈ L,
and so
(x1 + y1 ) + L = (x2 + y2 ) + L,
again by Proposition 3.4
72 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

Similarly,
(x1 y1 )−(x2 y2 ) = x1 y1 −x1 y2 +x1 y2 −x2 y2 = x1 (y1 −y2 )+(x1 −x2 )y2 = x1 m2 +m1 y2 ∈ L,
since L is an ideal. Thus
(x1 y1 ) + L = (x2 y2 ) + L.
That is, the operations of addition and multiplication we are using are well-defined.
Now we get to the nitty-gritty of checking that R/L is a ring. Sigh. Nah, “chin up”
– let’s do it.

For all r, s, t ∈ R, (r + L) + (s + L) ∶= (r + s) + L ∈ R/L, so that R/L is closed under


addition. Also,
● ((r + L) + (s + L)) + (t + L) = ((r + s) + L) + (t + L) = ((r + s) + t) + L. Since
addition in R is associative we may continue:
((r + s) + t) + L = (r + (s + t)) + L
= (r + L) + ((s + t) + L)
= (r + L) + ((s + L) + (t + L)).
That is,
((r + L) + (s + L)) + (t + L) = (r + L) + ((s + L) + (t + L)).
Hence addition is associative in R/L.
● For all r + L ∈ R/L, (r + L) + (0 + L) = (r + 0) + L = r + L = (0 + L) + (r + L).
That is, 0 + L is an additive identity for R/L.
● For all r + L ∈ R/L,
(r + L) + ((−r) + L) = (r − r) + L
=0+L
= ((−r) + r) + L)
= ((−r) + L) + (r + L).
Hence (−r) + L = −(r + L) is the additive inverse of r + L in R/L.
● (r + L) + (s + L) = (r + s) + L = (s + r) + L = (s + L) + (r + L), and so (R/L, +)
is an abelian group under addition.
Next, note that (r + L)(s + L) ∶= (rs) + L ∈ R/L, so that R/L is closed under
multiplication. Also,
● ((r + L)(s + L))(t + L) = ((rs) + L)(t + L) = ((rs)t) + L. Since multiplication
in R is associative we may continue:
((rs)t) + L = (r(st)) + L
= (r + L)((st) + L)
= (r + L)((s + L)(t + L)).
That is,
((r + L)(s + L))(t + L) = (r + L)((s + L)(t + L)).
3. COSETS AND QUOTIENT RINGS 73

Hence multiplication is associative in R/L.


● ((r + L) + (s + L))(t + L) = ((r + s) + L)(t + L) = ((r + s)t) + L. Since
multiplication distributes over addition in R we find that:
((r + s)t) + L = (rt + st) + L
= (rt + L) + (st + L)
= (r + L)(t + L) + (s + L)(t + L).
That is,
((r + L) + (s + L))(t + L) = (r + L)(t + L) + (s + L)(t + L).
● Finally, (t + L)((r + L) + (s + L)) = (t + L)((r + s) + L) = (t(r + s)) + L. Since
multiplication distributes over addition in R we find that:
(t(r + s)) + L = (tr + ts) + L
= (tr + L) + (ts + L)
= (t + L)(r + L) + ((t + L)(s + L)).
That is,
(t + L)((r + L) + (s + L)) = (t + L)(r + L) + (t + L)(s + L).
Thus multiplication distributes over addition in R/L.
We have now verified that conditions (A1) - (A5) and (M1) - (M4) from Defini-
tion 2.1.2 hold for R/L, and thus it is a ring.
● Conversely, suppose that L is a subring, but not an ideal of R. We argue by
contradiction. To that end, suppose that R/L is a ring under the operations above.
An easy calculation shows that 0 + L must be the zero element of R/L.
The fact that L is a subring but not an ideal of R implies that there exists m ∈ L
and r ∈ R such that either rm ∈/ L or mr ∈/ L. Consider the case where rm ∈/ L. Then
m ∈ L implies that m + L = 0 + L and so
0 + L = (r + L)(0 + L) = (r + L)(m + L) = (rm) + L.
By Proposition 3.4, we see that rm ∈ L, a contradiction. The case where mr ∈/ L is
similarly handled and is left to the reader.

The following result, while easy to prove, is of great importance, because it
establishes what ideals of a ring are. They are those subrings which arise as the
kernel of some homomorphism of the ring. Why is this of interest? As stated more
than once already in these course notes, the way that we investigate the relation
between two comparable mathematical objects (in this course the objects under
study are rings) is to use maps between those objects that respect the structure of
those objects. This is why we are particularly interested in ring homomorphisms.
They help us to understand how those two rings are related.
If R and S are rings, then the kernel of a ring homomorphism φ ∶ R → S gives
us a measure of how much information we are losing from the first ring when we
74 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

carry it into the second. Anything in the kernel gets sent to 0, and so (for example)
if r ∈ ker φ, then so is r2 , r3 , r + r2 + r3 , etc., so the homomorphism by itself can’t
distinguish between these elements.
We get extra information from the kernel as well; for example, if the kernel of
the ring homomorphism φ has exactly 10 elements in it, then not only are these 10
elements sent to 0, but in fact any element s in the range of φ is then the image of
exactly 10 elements of R. The reason for this is that φ(r1 ) = s = φ(r2 ) if and only
if φ(r2 − r1 ) = 0; i.e. if r2 = r1 + k for some k ∈ ker φ. In particular, the map φ is
injective if and only if its kernel is {0}, in which case φ loses no information – φ(R)
is isomorphic to R in this case. So the point is that ring homomorphisms are the
object we really want to study.
We originally defined an ideal K of a ring R as a subring with the special property
that k ∈ K and r ∈ R implies that both rk and kr lie in K. Let us (very) temporarily
change the name of such subrings to ”multiplication absorbing subrings”. Now the
question becomes: why should we care about multiplication absorbing subrings?
Why did we study them in the first place?
Half of the reason lies in Theorem 2.6, which states that kernels of homomor-
phisms are in fact multiplication absorbing subrings. The other half of the reason
is Theorem 3.8 below, which says that these are the only kinds of multiplication
absorbing subrings. That is, multiplication absorbing subrings and kernels of ring
homomorphisms are just two ways of looking at the same thing. So why bother with
this alternate description? Suppose that we didn’t know that these two objects were
the same. If one were to present you with a subring L of a ring R and ask you if
it is the kernel of some homomorphism (and therefore very important), how would
you check if this is so? How would you look for a ring S and a homomorphism
φ ∶ R → S? You can’t say “Hey, Theorem 3.7 is the answer!”, because Theorem 3.7
only tells you that a subring L is a multiplication absorbing subring. Knowing that
L is the kernel of a homomorphism if and only if it is a multiplication absorbing
subring tells us that instead of looking for some ring S and some homomorphism
φ ∶ R → S whose kernel is L, you only have to apply the multiplication absorbing
subring Test, which – in our infinite wisdom – we actually refer to as the Ideal Test.
Alternatively, it suffices to check that R/L is a ring under the canonical operations.
So why don’t we use the expression “multiplication absorbing subring”? My
guess is that it is because other mathematicians are jealous of me and my fancy
terminology, but this is just speculation.

3.8. Theorem. Let R be a ring, and let K be an ideal of R. Then there exists
a ring S and a homomorphism φ ∶ R → S such that K = ker φ.
Proof. By Theorem 3.7 above, R/K is a ring using the canonical operations. Let

φ ∶ R → R/K
r ↦ r + K.
3. COSETS AND QUOTIENT RINGS 75

Then

φ(r1 + r2 ) = (r1 + r2 ) + K = (r1 + K) + (r2 + K) = φ(r1 ) + φ(r2 ),

and
φ(r1 r2 ) = (r1 r2 ) + K = (r1 + K)(r2 + K) = φ(r1 ) φ(r2 ).
Thus φ is a ring homomorphism.
The additive identity element of R/K is 0+K, since (r+K)+(0+K) = (r+0)+K =
r + K for all r ∈ R. By Proposition 3.4,

φ(r) = r + K = 0 + K

if and only if r = r − 0 ∈ K. That is, ker φ = K.


3.9. Example.
Let R = R[x] and r ∈ R. Define δr (p(x)) = p(r).

δr ∶ R[x] → R
p(x) ↦ p(r).

Then
δr ((pq)(x)) = (pq)(r) = p(r)q(r) = δr (p(x)) δr (q(x)),
and
δr ((p + q)(x)) = (p + q)(r) = p(r) + q(r) = δr (p(x)) + δr (q(x)).
Thus δr is a homomorphism, often referred to as evaluation at r.
Note that p(x) ∈ ker δr if and only if 0 = δr (p(x)) = p(r). That is,

ker δr = {p(x) ∶ p(r) = 0}.

Let f (x) = x − r ∈ R[x]. Then f (r) = 0, so f ∈ ker δr . Note that ⟨f (x)⟩ ⊆ ker δr ,
since if g(x) ∈ ⟨f (x)⟩, then g(x) = h(x)f (x) for some h(x) ∈ R[x] (why don’t we
need to consider finite sums of functions here?), and thus

δr (g(x)) = g(r) = h(r)f (r) = h(r)0 = 0.

Conversely, if p(x) ∈ ker δr , then p(r) = 0, and so r is a root of p(x). But then
f (x) = x−r divides p(x), and so there exists k(x) ∈ R[x] such that p(x) = k(x)f (x) ∈
⟨f (x)⟩.
(Note: the fact that p(r) = 0 implies that x − r divides p(x) is something that
we know from high school in the setting of polynomials with coefficients in R, but
which we shall prove in greater generality in a later chapter (see Corollary 7.1.4)).
76 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

3.10. Example. Let κ ∈ N and consider the map

φ∶ Z/⟨κ⟩ → Zκ
n + ⟨κ⟩ ↦ n mod κ.

Once again, we must check that φ is well-defined, mustn’t we? But if n + ⟨κ⟩ =
m + ⟨κ⟩, then m − n ∈ ⟨κ⟩, i.e. m − n = κz for some z ∈ Z. Thus

φ(n) = n mod κ = n + (m − n) mod κ = m mod κ = φ(m),

proving that φ is well-defined.


Then

φ((n + ⟨κ⟩) + (m + ⟨κ⟩)) = φ((n + m) + ⟨κ⟩)


= (n + m) mod κ
= n mod κ + m mod κ
= φ(n + ⟨κ⟩) + φ(m + ⟨κ⟩),

and similarly (the verification is left to the reader)

φ(n + ⟨κ⟩)(m + ⟨κ⟩)) = φ(n + ⟨κ⟩)φ(m + ⟨κ⟩),

proving that φ is a homomorphism.


For any 0 ≤ n < κ, φ(n + ⟨κ⟩) = n mod κ, and thus φ is surjective.
Furthermore, n + ⟨κ⟩ ∈ ker φ if and only if n mod κ = 0 mod κ, and this happens
if and only if κ divides n. That is, it happens if and only if n ∈ ⟨κ⟩, or equivalently,
when n + ⟨κ⟩ = 0 + ⟨κ⟩.
Thus the kernel of φ is trivial, and so φ is injective.
By definition, φ is an isomorphism, and so
Z
≃ Zκ .
⟨κ⟩

4. The Isomorphism Theorems


4.1. The importance of the First Isomorphism Theorem we shall prove below
cannot be overstated. Such a result holds in most categories (i.e. vector spaces,
groups, rings, fields, algebras, etc.), and it is a sublimely useful tool in analysing
homomorphisms.
The other two Isomorphism Theorems are also quite useful, though the First
Isomorphism Theorem really stands apart.
4. THE ISOMORPHISM THEOREMS 77

4.2. Theorem. (The First Isomorphism Theorem) Let R and S be rings


and suppose that φ ∶ R → S is a homomorphism. Let K ∶= ker φ. The map
τ∶ R/K → φ(R)
r + K ↦ φ(r)
is an isomorphism. In other notation, φ(R) ≃ R/ ker φ.
Proof. Since τ is defined on cosets in terms of a (generally non-unique) represen-
tative of that coset, we shall need to prove that τ is well-defined.
Suppose that r +K = s+K for some r, s ∈ R. Then r −s ∈ K, and so φ(r) −φ(s) =
φ(r − s) = 0, i.e. φ(r) = φ(s). Then τ (r + K) = φ(r) = φ(s) = τ (s + K), proving that
τ is indeed well-defined.
Note that
τ ((r + K) + (s + K)) = τ ((r + s) + K) = φ(r + s) = φ(r) + φ(s) = τ (r + K) + τ (s + K),
and that
τ ((r + K)(s + K)) = τ ((rs) + K) = φ(rs) = φ(r) φ(s) = τ (r + K) τ (s + K),
proving that τ is a homomorphism.

By definition, for φ(r) ∈ φ(R), τ (r+K) = φ(r), proving that τ is surjective. Also,
if r + K ∈ ker τ , then 0 = τ (r + K) = φ(r), proving that r ∈ ker φ. But K = ker φ,
and thus r + K = 0 + K is the zero element of R/K. That is, by Theorem 2.6, τ is
injective.
Thus τ is a bijective homomorphism (i.e. an isomorphism) from R/K onto φ(R),
completing the proof.

4.3. Theorem. (The Second Isomorphism Theorem) Let M and N be


ideals of a ring R. Recall that
M + N = {m + n ∶ m ∈ M, n ∈ N }.
Then M + N is an ideal of R, and
M +N M
≃ .
N M ∩N

Proof. First we note that by Proposition 2.8, M + N and M ∩ N are ideals of R,


and as such, they are rings. Clearly N is a subring of M + N , while M ∩ N is a
subring of M . We claim that M ∩ N ⊲ M . In essence, this is the fact that if S is
a subring of a ring R and if J is an ideal of R that happens to be contained in S,
then J is also an ideal of S. Indeed, clearly J is a subring of S, and if j ∈ J, s ∈ S,
then s ∈ R and so sj, js ∈ J, proving that J ⊲ S.
M
By Theorem 3.7, is a ring under the induced operations.
M ∩N
78 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

Consider the map


τ ∶ M +N → M /M ∩ N
.
m+n ↦ m + (M ∩ N ).
Since it is possible that m1 + n1 = m2 + n2 for some m1 ≠ m2 ∈ M and n1 ≠ n2 ∈ N ,
we must again ensure that τ is well-defined.
If m1 + n1 = m2 + n2 , then m1 − m2 = n2 − n1 ∈ M ∩ N , and so

τ (m1 + n1 ) = m1 + (M ∩ N ) = m2 + (M ∩ N ) = τ (m2 + n2 ).

Hence τ is well-defined.
Next, note that

τ ((m1 + n1 ) + (m2 + n2 )) = τ ((m1 + m2 ) + (n1 + n2 ))


= (m1 + m2 ) + (M ∩ N )
= (m1 + (M ∩ N )) + (m2 + (M ∩ N ))
= τ (m1 + n1 ) + τ (m2 + n2 ),

while (keeping in mind that N ⊲ R implies that m1 n2 , n1 m2 , n1 n2 ∈ N ),

τ ((m1 + n1 )(m2 + n2 )) = τ (m1 m2 + (n1 m2 + m1 n2 + n1 n2 ))


= (m1 m2 ) + (M ∩ N )
= (m1 + (M ∩ N ))(m2 + (M ∩ N ))
= τ (m1 + n1 ) τ (m2 + n2 ),

proving that τ is a homomorphism.


For any m ∈ M , and hence for any m + M ∩ N in M /(M ∩ N ), we see that

τ (m + 0) = m + (M ∩ N ).

Thus τ is surjective.
Finally, suppose that n ∈ N . Then n = 0 + n ∈ M + N and τ (n) = τ (0 + n) =
0 + (M ∩ N ) is the zero element of M /(M ∩ N ). Thus n ∈ ker τ . Since n ∈ N was
arbitrary, N ⊆ ker τ .
Conversely, suppose that x = (m + n) ∈ ker τ . Then τ (x) = m + (M ∩ N ) =
0 + (M ∩ N ), and so m = m − 0 ∈ (M ∩ N ) ⊆ N . But n ∈ N and N is an ideal of R,
so x = m + n ∈ N . That is, ker τ ⊆ N .
Taken together, these two inclusions imply that ker τ = N .
By the First Isomorphism Theorem,
M M +N M +N
τ (M + N ) = ≃ = .
M ∩N ker τ N

4. THE ISOMORPHISM THEOREMS 79

4.4. Theorem. (The Third Isomorphism Theorem) Let M and N be


ideals in a ring R, and suppose that N ⊆ M . Then
R R/N
≃ .
M M /N

Proof. As we argued in the previous Theorem, since N is an ideal of R, it is a


subring of R, and if n ∈ N , m ∈ M , then m ∈ R and thus mn, nm ∈ N . In other
words, N is an ideal of M .
Consider the map
τ ∶ R/N → R/M
r + N ↦ r + M.
Again - since we are defining τ in terms of a particular representative of a coset, the
first thing we must do is to verify that it is well-defined.
Suppose that r + N = s + N . Then r − s ∈ N ⊆ M , and thus
τ (r + N ) = r + M = s + M = τ (s + N ).
That is, τ is indeed well-defined.
For any r + M ∈ R/M , we have that r + N ∈ R/N , and r + M = τ (r + N ), so that
τ is surjective.
As for ker τ , note that r + N ∈ ker τ if and only if r + M = τ (r + N ) = 0 + M , that
is, if and only if r ∈ M . In other words, r + N ∈ ker τ if and only if r + N ∈ M /N .
By the First Isomorphism Theorem,
R R/N
≃ .
M M /N

4.5. Example. Let us look at the Second Isomorphism Theorem (and its proof)
through the prism of an explicit example. It will be enlightening, to say the least.
Suppose that R = Z (already our example is special - for one thing, it is com-
mutative), M = ⟨12⟩ = 12Z, and N = ⟨20⟩ = 20Z. We leave it to the reader as an
exercise to prove that M + N = ⟨4⟩ = 4Z and M ∩ N = ⟨60⟩ = 60Z.
The Second Isomorphism Theorem says that
4Z M +N M 12Z
= ≃ = .
20Z N M ∩ N 60Z
4Z
So what does an element of look like? In general,
20Z
M +N
= {(m + n) + N ∶ m ∈ M, n ∈ N } = {m + N ∶ m ∈ M },
N
since n + N = N for all n ∈ N . In this specific case, we are looking at the cosets of
the ideal 20Z of the ring 4Z, so the cosets look like
4Z
= {0 + 20Z, 4 + 20Z, 8 + 20Z, 12 + 20Z, 16 + 20Z}.
20Z
80 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

Similarly,
12Z
= {0 + 60Z, 12 + 60Z, 24 + 60Z, 36 + 60Z, 48 + 60Z}.
60Z
How did our proof proceed? We considered the map
12Z
τ ∶ 12Z + 20Z →
60Z
12r + 20s ↦ 12r + 60Z.
Let’s think about this for a second. Ok, long enough. We remarked above that
12Z + 20Z = 4Z. So our map τ is really a map from 4Z to 12Z/60Z. We must specify
what τ does to a multiple of 4!
When looking at an element of 4Z, we are first asking ourselves to express a
multiple of four in the form 12r + 20s. For example, 8 ∈ 4Z, so you might write
8 = 12(−1) + 20(1). Then again, I might write 8 = 12(4) + 20(−2).
As we have defined τ , you would get τ (8) = 12(−1) + 60Z = −12 + 60Z, while I
would compute τ (8) = 12(4) + 60Z = 48 + 60Z. Fortunately,
−12 + 60Z = {. . . , −72, −12, 48, 108, . . .} = 48 + 60Z,
so the answer you computed agrees with my answer. But what if someone else wrote
8 = 12r + 60s for entirely different values of r and s? Would their candidate for τ (8)
agree with ours?
That is what we are doing when we check whether or not τ is well-defined. We
are verifying that independent of how we express 8 = 12r + 20s, all of us agree on
what τ (8) should be. After all, the function τ sees only the number 8. It is you, I
and that other person who see the r’s and s’s. Our computations are different, but
if the formula for τ is going to make sense, we had better all agree on the value of
τ (8).
Of course, we can’t just check this for 8; we have to check that we all agree on
the formula for τ (k) whenever k ∈ 4Z. But if two people write k = 12r1 + 20s1 and
k = 12r2 + 20s2 respectively, then 12r1 − 12r2 = 20s2 − 20s1 . That means that
12r1 − 12r2 = 20(s1 − s2 ) ∈ ⟨20⟩ = 20Z,
and in particular, 12r1 − 12r2 is divisible by 5. Clearly it is divisible by 12 also. But
then it is divisible by lcm(5, 12) = 60, so 12r1 − 12r2 ∈ ⟨60⟩, and
τ (k) = 12r1 + ⟨60⟩ = 12r2 + ⟨60⟩.
In other words, while the two people may have written k completely differently, they
agree on what τ (k) is. Thus τ is well-defined.
The computation that shows that τ is a homomorphism is (this humble author
hopes) relatively straight-forward:
τ ((12r1 + 20s1 ) + (12r2 + 20s2 )) = τ (12(r1 + r2 ) + 20(s1 + s2 ))
= 12(r1 + r2 ) + ⟨60⟩
= (12r1 + ⟨60⟩) + (12r2 + ⟨60⟩)
= τ (12r1 + 20s1 ) + τ (12r2 + 20s2 )
4. THE ISOMORPHISM THEOREMS 81

and

τ ((12r1 + 20s1 )(12r2 + 20s2 )) = τ ((12r1 )(12r2 ) + (12r1 )(20s2 )


+ (20s1 )(12r2 ) + (20s1 )(20s2 ))
= τ ((12(12r1 r2 )) + 20(12r1 s2 + 12s1 r2 + 20s1 s2 ))
= 12(12r1 r2 ) + ⟨60⟩
= ((12r1 )(12r2 )) + ⟨60⟩
= (12r1 + ⟨60⟩) (12r2 + ⟨60⟩)
= τ (12r1 + 20s1 ) τ (12r2 + 20s2 ).

12Z
To show that τ is onto, we consider an arbitrary element y of , so y =
60Z
12k + 60Z, where k ∈ {0, 1, 2, 3, 4}. Then 12k = 12k + 0 ∈ 12Z + 20Z = 4Z, so τ acts on
this element and by definition,

τ (12k) = τ (12k + 0) = 12k + ⟨60⟩ = 12k + 60Z = y,

and τ is onto.
Finally, we compute ker τ . Now x = 12r + 20s ∈ ker τ if τ (x) = 0 + 60Z. But
τ (x) = 12r + 60Z, and so we need 12r ∈ 60Z, which means that r must be a multiple
of 5, i.e. r = 5b for some b ∈ Z. But then x = 12(5b) + 20s = 20(3b + s) ∈ ⟨20⟩; i.e.
x ∈ N . Conversely, if x ∈ ⟨20⟩ = 20Z, then we can write x = 0 + 20s, and by definition,
τ (x) = 0 + ⟨60⟩ = 0 + 60Z. Thus x ∈ ker τ .
We have proven that ker τ = ⟨20⟩. Finally, the First Isomorphism Theorem now
assures us that
4Z M +N M 12Z
= ≃ ran τ = = .
20Z N M ∩ N 60Z
And what, pray tell, is the isomorphism given by the First Isomorphism The-
orem? Let’s call it Φ. Then – keeping in mind that we can always write 4k =
(12(2) + 20(−1))k = 24k − 20k – we find that
4Z 12Z
Φ∶ →
20Z 60Z

4k + ⟨20⟩ = (24k − 20k) + ⟨20⟩ ↦ 24k + ⟨60⟩.

In particular, (choosing k = 0, 1, 2, 3 and 4 below), we have


● Φ(0 + ⟨20⟩) = 0 + ⟨60⟩;
● Φ(4 + ⟨20⟩) = 24 + ⟨60⟩;
● Φ(8 + ⟨20⟩) = 48 + ⟨60⟩;
● Φ(12 + ⟨20⟩) = 72 + ⟨60⟩ = 12 + ⟨60⟩;
● Φ(16 + ⟨20⟩) = 96 + ⟨60⟩ = 36 + ⟨60⟩.
82 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

4.6. Examples. The details of the following example are left to the reader.
(a) Let K be an ideal of the ring R. The map
π ∶ R → R/K
r ↦ r+K
is a homomorphism, called the canonical homomorphism from R onto
R/K. Its kernel is ker π = K.
(b) For example, if K = ⟨m⟩ as an ideal of Z, then the map
π ∶ Z → Z/⟨m⟩
z ↦ z + mZ
is the canonical homomorphism.

4.7. Theorem. Let R be a unital ring.


(a) The map
φ∶ Z → R
n ↦ n1
is a homomorphism.
(b) If char(R) = κ > 0, then φ(Z) ≃ Zκ .
(c) If char(R) = 0, then φ(Z) ≃ Z.
Proof.
(a) Let n, m ∈ Z and observe that φ(n+m) = (n+m)1 = n 1+m 1 = φ(n)+φ(m)
and that φ(nm) = (nm) 1 = (n1) (m1) = φ(n)φ(n), proving that φ is a
homomorphism.
(b) Note that ker φ =< κ >, and thus by the First Isomorphism Theorem,
combined with Example 3.10, we see that
Z Z
φ(Z) ≃ = ≃ Zκ .
ker φ ⟨κ⟩
(c) If char(R) = 0, then for all n > 0, φ(n) = n 1 ≠ 0 and so φ(−n) = −φ(n) ≠ 0.
In other words, ker φ = {0}. But then φ is a bijective homomorphism
between Z and φ(Z), so
φ(Z) ≃ Z.

4.8. Example.
(a) Let n ∈ N and R = Tn (R). Then

⎪⎡k ⎤ ⎫


⎪⎢ ⎥ ⎪

⎪⎢
⎪⎢ k ⎥ ⎪


{kIn ∶ k ∈ Z} = ⎨⎢ ⎥ ∶ k ∈ Z⎬

⎪⎢ ⋱ ⎥ ⎥ ⎪


⎪⎢ ⎪

⎪⎢
⎩⎣ k ⎥ ⎪


is isomorphic to Z.
4. THE ISOMORPHISM THEOREMS 83

(b) In Z5 [i], S ∶= {0+0i, 1+0i, 2+0i, 3+0i, 4+0i} is a subring which is isomorphic
to Z5 .

4.9. Corollary. Let p ∈ N be a prime number.


(a) Suppose that F is a field of characteristic p. Then F contains a subfield
isomorphic to Zp .
(b) Suppose that F is a field of characteristic 0. Then F contains a subfield
isomorphic to Q.
Proof.
(a) By Theorem 4.7, since F is unital and satisfies char(F) = p, F contains a
subring G which is isomorphic to Zp . But p being a prime number implies
that Zp is a field, so G is a field.
(b) Again, by Theorem 4.7, F contains a subring S which is isomorphic to
Z. Since F is a field of characteristic zero, for 0 ≠ n ∈ Z, n 1 ≠ 0, and so
(n 1)−1 ∈ F. Let
G ∶= {(m 1)(n 1)−1 ∶ m, n ∈ Z, n ≠ 0}.
We leave it as an exercise for the reader to verify that the map
τ∶ Q → G
m
n ↦ (m 1)(n 1)−1
is an isomorphism.
But Q is a field, and thus so is G. Hence F contains the subfield G
which is isomorphic to Q.

84 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

Supplementary Examples.
S4.1. Example. Let R and S be rings. The map ζ ∶ R → S defined by ζ(r) = 0
for all r ∈ R is a homomorphism, called the zero homomorphism.
Needless to say, it’s not the most interesting homomorphism around. Having
said that, one should look at Exercise 4.9 below.

S4.2. Example. ∗∗∗ Let R[x] denote the commutative ring of polynomials in
one variable with real coefficients. Then
R[x]
≃ C.
⟨x2 + 1⟩

S4.3. Example. Let L ∶= {a + bi ∶ a, b ∈ Z and a − b ∈ 2Z} ⊆ Z[i]. Then L is an


ideal of Z[i] and
Z[i]
≃ Z2 .
L

S4.4. Example. Let R ∶= (C([0, 1], R), +, ⋅) be the ring of continuous real-
valued functions on [0, 1]. Let x0 ∈ [0, 1], and consider the map

εx0 ∶ C([0, 1], R) → R


f ↦ f (x0 ).

For hopefully obvious reasons, we refer to εx0 as evaluation at x0 . Then

εx0 (f + g) = (f + g)(x0 ) = f (x0 ) + g(x0 ) = εx0 (f ) + εx0 (g), and


εx0 (f g) = (f g)(x0 ) = f (x0 )g(x0 ) = εx0 (f ) εx0 (g),

proving that εx0 is a homomorphism of C([0, 1], R) into R. The kernel of this map
is the ideal
Mx0 ∶= {f ∈ C([0, 1], R) ∶ f (x0 ) = 0},
and it can be shown that if N ⊲ C([0, 1], R) is an ideal and Mx0 ⊆ N ⊆ C([0, 1], R),
then either N = Mx0 or N = C([0, 1], R). In other words, there are no proper ideals
which contain Mx0 . We say that Mx0 is a maximal ideal of C([0, 1], R). Maximal
ideals will prove important below.

S4.5. Example. Let F be a field and {0} ≠ K ⊲ F be a non-trivial ideal of F.


Let 0 ≠ k ∈ K. Then k −1 ∈ F, so 1 = k −1 k ∈ K. But then for all b ∈ F, b = b ⋅ 1 ∈ K, so
F ⊆ K, implying that K = F.
In other words, the only ideals of a field F are {0} and F itself. We say that F
is simple.
SUPPLEMENTARY EXAMPLES. 85

S4.6. Example. Let R = Z, M = 24Z and N = 15Z. Note that M and N are
ideals of Z, and
M + N = {m + n ∶ m ∈ 24Z, n ∈ 15Z}.
Note that 3 = 24 ⋅ 2 + 15 ⋅ (−3) ∈ M + N , and so 3z = 24 ⋅ (2z) + 15 ⋅ (−3z) ∈ M + N for
all z ∈ Z. Moreover, any element of M + N is divisible by 3, since each of 24 and 15
are. Hence
M + N = 3Z.
Also, we leave it to the reader to verify that M ∩ N = 120Z. By the Second Isomor-
phism Theorem,
3Z 24Z
≃ .
15Z 120Z
Hopefully this will remind the reader of the well-known equation
3 24
= ,
15 120
which our result generalises.

S4.7. Example. Suppose that R is a unital ring and that s ∈ R is invertible.


The map
Ads ∶ R → R
x ↦ s−1 xs
is an isomorphism of R onto R. The details are left to the reader. We remind the
reader that an isomorphism of a ring R onto itself is referred to as an automor-
phism of R.
In some contexts (for example if R = Mn (C) for some n ≥ 1), such an automor-
phism is said to be inner, and one is interested in knowing whether all isomorphisms
are inner. For example, it can be shown that all linear automorphisms of Mn (C)
are inner. (i.e. automorphisms φ that satisfy φ(κX) = κ(φ(X)) for all κ ∈ C,
X ∈ Mn (C).) The reader may wish to try proving this when n = 2.

S4.8. Example. Suppose that n ≥ 1 is an integer and let u ∈ Mn (C) be a


unitary matrix. The map
Adu ∶ Mn (C) → Mn (C)
x ↦ u∗ xu

is an automorphism of Mn (C). Unlike the case where s ∈ Mn (C) is simply invertible


as above, when s = u is unitary, these automorphism preserve adjoints: Adu (x∗ ) =
(Adu (x))∗ .
If you haven’t seen linear algebra yet, then you may feel free to ignore this
example. Or better yet, go learn linear algebra and come back to this example!
86 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

S4.9. Example. Let R = Z, and let K = 24Z, so that K ⊲ Z. The K-radical of


Z as defined in paragraph A4.3 below is
radK ∶= {r ∈ Z ∶ rn ∈ 24Z for some n ≥ 1}.
Thus, s ∈ radK if 24 ∣ rn for some n ≥ 1. Since 24 = 23 3, in order to have r ∈ radK ,
we need 2 ∣ r and 3 ∣ r, and thus 6 ∣ r. Moreover, if r ∈ 6Z, then r = 2 ⋅ 3 ⋅ r0 for some
r0 ∈ Z, so r3 ∈ K, whence r ∈ radK .
We have proven that
rad24Z ∶= 6Z.
S4.10. Example. in Example 3.3.4 we gave an example of a field F = {0, 1, x, y}
with 4 elements. Note that for each element b ∈ F, b + b = 0, and so char(F) = 2.
The character of a field is not the number of elements in the field.
APPENDIX 87

Appendix
A4.1. The question of when one has to check whether a function is well-defined
or not is not as difficult as it is sometimes made out to be. Basically, it comes down
to this: one is trying to describe a rule for a set of objects by specifying what that
rule does to one particular element of that set. Without knowing in advance that
the result is independent of which element of the set you chose, how do you know
that your rule even makes sense? To say that a function is “well-defined” is the
statement that the rule that you have for describing that function makes sense.

For example: suppose that to each National Hockey League (NHL) team we
wish to assign a natural number. Here is the “rule”.

Pick an arbitrary player of that team, and the natural number


that we assign to the team will be that person’s age.
Does this “rule” make sense?
NHL hockey teams have many players. Suppose that the team has two members,
one whose age is 23, and the other whose age is 30. Which of these two numbers
was the correct one to assign to the team?
The answer is that the “rule” we formulated does not make sense. In other
words, the “function” that we described is not well-defined.

Here is another possible “rule” - this time we try to assign a colour to each NHL
hockey team:
Pick an arbitrary player on that team. The colour we assign to
that team is the colour of that player’s right eye.
Does this “rule” make sense?
Again - no! It is entirely possible that one player on the team might have brown
eyes, while another member might have green eyes. The function we have described
to assign a colour to the country is not well-defined.

Undaunted, let’s try one more time. Let’s assign a number to an NHL team
according to the following “rule”.
Pick an arbitrary player of that team. The number we assign
to that team is the number of teammates that person has (not
counting himself).
This is an interesting one. If the team has 23 players, then – and this is the crucial
element – regardless of which player on the team you pick, that player will have
22 teammates. While the set of teammates of player A will differ from the set of
teammates of player B (for example, our definition states that player B is a teammate
of player A, but not of himself), nevertheless, the number of teammates does not
depend upon the particular player we chose. This way of assigning a number to a
88 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

team by choosing an individual player from that team and specifying a rule in terms
of something about that particular player makes sense. The function

f∶ NHL teams → N
player x on team Y → number of teammates of player x on team Y

is well-defined.

A4.2. Let’s see how this works in (mathematical) practice.


Suppose that we wish to determine whether or not the map

φ∶ Z/⟨12⟩ → Z/⟨4⟩
n + ⟨12⟩ ↦ n + ⟨4⟩

is well-defined. Again – in this case, note that 2 + ⟨12⟩ = 26 + ⟨12⟩. The function φ
does not see “2” nor “26”. It only sees the coset

{. . . , −22, −10, 2, 14, 26, 38, . . .}.

So how does φ work? It says - pick an arbitrary element from this set, say n and
then

φ({. . . , −22, −10, 2, 14, 26, 38, . . .}) = {. . . , n − 4, n, n + 4, n + 8, n + 12, . . . , }.

Here’s the beauty of it: if we had picked n = 2, then the result would be

φ({. . . , −22, −10, 2, 14, 26, 38, . . .}) = φ(2 + ⟨12⟩)


= {. . . , 2 − 4, 2, 2 + 4, 2 + 8, 2 + 12, . . .}
= {. . . , −2, 2, 6, 10, 14, . . .}.

If we had picked n = 26, we would have obtained

φ({. . . , −22, −10, 2, 14, 26, 38, . . .}) = φ(26 + ⟨12⟩)


= {. . . , 26 − 4, 26, 26 + 4, 26 + 8, . . .}
= {. . . , 22, 26, 30, 34, . . .}.

But
{. . . , 22, 26, 30, 34, . . .} = {. . . , −2, 2, 6, 10, 14, . . .}!!!
In fact, no matter which two representatives n1 and n2 of n + ⟨12⟩ we choose,
the fact that n1 + ⟨12⟩ = n2 + ⟨12⟩ implies that n1 − n2 ∈ ⟨12⟩ ⊆ ⟨4⟩, and thus n1 + ⟨4⟩ =
n2 + ⟨4⟩.
If you understand this example, you should understand “well-definedness”, and
you are on the path to a better and healthier mathematical lifestyle.
APPENDIX 89

A4.2. The reader might be asking himself/herself: “why is it that if W is any


subspace of a vector space V, then V/W forms a vector space under the usual
operations on cosets, but in the case of a ring R, we must quotient by an ideal
as opposed to an arbitrary subring S?” Indeed, I would be very happy to learn
that the reader has now adopted the strategy I suggested of asking himself/herself
(mathematical) questions while reading the notes.
The reason is that in the vector space setting, all subspaces occur as the kernels
of (vector space) homomorphisms! So it really comes down to the idea that you
always want to quotient by kernels of morphisms, it’s just that the question of
which substructures are kernels of morphisms depends upon which mathematical
objects you are looking at.
To see that any subspace W of a vector space V over a field F is the kernel of a
vector homomorphism (without first appealing to the fact that we can quotient V
by W), note that we can find a basis {wλ }λ∈Λ for W, which we can then extend to a
basis B ∶= {wλ }λ∈Λ ∪ {yγ }β∈Γ for V. (The proof of this depends upon Zorn’s Lemma,
which appears in Appendix A of these course notes.)
Recall that we can define a linear map (i.e. a vector space homomorphism)
φ ∶ V → V simply by specifying what it does to a basis (and extending by linearity!).
To that end, set
φ(wλ ) = 0 for all λ ∈ Λ
φ(yγ ) = yγ for all γ ∈ Γ.

Clearly wλ ∈ ker φ, and so W = span{wλ }λ∈Λ ⊆ ker φ.


Conversely, suppose that v ∈ ker φ. Since v ∈ V and B is a basis for V, we can
find α1 , α2 , . . . , αm , β1 , β2 , . . . , βn ∈ F and λ1 , λ2 , . . . , λm ∈ Λ, γ1 , γ2 , . . . , γn ∈ Γ such
that
m n
v = ∑ αi wλi + ∑ βj yγj .
i=1 j=1

Thus
m n
0 = φ(v) = ∑ αi φ(wλi ) + ∑ βj φ(yγj )
i=1 j=1
m n
= ∑ αi 0 + ∑ βj yγj
i=1 j=1
n
= ∑ βj yγj .
j=1

Since {yγ }β∈Γ is linearly independent, this means that βj = 0 for all 1 ≤ j ≤ n, and
thus v = ∑mi=1 αi wλi ∈ W. That is, ker φ ⊆ W.
Together, these two containments prove that W = ker φ.
So - the issue is that it is “easier” to be the kernel of a vector space homomor-
phism – any subspace of that vector space will do – than it is to be the kernel of a
90 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

ring homomorphism – where you must be an ideal. But in both cases the key issue
is that kernels of homomorphisms are the key! Cray cray.
A4.3. Let R be a commutative ring, and let K ⊲ R be an ideal of R. The
K-radical of R is the set
radK ∶= {r ∈ R ∶ there exists n ∈ N such that rn ∈ K}.

Caveat. Unfortunately, there are several different notions within rings that are
referred to as radicals of the ring. The one above is but one, and it is not as
important as the Jacobson radical, which we shall see in the Assignments.

A4.3. Let R be a commutative ring and K ⊲ R be an ideal of R. Then radK is


an ideal of R which contains K.

Proof. As always, we would like to apply the Ideal Test to verify that this is the
case.
That K ⊆ radK is easy to verify.
Let r, s ∈ radK and choose n, m ∈ N such that rn , sm ∈ K. Consider
m + n m+n−1 1 m + n m+n−2 2
(r − s)m+n = rm+n −( )r s +( )r s −⋯
1 2
m+n
+ (−1)m+n−1 ( )r1 sm+n−1 + (−1)m+n sm+n .
m+n−1
For 0 ≤ j ≤ m + n, either j ≤ m, in which case rm+n−j ∈ K, or m + 1 ≤ j ≤ m + n, in
which case sj ∈ K. Since K is an ideal, each term in the above sum lies in K, and
thus the sum itself is an element of K.
That is, r − s ∈ radK .
Let r ∈ radK and fix n ∈ N such that rn ∈ K. Let t ∈ R. Then, using the
commutativity of R,
(tr)n = tn rn ∈ K,
since rn ∈ K and K ⊲ R. Hence rt = tr ∈ radK .
By the Ideal Test, radK is an ideal of R.

EXERCISES FOR CHAPTER 4 91

Exercises for Chapter 4

Exercise 4.1.
(a) Find all subrings of Z.
(b) Find all ideals of Z.
(c) Let m, n ∈ Z be relatively prime. (That is, gcd(m, n) = 1.) Show that
Zm ⊕ Zn is isomorphic to Zmn .

Exercise 4.2.
(a) Find a subring of M3 (C) which is not an ideal.
(b) Find all ideals of M3 (C).
(c) Can you generalise this result to Mn (C)?

Exercise 4.3.
Let R = C([0, 1], R) and let K ∶= {f ∈ C([0, 1], R) ∶ f (x) = 0, x ∈ [ 13 , 87 ]}. Prove or
disprove that K is an ideal of R.

For those who enjoy a serious challenge: Let 0 ≠ L ≠ R be a non-trivial, proper


ideal of R = C([0, 1], R). Show that there exists x0 ∈ [0, 1] such that f (x0 ) = 0 for
all f ∈ L.

Exercise 4.4.
A proper ideal M of a ring R (i.e. an ideal that is not R itself) is said to be
maximal if whenever N ⊲ R and M is a proper subset of N , we have that N = R.
That is, M is not properly contained in any proper ideal of R.
(a) If the set K of R = C([0, 1], R) in Exercise 3 is in fact an ideal of R, is it
maximal?
(b) Find a maximal ideal of R.

Exercise 4.5.
Let R be a ring and

⎪⎡d1 0 0 ⋯ 0 0 ⎤⎥ ⎫


⎪⎢ ⎪



⎪⎢ 0 d 2 0 ⋯ 0 0 ⎥ ⎪



⎪⎢ ⎥ ⎪

⎪⎢
⎪⎢ 0 0 ⋱ ⋯ 0 0⎥ ⎥ ⎪

Dn (R) = ⎨⎢ ⎥ ∶ dj ∈ R, 1 ≤ j ≤ n⎬

⎪⎢ ⋱ ⎥ ⎪


⎪⎢ ⎥ ⎪


⎪⎢ 0 0 0 ⋯ d n−1 0 ⎥ ⎪


⎪⎢ ⎥ ⎪


⎪⎢0 0 0 ⋯ 0 d n
⎥ ⎪

⎩⎣ ⎦ ⎭
denote the ring of all diagonal matrices in Mn (R).
Find all ideals of Dn (R).

Exercise 4.6.
Consider the ideals J = ⟨m⟩ and K = ⟨n⟩ of Z. Find J + K, J ∩ K, and JK.
92 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS

Exercise 4.7. Suppose that S is a subring of a ring R.


(a) Suppose that M ⊲ R is an ideal of R such that M ⊆ S. Is M an ideal of S?
Prove that it is, or give a counterexample to show that it need not be.
(b) Suppose that N ⊲ S. Is N an ideal of R? Prove that it is, or give a
counterexample to show that it need not be.

Exercise 4.8.
Let L be a subring of a ring R. Let r1 , r2 ∈ R. Prove that either r1 + L = r2 + L,
or (r1 + L) ∩ (r2 + L) = ∅.
In other words, the cosets of L partition the ring R.
Another way to understand this is the following. Recall that a relation ∼ on a
set X is said to be an equivalence relation if
● x ∼ x for all x ∈ X;
● if x ∼ y, then y ∼ x; and
● if x ∼ y and y ∼ z, then x ∼ z.
The equivalence class of x ∈ X under this relation is denoted by [x] ∶= {y ∈ X ∶
x ∼ y}.
(a) Prove that in any set X equipped with any equivalence relation ∼, the
equivalence classes partition the set X. That is, X = ∪x∈X [x], and for
x ≠ y ∈ X, either [x] = [y] or [x] ∩ [y] = ∅.
(b) Define a relation ∼ on R via: r ∼ s if r − s ∈ L. Prove that ∼ is an equiva-
lence relation on R.

Exercise 4.9.
Let m, n ∈ N with m < n. Describe all ring homomorphisms from Mn (C) to
Mm (C).

Exercise 4.10.
Let (G, +) be an abelian group. A map φ ∶ G → G is said to be additive if
φ(g+h) = φ(g)+φ(h) for all g, h ∈ G. (In essence, this says that φ is a group homo-
morphism from G into itself. A group homomorphism (resp. ring homomorphism)
from a group (resp. a ring) into itself is referred to as a group endomorphism
(resp. a ring endomorphism).
We denote by End (G, +) the set of all additive maps on G.
(a) Prove that End (G, +) is a ring under the operations
(φ + ψ)(g) ∶= φ(g) + ψ(g),
and
(φ ∗ ψ)(g) ∶= φ ○ ψ(g) = φ(ψ(g))
for all g ∈ G.
Given a ring R and an element r ∈ R, define the map Lr ∶ R → R by Lr (x) ∶= rx.
In other words, Lr is left multiplication by r.
EXERCISES FOR CHAPTER 4 93

(b) Let (R, +, ⋅) be a ring and recall that (R, +) is an abelian group. Prove that
the map
λ ∶ (R, +, ⋅) → (End (R, +), +, ∗)
r → Lr
is an injective ring homomorphism.
In some contexts, this is referred to as the left regular representation of the ring
R. Where was commutativity of (R, +) used?

Exercise 4.11. In the same way that we worked our way through the Second Iso-
morphism Theorem in Example 4.5, work your way through the Third Isomorphism
Theorem for the special case where R = Z, M = ⟨10⟩, and N = ⟨50⟩.
CHAPTER 5

Prime ideals, maximal ideals, and fields of quotients

We owe a lot to Thomas Edison – if it weren’t for him, we’d be


watching television by candlelight.
Milton Berle

1. Prime and maximal ideals


1.1. One of the most important, if not the most important ring, is the ring of
integers (Z, +, ⋅). In studying more general rings, we are often interested in how
much the structure of Z carries over to a more general setting.
When studying the natural numbers, the prime numbers occupy a special and
central role. They have two extremely important properties that are closely related.
For one thing, we know that every natural number greater than or equal to 2 can be
factored as a product of primes (in a unique way except for the order of the factors).
Secondly, we know that if we try to factor a prime number p as a product p = ab
where a, b ∈ N, then either a = 1 (and b = p), or b = 1 (and a = p). If we extend our
notion of “prime” to integers by agreeing that an integer q is prime if ∣q∣ is a prime
natural number (so that −7 is now prime, for example), these two results become
● every integer that is neither zero nor invertible (i.e. every integer of absolute
value at least equal to 2) can be factored in an essentially unique way as
a product of primes; that is, it is unique if we don’t care about the order
of the terms, and we don’t worry about multiplying (an even number of)
factors by −1; and
● if q ∈ Z is a prime number and q ∣ ab for some a, b ∈ Z, then q ∣ a or q ∣ b.
The first item refers to the fact that we are agreeing that the technically different
factorisations 315 = 3 ⋅ 3 ⋅ 5 ⋅ 7 = −7 ⋅ −3 ⋅ 3 ⋅ −5 should be considered basically the same
because the order is irrelevant (Z is commutative, after all); also −3 = 3 ⋅ −1 and
−7 = 7 ⋅ −1, and the two invertible elements −1 and −1 are “cancelling each other out
anyway”. Notice that in Z, 1 and −1 are the only invertible elements. This will play
a role later on when looking at factorisation in more general rings.
The “long game” we are chasing is to try to solve polynomial equations f (x) = 0
where f (x) ∈ F[x] is a non-trivial polynomial with coefficients in a field. If we
could factor f (x) = am (x − α1 )(x − α2 )⋯(x − αm ) for some αk ∈ F, 1 ≤ k ≤ m,
95
96 5. PRIME IDEALS, MAXIMAL IDEALS, AND FIELDS OF QUOTIENTS

then we could spot the solutions to the equation immediately as {α1 , α2 , . . . , αm }.


(Why are these the only solutions?) As such, part of our strategy for attacking this
problem is to develop a theory of factorisation of polynomials with coefficients in
a field. Evolution teaches us that learning from others and from past experience is
not the worst way to move forward. We know how to factor integers as a product
of primes. Is there some way to extend the notion of primeness to rings other than
the integers? In particular, would this make sense in the polynomial ring F[x]?
Secondly, the first property, when applied to a prime number, says that if p ∈ Z is
prime and we try to factor it as p = ab for some integers a and b, then one of a and
b must have absolute value one, and as such, it must be invertible. We can think of
this as saying that prime numbers cannot be reduced into two “smaller” components
(where I am purposefully avoiding trying to ascribe a precise mathematical meaning
to “reduced ” and “smaller ” just now). In fact, this characterises prime numbers,
as does the second property listed above. That is, if an integer has one of the two
properties, it automatically has the second. If we manage to extend both of these
properties to more generally rings, will they always describe the same set? Ohhh,
the mystery deepens and the tension is palpable!
In the next three Chapters, we shall develop abstract notions of primeness,
irreducibility and of factorisation. We shall actually define these notions in more
general rings than just polynomial rings over a field, and we shall seek to understand
what is so special about the case of F[x].
Also, in this Chapter, we shall rediscover how to construct the rational numbers
from the integers, and we shall see how to extend this construction to integral
domains to produce fields of quotients.

We remind the reader that a subset B of a set A is said to be a proper subset


if B ≠ A. Thus a subset K of a ring R is a proper ideal if it is an ideal of R and
K ≠ R. Note that both prime and maximal ideals of a ring R are – by definition –
proper ideals (when they exist).

1.2. Definition. Let R be a ring. A proper ideal K of R is said to be prime


if r, s ∈ R and rs ∈ K implies that either r ∈ K or s ∈ K.
A proper ideal M of R is said to be maximal if whenever an ideal K ⊲ R satisfies
M ⊆ K ⊆ R, either K = M , or K = R.

1.3. The notion of a maximal element in a partially ordered set is not the same
as the notion of a maximum element. A ring may have many maximal ideals (or
none - depending upon the ring). The key observation is that a maximal element
need only be bigger than the elements to which it is comparable – a maximum
element must first be comparable to everything in the partially ordered set, and
must then be bigger than everything. We discuss the definition at greater length in
the Appendix to this section.
1. PRIME AND MAXIMAL IDEALS 97

1.4. Examples.
(a) Recall that Z is a principal ideal domain, that is, every ideal K of Z is of
the form K = ⟨k⟩ for some k ∈ Z. In order for K to be a proper ideal, we
cannot have k = 1 nor k = −1. Note also that ⟨k⟩ = ⟨−k⟩, showing that there
is no loss of generality in assuming that k ≥ 0.
● Suppose first that k = 0. If r, s ∈ Z and rs ∈ ⟨0⟩ = {0}, then either
r = 0 ∈ ⟨0⟩ or s = 0 ∈ ⟨0⟩, and so ⟨0⟩ is a prime ideal.
● Suppose next that k = p ∈ N is a prime number. If r, s ∈ R and rs ∈ ⟨p⟩,
then rs = p m for some integer m. Since p divides the right-hand side
of the equation and is prime, it must divide the left-hand side of the
equation, and indeed, it must divide either r or s. By relabelling if
necessary, we may suppose that p divides r, and thus r ∈ ⟨p⟩. This
shows that ⟨p⟩ is a prime ideal.
● Suppose that k = k1 k2 for some 2 ≤ k1 , k2 ∈ N. That is, suppose that k
is not prime. Then with r = k1 , s = k2 , we see that rs = k ∈ ⟨k⟩, but
r, s ∈/ ⟨k⟩.
Together, these show that – other than ⟨0⟩, an ideal K = ⟨k⟩ is a prime ideal
if and only if k is a prime number (or the negative of a prime number).
(b) Now let us determine the maximal ideals of Z. Again, we have seen that
all ideals of Z are of the form ⟨k⟩ for some 0 ≤ k ∈ Z, so these are the
only ones we need consider, and we can ignore the case where k = 1, since
K = ⟨1⟩ = ⟨−1⟩ = Z is not a proper ideal of Z.
Suppose first that k = 0. Then ⟨0⟩ is contained in any ideal – for
example, ⟨0⟩ ⊆ ⟨2⟩ ≠ Z, showing that ⟨0⟩ is not maximal.
Similarly, if k > 0 is a composite number, say k = k1 k2 where 2 ≤ k1 , k2 <
k, then
⟨k⟩ ⊆ ⟨k1 ⟩,
and these two ideals are distinct since k1 ∈ ⟨k1 ⟩ but k1 ∈/ ⟨k⟩. Hence ⟨k⟩ is
not maximal when k is composite.
Finally, suppose that k is prime. If ⟨k⟩ ⊆ ⟨m⟩ for some m ∈ N, then
k ∈ ⟨m⟩ implies that m divides k. But k prime then implies that either
m = 1 or m = k. If m = 1, then ⟨m⟩ = Z is not a proper ideal, while m = k
implies that ⟨m⟩ = ⟨k⟩.
Thus k prime implies that ⟨k⟩ is not properly contained in a proper
ideal of Z, whence ⟨k⟩ is maximal.
We conclude that (except for the zero ideal which is prime but not
maximal), the maximal and prime ideals of Z coincide.
(c) Let 2 ≤ n ∈ N. Recall from the Assignments that Mn (R) is simple. That is,
it has no ideals other than the trivial ideals {0} and Mn (R) itself.
The matrix units E11 and E22 lie in Mn (R), and E11 E22 = 0, although
neither of E11 nor E22 lies in {0}. This shows that {0} is not a prime ideal
in Mn (R).
It is, however maximal. (Why?)
98 5. PRIME IDEALS, MAXIMAL IDEALS, AND FIELDS OF QUOTIENTS

(d) Let 2 ≤ n ∈ N. Let 1 ≤ k ≤ n be an integer, and set


Mk ∶= {T = [ti,j ] ∈ Tn (C) ∶ tk,k = 0} .
We leave it to the reader to show that Mk is in indeed an ideal of Tn (C).
To see that it is maximal, consider the following. If N ⊲ Tn (C) is any ideal
which properly contains Mk , then there exists R ∈ N such that R ∈/ Mk .
Thus rk,k ≠ 0. Let A = [ai,j ] ∈ Tn (C) be arbitrary, and let T = [ti,j ] ∈
Tn (C), where ti,j = ai,j if (i, j) ≠ (k, k) and tk,k = 0. Then T ∈ Mk and
−1
A = ak,k rk,k Ek,k REk,k + T .
That is, A ∈ ⟨R, Mk ⟩ ⊆ N . Since A ∈ Tn (C) was arbitrary, we conclude
that Tn (C) ⊆ N . That is, there does not exist a proper ideal of Tn (C)
which properly contains Mk , and thus Mk is maximal.
Are there any other maximal ideals in Tn (C)?
(e) Note that each maximal ideal Mk of Tn (C), 1 ≤ k ≤ n listed in (d) above
is also a prime ideal. Suppose that R = [ri,j ] and S = [si,j ] ∈ Tn (C) and
A = [ai,j ] ∶= RS ∈ Mk . Then ak,k = 0, but note that ak,k = rk,k sk,k , so that
either rk,k = 0, in which case R ∈ Mk , or sk,k = 0, in which case S ∈ Mk .
That is, Mk is a prime ideal of Tn (C).
Are there any other prime ideals in Tn (C)?

1.5. Remark. Hopefully we have not become so tired that we have stopped
constantly asking ourselves new questions while reading. For example, in Exam-
ple 1.4(a), what was so special about Z that made ⟨0⟩ a prime ideal? Unlike the
case of Tn (C) – where ⟨0⟩ is not a prime ideal – Z has the property that if r, s ∈ Z
and rs = 0, then r = 0 or s = 0.
This should look very familiar. It is one of the defining properties of an in-
tegral domain. Thus, in every integral domain, ⟨0⟩ is a prime ideal. Of course,
integral domains are commutative (and unital) by definition. Can we think of a
non-commutative ring R where ⟨0⟩ is a prime ideal?
Again - this is the kind of question that should come to mind while reading these
notes, and you should always be thinking while reading any mathematical text.

1.6. Example. The ideal K ∶= ⟨x3 + 1⟩ ⊆ Z3 [x] is not prime.

To see this, note that p(x) = x2 + 2x + 1 and q(x) = x + 1 lie in Z3 [x] and
p(x)q(x) = x3 + 2x2 + x + x2 + 2x + 1 = x3 + 3x2 + 3x + 1 = x3 + 1,
though neither p(x) nor q(x) lies in K. (This is Exercise 5.2.)

1.7. Theorem. Let R be a commutative, unital ring. Suppose that L ⊲ R. The


following statements are equivalent.
(a) R/L is an integral domain.
(b) L is a prime ideal.
Proof.
1. PRIME AND MAXIMAL IDEALS 99

(a) implies (b). First we argue that L is a proper ideal of R. Indeed, since
L is an ideal of R, R/L is a ring. The hypothesis that R/L is an integral
domain means that it is unital, and it is not hard to see that 1 + L must
be the multiplicative identity of R/L. Since 1 + L ≠ 0 + L (in a unital ring,
the multiplicative and additive identities must be different), we see that
1 − 0 = 1 ∈/ L, so L ≠ R.
Suppose that r, s ∈ R and rs ∈ L. Then r + L, s + L ∈ R/L and (r + L)(s +
L) = rs + L = 0 + L. Since R/L is an integral domain (by hypothesis), either
r + L = 0 + L, in which case r ∈ L, or s + L = 0 + L, in which case s ∈ L. This
shows that L is prime.
(b) implies (a). We have seen that R/L is a ring, and (r + L)(s + L) =
(rs) + L = (sr) + L = (s + L)(r + L), where the middle equality holds because
R is commutative. Moreover, 1+L is easily seen to act as the multiplicative
identity of R/L, so that the latter is unital. Thus we need only show that
(r + L)(s + L) = 0 + L implies that either r + L = 0 or s + L = 0. But
(r + L)(s + L) = 0 + L implies that rs + L = 0 + L, which happens if and only
if rs ∈ L. Since L is assumed to be prime, this shows that either r ∈ L (or
equivalently r + L = 0 + L) or s ∈ L (or equivalently s + L = 0 + L). Thus R/L
is an integral domain.

1.8. Theorem. Let R be a commutative, unital ring, and let M ⊲ R be an ideal


of R. The following statements are equivalent.
(a) M is a maximal ideal of R.
(b) R/M is a field.
Proof.
(a) implies (b). Suppose that M is a maximal ideal of R. Arguing as in
part (b) of Theorem 1.7 above, we see that R/M is a commutative ring
with multiplicative identity 1 + M . It suffices to prove that if r0 ∈ R and
r0 + M ≠ 0 + M , then r0 + M is invertible in R/M .
To that end, note that if r0 + M ≠ 0 + M , then r0 ∈/ M . Consider the set
N ∶= {sr0 + m ∶ m ∈ M, s ∈ R}. If t ∈ R, and sr0 + m ∈ N , then
t(sr0 + m) = (ts)r0 + tm ∈ N,
since tm ∈ M because M ⊲ R. Since R is commutative,
(sr0 + m)t ∈ N.
Also,
(s1 r0 + m1 ) − (s2 r0 + m2 ) = (s1 − s2 )r0 + (m1 − m2 ) ∈ N,
since s1 − s2 ∈ R is clear, while m1 − m2 ∈ M since M ⊲ R is an ideal. By
the Ideal Test, N ⊲ R. That M ⊆ N is clear, and since 1r0 + 0 = r0 ∈ N but
r0 ∈/ M , we see that M is a proper subset of N .
100 5. PRIME IDEALS, MAXIMAL IDEALS, AND FIELDS OF QUOTIENTS

The hypothesis that M is a maximal ideal of R and N is an ideal of


R, combined with the fact that M ⊊ N ⊆ R now implies that N = R, and
thus 1 ∈ N . We can therefore write 1 = sr0 + m for some s ∈ R, m ∈ M , and
observe that

(s + M )(r0 + M ) = (sr0 ) + M
= (sr0 + m) + M
=1+M
= (r0 + M )(s + M ),

proving that r0 + M is invertible, as required.


(b) Suppose next that M is not maximal. Then, either M is not a proper ideal
of R, or there exists an ideal N ⊲ R satisfying

M ⊂ N ⊂ R.

(Here we remind the reader that ⊂ means “proper subset”, as opposed to


⊆, which includes the possibility of equality.) If M = R, then

R/M = R/R = {0 + R}

is just the zero ring. In particular, it is not unital, and thus R/M is certainly
not a field. Suppose therefore that there exists N ⊲ R as above.
Since N ≠ R, N does not contain any invertible elements, and in par-
ticular, 1 ∈/ N .
Let n ∈ N ∖ M . Then n + M ≠ 0 + M , and for all r ∈ R,

(r + M )(n + M ) = (rn) + M = (n + M )(r + M ).

But rn ∈ N , since N is an ideal. If (rn) + M = 1 + M , then rn − 1 ∈ M , say


m = rn − 1, and thus 1 = rn − m ∈ N , since rn ∈ N and m ∈ M ⊂ N . This
contradicts the above paragraph.
This shows that 0 + M ≠ n + M is not invertible in R/M , and thus R/M
is not a field.

An alternative argument that proves the same thing consists of observ-


ing that N /M is a proper ideal of R/M , since 1 + M ∈/ N /M . (After all,
if 1 + M ∈ N /M , then 1 + M = n + M and so 1 − n ∈ M ⊆ N , i.e. 1 ∈ N ,
contradicting the fact that N is a proper ideal of R.
The presence of multiple ways of proving this result brings to mind
certain allusions to taxidermy, but this is neither the time nor the place to
discuss this.

2. FROM INTEGRAL DOMAINS TO FIELDS 101

1.9. Corollary. Let R be a commutative, unital ring. If L ⊲ R is maximal,


then L is prime.
Proof. Suppose that L ⊲ R is a maximal ideal. By Theorem 1.8, R/L is a field.
But every field is an integral domain, so by Theorem 1.7, L is prime.

1.10. Example. The converse to this is false. Recall that {0} is a prime ideal
in any integral domain, but it is not maximal in the integral domain Z, for example.

2. From integral domains to fields


2.1. Our goal in this section is to use integral domains to construct fields. Our
inspiration will be the construction of the field of rational numbers from the set
of integers. Let us review (or learn, if we’ve never seen this before) how this is
done. If the student is not familiar with the notion of equivalence relations and
equivalence classes on a non-empty set, now might be the time to look at the
Appendix to this Chapter. Perhaps we should rephrase that. Now is the time to
look at the Appendix to this Chapter.
2.2. We know that the field of rational numbers Q is obtained from the integral
domain Z by considering quotients of the form
p
, p, q ∈ Z, q ≠ 0.
q
What we would like to do now is to show that starting with any integral domain D,
we can always use a similar construction to obtain a field F, which we shall refer to
as the field of quotients of D.
What we shall have to do is to examine more closely our construction of Q from
Z, and to figure out how to adapt it to our situation.

2.3. One of the issues we face in constructing the rational numbers from the
integers as above is the “non-uniqueness” of the representation of a rational number.
As we know all too well,
2 4 −6
= = ,
7 14 −21
and indeed, we can find infinitely many ways of describing the same rational number.
If we want to say that they all represent the same rational number, one approach
might be to set up an equivalence relation as follows:
Let X ∶= Z × (Z ∖ {0}) ∶= {(a, b) ∶ a, b ∈ Z, b ≠ 0}. We define a relation ∼ on X via
(a, b) ∼ (c, d)
if ad = bc.
Let us verify that this is indeed an equivalence relation.
102 5. PRIME IDEALS, MAXIMAL IDEALS, AND FIELDS OF QUOTIENTS

● (a, b) ∼ (a, b) since ab = ba.


● If (a, b) ∼ (c, d), then ad = bc, so (c, d) ∼ (a, b).
● If (a, b) ∼ (c, d) and (c, d) ∼ (x, y), then ad = bc and cy = dx. Thus (keeping
in mind that we are working in Z!),
d(ay) = (ad)y = (bc)y = b(cy) = b(dx) = d(bx).
Since d ≠ 0, we conclude that ay = bx, or equivalently that (a, b) ∼ (x, y).
Together, these three statements show that ∼ is an equivalence relation on X. Let
us denote by [(a, b)] the equivalence class of (a, b). That is,
[(a, b)] ∶= {(c, d) ∈ X ∶ (c, d) ∼ (a, b)}.
We denote by X/ ∼ the set of all equivalences classes of elements of X; that is,
X/ ∼∶= {[(a, b)] ∶ (a, b) ∈ X}.
In this way, we may identify the rational number ab ∈ Q with the equivalence class
[(a, b)] ∈ X/ ∼.
Notice, incidentally, that the proof that ∼ is an equivalence relation on X only
used the fact that the cancellation law holds in Z (see the third item above). But
in fact, the cancellation law holds in any integral domain, so that we obtain the
following result using exactly the same proof as that above.

2.4. Proposition. Let D be an integral domain. The relation ∼ defined on


X ∶= D × (D ∖ {0}) by
(a, b) ∼ (c, d) if ad = bc
is an equivalence relation on X.

Let us also remember how we add and multiply rational numbers:


a c ab + bc a c ac
+ = and = .
b d bd b d bd
If we translate the notation to the ordered pair notation, the next result becomes
less surprising.

Proposition 2.4 will be the key to the following theorem.


2.5. Theorem. Let D be an integral domain. Let ∼ be the equivalence relation
defined on X ∶= D × (D ∖ {0}) in Proposition 2.4 above, and let F ∶= X/ ∼= {[(a, b)] ∶
(a, b) ∈ X}. Then F is a field using the operations:
[(a, b)] + [(c, d)] ∶= [((ad + bc), bd)] and [(a, b)][(c, d)] ∶= [(ac, bd)].

Proof. Notice that we are dealing with equivalence classes. Already, this should
trigger a Pavlovian effect, and if nothing else, we should start salivating every time
a bell rings.
A fortiori, we are defining the operations of addition and multiplication using
representatives of these equivalence classes. This is the mathematical equivalence
2. FROM INTEGRAL DOMAINS TO FIELDS 103

of someone hitting your leg under the kneecap with a rubber mallet, and your leg
should pop forward and trigger the question: are these operations well-defined ? Let’s
check!
Suppose that [(a1 , b1 )] = [(a2 , b2 )] and that [(c1 , d1 )] = [(c2 , d2 )] in F. Then
(a1 , b1 ) ∼ (a, b), so a1 b = ab1 , and similarly c1 d = cd1 . Thus
(ad + bc)(b1 d1 ) = ab1 dd1 + bd1 b1 c
= a1 bdd1 + bb1 c1 d
= (a1 d1 + b1 c1 )(bd).
In other words,
[(ad + bc, bd)] = [(a1 d1 + b1 c1 , b1 d1 )],
proving that addition is indeed well-defined.
The proof that the multiplication operation above is well-defined is similar and
is left as an exercise for the reader.

There remains to show that (F, +, ⋅) is a field using these operations.


(A1) That F is closed under addition is clear, since for all [(a, b)], [(c, d)] ∈ F,
we see that [(ad + bc), bd] ∈ F.
(A2)
([(a, b)] + [(c, d)]) + [(x, y)] = [(ad + bc, bd)] + [(x, y)]
= [((ad + bc)y + (bd)x, bdy)]
= [(ady + bcy + bdx, bdy)]
= [(a, b)] + [(cy + dx), dy)]
= [(a, b)] + ([(c, d)] + [(x, y)] ) .
Thus addition is associative.
(A3) Consider α ∶= [(0, 1)] ∈ F. For any [(a, b)] ∈ F,
α + [(a, b)] = [(0, 1)] + [(a, b)] = [(0b + 1a, 1b)]
= [(a, b)] = [(a1 + b0, b1)]
= [(a, b)] + [(0, 1)] = [(a, b)] + α.
Thus α = [(0, 1)] is the neutral element under addition. (We are thinking
of [(0, 1)] as “ 10 ”. That it is a neutral element under addition should not
be surprising!!)
Notice that (0, 1) ∼ (0, d) for all d ∈ D ∖ {0}, so that [(0, 1)] = [(0, d)]
(A4) Let [(a, b)] ∈ F. Then
[(a, b)] + [(−a, b)] = [(ab − ba, b2 )] = [(0, b2 )]
= [(0, 1)] = [(−ab + ba, b2 )]
= [(−a, b)] + [(a, b)].
Thus [(−a, b)] = −[(a, b)] is the additive inverse of [(a, b)].
104 5. PRIME IDEALS, MAXIMAL IDEALS, AND FIELDS OF QUOTIENTS

(A5) For all [(a, b)], [(c, d)] ∈ F,


[(a, b)] + [(c, d)] = [(ad + bc, bd)] = [(cb + da, db)] = [(c, d)] + [(a, b)],
where the middle equality holds because (D, +) is an abelian group.
We have shown that (F, +) is an abelian group.
The proof that (F, +, ⋅) satisfies conditions (M1), (M2), (M3) and (M4) of Defi-
nition 2.1.2 are similar to those above and are left as an exercise for the reader.
We conclude that (F, +, ⋅) is a ring. We also leave it as an exercise for the
reader to verify that multiplication is commutative in F, and that [(1, 1)] serves as
a multiplicative identity for this ring. There remains to show that every non-zero
element is invertible.

From above, for all d ∈ D, [(0, d)] = [(0, 1)] ∶= α is the neutral element of F under
addition. Suppose that [(a, b)] ∈ F with a ≠ 0 ≠ b. Then
[(a, b)][(b, a)] = [(ab, ba)] = [(1, 1)] = [(ba, ab)] = [(b, a)][(a, b)],
and so [(b, a)] = [(a, b)]−1 is the multiplicative inverse of [(a, b)].
At long last, we conclude that (F, +, ⋅) is a field.

2.6. Example. Let us apply this construction with D = Z. The resulting field
F is then F ∶= {[(p, q)] ∶ p, q ∈ Z, q ≠ 0}. We claim that this is isomorphic to Q.
Consider the map
Θ∶ F → Q
.
[(p, q)] ↦ pq
Once again, we are defining a map on equivalence classes of sets, and since we
are defining the map in terms of a particular choice of representative, the first thing
that we must do is to check that our map Θ is well-defined.
Suppose that [(p, q)] = [(r, s)]. By definition, q ≠ 0 ≠ s are integers, and (p, q) ∼
(r, s), from which we find that ps = qr. But then dividing both sides by qs ≠ 0, we
get
p r
Θ([(p, q)]) = = = Θ([r, s]),
q s
proving that Θ is indeed well-defined.
Now
Θ([(p, q)] + [(r, s)]) = Θ([(ps + rq, qs)]
ps + rq
=
qs
p r
= +
q s
= Θ([(p, q)]) + Θ([(r, s)]),
2. FROM INTEGRAL DOMAINS TO FIELDS 105

and
Θ([(p, q)][(r, s)]) = Θ([(pr, qs)]
pr
=
qs
p r
=
q s
= Θ([(p, q)]) Θ([(r, s)]).
Thus Θ is a ring homomorphism.
To see that Θ is injective, it suffices to show that ker Θ = {[(0, 1)]}. But [(p, q)] ∈
ker Θ if and only if Θ([(p, q)]) = pq = 0, which happens if and only if p = 0. Hence
ker Θ = {[(0, q)] ∶ q ∈ Z, 0 ≠ q} = {[(0, 1)]},
since q ≠ 0 implies that (0, q) ∼ (0, 1). Hence Θ is injective.
Finally, to see that Θ is surjective, and hence an isomorphism of rings, note that
for any pq ∈ Q with p ∈ Z and 0 ≠ q ∈ Z, we have that
p
= Θ[(p, q)].
q
Since Θ is an isomorphism, we have just constructed (an isomorphic copy of) Q
from Z.

Of course, the fact that we are able to construct a field using an integral domain
is quite interesting, but if that field were to have no other connection to the integral
domain, then it would be an idle curiosity at best. In much the same way that the
rational numbers contain an isomorphic copy of the integers, we find that the field
F of quotients of an integral domain D always includes an isomorphic of D; in other
words, up to a matter of notation – it always includes D.

2.7. Theorem. Let D be an integral domain and let F = {[(a, b)] ∶ a, b ∈ D, b ≠


0} be its field of quotients. The map
θ∶ D → F
a ↦ [(a, 1)]
is an injective homomorphism, and thus θ(D) ⊆ F is an integral domain which is
isomorphic to D.
Proof. That θ(D) is an integral domain which is isomorphic to D will follow auto-
matically once we show that θ is an injective homomorphism, since it is automatically
surjective as a map onto its range θ(D).
Now
θ(a + b) = [(a + b, 1)] = [(a, 1)] + [(b, 1)] = θ(a) + θ(b),
and
θ(ab) = [(ab, 1)] = [(a, 1)][(b, 1)] = θ(a) θ(b),
106 5. PRIME IDEALS, MAXIMAL IDEALS, AND FIELDS OF QUOTIENTS

so that θ is a homomorphism. If θ(a) = [(0, 1)], then [(a, 1)] = [(0, 1)], so (a, 1) ∼
(0, 1). But this happens if and only if a ⋅ 1 = 1 ⋅ 0 = 0, i.e. if a = 0. Thus ker θ = {0},
whence θ is injective.

2.8. Example. If D is an integral domain, then so is D[x]. The corresponding


field is
F = {[(p(x), q(x))] ∶ p(x), q(x) ∈ D[x], q(x) ≠ 0},
with
[(p(x), q(x))] + [(r(x), s(x))] ∶= [(p(x)s(x) + q(x)r(x), q(x)s(x))],
and
[(p(x), q(x))] [(r(x), s(x))] ∶= [(p(x)r(x), q(x)s(x))].

It is standard to write
p(x)
q(x)
to denote the equivalence class [(p(x), q(x))]. In particular, it is understood that
p(x) r(x)
=
q(x) s(x)
if and only if p(x)s(x) = r(x)q(x). Using this notation, the above summation and
multiplication become the familiar
p(x) r(x) p(x)s(x) + q(x)r(x)
+ =
q(x) s(x) q(x)s(x)
and
p(x) r(x) p(x)r(x)
= .
q(x) s(x) q(x)s(x)
SUPPLEMENTARY EXAMPLES. 107

Supplementary Examples.

S5.1. Example. Let R = C([0, 1], R), equipped with the usual point-wise ad-
dition and multiplication. This ring has an additional structure, namely that it is
a vector space over R. (We shall discuss vector spaces over general fields in a later
Chapter.)
Given x0 ∈ [0, 1], the map
εx0 ∶ C([0, 1], R) → R
f → f (x0 )
is easily seen to be a ring homomorphism. It is also linear and non-zero, since if
g(x) = 1 for all x ∈ [0, 1], then g ∈ C([0, 1], R) and εx0 (g) = 1 ≠ 0. By the First
Isomorphism Theorem,
C([0, 1], R)
R = ran εx0 ≃ .
ker εx0
That ker εx0 is an ideal follows from Chapter 4. Since the co-dimension of ker εx0
is one, this implies that ker εx0 is a maximal ideal of C([0, 1], R).
Less easy to show is the fact that every maximal ideal of C([0, 1], R) is of the
form ker εx0 for some x0 ∈ [0, 1].

S5.2. Example. Let D be an integral domain, and let R = D × D = {(d1 , d2 ) ∶


di ∈ D, i = 1, 2}, equipped with point-wise addition and multiplication.
Let J ≠ D be a prime ideal of D. Then for any a, b ∈ D, ab ∈ J implies that a ∈ J
or b ∈ J. Let k ∈ J , and d ∈ D ∖ J . Then
(d, k) ⋅ (k, d) = (dk, kd) = (dk, dk) ∈ J × J,
though neither of (k, d) nor (d, k) belongs to J × J. Thus J × J is not a prime ideal
of R.
The reader would be well-served to generalise this result to ideals of the form
J1 × J2 of R.

S5.3. Example. If F is a field, then {0} and F are the only ideals of F, and so
{0} is a maximal ideal.

S5.4. Example. Let R = Z[x], the ring of polynomials in x with coefficients in


Z. Let K = ⟨x⟩, the ideal generated by x.
Suppose that p(x), q(x) ∈ Z[x] and that p(x)q(x) ∈ K. Writing p(x) = pm xm +
pm−1 xm−1 + ⋯ + p1 x + p0 and q(x) = qn xn + qn−1 xn−1 + ⋯q1 x + q0 , we see that if
p0 ≠ 0 ≠ q0 , then the constant term of p(x)q(x) is equal to p0 q0 , a contradiction.
In other words, either p0 = 0 (i.e. p(x) ∈ K) or q0 = 0 (i.e. q(x) ∈ K) – or both.
Either way, this shows that K is a prime ideal.
108 5. PRIME IDEALS, MAXIMAL IDEALS, AND FIELDS OF QUOTIENTS

S5.5. Example. Let R = Z[x], the ring of polynomials in x with coefficients


in Z. Once again, let K = ⟨x⟩, the ideal generated by x. Then K = {xq(x) ∶ q(x) ∈
Z[x]}. Let N ∶= {2m + xq(x) ∶ m ∈ Z, q(x) ∈ Z[x]}. We invite the reader to verify
that N ⊲ Z[x], and that K ⊊ N ⊊ Z[x].
In other words, K is not maximal. This also follows from the fact that
Z[x]
≃ Z,
⟨x⟩
which is not a field.

S5.6. Example. The ideal K = ⟨x2 + 1⟩ is a prime ideal in Z[x]. It is left as


an exercise for the reader to prove that
Z[x]
≃ Z[i],
⟨x2 + 1⟩
the ring of Gaussian integers. Since the latter is an integral domain but not a field,
⟨x2 + 1⟩ is a prime ideal which is not a maximal ideal of Z[x].

S5.7. Example. Note that Z × {0} is a prime ideal of Z × Z, but it is not


maximal, since it is contained in the larger ideal Z × 2Z.

S5.8. Example. Let p ∈ N be a prime number and


K ∶= {q(x) = qn x2 + qn−1 xn−1 + ⋯ + q1 x + pq0 ∶ n ∈ N, qj ∈ Z, 0 ≤ j ≤ n} ⊆ Z[x].

That is, K consists of all polynomials whose constant term is divisible by p.


Then
Z[x]
≃ Z/⟨p⟩ ≃ Zp
K
is a field, so K is a maximal ideal of Z[x], and therefore it is also a prime ideal of
Z[x].

S5.9. Example. Returning to C([0, 1], R): let K ∶= {f ∈ C([0, 1], R) ∶ f (x) =
0, x ∈ [ 31 , 32 ]}. We invite the reader to verify that this is an ideal of C([0, 1], R).
⎧ ⎧
⎪0
⎪ x ∈ [0, 21 ] ⎪ 21 − x x ∈ [0, 12 ]

Let g(x) = ⎨ and h(x) = ⎨ .

⎪x − 12 x ∈ [ 21 , 1], ⎪
⎪0 x ∈ [ 12 , 1]
⎩ ⎩
Then g(x)h(x) = 0 ∈ K, but neither g(x) nor h(x) is in K, so K is not prime.

In fact, essentially the same argument shows that if R is not an integral domain,
and if K is a non-zero, proper ideal of R, then K is not prime.
SUPPLEMENTARY EXAMPLES. 109

S5.10. Example. If we are not dealing with unital rings, there need not exist
maximal ideals in general (even if we assume Zorn’s Lemma).
Consider R = (Q, +, ∗), where + is usual addition of rational numbers, but
a ∗ b ∶= 0 for all a, b ∈ Q.
We leave it to the reader to verify that R is indeed a ring.
What should an ideal of R look like? In essence, K ⊲ R is k1 − k2 ∈ K for all
k1 , k2 ∈ K, while a ∗ k = k ∗ a = 0 ∈ K is trivially verified. In other words, K must be
a subgroup of Q under addition, and any subgroup of Q under addition is an ideal
of R.
Suppose that M ⊊ Q is a proper subgroup under addition.
As per the first Assignment, there exists a prime number p ∈ N and an integer
N ∈ N such that n ≥ N implies that p1n ∈/ M .
Let N = {m + k p1N ∶ m ∈ M, k ∈ Z}. We claim that N is a subgroup of (Q, +) and
therefore an ideal of R = (Q, +, ∗), and that M ⊊ N ⊊ R, so that M is not a maximal
ideal of R.
● Let a ∶= m1 + k1 p1N , b ∶= m2 + k2 p1N ∈ N . Then
1 1 1
b − a ∶= (m2 + k2 N
) − (m1 + k1 N ) = (m2 − m1 ) + (k2 − k1 ) N ∈ N.
p p p
Thus N is an additive subgroup of Q, and so it is an ideal of R using ∗ as
our multiplication.
● Clearly M ⊂ N and M ≠ N since p1N ∈ N ∖ M .
To see that N ≠ Q, note that pN1+1 ∈/ N . Indeed, suppose otherwise.
Then
1 1
= m + k N for some m ∈ M, k ∈ Z,
pN +1 p
and so
1 1
= pm + k N −1 ∈ M,
pN p
a contradiction.
110 5. PRIME IDEALS, MAXIMAL IDEALS, AND FIELDS OF QUOTIENTS

Appendix

A5.1. Earlier in this Chapter, and also in the Exercises to the previous Chapter,
we defined the notion of a maximal ideal M in a ring R, and we also mentioned that
there is a difference between the notions of maximal elements of partially ordered
sets, and of maximum elements of partially ordered sets. Let us now examine this
more closely. We shall begin, as a wise man used to say, at the “big-inning”.
A5.2. Definition. Let ∅ ≠ X be a set. A relation ρ on a non-empty set X is
a subset of the set X × X = {(x, y) ∶ x, y ∈ X} of ordered pairs of elements of X.
Often, we write x ρ y to mean the ordered pair (x, y) ∈ ρ. This is especially true
when dealing, for example, with the usual relation ≤ for real numbers: no one writes
(x, y) ∈≤; we all write x ≤ y. Incidentally, the notation ≤ is used not only for the
relation “less than or equal to” for real numbers; it frequently appears to indicate
a specific kind of a relation known as a partial order on an arbitrary set, which we
now define.
A5.3. Example. Let X = Z. We might define a relation ∼ on X by setting
x ∼ y if ∣x − y∣ ≤ 1. Thus 3 ∼ 3, 3 ∼ 4, 4 ∼ 3, and 4 ∼ 5, but (3, 5) ∈/∼ (i.e. 3 ∼/ 5), since
∣3 − 5∣ = 2 > 1.
Note that there need not exist an explicit “formula” to describe ρ. Any subset
of X × X defines a relation.
A5.4. Definition. A relation ≤ on a non-empty set X is said to be a partial
order if, given x, y, and z ∈ X,
(a) x ≤ x;
(b) if x ≤ y and y ≤ x, then x = y; and
(c) if x ≤ y and y ≤ z, then x ≤ z.
We refer to the pair (X, ≤) as a partially ordered set, or more succinctly as a
poset.
A5.5. The prototypical example of a partial order is the partial order of inclu-
sion on the power set P (A) of a non-empty set A. That is, we set P (A) = {B ∶ B ⊆
A} and for B, C ∈ P (A), we set B ⊆ C to mean that every member of B is a member
of C. It is easy to see that ⊆ is a partial order on P (A).
The word partial refers to the fact that given two elements B and C in P (A),
they might not be comparable. For example, if A = {x, y}, B = {x} and C = {y},
then it is not the case that B ⊆ C, nor is it the case that C ⊆ B. Only some subsets
of P (A) are comparable to each other.
In dealing with the natural numbers, we observe that they come equipped with
a partial order which we typically denote by ≤. In this setting, however, any two
natural numbers are comparable. This leads us to the following notion.
A5.6. Definition. If (X, ρ) is a partial ordered set, and if x, y ∈ X implies that
either xρy or yρx, then we say that ρ is a total order on X.
APPENDIX 111

A5.7. Example.
(a) It follows from the above discussion that (N, ≤) is a totally ordered set.
(b) It also follows from the above discussion that if A is a non-empty set with
at least two members, then (P (A), ⊆) is partially ordered, but not totally
ordered by inclusion.

A5.8. Definition. Let (X, ≤) be a partially ordered set. An element m ∈ X is


said to be a maximal element of X if x ∈ X and m ≤ x implies that x = m.
The element ω ∈ X is said to be a maximum element of X if x ∈ X implies
that x ≤ ω.

A5.9. The key difference between these two notions is that a maximum element
is comparable to every other element of X, and is bigger than or equal to it, whereas
a maximal element need not be comparable to every other element of the set X –
it just needs to be bigger than or equal to those elements to which it is actually
comparable.

A5.10. Example. Let A = {∅, {x}, {y}, {z}, {x, y}, {x, z}, {y, z}} be the set of
proper subsets of X = {x, y, z}, and partially order A by inclusion ⊆. Then clearly
{x} ⊆ {x, y}.
Observe that {x, y} is a maximal element of A. Nothing is bigger than {x, y} in
A. But {z} ⊆/ {x, y}, and so {x, y} is not a maximum element of A. Note that {x, z}
is also a maximal element. Maximal elements need not be unique. (Are maximum
elements unique when they exist? Think about it!)

A5.11. Remark. We have not addressed the question of whether or not max-
imal ideals always exist. As it so happens, the existence of maximal ideals in unital
rings is equivalent to Zorn’s Lemma, or equivalently to the Axiom of Choice, both
of which are discussed in the Appendix at the end of these course notes.

A5.12. We have seen that a partial order on a non-empty set is a specific kind
of relation that mimics the notion of inclusion between elements of the power set
P (X) of a set X, or the simple concept of less than or equal to between two real
numbers.
Another important relation tries to mimic the concept of two elements being
equivalent. The name chosen for such a relation couldn’t be better suited.

A5.13. Definition. A relation ≡ on a non-empty set X is said to be an equiv-


alence relation if for all x, y, z ∈ X,
(a) x ≡ x;
(b) if x ≡ y, then y ≡ x; and
(c) if x ≡ y and y ≡ z, then x ≡ z.
112 5. PRIME IDEALS, MAXIMAL IDEALS, AND FIELDS OF QUOTIENTS

Given x ∈ X, we define the equivalence class of x to be the set


[x] ∶= {y ∶ y ≡ x},
and we denote by X/ ≡ the set of all equivalence classes of elements of X. Thus
X/ ≡∶= {[x] ∶ x ∈ X}.
We refer to x as a representative of its equivalence class.
A5.14. Thus an equivalence relation differs from a partial order because of
item (b) above. In the case of a partial order, the condition x ≤ y and y ≤ x implies
x = y is referred to as anti-symmetry. The idea is that a partial order is only
symmetric when it has to be. In the case of an equivalence relation, the condition
that x ≡ y implies that y ≡ x is referred to as symmetry. Equivalence was born to
be symmetric.
A5.15. Example. The most obvious notion of equivalence is equality. Let
∅ ≠ X be a set, and define x ≡ y if and only if x = y. It is easy to see that this is an
equivalence relation. For each x ∈ X, the equivalence class of x is [x] = {x}.
A5.16. Example. Consider X = Z. For x, y ∈ Z, set x ≡ y if x − y ∈ 2Z; i.e. if
x − y = 2m for some m ∈ Z.
Again - we leave it to the reader to verify that this is indeed an equivalence
relation.
There are precisely two equivalence classes for Z under this relation, namely
[0] = 2Z = {2m ∶ m ∈ Z}, and [1] = {2m + 1 ∶ m ∈ Z}.
A5.17. Exercise. Note that in both of the above examples, the underlying set
X is a disjoint union of its equivalence classes under the equivalence relation.
Let ∅ ≠ X be a non-empty set and ≡ be an equivalence relation on X.
(a) Prove that if x, y ∈ X, then either [x] = [y], or [x] ∩ [y] = ∅.
(b) Prove that X = ⊍{[x] ∶ [x] ∈ X/ ≡}.
A5.18. One of the most important thing to remember about equivalence classes
is that their representatives are, in general, non-unique. For example, in Example 2.3
above, [0] = [2] = [246] = [−5484726492], so that 0, 2, 246 and −5484726492 are all
representatives of the same equivalence class
2Z = {. . . , −4, −2, 0, 2, 4, . . .}.
Similarly, [1] = [−2222444466669]. When dealing with an equivalence class as an
element of X/ ≡, it is important to remember that in general you are looking at a
set of elements, and that you can’t really see a particular representative of that set.
As a result, any argument or definition that you make about the equivalence class
that uses that representative had better not depend upon the representative that
you have chosen!

Hey, this sounds familiar: we’ve seen this kind of thing before, haven’t we? (Yes!
is the correct rhetorical answer to this not-so rhetorical question.) In the Appendix
APPENDIX 113

to Chapter 4, we discussed the notion of well-definedness of functions, especially in


relation to cosets of a ring. The point is that if R is a ring, and K is an ideal of
that ring, then we may establish an equivalence relation by setting x ∼ y if and only
if x − y ∈ K. The equivalence class of x is precisely [x] = {x + k ∶ k ∈ K}, which we
define to be the coset [x] = x + K. The point is that when K is an ideal of R, we
have discovered a way to turn the collection of equivalence classes of elements of R
(under the above relation) into a ring. Wicked!
114 5. PRIME IDEALS, MAXIMAL IDEALS, AND FIELDS OF QUOTIENTS

Exercises for Chapter 5

Exercise 5.1.
Let 2 ≤ n ∈ N.
(a) Find all maximal ideals of Tn (C).
(b) Find all prime ideals of Tn (C).

Exercise 5.2.
Prove that p(x) = x2 + 2x + 1 does not lie in the ideal ⟨x3 + 1⟩ of Z3 [x].

Exercise 5.3.
Let D be an integral domain. Let ∼ be the equivalence relation defined on
X ∶= D × (D ∖ {0}) in Proposition 2.4 above, and let F ∶= {[(a, b)] ∶ (a, b) ∈ X}.
Prove that the multiplication operation on F given by
[(a, b)][(c, d)] ∶= [(ac, bd)]
is well-defined.

Exercise 5.4.
With the hypotheses and notation from Theorem 2.5 above,
(a) Show that conditions (M1), (M2), (M3) and (M4) of Definition 2.1.2 are
satisfied by F.
(b) Prove that [(1, 1)] serves as a multiplicative identity for F.

Exercise 5.5.
Let R be a ring, and let M and N be prime ideals of R. Either prove that M ∩N
is a prime ideal of R, or find a counterexample to this statement if it is false.

Exercise 5.6.
Let R be a ring, and let M and N be maximal ideals of R. Either prove that
M ∩ N is a maximal ideal of R, or find a counterexample to this statement if it is
false.

Exercise 5.7.
Let R and S be rings and φ ∶ R → S be a homomorphism.
(a) Suppose that P ⊲ S is a prime ideal. Either prove that φ−1 (P ) ∶= {x ∈ R ∶
φ(x) ∈ P } is a prime ideal of R, or find a counterexample to this statement
if it is false.
(b) Suppose that M ⊲ S is a maximal ideal. Either prove that φ−1 (P ) ∶= {x ∈
R ∶ φ(x) ∈ P } is a maximal ideal of R, or find a counterexample to this
statement if it is false.
(c) Let K ⊲ R be an ideal. Prove that φ(K) need not be an ideal of S, but
that φ(K) is an ideal of S if φ is surjective.
EXERCISES FOR CHAPTER 5 115

Exercise 5.8.
Let R be a unital ring.
(a) Let K ≠ R be an ideal of R. Prove that there exists a maximal ideal M
such that K ⊆ M .
(b) Let r ∈ R be non-invertible. Prove that there exists a maximal ideal M
such that r ∈ M .
(c) Suppose that R has a unique maximal ideal. Prove that the set N of
non-invertible elements of R form an ideal in R.

Exercise 5.9.
Let R be a finite, commutative and unital ring. Prove that every prime ideal is
maximal.

Exercise 5.10.
Let R be a commutative unital ring. Suppose that ever proper ideal of R is a
prime ideal. Prove that R is a field.
CHAPTER 6

Euclidean Domains

Don’t ever wrestle with a pig. You’ll both get dirty, but the pig will
enjoy it.
Cale Yarborough

1. Euclidean Domains
1.1. Algebraists study algebraic structures. That is neither as deep nor as shal-
low as it may sound. Analysts study algebraic structures that admit some sort of
topological structure as well. The rational numbers are a perfect example of what
we are trying to get at here.
From the point of view of algebra - the rational numbers are a perfectly legitimate
object of study. Heck, they form a field, and fields - from an algebraic standpoint –
are swell. (That is not a technical term.) Of course, in some (algebraic) ways, the
rational numbers are lacking: as a field, Q is not algebraically closed. This can
be a severe handicap for an algebra. But all in all, they are still really, really swell.
From the point of view of your average Analyst, the rational numbers are severely
lacking not only because they are algebraically incomplete, but more significantly
because they are not (analytically) complete; that is, it is possible to find Cauchy
sequences of rational numbers whose limits are not rational numbers. As a conse-
quence, basic analytical properties such as the Least Upper Bound property and
the Intermediate Value Theorem fail abjectly when dealing with the rationals as
the base field, and this suggests to the Analyst that the real numbers (or, for the
more daring and debonaire – the complex numbers) are the field with which it is
preferable to work. Limits, one of the many “only true love”’s of the Analyst, tend
to be of little or no concern to Algebraists. Since they are interested in alternative,
more algebraic properties, the failure of the Intermediate Value Theorem is neither
here nor there nor anywhere else to them.
This is not a bad nor a good thing. It is just a thing.

A passing note on Euclidean domains before we embark upon a detailed study


of them. The three examples we develop in this Chapter, namely Z, polynomial
rings over integral domains (and its generalisation – formal power series), and the
Gaussian integers, are the same three examples that appear in most of the (great
117
118 6. EUCLIDEAN DOMAINS

many) books that we have consulted. What’s more – these are typically the only
three that appear in those references. See the Appendix for one more example -
so-called Dedekind domains.1
This is not to say that they are not important – and the techniques we shall
develop to study them – will be exceedingly important in the study of irreducibil-
ity/reducibility of polynomials in the coming chapters.

1.2. Definition. Let D be an integral domain. A Euclidean norm or valu-


ation on D is a map
µ ∶ D ∖ {0} → N0 ∶= N ∪ {0} = {0, 1, 2, 3, . . .}
satisfying
(a) for all a, b ∈ D with b ≠ 0, there exist q, r ∈ D such that
a = qb + r,
where either r = 0 or µ(r) < µ(b); and
(b) for all a, b ∈ D ∖ {0}, µ(a) ≤ µ(ab).
A Euclidean domain is an integral domain equipped with a valuation µ.

1.3. Example. Suppose that E is a Euclidean domain with valuation µ. Then


ν(d) = 2µ(d) , d ∈ E ∖ {0} defines a valuation on E as well.

1.4. Example. Item (a) in the definition of a valuation should look eerily famil-
iar. In particular, it should look very much like a property enjoyed by the integers
referred to as the division algorithm. More precisely, the division algorithm for
Z says that if a and b are integers, with b ≠ 0, then there exists an integer q such
that
a = bq + r,
where 0 ≤ r < ∣b∣.
If we set µ(d) = ∣d∣ for all d ∈ Z ∖ {0}, then clearly conditions (a) and (b) of
Definition 1.2 hold. Since Z is an integral domain, we have come to the not-so-
surprising conclusion that (Z, +, ⋅) is a Euclidean domain.
Observe that the conclusion is not-so-surprising in large part because the defi-
nition of a Euclidean domain is meant to capture exactly this property of Z.

1.5. The second most common and important example of a Euclidean domain is
that of a polynomial ring with coefficients in a field. In light of condition (a) above,
it behooves us (admit it – how many times have you seen the word “behoove” in
a mathematical text, eh?) to prove some version of the division algorithm in this
setting.
1As it turns out, our plea for more examples (from an earlier version of these notes) was
answered by N. Banks. We refer to the reader to the Appendix for a brief discussion of these.
1. EUCLIDEAN DOMAINS 119

1.6. Theorem. (The division algorithm for polynomial rings)


Let F be a field and f (x), g(x) ∈ F[x] with g(x) ≠ 0. Then there exist unique
polynomials q(x), r(x) ∈ F[x] satisfying
(i) f (x) = q(x)g(x) + r(x), and
(ii) either r(x) = 0 or deg (r(x)) < deg (g(x)).
Proof.
(i) Let us first address the question of the existence of q(x) and r(x).
● If f (x) = 0 or if deg (f (x)) < deg (g(x)), then we set q(x) = 0 and
r(x) = f (x).
● There remains the case where f (x) ≠ 0 and deg (g(x)) ≤ deg (f (x)).
We argue by induction on the degree of f (x). Since f (x) ≠ 0, we have
that 0 ≤ deg(f (x)).
The base case is where deg(f (x)) = 0. Then f (x) = a0 for some a0 ∈ F.
Since g(x) ≠ 0, we must have deg (g(x)) = 0, and thus g(x) = b0 for
some 0 ≠ b0 ∈ F.
Setting q(x) = a0 b−1
0 and r(x) = 0 does the trick.
Next, suppose that N ≥ 1, that deg (f (x)) = N , and that the de-
sired result holds when f (x) is replaced by any polynomial h(x) with
deg (h(x)) < N .
The case where deg (f (x)) < deg (g(x)) having already been done, we
may now suppose that m ∶= deg (g(x)) ≤ N . Let us write
f (x) = aN xN + aN −1 xN −1 + ⋯ + a1 x + a0
g(x) = bm xm + bm−1 xm−1 + ⋯ + b1 x + b0 .
Let
k(x) ∶= f (x) − (aN b−1
m )x
N −m
g(x)
and observe that deg (k(x)) < N . (Why?)
Since deg (k(x)) < N , our induction hypothesis ensures that there exist
u(x), r(x) ∈ F[x] such that r(x) = 0 or deg (r(x)) < g(x), and
k(x) = u(x)g(x) + r(x).
But then
f (x) = k(x) + (aN b−1
m )x
N −m
g(x)
= (u(x) + (aN b−1
m )x
N −m
)g(x) + r(x).
Setting q(x) = (u(x) + (aN b−1
m )x
N −m
) completes the proof of existence.
(ii) Next, let us see why q(x) and r(x) satisfying the properties listed above
are unique.
To that end, suppose that u(x) and s(x) are also polynomials satisfying
conditions (i) and (ii) in the statement of the theorem. Thus
f (x) = u(x)g(x) + s(x),
and either s(x) = 0 or deg (s(x)) < deg (g(x)).
120 6. EUCLIDEAN DOMAINS

Consider

0 = f (x) − f (x) = (u(x) − q(x))g(x) + (s(x) − r(x)),

or equivalently,

(u(x) − q(x)) g(x) = r(x) − s(x).

It immediately follows that

deg ((u(x) − q(x)) g(x)) = deg (s(x) − r(x)).

But from condition (ii) applied to each of r(x) and s(x) we find that either
s(x) − r(x) = 0 or deg (s(x) − r(x)) < deg (g(x)).
On the other hand, if u(x) − q(x) ≠ 0, then deg ((u(x) − q(x)) g(x)) ≥
deg (g(x)).
From this it follows that we must have u(x) − q(x) = 0, i.e. u(x) = q(x).
But then

s(x) = f (x) − u(x)g(x) = f (x) − q(x)g(x) = r(x).

1.7. Example. Let f (x) = 5x4 + 3x3 + 1 and g(x) = 3x2 + 2x + 1 ∈ Z7 [x].
To divide f (x) by g(x), we proceed with long division the way that we would
with integers. Thus, to keep trace of the “x2 place” and the “x1 place”, we write

f (x) = 5x4 + 3x3 + 0x2 + 0x1 + 1.

(The reader should not look at us with that expression: the natural number 1004 =
1 ⋅ 104 + 0 ⋅ 103 + 0 ⋅ 101 + 4 ⋅ 1. You’ve seen this kind of thing before.)
One next has to figure out how many 3x2 ’s “fit” into 5x4 , and this requires us
to divide 5 by 3 in Z7 . Since 5 = 3 ⋅ 4 in Z7 , we shall require 4g(x) to begin with.
Performing that calculation yield the following.

4x2
2 √
3x + 2x + 1 5x +3x +0x2 +0x +1
4 3

5x4 +1x3 +4x2

2x3 +3x2 +0x

Next, we must decide how man 3x2 ’s “fit” into 2x3 , and again – this requires us
to divide 2 by 3 in Z7 . Since 2 = 3 ⋅ 3 in Z7 , we continue the above long division with
3x. Observe that we implicitly use the calculation that 3 − 6 = 0 − 3 = 4 in Z7 .
1. EUCLIDEAN DOMAINS 121

4x2 +3x
2 √
3x + 2x + 1 5x +3x +0x2 +0x +1
4 3

5x4 +1x3 +4x2

2x3 +3x2 +0x


2x3 +6x2 +3x

4x2 +4x +1
Finally, we must decide how man 3x2 ’s “fit” into 4x2 , and again – this requires
us to divide 4 by 3 in Z7 . Since 4 = 3 ⋅ 6 in Z7 , we finish the above long division with
6. Again, observe that we implicitly use the calculation that 4 − 5 = 6 and 1 − 6 = 2
in Z7 .

4x2 +3x +6
2 √
3x + 2x + 1 5x +3x +0x2 +0x +1
4 3

5x4 +1x3 +4x2

2x3 +3x2 +0x


2x3 +6x2 +3x

4x2 +4x +1
4x2 +5x +6

6x +2
Our conclusion is that
5x4 + 3x3 + 1 = (4x2 + 3x + 6)(3x2 + 2x + 1) + (6x + 2),
i.e.
f (x) = (4x2 + 3x + 6)g(x) + (6x + 2),
2
and so q(x) = (4x + 3x + 6) and r(x) = 6x + 2 in this example.

1.8. A simple reason why we consider the division algorithm for polynomial
rings with coefficients in a field as opposed to polynomial rings with coefficients in,
say, an integral domain is because we must be able to divide the leading coefficient
of the numerator f (x) by the leading coefficient of the divisor g(x). Since g(x) =
bm xm + bm−1 xm−1 + ⋯ + b0 ≠ 0, and since (without loss of generality) the leading
coefficient bm ∈ F is non-zero, given any f (x) = an xn + an−1 xn−1 + ⋯a0 , we know that
we can divide an by bm . Thus we know that this applies to the remainder term of
f (x)/g(x), so long as the degree of that remainder term is at least that of g(x).
The ring R = Z is an integral domain. If f (x) = 7x5 + 1 and g(x) = 2x2 , then for
any q(x) ∈ Z[x],
deg (f (x) − q(x)g(x)) ≥ 5 = deg f (x),
since no choice of q(x) ∈ Z[x] can make the coefficient of x5 equal to zero.
122 6. EUCLIDEAN DOMAINS

1.9. Theorem. Let F be a field. The map


δ ∶ F[x] ∖ {0} → N0 ∶= {0, 1, 2, . . .}
f (x) ↦ deg(f (x))
defines a valuation on the integral domain F[x]. Thus F[x] is a Euclidean domain.

Proof.
(a) It is an immediate consequence of the division algorithm for polynomial
rings, Theorem 1.6 that there exist polynomials q(x), r(x) ∈ F[x] such that
f (x) = q(x)g(x) + r(x),
where r(x) = 0 or δ(r(x)) = deg(r(x)) < deg(g(x)) = δ(g(x)), and thus
condition (a) of Definition 1.2 is satisfied.
(b) If 0 ≠ f (x), g(x) ∈ F[x], then
δ(f (x)) ∶= deg(f (x))
≤ deg(f (x)) + deg(g(x))
= deg(f (x) g(x))
= δ(f (x)g(x)).
Thus δ is a valuation on F[x].

1.10. We have observed earlier in this manuscript that Z is a principal ideal


domain (i.e. a pid). As it turns out, this is a consequence of the fact that it is a
Euclidean domain.

1.11. Theorem. Let E be a Euclidean domain with valuation µ. Then E is a


principal ideal domain.
Proof. Suppose that K ⊲ E is an ideal. If K = {0}, then K = ⟨0⟩ is singly-generated.
Otherwise, if K ≠ {0}, we can choose m ∈ K such that
µ(m) = min{µ(k) ∶ 0 ≠ k ∈ K}.
(This uses the fact that any non-empty subset of N admits a minimum element. We
say that N is well-ordered.) We claim that K = ⟨m⟩. If 0 ≠ k ∈ K, then by condition
(a) of Definition 1.2 of a valuation, we can find q, r ∈ E such that
k = qm + r,
where r = 0 or µ(r) < µ(m). Observe first that m ∈ K implies that qm ∈ K and so
r = k − qm ∈ K as well. Next, note that 0 ≠ r implies that µ(r) < µ(m) contradicts
our choice of m. Thus r = 0, and so k = qm ∈ ⟨m⟩. This completes the proof.

2. THE EUCLIDEAN ALGORITHM 123

1.12. Corollary. Let F be a field. Then F[x] is a principal ideal domain.


Proof. This is an immediate consequence of Theorem 1.9 and Theorem 1.11.

2. The Euclidean algorithm


2.1. In this section we shall use valuations on an integral domain to obtain a
generalisation of the familiar Euclidean algorithm for integers.

2.2. Definition. Let R be a commutative ring, and let a, b ∈ R with b ≠ 0. We


say that b divides a if there exists q ∈ R such that a = qb. We write b∣a when this
holds; otherwise we write b ∤ a.

2.3. Exercise. Let R be a commutative ring and a, b, c ∈ R. The following


results are readily verified.
(a) If a, b ≠ 0, a∣b and b∣c, then a∣c.
(b) If a ≠ 0, a∣b and a∣c, then a∣(b + c) and a∣(b − c).
(c) If a ≠ 0 and a ∣ b, then a∣bz for all z ∈ R.

2.4. Definition. Let R be a commutative ring, and a, b ∈ R. An element d ∈ R


is said to be a greatest common divisor of a and b
(i) d∣a and d∣b – that is, d is a divisor of both a and b; and
(ii) If c ∈ R, c∣a and c∣b, then c∣d.
We shall write d = gcd(a, b) to denote the fact that d is a greatest common divisor
of a and b (when such a divisor exists).

2.5. Examples.
(a) 7 and −7 are both greatest common divisors of 35 and −49 in Z. This
highlights the fact that even if a greatest common divisor exists, it need
not be unique.
(b) Let R be a commutative ring, 0 ≠ a, b ∈ R, and d be a gcd of a and b.
Suppose that x ∈ R is invertible.
Now d ∣ a implies that a = dm for some m ∈ R, so a = (dx)x−1 m,
implying that dx ∣ a. Similarly, dx ∣ b, showing that dx is a common divisor
of a and b.
Moreover, if c ∣ a and c ∣ b, then c ∣ d, so d = cn for some n ∈ R, implying
that dx = cnx, and thus c ∣ dx. Hence dx is a gcd of a and b as well.
Applying this to the example above, we saw that 7 is a gcd of 35 and
−49 in Z. Since −1 ∈ Z is invertible (with inverse equal to itself), we see
that −7 is also a gcd of 35 and −49. We have simply generalised this to
other rings which may or may not have many more invertible elements.
124 6. EUCLIDEAN DOMAINS

(c) Let F be a field and a, b ∈ F. Given any 0 ≠ d ∈ F, a = d(d−1 a) and


b = d(d−1 b), showing that d is a common divisor of a and b. If 0 ≠ c ∈ F,
then d = c(c−1 d), so c ∣ d. In other words, d is a gcd of a and b. That’s
about as far from unique as you can get.
(d) Let us apply the argument from (b) to the case R = Q[x].
Consider f (x) = (x−1)2 and g(x) = (x−1)(x+2) ∈ Q[x]. It follows from
the Euclidean algorithm for polynomial rings that h(x) = (x−1) is a greatest
common divisor of f (x) and g(x). (See Question 1 on Assignment 4.)
On the other hand, if u(x) ∈ Q[x] is any invertible element, then
u(x)h(x) is also a gcd for f (x) and g(x). Of course, in Q[x], the only
invertible elements are the non-zero scalar functions u(x) = u0 with u0 ≠ 0,
but as we have seen, if we had a general commutative ring, there could be
many more.

The greatest common divisor d of two non-zero integers a and b can always be
expressed as a Z-linear combination of a and b. This notion carries over to Euclidean
domains as well.

2.6. Theorem. Let E be a Euclidean domain. If a, b ∈ E, then d ∶= gcd(a, b)


exists in E. Moreover, there exist r, s ∈ E such that

d = ra + sb.

Proof. Let K ∶= {xa + yb ∶ x, y ∈ E}. We leave it as an exercise for the reader to


prove that K ⊲ E is an ideal of E. By Theorem 1.11, E is a principal ideal domain,
and thus
K = ⟨d⟩
for some d ∈ E. By definition of K, d = ra + sb for some r, s ∈ R.
There remains to prove that d = gcd(a, b). Note that a = 1a + 0b and b = 0a + 1b ∈
K = ⟨d⟩, implying that there exist w0 , z0 ∈ E such that a = w0 d and b = z0 d. In other
words, d is a common divisor of a and b.
Next, suppose that k ∈ K divides both a and b. Then, by Exercise 2.3, k divides
ra and k divides sb, and thus k divides ra + sb = d. This shows that d is a greatest
common divisor of a and b.

The above proof, while “short and sweet”, has the disadvantage that it does not
tell us how to find a gcd, even when one exists. The process of finding the gcd(a, b)
of two elements a, b of a Euclidean domain E is called the Euclidean algorithm,
and it will be left as a homework assignment question. See also Exercise 6.7 below.
3. UNIQUE FACTORISATION DOMAINS 125

2.7. Example. Let E = Q[x], so that E is a Euclidean domain with valuation


δ = deg(⋅). Let f (x) = x2 − 2x + 1 and g(x) = x2 + x − 2.

Then f (x) = (x−1)2 and g(x) = (x−1)(x+2), and we recall from Example 2.5 (d)
above that d(x) = (x − 1) is a greatest common divisor of f (x) and g(x).
Note that with r(x) = − 13 and s(x) = 31 ∈ Q[x],
1
r(x)f (x) + s(x)g(x) = (−f (x) + g(x))
3
1
= (−(x − 1)(x − 1) + (x − 1)(x + 2))
3
1
= (x − 1) (−(x − 1) + x + 2) = (x − 1) = d(x).
3
This example was perhaps a little “too easy”. We’ll examine a more interesting
example in the homework assignment questions, once we develop the Euclidean
algorithm.

3. Unique Factorisation Domains


3.1. In the next Chapter we shall focus our attention on factorisation in poly-
nomial rings over a field. But first let us get a deeper sense of how factorisation
works in integral domains. The two main concepts we now introduce are those of
prime elements and irreducible elements. We shall also study a property of certain
rings relating to increasing sequences of ideals which will prove extremely useful –
see Definition 3.15 below.
3.2. Definition. Let D be an integral domain. An element q ∈ D is said to be:
(a) a unit in D if q is invertible. We only introduce this terminology because it
is the standard one, and now the reader will have seen it should the reader
wish to consult other sources. To remove all possibility of confusion between
this notion and the notion of a multiplicative unit (i.e. a multiplicative
identity), we shall simply refer to these elements as invertible in these
notes.
(b) We say that q is a prime element if q ≠ 0, q is not invertible, and whenever
q∣(bc) for some b, c ∈ D, then q∣b or q∣c.
(c) Let 0 ≠ q ∈ D be a non-invertible element. Then q is said to be irreducible
if, whenever we can write
q = uv
for some u, v ∈ D, then either u or v is invertible. Otherwise, we say that q
is reducible. (Invertible elements and 0 are not considered to be reducible,
nor are they considered to be irreducible.
(d) Finally, we say that two elements a, b ∈ D are associates if there exists an
invertible element u ∈ D such that a = ub.
126 6. EUCLIDEAN DOMAINS

Note that the relation a ∼ b if a and b are associates is an equivalence relation


on D.

3.3. Examples.
(a) It is not hard to see that the only invertible elements of Z are 1 and −1;
the same is true of Z[x]. Thus the only associates of, say, q(x) = 12x4 +
3x2 − 2x + 1 are q(x) itself and −q(x) = −12x4 − 3x2 + 2x − 1.
(b) Consider D = Z, then integers.
● The invertible elements of Z are exactly {−1, 1}.
● An element 0 ≠ r ∈ Z is reducible if we can write it as r = uv where
neither u nor v is invertible. Since r ≠ 0, clearly neither u nor v can
be equal to zero. Since they do not belong to {−1, 1} either, we see
that 0 ≠ r ∈ Z is reducible exactly when it is either a composite natural
number, or the negative of a composite natural number.
To be irreducible, therefore, r must either be a prime natural number,
or the negative of a prime natural number.
● We leave it to the reader to verify that 0 ≠ r ∈ Z is “ring” prime (i.e.
prime in the sense of Definition 3.2 (b)) if and only if it is a prime
natural number, or the negative of a prime natural number. Observe,
therefore, that in Z, the notions of prime elements and irreducible
elements coincide. That prime elements are always irreducible is not
unique to Z – see Proposition 3.4 below.
● Finally, two elements a and b of Z are associates if and only if b ∈
{a, −a}.
(c) The invertible elements of Q[x] are the non-zero scalar polynomials. Thus
if q(x) ∈ Q[x], the associates of q(x) are {r0 q(x) ∶ 0 ≠ r0 ∈ Q}. We shall
much more to say about irreducible and prime elements of Q[x] later.

3.4. Proposition. Let D be an integral domain. If p ∈ D is prime, then p is


irreducible.
Proof. Let p ∈ D be prime. We shall argue by contradiction. To that end, suppose
that p is reducible. Then we can find u, v ∈ D, neither of which is invertible, such
that
p = uv.
Since p is prime, either p ∣ u or p ∣ v. By relabelling if necessary (keeping in mind
that D is commutative), we may assume without loss of generality that p ∣ u. Thus
there exists r ∈ D such that u = pr, whence
p = p 1 = uv = prv.
Since D is an integral domain, it satisfies the cancellation law, and so
1 = rv = vr.
But then v is invertible, a contradiction.
Thus p is irreducible.
3. UNIQUE FACTORISATION DOMAINS 127


To prove that the converse is false requires more work. We begin by introducing
the following definition.

3.5. Definition. Let D be an integral domain. A multiplicative norm on D


is a function ν ∶ D → N ∪ {0} satisfying:
(i) ν(x) = 0 if and only if x = 0;
(ii) ν(xy) = ν(x)ν(y) for all x, y ∈ D.

3.6. Examples.
(a) Again - we turn to Z for inspiration. Here we may define ν(d) = ∣d∣ for all
d ∈ Z. Clearly this is a multiplicative norm on Z.
(b) Let Z[i] ∶= {a + bi ∶ a, b ∈ Z} denote the Gaussian integers. We may define
a multiplicative norm on Z[i] via
∣a + bi∣ ∶= a2 + b2 .
That this is√indeed a multiplicative
√ norm is left as an exercise for the reader.
(c) Let D = Z[ −5] ∶= {a + b −5 ∶ a, b ∈ Z}. Then

ν(a + b −5) ∶= a2 + 5b2
defines a multiplicative norm on D. Again, the verification of this (as well
as the verification of the fact that D is an integral domain) is left to the
reader.

3.7. Theorem. Let D be an integral domain and ν ∶ D → N ∪ {0} be a multi-


plicative norm on D. Then
(a) ν(u) = 1 for every invertible element u ∈ D; and
(b) if {x ∈ D ∶ ν(x) = 1} = {x ∈ D ∶ x is invertible}, then p ∶= ν(y) ∈ N prime
implies that y is irreducible in D.

Proof.
(a) First note that ν(1) = ν(1 ⋅ 1) = ν(1)ν(1) implies that ν(1) ∈ {0, 1}. But
1 ≠ 0 implies that ν(1) ≠ 0, so ν(1) = 1.
Now suppose that u ∈ D is invertible. Then
1 = ν(1) = ν(u ⋅ u−1 ) = ν(u) ν(u−1 ).
Since ν(u), ν(u−1 ) ∈ N, it follows that ν(u) = ν(u−1 ) = 1.
(b) Suppose next that ν(x) = 1 implies that x is invertible, and let y ∈ D be an
element such that p ∶= ν(y) ∈ N is prime.
If y = uv for some u, v ∈ D, then
p = ν(y) = ν(uv) = ν(u)ν(v),
128 6. EUCLIDEAN DOMAINS

and thus either p = ν(u), in which case ν(v) = 1 and so v is invertible by


hypothesis, or p = ν(v), in which case ν(u) = 1, and so u is invertible by
hypothesis. Either way, we find that y is irreducible in D.

3.8. Example. Although an element of Z is prime if and only if it is irreducible,


and although prime elements of an integral domain are always irreducible, in general
the two concepts are different, as we shall now see.

Let D = Z[ −5], equipped with the multiplicative norm

ν(a + b −5) = a2 + 5b2 .


Let r ∶= 1+2 −5. Then ν(r) = 1+5(22 ) = 21. We shall prove that r is irreducible,
but not prime in D.

r = 1 + 2 −5 is irreducible:
Suppose that r = uv for some u, v ∈ D. Then 21 = ν(r) = ν(uv) = ν(u)ν(v), and
thus (by relabelling u and v if necessary), we may assume that either ν(u) = 1 and
ν(v) = 21, or that ν(u)√= 3 and ν(v) = 7.
Writing u = u1 + u2 −5, we see that ν(u) ∈ {1, 3} implies that u21 + 5u22 ∈ {1, 3},
whence u2 = 0 and u = u1 ∈ {−1, 1}. Thus u is invertible, proving that r is irreducible.

r = 1 + 2 −5 is not prime:
Note that
√ √ √
r(1 − 2 −5) = (1 + 2 −5)(1 − 2 −5) = 1 − 4(−5) = 21 = 3 ⋅ 7.
Clearly r divides the left-hand side of this equation. If it were prime, then it would
have to divide 3 or 7.
Suppose 3 = rx for some x ∈ D. Then
9 = ν(3) = ν(r)ν(x) = 21ν(x),
which is impossible with ν(x) ∈ N. Similarly, if 7 = rx for some x ∈ D, then
49 = ν(7) = ν(r)ν(x) = 21ν(x),

which is impossible with ν(x) ∈ N. Thus r = 1 + 2 −5 is not prime.

3.9. Remarks. Let D be an integral domain.


(a) If 0 ≠ a, b ∈ D and ⟨a⟩ = ⟨b⟩, then a and b must be associates.
Indeed, since a ∈ ⟨a⟩ = ⟨b⟩, there exists u ∈ D such that a = bu. Similarly,
since b ∈ ⟨b⟩ ⊆ ⟨a⟩, there exists v ∈ D such that b = av.
Thus a1 = a = bu = avu. Since D is an integral domain, it satisfies
the cancellation law, and so vu = uv = 1, proving that both u and v are
invertible. From this it follows that a and b are associates.
3. UNIQUE FACTORISATION DOMAINS 129

Of course, the contrapositive of this is that if a and b are not associates,


then ⟨a⟩ ≠ ⟨b⟩.
(b) This fails if we are dealing with a general (non-unital) ring. Consider

⎪ ⎡0 x y ⎤ ⎫

⎪⎢⎢
⎪ ⎥
⎥ ⎪

R = ⎨⎢0 0 z ⎥ ∶ x, y, z ∈ R⎬ ,

⎪ ⎢ ⎥ ⎪

⎩⎢⎣0 0 0⎥⎦
⎪ ⎪

⎡0 1 0 ⎤ ⎡0 1 1⎤
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
and let a = ⎢0 0 1⎥, and b = ⎢0 0 1⎥. Then a and b cannot be asso-
⎢ ⎥ ⎢ ⎥
⎢0 0 0 ⎥ ⎢0 0 0⎥
⎣ ⎦ ⎣ ⎦
ciates, since R has no identity element, and as such, no invertible elements.
Now a ∈ ⟨a⟩ by definition, and so a2 ∈ ⟨a⟩, since the latter is an ideal of
R. But then b = a + a2 ∈ ⟨a⟩, whence ⟨b⟩ ⊆ ⟨a⟩.
Conversely, b ∈ ⟨b⟩ by definition, and arguing as above, a = b − b2 ∈ ⟨b⟩,
whence ⟨a⟩ ⊆ ⟨b⟩.
Putting these two containments together,

⟨a⟩ = ⟨b⟩

even though a and b are not associates.


(c) If 0 ≠ d ∈ D is reducible and d = ab for non-invertible a, b ∈ D, then ⟨d⟩ ⊊ ⟨a⟩
and ⟨d⟩ ⊊ ⟨b⟩.
We shall prove that ⟨d⟩ ⊊ ⟨a⟩; the other case is similar.
Suppose otherwise. Then a ∈ ⟨d⟩ implies that a = dx for some x ∈ D.
Thus
d = ab = dxb.
Again, since D is an integral domain, the cancellation law implies that
xb = bx = 1. Hence b is invertible, a contradiction.
Thus ⟨d⟩ ⊊ ⟨a⟩.
(d) If p ∈ D is irreducible and q is an associate of p, then q is also irreducible.
Suppose that q = pu for some invertible element u ∈ D, and write q = ab
for some a, b ∈ D. We must prove that a or b is invertible.
Now q = ab implies that pu = ab, or equivalently that p = u−1 ab =
(u a)b. Since p is irreducible, either u−1 a is invertible, or b is invertible.
−1

Of course, if b is invertible, we are done. On the other hand, if u−1 a is


invertible, then a−1 = (u−1 a)−1 u−1 , showing that a is invertible as well.
(It is not hard to show that the invertible elements of a unital ring form
a group.)

When dealing with principal ideal domains, the distinction between irreducibility
and primeness disappears, and we obtain one more equivalent property.
130 6. EUCLIDEAN DOMAINS

3.10. Theorem. Let D be a pid, and p ∈ D. The following statements are


equivalent.
(a) p is prime.
(b) p is irreducible.
(c) ⟨p⟩ is maximal.
Proof.
(a) implies (b): Since every principal ideal domain is an integral domain, this
follows immediately from Proposition 3.4.
(b) implies (a): Suppose now that p ∈ D is irreducible. If a, b ∈ D and p ∣ (ab),
then there exists u ∈ D such that ab = pu = up.
Consider K ∶= ⟨a, p⟩ ⊲ D. Since D is a principal ideal domain, K = ⟨k⟩
for some k ∈ K. Thus a, p ∈ ⟨k⟩, implying that p = kx and a = ky for some
x, y ∈ D. But p is irreducible, so by definition, either k is invertible, or x is
invertible.
● If x is invertible, then k = px−1 , and thus a = ky = px−1 y is divisible by
p.
● If k is invertible, then K = D, and so 1 ∈ K = ⟨a, p⟩, which implies that
there exist r, s ∈ D such that
1 = ra + sp.
Hence
b = rab + spb = rup + sbp = (ru + sb)p,
so that b is divisible by p.
We have shown that if p ∣ (ab), then p ∣ a or p ∣ b. That is, p is prime.
(b) implies (c). Suppose that p is irreducible. Let K ⊲ D, and suppose that
⟨p⟩ ⊆ K ⊆ D. Since D is a pid, K = ⟨k⟩ for some k ∈ D. Thus p ∈ ⟨k⟩, so
k ∣ p; i.e. there exists y ∈ D such that p = ky. Since p is irreducible, either k
or y is invertible. If k is invertible, then K = D, whereas if y is invertible,
then p and k are associates, and so by Remark 3.9, K = ⟨p⟩.
Thus ⟨p⟩ is maximal.
(c) implies (b). Suppose that ⟨p⟩ is maximal. If p were reducible, then we
could write p = ab, where neither a nor b is invertible. By Remark 3.9,
⟨p⟩ ⊊ ⟨a⟩.
By the maximality of ⟨p⟩, it follows that ⟨a⟩ = D, and thus 1 ∈ ⟨a⟩. But
since D is commutative, ⟨a⟩ = {xa ∶ x ∈ D}, and thus there exists b ∈ D
such that ba = 1 = ab, proving that a is invertible, a contradiction of our
hypothesis. Thus p must be irreducible.

A particularly useful application of the above theorem is the following:
3. UNIQUE FACTORISATION DOMAINS 131

3.11. Corollary. Let F be a field. A polynomial p(x) ∈ F[x] is prime if and


only if p(x) is irreducible.
Proof. This is an immediate corollary of the above theorem combined with the fact
that if F is a field, then F[x] is a pid (Corollary 1.12).

3.12. Definition. An integral domain D is said to be a unique factorisation


domain – denoted ufd – if
(a) every non-invertible element of D other than zero can be factored as a
product of finitely many irreducible elements of D, and
(b) if p1 p2 ⋯pr = q1 q2 ⋯qs are two factorisations of the same (non-zero, non-
invertible) element of D, then r = s and there exists a permutation σ of
{1, 2, . . . , r} such that pj and qσ(j) are associates, 1 ≤ j ≤ r.

3.13. Example. Both of these properties are familiar in Z, where – as we have


seen (or by applying Theorem 3.10 above) – an integer is irreducible if and only if
it is prime. Thus Z is a ufd.

3.14. Example. Note that every field is a ufd, since F admits no non-zero,
non-invertible elements. This is cheap, but true.
Our next goal is to prove that every pid is a ufd. We shall use the very important
concept of rings alluded to at the beginning of this section to do this.

3.15. Definition. A ring R is said to be Noetherian – or alternatively, R


satisfies the ascending chain condition – if whenever
J1 ⊆ J2 ⊆ J3 ⊆ ⋯ ⊆ Jn ⊆ ⋯
is an increasing sequence of ideals in R, there exists N ≥ 1 such that m ≥ N implies
that Jm = JN .

3.16. Examples.
(a) Let n ∈ N and R = Tn (R). If K ⊲ Tn (R) and K ∶= [ki,j ] ∈ K, then for
each 1 ≤ i ≤ j ≤ n, Li,j = Ei,i KEj,j ∈ K. Thus we see that K is spanned by
the (non-zero) matrix units it contains. Since we only have finitely many
matrix units to choose from, Tn (R) can have at most finitely many ideals.
In particular, any increasing chain of ideals must stabilise and so Tn (R) is
Noetherian.
We shall determine all of the ideals of Tn (R) in the Assignments.
(b) Let n ∈ N and R = Mn (R). Then, as seen in the Assignments, R is simple;
that is, the only ideals of R are {0} and R itself. Clearly any ascending
chain must stabilise.
132 6. EUCLIDEAN DOMAINS

(c) A more interesting example is R = Z. Since Z is a principal ideal domain,


given any ideal J ⊲ Z, there exists m ∈ Z such that J = ⟨m⟩.
Suppose that
J1 ⊆ J2 ⊆ J3 ⊆ ⋯ ⊆ Jn ⊆ ⋯
is an increasing sequence of ideals in Z, and choose mk ∈ Z such that
Jk = ⟨mk ⟩ for each k ≥ 1. Note that
⟨m1 ⟩ ⊆ ⟨m2 ⟩ ⊆ ⟨m3 ⟩ ⊆ ⋯
implies that mk ∈ ⟨mk+1 ⟩, or equivalently that mk+1 ∣mk for each k ≥ 1.
Since m1 has at most finitely many non-trivial (i.e. non-invertible)
factors, there must exist N ∈ N such that n ≥ N implies that ∣mn ∣ = ∣mN ∣,
and thus
Jn = ⟨mn ⟩ = ⟨mN ⟩ = JN for all n ≥ N.
Thus Z is Noetherian.
(d) Let R = ∏∞ ∞
n=1 R ∶= {(xn )n=1 ∶ xn ∈ R for all n ≥ 1}. For each k ≥ 1, set
Jk = {(xn )n ∈ R ∶ xm = 0 if m ≥ k}.
We leave it to the reader to verify that Jk is an ideal of R for each k ≥ 1,
and clearly
J1 ⊆ J2 ⊆ J3 ⊆ ⋯ ⊆ Jn ⊆ ⋯.
That Jk ≠ Jk+1 for all k ≥ 1 is also a (very routine) exercise. Thus R is not
Noetherian.
(e) Let R = C([0, 1], R). For each n ≥ 1, let
1
Jn ∶= {f ∈ C([0, 1], R) ∶ f (x) = 0 for all x ∈ [0, ]}.
n
Again, we leave it to the reader to verify that Jn is an ideal of R for each
n ≥ 1, and that
J1 ⊆ J2 ⊆ J3 ⊆ ⋯ ⊆ Jn ⊆ ⋯.
We also leave it to the reader to verify that Jn ≠ Jn+1 for any n ≥ 1.
Thus C([0, 1], R) is not Noetherian.

3.17. Theorem. Every pid is Noetherian.


Proof. Suppose that D is a pid and that
J1 ⊆ J2 ⊆ J3 ⊆ ⋯ ⊆ Jk ⊆ ⋯
is an ascending chain of ideals of D. Let J = ⋃∞ k=1 Jk .
We claim that J is an ideal of D. That J ≠ ∅ is clear, since J1 ≠ ∅ and J1 ⊆ J.
Next, given a, b ∈ J, there exist k1 and k2 ∈ N such that a ∈ Jk1 and b ∈ Jk2 . It follows
that a, b ∈ Jk0 , where k0 = max(k1 , k2 ). Since Jk0 is an ideal of D, a + b ∈ Jk0 ⊆ J.
Thus J is closed under addition.
If a ∈ J and r ∈ D, then a ∈ Jk for some k ≥ 1, and Jk ⊲ D, so ra = ar ∈ Jk ⊆ J.
By the Ideal Test, J is an ideal of D.
Since D is a pid, there exists an element m ∈ D such that J = ⟨m⟩.
3. UNIQUE FACTORISATION DOMAINS 133

Now m ∈ J implies that there exists N ≥ 1 such that m ∈ JN . But then


J = ⟨m⟩ ⊆ JN ⊆ J,
implying that JN = J. If k ≥ N , then J = JN ⊆ Jk ⊆ J, further implying that Jk = J
for all k ≥ N .
By definition, D is Noetherian.

3.18. Corollary. Suppose that D is a pid. If 0 ≠ d ∈ D and d is not invertible,


then d is a product of finitely many irreducible elements of D.
Proof. The proof will proceed in two steps. First we shall show that d admits and
irreducible divisor. Then we shall prove that it is a product of irreducible divisors.

Step 1. Suppose that every divisor of d is reducible. Since d ∣ d is obvious, we can


then write d = a1 b1 , where a1 , b1 are not invertible. Let J1 ∶= ⟨b1 ⟩.
Clearly b1 ∣ b1 , and thus b1 ∣ d. By hypothesis, b1 is reducible, so we may write
b1 = a2 b2 for non-invertible b2 . Set J2 = ⟨b2 ⟩.
In general, given k ≥ 2 and having written d = a1 a2 ⋯ak bk , where each a1 , a2 , . . . , ak
and bk are non-invertible, we note that bk ∣ bk and hence bk ∣ d. By hypothesis, bk is
reducible, and so we may write bk = ak+1 bk+1 for non-invertible ak+1 , bk+1 ∈ D. We
set Jk+1 = ⟨bk+1 ⟩.
By Remark 3.9, Jk = ⟨bk ⟩ ⊊ ⟨bk+1 ⟩ = Jk+1 , k ≥ 1. But then
J1 ⊆ J2 ⊆ J3 ⊆ ⋯ ⊆ Jn ⊆ ⋯
is an increasing sequence of ideals of D which does not stabilise, contradicting the
fact that every pid is Noetherian.
This contradiction proves that d admits an irreducible divisor.

Step 2. To prove that it d is a product of irreducible divisors, we shall once again


argue by contradiction. Suppose, to the contrary, that d can not be written as a
finite product of irreducible elements of D. Then d itself is reducible, otherwise it
is a product of (one) irreducible element, namely itself.
From Step 1, there exist an irreducible element q1 and an element y1 ∈ D such
that d = q1 y1 . We claim that y1 is not invertible. Indeed, if y1 were invertible,
then d and q1 would be associates. By Remark 3.9 (d), this would mean that d is
irreducible, contradicting our current hypothesis.
If y1 were irreducible, then d = q1 y1 would be a product of irreducible elements,
contradicting our current hypothesis. Thus, applying Step 1 to y1 , we may write
y1 = q2 y2 , where q2 is irreducible and y2 ∈ D. Note that d = q1 y1 = q1 q2 y2 .
If y2 were irreducible, then d would be a product of irreducible elements, a con-
tradiction. But y2 is not invertible - otherwise y1 and q2 are associates, contradicting
the fact that y1 is reducible.
134 6. EUCLIDEAN DOMAINS

In general, given k ≥ 2 and q1 , q2 , . . . , qk ∈ D irreducible, y1 , y2 , . . . yk−1 ∈ D


reducible and satisfying yj = qj+1 yj+1 for all 1 ≤ j ≤ k − 1, we get
d = q1 q2 ⋯qk yk .
If yk were irreducible, then clearly d would be a product of irreducible elements of
D, a contradiction.
But as always, yk is not invertible, otherwise yk−1 and qk would be associates,
and yk−1 would have been irreducible, a contradiction.
Observe that by Remark 3.9, ⟨yk ⟩ ≠ ⟨yk+1 ⟩ whenever yk and yk+1 are not asso-
ciates. Combining this with the fact that yk+1 ∣ yk implies that
⟨y1 ⟩ ⊊ ⟨y2 ⟩ ⊊ ⟨y2 ⟩ ⊊ ⋯,
contradicting the fact that every pid is Noetherian (Theorem 3.17).
We conclude that d can indeed be written as a finite product of irreducible
elements of D, as claimed.

3.19. Theorem. Every pid is a ufd.


Proof. Let D be a pid, and let 0 ≠ d ∈ D be a non-invertible element. By Corol-
lary 3.18, we can write
d = q1 q2 ⋯qm
for some integer m ≥ 1, where each qj is an irreducible element of D.
Suppose that d = p1 p2 ⋯pn , where n ∈ N and each pj is irreducible in D. Without
loss of generality, we may assume that m ≤ n. (Otherwise, we interchange the
sequence of pj ’s and qi ’s.)

Recall that an element of a pid is irreducible if and only if it is prime. Thus each
qj is prime, 1 ≤ j ≤ m. Since q1 ∣ p1 p2 ⋯pn , there exists k ≥ 1 such that q1 ∣ pk . By
reordering the terms p1 , p2 , . . . , pn (principal ideal domains are commutative, after
all!), we may assume that k = 1. Thus q1 ∣ p1 , and so there exists u1 ∈ D such that
p1 = q1 u1 . But p1 is irreducible, and thus either q1 is invertible, or u1 is. But q1 is
irreducible, so it is non-invertible. Hence u1 is invertible.
That is,
d = p1 p2 ⋯pn = q1 u1 p2 ⋯pn = q1 q2 ⋯qm .
Since every pid is an integral domain, the cancellation law holds, and so
(u−1
1 q2 ) q3 ⋯qm = p2 p3 p4 , . . . , pn .
Note that u−1 1 q2 is irreducible (hence prime), being the associate of an irreducible
element in D.
Since (u−1 −1
1 q2 ) ∣ p2 p3 p4 , . . . , pn , there exists 2 ≤ k ≤ n such that (u1 q2 ) ∣ pk . By
reordering the pj ’s if necessary, we may assume that j = 2, i.e. that (u−1 1 q2 ) ∣ p 2 .
Write p2 = (u1 q2 )u2 for some u2 ∈ D. Since p2 is irreducible, and thus either (u−1
−1
1 q2 )
is invertible, or u2 is. But (u−1 1 q 2 ) is irreducible, so it is non-invertible. Hence u2 is
invertible.
3. UNIQUE FACTORISATION DOMAINS 135

Thus
(u−1 −1
1 q2 ) q3 ⋯qm = p2 p3 ⋯pn = (u1 q2 ) u2 p3 p4 ⋯pn .
Again, we may apply the cancellation law to conclude that
q3 q4 ⋯ qm = u2 p3 p4 ⋯pn ,
or equivalently, that
(u−1
2 q3 ) q4 ⋯ qm = p3 p4 ⋯ pn .
Proceeding in this manner, we see that after m steps (and potentially m reorderings
of the pj terms), we may assume that p1 = u1 q1 and pj = (u−1
j−1 uj )qj , 2 ≤ j ≤ m. At
that stage, we have
q1 q2 ⋯qm = (u1 q1 )(u−1 −1 −1
1 u2 )q2 )(u2 u3 q3 )⋯(um−1 um qm ) pm+1 pm+2 ⋯ pn
= um (q1 q2 ⋯ qm ) pm+1 pm+2 ⋯ pn .
Applying the cancellation law yet again yields
1 = um pm+1 pm+2 ⋯ pn .
If n > m, then this contradicts the fact that each pj , m ≤ j ≤ n is irreducible, hence
non-invertible. Thus n = m, and each pair (qj , pj ) consists of associates, as required.

The next result is familiar - but it’s certainly nice to have a formal proof of this
fact.

3.20. Corollary. Z is a unique factorisation domain.


Proof. As we have already noted, Z is a pid, from which the claim follows.

More importantly for us – and this will certainly be useful in the next Chapter
– we have:
3.21. Corollary. Let F be a field. Then F[x] is a unique factorisation domain.
Proof. By Corollary 1.12, F[x] is a principal ideal domain. The result now follows
from Theorem 3.19.

136 6. EUCLIDEAN DOMAINS

Supplementary Examples.

S6.1. Example. Most sources list at most three, and often just two, examples
of Euclidean domains. The first two are invariably the ring Z of integers and poly-
nomial rings with coefficients in a field, which we have dutifully mentioned above.
Let us now consider the Gaussian integers, which is typically mentioned third, if at
all.
Recall that the Gaussian integers are the set
Z[i] ∶= {a + bi ∶ a, b ∈ Z},
2
where i = −1. We think of these as a unital (and necessarily commutative) subring
of C. Note that C is a field, and thus it is an integral domain. If a + bi ≠ 0 ≠ c + di
are non-zero Gaussian integers, then they are also non-zero complex numbers, so
(a + bi)(c + di) = (ac − bd) + (ad + bc)i ≠ 0.
As with Z, we may define a Euclidean valuation on Z[i] ∖ {0} through the map
µ(a + bi) ∶= ∣a + bi∣2 = a2 + b2 .
That µ satisfies condition (b) of a Euclidean valuation is left as an exercise for
the reader to verify. Let us verify condition (a).
Let a = w + xi and b = y + zi ∈ Z[i]. We must find q, r ∈ D such that a = qb + r,
where either r = 0 or µ(r) < µ(b). We consider two cases.
● Suppose first that b = k for some 0 < k ∈ N. It is not hard to see that we
can write
w = u1 k + v1 , x = u2 k + v2 ,
k
where max(∣v1 ∣, ∣v2 ∣) ≤ . Thus
2
a = w + xi = (u1 k + v1 ) + (u2 k + v2 )i = (u1 + u2 i)k + (v1 + v2 i) = qk + r,
where q = (u1 + u2 i) and r = (v1 + v2 i). Note that if r ≠ 0, then
k 2 k 2
µ(r) = µ(v1 + v2 i) = v12 + v22 ≤ ( ) + ( ) < k 2 = µ(k) = µ(b).
2 2
● Now suppose that b = y + zi ≠ 0. Let k = bb = (y − zi)(y + zi), and observe
that 0 < y 2 + z 2 = k ∈ N. Let us now apply the above argument to ab and k
to see that we may write
ab = qk + r0 ,
where r0 = 0 or µ(r0 ) < µ(k).
We pause to observe that for all s + ti ∈ Z[i], µ(s + ti) = s2 + t2 = ∣s + ti∣2 ,
the square of the absolute value of the complex number s + ti, and thus
µ((m + ni)(s + ti)) = ∣(m + ni)(s + ti)∣2
= ∣m + ni∣2 ∣s + ti∣2
= µ(m + ni) µ(s + ti).
SUPPLEMENTARY EXAMPLES. 137

Thanks to this observation, we see that


µ(a − qb)µ(b) = µ(ab − qbb)
= µ(ab − qk)
= µ(r0 )
< µ(k) = µ(bb)
= µ(b)µ(b).

Since µ(b) = µ(b) = y 2 + z 2 ≠ 0 (recall that b ≠ 0), we may divide both sides
of this inequality by the positive integer µ(b) to obtain
µ(a − qb) < µ(b).
Let r = a − qb. Then µ(r) < µ(b) and
a = qb + r,
as required to prove that µ is a valuation on Z[i], and completing the proof
that Z[i] is a Euclidean domain.

S6.2. Example. In Exercise 6.6, you are asked to prove that if E is a Euclidean
domain with Euclidean valuation µ, then
(a) µ(1) = min{µ(d) ∶ 0 ≠ d ∈ D}, and
(b) x ∈ E is invertible if and only if µ(x) = µ(1).
Note that if F is a field and δ = deg(⋅) is the usual valuation on F[x], then
δ(1) = deg(1) = 0
is the minimum possible degree for any non-zero element of F[x]. It is achieved at
all p(x) = p0 ≠ 0.

In Z with valuation µ(n) = ∣n∣, we see that µ(1) = 1 is the minimum possible
value of µ (why?), achieved at n = 1 and n = −1.

S6.3. Example. Consider the Euclidean domain E ∶= Q[x], equipped with the
valuation δ(q(x)) ∶= deg(q(x)), q(x) ∈ Q[x].
Let’s see how to use the division algorithm to find the greatest common divisor
of f (x) = x4 + x3 − x − 1 and g(x) = x3 + 1.
By definition, we may find q (1) (x), r(1) (x) ∈ E such that

f (x) = q (1) (x)g(x) + r(1) (x),

and either r(1) (x) = 0 or deg(r(1) (x)) < deg(g(x)). This is precisely the division
algorithm: the computation
138 6. EUCLIDEAN DOMAINS

x +1

x3 + 1 x4 +x3 +0x2 −x −1
x4 +x

x3 +0x2 −2x −1
x3 +1
−2x −2

shows that f (x) = (x + 1)(x3 + 1) + (−2x − 2), so we set q (1) (x) = x + 1 and r(1) (x) =
(−2x−2). Anything that divides both f (x) and g(x) must divide f (x)−q (1) (x)g(x) =
r(1) (x), so we look for common divisors of g(x) and r(1) (x).
The advantage of doing this is that we have reduced the problem to the case
where the maximum degree of the two polynomials whose greatest common divisor
we are hunting has been reduced from deg(f (x)) = 4 to deg(g(x)) = 3. If we continue
doing this process, it must eventually stop, since we cannot have a degree that is
negative!
Consider
− 21 x2 + 12 x − 12
√ 3
−2x − 2 x +0x2 +0x +1
x3 +x2

−x2 +1
−x2 −x
x +1
x +1
0

Thus g(x) = x3 + 1 = (− 21 x2 + 12 x − 12 )r(1) (x), and so the greatest common divisor


of g(x) = x3 + 1 and r(1) (x) = −2x − 2 must be r(1) (x) = −2x − 2.
But then – as we argued above – this must also divide f (x), and so r(1) (x) =
gcd(f (x), g(x)). Note that r(1) (x) and h(x) ∶= x + 1 are associates, since − 21 ∈ Q[x]
is invertible. So we could just as well say that

h(x) = x + 1 = gcd(f (x), g(x)).

Indeed,

f (x) = x4 + x3 − x − 1 = (x3 − 1)(x + 1) = (x − 1)(x2 + x + 1)(x + 1),

while
g(x) = x3 + 1 = (x + 1)(x2 − x + 1).
We leave it to the reader to verify that (x2 + x + 1) and (x2 − x + 1) have no common
factors.
SUPPLEMENTARY EXAMPLES. 139

S6.4. Example. It is interesting to look for the least common multiple


lcm(f (x), g(x)), where f (x), g(x) ∈ Q[x] are the polynomials from Example S6.3
above.
The definition of the least common multiple is drawn from the definition used
for positive integers. We shall say that h(x) = lcm(f (x), g(x)) if
● f (x) ∣ h(x) and g(x) ∣ h(x), so that h(x) is a multiple of both f (x) and
g(x), and
● if k(x) ∈ Q[x] is any multiple of both f (x) and g(x), then h(x) ∣ k(x).
The decomposition
f (x) = (x − 1)(x2 + x + 1)(x + 1)
g(x) = (x + 1)(x2 − x + 1)

obtained there shows that (x + 1), (x − 1), (x2 + x + 1) and (x2 − x + 1) must all divide
the least common multiple. Thus
h(x) ∶= lcm(f (x), g(x)) = (x + 1)(x − 1)(x2 + x + 1)(x2 − x + 1) = x6 − 1.

S6.5. Example. We know that Z is a pid, and that ⟨17⟩ is a prime ideal of Z.
Then
Z
≃ Z17
⟨17⟩
is a field, as we have seen earlier.

S6.6. Example. Consider the ring R[x] of polynomials with coefficients in R.


Since R is a field and therefore a pid, we see that R[x] is also a pid, by Corollary 1.12.
By Theorem 3.17, R[x] is Noetherian.
For each n ∈ N, set Kn ∶= ⟨xn ⟩ = {xn q(x) ∶ q(x) ∈ R[x]}. It is not hard to verify
that
⋯ ⊊ K3 ⊊ K2 ⊊ K1 .
In other words, we can find a decreasing sequence of distinct ideals of R[x].
A ring R is said to be Artinian – or alternatively, R satisfies the descending
chain condition – if whenever
J1 ⊇ J2 ⊇ J3 ⊇ ⋯ ⊇ Jn ⊇ ⋯
is a decreasing sequence of ideals in R, there exists N ≥ 1 such that m ≥ N implies
that Jm = JN .
We conclude that R[x] is Noetherian, but not Artinian.

S6.7. Example. Suppose that R is a ring and that K ⊲ R is an ideal. We leave


it as an Assignment Exercise for the reader to prove that the ideals of R/K are of
the form J/K, where J ⊲ R is an ideal which contains K.
From this it readily follows that if R is Noetherian, then so is R/K. (Check!)
140 6. EUCLIDEAN DOMAINS

S6.8. Example. Suppose that R and S are rings and that R is Noetherian. If
φ ∶ R → S is a homomorphism, then φ(R) is Noetherian as well.
Indeed, by the First Isomorphism Theorem ,
R
φ(R) ≃ ,
ker φ
and we saw in the previous example that R/K is Noetherian for any ideal of R.
S6.9. Example. Let D be a pid, and let K = ⟨b⟩ ⊲ D be an ideal of D. Then
D/K is a pid as well.
Again, we remark that any ideal of D/K is of the form J/K, where J ⊲ D is an
ideal which contains K. Since J = ⟨d⟩ for some d ∈ D, we see that J/K = ⟨d + K⟩ ⊲
D/K.
S6.10. Example. For those of you with a good background in Complex Anal-
ysis, you may be interested to learn that the ring
H(D) ∶= {f ∶ D → C ∣ f is holomorphic}
is not Artinian, since the ideals Kn = {f ∈ H(D) ∶ f has a zero of degree at least n at 0},
n ≥ 1 are descending and all distinct.
APPENDIX 141

Appendix

A6.1. Although we won’t prove it, it can be shown that if D is a ufd, then so
is D[x]. In particular, Z[x] is a ufd. Having said this, we have proven that if D is
a pid, then so is D[x], and thus D[x] is a ufd. This gives an explicit proof of the
fact that Z[x] is a ufd.

A6.2. As we have already alluded to, in a Euclidean domain we may apply a


version of division algorithm finitely often to find the greatest common divisor of
two elements. In this context, it is referred to as the Euclidean algorithm. It’s
proof will appear as an Assignment question.

A6.3. Recall that we defined a multiplicative norm ν on an integral domain


D be a function ν ∶ D → N ∪ {0} satisfying
(i) ν(x) = 0 if and only if x = 0; and
(ii) ν(xy) = ν(x)ν(y) for all x, y ∈ D.
Let’s find out what these might look like when D = Z, and use that inspiration
to figure out what’s happening in more general pid’s.

In the case where D = Z and ν ∶ Z → N ∪ {0} is a multiplicative norm, by


definition we know that ν(0) = 0. Furthermore, by Theorem 3.7, we also know that
ν(1) = ν(−1) = 1, since 1 and −1 are invertible in Z.
Now, for each 2 ≤ n ∈ Z, we can factor n as a product n = p1 p2 ⋯pk of prime
numbers (each positive). By definition of a multiplicative norm (and with the aide
of a routine induction argument), we find that
ν(n) = ν(p1 )ν(p2 )⋯ν(pk ).
In other words, if we know the value of ν(p) for each prime number p ∈ N, then we
now know the value of ν(n) for each n ≥ 2. We also know that
ν(−n) = ν(−1)ν(n) = 1 ⋅ ν(n),
which means that we know the value of ν(m) for all m ∶= (−n) ≤ −2. Thus knowing
the value of ν(p) for every prime is equivalent to understanding the value of ν(n)
for all n ∈ Z.
What restrictions do we have on the possible values of ν(p), p prime? None,
other than ν(p) ≥ 1 (since p ≠ 0 implies that ν(p) ≠ 0). Thus for each p ≥ 2 prime,
we can choose mp ∈ N, set ν(p) ∶= mp , and set
ν(n) = m1 m2 ⋯mk
whenever n = p1 p2 ⋯pk . What makes this well-defined is that Z is a pid, which allows
us to
● write every non-zero, non-invertible n ∈ Z as a product of prime elements
of Z (i.e. ∣p∣ ∈ N is a prime number in the usual sense),
● observe that prime elements and irreducible elements coincide, and
142 6. EUCLIDEAN DOMAINS

● use the fact that every pid is a ufd to note that the decomposition of
non-zero, non-invertible elements of Z into products of primes is essen-
tially unique (up to the order of the terms and replacing primes by their
associates). Since ν(a) = ν(b) when a and b are associates, and since
ν(a)ν(b) = ν(b)ν(a) because multiplication in N ∪ {0} is commutative, this
means that ν(n) is well-defined by the above formula.
This allows us to show that there exist an uncountable number of multiplicative
norms on Z; indeed, if we set P ∶= {2, 3, 5, 7, 11, . . .} to denote the set of prime natural
numbers, then each function f ∶ P → N determines a multiplicative norm
νf (p) ∶= f (p) for all p ∈ P,
and the correspondence is bijective. How many such functions are there? Precisely
∣N∣∣P∣ , where ∣X∣ denotes the cardinality of a set X. In our case,
∣N∣∣P∣ = (ℵ0 )ℵ0 = 2ℵ0 = c,
the cardinality of the continuum. Of course, c = ∣R∣, so there are as many
multiplicative norms on Z as there are points in R, in the sense that there exists a
bijection between these two sets. Finding that bijection is another matter, however.
(We point out that these cardinality arguments are just culture. Don’t worry if
you haven’t seen them before.)
A6.4. As mentioned above, we can extend this analysis to more general pid’s.
If D is a pid, then D is a ufd. To define a multiplicative norm ν on D, it suffices to
choose a positive integer mp for every irreducible (equivalently every prime) element
of D, and then to set
● ν(0) = 0;
● ν(u) = 1 whenever u ∈ D is invertible, and
● ν(p) ∶= mp .
The value of ν(d) for an arbitrary non-zero, non-invertible element of D is then
entirely determined by its essentially unique factorisation as a product of primes.
The moral of the story is that pids are good.
A6.5. Don’t look at these notes that way. You’ve seen this phenomenon before.
Recall that if V and W are vector spaces over R, and if B ∶= {bλ }λ∈Λ is a basis for
V, then to define a linear map T ∶ V → W it suffices to choose any wλ ∈ W, λ ∈ Λ
and to set each T bλ ∶= wλ .
If v ∈ V is arbitrary, then there exist m ≥ 1, real numbers r1 , r2 , . . . , rm and
indices λ1 , λ2 , . . . , λm such that
v = r1 bλ1 + r2 bλ2 + ⋯rm bλm .
If T is to be linear, then we must have
T v = T (r1 bλ1 + r2 bλ2 + ⋯rm bλm )
= r1 T bλ1 + r2 T bλ2 + ⋯rm T bλm
= r1 wλ1 + r2 wλ2 + ⋯rm wλm .
APPENDIX 143

(We leave it to the reader to check that this is well-defined.)


In other words – in vector space theory, it suffices to know what a linear map
does to a basis to completely understand that linear map. In the setting of a pid,
it suffices to know know a multiplicative norm does to the prime elements (i.e. the
irreducible elements) of that pid to completely understand the multiplicative norm.

It is not that these situations are identical: but there are definite parallels, and
understanding one allows us to better understand the other. This is what makes
mathematics interesting – not just the attention that we get from rich and attractive
people at political pyjama parties.
A6.6. In an earlier version of this text, I put out a plea for more examples
of Euclidean domains. After consulting a number of elementary abstract algebra
textbooks, I was struck by the fact that virtually every such textbook always referred
to the three examples we mention in Section 6.1, and to no other examples.

Mr. Nic Banks answered my plea for more examples. A discrete valuation
ring (a.k.a. dvr) D may be defined as a pid with precisely one non-zero prime
ideal. By Exercise 6.11 below, this is the same as asking that it has precisely one
non-zero maximal ideal. (The expression local pid is also used.) It can be shown
that every dvr is a Euclidean domain. As explained to me by Mr. Banks, the
motivating examples for dvr’s come from number theory and algebraic geometry.
In particular, “dvr’s have very simple ideal factorisations ... and ideal factorisations
are essential in problems like Fermat’s Last Theorem”.

A concrete example of a dvr is the set of p-adic integers, where p ∈ N is a prime


number. Unfortunately, this is a bit beyond the scope of these notes, but not much.
We include the definition for those willing to challenge themselves.

Definition. An inverse system of rings is a sequence (Rn )n together with a


sequence (φn )n of homomorphisms
φn φ1
⋯ Ð→ Rn+1 Ð→ Rn Ð→ ⋯ Ð→ R2 Ð→ R1 .
The inverse limit
R ∶= lim Rn
←Ð
is the subset of the direct product ∏n Rn ∶= {(rn )n ∶ rn ∈ Rn for all n ≥ 1} for which
φn (rn+1 ) = rn for all n ≥ 1. For each m ≥ 1, this gives rise to a projection map
πm ∶ R → Rn such that πm ((rn )n ) = rm .
(For those who are interested in such things – know that inverse limits may be
defined in any category.)

If we fix a prime number p ∈ N, the ring of p-adic integers Zp is the inverse


limit
Zp = lim Z/pn Z
←Ð
144 6. EUCLIDEAN DOMAINS

of the inverse system of rings (Z/pn Z) using the homomorphisms


φn ∶ Z/pn+1 Z → Z/pn Z
z + ⟨pn+1 ⟩ → z + ⟨pn ⟩.
(Note: Here is another example of the lack of imagination of certain mathematicians
insofar as choosing good notation is concerned: this version of Zp – i..e. the inverse
limit – has nothing to do with the field Zp ∶= Z/⟨p⟩. It is a mystery as deep
and almost as important as the mystery behind how they get the caramel into the
Caramilk bar to determine why the same notation would have been chosen for both.)

The identity of this ring is e ∶= (1 + ⟨p⟩, 1 + ⟨p2 ⟩, 1 + ⟨p3 ⟩, . . .), and the map
τ∶ Z → lim←Ð Z/pn Z
z ↦ (1 + ⟨p⟩, 1 + ⟨p2 ⟩, 1 + ⟨p3 ⟩, . . .)
is an injective ring homomorphism. Thus Zp has characteristic 0, and contains (an
isomorphic copy of) Z as a subring.

Aren’t you glad I asked?

A6.7. Let us recall our definition of a valuation on an integral domain D: a


valuation on D is a map
µ ∶ D ∖ {0} → N0 ∶= N ∪ {0} = {0, 1, 2, 3, . . . , }
satisfying
(a) for all a, b ∈ D with b ≠ 0, there exist q, r ∈ D such that
a = qb + r,
where either r = 0 or µ(r) < µ(b); and
(b) for all a, b ∈ D ∖ {0}, µ(a) ≤ µ(ab).
A Euclidean domain is an integral domain equipped with a valuation µ.

A6.8. Suppose that we are given a map


ν ∶ D ∖ {0} → {0, 1, 2, 3, . . .}
satisfying only the first condition (a), that is: for all a, b ∈ D with b ≠ 0, there exist
q, r ∈ D such that
a = qb + r,
where either r = 0 or ν(r) < ν(b).
For 0 ≠ x ∈ D, define µ(x) ∶= min{ν(xy) ∶ 0 ≠ y ∈ D}. (Note that µ(x) exists
because {0, 1, 2, . . .} is well-ordered.)
We claim that µ is a valuation on D.
● Clearly µ(a) ∈ {0, 1, 2, . . .} for all 0 ≠ a ∈ D.
APPENDIX 145

● If 0 ≠ a, b ∈ D, then
µ(a) ∶= min{ν(ay) ∶ 0 ≠ y ∈ D}
≤ min{ν(aby) ∶ 0 ≠ y ∈ D}
= µ(ab).
● Now let a, b ∈ D with b ≠ 0. Choose 0 ≠ y0 ∈ D such that µ(b) = min{ν(by) ∶
0 ≠ y ∈ D}. Since b, y0 are both non-zero and D is an integral domain,
by0 ≠ 0.
Now, choose q, r ∈ D such that
a = q(by0 ) + r,
where either r = 0 or ν(r) < ν(by0 ).
Suppose that r ≠ 0. Then ν(r) < ν(by0 ) = µ(b). Moreover,
µ(r) ∶= min{ν(ry) ∶ 0 ≠ y ∈ D} ≤ ν(r ⋅ 1) = ν(r),
and so
µ(r) ≤ nu(r) < µ(b).
Thus µ satisfies both conditions from A6.7, and so µ is a valuation on D,
as claimed.
Thus if ν satisfies only condition (a) of a valuation, then it induces a valuation
µ on D by setting µ(x) = min{ν(xy) ∶ 0 ≠ y ∈ D} for each 0 ≠ x ∈ D. Fascinating.
146 6. EUCLIDEAN DOMAINS

Exercises for Chapter 6

Exercise 6.1.
Divide f (x) = 3x4 + 2x3 + x + 2 by g(x) = x2 + 4 in Z5 [x].

Exercise 6.2.
Prove that p(x) = x2 + 2x + 1 does not lie in the ideal ⟨x3 + 1⟩ of Z3 [x].

Exercise 6.3.
Let D be a Euclidean domain, a, b, c ∈ D ∖ {0}, and suppose that d is a gcd of
a and b. Show that if a ∣ (bc), then
ad−1 ∣ c.

Exercise 6.4.
(a) Prove Fermat’s Theorem on sums of squares: Let p ∈ N be an odd
prime. Then p = a2 + b2 for some a, b ∈ N if and only if p ≡ 1 mod 4.
(b) Conclude that if p ∈ N is prime and p ≡ 3 mod 4, then p is a prime element
in the ring Z[i] of Gaussian integers.

Exercise 6.5.
Verify that the map µ ∶ Z[i] ∖ {0} → N defined by µ(a + bi) = a2 + b2 satisfies
µ((a + bi) (c + di)) = µ(a + bi) µ(c + di),
which was left as an exercise when establishing the fact that µ is a Euclidean valu-
ation on the set Z[i] of Gaussian integers.

Exercise 6.6.
Let E be a Euclidean domain with Euclidean valuation µ. Prove that
(a) µ(1) = min{µ(d) ∶ 0 ≠ d ∈ D}, and
(b) x ∈ E is invertible if and only if µ(x) = µ(1).

Exercise 6.7. The Euclidean algorithm


Let E be a Euclidean domain with Euclidean valuation µ. Let 0 ≠ a, b ∈ E.
Choose q1 , r1 ∈ E such that
a = q1 b + r1 ,
where either r1 = 0 or 0 ≤ µ(r1 ) < µ(b).
(a) Prove that if r1 ≠ 0, then the set of common divisors of a and b is the same
as the set of common divisors of b and r1 .
EXERCISES FOR CHAPTER 6 147

Let k ≥ 1, and suppose that we have chosen r1 , r2 , . . . , rk as above.


If rk = 0, we stop.
If rk ≠ 0, choose qk+1 , rk+1 ∈ E such that
b = qk+1 rk + rk+1 ,
where either rk+1 = 0 or 0 ≤ µ(rk+1 ) < µ(rk ).
(b) Prove that either r1 = 0, or there exists m ≥ 2 such that rm = 0 ≠ rm−1 .
(c) Prove that either
● r1 = 0, in which case b is a greatest common divisor of a and b, or that
● rm−1 is a greatest common divisor of a and b, where m is chosen as in
part (b).
This process to find the greatest common divisor of two elements of a Euclidean
domain is called the Euclidean algorithm.

Exercise 6.8.
Let E be a Euclidean domain with Euclidean valuation µ. Prove that if x ∈ E
and x is not invertible, then for any a ∈ D,
µ(a) < µ(ab).

Exercise 6.9.
Let
f (x) = x10 − 3x9 + 3x8 − 11x7 − 11x5 + 19x4 − 13x3 + 8x2 − 9x + 3
and
g(x) = x6 − 3x5 + 3x4 − 9x3 + 5x2 − 5x + 2

be elements of E ∶= Q[x]. Find the greatest common divisor d(x) of f (x) and g(x).

Exercise 6.10.
Let E be a Euclidean domain with Euclidean valuation µ. In Example 1.3, we
saw that if ν(d) ∶= 2µ(d) , d ∈ E, then ν is again a Euclidean valuation on E.
Observe that ν = f ○ µ, where f (x) = 2x , x ≥ 0.
Do there exist other functions g ∶ [0, ∞) → [0, ∞) such that φg ∶= g ○ µ is a
Euclidean valuation on E whenever µ is? Can we characterise such functions g?
(We do not yet have an answer to this ourselves, although it is entirely possible that
such an answer is known.)

Exercise 6.11.
Let D be a pid. Prove that the following conditions are equivalent.
(a) D admits a unique non-zero prime ideal.
(b) D admits a unique non-zero maximal ideal.
CHAPTER 7

Factorisation in polynomial rings

A computer once beat me at chess, but it was no match for me at


kick boxing.
Emo Philips

1. Divisibility in polynomial rings over a field


From our work in the previous Chapter, we know that if F is a field, then F[x] is
a unique factorisation domain. As such, given any non-zero, non-invertible element
0 ≠ f (x) ∈ F[x], we can factor f (x) as a product of irreducible elements in an
essentially unique way, meaning up to the order of the terms and the replacement
of some factors by their associates.
In this Chapter we further investigate what it means for an element of F[x]
to be irreducible. Along the way, we shall more closely investigate the notion of
factorisation over more general integral domains, specifically in the setting of Z[x] ⊆
Q[x].

1.1. Definition. Let F be a field and 0 ≠ f (x) ∈ F[x]. We say that α ∈ F is a


zero or a root of f (x) if f (α) = 0.
We define the multiplicity of the root α to be
mult(α) ∶= max{k ∈ N ∶ (x − α)k ∣ f (x) but (x − α)k+1 ∤ f (x)}.

1.2. Example. In R[x], α = 3 is a root of multiplicity 2 for the polynomial


f (x) = x3 − 7x2 + 15x − 9. (Check!)

1.3. Proposition. Let F be a field, 0 ≠ f (x) ∈ F[x] and α ∈ F. Then f (α) = r0 ,


where r(x) = r0 is the remainder of “f (x)/(x − α)”. More precisely, there exists
q(x) ∈ F[x] such that f (x) = q(x)(x − α) + f (α).
Proof. Using the Division Algorithm, we may write f (x) = q(x)(x − α) + r(x),
where either 0 ≤ deg r(x) < deg(x − α) = 1, or r(x) = 0.
149
150 7. FACTORISATION IN POLYNOMIAL RINGS

In the first case, we deduce that r(x) = r0 is a constant polynomial with r0 ≠ 0.


Note that in this case,
f (α) = q(α)(α − α) + r(α) = q(α)0 + r0 = r0 .
In the second case, r(x) = r0 = 0, and
f (α) = q(α)(α − α) = 0 = r0 .

The following result is an immediate consequence of Proposition 1.3 above.

1.4. Corollary. Let F be a field, 0 ≠ f (x) ∈ F[x] and α ∈ F. The following


statements are equivalent:
(a) f (α) = 0, i.e. α is a root of f (x); and
(b) x − α divides f (x).

1.5. Lemma. Let F be a field, 0 ≠ f (x) ∈ F[x], and α ∈ F. If k = mult(α) is


the multiplicity of α, then we may write
f (x) = (x − α)k g(x),
where g(α) ≠ 0 and deg(g(x)) = deg(f (x)) − k.
Proof. By definition of multiplicity, (x − α)k ∣ f (x), and so we may write f (x) =
(x − α)k g(x) for some g(x) ∈ F[x].
Suppose that g(α) = 0. Then, by Corollary 1.4, (x − α) ∣ g(x), and so we may
write g(x) = (x − α)h(x) for some h(x) ∈ F[x]. But then
f (x) = (x − α)k g(x) = (x − α)k+1 h(x),
contradicting the fact that mult(α) = k.
Thus g(α) ≠ 0.
Finally, by Remark 3.1.8, since every field is an integral domain,
deg(f (x)) = deg((x − α)k g(x)) = deg((x − α)k ) + deg(g(x)) = k + deg(g(x)),
completing the proof.

1.6. Theorem. Let F be a field and 0 ≠ f (x) ∈ F[x]. The number of roots of
f (x), counted according to multiplicity, is at most deg f (x).
Proof. We shall argue by induction on the degree of f (x). To that end, let n ∶=
deg(f (x)).
If n = 0, then f (x) = a0 ≠ 0, and so f (x) has no roots; i.e. it has zero roots. This
establishes the first step of the induction argument.
Now suppose that N = deg(f (x)) ∈ N and that the result holds for all polyno-
mials h(x) of degree less than N . That is, suppose that if deg(h(x)) < N , then the
number of roots of h(x), counted according to multiplicity, is at most deg (h(x)).
1. DIVISIBILITY IN POLYNOMIAL RINGS OVER A FIELD 151

If f (x) has no roots, then clearly we are done. Otherwise, let α ∈ F be a root of
f (x) of multiplicity k ≥ 1. By Lemma 1.5, we may write
f (x) = (x − α)k g(x)
for some 0 ≠ g(x) ∈ F[x], and g(α) ≠ 0. In particular, this shows that N =
deg (f (x)) ≥ k, and from above,
deg (g(x)) = N − k < N.
By our induction hypothesis, the number of roots of g(x) counted according to
multiplicity is at most deg(g(x)). But β ∈ F is a root of f (x) if and only if β = α or
β is a root of g(x). (In fact, since g(α) ≠ 0, these form disjoint sets, but we don’t
even need this.)
Hence the number of roots of f (x), counted according to multiplicity, is at most
k + deg (g(x)) = k + (N − k) = N = deg (f (x)),
completing the induction step and the proof.

1.7. Example. The above theorem fails in general if we consider polynomials


with coefficients in commutative rings which are not fields. For example, consider
R = C([0, 1], R). We define three functions in R, namely
● r0 (y) = 0;

⎪0
⎪ 0 ≤ y ≤ 21
● r1 (y) = ⎨ ; and

⎪(y − 12 )2 21 ≤ y ≤ 1


⎪0
⎪ 0 ≤ y ≤ 12
● r2 (y) = ⎨ .

⎪sin(y − 12 ) 12 ≤ y ≤ 1

We leave it to the reader to verify that each of these three functions is indeed
continuous on [0, 1]. Consider
f (x) = r2 (y)x2 + r1 (y)x + r0 (y).
(Observe that the coefficients just so happen to be functions of y ∈ [0, 1] but this is
a polynomial in x with strange coefficients!)
For each k ∈ N, consider the continuous function

⎪(y − 21 )k 0 ≤ y ≤ 12

αk (y) ∶= ⎨ 1 .

⎪ 0 2 ≤y≤1

(Again - that each of these functions lies in C([0, 1], R) is left as an exercise.)
Note that for each such k ≥ 1,
f (αk ) = r2 (y)αk2 (y) + r1 (y)αk (y) + r0 (y) = 0.
Thus f (x) has infinitely many roots in C([0, 1], R), despite being a polynomial of
degree 2 with coefficients in that ring.
152 7. FACTORISATION IN POLYNOMIAL RINGS

1.8. In the example above, we see that R is a commutative, unital ring, but it
is not an integral domain. Indeed, although we shall not require it here, it is not
hard to see that Theorem 1.6 extends to polynomials over an integral domain.
Suppose that D is an integral domain, and that 0 ≠ f (x) ∈ D[x]. As seen in
Chapter 5 (more specifically, see Theorem 5.2.7), we can think of D as being a
subring of its field F of quotients, and thus we may also think of f (x) as an element
of F[x]. By Theorem 1.6 above, f (x) has at most deg (f (x)) roots in F. But any
root of f (x) in D (i.e. any element β ∈ D such that f (β) = 0) is automatically a
root of f (x) in F, and so f (x) has at most deg (f (x)) roots in D.

1.9. Example.
● Let n ∈ N and let f (x) = xn −1 ∈ C[x]. If ω ∶= e2πi/n , then ω, ω 2 , . . . , ω n−1 , ω n =
1 are n distinct roots of f (x). By Theorem 1.6, these are the only roots of
f (x) in C.
● Observe that f (x) = x4 − 1 only has two roots in R. Indeed,
f (x) = x4 − 1 = (x2 − 1)(x2 + 1).
Now g(x) = x2 − 1 has two real roots, namely a1 = 1 and a2 = −1, but
h(x) = x2 + 1 has no real roots.

Let us recall Corollary 6.3.21.

1.10. Theorem. Let F be a field. Then F[x] is a ufd. As such, if f (x) ∈ F[x]
is a polynomial of degree at least one, then
(a) f (x) can be factored in F[x] as a product of irreducible polynomials, and
(b) except for the order of the terms and for non-zero scalar terms (which
correspond to the invertible elements of F[x]), the factorisation is unique.

1.11. Remark. Recall also from Theorem 6.3.10 that an element of F[x] is
prime if and only if it is irreducible. Thus every non-invertible element of F[x]
can be factored as a product of finitely many prime elements of F[x], in the same
(essentially) unique way.

1.12. Proposition. Let F be a field, and let q(x), u(x), v(x) ∈ F[x]. Suppose
that q(x) is irreducible and that
q(x) ∣ u(x)v(x).
Then, either q(x) ∣ u(x) or q(x) ∣ v(x).
Proof. By Remark 1.11, q(x) is prime in F[x]. The result now follows from this.

1. DIVISIBILITY IN POLYNOMIAL RINGS OVER A FIELD 153

1.13. Example.
(a) The polynomial f (x) = x4 −1 ∈ R[x] can be factored as a product irreducible
polynomials over R as
f (x) = (x2 − 1)(x2 + 1) = (x − 1)(x + 1)(x2 + 1).
The significance of item (b) in Theorem 1.10 is that we can also factor f (x)
as
1
f (x) = (3(x2 + 1))(π(x + 1))( (x − 1)),

and each of these terms is also irreducible.
On the other hand, the three terms in the second factorisation are
merely associates of the three terms in the first factorisation, written in a
different order.
(b) Note that although g(x) = x2 + 1 is irreducible in R[x], it is reducible in
C[x]. Indeed, there we can factor
f (x) = x4 − 1 = (x − 1)(x + 1)(x − i)(x + i).
Those last two factors do not lie in R[x].

1.14. Culture. Although we shall not prove it here, the field C has a won-
derfully useful property. Every polynomial f (x) ∈ C[x] can be factored into linear
terms. Stated another way, every complex polynomial of degree n ∈ N has exactly n
roots, counted according to multiplicity.
A field that enjoys this property is said to be algebraically closed. Thus C
is algebraically closed. Since f (x) = x2 + 1 can not be factored into linear terms in
R[x], R is not algebraically closed.
While it might not seem like much, this is actually the reason why people who
study operator theory (a version of linear algebra where the linear maps act on
normed vector spaces) tend to use C as opposed to R for their base field. The
eigenvalues of a matrix T ∈ Mn (R) can be found by considering the characteristic
polynomial of T . If this polynomial has no real roots, then T has no eigenvalues
whatsoever. But for any Y ∈ Mn (C), the minimal polynomial always factors into
linear terms, and so Y has exactly n eigenvalues, counted according to multiplicity.
The usefulness of the eigenvalues of a matrix (and of their generalisation to operators,
namely spectrum) cannot be overstated.
Hold on, that’s a bit hyperbolic. What we mean by that is that eigenvalues are
really important. That is, they are really important to the study of matrices, and to
those disciplines (I’m talking to you, Physics) that rely heavily upon matrix theory.
If you are alone in a cage with a starving lion, a new hair-do and your wits, you
will generally be excused for not having the eigenvalues of a matrix be one the first
things that spring to your mind, and your estimation of their general usefulness will
undoubtedly not be of the impossible to overstate variety. But if you are invited to
a pyjama party at Angela Merkel’s house, well, the sky’s the limit.
154 7. FACTORISATION IN POLYNOMIAL RINGS

2. Reducibility in polynomial rings over a field


2.1. We wish to factor polynomials in F[x]. The question of whether a polyno-
mial q(x) can be factored is as much a question of the field as of the polynomial, as
we shall soon see. In other words – one can not speak of an irreducible polynomial –
instead, one has to speak of a polynomial being irreducible over a given field. Indeed,
it is precisely this notion that is at the heart of extension theory for fields, which we
shall examine in Chapter 9.
We mention in passing that we shall adopt the popular convention whereby we
use the expression q(x) is irreducible over F to mean the same thing as q(x) is
irreducible in F[x].
2.2. Recall that if D is an integral domain, then 0 ≠ q ∈ D is reducible if and
only if we can find two non-invertible elements u and v ∈ D such that
q = uv.
Since D[x] is an integral domain whenever D is, it follows that 0 ≠ q(x) ∈ D[x]
is reducible if and only if there exist non-invertible elements u(x) and v(x) ∈ D[x]
such that
q(x) = u(x)v(x).
2.3. Remarks.
(a) Every field F is clearly an integral domain. Moreover, it is a routine exercise
(left to the reader) to show that u(x) ∈ F[x] is invertible if and only if
u(x) = u0 for some 0 ≠ u0 ∈ F. It follows that in this case, q(x) ∈ F[x] is
reducible if and only if there exist u(x) and v(x) of degree at least one such
that
q(x) = u(x)v(x).
That is, q(x) is reducible if and only if we can factor q(x) = u(x)v(x) with
1 ≤ deg (u(x)), deg (v(x)) < deg(q(x)).
Thus 0 ≠ r(x) ∈ F[x] is irreducible if and only if it can not be expressed
as a product of two elements of F[x], each having degree at least one and
therefore strictly lower than that of r(x).
(b) This is not the case for polynomials over general integral domains. The
polynomial f (x) = 2x + 6 ∈ Z[x] is reducible, because we may write f (x) =
g(x)h(x), where g(x) = 2 and h(x) = x + 3, neither of which is invertible in
Z[x]. Clearly deg (g(x)) = 0 while deg (h(x)) = 1.

2.4. Examples.
(a) For example, the polynomial f (x) = x2 − 2 factors as
√ √
f (x) = x2 − 2 = (x − 2)(x + 2) over R,
but it is irreducible over Q. If it were reducible over Q, we could write
f (x) = g(x)h(x),
2. REDUCIBILITY IN POLYNOMIAL RINGS OVER A FIELD 155

where g(x), h(x) ∈ Q[x] would necessarily each have degree one. Thus
g(x) = a1 x + a0 and h(x) = b1 x + b0 , with ai , bi ∈ Q and a1 ≠ 0 ≠ b1 . But then
f (−a0 /a1 ) = g(−a0 /a1 )h(−a0 /a1 ) = 0,
contradicting the fact that f (x) does not have any rational roots.
Similarly, the polynomial g(x) = x2 + 1 is irreducible as a polynomial in
R[x], but it is reducible over C, since g(x) = (x − i)(x + i) in C[x].
(b) Using Proposition 2.5 below, it is not hard to see that the polynomial
f (x) = 9x2 + 3 is irreducible over Q. On the other hand, the fact that
g(x) = 3 and h(x) = 3x2 + 1 are both non-invertible in Z[x] implies that
f (x) = g(x)h(x)
is in fact reducible over Z.

2.5. Proposition. Let F be a field and f (x) ∈ F[x]. Suppose that deg(f (x)) ∈
{2, 3}. The following statements are equivalent.
(a) f (x) is reducible in F[x].
(b) f (x) has a root in F[x].
Proof.
(a) implies (b). Suppose that f (x) is reducible over F. Then we may write
f (x) = g(x)h(x) for some g(x), h(x) ∈ F[x]. Since deg (f (x)) = deg(g(x))+
deg(h(x)) (why?), it follows that one of g(x) and h(x) must have degree
equal to 1. By relabelling if necessary, we may assume that it is g(x). But
then
g(x) = a1 x + a0 ∈ F[x]
with a1 ≠ 0 has a root in F, namely α = −a0 /a1 . Hence α is a root of f (x)
in F as well.
(b) implies (a). Suppose that α ∈ F is a root of multiplicity 1 ≤ k ≤ deg(f (x))
for f (x). By Corollary 1.4, x − α divides f (x), and so
f (x) = (x − α)g(x)
for some g(x) ∈ F[x] of degree deg (g(x)) = deg (f (x)) − 1 ∈ {1, 2}. By
Remark 2.3, f (x) is reducible over F.

2.6. Examples. This is an especially useful device, especially when working


over small fields, where one can try to locate roots by trial and error!
(a) Consider f (x) = x3 + 3x + 2 ∈ Z5 [x]. If it were to be reducible over Z5 , it
would have to have a root α ∈ Z5 .
Now
f (0) = 2 f (1) = 1 f (2) = 1
f (3) = 3 f (4) = 3
156 7. FACTORISATION IN POLYNOMIAL RINGS

shows that f (x) has no roots in Z5 , whence – by Proposition 2.5 – f (x) is


irreducible over Z5 . (Compare this with trying to factor f (x) over Z5 !)
(b) Consider f (x) = x3 + 3x + 2 ∈ Z3 [x]. Note that f (0) = 2, but f (1) = 0 in
Z3 . By Proposition 2.5, f (x) is reducible over Z3 , and (x − 1) is one of its
factors. Indeed,
f (x) = (x − 1)(x2 + x + 1)
= (x − 1)(x2 − 2x + 1)
= (x − 1)(x − 1)(x − 1)
= (x − 1)3
is a factorisation of f (x) into irreducible factors over Z3 . (Why is h(x) =
x − 1 irreducible over Z3 ?)
We have already done all of the work necessary to prove the next result. Now
we just need to reap the rewards of our efforts.
2.7. Theorem. Let F be a field and p(x) ∈ F[x]. The following statements are
equivalent.
(a) p(x) is prime.
(b) p(x) is irreducible over F.
(c) ⟨p(x)⟩ is a maximal ideal of F[x].
(d) F[x]/⟨p(x)⟩ is a field.
Proof. Since F is a field, F[x] is a pid, by Corollary 6.1.12. The result is now a
simple application of Theorem 6.3.10 and Theorem 5.1.8 to this setting.

2.8. Proposition. Let p ∈ N be a prime number and suppose that q(x) ∈ Zp [x]
is an irreducible polynomial of degree n over Zp . Then
Zp [x]
G ∶=
⟨q(x)⟩
is a field with pn elements.
Proof. By Theorem 2.7, it is clear that G is a field. The elements of G are of the
form
f (x) + ⟨q(x)⟩,
where f (x) ∈ Zp [x] is any polynomial. By the division algorithm, we can always
write f (x) = h(x)q(x) + r(x), where r(x) = 0 or 0 ≤ deg(r(x)) < deg (q(x)) = n.
That is,
f (x) + ⟨q(x)⟩ = r(x) + ⟨q(x)⟩.
Note that
r(x) = rn−1 xn−1 + rn−2 xn−2 + ⋯ + r1 x + r0
for some rj ∈ Zp , 0 ≤ j ≤ p, which means that we have at most pn distinct cosets in
G. on the other hand, given any two distinct choices of r(x) and s(x) = sn−1 xn−1 +
2. REDUCIBILITY IN POLYNOMIAL RINGS OVER A FIELD 157

sn−2 xn−2 + ⋯ + s1 x + s0 , the fact that deg(s(x) − r(x)) < n and s(x) − r(x) ≠ 0 implies
that the cosets are different.
This shows that we have exactly pn cosets in G. In other words, G is a field with
exactly pn elements, as claimed.

2.9. Example. Consider q(x) = x3 + 2x + 2 ∈ Z3 [x]. Now q(0) = q(1) = q(2) =


2 ≠ 0 in Z3 . Since q(x) is a polynomial of degree 3 with no roots in Z3 , we see from
Proposition 2.5 above that q(x) is irreducible.
By Proposition 2.8, we conclude that G ∶= Z3 [x]/⟨q(x)⟩ is a field with 33 = 27
elements.
The operations on G are the usual operations in a quotient ring:
(r(x) + ⟨q(x)⟩) + (s(x) + ⟨q(x)⟩) = (r(x) + s(x)) + ⟨q(x)⟩,
and
(r(x) + ⟨q(x)⟩)(s(x) + ⟨q(x)⟩) = (r(x)s(x)) + ⟨q(x)⟩.
Of course, if we start with deg (r(x)), deg (s(x)) < 3, then deg (r(x) + s(x)) < 3,
and so r(x) + s(x) is the “canonical” choice of representative for the sum of the two
cosets, if we agree that we always want the representatives to have degree less than
3. (This is “nice”, but not strictly necessary, mind you.)
As for multiplication, it is not unlikely that deg (r(x)s(x)) > 3, in which case we
use the Division Algorithm to write
(r(x)s(x)) = t(x)q(x) + w(x),
where w(x) = 0 or deg (w(x)) < deg (q(x)) = 3. As before, w(x)+⟨q(x)⟩ = (r(x)s(x))+
⟨q(x)⟩, and so
(r(x) + ⟨q(x)⟩)(s(x) + ⟨q(x)⟩) = w(x) + ⟨q(x)⟩.
For example,
((x2 + 1) + ⟨q(x)⟩) ⋅ ((x2 + x + 2) + ⟨q(x)⟩) = (x4 + x3 + 3x2 + x + 2) + ⟨q(x)⟩
= (x4 + x3 + x + 2) + ⟨q(x)⟩,
since 3 = 0 in Z3 .
Now, by using the Division Algorithm, we obtain that
x4 + x3 + x + 2 = (x + 1)(x3 + 2x + 2) + (−2x2 ) = (x)(x3 + 2x + 2) + (x2 ),
and so
((x2 + 1) + ⟨q(x)⟩) ⋅ ((x2 + x + 2) + ⟨q(x)⟩) = (x4 + x3 + x + 2) + ⟨q(x)⟩
= (x2 ) + ⟨q(x)⟩.
158 7. FACTORISATION IN POLYNOMIAL RINGS

3. Factorisation of elements of Z[x] over Q


3.1. As previously mentioned, the question of whether or not a given polynomial
is reducible or irreducible depends to a large extent over which domain we are trying
to factor it.
For example, in Z[x], the polynomial p(x) = 6x is reducible, since we may write
p(x) = q(x)r(x), where q(x) = 2x and r(x) = 3, neither of which is invertible. The
same polynomial, when viewed as an element of Q[x], is irreducible.
In light of this, it may be surprising to learn that the reducibility of a polynomial
q(x) ∈ Z[x] over Q implies its reducibility over Z. We say it may be... let’s learn it
and find out just how surprised we are!

3.2. Definition. Let 0 ≠ q(x) = qn xn + qn−1 xn−1 + ⋯ + q1 x + q0 ∈ Z[x] be a


polynomial of degree n. We define the content of q(x) to be
content(q(x)) ∶= ∣gcd(qn , qn−1 , . . . , q1 , q0 )∣.
If content(q(x)) = 1, we say that q(x) is primitive.

3.3. Examples.
(a) Let q(x) = 10x2 + 4x + 18. Then content(q(x)) = 2.
(b) Let r(x) = 10x2 + 7x + 18. Then content(r(x)) = 1, and r(x) is primitive.
(c) Let s(x) = −5. Then content(s(x)) = 5.

3.4. Note. We remark that may authors define the content of a non-zero poly-
nomial q(x) ∈ Z[x] without the absolute value sign. The only difference is that this
allows both 2 and −2 to be the content of q(x) in the example above; and we would
have to modify our notion of primitive to include the case where
gcd(qn , qn−1 , . . . , q1 , q0 ) = −1.
We are simply trying to be pragmatic.
The notion of content can be extended to polynomials over a general integral
domain D. In that setting, the notion of an absolute value makes no sense, and we
have to agree that the content of a non-zero polynomial over D should be the set of
all greatest common divisors of its coefficients.

3.5. The proof of Gauss’ Lemma below is extremely clever – almost too clever,
but it has the advantage that it is relatively easy to follow. It relies on the following
simple but effective observation.

Given a polynomial r(x) = rn xn + rn−1 xn−1 + ⋯ + r1 x + r0 ∈ Z[x], and given a


prime number p ∈ N, we shall denote by r(x) the polynomial
r(x) ∶= rn xn + rn−1 xn−1 + ⋯ + r1 x + r0 ∈ Zp [x],
where rj ∶= rj mod p ∈ Zp , 0 ≤ j ≤ n.
3. FACTORISATION OF ELEMENTS OF Z[x] OVER Q 159

We leave it as an exercise for the reader to prove that the map


∆ ∶ Z[x] → Zp [x]
r(x) ↦ r(x)
is a homomorphism.

3.6. Gauss’ Lemma. Suppose that 0 ≠ q(x), r(x) ∈ Z[x] are primitive. Then
q(x)r(x) is also primitive.
Proof. Suppose that γ ∶= content(q(x)r(x)). Then γ ∈ N, and our goal is to
prove that γ = 1. The “trick” is to realise that if γ ≠ 1, then there exists a prime
number p which divides γ, and thus p divides each coefficient of q(x)r(x).
Now the magic happens. Write q(x) = qn xn + qn−1 xn−1 + ⋯ + q1 x + q0 and r(x) =
rm xm + rm−1 xm−1 + ⋯ + r1 x + r0 , where n = deg(q(x)) and m = deg(r(x)). Then
q(x) ∶= q n xn + q n−1 xn−1 + ⋯ + q 1 x + q 0
and
r(x) ∶= rm xm + rm−1 xm−1 + ⋯ + r1 x + r0
lie in Zp [x], the ring of polynomials over the field Zp , where each q j ∶= qj mod p ∈ Zp ,
0 ≤ j ≤ n (and a similar statement holds for each rj ).
The fact that q(x) is primitive, i.e. that content(q(x)) = 1, implies that not
every coefficient of q(x) is divisible by p, and hence q(x) ≠ 0 in Zp [x]. Similarly,
the fact that r(x) is primitive implies that r(x) ≠ 0 in Zp [x]. Now Zp is a field and
hence an integral domain, and therefore Zp [x] is also an integral domain. But then
qr(x) = q(x)r(x) ≠ 0
in Zp [x].
If p had divided γ, then we would have qr(x) = 0, since each coefficient of qr(x)
would be zero!
Thus γ is not divisible by any prime number, and so γ = 1, i.e. q(x)r(x) is
primitive.

3.7. Remark. Let F and E be fields, and suppose that F ⊆ E. Suppose further-
more that f (x) ∈ F[x] is reducible over F. It is then easy to see that f (x) is also
reducible over E. For if we can write
f (x) = g(x)h(x)
where g(x), h(x) ∈ F[x] each have degree at least one, then it is clear that g(x), h(x) ∈
E[x] as well, proving that f (x) is reducible over E.
If f (x) is irreducible over F, then unfortunately no universal conclusions are
possible. For example, if we set p(x) = x2 + 1, then p(x) ∈ Q[x], Q ⊆ R ⊆ C, and p(x)
is irreducible over both Q and R, but p(x) = (x + i)(x − i) is reducible over C.
160 7. FACTORISATION IN POLYNOMIAL RINGS

The “problem”, so to speak, with D[x] where D is an integral domain (as


opposed to a field) is that any non-invertible element d0 ∈ D gives rise to a non-
invertible polynomial g(x) = d0 , and “skews” the issue of whether a polynomial in
D[x] is reducible in F[x] when F is a field and D ⊆ F.
For example, we recall that q(x) = 9x2 + 3 = 3(3x2 + 1) is reducible in Z[x], but
irreducible in Q[x].
In light of these examples, the next result becomes interesting.

3.8. Theorem. Let q(x) ∈ Z[x]. If q(x) is reducible over Q, then q(x) is
reducible over Z. Moreover, if
q(x) = g(x)h(x)
in Q[x] with deg (g(x)), deg (h(x)) < deg (q(x)), then it is possible to factor
q(x) = g1 (x)h1 (x)
in Z[x] with deg (g1 (x)) = deg (g(x)) and deg (h1 (x)) = deg (h(x)).
Proof.
Suppose that q(x) is reducible over Q, and choose g(x), h(x) ∈ Q[x] such that
q(x) = g(x)h(x) with deg g(x), deg h(x) ≥ 1. We shall prove that q(x) is reducible
over Z by proving that there exist polynomials g0 (x), h0 (x) ∈ Z[x] such that
● deg (g0 (x)) = deg (g(x)),
● deg (h0 (x)) = deg (h(x)), and
● q(x) = g0 (x)h0 (x).
Case One. Suppose that q(x) is primitive.
Write g(x) = an xn +an−1 xn−1 +⋯+a1 x1 +a0 , h(x) = bm xm +bm−1 xm−1 +⋯+b1 x+b0 ,
where an ≠ 0 ≠ bm .
Set κ = min{k ∈ N ∶ k g(x) ∈ Z[x]} and λ ∶= min{k ∈ N ∶ k h(x) ∈ Z[x]}. (Note
that if we write the coefficients of g(x) as rational numbers each of whose numerator
and denominator are relatively prime, then κ is the just the absolute value of least
common multiple of those denominators. A similar remark holds for h(x).)
Now
κλ q(x) = (κg(x))(λh(x)).
Write κg(x) = γκg g0 (x) and λh(x) = γλh h0 (x) where γκg ∶= content (κg(x)) and
γλh ∶= content (λh(x)). Thus g0 and h0 are primitive elements of Z[x], and
κλ q(x) = γκg γλh (g0 (x)h0 (x)).
Since g0 (x) and h0 (x) are primitive, it follows from Gauss’s Lemma 3.6 that g0 (x)h0 (x)
is also primitive. Hence
κλ = content (κλq(x))
= content (γκg γλh (g0 (x)h0 (x)))
= γκg γλh .
3. FACTORISATION OF ELEMENTS OF Z[x] OVER Q 161

But then (keeping in mind that Z is an integral domain and so we have cancellation
there!)
q(x) = g0 (x)h0 (x)
factors in Z[x] as well, since
deg(g0 (x)) = deg(g(x)) ≥ 1 and deg(h0 (x)) = deg(h(x)) ≥ 1.
Case Two. If q(x) is not primitive, then we can write q(x) = γq q0 (x), where
q0 (x) ∈ Z[x] is primitive, and note that
q0 (x) = (g1 (x))(h(x)),
where g1 (x) = γq−1 g(x) ∈ Q[x] has the same degree as g(x). By Case One above,
q0 (x) factors over Z, say q0 (x) = r(x)s(x) with r(x), s(x) ∈ Z[x] having degree at
least one, and thus q(x) = (γq r(x))s(x) factors over Z as well.

3.9. Corollary. Suppose that q(x) = xn + qn−1 xn−1 + qn−2 xn−2 + ⋯ + q1 x + q0 is


a monic polynomial in Z[x], and that q0 ≠ 0. If q(x) has a root in Q, then it has a
root – say α – in Z, and α ∣ q0 .
Proof. By Corollary 1.4, the fact that β ∈ Q is a root of q(x) implies that x − β
divides q(x) over Q.
Thus we may write q(x) = (x − β)h(x) for some h(x) ∈ Q[x]. As per the first
paragraph of the proof of Theorem 3.8 above, we see that we can then find two
polynomials g0 (x), h0 (x) in Z[x] such that
● deg (g0 (x)) = deg (x − β) = 1,
● deg (h0 (x)) = deg (h(x)) = n − 1, and
● q(x) = g0 (x)h0 (x).
Write g0 (x) = (a1 x − a0 ) and h0 (x) = bn−1 xn−1 + bn−2 xn−2 + ⋯ + b1 x + b0 ∈ Z[x]. Since
q(x) is monic, we see that a1 bn−1 = 1, and so there is no loss of generality in assuming
that a1 = 1 = bn−1 . Let α = a0 . Then q(α) = g0 (α)h0 (α) = 0h0 (α) = 0, and q0 = αb0
is divisible by α.

3.10. Example. Consider q(x) = x4 − 2x2 + 8x + 1 ∈ Z[x]. We wish to determine


whether or not q(x) is reducible over Q.

Suppose that q(x) is reducible over Q (and hence over Z). A moment’s thought
shows that we can find polynomials g(x) and h(x) ∈ Z[x] such that q(x) = g(x)h(x)
and either:
● deg(g(x)) = 1, deg(h(x)) = 3, or
● deg(g(x)) = 2 = deg(h(x)).
162 7. FACTORISATION IN POLYNOMIAL RINGS

Case One. deg(g(x)) = 1, deg(h(x)) = 3.


In this case, by Corollary 3.9 above, q(x) has a root α ∈ Z which divides q0 = 1.
That is, α ∈ {−1, 1}. But
q(−1) = 1 − 2 − 8 + 1 = −8 ≠ 0 ≠ 8 = 1 − 2 + 8 + 1 = q(1).
Since q(x) has no roots in Z, it has no roots in Q, and so this decomposition is not
possible.
Case Two. deg(g(x)) = 2 = deg(h(x)).
In this case, we can write g(x) = a2 x2 + a1 x + a0 and h(x) = b2 x2 + b1 x + b0 ∈ Z[x].
Thus
q(x) = x4 − 2x2 + 8x + 1
= (a2 x2 + a1 x + a0 )(b2 x2 + b1 x + b0 )
= (a2 b2 )x4 + (a2 b1 + a1 b2 )x3 + (a2 b0 + a1 b1 + a0 b2 )x2 + (a1 b0 + a0 b1 )x + a0 b0 .

Since a2 b2 = 1, we either have a2 = 1 = b2 or a2 = −1 = b2 . In the second case, we


shall simply multiply both g(x) and h(x) by −1; in other words, without loss of
generality, we may assume that a2 = 1 = b2 . Thus
q(x) = x4 + (a1 + b1 )x3 + (b0 + a1 b1 + a0 )x2 + (a1 b0 + a0 b1 )x + a0 b0 .
Let us now try to solve for ai , bi , 0 ≤ i ≤ 2. From a0 b0 = 1, we see that either
a0 = 1 = b0 , or a0 = −1 = b0 .
● If a0 = 1 = b0 , then by comparing the coefficients of x and x3 we find that
8 = a1 b0 + a0 b1 = a1 + b1 = 0,
a contradiction, while
● if a0 = −1 = b0 , then by comparing the coefficients of x and x3 we find that
8 = a1 b0 + a0 b1 = −a1 − b1 = 0,
another contradiction.
We conclude that this decomposition is also impossible.
Hence q(x) is irreducible over Z, and thus q(x) is irreducible over Q as well.

3.11. We next introduce the first of two irreducibility tests. Recall that the
map
∆ ∶ Z[x] → Zp [x]
r(x) ↦ r(x)
defined in paragraph 3.5 is a homomorphism.
3. FACTORISATION OF ELEMENTS OF Z[x] OVER Q 163

3.12. The Mod-p Test. Let q(x) = qn xn + qn−1 xn−1 + ⋯ + q1 x + q0 ∈ Z[x] be


a polynomial with deg(q(x)) = n ≥ 1. Let p ∈ N be a prime number.
If deg(q(x)) = deg(q(x)) and q(x) is irreducible over Zp , then q(x) is irreducible
over Q.
Proof. We argue the contrapositive of our desired statement. Suppose that q(x) is
reducible over Q. Since Q is a field, this means that we can write
q(x) = g(x)h(x),
where max(deg(g(x)), deg (h(x))) < n.
From the comment preceding the statement of this result,
q(x) = g(x) h(x).
Note also that deg(g(x)) ≤ deg(g(x)) < n and deg(h(x)) ≤ deg(h(x)) < n.
Thus q(x) is reducible in Zp [x].

3.13. Remarks.
(a) The converse of this statement is false! Consider q(x) = x4 + 1. It can be
shown that q(x) is irreducible over Q, but q(x) = x4 +1 is reducible in Zp [x]
for all primes p.
The proof of this relies on group theory and will be omitted.
(b) It is tempting to ask, and we shall allow ourselves to give into the tempta-
tion, whether the irreducibility of q(x) over Zp in the Mod-p Test implies
that q(x) is irreducible not only over Q, but also over Z.
Alas, this fails. Consider the polynomial p(x) = 2x2 + 4 ∈ Z[x]. Then
p(x) = 2x2 + 4 ∈ Z5 [x] is a polynomial of degree 2, and thus is reducible
over Z5 if and only if it has a root in Z5 . But
p(0) = 4 p(3) = 2
p(1) = 1 p(4) = 1,
p(2) = 2
showing that p(x) has no roots in Z5 , and is therefore irreducible over Z5 .
By the Mod-p Test, p(x) is irreducible over Q. On the other hand, we can
factor p(x) = g(x)h(x) as a product of non-invertible elements by setting
g(x) = 2 and h(x) = x + 2 ∈ Z[x], proving that p(x) is reducible over Z.

3.14. Eisenstein’s Criterion. Let q(x) = qn xn +qn−1 xn−1 +⋯+q1 x+q0 ∈ Z[x]
be a polynomial of degree n. Suppose that there exists a prime number p ∈ N such
that
(i) p ∤ qn .
(i) p ∣ qj for 0 ≤ j ≤ n − 1, but
(i) p2 ∤ q0 .
164 7. FACTORISATION IN POLYNOMIAL RINGS

Then q(x) is irreducible over Q.


Proof.
We shall argue by contradiction. Suppose that q(x) is reducible over Q. By
Theorem 3.8, it follows that it is reducible over Z with factors of degree at least one.
Thus we can find an integer 1 ≤ m < n, and polynomials g(x) = am xm +
am−1 xm−1 + ⋯a1 x + a0 and h(x) = bn−m xn−m + bn−m−1 xn−m−1 + ⋯b1 x1 + b0 ∈ Z[x]
such that
q(x) = g(x)h(x).
Then q0 = a0 b0 . Since p ∣ q0 but p2 ∤ q0 , it follows that either
● p ∣ a0 , p2 ∤ a0 and p ∤ b0 , or
● p ∣ b0 , p2 ∤ b0 and p ∤ a0 .
By interchanging g(x) and h(x) if necessary, we may assume without loss of gener-
ality that the first case holds.
Note that p ∤ qn implies that p ∤ am and p ∤ bn−m .
Let κ0 ∶= min{1 ≤ k ≤ m ∶ p ∣ ak−1 but p ∤ ak }. Observe that there exists
0 ≤ j ≤ κ0 such that
qκ0 = b0 aκ0 + b1 aκ0 −1 + ⋯ + bj aκ0 −j .
By definition of κ0 , we have that p ∣ bi aκ0 −i , for 1 ≤ i ≤ κ0 − j, but p ∤ b0 aκ0 , whence
p ∤ qκ0 . This forces κ0 = n, as per the hypotheses of the theorem.
But the definition of κ0 meant that κ0 ≤ m < n, a contradiction. This completes
the proof.

3.15. Examples.
(a) Consider q(x) = 7x3 − 6x2 + 3x + 9 ∈ Z[x].
Let us set p = 2 and apply the Mod-2 Test.
Then q(x) = x3 − 0x2 + x + 1 = x3 + x + 1 ∈ Z2 [x]. This has degree three.
If it were to be reducible over Z2 , it would need to have roots in Z2 , by
Proposition 2.5.
But q(0) = 0 + 0 + 1 = 1 ≠ 0 and q(1) = 1 + 1 + 1 = 1 ≠ 0 in Z2 , so that
q(x) is irreducible over Z2 , and therefore q(x) is irreducible over Q by the
Mod-2 Test.
(b) Consider q(x) = 25x5 − 9x4 + 3x2 − 12 ∈ Z[x].
Set p = 3. Then p ∣ −12, p ∣ 3, p ∣ −9, while p ∤ 25 and p2 ∤ 12. By
Eisenstein’s Criterion, q(x) is irreducible over Q.
(c) Consider q(x) = 3x4 + 5x + 1 ∈ Z[x].
Since the constant term in q(x) is 1, Eisenstein’s Criterion will not help
us here. Let us try the Mod-p Test. Note that we can’t take p = 3, because
that would reduce the degree of q(x), which we are not allowed to do. Let
us try p = 2.
3. FACTORISATION OF ELEMENTS OF Z[x] OVER Q 165

In this case q(x) = x4 + x + 1 ∈ Z2 [x]. If this is reducible over Z2 [x],


then we can find r(x), s(x) ∈ Z2 [x] such that
q(x) = r(x)s(x)
and max(deg (r(x)), deg (s(x))) < 4. Hence, after relabelling r(x) and s(x)
if necessary, we are led to two possibilities:
(i) deg(r(x)) = 1 and deg(s(x)) = 3, or
(ii) deg(r(x)) = 2 = deg(s(x)).
(The case where deg(s(x)) = 1 and deg(r(x)) = 3 is the case where we
simply relabel r(x) and s(x).)
If deg(r(x)) = 1, then r(x) = 1x + r0 . But then r0 ∈ Z2 is a root of r(x),
meaning that it is a root of q(x). Since q(0) = 1 = q(1), we see that this is
impossible.
Thus we must consider the case where r(x) = x2 + r1 x + r0 and s(x) =
2
x + s1 + s0 . (The fact that these both have degree two means that the
coefficients of x2 are both 1, which we have suppressed in the notation
above.)
Thus
q(x) = x4 + x + 1
= r(x)s(x)
= (x2 + r1 x + r0 )(x2 + s1 x + s0 )
= x4 + (r1 + s1 )x3 + (r0 + r1 s1 + s0 )x2 + (r1 s0 + r0 s1 )x + r0 s0 .
Now we try to solve for the ri ’s and si ’s. Since r0 s0 = 1, we must have
r0 = 1 = s0 . Hence the coefficient of x is 1 = r1 + s1 , while the coefficient of
x3 is 0 = r1 + s1 , a contradiction.
Thus q(x) is irreducible over Z2 , and so by the Mod-2 Test, q(x) is
irreducible over Q.
166 7. FACTORISATION IN POLYNOMIAL RINGS

Supplementary Examples.

S7.1. Example. Let p ∈ N be a prime number. The pth cyclotomic polyno-


mial
xp − 1
Φp (x) ∶= = xp−1 + xp−2 + ⋯ + x + 1
x−1
is irreducible over Q.
Proof. As we have seen in Theorem 3.8, if Φp (x) is reducible over Q, then Φp (x)
is reducible over Z.
Now,
(x + 1)p − 1
g(x) ∶= Φp (x + 1) =
(x + 1) − 1
xp + (p1)xp−1 + ⋯ + (p−1
p
)x1 + 1p − 1
=
x
p p p
= xp−1 + ( )xp−2 + ⋯ + ( )x + ( )
1 p−2 p−1
p p
= xp−1 + ( )xp−2 + ⋯ + ( )x + p.
1 p−2
Astoundingly, this polynomial g(x) satisfies Eisenstein’s Criterion for p, and thus
g(x) is irreducible over Z.
Suppose now that
Φp (x) = g(x)h(x) ∈ Z[x]
is a non-trivial factorisation (i.e. max(deg(g(x)), deg(h(x))) < p − 1). Then
g(x) = Φp (x + 1) = g(x + 1)h(x + 1)
is also a non-trivial factorisation (check! ), contradicting the irreducibility of g(x).

S7.2. Example. Let p(x) = x2 + 1 ∈ R[x]. Note that p(x) is irreducible since
deg p(x) = 2 and p(x) has no roots in R; i.e. p(x) ≥ 1 > 0 for all x ∈ R.
Note, however that the polynomial q(x) = x4 + 2x2 + 1 = (x2 + 1)(x2 + 1) is
reducible, despite the fact that it has no roots over R.
S7.3. Example. One of the most beautiful results concerning the field of com-
plex numbers is that it is algebraically closed. This is the statement that if
p(x) ∈ C[x] is a non-constant polynomial, then p(x) admits a root in C, and is
usually referred to as the Fundamental theorem of algebra.
The standard, and the most accessible proof of this derives from the theory of
complex variables, and is beyond the scope of this course. It is remarkable, to say
the least, that such a “purely algebraic” concept can be proven using very “analytic
techniques”. This is why mathematics is better than chocolate, and almost as good
as potato chips. Of course, here we are not being so impudent as to compare
SUPPLEMENTARY EXAMPLES. 167

mathematics to bacon and hickory-flavoured potato chips, which are in a league of


their own.

As an immediate consequence – of the Fundamental Theorem of Algebra, and


not of the sublime and incomparable exquisiteness of bacon and hickory-flavoured
potato chips – if p(x) ∈ C[x] is a polynomial of degree n ≥ 1, there exist α1 , α2 , . . . , αn
(not necessarily distinct) and β ∈ C such that
p(x) = β(x − α1 )(x − α2 )⋯(x − αn ).

S7.4. Example. This has the following remarkable consequence. Suppose


that p(x) ∈ R[x] is a monic polynomial of degree n ≥ 1. From the Fundamental
Theorem of algebra, thinking of p(x) as a polynomial in C[x], we can find n elements
α1 , α2 , ⋯, αn ∈ C such that
p(x) = (x − α1 )(x − α2 )⋯(x − αn ).
Since the coefficients of p(x) are all real numbers, we see that p(x) = p(x). Thus
p(x) = p(x) = (x − α1 )(x − α2 )⋯(x − αn ).
In other words, α ∈ C is a root of p(x) if and only if α is a root of p(x).
Suppose that p(x) is irreducible over R. If deg(p(x)) ≠ 1, then from above, we
may find a complex root α ∈ C for p(x) and its complex conjugate α is also a root
for p(x). Thus
p(x) = (x − α)(x − α)q(x) = (x2 + 2Re(α) + ∣α∣2 )q(x).
Note that r(x) ∶= x2 + Re(α) + ∣α∣2 ∈ R[x]. Thus
p(x) = p(x) = r(x)q(x) = r(x)q(x) = r(x)q(x).
Since R[x] is an integral domain (by virtue of the fact that R is a field), we see that
q(x) = q(x), implying that q(x) ∈ R[x] as well.
Next, since r(x) is a polynomial of degree 2, the irreducibility of p(x) over R
implies that q(x) must be invertible in R[x], and thus must have degree zero. This
shows that the only irreducible polynomials over R have degree 1 or 2.
The contrapositive of this statement says that if deg(p(x)) ≥ 3, then p(x) ∈ R[x]
is reducible.
3 3 5 1
S7.5. Example. Consider p(x) = 11 x + 11 x + 11 ∈ Q[x]. Then q(x) ∶= 11p(x) =
3
3x + 5x + 1 ∈ Z[x] is primitive. Let’s apply the Mod-2 Test.
q(x) = x3 + x + 1 ∈ Z2 [x]
is a polynomial of degree 3 with no roots in Z2 , and thus is irreducible there. By
the Mod-2 Test, q(x) is irreducible over Q, and thus p(x) is irreducible over Q as
well, being an associate of q(x).
168 7. FACTORISATION IN POLYNOMIAL RINGS

S7.6. Example. Consider the polynomial p(x) = x3 + 5 ∈ Q[x]. The Mod-2


Test (resp. the Mod-3 Test) fails, since p(x) has a root in Z2 (resp. in Z3 ). That it
fails does not mean that p(x) is irreducible - it simply means that the Test doesn’t
tell us one way or another.
Note that the Mod-5 Test also fails, since p(x) = x3 ∈ Z5 [x] has x = 0 as a root
of multiplicity 3.
On the other hand, with p = 7, we find that p(x) = x3 + 5 has no roots in Z7
(check!), and so p(x) is irreducible in Z7 [x] (it has degree 3 and no roots). By the
Mod-7 Test, p(x) ∈ Q[x] is irreducible.
S7.7. Example. Let’s look at the polynomial p(x) = x3 + 5 ∈ Q[x] once again.
Since the coefficients are integers, we know that it is reducible over Q only if it is
reducible over Z, and furthermore, we can choose the factors of p(x) in Z[x] to have
the same degrees as those in Q[x]. Since deg(p(x)) = 3, this can only happen if
one (or more) of the irreducible factors has degree 1. Furthermore, the fact that the
leading coefficient of p(x) is equal to 1 means that the leading coefficients of the
factors must either both be 1 or −1. Without loss of generality, we may assume that
they are both one, for otherwise we simply multiply each factor by −1.
Thus there exists α ∈ Z such that q(x) = (x − α) ∣ p(x). Equivalently,
p(x) = (x − α)r(x)
for some r(x) = x2 + r1 x + r0 ∈ Z[x]. In particular, p(x) has a root in Z! Thus
5 = αr0 , where α, r0 ∈ Z. This forces α, r0 ∈ {−5, −1, 1, 5}. But p(−5) = −120,
p(−1) = 4, p(1) = 6 and p(5) = 130. Since none of these are roots of p(x), p(x) must
be irreducible over Z, and hence irreducible over Q.
Fortunately, we have arrived to the same conclusion as before. Some might say
that that was to be expected, but perhaps we can view this as an opportunity for
us to show a bit more humility and learn to feel less entitled.
S7.8. Example. Let q(x) = x3 + 9x2 + 12x + 6 ∈ Z[x]. Let p = 3 ∈ N, so that p
is prime. Clearly p ∤ 1, while p ∣ 9, p ∣ 12, p ∣ 6 but p2 = 9 ∤ 6.
By Eisenstein’s Criterion, q(x) is irreducible over Z[x].
S7.9. Example. Let p(x) = x3 + 5 ∈ Q[x] yet again. If r = 5, then r ∤ 1, 5 ∣ 0,
5 ∣ 0 and 5 ∣ 5 but 52 ∤ 5, so again, by Eisenstein’s Criterion, p(x) is irreducible over
Q.
The reader may or may not agree that this evokes certain memories of cats and
taxidermists.
S7.10. Example. Suppose that F is a field and that p(x), q(x) ∈ F[x] satisfy
gcd(p(x), q(x)) = 1. (We say that p(x) and g(x) are relatively prime.)
If r(x) ∈ F[x] and deg(r(x)) < deg(p(x)) + deg(q(x)), then there exist polyno-
mials f (x) and g(x) ∈ F[x]
r(x) f (x) g(x)
= + .
p(x)q(x) p(x) q(x)
SUPPLEMENTARY EXAMPLES. 169

Indeed, by Euclid’s algorithm, we know that we may write


1 = gcd(p(x), q(x)) = t(x)p(x) + s(x)q(x)
for some t(x), s(x) ∈ F[x]. Thus
r(x) r(x)t(x)p(x) r(x)s(x)q(x) r(x)t(x) r(x)s(x)
= + = + .
p(x)q(x) p(x)q(x) p(x)q(x) q(x) p(x)
What extra conditions would we need to impose upon f (x) and g(x) to guarantee
that they are unique?
170 7. FACTORISATION IN POLYNOMIAL RINGS

Appendix

A7.1. As we have seen, every Euclidean domain is a Principle Ideal Domain


pid, and every Principle Ideal Domain is a Unique Factorisation Domain ufd. I’ve
got this friend (whose identity there is absolutely no good reason to reveal) who
remembers this using the mnemonic Ed Piduffed. This friend tells me that he’s
remembered this result for over 40 years using this mnemonic. Well, you don’t get
to choose your friends, I suppose.
A7.2. The proof of the Fundamental Theorem of Algebra using Complex Anal-
ysis relies on a result known as Liouville’s Theorem which states that if f ∶ C → C
is a holomorphic, then f cannot be bounded. That is, there does not exist 0 < M ∈ R
such that
∣f (z)∣ ≤ M for all z ∈ C.

Suppose that p(z) = p0 + p1 z + ⋯ + pn z n ∈ C[x] is a polynomial of degree n ≥ 2


1
which has no roots in C. Then p(z) is holomorphic on C and so h(z) ∶= p(z) is also
holomorphic on C (because p(z) ≠ 0 for all z ∈ C).
The trick is to show that h(z) is bounded. To do this, one first finds 0 < R ∈ C
such that
M1 ∶= sup{∣h(z)∣ ∶ ∣z∣ > R} < ∞,
and then uses the so-called compactness of {z ∈ C ∶ ∣z∣ ≤ R} to argue that
M2 ∶= sup{∣h(z)∣ ∶ ∣z∣ ≤ R} < ∞.
Finally, setting M ∶= max(M1 , M2 ), we find that
∣h(z)∣ ≤ M for all z ∈ C,
a contradiction of Liouville’s Theorem.
This is helpful to know only insofar as one is more likely to believe Liouville’s
result without proof more than the Fundamental Theorem of Algebra, or one actually
knows Liouville’s Theorem.
EXERCISES FOR CHAPTER 7 171

Exercises for Chapter 7

Exercise 7.1.
Prove that p(x) = x2 − 2 is irreducible over Q.

Exercise 7.2.
Prove that p(x) = x3 + x2 + 2 is irreducible over Q.

Exercise 7.3.
Let q(x) = qn xn + qn−1 xn−1 + ⋯ + q1 x + q0 ∈ Q[x] be a polynomial of degree n ≥ 1.
Prove that there exist integers, a, b ∈ Z and p0 , p1 , . . . , pn ∈ Z with gcd(p0 , p1 , . . . , pn ) =
1 such that
q
q(x) = (pn xn + pn−1 xn−1 + ⋯ + p1 x + p0 ).
b

Exercise 7.4.
Let p(x) = x4 − 2x3 + x + 1. Prove that p(x) is irreducible over Q.

Exercise 7.5.
Prove that p(x) = 16x5 − 9x4 + 3x2 + 6x − 21 ∈ Q[x]. Prove that p(x) is irreducible
over Q.

Exercise 7.6.
Let F be a field. Prove that F[x, y] is not a pid.

Exercise 7.7.
Apply the division algorithm to f (x) = 6x4 −2x3 +x2 −3x+1 and g(x) = x2 +x−2 ∈
Z7 [x] to find f (x)/g(x).

Exercise 7.8.
Find all of the roots of p(x) = 5x3 + 4x2 − x + 9 in Z12 .

Exercise 7.9.
Find an invertible element p(x) of Z4 [x] such that deg(p(x)) > 1.

Exercise 7.10.
Give two different factorisations of p(x) = x2 + x + 8 in Z10 [x].
CHAPTER 8

Vector spaces

There are so many different kinds of stupidity, and cleverness is one


of the worst.
Thomas Mann

1. Definition and basic properties


1.1. Most of us will have seen vector spaces (over R) in a first course in linear
algebra. Vector spaces – at least finite-dimensional vector spaces over a general field
– are in many ways similar to the familiar n-dimensional Rn as a vector space over
R.
Our goal in this Chapter is not to fully develop the theory of vector spaces.
Rather, we concentrate only on those results we shall need to discuss field extensions
in the next chapter. As such, we shall demonstrate that every vector space admits
a basis, and in the case where the vector space V admits a finite basis over a field F,
we shall then prove that every basis for V over F has the same number of elements,
allowing us to define the dimension of that space. (A more general result asserts
that for any vector space V over F, given any two bases B and C for V, there exists
a bijection from B to C. We shall not require that here.)
1.2. Definition. Let F be a field. A vector space V over F is a non-empty
set whose elements are called vectors. The set V is equipped with two operations,
namely addition (denoted by +) and scalar multiplication (denoted by ⋅ or simple
juxtaposition) which satisfy the following:
(a) (V, +) is an abelian group.
● Given x, y ∈ V, x + y ∈ V.
● Given x, y, z ∈ V, (x + y) + z = x + (y + z).
● There exists a neutral element 0 ∈ V such that 0 + x = x = x + 0 for all
x ∈ V.
● Given x ∈ V, there exists y ∈ V such that x + y = 0 = y + x. (As it is
unique, we write −x for y.)
● x + y = y + x for all x, y ∈ V.
(b) Scalar multiplication satisfies the following.
● Given κ ∈ F, x ∈ V, we have that κx ∈ V.
173
174 8. VECTOR SPACES

● 1x = x for all x ∈ V.
● κ(x + y) = κx + κy for all κ ∈ F, x, y ∈ V.
● (κ + λ)x = κx + λx for all κ, λ ∈ F, x ∈ V.
● κ(λx) = (κλ)x for all κ, λ ∈ F, x ∈ V.

1.3. Examples.
(a) Let n ∈ N be an integer. If p ∈ N is a prime, then Znp ∶= {(x1 , x2 , . . . , xn ) ∶
xj ∈ Zp , 1 ≤ j ≤ n} forms a vector space over Zp using the familiar operations
(x1 , x2 , . . . , xn ) + (y1 , y2 , . . . , yn ) ∶= (x1 + y1 , x2 + y2 , . . . , xn + yn )
and
κ(x1 , x2 , . . . , xn ) ∶= (κx1 , κx2 , . . . , κxn )
for all (x1 , x2 , . . . , xn ), (y1 , y2 , . . . , yn ) ∈ Znp and κ ∈ Zp .
(b) Let V = C([0, 1], C) ∶= {f ∶ [0, 1] → C ∶ f is continuous}. Then C([0, 1], C)
is a vector space over C using the operations
(f + g)(x) ∶= f (x) + g(x),
and
(κf )(x) = κ(f (x))
for all f, g ∈ C([0, 1], C) and κ ∈ C.
(c) If V is a vector space over C, then V is a vector space over R and over
Q, using the same operations, only restricting the scalar multiplication to
scalars in R or to scalars in Q respectively.

1.4. Definition. A subspace W of a vector space V over a field F is a non-


empty subset of V which is also a vector space over F.
We write W ≤ V to indicate that W is a subspace of V.
The familiar Subspace Test for subspaces of Rn from linear algebra carries
over to a completely general setting, and we leave the proof to the reader.

1.5. The Subspace Test. A non-empty subset W of a vector space V over a


field F is a subspace of V if and only if
(a) x, y ∈ W implies that x + y ∈ W, and
(b) x ∈ W and κ ∈ F implies that κx ∈ W.
Alternatively, we may replace these two conditions by the single condition that
κx + y ∈ W for all κ ∈ F and x, y ∈ W.

1.6. Notation. Given a subset X of a vector space V over a field F, we denote


by
⟨X⟩ ∶= ∩{W ∶ W ≤ V and X ⊆ W}.
This defines the smallest subspace (relative to the partial order of inclusion) of
V that contains X.
1. DEFINITION AND BASIC PROPERTIES 175

1.7. Definition. Let V be a vector space over a field F. A subset X ⊆ V is said


to be linearly dependent if there exist x1 , x2 , . . . , xm ∈ X and κ1 , κ2 , . . . , κm ∈ F∖{0}
such that
κ1 x1 + κ2 x2 + ⋯ + κm xm = 0.
Otherwise, we say that X is linearly independent.

1.8. Example. The set {sin x, cos x, sin(2x), cos(2x)} is linearly independent
in the vector space C((−1, 1), R) over R.

Consider
0 = κ1 sin x + κ2 cos x + κ3 sin(2x) + κ4 cos(2x).
By differentiating three times we obtain the three new equations
(1) 0 = κ1 sin x + κ2 cos x + κ3 sin(2x) + κ4 cos(2x)
(2) 0 = κ1 cos x − κ2 sin x + 2κ3 cos(2x) − 2κ4 sin(2x)
(3) 0 = −κ1 sin x − κ2 cos x − 4κ3 sin(2x) − 4κ4 cos(2x)
(4) 0 = −κ1 cos x + κ2 sin x − 8κ3 cos(2x) + 8κ4 sin(2x).
These equations must hold for all x ∈ (−1, 1), and so setting x = 0, we obtain that
(5) 0 = κ2 + κ4
(6) 0 = κ1 + 2κ3
(7) 0 = −κ2 − 4κ4
(8) 0 = −κ1 − 8κ3 .
An easy calculation now shows that κ1 = κ2 = κ3 = κ4 = 0, and thus that the set
{sin x, cos x, sin(2x), cos(2x)} is indeed linearly independent.

The student who is not already familiar with partially and totally ordered sets
should consult the Appendix to Chapter 5.
1.9. Definition. Let (X, ≤) be a partially ordered set. A chain in X is a non-
empty subset C ⊆ X which is totally ordered. Thus, given c1 , c2 ∈ C, either c1 ≤ c2 or
c2 ≤ c1 .
Let ∅ ≠ Y ⊆ X. An upper bound in X for Y is an element u ∈ X such that
y ≤ u for all y ∈ Y .

1.10. Example. Let X ∶= {a, b, c, d}, and partially order the power set P(X) ∶=
{Y ∶ Y ⊆ X} by inclusion: i.e. Y1 ≤ Y2 if and only if Y1 ⊆ Y2 .

Then
C ∶= {∅, {a}, {a, c}, {a, c, d}}
is a chain in P(X), while
D ∶= {∅, {a}, {b}, {a, b}}
176 8. VECTOR SPACES

is not a chain in P(X), since {a} and {b} are not comparable. That is, {a} ≤/ {b}
and {b} ≤/ {a}.

1.11. Definition. Let V be a vector space over a field F. Let


L ∶= {L ⊆ V ∶ L is linearly independent}.
We partially order L by inclusion, so that L1 ≤ L2 if and only if L1 ⊆ L2 .
A basis for V is a maximal element B of L.

The proof that every vector space over a field F admits a basis is much deeper
than it may sound. It is known to be equivalent to the Axiom of Choice, and to
Zorn’s Lemma, which we now state. We shall go into further detail on the Axiom
of Choice in Appendix A. In fact, we shall now prove “half” of the equivalence by
using Zorn’s Lemma to prove that every vector space admits a basis.

1.12. Zorn’s Lemma. Let (X, ≤) be a non-empty poset. Suppose that every
chain C in X has an upper bound in X. Then X admits a maximal element.

1.13. Theorem. Let V be a vector space over a field F. Then V admits a basis.

Proof. Since a basis is by definition a maximal (in terms of inclusion) linearly


independent set, this should point one in the direction of how to apply Zorn’s Lemma.
It suggests that we should first partially order all linearly independent subsets of V.
We shall also need to show that the collection of such sets is non-empty, and then
we shall have to consider chains of linearly independent sets.
Notice that we have required very little creativity to come up with this plan of
attack. The only inspiration required was to notice that Zorn’s Lemma is a tool used
to produce maximal elements in certain partially ordered sets. Having formulated
our strategy, let’s begin.
Let
L ∶= {L ⊆ V ∶ L is linearly independent}.
Given L1 , L2 ∈ L, set L1 ≤ L2 if and only if L1 ⊆ L2 . In other words, we partially
order L by inclusion.
First we need to know that L ≠ ∅. We are inspired by the case where W ∶= {0}.
We know that {0} is always a linearly dependent set (why?), and yet the above
statement assures us that W admits a basis. The only possibility is to take that
basis to be the empty set.
Now in general, is ∅ ∈ L? That is, is ∅ ⊆ V linearly independent? To answer
this, one should perhaps ask whether or not ∅ is linearly dependent!
Since we can not find x1 , x2 , . . . , xn ∈ ∅ in the first place, ∅ can not be linearly
dependent, and thus it is linearly independent. But then ∅ ∈ L, and so L ≠ ∅!
The rest of the proof is more straightforward. Let
C = {Lλ }λ∈Λ
1. DEFINITION AND BASIC PROPERTIES 177

be a chain in L. Thus each Lλ ∈ L. Define L ∶= ∪λ∈Λ Lλ .


We claim that L is linearly independent. Suppose that x1 , x2 , . . . , xn ∈ L. Then
there exist Lλ1 , Lλ2 , . . . , Lλn such that xj ∈ Lλj , 1 ≤ j ≤ n. But C is a chain, so one
of these Lλj ’s must include all of the others, say Lλj0 . Hence
xj ∈ Lλj ⊆ Lλj0 , 1 ≤ j ≤ n.
Since Lλj0 ∈ L, it is linearly independent, whence {x1 , x2 , . . . , xn } is also linearly
independent (why?).
This shows that L is linearly independent, and so L ∈ L and clearly L is an upper
bound for C, since Lλ ⊆ L for all λ ∈ Λ.
By Zorn’s Lemma, L has a maximal element, and this is precisely the definition
of a basis for V.

1.14. Exercise. Let V be a vector space over a field F, and let L ⊆ V be a


linearly independent set. Then there exists a basis B for V such that L ⊆ B.

1.15. Theorem. Let V be a vector space over a field F, and let B be a basis for
V. Then
(a) each element x ∈ V can be expressed as a finite linear combination of ele-
ments of B; and
(b) except for the presence of terms with a zero coefficient, this can be done in
a unique way.
Proof.
(a) Define
n
W ∶= span B = { ∑ κj xj ∶ n ∈ N, κj ∈ F, xj ∈ B, 1 ≤ j ≤ n}.
j=1

We leave it as an exercise for the reader to prove that W ≤ V; that is, that
W is a subspace of V. Suppose that W ≠ V, i.e. that there exists y ∈ V ∖ W.
We claim that B ∪ {y} is linearly independent in V, contradicting the
maximality of B as a linearly independent set in V.
Indeed, otherwise we can find x1 , x2 , . . . , xn ∈ V and κ1 , κ2 , . . . , κn+1 ∈ F
such that
κ1 x1 + κ2 x2 + ⋯ + κn xn + κn+1 y = 0.
If κn+1 ≠ 0, then
−1
y= (κ1 x1 + κ2 x2 + ⋯ + κn xn ) ∈ W,
κn+1
a contradiction.
Thus κn+1 = 0, and so
κ1 x1 + κ2 x2 + ⋯ + κn xn = 0,
implying that κj = 0 for all j since B is linearly independent.
178 8. VECTOR SPACES

This contradiction shows that W = V, i.e. that V consists of all finite


linear combinations of its basis elements.
(b) This is left as an exercise for the reader.

1.16. Definition. A vector space V over a field F is said to be finite


-dimensional if it admits a finite basis.

1.17. Theorem. Let V be a finite-dimensional vector space over a field F. Any


two bases B1 and B2 for V have the same number of elements.
Proof. We shall argue by contradiction. Suppose that B ∶= {x1 , x2 , . . . xm } is a
basis for V, and that Y ∶= {y1 , y2 , . . . , ym+1 } is linearly independent in V. By Exer-
cise 1.14 above, Y ⊆ B2 for some basis B2 of V. We shall show that this leads to a
contradiction.
By Theorem 1.15, V = span B, and thus
y1 ∈ span B,
i.e.,
y1 = κ1 x1 + κ2 x2 + ⋯ + κm xm
for some choice of κj ∈ F, 1 ≤ j ≤ m. Since y1 ≠ 0, not all κj ’s are equal to zero. By
reindexing the xj ’s if necessary, we may assume that κ1 ≠ 0.
By Exercise 1 below, span{y1 , x2 , x3 , . . . , xm } = V.
Thus y2 ∈ span{y1 , x2 , x3 , . . . , xm }, say
y2 = α1 y1 + α2 x2 + ⋯ + αm xm .
Arguing as before, not all of α2 , α3 , . . . , αm can be zero, otherwise {y1 , y2 } are not
linearly independent. Say α2 ≠ 0. Then (exercise)
span{y1 , y2 , x3 , . . . , xm } = V.
Continuing in this manner, we get to
span{y1 , y2 , y3 , . . . , ym } = V.
But then ym+1 ∈ span{y1 , y2 , y3 , . . . , ym }, contradicting the fact that our original set
{y1 , y2 , y3 , . . . , ym , ym+1 } was linearly independent.
Hence any linearly independent subset Y of V has at most m elements.
If C ⊆ V were a basis with fewer than m elements, then the above argument
would show that B is not linearly independent, a contradiction.
Thus all bases for V have exactly m elements.

Theorem 1.17 is what allows us to make the following definition.


1.18. Definition. Let V be a finite-dimensional vector space over a field F.
The dimension of V over F is the number of elements in a basis for V, denoted by
dimF V.
1. DEFINITION AND BASIC PROPERTIES 179

1.19. Examples.
(a) If n ∈ N and p ∈ N is prime, then dimZp Znp = n .
(b) The base field is extremely important when determining the dimension of
a vector space. For example,
dimC C = 1, and dimR C = 2.
Indeed, B ∶= {1, i} is easily seen to be a basis for C over R, while D ∶= {1}
is a basis for C over itself.
(c) What should dimQ C be?
180 8. VECTOR SPACES

Supplementary Examples.

S8.1. Example. Let F be a field. For each 1 ≤ n ∈ N, Fn ∶= {(x1 , x2 , . . . , xn ) ∶


xj ∈ F, 1 ≤ j ≤ n} is a vector space over F using the operations
(x1 , x2 , . . . , xn ) + (y1 , y2 , . . . , yn ) ∶= (x1 + y1 , x2 + y2 , . . . , xn + yn )
and
κ(x1 , x2 , . . . , xn ) ∶= (κx1 , κx2 , . . . , κxn ).
In particular, if n = 1, this shows that F is a vector space over F. The verification
is left to the reader.

S8.2. Example. If ∅ ≠ X is a set, F is a field and f ∶ X → F is a function, we


define the support of f to be
supp(f ) ∶= {x ∈ X ∶ f (x) ≠ 0}.
The set FX ∶= {f ∶ X → F ∶ f a function} is a vector space over F, using the
operations
(f + g)(x) ∶= f (x) + g(x) and (κf )(x) ∶= κ(f (x)), x ∈ X.
The set
c00 (X, F) ∶= {f ∈ FX ∶ supp(f ) is finite}
is a subspace of FX .
A basis for c00 (X, F) is the set B ∶= {δx ∶ x ∈ X}, where
δx ∶ X → F

⎪1
⎪ if y = x
y → ⎨

⎪0 if y ≠ x.

If X is finite, then FX = c00 (X, F).

S8.3. Example. Let F be a field. Then F[x] is a vector space over F, where
κ(qn xn + qn−1 xn−1 + ⋯ + q1 x + q0 ) ∶= (κqn xn + κqn−1 xn−1 + ⋯ + κq1 x + κq0 )
for all q(x) = qn xn + qn−1 xn−1 + ⋯ + q1 x + q0 ∈ F[x]. (We are using the usual addition
in F[x].)
Again, the verification of this is left to the reader.

S8.4. Example. The ring C([0, 1], R) ∶= {f ∶ [0, 1] → R ∶ f is continuous} is a


vector space over R, using point-wise addition and setting
(κf )(x) ∶= κ(f (x)), x ∈ [0, 1], κ ∈ R.
SUPPLEMENTARY EXAMPLES. 181

S8.5. Example. Let F be a field, and n ∈ N. Then Mn (F) is a vector space


over F, where
[xi,j ] + [yi,j ] ∶= [xi,j + yi,j ]
and
κ[xi,j ] ∶= [κxi,j ].
2
Note that dimF Mn (F) = n . The reader is left to discover a basis for Mn (F)
over F.

S8.6. Example. Let F be a field and V be a vector space over F. If W is a


subspace of V, and if Y is a subspace of W, then Y is a subspace of V.

S8.7. Example. Let ∅ ≠ Λ be a set, and let F be a field. If, for each λ ∈ Λ, we
have that Vλ is a vector space over F, then
∏ Vλ
λ∈Λ
is a vector space over F, where
(xλ )λ + (yλ )λ ∶= (xλ + yλ )λ ,
and
κ(xλ )λ ∶= (κxλ )λ ,
for all κ ∈ F, (xλ )λ , (yλ )λ ∈ ∏λ∈Λ Vλ .

S8.8. Example. Let F be a field and n ∈ N. Let Pn (F) ∶= {qn xn + qn−1 xn−1 +
⋯ + q1 x + q0 ∶ qj ∈ F, 0 ≤ j ≤ n}. Then Pn (F) is a finite-dimensional subspace of the
infinite-dimensional vector space F[x] over F, and
dimF Pn (F) = n + 1.
A basis for Pn (F) is B ∶= {1, x, x2 , . . . , xn }.

S8.9. Example. Consider the polynomial q(x) = x2 +x+1 over Z2 , and observe
that it is irreducible over Z2 , by virtue of the fact that its degree is 2 and it has no
roots in Z2 .
As we have seen, F ∶= Z2 [x]/⟨x2 + x + 1⟩ is a field, and a general element of F
looks like
b1 x + b0 + ⟨q(x)⟩ ∶ b1 , b0 ∈ Z2 .
Then F is a vector space over Z2 , where
(b1 x + b0 + ⟨q(x)) + (d1 x + d0 + ⟨q(x)) ∶= (b1 + d1 )x + (b0 + d0 ) + ⟨q(x),
and
κ(b1 x + b0 + ⟨q(x)) ∶= (κb1 x + κb0 ) + ⟨q(x)⟩.
A basis B for F over Z2 is {1 + ⟨q(x)⟩, x + ⟨q(x)⟩}, and thus
dimZ2 (F) = 2.
182 8. VECTOR SPACES

S8.10. Example. Let V be a vector space over a field F and W, Y be subspaces


of V. We say that V is an (internal) direct sum of W and Y if
● W ∩ Y = {0}, and
● V = W + Y ∶= {w + y ∶ w ∈ W, y ∈ Y}.
In this case we write V = W ⊕ Y.
Note that if char(F) ≠ 2, then F4 = W ⊕ Y, where W = {(w, −w, 0, z) ∶ w, z ∈ F}
and Y = {(x, x, y, 0) ∶ x, y ∈ F}, whereas if char(F) = 2, then Y ∩ W ≠ {(0, 0, 0, 0)}.
APPENDIX 183

Appendix

A8.1. Vector spaces over general fields behave for the most part like vector
spaces over R or over C, although we must always be aware of those few cases where
a property of the underlying field (such as finiteness, or the finiteness of its character)
plays a role (e.g. Example 1.12).
Since we assume that the reader has already seen vector spaces over R in a linear
algebra course, this chapter should be quite familiar, with the possible exception of
the proof that every vector space admits a basis. We should point out that when a
vector space is infinite-dimensional, it may be very, very difficult to actually describe
a basis, and Theorem 1.13 merely asserts that one exists.
A8.2. Students who have seen Calculus or Fourier Analysis may have seen us
write, for example,
x x2 x3
exp(x) ∶= ex = 1 + + + + ⋯.
1! 2! 3!
n
This is NOT a linear combination of the functions gn (x) ∶= xn! , n ≥ 0. This is a
series expansion. Be aware that we can NEVER add up infinitely many things in
mathematics. The most general thing we can do is to take finite sums and then take
limits of these.
The above expression really means exactly that. We are claiming that
N
exp(⋅) = lim ∑ gn .
N →∞ n=1

What we mean by the limit depends in general on the circumstance. In this instance,
we might be referring to point-wise limits, or to uniform convergence on closed and
bounded (i.e. so-called compact subsets) of R, or some other form of convergence
altogether.
This study of such series expansions falls outside the scope of a course on rings
and fields. We only bring it up to emphasise the fact that in vector spaces, we
are only ever allowed to consider finite sums of vectors, and in any setting – linear
algebra or otherwise – linear combinations only involve finitely many vectors. Be
warned.
184 8. VECTOR SPACES

Exercises for Chapter 8

Exercise 8.1.
Let V and W be vector spaces over a field F. A map φ ∶ V → W is said to be
F-linear if φ(κx + y) = κφ(x) + φ(y) for all x, y ∈ V and κ ∈ F.
Let L(V, W) ∶= {φ ∶ φ ∶ V → W is F − linear}. If V = W, we abbreviate the
notation to L(V).
Prove that L(V, W) is a vector space over F using the operations
(φ + ψ)(x) ∶= φ(x) + ψ(x), x∈V
and
(κφ)(x) ∶= κ(φ(x)), x ∈ V, κ ∈ F.
Note. Both V and W must be vector spaces over the same field F. Also, if
we are only dealing with one field F at a time, and there is no risk of confusion,
we sometimes relax a bit and simply refer to these maps as linear (as opposed to
F-linear). Let’s face it – life is stressful enough and sometimes we just need to let
our hair down.

Exercise 8.2.
Let V and W be vector spaces over a field F, and let φ ∈ L(V, W). Prove that
ker φ ∶= {x ∈ V ∶ φ(x) = 0}
is a subspace of V, while ran φ ∶= {φ(x) ∶ x ∈ V} is a subspace of W.

Exercise 8.3.
Let V be a vector space over the field F. Prove that (L(V), +, ○) is a ring with
the operations
(φ + ψ)(x) ∶= φ(x) + ψ(x), x ∈ V
and
(φ ○ ψ)(x) ∶= φ(ψ(x)), x ∈ V.

Exercise 8.4.
Let V be a vector space over the field F, and let L ⊆ V be a linearly independent
set. Show that L can be extended to a basis B for V. That is, show that there exists
a basis B ⊆ V such that L ⊆ B.

Exercise 8.5.
Let V and W be vector spaces over a field F, and let φ ∈ L(V, W). We say that
φ is a (vector space) isomorphism if φ is bijective.
Prove that the following statements are equivalent.
(a) φ ∈ L(V, W) is an isomorphism.
(b) Given a basis B = {bλ ∶ λ ∈ Λ} for V, the set D ∶= {φ(bλ ) ∶ λ ∈ Λ} is a basis
for W.
EXERCISES FOR CHAPTER 8 185

Exercise 8.6.
Let V be a vector space over a field F, and let 0 ≠ φ ∈ L(V, F). Show that
ker φ
is a maximal proper subspace of V. That is, if W ≤ V is a subspace of V and
ker φ ≤ W ≤ V,
then either W = ker φ or W = V.

Exercise 8.7.
Let V be a vector space over the field F, and let W be a subspace of V. We
define a relation ∼ on V as follows: given x, y ∈ V, we set
x ∼ y if and only if x − y ∈ W.
(a) Prove that ∼ is an equivalence relation on V.
(b) As we did in the case of rings, we refer to the equivalence class x + W ∶= [x]
of x ∈ V as a coset of W. Prove that the set V/W ∶= {[x] ∶ x ∈ cV } of
cosets of W form a vector space using the operations
(x + W) + (y + W) ∶= (x + y) + W,
and
κ(x + W) ∶= (κx) + W.
Of course, the first thing you will have to do is to verify that these operations
are well-defined.

Exercise 8.8.
Prove the First Isomorphism Theorem for vector spaces:
If V and W are vector spaces over a field F and φ ∶ V → W is an F-linear map,
then
V
ran φ = φ(V) ≃ .
ker φ

Exercise 8.9.
Prove the Second Isomorphism Theorem for vector spaces:
Let W and Y are be subspaces of a vector space V over a field F, and set
W + Y ∶= {w + y ∶ w ∈ W, y ∈ Y}.
Prove that W + Y is a subspace of V, and
W +Y W
≃ .
Y W ∩Y
186 8. VECTOR SPACES

Exercise 8.10.
Prove the Third Isomorphism Theorem for vector spaces:
Let W and Y are be subspaces of a vector space V over a field F, and suppose
that Y ≤ W. Prove that
V V/Y
≃ .
W W/Y
CHAPTER 9

Extension fields

We’re intellectual opposites. Well, I’m intellectual and you’re oppo-


site.
Mae West

1. A return to our roots (of polynomials)


1.1. Many of the problems which have arisen in the history of mathematics have
had at their heart the problem of finding solutions to polynomial equations. For
example, given a square whose sides each have length 1, the Pythagorean Theorem
assures us that the length x of a diagonal of the square is given by the solution to
x2 = 12 + 12 = 2, or equivalently, to the equation x2 − 2 = 0. In this way, one sees
the desire to study very basic geometrical problems immediately creates the need to
extend the rational numbers to a larger field where this equation admits a solution.
Unsurprisingly, we turn to the field of real numbers. But even there, we are met with
disappointment. The very simple quadratic polynomial equation q(x) = x2 +1 ∈ R[x]
is irreducible. Seeing how it has degree 2, we conclude that it has no roots
√ in R.
Again, we remedy this by introducing a larger field, namely C, where i = −1 exists
and is a solution to that equation.
The goal in this Chapter is to generalise this phenomenon to more general fields.
Given F and a polynomial q(x) ∈ F[x], we wish to show that it is always possible to
construct a larger field E that contains F and also contains all of the roots of q(x).
The construction is somewhat abstract, and may not be to everyone’s taste, even
though it is perfectly acceptable from a mathematical viewpoint.
1.2. Definition. Let F be a field. An extension field of F is a pair (E, ι),
where E is a field and ι ∶ F → E is an embedding, that is, an injective homomor-
phism. In other words, E is an extension of F if Fι ∶= ι(F) is a subfield of E for
some isomorphism ι from F to Fι .

1.3. Remarks.
● We note that this is not the most common definition of an extension field
of a field F. Many authors define an extension field E of F simply as a
field E which contains F as a subfield, and we may do this if we identify F
187
188 9. EXTENSION FIELDS

with its image Fι ∶= ι(F) ⊆ E. There is no harm in doing this – indeed, this
is how we should interpret the notion of an extension field, provided that
we are careful about keeping track of this identification. In writing these
course notes however, we have taken the viewpoint that when first learning
a subject, it is good to be overly cautious, meticulous and probably a bit
pedantic, and to do the bookkeeping in plain sight for all to see.
● At times we shall have to simultaneously consider multiple extension fields
of a field F. When we are only dealing with one extension field (E, ι)
however, we shall use the notation bι ∶= ι(b) in the hope that it will make
the arguments more readable. This will also apply to polynomials in the
following way: suppose that q(x) = qn xn + qn−1 xn−1 + ⋯ + q1 x + q0 ∈ F[x] and
that (E, ι) is an extension of F. We shall denote by
q ι (x) ∶= qnι xn + qn−1
ι
xn−1 + ⋯ + q1ι x + q0ι
the corresponding polynomial in E[x] with coefficients qjι = ι(qj ), 0 ≤ j ≤ n.
● A brief comment on terminology: we shall also say that E is an extension
field of F via ι to mean that (E, ι) is an extension field of F.
● Finally, let (E, ι) be an extension field of F and q(x) ∈ F[x]. Then, in
the same way that we are (mentally) identifying F and Fι , we should be
identifying q(x) and q ι (x). Thus “finding a root for q(x)” in E really
means finding a root for q ι (x) in E! Despite this, we shall continue to use
the common abuse of terminology and refer to a root of q ι (x) in E as a
“root of q(x)”.

1.4. Examples.
(a) (R, ι) is an extension of Q, where ι(q) = q for all q ∈ Q.
Note that in this example, if q(x) = 23 x2 − 17 222
18 x + 1033 ∈ Q[x], then
ι ι ι
3 2 17 222 3 17 222
q ι (x) = x − x+ = x2 − x + ∈ R[x].
2 18 1033 2 18 1033
Hopefully the reader can see why we would want to abuse the terminology
in cases such as these, and refer to a root of q ι (x)√in R as a “root of q(x)”.
(b) (C, ι) is an extension of R, where ι(x) = x ⋅ 1 + 0 ⋅ −1 = x + i0 for all x ∈ R,
and therefore (C, ι∣Q ) is also an extension of Q. [Note that ι and i are
different! The first is an injective homomorphism, while the second is the
square root of −1 in C!]
(c) There does not exist an embedding φ ∶ Z7 → Z19 for which (Z19 , φ) is an
extension of Z7 . Indeed, if 1 ∈ Z7 denotes the multiplicative identity in Z7 ,
then the fact that char(Z7 ) = 7 implies that
0 = φ(7) = φ(1) + φ(1) + φ(1) + φ(1) + φ(1) + φ(1) + φ(1).
But the only element a ∈ Z19 that satisfies 7a = (a + a + a + a + a + a + a) = 0
is a = 0. Thus φ(1) = 0 = φ(0), contradicting the fact that φ is injective.
1. A RETURN TO OUR ROOTS (OF POLYNOMIALS) 189

In constructing extension fields which contain a root to a given polynomial, the


next result is the key to our construction. As previously mentioned – the construc-
tion is perhaps more abstract than some of us might have preferred.

1.5. Kronecker’s Theorem – The fundamental theorem of field theory.


Let F be field and 0 ≠ q(x) be a non-constant polynomial in F[x]. Then there exists
an extension field (E, ι) of F such that q ι (x) has a root in E.
Proof. If q(x) already has a root in F, then we simply set E ∶= F and ι(a) = a for
all a ∈ F. Thus we suppose that q(x) does not have any roots in F.
Since F is a field, F[x] is a ufd. Thus we can factor q(x) as a product of
irreducible polynomials over F. Let p(x) denote one of these irreducible factors of
q(x). That is, q(x) = p(x)s(x) for some s(x) ∈ F[x], and p(x) is irreducible. Write
p(x) = pn xn + pn−1 xn−1 + ⋯ + p1 x + p0 .
Now consider E ∶= F[x]/⟨p(x)⟩. By Theorem 7.2.7, E is a field. The map
ι∶ F → E
a ↦ a + ⟨p(x)⟩
is easily seen to be an injective homomorphism. That is, for all a, b ∈ F,
ι(ab) = ab + ⟨p(x)⟩ = (a + ⟨p(x)⟩)(b + ⟨p(x)⟩) = ι(a)ι(b),
and
ι(a + b) = (a + b) + ⟨p(x)⟩ = (a + ⟨p(x)⟩) + (b + ⟨p(x)⟩) = ι(a) + ι(b).
Moreover, ker ι = {0}, since p(x) irreducible implies that deg p(x) ≥ 1, and thus
0 ≠ a ∈ F implies that 0 ≠ ι(a) = a + ⟨p(x)⟩ ∈ E.
Thus (E, ι) is an extension field of F.
We now leave it to the reader to verify that q ι (x) = pι (x)sι (x), and that pι (x)
is irreducible in Fι [x]. (This should be very easy if one understands what an iso-
morphism is.)
It is clear that any root of pι (x) is also a root of q ι (x). As such, we have reduced
the problem to finding roots of irreducible polynomials.
Here’s the kicker. Consider α ∶= x + ⟨p(x)⟩ ∈ E. Then αj = xj + ⟨p(x)⟩ for all j ≥ 1
and so
pι (α) = pιn αn + pιn−1 αn−1 + ⋯ + pι1 α1 + pι0
= pιn (xn + ⟨p(x)⟩) + pιn−1 (xn−1 + ⟨p(x)⟩) + ⋯ + pι1 (x + ⟨p(x)⟩) + pι0
= (pn + ⟨p(x)⟩)(xn + ⟨p(x)⟩) + (pn−1 + ⟨p(x)⟩)(xn−1 + ⟨p(x)⟩) + ⋯
⋯ + (p1 + ⟨p(x)⟩)(x + ⟨p(x)⟩) + (p0 + ⟨p(x)⟩)
= (pn x + pn−1 xn−1 + ⋯ + p1 x + p0 ) + ⟨p(x)⟩
n

= p(x) + ⟨p(x)⟩
= 0 + ⟨p(x)⟩ ∈ E.
In other words, α is a root of pι (x), and hence of q ι (x) in E!
Are you kidding me? Nay! Am I kidding you? Nay nay!
190 9. EXTENSION FIELDS


Remark. We shall refer to the embedding ι of F into E = F[x]/⟨p(x)⟩ defined in
the proof of Kronecker’s Theorem by ι(a) = a + ⟨p(x)⟩ for all a ∈ F as the canonical
embedding of F into E. Thus
Fι ∶= ι(F) = {aι ∶= a + ⟨p(x)⟩ ∶ a ∈ F}
is a subfield of E which is isomorphic to F.

1.6. Examples.
(a) Let F = Q and q(x) = x2 + 1 ∈ Q[x]. Note that in this example, q(x) is al-
ready irreducible over Q, so q(x) will play the role of p(x) from Kronecker’s
Theorem. By the argument in the proof of Kronecker’s Theorem 1.5,
E ∶= Q[x]/⟨x2 + 1⟩
is an extension of Q via the canonical embedding aι ∶= ι(a) = a+⟨x2 +1⟩, a ∈
Q.
Moreover, q(x) = q2 x2 + q0 , where q2 = q0 = 1 and thus
q ι (x) = q2ι x2 + q0ι = (1 + ⟨x2 + 1⟩)x2 + (1 + ⟨x2 + 1⟩).
This might seem a bit confusing at first, but notice that if we write
K ∶= ⟨q(x)⟩ = ⟨x2 + 1⟩, then we are saying that
q ι (x) = (q2 + K)x2 + (q0 + K).
It is important to keep in mind that an element of E is a coset of K, so
q2ι ∶= q2 + K and q0ι ∶= q0 + K are elements of E. That is, q ι (x) = q2ι x2 + q0ι ∈
E[x], as required.
Let’s check that α is indeed a root for q ι (x):
Setting α = x + K = x + ⟨x2 + 1⟩ ∈ E,
q ι (α) = (q2 + K)α2 + (q0 + K)
= (q2 + K)(x + K)2 + (q0 + K)
= (q2 + K)(x2 + K) + (q0 + K)
= (q2 x2 + q0 ) + K
= q(x) + ⟨q(x)⟩
= (x2 + 1) + ⟨x2 + 1⟩
= 0 + ⟨x2 + 1⟩.

The reader will observe that we never needed to invoke C or√“ −1” to
get a root for this polynomial! Indeed, this is one way to define −1! (How
sweet is that?)
1. A RETURN TO OUR ROOTS (OF POLYNOMIALS) 191

Observe also that


(−α)2 + 1ι = (−x + ⟨x2 + 1⟩)2 + (1 + ⟨x2 + 1⟩)
= ((−x)2 + ⟨x2 + 1⟩) + (1 + ⟨x2 + 1⟩)
= (x2 + 1) + ⟨x2 + 1⟩
= 0 + ⟨x2 + 1⟩.
Thus −α is also a root of q ι (x). (Of course, we could have simply noted
that (−α)2 + 1ι = (α)2 + 1ι . We are simply trying to emphasise what α
“really looks like”.)
Thus q ι (x) = (x − α)(x + α) factors into linear terms in E.
(b) Let
q(x) = (x2 + 2)(x3 + x2 + 1) ∈ Z5 [x].
Note that g(x) ∶= x2 + 2 and h(x) ∶= x3 + x2 + 1 are both irreducible polyno-
mials over Z5 , since neither has a root in Z5 and they each have degree in
{2, 3}.
By Kronecker’s Theorem,
E1 ∶= Z5 [x]/⟨x2 + 2⟩
and
E2 ∶= Z5 [x]/⟨x3 + x2 + 1⟩
are two extension fields of Z5 via the maps ι1 (a) = a + ⟨x2 + 2⟩ and ι2 (a) =
a + ⟨x3 + x2 + 1⟩ respectively. Note that each of Ej contains a root αj of
q ιj (x), j = 1, 2. But
E1 = {a1 x + a0 + ⟨x2 + 2⟩ ∶ a0 , a1 ∈ Z5 }
has 25 elements, whereas
E2 = {b2 x2 + b1 x + b0 + ⟨x3 + x2 + 1⟩ ∶ b0 , b1 , b2 ∈ Z5 }
has 125 elements.
This shows that not all extensions of a given field F need to be isomor-
phic to each other. That is, there can exist many fundamentally different
(i.e. non-isomorphic) extensions of a given field F.

1.7. Definition. Let (E, ι) be an extension field of a field F, and let q(x) ∈ F[x].
We say that q(x) splits over E if q ι (x) can be written as a product of linear terms
with coefficients in E.
The field E is said to be a splitting field for q(x) ∈ F[x] if
(a) q(x) splits over E, and
(b) If Fι ⊆ K ⊆ E for some proper subfield K of E, then q(x) does not split
over K.
192 9. EXTENSION FIELDS

1.8. Examples.
(a) Consider q(x) = x2 − 2 ∈ Q[x]. Then q(x) does not split over Q, but it does
split over R. Indeed,
√ √
q ι (x) = x2 − 2 = (x − 2)(x + 2) ∈ R[x].
√ √
Note also that q(x) splits over Q( 2) = {a + b 2 ∶√a, b ∈ Q}, and thus R is
not the splitting field for q(x). As it turns out, Q( 2) is the splitting field
for q(x) ∈ Q[x].
(b) The same polynomial√q(x) = x2 − 2 can be thought of as an element of
R[x]. In this case, Q( 2) would not be considered a splitting field for q(x)
over R, since any splitting field for q(x) over R would have to contain the
domain for the coefficients, namely R. In other words, if we are thinking of
q(x) as a polynomial with real coefficients, then q(x) already splits in R,
so R is the splitting field of q(x).

1.9. Notation. Let F be a field and (E, ι) be an extension field of F. If


α1 , α2 , . . . , αn ∈ E, we denote by F(α1 , α2 , . . . αn ) the smallest subfield of E con-
taining Fι and {α1 , α2 , . . . , αn }.
If
q ι (x) = qnι (x − α1 )(x − α2 )⋯(x − αn ) ∈ E[x],
then F(α1 , α2 , . . . , αn ) is a splitting field for q(x) over F. The proof of this is left
as an exercise for the reader.

1.10. Theorem.
Let F be a field and 0 ≠ q(x) ∈ F[x]. Then there exists a splitting field K for q(x)
over F.
Proof. We shall prove this by induction on the degree, say n, of q(x). Of course, if
n = 0, then q(x) = q0 has no roots (since we assumed that q(x) ≠ 0), and so F is the
splitting field for q(x) over F.
If n = 1, then q(x) = q1 x + q0 with q1 ≠ 0, and thus α ∶= −q1−1 q0 ∈ F is the unique
root of q(x). In other words, F is again a splitting field for q(x).
Let N > 1 be fixed, and suppose that the result holds for all polynomials of
degree less than N . Suppose next that q(x) ∈ F[x] is a polynomial of degree N . By
Kronecker’s Theorem 1.5, there exists an extension field (E, ι) of F in which q ι (x)
has a root, say α1 ∈ E.
Thus
q ι (x) = (x − α1 )g(x)
for some g(x) ∈ E[x] with deg(g(x)) = N − 1 < N . By our induction hypothesis,
there exists an extension field (H, τ ) of E such that g(x) splits in K; that is, there
exist α2 , α3 , . . . , αN ∈ H such that
τ
g(x) = gN −1 (x − α2 )(x − α3 )⋯(x − αN ) ∈ K[x].
Let K ∶= F(α1 , α2 , . . . , αN ) ⊆ H. Then (K, τ ○ ι) is a splitting field for q(x) over F.
1. A RETURN TO OUR ROOTS (OF POLYNOMIALS) 193

1.11. Example.
Let q(x) = x4 + 4x2 − 45 = (x2 − 5)(x2 + 9) ∈ Q[x], and let ι ∶ Q √→ C√denote the
inclusion map ι(q) = q for all q ∈ Q. The roots of√q ι (x) in C √ are { 5, − 5, 3i, −3i}.
Thus, a splitting field for q(x) over Q is K ∶= Q( 5, 3i) = Q( 5, i).
In particular,
√ √
K = {(a + b 5) + (c + d 5)i ∶ a, b, c, d ∈ Q}.
√ √
It is worth noting that K is a vector space over Q with basis {1, 5, i, 5i}, and
thus
dimQ K = 4.
That K is a vector space over Q is not a coincidence. We shall study this phenomenon
presently.

1.12. Exercise. What is the dimension of R as a vector space over Q?

1.13. Theorem. Let F be a field and 0 ≠ q(x) ∈ F[x] be an irreducible polyno-


mial over F. Suppose that (E, ι) is an extension field of F containing a root α of
q(x). Then
(a) F(α) ≃ F[x]/⟨q(x)⟩; and
(b) if deg (q(x)) = n ≥ 1, then
F(α) = {bιn−1 αn−1 + bιn−2 αn−2 + ⋯ + bι1 α + bι0 ∶ bj ∈ F, 0 ≤ j ≤ n − 1}.
In particular F(α) is an n-dimensional vector space over F.
Proof. Since q(x) is irreducible over F, by Theorem 7.2.7, ⟨q(x)⟩ is a maximal ideal
and F[x]/⟨q(x)⟩ is a field.
(a) Consider the so-called evaluation map
Θ ∶ F[x] → F(α)
f (x) ↦ f ι (α).
It is routine to verify (and the reader should check) that Θ is a surjective
homomorphism. Now q ι (α) = 0, so q(x) ∈ ker Θ. Since ker Θ is an ideal,
⟨q(x)⟩ ⊆ ker Θ. By maximality of ⟨q(x)⟩ combined with the fact that
Θ(1) = 1ι ≠ 0 (and hence ker Θ ≠ F[x]), it follows that
⟨q(x)⟩ = ker Θ.
By the First Isomorphism Theorem 4.4.2,
F(α) ≃ F[x]/⟨q(x)⟩,
via the map
̂∶ F[x]
Θ → F(α)
⟨q(x)⟩
f (x) + ⟨q(x)⟩ ↦ f ι (α).
194 9. EXTENSION FIELDS

(b) Since

F[x]/⟨q(x)⟩ =
{bn−1 xn−1 + bn−2 xn−2 + ⋯ + b1 x + b0 + ⟨q(x)⟩ ∶ bj ∈ F, 0 ≤ j ≤ n − 1},

we see that
̂
F(α) = Θ(F[x]/⟨q(x)⟩)
= {bιn−1 αn−1 + bιn−2 αn−2 + ⋯ + bι1 α + bι0 ∶ bj ∈ F, 0 ≤ j ≤ n − 1}.

1.14. Remark. It is worth noting that the isomorphism between F(α) and
F[x]/⟨q(x)⟩ in Theorem 1.13 above is given by the map
̂ (x) + ⟨q(x)⟩) = f ι (α).
Θ(f

This follows immediately from the First Isomorphism Theorem and its proof.
Also, let us keep in mind that for b ∈ F, the map δ ∶ b ↦ b+⟨q(x)⟩ is an embedding
of F to F[x]/⟨q(x)⟩, and
̂ + ⟨q(x)⟩) = bι ∈ F(α).
Θ(b
̂ ∶ F → F(α) satisfying κ(b) = bι as a canonical
Thus we may think of the map κ ∶= Θ○δ
embedding of F into F(α) induced by this construction.

1.15. Corollary. Let F be a field and 0 ≠ q(x) ∈ F[x] be irreducible. Suppose


that (E1 , ι1 ) and (E2 , ι2 ) are extension fields of F such that
● E1 contains a root α of q ι1 (x), and
● E2 contains a root β of q ι2 (x).
Then there exists an isomorphism Γ ∶ F(α) → F(β) such that Γ(ι1 (b)) = ι2 (b) for all
b ∈ F and Γ(α) = β.
Proof. This follows immediately from Theorem 1.13 and Remark 1.14. As we saw
there, there exist isomorphisms Θ ̂ 1 ∶ F[x]/⟨q(x)⟩ → F(α) and Θ
̂ 2 ∶ F[x]/⟨q(x)⟩ →
F(β) such that
̂ j (b + ⟨q(x)⟩) = ιj (b) = bιj for all b ∈ F, j = 1, 2;
● Θ
̂ 1 (x + ⟨q(x)⟩) = α; and
● Θ
̂ 2 (x + ⟨q(x)⟩) = β.
● Θ
It now suffices to set
̂ −1 .
̂2 ○ Θ
Γ ∶= Θ 1


1. A RETURN TO OUR ROOTS (OF POLYNOMIALS) 195

1.16. Example. Let√q(x) = x6 − 2 ∈ Q[x]. Observe that q(x) is irreducible over


Q (why?) and that α ∶= 6 2 is a root of q(x) in R. Let ι ∶ Q → R be the inclusion
map ι(b) = b, so that bι = b for all b ∈ Q.
By Theorem 1.13,
Q(α) = {bι5 α5 + bι4 α4 + ⋯ + bι1 α + bι0 ∶ bj ∈ Q, 0 ≤ j ≤ 5}

6 5 √
6 4 √6
= {b5 2 + b4 2 + ⋯ + b1 2 + b0 ∶ bj ∈ Q, 0 ≤ j ≤ 5}
is a vector space of dimension 6 over Q with basis

6

6 2 √
6 3 √6 4 √ 6 5
B ∶= {1, 2, 2 , 2 , 2 , 2 }.

Note, however, that Q(α) = Q( 6 2) is not a splitting field for q(x) over Q. If
2πi
ω ∈ C is a primitive sixth root of unity (i.e.√ ω =
√ e 6 and
√ thus
√ {1, ω,
√ ω2, . . √
. , ω 5 } is
the set of all roots of unity in C), then { 6 2, ω 6 2, ω 2 6 2, ω 3 6 2, ω 4 6 2, ω 5 6 2} are
all roots of q(x). Let’s see why.
First note that
√ √
αj6 = (ω j 2)6 = ω 6j ( 2)6 = (ω 6 )j 2 = 1j ⋅ 2 = 2,
6 6

whence q(αj ) = αj6 − 2 = 2 − 2 = 0. Thus each αj is a root of q(x). But deg(q(x)) = 6,


and so q ι (x) ∈ K[x] is a polynomial of degree 6, it can have at most six roots. Thus
q ι (x) = (x − α0 )(x − α1 )⋯(x − α4 )(x − α5 )
splits over K, and (K, ι) is a splitting field for q(x).
Note that K is the√smallest field √ in C containing all of the roots of q(x). Thus
it must contain α√ 1 = 6
2 and α =
2 √ ω 6
2. But then it also must contain α1−1 α2 = ω.
6 6
Hence K ⊇√Q(ω, 2). Since Q(ω, 2) contains each αj , 0 ≤ j ≤ 5, we conclude that
K = Q(ω, 6 2).
What is the dimension of K over Q?

The astute reader (which may or may not be the same person as the perspicacious
reader mentioned earlier) will have noticed that we have been referring to a splitting
field for a polynomial over a field, as opposed to the splitting field. The following
theorem will set things right.
But first, note that if θ ∶ F1 → F2 is an isomorphism of fields, and if f (x) =
bN xN + bN −1 xN −1 + ⋯ + b1 x + b0 ∈ F1 [x], then we extend our previous notation by
setting
f θ (x) ∶= bθN xN + bθN −1 xN −1 + ⋯ + bθ1 x + bθ0 ∈ F2 [x],
where bθj ∶= θ(bj ), 0 ≤ j ≤ N .

1.17. Theorem. Let F be a field and suppose that f (x) ∈ F[x]. If (K1 , ι1 ) and
(K2 , ι2 ) are splitting fields for f (x) over F, then there exists an isomorphism
Φ ∶ K1 → K2
which satisfies Φ(ι1 (b)) = ι2 (b) for all b ∈ F.
196 9. EXTENSION FIELDS

Proof. Let 1 ≤ N ∶= deg (f (x)).


If N = 1, then K1 = F = K2 , so we may take Φ ∶= ι2 ○ ι−1
1 , and we are done.
Now suppose that N ≥ 2, and that the result holds for all polynomials of degree
less than N . Since F[x] is a ufd by Theorem 7.1.10, we can factor f (x) as a product
of irreducible polynomials. Let q(x) be one of the irreducible factors of f (x).
Let α1 ∈ K1 be a root of q ι1 (x). (Note that all roots of q ι1 (x) are roots of f ι1 (x),
and hence must lie in K1 .) Similarly, we can find a root β1 ∈ K2 of q ι2 (x).
By Corollary 1.15, there exists an isomorphism θ ∶ F(α1 ) → F(β1 ) such that
θ(ι1 (b)) = ι2 (b) for all b ∈ F and θ(α1 ) = β1 .
It follows that
f ι2 (x) = (f ι1 )θ (x).

If we then write f ι1 (x) = (x − α1 )h(x) for some h(x) = hN −1 xN −1 + hN −2 xN −2 +


⋯ + h1 x + h0 ∈ F(α1 )[x], then
f ι2 (x) = (f ι1 )θ (x)
= (x − (α1 )θ )hθ (x)
= (x − β1 )(hθN −1 xN −1 + hθN −2 xN −2 + ⋯ + hθ1 x + hθ0 ) ∈ F(β1 )[x].
Moreover, (K1 , η1 ) is a splitting field for h(x) over F(α1 ), where η1 is the in-
clusion map of F(α1 ) into K1 , that is, η1 (z) = z for all z ∈ F(α1 ). In particular,
η1 (ι1 (b)) = ι1 (b) for all b ∈ F and η1 (α1 ) = α1 .
Also, (K2 , η2 ) is another splitting field for h(x) over F(α1 ), where η2 (z) = z θ ∶=
θ(z) for all z ∈ F(α1 ). That is, η2 first implements the isomorphism between F(α1 )
and F(β1 ), and then embeds F(β1 ) into K2 via the inclusion map.
By our induction hypothesis – noting that deg(h(x)) = N − 1 < N for j = 1, 2
– we may find an isomorphism Φ ∶ K1 → K2 with Φ(η1 (z)) = Φ(z) = η2 (z) for all
z ∈ F(α1 ); that is,
Φ(η1 (ι1 (b))) = Φ(ι1 (b)) = η2 (ι(b)) = (ι1 (b))θ = θ(ι1 (b)) = ι2 (b) for all b ∈ F,
(and Φ(η1 (α1 )) = Φ(α1 ) = η2 (α1 ) = θ(α1 ) = β1 ).

We agree with the reader; this is an awful lot of book-keeping. On the bright
side, it is mostly book-keeping and not theory. The main lines of the proof (after
the base case of the induction) were as follows:
● find one root α1 of f (x) in K1 , another root β1 of f (x) in K2 ;
● factor f ι1 (x) = (x − α1 )h(x) over F(α1 ), and use the isomorphism
θ ∶ F(α1 ) → F(β1 ) to obtain the factorisation f ι2 (x) = (x − β1 )hθ (x);
● observe that K1 is a splitting field for h(x) over F(α1 ) and that K2 is a
splitting field of h(x) over F(β1 ) (this is where the book-keeping of the
embeddings gets most hairy); and
1. A RETURN TO OUR ROOTS (OF POLYNOMIALS) 197

● then use the fact that the remaining factor h(x) has degree less than N ,
which allows you to apply the induction hypothesis to conclude that K1
and K2 are isomorphic.
Almost all of the work above went into keeping track of where the coefficients of the
different factors were.

1.18. Remark. The previous result shows that any two splitting fields for a
polynomial f (x) ∈ F[x] are isomorphic, and thus – up to isomorphism – we can
speak about the splitting field of a polynomial f (x) ∈ F[x].

Given the construction of Φ from θ in the above result, it is not hard to see that
with a bit more work, we can also ensure that
Φ(αj ) = βj , 1 ≤ j ≤ N.
That is, without loss of generality, we have that K1 ∶= F(α1 , α2 , . . . , αN ) and
K2 ∶= F(β1 , β2 , . . . , βN ) are isomorphic via an isomorphism that “fixes” F (in other
words, sends ι1 (b) to ι2 (b) for all b ∈ F and which satsifies Φ(αj ) = βj for all j.
It is important to note that when we do this for any fixed 1 ≤ j ≤ N , αj and βj
must be roots of the same irreducible factor of q(x). The reader may wish to review
Example 1.6 (b) to see why this is the case.

1.19. Definition. Let F be a field and let (E, ι) be an extension field of F. An


element α ∈ E is said to be algebraic over F if there exists a (non-zero) polynomial
q(x) ∈ F[x] such that q ι (α) = 0. Otherwise, we say that α is transcendental over
F.
We say that E itself is algebraic over F if every element of E is algebraic over
F. Otherwise we say that E is transcendental over F.

1.20. Examples.
√ √
(a) α ∶= ( 2 + 3) is algebraic over Q. To see this, consider the extension field
(R, ι) of Q where ι(b) = b for all b ∈ Q. Note that
√ √ √
( 2 + 3)2 = 5 + 2 6,
and thus √ √
(( 2 + 3)2 − 5)2 − 24 = 0.
In other words, α is a root of q ι (x), where
q(x) = (x2 − 5)2 − 24 = x4 − 10x2 + 1 ∈ Q[x].

We tend not to bother writing ι when it consists of the inclusion map;


in this case we think of F as a subfield of E, and an element q(x) ∈ F[x] as
an element of E[x]. This is especially true when F = Q and E = R, or even
a subfield of R that contains Q.
198 9. EXTENSION FIELDS

This leads us to say that a real number α is algebraic if there exists


q(x) ∈ Q[x] such that q(α) = 0. Recall that this is equivalent to the
condition that there exists p(x) ∈ Z[x] such that p(α) = 0.
(b) Although it is much, much harder to show, both π k and ej are transcen-
dental
√ over Q for all j, k ∈ N.
(c) i ∶= −1 is algebraic over R; it is a root of q(x) = x2 + 1. Indeed, we leave it
as an exercise for the reader to show that C is an algebraic extension of R.
(d) Are β ∶= π + e and γ ∶= eπ algebraic or transcendental over Q? An answer
to this would certainly get you a Ph.D., since both questions remains open
to this day. By Exercise 10, at least one of them is transcendental.

Recall that F(x) denotes the field of quotients of the integral domain F[x].

1.21. Theorem. Let (E, ι) be an extension field of a field F.


(a) If α ∈ E is algebraic over F, then

F(α) ≃ F[x]/⟨q(x)⟩,

where α is a root of q ι (x) for some q(x) ∈ F[x] which satisfies

deg (q(x)) = min{deg (r(x)) ∶ r(x) ∈ F[x], rι (α) = 0}.

(b) If β ∈ E is transcendental over F, then

F(β) ≃ F(x).

Proof.
(a) Let us prove that if q(x) is chosen as above, then q(x) is irreducible. If we
can do that, then the result follows immediately from Theorem 1.13 (a).
Indeed, if q(x) = g(x)h(x) for some g(x), h(x) ∈ F[x] with

max{deg(g(x)), deg(h(x))} < deg(q(x)),

then either g(α) = 0 or h(α) = 0, otherwise q(α) ≠ 0.


But then this contradicts our choice of q(x). Hence q(x) is irreducible.
(b) We leave it as an assignment exercise for the reader to verify that the
evaluation map
θ∶ F(x) → F(β)
[(f (x), g(x))] ↦ f (α)(g(α))−1

is a (well-defined!!) isomorphism.

1. A RETURN TO OUR ROOTS (OF POLYNOMIALS) 199

1.22. Corollary. Let F be a field and (E, ι) be an extension field of F. If


α ∈ E is algebraic over F, then there exists a unique monic irreducible polynomial
q(x) ∈ F[x] such that
(a) q ι (α) = 0, and
(b) if r(x) ∈ F[x] and rι (α) = 0, then q(x) ∣ r(x) in F[x].
We refer to the polynomial q(x) above as the minimal polynomial for α over
F.
Proof. Let Rα ∶= {0 ≠ r(x) ∈ F[x] and rι (α) = 0}. Since α ∈ F is algebraic, Rα ≠ ∅,
and thus
n ∶= min{deg(r(x)) ∶ r(x) ∈ Rα } < ∞.

Let p(x) ∈ Rα be a polynomial of degree n. Multiplying p(x) by the inverse p−1n


of its leading coefficient yields a monic polynomial q(x), also of degree n, such that
q ι (α) = 0.
If r(x) ∈ Rα , then deg(r(x)) ≥ deg(q(x)). We can use the division algorithm to
write
r(x) = f (x)q(x) + g(x)
for some f (x), g(x) ∈ F[x] with g(x) = 0 or deg(g(x)) < n. Since rι (α) = q ι (α) = 0,
we conclude that g ι (α) = 0. But then g(x) = 0, for otherwise we contradict the
minimality of n.
Thus q(x) ∣ r(x) in F[x], as claimed.

1.23. Definition. Let F be a field and (E, ι) be an extension field of F. Then E


is a vector space over ι(F), and we define the degree of E over F to be the dimension
of the vector space E over ι(F). We write
[E ∶ F] ∶= dimι(F) E.

1.24. Examples.
(a) R] = 2, and {1, i} is a√basis for C over R. √
[C ∶ √
(b) [Q( √2) ∶ Q] = 2, and {1, √2} √ is a √
basis√for Q( 2) over Q. √
(c) 5
[Q( 2) ∶ Q] = 5, and {1, 2, 4, 5 8, 5 16} is a basis for Q( 5 2) over Q.
5 5

(d) [R ∶ Q] = ∞. The proof of this is left as an exercise for the reader.

1.25. Theorem. Let (E, ι) be an extension field for F and suppose that
[E ∶ F] < ∞.
Then E is an algebraic extension of F.
200 9. EXTENSION FIELDS

Proof. Set n ∶= [E ∶ F] < ∞, and let α ∈ E. Then {1, α, α2 , . . . , αn } has n + 1


elements, so it must be linearly dependent over Fι = ι(F). In other words, there
exist q0 , q1 , q2 , . . . , qn+1 ∈ F (not all equal to zero) such that

q0ι + q1ι α + q2ι α2 + ⋯ + qn+1


ι
αn+1 = 0.

This is the statement that α is a root of the non-zero polynomial q ι (x), where
q(x) = qn+1 xn+1 + qn x2 + ⋯ + q1 x + q0 ∈ F[x]. Hence α is algebraic over F.
Since α ∈ E was arbitrary, E is an algebraic extension of F.

1.26. The converse is false. Note that


√ √3

4

5
K ∶= Q( 2, 2, 2, 2, . . .)

is an algebraic extension of Q, yet [K ∶ Q] = ∞.

1.27. Theorem. Let F be a field and (E, ι) be an extension field of F. Let


(K, η) be an extension field of E. Suppose that [E ∶ F] < ∞ and that [K ∶ E] < ∞.
Then

[K ∶ F] = [K ∶ E] [E ∶ F].

Proof. Let K ∶= {κ1 , κ2 , . . . , κn } be a basis for K over η(E), and E ∶= {α1 , α2 , . . . , αm }


be a basis for E over ι(F). Observe that η ○ ι ∶ F → K is an embedding.
Next, set L ∶= {αiη κj ∶ 1 ≤ i ≤ m, 1 ≤ j ≤ n}. We claim that L is a basis for K over
η ○ ι(F). If this is the case, then ∣L∣ = mn shows that [K ∶ F] = mn.
(i) Let β ∈ K. Since K is a basis for K over η(E), there exist γ1 , γ2 , . . . , γn ∈ E
such that

β = γ1η κ1 + γ2η κ2 + ⋯ + γnη κn .

Temporarily fix 1 ≤ j ≤ n. Since γj ∈ E and since E is a basis for E over Fι ,


there exist y1,j , y2,j , . . . , ym,j ∈ F such that

ι ι ι
γj = y1,j α1 + y2,j α2 + ⋯ + ym,j αm .
1. A RETURN TO OUR ROOTS (OF POLYNOMIALS) 201

Hence
n
β = ∑ γjη κj
j=1
n
= ∑ η(γj ) κj
j=1
n m
= ∑ η (∑ ι(yi,j )αi ) κj
j=1 i=1
n m
= ∑ (∑ η ○ ι(yi,j )αiη ) κj
j=1 i=1
n m
η○ι
= ∑ ∑ yi,j (αiη κj ).
j=1 i=1

We conclude that spanη○ι(F) L = K, and so [K ∶ F] ≤ mn = [K ∶ E] [E ∶ F].


(ii) Next suppose that yi,j ∈ F, 1 ≤ i ≤ m, 1 ≤ j ≤ n satisfy
n m
η○ι η
∑ ∑ yi,j (αi κj ) = 0.
j=1 i=1

For each 1 ≤ j ≤ n, define γj ∶= ∑m ι


i=1 yi,j αi ∈ E. Then
n m ⎛n ⎞
η○ι
0 = ∑ ∑ yi,j (αiη κj ) = ∑ γjη κj .
j=1 i=1 ⎝j=1 ⎠
Since K is a basis for K over η(E), we may deduce that for each 1 ≤ j ≤ n,
0 = γjη = η(γj ).
But η is injective and so for each 1 ≤ j ≤ n, we have that
m
ι
0 = ∑ yi,j αi .
i=1
ι
Now E is a basis for E over ι(F), and so yi,j = 0 for all 1 ≤ i ≤ m, and this
for all 1 ≤ j ≤ n. Hence L is linearly independent over F, and so it is a basis
for K over F, completing the proof of the Theorem.

1.28. Examples.
√ √ √ √ √ √
(a) [Q( 2, 3) ∶ Q] = 4 = [Q( 2, 3) ∶ √ Q( √2)] √
[Q( 2) ∶ Q].
A basis for this extension is {1, 2, 3, 6}.
202 9. EXTENSION FIELDS

Supplementary Examples.

S9.1. Example. Suppose that E is an extension field for the field F with F ⊆ E,
and that [E ∶ F] = 2. Note that {1} is linearly independent in E, since 1 ≠ 0, and so
{1} can be extended to a basis B ∶= {1, α} for E over F.
It follows that if γ ∈ E, then there exists a, b ∈ F such that γ = a ⋅ 1 + b ⋅ α. In
particular, α2 = a0 ⋅ 1 + b0 ⋅ α for some a0 , b0 ∈ F.
Thus
γ 2 = a2 ⋅ 1 + 2abα + b2 α2
= a2 ⋅ 1 + 2abα + b2 (a0 ⋅ 1 + b0 ⋅ α)
= (a2 + a0 b2 ) ⋅ 1 + (2ab + b0 b2 )α.
It follows that {1, γ, γ 2 } are linearly dependent over F, which implies that γ
satisfies a polynomial of degree at most 2 over F.
S9.2. Example. Let 0 ≠ p(x) = p2 x2 + p1 x + p0 ∈ Z2 [x] be an irreducible
polynomial of degree 2. Then p2 = 1, otherwise deg(p(x)) < 2. Also, in order to be
irreducible, p(x) can not have roots in Z2 , so
p(0) = p1 ⋅ 0 + p0 = p0 ≠ 0.
Thus p0 = 1. Also,
p(1) = p2 + p1 + p0 = 1 + p1 + 1 = p1 ≠ 0,
so p1 = 1.
Hence p(x) = x2 + x + 1. That is, there exists a unique irreducible polynomial of
degree 2 over Z2 .
What does this say about extension fields over Z2 of degree 2?
S9.3. Example. The polynomial q(x) = x4 +x+1 ∈ Z2 [x] is irreducible. Indeed,
it has no roots in Z2 , and so the only possible factorisation into polynomials of lower
degree would have to involve two polynomials r(x) and s(x) ∈ Z2 [x], each of degree
2.
Writing r(x) = x2 + r1 x + r0 and s(x) = x2 + s1 x + s0 , we see that q(x) = r(x)s(x)
implies that r0 s0 = 1 implies that r0 = s0 = 1.
Thus
q(x) = r(x)s(x) = (x2 + r1 x + 1)(x2 + s1 x + 1)
= x4 + (r1 + s1 )x3 + (1 + r1 s1 + 1)x2 + (r1 + s1 )x + 1,
from which we deduce that r1 + s1 = 0, r1 s1 = 1, and r1 + s1 = 1. The first and third
equation clearly contradict each other, so no such factorisation is possible.
It follows that F ∶= Z2 [x]/⟨x4 + x + 1⟩ is a field, and that a general element of F
is of the form
b3 x3 + b2 x2 + b1 x + b0 + ⟨x4 + x + 1⟩,
for some b0 , b1 , b2 , b3 ∈ Z2 . There are 16 such possibilities, and so ∣F∣ = 16 = 24 .
SUPPLEMENTARY EXAMPLES. 203

S9.4. Example. Let p ∈ N be prime. Given any integer n ≥ 1, it can be


shown that there exists an irreducible polynomial of degree n over Zp , and thus –
by generalising the argument presented in Example 1.26, there exists a field with pn
elements.
S9.5. Example. It can be shown that the polynomial q(x) = x4 +1 is irreducible
over Z, but reducible over Zp for all primes p. In other words – this is the worst
case scenario in terms of trying to use the Mod-p Test, since that Test will fail for
all primes p.
Here’s a proof that q(x) is irreducible over Z. Indeed, q(m) ≥ 1 for all m ∈ Z
implies that q has no roots in Z, and hence the only possible factorisation is as a
product of quadratic terms. Suppose such a factorisation exists. Since the product
of the leading terms has a coefficient of 1, they must both be 1 or both be −1. By
multiplying both terms by −1 in the second case, we may assume without loss of
generality that each quadratic term is a monic polynomial.
Let r(x) = x2 + r1 x + r0 and s(x) = x2 + s1 x + s0 . By considering the constant
term of q(x) = r(x)s(x), we see that either r0 = s0 = 1 or r0 = s0 = −1. Now
q(x) = r(x)s(x) = x4 + (r1 + s1 )x3 + (r0 + s0 + r1 s1 )x2 + (r0 s1 + r1 s0 )x + r0 s0 .
Thus we must solve
● r1 + s1 = 0;
● r0 + s0 + r1 s1 = 0;
● r0 s1 + r1 s0 = 0;
● r0 = s0 ∈ {−1, 1}.
From the first and second equations, we see that r1 s1 = −r12 ∈ {−2, 2}, which is
unsolvable with r1 ∈ Z.
S9.6. Example. Here’s a more sophisticated proof of the fact that q(x) = x4 +1
is irreducible over Z. Suppose that
q(x) = r(x)s(x)
for some r(x), s(x) ∈ Z[x].
Then
(x + 1)4 + 1 = q(x + 1) = r(x + 1)s(x + 1),
showing that q(x + 1) can be factored in Z[x]. But
q(x + 1) = (x + 1)4 + 1 = x4 + 4x3 + 6x2 + 4x + 2
cannot be factored in Z[x] by Eisenstein’s Criterion, applied with p = 2.
S9.7. Example. Let F be a field and q(x) ∈ F[x] be a monic irreducible
polynomial of degree d over F. As we have seen, if E = F[x]/⟨q(x)⟩, then E is a field,
and φ ∶ F → E defined by φ(b) = b + ⟨q(x)⟩ is an embedding of F into E, so that E is
an extension field of F.
Let α ∶= x + ⟨q(x)⟩. Since
E = {rd−1 xd−1 + rd−2 xd−2 + ⋅ + r1 x + r0 + ⟨q(x)⟩ ∶ rj ∈ F, 0 ≤ j ≤ d − 1},
204 9. EXTENSION FIELDS

we see that B ∶= {1, β, β 2 , . . . , bd−1 } is a basis for E over F.


S9.8. Example. Note that π 2 is transcendental over Q, but that π 2 is al-
gebraic over Q(π). Thus transcendental by itself is “not a thing”; you must be
transcendental (or algebraic, as the case may be) over a particular field.
S9.9. Example. Let F be a field of characteristic p, where p ∈ N is a prime
number. If E is an extension field of F, then char(E) = p.
To see this, note that E is a vector space over F. Let B = {eλ }λ∈Λ be a basis for
E over F, and y ∈ E. Then there exist b1 , b2 , . . . , bk ∈ F, λ1 , λ2 , . . . , λk ∈ Λ such that
y = b1 eλ1 + b2 eλ2 + ⋯ + bk eλk .
Hence
py = (pb1 )eλ1 + (pb2 )eλ2 + ⋯ + (pbk )eλk
= 0eλ1 + 0eλ2 + ⋯ + 0eλk
= 0.
It follows that the characteristic of E divides p, hence equals p, as the latter is prime.
A similar statement holds if F has characteristic zero. The proof is left to the
reader.
S9.10. Example. Suppose that F is a field and that g(x) ∈ F[x]. Recall
from Example 3.3.1 that if g(x) has a root r of multiplicity at least two in F, then
(x − r) ∣ g ′ (x), the derivative of g(x).
Stated another way, if gcd(g(x), g ′ (x)) = 1, then g(x) has distinct roots in F[x].
Suppose that char(F) = p, a prime, n ∈ N, and that q(x) = xp −x. Then q ′ (x) =
n

pn xp −1 − 1 = 0 − 1 = −1 ∈ F[x], and so gcd(q(x), q ′ (x)) = 1. Let K be a splitting field


n

for q(x) over F. In K, q(x) will have pn distinct roots, say E ∶= {α1 , α2 , . . . , αpn }.
Curious, isn’t it, that we should use E to denote a set of roots, instead of a field?
n n n n
But in fact, if a, b ∈ E, then ap = a and bpn = b, so (ab)p = ap bp = ab ∈ E and (by
the binomial expansion, noting that each coefficient in the expansion other than the
first and the last is divisible by pn , and therefore by p,
n n n
(b − a)p = bp − ap = b − a,
n
so that b − a ∈ E. In other words, E is a subring of K. If 0 ≠ a ∈ E, then ap = a
= 1 in K, so that a−1 = ap
n−1 n−2
implies that ap ∈ E. Thus E is in fact a subfield of
K, and E has exactly pn elements.
That is, for any prime p and any n ∈ N, there exists a field with pn elements.
APPENDIX 205

Appendix

A9.1. In many texts, an extension field E of a field F is defined as a field which


contains F. Technically, this is false. What the authors really mean is that E
contains an isomorphic copy, say K of F, and that they are identifying K with F.
x y
Suppose we consider the set E ∶= {[ ] ∶ x, y ∈ R} ⊆ M2 (R). Recall that E is
−y x
isomorphic to C, the field of complex numbers. It is readily seen that the map
φ∶ R → E
x 0
x → [ ]
0 x

is an injective homomorphism of R into E. Now, K ∶= φ(R) is a field which, in terms


of all of its field properties, behaves exactly the same as R. Surely we can agree that
elements of E are not actually complex numbers, even though they behave exactly
the same way (again, in terms of their field properties) as complex numbers. If we
were to agree amongst friends to identify K and R – in other words, if we were to
agree not to quibble and just say that we are going to treat K and R as the same
thing – then we could say that E contains R.
We would be lying, of course, but if we had sufficient mathematical sophistica-
tion to understand that we are lying, to understand the nature of our lie, and to
understand how to eliminate the lie by reintroducing the map φ and painstakingly
keeping track of the embedding, then we could live fruitful and productive lives and
never talk about this again.
This is precisely what these sinful authors to which I have earlier alluded are
doing. When they define an extension field E of a field F as a field which contains
F, they are agreeing to conflate the copy K ⊆ E of F and F itself.
So what does this have to do with Definition 1.7? If we were to adopt the
strategy of those naughty authors referenced above, then we could simply say that
E is a splitting field for a polynomial q(x) ∈ F[x] if, whenever K is an extension of
F and q(x) splits in K, then K ⊇ E.

A9.2. Although we shall not prove it here, it is known that the number of
irreducible polynomials of degree n over a field Fq , where q = pm for some prime p
and positive integer m is given by
1 n/d
N (q, n) ∶= ∑ µ(d)q ,
n d∣n

where µ is the so-called Möbius function. More specifically, µ(d) is the sum of
the primitive dth roots of unity in C, and takes on values in {−1, 0, 1}. In particular,
N (q, n) ≥ 1 for all q, n.
206 9. EXTENSION FIELDS

A9.3. There is still some time left to continue asking questions while reading
the text. For instance, how many algebraic elements over Q are there in R? Stated
more precisely, what is the cardinality of the set
Λ ∶= {α ∈ R ∶ α is algebraic over Q}?
Is Λ a field?
EXERCISES FOR CHAPTER 9 207

Exercises for Chapter 9

Exercise 9.1.
Prove that Q[x]/⟨x2 + 1⟩ is isomorphic to Q[i].

Exercise 9.2. √ √
Show that α = 13 + 7 is algebraic over Q, and find its minimal polynomial.

Exercise 9.3. √√
Show that β = 3 2 − i is algebraic over Q, and find its minimal polynomial.

Exercise 9.4. √ √
Find a basis for Q( 3, 6) over Q. What is the degree of this extension?

Exercise 9.5. √ √ √ √ √
Find a basis for Q( 2, 6 + 10) over Q( 3 + 5). What is the degree of this
extension?

Exercise 9.6.
Prove that Z2 [x]/⟨x3 + x2 + 1⟩ is a field with 8 elements, and construct a multi-
plication table for this field.

Exercise 9.7.
Is e algebraic over Q(e7 )?

Exercise 9.8. √ √
Prove or disprove that Q( 7) ≃ Q( 19).

Exercise 9.9.
Let F be a field of characteristic 5, and let b ∈ F. Prove that p(x) = x5 − b is
either irreducible over F, or splits in F.

Exercise 9.10.
Let α, β ∈ R be transcendental over Q. Prove that either αβ or α + β is tran-
scendental. (Recall from Example 1.20 (d) that we do not know whether π + e and
π e are algebraic or transcendental over Q. This shows that at least one of the two
is transcendental, but does not tell us which.)

Exercise 9.11.
Recall that a set X is said to be denumerable if there exists a bijection φ ∶
N → X. (We say that X is countable if X is either finite or denumerable.)
Let (R, ι) be the standard extension of Q (i.e. ι(q) = q for all q ∈ Q). Prove that
the set
Ω ∶= {α ∈ R ∶ α is algebraic over Q}
is denumerable.
208 9. EXTENSION FIELDS

Exercise 9.12.
Prove that the set Ω of algebraic real numbers defined in Exercise 9.11 is a field.

Exercise 9.13.√ √ √ √
Let K ∶= Q( 2, 3 2, 4 2, 5 2, . . .) be the extension field of Q from Paragraph 9.1.26.
Prove that K is algebraic over Q.
CHAPTER 10

Straight-edge and Compasses constructions

We are all here on earth to help others; what on earth the others are
here for I don’t know.
W.H. Auden

1. An ode to Wantzel
1.1. In this Chapter we shall see how Pierre Laurent Wantzel (I don’t know
why, but I like him already) managed to use field theory to prove lots of stuff in
geometry. So yes, that’s a thing now.
1.2. The story begins, as they say, with the ancient Greeks. By that we do not
mean that the Greeks themselves were ancient, but rather that we are referring to
Greeks who lived in ancient times. More specifically, we are interested in Plato,
born circa 425 bc. An Athenian philosopher, he held that the only “perfect” geo-
metrical forms were straight lines and circles, and as such, he was only interested
in geometrical constructions that could be accomplished with a straight-edge and a
pair of compasses. 1
Although he was not a mathematician, he nevertheless taught mathematics,
and one of his students, Eudoxos of Cnidus, is considered to have been one of
the greatest mathematicians of those days. In fact, much of what is to be found
in Elements, Euclid’s famous2 treatise on mathematics - written circa 300 bc – is
actually the work of Eudoxos.
1The Cambridge Dictionary asserts that a compass is “a device for finding direction with
a needle that can move easily and that always points to magnetic north”, whereas (a pair of)
compasses refers to “a V-shaped device that is used for drawing circles or measuring distances on
maps”. You could see where a pair of compasses might have been more useful than a compass when
performing geometrical constructions.
2The word “famous” is a curious beast when applied in a mathematical context. In this case,
we feel justified in using this expression because Euclid’s Elements is well-known outside of the
mathematical world. In fact, if one were to walk up to some random individual on the street and
ask them to name a famous mathematical text, then it is hard to think of another text the random
person might cite. Of course, it is far more likely that the person to whom one is posing this
question would ignore one, or – in a worst case scenario – would visit some extreme violence upon
one’s person, but that is an altogether different matter. It is worth bearing in mind that being

209
210 10. STRAIGHT-EDGE AND COMPASSES CONSTRUCTIONS

We being with the first of the postulates stated in Euclid’s Elements.


1.3. Euclid’s postulates. Let P ⊆ R2 be a set containing at least two distinct
points. We shall consider the following two geometric operations, hereafter referred
to as admissible operations:
(i) Given two distinct points a and b in P, there is a unique line which contains
them, and hence a unique line segment connecting a to b.
(ii) Given points a, b, c in P with a ≠ b, we can construct a circle centred
at c with radius equal to the distance d(a, b) from a to b. (Here we are
using the usual Euclidean distance √ between two points a = (a1 , a2 ) and
2
b = (b1 , b2 ) in R , namely d(a, b) ∶= ∣a1 − b1 ∣2 + ∣a2 − b2 ∣2 .)
The Greeks of Plato’s day referred to the construction of a geometric figure using
only a straight-edge and compasses (and the two admissible operations above) as
the plane method. We should think of these operations as telling us which lines
and circles we are allowed to “draw”.
1.4. Euclid elucidated a number of these straight-edge-and-compasses construc-
tions in Elements. More specifically, he demonstrated (amongst a great many other
things) that with only the plane method, it is possible to find the midpoint of a
given line segment, or to bisect a given angle, or to construct – given a line L and
a point x not on that line – a second line, parallel to L, and passing through the
point x. We shall demonstrate some of these constructions in the Appendix to this
chapter.

Our present goal is to try to describe which geometric figures may be constructed
using the plane method, starting from a minimal set P consisting of two distinct
points. By rotating, translating and scaling if necessary, we may assume without loss
of generality that those two points are p0 = (0, 0) and p1 = (1, 0). Thus, throughout
the remainder of this chapter

we shall always assume that {p0 = (0, 0), p1 = (1, 0)} ⊆ P ⊆ R2 .

In order to describe these geometric figures, we must first properly define what
it means to “construct” such a geometric object.
It is clear, for example, that since we can construct line segments which connect
pairs of distinct points using the plane method, we can construct a square S1 whose
area is 1 provided that the points q = (1, 0) and r = (1, 1) can be “constructed” from
the set P using the plane method – the points p0 = (0, 0) and p1 = (1, 0) already
lying in P by hypothesis. To do this, we simply connect p0 to p1 , p1 to r, r to q,
and q to p0 via line segments.
We can similarly “construct” a square S2 whose area √ is twice that of
√S1 provided
that we can somehow “construct” the points x = ( 2, 0), y = (0, 2) and z =

“famous” in mathematics today means being known to literally hundreds of other mathematicians
studying the same branch of mathematics, and to absolutely no one who actually matters.
1. AN ODE TO WANTZEL 211
√ √
( 2, 2) from P using the plane method, and “constructing” an arbitrary polygon
can be done provided that we can somehow “construct” the vertices of that polygon,
after which we may the appropriate vertices with line segments, which the plane
method allows us to draw.
Thus by “constructing” a more general geometric figure using the plane method,
we shall mean constructing the points which define the object, and connecting them
using the lines, line segments and circles prescribed by the plane method. Of course,
being able to do so relies upon our having a rigorous definition of which points we
can “construct” from P using the plane method.

1.5. Definition. Let P ⊆ R2 be a set containing the points p0 = (0, 0) and


p1 = (1, 0). A point q ∈ R2 is said to be constructible in one step from P if it is
a point of intersection of two non-parallel lines, two distinct circles, or a line and a
circle obtained from admissible operations from P.
We say that r ∈ R2 is constructible from P if there exist finitely many points
q0 , q1 , q2 , . . . , r = qn ∈ R2 such that each for each 1 ≤ k ≤ n, qk is constructible in
one step from P ∪ {q0 , q1 , q2 , . . . , qk−1 }.

A real number a ∈ R is said to be constructible from P if the point a ∶= (a, 0)


is constructible from P, and in the case where P = {p0 , p1 }, we abbreviate this to
the expression a ∈ R is a constructible real number. We shall denote by ΓP
the set of all real numbers which are constructible from P, and by Γ○ the set of all
constructible real numbers, so that Γ○ = Γ{p0 ,p1 } .

1.6. Remark. Suppose that P ⊆ R2 is a set containing the points p0 = (0, 0)


and p1 = (1, 0), and that q = (q1 , q2 ) and r = (r1 , r2 ) are constructible from P. Then
the x-axis, i.e. the set {(x, 0) ∶ x ∈ R}, corresponds to the line through p0 and p1 ,
which we may draw using the first admissible operation from Paragraph 1.3 above.
(Note: although the line itself may be drawn, this is not the same as saying that
every point (x, 0) on that line may be constructed. Constructible points come from
intersections of lines and circles with lines and circles.)
Moreover, the second admissible operation from Euclid’s postulates allows us
to construct a circle of radius ϱ ∶= d(q, r) centred at (0, 0), which clearly must
intersect the x-axis at the point (ϱ, 0). It follows that ϱ ∈ R is constructible from P,
i.e. ϱ ∈ ΓP . Conversely, if ϱ ∈ ΓP , then the point (ϱ, 0) ∈ R2 is constructible from P,
and ϱ = d((0, 0), (ϱ, 0)).
Thus we may also view those real numbers constructible from P as the set of all
possible distances between points in R2 which are themselves constructible from P.

This is not the only interesting way to interpret the set ΓP . Indeed, suppose
that the point r = (r1 , r2 ) is constructible from P.
● as we have just seen, we may construct the x-axis from P by the first
admissible operation.
● By Paragraph A10.1, we may also construct the y-axis.
212 10. STRAIGHT-EDGE AND COMPASSES CONSTRUCTIONS

● By Paragraph A10.2, we may draw a line parallel to the x-axis passing


through r. This line will intersect the y-axis at (0, r2 ). The circle centred
at p0 with radius r2 = d((0, 0), (0, r2 )) intersects the x-axis at (r2 , 0). Thus
r2 ∈ ΓP .
● By Paragraph A10.2, we may also draw a line parallel to the y-axis passing
through r. This line will intersect the x-axis at (r1 , 0), and thus r1 ∈ ΓP as
well.
In other words, given a point r = (r1 , r2 ) ∈ R2 which is constructible from P, the
points (r1 , 0) and (0, r2 ) are also constructible from P, and therefore r1 , r2 ∈ ΓP .
Conversely, if r1 and r2 ∈ ΓP are real numbers constructible from P, then
● by definition, (r1 , 0) ∈ P.
● By Paragraph A10.1, we may draw a line L perpendicular to the x-axis
passing through (r1 , 0). Since the point (r2 , 0) is constructible, we may
draw a circle C centred at (r1 , 0) with radius ϱ ∶= ∣r2 ∣ which intersects the
line L at the points (r1 , −∣r2 ∣) and (r1 , ∣r2 ∣). Clearly r is one of those two
points.
Thus the point r ∶= (r1 , r2 ) is constructible from P.

From this we see that the


ΓP = {α, β ∈ R ∶ q ∶= (α, β) ∈ R2 is constructible from P}.

2. Enter fields
2.1. This is where things get interesting for those of us who have spent the past
few weeks learning about rings and fields. Our immediate goal is to prove that if
{p0 , p1 } ⊆ P, then ΓP is a field. First we observe that if α ∈ ΓP , then the circle
of radius ∣α∣ centred at p0 intersects the x-axis at (−∣α∣, 0) and at (∣α∣, 0), In other
words, α ∈ ΓP implies that −α ∈ ΓP as well.

2.2. Proposition. Let α, β ∈ ΓP . Then α + β and α − β lie in ΓP .


Proof. First note that ∣α∣, ∣β∣ ∈ ΓP by the comments of Paragraph 2.1.
Set a ∶= (∣α∣, 0). The second admissible operation allows us to draw a circle of
radius ∣β∣ centred at a, which intersects the x-axis at the points (∣α∣ − ∣β∣, 0) and
(∣α∣ + ∣β∣, 0). Hence ∣α∣ − ∣β∣, ∣α∣ + ∣β∣ ∈ ΓP .
But then −∣α∣ + ∣β∣, −∣α∣ − ∣β∣ ∈ ΓP , again, by Paragraph 2.1. Since
{α + β, α − β} ⊆ {∣α∣ − ∣β∣, ∣α∣ + ∣β∣, −∣α∣ + ∣β∣, −∣α∣ − ∣β∣}
for all real numbers α, β, we conclude that α + β, α − β ∈ ΓP .

2. ENTER FIELDS 213

2.3. Proposition. Let α, β ∈ ΓP with α ≠ 0. Then


β
αβ ∈ ΓP and ∈ ΓP .
α

Proof. First note that by the argument in Paragraph 2.1, it suffices to consider the
case where α, β > 0. If α = 1, there is nothing to prove, and so we henceforth assume
that α ≠ 1.
(a) Let a = (α, 0) and b = (0, β), which are constructible from P.
(b) Using Paragraph A.10.3, we can draw a line L through the point p1 = (1, 0)
which is parallel to the line through a and b. That line will intersect the
y-axis at the point c ∶= (0, γ), and thus γ ∈ R ∈ ΓP .
The triangles (p0 a b) and (p0 p1 c) are similar, and thus
α d(p0 , a) d(p0 , b) β
= = = .
1 d(p0 , p1 ) d(p0 , c) γ
β
That is, = γ ∈ ΓP .
α

1
Since 1 ∈ ΓP , the above argument with β = 1 implies that γ1 ∶= ∈ ΓP and
α
therefore that
β
αβ = ∈ ΓP .
γ1

L1

L0

A P1
P0 L2

Figure 10.0


214 10. STRAIGHT-EDGE AND COMPASSES CONSTRUCTIONS

2.4. Theorem. Suppose that {p0 = (0, 0), p1 = (1, 0)} ⊆ P ⊆ R2 . Then
Q ⊆ ΓP ⊆ R
is a field. Thus
ΓP = Q({α, β ∶ (α, β) ∈ P}).

Proof. That ΓP ⊆ R is clear from the definition.


Note that P has at least two distinct points, and that p1 = (1, 0) ∈ P implies
that 1 ∈ ΓP ≠ ∅. By Proposition 2.2 and Proposition 2.3, we see that α, β ∈ ΓP
implies that α − β and αβ lie in ΓP . By the Subring Test, ΓP is a unital subring.
Since any unital subring of R contains Z, ΓP contains Z.
1
Since 1 ∈ ΓP and β ≠ 0 implies that ∈ ΓP by Proposition 2.3, we see that in
β
fact, ΓP is a field. But Z ⊆ ΓP , and thus Q ⊆ ΓP as well.

Finally, by the comments in Remark 1.6, we know that ΓP = {α, β ∶ (α, β) ∈ P}.
Since we now know that ΓP is a field which also contains Q, it must be the smallest
field which contains Q and {α, β ∶ (α, β) ∈ P}. In other words,
ΓP = Q({α, β ∶ (α, β) ∈ P}).

2.5. Corollary. The set Γ○ of constructible real numbers is a subfield of R that


contains Q.

2.6. Remark. Although we have determined that the set Γ○ = Γ{p0 ,p1 } of con-
structible real numbers is a subfield of R that contains Q, we still don’t know much
about it. The next result establishes a relation between straight-edge and compasses
constructions and field extensions, which will be the key to understanding Γ○ , and
to establishing the impossibility of certain classical geometric constructions in the
next section of this Chapter.

2.7. Theorem. Suppose that {p0 , p1 } ⊆ P ⊆ R2 , and suppose that a point


r = (r1 , r2 ) ∈ R2 is constructible in one step from P. Then
[ΓP∪{r} ∶ ΓP ] ∈ {1, 2}.

Proof. It is clear from the definition of ΓP∪{r} that ΓP∪{r} = ΓP (r1 , r2 ). There re-
mains to find the degree of the extension. This leads us to consider three possibilities
for r.
(a) r is the point of intersection of two non-parallel lines L1 determined by
a = (a1 , a2 ) and b = (b1 , b2 ) ∈ P, and L2 determined by c = (c1 , c2 ) and
d = (d1 , d2 ) ∈ P.
2. ENTER FIELDS 215

We shall argue the case where the slopes of L1 and L2 are non-zero real
numbers. The cases where one line is horizontal and/or one line is vertical
are left as (routine) exercises for the reader.
Let us write L1 in the form L1 = {(x, y) ∈ R2 ∶ y = m1 x + n1 }. For
(x, y) ∈ L1 and (x, y) ≠ {a, b}, we have that
x − a1 y − a2
= ,
x − b1 y − b2
and therefore
a2 − b2 a1 b2 − a2 b1
y= x+ .
a1 − b1 a1 − b1
a2 − b2 a1 b2 − a2 b1
Set m1 = and n1 = , and observe that m1 , n1 ∈ ΓP , since
a1 − b1 a1 − b1
a1 , a2 , b1 , b2 ∈ ΓP .

A similar argument shows that we may write


L2 = {(x, y) ∈ R2 ∶ y = m2 x + n2 },
where m2 , n2 ∈ ΓP .
Thus r ∈ L1 ∩ L2 implies that
r2 = m1 r1 + n1 = m2 r1 + n2 ,
whence
n2 − n1 m1 (n2 − n1 )
r1 = , r2 = + n1 .
m1 − m2 m1 − m2
Observe that r1 , r2 ∈ ΓP , and in this case,
ΓP (r1 , r2 ) = ΓP∪{r1 ,r2 } = ΓP .
Thus [ΓP (r1 , r2 ) ∶ ΓP ] = 1 in this case.

(b) r is the point of intersection of a line L determined by a = (a1 , a2 ) and


b = (b1 , b2 ) ∈ P, and a circle T centred at a point c = (c1 , c2 ) ∈ P and
containing
√ point d = (d1 , d2 ) ∈ P. Observe that the radius of the circle is
ρ ∶= (d1 − c1 )2 + (d2 − c2 )2 , and thus ρ2 ∈ ΓP .
Again, we shall argue the case where L is parallel to neither axis, and
leave the other cases as an exercise for the reader. From part (a) above, we
see that we may express L as
L = {(x, y) ∈ R2 ∶ y = mx + n},
where m, n ∈ ΓP are fixed constants and m ≠ 0.
Now r = (r1 , r2 ) ∈ L ∩ T implies that
r2 = mr1 + n, and (r1 − c1 )2 + (r2 − c2 )2 = ρ2 .
Note that r2 ∈ ΓP (r1 ), and thus ΓP (r1 , r2 ) = ΓP (r1 ) in this case. To solve
for r1 , we must solve the equation
(r1 − c1 )2 + ((mr1 + n) − c2 )2 = ρ2 ,
216 10. STRAIGHT-EDGE AND COMPASSES CONSTRUCTIONS

or equivalently,

(1 + m2 )r12 + (−2c1 + 2mn − 2mc2 )r1 + (c21 + n2 − 2nc2 + c22 − ρ2 ) = 0.

But this is a quadratic equation, and thus [ΓP (r1 ) ∶ ΓP ] ∈ {1, 2} (depending
upon whether or not this equation has roots in ΓP .)

(c) The third and last possibility is that r is the point of intersection of two
circles T1 (centred at a = (a1 , a2 ) ∈ P and containing b = (b1 , b2 ) ∈ P) and
T2 (centred at c = (c1 , c2 ) ∈ P and containing d = (d1 , d2 ) ∈ P). As in
part (b), we note that if ρk is the radius of the circle Tk , then ρk ∈ ΓP ,
k = 1, 2.
Thus r = (r1 , r2 ) ∈ T1 ∩ T2 implies that

(r1 − a1 )2 + (r2 − a2 )2 = ρ21 ,

and
(r1 − c1 )2 + (r2 − c2 )2 = ρ22 .
Taking the difference of these two equations, we obtain

r2 = αr1 + β,

where α, β ∈ ΓP . In particular, r2 ∈ ΓP (r1 ), and so ΓP (r1 , r2 ) = ΓP (r1 )


again! Although we don’t need the exact values of α and β, for those who
may be checking:

a1 − c1 (a21 − c21 ) + (a22 − c22 ) − (ρ21 − ρ22 )


α= , and β= .
a2 − c2 2(a2 − c2 )

Substituting this into the equation for L1 , we see that

(r1 − a1 )2 + (αr1 + β − a2 )2 = ρ21 ,

so that r1 satisfies a quadratic equation over ΓP , and thus

[ΓP (r1 , r2 ) ∶ ΓP ] = [ΓP (r1 ) ∶ ΓP ] ∈ {1, 2}.

2.8. Remark. In fact, the above proof shows that if r = (r1 , r2 ) ∈ R2 can be
constructed in one step from P, then [ΓP (r1 ) ∶ ΓP ] ∈ {1, 2} and r2 ∈ FP (r1 ), so that
[ΓP (r1 , r2 ) ∶ ΓP ] ∈ {1, 2}.
2. ENTER FIELDS 217

2.9. Corollary. Suppose that r = (r1 , r2 ) ∈ ΓP . Then

[ΓP (r1 , r2 ) ∶ ΓP ] ∈ {2m ∶ 0 ≤ m ∈ Z}.

Proof. Choose q0 = p1 = (1, 0) ∈ P and qk = (ak , bk ) ∈ R2 such that each qk


is constructible in one step from P ∪ {q0 , q1 , q2 , . . . , qk−1 }, 1 ≤ k ≤ n, and qn =
(an , bn ) = r.
To simplify the notation, set Pk = P ∪ {q0 , q1 , q2 , . . . , qk }, 1 ≤ k ≤ n, and observe
that qk is constructible in one step from Pk−1 . By the above Remark 2.8,

[ΓPk−1 (ak , bk ) ∶ ΓPk−1 ] ∈ {1, 2}.

But ΓPk−1 (ak , bk ) = ΓPk , and thus

[ΓPk ∶ ΓPk−1 ] ∈ {1, 2}.

Finally, by Theorem 1.27,

[ΓPn ∶ ΓP ] = [ΓPn ∶ ΓPn−1 ][ΓPn−1 ∶ ΓPn−2 ]⋯[ΓP1 ∶ ΓP ] ∈ {2m ∶ 0 ≤ m ∈ Z},

say
[ΓPn ∶ ΓP ] = 2m0
for some 0 ≤ m0 ∈ Z.
Now ΓP ⊆ ΓP (r1 , r2 ) is a subfield of ΓPn , and thus

2m0 = [ΓPn ∶ ΓP ] = [ΓPn ∶ ΓP(r1 ,r2 ) ] [ΓP(r1 ,r2 ) ∶ ΓP ],

which implies that [ΓP(r1 ,r2 ) ∶ ΓP ] divides 2m0 ; i.e.

ΓP (r1 , r2 ) = 2n0 for some 0 ≤ n0 ≤ m0 .

2.10. Corollary. Let ϱ ∈ R be a constructible real number. Then

[Q(ϱ) ∶ Q] ∈ {2m ∶ 0 ≤ m ∈ Z}.

Proof. Observe that ϱ ∈ Γ○ if and only if r ∶= (ϱ, 0) is constructible from P ○ ∶=


{p0 , p1 }. By Corollary 2.9 above, this implies that

[ΓP ○ (ϱ, 0) ∶ ΓP ○ ] ∈ {2m ∶ 0 ≤ m ∈ Z}.

To complete the proof, it suffices to note that ΓP ○ = Q(0, 1) = Q.



218 10. STRAIGHT-EDGE AND COMPASSES CONSTRUCTIONS

3. Back to geometry
3.1. We started this journey into the realm of constructible numbers because
of our deep admiration of Euclid, and our appreciation of his weakness for lines
and circles. While the Greeks were masters of the plane method - there were three
constructions that fell beyond their reach, and which would subsequently become
famous,3namely: doubling the cube, squaring the circle, and trisecting
an angle. The question of whether or not these constructions were possible using
the plane method remained open for approximately 2000 years, until the French
mathematician Pierre Laurent Wantzel (1814-1848) proved that the problems of
doubling the cube and trisecting the angle lay not with the Greeks, but with the
nature of field extensions related to constructible real numbers. The same Pierre
Laurent Wantzel also proved that a regular polygon is constructible if and only if
the number of its sides is a product of a power of 2 and (any number – including zero)
of prime natural numbers. As for the problem of squaring the circle – it was resolved
when Lindemann proved that π is transcendental over Q, as we shall discover below.

3.2. Doubling the cube. Since the set Γ○ of constructible real numbers is
a field that contains Q, it is clear that we can construct the points (0, 0), (1, 0),
(1, 1) and (0, 1) from the set P ○ ∶= {p0 , p1 }. By connecting these points with the
appropriate line segments, it follows that we may construct a square of area equal
to 1. By thinking “outside the box”, so to speak, we may view this as the face of a
cube in R3 whose volume is equal to 1.
Of course - using a straight-edge and compasses construction, we can “double
the area of the square”, because if we set
ϱ ∶= d((0, 0), (1, 1)),
√ √
then
√ √ ϱ = 2 is constructible,
√ and so we may also construct the points (0, 0), ( 2, 0),
( 2, 2) and (0, 2), which are the vertices of a square of area equal to 2.
It occurred to the Greeks to ask:
starting with P ○ ∶= {p0 , p1 }, can we construct the face of a cube
whose volume is 2?

Despite the efforts of a great many and many great mathematicians, this prob-
lem remained unsolved for approximately 2000 years. Thanks to Pierre Laurent
Wantzel, we are now in a position to resolve this problem. Take that, Euclid.

3.3. Theorem. (Wantzel) The cube cannot be doubled in volume using a


straight-edge-and-compasses construction.
Proof.
If it were possible to
√ construct the face of this cube, then we could construct
√ a
line segment of length 3 2, and in particular, we would conclude that α ∶= 3 2 ∈ Γ○ .

3See our previous note on fame in mathematics.


3. BACK TO GEOMETRY 219

Note that α is a root of the polynomial equation


q(x) = x3 − 2 ∈ Z[x].
By Eisenstein’s Criterion (with p = 2) (Theorem 7.3.14),
√ we see that q(x) is irre-
3
ducible over Q. By Theorem 9.1.13, we see that Q( √ 2) ≃ Q[x]/⟨q(x)⟩ is a field
extension of degree 3 over Q. By Corollary 2.10 above, 3 2 ∈/ Γ○ .
Thus we cannot construct the face of a cube of volume 2 in R2 using a straight-
edge and compasses construction and the points p0 and p1 .

3.4. Squaring the circle. Starting with P ○ = {p0 , p1 }, we can construct


a circle C whose radius is d(p0 , p1 ) = 1. The area it inscribes is then Area(C) =
π(12 ) = π.
A second question of geometry which troubled the Greeks was:
starting with P ○ ∶= {p0 , p1 }, can we construct a square whose area
is that inscribed by the circle C of radius 1?
Of course, we√ now recognise that the length of a side of such a square would
have to be α ∶= π.
At this stage, we shall unfortunately have to appeal to a result a bit outside the
scope of these notes and appeal to the following wonderful theorem of Carl Louis
Ferdinand von Lindemann in 1882 (see [vL82] for a proof of this fact).
3.5. Theorem. (Lindemann) The number π is transcendental. That is, π is
not the root of any non-zero polynomial with rational coefficients.
√ √ ○
If π were constructible
√ (i.e. if we were to have √π ∈ Γ ), then by Corollary 2.10,
we would know that [Q( π) ∶ Q] < ∞. Since π ∈ Q( π), this would imply that π is
algebraic, a contradiction.
Thus we cannot construct a square whose area is that of the unit circle C.
3.6. Trisecting an angle. Paragraph A.10.4 shows that it is always possible
to bisect an constructible angle.
A third question of geometry which troubled and eluded the ancient Greeks was
that of “trisecting the angle”. Properly phrased, the question is whether or not one
could always trisect an angle which one could already construct using a straight-edge
and compasses.
For example, if one begins with the angle θ = π, then trisecting the angle θ is
certainly possible using the plane method. Similarly, one can “trisect” the angle
θ′ ∶= π2 , because one can construct an angle of π6 . (We leave these two problems as
exercises for the interested reader.)
Let us consider the problem of trisecting other angles. We first construct two
circles C1 and C2 of √radius 1 ∈ Γ○ centred at p0 and p1 = (1, 0). They will intersect
at the point z = ( 12 , 23 ), and the triangle (p0 p1 z) is equilateral. Thus ∠(zp0 p1 )
is π3 radians (or 60○ ) is a “constructible angle”. Is it possible to trisect this angle?
220 10. STRAIGHT-EDGE AND COMPASSES CONSTRUCTIONS

That is, is it possible to construct a line L which intersects the unit circle C at the
point w such that the angle ∠(wp0 p1 ) measures π9 radians?
3.7. Theorem. (Wantzel) The angle π3 cannot be trisected using a straight-
edge-and-compasses construction.
Proof. The question reduces to: what is that point w, and does it lie in Γ○ ? Writing
w = (w1 , w2 ), we see that w1 ∶= cos( π9 ), and w2 ∶= sin( π9 ). If w ∈ Γ○ , then w1 and w2
must each be constructible as well!
Recall that for all 0 < α ∈ R, we have that
cos(3α) = cos(2α + α)
= cos(2α) cos(α) − sin(2α) sin(α)
= (2 cos2 α − 1) cos α − (2 sin α cos α) sin α
= 2 cos3 α − cos α − 2(sin2 α) cos α
= 2 cos3 α − cos α − 2(1 − cos2 α) cos α
= 4 cos3 α − 3 cos α.
Applying this with α = π9 , and recalling that cos(3α) = cos( π3 ) = 12 , we see that
w1 must satisfy
1
= cos(3w1 ) = 4w13 − 3w1 ,
2
or equivalently, w1 must be a root of the polynomial
q(x) = 8x3 − 6x − 1.
We claim that q(x) is irreducible over Q = ΓP ○ . Indeed, suppose that q(x)
were reducible over Q. Then we could write q(x) = g(x)h(x) where deg(g(x)) = 1
and deg(h(x)) = 2. By Theorem 7.3.8, if (q(x)) is reducible over Q, then q(x) is
reducible over Z, and we can find factors g1 (x) and h1 (x) ∈ Z[x] with deg(g1 (x)) = 1
and deg(h1 (x)) = 2 such that q(x) = g1 (x)h1 (x).
Set g1 (x) = a1 x + a0 and h1 (x) = b2 x2 + b1 x + b0 ∈ Z[x]. We must therefore solve
8x3 − 6x − 1 = (a1 b2 )x3 + (a1 b1 + a0 b2 )x2 + (a1 b0 + a0 b1 )x + (a0 b0 ),
where all ai , bj ’s lie in Z.
The equation a0 b0 = −1 implies that (a0 , b0 ) = (1, −1) or (a0 , b0 ) = (−1, 1). By
multiplying both g1 (x) and h1 (x) by −1 if necessary, we may assume without loss
of generality that a0 = 1, b0 = −1.
Thus
−6 = a1 b0 + a0 b1 = −a1 + b1 ,
0 = a1 b1 + a0 b2 = a1 b1 + b2 ,
and
8 = a1 b2 .
At this stage of the course, it is safe to leave it to the reader to verify that this
can not be done.
3. BACK TO GEOMETRY 221

It follows that q(x) is irreducible over Q. But then, as in Paragraph 3.2, we


find that Q(w1 ) ≃ Q[x]/⟨q(x)⟩ is a field extension of degree 3 over A, and so by
Corollary 2.10, w1 ∈/ Γ○ .
π π
We conclude that the angle 9 cannot be constructed, and so the angle 3 can
not be trisected.

222 10. STRAIGHT-EDGE AND COMPASSES CONSTRUCTIONS

Appendix

A10.1. Let a and b be distinct points in R2 . Here we describe how to construct


a line L passing through a which is perpendicular to the line passing through a and
b.

1. Let L1 denote the line passing through a and b, and let ϱ = d(a, b).
2. Otherwise, let C be a circle of radius ϱ centred at a. The circle C will
intersect L0 at two points, namely b and d. Then d(b, d) = 2ϱ.
3. Let Cb be the circle of radius 2ϱ centred at b, and Cd be the circle of radius
2ϱ centred at d. Then Cb and Cd will intersect at two points, namely y1
and y2 .
4. Let L denote the line passing through y1 and y2 . Observe that the line
segments S1 from y1 to y2 and S2 from d to b are the diagonals of the
rhombus determined by the vertices d, b, y1 and y2 , and as such, they are
perpendicular bisectors. Then L is the desired line.

√ where a = p0 = (0, 0) and b = p1 = (1, 0), we find that y1 = (0, 2)
(In the case
and y2 = (0, − 2). The line L connecting these two points is then the y-axis, which
is clearly perpendicular to the x-axis and passes through a = (0, 0).)
L

Y1

CD CB

L1
D B
A

Y2

Figure 10.1. The line L through a perpendicular to the line through a and b.
APPENDIX 223

A10.2. Let a, b and d be three non-collinear points in R2 . Here we describe


how to construct a line through d which is perpendicular to the line passing through
a and b.

1. Let L1 denote the line passing through a and b. If the line L2 through d
and b is perpendicular to L1 , then we are done.
2. Otherwise, let Cd be a circle of radius ϱ ∶= d(b, d) centred at d. Since ϱ is
greater than the distance from d to L1 , the circle Cd will intersect L1 at
two points, namely b and x.
3. Let Cb be the circle of radius ϱ centred at b, and Cx be the circle of radius
ϱ centred at x. Then Cb and Cx will intersect at two points, namely d and
y.
4. Let L2 denote the line passing through d and y.
We claim that L2 is perpendicular to L1 . (Obviously L2 contains d.) Indeed,
the figure (bdxy) is a rhombus, and the segments bx and dy are the diagonals of
that rhombus, which are perpendicular to each other.

L2

CD

CB CX

L1
A B X

Figure 10.2. The line through d and y is perpendicular to the line through a and b.
224 10. STRAIGHT-EDGE AND COMPASSES CONSTRUCTIONS

A10.3. Let L1 be a line in R2 passing through the points a and b, and let d ∈ R2
be a point not on L1 . Here we describe how to construct a line L2 through d which
is parallel to L1 .

1. Observe that L1 be the (unique!) line passing through a and b.


2. By Paragraph A10.2, we can construct a line L2 which passes through d
and is perpendicular to L1 .
3. By Paragraph A10.1, we can construct a line L3 which is perpendicular to
L2 and passes through d.
Then L3 is parallel to L1 (since both are perpendicular to L2 ) and L3 passes
through d, as claimed.

L2

L3
D

A B L1

Figure 10.3. The line L3 is parallel to the line L1 and passes through d.

A10.4. Suppose that we have three non-collinear points a, b and c. It is always


possible to bisect the angle ∠abc formed by these three points.
EXERCISES FOR CHAPTER 10 225

Exercises for Chapter 10

Exercise 10.1.√ √
Prove that 3 + 36 + 25 is constructible.

Exercise 10.2.
Prove that it is always possible to bisect a constructible angle.

Exercise 10.3.
Prove that the angle θ can be trisected using the plane method if and only if the
polynomial
q(x) = 4x3 − 3x − cos θ
is reducible over Q(cos θ).

Exercise 10.4.
π
Prove that one may approximate the angle 9 with any degree of precision using
the plane method, starting from P ○ = {p0 , p1 }.

Exercise 10.5.
Show that it is possible to construct a regular pentagon using the plane method,
starting from P ○ = {p0 , p1 }.

Exercise 10.6. √
Prove that if α, β ∈ Γ○ , then α2 + β 2 ∈ Γ○ . Is the converse true?

Exercise 10.7.
Prove that it is possible to construct a regular dodecagon (i.e. a 12-sided polygon,
all of whose sides are of equal length) using the plane method, starting from P ○ =
{p0 , p1 }.

Exercise 10.8. √ √
Prove that [Q( 4 + 2 3) ∶ Q] = 2.

Exercise 10.9.
Let a, b ∈ Γ○ with a ≠ 0. Prove that

−b ± b2 − 4ac
∈ Γ○ .
2a

Exercise 10.10.
Prove that it given two distinct points a and b in the plane, it is possible to
construct the midpoint of the line segment from a to b using a straight-edge and
compasses.
APPENDIX A

The Axiom of Choice

1. Introduction
1.1. The Axiom of Choice has been mentioned earlier in these course notes, as
has Zorn’s Lemma. In this note, we shall examine these two equivalent statements,
as well as a third statement, the Well-Ordering Principle, which is also equivalent
to these two.

1.2. Set theory is the study of specific collections, called... wait for it... sets of
objects that are called elements of that set. Pure set theory deals with sets whose
elements are again sets. Experts in set theory claim that the theory of hereditarily-
finite sets – i.e. those finite sets whose elements are also finite sets, whose elements in
turn are also finite sets, whose elements ... etc., is formally equivalent to arithmetic,
which we shall neither define nor attempt to explain.
The form of set theory which interests us here arose out of attempts by math-
ematicians of the late 19th and early 20th centuries to try to understand infinite
sets, and originated with the work of the German mathematician Georg Cantor.
He had published a number of articles in number theory between 18671 and 18712,
but began work on what was to become set theory by 1874.
In 1878, Cantor formulated was is now referred to as the Continuum Hypothe-
sis(ch). Suppose that X is a subset of the real line. The Continuum Hypothesis
asserts that either there exists a bijection between X and the set N of natural num-
bers, or there exists a bijection between X and the set R of real numbers.
Many well-known and leading mathematicians of his day attempted to prove
this statement, including Cantor himself, as well as David Hilbert, who included
it as the first of the twenty-three unsolved problems he presented at the Second
International Congress of Mathematics in Paris, in 1900. Any attempt to prove this
would require one to understand not only sets of real numbers, but sets in general.
Like mangoes, mathematical theories need time to ripen. Early attempts to
define sets led to inconsistencies and paradoxes. Originally, it was thought that
any property P could be used to define a set, namely the set of all sets (or even
other objects) which satisfy property P . The mathematician-philosopher Bertrand
Russell produced the following worrisome example:

1the year of Canada’s birth


2the year that the Government of Canada promised B.C. a railway which they delivered in
1885, by the way

227
228 A. THE AXIOM OF CHOICE

Russell’s Paradox. Let P be the property of sets: X ∈/ X. That is, a set X satisfies
property P if and only if X does not belong to X.
Let
R ∶= {X a set ∶ X satisfies property P } = {X a set ∶ X ∈/ X}.
The question becomes: does R satisfy property P ? If so, the R ∈/ R, and so R ∈ R .
If not, then R ∈ R, so R ∈/ R.

This is not a good state of affairs, and the German mathematician Ernst Zer-
melo, who was also aware of this paradox, thought that set theory should be ax-
iomatised in order to be rigorous and preclude paradoxes such as the one above. In
1908, Zermelo produced a first axiomitisation of set theory, although he was unable
at that time to prove that his axiomatic system was consistent. There was also
a second problem, insofar as some logician/set theorists were concerned, namely:
Zermelo avoided Russell’s Paradox by means of an axiom which he referred to as
the Separation Axiom. This, however, he stated using second order logic, which is
considered less desirable (in the sense that it requires stronger hypotheses) than
the version offered by the Norwegian Thoralf Skolem and the German Abraham
Fraenkel, which relied only on so-called first-order logic.
It occurs that the Hungarian mathematician John von Neumann also added
to the list of so-called Zermelo-Fraenkel Axioms we shall enumerate below by in-
troducing the Axiom of Foundation. We are not sufficiently versed in the history of
this area to be able to explain why Skolem and von Neumann’s names don’t appear
on the list of Axioms, although von Neumann’s name appears in so many other
mathematical contexts that even if he were alive today, he would not have the right
to complain.

1.3. The Zermelo-Fraenkel Axioms (ZF).


● The Null Set Axiom.
There exists a set ∅, called the empty set, which has no elements.
● The Axiom of Extension.
Two sets A and B are equal if and only if they have the same elements.

● The Axiom of Regularity, aka the Axiom of Foundation.


Every non-empty set A contains an element B such that no element of
A belongs to B.
● The Axiom of Specification.
If A is a set and P is a property, then there exists a set
B ∶= {b ∈ A ∶ b satisfies property P }.
1. INTRODUCTION 229

● The Axiom of Pairing.


Given two sets A and B, there exists a set {A, B} whose only elements
are A and B. (Combined with the Null Set Axiom, we deduce that {A} is
also a set.)
● The Axiom of Union.
If A is a set, then there exists a set ∪A, called the union of A, defined
by
∪A = {b ∶ b ∈ B for some B ∈ A}.

● The Axiom of Power Sets.


Given a set A, there exists a set P(A) = {B ∶ B ⊆ A} whose elements
are all of the subsets of A.
● The Axiom of infinity. There exists an infinite set. More specifically,
there exists a set Z such that ∅ ∈ Z and A ∈ Z implies that ∪{A, {A}} ∈ Z.

1.4. Just a quick comment about the Axiom of Specification: why doesn’t it
lead right back to Russell’s Paradox? Because the Axiom of Specification assumes
that you begin with a set, and you look at the members of that set which satisfy
a property P . In Russell’s Paradox, you don’t have a starting set – instead you
manufacture a set out of thin air through a property P . This is the kind of object
we now refer to as a class, which is an object more general than a set. In fact, a
class is a collection of sets, which means that we can’t apply Russell’s paradigm to
a class of classes.

1.5. There is one Axiom we have yet to specify, which – given the title of this
note – we might want to do sooner rather than later.

● The Axiom of Choice. Let Λ be a non-empty set, and let {Aλ }λ∈Λ be a
collection of non-empty sets. Then there exists a function f ∶ Λ → ∪λ∈Λ Aλ
such that f (γ) ∈ Aγ for each γ ∈ Λ.
In layman’s terms, if you have a (non-empty) collection of (non-empty) sets,
then you can choose an element from each set. In technical terms, we call such a
function f a choice function.
We write (ZFC) to indicate (ZF) plus the Axiom of Choice.

1.6. Remark. There are a couple of remarks we should make here.


(a) First: what the heck is going on? This is so obviously true that it isn’t
even worth mentioning! How could it not be true?
(b) Doesn’t it follow from the Axioms above? If not, what are they good for?
230 A. THE AXIOM OF CHOICE

1.7. I sympathise with those who think it’s obviously true. I really do. For one
thing, the Axiom of Choice becomes an issue only if we have a lot of sets – in fact,
only when we are dealing with infinitely many such sets. (Whether those sets are
finite or infinite is not the issue – the issue is how many sets we have.)
The formal reason why this is so can be summarised in the following way: (ZFC)
is a “theory in first-order logic”, and our ability to “choose” an element from a
(single) non-empty set is an application of a rule from that theory that carries
the groovy moniker existential instantiation. We do not claim to be an expert in
the theory of first-order logic, but the upshot that a hard-working, good-looking
mathematician but non-logician like me can safely take away from this statement is
that given a non-empty set, (ZF) allows you to pick an element of that set. If you
are interested in the details of first-order logic, then be aware that treatments now
exist to cure this, but also that some people exhibit these symptoms for years and
still manage to lead productive lives, both inside and outside of mathematics.
Another thing to note is that existential instantiation is not a constructive argu-
ment. That is, we are not given a method of choosing an element from a non-empty
set A – all we know is that it is possible. If we relabel A as Aλ , then saying that pro-
duces an element a ∈ A is the same as saying that there exists a function f ∶ {λ} → Aλ
that satisfies f (λ) = a. That is, we have our choice function.
Having done this once, first-order logic allows us to repeat this process finitely
often. That is, given finitely many non-empty sets A1 , A2 , . . . , An , we can find a func-
tion
f ∶ {1, 2, . . . , n} → ∪nk=1 Ak such that f (k) ∈ Ak , 1 ≤ k ≤ n.
What first-order logic and the Zermelo-Fraenkel Axioms do not do, however, is
to carry out this procedure infinitely often, willy-nilly. We shall try to make the
expression “willy-nilly” more precise below.

Willy-Nilly, or not Willy-Nilly. That is the question.

1.8. While it is important for one to try to be original, it is also important


for one to acknowledge the accomplishments of those who came before one. In
particular, the same Bertrand Russell referenced above once said:
To choose one sock from each of infinitely many pairs of socks re-
quires the Axiom of Choice, but for shoes the Axiom is not needed.
It is interesting to try to figure out what exactly Russell meant by this, and
what it says about his sartorial predilections.
The difference to which Bertrand Russell is alluding is related to the non-
constructiveness of existential instantiation. Russell’s underlying hypothesis while
making this comment is that there is no way to distinguish between two socks in a
pair. That is, he is assuming that the two socks in a given pair are for all intents and
purposes identical. Thus, to pick a sock from a pair requires the rules of first-order
logic to run (ZF) theory. This allows us to pick a sock from each pair for finitely
1. INTRODUCTION 231

many pairs, but there is no way to do this for infinitely many pairs at once. Thus in
order to pick a sock from amongst each of infinitely many pairs of socks, you need
something more than (ZF) theory, and the Axiom of Choice grants you your wish.

A second underlying assumption of Russell’s is that each pair of shoes includes a


left shoe and a right shoe. This changes everything. We no longer require existential
instantiation to select a shoe from a pair – instead, we just specify the left shoe.
(We could have said the right.) Thus, to “pick a shoe from each of infinitely many
pairs of shoes”, we simply specify that we shall always pick the left shoe, thereby
circumventing existential instantiation altogether. Having a method to pick an ele-
ment of the non-empty sets we are dealing with allows us to avoid having recourse
to the Axiom of Choice.

1.9. Let us consider how such a thing might arise in practice in a mathematical
setting. Let {At ∶ t ∈ R} be a collection of non-empty subsets ∅ ≠ At ⊆ N, one such
set for every real number t ∈ R. We would like to choose one element from each set.
Can we do this in (ZF), or do we require the Axiom of Choice?
If we had an explicit method to choose the element at ∈ At , then we could indeed
avoid the Axiom of Choice. But how can we specify an element of At without
knowing exactly what At is? The natural numbers have a very special and very
useful property: they are (well-) ordered.

Before defining a well-order we remind the reader that a relation ρ on a non-


empty set X is a subset of the set X × X = {(x, y) ∶ x, y ∈ X}. Often, we write
x ρ y to mean the ordered pair (x, y) ∈ ρ. This is especially true when dealing, for
example, with the usual relation ≤ for real numbers: no one writes (x, y) ∈≤; we all
write x ≤ y. Incidentally, the notation ≤ is used not only for the relation “less than
or equal to” for real numbers; it frequently appears to indicate a specific kind of a
relation known as a partial order on an arbitrary set, which we now define.

1.10. Definition. A relation ≤ on a non-empty set X is said to be a partial


order if, given x, y, and z ∈ X,
(a) x ≤ x;
(b) if x ≤ y and y ≤ x, then x = y; and
(c) if x ≤ y and y ≤ z, then x ≤ z.
The prototypical example of a partial order is the partial order of inclusion on
the power set P(A) of a non-empty set A. That is, set P(A) = {B ∶ B ⊆ A} and for
B, C ∈ P(A), we set B ⊆ C to mean that every member of B is a member of C. It
is easy to see that ⊆ is a partial order on P(A).
The word partial refers to the fact that given two elements B and C in P(A),
they might not be comparable. For example, if A = {x, y}, B = {x} and C = {y},
then it is not the case that B ⊆ C, nor is it the case that C ⊆ B. Only some subsets
of P(A) are comparable.
232 A. THE AXIOM OF CHOICE

In dealing with the natural numbers, we observe that they come equipped with
a partial order which we typically denote by ≤. In this setting, however, any two
natural numbers are comparable. If (X, ρ) is a partial ordered set, and if x, y ∈ X
implies that either xρy or yρx, then we say that ρ is a total order on X.
Thus (N, ≤) is a totally ordered set.

1.11. It gets even better (and yes, you should seriously ask yourself whether
you deserve it). The most useful property of N for our current purposes is that it
possesses one more very striking property:
given any non-empty subset ∅ ≠ H ⊆ N, it admits a minimum
element;
that is, there exists an element m ∈ H such that m ≤ h for all h ∈ H.
If (X, ρ) is a partially ordered set with the property that every non-empty subset
Y of X admits a minimum element, then we say that (X, ρ) is well-ordered.
The observation above is that (N, ≤) is well-ordered.

1.12. What is the relevance of this to the Axiom of Choice? Returning to the
example in paragraph 1.9, we were given a collection {At ∶ t ∈ R} of non-empty
subsets of N. Since we have just seen that (N, ≤) is well-ordered, we can specify a
choice function
f ∶ R → ∪t∈R At
t ↦ min At .
That is, we have a rule for specifying which element of At we are choosing – we
are choosing the minimum element of At which we know exists (and is unique).
We did not need to know in advance what At was, and we did not need existential
instantiation to define f (t).
Thus we don’t need the Axiom of Choice to pick an element from each subset
At simultaneously.

1.13. What if we had a collection {Bt ∶ t ∈ R} of non-empty subsets of R? That


is, ∅ ≠ Bt ⊆ R for each t ∈ R? Do we need the Axiom of Choice to choose an element
from each set?
It is easy to check that (R, ≤) is a totally ordered set, as was (N, ≤). On the
other hand, the open interval (0, 1) = {x ∈ R ∶ 0 < x < 1} certainly doesn’t have a
minimum element, so that (R, ≤) is not a well-ordered set.
How would we specify an element of Bt without first knowing what Bt is? Unless
we first know something about the set Bt , there is no way of specifying which element
of Bt we are choosing. It follows that we do need the Axiom of Choice to do this.
(Of course, every subset of N is a subset of R, so if we had originally chosen
Bt = At for all t, then we could have just used the “minimum” element method
above. The point here is that if you have no information about Bt other than the
fact that it is non-empty, then you can’t use that argument here.)
2. AN APOLOGY FOR THE AXIOM OF CHOICE. 233

1.14. We have not forgotten the second question raised in Remark 1.6, namely:
doesn’t the Axiom of Choice follow from the Axioms of (ZF)? In fact, we have already
partially addressed this – (ZF) and first-order logic do not allow us to choose an
element from each of infinitely-many sets at a time, in large part because of the
non-constructive nature of existential instantiation. (For those who are wondering,
yes, your humble author does love that phrase.)
There remains the question: does it contradict any of the (ZF) axioms? The
simple answer is “No”, although the proof that it does not is anything but simple.
That (ZFC) is consistent (i.e. doesn’t contain any inherent contradictions) was first
demonstrated by the Austro-Hungarian mathematician Kurt Gödel in 1938. Does
that mean that we can’t do mathematics without the Axiom of Choice? Interestingly
enough, in 1962, the American mathematician Paul Cohen demonstrated that one
can assume all of the Axioms of Zermelo-Fraenkel Theory, and also assume that the
Axiom of Choice is false(!!!), and still arrive at a consistent mathematical system.
For this and for his work on the Continuum Hypothesis (he showed that (CH) is
independent of (ZFC)), he was awarded the Fields medal, which is generally consid-
ered to be the mathematical equivalent of the Nobel Prize. Given how prestigious
and merited the Nobel Peace prize invariably is, a prize in mathematics is... (we
leave it as an exercise for the reader to complete this sentence).

2. An apology for the Axiom of Choice.


2.1. As important as it is to choose one’s socks wisely, if the only thing that
every resulted from the Axiom of Choice was our ability to choose a sock from each
pair in an infinite collections of pairs of socks, then we would not be talking about
it today.

The issue is that this seemingly innocuous assumption has major implications.
In particular, it is known that to imply that every vector space (over any field)
admits a vector-space basis. Given how useful vector space bases are when studying
linear algebra, it would be beyond tragic to lose them. Of course, one might argue
that one should instead drop the Axiom of Choice and just assume that every vector
space admits a basis. But the joke is then upon the one who argued this: as it so
happens, if we assume that every vector space admits a basis, then one can also
prove that the Axiom of Choice holds.
In other words, the Axiom of Choice is equivalent to the statement that every
vector space admits a basis. One of the statements is true if and only if the second
is true.
2.2. The number of statements that are equivalent to the Axiom of Choice makes
is huge. Entire books are written on this subject - including the book Equivalents
of the Axiom of Choice, II by Herman Rubin and Jean Rubin, which includes 250
statements, each equivalent to the Axiom of Choice. Below we shall mention three
such equivalent statements which are among the most important.
234 A. THE AXIOM OF CHOICE

2.3. The Axiom of Choice II. This version is a slight modification of the
Axiom of Choice, and you might want to think about how you would prove that it
is equivalent to the Axiom of Choice:
Let {Aλ }λ∈Λ be a non-empty collection of non-empty disjoint sets.
That is, λ1 ≠ λ2 ∈ Λ implies that Aλ1 ∩ Aλ2 = ∅.
Then there exists a function f ∶ Λ → ∪λ∈Λ Aλ such that f (λ) ∈
Aλ for all λ ∈ Λ.

2.4. The Well-Ordering Principle.


Let ∅ ≠ X be a non-empty set. Then there exists a relation ρ on
X such that (X, ρ) is well-ordered.

If (X, ≤) is a partially ordered set, then a subset C ⊆ X is said to be a chain in


X if c, d ∈ C implies that either c ≤ d or d ≤ c. In other words, C is totally ordered
using the relation inherited from X.

2.5. Zorn’s Lemma.


Let (X, ≤) be a partially ordered set. Suppose that for each chain
C in X, there exists an element µC ∈ X such that c ≤ µC for all
c ∈ C. (We say that C is bounded above in X.)
Then (X, ≤) admits a maximal element m. That is, there
exists m ∈ X such that if x ∈ X and m ≤ x, then x = m.

2.6. A note about maximal elements of partially ordered sets. Our decision to
use “maximal ” instead of “maximum” is not just fancy. They usually mean different
things.
Let A = {∅, {x}, {y}, {z}, {x, y}, {x, z}, {y, z}} be the set of proper subsets of
X = {x, y, z}, and partially order A by inclusion. Thus {x} ≤ {x, y} because {x} ⊆
{x, y}.
Observe that {x, y} is a maximal element of A. Nothing is bigger than {x, y} in
A. But {z} ⊆/ {x, y}, and so {x, y} is not a maximum element of A. Note that {x, z}
is also a maximal elements. Maximal elements need not be unique. (Are maximum
elements unique when they exist? Think about it!)
The issue is that a maximal element doesn’t have to be comparable to every
element in the set, while a maximum element does. In other words, a maximal
element only has to be bigger than or equal to those things that you can actually
compare it to, while a maximum element has to be bigger than or equal to everything
in the set! In this example, ∅ would be a mimimum, and thus also a minimal element
of (A, ≤).

That the Well-Ordering Principle implies the Axiom of Choice is definitely within
reach of the interested reader. We leave it as an exercise.
3. THE PROSECUTION RESTS ITS CASE. 235

2.7. The following quote is worth keeping in mind:


The Axiom of Choice is obviously true, the Well-Ordering Princi-
ple obviously false, and who can tell about Zorn’s lemma?
Jerry Bona
It’s funny people like Professor Bona who put the world of mathematics into
perspective and allow us to laugh at ourselves for days at a time by revealing some
heretofore hidden lacuna in our intuition. It is also worth keeping in mind that
the Well-Ordering Principle is not constructive. It does not suggest any method of
actually finding a well-order on a given set X; it simply says that one exists.
The question of how you might try to well-order the complex numbers is certainly
worth thinking about.

2.8. Although your humble author has certainly not had a quote that compares
in notoriety to that mentioned above, it strikes your humble author that the Well-
Ordering Principle is not as “obviously false” as a somewhat different consequence
of the Axiom of Choice, namely the Banach-Tarski Paradox. This we shall address
in the next section.

3. The prosecution rests its case.


3.1. Given how “obviously true” the Axiom of Choice is, and how perfectly
intuitive it may seem, it would seem outrageous not to include it amongst the axioms
of set theory.
Consider then the following story, which by all accounts is true (see J. Mycielski,
Notices of the AMS 53, no. 2, page 209): Alfred Tarski, a Polish logician, proved
that the Axiom of Choice was equivalent to the following statement.
Given any infinite set X, there is a bijection from X to X × X ∶=
{(x, y) ∶ x, y ∈ X}.
He then submitted this result to the Comptes Rendus de l’Académie des Sci-
ences, where it was refereed by two famous French mathematicians, Maurice René
Fréchet, and Henri Lebesgue. Both referees rejected Tarski’s paper. Fréchet
wrote that the equivalence of two well-known truths is not a new result. Lebesgue
claimed that both statements were false, and so their equivalence was of no interest.
Tarski is said to have never submitted another paper to Comptes Rendus. (Feel
free to revise the sentence you completed at the end of Section 1.)

3.2. Perhaps the most damning evidence one can provide against the Axiom of
Choice is the following result which the Axiom of Choice implies, and was alluded
to above.
236 A. THE AXIOM OF CHOICE

3.3. The Banach-Tarski Paradox. Let S ∶= {(x, y, z) ∈ R3 ∶ x2 + y 2 + z 2 ≤ 1},


so that S denotes the (solid ) ball in R3 centred at the origin. It is possible to partition
the ball S into finitely many disjoint pieces in such a way, that by translating and
rotating them, they can be rearranged into two disjoint identical balls, each having
the same volume as S.

Note that stretching, bending, and twisting pieces is not allowed. Once you have
determined the pieces, you can only translate and rotate them.

If you are very thirsty, you might want to try this out on a bag of oranges at
home. Note that the inordinate amount of time you spend cleaning up your mess,
as well as your abject failure in accomplishing this task, might result from the fact
that most knives on the market today produce so-called “measurable” cuts of your
oranges. The same applies to grapefruit, and need we say (?), most other food
products.
The Banach-Tarski Paradox is, to coin a phrase, “paradoxical”. It is infuriating,
in large measure due to the fact that it is so highly counterintuitive. On the other
hand, once you accept the Axiom of Choice, this result follows as a consequence,
and you have no choice but to accept it and move on with your life.

4. So what to do?
4.1. Zermelo-Fraenkel theory plus Choice (ZFC) is popular because it works. It
works well. The Banach-Tarski Paradox and the existence of non-measurable sets
aside, it produces so many great results that the vast majority of mathematicians use
it without hesitation. Having said that – the Axiom of Choice is considered special,
and unlike the other (ZF) axioms. In general, when it is not required to prove a
result, a proof that avoids it is considered better than a proof that uses it. Also, it is
often considered “good form” to indicate when it is being used, although some uses
of the Axiom of Choice have become so commonplace that mathematicians expect
the reader to realise on their own that the Axiom of Choice has been used in the
argument.
There is a group of mathematicians who feel that to prove that a mathematical
object exists, one must construct it. They are referred to as constructivists (at least
this is how they are referred to in polite company). Your humble author has not
met a constructivist mathematician himself, and it occurs to your humble author as
he writes this that if we were to use constructivist logic, then we could not conclude
that one exists, could we? [How do I know that the articles that refer to them aren’t
all made up?] But this author is not a constructivist, and so is willing to accept
that they do exist.
5. EQUIVALENCES 237

4.2. We conclude by indicating that the Continuum Hypothesis, which was also
shown by Gödel and by Cohen to be independent of (ZFC), is not universally ac-
cepted at all. Any argument that uses the Continuum Hypothesis should mention
where it is being used, and should be avoided if there is any way at all of obtaining
the same result without it. Good results have been proven that require it, and are
usually stated as: assuming the continuum hypothesis, we show that ... .

4.3. Quiz. Can you avoid the Axiom of choice to choose an element from
(a) a finite set?
(b) each member of an infinite set of sets, each of which has only one element?
(c) each member of a finite set of sets if each of the members is infinite?
(d) each member of an infinite set of sets of rationals?
(e) each member of an infinite set of finite sets of reals?
(f) each member of a denumerable set of sets if each of the members is denu-
merable?
(g) each member of an infinite set of sets of reals?
(h) each member of an infinite set of sets of integers?

5. Equivalences
Let us now provide a proof of the equivalence of the Axiom of Choice, Zorn’s
Lemma and the Well-Ordering Principle. We remind the reader that a poset is a
partially ordered set, as defined above. We begin with the definition of an initial
segment, which will be required in the proof.

We also remind the reader that the statement that, given a collection {Xλ }λ∈Λ
of non-empty sets, the existence of a choice function f ∶ Λ → ∪λ∈Λ Xλ such that
f (λ) ∈ Xλ for all λ ∈ Λ is the statement that
∏ Xλ =/ ∅,
λ∈Λ
for the simple reason that
∏ Xλ ∶= {f ∶ Λ → ∪λ∈Λ Xλ a function such that f (λ) ∈ Xλ for all λ ∈ Λ}.
λ∈Λ

5.1. Definition. Let (X, ≤) be a poset, C ⊆ X be a chain in X and d ∈ C. We


define
P (C, d) = {c ∈ C ∶ c < d}.
An initial segment of C is a subset of the form P (C, d) for some d ∈ C.

5.2. Example.
(a) For each r ∈ R, (−∞, r) is an initial segment of (R, ≤).
(b) For each n ∈ N, {1, 2, ..., n} is an initial segment of N.
238 A. THE AXIOM OF CHOICE

5.3. Theorem. The following are equivalent:


(i) The Axiom of Choice (AC): given a non-empty collection {Xλ }λ∈Λ of non-
empty sets, ∏λ∈Λ Xλ =/ ∅.
(ii) Zorn’s Lemma (ZL): Let (Y, ≤) be a poset. Suppose that every chain C ⊆ Y
has an upper bound. Then Y has a maximal element.
(iii) The Well-Ordering Principle (WO): Every non-empty set Z admits a well-
ordering.
Proof.
(i) implies (ii): This is the most delicate of the three implications. We shall
argue by contradiction.
Suppose that (X, ≤) is a poset such that every chain in X is bounded
above, but that X no maximal elements. Given a chain C ⊆ X, we can find
an upper bound uC for C. Since uC is not a maximal element, we can find
vC ∈ X with uC < vC . We shall refer to such an element vC as a strict
upper bound for C.
By the Axiom of Choice, for each chain C in X, we can choose a strict
upper bound f (C). If C = ∅, we arbitrarily select x0 ∈ X and set f (∅) = x0 .
We shall say that a subset A ⊆ X satisfies property L if
(I) The partial order ≤ on X when restricted to A is a well-ordering of A,
and
(II) for all x ∈ A, x = f (P (A, x)).
● Claim 1: if A, B ⊆ X satisfy property L and A =/ B, then either A is an
initial segment of B, or B is an initial segment of A.

Without loss of generality, we may assume that A / B =/ ∅. Let


x = min {a ∈ A ∶ a ∈/ B}.
Note that x exists because A is well-ordered. Then P (A, x) ⊆ B. We shall
argue that B = P (A, x). If not, then B / P (A, x) =/ ∅, and using the
well-orderedness of B,
y = min {b ∈ B ∶ b ∈/ P (A, x)}
exists. Thus P (B, y) ⊆ P (A, x).

Let z = min (A / P (B, y)). Then z ≤ x = min (A / B).

● Subclaim 1: P (A, z) = P (B, y).


By definition, P (A, z) ⊆ P (B, y).
To obtain the reverse inclusion, we first argue that if t ∈ P (B, y) =
A ∩ P (B, y), then P (A, t) ∪ {t} ⊆ P (B, y). By hypothesis, t ∈ P (B, y),
so suppose that u ∈ P (A, t). Now t ∈ P (B, y) ⊆ P (A, x), so u < t < x
implies that u ∈ P (A, x). In other words, P (A, t) ⊆ P (A, x) ⊆ B. But
then u ∈ B and u < t < y implies that u ∈ P (B, y).
5. EQUIVALENCES 239

We now have that if s ∈ P (B, y), then P (A, s) ∪ {s} ⊆ P (B, y) ⊆


P (A, x) ⊆ A. This forces s < z ∶= min (A / P (B, y)), so that s ∈ P (A, z).
Together, we find that P (B, y) ⊆ P (A, z) ⊆ P (B, y), which proves the
subclaim.
Returning to the proof of the claim, we now have that z = f (P (A, z)) =
f (P (B, y)) = y. But y ∈ B, so y =/ x. Hence z < x. Thus y = z ∈ P (A, x),
contradicting the definition of y. We deduce that P (A, x) = B, and hence
that B is an initial segment of A, thereby proving our claim.

Suppose that A ⊆ X has property L, and let x ∈ A. It follows from the


above argument that given y < x, either y ∈ A or y does not belong to any
set B with property L.
Let V = ∪{A ⊆ X ∶ A has property L}.
● Claim 2: We claim that if w = f (V ), then V ∪ {w} has property L.
Suppose that we can show this. Then V ∪ {w} ⊆ V , so w ∈ V , a
contradiction. This will complete the proof.

● Subclaim 2a: First we show that V itself has property L. We must show
that V is well-ordered, and that for all x ∈ V , x = f (P (V, x)).

(a) V is well-ordered.
Let ∅ =/ B ⊆ V . Then there exists A0 ⊆ X so that A0 has property
L and B ∩ A0 =/ ∅. Since A0 is well-ordered and ∅ =/ B ∩ A0 ⊆ A0 ,
m ∶= min(B ∩ A0 ) exists. We claim that m = min(B).
Suppose that y ∈ B. Then there exists A1 ⊆ X so that A1 has property
L and y ∈ A1 . Now, both A0 and A1 have property L:
◇ if A0 = A1 , then m = min(B ∩ A1 ), so m ≤ y.
◇ if A0 =/ A1 , then either
● A0 is an initial segment of A1 , so A0 = P (A1 , d) for some
d ∈ A1 . Then
m = min(B ∩ A0 ) = min(B ∩ A1 ),
since r ∈ A1 ∖ A0 implies that m < d ≤ r. Hence m ≤ y;, or
● A1 is an initial segment of A0 , say A1 = P (A0 , d) ⊆ A0 for
some d ∈ A0 . Then
m = min(B ∩ A0 ) ≤ min(B ∩ A1 ).
Hence m ≤ y.
In both cases we see that m ≤ y. Since y ∈ B was arbitrary, m =
min(B).
Thus, any non-empty subset B of V has a minimum element, and so
V is well-ordered.
(b) Let x ∈ V . Then there exists A2 ⊆ X with property L so that x ∈ A2 .
Then x = P (A2 , x). Suppose that y ∈ V and y < x. Then there exists
240 A. THE AXIOM OF CHOICE

A3 ⊆ X with property L so that y ∈ A3 . Since A2 and A3 both have


property L, either
● A2 = A3 , and so y ∈ A2 ; or
● A2 = P (A3 , d) for some d ∈ A3 . Since x ∈ A2 , P (A2 , x) = P (A3 , x)
and therefore y ∈ A2 ; or
● A3 = P (A2 , d) for some d ∈ A2 . Then y ∈ A3 implies that y ∈ A2 .
In any of these three cases, y ∈ A2 . Hence P (V, x) ⊆ P (A2 , x). Since
A2 ⊆ V , we have that P (A2 , x) ⊆ P (V, x), whence P (A2 , x) = P (V, x).
But then
x = f (P (A2 , x)) = f (P (V, x)).
By (a) and (b), V has property L.

We now return to the proof of Claim 2. That is, we prove that if


w = f (V ), then V ∪ {w} has property L.
(I) V ∪ {w} is well-ordered.
We know that V is well-ordered by part (a) above. Suppose that
∅ =/ B ⊆ V ∪ {w}. If B ∩ V =/ ∅, then by (a) above, m ∶= min(B ∩ V )
exists. Clearly m ∈ V implies m ≤ f (V ) = w, so m = min(B∩(V ∪{w})).
If ∅ =/ B ⊆ V ∪ {w} and B ∩ V = ∅, then B = {w}, and so w = min(B)
exists.
Hence V ∪ {w} is well-ordered.
(II) Let x = V ∪ {w}. If x ∈ V , then x = f (P (V, x)) by part (a). If x = w,
then
P (V ∪ {w}, x) = P (V ∪ {w}, w) = V,
so x = w = f (V ) = f (P (V ∪ {w}, x)).
By (I) and (II), V ∪ {w} has property L. As we saw in the statement
following Claim 2, this completes the proof that the Axiom of Choice implies
Zorn’s Lemma. Now let us never speak of this again.
(ii) implies (iii): Let X =/ ∅ be a set. It is clear that every finite subset F ⊆ X
can be well-ordered. Let A denote the collection of pairs (Y, ≤Y ), where
Y ⊆ X and ≤Y is a well-ordering of Y . For (A, ≤A ), (B, ≤B ) ∈ A, observe
that A is an initial segment of B if the following two conditions are met:
● A ⊆ B and a1 ≤A a2 implies that a1 ≤B a2 ;
● if b ∈ B / A, then a ≤B b for all a ∈ A.
Let us partially order A by setting (A, ≤A ) ≤ (B, ≤B ) if A is an initial
segment of B. Let C = {Cλ }λ∈Λ be a chain in A.
Then (exercise): ∪λ∈Λ Cλ is an upper bound for C.
By Zorn’s Lemma, A admits a maximal element, say (M, ≤M ). We
claim that M = X. Suppose otherwise. Then we can choose x0 ∈ X / M
and set M0 = M ∪ {x0 }. define a partial order on M0 via: x ≤M0 y if
either (a) x, y ∈ M and x ≤M y, or (b) x is arbitrary and y = x0 . Then
(M0 , ≤M0 ) is a well-ordered set and (M, ≤M ) < (M0 , ≤M0 ), a contradiction
5. EQUIVALENCES 241

of the maximality of (M, ≤M ). Thus M = X and ≤M is a well- ordering of


X.
(iii) implies (i):
Suppose that {Xλ }λ∈Λ is a non-empty collection of non-empty sets. Let
X = ∪λ∈Λ Xλ . By hypothesis, X admits a well-ordering ≤X . Since each
∅ =/ Xλ ⊆ X, it has a minimum element relative to the ordering on X.
Define a choice function f by setting f (λ) to be this minimum element of
Xλ for each λ ∈ Λ.

Bibliography

[Her69] I.N. Herstein. Topics in Ring Theory. Chicago Lecture Notes in Mathematics. University
of Chicago Press, Chicago and London, 1969.
[Her75] I.N. Herstein. Topics in Algebra, 2nd Edition. Chicago Lecture Notes in Mathematics. John
Wiley and Sons, New York, Chichester, Brisbane, Toronto and Singapore, 1975.
[Ste04] I. Stewart. Galois Theory. CRC Mathematics Texts. Chapman and Hall, Boca Raton,
London, New York, Washington, D.C., 2004.
[vL82] C.L.F. von Lindemann. Über die Zahl π. Mathematische Annalen, 20:213–225, 1882.

243
Index

K-radical, 90 chain, 175, 234


F-linear, 184 character, 82
p-adic integers, 143 character of a ring, 42
closed (under an operation), 2
inverse Cohen, Paul, 233
multiplicative, 3 compact set in C, 170
Abel, Niels, 51 constant polynomial, 20
Abel, Niels Henrik, 10 constructible point, 211
abelian group, 6 constructible real number, 211
addition, 13 content, 158
additive map, 92 continuous Heisenberg group, 9
admissible operations, 210 Continuum Hypothesis, 227
algebra, 18 corps, 51
algebraic element, 197 coset, 69, 185
algebraically closed, 117, 153, 166 countable, 207
alphabet, 18 Criterion
anti-symmetry, 112 Eisenstein, 219
Artinian ring, 139 cuerpo, 51
ascending chain condition, 131 cyclotomic polynomial, 166
associate, 125 Dedekind domains, 118
associative, 6 Dedekind, R., 31
Auden, W.H., 209 Dedekind, Richard, 51
automorphism, 59, 85 degree
Axiom of Choice, 21, 111, 176, 229, 238 of an extension, 199
Banach-Tarski Paradox, 236 degree of a polynomial, 20
Banks, N., ii, 143 denumerable set, 207
basis, 176 derivative, 49, 204
Berle, Milton, 95 descending chain condition, 139
Boolean ring, 35 dimension of a vector space, 178
direct products, 21
cancellation law, 24, 37, 126, 128, 134 direct sum
cancellation law for groups, 12 internal, 182
canonical embedding, 190 direct sums, 21
canonical homomorphism, 82 discrete Heisenberg group, 9
Cantor, Georg, 227 discrete valuation ring, 143
Capote, Truman, iii distance preserving, 3
cardinality of the continuum, 142 division algorithm, 118, 122, 141, 156, 199
centre, 27 division ring, 32, 53

245
246 INDEX

divisors of zero, 38 Galois, Évariste, 51


domain Gaussian integers, 108, 127, 136, 146
Euclidean, 118, 144 ring of, 40
unique factorisation, 131 general linear group, 8
doubling the cube, 218 Giraudoux, Jean, iii
grandmaman’s watch, 42, 68
Eisenstein’s Criterion, 163, 166, 219 greatest common divisor, 123
embedding, 187 group, 6
endomorphism, 30, 59 discrete Heisenberg, 9
of groups, 92 homomorphism, 9
of rings, 92 permutation, 8
endomorphism ring, 35 symmetric, 8
epimorphism, 59 group homomorphism, 55
equivalence group ring, 16
class, 112
equivalence class, 92 Hamilton, W.R., 32
representative, 112 Heisenberg group, 9
equivalence classes, 101 Hilbert, D., 31
equivalence of (AC), (ZL) and (WO), 238 Hilbert, David, 227
equivalence relation, 92, 111 Hitchcock, Alfred, iii
equivalence relations, 101 homomorphism, 55
Euclid, 209 of groups, 55, 92
Euclid’s postulates, 210 of rings, 56
Euclidean algorithm, 123, 147 of sets, 55
Euclidean distance, 210 of vector spaces, 55
Euclidean domain, 118, 144
Euclidean norm, 118 ideal, 50, 63
Eudoxos of Cinidus, 209 generated by a set, 66
evaluation at a point, 75 left, 35, 63
evaluation map, 193, 198 maximal, 84, 91, 96
extension prime, 96
degree of, 199 principal, 66
extension field, 187 proper, 96
right, 63
Fermat’s Last Theorem, 143 singly-generated, 66
Fermat’s Theorem Ideal Test, 63, 74, 90, 99, 132
on sums of squares, 146 idempotent, 48
field, 51 identity element, 6
algebraic, 197 indeterminate, 18
splitting, 192 initial segment, 237, 240
transcendental, 197 inner, 85
field of quotients, 101 integers, 2
finite-dimensional vector space, 178 integral domain, 39, 45
First Isomorphism Theorem, 77, 140, 193 internal direct sum, 182
for vector spaces, 185 inverse
formal power series, 49 additive, 3
Fréchet, René, 235 inverse element, 6
Fraenkel, A., 31 inverse system of rings, 143
Fraenkel, Abraham, 228 invertible element of a ring, 25
Fundamental Theorem of algebra, 166, 170 irreducible, 125
isometry, 3
Gödel, Kurt, 233 isomorphism, 59
INDEX 247

of vector spaces, 57, 184 permutation matrix, 17


Isomorphism Theorems, 56, 63, 76 Philips, Emo, 149
plane method, 210
Körper, 51 Plato, 209
kernel, 57 polynomial
of a linear map, 56 constant, 20
Kronecker’s Theorem, 189–192 monic, 20
poset, 110
least common multiple, 139
potato chips
Lebesgue, Henri, 235
bacon and hickory-flavoured, 167
left ideal, 35
power set, 29
left regular representation, 93
prime element, 125
length of a word, 18
prime ideal, 96
Levenson, Sam, 1
primitive, 158
Lindemann
principal ideal, 66
Carl Louis Ferdinand von, 219
principal ideal domain, 67, 97, 122
linear independence, 175
proper ideal, 96
linear maps between vector spaces, 55
proper subset, 96
Liouville’s Theorem, 170
property L, 238
Macaulay, Rose, iii pyjama party
Mann, Thomas, 173 at Angela Merkel’s house, 153
maximal element (of a poset), 111
quadratic equation, 10
maximal ideal, 84, 91, 96
quaternions
maximum element (of a poset), 111
rational, 28
Merkel, Angela, 153
real, 28
Milligan, Spike, 13
quotient ring, 57
minimal polynomial, 198
Mod-p Test, 163
real number
monic polynomial, 20
constructible, 211
monomorphism, 59
reducible, 125
Moore, E. Hastings, 51
relation
multiplication, 13
equivalence, 111
multiplicative identity, 14
relation on a set, 110
multiplicative norm, 126, 141
relations
multiplicity of a root (or zero), 149
equivalence, 101
natural numbers, 2 relatively prime, 168
neutral element, 3, 6 representative, 69, 112
nilpotent, 49 ring, 13
Noether, E., 31 unital, 14
Noetherian ring, 131 ring generated by a set, 23
norm ring homomorphism, 56
Euclidean, 118 ring of Gaussian integers, 40
multiplicative, 126 ring of polynomials, 18, 19
root (or zero) of a polynomial, 149
opposite ring, 35 Russell’s Paradox, 228
Russell’s socks, 230
Parker, Dorothy, iii, 37 Russell, Bertrand, 21, 227
partial order, 110
partially ordered set, 110, 175 Second Isomorphism Theorem, 77, 79
permutation, 11 for vector spaces, 185
permutation group, 8 set homomorphism, 55
248 INDEX

simple, 84, 97 well-ordered, 122


simple ring, 131 Well-Ordering Principle, 234, 238
skew field, 32, 53 West, Mae, 187
Skolem, Thoralf, 228 word, 18
socks, Russell’s, 230 length, 18
special linear group, 8
spectrum, 153 Yarborough, Cale, 117
splitting Youngman, Henny, 55
of a polynomial, 191 Zermelo, Ernst, 228
splitting field, 191, 192 zero (or root) of a polynomial, 149
squaring the circle, 218 Zorn’s Lemma, 109, 111, 176, 234, 238
strict upper bound, 238
subfield, 187
subgroup, 6, 109
subring, 26
Subring Test, 23, 26, 46, 57
subset
proper, 96
subspace, 174
Subspace Test, 174
support of a function, 180
symmetric group, 8
symmetry, 4
symmetry (of a relation), 112

Tarski, Alfred, 235


Theorem
Fermat’s Last, 143
Kronecker’s, 190, 192
Second Isomorphism, 79
Third Isomorphism Theorem, 79
for vector spaces, 186
transcendental element, 197
trisecting the angle, 218

unique factorisation domain, 131


unit, 125
unitary group, 8
upper bound, 175

valuation, 118, 144


vector, 173
vector space, 173
basis, 176
dimension, 178
finite-dimensional, 178
von Lindemann
Carl Louis Ferdinand, 219
von Neumann, John, 228

Wantzel
Pierre Laurent, 209, 218
well-defined, 71

You might also like