Pmath 334
Pmath 334
Laurent W. Marcoux
A number of typos from the first edition have now been corrected. Presumably,
many others remain, and there may even be new ones! Please read the preface
below, and bring any remaining typos/errors to my attention.
In particular, I would like to thank the following people for already having
brought some typos and/or errors to my attention: T. Dieudonné, W. Dong, M.
Hürlimann, Z. Kun, M. Nguyen, J. Nordmann, A. Okell, N. Saloojee, S. Steel, J.
Tzoganakis, Z. Wang, and X. Yin.
The following is a set of class notes for the PMath 334 course I am currently
teaching at the University of Waterloo in January, 2021. They are a work in progress,
and – this being the “first edition” – they are replete with typos. A student should
approach these notes with the same caution he or she would approach buzz saws;
they can be very useful, but you should be alert and thinking the whole time you have
them in your hands. Enjoy.
Just one short comment about the Exercises at the end of each chapter. These
are of varying degrees of difficulty. Some are much easier than Assignment Questions,
and some are of a comparable level of difficulty. Those few Exercises that marked by
three asterisks are definitely worth doing, as they are crucial to understanding the
underlying concepts. The marked exercises are also of varying levels of difficulty, but
it is better for the reader to discover some things on his/her own, since the reader
will then understand and retain those things better. Besides, the only way to learn
mathematics is to do mathematics.
In our humble opinion, an excellent approach to reading these notes is as follows.
● One first gathers the examples of rings from the second and third chapters.
One then reads the statements of the theorems/propositions/corollaries,
etc., and interprets those results for each of those examples. The purpose
of the theory is to understand and unify the examples.
i
ii PREFACE
● To learn the proofs, we recommend that one read the statement of a given
theorem or proposition, and tries to prove the result oneself. If one gets
stuck at a certain point in the proof, one reads the proof until one gets past
that point, and then one resumes the process of proving the result oneself.
Also, one should keep in mind that if one doesn’t know where to start, one can
always start with the definition, which means that one always knows where to start.
Just saying.
I strongly recommend that the reader consult other textbooks as well as these
notes. As ridiculous as this may sound, there are other people who can write as well
as if not better than your humble author, and it is important that the reader find the
source which best suits the reader. Moreover, by consulting multiple sources, the
reader will discover results not covered in any single reference. I shall only mention
two namely the book of I.N. Herstein [Her69], and a wonderful little book on Galois
(and Field Theory) by I. Stewart [Ste04].
I would like to thank the following people for bringing some typos to my atten-
tion: T. Cooper, J. Demetrioff, R. Gambhir, A. Murphy (more than once), K. Na,
and B. Virsik. Any remaining typos and mistakes are the fault of my colleagues...
well.... ok, maybe they’re partly my fault too.
Finally, I would like to thank Mr. Nic Banks for providing me with a short list of
examples of Euclidean domains beyond the much shorter and apparently universally
standard list which appears in so many elementary ring theory textbooks I consulted.
It was a book to kill time for those who like it better dead.
Rose Macaulay
Preface i
Chapter 1. A brief overview 1
1. Groups 1
Supplementary Examples 8
Appendix 10
Exercises for Chapter 1 11
Chapter 2. An introduction to Rings 13
1. Definitions and basic properties 13
2. Polynomial Rings – a first look 18
3. New rings from old 20
4. Basic results 24
Supplementary Examples 28
Appendix 31
Exercises for Chapter 2 34
Chapter 3. Integral Domains and Fields 37
1. Integral domains - definitions and basic properties 37
2. The character of a ring 42
3. Fields - an introduction 45
Supplementary Examples. 48
Appendix 51
Exercises for Chapter 3 52
Chapter 4. Homomorphisms, ideals and quotient rings 55
1. Homomorphisms and ideals 55
2. Ideals 63
3. Cosets and quotient rings 68
4. The Isomorphism Theorems 76
Supplementary Examples. 84
Appendix 87
Exercises for Chapter 4 91
Chapter 5. Prime ideals, maximal ideals, and fields of quotients 95
1. Prime and maximal ideals 95
2. From integral domains to fields 101
v
vi CONTENTS
Bibliography 243
Index 245
CHAPTER 1
A brief overview
1. Groups
1.1. Pure Mathematics is, we would argue, the study of mathematical objects
and structures, and of the relationships between these. We seek to understand and
classify these structures. This is a very vague statement, of course. What kinds of
objects and structures are we dealing with, and what do we mean by “relationships”?
The objects and structures are many and varied. Vector spaces, continuous func-
tions, integers, real and complex numbers, groups, rings, fields, algebras, topological
spaces and manifolds are just a very tiny sample of the cornucopia of objects which
come under the purview of modern mathematics. In dealing with various members
of a single category – and here we are not using category in the strict mathematical
sense, for categories are also a structure unto themselves – we often use functions
or morphisms between them to see how they are related. For example, an injective
function f from a non-empty set A to a non-empty set B suggests that B must have
at least as many elements as A. (In fact, we take this as the definition of “at least
as many elements” when A and B are infinite!)
1.2. The two principal structures we shall examine in this course are rings and
fields. The typical approach in most books begins with the definition of a ring, for
example, and then produces a (hopefully long) list of examples of rings to demon-
strate their importance. This process is, however, essentially the inverse of how
such a concept develops in the first place. In practice, one normally starts with a
long list of examples of objects, each of which has its own intrinsic value. Through
inspiration and hard work, one may discover certain commonalities between those
objects, and one may observe that any other object which shares that commonality
will – by necessity – behave in a certain way as suggested by the initial objects of
interest. The advantage of doing this is that if one can prove that certain behaviour
is a result of the commonality, as opposed to being specific to one of the examples,
1
2 1. A BRIEF OVERVIEW
then one may deduce that such behaviour will apply to all objects in this collection
and beyond.
But enough of generalities. Let us illustrate our point through an example. Since
both rings and fields exhibit a group structure under addition, let us momentarily
digress to examine these.
Examples (a), (c), (d), (e) and (f) have (at least the following) two things in
common:
1. GROUPS 3
(i) in each case - there is a neutral element; that is, an element z in the set
which satisfies a + z = a = z + a for all a in the original set.
For example, in the case of Z, we simply set z = 0, whereas for Mn (C),
we need to set z = [0i,j ], the zero matrix. In the case of C([0, 1], R), we
would set z to be the zero function, that is, the function z(x) = 0, x ∈ [0, 1].
In the case of Q∗ and Q+ , the number 1 = 11 serves as a neutral element
under multiplication.
Note that N does not possess a neutral element. Regardless of which
z ∈ N one chooses, n + z ≠ n for any n ∈ N.
(ii) Note also that for each element a in each of the first, third and fourth
examples, there is an element b such that a + b = z = b + a, where z is the
neutral element from item (i) above. This coincides with what we normally
refer to as the negative or the additive inverse of a. For example, in the
case of f ∈ C([0, 1], R), we set −f to be the function (−f )(x) = −(f (x)),
x ∈ [0, 1]. Again – that −f is continuous when f is continuous is proven in
a first-year Calculus course.
Given r = p/q ∈ Q∗ (in using this notation we are assuming that p, q ∈ Z),
we note that p ≠ 0 ≠ q, and so s ∶= q/p ∈ Q∗ , and sr = 1 = rs. That is, the
product of r and s yields the neutral element from above. Of course, we
normally refer to s as the (multiplicative) inverse of r.
Once again, N fails to have this property.
1.4. Example. The next example appears, at least on the surface, rather dif-
ferent from those above. Let ∆ denote an equilateral triangle, with vertices labelled
1, 2 and 3.
1
3 2
figure 1. ∆
Let us think of ∆ as sitting in the x, y-plane sitting inside the 3-dimensional
real vector space R3 . We equip R3 with its usual Euclidean distance (or metric) as
follows: if x = (x1 , x2 , x3 ) and y = (y1 , y2 , y3 ) ∈ R3 , the distance between x and y is
defined as
√
d(x, y) = ∣x1 − y1 ∣2 + ∣x2 − y2 ∣2 + ∣x3 − y3 ∣2 .
We first consider isometries of the triangle. By an isometry, we mean a map
φ ∶ ∆ → R3 such that d(x, y) = d(φ(x), φ(y)). That is, φ preserves distance.
4 1. A BRIEF OVERVIEW
ϱ2
3 2 1 3
figure 2. the action of ϱ2
We might also fix one of the vertices, for example the top vertex (labelled 1 in
Figure 1), and reflect ∆ along the vertical line from the top vertex to the midpoint
of the horizontal line segment connecting the leftmost vertex (labelled 3 in Figure
1) to the rightmost vertex (labelled 2 in Figure 1). Let us call this symmetry φ1 ,
and the symmetries similarly obtained by fixing vertex 2 by φ2 , while that similarly
obtained by fixing vertex 3 by φ3 .
1 1
φ1
3 2 2 3
figure 3. the action of φ1
We have so far discovered six distinct symmetries of ∆. Are there others?
A moment’s thought will hopefully convince the reader that any symmetry must
take distinct vertices of ∆ to distinct vertices of ∆. There are only six ways of doing
this: once we have chosen where to send vertex 1, there remain two choices for where
to send vertex 2, and then vertex 3 must be sent to the remaining vertex. The total
number of choices is therefore 6 = 3⋅2⋅1. Since we already have produced six distinct
symmetries above, we indeed have them all. Let us denote by symm(∆) the set
symm(∆) = {ϱ0 , ϱ1 , ϱ2 , φ1 , φ2 , φ3 }
of all symmetries of the equilateral triangle ∆.
1. GROUPS 5
Thus
φ1 ○ ϱ2 (1) = φ1 (ϱ2 (1)) = φ1 (3) = 2,
and similiarly φ1 ○ ϱ2 (2) = 1, φ1 ○ ϱ2 (3) = 3.
1 2
3 2 3 1
figure 4. The action of φ1 ○ ϱ2 .
But this is precisely what happens if we simply apply φ3 to ∆! In other words,
φ1 ○ ϱ2 = φ3 .
In analogy to the three examples from Example 1.3, a quick calculation shows
that ϱ0 serves as a neutral element for symm(∆). Also, given any symmetry α in
symm(∆), there exists a second symmetry β such that β ○ α = ϱ0 = α ○ β.
For example, if α = ϱ1 , then we may set β = ϱ2 , while if α = φ1 , then we choose
β = φ1 as well.
So, in terms of its behaviour using composition as the binary operation, symm(∆)
behaves more like (Z, +) than (N, +) does!
1.5. Each of the sets we have defined in Example 1.3 and Example 1.4 is inter-
esting in its own right. But now we have discovered that, except for the example of
the natural numbers, they all share a few properties in common. It is this kind of
analysis that leads mathematicians to make definitions such as the following.
6 1. A BRIEF OVERVIEW
1.7. Remark. Since a group consists of the non-empty set G along with the
binary operation ⋅, we normally write (G, ⋅) to denote a group. Informally, when
the binary operation is understood and there is no risk of confusion, we also refer
to a group G, and we even drop the ⋅ from the notation, writing gh to denote g ⋅ h.
We shall leave it as an exercise for the reader to show that if G is a group, then
it admits exactly one identity element e, and that given g ∈ G, the element h ∈ G
satisfying g ⋅ h = e = h ⋅ g is also uniquely defined. For this reason, we speak of the
identity element of a group G, as well as the inverse of g ∈ G. We also adopt the
notation g −1 to denote the unique inverse of g.
Supplementary Examples
S1.1. Example. Let n ∈ N. Let GLn (C) ∶= {T ∈ Mn (C) ∶ T is invertible}, and
let ⋅ represent usual matrix multiplication.
⎡1 ⎤
⎢ ⎥
⎢ 1 ⎥
⎢ ⎥
Then (GLn (C), ⋅) is a group, and In = ⎢ ⎥ ∈ Mn (C) is the identity of
⎢ ⋱ ⎥
⎢ ⎥
⎢ 1⎥⎦
⎣
this group. We refer to GLn (C) as the general linear group in Mn (C).
Also,
Un (C) ∶= {U ∈ Mn (C) ∶ U is unitary}
is a group, called the unitary group of Mn (C).
S1.2. Example. Let n ∈ N. Then SLn (C) ∶= {T ∈ Mn (C) ∶ det T = 1} is a
group, again using usual matrix multiplication as the group operation.
This group is referred to as the special linear group of Mn (C).
S1.3. Example. More generally, let Ω ⊆ T ∶= {z ∈ C ∶ ∣z∣ = 1} be a multiplicative
group. Let
Dn (Ω) ∶= {T ∈ Mn (C) ∶ det T ∈ Ω}.
Then Dn (Ω) is a group, using usual matrix multiplication as the group operation.
S1.4. Example. The set of rational numbers Q is an abelian group under
addition, while the set Q∗ of non-zero rational numbers is an abelian group under
multiplication. As we have seen, the set Q+ of strictly positive rational numbers is
also an abelian group under multiplication.
S1.5. Example. Let (V, +, ⋅) be a vector space over R. (Here, ⋅ refers to scalar
multiplication.) Then (V, +) is an abelian group.
In particular, given a natural number n, (Rn , +) is an additive, abelian group.
This explains why the examples given in Example 1.3 are (abelian) groups.
S1.6. Example. Amongst the most important classes of groups are the so-
called permutation groups.
Let ∅ ≠ X be a non-empty set. A permutation of X is a bijective function
f ∶ X → X. Typically, if X = {1, 2, 3, . . . , n} for some natural number n, then
we use symbols such as σ or ϱ to denote permutations. We refer to the set of all
permutations of X as the symmetric group of X and (here we shall) denote it by
Σ(X). The group operation is composition of functions.
Note that the identity map ι ∶ X → X given by ι(x) = x, x ∈ X is a bijection.
Given two bijections f, g of X, their composition f ○ g is again a bijection, and
f ○ ι = f = ι ○ f . Thus ι serves as the identity of Σ(X) under composition. Also,
since f is a bijection, given y ∈ X, we may write y = f (x) for a unique choice of
x ∈ X. Thus we may define the bijection f −1 ∶ X → X via f −1 (y) = x. Clearly
f −1 ○ f = ι = f ○ f −1 , so that f −1 is the inverse for f under composition.
In general, Σ(X) is non-abelian. (For which sets X is it abelian?)
SUPPLEMENTARY EXAMPLES 9
The fact that we refer to the set of permutations as the symmetric group, and
that we refer to the group symm(∆) from Example 1.4 as a group of symmetries is
not merely a coincidence. As we observed in the discussion of that Example, any
symmetry of ∆ is really just a permutation of the vertices, and any permutation of
the vertices leads to a symmetry of ∆.
S1.7. Example. The set Zn = {0, 1, 2, . . . , n − 1} is an abelian group under
addition modulo n. It is an interesting exercise to prove that if p ∈ N is a prime
number, then every group G with p elements behaves exactly like Zp under addition.
Technically speaking, (Zp , +) is the unique group of order p (i.e. with p elements)
up to group isomorphism.
To be explicit, if p ∈ N is a prime number and G = {g1 , g2 , . . . , gp } is a group
using the binary operation ⋅, then there exists a bijective map ϱ ∶ Zp → G such
that ϱ(a + b) = ϱ(a) ⋅ ϱ(b) for all a, b ∈ Zp . Such a map is referred to as a group
isomorphism. The proof of this for n = 2, 3 or n = 5 is certainly within reach.
We shall have more to say about so-called homomorphisms of groups in Chap-
ter 4.
S1.8. Example. If (G, ⋅) and (H, ∗) are groups, then so is G × H ∶= {(g, h) ∶
g ∈ G, h ∈ H} under the operation ●, where
(g1 , h1 ) ● (g2 , h2 ) ∶= (g1 ⋅ g2 , h1 ∗ h2 ).
We leave the verification of this to the reader.
S1.9. Example. The discrete Heisenberg group H3 (Z) is the set of 3 × 3
matrices of the form
⎡1 a b ⎤
⎢ ⎥
⎢ ⎥
⎢0 1 c ⎥ ,
⎢ ⎥
⎢0 0 1⎥
⎣ ⎦
where a, b, c ∈ Z, and where
⎡1 a b ⎤ ⎡1 x y ⎤ ⎡1 a + x y + az + b⎤
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢0 1 c ⎥ ⎢0 1 z ⎥ = ⎢0 1 c + z ⎥.
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢0 0 1⎥ ⎢0 0 1⎥ ⎢0 0 1 ⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
If one replaces Z by R, one obtains the continuous Heisenberg group.
S1.10. Example. The set T ∶= {z ∈ C ∶ ∣z∣ = 1} is a group, using usual complex
multiplication as the operation.
10 1. A BRIEF OVERVIEW
Appendix
A1.1. Niels Henrik Abel (5 August 1802 - 6 April 1829) is known for a wide
variety of results that bear his name. Perhaps his most famous result is the proof
that given a polynomial p of degree five over the real numbers, it is not possible
to solve the equation p(x) = 0 in radicals. (The well-known quadratic equation
does exactly this for polynomials of degree two. In the case of polynomials of degree
three or four, solutions in radicals also exist, but they are not as simple, and are
subsequently less known.)
Abel lived in poverty and died of tuberculosis. The Abel Prize in mathematics
is named in his honour. (There is no Nobel Prize in mathematics.) Given how
prestigious and merited the Nobel Peace Prize always is, a prize in the world of
mathematics is... (we leave it to the reader to complete this sentence).
EXERCISES FOR CHAPTER 1 11
Exercise 1.1.
Let (G, ⋅) be a group. Prove that if e, f ∈ G and gf = ge = g = eg = f g for all
g ∈ G, then e = f . That is, prove that the identity element of a group is unique.
Exercise 1.2.
Let (G, ⋅) be a group. Prove that if g, h1 , and h2 ∈ G and if gh2 = gh1 = e = h1 g =
h2 g, then h1 = h2 . This shows that the inverse of g is unique, allowing us to refer to
the inverse of g, which we then denote by g −1 .
Exercise 1.3.
Let (G, ⋅) be a group. Suppose that g 2 ∶= g ⋅ g = e for all g ∈ G. Prove that G is
abelian.
Exercise 1.4.
In Example 1.4, we determined that (symm(∆), ○) is a non-abelian group with
six elements, and we determined all of its elements by observing what each symmetry
does to the vertices of the triangle ∆.
Exercise 1.5.
Let Γ denote an isosceles triangle which is not equilateral. Determine
(symm(Γ), ○).
Exercise 1.6.
A permutation of a non-empty set X is a bijective function f ∶ X → X. Denote
by Σ(X) the set of all permutations of X. As we saw in Example S1.6, using
composition as the operation: f ○ g(x) ∶= f (g(x)) for all x ∈ X, we have that
(Σ(X), ○) is a group.
In the case where X = {1, 2, . . . , n}, we may write a permutation by specifying
its value at each of the elements of X as a 2 × n matrix: for example,
1 2 3 ⋯ n−1 n
σ=[ ]
a1 a2 a3 ⋯ an−1 an
represents the permutation σ ∈ Σ(X) that satisfies σ(j) = aj , 1 ≤ j ≤ n.
(a) What is the relationship (if any) between Σ({1, 2, 3}) and symm(∆)?
(b) What is the relationship (if any) between Σ({1, 2, 3, 4}) and symm(◻))?
12 1. A BRIEF OVERVIEW
Exercise 1.7.
Let Q∗ ∶= {q ∈ Q ∶ q ≠ 0}. Let ⋅ represent the usual product of rational numbers.
Prove that (Q∗ , ⋅) is an abelian group.
Exercise 1.8.
For those of you with a background in linear algebra: let V be a vector space
over C and let
L(V) ∶= {T ∶ V → V ∶ T is linear}.
Set G(V) ∶= {T ∈ L(V) ∶ T is invertible}. Using composition of linear maps ○ as the
binary operation, prove that (G(V), ○) is a group.
Exercise 1.9.
Let (G, ⋅) be a group and g, h and k ∈ G. Suppose that
g ⋅ h = g ⋅ k.
Prove that h = k.
Exercise 1.10.
Let n ∈ N and let ωn ∶= e2πi/n ∈ C. Let Ωn ∶= {1, ω, ω 2 , . . . , ω n−1 }. Prove that Ωn
is an abelian group under complex multiplication, and that if n is prime, then the
only subgroups of Ωn are {1} and Ωn itself.
CHAPTER 2
An introduction to Rings
1.2. Definition. A ring R is a non-empty set equipped with two binary oper-
ations, known as addition (denoted by +), and multiplication (denoted by ⋅, or
simply by juxtaposition of two elements). The operations must satisfy (A1) - (A5)
and (M1)-(M4) below:
(A1) a + b ∈ R for all a, b ∈ R.
(A2) (a + b) + c = a + (b + c) for all a, b, c ∈ R.
(A3) There exists a neutral element 0 ∈ R such that
a + 0 = a = 0 + a for all a ∈ R.
13
14 2. AN INTRODUCTION TO RINGS
1.3. Remark. Condition (M1) above states that R is closed under multiplica-
tion, (M2) says that multiplication is associative on R, while conditions (M3) and
(M4) state that multiplication is left and right distributive with respect to addition.
We point out that some authors require that all rings be unital. It is true that
given a non-unital ring, there is a way of attaching a multiplicative identity to it
(this will eventually appear as an Assignment question), however we shall refrain
from doing so because the abstract construction is not always natural in terms of
the (non-unital) rings we shall consider.
We point out that the decision to require that 1 ≠ 0 is to ensure that R = {0} is
not considered a unital ring.
1.4. Example. Given an abelian group (G, +), we can always turn it into a
ring by setting defining g1 ⋅ g2 = 0 for all g1 , g2 ∈ G.
1.7. Examples. The following are also rings, and the verification of this is left
to the reader.
(a) (C, +, ⋅). The complex numbers.
(b) (R, +, ⋅). The real numbers.
(c) (Q, +, ⋅). The rational numbers.
16 2. AN INTRODUCTION TO RINGS
where rj ∈ R and gj ∈ G, 1 ≤ j ≤ m.
The sum of two elements of R⟨G⟩ is performed as follows: let a = ∑pj=1 rj gj and
b = ∑qi=1 si hi be two elements of R⟨G⟩. Set E ∶= {g1 , g2 , . . . , gp } ∪ {h1 , h2 , . . . , hq }.
After a change of notation, we may rewrite E = {y1 , y2 , . . . , yk } where all of the yj ’s
are distinct (and k ≤ p + q). We may then write a = ∑kj=1 aj yj and b = ∑kj=1 bj yj ,
noting that it is entirely possible that aj = 0 or that bj = 0 for some of the j’s.
The purpose of this was to write both a and b as a combination of the same set
of group elements.
Then
⎛k ⎞ ⎛k ⎞ k
a + b = ∑ aj yj + ∑ bj yj ∶= ∑ (aj + bj )yj .
⎝j=1 ⎠ ⎝j=1 ⎠ j=1
Let’s have a look at this in action through two concrete examples of group rings.
(a) Let G = {e, ω, ω 2 }, and equip G with the multiplication determined by the
following table:
e ω ω2
e e ω ω2
ω ω ω2 e
ω2 ω2 e ω
1. DEFINITIONS AND BASIC PROPERTIES 17
and
a ⋅ b = (r1 e + r2 ω + r3 ω 2 )(s1 e + s2 ω + s3 ω 2 )
= r1 s1 e2 + r1 s2 e ⋅ ω + r1 s3 e ⋅ ω 2 + r2 s1 ω ⋅ e + r2 s2 ω 2 + r2 s3 ω 3
+ r3 s1 ω 2 ⋅ e + r3 s2 ω 3 + r3 s3 ω 4
= (r1 s1 + r2 s3 + r3 s2 ) e + (r1 s2 + r2 s1 + r3 s3 ) ω + (r1 s3 + r2 s2 + r3 s1 ) ω 2 .
(b) As a second example, let n ∈ N. A matrix P ∈ Mn (C) is said to be a
permutation matrix if every row and every column of P contains exactly
one “1”, and zeroes everywhere else. The reason for this nomenclature is
that if we view elements of Mn (C) as linear transformations on Cn written
with respect to the standard orthonormal basis B ∶= {e1 , e2 , . . . , en }, then a
permutation matrix corresponds to a linear transformation which permutes
the basis B; i.e. P ej = eσ(j) , where σ ∶ {1, 2, . . . , n} → {1, 2, . . . , n} is a
bijection.
Consider the set Pn of all n × n permutation matrices. We leave it to
the reader to prove that Pn is a group. In the case where n = 3, we find
that
P3 = {P1 , P2 , P3 , P4 , P5 , P6 }
⎧
⎪ ⎡1 0 0⎤ ⎡0 1 0⎤ ⎡0 0 1⎤⎥ ⎡⎢1 0 0⎤⎥ ⎡⎢0 0 1⎤⎥ ⎡⎢0 1 0⎤⎥⎫
⎪
⎪⎢⎢
⎪ ⎥ ⎢
⎥ ⎢
⎥ ⎢
⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎪
⎪
= ⎨⎢0 1 0⎥ , ⎢0 0 1⎥ , ⎢1 0 0⎥ , ⎢0 0 1⎥ , ⎢0 1 0⎥ , ⎢1 0 0⎥⎬ .
⎪
⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎪
⎩⎢⎣0 0 1⎥⎦ ⎢⎣1 0 0⎥⎦ ⎢⎣0 1
⎪ 0⎥⎦ ⎢⎣0 1 0⎥⎦ ⎢⎣1 0 0⎥⎦ ⎢⎣0 0 1⎥⎦⎪
⎪
⎭
A general element of the group ring Z⟨P3 ⟩ looks like
6
a = ∑ aj Pj ,
j=1
1.10. Remark. Notice that in the example above, L(V) is not only a ring, but
it is in fact a vector space over K. We say that it as an algebra over K.
2.2. Definition.
(a) Let n ∈ N and x1 , x2 , . . . , xn be a finite collection of abstract symbols (also
called indeterminates). The set {x1 , x2 , . . . , xn } is often called an alpha-
bet. A word in {x1 , x2 , . . . , xn } is a formal product (i.e. a concatenation)
W ∶= xi1 xi2 ⋯xik
where k ∈ N and ij ∈ {1, 2, . . . , n} for all 1 ≤ j ≤ k. That is, it is just a
concatenation of finitely many elements of the alphabet. The length of the
word W above is defined to be k. By convention, we define W0 to be the
empty word, with the understanding that W0 W = W = W W0 for any word
W ∈ W.
Let us denote by Wk the set of all words of length k in {x1 , x2 , . . . , xn },
and by W the set
W ∶= ∪∞
k=0 Wk .
2.3. The case where n = 1 will be very close to our hearts. Given a commutative,
unital ring R, we have defined the ring of polynomials in x with coefficients in R to
be the set of formal symbols
R[x] = R⟨x⟩ = {p0 W0 + p1 x + ⋯ + pm xm ∶ m ≥ 1, pk ∈ R, 0 ≤ k ≤ m}.
In practice, we typically do one of two things:
● either we drop the W0 notation and simply write p0 + p1 x + ⋯ + pm xm , or
● we denote W0 by x0 and write p0 x0 + p1 x + ⋯ + pm xm .
In these notes, we will tend to drop the W0 altogether. Hopefully, this should
not cause any confusion, as the notation agrees with the usual notation (that we’ve
all come to know and love) for polynomials in x.
Two elements p0 + p1 x + ⋯ + pm xm (where pm ≠ 0) and q0 + q1 x + ⋯ + qn xn (where
qn ≠ 0) are considered equal if and only if n = m and pk = qk , 0 ≤ k ≤ n.
As we have just seen, R[x] is a ring using the operations (say m ≤ n)
(p0 + p1 x + ⋯ + pm xm ) + (q0 + q1 x + ⋯ + qn xn )
= (p0 + q0 ) + (p1 + q1 )x + ⋯ + (pm + qm )xm + qm+1 xm+1 + ⋯ + qn xn ,
20 2. AN INTRODUCTION TO RINGS
and
(p0 + p1 x + ⋯ + pm xm )(q0 + q1 x + ⋯ + qn xn )
= p0 q0 + (p0 q1 + p1 q0 )x1 + (p0 q2 + p1 q1 + p2 q0 )x2 + ⋯ + pm qn xn+m .
3.3. Example.
(a) Direct products. Let Λ be a non-empty set, which may be finite or
infinite, countable or uncountable. Suppose that for each λ ∈ Λ, we are
given a ring Rλ . We define the direct product of the family {Rλ ∶ λ ∈ Λ}
to be the set
∏ Rλ ∶= {(rλ )λ ∶ rλ ∈ Rλ , λ ∈ Λ}.
λ∈Λ
and
z ⋅ w = (z1 w1 , z2 w2 , z3 w3 , . . .).
It follows that the element e = (1, 1, 1, . . .) defined above is the identity element
of ∏n∈N Z. We leave it as an exercise for the reader to prove that ⊕n∈N Z is non-unital.
1 0 0 1
3.7. Example. Let R = M2 (R). Let Y = {T1 ∶= [ ] , T2 ∶= [ ]}. If S ⊆
0 2 0 0
1 0
M2 (R) is a ring and T1 ∈ S, then T12 ∈ S, and therefore X1 ∶= 2T1 − T12 = [ ] ∈ S.
0 0
0 0
From this we see that X2 ∶= T1 − X1 = [ ] ∈ S as well. This in turn implies that
0 2
m 0
[ ] = mX1 ∶= X1 + X1 + ⋯ + X1 m times
0 0
and nX2 , pT2 ∈ S for all m, n, p ∈ N. Since (additive) inverses are also in S, we see
that
m p
W ∶= {[ ] ∶ m, n, p ∈ Z} ⊆ S.
0 2n
Thus
W ⊆ ∩{V ⊆ R ∶ Y ⊆ V and V is a ring}.
By definition, W consists of all upper-triangular 2×2 matrices whose (1, 1) and (1, 2)
entries are arbitrary integers, and whose (2, 2) entry is an arbitrary even integer.
To see that W = RY , it suffices to know that W is a ring. We could go through
the entire list of conditions from Definition 2.2 above, but it will be much easier to
apply the Subring Test below (see Proposition 4.10).
m p1
That W ≠ ∅ is obvious from its definition. If W1 ∶= [ 1 ] and W2 ∶=
0 2n1
m p2
[ 2 ] ∈ W, then
0 2n2
m − m2 p1 − p2
W1 − W 2 = [ 1 ] ∈ W,
0 2(n1 − n2 )
and
m1 m2 m1 p2 + p1 (2n2 )
W1 W2 = [ ] ∈ W.
0 2(2n1 n2 )
24 2. AN INTRODUCTION TO RINGS
By the Subring Test, W is a ring, and therefore it must be the smallest ring that
contains Y , i.e., W = RY .
4. Basic results
4.1. We now have at hand a large number of rings. It is time that we prove our
first results. We shall adopt the common notation whereby we express multiplication
through juxtaposition, that is: we write ab to mean a ⋅ b. We also introduce the
notation a − b to mean a + (−b).
4.8. Example. Let S = 2Z ∶= {. . . , −4, −2, 0, 2, 4, . . .} denote the set of all even
integers. We leave it as an exercise for the reader to show that 2Z is a subring of Z.
4.11. Remark. We point out that in the Subring Test, it is crucial that we
show that a − b ∈ S (and ab ∈ S) rather than a + b (and ab ∈ S) for all a, b ∈ S.
Consider N = {1, 2, 3, . . . , }, equipped with the usual addition and multiplication
it inherits as a subset of the ring Z. For m, n ∈ N, clearly m + n ∈ N and mn ∈ N.
But (N, +, ⋅) is not a ring because – for example – 2 ∈ N, but −2 ∈/ N. Alternatively,
(N, +, ⋅) is not a ring because it does not have a neutral element under addition - i.e.
0 ∈/ N.
4.12. Examples.
(a) Let R ∶= (C([0, 1], R), +, ⋅) and S = {f ∈ R ∶ f (x) = 0 for all x ∈ [1/2, 8/9]}.
Then S is a subring of R.
(b) For any ring R and any natural number n, Tn (R) is a subring of Mn (R).
(c) Q is a subring of R, and Z is a subring of Q and of R.
(d) Z5 is not a subring of Z10 . (Why not?)
Supplementary Examples
a b
S2.1. Example. Let R = Z. Then M2 (Z) = {[ ] ∶ a, b, c, d ∈ Z} is a ring,
c d
where
a b a b a + a2 b1 + b2
[ 1 1] + [ 2 2] = [ 1 ],
c1 d1 c2 d2 c1 + c2 d1 + d2
and
a b a b a a +b c a b +b d
[ 1 1 ] [ 2 2 ] = [ 1 2 1 2 1 2 1 2] .
c1 d1 c2 d2 c1 a2 + d1 c2 c1 b2 + d1 d2
1 1 i j k
i i -1 k -j
j j -k -1 i
k k j -i -1
We define
(a1 1+b1 i+c1 j+d2 k)+(a2 1+b2 i+c2 j+d2 k) = (a1 +a2 )1+(b1 +b2 ) i+(c1 +c2 ) j+(d1 +d2 ) k.
Multiplication behaves analogously to complex multiplication, only following the
multiplication table above. Thus, for example,
(π + 2j + 3k)(e + 2i) = πe + 2e j + 3e k + 2π i + 4 j i + 6 k i
= π e + 2π i + (2e + 6) j + (3e − 4) k.
We leave it to the reader to find the (long) formula for multiplying two arbitrary
real quaternions.
Note that when we say that there is no notion of convergence here, we mean
that we could just have easily defined an element of R[[x]] to be a sequence (rn )∞
n=0
with rn ∈ R for all n ≥ 0, and (rn )n + (sn )n = (rn + sn )n .
The reason for preferring the “series” notation is that helps to “explain” how
one might come up with the multiplication
(rn )n ∗ (sn )n ∶= (tn )n ,
where
n
tn ∶= ∑ rj sn−j .
j=0
n
In other words, the x , n ≥ 0 is really just a placeholder for the coefficient rn ∈ R.
APPENDIX 31
Appendix
A2.1. As mentioned in the introductions to each of the first two chapters, math-
ematics typically evolves by considering a large number of objects that one is in-
terested in, and having one or more clever individuals notice a common link which
relates those objects to one another.
In the present case, examples arose out of number theory (integers, rational
numbers, real numbers, complex numbers), linear algebra (linear transformations,
matrices), set theory (Boolean rings), analysis (continuous functions, differentiable
functions), amongst other areas.
The formulation of the notion of a ring began in the 1870’s with the study
of polynomials and algebraic integers, in part by Richard Dedekind. The term
“Zahlring” (German for “number ring”) was introduced by David Hilbert in the
1890s. Adolf Fraenkel gave the first axiomatic definition of a ring in 1914, but his
definition was stricter than the current definition. The modern axiomatic definition
of a (commutative) ring was given by Emmy Noether.
defined in Example 4.1 above becomes a ring using pointwise addition, but with
pointwise multiplication replaced by convolution:
∞
(f ∗ g)(x) ∶= ∫ f (t)g(x − t) dt.
−∞
A2.3. The ring H of real quaternions was first described by the Irish mathe-
matician William Rowan Hamilton. Note that if ω1 , ω2 ∈ H, then ω1 ω2 ≠ ω2 ω1
in general, but if 0 ≠ ω ∈ H, then ω is invertible. A non-commutative ring in which
every non-zero element is invertible is referred to as a division ring or a skew
field.
A2.4. In some textbooks, the degree of the zero polynomial is defined to be
“−∞”, with the understanding that −∞ + (−∞) = −∞ = −∞ + n = n + (−∞) for all
n ∈ N. In this way, we see that
deg (p(x) + q(x)) ≤ max(deg(p(x)), deg (q(x))
for all polynomials p(x), q(x) – that is, we don’t have to separately argue that
p(x) + q(x) might be equal to zero.
We have chosen not to use this terminology, because there is a tendency for
people to think of “−∞” as a number as opposed to a symbol, and to think of the
above “sums” as being actual addition, instead of mere convention.
A2.5 The ring of polynomials revisited. Recall that if R is a commutative
ring, then we defined the ring of polynomials R[x] to be the set of formal symbols:
R[x] = R⟨x⟩ = {p0 + p1 x + ⋯ + pm xm ∶ m ≥ 1, pk ∈ R, 0 ≤ k ≤ m}.
Let us now give an equivalent, but slightly different and very useful definition of the
ring of polynomials R[x].
Exercise 2.1.
Consider the set Pn of n×n permutation matrices in Mn (C). Thinking of Mn (C)
as a vector space over C, is
span{Pn } = Mn (C)?
Exercise 2.2.
Let
⎧
⎪ ⎡1 0 0⎤ ⎡0 1 0⎤ ⎡0 0 1⎤⎫
⎪⎢⎢
⎪ ⎥ ⎢ ⎥ ⎢ ⎥⎪
⎥ ⎢ ⎥ ⎢ ⎥⎪
⎪
H = {H1 , H2 , H3 } ∶= ⎨⎢0 1 0⎥ , ⎢0 0 1⎥ , ⎢1 0 0⎥⎬ .
⎪
⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎪
⎪
⎩⎢⎣0 0 1⎥⎦ ⎢⎣1 0 0⎥⎦ ⎢⎣0 1 0⎥⎦⎪
⎪ ⎭
Prove that H is a group and write out its multiplication table (as we did with
G in Example 2.1.8).
(a) Is there a relationship between H and the group G from that example?
(b) Is there a relationship between Q⟨H⟩ and Q⟨G⟩?
If relationships do exist, how would you describe them?
Exercise 2.3.
Find all subrings of (Z, +, ⋅).
Exercise 2.4.
Let R and S be subrings of a commutative ring T .
(a) Define
n
RS ∶= {∑ ri si ∶ n ≥ 1, ri ∈ R, si ∈ S, 1 ≤ i ≤ n}.
i=1
Prove that RS is a subring of T . Is it always true that R ⊆ RS? Either
prove that it is, or find a counterexample to show that it is not always true.
(b) Did we need to take all finite sums in the expression for RS above? In
other words, if
A ∶= {rs ∶ r ∈ R, s ∈ S},
is A a subring of T ? Again, you must justify your answer!
(c) Prove that R ∩ S is a subring of T . Is the commutativity of T required
here?
(d) Was commutativity of T required in part (a) above? If not, prove part (a)
without assuming that T is commutative. Otherwise, provide an example
of a non-commutative ring T where RS is not a subring of T .
Exercise 2.5.
Let R and S be rings. Describe all subrings of R ⊕ S in terms of the subrings of
R and S.
EXERCISES FOR CHAPTER 2 35
Exercise 2.6.
Let G = Z ⊕ Z, equipped with the operation (g1 , h1 ) + (g2 , h2 ) ∶= (g1 + g2 , h1 + h2 ).
(a) Prove that G is an abelian group.
(b) Describe End(G), the endomorphism ring of G as defined in Example 2.9.
Exercise 2.7.
Let R be a ring and a, b ∈ R. Show that in general,
(a + b)2 ∶= (a + b)(a + b) ≠ a2 + 2ab + b2 .
State the correct formula, and find the formula for (a + b)3 .
Exercise 2.8.
Consider M2 (R) equipped with usual matrix addition and multiplication. If
L ⊆ R is a subring with the additional property that R ∈ R and M ∈ L implies that
RM ∈ L, we say that L is a left ideal of M2 (R).
Find all left ideals of M2 (R). Can you generalise this to Mn (R) for n ≥ 3?
Note. As you might imagine, one can also define right ideals in a similar
manner. What is the relationship between left and right ideals of Mn (R)?
Exercise 2.9.
Let (R, +, ⋅) be a ring. Define a new multiplication ∗ on R by setting
a ∗ b ∶= b ⋅ a for all a, b ∈ R.
Prove that (R, +, ∗) is also a ring.
This ring is referred to as the opposite ring of R.
Exercise 2.10.
Let R be a ring and suppose that a2 = a for all a ∈ R. Prove that R is commu-
tative; that is, prove that ab = ba for all a, b ∈ R.
A ring R for which a2 = a for all a ∈ R is called a Boolean ring. Provide an
example of such a ring.
CHAPTER 3
If you want to know what God thinks of money, just look at the people
he gave it to.
Dorothy Parker
Unfortunately, the cancellation law is not some universal truth that holds in all
algebraic structures at all times. In particular, the cancellation law does not have
37
38 3. INTEGRAL DOMAINS AND FIELDS
In this Chapter, we shall isolate this “bad behaviour”, and we shall give a special
name to those rings which avoid it.
1.2. Definition. Let R be a ring. An element 0 ≠ r ∈ R is said to be a left
divisor of zero if there exists 0 ≠ x ∈ R such that rx = 0.
Similarly, 0 ≠ s ∈ R is said to be a right divisor of zero if there exists 0 ≠ y ∈ R
such that ys = 0.
Finally, 0 ≠ t ∈ R is said to be a (joint) divisor of zero if it is both a left and
a right divisor of 0.
We remark that if R is a commutative ring, then every left (or right) divisor of
zero is automatically a joint divisor of zero.
1.3. Examples.
(a) Let R = (C([0, 1], R), +, ⋅). Let f ∈ R be the function
⎧
⎪0
⎪ 0 ≤ x ≤ 21
f (x) = ⎨
⎪
⎪ x − 2 12 ≤ x ≤ 1.
1
⎩
Let g ∈ R be the function
⎧
⎪ 1 − x 0 ≤ x ≤ 12
⎪
g(x) = ⎨ 2 1
⎪
⎪0 2 ≤ x ≤ 1.
⎩
Then f ≠ 0 ≠ g, but f g = 0 = gf , and thus f is a joint divisor of zero (as
is g).
(b) Let V = {x = (xn )∞ N
n=1 ∈ R ∶ limn xn = 0}. We leave it as an exercise for the
reader to prove that V is a vector space over R under the operations
(xn )n + (yn )n ∶= (xn + yn )n ,
λ(xn )n ∶= (λxn )n ,
for all (xn )n , (yn )n ∈ V and λ ∈ R.
Define the linear map
S∶ V → V
(x1 , x2 , x3 , . . .) ↦ (0, x1 , x2 , x3 , x4 , . . .).
1. INTEGRAL DOMAINS - DEFINITIONS AND BASIC PROPERTIES 39
1.5. Examples.
(a) The motivating example of an integral domain, and probably the origin of
the terminology, is the ring (Z, +, ⋅) of integers.
(b) Each of (Q, +, ⋅), (R, +, ⋅) and (C, +, ⋅) is also an integral domain.
(c) From Example 1.3 above,
(C([0, 1], R), +, ⋅)
is not an integral domain.
(d) M2 (R) is not commutative - it has no chance of being an integral domain.
40 3. INTEGRAL DOMAINS AND FIELDS
1.6. Examples.
(a) The ring of Gaussian integers
Z[i] ∶= {a + bi ∶ a, b ∈ Z, i2 = −1}
is an integral domain.
(b) The subring (2Z, +, ⋅) of Z is not an integral domain, because it is not
unital!
(c) The ring (R[x], +, ⋅) of polynomials in one variable with coefficients in R is
an integral domain. We leave it to the reader to verify this.
Again - we remind the reader that polynomial rings will be of special importance
later. Despite the simplicity of the proof, the following result is crucial, and hence
is designated a “theorem”.
1.8. Remark. The following useful fact was embedded in the above proof. For
ease of reference, we isolate it as a remark.
(b) implies (a). Now suppose that (R, +, ⋅) is a unital commutative ring and
that condition (b) holds. If there exists a ≠ 0 and b ∈ R with ab = 0, then
ab = 0 = a0. By hypothesis, b = 0. But then a is not a left divisor of zero.
Since R is commutative, a is not a right divisor of zero either. In other
words, R has no divisors of zero, and as such, it is an integral domain.
◻
Next, suppose that n is not prime, say that n = m1 m2 , where 1 < m1 , m2 < n.
Then m1 ≠ 0 ≠ m2 ∈ Zn , and m1 m2 = n mod n = 0 in Zn , proving that m1 and m2
are joint divisors of zero in Zn . Thus Zn is not an integral domain.
◻
√ √
1.12. Exercise. Let p ∈ N be a prime number. Then Q( p) = {a + b p ∶
a, b ∈ Q} is an integral domain, using the usual addition and multiplication of real
numbers.
2.4. Examples.
(a) Each of the following rings has characteristic zero: Z, Q, R and C.
(b) If n ∈ N, then char (Zn , +, ⋅) = n. In particular, char(Z12 ) = 12, which
explains in part what is going on with grandmaman’s watch.
2. THE CHARACTER OF A RING 43
◻
The above Proposition will make our work significantly easier when it comes to
computing the characteristic of a unital ring, even those that on the surface appear
rather complicated.
2.6. Examples.
(a) Consider R = Z2 ⊕ Z3 = {(a, b) ∶ a ∈ Z2 , b ∈ Z3 }. We claim that char(R) = 6.
Indeed, the identity of R is e ∶= (1, 1). Note that as an ordered list, we
have that
(1e, 2e, 3e, 4e, 5e, 6e) = ((1, 1), (0, 2), (1, 0), (0, 1), (1, 2), (0, 0)).
That is, 6 = min{n ∈ N ∶ n e = 0}, and so by the above Proposition 2.5,
char(R) = 6.
(b) Consider next R = Z2 ⊕ Z4 = {(a, b) ∶ a ∈ Z2 , b ∈ Z4 }. Let e ∶= (1, 1) denote
the multiplicative identity of R.
Then
(1e, 2e, 3e, 4e) = ((1, 1), (0, 2), (1, 3), (0, 0)).
By Proposition 2.5, we deduce that
char(R) = 4.
(c) Let R = Z5 ⟨x, y, z⟩ denote the ring of polynomials in three non-commuting
variables x, y and z with coefficients in Z5 . Then R is a unital ring with
identity e = 1 ∈ Z5 . Thus
(1e, 2e, 3e, 4e, 5e) = (1, 2, 3, 4, 0),
and so by Proposition 2.5,
char(Z5 ⟨x, y, z⟩) = 5.
44 3. INTEGRAL DOMAINS AND FIELDS
These are the kinds of questions one should be asking oneself throughout the
learning process. Yes, it is time-consuming, but the better one understands math-
ematics, the less one has to memorise. In a perfect world, one would only need to
memorise definitions. (I’ve yet to figure out how to get around that!)
For example, rather than trying to memorise a proof line-by-line, one would only
need to remember the key step, relying upon oneself to fill in the routine steps of
the proof. The time invested in learning these things at the beginning is time saved
later on.
Proof. Consider
p p p
(r + s)p = rp + ( )rp−1 sp + ( )rp−2 s2 + ⋯ + ( )r1 sp−1 + sp .
1 2 p−1
Here, for 1 ≤ k ≤ p − 1,
p p!
( )=
k k!(p − k)!
Now, p is prime and 1 ≤ k, (p − k) < p implies that p does not divide k! nor (p − k)!.
p
Since p divides the numerator, it follows that p divides ( ), 1 ≤ k ≤ p − 1. But then
k
p
( )rp−k sk = 0, 1 ≤ k ≤ p − 1,
k
and so
(r + s)p = rp + 0 + 0 + ⋯ + 0 + sp = rp + sp .
◻
Again, the reader should be asking (and hopefully answering) the question:
where was the commutativity of R used in this proof?
3. FIELDS - AN INTRODUCTION 45
3. Fields - an introduction
3.1. In Section 1 of this Chapter, we identified a property of rings which is
equivalent to the ring satisfying the cancellation law for multiplication, namely: we
asked that the ring be an integral domain. We argued that this is a requirement if
we hope to solve equations that involve elements of that ring.
Note that (Z, +, ⋅) is an integral domain, and yet if we only knew about inte-
gers and we didn’t know anything about rational numbers, we could not solve the
equation
2x = 1024.
That is, we could not do it by simply multiplying 2 by 12 , because 12 ∈/ Z. (Trial and
error would still work, I suppose.)
This leads us to consider rings which are even more special than integral do-
mains. In some ways, these new rings will behave almost as well as the real numbers
themselves.
3.4. Examples.
(a) Q, R and C are all fields, using the usual notions of addition and multipli-
cation.
(b) (Z, +, ⋅) is not a field, since 0 ≠ 2 ∈ Z does not admit a multiplicative inverse
in Z.
(c) Let F ∶= {0, 1, x, y}, equipped with the following addition and multiplication
operations.
+ 0 1 x y ⋅ 0 1 x y
0 0 1 x y 0 0 0 0 0
1 1 0 y x and 1 0 1 x y .
x x y 0 1 x 0 x y 1
y y x 1 0 y 0 y 1 x
Then one can verify that (F, +, ⋅) is a field. Note that F has 4 = 22 elements.
Coincidence? We shall see.
46 3. INTEGRAL DOMAINS AND FIELDS
√ √
Suppose that 0 ≠ r ∶= a + b p ∈ Q( p). By Exercise 7 below, a2 − pb2 ≠ 0.
Thinking of r as an element of R, where we know it is invertible, we may write
√ √
1 1 a−b p a−b p
r−1 = √ = √ ( √ ) = .
a+b p a+b p a−b p a2 − pb2
That is,
a b √ √
r−1 = − 2 p ∈ Q( p).
a2 − pb2 a − pb2
√
Thus Q( p) is a field.
48 3. INTEGRAL DOMAINS AND FIELDS
Supplementary Examples.
S3.1. Example. We invite the student to verify that
√ √ √ √ √
Q( 2, 3) ∶= {a + b 2 + c 3 + d 6 ∣ a, b, c, d ∈ Q}
is a subfield of R. That is, it is a subset of R which is a field using the addition and
multiplication it inherits from R.
S3.4. Example. Let D4 denote the set of all 2 × 2 matrices of the form
a b
[ ],
b a+b
where a and b ∈ Z2 . We add and multiply elements of D4 using the usual matrix
operations.
We invite the reader to prove that D4 is a field under these operations.
Note. This example is due to R.A. Dean, and thus our choice of notation.
S3.6. Example. Let R be a commutative ring, and let R[[x]] denote the ring
of formal power series defined in Example 2.4.1.
We leave it as an exercise for the reader to prove that R[[x]] is an integral
domain. (Hint. if 0 ≠ p(x) = ∑∞ n
n=0 pn x ∈ R[[x]], let ξ(p(x)) ∶= min{k ≥ 0 ∶ pk ≠ 0}.
Show that if 0 ≠ q(x) ∈ R[[x]], then ξ(p(x)q(x)) = ξ(p(x)) + ξ(q(x)). Why is this
helpful?)
We leave it as an exercise for the reader to check that the usual rules of differentiation
of real functions holds in this setting:
● (p(x) + q(x))′ = p′ (x) + q ′ (x);
● (r0 q(x))′ = r0 q ′ (x) for all r0 ∈ R;
● (p(x)q(x))′ = p′ (x)q(x) + p(x)q ′ (x).
Let a ∈ R. If (x − a)2 ∣ q(x) in the sense that q(x) = (x − a)2 g(x) for some
g(x) ∈ R[x], then (x − a) ∣ q ′ (x). To see this, we use the third property above to
write
Thus (x − a) ∣ q ′ (x).
We emphasise that elements of R[x] are not functions acting on a set (e.g. the
real line). As such, this is an abstract notion of derivative, which we have named
“derivative” because it behaves like the derivative of real-valued functions on R.
Appendix
A3.1. Fields play a central role in algebra and number theory, in particular
in the search for solutions to equations. It can be shown that they have a strong
relation to Euclidean geometry by using field theory to demonstrate that one cannot,
using a compass and straightedge, trisect an (arbitrary) angle or find a square whose
area is the same as that as a circle of radius 1.
The solutions to a quadratic equation ax2 + bx + c = 0 where a, b and c ∈ R and
0 ≠ a are given by √
−b ± b2 − 4ac
x= .
2a
There exist formulae, albeit more complicated than the one above, to identify the
solutions of cubic and quartic equations. In 1799, the Italian mathematician Paolo
Ruffini claimed that no such formula exists for a quintic equation (i.e. one of
degree 5), although his argument contained errors which were finally resolved by the
Norwegian Niels Henrik Abel, this being the same Abel whose name is used to
describe abelian groups.
The crowning result in this area is due to Évariste Galois, who derived neces-
sary and sufficient conditions for a polynomial equation to be solvable algebraically.
Galois’ contributions to algebra are extensive and quite amazing, especially consid-
ering that he died at the age of 20 of peritonitis, caused by a gunshot wound suffered
in a dual. Go figure.
The original word for a field was the German word Körper , introduced by
Richard Dedekind in 1871 (for subfields of the complex numbers), while the expres-
sion field was introduced much later (1893) by E. Hastings Moore. We mention
in passing that Körper is actually the German word for “body”, and the equivalent
word corps is used to describe a field in French, while cuerpo is used to describe a
field in Spanish. So English is the outlier, pretty much. How surprised the reader
chooses to be by this is really up to the reader, frankly.
52 3. INTEGRAL DOMAINS AND FIELDS
Exercise 3.1.
Our definition of a joint divisor of zero 0 ≠ r in a ring R says that there exist
0 ≠ x, y ∈ R such that
xr = 0 = ry.
If r is a joint divisor of zero in R, is it always the case that there exists 0 ≠ z ∈ R
such that
zr = 0 = rz?
Exercise 3.2.
Consider the ring of polynomials in two commuting variables (Z[x, y], +, ⋅) with
coefficients in (the commutative, unital ring) Z, as defined in Example 2.1.8(b).
Prove that (Z[x, y], +, ⋅) is an integral domain.
Exercise 3.3.
Prove that the ring of Gaussian integers Z[i] is indeed a ring, as the name
suggests. (Hint: use the Subring Test. Which ring should Z[i] be contained in?)
Exercise 3.4.
For which rings (R, +, ⋅) is the ring of polynomials R[x] in one variable with
coefficients in R an integral domain? Formulate and prove your conjecture.
Exercise 3.5.
Let n1 , n2 , n3 ∈ N and suppose that R = Zn1 ⊕ Zn2 ⊕ Zn3 . Find
char(R),
and formulate a conjecture of what the character of the ring
S ∶= ⊕kj=1 Znj
should be, given n1 , n2 , . . . , nk ∈ N. Better yet, try to prove your conjecture!
Exercise 3.6.
Find a unital ring R and a polynomial p(x) = p0 + p1 x + ⋯ + pn xn of degree at
least two such that the equation
p(x) = 0
admits infinitely many solutions in R.
Exercise 3.7.
Let a, b ∈ Q and suppose that p ∈ N is prime. Prove that a2 − pb2 ≠ 0. This was
used in Example 3.7 above.
EXERCISES FOR CHAPTER 3 53
Exercise 3.10.
Let H denote the ring of real Hamiltonians, defined in Example 2.4.1. Prove
that H satisfies all of the properties of a field, except that multiplication need not
be commutive. As pointed out in the Appendix to Chapter 2, we say that H is a
division ring or a skew field .
CHAPTER 4
Observe that the first multiplication, namely g1 ⋅ g2 , is taking place in (G, ⋅), while
the second product φ(g1 ) ∗ φ(g2 ) is taking place in (H, ∗). (Something similar
happened with linear maps above: the sum of x and y occurred in V, whereas the
sum of T x and T y occurred in W.) More often than not, multiplication in both
G and H is denoted simply by juxtaposition, and the reader will often see a group
homomorphism defined as a map φ ∶ G → H which satisfies φ(g1 g2 ) = φ(g1 )φ(g2 )
for all g1 , g2 ∈ G. The reader is expected to keep track of which multiplication is
being used where.
Which brings us to rings. Rings (R, +, ⋅) admit two operations, as did vector
spaces. It is not surprising, therefore, that – like vector space homomorphisms – our
notion of a homomorphism for a ring should require that we respect both of these
operations.
1.2. Definition. Let (R, +, ⋅) and (S, +̇, ∗) be two rings. A map φ ∶ R → S is
said to be a ring homomorphism if
φ(r1 + r2 ) = φ(r1 )+̇φ(r2 )
and
φ(r1 ⋅ r2 ) = φ(r1 ) ∗ φ(r2 )
for all r1 , r2 ∈ R.
1.3. Similar to the case when dealing with group homomorphisms, the reader
is often expected to keep track of the two different notions of addition and of mul-
tiplication arising in R and S, and a ring homomorphism from R to S is (infor-
mally) defined as a map φ ∶ R → S that satisfies φ(r1 + r2 ) = φ(r1 ) + φ(r2 ) and
φ(r1 r2 ) = φ(r1 )φ(r2 ) for all r1 , r2 ∈ R.
If there is a risk of confusion, we shall adopt the very strict notation from
Definition 1.2. In general, however, the notation given in the previous paragraph
should not cause misunderstanding, and we shall bow to tradition and use it with
almost reckless abandon.
1.7. Example. Let R and S be arbitrary rings and define φ ∶ R → S via φ(r) = 0
for all r ∈ R. Then φ is a ring homomorphism, known as the trivial homomorphism
from R to S. As the reader may have already surmised, not much information about
R and S can be gleaned from studying this map.
√
1.8. Example. Let p ∈ N be a prime number and consider R = Q( p) = S.
Define √ √
φ ∶ Q( p) → Q( p)
√ √ .
a+b p ↦a−b p
Then
√ √ √
φ((a1 + b1 p) + (a2 + b2 p)) = φ((a1 + a2 ) + (b1 + b2 ) p)
√
= (a1 + a2 ) − (b1 + b2 ) p
√ √
= (a1 − b1 p) + (a2 − b2 p)
√ √
= φ(a1 + b1 p) + φ(a2 + b2 p).
58 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS
Similarly,
√ √ √
φ((a1 + b1 p)(a2 + b2 p)) = φ((a1 a2 + pb1 b2 ) + (a1 b2 + b1 a2 ) p)
√
= (a1 a2 + pb1 b2 ) − (a1 b2 + b1 a2 ) p
√ √
= (a1 − b1 p)(a2 − b2 p)
√ √
= φ(a1 + b1 p)φ(a2 + b2 p).
√
Thus φ is a ring homomorphism of Q( p) into (in fact onto) itself.
This is again a good time to make sure that we are thinking while reading these
notes: what is so special about having three rings R, S and T ? Would it have worked
with four rings? With twenty? With infinitely many? If not, what goes wrong?
Is the map ψ ∶ A → S ⊕ R defined by ψ(r, s, t) = (s, r) a ring homomorphism? So
many questions! That is indeed the nature of mathematics.
in general. Yes, it can happen for some matrices, but it is not always true. For
0 1 0 0
example, if A ∶= [ ] and B ∶= [ ], then φ(A) = 0 = φ(B), but
0 0 1 0
1 0
φ(AB) = φ([ ]) = 1 ≠ 0 = φ(A)φ(B).
0 0
Thus φ is not a ring homomorphism.
1.13. Remark. From the point of view of set theory, there is not much differ-
ence between the sets A = {a, b, c} and B = {1, 2, 3}. True, one consists of letters and
the other of numbers, but those are not set-theoretic properties. That fact that each
set has three elements is a set-theoretic property, as is the fact that each set admits
a total of 8 subsets. In fact, from the point of view of Set Theory, B is just the set A
(after relabelling the entries of A). If we wish to impress people, we might say that
A and B are isomorphic as sets. Why we would want to impress people, especially
the kinds of people who would be impressed by such a statement, is another matter
altogether.
In much the same way, however, one should think of two isomorphic rings R
and S are being essentially the same, although presented differently. While one ring
might consist of numbers and another of functions, or matrices, or polynomials with
coefficients in some ring, any ring-theoretic property of R will be possessed by S
and vice-versa. That is, as rings, S is the just the ring R, with the elements and
the operations relabelled.
1.14. Examples. The proofs of the following facts are computations which we
leave to the reader.
(a) Consider the map
α∶ T2 (C) → T3 (C)
⎡t ⎤
t11 t12 ⎢ 11 0 t12 ⎥
⎢ ⎥
[ ] ↦ ⎢ 0 0 0 ⎥.
0 t22 ⎢
⎢ 0 0 t22 ⎥
⎥
⎣ ⎦
Then α is a monomorphism of T2 (C) into T3 (C). Clearly it is not surjective,
and so it is not an isomorphism.
60 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS
We shall prove that φ(nr) = nφ(r) by induction, and leave the (similar)
proof that φ(rn ) = φ(r)n for all r ∈ R and n ∈ N to the reader.
Let r ∈ R be arbitrary. First note that if n = 1, then φ(1r) = φ(r) =
1φ(r), which provides the initial step in our induction process.
In general, if φ(nr) = nφ(r), then
φ((n + 1)r) = φ(nr + 1r) = φ(nr + r)
= φ(nr) + φ(r) = nφ(r) + φ(r)
= (n + 1)φ(r).
This completes the induction step, and the proof that φ(nr) = nφ(r) for
all n ∈ N, r ∈ R.
(b) Since 0R = 0S ∈ S, 0T = φ(0S ) ∈ φ(S), and thus the latter is non-empty.
Let us apply the Subring Test: Let x, y ∈ φ(S). Then there exist r, s ∈ S
such that x = φ(r) and y = φ(s). Thus
x − y = φ(r) − φ(s) = φ(r − s) ∈ φ(S),
and
xy = φ(r)φ(s) = φ(rs) ∈ φ(S),
since r, s ∈ S and S a subring of R implies that r − s and rs ∈ S.
By the Subring Test, φ(S) is a subring of T .
(c) If S is commutative, then for xy ∈ φ(S), there exist r, s ∈ S such that
x = φ(r) and y = φ(s). Thus
xy = φ(r)φ(s) = φ(rs) = φ(sr) = φ(s)φ(r) = yx,
proving that φ(S) is commutative as well.
(d) Let t ∈ T be arbitrary. Since φ is surjective, t = φ(r) for some r ∈ R. Thus
t = φ(r) = φ(1R ⋅ r) = φ(1R )φ(r) = φ(1R ) ⋅ t,
and similarly,
t = φ(r) = φ(r ⋅ 1R ) = φ(r)φ(1R ) = t ⋅ φ(1R ).
Since t ∈ T was arbitrary, φ(1R ) is a multiplicative identity for T . Since
this must be unique, φ(1R ) = 1T .
◻
Here is a wonderful little application of these results.
1.18. Proposition. Let n ∈ N and consider the decimal expansion of n,
n = ak 10k + ak−1 10k−1 + ⋯ + a2 102 + a1 10 + a0 .
Then n is divisible by 9 if and only if the sum ak + ak−1 + ⋯ + a2 + a1 + a0 of its digits
is divisible by 9.
Proof. Define the map
β∶ N → Z9
m ↦ m mod 9.
2. IDEALS 63
But β(10) = 1, and so by Proposition 1.17 above, β(10j ) = β(10)j = 1j = 1 for all
j ≥ 1. Thus
n mod 9 = β(n)
= β(ak ) + β(ak−1 ) + ⋯ + β(a2 ) + β(a1 ) + β(a0 )
= (ak + ak−1 + ⋯ + a2 + a1 + a0 ) mod 9.
In particular, n is divisible by 9 if and only if β(n) = n mod 9 = 0. But β(n) = 0
if and only if the sum of the digits of n is divisible by 9.
Putting these two things together, we see that n is divisible by 9 if and only if
the sum of its digits is divisible by 9, as claimed.
◻
2. Ideals
2.1. We now turn our attention to a special class of subrings of a given ring R.
Elements of this class will be called ideals of R. They will be of interest because
they can always be realised as the kernels of some homomorphism from R into some
ring S, and as mentioned in paragraph 1.4 above, they will play a central role in
establishing the three Isomorphism Theorems for rings.
2.3. Note that the condition that rk ∈ K for all r ∈ R and k ∈ K implies in
particular that jk ∈ K for all j, k ∈ K. As such, in the definition of an ideal, it
suffices to ask that K be an additive subgroup of R with the property that rk and
kr ∈ K for all r ∈ R, k ∈ K.
That is, by combining these remarks with the Subring Test 2.4.10, we see that
to check that K is an ideal of R, it suffices to prove that
● k − j ∈ K for all j, k ∈ K; and
● rk and kr ∈ K for all k ∈ K, r ∈ R.
For ease of reference, we shall call this the Ideal Test.
64 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS
We also point out that if R is commutative, then every left or right ideal is
automatically a two-sided ideal.
2.4. Example. Consider the ring Z. Let m ∈ N and set K = mZ = {mz ∶ z ∈ Z}.
By a (hopefully by now) routine application of the Subring Test, we see that K is a
subring of Z.
If k ∈ K, then k = mz for some z ∈ Z. Thus for any r ∈ Z,
kr = rk = r(mz) = m(rz),
showing that rk, kr ∈ K. Thus K is an ideal of Z.
⎧
⎪ ⎡x 0 0⎤ ⎫
⎪
⎪⎢⎢
⎪ ⎥
⎥ ⎪
⎪
2.5. Example. Let R = M3 (Z), and let K = ⎨⎢y 0 0⎥ ∶ x, y, z ∈ Z⎬.
⎪ ⎢
⎪⎢ z 0 0⎥⎥ ⎪
⎪
⎪
⎩⎣ ⎦ ⎪
⎭
That K is a subring of R is left as a computation for the reader. If R = [ri,j ] ∈
⎡x 0 0⎤
⎢ ⎥
⎢ ⎥
M3 (Z) and K = ⎢y 0 0⎥ ∈ K, then
⎢ ⎥
⎢ z 0 0⎥
⎣ ⎦
⎡r ⎤ ⎡ ⎤
⎢ 11 r12 r13 ⎥ ⎢x 0 0⎥
⎢ ⎥ ⎢ ⎥
RK = ⎢r21 r22 r23 ⎥ ⎢y 0 0⎥
⎢ ⎥ ⎢ ⎥
⎢r31 r32 r33 ⎥ ⎢ z 0 0⎥
⎣ ⎦ ⎣ ⎦
⎡r x + r y + r z 0 0⎤
⎢ 11 12 13 ⎥
⎢ ⎥
= ⎢r21 x + r22 y + r23 z 0 0⎥ ∈ K.
⎢ ⎥
⎢r31 x + r32 y + r33 z 0 0⎥
⎣ ⎦
⎡1 0 0⎤ ⎡1 2 3⎤
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
Thus K is a left ideal of M3 (Z). Note that L ∶= ⎢0 0 0⎥ ∈ K and S ∶= ⎢0 0 0⎥ ∈
⎢ ⎥ ⎢ ⎥
⎢0 0 0⎥ ⎢0 0 0⎥
⎣ ⎦ ⎣ ⎦
R, but
LS = S ∈/ K,
proving that K is not a right ideal, and a fortiori not a two-sided ideal.
The most important example of an ideal stems from the following result. Al-
though its proof is routine, its significance is profound. As we shall eventually see,
every ideal of R arises in this manner, for an appropriate choice of S and corre-
sponding homomorphism φ.
whence k1 − k2 ∈ K.
If r ∈ R is arbitrary, then
φ(rk) = φ(r)φ(k) = φ(r)0 = 0 = 0φ(r) = φ(k)φ(r) = φ(kr),
proving that rk and kr both lie in K.
By the Ideal Test, K is a two-sided ideal of R.
Conversely, suppose that ker φ = {0}. If φ(r1 ) = φ(r2 ) for some r1 , r2 ∈ R, then
0 = φ(r1 ) − φ(r2 ) = φ(r1 − r2 ),
implying that r1 − r2 ∈ ker φ = {0}, whence r1 = r2 . Thus φ is injective.
◻
lies in JK.
If r ∈ R, then rjm ∈ J and km r ∈ K for all 1 ≤ m ≤ p, so that
p p
ra = r ( ∑ jm km ) = ∑ (rjm )km ∈ JK,
m=1 m=1
and
p p
ar = ( ∑ jm km ) r = ∑ jm (km r) ∈ JK.
m=1 m=1
By the Ideal Test, JK ⊲ R.
◻
2.10. Remarks.
(a) It is not hard to show that (and we leave it as an exercise for the reader to
show that) if R is a unital ring and ∅ ≠ F ⊆ R, then
n
⟨F ⟩ = {∑ ri xi si ∶ n ∈ N, ri , si ∈ R, xi ∈ F, 1 ≤ i ≤ n}.
i=1
When R is non-unital, there is a slight issue that arises with this de-
scription. For example, suppose that
0 x
R = {[ ] ∶ x ∈ R} ,
0 0
0 1
which forms a non-unital subring of M2 (R). Let a = [ ], and consider
0 0
0 0
F = {a}. For any r, s ∈ R, it is straightforward to verify that ras = [ ],
0 0
2. IDEALS 67
and so a would not belong to the set described upon the right-hand side of
the above equation, although a ∈ ⟨a⟩ by definition. (The set on the right
hand side of the equation would just contain the zero matrix in this case.)
As such, this description of the ideal generated by a set F does not hold in
general when R is non-unital.
(b) What is true in general is that if R is any ring, a ∈ R, and we define
ma ∶= a + a + ⋯ + a(m times), then ma ∈ ⟨a⟩ for all m ∈ N, and so (−m)a ∶=
−(ma) ∈ ⟨a⟩ as well. Writing Za ∶= {ma ∶ m ∈ Z}, we find that in general,
n
⟨a⟩ = {ma + ra + as + ∑ rk ask ∶ m ∈ Z, n ≥ 1, r, s, rk , sk ∈ R, 1 ≤ k ≤ n}.
k=1
0 m
⟨a⟩ = {[ ] ∶ m ∈ Z} .
0 0
2.11. Exercise. Find all ideals of Z, proving thereby that every ideal of Z is
a principal ideal. (An integral domain in which every ideal is principal is called a
principal ideal domain, denoted by pid. Thus Z is a pid. We shall return to these
later.)
2.13. Example. Let R = (C([0, 1], R), +, ⋅). We leave it to the reader to verify
that J ∶= {f ∈ R ∶ f (x) = 0, x ∈ [0, 31 ]} and K ∶= {g ∈ R ∶ g(x) = 0, x ∈ [ 14 , 1]} are
ideals of R.
● Consider J ∩ K. If h ∈ J ∩ K, then h ∈ J implies that h(x) = 0 for all
x ∈ [0, 13 ] and h(x) = 0 for all x ∈ [ 41 , 1]. But then h(x) = 0 for all x ∈ [0, 1],
so h = 0.
That is, J ∩ K = {0}.
● Next, consider JK. If f ∈ J and g ∈ K, then
⎧
⎪0g(x) = 0
⎪ if x ∈ [0, 31 ]
f (x)g(x) = ⎨ .
⎪
⎪f (x)0 = 0 if x ∈ [ 14 , 1]
⎩
Thus f (x)g(x) = 0 for all x ∈ [0, 1], so f g = 0.
More generally, if h ∈ JK, then h = ∑ni=1 fi gi for some n ∈ N and some
fi ∈ J and gi ∈ K, 1 ≤ i ≤ n. Since each fi gi = 0 from the above computation,
we see that h = ∑ni=1 0 = 0.
Thus JK = {0}.
68 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS
but idle. Assuming that your grandmaman was a high-ranking member of the army
and had a 24-hour watch with the hours indicated 0000, 0100, 0200, . . . , 2200, 2300
but with the date notably absent from the face of her watch (it’s been a while since
she retired, obviously), we see that 1400 hours refers not just to two in the afternoon
today, but in fact two in the afternoon tomorrow, the next day, yesterday and in
fact every day. If we take a philosophical leap of faith and make the simplifying
assumption that time has no beginning and no end (a mild exaggeration, perhaps),
then 1400 on your grandmaman’s watch would really indicate a member of {1400 +
2400n ∶ n ∈ Z}, which is the set of all two-in-the-afternoons throughout eternity,
moving forwards or backwards in time. In other words, two in the afternoon on
your grandmaman’s watch is just the set 1400 + 2400Z, a coset of the subring 2400Z
of Z.
The collection of all two-in-the-afternoons we think of as a single object, denoted
by 1400 on the face of your grandmaman’s watch. Observe that if we wanted to know
what her watch would say three hours from now, we would add 0300 + 2400Z to
1400 + 2400Z to get 1700 + 2400Z, i.e. 5 in the afternoon. Of course, if we wanted to
know the time tomorrow plus three hours from now, we would add 2700+2400Z to the
current time 1400+2400Z to get 4100+2400Z. But there is no 4100 on the face of your
grandmaman’s watch, because this is indicated by 1700, since 4100 = 1700 + 2400,
and thus 4100 + 2400Z is represented by 1700 + 2400Z.
The point is, if you understand your grandmaman’s watch, then you understand
cosets. Of course, the advent of smart-phone technology is quickly making your
grandmaman and indeed everyone’s grandmaman obsolete, but that unfortunate
circumstance is way beyond the scope of these lecture notes.
To summarise: all of this suggests that the concept of a coset is either not as
complicated as we are tempted to believe it might be, or that your grandmaman
possessed some crazy-mad sophisticated technology. Your humble author prefers the
former explanation.
R/L ∶= {x + L ∶ x ∈ R}.
3.3. The astute reader (hopefully you!) will have noticed that we referred to
x above as a representative of the coset x + L, as opposed to the representative of
x + L. There’s a very good reason for this. Unless L = {0} is the trivial subring,
then the representative of x + L is not unique, as the next result proves.
70 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS
3.5. Examples.
(a) Let R = Z and L = ⟨12⟩. Let us show that 7 + L = 19 + L ≠ 4 + L.
Observe that
7 + ⟨12⟩ = 7 + {. . . , −24, −12, 0, 12, 24, 36, . . .}
= {. . . , −17, −5, 7, 19, 43, 55, . . . , }
= 19 + ⟨12⟩,
but that
4 + ⟨12⟩ = 4 + {. . . , −24, −12, 0, 12, 24, 36, . . .}
= {. . . , −20, −8, 4, 16, 28, 40, . . .}
≠ 7 + ⟨12⟩.
(For example, 4 ∈ 4 + ⟨12⟩, but 4 ∈/ 7 + ⟨12⟩.)
(b) Note that
⎧
⎪ ⎡d 0 0 ⎤⎥ ⎫
⎪
⎪⎢⎢ 1
⎪ ⎥ ⎪
⎪
D3 (R) = ⎨⎢ 0 d2 0 ⎥ ∶ d1 , d2 , d3 ∈ R⎬
⎪
⎪ ⎢ ⎥ ⎪
⎪
⎩⎢⎣ 0 0 d3 ⎥⎦
⎪ ⎪
⎭
is a subring of T3 (R). It is not an ideal of T3 (R) since, for example, the
matrix
⎡0 1 0⎤ ⎡1 0 0⎤
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
E12 ∶= ⎢0 0 0⎥ ∈ T3 (R) and D1 ∶= ⎢0 0 0⎥ ∈ D3 (R),
⎢ ⎥ ⎢ ⎥
⎢0 0 0⎥ ⎢0 0 0⎥
⎣ ⎦ ⎣ ⎦
but
E12 = D1 ⋅ E12 ∈/ D3 (R).
3. COSETS AND QUOTIENT RINGS 71
⎡17 2⎤
⎢ 2 π− √e ⎥⎥
⎢
Fix a matrix, say T = ⎢ 0 −12.1 − 2 ⎥ ∈ T3 (R). Then
⎢ ⎥
⎢0 0 1 ⎥⎦
⎣
⎧
⎪ ⎡x 2 π − e2 ⎤ ⎫
⎪
⎪⎢⎢
⎪ √ ⎥⎥ ⎪
⎪
T + D3 (R) = ⎨⎢ 0 y − 2 ⎥ ∶ x, y, z ∈ R⎬ .
⎪
⎪ ⎢ ⎥ ⎪
⎪
⎩⎢⎣ 0 0
⎪ z ⎥⎦ ⎪
⎭
3.6. In linear algebra, one shows that if V is a vector space (over a field F) and
W is a subspace of V, then we can view the set V/W of cosets of W in V as a vector
space using the operations
(x + W) + (y + W) ∶= (x + y) + W
and
κ(x + W) ∶= (κx) + W
for all x, y ∈ V and κ ∈ F. Unfortunately, in the case of rings, it is not true that the
cosets of a subring can always be made into a ring using the canonical operations.
This is where the notion of an ideal takes on all of its importance.
3.7. Theorem. Let R be a ring and L be a subring of R. The set R/L of all
cosets of L in R becomes a ring using the operations
(x + L) + (y + L) ∶= (x + y) + L
and
(x + L)(y + L) ∶= (xy) + L
for x + L, y + L ∈ R/L if and only if L is an ideal of R.
Proof.
● Suppose first that L is an ideal of R. Let us prove that R/L is a ring under the
operations defined above. Note that the Subring Test does not apply here – R/L
is not contained in anything that we already know to be a ring. As such, we are
forced to use the definition of a ring, which involves checking a great many things.
Life isn’t always easy.
The first thing that we have to do is to verify that the operations are well-defined.
(See the Appendix to this Chapter to learn more about when we have to check that
operations are well-defined.)
Similarly,
(x1 y1 )−(x2 y2 ) = x1 y1 −x1 y2 +x1 y2 −x2 y2 = x1 (y1 −y2 )+(x1 −x2 )y2 = x1 m2 +m1 y2 ∈ L,
since L is an ideal. Thus
(x1 y1 ) + L = (x2 y2 ) + L.
That is, the operations of addition and multiplication we are using are well-defined.
Now we get to the nitty-gritty of checking that R/L is a ring. Sigh. Nah, “chin up”
– let’s do it.
carry it into the second. Anything in the kernel gets sent to 0, and so (for example)
if r ∈ ker φ, then so is r2 , r3 , r + r2 + r3 , etc., so the homomorphism by itself can’t
distinguish between these elements.
We get extra information from the kernel as well; for example, if the kernel of
the ring homomorphism φ has exactly 10 elements in it, then not only are these 10
elements sent to 0, but in fact any element s in the range of φ is then the image of
exactly 10 elements of R. The reason for this is that φ(r1 ) = s = φ(r2 ) if and only
if φ(r2 − r1 ) = 0; i.e. if r2 = r1 + k for some k ∈ ker φ. In particular, the map φ is
injective if and only if its kernel is {0}, in which case φ loses no information – φ(R)
is isomorphic to R in this case. So the point is that ring homomorphisms are the
object we really want to study.
We originally defined an ideal K of a ring R as a subring with the special property
that k ∈ K and r ∈ R implies that both rk and kr lie in K. Let us (very) temporarily
change the name of such subrings to ”multiplication absorbing subrings”. Now the
question becomes: why should we care about multiplication absorbing subrings?
Why did we study them in the first place?
Half of the reason lies in Theorem 2.6, which states that kernels of homomor-
phisms are in fact multiplication absorbing subrings. The other half of the reason
is Theorem 3.8 below, which says that these are the only kinds of multiplication
absorbing subrings. That is, multiplication absorbing subrings and kernels of ring
homomorphisms are just two ways of looking at the same thing. So why bother with
this alternate description? Suppose that we didn’t know that these two objects were
the same. If one were to present you with a subring L of a ring R and ask you if
it is the kernel of some homomorphism (and therefore very important), how would
you check if this is so? How would you look for a ring S and a homomorphism
φ ∶ R → S? You can’t say “Hey, Theorem 3.7 is the answer!”, because Theorem 3.7
only tells you that a subring L is a multiplication absorbing subring. Knowing that
L is the kernel of a homomorphism if and only if it is a multiplication absorbing
subring tells us that instead of looking for some ring S and some homomorphism
φ ∶ R → S whose kernel is L, you only have to apply the multiplication absorbing
subring Test, which – in our infinite wisdom – we actually refer to as the Ideal Test.
Alternatively, it suffices to check that R/L is a ring under the canonical operations.
So why don’t we use the expression “multiplication absorbing subring”? My
guess is that it is because other mathematicians are jealous of me and my fancy
terminology, but this is just speculation.
3.8. Theorem. Let R be a ring, and let K be an ideal of R. Then there exists
a ring S and a homomorphism φ ∶ R → S such that K = ker φ.
Proof. By Theorem 3.7 above, R/K is a ring using the canonical operations. Let
φ ∶ R → R/K
r ↦ r + K.
3. COSETS AND QUOTIENT RINGS 75
Then
and
φ(r1 r2 ) = (r1 r2 ) + K = (r1 + K)(r2 + K) = φ(r1 ) φ(r2 ).
Thus φ is a ring homomorphism.
The additive identity element of R/K is 0+K, since (r+K)+(0+K) = (r+0)+K =
r + K for all r ∈ R. By Proposition 3.4,
φ(r) = r + K = 0 + K
3.9. Example.
Let R = R[x] and r ∈ R. Define δr (p(x)) = p(r).
δr ∶ R[x] → R
p(x) ↦ p(r).
Then
δr ((pq)(x)) = (pq)(r) = p(r)q(r) = δr (p(x)) δr (q(x)),
and
δr ((p + q)(x)) = (p + q)(r) = p(r) + q(r) = δr (p(x)) + δr (q(x)).
Thus δr is a homomorphism, often referred to as evaluation at r.
Note that p(x) ∈ ker δr if and only if 0 = δr (p(x)) = p(r). That is,
Let f (x) = x − r ∈ R[x]. Then f (r) = 0, so f ∈ ker δr . Note that ⟨f (x)⟩ ⊆ ker δr ,
since if g(x) ∈ ⟨f (x)⟩, then g(x) = h(x)f (x) for some h(x) ∈ R[x] (why don’t we
need to consider finite sums of functions here?), and thus
Conversely, if p(x) ∈ ker δr , then p(r) = 0, and so r is a root of p(x). But then
f (x) = x−r divides p(x), and so there exists k(x) ∈ R[x] such that p(x) = k(x)f (x) ∈
⟨f (x)⟩.
(Note: the fact that p(r) = 0 implies that x − r divides p(x) is something that
we know from high school in the setting of polynomials with coefficients in R, but
which we shall prove in greater generality in a later chapter (see Corollary 7.1.4)).
76 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS
φ∶ Z/⟨κ⟩ → Zκ
n + ⟨κ⟩ ↦ n mod κ.
Once again, we must check that φ is well-defined, mustn’t we? But if n + ⟨κ⟩ =
m + ⟨κ⟩, then m − n ∈ ⟨κ⟩, i.e. m − n = κz for some z ∈ Z. Thus
By definition, for φ(r) ∈ φ(R), τ (r+K) = φ(r), proving that τ is surjective. Also,
if r + K ∈ ker τ , then 0 = τ (r + K) = φ(r), proving that r ∈ ker φ. But K = ker φ,
and thus r + K = 0 + K is the zero element of R/K. That is, by Theorem 2.6, τ is
injective.
Thus τ is a bijective homomorphism (i.e. an isomorphism) from R/K onto φ(R),
completing the proof.
◻
τ (m1 + n1 ) = m1 + (M ∩ N ) = m2 + (M ∩ N ) = τ (m2 + n2 ).
Hence τ is well-defined.
Next, note that
τ (m + 0) = m + (M ∩ N ).
Thus τ is surjective.
Finally, suppose that n ∈ N . Then n = 0 + n ∈ M + N and τ (n) = τ (0 + n) =
0 + (M ∩ N ) is the zero element of M /(M ∩ N ). Thus n ∈ ker τ . Since n ∈ N was
arbitrary, N ⊆ ker τ .
Conversely, suppose that x = (m + n) ∈ ker τ . Then τ (x) = m + (M ∩ N ) =
0 + (M ∩ N ), and so m = m − 0 ∈ (M ∩ N ) ⊆ N . But n ∈ N and N is an ideal of R,
so x = m + n ∈ N . That is, ker τ ⊆ N .
Taken together, these two inclusions imply that ker τ = N .
By the First Isomorphism Theorem,
M M +N M +N
τ (M + N ) = ≃ = .
M ∩N ker τ N
◻
4. THE ISOMORPHISM THEOREMS 79
4.5. Example. Let us look at the Second Isomorphism Theorem (and its proof)
through the prism of an explicit example. It will be enlightening, to say the least.
Suppose that R = Z (already our example is special - for one thing, it is com-
mutative), M = ⟨12⟩ = 12Z, and N = ⟨20⟩ = 20Z. We leave it to the reader as an
exercise to prove that M + N = ⟨4⟩ = 4Z and M ∩ N = ⟨60⟩ = 60Z.
The Second Isomorphism Theorem says that
4Z M +N M 12Z
= ≃ = .
20Z N M ∩ N 60Z
4Z
So what does an element of look like? In general,
20Z
M +N
= {(m + n) + N ∶ m ∈ M, n ∈ N } = {m + N ∶ m ∈ M },
N
since n + N = N for all n ∈ N . In this specific case, we are looking at the cosets of
the ideal 20Z of the ring 4Z, so the cosets look like
4Z
= {0 + 20Z, 4 + 20Z, 8 + 20Z, 12 + 20Z, 16 + 20Z}.
20Z
80 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS
Similarly,
12Z
= {0 + 60Z, 12 + 60Z, 24 + 60Z, 36 + 60Z, 48 + 60Z}.
60Z
How did our proof proceed? We considered the map
12Z
τ ∶ 12Z + 20Z →
60Z
12r + 20s ↦ 12r + 60Z.
Let’s think about this for a second. Ok, long enough. We remarked above that
12Z + 20Z = 4Z. So our map τ is really a map from 4Z to 12Z/60Z. We must specify
what τ does to a multiple of 4!
When looking at an element of 4Z, we are first asking ourselves to express a
multiple of four in the form 12r + 20s. For example, 8 ∈ 4Z, so you might write
8 = 12(−1) + 20(1). Then again, I might write 8 = 12(4) + 20(−2).
As we have defined τ , you would get τ (8) = 12(−1) + 60Z = −12 + 60Z, while I
would compute τ (8) = 12(4) + 60Z = 48 + 60Z. Fortunately,
−12 + 60Z = {. . . , −72, −12, 48, 108, . . .} = 48 + 60Z,
so the answer you computed agrees with my answer. But what if someone else wrote
8 = 12r + 60s for entirely different values of r and s? Would their candidate for τ (8)
agree with ours?
That is what we are doing when we check whether or not τ is well-defined. We
are verifying that independent of how we express 8 = 12r + 20s, all of us agree on
what τ (8) should be. After all, the function τ sees only the number 8. It is you, I
and that other person who see the r’s and s’s. Our computations are different, but
if the formula for τ is going to make sense, we had better all agree on the value of
τ (8).
Of course, we can’t just check this for 8; we have to check that we all agree on
the formula for τ (k) whenever k ∈ 4Z. But if two people write k = 12r1 + 20s1 and
k = 12r2 + 20s2 respectively, then 12r1 − 12r2 = 20s2 − 20s1 . That means that
12r1 − 12r2 = 20(s1 − s2 ) ∈ ⟨20⟩ = 20Z,
and in particular, 12r1 − 12r2 is divisible by 5. Clearly it is divisible by 12 also. But
then it is divisible by lcm(5, 12) = 60, so 12r1 − 12r2 ∈ ⟨60⟩, and
τ (k) = 12r1 + ⟨60⟩ = 12r2 + ⟨60⟩.
In other words, while the two people may have written k completely differently, they
agree on what τ (k) is. Thus τ is well-defined.
The computation that shows that τ is a homomorphism is (this humble author
hopes) relatively straight-forward:
τ ((12r1 + 20s1 ) + (12r2 + 20s2 )) = τ (12(r1 + r2 ) + 20(s1 + s2 ))
= 12(r1 + r2 ) + ⟨60⟩
= (12r1 + ⟨60⟩) + (12r2 + ⟨60⟩)
= τ (12r1 + 20s1 ) + τ (12r2 + 20s2 )
4. THE ISOMORPHISM THEOREMS 81
and
12Z
To show that τ is onto, we consider an arbitrary element y of , so y =
60Z
12k + 60Z, where k ∈ {0, 1, 2, 3, 4}. Then 12k = 12k + 0 ∈ 12Z + 20Z = 4Z, so τ acts on
this element and by definition,
and τ is onto.
Finally, we compute ker τ . Now x = 12r + 20s ∈ ker τ if τ (x) = 0 + 60Z. But
τ (x) = 12r + 60Z, and so we need 12r ∈ 60Z, which means that r must be a multiple
of 5, i.e. r = 5b for some b ∈ Z. But then x = 12(5b) + 20s = 20(3b + s) ∈ ⟨20⟩; i.e.
x ∈ N . Conversely, if x ∈ ⟨20⟩ = 20Z, then we can write x = 0 + 20s, and by definition,
τ (x) = 0 + ⟨60⟩ = 0 + 60Z. Thus x ∈ ker τ .
We have proven that ker τ = ⟨20⟩. Finally, the First Isomorphism Theorem now
assures us that
4Z M +N M 12Z
= ≃ ran τ = = .
20Z N M ∩ N 60Z
And what, pray tell, is the isomorphism given by the First Isomorphism The-
orem? Let’s call it Φ. Then – keeping in mind that we can always write 4k =
(12(2) + 20(−1))k = 24k − 20k – we find that
4Z 12Z
Φ∶ →
20Z 60Z
4.6. Examples. The details of the following example are left to the reader.
(a) Let K be an ideal of the ring R. The map
π ∶ R → R/K
r ↦ r+K
is a homomorphism, called the canonical homomorphism from R onto
R/K. Its kernel is ker π = K.
(b) For example, if K = ⟨m⟩ as an ideal of Z, then the map
π ∶ Z → Z/⟨m⟩
z ↦ z + mZ
is the canonical homomorphism.
4.8. Example.
(a) Let n ∈ N and R = Tn (R). Then
⎧
⎪⎡k ⎤ ⎫
⎪
⎪
⎪⎢ ⎥ ⎪
⎪
⎪⎢
⎪⎢ k ⎥ ⎪
⎪
⎥
{kIn ∶ k ∈ Z} = ⎨⎢ ⎥ ∶ k ∈ Z⎬
⎪
⎪⎢ ⋱ ⎥ ⎥ ⎪
⎪
⎪
⎪⎢ ⎪
⎪
⎪⎢
⎩⎣ k ⎥ ⎪
⎭
⎦
is isomorphic to Z.
4. THE ISOMORPHISM THEOREMS 83
(b) In Z5 [i], S ∶= {0+0i, 1+0i, 2+0i, 3+0i, 4+0i} is a subring which is isomorphic
to Z5 .
Supplementary Examples.
S4.1. Example. Let R and S be rings. The map ζ ∶ R → S defined by ζ(r) = 0
for all r ∈ R is a homomorphism, called the zero homomorphism.
Needless to say, it’s not the most interesting homomorphism around. Having
said that, one should look at Exercise 4.9 below.
S4.2. Example. ∗∗∗ Let R[x] denote the commutative ring of polynomials in
one variable with real coefficients. Then
R[x]
≃ C.
⟨x2 + 1⟩
S4.4. Example. Let R ∶= (C([0, 1], R), +, ⋅) be the ring of continuous real-
valued functions on [0, 1]. Let x0 ∈ [0, 1], and consider the map
proving that εx0 is a homomorphism of C([0, 1], R) into R. The kernel of this map
is the ideal
Mx0 ∶= {f ∈ C([0, 1], R) ∶ f (x0 ) = 0},
and it can be shown that if N ⊲ C([0, 1], R) is an ideal and Mx0 ⊆ N ⊆ C([0, 1], R),
then either N = Mx0 or N = C([0, 1], R). In other words, there are no proper ideals
which contain Mx0 . We say that Mx0 is a maximal ideal of C([0, 1], R). Maximal
ideals will prove important below.
S4.6. Example. Let R = Z, M = 24Z and N = 15Z. Note that M and N are
ideals of Z, and
M + N = {m + n ∶ m ∈ 24Z, n ∈ 15Z}.
Note that 3 = 24 ⋅ 2 + 15 ⋅ (−3) ∈ M + N , and so 3z = 24 ⋅ (2z) + 15 ⋅ (−3z) ∈ M + N for
all z ∈ Z. Moreover, any element of M + N is divisible by 3, since each of 24 and 15
are. Hence
M + N = 3Z.
Also, we leave it to the reader to verify that M ∩ N = 120Z. By the Second Isomor-
phism Theorem,
3Z 24Z
≃ .
15Z 120Z
Hopefully this will remind the reader of the well-known equation
3 24
= ,
15 120
which our result generalises.
Appendix
A4.1. The question of when one has to check whether a function is well-defined
or not is not as difficult as it is sometimes made out to be. Basically, it comes down
to this: one is trying to describe a rule for a set of objects by specifying what that
rule does to one particular element of that set. Without knowing in advance that
the result is independent of which element of the set you chose, how do you know
that your rule even makes sense? To say that a function is “well-defined” is the
statement that the rule that you have for describing that function makes sense.
For example: suppose that to each National Hockey League (NHL) team we
wish to assign a natural number. Here is the “rule”.
Here is another possible “rule” - this time we try to assign a colour to each NHL
hockey team:
Pick an arbitrary player on that team. The colour we assign to
that team is the colour of that player’s right eye.
Does this “rule” make sense?
Again - no! It is entirely possible that one player on the team might have brown
eyes, while another member might have green eyes. The function we have described
to assign a colour to the country is not well-defined.
Undaunted, let’s try one more time. Let’s assign a number to an NHL team
according to the following “rule”.
Pick an arbitrary player of that team. The number we assign
to that team is the number of teammates that person has (not
counting himself).
This is an interesting one. If the team has 23 players, then – and this is the crucial
element – regardless of which player on the team you pick, that player will have
22 teammates. While the set of teammates of player A will differ from the set of
teammates of player B (for example, our definition states that player B is a teammate
of player A, but not of himself), nevertheless, the number of teammates does not
depend upon the particular player we chose. This way of assigning a number to a
88 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS
team by choosing an individual player from that team and specifying a rule in terms
of something about that particular player makes sense. The function
f∶ NHL teams → N
player x on team Y → number of teammates of player x on team Y
is well-defined.
φ∶ Z/⟨12⟩ → Z/⟨4⟩
n + ⟨12⟩ ↦ n + ⟨4⟩
is well-defined. Again – in this case, note that 2 + ⟨12⟩ = 26 + ⟨12⟩. The function φ
does not see “2” nor “26”. It only sees the coset
So how does φ work? It says - pick an arbitrary element from this set, say n and
then
Here’s the beauty of it: if we had picked n = 2, then the result would be
But
{. . . , 22, 26, 30, 34, . . .} = {. . . , −2, 2, 6, 10, 14, . . .}!!!
In fact, no matter which two representatives n1 and n2 of n + ⟨12⟩ we choose,
the fact that n1 + ⟨12⟩ = n2 + ⟨12⟩ implies that n1 − n2 ∈ ⟨12⟩ ⊆ ⟨4⟩, and thus n1 + ⟨4⟩ =
n2 + ⟨4⟩.
If you understand this example, you should understand “well-definedness”, and
you are on the path to a better and healthier mathematical lifestyle.
APPENDIX 89
Thus
m n
0 = φ(v) = ∑ αi φ(wλi ) + ∑ βj φ(yγj )
i=1 j=1
m n
= ∑ αi 0 + ∑ βj yγj
i=1 j=1
n
= ∑ βj yγj .
j=1
Since {yγ }β∈Γ is linearly independent, this means that βj = 0 for all 1 ≤ j ≤ n, and
thus v = ∑mi=1 αi wλi ∈ W. That is, ker φ ⊆ W.
Together, these two containments prove that W = ker φ.
So - the issue is that it is “easier” to be the kernel of a vector space homomor-
phism – any subspace of that vector space will do – than it is to be the kernel of a
90 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS
ring homomorphism – where you must be an ideal. But in both cases the key issue
is that kernels of homomorphisms are the key! Cray cray.
A4.3. Let R be a commutative ring, and let K ⊲ R be an ideal of R. The
K-radical of R is the set
radK ∶= {r ∈ R ∶ there exists n ∈ N such that rn ∈ K}.
Caveat. Unfortunately, there are several different notions within rings that are
referred to as radicals of the ring. The one above is but one, and it is not as
important as the Jacobson radical, which we shall see in the Assignments.
Proof. As always, we would like to apply the Ideal Test to verify that this is the
case.
That K ⊆ radK is easy to verify.
Let r, s ∈ radK and choose n, m ∈ N such that rn , sm ∈ K. Consider
m + n m+n−1 1 m + n m+n−2 2
(r − s)m+n = rm+n −( )r s +( )r s −⋯
1 2
m+n
+ (−1)m+n−1 ( )r1 sm+n−1 + (−1)m+n sm+n .
m+n−1
For 0 ≤ j ≤ m + n, either j ≤ m, in which case rm+n−j ∈ K, or m + 1 ≤ j ≤ m + n, in
which case sj ∈ K. Since K is an ideal, each term in the above sum lies in K, and
thus the sum itself is an element of K.
That is, r − s ∈ radK .
Let r ∈ radK and fix n ∈ N such that rn ∈ K. Let t ∈ R. Then, using the
commutativity of R,
(tr)n = tn rn ∈ K,
since rn ∈ K and K ⊲ R. Hence rt = tr ∈ radK .
By the Ideal Test, radK is an ideal of R.
◻
EXERCISES FOR CHAPTER 4 91
Exercise 4.1.
(a) Find all subrings of Z.
(b) Find all ideals of Z.
(c) Let m, n ∈ Z be relatively prime. (That is, gcd(m, n) = 1.) Show that
Zm ⊕ Zn is isomorphic to Zmn .
Exercise 4.2.
(a) Find a subring of M3 (C) which is not an ideal.
(b) Find all ideals of M3 (C).
(c) Can you generalise this result to Mn (C)?
Exercise 4.3.
Let R = C([0, 1], R) and let K ∶= {f ∈ C([0, 1], R) ∶ f (x) = 0, x ∈ [ 13 , 87 ]}. Prove or
disprove that K is an ideal of R.
Exercise 4.4.
A proper ideal M of a ring R (i.e. an ideal that is not R itself) is said to be
maximal if whenever N ⊲ R and M is a proper subset of N , we have that N = R.
That is, M is not properly contained in any proper ideal of R.
(a) If the set K of R = C([0, 1], R) in Exercise 3 is in fact an ideal of R, is it
maximal?
(b) Find a maximal ideal of R.
Exercise 4.5.
Let R be a ring and
⎧
⎪⎡d1 0 0 ⋯ 0 0 ⎤⎥ ⎫
⎪
⎪
⎪⎢ ⎪
⎪
⎪
⎪
⎪⎢ 0 d 2 0 ⋯ 0 0 ⎥ ⎪
⎪
⎪
⎪
⎪⎢ ⎥ ⎪
⎪
⎪⎢
⎪⎢ 0 0 ⋱ ⋯ 0 0⎥ ⎥ ⎪
⎪
Dn (R) = ⎨⎢ ⎥ ∶ dj ∈ R, 1 ≤ j ≤ n⎬
⎪
⎪⎢ ⋱ ⎥ ⎪
⎪
⎪
⎪⎢ ⎥ ⎪
⎪
⎪
⎪⎢ 0 0 0 ⋯ d n−1 0 ⎥ ⎪
⎪
⎪
⎪⎢ ⎥ ⎪
⎪
⎪
⎪⎢0 0 0 ⋯ 0 d n
⎥ ⎪
⎪
⎩⎣ ⎦ ⎭
denote the ring of all diagonal matrices in Mn (R).
Find all ideals of Dn (R).
Exercise 4.6.
Consider the ideals J = ⟨m⟩ and K = ⟨n⟩ of Z. Find J + K, J ∩ K, and JK.
92 4. HOMOMORPHISMS, IDEALS AND QUOTIENT RINGS
Exercise 4.8.
Let L be a subring of a ring R. Let r1 , r2 ∈ R. Prove that either r1 + L = r2 + L,
or (r1 + L) ∩ (r2 + L) = ∅.
In other words, the cosets of L partition the ring R.
Another way to understand this is the following. Recall that a relation ∼ on a
set X is said to be an equivalence relation if
● x ∼ x for all x ∈ X;
● if x ∼ y, then y ∼ x; and
● if x ∼ y and y ∼ z, then x ∼ z.
The equivalence class of x ∈ X under this relation is denoted by [x] ∶= {y ∈ X ∶
x ∼ y}.
(a) Prove that in any set X equipped with any equivalence relation ∼, the
equivalence classes partition the set X. That is, X = ∪x∈X [x], and for
x ≠ y ∈ X, either [x] = [y] or [x] ∩ [y] = ∅.
(b) Define a relation ∼ on R via: r ∼ s if r − s ∈ L. Prove that ∼ is an equiva-
lence relation on R.
Exercise 4.9.
Let m, n ∈ N with m < n. Describe all ring homomorphisms from Mn (C) to
Mm (C).
Exercise 4.10.
Let (G, +) be an abelian group. A map φ ∶ G → G is said to be additive if
φ(g+h) = φ(g)+φ(h) for all g, h ∈ G. (In essence, this says that φ is a group homo-
morphism from G into itself. A group homomorphism (resp. ring homomorphism)
from a group (resp. a ring) into itself is referred to as a group endomorphism
(resp. a ring endomorphism).
We denote by End (G, +) the set of all additive maps on G.
(a) Prove that End (G, +) is a ring under the operations
(φ + ψ)(g) ∶= φ(g) + ψ(g),
and
(φ ∗ ψ)(g) ∶= φ ○ ψ(g) = φ(ψ(g))
for all g ∈ G.
Given a ring R and an element r ∈ R, define the map Lr ∶ R → R by Lr (x) ∶= rx.
In other words, Lr is left multiplication by r.
EXERCISES FOR CHAPTER 4 93
(b) Let (R, +, ⋅) be a ring and recall that (R, +) is an abelian group. Prove that
the map
λ ∶ (R, +, ⋅) → (End (R, +), +, ∗)
r → Lr
is an injective ring homomorphism.
In some contexts, this is referred to as the left regular representation of the ring
R. Where was commutativity of (R, +) used?
Exercise 4.11. In the same way that we worked our way through the Second Iso-
morphism Theorem in Example 4.5, work your way through the Third Isomorphism
Theorem for the special case where R = Z, M = ⟨10⟩, and N = ⟨50⟩.
CHAPTER 5
1.3. The notion of a maximal element in a partially ordered set is not the same
as the notion of a maximum element. A ring may have many maximal ideals (or
none - depending upon the ring). The key observation is that a maximal element
need only be bigger than the elements to which it is comparable – a maximum
element must first be comparable to everything in the partially ordered set, and
must then be bigger than everything. We discuss the definition at greater length in
the Appendix to this section.
1. PRIME AND MAXIMAL IDEALS 97
1.4. Examples.
(a) Recall that Z is a principal ideal domain, that is, every ideal K of Z is of
the form K = ⟨k⟩ for some k ∈ Z. In order for K to be a proper ideal, we
cannot have k = 1 nor k = −1. Note also that ⟨k⟩ = ⟨−k⟩, showing that there
is no loss of generality in assuming that k ≥ 0.
● Suppose first that k = 0. If r, s ∈ Z and rs ∈ ⟨0⟩ = {0}, then either
r = 0 ∈ ⟨0⟩ or s = 0 ∈ ⟨0⟩, and so ⟨0⟩ is a prime ideal.
● Suppose next that k = p ∈ N is a prime number. If r, s ∈ R and rs ∈ ⟨p⟩,
then rs = p m for some integer m. Since p divides the right-hand side
of the equation and is prime, it must divide the left-hand side of the
equation, and indeed, it must divide either r or s. By relabelling if
necessary, we may suppose that p divides r, and thus r ∈ ⟨p⟩. This
shows that ⟨p⟩ is a prime ideal.
● Suppose that k = k1 k2 for some 2 ≤ k1 , k2 ∈ N. That is, suppose that k
is not prime. Then with r = k1 , s = k2 , we see that rs = k ∈ ⟨k⟩, but
r, s ∈/ ⟨k⟩.
Together, these show that – other than ⟨0⟩, an ideal K = ⟨k⟩ is a prime ideal
if and only if k is a prime number (or the negative of a prime number).
(b) Now let us determine the maximal ideals of Z. Again, we have seen that
all ideals of Z are of the form ⟨k⟩ for some 0 ≤ k ∈ Z, so these are the
only ones we need consider, and we can ignore the case where k = 1, since
K = ⟨1⟩ = ⟨−1⟩ = Z is not a proper ideal of Z.
Suppose first that k = 0. Then ⟨0⟩ is contained in any ideal – for
example, ⟨0⟩ ⊆ ⟨2⟩ ≠ Z, showing that ⟨0⟩ is not maximal.
Similarly, if k > 0 is a composite number, say k = k1 k2 where 2 ≤ k1 , k2 <
k, then
⟨k⟩ ⊆ ⟨k1 ⟩,
and these two ideals are distinct since k1 ∈ ⟨k1 ⟩ but k1 ∈/ ⟨k⟩. Hence ⟨k⟩ is
not maximal when k is composite.
Finally, suppose that k is prime. If ⟨k⟩ ⊆ ⟨m⟩ for some m ∈ N, then
k ∈ ⟨m⟩ implies that m divides k. But k prime then implies that either
m = 1 or m = k. If m = 1, then ⟨m⟩ = Z is not a proper ideal, while m = k
implies that ⟨m⟩ = ⟨k⟩.
Thus k prime implies that ⟨k⟩ is not properly contained in a proper
ideal of Z, whence ⟨k⟩ is maximal.
We conclude that (except for the zero ideal which is prime but not
maximal), the maximal and prime ideals of Z coincide.
(c) Let 2 ≤ n ∈ N. Recall from the Assignments that Mn (R) is simple. That is,
it has no ideals other than the trivial ideals {0} and Mn (R) itself.
The matrix units E11 and E22 lie in Mn (R), and E11 E22 = 0, although
neither of E11 nor E22 lies in {0}. This shows that {0} is not a prime ideal
in Mn (R).
It is, however maximal. (Why?)
98 5. PRIME IDEALS, MAXIMAL IDEALS, AND FIELDS OF QUOTIENTS
1.5. Remark. Hopefully we have not become so tired that we have stopped
constantly asking ourselves new questions while reading. For example, in Exam-
ple 1.4(a), what was so special about Z that made ⟨0⟩ a prime ideal? Unlike the
case of Tn (C) – where ⟨0⟩ is not a prime ideal – Z has the property that if r, s ∈ Z
and rs = 0, then r = 0 or s = 0.
This should look very familiar. It is one of the defining properties of an in-
tegral domain. Thus, in every integral domain, ⟨0⟩ is a prime ideal. Of course,
integral domains are commutative (and unital) by definition. Can we think of a
non-commutative ring R where ⟨0⟩ is a prime ideal?
Again - this is the kind of question that should come to mind while reading these
notes, and you should always be thinking while reading any mathematical text.
To see this, note that p(x) = x2 + 2x + 1 and q(x) = x + 1 lie in Z3 [x] and
p(x)q(x) = x3 + 2x2 + x + x2 + 2x + 1 = x3 + 3x2 + 3x + 1 = x3 + 1,
though neither p(x) nor q(x) lies in K. (This is Exercise 5.2.)
(a) implies (b). First we argue that L is a proper ideal of R. Indeed, since
L is an ideal of R, R/L is a ring. The hypothesis that R/L is an integral
domain means that it is unital, and it is not hard to see that 1 + L must
be the multiplicative identity of R/L. Since 1 + L ≠ 0 + L (in a unital ring,
the multiplicative and additive identities must be different), we see that
1 − 0 = 1 ∈/ L, so L ≠ R.
Suppose that r, s ∈ R and rs ∈ L. Then r + L, s + L ∈ R/L and (r + L)(s +
L) = rs + L = 0 + L. Since R/L is an integral domain (by hypothesis), either
r + L = 0 + L, in which case r ∈ L, or s + L = 0 + L, in which case s ∈ L. This
shows that L is prime.
(b) implies (a). We have seen that R/L is a ring, and (r + L)(s + L) =
(rs) + L = (sr) + L = (s + L)(r + L), where the middle equality holds because
R is commutative. Moreover, 1+L is easily seen to act as the multiplicative
identity of R/L, so that the latter is unital. Thus we need only show that
(r + L)(s + L) = 0 + L implies that either r + L = 0 or s + L = 0. But
(r + L)(s + L) = 0 + L implies that rs + L = 0 + L, which happens if and only
if rs ∈ L. Since L is assumed to be prime, this shows that either r ∈ L (or
equivalently r + L = 0 + L) or s ∈ L (or equivalently s + L = 0 + L). Thus R/L
is an integral domain.
◻
(s + M )(r0 + M ) = (sr0 ) + M
= (sr0 + m) + M
=1+M
= (r0 + M )(s + M ),
M ⊂ N ⊂ R.
R/M = R/R = {0 + R}
is just the zero ring. In particular, it is not unital, and thus R/M is certainly
not a field. Suppose therefore that there exists N ⊲ R as above.
Since N ≠ R, N does not contain any invertible elements, and in par-
ticular, 1 ∈/ N .
Let n ∈ N ∖ M . Then n + M ≠ 0 + M , and for all r ∈ R,
1.10. Example. The converse to this is false. Recall that {0} is a prime ideal
in any integral domain, but it is not maximal in the integral domain Z, for example.
2.3. One of the issues we face in constructing the rational numbers from the
integers as above is the “non-uniqueness” of the representation of a rational number.
As we know all too well,
2 4 −6
= = ,
7 14 −21
and indeed, we can find infinitely many ways of describing the same rational number.
If we want to say that they all represent the same rational number, one approach
might be to set up an equivalence relation as follows:
Let X ∶= Z × (Z ∖ {0}) ∶= {(a, b) ∶ a, b ∈ Z, b ≠ 0}. We define a relation ∼ on X via
(a, b) ∼ (c, d)
if ad = bc.
Let us verify that this is indeed an equivalence relation.
102 5. PRIME IDEALS, MAXIMAL IDEALS, AND FIELDS OF QUOTIENTS
Proof. Notice that we are dealing with equivalence classes. Already, this should
trigger a Pavlovian effect, and if nothing else, we should start salivating every time
a bell rings.
A fortiori, we are defining the operations of addition and multiplication using
representatives of these equivalence classes. This is the mathematical equivalence
2. FROM INTEGRAL DOMAINS TO FIELDS 103
of someone hitting your leg under the kneecap with a rubber mallet, and your leg
should pop forward and trigger the question: are these operations well-defined ? Let’s
check!
Suppose that [(a1 , b1 )] = [(a2 , b2 )] and that [(c1 , d1 )] = [(c2 , d2 )] in F. Then
(a1 , b1 ) ∼ (a, b), so a1 b = ab1 , and similarly c1 d = cd1 . Thus
(ad + bc)(b1 d1 ) = ab1 dd1 + bd1 b1 c
= a1 bdd1 + bb1 c1 d
= (a1 d1 + b1 c1 )(bd).
In other words,
[(ad + bc, bd)] = [(a1 d1 + b1 c1 , b1 d1 )],
proving that addition is indeed well-defined.
The proof that the multiplication operation above is well-defined is similar and
is left as an exercise for the reader.
From above, for all d ∈ D, [(0, d)] = [(0, 1)] ∶= α is the neutral element of F under
addition. Suppose that [(a, b)] ∈ F with a ≠ 0 ≠ b. Then
[(a, b)][(b, a)] = [(ab, ba)] = [(1, 1)] = [(ba, ab)] = [(b, a)][(a, b)],
and so [(b, a)] = [(a, b)]−1 is the multiplicative inverse of [(a, b)].
At long last, we conclude that (F, +, ⋅) is a field.
◻
2.6. Example. Let us apply this construction with D = Z. The resulting field
F is then F ∶= {[(p, q)] ∶ p, q ∈ Z, q ≠ 0}. We claim that this is isomorphic to Q.
Consider the map
Θ∶ F → Q
.
[(p, q)] ↦ pq
Once again, we are defining a map on equivalence classes of sets, and since we
are defining the map in terms of a particular choice of representative, the first thing
that we must do is to check that our map Θ is well-defined.
Suppose that [(p, q)] = [(r, s)]. By definition, q ≠ 0 ≠ s are integers, and (p, q) ∼
(r, s), from which we find that ps = qr. But then dividing both sides by qs ≠ 0, we
get
p r
Θ([(p, q)]) = = = Θ([r, s]),
q s
proving that Θ is indeed well-defined.
Now
Θ([(p, q)] + [(r, s)]) = Θ([(ps + rq, qs)]
ps + rq
=
qs
p r
= +
q s
= Θ([(p, q)]) + Θ([(r, s)]),
2. FROM INTEGRAL DOMAINS TO FIELDS 105
and
Θ([(p, q)][(r, s)]) = Θ([(pr, qs)]
pr
=
qs
p r
=
q s
= Θ([(p, q)]) Θ([(r, s)]).
Thus Θ is a ring homomorphism.
To see that Θ is injective, it suffices to show that ker Θ = {[(0, 1)]}. But [(p, q)] ∈
ker Θ if and only if Θ([(p, q)]) = pq = 0, which happens if and only if p = 0. Hence
ker Θ = {[(0, q)] ∶ q ∈ Z, 0 ≠ q} = {[(0, 1)]},
since q ≠ 0 implies that (0, q) ∼ (0, 1). Hence Θ is injective.
Finally, to see that Θ is surjective, and hence an isomorphism of rings, note that
for any pq ∈ Q with p ∈ Z and 0 ≠ q ∈ Z, we have that
p
= Θ[(p, q)].
q
Since Θ is an isomorphism, we have just constructed (an isomorphic copy of) Q
from Z.
Of course, the fact that we are able to construct a field using an integral domain
is quite interesting, but if that field were to have no other connection to the integral
domain, then it would be an idle curiosity at best. In much the same way that the
rational numbers contain an isomorphic copy of the integers, we find that the field
F of quotients of an integral domain D always includes an isomorphic of D; in other
words, up to a matter of notation – it always includes D.
so that θ is a homomorphism. If θ(a) = [(0, 1)], then [(a, 1)] = [(0, 1)], so (a, 1) ∼
(0, 1). But this happens if and only if a ⋅ 1 = 1 ⋅ 0 = 0, i.e. if a = 0. Thus ker θ = {0},
whence θ is injective.
◻
It is standard to write
p(x)
q(x)
to denote the equivalence class [(p(x), q(x))]. In particular, it is understood that
p(x) r(x)
=
q(x) s(x)
if and only if p(x)s(x) = r(x)q(x). Using this notation, the above summation and
multiplication become the familiar
p(x) r(x) p(x)s(x) + q(x)r(x)
+ =
q(x) s(x) q(x)s(x)
and
p(x) r(x) p(x)r(x)
= .
q(x) s(x) q(x)s(x)
SUPPLEMENTARY EXAMPLES. 107
Supplementary Examples.
S5.1. Example. Let R = C([0, 1], R), equipped with the usual point-wise ad-
dition and multiplication. This ring has an additional structure, namely that it is
a vector space over R. (We shall discuss vector spaces over general fields in a later
Chapter.)
Given x0 ∈ [0, 1], the map
εx0 ∶ C([0, 1], R) → R
f → f (x0 )
is easily seen to be a ring homomorphism. It is also linear and non-zero, since if
g(x) = 1 for all x ∈ [0, 1], then g ∈ C([0, 1], R) and εx0 (g) = 1 ≠ 0. By the First
Isomorphism Theorem,
C([0, 1], R)
R = ran εx0 ≃ .
ker εx0
That ker εx0 is an ideal follows from Chapter 4. Since the co-dimension of ker εx0
is one, this implies that ker εx0 is a maximal ideal of C([0, 1], R).
Less easy to show is the fact that every maximal ideal of C([0, 1], R) is of the
form ker εx0 for some x0 ∈ [0, 1].
S5.3. Example. If F is a field, then {0} and F are the only ideals of F, and so
{0} is a maximal ideal.
S5.9. Example. Returning to C([0, 1], R): let K ∶= {f ∈ C([0, 1], R) ∶ f (x) =
0, x ∈ [ 31 , 32 ]}. We invite the reader to verify that this is an ideal of C([0, 1], R).
⎧ ⎧
⎪0
⎪ x ∈ [0, 21 ] ⎪ 21 − x x ∈ [0, 12 ]
⎪
Let g(x) = ⎨ and h(x) = ⎨ .
⎪
⎪x − 12 x ∈ [ 21 , 1], ⎪
⎪0 x ∈ [ 12 , 1]
⎩ ⎩
Then g(x)h(x) = 0 ∈ K, but neither g(x) nor h(x) is in K, so K is not prime.
In fact, essentially the same argument shows that if R is not an integral domain,
and if K is a non-zero, proper ideal of R, then K is not prime.
SUPPLEMENTARY EXAMPLES. 109
S5.10. Example. If we are not dealing with unital rings, there need not exist
maximal ideals in general (even if we assume Zorn’s Lemma).
Consider R = (Q, +, ∗), where + is usual addition of rational numbers, but
a ∗ b ∶= 0 for all a, b ∈ Q.
We leave it to the reader to verify that R is indeed a ring.
What should an ideal of R look like? In essence, K ⊲ R is k1 − k2 ∈ K for all
k1 , k2 ∈ K, while a ∗ k = k ∗ a = 0 ∈ K is trivially verified. In other words, K must be
a subgroup of Q under addition, and any subgroup of Q under addition is an ideal
of R.
Suppose that M ⊊ Q is a proper subgroup under addition.
As per the first Assignment, there exists a prime number p ∈ N and an integer
N ∈ N such that n ≥ N implies that p1n ∈/ M .
Let N = {m + k p1N ∶ m ∈ M, k ∈ Z}. We claim that N is a subgroup of (Q, +) and
therefore an ideal of R = (Q, +, ∗), and that M ⊊ N ⊊ R, so that M is not a maximal
ideal of R.
● Let a ∶= m1 + k1 p1N , b ∶= m2 + k2 p1N ∈ N . Then
1 1 1
b − a ∶= (m2 + k2 N
) − (m1 + k1 N ) = (m2 − m1 ) + (k2 − k1 ) N ∈ N.
p p p
Thus N is an additive subgroup of Q, and so it is an ideal of R using ∗ as
our multiplication.
● Clearly M ⊂ N and M ≠ N since p1N ∈ N ∖ M .
To see that N ≠ Q, note that pN1+1 ∈/ N . Indeed, suppose otherwise.
Then
1 1
= m + k N for some m ∈ M, k ∈ Z,
pN +1 p
and so
1 1
= pm + k N −1 ∈ M,
pN p
a contradiction.
110 5. PRIME IDEALS, MAXIMAL IDEALS, AND FIELDS OF QUOTIENTS
Appendix
A5.1. Earlier in this Chapter, and also in the Exercises to the previous Chapter,
we defined the notion of a maximal ideal M in a ring R, and we also mentioned that
there is a difference between the notions of maximal elements of partially ordered
sets, and of maximum elements of partially ordered sets. Let us now examine this
more closely. We shall begin, as a wise man used to say, at the “big-inning”.
A5.2. Definition. Let ∅ ≠ X be a set. A relation ρ on a non-empty set X is
a subset of the set X × X = {(x, y) ∶ x, y ∈ X} of ordered pairs of elements of X.
Often, we write x ρ y to mean the ordered pair (x, y) ∈ ρ. This is especially true
when dealing, for example, with the usual relation ≤ for real numbers: no one writes
(x, y) ∈≤; we all write x ≤ y. Incidentally, the notation ≤ is used not only for the
relation “less than or equal to” for real numbers; it frequently appears to indicate
a specific kind of a relation known as a partial order on an arbitrary set, which we
now define.
A5.3. Example. Let X = Z. We might define a relation ∼ on X by setting
x ∼ y if ∣x − y∣ ≤ 1. Thus 3 ∼ 3, 3 ∼ 4, 4 ∼ 3, and 4 ∼ 5, but (3, 5) ∈/∼ (i.e. 3 ∼/ 5), since
∣3 − 5∣ = 2 > 1.
Note that there need not exist an explicit “formula” to describe ρ. Any subset
of X × X defines a relation.
A5.4. Definition. A relation ≤ on a non-empty set X is said to be a partial
order if, given x, y, and z ∈ X,
(a) x ≤ x;
(b) if x ≤ y and y ≤ x, then x = y; and
(c) if x ≤ y and y ≤ z, then x ≤ z.
We refer to the pair (X, ≤) as a partially ordered set, or more succinctly as a
poset.
A5.5. The prototypical example of a partial order is the partial order of inclu-
sion on the power set P (A) of a non-empty set A. That is, we set P (A) = {B ∶ B ⊆
A} and for B, C ∈ P (A), we set B ⊆ C to mean that every member of B is a member
of C. It is easy to see that ⊆ is a partial order on P (A).
The word partial refers to the fact that given two elements B and C in P (A),
they might not be comparable. For example, if A = {x, y}, B = {x} and C = {y},
then it is not the case that B ⊆ C, nor is it the case that C ⊆ B. Only some subsets
of P (A) are comparable to each other.
In dealing with the natural numbers, we observe that they come equipped with
a partial order which we typically denote by ≤. In this setting, however, any two
natural numbers are comparable. This leads us to the following notion.
A5.6. Definition. If (X, ρ) is a partial ordered set, and if x, y ∈ X implies that
either xρy or yρx, then we say that ρ is a total order on X.
APPENDIX 111
A5.7. Example.
(a) It follows from the above discussion that (N, ≤) is a totally ordered set.
(b) It also follows from the above discussion that if A is a non-empty set with
at least two members, then (P (A), ⊆) is partially ordered, but not totally
ordered by inclusion.
A5.9. The key difference between these two notions is that a maximum element
is comparable to every other element of X, and is bigger than or equal to it, whereas
a maximal element need not be comparable to every other element of the set X –
it just needs to be bigger than or equal to those elements to which it is actually
comparable.
A5.10. Example. Let A = {∅, {x}, {y}, {z}, {x, y}, {x, z}, {y, z}} be the set of
proper subsets of X = {x, y, z}, and partially order A by inclusion ⊆. Then clearly
{x} ⊆ {x, y}.
Observe that {x, y} is a maximal element of A. Nothing is bigger than {x, y} in
A. But {z} ⊆/ {x, y}, and so {x, y} is not a maximum element of A. Note that {x, z}
is also a maximal element. Maximal elements need not be unique. (Are maximum
elements unique when they exist? Think about it!)
A5.11. Remark. We have not addressed the question of whether or not max-
imal ideals always exist. As it so happens, the existence of maximal ideals in unital
rings is equivalent to Zorn’s Lemma, or equivalently to the Axiom of Choice, both
of which are discussed in the Appendix at the end of these course notes.
A5.12. We have seen that a partial order on a non-empty set is a specific kind
of relation that mimics the notion of inclusion between elements of the power set
P (X) of a set X, or the simple concept of less than or equal to between two real
numbers.
Another important relation tries to mimic the concept of two elements being
equivalent. The name chosen for such a relation couldn’t be better suited.
Hey, this sounds familiar: we’ve seen this kind of thing before, haven’t we? (Yes!
is the correct rhetorical answer to this not-so rhetorical question.) In the Appendix
APPENDIX 113
Exercise 5.1.
Let 2 ≤ n ∈ N.
(a) Find all maximal ideals of Tn (C).
(b) Find all prime ideals of Tn (C).
Exercise 5.2.
Prove that p(x) = x2 + 2x + 1 does not lie in the ideal ⟨x3 + 1⟩ of Z3 [x].
Exercise 5.3.
Let D be an integral domain. Let ∼ be the equivalence relation defined on
X ∶= D × (D ∖ {0}) in Proposition 2.4 above, and let F ∶= {[(a, b)] ∶ (a, b) ∈ X}.
Prove that the multiplication operation on F given by
[(a, b)][(c, d)] ∶= [(ac, bd)]
is well-defined.
Exercise 5.4.
With the hypotheses and notation from Theorem 2.5 above,
(a) Show that conditions (M1), (M2), (M3) and (M4) of Definition 2.1.2 are
satisfied by F.
(b) Prove that [(1, 1)] serves as a multiplicative identity for F.
Exercise 5.5.
Let R be a ring, and let M and N be prime ideals of R. Either prove that M ∩N
is a prime ideal of R, or find a counterexample to this statement if it is false.
Exercise 5.6.
Let R be a ring, and let M and N be maximal ideals of R. Either prove that
M ∩ N is a maximal ideal of R, or find a counterexample to this statement if it is
false.
Exercise 5.7.
Let R and S be rings and φ ∶ R → S be a homomorphism.
(a) Suppose that P ⊲ S is a prime ideal. Either prove that φ−1 (P ) ∶= {x ∈ R ∶
φ(x) ∈ P } is a prime ideal of R, or find a counterexample to this statement
if it is false.
(b) Suppose that M ⊲ S is a maximal ideal. Either prove that φ−1 (P ) ∶= {x ∈
R ∶ φ(x) ∈ P } is a maximal ideal of R, or find a counterexample to this
statement if it is false.
(c) Let K ⊲ R be an ideal. Prove that φ(K) need not be an ideal of S, but
that φ(K) is an ideal of S if φ is surjective.
EXERCISES FOR CHAPTER 5 115
Exercise 5.8.
Let R be a unital ring.
(a) Let K ≠ R be an ideal of R. Prove that there exists a maximal ideal M
such that K ⊆ M .
(b) Let r ∈ R be non-invertible. Prove that there exists a maximal ideal M
such that r ∈ M .
(c) Suppose that R has a unique maximal ideal. Prove that the set N of
non-invertible elements of R form an ideal in R.
Exercise 5.9.
Let R be a finite, commutative and unital ring. Prove that every prime ideal is
maximal.
Exercise 5.10.
Let R be a commutative unital ring. Suppose that ever proper ideal of R is a
prime ideal. Prove that R is a field.
CHAPTER 6
Euclidean Domains
Don’t ever wrestle with a pig. You’ll both get dirty, but the pig will
enjoy it.
Cale Yarborough
1. Euclidean Domains
1.1. Algebraists study algebraic structures. That is neither as deep nor as shal-
low as it may sound. Analysts study algebraic structures that admit some sort of
topological structure as well. The rational numbers are a perfect example of what
we are trying to get at here.
From the point of view of algebra - the rational numbers are a perfectly legitimate
object of study. Heck, they form a field, and fields - from an algebraic standpoint –
are swell. (That is not a technical term.) Of course, in some (algebraic) ways, the
rational numbers are lacking: as a field, Q is not algebraically closed. This can
be a severe handicap for an algebra. But all in all, they are still really, really swell.
From the point of view of your average Analyst, the rational numbers are severely
lacking not only because they are algebraically incomplete, but more significantly
because they are not (analytically) complete; that is, it is possible to find Cauchy
sequences of rational numbers whose limits are not rational numbers. As a conse-
quence, basic analytical properties such as the Least Upper Bound property and
the Intermediate Value Theorem fail abjectly when dealing with the rationals as
the base field, and this suggests to the Analyst that the real numbers (or, for the
more daring and debonaire – the complex numbers) are the field with which it is
preferable to work. Limits, one of the many “only true love”’s of the Analyst, tend
to be of little or no concern to Algebraists. Since they are interested in alternative,
more algebraic properties, the failure of the Intermediate Value Theorem is neither
here nor there nor anywhere else to them.
This is not a bad nor a good thing. It is just a thing.
many) books that we have consulted. What’s more – these are typically the only
three that appear in those references. See the Appendix for one more example -
so-called Dedekind domains.1
This is not to say that they are not important – and the techniques we shall
develop to study them – will be exceedingly important in the study of irreducibil-
ity/reducibility of polynomials in the coming chapters.
1.4. Example. Item (a) in the definition of a valuation should look eerily famil-
iar. In particular, it should look very much like a property enjoyed by the integers
referred to as the division algorithm. More precisely, the division algorithm for
Z says that if a and b are integers, with b ≠ 0, then there exists an integer q such
that
a = bq + r,
where 0 ≤ r < ∣b∣.
If we set µ(d) = ∣d∣ for all d ∈ Z ∖ {0}, then clearly conditions (a) and (b) of
Definition 1.2 hold. Since Z is an integral domain, we have come to the not-so-
surprising conclusion that (Z, +, ⋅) is a Euclidean domain.
Observe that the conclusion is not-so-surprising in large part because the defi-
nition of a Euclidean domain is meant to capture exactly this property of Z.
1.5. The second most common and important example of a Euclidean domain is
that of a polynomial ring with coefficients in a field. In light of condition (a) above,
it behooves us (admit it – how many times have you seen the word “behoove” in
a mathematical text, eh?) to prove some version of the division algorithm in this
setting.
1As it turns out, our plea for more examples (from an earlier version of these notes) was
answered by N. Banks. We refer to the reader to the Appendix for a brief discussion of these.
1. EUCLIDEAN DOMAINS 119
Consider
or equivalently,
But from condition (ii) applied to each of r(x) and s(x) we find that either
s(x) − r(x) = 0 or deg (s(x) − r(x)) < deg (g(x)).
On the other hand, if u(x) − q(x) ≠ 0, then deg ((u(x) − q(x)) g(x)) ≥
deg (g(x)).
From this it follows that we must have u(x) − q(x) = 0, i.e. u(x) = q(x).
But then
1.7. Example. Let f (x) = 5x4 + 3x3 + 1 and g(x) = 3x2 + 2x + 1 ∈ Z7 [x].
To divide f (x) by g(x), we proceed with long division the way that we would
with integers. Thus, to keep trace of the “x2 place” and the “x1 place”, we write
(The reader should not look at us with that expression: the natural number 1004 =
1 ⋅ 104 + 0 ⋅ 103 + 0 ⋅ 101 + 4 ⋅ 1. You’ve seen this kind of thing before.)
One next has to figure out how many 3x2 ’s “fit” into 5x4 , and this requires us
to divide 5 by 3 in Z7 . Since 5 = 3 ⋅ 4 in Z7 , we shall require 4g(x) to begin with.
Performing that calculation yield the following.
4x2
2 √
3x + 2x + 1 5x +3x +0x2 +0x +1
4 3
Next, we must decide how man 3x2 ’s “fit” into 2x3 , and again – this requires us
to divide 2 by 3 in Z7 . Since 2 = 3 ⋅ 3 in Z7 , we continue the above long division with
3x. Observe that we implicitly use the calculation that 3 − 6 = 0 − 3 = 4 in Z7 .
1. EUCLIDEAN DOMAINS 121
4x2 +3x
2 √
3x + 2x + 1 5x +3x +0x2 +0x +1
4 3
4x2 +4x +1
Finally, we must decide how man 3x2 ’s “fit” into 4x2 , and again – this requires
us to divide 4 by 3 in Z7 . Since 4 = 3 ⋅ 6 in Z7 , we finish the above long division with
6. Again, observe that we implicitly use the calculation that 4 − 5 = 6 and 1 − 6 = 2
in Z7 .
4x2 +3x +6
2 √
3x + 2x + 1 5x +3x +0x2 +0x +1
4 3
4x2 +4x +1
4x2 +5x +6
6x +2
Our conclusion is that
5x4 + 3x3 + 1 = (4x2 + 3x + 6)(3x2 + 2x + 1) + (6x + 2),
i.e.
f (x) = (4x2 + 3x + 6)g(x) + (6x + 2),
2
and so q(x) = (4x + 3x + 6) and r(x) = 6x + 2 in this example.
1.8. A simple reason why we consider the division algorithm for polynomial
rings with coefficients in a field as opposed to polynomial rings with coefficients in,
say, an integral domain is because we must be able to divide the leading coefficient
of the numerator f (x) by the leading coefficient of the divisor g(x). Since g(x) =
bm xm + bm−1 xm−1 + ⋯ + b0 ≠ 0, and since (without loss of generality) the leading
coefficient bm ∈ F is non-zero, given any f (x) = an xn + an−1 xn−1 + ⋯a0 , we know that
we can divide an by bm . Thus we know that this applies to the remainder term of
f (x)/g(x), so long as the degree of that remainder term is at least that of g(x).
The ring R = Z is an integral domain. If f (x) = 7x5 + 1 and g(x) = 2x2 , then for
any q(x) ∈ Z[x],
deg (f (x) − q(x)g(x)) ≥ 5 = deg f (x),
since no choice of q(x) ∈ Z[x] can make the coefficient of x5 equal to zero.
122 6. EUCLIDEAN DOMAINS
Proof.
(a) It is an immediate consequence of the division algorithm for polynomial
rings, Theorem 1.6 that there exist polynomials q(x), r(x) ∈ F[x] such that
f (x) = q(x)g(x) + r(x),
where r(x) = 0 or δ(r(x)) = deg(r(x)) < deg(g(x)) = δ(g(x)), and thus
condition (a) of Definition 1.2 is satisfied.
(b) If 0 ≠ f (x), g(x) ∈ F[x], then
δ(f (x)) ∶= deg(f (x))
≤ deg(f (x)) + deg(g(x))
= deg(f (x) g(x))
= δ(f (x)g(x)).
Thus δ is a valuation on F[x].
◻
2.5. Examples.
(a) 7 and −7 are both greatest common divisors of 35 and −49 in Z. This
highlights the fact that even if a greatest common divisor exists, it need
not be unique.
(b) Let R be a commutative ring, 0 ≠ a, b ∈ R, and d be a gcd of a and b.
Suppose that x ∈ R is invertible.
Now d ∣ a implies that a = dm for some m ∈ R, so a = (dx)x−1 m,
implying that dx ∣ a. Similarly, dx ∣ b, showing that dx is a common divisor
of a and b.
Moreover, if c ∣ a and c ∣ b, then c ∣ d, so d = cn for some n ∈ R, implying
that dx = cnx, and thus c ∣ dx. Hence dx is a gcd of a and b as well.
Applying this to the example above, we saw that 7 is a gcd of 35 and
−49 in Z. Since −1 ∈ Z is invertible (with inverse equal to itself), we see
that −7 is also a gcd of 35 and −49. We have simply generalised this to
other rings which may or may not have many more invertible elements.
124 6. EUCLIDEAN DOMAINS
The greatest common divisor d of two non-zero integers a and b can always be
expressed as a Z-linear combination of a and b. This notion carries over to Euclidean
domains as well.
d = ra + sb.
The above proof, while “short and sweet”, has the disadvantage that it does not
tell us how to find a gcd, even when one exists. The process of finding the gcd(a, b)
of two elements a, b of a Euclidean domain E is called the Euclidean algorithm,
and it will be left as a homework assignment question. See also Exercise 6.7 below.
3. UNIQUE FACTORISATION DOMAINS 125
Then f (x) = (x−1)2 and g(x) = (x−1)(x+2), and we recall from Example 2.5 (d)
above that d(x) = (x − 1) is a greatest common divisor of f (x) and g(x).
Note that with r(x) = − 13 and s(x) = 31 ∈ Q[x],
1
r(x)f (x) + s(x)g(x) = (−f (x) + g(x))
3
1
= (−(x − 1)(x − 1) + (x − 1)(x + 2))
3
1
= (x − 1) (−(x − 1) + x + 2) = (x − 1) = d(x).
3
This example was perhaps a little “too easy”. We’ll examine a more interesting
example in the homework assignment questions, once we develop the Euclidean
algorithm.
3.3. Examples.
(a) It is not hard to see that the only invertible elements of Z are 1 and −1;
the same is true of Z[x]. Thus the only associates of, say, q(x) = 12x4 +
3x2 − 2x + 1 are q(x) itself and −q(x) = −12x4 − 3x2 + 2x − 1.
(b) Consider D = Z, then integers.
● The invertible elements of Z are exactly {−1, 1}.
● An element 0 ≠ r ∈ Z is reducible if we can write it as r = uv where
neither u nor v is invertible. Since r ≠ 0, clearly neither u nor v can
be equal to zero. Since they do not belong to {−1, 1} either, we see
that 0 ≠ r ∈ Z is reducible exactly when it is either a composite natural
number, or the negative of a composite natural number.
To be irreducible, therefore, r must either be a prime natural number,
or the negative of a prime natural number.
● We leave it to the reader to verify that 0 ≠ r ∈ Z is “ring” prime (i.e.
prime in the sense of Definition 3.2 (b)) if and only if it is a prime
natural number, or the negative of a prime natural number. Observe,
therefore, that in Z, the notions of prime elements and irreducible
elements coincide. That prime elements are always irreducible is not
unique to Z – see Proposition 3.4 below.
● Finally, two elements a and b of Z are associates if and only if b ∈
{a, −a}.
(c) The invertible elements of Q[x] are the non-zero scalar polynomials. Thus
if q(x) ∈ Q[x], the associates of q(x) are {r0 q(x) ∶ 0 ≠ r0 ∈ Q}. We shall
much more to say about irreducible and prime elements of Q[x] later.
◻
To prove that the converse is false requires more work. We begin by introducing
the following definition.
3.6. Examples.
(a) Again - we turn to Z for inspiration. Here we may define ν(d) = ∣d∣ for all
d ∈ Z. Clearly this is a multiplicative norm on Z.
(b) Let Z[i] ∶= {a + bi ∶ a, b ∈ Z} denote the Gaussian integers. We may define
a multiplicative norm on Z[i] via
∣a + bi∣ ∶= a2 + b2 .
That this is√indeed a multiplicative
√ norm is left as an exercise for the reader.
(c) Let D = Z[ −5] ∶= {a + b −5 ∶ a, b ∈ Z}. Then
√
ν(a + b −5) ∶= a2 + 5b2
defines a multiplicative norm on D. Again, the verification of this (as well
as the verification of the fact that D is an integral domain) is left to the
reader.
Proof.
(a) First note that ν(1) = ν(1 ⋅ 1) = ν(1)ν(1) implies that ν(1) ∈ {0, 1}. But
1 ≠ 0 implies that ν(1) ≠ 0, so ν(1) = 1.
Now suppose that u ∈ D is invertible. Then
1 = ν(1) = ν(u ⋅ u−1 ) = ν(u) ν(u−1 ).
Since ν(u), ν(u−1 ) ∈ N, it follows that ν(u) = ν(u−1 ) = 1.
(b) Suppose next that ν(x) = 1 implies that x is invertible, and let y ∈ D be an
element such that p ∶= ν(y) ∈ N is prime.
If y = uv for some u, v ∈ D, then
p = ν(y) = ν(uv) = ν(u)ν(v),
128 6. EUCLIDEAN DOMAINS
√
Let r ∶= 1+2 −5. Then ν(r) = 1+5(22 ) = 21. We shall prove that r is irreducible,
but not prime in D.
√
r = 1 + 2 −5 is irreducible:
Suppose that r = uv for some u, v ∈ D. Then 21 = ν(r) = ν(uv) = ν(u)ν(v), and
thus (by relabelling u and v if necessary), we may assume that either ν(u) = 1 and
ν(v) = 21, or that ν(u)√= 3 and ν(v) = 7.
Writing u = u1 + u2 −5, we see that ν(u) ∈ {1, 3} implies that u21 + 5u22 ∈ {1, 3},
whence u2 = 0 and u = u1 ∈ {−1, 1}. Thus u is invertible, proving that r is irreducible.
√
r = 1 + 2 −5 is not prime:
Note that
√ √ √
r(1 − 2 −5) = (1 + 2 −5)(1 − 2 −5) = 1 − 4(−5) = 21 = 3 ⋅ 7.
Clearly r divides the left-hand side of this equation. If it were prime, then it would
have to divide 3 or 7.
Suppose 3 = rx for some x ∈ D. Then
9 = ν(3) = ν(r)ν(x) = 21ν(x),
which is impossible with ν(x) ∈ N. Similarly, if 7 = rx for some x ∈ D, then
49 = ν(7) = ν(r)ν(x) = 21ν(x),
√
which is impossible with ν(x) ∈ N. Thus r = 1 + 2 −5 is not prime.
⟨a⟩ = ⟨b⟩
When dealing with principal ideal domains, the distinction between irreducibility
and primeness disappears, and we obtain one more equivalent property.
130 6. EUCLIDEAN DOMAINS
3.14. Example. Note that every field is a ufd, since F admits no non-zero,
non-invertible elements. This is cheap, but true.
Our next goal is to prove that every pid is a ufd. We shall use the very important
concept of rings alluded to at the beginning of this section to do this.
3.16. Examples.
(a) Let n ∈ N and R = Tn (R). If K ⊲ Tn (R) and K ∶= [ki,j ] ∈ K, then for
each 1 ≤ i ≤ j ≤ n, Li,j = Ei,i KEj,j ∈ K. Thus we see that K is spanned by
the (non-zero) matrix units it contains. Since we only have finitely many
matrix units to choose from, Tn (R) can have at most finitely many ideals.
In particular, any increasing chain of ideals must stabilise and so Tn (R) is
Noetherian.
We shall determine all of the ideals of Tn (R) in the Assignments.
(b) Let n ∈ N and R = Mn (R). Then, as seen in the Assignments, R is simple;
that is, the only ideals of R are {0} and R itself. Clearly any ascending
chain must stabilise.
132 6. EUCLIDEAN DOMAINS
Recall that an element of a pid is irreducible if and only if it is prime. Thus each
qj is prime, 1 ≤ j ≤ m. Since q1 ∣ p1 p2 ⋯pn , there exists k ≥ 1 such that q1 ∣ pk . By
reordering the terms p1 , p2 , . . . , pn (principal ideal domains are commutative, after
all!), we may assume that k = 1. Thus q1 ∣ p1 , and so there exists u1 ∈ D such that
p1 = q1 u1 . But p1 is irreducible, and thus either q1 is invertible, or u1 is. But q1 is
irreducible, so it is non-invertible. Hence u1 is invertible.
That is,
d = p1 p2 ⋯pn = q1 u1 p2 ⋯pn = q1 q2 ⋯qm .
Since every pid is an integral domain, the cancellation law holds, and so
(u−1
1 q2 ) q3 ⋯qm = p2 p3 p4 , . . . , pn .
Note that u−1 1 q2 is irreducible (hence prime), being the associate of an irreducible
element in D.
Since (u−1 −1
1 q2 ) ∣ p2 p3 p4 , . . . , pn , there exists 2 ≤ k ≤ n such that (u1 q2 ) ∣ pk . By
reordering the pj ’s if necessary, we may assume that j = 2, i.e. that (u−1 1 q2 ) ∣ p 2 .
Write p2 = (u1 q2 )u2 for some u2 ∈ D. Since p2 is irreducible, and thus either (u−1
−1
1 q2 )
is invertible, or u2 is. But (u−1 1 q 2 ) is irreducible, so it is non-invertible. Hence u2 is
invertible.
3. UNIQUE FACTORISATION DOMAINS 135
Thus
(u−1 −1
1 q2 ) q3 ⋯qm = p2 p3 ⋯pn = (u1 q2 ) u2 p3 p4 ⋯pn .
Again, we may apply the cancellation law to conclude that
q3 q4 ⋯ qm = u2 p3 p4 ⋯pn ,
or equivalently, that
(u−1
2 q3 ) q4 ⋯ qm = p3 p4 ⋯ pn .
Proceeding in this manner, we see that after m steps (and potentially m reorderings
of the pj terms), we may assume that p1 = u1 q1 and pj = (u−1
j−1 uj )qj , 2 ≤ j ≤ m. At
that stage, we have
q1 q2 ⋯qm = (u1 q1 )(u−1 −1 −1
1 u2 )q2 )(u2 u3 q3 )⋯(um−1 um qm ) pm+1 pm+2 ⋯ pn
= um (q1 q2 ⋯ qm ) pm+1 pm+2 ⋯ pn .
Applying the cancellation law yet again yields
1 = um pm+1 pm+2 ⋯ pn .
If n > m, then this contradicts the fact that each pj , m ≤ j ≤ n is irreducible, hence
non-invertible. Thus n = m, and each pair (qj , pj ) consists of associates, as required.
◻
The next result is familiar - but it’s certainly nice to have a formal proof of this
fact.
Supplementary Examples.
S6.1. Example. Most sources list at most three, and often just two, examples
of Euclidean domains. The first two are invariably the ring Z of integers and poly-
nomial rings with coefficients in a field, which we have dutifully mentioned above.
Let us now consider the Gaussian integers, which is typically mentioned third, if at
all.
Recall that the Gaussian integers are the set
Z[i] ∶= {a + bi ∶ a, b ∈ Z},
2
where i = −1. We think of these as a unital (and necessarily commutative) subring
of C. Note that C is a field, and thus it is an integral domain. If a + bi ≠ 0 ≠ c + di
are non-zero Gaussian integers, then they are also non-zero complex numbers, so
(a + bi)(c + di) = (ac − bd) + (ad + bc)i ≠ 0.
As with Z, we may define a Euclidean valuation on Z[i] ∖ {0} through the map
µ(a + bi) ∶= ∣a + bi∣2 = a2 + b2 .
That µ satisfies condition (b) of a Euclidean valuation is left as an exercise for
the reader to verify. Let us verify condition (a).
Let a = w + xi and b = y + zi ∈ Z[i]. We must find q, r ∈ D such that a = qb + r,
where either r = 0 or µ(r) < µ(b). We consider two cases.
● Suppose first that b = k for some 0 < k ∈ N. It is not hard to see that we
can write
w = u1 k + v1 , x = u2 k + v2 ,
k
where max(∣v1 ∣, ∣v2 ∣) ≤ . Thus
2
a = w + xi = (u1 k + v1 ) + (u2 k + v2 )i = (u1 + u2 i)k + (v1 + v2 i) = qk + r,
where q = (u1 + u2 i) and r = (v1 + v2 i). Note that if r ≠ 0, then
k 2 k 2
µ(r) = µ(v1 + v2 i) = v12 + v22 ≤ ( ) + ( ) < k 2 = µ(k) = µ(b).
2 2
● Now suppose that b = y + zi ≠ 0. Let k = bb = (y − zi)(y + zi), and observe
that 0 < y 2 + z 2 = k ∈ N. Let us now apply the above argument to ab and k
to see that we may write
ab = qk + r0 ,
where r0 = 0 or µ(r0 ) < µ(k).
We pause to observe that for all s + ti ∈ Z[i], µ(s + ti) = s2 + t2 = ∣s + ti∣2 ,
the square of the absolute value of the complex number s + ti, and thus
µ((m + ni)(s + ti)) = ∣(m + ni)(s + ti)∣2
= ∣m + ni∣2 ∣s + ti∣2
= µ(m + ni) µ(s + ti).
SUPPLEMENTARY EXAMPLES. 137
Since µ(b) = µ(b) = y 2 + z 2 ≠ 0 (recall that b ≠ 0), we may divide both sides
of this inequality by the positive integer µ(b) to obtain
µ(a − qb) < µ(b).
Let r = a − qb. Then µ(r) < µ(b) and
a = qb + r,
as required to prove that µ is a valuation on Z[i], and completing the proof
that Z[i] is a Euclidean domain.
S6.2. Example. In Exercise 6.6, you are asked to prove that if E is a Euclidean
domain with Euclidean valuation µ, then
(a) µ(1) = min{µ(d) ∶ 0 ≠ d ∈ D}, and
(b) x ∈ E is invertible if and only if µ(x) = µ(1).
Note that if F is a field and δ = deg(⋅) is the usual valuation on F[x], then
δ(1) = deg(1) = 0
is the minimum possible degree for any non-zero element of F[x]. It is achieved at
all p(x) = p0 ≠ 0.
In Z with valuation µ(n) = ∣n∣, we see that µ(1) = 1 is the minimum possible
value of µ (why?), achieved at n = 1 and n = −1.
S6.3. Example. Consider the Euclidean domain E ∶= Q[x], equipped with the
valuation δ(q(x)) ∶= deg(q(x)), q(x) ∈ Q[x].
Let’s see how to use the division algorithm to find the greatest common divisor
of f (x) = x4 + x3 − x − 1 and g(x) = x3 + 1.
By definition, we may find q (1) (x), r(1) (x) ∈ E such that
and either r(1) (x) = 0 or deg(r(1) (x)) < deg(g(x)). This is precisely the division
algorithm: the computation
138 6. EUCLIDEAN DOMAINS
x +1
√
x3 + 1 x4 +x3 +0x2 −x −1
x4 +x
x3 +0x2 −2x −1
x3 +1
−2x −2
shows that f (x) = (x + 1)(x3 + 1) + (−2x − 2), so we set q (1) (x) = x + 1 and r(1) (x) =
(−2x−2). Anything that divides both f (x) and g(x) must divide f (x)−q (1) (x)g(x) =
r(1) (x), so we look for common divisors of g(x) and r(1) (x).
The advantage of doing this is that we have reduced the problem to the case
where the maximum degree of the two polynomials whose greatest common divisor
we are hunting has been reduced from deg(f (x)) = 4 to deg(g(x)) = 3. If we continue
doing this process, it must eventually stop, since we cannot have a degree that is
negative!
Consider
− 21 x2 + 12 x − 12
√ 3
−2x − 2 x +0x2 +0x +1
x3 +x2
−x2 +1
−x2 −x
x +1
x +1
0
Indeed,
while
g(x) = x3 + 1 = (x + 1)(x2 − x + 1).
We leave it to the reader to verify that (x2 + x + 1) and (x2 − x + 1) have no common
factors.
SUPPLEMENTARY EXAMPLES. 139
obtained there shows that (x + 1), (x − 1), (x2 + x + 1) and (x2 − x + 1) must all divide
the least common multiple. Thus
h(x) ∶= lcm(f (x), g(x)) = (x + 1)(x − 1)(x2 + x + 1)(x2 − x + 1) = x6 − 1.
S6.5. Example. We know that Z is a pid, and that ⟨17⟩ is a prime ideal of Z.
Then
Z
≃ Z17
⟨17⟩
is a field, as we have seen earlier.
S6.8. Example. Suppose that R and S are rings and that R is Noetherian. If
φ ∶ R → S is a homomorphism, then φ(R) is Noetherian as well.
Indeed, by the First Isomorphism Theorem ,
R
φ(R) ≃ ,
ker φ
and we saw in the previous example that R/K is Noetherian for any ideal of R.
S6.9. Example. Let D be a pid, and let K = ⟨b⟩ ⊲ D be an ideal of D. Then
D/K is a pid as well.
Again, we remark that any ideal of D/K is of the form J/K, where J ⊲ D is an
ideal which contains K. Since J = ⟨d⟩ for some d ∈ D, we see that J/K = ⟨d + K⟩ ⊲
D/K.
S6.10. Example. For those of you with a good background in Complex Anal-
ysis, you may be interested to learn that the ring
H(D) ∶= {f ∶ D → C ∣ f is holomorphic}
is not Artinian, since the ideals Kn = {f ∈ H(D) ∶ f has a zero of degree at least n at 0},
n ≥ 1 are descending and all distinct.
APPENDIX 141
Appendix
A6.1. Although we won’t prove it, it can be shown that if D is a ufd, then so
is D[x]. In particular, Z[x] is a ufd. Having said this, we have proven that if D is
a pid, then so is D[x], and thus D[x] is a ufd. This gives an explicit proof of the
fact that Z[x] is a ufd.
● use the fact that every pid is a ufd to note that the decomposition of
non-zero, non-invertible elements of Z into products of primes is essen-
tially unique (up to the order of the terms and replacing primes by their
associates). Since ν(a) = ν(b) when a and b are associates, and since
ν(a)ν(b) = ν(b)ν(a) because multiplication in N ∪ {0} is commutative, this
means that ν(n) is well-defined by the above formula.
This allows us to show that there exist an uncountable number of multiplicative
norms on Z; indeed, if we set P ∶= {2, 3, 5, 7, 11, . . .} to denote the set of prime natural
numbers, then each function f ∶ P → N determines a multiplicative norm
νf (p) ∶= f (p) for all p ∈ P,
and the correspondence is bijective. How many such functions are there? Precisely
∣N∣∣P∣ , where ∣X∣ denotes the cardinality of a set X. In our case,
∣N∣∣P∣ = (ℵ0 )ℵ0 = 2ℵ0 = c,
the cardinality of the continuum. Of course, c = ∣R∣, so there are as many
multiplicative norms on Z as there are points in R, in the sense that there exists a
bijection between these two sets. Finding that bijection is another matter, however.
(We point out that these cardinality arguments are just culture. Don’t worry if
you haven’t seen them before.)
A6.4. As mentioned above, we can extend this analysis to more general pid’s.
If D is a pid, then D is a ufd. To define a multiplicative norm ν on D, it suffices to
choose a positive integer mp for every irreducible (equivalently every prime) element
of D, and then to set
● ν(0) = 0;
● ν(u) = 1 whenever u ∈ D is invertible, and
● ν(p) ∶= mp .
The value of ν(d) for an arbitrary non-zero, non-invertible element of D is then
entirely determined by its essentially unique factorisation as a product of primes.
The moral of the story is that pids are good.
A6.5. Don’t look at these notes that way. You’ve seen this phenomenon before.
Recall that if V and W are vector spaces over R, and if B ∶= {bλ }λ∈Λ is a basis for
V, then to define a linear map T ∶ V → W it suffices to choose any wλ ∈ W, λ ∈ Λ
and to set each T bλ ∶= wλ .
If v ∈ V is arbitrary, then there exist m ≥ 1, real numbers r1 , r2 , . . . , rm and
indices λ1 , λ2 , . . . , λm such that
v = r1 bλ1 + r2 bλ2 + ⋯rm bλm .
If T is to be linear, then we must have
T v = T (r1 bλ1 + r2 bλ2 + ⋯rm bλm )
= r1 T bλ1 + r2 T bλ2 + ⋯rm T bλm
= r1 wλ1 + r2 wλ2 + ⋯rm wλm .
APPENDIX 143
It is not that these situations are identical: but there are definite parallels, and
understanding one allows us to better understand the other. This is what makes
mathematics interesting – not just the attention that we get from rich and attractive
people at political pyjama parties.
A6.6. In an earlier version of this text, I put out a plea for more examples
of Euclidean domains. After consulting a number of elementary abstract algebra
textbooks, I was struck by the fact that virtually every such textbook always referred
to the three examples we mention in Section 6.1, and to no other examples.
Mr. Nic Banks answered my plea for more examples. A discrete valuation
ring (a.k.a. dvr) D may be defined as a pid with precisely one non-zero prime
ideal. By Exercise 6.11 below, this is the same as asking that it has precisely one
non-zero maximal ideal. (The expression local pid is also used.) It can be shown
that every dvr is a Euclidean domain. As explained to me by Mr. Banks, the
motivating examples for dvr’s come from number theory and algebraic geometry.
In particular, “dvr’s have very simple ideal factorisations ... and ideal factorisations
are essential in problems like Fermat’s Last Theorem”.
The identity of this ring is e ∶= (1 + ⟨p⟩, 1 + ⟨p2 ⟩, 1 + ⟨p3 ⟩, . . .), and the map
τ∶ Z → lim←Ð Z/pn Z
z ↦ (1 + ⟨p⟩, 1 + ⟨p2 ⟩, 1 + ⟨p3 ⟩, . . .)
is an injective ring homomorphism. Thus Zp has characteristic 0, and contains (an
isomorphic copy of) Z as a subring.
● If 0 ≠ a, b ∈ D, then
µ(a) ∶= min{ν(ay) ∶ 0 ≠ y ∈ D}
≤ min{ν(aby) ∶ 0 ≠ y ∈ D}
= µ(ab).
● Now let a, b ∈ D with b ≠ 0. Choose 0 ≠ y0 ∈ D such that µ(b) = min{ν(by) ∶
0 ≠ y ∈ D}. Since b, y0 are both non-zero and D is an integral domain,
by0 ≠ 0.
Now, choose q, r ∈ D such that
a = q(by0 ) + r,
where either r = 0 or ν(r) < ν(by0 ).
Suppose that r ≠ 0. Then ν(r) < ν(by0 ) = µ(b). Moreover,
µ(r) ∶= min{ν(ry) ∶ 0 ≠ y ∈ D} ≤ ν(r ⋅ 1) = ν(r),
and so
µ(r) ≤ nu(r) < µ(b).
Thus µ satisfies both conditions from A6.7, and so µ is a valuation on D,
as claimed.
Thus if ν satisfies only condition (a) of a valuation, then it induces a valuation
µ on D by setting µ(x) = min{ν(xy) ∶ 0 ≠ y ∈ D} for each 0 ≠ x ∈ D. Fascinating.
146 6. EUCLIDEAN DOMAINS
Exercise 6.1.
Divide f (x) = 3x4 + 2x3 + x + 2 by g(x) = x2 + 4 in Z5 [x].
Exercise 6.2.
Prove that p(x) = x2 + 2x + 1 does not lie in the ideal ⟨x3 + 1⟩ of Z3 [x].
Exercise 6.3.
Let D be a Euclidean domain, a, b, c ∈ D ∖ {0}, and suppose that d is a gcd of
a and b. Show that if a ∣ (bc), then
ad−1 ∣ c.
Exercise 6.4.
(a) Prove Fermat’s Theorem on sums of squares: Let p ∈ N be an odd
prime. Then p = a2 + b2 for some a, b ∈ N if and only if p ≡ 1 mod 4.
(b) Conclude that if p ∈ N is prime and p ≡ 3 mod 4, then p is a prime element
in the ring Z[i] of Gaussian integers.
Exercise 6.5.
Verify that the map µ ∶ Z[i] ∖ {0} → N defined by µ(a + bi) = a2 + b2 satisfies
µ((a + bi) (c + di)) = µ(a + bi) µ(c + di),
which was left as an exercise when establishing the fact that µ is a Euclidean valu-
ation on the set Z[i] of Gaussian integers.
Exercise 6.6.
Let E be a Euclidean domain with Euclidean valuation µ. Prove that
(a) µ(1) = min{µ(d) ∶ 0 ≠ d ∈ D}, and
(b) x ∈ E is invertible if and only if µ(x) = µ(1).
Exercise 6.8.
Let E be a Euclidean domain with Euclidean valuation µ. Prove that if x ∈ E
and x is not invertible, then for any a ∈ D,
µ(a) < µ(ab).
Exercise 6.9.
Let
f (x) = x10 − 3x9 + 3x8 − 11x7 − 11x5 + 19x4 − 13x3 + 8x2 − 9x + 3
and
g(x) = x6 − 3x5 + 3x4 − 9x3 + 5x2 − 5x + 2
be elements of E ∶= Q[x]. Find the greatest common divisor d(x) of f (x) and g(x).
Exercise 6.10.
Let E be a Euclidean domain with Euclidean valuation µ. In Example 1.3, we
saw that if ν(d) ∶= 2µ(d) , d ∈ E, then ν is again a Euclidean valuation on E.
Observe that ν = f ○ µ, where f (x) = 2x , x ≥ 0.
Do there exist other functions g ∶ [0, ∞) → [0, ∞) such that φg ∶= g ○ µ is a
Euclidean valuation on E whenever µ is? Can we characterise such functions g?
(We do not yet have an answer to this ourselves, although it is entirely possible that
such an answer is known.)
Exercise 6.11.
Let D be a pid. Prove that the following conditions are equivalent.
(a) D admits a unique non-zero prime ideal.
(b) D admits a unique non-zero maximal ideal.
CHAPTER 7
1.6. Theorem. Let F be a field and 0 ≠ f (x) ∈ F[x]. The number of roots of
f (x), counted according to multiplicity, is at most deg f (x).
Proof. We shall argue by induction on the degree of f (x). To that end, let n ∶=
deg(f (x)).
If n = 0, then f (x) = a0 ≠ 0, and so f (x) has no roots; i.e. it has zero roots. This
establishes the first step of the induction argument.
Now suppose that N = deg(f (x)) ∈ N and that the result holds for all polyno-
mials h(x) of degree less than N . That is, suppose that if deg(h(x)) < N , then the
number of roots of h(x), counted according to multiplicity, is at most deg (h(x)).
1. DIVISIBILITY IN POLYNOMIAL RINGS OVER A FIELD 151
If f (x) has no roots, then clearly we are done. Otherwise, let α ∈ F be a root of
f (x) of multiplicity k ≥ 1. By Lemma 1.5, we may write
f (x) = (x − α)k g(x)
for some 0 ≠ g(x) ∈ F[x], and g(α) ≠ 0. In particular, this shows that N =
deg (f (x)) ≥ k, and from above,
deg (g(x)) = N − k < N.
By our induction hypothesis, the number of roots of g(x) counted according to
multiplicity is at most deg(g(x)). But β ∈ F is a root of f (x) if and only if β = α or
β is a root of g(x). (In fact, since g(α) ≠ 0, these form disjoint sets, but we don’t
even need this.)
Hence the number of roots of f (x), counted according to multiplicity, is at most
k + deg (g(x)) = k + (N − k) = N = deg (f (x)),
completing the induction step and the proof.
◻
1.8. In the example above, we see that R is a commutative, unital ring, but it
is not an integral domain. Indeed, although we shall not require it here, it is not
hard to see that Theorem 1.6 extends to polynomials over an integral domain.
Suppose that D is an integral domain, and that 0 ≠ f (x) ∈ D[x]. As seen in
Chapter 5 (more specifically, see Theorem 5.2.7), we can think of D as being a
subring of its field F of quotients, and thus we may also think of f (x) as an element
of F[x]. By Theorem 1.6 above, f (x) has at most deg (f (x)) roots in F. But any
root of f (x) in D (i.e. any element β ∈ D such that f (β) = 0) is automatically a
root of f (x) in F, and so f (x) has at most deg (f (x)) roots in D.
1.9. Example.
● Let n ∈ N and let f (x) = xn −1 ∈ C[x]. If ω ∶= e2πi/n , then ω, ω 2 , . . . , ω n−1 , ω n =
1 are n distinct roots of f (x). By Theorem 1.6, these are the only roots of
f (x) in C.
● Observe that f (x) = x4 − 1 only has two roots in R. Indeed,
f (x) = x4 − 1 = (x2 − 1)(x2 + 1).
Now g(x) = x2 − 1 has two real roots, namely a1 = 1 and a2 = −1, but
h(x) = x2 + 1 has no real roots.
1.10. Theorem. Let F be a field. Then F[x] is a ufd. As such, if f (x) ∈ F[x]
is a polynomial of degree at least one, then
(a) f (x) can be factored in F[x] as a product of irreducible polynomials, and
(b) except for the order of the terms and for non-zero scalar terms (which
correspond to the invertible elements of F[x]), the factorisation is unique.
1.11. Remark. Recall also from Theorem 6.3.10 that an element of F[x] is
prime if and only if it is irreducible. Thus every non-invertible element of F[x]
can be factored as a product of finitely many prime elements of F[x], in the same
(essentially) unique way.
1.12. Proposition. Let F be a field, and let q(x), u(x), v(x) ∈ F[x]. Suppose
that q(x) is irreducible and that
q(x) ∣ u(x)v(x).
Then, either q(x) ∣ u(x) or q(x) ∣ v(x).
Proof. By Remark 1.11, q(x) is prime in F[x]. The result now follows from this.
◻
1. DIVISIBILITY IN POLYNOMIAL RINGS OVER A FIELD 153
1.13. Example.
(a) The polynomial f (x) = x4 −1 ∈ R[x] can be factored as a product irreducible
polynomials over R as
f (x) = (x2 − 1)(x2 + 1) = (x − 1)(x + 1)(x2 + 1).
The significance of item (b) in Theorem 1.10 is that we can also factor f (x)
as
1
f (x) = (3(x2 + 1))(π(x + 1))( (x − 1)),
3π
and each of these terms is also irreducible.
On the other hand, the three terms in the second factorisation are
merely associates of the three terms in the first factorisation, written in a
different order.
(b) Note that although g(x) = x2 + 1 is irreducible in R[x], it is reducible in
C[x]. Indeed, there we can factor
f (x) = x4 − 1 = (x − 1)(x + 1)(x − i)(x + i).
Those last two factors do not lie in R[x].
1.14. Culture. Although we shall not prove it here, the field C has a won-
derfully useful property. Every polynomial f (x) ∈ C[x] can be factored into linear
terms. Stated another way, every complex polynomial of degree n ∈ N has exactly n
roots, counted according to multiplicity.
A field that enjoys this property is said to be algebraically closed. Thus C
is algebraically closed. Since f (x) = x2 + 1 can not be factored into linear terms in
R[x], R is not algebraically closed.
While it might not seem like much, this is actually the reason why people who
study operator theory (a version of linear algebra where the linear maps act on
normed vector spaces) tend to use C as opposed to R for their base field. The
eigenvalues of a matrix T ∈ Mn (R) can be found by considering the characteristic
polynomial of T . If this polynomial has no real roots, then T has no eigenvalues
whatsoever. But for any Y ∈ Mn (C), the minimal polynomial always factors into
linear terms, and so Y has exactly n eigenvalues, counted according to multiplicity.
The usefulness of the eigenvalues of a matrix (and of their generalisation to operators,
namely spectrum) cannot be overstated.
Hold on, that’s a bit hyperbolic. What we mean by that is that eigenvalues are
really important. That is, they are really important to the study of matrices, and to
those disciplines (I’m talking to you, Physics) that rely heavily upon matrix theory.
If you are alone in a cage with a starving lion, a new hair-do and your wits, you
will generally be excused for not having the eigenvalues of a matrix be one the first
things that spring to your mind, and your estimation of their general usefulness will
undoubtedly not be of the impossible to overstate variety. But if you are invited to
a pyjama party at Angela Merkel’s house, well, the sky’s the limit.
154 7. FACTORISATION IN POLYNOMIAL RINGS
2.4. Examples.
(a) For example, the polynomial f (x) = x2 − 2 factors as
√ √
f (x) = x2 − 2 = (x − 2)(x + 2) over R,
but it is irreducible over Q. If it were reducible over Q, we could write
f (x) = g(x)h(x),
2. REDUCIBILITY IN POLYNOMIAL RINGS OVER A FIELD 155
where g(x), h(x) ∈ Q[x] would necessarily each have degree one. Thus
g(x) = a1 x + a0 and h(x) = b1 x + b0 , with ai , bi ∈ Q and a1 ≠ 0 ≠ b1 . But then
f (−a0 /a1 ) = g(−a0 /a1 )h(−a0 /a1 ) = 0,
contradicting the fact that f (x) does not have any rational roots.
Similarly, the polynomial g(x) = x2 + 1 is irreducible as a polynomial in
R[x], but it is reducible over C, since g(x) = (x − i)(x + i) in C[x].
(b) Using Proposition 2.5 below, it is not hard to see that the polynomial
f (x) = 9x2 + 3 is irreducible over Q. On the other hand, the fact that
g(x) = 3 and h(x) = 3x2 + 1 are both non-invertible in Z[x] implies that
f (x) = g(x)h(x)
is in fact reducible over Z.
2.5. Proposition. Let F be a field and f (x) ∈ F[x]. Suppose that deg(f (x)) ∈
{2, 3}. The following statements are equivalent.
(a) f (x) is reducible in F[x].
(b) f (x) has a root in F[x].
Proof.
(a) implies (b). Suppose that f (x) is reducible over F. Then we may write
f (x) = g(x)h(x) for some g(x), h(x) ∈ F[x]. Since deg (f (x)) = deg(g(x))+
deg(h(x)) (why?), it follows that one of g(x) and h(x) must have degree
equal to 1. By relabelling if necessary, we may assume that it is g(x). But
then
g(x) = a1 x + a0 ∈ F[x]
with a1 ≠ 0 has a root in F, namely α = −a0 /a1 . Hence α is a root of f (x)
in F as well.
(b) implies (a). Suppose that α ∈ F is a root of multiplicity 1 ≤ k ≤ deg(f (x))
for f (x). By Corollary 1.4, x − α divides f (x), and so
f (x) = (x − α)g(x)
for some g(x) ∈ F[x] of degree deg (g(x)) = deg (f (x)) − 1 ∈ {1, 2}. By
Remark 2.3, f (x) is reducible over F.
◻
2.8. Proposition. Let p ∈ N be a prime number and suppose that q(x) ∈ Zp [x]
is an irreducible polynomial of degree n over Zp . Then
Zp [x]
G ∶=
⟨q(x)⟩
is a field with pn elements.
Proof. By Theorem 2.7, it is clear that G is a field. The elements of G are of the
form
f (x) + ⟨q(x)⟩,
where f (x) ∈ Zp [x] is any polynomial. By the division algorithm, we can always
write f (x) = h(x)q(x) + r(x), where r(x) = 0 or 0 ≤ deg(r(x)) < deg (q(x)) = n.
That is,
f (x) + ⟨q(x)⟩ = r(x) + ⟨q(x)⟩.
Note that
r(x) = rn−1 xn−1 + rn−2 xn−2 + ⋯ + r1 x + r0
for some rj ∈ Zp , 0 ≤ j ≤ p, which means that we have at most pn distinct cosets in
G. on the other hand, given any two distinct choices of r(x) and s(x) = sn−1 xn−1 +
2. REDUCIBILITY IN POLYNOMIAL RINGS OVER A FIELD 157
sn−2 xn−2 + ⋯ + s1 x + s0 , the fact that deg(s(x) − r(x)) < n and s(x) − r(x) ≠ 0 implies
that the cosets are different.
This shows that we have exactly pn cosets in G. In other words, G is a field with
exactly pn elements, as claimed.
◻
3.3. Examples.
(a) Let q(x) = 10x2 + 4x + 18. Then content(q(x)) = 2.
(b) Let r(x) = 10x2 + 7x + 18. Then content(r(x)) = 1, and r(x) is primitive.
(c) Let s(x) = −5. Then content(s(x)) = 5.
3.4. Note. We remark that may authors define the content of a non-zero poly-
nomial q(x) ∈ Z[x] without the absolute value sign. The only difference is that this
allows both 2 and −2 to be the content of q(x) in the example above; and we would
have to modify our notion of primitive to include the case where
gcd(qn , qn−1 , . . . , q1 , q0 ) = −1.
We are simply trying to be pragmatic.
The notion of content can be extended to polynomials over a general integral
domain D. In that setting, the notion of an absolute value makes no sense, and we
have to agree that the content of a non-zero polynomial over D should be the set of
all greatest common divisors of its coefficients.
3.5. The proof of Gauss’ Lemma below is extremely clever – almost too clever,
but it has the advantage that it is relatively easy to follow. It relies on the following
simple but effective observation.
3.6. Gauss’ Lemma. Suppose that 0 ≠ q(x), r(x) ∈ Z[x] are primitive. Then
q(x)r(x) is also primitive.
Proof. Suppose that γ ∶= content(q(x)r(x)). Then γ ∈ N, and our goal is to
prove that γ = 1. The “trick” is to realise that if γ ≠ 1, then there exists a prime
number p which divides γ, and thus p divides each coefficient of q(x)r(x).
Now the magic happens. Write q(x) = qn xn + qn−1 xn−1 + ⋯ + q1 x + q0 and r(x) =
rm xm + rm−1 xm−1 + ⋯ + r1 x + r0 , where n = deg(q(x)) and m = deg(r(x)). Then
q(x) ∶= q n xn + q n−1 xn−1 + ⋯ + q 1 x + q 0
and
r(x) ∶= rm xm + rm−1 xm−1 + ⋯ + r1 x + r0
lie in Zp [x], the ring of polynomials over the field Zp , where each q j ∶= qj mod p ∈ Zp ,
0 ≤ j ≤ n (and a similar statement holds for each rj ).
The fact that q(x) is primitive, i.e. that content(q(x)) = 1, implies that not
every coefficient of q(x) is divisible by p, and hence q(x) ≠ 0 in Zp [x]. Similarly,
the fact that r(x) is primitive implies that r(x) ≠ 0 in Zp [x]. Now Zp is a field and
hence an integral domain, and therefore Zp [x] is also an integral domain. But then
qr(x) = q(x)r(x) ≠ 0
in Zp [x].
If p had divided γ, then we would have qr(x) = 0, since each coefficient of qr(x)
would be zero!
Thus γ is not divisible by any prime number, and so γ = 1, i.e. q(x)r(x) is
primitive.
◻
3.7. Remark. Let F and E be fields, and suppose that F ⊆ E. Suppose further-
more that f (x) ∈ F[x] is reducible over F. It is then easy to see that f (x) is also
reducible over E. For if we can write
f (x) = g(x)h(x)
where g(x), h(x) ∈ F[x] each have degree at least one, then it is clear that g(x), h(x) ∈
E[x] as well, proving that f (x) is reducible over E.
If f (x) is irreducible over F, then unfortunately no universal conclusions are
possible. For example, if we set p(x) = x2 + 1, then p(x) ∈ Q[x], Q ⊆ R ⊆ C, and p(x)
is irreducible over both Q and R, but p(x) = (x + i)(x − i) is reducible over C.
160 7. FACTORISATION IN POLYNOMIAL RINGS
3.8. Theorem. Let q(x) ∈ Z[x]. If q(x) is reducible over Q, then q(x) is
reducible over Z. Moreover, if
q(x) = g(x)h(x)
in Q[x] with deg (g(x)), deg (h(x)) < deg (q(x)), then it is possible to factor
q(x) = g1 (x)h1 (x)
in Z[x] with deg (g1 (x)) = deg (g(x)) and deg (h1 (x)) = deg (h(x)).
Proof.
Suppose that q(x) is reducible over Q, and choose g(x), h(x) ∈ Q[x] such that
q(x) = g(x)h(x) with deg g(x), deg h(x) ≥ 1. We shall prove that q(x) is reducible
over Z by proving that there exist polynomials g0 (x), h0 (x) ∈ Z[x] such that
● deg (g0 (x)) = deg (g(x)),
● deg (h0 (x)) = deg (h(x)), and
● q(x) = g0 (x)h0 (x).
Case One. Suppose that q(x) is primitive.
Write g(x) = an xn +an−1 xn−1 +⋯+a1 x1 +a0 , h(x) = bm xm +bm−1 xm−1 +⋯+b1 x+b0 ,
where an ≠ 0 ≠ bm .
Set κ = min{k ∈ N ∶ k g(x) ∈ Z[x]} and λ ∶= min{k ∈ N ∶ k h(x) ∈ Z[x]}. (Note
that if we write the coefficients of g(x) as rational numbers each of whose numerator
and denominator are relatively prime, then κ is the just the absolute value of least
common multiple of those denominators. A similar remark holds for h(x).)
Now
κλ q(x) = (κg(x))(λh(x)).
Write κg(x) = γκg g0 (x) and λh(x) = γλh h0 (x) where γκg ∶= content (κg(x)) and
γλh ∶= content (λh(x)). Thus g0 and h0 are primitive elements of Z[x], and
κλ q(x) = γκg γλh (g0 (x)h0 (x)).
Since g0 (x) and h0 (x) are primitive, it follows from Gauss’s Lemma 3.6 that g0 (x)h0 (x)
is also primitive. Hence
κλ = content (κλq(x))
= content (γκg γλh (g0 (x)h0 (x)))
= γκg γλh .
3. FACTORISATION OF ELEMENTS OF Z[x] OVER Q 161
But then (keeping in mind that Z is an integral domain and so we have cancellation
there!)
q(x) = g0 (x)h0 (x)
factors in Z[x] as well, since
deg(g0 (x)) = deg(g(x)) ≥ 1 and deg(h0 (x)) = deg(h(x)) ≥ 1.
Case Two. If q(x) is not primitive, then we can write q(x) = γq q0 (x), where
q0 (x) ∈ Z[x] is primitive, and note that
q0 (x) = (g1 (x))(h(x)),
where g1 (x) = γq−1 g(x) ∈ Q[x] has the same degree as g(x). By Case One above,
q0 (x) factors over Z, say q0 (x) = r(x)s(x) with r(x), s(x) ∈ Z[x] having degree at
least one, and thus q(x) = (γq r(x))s(x) factors over Z as well.
◻
Suppose that q(x) is reducible over Q (and hence over Z). A moment’s thought
shows that we can find polynomials g(x) and h(x) ∈ Z[x] such that q(x) = g(x)h(x)
and either:
● deg(g(x)) = 1, deg(h(x)) = 3, or
● deg(g(x)) = 2 = deg(h(x)).
162 7. FACTORISATION IN POLYNOMIAL RINGS
3.11. We next introduce the first of two irreducibility tests. Recall that the
map
∆ ∶ Z[x] → Zp [x]
r(x) ↦ r(x)
defined in paragraph 3.5 is a homomorphism.
3. FACTORISATION OF ELEMENTS OF Z[x] OVER Q 163
3.13. Remarks.
(a) The converse of this statement is false! Consider q(x) = x4 + 1. It can be
shown that q(x) is irreducible over Q, but q(x) = x4 +1 is reducible in Zp [x]
for all primes p.
The proof of this relies on group theory and will be omitted.
(b) It is tempting to ask, and we shall allow ourselves to give into the tempta-
tion, whether the irreducibility of q(x) over Zp in the Mod-p Test implies
that q(x) is irreducible not only over Q, but also over Z.
Alas, this fails. Consider the polynomial p(x) = 2x2 + 4 ∈ Z[x]. Then
p(x) = 2x2 + 4 ∈ Z5 [x] is a polynomial of degree 2, and thus is reducible
over Z5 if and only if it has a root in Z5 . But
p(0) = 4 p(3) = 2
p(1) = 1 p(4) = 1,
p(2) = 2
showing that p(x) has no roots in Z5 , and is therefore irreducible over Z5 .
By the Mod-p Test, p(x) is irreducible over Q. On the other hand, we can
factor p(x) = g(x)h(x) as a product of non-invertible elements by setting
g(x) = 2 and h(x) = x + 2 ∈ Z[x], proving that p(x) is reducible over Z.
3.14. Eisenstein’s Criterion. Let q(x) = qn xn +qn−1 xn−1 +⋯+q1 x+q0 ∈ Z[x]
be a polynomial of degree n. Suppose that there exists a prime number p ∈ N such
that
(i) p ∤ qn .
(i) p ∣ qj for 0 ≤ j ≤ n − 1, but
(i) p2 ∤ q0 .
164 7. FACTORISATION IN POLYNOMIAL RINGS
3.15. Examples.
(a) Consider q(x) = 7x3 − 6x2 + 3x + 9 ∈ Z[x].
Let us set p = 2 and apply the Mod-2 Test.
Then q(x) = x3 − 0x2 + x + 1 = x3 + x + 1 ∈ Z2 [x]. This has degree three.
If it were to be reducible over Z2 , it would need to have roots in Z2 , by
Proposition 2.5.
But q(0) = 0 + 0 + 1 = 1 ≠ 0 and q(1) = 1 + 1 + 1 = 1 ≠ 0 in Z2 , so that
q(x) is irreducible over Z2 , and therefore q(x) is irreducible over Q by the
Mod-2 Test.
(b) Consider q(x) = 25x5 − 9x4 + 3x2 − 12 ∈ Z[x].
Set p = 3. Then p ∣ −12, p ∣ 3, p ∣ −9, while p ∤ 25 and p2 ∤ 12. By
Eisenstein’s Criterion, q(x) is irreducible over Q.
(c) Consider q(x) = 3x4 + 5x + 1 ∈ Z[x].
Since the constant term in q(x) is 1, Eisenstein’s Criterion will not help
us here. Let us try the Mod-p Test. Note that we can’t take p = 3, because
that would reduce the degree of q(x), which we are not allowed to do. Let
us try p = 2.
3. FACTORISATION OF ELEMENTS OF Z[x] OVER Q 165
Supplementary Examples.
S7.2. Example. Let p(x) = x2 + 1 ∈ R[x]. Note that p(x) is irreducible since
deg p(x) = 2 and p(x) has no roots in R; i.e. p(x) ≥ 1 > 0 for all x ∈ R.
Note, however that the polynomial q(x) = x4 + 2x2 + 1 = (x2 + 1)(x2 + 1) is
reducible, despite the fact that it has no roots over R.
S7.3. Example. One of the most beautiful results concerning the field of com-
plex numbers is that it is algebraically closed. This is the statement that if
p(x) ∈ C[x] is a non-constant polynomial, then p(x) admits a root in C, and is
usually referred to as the Fundamental theorem of algebra.
The standard, and the most accessible proof of this derives from the theory of
complex variables, and is beyond the scope of this course. It is remarkable, to say
the least, that such a “purely algebraic” concept can be proven using very “analytic
techniques”. This is why mathematics is better than chocolate, and almost as good
as potato chips. Of course, here we are not being so impudent as to compare
SUPPLEMENTARY EXAMPLES. 167
Appendix
Exercise 7.1.
Prove that p(x) = x2 − 2 is irreducible over Q.
Exercise 7.2.
Prove that p(x) = x3 + x2 + 2 is irreducible over Q.
Exercise 7.3.
Let q(x) = qn xn + qn−1 xn−1 + ⋯ + q1 x + q0 ∈ Q[x] be a polynomial of degree n ≥ 1.
Prove that there exist integers, a, b ∈ Z and p0 , p1 , . . . , pn ∈ Z with gcd(p0 , p1 , . . . , pn ) =
1 such that
q
q(x) = (pn xn + pn−1 xn−1 + ⋯ + p1 x + p0 ).
b
Exercise 7.4.
Let p(x) = x4 − 2x3 + x + 1. Prove that p(x) is irreducible over Q.
Exercise 7.5.
Prove that p(x) = 16x5 − 9x4 + 3x2 + 6x − 21 ∈ Q[x]. Prove that p(x) is irreducible
over Q.
Exercise 7.6.
Let F be a field. Prove that F[x, y] is not a pid.
Exercise 7.7.
Apply the division algorithm to f (x) = 6x4 −2x3 +x2 −3x+1 and g(x) = x2 +x−2 ∈
Z7 [x] to find f (x)/g(x).
Exercise 7.8.
Find all of the roots of p(x) = 5x3 + 4x2 − x + 9 in Z12 .
Exercise 7.9.
Find an invertible element p(x) of Z4 [x] such that deg(p(x)) > 1.
Exercise 7.10.
Give two different factorisations of p(x) = x2 + x + 8 in Z10 [x].
CHAPTER 8
Vector spaces
● 1x = x for all x ∈ V.
● κ(x + y) = κx + κy for all κ ∈ F, x, y ∈ V.
● (κ + λ)x = κx + λx for all κ, λ ∈ F, x ∈ V.
● κ(λx) = (κλ)x for all κ, λ ∈ F, x ∈ V.
1.3. Examples.
(a) Let n ∈ N be an integer. If p ∈ N is a prime, then Znp ∶= {(x1 , x2 , . . . , xn ) ∶
xj ∈ Zp , 1 ≤ j ≤ n} forms a vector space over Zp using the familiar operations
(x1 , x2 , . . . , xn ) + (y1 , y2 , . . . , yn ) ∶= (x1 + y1 , x2 + y2 , . . . , xn + yn )
and
κ(x1 , x2 , . . . , xn ) ∶= (κx1 , κx2 , . . . , κxn )
for all (x1 , x2 , . . . , xn ), (y1 , y2 , . . . , yn ) ∈ Znp and κ ∈ Zp .
(b) Let V = C([0, 1], C) ∶= {f ∶ [0, 1] → C ∶ f is continuous}. Then C([0, 1], C)
is a vector space over C using the operations
(f + g)(x) ∶= f (x) + g(x),
and
(κf )(x) = κ(f (x))
for all f, g ∈ C([0, 1], C) and κ ∈ C.
(c) If V is a vector space over C, then V is a vector space over R and over
Q, using the same operations, only restricting the scalar multiplication to
scalars in R or to scalars in Q respectively.
1.8. Example. The set {sin x, cos x, sin(2x), cos(2x)} is linearly independent
in the vector space C((−1, 1), R) over R.
Consider
0 = κ1 sin x + κ2 cos x + κ3 sin(2x) + κ4 cos(2x).
By differentiating three times we obtain the three new equations
(1) 0 = κ1 sin x + κ2 cos x + κ3 sin(2x) + κ4 cos(2x)
(2) 0 = κ1 cos x − κ2 sin x + 2κ3 cos(2x) − 2κ4 sin(2x)
(3) 0 = −κ1 sin x − κ2 cos x − 4κ3 sin(2x) − 4κ4 cos(2x)
(4) 0 = −κ1 cos x + κ2 sin x − 8κ3 cos(2x) + 8κ4 sin(2x).
These equations must hold for all x ∈ (−1, 1), and so setting x = 0, we obtain that
(5) 0 = κ2 + κ4
(6) 0 = κ1 + 2κ3
(7) 0 = −κ2 − 4κ4
(8) 0 = −κ1 − 8κ3 .
An easy calculation now shows that κ1 = κ2 = κ3 = κ4 = 0, and thus that the set
{sin x, cos x, sin(2x), cos(2x)} is indeed linearly independent.
The student who is not already familiar with partially and totally ordered sets
should consult the Appendix to Chapter 5.
1.9. Definition. Let (X, ≤) be a partially ordered set. A chain in X is a non-
empty subset C ⊆ X which is totally ordered. Thus, given c1 , c2 ∈ C, either c1 ≤ c2 or
c2 ≤ c1 .
Let ∅ ≠ Y ⊆ X. An upper bound in X for Y is an element u ∈ X such that
y ≤ u for all y ∈ Y .
1.10. Example. Let X ∶= {a, b, c, d}, and partially order the power set P(X) ∶=
{Y ∶ Y ⊆ X} by inclusion: i.e. Y1 ≤ Y2 if and only if Y1 ⊆ Y2 .
Then
C ∶= {∅, {a}, {a, c}, {a, c, d}}
is a chain in P(X), while
D ∶= {∅, {a}, {b}, {a, b}}
176 8. VECTOR SPACES
is not a chain in P(X), since {a} and {b} are not comparable. That is, {a} ≤/ {b}
and {b} ≤/ {a}.
The proof that every vector space over a field F admits a basis is much deeper
than it may sound. It is known to be equivalent to the Axiom of Choice, and to
Zorn’s Lemma, which we now state. We shall go into further detail on the Axiom
of Choice in Appendix A. In fact, we shall now prove “half” of the equivalence by
using Zorn’s Lemma to prove that every vector space admits a basis.
1.12. Zorn’s Lemma. Let (X, ≤) be a non-empty poset. Suppose that every
chain C in X has an upper bound in X. Then X admits a maximal element.
1.13. Theorem. Let V be a vector space over a field F. Then V admits a basis.
1.15. Theorem. Let V be a vector space over a field F, and let B be a basis for
V. Then
(a) each element x ∈ V can be expressed as a finite linear combination of ele-
ments of B; and
(b) except for the presence of terms with a zero coefficient, this can be done in
a unique way.
Proof.
(a) Define
n
W ∶= span B = { ∑ κj xj ∶ n ∈ N, κj ∈ F, xj ∈ B, 1 ≤ j ≤ n}.
j=1
We leave it as an exercise for the reader to prove that W ≤ V; that is, that
W is a subspace of V. Suppose that W ≠ V, i.e. that there exists y ∈ V ∖ W.
We claim that B ∪ {y} is linearly independent in V, contradicting the
maximality of B as a linearly independent set in V.
Indeed, otherwise we can find x1 , x2 , . . . , xn ∈ V and κ1 , κ2 , . . . , κn+1 ∈ F
such that
κ1 x1 + κ2 x2 + ⋯ + κn xn + κn+1 y = 0.
If κn+1 ≠ 0, then
−1
y= (κ1 x1 + κ2 x2 + ⋯ + κn xn ) ∈ W,
κn+1
a contradiction.
Thus κn+1 = 0, and so
κ1 x1 + κ2 x2 + ⋯ + κn xn = 0,
implying that κj = 0 for all j since B is linearly independent.
178 8. VECTOR SPACES
1.19. Examples.
(a) If n ∈ N and p ∈ N is prime, then dimZp Znp = n .
(b) The base field is extremely important when determining the dimension of
a vector space. For example,
dimC C = 1, and dimR C = 2.
Indeed, B ∶= {1, i} is easily seen to be a basis for C over R, while D ∶= {1}
is a basis for C over itself.
(c) What should dimQ C be?
180 8. VECTOR SPACES
Supplementary Examples.
S8.3. Example. Let F be a field. Then F[x] is a vector space over F, where
κ(qn xn + qn−1 xn−1 + ⋯ + q1 x + q0 ) ∶= (κqn xn + κqn−1 xn−1 + ⋯ + κq1 x + κq0 )
for all q(x) = qn xn + qn−1 xn−1 + ⋯ + q1 x + q0 ∈ F[x]. (We are using the usual addition
in F[x].)
Again, the verification of this is left to the reader.
S8.7. Example. Let ∅ ≠ Λ be a set, and let F be a field. If, for each λ ∈ Λ, we
have that Vλ is a vector space over F, then
∏ Vλ
λ∈Λ
is a vector space over F, where
(xλ )λ + (yλ )λ ∶= (xλ + yλ )λ ,
and
κ(xλ )λ ∶= (κxλ )λ ,
for all κ ∈ F, (xλ )λ , (yλ )λ ∈ ∏λ∈Λ Vλ .
S8.8. Example. Let F be a field and n ∈ N. Let Pn (F) ∶= {qn xn + qn−1 xn−1 +
⋯ + q1 x + q0 ∶ qj ∈ F, 0 ≤ j ≤ n}. Then Pn (F) is a finite-dimensional subspace of the
infinite-dimensional vector space F[x] over F, and
dimF Pn (F) = n + 1.
A basis for Pn (F) is B ∶= {1, x, x2 , . . . , xn }.
S8.9. Example. Consider the polynomial q(x) = x2 +x+1 over Z2 , and observe
that it is irreducible over Z2 , by virtue of the fact that its degree is 2 and it has no
roots in Z2 .
As we have seen, F ∶= Z2 [x]/⟨x2 + x + 1⟩ is a field, and a general element of F
looks like
b1 x + b0 + ⟨q(x)⟩ ∶ b1 , b0 ∈ Z2 .
Then F is a vector space over Z2 , where
(b1 x + b0 + ⟨q(x)) + (d1 x + d0 + ⟨q(x)) ∶= (b1 + d1 )x + (b0 + d0 ) + ⟨q(x),
and
κ(b1 x + b0 + ⟨q(x)) ∶= (κb1 x + κb0 ) + ⟨q(x)⟩.
A basis B for F over Z2 is {1 + ⟨q(x)⟩, x + ⟨q(x)⟩}, and thus
dimZ2 (F) = 2.
182 8. VECTOR SPACES
Appendix
A8.1. Vector spaces over general fields behave for the most part like vector
spaces over R or over C, although we must always be aware of those few cases where
a property of the underlying field (such as finiteness, or the finiteness of its character)
plays a role (e.g. Example 1.12).
Since we assume that the reader has already seen vector spaces over R in a linear
algebra course, this chapter should be quite familiar, with the possible exception of
the proof that every vector space admits a basis. We should point out that when a
vector space is infinite-dimensional, it may be very, very difficult to actually describe
a basis, and Theorem 1.13 merely asserts that one exists.
A8.2. Students who have seen Calculus or Fourier Analysis may have seen us
write, for example,
x x2 x3
exp(x) ∶= ex = 1 + + + + ⋯.
1! 2! 3!
n
This is NOT a linear combination of the functions gn (x) ∶= xn! , n ≥ 0. This is a
series expansion. Be aware that we can NEVER add up infinitely many things in
mathematics. The most general thing we can do is to take finite sums and then take
limits of these.
The above expression really means exactly that. We are claiming that
N
exp(⋅) = lim ∑ gn .
N →∞ n=1
What we mean by the limit depends in general on the circumstance. In this instance,
we might be referring to point-wise limits, or to uniform convergence on closed and
bounded (i.e. so-called compact subsets) of R, or some other form of convergence
altogether.
This study of such series expansions falls outside the scope of a course on rings
and fields. We only bring it up to emphasise the fact that in vector spaces, we
are only ever allowed to consider finite sums of vectors, and in any setting – linear
algebra or otherwise – linear combinations only involve finitely many vectors. Be
warned.
184 8. VECTOR SPACES
Exercise 8.1.
Let V and W be vector spaces over a field F. A map φ ∶ V → W is said to be
F-linear if φ(κx + y) = κφ(x) + φ(y) for all x, y ∈ V and κ ∈ F.
Let L(V, W) ∶= {φ ∶ φ ∶ V → W is F − linear}. If V = W, we abbreviate the
notation to L(V).
Prove that L(V, W) is a vector space over F using the operations
(φ + ψ)(x) ∶= φ(x) + ψ(x), x∈V
and
(κφ)(x) ∶= κ(φ(x)), x ∈ V, κ ∈ F.
Note. Both V and W must be vector spaces over the same field F. Also, if
we are only dealing with one field F at a time, and there is no risk of confusion,
we sometimes relax a bit and simply refer to these maps as linear (as opposed to
F-linear). Let’s face it – life is stressful enough and sometimes we just need to let
our hair down.
Exercise 8.2.
Let V and W be vector spaces over a field F, and let φ ∈ L(V, W). Prove that
ker φ ∶= {x ∈ V ∶ φ(x) = 0}
is a subspace of V, while ran φ ∶= {φ(x) ∶ x ∈ V} is a subspace of W.
Exercise 8.3.
Let V be a vector space over the field F. Prove that (L(V), +, ○) is a ring with
the operations
(φ + ψ)(x) ∶= φ(x) + ψ(x), x ∈ V
and
(φ ○ ψ)(x) ∶= φ(ψ(x)), x ∈ V.
Exercise 8.4.
Let V be a vector space over the field F, and let L ⊆ V be a linearly independent
set. Show that L can be extended to a basis B for V. That is, show that there exists
a basis B ⊆ V such that L ⊆ B.
Exercise 8.5.
Let V and W be vector spaces over a field F, and let φ ∈ L(V, W). We say that
φ is a (vector space) isomorphism if φ is bijective.
Prove that the following statements are equivalent.
(a) φ ∈ L(V, W) is an isomorphism.
(b) Given a basis B = {bλ ∶ λ ∈ Λ} for V, the set D ∶= {φ(bλ ) ∶ λ ∈ Λ} is a basis
for W.
EXERCISES FOR CHAPTER 8 185
Exercise 8.6.
Let V be a vector space over a field F, and let 0 ≠ φ ∈ L(V, F). Show that
ker φ
is a maximal proper subspace of V. That is, if W ≤ V is a subspace of V and
ker φ ≤ W ≤ V,
then either W = ker φ or W = V.
Exercise 8.7.
Let V be a vector space over the field F, and let W be a subspace of V. We
define a relation ∼ on V as follows: given x, y ∈ V, we set
x ∼ y if and only if x − y ∈ W.
(a) Prove that ∼ is an equivalence relation on V.
(b) As we did in the case of rings, we refer to the equivalence class x + W ∶= [x]
of x ∈ V as a coset of W. Prove that the set V/W ∶= {[x] ∶ x ∈ cV } of
cosets of W form a vector space using the operations
(x + W) + (y + W) ∶= (x + y) + W,
and
κ(x + W) ∶= (κx) + W.
Of course, the first thing you will have to do is to verify that these operations
are well-defined.
Exercise 8.8.
Prove the First Isomorphism Theorem for vector spaces:
If V and W are vector spaces over a field F and φ ∶ V → W is an F-linear map,
then
V
ran φ = φ(V) ≃ .
ker φ
Exercise 8.9.
Prove the Second Isomorphism Theorem for vector spaces:
Let W and Y are be subspaces of a vector space V over a field F, and set
W + Y ∶= {w + y ∶ w ∈ W, y ∈ Y}.
Prove that W + Y is a subspace of V, and
W +Y W
≃ .
Y W ∩Y
186 8. VECTOR SPACES
Exercise 8.10.
Prove the Third Isomorphism Theorem for vector spaces:
Let W and Y are be subspaces of a vector space V over a field F, and suppose
that Y ≤ W. Prove that
V V/Y
≃ .
W W/Y
CHAPTER 9
Extension fields
1.3. Remarks.
● We note that this is not the most common definition of an extension field
of a field F. Many authors define an extension field E of F simply as a
field E which contains F as a subfield, and we may do this if we identify F
187
188 9. EXTENSION FIELDS
with its image Fι ∶= ι(F) ⊆ E. There is no harm in doing this – indeed, this
is how we should interpret the notion of an extension field, provided that
we are careful about keeping track of this identification. In writing these
course notes however, we have taken the viewpoint that when first learning
a subject, it is good to be overly cautious, meticulous and probably a bit
pedantic, and to do the bookkeeping in plain sight for all to see.
● At times we shall have to simultaneously consider multiple extension fields
of a field F. When we are only dealing with one extension field (E, ι)
however, we shall use the notation bι ∶= ι(b) in the hope that it will make
the arguments more readable. This will also apply to polynomials in the
following way: suppose that q(x) = qn xn + qn−1 xn−1 + ⋯ + q1 x + q0 ∈ F[x] and
that (E, ι) is an extension of F. We shall denote by
q ι (x) ∶= qnι xn + qn−1
ι
xn−1 + ⋯ + q1ι x + q0ι
the corresponding polynomial in E[x] with coefficients qjι = ι(qj ), 0 ≤ j ≤ n.
● A brief comment on terminology: we shall also say that E is an extension
field of F via ι to mean that (E, ι) is an extension field of F.
● Finally, let (E, ι) be an extension field of F and q(x) ∈ F[x]. Then, in
the same way that we are (mentally) identifying F and Fι , we should be
identifying q(x) and q ι (x). Thus “finding a root for q(x)” in E really
means finding a root for q ι (x) in E! Despite this, we shall continue to use
the common abuse of terminology and refer to a root of q ι (x) in E as a
“root of q(x)”.
1.4. Examples.
(a) (R, ι) is an extension of Q, where ι(q) = q for all q ∈ Q.
Note that in this example, if q(x) = 23 x2 − 17 222
18 x + 1033 ∈ Q[x], then
ι ι ι
3 2 17 222 3 17 222
q ι (x) = x − x+ = x2 − x + ∈ R[x].
2 18 1033 2 18 1033
Hopefully the reader can see why we would want to abuse the terminology
in cases such as these, and refer to a root of q ι (x)√in R as a “root of q(x)”.
(b) (C, ι) is an extension of R, where ι(x) = x ⋅ 1 + 0 ⋅ −1 = x + i0 for all x ∈ R,
and therefore (C, ι∣Q ) is also an extension of Q. [Note that ι and i are
different! The first is an injective homomorphism, while the second is the
square root of −1 in C!]
(c) There does not exist an embedding φ ∶ Z7 → Z19 for which (Z19 , φ) is an
extension of Z7 . Indeed, if 1 ∈ Z7 denotes the multiplicative identity in Z7 ,
then the fact that char(Z7 ) = 7 implies that
0 = φ(7) = φ(1) + φ(1) + φ(1) + φ(1) + φ(1) + φ(1) + φ(1).
But the only element a ∈ Z19 that satisfies 7a = (a + a + a + a + a + a + a) = 0
is a = 0. Thus φ(1) = 0 = φ(0), contradicting the fact that φ is injective.
1. A RETURN TO OUR ROOTS (OF POLYNOMIALS) 189
= p(x) + ⟨p(x)⟩
= 0 + ⟨p(x)⟩ ∈ E.
In other words, α is a root of pι (x), and hence of q ι (x) in E!
Are you kidding me? Nay! Am I kidding you? Nay nay!
190 9. EXTENSION FIELDS
◻
Remark. We shall refer to the embedding ι of F into E = F[x]/⟨p(x)⟩ defined in
the proof of Kronecker’s Theorem by ι(a) = a + ⟨p(x)⟩ for all a ∈ F as the canonical
embedding of F into E. Thus
Fι ∶= ι(F) = {aι ∶= a + ⟨p(x)⟩ ∶ a ∈ F}
is a subfield of E which is isomorphic to F.
1.6. Examples.
(a) Let F = Q and q(x) = x2 + 1 ∈ Q[x]. Note that in this example, q(x) is al-
ready irreducible over Q, so q(x) will play the role of p(x) from Kronecker’s
Theorem. By the argument in the proof of Kronecker’s Theorem 1.5,
E ∶= Q[x]/⟨x2 + 1⟩
is an extension of Q via the canonical embedding aι ∶= ι(a) = a+⟨x2 +1⟩, a ∈
Q.
Moreover, q(x) = q2 x2 + q0 , where q2 = q0 = 1 and thus
q ι (x) = q2ι x2 + q0ι = (1 + ⟨x2 + 1⟩)x2 + (1 + ⟨x2 + 1⟩).
This might seem a bit confusing at first, but notice that if we write
K ∶= ⟨q(x)⟩ = ⟨x2 + 1⟩, then we are saying that
q ι (x) = (q2 + K)x2 + (q0 + K).
It is important to keep in mind that an element of E is a coset of K, so
q2ι ∶= q2 + K and q0ι ∶= q0 + K are elements of E. That is, q ι (x) = q2ι x2 + q0ι ∈
E[x], as required.
Let’s check that α is indeed a root for q ι (x):
Setting α = x + K = x + ⟨x2 + 1⟩ ∈ E,
q ι (α) = (q2 + K)α2 + (q0 + K)
= (q2 + K)(x + K)2 + (q0 + K)
= (q2 + K)(x2 + K) + (q0 + K)
= (q2 x2 + q0 ) + K
= q(x) + ⟨q(x)⟩
= (x2 + 1) + ⟨x2 + 1⟩
= 0 + ⟨x2 + 1⟩.
√
The reader will observe that we never needed to invoke C or√“ −1” to
get a root for this polynomial! Indeed, this is one way to define −1! (How
sweet is that?)
1. A RETURN TO OUR ROOTS (OF POLYNOMIALS) 191
1.7. Definition. Let (E, ι) be an extension field of a field F, and let q(x) ∈ F[x].
We say that q(x) splits over E if q ι (x) can be written as a product of linear terms
with coefficients in E.
The field E is said to be a splitting field for q(x) ∈ F[x] if
(a) q(x) splits over E, and
(b) If Fι ⊆ K ⊆ E for some proper subfield K of E, then q(x) does not split
over K.
192 9. EXTENSION FIELDS
1.8. Examples.
(a) Consider q(x) = x2 − 2 ∈ Q[x]. Then q(x) does not split over Q, but it does
split over R. Indeed,
√ √
q ι (x) = x2 − 2 = (x − 2)(x + 2) ∈ R[x].
√ √
Note also that q(x) splits over Q( 2) = {a + b 2 ∶√a, b ∈ Q}, and thus R is
not the splitting field for q(x). As it turns out, Q( 2) is the splitting field
for q(x) ∈ Q[x].
(b) The same polynomial√q(x) = x2 − 2 can be thought of as an element of
R[x]. In this case, Q( 2) would not be considered a splitting field for q(x)
over R, since any splitting field for q(x) over R would have to contain the
domain for the coefficients, namely R. In other words, if we are thinking of
q(x) as a polynomial with real coefficients, then q(x) already splits in R,
so R is the splitting field of q(x).
1.10. Theorem.
Let F be a field and 0 ≠ q(x) ∈ F[x]. Then there exists a splitting field K for q(x)
over F.
Proof. We shall prove this by induction on the degree, say n, of q(x). Of course, if
n = 0, then q(x) = q0 has no roots (since we assumed that q(x) ≠ 0), and so F is the
splitting field for q(x) over F.
If n = 1, then q(x) = q1 x + q0 with q1 ≠ 0, and thus α ∶= −q1−1 q0 ∈ F is the unique
root of q(x). In other words, F is again a splitting field for q(x).
Let N > 1 be fixed, and suppose that the result holds for all polynomials of
degree less than N . Suppose next that q(x) ∈ F[x] is a polynomial of degree N . By
Kronecker’s Theorem 1.5, there exists an extension field (E, ι) of F in which q ι (x)
has a root, say α1 ∈ E.
Thus
q ι (x) = (x − α1 )g(x)
for some g(x) ∈ E[x] with deg(g(x)) = N − 1 < N . By our induction hypothesis,
there exists an extension field (H, τ ) of E such that g(x) splits in K; that is, there
exist α2 , α3 , . . . , αN ∈ H such that
τ
g(x) = gN −1 (x − α2 )(x − α3 )⋯(x − αN ) ∈ K[x].
Let K ∶= F(α1 , α2 , . . . , αN ) ⊆ H. Then (K, τ ○ ι) is a splitting field for q(x) over F.
1. A RETURN TO OUR ROOTS (OF POLYNOMIALS) 193
1.11. Example.
Let q(x) = x4 + 4x2 − 45 = (x2 − 5)(x2 + 9) ∈ Q[x], and let ι ∶ Q √→ C√denote the
inclusion map ι(q) = q for all q ∈ Q. The roots of√q ι (x) in C √ are { 5, − 5, 3i, −3i}.
Thus, a splitting field for q(x) over Q is K ∶= Q( 5, 3i) = Q( 5, i).
In particular,
√ √
K = {(a + b 5) + (c + d 5)i ∶ a, b, c, d ∈ Q}.
√ √
It is worth noting that K is a vector space over Q with basis {1, 5, i, 5i}, and
thus
dimQ K = 4.
That K is a vector space over Q is not a coincidence. We shall study this phenomenon
presently.
(b) Since
F[x]/⟨q(x)⟩ =
{bn−1 xn−1 + bn−2 xn−2 + ⋯ + b1 x + b0 + ⟨q(x)⟩ ∶ bj ∈ F, 0 ≤ j ≤ n − 1},
we see that
̂
F(α) = Θ(F[x]/⟨q(x)⟩)
= {bιn−1 αn−1 + bιn−2 αn−2 + ⋯ + bι1 α + bι0 ∶ bj ∈ F, 0 ≤ j ≤ n − 1}.
1.14. Remark. It is worth noting that the isomorphism between F(α) and
F[x]/⟨q(x)⟩ in Theorem 1.13 above is given by the map
̂ (x) + ⟨q(x)⟩) = f ι (α).
Θ(f
This follows immediately from the First Isomorphism Theorem and its proof.
Also, let us keep in mind that for b ∈ F, the map δ ∶ b ↦ b+⟨q(x)⟩ is an embedding
of F to F[x]/⟨q(x)⟩, and
̂ + ⟨q(x)⟩) = bι ∈ F(α).
Θ(b
̂ ∶ F → F(α) satisfying κ(b) = bι as a canonical
Thus we may think of the map κ ∶= Θ○δ
embedding of F into F(α) induced by this construction.
◻
1. A RETURN TO OUR ROOTS (OF POLYNOMIALS) 195
The astute reader (which may or may not be the same person as the perspicacious
reader mentioned earlier) will have noticed that we have been referring to a splitting
field for a polynomial over a field, as opposed to the splitting field. The following
theorem will set things right.
But first, note that if θ ∶ F1 → F2 is an isomorphism of fields, and if f (x) =
bN xN + bN −1 xN −1 + ⋯ + b1 x + b0 ∈ F1 [x], then we extend our previous notation by
setting
f θ (x) ∶= bθN xN + bθN −1 xN −1 + ⋯ + bθ1 x + bθ0 ∈ F2 [x],
where bθj ∶= θ(bj ), 0 ≤ j ≤ N .
1.17. Theorem. Let F be a field and suppose that f (x) ∈ F[x]. If (K1 , ι1 ) and
(K2 , ι2 ) are splitting fields for f (x) over F, then there exists an isomorphism
Φ ∶ K1 → K2
which satisfies Φ(ι1 (b)) = ι2 (b) for all b ∈ F.
196 9. EXTENSION FIELDS
● then use the fact that the remaining factor h(x) has degree less than N ,
which allows you to apply the induction hypothesis to conclude that K1
and K2 are isomorphic.
Almost all of the work above went into keeping track of where the coefficients of the
different factors were.
1.18. Remark. The previous result shows that any two splitting fields for a
polynomial f (x) ∈ F[x] are isomorphic, and thus – up to isomorphism – we can
speak about the splitting field of a polynomial f (x) ∈ F[x].
Given the construction of Φ from θ in the above result, it is not hard to see that
with a bit more work, we can also ensure that
Φ(αj ) = βj , 1 ≤ j ≤ N.
That is, without loss of generality, we have that K1 ∶= F(α1 , α2 , . . . , αN ) and
K2 ∶= F(β1 , β2 , . . . , βN ) are isomorphic via an isomorphism that “fixes” F (in other
words, sends ι1 (b) to ι2 (b) for all b ∈ F and which satsifies Φ(αj ) = βj for all j.
It is important to note that when we do this for any fixed 1 ≤ j ≤ N , αj and βj
must be roots of the same irreducible factor of q(x). The reader may wish to review
Example 1.6 (b) to see why this is the case.
1.20. Examples.
√ √
(a) α ∶= ( 2 + 3) is algebraic over Q. To see this, consider the extension field
(R, ι) of Q where ι(b) = b for all b ∈ Q. Note that
√ √ √
( 2 + 3)2 = 5 + 2 6,
and thus √ √
(( 2 + 3)2 − 5)2 − 24 = 0.
In other words, α is a root of q ι (x), where
q(x) = (x2 − 5)2 − 24 = x4 − 10x2 + 1 ∈ Q[x].
Recall that F(x) denotes the field of quotients of the integral domain F[x].
F(α) ≃ F[x]/⟨q(x)⟩,
F(β) ≃ F(x).
Proof.
(a) Let us prove that if q(x) is chosen as above, then q(x) is irreducible. If we
can do that, then the result follows immediately from Theorem 1.13 (a).
Indeed, if q(x) = g(x)h(x) for some g(x), h(x) ∈ F[x] with
is a (well-defined!!) isomorphism.
◻
1. A RETURN TO OUR ROOTS (OF POLYNOMIALS) 199
1.24. Examples.
(a) R] = 2, and {1, i} is a√basis for C over R. √
[C ∶ √
(b) [Q( √2) ∶ Q] = 2, and {1, √2} √ is a √
basis√for Q( 2) over Q. √
(c) 5
[Q( 2) ∶ Q] = 5, and {1, 2, 4, 5 8, 5 16} is a basis for Q( 5 2) over Q.
5 5
1.25. Theorem. Let (E, ι) be an extension field for F and suppose that
[E ∶ F] < ∞.
Then E is an algebraic extension of F.
200 9. EXTENSION FIELDS
This is the statement that α is a root of the non-zero polynomial q ι (x), where
q(x) = qn+1 xn+1 + qn x2 + ⋯ + q1 x + q0 ∈ F[x]. Hence α is algebraic over F.
Since α ∈ E was arbitrary, E is an algebraic extension of F.
◻
[K ∶ F] = [K ∶ E] [E ∶ F].
ι ι ι
γj = y1,j α1 + y2,j α2 + ⋯ + ym,j αm .
1. A RETURN TO OUR ROOTS (OF POLYNOMIALS) 201
Hence
n
β = ∑ γjη κj
j=1
n
= ∑ η(γj ) κj
j=1
n m
= ∑ η (∑ ι(yi,j )αi ) κj
j=1 i=1
n m
= ∑ (∑ η ○ ι(yi,j )αiη ) κj
j=1 i=1
n m
η○ι
= ∑ ∑ yi,j (αiη κj ).
j=1 i=1
1.28. Examples.
√ √ √ √ √ √
(a) [Q( 2, 3) ∶ Q] = 4 = [Q( 2, 3) ∶ √ Q( √2)] √
[Q( 2) ∶ Q].
A basis for this extension is {1, 2, 3, 6}.
202 9. EXTENSION FIELDS
Supplementary Examples.
S9.1. Example. Suppose that E is an extension field for the field F with F ⊆ E,
and that [E ∶ F] = 2. Note that {1} is linearly independent in E, since 1 ≠ 0, and so
{1} can be extended to a basis B ∶= {1, α} for E over F.
It follows that if γ ∈ E, then there exists a, b ∈ F such that γ = a ⋅ 1 + b ⋅ α. In
particular, α2 = a0 ⋅ 1 + b0 ⋅ α for some a0 , b0 ∈ F.
Thus
γ 2 = a2 ⋅ 1 + 2abα + b2 α2
= a2 ⋅ 1 + 2abα + b2 (a0 ⋅ 1 + b0 ⋅ α)
= (a2 + a0 b2 ) ⋅ 1 + (2ab + b0 b2 )α.
It follows that {1, γ, γ 2 } are linearly dependent over F, which implies that γ
satisfies a polynomial of degree at most 2 over F.
S9.2. Example. Let 0 ≠ p(x) = p2 x2 + p1 x + p0 ∈ Z2 [x] be an irreducible
polynomial of degree 2. Then p2 = 1, otherwise deg(p(x)) < 2. Also, in order to be
irreducible, p(x) can not have roots in Z2 , so
p(0) = p1 ⋅ 0 + p0 = p0 ≠ 0.
Thus p0 = 1. Also,
p(1) = p2 + p1 + p0 = 1 + p1 + 1 = p1 ≠ 0,
so p1 = 1.
Hence p(x) = x2 + x + 1. That is, there exists a unique irreducible polynomial of
degree 2 over Z2 .
What does this say about extension fields over Z2 of degree 2?
S9.3. Example. The polynomial q(x) = x4 +x+1 ∈ Z2 [x] is irreducible. Indeed,
it has no roots in Z2 , and so the only possible factorisation into polynomials of lower
degree would have to involve two polynomials r(x) and s(x) ∈ Z2 [x], each of degree
2.
Writing r(x) = x2 + r1 x + r0 and s(x) = x2 + s1 x + s0 , we see that q(x) = r(x)s(x)
implies that r0 s0 = 1 implies that r0 = s0 = 1.
Thus
q(x) = r(x)s(x) = (x2 + r1 x + 1)(x2 + s1 x + 1)
= x4 + (r1 + s1 )x3 + (1 + r1 s1 + 1)x2 + (r1 + s1 )x + 1,
from which we deduce that r1 + s1 = 0, r1 s1 = 1, and r1 + s1 = 1. The first and third
equation clearly contradict each other, so no such factorisation is possible.
It follows that F ∶= Z2 [x]/⟨x4 + x + 1⟩ is a field, and that a general element of F
is of the form
b3 x3 + b2 x2 + b1 x + b0 + ⟨x4 + x + 1⟩,
for some b0 , b1 , b2 , b3 ∈ Z2 . There are 16 such possibilities, and so ∣F∣ = 16 = 24 .
SUPPLEMENTARY EXAMPLES. 203
for q(x) over F. In K, q(x) will have pn distinct roots, say E ∶= {α1 , α2 , . . . , αpn }.
Curious, isn’t it, that we should use E to denote a set of roots, instead of a field?
n n n n
But in fact, if a, b ∈ E, then ap = a and bpn = b, so (ab)p = ap bp = ab ∈ E and (by
the binomial expansion, noting that each coefficient in the expansion other than the
first and the last is divisible by pn , and therefore by p,
n n n
(b − a)p = bp − ap = b − a,
n
so that b − a ∈ E. In other words, E is a subring of K. If 0 ≠ a ∈ E, then ap = a
= 1 in K, so that a−1 = ap
n−1 n−2
implies that ap ∈ E. Thus E is in fact a subfield of
K, and E has exactly pn elements.
That is, for any prime p and any n ∈ N, there exists a field with pn elements.
APPENDIX 205
Appendix
A9.2. Although we shall not prove it here, it is known that the number of
irreducible polynomials of degree n over a field Fq , where q = pm for some prime p
and positive integer m is given by
1 n/d
N (q, n) ∶= ∑ µ(d)q ,
n d∣n
where µ is the so-called Möbius function. More specifically, µ(d) is the sum of
the primitive dth roots of unity in C, and takes on values in {−1, 0, 1}. In particular,
N (q, n) ≥ 1 for all q, n.
206 9. EXTENSION FIELDS
A9.3. There is still some time left to continue asking questions while reading
the text. For instance, how many algebraic elements over Q are there in R? Stated
more precisely, what is the cardinality of the set
Λ ∶= {α ∈ R ∶ α is algebraic over Q}?
Is Λ a field?
EXERCISES FOR CHAPTER 9 207
Exercise 9.1.
Prove that Q[x]/⟨x2 + 1⟩ is isomorphic to Q[i].
Exercise 9.2. √ √
Show that α = 13 + 7 is algebraic over Q, and find its minimal polynomial.
Exercise 9.3. √√
Show that β = 3 2 − i is algebraic over Q, and find its minimal polynomial.
Exercise 9.4. √ √
Find a basis for Q( 3, 6) over Q. What is the degree of this extension?
Exercise 9.5. √ √ √ √ √
Find a basis for Q( 2, 6 + 10) over Q( 3 + 5). What is the degree of this
extension?
Exercise 9.6.
Prove that Z2 [x]/⟨x3 + x2 + 1⟩ is a field with 8 elements, and construct a multi-
plication table for this field.
Exercise 9.7.
Is e algebraic over Q(e7 )?
Exercise 9.8. √ √
Prove or disprove that Q( 7) ≃ Q( 19).
Exercise 9.9.
Let F be a field of characteristic 5, and let b ∈ F. Prove that p(x) = x5 − b is
either irreducible over F, or splits in F.
Exercise 9.10.
Let α, β ∈ R be transcendental over Q. Prove that either αβ or α + β is tran-
scendental. (Recall from Example 1.20 (d) that we do not know whether π + e and
π e are algebraic or transcendental over Q. This shows that at least one of the two
is transcendental, but does not tell us which.)
Exercise 9.11.
Recall that a set X is said to be denumerable if there exists a bijection φ ∶
N → X. (We say that X is countable if X is either finite or denumerable.)
Let (R, ι) be the standard extension of Q (i.e. ι(q) = q for all q ∈ Q). Prove that
the set
Ω ∶= {α ∈ R ∶ α is algebraic over Q}
is denumerable.
208 9. EXTENSION FIELDS
Exercise 9.12.
Prove that the set Ω of algebraic real numbers defined in Exercise 9.11 is a field.
Exercise 9.13.√ √ √ √
Let K ∶= Q( 2, 3 2, 4 2, 5 2, . . .) be the extension field of Q from Paragraph 9.1.26.
Prove that K is algebraic over Q.
CHAPTER 10
We are all here on earth to help others; what on earth the others are
here for I don’t know.
W.H. Auden
1. An ode to Wantzel
1.1. In this Chapter we shall see how Pierre Laurent Wantzel (I don’t know
why, but I like him already) managed to use field theory to prove lots of stuff in
geometry. So yes, that’s a thing now.
1.2. The story begins, as they say, with the ancient Greeks. By that we do not
mean that the Greeks themselves were ancient, but rather that we are referring to
Greeks who lived in ancient times. More specifically, we are interested in Plato,
born circa 425 bc. An Athenian philosopher, he held that the only “perfect” geo-
metrical forms were straight lines and circles, and as such, he was only interested
in geometrical constructions that could be accomplished with a straight-edge and a
pair of compasses. 1
Although he was not a mathematician, he nevertheless taught mathematics,
and one of his students, Eudoxos of Cnidus, is considered to have been one of
the greatest mathematicians of those days. In fact, much of what is to be found
in Elements, Euclid’s famous2 treatise on mathematics - written circa 300 bc – is
actually the work of Eudoxos.
1The Cambridge Dictionary asserts that a compass is “a device for finding direction with
a needle that can move easily and that always points to magnetic north”, whereas (a pair of)
compasses refers to “a V-shaped device that is used for drawing circles or measuring distances on
maps”. You could see where a pair of compasses might have been more useful than a compass when
performing geometrical constructions.
2The word “famous” is a curious beast when applied in a mathematical context. In this case,
we feel justified in using this expression because Euclid’s Elements is well-known outside of the
mathematical world. In fact, if one were to walk up to some random individual on the street and
ask them to name a famous mathematical text, then it is hard to think of another text the random
person might cite. Of course, it is far more likely that the person to whom one is posing this
question would ignore one, or – in a worst case scenario – would visit some extreme violence upon
one’s person, but that is an altogether different matter. It is worth bearing in mind that being
209
210 10. STRAIGHT-EDGE AND COMPASSES CONSTRUCTIONS
Our present goal is to try to describe which geometric figures may be constructed
using the plane method, starting from a minimal set P consisting of two distinct
points. By rotating, translating and scaling if necessary, we may assume without loss
of generality that those two points are p0 = (0, 0) and p1 = (1, 0). Thus, throughout
the remainder of this chapter
In order to describe these geometric figures, we must first properly define what
it means to “construct” such a geometric object.
It is clear, for example, that since we can construct line segments which connect
pairs of distinct points using the plane method, we can construct a square S1 whose
area is 1 provided that the points q = (1, 0) and r = (1, 1) can be “constructed” from
the set P using the plane method – the points p0 = (0, 0) and p1 = (1, 0) already
lying in P by hypothesis. To do this, we simply connect p0 to p1 , p1 to r, r to q,
and q to p0 via line segments.
We can similarly “construct” a square S2 whose area √ is twice that of
√S1 provided
that we can somehow “construct” the points x = ( 2, 0), y = (0, 2) and z =
“famous” in mathematics today means being known to literally hundreds of other mathematicians
studying the same branch of mathematics, and to absolutely no one who actually matters.
1. AN ODE TO WANTZEL 211
√ √
( 2, 2) from P using the plane method, and “constructing” an arbitrary polygon
can be done provided that we can somehow “construct” the vertices of that polygon,
after which we may the appropriate vertices with line segments, which the plane
method allows us to draw.
Thus by “constructing” a more general geometric figure using the plane method,
we shall mean constructing the points which define the object, and connecting them
using the lines, line segments and circles prescribed by the plane method. Of course,
being able to do so relies upon our having a rigorous definition of which points we
can “construct” from P using the plane method.
This is not the only interesting way to interpret the set ΓP . Indeed, suppose
that the point r = (r1 , r2 ) is constructible from P.
● as we have just seen, we may construct the x-axis from P by the first
admissible operation.
● By Paragraph A10.1, we may also construct the y-axis.
212 10. STRAIGHT-EDGE AND COMPASSES CONSTRUCTIONS
2. Enter fields
2.1. This is where things get interesting for those of us who have spent the past
few weeks learning about rings and fields. Our immediate goal is to prove that if
{p0 , p1 } ⊆ P, then ΓP is a field. First we observe that if α ∈ ΓP , then the circle
of radius ∣α∣ centred at p0 intersects the x-axis at (−∣α∣, 0) and at (∣α∣, 0), In other
words, α ∈ ΓP implies that −α ∈ ΓP as well.
Proof. First note that by the argument in Paragraph 2.1, it suffices to consider the
case where α, β > 0. If α = 1, there is nothing to prove, and so we henceforth assume
that α ≠ 1.
(a) Let a = (α, 0) and b = (0, β), which are constructible from P.
(b) Using Paragraph A.10.3, we can draw a line L through the point p1 = (1, 0)
which is parallel to the line through a and b. That line will intersect the
y-axis at the point c ∶= (0, γ), and thus γ ∈ R ∈ ΓP .
The triangles (p0 a b) and (p0 p1 c) are similar, and thus
α d(p0 , a) d(p0 , b) β
= = = .
1 d(p0 , p1 ) d(p0 , c) γ
β
That is, = γ ∈ ΓP .
α
1
Since 1 ∈ ΓP , the above argument with β = 1 implies that γ1 ∶= ∈ ΓP and
α
therefore that
β
αβ = ∈ ΓP .
γ1
L1
L0
A P1
P0 L2
Figure 10.0
◻
214 10. STRAIGHT-EDGE AND COMPASSES CONSTRUCTIONS
2.4. Theorem. Suppose that {p0 = (0, 0), p1 = (1, 0)} ⊆ P ⊆ R2 . Then
Q ⊆ ΓP ⊆ R
is a field. Thus
ΓP = Q({α, β ∶ (α, β) ∈ P}).
Finally, by the comments in Remark 1.6, we know that ΓP = {α, β ∶ (α, β) ∈ P}.
Since we now know that ΓP is a field which also contains Q, it must be the smallest
field which contains Q and {α, β ∶ (α, β) ∈ P}. In other words,
ΓP = Q({α, β ∶ (α, β) ∈ P}).
◻
2.6. Remark. Although we have determined that the set Γ○ = Γ{p0 ,p1 } of con-
structible real numbers is a subfield of R that contains Q, we still don’t know much
about it. The next result establishes a relation between straight-edge and compasses
constructions and field extensions, which will be the key to understanding Γ○ , and
to establishing the impossibility of certain classical geometric constructions in the
next section of this Chapter.
Proof. It is clear from the definition of ΓP∪{r} that ΓP∪{r} = ΓP (r1 , r2 ). There re-
mains to find the degree of the extension. This leads us to consider three possibilities
for r.
(a) r is the point of intersection of two non-parallel lines L1 determined by
a = (a1 , a2 ) and b = (b1 , b2 ) ∈ P, and L2 determined by c = (c1 , c2 ) and
d = (d1 , d2 ) ∈ P.
2. ENTER FIELDS 215
We shall argue the case where the slopes of L1 and L2 are non-zero real
numbers. The cases where one line is horizontal and/or one line is vertical
are left as (routine) exercises for the reader.
Let us write L1 in the form L1 = {(x, y) ∈ R2 ∶ y = m1 x + n1 }. For
(x, y) ∈ L1 and (x, y) ≠ {a, b}, we have that
x − a1 y − a2
= ,
x − b1 y − b2
and therefore
a2 − b2 a1 b2 − a2 b1
y= x+ .
a1 − b1 a1 − b1
a2 − b2 a1 b2 − a2 b1
Set m1 = and n1 = , and observe that m1 , n1 ∈ ΓP , since
a1 − b1 a1 − b1
a1 , a2 , b1 , b2 ∈ ΓP .
or equivalently,
But this is a quadratic equation, and thus [ΓP (r1 ) ∶ ΓP ] ∈ {1, 2} (depending
upon whether or not this equation has roots in ΓP .)
(c) The third and last possibility is that r is the point of intersection of two
circles T1 (centred at a = (a1 , a2 ) ∈ P and containing b = (b1 , b2 ) ∈ P) and
T2 (centred at c = (c1 , c2 ) ∈ P and containing d = (d1 , d2 ) ∈ P). As in
part (b), we note that if ρk is the radius of the circle Tk , then ρk ∈ ΓP ,
k = 1, 2.
Thus r = (r1 , r2 ) ∈ T1 ∩ T2 implies that
and
(r1 − c1 )2 + (r2 − c2 )2 = ρ22 .
Taking the difference of these two equations, we obtain
r2 = αr1 + β,
2.8. Remark. In fact, the above proof shows that if r = (r1 , r2 ) ∈ R2 can be
constructed in one step from P, then [ΓP (r1 ) ∶ ΓP ] ∈ {1, 2} and r2 ∈ FP (r1 ), so that
[ΓP (r1 , r2 ) ∶ ΓP ] ∈ {1, 2}.
2. ENTER FIELDS 217
say
[ΓPn ∶ ΓP ] = 2m0
for some 0 ≤ m0 ∈ Z.
Now ΓP ⊆ ΓP (r1 , r2 ) is a subfield of ΓPn , and thus
3. Back to geometry
3.1. We started this journey into the realm of constructible numbers because
of our deep admiration of Euclid, and our appreciation of his weakness for lines
and circles. While the Greeks were masters of the plane method - there were three
constructions that fell beyond their reach, and which would subsequently become
famous,3namely: doubling the cube, squaring the circle, and trisecting
an angle. The question of whether or not these constructions were possible using
the plane method remained open for approximately 2000 years, until the French
mathematician Pierre Laurent Wantzel (1814-1848) proved that the problems of
doubling the cube and trisecting the angle lay not with the Greeks, but with the
nature of field extensions related to constructible real numbers. The same Pierre
Laurent Wantzel also proved that a regular polygon is constructible if and only if
the number of its sides is a product of a power of 2 and (any number – including zero)
of prime natural numbers. As for the problem of squaring the circle – it was resolved
when Lindemann proved that π is transcendental over Q, as we shall discover below.
3.2. Doubling the cube. Since the set Γ○ of constructible real numbers is
a field that contains Q, it is clear that we can construct the points (0, 0), (1, 0),
(1, 1) and (0, 1) from the set P ○ ∶= {p0 , p1 }. By connecting these points with the
appropriate line segments, it follows that we may construct a square of area equal
to 1. By thinking “outside the box”, so to speak, we may view this as the face of a
cube in R3 whose volume is equal to 1.
Of course - using a straight-edge and compasses construction, we can “double
the area of the square”, because if we set
ϱ ∶= d((0, 0), (1, 1)),
√ √
then
√ √ ϱ = 2 is constructible,
√ and so we may also construct the points (0, 0), ( 2, 0),
( 2, 2) and (0, 2), which are the vertices of a square of area equal to 2.
It occurred to the Greeks to ask:
starting with P ○ ∶= {p0 , p1 }, can we construct the face of a cube
whose volume is 2?
Despite the efforts of a great many and many great mathematicians, this prob-
lem remained unsolved for approximately 2000 years. Thanks to Pierre Laurent
Wantzel, we are now in a position to resolve this problem. Take that, Euclid.
That is, is it possible to construct a line L which intersects the unit circle C at the
point w such that the angle ∠(wp0 p1 ) measures π9 radians?
3.7. Theorem. (Wantzel) The angle π3 cannot be trisected using a straight-
edge-and-compasses construction.
Proof. The question reduces to: what is that point w, and does it lie in Γ○ ? Writing
w = (w1 , w2 ), we see that w1 ∶= cos( π9 ), and w2 ∶= sin( π9 ). If w ∈ Γ○ , then w1 and w2
must each be constructible as well!
Recall that for all 0 < α ∈ R, we have that
cos(3α) = cos(2α + α)
= cos(2α) cos(α) − sin(2α) sin(α)
= (2 cos2 α − 1) cos α − (2 sin α cos α) sin α
= 2 cos3 α − cos α − 2(sin2 α) cos α
= 2 cos3 α − cos α − 2(1 − cos2 α) cos α
= 4 cos3 α − 3 cos α.
Applying this with α = π9 , and recalling that cos(3α) = cos( π3 ) = 12 , we see that
w1 must satisfy
1
= cos(3w1 ) = 4w13 − 3w1 ,
2
or equivalently, w1 must be a root of the polynomial
q(x) = 8x3 − 6x − 1.
We claim that q(x) is irreducible over Q = ΓP ○ . Indeed, suppose that q(x)
were reducible over Q. Then we could write q(x) = g(x)h(x) where deg(g(x)) = 1
and deg(h(x)) = 2. By Theorem 7.3.8, if (q(x)) is reducible over Q, then q(x) is
reducible over Z, and we can find factors g1 (x) and h1 (x) ∈ Z[x] with deg(g1 (x)) = 1
and deg(h1 (x)) = 2 such that q(x) = g1 (x)h1 (x).
Set g1 (x) = a1 x + a0 and h1 (x) = b2 x2 + b1 x + b0 ∈ Z[x]. We must therefore solve
8x3 − 6x − 1 = (a1 b2 )x3 + (a1 b1 + a0 b2 )x2 + (a1 b0 + a0 b1 )x + (a0 b0 ),
where all ai , bj ’s lie in Z.
The equation a0 b0 = −1 implies that (a0 , b0 ) = (1, −1) or (a0 , b0 ) = (−1, 1). By
multiplying both g1 (x) and h1 (x) by −1 if necessary, we may assume without loss
of generality that a0 = 1, b0 = −1.
Thus
−6 = a1 b0 + a0 b1 = −a1 + b1 ,
0 = a1 b1 + a0 b2 = a1 b1 + b2 ,
and
8 = a1 b2 .
At this stage of the course, it is safe to leave it to the reader to verify that this
can not be done.
3. BACK TO GEOMETRY 221
Appendix
1. Let L1 denote the line passing through a and b, and let ϱ = d(a, b).
2. Otherwise, let C be a circle of radius ϱ centred at a. The circle C will
intersect L0 at two points, namely b and d. Then d(b, d) = 2ϱ.
3. Let Cb be the circle of radius 2ϱ centred at b, and Cd be the circle of radius
2ϱ centred at d. Then Cb and Cd will intersect at two points, namely y1
and y2 .
4. Let L denote the line passing through y1 and y2 . Observe that the line
segments S1 from y1 to y2 and S2 from d to b are the diagonals of the
rhombus determined by the vertices d, b, y1 and y2 , and as such, they are
perpendicular bisectors. Then L is the desired line.
√
√ where a = p0 = (0, 0) and b = p1 = (1, 0), we find that y1 = (0, 2)
(In the case
and y2 = (0, − 2). The line L connecting these two points is then the y-axis, which
is clearly perpendicular to the x-axis and passes through a = (0, 0).)
L
Y1
CD CB
L1
D B
A
Y2
Figure 10.1. The line L through a perpendicular to the line through a and b.
APPENDIX 223
1. Let L1 denote the line passing through a and b. If the line L2 through d
and b is perpendicular to L1 , then we are done.
2. Otherwise, let Cd be a circle of radius ϱ ∶= d(b, d) centred at d. Since ϱ is
greater than the distance from d to L1 , the circle Cd will intersect L1 at
two points, namely b and x.
3. Let Cb be the circle of radius ϱ centred at b, and Cx be the circle of radius
ϱ centred at x. Then Cb and Cx will intersect at two points, namely d and
y.
4. Let L2 denote the line passing through d and y.
We claim that L2 is perpendicular to L1 . (Obviously L2 contains d.) Indeed,
the figure (bdxy) is a rhombus, and the segments bx and dy are the diagonals of
that rhombus, which are perpendicular to each other.
L2
CD
CB CX
L1
A B X
Figure 10.2. The line through d and y is perpendicular to the line through a and b.
224 10. STRAIGHT-EDGE AND COMPASSES CONSTRUCTIONS
A10.3. Let L1 be a line in R2 passing through the points a and b, and let d ∈ R2
be a point not on L1 . Here we describe how to construct a line L2 through d which
is parallel to L1 .
L2
L3
D
A B L1
Figure 10.3. The line L3 is parallel to the line L1 and passes through d.
Exercise 10.1.√ √
Prove that 3 + 36 + 25 is constructible.
Exercise 10.2.
Prove that it is always possible to bisect a constructible angle.
Exercise 10.3.
Prove that the angle θ can be trisected using the plane method if and only if the
polynomial
q(x) = 4x3 − 3x − cos θ
is reducible over Q(cos θ).
Exercise 10.4.
π
Prove that one may approximate the angle 9 with any degree of precision using
the plane method, starting from P ○ = {p0 , p1 }.
Exercise 10.5.
Show that it is possible to construct a regular pentagon using the plane method,
starting from P ○ = {p0 , p1 }.
Exercise 10.6. √
Prove that if α, β ∈ Γ○ , then α2 + β 2 ∈ Γ○ . Is the converse true?
Exercise 10.7.
Prove that it is possible to construct a regular dodecagon (i.e. a 12-sided polygon,
all of whose sides are of equal length) using the plane method, starting from P ○ =
{p0 , p1 }.
Exercise 10.8. √ √
Prove that [Q( 4 + 2 3) ∶ Q] = 2.
Exercise 10.9.
Let a, b ∈ Γ○ with a ≠ 0. Prove that
√
−b ± b2 − 4ac
∈ Γ○ .
2a
Exercise 10.10.
Prove that it given two distinct points a and b in the plane, it is possible to
construct the midpoint of the line segment from a to b using a straight-edge and
compasses.
APPENDIX A
1. Introduction
1.1. The Axiom of Choice has been mentioned earlier in these course notes, as
has Zorn’s Lemma. In this note, we shall examine these two equivalent statements,
as well as a third statement, the Well-Ordering Principle, which is also equivalent
to these two.
1.2. Set theory is the study of specific collections, called... wait for it... sets of
objects that are called elements of that set. Pure set theory deals with sets whose
elements are again sets. Experts in set theory claim that the theory of hereditarily-
finite sets – i.e. those finite sets whose elements are also finite sets, whose elements in
turn are also finite sets, whose elements ... etc., is formally equivalent to arithmetic,
which we shall neither define nor attempt to explain.
The form of set theory which interests us here arose out of attempts by math-
ematicians of the late 19th and early 20th centuries to try to understand infinite
sets, and originated with the work of the German mathematician Georg Cantor.
He had published a number of articles in number theory between 18671 and 18712,
but began work on what was to become set theory by 1874.
In 1878, Cantor formulated was is now referred to as the Continuum Hypothe-
sis(ch). Suppose that X is a subset of the real line. The Continuum Hypothesis
asserts that either there exists a bijection between X and the set N of natural num-
bers, or there exists a bijection between X and the set R of real numbers.
Many well-known and leading mathematicians of his day attempted to prove
this statement, including Cantor himself, as well as David Hilbert, who included
it as the first of the twenty-three unsolved problems he presented at the Second
International Congress of Mathematics in Paris, in 1900. Any attempt to prove this
would require one to understand not only sets of real numbers, but sets in general.
Like mangoes, mathematical theories need time to ripen. Early attempts to
define sets led to inconsistencies and paradoxes. Originally, it was thought that
any property P could be used to define a set, namely the set of all sets (or even
other objects) which satisfy property P . The mathematician-philosopher Bertrand
Russell produced the following worrisome example:
227
228 A. THE AXIOM OF CHOICE
Russell’s Paradox. Let P be the property of sets: X ∈/ X. That is, a set X satisfies
property P if and only if X does not belong to X.
Let
R ∶= {X a set ∶ X satisfies property P } = {X a set ∶ X ∈/ X}.
The question becomes: does R satisfy property P ? If so, the R ∈/ R, and so R ∈ R .
If not, then R ∈ R, so R ∈/ R.
This is not a good state of affairs, and the German mathematician Ernst Zer-
melo, who was also aware of this paradox, thought that set theory should be ax-
iomatised in order to be rigorous and preclude paradoxes such as the one above. In
1908, Zermelo produced a first axiomitisation of set theory, although he was unable
at that time to prove that his axiomatic system was consistent. There was also
a second problem, insofar as some logician/set theorists were concerned, namely:
Zermelo avoided Russell’s Paradox by means of an axiom which he referred to as
the Separation Axiom. This, however, he stated using second order logic, which is
considered less desirable (in the sense that it requires stronger hypotheses) than
the version offered by the Norwegian Thoralf Skolem and the German Abraham
Fraenkel, which relied only on so-called first-order logic.
It occurs that the Hungarian mathematician John von Neumann also added
to the list of so-called Zermelo-Fraenkel Axioms we shall enumerate below by in-
troducing the Axiom of Foundation. We are not sufficiently versed in the history of
this area to be able to explain why Skolem and von Neumann’s names don’t appear
on the list of Axioms, although von Neumann’s name appears in so many other
mathematical contexts that even if he were alive today, he would not have the right
to complain.
1.4. Just a quick comment about the Axiom of Specification: why doesn’t it
lead right back to Russell’s Paradox? Because the Axiom of Specification assumes
that you begin with a set, and you look at the members of that set which satisfy
a property P . In Russell’s Paradox, you don’t have a starting set – instead you
manufacture a set out of thin air through a property P . This is the kind of object
we now refer to as a class, which is an object more general than a set. In fact, a
class is a collection of sets, which means that we can’t apply Russell’s paradigm to
a class of classes.
1.5. There is one Axiom we have yet to specify, which – given the title of this
note – we might want to do sooner rather than later.
● The Axiom of Choice. Let Λ be a non-empty set, and let {Aλ }λ∈Λ be a
collection of non-empty sets. Then there exists a function f ∶ Λ → ∪λ∈Λ Aλ
such that f (γ) ∈ Aγ for each γ ∈ Λ.
In layman’s terms, if you have a (non-empty) collection of (non-empty) sets,
then you can choose an element from each set. In technical terms, we call such a
function f a choice function.
We write (ZFC) to indicate (ZF) plus the Axiom of Choice.
1.7. I sympathise with those who think it’s obviously true. I really do. For one
thing, the Axiom of Choice becomes an issue only if we have a lot of sets – in fact,
only when we are dealing with infinitely many such sets. (Whether those sets are
finite or infinite is not the issue – the issue is how many sets we have.)
The formal reason why this is so can be summarised in the following way: (ZFC)
is a “theory in first-order logic”, and our ability to “choose” an element from a
(single) non-empty set is an application of a rule from that theory that carries
the groovy moniker existential instantiation. We do not claim to be an expert in
the theory of first-order logic, but the upshot that a hard-working, good-looking
mathematician but non-logician like me can safely take away from this statement is
that given a non-empty set, (ZF) allows you to pick an element of that set. If you
are interested in the details of first-order logic, then be aware that treatments now
exist to cure this, but also that some people exhibit these symptoms for years and
still manage to lead productive lives, both inside and outside of mathematics.
Another thing to note is that existential instantiation is not a constructive argu-
ment. That is, we are not given a method of choosing an element from a non-empty
set A – all we know is that it is possible. If we relabel A as Aλ , then saying that pro-
duces an element a ∈ A is the same as saying that there exists a function f ∶ {λ} → Aλ
that satisfies f (λ) = a. That is, we have our choice function.
Having done this once, first-order logic allows us to repeat this process finitely
often. That is, given finitely many non-empty sets A1 , A2 , . . . , An , we can find a func-
tion
f ∶ {1, 2, . . . , n} → ∪nk=1 Ak such that f (k) ∈ Ak , 1 ≤ k ≤ n.
What first-order logic and the Zermelo-Fraenkel Axioms do not do, however, is
to carry out this procedure infinitely often, willy-nilly. We shall try to make the
expression “willy-nilly” more precise below.
many pairs, but there is no way to do this for infinitely many pairs at once. Thus in
order to pick a sock from amongst each of infinitely many pairs of socks, you need
something more than (ZF) theory, and the Axiom of Choice grants you your wish.
1.9. Let us consider how such a thing might arise in practice in a mathematical
setting. Let {At ∶ t ∈ R} be a collection of non-empty subsets ∅ ≠ At ⊆ N, one such
set for every real number t ∈ R. We would like to choose one element from each set.
Can we do this in (ZF), or do we require the Axiom of Choice?
If we had an explicit method to choose the element at ∈ At , then we could indeed
avoid the Axiom of Choice. But how can we specify an element of At without
knowing exactly what At is? The natural numbers have a very special and very
useful property: they are (well-) ordered.
In dealing with the natural numbers, we observe that they come equipped with
a partial order which we typically denote by ≤. In this setting, however, any two
natural numbers are comparable. If (X, ρ) is a partial ordered set, and if x, y ∈ X
implies that either xρy or yρx, then we say that ρ is a total order on X.
Thus (N, ≤) is a totally ordered set.
1.11. It gets even better (and yes, you should seriously ask yourself whether
you deserve it). The most useful property of N for our current purposes is that it
possesses one more very striking property:
given any non-empty subset ∅ ≠ H ⊆ N, it admits a minimum
element;
that is, there exists an element m ∈ H such that m ≤ h for all h ∈ H.
If (X, ρ) is a partially ordered set with the property that every non-empty subset
Y of X admits a minimum element, then we say that (X, ρ) is well-ordered.
The observation above is that (N, ≤) is well-ordered.
1.12. What is the relevance of this to the Axiom of Choice? Returning to the
example in paragraph 1.9, we were given a collection {At ∶ t ∈ R} of non-empty
subsets of N. Since we have just seen that (N, ≤) is well-ordered, we can specify a
choice function
f ∶ R → ∪t∈R At
t ↦ min At .
That is, we have a rule for specifying which element of At we are choosing – we
are choosing the minimum element of At which we know exists (and is unique).
We did not need to know in advance what At was, and we did not need existential
instantiation to define f (t).
Thus we don’t need the Axiom of Choice to pick an element from each subset
At simultaneously.
1.14. We have not forgotten the second question raised in Remark 1.6, namely:
doesn’t the Axiom of Choice follow from the Axioms of (ZF)? In fact, we have already
partially addressed this – (ZF) and first-order logic do not allow us to choose an
element from each of infinitely-many sets at a time, in large part because of the
non-constructive nature of existential instantiation. (For those who are wondering,
yes, your humble author does love that phrase.)
There remains the question: does it contradict any of the (ZF) axioms? The
simple answer is “No”, although the proof that it does not is anything but simple.
That (ZFC) is consistent (i.e. doesn’t contain any inherent contradictions) was first
demonstrated by the Austro-Hungarian mathematician Kurt Gödel in 1938. Does
that mean that we can’t do mathematics without the Axiom of Choice? Interestingly
enough, in 1962, the American mathematician Paul Cohen demonstrated that one
can assume all of the Axioms of Zermelo-Fraenkel Theory, and also assume that the
Axiom of Choice is false(!!!), and still arrive at a consistent mathematical system.
For this and for his work on the Continuum Hypothesis (he showed that (CH) is
independent of (ZFC)), he was awarded the Fields medal, which is generally consid-
ered to be the mathematical equivalent of the Nobel Prize. Given how prestigious
and merited the Nobel Peace prize invariably is, a prize in mathematics is... (we
leave it as an exercise for the reader to complete this sentence).
The issue is that this seemingly innocuous assumption has major implications.
In particular, it is known that to imply that every vector space (over any field)
admits a vector-space basis. Given how useful vector space bases are when studying
linear algebra, it would be beyond tragic to lose them. Of course, one might argue
that one should instead drop the Axiom of Choice and just assume that every vector
space admits a basis. But the joke is then upon the one who argued this: as it so
happens, if we assume that every vector space admits a basis, then one can also
prove that the Axiom of Choice holds.
In other words, the Axiom of Choice is equivalent to the statement that every
vector space admits a basis. One of the statements is true if and only if the second
is true.
2.2. The number of statements that are equivalent to the Axiom of Choice makes
is huge. Entire books are written on this subject - including the book Equivalents
of the Axiom of Choice, II by Herman Rubin and Jean Rubin, which includes 250
statements, each equivalent to the Axiom of Choice. Below we shall mention three
such equivalent statements which are among the most important.
234 A. THE AXIOM OF CHOICE
2.3. The Axiom of Choice II. This version is a slight modification of the
Axiom of Choice, and you might want to think about how you would prove that it
is equivalent to the Axiom of Choice:
Let {Aλ }λ∈Λ be a non-empty collection of non-empty disjoint sets.
That is, λ1 ≠ λ2 ∈ Λ implies that Aλ1 ∩ Aλ2 = ∅.
Then there exists a function f ∶ Λ → ∪λ∈Λ Aλ such that f (λ) ∈
Aλ for all λ ∈ Λ.
2.6. A note about maximal elements of partially ordered sets. Our decision to
use “maximal ” instead of “maximum” is not just fancy. They usually mean different
things.
Let A = {∅, {x}, {y}, {z}, {x, y}, {x, z}, {y, z}} be the set of proper subsets of
X = {x, y, z}, and partially order A by inclusion. Thus {x} ≤ {x, y} because {x} ⊆
{x, y}.
Observe that {x, y} is a maximal element of A. Nothing is bigger than {x, y} in
A. But {z} ⊆/ {x, y}, and so {x, y} is not a maximum element of A. Note that {x, z}
is also a maximal elements. Maximal elements need not be unique. (Are maximum
elements unique when they exist? Think about it!)
The issue is that a maximal element doesn’t have to be comparable to every
element in the set, while a maximum element does. In other words, a maximal
element only has to be bigger than or equal to those things that you can actually
compare it to, while a maximum element has to be bigger than or equal to everything
in the set! In this example, ∅ would be a mimimum, and thus also a minimal element
of (A, ≤).
That the Well-Ordering Principle implies the Axiom of Choice is definitely within
reach of the interested reader. We leave it as an exercise.
3. THE PROSECUTION RESTS ITS CASE. 235
2.8. Although your humble author has certainly not had a quote that compares
in notoriety to that mentioned above, it strikes your humble author that the Well-
Ordering Principle is not as “obviously false” as a somewhat different consequence
of the Axiom of Choice, namely the Banach-Tarski Paradox. This we shall address
in the next section.
3.2. Perhaps the most damning evidence one can provide against the Axiom of
Choice is the following result which the Axiom of Choice implies, and was alluded
to above.
236 A. THE AXIOM OF CHOICE
Note that stretching, bending, and twisting pieces is not allowed. Once you have
determined the pieces, you can only translate and rotate them.
If you are very thirsty, you might want to try this out on a bag of oranges at
home. Note that the inordinate amount of time you spend cleaning up your mess,
as well as your abject failure in accomplishing this task, might result from the fact
that most knives on the market today produce so-called “measurable” cuts of your
oranges. The same applies to grapefruit, and need we say (?), most other food
products.
The Banach-Tarski Paradox is, to coin a phrase, “paradoxical”. It is infuriating,
in large measure due to the fact that it is so highly counterintuitive. On the other
hand, once you accept the Axiom of Choice, this result follows as a consequence,
and you have no choice but to accept it and move on with your life.
4. So what to do?
4.1. Zermelo-Fraenkel theory plus Choice (ZFC) is popular because it works. It
works well. The Banach-Tarski Paradox and the existence of non-measurable sets
aside, it produces so many great results that the vast majority of mathematicians use
it without hesitation. Having said that – the Axiom of Choice is considered special,
and unlike the other (ZF) axioms. In general, when it is not required to prove a
result, a proof that avoids it is considered better than a proof that uses it. Also, it is
often considered “good form” to indicate when it is being used, although some uses
of the Axiom of Choice have become so commonplace that mathematicians expect
the reader to realise on their own that the Axiom of Choice has been used in the
argument.
There is a group of mathematicians who feel that to prove that a mathematical
object exists, one must construct it. They are referred to as constructivists (at least
this is how they are referred to in polite company). Your humble author has not
met a constructivist mathematician himself, and it occurs to your humble author as
he writes this that if we were to use constructivist logic, then we could not conclude
that one exists, could we? [How do I know that the articles that refer to them aren’t
all made up?] But this author is not a constructivist, and so is willing to accept
that they do exist.
5. EQUIVALENCES 237
4.2. We conclude by indicating that the Continuum Hypothesis, which was also
shown by Gödel and by Cohen to be independent of (ZFC), is not universally ac-
cepted at all. Any argument that uses the Continuum Hypothesis should mention
where it is being used, and should be avoided if there is any way at all of obtaining
the same result without it. Good results have been proven that require it, and are
usually stated as: assuming the continuum hypothesis, we show that ... .
4.3. Quiz. Can you avoid the Axiom of choice to choose an element from
(a) a finite set?
(b) each member of an infinite set of sets, each of which has only one element?
(c) each member of a finite set of sets if each of the members is infinite?
(d) each member of an infinite set of sets of rationals?
(e) each member of an infinite set of finite sets of reals?
(f) each member of a denumerable set of sets if each of the members is denu-
merable?
(g) each member of an infinite set of sets of reals?
(h) each member of an infinite set of sets of integers?
5. Equivalences
Let us now provide a proof of the equivalence of the Axiom of Choice, Zorn’s
Lemma and the Well-Ordering Principle. We remind the reader that a poset is a
partially ordered set, as defined above. We begin with the definition of an initial
segment, which will be required in the proof.
We also remind the reader that the statement that, given a collection {Xλ }λ∈Λ
of non-empty sets, the existence of a choice function f ∶ Λ → ∪λ∈Λ Xλ such that
f (λ) ∈ Xλ for all λ ∈ Λ is the statement that
∏ Xλ =/ ∅,
λ∈Λ
for the simple reason that
∏ Xλ ∶= {f ∶ Λ → ∪λ∈Λ Xλ a function such that f (λ) ∈ Xλ for all λ ∈ Λ}.
λ∈Λ
5.2. Example.
(a) For each r ∈ R, (−∞, r) is an initial segment of (R, ≤).
(b) For each n ∈ N, {1, 2, ..., n} is an initial segment of N.
238 A. THE AXIOM OF CHOICE
● Subclaim 2a: First we show that V itself has property L. We must show
that V is well-ordered, and that for all x ∈ V , x = f (P (V, x)).
(a) V is well-ordered.
Let ∅ =/ B ⊆ V . Then there exists A0 ⊆ X so that A0 has property
L and B ∩ A0 =/ ∅. Since A0 is well-ordered and ∅ =/ B ∩ A0 ⊆ A0 ,
m ∶= min(B ∩ A0 ) exists. We claim that m = min(B).
Suppose that y ∈ B. Then there exists A1 ⊆ X so that A1 has property
L and y ∈ A1 . Now, both A0 and A1 have property L:
◇ if A0 = A1 , then m = min(B ∩ A1 ), so m ≤ y.
◇ if A0 =/ A1 , then either
● A0 is an initial segment of A1 , so A0 = P (A1 , d) for some
d ∈ A1 . Then
m = min(B ∩ A0 ) = min(B ∩ A1 ),
since r ∈ A1 ∖ A0 implies that m < d ≤ r. Hence m ≤ y;, or
● A1 is an initial segment of A0 , say A1 = P (A0 , d) ⊆ A0 for
some d ∈ A0 . Then
m = min(B ∩ A0 ) ≤ min(B ∩ A1 ).
Hence m ≤ y.
In both cases we see that m ≤ y. Since y ∈ B was arbitrary, m =
min(B).
Thus, any non-empty subset B of V has a minimum element, and so
V is well-ordered.
(b) Let x ∈ V . Then there exists A2 ⊆ X with property L so that x ∈ A2 .
Then x = P (A2 , x). Suppose that y ∈ V and y < x. Then there exists
240 A. THE AXIOM OF CHOICE
[Her69] I.N. Herstein. Topics in Ring Theory. Chicago Lecture Notes in Mathematics. University
of Chicago Press, Chicago and London, 1969.
[Her75] I.N. Herstein. Topics in Algebra, 2nd Edition. Chicago Lecture Notes in Mathematics. John
Wiley and Sons, New York, Chichester, Brisbane, Toronto and Singapore, 1975.
[Ste04] I. Stewart. Galois Theory. CRC Mathematics Texts. Chapman and Hall, Boca Raton,
London, New York, Washington, D.C., 2004.
[vL82] C.L.F. von Lindemann. Über die Zahl π. Mathematische Annalen, 20:213–225, 1882.
243
Index
245
246 INDEX
Wantzel
Pierre Laurent, 209, 218
well-defined, 71