0% found this document useful (0 votes)
76 views230 pages

Algebra Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views230 pages

Algebra Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 230

Abstract Algebra

Joseph R. Mileti

May 5, 2014
2
Contents

1 Introduction 5
1.1 Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Power and Beauty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Abstracting the Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Codes and Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 The Integers 11
2.1 Induction and Well-Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Divisibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Division with Remainder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 GCDs and the Euclidean Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Primes and Factorizations in Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3 Relations and Functions 27


3.1 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Equivalence Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Injections, Surjections, Bijections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.5 Defining Functions on Equivalence Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.6 Modular Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4 Introduction to Groups 43
4.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 Basic Properties, Notation, and Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3 Building Groups From Associative Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.4 The Groups Z/nZ and U (Z/nZ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.5 The Symmetric Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.6 Orders of Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.7 Direct Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5 Subgroups and Cosets 65


5.1 Subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.2 Generating Subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.3 The Alternating Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.4 The Dihedral Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.5 The Quaternion Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.6 The Center of a Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.7 Cosets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

3
4 CONTENTS

5.8 Lagrange’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

6 Quotients and Homomorphisms 91


6.1 Quotients of Abelian Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2 Normal Subgroups and Quotient Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.3 Isomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.4 Internal Direct Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.5 Classifying Groups up to Isomorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.6 Homomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.7 The Isomorphism and Correspondence Theorems . . . . . . . . . . . . . . . . . . . . . . . . . 117

7 Group Actions 123


7.1 Actions, Orbits, and Stabilizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.2 Permutations and Cayley’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.3 The Conjugation Action and the Class Equation . . . . . . . . . . . . . . . . . . . . . . . . . 130
7.4 Simplicity of A5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.5 Counting Orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

8 Cyclic, Abelian, and Solvable Groups 141


8.1 Structure of Cyclic Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

9 Introduction to Rings 147


9.1 Definitions and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
9.2 Units and Zero Divisors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
9.3 Polynomial Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
9.4 Further Examples of Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

10 Ideals and Ring Homomorphisms 169


10.1 Ideals, Quotients, and Homomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
10.2 The Characteristic of a Ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
10.3 Polynomial Evaluation and Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
10.4 Generating Subrings and Ideals in Commutative Rings . . . . . . . . . . . . . . . . . . . . . . 184
10.5 Prime and Maximal Ideals in Commutative Rings . . . . . . . . . . . . . . . . . . . . . . . . . 189

11 Divisibility and Factorizations in Integral Domains 193


11.1 Divisibility and Associates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
11.2 Irreducible and Prime Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
11.3 Irreducible Polynomials in Q[x] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
11.4 Unique Factorization Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

12 Euclidean Domains and PIDs 211


12.1 Euclidean Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
12.2 Principal Ideal Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
12.3 Quotients of F [x] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
12.4 Field of Fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
Chapter 1

Introduction

1.1 Abstraction
Each of the sets N, Z, Q, R, and C come equipped with naturally defined notions of addition and multiplica-
tion. Comparing these, we see many similarities. No matter which of these “universes” of numbers we work
within, it does not matter the order we use to add two numbers. That is, we always have a + b = b + a no
matter what a and b are. There are of course some differences as well. In each of the first three sets, there
is no number which when multiplied by itself gives 2, but in there latter two such a number does exist.
There is no reason to work only with objects one typically thinks of as numbers. We know how to
add and multiply vectors, polynomials, matrices, functions, etc. In many of these cases, several of the
“normal” properties of addition and multiplication are still true. For example, essentially all of the basic
properties of addition/multiplication carry over to polynomials and functions. However, occasionally the
operations of addition and multiplication that one defines fail to have the same properties. For example, the
“multiplication” of two vectors in R3 given by cross product is not commutative, nor is the multiplication
of square matrices. When n ≥ 2, the “multiplication” of two vectors in Rn given by the dot product takes
two vectors and returns an element of R rather than an element of Rn .
One fundamental feature of abstract algebra is to take the essential properties of these operations, codify
them as axioms, and then study all occasions where they arise. Of course, we first need to ask the question:
What is essential? Where do we draw the line for which properties we enforce by fiat? A primary goal
is to put enough in to force an interesting theory, but keep enough out to leave the theory as general and
robust as possible. The delicate balancing act of “interesting” and “general” is no easy task, but centuries
of research have isolated a few important collections of axioms as fitting these requirements. Groups, rings,
fields are often viewed as the most central objects, although there are many other examples (such as integral
domains, semigroups, lie algebras, etc.). Built upon these in various ways are many other structures, such
as vector spaces, modules, and algebras. We will emphasize the “big three”, but will also spend a lot of time
on integral domains and vector spaces, as well as introduce and explore a few others throughout our study.
However, before diving into a study of these strange objects, one may ask “Why do we care?”. We’ll
spend the rest of this section giving motivation, but first we introduce rough descriptions of the types of
objects that we will study at a high level so that we can refer to them. Do not worry at all about the details
at this point.
A group consists of a set of objects together with a a way of combining two objects to form another
object, subject to certain axioms. For example, we’ll see that Z under addition is a group, as is R\{0} under
multiplication. More interesting examples are the set of all permutations (rearrangements) of the numbers
{1, 2, . . . , n} under a “composition” operation, and the set of all valid transformation of the Rubik’s Cube
(again under a certain “composition” operation).

5
6 CHAPTER 1. INTRODUCTION

A ring consists of a set of objects together with two ways of combining two objects to form another object,
again subject to certain axioms. We call these operations addition and multiplication (even if they are not
“natural” versions of addition and multiplication) because the axioms assert that they behave in ways very
much like these operations. For example, we have a distributive law that says that a(b + c) = ab + ac for all
a, b, c. Besides Z, Q, R, and C under the usual operations, another ring is the set of all n × n matrices with
real entries using the addition and multiplication rules from linear algebra. Another interesting example is
“modular arithmetic”, where we work with the set {0, 1, 2, . . . , n − 1} and add/multiply as usual but cycle
back around when we get a value of n or greater. We denote these rings by Z/nZ, and they are examples of
finite rings that have many interesting number-theoretic properties.
A field is a special kind of ring that satisfies the additional requirement that every nonzero element has
a multiplicative inverse (like 1/2 is a multiplicative inverse for 2). Since elements of Z do not generally
have multiplicative inverses, the ring Z, under the usual operations, is not a field. However, Q, R, and
C are fields under the usual operations, and roughly one thinks of general fields as behaving very much
like these examples. It turns out that for primes numbers p, the ring given by “modular arithmetic” on
{0, 1, 2, . . . , p − 1} is a fascinating example of a field with finitely many elements.

1.2 Power and Beauty


Throughout our study, we will encounter several fascinating and exotic mathematical objects. Not only
are these objects intrinsically beautiful and exciting to think about, but it turns out that they play a
fundamental role in understanding many of our typical mathematical objects, and that they have several
important applications both to important practical problems and to other ares of science and mathematics.
The complex numbers, formed by adding a new number i (unfortunately called “imaginary” for historical
reasons - blame Descartes!) to the real numbers and closing off under addition and multiplication, had origins
in trying to solve simple equations. We can write this set as

C = {a + bi : a, b ∈ R}

where i2 = −1. Although perhaps a mathematical curiosity at first, using these new numbers provided ways
to solve ordinary problems in mathematics that people did not know how to do using standard techniques.
In particular, they were surprisingly employed to find real roots to cubic polynomials by using mysterious
multiplications and root operations on these weird new numbers. Eventually, using complex numbers allowed
mathematicians to calculate certain real integrals that resisted solutions to the usual attacks. Since then,
complex numbers are now used in an essential way in many areas of science. They unify and simplify many
concepts and calculations in electromagnetism and fluid flow. Moreover, they are fundamentally embedded
in an essential way within quantum mechanics where “probability amplitudes” are measured using complex
numbers rather than nonnegative real numbers.
One can ask whether we might want to extend C further to include roots of other numbers, perhaps roots
of the new numbers like i. It turns out that we do not have to do this, because C already has everything
one would want in this regard! For example, one can check that
 2
1 1
√ + √ · i = i,
2 2
so i has a square root. Not only is C closed under taking roots, but it turns out that every nontrivial
polynomial with real (or even complex) coefficients has a root in C! Thus, the complex numbers are not
“missing” anything from an algebraic point of view. We call fields with this property algebraically closed,
and we will eventually prove that C is such a field.
Even though C is such a pretty and useful object that has just about everything one could want, that
does not mean that we are unable to think about extending it further in different ways. William Hamilton
1.3. ABSTRACTING THE INTEGERS 7

tried to do this by adding something new, let’s call it j, and thinking about how to do arithmetic with the
set {a + bi + cj : a, b, d ∈ R}. However, he was not able to define things like ij and j 2 in such a way that
the arithmetic worked out nicely. One day, he had the inspired idea of going up another dimension and
ditching commutativity of multiplication. As a result, he discovered the quaternions, which are the set of all
“numbers” of the form
H = {a + bi + cj + dk : a, b, c, d ∈ R}
where i2 = j 2 = k 2 = −1. One also need to explain how to multiply the elements i, j, k together, and this
is where things become especially interesting. We multiply them in a cycle so that ij = k, jk = i and
ki = j. However, when we go backwards in the cycle we introduce a negative sign so that ji = −k, kj = −i,
and ik = −j. We’ve lost commutativity of multiplication, but it turns out that H retains all of the other
essential properties, and is an interesting example of a ring that is not a field. Although H has not found
nearly as many applications to science and mathematics as C, it is still useful in some parts of physics and
to understand rotations in space. Furthermore, by understanding the limitations of H and what properties
fail, we gain a much deeper understanding of the role of commutativity of multiplication in fields like R and
C.

1.3 Abstracting the Integers


Instead of extending C, we can work in certain subset of C. One such example is the set of Gaussian Integers,
which is the set
Z[i] = {a + bi : a, b ∈ Z}.
At first sight, this seems to be a strange world to work in. Unlike C, it does not in general have multiplicative
inverses. For example, there is no element of Z[i] that one can multiply 1 + i by to obtain 1. However, we’ll
see that it behaves like the integers Z in many ways, and moreover that studying its algebraic properties
gives us insight into the properties of the normal integers Z. For example, just like how we define primes in
the integers, we can define primes in Z[i] as well as those elements that can not factor in any “nontrivial”
ways (there are a few subtleties about how to define nontrivial, but we will sweep them under the rug for
now until we study these types of objects in more detail). For example, the number 5 ∈ Z is certainly prime
when viewed as a element of Z. However, it is no longer prime when viewed in the larger ring Z[i] because
we have
5 = (2 + i)(2 − i)
We also have 13 = (3 + 2i)(3 − 2i), but it turns out that 7 does not factor in any nontrivial way.
Which primes in Z are no longer prime in this new world? Suppose that p ∈ Z is a prime which is the
sum of two squares. We can then fix a, b ∈ Z with p = a2 + b2 . In the Gaussian integers Z[i] we have

p = a2 + b2 = (a + bi)(a − bi)

so the number p, which is prime in Z, factors in an interesting manner over the larger ring Z[i]. This is
precisely what happened with 5 = 22 + 12 and 13 = 32 + 22 above. There is a converse to this result as well.
In fact, we will eventually be able to show the following theorem that also connects all of these ideas to facts
about modular arithmetic.

Theorem 1.3.1. Let p ∈ Z be an odd prime. The following are equivalent.

1. p is a sum of two squares, i.e. there exist a, b ∈ Z with p = a2 + b2 .

2. p is not prime in Z[i].

3. −1 is a square modulo p, i.e. there exists x ∈ Z such that x2 ≡ −1 (mod p).


8 CHAPTER 1. INTRODUCTION

4. p ≡ 1 (mod 4).
This theorem establishes a connection between sums of squares, factorizations in Z[i], and squares in the
ring Z/pZ (the ring representing “modular arithmetic”). Thus, from this simple example, we see how basic
number-theoretic questions can be understood and hopefully solved by studying these more exotic algebraic
objects. Moreover, it opens the door to many interesting questions. For example, this theorem gives a
simple characteristic for when −1 is a square in Z/pZ, but determining which other elements are squares is
a fascinating problem leading to the beautiful result known as Quadratic Reciprocity.
Furthermore, we see from the above theorem that p can be written as the sum of two squares exactly
when p (which is prime in Z) fails to be prime in the larger ring Z[i]. We have turned a number-theoretic
question into one about factorizations in a new ring. In order to take this perspective, we need a solid
understanding of factorizations in the ring Z[i], which we will develop. Working in N, it is well-known that
every n ≥ 2 factors uniquely (up to order) into a product of primes, but this fact is less obvious than it may
initially appear. We will see a proof of it in Chapter 2, but when we try to generalize it to other contexts,
we will encounter exotic settings where the natural generalization fails. It turns out that things work out
nicely in Z[i], and that is the one of the key ideas driving the theorem.
Closely related to which primes can be written as sums of square, consider the problem of finding integer
solutions to x2 + y 2 = z 2 . Of course, this equation is of interest because of the Pythagorean Theorem, so
finding integer solutions is the same as finding right triangles all of whose sides have integer lengths. For
example, we have 32 + 42 = 52 and 52 + 122 = 132 . Finding all such Pythagorean triples is an interesting
problem, and one can do it using elementary number theory. However, it’s also possible to work in Z[i], and
notice that if x2 + y 2 = z 2 if and only if (x + yi)(x − yi) = z 2 in Z[i]. Using this and the nice factorization
properties of Z[i], one can find all solutions.
Fermat’s Last Theorem is the statement that if n ≥ 3, then there are no nontrivial integer solutions to

xn + y n = z n

(here nontrivial means that each of x, y, and z are nonzero). Fermat scribbled a note in the margin of
one of his books stating that he had a proof but the margin was too small to contain it. For centuries,
mathematicians attempted to prove this result. If there exists a nontrivial solution for some n, then some
straightforward calculations show that there must be a nontrivial solution for either n = 4 or for some odd
prime p. Fermat did provide a proof showing that

x4 + y 4 = z 4

has no nontrivial integer solutions. Suppose then that p is an odd prime and we want to show that

xp + y p = z p

has no nontrivial solution. The idea is to generalize the above concepts from Z[i] to factor the left-hand side.
Although it may not be obvious at this point, by setting ζ = e2πi/p , it turns out that

xp + y p = (x + y)(x + ζy)(x + ζ 2 y) · · · (x + ζ p−1 y)

Thus, if (x, y, z) is a nontrivial solution, we have

(x + y)(x + ζy)(x + ζ 2 y) · · · (x + ζ p−1 y) = z p

The idea then is to work in the ring

Z[ζ] = {a0 + a1 ζ + a2 ζ 2 + · · · + ap−1 ζ p−1 : ai ∈ Z}

Working with these strange “numbers”, Lamé put forward an argument asserting that there were no nontriv-
ial solutions, thus claiming a solution to Fermat’s Last Theorem. Liouville pointed out that this argument
1.4. CODES AND CIPHERS 9

relied essentially on the ring Z[ζ] have nice factorization properties like Z and Z[i], and that this was not
obvious. In fact, there do exist some primes p where these properties fail in Z[ζ], and in these situations
Lamé’s argument fell apart. Attempts to patch these arguments, and to generalize Quadratic Reciprocity,
led to many fundamental ideas in ring theory. By the way, a correct solution to Fermat’s Last Theorem was
eventually completed by Andrew Wiles in 1994, using ideas and methods from abstract algebra much more
sophisticated than these.

1.4 Codes and Ciphers


Amongst the most simple kind of ciphers is something called a substitution cipher. The basic idea is that
we replace each letter of a message with another letter. For example, we replace every E with a W , every
X with a P , etc. In order for this to work so that one can decode the message, we need to ensure that that
we do not use the same code letter for two different ordinary letters, i.e. we can not replace every A with
an N and also every B with an N . With this in mind, every substitution cipher is given by a permutation,
or rearrangement, of the letters in the set {A, B, C, . . . , Z}. In other words, if we code the letters by
numbers, then the ciphering mechanism is done by choosing an element of the group of all permutations of
{1, 2, 3, . . . , 26}.
Although this cipher might work against a naive and unsophisticated eavesdropper, it is not at all secure.
In fact, newspapers and magazines often use them as puzzles. The trick is that in any given language such as
English, letters occurs with different frequencies. Thus, given an encoded message of any reasonable length,
one can analyze frequencies of letters to obtain the most likely encodings of common letters like E and T . By
making such a guess, one can then use that assumption to fill in (the most likely possibilities for) additional
letters. Although it’s certainly possible to guess incorrectly and result in the need to backtrack, using this
method one can almost always decipher the message quite quickly.
In order to combat this simple attack, people developed substitution schemes that changed based on the
position. For example, we can use five different permutations, and switch between them based on the position
of the letter. Thus, the letters in position 1, 6, 11, 16, etc. will all use one permutation, the letters in position
2, 7, 12, 17, etc. will use another, and so on. This approach works better, but it’s still susceptible to the
same frequency analysis attack if the message is long enough. If the person doing the coding doesn’t divulge
the number of permutations, or sometimes changes the number, then this adds some security. However, a
sophisticated eavesdropper can still break these systems. Also, if the person doing the coding does change the
number of permutations, there needs to be some way to communicate this change and the new permutations
to the receiver, which can be difficult.
In the previous example, the coding and decoding become unwieldy to carry out by hand if one used more
than a few permutations. However, in the 20th century, people created machines to mechanize this process
so that the period (or number of steps before returning to the same permutation) was incredibly large.
The most famous of these is known as the Enigma machine. Although initially developed for commercial
purposes, the German military adopted enhanced versions of these machines to encode and decode military
communications. The machines used by the Germans involved 3 or 4 “rotors”. With these rotors fixed in
a given position, the machine had encoded within it a certain permutation of the letters. However, upon
pressing a key to learn the output letter according to that permutation, the machine would also turn (at
least) one of the rotors, thus resulting in a new permutation of the letters. The period of a 3-rotor machine
was 263 = 17, 576, so as long as the messages were less than this length, a permutation would not be repeated
in the same message, and hence basic frequency analysis would not work.
If one knew the basic wiring of the rotors and some likely words (and perhaps possible positions of these
words) that often occurred in military communications, then one can start to mount an attack on the Enigma
machine. To complicate matters, the German military used machines with a manual “plugboard” in order to
provide additional letter swaps in a given message. Although this certainly made cryptanalysis more difficult,
some group theory (especially analysis of an operation called “conjugation” in the group of permutations
10 CHAPTER 1. INTRODUCTION

{1, 2, . . . , n}) played an essential role in the efforts of the Polish and British to decode messages. By the
middle of World War II, the British, using group theory, statistical analysis, specially built machines called
“Bombes”, and a lot of hard human work, were able to decode the majority of the military communications
using the Enigma machines. The resulting intelligence played an enormous role in allied operations and their
eventual victory.
Modern day cryptography makes even more essential use of the algebraic objects we will study. Public-
key cryptosystems and key-exchange algorithms like RSA and classical Diffie-Hellman work in the ring
Z/nZ of modular arithmetic. More sophisticated systems involve exotic finite groups based on certain
“elliptic” curves (and these curves themselves are defined over interesting finite fields). Even private-key
cryptosystems in widespread use often use these algebraic structures. For example, AES (the “Advanced
Encryption Standard”) works by performing arithmetic in a field with 28 = 256 many elements.
Chapter 2

The Integers

2.1 Induction and Well-Ordering


We first recall three fundamental and intertwined facts about the natural numbers, namely Induction, Strong
Induction, and Well-Ordering on N. We will not attempt to “prove” them or argue in what sense they are
equivalent because in order to do so in a satisfying manner we would need to make clear our fundamental
assumptions about N. In any case, I hope that each of these facts is intuitively clear with a bit of thought.

Fact 2.1.1 (Principle of Mathematical Induction on N). Let X ⊆ N. Suppose that

• 0 ∈ X (the base case)

• n + 1 ∈ X whenever n ∈ X (the inductive step)

We then have that X = N.

Here’s the intuitive argument for why induction is true. By the first assumption, we know that 0 ∈ X.
Since 0 ∈ X, the second assumption tells us that 1 ∈ X. Since 1 ∈ X, the second assumption again tells us
that 2 ∈ X. By repeatedly applying the second assumption in this manner, each element of N is eventually
determined to be in X.
The way that induction works above is that 5 is shown to be an element of X using only the assumption
that 4 is an element of X. However, once we’ve arrived at 5, we’ve already shown that 0, 1, 2, 3, 4 ∈ X, so
why shouldn’t we be able to make use of all of these assumptions when arguing that 5 ∈ X? The answer is
that we can, and this version of induction is sometimes called strong induction.

Fact 2.1.2 (Principle of Strong Induction on N). Let X ⊆ N. Suppose that n ∈ X whenever k ∈ X for all
k ∈ N with k < n. We then have that X = N.

Thus, when arguing that n ∈ X, we are allowed to assume that we know all smaller numbers are in X.
Notice that with this formulation we can even avoid the base case of checking 0 because of a technicality: If
we have the above assumption, then 0 ∈ X because vacuously k ∈ X whenever k ∈ N satisfies k < 0, simply
because no such k exists. If that twist of logic makes you uncomfortable, feel free to argue a base case of 0
when doing strong induction.
The last of the three fundamental facts looks different from induction, but like induction is based on the
concept that natural numbers start with 0 and are built by taking one discrete step at a time forward.

Fact 2.1.3 (Well-Ordering Property of N). Suppose that X ⊆ N with X 6= ∅. There exists k ∈ X such that
k ≤ n for all n ∈ X.

11
12 CHAPTER 2. THE INTEGERS

To see intuitively why this is true, suppose that X ⊆ N with X 6= ∅. If 0 ∈ X, then we can take k = 0.
Suppose not. If 1 ∈ X, then since 1 ≤ n for all n ∈ N with n 6= 0, we can take k = 1. Continue on. If we
keep going until we eventually find 7 ∈ X, then we know that 0, 1, 2, 3, 4, 5, 6 ∈
/ X, so we can take k = 7. If
we keep going forever consistently finding that each natural number is not in X, then we have determined
that X = ∅, which is a contradiction.
We now give an example of proof by induction. Notice that in this case we start with the base case of 1
rather than 0.
Theorem 2.1.4. For any n ∈ N+ , we have
n
X
(2k − 1) = n2
k=1

i.e.
1 + 3 + 5 + 7 + · · · + (2n − 1) = n2
Proof. We prove the result by induction. If we want to formally apply the above statement of induction, we
are letting
Xn
X = {n ∈ N+ : (2k − 1) = n2 }
k=1

and using the principle of induction to argue that X = N+ . More formally still if you feel uncomfortable
starting with 1 rather than 0, we are letting
n
X
X = {0} ∪ {n ∈ N+ : (2k − 1) = n2 }
k=1

and using the principle of induction to argue that X = N, then forgetting about 0 entirely. In the future,
we will not bother to make these pedantic diversions to shoehorn our arguments into the technical versions
expressed above, but you should know that it is always possible to do so.
• Base Case: Suppose that n = 1. We have
1
X
(2k − 1) = 2 · 1 − 1 = 1
k=1

so the left hand-side is 1. The right-hand side is 12 = 1. Therefore, the result is true when n = 1.
• Inductive Step: Suppose that for some fixed n ∈ N+ we know that
n
X
(2k − 1) = n2
k=1

Notice that 2(n + 1) − 1 = 2n + 2 − 1 = 2n + 1, hence


n+1
X n
X
(2k − 1) = [ (2k − 1)] + [2(n + 1) − 1]
k=1 k=1
Xn
=[ (2k − 1)] + (2n + 1)
k=1
2
= n + (2n + 1) (by induction)
2
= (n + 1)
2.1. INDUCTION AND WELL-ORDERING 13

Since the result holds of 1 and it holds of n + 1 whenever it holds of n, we conclude that the result holds for
all n ∈ N+ by induction.

Our other example of induction will be the Binomial Theorem, which tells us how to completely expand
a binomial to a power, i.e. how to expand the expression (x + y)n . For the first few values we have:

• (x + y)1 = x + y

• (x + y)2 = x2 + 2xy + y 2

• (x + y)3 = x3 + 3x2 y + 3xy 2 + y 3

• (x + y)4 = x4 + 4x3 y + 6x2 y 2 + 4xy 3 + y 4

We seek to determine the coefficients of the various xi y j terms.

Definition 2.1.5. Let n, k ∈ N with k ≤ n. We define


 
n n!
=
k k! · (n − k)!

we see that nk is the number of subsets of an n-element set of size k. In particular,



In combinatorics,
the number nk is always an integer, which might be surprising from

a naive glance at the formula. The
following fundamental lemma gives a recursive way to calculate nk .


Lemma 2.1.6. Let n, k ∈ N with k ≤ n. We have


     
n+1 n n
= +
k+1 k k+1

Proof. One extremely unenlightening proof is to expand out the formula on the right and do terrible algebraic
manipulations on it. If you haven’t done so, I encourage you to do it. If we believe the combinatorial
description of nk , here’s a more meaningful combinatorial argument. Let n, k ∈ N with k ≤ n. Consider a
set X with n + 1 many elements. To determine n+1

k+1 , we need to count the number of subsets of X of size
k + 1. We do this as follows. Fix an arbitrary a ∈ X. Now an arbitrary subset of X of size k + 1 fits into
exactly one of the following types.

• The subset has a as an element. In this case, to completely determine the subset, we need to pick the
remaining k elements of the subset from X\{a}. Since X\{a} has n elements, the number of ways to
do this is nk .

• The subset does not have a as an element. In this case, to completely determine the subset, we need
to pick all k + 1 elements of the subset from X\{a}. Since X\{a} has n elements, the number of ways
n

to do this is k+1 .

Putting this together, we conclude that the number of subsets of X of size k + 1 equals nk + k+1
n
 
.

Theorem 2.1.7 (Binomial Theorem). Let x, y ∈ R and let n ∈ N+ . We have


       
n n n n n−1 n n−1 n n
(x + y) = x + x y + ··· + xy + y
0 1 n−1 n
n  
X n n−k k
= x y
k
k=0
14 CHAPTER 2. THE INTEGERS

Proof. We prove the result by induction. The base case is trivial. Suppose that we know the result for a
given n ∈ N+ . We have

(x + y)n+1 = (x + y) · (x + y)n
        
n n n n−1 n n−1 n n
= (x + y) · x + x y + ··· + xy + y
0 1 n−1 n
         
n n+1 n n n n−1 2 n 2 n−1 n
= x + x y+ x y + ··· + x y + xy n
0 1 2 n−1 n
         
n n n n−1 2 n n n n+1
+ x y+ x y + ··· + x2 y n−1 + xy n + y
0 1 n−2 n−1 n
           
n n n n n n
= xn+1 + + · xn y + + · xn−1 y 2 + · · · + + · xy n + y n+1
1 0 2 1 n n−1
         
n + 1 n+1 n+1 n n + 1 n−1 2 n+1 n + 1 n+1
= x + x y+ x y + ··· + xy n + y
0 1 2 n n+1

where we have used the lemma to combine each of the sums to get the last line.

2.2 Divisibility
Definition 2.2.1. Let a, b ∈ Z. We say that a divides b, and write a | b, if there exists m ∈ Z with b = am.

For example, we have 2 | 6 because 2 · 3 = 6 and −3 | 21 because −3 · 7 = 21. We also have that 2 - 5
since it is “obvious” that no such integer exists. If you are uncomfortable with that (and there is certainly
reason to be), we will be much more careful about such statements in a couple of sections.
Notice that a | 0 for every a ∈ Z because a · 0 = 0 for all a ∈ Z. In particular, we have 0 | 0 because as
noted we have 0 · 0 = 0. Of course we also have 0 · 3 = 0 and in fact 0 · m = 0 for all m ∈ Z, so every integer
serves as a “witness” that 0 | 0. Our definition says nothing about the m ∈ Z being unique.

Proposition 2.2.2. If a | b and b | c, then a | c.

Proof. Since a | b, there exists m ∈ Z with b = am. Since b | c, there exists n ∈ Z with c = bn. We then have

c = bn = (am)n = a(mn)

Since mn ∈ Z, it follows that a | c.

Proposition 2.2.3.

1. If a | b, then a | bk for all k ∈ Z.

2. If a | b and a | c, then a | (b + c).

3. If a | b and a | c, then a | (bm + cn) for all m, n ∈ Z.

Proof.

1. Let k ∈ Z. Since a | b, there exists m ∈ Z with b = am. We then have

bk = (am)k = a(mk)

Since mk ∈ Z, it follows that a | bk.


2.3. DIVISION WITH REMAINDER 15

2. Since a | b, there exists m ∈ Z with b = am. Since a | c, there exists n ∈ Z with c = an. We then have

b + c = am + an = a(m + n)

Since m + n ∈ Z, it follows that a | b + c.

3. Let m, n ∈ Z. Since a | b, we conclude from part 1 that a | bm. Since a | c, we conclude from part 1
again that a | cn. Using part 2, it follows that a | (bm + cn).

Proposition 2.2.4. Suppose that a, b ∈ Z. If a | b and b 6= 0, then |a| ≤ |b|.

Proof. Suppose that a | b with b 6= 0. Fix d ∈ Z with ad = b. Since b 6= 0, we have d 6= 0. Thus, |d| ≥ 1, and
so
|b| = |ad| = |a| · |d| ≥ |a| · 1 = |a|

Corollary 2.2.5. Suppose that a, b ∈ Z. If a | b and b | a, then either a = b or a = −b.

Proof. Suppose first that a 6= 0 and b 6= 0. By the previous Proposition, we know that both |a| ≤ |b| and
|b| ≤ |a|. It follows that |a| = |b|, and hence either a = b or a = −b.
Suppose now that a = 0. Since a | b, we may fix m ∈ Z with b = am. We then have b = am = 0m = 0
as well. Therefore, a = b.
Suppose finally that b = 0. Since b | a, we may fix m ∈ Z with a = bm. We then have a = bm = 0m = 0
as well. Therefore, a = b.

2.3 Division with Remainder


The primary goal of this section is to prove the following deeply fundamental result.

Theorem 2.3.1. Let a, b ∈ Z with b 6= 0. There exist unique q, r ∈ Z such that a = qb + r and 0 ≤ r < |b|.
Uniqueness here means that if a = q1 b + r1 with 0 ≤ r1 < |b| and a = q2 b + r2 with 0 ≤ r2 < |b|, then q1 = q2
and r1 = r2 .

Here are a bunch of examples illustrating existence:

• Let a = 5 and b = 2. We have 5 = 2 · 2 + 1

• Let a = 135 and b = 45. We have 135 = 3 · 45 + 0

• Let a = 60 and b = 9. We have 60 = 6 · 9 + 6

• Let a = 29 and b = −11. We have 29 = (−2)(−11) + 7

• Let a = −45 and b = 7. We have −45 = (−7) · 7 + 4

• Let a = −21 and b = −4. We have −21 = 6 · (−4) + 3

We begin by proving existence via a sequence of lemmas, starting in the case where a, b are natural
numbers rather than just integers.

Lemma 2.3.2. Let a, b ∈ N with b > 0. There exist q, r ∈ N such that a = qb + r and 0 ≤ r < b.
16 CHAPTER 2. THE INTEGERS

Proof. Fix b ∈ N with b > 0. For this fixed b, we prove the existence of q, r for all a ∈ N by induction. That
is, for this fixed b, we define

X = {a ∈ N : There exist q, r ∈ N with a = qb + r}

and show that X = N by induction.


• Base Case: Suppose that a = 0. We then have a = 0 · b + 0 and clearly 0 < b, so we may take q = 0
and r = 0.
• Inductive Step: Suppose that we know the result for a given a ∈ N. Fix q, r ∈ Z with 0 ≤ r < b such
that a = qb + r. We then have a + 1 = qb + (r + 1). Since r, b ∈ N with r < b, we know that r + 1 ≤ b.
If r + 1 < b, then we are done. Otherwise, we have r + 1 = b, hence

a + 1 = qb + (r + 1)
= qb + b
= (q + 1)b
= (q + 1)b + 0

and so we may take q + 1 and 0.


The result follows by induction.
Lemma 2.3.3. Let a, b ∈ Z with b > 0. There exist q, r ∈ Z such that a = qb + r and 0 ≤ r < b.
Proof. If a ≥ 0, we are done by the previous lemma. Suppose that a < 0. We then have −a > 0, so
by the previous lemma we may fix q, r ∈ N with 0 ≤ r < b such that −a = qb + r. We then have
a = −(qb + r) = (−q)b + (−r). If r = 0, then −r = 0 and we are done. Otherwise we 0 < r < b and

a = (−q)b + (−r)
= (−q)b − b + b + (−r)
= (−q − 1)b + (b − r)

Now since 0 < r < b, we have 0 < b − r < b, so this gives existence.
Lemma 2.3.4. Let a, b ∈ Z with b 6= 0. There exist q, r ∈ Z such that a = qb + r and 0 ≤ r < |b|.
Proof. If b > 0, we are done by the previous lemma. Suppose that b < 0. We then have −b > 0, so the
previous lemma we can fix q, r ∈ N with 0 ≤ r < −b and a = q(−b) + r. We then have a = (−q)b + r and
we are done because |b| = −b.
With that sequence of lemmas building to existence now in hand, we finish off the proof of the theorem.
Proof of Theorem 2.3.1. The final lemma above gives us existence. Suppose that

q1 b + r1 = a = q2 b + r2

where 0 ≤ r1 < |b| and 0 ≤ r2 < |b|. We then have

b(q2 − q1 ) = r1 − r2

hence b | (r2 − r1 ). Now −|b| < −r1 ≤ 0, so adding this to 0 ≤ r2 < |b|, we conclude that

−|b| < r2 − r1 < |b|


2.4. GCDS AND THE EUCLIDEAN ALGORITHM 17

and therefore
|r2 − r1 | < |b|
Now if r2 − r1 6= 0, then since b | (r2 − r1 ), we would conclude that |b| ≤ |r2 − r1 |, a contradiction. It follows
that r2 − r1 = 0, and hence r1 = r2 . Since

q1 b + r1 = q2 b + r2

and r1 = r2 , we conclude that q1 b = q2 b. Now b 6= 0, so it follows that q1 = q2 .


Proposition 2.3.5. Let a, b ∈ Z with b 6= 0. Write a = qb + r for the unique choice of q, r ∈ Z with
0 ≤ r < |b|. We then have that b | a if and only if r = 0.
Proof. If r = 0, then a = qb + r = bq, so b | a. Suppose conversely that b | a and fix m ∈ Z with a = bm.
We then have a = mb + 0 and a = qb + r, so by the uniqueness part of the above theorem, we must have
r = 0.

2.4 GCDs and the Euclidean Algorithm


Definition 2.4.1. Suppose that a, b ∈ Z. We say that d ∈ Z is a common divisor of a and b if both d | a
and d | b.
The common divisors of 120 and 84 are {±1, ±2, ±3, ±4, ±6, ±12} (we will see a careful argument below).
The common divisors of 10 and 0 are {±1, ±2, ±5, ±10}. Every element of Z is a common divisor of 0 and
0. The following little proposition is fundamental to this entire section.
Proposition 2.4.2. Suppose that a, b, q, r ∈ Z and a = qb + r (we need not have 0 ≤ r < |b|). For any
d ∈ Z, we have that d is a common divisor of a and b if and only if d is a common divisor of b and r, i.e.

{d ∈ Z : d is a common divisor of a and b} = {d ∈ Z : d is a common divisor of b and r}.

Proof. Suppose first that d is a common divisor of b and r. Since d | b, d | r, and a = qb + r = bq + r1, we
may use Proposition 2.2.3 to conclude that d | a.
Conversely, suppose that d is a common divisor of a and b. Since d | a, d | b, and r = a − qb = a1 + b(−q),
we may use Proposition 2.2.3 to conclude that d | r.
For example, suppose that we are trying to find the set of common divisors of 120 and 84 (we wrote them
above, but now want to justify it). We repeatedly do division to reduce the problem as follows:

120 = 1 · 84 + 36
84 = 2 · 36 + 12
36 = 3 · 12 + 0

The first line tells us that the set of common divisors of 120 and 84 equals the set of common divisors of
84 and 36. The next line tells us that the set of common divisors of 84 and 36 equals the set of common
divisors of 36 and 12. The last line tells us that the set of common divisors of 36 and 12 equals the set of
common divisors of 12 and 0. Now the set of common divisors of 12 and 0 is simply the set of divisors of 12
(because every number divides 0). Putting it all together, we conclude that the set of common divisors of
120 and 84 equals the set of divisors of 12.
Definition 2.4.3. Let a, b ∈ Z. We say that an element d ∈ Z is a greatest common divisor of a and b if:
• d≥0
18 CHAPTER 2. THE INTEGERS

• d is a common divisor of a and b.

• Whenever c ∈ Z is a common divisor of a and b, we have c | d.

Notice that we are not defining the greatest common divisor of a and b to be the largest divisor of a and
b. The primary reason we do not is because this description fails to capture the most fundamental property
(namely that of being divisible by all other divisors, not just larger than them). Furthermore, if we were
to take that definition, then 0 and 0 would fail to have a greatest common divisor because every integer is
a common divisor of 0 and 0. With this definition however, it is a straightforward matter to check that 0
satisfies the above three conditions.
Since we require more of a greatest common divisor than just picking the largest, we first need to check
that they do indeed exist. The proof is an inductive formulation of the above method of calculation.

Theorem 2.4.4. Every pair of integers a, b ∈ Z has a unique greatest common divisor.

We first sketch the idea of the proof in the case where a, b ∈ N. If b = 0, we are done because it is
simple to verify that a is a greatest common divisor of a and 0. Suppose then that b 6= 0. Fix q, r ∈ N with
a = qb + r and 0 ≤ r < b. Now the idea is to assert inductively the existence of a greatest common divisor
of b and r because this pair is “smaller” than the pair a and b. The only issue is how to make this intuitive
idea of “smaller” precise. There are several ways to do this, but perhaps the most straightforward is to only
induct on b. Thus, our base case handles all pairs of form (a, 0). Next, we handle all pairs of the form (a, 1)
and in doing this we can use the fact the we know the result for all pairs of the form (a0 , 0). Notice that
we can we even change the value of the first coordinate here which is why we used a0 . Then, we handle all
pairs of the form (a, 2) and in doing this we can use the fact that we know the result for all pairs of the form
(a0 , 0) and (a0 , 1). We now begin the formal argument.

Proof. We begin by proving existence only in the special case where a, b ∈ N. We use (strong) induction on
b to prove the result. That is, we let

X = {b ∈ N : For all a ∈ N, there exists a greatest common divisor of a and b}

and prove that X = N by strong induction.

• Base Case: Suppose that b = 0. Let a ∈ N be arbitrary. We then have that the set of common divisors
of a and b equals the set of divisors of a (because every integer divides 0), so a satisfies the requirement
of a greatest common divisor of a and 0. Since a ∈ N was arbitrary, we showed that there exists a
greatest common divisor of a and 0 for every a ∈ N, hence 0 ∈ X.

• Inductive Step: Suppose then that b ∈ N+ and we know the result for all smaller natural numbers. In
other words, we are assuming that c ∈ X whenever 0 ≤ c < b. We prove that b ∈ X. Let a ∈ N be
arbitrary. From above, we may fix q, r ∈ Z with a = qb + r and 0 ≤ r < b. Since 0 ≤ r < b, we know by
strong induction that r ∈ X, hence b and r have a greatest common divisor d. By Proposition 2.4.2,
the set of common divisors of a and b equals the set of common divisors of b and r. It follows that
d is a greatest common divisor of a and b. Since a ∈ N was arbitrary, we showed that there exists a
greatest common divisor of a and b for every a ∈ N, hence b ∈ X.

Therefore, we have shown that X = N, which implies that whenever a, b ∈ N, there exists a greatest common
divisor of a and b.
To turn the argument into a proof for all a, b ∈ Z, we simply note the set of divisors of an element m ∈ Z
equals the set of divisors of −m. So, for example, if a < 0 but b ≥ 0 we can simply take a greatest common
divisor of −a and b (which exists since −a, b ∈ N) and note that it will also be a greatest common divisor of
a and b. A similar argument works if a ≥ 0 and b < 0, or if both a < 0 and b < 0.
2.4. GCDS AND THE EUCLIDEAN ALGORITHM 19

For uniqueness, suppose that c and d are both greatest common divisors of a and b. Since d is a greatest
common divisor and c is a common divisor, we know by the last condition that c | d. Similarly, since c is
a greatest common divisor and d is a common divisor, we know by the last condition that d | c. Therefore,
either c = d or c = −d. Using the first requirement that a greatest common divisor must be nonnegative,
we must have c = d.
Definition 2.4.5. Let a, b ∈ Z. We let gcd(a, b) be the unique greatest common divisor of a and b.
For example we have gcd(120, 84) = 12 and gcd(0, 0) = 0. The following corollary is immediate from
Proposition 2.4.2.
Corollary 2.4.6. Suppose that a, b, q, r ∈ Z and a = qb + r. We have gcd(a, b) = gcd(b, r).
The method of using repeated division and this corollary to reduce the problem of calculating greatest
common divisors is known as the Euclidean Algorithm. We saw it in action of above with 120 and 84. Here
is another example where we are trying to compute gcd(525, 182). We have
525 = 2 · 182 + 161
182 = 1 · 161 + 21
161 = 7 · 21 + 14
21 = 1 · 14 + 7
14 = 2 · 7 + 0
Therefore, gcd(525, 182) = gcd(7, 0) = 7.
Theorem 2.4.7. For all a, b ∈ Z, there exist k, ` ∈ Z with gcd(a, b) = ka + `b.
Proof. We begin by proving existence in the special case where a, b ∈ N. We use induction on b to prove the
result. That is, we let
X = {b ∈ N : For all a ∈ N, there exist k, ` ∈ Z with gcd(a, b) = ka + `b}
and prove that X = N by strong induction.
• Base Case: Suppose that b = 0. Let a ∈ N be arbitrary. We then have that
gcd(a, b) = gcd(a, 0) = a
Since a = 1 · a + 0 · b, so we may let k = 1 and ` = 0. Since a ∈ N was arbitrary, we conclude that
0 ∈ X.
• Inductive Step: Suppose then that b ∈ N+ and we know the result for all smaller nonnegative values.
In other words, we are assuming that c ∈ X whenever 0 ≤ c < b. We prove that b ∈ X. Let a ∈ N be
arbitrary. From above, we may fix q, r ∈ Z with a = qb + r and 0 ≤ r < b. We also know from above
that gcd(a, b) = gcd(b, r). Since 0 ≤ r < b, we know by strong induction that r ∈ X, hence there exist
k, ` ∈ Z with
gcd(b, r) = kb + `r
Now r = a − qb, so
gcd(a, b) = gcd(b, r)
= kb + `r
= kb + `(a − qb)
= kb + `a − qb`
= `a + (k − q`)b
Since a ∈ N was arbitrary, we conclude that b ∈ X.
20 CHAPTER 2. THE INTEGERS

Therefore, we have shown that X = N, which implies that whenever a, b ∈ N, there exists k, ` ∈ Z with
gcd(a, b) = ka + `b.
Given a, b ∈ Z, we can explicitly calculate k, ` ∈ Z by “winding up” the work created from the Euclidean
Algorithm. For example, we saw above that gcd(525, 182) = 7 by calculating

525 = 2 · 182 + 161


182 = 1 · 161 + 21
161 = 7 · 21 + 14
21 = 1 · 14 + 7
14 = 2 · 7 + 0

We now use these steps in reverse to calculate:

7=1·7+0·0
= 1 · 7 + 0 · (14 − 2 · 7)
= 0 · 14 + 1 · 7
= 0 · 14 + 1 · (21 − 1 · 14)
= 1 · 21 + (−1) · 14
= 1 · 21 + (−1) · (161 − 7 · 21)
= (−1) · 161 + 8 · 21
= (−1) · 161 + 8 · (182 − 1 · 161)
= 8 · 182 + (−9) · 161
= 8 · 182 + (−9) · (525 − 2 · 182)
= (−9) · 525 + 26 · 182

This wraps everything up perfectly, but it is easier to simply start at the fifth line.
Before moving on, we work through another proof of the existence of greatest common divisors, along
with the fact that we can write gcd(a, b) as an integer combination of a and b. This proof also works because
of Theorem 2.3.1, but it uses well-ordering and establishes existence without a method of computation. One
may ask why we bother with another proof. One answer is that this result is so fundamental and important
that two different proofs help to reinforce its value. Another reason is that we will generalize each of these
distinct proofs in Chapter 11 to slightly different settings.
Theorem 2.4.8. Let a, b ∈ Z with at least one of a and b nonzero. The set

{ma + nb : m, n ∈ Z}

has positive elements, and the least positive element is a greatest common divisor of a and b. In particular,
if for any a, b ∈ Z, there exist k, ` ∈ Z with gcd(a, b) = ka + `b.
Proof. Let
S = {ma + nb : m, n ∈ Z} ∩ N+
We first claim that S 6= ∅. If a > 0, then a = 1 · a + 0 · b ∈ S. Similarly, if b > 0, then b ∈ S. If a < 0,
then −a > 0 and −a = (−1) · a + 0 · b ∈ S. Similarly, if b < 0, then −b ∈ S. Since at least one of a and b
is nonzero, it follows that S 6= ∅. By the Well-Ordering property of N, we know that S has a least element.
Let d = min(S). Since d ∈ S, we may fix k, ` ∈ Z with d = ka + `b. We claim that d is the greatest common
divisor of a and b.
2.5. PRIMES AND FACTORIZATIONS IN Z 21

First, we need to check that d is a common divisor of a and b. We begin by showing that d | a. Fix
q, r ∈ Z with a = qd + r and 0 ≤ r < d. We want to show that r = 0. We have
r = a − qd
= a − q(ak + b`)
= (1 − qk) · a + (−q`) · b
Now if r > 0, then we have shown that r ∈ S, which contradicts the choice of d as the least element of S.
Hence, we must have r = 0, and so d | a.
We next show that d | b. Fix q, r ∈ Z with b = qd + r and 0 ≤ r < d. We want to show that r = 0. We
have
r = b − qd
= b − q(ak + b`)
= (−qk) · a + (1 − q`) · b
Now if r > 0, then we have shown that r ∈ S, which contradicts the choice of d as the least element of S.
Hence, we must have r = 0, and so d | b.
Finally, we need to check the last condition for d to be the greatest common divisor. Let c be a common
divisor of a and b. Since c | a, c | b, and d = ka + `b, we may use Proposition 2.2.3 to conclude that c | d.
We end this section with a result that will play an important role in proving the Fundamental Theorem
of Arithmetic in the next section.
Definition 2.4.9. Two elements a, b ∈ Z are relatively prime if gcd(a, b) = 1.
Proposition 2.4.10. Let a, b, c ∈ Z. If a | bc and gcd(a, b) = 1, then a | c.
Proof. Since a | bc, we may fix m ∈ Z with bc = am. Since gcd(a, b) = 1, we may fix k, ` ∈ Z with ak +b` = 1.
Multiplying this last equation through by c we conclude that akc + b`c = c, so
c = akc + `(bc)
= akc + na`
= a(kc + n`)
It follows that a | c.

2.5 Primes and Factorizations in Z


Definition 2.5.1. An element p ∈ Z is prime if p > 1 and the only positive divisors of p are 1 and p. If
n ∈ Z with n > 1 is not prime, we say that n is composite.
We begin with the following simple fact.
Proposition 2.5.2. Every n ∈ N with n > 1 is a product of primes.
Proof. We prove the result by strong induction on N. If n = 2, we are done because 2 itself is prime. Suppose
that n > 2 and we have proven the result for all k with 1 < k < n. If n is prime, we are done. Suppose that
n is not prime and fix a divisor c | n with 1 < c < n. Fix d ∈ N with cd = n. We then have that 1 < d < n,
so by induction, both c and d are products of primes, say c = p1 p2 · · · pk and d = q1 q2 · · · q` with each pi and
qj prime. We then have
n = cd = p1 p2 · · · pk q1 q2 · · · q`
so n is a product of primes. The result follows by induction.
22 CHAPTER 2. THE INTEGERS

Corollary 2.5.3. Every a ∈ Z with a ∈


/ {−1, 0, −1} is either a product of primes, or −1 times a product of
primes.
Corollary 2.5.4. Every a ∈ Z with a ∈
/ {−1, 0, −1} is divisible by at least one prime.
Proposition 2.5.5. There are infinitely many primes.
Proof. We know that 2 is a prime, so there is at least one prime. We will take an arbitrary given finite list of
primes and show that there exists a prime which is omitted. Suppose then that p1 , p2 , . . . , pk is an arbitrary
finite list of prime numbers with k ≥ 1. We show that there exists a prime not in the list. Let
n = p1 p2 · · · pk + 1
We have n ≥ 3, so by the above corollary we know that n is divisible by some prime q. If q = pi , we would
have that q | n and also q | p1 p2 · · · pk , so q | (n − p1 p2 · · · pk ). This would imply that q | 1, so |q| ≤ 1, a
contradiction. Therefore q 6= pi for all i, and we have succeeded in finding a prime not in the list.
The next proposition uses our hard work on greatest common divisors. It really is the most useful
property of prime numbers. In fact, when we generalize the idea of primes to more exotic and abstract
contexts in Chapter 11, we will see this property is really of fundamental importance.
Proposition 2.5.6. If p ∈ Z is prime and p | ab, then either p | a or p | b.
Proof. Suppose that p | ab and p - a. Since gcd(a, p) divides p and we know that p - a, we have gcd(a, p) 6= p.
The only other positive divisor of p is 1, so gcd(a, p) = 1. Therefore, by the Proposition 2.4.10, we conclude
that p | b.
Now that we’ve handled the product of two numbers, we get the following corollary about finite products
by a trivial induction.
Corollary 2.5.7. If p is prime and p | a1 a2 · · · an , then p | ai for some i.
We now have all the tools necessary to provide the uniqueness of prime factorizations.
Theorem 2.5.8 (Fundamental Theorem of Arithmetic). Every natural number greater than 1 factors
uniquely (up to order) into a product of primes. In other words, if n ≥ 2 and
p1 p2 · · · pk = n = q1 q2 · · · q`
with p1 ≤ p2 ≤ · · · ≤ pk and q1 ≤ q2 ≤ · · · ≤ q` all primes, then k = ` and pi = qi for 1 ≤ i ≤ k.
Proof. Existence follows from above. We prove the result by (strong) induction on n. Suppose first that n
is prime and
p1 p2 · · · pk = n = q1 q2 · · · q`
with p1 ≤ p2 ≤ · · · ≤ pk and q1 ≤ q2 ≤ · · · ≤ q` all primes. Notice that pi | n for all i and qj | n for all j.
Since the only positive divisors of n are 1 and n, and 1 is not prime, we conclude that pi = n for all i and
qj = n for all j. If k ≥ 2, then p1 p2 · · · pk ≥ nk > n, a contradiction, so we must have k = 1. Similarly we
must have ` = 1. Thus, the result holds for all primes n. In particular, it holds if n = 2.
Suppose now that n > 2 is a natural number and the result is true for all smaller natural numbers.
Suppose that
p1 p2 · · · pk = n = q1 q2 · · · q`
If n is prime, we are done by the above discussion. Suppose then that n is composite. In particular, we have
k ≥ 2 and ` ≥ 2. Now p1 | q1 q2 · · · q` , so p1 | qj for some j. Since qj is prime and p1 6= 1, we must have
p1 = qj . Similarly, we must have q1 = pi for some i. We then have
p1 = qj ≥ q1 = pi ≥ p1
2.5. PRIMES AND FACTORIZATIONS IN Z 23

hence all inequalities must be equalities, and we conclude that p1 = q1 . Canceling, we get

p2 · · · pk = q2 · · · q`

and this common number is smaller than n. By induction, it follows that k = ` and pi = qi for 2 ≤ i ≤ k.

Given a natural number n ∈ N with n ≥ 2, when we write its prime factorization, we typically group
together like primes and write
αk
n = pα 1 α2
1 p2 · · · pk

where the pi are distinct primes. We often allow the insertion of “extra” primes in the factorization of n by
permitting some αi to equal to 0. This convention is particularly useful when comparing prime factorization
of two numbers so that we can assume that both factorizations have the same primes occurring. It also
allows us to write 1 in such a form by choosing all αi to equal 0. Here is one example.

Proposition 2.5.9. Suppose that n, d ∈ N+ . Write the prime factorizations of n and d as


αk
n = pα 1 α2
1 p2 · · · pk

d = pβ1 1 pβ2 2 · · · pβkk

where the pi are distinct primes and possibly some αi and βj are 0. We then have that d | n if and only if
0 ≤ βi ≤ αi for all i.

Proof. Suppose first that 0 ≤ βi ≤ αi for all i. We then have that αi − βi ≥ 0 for all i, so we may let

c = p1α1 −β1 p2α2 −β2 · · · pkαk −βk ∈ Z

Notice that

dc = pβ1 1 pβ2 2 · · · pβkk · p1α1 −β1 p2α2 −β2 · · · pkαk −βk


= (pβ1 1 p1α1 −β1 )(pβ2 2 p2α2 −β2 ) · · · (pβnn pnα1 −βn )
= pα1 α2 αn
1 p2 · · · pn
=n

hence d | n.
Conversely, suppose that d | n and fix c ∈ Z with dc = n. Notice that c > 0 because d, n > 0. Now
we have dc = n, so c | n. If q is prime and q | c, then q | n by transitively of divisibility so q | pi for some
i by Corollary 2.5.7, and hence q = pi for some i because each pi is prime. Thus, we can write the prime
factorization of c as
c = pγ11 pγ22 · · · pγkk
where again we may have some γi equal to 0. We then have

n = dc
= (pβ1 1 pβ2 2 · · · pβkk )(pγ11 pγ22 · · · pγkk )
= (pβ1 1 pγ11 )(pβ2 2 pγ22 ) · · · (pβkk pγkk )
= p1β1 +γ1 pβ2 2 +γ2 · · · pβkk +γk

By the Fundamental Theorem of Arithmetic, we have βi + γi = αi for all i. Since βi , γi , αi ≥ 0 for all i, we
conclude that βi ≤ αi for all i.
24 CHAPTER 2. THE INTEGERS

Corollary 2.5.10. Let a, b ∈ N+ with and write


αk
a = pα1 α2
1 p2 · · · pk

b = pβ1 1 pβ2 2 · · · pβkk

where the pi are distinct primes. We then have


min{α1 ,β1 } min{α2 ,β2 } min{αk ,βk }
gcd(a, b) = p1 p2 · · · pk
αk
Corollary 2.5.11. Suppose that n > 1 and n = pα 1 α2
1 p2 · · · pk where the pi are distinct primes. The number
of nonnegative divisors of n is
Yk
(αi + 1)
i=1

and the number of integers divisors of n is


k
Y
2· (αi + 1)
i=1

Proof. We know that a nonnegative divisor d of n must factor as

d = pβ1 1 pβ2 2 · · · pβkk

where 0 ≤ βi ≤ αi for all i. Thus, we have αi + 1 many choices for each βi . Notice that different choices of
βi give rise to different values of d by the Fundamental Theorem of Arithmetic.
αk
Proposition 2.5.12. Suppose that m, n ∈ N+ , and write the prime factorization of m as m = pα 1 α2
1 p2 · · · pk
th
where the pi are distinct primes. We then have that m is an n power in N if and only if n | αi for all i.

Proof. Suppose first that n | αi for all i. For each i, fix βi such that αi = nβi . Since n and each αi are
nonnegative, it follows that each βi is also nonnegative. Letting ` = pβ1 1 pβ2 2 · · · pβkk , we then have

`n = pnβ 1 nβ2
1 p2 · · · pnβ
k
k
= pα1 α2 αk
1 p2 · · · pk = m

so m is an nth power in N.
Suppose conversely that m is an nth power in N, and write m = `n . Since m > 1, we have ` > 1. Write
the unique prime factorization of ` as
` = pβ1 1 pβ2 2 · · · pβkk
We then have
m = `n = (pβ1 1 pβ2 2 · · · pβkk )n = pnβ1 nβ2
1 p2 · · · pnβ
k
k

By the Fundamental Theorem of Arithmetic, we have αi = nβi for all i, so n | αi for all i.

Theorem 2.5.13. Let m, n ∈ N with m, n ≥ 2. If the unique √ prime factorization of m does not have the
property that every prime exponent is divisible by n, then n m is irrational.
√ √
Proof. We proof the contrapositive. Suppose √ that n m is rational and write n m = ab where a, b ∈ Z with
b 6= 0. We may assume that a, b > 0 because n m > 0. We then have

an  a n
= =m
bn b
2.5. PRIMES AND FACTORIZATIONS IN Z 25

hence
an = bn m
Write a, b, m in their unique prime factorizations as
αk
a = pα 1 α2
1 p2 · · · pk

b = pβ1 1 pβ2 2 · · · pβkk


m = pγ11 pγ22 · · · pγkk

where the pi are distinct (and possibly some αi , βi , γi are equal to 0). Since an = bn m, we have

pnα
1 p2 · · · pnα
1 nα2
k
k
= pnβ
1
1 +γ1 nβ2 +γ2
p2 · · · pnβ
k
k +γk

By the Fundamental Theorem of Arithmetic, we conclude that nαi = nβi + γi for all i. Therefore, for each
i, we have γi = nαi − nβi = n(αi − βi ), and so n | γi for each i.
26 CHAPTER 2. THE INTEGERS
Chapter 3

Relations and Functions

3.1 Relations
Definition 3.1.1. Given two sets A and B, we let A × B be the set of all ordered pairs (a, b) where a ∈ A
and b ∈ B. We call A × B the Cartesian product of A and B.
For example, we have

{1, 2, 3} × {6, 8} = {(1, 6), (1, 8), (2, 6), (2, 8), (3, 6), (3, 8)}

and
N × N = {(0, 0), (0, 1), (1, 0), (2, 0), . . . , (4, 7), . . . }
We also use the notation A2 as shorthand for A × A, so R2 = R × R really is the set of points in the plane.
Definition 3.1.2. Let A and B be sets. A (binary) relation between A and B is a subset R ⊆ A × B. If
A = B, then we call a subset of A × A a (binary) relation on A.
For example, let A = {1, 2, 3} and B = {6, 8} as above. Let

R = {(1, 6), (1, 8), (3, 8)}

We then have that R is a relation between A and B, although certainly not a very interesting one. However,
we’ll use it to illustrate a few facts. First, in a relation, it’s possible for an element of A to be related to
multiple elements of B, as in the case for 1 ∈ A in our example R. Also, it’s possible that an element of A
is related to no elements of B, as in the case of 2 ∈ A in our example R.
For a geometric example, let A be the set of points in the plane, and let B be the set of lines in the plane.
We can then define R ⊆ A × B to be the set of pairs (p, L) ∈ A × B such that p is a point on L.
Here are two examples of binary relations on Z:
• L = {(a, b) ∈ Z2 : a < b}
• D = {(a, b) ∈ Z2 : a | b}
We have (4, 7) ∈ L but (7, 4) ∈/ L. Notice that (5, 5) ∈/ L but (5, 5) ∈ D.
By definition, relations are sets. However, it is typically cumbersome to use set notation to write things
like (1, 6) ∈ R. Instead, it usually makes much more sense to use infix notation and write 1R6. Moreover,
we can use better notation for the relation by using a symbol like ∼ instead of R. In this case, we would
write 1 ∼ 6 instead of (1, 6) ∈ ∼ or 2 6∼ 8 instead of (2, 8) ∈
/ ∼.
With this new notation, we give a few examples of binary relations on R:

27
28 CHAPTER 3. RELATIONS AND FUNCTIONS

• Given x, y ∈ R, we let x ∼ y if x2 + y 2 = 1.
• Given x, y ∈ R, we let x ∼ y if x2 + y 2 ≤ 1.
• Given x, y ∈ R, we let x ∼ y if x = sin y.
• Given x, y ∈ R, we let x ∼ y if y = sin x.
Again, notice from these examples that given x ∈ R, there many 0, 1, 2, or even infinitely many y ∈ R with
x ∼ y.
If we let A be the set of all finite sequences of 0’s and 1’s, then the following are binary relations on A:
• Given σ, τ ∈ A, we let σ ∼ τ if σ and τ have the same number of 1’s.
• Given σ, τ ∈ A, we let σ ∼ τ if σ occurs as a consecutive subsequence of τ (for example, we have
010 ∼ 001101011 because 010 appears in positions 5-6-7 of 001101011).
For a final example, let A be the set consisting of the 50 states. Let R be the subset of A × A con-
sisting of those pairs of states whose second letter of their postal codes are equal. For example, we have
(Iowa,California) ∈ R and and (Iowa, Virginia) ∈ R because the postal codes of these sets are IA, CA, VA.
We also have (Minnesota, Tennessee) ∈ R because of the postal codes MN and TN. Now (Texas, Texas)
∈ R, but there is no a ∈ A with a 6= Texas such that (Texas, a) ∈ R because no other state has X as the
second letter of its postal code. Texas stands alone.

3.2 Equivalence Relations


Definition 3.2.1. An equivalence relation on a set A is a binary relation ∼ on A having the following three
properties:
• ∼ is reflexive: a ∼ a for all a ∈ A.
• ∼ is symmetric: Whenever a, b ∈ A satisfy a ∼ b, we have b ∼ a.
• ∼ is transitive: Whenever a, b, c ∈ A satisfy a ∼ b and b ∼ c, we have a ∼ c.
Consider the binary relation ∼ on Z where a ∼ b means that a ≤ b. Notice that ∼ is reflexive because
a ≤ a for all a ∈ Z. Also, ∼ is transitive because if a ≤ b and b ≤ c, then a ≤ c. However, ∼ is not
symmetric because 3 ∼ 4 but 4 6∼ 3. Thus, although ∼ satisfies two out of the three requirements, it is not
an equivalence relation.
A simple example of an equivalence relation is where A = R and a ∼ b means that |a| = |b|. In this case,
it is straightforward to check that ∼ is an equivalence relation. We now move on to some more interesting
examples which we treat more carefully.
Example 3.2.2. Let A be the set of all n × n matrices with real entries. Let M ∼ N mean that there exists
an invertible n × n matrix P such that M = P N P −1 . We then have that ∼ is an equivalence relation on A.
Proof. We need to check the three properties.
• Reflexive: Let M ∈ A. The n × n identity matrix I is invertible and satisfies I −1 = I, so we have
M = IM I −1 . Therefore, ∼ is reflexive.
• Symmetric: Let M, N ∈ A with M ∼ N . Fix a n × n invertible matrix P with M = P N P −1 .
Multiplying on the left by P −1 we get P −1 M = N P −1 , and now multiplying on the right by P we
conclude that P −1 M P = N . We know from linear algebra that P −1 is also invertible and (P −1 )−1 = P ,
so N = P −1 M (P −1 )−1 and hence N ∼ M .
3.2. EQUIVALENCE RELATIONS 29

• Transitive: Let L, M, N ∈ A with L ∼ M and M ∼ N . Since L ∼ M , we may fix a n × n invertible


matrix P with L = P M P −1 . Since M ∼ N , we may fix a n×n invertible matrix Q with M = QN Q−1 .
We then have
L = P M P −1 = P (QN Q−1 )P −1 = (P Q)N (Q−1 P −1 )
Now by linear algebra, we know that the product of two invertible matrices is invertible, so P Q is
invertible and furthermore we know that (P Q)−1 = Q−1 P −1 . Therefore, we have

L = (P Q)N (P Q)−1

so L ∼ N .
Putting it all together, we conclude that ∼ is an equivalence relation on A.
Example 3.2.3. Let A be the set Z × (Z\{0}), i.e. A is the set of all pairs (a, b) ∈ Z2 with b 6= 0. Define a
relation ∼ on A as follows. Given a, b, c, d ∈ Z with b, d 6= 0, we let (a, b) ∼ (c, d) mean ad = bc. We then
have that ∼ is an equivalence relation on A.
Proof. We check the three properties.
• Reflexive: Let a, b ∈ Z with b 6= 0. Since ab = ba, it follows that (a, b) ∼ (a, b).
• Symmetric: Let a, b, c, d ∈ Z with b, d 6= 0, and (a, b) ∼ (c, d). We then have that ad = bc. From this,
we conclude that cb = da so (c, d) ∼ (a, b).
• Transitive: Let a, b, c, d, e, f ∈ Z with b, d, f 6= 0 where (a, b) ∼ (c, d) and (c, d) ∼ (e, f ). We then have
that ad = bc and cf = de. Multiplying the first equation by f we see that adf = bcf . Multiplying the
second equation by b gives bcf = bde. Therefore, we know that adf = bde. Now d 6= 0 by assumption,
so we may cancel it to conclude that af = be. It follows that (a, b) ∼ (e, f )
Therefore, ∼ is an equivalence relation on A.
Let’s analyze the above situation more carefully. We have (1, 2) ∼ (2, 4), (1, 2) ∼ (4, 8), (1, 2) ∼ (−5, −10),
etc. If we think of (a, b) as representing the fraction ab , then the relation (a, b) ∼ (c, d) is saying exactly that
the fractions ab and dc are equal. You may never have thought about equality of fractions as the result of
imposing an equivalence relation on pairs of integers, but that is exactly what it is. We will be more precise
about this below.
The next example is an important example in geometry. We introduce it now, and will return to it later.
Example 3.2.4. Let A be the set R2 \{(0, 0)}. Define a relation ∼ on A by letting (x1 , y1 ) ∼ (x2 , y2 ) if there
exists a real number λ 6= 0 with (x1 , y1 ) = (λx2 , λy2 ). We then have that ∼ is an equivalence relation on A.
Proof. We check the three properties.
• Reflexive: Let (x, y) ∈ R2 \{(0, 0)} we have (x, y) ∼ (x, y) because using λ = 1 we see that (x, y) =
(1 · x, 1 · y). Therefore, ∼ is reflexive.
• Symmetric: Suppose now that (x1 , y1 ) ∼ (x2 , y2 ), and fix a real number λ 6= 0 such that (x1 , y1 ) =
(λx2 , λy1 ). We then have that x1 = λx2 and y1 = λy2 , so x2 = λ1 · x1 and y2 = λ1 · y1 (notice that we
are using λ 6= 0 so we can divide by it). Hence (x2 , y2 ) = ( λ1 · x2 , λ1 · y2 ), and so (x2 , y2 ) ∼ (x1 , y1 ).
Therefore, ∼ is symmetric.
• Transitive: Suppose that (x1 , y1 ) ∼ (x2 , y2 ) and (x2 , y2 ) ∼ (x3 , y3 ). Fix a real number λ 6= 0 with
(x1 , y1 ) = (λx2 , λy2 ), and also fix a real number µ 6= 0 with (x2 , y2 ) = (µx3 , µy3 ). We then have
that (x1 , y1 ) = ((λµ)x3 , (λµ)y3 ). Since both λ 6= 0 and µ 6= 0, notice that λµ 6= 0 as well, so
(x1 , y1 ) ∼ (x3 , y3 ). Therefore, ∼ is transitive.
30 CHAPTER 3. RELATIONS AND FUNCTIONS

It follows that ∼ is an equivalence relation on A.

Definition 3.2.5. Let ∼ be an equivalence relation on a set A. Given a ∈ A, we let

a = {b ∈ A : a ∼ b}

The set a is called the equivalence class of a.

Some sources use the notation [a] instead of a. This notation helps emphasize that the equivalence class
of a is a subset of A rather than an element of A. However, it is cumbersome notation when we begin working
with equivalence classes. We will stick with our notation, although it might take a little time to get used to.
Notice that by the reflexive property of ∼, we have that a ∈ a for all a ∈ A.
For example, let’s return to where A is the set consisting of the 50 states and R is the subset of A × A
consisting of those pairs of states whose second letter of their postal codes are equal. It’s straightforward to
show that R is an equivalence relation on A. We have

Iowa = {California, Georgia, Iowa, Louisiana, Massachusetts, Pennsylvania, Virginia, Washington}

while
Minnesota = {Indiana, Minnesota, Tennessee}
and
Texas = {Texas}.
Notice that each of these are sets, even in the case of Texas.
For another example, suppose we are working as above with A = Z × (Z\{0}) where (a, b) ∼ (c, d) means
that ad = bc. As discussed above, some elements of (1, 2) are (1, 2), (2, 4), (4, 8), (−5, −10), etc. So

(1, 2) = {(1, 2), (2, 4), (4, 8), (−5, −10), . . . }

Again, I want to emphasize that (a, b) is a subset of A.


The following proposition is hugely fundamental. It says that if two equivalence classes overlap, then
they must in fact be equal. In other words, if ∼ is an equivalence on A, then the equivalence classes partition
the set A into pieces.

Proposition 3.2.6. Let ∼ be an equivalence relation on a set A and let a, b ∈ A. If a ∩ b 6= ∅, then a = b.

Proof. Suppose that a ∩ b 6= ∅. Fix c ∈ a ∩ b. We then have a ∼ c and b ∼ c. By symmetry, we know that
c ∼ b, and using transitivity we get that a ∼ b. Using symmetry again, we conclude that b ∼ a.
We first show that a ⊆ b. Let x ∈ a. We then have that a ∼ x. Since b ∼ a, we can use transitivity to
conclude that b ∼ x, hence x ∈ b.
We next show that b ⊆ a. Let x ∈ b. We then have that b ∼ x. Since a ∼ b, we can use transitivity to
conclude that a ∼ x, hence x ∈ a.
Putting this together, we get that a = b.

With that proposition in hand, we are ready for the foundational theorem about equivalence relations.

Theorem 3.2.7. Let ∼ be an equivalence relation on a set A and let a, b ∈ A.

1. a ∼ b if and only if a = b.

2. a 6∼ b if and only if a ∩ b = ∅.
3.2. EQUIVALENCE RELATIONS 31

Proof. We first prove 1. Suppose first that a ∼ b. We then have that b ∈ a. Now we know that b ∼ b because
∼ is reflexive, so b ∈ b. Thus, b ∈ a ∩ b, so a ∩ b 6= ∅. By the previous proposition, we conclude that a = b.
Suppose conversely that a = b. Since b ∼ b because ∼ is reflexive, we have that b ∈ b. Therefore, b ∈ a
and hence a ∼ b.
We now use everything we’ve shown to get 2 with little effort. Suppose that a 6∼ b. Since we just proved
1, it follows that a 6= b, so by the previous proposition we must have a ∩ b = ∅. Suppose conversely that
a ∩ b = ∅. We then have a 6= b (because a ∈ a so a 6= ∅), so a 6∼ b by part 1.

Therefore, given an equivalence relation ∼ on a set A, the equivalence classes partition A into pieces.
Working out the details in our postal code example, one can show that ∼ has 1 equivalence class of size 8
(namely Iowa, which is the same set as California and 6 others), 3 equivalence classes of size 4, 4 equivalence
classes of size 3, 7 equivalence classes of size 2, and 4 equivalence classes of size 1.
Let’s revisit the example of A = Z × (Z\{0}) where (a, b) ∼ (c, d) means ad = bc. The equivalence class
of (1, 2), namely the set (1, 2) is the set of all pairs of integers which are ways of representing the fraction
1
2 . In fact, this is how once can “construct” the rational numbers from the integers. We simply define the
rational numbers to be the set of equivalence classes of A under ∼. In other words, we let
a
= (a, b)
b
So when we write something like
1 4
=
2 8
we are simply saying that
(1, 2) = (4, 8)
which is true because (1, 2) ∼ (4, 8).

Example 3.2.8. Recall the example above where A = R2 \{(0, 0)} and where (x1 , y1 ) ∼ (x2 , y2 ) means that
there exists a real number λ 6= 0 with (x1 , y1 ) = (λx2 , λy2 ). The equivalence classes of ∼ are the lines through
the origin (omitting the origin itself ).

Proof. Our first claim is that every point of A is equivalent to exactly one of the following points.

• (0, 1)

• (1, m) for some m ∈ R.

We first show that every point is equivalent to at least one of the above points. Suppose that (x, y) ∈ A
so (x, y) 6= (0, 0). If x = 0, then we must have y 6= 0, so (x, y) ∼ (0, 1) via λ = y. Now if x 6= 0, then
(x, y) ∼ (1, xy ) via λ = x. This gives existence.
To show uniqueness, it suffices to show that no two of the above points are equivalent to each other
because we already know that ∼ is an equivalence relation. Suppose that m ∈ R and that (0, 1) ∼ (1, m).
Fix λ ∈ R with λ 6= 0 such that (0, 1) = (λ1, λm). Looking at the first coordinate, we conclude that λ = 0, a
contradiction. Therefore, (0, 1) is not equivalent to any point of the second type. Suppose now that m, n ∈ R
with (1, m) ∼ (1, n). Fix λ ∈ R with λ 6= 0 such that (1, m) = (λ1, λn). Looking at first coordinates, we
must have λ = 1, so examining second coordinates gives m = λn = n. Therefore (1, m) 6∼ (1, n) whenever
m 6= n. This finishes the claim.
Now we examine the equivalence classes of each of the above points. We first handle (0, 1) and claim
that it equals the set of points in A on the line x = 0. Notice first that if (x, y) ∈ (0, 1), then (0, 1) ∼ (x, y),
so fixing λ 6= 0 with (0, 1) = (λx, λy) we conclude that λx = 0 and hence x = 0. Thus, every element of
(0, 1) is indeed on the line x = 0. Now taking an arbitrary point on the line x = 0, say (0, y) with y 6= 0, we
simply notice that (0, 1) ∼ (0, y) via λ = y1 . Hence, every point on the line x = 0 is an element of (0, 1).
32 CHAPTER 3. RELATIONS AND FUNCTIONS

Finally we fix m ∈ R and claim that (1, m) is the set of points in A on the line y = mx. Notice first that
if (x, y) ∈ (1, m), then (1, m) ∼ (x, y), hence (x, y) ∼ (1, m). Fix λ 6= 0 with (x, y) = (λ1, λm). We then
have x = λ by looking at first coordinates, so y = λm = mx by looking at second coordinates. Thus, every
element of (1, m) lies on the line y = mx. Now take an arbitrary point in A on the line y = mx, say (x, mx).
We then have that x 6= 0 because (0, 0) ∈ / A. Thus (1, m) ∼ (x, mx) via λ = x. Hence, every point on the
line y = mx is an element of (1, m).
The set of equivalence classes of ∼ in the previous example is known as the projective real line.

3.3 Functions
Intuitively, given two sets A and B, a function f : A → B is a input-output “mechanism” that produces a
unique output b ∈ B for any given input a ∈ A. Up through calculus, the vast majority of functions that
we encounter are given by simple formulas, so this “mechanism” was typically interpreted in an algorithmic
and computational
Rx sense. However, some functions such as f (x) = sin x, f (x) = ln x, or integral functions
like f (x) = a g(t) dt (given a continuous function g(t) and a fixed a ∈ R) were defined in more interesting
ways where it was not at all obvious how to compute them. We are now in a position to define functions
as relations that satisfy a certain property. Thinking about functions from this more abstract point of view
eliminates the vague “mechanism” concept because they will simply be certain types of sets. With this
perspective, we’ll see that functions can be defined in any way that a set can be defined. This approach
both clarifies the concept of a function as well as providing us with some much needed flexibility in defining
functions in more interesting ways.
Definition 3.3.1. Let A and B be sets. A function from A to B is relation f between A and B such that
for each a ∈ A, there is a unique b ∈ B with (a, b) ∈ f .
For example, let A = {c, q, w, y} and let B = N = {0, 1, 2, 3, 4, . . . }. An example of a function from A to
B is the set
f = {(c, 71), (q, 4), (w, 9382), (y, 4)}.
Notice that in the definition of a function from A to B, we know that for every a ∈ A, there is a unique
b ∈ B such that (a, b) ∈ f . However, as this example shows, it may not be the case that for every b ∈ B,
there is a unique a ∈ A with (a, b) ∈ f . Be careful with the order of quantifiers!
Thinking of functions as special types of relations, and in particular as special types of sets, is occasionally
helpful (see below), but is often awkward in practice. For example, writing (c, 71) ∈ f to mean that f sends
c to 71 gets annoying very quickly. Using infix notation like c f 71 is not much better. Thus, we introduce
some new notation matching up with our old experience with functions.
Notation 3.3.2. Let A and B be sets.
• Instead of writing “f is a function from A to B”, we typically use the shorthand notation “f : A → B”.
• If f : A → B and a ∈ A, we write f (a) to mean the unique b ∈ B such that (a, b) ∈ f .
Therefore, in the above example of f , we have

f (c) = 71
f (q) = 4
f (w) = 9382
f (y) = 4

Definition 3.3.3. Let f : A → B be a function. We define the following.


3.3. FUNCTIONS 33

• We call A the domain of f .


• We call B the codomain of f .
• We define range(f ) = {b ∈ B : There exists a ∈ A with f (a) = b}.
Notice that given a function f : A → B, we have range(f ) ⊆ B, but it is possible that range(f ) 6= B. For
example, in the above case, we have that the codomain of f is N, but range(f ) = {4, 71, 9382}.
In general, given a function f : A → B, it may be very difficult to determine range(f ) because we may
need to search through all a ∈ A. For example, fix n ∈ N+ and define f : N → {0, 1, 2, . . . , n − 1} by letting
f (a) be the remainder when dividing a2 by n. This simple but strange looking function has many interesting
properties, and we will explore it some in Chapter ??. Given a large number n, computing whether a given
number in {0, 1, 2, . . . , n − 1} is an element of range(f ) is thought to be extremely hard. In fact, it is widely
believed that there is no efficient algorithm to compute this when n is the product of two primes, and this
is the basis for some cryptosystems and pseudo-random number generators.
One nice feature of our definition of a function is that we immediately obtain a nice definition for when two
functions f : A → B and g : A → B are equal because we have defined when two sets are equal. Unwrapping
this definition, we see that f = g exactly when f and g have the same elements, which is precisely the same
thing as saying that f (a) = g(a) for all a ∈ A. In particular, the manner in which we describe functions does
not matter so long as the functions behave the same on all inputs. For example, if we define f : R → R and
g : R → R by letting f (x) = sin2 x + cos2 x and g(x) = 1, then we have that f = g. Just like sets (because
after all functions are sets!), the definitions do not matter as long as the elements are the same.
Definition 3.3.4. Suppose that f : A → B and g : B → C are functions. The composition of g and f ,
denoted g ◦ f , is the function g ◦ f : A → C defined by (g ◦ f )(a) = g(f (a)) for all a ∈ A.
Instead of defining g ◦ f in function language, one can also define function composition directly in terms
of the sets f and g. Suppose that f : A → B and g : B → C are functions. Define a new set

R = {(a, c) ∈ A × C : There exists b ∈ B with (a, b) ∈ f and (b, c) ∈ g}

Now R is a relation, and one can check that it is a function (using the assumption that f and g are both
functions). We define g ◦ f to be this set.
Proposition 3.3.5. Let A, B, C, D be sets. Suppose that f : A → B, that g : B → C, and that h : C → D
are functions. We then have that (h ◦ g) ◦ f = h ◦ (g ◦ f ). Stated more simply, function composition is
associative whenever it is defined.
Proof. Let a ∈ A be arbitrary. We then have

((h ◦ g) ◦ f )(a) = (h ◦ g)(f (a))


= h(g(f (a)))
= h((g ◦ f )(a))
= (h ◦ (g ◦ f ))(a)

Therefore ((h ◦ g) ◦ f )(a) = (h ◦ (g ◦ f ))(a) for all a ∈ A. It follows that (h ◦ g) ◦ f = h ◦ (g ◦ f ).


Notice that in general we have f ◦ g 6= g ◦ f even when both are defined! For example, if f : R → R is
f (x) = x + 1 and g : R → R is g(x) = x2 , then

(f ◦ g)(x) = f (g(x)) = f (x2 ) = x2 + 1

while
(g ◦ f )(x) = g(f (x)) = g(x + 1) = (x + 1)2 = x2 + 2x + 1
34 CHAPTER 3. RELATIONS AND FUNCTIONS

For example, we have (f ◦ g)(1) = 12 + 1 = 2 while (g ◦ f )(1) = 12 + 2 · 1 + 1 = 4. Since we have found one
example of an x with (f ◦ g)(x) 6= (f ◦ g)(x), we conclude that f ◦ g 6= g ◦ f . It does not matter that there
do exist some values of x with (f ◦ g)(x) = (f ◦ g)(x) (for example, this is true when x = 0). Remember
that two functions are equal precisely when they agree on all inputs, so to show that the two functions are
not equal it suffices to find just one value where they disagree.

Definition 3.3.6. Let A be a set. The function idA : A → A defined by idA (a) = a for all a ∈ A is called
the identity function on A.

The identity function does leave other functions alone when we compose with it. However, we have to
be careful that we compose with the identity function on the correct set and the correct side.

Proposition 3.3.7. For any function f : A → B, we have f ◦ idA = f and idB ◦ f = f .

Proof. Let f : A → B. For any a ∈ A, we have

(f ◦ idA )(a) = f (idA (a)) = f (a)

Since a ∈ A was arbitrary, it follows that f ◦ idA = f . For any b ∈ B, we have

(idB ◦ f )(a) = idB (f (a)) = f (a)

because f (a) is some element in B. Since b ∈ B was arbitrary, it follows that idB ◦ f = f .

Definition 3.3.8. Let f : A → B be a function.

• A left inverse of f is a function g : B → A such that g ◦ f = idA .

• A right inverse of f is a function g : B → A such that f ◦ g = idB .

• An inverse of f is a function g : B → A that is both a left inverse and a right inverse of f simultane-
ously, i.e. a function g : B → A such that both g ◦ f = idA and f ◦ g = idB .

Notice that we need inverses on both sides and the identity functions differ if A 6= B. Consider the
function f : R → R given by f (x) = ex . Does f have an inverse? As defined, we are asking whether there
exists a function g : R → R such that both g ◦ f = idR and f ◦ g = idR . A natural guess after a calculus
course would be to consider the function g : R+ → R given by h(x) = ln x. However, g is not an inverse of f
because it is not even defined for all real numbers as would be required. Suppose that we try to correct this
minor defect by instead considering the function g : R → R defined by
(
ln x if x > 0
g(x) =
0 otherwise

Now at least g is defined on all of R. Is g an inverse of f ? Let’s check the definitions. For any x ∈ R, we
have ex > 0, so
(g ◦ f )(x) = g(f (x)) = g(ex ) = ln(ex ) = x = idR (x)
so we have shown that g ◦ f = idR . Thus, g is indeed a left inverse of f . Let’s examine f ◦ g. Now if x > 0
then we have
(f ◦ g)(x) = f (g(x)) = f (ln x) = eln x = x = idR (x)
so everything is good there. However, if x < 0, then

(f ◦ g)(x) = f (g(x)) = f (0) = e0 = 1 6= x


3.3. FUNCTIONS 35

Therefore, g is not a right inverse of f , and hence not an inverse for f . In fact, g does not have a right
inverse. One can prove this directly, but it will follow from Proposition 3.4.3 below. Notice that f has other
left inverses beside g. For example, define h : R → R defined by
(
ln x if x > 0
h(x) =
72x5 otherwise

For any x ∈ R, we have ex > 0, so

(h ◦ f )(x) = h(f (x)) = h(ex ) = ln(ex ) = x = idR (x)

so h ◦ f = idR . Thus, h is another left inverse of f . In fact, we can define a left inverse arbitrarily on the set
{x ∈ R : x ≤ 0} so long as it is defined by ln x for positive reals.
Notice that if we had instead considered our function f (x) = ex as a function from R to R+ , then f
does have an inverse. In fact, the function h : R+ → R defined by h(x) = ln x does satisfy h ◦ f = idR and
f ◦ h = idR+ . Thus, when restricting the codomain, we can obtain an inverse for a function that did not
have one originally.

Proposition 3.3.9. Let f : A → B be a function. If g : B → A is a left inverse of f and h : B → A is a


right inverse of f , then g = h.

Proof. By definition, we have that that g ◦ f = idA and f ◦ h = idB . The key function to consider is the
composition (g ◦ f ) ◦ h = g ◦ (f ◦ h) (notice that these are equal by Proposition 3.3.5). We have

g = g ◦ idB
= g ◦ (f ◦ h)
= (g ◦ f ) ◦ h (by Proposition 3.3.5)
= idA ◦ h
=h

Therefore, we conclude that g = h.

Corollary 3.3.10. If f : A → B is a function, then there exists at most one function g : B → A that is an
inverse of f .

Proof. Suppose that g : B → A and h : B → A are both inverse of f . In particular, we then have that g is a
left inverse of f and h is a right inverse of f . Therefore, g = h by Proposition 3.3.9.

Corollary 3.3.11. Let f : A → B be a function.

1. If f has a left inverse, then f has at most one right inverse.

2. If f has a right inverse, then f has at most one left inverse.

Proof. Suppose that f has a left inverse, and fix such a left inverse g : B → A. Suppose that h1 : B → A
and h2 : B → A are both right inverse of f . Using Proposition 3.3.9, we conclude that g = h1 and g = h2 .
Therefore, h1 = h2 .
The proof of the second result is completely analogous.

Notice that it is possible for a function to have many left inverses, but this result says that that f must
fail to have a right inverse in this case. This is exactly what happened above in the example where f : R → R
was the function f (x) = ex .
36 CHAPTER 3. RELATIONS AND FUNCTIONS

3.4 Injections, Surjections, Bijections


Definition 3.4.1. Let f : A → B be a function.

• We say that f is injective (or one-to-one) if whenever f (a1 ) = f (a2 ) we have a1 = a2 .

• We say that f is surjective (or onto) if for all b ∈ B there exists a ∈ A such that f (a) = b. In other
words, f is surjective if range(f ) = B.

• We say that f is bijective if both f is injective and surjective.

An equivalent condition for f to be injective is obtained by simply taking the contrapositive, i.e. f : A → B
is injective if and only if whenever a1 6= a2 , we have f (a1 ) 6= f (a2 ). Stated in more colloquial language, f is
injective if every element of B is hit by at most one element of A via f . In this manner, f is surjective if
every element of B is hit by at least one element of a via f , and f is bijective if every element of B is hit by
exactly one element of a via f .
For example, consider the function f : N+ → N+ defined by letting f (n) be the number of positive divisors
of n. Notice that f is not injective because f (3) = 2 = f (5) but of course 3 6= 5. On the other hand, f is
surjective, because given any n ∈ N+ , we have f (2n−1 ) = n.
If we want to prove that a function f : A → B is injective, it is usually better to use our official definition
than the contrapositive one with negations. Thus, we want to start by assuming that we are given arbitrary
a1 , a2 ∈ A that satisfy f (a1 ) = f (a2 ), and using this assumption we want to prove that a1 = a2 . The
reason why this approach is often preferable is because it is typically easier to work with and manipulate a
statement involving equality than it is to derive statements from a non-equality.

Proposition 3.4.2. Suppose that f : A → B and g : B → C are both functions.

1. If both f and g are injective, then g ◦ f is injective.

2. If both f and g are surjective, then g ◦ f is surjective.

3. If both f and g are bijective, then g ◦ f is bijective.

Proof.

1. Suppose that a1 , a2 ∈ A satisfy (g ◦ f )(a1 ) = (g ◦ f )(a2 ). We then have that g(f (a1 )) = g(f (a2 )). Using
the fact that g is injective, we conclude that f (a1 ) = f (a2 ). Now using the fact that f is injective, it
follows that a1 = a2 . Therefore, g ◦ f is injective.

2. Let c ∈ C. Since g : B → C is surjective, we may fix b ∈ B with g(b) = c. Since f : A → B is surjective,


we may fix a ∈ A with f (a) = b. We then have

(g ◦ f )(a) = g(f (a)) = g(b) = c

Therefore, g ◦ f is surjective.

3. This follows from combining 1 and 2.

Proposition 3.4.3. Let f : A → B be a function.

1. f is injective if and only f has a left inverse (i.e. there is a function g : B → A with g ◦ f = idA ).

2. f is surjective if and only f has a right inverse (i.e. there is a function g : B → A with f ◦ g = idB ).
3.5. DEFINING FUNCTIONS ON EQUIVALENCE CLASSES 37

3. f is bijective if and only if f has an inverse (i.e. there is a function g : B → A with both g ◦ f = idA
and f ◦ g = idB ).
Proof.
1. Suppose first that f has a left inverse, and fix a function g : B → A with g ◦ f = idA . Suppose that
a1 , a2 ∈ A satisfy f (a1 ) = f (a2 ). Applying the function g to both sides we see that g(f (a1 )) = g(f (a2 )),
and hence (g ◦ f )(a1 ) = (g ◦ f )(a2 ). We now have

a1 = idA (a1 )
= (g ◦ f )(a1 )
= (g ◦ f )(a2 )
= idA (a2 )
= a2

so a1 = a2 . It follows that f is injective.


Suppose conversely that f is injective. If A = ∅, then f = ∅, and we are done by letting g = ∅ (if
the empty set as a function annoys you, just ignore this case). Let’s assume then that A 6= ∅ and fix
a0 ∈ A. We define g : B → A as follows. Given b ∈ B, we define g(b) as follows:
• If b ∈ range(f ), then there exists a unique a ∈ A with f (a) = b (because f is injective), and we
let g(b) = a for this unique choice.
• If b ∈
/ range(f ), then we let g(b) = a0 .
This completes the definition of g : B → A. We need to check that g ◦ f = idA . Let a ∈ A be arbitrary.
Notice that g(f (a)) = a because in the definition of g on the value b = f (a) we simply notice that
b = f (a) ∈ range(f ) trivially, so we define g(b) = a. Thus, (g ◦ f )(a) = idA (a). Since a ∈ A was
arbitrary, it follows that g ◦ f = idA . Therefore, f has a left inverse.
2. Suppose first that f has a right inverse, and fix a function g : B → A with f ◦ g = idB . Let b ∈ B be
arbitrary. We then have that
b = idB (b) = (f ◦ g)(b) = f (g(b))
hence there exists a ∈ A with f (a) = b, namely a = g(b). Since b ∈ B was arbitrary, it follows that f
is surjective.
Suppose conversely that f is surjective. We define g : B → A as follows. For every b ∈ B, we know that
there exists (possibly many) a ∈ A with f (a) = b because f is surjective. Given b ∈ B, we then define
g(b) = a for some (any) a ∈ A for which g(a) = b. Now given any b ∈ B, notice that g(b) satisfies
f (g(b)) = b by definition of g, so (f ◦ g)(b) = b = idB (b). Since b ∈ B was arbitrary, it follows that
f ◦ g = idB .
3. The right to left direction is immediate from parts 1 and 2. For the left to right direction, we need
only note that if f is a bijection, then the function g defined in the left to right direction in the proof
of 1 equals the function h defined in the left to right direction in the proof of 2.

3.5 Defining Functions on Equivalence Classes


Suppose that ∼ is a equivalence relation on the set A. When we look at the equivalence classes of A, we
know that we have shattered the set A into pieces. Just as in the case where we constructed the rationals,
38 CHAPTER 3. RELATIONS AND FUNCTIONS

we often want to form a new set which consists of the equivalence classes themselves. Thus, the elements of
this new set are themselves sets. Here is the formal definition.

Definition 3.5.1. Let A be a set and let ∼ be an equivalence relation on A. The set whose elements are the
equivalence classes of A under ∼ is called the quotient of A by ∼ and is denoted A/∼.

Thus, if let A = Z × (Z\{0}) where (a, b) ∼ (c, d) means ad = bc, then the set of all rationals is the
quotient A/∼. Letting Q = A/∼, we then have that the set (a, b) is an element of Q for every choice of
a, b ∈ Z with b 6= 0. This quotient construction is extremely general, and we will see that it will play a
fundamental role in our studies. Before we delve into our first main example of modular arithmetic in the
next section, we first address an important and subtle question.
To begin with, notice that a given element of Q (namely a rational) is represented by many different pairs
of integers. After all, we have
1
= (1, 2) = (2, 4) = (−5, −10) = . . .
2

Suppose that we want to define a function whose domain is Q. For example, we want to define a function
f : Q → Z. Now we can try to write down something like:

f ((a, b)) = a

Intuitively, we are trying to define f : Q → Z by letting f ( ab ) = a. From a naive glance, this might look
perfectly reasonable. However, there is a real problem arising from the fact that elements of Q have many
representations. On the one hand, we should have

f ((1, 2)) = 1

and on the other hand we should have


f ((2, 4)) = 2

But we know that (1, 2) = (2, 4), which contradicts the very definition of a function (after all, a function
must have a unique output for any given input, but our description has imposed multiple different outputs
for the same input). Thus, if we want to define a function on Q, we need to check that our definition does
not depend on our choice of representatives.
For a positive example, consider the projective real line P . That is, let A = R2 \{(0, 0} where (x1 , y1 ) ∼
(x2 , y2 ) means there there exists a real number λ 6= 0 such that (x1 , y1 ) = (λx2 , λy2 ). We then have have
that P = A/∼. Consider trying to define the function g : P → R by

5xy
g((x, y)) =
x2+ y2

First we check a technicality: If (x, y) ∈ P , then (x, y) 6= (0, 0), so x2 + y 2 6= 0, and hence the domain of g
really is all of P . Now we claim that g “makes sense”, i.e. that it actually is a function. To see this, we take
two elements (x1 , y1 ) and (x2 , y2 ) with (x1 , y1 ) = (x2 , y2 ), and check that g((x1 , y1 )) = g((x2 , y2 )). In other
words, we check that our definition of g does not actually depend on our choice of representative. Suppose
then that (x1 , y1 ) = (x2 , y2 ). We then have (x1 , y1 ) ∼ (x2 , y2 ), so we may fix λ 6= 0 with (x1 , y1 ) = (λx2 , λy2 ).
3.6. MODULAR ARITHMETIC 39

Now
5x1 y1
g((x1 , y1 )) =
x21 + y12
5(λx2 )(λy2 )
=
(λx2 )2 + (λy2 )2
5λ2 x2 y2
=
λ x22 + λ2 y22
2

λ2 · 5x2 y2
=
λ · (x22 + y22 )
2

5x2 y2
=
x22 + y22
= g((x2 , y2 )).

Hence, g does indeed make sense as a function on P .


Now technically we are being sloppy above when we start by defining a “function” and only checking after
the fact that it makes sense and actually results in an honest function. However, this shortcut is completely
standard. The process of checking that a “function” f : A/∼ → X defined via representatives of equivalence
classes is independent of the actual representatives chosen, and so actually makes sense, is called checking
that f is well-defined.
For a final example, let’s consider a function from a quotient to another quotient. Going back to Q, we
define a function f : Q → Q as follows:

f ((a, b)) = (a2 + 3b2 , 2b2 )

We claim that this function is well-defined on Q. Intuitively, we want to define the following function on
fractions:  a  a2 + 3b2
f =
b 2b2
Let’s check that it does indeed make sense. Suppose that a, b, c, d ∈ Z with b, d 6= 0 and we have (a, b) = (c, d),
i.e. that (a, b) ∼ (c, d). We need to show that f ((a, b)) = f ((c, d)), i.e. that (a2 + 3b2 , 2b2 ) = (c2 + 3d2 , 2d2 )
or equivalently that (a2 + 3b2 , 2b2 ) ∼ (c2 + 3d2 , 2d2 ). Since we are assuming that (a, b) ∼ (c, d), we know
that ad = bc. Hence

(a2 + 3b2 ) · 2d2 = 2a2 d2 + 6b2 d2


= 2(ad)2 + 6b2 d2
= 2(bc)2 + 6b2 d2
= 2b2 c2 + 6b2 d2
= 2b2 · (c2 + 3d2 )

Therefore, (a2 + 3b2 , 2b2 ) ∼ (c2 + 3d2 , 2d2 ), which is to say that f ((a, b)) = f ((c, d)). It follows that f is
well-defined on Q.

3.6 Modular Arithmetic


Definition 3.6.1. Let n ∈ N+ . We define a relation ≡n on Z by letting a ≡n b mean that n | (a − b). When
a ≡n b we say that a is congruent to b modulo n.
40 CHAPTER 3. RELATIONS AND FUNCTIONS

Proposition 3.6.2. Let n ∈ N+ . The relation ≡n is an equivalence relation on Z.

Proof. We need to check the three properties.

• Reflexive: Let a ∈ Z. Since a − a = 0 and n | 0, we have that n | (a − a), hence a ≡n a.

• Symmetric: Let a, b ∈ Z with a ≡n b. We then have that n | (a − b). Thus n | (−1)(a − b) by


Proposition 2.2.3, which says that n | (b − a), and so b ≡n a.

• Transitive: Let a, b, c ∈ Z with a ≡n b and b ≡n c. We then have that n | (a − b) and n | (b − c). Using
Proposition 2.2.3, it follows that n | [(a − b) + (b − c)], which is to say that n | (a − c). Therefore,
a ≡n c.

Putting it all together, we conclude that ≡n is an equivalence relation on Z.

By our general theory about equivalence relations, we know that ≡n partitions Z into equivalence classes.
We next determine the number of such equivalence classes.

Proposition 3.6.3. Let n ∈ N+ and let a ∈ Z. There exists a unique b ∈ {0, 1, . . . , n − 1} such that a ≡n b.
In fact, if we write a = qn + r for the unique choice of q, r ∈ Z with 0 ≤ r < n, then b = r.

Proof. As in the statement, fix q, r ∈ Z with a = qn + r and 0 ≤ r < n. We then have a − r = nq, so
n | (a − r). It follows that a ≡n r, so we have proven existence.
Suppose now that b ∈ {0, 1, . . . , n − 1} and a ≡n b. We then have that n | (a − b), so we may fix k ∈ Z
with nk = a − b. This gives a = kn + b. Since 0 ≤ b < n, we may use the uniqueness part of Theorem 2.3.1
to conclude that k = q (which is unnecessary) and also that b = r. This proves uniqueness.

Therefore, the quotient Z/ ≡n has n elements, and we can obtain unique representatives for these equiv-
alence classes by taking the representatives from the set {0, 1, . . . , n − 1}. For example, if n = 5, we have
that Z/ ≡n consists of the following five sets.

• 0 = {. . . , −10, −5, 0, 5, 10, . . . }

• 1 = {. . . , −9, −4, 1, 6, 11, . . . }

• 2 = {. . . , −8, −3, 2, 7, 12, . . . }

• 3 = {. . . , −7, −2, 3, 8, 13, . . . }

• 4 = {. . . , −6, −1, 4, 9, 14, . . . }

Now that we’ve used ≡n to break Z up into n pieces, our next goal is to show how to add and multiply
elements of this quotient. The idea is to define addition/multiplication of elements of Z/ ≡n by simply
adding/multiplying representatives. In other words, we would like to define

a+b=a+b and a·b=a·b

Of course, whenever we define functions on equivalence classes via representatives, we need to be careful to
ensure that our function is well-defined, i.e. that it does not depend of the choice of representatives. That is
the content of the next result.

Proposition 3.6.4. Suppose that a, b, c, d ∈ Z with a ≡n c and b ≡n d. We then have

1. a + b ≡n c + d

2. ab ≡n cd
3.6. MODULAR ARITHMETIC 41

Proof. Since a ≡n c and b ≡n d, we have n | (a − c) and n | (b − d).


1. Notice that
(a + b) − (c + d) = (a − c) + (b − d)
Since n | (a − c) and n | (b − d), it follows that n | [(a − c) + (b − d)] and so n | [(a + b) − (c + d)].
Therefore, a + b ≡n c + d.
2. Notice that

ab − cd = ab − bc + bc − cd
= (a − c) · b + (b − d) · c

Since n | (a − c) and n | (b − d), it follows that n | [(a − c) · b + (b − d) · c] and so n | (ab − cd). Therefore,
ab ≡n cd.

In this section, we’ve used the notation a ≡n b to denote the relation because it fits in with our general
infix notation. However, for both historical reasons and because the subscript can be annoying, one typically
uses the following notation.
Notation 3.6.5. Given a, b ∈ Z and n ∈ N+ , we write a ≡ b (mod n) to mean that a ≡n b..
42 CHAPTER 3. RELATIONS AND FUNCTIONS
Chapter 4

Introduction to Groups

4.1 Definitions
We begin with the notion of a group. In this context, we deal with just one operation. We choose to start
here in order to get practice with rigor and abstraction in as simple a setting as possible. Also, it turns out
that groups appear across all areas of mathematics in many different guises.
Definition 4.1.1. Let X be a set. A binary operation on X is a function f : X 2 → X.
In other words, a binary operation on a set X is a rule which tells us how to “put together” any two
elements of X. For example, the function f : R2 → R defined by f (x, y) = x2 ey is a binary operation on
R. Notice that a binary operation must be defined for all pairs of elements from X, and it must return an
x
element of X. The function f (x, y) = y−1 is not a binary operation on R because it is not defined for any
point of the form (x, 1). The function f : Z2 → R defined by f (x, y) = sin(xy) is defined on all of Z2 , but
it is not a binary operation on Z because although some values of f are integers (like f (0, 0) = 1), not all
outputs are integers even when we provide integer inputs (for example, f (1, 1) = sin 1 is not an integer).
Also, the dot product is not a binary operation on R3 because given two element of R3 it returns an element
of R (rather than an element of R3 ).
Instead of using the standard function notation for binary operations, one typically uses the so-called
infix notation. For example, when we add two numbers, we write x + y rather the far more cumbersome
+(x, y). For the binary operation involved with groups, we will follow this infix notation.
Definition 4.1.2. A group is a set G equipped with a binary operation · and an element e ∈ G such that:
1. (Associativity) For all a, b, c ∈ G, we have (a · b) · c = a · (b · c).
2. (Identity) For all a ∈ G, we have a · e = a and e · a = a.
3. (Inverses) For all a ∈ G, there exists b ∈ G with a · b = e and b · a = e.
In the abstract definition of a group, we have chosen to use the symbol · for the binary operation. This
symbol may look like the “multiplication” symbol, but the operation need not be the usual multiplication in
any sense. In fact, the · operation may be addition, exponentiation, or some bizarre and unnatural operation
we never thought of before.
Notice that any description of a group needs to provide three things: a set G, a binary operation · on G
(i.e. function from G2 to G), and a particular element e ∈ G. Absolutely any such choice of set, function,
and element can conceivably comprise a group. To check whether this is indeed the case, we need only check
whether the above three properties are true of that fixed choice.

43
44 CHAPTER 4. INTRODUCTION TO GROUPS

Here are a few examples of groups (in most cases we do not verify all of the properties here, but we will
do so later):

1. (Z, +, 0) is a group, as are (Q, +, 0), (R, +, 0), and (C, +, 0).

2. (Q\{0}, ·, 1) is a group. We need to omit 0 because it has no multiplicative inverse. Notice that the
product of two nonzero elements of Q is nonzero, so · really is a binary operation on Q\{0}.
 
1 0
3. The set of invertible 2 × 2 matrices over R with matrix multiplication and identity . In order
0 1
to verify that multiplication is a binary operation on these matrices, it is important to note that the
product of two invertible matrices is itself invertible, and that the inverse of an invertible matrix is
itself invertible. These are fundamental facts from linear algebra.

4. ({1, −1, i, −i}, ·, 1) is a group where · is multiplication of complex numbers.

5. ({T, F }, ⊕, F ) where ⊕ is “exclusive or” on T (interpreted as “true”) and F (interpreted as “false”).

6. Fix n ∈ N+ and let {0, 1}n be the set of all sequences of 0’s and 1’s of length n. If we let ⊕ be the
bitwise “exclusive or” operation, then ({0, 1}n , ⊕, 0n ) is a group. Notice that n = 1, and we interpret
0 as false and 1 as true, then this example looks just like the previous one.

In contrast, here are some examples of non-groups.

1. (Z, −, 0) is not a group. Notice that − is not associative because (3 − 2) − 1 = 1 − 1 = 0 but


3 − (2 − 1) = 3 − 1 = 2. Also, 0 is not an identity because although a − 0 = a for all a ∈ Z, we have
0 − 1 = −1 6= 1.

2. Let S be the set of all odd elements of Z, with the additional inclusion of 0, i.e. S = {0} ∪ {n ∈ Z : n
is odd}. Then (S, +, 0) is not a group. Notice that + is associative, that 0 is an identity, and that
inverses exist. However, + is not a binary operation on S because 1 + 3 = 4 ∈ / S. In other words, S is
not closed under +.

3. (R, ·, 1) is not a group because 0 has no inverse.

Returning to the examples of groups above, notice that the group operation is commutative in all the
examples above except for 3. To see that the operation in 3 is not commutative, simply notice that each of
the matrices    
1 1 1 0
0 1 1 1
are invertible (they both have determinant 1), but
    
1 1 1 0 2 1
=
0 1 1 1 1 1

while     
1 0 1 1 1 1
= .
1 1 0 1 1 2
Thus, there are groups in which the group operation · is not commutative. The special groups that satisfy
this additional fundamental property are named after Niels Abel.

Definition 4.1.3. A group (G, ·, e) is abelian if · is commutative, i.e. if a · b = b · a for all a, b ∈ G. A group
that is not abelian is called nonabelian.
4.2. BASIC PROPERTIES, NOTATION, AND CONVENTIONS 45

We will see many examples of nonabelian groups in time.


Notice that for a group G, there is no requirement at all that the set G or the operation · are in any way
“natural” or “reasonable”. For example, suppose that we work with the set G = {3, ℵ, @} with operation ·
defined by the following table.

· 3 ℵ @
3 @ 3 ℵ
ℵ 3 ℵ @
@ ℵ @ 3

Interpret the above table as follows. To determine a · b, go to the row corresponding to a and the column
corresponding to b, and a · b will be the corresponding entry. For example, in order to compute 3 · @, we go
to the row labeled 3 and column labeled @, which tells us that 3 · @ = ℵ. We can indeed check that this
does form a group with identity element ℵ. However, this is a painful check because there are 27 choices of
triples for which we need to verify associativity. This table view of a group G described above is called the
Cayley table of G.
Here is another example of a group using the set G = {1, 2, 3, 4, 5, 6} with operation ∗ defined by the
following Cayley table:

∗ 1 2 3 4 5 6
1 1 2 3 4 5 6
2 2 1 6 5 4 3
3 3 5 1 6 2 4
4 4 6 5 1 3 2
5 5 3 4 2 6 1
6 6 4 2 3 1 5

It is straightforward to check that 1 is an identity for G and that every element has an inverse. However,
as above, I very strongly advise against checking associativity directly. We will see later how to build many
new groups and verify associativity without directly trying all possible triples. Notice that this last group is
an example of a finite nonabelian group because 2 ∗ 3 = 6 while 3 ∗ 2 = 5.

4.2 Basic Properties, Notation, and Conventions


First, we notice that in the definition of a group it was only stated that there exists an identity element.
There was no comment on how many such elements there are. We first prove that there is only one such
element in every group.

Proposition 4.2.1. Let (G, ·, e) be a group. There exists a unique identity element in G, i.e. if d ∈ G has
the property that a · d = a and d · a = a for all a ∈ G, then d = e.

Proof. The key element to consider is d · e. On the one hand, we know that d · e = d because e is an identity
element. On the other hand, we have d · e = e because d is an identity element. Therefore, d = d · e = e, so
d = e.

We now move on to a similar question about inverses. The axioms only state the existence of an inverse
for every given element. We now prove uniqueness.

Proposition 4.2.2. Let (G, ·, e) be a group. For each a ∈ G, there exists a unique b ∈ G such that a · b = e
and b · a = e.
46 CHAPTER 4. INTRODUCTION TO GROUPS

Proof. Fix a ∈ G. By the group axioms, we know that there exists an inverse of a. Suppose that b and c
both work as inverses, i.e. that a · b = e = b · a and a · c = e = c · a. The crucial element to think about is
(b · a) · c = b · (a · c). We have

b=b·e
= b · (a · c)
= (b · a) · c
=e·c
= c,

hence b = c.
Definition 4.2.3. Let (G, ·, e) be a group. Given an element a ∈ G, we let a−1 denote the unique element
such that a · a−1 = e and a−1 · a = e.
Proposition 4.2.4. Let (G, ·, e) be a group and let a, b ∈ G.
1. There exists a unique element c ∈ G such that a · c = b, namely c = a−1 · b.
2. There exists a unique element c ∈ G such that c · a = b, namely c = b · a−1 .
Proof. We prove 1 and leave 2 as an exercise (it is completely analogous). Notice that c = a−1 · b works
because

a · (a−1 · b) = (a · a−1 ) · b
=e·b
= b.

Suppose now that d ∈ G satisfies a · d = b. We then have that

d=e·d
= (a−1 · a) · d
= a−1 · (a · d)
= a−1 · b.

Here’s an alternate presentation of the latter part (it is the exact same proof, just with more words and a
little more motivation). Suppose that d ∈ G satisfies a · d = b. We then have that a−1 · (a · d) = a−1 · b. By
associatively, the left-hand side is (a−1 · a) · d, which is e · d, with equals d. Therefore, d = a−1 · b.
Corollary 4.2.5 (Cancellation Laws). Let (G, ·, e) be a group and let a, b, c ∈ G.
1. If a · c = a · d, then c = d.
2. If c · a = d · a, then c = d.
Proof. Suppose that a · c = a · d. Letting b equal this common value, it follows the uniqueness part of
Proposition 4.2.4 that c = d. Alternatively, multiply both sides on the left by a−1 and use associativity.
Part 2 is completely analogous.
In terms of the Cayley table of a group G, Proposition 4.2.4 says that every element of a group appears
exactly once in each row of G and exactly once in each column of G. In fancy terminology, the Cayley table
of a group is a Latin square.
4.2. BASIC PROPERTIES, NOTATION, AND CONVENTIONS 47

Proposition 4.2.6. Let (G, ·, e) be a group and let a ∈ G. We have (a−1 )−1 = a.
Proof. By definition we have a · a−1 = e = a−1 · a. Thus, a satisfies the requirement to be the inverse of a−1 .
In other words, (a−1 )−1 = a.
Proposition 4.2.7. Let (G, ·, e) be a group and let a, b ∈ G. We have (a · b)−1 = b−1 · a−1 .
Proof. We have

(a · b) · (b−1 · a−1 ) = ((a · b) · b−1 ) · a−1


= (a · (b · b−1 )) · a−1
= (a · e) · a−1
= a · a−1
= e.

and similarly (b−1 · a−1 ) · (a · b) = e. Therefore, (a · b)−1 = b−1 · a−1 .


We now establish some notation. First, we often just say “Let G be a group” rather than the more precise
“Let (G, ·, e) be a group”. In other words, we typically do not explicitly mention the binary operation and
will just write · afterward if we feel the need to use it. The identity element is unique as shown in Proposition
4.2.1, so we do not feel the need to be so explicit about it and will just call it e later if we need to refer to
it. If some confusion may arise (for example, there are several natural binary operations on the given set),
then we will need to be explicit.
Let G be a group. If a, b ∈ G, we typically write ab rather than explicitly writing the binary operation
in a · b. Of course, sometimes it makes sense to insert the dot. For example, if G contains the integers 2
and 4 and 24, writing 24 would certainly be confusing. Also, if the group operation is standard addition or
some other operation with a conventional symbol, we will switch up and use that symbol to avoid confusion.
However, when there is no confusion, and when we are dealing with an abstract group without explicit
mention of what the operation actually is, we will tend to omit it.
With this convention in hand, associativity tells us that (ab)c = a(bc) for all a, b, c ∈ G. Now using
associativity repeatedly we obtain what can be called “generalized associativity”. For example, suppose that
G is a group and a, b, c, d ∈ G. We then have that (ab)(cd) = (a(bc))d because we can use associativity twice
as follows:

(ab)(cd) = ((ab)c)d
= (a(bc))d.

In general, such rearrangements are always possible by iteratively moving the parentheses around as long as
the order of the elements doesn’t change. In other words, no matter how we insert parentheses in abcd so that
it makes sense, we always obtain the same result. For a sequence of 4 elements it is straightforward to try
them all, and for 5 elements it is tedious but feasible. However, it is true no matter how long the sequence is.
A careful proof of this fact requires careful definitions of what a “permissible insertion of parentheses” means
and at this level such an involved tangent is more distracting from our primary aims that it is enlightening.
We will simply take the result as true.
Keep in mind that the order of the elements occurring does matter. There is no reason at all to think that
abcd equals dacb unless the group is abelian. However, if G is abelian then upon any insertion of parentheses
into these expressions so that they make sense, they will evaluate to the same value. Thus, assuming that G is
commutative, we obtain a kind of “generalized commutativity” just like we have a “generalized associativity”.
Definition 4.2.8. Let G be a group. If G is a finite set, then the order of G, denoted |G|, is the number of
elements in G. If G is infinite, we simply write |G| = ∞.
48 CHAPTER 4. INTRODUCTION TO GROUPS

At the moment, we only have a few examples of groups. We know that the usual sets of numbers under
addition like (Z, +, 0), (Q, +, 0), (R, +, 0), and (C, +, 0) are groups. We listed a few other examples above,
such as (Q\{0}, ·), (R\{0}, ·), and the invertible 2 × 2 matrices under multiplication. In these last few
examples, we needed to “throw away” some elements from the natural set in order to satisfy the inverse
requirement, and in the next section we outline this construction in general.

4.3 Building Groups From Associative Operations


When constructing new groups from scratch, the most difficult property to verify is the associative property
of the binary operation. In the next few sections, we will construct some fundamental groups using some
known associative operations. However, even if we have a known associative operation with a natural identity
element, we may not always have inverses. In the rest of this section, we show that in such a situation, if we
restrict down to the “invertible” elements, then we indeed do have a group.

Definition 4.3.1. Let · be a binary operation on a set A that is associative and has an identity element e.
Given a ∈ A, we say that a is invertible if there exists b ∈ A with both ab = e and ba = e. In this case, we
say that b is an inverse for a.

Proposition 4.3.2. Let · be a binary operation on a set A that is associative and has an identity element
e. We then have that every a ∈ A has at most one inverse.

Proof. The proof is completely analogous to the proof for functions. Let a ∈ A and suppose that b and c are
both inverses for a. We then have that

b = be
= b(ac)
= (ba)c
= ec
=c

so b = c.

Notation 4.3.3. Let · be a binary operation on a set A that is associative and has an identity element e. If
a ∈ A is invertible, then we denote its unique inverse by a−1 .

Proposition 4.3.4. Let · be a binary operation on a set A that is associative and has an identity element e.

1. e is an invertible element with e−1 = e.

2. If a and b are both invertible, then ab is invertible and (ab)−1 = b−1 a−1 .

3. If a is invertible, then a−1 is invertible and (a−1 )−1 = a.

Proof. 1. Since ee = e because e is an identity element, we see immediately that e is invertible and
e−1 = e.

2. Suppose that a, b ∈ A are both invertible. We then have that

abb−1 a−1 = aea−1


= aa−1
=e
4.3. BUILDING GROUPS FROM ASSOCIATIVE OPERATIONS 49

and also

b−1 a−1 ab = beb−1


= bb−1
=e

It follows that ab is invertible and (ab)−1 = b−1 a−1 .

3. Suppose that a ∈ A is invertible. By definition, we then have that both aa−1 = e and also that
a−1 a = e. Looking at these equations, we see that a satisfies the definition of being an inverse for a−1 .
Therefore, a−1 is invertible and (a−1 )−1 = a.

Corollary 4.3.5. Let · be a binary operation on a set A that is associative and has an identity element e.
Let B be the set of all invertible elements of A. We then have that · is a binary operation on B and that
(B, ·, e) is a group. Furthermore, if · is commutative on A, then (B, ·, e) is an abelian group.

Proof. By Proposition 4.3.4, we know that if b1 , b2 ∈ B, then b1 b2 ∈ B, and hence · is a binary operation
on B. Proposition 4.3.4 also tells us that e ∈ B. Furthermore, since e is an identity for A ,and B ⊆ A, it
follows that e is an identity element for B. Now · is an associative operation on A and B ⊆ A, so it follows
that · is an associative operation on B. Finally, if b ∈ B, then Proposition 4.3.4 tells us that b−1 ∈ B as
well, so b does have an inverse in B. Putting this all together, we conclude that (B, ·, e) is a group. For the
final statement, simply notice that if · is an commutative operation on A, then since B ⊆ A it follows that ·
is a commutative operation on B.

One special case of this construction is that (Q\{0}, ·, 1), (R\{0}, ·, 1), and (C\{0}, ·, 1) are all abelian
groups. In each case, multiplication is an associative and commutative operation on the whole set, and 1 is
an identity element. Furthermore, in each case, every element but 0 is invertible. Thus, Corollary 4.3.5 says
that each of these are abelian groups.
Another associative operation that we know is the multiplication of matrices. Of course, the set of all
n × n matrices does not form a group because some matrices do not have inverses. However, if we restrict
down to just the invertible matrices, then Corollary 4.3.5 tells us that we do indeed obtain a group.

Definition 4.3.6. Let n ∈ N+ . The set of all n × n invertible matrices with real entries forms a group under
matrix multiplication, with identity element equal to the n × n identity matrix with 1’s on the diagonal and
0’s everywhere else. We denote this group by GLn (R) and call it the general linear group of degree n over
R.

Notice that GL1 (R) is really just the group (R\{0}, ·, 1). However, for each n ≥ 2, the group GLn (R) is
nonabelian so we have an infinite family of such groups (it is worthwhile to explicitly construct two n × n
invertible matrices that do not commute with each other).
Moreover, it also possible to allow entries of the matrices to come from a place other than R. For example,
we can consider matrices with entries from C. In this case, matrix multiplication is still associative, and the
the usual identity matrix is still an identity element. Thus, we can define the following group:

Definition 4.3.7. Let n ∈ N+ . The set of all n × n invertible matrices with complex entries forms a group
under matrix multiplication, with identity element equal to the n × n identity matrix with 1’s on the diagonal
and 0’s everywhere els). We denote this group by GLn (C) and call it the general linear group of degree n
over C.

As we will see in Chapter 9, we will see that we can even generalize this matrix construction to other
“rings”.
50 CHAPTER 4. INTRODUCTION TO GROUPS

4.4 The Groups Z/nZ and U (Z/nZ)


Proposition 4.4.1. Let n ∈ N+ and let G = Z/ ≡n , i.e. the elements of G are the equivalence classes of Z
under the equivalence relation ≡n . Define a binary operation + on G by letting a + b = a + b. We then have
that (G, +, 0) is an abelian group.
Proof. We have already shown in Proposition 3.6.4 that + is well-defined on G. With that in hand, the
definition makes it immediate that + is a binary operation on G. We now check the axioms for an abelian
group.
• Associative: For any a, b, c ∈ Z we have

(a + b) + c = (a + b) + c
= (a + b) + c
= a + (b + c) (since + is associative on Z)
=a+b+c
= a + (b + c)

so + is associative on G.
• Identity: For any a ∈ Z we have
a+0=a+0=a
and
0+a=0+a=a

• Inverses: For any a ∈ Z we have


a + −a = a + (−a) = 0
and
−a + a = (−a) + a = 0

• Commutative: For any a, b ∈ Z we have

a+b=a+b
=b+a (since + is commutative on Z)
=b+a

so + is commutative on G.
Therefore, (G, +, 0) is an abelian group.
Definition 4.4.2. Let n ∈ N+ . We denote the above abelian group by Z/nZ. We call the group “Z mod
nZ”.
This notation is unmotivated at the moment, but will make sense when we discuss general quotient
groups (and be consistent with the more general notation we establish there). Notice that |Z/nZ| = n for
all n ∈ N+ . Thus, we have shown that there exists an abelian group of every finite order.
Let’s examine one example in detail. Consider Z/5Z. As discussed in the last section, the equivalence
classes 0, 1, 2, 3 and 4 are all distinct and give all elements of Z/5Z. By definition, we have 3 + 4 = 7.
This is perfectly correct, but 7 is not one of the special representatives we chose above. Since 7 = 2, we can
also write 3 + 4 = 2 and now we have removed mention of all representatives other than the chosen ones.
Working it all out with only those representatives, we get the following Cayley table for Z/5Z:
4.4. THE GROUPS Z/N Z AND U (Z/N Z) 51
+ 0 1 2 3 4
0 0 1 2 3 4
1 1 2 3 4 0
2 2 3 4 0 1
3 3 4 0 1 2
4 4 0 1 2 3

We also showed in Proposition 3.6.4 that multiplication is well-defined on the quotient. Now it is straight-
forward to mimic the above arguments to show that multiplication is associative and commutative, and that
1 is an identity. However, it seems unlikely that we always have inverses because Z itself fails to have mul-
tiplicative inverses for all elements other than ±1. But let’s look at what happens in the case n = 5. For
example, we have 4 · 4 = 16 = 1, so in particular 4 does have a multiplicative inverse. Working out the
computations, we get the following table.

· 0 1 2 3 4
0 0 0 0 0 0
1 0 1 2 3 4
2 0 2 4 1 3
3 0 3 1 4 2
4 0 4 3 2 1

Examining the table, we are pleasantly surprised that every element other that 0 does have a multiplicative
inverse! In hindsight, there was no hope of 0 having a multiplicative inverse because 0 · a = 0 · a = 0 6= 1
for all a ∈ Z. With this one example we might have the innocent hope that we always get multiplicative
inverses for elements other 0 when we change n. Let’s dash those hopes now by looking at the case when
n = 6.

· 0 1 2 3 4 5
0 0 0 0 0 0 0
1 0 1 2 3 4 5
2 0 2 4 0 2 4
3 0 3 0 3 0 3
4 0 4 2 0 4 2
5 0 5 4 3 2 1

Looking at the table, we see that only 1 and 5 have multiplicative inverses. There are other curiosities
as well. For example, we can have two nonzero elements whose product is zero as shown by 3 · 4 = 0. This
is an interesting fact, and will return to such considerations when we get to ring theory. However, let’s get
to our primary concern of forming a group under multiplication. Since · is an associative and commutative
operation on Z/nZ with identity element 1, Corollary 4.3.5 tells us that we can form an abelian group by
trimming down to those elements that do have inverses. To get a better handle on what elements will be in
this group, we use the following fact.

Proposition 4.4.3. Let n ∈ N+ and let a ∈ Z. The following are equivalent.

1. There exists c ∈ Z with ac ≡ 1 (mod n).

2. gcd(a, n) = 1.

Proof. Assume 1. Fix c ∈ Z with ac ≡ 1 (mod n). We then have n | (ac − 1), so we may fix k ∈ Z with
nk = ac − 1. Rearranging, we see that ac + n(−k) = 1. Hence, there is an integer combination of a and n
52 CHAPTER 4. INTRODUCTION TO GROUPS

which gives 1. Since gcd(a, n) is the least positive such integer, and there is no positive integer less that 1,
we conclude that gcd(a, n) = 1.
Suppose conversely that gcd(a, n) = 1. Fix k, ` ∈ Z with ak + n` = 1. Rearranging gives n(−`) = ak − 1,
so n | (ak − 1). It follows that ak ≡ 1 (mod n), so we can choose c = k.
Thus, when n = 6, the fundamental reason why 1 and 5 have multiplicative inverses is that gcd(1, 6) =
1 = gcd(5, 6). In the case of n = 5, the reason why every element other than 0 had a multiplicative inverse
is because every possible number less than 5 is relatively prime with 5, which is essentially just saying that
5 is prime. Notice that the above argument can be turned into an explicit algorithm for finding such a c as
follows. Given n ∈ N+ and a ∈ Z with gcd(a, n) = 1, use the Euclidean algorithm to produce k, ` ∈ Z with
ak + n` = 1. As the argument shows, we can then choose c = k.
As a consequence of the above result and the fact that multiplication is well-defined (Proposition 3.6.4),
we see that if a ≡ b (mod n), then gcd(a, n) = 1 if and only if gcd(b, n) = 1. Of course we could have proved
this without this result: Suppose that a ≡ b (mod n). We then have that n | (a − b), so we may fix k ∈ Z
with nk = a − b. This gives a = kn + b so gcd(a, n) = gcd(b, n) by Corollary 2.4.6.
Now that we know which elements have multiplicative inverses, if we trim down to these elements then
we obtain an abelian group.
Proposition 4.4.4. Let n ∈ N+ and let G = Z/ ≡n , i.e. G the elements of G are the equivalence classes of
Z under the equivalence relation ≡n . Let U be the following subset of G:

U = {a : a ∈ Z and gcd(a, n) = 1}

Define a binary operation · on U by letting a · b = a · b. We then have that (U, ·, 1) is an abelian group.
Proof. Since · is an associative and commutative operation on Z/nZ with identity element 1, this follows
immediately from Corollary 4.3.5 and Proposition 4.4.3.
Definition 4.4.5. Let n ∈ N+ . We denote the above abelian group by U (Z/nZ).
Again, this notation may seems a bit odd, but we will come back and explain it in context when we get
to ring theory. To see some examples of Cayley tables of these groups, here is the Cayley table of U (Z/5Z):
· 1 2 3 4
1 1 2 3 4
2 2 4 1 3
3 3 1 4 2
4 4 3 2 1

and here is the Cayley table of U (Z/6Z):


· 1 5
1 1 5
5 5 1

Finally, we give the Cayley table of U (Z/8Z) because we will return to this group later as an important
example:
· 1 3 5 7
1 1 3 5 7
3 3 1 7 5
5 5 7 1 3
7 7 5 3 1
4.5. THE SYMMETRIC GROUPS 53

Definition 4.4.6. We define a function ϕ : N+ → N+ as follows. For each n ∈ N+ , we let

ϕ(n) = |{a ∈ {0, 1, 2, . . . , n − 1} : gcd(a, n) = 1}|

The function ϕ is called the Euler ϕ-function or Euler totient function.


Directly from our definition of U (Z/nZ), we see that |U (Z/nZ)| = ϕ(n) for all n ∈ N+ . Calculating ϕ(n)
is a nontrivial task, although it is straightforward if we know the prime factorization of n (see Chapter ??).
Notice that ϕ(p) = p − 1 for all primes p, so |U (Z/pZ)| = p − 1 for all primes p.

4.5 The Symmetric Groups


Using all of our hard work in Chapter 3, we can now construct a group whose elements are functions. The
key fact is that function composition is an associative operation by Proposition 3.3.5, so we can think about
using it as a group operation. However, there are a couple of things to consider. First if f : A → B and
g : B → C are functions so that g ◦ f makes sense, we would still have that f ◦ g doesn’t make sense unless
A = C. Even more importantly, f ◦ f doesn’t make sense unless A = B. Thus, if we want to make sure
that composition always works, we should stick only to those functions that are from a given set A to itself.
With this in mind, let’s consider the set S of all functions f : A → A. We now have that composition is an
associative operation on S, and furthermore idA serves as an identity element. Of course, not all elements
of S have an inverse, but we do know from Proposition 3.4.3 that an element f ∈ S has an inverse if and
only if f is a bijection. With that in mind, we can use Corollary 4.3.5 to form a group.
Definition 4.5.1. Let A be a set. A permutation of A is a bijection f : A → A. Rather than use the standard
symbols f, g, h, . . . employed for general functions, we typically employ letters late in the greek alphabet, like
σ, τ, π, µ, . . . for permutations.
Definition 4.5.2. Let A be a set. The set of all permutations of A together with the operation of function
composition is a group called the symmetric group on A. We denote this group by SA . If we’re given n ∈ N+ ,
we simply write Sn rather than S{1,2,...,n} .
Remember that elements of SA are functions and the operation is composition. It’s exciting to have built
such an interesting group, but working with functions as group elements takes some getting used to. For an
example, suppose that A = {1, 2, 3, 4, 5, 6}. To define a permutation σ : A → A, we need to say what σ does
on each element of A, which can do this by a simple list. Define σ : A → A as follows.
• σ(1) = 5
• σ(2) = 6
• σ(3) = 3
• σ(4) = 1
• σ(5) = 4
• σ(6) = 2
We then have that σ : A → A is a permutation (note every element of A is hit by exactly one element). Now
it’s unnecessarily cumbersome to write out the value of σ(k) on each k in a bulleted list like this. Instead,
we can use the slightly more compact notation:
 
1 2 3 4 5 6
σ=
5 6 3 1 4 2
54 CHAPTER 4. INTRODUCTION TO GROUPS

Interpret the above matrix by taking the top row as the inputs and each number below as the corresponding
output. Let’s through another permutation τ ∈ S6 into the mix. Define
 
1 2 3 4 5 6
τ=
3 1 5 6 2 4

Let’s compute σ◦τ . Remember that function composition happens from right to left. That is, the composition
σ ◦ τ is obtained by performing τ first and following after by performing σ. For example, we have

(σ ◦ τ )(2) = σ(τ (2)) = σ(1) = 5

Working through the 6 inputs, we obtain:


 
1 2 3 4 5 6
σ◦τ =
3 5 4 2 6 1

On the other hand, we have  


1 2 3 4 5 6
τ ◦σ =
2 4 5 3 6 1
Notice that σ ◦ τ 6= τ ◦ σ. We have just shown that S6 is a nonabelian group. Let’s determine just how large
these groups are.
Proposition 4.5.3. If |A| = n, then |SA | = n!. In particular, |Sn | = n! for all n ∈ N+ .
Proof. We count the number of elements of the set SA by constructing all possible permutations σ : A → A
via a sequence of choices. To construct such a permutation, we can first determine the value σ(1). Since
no numbers have been claimed, we have n possibilities for this value because we can choose any element
of A. Now we move on to determine the value σ(2). Notice that we can make σ(2) any value in A except
for the chosen value of σ(1) (because we need to ensure that σ is an injection). Thus, we have n − 1 many
possibilities. Now to determine the value of σ(3), we can choose any value in A other than σ(1) and σ(2)
because those have already been claimed, so we have n − 2 many choices. In general, when trying to define
σ(k + 1), we have already claimed k necessarily distinct elements of A, so we have exactly n − k possibly
choices for the value. Thus, the number of permutations of A equals

n · (n − 1) · (n − 2) · · · 2 · 1 = n!

Let’s return to our element σ in S6 :


 
1 2 3 4 5 6
σ=
5 6 3 1 4 2

This method of representing σ is reasonably compact, but it hides the fundamental structure of what is
happening. For example, it is difficult to “see” what the inverse of σ is, or to understand what happens
when we compose σ with itself repeatedly. We now develop another method for representing an permutation
on A called cycle notation. The basic idea is to take an element of A and follow its path through σ. For
example, let’s start with 1. We have σ(1) = 5. Now instead of moving on to deal with 2, let’s continue this
thread and determine the value σ(5). Looking above, we see that σ(5) = 4. If we continue on this path
to investigate 4, we see that σ(4) = 1, and we have found a “cycle” 1 → 5 → 4 → 1 hidden inside σ. We
will denote this cycle with the notation (1 5 4). Now that those numbers are taken care of, we start again
with the smallest number not yet claimed, which in this case is 2. We have σ(2) = 6 and following up gives
σ(6) = 2. Thus, we have found the cycle 2 → 6 → 2 and we denote this by (2 6). We have now claimed all
4.5. THE SYMMETRIC GROUPS 55

numbers other than 3, and when we investigate 3 we see that σ(3) = 3, so we form the sad lonely cycle (3).
Putting this all together, we write σ in cycle notation as

σ = (1 5 4)(2 6)(3)

One might wonder whether we ever get “stuck” when trying to build these cycles. What would happen if we
follow 1 and we repeat a number before coming back to 1? For example, what if we see 1 → 3 → 6 → 2 → 6?
Don’t fear because this can never happen. The only way the example above could crop up is if the purported
permutation sent both 3 and 2 to 6, which would violate the fact that the purported permutation is injective.
Also, if we finish a few cycles and start up a new one, then it is not possible that our new cycle has any
elements in common with previous ones. For example, if we already have the cycle 1 → 3 → 2 → 1 and we
start with 4, we can’t find 4 → 5 → 3 because then both 1 and 5 would map to 3.
Our conclusion is that this process of writing down a permutation in cycle notation never gets stuck and
results in writing the given permutation as a product of disjoint cycles. Working through the same process
with the permutation  
1 2 3 4 5 6
τ=
3 1 5 6 2 4
we see that in cycle notation we have
τ = (1 3 5 2)(4 6)
Now we can determine σ ◦ τ in cycle notation directly from the cycle notations of σ and τ . For example,
suppose we want to calculate the following:

(1 2 4)(3 6)(5) ◦ (1 6 2)(3 5 4)

We want to determine the cycle notation of the resulting function, so we first need to determine where it
sends 1. Again, function composition happens from right to left. Looking at the function represented on the
right, we see the cycle containing 1 is (1 6 2), so the right function sends 1 to 6. We then go to the function
on the left and see where it sends 6 The cycle containing 6 there is (3 6), so it takes 6 and sends it to 3.
Thus, the composition sends 1 to 3. Thus, our result starts out as

(1 3

Now we need to see what happens to 3. The function on the right sends 3 to 5, and the function on the left
takes 5 and leave it alone, so we have
(1 3 5
When we move on to see what happens to 5, we notice that the right function sends it to 4 and then the left
function takes 4 to 1. Since 1 is the first element the cycle we started, we now close the loop and have

(1 3 5)

We now pick up the least element not in the cycle and continue. Working it out, we end with:

(1 2 4)(3 6)(5) ◦ (1 6 2)(3 5 4) = (1 3 5)(2)(4 6)

Finally, we make our notation a bit more compact with a few conventions. First, we simply omit any cycles
of length 1, so we just write (1 2 4)(3 6) instead of (1 2 4)(3 6)(5). Of course, this requires an understanding
of which n we are using to avoid ambiguity as the notation (1 2 4)(3 6) doesn’t specify whether we are
viewing is as an element of S6 or S8 (in the latter case, the corresponding function fixes both 7 and 8). Also,
as with most group operations, we simply omit the ◦ when composing functions. Thus, we would write the
above as:
(1 2 4)(3 6)(1 6 2)(3 5 4) = (1 3 5)(2)(4 6)
56 CHAPTER 4. INTRODUCTION TO GROUPS

Now there is potential for some conflict here. Looking at the first two cycles above, we meant to think of
(1 2 4)(3 6) as one particular function on {1, 2, 3, 4, 5, 6}, but by omitting the group operation it could also
be interpreted as (1 2 4) ◦ (3 6). Fortunately, these are exactly the same function because the cycles are
disjoint. Thus, there is no ambiguity.
Let’s work out everything about S3 . First, we know that |S3 | = 3! = 6. Working through the possibilities,
we determine that
S3 = {id, (1 2), (1 3), (2 3), (1 2 3), (1 3 2)}
so S3 has the identity function, three 2-cycles, and two 3-cycles. Here is the Cayley table of S3 .
◦ id (1 2) (1 3) (2 3) (1 2 3) (1 3 2)
id id (1 2) (1 3) (2 3) (1 2 3) (1 3 2)
(1 2) (1 2) id (1 3 2) (1 2 3) (2 3) (1 3)
(1 3) (1 3) (1 2 3) id (1 3 2) (1 2) (2 3)
(2 3) (2 3) (1 3 2) (1 2 3) id (1 3) (1 2)
(1 2 3) (1 2 3) (1 3) (2 3) (1 2) (1 3 2) id
(1 3 2) (1 3 2) (2 3) (1 2) (1 3) id (1 2 3)
Notice that S3 is a nonabelian group with 6 elements. In fact, it is the smallest possible nonabelian group,
as we shall see later.
We end this section with two straightforward but important results.
Proposition 4.5.4. Disjoint cycles commutes. That is, if A is a set and a1 , a2 , . . . , ak , b1 , b2 , . . . , b` ∈ A are
all distinct, then
(a1 a2 · · · ak )(b1 b2 · · · b` ) = (b1 b2 · · · b` )(a1 a2 · · · ak )
Proof. Simply work through where each ai and bj are sent on each side. Since ai 6= bj for all i and j, each ai
is fixed by the cycle containing the bj ’s and vice versa. Furthermore, if c ∈ A is such that c 6= ai and c 6= bj
for all i and j, then both cycles fix c, so both sides fix c.
Proposition 4.5.5. Let A be a set and let a1 , a2 , . . . , ak ∈ A be distinct. Let σ = (a1 a2 · · · ak ). We then
have that
σ −1 = (ak ak−1 · · · a2 a1 )
= (a1 ak ak−1 · · · a2 )
Proof. Let τ = (a1 ak ak−1 · · · a2 ). For any i with 1 ≤ i ≤ k − 1, we have
(τ ◦ σ)(ai ) = τ (σ(ai )) = τ (ai+1 ) = ai
and also
(τ ◦ σ)(ak ) = τ (σ(ak )) = τ (a1 ) = ak
Furthermore, for any i with 2 ≤ i ≤ k, we have
(σ ◦ τ )(a1 ) = σ(τ (a1 )) = σ(ak ) = a1
and also
(σ ◦ τ )(ai ) = σ(τ (ai )) = σ(ai−1 ) = ai
Finally, if c ∈ A is such that c 6= ai for all i, then
(τ ◦ σ)(c) = τ (σ(c)) = τ (c) = c
and
(σ ◦ τ )(c) = σ(τ (c)) = σ(c) = c
Since σ ◦ τ and τ ◦ σ have the same output for every element of A, we conclude that σ ◦ τ = τ ◦ σ.
4.6. ORDERS OF ELEMENTS 57

4.6 Orders of Elements


Definition 4.6.1. Let G be a group and let a ∈ G. For each n ∈ N+ , we define

an = aaa · · · a

where there are n total a’s in the above product. If we want to be more formal, we define an recursively by
letting a1 = a and an+1 = an a for all n ∈ N+ . We also define

a0 = e.

Finally, we extend the notation an for n ∈ Z as follows. If n < 0, we let

an = (a−1 )|n| .

In other words, if n < 0, we let


an = a−1 a−1 a−1 · · · a−1
where there are |n| total (a−1 )’s in the above product.
Let G be a group and let a ∈ G. Notice that

a3 a2 = (aaa)(aa) = aaaaa = a5

and
(a3 )2 = a3 a3 = (aaa)(aaa) = aaaaaa = a6
For other examples, notice that

a5 a−2 = (aaaaa)(a−1 a−1 ) = aaaaaa−1 a−1 = aaaaa−1 = aaa = a3

and for any n we have


(a4 )−1 = (aaaa)−1 = a−1 a−1 a−1 a−1 = (a−1 )4 = a−4
In general, we have the following. A precise proof would involve induction using the formal recursive definition
of an given above. However, just like the above discussion, these special cases explain where they come from
and such a formal argument adds little enlightenment so will be omitted.
Proposition 4.6.2. Let G be a group. Let a ∈ G and let m, n ∈ Z. We have
1. am+n = am an
2. amn = (am )n
In some particular groups, this notation is very confusing if used in practice. For example, when working
in the group (Z, +, 0), we would have 24 = 2 + 2 + 2 + 2 = 8 because the operation is addition rather than
multiplication. However, with the standard operation of exponentiation on Z we have 24 = 16. Make sure to
understand what operation is in use whenever we use that notation an in a given group. It might be better
to use a different notation if confusion can arise.
Definition 4.6.3. Let G be a group and let a ∈ G. We define the order of a as follows. Let

S = {n ∈ N+ : an = e}

If S 6= ∅, we let |a| = min(S) (which exists by well-ordering), and if S = ∅, we define |a| = ∞. In other
words, |a| is the least positive n such that an = e provided such an n exists.
58 CHAPTER 4. INTRODUCTION TO GROUPS

The reason why we choose to overload the word order for two apparently very different concepts (when
applied to a group versus when applied to an element of a group) will be explained in the next chapter on
subgroups.
Example 4.6.4. Here are some examples of computing orders of elements in a group.
• In any group G, we have |e| = 1.
• In the group (Z, +), we have |0| = 1 as noted, but |n| = ∞ for all n 6= 0.
• In the group Z/nZ, we have |1| = n.
• In the group Z/12Z, we have |9| = 4 because
2 3 4
9 = 9 + 9 = 18 = 6 9 = 9 + 9 + 9 = 27 = 3 9 = 9 + 9 + 9 + 9 = 36 = 0

• In the group U (Z/7Z) we have |4| = 3 because


2 3
4 = 4 · 4 = 16 = 2 4 = 4 · 4 · 4 = 64 = 1

The order of an element a ∈ G is the least positive m ∈ Z with am = e. There may be many other
larger positive powers of a that give the identity, or even negative powers that do. For example, consider
4 6 −14
1 ∈ Z/2Z. We have |1| = 2, but 1 = 0, 1 = 0, and 1 = 0. In general, given the order of an element, we
can characterize all powers of that element which give the identity as follows.
Proposition 4.6.5. Let G be a group and let a ∈ G.
1. Suppose that |a| = m ∈ N+ . For any n ∈ Z, we have an = e if and only if m | n.
2. Suppose that |a| = ∞. For any n ∈ Z, we have an = e if and only if n = 0.
Proof. We first prove 1. Let m = |a| and notice that m > 0. We then have in particular that am = e.
Suppose first that n ∈ Z is such that m | n. Fix k ∈ Z with n = mk We then have
an = amk = (am )k = ek = e
so an = e. Suppose conversely that n ∈ Z and that an = e. Since m > 0, we may write n = qm + r where
0 ≤ r < m. We then have
e = an
= aqm+r
= aqm ar
= (am )q ar
= eq ar
= ar
Now by definition we know that m is the least positive power of a which gives the identity. Therefore, since
0 ≤ r < m and ar = e, we must have that r = 0. It follows that n = qm so m | n.
We now prove 2. Suppose that |a| = ∞. If n = 0, then we have an = a0 = e. If n > 0, then we have
n
a 6= e because by definition no positive power of a equals the identity. Suppose then that n < 0 and that
an = e. We then have
e = e−1 = (an )−1 = a−n
Now −n > 0 because n < 0, but this is a contradiction because no positive power of a gives the identity. It
follows that if n < 0, then an 6= e. Therefore, an = e if and only if n = 0.
4.6. ORDERS OF ELEMENTS 59

Next, we work to understand the orders of elements in symmetric groups.

Proposition 4.6.6. Let A be a set and let a1 , a2 , . . . , ak ∈ A be distinct. Let σ = (a1 a2 · · · ak ) ∈ SA . We


then have that |σ| = k. In other words, the order of a cycle is its length.

Proof. If 1 ≤ i < k, we have σ i (a1 ) = ai 6= a1 , so σ i 6= id. For each i, we have σ k (ai ) = ai , so σ k fixes each
ai . Since σ fixes all other elements of A, it follows that σ k fixes all other elements of A. Therefore, σ k = id
and we have shown that |σ| = k.

Proposition 4.6.7. Let A be a set and let σ ∈ SA . We then have that |σ| is the least common multiple of
the cycle lengths occurring in the cycle notation of σ.

Proof. Suppose that σ = τ1 τ2 · · · τ` where the τi are disjoint cycles. For each i, let mi = |τi | and notice
|τi | = mi from Proposition 4.6.6. Since disjoint cycles commute by Proposition 4.5.4, for any n ∈ N+ we
have
σ n = τ1n τ2n · · · τ`n
Now if mi | n for each i, then τin = id for each i by Proposition 4.6.5, so σ n = id. Conversely, suppose that
n ∈ N+ is such that there exists i with mi - n. We then have that τin 6= id by Proposition 4.6.5, so we may
fix a ∈ A with τin (a) 6= a. Now both a and τin (a) are fixed by each τj with j 6= i (because the cycles are
disjoint). Therefore σ n (a) = τin (a) 6= a, and hence σ n 6= id.
It follows that σ n = id if and only if mi | n for each i. Since |σ| is the least n ∈ N+ with σ n = id, it
follows that |σ| is the least n ∈ N+ satisfying mi | n for each i, which is to say that |σ| is the least common
multiple of the mi .

Suppose now that we have an element a of a group G and we know its order. How do we compute the
orders of the powers of a? For an example, consider 3 ∈ Z/30Z. It is straightforward to check that |3| = 10.
2 2
Let’s look at the orders of some powers of 3. Now 3 = 6 and a simple check shows that |6| = 5 so |3 | = 5.
3 2 3 4
Now consider 3 = 9. We have 9 = 18, then 9 = 27, and then 9 = 36 = 6. Since 30 is not a multiple of 9
we see that we “cycle around” and it is not quite as clear when we will hit 0 as we continue. However, if we
3 4 5
keep at it, we will find that |3 | = 10. If we keep calculating away, we will find that |3 | = 5 and |3 | = 2.
We would like to have a better way to determine these values without resorting to tedious calculations, and
that is what the next proposition supplies.

Proposition 4.6.8. Let G be a group and let a ∈ G.

1. Suppose that |a| = m ∈ N+ . For any n ∈ Z, we have


m
|an | = .
gcd(m, n)

2. Suppose that |a| = ∞. We have |an | = ∞ for all n ∈ Z\{0}.

Proof. We first prove 1. Fix n ∈ Z and let d = gcd(m, n). Since d is a common divisor of m and n, we may
fix s, t ∈ Z with m = ds and n = dt. Notice that s > 0 because both m > 0 and d > 0. With this notation,
we need to show that |an | = s.
Notice that
(an )s = ans = adts = amt = (am )t = et = e
Thus, s is a positive power of a which gives the identity, and so |a| ≤ s.
Suppose now that k ∈ N+ with (an )k = e. We need to show that s ≤ k. We have ank = e, so by
Proposition 4.6.5, we know that m | nk. Fix ` ∈ Z with m` = nk. We then have that ds` = dtk, so canceling
d > 0 we conclude that s` = tk, and hence s | tk. Now by the homework problem about least common
60 CHAPTER 4. INTRODUCTION TO GROUPS

multiples, we know that gcd(s, t) = 1. Using Proposition 2.4.10, we conclude that s | k. Since s, k > 0, it
follows that s ≤ k.
Therefore, s is the least positive value of k such that (an )k = e, and so we conclude that |ak | = s.
We now prove 2. Suppose that n ∈ Z\{0}. Let k ∈ N+ . We then have (an )k = ank . Now nk 6= 0 because
n 6= 0 and k > 0, hence (an )k = ank 6= 0 by the previous proposition. Therefore, (an )k 6= 0 for all k ∈ N+
and hence |an | = ∞.

Corollary 4.6.9. Let n ∈ N+ and let k ∈ Z. In the group Z/nZ, we have


n
|k| =
gcd(k, n)
k
Proof. Working in the group Z/nZ, we know that |1| = n. Now k = 1 , so the result follows from the
previous proposition.

Proposition 4.6.10. If G has an element of order n ∈ N+ , then G has an element of order d for every
positive d | n.

Proof. Suppose that G has an element of order n, and fix a ∈ G with |a| = n. Let d ∈ N+ with d | n. Fix
k ∈ N+ with kd = n. Using Proposition 4.6.8 and the fact that k | n, we have
n n
|ak | = = =d
gcd(n, k) k

Thus ak is an element of G with order d.

4.7 Direct Products


We give a construction for putting groups together.

Definition 4.7.1. Suppose that (Gi , ?i ) for 1 ≤ i ≤ n are all groups. Consider the Cartesian product of the
sets G1 , G2 , . . . , Gn , i.e.

G1 × G2 × · · · × Gn = {(a1 , a2 , . . . , an ) : ai ∈ Gi for 1 ≤ i ≤ n}

Define an operation · on G1 × G2 × · · · × Gn by letting

(a1 , a2 , . . . , an ) · (b1 , b2 , . . . , bn ) = (a1 ?1 b1 , a2 ?2 b2 , . . . , an ?n bn )

Then (G1 × G2 × · · · × Gn , ·) is a group which is called the (external) direct product of G1 , G2 , . . . , Gn .

We first verify the claim that (G1 × G2 × · · · × Gn , ·) is a group.

Proposition 4.7.2. Suppose that (Gi , ?i ) for 1 ≤ i ≤ n are all groups. The direct product defined above is
a group with the following properties:

• The identity is (e1 , e2 , . . . , en ) where ei is the unique identity of Gi .

• Given ai ∈ Gi for all i, we have

(a1 , a2 , . . . , an )−1 = (a−1 −1 −1


1 , a2 , . . . , an ).

where a−1
i is the inverse of ai in the group Gi .
4.7. DIRECT PRODUCTS 61
n
Q
• |G1 × G2 × · · · × Gn | = |Gi |, i.e. the order of the direct product of the Gi is the product of the orders
i=1
of the Gi .
Proof. We first check that · is associative. Suppose that ai , bi , ci ∈ Gi for 1 ≤ i ≤ n. We have

((a1 , a2 , . . . , an ) · (b1 , b2 , . . . , bn )) · (c1 , c2 , . . . , cn ) = (a1 ?1 b1 , a2 ?2 b2 , . . . , an ?n bn ) · (c1 , c2 , . . . , cn )


= ((a1 ?1 b1 ) ?1 c1 , (a2 ?2 b2 ) ?2 c2 , . . . , (an ?n bn ) ?n cn )
= (a1 ?1 (b1 ?1 c1 ), a2 ?2 (b2 ?2 c2 ), . . . , an ?n (bn ?n cn ))
= (a1 , a2 , . . . , an ) · (b1 ?1 c1 , b2 ?2 c2 , . . . , bn ?n cn )
= (a1 , a2 , . . . , an ) · ((b1 , b2 , . . . , bn ) · (c1 , c2 , . . . , cn )).

Let ei be the identity of Gi . We now check that (e1 , e2 , . . . , en ) is an identity of the direct product. Let
ai ∈ Gi for all i. We have

(a1 , a2 , . . . , an ) · (e1 , e2 , . . . , en ) = (a1 ?1 e1 , a2 ?2 e2 , . . . , an ?n en )


= (a1 , a2 , . . . , an )

and

(e1 , e2 , . . . , en ) · (a1 , a2 , . . . , an ) = (e1 ?1 a1 , e2 ?2 a2 , . . . , en ?n an )


= (a1 , a2 , . . . , an )

hence (e1 , e2 , . . . , en ) is an identity for the direct product.


We finally check the claim about inverses inverses. Let ai ∈ Gi for 1 ≤ i ≤ n. For each i, let a−1
i be the
inverse of ai in Gi . We then have

(a1 , a2 , . . . , an ) · (a−1 −1 −1 −1 −1 −1
1 , a2 , . . . , an ) = (a1 ?1 a1 , a2 ?2 a2 , . . . , an ?n an )
= (e1 , e2 , . . . , en )

and

(a−1 −1 −1 −1 −1 −1
1 , a2 , . . . , an ) · (a1 , a2 , . . . , an ) = (a1 ?1 a1 , a2 ?2 a2 , . . . , an ?n an )
= (e1 , e2 , . . . , en )

hence (a−1 −1 −1
1 , a2 , . . . , an ) is an inverse of (a1 , a2 , . . . , an ).
Finally, since there are |G1 | elements to put in the first coordinate of the n-tuple, |G2 | elements to put
in the second coordinates, etc., it follows that
n
Y
|G1 × G2 × · · · × Gn | = |Gi |.
i=1

For example, consider the group G = S3 × Z (where we are considering Z as a group under addition).
Elements of G are ordered pairs (σ, n) where σ ∈ S3 and n ∈ Z. For example, ((1 2), 8), (id, −6), and
((1 3 2), 42) are all elements of G. The group operation on G is obtained by working in each coordinate
separately and performing the corresponding group operation there. For example, we have

((1 2), 8) · ((1 3 2), 42) = ((1 2) ◦ (1 3 2), 8 + 42) = ((1 3), 50).

Notice that the direct product puts two groups together in a manner that makes them completely ignore
each other. Each coordinate goes about doing its business without interacting with the others at all.
62 CHAPTER 4. INTRODUCTION TO GROUPS

Proposition 4.7.3. The group G1 × G2 × · · · × Gn is abelian if and only if each Gi is abelian.


Proof. For each i, let ?i be the group of operation of Gi .
Suppose first that each Gi is abelian. Suppose that ai , bi ∈ Gi for 1 ≤ i ≤ n. We then have

(a1 , a2 , . . . , an ) · (b1 , b2 , . . . , bn ) = (a1 ?1 b1 , a2 ?2 b2 , . . . , an ?n bn )


= (b1 ?1 a1 , b2 ?2 a2 , . . . , bn ?n an )
= (b1 , b2 , . . . , bn ) · (a1 , a2 , . . . , an ).

Therefore, G1 × G2 × · · · × Gn is abelian.
Suppose conversely that G1 × G2 × · · · × Gn is abelian. Fix i with 1 ≤ i ≤ n. Suppose that ai , bi ∈ Gi .
Consider the elements (e1 , . . . , ei−1 , ai , ei , . . . , en ) and (e1 , . . . , ei−1 , bi , ei , . . . , en ) in G1 ×G2 ×· · ·×Gn . Using
the fact that the direction product is abelian, we see that

(e1 , . . . ,ei−1 , ai ?i bi , ei+1 , . . . , en )


= (e1 ?1 e1 , . . . , ei−1 ?i−1 ei−1 , ai ?i bi , ei+1 ?i+1 ei+1 , . . . , en ?n en )
= (e1 , . . . , ei−1 , ai , ei+1 , . . . , en ) · (e1 , . . . , ei−1 , bi , ei+1 , . . . , en )
= (e1 , . . . , ei−1 , bi , ei+1 , . . . , en ) · (e1 , . . . , ei−1 , ai , ei+1 , . . . , en )
= (e1 ?1 e1 , . . . , ei−1 ?i−1 ei−1 , bi ?i ai , ei+1 ?i+1 ei+1 , . . . , en ?n en )
= (e1 , . . . , ei−1 , bi ?i ai , ei+1 , . . . , en ).

Comparing the ith coordinates of the first and last tuple, we conclude that ai ?i bi = bi ?i ai . Therefore, Gi
is abelian.
We can build all sorts of groups with this construction. For example Z/2Z × Z/2Z is an abelian group
of order 4 with elements
Z/2Z × Z/2Z = {(0, 0), (0, 1), (1, 0), (1, 1)}
In this group, the element (0, 0) is the identity and all other elements have order 2. We can also use this
construction to build nonabelian groups of various orders. For example S3 × Z/2Z is a nonabelian group of
order 12.
Proposition 4.7.4. Let G1 , G2 , . . . , Gn be groups. Let ai ∈ Gi for 1 ≤ i ≤ n. The order of the element
(a1 , a2 , . . . , an ) ∈ G1 × G2 × · · · × Gn is the least common multiple of the orders of the ai ’s in Gi .
Proof. For each i, let mi = |ai |, so mi is the order of ai in the group Gi . Now since the group operation in
the direct product works in each coordinate separately, a simple induction shows that

(a1 , a2 , . . . , an )k = (ak1 , ak2 , . . . , akn )

for all k ∈ N+ . Now if mi | k for each i, then aki = ei for each i, so

(a1 , a2 , . . . , an )k = (ak1 , ak2 , . . . , akn ) = (e1 , e2 , . . . , en )

Conversely, suppose that k ∈ N+ is such that there exists i with mi - k. For such an i, we then have that
aki 6= ei by Proposition 4.6.5, so

(a1 , a2 , . . . , an )k = (ak1 , ak2 , . . . , akn ) 6= (e1 , e2 , . . . , en )

It follows that (a1 , a2 , . . . , an )k = (e1 , e2 , . . . , en ) if and only if mi | k for all i. Since |(a1 , a2 , . . . , an )| is the
least k ∈ N+ with (a1 , a2 , . . . , an )k = (e1 , e2 , . . . , en ), it follows that |(a1 , a2 , . . . , an )| is the least k ∈ N+
satisfying mi | k for all i, which is to say that |(a1 , a2 , . . . , an )| is the least common multiple of the mi .
4.7. DIRECT PRODUCTS 63

For example, suppose that we are working in the group S4 × Z/42Z and we consider the element
((1 4 2 3), 7). Since |(1 4 2 3)| = 4 in S4 and |7| = 6 in Z/42Z, it follows that the order of ((1 4 2 3), 7) in
S4 × Z/42Z equals lcm(4, 6) = 12.
For another example, the group Z/2Z × Z/2Z × Z/2Z is an abelian group of order 8 in which every
nonidentity element has order 2. Generalizing this construction by taking n copies of Z/2Z, we see how to
construct an abelian group of order 2n in which every nonidentity element has order 2.
64 CHAPTER 4. INTRODUCTION TO GROUPS
Chapter 5

Subgroups and Cosets

5.1 Subgroups
If we have a group, we can consider subsets of G which also happen to be a group under the same operation.
If we take a subset of a group and restrict the group operation to that subset, we trivially have the operation
is associative on H because it is associative on G. The only issues are whether the operation remains a
binary operation on H (it could conceivably combine two elements of H and return an element not in H),
whether the identity is there, and whether the inverse of every element of H is also in H. This gives the
following definition.

Definition 5.1.1. Let G be a group and let H ⊆ G. We say that H is a subgroup of G if

1. e ∈ H (H contains the identity of G).

2. ab ∈ H whenever a ∈ H and b ∈ H (H is closed under the group operation).

3. a−1 ∈ H whenever a ∈ H (H is closed under inverses).

Example 5.1.2. Here are some examples of subgroups.

• For any group G, we always have two trivial examples of subgroups. Namely G is always a subgroup
of itself, and {e} is always a subgroup of G.

• Z is a subgroup of (Q, +) and (R, +). This follows because 0 ∈ Z, the sum of two integers is an integer,
and the additive inverse of an integer is an integer.

• The set 2Z = {2n : n ∈ Z} is a subgroup of (Z, +). This follows from the fact that 0 = 2 · 0 ∈ 2Z,
that 2m + 2n = 2(m + n) so the sum of two evens is even, and that −(2n) = 2(−n) so the the additive
inverse of an even number is even.

• The set H = {0, 3} is a subgroup of Z/6Z. To check that H is a subgroup, we need to check the three
−1 −1
conditions. We have 0 ∈ H, so H contains the identity. Also 0 = 0 ∈ H and 3 = 3 ∈ H, so the
inverse of each element of H lies in H. Finally, to check that H is closed under the group operation,
we simply have to check the four possibilities. For example, we have 3 + 3 = 6 = 0 ∈ H. The other 3
possible sums are even easier.

Example 5.1.3. Let G = (Z, +). Here are some examples of subsets of Z which are not subgroups of G.

• The set H = {2n + 1 : n ∈ Z} is not a subgroup of G because 0 ∈


/ H.

65
66 CHAPTER 5. SUBGROUPS AND COSETS

• The set H = {0} ∪ {2n + 1 : n ∈ Z} is not a subgroup of G because even though it contains 0 and is
closed under inverses, it is not closed under the group operation. For example, 1 ∈ H and 3 ∈ H, but
1+3∈ / H.

• The set N is not a subgroup of G because even though it contains 0 and is closed under the group
operation, it is not closed under inverses. For example, 1 ∈ H but −1 ∈
/H

Now it is possible that a subset of a group G forms a group under a completely different binary operation
that the one used in G, but whenever we talk about a subgroup H of G we only think of restricting the
group operation of G down to H. For example, let G = (Q, +). The set H = Q\{0} is not a subgroup of
G (since it does not contain the identity) even though H can be made into a group with the completely
different operation of multiplication. When we consider a subset H of a group G, we only call it a subgroup
of G if it is a group with respect to the exact same binary operation.

Proposition 5.1.4. Let n ∈ N+ . The set H = {A ∈ GLn (R) : det(A) = 1} is a subgroup of GLn (R).

Proof. Letting In be the n × n identity matrix (which is the identity of GLn (R)), we have that det(In ) = 1
so In ∈ H. Suppose that M, N ∈ H so det(M ) = 1 = det(N ). We then have

det(M N ) = det(M ) · det(N ) = 1 · 1 = 1

so M N ∈ H. Suppose finally that M ∈ H. We have M M −1 = In , so det(M M −1 ) = det(In ) = 1, hence


det(M ) · det(M −1 ) = 1. Since M ∈ H we have det(M ) = 1, hence det(M −1 ) = 1. It follows that M −1 ∈ H.
We have checked the three properties of a subgroup, so H is a subgroup of G.

Definition 5.1.5. Let n ∈ N+ . We let SLn (R) be the above subgroup of GLn (R). That is,

SLn (R) = {A ∈ GLn (R) : det(A) = 1}

The group SLn (R) is called the special linear group of degree n.

The following proposition occasionally makes the process of checking whether a subset of a group is
indeed a subgroup a bit easier.

Proposition 5.1.6. Let G be a group and let H ⊆ G. The following are equivalent:

• H is a subgroup of G

• H 6= ∅ and ab−1 ∈ H whenever a, b ∈ H.

Proof. Suppose first that H is a subgroup of G. By definition, we must have e ∈ H, so H 6= ∅. Suppose that
a, b ∈ H. Since H is a subgroup and b ∈ H, we must have b−1 ∈ H. Now using the fact that a ∈ H and
b−1 ∈ H, together with the second part of the definition of a subgroup, it follows that ab−1 ∈ H.
Now suppose conversely that H 6= ∅ and ab−1 ∈ H whenever a, b ∈ H. We need to check the three
defining characteristics of a subgroup. Since H 6= ∅, we may fix c ∈ H. Using our condition and the fact that
c ∈ H, it follows that e = cc−1 ∈ H, so we have checked the first property. Now using the fact that e ∈ H,
given any a ∈ H we have a−1 = ea−1 ∈ H by our condition, so we have checked the third property. Suppose
now that a, b ∈ H. From what we just showed, we know that b−1 ∈ H. Therefore, using our condition,
we conclude that ab = a(b−1 )−1 ∈ H, so we have verified the second property. We have shown that all 3
properties hold for H, so H is a subgroup of G.

We end with an important fact.

Proposition 5.1.7. Let G be a group. If H and K are both subgroups of G, then H ∩ K is a subgroup of G.
5.2. GENERATING SUBGROUPS 67

Proof. We check the three properties for the subset H ∩ K:


• First notice that since both H and K are subgroups of G, we have that e ∈ H and e ∈ K. Therefore,
e ∈ H ∩ K.
• Suppose that a, b ∈ H ∩ K. We then have that both a ∈ H and b ∈ H, so ab ∈ H because H is a
subgroup of G. We also have that both a ∈ K and b ∈ K, so ab ∈ K because K is a subgroup of G.
Therefore, ab ∈ H ∩ K. Since a, b ∈ H ∩ K were arbitrary, it follows that H ∩ K is closed under the
group operation.
• Suppose that a ∈ H ∩ K. We then have that a ∈ H, so a−1 ∈ H because H is a subgroup of G. We
also have that a ∈ K, so a−1 ∈ K because K is a subgroup of G. Therefore, a−1 ∈ H ∩ K. Since
a ∈ H ∩ K was arbitrary, it follows that H ∩ K is closed under inverses.
Therefore, H ∩ K is a subgroup of G.

5.2 Generating Subgroups


Let G be a group and let c ∈ G. Suppose that we want to form the smallest subgroup of G containing the
element c. If H is any such subgroup, then we certainly need to have c2 = c · c ∈ H. Since we must have
c2 ∈ H, it then follows that c3 = c2 · c ∈ H. In this way, we see that cn ∈ H for any n ∈ N+ . Now we also
need to have e ∈ H, and since e = c0 , we conclude that cn ∈ H for all n ∈ N. Furthermore, H needs to be
closed under inverses, so c−1 ∈ H and in fact c−n = (cn )−1 ∈ H for all n ∈ N+ . Putting it all together, we
see that cn ∈ H for all n ∈ Z.
Definition 5.2.1. Let G be a group and let c ∈ G. We define

hci = {cn : n ∈ Z}

The set hci is called the subgroup of G generated by c.


The next proposition explains our choice of definition for hci. Namely, the set hci is the smallest subgroup
of G containing c as an element.
Proposition 5.2.2. Let G be a group and let c ∈ G. Let H = hci = {cn : n ∈ Z}.
1. H is a subgroup of G with c ∈ H.
2. If K is a subgroup of G with c ∈ K, then H ⊆ K.
Proof. We first prove 1. We check that three properties.
• Notice that e = c0 ∈ H.
• Suppose that a, b ∈ H. Fix m, n ∈ Z with a = cm and b = cn . We then have

ab = cm cn = cm+n

so ab ∈ H because m + n ∈ Z.
• Suppose that a ∈ H. Fix m ∈ Z with a = cm . We then have

a−1 = (cm )−1 = c−m

so a−1 ∈ H because −m ∈ Z.
68 CHAPTER 5. SUBGROUPS AND COSETS

Therefore, H is a subgroup of G.
We now prove 2. Suppose that K is a subgroup of G with c ∈ K. We first prove by induction on n ∈ N+
that cn ∈ K. We clearly have c1 = c ∈ K by assumption. Suppose that n ∈ N+ and we know that cn ∈ K.
Since cn ∈ K and c ∈ K, and K is a subgroup of G, it follows that cn+1 = cn c ∈ K. Therefore, by induction,
we know that cn ∈ K for all n ∈ N+ . Now c0 = e ∈ K because K is a subgroup of G, so cn ∈ K for all
n ∈ N. Finally, if n ∈ Z with n < 0, then c−n ∈ K because −n ∈ N+ and hence cn = (c−n )−1 ∈ K because
inverses of elements of K must be in K. Therefore, cn ∈ K for all n ∈ Z, which is to say that H ⊆ K.
For example, suppose that we are working with the group G = Z under addition. Since the group
operation is addition, given c, n ∈ Z, we have that cn (under the general group theory definition) equals nc
(under the usual definition of multiplication). Therefore,

hci = {nc : n ∈ Z}

so hci is the set of all multiples of c.


0
For another example, let G = Z/5Z and consider H = h3i. We certainly must have 0 = 3 ∈ H and also
3
3 ∈ H. Notice then that 3 + 3 = 6 = 1 ∈ H. From here, we conclude that 1 + 3 = 4 ∈ H (this is 3 in group
theory notation), and from this we conclude that 4 + 3 = 7 = 2 ∈ H. Putting it all together, it follows that
n
{0, 3, 1, 4, 2} ⊆ H and hence H = Z/5Z. Notice that we were able to stop once we computed 3 for n with
0 ≤ n ≤ 4 because we’ve exhausted all of H. Now if we add 3 again we end up with 2 + 3 = 0, so we’ve
come back around to the identity. If we continue, we see that we will cycle through these values repeatedly.
This next proposition explains this “cycling” phenomenon more fully, and also finally justifies our over-
loading of the word order. Given a group G and an element c ∈ G, it says that |c| (the order of c as an
element of G) equals |hci| (the number of elements in the subgroup of G generated by c).
Proposition 5.2.3. Suppose that G is a group and that c ∈ G. Let H = hci.
1. Suppose that |c| = m. We then have that H = {ci : 0 ≤ i < m} = {e, c, c2 , . . . , cm−1 } and ck 6= c`
whenever 0 ≤ k < ` < m. Thus, |H| = m.
2. Suppose that |c| = ∞. We then have that ck 6= c` whenever k, ` ∈ Z with k < `, so |H| = ∞
In particular, we have |c| = |hci|.
Proof. We first prove 1. By definition we have H = {cn : n ∈ Z}, so we trivially have that

{ci : 0 ≤ i < m} ⊆ H.

Now let n ∈ Z. Write n = qm + r where 0 ≤ r < n. We then have

cn = cqm+r = (cm )q cr = e1 cr = cr

so cn = cr ∈ {ci : 0 ≤ i < m}. Therefore, H ⊆ {ci : 0 ≤ i < m} and combining this with the reverse
inclusion above we conclude that H = {ci : 0 ≤ i < m}.
Suppose now that 0 ≤ k < ` < m. Assume for the sake of obtaining a contradiction that ck = c` .
Multiplying both sides by c−k on the right, we see that ck c−k = c` c−k , hence

e = c0 = ck−k = ck c−k = c` c−k = c`−k

Now we have 0 ≤ k < ` < m, so 0 < ` − k < m. This contradicts the assumption that m = |c| is the least
positive power of c giving the identity. Hence, we must have ck 6= c` .
We now prove 2. Suppose that k, ` ∈ Z with k < `. Assume that ck = c` . As in part 1 we can multiply
both sides on the right by c−k to conclude that c`−k = e. Now ` − k > 0, so this contradicts the assumption
that |c| = ∞. Therefore, we must have ck 6= c` .
5.2. GENERATING SUBGROUPS 69

Corollary 5.2.4. If G is a finite group, then every element of G has finite order. Moreover, for each a ∈ G,
we have |a| ≤ |G|.
Proof. Let a ∈ G. We then have that hai ⊆ G, so |hai| ≤ |G|. The result follows because |a| = |hai|.
In fact, much more is true. We will see as a consequence of Lagrange’s Theorem that the order of every
element of finite group G is actually a divisor of |G|.
Definition 5.2.5. A group G is cyclic if there exists c ∈ G such that G = hci. An element c ∈ G with
G = hci is called a generator of G.
k
For example, for each n ∈ N+ , the group Z/nZ is cyclic because 1 is a generator (since 1 = k for all k).
Also, Z is cyclic because 1 is a generator (remember than hci is the set of all powers of c, both positive and
negative). In general, a cyclic group has many generators. For example, −1 is a generator of Z, and 3 is a
generator of Z/5Z as we saw above.
Proposition 5.2.6. Let G be a finite group with |G| = n. An element c ∈ G is a generator of G if and only
if |c| = n. In particular, G is cyclic if and only if it has an element of order n.
Proof. Suppose first that c is a generator of G so that G = hci. We know from above that |c| = |hci|, so
|c| = |G| = n. Suppose conversely that |c| = n. We then know that |hci| = n. Since hci ⊆ G and each has n
elements, we must have that G = hci.
For an example of a noncyclic group, consider S3 . The order of each element of S3 is either 1, 2, or 3, so
S3 has no element of order 6. Thus, S3 has no generators, and so S3 is not cyclic. We could also conclude
that S3 is not cyclic by using the following result.
Proposition 5.2.7. All cyclic groups are abelian.
Proof. Suppose that G is a cyclic group and fix c ∈ G with G = hci. Let a, b ∈ G. Since G = hci, we may
fix m, n ∈ Z with a = cm and b = cn . We then have

ab = cm cn
= cm+n
= cn+m
= cn cm
= ba.

Therefore, ab = ba for all a, b ∈ G, so G is abelian.


The converse of the preceding proposition is false. In other words, there exist abelian groups which are
not cyclic. For example, consider the group U (Z/8Z). We have U (Z/8Z) = {1, 3, 5, 7}, so |U (Z/8Z)| = 4,
but every nonidentity element has order 2 (as can be seen by examining the Cayley table at the end of
Section 4.4). Alternatively, the abelian group Z/2Z × Z/2Z has order 4 but every nonidentity element has
order 2. For an infinite example, the infinite abelian group (Q, +) is also not cyclic: We clearly have h0i =
6 Q,
and if q 6= 0, then 2q ∈
/ {nq : n ∈ Z} = hqi.
Suppose that G is a group. We have seen how to take an element c ∈ G and form the smallest subgroup
of G containing c by simply taking the set {cn : n ∈ Z}. Suppose now that we have many elements of G and
we want to form the smallest subgroup of G which contains them. For definiteness now, suppose that we
have c, d ∈ G. How would we construct the smallest possible subgroup of G containing both c and d, which
we denote by hc, di? A natural guess would be

{cm dn : m, n ∈ Z}
70 CHAPTER 5. SUBGROUPS AND COSETS

but unless G is abelian there is no reason to think that this set is closed under multiplication. For example,
we must have cdc ∈ hc, di, but it doesn’t obviously appear there. Whatever hc, di is, it must contain the
following elements:
cdcdcdc c−1 dc−1 d−1 c c3 d6 c−2 d7
If we take 3 elements, it gets even more complicated because we can alternate the 3 elements in such
sequences without such a repetitive pattern. If we have infinitely many elements, it gets even worse. Since
the constructions get messy, we define the subgroup generated by an arbitrary set in a much less explicit
manner. The key idea comes from Proposition 5.2.2, which says that hci is the “smallest” subgroup of G
containing c as an element. Here, “smallest” does not mean the least number of elements, but instead means
the subgroup that is a subset of all other subgroups containing the elements in question.
Proposition 5.2.8. Let G be a group and let A ⊆ G. There exists a subgroup H of G with the following
properties:
• A ⊆ H.
• Whenever K is a subgroup of G with the property that A ⊆ K, we have H ⊆ K.
Furthermore, the subgroup H is unique (i.e. if both H1 and H2 have the above properties, then H1 = H2 ).
Proof. The idea is to intersect all of the subgroups of G that contain A, and argue that the result is a
subgroup. Since their might be infinitely many such subgroups, we can not simply appeal to Proposition
5.1.7. However, our argument is very similar.
We first prove existence. Notice that there is at least one subgroup of G containing A, namely G itself.
Define
H = {a ∈ G : a ∈ K for all subgroups K of G such that A ⊆ K}
Notice we certainly have A ⊆ H by definition. Moreover, if K is a subgroup of G with the property that
A ⊆ K, then we have H ⊆ K by definition of H. We now show that H is indeed a subgroup of G.
• Since e ∈ K for every subgroup K of G with A ⊆ K, we have e ∈ H.
• Let a, b ∈ H. For any subgroup K of G such that A ⊆ K, we must have both a, b ∈ K by definition of
H, hence ab ∈ K because K is a subgroup. Since this is true for all such K, we conclude that ab ∈ H
by definition of H.
• Let a ∈ H. For any subgroup K of G such that A ⊆ K, we must have both a ∈ K by definition of H,
hence a−1 ∈ K because K is a subgroup. Since this is true for all such K, we conclude that a−1 ∈ H
by definition of H.
Combining these three, we conclude that H is a subgroup of G. This finishes the proof of existence.
Finally, suppose that H1 and H2 both have the above properties. Since H2 is a subgroup of G with
A ⊆ H2 , we know that H1 ⊆ H2 . Similarly, since H1 is a subgroup of G with A ⊆ H1 , we know that
H2 ⊆ H1 . Therefore, H1 = H2 .
Definition 5.2.9. Let G be a group and let A ⊆ G. We define hAi to be the unique subgroup H of G given
by Proposition 5.2.8. If A = {a1 , a2 , . . . , an }, we write ha1 , a2 , . . . , an i rather than h{a1 , a2 , . . . , an }i.
For example, consider the group G = S3 , and let H = h(1 2), (1 2 3)i. We know that H is a subgroup of
G, so id ∈ H. Furthermore, we must have (1 3 2) = (1 2 3)(1 2 3) ∈ H. We also must have

(1 3) = (1 2 3)(1 2) ∈ H

and
(2 3) = (1 2)(1 2 3) ∈ H.
5.3. THE ALTERNATING GROUPS 71

Therefore, H = S3 , i.e. h(1 2), (1 2 3)i = S3 .


On the other hand, working in S3 , we have h(1 2 3), (1 3 2)i = {id, (1 2 3), (1 3 2)}. This follows from
the fact that {id, (1 2 3), (1 3 2)} is a subgroup (which one can check directly, or by simply noticing that it
equals h(1 2 3)i), containing the given set.
Definition 5.2.10. Let A be a set. A transposition is an element of SA whose cycle notation equals (a b)
with a 6= b. In other words, a transposition is a bijection which flips two elements of A and leaves all other
elements of A fixed.
Proposition 5.2.11. Let n ∈ N+ . Every element of Sn can be written as a product of transpositions. Thus,
if T ⊆ Sn is the set of all transpositions, then hT i = Sn .
Proof. We’ve seen that every permutation can we written as a product of disjoint cycles, so it suffices to
show that every cycle can be written as a product of transpositions. If a1 , a2 , . . . , ak ∈ {1, 2, . . . , n} are
distinct, we have
(a1 a2 a3 · · · ak−1 ak ) = (a1 ak )(a1 ak−1 ) · · · (a1 a3 )(a1 a2 )
The result follows.
To illustrate the above construction in a special case, we have

(1 7 2 5)(3 9)(4 8 6) = (1 5)(1 2)(1 7)(3 9)(4 6)(4 8)

5.3 The Alternating Groups


At the end of the previous section, we showed that every element of Sn can be written as a product of
transpositions, and in fact we showed explicitly how to construct this product from a description of an
element of Sn as a product of cycles. However, this decomposition is far from unique and in fact it is even
possible the change the number of transpositions. For example, following our description, we have

(1 2 3) = (1 3)(1 2)

However, we can also write


(1 2 3) = (1 3)(2 3)(1 2)(1 3)
Although we can change the number and order of transpositions, it is a somewhat surprising fact that we
can not change the parity of the number of transpositions. That is, it is impossible to find a σ ∈ Sn which we
can write simultaneously as a product of an even number of transpositions and also as a product of an odd
number of transpositions. To build up to this important fact, we introduce a new and interesting concept.
Definition 5.3.1. Let σ ∈ Sn . An inversion of σ is an ordered pair (i, j) with i < j but σ(i) > σ(j). We
let Inv(σ) be the set of all inversions of σ.
For example, let
     
1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
σ= τ= π=
3 1 2 5 4 6 3 1 5 2 4 6 3 4 2 5 1 6

In cycle notation, these are

σ = (1 3 2)(4 5)(6) τ = (1 3 5 4 2)(6) π = (1 3 2 4 5)(6)

Although it’s not immediately obvious, notice that τ and π are obtained by swapping just two elements in
the bottom row of σ. More formally, they are obtained by composing with a transposition first, and we have
72 CHAPTER 5. SUBGROUPS AND COSETS

σ = π ◦ (3 4) and τ = π ◦ (2 5). We now examine the inversions in each of these permutations. Notice that
it is typically easier to determine these in first representations rather than in cycle notation:

Inv(σ) = {(1, 2), (1, 3), (4, 5)}


Inv(τ ) = {(1, 2), (1, 4), (3, 4), (3, 5)}
Inv(π) = {(1, 3), (1, 5), (2, 3), (2, 5), (3, 5), (4, 5)}

From this example, it may seem puzzling to see how the inversions are related. However, there is something
quite interesting that is happening. Let’s examine the relationship between Inv(σ) and Inv(τ ). By swapping
the third and fourth positions in the second row, the inversion (1, 3) in σ became the inversion (1, 4) in τ ,
and the inversion (4, 5) in σ became the inversion (3, 5) in τ , so those match up. However, we added a new
inversion by this swap, because although originally we had σ(3) < σ(4), but the swapping made τ (3) > τ (4).
This accounts for the one additional inversion in τ . If instead we had σ(3) > σ(4), then this swap would
have lost an inversion. However, in either case, this example illustrates that a swapping of two adjacent
numbers either increases or decreases the number of inversions by 1.

Lemma 5.3.2. Suppose σ ∈ Sn . If µ is a transposition consisting of two adjacent numbers, say µ = (k k+1),
then |Inv(σ)| and |Inv(σ ◦ µ)| differ by 1.

Proof. Notice that if i, j ∈


/ {k, k + 1}, then

(i, j) ∈ Inv(σ) ⇐⇒ (i, j) ∈ Inv(σ ◦ µ)

because µ fixes both i and j. Now since µ(k) = k + 1, µ(k + 1) = k, and µ fixes all other numbers, given
any i with i < k, we have

(i, k) ∈ Inv(σ) ⇐⇒ (i, k + 1) ∈ Inv(σ ◦ µ)


(i, k + 1) ∈ Inv(σ) ⇐⇒ (i, k) ∈ Inv(σ ◦ µ)

Similarly, given any j with j > k + 1, we have

(k, j) ∈ Inv(σ) ⇐⇒ (k + 1, j) ∈ Inv(σ ◦ µ)


(k + 1, j) ∈ Inv(σ) ⇐⇒ (k, j) ∈ Inv(σ ◦ µ)

The final thing to notice is that

(k, k + 1) ∈ Inv(σ) ⇐⇒ (k, k + 1) ∈


/ Inv(σ ◦ µ)

because if σ(k) > σ(k+1) then (σ◦µ)(k) < (σ◦µ)(k+1), while if σ(k) < σ(k+1) then (σ◦µ)(k) > (σ◦µ)(k+1).
Since we have a bijection between Inv(σ)\{(k, k + 1)} and Inv(σ ◦ µ)\{(k, k + 1)}, while (k, k + 1) is exactly
one of the sets Inv(σ) and Inv(σ ◦ µ), it follows that |Inv(σ)| and |Inv(σ ◦ µ)| differ by 1.

A similar analysis is more difficult to perform on π because the swapping involved two non-adjacent
numbers. As a result, elements in the middle had slightly more complicated interactions, and the above
example shows that a swap of this type can sizably increase the number of inversions. Although it is possible
to handle it directly, the key idea is to realize we can perform this swap through a sequence of adjacent
swaps. This leads to the following result.

Corollary 5.3.3. Suppose σ ∈ Sn . If µ is a transposition, then |Inv(σ)| 6≡ |Inv(σ ◦ µ)| (mod 2), i.e. the
parity of the number of inversions of σ does not equal the parity of the number of inversions of σ ◦ µ.
5.3. THE ALTERNATING GROUPS 73

Proof. Let µ be a transposition, and write µ = (k `) where k < `. The key fact is that we can write (k `) as
a composition of 2(` − k) − 1 many adjacent transposition in succession. In other words, we have

(k `) = (k k + 1) ◦ (k + 1 k + 2) ◦ · · · ◦ (` − 2 ` − 1) ◦ (` − 1 `) ◦ (` − 2 ` − 1) · · · ◦ (k + 1 k + 2) ◦ (k k + 1)

Thus, the number of inversions of σ ◦ µ equals the number of inversions of

σ ◦ (k k + 1) ◦ (k + 1 k + 2) ◦ · · · ◦ (` − 2 ` − 1) ◦ (` − 1 `) ◦ (` − 2 ` − 1) · · · ◦ (k + 1 k + 2) ◦ (k k + 1)

Using associativity, we can handle each of these in succession, and use Lemma 5.3.2 to conclude that each
changes the number of inversion by 1 (either increasing or decreasing it). Since there are an odd number of
adjacent transpositions, the result follows.
Definition 5.3.4. Let n ∈ N+ . We define a function ε : Sn → {1, −1} by letting ε(σ) = (−1)|Inv(σ)| , i.e.
(
1 if σ has an even number of inversions
ε(σ) =
−1 if σ has an odd number of inversions

The value ε(σ) is called the sign of σ.


Proposition 5.3.5. Let n ∈ N+ .
• ε(id) = 1.
• If σ ∈ Sn can be written as a product of an even number of transpositions (not necessarily disjoint),
then ε(σ) = 1.
• If σ ∈ Sn can be written as a product of an odd number of transpositions (not necessarily disjoint),
then ε(σ) = −1.
Proof. The first of these is immediate because |Inv(id)| = 0. Suppose that σ = µ1 µ2 · · · µm where the µi
are transpositions. In order to use the above results, we then write

σ = id ◦ µ1 ◦ µ2 ◦ · · · ◦ µm

Now |Inv(id)| = 0, so using Corollary 5.3.3 repeatedly, we conclude that |Inv(id ◦ µ1 )| is odd, and then
|Inv(id ◦ µ1 ◦ µ2 )| is even, etc. In general, a straightforward induction on k shows that

|Inv(id ◦ µ1 ◦ µ2 · · · ◦ µk )|

is odd if k is odd, and is even if k is even. Thus, if m is even, then ε(σ) = 1, and if m is odd, then
ε(σ) = −1.
Corollary 5.3.6. It is impossible for a permutation to be written as both a product of an even number of
transpositions and as a product of an odd number of transpositions.
Proof. If we could write a permutation in both ways, then both ε(σ) = 1 and ε(σ) = −1, a contradiction.
Definition 5.3.7. Let n ∈ N+ and let σ ∈ Sn . If ε(σ) = 1, then we say that σ is an even permutation. If
ε(σ) = −1, then we say that σ is an odd permutation.
Proposition 5.3.8. Let n ∈ N+ . For all σ, τ ∈ Sn , we have ε(στ ) = ε(σ) · ε(τ ).
Proof. If either σ = id or τ = id, this is immediate from the fact that ε(id) = 1. We now handle the various
cases:
74 CHAPTER 5. SUBGROUPS AND COSETS

• If σ and τ can both be written as a product of an even number of transpositions, then στ can also be
written as a product of an even number of transpositions, so we have ε(σ) = 1, ε(τ ) = 1, and ε(στ ) = 1
by Proposition 5.3.5.

• If σ and τ can both be written as a product of an odd number of transpositions, then στ can be written
as a product of an even number of transpositions, so we have ε(σ) = −1, ε(τ ) = −1, and ε(στ ) = 1 by
Proposition 5.3.5.

• If σ can be written as a product of an even number of transpositions, and τ can be written as a


product of an odd number of transpositions, then στ can be written as a product of an odd number of
transpositions, so we have ε(σ) = 1, ε(τ ) = −1, and ε(στ ) = −1 by Proposition 5.3.5.

• If σ can be written as a product of an odd number of transpositions, and τ can be written as a


product of an even number of transpositions, then στ can be written as a product of an odd number
of transpositions, so we have ε(σ) = −1, ε(τ ) = 1, and ε(στ ) = −1 by Proposition 5.3.5.

Therefore, in all cases, we have ε(στ ) = ε(σ) · ε(τ ).

Proposition 5.3.9. Let n ∈ N+ . Suppose that σ ∈ Sn is a k-cycle.

• If k is an even number, then σ is an odd permutation, i.e. ε(σ) = −1.

• If k is an odd number, then σ is an even permutation, i.e. ε(σ) = 1.

Proof. Write σ = (a1 a2 a3 · · · ak−1 ak ) where the ai are distinct. As above, we have

σ = (a1 ak )(a1 ak−1 ) · · · (a1 a3 )(a1 a2 )

Thus, σ is the product of k − 1 many transpositions. If k is an even number, then k − 1 is an odd number,
and hence σ is an odd permutation. If k is an odd number, then k − 1 is an even number, and hence σ is an
even permutation.

Using Proposition 5.3.8 and Proposition 5.3.9, we can now easily compute the sign of any permutation
once we’ve written it in cycle notation. For example, suppose that

σ = (1 9 2 6 4 11)(3 8 5)(7 10)

We then have

ε(σ) = ε((1 9 2 6 4 11)) · ε((3 8 5)) · ε((7 10))


= (−1) · 1 · (−1)
=1

so σ is an even permutation.

Proposition 5.3.10. Let n ∈ N+ and define

An = {σ ∈ Sn : ε(σ) = 1}

We then have that An is a subgroup of Sn .


5.3. THE ALTERNATING GROUPS 75

Proof. We have ε(id) = (−1)0 = 1, so id ∈ An . Suppose that σ, τ ∈ An so that ε(σ) = 1 = ε(τ ). we then
have
ε(στ ) = ε(σ) · ε(τ ) = 1 · 1 = 1
so στ ∈ An . Finally, suppose that σ ∈ An . We then have σσ −1 = id so

1 = ε(id)
= ε(σσ −1 )
= ε(σ) · ε(σ −1 )
= 1 · ε(σ −1 )
= ε(σ −1 )

so σ −1 ∈ An .

Definition 5.3.11. The subgroup An = {σ ∈ Sn : ε(σ) = 1} of Sn is called the alternating group of degree
n.

Let’s take a look at a few small examples. We trivially have A1 = {id}, and we also have A2 = {id}
because (1 2) is an odd permutation. The group S3 has 6 elements: the identity, three 2-cycles, and two
3-cycles, so A3 = {id, (1 2 3), (1 3 2)}. When we examine S4 , we see that it contains the following:

• The identity.

• Six 4-cycles.

• Eight 3-cycles.

• Six 2-cycles.

• Three elements that are the product of two disjoint 2-cycles.

Now of these, A4 consists of the identity, the eight 3-cycles, and the three products of two disjoint 2-cycles,
so |A4 | = 12. In general, we have the following.
n!
Proposition 5.3.12. For any n ≥ 2, we have |An | = 2 .

Proof. Define a function f : An → Sn by letting f (σ) = σ(1 2). We first claim that f is injective. To see
this, suppose that σ, τ ∈ An and that f (σ) = f (τ ). We then have σ(1 2) = τ (1 2). Multiplying on the right
by (1 2), we conclude that σ = τ . Therefore, f is injective.
We next claim that range(f ) = Sn \An . Suppose first that σ ∈ An . We then have ε(σ) = 1, so

ε(f (σ)) = ε(σ(1 2)) = ε(σ) · ε((1 2)) = 1 · (−1) = −1

so f (σ) ∈ Sn \An . Conversely, suppose that τ ∈ Sn \An so that ε(τ ) = −1. We then have

ε(τ (1 2)) = ε(τ ) · ε((1 2)) = (−1) · (−1) = 1

so τ (1 2) ∈ An . Now
f (τ (1 2)) = τ (1 2)(1 2) = τ
so τ ∈ range(f ). It follows that range(f ) = Sn \An . Therefore, f maps An bijectively onto Sn \An and hence
|An | = |Sn \An |. Since Sn is the disjoint union of these two sets and |Sn | = n!, it follows that |An | = n!
2 .
76 CHAPTER 5. SUBGROUPS AND COSETS

5.4 The Dihedral Groups


We now define another subgroup of Sn which has a very geometric flavor. By way of motivation, consider
a regular n-gon, i.e. a convex polygon with n edges in which all edges have the same length and all interior
angles are congruent. Now we want to consider the symmetries of this polygon obtained by rigid motions.
That is, we want to think about all results obtained by picking up the polygon, moving it around in space,
and then placing it back down so that it looks exactly the same. For example, one possibility is that we rotate
the polygon around the center by 360/n degrees. We can also flip the polygon over across an imaginary line
through the center so long as we take vertices to vertices.
To put these geometric ideas in more algebraic terms, first label the vertices clockwise with the number
1, 2, 3, . . . , n. Now a symmetry of the polygon will move vertices to vertices, so will correspond to a bijection
of {1, 2, 3, . . . , n}, i.e. an element of Sn . For example, the permutation (1 2 3 · · · n) corresponds to rotation
by 360/n degrees because it sends vertex 1 to the position originally occupied by vertex 2, sends vertex 2 to
the position originally occupied by vertex 3, etc.
Definition 5.4.1. Fix n ≥ 3. We define the following elements of Sn .
• r = (1 2 3 · · · n)
• s = (2 n)(3 n − 1)(4 n − 2) · · · . More formally, we have the following
– If n = 2k where k ∈ N+ , let s = (2 n)(3 n − 1)(4 n − 2) · · · (k k + 2)
– If n = 2k + 1 where k ∈ N+ , let s = (2 n)(3 n − 1)(4 n − 2) · · · (k + 1 k + 2)
Thus, r corresponds to clockwise rotation by 360/n degrees (as describe above), and s corresponds to
flipping the polygon across the line through the vertex 1 and the center. Now given two symmetries of the
polygon, we can do one followed by the other to get another symmetry. In other words, the composition
of two symmetries is a symmetry. We can also reverse any rigid motion, so the inverse of a symmetry is a
symmetry. Finally, the identity function is a trivial symmetry, so the set of all symmetries of the regular
n-gon forms a subgroup of Sn .
Definition 5.4.2. Let n ≥ 3. With the notation of r and s as above, we define Dn = hr, si, i.e. Dn is the
smallest subgroup of Sn containing both r and s. The group Dn is called the dihedral group of degree n.
In other words, we simply define Dn to be every symmetry we can obtain through a simple rotation and
a simple flip. This is a nice precise algebraic description. Since the collection of symmetries is closed under
composition and inverses, every element of Dn is indeed a rigid symmetry of the n-gon. We will explain
below (after we develop a bit of theory) why every such symmetry of a regular n-gon is given by an element
of Dn .
As an example, the following is an element of Dn .

r7 s5 r−4 s−4 rs−3 r14

We would like to find a way to simplify such expressions, and the next proposition is the primary tool that
we will need.
Proposition 5.4.3. Let n ≥ 3. We have the following
1. |r| = n.
2. |s| = 2.
3. sr = r−1 s = rn−1 s.
4. For all k ∈ N+ , we have srk = r−k s.
5.4. THE DIHEDRAL GROUPS 77

Proof. We have |r| = n because r is an n-cycle and |s| = 2 because s is a product of disjoint 2-cycles. We
now check that sr = r−1 s. First notice that

r−1 = (n n − 1 · · · 3 2 1) = (1 n n − 1 · · · 3 2)

Suppose first that n = 2k where k ∈ N+ . We then have

sr = (2 n)(3 n − 1)(4 n − 2) · · · (k k + 2)(1 2 3 · · · n)


= (1 n)(2 n − 1)(3 n − 2) · · · (k k + 1)

and

r−1 s = (n n − 1 · · · 3 2 1)(2 n)(3 n − 1)(4 n − 2) · · · (k k + 2)


= (1 n)(2 n − 1)(3 n − 2) · · · (k k + 1)

so sr = r−1 s. Suppose now that n = 2k + 1 where k ∈ N+ . We then have

sr = (2 n)(3 n − 1)(4 n − 2) · · · (k + 1 k + 2)(1 2 3 · · · n)


= (1 n)(2 n − 1)(3 n − 2) · · · (k k + 2)

and

r−1 s = (n n − 1 · · · 3 2 1)(2 n)(3 n − 1)(4 n − 2) · · · (k + 1 k + 2)


= (1 n)(2 n − 1)(3 n − 2) · · · (k k + 2)

so sr = r−1 s. Thus, sr = r−1 s in all cases. Since we know that |r| = n, we have rrn−1 = id and rn−1 r = id,
so r−1 = rn−1 . It follows that r−1 s = rn−1 s.
The last statement now follows by induction from the third.

We now give an example of how to use this proposition to simplify the above expression in the case n = 5.
We have

r7 s5 r−4 s−4 rs−3 r14 = r2 srrsr4 (using |r| = 5 and |s| = 2)


2 2 4
= r sr sr
= r2 sr2 r−4 s
= r2 sr−2 s
= r2 sr3 s
= r2 r−3 ss
= r−1 s2
= r4 .

Theorem 5.4.4. Let n ≥ 3.

1. Dn = {ri sk : 0 ≤ i ≤ n − 1, 0 ≤ k ≤ 1}

2. If ri sk = rj s` with 0 ≤ i, j ≤ n − 1 and 0 ≤ k, ` ≤ 1, then i = j and k = `.

In particular, we have |Dn | = 2n.


78 CHAPTER 5. SUBGROUPS AND COSETS

Proof. Using the fundamental relations that |r| = n, that |s| = 2, and that srk = r−k s for all k ∈ N+ , the
above argument shows that any product of r, s, and their inverses equals ri sk for some i, k with 0 ≤ i ≤ n−1
and 0 ≤ k ≤ 1. To be more precise, one can use the above relations to show that the set

{ri sk : 0 ≤ i ≤ n − 1, 0 ≤ k ≤ 1}

is closed under multiplication and under inverses, so it equals Dn . I will leave such a check for you if you
would like to work through it. This gives part 1.
Suppose now that ri sk = rj s` with 0 ≤ i, j ≤ n − 1 and 0 ≤ k, ` ≤ 1. Multiplying on the left by r−j and
on the right by s−k , we see that
ri−j = s`−k

Suppose for the sake of obtaining a contradiction that k 6= `. Since k, ` ∈ {0, 1}, we must have `−k ∈ {−1, 1},
so as s−1 = s it follows that ri−j = s`−k = s. Now we have s(1) = 1, so we must have ri−j (1) = 1 as well.
This implies that n | (i − j), so as −n < i − j < n, we conclude that i − j = 0. Thus ri−j = id, and we
conclude that s = id, a contradiction. Therefore, we must have k = `. It follows that s`−k = s0 = id, so
ri−j = id. Using the fact that |r| = n now, we see that n | (i − j), and as above this implies that i − j = 0
so i = j.

Corollary 5.4.5. Let n ≥ 3. We then have that Dn is a nonabelian group with |Dn | = 2n.

Proof. We claim that rs 6= sr. Suppose instead that rs = sr. Since we know that sr = r−1 s, it follows that
rs = r−1 s. Canceling the s on the right, we see that r = r−1 . Multiplying on the left by r we see that
r2 = id, but this is a contradiction because |r| = n ≥ 3. Therefore, rs 6= sr and hence Dn is nonabelian.
Now by the previous theorem we know that

Dn = {ri sk : 0 ≤ i ≤ n − 1, 0 ≤ k ≤ 1}

and by the second part of the theorem that the 2n elements described in the set are distinct.

We now come back around to giving a geometric justification for why the elements of Dn exactly cor-
respond to the symmetries of the regular n-gon. We have described why every element of Dn does indeed
give a symmetry (because both r and s do, and the set of symmetries must be closed under composition
and inversion), so we need only understand why all possible symmetries of the regular n-gon arise from an
element of Dn . To determine a symmetry, we first need to send the vertex labeled by 1 to another vertex.
We have n possible choices for where to send it, and suppose we send it to the original position of vertex k.
Once we have sent vertex 1 to the position of vertex k, we now need to determine were the vertex 2 is sent.
Now vertex 2 must go to one of the vertices adjacent to k, so we only have 2 choices for where to send it.
Finally, once we’ve determined these two vertices (where vertex 1 and vertex 2 go), the rest of the n-gon is
determined because we have completely determined where an entire edge goes. Thus, there are a total of
n · 2 = 2n many possible symmetries. Since |Dn | = 2n, it follows that all symmetries are given by elements
of Dn .
Finally notice that D3 = S3 simply because D3 is a subgroup of S3 and |D3 | = 6 = |S3 |. In other words,
any permutation of the vertices of an equilateral triangle is obtainable via a rigid motion of the triangle.
However, if n ≥ 4, then |Dn | is much smaller than |Sn | as most permutations of the vertices of a regular
n-gon can not be obtained from a rigid motion.
We end with the Cayley table of D4 :
5.5. THE QUATERNION GROUP 79

◦ id r r2 r3 s rs r2 s r3 s
id id r r2 r3 s rs r2 s r3 s
r r r2 r3 id rs r2 s r3 s s
r2 r2 r3 id r r2 s r3 s s rs
r3 r3 id r r2 r3 s s rs r2 s
s s r3 s r2 s rs id r3 r2 r
rs rs s r3 s r2 s r id r3 r2
r2 s r2 s rs s r3 s r2 r id r3
r3 s r3 s r2 s rs s r3 r2 r id

5.5 The Quaternion Group


Consider the following three 2 × 2 matrices over C.
     
0 1 0 i i 0
A= B= C=
−1 0 i 0 0 −i

For A, we have     
0 1 0 1 −1 0
A2 = = = −I
−1 0 −1 0 0 −1
For B, we have     
0 i 0 i −1 0
B2 = = = −I
i 0 i 0 0 −1
For C, we have     
2 i 0 i 0 −1 0
C = = = −I
0 −i 0 −i 0 −1
Since
A2 = B 2 = C 2 = −I
it follows that
A4 = B 4 = C 4 = I
Thus, each of A, B, and C have order at most 4. One can check that none of them have order 3 (either directly,
using general results on powers giving the identity, or using the fact that otherwise A3 = I = A4 , so A = I,
a contradiction). In particular, each of these matrices is an element of GL2 (C) because A · A3 = I = A3 · A
(and similarly for B and C).
We have     
0 1 0 i i 0
AB = = =C
−1 0 i 0 0 −i
and     
0 i 0 1 −i 0
BA = = = −C
i 0 −1 0 0 i
We also have     
i 0 0 1 0 i
CA = = =B
0 −i −1 0 i 0
and     
0 1 i 0 0 −i
AC = = = −B
−1 0 0 −i −i 0
80 CHAPTER 5. SUBGROUPS AND COSETS

Finally, we have
    
0 i i 0 0 1
BC = = =A
i 0 0 −i −1 0
and     
i 0 0 i 0 −1
CB = = = −A
0 −i i 0 1 0

Thus, if we are working with GL2 (R), then

hA, B, Ci = {I, −I, A, −A, B, −B, C, −C}.

This subgroup of 8 elements is called the quaternion group. However, just like in the situation for Dn , we
typically give these elements other names and forgot that they are matrices (just as we often forgot that the
elements of Dn are really elements of Sn ).

Definition 5.5.1. The quaternion group is the group on the set {1, −1, i, −i, j, −j, k, −k} where we define

i2 = −1 j 2 = −1 k 2 = −1
ij = k jk = i ki = j
ji = −k kj = −i ik = −j

and all other multiplications in the natural way. We denote this group by Q8 .

For example, we have (−k)(−j) = −(−(kj)) = kj = −i.

5.6 The Center of a Group


Definition 5.6.1. Given a group G, we let Z(G) = {a ∈ G : ag = ga for all g ∈ G}, i.e. Z(G) is the set of
elements of G that commute with every element of G. We call Z(G) the center of G.

Proposition 5.6.2. For any group G, we have that Z(G) is a subgroup of G.

Proof. We check the three conditions.

• First notice that for any b ∈ G we have


eg = g = ge
directly from the identity axiom, hence e ∈ Z(G).

• Suppose that a, b ∈ Z(G). Let g ∈ G be arbitrary. We then have

g(ab) = (ga)b (by associativity)


= (ag)b (since a ∈ Z(G))
= a(gb) (by associativity)
= a(bg) (since y ∈ Z(G))
= (ab)g (by associativity)

Since g ∈ G was arbitrary, we conclude that g(ab) = (ab)g for all g ∈ G, so ab ∈ Z(G). Thus, Z(G) is
closed under multiplication.
5.6. THE CENTER OF A GROUP 81

• Suppose that a ∈ Z(G). Let g ∈ G be arbitrary. Since a ∈ Z(G), we have

ga = ag

Multiplying on the left by a−1 we conclude that

a−1 ga = a−1 ag

so
a−1 ga = g
Now multiplying this equation on the right by a−1 gives

a−1 gaa−1 = ga−1

so
a−1 g = ga−1
Since g ∈ G was arbitrary, we conclude that ga−1 = a−1 g for all g ∈ G, so a−1 ∈ Z(G). Thus, Z(G)
is closed under inverses.

Combining the above three facts, we conclude that Z(G) is a subgroup of G.

We now calculate Z(G) for many of the groups G that we have encountered.

• First notice that if G is an abelian group, then we trivially have that Z(G) = G. In particular, we
have Z(Z/nZ) = Z/nZ and Z(U (Z/nZ)) = U (Z/nZ) for all n ∈ N+ .

• On the homework, we proved that if n ≥ 3 and σ ∈ Sn \{id}. then there exists τ ∈ Sn with στ 6= τ σ,
so σ ∈
/ Z(Sn ). Since the identity of a group is always in the center, it follows that Z(Sn ) = {id} for all
n ≥ 3. Notice that we also have Z(S1 ) = {id} trivially, but Z(S2 ) = {id, (1 2)} = S2 .

• If n ≥ 4, then Z(An ) = {id}. Suppose that σ ∈ An with σ 6= id. We have that σ : {1, 2, . . . , n} →
{1, 2, . . . , n} is a bijection which is not the identity map. Thus, we may fix i ∈ {1, 2, . . . , n} with
σ(i) 6= i, say σ(i) = j. Since n ≥ 4, we may fix k, ` ∈ {1, 2, . . . , n}\{i, j} with k 6= `. Define
τ : {1, 2, . . . , n} → {1, 2, . . . , n} by


 k if m = j

` if m = k
τ (m) =


 j if m = `
m otherwise

In other words, τ = (j k `) in cycle notation. Notice that τ ∈ An , that

(τ ◦ σ)(i) = τ (σ(i)) = τ (j) = k

and that
(σ ◦ τ )(i) = σ(τ (i)) = σ(i) = j.
Since j 6= k, we have shown that the functions σ ◦ τ and τ ◦ σ disagree on i, and hence are distinct
functions. In other words, in An we have στ 6= τ σ. Thus, Z(An ) = {id} if n ≥ 4.
Notice that A1 = {id} and A2 = {id}, so we also have Z(An ) = {id} trivially when n ≤ 2. However,
when n = 3, we have that A3 = {id, (1 2 3), (1 3 2)} = h(1 2 3)i is cyclic and hence abelian, so
Z(A3 ) = A3 .
82 CHAPTER 5. SUBGROUPS AND COSETS

• We have Z(Q8 ) = {1, −1}. We trivially have that 1 ∈ Z(Q8 ), and showing that −1 ∈ Z(Q8 ) is simply
a matter of checking the various possibilities. Now i ∈
/ Z(Q8 ) because ij = k but ji = −k, and
−i ∈/ Z(Q8 ) because (−i)j = −k and j(−i) = k. Similar arguments show that the other four elements
are not in Z(Q8 ) either.
• We claim that   
r 0
Z(GL2 (R) = : r ∈ R\{0}
0 r
To see this, first notice that if r ∈ R\{0}, then
 
r 0
∈ GLn (R)
0 r

because 1 
r 0
1
0 r
is an inverse. Now for any matrix  
a b
∈ GL2 (R)
c d
we have
    
r 0 a b ra rb
=
0 r c d rc rd
  
a b r 0
=
c d 0 r

Therefore   
r 0
: r ∈ R\{0} ⊆ Z(GL2 (R))
0 r
Suppose now that A ∈ Z(GL2 (R)), i.e. that AB = BA for all B ∈ GL2 (R). Write
 
a b
A=
c d

Using our hypothesis in the special case where


 
2 0
B=
0 1

(which is easily seen to be invertible), we obtain


     
a b 2 0 2 0 a b
=
c d 0 1 0 1 c d

Multiplying out these matrices we see that


   
2a b 2a 2b
=
2c d c d

Thus, b = 2b and c = 2c, which tells us that b = 0 and c = 0. It follows that


 
a 0
A=
0 d
5.7. COSETS 83

Now using our hypothesis in the special case where


 
0 1
B=
1 0

(which is easily seen to be invertible), we obtain


     
a 0 0 1 0 1 a 0
=
0 d 1 0 1 0 0 d

Multiplying out these matrices we see that


   
0 a 0 d
=
d 0 a 0

Therefore a = d, and we conclude that


 
a 0
A=
0 a

completing our proof.

• Generalizing the above example, one can show that

Z(GLn (R)) = {r · In : r ∈ R\{0}}

where In is the n × n identity matrix.

On the homework, you will calculate Z(Dn ).

5.7 Cosets
Suppose that G is a group and that H is a subgroup of G. The idea we want to explore is how to collapse
the elements of H by considering them all to be “trivial” like the identity e. If we want this idea to work, we
would then want to identify two elements a, b ∈ G if we can get from one to the other via multiplication by
a “trivial” element. In other words, we want to identify elements a and b if there exists h ∈ H with ah = b.
For example, suppose that G is the group R2 = R × R under addition (so (a1 , b1 ) + (a2 , b2 ) = (a1 +
a2 , b1 + b2 )) and that H = {(0, b) : b ∈ R} is the y-axis. Notice that H is subgroup of G. We want to consider
everything on the y-axis, that is every pair of the form (0, b), as trivial. Now if we want the y-axis to be
considered “trivial”, then we would want to consider two points to be the “same” if we can get from one to
the other by adding an element of the y-axis. Thus, we would want to identify (a1 , b1 ) with (a2 , b2 ) if and
only if a1 = a2 , because then we can add the “trivial” point (0, b2 − b1 ) to (a1 , b1 ) to get (a2 , b2 ).
Let’s move on to the group G = Z. Let n ∈ N+ and consider H = hni = nZ = {nk : k ∈ Z}. In this
situation, we want to consider all multiples of n to be “equal” to 0, and in general we want to consider
a, b ∈ Z to be equal if we can add some multiple of n to a in order to obtain b. In other words, we want to
identify a and b if and only if there exists k ∈ Z with a + kn = b. Working it out, we want to identify a
and b if and only if b − a is a multiple of n, i.e. if and only if a ≡ b (mod n). Thus, in the special case of
the subgroup nZ of Z, we recover the fundamental ideas of modular arithmetic and our eventual definition
of Z/nZ.
84 CHAPTER 5. SUBGROUPS AND COSETS

Left Cosets
Definition 5.7.1. Let G be a group and let H be a subgroup of G. We define a relation ∼H on G by letting
a ∼H b mean that there exists h ∈ H with ah = b.
Proposition 5.7.2. Let G be a group and let H be a subgroup of G. The relation ∼H is an equivalence
relation on G.
Proof. We check the three properties.
• Reflexive: Let a ∈ G. We have that ae = a and we know that e ∈ H because H is a subgroup of G, so
a ∼H a.
• Symmetric: Let a, b ∈ G with a ∼H b. Fix h ∈ H with ah = b. Multiplying on the right by h−1 , we
see that a = bh−1 , so bh−1 = a. Now h−1 ∈ H because H is a subgroup of G, so b ∼H a
• Transitive: Let a, b, c ∈ G with a ∼H b and b ∼H c. Fix h1 , h2 ∈ H with ah1 = b and bh2 = c. We
then have
a(h1 h2 ) = (ah1 )h2 = bh2 = c
Now h1 h2 ∈ H because H is a subgroup of G, so a ∼H c.
Therefore, ∼H is an equivalence relation on G.
The next proposition is a useful little rephrasing of when a ∼H b.
Proposition 5.7.3. Let G be a group and let H be a subgroup of G. Given a, b ∈ G, we have a ∼H b if and
only if a−1 b ∈ H.
Proof. Suppose first that a, b ∈ G satisfy a ∼H b. Fix h ∈ H with ah = b. Multiplying on the left by a−1 ,
we conclude that h = a−1 b, so a−1 b = h ∈ H.
Suppose conversely that a−1 b ∈ H. Since

a(a−1 b) = (aa−1 )b = eb = b

it follows that a ∼H b.
Definition 5.7.4. Let G be a group and let H be a subgroup of G. Under the equivalence relation ∼H , we
have

a = {b ∈ G : a ∼H b}
= {b ∈ G : There exists h ∈ H with b = ah}
= {ah : h ∈ H}
= aH.

These equivalence class are called left cosets of H in G.


Since the left cosets of H in G are the equivalence classes of the equivalence relation ∼H , all of our theory
about equivalence relations apply. For example, if two left cosets of H in G intersect nontrivially, then they
are in fact equal. Also, as is the case for general equivalence relations, the left cosets of H in G partition G
into pieces. Using Proposition 5.7.3, we obtain the following fundamental way of determining when two left
cosets are equal:
aH = bH ⇐⇒ a−1 b ∈ H ⇐⇒ b−1 a ∈ H
Let’s work through an example.
5.7. COSETS 85

Example 5.7.5. Let G = S3 and let H = h(1 2)i = {id, (1 2)}. Determine the left cosets of H in G.

Proof. The left cosets are the sets σH for σ ∈ G. For example, we have the left coset

idH = {id ◦ id, id ◦ (1 2)} = {id, (1 2)}

We can also consider the left coset

(1 2)H = {id ◦ (1 2), (1 2) ◦ (1 2)} = {(1 2), id} = {id, (1 2)}

Thus, the two left cosets idH and (1 2)H are equal. This should not be surprising because

id−1 ◦ (1 2) = id ◦ (1 2) = (1 2) ∈ H

Alternatively, we can simply note that (1 2) ∈ idH and (1 2) ∈ (1 2)H, so since the left cosets idH and (1 2)H
intersect nontrivially, we know immediately that they must be equal. Working through all the examples, we
compute σH for each of the six σ ∈ S3 :

1. idH = (1 2)H = {id, (1 2)}

2. (1 3)H = (1 2 3)H = {(1 3), (1 2 3)}

3. (2 3)H = (1 3 2)H = {(2 3), (1 3 2)}

Thus, there are 3 distinct left cosets of H in G.

Intuitively, if H is a subgroup of G, then a left coset aH is simply a “translation” of the subgroup H


within G using the element a ∈ G. In the above example, (1 3)H is simply the “shift” of the subgroup H
using (1 3) because we obtain it by hitting every element of H on the left by (1 3).
To see this geometrically, consider again the group G = R2 under addition with subgroup

H = {(0, b) : b ∈ R}

equal to the y-axis (or equivalently the line x = 0). Since the operation in G is +, we will denote the left
coset of (a, b) ∈ G by (a, b) + H (rather than (a, b)H). Let’s consider the left coset (3, 0) + H. We have

(3, 0) + H = {(3, 0) + (0, b) : b ∈ R}


= {(3, b) : b ∈ R}

so (3, 0) + H gives line x = 3. Thus, the left coset (3, 0) + H is the translation of the line x = 0. Let’s now
consider the coset (3, 5) + H. We have

(3, 5) + H = {(3, 5) + (0, b) : b ∈ R}


= {(3, 5 + b) : b ∈ R}
= {(3, c) : c ∈ R}
= (3, 0) + H.

Notice that we could have obtained this with less work by noting that the inverse of (3, 5) in G is (−3, −5)
and (−3, −5) + (3, 0) = (0, −5) ∈ H, hence (3, 5) + H = (3, 0) + H. Therefore, the left coset (3, 5) + H also
gives the line x = 3. Notice that every element of H when hit by (3, 5) translates right 3 and shifts up 5,
but as a set this latter shift of 5 is washed away.
86 CHAPTER 5. SUBGROUPS AND COSETS

Finally, let’s consider G = Z (under addition) and H = nZ = {nk : k ∈ Z} where n ∈ N+ . Notice that
given a, b ∈ Z, we have

a ∼nZ b ⇐⇒ There exists h ∈ nZ with a + h = b


⇐⇒ There exists k ∈ Z with a + nk = b
⇐⇒ There exists k ∈ Z with nk = b − a
⇐⇒ n | (b − a)
⇐⇒ b ≡ a (mod n)
⇐⇒ a ≡ b (mod n).

Therefore the two relations ∼nZ and ≡n are precisely the same, and we have recovered congruence modulo n
as a special case of our general construction. Since the relations are the same, they have the same equivalence
classes. Hence, the equivalence class of a under the equivalence relation ≡n , which in the past we denoted
by a, equals the equivalence class of a under the equivalence relation ∼nZ , which is the left coset a + nZ.

Right Cosets
In the previous section, we defined a ∼H b to mean that there exists h ∈ H with ah = b. Thus, we considered
two elements of G to be equivalent if we could get from a to b through multiplication by an element of H
on the right of a. In particular, with this definition, we saw that when we consider G = S3 and H = h(1 2)i,
we have (1 3) ∼H (1 2 3) because (1 3)(1 2) = (1 2 3).
What happens if we switch things up? For the rest of this section, completely ignore the definition of
∼H defined above because we will redefine it on the other side now.

Definition 5.7.6. Let G be a group and let H be a subgroup of G. We define a relation ∼H on G by letting
a ∼H b mean that there exists h ∈ H with ha = b.

The following results are proved exactly as above just working on the other side.

Proposition 5.7.7. Let G be a group and let H be a subgroup of G. The relation ∼H is an equivalence
relation on G.

Proposition 5.7.8. Let G be a group and let H be a subgroup of G. Given a, b ∈ G, we have a ∼H b if and
only if ab−1 ∈ H.

Now it would be nice if this new equivalence relation was the same as the original equivalence relation.
Too bad. In general, they are different! For example, with this new equivalence relation, we do not have
(1 3) ∼H (1 2 3) because id ◦ (1 3) = (1 3) and (1 2)(1 3) = (1 3 2). Now that we know that the two relations
differ in general, we should think about the equivalence classes of this new equivalence relation.

Definition 5.7.9. Let G be a group and let H be a subgroup of G. Under the equivalence relation ∼H , we
have

a = {b ∈ G : a ∼H b}
= {b ∈ G : There exists h ∈ H with b = ha}
= {ha : h ∈ H}
= Ha.

These equivalence class are called right cosets of H in G.

Let’s work out the right cosets of our previous example.


5.7. COSETS 87

Example 5.7.10. Let G = S3 and let H = h(1 2)i = {id, (1 2)}. Determine the right cosets of H in G.

Proof. Working them all out, we obtain.

• Hid = H(1 2) = {id, (1 2)}

• H(1 3) = H(1 3 2) = {(1 3), (1 3 2)}

• H(2 3) = H(1 2 3) = {(2 3), (1 2 3)}

Thus, there are 3 distinct right cosets of H in G.

Notice that although we obtained both 3 left cosets and 3 right cosets, these cosets were different. In
particular, we have (1 3)H 6= H(1 3). In other words, it is not true in general that the left coset aH equals
the right coset Ha. Notice that this is fundamentally an issue because S3 is nonabelian. If we were working
in an abelian group G with a subgroup H of G, then ah = ha for all h ∈ H, so aH = Ha.
As in the left coset section, using Proposition 5.7.8, we have the following fundamental way of determining
when two left cosets are equal:

Ha = Hb ⇐⇒ ab−1 ∈ H ⇐⇒ ba−1 ∈ H

Index of a Subgroup
As we saw in the previous sections, it is not in general true that left cosets are right cosets and vice versa.
However, in the one example we saw above, we at least had the same number of left cosets as we had right
cosets. This is a general and important fact, which we now establish.
First, let’s once again state the fundamental way to tell when two left cosets are equal and when two
right cosets are equal.
aH = bH ⇐⇒ a−1 b ∈ H ⇐⇒ b−1 a ∈ H
Ha = Hb ⇐⇒ ab−1 ∈ H ⇐⇒ ba−1 ∈ H
Suppose now that G is a group and that H is a subgroup of G. Let LH be the set of left cosets of H in
G and let RH be the set of right cosets of H in G. We will show that |LH | = |RH | by defining a bijection
f : LH → RH . Now the natural idea is to define f by letting f (aH) = Ha. However, we need to be very
careful. Recall that the left cosets of H in G are the equivalence classes of a certain equivalence relation.
By “defining” f as above we are giving a definition based on particular representatives of these equivalence
classes, and it may be possible that aH = bH but Ha 6= Hb. In other words, we must determine whether f
is well-defined.
In fact, in general the above f is not well-defined. Consider our standard example of G = S3 and
H = h(1 2)i. Checking our above computations, we have (1 3)H = (1 2 3)H but H(1 3) 6= H(1 2 3).
Therefore, in this particular case, that choice of f is not well-defined. We need to define f differently to
make it well-defined in general, and the following lemma is the key to do so.

Lemma 5.7.11. Let H be a subgroup of G and let a, b ∈ G. The following are equivalent.

1. aH = bH

2. Ha−1 = Hb−1

Proof. Suppose that aH = bH. We then have that a−1 b ∈ H so using the fact that (b−1 )−1 = b, we see that
a−1 (b−1 )−1 ∈ H. It follows that Ha−1 = Hb−1 .
Suppose conversely that Ha−1 = Hb−1 . We then have that a−1 (b−1 )−1 ∈ H so using the fact that
(b ) = b, we see that a−1 b ∈ H. It follows that aH = bH.
−1 −1
88 CHAPTER 5. SUBGROUPS AND COSETS

We can now prove our result.

Proposition 5.7.12. Let G be a group and let H be a subgroup of G. Let LH be the set of left cosets of H
in G and let RH be the set of right cosets of H in G. Define f : LH → RH by letting f (aH) = Ha−1 . We
then have that f is a well-defined bijection from LH onto RH . In particular, |LH | = |RH |.

Proof. Notice that f is well-defined by the above lemma because if aH = bH, then

f (aH) = Ha−1 = Hb−1 = f (bH).

We next check that f is injective. Suppose that f (aH) = f (bH), so that Ha−1 = Hb−1 . By the other
direction of the lemma, we have that aH = bH. Therefore, f is injective.
Finally, we need to check that f is surjective. Fix an element of RH , say Hb. We then have that
b−1 H ∈ LH and
f (b−1 H) = H(b−1 )−1 = Hb.
Hence, range(f ) = RH , so f is surjective.
Putting it all together, we conclude that f is a well-defined bijection from LH onto RH .

Definition 5.7.13. Let G be a group and let H be a subgroup of G. We define [G : H] to be the number of
left cosets of H in G (or equivalently the number of right cosets of H in G). That is, [G : H] is the number
of equivalence classes of G under the equivalence relation ∼H . If there are infinitely many left cosets (or
equivalently infinitely many right cosets), we write [G : H] = ∞. We call [G : H] the index of H in G.

For example, we saw above that [S3 : h(1 2)i] = 3. For any n ∈ N+ , we have [Z : nZ] = n because the
left cosets of nZ in Z are 0 + nZ, 1 + nZ, . . . , (n − 1) + nZ.

5.8 Lagrange’s Theorem


We are now in position to prove one of the most fundamental theorems about finite groups. We start with
a proposition.

Proposition 5.8.1. Let G be a group and let H be a subgroup of G. Let a ∈ G. Define a function
f : H → aH by letting f (h) = ah. We then have that f is a bijection, so |H| = |aH|. In other words, all left
cosets of H in G have the same size.

Proof. Notice that f is surjective because if b ∈ aH, then we may fix h ∈ H with b = ah, and notice that
f (h) = ah = b, so b ∈ range(f ). Suppose that h1 , h2 ∈ H and that f (h1 ) = f (h2 ). We then have that
ah1 = ah2 , so by canceling the a’s on the left (i.e. multiplying on the left by a−1 ), we conclude that h1 = h2 .
Therefore, f is injective. Putting this together with the fact that f is surjective, we conclude that f is a
bijection. The result follows.

Theorem 5.8.2 (Lagrange’s Theorem). Let G be a finite group and let H be a subgroup of G. We have

|G| = [G : H] · |H|.

In particular, |H| divides |G| and


|G|
[G : H] = .
|H|
Proof. The previous proposition shows that the [G : H] many left cosets of H in G each have cardinality
|H|. Since G is partitioned into [G : H] many sets each of size |H|, we conclude that |G| = [G : H] · |H|.
5.8. LAGRANGE’S THEOREM 89

For example, instead of finding all of the left cosets of h(1 2)i in S3 to determine that [S3 : h(1 2)i] = 3,
we could have simply calculated
|S3 | 6
[S3 : h(1 2)i] = = = 3.
|h(1 2)i| 2
Notice the assumption in Lagrange’s Theorem that G is finite. It makes no sense to calculate [Z : nZ] in

this manner (if you try to write ∞ you will make me very angry). We end this section with several simple
consequences of Lagrange’s Theorem.
Corollary 5.8.3. Let G be a finite group. Let K be a subgroup of G and let H be a subgroup of K. We
then have
[G : H] = [G : K] · [K : H].

Proof. Since G is finite (and hence trivially both H and K are finite) we may use Lagrange’s Theorem to
note that
|G| |K| |G|
[G : K] · [K : H] = · = = [G : H].
|K| |H| |H|

Corollary 5.8.4. Let G be a finite group and let a ∈ G. We then have that |a| divides |G|.

Proof. Let H = hai. By Proposition 5.2.3, we know that H is a subgroup of G and |a| = |H|. Therefore, by
Lagrange’s Theorem, we may conclude that |a| divides |G|.
Corollary 5.8.5. Let G be a finite group. We then have that a|G| = e for all a ∈ G.
Proof. Let m = |a|. By the previous corollary, we know that m | |G|. Therefore, by Proposition 4.6.5, it
follows that a|G| = e.

Theorem 5.8.6. Every group of prime order is cyclic (so in particular every group of prime order is abelian).
In fact, if G is a group of prime order, then every nonidentity element is a generator of G.
Proof. Let p = |G| be the prime order of G. Suppose that c ∈ G with c 6= e. We know that |c| divides p, so
as p is prime we must have that either |c| = 1 or |c| = p. Since c 6= e, we must have |c| = p. By Proposition
5.2.6, we conclude that c is a generator of G. Hence, every nonidentity element of G is a generator of G.
Theorem 5.8.7 (Euler’s Theorem). Let n ∈ N+ and let a ∈ Z with gcd(a, n) = 1. We then have aϕ(n) ≡ 1
(mod n).
Proof. We apply Corollary 5.8.4 to the group U (Z/nZ). Since gcd(a, n) = 1, we have that a ∈ U (Z/nZ).
Since |U (Z/nZ)| = ϕ(n), Corollary 5.8.4 tells us that aϕ(n) = 1. Therefore, aϕ(n) ≡ 1 (mod n).

Corollary 5.8.8 (Fermat’s Little Theorem). Let p ∈ N+ be prime.


1. If a ∈ Z with p - a, then ap−1 ≡ 1 (mod p).
2. If a ∈ Z, then ap ≡ a (mod p).

Proof. We first prove 1. Suppose that a ∈ Z with p - a. We then have that gcd(a, p) = 1 (because gcd(a, p)
divides p, so it must be either 1 or p, but it can not be p since p - a). Now using the fact that ϕ(p) = p − 1,
we conclude from Euler’s Theorem that ap−1 ≡ 1 (mod p).
We next prove 2. Suppose that a ∈ Z. If p - a, then ap−1 ≡ 1 (mod p) by part 1, so multiplying both
sides by a gives ap ≡ a (mod p). Now if p | a, then we trivially have p | ap as well, so p | (ap − a) and hence
ap ≡ a (mod p).
90 CHAPTER 5. SUBGROUPS AND COSETS
Chapter 6

Quotients and Homomorphisms

6.1 Quotients of Abelian Groups


The quotient construction is one of the most subtle but important construction in group theory (and algebra
in general). As suggested by the name, the quotient should somehow be “smaller” than G. Given a group
G together with a subgroup H, the idea is to build a new group G/H that is obtained by “trivializing” H.
We already know that using H we get an equivalence relation ∼H which breaks up the group G into cosets.
The fundamental idea is to make a new group whose elements are these cosets. However, the first question
we need to ask ourselves is whether we will use left cosets or right cosets (i.e. which version of ∼H will we
work with). To avoid this issue, and to focus on the key elements of the construction first, we will begin by
assuming that our group is abelian so that there is no worry about which side we work on. As we will see,
this abelian assumption helps immensely in other ways as well.
Suppose then that G is an abelian group and that H is a subgroup of G. We let G/H be the set of all
(left) cosets of H in G, so elements of G/H are of the form aH for some a ∈ G. In the language established
when discussing equivalence relations, we are looking at the set G/∼H , but we simplify the notation and
just write G/H. Now that we have decided what the elements of G/H are, we need to define the operation.
How should we define the product of aH and bH? The natural idea is simply to define (aH) · (bH) = (ab)H.
That is, given the two cosets aH and bH, take the product ab in G and form its coset. Alarm bells should
immediately go off in your head and you should be asking: Is this well-defined? After all, a coset has many
different representatives. What if aH = cH and bH = dH? On the one hand, the product should be (ab)H
and on the other the product should be (cd)H. Are these necessarily the same? Recall that we are dealing
with equivalence classes so aH = bH if and only if a ∼H b. The next proposition says that everything is
indeed well-defined in the abelian case we are currently considering.

Proposition 6.1.1. Let G be an abelian group and let H be a subgroup of G. Suppose that a ∼H c and
b ∼H d. We then have that ab ∼H cd.

Proof. Fix h, k ∈ H such that ah = c and bk = d. We then have that

cd = ahbk = abhk.

Now hk ∈ H because H is a subgroup of G. Since cd = (ab)(hk), it follows that ab ∼H cd.

Notice how fundamentally we used the fact that G was abelian in this proof to write hb = bh. Overcoming
this apparent stumbling block will be our primary focus when we get to nonabelian groups.
For example, suppose that G = R2 and H = {(0, b) : b ∈ R} is the y-axis. Again, we will write (a, b) + H
for the left coset of H In G. The elements of the quotient group G/H are the left cosets of H in G, which

91
92 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

we know are the set of vertical lines in the plane. Let’s examine how we add two elements of G/H. The
definition says that we add left cosets by finding representatives of those cosets, adding those representatives,
and then taking the left coset of the result. In other words, to add two vertical lines, we pick points on the
lines, add those points, and output the line containing the result. For example, we add the cosets (3, 2) + H
and (4, −7) + H by computing (3, 2) + (4, −7) = (7, −5) and outputting the corresponding coset (7, −5) + H.
In other words, we have
((3, 2) + H) + ((4, −7) + H) = (7, −5) + H.
Now we could have chosen different representatives of those two cosets. For example, we have (3, 2) + H =
(3, 16) + H (after all both are on the line x = 3) and (4, −7) + H = (4, 1) + H, and if we calculate the sum
using these representatives we see that

((3, 16) + H) + ((4, 1) + H) = (7, 17) + H.

Now although the elements of G given by (7, −5) and (7, 17) are different, the cosets (7, −5) + H and
(7, 17) + H are equal because (7, −5) ∼H (7, 17).
We are now ready to formally define the quotient of an abelian group G by a subgroup H. We verify
that the given definition really is a group in the proposition immediately after the definition.

Definition 6.1.2. Let G be an abelian group and let H be a subgroup of G. We define a new group, called
the quotient of G by H and denoted G/H, by letting the elements be the left cosets of H in G (i.e. the
equivalence classes of G under ∼H ), and defining the binary operation aH · bH = (ab)H. The identity is eH
(where e is the identity of G) and the inverse of aH is a−1 H.

Proposition 6.1.3. Let G be an abelian group and let H be a subgroup of G. The set G/H with the operation
just defined is indeed a group with |G/H| = [G : H]. Furthermore, it is an abelian group.

Proof. We verified that the operation aH · bH = (ab)H is well-defined in Proposition 6.1.1. With that in
hand, we just need to check the group axioms.
We first check that the · is an associative operation on G/H. For any a, b, c ∈ G we have

(aH · bH) · cH = (ab)H · cH


= ((ab)c)H
= (a(bc))H (since · is associative on G)
= aH · (bc)H
= aH · (bH · cH).

We next check that eH is an identity. For any a ∈ G we have

aH · eH = (ae)H = aH

and
eH · aH = (ea)H = aH.
For inverses, notice that given any a ∈ G, we have

aH · a−1 H = (aa−1 )H = eH

and
a−1 H · aH = (a−1 a)H = eH.
6.1. QUOTIENTS OF ABELIAN GROUPS 93

Thus, G/H is indeed a group, and it has order [G : H] because the elements are the left cosets of H in G.
Finally, we verify that G/H is abelian by noting that for any a, b ∈ G, we have

aH · bH = (ab)H
= (ba)H (since G is abelian)
= bH · aH.

Here is another example. Suppose that G = U (Z/18Z) and let H = h17i = {1, 17}. We then have that
the left cosets of H in G are:

• 1H = 17H = {1, 17}

• 5H = 13H = {5, 13}

• 7H = 11H = {7, 11}

Therefore, |G/H| = 3. To multiply two cosets, we choose representatives and multiply. For example, we
could calculate
5H · 7H = (5 · 7)H = 17H.
We can multiply the exact same two cosets using different representatives. For example, we have 7H = 11H,
so we could calculate
5H · 11H = (5 · 11)H = 1H.
Notice that we obtained the same answer since 1H = 17H. Now there is no canonical choice of representatives
for the various cosets, so if you want to give each element of G/H a unique “name”, then you simply have
to pick which representative of each coset you will use. We will choose (somewhat arbitrarily) to view G/H
as the following:
G/H = {1H, 5H, 7H}.
Here is the Cayley table of G/H using these choices of representatives.

· 1H 5H 7H
1H 1H 5H 7H
5H 5H 7H 1H
7H 7H 1H 5H

Again, notice that using the definition we have 5H · 7H = 17H, but since 17 was not one of our chosen
representatives and 17H = 1H where 1 is one of our chosen representatives, we used 5H · 7H = 1H in the
above table.
Finally, let us take a moment to realize that we have been dealing with quotient groups all along when
working with Z/nZ. This group is exactly the quotient of the group G = Z under addition by the subgroup
H = nZ = {nk : k ∈ Z}. Recall from Section 5.7 that given a, b ∈ Z, we have

a ∼nZ b ⇐⇒ a ≡ b (mod n)

so the left cosets of nZ are precisely the equivalence classes of ≡n . Stated in symbols, if a is the equivalence
class of a under ≡n , then a = a + nZ (we are again using + in left cosets because that is the operation in
Z). Furthermore, our definition of the operation in Z/nZ was given by

a + b = a + b.
94 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

However, in terms of cosets, this simply says


(a + nZ) + (b + nZ) = (a + b) + nZ
which is exactly our definition of the operation in the quotient Z/nZ. Therefore, the group Z/nZ is precisely
the quotient of the group Z by the subgroup nZ, which is the reason why we have used that notation all
along.

6.2 Normal Subgroups and Quotient Groups


In discussing quotients of abelian groups, we used the abelian property in two places. The first place was to
avoid choosing whether we were dealing with left cosets or right cosets (since they will be the same whenever
G is abelian). The second place we used commutativity was in checking that the group operation on the
quotient was well-defined. Now the first choice of left/right cosets doesn’t seem so hard to overcome because
we can simply choose one. For the moment, say we choose to work with left cosets. However, the well-defined
issue looks like it might be hard to overcome because we used commutativity in what appears to be a very
central manner. Let us recall our proof. We had a subgroup H of an abelian group G, and we were assuming
a ∼H c and b ∼H d. We then fixed h, k ∈ H with ah = c and bk = d and observed that
cd = ahbk = abhk
hence ab ∼H cd because h` ∈ H. Notice the key use of commutativity to write hb = bh. In fact, we can
easily see that the operation is not always well-defined if G is nonabeilan. Consider our usual example of
G = S3 with H = h(1 2)i = {id, (1 2)}. We computed the left cosets in Section 5.7:
1. idH = (1 2)H = {id, (1 2)}
2. (1 3)H = (1 2 3)H = {(1 3), (1 2 3)}
3. (2 3)H = (1 3 2)H = {(2 3), (1 3 2)}
Thus, we have id ∼H (1 2) and (1 3) ∼H (1 2 3). Now id(1 3) = (1 3) and (1 2)(1 2 3) = (2 3), but a quick
look at the left cosets shows that (1 3) 6∼H (2 3). Hence
id(1 3) 6∼H (1 2)(1 2 3).
In other words, the operation σH · τ H = (στ )H is not well-defined because on the one hand we would have
idH · (1 3)H = (1 3)H
and on the other hand
(1 2)H · (1 2 3)H = (2 3)H
which contradicts the definition of a function since (1 3)H 6= (2 3)H.
With the use of commutativity so essential in our well-defined proof and a general counterexample in
hand, it might appear that we have little hope in dealing with nonabelian groups. We just showed that we
have no hope for an arbitrary subgroup H of G, but maybe we get by with some special subgroups. Recall
again our proof that used
cd = ahbk = abhk
We do really need to get b next to a and with a little thought we can see some wiggle room. For example, one
way we could get by is if H ⊆ Z(G). Under this assumption, the elements of H commute with everything
in G, so the above argument still works. This suffices, but it is quite restrictive. Notice that we could make
things work with even less. We would be able to get by if we weakened the assumption that H commutes
with all elements of G to the following:
6.2. NORMAL SUBGROUPS AND QUOTIENT GROUPS 95

For all g ∈ G and all h ∈ H, there exists ` ∈ H with hg = g`.


In other words, we do not need to be able to move b past h in a manner which does not disturb h at all, but
instead we could get by with moving b past h in a manner which changes h to perhaps a different element
of H.
It turns out that the above condition on a subgroup H of a group G is equivalent to many other
fundamental concepts. First, we introduce the following definition.
Definition 6.2.1. Let G be a group and let H be a subgroup of G. Given g ∈ G, we define

gHg −1 = {ghg −1 : g ∈ G, h ∈ H}

Proposition 6.2.2. Let G be a group and let H be a subgroup of G. The following are equivalent.
1. For all g ∈ G and all h ∈ H, there exists ` ∈ H with hg = g`.
2. For all g ∈ G and all h ∈ H, there exists ` ∈ H with gh = `g.
3. g −1 hg ∈ H for all g ∈ G and all h ∈ H.
4. ghg −1 ∈ H for all g ∈ G and all h ∈ H.
5. gHg −1 ⊆ H for all g ∈ G.
6. gHg −1 = H for all g ∈ G.
7. Hg ⊆ gH for all g ∈ G.
8. gH ⊆ Hg for all g ∈ G.
9. gH = Hg for all g ∈ G.
Proof. 1 ⇒ 2: Suppose that we know 1. Let g ∈ G and let h ∈ H. Applying 1 with g −1 ∈ G and h ∈ H, we
may fix ` ∈ H with hg −1 = g −1 `. Multiplying on the left by g we see that ghg −1 = `, and then multiplying
on the right by g we conclude that gh = `g.
2 ⇒ 1: Suppose that we know 2. Let g ∈ G and let h ∈ H. Applying 2 with g −1 ∈ G and h ∈ H, we may
fix ` ∈ H with g −1 h = `g −1 . Multiplying on the right by g we see that g −1 hg = `, and then multiplying on
the left by g we conclude that hg = g`.
1 ⇒ 3: Suppose that we know 1. Let g ∈ G and let h ∈ H. By 1, we may fix ` ∈ H with hg = g`.
Multiplying on the left by g −1 , we see that g −1 hg = ` ∈ H. Since g ∈ G and h ∈ H were arbitrary, we
conclude that g −1 hg ∈ H for all g ∈ G and all h ∈ H.
3 ⇒ 1: Suppose that we know 3. Let g ∈ G and let h ∈ H. Now

hg = ehg = (gg −1 )hg = g(g −1 hg)

and by 3 we know that g −1 hg ∈ H. Part 1 follows.


2 ⇔ 4: This follows in exactly that same manner as 1 ⇔ 3.
4 ⇔ 5: These two are simply restatements of each other.
5 ⇒ 6: Suppose we know 5. To prove 6, we need only prove that H ⊆ gHg −1 for all g ∈ G (since
the reverse containment is given to us). Let g ∈ G and let h ∈ H. By 5 applied to g −1 , we see that
g −1 H(g −1 )−1 ⊆ H, hence g −1 Hg ⊆ H and in particular we conclude that g −1 hg ∈ H. Since

h = ehe = (gg −1 )h(gg −1 ) = g(g −1 hg)g −1

and g −1 hg ∈ H, we see that h ∈ gHg −1 . Since g ∈ G and h ∈ H were arbitrary, it follows that H ⊆ gHg −1
for all g ∈ G.
96 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

6 ⇒ 5: This is trivial.
1 ⇔ 7: These two are simply restatements of each other.
2 ⇔ 8: These two are simply restatements of each other.
9 ⇒ 8: This is trivial.
7 ⇒ 9: Suppose that we know 7. Since 7 ⇒ 1 ⇒ 2 ⇒ 8, it follows that we know 8 as well. Putting 7 and
8 together we conclude 9.

As the Proposition shows, the condition we are seeking to ensure that multiplication of left cosets is
well-defined is equivalent to the condition that the left cosets of H in G are equal to the right cosets of H
in G. Thus, by adopting that condition we automatically get rid of the other problematic question of which
side to work on. This condition is shaping up to be so useful that we given the subgroups which satisfy it a
special name.

Definition 6.2.3. Let G be a group and let H be a subgroup of G. We say that H is a normal subgroup of
G if gHg −1 ⊆ H for all g ∈ G (or equivalently any of properties in the previous proposition hold).

Our entire goal in defining and exploring the concept of a normal subgroup H of a group G was to
allow us to prove that multiplication of left cosets via representatives is well-defined. It turns out that this
condition is precisely equivalent to this operation being well-defined.

Proposition 6.2.4. Let G be an group and let H be a subgroup of G. The following are equivalenct.

1. H is a normal subgroup of G.

2. Whenever a, b, c, d ∈ G with a ∼H c and b ∼H d, we have ab ∼H cd. (Here, ∼H is the equivalence


relation corresponding to left cosets.)

Proof. We first prove that 1 implies 2. Suppose that a, b, c, d ∈ G with a ∼H c and b ∼H d. Fix h, k ∈ H
such that ah = c and bk = d. Since H is a normal subgroup of G, we may fix ` ∈ H with hb = b`. We then
have that
cd = ahbk = ab`k.
Now `k ∈ H because H is a subgroup of G. Since cd = (ab)(`k), it follows that ab ∼H cd.
We now prove that 2 implies 1. We prove that H is a normal subgroup of G by showing that g −1 hg ∈ H
for all g ∈ G and h ∈ H. Let g ∈ G and let h ∈ H. Notice that we have e ∼H h because eh = h and g ∼H g
because ge = g. Since we are assuming 2, it follows that eg ∼H hg and hence g ∼H hg. Fix k ∈ H with
gk = hg. Multiply on the left by g −1 we get k = g −1 hg so g −1 hg ∈ H. The result follows.

For example, A3 = h(1 2 3)i is a normal subgroup of S3 . To show this, we can directly compute the
cosets (although we will see a faster method soon). The left cosets of A3 in S3 are:

• idA3 = (1 2 3)A3 = (1 3 2)A3 = {id, (1 2 3), (1 3 2)}

• (1 2)A3 = (1 3)A3 = (2 3)A3 = {(1 2), (1 3), (2 3)}

The right cosets of A3 in S3 are:

• A3 id = A3 (1 2 3) = A3 (1 3 2) = {id, (1 2 3), (1 3 2)}

• A3 (1 2) = A3 (1 3) = A3 (2 3) = {(1 2), (1 3), (2 3)}

Thus, σA3 = A3 σ for all σ ∈ S3 and hence A3 is a normal subgroup of S3 .

Proposition 6.2.5. Let G be a group and let H be a subgroup of G. If H ⊆ Z(G), then H is a normal
subgroup of G.
6.2. NORMAL SUBGROUPS AND QUOTIENT GROUPS 97

Proof. For any g ∈ G and h ∈ H, we have hg = gh because h ∈ Z(G). Therefore, H is a normal subgroup
of G by Condition 1 above.
Example 6.2.6. Suppose that n ≥ 4 is even and write n = 2k for k ∈ N+ . Since Z(Dn ) = {id, rk } from
the homework, we conclude that {id, rk } is a normal subgroup of Dn .
Proposition 6.2.7. Let G be a group and let H be a subgroup of G. If [G : H] = 2, then H is a normal
subgroup of G.
Proof. Since [G : H] = 2, we know that there are exactly two distinct left cosets of H in G and exactly two
distinct right cosets of H in G. One left coset of H in G is eH = H. Since the left cosets partition G and
there are only two of them, it must be the case that the other left coset of H in G is the set G\H (that is
the set G with the set H removed). Similarly, one right coset of H in G is He = H. Since the right cosets
partition G and there are only two of them, it must be the case that the other right coset of H in G is the
set G\H.
To show that H is a normal subgroup of G, we know that it suffices to show that gH = Hg for all g ∈ G
(by Proposition 6.2.2). Let g ∈ G be arbitrary. We have two cases.
• Suppose that g ∈ H. We then have that g ∈ gH and g ∈ eH, so gH ∩eH 6= ∅ and hence gH = eH = H.
We also have that g ∈ Hg and g ∈ He, so Hg ∩ He 6= ∅ and hence Hg = He = H. Therefore
gH = H = Hg.
• Suppose that g ∈ / H. We then have g ∈ / eH, so gH =6 eH, and hence we must have gH = G\H
(because it is the only other left coset). We also have g ∈
/ He, so Hg 6= He, and hence Hg = G\H.
Therefore gH = G\H = Hg.
Thus, for any g ∈ G, we have gH = Hg. It follows that H is a normal subgroup of G.
Since [S3 : A3 ] = 2, this proposition gives a different way to prove that A3 is a normal subgroup of S3
without doing all of the calculations we did above. In fact, we get the following.
Corollary 6.2.8. An is a normal subgroup of Sn for all n ∈ N.
Proof. This is trivial if n = 1. When n ≥ 2, we have

|Sn |
[Sn : An ] =
|An |
n!
= (by by Proposition 5.3.12)
n!/2
=2

so An is a normal subgroup of Sn by the Proposition 6.2.7.


Corollary 6.2.9. For any n ≥ 3, the subgroup H = hri is a normal subgroup of Dn .
Proof. We have |H| = |r| = n and |Dn | = 2n, so again this follows from Proposition 6.2.7.
We are now in a position to define the quotient of a group G (possibly nonabelian) by a normal subgroup
H. As in the abelian case, we verify that the definition really is a group immediately after.
Definition 6.2.10. Let G be an group and let H be a normal subgroup of G. We define a new group, called
the quotient of G by H and denoted G/H, by letting the elements be the left cosets of H in G (i.e. the
equivalence classes of G under ∼H ), and defining the binary operation aH · bH = (ab)H. The identity is eH
(where e is the identity of G) and the inverse of aH is a−1 H.
98 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

Proposition 6.2.11. Let G be a group and let H be a normal subgroup of G. The set G/H with the
operation just defined is indeed a group with |G/H| = [G : H].

Proof. We verified that the operation aH · bH = (ab)H is well-defined in Proposition 6.2.4. With that in
hand, we just need to check the group axioms.
We first check that · is an associative operation on G/H. For any a, b, c ∈ G we have

(aH · bH) · cH = (ab)H · cH


= ((ab)c)H
= (a(bc))H (since · is associative on G)
= aH · (bc)H
= aH · (bH · cH).

We next check that eH is an identity. For any a ∈ G we have

aH · eH = (ae)H = aH

and
eH · aH = (ea)H = aH.
For inverses, notice that given any a ∈ G, we have

aH · a−1 H = (aa−1 )H = eH

and
a−1 H · aH = (a−1 a)H = eH.
Thus, G/H is indeed a group, and it has order [G : H] because the elements are the left cosets of H in G.

For example, suppose that G = D4 and H = Z(G) = {id, r2 }. We know from Proposition 6.2.5 that H
is a normal subgroup of G. The left cosets (and hence right cosets because H is normal in G) of H in G are:

• idH = r2 H = {id, r2 }

• rH = r3 H = {r, r3 }

• sH = r2 sH = {s, r2 s}

• rsH = r3 sH = {rs, r3 s}

As usual, there are no “best” choices of representatives for these cosets when we consider G/H. We choose
to take
G/H = {idH, rH, sH, rsH}.
The Cayley table of G/H using these representatives is:

· idH rH sH rsH
idH idH rH sH rsH
rH rH idH rsH sH
sH sH rsH idH rH
rsH rsH sH rH idH
6.2. NORMAL SUBGROUPS AND QUOTIENT GROUPS 99

Notice that we had to switch to our “chosen” representatives several times when constructing this table. For
example, we have
rH · rH = r2 H = idH
and
sH · rH = srH = r−1 sH = r3 sH = rsH.
Examining the table, we see a few interesting facts. The group G/H is abelian even though G is not abelian.
Furthermore, every nonidentity element of G/H has order 2 even though r itself has order 4. We next see
how the order of an element in the quotient relates to the order of the representative in the original group.

Proposition 6.2.12. Let G be a group and let H be a normal subgroup of G. Suppose that a ∈ G has finite
order. The order of aH (in the group G/H) is finite and divides the order of a (in the group G).

Proof. Let n = |a| (in the group G). We have an = e, so

(aH)n = an H = eH.

Now eH is the identity of G/H, so we have found some power of the element aH which gives the identity in
G/H. Thus, aH has finite order. Let m = |aH| (in the group G/H). Since we checked that n is a power of
aH giving the identity, it follow from Proposition 4.6.5 that m | n.

It is possible that |aH| is strictly smaller than |a|. In the above example of D4 , we have |r| = 4 but
|rH| = 2. Notice however that 2 | 4 as the previous proposition proves must be the case.
We now show how it is possible to prove theorems about groups using quotients and induction. The
general idea is as follows. Given a group G, proper subgroups of G and nontrivial quotients of G (that is by
normal subgroups other than {e} and G) are “smaller” than G. So the idea is to prove a result about finite
groups by using induction on the order of the group. By the inductive hypothesis, we know information
about the proper subgroups and nontrivial quotients, so the hope is to piece together that information to
prove the result about G. We give an example of this technique by proving the following theorem.

Theorem 6.2.13. Let p ∈ N+ be prime. If G is a finite abelian group with p | |G|, then G has an element
of order p.

Proof. The proof is by induction on |G|. If |G| = 1, then the result is trivial because p - 1 (if you don’t
like this vacuous base case, simply note the if |G| = p, then every nonidentity element of G has order p by
Lagrange’s Theorem). Suppose then that G is a finite group with p | |G|, and suppose that the result is true
for all groups K satisfying p | |K| and |K| < |G|. Fix a ∈ G with a 6= e, and let H = hai. We then have that
H is a normal subgroup of G because G is abelian. We now have two cases.

• Case 1: Suppose that p | |a|. By Proposition 4.6.10, G has an element of order p.

• Case 2: Suppose that p - |a|. We have

|G|
|G/H| = [G : H] =
|H|

so |G| = |H| · |G/H|. Since p | |G| and p - |H| (because |H| = |a|), it follows that p | |G/H|. Now
|G/H| < |G| because |H| > 1, so by induction there exists an element bH ∈ G/H with |bH| = p. By
Proposition 6.2.12, we may conclude that p | |b|. Therefore, by Proposition 4.6.10, G has an element
of order p.

In either case, we have concluded that G has an element of order p. The result follows by induction.
100 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

Notice where we used the two fundamental assumptions that p is prime and that G is abelian. For the
prime assumption, we used the key fact that if p | ab, then either p | a or p | b. In fact, the result is not
true if you leave out the assumption that p is prime. The abelian group U (Z/8Z) has order 4 but it has no
element of order 4.
We made use of the abelian assumption to get that H was a normal subgroup of G without any work.
In general, with a little massaging, we could slightly alter the above proof as long as we could assume that
every group has some normal subgroup other than the two trivial normal subgroups {e} and G (because this
would allow us to either use induction on H or on G/H). Unfortunately, it is not in general true that every
group always has such a normal subgroup. Those that do not are given a name.
Definition 6.2.14. A group G is simple if |G| > 1 and the only normal subgroups of G are {e} and G.
A simple group is a group that we are unable to “break up” into a smaller normal subgroup H and
corresponding smaller quotient G/H. They are the “atoms” of the groups and are analogous to the primes.
In fact, for every prime p, the abelian group Z/pZ is simple for the trivial reason that its only subgroups
at all are {0} and all of Z/pZ by Lagrange’s Theorem. It turns out that these are the only simple abelian
groups (see homework). Now if these were the only simple groups at all, there would not be much of problem
because they are quite easy to get a handle on. However, there are infinitely many finite simple nonabelian
groups. For example, we will see later that An is a simple group for all n ≥ 5. The existence of these groups
is a serious obstruction to any inductive proof for all groups in the above style. Nonetheless, the above
theorem is true for all groups (including nonabelian ones), and the result is known as Cauchy’s Theorem.
We will prove it later using more advanced tools, but we will start that proof knowing that we have already
handled the abelian case.

6.3 Isomorphisms
Definitions and Examples
We have developed several ways to construct groups. We started with well known groups like Z, Q and
GLn (R). From there, we introduced the groups Z/nZ and U (Z/nZ). After that, we developed our first
family of nonabelian groups in the symmetric groups Sn . With those in hand, we obtained many other
groups as subgroups of these, such as SLn (R), An , and Dn . Finally, we built new groups from all of these
using direct products and quotients.
With such a rich supply on groups now, it is time to realize that some of these groups are essentially the
“same”. For example, let G = Z/2Z, let H = S2 , and let K be the group ({T, F }, ⊕) where ⊕ is “exclusive
or” discussed in the first section. Here are the Cayley tables of these groups.
+ 0 1 ◦ id (1 2) ⊕ F T
0 0 1 id id (1 2) F F T
1 1 0 (1 2) (1 2) id T T F
Now of course these groups are different because the sets are completely different. The elements of G are
equivalence classes and thus subsets of Z, the elements of H are permutations of the set {1, 2} (and hence
are really certain functions), and the elements of K are T and F . Furthermore the operations themselves
have little in common since in G we have addition of cosets via representatives, in H we have function
composition, and in K we have this funny logic operation. However, despite all these differences, a glance
at the above tables tells us that there is a deeper “sameness” to them. For G and H, if we pair off 0 with id
and pair off 1 and (1 2), then we have provided a kind of “rosetta stone” for translating between the groups.
This is formalized with the following definition.
Definition 6.3.1. Let (G, ·) and (H, ?) be groups. An isomorphism from G to H is a function ϕ : G → H
such that
6.3. ISOMORPHISMS 101

1. ϕ is a bijection.

2. ϕ(a · b) = ϕ(a) ? ϕ(b) for all a, b ∈ G. In shorthand, ϕ preserves the group operation.

Thus, an isomorphism ϕ : G → H is a pairing of elements of G with elements of H (that is the bijection


part) in such a way that we can either operate on the G side first and then walk over to H, or equivalently
can walk over to H with our elements first and then operate on that side. In our case of G = Z/2Z and
H = S2 , we define ϕ : G → H by letting ϕ(0) = id and ϕ(1) = (1 2). Clearly, ϕ is a bijection. Now we have
to check four pairs to check the second property. We have

• ϕ(0 + 0) = ϕ(0) = id and ϕ(0) + ϕ(0) = id ◦ id = id.

• ϕ(0 + 1) = ϕ(1) = (1 2) and ϕ(0) + ϕ(1) = id ◦ (1 2) = (1 2).

• ϕ(1 + 0) = ϕ(1) = (1 2) and ϕ(1) + ϕ(0) = (1 2) ◦ id = (1 2).

• ϕ(1 + 1) = ϕ(0) = id and ϕ(1) + ϕ(1) = (1 2) ◦ (1 2) = id.

Therefore, ϕ(a + b) = ϕ(a) ◦ ϕ(b) for all a, b ∈ Z/2Z. Since we already noted that ϕ is a bijection, it follows
that ϕ is an isomorphism. All of these checks are just really implicit in the above table. The bijection
ϕ : G → H we defined pairs off 0 with id and pairs off 1 with (1 2). Since this aligning of elements carries
the table of G to the table of H as seen in the above tables, ϕ is an isomorphism.
Notice if instead we define ψ : G → H by letting ψ(0) = (1 2) and ψ(1) = id, then ψ is not an isomorphism.
To see this, we just need to find a counterexample to the second property. We have

ψ(0 + 0) = ψ(0) = (1 2)

and
ψ(0) + ψ(0) = (1 2) ◦ (1 2) = id
so
ψ(0 + 0) 6= ψ(0) + ψ(0)
Essentially we are writing the Cayley tables as:

+ 0 1 ◦ (1 2) id
0 0 1 (1 2) id (1 2)
1 1 0 id (1 2) id

and noting that ψ does not carry the table of G to the table of H (as can be seen in the (1, 1) entry). Thus,
even if one bijection is an isomorphism, there may be other bijections which are not. We make the following
definition.

Definition 6.3.2. Let G and H be groups. We say that G and H are isomorphic, and write G ∼
= H, if there
exists an isomorphism ϕ : G → H.

In colloquial language, two groups G and H are isomorphic exactly when there is some way to pair off
elements of G with H which maps the Cayley table of G onto the Cayley table of H. For example, we have
U (Z/8Z) ∼= Z/2Z × Z/2Z via the bijection

ϕ(1) = (0, 0) ϕ(3) = (0, 1) ϕ(5) = (1, 0) ϕ(7) = (1, 1)

as shown by the following tables:


102 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

· 1 3 5 7 + (0, 0) (0, 1) (1, 0) (1, 1)


1 1 3 5 7 (0, 0) (0, 0) (0, 1) (1, 0) (1, 1)
3 3 1 7 5 (0, 1) (0, 1) (0, 0) (1, 1) (1, 0)
5 5 7 1 3 (1, 0) (1, 0) (1, 1) (0, 0) (0, 1)
7 7 5 3 1 (1, 1) (1, 1) (1, 0) (0, 1) (0, 0)

Furthermore, looking back at the introduction we see that the crazy group G = {3, ℵ, @} is isomorphic to
Z/3Z as the following ordering of elements of shows.
· 3 ℵ @ + 1 0 2
3 @ 3 ℵ 1 2 1 0
ℵ 3 ℵ @ 0 1 0 2
@ ℵ @ 3 2 0 2 1
Also, the 6 element group at the very end of Section 4.1 is isomorphic to S3 :
∗ 1 2 3 4 5 6 ◦ id (1 2) (1 3) (2 3) (1 2 3) (1 3 2)
1 1 2 3 4 5 6 id id (1 2) (1 3) (2 3) (1 2 3) (1 3 2)
2 2 1 6 5 4 3 (1 2) (1 2) id (1 3 2) (1 2 3) (2 3) (1 3)
3 3 5 1 6 2 4 (1 3) (1 3) (1 2 3) id (1 3 2) (1 2) (2 3)
4 4 6 5 1 3 2 (2 3) (2 3) (1 3 2) (1 2 3) id (1 3) (1 2)
5 5 3 4 2 6 1 (1 2 3) (1 2 3) (1 3) (2 3) (1 2) (1 3 2) id
6 6 4 2 3 1 5 (1 3 2) (1 3 2) (2 3) (1 2) (1 3) id (1 2 3)
The property of being isomorphic has the following basic properties. Roughly, we are saying that iso-
morphism is an equivalence relation on the set of all groups. Formally, there are technical problems talking
about “the set of all groups” (some collections are simply too big to be sets), but let’s not dwell on those
details here.
Proposition 6.3.3.
1. For any group G, the function idG : G ∼
= G is an isomorphism, so G ∼
= G.
2. If ϕ : G → H is an isomorphism, then ϕ−1 : H → G is an isomorphism. In particular, if G ∼
= H, then
H∼ = G.
3. If ϕ : G → H and ψ : H → K are isomorphisms, then ψ ◦ ϕ : G → K is an isomorphism. In particular,
if G ∼= H and H ∼
= K, then G ∼
= K.
Proof.
1. Let (G, ·) be a group. The function idG : G → G is a bijection, and for any a, b ∈ G we have

idG (a · b) = a · b = idG (a) · idG (b)

so idG : G → G is an isomorphism.
2. Let · be the group operation in G and let ? be the group operation in H. Since ϕ : G → H is a
bijection, we know that it has an inverse ϕ−1 : H → G which is also a bijection (because ϕ−1 has an
inverse, namely ϕ). We need only check the second property. Let c, d ∈ H. Since ϕ is a bijection, it is
a surjection, so we may fix a, b ∈ G with ϕ(a) = c and ϕ(b) = d. By definition of ϕ−1 , we then have
ϕ−1 (c) = a and ϕ−1 (d) = b. Now

ϕ(a · b) = ϕ(a) ? ϕ(b) = c ? d


6.3. ISOMORPHISMS 103

so by definition of ϕ−1 we also have ϕ−1 (c ? d) = a · b. Therefore,

ϕ−1 (c ? d) = a · b

Putting this information together we see that

ϕ−1 (c ? d) = a · b = ϕ−1 (c) · ϕ−1 (d)

Since c, d ∈ H were arbitrary, the second property holds. Therefore, ϕ−1 : H → G is an isomorphism.
3. Let · be the group operation in G, let ? be the group operation in H, and let ∗ be the group operation
in K. Since the composition of bijections is a bijection, it follows that ψ ◦ ϕ : G → K is a bijection.
For any a, b ∈ G, we have

(ψ ◦ ϕ)(a · b) = ψ(ϕ(a · b))


= ψ(ϕ(a) ? ϕ(b)) (because ϕ is an isomorphism)
= ψ(ϕ(a)) ∗ ψ(ϕ(b)) (because ψ is an isomorphism)
= (ψ ◦ ϕ)(a) ∗ (ψ ◦ ϕ)(b)

Therefore, ψ ◦ φ : G → K is an isomorphism.

It gets tiresome consistently using different notation for the operation in G and the operation in H, so
we will stop doing it unless absolutely necessary. Thus, we will write

ϕ(a · b) = ϕ(a) · ϕ(b)

where you need to keep in mind that the · on the left is the group operation in G and the · on the right is
the group operation in H.
Proposition 6.3.4. Let ϕ : G → H be an isomorphism. We have the following.
1. ϕ(eG ) = eH .
2. ϕ(a−1 ) = ϕ(a)−1 for all a ∈ G.
3. ϕ(an ) = ϕ(a)n for all a ∈ G and all n ∈ Z.
Proof.
1. We have
ϕ(eG ) = ϕ(eG · eG ) = ϕ(eG ) · ϕ(eG )
hence
eH · ϕ(eG ) = ϕ(eG ) · ϕ(eG )
Using the cancellation law, it follows that ϕ(eG ) = eH .
2. Let a ∈ G. We have
ϕ(a) · ϕ(a−1 ) = ϕ(a · a−1 ) = ϕ(eG ) = eH
and
ϕ(a−1 ) · ϕ(a) = ϕ(a−1 · a) = ϕ(eG ) = eH
Therefore, ϕ(a−1 ) = ϕ(a)−1 .
104 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

3. For n = 0, this says that ϕ(eG ) = eH , which is true by part 1. The case n = 1 is trivial, and the case
n = −1 is part 2. We first prove the result for all n ∈ N+ by induction. We already noticed that n = 1
is trivial. Suppose that n ∈ N+ is such that ϕ(an ) = ϕ(a)n for all a ∈ G. For any a ∈ G we have
ϕ(an+1 ) = ϕ(an · a)
= ϕ(an ) · ϕ(a) (since ϕ is an isomorphism)
= ϕ(a)n · ϕ(a) (by induction)
n+1
= ϕ(a)
Thus, the result holds for n + 1. Therefore, the result is true for all n ∈ N+ by induction. We finally
handle n ∈ Z with n < 0. For any a ∈ G we have
ϕ(an ) = ϕ((a−1 )−n )
= ϕ(a−1 )−n (since − n > 0)
−1 −n
= (ϕ(a) ) (by part 2)
n
= ϕ(a)
Thus, the result is true for all n ∈ Z.

Theorem 6.3.5. Let G be a cyclic group.


1. If |G| = ∞, then G ∼
= Z.
2. If |G| = n, then G ∼
= Z/nZ.
Proof.
1. Suppose that |G| = ∞. Since G is cyclic, we may fix c ∈ G with G = hci. Since |G| = ∞, it follows
that |c| = ∞. Define ϕ : Z → G by letting ϕ(n) = cn . Notice that ϕ is surjective because G = hci.
Also, if ϕ(m) = ϕ(n), then cm = cn , so m = n by Proposition 5.2.3. Therefore, ϕ is injective. Putting
this together, we conclude that ϕ is a bijection. For any k, ` ∈ Z, we have
ϕ(k + `) = ck+` = ck · c` = ϕ(k) · ϕ(`)
Therefore, ϕ is an isomorphism. It follows that Z ∼
= G and hence G ∼
= Z.
2. Suppose that |G| = n. Since G is cyclic, we may fix c ∈ G with G = hci. Since |G| = n, it follows
from Proposition 5.2.3 that |c| = n. Define ϕ : Z/nZ → G by letting ϕ(k) = ck . Since we are defining
a function on equivalence classes, we first need to check that ϕ is well-defined.
Suppose that k, ` ∈ Z with k = `. We then have that k ≡ ` (mod n), so n | (k − `). Fix d ∈ Z with
nd = k − `. We then have k = ` + nd, hence
ϕ(k) = ck = c`+nd = c` (cn )d = c` ed = c` = ϕ(`)
Thus, ϕ is well-defined.
We next check that ϕ is a bijection. Suppose that k, ` ∈ Z with ϕ(k) = ϕ(`). We then have that
ck = c` , hence ck−` = e. Since |c| = n, we know from Proposition 4.6.5 that n | (k − `). Therefore
k ≡ ` (mod n), and hence k = `. It follows that ϕ is injective. To see this ϕ is surjective, if a ∈ G,
then a ∈ hci, so we may fix k ∈ Z with a = ck and note that ϕ(k) = ck = a. Hence, ϕ is a bijection.
Finally notice that for any k, ` ∈ Z, we have
ϕ(k + `) = ϕ(k + `) = ck+` = ck · c` = ϕ(k) · ϕ(`)
Therefore, ϕ is an isomorphism. It follows that Z/nZ ∼
= G and hence G ∼
= Z/nZ.
6.3. ISOMORPHISMS 105

Now on Homework 4, we proved that |11| = 6 in U (Z/18Z). Since |U (Z/18Z)| = 6, it follows that
U (Z/18Z) is a cyclic group of order 6, and hence U (Z/18Z) ∼
= Z/6Z.

Corollary 6.3.6. Let p ∈ N+ be prime. Any two groups of order p are isomorphic.

Proof. Suppose that G and H have order p. By the previous theorem, we have G ∼
= Z/pZ and H ∼
= Z/pZ.
Using symmetry and transitivity of ∼
=, it follows that G ∼
= H.

Properties Preserved by Isomorphisms


We have spent a lot of time establishing the existence of isomorphisms between various groups. How do
we show that two groups are not isomorphic? The first thing to note is that if G ∼ = H, the we must have
|G| = |H| simply because if there is a bijection between two sets, then they have the same size. But what
can we do if |G| = |H|? Looking at all possible bijections and ruling them out is not particularly feasible.
The fundamental idea is that isomorphic groups are really just “renamings” of each other, so they should
have all of the same fundamental properties. The next proposition is an example of this idea.

Proposition 6.3.7. Suppose that G and H are groups with G ∼


= H. If G is abelian, then H is abelian.

Proof. Since G ∼ = H, we may fix an isomorphism ϕ : G → H. Let h1 , h2 ∈ H. Since ϕ is an isomorphism, it


is in particular surjective, so we may fix g1 , g2 ∈ G with ϕ(g1 ) = h1 and ϕ(g2 ) = h2 . We then have

h1 · h2 = ϕ(g1 ) · ϕ(g2 )
= ϕ(g1 · g2 ) (since ϕ is an isomorphism)
= ϕ(g2 · g1 ) (since G is abelian)
= ϕ(g2 ) · ϕ(g1 ) (since ϕ is an isomorphism)
= h2 · h1

Since h1 , h2 ∈ H were arbitrary, it follows that H is abelian.

As an example, even though |S3 | = 6 = |Z/6Z| we have S3 ∼


6 Z/6Z because S3 is nonabelian while Z/6Z
=
is abelian.

Proposition 6.3.8. Suppose that G and H are groups with G ∼


= H. If G is cyclic, then H is cyclic.

Proof. Since G ∼= H, we may fix an isomorphism ϕ : G → H. Since G is cyclic, we may fix c ∈ G with
G = hci. We claim that H = hϕ(c)i. Let h ∈ H. Since ϕ is in particular a surjection, we may fix g ∈ G with
ϕ(g) = h. Since g ∈ G = hci, we may fix n ∈ Z with g = cn . We then have

h = ϕ(g) = ϕ(cn ) = ϕ(c)n

so h ∈ hϕ(c)i. Since h ∈ H was arbitrary, it follows that H = hϕ(c)i and hence H is cyclic.

As an example, consider the groups Z/4Z and Z/2Z × Z/2Z. Each of these groups are abelian of order
4. However, Z/4Z is cyclic while Z/2Z × Z/2Z is not. It follows that Z/4Z 6∼
= Z/2Z × Z/2Z.

Proposition 6.3.9. Suppose that ϕ : G → H is an isomorphism. For any a ∈ G, we have |a| = |ϕ(a)| (in
other words, the order of a in G equals the order of ϕ(a) in H).
106 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

Proof. Let a ∈ G. Suppose that n ∈ N+ and an = eG . Using the fact that ϕ(eG ) = eH we have

ϕ(a)n = ϕ(an ) = ϕ(eG ) = eH

so ϕ(a)n = eH . Conversely suppose that n ∈ N+ and ϕ(a)n = eH . We then have that

ϕ(an ) = ϕ(an ) = eH

so ϕ(an ) = eH = ϕ(eG ), and hence an = eG because ϕ is injective. Combining both of these, we have shown
that
{n ∈ N+ : an = eG } = {n ∈ N+ : ϕ(a)n = eH }
It follows that both of these sets are either empty (and so |a| = ∞ = |ϕ(a)|) or both have the same least
element (equal to the common order of |a| and |ϕ(a)|).
Thus, for example, we have Z/4Z × Z/4Z ∼ 6 Z/2Z × Z/8Z because the latter group has an element of
=
order 8 while a4 = e for all a ∈ Z/4Z × Z/4Z.

6.4 Internal Direct Products


We know how to take two groups and “put them together” by taking the direct product. We now want to
think about how to reverse this process. Given a group G which is not obviously a direct product, how can
we recognize it as being naturally isomorphic to the direct product of two of its subgroups? Before jumping
in, let’s look at the direct product again.
Suppose then that H and K are groups and consider their direct product H × K. Define

H 0 = {(h, eK ) : h ∈ H}

and
K 0 = {(eH , k) : k ∈ K}
We claim that H 0 is a normal subgroup of H ×K. To see that it is a subgroup, simply note that (eH , eK ) ∈ H 0 ,
that
(h1 , eK ) · (h2 , eK ) = (h1 h2 , eK eK ) = (h1 h2 , eK )
and that
(h, eK )−1 = (h−1 , e−1
K ) = (h
−1
, eK )
To see that H 0 is a normal subgroup of H × K, notice that if (h, eK ) ∈ H 0 and (a, b) ∈ H × K, then

(a, b) · (h, eK ) · (a, b)−1 = (a, b) · (h, eK ) · (a−1 , b−1 )


= (ah, beK ) · (a−1 , b−1 )
= (aha−1 , beK b−1 )
= (aha−1 , eK )

which is an element of H 0 . Similarly, K 0 is a normal subgroup of H × K. Notice further that H ∼


= H 0 via
the function ϕ(h) = (h, eK ) and similarly K ∼ = K 0
via the function ψ(k) = (e H , k).
We can say some more about these relationship between the subgroups H 0 and K 0 of H × K. Notice that

H 0 ∩ K 0 = {(eH , eK )} = {eH×K }

In other words, the only thing that H 0 and K 0 have in common is the identity element of H × K. Finally,
note that any element of H × K can be written as a product of an element of H 0 and an element of K 0 : If
(h, k) ∈ H × K, then (h, k) = (h, eK ) · (eH , k).
6.4. INTERNAL DIRECT PRODUCTS 107

It turns out that these few facts characterize when a given group G is naturally isomorphic to the direct
product of two of its subgroups, as we will see in Theorem 6.4.4. Before proving this result, we first introduce
a definition and a few lemmas.
Definition 6.4.1. Let G be a group, and let H and K be subgroups of G. We define

HK = {hk : h ∈ H and k ∈ K}.

Notice that HK will be a subset of G, but in general it is not a subgroup of G. For an explicit example,
consider G = S3 , H = H = h(1 2)i = {id, (1 2)} and K = h(1 3)i = {id, (1 3)}. We have

HK = {id ◦ id, id ◦ (1 3), (1 2) ◦ id, (1 2)(1 3)}


= {id, (1 3), (1 2), (1 3 2)}

Thus (1 3) ∈ HK and (1 2) ∈ HK, but (1 3)(1 2) = (1 2 3) ∈


/ HK. Therefore, HK is not a subgroup of G.
Lemma 6.4.2. Suppose that H and K are both subgroups of a group G and H ∩ K = {e}. Suppose that
h1 , h2 ∈ H and k1 , k2 ∈ K are such that h1 k1 = h2 k2 . We then have that h1 = h2 and k1 = k2 .
Proof. We have h1 k1 = h2 k2 , so multiplying on the left by h−12 and on the right by k1−1 , we see that
h2 h1 = k2 k1 . Now h2 h1 ∈ H because H is a subgroup of G, and k2 k1−1 ∈ K because K is a subgroup
−1 −1 −1

of G. Therefore, this common value is an element of H ∩ K = {e}. Thus, we have h−1 2 h1 = e and hence
h1 = h2 , and similarly we have k2 k1−1 = e and hence k1 = k2 .
Lemma 6.4.3. Suppose that H and K are both normal subgroups of a group G and that H ∩ K = {e}. We
then have that hk = kh for all h ∈ H and k ∈ K.
Proof. Let h ∈ H and k ∈ K. Consider the element hkh−1 k −1 . Since K is normal in G, we have hkh−1 ∈ K
and since k −1 ∈ K it follows that hkh−1 k −1 ∈ K. Similarly, since H is normal in G, we have kh−1 k −1 ∈ H
and since sine h ∈ H it follows that hkh−1 k −1 ∈ H. Therefore, hkh−1 k −1 ∈ H ∩K and hence hkh−1 k −1 = e.
Multiplying on the right by kh, we conclude that hk = kh.
Theorem 6.4.4. Suppose that H and K are subgroups of G with the following properties
1. H and K are both normal subgroups of G.
2. H ∩ K = {e}.
3. HK = G, i.e. for all g ∈ G, there exists h ∈ H and k ∈ K with g = hk.
In this case, the function ϕ : H × K → G defined by ϕ((h, k)) = h · k is an isomorphism. Thus, we have
G∼= H × K.
Proof. Define ϕ : H × K → G by letting ϕ((h, k)) = h · k. We check the following:
• ϕ is injective: Suppose that ϕ((h1 , k1 )) = ϕ((h2 , k2 )). We then have h1 k1 = h2 k2 , so since H ∩K = {e},
we can conclude that h1 = h2 and k1 = k2 using Lemma 6.4.2. Therefore, (h1 , k1 ) = (h2 , k2 ).
• ϕ is surjective: This is immediate from the assumption that G = HK.
• ϕ preserves the group operation: Suppose that (h1 , k1 ) ∈ H × K and (h2 , k2 ) ∈ H × K, we have

ϕ((h1 , k1 ) · (h2 , k2 )) = ϕ((h1 h2 , k1 k2 ))


= h1 h2 k1 k2
= h1 k1 h2 k2 (by Lemma 6.4.3)
= ϕ((h1 , k1 )) · ϕ((h2 , k2 )).
108 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

Therefore, ϕ is an isomorphism.

Definition 6.4.5. Let G a group. Let H and K be subgroups of G such that:

1. H and K are both normal subgroups of G.

2. H ∩ K = {e}.

3. HK = G.

We then say that G is the internal direct product of H and K. In this case, we have that H × K ∼
= G via
the function ϕ((h, k)) = hk, as shown in Theorem 6.4.4.

As an example, consider the group G = U (Z/8Z). Let H = h3i = {1, 3} and let K = h5i = {1, 5}. We
then have that H and K are both normal subgroups of G (because G is abelian), that H ∩ K = {1}, and
that
HK = {1 · 1, 3 · 1, 1 · 5, 3 · 5} = {1, 3, 5, 7} = G
Therefore, U (Z/8Z) is the internal direct product of H and K. I want to emphasize that U (Z/8Z) does not
equal H × K. After all, as sets we have

U (Z/8Z) = {1, 3, 5, 7}

while
H × K = {(1, 1), (3, 1), (1, 5), (3, 5)}
However, the above result shows that these two groups are isomorphic via the function ϕ : H ×K → U (Z/8Z)
given by ϕ((h, k)) = h · k. Since H and K are each cyclic of order 2, it follows that each of H and K are
isomorphic to Z/2Z. Using the following result, we conclude that U (Z/8Z) ∼ = Z/2Z × Z/2Z.
Proposition 6.4.6. Suppose that G1 ∼
= H1 and G2 ∼
= H2 . Show that G1 × G2 ∼
= H1 × H2 .
∼ H1 , we may fix an isomorphism ϕ1 : G1 → H1 . Since G2 ∼
Proof. Since G1 = = H2 , we may fix an isomorphism
ϕ2 : G2 → H2 . Define ψ : G1 × G2 → H1 × H2 by letting

ψ((a1 , a2 )) = (ϕ1 (a1 ), ϕ2 (a2 ))

We check the following:

• ψ is injective: Suppose that a1 , b1 ∈ G1 and a2 , b2 ∈ G2 satisfy ψ((a1 , a2 )) = ψ((b1 , b2 )). We then have
(ϕ1 (a1 ), ϕ2 (a2 )) = (ϕ1 (b1 ), ϕ2 (b2 )), so

ϕ1 (a1 ) = ϕ1 (b1 ) and ϕ2 (a2 ) = ϕ2 (b2 )

Now ϕ1 is an isomorphism, so ϕ1 is injective, and hence the first equality implies that a1 = b1 . Similarly,
ϕ2 is an isomorphism, so ϕ2 is injective, and hence the second equality implies that a2 = b2 . Therefore,
(a1 , a2 ) = (b1 , b2 ).

• ψ is surjective: Let (c1 , c2 ) ∈ H1 × H2 . We then have that c1 ∈ H1 and c2 ∈ H2 . Now ϕ1 is


an isomorphism, so ϕ1 is surjective, so we may fix a1 ∈ G1 with ϕ1 (a1 ) = c1 . Similarly, ϕ2 is an
isomorphism, so ϕ2 is surjective, so we may fix a2 ∈ G1 with ϕ1 (a2 ) = c2 . We then have

ψ((a1 , a2 )) = (ϕ1 (a1 ), ϕ2 (a2 )) = (c1 , c2 )

so (c1 , c2 ) ∈ range(ψ).
6.4. INTERNAL DIRECT PRODUCTS 109

• ψ preserves the group operation: Let a1 , b1 ∈ G1 and a2 , b2 ∈ G2 . We have

ψ((a1 , a2 ) · (b1 , b2 )) = ψ((a1 b1 , a2 b2 ))


= (ϕ1 (a1 b1 ), ϕ2 (a2 b2 ))
= (ϕ1 (a1 ) · ϕ1 (b1 ), ϕ2 (a2 ) · ϕ2 (b2 )) (since ϕ1 and ϕ2 preserve the operations)
= (ϕ1 (a1 ), ϕ2 (a2 )) · (ϕ1 (b1 ), ϕ2 (b2 ))
= ψ((a1 , a2 )) · ψ((b1 , b2 ))

Putting it all together, we conclude that ψ : G1 ×G2 → H1 ×H2 is an isomorphism, so G1 ×G2 ∼


= H1 ×H2 .

Corollary 6.4.7. Let G a finite group. Let H and K be subgroups of G such that:

1. H and K are both normal subgroups of G.

2. H ∩ K = {e}.

3. |H| · |K| = |G|.

We then have that G is the internal direct product of H and K.

Proof. We need only prove that HK = G. In the above proof where we showed that h1 k1 = h2 k2 implies
h1 = h2 and k1 = k2 , we only used the fact that H ∩ K = {e}. Therefore, since we are assuming that
H ∩ K = {e}, it follows that |HK| = |H| · |K|. Since we are assuming that |H| · |K| = |G|, it follows that
|HK| = |G|, and since G is finite we conclude that HK = G.

Suppose that G = D6 . Let


H = {id, r2 , r4 , s, r2 s, r4 s}

and let
K = {id, r3 }

We have that K = Z(G) from the homework, so K is a normal subgroup of G. It is straightforward to check
that H is a subgroup of G and
|G| 12
[G : H] = = =2
|H| 6
Using Proposition 6.2.7, we conclude that H is a normal subgroup of G. Now H ∩ K = {id} and |H| · |K| =
6 · 2 = 12 = |G| (you can also check directly that HK = G). It follows that G is the internal direct product
of H and K, so in particular we have G ∼ = H × K. Notice that K is cyclic of order 2, so K ∼ = Z/2Z.
Furthermore, H is a group of order 6, and it is not hard to convince yourself that H ∼ = D3 : Roughly, you can
map r2 (where r is rotation in D6 ) to r (where r is rotation in D3 ). Geometrically, when working with H,
you are essentially looking at the regular hexagon and focusing only on the rotations by 120o (corresponding
to r2 ), 240o (corresponding to r4 ) and the identity, along with the standard flip. This really just corresponds
exactly to the rigid motions of the triangle. I hope that convinces you that H ∼ = D3 , but you can check
formally by looking at the corresponding Cayley tables. Now D3 = S3 , so it follows that H ∼ = S3 and hence

D6 ∼
=H ×K ∼
= S3 × Z/2Z

where the last line uses Proposition 6.4.6.


110 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

6.5 Classifying Groups up to Isomorphism


A natural and interesting question is to count how many groups there are of various orders up to isomorphism.
It’s easy to see that any two groups of order 1 are isomorphic, so there is only one group of order 1 up to
isomorphism. The next case that we can quickly dispose of are groups of prime order.
Proposition 6.5.1. For each prime p ∈ Z, every group G of order p is isomorphic to Z/pZ. Thus, there is
exactly one group of order p up to isomorphism.
Proof. Let p ∈ Z be prime. If G is a group of order p, then G is cyclic by Theorem 5.8.6, and hence
isomorphic to Z/pZ by Theorem 6.3.5.
The first nonprime case is 4. Before jumping in to that, we prove a useful little proposition.
Proposition 6.5.2. If G is a group with the property that a2 = e for all a ∈ G, then G is abelian.
Proof. Let a, b ∈ G. Since ab ∈ G, we have (ba)2 = e, so baba = e. Multiplying on the left by b and using
the fact that b2 = e, we conclude that aba = b. Multiplying this on the right be a and using the fact that
a2 = e, we deduce that ab = ba. Since a, b ∈ G were arbitrary, it follows that G is abelian.
Proposition 6.5.3. Every group of order 4 is isomorphic to exactly one of Z/4Z or Z/2Z × Z/2Z. Thus,
there are exactly two groups of order 4 up to isomorphism.
Proof. First, notice that Z/4Z ∼ 6 Z/2Z × Z/2Z by Proposition 6.3.8 because the former is cyclic but the
=
latter is not. Thus, there are at least two groups of order 4 up to isomorphism.
Let G be a group of order 4. By Lagrange’s Theorem, we know that every nonidentity element of G has
order either 2 or 4. We have two cases.
• Case 1: Suppose that G has an element of order 4. By Proposition 5.2.6, we then have that G is cyclic,
so G ∼
= Z/4Z by Theorem 6.3.5.
• Case 2: Suppose that G has no element of order 4. Thus, every nonidentity element of G has order 2.
We then have that a2 = e for all a ∈ G, so G is abelian by Proposition 6.5.2. Fix distinct nonidentity
elements a, b ∈ G. We then have that |a| = 2 = |b|. Let H = hai = {e, a} and K = hbi = {e, b}. Since
G is abelian, we know that H and K are normal subgroups of G. We also have H ∩ K = {e} and
|H| · |K| = 2 · 2 = 4 = |G|. Using Corollary 6.4.7, we conclude that G is the internal direct product of
H and K. Now H and K are both cyclic of order 2, so they are both isomorphic to Z/2Z by Theorem
6.3.5. It follows that
G∼ =H ×K ∼ = Z/2Z × Z/2Z
where the latter isomorphism follows from Proposition 6.4.6.
Thus, every group of order 4 is isomorphic to one of groups Z/4Z or Z/2Z × Z/2Z. Furthermore, no group
can be isomorphic to both because Z/4Z 6∼= Z/2Z × Z/2Z as mentioned above.
We’ve now shown that every group of order strictly less than 6 is abelian. We know that S3 = D3 is
a nonabelian group of order 6, so this is the smallest possible size of a nonabelian group. In fact, up to
isomorphism, it is the only nonabelian group of order 6.
Proposition 6.5.4. Every group of order 6 is isomorphic to exactly one of Z/6Z or S3 . Thus, there are
exactly two groups of order 6 up to isomorphism.
Proof. First, notice that Z/6Z ∼
6 S3 by Proposition 6.3.7 because the former is abelian but the latter is not.
=
Thus, there are at least two groups of order 6 up to isomorphism.
Let G be a group of order 6. By Lagrange’s Theorem, we know that every nonidentity element of G has
order either 2, 3 or 6. We have two cases.
6.5. CLASSIFYING GROUPS UP TO ISOMORPHISM 111

• Case 1: Suppose that G is abelian. By Theorem 6.2.13, we know that G has an element a of order
3 and an element b of order 2. Let H = hai = {e, a, a2 } and K = hbi = {e, b}. Since G is abelian,
we know that H and K are normal subgroups of G. We also have H ∩ K = {e} (notice that b 6= a2
because |a2 | = 3 also) and |H| · |K| = 3 · 2 = 6 = |G|. Using Corollary 6.4.7, we conclude that G is the
internal direct product of H and K. Now H and K are both cyclic, so H ∼ = Z/3Z and K ∼ = Z/2Z by
Theorem 6.3.5. It follows that
G∼=H ×K ∼ = Z/3Z × Z/2Z
where the latter isomorphism follows from Proposition 6.4.6. Finally, notice that Z/3Z × Z/2Z is cyclic
by Problem 3b on Homework 5 (or by directly checking that |(1, 1)| = 6), so Z/3Z × Z/2Z ∼ = Z/6Z by
Theorem 6.3.5. Thus, G ∼= Z/6Z.
• Case 2: Suppose that G is not abelian. We then have that G is not cyclic by Proposition 5.2.7, so G
is no element of order 6 by Proposition 5.2.6. Also, it is not impossible that every element of G has
order 2 by Proposition 6.5.2. Thus, G must have an element of order 3. Fix a ∈ G with |a| = 3. Notice
that a2 ∈/ {e, a}, that a2 = a−1 , and that |a2 | = 3 (because (a2 )2 = a4 = a, and (a2 )3 = a6 = e).
Now take an arbitrary b ∈ G\{e, a, a2 }. We then have ab ∈ / {e, a, a2 , b} (for example, if ab = e, then
−1
b = a = a , while if ab = a then b = a by cancellation) and also a2 b ∈
2 2
/ {e, a, a2 , b, ab}. Therefore,
2 2 2 2
the 6 element of G are e, a, a , b, ab, and a b, i.e. G = {e, a, a , b, ab, a b}.
We next determine |b|. Now we know that b 6= e, so since G is not cyclic we know that either |b| = 2
or |b| = 3. Suppose then that |b| = 3. We know that b2 must be one of the six elements of G. We
do not have b2 = e because |b| = 3, and arguments similar to those above show that b2 ∈ / {b, ab, a2 b}
(for example, if b2 = ab, then a = b by cancellation). Now if b2 = a, then multiplying both sides by
b on the right, we could conclude that b3 = ab, so e = ab, and hence b = a−1 = a2 , a contradiction.
Similarly, if b2 = a2 , then multiplying by b on the right gives e = a2 b, so multiplying on the left by a
allows us to conclude that a = b, a contradiction. Since all cases end in a contradiction, we conclude
that |b| =
6 3, and hence we must have |b| = 2.
Since ba ∈ G and G = {e, a, a2 , b, ab, a2 b}, we now determine which of the six elements equals ba.
Notice that ba ∈/ {e, a, a2 , b} for similar reasons as above. If ba = ab, then a and b commute, and from
here it is straightforward to check that all element of G commute, contradicting the fact that G is
nonabelian. It follows that we have ba = a2 b = a−1 b.
To recap, we have |a| = 3, |b| = 2, and ba = a−1 b. Now in D3 we also have |r| = 3, |s| = 2, and
sr = r−1 s. Define a bijection ϕ : G → D3 as follows:
– ϕ(e) = id.
– ϕ(a) = r.
– ϕ(a2 ) = r2 .
– ϕ(b) = s.
– ϕ(ab) = rs.
– ϕ(a2 b) = r2 s.
Using the properties of a, b ∈ G along with those for r, s ∈ D3 , it is straightforward to check that ϕ
preserves the group operation. Therefore, G ∼
= D3 , and since D3 = S3 , we conclude that G ∼ = S3 .
Thus, every group of order 6 is isomorphic to one of groups Z/6Z or S3 . Furthermore, no group can be
isomorphic to both because Z/6Z 6∼= S3 as mentioned above.
We know that there are at least 5 groups of order 8 up to isomorphism:

Z/8Z Z/4Z × Z/2Z Z/2Z × Z/2Z × Z/2Z D4 Q8


112 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

The first three of these are abelian, while the latter two are not. Notice that Z/8Z is not isomorphic to other
two abelian groups because they are not cyclic. Also, Z/4Z × Z/2Z 6∼ = Z/2Z × Z/2Z × Z/2Z because the
former has an element of order 4 while the latter does not. Finally, D4 6∼ = Q8 because D4 has five elements
of order 2 and two of order 4, while Q8 has one element of order 2 and six of order 4. Now it is possible
to show that every group of order 8 is isomorphic to one of these five, but this is difficult to do with our
elementary methods. We need to build more tools, not only to tackle this problem, but also to think about
groups of larger order. Although many of these methods will arise in later chapters, the following result is
within our grasp now.

Theorem 6.5.5. Let G be a group. If G/Z(G) is cyclic, then G is abelian.

Proof. Recall that Z(G) is a normal subgroup of G by Proposition 5.6.2 and Proposition 6.2.5. Suppose that
G/Z(G) is cyclic. We may then fix c ∈ G such that G/Z(G) = hcZ(G)i. Now we have (cZ(G))n = cn Z(G)
for all n ∈ Z, so
G/Z(G) = {cn Z(G) : n ∈ Z}

We first claim that every a ∈ G can be written in the form a = cn z for some n ∈ Z and z ∈ Z(G). To see
this, let a ∈ G. Now aZ(G) ∈ G/Z(G), so we may fix n ∈ Z with aZ(G) = cn Z(G). Since a ∈ Z(G), we
have a ∈ cn Z(G), so there exists z ∈ Z(G) with a = cn z.
Suppose now that a, b ∈ G. From above, we may write a = cn z and b = cm w where n, m ∈ Z and
z, w ∈ Z(G). We have

ab = cn zcm w
= cn cm zw (since z ∈ Z(G))
n+m
=c zw
m+n
=c wz (since z ∈ Z(G))
m n
= c c wz
= cm wcn z (since w ∈ Z(G))
= ba

Since a, b ∈ G were arbitrary, it follows that G is abelian.

Corollary 6.5.6. Let G be a group with |G| = pq for (not necessarily distinct) primes p, q ∈ Z. We then
have that either Z(G) = G or Z(G) = {e}.

Proof. We know that Z(G) is a normal subgroup of G. Thus, |Z(G)| divides |G| by Lagrange’s Theorem, so
|Z(G)| must be one of 1, p, q, or pq. Suppose that |Z(G)| = p. We then have that

|G| pq
|G/Z(G)| = = = q,
|Z(G)| p

so G/Z(G) is cyclic by Theorem 5.8.6. Using Theorem 6.5.5, it follows that G is abelian, and hence Z(G) = G,
contradicting the fact that |Z(G)| = p. Similarly, we can not have |Z(G)| = q. Therefore, either |Z(G)| = 1,
in which case Z(G) = {e}, or |Z(G)| = pq, in which case Z(G) = G.

For example, since S3 is nonabelian and |S3 | = 6 = 2·3, this result immediately implies that Z(S3 ) = {id}.
Similarly, since D5 is nonabelian and |D5 | = 10 = 2 · 5, we conclude that Z(D5 ) = {id}. We will come back
and use this result in more interesting contexts later.
6.6. HOMOMORPHISMS 113

6.6 Homomorphisms
Our definition of isomorphism had two requirements: that the function was a bijection and that it preserves
the operation. We next investigate what happens if we drop the former and just require that the function
preserves the operation.
Definition 6.6.1. Let G and H be groups. A homomorphism from G to H is a function ϕ : G → H such
that ϕ(a · b) = ϕ(a) · ϕ(b) for all a, b ∈ G.
Consider the following examples:
• The determinant function det : GLn (R) → R\{0} is a homomorphism where the operation on R\{0}
is multiplication. This is because det(AB) = det(A) · det(B) for any matrices A and B. Notice that
det is certainly not an injective function.
• Consider the sign function ε : Sn → {1, −1} where the operation on {±1} is multiplication. Notice
that ε is homomorphism by Proposition 5.3.8. Again, for n ≥ 3, the function ε is very far from being
injective.
• For any groups G and H, the function π1 : G × H → G given by π1 ((g, h)) = g and the function
π2 : G × H → H given by π2 ((g, h)) = h are both homomorphisms. For examples, given any g1 , g2 ∈ G
and h1 , h2 ∈ H we have
π1 ((g1 , h1 ) · (g2 , h2 )) = π1 ((g1 g2 , h1 h2 ))
= g1 g2
= π1 ((g1 , h1 )) · π1 ((g2 , h2 ))
and similarly for π2 .
• For a given group G with normal subgroup N , the function π : G → G/N by letting π(a) = aN for all
a ∈ G is a homomorphism because for any a, b ∈ G, we have
π(ab) = abN
= aN · bN
= π(a) · π(b)

Proposition 6.6.2. Let ϕ : G → H be an homomorphism. We have the following.


1. ϕ(eG ) = eH .
2. ϕ(a−1 ) = ϕ(a)−1 for all a ∈ G.
3. ϕ(an ) = ϕ(a)n for all a ∈ G and all n ∈ Z.
Proof. In Proposition 6.3.4, the corresponding proof for isomorphisms, we only used the fact that ϕ preserves
the group operation, not that it is a bijection.
Definition 6.6.3. Let ϕ : G → H be a homomorphism. We define ker(ϕ) = {g ∈ G : ϕ(g) = eH }. The set
ker(ϕ) is called the kernel of ϕ.
For example, for det : GLn (R) → R\{0}, we have
ker(det) = {A ∈ GLn (R) : det(A) = 1} = SLn (R)
and for ε : Sn → {±1}, we have
ker(ε) = {σ ∈ Sn : ε(σ) = 1} = An
114 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

Proposition 6.6.4. If ϕ : G → H is a homomorphism, then ker(ϕ) is a normal subgroup of G.


Proof. Let K = ker(ϕ). We first check that K is indeed a subgroup of G. Since ϕ(eG ) = eH by Proposition
6.6.2, we see that eG ∈ K. Given a, b ∈ K, we have ϕ(a) = eH = ϕ(b), so
ϕ(ab) = ϕ(a) · ϕ(b) = eH · eH = eH
and hence ab ∈ K. Given a ∈ K, we have ϕ(a) = eH , so
ϕ(a−1 ) = ϕ(a)−1 = e−1
H = eH

so a−1 ∈ K. Putting this all together, we see that K is a subgroup of G. Suppose now that a ∈ K and
g ∈ G. Since a ∈ K we have ϕ(a) = eH , so
ϕ(gag −1 ) = ϕ(g) · ϕ(a) · ϕ(g −1 )
= ϕ(g) · eH · ϕ(g)−1
= ϕ(g) · ϕ(g)−1
= eH
and hence gag −1 ∈ K. Therefore, K is a normal subgroup of G.
We’ve just seen that the kernel of a homomorphism is always a normal subgroup of G. It’s a nice fact
that every normal subgroup of a group arises in this way, so we get another equivalent characterization of a
normal subgroup.
Theorem 6.6.5. Let G be a group and let K be a normal subgroup of G. There exists a group H and a
homomorphism ϕ : G → H with K = ker(ϕ).
Proof. Suppose that K is a normal subgroup of G. We can then form the quotient group H = G/K. Define
ϕ : G → H by letting ϕ(a) = aK for all a ∈ G. As discussed above, ϕ is a homomorphism. Notice that for
any a ∈ G, we have
a ∈ ker(ϕ) ⇐⇒ ϕ(a) = eK
⇐⇒ aK = eK
⇐⇒ eK = aK
⇐⇒ e−1 a ∈ K
⇐⇒ a ∈ K
Therefore ϕ is a homomorphism with ker(ϕ) = K.
Proposition 6.6.6. Let ϕ : G → H be a homomorphism. ϕ is injective if and only if ker(ϕ) = {eG }.
Proof. Suppose first that ϕ is injective. We know that ϕ(eG ) = eH , so eG ∈ ker(ϕ). Ler a ∈ ker(ϕ) be
arbitrary. We then have ϕ(a) = eH = ϕ(eG ), so since ϕ is injective we conclude that a = eG . Therefore,
ker(ϕ) = {eG }.
Suppose conversely that ker(ϕ) = {eG }. Let a, b ∈ G be arbitrary with ϕ(a) = ϕ(b). We then have
ϕ(a−1 b) = ϕ(a−1 ) · ϕ(b)
= ϕ(a)−1 · ϕ(b)
= ϕ(a)−1 · ϕ(a)
= eH
so a−1 b ∈ ker(ϕ). Now we are assuming that ker(ϕ) = {eG }, so we conclude that a−1 b = eG . Multiplying
on the left by a, we see that a = b. Therefore, ϕ is injective.
6.6. HOMOMORPHISMS 115

In contrast to Proposition 6.3.9, we only obtain the following for homomorphisms.


Proposition 6.6.7. Suppose that ϕ : G → H is a homomorphism. If a ∈ G has finite order, then |ϕ(a)|
divides |a|.
Proof. Suppose that a ∈ G has finite order and let n = |a|. We have an = eG , so
ϕ(a)n = ϕ(an ) = ϕ(eG ) = eH
Thus, ϕ(a) has finite order, and furthermore, if we let m = |ϕ(a)|, then m | n by Proposition 4.6.5.
Given a homomorphism ϕ : G → H, it is possible that |ϕ(a)| < |a| for an element a ∈ G. For example,
if ε : S4 → {1, −1} is the sign homomorphism, then |(1 2 3 4)| = 4 but |ε((1 2 3 4))| = | − 1| = 1. For a
more extreme example, given any group G, the function ϕ : G → {e} from G to the trivial group given by
ϕ(a) = e for all a is a homomorphism with the property that |ϕ(a)| = 1 for all a ∈ G. In particular, an
element of infinite order can be sent by a homomorphism to an element of finite order.
For another example, suppose that N is a normal subgroup of G, and consider the homomorphism
ϕ : G → G/N given by π(a) = aN . For any a ∈ G, we know by Proposition 6.6.7 that |π(a)| must divide
|a|, which is to say that |aN | (the order of the element aN ) must divide |a|. Thus, we have generalized
Proposition 6.2.12
Definition 6.6.8. Suppose A and B are sets and that f : A → B is a function.
• Given a subset S ⊆ A, we let f → [S] = {f (a) : a ∈ S} = {b ∈ B : There exists a ∈ A with f (a) = b}.
• Given a subset T ⊆ B, we let f ← [T ] = {a ∈ A : f (a) ∈ T }
In other words, f → [S] is the set of outputs that arise when restricting inputs to S, an f ← [T ] is the set of
inputs that are sent to an output in T .
Given a function f : A → B, some sources use the notation f (S) for our f → [S], and use f −1 (T ) for our
f ← [T ]. This notation is standard, but extraordinary confusing. On the one hand, the notation f (S) suggests
that S is a valid input to f rather than a set of valid inputs. To avoid confusion here, we choose to use
square brackets. In addition, the notation f −1 (T ), or even f −1 [T ], is additionally confusing because f may
fail to have an inverse (after all, it need not be bijective). Thus, we will use our (somewhat unorthodox)
notation.
Theorem 6.6.9. Let G1 and G2 be groups, and let ϕ : G1 → G2 be a homomorphism. We have the following:
1. If H1 is a subgroup of G1 , then ϕ→ [H1 ] is a subgroup of G2 .
2. If N1 is a normal subgroup of G1 and ϕ is surjective, then ϕ→ [N1 ] is a normal subgroup of G2 .
3. If H2 is a subgroup of G2 , then ϕ← [H2 ] is a subgroup of G1 .
4. If N2 is a normal subgroup of G2 , then ϕ← [N2 ] is a normal subgroup of G1 .
Proof. Throughout, let e1 be the identity of G1 and let e2 be the identity of G2 .
1. Suppose that H1 is a subgroup of G1 .
• Notice that we have e1 ∈ H1 because H1 is a subgroup of G2 , so e2 = ϕ(e1 ) ∈ ϕ→ [H1 ].
• Let c, d ∈ ϕ→ [H1 ] be arbitrary. Fix a, b ∈ H1 with ϕ(a) = c and ϕ(b) = d. Since a, b ∈ H1 and
H1 is a subgroup of G1 , it follows that ab ∈ H1 . Now
ϕ(ab) = ϕ(a) · ϕ(b) = cd
so cd ∈ ϕ→ [H1 ].
116 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

• Let c ∈ ϕ→ [H1 ] be arbitrary. Fix a ∈ H1 with ϕ(a) = c. Since a ∈ H1 and H1 is a subgroup of


G1 , it follows that a−1 ∈ H1 . Now

ϕ(a−1 ) = ϕ(a)−1 = c−1

so c−1 ∈ ϕ→ [H1 ].
Putting it all together, we conclude that ϕ→ [H1 ] is a subgroup of G2 .
2. Suppose that N1 is a normal subgroup of G1 and that ϕ is surjective. We know from part 1 that
ϕ→ [N1 ] is a subgroup of G2 . Suppose that d ∈ G2 and c ∈ ϕ→ [N1 ] are arbitrary. Since ϕ is surjective,
we may fix b ∈ G1 with ϕ(b) = d. Since c ∈ ϕ→ [N1 ], we can fix a ∈ G1 with ϕ(a) = c. We then have

dcd−1 = ϕ(b) · ϕ(a) · ϕ(b)−1


= ϕ(b) · ϕ(a) · ϕ(b−1 )
= ϕ(bab−1 )

so dcd−1 ∈ ϕ→ [N1 ]. Since d ∈ G2 and c ∈ ϕ→ [N1 ] were arbitrary, we conclude that ϕ→ [N1 ] is a normal
subgroup of G2 .
3. Suppose that H2 be a subgroup of G2 .
• Notice that we have e2 ∈ H2 because H2 is a subgroup of H1 . Since ϕ(e1 ) = e2 ∈ H2 , it follows
that e1 ∈ ϕ← [H2 ].
• Let a, b ∈ ϕ← [H2 ] be arbitrary, so ϕ(a) ∈ H2 and ϕ(b) ∈ H2 . We then have

ϕ(ab) = ϕ(a) · ϕ(b) ∈ H2

because H2 is a subgroup of G2 , so ab ∈ ϕ← [H2 ].


• Let a ∈ ϕ← [H2 ] be arbitrary, so ϕ(a) ∈ H2 . We then have

ϕ(a−1 ) = ϕ(a)−1 ∈ H2

because H2 is a subgroup of G2 , so a−1 ∈ ϕ← [H2 ].


Putting it all together, we conclude that ϕ← [H2 ] is a subgroup of G1 .
4. Suppose that N2 be a subgroup of G2 . We know from part 1 that ϕ← [N2 ] is a subgroup of G1 . Suppose
that b ∈ G1 and a ∈ ϕ← [N2 ] are arbitrary, so ϕ(a) ∈ N2 . We then have

ϕ(bab−1 ) = ϕ(b) · ϕ(a) · ϕ(b−1 )


= ϕ(b) · ϕ(a) · ϕ(b)−1

Now ϕ(b) ∈ G2 and ϕ(a) ∈ N2 , so ϕ(b) · ϕ(a) · ϕ(b)−1 ∈ N2 because N2 is a normal subgroup of G2 .
Therefore, ϕ(bab−1 ) ∈ N2 , and hence bab−1 ∈ ϕ← [N2 ]. Since b ∈ G1 and a ∈ ϕ← [N2 ] were arbitrary,
we conclude that ϕ← [N2 ] is a normal subgroup of G1 .

Corollary 6.6.10. If ϕ : G1 → G2 is a homomorphism, then range(ϕ) is a subgroup of G2 .


Proof. This follows immediately from the previous theorem because G1 is trivially a subgroup of G1 and
range(ϕ) = ϕ→ [G1 ].
6.7. THE ISOMORPHISM AND CORRESPONDENCE THEOREMS 117

6.7 The Isomorphism and Correspondence Theorems


Suppose that ϕ : G → H is a homomorphism. There are two ways in which ϕ can fail to be an isomorphism:
ϕ can fail to be injective and ϕ can fail to be surjective. The latter is not really a problem at all, because
range(ϕ) = ϕ→ [G] is a subgroup of H, and if we restrict the codomain down to this subgroup, then ϕ
becomes surjective. The real issue is how to deal with a lack of injectivity. Let K = ker(ϕ). As we’ve seen
in Proposition 6.6.6, if K = {eG }, then ϕ is injective. But what if K is a nontrivial subgroup?
The general idea is as follows. All elements of K get sent to the same element of H via ϕ, namely to eH .
Now viewing K as a subgroup of G, the cosets of K break up G into pieces. The key insight is that two
elements which belong to the same coset of K must get sent via ϕ to the same element of H, and conversely
if two elements belong to different cosets of K then they must be sent via ϕ to distinct values of H. This is
the content of the following lemma.

Lemma 6.7.1. Suppose that ϕ : G → H is a homomorphism and let K = ker(ϕ). For any a, b ∈ G, we have
that a ∼K b if and only if ϕ(a) = ϕ(b).

Proof. Suppose first that a, b ∈ G satisfy a ∼K b. Fix k ∈ K with ak = b and notice that

ϕ(b) = ϕ(ak)
= ϕ(a) · ϕ(k)
= ϕ(a) · eH
= ϕ(a)

Suppose conversely that a, b ∈ G satisfy ϕ(a) = ϕ(b). We then have

ϕ(a−1 b) = ϕ(a−1 ) · ϕ(b)


= ϕ(a)−1 · ϕ(b)
= ϕ(a)−1 · ϕ(a)
= eH

so a−1 b ∈ K. It follows that a ∼K b.

In less formal terms, the lemma says that ϕ is constant on each coset of K, and assigns distinct values
to distinct cosets. This sets up a well-defined injective function from the quotient group G/K to the group
H. Restricting down to range(ϕ) on the right, this function is a bijection. Furthermore, this function is an
isomorphism as we now prove in the following fundamental theorem.

Theorem 6.7.2 (First Isomorphism Theorem). Let ϕ : G → H be a homomorphism and let K = ker(ϕ).
Define a function ψ : G/K → H by letting ψ(aK) = ϕ(a). We then have that ψ is a well-defined function
which is an isomorphism onto the subgroup range(ϕ) of H. Therefore

G/ ker(ϕ) ∼
= range(ϕ)

Proof. We check the following.

• ψ is well-defined: Suppose that a, b ∈ G with aK = bK. We then have a ∼K b, so by Lemma 6.7.1, we


have that ϕ(a) = ϕ(b) and hence ψ(aK) = ψ(bK).

• ψ is injective: Suppose that a, b ∈ G with ψ(aK) = ψ(bK). We then have that ϕ(a) = ϕ(b), so a ∼K b
by Lemma 6.7.1 (in the other direction), and hence aK = bK.
118 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

• ψ is surjective onto range(ϕ): First notice that since ϕ : G → H is a homomorphism, we know from
Corollary 6.6.10 that range(ϕ) is a subgroup of H. Now for any a ∈ G, we have ψ(aK) = ϕ(a) ∈
range(ϕ), so range(ψ) ⊆ range(ϕ). Moreover, given an arbitrary c ∈ range(ϕ), we can fix a ∈ G with
ϕ(a) = c, and then we have ψ(aK) = ϕ(a) = c. Thus, range(ϕ) ⊆ range(ψ). Combining both of these,
we conclude that range(ψ) = range(ϕ).

• ψ preserves the group operation: For any a, b ∈ G, we have

ψ(aK · bK) = ψ(abK)


= ϕ(ab)
= ϕ(a) · ϕ(b)
= ψ(aK) · ψ(bK)

Putting it all together, the function ψ : G/K → H defined by ψ(aK) = ϕ(a) is a well-defined injective
homomorphism that maps G surjectively onto range(φ). This proves the result.

For example, consider the homomorphism det : GLn (R) → R\{0}, where we view R\{0} as a group
under multiplication. As discussed in the previous section, we have ker(det) = SLn (R). Notice that det is a
surjective function because every nonzero real number arises as the determinant of some invertible matrix (for
example, the identity matrix with the (1, 1) entry replaced by r has determinant r). The First Isomorphism
Theorem tells us that
GLn (R)/SLn (R) ∼ = R\{0}
Here is what is happening intuitively. The subgroup SLn (R), which is the kernel of the determinant homo-
morphism, is the set of n × n matrices with determinant 1. This break up the n × n matrices into cosets
which correspond exactly to the nonzero real numbers in the sense that all matrices of a given determinant
form a coset. The First Isomorphism Theorem says that multiplication in the quotient (where you take rep-
resentatives matrices from the corresponding cosets, multiply the matrices, and then expand to the resulting
coset) corresponds exactly to just multiplying the real numbers which “label” the cosets.
For another example, consider the sign homomorphism ε : Sn → {±1} where n ≥ 2. We know that ε
is a homomorphism by Proposition 5.3.8, and we have ker(ε) = An by definition of An . Notice that ε is
surjective because n ≥ 2 (so ε(id) = 1 and ε((1 2)) = −1). By the First Isomorphism Theorem we have

Sn /An ∼
= {±1}

Now the quotient group Sn /An consists of two cosets: the even permutations form one coset (namely An )
and the odd permutations form the other coset. The isomorphism above is simply saying that we can label
all of the even permutations with 1 and all of the odd permutations with −1, and in this way multiplication
in the quotient corresponds exactly to multiplication of the labels.
Recall that if G is a group and H and K are subgroups of G, then we defined

HK = {hk : h ∈ H and k ∈ K}

(see Definition 6.4.1). Immediately after that definition we showed that HK is not in general a subgroup of
G by considering the case where G = S3 , H = H = h(1 2)i = {id, (1 2)}, and K = h(1 3)i = {id, (1 3)}. In
this case, we had

HK = {id ◦ id, id ◦ (1 3), (1 2) ◦ id, (1 2)(1 3)}


= {id, (1 3), (1 2), (1 3 2)}
6.7. THE ISOMORPHISM AND CORRESPONDENCE THEOREMS 119

which is not a subgroup of G. Moreover, notice that

KH = {id ◦ id, id ◦ (1 2), (1 3) ◦ id, (1 3)(1 2)}


= {id, (1 2), (1 3), (1 2 3)}

so we also have that HK 6= KH. Fortunately, if at least one of H or K is normal in G, then neither of these
problems arise.
Proposition 6.7.3. Let H and N be subgroups of a group G, and suppose that N is a normal subgroup of
G. We have the following:
1. HN = N H.
2. HN is a subgroup of G (and hence N H is a subgroup of G).
Proof. We first show that HN = N H.
• We show that HN ⊆ N H. Let a ∈ HN be arbitrary, and fix h ∈ H and n ∈ N with a = hn. One
can show that a ∈ N H directly from Condition 2 of Proposition 6.2.2, but we work with the conjugate
instead. Notice that
a = hn = (hnh−1 )h
where hnh−1 ∈ N (because N is a normal subgroup of G) and h ∈ H. Thus, a ∈ N H.
• We show that N H ⊆ HN . Let a ∈ N H be arbitrary, and fix n ∈ N and h ∈ H with a = nh. Again,
one can use Condition 1 of Proposition 6.2.2 directly, or notice that

a = nh = h(h−1 nh)

where h ∈ H and h−1 nh = h−1 n(h−1 )−1 ∈ N because N is a normal subgroup of G. Thus, a ∈ HN .
We now check that HN is a subgroup of G.
• Since H and N are subgroups of G, we have e ∈ H and e ∈ N . Since e = ee, it follows that e ∈ HN .
• Let a, b ∈ HN be arbitrary. Fix h1 , h2 ∈ H and n1 , n2 ∈ N with a = h1 n1 and b = h2 n2 . Since N is
a normal subgroup of G, n1 ∈ N , and h2 ∈ G, we may use Property 1 from Proposition 6.2.2 to fix
n3 ∈ N with n1 h2 = h2 n3 . We then have

ab = h1 n1 h2 n2 = h1 h2 n3 n2

Now h1 , h2 ∈ H and H is a subgroup of G, so h1 h2 ∈ H. Also, n3 , n2 ∈ N and N is a subgroup of G,


so n3 n2 ∈ N . Therefore
ab = (h1 h2 )(n3 n2 ) ∈ HN

• Let a ∈ HN . Fix h ∈ H and n ∈ N with a = hn. Notice that n−1 ∈ N because N is a normal
subgroup of G. Since N is a normal subgroup of G, n−1 ∈ N , and h−1 ∈ G, we may use Property 1
from Proposition 6.2.2 to fix k ∈ N with n−1 h−1 = h−1 k. We then have

a−1 = (hn)−1 = n−1 h−1 = h−1 k

Now h−1 ∈ H because H is a subgroup of G, so a−1 = h−1 k ∈ HN .


Therefore, HN is a subgroup of G. Since N H = HN from above, it follows that N H is also a subgroup of
G.
120 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS

Theorem 6.7.4 (Second Isomorphism Theorem). Let G be a group, let H be a subgroup of G, and let N be
a normal subgroup of G. We then have that HN is a subgroup of G, that N is a normal subgroup of HN ,
that H ∩ N is a normal subgroup of H, and that
HN ∼ H
=
N H ∩N
Proof. First notice that since N is a normal subgroup of G, we know from Proposition 6.7.3 that HN is a
subgroup of G. Notice that for since e ∈ H, we have that that n = en ∈ HN for all n ∈ N , so N ⊆ HN .
Since N is a normal subgroup of G, it follows that N is a normal subgroup of HN because conjugation by
elements of HN is just a special case of conjugation by elements of G. Therefore, we may form the quotient
group HN/N . Notice that we also have H ⊆ HN (because e ∈ N ), so we may define ϕ : H → HN/N by
letting ϕ(h) = hN . We check the following:
• ϕ is a homomorphism: For any h1 , h2 ∈ H, we have
ϕ(h1 h2 ) = h1 h2 N
= h1 N · h2 N
= ϕ(h1 ) · ϕ(h2 )

• ϕ is surjective: Let a ∈ HN be arbitrary. We need to show that aN ∈ range(ϕ). Fix h ∈ H and n ∈ N


with a = hn. Notice that we then have h ∼N a by definition of ∼N , so hN = aN . It follows that
ϕ(h) = hN = aN , so aN ∈ range(ϕ).
• ker(ϕ) = H ∩ N : Suppose first that a ∈ H ∩ N . We then have that a ∈ N , so aN ∩ eN 6= ∅ and hence
aN = eN . It follows that ϕ(a) = aN = eN , so a ∈ ker(ϕ). Since a ∈ H ∩ N was arbitrary, we conclude
that H ∩ N ⊆ ker(ϕ).
Suppose conversely that a ∈ ker(ϕ) is arbitrary. Notice that the domain of ϕ is H, so a ∈ H. Since
a ∈ ker(ϕ), we have ϕ(a) = eN , so aN = eN . It follows that e−1 a ∈ N , so a ∈ N . Therefore, a ∈ H
and a ∈ N , so a ∈ H ∩ N . Since a ∈ ker(ϕ) was arbitrary, we conclude that ker(ϕ) ⊆ H ∩ N .
Therefore, ϕ : H → HN/N is a surjective homomorphism with kernel equal to H ∩ N . Since H ∩ N is the
kernel of a homomorphism with domain H, we know that H ∩ N is a normal subgroup of H. By the First
Isomorphism Theorem, we conclude that
HN ∼ H
=
N H ∩N
This completes the proof.
Suppose that we have a normal subgroup N of a group G. We can then form the quotient group G/N
and consider the projection homomorphism π : G → G/N given by π(a) = aN . Now if we have a subgroup
of the quotient G/N , then Theorem 6.6.9 tells us that we can pull this subgroup back via π to obtain a
subgroup of G. Call this resulting subgroup H. Now the elements of G/N are cosets, so when we pull back
we see that H will be a union of cosets of N . In particular, since eN will be an element of G/N , we will
have that N ⊆ H.
Conversely, suppose that H is a subgroup of G with the property that N ⊆ H. Suppose that a ∈ H and
b ∈ G with a ∼N b. We may then fix n ∈ N with an = b. Since N ⊆ H, we have that n ∈ H, so b = an ∈ H.
Thus, if H contains an element of a given coset of N , then H must contain every element of that coset. In
other words, H is a union of cosets of N . Using Theorem 6.6.9 again, we can go across π and notice that
the cosets which are subsets of H form a subgroup of the quotient. Furthermore, we have
π → [H] = {π(a) : a ∈ H}
= {aN : a ∈ H}
6.7. THE ISOMORPHISM AND CORRESPONDENCE THEOREMS 121

so we can view π → [H] as simply the quotient group H/N .


Taken together, this is summarized in the following two important results. Most of the key ideas are
discussed above and/or are included in Theorem 6.6.9, but I will omit a careful proof of the second.
Theorem 6.7.5 (Third Isomorphism Theorem). Let G be a group. Let N and H be normal subgroups of G
with N ⊆ H. We then have that H/N is a normal subgroup of G/N and that
G/N ∼ G
=
H/N H
Proof. Define ϕ : G/N → G/H by letting ϕ(aN ) = aH.
• ϕ is well-defined: If a, b ∈ G with aN = bN , then a ∼N b, so a ∼H b because N ⊆ H, and hence
aH = bH.
• ϕ is a homomorphism: For any a, b ∈ G, we have

ϕ(aN · bN ) = ϕ(abN )
= abH
= aH · bH
= ϕ(aN ) · ϕ(bN )

• ϕ is surjective: For any a ∈ G, we have ϕ(aN ) = aH, so aH ∈ range(ϕ).


• ker(ϕ) = H/N : For any a ∈ G, we have

aN ∈ ker(ϕ) ⇐⇒ ϕ(aN ) = eH
⇐⇒ aH = eH
⇐⇒ e−1 a ∈ H
⇐⇒ a ∈ H

Therefore, ker(ϕ) = {aN : a ∈ H} = H/N .


Notice in particular that since H/N is the kernel of the homomorphism ϕ : G/N → G/H, we know that
H/N is a normal subgroup of G/N by Proposition 6.6.4. Now using the First Isomorphism Theorem, we
conclude that
G/N ∼ G
=
H/N H
This completes the proof.
Theorem 6.7.6 (Correspondence Theorem). Let G be a group and let N be a normal subgroup of G. For
every subgroup H of G with N ⊆ H, we have that H/N is a subgroup of G/N and the function

H 7→ H/N

is a bijection from subgroups of G containing N to subgroups of G/N . Furthermore, we have the following
properties for any subgroups H1 and H2 of G that both contain N :
1. H1 is a subgroup of H2 if and only if H1 /N is a subgroup of H2 /N .
2. H1 is a normal subgroup of H2 if and only if H1 /N is a normal subgroup of H2 /N .
3. If H1 is a subgroup of H2 , then [H2 : H1 ] = [H2 /N : H1 /N ].
122 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS
Chapter 7

Group Actions

Our definition of a group involved axioms for an abstract algebraic object. However, many groups arise in a
setting where the elements of the group naturally “move around” the elements of some set. For example, the
elements of GLn (R) represent linear transformations on Rn and so “move around” points in n-dimensional
space. When we discussed Dn , we thought of the elements of Dn as moving the vertices of a regular n-gon.
The general notion of a group working on a set in this way is called a group action. Typically these sets are
very geometric or combinatorial in nature, and we can understand the sets themselves by understanding the
groups. Perhaps more surprising, we can turn this idea around to obtain a great deal of information about
groups by understanding the sets on which they act.
The concept of understanding an algebraic object by seeing it “act” on other objects, often from different
areas of mathematics (geometry, topology, analysis, combinatorics, etc.), is a tremendously important part
of modern mathematics. In practice, groups typically arise as symmetries of some object just like our
introduction of Dn as the “symmetries” of the regular n-gon. Instead of n-gons, the objects can be graphs,
manifolds, topological spaces, vector spaces, etc.

7.1 Actions, Orbits, and Stabilizers


Definition 7.1.1. Let G be a group and let X be a (nonempty) set. A group action of G on X is a function
f : G × X → X, where we write f (g, x) as g ∗ x, such that
• e ∗ x = x for all x ∈ X.
• a ∗ (b ∗ x) = (a · b) ∗ x for all a, b ∈ G and x ∈ X.
We describe this situation by saying that “G acts on X”.
Here are several examples of group actions.
• GLn (R) acts on Rn where A ∗ ~x = A~x. Notice that In ~x = ~x for all ~x ∈ Rn and A(B~x) = (AB)~x for all
A, B ∈ GLn (R) and all ~x ∈ Rn from linear algebra.
• Sn acts on {1, 2, . . . , n} where σ ∗ i = σ(i). Notice that id ∗ i = i for all i ∈ {1, 2, . . . , n} and for any
σ, τ ∈ Sn and i ∈ {1, 2, . . . , n}, we have
σ ∗ (τ ∗ i) = σ ∗ τ (i)
= σ(τ (i))
= (σ ◦ τ )(i)
= (σ ◦ τ ) ∗ i

123
124 CHAPTER 7. GROUP ACTIONS

• Let G be a group. G acts on G by left multiplication, i.e. g ∗ a = g · a. Notice that e ∗ a = e · a = a


for all a ∈ G. The second axiom of a group action follows from associativity of the group operation
because for all g, h, a ∈ G we have

g ∗ (h ∗ a) = g ∗ (h · a)
= g · (h · a)
= (g · h) · a
= (g · h) ∗ a

• Let G be a group. G acts on G by conjugation, i.e. g ∗ a = gag −1 . Notice that e ∗ a = eae−1 = a for
all a ∈ G. Also, for all g, h, a ∈ G we have

g ∗ (h ∗ a) = g ∗ (hah−1 )
= ghah−1 g −1
= gha(gh)−1
= (g · h) ∗ a

• If G acts on X and H is a subgroup of G, then H acts on X by simply restricting the function. For
example, since Dn is a subgroup of Sn , we see that Dn acts on {1, 2, . . . , n} via σ ∗ i = σ(i). Also,
since SLn (R) is a subgroup of GLn (R), it acts on Rn via matrix multiplication as well.

Proposition 7.1.2. Suppose that G acts on X. Define a relation ∼ on X by letting x ∼ y if there exists
a ∈ G with a ∗ x = y. The relation ∼ is an equivalence relation on X.

Proof. We check the properties

• Reflexive: For any x ∈ X, we have e ∗ x = x, so x ∼ x.

• Symmetric: Suppose that x, y ∈ X with x ∼ y. Fix a ∈ G with a ∗ x = y. We then have

a−1 ∗ y = a−1 ∗ (a ∗ x)
= (a−1 · a) ∗ x
=e∗x
=x

so y ∼ x.

• Transitive: Suppose that x, y, z ∈ X with x ∼ y and y ∼ z. Fix a, b ∈ G with a ∗ x = y and b ∗ y = z.


We then have
(b · a) ∗ x = b ∗ (a ∗ x) = b ∗ y = z
so x ∼ z.

Definition 7.1.3. Suppose that G acts on X. The equivalence class of x under the above relation ∼ is called
the orbit of x. We denote this equivalence class by Ox . Notice that Ox = {a ∗ x : a ∈ G}.

If G acts on X, we know from our general theory of equivalence relations that the orbits partition X.
For example, consider the case where G = GL2 (R) and X = R2 with action A ∗ x = Ax. Notice that
7.1. ACTIONS, ORBITS, AND STABILIZERS 125

O(0,0) = {(0, 0)}. Let’s consider O(1,0) . We claim that O(1,0) = R2 \{(0, 0)}. Since the orbits partition R2 ,
we know that (0, 0) ∈/ O(1,0) . Now if a, b ∈ R with a 6= 0, then
 
a 0
b 1
is in GL2 (R) since it has determinant a 6= 0, and
    
a 0 1 a
=
b 1 0 b
so (a, b) ∈ O(1,0) . If b ∈ R with b 6= 0, then  
0 1
b 0
is in GL2 (R) since it has determinant −b 6= 0, and
    
0 1 1 0
=
b 0 0 b
so (0, b) ∈ O(1,0) . Putting everything together, it follows that O(1,0) = R2 \{(0, 0)}. Since the orbits are
equivalence classes, we conclude that O(a,b) = R2 \{(0, 0} whenever (a, b) 6= (0, 0). Thus, the orbits partition
R2 into the two pieces: the origin and the rest.
In contrast, consider what happens if we consider the following H of GL2 (R). Let
     
cos θ − sin θ cos θ sin θ
H= :θ∈R ∪ :θ∈R
sin θ cos θ sin θ − cos θ
If you’ve seen the terminology, then H is the set of all orthogonal 2 × 2 matrices, i.e. matrices A such that
AT = A−1 (where AT is the transpose of A). Intuitively, the elements of the set on the left are rotations by
angle θ and the elements of the set on the right are flips followed by rotations (very much like the alternate
definition of Dn on the homework, but now with all possible real values for the angles). One can show that
elements of H preserve distance. That is, if A ∈ H and ~v ∈ R2 , then ||A~v || = ||~v || (this can be checked
using general theory, or by checking directly using the above the matrices: Suppose that (x, y) ∈ R2 with
x2 + y 2 = r2 and show that the same is true after hitting (x, y) with any of the above matrices). Thus, if
we let H act on R2 , it follows that every element of O(1,0) is on the circle of radius 1 centered at the origin.
Furthermore, O(1,0) contains all of these points because
    
cos θ − sin θ 1 cos θ
=
sin θ cos θ 0 sin θ
which gives all points on the unit circle as we vary θ. In general, by working through the details one can
sshow that the orbits of this action are the circles centered at the origin.
Definition 7.1.4. Suppose that G acts on X. For each x ∈ X, define Gx = {a ∈ G : a ∗ x = x}. The set
Gx is called the stabilizer of x.
Proposition 7.1.5. Suppose that G acts on X. For each x ∈ X, the set Gx is a subgroup of G.
Proof. Let x ∈ X. Since e ∗ x = x, we have e ∈ Gx . Suppose that a, b ∈ Gx . We then have that a ∗ x = x
and b ∗ x = x, so
(a · b) ∗ x = a ∗ (b ∗ x) = a ∗ x = x
and hence a · b ∈ Gx . Suppose that a ∈ Gx so that a ∗ x = x. We then have that a−1 ∗ (a ∗ x) = a−1 ∗ x. Now
a−1 ∗ (a ∗ x) = (a−1 ∗ a) ∗ x = e ∗ x = x
so a−1 ∗ x = x and hence a−1 ∈ Gx . Therefore, Gx is a subgroup of G.
126 CHAPTER 7. GROUP ACTIONS

Since D4 is a subgroup of S4 , and S4 acts on {1, 2, 3, 4} via σ ∗ i = σ(i). It follows that D4 acts on
{1, 2, 3, 4} as well. To see how this works, we should remember our formal definitions of r and s as elements
of Sn . In D4 , we have r = (1 2 3 4) and s = (2 4). Working out all of the elements as permutations, we see
that
id r = (1 2 3 4) r2 = (1 3)(2 4) r3 = (1 4 3 2)
s = (2 4) rs = (1 2)(3 4) r2 s = (1 3) r3 s = (1 4)(2 3)
Notice that
G1 = G3 = {id, s} G2 = G4 = {id, r2 s}
and that
O1 = O2 = O3 = O4 = {1, 2, 3, 4}
Lemma 7.1.6. Suppose that G acts on X, and let x ∈ X. For any a, b ∈ G, we have

a ∗ x = b ∗ x ⇐⇒ a ∼Gx b.

Proof. Suppose first that a, b ∈ G with a ∗ x = b ∗ x. We then have that

(a−1 · b) ∗ x = a−1 ∗ (b ∗ x)
= a−1 ∗ (a ∗ x)
= (a−1 · a) ∗ x
=e∗x
=x

Therefore, a−1 b ∈ Gx and hence a ∼Gx b.


Suppose conversely that a, b ∈ G with a ∼Gx b. Fix h ∈ Gx with ah = b. We then have that

b ∗ x = (a · h) ∗ x
= a ∗ (h ∗ x)
=a∗x

so a ∗ x = b ∗ x.
Theorem 7.1.7 (Orbit-Stabilizer Theorem). Suppose that G acts on X, and let x ∈ X. There is a bijection
between Ox and the set of (left) cosets of Gx in G, so |Ox | = [G : Gx ]. In particular, if G is finite, then

|G| = |Ox | · |Gx |

so |Ox | divides |G|.


Proof. Let LGx be the set of left cosets of Gx in G. Define f : LGx → Ox by letting f (aGx ) = a ∗ x. Now we
are defining a function on cosets, but the right-to-left direction of Lemma 7.1.6 tells us that f is well-defined.
Also, the left-to-right direction of Lemma 7.1.6 tells us that f is injective. Now if y ∈ Ox , then we may fix
a ∈ G with y = a ∗ x, and notice that f (aGx ) = a ∗ x = y, so f is surjective. Therefore, f : LGx → Ox is a
bijection and hence |Ox | = [G : Gx ].
Suppose now that G is finite. By Lagrange’s Theorem, we know that
|G|
[G : Gx ] =
|Gx |
so
|G|
|Ox | =
|Gx |
7.2. PERMUTATIONS AND CAYLEY’S THEOREM 127

and hence
|G| = |Ox | · |Gx |.
This completes the proof.

For example, consider the standard action of Sn on {1, 2, . . . , n}. For every i ∈ {1, 2, . . . , n}, we have
Oi = {1, 2, . . . , n} (because for each j, there is a permutation sending i to j), so as |Sn | = n!, it follows
that |Gi | = n!
n = (n − 1)!. In other words, there are (n − 1)! permutations of {1, 2, . . . , n} which fix a given
element of {1, 2, . . . , n}. Of course, this could have been proven directly, but it is an immediate consequence
of the Orbit-Stabilizer Theorem.
In the case of D4 acting on {1, 2, 3, 4} discussed above, we saw that Oi = {1, 2, 3, 4} for all i. Thus, each
stabilizer Gi satisfies |Gi | = |D44 | = 84 = 2, as was verified above.

7.2 Permutations and Cayley’s Theorem


With several examples in hand, we start with the following proposition, which says that given a group action,
a given element of G actually permutes the elements of X.

Definition 7.2.1. Suppose that G acts on X. Given a ∈ G, define a function πa : X → X by letting


πa (x) = a ∗ x.

Proposition 7.2.2. Suppose that G acts on X. For each fixed a ∈ G, the function πa is a permutation of
X, i.e. is a bijection from X to X.

Proof. Let a ∈ G be arbitrary. We first check that πa is injective. Let x, y ∈ X and assume that πa (x) =
πa (y). We then have a ∗ x = a ∗ y, so

x=e∗x
= (a−1 · a) ∗ x
= a−1 ∗ (a ∗ x)
= a−1 ∗ (a ∗ y)
= (a−1 · a) ∗ y
=e∗y
= y.

Thus, the function πa is injective.


We now check that πa is surjection. Let x ∈ X be arbitrary. Notice that

πa (a−1 ∗ x) = a ∗ (a−1 ∗ x)
= (a · a−1 ) ∗ x
=e∗x
=x

hence x ∈ range(πa ). It follows that πa is surjective.


Therefore, πa : X → X is a bijection, so πa is a permutation of X.

Thus, given a group action of G on X, we can associate to each a ∈ G a bijection πa . Since each πa is
a bijection from X to X, we can view it as an element of SX . The function that takes a and produces the
function πa is in fact a homomorphism.
128 CHAPTER 7. GROUP ACTIONS

Proposition 7.2.3. Suppose that G acts on X. The function ϕ : G → SX defined by letting ϕ(a) = πa is a
homomorphism.
Proof. We know from Proposition 7.2.2 that πa ∈ SX for all a ∈ G. Let a, b ∈ G be arbitrary. For any
x ∈ X, we have

πa·b (x) = (a · b) ∗ x
= a ∗ (b ∗ x) (by the axioms of a group action)
= a ∗ πb (x)
= πa (πb (x))
= (πa ◦ πb )(x)

Since x ∈ X was arbitrary, it follows that πa·b = πa ◦ πb . Thus, we have

ϕ(a · b) = πa·b = πa ◦ πb = ϕ(a) ◦ ϕ(b).

Since a, b ∈ G were arbitrary, it follows that ϕ : G → SX is a homomorphism.


We have shown that given an action of G on X, we obtain a homomorphism from G to SX . We now
show that we can reverse this process. That is, if we start with a homomorphism from G to SX , we obtain
a group action. Thus, we can view group actions from either perspective.
Proposition 7.2.4. Let G be a group and let X be a set. If ϕ : G → SX is a homomorphism, then the
function from G × X to X defined by letting a ∗ x = ϕ(a)(x) is a group action of G on X.
Proof. Since ϕ is a homomorphism, we know that ϕ(e) = idX , hence for any x ∈ X we have

e ∗ x = ϕ(e)(x)
= idX (x)
=x

Suppose now that a, b ∈ G and x ∈ X are arbitrary. We then have

a ∗ (b ∗ x) = ϕ(a)(b ∗ x)
= ϕ(a)(ϕ(b)(x))
= (ϕ(a) ◦ ϕ(b))(x)
= ϕ(a · b)(x)
= (a · b) ∗ x

Therefore, the function a ∗ x = ϕ(a)(x) is a group action.


Historically, the concept of a group was originally studied in the context of permuting the roots of a given
polynomial (which leads to Galois Theory). In particular, group theory in the 18th and early 19th centuries
really consisted of the study of subgroups of Sn and thus had a much more “concrete” feel to it. The more
general abstract definition of a group (as any set with any operation satisfying the axioms) didn’t arise until
later. In particular, Lagrange’s Theorem was originally proved only for subgroups of symmetric groups.
However, once the abstract definition of a group as we now know it came about, it was quickly proved that
every finite group was isomorphic to a subgroup of a symmetric group. Thus, up to isomorphism, the older
more “concrete” study of groups is equivalent to the more abstract study we are conducting. This result is
known as Cayley’s Theorem, and we now go about proving it. Before jumping into the proof, we first handle
some preliminaries about the symmetric groups.
7.2. PERMUTATIONS AND CAYLEY’S THEOREM 129

Proposition 7.2.5. Suppose that X and Y are sets and that f : X → Y is a bijection. For each σ ∈ SX , the
function f ◦σ ◦f −1 is a permutation of Y . Furthermore, the function ϕ : SX → SY given by ϕ(σ) = f ◦σ ◦f −1
is an isomorphism. In particular, if |X| = |Y |, then SX ∼= SY .

Proof. Notice that f ◦ σ ◦ f −1 is a function from Y to Y , and is a composition of bijections, so is itself a


bijection. Thus, f ◦ σ ◦ f −1 is a permutation of Y . Define ϕ : SX → SY by letting ϕ(σ) = f ◦ σ ◦ f −1 . We
check the following.

• ϕ is bijective. To see this, we show that ϕ has an inverse. Define ψ : SY → SX by letting ψ(τ ) =
f −1 ◦ τ ◦ f (notice that f −1 ◦ τ ◦ f is indeed a permutation of X by a similar argument as above). For
any σ ∈ SX , we have

(ψ ◦ ϕ)(σ) = ψ(ϕ(σ))
= ψ(f ◦ σ ◦ f −1 )
= f −1 ◦ (f ◦ σ ◦ f −1 ) ◦ f
= (f −1 ◦ f ) ◦ σ ◦ (f −1 ◦ f )
= idX ◦ σ ◦ idX

• ϕ preserves the group operation. Let σ1 , σ2 ∈ SX . We then have

ϕ(σ1 ◦ σ2 ) = f ◦ (σ1 ◦ σ2 ) ◦ f −1
= f ◦ σ1 ◦ (f −1 ◦ f ) ◦ σ2 ◦ f −1
= (f ◦ σ1 ◦ f −1 ) ◦ (f ◦ σ2 ◦ f −1 )
= ϕ(σ1 ) ◦ ϕ(σ2 )

Therefore, ϕ : SX → SY is an isomorphism.

Corollary 7.2.6. If X is a finite set with |X| = n, then SX ∼


= Sn .

Proof. If X is finite with |X| = n, we may fix a bijection f : X → {1, 2, 3, . . . , n} and apply the previous
proposition.

Theorem 7.2.7 (Cayley’s Theorem). Let G be a group. There exists a subgroup H of SG such that G ∼
= H.
Therefore, if |G| = n ∈ N+ , then G is isomorphic to a subgroup of Sn .

Proof. Consider the action of G on G given by a ∗ b = a · b. We know by Proposition 7.2.3 that the function
ϕ : G → SG given by ϕ(a) = πa is a homomorphism. We now check that ϕ is injective. Suppose that a, b ∈ G
with ϕ(a) = ϕ(b). We then have that πa = πb , so in particular we have πa (e) = πb (e), hence a · e = b · e, and
so a = b. It follows that ϕ is injective. Since range(ϕ) is a subgroup of SG by Corollary 6.6.10, we can view
ϕ as a isomorphism from G to range(ϕ). Therefore, G is isomorphic to a subgroup of SG .
Suppose finally that |G| = n ∈ N+ . We know from above that SG ∼ = Sn , we we may fix an isomorphism
ψ : SG → Sn . We then have that ψ ◦ ϕ : G → Sn is injective (because the composition of injective functions is
injective) and that ψ ◦ ϕ preserves the group operation (as in the proof that the composition of isomorphisms
is an isomorphism), so G is isomorphic to the subgroup range(ψ ◦ ϕ) of Sn .
130 CHAPTER 7. GROUP ACTIONS

7.3 The Conjugation Action and the Class Equation


Conjugacy Classes and Centralizers
Let G be a group. Throughout this section, we will consider the special case of the action of G on G given by
conjugation. That is, we are considering the action g ∗ a = gag −1 . In this case, the orbits and the stabilizers
of the action are given special names.
Definition 7.3.1. Let G be a group and consider the action of G on G given by conjugation.
• The orbits of this action are called conjugacy classes. The conjugacy class of a is the set

Oa = {g ∗ a : g ∈ G} = {gag −1 : g ∈ G}.

• For a ∈ G, the stabilizer Ga is called the centralizer of a in G and is denoted CG (a). Notice that

g ∈ CG (a) ⇐⇒ g ∗ a = a ⇐⇒ gag −1 = a ⇐⇒ ga = ag.

Thus,
CG (a) = {g ∈ G : ga = ag}
is the set of elements of G which commute with a.
By our general theory of group actions, we know that CG (a) is a subgroup of G for every a ∈ G. Now
the conjugacy classes are orbits, so they are subsets of G which partition G, but in general they are certainly
not subgroups of G. However, we do know that if G is finite, then the size of every conjugacy class divides
|G| by the Orbit-Stabilizer Theorem because the size of a conjugacy class is the index of the corresponding
centralizer subgroup. In fact, in this case, the Orbit-Stabilizer Theorem says that if G is finite, then

|Oa | · |CG (a)| = |G|.

Notice that if a ∈ G, then we have a ∈ CG (a) (because a trivially commutes with a), so since CG (a) is a
subgroup of G containing a, it follows that hai ⊆ CG (a). It is often possible to use this simple fact together
with the above equality to help calculate conjugacy classes
As an example, consider the group G = S3 . We work out the conjugacy class and centralizer of the
various elements. Notice first that CG (id) = G because every elements commutes with the identity, and the
conjugacy class of id is {id} because σ ◦ id ◦ σ −1 = id for all σ ∈ G. Now consider the element (1 2). On the
one hand, we know that h(1 2)i = {id, (1 2)} is a subset of CG ((1 2)), so |CG ((1 2))| ≥ 2. Since |G| = 6, we
conclude that |O(1 2) | ≤ 3. Now we know that (1 2) ∈ O(1 2) because O(1 2) is the equivalence class of (1 2).
Since
• (2 3)(1 2)(2 3)−1 = (2 3)(1 2)(2 3) = (1 3)
• (1 3)(1 2)(1 3)−1 = (1 3)(1 2)(1 3) = (2 3)
it follows that (1 3) and (2 3) are also in O(1 2) . We now have three elements of O(1 2) , and since |O(1 2) | ≤ 3,
we conclude that
O(1 2) = {(1 2), (1 3), (2 3)}
Notice that we can now conclude that |CG ((1 2))| = 2, so in fact we must have CG ((1 2)) = {id, (1 2)}
without doing any other calculations.
We have now found two conjugacy classes which take up 4 of the elements of G = S3 . Let’s look at the
conjugacy class of (1 2 3). We know it contains (1 2 3), and since

(2 3)(1 2 3)(2 3)−1 = (2 3)(1 2 3)(2 3) = (1 3 2)


7.3. THE CONJUGATION ACTION AND THE CLASS EQUATION 131

it follows that (1 3 2) is there as well. Since the conjugacy classes partition G, these are the only possible
elements so we conclude that
O(1 2 3) = {(1 2 3), (1 3 2)}
Using the Orbit-Stabilizer Theorem it follows that |CG ((1 2 3))| = 3, so since h(1 2 3)i ⊆ CG ((1 2 3)) and
|h(1 2 3)i| = 3, we conclude that CG ((1 2 3)) = h(1 2 3)i. Putting it all together, we see that S3 breaks up
into three conjugacy classes:
{id} {(1 2), (1 3), (2 3)} {(1 2 3), (1 3 2)}
The fact that the 2-cycles form one conjugacy class and the 3-cycles form another is a specific case of a
general fact which we now prove.
Lemma 7.3.2. Let σ ∈ Sn be a k-cycle, say σ = (a1 a2 . . . ak ). For any τ ∈ Sn , the permutation τ στ −1
is a k-cycle and in fact
τ στ −1 = (τ (a1 ) τ (a2 ) . . . τ (ak ))
(Note: this k-cycle may not have the smallest element first, so we may have to “rotate” it to have it in
standard cycle notation).
Proof. For any i with 1 ≤ i ≤ k − 1, we have σ(ai ) = ai+1 , hence
(τ στ −1 )(τ (ai )) = τ (σ(τ −1 (τ (ai )))) = τ (σ(ai )) = τ (ai+1 )
Furthermore, since σ(ak ) = a1 , we have
(τ στ −1 )(τ (ak )) = τ (σ(τ −1 (τ (ak )))) = τ (σ(ak )) = τ (a1 )
To finish the proof, we need to show that τ στ −1 fixes all elements distinct from the τ (ai ). Suppose then
that b 6= τ (ai ) for each i. We then have that τ −1 (b) 6= ai for all i. Since σ fixes all elements other the ai , it
follows that σ fixes τ −1 (b). Therefore
(τ στ −1 )(b) = τ (σ(τ −1 (b))) = τ (τ −1 (b)) = b
Putting it all together, we conclude that τ στ −1 = (τ (a1 ) τ (a2 ) . . . τ (ak )).
For example, suppose that σ = (1 6 3 4) and τ = (1 7)(2 4 9 6)(5 8). To determine τ στ −1 , we need only
apply τ to each of the elements in the cycle σ. Thus,
τ στ −1 = (7 2 3 9) = (2 3 9 7)
This result extends beyond k-cycles, and in fact we get the reverse direction as well.
Theorem 7.3.3. Two elements of Sn are conjugate in Sn if and only if they have the same cycle structure,
i.e. if and only if their cycle notations have the same number of k-cycles for each k ∈ N+ .
Proof. Let σ ∈ Sn and suppose that we write σ in cycle notation as σ = π1 π2 · · · πn with the πi disjoint
cycles. For any τ ∈ Sn , we have
τ στ −1 = τ π1 π2 · · · πn τ −1 = (τ π1 τ −1 )(τ π2 τ −1 ) · · · (τ πn τ −1 )
By the lemma, each τ πi τ −1 is a cycle of the same length as πi . Furthermore, the various cycles τ πi τ −1 are
disjoint from each other because they are obtained by applying τ to the elements of the cycles and τ is a
bijection. Therefore, τ στ −1 has the same cycle structure as σ.
Suppose conversely that σ and ρ have the same cycle structure. Match up the cycles of σ with the cycles
of ρ in a manner which preserves cycle length (including 1-cycles). Define τ : {1, 2, . . . , n} → {1, 2, . . . , n}
as follows. Given i ∈ {1, 2, . . . , n}, let τ (i) be the element of the corresponding cycle in the corresponding
position. Notice that τ is a bijection because the cycles in a given cycle notation are disjoint. By inserting
τ τ −1 between the various cycles of σ, we can use the lemma to conclude that τ στ −1 = ρ.
132 CHAPTER 7. GROUP ACTIONS

As an illustration of the latter part of the theorem, suppose that we are working in S8 and we have

σ = (1 8)(2 5)(4 7 6) ρ = (2 7 3)(4 5)(6 8)

Notice that σ and ρ have the same cycle structure, so they are conjugate in S8 . We can then write

σ = (3)(1 8)(2 5)(4 7 6)


ρ = (1)(4 5)(6 8)(2 7 3)

The proof then defines τ by matching up the corresponding numbers, i.e. τ (3) = 1, τ (1) = 4, τ (8) = 5, etc.
Working it out, we see that we can take

τ = (1 4 2 6 3)(5 8)(7)

and that it will satisfy τ στ −1 = ρ.

The Class Equation


Given an action of G on X, we know that the orbits partition X. In particular, the conjugacy classes of a
group G partition G. Thus, if we pick a unique element bi from each of the m conjugacy classes of G, then
we have
|G| = |Ob1 | + |Ob2 | + · · · + |Obm |
Furthermore, since we know by the Orbit-Stabilizer Theorem the the size of an orbit divides the order of G,
it follows that every summand on the right is a divisor of G.
We can get a more useful version of the above equation if we bring together the orbits of size 1. Let’s
begin by examining which elements of G form a conjugacy class by themselves. We have

Oa = {a} ⇐⇒ gag −1 = a for all g ∈ G


⇐⇒ ga = ag for all g ∈ G
⇐⇒ a ∈ Z(G)

Thus, the elements which form a conjugacy class by themselves are exactly the elements of Z(G). Now if we
bring together these elements, and pick a unique element ai from each of the k conjugacy classes of size at
least 2, we get the equation:

|G| = |Z(G)| + |Oa1 | + |Oa2 | + · · · + |Oak |

Since Z(G) is a subgroup of G, we may use Lagrange’s Theorem to conclude that each of the summands
on the right is a divisor of G. Furthermore, we have |Oai | ≥ 2 for all i, although it might happen that
|Z(G)| = 1. Finally, using the Orbit-Stabilizer Theorem, we can rewrite this latter equation as

|G| = |Z(G)| + [G : CG (a1 )] + [G : CG (a2 )] + · · · + [G : CG (ak )]

where again each of the summands on the right is a divisor of G and [G : CG (ai )] ≥ 2 for all i. These latter
two equations (either variation) is known as the Class Equation for G.
We saw above that the class equation for S3 reads as

6=1+2+3

In Problem 6c on Homework 4, we computed the number of elements of S5 of each given cycle structure, so
the class equation for S5 reads as

120 = 1 + 10 + 15 + 20 + 20 + 24 + 30
7.3. THE CONJUGATION ACTION AND THE CLASS EQUATION 133

Theorem 7.3.4. Let p ∈ N+ be prime and let G be a group with |G| = pn for some n ∈ N+ . We then have
that |Z(G)| =
6 {e}.

Proof. Let a1 , a2 , . . . , ak be representatives from the conjugacy classes of size at least 2. By the Class
Equation, we know that
|G| = |Z(G)| + |Oa1 | + |Oa2 | + · · · + |Oak |
By the Orbit-Stabilizer Theorem, we know that |Oai | divides pn for each i. By the Fundamental Theorem
of Arithmetic, it follows that each |Oai | is one of 1, p, p2 , . . . , pn . Now we know that |Oai | > 1 for each i, so
p divides each |Oai |. Since
|Z(G)| = |G| − |Oa1 | − |Oa2 | − · · · − |Oak |
and p divides every term on the right-hand side, we conclude that p | |Z(G)|. In particular, Z(G) 6= {e}.

Corollary 7.3.5. Let p ∈ N+ be prime. Every group of order p2 is abelian.

Proof. Let G be a group of order p2 . Using Corollary 6.5.6, we know that either Z(G) = {e} or Z(G) = G.
The former is impossible by the Theorem 7.3.4, so Z(G) = G and hence G is abelian.

Proposition 7.3.6. Let p ∈ N+ be prime. If G is a group of order p2 , then either

G∼
= Z/p2 Z or G∼
= Z/pZ × Z/pZ

Therefore, up to isomorphism, there are exactly two groups of order p2 .

Proof. Suppose that G is a group of order p2 . By the Corollary 7.3.5, G is abelian. If there exists an element
of G with order p2 , then G is cyclic and G ∼
= Z/p2 Z. Suppose then that G has no element of order p2 . By
Lagrange’s Theorem, the order of every element divides p2 , so the order of every nonidentity element of G
must be p. Fix a ∈ G with a 6= e, and let H = hai. Since |H| = p < |G|, we may fix b ∈ G with b ∈ / H and
let K = hbi. Notice that H and K are both normal subgroups of G because G is abelian. Now H ∩ K is
a subgroup of K, so |H ∩ K| divides |K| = p. We can’t have |H ∩ K| = p, because this would imply that
H ∩ K = K, which would contradict the fact that b ∈ / H. Therefore, we must have |H ∩ K| = 1 and hence
H ∩ K = {e}. Since |H| · |K| = p2 = |G|, we may use Corollary 6.4.7 to conclude that G is the internal
direct product of H and K. Since H and K are both cyclic of order p, they are both isomorphic to Z/pZ, so

G∼
=H ×K ∼
= Z/pZ × Z/pZ

Finally notice that Z/p2 Z 6∼


= Z/pZ × Z/pZ because the first group is cyclic while the second is not (by
Problem 7 on Homework 7, for example).

We are now in a position to extend Theorem 6.2.13 to arbitrary groups.

Theorem 7.3.7 (Cauchy’s Theorem). Let p ∈ N+ be prime. If G is a group with p | |G|, then G has an
element of order p.

Proof. The proof is by induction on |G|. If |G| = 1, then the result is trivial because p - 1 (again if you don’t
like this vacuous base case, simply note the if |G| = p, then every nonidentity element of G has order p by
Lagrange’s Theorem). Suppose then that G is a finite group with p | |G|, and suppose that the result is true
for all groups K satisfying p | |K| and |K| < |G|. Let a1 , a2 , . . . , ak be representatives from the conjugacy
classes of size at least 2. By the Class Equation, we know that

|G| = |Z(G)| + [G : CG (a1 )] + [G : CG (a2 )] + · · · + [G : CG (ak )]

We have two cases.


134 CHAPTER 7. GROUP ACTIONS

• Suppose that p | [G : CG (ai )] for all i. Since

|Z(G)| = |G| − [G : CG (a1 )] + [G : CG (a2 )] + · · · + [G : CG (ak )]

and p divides every term on the right, it follows that p | |Z(G)|. Now Z(G) is a abelian group, so
Theorem 6.2.13 tells us that Z(G) has an element of order p. Thus, G has an element of order p.
• Suppose that p - [G : CG (ai )] for some i. Fix such an i. By Lagrange’s Theorem we have

|G| = [G : CG (ai )] · |CG (ai )|

Since p is a prime number with both p | |G| and p - [G : CG (ai )], we must have that p | |CG (ai )|. Now
|CG (ai )| < |G| because ai ∈
/ Z(G), so by induction, CG (ai ) has an element of order p. Therefore, G
has an element of order p.
The result follows by induction.

7.4 Simplicity of A5
Suppose that H is a subgroup of G. We know that one of the equivalent conditions for H to be a normal
subgroup of G is that ghg −1 ∈ H for all g ∈ G and h ∈ H. This leads to following simple result.
Proposition 7.4.1. Let H be a subgroup of G. We then have that H is a normal subgroup of G if and only
if H is a union of conjugacy classes of G.
Proof. Suppose first that H is a normal subgroup of G. Suppose that H includes an element h from some
conjugacy class of G. Now the conjugacy class of h in G equals {ghg −1 : g ∈ G}, and since h ∈ H and H is
normal in G, it follows that {ghg −1 : g ∈ G} ⊆ H. Thus, if H includes an element of some conjugacy class
of G, then H must include that entire conjugacy class. It follows that H is a union of conjugacy class of G.
Suppose conversely that H is a subgroup of G that is a union of conjugacy classes of G. Given any g ∈ G
and h ∈ H, we have that ghg −1 is an element of the conjugacy class of h in G, so since h ∈ H, we must have
ghg −1 ∈ H as well. Therefore, H is a normal subgroup of G.
If we understand the conjugacy classes of a group G, we can use this proposition to help us understand
the normal subgroups of G. Let’s begin such an analysis by looking at S4 . Recall the in Homework 4 we
counted the number of elements of Sn of various cycle types. In particular, we showed that if k ≤ n, then
the number of k-cycles in Sn equals:
n(n − 1)(n − 2) · · · (n − k + 1)
k
Also, if n ≥ 4, we showed that the number of permutations in Sn which are the product of two disjoint
2-cycles equals
n(n − 1)(n − 2)(n − 3)
8
Using these results, we see that S4 consists of the following numbers of elements of each cycle type.
• Identity: 1
4·3
• 2-cycles: 2 =6
4·3·2
• 3-cycles: 3 =8
4·3·2·1
• 4-cycles: 4 =6
7.4. SIMPLICITY OF A5 135
4·3·2·1
• Product of two disjoint 2-cycles: 8 = 3.
Since two elements of S4 are conjugates in S4 exactly when they have the same cycle type, this breakdown
gives the conjugacy classes of S4 . In particular, the class equation of S4 is:

24 = 1 + 3 + 6 + 6 + 8.

Using this class equation, let’s examine the possible normal subgroups of S4 . We already know that A4 is a
normal subgroup of S4 since it has index 2 in S4 (see Proposition 6.2.7). However another way to see this
is that A4 contains the identity, the 3-cycles, and the products of two disjoint 2-cycles, so it is a union of
conjugacy classes of S4 . In particular, it arises from taking the 1, the 3, and the 8 in the above class equation
and putting them together to form a subgroup of size 1 + 3 + 8 = 12.
Aside from the trivial examples of {id} and S4 itself, are there any other normal subgroups of S4 ? Suppose
that H is a normal subgroup of S4 with {id} ( H ( S4 and H 6= A4 . We certainly know that id ∈ H.
By Lagrange’s Theorem, we know we must have that |H| | 24. We also know that H must be a union of
conjugacy classes of S4 . Thus, we would need to find a way to add some collection of the various numbers
in the above class equation, necessarily including the number 1, such that their sum is a divisor of 24. One
way is 1 + 3 + 8 = 12 which gave A4 . Working through the various possibilities, we see that they only other
nontrivial way to make it work is 1 + 3 = 4. This corresponds to the subset

{id, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}.

Now this subset is certainly closed under conjugation, but it is not immediately obvious that it is a subgroup.
Each nonidentity element here has order 2, so it is closed under inverses. Performing the simple check, it
turns out that it is also closed under composition, so indeed this is another normal subgroup of S4 . Thus,
the normal subgroups of S4 are {id}, S4 , A4 , and this subgroup of order 4.
Now let’s examine the possible normal subgroups of A4 . We already know three examples: {id}, A4 , and
the just discovered subgroup of S4 of size 4 (it is contained in A4 , and it must be normal in A4 because it
is normal in S4 ). Now the elements of A4 are the identity, the 3-cycles, and the products of two disjoint
2-cycles. Although the set of 3-cycles forms one conjugacy class in S4 , the set of eight 3-cycles does not form
one conjugacy class in A4 . We can see this immediately because |A4 | = 12 and 8 - 12. The problem is that
the elements of S4 which happen to conjugate one 3-cycle to another might all be odd permutations and so
do not exist in A4 . How can we determine the conjugacy classes in A4 without simply plowing through all
of the calculations from scratch?
Let’s try to work out the conjugacy class of (1 2 3) in A4 . First notice that we know that the conjugacy
class of (1 2 3) in S4 has size 8, so by the Orbit-Stabilizer Theorem we conclude that |CS4 ((1 2 3))| = 24 8 = 3.
Since h(1 2 3)i ⊆ CS4 ((1 2 3)) and |h(1 2 3)i| = 3, it follows that CS4 ((1 2 3)) = h(1 2 3)i. Now h(1 2 3)i ⊆ A4 ,
so we conclude that CA4 ((1 2 3)) = h(1 2 3)i as well. Therefore, by the Orbit-Stabilizer Theorem, the
conjugacy class of (1 2 3) in A4 has size 12 3 = 4. If we want to work out what exactly it is, it suffices to find
4 conjugates of (1 2 3) in A4 . Fortunately, we know how to compute conjugates quickly in Sn using Lemma
7.3.2 and the discussion afterwards:
• id(1 2 3)id−1 = (1 2 3)
• (1 2 4)(1 2 3)(1 2 4)−1 = (2 4 3)
• (2 3 4)(1 2 3)(2 3 4)−1 = (1 3 4)
• (1 2)(3 4)(1 2 3)[(1 2)(3 4)]−1 = (2 1 4) = (1 4 2)
Thus, the conjugacy class of (1 2 3) in A4 is:

{(1 2 3), (1 3 4), (1 4 2), (2 4 3)}


136 CHAPTER 7. GROUP ACTIONS

If we work with a 3-cycle not in this set (for example, (1 2 4)), the above argument works through to show
that its conjugacy class also has size 4, so its conjugacy class must be the other four 3-cycles in A4 . Thus,
we get the conjugacy class
{(1 2 4), (1 3 2), (1 4 3), (2 3 4)}
Finally, let’s look at the conjugacy class of (1 2)(3 4) in A4 . We have

• id(1 2)(3 4)id−1 = (1 2)(3 4)

• (2 3 4)(1 2)(3 4)(2 3 4)−1 = (1 3)(4 2) = (1 3)(2 4)

• (2 4 3)(1 2)(3 4)(2 4 3)−1 = (1 4)(2 3)

so the products of two disjoint 2-cycles still form one conjugacy class in A4 . Why did this conjugacy class
not break up? By the Orbit-Stabilizer Theorem, we know that |CS4 ((1 2)(3 4))| = 24 3 = 8. If we actually
compute this centralizer, we see that 4 of its elements are even permutations and four of its elements are
odd permutations. Therefore, |CA4 ((1 2)(3 4))| = 4 (the four even permutations in CS4 ((1 2)(3 4))), hence
using the Orbit-Stabilizer Theorem again we conclude that the conjugacy class of (1 2)(3 4) in A4 has size
12
4 = 3.
Putting everything together, we see that the class equation of A4 is:

12 = 1 + 3 + 4 + 4.

Working through the possibilities as in S4 , we conclude that the three normal subgroups of A4 we found
above are indeed all of the normal subgroups of A4 (there is no other way to add some subcollection of these
numbers which includes 1 to obtain a divisor of 12). Notice that we can also conclude that following.

Proposition 7.4.2. A4 has no subgroup of order 6, so the converse of Lagrange’s Theorem is false.

Proof. If H was a subgroup of A4 with |H| = 6, then H would have index 2 in A4 , so would be normal in
A4 . However, we just saw that A4 has no normal subgroup of order 6.

In the case of S4 that we just worked through, we saw that when we restrict down to A4 , some conjugacy
classes split into two and others stay intact. This is a general phenomenon in Sn , as we now show. We first
need the following fact.
|H|
Lemma 7.4.3. Let H be a subgroup of Sn . If H contains an odd permutation, then |H ∩ An | = 2 .

Proof. Suppose that H contains an odd permutation, and fix such an element τ ∈ H. Let X = H ∩ An
be the set of even permutations in H and let Y = H\An be the set of odd permutations in H. Define
f : X → Sn by letting f (σ) = στ . We claim that f maps X bijectively onto Y . We check the following:

• f is injective: Suppose that σ1 , σ2 ∈ X with f (σ1 ) = f (σ2 ). We then have σ1 τ = σ2 τ , so σ1 = σ2 by


cancellation.

• range(f ) ⊆ Y . Let σ ∈ X. We have σ ∈ H, so στ ∈ H because H is a subgroup of Sn . Also, σ is a even


permutation and τ is an odd permutation, so στ is an odd permutation. It follows that f (σ) = στ ∈ Y .

• Y ⊆ range(f ): Let ρ ∈ Y . We then have ρ ∈ H and τ ∈ H, so ρτ −1 ∈ H because H is a subgroup


of Sn . Also, we have that both ρ and τ −1 are odd permutations, so ρτ −1 is an even permutation. It
follows that ρτ −1 ∈ X and since f (ρτ −1 ) = ρτ −1 τ = ρ, we conclude that ρ ∈ range(f ).

Therefore, f maps X bijectively onto Y , and hence |X| = |Y |. Since H = X ∪ Y and X ∩ Y = ∅, we conclude
that |H| = |X| + |Y |. It follows that |H| = 2 · |X|, so |X| = |H|
2 .
7.4. SIMPLICITY OF A5 137

Proposition 7.4.4. Let σ ∈ An . Let X be the conjugacy class of σ in Sn , and let Y be the conjugacy class
of σ in An .

• If σ commutes with some odd permutation in Sn , then Y = X.


|X|
• If σ does not commute with any odd permutation in Sn , then Y ⊆ X with |Y | = 2 .

Proof. First notice that X ⊆ An because σ ∈ An and all elements of X have the same cycle type as σ. Let
H = CSn (σ) and let K = CAn (σ). Notice that Y ⊆ X and K = H ∩ An . By the Orbit-Stabilizer Theorem
applied in each of Sn and An , we know that

n!
|H| · |X| = n! and |K| · |Y | =
2
and therefore
2 · |K| · |Y | = |H| · |X|
Suppose first that σ commutes with some odd permutation in Sn . We then have H contains an odd
permutation, so by the lemma we know that |K| = |H ∩ An | = |H| 2 . Plugging this into the above equation,
we see that |H| · |Y | = |H| · |X|, so |Y | = |X|. Since Y ⊆ X, it follows that Y = X.
Suppose now that σ does not commute with any odd permutation in Sn . We then have that H ⊆ An ,
so K = H ∩ An = H and hence |K| = |H|. Plugging this into the above equation, we see that 2 · |H| · |Y | =
|H| · |X|, so |Y | = |X|
2 .

Let’s put all of the knowledge to work in order to study A5 . We begin by recalling Problem 6c on
Homework 4, where we computed the number of elements of S5 of each given cycle type:

• Identity: 1
5·4
• 2-cycles: 2 = 10
5·4·3
• 3-cycles: 3 = 20
5·4·3·2
• 4-cycles: 4 = 30
5·4·3·2·1
• 5-cycles: 5 = 24
5·4·3·2
• Product of two disjoint 2-cycles: 8 = 15

• Product of a 3-cycle and a 2-cycle which are disjoint: This equals the number of 3-cycles as discussed
above, which is 20 from above.

This gives the following class equation for S5 :

120 = 1 + 10 + 15 + 20 + 20 + 24 + 30.

Now in A5 we only have the identity, the 3-cycles, the 5-cycles, and the product of two disjoint 2-cycles.
Let’s examine what happens to these conjugacy classes in A5 . We know that each of these conjugacy classes
either stays intact or breaks in half by the above proposition.

• The set of 3-cycles has size 20, and since (1 2 3) commutes with the odd permutation (4 5), this
conjugacy class stays intact in A5 .

• The set of 5-cycles has size 24, and since 24 - 60, it is not possible that this conjugacy class stays intact
in A5 . Therefore, the set of 5-cycles breaks up into two conjugacy classes of size 12.
138 CHAPTER 7. GROUP ACTIONS

• The set of products of two disjoint 2-cycles has size 15. Since this is an odd number, it is not possible
that it breaks into two pieces of size 15
2 , so it must remain a conjugacy class in A5 .

Therefore, the class equation for A5 is:

60 = 1 + 12 + 12 + 15 + 20.

With this in hand, we can examine the normal subgroups of A5 . Of course we know that {id} and A5 are
normal subgroups of S5 . Suppose that H is a normal subgroup of A5 with {id} ( H ( A5 . We then have
that id ∈ H, that |H| | 60, and that H is a union of conjugacy classes of A5 . Thus, we would need to
find a way to add some collection of the various numbers in the above class equation, necessarily including
the number 1, such that their sum is a divisor of 60. Working through the possibilities, we see that this is
not possible except in the cases when we take all the numbers (corresponding to A5 ) and we only take 1
(corresponding to {id}). We have proved the following important theorem.

Theorem 7.4.5. A5 is a simple group.

In fact, An is a simple group for all n ≥ 5, but the above method of proof falls apart for n > 5 (the
class equations get too long and there occasionally are ways to add certain numbers to get divisors of |An |).
Thus, we need some new techniques. One approach is to use induction on n ≥ 5 starting with the base case
that we just proved, but we will not go through all the details here.

7.5 Counting Orbits


Suppose that we color each vertex of a square with one of n colors. In total, there are n4 many ways to do
this because we have n choices for each of the 4 vertices. However, some of these colorings are the “same” up
to symmetry. For example, if we color the upper left vertex red and the other 3 green, then we can get this
by starting with the coloring the upper right vertex red, the others green, and rotating. How many possible
colorings are there up to symmetry?
We can turn this question into a question about counting orbits of a given group action. We begin by
letting
X = {(a1 , a2 , a3 , a4 ) : 1 ≤ ai ≤ n for all i}
Intuitively, we are labeling the four vertices of the square with the numbers 1, 2, 3, 4, and we are letting ai
denote the color of vertex i. If we use the labeling where the upper left corner is 1, the upper right is 2, the
lower right is 3, and the lower left is 4, then the above above colorings are

(R, G, G, G) and (G, R, G, G)

where R is number we associated to color red and G the number for color green. Of course, these are distinct
elements of X, but we want to make them the “same”. Notice that S4 acts on X via the action:

σ ∗ (a1 , a2 , a3 , a4 ) = (aσ(1) , aσ(2) , aσ(3) , aσ(4) )

For example, we have


(1 2 3) ∗ (a1 , a2 , a3 , a4 ) = (a2 , a3 , a1 , a4 )
so
(1 2 3) ∗ (R, B, Y, G) = (B, Y, R, G)
Now we really don’t want to consider these two colorings to be the same because although we can get from
one to the other via an element of S4 , we can’t do it from a rigid motion of the square. However, since D4 is
7.5. COUNTING ORBITS 139

a subgroup of S4 , we can restrict the above to an action of D4 on X. Recall that when viewed as a subgroup
of S4 we can list the elements of D4 as:

id r = (1 2 3 4) r2 = (1 3)(2 4) r3 = (1 4 3 2)
s = (2 4) rs = (1 2)(3 4) r2 s = (1 3) r3 s = (1 4)(2 3)

The key insight is that two colorings are the same exactly when they are in the same orbit of this action by
D4 . For example, we have the following orbits:

• O(R,R,R,R) = {(R, R, R, R)}

• O(R,G,G,G) = {(R, G, G, G), (G, R, G, G), (G, G, R, G), (G, G, G, R)}

• O(R,G,R,G) = {(R, G, R, G), (G, R, G, R)}

Thus, to count the number of colorings up to symmetry, we want to count the number of orbits of this
action. The problem in attacking this problem directly is that the orbits have different sizes, so we can not
simply divide n4 by the common size of the orbits. We need a better way to count the number of orbits of
an action.

Definition 7.5.1. Suppose that G acts on X. For each g ∈ G, let Xg = {x ∈ X : g ∗ x = x}. The set Xg is
called the fixed-point set of g.

Theorem 7.5.2 (Burnside’s Lemma - due originally to Cauchy - sometimes also attributed to Frobenius).
Suppose that G acts on X and that both G and X are finite. If k is the number of orbits of the action, then
1 X
k= |Xg |
|G|
g∈G

Thus, the number of orbits is the average number of elements fixed by each g ∈ G.

Proof. We the count the set A = {(g, x) ∈ G × X : g ∗ x = x} in two different ways. On the
Pone hand, for
each g ∈ G, there are |Xg | many elements of A in the “row” corresponding to g, so |A| = g∈G |Xg |. On
P hand, for each x ∈ X, there are |Gx | many elements of A in the “column” corresponding to x, so
the other
|A| = x∈X |Gx |. Using the Orbit-Stabilizer Theorem, we know that
X X X |G| X 1
|Xg | = |A| = |Gx | = = |G| ·
|Ox | |Ox |
g∈G x∈X x∈X x∈X

and therefore
1 X X 1
|Xg | =
|G| |Ox |
g∈G x∈X

Let’s examine this latter sum. Let P1 , P2 , . . . , Pk be the distinct orbits of X. We then have that
k X k k
X 1 X 1 X 1 X
= = |Pi | · = 1=k
|Ox | i=1
|Pi | i=1
|Pi | i=1
x∈X x∈Pi

Therefore,
1 X X 1
|Xg | = =k
|G| |Ox |
g∈G x∈X
140 CHAPTER 7. GROUP ACTIONS

Of course, Burnside’s Lemma will only be useful if it is not hard to compute the various values |Xg |. Let’s
return to our example of D4 acting on the set X above. First notice that Xid = X, so |Xid | = n4 . Let move
on to |Xr | where r = (1 2 3 4). Which elements (a1 , a2 , a3 , a4 ) are fixed by r? We have r ∗ (a1 , a2 , a3 , a4 ) =
(a2 , a3 , a4 , a1 ), so we need a1 = a2 , a2 = a3 , a3 = a4 , and a4 = a1 . Thus, an element (a1 , a2 , a3 , a4 ) is fixed
by r exactly when all the ai are equal. There are n such choices (because we can pick a1 arbitrarily, and
then all the others are determined), so |Xr | = n. In general, given any σ ∈ D4 , an element of X is in Xσ
exactly when all of the entries in each cycle of σ get the same color. Therefore, we have |Xσ | = nd where d
is the number of cycles in the cycle notation of σ, assuming that we include the 1-cycles. For example, we
have |Xr2 | = n2 and |Xs | = n3 . Working this out in the above cases and using the fact that |D4 | = 8, we
conclude from Burnside’s Lemma that the number of ways to color the vertices of the square with n colors
up to symmetry is:
1 4 1
(n + n + n2 + n + n3 + n2 + n3 + n2 ) = (n4 + 2n3 + 3n2 + 2n)
8 8
Let’s examine the problem of coloring the faces of a cube. We will label the faces of a cube as a 6-sided die
is labeled so that opposing faces sum to 7. For example, we could take the top face to be 1, the bottom 6,
the front 2, the back 5, the right 3, and the left 4. With this labeling, the symmetries of the cube forms a
subgroup of S6 . Letting G be this subgroup, notice that |G| = 24 because we can put any of the 6 faces on
top, and then rotate around the top in 4 distinct ways. Working through the actually elements, one sees
that G equals the following subset of S6 :

id (2 3 5 4) (2 5)(3 4) (2 4 5 3)
(1 2)(3 4)(5 6) (1 3 2)(4 5 6) (1 5 6 2) (1 4 2)(3 5 6)
(1 2 3)(4 6 5) (1 3)(2 5)(4 6) (1 5 3)(2 4 6) (1 4 6 3)
(1 2 4)(3 6 5) (1 3 6 4) (1 5 4)(2 3 6) (1 4)(2 5)(3 6)
(1 2 6 5) (1 3 5)(2 6 4) (1 5)(2 6)(3 4) (1 4 5)(2 6 3)
(1 6)(3 4) (1 6)(2 3)(4 5) (1 6)(2 5) (1 6)(2 4)(3 5)

Let’s examine how these elements arise.

• Identity: 1 of these, and it fixes all n6 many colorings.


• 4-cycles: These are 90o rotations around the line through the center of two opposing faces. There are
6 of these (3 such lines, and can rotate in either direction), and each fixes n3 many colorings.
• Product of two 2-cycles: These are 180o rotations around the line through the center of two opposing
faces. There are 3 of these, and each fixes n4 many colorings.

• Product of two 3-cycles: These are rotations around the line through opposite corners of the cube.
There are 8 of these (there are four pairs of opposing corners, and then we can rotate either 120o or
240o for each), and each fixes n2 many colorings .
• Product of three 2-cycles: These are 180o rotations around a line through the middle of opposing edges
of the cube. There are 6 of these (there are 6 such pairs of opposing edges), and each fixes n3 many
colorings.
Using Burnside’s Lemma, we conclude that the total number of colorings of the faces of a cube using n colors
up to symmetry is:
1 6
(n + 3n4 + 12n3 + 8n2 )
24
Chapter 8

Cyclic, Abelian, and Solvable Groups

8.1 Structure of Cyclic Groups


We know that every cyclic group of order n ∈ N+ is isomorphic to Z/nZ, and that every infinite cyclic group
is isomorphic to Z. However, there is still more to say. In this section, we completely determine the subgroup
structure of cyclic groups. We begin with the following important fact.
Proposition 8.1.1. Every subgroup of a cyclic group is cyclic.
Proof. Suppose that G is a cyclic group and that H is a subgroup of G. Fix a generator c of G so that
G = hci. If H = {e} then H = hei, so H is cyclic. Suppose then that H 6= {e}. Since H 6= {e}, there exists
n ∈ Z\{0} such that cn ∈ H. Notice that we also have c−n = (cn )−1 ∈ H because H is a subgroup of G, so

{n ∈ N+ : cn ∈ H} =
6 ∅

Let m be the least element of this set (which exists by well-ordering). We show that H = hcm i. Since
cm ∈ H and H is a subgroup of G, we know that hcm i ⊆ H by Proposition 5.2.2. Let a ∈ H be arbitrary.
Since H ⊆ G, we also have a ∈ G, so we may fix k ∈ Z with a = ck . Write k = qm + r where 0 ≤ r < m.
We then have

a = ck
= cqm+r
= cmq cr ,

hence

cr = (cmq )−1 · a
= c−mq · a
= (cm )−q · a.

Now cm ∈ H and a ∈ H. Since H is a subgroup of G, we know that it is closed under inverses and the group
operation, hence (cm )−q · a ∈ H and so cr ∈ H. By choice of m as the smallest positive power of c which
lies in H, we conclude that we must have r = 0. Therefore,

a = ck = cqm = (cm )q ∈ hcm i.

Since a ∈ H was arbitrary, it follows that H ⊆ hcm i. Combining this with the reverse containment above,
we conclude that H = hcm i, so H is cyclic.

141
142 CHAPTER 8. CYCLIC, ABELIAN, AND SOLVABLE GROUPS

Therefore, if G is a cyclic group and c is a generator for G, then every subgroup of G can be written in
the form hck i for some k ∈ Z (since every element of G equals ck for some k ∈ Z). It is possible that the
same subgroup can be written in two different such ways. For example, in Z/6Z with generator 1, we have
h2i = {0, 2, 4} = h4i. We next determine when we have such equalities. We begin with the following lemma
that holds in any group.

Lemma 8.1.2. Let G be a group, and let a ∈ G have finite order n ∈ N+ . Let m ∈ Z and let d = gcd(m, n).
We then have that ham i = had i.

Proof. Since d | m, we may fix k ∈ Z with m = dk. We then have

am = adt = (ad )k

Therefore, am ∈ had i and hence ham i ⊆ had i. We now prove the reverse containment. Since d = gcd(m, n),
we may fix k, ` ∈ Z with d = mk + n`. Now

ad = amk+n`
= amk an`
= (am )k (an )`
= (am )k e`
= (am )k

so ad ∈ ham i, and hence had i ⊆ ham i. Combing the two containments, we conclude that ham i = had i.

For example, in our above case where G = Z/6Z and a = 1, we have |a| = 6 and 4 = a4 . Since
gcd(4, 6) = 2, the lemma immediately implies that ha4 i = ha2 i, and hence h4i = h2i.

Proposition 8.1.3. Let G be a finite cyclic group of order n ∈ N+ , and let c be an arbitrary generator of
G. If H is a subgroup of G, then H = hcd i for some d ∈ N+ with d | n. Furthermore, if H = hcd i, then
|H| = nd .

Proof. Let H be a subgroup of G. We know that H is cyclic by Proposition 8.1.1, so we may fix a ∈ G with
H = hai. Since G = hci, we can fix m ∈ Z with a = cm . Let d = gcd(m, n) and notice that d ∈ N+ and
d | n. We then have
H = hai = hcm i = hcd i
by Lemma 8.1.2. For the last claim, notice that if H = hcd i, then

|H| = |hcd i|
= |cd |
n
= (by Proposition 4.6.8)
gcd(d, n)
n
= .
d
This completes the proof.

We can improve on the lemma to determine precisely when two powers of the same element generate the
same subgroup.

Proposition 8.1.4. Let G be a group and let a ∈ G.


8.1. STRUCTURE OF CYCLIC GROUPS 143

1. Suppose that |a| = n ∈ N+ . For any `, m ∈ Z, we have that ha` i = ham i if and only if gcd(`, n) =
gcd(m, n).

2. Suppose that |a| = ∞. For any `, m ∈ Z, we have ha` i = ham i if and only if either ` = m or ` = −m.

Proof. 1. Let `, m ∈ Z be arbitrary.

• Suppose first that ha` i = ham i. By Proposition 4.6.8, we have


n
|ha` i| = |a` | =
gcd(`, n)

and
n
|ham i| = |am | =
gcd(m, n)
Since ha` i = ham i, it follows that |ha` i| = |ham i|, and hence
n n
= .
gcd(`, n) gcd(m, n)

Thus, we have n · gcd(`, n) = n · gcd(m, n). Since n 6= 0, we can divide by it to conclude that
gcd(`, n) = gcd(m, n).
• Suppose now that gcd(`, n) = gcd(m, n), and let d be this common value. By Lemma 8.1.2, we
have both ha` i = had i and ham i = had i. Therefore, ha` i = ham i.

2. Let `, m ∈ Z be arbitrary.

• Suppose first that ha` i = ham i. We then have am ∈ ha` i, so we may fix k ∈ Z with am = (a` )k .
It follows that am = a`k , hence m = `k by Proposition 5.2.3, and so ` | m. Similarly, using the
fact that a` ∈ ham i, it follows that m | `. Since both ` | m and m | `, we can use Corollary 2.2.5
to conclude that either ` = m or ` = −m.
• If ` = m, then we trivially have ha` i = ham i. Suppose then that ` = −m. We then have
a` = a−m = (am )−1 , so a` ∈ ham i and hence ha` i ⊆ ham i. We also have am = a−` = (a` )−1 , so
am ∈ ha` i and hence ham i ⊆ ha` i. Combining both of these, we conclude that ha` i = ham i.

We know by Lagrange’s Theorem that if H is a subgroup of a finite group G, then |H| is a divisor of |G|.
In general, the converse to Lagrange’s Theorem is false, as we saw in Proposition 7.4.2 (although |A4 | = 12,
the group A4 has no subgroup of order 6). However, for cyclic groups, we have the very strong converse that
there is a unique subgroup order equal to each divisor of |G|.

Theorem 8.1.5. Let G be a cyclic group, and let c be an arbitrary generator of G.

1. Suppose that |G| = n ∈ N+ . For each d ∈ N+ with d | n, there is a unique subgroup of G of order d,
namely hcn/d i = {a ∈ G : ad = e}.

2. Suppose that |G| = ∞. Every subgroup of G equals hcm i for some m ∈ N, and each of these subgroups
are distinct.

Proof. 1. Let d ∈ N+ with d | n. We then have that nd ∈ Z and d · nd = n. Thus, nd | n, and we have
n
|hcn/d i| = n/d = d by Proposition 8.1.3. Therefore, hcn/d i is a subgroup of G of order d.
We next show that hcn/d i = {a ∈ G : ad = e}.
144 CHAPTER 8. CYCLIC, ABELIAN, AND SOLVABLE GROUPS

• hcn/d i ⊆ {a ∈ G : ad = e}: Let a ∈ hcn/d i be arbitrary. Fix k ∈ Z with a = (cn/d )k . We then


have a = cnk/d , so
ad = cnk
= (cn )k
= ek
= e.

• {a ∈ G : ad = e} ⊆ hcn/d i: Let a ∈ G with ad = e. Since c is a generator for G, we can fix k ∈ Z


with a = ck . We then have that
ckd = (ck )d = ad = e
Since |c| = n, we may use Proposition 4.6.5 to conclude that n | kd. Fix m ∈ Z with nm = kd.
We then have that nd · m = k, so

(cn/d )m = c(n/d)·m = ck = a
and hence a ∈ hcn/d i.
Putting these containments together, we conclude that hcn/d i = {a ∈ G : ad = e}.
Suppose now that H is an arbitrary subgroup of G with order d. By Lagrange’s Theorem, every
element of H has order dividing d, so hd = e for all h ∈ H. It follows that H ⊆ {a ∈ G : ad = e}, and
since each of these sets have cardinality d, we conclude that H = {a ∈ G : ad = e}.
2. Let H be an arbitrary subgroup of G. We know that H is cyclic by Proposition 8.1.1, so we may fix
a ∈ G with H = hai. Since G = hci, we can fix m ∈ Z with a = cm . If m < 0, then −m > 0, and
H = hcm i = hc−m i by Proposition 8.1.4.
Finally, notice that if m, n ∈ N with m 6= n, then we also have m 6= −n (because both m and n are
nonnegative), so hcm i =6 hcn i by Proposition 8.1.4 again.

Corollary 8.1.6. The subgroups of Z are precisely hni = nZ = {nk : k ∈ Z} for each n ∈ N, and each of
these are distinct.
Proof. This is immediate from the fact that Z is a cyclic group generated by 1, and 1n = n for all n ∈ N.
We now determine the number of elements of each order in a cyclic group. We start by determining the
number of generators.
Proposition 8.1.7. Let G be a cyclic group.
1. If |G| = n ∈ N+ , then G has exactly ϕ(n) distinct generators.
2. If |G| = ∞, then G has exactly 2 generators.
Proof. Since G is cyclic, we may fix c ∈ G with G = hci.
1. Suppose that |G| = n ∈ N+ . We then have
G = {c0 , c1 , c2 , . . . , cn−1 }
and we know that ck 6= c` whenever 0 ≤ k < ` < n. Thus, we need only determine how many of these
elements are generators. By Proposition 8.1.4, we have hck i = hc1 i if and only if gcd(k, n) = gcd(1, n).
Since hci = G and gcd(1, n) = 1, we conclude that hck i = G if and only if gcd(k, n) = 1. Therefore,
the number of generators of G equals the number of k ∈ {0, 1, 2, . . . , n − 1} such that gcd(k, n) = 1,
which is ϕ(n).
8.1. STRUCTURE OF CYCLIC GROUPS 145

2. Suppose that |G| = ∞. We then have that G = {ck : k ∈ Z} and that each of these elements are
distinct by Proposition 5.2.3. Thus, we need only determine the number of k ∈ Z such that hck i = G,
which is the number of k in Z such that hck i = hc1 i. Using Proposition 8.1.4, we have hck i = hc1 i if
and only if either k = 1 or k = −1. Thus, G has exactly 2 generators.

Proposition 8.1.8. Let G be a cyclic group of order n and let d | n. We then have that G has exactly ϕ(d)
many elements of order d.
Proof. Let H = hcn/d i = {a ∈ G : ad = e}. We know from Theorem 8.1.5 that H is the unique subgroup of
G having order d. Since H = hcn/d i, we know that H is cyclic, and hence Proposition 8.1.7 tells us that H
has exactly ϕ(d) many element of order d. Now if a ∈ G is any element with |a| = d, then ad = e, and hence
a ∈ H. Therefore, every element of G with order d lies in H, and hence G has exactly ϕ(d) many elements
of order d.
Corollary 8.1.9. For any n ∈ N+ , we have
X
n= ϕ(d)
d|n

where the summation is over all positive divisors d of n.


Proof. Let n ∈ N+ . Fix any cyclic group G of order n, such as G = Z/nZ. We know that every element
of G has order some divisor of n, and furthermore we know that if d | n, then G has exactly ϕ(d) many
elements of order d. Therefore, the sum on the right-hand side simply counts the number of elements of G
by breaking them up into the various possible orders. Since |G| = n, the result follows.
Notice that the above proposition gives a recursive way to calculate ϕ(n) because
X
ϕ(n) = n − ϕ(d)
d|n
d<n

for all n ∈ N+ . Finally, we end this section by determining when two subgroups of a cyclic group are
subgroups of each other.
Proposition 8.1.10. Let G be a cyclic group.
1. Suppose that |G| = n ∈ N+ . For each d ∈ N+ with d | n, let Hd be the unique subgroup of G of order
d. We then have that Hk ⊆ H` if and only if k | `.
2. Suppose that |G| = ∞. For each k ∈ N, let Hk = hck i. For any k, ` ∈ N, we have Hk ⊆ H` if and only
if ` | k.
Proof. 1. Let k, ` ∈ N+ be arbitrary with k | n and ` | n. Notice that if Hk ⊆ H` , then Hk is a subgroup
of H` , so k | ` by Lagrange’s Theorem. Conversely, suppose that k | `. Let a ∈ Hk be arbitrary. By
Theorem 8.1.5, we have ak = e. Since k | `, we can fix m ∈ Z with ` = mk, and notice that
a` = amk = (ak )m = em = e
so a ∈ H` by Theorem 8.1.5. Therefore, Hk ⊆ H` .
2. Let k, ` ∈ N be arbitrary. If ` | k, then we can fix m ∈ Z with k = m`, in which case we have
ck = (c` )m ∈ hc` i, so hck i ⊆ hc` i and hence Hk ⊆ H` . Conversely, suppose that Hk ⊆ H` . Since
ck ∈ Hk , we then have ck ∈ hc` i, so we can fix m ∈ Z with ck = (c` )m . We then have ck = c`m , so
k = `m by Proposition 5.2.3, and hence ` | k.
146 CHAPTER 8. CYCLIC, ABELIAN, AND SOLVABLE GROUPS
Chapter 9

Introduction to Rings

9.1 Definitions and Examples


We now define another fundamental algebraic structure. Whereas groups have one binary operation, rings
come equipped with two. We tend to think of them as “addition” and “multiplication”, although just like
for groups all that matters is the properties that they satisfy.

Definition 9.1.1. A ring is a set R equipped with two binary operations + and · and two elements 0, 1 ∈ R
satisfying the following properties:

1. a + (b + c) = (a + b) + c for all a, b, c ∈ R.

2. a + b = b + a for all a, b ∈ R.

3. a + 0 = a = 0 + a for all a ∈ R.

4. For all a ∈ R, there exists b ∈ R with a + b = 0 = b + a.

5. a · (b · c) = (a · b) · c for all a, b, c ∈ R.

6. a · 1 = a and 1 · a = a for all a ∈ R.

7. a · (b + c) = a · b + a · c for all a, b, c ∈ R.

8. (a + b) · c = a · c + b · c for all a, b, c ∈ R.

Notice that the first four axioms simply say that (R, +, 0) is an abelian group.

Some sources omit property 6 and so do not require that their rings have a multiplicative identity. They
call our rings either “rings with identity” or “rings with 1”. We will not discuss such objects here, and for
us “ring” implies that there is a multiplicative identity. Notice that we are requiring that + is commutative
but we do not require that · is commutative. Also, we are not requiring that elements have multiplicative
inverses.
For example, Z, Q, R, and C are all rings with the standard notions of addition and multiplication along
with the usual 0 and 1. For each n ∈ N+ , the set Mn (R) of all n × n matrices with entries from R is a ring,
where + and · are the usual matrix addition and multiplication, 0 is the zero matrix, and 1 = In is the n × n
identity matrix. Certainly some matrices in Mn (R) fail to be invertible, but that is not a problem because

147
148 CHAPTER 9. INTRODUCTION TO RINGS

the ring axioms say nothing about the existence of multiplicative inverses. Notice that the multiplicative
group GLn (R) is not a ring because it is not closed under addition. For example
     
1 0 −1 0 0 0
+ =
0 1 0 −1 0 0

but this latter matrix is not an element of GL2 (R). The next proposition gives some examples of finite rings.

Proposition 9.1.2. Let n ∈ N+ . The set Z/nZ of equivalence classes of the relation ≡n with operations

a+b=a+b and a · b = ab

is a ring with additive identity 0 and multiplicative identity 1.

Proof. We already know that (Z/nZ, +, 0) is an abelian group. We proved that multiplication of equivalence
classes given by
a · b = ab
is well-defined in Proposition 3.6.4, so it gives a binary operation on Z/nZ. It is now straightforward to check
that 1 is a multiplicative identity, that · is associative, and that · distributes over addition by appealing to
these facts in Z.

Initially, it may seem surprising that we require that + is commutative in rings. In fact, this axiom
follows from the others (so logically we could omit it). To see this, suppose that R satisfies all of the above
axioms except possibly axiom 2. Let a, b ∈ R be arbitrary. We then have

(1 + 1) · (a + b) = (1 + 1) · a + (1 + 1) · b
=1·a+1·a+1·b+1·b
=a+a+b+b

and also

(1 + 1) · (a + b) = 1 · (a + b) + 1 · (a + b)
=a+b+a+b

hence
a + a + b + b = a + b + a + b.
Now we are assuming that (R, +, 0) is a group because we have axioms 1, 3, and 4, so by left and right
cancellation we conclude that a + b = b + a.
If R is a ring, then we know that (R, +, 0) is a group. In particular, every a ∈ R has a unique additive
inverse. Since we have another binary operation in multiplication, we choose different notation new notation
for the additive inverse of a (rather than use the old a−1 which looks multiplicative).

Definition 9.1.3. Let R be a ring. Given a ∈ R, we let −a be the unique additive inverse of a, so a+(−a) = 0
and (−a) + a = 0.

Proposition 9.1.4. Let R be a ring.

1. For all a ∈ R, we have a · 0 = 0 = 0 · a.

2. For all a, b ∈ R, we have a · (−b) = −(a · b) = (−a) · b.

3. For all a, b ∈ R, we have (−a) · (−b) = a · b.


9.1. DEFINITIONS AND EXAMPLES 149

4. For all a ∈ R, we have −a = (−1) · a.


Proof.
1. Let a ∈ R be arbitrary. We have

a · 0 = a · (0 + 0)
= (a · 0) + (a · 0).

Therefore
0 + (a · 0) = (a · 0) + (a · 0)
By cancellation in the group (R, +, 0), it follows that a · 0 = 0. Similarly we have

0 · a = (0 + 0) · a
= (0 · a) + (0 · a).

Therefore
0 + (0 · a) = (0 · a) + (0 · a).
By cancellation in the group (R, +, 0), it follows that 0 · a = 0.
2. Let a, b ∈ R be arbitrary. We have

0=a·0 (by 1)
= a · (b + (−b))
= (a · b) + (a · (−b))

Therefore, a · (−b) is the additive inverse of a · b (since + is commutative), which is to say that
a · (−b) = −(a · b). Similarly

0=0·b (by 1)
= (a + (−a)) · b
= (a · b) + ((−a) · b)

so (−a) · b is the additive inverse of a · b, which is to say that (−a) · b = −(a · b).
3. Let a, b ∈ R be arbitrary. Using 2, we have

(−a) · (−b) = −(a · (−b))


= −(−(a · b))
=a·b

where we have used the group theoretic fact that the inverse of the inverse is the original element
(i.e. Proposition 4.2.6).
4. Let a ∈ R be arbitrary. We have

(−1) · a = −(1 · a) (by 2)


= −a
150 CHAPTER 9. INTRODUCTION TO RINGS

Definition 9.1.5. Let R be a ring. Given a, b ∈ R, we define a − b = a + (−b).

Proposition 9.1.6. If R is a ring with 1 = 0, then R = {0}.

Proof. Suppose that 1 = 0. For every a ∈ R, we have

a=a·1=a·0=0

Thus, a = 0 for all a ∈ R. It follows that R = {0}.

Definition 9.1.7. A commutative ring is a ring R such that · is commutative, i.e. such that a · b = b · a for
all a, b ∈ R.

Definition 9.1.8. Let R be a ring. A subring of R is a subset S ⊆ R such that

• S is an additive subgroup of R (so S contains 0, is closed under +, and closed under additive inverses).

• 1∈S

• ab ∈ S whenever a ∈ S and b ∈ S.

Here are two important examples of subrings.


√ √ √ √
1. Q[ 2] = {a + √b 2 : a, b√∈ Q} is a subring of R. To√see this, first notice that 0 = 0 + 0 2 ∈√Q[ 2]
and 1 = √1 + 0 2 ∈ Q[ 2]. Suppose that x, y ∈ Q[ 2] and fix a, b, c, d ∈ Q with x = a + b 2 and
y = c + d 2. We then have that
√ √
x + y = (a + b 2) + (c + d 2)

= (a + c) + (b + d) 2,

so x + y ∈ Q[ 2]. We also have
√ √
−x = −(a + b 2) = (−a) + (−b) 2 ∈ Q

so −x ∈ Q[ 2], and
√ √
xy = (a + b 2)(c + d 2)
√ √
= ac + ad 2 + bc 2 + 2bd

= (ac + 2bd) + (ad + bc) 2,
√ √
so xy ∈ Q[ 2]. Therefore, Q[ 2] is a subring of R.

2. Z[i] = {a + bi : a, b ∈ Z} is a subring of C. To see this, first notice that 0 = 0 + 0i ∈ Z[i] and


1 = 1 + 0i ∈ Z[i]. Suppose that x, y ∈ Z[i] and fix a, b, c, d ∈ Z with x = a + bi and y = c + di. We
then have that

x + y = (a + bi) + (c + di)
= (a + c) + (b + d)i

so x + y ∈ Z[i]. We also have

−x = −(a + bi) = (−a) + (−b)i ∈ Z[i]


9.2. UNITS AND ZERO DIVISORS 151

so −x ∈ Z[i], and also

xy = (a + bi)(c + di)
= ac + adi + bci + bdi2
= (ac − bd) + (ad + bc)i

so xy ∈ Z[i]. Therefore, Z[i] is a subring of C. The ring Z[i] is called the ring of Gaussian Integers

In our work on group theory, we made extensive use of subgroups of a given group G to understand G.
In our discussion of rings, the concept of subrings will play
√ a much smaller role. Some rings of independent
interest can be seen as subrings of larger rings, such as Q[ 2] and Z[i] above. However, we will not typically
try to understand R by looking at its subrings. Partly, this is due to the fact that we will spend a large
amount of time working with infinite rings (as opposed to the significant amount of time we spent on finite
groups, where Lagrange’s Theorem played a key role). As we will see, our attention will turn toward certain
subsets of a ring called ideals which play a role similar to normal subgroups of a group.
Finally, one small note. Some authors use a slightly different definition of a subring in that they do not
require that 1 ∈ S. They use the idea that a subring of R should be a subset S ⊆ R which forms a ring
with the inherited operations, but the multiplicative identity of S could be different from the multiplicative
identity of R. Again, since subrings will not play a particularly important role for us, we will not dwell on
this distinction.

9.2 Units and Zero Divisors


As we mentioned, our ring axioms do not require that elements have multiplicative inverses. Those elements
which do have multiplicative inverses are given special names.

Definition 9.2.1. Let R be a ring. An element u ∈ R is a unit if it has a multiplicative inverse, i.e. there
exists v ∈ R with uv = 1 = vu. We denote the set of units of R by U (R).

In other words, that units of R are the invertible elements of R under the associative operation · with
identity 1 as in Section 4.3. For example, the units in Z are {±1} and the units in Q are Q\{0} (and similarly
for R and C). The units in Mn (R) are the invertible n × n matrices.

Proposition 9.2.2. Let R be a ring and let u ∈ U (R). There is a unique v ∈ R with uv = 1 and vu = 1.

Proof. Existence follows from the assumption that u is a unit, and uniqueness is immediate from Proposition
4.3.2.

Definition 9.2.3. Let R be a ring and let u ∈ U ((R). We let u−1 be the unique multiplicative inverse of u,
so uu−1 = 1 and u−1 u = 1.

Proposition 9.2.4. Let R be a ring. The set U (R) forms an group under multiplication with identity 1.

Proof. Immediate from Corollary 4.3.5.

This finally explains our notation for the group U (Z/nZ); namely, we are considering Z/nZ as a ring
and forming the corresponding unit group of that ring. Notice that for any n ∈ N+ , we have GLn (R) =
U (Mn (R)).
Let’s recall the multiplication table of Z/6Z:
152 CHAPTER 9. INTRODUCTION TO RINGS

· 0 1 2 3 4 5
0 0 0 0 0 0 0
1 0 1 2 3 4 5
2 0 2 4 0 2 4
3 0 3 0 3 0 3
4 0 4 2 0 4 2
5 0 5 4 3 2 1

Looking at the table, we see that U (Z/6Z) = {1, 5} as we already know. As remarked when we first saw this
table in Section 4.4, there are some other interesting things. For example, we have 2 · 3 = 0, so it is possible
to have the product of two nonzero elements result in 0. Elements which are part of such a pair are given a
name.
Definition 9.2.5. Let R be ring. A zero divisor is a nonzero element a ∈ R such that there exists a nonzero
b ∈ R such that either ab = 0 or ba = 0 (or both).
In the above case of Z/6Z, we see that the zero divisors are {2, 3, 4}. The concept of unit and zero divisor
are antithetical as we now see.
Proposition 9.2.6. Let R be a ring. No element is both a unit and zero divisor.
Proof. Suppose that R is a ring and that a ∈ R is a unit. If b ∈ R satisfies ab = 0, then

b=1·b
= (a−1 a) · b
= a−1 · (ab)
= a−1 · 0
=0

Thus, there is no nonzero b ∈ R with ab = 0. Similarly, if b ∈ R satisfies ba = 0, then

b=b·1
= b · (aa−1 )
= (ba) · a−1
= 0 · a−1
=0

Thus, there is no nonzero b ∈ R with ba = 0. It follows that a is not a zero divisor.


Proposition 9.2.7. Let n ∈ N+ and let a ∈ Z.
• If gcd(a, n) = 1, then a is a unit in Z/nZ.
• If gcd(a, n) > 1 and a 6= 0, then a is a zero divisor in Z/nZ.
Proof. The first part is given by Proposition 4.4.3. Suppose that gcd(a, n) > 1 and a 6= 0. Let d = gcd(a, n)
and notice that 0 < d < n because n - a (since we are assuming a 6= 0). We have d | n and d | a, so we may
fix b, c ∈ Z with db = n and dc = a. Notice that 0 < b < n because d, n ∈ N and d > 1, hence b 6= 0. Now

ab = (dc)b = (db)c = nc

so n | ab. It follows that a · b = ab = 0. Since b 6= 0, we conclude that a is a zero divisor.


9.2. UNITS AND ZERO DIVISORS 153

Therefore, every nonzero element of Z/nZ is either a unit or a zero divisor. However, this is not true in
every ring. The units in Z are {±1} but there are no zero divisors in Z. We now define three important
classes of rings.

Definition 9.2.8. Let R be a ring.

• R is a division ring if 1 6= 0 and every nonzero element of R is a unit.

• R is a field if R is a commutative division ring. Thus, a field is a commutative ring with 1 6= 0 for
which every nonzero element is a unit.

• R is an integral domain if R is a commutative ring with 1 6= 0 which has no zero divisors. Equivalently,
an integral domain is a commutative ring R with 1 6= 0 such that whenever ab = 0, either a = 0 or
b = 0.

Clearly every field is a division ring. We also have the following.

Proposition 9.2.9. Every field is an integral domain.

Proof. Suppose that R is a field. If a ∈ R is nonzero, then it is a unit, so it is not a zero divisor by Proposition
9.2.6.

For example, each of Q, R, and C are fields. The ring Z is an example of an integral domain which is
not a field, so the concept of integral domain is strictly weaker. There also exist division rings which are not
fields, such as the Hamiltonian Quaternions as discussed in Section 1.2:

H = {a + bi + cj + dk : a, b, c, d ∈ R}

with
i2 = −1 j 2 = −1 k 2 = −1
ij = k jk = i ki = j
ji = −k kj = −i ik = −j
We will return to such objects (and will actually prove that H really is a division ring) later.

Corollary 9.2.10. We have the following.

• For each prime p ∈ N+ , the ring Z/pZ is a field (and hence also an integral domain).

• For each composite n ∈ N+ with n ≥ 2, the ring Z/nZ is not an integral domain.

Proof. If p ∈ N+ is prime, then every nonzero element of Z/pZ is a unit by Proposition 9.2.7 (because if
a ∈ {1, 2, . . . , p − 1}, then gcd(a, p) = 1 because p is prime). Suppose that n ∈ N+ with n ≥ 2 is composite.
Fix d ∈ N+ with 1 < d < n such that d | n. We then have that gcd(d, n) = d 6= 1 and d 6= 0, so d is a zero
divisor by Proposition 9.2.7. Therefore, Z/nZ is not an integral domain.

Since there are infinitely many primes, this corollary provides us with an infinite supply of finite fields.
Here are the addition and multiplication tables of Z/5Z to get a picture of one of one of these objects.

+ 0 1 2 3 4 · 0 1 2 3 4
0 0 1 2 3 4 0 0 0 0 0 0
1 1 2 3 4 0 1 0 1 2 3 4
2 2 3 4 0 1 2 0 2 4 1 3
3 3 4 0 1 2 3 0 3 1 4 2
4 4 0 1 2 3 4 0 4 3 2 1
154 CHAPTER 9. INTRODUCTION TO RINGS
√ √
Another example of a field is the ring Q[ 2] discussed in the last section. Since Q[ 2] is a subring of R,
it is a commutative ring with 1 6= 0. To see that it is a field, we
√ need only check that every nonzero element
has an inverse. Suppose that a, b ∈ Q are arbitrary with a + b 2 6= 0.
√ √ √
• We first claim that a − b 2 6= 0. Suppose √ instead that a − b 2 = 0. We then have a = b 2. Now if
b = 0, then we√would have a√ = 0, so a + b 2 = 0, which is a contradiction.
√ Thus, b 6= 0. Dividing both
sides of a = b 2 by b gives 2 = ab , contradicting the fact that 2 is irrational. Thus, we must have

a − b 2 6= 0.
√ √ √
Since a − b 2 6= 0, we have a2 − 2b2 = (a + b 2)(a − b 2) 6= 0, and

1 1 a−b 2
√ = √ · √
a+b 2 a+b 2 a−b 2

a−b 2
= 2
a − 2b2
1 −b √
= 2 2
+ 2 2
a − 2b a − 2b2
1 −b 1√

Since a, b ∈ Q, we have both a2 −2b 2 ∈ Q and a2 −2b2 ∈ Q, so
a+b 2
∈ Q[ 2].
Although we will spend a bit of time discussing noncommutative rings, the focus of our study will be
commutative rings and often we will be working with integral domains (and sometimes more specifically
with fields). The next proposition is a fundamental tool when working in integral domains. Notice that it
can fail in arbitrary commutative rings. For example, in Z/6Z we have 3 · 2 = 3 · 4, but 2 6= 4
Proposition 9.2.11. Suppose that R is an integral domain and that ab = ac with a 6= 0. We then have that
b = c.
Proof. Since ab = ac, we have ab − ac = 0. Using the distributive law, we see that a(b − c) = 0 (more
formally, we have ab + (−ac) = 0, so ab + a(−c) = 0, hence a(b + (−c)) = 0, and thus a(b − c) = 0). Since R
is an integral domain, either a = 0 or b − c = 0. Now the former is impossible by assumption, so we conclude
that b − c = 0. Adding c to both sides, we conclude that b = c.
Although Z is an example of integral domain which is not a field, it turns out that all such examples are
infinite.
Proposition 9.2.12. Every finite integral domain is a field.
Proof. Suppose that R is a finite integral domain. Let a ∈ R with a 6= 0. Define λa : R → R by letting
λa (b) = ab. Now if b, c ∈ R with λa (b) = λa (c), then ab = ac, so b = c by Proposition 9.2.11. Therefore,
λa : R → R is injective. Since R is finite, it must be the case that λa is surjective. Thus, there exists b ∈ R
with λa (b) = 1, which is to say that ab = 1. Since R is commutative, we also have ba = 1. Therefore, a is a
unit in R. Since a ∈ R with a 6= 0 was arbitrary, it follows that R is a field.
We can define direct products of rings just we did for direct products of groups. The proof that the
componentwise operations give rise to a ring are straightforward.
Definition 9.2.13. Suppose that R1 , R2 , . . . , Rn are all rings. Consider the Cartesian product

R1 × R2 × · · · × Rn = {(a1 , a2 , . . . , an ) : ai ∈ Ri for 1 ≤ i ≤ n}

Define operations + and · on R1 × R2 × · · · × Rn by letting

(a1 , a2 , . . . , an ) + (b1 , b2 , . . . , bn ) = (a1 + b1 , a2 + b2 , . . . , an + bn )


(a1 , a2 , . . . , an ) · (b1 , b2 , . . . , bn ) = (a1 · b1 , a2 · b2 , . . . , an · bn )
9.3. POLYNOMIAL RINGS 155

where we are using the operations in Ri in the ith components. We then have that R1 × R2 × · · · × Rn with
these operations is a ring which is called the (external) direct product of R1 , R2 , . . . , Rn .
Notice that if R and S are nonzero rings, then R × S is never an integral domain (even if R and S are
both integral domains, or even fields) because (1, 0) · (0, 1) = (0, 0).

9.3 Polynomial Rings


Given a ring R, we show in this section how to build a new and important ring from R. The idea is to form
the set of all “polynomials with coefficient in R”. For example, we are used to working with polynomials
over the real number R which look like

3x7 + 5x4 − 2x2 + 142x − π.

Intuitively, we add and multiply polynomials in the obvious way:


• (4x2 + 3x − 2) + (8x3 − 2x2 − 81x + 14) = 8x3 + 2x2 − 78x + 12
• (2x2 + 5x − 3) · (−7x2 + 2x + 1) = −14x4 − 31x3 + 33x2 − x − 3
In more detail, we multiplied by collecting like powers of x as follows:

(2 · (−7))x4 + (2 · 2 + 5 · (−7))x3 + (2 · 1 + 5 · 2 + (−3) · (−7))x2 + (5 · 1 + (−3) · 2)x + (−3) · 1

In general, we aim to carry over the above idea except we will allow our coefficients to come from a
general ring R. Thus, a typical element should look like:

an xn + an−1 xn−1 + · · · + a1 x + a0

where each ai ∈ R. Before we dive into giving a formal definition, we first discuss polynomials a little more
deeply.
In previous math courses, we typically thought of a polynomial with real coefficients as describing a
certain function from R to R resulting from “plugging in for x”. Thus, when we wrote the polynomial
4x2 + 3x − 2, we were probably thinking of it as the function which sends 1 to 4 + 3 − 2 = 5, sends 2 to
16 + 6 − 2 = 20, etc. In contrast, we will consciously avoid defining a polynomial as the resulting function
obtained by “plugging in for x”. To see why this distinction matters, consider the case where we are we
working with the ring R = Z/2Z and we have the two polynomials 1x2 + 1x + 1 and 1. These look like
different polynomials on the face of it. However, notice that
2
1·0 +1·0+1=1

and also
2
1·1 +1·1+1=1
so these two distinct polynomials “evaluate” to same thing whenever we plug in elements of R (namely they
both produce the function, i.e. the function that outputs 1 for each input). Therefore, the resulting functions
are indeed equal as functions despite the fact the polynomials have distinct forms.
We will enforce this distinction between polynomials and the resulting functions, so we will simply define
our ring in a manner that two different looking polynomials are really distinct elements of our ring. In order
to do this carefully, let’s go back and look at our polynomials. For example, consider the polynomial with
real coefficients given by 5x3 + 9x − 2. Since we are not “plugging in” for x, this polynomial is determined
by its sequence of coefficients. In other words, if we order the sequence from coefficients of smaller powers to
coefficients of larger powers, we can represent this polynomial as the sequence (−2, 9, 0, 5). Performing this
156 CHAPTER 9. INTRODUCTION TO RINGS

step gets rid of the superfluous x which was really just serving as a placeholder and honestly did not have
any real meaning. We will adopt this perspective and simply define a polynomial to be such a sequence.
However, since polynomials can have arbitrarily large degree, these finite sequences would all of different
lengths. We get around this problem by defining polynomials as infinite sequences of elements of R in which
only finitely many of the terms are nonzero.

Definition 9.3.1. Let A be a set. An infinite sequence of elements of A is an infinite list a0 , a1 , a2 , . . .


where an ∈ A for all n ∈ N. We use the notation {an }n∈N , or often just {an } (which can be confusing!), for
an infinite sequence. More formally, an infinite sequence is a function f : N → A which we can view as the
sequence of values f (0), f (1), f (2), . . . .

Definition 9.3.2. Let R be a ring. We define a new ring denoted R[x] whose elements are the set of all
infinite sequences {an } of elements of R such that {n ∈ N : an 6= 0} is finite. We define two binary operations
on R[x] as follows.
{an } + {bn } = {an + bn }
and

{an } · {bn } = {a0 bn + a1 bn−1 + · · · + an−1 b1 + an b0 }


Xn
={ ak bn−k }
k=0
X
={ ai bj }
i+j=n

We will see below that this makes R[x] into a ring called the polynomial ring over R.

Let’s pause for a moment to ensure that the above definition makes sense. Suppose that {an } and {bn }
are both infinite sequences of elements of R for which only finitely many nonzero terms. Fix M, N ∈ N such
that an = 0 for all n > M and bn = 0 for all n > N . We then have that an + bn = 0 for all n > max{M, N },
so the infinite sequence {an + bn } only has finitely many nonzero terms. Also, we have
X
ai bj = 0
i+j=n

for all n > M + N (because if n > M + PN and i + j = n, then either i > M or j > N , so either ai = 0
or bj = 0), hence the infinite sequence { i+j=n ai bj } only has finitely many nonzero terms. We now check
that these operations turn R[x] into a ring.

Theorem 9.3.3. Let R be a ring. The set R[x] with the above operations is a ring with additive identity
the infinite sequence 0, 0, 0, 0, . . . and multiplicative identity the infinite sequence 1, 0, 0, 0, . . . . Furthermore,
if R is commutative, then R[x] is commutative.

Proof. Many of these checks are routine. For example, + is associative on R[x] because for any infinite
sequences {an }, {bn }, and {cn }, we have

{an } + ({bn } + {cn }) = {an } + {bn + cn }


= {an + (bn + cn )}
= {(an + bn ) + cn } (since + is associative on R)
= {an + bn } + {cn }
= ({an } + {bn }) + {cn }
9.3. POLYNOMIAL RINGS 157

The other ring axioms involving addition are completely analogous. However, checking the axioms involv-
ing multiplication is more interesting because our multiplication operation is much more complicated than
componentwise multiplication. For example, let {en } be the infinite sequence 1, 0, 0, 0, . . . , i.e.
(
1 if n = 0
en =
0 if n > 0

Now for any infinite sequence {an }, we have


n
X
{en } · {an } = { ek an−k } = {e0 an−0 } = {1 · an−0 } = {an }
k=0

where we have used the fact that ek = 0 whenever k > 0. We also have
n
X
{an } · {en } = { ak en−k } = {an e0 } = {an · 1} = {an }
k=0

where we have used the fact that if 0 ≤ k < n, then n − k > 1, so en−k = 0. Therefore, the infinite sequence
1, 0, 0, 0, . . . is indeed a multiplicative identity of R[x].
The most interesting (i.e. difficult) check is that · is associative on R[x]. We have
n
X
{an } · ({bn } · {cn }) = {an } · { b` cn−` }
`=0
n
X n−k
X
={ ak · ( b` cn−k−` )}
k=0 `=0
n n−k
X X
={ ak (b` cn−k−` )}
k=0 `=0
X
={ ak (b` cm )}
k+`+m=n

and also
n
X
({an } · {bn }) · {cn } = { ak bn−k } · {cn }
k=0
n X
X `
={ ( ak b`−k ) · cn−` }
`=0 k=0
Xn X
`
={ (ak b`−k )cn−` }
`=0 k=0
X
={ (ak b` )cm }.
k+`+m=n

Since multiplication in R is associative, we know that ak (b` cm ) = (ak b` )cm for all k, `, m ∈ N, so
X X
{ ak (b` cm )} = { (ak b` )cm }
k+`+m=n k+`+m=n
158 CHAPTER 9. INTRODUCTION TO RINGS

and hence
{an } · ({bn } · {cn }) = ({an } · {bn }) · {cn }
One also needs to check the two distributive laws, and also that R[x] is commutative when R is commutative,
but these are notably easier than associativity of multiplication, and are left as an exercise.
The above formal definition of R[x] is precise and useful when trying to prove theorems about polynomials,
but it is terrible to work with intuitively. Thus, when discussing elements of R[x], we will typically use
standard polynomial notation. For example, if we are dealing with Q[x], we will simply write the formal
element  
1 22
5, 0, 0, − , 7, 0, , 0, 0, 0, . . .
3 7
as
1 22
5 − x3 + 7x4 + x6
3 7
or
22 6 1
x + 7x4 − x3 + 5,
7 3
and we will treat the x as a meaningless placeholder symbol called an indeterminate. This is where the
x comes from in the notation R[x] (if for some reason we want to use a different indeterminate, say t, we
instead use the notation R[t]). Formally, R[x] will be the set of infinite sequences, but we often use this
more straightforward notation in the future when working with polynomials. When working in R[x] with
this notation, we will typically call elements of R[x] by names like p(x) and q(x) and write something like
“Let p(x) = an xn + an−1 xn−1 + · · · + a1 x + a0 be an element of R[x]”. Since I can’t say this enough, do not
simply view this as saying that p(x) is the resulting function. Keep a clear distinction in your mind between
an element of R[x] and the function it represents via “evaluation” that we discuss in Definition 10.1.9 below!
It is also possible to connect up the formal definition of R[x] (as infinite sequences of elements of R with
finitely many nonzero terms) and the more gentle standard notation of polynomials as follows. Every element
a ∈ R can be naturally associated with the sequence (a, 0, 0, 0, . . . ) in R[x]. If we simply define x to be the
sequence (0, 1, 0, 0, 0, . . . ), then working in the ring R it is not difficult to check that x2 = (0, 0, 1, 0, 0, . . . ),
that x3 = (0, 0, 0, 1, 0, . . . ), etc. With these identifications, if we interpret the additions and multiplications
implicit in the polynomial
22 6 1
x + 7x4 − x3 + 5
7 3
as their formal counterparts defined above, then everything matches up as we would expect.
Definition 9.3.4. Let R be a ring. Given a nonzero element {an } ∈ R[x], we define the degree of {an } to
be max{n ∈ N : an 6= 0}. In the more relaxed notation, given a nonzero polynomial p(x) ∈ R[x], say

p(x) = an xn + an−1 xn−1 + · · · + a1 x + a0

with an 6= 0, we define the degree of p(x) to be n. We write deg(p(x)) for the degree of p(x). Notice that we
do not define the degree of the zero polynomial.
The next proposition gives some relationship between the degrees of polynomials and the degrees of their
sum/product.
Proposition 9.3.5. Let R be a ring and let p(x), q(x) ∈ R[x] be nonzero.
1. Either p(x) + q(x) = 0 or deg(p(x) + q(x)) ≤ max{deg(p(x)), deg(q(x))}.
2. If deg(p(x)) 6= deg(q(x)), then p(x) + q(x) 6= 0 and deg(p(x) + q(x)) = max{deg(p(x)), deg(q(x))}.
3. Either p(x) · q(x) = 0 or deg(p(x)q(x)) ≤ deg(p(x)) + deg(q(x)).
9.3. POLYNOMIAL RINGS 159

Proof. We give an argument using our formal definitions. Let p(x) be the sequence {an } and let q(x) be the
sequence {bn }. Let M = deg(p(x)) so that aM 6= 0 and an = 0 for all n > M . Let N = deg(q(x)) so that
aN ≥ 0 and aN = 0 for all n > N .
1. Suppose that p(x) + q(x) 6= 0. For any n > max{M, N }, we have an + bn = 0 + 0 = 0, so deg(p(x) +
q(x)) ≤ max{M, N }.
2. Suppose that M 6= N . Suppose first that M > N . We then have aM + bM = aM + 0 = aM 6= 0, so
p(x) + q(x) 6= 0 and deg(p(x) + q(x)) ≥ max{deg(p(x)), deg(q(x))}. Combining this with the inequality
in part 1, it follows that deg(p(x) + q(x)) = max{deg(p(x)), deg(q(x))}. A similar argument works if
N > M.
3. Suppose that p(x)q(x) 6= 0. Let n > M + N and consider the sum
n
X
ak bn−k = 0.
k=0

Notice that if k > M , then ak = 0 so ak bn−k = 0. Also, if k < M , then n − k > M + N − k =


N + (M − k) > N , hence bn−k = 0 and so ak bn−k = 0. Thus, ak bn−k = 0 for all k with 0 ≤ k ≤ n, so
it follows that
Xn
ak bn−k = 0.
k=0

Therefore, deg(p(x)q(x)) ≤ M + N .

Notice that in the ring Z/6Z[x], we have

(2x2 + 5x + 4)(3x + 1) = 1x2 + 5x + 4

so the product of a degree 2 polynomial and a degree 1 polynomial results in a degree 2 polynomial. It
follows that we can indeed have a strict inequality in the latter case. In fact, we can have the product of two
nonzero polynomials result in the zero polynomial, i.e. there may exist zero divisors in R[x]. For example,
working in Z/6Z[x] again we have
(4x + 2) · 3x2 = 0
Fortunately, for well-behaved rings, the degree of the product always equals the sum of the degrees.
Proposition 9.3.6. Let R be an integral domain. If p(x), q(x) ∈ R[x] are both nonzero, then p(x)q(x) 6= 0
and deg(p(x)q(x)) = deg(p(x)) + deg(q(x)).
Proof. Let p(x) be the sequence {an } and let q(x) be the sequence {bn }. Let M = deg(p(x)) so that aM 6= 0
and an = 0 for all n > M . Let N = deg(q(x)) so that aN ≥ 0 and aN = 0 for all n > N . Now consider
M
X +N
ak bM +N −k
k=0

Notice that if k > M , then ak = 0 so ak bM +N −k = 0. Also, if k < M , then M + N − k = N + (M − k) > N ,


hence bM +N −k = 0 and so ak bM +N −k = 0. Therefore, we have
M
X +N
ak bM +N −k = aM bM +N −M = aM bN
k=0
160 CHAPTER 9. INTRODUCTION TO RINGS

Now we have aM 6= 0 and bN 6= 0, so since R is an integral domain we know that aM bN 6= 0. Therefore,


M
X +N
ak bM +N −k = aM bN 6= 0
k=0

6 0, and furthermore that deg(p(x)q(x)) ≥ M + N . Since we already know that


It follows that p(x)q(x) =
deg(p(x)q(x)) ≤ M + N from Proposition 9.3.5, we conclude that deg(p(x)q(x)) = M + N .
Corollary 9.3.7. If R is an integral domain, then R[x] is an integral domain. Hence, if F is a field, then
F [x] is an integral domain.
Proof. Immediate form Proposition 9.3.6 and Proposition 9.2.9.
Proposition 9.3.8. Let R be an integral domain. The units in R[x] are precisely the units in R. In other
words, if we identify an element of R with the corresponding constant polynomial, then U (R[x]) = U (R).
Proof. Recall that the identity of R[x] is the constant polynomial 1. If a ∈ U (R), then considering a and
a−1 and 1 all as constant polynomials in R[x], we have aa−1 = 1 and a−1 a = 1, so a ∈ U (R[x]). Suppose
conversely that p(x) ∈ R[x] is a unit. Fix q(x) ∈ F [x] with p(x) · q(x) = 1 = q(x) · p(x). Notice that we must
have both p(x) and q(x) be nonzero because 1 6= 0. Using Proposition 9.3.6, we then have

0 = deg(1) = deg(p(x) · q(x)) = deg(p(x)) + deg(q(x))

Since deg(p(x)), deg(q(x)) ∈ N, it follows that deg(p(x)) = 0 = deg(q(x)). Therefore, p(x) and q(x) are both
nonzero constant polynomials, say p(x) = a and q(x) = b. We then have ab = 1 and ba = 1 in R[x], so these
equations are true in R as well. It follows that a ∈ U (R).
This proposition can be false when R is only a commutative ring (see the homework).
Corollary 9.3.9. Let F be a field. The units in F [x] are precisely the nonzero constant polynomials. In other
words, if we identify an element of F with the corresponding constant polynomial, then U (F [x]) = F \{0}.
Proof. Immediate from Proposition 9.3.8 and the fact that U (F ) = F \{0}.
Recall Theorem 2.3.1, which said that if a, b ∈ Z and b 6= 0, then there exists q, r ∈ Z with

a = qb + r

and 0 ≤ r < |b|. Furthermore, the q and r and unique. Intuitively, if b 6= 0, then we can always divide by
b and obtain a “smaller” remainder. This simple result was fundamental in proving results about greatest
common divisors (such as the fact that they exist!). If we hope to generalize this to other rings, we need a
way to interpret “smaller” in new ways. Fortunately, for polynomial rings, we can use the notion of degrees.
We’re already used to this in R[x] from polynomial long division. However, in R[x] for a general ring R,
things may not work out so nicely. The process of polynomial long division involves dividing by the leading
coefficient of the divisor, so we need to assume that it is a unit.
Theorem 9.3.10. Let R be a ring, and let f (x), g(x) ∈ R[x] with g(x) 6= 0. Write

g(x) = bm xm + bm−1 xm−1 + · · · + b1 x + b0

where bm 6= 0.
1. If bm ∈ U (R), then there exist q(x), r(x) ∈ R[x] with f (x) = q(x) · g(x) + r(x) and either r(x) = 0 or
deg(r(x)) < deg(g(x)).
9.3. POLYNOMIAL RINGS 161

2. If R is an integral domain, then there exist at most one pair q(x), r(x) ∈ R[x] with f (x) = q(x) · g(x) +
r(x) and either r(x) = 0 or deg(r(x)) < deg(g(x)).
Proof. 1. Suppose that bm ∈ U (R). We prove the existence of q(x) and r(x) for all f (x) ∈ R[x] by
induction on deg(f (x)). We begin by handling some simple cases that will serve as base cases for our
induction. Notice first that if f (x) = 0, then we may take q(x) = 0 and r(x) = 0 because

f (x) = 0 · g(x) + 0.

Also, if f (x) 6= 0 but deg(f (x)) < deg(g(x)), then we may take q(x) = 0 and r(x) = f (x) because

f (x) = 0 · g(x) + f (x).

We handle all other polynomials f (x) using induction on deg(f (x)). Suppose then that deg(f (x)) ≥
deg(g(x)) and that we know the existence result is true for all p(x) ∈ R[x] with either p(x) = 0 or
deg(p(x)) < deg(f (x)). Write

f (x) = an xn + an−1 xn−1 + · · · + a1 x + a0

where an 6= 0. Since we are assuming that deg(f (x)) ≥ deg(g(x)), we have n ≥ m. Consider the
polynomial
p(x) = f (x) − an b−1
m x
n−m
g(x)
We have

p(x) = f (x) − an b−1


m x
n−m
g(x)
= (an xn + an−1 xn−1 + · · · + a0 ) − an b−1
m x
n−m
· (bm xm + bm−1 xm−1 + · · · + b0 )
= (an xn + an−1 xn−1 + · · · + a0 ) − (an b−1 n −1
m bm x + an bm bm−1 x
n−1
+ · · · + an b−1
m b0 x
n−m
)
= (an xn + an−1 xn−1 + · · · + a0 ) − (an xn + an b−1
m bm−1 x
n−1
+ · · · + an b−1
m b0 x
n−m
)
= (an−1 − an b−1
m bm−1 )x
n−1
+ · · · + (an−m − an b−1
m b0 )x
n−m
+ an−m−1 xn−m−1 + · · · + a0

Therefore, either p(x) = 0 or deg(p(x)) < n = deg(f (x)). By induction, we may fix q ∗ (x), r∗ (x) ∈ R[x]
with
p(x) = q ∗ (x) · g(x) + r∗ (x)
where either r∗ (x) = 0 or deg(r∗ (x)) < deg(g(x)). We then have

f (x) − an b−1
m x
n−m
· g(x) = q ∗ (x) · g(x) + r∗ (x),

hence

f (x) = an b−1
m x
n−m
· g(x) + q ∗ (x) · g(x) + r∗ (x)
= (an b−1
m x
n−m
+ q ∗ (x)) · g(x) + r∗ (x).

Thus, if we let q(x) = (an b−1


m x
n−m
+ q ∗ (x)) and r(x) = r∗ (x), we have either r(x) = 0 or deg(r(x)) <
deg(g(x)), so we have proven existence for f (x).
2. Suppose that R is an integral domain. Suppose that

q1 (x) · g(x) + r1 (x) = f (x) = q2 (x) · g(x) + r2 (x)

where qi (x), ri (x) ∈ R[x] and either ri (x) = 0 or deg(ri (x)) < deg(g(x)) for each i. We then have

q1 (x) · g(x) − q2 (x) · g(x) = r2 (x) − r1 (x)


162 CHAPTER 9. INTRODUCTION TO RINGS

hence
(q1 (x) − q2 (x)) · g(x) = r2 (x) − r1 (x)
Suppose that r2 (x)−r1 (x) 6= 0. Since R is an integral domain, we know that R[x] is an integral domain
by Corollary 9.3.7. Thus, we must have q1 (x) − q2 (x) 6= 0 and g(x) 6= 0, and also

deg(r2 (x) − r1 (x)) = deg(q1 (x) − q2 (x)) + deg(g(x)) ≥ deg(g(x))

by Proposition 9.3.6. However, this is a contradiction because deg(r2 (x) − r1 (x)) < deg(g(x)) since
for each i, we have either ri (x) = 0 or deg(ri (x)) < deg(g(x)). We conclude that we must have
r2 (x) − r1 (x) = 0 and thus r1 (x) = r2 (x). Canceling this common term from

q1 (x) · g(x) − q2 (x) · g(x) = r2 (x) − r1 (x),

we conclude that
q1 (x) · g(x) = q2 (x) · g(x)
Since g(x) 6= 0 and R[x] is an integral domain, it follows that q1 (x) = q2 (x) as well.

Corollary 9.3.11. Let F be a field, and let f (x), g(x) ∈ F [x] with g(x) 6= 0. There exist unique q(x), r(x) ∈
F [x] with f (x) = q(x) · g(x) + r(x) and either r(x) = 0 or deg(r(x)) < deg(g(x)).
Proof. Immediate from the previous theorem together with the fact that fields are integral domains, and
U (F ) = F \{0}.
Let’s compute an example when F = Z/7Z. Working in Z/7Z[x], let

f (x) = 3x4 + 6x3 + 1x2 + 2x + 2,

and let
g(x) = 2x2 + 5x + 1.
We perform long division, i.e. follow the proof, to find q(x) and r(x). Notice that the leading coefficient of
−1
g(x) is 2 and that in Z/7Z we have 2 = 4. We begin by computing

3 · 4 · x4−2 = 5x2

This will be the first term in our resulting quotient. We then multiply this by g(x) and subtract from f (x)
to obtain

f (x) − 5x2 · g(x) = (3x4 + 6x3 + 1x2 + 2x + 2) − 5x2 · (2x2 + 5x + 1)


= (3x4 + 6x3 + 1x2 + 2x + 2) − (3x4 + 4x3 + 5x2 )
= 2x3 + 3x2 + 2x + 2

We now continue on with this new polynomial (this is where we appealed to induction in the proof) as our
“new” f (x). We follow the proof recursively and compute

2 · 4 · x = 1x

This will be our next term in the quotient. We now subtract 1x · g(x) from our current polynomial to obtain

(2x3 + 3x2 + 2x + 2) − x · g(x) = (2x3 + 3x2 + 2x + 2) − x · (2x2 + 5x + 1)


= (2x3 + 3x2 + 2x + 2) − (2x3 + 5x2 + 1x)
= 5x2 + 1x + 2.
9.3. POLYNOMIAL RINGS 163

Continuing on, we compute


5·4=6
and add this to our quotient. We now subtract 6 · g(x) from our current polynomial to obtain

(5x2 + 1x + 2) − 6 · g(x) = (5x2 + 1x + 2) − 6 · (2x2 + 5x + 1)


= (5x2 + 1x + 2) − (5x2 + 2x + 6)
= 6x + 3.

We have arrived at a point with our polynomial has degree less than that of g(x), so we have bottomed out
in the above proof at a base case. Adding up our contributions to the quotient gives

q(x) = 5x2 + 1x + 2,

and we are left with the remainder


r(x) = 6x + 3.
Therefore, we have written

3x4 + 6x3 + 1x2 + 2x + 2 = (5x2 + 1x + 2) · (2x2 + 5x + 1) + (6x + 3),

Notice that if R is not a field, then such q(x) and r(x) may not exist, even if R is a nice ring like Z. For
example, there are no q(x), r(x) ∈ Z[x] with

x2 = q(x) · 2x + r(x)

and either r(x) = 0 or deg(r(x)) < deg(2x) = 1. To see this, first notice that r(x) would have to be a
constant polynomial, say r(x) = c. If

q(x) = an xn + an−1 xn−1 + · · · + a1 x + a0

with an 6= 0, then we would have

x2 = 2an xn+1 + 2an−1 xn + · · · + 2a1 x2 + 2a0 x + c.

Since 2an 6= 0 (because Z is an integral domain), we must have n = 1 and hence

x2 = 2a1 x2 + 2a0 x + c

It follows that 2a1 = 1, which is a contradiction because 2 - 1 in Z. Thus, no such q(x) and r(x) exist in this
case.
Furthermore, if R is not an integral domain, we may not have uniqueness. For example, in Z/6Z with
f (x) = 4x2 + 1 and g(x) = 2x, we have

4x2 + 1 = 2x · 2x + 1
4x2 + 1 = 5x · 2x + 1
4x2 + 1 = (5x + 3) · 2x + 1,

so we can obtain several different quotients. We can even have different quotients and remainders. If we
stay in Z/6Z[x] but use f (x) = 4x2 + 2x and g(x) = 2x + 1, then we have

4x2 + 2x = 2x · (2x + 1) + 0
4x2 + 2x = (2x + 3) · (2x + 1) + 3.
164 CHAPTER 9. INTRODUCTION TO RINGS

9.4 Further Examples of Rings


Power Series Rings
When we defined R[x], the ring of polynomials with coefficients in R, we restricted our infinite sequences to
have only a finite number of nonzero terms so that they “correspond” to polynomials. However, the entire
apparatus we constructed goes through without a hitch if we take the set of all infinite sequences with no
restriction. Intuitively, an infinite sequence {an } corresponds to the “power series”

a0 + a1 x + a2 x2 + a3 x3 + . . .

The nice thing about defining our objects as infinite sequences is that there is no confusion at all about
“plugging in for x”, because there is no x. Thus, there are no issues about convergence or whether this is a
well-defined function at all. We define our + and · on these infinite sequences exactly as in the polynomial
ring case, and the proofs of the ring axioms follows word for word (actually, the proof is a bit easier because
we don’t have to worry about the resulting sequences having only a finite number of nonzero terms).

Definition 9.4.1. Let R be a ring. Let R[[x]] be the set of all infinite sequences {an } where each an ∈ R.
We define two binary operations on R[[x]] as follows.

{an } + {bn } = {an + bn }

and

{an } · {bn } = {a0 bn + a1 bn−1 + · · · + an−1 b1 + an b0 }


Xn
={ ak bn−k }
k=0
X
={ ai bj }
i+j=n

Theorem 9.4.2. Let R be a ring. The above operation make R[[x]] into a ring with additive identity the
infinite sequence 0, 0, 0, 0, . . . and multiplicative identity the infinite sequence 1, 0, 0, 0, . . . . Furthermore, if
R is commutative, then R[[x]] is commutative. We call R[[x]] the ring of formal power series over R, or
simply the power series ring over R.

If you have worked with generating functions in combinatorics as strictly combinatorial objects (i.e. you
did not worry about values of x where convergence made sense, or work with resulting function on the
restricted domain), then you were in fact working in the ring R[[x]] (or perhaps the larger C[[x]]. From
another perspective, if you have worked with infinite series, then you know that

1
= 1 + x + x2 + x3 + . . .
1−x

for all real numbers x with |x| < 1. You have probably used this fact when you worked with generating
functions as well, even if you weren’t thinking about these as functions. If you are not thinking about
the above equality in terms of functions, and thus ignored issues about convergence on the right, then
what you are doing is that you are working in the ring R[[x]] and saying that 1 − x is a unit with inverse
1 + x + x2 + x3 + . . . . We now verify this fact. In fact, for any ring R, we have

(1 − x)(1 + x + x2 + x3 + . . . ) = 1 and (1 + x + x2 + x3 + . . . )(1 − x) = 1


9.4. FURTHER EXAMPLES OF RINGS 165

in R[[x]]. To see this, one can simply multiply out the left hand sides naively and notice the coefficient of
xk equals 0 for all k ≥ 2. For example, we have

(1 − x)(1 + x + x2 + x3 + . . . ) = 1 · 1 + (1 · 1 + (−1) · 1) · x + (1 · 1 + (−1) · 1) · x2 + . . .


= 1 + 0 · x + 0 · x2 + . . .
=1

and similarly for the other other. More formally, we can argue this as follows. Let R be a ring. Let {an } be
the infinite sequence defined by

1
 if n = 0
an = −1 if n = 1

0 otherwise

and notice that {an } is the formal version of 1 − x. Let {bn } be the infinite sequence defined by bn = 1
for all n ∈ N, and notice that {bn } is the formal version of 1 + x + x2 + . . . . Finally, let {en } be the finite
sequence defined by
(
1 if n = 0
en =
0 otherwise

and notice that {en } is the formal version of 1. Now a0 · b0 = 1 · 1 = 1 = e0 , and for any n ∈ N+ , we have
n
X
ak bn−k = a0 bn + a1 bn−1 (since ak = 0 if k ≥ 2)
k=0
= 1 · 1 + (−1) · 1
=1−1
=0
= en

Therefore {an } · {bn } = {en }. The proof that {bn } · {an } = {en } is similar.

Matrix Rings
One of our fundamental examples of a ring is the ring Mn (R) of all n × n matrices with entries in R. In
turns out that we can generalize this construction to Mn (R) for any ring R.

Definition 9.4.3. Let R be a ring and let n ∈ N+ . We let Mn (R) be the ring of all n × n matrices with
entries in R with the following operations. Writing an element of R as [ai,j ], we define
n
X
[ai,j ] + [bi,j ] = [ai,j + bi,j ] and [ai,j ] · [bi,j ] = [ ai,k bk,j ]
k=1

With these operations, Mn (R) is a ring with additive identity the matrix of all zeros and multiplicative
identity the matrix with all zeros except for ones on the diagonal, i.e.
(
1 if i = j
ei,j =
0 otherwise
166 CHAPTER 9. INTRODUCTION TO RINGS

The verification of the ring axioms on Mn (R) are mostly straightforward (we know the necessary results
in the case R = R from linear algebra). The hardest check once again is that · is associative. We formally
carry that out now. Given matrices [ai,j ], [bi,j ], and [ci,j ] in Mn (R), we have

Xn
[ai,j ] · ([bi,j ] · [ci,j ]) = [ai,j ] · [ bi,` c`,j ]
`=1
n
X Xn
=[ ai,k · ( bk,` c`,j )]
k=1 `=1
Xn Xn
=[ ai,k (bk,` c`,j )]
k=1 `=1
Xn X n
=[ (ai,k bk,` )c`,j ]
k=1 `=1
Xn X n
=[ (ai,k bk,` )c`,j ]
`=1 k=1
Xn X n
=[ ( ai,k bk,` ) · c`,j ]
`=1 k=1
Xn
=[ ai,k bk,j ] · [ci,j ]
k=1
= ([ai,j ] · [bi,j ]) · [ci,j ]

One nice thing about this more general construction of matrix rings is that it provides us with a decent
supply of noncommutative rings.

Proposition 9.4.4. Let R be a ring with 1 6= 0. For each n ≥ 2, the ring Mn (R) is noncommutative.

Proof. We claim that the matrix of all zeros except for a 1 in the (1, 2) position does not commute with the
matrix of all zeroes except for a 1 in the (2, 1) position. To see this, let A = [ai,j ] be the matrix where
(
1 if i = 1 and j = 2
ai,j =
0 otherwise

and let B = [bi,j ] be the matrix where


(
1 if i = 2 and j = 1
bi,j =
0 otherwise

Now the (1, 1) entry of AB equals


n
X
a1,i bi,1 = a1,1 b1,1 + a1,2 b2,1 + a1,3 b3,1 + · · · + a1,n bn,1
k=1
= 0 + 1 + 0 + ··· + 0
=1
9.4. FURTHER EXAMPLES OF RINGS 167

while the (1, 1) entry of BA equals


n
X
b1,i ai,1 = b1,1 a1,1 + b1,2 a2,1 + b1,3 a3,1 + · · · + b1,n an,1
k=1
= 0 + 0 + 0 + ··· + 0
=0

Therefore, AB 6= BA, and hence Mn (R) is noncommutative.

In particular, this construction gives us examples of finite noncommutative rings. For example, the ring
M2 (Z/2Z) is a noncommutative ring with 24 = 16 elements.

Rings of Functions Under Pointwise Operations


Consider the set F of all functions f : R → R. For example, the function f (x) = x2 + 1 is an element of F,
as is g(x) = sin x and h(x) = ex . We add and multiply functions pointwise as usual, so f · g is the function
(f · g)(x) = (x2 + 1) sin x. Since R is a ring and everything happens pointwise, it’s not difficult to see that
F is a ring under these pointwise operations, with 0 equal to the constant function f (x) = 0, and 1 equal to
the constant function f (x) = 1. In fact, we can vastly generalize this construction.

Definition 9.4.5. Let X be a set and let R be a ring. Consider the set F of all functions f : X → R. We
define + and · on F to be pointwise addition and multiplication, i.e. given f, g ∈ F, we define f + g : X → R
to be the function such that
(f + g)(x) = f (x) + g(x)

for all x ∈ X, and we define f · g : X → R to be the function such that

(f · g)(x) = f (x) · g(x)

for all x ∈ X. With these operations, F is a ring with additive identity equal to the constant function f (x) = 0
and multiplicative identity equal to the constant function f (x) = 1. Furthermore, if R is commutative, then
F is commutative.

Notice that if R is a ring and n ∈ N, then the direct product Rn = R × R × · · · × R can viewed as a special
case of this construction where X = {1, 2, . . . , n}. To see this, notice that function f : {1, 2, . . . , n} → R
naturally corresponding to an n-tuple (a1 , a2 , . . . , an ) where each ai ∈ R by letting ai = f (i). Since the
operations in both F and Rn are pointwise, the operations of + and · correspond as well. Although these
two descriptions really are just different ways of defining the same object, we can formally verify that these
rings are indeed isomorphic once we define ring isomorphisms in the next section.
Now if R is a ring and we let X = N, then elements of F correspond to infinite sequences {an } of
elements of R. Although this is same underlying set as R[[x]] and both F and R[[x]] have the same addition
operation (namely {an } + {bn } = {an + bn }), the multiplication operations are different. In F, we have
pointwise multiplication, so {an } · {bn } = {an · bn }, while the operation in R[[x]] is more complicated (and
more interesting!).
We obtain other fundamental and fascinating rings by taking subrings of these examples. For instance,
let F be our original example of the set of all functions f : R → R (so we are taking R = R and X = R). Let
C ⊆ F be the subset of all continuous functions from R to R (i.e. continuous at every point). By results from
calculus and/or analysis, the sum and product of continuous functions is continuous, as are the constant
functions 0 and 1, along with −f for every continuous function f . It follows that C is a subring of F.
168 CHAPTER 9. INTRODUCTION TO RINGS

Polynomial Rings in Several Variables


Let R be a ring. In Section 9.3, we defined the polynomial ring R[x]. In this setting, elements of Z[x] look
like 12x + 7, x3 + 2x2 − 5x + 13, etc. In other words, elements of R[x] correspond to polynomials in one
variable. What about polynomials in many variables? For example, suppose to want to consider polynomials
like the following:
7x5 y 2 − 4x3 y 2 + 11xy 2 − 16xy + 5x2 − 3y + 42
There are two ways to go about making the ring of such “polynomials” precise. One way is to start from
scratch like we did for polynomials of one variable. Thinking about infinite sequences {an } of elements of
R as functions f : N → R, we had that the underlying set of R[x] consisted of those functions f : N → R
such that {n ∈ N : f (n) 6= 0} is finite. For polynomials of two variables, we can instead consider the set of
all functions f : N2 → R such that {(k, `) ∈ N2 : f ((k, `)) 6= 0} is finite. For example, the above polynomial
would be given by the function f ((5, 2)) = 7, f ((3, 2)) = −4, f ((2, 0)) = 5, etc. We then add by simply
adding the functions pointwise as we’ve just seen, but multiplication is more interesting. Given two such
functions f, g : N2 → R, to find out what (f · g)(k, `) is, we need to find all pairs of pairs that sum pointwise
to (k, `), and add up all of their contributions. For example, we would have

(f · g)((2, 1)) = f ((0, 0)) · g((2, 1)) + f ((0, 1)) · g((2, 0)) + f ((1, 0)) · g((1, 1))
+ f ((1, 1)) · g((1, 0)) + f ((2, 0)) · g((0, 1)) + f ((2, 1)) · g((0, 0))

With such a definition, it is possible (but tedious) to check that we obtain a ring.
An alternate approach is to define R[x, y] to simply be the ring (R[x])[y] obtained by first taking R[x],
and then forming a new polynomial ring from it. From this perspective, one can view the above polynomial
as
(7x5 − 4x3 + 11x) · y 2 + (−16x − 3) · y + (5x2 + 42)
We could also define it as (R[y])[x], in which case we view the above polynomial as

(y 2 ) · x5 + (−4y 2 ) · x3 + (5) · x2 + (11y 2 − 16y) · x + (−3y + 42).

This approach is particularly nice because we do not need to recheck the ring axioms, as they follow from
our results on polynomial rings (or one variable). The downside is that working with polynomials in this
way, rather than as summands like the original polynomial above, is slightly less natural. Nonetheless, it is
possible to prove that all three definitions result in naturally isomorphic rings. We will return to these ideas
later, along with working in polynomial rings of more than two variables.
Chapter 10

Ideals and Ring Homomorphisms

10.1 Ideals, Quotients, and Homomorphisms


Ideals
Suppose that R is a ring and that I is a subset of R which is an additive subgroup. We know that when we
look only at addition, we have that I breaks up R into (additive) cosets of the form

r + I = {r + a : a ∈ I}

These cosets are the equivalence classes of the equivalence relation ∼I on R defined by r ∼I s if there exists
a ∈ I with r + a = s. Since (R, +, 0) is abelian, we know that the additive subgroup I of R is normal in R,
so we can take the quotient R/I as additive groups. In this quotient, we know from our general theory of
quotient groups that addition is well-defined by:

(r + I) + (s + I) = (r + s) + I.

Now if we want to turn the resulting quotient into a ring (rather than just an abelian group), we would
certainly require that multiplication of cosets is well-defined as well. In other words, we would need to know
that if r, s, t, u ∈ R with r ∼I t and s ∼I u, then rs ∼I tu. A first guess might be that we will should
require that I is closed under multiplication as well. Before jumping to conclusions, let’s work out whether
this happens for free, or if it looks grim, then what additional conditions on I we might want to require.
Suppose then that r, s, t, u ∈ R with r ∼I t and s ∼I u. Fix a, b ∈ I with r + a = t and s + b = u. We
then have

tu = (r + a)(s + b)
= r(s + b) + a(s + b)
= rs + rb + as + ab

Now in order for rs ∼I tu, we would want rb + as + ab ∈ I. To ensure this, it would suffice to know that
rb ∈ I, as ∈ I, and ab ∈ I because we are assuming that I is an additive subgroup. The last of these, that
ab ∈ I, would follow if we include the additional assumption that I is closed under multiplication as we
guessed above. However, a glance at the other two suggests that we might need to require more. These other
summands suggest that we want I to be closed under “super multiplication”, i.e. that if we take an element
of I and multiply it by any element of R on either side, then we stay in I. If we have these conditions on I
(which at this point looks like an awful lot to ask), then everything should work out fine. We give a special
name to subsets of R that have this property.

169
170 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS

Definition 10.1.1. Let R be a ring. An ideal of R is a subset I ⊆ R with the following properties:

• 0 ∈ I.

• a + b ∈ I whenever a ∈ I and b ∈ I.

• −a ∈ I whenever a ∈ I.

• ra ∈ I whenever r ∈ R and a ∈ I.

• ar ∈ I whenever r ∈ R and a ∈ I.

Notice that the first three properties simply say that I is an additive subgroup of R.

For example, consider the ring R = Z. Suppose n ∈ N and let I = nZ = {nk : k ∈ Z}. We then have
that I is an ideal of R. To see this, first notice that we already know that nZ is a subgroup of Z. Now for
any m ∈ Z and k ∈ Z, we have

m · (nk) = n · (mk) ∈ nZ and (nk) · m = n · (km) ∈ nZ

Therefore, I = nZ does satisfy the additional conditions necessary, so I = nZ is an ideal of R. Before going
further, let’s note one small simplification in the definition of an ideal.

Proposition 10.1.2. Let R be a ring. Suppose that I ⊆ R satisfies the following:

• 0 ∈ I.

• a + b ∈ I whenever a ∈ I and b ∈ I.

• ra ∈ I whenever r ∈ R and a ∈ I.

• ar ∈ I whenever r ∈ R and a ∈ I.

We then have that I is an ideal of R.

Proof. Notice that the only condition that is missing is that I is closed under additive inverses. For any
a ∈ R, we have −a = (−1) · a, so −a ∈ R by the third condition (notice here that we are using the fact that
all our rings have a multiplicative identity).

When R is commutative, we can even pare this to three conditions.

Corollary 10.1.3. Let R be a commutative ring. Suppose that I ⊆ R satisfies the following:

• 0 ∈ I.

• a + b ∈ I whenever a ∈ I and b ∈ I.

• ra ∈ I whenever r ∈ R and a ∈ I.

We then have that I is an ideal of R.

Proof. Use the previous proposition together with the fact that if r ∈ R and a ∈ I, then ar = ra ∈ I because
R is commutative.
10.1. IDEALS, QUOTIENTS, AND HOMOMORPHISMS 171

For another example of an ideal, consider the ring R = Z[x]. Let I be set of all polynomials with 0
constant term. Formally, we are letting I be the set of infinite sequences {an } with a0 = 0. Let’s prove that
I is an ideal using a more informal approach to the polynomial ring (make sure you know how to translate
everything we are saying into formal terms). Notice that the zero polynomial is trivially in I and that I is
closed under addition because the constant term of the sum of two polynomials is the sum of their constant
terms. Finally, if f (x) ∈ R and p(x) ∈ I, then we can write

f (x) = an xn + an−1 xn−1 + · · · + a1 x + a0

and
p(x) = bm xn + bn−1 xn−1 + · · · + b1 x
Multiplying out the polynomials, we see that the constant term of f (x)p(x) is a0 · 0 = 0 and similarly the
constant term of p(x)f (x) is 0 · a0 = 0. Therefore, we have both f (x)p(x) ∈ I and p(x)f (x) ∈ I. It follows
that I is an ideal of R.
From the above discussion, it appears that if I is an ideal of R, then it makes sense to take the quotient
of R by I and have both addition and multiplication of cosets be well-defined. However, our definition was
motivated by what conditions would ensure that checking that multiplication is well-defined easy. As in
our definition of a normal subgroup, it is a pleasant surprise that the implication can be reversed so our
conditions are precisely what is needed for the quotient to make sense.

Proposition 10.1.4. Let R be a ring and let I ⊆ R be an additive subgroup of (R, +, 0). The following are
equivalent.

• I is an ideal of R.

• Whenever r, s, t, u ∈ R with r ∼I t and s ∼I u, we have rs ∼I tu.

Proof. We first prove 1 → 2. Let r, s, t, u ∈ R be arbitrary with r ∼I t and s ∼I u. Fix a, b ∈ I with r + a = t


and s + b = u. We then have

tu = (r + a)(s + b)
= r(s + b) + a(s + b)
= rs + rb + as + ab.

Since I is an ideal of R and b ∈ I, we know that both rb ∈ I and ab ∈ I. Similarly, since I is an ideal of
R and a ∈ I, we know that as ∈ I. Now I is an ideal of R, so it is an additive subgroup of R, and hence
rb + as + ab ∈ I. Since
tu = rs + (rb + as + ab)
we conclude that rs ∼I tu.
We now prove 2 → 1. We are assuming that I is an additive subgroup of R and condition 2. Let r ∈ R
and let a ∈ I be arbitrary. Now r ∼I r because r + 0 = r and 0 ∈ I. Also, we have a ∼I 0 because
a + (−a) = 0 and −a ∈ I (as a ∈ I and I is an additive subgroup).

• Since r ∼I r and a ∼I 0, we may use condition 2 to conclude that ra ∼I r0, which is to say that
ra ∼I 0. Thus, we may fix b ∈ I with ra + b = 0. Since b ∈ I and I is an additive subgroup, it follows
that ra = −b ∈ I.

• Since a ∼I 0 and r ∼I r, we may use condition 2 to conclude that ar ∼I 0r, which is to say that
ar ∼I 0. Thus, we may fix b ∈ I with ar + b = 0. Since b ∈ I and I is an additive subgroup, it follows
that ar = −b ∈ I.
172 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS

Therefore, we have both ra ∈ I and ar ∈ I. Since r ∈ R and a ∈ I were arbitrary, we conclude that I is an
ideal of R.
We are now ready to formally define quotient rings.
Definition 10.1.5. Let R be a ring and let I be an ideal of R. Let R/I be the set of additive cosets of I in
R, i.e. the set of equivalence classes of R under ∼I . Define operation on R/I by letting
(a + I) + (b + I) = (a + b) + I (a + I) · (b + I) = ab + I
With these operations, the set R/I becomes a ring with additive identity 0 + I and multiplicative identity
1 + I (we did the hard part of checking that the operations are well-defined, and from here the ring axioms
follow because the ring axioms hold in R). Furthermore, if R is commutative, then R/I is also commutative.
Let n ∈ N+ . As discussed above, the set nZ = {nk : k ∈ Z} is an ideal of Z. Our familiar ring Z/nZ
defined in the first section is precisely the quotient of Z by this ideal nZ, hence the notation again.
As we have seen, ideals of a ring correspond to normal subgroups of a group in that they are the “special”
subsets for which it makes sense to take a quotient. There is one small but important note to be made here.
Of course, every normal subgroup of a group G is a subgroup of G. However, it is not true that every ideal
of a ring R is a subring of R. The reason is that for I to be an ideal of R, it is not required that 1 ∈ I. In
fact, if I is an ideal of R and 1 ∈ R, then r = r · 1 ∈ I for all r ∈ R, so I = R. Since every subring S of R
must satisfy 1 ∈ S, it follows that the only ideal of R which is a subring of R is the whole ring itself. This
might seem to be quite a nuisance, but as mentioned above, we will pay very little attention to subrings of
a given ring, and the vast majority of our focus will be on ideals.
We end with our discussion of ideals in general rings with a simple characterization for when two elements
of a ring R represent the same coset.
Proposition 10.1.6. Let R be a ring and let I be an ideal of R. Let r, s ∈ R. The following are equivalent.
1. r + I = s + I
2. r ∼I s
3. r − s ∈ I
4. s − r ∈ I
Proof. Notice that 1 → 2 from our general theory of equivalence relations because r + I is simply the
equivalence class of r under the relation ∼I .
2 → 3: Suppose that r ∼I s, and fix a ∈ I with r + a = s. Subtracting s and a from both sides, it follows
that r − s = −a (you should work through the details of this if you are nervous). Now a ∈ I and I is an
additive subgroup of R, so −a ∈ I. It follows that r − s ∈ I.
3 → 4: Suppose that r − s ∈ I. Since I is an additive subgroup of R, we know that −(r − s) ∈ I. Since
−(r − s) = s − r, it follows that s − r ∈ I.
4 → 2: Suppose that s − r ∈ I. Since r + (s − r) = s, it follows that r ∼I s.
We now work through an example of a quotient ring other than Z/nZ. Let R = Z[x] and notice that R
is commutative. We consider two different quotients of R.
• First, let
I = (x2 − 2) · Z[x] = {(x2 − 2) · p(x) : p(x) ∈ Z[x]}
For example, 3x3 + 5x2 − 6x − 10 ∈ I because
3x3 + 5x2 − 6x − 10 = (x2 − 2) · (3x + 5)
and 3x + 5 ∈ Z[x]. We claim that I is an ideal of Z[x]:
10.1. IDEALS, QUOTIENTS, AND HOMOMORPHISMS 173

– Notice that 0 ∈ I because 0 = (x2 − 2) · 0.


– Let f (x), g(x) ∈ I be arbitrary. By definition of I, we can fix p(x), q(x) ∈ Z[x] with f (x) =
(x2 − 2) · p(x) and g(x) = (x2 − 2) · q(x). We then have
f (x) + g(x) = (x2 − 2) · p(x) + (x2 − 2) · q(x)
= (x2 − 2) · (p(x) + q(x)).
Since p(x) + q(x) ∈ Z[x], it follows that f (x) + g(x) ∈ I.
– Let h(x) ∈ Z[x] and f (x) ∈ I be arbitrary. By definition of I, we can fix p(x) ∈ Z[x] with
f (x) = (x2 − 2) · p(x) . We then have
h(x) · f (x) = h(x) · (x2 − 2) · p(x)
= (h(x) · p(x)) · (x2 − 2).
Since h(x) · p(x) ∈ Z[x], it follows that h(x) · f (x) ∈ I.
Since Z[x] is commutative, it follows that I is an ideal of Z[x]. Consider the quotient ring Z[x]/I.
Notice that we have
(7x3 + 3x2 − 13x − 2) + I = (x + 4) + I
because
(7x3 + 3x2 − 13x − 2) − (x + 4) = 7x3 + 3x2 − 14x − 6
= (x2 − 2) · (7x + 3)
∈ I.
In fact, for every polynomial f (x) ∈ Z(x), there exists r(x) ∈ Z[x] with either r(x) = 0 or deg(r(x)) < 2
such that f (x) + I = r(x) + I. To see this, notice that the leading coefficient of x2 − 2 is 1, which is a
unit in Z. Now given any f (x) ∈ Z[x], Theorem 9.3.10 tells us that there exists q(x), r(x) ∈ Z[x] with
f (x) = q(x) · (x2 − 2) + r(x)
and either r(x) = 0 or deg(r(x)) < 2. We then have that
f (x) − r(x) = (x2 − 2) · q(x)
so f (x) − r(x) ∈ I, and hence f (x) + I = r(x) + I. Furthermore, if r1 (x), r2 (x) are such that either
ri (x) = 0 or deg(ri (x)) < 2, and r1 (x) + I = r2 (x) + I, then one can show show that r1 (x) = r2 (x).
This is a good exercise now, but we will prove more general facts later. In other words, elements of
R[x]/I, which are additive cosets, can be uniquely represented by the zero polynomial together with
polynomials of degree less than 2.
• Let
J = 2x · Z[x] = {2x · f (x) : f (x) ∈ Z[x]}
As in the previous example, it’s straightforward to check that J is an ideal of R. Consider the quotient
ring Z[x]/J. Notice that we have
(2x2 + x − 7) + J = (3x − 7) + J
because
(2x2 + x − 7) − (3x − 7) = 2x2 − 2x
= 2x · (x − 1)
∈ J.
174 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS

However, notice that the leading coefficient of 2x is 2, which is not a unit in Z. Thus, we can’t make an
argument like the one above work, and it is indeed harder to find unique representatives of the cosets
in Z[x]/J. For example, one can show that x3 + I 6= r(x) + I for any r(x) ∈ Z[x] of degree at most 2.

Ring Homomorphisms

Definition 10.1.7. Let R and S be rings. A (ring) homomorphism from R to S is a function ϕ : R → S


such that

• ϕ(r + s) = ϕ(r) + ϕ(s) for all r, s ∈ R.

• ϕ(r · s) = ϕ(r) · ϕ(s) for all r, s ∈ R.

• ϕ(1R ) = 1S .

A (ring) isomorphism from R to S is a homomorphism ϕ : R → S which is a bijection.

Definition 10.1.8. Given two rings R and S, we say that R and S are isomorphic, and write R ∼
= S, if
there exists an isomorphism ϕ : R → S.

Notice that we have the additional requirement that ϕ(1R ) = 1S . When we discussed group homomor-
phisms, we derived ϕ(eG ) = eH rather than explicitly require it (see Proposition 6.6.2 and Proposition 6.3.4).
Unfortunately, it does not follow for free from the other two conditions in the ring case. If you go back and
look at the proof that ϕ(eG ) = eH in the group case, you will see that we used the fact that ϕ(eG ) has an
inverse but it is not true in rings that every element must have a multiplicative inverse. To see an example
where the condition can fail consider the ring Z × Z (we have not formally defined the direct product of
rings, but it works in the same way). Define ϕ : Z → Z × Z by ϕ(n) = (n, 0). It is not hard to check that ϕ
satisfies the first two conditions for a ring homomorphism, but ϕ(1) = (1, 0) while the identity of Z × Z is
(1, 1).

Definition 10.1.9. Let R be a ring and let c ∈ R. Define Evc : R[x] → R by letting
X
Evc ({an }) = an cn
n

Notice that the above sum makes because elements {an } ∈ R[x] have only finitely many nonzero terms (if
{an } is nonzero, we can stop the sum at M = deg({an })). Intuitively, we are defining

Evc (an xn + an−1 xn−1 + · · · + a1 x + a0 ) = an cn + an−1 cn−1 + · · · + a1 c + a0

Thus, Evc is the function which says “evaluate the polynomial at c”.

The next proposition is really fundamental. Intuitively, it says the following. Suppose that we are given
a commutative ring R, and we take two polynomials in R. Given any c ∈ R, we obtain the same result if we
first add/multiply the polynomials and then plug c into the result, or if we first plug c into each polynomial
and then add/multiply the results.

Proposition 10.1.10. Let R be a commutative ring and let c ∈ R. The function Evc is a ring homomor-
phism.
10.1. IDEALS, QUOTIENTS, AND HOMOMORPHISMS 175

Proof. We clearly have Evc (1) = 1 (where the 1 in parenthesis is the constant polynomial 1). For any
{an }, {bn } ∈ R[x], we have

Evc ({an } + {bn }) = Evc ({an + bn })


X
= (an + bn )cn
n
X
= (an cn + bn cn )
n
X X
= an cn + bn cn
n n
= Evc ({an }) + Evc ({bn })

Thus, Evc preserves addition. We now check that Evc preserves multiplication. Suppose that {an }, {bn } ∈
R[x]. Let M1 = deg({an }) and M2 = deg({bn }) (for this argument, let Mi = 0 if the corresponding
polynomial is the zero polynomial). We know that M1 + M2 ≥ deg({an } · {bn }), so
X
Evc ({an } · {bn }) = Evc ({ ai bj })
i+j=n
MX
1 +M2 X
= ( ai bj )cn
n=0 i+j=n
MX
1 +M2 X
= ai bj cn
n=0 i+j=n
MX
1 +M2 X
= ai bj ci+j
n=0 i+j=n
MX
1 +M2 X
= ai bj ci cj
n=0 i+j=n
MX
1 +M2 X
= ai ci bj cj (since R is commutative)
n=0 i+j=n
M1 X
X M2
= ai ci bj cj (since ai = 0 if i > M1 and bj = 0 if j > M2 )
i=0 j=0
M1
X M2
X
= (ai ci · ( bj cj ))
i=0 j=0
M1
X XM2
=( ai ci ) · ( bj cj )
i=0 j=0

= Evc ({an }) · Evc ({bn })

Thus, Evc preserves multiplication. It follows that Evc is a ring homomorphism.


Let’s take a look at what can fail when R is noncommutative. Consider the polynomials p(x) = ax and
q(x) = bx. We then have that p(x)q(x) = abx2 . Now if c ∈ R, then

Evc (p(x)q(x)) = Evc (abx2 ) = abc2


176 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS

and
Evc (p(x)) · Evc (q(x)) = acbc
It seems impossible to argue that these are equal in general if you can not commute b with c. To find
a specific counterexample, it suffices to find a noncommutative ring with two elements b and c such that
bc2 6= cbc (because then we can take a = 1). It’s not hard to find two matrices which satisfy this, so the
corresponding Evc will not be a ring homomorphism.
In the future, given p(x) ∈ R[x] and c ∈ R, we will tend to write the informal p(c) for the formal
notion Evc (p(x)). In this informal notation, the proposition says that if R is commutative, c ∈ R, and
p(x), q(x) ∈ R[x], then

(p + q)(c) = p(c) + q(c) and (p · q)(c) = p(c) · q(c)

Definition 10.1.11. Let ϕ : R → S be a ring homomorphism. We define ker(ϕ) to be the kernel of ϕ when
viewed as a homomorphism of the additive groups, i.e. ker(ϕ) = {a ∈ R : ϕ(R) = 0S }.
Recall that the normal subgroups of a group G are precisely the kernels of group homomorphisms with
domain G. Continuing on the analogy between normal subgroups of a group and ideal of a ring, we might
hope that the ideals of a ring R are precisely the kernels of ring homomorphisms with domain R. The next
propositions confirm this.
Proposition 10.1.12. If ϕ : R → S is a ring homomorphism, then ker(ϕ) is an ideal of R.
Proof. Let K = ker(ϕ). Since we know in particular in that ϕ is a additive group homomorphism, we know
from our work on groups that K is an additive subgroup of R (see Proposition 6.6.4). Let a ∈ K and r ∈ R
be arbitrary. Since a ∈ K, we have ϕ(a) = 0. Therefore

ϕ(ra) = ϕ(r) · ϕ(a)


= ϕ(r) · 0S
= 0S

so ra ∈ K, and

ϕ(ar) = ϕ(a) · ϕ(r)


= 0S · ϕ(r)
= 0S

so ar ∈ K. Therefore, K is an ideal of R.
Proposition 10.1.13. Let R be a ring and let I be an ideal of R. There exists a ring S and a ring
homomorphism ϕ : R → S such that I = ker(ϕ).
Proof. Consider the ring S = R/I and the projection π : R → R/I defined by π(a) = a + I. As in the group
case, it follows that π is a ring homomorphism and that ker(π) = I.
Now many of the results about group homomorphism carry over to ring homomorphism. For example,
we have the following.
Proposition 10.1.14. Let ϕ : R → S be a ring homomorphism. ϕ is injective if and only if ker(ϕ) = {0R }.
Proof. Notice that ϕ is in particular a homomorphism of additive groups. Thus, the result follows from the
corresponding result about groups. However, let’s prove it again because it is an important result.
Suppose first that ϕ is injective. We know that ϕ(0R ) = 0S , so 0R ∈ ker(ϕ). If a ∈ ker(ϕ), then
ϕ(a) = 0S = ϕ(0R ), so a = 0R because ϕ is injective. Therefore, ker(ϕ) = {0R }.
10.1. IDEALS, QUOTIENTS, AND HOMOMORPHISMS 177

Suppose conversely that ker(ϕ) = {0R }. Let r, s ∈ R with ϕ(r) = ϕ(s). We then have

ϕ(r − s) = ϕ(r + (−s))


= ϕ(r) + ϕ(−s)
= ϕ(r) − ϕ(s)
= 0S

so r − s ∈ ker(ϕ). We are assuming that ker(ϕ) = {0R }, so r − s = 0R and hence r = s.


Next we discuss the ring theoretic analogues of Theorem 6.6.9.
Theorem 10.1.15. Let R1 and R2 be rings, and let ϕ : R1 → R2 be a homomorphism. We have the
following:
1. If S1 is a subring of R1 , then ϕ→ [S1 ] is a subring of R2 .
2. If I1 is an ideal of R1 and ϕ is surjective, then ϕ→ [I1 ] is an ideal of R2 .
3. If S2 is a subring of R2 , then ϕ← [S2 ] is a subring of R1 .
4. If I2 is an ideal of R2 , then ϕ← [I2 ] is an ideal of R1 .
Proof. 1. Suppose that S1 is a subring of R1 . We then have that S1 is an additive subgroup of R1 , so
ϕ→ [S1 ] is an additive subgroup of R2 by Theorem 6.6.9. Since 1 ∈ S1 and ϕ(1) = 1 by definition
of subrings and ring homomorphisms, we have that 1 ∈ ϕ→ [S1 ] as well. Now let c, d ∈ ϕ→ [S1 ] be
arbitrary. Fix a, b ∈ S1 with ϕ(a) = c and ϕ(b) = d. Since a, b ∈ S1 and S1 is a subring of R1 , it
follows that ab ∈ S1 . Now
ϕ(ab) = ϕ(a) · ϕ(b) = cd
so cd ∈ ϕ→ [S1 ]. Thus, ϕ→ [S1 ] is a subring of R1 .
2. Suppose that I1 is an ideal of R1 and that ϕ is surjective. Since I1 is an additive subgroup of R1 ,
we know from Theorem 6.6.9 that ϕ→ [I1 ] is an additive subgroup of R2 . Suppose that d ∈ R2 and
c ∈ ϕ→ [I1 ] are arbitrary. Since ϕ is surjective, we may fix b ∈ R1 with ϕ(b) = d. Since c ∈ ϕ→ [I1 ], we
can fix a ∈ I1 with ϕ(a) = c. We then have

dc = ϕ(b) · ϕ(a)
= ϕ(ba)

so dc ∈ ϕ→ [I1 ]. Also, we have

cd = ϕ(a) · ϕ(b)
= ϕ(ab)

so cd ∈ ϕ→ [I1 ]. It follows that ϕ→ [I1 ] is an ideal of R2 .


3. Suppose that S2 is a subring of R2 . We then have that S2 is an additive subgroup of R2 , so ϕ← [S2 ]
is an additive subgroup of R1 by Theorem 6.6.9. Since 1 ∈ S2 and ϕ(1) = 1 by definition of subrings
and ring homomorphisms, we have that 1 ∈ ϕ← [S2 ] as well. Now let a, b ∈ ϕ← [S2 ] be arbitrary, so
ϕ(a) ∈ S2 and ϕ(b) ∈ S2 . We then have

ϕ(ab) = ϕ(a) · ϕ(b) ∈ S2

because S2 is a subring of R2 , so ab ∈ ϕ← [S2 ]. Thus, ϕ← [S2 ] is a subring of R1 .


178 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS

4. Suppose that I2 be an ideal of R2 . Since I2 is an additive subgroup of R2 , we know from Theorem


6.6.9 that ϕ← [I2 ] is an additive subgroup of R1 . Let b ∈ R1 and a ∈ ϕ← [I2 ] be arbitrary, so ϕ(a) ∈ I2 .
We then have that ϕ(ba) = ϕ(b) · ϕ(a) ∈ I2 because I2 is an ideal of R2 , so ba ∈ ϕ← [I2 ]. Similarly, we
have ϕ(ab) = ϕ(a) · ϕ(b) ∈ I2 because I2 is an ideal of R2 , so ab ∈ ϕ← [I2 ]. Since b ∈ R1 and a ∈ ϕ← [I2 ]
were arbitrary, we conclude that ϕ← [I2 ] is an ideal of R1 .

Corollary 10.1.16. If ϕ : R1 → R2 is a (ring) homomorphism, then range(ϕ) is a subring of R2 .


Proof. This follows immediately from the previous theorem because R1 is trivially a subgroup of R1 and
range(ϕ) = ϕ→ [R1 ].
We end with brief statements of the Isomorphism and Correspondence Theorems. As one might expect,
the corresponding results from group theory do a lot of the heavy lifting in the following proofs, but we still
need to check a few extra things in each case (like the corresponding functions preserve multiplication).
Theorem 10.1.17 (First Isomorphism Theorem). Let ϕ : R → S be a ring homomorphism and let K =
ker(ϕ). Define a function ψ : R/K → S by letting ψ(a + K) = ϕ(a). We then have that ψ is a well-defined
function which is a ring isomorphism onto the subring range(ϕ) of S. Therefore

R/ ker(ϕ) ∼
= range(ϕ)
Let’s see the First Isomorphism Theorem in action. Let R = Z[x] and let I be the ideal of all polynomials
with 0 constant term, i.e.

I = {an xn + an−1 xn−1 + · · · + a1 x + a0 ∈ Z[x] : a0 = 0}


= {an xn + an−1 xn−1 + · · · + a1 x : ai ∈ Z for 1 ≤ i ≤ n}

It is straightforward to check that I is an ideal. Consider the ring R/I. Intuitively, taking the quotient
by I we “kill off” all polynomials without a constant term. Thus, it seems reasonably to suspect that
two polynomials will be in the same coset in the quotient exactly when they have the same constant term
(because then the difference of the two polynomials will be in I). As a result, we might expect that R/I ∼
= Z.
Although it is possible to formalize and prove all of these statements directly, we can also deduce them all
from the First Isomorphism Theorem, as we now show.
Since Z is commutative, the function Ev0 : Z[x] → Z is a ring homomorphism by Proposition 10.1.10.
Notice that Ev0 is surjective because Ev0 (a) = a for all a ∈ Z (where we interpret the a in parentheses as
the constant polynomial a). Now ker(Ev0 ) = I because for any p(x) ∈ Z[x], we have that p(0) = Ev0 (p(x))
is the constant term of p(x). Since I = ker(Ev0 ), we know that I is an ideal of Z[x] by Proposition 10.1.12.
Finally, using the First Isomorphism Theorem together with the fact that ϕ is surjective, we conclude that
Z[x]/I ∼= Z.
We end by stating the remaining theorems. Again, most of the work can be outsourced to the corre-
sponding theorems about groups, and all that remains is to check that multiplication behaves appropriately
in each theorem.
Theorem 10.1.18 (Second Isomorphism Theorem). Let R be a ring, let S be a subring of R, and let I be
an ideal of R. We then have that S + I = {r + a : r ∈ S, a ∈ I} is a subring of R, that I is an ideal of S + I,
that S ∩ I is an ideal of S, and that
S+I ∼ S
=
I S∩I
Theorem 10.1.19 (Correspondence Theorem). Let R be a ring and let I be an ideal of R. For every subring
S of R with I ⊆ S, we have that S/I is a subring of R/I and the function

S 7→ S/I
10.2. THE CHARACTERISTIC OF A RING 179

is a bijection from subrings of R containing I to subrings of R/I. Also, for every ideal J of R with I ⊆ J,
we have that J/I is an ideal of R/I and the function
J 7→ J/I
is a bijection from ideals of R containing J to ideals of R/J. Furthermore, we have the following properties
for any subgroups S1 and S2 of R that both contain I, and ideal J1 and J2 of R that both contain I:
1. S1 is a subring of S2 if and only if S1 /I is a subring of S2 /I.
2. J1 is an ideal of J2 if and only if J1 /I is an ideal of J2 /I.
Theorem 10.1.20 (Third Isomorphism Theorem). Let R be a ring. Let I and J be ideals of R with I ⊆ J.
We then have that J/I is an ideal of R/I and that
R/I ∼ R
=
J/I J

10.2 The Characteristic of a Ring


Let R be a ring (not necessarily commutative). We know that R has an element 1, so it makes sense to
consider 1 + 1, 1 + 1 + 1, etc. It is natural to call these resulting ring elements 2, 3, etc. However, we need to
be careful when we interpret these. For example, in the ring Z/2Z, the multiplicative identity is 1, and we
have 1 + 1 = 0. Thus, in the above notation, we would have 2 = 0 in the ring Z/2Z. To avoid such confusion,
we introduce new notation by putting an underline beneath a number n to denote the corresponding result
of adding 1 ∈ R to itself n times. Here is the formal definition.
Definition 10.2.1. Let R be a ring. For each n ∈ Z, we define an element n recursively as follows. Let
• 0=0
• n + 1 = n + 1 for all n ∈ N
• n = −(−n) if n ∈ Z with n < 0
For example, if we unravel the above definition, we have
4=1+1+1+1 −3 = −3 = −(1 + 1 + 1) = (−1) + (−1) + (−1)
As these example illustrate, if n is negative, the element n is just the result of adding −1 ∈ R to itself |n|
many times. Let’s analyze how to add and multiply elements of the form m and n. Notice that we have
3 + 4 = (1 + 1 + 1) + (1 + 1 + 1 + 1)
=1+1+1+1+1+1+1
=7
and using the distributive law we have
3 · 4 = (1 + 1 + 1)(1 + 1 + 1 + 1)
= (1 + 1 + 1) · 1 + (1 + 1 + 1) · 1 + (1 + 1 + 1) · 1 + (1 + 1 + 1) · 1
= (1 + 1 + 1) + (1 + 1 + 1) + (1 + 1 + 1) + (1 + 1 + 1)
=1+1+1+1+1+1+1+1+1+1+1+1
= 12
At least for positive m and n, these computations illustrate the following result.
180 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS

Proposition 10.2.2. Let R be a ring. For any m, n ∈ Z, we have


• m+n=m+n
• m·n=m·n
Therefore the function ϕ : Z → R defined by ϕ(n) = n is a ring homomorphism.
Proof. The first of these follows from the exponent rules for groups in Proposition 4.6.2 by viewing 1 as an
element of the additive group (R, +, 0) (in this additive setting, the group theory notation 1n and our new
notation n denote the same elements). The second can be proven using the distribute law and induction on
n (for positive n), but we omit the details.
Proposition 10.2.3. Let R be a ring. For all a ∈ R and n ∈ Z, we have n · a = a · n. Thus, elements of
the set {n : n ∈ Z} commute with all elements of R.
Proof. Let a ∈ R. Notice that

0·a=0·a
=0
=a·0
=a·0

If n ∈ N+ , we have

n · a = (1 + 1 + · · · + 1) · a
= 1 · a + 1 · a + ··· + 1 · a
= a + a + ··· + a
= a · 1 + a · 1 + ··· + a · 1
= a · (1 + 1 + · · · + 1)
=a·n

where each of the above sums have n terms (if you find the . . . not sufficiently formal, you can again give a
formal inductive argument). Suppose now that n ∈ Z with n < 0. We then have −n > 0, hence

n · a = (−(−n)) · a
= −((−n) · a)
= −(a · (−n)) (from above)
= a · (−(−n))
=a·n

Definition 10.2.4. Let R be a ring. We define the characteristic of R, denoted char(R), as follows. If
there exists n ∈ N+ with n = 0, we define char(R) to be the least such n. If no such n exists, we define
char(R) = 0. In other words, char(R) is the order of 1 when viewed as an element of the additive group
(R, +, 0) if this order is finite, but equals 0 if this order is infinite.
Notice that char(R) is the order of 1 when viewed as an element of the abelian group (R, +, 0), unless
that order is infinite (in which case we defined char(R) = 0). For example, char(Z/nZ) = n for all n ∈ N+
and char(Z) = 0. Also, we have char(Q) = 0 and char(R) = 0. For an example of an infinite ring with
nonzero characteristic, notice that char(Z/nZ[x]) = n for all n ∈ N+ .
10.3. POLYNOMIAL EVALUATION AND ROOTS 181

Proposition 10.2.5. Let R be a ring with char(R) = n and let ϕ : Z → R be the ring homomorphism
ϕ(m) = m. Let S be the subgroup of (R, +, 0) generated by 1, so S = {m : m ∈ Z}. We then have that
S = range(ϕ), so S is a subring of R. Furthermore:

• If n 6= 0, then ker(ϕ) = nZ and so by the First Isomorphism Theorem it follows that Z/nZ ∼
= S.
• If n = 0, then ker(ϕ) = {0}, so ϕ is injective and hence Z ∼
=S
In particular, if char(R) = n 6= 0, then R has a subring isomorphic to Z/nZ, while if char(R) = 0, then R
has a subring isomorphic to Z.

Proof. The additive subgroup of (R, +, 0) generated by 1 is

S = {m : m ∈ Z} = {ϕ(m) : m ∈ Z} = range(ϕ)

Since S is the range of ring homomorphism, we have that S is a subring of R by Corollary 10.1.16. Now if
n = 0, then m 6= 0 for all nonzero m ∈ Z, so ker(ϕ) = {0}. Suppose that n 6= 0. To see that ker(ϕ) = nZ,
note that the order of 1 when viewed as an element of the abelian group (R, +, 0) equals n, so m = 0 if and
only if n | m by Proposition 4.6.5.

Definition 10.2.6. Let R be a ring. The subring S = {m : m ∈ Z} of R defined in the previous proposition
is called the prime subring of R.

Proposition 10.2.7. If R is an integral domain, then either char(R) = 0 or char(R) is prime.

Proof. Let R be an integral domain and let n = char(R). Notice that n 6= 1 because 1 6= 0 as R is an
integral domain. Suppose that n ≥ 2 and that n is composite. Fix k, ` ∈ N with 2 ≤ k, ` < n and n = k`.
We then have
0=n=k·`=k·`
Since R is an integral domain, either k = 0 or ` = 0. However, this is a contradiction because k, ` < n and
n = char(R) is the least positive value of m with m = 0. Therefore, either n = 0 or n is prime.

10.3 Polynomial Evaluation and Roots


In this section, we will investigate the idea of roots of polynomials (i.e. those values that give 0 when plugged
in) along with the question of when two different polynomials determine the same function. All of these
results will rely heavily on the fact that evaluation at a point is a ring homomorphism, which holds in any
commutative ring (Proposition 10.1.10). In other words, if R is commutative, a ∈ R, and p(x), q(x) ∈ R[x],
then
(p + q)(a) = p(a) + q(a) and (p · q)(a) = p(a) · q(a).
Therefore, all of our results will assume that R is commutative, and occasionally we will need to assume
more.

Definition 10.3.1. Let R be a commutative ring. A root of a polynomial f (x) ∈ R[x] is an element a ∈ R
such that f (a) = 0 (or more formally Eva (f (x)) = 0).

Proposition 10.3.2. Let R be a commutative ring, let a ∈ R, and let f (x) ∈ R[x]. The following are
equivalent:

1. a is a root of f (x).

2. There exists g(x) ∈ R[x] with f (x) = (x − a) · g(x).


182 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS

Proof. Suppose first that there exists g(x) ∈ R[x] with f (x) = (x − a) · g(x). We then have

f (a) = (a − a) · g(a)
= 0 · g(a)
= 0.

Written more formally, we have

Eva (f (x)) = Eva ((x − a) · g(x))


= Eva (x − a) · Eva (g(x)) (since Eva is a ring homomorphism)
= 0 · Eva (g(x))
= 0.

Therefore, a is a root of f (x).


Suppose conversely that a is a root of f (x). Since the leading coefficient of x − a is 1, which is a unit in
R, we can apply Theorem 9.3.10 to fix q(x), r(x) ∈ R[x] with

f (x) = q(x) · (x − a) + r(x)

and either r(x) = 0 or deg(r(x)) < deg(x − a). Since deg(x − a) = 1, we have that r(x) is a constant
polynomial, so we can fix c ∈ R with r(x) = c. We then have

f (x) = q(x) · (x − a) + c

Now plugging in a (or applying Eva ) we see that

f (a) = q(a) · (a − a) + c
= q(a) · 0 + c
= c.

Since we are assuming that f (a) = 0, it follows that c = 0, and thus

f (x) = q(x) · (x − a)

Thus, we can take g(x) = q(x).


Proposition 10.3.3. If R is an integral domain and f (x) ∈ R[x] is a nonzero polynomial, then f (x) has at
most deg(f (x)) many roots in R. In particular, every nonzero polynomial in R[x] has finitely many roots.
Proof. We prove the result by induction on deg(f (x)). If deg(f (x)) = 0, then f (x) is a nonzero constant
polynomial, so has 0 roots in R. Suppose that the statement is true for all polynomials of degree n, and let
f (x) ∈ R[x] with deg(f (x)) = n + 1. If f (x) has no roots in R, then we are done because 0 ≤ n + 1. Suppose
then that f (x) has at least one root in R, and fix such a root a ∈ R. Using Proposition 10.3.2, we can fix
g(x) ∈ F [x] with
f (x) = (x − a) · g(x).
We now have

n + 1 = deg(f (x))
= deg((x − a) · g(x))
= deg(x − a) + deg(g(x)) (by Proposition 9.3.6)
= 1 + deg(g(x)),
10.3. POLYNOMIAL EVALUATION AND ROOTS 183

so deg(g(x)) = n. By induction, we know that g(x) has at most n roots in R. Notice that if b is a root of
f (x), then
0 = f (b) = (b − a) · g(b)
so either b − a = 0 or g(b) = 0 (because R is an integral domain), and hence either b = a or b is a root of
g(x). Therefore, f (x) has at most n + 1 roots in R, namely the roots of g(x) together with a. The result
follows by induction.
Corollary 10.3.4. Let R be an integral domain and let n ∈ N. Let f (x), g(x) ∈ R[x] be polynomials of
degree at most n (including possibly the zero polynomial). If there exists at least n + 1 many a ∈ R such that
f (a) = g(a), then f (x) = g(x).
Proof. Suppose that there are at least n + 1 many points a ∈ R with f (a) = g(a). Consider the polynomial
h(x) = f (x) − g(x) ∈ R[x]. Notice that either h(x) = 0 or deg(h(x)) ≤ n by Proposition 9.3.5. Since h(x)
has at least n + 1 many roots in R, we must have h(x) = 0 by Proposition 10.3.3. Thus, f (x) − g(x) = 0.
Adding g(x) to both sides, we conclude that f (x) = g(x).
Back when we defined the polynomial ring R[x], we were very careful to define an element as a sequence
of coefficients rather than as the function defined from R to R given by evaluation. As we saw, if R = Z/2Z,
then the two distinct polynomials
1x2 + 1x + 1 1
given the same function on Z/2Z because they evaluate to the same value for each element of Z/2Z. Notice
that each of these polynomials have degree at most 2, but they only agree at the 2 points in Z/2Z, and hence
we can apply the previous corollary.
Corollary 10.3.5. Let R be an infinite integral domain. If f (x), g(x) ∈ R[x] are such that f (a) = g(a) for
all a ∈ R, then f (x) = g(x). Thus, distinct polynomials give different functions from R to R.
Proof. Suppose that f (a) = g(a) for all a ∈ R. Consider the polynomial h(x) = f (x) − g(x) ∈ R[x]. For
every a ∈ R, we have h(a) = f (a) − g(a) = 0. Since R is infinite, we conclude that h(x) has infinitely many
roots, so by the Proposition 10.3.3 we must have that h(x) = 0. Thus, f (x) − g(x) = 0. Adding g(x) to both
sides, we conclude that f (x) = g(x).
Let R be an integral domain and let n ∈ N. Suppose that we have n + 1 many distinct points
a1 , a2 , . . . , an+1 ∈ R, along with n + 1 many values b1 , b2 , . . . , bn+1 ∈ R (which may not be distinct, and may
equal some of the ai ). We know from Corollary 10.3.4 that there can be most one polynomial f (x) ∈ R[x]
with either f (x) = 0 or deg(f (x)) ≤ n such that f (ai ) = bi for all i. Must one always exist?
In Z, the answer is no. Suppose, for example, that n = 1, and we want to find a polynomial of degree at
most 1 such that f (0) = 1 and f (2) = 2. Writing f (x) = ax + b, we then need 1 = f (0) = a · 0 + b, so b = 1,
and hence f (x) = ax + 1. We also need 2 = f (2) = a · 2 + 1, which would require 2 · (1 − a) = 1. Since 2 - 1,
there is no a ∈ Z satisfying this. Notice, however, that if we go up to Q, then f (x) = 21 · x + 1 works as an
example here. In general, such polynomials always exist when working over a field F .
Theorem 10.3.6 (Lagrange Interpolation). Let F be a field and let n ∈ N. Suppose that we have n + 1
many distinct points a1 , a2 , . . . , an+1 ∈ F , along with n + 1 many values b1 , b2 , . . . , bn+1 ∈ F (which may
not be distinct, and may equal some of the ai ). There exists a unique polynomial f (x) ∈ F [x] with either
f (x) = 0 or deg(f (x)) ≤ n such that f (ai ) = bi for all i.
Before jumping into the general proof of this result, we first consider the case when bk = 1 and bi = 0
for all i 6= k. In order to build a polynomial f (x) of degree at most n such that f (ai ) = bi = 0 for all I 6= k,
we need to make the ai with i 6= k roots of our polynomial. The idea the is to consider the polynomial

hk (x) = (x − a1 ) · · · (x − ak−1 )(x − ak+1 ) · · · (x − an )


184 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS

Now for all i 6= k, we have gk (ai ) = 0 = bi , so we are good there. However, when we plug in ak , we obtain

hk (ak ) = (ak − a1 ) · · · (ak − ak−1 )(ak − ak+1 ) · · · (ak − an )

which is most likely not equal to 1. However, since the ai are distinct, we have that ak − ai 6= 0 for all i 6= k.
Now F is integral domain (because it is a field), so the product above is nonzero. Since F is a field, we can
divide by the resulting value. This suggests considering the polynomial

(x − a1 ) · · · (x − ak−1 )(x − ak+1 ) · · · (x − an+1 )


gk (x) =
(ak − a1 ) · · · (ak − ak−1 )(ak − ak+1 ) · · · (ak − an+1 )

Notice now that gk (ai ) = 0 = bi for all i 6= k, and gk (ak ) = 1 = bk . Thus, we are successful in the very
special case where one of the bi equals 1 and the rest are 0.
How do we generalize this? First, if bk 6= 0 (but possibly not 1) while bi = 0 for all i 6= k, then we just
scale gk (x) accordingly and consider the polynomial bk · gk (x). Fortunately, to handle the general case, we
need only add up these polynomials accordingly!
Proof of Theorem 10.3.6. For each k, let

(x − a1 ) · · · (x − ak−1 )(x − ak+1 ) · · · (x − an+1 )


gk (x) =
(ak − a1 ) · · · (ak − ak−1 )(ak − ak+1 ) · · · (ak − an+1 )

Notice that deg(gk (x)) = n for each k (since there are n terms in the product, each of which as degree 1),
that gk (ai ) = 0 whenever i 6= k, and that gk (ak ) = 1. Now let

f (x) = b1 · g1 (x) + b2 · g2 (x) + · · · + bn · gn (x) + bn+1 · gn+1 (x)

Since deg(bk · gk (x)) = n for all k, it follows that either f (x) = 0 or deg(f (x)) ≤ n. Furthermore, for any i,
we have

f (ai ) = b1 · g1 (ai ) + · · · + bi · gi (ai ) + · · · + bn+1 · gn+1 (ai )


= 0 + · · · + 0 + bi · gi (ai ) + 0 + · · · + 0
= bi · 1
= bi

10.4 Generating Subrings and Ideals in Commutative Rings


In Section 5.2, we proved that if G is a group and A ⊆ G, then there is a “smallest” subgroup of G that
contains A, i.e. there exists a subgroup H of G containing A with the property that H is subset of any
other subgroup of G containing A. We proved this result in Proposition 5.2.8, and called the resulting H
the subgroup generated by A. We now prove the analogue for subrings.
Proposition 10.4.1. Let R be a ring and let A ⊆ R. There exists a subring S of R with the following
properties:
• A ⊆ S.
• Whenever T is a subring of R with the property that A ⊆ T , we have S ⊆ T .
Furthermore, the subring S is unique (i.e. if both S1 and S2 have the above properties, then S1 = S2 ).
10.4. GENERATING SUBRINGS AND IDEALS IN COMMUTATIVE RINGS 185

Proof. The idea is to intersect all of the subrings of R that contain A, and argue that the result is a subgroup.
We first prove existence. Notice that there is at least one subring of R containing A, namely R itself. Define

S = {a ∈ R : a ∈ T for all subrings T of R such that A ⊆ R}

Notice we certainly have A ⊆ S by definition. Moreover, if T is a subring of R with the property that A ⊆ T ,
then we have T ⊆ S by definition of S. We now show that S is indeed a subring of R.

• Since 0, 1 ∈ T for every subring T of R with A ⊆ T , we have 0, 1 ∈ S.

• Let a, b ∈ S. For any subring T of R such that A ⊆ T , we must have both a, b ∈ T by definition of S,
hence a + b ∈ T because T is a subring. Since this is true for all such T , we conclude that a + b ∈ S
by definition of S.

• Let a ∈ S. For any subring T of R such that A ⊆ T , we must have both a ∈ T by definition of S,
hence −a ∈ T because T is a subring. Since this is true for all such K, we conclude that a−1 ∈ H by
definition of H.

• Let a, b ∈ S. For any subring T of R such that A ⊆ T , we must have both a, b ∈ T by definition of
S, hence ab ∈ T because T is a subring. Since this is true for all such T , we conclude that ab ∈ S by
definition of S.

Combining these four properties, we conclude that S is a subring of R. This finishes the proof of existence.
Finally, suppose that S1 and S2 both have the above properties. Since S2 is a subring of R with A ⊆ S2 ,
we know that S1 ⊆ S2 . Similarly, since S1 is a subring of R with A ⊆ S1 , we know that S2 ⊆ S1 . Therefore,
S1 = S2 .

In group theory, when G was a group and A = {a} consisted of a single element, the corresponding
subgroup was {an : a ∈ Z}. We want to develop a similar explicit description in the commutative ring
case (one can also look at noncommutative rings, but the description is significantly more complicated).
However, instead of just looking at the smallest ring containing one element, we are going to generalize the
construction as follows. Assume that we already have a subring S of R. With this base in hand, suppose
that we now want to include one other element a ∈ R in our subring. We then want to know what is the
smallest subring containing S ∪ {a}? Notice that if we really want just the smallest ring that contains one
a ∈ R, we can simply take S to be the prime subring of R from Definition 10.2.6 (because every subring of
R must contain 1, it follows that every subring of R must also contain the prime subring of R).

Proposition 10.4.2. Suppose that R is a commutative ring, that S is a subring of R, and that a ∈ R. Let

S[a] = {p(a) : p(x) ∈ S[x]}


= {sn an + sn−1 an−1 + · · · + s1 a + s0 : n ∈ N and s1 , s2 , . . . , sn ∈ S}.

We have that following.

1. S[a] is a subring of R with S ∪ {a} ⊆ S[a].

2. If T is a subring R with S ∪ {a} ⊆ T , then S[a] ⊆ T .

Therefore, S[a] is the smallest subring of R containing S ∪ {a}.

Proof. We first check 1.

• Notice that 0, 1 ∈ {p(a) : p(x) ∈ S[x]} by considering the zero polynomial and one polynomial.
186 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS

• Given two elements of S[a], say p(a) and q(a) where p(x), q(x) ∈ S[x], we have that p(a) + q(a) =
(p + q)(a) and p(a) · q(a) = (pq)(a). Since p(x) + q(x) ∈ S[x] and p(x), q(x) ∈ S[x], it follows that S[a]
is closed under addition and multiplication.
• Consider an arbitrary element p(a) ∈ S[a], where p(x) ∈ S[x]. Since evaluation is a ring homomor-
phism, we then have that −p(a) = (−p)(a), where −p is the additive inverse of p in S[x]. Thus, S[a]
is closed under additive inverses.
Therefore, S[a] is subring of R. Finally, notice that S ∪ {a} ⊆ S[a]} by considering the constant polynomials
and x ∈ S[x].
Suppose now that T is an arbitrary subring of R with S ∪ {a} ⊆ T . Let n ∈ N+ and let s1 , s2 , . . . , sn ∈ S
be arbitrary. Notice that a0 = 1 ∈ T because T is a subring of R, and ak ∈ T for all k ∈ N+ because a ∈ T
and T is closed under multiplication. Since sk ∈ T for all k (because S ⊆ T ), we can use the fact that T is
closed under multiplication to conclude that sk ak ∈ T for all k. Finally, since T is closed under addition, we
conclude that sn an + sn−1 an−1 + · · · + s1 a + s0 ∈ T . Therefore, S[a] ⊆ T .
In group theory, we know that if |a| = n, then we can write hai = {ak : 0 ≤ k ≤ n − 1} in place of
{a : k ∈ Z}. In other words, sometimes we have redundancy in the set {ak : k ∈ Z} and can express hai
k

with less. The same holds true now, and sometimes we can obtain all elements of S[a] using fewer than√all
polynomials in S[x]. For example, consider Q as a subring of R. Suppose that we now want to include 2.
From above, we then have that
√ √
Q[ 2] = {p( 2) : p(x) ∈ Q}
√ √ √
= {bn ( 2)n + bn−1 ( 2)n−1 + · · · + b1 ( 2) + b0 : n ∈ N and b1 , b2 , . . . , bn ∈ Q}.

Although
√ this certainly works, there is a lot of redundancy in the set on the right. For example when we
plug 2 into x3 + x2 − 7, we obtain
√ √ √
( 2)3 + ( 2)2 − 7 = 8 + 2 − 7

=2 2−5
√ √
which is the same thing as plugging 3 2 into√2x − 5. In fact, since {a + b 2 : a, b ∈ Q} is subring of R, as
we checked in Section 9.1, it suffices to plug 2 into only the polynomials of the form a + bx where a, b ∈ Q.
A similar thing happened with Z[i] = {a + bi : a, b ∈ Z}, where if we want to find the smallest subring of C
containing Z ∪ {i}, we just needed to plug i into linear polynomials with coefficients in Z. √
Suppose instead that we consider Q as a subring of R, and we now want to include 3 2. We know from
above that

3
√3
Q[ 2] = {p( 2) : p(x) ∈ Q}
√ √ √
= {bn ( 2)n + bn−1 ( 2)n−1 + · · · + b1 ( 2) + b0 : n ∈ N and b1 , b2 , . . . , bn ∈ Q}.
3 3 3

√ √ √ 2 √ √
Can we express Q[ 3 2] as {a + b 3 2 : a, b ∈ Q}? √ Notice that ( 3 2)√ = 3 4 must be an element of Q[ 3 2], but
it is not clear whether it is possible to write 3 4 in the form a + b 3 2 where a, b ∈ Q. In fact, it is impossible
to find such a and b, although it is tedious to verify this fact now. However, it is true that

3

3
√3
Q[ 2] = {{a + b 2 + c 4 : a, b, c ∈ Q},

so we need only plug 3 2 into quadratic polynomials with coefficients in Q. We will verify all of these facts,
and investigate these kinds of questions in more detail later.
Suppose now that S is a subring of a commutative ring R and a1 , a2 , . . . , an ∈ R. How do we obtain an
explicit description of S[a1 , a2 , . . . , an ]? It is natural to guess that one obtains S[a1 , a2 , . . . , an ] by plugging
10.4. GENERATING SUBRINGS AND IDEALS IN COMMUTATIVE RINGS 187

the point (a1 , a2 , . . . , an ) into all polynomials in n variables over S. This is indeed the case, but we will wait
until we develop the theory of such multivariable polynomials more deeply.
We now move on to generating ideals. Analogously to Proposition 5.2.8 and Proposition 10.4.1, we have
the following result. Since the argument is completely analogous, we omit the proof.

Proposition 10.4.3. Let R be a ring and let A ⊆ R. There exists an I of R with the following properties:

• A ⊆ I.

• Whenever J is an ideal of R with the property that A ⊆ J, we have I ⊆ J.

Furthermore, the ideal is unique (i.e. if both I1 and I2 have the above properties, then I1 = I2 ).

Let R be a commutative ring. Suppose that a ∈ R, and we want an explicit description of the smallest
ideal that contains the element a (so we are considering the case where A = {a}). Notice that ra must be
an element of this ideal for all r ∈ R. Fortunately, the resulting set turns out to be an ideal

Proposition 10.4.4. Let R be a commutative ring and let a ∈ R. Let I = {ra : r ∈ R}.

1. I is an ideal of R with a ∈ I.

2. If J is an ideal of R with a ∈ J, then I ⊆ J.

Therefore, I is the smallest ideal of R containing a.

Proof. We first prove 1. We begin by noting that a = 1 · a ∈ I and that 0 = 0 · a ∈ I. For any r, s ∈ R, we
have
ra + sa = (r + s)a ∈ I
so I is closed under addition. For any r, s ∈ R, we have

r · (sa) = (rs) · a ∈ I

Therefore I is an ideal of R with a ∈ I by Corollary 10.1.3.


We now prove 2. Let J be an ideal of R with a ∈ J. For any r ∈ R, we have ra ∈ J because J is a ideal
of R and a ∈ J. It follows that I ⊆ J.

Since we will be looking at these types of ideals often, and they are sometime analogous to cyclic sub-
groups, we steal the notation that we used in group theory.

Definition 10.4.5. Let R be a commutative ring and let a ∈ R. We define hai = {ra : r ∈ R}. The ideal
hai is called the ideal generated by a.

Be careful with this overloaded notation! If G is a group and a ∈ G, then hai = {an : n ∈ Z}. However,
if R is commutative ring and a ∈ R, then hai = {ra : a ∈ R}.

Definition 10.4.6. Let R be a commutative ring and let I be an ideal of R. We say that I is a principal
ideal if there exists a ∈ R with I = hai.

Proposition 10.4.7. The ideals of Z are precisely the sets nZ = hni for every n ∈ N. Thus, every ideal of
Z is principal.

Proof. For each n ∈ N, we know that hni = {kn : k ∈ N} is an ideal by Proposition 10.4.4. Suppose now
that I is an arbitrary ideal of Z. Since ideals of Z are in particular additive subgroups of Z, one can use
Corollary 8.1.6 to argue that every ideal of Z is of this form. However, we give a direct argument.
188 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS

Let I be an arbitrary ideal of Z. We know that 0 ∈ I. If I = {0}, then I = h0i, and we are done. Suppose
then that I 6= {0}. Notice that if k ∈ I, then −k ∈ I as well, hence I ∩ N+ 6= ∅. By well-ordering, we may
let n = min(I ∩ N+ ). We claim that I = hni.
First notice that since n ∈ I and I is an ideal of R, it follows from Proposition 10.4.4 that hni ⊆ I. Let
m ∈ I be arbitrary. Fix q, r ∈ Z with m = qn+r and 0 ≤ r < n. We then have that r = m−qn = m+(−q)n.
Since m, n ∈ I and I is an ideal, we know that (−q)n ∈ I and so r = m + (−q)n ∈ I. Now 0 ≤ r < n and
n = min(I ∩ N+ ), so we must have r = 0. It follows that m = qn, so k ∈ hni. Now m ∈ I was arbitrary, so
I ⊆ hni. Putting this together with the above, we conclude that I = hni.

Lemma 10.4.8. Let R be a commutative ring and let I be an ideal of R. We have that I = R if and only
if I contains a unit of R.

Proof. If I = R, then 1 ∈ I, so I contains a unit. Suppose conversely that I contains a unit, and fix such a
unit u ∈ I. Since u is a unit, we may fix v ∈ R with vu = 1. Since u ∈ I and I is an ideal, we conclude that
1 ∈ I. Now for any r ∈ R, we have r = r · 1 ∈ I again because I is an ideal of R. Thus, R ⊆ I, and since
I ⊆ R trivially, it follows that I = R.

Using principal ideals, we can prove the following simple but important result.

Proposition 10.4.9. Let R be a commutative ring. The following are equivalent.

1. R is a field.

2. The only ideals of R are {0} and R.

Proof. We first prove that 1 implies 2. Suppose that R is field. Let I be an ideal of R with I 6= {0}. Fix
a ∈ I with a 6= 0. Since R is a field, every nonzero element of R is a unit, so a is a unit. Since a ∈ I, we
may use the lemma to conclude that I = R.
We now prove that 2 implies 1 by proving the contrapositive. Suppose that R is not a field. Fix a nonzero
element a ∈ R such that a is not a unit. Let I = hai = {ra : r ∈ R}. We know from above that I is an
ideal of R. If 1 ∈ I, then we may fix r ∈ R with ra = 1, which implies that a is a unit (remember that we
are assuming that R is commutative). Therefore, 1 ∈ / I, and hence I 6= R. Since a 6= 0 and a ∈ I, we have
I = {0}. Therefore, I is an ideal of R distinct from {0} and R.

Given finitely many elements a1 , a2 , . . . , an of a commutative ring R, we can also describe the smallest
ideal of R containing {a1 , a2 , . . . , an } in a simple manner. We leave the verification as an exercise.

Definition 10.4.10. Let R be a commutative ring and let a1 , a2 , . . . , an ∈ R. We define

ha1 , a2 , . . . , an i = {r1 a1 + r2 a2 + · · · + rn an : ri ∈ R for each i}

This set is the smallest ideal of R containing a1 , a2 , . . . , an , and we call it the ideal generated by a1 , a2 , . . . , an .

For example, consider the ring Z. We know that h15, 42i is an ideal of Z. Now

h15, 42i = {15k + 42` : k, ` ∈ Z}

is an ideal of Z, so it must equal hni for some n ∈ Z by Proposition 10.4.7. Since gcd(15, 42) = 3, we know
that 3 divides every element in h15, 42i, and we also know that 3 is an actual element of h15, 42i. Working
out the details, one can show that h15, 42i = h3i. Generalizing this argument, one can show that if a, b ∈ Z,
then ha, bi = hgcd(a, b)i.
10.5. PRIME AND MAXIMAL IDEALS IN COMMUTATIVE RINGS 189

10.5 Prime and Maximal Ideals in Commutative Rings


Throughout this section, we work in the special case when R is a commutative ring. In this context, we
define two very special kinds of ideals.

Definition 10.5.1. Let R be a commutative ring.

• A prime ideal of R is an ideal P ⊆ R such that P 6= R and whenever ab ∈ P , then either a ∈ P or


b ∈ P.

• A maximal ideal of R is an ideal M ⊆ R such that M 6= R and there exists no ideal I of R with
M ( I ( R.

For an example, consider the ideal hxi in the commutative ring Z[x]. We have

hxi = {x · p(x) : p(x) ∈ Z[x]}


= {x · (an xn + an−1 xn−1 + · · · + a1 x + a0 ) : n ∈ N and a0 , a1 , . . . , an ∈ Z}
= {an xn+1 + an−1 xn + · · · + a1 x2 + a0 x : n ∈ N and a0 , a1 , . . . , an ∈ Z}
= {f (x) ∈ Z[x] : The constant term of f (x) is 0}
= {f (x) ∈ Z[x] : f (0) = 0}.

We claim that hxi is a prime ideal of Z[x]. To see this, suppose that f (x), g(x) ∈ Z[x] and that f (x)g(x) ∈ hxi.
We then have that f (0)g(0) = 0, so either f (0) = 0 or g(0) = 0 (because Z is an integral domain). It follows
that either f (x) ∈ hxi or g(x) ∈ hxi.
For an example of an ideal that is not a prime ideal, consider the ideal h6i = 6Z in Z. We have that
2 · 3 ∈ 6Z but 2 ∈
/ 6Z and 3 ∈ / 3Z. Also, h6i is not a maximal ideal because h6i ( h3i (because every multiple
of 6 is a multiple of 3, but 3 ∈
/ h6i).

Proposition 10.5.2. Let R be a commutative ring with 1 6= 0. The ideal {0} is a prime ideal of R if and
only if R is an integral domain.

Proof. Suppose first that {0} is a prime ideal of R. Let a, b ∈ R be arbitrary with ab = 0. We then have
that ab ∈ {0}, so either a ∈ {0} or b ∈ {0} because {0} is a prime ideal of R. Thus, either a = 0 or b = 0.
Since R has no zero divisors, we conclude that R is an integral domain.
Conversely, suppose that R is an integral domain. Notice that {0} is an ideal of R with {0} 6= R (since
1 6= 0). Let a, b ∈ R be arbitrary with ab ∈ {0}. We then have that ab = 0, so as R is an integral domain we
can conclude that either a = 0 or b = 0. Thus, either a ∈ {0} or b ∈ {0}. It follows that {0} is a prime ideal
of R.

We can now classify the prime and maximal ideals of Z.

Proposition 10.5.3. Let I be an ideal of Z.

1. I is a prime ideal if and only if either I = {0} or I = hpi for some prime p ∈ N+ .

2. I is a maximal ideal if and only if I = hpi for some prime p ∈ N+ .

Proof. 1. First notice that {0} is a prime ideal of Z by Proposition 10.5.2 because Z is an integral domain.
Suppose now that p ∈ N+ is prime and let I = hpi = {pk : k ∈ Z}. We then have that I = hpi, so I is
indeed an ideal of Z by Proposition 10.4.4. Suppose that a, b ∈ Z are such that ab ∈ pZ. Fix k ∈ Z
with ab = pk. We then have that p | ab, so as p is prime, either p | a or p | b by Proposition 2.5.6. If
p | a, then we can fix ` ∈ Z with a = p`, from which we can conclude that a ∈ pZ. Similarly, if p | b,
190 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS

then we can fix ` ∈ Z with b = p`, from which we can conclude that b ∈ pZ. Thus, either a ∈ pZ or
b ∈ pZ. Therefore, pZ is a prime ideal of Z.
Suppose conversely that I is a prime ideal of Z. By Proposition 10.4.7, we can fix n ∈ N with I = hni.
We need to prove that either n = 0 or n is prime. Notice that n 6= 1 because h1i = Z is not a prime
ideal of Z by definition. Suppose that n ≥ 2 is not prime. We can then fix c, d ∈ N with n = cd and
both 1 < c < n and 1 < d < n. Notice that cd = n ∈ hni. However, if c ∈ hni, then fixing k ∈ Z
with c = nk, we notice that k > 0 (because c, n > 0), so c = nk ≥ n, a contradiction. Thus, c ∈ / hni.
Similarly, d ∈
/ hni. It follows that hni is not prime ideal when n ≥ 2 is composite. Therefore, if I = hni
is a prime ideal, then either n = 0 or n is prime.
2. Suppose first that I = hpi for some prime p ∈ N+ . Let J be an ideal of Z with I ⊆ J. We prove that
either I = J or I = Z. By Proposition 10.4.7, we can fix n ∈ N with J = hni. Since I ⊆ J and p ∈ I,
we have that p ∈ J = hni. Thus, we can fix k ∈ Z with p = nk. Notice that k > 0 because p, n > 0.
Since p is prime, it follows that either n = 1 or n = p. If n = 1, then J = h1i = Z. If n = p, then
J = hpi = I. Therefore, I = hpi is a maximal ideal of Z.
Suppose conversely that I is a maximal ideal of Z. By Proposition 10.4.7, we can fix n ∈ N with
I = hni. We need to prove that either n is prime. Notice that n 6= 0 because h0i is not a maximal
ideal of Z (we have h0i ( h2i ( Z). Also, n 6= 1 because h1i = Z is not a maximal ideal of Z by
definition. Suppose that n ≥ 2 is not prime. We can then fix c, d ∈ N with n = cd and both 1 < c < n
and 1 < d < n. Notice that hni ⊆ hdi because every multiple of n is a multiple of d. However, we
have d ∈ / hni because n - d (as 1 < d < n). Furthermore, we have hdi = 6 Z because 1 ∈/ hdi. Since
hni ( hdi ( Z, we conclude that hni is not a maximal ideal of Z. Therefore, if I = hni is a maximal
ideal, then n is prime.

We can classify the prime and maximal ideals of a commutative ring by the properties of the corresponding
quotient ring.
Theorem 10.5.4. Let R be a commutative ring and let P be an ideal of R. P is a prime ideal of R if and
only if R/P is an integral domain.
Proof. Suppose first that P is a prime ideal of R. Since R is commutative, we know that R/P is commutative.
By definition, we have P 6= R, so 1 ∈ / P , and hence 1 + P 6= 0 + P . Finally, suppose that a, b ∈ R with
(a + P ) · (b + P ) = 0 + P . We then have that ab + P = 0 + P , so ab ∈ P . Since P is a prime ideal, either
a ∈ P or b ∈ P . Therefore, either a + P = 0 + P or b + P = 0 + P . It follows that R/P is an integral domain.
Suppose conversely that R/P is an integral domain. We then have that 1 + P 6= 0 + P by definition of
an integral domain, hence 1 ∈ / P and so P 6= R. Suppose that a, b ∈ R with ab ∈ P . We then have
(a + P ) · (b + P ) = ab + P = 0 + P
Since R/P is an integral domain, we conclude that either a + P = 0 + P or b + P = 0 + P . Therefore, either
a ∈ P or b ∈ P . It follows that P is a prime ideal of R.
Theorem 10.5.5. Let R be a commutative ring and let M be an ideal of R. M is a maximal ideal of R if
and only if R/M is a field.
Proof. We give two proofs. The first is the slick “highbrow” proof. Using the Correspondence Theorem and
Proposition 10.4.9, we have
M is a maximal ideal of R ⇐⇒ There are no ideals I of R with M ( I ⊆ R
⇐⇒ There are no ideals of R/M other than {0 + I} and R/M
⇐⇒ R/M is a field
10.5. PRIME AND MAXIMAL IDEALS IN COMMUTATIVE RINGS 191

If you don’t like appealing to the Correspondence Theorem (which is a shame, because it’s awesome), we
can prove it directly via a “lowbrow” proof.
Suppose first that M is a maximal ideal of R. Fix a nonzero element a+M ∈ R/M . Since a+M 6= 0+M ,
we have that a ∈ / M . Let I = {ra + m : s ∈ R, m ∈ M }. We then have that I is an ideal of M (check it)
with M ⊆ I and a ∈ I. Since a ∈ / M , we have M ( I, so as M is maximal it follows that I = R. Thus,
we may fix r ∈ R and m ∈ M with ra + m = 1. We then have ra − 1 = −m ∈ M , so ra + M = 1 + M . It
follows that (r + M )(a + M ) = 1 + M , so a + M has an inverse in R/M (recall that R and hence R/M is
commutative, so we only need an inverse on one side).
Suppose conversely that R/M is a field. Since R/M is a field, we have 1 + M 6= 0 + M , so 1 ∈
/ M and
hence M ( R. Fix an ideal I of R with M ( I. Since M ( I, we may fix a ∈ I\M . Since a ∈ / M , we have
a+M 6= 0+M , and using the fact that R/M is a field we may fix b+M ∈ R/M with (a+M )(b+M ) = 1+M .
We then have ab + M = 1 + M , so ab − 1 ∈ M . Fixing m ∈ M with ab − 1 = m, we then have ab − m = 1.
Now a ∈ I, so ab ∈ I as I is an ideal. Also, we have m ∈ M ⊆ I. It follows that 1 = ab − m ∈ I, and thus
I = R. Therefore, M is a maximal ideal of R.
From the theorem, we see that pZ is a maximal ideal of Z for every prime p ∈ N+ because we know
that Z/pZ is a field, which gives a much faster proof of one part of Proposition 10.5.3. We also obtain the
following nice corollary.

Corollary 10.5.6. Let R be a commutative ring. Every maximal ideal of R is a prime ideal of R.
Proof. Suppose that M is a maximal ideal of R. We then have that R/M is a field by Theorem 10.5.5, so
R/M is an integral domain by Proposition 9.2.9. Therefore, M is a prime ideal of R by Theorem 10.5.4.
The converse is not true. As we’ve seen, the ideal {0} is a prime ideal of Z, but it is certainly not a
maximal ideal of Z.
192 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS
Chapter 11

Divisibility and Factorizations in


Integral Domains

Our primary example of an integral domain that is not a field is Z. We spent a lot of time developing the
arithmetic of Z in Chapter 2, and there we worked through the notions of divisibility, greatest common
divisors, and prime factorizations. In this chapter, we explore how much of this arithmetic of Z we can carry
over to other types of integral domains.

11.1 Divisibility and Associates


We begin with the development of divisibility in integral domains. It is possible, but more involved, to
discuss these concepts in the more general rings. However, in a noncommutative ring, we would need to
refer constantly to which side we are working on and this brings attention away from our primary concerns.
Even in a commutative ring, divisibility can have some undesirable properties (see below). Although the
general commutative case is worthy of investigation, the fundamental examples for us will be integral domains
anyway, so we want to focus on the interesting questions in this setting.
Definition 11.1.1. Let R be an integral domain and let a, b ∈ R. We say that a divides b, and write a | b,
if there exists d ∈ R with b = ad.
Of course, this is just our definition of divisibility in Z generalized to an arbitrary integral domain R.
For example, in the ring Z[x], we have

x2 + 3x − 1 | x4 − x3 − 11x2 + 10x − 2

because
(x2 + 3x − 1)(x2 − 4x + 2) = x4 − x3 − 11x2 + 10x − 2
If R is a field, then for any a, b ∈ R with a 6= 0, we have a | b because b = a(a−1 b). Thus, the divisibility
relation is trivial in fields. Also notice that we have that a ∈ R is a unit if and only if a | 1.
Proposition 11.1.2. Let R be an integral domain and let a, b, c ∈ R.
1. If a | b and b | c, then a | c.
2. If a | b and a | c, then a | (rb + sc) for all r, s ∈ R.
Proof.

193
194 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS

1. Fix k, m ∈ R with b = ak and c = bm. We then have c = bm = (ak)m = a(km), so a | c.

2. Fix k, m ∈ R with b = ak and c = am. Let r, s ∈ R. We then have

rb + sc = r(ak) + s(am)
= a(rk) + a(sm) (since R is commutative)
= a(rk + sm)

so a | (rb + sc).

For an example of where working in an integral domain makes divisibility have more desirable properties,
consider the following proposition, which would be false in general commutative rings. Working in R =
Z × Z (which is not an integral domain), notice that (2, 0) | (6, 0) via both (2, 0) · (3, 0) = (6, 0) and also
(2, 0) · (3, 5) = (6, 0).

Proposition 11.1.3. Let R be an integral domain. Suppose that a, b ∈ R with a 6= 0 and that a | b. There
exists a unique d ∈ R such that ad = b.

Proof. The existence of a d follows immediately from the definition of divisibility. Suppose that c, d ∈ R
satisfy ac = b and ad = b. We then have that ac = ad. Since a 6= 0 and R is an integral domain, we may use
Proposition 9.2.11 to cancel the a’s to conclude that c = d.

Recall that if R is a ring, then we denote the set of units of R by U (R), and that U (R) forms a
multiplicative group by Proposition 9.2.4.

Definition 11.1.4. Let R be an integral domain. Define a relation ∼ on R by letting a ∼ b if there exists a
u ∈ U (R) such that b = au.

Proposition 11.1.5. Let R be an integral domain. The relation ∼ is an equivalence relation.

Proof. We check the properties.

• Reflexive: Given a ∈ R, we have a = a · 1, so since 1 ∈ U (R) it follows that a ∼ a.

• Symmetric: Suppose that a, b ∈ R with a ∼ b. Fix u ∈ U (R) with b = au. Multiplying on the right by
u−1 , we see that a = bu−1 . Since u−1 ∈ U (R) (by Proposition 9.2.4), it follows that b ∼ a.

• Transitive: Suppose that a, b, c ∈ R with a ∼ b and b ∼ c. Fix u, v ∈ U (R) with b = au and c = bv.
We then have c = bv = (au)v = a(uv). Since uv ∈ U (R) by (Proposition 9.2.4), it follows that a ∼ c.

Therefore, ∼ is an equivalence relation.

Definition 11.1.6. Let R be an integral domain. Elements of the same equivalence class are called associates.
In other words, given a, b ∈ R, then a and b are associates if there exists u ∈ U (R) with b = au.

For example, we have U (Z) = {±1}, so the associates of a given n ∈ Z are exactly ±n. Thus, the
equivalence classes partition Z into the sets {0}, {±1}, {±2}, . . . .

Proposition 11.1.7. Let R be an integral domain and let a, b ∈ R. The following are equivalent.

1. a and b are associates in R.

2. Both a | b and b | a.
11.1. DIVISIBILITY AND ASSOCIATES 195

Proof. Suppose first that a and b are associates. Fix u ∈ U (R) with b = au. We then clearly have a | b, and
since a = bu−1 we have b | a.
Suppose conversely that both a | b and b | a. Fix c, d ∈ R with b = ac and a = bd. Notice that if a = 0,
then b = ac = 0c = 0, so a = b1 and a and b are associates. Suppose instead that a 6= 0. We then have

a1 = a = bd = (ac)d = acd

Since R is an integral domain and a 6= 0, it follows that cd = 1 (using Proposition 9.2.11), so both c, d ∈ U (R).
Therefore, as b = ac, it follows that a and b are associates.

It will be very useful for us to rephrase the definition of divisibility and associates in terms of principal
ideals. Notice that the elements a and b switch sides in the next proposition.

Proposition 11.1.8. Let R be an integral domain and let a, b ∈ R. We have a | b if and only if hbi ⊆ hai.

Proof. Suppose first that a | b. Fix d ∈ R with b = ad. We then have b ∈ hai, hence hbi ⊆ hai because hbi is
the smallest ideal containing b.
Suppose conversely that hbi ⊆ hai. Since b ∈ hbi, we then have in particular that b ∈ hai. Thus, we may
fix d ∈ R with b = ad. It follows that a | b.

Corollary 11.1.9. Let R be an integral domain and let a, b ∈ R. We have hai = hbi if and only if a and b
are associates.

Proof. We have

hai = hbi ⇐⇒ hai ⊆ hbi and hbi ⊆ hai


⇐⇒ b | a and a | b (by Proposition 11.1.8)
⇐⇒ a and b are associates. (by Proposition 11.1.7)

Finally, we begin a discussion of greatest common divisors in integral domains. Recall that in our
discussion about Z, we avoided defining the greatest common divisor as the largest common divisor of a and
b, and instead said that it had the property that every other common divisor of a and b was also a divisor
of it. Taking this approach was useful in Z (after all, gcd(a, b) had this much stronger property anyway and
otherwise gcd(0, 0) would not make sense), but it is absolutely essential when we try to generalize it to other
integral domains since a general integral domain has no notion of “order” or “largest”.

Definition 11.1.10. Let R be an integral domain and let a, b ∈ R.

1. A common divisor of a and b is an element c ∈ R such that c | a and c | b.

2. An element d ∈ R is called a greatest common divisor of a and b if

• d is a common divisor of a and b.


• For every common divisor c of a and b, we have c | d.

When we worked in Z, we added the additional requirement that gcd(a, b) was nonnegative so that it
would be unique. However, just like with “order”, there is no notion of “positive” in a general integral
domain. Thus, we will have to live with a lack of uniqueness in general. Fortunately, any two greatest
common divisors are associates, as we now prove.

Proposition 11.1.11. Suppose that R is an integral domain and a, b ∈ R.


196 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS

• If d is a greatest common divisor of a and b, then every associate of d is also a greatest common divisor
of a and b.
• If d and d0 are both greatest common divisors of a and b, then d and d0 are associates.
Proof. Suppose that d is a greatest common divisor of a and b. Suppose that d0 is an associate of d. We
then have that d0 | d, so since d is a common divisor of a and b and divisibility is transitive, it follows that d0
is a common divisor of a and b. Let c be a common divisor of a and b. Since d is a greatest common divisor
of a and b, we know that c | d. Now d | d0 because d and d0 are associates, so by transitivity of divisibility,
we conclude that c | d0 . Therefore, every common divisor of a and b divides d0 , and hence d0 is a greatest
common divisor of a and b.
Suppose that d and d0 are both greatest common divisors of a and b. We then have that d0 is a common
divisor of a and b, so d0 | d because d is a greatest common divisor of a and b. Similarly, we have that d is a
common divisor of a and b, so d | d0 because d0 is a greatest common divisor of a and b. Using Proposition
Proposition 11.1.7, we conclude that either d and d0 are associates.
So far, we have danced around the fundamental question: Given an integral domain R and elements
a, b ∈ R, must there exist a greatest common divisor of a and b? The answer is no in general, so proving
existence in “nice” integral domains will be a top priority for us.

11.2 Irreducible and Prime Elements


Our next goal is to generalize the notion of a prime number in Z to an arbitrary integral domain. When we
were working with Z, we noticed the fundamental fact about prime numbers that we used again and again
was Proposition 2.5.6, which stated that if p ∈ Z is a prime and p | ab, then either p | a or p | b. However, in
a general integral domain, it is far from obvious that this property is equivalent to the usual idea of being
unable to factor it nontrivially, so we introduce two different notions.
Definition 11.2.1. Let R be an integral domain and let p ∈ R be nonzero and not a unit.
• We say that p is irreducible if whenever p = ab, then either a is a unit or b is a unit.
• We say that p is prime if whenever p | ab, either p | a or p | b.
The first definition says that an element is irreducible exactly when every possible factorization contains a
“trivial” element. The next proposition says that an element is irreducible exactly when it has only “trivial”
divisors.
Proposition 11.2.2. Let R be an integral domain. A nonzero nonunit p ∈ R is irreducible if and only if
the only divisors of p are the units and the associates of p.
Proof. Let p ∈ R be nonzero and not a unit.
Suppose that p is irreducible. Let a ∈ R with a | p. Fix d ∈ R with p = ad. Since p is irreducible, we
conclude that either a is a unit or d is a unit. If a is a unit, we are done. If d is a unit, then have that p and
a are associates. Therefore, every divisor of p is either a unit or an associate of p.
Suppose conversely that the only divisors of p are the units and associates of p. Suppose that p = ab
and that a is not a unit. Notice that a 6= 0 because p 6= 0. We have that a | p, so since a is not a unit, we
must have that a is an associate of p. Fix a unit u with p = au. We then have ab = p = au, so since R is in
integral domain and a 6= 0, it follows that b = u. Therefore b is a unit.
Let’s examine these definitions and concepts in the context of the integral domain Z. In Z, our original
definition of p being “prime” was really about p only having trivial divisors, so in hindsight our old definition
more closely resembles our new definition of irreducible. Notice that we also only worked with positive
11.2. IRREDUCIBLE AND PRIME ELEMENTS 197

elements p ∈ Z when talking about “primes” and their divisors. However, it’s straightforward to check that
if a, b ∈ Z, and a | b, then (−a) | b. Thus, our old definition of a p ∈ Z being “prime” is the same as saying
that p ≥ 2 and the the only divisors of p are ±1 and ±p, i.e. the units and associates. If we no longer insist
that p ≥ 2, then this is precisely the same as saying that p is irreducible in Z. Notice now that after dropping
that requirement, we have that −2, −3, −5, . . . are irreducibles in Z. This is generalized in the following
result.
Proposition 11.2.3. Let R be an integral domain.
1. If p ∈ R is irreducible, then every associate of p is irreducible.
2. If p ∈ R is prime, then every associate of p is prime.
Proof. Exercise (see homework).
Let’s examine some elements of the integral domain Z[x]. We know that U (Z[x]) = U (Z) = {1, −1} by
Proposition 9.3.8. Notice that 2x2 + 6 is not irreducible in Z[x] because 2x2 + 6 = 2 · (x2 + 3), and neither
2 nor x2 + 3 is a unit. Also, x2 − 4x + 21 = (x + 3)(x − 7), and neither x + 3 nor x − 7 is a unit in Z[x], so
x2 − 4x + 21 is not irreducible in Z[x].
In contrast, we claim that x + 3 is irreducible in Z[x]. To see this, suppose that f (x), g(x) ∈ Z[x] are
such that x + 3 = f (x)g(x). Notice that we must have that f (x) and g(x) are both nonzero. By Proposition
9.3.6, we know that
deg(x + 3) = deg(f (x)) + deg(g(x))
so
1 = deg(f (x)) + deg(g(x))
It follows that one of deg(f (x)) and deg(g(x)) equals 0, while the other is 1. Suppose without loss of
generality that deg(f (x)) = 0, so f (x) is a nonzero constant, say f (x) = c. Since deg(g(x)) = 1, we can fix
a, b ∈ Z with g(x) = ax + b. We then have

x + 3 = c · (ax + b) = ac · x + b

It follows that ac = 1 in Z, so c ∈ {1, −1}, and hence f (x) = c is a unit in Z[x]. Therefore, x + 3 is irreducible
in Z[x].
One can also show that x + 3 is prime in Z[x]. The key fact to use here is Proposition 10.3.2, from which
we conclude that for any h(x) ∈ Z[x], we have that x + 3 | h(x) in Z[x] if and only if h(−3) = 0. Suppose
then that f (x), g(x) ∈ Z[x] are such that x + 3 | f (x)g(x). We then have that f (−3)g(−3) = 0, so since Z is
an integral domain, either f (−3) = 0 or g(−3) = 0. It follows that either x + 3 | f (x) in Z[x] or x + 3 | g(x)
in Z[x].
From these examples, we see that there certainly does appear to be some connection between the concepts
of irreducible and prime, but they also have a slightly different flavor. In Z, it is true that an element is
prime exactly when it is irreducible. The hard direction here is essentially the content of Proposition 2.5.6
(although again we technically only dealt with positive elements there, but we can use Proposition 11.2.3).
In general, one direction of this equivalence from Z holds in every integral domain, but will see that the
other does not.
Proposition 11.2.4. Let R be an integral domain. If p is prime, then p is irreducible.
Proof. Suppose that p is prime. By definition of prime, p is nonzero and not a unit. Let a, b ∈ R be arbitrary
with p = ab. We then have p1 = ab, so p | ab. Since p is prime, we conclude that either p | a or p | b.
Suppose that p | a. Fix c ∈ R with a = pc. We then have

p1 = p = ab = (pc)b = p(cb)
198 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS

Since R is an integral domain and p 6= 0, we may cancel it to conclude that 1 = cb, so b is a unit. Suppose
instead that p | b. Fix d ∈ R with b = pd. We then have

p1 = p = ab = a(pd) = p(ad)

Since R is an integral domain and p 6= 0, we may cancel it to conclude that 1 = ad, so a is a unit. Therefore,
either a or b is a unit. It follows that p is irreducible.
As in divisibility, it is helpful to rephrase our definition of prime elements in terms of principal ideals,
and fortunately our common names here coincide.
Proposition 11.2.5. Let R be an integral domain and let p ∈ R be nonzero. The ideal hpi is a prime ideal
of R if and only if p is a prime element of R.
Proof. Suppose first that hpi is a prime ideal of R. Notice that p 6= 0 by assumption and that p is not a unit
because hpi 6= R. Suppose that a, b ∈ R and p | ab. We then have that ab ∈ hpi, so as hpi is a prime ideal
we know that either a ∈ hpi or b ∈ hpi. In the former case, we conclude that p | a, and in the latter case we
conclude that p | b. Since a, b ∈ R were arbitrary, it follows that p is a prime element of R.
Suppose conversely that p is a prime element of R. By definition, we know that p is not a unit, so 1 ∈/ hpi
and hence hpi =
6 R. Suppose that a, b ∈ R and ab ∈ hpi. We then have that p | ab, so as p is a prime element
we know that either p | a or p | b. In the former case, we conclude that a ∈ hpi and in the latter case we
conclude that b ∈ hpi. Since a, b ∈ R were arbitrary, it follows that hpi is a prime ideal of R.
One word of caution here is that if R is an integral domain, then h0i = {0} is a prime ideal, but 0 is not
a prime element.
Recall from Corollary 9.3.9 that if F is a field, then U (F [x]) = U (F ) of nonzero constant polynomials.
Using this classification of the units, we can give a simple characterization of the irreducible elements in F [x]
as those polynomials that can not be factored into two polynomials of smaller degree.
Proposition 11.2.6. Let F be a field and let f (x) ∈ F [x] be a nonconstant polynomial. The following are
equivalent.
1. f (x) is irreducible in F [x].
2. There do not exist nonzero polynomials g(x), h(x) ∈ F [x] with f (x) = g(x) · h(x) and both deg(g(x)) <
deg(f (x)) and deg(h(x)) < deg(f (x)).
Proof. We prove 1 ↔ 2 by instead proving ¬1 ↔ ¬2.
• ¬1 → ¬2: Suppose that 1 is false, so f (x) is not irreducible in F [x]. Since f (x) is a nonconstant
polynomial, we know that f (x) is nonzero and not a unit. in F [x]. Therefore, there exists g(x), h(x) ∈
F [x] with f (x) = g(x)h(x) and such that neither g(x) nor h(x) are units. Notice that g(x) and h(x)
are both nonzero (because f (x) is nonzero), so since U (F [x]) = U (F ) = F \{0} by Corollary 9.3.9, it
follows that g(x) and h(x) are both nonconstant polynomials. Now

deg(f (x)) = deg(g(x) · h(x)) = deg(g(x)) + deg(h(x))

by Proposition 9.3.6. Since h(x) is nonconstant, we have deg(h(x)) ≥ 1, and hence deg(g(x)) <
deg(f (x)). Similarly, since g(x) is nonconstant, we have deg(g(x)) ≥ 1, and hence deg(h(x)) <
deg(f (x)). Thus, we’ve shown that 2 is false.
• ¬2 → ¬1: Suppose that 2 is false, and fix nonzero polynomials g(x), h(x) ∈ F [x] with f (x) = g(x)·h(x)
and both deg(g(x)) < deg(f (x)) and deg(h(x)) < deg(f (x)). Since

deg(f (x)) = deg(g(x) · h(x)) = deg(g(x)) + deg(h(x))


11.2. IRREDUCIBLE AND PRIME ELEMENTS 199

by Proposition 9.3.6, we must have deg(g(x)) 6= 0 and deg(h(x)) 6= 0. Since U (F [x]) = U (F ) = F \{0}
by Corollary 9.3.9, it follows that neither g(x) nor h(x) is a unit. Thus, f (x) is not irreducible in F [x],
and so 1 is false.

Life is not as nice if we are working over an integral domain that is not a field. For example, we showed
above that 2x2 + 6 is not irreducible in Z[x] because 2x2 + 6 = 2 · (x2 + 3), and neither 2 nor x2 + 3 is a
unit in Z[x]. However, it is not possible to write 2x2 + 6 = g(x) · h(x) where g(x), h(x) ∈ Z[x] and both
deg(g(x)) < 2 and deg(h(x)) < 2.

Proposition 11.2.7. Let F be a field and let f (x) ∈ F [x] be a nonzero polynomial with deg(f (x)) ≥ 2. If
f (x) has a root in F , then f (x) is not irreducible in F [x].

Proof. Suppose that f (x) has a root in F , and fix such a root a ∈ F . By Proposition 10.3.2, it follows that
(x − a) | f (x) in F [x]. Fixing g(x) ∈ F [x] with f (x) = (x − a) · g(x), we then have that g(x) is nonzero (since
f (x) is nonzero) and also

deg(f (x)) = deg(x − a) + deg(g(x)) = 1 + deg(g(x)).

Therefore, deg(g(x)) = deg(f (x)) − 1 ≥ 2 − 1 = 1. Since x − a and g(x) both have degree at least 1, we
conclude that neither is a unit in F [x] by Corollary 9.3.9, and hence f (x) is not irreducible in F [x].

Theorem 11.2.8. Let F be a field and let f (x) ∈ F [x] be a nonzero polynomial.

1. If deg(f (x)) = 1, then f (x) is irreducible in F [x].

2. If deg(f (x)) = 2 or deg(f (x)) = 3, then f (x) is irreducible in F [x] if and only if f (x) has no roots in
F [x].

Proof. 1. This follows immediately from Proposition 11.2.6 and Proposition 9.3.6.

2. Suppose that either deg(f (x)) = 2 or deg(f (x)) = 3. If f (x) has a root in F [x], then f (x) is not
irreducible immediately by Proposition 11.2.7. Suppose conversely that f (x) is not irreducible if F [x].
Write f (x) = g(x)h(x) where g(x), h(x) ∈ F [x] are nonunits. We have

deg(f (x)) = deg(g(x)) + deg(h(x))

Now g(x) and h(x) are not units, so they each have degree at least 1. Since deg(f (x)) ∈ {2, 3}, it
follows that at least one of g(x) or h(x) has degree equal to 1. Suppose without loss of generality that
deg(g(x)) = 1 and write g(x) = ax + b where a, b ∈ F with a 6= 0. We then have

f (−ba−1 ) = g(−ba−1 ) · h(−ba−1 )


= (a · (−ba−1 ) + b) · h(−ba−1 )
= (−b + b) · h(−ba−1 )
= 0 · h(−ba−1 )
=0

so −ba−1 ∈ F is a root of f (x).


200 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS

Notice that this theorem can be false if we are working in R[x] for an integral domain R. For example,
in Z[x], we have that deg(3x + 12) = 1, but 3x + 12 is not irreducible in Z[x] because 3x + 12 = 3 · (x + 4),
and neither 3 nor x + 4 is a unit in Z[x]. Of course, the theorem implies that 3x + 12 is irreducible in Q[x]
(the factorization 3x + 12 = 3 · (x + 4) does not work here because 3 is a unit in Q[x]).
Also, note that even in the case of F [x] for a field F , in order to use the nonexistence of roots to
prove that a polynomial is irreducible, we require that the polynomial has degree 2 or 3. This restriction is
essential. Consider the polynomial x4 + 6x2 + 5 in Q[x]. Since x4 + 6x2 + 5 = (x2 + 1)(x2 + 5), it follows
that x4 + 6x2 + 5 is no irreducible in Q[x]. However, notice that x4 + 6x2 + 5 has no roots in Q (or even in
R) because a4 + 6a2 + 5 > 0 for all a ∈ Q.
For an example of how to use the theorem affirmatively,√consider the polynomial f (x) = x3 − 2 in the
ring Q[x]. We know that f (x) has no roots in Q because ± 3 2 are not rational by Theorem 2.5.13. Thus,
f (x) is irreducible in Q[x]. Notice that f (x) is not irreducible when viewed as an element of R[x] because
it does have a root in R. In fact, no polynomial in R[x] of odd degree is irreducible because every such
polynomial has a root (this uses the Intermediate Value Theorem because as x → ±∞, on one side we must
have f (x) → ∞ and on the other we must have f (x) → −∞). Moreover, it turns out that every irreducible
polynomial in R[x] has degree either 1 or 2, though this is far from obvious at this point, and relies on an
important result called the Fundamental Theorem of Algebra.
By Proposition 11.2.4, we know that in an integral domain R, every prime element of R is irreducible in
R. Although in the special case of Z we know that every irreducible is prime, this is certainly obvious in
general. For example, we’ve just shown that x3 − 2 is irreducible in Q[x], but it’s much less clear whether
x3 − 2 is prime in Q[x]. In fact, for general integral domains R, there can be irreducible elements that are
not prime. For an somewhat exotic but interesting example of this, let R be the Q[x] consisting of those
polynomials whose constant term and coefficient of x are both elements of Z. In other words, let
R = {an xn + an−1 xn−1 + · · · + a1 x + a0 ∈ Q[x] : a0 ∈ Z and a1 ∈ Z}
It is straightforward to check that R is indeed a subring of Q[x]. Furthermore, since Q[x] is an integral domain,
it follow that R is an integral domain as well. We also still have that deg(f (x)g(x)) = deg(f (x)) + deg(g(x))
for all f (x), g(x) ∈ R because this property holds in Q[x]. From this, it follows that U (R) = {1, −1} and
that 3 ∈ R is irreducible in R. Notice that 3 | x2 , i.e. 3 | x · x, because 31 x2 ∈ R and
1
3 · x2 = x2 .
3
However, we also have that 3 - x in R, essentially because 31 · x is not an element of R. To see this more
formally, suppose that h(x) ∈ R and 3 · h(x) = x. Since the degree of the product of two elements of R is
the sum of the degrees, we have deg(h(x)) = 1. Since h(x) ∈ R, we can write h(x) = ax + b where a, b ∈ Z.
We then have 3a · x + 3b = x, so 3a = 1, contradicting the fact that 3 - 1 in Z. Since 3 | x · x in R, but 3 - x
in R, it follows that 3 is not prime in R.
Another example of an integral domain where some irreducibles are not prime is the integral domain
√ √
Z[ −5] = {a + b −5 : a, b ∈ Z}.

Although this also looks like a bizarre example, we will see √ that Z[ −5] and many other examples like it
play a fundamental role in algebraic number theory. In Z[ −5], we have two different factorizations of 6:
√ √
(1 + −5)(1 − −5) = 6 = 2 · 3

It turns out that each of the four factors that appear are irreducible in Z[ −5], but none are prime. We
will establish
√ all of these
√ facts later, but here we√can at least√ argue that
√ 2 is not prime. Notice
√ that√since
2 · 3 = (1 + −5)(1 − −5), we have that √ 2 | (1 + −5)(1
√ − −5) in Z[ −5]. Now
√ if 2 | 1 +
√ −5 in Z[ −5],
then we can fix a, b ∈ Z with √ 2(a + b −5) = 1 + −5, which gives 2a +√2b −5 = 1 + −5, so 2a = 1, a
contradiction. Thus, 2 - 1 + −5. A similar argument shows that 2 - 1 − −5.
11.3. IRREDUCIBLE POLYNOMIALS IN Q[X] 201

11.3 Irreducible Polynomials in Q[x]


We spend this section gaining some understanding of irreducible elements in Q[x]. Notice that every element
of Q[x] is an associate with a polynomial in Z[x] because we can simply multiply through by the product of
all denominators (which is a unit because it is a constant polynomial). Thus, up to associates, it suffices to
examine polynomials with integer coefficients. We begin with a simple but important result above potential
rational roots of such a polynomial.
Theorem 11.3.1 (Rational Root Theorem). Suppose that f (x) ∈ Z[x] is a nonzero polynomial and write
f (x) = an xn + an−1 xn−1 + · · · + a1 x + a0
b
where each ai ∈ Z and an 6= 0. Suppose that q ∈ Q is a root of f (x). If q = c where b ∈ Z and c ∈ N+ are
relatively prime (so we write q in “lowest terms”), then b | a0 and c | an .
Proof. We have
an · (b/c)n + an−1 · (b/c)n−1 + · · · + a1 · (b/c) + a0 = 0
Multiplying through by cn we get
an bn + an−1 bn−1 c + · · · + a1 bcn−1 + a0 cn = 0
From this, we see that
an bn = c · [−(an−1 bn−1 + · · · + a1 bcn−2 + a0 cn−1 )]
and hence c | an bn in Z. Using the fact that gcd(b, c) = 1, it follows that c | an (either repeatedly apply
Proposition 2.4.10, or use Problem 2 on Homework 2 to conclude that gcd(bn , c) = 1 followed by one
application of Proposition 2.4.10). On the other hand, we see that
a0 cn = b · [−(an bn−1 + an−1 bn−2 c + · · · + a1 cn−1 )]
and hence b | a0 cn . Using the fact that gcd(b, c) = 1 as above, it follows as above that b | a0 .
As an example, we show that the polynomial f (x) = 2x3 − x2 + 7x − 9 is irreducible in Q[x]. By Rational
Root Theorem, the only possible roots of f (x) in Q are ±9, ±3, ±1, ± 29 , ± 32 , ± 21 . We check the values:
• f (9) = 1431 and f (−9) = −1611
• f (3) = 57 and f (−3) = −93
• f (1) = −1 and f (−1) = −19
• f ( 29 ) = 369
2 and f (− 92 ) = −243
• f ( 12 ) = − 11 1
2 and f (− 2 ) = −13

• f ( 32 ) = 6 and f (− 32 ) = − 57
2

Thus, f (x) has no roots in Q. Since deg(f (x)) = 3, we can use Theorem 11.2.8 to conclude that f (x) is
irreducible in Q[x]. Notice that f (x) is not irreducible in R[x] because it has a root between in the interval
(1, 32 ) by the Intermediate Value Theorem.
As we mentioned above, we are focusing on polynomials in Q[x] which have integer coefficients since every
polynomial in Q[x] is an associate of such a polynomial. However, even if f (x) ∈ Z[x], when we check for
irreducibility in Q[x], we have to consider the possibility that a potential factorization involves polynomials
whose coefficients are fractions. For example, we have
 
1
x2 = (2x) · x
2
202 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS

Of course, in this case there also exists a factorization into smaller degree degree polynomials in Z[x] because
we can write x2 = x · x. Our first task is to prove that this is always the case. We will need the following
lemma.

Lemma 11.3.2. Suppose that g(x), h(x) ∈ Z[x] and that p ∈ Z is a prime which divides all coefficients of
g(x)h(x). We then have that either p divides all coefficients of g(x), or p divides all coefficients of h(x).

Proof. Let g(x) be the polynomial {bn }, let h(x) be the polynomial {cn }, and let g(x)h(x) be the polynomial
{an }. We are supposing that p | an for all n. Suppose the p - bn for some n and also that p - cn for some n
(possibly different). Let k be least such that p - bk , and let ` be least such that p - c` . Notice that

k+`
X
ak+` = bi ck+`−i
i=0
k−1
! k+`
!
X X
= bk c` + bi ck+`−i + bi ck+`−i
i=0 i=k+1

hence ! !
k−1
X k+`
X
bk c` = ak+` − bi ck+`−i − bi ck+`−i
i=0 i=k+1

Now if 0 ≤ i ≤ k − 1, then p | bi by choice of k, hence p | bi ck+`−i . Also, if k + 1 ≤ i ≤ k + `, then k + ` − i < `,


so p | ck+`−i by choice of `, hence p | bi ck+`−i . Since p | ak+` by assumption, it follows that p divides every
summand on the right hand side. Therefore, p divides the right hand side, and thus p | bk c` . Since p is
prime, it follows that either p | bk or p | c` , but both of these are impossible by choice of k and `. Therefore,
it must be the case that either p | bn for all n, or p | cn for all n.

Proposition 11.3.3 (Gauss’ Lemma). Suppose that f (x) ∈ Z[x] and that g(x), h(x) ∈ Q[x] with f (x) =
g(x)h(x). There exist polynomials g ∗ (x), h∗ (x) ∈ Z[x] such that f (x) = g ∗ (x)h∗ (x) and both deg(g ∗ (x)) =
deg(g(x)) and deg(h∗ (x)) = deg(h(x)). In fact, there exist nonzero s, t ∈ Q with

• f (x) = g ∗ (x)h∗ (x)

• g ∗ (x) = s · g(x)

• h∗ (x) = t · h(x)

Proof. If each of the coefficients of g(x) and h(x) happen to be integers, then we are happy. Suppose not.
Let a ∈ Z be the least common multiple of the denominators of the coefficients of g, and let b ∈ Z be the
least common multiple of the denominators of the coefficients of g. Let d = ab. Multiplying both sides of
the equation f (x) = g(x)h(x) through by d to “clear denominators”, we see that

d · f (x) = (a · g(x)) · (b · h(x))

where each of the three factors d · f (x), a · g(x), and b · h(x) is a polynomial in Z[x]. We have at least one
of a > 1 or b > 1, hence d = ab > 1.
Fix a prime divisor p of d. We then have that p divides all coefficients of d · f (x), so by the previous
lemma either p divides all coefficients of a · g(x), or p divides all coefficients of b · h(x). In the former case,
we have  
d a
· f (x) = · g(x) · (b · h(x))
p p
11.3. IRREDUCIBLE POLYNOMIALS IN Q[X] 203

where each of the three factors is a polynomial in Z[x]. In the latter case, we have
 
d b
· f (x) = (a · g(x)) · · h(x)
p p

where each of the three factors is a polynomial in Z[x]. Now if dp = 1, then we are done by letting g ∗ (x)
be the first factor and letting h∗ (x) be the second. Otherwise, we continue the argument by dividing out
another prime factor of dp from all coefficients of one of the two polynomials. Continue until we have handled
all primes which occur in a factorization of d. Formally, you can do induction on d.

An immediate consequence of Gauss’ Lemma is the following, which greatly simplifies the check for
whether a given polynomial with integer coefficients is irreducible in Q[x].

Corollary 11.3.4. Let f (x) ∈ Z[x]. If there do not exist nonconstant polynomials g(x), h(x) ∈ Z[x] with
f (x) = g(x) · h(x), then f (x) is irreducible in Q[x]. Furthermore, if f (x) is monic, then it suffices to show
that no such monic g(x) and h(x) exist.

Proof. The first part is immediate from Proposition 11.2.6 and Gauss’ Lemma. Now suppose that f (x) ∈ Z[x]
is monic. Suppose that g(x), h(x) ∈ Z with f (x) = g(x)h(x). Notice that the leading term of f (x) is the
product of the leading terms of g(x) and h(x), so as f (x) is monic and all coefficients are in Z, either both
g(x) and h(x) are monic or both have leading terms −1. In the latter case, we can multiply both through
by −1 to get a factorization into monic polynomials in Z[x] of the same degree.

As an example, consider the polynomial f (x) = x4 + 3x3 + 7x2 − 9x + 1 ∈ Q[x]. We claim that f (x)
is irreducible in Q[x]. We first check for rational roots. We know that the only possibilities are ±1 and we
check these:

• f (1) = 1 + 3 + 7 − 9 + 1 = 3

• f (−1) = 1 − 3 + 7 + 9 + 1 = 15

Thus, f (x) has no rational roots.


By the corollary, it suffices to show that f (x) is not the product of two monic nonconstant polynomials
in Z[x]. Notice that f (x) has no monic divisor of degree 1 because such a divisor would imply that f (x) has
a rational root, which we just ruled out. Thus, we need only consider the possibility that f (x) is the product
of two monic polynomials in Z[x] of degree 2. Consider a factorization:

f (x) = (x2 + ax + b)(x2 + cx + d)

where a, b, c, d ∈ Z. We then have

x4 + 3x3 + 7x2 − 9x + 1 = x4 + (a + c)x3 + (b + ac + d)x2 + (ad + bc)x + bd

We therefore have the following equations

1. a + c = 3

2. b + ac + d = 7

3. ad + bc = −9

4. bd = 1

Since bd = 1 and b, d ∈ Z, we have that b ∈ {1, −1}. We check the possibilities.


204 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS

• Suppose that b = 1. By equation 4, we conclude that d = 1. Thus, by equation 3, we conclude that


a + c = −9, but this contradicts equation 1.
• Suppose that b = −1. By equation 4, we conclude that d = −1. Thus, by equation 3, we conclude that
−a − c = −9, so a + c = 9, but this contradicts equation 1.
In all cases, we have reached a contradiction. We conclude that f (x) is irreducible in Q[x].
The following theorem, when it applies, is a simple way to determine that certain polynomials in Z[x]
are irreducible in Q[x]. Although it has limited general use (a polynomial taken at random typically does
not satisfy the hypotheses), it is surprisingly useful how often it applies to “natural” polynomials you want
to verify are irreducible.
Theorem 11.3.5 (Eisenstein’s Criterion). Suppose that f (x) ∈ Z[x] and write
f (x) = an xn + an−1 xn−1 + · · · + a1 x + a0
If there exists a prime p ∈ N+ such that
• p - an
• p | ai for 0 ≤ i ≤ n − 1
• p2 - a0
then f (x) is irreducible in Q[x].
Proof. Fix such a prime p. We use Corollary 11.3.4. Suppose that g(x), h(x) ∈ Z[x] are not constant
polynomials with f (x) = g(x)h(x). We then have
n = deg(f (x)) = deg(g(x)) + deg(h(x))
Since we are assuming that g(x) and h(x) are not constant, they each have degree at least 1, and so by the
above equality they both have degree at most n − 1.
Let g(x) be the polynomial {bn } and let h(x) be the polynomial {cn }. We have a0 = b0 c0 , so since p | a0
and p is prime, either p | b0 or p | c0 . Furthermore, since p2 - a0 by assumption, we can not have both p | b0
and p | c0 . Without loss of generality (by switching the roles of g(x) and h(x) if necessary), suppose that
p | b0 and p - c0 .
We now prove that p | bk for 0 ≤ k ≤ n − 1 by (strong) induction. Suppose that we have k with
0 ≤ k ≤ n − 1 and we know that p | bi for 0 ≤ i < k. Now
ak = bk c0 + bk−1 c1 + · · · + b1 ck−1 + b0 ck
and hence
bk c0 = ak − bk−1 c1 − · · · − b1 ck−1 − b0 ck
By assumption, we have p | ak , and by induction we have p | bi for 0 ≤ i < k. It follows that p divides every
term on the right-hand side, so p | bk c0 . Since p is prime and p - c0 , it follows that p | bk .
Thus, we have shown that p | bk for 0 ≤ k ≤ n − 1. Now we have
an = bn c0 + bn−1 c1 + · · · + b1 cn−1 + b0 cn
= bn−1 c1 + · · · + b1 cn−1 + b0 cn
where the last line follows from the fact that bn = 0 (since we are assuming deg(g(x)) < n). Now we know
p | bk for 0 ≤ k ≤ n − 1, so p divides every term on the right. It follows that p | an , contradicting our
assumption. Therefore, by Corollary 11.3.4, f (x) is irreducible in Q[x].
For example, the polynomial x4 + 6x3 + 27x2 − 297x + 24 is irreducible in Q[x] using Eisenstein’s Criterion
with p = 3. For each n ∈ N+ the polynomial xn − 2 is irreducible in Q[x] using Eisenstein’s Criterion with
p = 2. In particular, we have shown that Q[x] has irreducible polynomials of every positive degree.
11.4. UNIQUE FACTORIZATION DOMAINS 205

11.4 Unique Factorization Domains


An element of an integral domain R is irreducible if we can not break it down nontrivially. In Z, we proved the
Fundamental Theorem of Arithmetic, which said that every element could be broken down into irreducibles
in an essentially unique way. Of course, our uniqueness of irreducible factorizations was only up to order,
but there was another issue that we were able to sweep under the rug in Z by working only with positive
elements. For example, consider the following factorizations of 30 in Z:

30 = 2 · 3 · 5
= (−2) · (−3) · 5
= (−2) · 3 · (−5)

In Z, the elements −2, −3, and −5 are also irreducible/prime in Z under our new definition of these concepts.
Thus, if we move away from working only with positive primes, then we lose a bit more uniqueness. However,
if we slightly loosen the requirements that any two factorizations are “the same up to order” to “the same
up to order and associates”, we might have a chance.
Definition 11.4.1. A Unique Factorization Domain, or UFD, is an integral domain R such that:
1. Every nonzero nonunit is a product of irreducible elements.
2. If q1 q2 · · · qn = r1 r2 . . . rm where each qi and ri are irreducible, then n = m and there exists a permu-
tation σ ∈ Sn such that qi and rσ(i) are associates for all i.
Thus, a UFD is an integral domain in which the analogue of the Fundamental Theorem of Arithmetic
(Theorem 2.5.8) holds. We want to prove that several important integral domains that we have studied are
indeed UFDs, and to set the stage for this, we first go back and think about the proofs in Z.
In Proposition 2.5.2, we proved that every n ≥ 2 was a product of irreducible elements of Z by induction.
Intuitively, if n is not irreducible, then factor it, and if those factors are not irreducible, then factor them,
etc. The key fact forcing this process to “bottom out”, and hence making the induction work, is that
the numbers are getting smaller upon factorization and we can not have an infinite descending sequence
of natural numbers. In a general integral domain, however, there may not be something that is getting
“smaller” and hence this argument could conceivably break down. In fact, there are exotic integral domains
where it is possible factor an element forever without ever reaching an irreducible. For an example of this
situation, consider the subring R of Q[x] consisting of those polynomials whose constant term is an integer.
In other words, let

R = {p(x) ∈ Q[x] : p(0) ∈ Z}


= {an xn + an−1 xn−1 + · · · + a1 x + a0 ∈ Q[x] : a0 = 0}.

It is straightforward to check that R is a subring of Q[x]. Furthermore, since Q[x] is an integral domain,
it follows that R is an integral domain as well. We still have that deg(f (x)g(x)) = deg(f (x)) + deg(g(x))
for all f (x), g(x) ∈ R because this property holds in Q[x]. From this, it follows that any element of U (R)
must be a constant polynomial. Since the constant terms of elements of R are integers, we conclude that
U (R) = {1, −1}. Now consider the element x ∈ R, which is nonzero and not a unit. Notice that x is not
irreducible in R because we can write x = 2 · ( 21 x), and neither 2 nor 12 x is a unit in R. In fact, we claim that
x can not be written as a product of irreducibles in R. To see this, suppose that p1 (x), p2 (x), . . . , pn (x) ∈ R
and that
x = p1 (x)p2 (x) · · · pn (x).
Since deg(x) = 1, one of the pi (x) must have degree 1 and the rest must have degree 0 (notice that it is not
possible that some pi (x) is the zero polynomial since x is a nonzero polynomial). Since R is commutative,
206 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS

we can assume that deg(p1 (x)) = 1 and deg(pi (x)) = 0 for 2 ≤ i ≤ n. Write p1 (x) = ax + b, and pi (x) = ci
for 2 ≤ i ≤ n where b, c2 , c3 , . . . , cn ∈ Z, each ci 6= 0, and a ∈ Q. We then have

x = (ac2 c3 . . . ck )x + bc2 c3 · · · ck

This implies that bc2 c3 · · · ck = 0, and since each ci 6= 0, it follows that b = 0. Thus, we have p1 (x) = ax.
Now notice that a 
p1 (x) = ax = 2 · x
2
and neither 2 nor a2 x are units in R. Therefore, p1 (x) is not irreducible in R. We took an arbitrary way to
write x as a product of elements of R, and showed that at least of the factors was not irreducible. It follows
that x can not be written as a product of irreducible elements in R. From this example, we realize that it is
hopeless to prove the existence part of factorizations in general integral domains, so we will need to isolate
some special properties of integral domains that will rule out such “infinite descent”. In general, relations
that do not have such infinite descent are given a special name.

Definition 11.4.2. A relation ∼ on a set A is well-founded if there does not exist a sequence of elements
a1 , a2 , a3 , . . . from A such that an+1 ∼ an for all n ∈ N+ .

Notice that < is well-founded on N but not on Z, Q, or R. If we instead define ∼ by saying that a ∼ b
if |a| < |b|, then ∼ is well-founded on Z, but not on Q or R. The fact that < is well-founded on N is the
foundation for well-ordering and induction proofs on N. We want to think about whether a certain divisibility
relation is well-founded. However, the relation we want to think about is not ordinary divisibility, but a
slightly different notion.

Definition 11.4.3. Let R be an integral domain and let a, b ∈ R. We write a k b to mean that a | b and
that a is not an associate of b. We call k the strict divisibility relation.

For example, in Z the strict divisibility relation is well-founded because if a, b ∈ Z\{0} and a k b, then
|a| < |b|. However, if R is the subring of Q[x] consisting of those polynomials whose constant term is an
integer, then the strict divisibility relation is not well-founded. To see this, let an = 21n x for each n ∈ N+ .
Notice that an ∈ R for all n ∈ N+ . Also, we have an = 2an+1 for each n ∈ N+ . Since 2 is not a unit in R, we
also have that an and an+1 are not associates for each n ∈ N+ . Therefore, an+1 k an for each n ∈ N+ . The
fact that the strict divisibility relation is not well-founded on R is the fundamental reason why elements of
R may not factor into irreducibles.

Proposition 11.4.4. Let R be an integral domain. If the strict divisibility relation on R is well-founded
(i.e. there does not exist d1 , d2 , d3 , . . . in R such that dn+1 k dn for all n ∈ N+ ), then every nonzero nonunit
in R is a product of irreducibles.

Proof. We prove the contrapositive. Suppose that there is a nonzero nonunit element of R that is not a
product of irreducibles. Fix such an element a. We define a sequence of nonzero nonunit elements d1 , d2 , . . .
in R with dn+1 k dn recursively as follows. Start by letting d1 = a. Assume inductively that dn is a nonzero
nonunit which is not a product of irreducibles. In particular, dn is itself not irreducible, so we may write
dn = bc for some choice of nonzero nonunits b and c. Now it is not possible that both b and c are products
of irreducibles because otherwise dn would be as well. Thus, we may let dn+1 be one of b and c, chosen so
that dn+1 is also not a product of irreducibles. Notice that dn+1 is a nonzero nonunit, that dn+1 | dn , and
that dn+1 is not an associate of dn because neither b nor c are units. Therefore, dn+1 k dn . Since we can
continue this recursive construction for all n ∈ N+ , it follows that the strict divisibility relation on R is not
well-founded.

We can also give another proof of Proposition 11.4.4 by using an important combinatorial result.
11.4. UNIQUE FACTORIZATION DOMAINS 207

Definition 11.4.5. Let {0, 1}∗ be the set of all finite sequences of 0’s and 1’s (including the “empty string”
λ). A tree is a subset T ⊆ {0, 1}∗ which is closed under initial segments. In other words, if σ ∈ T and τ is
an initial segment of S, then τ ∈ S.

For example, the set {λ, 0, 1, 00, 01, 011, 0110, 0111} is a tree.

Lemma 11.4.6 (König’s Lemma). Every infinite tree has an infinite branch. In other words, if T is a tree
with infinitely many elements, then there is an infinite sequence of 0’s and 1’s such that every finite initial
segment of this sequence is an element of T .

Proof. Let T be a tree with infinitely many elements. We build the infinite sequences in stages. That is, we
define finite sequences σ0 ≺ σ1 ≺ σ2 ≺ . . . recursively where each |σn | = n. In our construction, we maintain
the invariant that there are infinitely many element of T extending σn .
We begin by defining σ0 = λ and notice that there are infinitely many element of T extending λ because
λ is an initial segment of every element of T trivially. Suppose that we have defined σn in such a way that
|σn | = n and there are infinitely many elements of T extending σn . We then must have that either there
are infinitely many elements of T extending σn 0 or there are infinitely many elements of T extending σn 1.
Thus, we may fix an i ∈ {0, 1} such that there are infinitely many elements of T extending σn i, and define
σn+1 = σn i.
We now take the unique infinite sequence extending all of the σn and notice that it has the required
properties.

Proof 2 of Proposition 11.4.4. Suppose that a ∈ R is a nonzero nonunit. Recursively factor a into two
nonunits, and hence two nonassociate divisors, down a tree. If we ever a reach a irreducible, stop that
branch but continue down the others. This tree can not have an infinite path because this this would violate
the fact that the strict divisibility relation is well-founded on R. Therefore, by König’s Lemma, the tree is
finite. It follows that a is the product of the leaves, and hence a product of irreducibles.

Now that we have a descent handle on when an integral domain will have the property that every element
is a product of irreducibles, we now move on to the uniqueness aspect. In the proof of the Fundamental
Theorem of Arithmetic in Z, we made essential use of Proposition 2.5.6 saying that irreducibles in Z are
prime, and this will be crucial in our generalizations as well. Hence, we are again faced with the question
of when irreducibles are guaranteed to be prime. As we saw in the Section 11.2, irreducibles need not be
prime in general integral domains either. We now spend the rest of this section showing that if irreducibles
are prime, then we do in fact obtain uniqueness.

Definition 11.4.7. Let R be an integral domain and c ∈ R. Define a function ordc : R → N ∪ {∞} as
follows. Given a ∈ R, let ordc (a) be the largest k ∈ N such that ck | a if one exists, and otherwise let
ordc (a) = ∞. Here, we interpret c0 = 1, so we always have 0 ∈ {k ∈ N : ck | a}.

Notice that we have ord0 (a) = 0 whenever a 6= 0 and ord0 (0) = ∞. Also, for any a ∈ R and u ∈ U (R),
we have ordu (a) = ∞ because uk ∈ U (R) for all k ∈ N, hence uk | a for all k ∈ N.

Lemma 11.4.8. Let R be an integral domain. Let a, c ∈ R with c 6= 0, and let k ∈ N. The following are
equivalent.

1. ordc (a) = k

2. ck | a and ck+1 - a

3. There exists m ∈ R with a = ck m and c - m

Proof. • 1 → 2 is immediate.
208 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS

• 2 → 1: Suppose that ck | a and ck+1 - a. We clearly have ordc (a) ≥ k. Suppose that there exists ` > k
with c` | a. Since ` > k, we have ` ≥ k + 1. This implies that ck+1 | c` , so since c` | a we conclude
that ck+1 | a. This contradicts our assumption. Therefore, there is no ` > k with c` | a, and hence
ordc (a) = k.
• 2 → 3: Suppose that ck | a and ck+1 - a. Fix m ∈ R with a = ck m. If c | m, then we may fix
n ∈ R with m = cn, which would imply that a = ck cn = ck+1 n contradicting the fact that ck+1 - a.
Therefore, we must have c - m.
• 3 → 2: Fix m ∈ R with a = ck m and c - m. We clearly have ck | a. Suppose that ck+1 | a and fix
n ∈ R with a = ck+1 n. We then have ck m = ck+1 n. Since R is an integral domain and c 6= 0, it follows
that that ck 6= 0. Canceling ck from both sides of ck m = ck+1 n (again since R is an integral domain),
we conclude that m = cn. This implies that c | m, which is a contradiction. Therefore, ck+1 - a.

A rough intuition is that ordc (a) intends to count the number of “occurrences” of c inside of a. A natural
hope would be that ordc (ab) = ordc (a) + ordc (b) for all a, b, c ∈ R, but this fails in general. For example,
consider Z. We have ord10 (70) = 1 but ord10 (14) = 0 and ord10 (5) = 0, so ord10 (70) 6= ord10 (14) + ord10 (5).
Since 10 = 2 · 5, what is happening in this example is that the 2 was split off into the 14, while the 5 went
into the other factor. One might hope that since irreducibles can not be factored nontrivially, this kind of
behavior would not happen if we replace 10 by an irreducible element. Although this is true in Z, it is not
necessarily the case in general.
As introduced in Section 11.2, in the integral domain
√ √
Z[ −5] = {a + b −5 : a, b ∈ Z}

we have √ √
(1 + −5)(1 − −5) = 6 = 2 · 3

where all four √factors are irreducible.√Now ord2 (6) = 1 because 6 = 2 · 3 and 2 - 3 in Z[ −5]. However, we
have ord2 (1 + −5) = 0 = ord2 (1 − −5). Thus,
√ √ √ √
ord2 ((1 + −5)(1 − −5)) 6= ord2 (1 + −5) + ord2 (1 − −5)
√ √
√ other words, although 2 is irreducible in Z[ −5] and we can “find it” in the product 6 = (1 + −5)(1 −
In
−5), we can not “find it” in either of the factors. This strange behavior takes some getting used to, and
we will
√ explore it much further in later chapters. The fundamental
√ obstacle
√ is that although 2 is irreducible

in Z[ −5],
√ it is not prime there. In fact, although 2 | (1 + −5)(1 − −5), we have both 2 - 1 + −5 and
2 - 1 − −5.
For an even worse example, consider the subring R of Q[x] consisting of those polynomials whose constant
term and coefficient of x are both elements of Z. In Section 11.2, we showed that 3 was not prime in R
because 3 | x2 but 3 - x. However, we still have that U (R) = {1, −1}, so 3 is irreducible in R. However,
notice that ord3 (x2 ) = ∞ and ord3 (x) = 0. Thus, we certainly do not have ord3 (x2 ) = ord3 (x) + ord3 (x).
The takeaway fact from these examples is that although irreducibility was defined to mean that we could
not further “split” the element nontrivially, this definition does not carry with it the nice properties that we
might expect. The next theorem shows that everything works much better for the possibly smaller class of
prime elements.
Theorem 11.4.9. Let R be an integral domain and let p ∈ R be prime. We have the following.
1. ordp (ab) = ordp (a) + ordp (b) for all a, b ∈ R.
2. ordp (an ) = n · ordp (a) for all a ∈ R and n ∈ N.
11.4. UNIQUE FACTORIZATION DOMAINS 209

Proof. We prove 1, from which 2 follows by induction. Let a, b ∈ R. First notice that if ordp (a) = ∞, then
pk | a for all k ∈ N, hence pk | ab for all k ∈ N, and thus ordp (ab) = ∞. Similarly, if ordp (b) = ∞, then
ordp (ab) = ∞. Suppose then that both ordp (a) and ordp (b) are finite, and let k = ordp (a) and ` = ordp (b).
Using Lemma 11.4.8, we may then write a = pk m where p - m and b = p` n where p - n. We then have

ab = pk mp` n = pk+` · mn

Now if p | mn, then since p is prime, we conclude that either p | m or p | n, but both of these are
contradictions. Therefore, p - mn. Using Lemma 11.4.8 again, it follows that ordp (ab) = k + `.
Proposition 11.4.10. Let R be an integral domain, and let p ∈ R be irreducible.
1. For any irreducible q that is an associate of p, we have ordp (q) = 1.
2. For any irreducible q that is not an associate of p, we have ordp (q) = 0.
3. For any unit u, we have ordp (u) = 0.
Proof. 1. Suppose that q is an irreducible that is an associate of p. Fix a unit u with q = pu. Notice that
if p | u, then since u | 1, we conclude that p | 1, which would imply that p is a unit, contradicting our
definition of irreducible. Thus, p - u, and so ordp (q) = 1 by Lemma 11.4.8.
2. Suppose that q is an irreducible that is not an associate of p. Since q is irreducible, its only divisors
are units and associates of q. Since p is not a unit nor an associate of q, it follows that p - q. Therefore,
ordp (q) = 0.
3. This is immediate because if p | u, then since u | 1, we could conclude that p | 1 and hence p is unit,
contradicting the definition of irreducible.

Proposition 11.4.11. Let R be an integral domain. Let a ∈ R and let p ∈ R be prime. Suppose that
a = uq1 q2 · · · qk where u is a unit and the qi are irreducibles. We then have that exactly ordp (a) many of the
qi are associates of p.
Proof. Since p ∈ R is prime, Theorem 11.4.9 implies that

ordp (a) = ordp (uq1 q2 · · · qk )


k
X
= ordp (u) + ordp (qi )
i=1
k
X
= ordp (qi )
i=1

By Proposition 11.4.10, the terms on the right are 1 when qi is an associate of p and 0 otherwise. The result
follows.
Theorem 11.4.12. Let R be an integral domain. Suppose that the strict divisibility relation on R is well-
founded (i.e. there does not exist d1 , d2 , d3 , . . . such that dn+1 k dn for all n ∈ N+ ), and every irreducible in
R is prime. We then have R is a UFD.
Proof. Since the strict divisibility relation on R is well-founded, we can use Proposition 11.4.4 to conclude
that every nonzero nonunit element of R can be written as a product of irreducibles. Suppose now that
q1 q2 · · · qn = r1 r2 . . . rm where each qi and rj are irreducible. Call this common element a. We know that
every irreducible element of R is prime by assumption. Thus, for any irreducible p ∈ R, Proposition 11.4.11
210 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS

(with u = 1) tells us that exactly ordp (a) many of the qi are associates of p, and also that exactly ordp (a)
many of the rj are associates of p. Thus, for every irreducible p ∈ R, there are an equal number of associates
of p on each side. Matching up the elements on the left with corresponding associates on the right tells us
that m = n and gives the required permutation.
Of course, we are left with the question of how to prove that many natural integral domains have these two
properties. Now in Z, the proof of Proposition 2.5.6 (that irreducibles are prime) relied upon the existence
of greatest common divisors. In the next chapter, we will attempt to generalize the special aspects of Z that
allowed us to prove some of the existence and fundamental properties of GCDs.
Chapter 12

Euclidean Domains and PIDs

12.1 Euclidean Domains


Many of our results in Chapter 2 about Z, and in particular Proposition 2.5.6, trace back to Theorem
2.3.1 about division with remainder. Similarly, in Corollary 9.3.11 we established an analog of division with
remainder in the integral domain F [x] of polynomials over a field F . Since this phenomenon has arisen a
few times, we now define a class of integral domains that allow “division with remainder”. One of ultimate
hopes is that such integral domains will always be UFDs.

Definition 12.1.1. Let R be an integral domain. A function N : R\{0} → N is called a Euclidean function
on R if for all a, b ∈ R with b 6= 0, there exist q, r ∈ R such that

a = qb + r

and either r = 0 or N (r) < N (b).

Definition 12.1.2. An integral domain R is a Euclidean domain if there exists a Euclidean function on R.

Example 12.1.3. As alluded to above, Theorem 2.3.1 and Corollary 9.3.11 establish the following:

• The function N : Z\{0} → N defined by N (a) = |a| is a Euclidean function on Z, so Z is a Euclidean


domain.

• Let F be a field. The function N : F [x]\{0} → N defined by N (f (x)) = deg(f (x)) is a Euclidean
function on F [x], so F [x] is a Euclidean domain.

Notice that we do not require the uniqueness of q and r in our definition of a Euclidean function. Although
it was certainly a nice perk to have some aspect of uniqueness in Z and F [x], it turns out to be be unnecessary
for the theoretical results of interest about Euclidean domains. Furthermore, although the above Euclidean
function for F [x] does provide true uniqueness, the one for Z does not since uniqueness there only holds
if the remainder is positive (for example, we have 13 = 4 · 3 + 1 and also 13 = 5 · 3 + (−2) where both
N (1) < N (3) and N (−2) < N (3)). We will see more natural Euclidean functions on integral domains for
which uniqueness fails, and we want to be as general as possible.
The name Euclidean domain comes from the fact that any such integral domain supports the ability to
find greatest common divisors via the Euclidean algorithm. Even more fundamentally, the notion of “size”
given by a Euclidean function N : R → N allows us to use induction to prove the existence of greatest
common divisors. We begin with the following generalization of a simple result we proved about Z which
works in any integral domain (even any commutative ring).

211
212 CHAPTER 12. EUCLIDEAN DOMAINS AND PIDS

Proposition 12.1.4. Let R be an integral domain. Let a, b, q, r ∈ R with a = qb + r. For any d ∈ R, we


have that d is a common divisor of a and b if and only if d is a common divisor of b and r, i.e.

{d ∈ R : d is a common divisor of a and b} = {d ∈ R : d is a common divisor of b and r}

Proof. Suppose first that d is a common divisor of b and r. Since d | b, d | r, and a = qb + r = bq + r1, it
follows that d | a.
Conversely, suppose that d is a common divisor of a and b. Since d | a, d | b, and r = a − qb = a1 + b(−q),
it follows that d | r.
Theorem 12.1.5. Let R be a Euclidean domain. Every pair of elements a, b ∈ R has a greatest common
divisor.
Proof. Since R is a Euclidean domain, we may fix a Euclidean function N : R\{0} → N. We first handle
the special case when b = 0 since N (0) is not defined. If b = 0, then the set of common divisors of a and b
equals the set of divisors of a (because every element divides 0), so a satisfies the requirement of a greatest
common divisor. We now use (strong) induction on N (b) ∈ N to prove the result.
• Base Case: Suppose that b ∈ R is nonzero and N (b) = 0. Fix q, r ∈ R with a = qb + r and either r = 0
or N (r) < N (b). Since N (b) = 0, we can not have N (r) < N (b), so we must have r = 0. Therefore, we
have a = qb. It is now easy to check that b is a greatest common divisor of a and b.
• Inductive Step: Suppose then that b ∈ R is nonzero and we know the result for all pairs x, y ∈ R with
either y = 0 or N (y) < N (b). Fix q, r ∈ R with a = qb + r and either r = 0 or N (r) < N (b). By
(strong) induction, we know that b and r have a greatest common divisor d. By Proposition 12.1.4,
the set of common divisors of a and b equals the set of common divisors of b and r. It follows that d
is a greatest common divisor of a and b.

As an example, consider working in the integral domain Q[x] and trying to find a greatest common divisor
of the following two polynomials:

f (x) = x5 + 3x3 + 2x2 + 6 g(x) = x4 − x3 + 4x2 − 3x + 3

We apply the Euclidean Algorithm as follows (we suppress the computations of the long divisions):

x5 + 3x3 + 2x2 + 6 = (x + 1)(x4 − x3 + 4x2 − 3x + 3) + (x2 + 3)


x4 − x3 + 4x2 − 3x + 3 = (x2 − x + 1)(x2 + 3) + 0

Thus, the set of common of f (x) and g(x) equals the set of common divisors of x2 + 3 and 0, which is just
the set of divisors of x2 + 3. Therefore, x2 + 3 is a greatest common divisor of f (x) and g(x). Now this is
not the only greatest common divisor because we know that any associate of x2 + 3 will also be a greatest
common divisor of f (x) and g(x) by Proposition 11.1.11. The units in Q[x] are the nonzero constants, so
other greatest common divisors are 2x2 + 6, 65 x2 + 52 , etc. We would like to have a canonical choice for which
to pick, akin to choosing the nonnegative value when working in Z.
Definition 12.1.6. Let F be a field. A monic polynomial in F [x] is a nonzero polynomial whose leading
term is 1.
Notice that every nonzero polynomial in F [x] is an associate with a unique monic polynomial (if the
leading term is a 6= 0, just multiply by a−1 to get a monic associate, and notice that this is the only way to
multiply by a nonzero constant to make it monic). By restricting to monic polynomials, we get a canonical
choice for a greatest common divisor.
12.1. EUCLIDEAN DOMAINS 213

Definition 12.1.7. Let F be a field and let f (x), g(x) ∈ F [x] be polynomials. If at least one of f (x) and
g(x) is nonzero, we define gcd(f (x), g(x)) to be the unique monic polynomial which is a greatest common
divisor of f (x) and g(x). Notice that if both f (x) and g(x) are the zero polynomial, then 0 is the only greatest
common divisor of f (x) and g(x), so we define gcd(f (x), g(x)) = 0.
Now x2 + 3 is monic, so from the above computations, we have
gcd(x5 + 3x3 + 2x2 + 6, x4 − x3 + 4x2 − 3x + 3) = x2 + 3
We end this section by showing that the Gaussian Integers Z[i] = {a + bi : a, b ∈ Z} are also a Euclidean
domain. In order to show this, we will use the following result.
Proposition 12.1.8. The subring Q[i] = {q + ri : q, r ∈ Q} is a field.
Proof. Let α ∈ Q[i] be nonzero and write α = q + ri where q, r ∈ Q. We then have that either q 6= 0 or
r 6= 0, so
1 1
=
α q + ri
1 q − ri
= ·
q + ri q − ri
q − ri
= 2
q + r2
q −r
= 2 2
+ 2 ·i
q +r q + r2
q −r 1
Since both q 2 +r 2 and q 2 +r 2 are elements of Q, it follows that α ∈ Q[i]. Therefore, Q[i] is a field.

Definition 12.1.9. We define a function N : Q[i] → Q by letting N (q + ri) = q 2 + r2 . The function N is


called the norm on the field Q[i].
Proposition 12.1.10. For the function N (q + ri) = q 2 + r2 defined on Q[i], we have
1. N (α) ≥ 0 for all α ∈ Q[i].
2. N (α) = 0 if and only if α = 0.
3. N (q) = q 2 for all q ∈ Q.
4. N (α) ∈ N for all α ∈ Z[i].
5. N (αβ) = N (α) · N (β) for all α, β ∈ Q[i].
Proof. The first four are all immediate from the definition. Let α, β ∈ Q[i] be arbitrary, and write α = q + ri
and β = s + ti where q, r, s, t ∈ Q. We have
N (αβ) = N ((q + ri)(s + ti))
= N (qs + rsi + qti − rt)
= N ((qs − rt) + (rs + qt)i)
= (qs − rt)2 + (rs + qt)2
= q 2 s2 − 2qsrt + r2 t2 + r2 s2 + 2rsqt + q 2 t2
= q 2 s2 + r2 s2 + q 2 t2 + r2 t2
= (q 2 + r2 ) · (s2 + t2 )
= N (q + ri) · N (s + ti)
= N (α) · N (β)
214 CHAPTER 12. EUCLIDEAN DOMAINS AND PIDS

Corollary 12.1.11. U (Z[i]) = {1, −1, i, −i}.

Proof. Notice that {1, −1, i, −i} ⊆ U (Z[i]) because 12 = 1, (−1)2 = 1, and i · (−i) = 1. Suppose conversely
that α ∈ U (Z[i]) and write α = c + di where a, b ∈ Z. Since α ∈ U (Z[i]), we can fix β ∈ Z[i] with αβ = 1.
We then have N (αβ) = N (1), so N (α) · N (β) = 1 by the previous proposition. Since N (α), N (β) ∈ N, we
conclude that N (α) = 1 and N (β) = 1. We then have c2 + d2 = N (α) = 1. It follows that one of c or d is 0,
and the other is ±1. Thus, α = c + di ∈ {1, −1, i, −i}.

Theorem 12.1.12. Z[i] is a Euclidean domain with Euclidean function N (a + bi) = a2 + b2 .

Proof. Notice Z[i] is an integral domain because it is a subring of C. Let α, β ∈ Z[i] be arbitrary with β 6= 0.
When we divide α by β in the field Q[i] we get α β = s + ti for some s, t ∈ Q. Fix integers m, n ∈ Z closest
to s, t ∈ Q respectively, i.e. fix m, n ∈ Z so that |m − s| ≤ 21 and |n − t| ≤ 12 . Let γ = m + ni ∈ Z[i], and let
ρ = α − βγ ∈ Z[i]. We then have that α = βγ + ρ, so we need only show that N (ρ) < N (β). Now

N (ρ) = N (α − βγ)
= N (β · (s + ti) − β · γ)
= N (β · ((s + ti) − (m + ni))
= N (β · ((s − m) + (t − n)i)
= N (β) · N ((s − m) + (t − n)i)
= N (β) · ((s − m)2 + (t − n)2 )
 
1 1
≤ N (β) · +
4 4
1
= · N (β)
2
< N (β)

where the last line follows because N (β) > 0 (since β 6= 0).

We work out an example of finding a greatest common of 8 + 9i and 10 − 5i in Z[i]. We follow the above
proof to find quotients and remainders. Notice that

8 + 9i 8 + 9i 10 + 5i
= ·
10 − 5i 10 − 5i 10 + 5i
80 + 40i + 90i − 45
=
100 + 25
35 + 130i
=
125
7 26
= + ·i
25 25
7 26
Following the proof (where we take the closest integers to 25 and 25 ), we should use the quotient i and
determine the remainder from there. We thus write

8 + 9i = i · (10 − 5i) + (3 − i)
12.1. EUCLIDEAN DOMAINS 215

Notice that N (3 − i) = 9 + 1 = 10 which is less than N (10 − 5i) = 100 + 25 = 125. Following the Euclidean
algorithm, we next calculate

10 − 5i 10 − 5i 3 + i
= ·
3−i 3−i 3+i
30 + 10i − 15i + 5
=
9+1
35 − 5i
=
10
7 1
= − ·i
2 2

Following the proof (where we now have many choices because 72 is equally close to 3 and 4 and − 12 is equally
close to −1 and 0), we choose to take the quotient 3. We then write

10 − 5i = 3 · (3 − i) + (1 − 2i)

Notice that N (1 − 2i) = 1 + 4 = 5 which is less than N (3 − i) = 9 + 1 = 10. Going to the next step, we
calculate
3−i 3 − i 1 + 2i
= ·
1 − 2i 1 − 2i 1 + 2i
3 + 6i − i + 2
=
1+4
5 + 5i
=
5
=1+i

Therefore, we have
3 − i = (1 + i) · (1 − 2i) + 0

Putting together the various divisions, we see the Euclidean algorithm as:

8 + 9i = i · (10 − 5i) + (3 − i)
10 − 5i = 3 · (3 − i) + (1 − 2i)
3 − i = (1 + i) · (1 − 2i) + 0

Thus, the set of common divisors of 8 + 9i and 10 − 5i equals the set of common divisors of 1 − 2i and 0,
which is just the set of divisors of 1 − 2i. Since a greatest common divisor is unique up to associates and the
units of Z[i] are 1, −1, i, −i, it follows the set of greatest common divisors of 8 + 9i and 10 − 5i is

{1 − 2i, −1 + 2i, 2 + i, −2 − i}

In a Euclidean domains with a “nice” Euclidean function, say where N (b) < N (a) whenever b k a, one can
mimic the inductive argument in Z to prove that every element is a product of irreducibles. For example, it
is relatively straightforward to prove that our Euclidean functions on F [x] and Z[i] satisfy this, and so every
(nonzero nonunit) element factors into irreducibles. In fact, one can show that every Euclidean domain has
a (possibly different) Euclidean function N with the property that N (b) < N (a) whenever b k a. However,
rather than develop this interesting theory, we approach these problems from another perspective.
216 CHAPTER 12. EUCLIDEAN DOMAINS AND PIDS

12.2 Principal Ideal Domains


We chose our definition of a Euclidean domain to abstract away the fundamental fact about Z that we can
always divide in such a way to get a quotient along with a “smaller” remainder. As we have seen, this ability
allows us to carry over to these integral domains the existence of greatest common divisors and the method
of finding them via the Euclidean Algorithm.
Recall back when we working with Z that we had another characterization of (and proof of existence for)
a greatest common divisor in Theorem 2.4.8. We proved that the greatest common divisor of two nonzero
integers a and b was the least positive number of the form ma + nb where m, n ∈ Z. Now the “least” part
will have no analogue in a general integral domain, so we will have to change that. Perhaps surprisingly,
it turns out that the way to generalize this construction is to think about ideals. The reason why is that
{ma + nb : m, n ∈ Z} is simply the ideal ha, bi. As we will see, in hindsight, what made this approach to
greatest common divisors work in Z is the fact that every ideal of Z is principal (from Proposition 10.4.7).
We give the integral domains which have this property a special name.
Definition 12.2.1. A principal ideal domain, or PID, is an integral domain in which every ideal is principal.
Before working with these rings on their own terms, we first prove that every Euclidean domains is a PID
so that we have a decent supply of examples. Our proof generalizes the one for Z (see Proposition 10.4.7) in
the sense that instead of looking for a smallest positive element of the ideal, we simply look for an element
of smallest “size” according to a given Euclidean function.
Theorem 12.2.2. Every Euclidean domain is a PID.
Proof. Let R be a Euclidean domain, and fix a Euclidean function N : R\{0} → N. Suppose that I is an
ideal of R. If I = {0}, then I = h0i. Suppose then that I 6= {0}. The set

{N (a) : a ∈ I\{0}}

is a nonempty subset of N. By the well-ordering property of N, the set has a least element m. Fix b ∈ I
with N (b) = m. Since b ∈ I, we clearly have hbi ⊆ I. Suppose now that a ∈ I. Fix q, r ∈ R with

a = qb + r

and either r = 0 or N (r) < N (b). Since r = a − qb and both a, b ∈ I, it follows that r ∈ I. Now if r 6= 0,
then N (r) < N (b) = m contradicting our minimality of m. Therefore, we must have r = 0 and so a = qb. It
follows that a ∈ hbi. Since a ∈ I was arbitrary, we conclude that I ⊆ hbi. Therefore, I = hbi.
Corollary 12.2.3. Z, F [x] for F a field, and Z[i] are all PIDs.
Notice also that all fields F are also PIDs for the trivial reason that the only ideals of F are {0} = h0i and
F = h1i. In fact, all fields are also trivially Euclidean domain via absolutely any function N : F \{0} → N
because we can always divide by a nonzero element with zero as a remainder. It turns out that there are
PIDs which are not Euclidean domains, but we will not construct examples of such rings now.
Returning to our other characterization of greatest common divisors in Z, we had that if a, b ∈ Z not
both nonzero, then we considered the set

{ma + nb : m, n ∈ Z}

and proved that the least positive element of this set was the greatest common divisor. In our current ring-
theoretic language, the above set is the ideal ha, bi of Z, and a generator of this ideal is a greatest common
divisor. With this change in perspective/language, we can carry this argument over to an arbitrary PID.
Theorem 12.2.4. Let R be a PID and let a, b ∈ R.
12.2. PRINCIPAL IDEAL DOMAINS 217

1. There exists a greatest common divisor of a and b.


2. If d is a greatest common divisor of a and b, then there exists r, s ∈ R with d = ra + sb.
Proof.
1. Let a, b ∈ R. Consider the ideal
I = ha, bi = {ra + sb : r, s ∈ R}
Since R is a PID, the ideal I is principal, so we may fix d ∈ R with I = hdi. Since d ∈ hdi = ha, bi, we
may fix r, s ∈ R with ra + sb = d. We claim that d is a greatest common divisor of a and b.
First notice that a ∈ I since a = 1a + 0b, so a ∈ hdi, and hence d | a. Also, we have b ∈ I because
b = 0a + 1b, so b ∈ hdi, and hence d | b. Thus, d is a common divisor of a and b.
Suppose now that c is a common divisor of a and b. Fix m, n ∈ R with a = cm and b = cn. We then
have
d = ra + sb
= r(cm) + s(cn)
= c(rm + sn)
Thus, c | d. Putting it all together, we conclude that d is a greatest common divisor of a and b.
2. For the d in part 1, we showed in the proof that there exist r, s ∈ R with d = ra + sb. Let d0 be any
other greatest common divisor of a and b, and fix a unit u with d0 = du. We then have
d0 = du = (ra + sb)u = (ru)a + (su)b

It is possible to define a greatest common divisor of elements a1 , a2 , . . . , an ∈ R completely analogously


to our definition for pairs of elements. If we do so, even in the case of a nice Euclidean domain, we can
not immediately generalize the idea of the Euclidean Algorithm to many elements without doing a kind
of repeated nesting that gets complicated. However, notice that we can very easily generalize our PID
arguments to prove that greatest common divisors exist and are unique up to associates by following the
above proofs and simply replacing the ideal ha, bi with the ideal ha1 , a2 , . . . , an i. We even conclude that it
possible to write a greatest common divisor in the form r1 a1 + r2 a2 + · · · + rn an . The assumption that all
ideals are principal is extremely powerful.
With the hard work of the last couple of sections in hand, we can now carry over much of our later work
in Z which dealt with relatively prime integers and primes. The next definition and ensuing two propositions
directly generalize corresponding results about Z.
Definition 12.2.5. Let R be a PID. Two elements a, b ∈ R are relatively prime if 1 is a greatest common
divisor of a and b.
Proposition 12.2.6. Let R be a PID and let a, b, c ∈ R. If a | bc and a and b are relatively prime, then
a | c.
Proof. Fix d ∈ R with bc = ad. Fix r, s ∈ R with ra + sb = 1. Multiplying this last equation through by c,
we conclude that rac + sbc = c, so
c = rac + s(bc)
= rac + s(ad)
= a(rc + sd)
It follows that a | c.
218 CHAPTER 12. EUCLIDEAN DOMAINS AND PIDS

Proposition 12.2.7. Suppose that R is a PID. If p is irreducible, then p is prime.


Proof. Suppose that p ∈ R is irreducible. By definition, p is nonzero and not a unit. Suppose that a, b ∈ R
are such that p | ab. Fix a greatest common divisor d of p and a. Since d | p, we may fix c ∈ R with p = dc.
Now p is irreducible, so either d is a unit or c is a unit. We handle each case.
• Suppose that d is a unit. We then have that 1 is an associate of d, so 1 is also a greatest common divisor
of p and a. Therefore, p and a are relatively prime, so as p | ab we may use the previous corollary to
conclude that p | b.
• Suppose that c is a unit. We then have that pc−1 = d, so p | d. Since d | a, it follows that p | a.
Therefore, either p | a or p | b. It follows that p is prime.
Excellent. We’ve developed enough theory of PIDs to prove that GCDs always exist, and used that to
prove that irreducibles are always prime in PIDs. However, there is still the question on why k is well-founded
in PIDs. Now that we no longer have a notion of size of elements, this seems daunting. Notice that the the
defining property of PIDs is a condition on the ideals, so we first rephrase the k relation in terms of ideals.
Proposition 12.2.8. Suppose that R is an integral domain and a, b ∈ R. We have a k b if and only if
hbi ( hai.
Proof. Immediate from Proposition 11.1.8 and Corollary 11.1.9.
Thus, if we want to prove that k is well-founded, i.e. if we want to prove that there is no infinite descending
sequence of strict divisibility, then we can instead prove that there is no infinite ascending chain of ideals.
Commutative rings that have this property are given a special name in honor of Emmy Noether.
Definition 12.2.9. A commutative ring R is said to be Noetherian if there is no strictly increasing sequence
of ideals
I1 ( I2 ( I3 ( . . .
Equivalently, whenever
I1 ⊆ I2 ⊆ I3 ⊆ . . .
is a sequence of ideals, there exists N ∈ N+ such that Ik = IN for all k ≥ N .
Immediately from the definition, we obtain the following result.
Proposition 12.2.10. Let R be a Noetherian integral domain. We have that k is a well-founded relation
on R, and hence every nonzero nonunit element of R can be written as a product of irreducibles.
Proof. Let R be a Noetherian integral domain. We need to prove that k is well-founded on R. Suppose
instead that we have a sequence a1 , a2 , a3 , . . . such that an+1 k an . Letting In = han i for all n, it follows
from Proposition 12.2.8 that In ( In+1 for all n ∈ N+ . Thus, we have

I1 ( I2 ( I3 ( . . .

which contradicts the definition of Noetherian. Therefore, k is well-founded on R. The last statement now
following from Proposition 11.4.4.
This is all well and good, but we need a “simple” way to determine when a commutative ring R is
Noetherian. Fortunately, we have the following.
Theorem 12.2.11. Let R be a commutative ring. The following are equivalent.
1. R is Noetherian.
12.2. PRINCIPAL IDEAL DOMAINS 219

2. Every ideal of R is finitely generated (i.e. for every ideal I of R, there exist a1 , a2 , . . . , am ∈ R with
I = ha1 , a2 , . . . , am i).
Proof. Suppose first that every ideal of R is finitely generated. Let

I1 ⊆ I2 ⊆ I3 ⊆ . . .

be a sequence of ideals. Let



[
J= Ik = {r ∈ R : r ∈ Ik for some k ∈ N+ }
k=1

We claim that J is an ideal of R.


• First notice that 0 ∈ I1 ⊆ J.
• Let a, b ∈ J be arbitrary. Fix k, ` ∈ N+ with a ∈ Ik and b ∈ I` . We then have a, b ∈ Imax{k,`} , so
a + b ∈ Imax{k,`} ⊆ J.
• Let a ∈ J and r ∈ R be arbitrary. Fix k ∈ N+ with a ∈ Ik . We then have that ra ∈ Ik ⊆ J.
Since J is an ideal of R and we are assuming that every ideal of R is finitely generated, there exist
a1 , a2 , . . . , am ∈ R with J = ha1 , a2 , . . . , am i. For each i, fix ki ∈ N with ai ∈ Iki . Let N = max{k1 , k2 , . . . , km }.
We then have that ai ∈ IN for each i, hence J = ha1 , a2 , . . . , am i ⊆ IN . Therefore, for any n ≥ N , we have

IN ⊆ In ⊆ J ⊆ IN

hence In = IN .
Suppose conversely that some ideal of R is not finitely generated and fix such an ideal J. Define a
sequence of elements of J as follows. Let a1 be an arbitrary element of J. Suppose that we have defined
a1 , a2 , . . . , ak ∈ J. Since J is not finitely generated, we have that

ha1 , a2 , . . . , an i ( J

so we may let ak+1 be some (any) element of J\ha1 , a2 , . . . , ak i. Letting In = ha1 , a2 , . . . , an i for each n ∈ N+ ,
we then have
I1 ( I2 ( I3 ( . . .
so R is not Noetherian.
Corollary 12.2.12. Every PID is Noetherian.
Proof. This follows immediately from Theorem 12.2.11 and the fact that in a PID every ideal is generated
by one element.
Corollary 12.2.13. Every PID is a UFD, and thus every Euclidean domain is a UFD as well.
Proof. Let R be a PID. By Corollary 12.2.12, we know that R is Noetherian, and so k is well-founded
by Proposition 12.2.10. Furthermore, we know that every irreducible in R is prime by Proposition 12.2.7.
Therefore, R is a UFD by Theorem 11.4.12.
Finally, we bring together many of the fundamental properties of elements and ideals in any PID.
Proposition 12.2.14. Let R be a PID and let a ∈ R with a 6= 0. The following are equivalent.
1. hai is a maximal ideal.
220 CHAPTER 12. EUCLIDEAN DOMAINS AND PIDS

2. hai is a prime ideal.

3. a is a prime.

4. a is irreducible.

Proof. We have already proved much of this, so let’s recap what we know.

• 1 → 2 is Corollary 10.5.6.

• 2 ↔ 3 is Proposition 11.2.5.

• 3 → 4 is Proposition 11.2.4.

• 4 → 3 is Proposition 12.2.7

To finish the equivalences, we prove that 4 → 1.


Suppose that a ∈ R is irreducible, and let M = hai. Since a is not a unit, we have that 1 ∈ / hai, so
M 6= R. Suppose that I is an ideal with M ⊆ I ⊆ R. Since R is a PID, there exists b ∈ R with I = hbi. We
then have that hai ⊆ hbi, so b | a by Proposition 11.1.8. Fix c ∈ R with a = bc. Since a is irreducible, either
b is a unit or c is a unit. In the former case, we have that 1 ∈ hbi = I, so I = R. In the latter case we have
that b is an associate of a so I = hbi = hai = M by Corollary 11.1.9. Thus, there is no ideal I of R with
M ( I ( R.

It is natural to misread the previous proposition to conclude that in a PID every prime ideal is maximal.
This is almost true, but pay careful attention to the assumption that a 6= 0. In a PID R, the ideal {0} is
always a prime ideal, but it is only maximal in the trivial special case of when R is a field. In a PID, every
nonzero prime ideal is maximal.

Corollary 12.2.15. If R is a PID, then every nonzero prime ideal is maximal.

12.3 Quotients of F [x]


We have now carried over many of the concepts and proofs that originated in the study of the integers Z
over to the polynomial ring over a field F [x] (and more generally any Euclidean Domains or even PID). One
construction we have not focused on is the quotient construction. In the integers Z, we know that every
nonzero ideal has the form nZ = hni for some n ∈ N+ (because after all Z is a PID and we have hni = h−ni).
The quotients of Z by these ideals are the now very familiar rings Z/nZ.
Since we have so much in common between Z and F [x], we would like to study the analogous quotients
of F [x]. Now if F is a field, then F [x] is a PID, so for every nonzero ideal of F [x] equals hp(x)i for some
p(x) ∈ F [x]. Suppose then that p(x) ∈ F [x] and let I = hp(x)i. Given f (x) ∈ F [x], we will write f (x) for
the coset f (x) + I just like we wrote k for the coset k + nZ. As long as we do not change the ideal I (so do
not change p(x)) in a given construction, there should be no confusion as to which quotient we are working
in. When dealing with these quotients, our first task is to find unique representatives for the cosets as in
Z/nZ where we saw that 0, 1, . . . , n − 1 served as distinct representatives for the cosets.

Proposition 12.3.1. Let F be a field and let p(x) ∈ F [x] be nonzero. Let I = hp(x)i and work in F [x]/I.
For all f (x) ∈ F [x], there exists a unique h(x) ∈ F [x] such that both:

• f (x) = h(x)

• Either h(x) = 0 or deg(h(x)) < deg(p(x))


12.3. QUOTIENTS OF F [X] 221

In other words, if we let

S = {h(x) ∈ F [x] : h(x) = 0 or deg(h(x)) < deg(p(x))}

then the elements of S provide unique representatives for the cosets in F [x]/I.

Proof. We first prove existence. Let f (x) ∈ F [x]. Since p(x) 6= 0, we may fix q(x), r(x) with

f (x) = q(x)p(x) + r(x)

and either r(x) = 0 or deg(r(x)) < deg(p(x)). We then have

p(x)q(x) = f (x) − r(x)

Thus p(x) | (f (x) − r(x)) and so f (x) − r(x) ∈ I. It follows from Proposition 10.1.6 that f (x) = r(x) so we
may take h(x) = r(x). This proves existence.
We now prove uniqueness. Suppose that h1 (x), h2 (x) ∈ S (so each is either 0 or has smaller degree
than p(x)) and that h1 (x) = h2 (x). Using Proposition 10.1.6, we then have that h1 (x) − h2 (x) ∈ I and
hence p(x) | (h1 (x) − h2 (x)). Notice that every nonzero multiple of p(x) has degree greater than or equal to
deg(p(x)) (since the degree of a product is the sum of the degrees in F [x]). Now either h1 (x) − h2 (x) = 0
or it has degree less than deg(p(x)), but we’ve just seen that the latter is impossible. Therefore, it must be
the case that h1 (x) − h2 (x) = 0, and so h1 (x) = h2 (x). This proves uniqueness.

Let’s look at an example. Suppose that we are working with F = Q and we let p(x) = x2 − 2x + 3.
Consider the quotient ring R = Q[x]/hp(x)i. From above, we know that every element in this quotient
is represented uniquely by either a constant polynomial or a polynomial of degree 1. Thus, some distinct
elements of R are 1, 3/7, x, and 2x − 5/3. We add elements in the quotient ring R by adding representatives
as usual, so for example we have
4x − 7 + 2x + 8 = 6x + 1
Multiplication of elements of R is more interesting if we try to convert the resulting product to one of our
chosen representatives. For example, we have

2x + 7 · x − 1 = (2x + 7)(x − 1) = 2x2 + 5x − 7

which is perfectly correct, but the resulting representative isn’t one of our chosen ones. If we follows the
above proof, we should divide 2x2 + 5x − 7 by x2 − 2x + 3 and use the remainder as our representative. We
have
2x2 + 5x − 7 = 2 · (x2 − 2x + 3) + (9x − 13)
so
(2x2 + 5x − 7) − (9x − 13) ∈ hp(x)i
and hence
2x2 + 5x − 7 = 9x − 13
It follows that in the quotient we have

2x + 7 · x − 1 = 9x − 13

Here’s another way to determine that the product is 9x − 13. Notice that for any f (x), g(x) ∈ F [x], we have

f (x) + g(x) = f (x) + g(x) f (x) · g(x) = f (x) · g(x)


222 CHAPTER 12. EUCLIDEAN DOMAINS AND PIDS

by definition of multiplication in the quotient. Now p(x) = x2 − 2x + 3, so in the quotient we have


x2 − 2x + 3 = 0. It follows that
x2 + −2x + 3 = 0
and hence
x2 = 2x − 3
Therefore
2x + 7 · x − 1 = 2x2 + 5x − 7
= 2 · x2 + 5x − 7
= 2 · 2x − 3 + 5x − 7
= 4x − 6 + 5x − 7
= 9x − 13
For another example, consider the ring F = Z/2Z. Since we will be working in quotients of F [x] and too
many equivalence classes begins to get confusing, we will write F = {0, 1} rather than F = {0, 1}. Consider
the polynomial p(x) = x2 +1 ∈ F [x]. We then have that the elements of F [x]/hx2 +1i are represented uniquely
by either a constant polynomial or a polynomial of degree 1. Thus, the distinct elements of F [x]/hx2 + 1i are
given by ax + b for a, b ∈ F . Since |F | = 2, we have two choices for each of a and b, and hence the quotient
has four elements. Here is the additional and multiplication tables for F [x]/hx2 + 1i:
+ 0 1 x x+1 · 0 1 x x+1
0 0 1 x x+1 0 0 0 0 0
1 1 0 x+1 x 1 0 1 x x+1
x x x+1 0 1 x 0 x 1 x+1
x+1 x+1 x 1 0 x+1 0 x+1 x+1 0
The addition table is fairly straightforward, but some work went into constructing the multiplication table.
For example, we have
x · x + 1 = x2 + x
To determine which of our chosen representatives gives this coset, we notice that x2 + 1 = 0 (since clearly
x2 + 1 ∈ hx2 + 1i), so x2 + 1 = 0. Adding 1 to both sides and noting that 1 + 1 = 0, we conclude that x2 = 1.
Therefore
x · x + 1 = x2 + x
= x2 + x
=1+x
=x+1
The other entries of the table can be found similarly. In fact, we have done all the hard work because we
now know that x2 = 1 and we can use that throughout.
Suppose instead that we work with the polynomial p(x) = x2 + x + 1. Thus, we are considering the
quotient F [x]/hx2 + x + 1i. As above, since deg(x2 + x + 1) = 2, we get unique representatives from the
elements ax + b for a, b ∈ {0, 1}. Although we have the same representatives, the cosets are different and the
multiplication table changes considerably:
+ 0 1 x x+1 · 0 1 x x+1
0 0 1 x x+1 0 0 0 0 0
1 1 0 x+1 x 1 0 1 x x+1
x x x+1 0 1 x 0 x x+1 1
x+1 x+1 x 1 0 x+1 0 x+1 1 x
12.3. QUOTIENTS OF F [X] 223

For example, let’s determine x · x + 1 in this situation. Notice first that x2 + x + 1 = 0, so x2 + x + 1 = 0.


Adding x + 1 to both sides and using the fact that 1 + 1 = 0 and x + x = 0, we conclude that

x2 = x + 1 = x + 1

Therefore, we have

x · x + 1 = x2 + x
= x2 + x
=x+1+x
=1

Notice that every nonzero element of the quotient F [x]/hx2 + x + 1i has a multiplicative inverse, so the
quotient in this case is a field. We have succeeded in constructing of field of order 4. This is our first
example of a finite field which does not have prime order.
The reason why we obtained a field when taking the quotient by hx2 + x + 1i but not when taking the
quotient by hx2 + 1i is the following. It is the analogue of the fact that Z/pZ is a field if and only if p is
prime (equivalently irreducible) in Z.
Proposition 12.3.2. Let F be a field and let p(x) ∈ F [x] be nonzero. We have that F [x]/hp(x)i is a field
if and only if p(x) is irreducible in F [x].
Proof. Let p(x) ∈ F [x] be nonzero. Since F is a field, we know that F [x] is a PID. Using Proposition 12.2.14
and Theorem 10.5.5, we conclude that

F [x]/hp(x)i is a field ⇐⇒ hp(x)i is a maximal ideal of F [x]


⇐⇒ p(x) is irreducible in F [x]

When F = {0, 1} as above, we see that x2 +1 is not irreducible (since 1 is a root of x2 +1 = (x+1)(x+1))
but x2 + x + 1 is irreducible (because it has degree 2 and neither 0 nor 1 is a root). Generalizing the previous
constructions, we get the following.
Proposition 12.3.3. Let F be a finite field with k elements. If p(x) ∈ F [x] is irreducible and deg(p(x)) = n,
then F [x]/hp(x)i is a field with k n elements.
Proof. Since p(x) is irreducible, we know from the previous proposition that F [x]/hp(x)i is a field. Since
deg(p(x)) = n, we can represent the elements of the quotient uniquely by elements of the form

an−1 xn−1 + · · · + a1 x + a0

where each ai ∈ F . Now F has k elements, so we have k choices for each value of ai . We can make this
choice for each of the n coefficients ai , so we have k n many choices in total.
It turns out that if p ∈ N+ is prime and n ∈ N+ , then there exists an irreducible polynomial in Z/pZ[x]
of degree n, so there exists a field of order pn . However, directly proving that such polynomials exist is
nontrivial. It is also possible to invert this whole idea by first proving that there exists fields of order pn ,
working to understand their structure, and then using these results to prove there there exist irreducible
polynomials in in Z/pZ[x] of each degree n. We will do this later in Chapter ??. In addition, we will prove
every finite field has order some prime power, and any two finite fields of the same order are isomorphic, so
this process of taking quotients of Z/pZ[x] by irreducible polynomials suffices to construct all finite fields
(up to isomorphism).
224 CHAPTER 12. EUCLIDEAN DOMAINS AND PIDS

Finally, we end this section with one way to construct the complex numbers. Consider the ring R[x]
of polynomials with real coefficients. Let p(x) = x2 + 1 and notice that p(x) has no roots in R because
a2 + 1 ≥ 1 for all a ∈ R. Since deg(p(x)) = 2, it follows that p(x) = x2 + 1 is irreducible in R[x]. From
above, we conclude that R[x]/hx2 + 1i is a field. Now elements of the quotient are represented uniquely by
ax + b for a, b ∈ R. We have x2 + 1 = 0, so x2 + 1 = 0 and hence x2 = −1. It follows that x2 = −1, so x
can play the role of “i”. Notice that for any a, b, c, d ∈ R we have
ax + b + cx + d = (a + c)x + (b + d)
and
ax + b · cx + d = acx2 + (ad + bc)x + bd
= acx2 + (ad + bc)x + bd
= ac · x2 + (ad + bc)x + bd
= ac · −1 + (ad + bc)x + bd
= (ad + bc)x + (bd − ac)
Notice that this multiplication is the exact same as when you treat the complex numbers as having the
form ai + b and “formally add and multiply” using the rule that i2 = −1. One advantage of our quotient
construction is that we do not need to verify all of the field axioms. We get them for free from our general
theory.

12.4 Field of Fractions


Let R be an integral domain. In this section, we show that R can be embedded in a field F . Furthermore,
our construction gives a “smallest” such field F in a sense to be made precise below. We call this field the
field of fractions of R and denote it F rac(R). Our method for building F rac(R) generalizes the construction
of the rationals from the integers we outlined in Chapter 3, and in particular we will carry out all of the
necessary details that we omitted there. Notice that the fact that every integral domain R can be embedded
in a field “explains” why we have cancellation in integral domains (because when viewed in the larger field
F rac(R), we can multiply both sides by the multiplicative inverse).
Definition 12.4.1. Let R be an integral domain. Define P = R × (R\{0}). We define a relation ∼ on P
by letting (a, b) ∼ (c, d) if ad = bc.
Proposition 12.4.2. ∼ is an equivalence relation on P .
Proof. We check the properties:
• Reflexive: For any a, b ∈ R with b 6= 0, we have (a, b) ∼ (a, b) because ab = ba, so ∼ is reflexive.
• Symmetric: Suppose that a, b, c, d ∈ R with b, d 6= 0 and (a, b) ∼ (c, d). We then have ad = bc, hence
cb = bc = ad = da. It follows that (c, d) ∼ (a, b).
• Transitive: Suppose that a, b, c, d, e, f ∈ Z with b, d, f 6= 0, (a, b) ∼ (c, d) and (c, d) ∼ (e, f ). We then
have ad = bc and cf = de. Hence,
(af )d = (ad)f
= (bc)f (since ad = bc)
= b(cf )
= b(de) (since cf = de)
= (be)d
12.4. FIELD OF FRACTIONS 225

Therefore, af = be because R is an integral domain and d 6= 0.

We now define F rac(R) to be the set of equivalence classes, i.e. F rac(R) = P/∼. We need to define
addition and multiplication on F to make it into a field. Mimicking addition and multiplication of rationals,
we want to define
(a, b) + (c, d) = (ad + bc, bd) (a, b) · (c, d) = (ac, bd)

Notice first that since b, d 6= 0, we have bd 6= 0 because R is an integral domain, so we have no issues there.
However, we need to check that the operations are well-defined.

Proposition 12.4.3. Let a1 , a2 , b1 , b2 , c1 , c2 , d1 , d2 ∈ R with b1 , b2 , d1 , d2 6= 0. Suppose that (a1 , b1 ) ∼ (a2 , b2 )


and (c1 , d1 ) ∼ (c2 , d2 ). We then have

• (a1 d1 + b1 c1 , b1 d1 ) ∼ (a2 d2 + b2 c2 , b2 d2 )

• (a1 c1 , b1 d1 ) ∼ (a2 c2 , b2 d2 )

Thus, the above operations of + and · on F rac(R) are well-defined.

Proof. Since (a1 , b1 ) ∼ (a2 , b2 ) and (c1 , d1 ) ∼ (c2 , d2 ), it follows that a1 b2 = b1 a2 and c1 d2 = d1 c2 . We have

(a1 d1 + b1 c1 ) · b2 d2 = a1 b2 d1 d2 + c1 d2 b1 b2
= b1 a2 d1 d2 + d1 c2 b1 b2
= b1 d1 · (a2 d2 + b2 c2 )

so (a1 d1 + b1 c1 , b1 d1 ) ∼ (a2 d2 + b2 c2 , b2 d2 ).
Multiplying a1 b2 = b1 a2 and c1 d2 = d1 c2 , we conclude that

a1 b2 · c1 d2 = b1 a2 · d1 c2

and hence
a1 c1 · b2 d2 = b1 d1 · a2 c2

Therefore, (a1 c1 , b1 d1 ) ∼ (a2 c2 , b2 d2 ).

Now that we have successfully defined addition and multiplication, we are ready to prove that the resulting
object is a field.

Theorem 12.4.4. If R is an integral domain, then F rac(R) is a field with the following properties.

• The additive identity of F rac(R) is (0, 1).

• The multiplicative identity of F rac(R) is (1, 1).

• The additive inverse of (a, b) ∈ F rac(R) is (−a, b).

• If (a, b) ∈ F rac(R) with (a, b) 6= (0, 1), then a 6= 0 and the multiplicative inverse of (a, b) is (b, a).

Proof. We check each of the field axioms.


226 CHAPTER 12. EUCLIDEAN DOMAINS AND PIDS

1. Associativity of +: Let q, r, s ∈ F rac(R). Fix a, b, c, d, e, f ∈ R with b, d, f 6= 0 such that q = (a, b),


r = (c, d), and s = (e, f ). We then have

q + (r + s) = (a, b) + ((c, d) + (e, f ))


= (a, b) + (cf + de, df )
= (a(df ) + b(cf + de), b(df ))
= (adf + bcf + bde, bdf )
= ((ad + bc)f + (bd)e, (bd)f )
= (ad + bc, bd) + (e, f )
= [(a, b) + (c, d)] + (e, f )
= (q + r) + s

2. Commutativity of +: Let q, r, s ∈ F rac(R). Fix a, b, c, d ∈ R with b, d 6= 0 such that q = (a, b), and
r = (c, d). We then have

q + r = (a, b) + (c, d)
= (ad + bc, bd)
= (cb + da, db)
= (c, d) + (a, b)
=r+q

3. (0, 1) is an additive identity: Let q ∈ F rac(R). Fix a, b ∈ R with b 6= 0 such that q = (a, b). We then
have

q + (0, 1) = (a, b) + (0, 1)


= (a · 1 + b · 0, b · 1)
= (a, b)
=q

Since we already proved commutativity of +, we conclude that (0, 1) + q = q also.

4. Additive inverses: Let q ∈ F rac(R). Fix a, b ∈ R with b 6= 0 such that q = (a, b). Let r = (−a, b). We
then have

q + r = (a, b) + (−a, b)
= (ab + b(−a), bb)
= (ab − ab, bb)
= (0, bb)
= (0, 1)

where the last line follows from the fact that 0 · 1 = 0 = 0 · bb. Since we already proved commutativity
of +, we conclude that r + q = (0, 1) also.
12.4. FIELD OF FRACTIONS 227

5. Associativity of ·: Let q, r, s ∈ F rac(R). Fix a, b, c, d, e, f ∈ R with b, d, f 6= 0 such that q = (a, b),


r = (c, d), and s = (e, f ). We then have

q · (r · s) = (a, b) · ((c, d) · (e, f ))


= (a, b) · (ce, df )
= (a(ce), b(df ))
= ((ac)e, (bd)f )
= (ac, bd) · (e, f )
= ((a, b) · (c, d)) · (e, f )
= (q · r) · s

6. Commutativity of ·: Let q, r ∈ F rac(R). Fix a, b, c, d ∈ R with b, d 6= 0 such that q = (a, b), and
r = (c, d). We then have

q · r = (a, b) · (c, d)
= (ac, bd)
= (ca, db)
= (c, d) · (a, b)
=r·q

7. (1, 1) is a multiplicative identity: Let q ∈ F rac(R). Fix a, b ∈ R with b 6= 0 such that q = (a, b). We
then have

q · (1, 1) = (a, b) · (1, 1)


= (a · 1, b · 1)
= (a, b)
=q

Since we already proved commutativity of ·, we conclude that (1, 1) · q = q also.

8. Multiplicative inverses: Let q ∈ F rac(R) with q 6= (0, 1). Fix a, b ∈ R with b 6= 0 such that q = (a, b).
Since (a, b) 6= (0, 1), we know that (a, b) 6∼ (0, 1), so a · 1 6= b · 0 which means that a 6= 0. Let r = (b, a)
which makes sense because a 6= 0. We then have

q · r = (a, b) · (b, a)
= (ab, ba)
= (ab, ab)
= (1, 1)

where the last line follows from the fact that ab·1 = ab = ab·1. Since we already proved commutativity
of +, we conclude that r · q = (1, 1) also.

9. Distributivity: Let q, r, s ∈ F rac(R). Fix a, b, c, d, e, f ∈ R with b, d, f 6= 0 such that q = (a, b),


228 CHAPTER 12. EUCLIDEAN DOMAINS AND PIDS

r = (c, d), and s = (e, f ). We then have


q · (r + s) = (a, b) · ((c, d) + (e, f ))
= (a, b) · (cf + de, df )
= (a(cf + de), b(df ))
= (acf + ade, bdf )
= (b(acf + ade), b(bdf ))
= (abcf + abde, bbdf )
= ((ac)(bf ) + (bd)(ae), (bd)(bf ))
= (ac, bd) + (ae, bf )
= (a, b) · (c, d) + (a, b) · (e, f )
=q·r+q·s

Although R is certainly not a subring of F rac(R) (it is not even a subset), our next proposition says that
R can be embedded in F rac(R).
Proposition 12.4.5. Let R be an integral domain. Define ϕ : R → F rac(R) by θ(a) = (a, 1). We then have
that θ is an injective ring homomorphism.
Proof. Notice first that θ(1) = (1, 1), which is the multiplicative identity of F rac(R). Now for any a, b ∈ R,
we have
θ(a + b) = (a + b, 1)
= (a · 1 + 1 · b, 1 · 1)
= (a, 1) + (b, 1)
= θ(a) + θ(b)
and
θ(ab) = (ab, 1)
= (a · b, 1 · 1)
= (a, 1) · (b, 1)
= θ(a) · θ(b)
Thus, θ : R → F rac(R) is a homomorphism. Suppose now that a, b ∈ R with θ(a) = θ(b). We then have
that (a, 1) = (b, 1), so (a, 1) ∼ (b, 1). It follows that a · 1 = 1 · b, so a = b. Therefore, θ is injective.
We have now completed our primary objective in showing that every integral domain can be embedded
in a field. Our final result about F rac(R) is that it is the “smallest” such field. We can’t hope to prove that
it is a subset of every field containing R because of course we can always rename elements. However, we can
show that if R embeds in some field K, then you can also embed F rac(R) in K. In fact, we show that there
is a unique way to do it so that you “extend” the embedding of R.
Theorem 12.4.6. Let R be an integral domain. Let θ : R → F rac(R) be defined by θ(a) = (a, 1) as above.
Suppose that K is a field and that ψ : R → K is an injective ring homomorphism. There exists a unique
injective ring homomorphism ϕ : F rac(R) → K such that ϕ ◦ θ = ψ.
12.4. FIELD OF FRACTIONS 229

Proof. First notice that if b ∈ R with b 6= 0, then ψ(b) 6= 0 (because ψ(0) = 0 and ψ is assumed to be
injective). Thus, if b ∈ R with b 6= 0, then ψ(b) has a multiplicative inverse in K. Define ϕ : F rac(R) → K
by letting
ϕ((a, b)) = ψ(a) · ψ(b)−1
We check the following.
• ϕ is well-defined: Suppose that (a, b) = (c, d). We then have (a, b) ∼ (c, d), so ad = bc. From this we
conclude that ψ(ad) = ψ(bc), and since ψ is a ring homomorphism it follows that ψ(a)·ψ(d) = ψ(c)·ψ(b).
We have b, d 6= 0, so ψ(b) 6= 0 and ψ(d) 6= 0 by our initial comment. Multiplying both sides by
ψ(b)−1 · ψ(d)−1 , we conclude that ψ(a) · ψ(b)−1 = ψ(a) · ψ(d)−1 . Therefore, ϕ((a, b)) = ϕ((c, d)).
• ϕ(1F rac(R) ) = 1K : We have

ϕ((1, 1)) = ψ(1) · ψ(1)−1 = 1 · 1−1 = 1

• ϕ preserves addition: Let a, b, c, d ∈ R with b, d 6= 0. We have

ϕ((a, b) + (c, d)) = ϕ((ad + bc, bd))


= ψ(ad + bc) · ψ(bd)−1
= (ψ(ad) + ψ(bc)) · (ψ(b) · ψ(d))−1
= (ψ(a) · ψ(d) + ψ(b) · ψ(c)) · ψ(b)−1 · ψ(d)−1
= ψ(a) · ψ(b)−1 + ψ(c) · ψ(d)−1
= ϕ((a, b)) + ϕ((c, d))

• ϕ preserves multiplication: Let a, b, c, d ∈ R with b, d 6= 0. We have

ϕ((a, b) · (c, d)) = ϕ((ac, bd))


= ψ(ac) · ψ(bd)−1
= ψ(a) · ψ(c) · (ψ(b) · ψ(d))−1
= ψ(a) · ψ(c) · ψ(b)−1 · ψ(d)−1
= ψ(a) · ψ(b)−1 · ψ(c) · ψ(d)−1
= ϕ((a, b)) · ϕ((c, d))

• ϕ is injective: Let a, b, c, d ∈ R with b, d 6= 0 and suppose that ϕ((a, b)) = ϕ((c, d)). We then
have that ψ(a) · ψ(b)−1 = ψ(c) · ψ(d)−1 . Multiplying both sides by ψ(b) · ψ(d), we conclude that
ψ(a) · ψ(d) = ψ(b) · ψ(c). Since ψ is a ring homomorphism, it follows that ψ(ad) = ψ(bc). Now ψ is
injective, so we conclude that ad = bc. Thus, we have (a, b) ∼ (c, d) and so (a, b) = (c, d).
• ϕ ◦ θ = ψ: For any a ∈ R, we have

(ϕ ◦ θ)(a) = ϕ(θ(a))
= ϕ((a, 1))
= ψ(a) · ψ(1)−1
= ψ(a) · 1−1
= ψ(a)

It follows that (ϕ ◦ θ)(a) = ψ(a) for all a ∈ R.


230 CHAPTER 12. EUCLIDEAN DOMAINS AND PIDS

We finally prove uniqueness. Suppose that φ : F rac(R) → K is a ring homomorphism with φ ◦ θ = ψ. For
any a ∈ R, we have
φ((a, 1)) = φ(θ(a)) = ψ(a)
Now for any b ∈ R with b 6= 0, we have

ψ(b) · φ((1, b)) = φ((b, 1)) · φ((1, b))


= φ((b, 1) · (1, b))
= φ((b · 1, 1 · b)
= φ((b, b))
= φ((1, 1))
=1

where we used the fact that (b, b) ∼ (1, 1) and that φ is a ring homomorphism so sends the multiplicative
identity to 1 ∈ K. Thus, for every b ∈ R with b 6= 0, we have

φ((1, b)) = ψ(b)−1

Now for any a, b ∈ R with b 6= 0, we have

φ((a, b)) = φ((a, 1) · (1, b))


= φ((a, 1)) · φ((1, b))
= ψ(a) · ψ(b)−1 (from above)
= ϕ((a, b))

Therefore, φ = ϕ.

Corollary 12.4.7. Suppose that R is an integral domain which is a subring of a field K. If every element
of K can be written as ab−1 for some a, b ∈ R with b 6= 0, then K ∼
= F rac(R).
Proof. Let ψ : R → K be the trivial map ψ(r) = r and notice that ψ is an injective ring homomorphism. By
the proof of the previous result, the function ϕ : F rac(R) → K defined by

ϕ((a, b)) = ψ(a) · ψ(b)−1 = ab−1

is an injective ring homomorphism. By assumption, ϕ is surjective, so ϕ is an isomorphism. Therefore,


K : F rac(R).

You might also like