Algebra Notes
Algebra Notes
Joseph R. Mileti
May 5, 2014
2
Contents
1 Introduction 5
1.1 Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Power and Beauty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Abstracting the Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Codes and Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 The Integers 11
2.1 Induction and Well-Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Divisibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Division with Remainder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 GCDs and the Euclidean Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Primes and Factorizations in Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4 Introduction to Groups 43
4.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 Basic Properties, Notation, and Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3 Building Groups From Associative Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.4 The Groups Z/nZ and U (Z/nZ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.5 The Symmetric Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.6 Orders of Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.7 Direct Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3
4 CONTENTS
Introduction
1.1 Abstraction
Each of the sets N, Z, Q, R, and C come equipped with naturally defined notions of addition and multiplica-
tion. Comparing these, we see many similarities. No matter which of these “universes” of numbers we work
within, it does not matter the order we use to add two numbers. That is, we always have a + b = b + a no
matter what a and b are. There are of course some differences as well. In each of the first three sets, there
is no number which when multiplied by itself gives 2, but in there latter two such a number does exist.
There is no reason to work only with objects one typically thinks of as numbers. We know how to
add and multiply vectors, polynomials, matrices, functions, etc. In many of these cases, several of the
“normal” properties of addition and multiplication are still true. For example, essentially all of the basic
properties of addition/multiplication carry over to polynomials and functions. However, occasionally the
operations of addition and multiplication that one defines fail to have the same properties. For example, the
“multiplication” of two vectors in R3 given by cross product is not commutative, nor is the multiplication
of square matrices. When n ≥ 2, the “multiplication” of two vectors in Rn given by the dot product takes
two vectors and returns an element of R rather than an element of Rn .
One fundamental feature of abstract algebra is to take the essential properties of these operations, codify
them as axioms, and then study all occasions where they arise. Of course, we first need to ask the question:
What is essential? Where do we draw the line for which properties we enforce by fiat? A primary goal
is to put enough in to force an interesting theory, but keep enough out to leave the theory as general and
robust as possible. The delicate balancing act of “interesting” and “general” is no easy task, but centuries
of research have isolated a few important collections of axioms as fitting these requirements. Groups, rings,
fields are often viewed as the most central objects, although there are many other examples (such as integral
domains, semigroups, lie algebras, etc.). Built upon these in various ways are many other structures, such
as vector spaces, modules, and algebras. We will emphasize the “big three”, but will also spend a lot of time
on integral domains and vector spaces, as well as introduce and explore a few others throughout our study.
However, before diving into a study of these strange objects, one may ask “Why do we care?”. We’ll
spend the rest of this section giving motivation, but first we introduce rough descriptions of the types of
objects that we will study at a high level so that we can refer to them. Do not worry at all about the details
at this point.
A group consists of a set of objects together with a a way of combining two objects to form another
object, subject to certain axioms. For example, we’ll see that Z under addition is a group, as is R\{0} under
multiplication. More interesting examples are the set of all permutations (rearrangements) of the numbers
{1, 2, . . . , n} under a “composition” operation, and the set of all valid transformation of the Rubik’s Cube
(again under a certain “composition” operation).
5
6 CHAPTER 1. INTRODUCTION
A ring consists of a set of objects together with two ways of combining two objects to form another object,
again subject to certain axioms. We call these operations addition and multiplication (even if they are not
“natural” versions of addition and multiplication) because the axioms assert that they behave in ways very
much like these operations. For example, we have a distributive law that says that a(b + c) = ab + ac for all
a, b, c. Besides Z, Q, R, and C under the usual operations, another ring is the set of all n × n matrices with
real entries using the addition and multiplication rules from linear algebra. Another interesting example is
“modular arithmetic”, where we work with the set {0, 1, 2, . . . , n − 1} and add/multiply as usual but cycle
back around when we get a value of n or greater. We denote these rings by Z/nZ, and they are examples of
finite rings that have many interesting number-theoretic properties.
A field is a special kind of ring that satisfies the additional requirement that every nonzero element has
a multiplicative inverse (like 1/2 is a multiplicative inverse for 2). Since elements of Z do not generally
have multiplicative inverses, the ring Z, under the usual operations, is not a field. However, Q, R, and
C are fields under the usual operations, and roughly one thinks of general fields as behaving very much
like these examples. It turns out that for primes numbers p, the ring given by “modular arithmetic” on
{0, 1, 2, . . . , p − 1} is a fascinating example of a field with finitely many elements.
C = {a + bi : a, b ∈ R}
where i2 = −1. Although perhaps a mathematical curiosity at first, using these new numbers provided ways
to solve ordinary problems in mathematics that people did not know how to do using standard techniques.
In particular, they were surprisingly employed to find real roots to cubic polynomials by using mysterious
multiplications and root operations on these weird new numbers. Eventually, using complex numbers allowed
mathematicians to calculate certain real integrals that resisted solutions to the usual attacks. Since then,
complex numbers are now used in an essential way in many areas of science. They unify and simplify many
concepts and calculations in electromagnetism and fluid flow. Moreover, they are fundamentally embedded
in an essential way within quantum mechanics where “probability amplitudes” are measured using complex
numbers rather than nonnegative real numbers.
One can ask whether we might want to extend C further to include roots of other numbers, perhaps roots
of the new numbers like i. It turns out that we do not have to do this, because C already has everything
one would want in this regard! For example, one can check that
2
1 1
√ + √ · i = i,
2 2
so i has a square root. Not only is C closed under taking roots, but it turns out that every nontrivial
polynomial with real (or even complex) coefficients has a root in C! Thus, the complex numbers are not
“missing” anything from an algebraic point of view. We call fields with this property algebraically closed,
and we will eventually prove that C is such a field.
Even though C is such a pretty and useful object that has just about everything one could want, that
does not mean that we are unable to think about extending it further in different ways. William Hamilton
1.3. ABSTRACTING THE INTEGERS 7
tried to do this by adding something new, let’s call it j, and thinking about how to do arithmetic with the
set {a + bi + cj : a, b, d ∈ R}. However, he was not able to define things like ij and j 2 in such a way that
the arithmetic worked out nicely. One day, he had the inspired idea of going up another dimension and
ditching commutativity of multiplication. As a result, he discovered the quaternions, which are the set of all
“numbers” of the form
H = {a + bi + cj + dk : a, b, c, d ∈ R}
where i2 = j 2 = k 2 = −1. One also need to explain how to multiply the elements i, j, k together, and this
is where things become especially interesting. We multiply them in a cycle so that ij = k, jk = i and
ki = j. However, when we go backwards in the cycle we introduce a negative sign so that ji = −k, kj = −i,
and ik = −j. We’ve lost commutativity of multiplication, but it turns out that H retains all of the other
essential properties, and is an interesting example of a ring that is not a field. Although H has not found
nearly as many applications to science and mathematics as C, it is still useful in some parts of physics and
to understand rotations in space. Furthermore, by understanding the limitations of H and what properties
fail, we gain a much deeper understanding of the role of commutativity of multiplication in fields like R and
C.
p = a2 + b2 = (a + bi)(a − bi)
so the number p, which is prime in Z, factors in an interesting manner over the larger ring Z[i]. This is
precisely what happened with 5 = 22 + 12 and 13 = 32 + 22 above. There is a converse to this result as well.
In fact, we will eventually be able to show the following theorem that also connects all of these ideas to facts
about modular arithmetic.
4. p ≡ 1 (mod 4).
This theorem establishes a connection between sums of squares, factorizations in Z[i], and squares in the
ring Z/pZ (the ring representing “modular arithmetic”). Thus, from this simple example, we see how basic
number-theoretic questions can be understood and hopefully solved by studying these more exotic algebraic
objects. Moreover, it opens the door to many interesting questions. For example, this theorem gives a
simple characteristic for when −1 is a square in Z/pZ, but determining which other elements are squares is
a fascinating problem leading to the beautiful result known as Quadratic Reciprocity.
Furthermore, we see from the above theorem that p can be written as the sum of two squares exactly
when p (which is prime in Z) fails to be prime in the larger ring Z[i]. We have turned a number-theoretic
question into one about factorizations in a new ring. In order to take this perspective, we need a solid
understanding of factorizations in the ring Z[i], which we will develop. Working in N, it is well-known that
every n ≥ 2 factors uniquely (up to order) into a product of primes, but this fact is less obvious than it may
initially appear. We will see a proof of it in Chapter 2, but when we try to generalize it to other contexts,
we will encounter exotic settings where the natural generalization fails. It turns out that things work out
nicely in Z[i], and that is the one of the key ideas driving the theorem.
Closely related to which primes can be written as sums of square, consider the problem of finding integer
solutions to x2 + y 2 = z 2 . Of course, this equation is of interest because of the Pythagorean Theorem, so
finding integer solutions is the same as finding right triangles all of whose sides have integer lengths. For
example, we have 32 + 42 = 52 and 52 + 122 = 132 . Finding all such Pythagorean triples is an interesting
problem, and one can do it using elementary number theory. However, it’s also possible to work in Z[i], and
notice that if x2 + y 2 = z 2 if and only if (x + yi)(x − yi) = z 2 in Z[i]. Using this and the nice factorization
properties of Z[i], one can find all solutions.
Fermat’s Last Theorem is the statement that if n ≥ 3, then there are no nontrivial integer solutions to
xn + y n = z n
(here nontrivial means that each of x, y, and z are nonzero). Fermat scribbled a note in the margin of
one of his books stating that he had a proof but the margin was too small to contain it. For centuries,
mathematicians attempted to prove this result. If there exists a nontrivial solution for some n, then some
straightforward calculations show that there must be a nontrivial solution for either n = 4 or for some odd
prime p. Fermat did provide a proof showing that
x4 + y 4 = z 4
has no nontrivial integer solutions. Suppose then that p is an odd prime and we want to show that
xp + y p = z p
has no nontrivial solution. The idea is to generalize the above concepts from Z[i] to factor the left-hand side.
Although it may not be obvious at this point, by setting ζ = e2πi/p , it turns out that
Working with these strange “numbers”, Lamé put forward an argument asserting that there were no nontriv-
ial solutions, thus claiming a solution to Fermat’s Last Theorem. Liouville pointed out that this argument
1.4. CODES AND CIPHERS 9
relied essentially on the ring Z[ζ] have nice factorization properties like Z and Z[i], and that this was not
obvious. In fact, there do exist some primes p where these properties fail in Z[ζ], and in these situations
Lamé’s argument fell apart. Attempts to patch these arguments, and to generalize Quadratic Reciprocity,
led to many fundamental ideas in ring theory. By the way, a correct solution to Fermat’s Last Theorem was
eventually completed by Andrew Wiles in 1994, using ideas and methods from abstract algebra much more
sophisticated than these.
{1, 2, . . . , n}) played an essential role in the efforts of the Polish and British to decode messages. By the
middle of World War II, the British, using group theory, statistical analysis, specially built machines called
“Bombes”, and a lot of hard human work, were able to decode the majority of the military communications
using the Enigma machines. The resulting intelligence played an enormous role in allied operations and their
eventual victory.
Modern day cryptography makes even more essential use of the algebraic objects we will study. Public-
key cryptosystems and key-exchange algorithms like RSA and classical Diffie-Hellman work in the ring
Z/nZ of modular arithmetic. More sophisticated systems involve exotic finite groups based on certain
“elliptic” curves (and these curves themselves are defined over interesting finite fields). Even private-key
cryptosystems in widespread use often use these algebraic structures. For example, AES (the “Advanced
Encryption Standard”) works by performing arithmetic in a field with 28 = 256 many elements.
Chapter 2
The Integers
Here’s the intuitive argument for why induction is true. By the first assumption, we know that 0 ∈ X.
Since 0 ∈ X, the second assumption tells us that 1 ∈ X. Since 1 ∈ X, the second assumption again tells us
that 2 ∈ X. By repeatedly applying the second assumption in this manner, each element of N is eventually
determined to be in X.
The way that induction works above is that 5 is shown to be an element of X using only the assumption
that 4 is an element of X. However, once we’ve arrived at 5, we’ve already shown that 0, 1, 2, 3, 4 ∈ X, so
why shouldn’t we be able to make use of all of these assumptions when arguing that 5 ∈ X? The answer is
that we can, and this version of induction is sometimes called strong induction.
Fact 2.1.2 (Principle of Strong Induction on N). Let X ⊆ N. Suppose that n ∈ X whenever k ∈ X for all
k ∈ N with k < n. We then have that X = N.
Thus, when arguing that n ∈ X, we are allowed to assume that we know all smaller numbers are in X.
Notice that with this formulation we can even avoid the base case of checking 0 because of a technicality: If
we have the above assumption, then 0 ∈ X because vacuously k ∈ X whenever k ∈ N satisfies k < 0, simply
because no such k exists. If that twist of logic makes you uncomfortable, feel free to argue a base case of 0
when doing strong induction.
The last of the three fundamental facts looks different from induction, but like induction is based on the
concept that natural numbers start with 0 and are built by taking one discrete step at a time forward.
Fact 2.1.3 (Well-Ordering Property of N). Suppose that X ⊆ N with X 6= ∅. There exists k ∈ X such that
k ≤ n for all n ∈ X.
11
12 CHAPTER 2. THE INTEGERS
To see intuitively why this is true, suppose that X ⊆ N with X 6= ∅. If 0 ∈ X, then we can take k = 0.
Suppose not. If 1 ∈ X, then since 1 ≤ n for all n ∈ N with n 6= 0, we can take k = 1. Continue on. If we
keep going until we eventually find 7 ∈ X, then we know that 0, 1, 2, 3, 4, 5, 6 ∈
/ X, so we can take k = 7. If
we keep going forever consistently finding that each natural number is not in X, then we have determined
that X = ∅, which is a contradiction.
We now give an example of proof by induction. Notice that in this case we start with the base case of 1
rather than 0.
Theorem 2.1.4. For any n ∈ N+ , we have
n
X
(2k − 1) = n2
k=1
i.e.
1 + 3 + 5 + 7 + · · · + (2n − 1) = n2
Proof. We prove the result by induction. If we want to formally apply the above statement of induction, we
are letting
Xn
X = {n ∈ N+ : (2k − 1) = n2 }
k=1
and using the principle of induction to argue that X = N+ . More formally still if you feel uncomfortable
starting with 1 rather than 0, we are letting
n
X
X = {0} ∪ {n ∈ N+ : (2k − 1) = n2 }
k=1
and using the principle of induction to argue that X = N, then forgetting about 0 entirely. In the future,
we will not bother to make these pedantic diversions to shoehorn our arguments into the technical versions
expressed above, but you should know that it is always possible to do so.
• Base Case: Suppose that n = 1. We have
1
X
(2k − 1) = 2 · 1 − 1 = 1
k=1
so the left hand-side is 1. The right-hand side is 12 = 1. Therefore, the result is true when n = 1.
• Inductive Step: Suppose that for some fixed n ∈ N+ we know that
n
X
(2k − 1) = n2
k=1
Since the result holds of 1 and it holds of n + 1 whenever it holds of n, we conclude that the result holds for
all n ∈ N+ by induction.
Our other example of induction will be the Binomial Theorem, which tells us how to completely expand
a binomial to a power, i.e. how to expand the expression (x + y)n . For the first few values we have:
• (x + y)1 = x + y
• (x + y)2 = x2 + 2xy + y 2
Proof. One extremely unenlightening proof is to expand out the formula on the right and do terrible algebraic
manipulations on it. If you haven’t done so, I encourage you to do it. If we believe the combinatorial
description of nk , here’s a more meaningful combinatorial argument. Let n, k ∈ N with k ≤ n. Consider a
set X with n + 1 many elements. To determine n+1
k+1 , we need to count the number of subsets of X of size
k + 1. We do this as follows. Fix an arbitrary a ∈ X. Now an arbitrary subset of X of size k + 1 fits into
exactly one of the following types.
• The subset has a as an element. In this case, to completely determine the subset, we need to pick the
remaining k elements of the subset from X\{a}. Since X\{a} has n elements, the number of ways to
do this is nk .
• The subset does not have a as an element. In this case, to completely determine the subset, we need
to pick all k + 1 elements of the subset from X\{a}. Since X\{a} has n elements, the number of ways
n
to do this is k+1 .
Putting this together, we conclude that the number of subsets of X of size k + 1 equals nk + k+1
n
.
Proof. We prove the result by induction. The base case is trivial. Suppose that we know the result for a
given n ∈ N+ . We have
(x + y)n+1 = (x + y) · (x + y)n
n n n n−1 n n−1 n n
= (x + y) · x + x y + ··· + xy + y
0 1 n−1 n
n n+1 n n n n−1 2 n 2 n−1 n
= x + x y+ x y + ··· + x y + xy n
0 1 2 n−1 n
n n n n−1 2 n n n n+1
+ x y+ x y + ··· + x2 y n−1 + xy n + y
0 1 n−2 n−1 n
n n n n n n
= xn+1 + + · xn y + + · xn−1 y 2 + · · · + + · xy n + y n+1
1 0 2 1 n n−1
n + 1 n+1 n+1 n n + 1 n−1 2 n+1 n + 1 n+1
= x + x y+ x y + ··· + xy n + y
0 1 2 n n+1
where we have used the lemma to combine each of the sums to get the last line.
2.2 Divisibility
Definition 2.2.1. Let a, b ∈ Z. We say that a divides b, and write a | b, if there exists m ∈ Z with b = am.
For example, we have 2 | 6 because 2 · 3 = 6 and −3 | 21 because −3 · 7 = 21. We also have that 2 - 5
since it is “obvious” that no such integer exists. If you are uncomfortable with that (and there is certainly
reason to be), we will be much more careful about such statements in a couple of sections.
Notice that a | 0 for every a ∈ Z because a · 0 = 0 for all a ∈ Z. In particular, we have 0 | 0 because as
noted we have 0 · 0 = 0. Of course we also have 0 · 3 = 0 and in fact 0 · m = 0 for all m ∈ Z, so every integer
serves as a “witness” that 0 | 0. Our definition says nothing about the m ∈ Z being unique.
Proof. Since a | b, there exists m ∈ Z with b = am. Since b | c, there exists n ∈ Z with c = bn. We then have
c = bn = (am)n = a(mn)
Proposition 2.2.3.
Proof.
bk = (am)k = a(mk)
2. Since a | b, there exists m ∈ Z with b = am. Since a | c, there exists n ∈ Z with c = an. We then have
b + c = am + an = a(m + n)
3. Let m, n ∈ Z. Since a | b, we conclude from part 1 that a | bm. Since a | c, we conclude from part 1
again that a | cn. Using part 2, it follows that a | (bm + cn).
Proof. Suppose that a | b with b 6= 0. Fix d ∈ Z with ad = b. Since b 6= 0, we have d 6= 0. Thus, |d| ≥ 1, and
so
|b| = |ad| = |a| · |d| ≥ |a| · 1 = |a|
Proof. Suppose first that a 6= 0 and b 6= 0. By the previous Proposition, we know that both |a| ≤ |b| and
|b| ≤ |a|. It follows that |a| = |b|, and hence either a = b or a = −b.
Suppose now that a = 0. Since a | b, we may fix m ∈ Z with b = am. We then have b = am = 0m = 0
as well. Therefore, a = b.
Suppose finally that b = 0. Since b | a, we may fix m ∈ Z with a = bm. We then have a = bm = 0m = 0
as well. Therefore, a = b.
Theorem 2.3.1. Let a, b ∈ Z with b 6= 0. There exist unique q, r ∈ Z such that a = qb + r and 0 ≤ r < |b|.
Uniqueness here means that if a = q1 b + r1 with 0 ≤ r1 < |b| and a = q2 b + r2 with 0 ≤ r2 < |b|, then q1 = q2
and r1 = r2 .
We begin by proving existence via a sequence of lemmas, starting in the case where a, b are natural
numbers rather than just integers.
Lemma 2.3.2. Let a, b ∈ N with b > 0. There exist q, r ∈ N such that a = qb + r and 0 ≤ r < b.
16 CHAPTER 2. THE INTEGERS
Proof. Fix b ∈ N with b > 0. For this fixed b, we prove the existence of q, r for all a ∈ N by induction. That
is, for this fixed b, we define
a + 1 = qb + (r + 1)
= qb + b
= (q + 1)b
= (q + 1)b + 0
a = (−q)b + (−r)
= (−q)b − b + b + (−r)
= (−q − 1)b + (b − r)
Now since 0 < r < b, we have 0 < b − r < b, so this gives existence.
Lemma 2.3.4. Let a, b ∈ Z with b 6= 0. There exist q, r ∈ Z such that a = qb + r and 0 ≤ r < |b|.
Proof. If b > 0, we are done by the previous lemma. Suppose that b < 0. We then have −b > 0, so the
previous lemma we can fix q, r ∈ N with 0 ≤ r < −b and a = q(−b) + r. We then have a = (−q)b + r and
we are done because |b| = −b.
With that sequence of lemmas building to existence now in hand, we finish off the proof of the theorem.
Proof of Theorem 2.3.1. The final lemma above gives us existence. Suppose that
q1 b + r1 = a = q2 b + r2
b(q2 − q1 ) = r1 − r2
hence b | (r2 − r1 ). Now −|b| < −r1 ≤ 0, so adding this to 0 ≤ r2 < |b|, we conclude that
and therefore
|r2 − r1 | < |b|
Now if r2 − r1 6= 0, then since b | (r2 − r1 ), we would conclude that |b| ≤ |r2 − r1 |, a contradiction. It follows
that r2 − r1 = 0, and hence r1 = r2 . Since
q1 b + r1 = q2 b + r2
Proof. Suppose first that d is a common divisor of b and r. Since d | b, d | r, and a = qb + r = bq + r1, we
may use Proposition 2.2.3 to conclude that d | a.
Conversely, suppose that d is a common divisor of a and b. Since d | a, d | b, and r = a − qb = a1 + b(−q),
we may use Proposition 2.2.3 to conclude that d | r.
For example, suppose that we are trying to find the set of common divisors of 120 and 84 (we wrote them
above, but now want to justify it). We repeatedly do division to reduce the problem as follows:
120 = 1 · 84 + 36
84 = 2 · 36 + 12
36 = 3 · 12 + 0
The first line tells us that the set of common divisors of 120 and 84 equals the set of common divisors of
84 and 36. The next line tells us that the set of common divisors of 84 and 36 equals the set of common
divisors of 36 and 12. The last line tells us that the set of common divisors of 36 and 12 equals the set of
common divisors of 12 and 0. Now the set of common divisors of 12 and 0 is simply the set of divisors of 12
(because every number divides 0). Putting it all together, we conclude that the set of common divisors of
120 and 84 equals the set of divisors of 12.
Definition 2.4.3. Let a, b ∈ Z. We say that an element d ∈ Z is a greatest common divisor of a and b if:
• d≥0
18 CHAPTER 2. THE INTEGERS
Notice that we are not defining the greatest common divisor of a and b to be the largest divisor of a and
b. The primary reason we do not is because this description fails to capture the most fundamental property
(namely that of being divisible by all other divisors, not just larger than them). Furthermore, if we were
to take that definition, then 0 and 0 would fail to have a greatest common divisor because every integer is
a common divisor of 0 and 0. With this definition however, it is a straightforward matter to check that 0
satisfies the above three conditions.
Since we require more of a greatest common divisor than just picking the largest, we first need to check
that they do indeed exist. The proof is an inductive formulation of the above method of calculation.
Theorem 2.4.4. Every pair of integers a, b ∈ Z has a unique greatest common divisor.
We first sketch the idea of the proof in the case where a, b ∈ N. If b = 0, we are done because it is
simple to verify that a is a greatest common divisor of a and 0. Suppose then that b 6= 0. Fix q, r ∈ N with
a = qb + r and 0 ≤ r < b. Now the idea is to assert inductively the existence of a greatest common divisor
of b and r because this pair is “smaller” than the pair a and b. The only issue is how to make this intuitive
idea of “smaller” precise. There are several ways to do this, but perhaps the most straightforward is to only
induct on b. Thus, our base case handles all pairs of form (a, 0). Next, we handle all pairs of the form (a, 1)
and in doing this we can use the fact the we know the result for all pairs of the form (a0 , 0). Notice that
we can we even change the value of the first coordinate here which is why we used a0 . Then, we handle all
pairs of the form (a, 2) and in doing this we can use the fact that we know the result for all pairs of the form
(a0 , 0) and (a0 , 1). We now begin the formal argument.
Proof. We begin by proving existence only in the special case where a, b ∈ N. We use (strong) induction on
b to prove the result. That is, we let
• Base Case: Suppose that b = 0. Let a ∈ N be arbitrary. We then have that the set of common divisors
of a and b equals the set of divisors of a (because every integer divides 0), so a satisfies the requirement
of a greatest common divisor of a and 0. Since a ∈ N was arbitrary, we showed that there exists a
greatest common divisor of a and 0 for every a ∈ N, hence 0 ∈ X.
• Inductive Step: Suppose then that b ∈ N+ and we know the result for all smaller natural numbers. In
other words, we are assuming that c ∈ X whenever 0 ≤ c < b. We prove that b ∈ X. Let a ∈ N be
arbitrary. From above, we may fix q, r ∈ Z with a = qb + r and 0 ≤ r < b. Since 0 ≤ r < b, we know by
strong induction that r ∈ X, hence b and r have a greatest common divisor d. By Proposition 2.4.2,
the set of common divisors of a and b equals the set of common divisors of b and r. It follows that
d is a greatest common divisor of a and b. Since a ∈ N was arbitrary, we showed that there exists a
greatest common divisor of a and b for every a ∈ N, hence b ∈ X.
Therefore, we have shown that X = N, which implies that whenever a, b ∈ N, there exists a greatest common
divisor of a and b.
To turn the argument into a proof for all a, b ∈ Z, we simply note the set of divisors of an element m ∈ Z
equals the set of divisors of −m. So, for example, if a < 0 but b ≥ 0 we can simply take a greatest common
divisor of −a and b (which exists since −a, b ∈ N) and note that it will also be a greatest common divisor of
a and b. A similar argument works if a ≥ 0 and b < 0, or if both a < 0 and b < 0.
2.4. GCDS AND THE EUCLIDEAN ALGORITHM 19
For uniqueness, suppose that c and d are both greatest common divisors of a and b. Since d is a greatest
common divisor and c is a common divisor, we know by the last condition that c | d. Similarly, since c is
a greatest common divisor and d is a common divisor, we know by the last condition that d | c. Therefore,
either c = d or c = −d. Using the first requirement that a greatest common divisor must be nonnegative,
we must have c = d.
Definition 2.4.5. Let a, b ∈ Z. We let gcd(a, b) be the unique greatest common divisor of a and b.
For example we have gcd(120, 84) = 12 and gcd(0, 0) = 0. The following corollary is immediate from
Proposition 2.4.2.
Corollary 2.4.6. Suppose that a, b, q, r ∈ Z and a = qb + r. We have gcd(a, b) = gcd(b, r).
The method of using repeated division and this corollary to reduce the problem of calculating greatest
common divisors is known as the Euclidean Algorithm. We saw it in action of above with 120 and 84. Here
is another example where we are trying to compute gcd(525, 182). We have
525 = 2 · 182 + 161
182 = 1 · 161 + 21
161 = 7 · 21 + 14
21 = 1 · 14 + 7
14 = 2 · 7 + 0
Therefore, gcd(525, 182) = gcd(7, 0) = 7.
Theorem 2.4.7. For all a, b ∈ Z, there exist k, ` ∈ Z with gcd(a, b) = ka + `b.
Proof. We begin by proving existence in the special case where a, b ∈ N. We use induction on b to prove the
result. That is, we let
X = {b ∈ N : For all a ∈ N, there exist k, ` ∈ Z with gcd(a, b) = ka + `b}
and prove that X = N by strong induction.
• Base Case: Suppose that b = 0. Let a ∈ N be arbitrary. We then have that
gcd(a, b) = gcd(a, 0) = a
Since a = 1 · a + 0 · b, so we may let k = 1 and ` = 0. Since a ∈ N was arbitrary, we conclude that
0 ∈ X.
• Inductive Step: Suppose then that b ∈ N+ and we know the result for all smaller nonnegative values.
In other words, we are assuming that c ∈ X whenever 0 ≤ c < b. We prove that b ∈ X. Let a ∈ N be
arbitrary. From above, we may fix q, r ∈ Z with a = qb + r and 0 ≤ r < b. We also know from above
that gcd(a, b) = gcd(b, r). Since 0 ≤ r < b, we know by strong induction that r ∈ X, hence there exist
k, ` ∈ Z with
gcd(b, r) = kb + `r
Now r = a − qb, so
gcd(a, b) = gcd(b, r)
= kb + `r
= kb + `(a − qb)
= kb + `a − qb`
= `a + (k − q`)b
Since a ∈ N was arbitrary, we conclude that b ∈ X.
20 CHAPTER 2. THE INTEGERS
Therefore, we have shown that X = N, which implies that whenever a, b ∈ N, there exists k, ` ∈ Z with
gcd(a, b) = ka + `b.
Given a, b ∈ Z, we can explicitly calculate k, ` ∈ Z by “winding up” the work created from the Euclidean
Algorithm. For example, we saw above that gcd(525, 182) = 7 by calculating
7=1·7+0·0
= 1 · 7 + 0 · (14 − 2 · 7)
= 0 · 14 + 1 · 7
= 0 · 14 + 1 · (21 − 1 · 14)
= 1 · 21 + (−1) · 14
= 1 · 21 + (−1) · (161 − 7 · 21)
= (−1) · 161 + 8 · 21
= (−1) · 161 + 8 · (182 − 1 · 161)
= 8 · 182 + (−9) · 161
= 8 · 182 + (−9) · (525 − 2 · 182)
= (−9) · 525 + 26 · 182
This wraps everything up perfectly, but it is easier to simply start at the fifth line.
Before moving on, we work through another proof of the existence of greatest common divisors, along
with the fact that we can write gcd(a, b) as an integer combination of a and b. This proof also works because
of Theorem 2.3.1, but it uses well-ordering and establishes existence without a method of computation. One
may ask why we bother with another proof. One answer is that this result is so fundamental and important
that two different proofs help to reinforce its value. Another reason is that we will generalize each of these
distinct proofs in Chapter 11 to slightly different settings.
Theorem 2.4.8. Let a, b ∈ Z with at least one of a and b nonzero. The set
{ma + nb : m, n ∈ Z}
has positive elements, and the least positive element is a greatest common divisor of a and b. In particular,
if for any a, b ∈ Z, there exist k, ` ∈ Z with gcd(a, b) = ka + `b.
Proof. Let
S = {ma + nb : m, n ∈ Z} ∩ N+
We first claim that S 6= ∅. If a > 0, then a = 1 · a + 0 · b ∈ S. Similarly, if b > 0, then b ∈ S. If a < 0,
then −a > 0 and −a = (−1) · a + 0 · b ∈ S. Similarly, if b < 0, then −b ∈ S. Since at least one of a and b
is nonzero, it follows that S 6= ∅. By the Well-Ordering property of N, we know that S has a least element.
Let d = min(S). Since d ∈ S, we may fix k, ` ∈ Z with d = ka + `b. We claim that d is the greatest common
divisor of a and b.
2.5. PRIMES AND FACTORIZATIONS IN Z 21
First, we need to check that d is a common divisor of a and b. We begin by showing that d | a. Fix
q, r ∈ Z with a = qd + r and 0 ≤ r < d. We want to show that r = 0. We have
r = a − qd
= a − q(ak + b`)
= (1 − qk) · a + (−q`) · b
Now if r > 0, then we have shown that r ∈ S, which contradicts the choice of d as the least element of S.
Hence, we must have r = 0, and so d | a.
We next show that d | b. Fix q, r ∈ Z with b = qd + r and 0 ≤ r < d. We want to show that r = 0. We
have
r = b − qd
= b − q(ak + b`)
= (−qk) · a + (1 − q`) · b
Now if r > 0, then we have shown that r ∈ S, which contradicts the choice of d as the least element of S.
Hence, we must have r = 0, and so d | b.
Finally, we need to check the last condition for d to be the greatest common divisor. Let c be a common
divisor of a and b. Since c | a, c | b, and d = ka + `b, we may use Proposition 2.2.3 to conclude that c | d.
We end this section with a result that will play an important role in proving the Fundamental Theorem
of Arithmetic in the next section.
Definition 2.4.9. Two elements a, b ∈ Z are relatively prime if gcd(a, b) = 1.
Proposition 2.4.10. Let a, b, c ∈ Z. If a | bc and gcd(a, b) = 1, then a | c.
Proof. Since a | bc, we may fix m ∈ Z with bc = am. Since gcd(a, b) = 1, we may fix k, ` ∈ Z with ak +b` = 1.
Multiplying this last equation through by c we conclude that akc + b`c = c, so
c = akc + `(bc)
= akc + na`
= a(kc + n`)
It follows that a | c.
hence all inequalities must be equalities, and we conclude that p1 = q1 . Canceling, we get
p2 · · · pk = q2 · · · q`
and this common number is smaller than n. By induction, it follows that k = ` and pi = qi for 2 ≤ i ≤ k.
Given a natural number n ∈ N with n ≥ 2, when we write its prime factorization, we typically group
together like primes and write
αk
n = pα 1 α2
1 p2 · · · pk
where the pi are distinct primes. We often allow the insertion of “extra” primes in the factorization of n by
permitting some αi to equal to 0. This convention is particularly useful when comparing prime factorization
of two numbers so that we can assume that both factorizations have the same primes occurring. It also
allows us to write 1 in such a form by choosing all αi to equal 0. Here is one example.
where the pi are distinct primes and possibly some αi and βj are 0. We then have that d | n if and only if
0 ≤ βi ≤ αi for all i.
Proof. Suppose first that 0 ≤ βi ≤ αi for all i. We then have that αi − βi ≥ 0 for all i, so we may let
Notice that
hence d | n.
Conversely, suppose that d | n and fix c ∈ Z with dc = n. Notice that c > 0 because d, n > 0. Now
we have dc = n, so c | n. If q is prime and q | c, then q | n by transitively of divisibility so q | pi for some
i by Corollary 2.5.7, and hence q = pi for some i because each pi is prime. Thus, we can write the prime
factorization of c as
c = pγ11 pγ22 · · · pγkk
where again we may have some γi equal to 0. We then have
n = dc
= (pβ1 1 pβ2 2 · · · pβkk )(pγ11 pγ22 · · · pγkk )
= (pβ1 1 pγ11 )(pβ2 2 pγ22 ) · · · (pβkk pγkk )
= p1β1 +γ1 pβ2 2 +γ2 · · · pβkk +γk
By the Fundamental Theorem of Arithmetic, we have βi + γi = αi for all i. Since βi , γi , αi ≥ 0 for all i, we
conclude that βi ≤ αi for all i.
24 CHAPTER 2. THE INTEGERS
where 0 ≤ βi ≤ αi for all i. Thus, we have αi + 1 many choices for each βi . Notice that different choices of
βi give rise to different values of d by the Fundamental Theorem of Arithmetic.
αk
Proposition 2.5.12. Suppose that m, n ∈ N+ , and write the prime factorization of m as m = pα 1 α2
1 p2 · · · pk
th
where the pi are distinct primes. We then have that m is an n power in N if and only if n | αi for all i.
Proof. Suppose first that n | αi for all i. For each i, fix βi such that αi = nβi . Since n and each αi are
nonnegative, it follows that each βi is also nonnegative. Letting ` = pβ1 1 pβ2 2 · · · pβkk , we then have
`n = pnβ 1 nβ2
1 p2 · · · pnβ
k
k
= pα1 α2 αk
1 p2 · · · pk = m
so m is an nth power in N.
Suppose conversely that m is an nth power in N, and write m = `n . Since m > 1, we have ` > 1. Write
the unique prime factorization of ` as
` = pβ1 1 pβ2 2 · · · pβkk
We then have
m = `n = (pβ1 1 pβ2 2 · · · pβkk )n = pnβ1 nβ2
1 p2 · · · pnβ
k
k
By the Fundamental Theorem of Arithmetic, we have αi = nβi for all i, so n | αi for all i.
Theorem 2.5.13. Let m, n ∈ N with m, n ≥ 2. If the unique √ prime factorization of m does not have the
property that every prime exponent is divisible by n, then n m is irrational.
√ √
Proof. We proof the contrapositive. Suppose √ that n m is rational and write n m = ab where a, b ∈ Z with
b 6= 0. We may assume that a, b > 0 because n m > 0. We then have
an a n
= =m
bn b
2.5. PRIMES AND FACTORIZATIONS IN Z 25
hence
an = bn m
Write a, b, m in their unique prime factorizations as
αk
a = pα 1 α2
1 p2 · · · pk
where the pi are distinct (and possibly some αi , βi , γi are equal to 0). Since an = bn m, we have
pnα
1 p2 · · · pnα
1 nα2
k
k
= pnβ
1
1 +γ1 nβ2 +γ2
p2 · · · pnβ
k
k +γk
By the Fundamental Theorem of Arithmetic, we conclude that nαi = nβi + γi for all i. Therefore, for each
i, we have γi = nαi − nβi = n(αi − βi ), and so n | γi for each i.
26 CHAPTER 2. THE INTEGERS
Chapter 3
3.1 Relations
Definition 3.1.1. Given two sets A and B, we let A × B be the set of all ordered pairs (a, b) where a ∈ A
and b ∈ B. We call A × B the Cartesian product of A and B.
For example, we have
{1, 2, 3} × {6, 8} = {(1, 6), (1, 8), (2, 6), (2, 8), (3, 6), (3, 8)}
and
N × N = {(0, 0), (0, 1), (1, 0), (2, 0), . . . , (4, 7), . . . }
We also use the notation A2 as shorthand for A × A, so R2 = R × R really is the set of points in the plane.
Definition 3.1.2. Let A and B be sets. A (binary) relation between A and B is a subset R ⊆ A × B. If
A = B, then we call a subset of A × A a (binary) relation on A.
For example, let A = {1, 2, 3} and B = {6, 8} as above. Let
We then have that R is a relation between A and B, although certainly not a very interesting one. However,
we’ll use it to illustrate a few facts. First, in a relation, it’s possible for an element of A to be related to
multiple elements of B, as in the case for 1 ∈ A in our example R. Also, it’s possible that an element of A
is related to no elements of B, as in the case of 2 ∈ A in our example R.
For a geometric example, let A be the set of points in the plane, and let B be the set of lines in the plane.
We can then define R ⊆ A × B to be the set of pairs (p, L) ∈ A × B such that p is a point on L.
Here are two examples of binary relations on Z:
• L = {(a, b) ∈ Z2 : a < b}
• D = {(a, b) ∈ Z2 : a | b}
We have (4, 7) ∈ L but (7, 4) ∈/ L. Notice that (5, 5) ∈/ L but (5, 5) ∈ D.
By definition, relations are sets. However, it is typically cumbersome to use set notation to write things
like (1, 6) ∈ R. Instead, it usually makes much more sense to use infix notation and write 1R6. Moreover,
we can use better notation for the relation by using a symbol like ∼ instead of R. In this case, we would
write 1 ∼ 6 instead of (1, 6) ∈ ∼ or 2 6∼ 8 instead of (2, 8) ∈
/ ∼.
With this new notation, we give a few examples of binary relations on R:
27
28 CHAPTER 3. RELATIONS AND FUNCTIONS
• Given x, y ∈ R, we let x ∼ y if x2 + y 2 = 1.
• Given x, y ∈ R, we let x ∼ y if x2 + y 2 ≤ 1.
• Given x, y ∈ R, we let x ∼ y if x = sin y.
• Given x, y ∈ R, we let x ∼ y if y = sin x.
Again, notice from these examples that given x ∈ R, there many 0, 1, 2, or even infinitely many y ∈ R with
x ∼ y.
If we let A be the set of all finite sequences of 0’s and 1’s, then the following are binary relations on A:
• Given σ, τ ∈ A, we let σ ∼ τ if σ and τ have the same number of 1’s.
• Given σ, τ ∈ A, we let σ ∼ τ if σ occurs as a consecutive subsequence of τ (for example, we have
010 ∼ 001101011 because 010 appears in positions 5-6-7 of 001101011).
For a final example, let A be the set consisting of the 50 states. Let R be the subset of A × A con-
sisting of those pairs of states whose second letter of their postal codes are equal. For example, we have
(Iowa,California) ∈ R and and (Iowa, Virginia) ∈ R because the postal codes of these sets are IA, CA, VA.
We also have (Minnesota, Tennessee) ∈ R because of the postal codes MN and TN. Now (Texas, Texas)
∈ R, but there is no a ∈ A with a 6= Texas such that (Texas, a) ∈ R because no other state has X as the
second letter of its postal code. Texas stands alone.
L = (P Q)N (P Q)−1
so L ∼ N .
Putting it all together, we conclude that ∼ is an equivalence relation on A.
Example 3.2.3. Let A be the set Z × (Z\{0}), i.e. A is the set of all pairs (a, b) ∈ Z2 with b 6= 0. Define a
relation ∼ on A as follows. Given a, b, c, d ∈ Z with b, d 6= 0, we let (a, b) ∼ (c, d) mean ad = bc. We then
have that ∼ is an equivalence relation on A.
Proof. We check the three properties.
• Reflexive: Let a, b ∈ Z with b 6= 0. Since ab = ba, it follows that (a, b) ∼ (a, b).
• Symmetric: Let a, b, c, d ∈ Z with b, d 6= 0, and (a, b) ∼ (c, d). We then have that ad = bc. From this,
we conclude that cb = da so (c, d) ∼ (a, b).
• Transitive: Let a, b, c, d, e, f ∈ Z with b, d, f 6= 0 where (a, b) ∼ (c, d) and (c, d) ∼ (e, f ). We then have
that ad = bc and cf = de. Multiplying the first equation by f we see that adf = bcf . Multiplying the
second equation by b gives bcf = bde. Therefore, we know that adf = bde. Now d 6= 0 by assumption,
so we may cancel it to conclude that af = be. It follows that (a, b) ∼ (e, f )
Therefore, ∼ is an equivalence relation on A.
Let’s analyze the above situation more carefully. We have (1, 2) ∼ (2, 4), (1, 2) ∼ (4, 8), (1, 2) ∼ (−5, −10),
etc. If we think of (a, b) as representing the fraction ab , then the relation (a, b) ∼ (c, d) is saying exactly that
the fractions ab and dc are equal. You may never have thought about equality of fractions as the result of
imposing an equivalence relation on pairs of integers, but that is exactly what it is. We will be more precise
about this below.
The next example is an important example in geometry. We introduce it now, and will return to it later.
Example 3.2.4. Let A be the set R2 \{(0, 0)}. Define a relation ∼ on A by letting (x1 , y1 ) ∼ (x2 , y2 ) if there
exists a real number λ 6= 0 with (x1 , y1 ) = (λx2 , λy2 ). We then have that ∼ is an equivalence relation on A.
Proof. We check the three properties.
• Reflexive: Let (x, y) ∈ R2 \{(0, 0)} we have (x, y) ∼ (x, y) because using λ = 1 we see that (x, y) =
(1 · x, 1 · y). Therefore, ∼ is reflexive.
• Symmetric: Suppose now that (x1 , y1 ) ∼ (x2 , y2 ), and fix a real number λ 6= 0 such that (x1 , y1 ) =
(λx2 , λy1 ). We then have that x1 = λx2 and y1 = λy2 , so x2 = λ1 · x1 and y2 = λ1 · y1 (notice that we
are using λ 6= 0 so we can divide by it). Hence (x2 , y2 ) = ( λ1 · x2 , λ1 · y2 ), and so (x2 , y2 ) ∼ (x1 , y1 ).
Therefore, ∼ is symmetric.
• Transitive: Suppose that (x1 , y1 ) ∼ (x2 , y2 ) and (x2 , y2 ) ∼ (x3 , y3 ). Fix a real number λ 6= 0 with
(x1 , y1 ) = (λx2 , λy2 ), and also fix a real number µ 6= 0 with (x2 , y2 ) = (µx3 , µy3 ). We then have
that (x1 , y1 ) = ((λµ)x3 , (λµ)y3 ). Since both λ 6= 0 and µ 6= 0, notice that λµ 6= 0 as well, so
(x1 , y1 ) ∼ (x3 , y3 ). Therefore, ∼ is transitive.
30 CHAPTER 3. RELATIONS AND FUNCTIONS
a = {b ∈ A : a ∼ b}
Some sources use the notation [a] instead of a. This notation helps emphasize that the equivalence class
of a is a subset of A rather than an element of A. However, it is cumbersome notation when we begin working
with equivalence classes. We will stick with our notation, although it might take a little time to get used to.
Notice that by the reflexive property of ∼, we have that a ∈ a for all a ∈ A.
For example, let’s return to where A is the set consisting of the 50 states and R is the subset of A × A
consisting of those pairs of states whose second letter of their postal codes are equal. It’s straightforward to
show that R is an equivalence relation on A. We have
while
Minnesota = {Indiana, Minnesota, Tennessee}
and
Texas = {Texas}.
Notice that each of these are sets, even in the case of Texas.
For another example, suppose we are working as above with A = Z × (Z\{0}) where (a, b) ∼ (c, d) means
that ad = bc. As discussed above, some elements of (1, 2) are (1, 2), (2, 4), (4, 8), (−5, −10), etc. So
Proof. Suppose that a ∩ b 6= ∅. Fix c ∈ a ∩ b. We then have a ∼ c and b ∼ c. By symmetry, we know that
c ∼ b, and using transitivity we get that a ∼ b. Using symmetry again, we conclude that b ∼ a.
We first show that a ⊆ b. Let x ∈ a. We then have that a ∼ x. Since b ∼ a, we can use transitivity to
conclude that b ∼ x, hence x ∈ b.
We next show that b ⊆ a. Let x ∈ b. We then have that b ∼ x. Since a ∼ b, we can use transitivity to
conclude that a ∼ x, hence x ∈ a.
Putting this together, we get that a = b.
With that proposition in hand, we are ready for the foundational theorem about equivalence relations.
1. a ∼ b if and only if a = b.
2. a 6∼ b if and only if a ∩ b = ∅.
3.2. EQUIVALENCE RELATIONS 31
Proof. We first prove 1. Suppose first that a ∼ b. We then have that b ∈ a. Now we know that b ∼ b because
∼ is reflexive, so b ∈ b. Thus, b ∈ a ∩ b, so a ∩ b 6= ∅. By the previous proposition, we conclude that a = b.
Suppose conversely that a = b. Since b ∼ b because ∼ is reflexive, we have that b ∈ b. Therefore, b ∈ a
and hence a ∼ b.
We now use everything we’ve shown to get 2 with little effort. Suppose that a 6∼ b. Since we just proved
1, it follows that a 6= b, so by the previous proposition we must have a ∩ b = ∅. Suppose conversely that
a ∩ b = ∅. We then have a 6= b (because a ∈ a so a 6= ∅), so a 6∼ b by part 1.
Therefore, given an equivalence relation ∼ on a set A, the equivalence classes partition A into pieces.
Working out the details in our postal code example, one can show that ∼ has 1 equivalence class of size 8
(namely Iowa, which is the same set as California and 6 others), 3 equivalence classes of size 4, 4 equivalence
classes of size 3, 7 equivalence classes of size 2, and 4 equivalence classes of size 1.
Let’s revisit the example of A = Z × (Z\{0}) where (a, b) ∼ (c, d) means ad = bc. The equivalence class
of (1, 2), namely the set (1, 2) is the set of all pairs of integers which are ways of representing the fraction
1
2 . In fact, this is how once can “construct” the rational numbers from the integers. We simply define the
rational numbers to be the set of equivalence classes of A under ∼. In other words, we let
a
= (a, b)
b
So when we write something like
1 4
=
2 8
we are simply saying that
(1, 2) = (4, 8)
which is true because (1, 2) ∼ (4, 8).
Example 3.2.8. Recall the example above where A = R2 \{(0, 0)} and where (x1 , y1 ) ∼ (x2 , y2 ) means that
there exists a real number λ 6= 0 with (x1 , y1 ) = (λx2 , λy2 ). The equivalence classes of ∼ are the lines through
the origin (omitting the origin itself ).
Proof. Our first claim is that every point of A is equivalent to exactly one of the following points.
• (0, 1)
We first show that every point is equivalent to at least one of the above points. Suppose that (x, y) ∈ A
so (x, y) 6= (0, 0). If x = 0, then we must have y 6= 0, so (x, y) ∼ (0, 1) via λ = y. Now if x 6= 0, then
(x, y) ∼ (1, xy ) via λ = x. This gives existence.
To show uniqueness, it suffices to show that no two of the above points are equivalent to each other
because we already know that ∼ is an equivalence relation. Suppose that m ∈ R and that (0, 1) ∼ (1, m).
Fix λ ∈ R with λ 6= 0 such that (0, 1) = (λ1, λm). Looking at the first coordinate, we conclude that λ = 0, a
contradiction. Therefore, (0, 1) is not equivalent to any point of the second type. Suppose now that m, n ∈ R
with (1, m) ∼ (1, n). Fix λ ∈ R with λ 6= 0 such that (1, m) = (λ1, λn). Looking at first coordinates, we
must have λ = 1, so examining second coordinates gives m = λn = n. Therefore (1, m) 6∼ (1, n) whenever
m 6= n. This finishes the claim.
Now we examine the equivalence classes of each of the above points. We first handle (0, 1) and claim
that it equals the set of points in A on the line x = 0. Notice first that if (x, y) ∈ (0, 1), then (0, 1) ∼ (x, y),
so fixing λ 6= 0 with (0, 1) = (λx, λy) we conclude that λx = 0 and hence x = 0. Thus, every element of
(0, 1) is indeed on the line x = 0. Now taking an arbitrary point on the line x = 0, say (0, y) with y 6= 0, we
simply notice that (0, 1) ∼ (0, y) via λ = y1 . Hence, every point on the line x = 0 is an element of (0, 1).
32 CHAPTER 3. RELATIONS AND FUNCTIONS
Finally we fix m ∈ R and claim that (1, m) is the set of points in A on the line y = mx. Notice first that
if (x, y) ∈ (1, m), then (1, m) ∼ (x, y), hence (x, y) ∼ (1, m). Fix λ 6= 0 with (x, y) = (λ1, λm). We then
have x = λ by looking at first coordinates, so y = λm = mx by looking at second coordinates. Thus, every
element of (1, m) lies on the line y = mx. Now take an arbitrary point in A on the line y = mx, say (x, mx).
We then have that x 6= 0 because (0, 0) ∈ / A. Thus (1, m) ∼ (x, mx) via λ = x. Hence, every point on the
line y = mx is an element of (1, m).
The set of equivalence classes of ∼ in the previous example is known as the projective real line.
3.3 Functions
Intuitively, given two sets A and B, a function f : A → B is a input-output “mechanism” that produces a
unique output b ∈ B for any given input a ∈ A. Up through calculus, the vast majority of functions that
we encounter are given by simple formulas, so this “mechanism” was typically interpreted in an algorithmic
and computational
Rx sense. However, some functions such as f (x) = sin x, f (x) = ln x, or integral functions
like f (x) = a g(t) dt (given a continuous function g(t) and a fixed a ∈ R) were defined in more interesting
ways where it was not at all obvious how to compute them. We are now in a position to define functions
as relations that satisfy a certain property. Thinking about functions from this more abstract point of view
eliminates the vague “mechanism” concept because they will simply be certain types of sets. With this
perspective, we’ll see that functions can be defined in any way that a set can be defined. This approach
both clarifies the concept of a function as well as providing us with some much needed flexibility in defining
functions in more interesting ways.
Definition 3.3.1. Let A and B be sets. A function from A to B is relation f between A and B such that
for each a ∈ A, there is a unique b ∈ B with (a, b) ∈ f .
For example, let A = {c, q, w, y} and let B = N = {0, 1, 2, 3, 4, . . . }. An example of a function from A to
B is the set
f = {(c, 71), (q, 4), (w, 9382), (y, 4)}.
Notice that in the definition of a function from A to B, we know that for every a ∈ A, there is a unique
b ∈ B such that (a, b) ∈ f . However, as this example shows, it may not be the case that for every b ∈ B,
there is a unique a ∈ A with (a, b) ∈ f . Be careful with the order of quantifiers!
Thinking of functions as special types of relations, and in particular as special types of sets, is occasionally
helpful (see below), but is often awkward in practice. For example, writing (c, 71) ∈ f to mean that f sends
c to 71 gets annoying very quickly. Using infix notation like c f 71 is not much better. Thus, we introduce
some new notation matching up with our old experience with functions.
Notation 3.3.2. Let A and B be sets.
• Instead of writing “f is a function from A to B”, we typically use the shorthand notation “f : A → B”.
• If f : A → B and a ∈ A, we write f (a) to mean the unique b ∈ B such that (a, b) ∈ f .
Therefore, in the above example of f , we have
f (c) = 71
f (q) = 4
f (w) = 9382
f (y) = 4
Now R is a relation, and one can check that it is a function (using the assumption that f and g are both
functions). We define g ◦ f to be this set.
Proposition 3.3.5. Let A, B, C, D be sets. Suppose that f : A → B, that g : B → C, and that h : C → D
are functions. We then have that (h ◦ g) ◦ f = h ◦ (g ◦ f ). Stated more simply, function composition is
associative whenever it is defined.
Proof. Let a ∈ A be arbitrary. We then have
while
(g ◦ f )(x) = g(f (x)) = g(x + 1) = (x + 1)2 = x2 + 2x + 1
34 CHAPTER 3. RELATIONS AND FUNCTIONS
For example, we have (f ◦ g)(1) = 12 + 1 = 2 while (g ◦ f )(1) = 12 + 2 · 1 + 1 = 4. Since we have found one
example of an x with (f ◦ g)(x) 6= (f ◦ g)(x), we conclude that f ◦ g 6= g ◦ f . It does not matter that there
do exist some values of x with (f ◦ g)(x) = (f ◦ g)(x) (for example, this is true when x = 0). Remember
that two functions are equal precisely when they agree on all inputs, so to show that the two functions are
not equal it suffices to find just one value where they disagree.
Definition 3.3.6. Let A be a set. The function idA : A → A defined by idA (a) = a for all a ∈ A is called
the identity function on A.
The identity function does leave other functions alone when we compose with it. However, we have to
be careful that we compose with the identity function on the correct set and the correct side.
because f (a) is some element in B. Since b ∈ B was arbitrary, it follows that idB ◦ f = f .
• An inverse of f is a function g : B → A that is both a left inverse and a right inverse of f simultane-
ously, i.e. a function g : B → A such that both g ◦ f = idA and f ◦ g = idB .
Notice that we need inverses on both sides and the identity functions differ if A 6= B. Consider the
function f : R → R given by f (x) = ex . Does f have an inverse? As defined, we are asking whether there
exists a function g : R → R such that both g ◦ f = idR and f ◦ g = idR . A natural guess after a calculus
course would be to consider the function g : R+ → R given by h(x) = ln x. However, g is not an inverse of f
because it is not even defined for all real numbers as would be required. Suppose that we try to correct this
minor defect by instead considering the function g : R → R defined by
(
ln x if x > 0
g(x) =
0 otherwise
Now at least g is defined on all of R. Is g an inverse of f ? Let’s check the definitions. For any x ∈ R, we
have ex > 0, so
(g ◦ f )(x) = g(f (x)) = g(ex ) = ln(ex ) = x = idR (x)
so we have shown that g ◦ f = idR . Thus, g is indeed a left inverse of f . Let’s examine f ◦ g. Now if x > 0
then we have
(f ◦ g)(x) = f (g(x)) = f (ln x) = eln x = x = idR (x)
so everything is good there. However, if x < 0, then
Therefore, g is not a right inverse of f , and hence not an inverse for f . In fact, g does not have a right
inverse. One can prove this directly, but it will follow from Proposition 3.4.3 below. Notice that f has other
left inverses beside g. For example, define h : R → R defined by
(
ln x if x > 0
h(x) =
72x5 otherwise
so h ◦ f = idR . Thus, h is another left inverse of f . In fact, we can define a left inverse arbitrarily on the set
{x ∈ R : x ≤ 0} so long as it is defined by ln x for positive reals.
Notice that if we had instead considered our function f (x) = ex as a function from R to R+ , then f
does have an inverse. In fact, the function h : R+ → R defined by h(x) = ln x does satisfy h ◦ f = idR and
f ◦ h = idR+ . Thus, when restricting the codomain, we can obtain an inverse for a function that did not
have one originally.
Proof. By definition, we have that that g ◦ f = idA and f ◦ h = idB . The key function to consider is the
composition (g ◦ f ) ◦ h = g ◦ (f ◦ h) (notice that these are equal by Proposition 3.3.5). We have
g = g ◦ idB
= g ◦ (f ◦ h)
= (g ◦ f ) ◦ h (by Proposition 3.3.5)
= idA ◦ h
=h
Corollary 3.3.10. If f : A → B is a function, then there exists at most one function g : B → A that is an
inverse of f .
Proof. Suppose that g : B → A and h : B → A are both inverse of f . In particular, we then have that g is a
left inverse of f and h is a right inverse of f . Therefore, g = h by Proposition 3.3.9.
Proof. Suppose that f has a left inverse, and fix such a left inverse g : B → A. Suppose that h1 : B → A
and h2 : B → A are both right inverse of f . Using Proposition 3.3.9, we conclude that g = h1 and g = h2 .
Therefore, h1 = h2 .
The proof of the second result is completely analogous.
Notice that it is possible for a function to have many left inverses, but this result says that that f must
fail to have a right inverse in this case. This is exactly what happened above in the example where f : R → R
was the function f (x) = ex .
36 CHAPTER 3. RELATIONS AND FUNCTIONS
• We say that f is surjective (or onto) if for all b ∈ B there exists a ∈ A such that f (a) = b. In other
words, f is surjective if range(f ) = B.
An equivalent condition for f to be injective is obtained by simply taking the contrapositive, i.e. f : A → B
is injective if and only if whenever a1 6= a2 , we have f (a1 ) 6= f (a2 ). Stated in more colloquial language, f is
injective if every element of B is hit by at most one element of A via f . In this manner, f is surjective if
every element of B is hit by at least one element of a via f , and f is bijective if every element of B is hit by
exactly one element of a via f .
For example, consider the function f : N+ → N+ defined by letting f (n) be the number of positive divisors
of n. Notice that f is not injective because f (3) = 2 = f (5) but of course 3 6= 5. On the other hand, f is
surjective, because given any n ∈ N+ , we have f (2n−1 ) = n.
If we want to prove that a function f : A → B is injective, it is usually better to use our official definition
than the contrapositive one with negations. Thus, we want to start by assuming that we are given arbitrary
a1 , a2 ∈ A that satisfy f (a1 ) = f (a2 ), and using this assumption we want to prove that a1 = a2 . The
reason why this approach is often preferable is because it is typically easier to work with and manipulate a
statement involving equality than it is to derive statements from a non-equality.
Proof.
1. Suppose that a1 , a2 ∈ A satisfy (g ◦ f )(a1 ) = (g ◦ f )(a2 ). We then have that g(f (a1 )) = g(f (a2 )). Using
the fact that g is injective, we conclude that f (a1 ) = f (a2 ). Now using the fact that f is injective, it
follows that a1 = a2 . Therefore, g ◦ f is injective.
Therefore, g ◦ f is surjective.
1. f is injective if and only f has a left inverse (i.e. there is a function g : B → A with g ◦ f = idA ).
2. f is surjective if and only f has a right inverse (i.e. there is a function g : B → A with f ◦ g = idB ).
3.5. DEFINING FUNCTIONS ON EQUIVALENCE CLASSES 37
3. f is bijective if and only if f has an inverse (i.e. there is a function g : B → A with both g ◦ f = idA
and f ◦ g = idB ).
Proof.
1. Suppose first that f has a left inverse, and fix a function g : B → A with g ◦ f = idA . Suppose that
a1 , a2 ∈ A satisfy f (a1 ) = f (a2 ). Applying the function g to both sides we see that g(f (a1 )) = g(f (a2 )),
and hence (g ◦ f )(a1 ) = (g ◦ f )(a2 ). We now have
a1 = idA (a1 )
= (g ◦ f )(a1 )
= (g ◦ f )(a2 )
= idA (a2 )
= a2
we often want to form a new set which consists of the equivalence classes themselves. Thus, the elements of
this new set are themselves sets. Here is the formal definition.
Definition 3.5.1. Let A be a set and let ∼ be an equivalence relation on A. The set whose elements are the
equivalence classes of A under ∼ is called the quotient of A by ∼ and is denoted A/∼.
Thus, if let A = Z × (Z\{0}) where (a, b) ∼ (c, d) means ad = bc, then the set of all rationals is the
quotient A/∼. Letting Q = A/∼, we then have that the set (a, b) is an element of Q for every choice of
a, b ∈ Z with b 6= 0. This quotient construction is extremely general, and we will see that it will play a
fundamental role in our studies. Before we delve into our first main example of modular arithmetic in the
next section, we first address an important and subtle question.
To begin with, notice that a given element of Q (namely a rational) is represented by many different pairs
of integers. After all, we have
1
= (1, 2) = (2, 4) = (−5, −10) = . . .
2
Suppose that we want to define a function whose domain is Q. For example, we want to define a function
f : Q → Z. Now we can try to write down something like:
f ((a, b)) = a
Intuitively, we are trying to define f : Q → Z by letting f ( ab ) = a. From a naive glance, this might look
perfectly reasonable. However, there is a real problem arising from the fact that elements of Q have many
representations. On the one hand, we should have
f ((1, 2)) = 1
But we know that (1, 2) = (2, 4), which contradicts the very definition of a function (after all, a function
must have a unique output for any given input, but our description has imposed multiple different outputs
for the same input). Thus, if we want to define a function on Q, we need to check that our definition does
not depend on our choice of representatives.
For a positive example, consider the projective real line P . That is, let A = R2 \{(0, 0} where (x1 , y1 ) ∼
(x2 , y2 ) means there there exists a real number λ 6= 0 such that (x1 , y1 ) = (λx2 , λy2 ). We then have have
that P = A/∼. Consider trying to define the function g : P → R by
5xy
g((x, y)) =
x2+ y2
First we check a technicality: If (x, y) ∈ P , then (x, y) 6= (0, 0), so x2 + y 2 6= 0, and hence the domain of g
really is all of P . Now we claim that g “makes sense”, i.e. that it actually is a function. To see this, we take
two elements (x1 , y1 ) and (x2 , y2 ) with (x1 , y1 ) = (x2 , y2 ), and check that g((x1 , y1 )) = g((x2 , y2 )). In other
words, we check that our definition of g does not actually depend on our choice of representative. Suppose
then that (x1 , y1 ) = (x2 , y2 ). We then have (x1 , y1 ) ∼ (x2 , y2 ), so we may fix λ 6= 0 with (x1 , y1 ) = (λx2 , λy2 ).
3.6. MODULAR ARITHMETIC 39
Now
5x1 y1
g((x1 , y1 )) =
x21 + y12
5(λx2 )(λy2 )
=
(λx2 )2 + (λy2 )2
5λ2 x2 y2
=
λ x22 + λ2 y22
2
λ2 · 5x2 y2
=
λ · (x22 + y22 )
2
5x2 y2
=
x22 + y22
= g((x2 , y2 )).
We claim that this function is well-defined on Q. Intuitively, we want to define the following function on
fractions: a a2 + 3b2
f =
b 2b2
Let’s check that it does indeed make sense. Suppose that a, b, c, d ∈ Z with b, d 6= 0 and we have (a, b) = (c, d),
i.e. that (a, b) ∼ (c, d). We need to show that f ((a, b)) = f ((c, d)), i.e. that (a2 + 3b2 , 2b2 ) = (c2 + 3d2 , 2d2 )
or equivalently that (a2 + 3b2 , 2b2 ) ∼ (c2 + 3d2 , 2d2 ). Since we are assuming that (a, b) ∼ (c, d), we know
that ad = bc. Hence
Therefore, (a2 + 3b2 , 2b2 ) ∼ (c2 + 3d2 , 2d2 ), which is to say that f ((a, b)) = f ((c, d)). It follows that f is
well-defined on Q.
• Transitive: Let a, b, c ∈ Z with a ≡n b and b ≡n c. We then have that n | (a − b) and n | (b − c). Using
Proposition 2.2.3, it follows that n | [(a − b) + (b − c)], which is to say that n | (a − c). Therefore,
a ≡n c.
By our general theory about equivalence relations, we know that ≡n partitions Z into equivalence classes.
We next determine the number of such equivalence classes.
Proposition 3.6.3. Let n ∈ N+ and let a ∈ Z. There exists a unique b ∈ {0, 1, . . . , n − 1} such that a ≡n b.
In fact, if we write a = qn + r for the unique choice of q, r ∈ Z with 0 ≤ r < n, then b = r.
Proof. As in the statement, fix q, r ∈ Z with a = qn + r and 0 ≤ r < n. We then have a − r = nq, so
n | (a − r). It follows that a ≡n r, so we have proven existence.
Suppose now that b ∈ {0, 1, . . . , n − 1} and a ≡n b. We then have that n | (a − b), so we may fix k ∈ Z
with nk = a − b. This gives a = kn + b. Since 0 ≤ b < n, we may use the uniqueness part of Theorem 2.3.1
to conclude that k = q (which is unnecessary) and also that b = r. This proves uniqueness.
Therefore, the quotient Z/ ≡n has n elements, and we can obtain unique representatives for these equiv-
alence classes by taking the representatives from the set {0, 1, . . . , n − 1}. For example, if n = 5, we have
that Z/ ≡n consists of the following five sets.
Now that we’ve used ≡n to break Z up into n pieces, our next goal is to show how to add and multiply
elements of this quotient. The idea is to define addition/multiplication of elements of Z/ ≡n by simply
adding/multiplying representatives. In other words, we would like to define
Of course, whenever we define functions on equivalence classes via representatives, we need to be careful to
ensure that our function is well-defined, i.e. that it does not depend of the choice of representatives. That is
the content of the next result.
1. a + b ≡n c + d
2. ab ≡n cd
3.6. MODULAR ARITHMETIC 41
ab − cd = ab − bc + bc − cd
= (a − c) · b + (b − d) · c
Since n | (a − c) and n | (b − d), it follows that n | [(a − c) · b + (b − d) · c] and so n | (ab − cd). Therefore,
ab ≡n cd.
In this section, we’ve used the notation a ≡n b to denote the relation because it fits in with our general
infix notation. However, for both historical reasons and because the subscript can be annoying, one typically
uses the following notation.
Notation 3.6.5. Given a, b ∈ Z and n ∈ N+ , we write a ≡ b (mod n) to mean that a ≡n b..
42 CHAPTER 3. RELATIONS AND FUNCTIONS
Chapter 4
Introduction to Groups
4.1 Definitions
We begin with the notion of a group. In this context, we deal with just one operation. We choose to start
here in order to get practice with rigor and abstraction in as simple a setting as possible. Also, it turns out
that groups appear across all areas of mathematics in many different guises.
Definition 4.1.1. Let X be a set. A binary operation on X is a function f : X 2 → X.
In other words, a binary operation on a set X is a rule which tells us how to “put together” any two
elements of X. For example, the function f : R2 → R defined by f (x, y) = x2 ey is a binary operation on
R. Notice that a binary operation must be defined for all pairs of elements from X, and it must return an
x
element of X. The function f (x, y) = y−1 is not a binary operation on R because it is not defined for any
point of the form (x, 1). The function f : Z2 → R defined by f (x, y) = sin(xy) is defined on all of Z2 , but
it is not a binary operation on Z because although some values of f are integers (like f (0, 0) = 1), not all
outputs are integers even when we provide integer inputs (for example, f (1, 1) = sin 1 is not an integer).
Also, the dot product is not a binary operation on R3 because given two element of R3 it returns an element
of R (rather than an element of R3 ).
Instead of using the standard function notation for binary operations, one typically uses the so-called
infix notation. For example, when we add two numbers, we write x + y rather the far more cumbersome
+(x, y). For the binary operation involved with groups, we will follow this infix notation.
Definition 4.1.2. A group is a set G equipped with a binary operation · and an element e ∈ G such that:
1. (Associativity) For all a, b, c ∈ G, we have (a · b) · c = a · (b · c).
2. (Identity) For all a ∈ G, we have a · e = a and e · a = a.
3. (Inverses) For all a ∈ G, there exists b ∈ G with a · b = e and b · a = e.
In the abstract definition of a group, we have chosen to use the symbol · for the binary operation. This
symbol may look like the “multiplication” symbol, but the operation need not be the usual multiplication in
any sense. In fact, the · operation may be addition, exponentiation, or some bizarre and unnatural operation
we never thought of before.
Notice that any description of a group needs to provide three things: a set G, a binary operation · on G
(i.e. function from G2 to G), and a particular element e ∈ G. Absolutely any such choice of set, function,
and element can conceivably comprise a group. To check whether this is indeed the case, we need only check
whether the above three properties are true of that fixed choice.
43
44 CHAPTER 4. INTRODUCTION TO GROUPS
Here are a few examples of groups (in most cases we do not verify all of the properties here, but we will
do so later):
1. (Z, +, 0) is a group, as are (Q, +, 0), (R, +, 0), and (C, +, 0).
2. (Q\{0}, ·, 1) is a group. We need to omit 0 because it has no multiplicative inverse. Notice that the
product of two nonzero elements of Q is nonzero, so · really is a binary operation on Q\{0}.
1 0
3. The set of invertible 2 × 2 matrices over R with matrix multiplication and identity . In order
0 1
to verify that multiplication is a binary operation on these matrices, it is important to note that the
product of two invertible matrices is itself invertible, and that the inverse of an invertible matrix is
itself invertible. These are fundamental facts from linear algebra.
6. Fix n ∈ N+ and let {0, 1}n be the set of all sequences of 0’s and 1’s of length n. If we let ⊕ be the
bitwise “exclusive or” operation, then ({0, 1}n , ⊕, 0n ) is a group. Notice that n = 1, and we interpret
0 as false and 1 as true, then this example looks just like the previous one.
2. Let S be the set of all odd elements of Z, with the additional inclusion of 0, i.e. S = {0} ∪ {n ∈ Z : n
is odd}. Then (S, +, 0) is not a group. Notice that + is associative, that 0 is an identity, and that
inverses exist. However, + is not a binary operation on S because 1 + 3 = 4 ∈ / S. In other words, S is
not closed under +.
Returning to the examples of groups above, notice that the group operation is commutative in all the
examples above except for 3. To see that the operation in 3 is not commutative, simply notice that each of
the matrices
1 1 1 0
0 1 1 1
are invertible (they both have determinant 1), but
1 1 1 0 2 1
=
0 1 1 1 1 1
while
1 0 1 1 1 1
= .
1 1 0 1 1 2
Thus, there are groups in which the group operation · is not commutative. The special groups that satisfy
this additional fundamental property are named after Niels Abel.
Definition 4.1.3. A group (G, ·, e) is abelian if · is commutative, i.e. if a · b = b · a for all a, b ∈ G. A group
that is not abelian is called nonabelian.
4.2. BASIC PROPERTIES, NOTATION, AND CONVENTIONS 45
· 3 ℵ @
3 @ 3 ℵ
ℵ 3 ℵ @
@ ℵ @ 3
Interpret the above table as follows. To determine a · b, go to the row corresponding to a and the column
corresponding to b, and a · b will be the corresponding entry. For example, in order to compute 3 · @, we go
to the row labeled 3 and column labeled @, which tells us that 3 · @ = ℵ. We can indeed check that this
does form a group with identity element ℵ. However, this is a painful check because there are 27 choices of
triples for which we need to verify associativity. This table view of a group G described above is called the
Cayley table of G.
Here is another example of a group using the set G = {1, 2, 3, 4, 5, 6} with operation ∗ defined by the
following Cayley table:
∗ 1 2 3 4 5 6
1 1 2 3 4 5 6
2 2 1 6 5 4 3
3 3 5 1 6 2 4
4 4 6 5 1 3 2
5 5 3 4 2 6 1
6 6 4 2 3 1 5
It is straightforward to check that 1 is an identity for G and that every element has an inverse. However,
as above, I very strongly advise against checking associativity directly. We will see later how to build many
new groups and verify associativity without directly trying all possible triples. Notice that this last group is
an example of a finite nonabelian group because 2 ∗ 3 = 6 while 3 ∗ 2 = 5.
Proposition 4.2.1. Let (G, ·, e) be a group. There exists a unique identity element in G, i.e. if d ∈ G has
the property that a · d = a and d · a = a for all a ∈ G, then d = e.
Proof. The key element to consider is d · e. On the one hand, we know that d · e = d because e is an identity
element. On the other hand, we have d · e = e because d is an identity element. Therefore, d = d · e = e, so
d = e.
We now move on to a similar question about inverses. The axioms only state the existence of an inverse
for every given element. We now prove uniqueness.
Proposition 4.2.2. Let (G, ·, e) be a group. For each a ∈ G, there exists a unique b ∈ G such that a · b = e
and b · a = e.
46 CHAPTER 4. INTRODUCTION TO GROUPS
Proof. Fix a ∈ G. By the group axioms, we know that there exists an inverse of a. Suppose that b and c
both work as inverses, i.e. that a · b = e = b · a and a · c = e = c · a. The crucial element to think about is
(b · a) · c = b · (a · c). We have
b=b·e
= b · (a · c)
= (b · a) · c
=e·c
= c,
hence b = c.
Definition 4.2.3. Let (G, ·, e) be a group. Given an element a ∈ G, we let a−1 denote the unique element
such that a · a−1 = e and a−1 · a = e.
Proposition 4.2.4. Let (G, ·, e) be a group and let a, b ∈ G.
1. There exists a unique element c ∈ G such that a · c = b, namely c = a−1 · b.
2. There exists a unique element c ∈ G such that c · a = b, namely c = b · a−1 .
Proof. We prove 1 and leave 2 as an exercise (it is completely analogous). Notice that c = a−1 · b works
because
a · (a−1 · b) = (a · a−1 ) · b
=e·b
= b.
d=e·d
= (a−1 · a) · d
= a−1 · (a · d)
= a−1 · b.
Here’s an alternate presentation of the latter part (it is the exact same proof, just with more words and a
little more motivation). Suppose that d ∈ G satisfies a · d = b. We then have that a−1 · (a · d) = a−1 · b. By
associatively, the left-hand side is (a−1 · a) · d, which is e · d, with equals d. Therefore, d = a−1 · b.
Corollary 4.2.5 (Cancellation Laws). Let (G, ·, e) be a group and let a, b, c ∈ G.
1. If a · c = a · d, then c = d.
2. If c · a = d · a, then c = d.
Proof. Suppose that a · c = a · d. Letting b equal this common value, it follows the uniqueness part of
Proposition 4.2.4 that c = d. Alternatively, multiply both sides on the left by a−1 and use associativity.
Part 2 is completely analogous.
In terms of the Cayley table of a group G, Proposition 4.2.4 says that every element of a group appears
exactly once in each row of G and exactly once in each column of G. In fancy terminology, the Cayley table
of a group is a Latin square.
4.2. BASIC PROPERTIES, NOTATION, AND CONVENTIONS 47
Proposition 4.2.6. Let (G, ·, e) be a group and let a ∈ G. We have (a−1 )−1 = a.
Proof. By definition we have a · a−1 = e = a−1 · a. Thus, a satisfies the requirement to be the inverse of a−1 .
In other words, (a−1 )−1 = a.
Proposition 4.2.7. Let (G, ·, e) be a group and let a, b ∈ G. We have (a · b)−1 = b−1 · a−1 .
Proof. We have
(ab)(cd) = ((ab)c)d
= (a(bc))d.
In general, such rearrangements are always possible by iteratively moving the parentheses around as long as
the order of the elements doesn’t change. In other words, no matter how we insert parentheses in abcd so that
it makes sense, we always obtain the same result. For a sequence of 4 elements it is straightforward to try
them all, and for 5 elements it is tedious but feasible. However, it is true no matter how long the sequence is.
A careful proof of this fact requires careful definitions of what a “permissible insertion of parentheses” means
and at this level such an involved tangent is more distracting from our primary aims that it is enlightening.
We will simply take the result as true.
Keep in mind that the order of the elements occurring does matter. There is no reason at all to think that
abcd equals dacb unless the group is abelian. However, if G is abelian then upon any insertion of parentheses
into these expressions so that they make sense, they will evaluate to the same value. Thus, assuming that G is
commutative, we obtain a kind of “generalized commutativity” just like we have a “generalized associativity”.
Definition 4.2.8. Let G be a group. If G is a finite set, then the order of G, denoted |G|, is the number of
elements in G. If G is infinite, we simply write |G| = ∞.
48 CHAPTER 4. INTRODUCTION TO GROUPS
At the moment, we only have a few examples of groups. We know that the usual sets of numbers under
addition like (Z, +, 0), (Q, +, 0), (R, +, 0), and (C, +, 0) are groups. We listed a few other examples above,
such as (Q\{0}, ·), (R\{0}, ·), and the invertible 2 × 2 matrices under multiplication. In these last few
examples, we needed to “throw away” some elements from the natural set in order to satisfy the inverse
requirement, and in the next section we outline this construction in general.
Definition 4.3.1. Let · be a binary operation on a set A that is associative and has an identity element e.
Given a ∈ A, we say that a is invertible if there exists b ∈ A with both ab = e and ba = e. In this case, we
say that b is an inverse for a.
Proposition 4.3.2. Let · be a binary operation on a set A that is associative and has an identity element
e. We then have that every a ∈ A has at most one inverse.
Proof. The proof is completely analogous to the proof for functions. Let a ∈ A and suppose that b and c are
both inverses for a. We then have that
b = be
= b(ac)
= (ba)c
= ec
=c
so b = c.
Notation 4.3.3. Let · be a binary operation on a set A that is associative and has an identity element e. If
a ∈ A is invertible, then we denote its unique inverse by a−1 .
Proposition 4.3.4. Let · be a binary operation on a set A that is associative and has an identity element e.
2. If a and b are both invertible, then ab is invertible and (ab)−1 = b−1 a−1 .
Proof. 1. Since ee = e because e is an identity element, we see immediately that e is invertible and
e−1 = e.
and also
3. Suppose that a ∈ A is invertible. By definition, we then have that both aa−1 = e and also that
a−1 a = e. Looking at these equations, we see that a satisfies the definition of being an inverse for a−1 .
Therefore, a−1 is invertible and (a−1 )−1 = a.
Corollary 4.3.5. Let · be a binary operation on a set A that is associative and has an identity element e.
Let B be the set of all invertible elements of A. We then have that · is a binary operation on B and that
(B, ·, e) is a group. Furthermore, if · is commutative on A, then (B, ·, e) is an abelian group.
Proof. By Proposition 4.3.4, we know that if b1 , b2 ∈ B, then b1 b2 ∈ B, and hence · is a binary operation
on B. Proposition 4.3.4 also tells us that e ∈ B. Furthermore, since e is an identity for A ,and B ⊆ A, it
follows that e is an identity element for B. Now · is an associative operation on A and B ⊆ A, so it follows
that · is an associative operation on B. Finally, if b ∈ B, then Proposition 4.3.4 tells us that b−1 ∈ B as
well, so b does have an inverse in B. Putting this all together, we conclude that (B, ·, e) is a group. For the
final statement, simply notice that if · is an commutative operation on A, then since B ⊆ A it follows that ·
is a commutative operation on B.
One special case of this construction is that (Q\{0}, ·, 1), (R\{0}, ·, 1), and (C\{0}, ·, 1) are all abelian
groups. In each case, multiplication is an associative and commutative operation on the whole set, and 1 is
an identity element. Furthermore, in each case, every element but 0 is invertible. Thus, Corollary 4.3.5 says
that each of these are abelian groups.
Another associative operation that we know is the multiplication of matrices. Of course, the set of all
n × n matrices does not form a group because some matrices do not have inverses. However, if we restrict
down to just the invertible matrices, then Corollary 4.3.5 tells us that we do indeed obtain a group.
Definition 4.3.6. Let n ∈ N+ . The set of all n × n invertible matrices with real entries forms a group under
matrix multiplication, with identity element equal to the n × n identity matrix with 1’s on the diagonal and
0’s everywhere else. We denote this group by GLn (R) and call it the general linear group of degree n over
R.
Notice that GL1 (R) is really just the group (R\{0}, ·, 1). However, for each n ≥ 2, the group GLn (R) is
nonabelian so we have an infinite family of such groups (it is worthwhile to explicitly construct two n × n
invertible matrices that do not commute with each other).
Moreover, it also possible to allow entries of the matrices to come from a place other than R. For example,
we can consider matrices with entries from C. In this case, matrix multiplication is still associative, and the
the usual identity matrix is still an identity element. Thus, we can define the following group:
Definition 4.3.7. Let n ∈ N+ . The set of all n × n invertible matrices with complex entries forms a group
under matrix multiplication, with identity element equal to the n × n identity matrix with 1’s on the diagonal
and 0’s everywhere els). We denote this group by GLn (C) and call it the general linear group of degree n
over C.
As we will see in Chapter 9, we will see that we can even generalize this matrix construction to other
“rings”.
50 CHAPTER 4. INTRODUCTION TO GROUPS
(a + b) + c = (a + b) + c
= (a + b) + c
= a + (b + c) (since + is associative on Z)
=a+b+c
= a + (b + c)
so + is associative on G.
• Identity: For any a ∈ Z we have
a+0=a+0=a
and
0+a=0+a=a
a+b=a+b
=b+a (since + is commutative on Z)
=b+a
so + is commutative on G.
Therefore, (G, +, 0) is an abelian group.
Definition 4.4.2. Let n ∈ N+ . We denote the above abelian group by Z/nZ. We call the group “Z mod
nZ”.
This notation is unmotivated at the moment, but will make sense when we discuss general quotient
groups (and be consistent with the more general notation we establish there). Notice that |Z/nZ| = n for
all n ∈ N+ . Thus, we have shown that there exists an abelian group of every finite order.
Let’s examine one example in detail. Consider Z/5Z. As discussed in the last section, the equivalence
classes 0, 1, 2, 3 and 4 are all distinct and give all elements of Z/5Z. By definition, we have 3 + 4 = 7.
This is perfectly correct, but 7 is not one of the special representatives we chose above. Since 7 = 2, we can
also write 3 + 4 = 2 and now we have removed mention of all representatives other than the chosen ones.
Working it all out with only those representatives, we get the following Cayley table for Z/5Z:
4.4. THE GROUPS Z/N Z AND U (Z/N Z) 51
+ 0 1 2 3 4
0 0 1 2 3 4
1 1 2 3 4 0
2 2 3 4 0 1
3 3 4 0 1 2
4 4 0 1 2 3
We also showed in Proposition 3.6.4 that multiplication is well-defined on the quotient. Now it is straight-
forward to mimic the above arguments to show that multiplication is associative and commutative, and that
1 is an identity. However, it seems unlikely that we always have inverses because Z itself fails to have mul-
tiplicative inverses for all elements other than ±1. But let’s look at what happens in the case n = 5. For
example, we have 4 · 4 = 16 = 1, so in particular 4 does have a multiplicative inverse. Working out the
computations, we get the following table.
· 0 1 2 3 4
0 0 0 0 0 0
1 0 1 2 3 4
2 0 2 4 1 3
3 0 3 1 4 2
4 0 4 3 2 1
Examining the table, we are pleasantly surprised that every element other that 0 does have a multiplicative
inverse! In hindsight, there was no hope of 0 having a multiplicative inverse because 0 · a = 0 · a = 0 6= 1
for all a ∈ Z. With this one example we might have the innocent hope that we always get multiplicative
inverses for elements other 0 when we change n. Let’s dash those hopes now by looking at the case when
n = 6.
· 0 1 2 3 4 5
0 0 0 0 0 0 0
1 0 1 2 3 4 5
2 0 2 4 0 2 4
3 0 3 0 3 0 3
4 0 4 2 0 4 2
5 0 5 4 3 2 1
Looking at the table, we see that only 1 and 5 have multiplicative inverses. There are other curiosities
as well. For example, we can have two nonzero elements whose product is zero as shown by 3 · 4 = 0. This
is an interesting fact, and will return to such considerations when we get to ring theory. However, let’s get
to our primary concern of forming a group under multiplication. Since · is an associative and commutative
operation on Z/nZ with identity element 1, Corollary 4.3.5 tells us that we can form an abelian group by
trimming down to those elements that do have inverses. To get a better handle on what elements will be in
this group, we use the following fact.
2. gcd(a, n) = 1.
Proof. Assume 1. Fix c ∈ Z with ac ≡ 1 (mod n). We then have n | (ac − 1), so we may fix k ∈ Z with
nk = ac − 1. Rearranging, we see that ac + n(−k) = 1. Hence, there is an integer combination of a and n
52 CHAPTER 4. INTRODUCTION TO GROUPS
which gives 1. Since gcd(a, n) is the least positive such integer, and there is no positive integer less that 1,
we conclude that gcd(a, n) = 1.
Suppose conversely that gcd(a, n) = 1. Fix k, ` ∈ Z with ak + n` = 1. Rearranging gives n(−`) = ak − 1,
so n | (ak − 1). It follows that ak ≡ 1 (mod n), so we can choose c = k.
Thus, when n = 6, the fundamental reason why 1 and 5 have multiplicative inverses is that gcd(1, 6) =
1 = gcd(5, 6). In the case of n = 5, the reason why every element other than 0 had a multiplicative inverse
is because every possible number less than 5 is relatively prime with 5, which is essentially just saying that
5 is prime. Notice that the above argument can be turned into an explicit algorithm for finding such a c as
follows. Given n ∈ N+ and a ∈ Z with gcd(a, n) = 1, use the Euclidean algorithm to produce k, ` ∈ Z with
ak + n` = 1. As the argument shows, we can then choose c = k.
As a consequence of the above result and the fact that multiplication is well-defined (Proposition 3.6.4),
we see that if a ≡ b (mod n), then gcd(a, n) = 1 if and only if gcd(b, n) = 1. Of course we could have proved
this without this result: Suppose that a ≡ b (mod n). We then have that n | (a − b), so we may fix k ∈ Z
with nk = a − b. This gives a = kn + b so gcd(a, n) = gcd(b, n) by Corollary 2.4.6.
Now that we know which elements have multiplicative inverses, if we trim down to these elements then
we obtain an abelian group.
Proposition 4.4.4. Let n ∈ N+ and let G = Z/ ≡n , i.e. G the elements of G are the equivalence classes of
Z under the equivalence relation ≡n . Let U be the following subset of G:
U = {a : a ∈ Z and gcd(a, n) = 1}
Define a binary operation · on U by letting a · b = a · b. We then have that (U, ·, 1) is an abelian group.
Proof. Since · is an associative and commutative operation on Z/nZ with identity element 1, this follows
immediately from Corollary 4.3.5 and Proposition 4.4.3.
Definition 4.4.5. Let n ∈ N+ . We denote the above abelian group by U (Z/nZ).
Again, this notation may seems a bit odd, but we will come back and explain it in context when we get
to ring theory. To see some examples of Cayley tables of these groups, here is the Cayley table of U (Z/5Z):
· 1 2 3 4
1 1 2 3 4
2 2 4 1 3
3 3 1 4 2
4 4 3 2 1
Finally, we give the Cayley table of U (Z/8Z) because we will return to this group later as an important
example:
· 1 3 5 7
1 1 3 5 7
3 3 1 7 5
5 5 7 1 3
7 7 5 3 1
4.5. THE SYMMETRIC GROUPS 53
Interpret the above matrix by taking the top row as the inputs and each number below as the corresponding
output. Let’s through another permutation τ ∈ S6 into the mix. Define
1 2 3 4 5 6
τ=
3 1 5 6 2 4
Let’s compute σ◦τ . Remember that function composition happens from right to left. That is, the composition
σ ◦ τ is obtained by performing τ first and following after by performing σ. For example, we have
n · (n − 1) · (n − 2) · · · 2 · 1 = n!
This method of representing σ is reasonably compact, but it hides the fundamental structure of what is
happening. For example, it is difficult to “see” what the inverse of σ is, or to understand what happens
when we compose σ with itself repeatedly. We now develop another method for representing an permutation
on A called cycle notation. The basic idea is to take an element of A and follow its path through σ. For
example, let’s start with 1. We have σ(1) = 5. Now instead of moving on to deal with 2, let’s continue this
thread and determine the value σ(5). Looking above, we see that σ(5) = 4. If we continue on this path
to investigate 4, we see that σ(4) = 1, and we have found a “cycle” 1 → 5 → 4 → 1 hidden inside σ. We
will denote this cycle with the notation (1 5 4). Now that those numbers are taken care of, we start again
with the smallest number not yet claimed, which in this case is 2. We have σ(2) = 6 and following up gives
σ(6) = 2. Thus, we have found the cycle 2 → 6 → 2 and we denote this by (2 6). We have now claimed all
4.5. THE SYMMETRIC GROUPS 55
numbers other than 3, and when we investigate 3 we see that σ(3) = 3, so we form the sad lonely cycle (3).
Putting this all together, we write σ in cycle notation as
σ = (1 5 4)(2 6)(3)
One might wonder whether we ever get “stuck” when trying to build these cycles. What would happen if we
follow 1 and we repeat a number before coming back to 1? For example, what if we see 1 → 3 → 6 → 2 → 6?
Don’t fear because this can never happen. The only way the example above could crop up is if the purported
permutation sent both 3 and 2 to 6, which would violate the fact that the purported permutation is injective.
Also, if we finish a few cycles and start up a new one, then it is not possible that our new cycle has any
elements in common with previous ones. For example, if we already have the cycle 1 → 3 → 2 → 1 and we
start with 4, we can’t find 4 → 5 → 3 because then both 1 and 5 would map to 3.
Our conclusion is that this process of writing down a permutation in cycle notation never gets stuck and
results in writing the given permutation as a product of disjoint cycles. Working through the same process
with the permutation
1 2 3 4 5 6
τ=
3 1 5 6 2 4
we see that in cycle notation we have
τ = (1 3 5 2)(4 6)
Now we can determine σ ◦ τ in cycle notation directly from the cycle notations of σ and τ . For example,
suppose we want to calculate the following:
We want to determine the cycle notation of the resulting function, so we first need to determine where it
sends 1. Again, function composition happens from right to left. Looking at the function represented on the
right, we see the cycle containing 1 is (1 6 2), so the right function sends 1 to 6. We then go to the function
on the left and see where it sends 6 The cycle containing 6 there is (3 6), so it takes 6 and sends it to 3.
Thus, the composition sends 1 to 3. Thus, our result starts out as
(1 3
Now we need to see what happens to 3. The function on the right sends 3 to 5, and the function on the left
takes 5 and leave it alone, so we have
(1 3 5
When we move on to see what happens to 5, we notice that the right function sends it to 4 and then the left
function takes 4 to 1. Since 1 is the first element the cycle we started, we now close the loop and have
(1 3 5)
We now pick up the least element not in the cycle and continue. Working it out, we end with:
Finally, we make our notation a bit more compact with a few conventions. First, we simply omit any cycles
of length 1, so we just write (1 2 4)(3 6) instead of (1 2 4)(3 6)(5). Of course, this requires an understanding
of which n we are using to avoid ambiguity as the notation (1 2 4)(3 6) doesn’t specify whether we are
viewing is as an element of S6 or S8 (in the latter case, the corresponding function fixes both 7 and 8). Also,
as with most group operations, we simply omit the ◦ when composing functions. Thus, we would write the
above as:
(1 2 4)(3 6)(1 6 2)(3 5 4) = (1 3 5)(2)(4 6)
56 CHAPTER 4. INTRODUCTION TO GROUPS
Now there is potential for some conflict here. Looking at the first two cycles above, we meant to think of
(1 2 4)(3 6) as one particular function on {1, 2, 3, 4, 5, 6}, but by omitting the group operation it could also
be interpreted as (1 2 4) ◦ (3 6). Fortunately, these are exactly the same function because the cycles are
disjoint. Thus, there is no ambiguity.
Let’s work out everything about S3 . First, we know that |S3 | = 3! = 6. Working through the possibilities,
we determine that
S3 = {id, (1 2), (1 3), (2 3), (1 2 3), (1 3 2)}
so S3 has the identity function, three 2-cycles, and two 3-cycles. Here is the Cayley table of S3 .
◦ id (1 2) (1 3) (2 3) (1 2 3) (1 3 2)
id id (1 2) (1 3) (2 3) (1 2 3) (1 3 2)
(1 2) (1 2) id (1 3 2) (1 2 3) (2 3) (1 3)
(1 3) (1 3) (1 2 3) id (1 3 2) (1 2) (2 3)
(2 3) (2 3) (1 3 2) (1 2 3) id (1 3) (1 2)
(1 2 3) (1 2 3) (1 3) (2 3) (1 2) (1 3 2) id
(1 3 2) (1 3 2) (2 3) (1 2) (1 3) id (1 2 3)
Notice that S3 is a nonabelian group with 6 elements. In fact, it is the smallest possible nonabelian group,
as we shall see later.
We end this section with two straightforward but important results.
Proposition 4.5.4. Disjoint cycles commutes. That is, if A is a set and a1 , a2 , . . . , ak , b1 , b2 , . . . , b` ∈ A are
all distinct, then
(a1 a2 · · · ak )(b1 b2 · · · b` ) = (b1 b2 · · · b` )(a1 a2 · · · ak )
Proof. Simply work through where each ai and bj are sent on each side. Since ai 6= bj for all i and j, each ai
is fixed by the cycle containing the bj ’s and vice versa. Furthermore, if c ∈ A is such that c 6= ai and c 6= bj
for all i and j, then both cycles fix c, so both sides fix c.
Proposition 4.5.5. Let A be a set and let a1 , a2 , . . . , ak ∈ A be distinct. Let σ = (a1 a2 · · · ak ). We then
have that
σ −1 = (ak ak−1 · · · a2 a1 )
= (a1 ak ak−1 · · · a2 )
Proof. Let τ = (a1 ak ak−1 · · · a2 ). For any i with 1 ≤ i ≤ k − 1, we have
(τ ◦ σ)(ai ) = τ (σ(ai )) = τ (ai+1 ) = ai
and also
(τ ◦ σ)(ak ) = τ (σ(ak )) = τ (a1 ) = ak
Furthermore, for any i with 2 ≤ i ≤ k, we have
(σ ◦ τ )(a1 ) = σ(τ (a1 )) = σ(ak ) = a1
and also
(σ ◦ τ )(ai ) = σ(τ (ai )) = σ(ai−1 ) = ai
Finally, if c ∈ A is such that c 6= ai for all i, then
(τ ◦ σ)(c) = τ (σ(c)) = τ (c) = c
and
(σ ◦ τ )(c) = σ(τ (c)) = σ(c) = c
Since σ ◦ τ and τ ◦ σ have the same output for every element of A, we conclude that σ ◦ τ = τ ◦ σ.
4.6. ORDERS OF ELEMENTS 57
an = aaa · · · a
where there are n total a’s in the above product. If we want to be more formal, we define an recursively by
letting a1 = a and an+1 = an a for all n ∈ N+ . We also define
a0 = e.
an = (a−1 )|n| .
a3 a2 = (aaa)(aa) = aaaaa = a5
and
(a3 )2 = a3 a3 = (aaa)(aaa) = aaaaaa = a6
For other examples, notice that
S = {n ∈ N+ : an = e}
If S 6= ∅, we let |a| = min(S) (which exists by well-ordering), and if S = ∅, we define |a| = ∞. In other
words, |a| is the least positive n such that an = e provided such an n exists.
58 CHAPTER 4. INTRODUCTION TO GROUPS
The reason why we choose to overload the word order for two apparently very different concepts (when
applied to a group versus when applied to an element of a group) will be explained in the next chapter on
subgroups.
Example 4.6.4. Here are some examples of computing orders of elements in a group.
• In any group G, we have |e| = 1.
• In the group (Z, +), we have |0| = 1 as noted, but |n| = ∞ for all n 6= 0.
• In the group Z/nZ, we have |1| = n.
• In the group Z/12Z, we have |9| = 4 because
2 3 4
9 = 9 + 9 = 18 = 6 9 = 9 + 9 + 9 = 27 = 3 9 = 9 + 9 + 9 + 9 = 36 = 0
The order of an element a ∈ G is the least positive m ∈ Z with am = e. There may be many other
larger positive powers of a that give the identity, or even negative powers that do. For example, consider
4 6 −14
1 ∈ Z/2Z. We have |1| = 2, but 1 = 0, 1 = 0, and 1 = 0. In general, given the order of an element, we
can characterize all powers of that element which give the identity as follows.
Proposition 4.6.5. Let G be a group and let a ∈ G.
1. Suppose that |a| = m ∈ N+ . For any n ∈ Z, we have an = e if and only if m | n.
2. Suppose that |a| = ∞. For any n ∈ Z, we have an = e if and only if n = 0.
Proof. We first prove 1. Let m = |a| and notice that m > 0. We then have in particular that am = e.
Suppose first that n ∈ Z is such that m | n. Fix k ∈ Z with n = mk We then have
an = amk = (am )k = ek = e
so an = e. Suppose conversely that n ∈ Z and that an = e. Since m > 0, we may write n = qm + r where
0 ≤ r < m. We then have
e = an
= aqm+r
= aqm ar
= (am )q ar
= eq ar
= ar
Now by definition we know that m is the least positive power of a which gives the identity. Therefore, since
0 ≤ r < m and ar = e, we must have that r = 0. It follows that n = qm so m | n.
We now prove 2. Suppose that |a| = ∞. If n = 0, then we have an = a0 = e. If n > 0, then we have
n
a 6= e because by definition no positive power of a equals the identity. Suppose then that n < 0 and that
an = e. We then have
e = e−1 = (an )−1 = a−n
Now −n > 0 because n < 0, but this is a contradiction because no positive power of a gives the identity. It
follows that if n < 0, then an 6= e. Therefore, an = e if and only if n = 0.
4.6. ORDERS OF ELEMENTS 59
Proof. If 1 ≤ i < k, we have σ i (a1 ) = ai 6= a1 , so σ i 6= id. For each i, we have σ k (ai ) = ai , so σ k fixes each
ai . Since σ fixes all other elements of A, it follows that σ k fixes all other elements of A. Therefore, σ k = id
and we have shown that |σ| = k.
Proposition 4.6.7. Let A be a set and let σ ∈ SA . We then have that |σ| is the least common multiple of
the cycle lengths occurring in the cycle notation of σ.
Proof. Suppose that σ = τ1 τ2 · · · τ` where the τi are disjoint cycles. For each i, let mi = |τi | and notice
|τi | = mi from Proposition 4.6.6. Since disjoint cycles commute by Proposition 4.5.4, for any n ∈ N+ we
have
σ n = τ1n τ2n · · · τ`n
Now if mi | n for each i, then τin = id for each i by Proposition 4.6.5, so σ n = id. Conversely, suppose that
n ∈ N+ is such that there exists i with mi - n. We then have that τin 6= id by Proposition 4.6.5, so we may
fix a ∈ A with τin (a) 6= a. Now both a and τin (a) are fixed by each τj with j 6= i (because the cycles are
disjoint). Therefore σ n (a) = τin (a) 6= a, and hence σ n 6= id.
It follows that σ n = id if and only if mi | n for each i. Since |σ| is the least n ∈ N+ with σ n = id, it
follows that |σ| is the least n ∈ N+ satisfying mi | n for each i, which is to say that |σ| is the least common
multiple of the mi .
Suppose now that we have an element a of a group G and we know its order. How do we compute the
orders of the powers of a? For an example, consider 3 ∈ Z/30Z. It is straightforward to check that |3| = 10.
2 2
Let’s look at the orders of some powers of 3. Now 3 = 6 and a simple check shows that |6| = 5 so |3 | = 5.
3 2 3 4
Now consider 3 = 9. We have 9 = 18, then 9 = 27, and then 9 = 36 = 6. Since 30 is not a multiple of 9
we see that we “cycle around” and it is not quite as clear when we will hit 0 as we continue. However, if we
3 4 5
keep at it, we will find that |3 | = 10. If we keep calculating away, we will find that |3 | = 5 and |3 | = 2.
We would like to have a better way to determine these values without resorting to tedious calculations, and
that is what the next proposition supplies.
Proof. We first prove 1. Fix n ∈ Z and let d = gcd(m, n). Since d is a common divisor of m and n, we may
fix s, t ∈ Z with m = ds and n = dt. Notice that s > 0 because both m > 0 and d > 0. With this notation,
we need to show that |an | = s.
Notice that
(an )s = ans = adts = amt = (am )t = et = e
Thus, s is a positive power of a which gives the identity, and so |a| ≤ s.
Suppose now that k ∈ N+ with (an )k = e. We need to show that s ≤ k. We have ank = e, so by
Proposition 4.6.5, we know that m | nk. Fix ` ∈ Z with m` = nk. We then have that ds` = dtk, so canceling
d > 0 we conclude that s` = tk, and hence s | tk. Now by the homework problem about least common
60 CHAPTER 4. INTRODUCTION TO GROUPS
multiples, we know that gcd(s, t) = 1. Using Proposition 2.4.10, we conclude that s | k. Since s, k > 0, it
follows that s ≤ k.
Therefore, s is the least positive value of k such that (an )k = e, and so we conclude that |ak | = s.
We now prove 2. Suppose that n ∈ Z\{0}. Let k ∈ N+ . We then have (an )k = ank . Now nk 6= 0 because
n 6= 0 and k > 0, hence (an )k = ank 6= 0 by the previous proposition. Therefore, (an )k 6= 0 for all k ∈ N+
and hence |an | = ∞.
Proposition 4.6.10. If G has an element of order n ∈ N+ , then G has an element of order d for every
positive d | n.
Proof. Suppose that G has an element of order n, and fix a ∈ G with |a| = n. Let d ∈ N+ with d | n. Fix
k ∈ N+ with kd = n. Using Proposition 4.6.8 and the fact that k | n, we have
n n
|ak | = = =d
gcd(n, k) k
Definition 4.7.1. Suppose that (Gi , ?i ) for 1 ≤ i ≤ n are all groups. Consider the Cartesian product of the
sets G1 , G2 , . . . , Gn , i.e.
G1 × G2 × · · · × Gn = {(a1 , a2 , . . . , an ) : ai ∈ Gi for 1 ≤ i ≤ n}
Proposition 4.7.2. Suppose that (Gi , ?i ) for 1 ≤ i ≤ n are all groups. The direct product defined above is
a group with the following properties:
where a−1
i is the inverse of ai in the group Gi .
4.7. DIRECT PRODUCTS 61
n
Q
• |G1 × G2 × · · · × Gn | = |Gi |, i.e. the order of the direct product of the Gi is the product of the orders
i=1
of the Gi .
Proof. We first check that · is associative. Suppose that ai , bi , ci ∈ Gi for 1 ≤ i ≤ n. We have
Let ei be the identity of Gi . We now check that (e1 , e2 , . . . , en ) is an identity of the direct product. Let
ai ∈ Gi for all i. We have
and
(a1 , a2 , . . . , an ) · (a−1 −1 −1 −1 −1 −1
1 , a2 , . . . , an ) = (a1 ?1 a1 , a2 ?2 a2 , . . . , an ?n an )
= (e1 , e2 , . . . , en )
and
(a−1 −1 −1 −1 −1 −1
1 , a2 , . . . , an ) · (a1 , a2 , . . . , an ) = (a1 ?1 a1 , a2 ?2 a2 , . . . , an ?n an )
= (e1 , e2 , . . . , en )
hence (a−1 −1 −1
1 , a2 , . . . , an ) is an inverse of (a1 , a2 , . . . , an ).
Finally, since there are |G1 | elements to put in the first coordinate of the n-tuple, |G2 | elements to put
in the second coordinates, etc., it follows that
n
Y
|G1 × G2 × · · · × Gn | = |Gi |.
i=1
For example, consider the group G = S3 × Z (where we are considering Z as a group under addition).
Elements of G are ordered pairs (σ, n) where σ ∈ S3 and n ∈ Z. For example, ((1 2), 8), (id, −6), and
((1 3 2), 42) are all elements of G. The group operation on G is obtained by working in each coordinate
separately and performing the corresponding group operation there. For example, we have
((1 2), 8) · ((1 3 2), 42) = ((1 2) ◦ (1 3 2), 8 + 42) = ((1 3), 50).
Notice that the direct product puts two groups together in a manner that makes them completely ignore
each other. Each coordinate goes about doing its business without interacting with the others at all.
62 CHAPTER 4. INTRODUCTION TO GROUPS
Therefore, G1 × G2 × · · · × Gn is abelian.
Suppose conversely that G1 × G2 × · · · × Gn is abelian. Fix i with 1 ≤ i ≤ n. Suppose that ai , bi ∈ Gi .
Consider the elements (e1 , . . . , ei−1 , ai , ei , . . . , en ) and (e1 , . . . , ei−1 , bi , ei , . . . , en ) in G1 ×G2 ×· · ·×Gn . Using
the fact that the direction product is abelian, we see that
Comparing the ith coordinates of the first and last tuple, we conclude that ai ?i bi = bi ?i ai . Therefore, Gi
is abelian.
We can build all sorts of groups with this construction. For example Z/2Z × Z/2Z is an abelian group
of order 4 with elements
Z/2Z × Z/2Z = {(0, 0), (0, 1), (1, 0), (1, 1)}
In this group, the element (0, 0) is the identity and all other elements have order 2. We can also use this
construction to build nonabelian groups of various orders. For example S3 × Z/2Z is a nonabelian group of
order 12.
Proposition 4.7.4. Let G1 , G2 , . . . , Gn be groups. Let ai ∈ Gi for 1 ≤ i ≤ n. The order of the element
(a1 , a2 , . . . , an ) ∈ G1 × G2 × · · · × Gn is the least common multiple of the orders of the ai ’s in Gi .
Proof. For each i, let mi = |ai |, so mi is the order of ai in the group Gi . Now since the group operation in
the direct product works in each coordinate separately, a simple induction shows that
Conversely, suppose that k ∈ N+ is such that there exists i with mi - k. For such an i, we then have that
aki 6= ei by Proposition 4.6.5, so
It follows that (a1 , a2 , . . . , an )k = (e1 , e2 , . . . , en ) if and only if mi | k for all i. Since |(a1 , a2 , . . . , an )| is the
least k ∈ N+ with (a1 , a2 , . . . , an )k = (e1 , e2 , . . . , en ), it follows that |(a1 , a2 , . . . , an )| is the least k ∈ N+
satisfying mi | k for all i, which is to say that |(a1 , a2 , . . . , an )| is the least common multiple of the mi .
4.7. DIRECT PRODUCTS 63
For example, suppose that we are working in the group S4 × Z/42Z and we consider the element
((1 4 2 3), 7). Since |(1 4 2 3)| = 4 in S4 and |7| = 6 in Z/42Z, it follows that the order of ((1 4 2 3), 7) in
S4 × Z/42Z equals lcm(4, 6) = 12.
For another example, the group Z/2Z × Z/2Z × Z/2Z is an abelian group of order 8 in which every
nonidentity element has order 2. Generalizing this construction by taking n copies of Z/2Z, we see how to
construct an abelian group of order 2n in which every nonidentity element has order 2.
64 CHAPTER 4. INTRODUCTION TO GROUPS
Chapter 5
5.1 Subgroups
If we have a group, we can consider subsets of G which also happen to be a group under the same operation.
If we take a subset of a group and restrict the group operation to that subset, we trivially have the operation
is associative on H because it is associative on G. The only issues are whether the operation remains a
binary operation on H (it could conceivably combine two elements of H and return an element not in H),
whether the identity is there, and whether the inverse of every element of H is also in H. This gives the
following definition.
• For any group G, we always have two trivial examples of subgroups. Namely G is always a subgroup
of itself, and {e} is always a subgroup of G.
• Z is a subgroup of (Q, +) and (R, +). This follows because 0 ∈ Z, the sum of two integers is an integer,
and the additive inverse of an integer is an integer.
• The set 2Z = {2n : n ∈ Z} is a subgroup of (Z, +). This follows from the fact that 0 = 2 · 0 ∈ 2Z,
that 2m + 2n = 2(m + n) so the sum of two evens is even, and that −(2n) = 2(−n) so the the additive
inverse of an even number is even.
• The set H = {0, 3} is a subgroup of Z/6Z. To check that H is a subgroup, we need to check the three
−1 −1
conditions. We have 0 ∈ H, so H contains the identity. Also 0 = 0 ∈ H and 3 = 3 ∈ H, so the
inverse of each element of H lies in H. Finally, to check that H is closed under the group operation,
we simply have to check the four possibilities. For example, we have 3 + 3 = 6 = 0 ∈ H. The other 3
possible sums are even easier.
Example 5.1.3. Let G = (Z, +). Here are some examples of subsets of Z which are not subgroups of G.
65
66 CHAPTER 5. SUBGROUPS AND COSETS
• The set H = {0} ∪ {2n + 1 : n ∈ Z} is not a subgroup of G because even though it contains 0 and is
closed under inverses, it is not closed under the group operation. For example, 1 ∈ H and 3 ∈ H, but
1+3∈ / H.
• The set N is not a subgroup of G because even though it contains 0 and is closed under the group
operation, it is not closed under inverses. For example, 1 ∈ H but −1 ∈
/H
Now it is possible that a subset of a group G forms a group under a completely different binary operation
that the one used in G, but whenever we talk about a subgroup H of G we only think of restricting the
group operation of G down to H. For example, let G = (Q, +). The set H = Q\{0} is not a subgroup of
G (since it does not contain the identity) even though H can be made into a group with the completely
different operation of multiplication. When we consider a subset H of a group G, we only call it a subgroup
of G if it is a group with respect to the exact same binary operation.
Proposition 5.1.4. Let n ∈ N+ . The set H = {A ∈ GLn (R) : det(A) = 1} is a subgroup of GLn (R).
Proof. Letting In be the n × n identity matrix (which is the identity of GLn (R)), we have that det(In ) = 1
so In ∈ H. Suppose that M, N ∈ H so det(M ) = 1 = det(N ). We then have
Definition 5.1.5. Let n ∈ N+ . We let SLn (R) be the above subgroup of GLn (R). That is,
The group SLn (R) is called the special linear group of degree n.
The following proposition occasionally makes the process of checking whether a subset of a group is
indeed a subgroup a bit easier.
Proposition 5.1.6. Let G be a group and let H ⊆ G. The following are equivalent:
• H is a subgroup of G
Proof. Suppose first that H is a subgroup of G. By definition, we must have e ∈ H, so H 6= ∅. Suppose that
a, b ∈ H. Since H is a subgroup and b ∈ H, we must have b−1 ∈ H. Now using the fact that a ∈ H and
b−1 ∈ H, together with the second part of the definition of a subgroup, it follows that ab−1 ∈ H.
Now suppose conversely that H 6= ∅ and ab−1 ∈ H whenever a, b ∈ H. We need to check the three
defining characteristics of a subgroup. Since H 6= ∅, we may fix c ∈ H. Using our condition and the fact that
c ∈ H, it follows that e = cc−1 ∈ H, so we have checked the first property. Now using the fact that e ∈ H,
given any a ∈ H we have a−1 = ea−1 ∈ H by our condition, so we have checked the third property. Suppose
now that a, b ∈ H. From what we just showed, we know that b−1 ∈ H. Therefore, using our condition,
we conclude that ab = a(b−1 )−1 ∈ H, so we have verified the second property. We have shown that all 3
properties hold for H, so H is a subgroup of G.
Proposition 5.1.7. Let G be a group. If H and K are both subgroups of G, then H ∩ K is a subgroup of G.
5.2. GENERATING SUBGROUPS 67
hci = {cn : n ∈ Z}
ab = cm cn = cm+n
so ab ∈ H because m + n ∈ Z.
• Suppose that a ∈ H. Fix m ∈ Z with a = cm . We then have
so a−1 ∈ H because −m ∈ Z.
68 CHAPTER 5. SUBGROUPS AND COSETS
Therefore, H is a subgroup of G.
We now prove 2. Suppose that K is a subgroup of G with c ∈ K. We first prove by induction on n ∈ N+
that cn ∈ K. We clearly have c1 = c ∈ K by assumption. Suppose that n ∈ N+ and we know that cn ∈ K.
Since cn ∈ K and c ∈ K, and K is a subgroup of G, it follows that cn+1 = cn c ∈ K. Therefore, by induction,
we know that cn ∈ K for all n ∈ N+ . Now c0 = e ∈ K because K is a subgroup of G, so cn ∈ K for all
n ∈ N. Finally, if n ∈ Z with n < 0, then c−n ∈ K because −n ∈ N+ and hence cn = (c−n )−1 ∈ K because
inverses of elements of K must be in K. Therefore, cn ∈ K for all n ∈ Z, which is to say that H ⊆ K.
For example, suppose that we are working with the group G = Z under addition. Since the group
operation is addition, given c, n ∈ Z, we have that cn (under the general group theory definition) equals nc
(under the usual definition of multiplication). Therefore,
hci = {nc : n ∈ Z}
{ci : 0 ≤ i < m} ⊆ H.
cn = cqm+r = (cm )q cr = e1 cr = cr
so cn = cr ∈ {ci : 0 ≤ i < m}. Therefore, H ⊆ {ci : 0 ≤ i < m} and combining this with the reverse
inclusion above we conclude that H = {ci : 0 ≤ i < m}.
Suppose now that 0 ≤ k < ` < m. Assume for the sake of obtaining a contradiction that ck = c` .
Multiplying both sides by c−k on the right, we see that ck c−k = c` c−k , hence
Now we have 0 ≤ k < ` < m, so 0 < ` − k < m. This contradicts the assumption that m = |c| is the least
positive power of c giving the identity. Hence, we must have ck 6= c` .
We now prove 2. Suppose that k, ` ∈ Z with k < `. Assume that ck = c` . As in part 1 we can multiply
both sides on the right by c−k to conclude that c`−k = e. Now ` − k > 0, so this contradicts the assumption
that |c| = ∞. Therefore, we must have ck 6= c` .
5.2. GENERATING SUBGROUPS 69
Corollary 5.2.4. If G is a finite group, then every element of G has finite order. Moreover, for each a ∈ G,
we have |a| ≤ |G|.
Proof. Let a ∈ G. We then have that hai ⊆ G, so |hai| ≤ |G|. The result follows because |a| = |hai|.
In fact, much more is true. We will see as a consequence of Lagrange’s Theorem that the order of every
element of finite group G is actually a divisor of |G|.
Definition 5.2.5. A group G is cyclic if there exists c ∈ G such that G = hci. An element c ∈ G with
G = hci is called a generator of G.
k
For example, for each n ∈ N+ , the group Z/nZ is cyclic because 1 is a generator (since 1 = k for all k).
Also, Z is cyclic because 1 is a generator (remember than hci is the set of all powers of c, both positive and
negative). In general, a cyclic group has many generators. For example, −1 is a generator of Z, and 3 is a
generator of Z/5Z as we saw above.
Proposition 5.2.6. Let G be a finite group with |G| = n. An element c ∈ G is a generator of G if and only
if |c| = n. In particular, G is cyclic if and only if it has an element of order n.
Proof. Suppose first that c is a generator of G so that G = hci. We know from above that |c| = |hci|, so
|c| = |G| = n. Suppose conversely that |c| = n. We then know that |hci| = n. Since hci ⊆ G and each has n
elements, we must have that G = hci.
For an example of a noncyclic group, consider S3 . The order of each element of S3 is either 1, 2, or 3, so
S3 has no element of order 6. Thus, S3 has no generators, and so S3 is not cyclic. We could also conclude
that S3 is not cyclic by using the following result.
Proposition 5.2.7. All cyclic groups are abelian.
Proof. Suppose that G is a cyclic group and fix c ∈ G with G = hci. Let a, b ∈ G. Since G = hci, we may
fix m, n ∈ Z with a = cm and b = cn . We then have
ab = cm cn
= cm+n
= cn+m
= cn cm
= ba.
{cm dn : m, n ∈ Z}
70 CHAPTER 5. SUBGROUPS AND COSETS
but unless G is abelian there is no reason to think that this set is closed under multiplication. For example,
we must have cdc ∈ hc, di, but it doesn’t obviously appear there. Whatever hc, di is, it must contain the
following elements:
cdcdcdc c−1 dc−1 d−1 c c3 d6 c−2 d7
If we take 3 elements, it gets even more complicated because we can alternate the 3 elements in such
sequences without such a repetitive pattern. If we have infinitely many elements, it gets even worse. Since
the constructions get messy, we define the subgroup generated by an arbitrary set in a much less explicit
manner. The key idea comes from Proposition 5.2.2, which says that hci is the “smallest” subgroup of G
containing c as an element. Here, “smallest” does not mean the least number of elements, but instead means
the subgroup that is a subset of all other subgroups containing the elements in question.
Proposition 5.2.8. Let G be a group and let A ⊆ G. There exists a subgroup H of G with the following
properties:
• A ⊆ H.
• Whenever K is a subgroup of G with the property that A ⊆ K, we have H ⊆ K.
Furthermore, the subgroup H is unique (i.e. if both H1 and H2 have the above properties, then H1 = H2 ).
Proof. The idea is to intersect all of the subgroups of G that contain A, and argue that the result is a
subgroup. Since their might be infinitely many such subgroups, we can not simply appeal to Proposition
5.1.7. However, our argument is very similar.
We first prove existence. Notice that there is at least one subgroup of G containing A, namely G itself.
Define
H = {a ∈ G : a ∈ K for all subgroups K of G such that A ⊆ K}
Notice we certainly have A ⊆ H by definition. Moreover, if K is a subgroup of G with the property that
A ⊆ K, then we have H ⊆ K by definition of H. We now show that H is indeed a subgroup of G.
• Since e ∈ K for every subgroup K of G with A ⊆ K, we have e ∈ H.
• Let a, b ∈ H. For any subgroup K of G such that A ⊆ K, we must have both a, b ∈ K by definition of
H, hence ab ∈ K because K is a subgroup. Since this is true for all such K, we conclude that ab ∈ H
by definition of H.
• Let a ∈ H. For any subgroup K of G such that A ⊆ K, we must have both a ∈ K by definition of H,
hence a−1 ∈ K because K is a subgroup. Since this is true for all such K, we conclude that a−1 ∈ H
by definition of H.
Combining these three, we conclude that H is a subgroup of G. This finishes the proof of existence.
Finally, suppose that H1 and H2 both have the above properties. Since H2 is a subgroup of G with
A ⊆ H2 , we know that H1 ⊆ H2 . Similarly, since H1 is a subgroup of G with A ⊆ H1 , we know that
H2 ⊆ H1 . Therefore, H1 = H2 .
Definition 5.2.9. Let G be a group and let A ⊆ G. We define hAi to be the unique subgroup H of G given
by Proposition 5.2.8. If A = {a1 , a2 , . . . , an }, we write ha1 , a2 , . . . , an i rather than h{a1 , a2 , . . . , an }i.
For example, consider the group G = S3 , and let H = h(1 2), (1 2 3)i. We know that H is a subgroup of
G, so id ∈ H. Furthermore, we must have (1 3 2) = (1 2 3)(1 2 3) ∈ H. We also must have
(1 3) = (1 2 3)(1 2) ∈ H
and
(2 3) = (1 2)(1 2 3) ∈ H.
5.3. THE ALTERNATING GROUPS 71
(1 2 3) = (1 3)(1 2)
Although it’s not immediately obvious, notice that τ and π are obtained by swapping just two elements in
the bottom row of σ. More formally, they are obtained by composing with a transposition first, and we have
72 CHAPTER 5. SUBGROUPS AND COSETS
σ = π ◦ (3 4) and τ = π ◦ (2 5). We now examine the inversions in each of these permutations. Notice that
it is typically easier to determine these in first representations rather than in cycle notation:
From this example, it may seem puzzling to see how the inversions are related. However, there is something
quite interesting that is happening. Let’s examine the relationship between Inv(σ) and Inv(τ ). By swapping
the third and fourth positions in the second row, the inversion (1, 3) in σ became the inversion (1, 4) in τ ,
and the inversion (4, 5) in σ became the inversion (3, 5) in τ , so those match up. However, we added a new
inversion by this swap, because although originally we had σ(3) < σ(4), but the swapping made τ (3) > τ (4).
This accounts for the one additional inversion in τ . If instead we had σ(3) > σ(4), then this swap would
have lost an inversion. However, in either case, this example illustrates that a swapping of two adjacent
numbers either increases or decreases the number of inversions by 1.
Lemma 5.3.2. Suppose σ ∈ Sn . If µ is a transposition consisting of two adjacent numbers, say µ = (k k+1),
then |Inv(σ)| and |Inv(σ ◦ µ)| differ by 1.
because µ fixes both i and j. Now since µ(k) = k + 1, µ(k + 1) = k, and µ fixes all other numbers, given
any i with i < k, we have
because if σ(k) > σ(k+1) then (σ◦µ)(k) < (σ◦µ)(k+1), while if σ(k) < σ(k+1) then (σ◦µ)(k) > (σ◦µ)(k+1).
Since we have a bijection between Inv(σ)\{(k, k + 1)} and Inv(σ ◦ µ)\{(k, k + 1)}, while (k, k + 1) is exactly
one of the sets Inv(σ) and Inv(σ ◦ µ), it follows that |Inv(σ)| and |Inv(σ ◦ µ)| differ by 1.
A similar analysis is more difficult to perform on π because the swapping involved two non-adjacent
numbers. As a result, elements in the middle had slightly more complicated interactions, and the above
example shows that a swap of this type can sizably increase the number of inversions. Although it is possible
to handle it directly, the key idea is to realize we can perform this swap through a sequence of adjacent
swaps. This leads to the following result.
Corollary 5.3.3. Suppose σ ∈ Sn . If µ is a transposition, then |Inv(σ)| 6≡ |Inv(σ ◦ µ)| (mod 2), i.e. the
parity of the number of inversions of σ does not equal the parity of the number of inversions of σ ◦ µ.
5.3. THE ALTERNATING GROUPS 73
Proof. Let µ be a transposition, and write µ = (k `) where k < `. The key fact is that we can write (k `) as
a composition of 2(` − k) − 1 many adjacent transposition in succession. In other words, we have
(k `) = (k k + 1) ◦ (k + 1 k + 2) ◦ · · · ◦ (` − 2 ` − 1) ◦ (` − 1 `) ◦ (` − 2 ` − 1) · · · ◦ (k + 1 k + 2) ◦ (k k + 1)
σ ◦ (k k + 1) ◦ (k + 1 k + 2) ◦ · · · ◦ (` − 2 ` − 1) ◦ (` − 1 `) ◦ (` − 2 ` − 1) · · · ◦ (k + 1 k + 2) ◦ (k k + 1)
Using associativity, we can handle each of these in succession, and use Lemma 5.3.2 to conclude that each
changes the number of inversion by 1 (either increasing or decreasing it). Since there are an odd number of
adjacent transpositions, the result follows.
Definition 5.3.4. Let n ∈ N+ . We define a function ε : Sn → {1, −1} by letting ε(σ) = (−1)|Inv(σ)| , i.e.
(
1 if σ has an even number of inversions
ε(σ) =
−1 if σ has an odd number of inversions
σ = id ◦ µ1 ◦ µ2 ◦ · · · ◦ µm
Now |Inv(id)| = 0, so using Corollary 5.3.3 repeatedly, we conclude that |Inv(id ◦ µ1 )| is odd, and then
|Inv(id ◦ µ1 ◦ µ2 )| is even, etc. In general, a straightforward induction on k shows that
|Inv(id ◦ µ1 ◦ µ2 · · · ◦ µk )|
is odd if k is odd, and is even if k is even. Thus, if m is even, then ε(σ) = 1, and if m is odd, then
ε(σ) = −1.
Corollary 5.3.6. It is impossible for a permutation to be written as both a product of an even number of
transpositions and as a product of an odd number of transpositions.
Proof. If we could write a permutation in both ways, then both ε(σ) = 1 and ε(σ) = −1, a contradiction.
Definition 5.3.7. Let n ∈ N+ and let σ ∈ Sn . If ε(σ) = 1, then we say that σ is an even permutation. If
ε(σ) = −1, then we say that σ is an odd permutation.
Proposition 5.3.8. Let n ∈ N+ . For all σ, τ ∈ Sn , we have ε(στ ) = ε(σ) · ε(τ ).
Proof. If either σ = id or τ = id, this is immediate from the fact that ε(id) = 1. We now handle the various
cases:
74 CHAPTER 5. SUBGROUPS AND COSETS
• If σ and τ can both be written as a product of an even number of transpositions, then στ can also be
written as a product of an even number of transpositions, so we have ε(σ) = 1, ε(τ ) = 1, and ε(στ ) = 1
by Proposition 5.3.5.
• If σ and τ can both be written as a product of an odd number of transpositions, then στ can be written
as a product of an even number of transpositions, so we have ε(σ) = −1, ε(τ ) = −1, and ε(στ ) = 1 by
Proposition 5.3.5.
Proof. Write σ = (a1 a2 a3 · · · ak−1 ak ) where the ai are distinct. As above, we have
Thus, σ is the product of k − 1 many transpositions. If k is an even number, then k − 1 is an odd number,
and hence σ is an odd permutation. If k is an odd number, then k − 1 is an even number, and hence σ is an
even permutation.
Using Proposition 5.3.8 and Proposition 5.3.9, we can now easily compute the sign of any permutation
once we’ve written it in cycle notation. For example, suppose that
We then have
so σ is an even permutation.
An = {σ ∈ Sn : ε(σ) = 1}
Proof. We have ε(id) = (−1)0 = 1, so id ∈ An . Suppose that σ, τ ∈ An so that ε(σ) = 1 = ε(τ ). we then
have
ε(στ ) = ε(σ) · ε(τ ) = 1 · 1 = 1
so στ ∈ An . Finally, suppose that σ ∈ An . We then have σσ −1 = id so
1 = ε(id)
= ε(σσ −1 )
= ε(σ) · ε(σ −1 )
= 1 · ε(σ −1 )
= ε(σ −1 )
so σ −1 ∈ An .
Definition 5.3.11. The subgroup An = {σ ∈ Sn : ε(σ) = 1} of Sn is called the alternating group of degree
n.
Let’s take a look at a few small examples. We trivially have A1 = {id}, and we also have A2 = {id}
because (1 2) is an odd permutation. The group S3 has 6 elements: the identity, three 2-cycles, and two
3-cycles, so A3 = {id, (1 2 3), (1 3 2)}. When we examine S4 , we see that it contains the following:
• The identity.
• Six 4-cycles.
• Eight 3-cycles.
• Six 2-cycles.
Now of these, A4 consists of the identity, the eight 3-cycles, and the three products of two disjoint 2-cycles,
so |A4 | = 12. In general, we have the following.
n!
Proposition 5.3.12. For any n ≥ 2, we have |An | = 2 .
Proof. Define a function f : An → Sn by letting f (σ) = σ(1 2). We first claim that f is injective. To see
this, suppose that σ, τ ∈ An and that f (σ) = f (τ ). We then have σ(1 2) = τ (1 2). Multiplying on the right
by (1 2), we conclude that σ = τ . Therefore, f is injective.
We next claim that range(f ) = Sn \An . Suppose first that σ ∈ An . We then have ε(σ) = 1, so
so f (σ) ∈ Sn \An . Conversely, suppose that τ ∈ Sn \An so that ε(τ ) = −1. We then have
so τ (1 2) ∈ An . Now
f (τ (1 2)) = τ (1 2)(1 2) = τ
so τ ∈ range(f ). It follows that range(f ) = Sn \An . Therefore, f maps An bijectively onto Sn \An and hence
|An | = |Sn \An |. Since Sn is the disjoint union of these two sets and |Sn | = n!, it follows that |An | = n!
2 .
76 CHAPTER 5. SUBGROUPS AND COSETS
We would like to find a way to simplify such expressions, and the next proposition is the primary tool that
we will need.
Proposition 5.4.3. Let n ≥ 3. We have the following
1. |r| = n.
2. |s| = 2.
3. sr = r−1 s = rn−1 s.
4. For all k ∈ N+ , we have srk = r−k s.
5.4. THE DIHEDRAL GROUPS 77
Proof. We have |r| = n because r is an n-cycle and |s| = 2 because s is a product of disjoint 2-cycles. We
now check that sr = r−1 s. First notice that
r−1 = (n n − 1 · · · 3 2 1) = (1 n n − 1 · · · 3 2)
and
and
so sr = r−1 s. Thus, sr = r−1 s in all cases. Since we know that |r| = n, we have rrn−1 = id and rn−1 r = id,
so r−1 = rn−1 . It follows that r−1 s = rn−1 s.
The last statement now follows by induction from the third.
We now give an example of how to use this proposition to simplify the above expression in the case n = 5.
We have
1. Dn = {ri sk : 0 ≤ i ≤ n − 1, 0 ≤ k ≤ 1}
Proof. Using the fundamental relations that |r| = n, that |s| = 2, and that srk = r−k s for all k ∈ N+ , the
above argument shows that any product of r, s, and their inverses equals ri sk for some i, k with 0 ≤ i ≤ n−1
and 0 ≤ k ≤ 1. To be more precise, one can use the above relations to show that the set
{ri sk : 0 ≤ i ≤ n − 1, 0 ≤ k ≤ 1}
is closed under multiplication and under inverses, so it equals Dn . I will leave such a check for you if you
would like to work through it. This gives part 1.
Suppose now that ri sk = rj s` with 0 ≤ i, j ≤ n − 1 and 0 ≤ k, ` ≤ 1. Multiplying on the left by r−j and
on the right by s−k , we see that
ri−j = s`−k
Suppose for the sake of obtaining a contradiction that k 6= `. Since k, ` ∈ {0, 1}, we must have `−k ∈ {−1, 1},
so as s−1 = s it follows that ri−j = s`−k = s. Now we have s(1) = 1, so we must have ri−j (1) = 1 as well.
This implies that n | (i − j), so as −n < i − j < n, we conclude that i − j = 0. Thus ri−j = id, and we
conclude that s = id, a contradiction. Therefore, we must have k = `. It follows that s`−k = s0 = id, so
ri−j = id. Using the fact that |r| = n now, we see that n | (i − j), and as above this implies that i − j = 0
so i = j.
Corollary 5.4.5. Let n ≥ 3. We then have that Dn is a nonabelian group with |Dn | = 2n.
Proof. We claim that rs 6= sr. Suppose instead that rs = sr. Since we know that sr = r−1 s, it follows that
rs = r−1 s. Canceling the s on the right, we see that r = r−1 . Multiplying on the left by r we see that
r2 = id, but this is a contradiction because |r| = n ≥ 3. Therefore, rs 6= sr and hence Dn is nonabelian.
Now by the previous theorem we know that
Dn = {ri sk : 0 ≤ i ≤ n − 1, 0 ≤ k ≤ 1}
and by the second part of the theorem that the 2n elements described in the set are distinct.
We now come back around to giving a geometric justification for why the elements of Dn exactly cor-
respond to the symmetries of the regular n-gon. We have described why every element of Dn does indeed
give a symmetry (because both r and s do, and the set of symmetries must be closed under composition
and inversion), so we need only understand why all possible symmetries of the regular n-gon arise from an
element of Dn . To determine a symmetry, we first need to send the vertex labeled by 1 to another vertex.
We have n possible choices for where to send it, and suppose we send it to the original position of vertex k.
Once we have sent vertex 1 to the position of vertex k, we now need to determine were the vertex 2 is sent.
Now vertex 2 must go to one of the vertices adjacent to k, so we only have 2 choices for where to send it.
Finally, once we’ve determined these two vertices (where vertex 1 and vertex 2 go), the rest of the n-gon is
determined because we have completely determined where an entire edge goes. Thus, there are a total of
n · 2 = 2n many possible symmetries. Since |Dn | = 2n, it follows that all symmetries are given by elements
of Dn .
Finally notice that D3 = S3 simply because D3 is a subgroup of S3 and |D3 | = 6 = |S3 |. In other words,
any permutation of the vertices of an equilateral triangle is obtainable via a rigid motion of the triangle.
However, if n ≥ 4, then |Dn | is much smaller than |Sn | as most permutations of the vertices of a regular
n-gon can not be obtained from a rigid motion.
We end with the Cayley table of D4 :
5.5. THE QUATERNION GROUP 79
◦ id r r2 r3 s rs r2 s r3 s
id id r r2 r3 s rs r2 s r3 s
r r r2 r3 id rs r2 s r3 s s
r2 r2 r3 id r r2 s r3 s s rs
r3 r3 id r r2 r3 s s rs r2 s
s s r3 s r2 s rs id r3 r2 r
rs rs s r3 s r2 s r id r3 r2
r2 s r2 s rs s r3 s r2 r id r3
r3 s r3 s r2 s rs s r3 r2 r id
For A, we have
0 1 0 1 −1 0
A2 = = = −I
−1 0 −1 0 0 −1
For B, we have
0 i 0 i −1 0
B2 = = = −I
i 0 i 0 0 −1
For C, we have
2 i 0 i 0 −1 0
C = = = −I
0 −i 0 −i 0 −1
Since
A2 = B 2 = C 2 = −I
it follows that
A4 = B 4 = C 4 = I
Thus, each of A, B, and C have order at most 4. One can check that none of them have order 3 (either directly,
using general results on powers giving the identity, or using the fact that otherwise A3 = I = A4 , so A = I,
a contradiction). In particular, each of these matrices is an element of GL2 (C) because A · A3 = I = A3 · A
(and similarly for B and C).
We have
0 1 0 i i 0
AB = = =C
−1 0 i 0 0 −i
and
0 i 0 1 −i 0
BA = = = −C
i 0 −1 0 0 i
We also have
i 0 0 1 0 i
CA = = =B
0 −i −1 0 i 0
and
0 1 i 0 0 −i
AC = = = −B
−1 0 0 −i −i 0
80 CHAPTER 5. SUBGROUPS AND COSETS
Finally, we have
0 i i 0 0 1
BC = = =A
i 0 0 −i −1 0
and
i 0 0 i 0 −1
CB = = = −A
0 −i i 0 1 0
This subgroup of 8 elements is called the quaternion group. However, just like in the situation for Dn , we
typically give these elements other names and forgot that they are matrices (just as we often forgot that the
elements of Dn are really elements of Sn ).
Definition 5.5.1. The quaternion group is the group on the set {1, −1, i, −i, j, −j, k, −k} where we define
i2 = −1 j 2 = −1 k 2 = −1
ij = k jk = i ki = j
ji = −k kj = −i ik = −j
and all other multiplications in the natural way. We denote this group by Q8 .
Since g ∈ G was arbitrary, we conclude that g(ab) = (ab)g for all g ∈ G, so ab ∈ Z(G). Thus, Z(G) is
closed under multiplication.
5.6. THE CENTER OF A GROUP 81
ga = ag
a−1 ga = a−1 ag
so
a−1 ga = g
Now multiplying this equation on the right by a−1 gives
so
a−1 g = ga−1
Since g ∈ G was arbitrary, we conclude that ga−1 = a−1 g for all g ∈ G, so a−1 ∈ Z(G). Thus, Z(G)
is closed under inverses.
We now calculate Z(G) for many of the groups G that we have encountered.
• First notice that if G is an abelian group, then we trivially have that Z(G) = G. In particular, we
have Z(Z/nZ) = Z/nZ and Z(U (Z/nZ)) = U (Z/nZ) for all n ∈ N+ .
• On the homework, we proved that if n ≥ 3 and σ ∈ Sn \{id}. then there exists τ ∈ Sn with στ 6= τ σ,
so σ ∈
/ Z(Sn ). Since the identity of a group is always in the center, it follows that Z(Sn ) = {id} for all
n ≥ 3. Notice that we also have Z(S1 ) = {id} trivially, but Z(S2 ) = {id, (1 2)} = S2 .
• If n ≥ 4, then Z(An ) = {id}. Suppose that σ ∈ An with σ 6= id. We have that σ : {1, 2, . . . , n} →
{1, 2, . . . , n} is a bijection which is not the identity map. Thus, we may fix i ∈ {1, 2, . . . , n} with
σ(i) 6= i, say σ(i) = j. Since n ≥ 4, we may fix k, ` ∈ {1, 2, . . . , n}\{i, j} with k 6= `. Define
τ : {1, 2, . . . , n} → {1, 2, . . . , n} by
k if m = j
` if m = k
τ (m) =
j if m = `
m otherwise
and that
(σ ◦ τ )(i) = σ(τ (i)) = σ(i) = j.
Since j 6= k, we have shown that the functions σ ◦ τ and τ ◦ σ disagree on i, and hence are distinct
functions. In other words, in An we have στ 6= τ σ. Thus, Z(An ) = {id} if n ≥ 4.
Notice that A1 = {id} and A2 = {id}, so we also have Z(An ) = {id} trivially when n ≤ 2. However,
when n = 3, we have that A3 = {id, (1 2 3), (1 3 2)} = h(1 2 3)i is cyclic and hence abelian, so
Z(A3 ) = A3 .
82 CHAPTER 5. SUBGROUPS AND COSETS
• We have Z(Q8 ) = {1, −1}. We trivially have that 1 ∈ Z(Q8 ), and showing that −1 ∈ Z(Q8 ) is simply
a matter of checking the various possibilities. Now i ∈
/ Z(Q8 ) because ij = k but ji = −k, and
−i ∈/ Z(Q8 ) because (−i)j = −k and j(−i) = k. Similar arguments show that the other four elements
are not in Z(Q8 ) either.
• We claim that
r 0
Z(GL2 (R) = : r ∈ R\{0}
0 r
To see this, first notice that if r ∈ R\{0}, then
r 0
∈ GLn (R)
0 r
because 1
r 0
1
0 r
is an inverse. Now for any matrix
a b
∈ GL2 (R)
c d
we have
r 0 a b ra rb
=
0 r c d rc rd
a b r 0
=
c d 0 r
Therefore
r 0
: r ∈ R\{0} ⊆ Z(GL2 (R))
0 r
Suppose now that A ∈ Z(GL2 (R)), i.e. that AB = BA for all B ∈ GL2 (R). Write
a b
A=
c d
5.7 Cosets
Suppose that G is a group and that H is a subgroup of G. The idea we want to explore is how to collapse
the elements of H by considering them all to be “trivial” like the identity e. If we want this idea to work, we
would then want to identify two elements a, b ∈ G if we can get from one to the other via multiplication by
a “trivial” element. In other words, we want to identify elements a and b if there exists h ∈ H with ah = b.
For example, suppose that G is the group R2 = R × R under addition (so (a1 , b1 ) + (a2 , b2 ) = (a1 +
a2 , b1 + b2 )) and that H = {(0, b) : b ∈ R} is the y-axis. Notice that H is subgroup of G. We want to consider
everything on the y-axis, that is every pair of the form (0, b), as trivial. Now if we want the y-axis to be
considered “trivial”, then we would want to consider two points to be the “same” if we can get from one to
the other by adding an element of the y-axis. Thus, we would want to identify (a1 , b1 ) with (a2 , b2 ) if and
only if a1 = a2 , because then we can add the “trivial” point (0, b2 − b1 ) to (a1 , b1 ) to get (a2 , b2 ).
Let’s move on to the group G = Z. Let n ∈ N+ and consider H = hni = nZ = {nk : k ∈ Z}. In this
situation, we want to consider all multiples of n to be “equal” to 0, and in general we want to consider
a, b ∈ Z to be equal if we can add some multiple of n to a in order to obtain b. In other words, we want to
identify a and b if and only if there exists k ∈ Z with a + kn = b. Working it out, we want to identify a
and b if and only if b − a is a multiple of n, i.e. if and only if a ≡ b (mod n). Thus, in the special case of
the subgroup nZ of Z, we recover the fundamental ideas of modular arithmetic and our eventual definition
of Z/nZ.
84 CHAPTER 5. SUBGROUPS AND COSETS
Left Cosets
Definition 5.7.1. Let G be a group and let H be a subgroup of G. We define a relation ∼H on G by letting
a ∼H b mean that there exists h ∈ H with ah = b.
Proposition 5.7.2. Let G be a group and let H be a subgroup of G. The relation ∼H is an equivalence
relation on G.
Proof. We check the three properties.
• Reflexive: Let a ∈ G. We have that ae = a and we know that e ∈ H because H is a subgroup of G, so
a ∼H a.
• Symmetric: Let a, b ∈ G with a ∼H b. Fix h ∈ H with ah = b. Multiplying on the right by h−1 , we
see that a = bh−1 , so bh−1 = a. Now h−1 ∈ H because H is a subgroup of G, so b ∼H a
• Transitive: Let a, b, c ∈ G with a ∼H b and b ∼H c. Fix h1 , h2 ∈ H with ah1 = b and bh2 = c. We
then have
a(h1 h2 ) = (ah1 )h2 = bh2 = c
Now h1 h2 ∈ H because H is a subgroup of G, so a ∼H c.
Therefore, ∼H is an equivalence relation on G.
The next proposition is a useful little rephrasing of when a ∼H b.
Proposition 5.7.3. Let G be a group and let H be a subgroup of G. Given a, b ∈ G, we have a ∼H b if and
only if a−1 b ∈ H.
Proof. Suppose first that a, b ∈ G satisfy a ∼H b. Fix h ∈ H with ah = b. Multiplying on the left by a−1 ,
we conclude that h = a−1 b, so a−1 b = h ∈ H.
Suppose conversely that a−1 b ∈ H. Since
a(a−1 b) = (aa−1 )b = eb = b
it follows that a ∼H b.
Definition 5.7.4. Let G be a group and let H be a subgroup of G. Under the equivalence relation ∼H , we
have
a = {b ∈ G : a ∼H b}
= {b ∈ G : There exists h ∈ H with b = ah}
= {ah : h ∈ H}
= aH.
Example 5.7.5. Let G = S3 and let H = h(1 2)i = {id, (1 2)}. Determine the left cosets of H in G.
Proof. The left cosets are the sets σH for σ ∈ G. For example, we have the left coset
Thus, the two left cosets idH and (1 2)H are equal. This should not be surprising because
id−1 ◦ (1 2) = id ◦ (1 2) = (1 2) ∈ H
Alternatively, we can simply note that (1 2) ∈ idH and (1 2) ∈ (1 2)H, so since the left cosets idH and (1 2)H
intersect nontrivially, we know immediately that they must be equal. Working through all the examples, we
compute σH for each of the six σ ∈ S3 :
H = {(0, b) : b ∈ R}
equal to the y-axis (or equivalently the line x = 0). Since the operation in G is +, we will denote the left
coset of (a, b) ∈ G by (a, b) + H (rather than (a, b)H). Let’s consider the left coset (3, 0) + H. We have
so (3, 0) + H gives line x = 3. Thus, the left coset (3, 0) + H is the translation of the line x = 0. Let’s now
consider the coset (3, 5) + H. We have
Notice that we could have obtained this with less work by noting that the inverse of (3, 5) in G is (−3, −5)
and (−3, −5) + (3, 0) = (0, −5) ∈ H, hence (3, 5) + H = (3, 0) + H. Therefore, the left coset (3, 5) + H also
gives the line x = 3. Notice that every element of H when hit by (3, 5) translates right 3 and shifts up 5,
but as a set this latter shift of 5 is washed away.
86 CHAPTER 5. SUBGROUPS AND COSETS
Finally, let’s consider G = Z (under addition) and H = nZ = {nk : k ∈ Z} where n ∈ N+ . Notice that
given a, b ∈ Z, we have
Therefore the two relations ∼nZ and ≡n are precisely the same, and we have recovered congruence modulo n
as a special case of our general construction. Since the relations are the same, they have the same equivalence
classes. Hence, the equivalence class of a under the equivalence relation ≡n , which in the past we denoted
by a, equals the equivalence class of a under the equivalence relation ∼nZ , which is the left coset a + nZ.
Right Cosets
In the previous section, we defined a ∼H b to mean that there exists h ∈ H with ah = b. Thus, we considered
two elements of G to be equivalent if we could get from a to b through multiplication by an element of H
on the right of a. In particular, with this definition, we saw that when we consider G = S3 and H = h(1 2)i,
we have (1 3) ∼H (1 2 3) because (1 3)(1 2) = (1 2 3).
What happens if we switch things up? For the rest of this section, completely ignore the definition of
∼H defined above because we will redefine it on the other side now.
Definition 5.7.6. Let G be a group and let H be a subgroup of G. We define a relation ∼H on G by letting
a ∼H b mean that there exists h ∈ H with ha = b.
The following results are proved exactly as above just working on the other side.
Proposition 5.7.7. Let G be a group and let H be a subgroup of G. The relation ∼H is an equivalence
relation on G.
Proposition 5.7.8. Let G be a group and let H be a subgroup of G. Given a, b ∈ G, we have a ∼H b if and
only if ab−1 ∈ H.
Now it would be nice if this new equivalence relation was the same as the original equivalence relation.
Too bad. In general, they are different! For example, with this new equivalence relation, we do not have
(1 3) ∼H (1 2 3) because id ◦ (1 3) = (1 3) and (1 2)(1 3) = (1 3 2). Now that we know that the two relations
differ in general, we should think about the equivalence classes of this new equivalence relation.
Definition 5.7.9. Let G be a group and let H be a subgroup of G. Under the equivalence relation ∼H , we
have
a = {b ∈ G : a ∼H b}
= {b ∈ G : There exists h ∈ H with b = ha}
= {ha : h ∈ H}
= Ha.
Example 5.7.10. Let G = S3 and let H = h(1 2)i = {id, (1 2)}. Determine the right cosets of H in G.
Notice that although we obtained both 3 left cosets and 3 right cosets, these cosets were different. In
particular, we have (1 3)H 6= H(1 3). In other words, it is not true in general that the left coset aH equals
the right coset Ha. Notice that this is fundamentally an issue because S3 is nonabelian. If we were working
in an abelian group G with a subgroup H of G, then ah = ha for all h ∈ H, so aH = Ha.
As in the left coset section, using Proposition 5.7.8, we have the following fundamental way of determining
when two left cosets are equal:
Ha = Hb ⇐⇒ ab−1 ∈ H ⇐⇒ ba−1 ∈ H
Index of a Subgroup
As we saw in the previous sections, it is not in general true that left cosets are right cosets and vice versa.
However, in the one example we saw above, we at least had the same number of left cosets as we had right
cosets. This is a general and important fact, which we now establish.
First, let’s once again state the fundamental way to tell when two left cosets are equal and when two
right cosets are equal.
aH = bH ⇐⇒ a−1 b ∈ H ⇐⇒ b−1 a ∈ H
Ha = Hb ⇐⇒ ab−1 ∈ H ⇐⇒ ba−1 ∈ H
Suppose now that G is a group and that H is a subgroup of G. Let LH be the set of left cosets of H in
G and let RH be the set of right cosets of H in G. We will show that |LH | = |RH | by defining a bijection
f : LH → RH . Now the natural idea is to define f by letting f (aH) = Ha. However, we need to be very
careful. Recall that the left cosets of H in G are the equivalence classes of a certain equivalence relation.
By “defining” f as above we are giving a definition based on particular representatives of these equivalence
classes, and it may be possible that aH = bH but Ha 6= Hb. In other words, we must determine whether f
is well-defined.
In fact, in general the above f is not well-defined. Consider our standard example of G = S3 and
H = h(1 2)i. Checking our above computations, we have (1 3)H = (1 2 3)H but H(1 3) 6= H(1 2 3).
Therefore, in this particular case, that choice of f is not well-defined. We need to define f differently to
make it well-defined in general, and the following lemma is the key to do so.
Lemma 5.7.11. Let H be a subgroup of G and let a, b ∈ G. The following are equivalent.
1. aH = bH
2. Ha−1 = Hb−1
Proof. Suppose that aH = bH. We then have that a−1 b ∈ H so using the fact that (b−1 )−1 = b, we see that
a−1 (b−1 )−1 ∈ H. It follows that Ha−1 = Hb−1 .
Suppose conversely that Ha−1 = Hb−1 . We then have that a−1 (b−1 )−1 ∈ H so using the fact that
(b ) = b, we see that a−1 b ∈ H. It follows that aH = bH.
−1 −1
88 CHAPTER 5. SUBGROUPS AND COSETS
Proposition 5.7.12. Let G be a group and let H be a subgroup of G. Let LH be the set of left cosets of H
in G and let RH be the set of right cosets of H in G. Define f : LH → RH by letting f (aH) = Ha−1 . We
then have that f is a well-defined bijection from LH onto RH . In particular, |LH | = |RH |.
Proof. Notice that f is well-defined by the above lemma because if aH = bH, then
We next check that f is injective. Suppose that f (aH) = f (bH), so that Ha−1 = Hb−1 . By the other
direction of the lemma, we have that aH = bH. Therefore, f is injective.
Finally, we need to check that f is surjective. Fix an element of RH , say Hb. We then have that
b−1 H ∈ LH and
f (b−1 H) = H(b−1 )−1 = Hb.
Hence, range(f ) = RH , so f is surjective.
Putting it all together, we conclude that f is a well-defined bijection from LH onto RH .
Definition 5.7.13. Let G be a group and let H be a subgroup of G. We define [G : H] to be the number of
left cosets of H in G (or equivalently the number of right cosets of H in G). That is, [G : H] is the number
of equivalence classes of G under the equivalence relation ∼H . If there are infinitely many left cosets (or
equivalently infinitely many right cosets), we write [G : H] = ∞. We call [G : H] the index of H in G.
For example, we saw above that [S3 : h(1 2)i] = 3. For any n ∈ N+ , we have [Z : nZ] = n because the
left cosets of nZ in Z are 0 + nZ, 1 + nZ, . . . , (n − 1) + nZ.
Proposition 5.8.1. Let G be a group and let H be a subgroup of G. Let a ∈ G. Define a function
f : H → aH by letting f (h) = ah. We then have that f is a bijection, so |H| = |aH|. In other words, all left
cosets of H in G have the same size.
Proof. Notice that f is surjective because if b ∈ aH, then we may fix h ∈ H with b = ah, and notice that
f (h) = ah = b, so b ∈ range(f ). Suppose that h1 , h2 ∈ H and that f (h1 ) = f (h2 ). We then have that
ah1 = ah2 , so by canceling the a’s on the left (i.e. multiplying on the left by a−1 ), we conclude that h1 = h2 .
Therefore, f is injective. Putting this together with the fact that f is surjective, we conclude that f is a
bijection. The result follows.
Theorem 5.8.2 (Lagrange’s Theorem). Let G be a finite group and let H be a subgroup of G. We have
|G| = [G : H] · |H|.
For example, instead of finding all of the left cosets of h(1 2)i in S3 to determine that [S3 : h(1 2)i] = 3,
we could have simply calculated
|S3 | 6
[S3 : h(1 2)i] = = = 3.
|h(1 2)i| 2
Notice the assumption in Lagrange’s Theorem that G is finite. It makes no sense to calculate [Z : nZ] in
∞
this manner (if you try to write ∞ you will make me very angry). We end this section with several simple
consequences of Lagrange’s Theorem.
Corollary 5.8.3. Let G be a finite group. Let K be a subgroup of G and let H be a subgroup of K. We
then have
[G : H] = [G : K] · [K : H].
Proof. Since G is finite (and hence trivially both H and K are finite) we may use Lagrange’s Theorem to
note that
|G| |K| |G|
[G : K] · [K : H] = · = = [G : H].
|K| |H| |H|
Corollary 5.8.4. Let G be a finite group and let a ∈ G. We then have that |a| divides |G|.
Proof. Let H = hai. By Proposition 5.2.3, we know that H is a subgroup of G and |a| = |H|. Therefore, by
Lagrange’s Theorem, we may conclude that |a| divides |G|.
Corollary 5.8.5. Let G be a finite group. We then have that a|G| = e for all a ∈ G.
Proof. Let m = |a|. By the previous corollary, we know that m | |G|. Therefore, by Proposition 4.6.5, it
follows that a|G| = e.
Theorem 5.8.6. Every group of prime order is cyclic (so in particular every group of prime order is abelian).
In fact, if G is a group of prime order, then every nonidentity element is a generator of G.
Proof. Let p = |G| be the prime order of G. Suppose that c ∈ G with c 6= e. We know that |c| divides p, so
as p is prime we must have that either |c| = 1 or |c| = p. Since c 6= e, we must have |c| = p. By Proposition
5.2.6, we conclude that c is a generator of G. Hence, every nonidentity element of G is a generator of G.
Theorem 5.8.7 (Euler’s Theorem). Let n ∈ N+ and let a ∈ Z with gcd(a, n) = 1. We then have aϕ(n) ≡ 1
(mod n).
Proof. We apply Corollary 5.8.4 to the group U (Z/nZ). Since gcd(a, n) = 1, we have that a ∈ U (Z/nZ).
Since |U (Z/nZ)| = ϕ(n), Corollary 5.8.4 tells us that aϕ(n) = 1. Therefore, aϕ(n) ≡ 1 (mod n).
Proof. We first prove 1. Suppose that a ∈ Z with p - a. We then have that gcd(a, p) = 1 (because gcd(a, p)
divides p, so it must be either 1 or p, but it can not be p since p - a). Now using the fact that ϕ(p) = p − 1,
we conclude from Euler’s Theorem that ap−1 ≡ 1 (mod p).
We next prove 2. Suppose that a ∈ Z. If p - a, then ap−1 ≡ 1 (mod p) by part 1, so multiplying both
sides by a gives ap ≡ a (mod p). Now if p | a, then we trivially have p | ap as well, so p | (ap − a) and hence
ap ≡ a (mod p).
90 CHAPTER 5. SUBGROUPS AND COSETS
Chapter 6
Proposition 6.1.1. Let G be an abelian group and let H be a subgroup of G. Suppose that a ∼H c and
b ∼H d. We then have that ab ∼H cd.
cd = ahbk = abhk.
Notice how fundamentally we used the fact that G was abelian in this proof to write hb = bh. Overcoming
this apparent stumbling block will be our primary focus when we get to nonabelian groups.
For example, suppose that G = R2 and H = {(0, b) : b ∈ R} is the y-axis. Again, we will write (a, b) + H
for the left coset of H In G. The elements of the quotient group G/H are the left cosets of H in G, which
91
92 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS
we know are the set of vertical lines in the plane. Let’s examine how we add two elements of G/H. The
definition says that we add left cosets by finding representatives of those cosets, adding those representatives,
and then taking the left coset of the result. In other words, to add two vertical lines, we pick points on the
lines, add those points, and output the line containing the result. For example, we add the cosets (3, 2) + H
and (4, −7) + H by computing (3, 2) + (4, −7) = (7, −5) and outputting the corresponding coset (7, −5) + H.
In other words, we have
((3, 2) + H) + ((4, −7) + H) = (7, −5) + H.
Now we could have chosen different representatives of those two cosets. For example, we have (3, 2) + H =
(3, 16) + H (after all both are on the line x = 3) and (4, −7) + H = (4, 1) + H, and if we calculate the sum
using these representatives we see that
Now although the elements of G given by (7, −5) and (7, 17) are different, the cosets (7, −5) + H and
(7, 17) + H are equal because (7, −5) ∼H (7, 17).
We are now ready to formally define the quotient of an abelian group G by a subgroup H. We verify
that the given definition really is a group in the proposition immediately after the definition.
Definition 6.1.2. Let G be an abelian group and let H be a subgroup of G. We define a new group, called
the quotient of G by H and denoted G/H, by letting the elements be the left cosets of H in G (i.e. the
equivalence classes of G under ∼H ), and defining the binary operation aH · bH = (ab)H. The identity is eH
(where e is the identity of G) and the inverse of aH is a−1 H.
Proposition 6.1.3. Let G be an abelian group and let H be a subgroup of G. The set G/H with the operation
just defined is indeed a group with |G/H| = [G : H]. Furthermore, it is an abelian group.
Proof. We verified that the operation aH · bH = (ab)H is well-defined in Proposition 6.1.1. With that in
hand, we just need to check the group axioms.
We first check that the · is an associative operation on G/H. For any a, b, c ∈ G we have
aH · eH = (ae)H = aH
and
eH · aH = (ea)H = aH.
For inverses, notice that given any a ∈ G, we have
aH · a−1 H = (aa−1 )H = eH
and
a−1 H · aH = (a−1 a)H = eH.
6.1. QUOTIENTS OF ABELIAN GROUPS 93
Thus, G/H is indeed a group, and it has order [G : H] because the elements are the left cosets of H in G.
Finally, we verify that G/H is abelian by noting that for any a, b ∈ G, we have
aH · bH = (ab)H
= (ba)H (since G is abelian)
= bH · aH.
Here is another example. Suppose that G = U (Z/18Z) and let H = h17i = {1, 17}. We then have that
the left cosets of H in G are:
Therefore, |G/H| = 3. To multiply two cosets, we choose representatives and multiply. For example, we
could calculate
5H · 7H = (5 · 7)H = 17H.
We can multiply the exact same two cosets using different representatives. For example, we have 7H = 11H,
so we could calculate
5H · 11H = (5 · 11)H = 1H.
Notice that we obtained the same answer since 1H = 17H. Now there is no canonical choice of representatives
for the various cosets, so if you want to give each element of G/H a unique “name”, then you simply have
to pick which representative of each coset you will use. We will choose (somewhat arbitrarily) to view G/H
as the following:
G/H = {1H, 5H, 7H}.
Here is the Cayley table of G/H using these choices of representatives.
· 1H 5H 7H
1H 1H 5H 7H
5H 5H 7H 1H
7H 7H 1H 5H
Again, notice that using the definition we have 5H · 7H = 17H, but since 17 was not one of our chosen
representatives and 17H = 1H where 1 is one of our chosen representatives, we used 5H · 7H = 1H in the
above table.
Finally, let us take a moment to realize that we have been dealing with quotient groups all along when
working with Z/nZ. This group is exactly the quotient of the group G = Z under addition by the subgroup
H = nZ = {nk : k ∈ Z}. Recall from Section 5.7 that given a, b ∈ Z, we have
a ∼nZ b ⇐⇒ a ≡ b (mod n)
so the left cosets of nZ are precisely the equivalence classes of ≡n . Stated in symbols, if a is the equivalence
class of a under ≡n , then a = a + nZ (we are again using + in left cosets because that is the operation in
Z). Furthermore, our definition of the operation in Z/nZ was given by
a + b = a + b.
94 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS
gHg −1 = {ghg −1 : g ∈ G, h ∈ H}
Proposition 6.2.2. Let G be a group and let H be a subgroup of G. The following are equivalent.
1. For all g ∈ G and all h ∈ H, there exists ` ∈ H with hg = g`.
2. For all g ∈ G and all h ∈ H, there exists ` ∈ H with gh = `g.
3. g −1 hg ∈ H for all g ∈ G and all h ∈ H.
4. ghg −1 ∈ H for all g ∈ G and all h ∈ H.
5. gHg −1 ⊆ H for all g ∈ G.
6. gHg −1 = H for all g ∈ G.
7. Hg ⊆ gH for all g ∈ G.
8. gH ⊆ Hg for all g ∈ G.
9. gH = Hg for all g ∈ G.
Proof. 1 ⇒ 2: Suppose that we know 1. Let g ∈ G and let h ∈ H. Applying 1 with g −1 ∈ G and h ∈ H, we
may fix ` ∈ H with hg −1 = g −1 `. Multiplying on the left by g we see that ghg −1 = `, and then multiplying
on the right by g we conclude that gh = `g.
2 ⇒ 1: Suppose that we know 2. Let g ∈ G and let h ∈ H. Applying 2 with g −1 ∈ G and h ∈ H, we may
fix ` ∈ H with g −1 h = `g −1 . Multiplying on the right by g we see that g −1 hg = `, and then multiplying on
the left by g we conclude that hg = g`.
1 ⇒ 3: Suppose that we know 1. Let g ∈ G and let h ∈ H. By 1, we may fix ` ∈ H with hg = g`.
Multiplying on the left by g −1 , we see that g −1 hg = ` ∈ H. Since g ∈ G and h ∈ H were arbitrary, we
conclude that g −1 hg ∈ H for all g ∈ G and all h ∈ H.
3 ⇒ 1: Suppose that we know 3. Let g ∈ G and let h ∈ H. Now
and g −1 hg ∈ H, we see that h ∈ gHg −1 . Since g ∈ G and h ∈ H were arbitrary, it follows that H ⊆ gHg −1
for all g ∈ G.
96 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS
6 ⇒ 5: This is trivial.
1 ⇔ 7: These two are simply restatements of each other.
2 ⇔ 8: These two are simply restatements of each other.
9 ⇒ 8: This is trivial.
7 ⇒ 9: Suppose that we know 7. Since 7 ⇒ 1 ⇒ 2 ⇒ 8, it follows that we know 8 as well. Putting 7 and
8 together we conclude 9.
As the Proposition shows, the condition we are seeking to ensure that multiplication of left cosets is
well-defined is equivalent to the condition that the left cosets of H in G are equal to the right cosets of H
in G. Thus, by adopting that condition we automatically get rid of the other problematic question of which
side to work on. This condition is shaping up to be so useful that we given the subgroups which satisfy it a
special name.
Definition 6.2.3. Let G be a group and let H be a subgroup of G. We say that H is a normal subgroup of
G if gHg −1 ⊆ H for all g ∈ G (or equivalently any of properties in the previous proposition hold).
Our entire goal in defining and exploring the concept of a normal subgroup H of a group G was to
allow us to prove that multiplication of left cosets via representatives is well-defined. It turns out that this
condition is precisely equivalent to this operation being well-defined.
Proposition 6.2.4. Let G be an group and let H be a subgroup of G. The following are equivalenct.
1. H is a normal subgroup of G.
Proof. We first prove that 1 implies 2. Suppose that a, b, c, d ∈ G with a ∼H c and b ∼H d. Fix h, k ∈ H
such that ah = c and bk = d. Since H is a normal subgroup of G, we may fix ` ∈ H with hb = b`. We then
have that
cd = ahbk = ab`k.
Now `k ∈ H because H is a subgroup of G. Since cd = (ab)(`k), it follows that ab ∼H cd.
We now prove that 2 implies 1. We prove that H is a normal subgroup of G by showing that g −1 hg ∈ H
for all g ∈ G and h ∈ H. Let g ∈ G and let h ∈ H. Notice that we have e ∼H h because eh = h and g ∼H g
because ge = g. Since we are assuming 2, it follows that eg ∼H hg and hence g ∼H hg. Fix k ∈ H with
gk = hg. Multiply on the left by g −1 we get k = g −1 hg so g −1 hg ∈ H. The result follows.
For example, A3 = h(1 2 3)i is a normal subgroup of S3 . To show this, we can directly compute the
cosets (although we will see a faster method soon). The left cosets of A3 in S3 are:
Proposition 6.2.5. Let G be a group and let H be a subgroup of G. If H ⊆ Z(G), then H is a normal
subgroup of G.
6.2. NORMAL SUBGROUPS AND QUOTIENT GROUPS 97
Proof. For any g ∈ G and h ∈ H, we have hg = gh because h ∈ Z(G). Therefore, H is a normal subgroup
of G by Condition 1 above.
Example 6.2.6. Suppose that n ≥ 4 is even and write n = 2k for k ∈ N+ . Since Z(Dn ) = {id, rk } from
the homework, we conclude that {id, rk } is a normal subgroup of Dn .
Proposition 6.2.7. Let G be a group and let H be a subgroup of G. If [G : H] = 2, then H is a normal
subgroup of G.
Proof. Since [G : H] = 2, we know that there are exactly two distinct left cosets of H in G and exactly two
distinct right cosets of H in G. One left coset of H in G is eH = H. Since the left cosets partition G and
there are only two of them, it must be the case that the other left coset of H in G is the set G\H (that is
the set G with the set H removed). Similarly, one right coset of H in G is He = H. Since the right cosets
partition G and there are only two of them, it must be the case that the other right coset of H in G is the
set G\H.
To show that H is a normal subgroup of G, we know that it suffices to show that gH = Hg for all g ∈ G
(by Proposition 6.2.2). Let g ∈ G be arbitrary. We have two cases.
• Suppose that g ∈ H. We then have that g ∈ gH and g ∈ eH, so gH ∩eH 6= ∅ and hence gH = eH = H.
We also have that g ∈ Hg and g ∈ He, so Hg ∩ He 6= ∅ and hence Hg = He = H. Therefore
gH = H = Hg.
• Suppose that g ∈ / H. We then have g ∈ / eH, so gH =6 eH, and hence we must have gH = G\H
(because it is the only other left coset). We also have g ∈
/ He, so Hg 6= He, and hence Hg = G\H.
Therefore gH = G\H = Hg.
Thus, for any g ∈ G, we have gH = Hg. It follows that H is a normal subgroup of G.
Since [S3 : A3 ] = 2, this proposition gives a different way to prove that A3 is a normal subgroup of S3
without doing all of the calculations we did above. In fact, we get the following.
Corollary 6.2.8. An is a normal subgroup of Sn for all n ∈ N.
Proof. This is trivial if n = 1. When n ≥ 2, we have
|Sn |
[Sn : An ] =
|An |
n!
= (by by Proposition 5.3.12)
n!/2
=2
Proposition 6.2.11. Let G be a group and let H be a normal subgroup of G. The set G/H with the
operation just defined is indeed a group with |G/H| = [G : H].
Proof. We verified that the operation aH · bH = (ab)H is well-defined in Proposition 6.2.4. With that in
hand, we just need to check the group axioms.
We first check that · is an associative operation on G/H. For any a, b, c ∈ G we have
aH · eH = (ae)H = aH
and
eH · aH = (ea)H = aH.
For inverses, notice that given any a ∈ G, we have
aH · a−1 H = (aa−1 )H = eH
and
a−1 H · aH = (a−1 a)H = eH.
Thus, G/H is indeed a group, and it has order [G : H] because the elements are the left cosets of H in G.
For example, suppose that G = D4 and H = Z(G) = {id, r2 }. We know from Proposition 6.2.5 that H
is a normal subgroup of G. The left cosets (and hence right cosets because H is normal in G) of H in G are:
• idH = r2 H = {id, r2 }
• rH = r3 H = {r, r3 }
• sH = r2 sH = {s, r2 s}
• rsH = r3 sH = {rs, r3 s}
As usual, there are no “best” choices of representatives for these cosets when we consider G/H. We choose
to take
G/H = {idH, rH, sH, rsH}.
The Cayley table of G/H using these representatives is:
· idH rH sH rsH
idH idH rH sH rsH
rH rH idH rsH sH
sH sH rsH idH rH
rsH rsH sH rH idH
6.2. NORMAL SUBGROUPS AND QUOTIENT GROUPS 99
Notice that we had to switch to our “chosen” representatives several times when constructing this table. For
example, we have
rH · rH = r2 H = idH
and
sH · rH = srH = r−1 sH = r3 sH = rsH.
Examining the table, we see a few interesting facts. The group G/H is abelian even though G is not abelian.
Furthermore, every nonidentity element of G/H has order 2 even though r itself has order 4. We next see
how the order of an element in the quotient relates to the order of the representative in the original group.
Proposition 6.2.12. Let G be a group and let H be a normal subgroup of G. Suppose that a ∈ G has finite
order. The order of aH (in the group G/H) is finite and divides the order of a (in the group G).
(aH)n = an H = eH.
Now eH is the identity of G/H, so we have found some power of the element aH which gives the identity in
G/H. Thus, aH has finite order. Let m = |aH| (in the group G/H). Since we checked that n is a power of
aH giving the identity, it follow from Proposition 4.6.5 that m | n.
It is possible that |aH| is strictly smaller than |a|. In the above example of D4 , we have |r| = 4 but
|rH| = 2. Notice however that 2 | 4 as the previous proposition proves must be the case.
We now show how it is possible to prove theorems about groups using quotients and induction. The
general idea is as follows. Given a group G, proper subgroups of G and nontrivial quotients of G (that is by
normal subgroups other than {e} and G) are “smaller” than G. So the idea is to prove a result about finite
groups by using induction on the order of the group. By the inductive hypothesis, we know information
about the proper subgroups and nontrivial quotients, so the hope is to piece together that information to
prove the result about G. We give an example of this technique by proving the following theorem.
Theorem 6.2.13. Let p ∈ N+ be prime. If G is a finite abelian group with p | |G|, then G has an element
of order p.
Proof. The proof is by induction on |G|. If |G| = 1, then the result is trivial because p - 1 (if you don’t
like this vacuous base case, simply note the if |G| = p, then every nonidentity element of G has order p by
Lagrange’s Theorem). Suppose then that G is a finite group with p | |G|, and suppose that the result is true
for all groups K satisfying p | |K| and |K| < |G|. Fix a ∈ G with a 6= e, and let H = hai. We then have that
H is a normal subgroup of G because G is abelian. We now have two cases.
|G|
|G/H| = [G : H] =
|H|
so |G| = |H| · |G/H|. Since p | |G| and p - |H| (because |H| = |a|), it follows that p | |G/H|. Now
|G/H| < |G| because |H| > 1, so by induction there exists an element bH ∈ G/H with |bH| = p. By
Proposition 6.2.12, we may conclude that p | |b|. Therefore, by Proposition 4.6.10, G has an element
of order p.
In either case, we have concluded that G has an element of order p. The result follows by induction.
100 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS
Notice where we used the two fundamental assumptions that p is prime and that G is abelian. For the
prime assumption, we used the key fact that if p | ab, then either p | a or p | b. In fact, the result is not
true if you leave out the assumption that p is prime. The abelian group U (Z/8Z) has order 4 but it has no
element of order 4.
We made use of the abelian assumption to get that H was a normal subgroup of G without any work.
In general, with a little massaging, we could slightly alter the above proof as long as we could assume that
every group has some normal subgroup other than the two trivial normal subgroups {e} and G (because this
would allow us to either use induction on H or on G/H). Unfortunately, it is not in general true that every
group always has such a normal subgroup. Those that do not are given a name.
Definition 6.2.14. A group G is simple if |G| > 1 and the only normal subgroups of G are {e} and G.
A simple group is a group that we are unable to “break up” into a smaller normal subgroup H and
corresponding smaller quotient G/H. They are the “atoms” of the groups and are analogous to the primes.
In fact, for every prime p, the abelian group Z/pZ is simple for the trivial reason that its only subgroups
at all are {0} and all of Z/pZ by Lagrange’s Theorem. It turns out that these are the only simple abelian
groups (see homework). Now if these were the only simple groups at all, there would not be much of problem
because they are quite easy to get a handle on. However, there are infinitely many finite simple nonabelian
groups. For example, we will see later that An is a simple group for all n ≥ 5. The existence of these groups
is a serious obstruction to any inductive proof for all groups in the above style. Nonetheless, the above
theorem is true for all groups (including nonabelian ones), and the result is known as Cauchy’s Theorem.
We will prove it later using more advanced tools, but we will start that proof knowing that we have already
handled the abelian case.
6.3 Isomorphisms
Definitions and Examples
We have developed several ways to construct groups. We started with well known groups like Z, Q and
GLn (R). From there, we introduced the groups Z/nZ and U (Z/nZ). After that, we developed our first
family of nonabelian groups in the symmetric groups Sn . With those in hand, we obtained many other
groups as subgroups of these, such as SLn (R), An , and Dn . Finally, we built new groups from all of these
using direct products and quotients.
With such a rich supply on groups now, it is time to realize that some of these groups are essentially the
“same”. For example, let G = Z/2Z, let H = S2 , and let K be the group ({T, F }, ⊕) where ⊕ is “exclusive
or” discussed in the first section. Here are the Cayley tables of these groups.
+ 0 1 ◦ id (1 2) ⊕ F T
0 0 1 id id (1 2) F F T
1 1 0 (1 2) (1 2) id T T F
Now of course these groups are different because the sets are completely different. The elements of G are
equivalence classes and thus subsets of Z, the elements of H are permutations of the set {1, 2} (and hence
are really certain functions), and the elements of K are T and F . Furthermore the operations themselves
have little in common since in G we have addition of cosets via representatives, in H we have function
composition, and in K we have this funny logic operation. However, despite all these differences, a glance
at the above tables tells us that there is a deeper “sameness” to them. For G and H, if we pair off 0 with id
and pair off 1 and (1 2), then we have provided a kind of “rosetta stone” for translating between the groups.
This is formalized with the following definition.
Definition 6.3.1. Let (G, ·) and (H, ?) be groups. An isomorphism from G to H is a function ϕ : G → H
such that
6.3. ISOMORPHISMS 101
1. ϕ is a bijection.
2. ϕ(a · b) = ϕ(a) ? ϕ(b) for all a, b ∈ G. In shorthand, ϕ preserves the group operation.
Therefore, ϕ(a + b) = ϕ(a) ◦ ϕ(b) for all a, b ∈ Z/2Z. Since we already noted that ϕ is a bijection, it follows
that ϕ is an isomorphism. All of these checks are just really implicit in the above table. The bijection
ϕ : G → H we defined pairs off 0 with id and pairs off 1 with (1 2). Since this aligning of elements carries
the table of G to the table of H as seen in the above tables, ϕ is an isomorphism.
Notice if instead we define ψ : G → H by letting ψ(0) = (1 2) and ψ(1) = id, then ψ is not an isomorphism.
To see this, we just need to find a counterexample to the second property. We have
ψ(0 + 0) = ψ(0) = (1 2)
and
ψ(0) + ψ(0) = (1 2) ◦ (1 2) = id
so
ψ(0 + 0) 6= ψ(0) + ψ(0)
Essentially we are writing the Cayley tables as:
+ 0 1 ◦ (1 2) id
0 0 1 (1 2) id (1 2)
1 1 0 id (1 2) id
and noting that ψ does not carry the table of G to the table of H (as can be seen in the (1, 1) entry). Thus,
even if one bijection is an isomorphism, there may be other bijections which are not. We make the following
definition.
Definition 6.3.2. Let G and H be groups. We say that G and H are isomorphic, and write G ∼
= H, if there
exists an isomorphism ϕ : G → H.
In colloquial language, two groups G and H are isomorphic exactly when there is some way to pair off
elements of G with H which maps the Cayley table of G onto the Cayley table of H. For example, we have
U (Z/8Z) ∼= Z/2Z × Z/2Z via the bijection
Furthermore, looking back at the introduction we see that the crazy group G = {3, ℵ, @} is isomorphic to
Z/3Z as the following ordering of elements of shows.
· 3 ℵ @ + 1 0 2
3 @ 3 ℵ 1 2 1 0
ℵ 3 ℵ @ 0 1 0 2
@ ℵ @ 3 2 0 2 1
Also, the 6 element group at the very end of Section 4.1 is isomorphic to S3 :
∗ 1 2 3 4 5 6 ◦ id (1 2) (1 3) (2 3) (1 2 3) (1 3 2)
1 1 2 3 4 5 6 id id (1 2) (1 3) (2 3) (1 2 3) (1 3 2)
2 2 1 6 5 4 3 (1 2) (1 2) id (1 3 2) (1 2 3) (2 3) (1 3)
3 3 5 1 6 2 4 (1 3) (1 3) (1 2 3) id (1 3 2) (1 2) (2 3)
4 4 6 5 1 3 2 (2 3) (2 3) (1 3 2) (1 2 3) id (1 3) (1 2)
5 5 3 4 2 6 1 (1 2 3) (1 2 3) (1 3) (2 3) (1 2) (1 3 2) id
6 6 4 2 3 1 5 (1 3 2) (1 3 2) (2 3) (1 2) (1 3) id (1 2 3)
The property of being isomorphic has the following basic properties. Roughly, we are saying that iso-
morphism is an equivalence relation on the set of all groups. Formally, there are technical problems talking
about “the set of all groups” (some collections are simply too big to be sets), but let’s not dwell on those
details here.
Proposition 6.3.3.
1. For any group G, the function idG : G ∼
= G is an isomorphism, so G ∼
= G.
2. If ϕ : G → H is an isomorphism, then ϕ−1 : H → G is an isomorphism. In particular, if G ∼
= H, then
H∼ = G.
3. If ϕ : G → H and ψ : H → K are isomorphisms, then ψ ◦ ϕ : G → K is an isomorphism. In particular,
if G ∼= H and H ∼
= K, then G ∼
= K.
Proof.
1. Let (G, ·) be a group. The function idG : G → G is a bijection, and for any a, b ∈ G we have
so idG : G → G is an isomorphism.
2. Let · be the group operation in G and let ? be the group operation in H. Since ϕ : G → H is a
bijection, we know that it has an inverse ϕ−1 : H → G which is also a bijection (because ϕ−1 has an
inverse, namely ϕ). We need only check the second property. Let c, d ∈ H. Since ϕ is a bijection, it is
a surjection, so we may fix a, b ∈ G with ϕ(a) = c and ϕ(b) = d. By definition of ϕ−1 , we then have
ϕ−1 (c) = a and ϕ−1 (d) = b. Now
ϕ−1 (c ? d) = a · b
Since c, d ∈ H were arbitrary, the second property holds. Therefore, ϕ−1 : H → G is an isomorphism.
3. Let · be the group operation in G, let ? be the group operation in H, and let ∗ be the group operation
in K. Since the composition of bijections is a bijection, it follows that ψ ◦ ϕ : G → K is a bijection.
For any a, b ∈ G, we have
Therefore, ψ ◦ φ : G → K is an isomorphism.
It gets tiresome consistently using different notation for the operation in G and the operation in H, so
we will stop doing it unless absolutely necessary. Thus, we will write
where you need to keep in mind that the · on the left is the group operation in G and the · on the right is
the group operation in H.
Proposition 6.3.4. Let ϕ : G → H be an isomorphism. We have the following.
1. ϕ(eG ) = eH .
2. ϕ(a−1 ) = ϕ(a)−1 for all a ∈ G.
3. ϕ(an ) = ϕ(a)n for all a ∈ G and all n ∈ Z.
Proof.
1. We have
ϕ(eG ) = ϕ(eG · eG ) = ϕ(eG ) · ϕ(eG )
hence
eH · ϕ(eG ) = ϕ(eG ) · ϕ(eG )
Using the cancellation law, it follows that ϕ(eG ) = eH .
2. Let a ∈ G. We have
ϕ(a) · ϕ(a−1 ) = ϕ(a · a−1 ) = ϕ(eG ) = eH
and
ϕ(a−1 ) · ϕ(a) = ϕ(a−1 · a) = ϕ(eG ) = eH
Therefore, ϕ(a−1 ) = ϕ(a)−1 .
104 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS
3. For n = 0, this says that ϕ(eG ) = eH , which is true by part 1. The case n = 1 is trivial, and the case
n = −1 is part 2. We first prove the result for all n ∈ N+ by induction. We already noticed that n = 1
is trivial. Suppose that n ∈ N+ is such that ϕ(an ) = ϕ(a)n for all a ∈ G. For any a ∈ G we have
ϕ(an+1 ) = ϕ(an · a)
= ϕ(an ) · ϕ(a) (since ϕ is an isomorphism)
= ϕ(a)n · ϕ(a) (by induction)
n+1
= ϕ(a)
Thus, the result holds for n + 1. Therefore, the result is true for all n ∈ N+ by induction. We finally
handle n ∈ Z with n < 0. For any a ∈ G we have
ϕ(an ) = ϕ((a−1 )−n )
= ϕ(a−1 )−n (since − n > 0)
−1 −n
= (ϕ(a) ) (by part 2)
n
= ϕ(a)
Thus, the result is true for all n ∈ Z.
Now on Homework 4, we proved that |11| = 6 in U (Z/18Z). Since |U (Z/18Z)| = 6, it follows that
U (Z/18Z) is a cyclic group of order 6, and hence U (Z/18Z) ∼
= Z/6Z.
Corollary 6.3.6. Let p ∈ N+ be prime. Any two groups of order p are isomorphic.
Proof. Suppose that G and H have order p. By the previous theorem, we have G ∼
= Z/pZ and H ∼
= Z/pZ.
Using symmetry and transitivity of ∼
=, it follows that G ∼
= H.
h1 · h2 = ϕ(g1 ) · ϕ(g2 )
= ϕ(g1 · g2 ) (since ϕ is an isomorphism)
= ϕ(g2 · g1 ) (since G is abelian)
= ϕ(g2 ) · ϕ(g1 ) (since ϕ is an isomorphism)
= h2 · h1
Proof. Since G ∼= H, we may fix an isomorphism ϕ : G → H. Since G is cyclic, we may fix c ∈ G with
G = hci. We claim that H = hϕ(c)i. Let h ∈ H. Since ϕ is in particular a surjection, we may fix g ∈ G with
ϕ(g) = h. Since g ∈ G = hci, we may fix n ∈ Z with g = cn . We then have
so h ∈ hϕ(c)i. Since h ∈ H was arbitrary, it follows that H = hϕ(c)i and hence H is cyclic.
As an example, consider the groups Z/4Z and Z/2Z × Z/2Z. Each of these groups are abelian of order
4. However, Z/4Z is cyclic while Z/2Z × Z/2Z is not. It follows that Z/4Z 6∼
= Z/2Z × Z/2Z.
Proposition 6.3.9. Suppose that ϕ : G → H is an isomorphism. For any a ∈ G, we have |a| = |ϕ(a)| (in
other words, the order of a in G equals the order of ϕ(a) in H).
106 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS
Proof. Let a ∈ G. Suppose that n ∈ N+ and an = eG . Using the fact that ϕ(eG ) = eH we have
ϕ(an ) = ϕ(an ) = eH
so ϕ(an ) = eH = ϕ(eG ), and hence an = eG because ϕ is injective. Combining both of these, we have shown
that
{n ∈ N+ : an = eG } = {n ∈ N+ : ϕ(a)n = eH }
It follows that both of these sets are either empty (and so |a| = ∞ = |ϕ(a)|) or both have the same least
element (equal to the common order of |a| and |ϕ(a)|).
Thus, for example, we have Z/4Z × Z/4Z ∼ 6 Z/2Z × Z/8Z because the latter group has an element of
=
order 8 while a4 = e for all a ∈ Z/4Z × Z/4Z.
H 0 = {(h, eK ) : h ∈ H}
and
K 0 = {(eH , k) : k ∈ K}
We claim that H 0 is a normal subgroup of H ×K. To see that it is a subgroup, simply note that (eH , eK ) ∈ H 0 ,
that
(h1 , eK ) · (h2 , eK ) = (h1 h2 , eK eK ) = (h1 h2 , eK )
and that
(h, eK )−1 = (h−1 , e−1
K ) = (h
−1
, eK )
To see that H 0 is a normal subgroup of H × K, notice that if (h, eK ) ∈ H 0 and (a, b) ∈ H × K, then
H 0 ∩ K 0 = {(eH , eK )} = {eH×K }
In other words, the only thing that H 0 and K 0 have in common is the identity element of H × K. Finally,
note that any element of H × K can be written as a product of an element of H 0 and an element of K 0 : If
(h, k) ∈ H × K, then (h, k) = (h, eK ) · (eH , k).
6.4. INTERNAL DIRECT PRODUCTS 107
It turns out that these few facts characterize when a given group G is naturally isomorphic to the direct
product of two of its subgroups, as we will see in Theorem 6.4.4. Before proving this result, we first introduce
a definition and a few lemmas.
Definition 6.4.1. Let G be a group, and let H and K be subgroups of G. We define
Notice that HK will be a subset of G, but in general it is not a subgroup of G. For an explicit example,
consider G = S3 , H = H = h(1 2)i = {id, (1 2)} and K = h(1 3)i = {id, (1 3)}. We have
of G. Therefore, this common value is an element of H ∩ K = {e}. Thus, we have h−1 2 h1 = e and hence
h1 = h2 , and similarly we have k2 k1−1 = e and hence k1 = k2 .
Lemma 6.4.3. Suppose that H and K are both normal subgroups of a group G and that H ∩ K = {e}. We
then have that hk = kh for all h ∈ H and k ∈ K.
Proof. Let h ∈ H and k ∈ K. Consider the element hkh−1 k −1 . Since K is normal in G, we have hkh−1 ∈ K
and since k −1 ∈ K it follows that hkh−1 k −1 ∈ K. Similarly, since H is normal in G, we have kh−1 k −1 ∈ H
and since sine h ∈ H it follows that hkh−1 k −1 ∈ H. Therefore, hkh−1 k −1 ∈ H ∩K and hence hkh−1 k −1 = e.
Multiplying on the right by kh, we conclude that hk = kh.
Theorem 6.4.4. Suppose that H and K are subgroups of G with the following properties
1. H and K are both normal subgroups of G.
2. H ∩ K = {e}.
3. HK = G, i.e. for all g ∈ G, there exists h ∈ H and k ∈ K with g = hk.
In this case, the function ϕ : H × K → G defined by ϕ((h, k)) = h · k is an isomorphism. Thus, we have
G∼= H × K.
Proof. Define ϕ : H × K → G by letting ϕ((h, k)) = h · k. We check the following:
• ϕ is injective: Suppose that ϕ((h1 , k1 )) = ϕ((h2 , k2 )). We then have h1 k1 = h2 k2 , so since H ∩K = {e},
we can conclude that h1 = h2 and k1 = k2 using Lemma 6.4.2. Therefore, (h1 , k1 ) = (h2 , k2 ).
• ϕ is surjective: This is immediate from the assumption that G = HK.
• ϕ preserves the group operation: Suppose that (h1 , k1 ) ∈ H × K and (h2 , k2 ) ∈ H × K, we have
Therefore, ϕ is an isomorphism.
2. H ∩ K = {e}.
3. HK = G.
We then say that G is the internal direct product of H and K. In this case, we have that H × K ∼
= G via
the function ϕ((h, k)) = hk, as shown in Theorem 6.4.4.
As an example, consider the group G = U (Z/8Z). Let H = h3i = {1, 3} and let K = h5i = {1, 5}. We
then have that H and K are both normal subgroups of G (because G is abelian), that H ∩ K = {1}, and
that
HK = {1 · 1, 3 · 1, 1 · 5, 3 · 5} = {1, 3, 5, 7} = G
Therefore, U (Z/8Z) is the internal direct product of H and K. I want to emphasize that U (Z/8Z) does not
equal H × K. After all, as sets we have
U (Z/8Z) = {1, 3, 5, 7}
while
H × K = {(1, 1), (3, 1), (1, 5), (3, 5)}
However, the above result shows that these two groups are isomorphic via the function ϕ : H ×K → U (Z/8Z)
given by ϕ((h, k)) = h · k. Since H and K are each cyclic of order 2, it follows that each of H and K are
isomorphic to Z/2Z. Using the following result, we conclude that U (Z/8Z) ∼ = Z/2Z × Z/2Z.
Proposition 6.4.6. Suppose that G1 ∼
= H1 and G2 ∼
= H2 . Show that G1 × G2 ∼
= H1 × H2 .
∼ H1 , we may fix an isomorphism ϕ1 : G1 → H1 . Since G2 ∼
Proof. Since G1 = = H2 , we may fix an isomorphism
ϕ2 : G2 → H2 . Define ψ : G1 × G2 → H1 × H2 by letting
• ψ is injective: Suppose that a1 , b1 ∈ G1 and a2 , b2 ∈ G2 satisfy ψ((a1 , a2 )) = ψ((b1 , b2 )). We then have
(ϕ1 (a1 ), ϕ2 (a2 )) = (ϕ1 (b1 ), ϕ2 (b2 )), so
Now ϕ1 is an isomorphism, so ϕ1 is injective, and hence the first equality implies that a1 = b1 . Similarly,
ϕ2 is an isomorphism, so ϕ2 is injective, and hence the second equality implies that a2 = b2 . Therefore,
(a1 , a2 ) = (b1 , b2 ).
so (c1 , c2 ) ∈ range(ψ).
6.4. INTERNAL DIRECT PRODUCTS 109
Corollary 6.4.7. Let G a finite group. Let H and K be subgroups of G such that:
2. H ∩ K = {e}.
Proof. We need only prove that HK = G. In the above proof where we showed that h1 k1 = h2 k2 implies
h1 = h2 and k1 = k2 , we only used the fact that H ∩ K = {e}. Therefore, since we are assuming that
H ∩ K = {e}, it follows that |HK| = |H| · |K|. Since we are assuming that |H| · |K| = |G|, it follows that
|HK| = |G|, and since G is finite we conclude that HK = G.
and let
K = {id, r3 }
We have that K = Z(G) from the homework, so K is a normal subgroup of G. It is straightforward to check
that H is a subgroup of G and
|G| 12
[G : H] = = =2
|H| 6
Using Proposition 6.2.7, we conclude that H is a normal subgroup of G. Now H ∩ K = {id} and |H| · |K| =
6 · 2 = 12 = |G| (you can also check directly that HK = G). It follows that G is the internal direct product
of H and K, so in particular we have G ∼ = H × K. Notice that K is cyclic of order 2, so K ∼ = Z/2Z.
Furthermore, H is a group of order 6, and it is not hard to convince yourself that H ∼ = D3 : Roughly, you can
map r2 (where r is rotation in D6 ) to r (where r is rotation in D3 ). Geometrically, when working with H,
you are essentially looking at the regular hexagon and focusing only on the rotations by 120o (corresponding
to r2 ), 240o (corresponding to r4 ) and the identity, along with the standard flip. This really just corresponds
exactly to the rigid motions of the triangle. I hope that convinces you that H ∼ = D3 , but you can check
formally by looking at the corresponding Cayley tables. Now D3 = S3 , so it follows that H ∼ = S3 and hence
D6 ∼
=H ×K ∼
= S3 × Z/2Z
• Case 1: Suppose that G is abelian. By Theorem 6.2.13, we know that G has an element a of order
3 and an element b of order 2. Let H = hai = {e, a, a2 } and K = hbi = {e, b}. Since G is abelian,
we know that H and K are normal subgroups of G. We also have H ∩ K = {e} (notice that b 6= a2
because |a2 | = 3 also) and |H| · |K| = 3 · 2 = 6 = |G|. Using Corollary 6.4.7, we conclude that G is the
internal direct product of H and K. Now H and K are both cyclic, so H ∼ = Z/3Z and K ∼ = Z/2Z by
Theorem 6.3.5. It follows that
G∼=H ×K ∼ = Z/3Z × Z/2Z
where the latter isomorphism follows from Proposition 6.4.6. Finally, notice that Z/3Z × Z/2Z is cyclic
by Problem 3b on Homework 5 (or by directly checking that |(1, 1)| = 6), so Z/3Z × Z/2Z ∼ = Z/6Z by
Theorem 6.3.5. Thus, G ∼= Z/6Z.
• Case 2: Suppose that G is not abelian. We then have that G is not cyclic by Proposition 5.2.7, so G
is no element of order 6 by Proposition 5.2.6. Also, it is not impossible that every element of G has
order 2 by Proposition 6.5.2. Thus, G must have an element of order 3. Fix a ∈ G with |a| = 3. Notice
that a2 ∈/ {e, a}, that a2 = a−1 , and that |a2 | = 3 (because (a2 )2 = a4 = a, and (a2 )3 = a6 = e).
Now take an arbitrary b ∈ G\{e, a, a2 }. We then have ab ∈ / {e, a, a2 , b} (for example, if ab = e, then
−1
b = a = a , while if ab = a then b = a by cancellation) and also a2 b ∈
2 2
/ {e, a, a2 , b, ab}. Therefore,
2 2 2 2
the 6 element of G are e, a, a , b, ab, and a b, i.e. G = {e, a, a , b, ab, a b}.
We next determine |b|. Now we know that b 6= e, so since G is not cyclic we know that either |b| = 2
or |b| = 3. Suppose then that |b| = 3. We know that b2 must be one of the six elements of G. We
do not have b2 = e because |b| = 3, and arguments similar to those above show that b2 ∈ / {b, ab, a2 b}
(for example, if b2 = ab, then a = b by cancellation). Now if b2 = a, then multiplying both sides by
b on the right, we could conclude that b3 = ab, so e = ab, and hence b = a−1 = a2 , a contradiction.
Similarly, if b2 = a2 , then multiplying by b on the right gives e = a2 b, so multiplying on the left by a
allows us to conclude that a = b, a contradiction. Since all cases end in a contradiction, we conclude
that |b| =
6 3, and hence we must have |b| = 2.
Since ba ∈ G and G = {e, a, a2 , b, ab, a2 b}, we now determine which of the six elements equals ba.
Notice that ba ∈/ {e, a, a2 , b} for similar reasons as above. If ba = ab, then a and b commute, and from
here it is straightforward to check that all element of G commute, contradicting the fact that G is
nonabelian. It follows that we have ba = a2 b = a−1 b.
To recap, we have |a| = 3, |b| = 2, and ba = a−1 b. Now in D3 we also have |r| = 3, |s| = 2, and
sr = r−1 s. Define a bijection ϕ : G → D3 as follows:
– ϕ(e) = id.
– ϕ(a) = r.
– ϕ(a2 ) = r2 .
– ϕ(b) = s.
– ϕ(ab) = rs.
– ϕ(a2 b) = r2 s.
Using the properties of a, b ∈ G along with those for r, s ∈ D3 , it is straightforward to check that ϕ
preserves the group operation. Therefore, G ∼
= D3 , and since D3 = S3 , we conclude that G ∼ = S3 .
Thus, every group of order 6 is isomorphic to one of groups Z/6Z or S3 . Furthermore, no group can be
isomorphic to both because Z/6Z 6∼= S3 as mentioned above.
We know that there are at least 5 groups of order 8 up to isomorphism:
The first three of these are abelian, while the latter two are not. Notice that Z/8Z is not isomorphic to other
two abelian groups because they are not cyclic. Also, Z/4Z × Z/2Z 6∼ = Z/2Z × Z/2Z × Z/2Z because the
former has an element of order 4 while the latter does not. Finally, D4 6∼ = Q8 because D4 has five elements
of order 2 and two of order 4, while Q8 has one element of order 2 and six of order 4. Now it is possible
to show that every group of order 8 is isomorphic to one of these five, but this is difficult to do with our
elementary methods. We need to build more tools, not only to tackle this problem, but also to think about
groups of larger order. Although many of these methods will arise in later chapters, the following result is
within our grasp now.
Proof. Recall that Z(G) is a normal subgroup of G by Proposition 5.6.2 and Proposition 6.2.5. Suppose that
G/Z(G) is cyclic. We may then fix c ∈ G such that G/Z(G) = hcZ(G)i. Now we have (cZ(G))n = cn Z(G)
for all n ∈ Z, so
G/Z(G) = {cn Z(G) : n ∈ Z}
We first claim that every a ∈ G can be written in the form a = cn z for some n ∈ Z and z ∈ Z(G). To see
this, let a ∈ G. Now aZ(G) ∈ G/Z(G), so we may fix n ∈ Z with aZ(G) = cn Z(G). Since a ∈ Z(G), we
have a ∈ cn Z(G), so there exists z ∈ Z(G) with a = cn z.
Suppose now that a, b ∈ G. From above, we may write a = cn z and b = cm w where n, m ∈ Z and
z, w ∈ Z(G). We have
ab = cn zcm w
= cn cm zw (since z ∈ Z(G))
n+m
=c zw
m+n
=c wz (since z ∈ Z(G))
m n
= c c wz
= cm wcn z (since w ∈ Z(G))
= ba
Corollary 6.5.6. Let G be a group with |G| = pq for (not necessarily distinct) primes p, q ∈ Z. We then
have that either Z(G) = G or Z(G) = {e}.
Proof. We know that Z(G) is a normal subgroup of G. Thus, |Z(G)| divides |G| by Lagrange’s Theorem, so
|Z(G)| must be one of 1, p, q, or pq. Suppose that |Z(G)| = p. We then have that
|G| pq
|G/Z(G)| = = = q,
|Z(G)| p
so G/Z(G) is cyclic by Theorem 5.8.6. Using Theorem 6.5.5, it follows that G is abelian, and hence Z(G) = G,
contradicting the fact that |Z(G)| = p. Similarly, we can not have |Z(G)| = q. Therefore, either |Z(G)| = 1,
in which case Z(G) = {e}, or |Z(G)| = pq, in which case Z(G) = G.
For example, since S3 is nonabelian and |S3 | = 6 = 2·3, this result immediately implies that Z(S3 ) = {id}.
Similarly, since D5 is nonabelian and |D5 | = 10 = 2 · 5, we conclude that Z(D5 ) = {id}. We will come back
and use this result in more interesting contexts later.
6.6. HOMOMORPHISMS 113
6.6 Homomorphisms
Our definition of isomorphism had two requirements: that the function was a bijection and that it preserves
the operation. We next investigate what happens if we drop the former and just require that the function
preserves the operation.
Definition 6.6.1. Let G and H be groups. A homomorphism from G to H is a function ϕ : G → H such
that ϕ(a · b) = ϕ(a) · ϕ(b) for all a, b ∈ G.
Consider the following examples:
• The determinant function det : GLn (R) → R\{0} is a homomorphism where the operation on R\{0}
is multiplication. This is because det(AB) = det(A) · det(B) for any matrices A and B. Notice that
det is certainly not an injective function.
• Consider the sign function ε : Sn → {1, −1} where the operation on {±1} is multiplication. Notice
that ε is homomorphism by Proposition 5.3.8. Again, for n ≥ 3, the function ε is very far from being
injective.
• For any groups G and H, the function π1 : G × H → G given by π1 ((g, h)) = g and the function
π2 : G × H → H given by π2 ((g, h)) = h are both homomorphisms. For examples, given any g1 , g2 ∈ G
and h1 , h2 ∈ H we have
π1 ((g1 , h1 ) · (g2 , h2 )) = π1 ((g1 g2 , h1 h2 ))
= g1 g2
= π1 ((g1 , h1 )) · π1 ((g2 , h2 ))
and similarly for π2 .
• For a given group G with normal subgroup N , the function π : G → G/N by letting π(a) = aN for all
a ∈ G is a homomorphism because for any a, b ∈ G, we have
π(ab) = abN
= aN · bN
= π(a) · π(b)
so a−1 ∈ K. Putting this all together, we see that K is a subgroup of G. Suppose now that a ∈ K and
g ∈ G. Since a ∈ K we have ϕ(a) = eH , so
ϕ(gag −1 ) = ϕ(g) · ϕ(a) · ϕ(g −1 )
= ϕ(g) · eH · ϕ(g)−1
= ϕ(g) · ϕ(g)−1
= eH
and hence gag −1 ∈ K. Therefore, K is a normal subgroup of G.
We’ve just seen that the kernel of a homomorphism is always a normal subgroup of G. It’s a nice fact
that every normal subgroup of a group arises in this way, so we get another equivalent characterization of a
normal subgroup.
Theorem 6.6.5. Let G be a group and let K be a normal subgroup of G. There exists a group H and a
homomorphism ϕ : G → H with K = ker(ϕ).
Proof. Suppose that K is a normal subgroup of G. We can then form the quotient group H = G/K. Define
ϕ : G → H by letting ϕ(a) = aK for all a ∈ G. As discussed above, ϕ is a homomorphism. Notice that for
any a ∈ G, we have
a ∈ ker(ϕ) ⇐⇒ ϕ(a) = eK
⇐⇒ aK = eK
⇐⇒ eK = aK
⇐⇒ e−1 a ∈ K
⇐⇒ a ∈ K
Therefore ϕ is a homomorphism with ker(ϕ) = K.
Proposition 6.6.6. Let ϕ : G → H be a homomorphism. ϕ is injective if and only if ker(ϕ) = {eG }.
Proof. Suppose first that ϕ is injective. We know that ϕ(eG ) = eH , so eG ∈ ker(ϕ). Ler a ∈ ker(ϕ) be
arbitrary. We then have ϕ(a) = eH = ϕ(eG ), so since ϕ is injective we conclude that a = eG . Therefore,
ker(ϕ) = {eG }.
Suppose conversely that ker(ϕ) = {eG }. Let a, b ∈ G be arbitrary with ϕ(a) = ϕ(b). We then have
ϕ(a−1 b) = ϕ(a−1 ) · ϕ(b)
= ϕ(a)−1 · ϕ(b)
= ϕ(a)−1 · ϕ(a)
= eH
so a−1 b ∈ ker(ϕ). Now we are assuming that ker(ϕ) = {eG }, so we conclude that a−1 b = eG . Multiplying
on the left by a, we see that a = b. Therefore, ϕ is injective.
6.6. HOMOMORPHISMS 115
so c−1 ∈ ϕ→ [H1 ].
Putting it all together, we conclude that ϕ→ [H1 ] is a subgroup of G2 .
2. Suppose that N1 is a normal subgroup of G1 and that ϕ is surjective. We know from part 1 that
ϕ→ [N1 ] is a subgroup of G2 . Suppose that d ∈ G2 and c ∈ ϕ→ [N1 ] are arbitrary. Since ϕ is surjective,
we may fix b ∈ G1 with ϕ(b) = d. Since c ∈ ϕ→ [N1 ], we can fix a ∈ G1 with ϕ(a) = c. We then have
so dcd−1 ∈ ϕ→ [N1 ]. Since d ∈ G2 and c ∈ ϕ→ [N1 ] were arbitrary, we conclude that ϕ→ [N1 ] is a normal
subgroup of G2 .
3. Suppose that H2 be a subgroup of G2 .
• Notice that we have e2 ∈ H2 because H2 is a subgroup of H1 . Since ϕ(e1 ) = e2 ∈ H2 , it follows
that e1 ∈ ϕ← [H2 ].
• Let a, b ∈ ϕ← [H2 ] be arbitrary, so ϕ(a) ∈ H2 and ϕ(b) ∈ H2 . We then have
ϕ(a−1 ) = ϕ(a)−1 ∈ H2
Now ϕ(b) ∈ G2 and ϕ(a) ∈ N2 , so ϕ(b) · ϕ(a) · ϕ(b)−1 ∈ N2 because N2 is a normal subgroup of G2 .
Therefore, ϕ(bab−1 ) ∈ N2 , and hence bab−1 ∈ ϕ← [N2 ]. Since b ∈ G1 and a ∈ ϕ← [N2 ] were arbitrary,
we conclude that ϕ← [N2 ] is a normal subgroup of G1 .
Lemma 6.7.1. Suppose that ϕ : G → H is a homomorphism and let K = ker(ϕ). For any a, b ∈ G, we have
that a ∼K b if and only if ϕ(a) = ϕ(b).
Proof. Suppose first that a, b ∈ G satisfy a ∼K b. Fix k ∈ K with ak = b and notice that
ϕ(b) = ϕ(ak)
= ϕ(a) · ϕ(k)
= ϕ(a) · eH
= ϕ(a)
In less formal terms, the lemma says that ϕ is constant on each coset of K, and assigns distinct values
to distinct cosets. This sets up a well-defined injective function from the quotient group G/K to the group
H. Restricting down to range(ϕ) on the right, this function is a bijection. Furthermore, this function is an
isomorphism as we now prove in the following fundamental theorem.
Theorem 6.7.2 (First Isomorphism Theorem). Let ϕ : G → H be a homomorphism and let K = ker(ϕ).
Define a function ψ : G/K → H by letting ψ(aK) = ϕ(a). We then have that ψ is a well-defined function
which is an isomorphism onto the subgroup range(ϕ) of H. Therefore
G/ ker(ϕ) ∼
= range(ϕ)
• ψ is injective: Suppose that a, b ∈ G with ψ(aK) = ψ(bK). We then have that ϕ(a) = ϕ(b), so a ∼K b
by Lemma 6.7.1 (in the other direction), and hence aK = bK.
118 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS
• ψ is surjective onto range(ϕ): First notice that since ϕ : G → H is a homomorphism, we know from
Corollary 6.6.10 that range(ϕ) is a subgroup of H. Now for any a ∈ G, we have ψ(aK) = ϕ(a) ∈
range(ϕ), so range(ψ) ⊆ range(ϕ). Moreover, given an arbitrary c ∈ range(ϕ), we can fix a ∈ G with
ϕ(a) = c, and then we have ψ(aK) = ϕ(a) = c. Thus, range(ϕ) ⊆ range(ψ). Combining both of these,
we conclude that range(ψ) = range(ϕ).
Putting it all together, the function ψ : G/K → H defined by ψ(aK) = ϕ(a) is a well-defined injective
homomorphism that maps G surjectively onto range(φ). This proves the result.
For example, consider the homomorphism det : GLn (R) → R\{0}, where we view R\{0} as a group
under multiplication. As discussed in the previous section, we have ker(det) = SLn (R). Notice that det is a
surjective function because every nonzero real number arises as the determinant of some invertible matrix (for
example, the identity matrix with the (1, 1) entry replaced by r has determinant r). The First Isomorphism
Theorem tells us that
GLn (R)/SLn (R) ∼ = R\{0}
Here is what is happening intuitively. The subgroup SLn (R), which is the kernel of the determinant homo-
morphism, is the set of n × n matrices with determinant 1. This break up the n × n matrices into cosets
which correspond exactly to the nonzero real numbers in the sense that all matrices of a given determinant
form a coset. The First Isomorphism Theorem says that multiplication in the quotient (where you take rep-
resentatives matrices from the corresponding cosets, multiply the matrices, and then expand to the resulting
coset) corresponds exactly to just multiplying the real numbers which “label” the cosets.
For another example, consider the sign homomorphism ε : Sn → {±1} where n ≥ 2. We know that ε
is a homomorphism by Proposition 5.3.8, and we have ker(ε) = An by definition of An . Notice that ε is
surjective because n ≥ 2 (so ε(id) = 1 and ε((1 2)) = −1). By the First Isomorphism Theorem we have
Sn /An ∼
= {±1}
Now the quotient group Sn /An consists of two cosets: the even permutations form one coset (namely An )
and the odd permutations form the other coset. The isomorphism above is simply saying that we can label
all of the even permutations with 1 and all of the odd permutations with −1, and in this way multiplication
in the quotient corresponds exactly to multiplication of the labels.
Recall that if G is a group and H and K are subgroups of G, then we defined
HK = {hk : h ∈ H and k ∈ K}
(see Definition 6.4.1). Immediately after that definition we showed that HK is not in general a subgroup of
G by considering the case where G = S3 , H = H = h(1 2)i = {id, (1 2)}, and K = h(1 3)i = {id, (1 3)}. In
this case, we had
so we also have that HK 6= KH. Fortunately, if at least one of H or K is normal in G, then neither of these
problems arise.
Proposition 6.7.3. Let H and N be subgroups of a group G, and suppose that N is a normal subgroup of
G. We have the following:
1. HN = N H.
2. HN is a subgroup of G (and hence N H is a subgroup of G).
Proof. We first show that HN = N H.
• We show that HN ⊆ N H. Let a ∈ HN be arbitrary, and fix h ∈ H and n ∈ N with a = hn. One
can show that a ∈ N H directly from Condition 2 of Proposition 6.2.2, but we work with the conjugate
instead. Notice that
a = hn = (hnh−1 )h
where hnh−1 ∈ N (because N is a normal subgroup of G) and h ∈ H. Thus, a ∈ N H.
• We show that N H ⊆ HN . Let a ∈ N H be arbitrary, and fix n ∈ N and h ∈ H with a = nh. Again,
one can use Condition 1 of Proposition 6.2.2 directly, or notice that
a = nh = h(h−1 nh)
where h ∈ H and h−1 nh = h−1 n(h−1 )−1 ∈ N because N is a normal subgroup of G. Thus, a ∈ HN .
We now check that HN is a subgroup of G.
• Since H and N are subgroups of G, we have e ∈ H and e ∈ N . Since e = ee, it follows that e ∈ HN .
• Let a, b ∈ HN be arbitrary. Fix h1 , h2 ∈ H and n1 , n2 ∈ N with a = h1 n1 and b = h2 n2 . Since N is
a normal subgroup of G, n1 ∈ N , and h2 ∈ G, we may use Property 1 from Proposition 6.2.2 to fix
n3 ∈ N with n1 h2 = h2 n3 . We then have
ab = h1 n1 h2 n2 = h1 h2 n3 n2
• Let a ∈ HN . Fix h ∈ H and n ∈ N with a = hn. Notice that n−1 ∈ N because N is a normal
subgroup of G. Since N is a normal subgroup of G, n−1 ∈ N , and h−1 ∈ G, we may use Property 1
from Proposition 6.2.2 to fix k ∈ N with n−1 h−1 = h−1 k. We then have
Theorem 6.7.4 (Second Isomorphism Theorem). Let G be a group, let H be a subgroup of G, and let N be
a normal subgroup of G. We then have that HN is a subgroup of G, that N is a normal subgroup of HN ,
that H ∩ N is a normal subgroup of H, and that
HN ∼ H
=
N H ∩N
Proof. First notice that since N is a normal subgroup of G, we know from Proposition 6.7.3 that HN is a
subgroup of G. Notice that for since e ∈ H, we have that that n = en ∈ HN for all n ∈ N , so N ⊆ HN .
Since N is a normal subgroup of G, it follows that N is a normal subgroup of HN because conjugation by
elements of HN is just a special case of conjugation by elements of G. Therefore, we may form the quotient
group HN/N . Notice that we also have H ⊆ HN (because e ∈ N ), so we may define ϕ : H → HN/N by
letting ϕ(h) = hN . We check the following:
• ϕ is a homomorphism: For any h1 , h2 ∈ H, we have
ϕ(h1 h2 ) = h1 h2 N
= h1 N · h2 N
= ϕ(h1 ) · ϕ(h2 )
ϕ(aN · bN ) = ϕ(abN )
= abH
= aH · bH
= ϕ(aN ) · ϕ(bN )
aN ∈ ker(ϕ) ⇐⇒ ϕ(aN ) = eH
⇐⇒ aH = eH
⇐⇒ e−1 a ∈ H
⇐⇒ a ∈ H
H 7→ H/N
is a bijection from subgroups of G containing N to subgroups of G/N . Furthermore, we have the following
properties for any subgroups H1 and H2 of G that both contain N :
1. H1 is a subgroup of H2 if and only if H1 /N is a subgroup of H2 /N .
2. H1 is a normal subgroup of H2 if and only if H1 /N is a normal subgroup of H2 /N .
3. If H1 is a subgroup of H2 , then [H2 : H1 ] = [H2 /N : H1 /N ].
122 CHAPTER 6. QUOTIENTS AND HOMOMORPHISMS
Chapter 7
Group Actions
Our definition of a group involved axioms for an abstract algebraic object. However, many groups arise in a
setting where the elements of the group naturally “move around” the elements of some set. For example, the
elements of GLn (R) represent linear transformations on Rn and so “move around” points in n-dimensional
space. When we discussed Dn , we thought of the elements of Dn as moving the vertices of a regular n-gon.
The general notion of a group working on a set in this way is called a group action. Typically these sets are
very geometric or combinatorial in nature, and we can understand the sets themselves by understanding the
groups. Perhaps more surprising, we can turn this idea around to obtain a great deal of information about
groups by understanding the sets on which they act.
The concept of understanding an algebraic object by seeing it “act” on other objects, often from different
areas of mathematics (geometry, topology, analysis, combinatorics, etc.), is a tremendously important part
of modern mathematics. In practice, groups typically arise as symmetries of some object just like our
introduction of Dn as the “symmetries” of the regular n-gon. Instead of n-gons, the objects can be graphs,
manifolds, topological spaces, vector spaces, etc.
123
124 CHAPTER 7. GROUP ACTIONS
g ∗ (h ∗ a) = g ∗ (h · a)
= g · (h · a)
= (g · h) · a
= (g · h) ∗ a
• Let G be a group. G acts on G by conjugation, i.e. g ∗ a = gag −1 . Notice that e ∗ a = eae−1 = a for
all a ∈ G. Also, for all g, h, a ∈ G we have
g ∗ (h ∗ a) = g ∗ (hah−1 )
= ghah−1 g −1
= gha(gh)−1
= (g · h) ∗ a
• If G acts on X and H is a subgroup of G, then H acts on X by simply restricting the function. For
example, since Dn is a subgroup of Sn , we see that Dn acts on {1, 2, . . . , n} via σ ∗ i = σ(i). Also,
since SLn (R) is a subgroup of GLn (R), it acts on Rn via matrix multiplication as well.
Proposition 7.1.2. Suppose that G acts on X. Define a relation ∼ on X by letting x ∼ y if there exists
a ∈ G with a ∗ x = y. The relation ∼ is an equivalence relation on X.
a−1 ∗ y = a−1 ∗ (a ∗ x)
= (a−1 · a) ∗ x
=e∗x
=x
so y ∼ x.
Definition 7.1.3. Suppose that G acts on X. The equivalence class of x under the above relation ∼ is called
the orbit of x. We denote this equivalence class by Ox . Notice that Ox = {a ∗ x : a ∈ G}.
If G acts on X, we know from our general theory of equivalence relations that the orbits partition X.
For example, consider the case where G = GL2 (R) and X = R2 with action A ∗ x = Ax. Notice that
7.1. ACTIONS, ORBITS, AND STABILIZERS 125
O(0,0) = {(0, 0)}. Let’s consider O(1,0) . We claim that O(1,0) = R2 \{(0, 0)}. Since the orbits partition R2 ,
we know that (0, 0) ∈/ O(1,0) . Now if a, b ∈ R with a 6= 0, then
a 0
b 1
is in GL2 (R) since it has determinant a 6= 0, and
a 0 1 a
=
b 1 0 b
so (a, b) ∈ O(1,0) . If b ∈ R with b 6= 0, then
0 1
b 0
is in GL2 (R) since it has determinant −b 6= 0, and
0 1 1 0
=
b 0 0 b
so (0, b) ∈ O(1,0) . Putting everything together, it follows that O(1,0) = R2 \{(0, 0)}. Since the orbits are
equivalence classes, we conclude that O(a,b) = R2 \{(0, 0} whenever (a, b) 6= (0, 0). Thus, the orbits partition
R2 into the two pieces: the origin and the rest.
In contrast, consider what happens if we consider the following H of GL2 (R). Let
cos θ − sin θ cos θ sin θ
H= :θ∈R ∪ :θ∈R
sin θ cos θ sin θ − cos θ
If you’ve seen the terminology, then H is the set of all orthogonal 2 × 2 matrices, i.e. matrices A such that
AT = A−1 (where AT is the transpose of A). Intuitively, the elements of the set on the left are rotations by
angle θ and the elements of the set on the right are flips followed by rotations (very much like the alternate
definition of Dn on the homework, but now with all possible real values for the angles). One can show that
elements of H preserve distance. That is, if A ∈ H and ~v ∈ R2 , then ||A~v || = ||~v || (this can be checked
using general theory, or by checking directly using the above the matrices: Suppose that (x, y) ∈ R2 with
x2 + y 2 = r2 and show that the same is true after hitting (x, y) with any of the above matrices). Thus, if
we let H act on R2 , it follows that every element of O(1,0) is on the circle of radius 1 centered at the origin.
Furthermore, O(1,0) contains all of these points because
cos θ − sin θ 1 cos θ
=
sin θ cos θ 0 sin θ
which gives all points on the unit circle as we vary θ. In general, by working through the details one can
sshow that the orbits of this action are the circles centered at the origin.
Definition 7.1.4. Suppose that G acts on X. For each x ∈ X, define Gx = {a ∈ G : a ∗ x = x}. The set
Gx is called the stabilizer of x.
Proposition 7.1.5. Suppose that G acts on X. For each x ∈ X, the set Gx is a subgroup of G.
Proof. Let x ∈ X. Since e ∗ x = x, we have e ∈ Gx . Suppose that a, b ∈ Gx . We then have that a ∗ x = x
and b ∗ x = x, so
(a · b) ∗ x = a ∗ (b ∗ x) = a ∗ x = x
and hence a · b ∈ Gx . Suppose that a ∈ Gx so that a ∗ x = x. We then have that a−1 ∗ (a ∗ x) = a−1 ∗ x. Now
a−1 ∗ (a ∗ x) = (a−1 ∗ a) ∗ x = e ∗ x = x
so a−1 ∗ x = x and hence a−1 ∈ Gx . Therefore, Gx is a subgroup of G.
126 CHAPTER 7. GROUP ACTIONS
Since D4 is a subgroup of S4 , and S4 acts on {1, 2, 3, 4} via σ ∗ i = σ(i). It follows that D4 acts on
{1, 2, 3, 4} as well. To see how this works, we should remember our formal definitions of r and s as elements
of Sn . In D4 , we have r = (1 2 3 4) and s = (2 4). Working out all of the elements as permutations, we see
that
id r = (1 2 3 4) r2 = (1 3)(2 4) r3 = (1 4 3 2)
s = (2 4) rs = (1 2)(3 4) r2 s = (1 3) r3 s = (1 4)(2 3)
Notice that
G1 = G3 = {id, s} G2 = G4 = {id, r2 s}
and that
O1 = O2 = O3 = O4 = {1, 2, 3, 4}
Lemma 7.1.6. Suppose that G acts on X, and let x ∈ X. For any a, b ∈ G, we have
a ∗ x = b ∗ x ⇐⇒ a ∼Gx b.
(a−1 · b) ∗ x = a−1 ∗ (b ∗ x)
= a−1 ∗ (a ∗ x)
= (a−1 · a) ∗ x
=e∗x
=x
b ∗ x = (a · h) ∗ x
= a ∗ (h ∗ x)
=a∗x
so a ∗ x = b ∗ x.
Theorem 7.1.7 (Orbit-Stabilizer Theorem). Suppose that G acts on X, and let x ∈ X. There is a bijection
between Ox and the set of (left) cosets of Gx in G, so |Ox | = [G : Gx ]. In particular, if G is finite, then
and hence
|G| = |Ox | · |Gx |.
This completes the proof.
For example, consider the standard action of Sn on {1, 2, . . . , n}. For every i ∈ {1, 2, . . . , n}, we have
Oi = {1, 2, . . . , n} (because for each j, there is a permutation sending i to j), so as |Sn | = n!, it follows
that |Gi | = n!
n = (n − 1)!. In other words, there are (n − 1)! permutations of {1, 2, . . . , n} which fix a given
element of {1, 2, . . . , n}. Of course, this could have been proven directly, but it is an immediate consequence
of the Orbit-Stabilizer Theorem.
In the case of D4 acting on {1, 2, 3, 4} discussed above, we saw that Oi = {1, 2, 3, 4} for all i. Thus, each
stabilizer Gi satisfies |Gi | = |D44 | = 84 = 2, as was verified above.
Proposition 7.2.2. Suppose that G acts on X. For each fixed a ∈ G, the function πa is a permutation of
X, i.e. is a bijection from X to X.
Proof. Let a ∈ G be arbitrary. We first check that πa is injective. Let x, y ∈ X and assume that πa (x) =
πa (y). We then have a ∗ x = a ∗ y, so
x=e∗x
= (a−1 · a) ∗ x
= a−1 ∗ (a ∗ x)
= a−1 ∗ (a ∗ y)
= (a−1 · a) ∗ y
=e∗y
= y.
πa (a−1 ∗ x) = a ∗ (a−1 ∗ x)
= (a · a−1 ) ∗ x
=e∗x
=x
Thus, given a group action of G on X, we can associate to each a ∈ G a bijection πa . Since each πa is
a bijection from X to X, we can view it as an element of SX . The function that takes a and produces the
function πa is in fact a homomorphism.
128 CHAPTER 7. GROUP ACTIONS
Proposition 7.2.3. Suppose that G acts on X. The function ϕ : G → SX defined by letting ϕ(a) = πa is a
homomorphism.
Proof. We know from Proposition 7.2.2 that πa ∈ SX for all a ∈ G. Let a, b ∈ G be arbitrary. For any
x ∈ X, we have
πa·b (x) = (a · b) ∗ x
= a ∗ (b ∗ x) (by the axioms of a group action)
= a ∗ πb (x)
= πa (πb (x))
= (πa ◦ πb )(x)
e ∗ x = ϕ(e)(x)
= idX (x)
=x
a ∗ (b ∗ x) = ϕ(a)(b ∗ x)
= ϕ(a)(ϕ(b)(x))
= (ϕ(a) ◦ ϕ(b))(x)
= ϕ(a · b)(x)
= (a · b) ∗ x
Proposition 7.2.5. Suppose that X and Y are sets and that f : X → Y is a bijection. For each σ ∈ SX , the
function f ◦σ ◦f −1 is a permutation of Y . Furthermore, the function ϕ : SX → SY given by ϕ(σ) = f ◦σ ◦f −1
is an isomorphism. In particular, if |X| = |Y |, then SX ∼= SY .
• ϕ is bijective. To see this, we show that ϕ has an inverse. Define ψ : SY → SX by letting ψ(τ ) =
f −1 ◦ τ ◦ f (notice that f −1 ◦ τ ◦ f is indeed a permutation of X by a similar argument as above). For
any σ ∈ SX , we have
(ψ ◦ ϕ)(σ) = ψ(ϕ(σ))
= ψ(f ◦ σ ◦ f −1 )
= f −1 ◦ (f ◦ σ ◦ f −1 ) ◦ f
= (f −1 ◦ f ) ◦ σ ◦ (f −1 ◦ f )
= idX ◦ σ ◦ idX
=σ
ϕ(σ1 ◦ σ2 ) = f ◦ (σ1 ◦ σ2 ) ◦ f −1
= f ◦ σ1 ◦ (f −1 ◦ f ) ◦ σ2 ◦ f −1
= (f ◦ σ1 ◦ f −1 ) ◦ (f ◦ σ2 ◦ f −1 )
= ϕ(σ1 ) ◦ ϕ(σ2 )
Therefore, ϕ : SX → SY is an isomorphism.
Proof. If X is finite with |X| = n, we may fix a bijection f : X → {1, 2, 3, . . . , n} and apply the previous
proposition.
Theorem 7.2.7 (Cayley’s Theorem). Let G be a group. There exists a subgroup H of SG such that G ∼
= H.
Therefore, if |G| = n ∈ N+ , then G is isomorphic to a subgroup of Sn .
Proof. Consider the action of G on G given by a ∗ b = a · b. We know by Proposition 7.2.3 that the function
ϕ : G → SG given by ϕ(a) = πa is a homomorphism. We now check that ϕ is injective. Suppose that a, b ∈ G
with ϕ(a) = ϕ(b). We then have that πa = πb , so in particular we have πa (e) = πb (e), hence a · e = b · e, and
so a = b. It follows that ϕ is injective. Since range(ϕ) is a subgroup of SG by Corollary 6.6.10, we can view
ϕ as a isomorphism from G to range(ϕ). Therefore, G is isomorphic to a subgroup of SG .
Suppose finally that |G| = n ∈ N+ . We know from above that SG ∼ = Sn , we we may fix an isomorphism
ψ : SG → Sn . We then have that ψ ◦ ϕ : G → Sn is injective (because the composition of injective functions is
injective) and that ψ ◦ ϕ preserves the group operation (as in the proof that the composition of isomorphisms
is an isomorphism), so G is isomorphic to the subgroup range(ψ ◦ ϕ) of Sn .
130 CHAPTER 7. GROUP ACTIONS
Oa = {g ∗ a : g ∈ G} = {gag −1 : g ∈ G}.
• For a ∈ G, the stabilizer Ga is called the centralizer of a in G and is denoted CG (a). Notice that
Thus,
CG (a) = {g ∈ G : ga = ag}
is the set of elements of G which commute with a.
By our general theory of group actions, we know that CG (a) is a subgroup of G for every a ∈ G. Now
the conjugacy classes are orbits, so they are subsets of G which partition G, but in general they are certainly
not subgroups of G. However, we do know that if G is finite, then the size of every conjugacy class divides
|G| by the Orbit-Stabilizer Theorem because the size of a conjugacy class is the index of the corresponding
centralizer subgroup. In fact, in this case, the Orbit-Stabilizer Theorem says that if G is finite, then
Notice that if a ∈ G, then we have a ∈ CG (a) (because a trivially commutes with a), so since CG (a) is a
subgroup of G containing a, it follows that hai ⊆ CG (a). It is often possible to use this simple fact together
with the above equality to help calculate conjugacy classes
As an example, consider the group G = S3 . We work out the conjugacy class and centralizer of the
various elements. Notice first that CG (id) = G because every elements commutes with the identity, and the
conjugacy class of id is {id} because σ ◦ id ◦ σ −1 = id for all σ ∈ G. Now consider the element (1 2). On the
one hand, we know that h(1 2)i = {id, (1 2)} is a subset of CG ((1 2)), so |CG ((1 2))| ≥ 2. Since |G| = 6, we
conclude that |O(1 2) | ≤ 3. Now we know that (1 2) ∈ O(1 2) because O(1 2) is the equivalence class of (1 2).
Since
• (2 3)(1 2)(2 3)−1 = (2 3)(1 2)(2 3) = (1 3)
• (1 3)(1 2)(1 3)−1 = (1 3)(1 2)(1 3) = (2 3)
it follows that (1 3) and (2 3) are also in O(1 2) . We now have three elements of O(1 2) , and since |O(1 2) | ≤ 3,
we conclude that
O(1 2) = {(1 2), (1 3), (2 3)}
Notice that we can now conclude that |CG ((1 2))| = 2, so in fact we must have CG ((1 2)) = {id, (1 2)}
without doing any other calculations.
We have now found two conjugacy classes which take up 4 of the elements of G = S3 . Let’s look at the
conjugacy class of (1 2 3). We know it contains (1 2 3), and since
it follows that (1 3 2) is there as well. Since the conjugacy classes partition G, these are the only possible
elements so we conclude that
O(1 2 3) = {(1 2 3), (1 3 2)}
Using the Orbit-Stabilizer Theorem it follows that |CG ((1 2 3))| = 3, so since h(1 2 3)i ⊆ CG ((1 2 3)) and
|h(1 2 3)i| = 3, we conclude that CG ((1 2 3)) = h(1 2 3)i. Putting it all together, we see that S3 breaks up
into three conjugacy classes:
{id} {(1 2), (1 3), (2 3)} {(1 2 3), (1 3 2)}
The fact that the 2-cycles form one conjugacy class and the 3-cycles form another is a specific case of a
general fact which we now prove.
Lemma 7.3.2. Let σ ∈ Sn be a k-cycle, say σ = (a1 a2 . . . ak ). For any τ ∈ Sn , the permutation τ στ −1
is a k-cycle and in fact
τ στ −1 = (τ (a1 ) τ (a2 ) . . . τ (ak ))
(Note: this k-cycle may not have the smallest element first, so we may have to “rotate” it to have it in
standard cycle notation).
Proof. For any i with 1 ≤ i ≤ k − 1, we have σ(ai ) = ai+1 , hence
(τ στ −1 )(τ (ai )) = τ (σ(τ −1 (τ (ai )))) = τ (σ(ai )) = τ (ai+1 )
Furthermore, since σ(ak ) = a1 , we have
(τ στ −1 )(τ (ak )) = τ (σ(τ −1 (τ (ak )))) = τ (σ(ak )) = τ (a1 )
To finish the proof, we need to show that τ στ −1 fixes all elements distinct from the τ (ai ). Suppose then
that b 6= τ (ai ) for each i. We then have that τ −1 (b) 6= ai for all i. Since σ fixes all elements other the ai , it
follows that σ fixes τ −1 (b). Therefore
(τ στ −1 )(b) = τ (σ(τ −1 (b))) = τ (τ −1 (b)) = b
Putting it all together, we conclude that τ στ −1 = (τ (a1 ) τ (a2 ) . . . τ (ak )).
For example, suppose that σ = (1 6 3 4) and τ = (1 7)(2 4 9 6)(5 8). To determine τ στ −1 , we need only
apply τ to each of the elements in the cycle σ. Thus,
τ στ −1 = (7 2 3 9) = (2 3 9 7)
This result extends beyond k-cycles, and in fact we get the reverse direction as well.
Theorem 7.3.3. Two elements of Sn are conjugate in Sn if and only if they have the same cycle structure,
i.e. if and only if their cycle notations have the same number of k-cycles for each k ∈ N+ .
Proof. Let σ ∈ Sn and suppose that we write σ in cycle notation as σ = π1 π2 · · · πn with the πi disjoint
cycles. For any τ ∈ Sn , we have
τ στ −1 = τ π1 π2 · · · πn τ −1 = (τ π1 τ −1 )(τ π2 τ −1 ) · · · (τ πn τ −1 )
By the lemma, each τ πi τ −1 is a cycle of the same length as πi . Furthermore, the various cycles τ πi τ −1 are
disjoint from each other because they are obtained by applying τ to the elements of the cycles and τ is a
bijection. Therefore, τ στ −1 has the same cycle structure as σ.
Suppose conversely that σ and ρ have the same cycle structure. Match up the cycles of σ with the cycles
of ρ in a manner which preserves cycle length (including 1-cycles). Define τ : {1, 2, . . . , n} → {1, 2, . . . , n}
as follows. Given i ∈ {1, 2, . . . , n}, let τ (i) be the element of the corresponding cycle in the corresponding
position. Notice that τ is a bijection because the cycles in a given cycle notation are disjoint. By inserting
τ τ −1 between the various cycles of σ, we can use the lemma to conclude that τ στ −1 = ρ.
132 CHAPTER 7. GROUP ACTIONS
As an illustration of the latter part of the theorem, suppose that we are working in S8 and we have
Notice that σ and ρ have the same cycle structure, so they are conjugate in S8 . We can then write
The proof then defines τ by matching up the corresponding numbers, i.e. τ (3) = 1, τ (1) = 4, τ (8) = 5, etc.
Working it out, we see that we can take
τ = (1 4 2 6 3)(5 8)(7)
Thus, the elements which form a conjugacy class by themselves are exactly the elements of Z(G). Now if we
bring together these elements, and pick a unique element ai from each of the k conjugacy classes of size at
least 2, we get the equation:
Since Z(G) is a subgroup of G, we may use Lagrange’s Theorem to conclude that each of the summands
on the right is a divisor of G. Furthermore, we have |Oai | ≥ 2 for all i, although it might happen that
|Z(G)| = 1. Finally, using the Orbit-Stabilizer Theorem, we can rewrite this latter equation as
where again each of the summands on the right is a divisor of G and [G : CG (ai )] ≥ 2 for all i. These latter
two equations (either variation) is known as the Class Equation for G.
We saw above that the class equation for S3 reads as
6=1+2+3
In Problem 6c on Homework 4, we computed the number of elements of S5 of each given cycle structure, so
the class equation for S5 reads as
120 = 1 + 10 + 15 + 20 + 20 + 24 + 30
7.3. THE CONJUGATION ACTION AND THE CLASS EQUATION 133
Theorem 7.3.4. Let p ∈ N+ be prime and let G be a group with |G| = pn for some n ∈ N+ . We then have
that |Z(G)| =
6 {e}.
Proof. Let a1 , a2 , . . . , ak be representatives from the conjugacy classes of size at least 2. By the Class
Equation, we know that
|G| = |Z(G)| + |Oa1 | + |Oa2 | + · · · + |Oak |
By the Orbit-Stabilizer Theorem, we know that |Oai | divides pn for each i. By the Fundamental Theorem
of Arithmetic, it follows that each |Oai | is one of 1, p, p2 , . . . , pn . Now we know that |Oai | > 1 for each i, so
p divides each |Oai |. Since
|Z(G)| = |G| − |Oa1 | − |Oa2 | − · · · − |Oak |
and p divides every term on the right-hand side, we conclude that p | |Z(G)|. In particular, Z(G) 6= {e}.
Proof. Let G be a group of order p2 . Using Corollary 6.5.6, we know that either Z(G) = {e} or Z(G) = G.
The former is impossible by the Theorem 7.3.4, so Z(G) = G and hence G is abelian.
G∼
= Z/p2 Z or G∼
= Z/pZ × Z/pZ
Proof. Suppose that G is a group of order p2 . By the Corollary 7.3.5, G is abelian. If there exists an element
of G with order p2 , then G is cyclic and G ∼
= Z/p2 Z. Suppose then that G has no element of order p2 . By
Lagrange’s Theorem, the order of every element divides p2 , so the order of every nonidentity element of G
must be p. Fix a ∈ G with a 6= e, and let H = hai. Since |H| = p < |G|, we may fix b ∈ G with b ∈ / H and
let K = hbi. Notice that H and K are both normal subgroups of G because G is abelian. Now H ∩ K is
a subgroup of K, so |H ∩ K| divides |K| = p. We can’t have |H ∩ K| = p, because this would imply that
H ∩ K = K, which would contradict the fact that b ∈ / H. Therefore, we must have |H ∩ K| = 1 and hence
H ∩ K = {e}. Since |H| · |K| = p2 = |G|, we may use Corollary 6.4.7 to conclude that G is the internal
direct product of H and K. Since H and K are both cyclic of order p, they are both isomorphic to Z/pZ, so
G∼
=H ×K ∼
= Z/pZ × Z/pZ
Theorem 7.3.7 (Cauchy’s Theorem). Let p ∈ N+ be prime. If G is a group with p | |G|, then G has an
element of order p.
Proof. The proof is by induction on |G|. If |G| = 1, then the result is trivial because p - 1 (again if you don’t
like this vacuous base case, simply note the if |G| = p, then every nonidentity element of G has order p by
Lagrange’s Theorem). Suppose then that G is a finite group with p | |G|, and suppose that the result is true
for all groups K satisfying p | |K| and |K| < |G|. Let a1 , a2 , . . . , ak be representatives from the conjugacy
classes of size at least 2. By the Class Equation, we know that
and p divides every term on the right, it follows that p | |Z(G)|. Now Z(G) is a abelian group, so
Theorem 6.2.13 tells us that Z(G) has an element of order p. Thus, G has an element of order p.
• Suppose that p - [G : CG (ai )] for some i. Fix such an i. By Lagrange’s Theorem we have
Since p is a prime number with both p | |G| and p - [G : CG (ai )], we must have that p | |CG (ai )|. Now
|CG (ai )| < |G| because ai ∈
/ Z(G), so by induction, CG (ai ) has an element of order p. Therefore, G
has an element of order p.
The result follows by induction.
7.4 Simplicity of A5
Suppose that H is a subgroup of G. We know that one of the equivalent conditions for H to be a normal
subgroup of G is that ghg −1 ∈ H for all g ∈ G and h ∈ H. This leads to following simple result.
Proposition 7.4.1. Let H be a subgroup of G. We then have that H is a normal subgroup of G if and only
if H is a union of conjugacy classes of G.
Proof. Suppose first that H is a normal subgroup of G. Suppose that H includes an element h from some
conjugacy class of G. Now the conjugacy class of h in G equals {ghg −1 : g ∈ G}, and since h ∈ H and H is
normal in G, it follows that {ghg −1 : g ∈ G} ⊆ H. Thus, if H includes an element of some conjugacy class
of G, then H must include that entire conjugacy class. It follows that H is a union of conjugacy class of G.
Suppose conversely that H is a subgroup of G that is a union of conjugacy classes of G. Given any g ∈ G
and h ∈ H, we have that ghg −1 is an element of the conjugacy class of h in G, so since h ∈ H, we must have
ghg −1 ∈ H as well. Therefore, H is a normal subgroup of G.
If we understand the conjugacy classes of a group G, we can use this proposition to help us understand
the normal subgroups of G. Let’s begin such an analysis by looking at S4 . Recall the in Homework 4 we
counted the number of elements of Sn of various cycle types. In particular, we showed that if k ≤ n, then
the number of k-cycles in Sn equals:
n(n − 1)(n − 2) · · · (n − k + 1)
k
Also, if n ≥ 4, we showed that the number of permutations in Sn which are the product of two disjoint
2-cycles equals
n(n − 1)(n − 2)(n − 3)
8
Using these results, we see that S4 consists of the following numbers of elements of each cycle type.
• Identity: 1
4·3
• 2-cycles: 2 =6
4·3·2
• 3-cycles: 3 =8
4·3·2·1
• 4-cycles: 4 =6
7.4. SIMPLICITY OF A5 135
4·3·2·1
• Product of two disjoint 2-cycles: 8 = 3.
Since two elements of S4 are conjugates in S4 exactly when they have the same cycle type, this breakdown
gives the conjugacy classes of S4 . In particular, the class equation of S4 is:
24 = 1 + 3 + 6 + 6 + 8.
Using this class equation, let’s examine the possible normal subgroups of S4 . We already know that A4 is a
normal subgroup of S4 since it has index 2 in S4 (see Proposition 6.2.7). However another way to see this
is that A4 contains the identity, the 3-cycles, and the products of two disjoint 2-cycles, so it is a union of
conjugacy classes of S4 . In particular, it arises from taking the 1, the 3, and the 8 in the above class equation
and putting them together to form a subgroup of size 1 + 3 + 8 = 12.
Aside from the trivial examples of {id} and S4 itself, are there any other normal subgroups of S4 ? Suppose
that H is a normal subgroup of S4 with {id} ( H ( S4 and H 6= A4 . We certainly know that id ∈ H.
By Lagrange’s Theorem, we know we must have that |H| | 24. We also know that H must be a union of
conjugacy classes of S4 . Thus, we would need to find a way to add some collection of the various numbers
in the above class equation, necessarily including the number 1, such that their sum is a divisor of 24. One
way is 1 + 3 + 8 = 12 which gave A4 . Working through the various possibilities, we see that they only other
nontrivial way to make it work is 1 + 3 = 4. This corresponds to the subset
Now this subset is certainly closed under conjugation, but it is not immediately obvious that it is a subgroup.
Each nonidentity element here has order 2, so it is closed under inverses. Performing the simple check, it
turns out that it is also closed under composition, so indeed this is another normal subgroup of S4 . Thus,
the normal subgroups of S4 are {id}, S4 , A4 , and this subgroup of order 4.
Now let’s examine the possible normal subgroups of A4 . We already know three examples: {id}, A4 , and
the just discovered subgroup of S4 of size 4 (it is contained in A4 , and it must be normal in A4 because it
is normal in S4 ). Now the elements of A4 are the identity, the 3-cycles, and the products of two disjoint
2-cycles. Although the set of 3-cycles forms one conjugacy class in S4 , the set of eight 3-cycles does not form
one conjugacy class in A4 . We can see this immediately because |A4 | = 12 and 8 - 12. The problem is that
the elements of S4 which happen to conjugate one 3-cycle to another might all be odd permutations and so
do not exist in A4 . How can we determine the conjugacy classes in A4 without simply plowing through all
of the calculations from scratch?
Let’s try to work out the conjugacy class of (1 2 3) in A4 . First notice that we know that the conjugacy
class of (1 2 3) in S4 has size 8, so by the Orbit-Stabilizer Theorem we conclude that |CS4 ((1 2 3))| = 24 8 = 3.
Since h(1 2 3)i ⊆ CS4 ((1 2 3)) and |h(1 2 3)i| = 3, it follows that CS4 ((1 2 3)) = h(1 2 3)i. Now h(1 2 3)i ⊆ A4 ,
so we conclude that CA4 ((1 2 3)) = h(1 2 3)i as well. Therefore, by the Orbit-Stabilizer Theorem, the
conjugacy class of (1 2 3) in A4 has size 12 3 = 4. If we want to work out what exactly it is, it suffices to find
4 conjugates of (1 2 3) in A4 . Fortunately, we know how to compute conjugates quickly in Sn using Lemma
7.3.2 and the discussion afterwards:
• id(1 2 3)id−1 = (1 2 3)
• (1 2 4)(1 2 3)(1 2 4)−1 = (2 4 3)
• (2 3 4)(1 2 3)(2 3 4)−1 = (1 3 4)
• (1 2)(3 4)(1 2 3)[(1 2)(3 4)]−1 = (2 1 4) = (1 4 2)
Thus, the conjugacy class of (1 2 3) in A4 is:
If we work with a 3-cycle not in this set (for example, (1 2 4)), the above argument works through to show
that its conjugacy class also has size 4, so its conjugacy class must be the other four 3-cycles in A4 . Thus,
we get the conjugacy class
{(1 2 4), (1 3 2), (1 4 3), (2 3 4)}
Finally, let’s look at the conjugacy class of (1 2)(3 4) in A4 . We have
so the products of two disjoint 2-cycles still form one conjugacy class in A4 . Why did this conjugacy class
not break up? By the Orbit-Stabilizer Theorem, we know that |CS4 ((1 2)(3 4))| = 24 3 = 8. If we actually
compute this centralizer, we see that 4 of its elements are even permutations and four of its elements are
odd permutations. Therefore, |CA4 ((1 2)(3 4))| = 4 (the four even permutations in CS4 ((1 2)(3 4))), hence
using the Orbit-Stabilizer Theorem again we conclude that the conjugacy class of (1 2)(3 4) in A4 has size
12
4 = 3.
Putting everything together, we see that the class equation of A4 is:
12 = 1 + 3 + 4 + 4.
Working through the possibilities as in S4 , we conclude that the three normal subgroups of A4 we found
above are indeed all of the normal subgroups of A4 (there is no other way to add some subcollection of these
numbers which includes 1 to obtain a divisor of 12). Notice that we can also conclude that following.
Proposition 7.4.2. A4 has no subgroup of order 6, so the converse of Lagrange’s Theorem is false.
Proof. If H was a subgroup of A4 with |H| = 6, then H would have index 2 in A4 , so would be normal in
A4 . However, we just saw that A4 has no normal subgroup of order 6.
In the case of S4 that we just worked through, we saw that when we restrict down to A4 , some conjugacy
classes split into two and others stay intact. This is a general phenomenon in Sn , as we now show. We first
need the following fact.
|H|
Lemma 7.4.3. Let H be a subgroup of Sn . If H contains an odd permutation, then |H ∩ An | = 2 .
Proof. Suppose that H contains an odd permutation, and fix such an element τ ∈ H. Let X = H ∩ An
be the set of even permutations in H and let Y = H\An be the set of odd permutations in H. Define
f : X → Sn by letting f (σ) = στ . We claim that f maps X bijectively onto Y . We check the following:
Therefore, f maps X bijectively onto Y , and hence |X| = |Y |. Since H = X ∪ Y and X ∩ Y = ∅, we conclude
that |H| = |X| + |Y |. It follows that |H| = 2 · |X|, so |X| = |H|
2 .
7.4. SIMPLICITY OF A5 137
Proposition 7.4.4. Let σ ∈ An . Let X be the conjugacy class of σ in Sn , and let Y be the conjugacy class
of σ in An .
Proof. First notice that X ⊆ An because σ ∈ An and all elements of X have the same cycle type as σ. Let
H = CSn (σ) and let K = CAn (σ). Notice that Y ⊆ X and K = H ∩ An . By the Orbit-Stabilizer Theorem
applied in each of Sn and An , we know that
n!
|H| · |X| = n! and |K| · |Y | =
2
and therefore
2 · |K| · |Y | = |H| · |X|
Suppose first that σ commutes with some odd permutation in Sn . We then have H contains an odd
permutation, so by the lemma we know that |K| = |H ∩ An | = |H| 2 . Plugging this into the above equation,
we see that |H| · |Y | = |H| · |X|, so |Y | = |X|. Since Y ⊆ X, it follows that Y = X.
Suppose now that σ does not commute with any odd permutation in Sn . We then have that H ⊆ An ,
so K = H ∩ An = H and hence |K| = |H|. Plugging this into the above equation, we see that 2 · |H| · |Y | =
|H| · |X|, so |Y | = |X|
2 .
Let’s put all of the knowledge to work in order to study A5 . We begin by recalling Problem 6c on
Homework 4, where we computed the number of elements of S5 of each given cycle type:
• Identity: 1
5·4
• 2-cycles: 2 = 10
5·4·3
• 3-cycles: 3 = 20
5·4·3·2
• 4-cycles: 4 = 30
5·4·3·2·1
• 5-cycles: 5 = 24
5·4·3·2
• Product of two disjoint 2-cycles: 8 = 15
• Product of a 3-cycle and a 2-cycle which are disjoint: This equals the number of 3-cycles as discussed
above, which is 20 from above.
120 = 1 + 10 + 15 + 20 + 20 + 24 + 30.
Now in A5 we only have the identity, the 3-cycles, the 5-cycles, and the product of two disjoint 2-cycles.
Let’s examine what happens to these conjugacy classes in A5 . We know that each of these conjugacy classes
either stays intact or breaks in half by the above proposition.
• The set of 3-cycles has size 20, and since (1 2 3) commutes with the odd permutation (4 5), this
conjugacy class stays intact in A5 .
• The set of 5-cycles has size 24, and since 24 - 60, it is not possible that this conjugacy class stays intact
in A5 . Therefore, the set of 5-cycles breaks up into two conjugacy classes of size 12.
138 CHAPTER 7. GROUP ACTIONS
• The set of products of two disjoint 2-cycles has size 15. Since this is an odd number, it is not possible
that it breaks into two pieces of size 15
2 , so it must remain a conjugacy class in A5 .
60 = 1 + 12 + 12 + 15 + 20.
With this in hand, we can examine the normal subgroups of A5 . Of course we know that {id} and A5 are
normal subgroups of S5 . Suppose that H is a normal subgroup of A5 with {id} ( H ( A5 . We then have
that id ∈ H, that |H| | 60, and that H is a union of conjugacy classes of A5 . Thus, we would need to
find a way to add some collection of the various numbers in the above class equation, necessarily including
the number 1, such that their sum is a divisor of 60. Working through the possibilities, we see that this is
not possible except in the cases when we take all the numbers (corresponding to A5 ) and we only take 1
(corresponding to {id}). We have proved the following important theorem.
In fact, An is a simple group for all n ≥ 5, but the above method of proof falls apart for n > 5 (the
class equations get too long and there occasionally are ways to add certain numbers to get divisors of |An |).
Thus, we need some new techniques. One approach is to use induction on n ≥ 5 starting with the base case
that we just proved, but we will not go through all the details here.
where R is number we associated to color red and G the number for color green. Of course, these are distinct
elements of X, but we want to make them the “same”. Notice that S4 acts on X via the action:
a subgroup of S4 , we can restrict the above to an action of D4 on X. Recall that when viewed as a subgroup
of S4 we can list the elements of D4 as:
id r = (1 2 3 4) r2 = (1 3)(2 4) r3 = (1 4 3 2)
s = (2 4) rs = (1 2)(3 4) r2 s = (1 3) r3 s = (1 4)(2 3)
The key insight is that two colorings are the same exactly when they are in the same orbit of this action by
D4 . For example, we have the following orbits:
Thus, to count the number of colorings up to symmetry, we want to count the number of orbits of this
action. The problem in attacking this problem directly is that the orbits have different sizes, so we can not
simply divide n4 by the common size of the orbits. We need a better way to count the number of orbits of
an action.
Definition 7.5.1. Suppose that G acts on X. For each g ∈ G, let Xg = {x ∈ X : g ∗ x = x}. The set Xg is
called the fixed-point set of g.
Theorem 7.5.2 (Burnside’s Lemma - due originally to Cauchy - sometimes also attributed to Frobenius).
Suppose that G acts on X and that both G and X are finite. If k is the number of orbits of the action, then
1 X
k= |Xg |
|G|
g∈G
Thus, the number of orbits is the average number of elements fixed by each g ∈ G.
Proof. We the count the set A = {(g, x) ∈ G × X : g ∗ x = x} in two different ways. On the
Pone hand, for
each g ∈ G, there are |Xg | many elements of A in the “row” corresponding to g, so |A| = g∈G |Xg |. On
P hand, for each x ∈ X, there are |Gx | many elements of A in the “column” corresponding to x, so
the other
|A| = x∈X |Gx |. Using the Orbit-Stabilizer Theorem, we know that
X X X |G| X 1
|Xg | = |A| = |Gx | = = |G| ·
|Ox | |Ox |
g∈G x∈X x∈X x∈X
and therefore
1 X X 1
|Xg | =
|G| |Ox |
g∈G x∈X
Let’s examine this latter sum. Let P1 , P2 , . . . , Pk be the distinct orbits of X. We then have that
k X k k
X 1 X 1 X 1 X
= = |Pi | · = 1=k
|Ox | i=1
|Pi | i=1
|Pi | i=1
x∈X x∈Pi
Therefore,
1 X X 1
|Xg | = =k
|G| |Ox |
g∈G x∈X
140 CHAPTER 7. GROUP ACTIONS
Of course, Burnside’s Lemma will only be useful if it is not hard to compute the various values |Xg |. Let’s
return to our example of D4 acting on the set X above. First notice that Xid = X, so |Xid | = n4 . Let move
on to |Xr | where r = (1 2 3 4). Which elements (a1 , a2 , a3 , a4 ) are fixed by r? We have r ∗ (a1 , a2 , a3 , a4 ) =
(a2 , a3 , a4 , a1 ), so we need a1 = a2 , a2 = a3 , a3 = a4 , and a4 = a1 . Thus, an element (a1 , a2 , a3 , a4 ) is fixed
by r exactly when all the ai are equal. There are n such choices (because we can pick a1 arbitrarily, and
then all the others are determined), so |Xr | = n. In general, given any σ ∈ D4 , an element of X is in Xσ
exactly when all of the entries in each cycle of σ get the same color. Therefore, we have |Xσ | = nd where d
is the number of cycles in the cycle notation of σ, assuming that we include the 1-cycles. For example, we
have |Xr2 | = n2 and |Xs | = n3 . Working this out in the above cases and using the fact that |D4 | = 8, we
conclude from Burnside’s Lemma that the number of ways to color the vertices of the square with n colors
up to symmetry is:
1 4 1
(n + n + n2 + n + n3 + n2 + n3 + n2 ) = (n4 + 2n3 + 3n2 + 2n)
8 8
Let’s examine the problem of coloring the faces of a cube. We will label the faces of a cube as a 6-sided die
is labeled so that opposing faces sum to 7. For example, we could take the top face to be 1, the bottom 6,
the front 2, the back 5, the right 3, and the left 4. With this labeling, the symmetries of the cube forms a
subgroup of S6 . Letting G be this subgroup, notice that |G| = 24 because we can put any of the 6 faces on
top, and then rotate around the top in 4 distinct ways. Working through the actually elements, one sees
that G equals the following subset of S6 :
id (2 3 5 4) (2 5)(3 4) (2 4 5 3)
(1 2)(3 4)(5 6) (1 3 2)(4 5 6) (1 5 6 2) (1 4 2)(3 5 6)
(1 2 3)(4 6 5) (1 3)(2 5)(4 6) (1 5 3)(2 4 6) (1 4 6 3)
(1 2 4)(3 6 5) (1 3 6 4) (1 5 4)(2 3 6) (1 4)(2 5)(3 6)
(1 2 6 5) (1 3 5)(2 6 4) (1 5)(2 6)(3 4) (1 4 5)(2 6 3)
(1 6)(3 4) (1 6)(2 3)(4 5) (1 6)(2 5) (1 6)(2 4)(3 5)
• Product of two 3-cycles: These are rotations around the line through opposite corners of the cube.
There are 8 of these (there are four pairs of opposing corners, and then we can rotate either 120o or
240o for each), and each fixes n2 many colorings .
• Product of three 2-cycles: These are 180o rotations around a line through the middle of opposing edges
of the cube. There are 6 of these (there are 6 such pairs of opposing edges), and each fixes n3 many
colorings.
Using Burnside’s Lemma, we conclude that the total number of colorings of the faces of a cube using n colors
up to symmetry is:
1 6
(n + 3n4 + 12n3 + 8n2 )
24
Chapter 8
{n ∈ N+ : cn ∈ H} =
6 ∅
Let m be the least element of this set (which exists by well-ordering). We show that H = hcm i. Since
cm ∈ H and H is a subgroup of G, we know that hcm i ⊆ H by Proposition 5.2.2. Let a ∈ H be arbitrary.
Since H ⊆ G, we also have a ∈ G, so we may fix k ∈ Z with a = ck . Write k = qm + r where 0 ≤ r < m.
We then have
a = ck
= cqm+r
= cmq cr ,
hence
cr = (cmq )−1 · a
= c−mq · a
= (cm )−q · a.
Now cm ∈ H and a ∈ H. Since H is a subgroup of G, we know that it is closed under inverses and the group
operation, hence (cm )−q · a ∈ H and so cr ∈ H. By choice of m as the smallest positive power of c which
lies in H, we conclude that we must have r = 0. Therefore,
Since a ∈ H was arbitrary, it follows that H ⊆ hcm i. Combining this with the reverse containment above,
we conclude that H = hcm i, so H is cyclic.
141
142 CHAPTER 8. CYCLIC, ABELIAN, AND SOLVABLE GROUPS
Therefore, if G is a cyclic group and c is a generator for G, then every subgroup of G can be written in
the form hck i for some k ∈ Z (since every element of G equals ck for some k ∈ Z). It is possible that the
same subgroup can be written in two different such ways. For example, in Z/6Z with generator 1, we have
h2i = {0, 2, 4} = h4i. We next determine when we have such equalities. We begin with the following lemma
that holds in any group.
Lemma 8.1.2. Let G be a group, and let a ∈ G have finite order n ∈ N+ . Let m ∈ Z and let d = gcd(m, n).
We then have that ham i = had i.
am = adt = (ad )k
Therefore, am ∈ had i and hence ham i ⊆ had i. We now prove the reverse containment. Since d = gcd(m, n),
we may fix k, ` ∈ Z with d = mk + n`. Now
ad = amk+n`
= amk an`
= (am )k (an )`
= (am )k e`
= (am )k
so ad ∈ ham i, and hence had i ⊆ ham i. Combing the two containments, we conclude that ham i = had i.
For example, in our above case where G = Z/6Z and a = 1, we have |a| = 6 and 4 = a4 . Since
gcd(4, 6) = 2, the lemma immediately implies that ha4 i = ha2 i, and hence h4i = h2i.
Proposition 8.1.3. Let G be a finite cyclic group of order n ∈ N+ , and let c be an arbitrary generator of
G. If H is a subgroup of G, then H = hcd i for some d ∈ N+ with d | n. Furthermore, if H = hcd i, then
|H| = nd .
Proof. Let H be a subgroup of G. We know that H is cyclic by Proposition 8.1.1, so we may fix a ∈ G with
H = hai. Since G = hci, we can fix m ∈ Z with a = cm . Let d = gcd(m, n) and notice that d ∈ N+ and
d | n. We then have
H = hai = hcm i = hcd i
by Lemma 8.1.2. For the last claim, notice that if H = hcd i, then
|H| = |hcd i|
= |cd |
n
= (by Proposition 4.6.8)
gcd(d, n)
n
= .
d
This completes the proof.
We can improve on the lemma to determine precisely when two powers of the same element generate the
same subgroup.
1. Suppose that |a| = n ∈ N+ . For any `, m ∈ Z, we have that ha` i = ham i if and only if gcd(`, n) =
gcd(m, n).
2. Suppose that |a| = ∞. For any `, m ∈ Z, we have ha` i = ham i if and only if either ` = m or ` = −m.
and
n
|ham i| = |am | =
gcd(m, n)
Since ha` i = ham i, it follows that |ha` i| = |ham i|, and hence
n n
= .
gcd(`, n) gcd(m, n)
Thus, we have n · gcd(`, n) = n · gcd(m, n). Since n 6= 0, we can divide by it to conclude that
gcd(`, n) = gcd(m, n).
• Suppose now that gcd(`, n) = gcd(m, n), and let d be this common value. By Lemma 8.1.2, we
have both ha` i = had i and ham i = had i. Therefore, ha` i = ham i.
2. Let `, m ∈ Z be arbitrary.
• Suppose first that ha` i = ham i. We then have am ∈ ha` i, so we may fix k ∈ Z with am = (a` )k .
It follows that am = a`k , hence m = `k by Proposition 5.2.3, and so ` | m. Similarly, using the
fact that a` ∈ ham i, it follows that m | `. Since both ` | m and m | `, we can use Corollary 2.2.5
to conclude that either ` = m or ` = −m.
• If ` = m, then we trivially have ha` i = ham i. Suppose then that ` = −m. We then have
a` = a−m = (am )−1 , so a` ∈ ham i and hence ha` i ⊆ ham i. We also have am = a−` = (a` )−1 , so
am ∈ ha` i and hence ham i ⊆ ha` i. Combining both of these, we conclude that ha` i = ham i.
We know by Lagrange’s Theorem that if H is a subgroup of a finite group G, then |H| is a divisor of |G|.
In general, the converse to Lagrange’s Theorem is false, as we saw in Proposition 7.4.2 (although |A4 | = 12,
the group A4 has no subgroup of order 6). However, for cyclic groups, we have the very strong converse that
there is a unique subgroup order equal to each divisor of |G|.
1. Suppose that |G| = n ∈ N+ . For each d ∈ N+ with d | n, there is a unique subgroup of G of order d,
namely hcn/d i = {a ∈ G : ad = e}.
2. Suppose that |G| = ∞. Every subgroup of G equals hcm i for some m ∈ N, and each of these subgroups
are distinct.
Proof. 1. Let d ∈ N+ with d | n. We then have that nd ∈ Z and d · nd = n. Thus, nd | n, and we have
n
|hcn/d i| = n/d = d by Proposition 8.1.3. Therefore, hcn/d i is a subgroup of G of order d.
We next show that hcn/d i = {a ∈ G : ad = e}.
144 CHAPTER 8. CYCLIC, ABELIAN, AND SOLVABLE GROUPS
(cn/d )m = c(n/d)·m = ck = a
and hence a ∈ hcn/d i.
Putting these containments together, we conclude that hcn/d i = {a ∈ G : ad = e}.
Suppose now that H is an arbitrary subgroup of G with order d. By Lagrange’s Theorem, every
element of H has order dividing d, so hd = e for all h ∈ H. It follows that H ⊆ {a ∈ G : ad = e}, and
since each of these sets have cardinality d, we conclude that H = {a ∈ G : ad = e}.
2. Let H be an arbitrary subgroup of G. We know that H is cyclic by Proposition 8.1.1, so we may fix
a ∈ G with H = hai. Since G = hci, we can fix m ∈ Z with a = cm . If m < 0, then −m > 0, and
H = hcm i = hc−m i by Proposition 8.1.4.
Finally, notice that if m, n ∈ N with m 6= n, then we also have m 6= −n (because both m and n are
nonnegative), so hcm i =6 hcn i by Proposition 8.1.4 again.
Corollary 8.1.6. The subgroups of Z are precisely hni = nZ = {nk : k ∈ Z} for each n ∈ N, and each of
these are distinct.
Proof. This is immediate from the fact that Z is a cyclic group generated by 1, and 1n = n for all n ∈ N.
We now determine the number of elements of each order in a cyclic group. We start by determining the
number of generators.
Proposition 8.1.7. Let G be a cyclic group.
1. If |G| = n ∈ N+ , then G has exactly ϕ(n) distinct generators.
2. If |G| = ∞, then G has exactly 2 generators.
Proof. Since G is cyclic, we may fix c ∈ G with G = hci.
1. Suppose that |G| = n ∈ N+ . We then have
G = {c0 , c1 , c2 , . . . , cn−1 }
and we know that ck 6= c` whenever 0 ≤ k < ` < n. Thus, we need only determine how many of these
elements are generators. By Proposition 8.1.4, we have hck i = hc1 i if and only if gcd(k, n) = gcd(1, n).
Since hci = G and gcd(1, n) = 1, we conclude that hck i = G if and only if gcd(k, n) = 1. Therefore,
the number of generators of G equals the number of k ∈ {0, 1, 2, . . . , n − 1} such that gcd(k, n) = 1,
which is ϕ(n).
8.1. STRUCTURE OF CYCLIC GROUPS 145
2. Suppose that |G| = ∞. We then have that G = {ck : k ∈ Z} and that each of these elements are
distinct by Proposition 5.2.3. Thus, we need only determine the number of k ∈ Z such that hck i = G,
which is the number of k in Z such that hck i = hc1 i. Using Proposition 8.1.4, we have hck i = hc1 i if
and only if either k = 1 or k = −1. Thus, G has exactly 2 generators.
Proposition 8.1.8. Let G be a cyclic group of order n and let d | n. We then have that G has exactly ϕ(d)
many elements of order d.
Proof. Let H = hcn/d i = {a ∈ G : ad = e}. We know from Theorem 8.1.5 that H is the unique subgroup of
G having order d. Since H = hcn/d i, we know that H is cyclic, and hence Proposition 8.1.7 tells us that H
has exactly ϕ(d) many element of order d. Now if a ∈ G is any element with |a| = d, then ad = e, and hence
a ∈ H. Therefore, every element of G with order d lies in H, and hence G has exactly ϕ(d) many elements
of order d.
Corollary 8.1.9. For any n ∈ N+ , we have
X
n= ϕ(d)
d|n
for all n ∈ N+ . Finally, we end this section by determining when two subgroups of a cyclic group are
subgroups of each other.
Proposition 8.1.10. Let G be a cyclic group.
1. Suppose that |G| = n ∈ N+ . For each d ∈ N+ with d | n, let Hd be the unique subgroup of G of order
d. We then have that Hk ⊆ H` if and only if k | `.
2. Suppose that |G| = ∞. For each k ∈ N, let Hk = hck i. For any k, ` ∈ N, we have Hk ⊆ H` if and only
if ` | k.
Proof. 1. Let k, ` ∈ N+ be arbitrary with k | n and ` | n. Notice that if Hk ⊆ H` , then Hk is a subgroup
of H` , so k | ` by Lagrange’s Theorem. Conversely, suppose that k | `. Let a ∈ Hk be arbitrary. By
Theorem 8.1.5, we have ak = e. Since k | `, we can fix m ∈ Z with ` = mk, and notice that
a` = amk = (ak )m = em = e
so a ∈ H` by Theorem 8.1.5. Therefore, Hk ⊆ H` .
2. Let k, ` ∈ N be arbitrary. If ` | k, then we can fix m ∈ Z with k = m`, in which case we have
ck = (c` )m ∈ hc` i, so hck i ⊆ hc` i and hence Hk ⊆ H` . Conversely, suppose that Hk ⊆ H` . Since
ck ∈ Hk , we then have ck ∈ hc` i, so we can fix m ∈ Z with ck = (c` )m . We then have ck = c`m , so
k = `m by Proposition 5.2.3, and hence ` | k.
146 CHAPTER 8. CYCLIC, ABELIAN, AND SOLVABLE GROUPS
Chapter 9
Introduction to Rings
Definition 9.1.1. A ring is a set R equipped with two binary operations + and · and two elements 0, 1 ∈ R
satisfying the following properties:
1. a + (b + c) = (a + b) + c for all a, b, c ∈ R.
2. a + b = b + a for all a, b ∈ R.
3. a + 0 = a = 0 + a for all a ∈ R.
5. a · (b · c) = (a · b) · c for all a, b, c ∈ R.
7. a · (b + c) = a · b + a · c for all a, b, c ∈ R.
8. (a + b) · c = a · c + b · c for all a, b, c ∈ R.
Notice that the first four axioms simply say that (R, +, 0) is an abelian group.
Some sources omit property 6 and so do not require that their rings have a multiplicative identity. They
call our rings either “rings with identity” or “rings with 1”. We will not discuss such objects here, and for
us “ring” implies that there is a multiplicative identity. Notice that we are requiring that + is commutative
but we do not require that · is commutative. Also, we are not requiring that elements have multiplicative
inverses.
For example, Z, Q, R, and C are all rings with the standard notions of addition and multiplication along
with the usual 0 and 1. For each n ∈ N+ , the set Mn (R) of all n × n matrices with entries from R is a ring,
where + and · are the usual matrix addition and multiplication, 0 is the zero matrix, and 1 = In is the n × n
identity matrix. Certainly some matrices in Mn (R) fail to be invertible, but that is not a problem because
147
148 CHAPTER 9. INTRODUCTION TO RINGS
the ring axioms say nothing about the existence of multiplicative inverses. Notice that the multiplicative
group GLn (R) is not a ring because it is not closed under addition. For example
1 0 −1 0 0 0
+ =
0 1 0 −1 0 0
but this latter matrix is not an element of GL2 (R). The next proposition gives some examples of finite rings.
Proposition 9.1.2. Let n ∈ N+ . The set Z/nZ of equivalence classes of the relation ≡n with operations
a+b=a+b and a · b = ab
Proof. We already know that (Z/nZ, +, 0) is an abelian group. We proved that multiplication of equivalence
classes given by
a · b = ab
is well-defined in Proposition 3.6.4, so it gives a binary operation on Z/nZ. It is now straightforward to check
that 1 is a multiplicative identity, that · is associative, and that · distributes over addition by appealing to
these facts in Z.
Initially, it may seem surprising that we require that + is commutative in rings. In fact, this axiom
follows from the others (so logically we could omit it). To see this, suppose that R satisfies all of the above
axioms except possibly axiom 2. Let a, b ∈ R be arbitrary. We then have
(1 + 1) · (a + b) = (1 + 1) · a + (1 + 1) · b
=1·a+1·a+1·b+1·b
=a+a+b+b
and also
(1 + 1) · (a + b) = 1 · (a + b) + 1 · (a + b)
=a+b+a+b
hence
a + a + b + b = a + b + a + b.
Now we are assuming that (R, +, 0) is a group because we have axioms 1, 3, and 4, so by left and right
cancellation we conclude that a + b = b + a.
If R is a ring, then we know that (R, +, 0) is a group. In particular, every a ∈ R has a unique additive
inverse. Since we have another binary operation in multiplication, we choose different notation new notation
for the additive inverse of a (rather than use the old a−1 which looks multiplicative).
Definition 9.1.3. Let R be a ring. Given a ∈ R, we let −a be the unique additive inverse of a, so a+(−a) = 0
and (−a) + a = 0.
a · 0 = a · (0 + 0)
= (a · 0) + (a · 0).
Therefore
0 + (a · 0) = (a · 0) + (a · 0)
By cancellation in the group (R, +, 0), it follows that a · 0 = 0. Similarly we have
0 · a = (0 + 0) · a
= (0 · a) + (0 · a).
Therefore
0 + (0 · a) = (0 · a) + (0 · a).
By cancellation in the group (R, +, 0), it follows that 0 · a = 0.
2. Let a, b ∈ R be arbitrary. We have
0=a·0 (by 1)
= a · (b + (−b))
= (a · b) + (a · (−b))
Therefore, a · (−b) is the additive inverse of a · b (since + is commutative), which is to say that
a · (−b) = −(a · b). Similarly
0=0·b (by 1)
= (a + (−a)) · b
= (a · b) + ((−a) · b)
so (−a) · b is the additive inverse of a · b, which is to say that (−a) · b = −(a · b).
3. Let a, b ∈ R be arbitrary. Using 2, we have
where we have used the group theoretic fact that the inverse of the inverse is the original element
(i.e. Proposition 4.2.6).
4. Let a ∈ R be arbitrary. We have
a=a·1=a·0=0
Definition 9.1.7. A commutative ring is a ring R such that · is commutative, i.e. such that a · b = b · a for
all a, b ∈ R.
• S is an additive subgroup of R (so S contains 0, is closed under +, and closed under additive inverses).
• 1∈S
• ab ∈ S whenever a ∈ S and b ∈ S.
x + y = (a + bi) + (c + di)
= (a + c) + (b + d)i
xy = (a + bi)(c + di)
= ac + adi + bci + bdi2
= (ac − bd) + (ad + bc)i
so xy ∈ Z[i]. Therefore, Z[i] is a subring of C. The ring Z[i] is called the ring of Gaussian Integers
In our work on group theory, we made extensive use of subgroups of a given group G to understand G.
In our discussion of rings, the concept of subrings will play
√ a much smaller role. Some rings of independent
interest can be seen as subrings of larger rings, such as Q[ 2] and Z[i] above. However, we will not typically
try to understand R by looking at its subrings. Partly, this is due to the fact that we will spend a large
amount of time working with infinite rings (as opposed to the significant amount of time we spent on finite
groups, where Lagrange’s Theorem played a key role). As we will see, our attention will turn toward certain
subsets of a ring called ideals which play a role similar to normal subgroups of a group.
Finally, one small note. Some authors use a slightly different definition of a subring in that they do not
require that 1 ∈ S. They use the idea that a subring of R should be a subset S ⊆ R which forms a ring
with the inherited operations, but the multiplicative identity of S could be different from the multiplicative
identity of R. Again, since subrings will not play a particularly important role for us, we will not dwell on
this distinction.
Definition 9.2.1. Let R be a ring. An element u ∈ R is a unit if it has a multiplicative inverse, i.e. there
exists v ∈ R with uv = 1 = vu. We denote the set of units of R by U (R).
In other words, that units of R are the invertible elements of R under the associative operation · with
identity 1 as in Section 4.3. For example, the units in Z are {±1} and the units in Q are Q\{0} (and similarly
for R and C). The units in Mn (R) are the invertible n × n matrices.
Proposition 9.2.2. Let R be a ring and let u ∈ U (R). There is a unique v ∈ R with uv = 1 and vu = 1.
Proof. Existence follows from the assumption that u is a unit, and uniqueness is immediate from Proposition
4.3.2.
Definition 9.2.3. Let R be a ring and let u ∈ U ((R). We let u−1 be the unique multiplicative inverse of u,
so uu−1 = 1 and u−1 u = 1.
Proposition 9.2.4. Let R be a ring. The set U (R) forms an group under multiplication with identity 1.
This finally explains our notation for the group U (Z/nZ); namely, we are considering Z/nZ as a ring
and forming the corresponding unit group of that ring. Notice that for any n ∈ N+ , we have GLn (R) =
U (Mn (R)).
Let’s recall the multiplication table of Z/6Z:
152 CHAPTER 9. INTRODUCTION TO RINGS
· 0 1 2 3 4 5
0 0 0 0 0 0 0
1 0 1 2 3 4 5
2 0 2 4 0 2 4
3 0 3 0 3 0 3
4 0 4 2 0 4 2
5 0 5 4 3 2 1
Looking at the table, we see that U (Z/6Z) = {1, 5} as we already know. As remarked when we first saw this
table in Section 4.4, there are some other interesting things. For example, we have 2 · 3 = 0, so it is possible
to have the product of two nonzero elements result in 0. Elements which are part of such a pair are given a
name.
Definition 9.2.5. Let R be ring. A zero divisor is a nonzero element a ∈ R such that there exists a nonzero
b ∈ R such that either ab = 0 or ba = 0 (or both).
In the above case of Z/6Z, we see that the zero divisors are {2, 3, 4}. The concept of unit and zero divisor
are antithetical as we now see.
Proposition 9.2.6. Let R be a ring. No element is both a unit and zero divisor.
Proof. Suppose that R is a ring and that a ∈ R is a unit. If b ∈ R satisfies ab = 0, then
b=1·b
= (a−1 a) · b
= a−1 · (ab)
= a−1 · 0
=0
b=b·1
= b · (aa−1 )
= (ba) · a−1
= 0 · a−1
=0
ab = (dc)b = (db)c = nc
Therefore, every nonzero element of Z/nZ is either a unit or a zero divisor. However, this is not true in
every ring. The units in Z are {±1} but there are no zero divisors in Z. We now define three important
classes of rings.
• R is a field if R is a commutative division ring. Thus, a field is a commutative ring with 1 6= 0 for
which every nonzero element is a unit.
• R is an integral domain if R is a commutative ring with 1 6= 0 which has no zero divisors. Equivalently,
an integral domain is a commutative ring R with 1 6= 0 such that whenever ab = 0, either a = 0 or
b = 0.
Proof. Suppose that R is a field. If a ∈ R is nonzero, then it is a unit, so it is not a zero divisor by Proposition
9.2.6.
For example, each of Q, R, and C are fields. The ring Z is an example of an integral domain which is
not a field, so the concept of integral domain is strictly weaker. There also exist division rings which are not
fields, such as the Hamiltonian Quaternions as discussed in Section 1.2:
H = {a + bi + cj + dk : a, b, c, d ∈ R}
with
i2 = −1 j 2 = −1 k 2 = −1
ij = k jk = i ki = j
ji = −k kj = −i ik = −j
We will return to such objects (and will actually prove that H really is a division ring) later.
• For each prime p ∈ N+ , the ring Z/pZ is a field (and hence also an integral domain).
• For each composite n ∈ N+ with n ≥ 2, the ring Z/nZ is not an integral domain.
Proof. If p ∈ N+ is prime, then every nonzero element of Z/pZ is a unit by Proposition 9.2.7 (because if
a ∈ {1, 2, . . . , p − 1}, then gcd(a, p) = 1 because p is prime). Suppose that n ∈ N+ with n ≥ 2 is composite.
Fix d ∈ N+ with 1 < d < n such that d | n. We then have that gcd(d, n) = d 6= 1 and d 6= 0, so d is a zero
divisor by Proposition 9.2.7. Therefore, Z/nZ is not an integral domain.
Since there are infinitely many primes, this corollary provides us with an infinite supply of finite fields.
Here are the addition and multiplication tables of Z/5Z to get a picture of one of one of these objects.
+ 0 1 2 3 4 · 0 1 2 3 4
0 0 1 2 3 4 0 0 0 0 0 0
1 1 2 3 4 0 1 0 1 2 3 4
2 2 3 4 0 1 2 0 2 4 1 3
3 3 4 0 1 2 3 0 3 1 4 2
4 4 0 1 2 3 4 0 4 3 2 1
154 CHAPTER 9. INTRODUCTION TO RINGS
√ √
Another example of a field is the ring Q[ 2] discussed in the last section. Since Q[ 2] is a subring of R,
it is a commutative ring with 1 6= 0. To see that it is a field, we
√ need only check that every nonzero element
has an inverse. Suppose that a, b ∈ Q are arbitrary with a + b 2 6= 0.
√ √ √
• We first claim that a − b 2 6= 0. Suppose √ instead that a − b 2 = 0. We then have a = b 2. Now if
b = 0, then we√would have a√ = 0, so a + b 2 = 0, which is a contradiction.
√ Thus, b 6= 0. Dividing both
sides of a = b 2 by b gives 2 = ab , contradicting the fact that 2 is irrational. Thus, we must have
√
a − b 2 6= 0.
√ √ √
Since a − b 2 6= 0, we have a2 − 2b2 = (a + b 2)(a − b 2) 6= 0, and
√
1 1 a−b 2
√ = √ · √
a+b 2 a+b 2 a−b 2
√
a−b 2
= 2
a − 2b2
1 −b √
= 2 2
+ 2 2
a − 2b a − 2b2
1 −b 1√
√
Since a, b ∈ Q, we have both a2 −2b 2 ∈ Q and a2 −2b2 ∈ Q, so
a+b 2
∈ Q[ 2].
Although we will spend a bit of time discussing noncommutative rings, the focus of our study will be
commutative rings and often we will be working with integral domains (and sometimes more specifically
with fields). The next proposition is a fundamental tool when working in integral domains. Notice that it
can fail in arbitrary commutative rings. For example, in Z/6Z we have 3 · 2 = 3 · 4, but 2 6= 4
Proposition 9.2.11. Suppose that R is an integral domain and that ab = ac with a 6= 0. We then have that
b = c.
Proof. Since ab = ac, we have ab − ac = 0. Using the distributive law, we see that a(b − c) = 0 (more
formally, we have ab + (−ac) = 0, so ab + a(−c) = 0, hence a(b + (−c)) = 0, and thus a(b − c) = 0). Since R
is an integral domain, either a = 0 or b − c = 0. Now the former is impossible by assumption, so we conclude
that b − c = 0. Adding c to both sides, we conclude that b = c.
Although Z is an example of integral domain which is not a field, it turns out that all such examples are
infinite.
Proposition 9.2.12. Every finite integral domain is a field.
Proof. Suppose that R is a finite integral domain. Let a ∈ R with a 6= 0. Define λa : R → R by letting
λa (b) = ab. Now if b, c ∈ R with λa (b) = λa (c), then ab = ac, so b = c by Proposition 9.2.11. Therefore,
λa : R → R is injective. Since R is finite, it must be the case that λa is surjective. Thus, there exists b ∈ R
with λa (b) = 1, which is to say that ab = 1. Since R is commutative, we also have ba = 1. Therefore, a is a
unit in R. Since a ∈ R with a 6= 0 was arbitrary, it follows that R is a field.
We can define direct products of rings just we did for direct products of groups. The proof that the
componentwise operations give rise to a ring are straightforward.
Definition 9.2.13. Suppose that R1 , R2 , . . . , Rn are all rings. Consider the Cartesian product
R1 × R2 × · · · × Rn = {(a1 , a2 , . . . , an ) : ai ∈ Ri for 1 ≤ i ≤ n}
where we are using the operations in Ri in the ith components. We then have that R1 × R2 × · · · × Rn with
these operations is a ring which is called the (external) direct product of R1 , R2 , . . . , Rn .
Notice that if R and S are nonzero rings, then R × S is never an integral domain (even if R and S are
both integral domains, or even fields) because (1, 0) · (0, 1) = (0, 0).
In general, we aim to carry over the above idea except we will allow our coefficients to come from a
general ring R. Thus, a typical element should look like:
an xn + an−1 xn−1 + · · · + a1 x + a0
where each ai ∈ R. Before we dive into giving a formal definition, we first discuss polynomials a little more
deeply.
In previous math courses, we typically thought of a polynomial with real coefficients as describing a
certain function from R to R resulting from “plugging in for x”. Thus, when we wrote the polynomial
4x2 + 3x − 2, we were probably thinking of it as the function which sends 1 to 4 + 3 − 2 = 5, sends 2 to
16 + 6 − 2 = 20, etc. In contrast, we will consciously avoid defining a polynomial as the resulting function
obtained by “plugging in for x”. To see why this distinction matters, consider the case where we are we
working with the ring R = Z/2Z and we have the two polynomials 1x2 + 1x + 1 and 1. These look like
different polynomials on the face of it. However, notice that
2
1·0 +1·0+1=1
and also
2
1·1 +1·1+1=1
so these two distinct polynomials “evaluate” to same thing whenever we plug in elements of R (namely they
both produce the function, i.e. the function that outputs 1 for each input). Therefore, the resulting functions
are indeed equal as functions despite the fact the polynomials have distinct forms.
We will enforce this distinction between polynomials and the resulting functions, so we will simply define
our ring in a manner that two different looking polynomials are really distinct elements of our ring. In order
to do this carefully, let’s go back and look at our polynomials. For example, consider the polynomial with
real coefficients given by 5x3 + 9x − 2. Since we are not “plugging in” for x, this polynomial is determined
by its sequence of coefficients. In other words, if we order the sequence from coefficients of smaller powers to
coefficients of larger powers, we can represent this polynomial as the sequence (−2, 9, 0, 5). Performing this
156 CHAPTER 9. INTRODUCTION TO RINGS
step gets rid of the superfluous x which was really just serving as a placeholder and honestly did not have
any real meaning. We will adopt this perspective and simply define a polynomial to be such a sequence.
However, since polynomials can have arbitrarily large degree, these finite sequences would all of different
lengths. We get around this problem by defining polynomials as infinite sequences of elements of R in which
only finitely many of the terms are nonzero.
Definition 9.3.2. Let R be a ring. We define a new ring denoted R[x] whose elements are the set of all
infinite sequences {an } of elements of R such that {n ∈ N : an 6= 0} is finite. We define two binary operations
on R[x] as follows.
{an } + {bn } = {an + bn }
and
We will see below that this makes R[x] into a ring called the polynomial ring over R.
Let’s pause for a moment to ensure that the above definition makes sense. Suppose that {an } and {bn }
are both infinite sequences of elements of R for which only finitely many nonzero terms. Fix M, N ∈ N such
that an = 0 for all n > M and bn = 0 for all n > N . We then have that an + bn = 0 for all n > max{M, N },
so the infinite sequence {an + bn } only has finitely many nonzero terms. Also, we have
X
ai bj = 0
i+j=n
for all n > M + N (because if n > M + PN and i + j = n, then either i > M or j > N , so either ai = 0
or bj = 0), hence the infinite sequence { i+j=n ai bj } only has finitely many nonzero terms. We now check
that these operations turn R[x] into a ring.
Theorem 9.3.3. Let R be a ring. The set R[x] with the above operations is a ring with additive identity
the infinite sequence 0, 0, 0, 0, . . . and multiplicative identity the infinite sequence 1, 0, 0, 0, . . . . Furthermore,
if R is commutative, then R[x] is commutative.
Proof. Many of these checks are routine. For example, + is associative on R[x] because for any infinite
sequences {an }, {bn }, and {cn }, we have
The other ring axioms involving addition are completely analogous. However, checking the axioms involv-
ing multiplication is more interesting because our multiplication operation is much more complicated than
componentwise multiplication. For example, let {en } be the infinite sequence 1, 0, 0, 0, . . . , i.e.
(
1 if n = 0
en =
0 if n > 0
where we have used the fact that ek = 0 whenever k > 0. We also have
n
X
{an } · {en } = { ak en−k } = {an e0 } = {an · 1} = {an }
k=0
where we have used the fact that if 0 ≤ k < n, then n − k > 1, so en−k = 0. Therefore, the infinite sequence
1, 0, 0, 0, . . . is indeed a multiplicative identity of R[x].
The most interesting (i.e. difficult) check is that · is associative on R[x]. We have
n
X
{an } · ({bn } · {cn }) = {an } · { b` cn−` }
`=0
n
X n−k
X
={ ak · ( b` cn−k−` )}
k=0 `=0
n n−k
X X
={ ak (b` cn−k−` )}
k=0 `=0
X
={ ak (b` cm )}
k+`+m=n
and also
n
X
({an } · {bn }) · {cn } = { ak bn−k } · {cn }
k=0
n X
X `
={ ( ak b`−k ) · cn−` }
`=0 k=0
Xn X
`
={ (ak b`−k )cn−` }
`=0 k=0
X
={ (ak b` )cm }.
k+`+m=n
Since multiplication in R is associative, we know that ak (b` cm ) = (ak b` )cm for all k, `, m ∈ N, so
X X
{ ak (b` cm )} = { (ak b` )cm }
k+`+m=n k+`+m=n
158 CHAPTER 9. INTRODUCTION TO RINGS
and hence
{an } · ({bn } · {cn }) = ({an } · {bn }) · {cn }
One also needs to check the two distributive laws, and also that R[x] is commutative when R is commutative,
but these are notably easier than associativity of multiplication, and are left as an exercise.
The above formal definition of R[x] is precise and useful when trying to prove theorems about polynomials,
but it is terrible to work with intuitively. Thus, when discussing elements of R[x], we will typically use
standard polynomial notation. For example, if we are dealing with Q[x], we will simply write the formal
element
1 22
5, 0, 0, − , 7, 0, , 0, 0, 0, . . .
3 7
as
1 22
5 − x3 + 7x4 + x6
3 7
or
22 6 1
x + 7x4 − x3 + 5,
7 3
and we will treat the x as a meaningless placeholder symbol called an indeterminate. This is where the
x comes from in the notation R[x] (if for some reason we want to use a different indeterminate, say t, we
instead use the notation R[t]). Formally, R[x] will be the set of infinite sequences, but we often use this
more straightforward notation in the future when working with polynomials. When working in R[x] with
this notation, we will typically call elements of R[x] by names like p(x) and q(x) and write something like
“Let p(x) = an xn + an−1 xn−1 + · · · + a1 x + a0 be an element of R[x]”. Since I can’t say this enough, do not
simply view this as saying that p(x) is the resulting function. Keep a clear distinction in your mind between
an element of R[x] and the function it represents via “evaluation” that we discuss in Definition 10.1.9 below!
It is also possible to connect up the formal definition of R[x] (as infinite sequences of elements of R with
finitely many nonzero terms) and the more gentle standard notation of polynomials as follows. Every element
a ∈ R can be naturally associated with the sequence (a, 0, 0, 0, . . . ) in R[x]. If we simply define x to be the
sequence (0, 1, 0, 0, 0, . . . ), then working in the ring R it is not difficult to check that x2 = (0, 0, 1, 0, 0, . . . ),
that x3 = (0, 0, 0, 1, 0, . . . ), etc. With these identifications, if we interpret the additions and multiplications
implicit in the polynomial
22 6 1
x + 7x4 − x3 + 5
7 3
as their formal counterparts defined above, then everything matches up as we would expect.
Definition 9.3.4. Let R be a ring. Given a nonzero element {an } ∈ R[x], we define the degree of {an } to
be max{n ∈ N : an 6= 0}. In the more relaxed notation, given a nonzero polynomial p(x) ∈ R[x], say
with an 6= 0, we define the degree of p(x) to be n. We write deg(p(x)) for the degree of p(x). Notice that we
do not define the degree of the zero polynomial.
The next proposition gives some relationship between the degrees of polynomials and the degrees of their
sum/product.
Proposition 9.3.5. Let R be a ring and let p(x), q(x) ∈ R[x] be nonzero.
1. Either p(x) + q(x) = 0 or deg(p(x) + q(x)) ≤ max{deg(p(x)), deg(q(x))}.
2. If deg(p(x)) 6= deg(q(x)), then p(x) + q(x) 6= 0 and deg(p(x) + q(x)) = max{deg(p(x)), deg(q(x))}.
3. Either p(x) · q(x) = 0 or deg(p(x)q(x)) ≤ deg(p(x)) + deg(q(x)).
9.3. POLYNOMIAL RINGS 159
Proof. We give an argument using our formal definitions. Let p(x) be the sequence {an } and let q(x) be the
sequence {bn }. Let M = deg(p(x)) so that aM 6= 0 and an = 0 for all n > M . Let N = deg(q(x)) so that
aN ≥ 0 and aN = 0 for all n > N .
1. Suppose that p(x) + q(x) 6= 0. For any n > max{M, N }, we have an + bn = 0 + 0 = 0, so deg(p(x) +
q(x)) ≤ max{M, N }.
2. Suppose that M 6= N . Suppose first that M > N . We then have aM + bM = aM + 0 = aM 6= 0, so
p(x) + q(x) 6= 0 and deg(p(x) + q(x)) ≥ max{deg(p(x)), deg(q(x))}. Combining this with the inequality
in part 1, it follows that deg(p(x) + q(x)) = max{deg(p(x)), deg(q(x))}. A similar argument works if
N > M.
3. Suppose that p(x)q(x) 6= 0. Let n > M + N and consider the sum
n
X
ak bn−k = 0.
k=0
Therefore, deg(p(x)q(x)) ≤ M + N .
so the product of a degree 2 polynomial and a degree 1 polynomial results in a degree 2 polynomial. It
follows that we can indeed have a strict inequality in the latter case. In fact, we can have the product of two
nonzero polynomials result in the zero polynomial, i.e. there may exist zero divisors in R[x]. For example,
working in Z/6Z[x] again we have
(4x + 2) · 3x2 = 0
Fortunately, for well-behaved rings, the degree of the product always equals the sum of the degrees.
Proposition 9.3.6. Let R be an integral domain. If p(x), q(x) ∈ R[x] are both nonzero, then p(x)q(x) 6= 0
and deg(p(x)q(x)) = deg(p(x)) + deg(q(x)).
Proof. Let p(x) be the sequence {an } and let q(x) be the sequence {bn }. Let M = deg(p(x)) so that aM 6= 0
and an = 0 for all n > M . Let N = deg(q(x)) so that aN ≥ 0 and aN = 0 for all n > N . Now consider
M
X +N
ak bM +N −k
k=0
Since deg(p(x)), deg(q(x)) ∈ N, it follows that deg(p(x)) = 0 = deg(q(x)). Therefore, p(x) and q(x) are both
nonzero constant polynomials, say p(x) = a and q(x) = b. We then have ab = 1 and ba = 1 in R[x], so these
equations are true in R as well. It follows that a ∈ U (R).
This proposition can be false when R is only a commutative ring (see the homework).
Corollary 9.3.9. Let F be a field. The units in F [x] are precisely the nonzero constant polynomials. In other
words, if we identify an element of F with the corresponding constant polynomial, then U (F [x]) = F \{0}.
Proof. Immediate from Proposition 9.3.8 and the fact that U (F ) = F \{0}.
Recall Theorem 2.3.1, which said that if a, b ∈ Z and b 6= 0, then there exists q, r ∈ Z with
a = qb + r
and 0 ≤ r < |b|. Furthermore, the q and r and unique. Intuitively, if b 6= 0, then we can always divide by
b and obtain a “smaller” remainder. This simple result was fundamental in proving results about greatest
common divisors (such as the fact that they exist!). If we hope to generalize this to other rings, we need a
way to interpret “smaller” in new ways. Fortunately, for polynomial rings, we can use the notion of degrees.
We’re already used to this in R[x] from polynomial long division. However, in R[x] for a general ring R,
things may not work out so nicely. The process of polynomial long division involves dividing by the leading
coefficient of the divisor, so we need to assume that it is a unit.
Theorem 9.3.10. Let R be a ring, and let f (x), g(x) ∈ R[x] with g(x) 6= 0. Write
where bm 6= 0.
1. If bm ∈ U (R), then there exist q(x), r(x) ∈ R[x] with f (x) = q(x) · g(x) + r(x) and either r(x) = 0 or
deg(r(x)) < deg(g(x)).
9.3. POLYNOMIAL RINGS 161
2. If R is an integral domain, then there exist at most one pair q(x), r(x) ∈ R[x] with f (x) = q(x) · g(x) +
r(x) and either r(x) = 0 or deg(r(x)) < deg(g(x)).
Proof. 1. Suppose that bm ∈ U (R). We prove the existence of q(x) and r(x) for all f (x) ∈ R[x] by
induction on deg(f (x)). We begin by handling some simple cases that will serve as base cases for our
induction. Notice first that if f (x) = 0, then we may take q(x) = 0 and r(x) = 0 because
f (x) = 0 · g(x) + 0.
Also, if f (x) 6= 0 but deg(f (x)) < deg(g(x)), then we may take q(x) = 0 and r(x) = f (x) because
We handle all other polynomials f (x) using induction on deg(f (x)). Suppose then that deg(f (x)) ≥
deg(g(x)) and that we know the existence result is true for all p(x) ∈ R[x] with either p(x) = 0 or
deg(p(x)) < deg(f (x)). Write
where an 6= 0. Since we are assuming that deg(f (x)) ≥ deg(g(x)), we have n ≥ m. Consider the
polynomial
p(x) = f (x) − an b−1
m x
n−m
g(x)
We have
Therefore, either p(x) = 0 or deg(p(x)) < n = deg(f (x)). By induction, we may fix q ∗ (x), r∗ (x) ∈ R[x]
with
p(x) = q ∗ (x) · g(x) + r∗ (x)
where either r∗ (x) = 0 or deg(r∗ (x)) < deg(g(x)). We then have
f (x) − an b−1
m x
n−m
· g(x) = q ∗ (x) · g(x) + r∗ (x),
hence
f (x) = an b−1
m x
n−m
· g(x) + q ∗ (x) · g(x) + r∗ (x)
= (an b−1
m x
n−m
+ q ∗ (x)) · g(x) + r∗ (x).
where qi (x), ri (x) ∈ R[x] and either ri (x) = 0 or deg(ri (x)) < deg(g(x)) for each i. We then have
hence
(q1 (x) − q2 (x)) · g(x) = r2 (x) − r1 (x)
Suppose that r2 (x)−r1 (x) 6= 0. Since R is an integral domain, we know that R[x] is an integral domain
by Corollary 9.3.7. Thus, we must have q1 (x) − q2 (x) 6= 0 and g(x) 6= 0, and also
by Proposition 9.3.6. However, this is a contradiction because deg(r2 (x) − r1 (x)) < deg(g(x)) since
for each i, we have either ri (x) = 0 or deg(ri (x)) < deg(g(x)). We conclude that we must have
r2 (x) − r1 (x) = 0 and thus r1 (x) = r2 (x). Canceling this common term from
we conclude that
q1 (x) · g(x) = q2 (x) · g(x)
Since g(x) 6= 0 and R[x] is an integral domain, it follows that q1 (x) = q2 (x) as well.
Corollary 9.3.11. Let F be a field, and let f (x), g(x) ∈ F [x] with g(x) 6= 0. There exist unique q(x), r(x) ∈
F [x] with f (x) = q(x) · g(x) + r(x) and either r(x) = 0 or deg(r(x)) < deg(g(x)).
Proof. Immediate from the previous theorem together with the fact that fields are integral domains, and
U (F ) = F \{0}.
Let’s compute an example when F = Z/7Z. Working in Z/7Z[x], let
and let
g(x) = 2x2 + 5x + 1.
We perform long division, i.e. follow the proof, to find q(x) and r(x). Notice that the leading coefficient of
−1
g(x) is 2 and that in Z/7Z we have 2 = 4. We begin by computing
3 · 4 · x4−2 = 5x2
This will be the first term in our resulting quotient. We then multiply this by g(x) and subtract from f (x)
to obtain
We now continue on with this new polynomial (this is where we appealed to induction in the proof) as our
“new” f (x). We follow the proof recursively and compute
2 · 4 · x = 1x
This will be our next term in the quotient. We now subtract 1x · g(x) from our current polynomial to obtain
We have arrived at a point with our polynomial has degree less than that of g(x), so we have bottomed out
in the above proof at a base case. Adding up our contributions to the quotient gives
q(x) = 5x2 + 1x + 2,
Notice that if R is not a field, then such q(x) and r(x) may not exist, even if R is a nice ring like Z. For
example, there are no q(x), r(x) ∈ Z[x] with
x2 = q(x) · 2x + r(x)
and either r(x) = 0 or deg(r(x)) < deg(2x) = 1. To see this, first notice that r(x) would have to be a
constant polynomial, say r(x) = c. If
x2 = 2a1 x2 + 2a0 x + c
It follows that 2a1 = 1, which is a contradiction because 2 - 1 in Z. Thus, no such q(x) and r(x) exist in this
case.
Furthermore, if R is not an integral domain, we may not have uniqueness. For example, in Z/6Z with
f (x) = 4x2 + 1 and g(x) = 2x, we have
4x2 + 1 = 2x · 2x + 1
4x2 + 1 = 5x · 2x + 1
4x2 + 1 = (5x + 3) · 2x + 1,
so we can obtain several different quotients. We can even have different quotients and remainders. If we
stay in Z/6Z[x] but use f (x) = 4x2 + 2x and g(x) = 2x + 1, then we have
4x2 + 2x = 2x · (2x + 1) + 0
4x2 + 2x = (2x + 3) · (2x + 1) + 3.
164 CHAPTER 9. INTRODUCTION TO RINGS
a0 + a1 x + a2 x2 + a3 x3 + . . .
The nice thing about defining our objects as infinite sequences is that there is no confusion at all about
“plugging in for x”, because there is no x. Thus, there are no issues about convergence or whether this is a
well-defined function at all. We define our + and · on these infinite sequences exactly as in the polynomial
ring case, and the proofs of the ring axioms follows word for word (actually, the proof is a bit easier because
we don’t have to worry about the resulting sequences having only a finite number of nonzero terms).
Definition 9.4.1. Let R be a ring. Let R[[x]] be the set of all infinite sequences {an } where each an ∈ R.
We define two binary operations on R[[x]] as follows.
and
Theorem 9.4.2. Let R be a ring. The above operation make R[[x]] into a ring with additive identity the
infinite sequence 0, 0, 0, 0, . . . and multiplicative identity the infinite sequence 1, 0, 0, 0, . . . . Furthermore, if
R is commutative, then R[[x]] is commutative. We call R[[x]] the ring of formal power series over R, or
simply the power series ring over R.
If you have worked with generating functions in combinatorics as strictly combinatorial objects (i.e. you
did not worry about values of x where convergence made sense, or work with resulting function on the
restricted domain), then you were in fact working in the ring R[[x]] (or perhaps the larger C[[x]]. From
another perspective, if you have worked with infinite series, then you know that
1
= 1 + x + x2 + x3 + . . .
1−x
for all real numbers x with |x| < 1. You have probably used this fact when you worked with generating
functions as well, even if you weren’t thinking about these as functions. If you are not thinking about
the above equality in terms of functions, and thus ignored issues about convergence on the right, then
what you are doing is that you are working in the ring R[[x]] and saying that 1 − x is a unit with inverse
1 + x + x2 + x3 + . . . . We now verify this fact. In fact, for any ring R, we have
in R[[x]]. To see this, one can simply multiply out the left hand sides naively and notice the coefficient of
xk equals 0 for all k ≥ 2. For example, we have
and similarly for the other other. More formally, we can argue this as follows. Let R be a ring. Let {an } be
the infinite sequence defined by
1
if n = 0
an = −1 if n = 1
0 otherwise
and notice that {an } is the formal version of 1 − x. Let {bn } be the infinite sequence defined by bn = 1
for all n ∈ N, and notice that {bn } is the formal version of 1 + x + x2 + . . . . Finally, let {en } be the finite
sequence defined by
(
1 if n = 0
en =
0 otherwise
and notice that {en } is the formal version of 1. Now a0 · b0 = 1 · 1 = 1 = e0 , and for any n ∈ N+ , we have
n
X
ak bn−k = a0 bn + a1 bn−1 (since ak = 0 if k ≥ 2)
k=0
= 1 · 1 + (−1) · 1
=1−1
=0
= en
Therefore {an } · {bn } = {en }. The proof that {bn } · {an } = {en } is similar.
Matrix Rings
One of our fundamental examples of a ring is the ring Mn (R) of all n × n matrices with entries in R. In
turns out that we can generalize this construction to Mn (R) for any ring R.
Definition 9.4.3. Let R be a ring and let n ∈ N+ . We let Mn (R) be the ring of all n × n matrices with
entries in R with the following operations. Writing an element of R as [ai,j ], we define
n
X
[ai,j ] + [bi,j ] = [ai,j + bi,j ] and [ai,j ] · [bi,j ] = [ ai,k bk,j ]
k=1
With these operations, Mn (R) is a ring with additive identity the matrix of all zeros and multiplicative
identity the matrix with all zeros except for ones on the diagonal, i.e.
(
1 if i = j
ei,j =
0 otherwise
166 CHAPTER 9. INTRODUCTION TO RINGS
The verification of the ring axioms on Mn (R) are mostly straightforward (we know the necessary results
in the case R = R from linear algebra). The hardest check once again is that · is associative. We formally
carry that out now. Given matrices [ai,j ], [bi,j ], and [ci,j ] in Mn (R), we have
Xn
[ai,j ] · ([bi,j ] · [ci,j ]) = [ai,j ] · [ bi,` c`,j ]
`=1
n
X Xn
=[ ai,k · ( bk,` c`,j )]
k=1 `=1
Xn Xn
=[ ai,k (bk,` c`,j )]
k=1 `=1
Xn X n
=[ (ai,k bk,` )c`,j ]
k=1 `=1
Xn X n
=[ (ai,k bk,` )c`,j ]
`=1 k=1
Xn X n
=[ ( ai,k bk,` ) · c`,j ]
`=1 k=1
Xn
=[ ai,k bk,j ] · [ci,j ]
k=1
= ([ai,j ] · [bi,j ]) · [ci,j ]
One nice thing about this more general construction of matrix rings is that it provides us with a decent
supply of noncommutative rings.
Proposition 9.4.4. Let R be a ring with 1 6= 0. For each n ≥ 2, the ring Mn (R) is noncommutative.
Proof. We claim that the matrix of all zeros except for a 1 in the (1, 2) position does not commute with the
matrix of all zeroes except for a 1 in the (2, 1) position. To see this, let A = [ai,j ] be the matrix where
(
1 if i = 1 and j = 2
ai,j =
0 otherwise
In particular, this construction gives us examples of finite noncommutative rings. For example, the ring
M2 (Z/2Z) is a noncommutative ring with 24 = 16 elements.
Definition 9.4.5. Let X be a set and let R be a ring. Consider the set F of all functions f : X → R. We
define + and · on F to be pointwise addition and multiplication, i.e. given f, g ∈ F, we define f + g : X → R
to be the function such that
(f + g)(x) = f (x) + g(x)
for all x ∈ X. With these operations, F is a ring with additive identity equal to the constant function f (x) = 0
and multiplicative identity equal to the constant function f (x) = 1. Furthermore, if R is commutative, then
F is commutative.
Notice that if R is a ring and n ∈ N, then the direct product Rn = R × R × · · · × R can viewed as a special
case of this construction where X = {1, 2, . . . , n}. To see this, notice that function f : {1, 2, . . . , n} → R
naturally corresponding to an n-tuple (a1 , a2 , . . . , an ) where each ai ∈ R by letting ai = f (i). Since the
operations in both F and Rn are pointwise, the operations of + and · correspond as well. Although these
two descriptions really are just different ways of defining the same object, we can formally verify that these
rings are indeed isomorphic once we define ring isomorphisms in the next section.
Now if R is a ring and we let X = N, then elements of F correspond to infinite sequences {an } of
elements of R. Although this is same underlying set as R[[x]] and both F and R[[x]] have the same addition
operation (namely {an } + {bn } = {an + bn }), the multiplication operations are different. In F, we have
pointwise multiplication, so {an } · {bn } = {an · bn }, while the operation in R[[x]] is more complicated (and
more interesting!).
We obtain other fundamental and fascinating rings by taking subrings of these examples. For instance,
let F be our original example of the set of all functions f : R → R (so we are taking R = R and X = R). Let
C ⊆ F be the subset of all continuous functions from R to R (i.e. continuous at every point). By results from
calculus and/or analysis, the sum and product of continuous functions is continuous, as are the constant
functions 0 and 1, along with −f for every continuous function f . It follows that C is a subring of F.
168 CHAPTER 9. INTRODUCTION TO RINGS
(f · g)((2, 1)) = f ((0, 0)) · g((2, 1)) + f ((0, 1)) · g((2, 0)) + f ((1, 0)) · g((1, 1))
+ f ((1, 1)) · g((1, 0)) + f ((2, 0)) · g((0, 1)) + f ((2, 1)) · g((0, 0))
With such a definition, it is possible (but tedious) to check that we obtain a ring.
An alternate approach is to define R[x, y] to simply be the ring (R[x])[y] obtained by first taking R[x],
and then forming a new polynomial ring from it. From this perspective, one can view the above polynomial
as
(7x5 − 4x3 + 11x) · y 2 + (−16x − 3) · y + (5x2 + 42)
We could also define it as (R[y])[x], in which case we view the above polynomial as
This approach is particularly nice because we do not need to recheck the ring axioms, as they follow from
our results on polynomial rings (or one variable). The downside is that working with polynomials in this
way, rather than as summands like the original polynomial above, is slightly less natural. Nonetheless, it is
possible to prove that all three definitions result in naturally isomorphic rings. We will return to these ideas
later, along with working in polynomial rings of more than two variables.
Chapter 10
r + I = {r + a : a ∈ I}
These cosets are the equivalence classes of the equivalence relation ∼I on R defined by r ∼I s if there exists
a ∈ I with r + a = s. Since (R, +, 0) is abelian, we know that the additive subgroup I of R is normal in R,
so we can take the quotient R/I as additive groups. In this quotient, we know from our general theory of
quotient groups that addition is well-defined by:
(r + I) + (s + I) = (r + s) + I.
Now if we want to turn the resulting quotient into a ring (rather than just an abelian group), we would
certainly require that multiplication of cosets is well-defined as well. In other words, we would need to know
that if r, s, t, u ∈ R with r ∼I t and s ∼I u, then rs ∼I tu. A first guess might be that we will should
require that I is closed under multiplication as well. Before jumping to conclusions, let’s work out whether
this happens for free, or if it looks grim, then what additional conditions on I we might want to require.
Suppose then that r, s, t, u ∈ R with r ∼I t and s ∼I u. Fix a, b ∈ I with r + a = t and s + b = u. We
then have
tu = (r + a)(s + b)
= r(s + b) + a(s + b)
= rs + rb + as + ab
Now in order for rs ∼I tu, we would want rb + as + ab ∈ I. To ensure this, it would suffice to know that
rb ∈ I, as ∈ I, and ab ∈ I because we are assuming that I is an additive subgroup. The last of these, that
ab ∈ I, would follow if we include the additional assumption that I is closed under multiplication as we
guessed above. However, a glance at the other two suggests that we might need to require more. These other
summands suggest that we want I to be closed under “super multiplication”, i.e. that if we take an element
of I and multiply it by any element of R on either side, then we stay in I. If we have these conditions on I
(which at this point looks like an awful lot to ask), then everything should work out fine. We give a special
name to subsets of R that have this property.
169
170 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS
Definition 10.1.1. Let R be a ring. An ideal of R is a subset I ⊆ R with the following properties:
• 0 ∈ I.
• a + b ∈ I whenever a ∈ I and b ∈ I.
• −a ∈ I whenever a ∈ I.
• ra ∈ I whenever r ∈ R and a ∈ I.
• ar ∈ I whenever r ∈ R and a ∈ I.
Notice that the first three properties simply say that I is an additive subgroup of R.
For example, consider the ring R = Z. Suppose n ∈ N and let I = nZ = {nk : k ∈ Z}. We then have
that I is an ideal of R. To see this, first notice that we already know that nZ is a subgroup of Z. Now for
any m ∈ Z and k ∈ Z, we have
Therefore, I = nZ does satisfy the additional conditions necessary, so I = nZ is an ideal of R. Before going
further, let’s note one small simplification in the definition of an ideal.
• 0 ∈ I.
• a + b ∈ I whenever a ∈ I and b ∈ I.
• ra ∈ I whenever r ∈ R and a ∈ I.
• ar ∈ I whenever r ∈ R and a ∈ I.
Proof. Notice that the only condition that is missing is that I is closed under additive inverses. For any
a ∈ R, we have −a = (−1) · a, so −a ∈ R by the third condition (notice here that we are using the fact that
all our rings have a multiplicative identity).
Corollary 10.1.3. Let R be a commutative ring. Suppose that I ⊆ R satisfies the following:
• 0 ∈ I.
• a + b ∈ I whenever a ∈ I and b ∈ I.
• ra ∈ I whenever r ∈ R and a ∈ I.
Proof. Use the previous proposition together with the fact that if r ∈ R and a ∈ I, then ar = ra ∈ I because
R is commutative.
10.1. IDEALS, QUOTIENTS, AND HOMOMORPHISMS 171
For another example of an ideal, consider the ring R = Z[x]. Let I be set of all polynomials with 0
constant term. Formally, we are letting I be the set of infinite sequences {an } with a0 = 0. Let’s prove that
I is an ideal using a more informal approach to the polynomial ring (make sure you know how to translate
everything we are saying into formal terms). Notice that the zero polynomial is trivially in I and that I is
closed under addition because the constant term of the sum of two polynomials is the sum of their constant
terms. Finally, if f (x) ∈ R and p(x) ∈ I, then we can write
and
p(x) = bm xn + bn−1 xn−1 + · · · + b1 x
Multiplying out the polynomials, we see that the constant term of f (x)p(x) is a0 · 0 = 0 and similarly the
constant term of p(x)f (x) is 0 · a0 = 0. Therefore, we have both f (x)p(x) ∈ I and p(x)f (x) ∈ I. It follows
that I is an ideal of R.
From the above discussion, it appears that if I is an ideal of R, then it makes sense to take the quotient
of R by I and have both addition and multiplication of cosets be well-defined. However, our definition was
motivated by what conditions would ensure that checking that multiplication is well-defined easy. As in
our definition of a normal subgroup, it is a pleasant surprise that the implication can be reversed so our
conditions are precisely what is needed for the quotient to make sense.
Proposition 10.1.4. Let R be a ring and let I ⊆ R be an additive subgroup of (R, +, 0). The following are
equivalent.
• I is an ideal of R.
tu = (r + a)(s + b)
= r(s + b) + a(s + b)
= rs + rb + as + ab.
Since I is an ideal of R and b ∈ I, we know that both rb ∈ I and ab ∈ I. Similarly, since I is an ideal of
R and a ∈ I, we know that as ∈ I. Now I is an ideal of R, so it is an additive subgroup of R, and hence
rb + as + ab ∈ I. Since
tu = rs + (rb + as + ab)
we conclude that rs ∼I tu.
We now prove 2 → 1. We are assuming that I is an additive subgroup of R and condition 2. Let r ∈ R
and let a ∈ I be arbitrary. Now r ∼I r because r + 0 = r and 0 ∈ I. Also, we have a ∼I 0 because
a + (−a) = 0 and −a ∈ I (as a ∈ I and I is an additive subgroup).
• Since r ∼I r and a ∼I 0, we may use condition 2 to conclude that ra ∼I r0, which is to say that
ra ∼I 0. Thus, we may fix b ∈ I with ra + b = 0. Since b ∈ I and I is an additive subgroup, it follows
that ra = −b ∈ I.
• Since a ∼I 0 and r ∼I r, we may use condition 2 to conclude that ar ∼I 0r, which is to say that
ar ∼I 0. Thus, we may fix b ∈ I with ar + b = 0. Since b ∈ I and I is an additive subgroup, it follows
that ar = −b ∈ I.
172 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS
Therefore, we have both ra ∈ I and ar ∈ I. Since r ∈ R and a ∈ I were arbitrary, we conclude that I is an
ideal of R.
We are now ready to formally define quotient rings.
Definition 10.1.5. Let R be a ring and let I be an ideal of R. Let R/I be the set of additive cosets of I in
R, i.e. the set of equivalence classes of R under ∼I . Define operation on R/I by letting
(a + I) + (b + I) = (a + b) + I (a + I) · (b + I) = ab + I
With these operations, the set R/I becomes a ring with additive identity 0 + I and multiplicative identity
1 + I (we did the hard part of checking that the operations are well-defined, and from here the ring axioms
follow because the ring axioms hold in R). Furthermore, if R is commutative, then R/I is also commutative.
Let n ∈ N+ . As discussed above, the set nZ = {nk : k ∈ Z} is an ideal of Z. Our familiar ring Z/nZ
defined in the first section is precisely the quotient of Z by this ideal nZ, hence the notation again.
As we have seen, ideals of a ring correspond to normal subgroups of a group in that they are the “special”
subsets for which it makes sense to take a quotient. There is one small but important note to be made here.
Of course, every normal subgroup of a group G is a subgroup of G. However, it is not true that every ideal
of a ring R is a subring of R. The reason is that for I to be an ideal of R, it is not required that 1 ∈ I. In
fact, if I is an ideal of R and 1 ∈ R, then r = r · 1 ∈ I for all r ∈ R, so I = R. Since every subring S of R
must satisfy 1 ∈ S, it follows that the only ideal of R which is a subring of R is the whole ring itself. This
might seem to be quite a nuisance, but as mentioned above, we will pay very little attention to subrings of
a given ring, and the vast majority of our focus will be on ideals.
We end with our discussion of ideals in general rings with a simple characterization for when two elements
of a ring R represent the same coset.
Proposition 10.1.6. Let R be a ring and let I be an ideal of R. Let r, s ∈ R. The following are equivalent.
1. r + I = s + I
2. r ∼I s
3. r − s ∈ I
4. s − r ∈ I
Proof. Notice that 1 → 2 from our general theory of equivalence relations because r + I is simply the
equivalence class of r under the relation ∼I .
2 → 3: Suppose that r ∼I s, and fix a ∈ I with r + a = s. Subtracting s and a from both sides, it follows
that r − s = −a (you should work through the details of this if you are nervous). Now a ∈ I and I is an
additive subgroup of R, so −a ∈ I. It follows that r − s ∈ I.
3 → 4: Suppose that r − s ∈ I. Since I is an additive subgroup of R, we know that −(r − s) ∈ I. Since
−(r − s) = s − r, it follows that s − r ∈ I.
4 → 2: Suppose that s − r ∈ I. Since r + (s − r) = s, it follows that r ∼I s.
We now work through an example of a quotient ring other than Z/nZ. Let R = Z[x] and notice that R
is commutative. We consider two different quotients of R.
• First, let
I = (x2 − 2) · Z[x] = {(x2 − 2) · p(x) : p(x) ∈ Z[x]}
For example, 3x3 + 5x2 − 6x − 10 ∈ I because
3x3 + 5x2 − 6x − 10 = (x2 − 2) · (3x + 5)
and 3x + 5 ∈ Z[x]. We claim that I is an ideal of Z[x]:
10.1. IDEALS, QUOTIENTS, AND HOMOMORPHISMS 173
However, notice that the leading coefficient of 2x is 2, which is not a unit in Z. Thus, we can’t make an
argument like the one above work, and it is indeed harder to find unique representatives of the cosets
in Z[x]/J. For example, one can show that x3 + I 6= r(x) + I for any r(x) ∈ Z[x] of degree at most 2.
Ring Homomorphisms
• ϕ(1R ) = 1S .
Definition 10.1.8. Given two rings R and S, we say that R and S are isomorphic, and write R ∼
= S, if
there exists an isomorphism ϕ : R → S.
Notice that we have the additional requirement that ϕ(1R ) = 1S . When we discussed group homomor-
phisms, we derived ϕ(eG ) = eH rather than explicitly require it (see Proposition 6.6.2 and Proposition 6.3.4).
Unfortunately, it does not follow for free from the other two conditions in the ring case. If you go back and
look at the proof that ϕ(eG ) = eH in the group case, you will see that we used the fact that ϕ(eG ) has an
inverse but it is not true in rings that every element must have a multiplicative inverse. To see an example
where the condition can fail consider the ring Z × Z (we have not formally defined the direct product of
rings, but it works in the same way). Define ϕ : Z → Z × Z by ϕ(n) = (n, 0). It is not hard to check that ϕ
satisfies the first two conditions for a ring homomorphism, but ϕ(1) = (1, 0) while the identity of Z × Z is
(1, 1).
Definition 10.1.9. Let R be a ring and let c ∈ R. Define Evc : R[x] → R by letting
X
Evc ({an }) = an cn
n
Notice that the above sum makes because elements {an } ∈ R[x] have only finitely many nonzero terms (if
{an } is nonzero, we can stop the sum at M = deg({an })). Intuitively, we are defining
Thus, Evc is the function which says “evaluate the polynomial at c”.
The next proposition is really fundamental. Intuitively, it says the following. Suppose that we are given
a commutative ring R, and we take two polynomials in R. Given any c ∈ R, we obtain the same result if we
first add/multiply the polynomials and then plug c into the result, or if we first plug c into each polynomial
and then add/multiply the results.
Proposition 10.1.10. Let R be a commutative ring and let c ∈ R. The function Evc is a ring homomor-
phism.
10.1. IDEALS, QUOTIENTS, AND HOMOMORPHISMS 175
Proof. We clearly have Evc (1) = 1 (where the 1 in parenthesis is the constant polynomial 1). For any
{an }, {bn } ∈ R[x], we have
Thus, Evc preserves addition. We now check that Evc preserves multiplication. Suppose that {an }, {bn } ∈
R[x]. Let M1 = deg({an }) and M2 = deg({bn }) (for this argument, let Mi = 0 if the corresponding
polynomial is the zero polynomial). We know that M1 + M2 ≥ deg({an } · {bn }), so
X
Evc ({an } · {bn }) = Evc ({ ai bj })
i+j=n
MX
1 +M2 X
= ( ai bj )cn
n=0 i+j=n
MX
1 +M2 X
= ai bj cn
n=0 i+j=n
MX
1 +M2 X
= ai bj ci+j
n=0 i+j=n
MX
1 +M2 X
= ai bj ci cj
n=0 i+j=n
MX
1 +M2 X
= ai ci bj cj (since R is commutative)
n=0 i+j=n
M1 X
X M2
= ai ci bj cj (since ai = 0 if i > M1 and bj = 0 if j > M2 )
i=0 j=0
M1
X M2
X
= (ai ci · ( bj cj ))
i=0 j=0
M1
X XM2
=( ai ci ) · ( bj cj )
i=0 j=0
and
Evc (p(x)) · Evc (q(x)) = acbc
It seems impossible to argue that these are equal in general if you can not commute b with c. To find
a specific counterexample, it suffices to find a noncommutative ring with two elements b and c such that
bc2 6= cbc (because then we can take a = 1). It’s not hard to find two matrices which satisfy this, so the
corresponding Evc will not be a ring homomorphism.
In the future, given p(x) ∈ R[x] and c ∈ R, we will tend to write the informal p(c) for the formal
notion Evc (p(x)). In this informal notation, the proposition says that if R is commutative, c ∈ R, and
p(x), q(x) ∈ R[x], then
Definition 10.1.11. Let ϕ : R → S be a ring homomorphism. We define ker(ϕ) to be the kernel of ϕ when
viewed as a homomorphism of the additive groups, i.e. ker(ϕ) = {a ∈ R : ϕ(R) = 0S }.
Recall that the normal subgroups of a group G are precisely the kernels of group homomorphisms with
domain G. Continuing on the analogy between normal subgroups of a group and ideal of a ring, we might
hope that the ideals of a ring R are precisely the kernels of ring homomorphisms with domain R. The next
propositions confirm this.
Proposition 10.1.12. If ϕ : R → S is a ring homomorphism, then ker(ϕ) is an ideal of R.
Proof. Let K = ker(ϕ). Since we know in particular in that ϕ is a additive group homomorphism, we know
from our work on groups that K is an additive subgroup of R (see Proposition 6.6.4). Let a ∈ K and r ∈ R
be arbitrary. Since a ∈ K, we have ϕ(a) = 0. Therefore
so ra ∈ K, and
so ar ∈ K. Therefore, K is an ideal of R.
Proposition 10.1.13. Let R be a ring and let I be an ideal of R. There exists a ring S and a ring
homomorphism ϕ : R → S such that I = ker(ϕ).
Proof. Consider the ring S = R/I and the projection π : R → R/I defined by π(a) = a + I. As in the group
case, it follows that π is a ring homomorphism and that ker(π) = I.
Now many of the results about group homomorphism carry over to ring homomorphism. For example,
we have the following.
Proposition 10.1.14. Let ϕ : R → S be a ring homomorphism. ϕ is injective if and only if ker(ϕ) = {0R }.
Proof. Notice that ϕ is in particular a homomorphism of additive groups. Thus, the result follows from the
corresponding result about groups. However, let’s prove it again because it is an important result.
Suppose first that ϕ is injective. We know that ϕ(0R ) = 0S , so 0R ∈ ker(ϕ). If a ∈ ker(ϕ), then
ϕ(a) = 0S = ϕ(0R ), so a = 0R because ϕ is injective. Therefore, ker(ϕ) = {0R }.
10.1. IDEALS, QUOTIENTS, AND HOMOMORPHISMS 177
Suppose conversely that ker(ϕ) = {0R }. Let r, s ∈ R with ϕ(r) = ϕ(s). We then have
dc = ϕ(b) · ϕ(a)
= ϕ(ba)
cd = ϕ(a) · ϕ(b)
= ϕ(ab)
R/ ker(ϕ) ∼
= range(ϕ)
Let’s see the First Isomorphism Theorem in action. Let R = Z[x] and let I be the ideal of all polynomials
with 0 constant term, i.e.
It is straightforward to check that I is an ideal. Consider the ring R/I. Intuitively, taking the quotient
by I we “kill off” all polynomials without a constant term. Thus, it seems reasonably to suspect that
two polynomials will be in the same coset in the quotient exactly when they have the same constant term
(because then the difference of the two polynomials will be in I). As a result, we might expect that R/I ∼
= Z.
Although it is possible to formalize and prove all of these statements directly, we can also deduce them all
from the First Isomorphism Theorem, as we now show.
Since Z is commutative, the function Ev0 : Z[x] → Z is a ring homomorphism by Proposition 10.1.10.
Notice that Ev0 is surjective because Ev0 (a) = a for all a ∈ Z (where we interpret the a in parentheses as
the constant polynomial a). Now ker(Ev0 ) = I because for any p(x) ∈ Z[x], we have that p(0) = Ev0 (p(x))
is the constant term of p(x). Since I = ker(Ev0 ), we know that I is an ideal of Z[x] by Proposition 10.1.12.
Finally, using the First Isomorphism Theorem together with the fact that ϕ is surjective, we conclude that
Z[x]/I ∼= Z.
We end by stating the remaining theorems. Again, most of the work can be outsourced to the corre-
sponding theorems about groups, and all that remains is to check that multiplication behaves appropriately
in each theorem.
Theorem 10.1.18 (Second Isomorphism Theorem). Let R be a ring, let S be a subring of R, and let I be
an ideal of R. We then have that S + I = {r + a : r ∈ S, a ∈ I} is a subring of R, that I is an ideal of S + I,
that S ∩ I is an ideal of S, and that
S+I ∼ S
=
I S∩I
Theorem 10.1.19 (Correspondence Theorem). Let R be a ring and let I be an ideal of R. For every subring
S of R with I ⊆ S, we have that S/I is a subring of R/I and the function
S 7→ S/I
10.2. THE CHARACTERISTIC OF A RING 179
is a bijection from subrings of R containing I to subrings of R/I. Also, for every ideal J of R with I ⊆ J,
we have that J/I is an ideal of R/I and the function
J 7→ J/I
is a bijection from ideals of R containing J to ideals of R/J. Furthermore, we have the following properties
for any subgroups S1 and S2 of R that both contain I, and ideal J1 and J2 of R that both contain I:
1. S1 is a subring of S2 if and only if S1 /I is a subring of S2 /I.
2. J1 is an ideal of J2 if and only if J1 /I is an ideal of J2 /I.
Theorem 10.1.20 (Third Isomorphism Theorem). Let R be a ring. Let I and J be ideals of R with I ⊆ J.
We then have that J/I is an ideal of R/I and that
R/I ∼ R
=
J/I J
0·a=0·a
=0
=a·0
=a·0
If n ∈ N+ , we have
n · a = (1 + 1 + · · · + 1) · a
= 1 · a + 1 · a + ··· + 1 · a
= a + a + ··· + a
= a · 1 + a · 1 + ··· + a · 1
= a · (1 + 1 + · · · + 1)
=a·n
where each of the above sums have n terms (if you find the . . . not sufficiently formal, you can again give a
formal inductive argument). Suppose now that n ∈ Z with n < 0. We then have −n > 0, hence
n · a = (−(−n)) · a
= −((−n) · a)
= −(a · (−n)) (from above)
= a · (−(−n))
=a·n
Definition 10.2.4. Let R be a ring. We define the characteristic of R, denoted char(R), as follows. If
there exists n ∈ N+ with n = 0, we define char(R) to be the least such n. If no such n exists, we define
char(R) = 0. In other words, char(R) is the order of 1 when viewed as an element of the additive group
(R, +, 0) if this order is finite, but equals 0 if this order is infinite.
Notice that char(R) is the order of 1 when viewed as an element of the abelian group (R, +, 0), unless
that order is infinite (in which case we defined char(R) = 0). For example, char(Z/nZ) = n for all n ∈ N+
and char(Z) = 0. Also, we have char(Q) = 0 and char(R) = 0. For an example of an infinite ring with
nonzero characteristic, notice that char(Z/nZ[x]) = n for all n ∈ N+ .
10.3. POLYNOMIAL EVALUATION AND ROOTS 181
Proposition 10.2.5. Let R be a ring with char(R) = n and let ϕ : Z → R be the ring homomorphism
ϕ(m) = m. Let S be the subgroup of (R, +, 0) generated by 1, so S = {m : m ∈ Z}. We then have that
S = range(ϕ), so S is a subring of R. Furthermore:
• If n 6= 0, then ker(ϕ) = nZ and so by the First Isomorphism Theorem it follows that Z/nZ ∼
= S.
• If n = 0, then ker(ϕ) = {0}, so ϕ is injective and hence Z ∼
=S
In particular, if char(R) = n 6= 0, then R has a subring isomorphic to Z/nZ, while if char(R) = 0, then R
has a subring isomorphic to Z.
S = {m : m ∈ Z} = {ϕ(m) : m ∈ Z} = range(ϕ)
Since S is the range of ring homomorphism, we have that S is a subring of R by Corollary 10.1.16. Now if
n = 0, then m 6= 0 for all nonzero m ∈ Z, so ker(ϕ) = {0}. Suppose that n 6= 0. To see that ker(ϕ) = nZ,
note that the order of 1 when viewed as an element of the abelian group (R, +, 0) equals n, so m = 0 if and
only if n | m by Proposition 4.6.5.
Definition 10.2.6. Let R be a ring. The subring S = {m : m ∈ Z} of R defined in the previous proposition
is called the prime subring of R.
Proof. Let R be an integral domain and let n = char(R). Notice that n 6= 1 because 1 6= 0 as R is an
integral domain. Suppose that n ≥ 2 and that n is composite. Fix k, ` ∈ N with 2 ≤ k, ` < n and n = k`.
We then have
0=n=k·`=k·`
Since R is an integral domain, either k = 0 or ` = 0. However, this is a contradiction because k, ` < n and
n = char(R) is the least positive value of m with m = 0. Therefore, either n = 0 or n is prime.
Definition 10.3.1. Let R be a commutative ring. A root of a polynomial f (x) ∈ R[x] is an element a ∈ R
such that f (a) = 0 (or more formally Eva (f (x)) = 0).
Proposition 10.3.2. Let R be a commutative ring, let a ∈ R, and let f (x) ∈ R[x]. The following are
equivalent:
1. a is a root of f (x).
Proof. Suppose first that there exists g(x) ∈ R[x] with f (x) = (x − a) · g(x). We then have
f (a) = (a − a) · g(a)
= 0 · g(a)
= 0.
and either r(x) = 0 or deg(r(x)) < deg(x − a). Since deg(x − a) = 1, we have that r(x) is a constant
polynomial, so we can fix c ∈ R with r(x) = c. We then have
f (x) = q(x) · (x − a) + c
f (a) = q(a) · (a − a) + c
= q(a) · 0 + c
= c.
f (x) = q(x) · (x − a)
n + 1 = deg(f (x))
= deg((x − a) · g(x))
= deg(x − a) + deg(g(x)) (by Proposition 9.3.6)
= 1 + deg(g(x)),
10.3. POLYNOMIAL EVALUATION AND ROOTS 183
so deg(g(x)) = n. By induction, we know that g(x) has at most n roots in R. Notice that if b is a root of
f (x), then
0 = f (b) = (b − a) · g(b)
so either b − a = 0 or g(b) = 0 (because R is an integral domain), and hence either b = a or b is a root of
g(x). Therefore, f (x) has at most n + 1 roots in R, namely the roots of g(x) together with a. The result
follows by induction.
Corollary 10.3.4. Let R be an integral domain and let n ∈ N. Let f (x), g(x) ∈ R[x] be polynomials of
degree at most n (including possibly the zero polynomial). If there exists at least n + 1 many a ∈ R such that
f (a) = g(a), then f (x) = g(x).
Proof. Suppose that there are at least n + 1 many points a ∈ R with f (a) = g(a). Consider the polynomial
h(x) = f (x) − g(x) ∈ R[x]. Notice that either h(x) = 0 or deg(h(x)) ≤ n by Proposition 9.3.5. Since h(x)
has at least n + 1 many roots in R, we must have h(x) = 0 by Proposition 10.3.3. Thus, f (x) − g(x) = 0.
Adding g(x) to both sides, we conclude that f (x) = g(x).
Back when we defined the polynomial ring R[x], we were very careful to define an element as a sequence
of coefficients rather than as the function defined from R to R given by evaluation. As we saw, if R = Z/2Z,
then the two distinct polynomials
1x2 + 1x + 1 1
given the same function on Z/2Z because they evaluate to the same value for each element of Z/2Z. Notice
that each of these polynomials have degree at most 2, but they only agree at the 2 points in Z/2Z, and hence
we can apply the previous corollary.
Corollary 10.3.5. Let R be an infinite integral domain. If f (x), g(x) ∈ R[x] are such that f (a) = g(a) for
all a ∈ R, then f (x) = g(x). Thus, distinct polynomials give different functions from R to R.
Proof. Suppose that f (a) = g(a) for all a ∈ R. Consider the polynomial h(x) = f (x) − g(x) ∈ R[x]. For
every a ∈ R, we have h(a) = f (a) − g(a) = 0. Since R is infinite, we conclude that h(x) has infinitely many
roots, so by the Proposition 10.3.3 we must have that h(x) = 0. Thus, f (x) − g(x) = 0. Adding g(x) to both
sides, we conclude that f (x) = g(x).
Let R be an integral domain and let n ∈ N. Suppose that we have n + 1 many distinct points
a1 , a2 , . . . , an+1 ∈ R, along with n + 1 many values b1 , b2 , . . . , bn+1 ∈ R (which may not be distinct, and may
equal some of the ai ). We know from Corollary 10.3.4 that there can be most one polynomial f (x) ∈ R[x]
with either f (x) = 0 or deg(f (x)) ≤ n such that f (ai ) = bi for all i. Must one always exist?
In Z, the answer is no. Suppose, for example, that n = 1, and we want to find a polynomial of degree at
most 1 such that f (0) = 1 and f (2) = 2. Writing f (x) = ax + b, we then need 1 = f (0) = a · 0 + b, so b = 1,
and hence f (x) = ax + 1. We also need 2 = f (2) = a · 2 + 1, which would require 2 · (1 − a) = 1. Since 2 - 1,
there is no a ∈ Z satisfying this. Notice, however, that if we go up to Q, then f (x) = 21 · x + 1 works as an
example here. In general, such polynomials always exist when working over a field F .
Theorem 10.3.6 (Lagrange Interpolation). Let F be a field and let n ∈ N. Suppose that we have n + 1
many distinct points a1 , a2 , . . . , an+1 ∈ F , along with n + 1 many values b1 , b2 , . . . , bn+1 ∈ F (which may
not be distinct, and may equal some of the ai ). There exists a unique polynomial f (x) ∈ F [x] with either
f (x) = 0 or deg(f (x)) ≤ n such that f (ai ) = bi for all i.
Before jumping into the general proof of this result, we first consider the case when bk = 1 and bi = 0
for all i 6= k. In order to build a polynomial f (x) of degree at most n such that f (ai ) = bi = 0 for all I 6= k,
we need to make the ai with i 6= k roots of our polynomial. The idea the is to consider the polynomial
Now for all i 6= k, we have gk (ai ) = 0 = bi , so we are good there. However, when we plug in ak , we obtain
which is most likely not equal to 1. However, since the ai are distinct, we have that ak − ai 6= 0 for all i 6= k.
Now F is integral domain (because it is a field), so the product above is nonzero. Since F is a field, we can
divide by the resulting value. This suggests considering the polynomial
Notice now that gk (ai ) = 0 = bi for all i 6= k, and gk (ak ) = 1 = bk . Thus, we are successful in the very
special case where one of the bi equals 1 and the rest are 0.
How do we generalize this? First, if bk 6= 0 (but possibly not 1) while bi = 0 for all i 6= k, then we just
scale gk (x) accordingly and consider the polynomial bk · gk (x). Fortunately, to handle the general case, we
need only add up these polynomials accordingly!
Proof of Theorem 10.3.6. For each k, let
Notice that deg(gk (x)) = n for each k (since there are n terms in the product, each of which as degree 1),
that gk (ai ) = 0 whenever i 6= k, and that gk (ak ) = 1. Now let
Since deg(bk · gk (x)) = n for all k, it follows that either f (x) = 0 or deg(f (x)) ≤ n. Furthermore, for any i,
we have
Proof. The idea is to intersect all of the subrings of R that contain A, and argue that the result is a subgroup.
We first prove existence. Notice that there is at least one subring of R containing A, namely R itself. Define
Notice we certainly have A ⊆ S by definition. Moreover, if T is a subring of R with the property that A ⊆ T ,
then we have T ⊆ S by definition of S. We now show that S is indeed a subring of R.
• Let a, b ∈ S. For any subring T of R such that A ⊆ T , we must have both a, b ∈ T by definition of S,
hence a + b ∈ T because T is a subring. Since this is true for all such T , we conclude that a + b ∈ S
by definition of S.
• Let a ∈ S. For any subring T of R such that A ⊆ T , we must have both a ∈ T by definition of S,
hence −a ∈ T because T is a subring. Since this is true for all such K, we conclude that a−1 ∈ H by
definition of H.
• Let a, b ∈ S. For any subring T of R such that A ⊆ T , we must have both a, b ∈ T by definition of
S, hence ab ∈ T because T is a subring. Since this is true for all such T , we conclude that ab ∈ S by
definition of S.
Combining these four properties, we conclude that S is a subring of R. This finishes the proof of existence.
Finally, suppose that S1 and S2 both have the above properties. Since S2 is a subring of R with A ⊆ S2 ,
we know that S1 ⊆ S2 . Similarly, since S1 is a subring of R with A ⊆ S1 , we know that S2 ⊆ S1 . Therefore,
S1 = S2 .
In group theory, when G was a group and A = {a} consisted of a single element, the corresponding
subgroup was {an : a ∈ Z}. We want to develop a similar explicit description in the commutative ring
case (one can also look at noncommutative rings, but the description is significantly more complicated).
However, instead of just looking at the smallest ring containing one element, we are going to generalize the
construction as follows. Assume that we already have a subring S of R. With this base in hand, suppose
that we now want to include one other element a ∈ R in our subring. We then want to know what is the
smallest subring containing S ∪ {a}? Notice that if we really want just the smallest ring that contains one
a ∈ R, we can simply take S to be the prime subring of R from Definition 10.2.6 (because every subring of
R must contain 1, it follows that every subring of R must also contain the prime subring of R).
Proposition 10.4.2. Suppose that R is a commutative ring, that S is a subring of R, and that a ∈ R. Let
• Notice that 0, 1 ∈ {p(a) : p(x) ∈ S[x]} by considering the zero polynomial and one polynomial.
186 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS
• Given two elements of S[a], say p(a) and q(a) where p(x), q(x) ∈ S[x], we have that p(a) + q(a) =
(p + q)(a) and p(a) · q(a) = (pq)(a). Since p(x) + q(x) ∈ S[x] and p(x), q(x) ∈ S[x], it follows that S[a]
is closed under addition and multiplication.
• Consider an arbitrary element p(a) ∈ S[a], where p(x) ∈ S[x]. Since evaluation is a ring homomor-
phism, we then have that −p(a) = (−p)(a), where −p is the additive inverse of p in S[x]. Thus, S[a]
is closed under additive inverses.
Therefore, S[a] is subring of R. Finally, notice that S ∪ {a} ⊆ S[a]} by considering the constant polynomials
and x ∈ S[x].
Suppose now that T is an arbitrary subring of R with S ∪ {a} ⊆ T . Let n ∈ N+ and let s1 , s2 , . . . , sn ∈ S
be arbitrary. Notice that a0 = 1 ∈ T because T is a subring of R, and ak ∈ T for all k ∈ N+ because a ∈ T
and T is closed under multiplication. Since sk ∈ T for all k (because S ⊆ T ), we can use the fact that T is
closed under multiplication to conclude that sk ak ∈ T for all k. Finally, since T is closed under addition, we
conclude that sn an + sn−1 an−1 + · · · + s1 a + s0 ∈ T . Therefore, S[a] ⊆ T .
In group theory, we know that if |a| = n, then we can write hai = {ak : 0 ≤ k ≤ n − 1} in place of
{a : k ∈ Z}. In other words, sometimes we have redundancy in the set {ak : k ∈ Z} and can express hai
k
with less. The same holds true now, and sometimes we can obtain all elements of S[a] using fewer than√all
polynomials in S[x]. For example, consider Q as a subring of R. Suppose that we now want to include 2.
From above, we then have that
√ √
Q[ 2] = {p( 2) : p(x) ∈ Q}
√ √ √
= {bn ( 2)n + bn−1 ( 2)n−1 + · · · + b1 ( 2) + b0 : n ∈ N and b1 , b2 , . . . , bn ∈ Q}.
Although
√ this certainly works, there is a lot of redundancy in the set on the right. For example when we
plug 2 into x3 + x2 − 7, we obtain
√ √ √
( 2)3 + ( 2)2 − 7 = 8 + 2 − 7
√
=2 2−5
√ √
which is the same thing as plugging 3 2 into√2x − 5. In fact, since {a + b 2 : a, b ∈ Q} is subring of R, as
we checked in Section 9.1, it suffices to plug 2 into only the polynomials of the form a + bx where a, b ∈ Q.
A similar thing happened with Z[i] = {a + bi : a, b ∈ Z}, where if we want to find the smallest subring of C
containing Z ∪ {i}, we just needed to plug i into linear polynomials with coefficients in Z. √
Suppose instead that we consider Q as a subring of R, and we now want to include 3 2. We know from
above that
√
3
√3
Q[ 2] = {p( 2) : p(x) ∈ Q}
√ √ √
= {bn ( 2)n + bn−1 ( 2)n−1 + · · · + b1 ( 2) + b0 : n ∈ N and b1 , b2 , . . . , bn ∈ Q}.
3 3 3
√ √ √ 2 √ √
Can we express Q[ 3 2] as {a + b 3 2 : a, b ∈ Q}? √ Notice that ( 3 2)√ = 3 4 must be an element of Q[ 3 2], but
it is not clear whether it is possible to write 3 4 in the form a + b 3 2 where a, b ∈ Q. In fact, it is impossible
to find such a and b, although it is tedious to verify this fact now. However, it is true that
√
3
√
3
√3
Q[ 2] = {{a + b 2 + c 4 : a, b, c ∈ Q},
√
so we need only plug 3 2 into quadratic polynomials with coefficients in Q. We will verify all of these facts,
and investigate these kinds of questions in more detail later.
Suppose now that S is a subring of a commutative ring R and a1 , a2 , . . . , an ∈ R. How do we obtain an
explicit description of S[a1 , a2 , . . . , an ]? It is natural to guess that one obtains S[a1 , a2 , . . . , an ] by plugging
10.4. GENERATING SUBRINGS AND IDEALS IN COMMUTATIVE RINGS 187
the point (a1 , a2 , . . . , an ) into all polynomials in n variables over S. This is indeed the case, but we will wait
until we develop the theory of such multivariable polynomials more deeply.
We now move on to generating ideals. Analogously to Proposition 5.2.8 and Proposition 10.4.1, we have
the following result. Since the argument is completely analogous, we omit the proof.
Proposition 10.4.3. Let R be a ring and let A ⊆ R. There exists an I of R with the following properties:
• A ⊆ I.
Furthermore, the ideal is unique (i.e. if both I1 and I2 have the above properties, then I1 = I2 ).
Let R be a commutative ring. Suppose that a ∈ R, and we want an explicit description of the smallest
ideal that contains the element a (so we are considering the case where A = {a}). Notice that ra must be
an element of this ideal for all r ∈ R. Fortunately, the resulting set turns out to be an ideal
Proposition 10.4.4. Let R be a commutative ring and let a ∈ R. Let I = {ra : r ∈ R}.
1. I is an ideal of R with a ∈ I.
Proof. We first prove 1. We begin by noting that a = 1 · a ∈ I and that 0 = 0 · a ∈ I. For any r, s ∈ R, we
have
ra + sa = (r + s)a ∈ I
so I is closed under addition. For any r, s ∈ R, we have
r · (sa) = (rs) · a ∈ I
Since we will be looking at these types of ideals often, and they are sometime analogous to cyclic sub-
groups, we steal the notation that we used in group theory.
Definition 10.4.5. Let R be a commutative ring and let a ∈ R. We define hai = {ra : r ∈ R}. The ideal
hai is called the ideal generated by a.
Be careful with this overloaded notation! If G is a group and a ∈ G, then hai = {an : n ∈ Z}. However,
if R is commutative ring and a ∈ R, then hai = {ra : a ∈ R}.
Definition 10.4.6. Let R be a commutative ring and let I be an ideal of R. We say that I is a principal
ideal if there exists a ∈ R with I = hai.
Proposition 10.4.7. The ideals of Z are precisely the sets nZ = hni for every n ∈ N. Thus, every ideal of
Z is principal.
Proof. For each n ∈ N, we know that hni = {kn : k ∈ N} is an ideal by Proposition 10.4.4. Suppose now
that I is an arbitrary ideal of Z. Since ideals of Z are in particular additive subgroups of Z, one can use
Corollary 8.1.6 to argue that every ideal of Z is of this form. However, we give a direct argument.
188 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS
Let I be an arbitrary ideal of Z. We know that 0 ∈ I. If I = {0}, then I = h0i, and we are done. Suppose
then that I 6= {0}. Notice that if k ∈ I, then −k ∈ I as well, hence I ∩ N+ 6= ∅. By well-ordering, we may
let n = min(I ∩ N+ ). We claim that I = hni.
First notice that since n ∈ I and I is an ideal of R, it follows from Proposition 10.4.4 that hni ⊆ I. Let
m ∈ I be arbitrary. Fix q, r ∈ Z with m = qn+r and 0 ≤ r < n. We then have that r = m−qn = m+(−q)n.
Since m, n ∈ I and I is an ideal, we know that (−q)n ∈ I and so r = m + (−q)n ∈ I. Now 0 ≤ r < n and
n = min(I ∩ N+ ), so we must have r = 0. It follows that m = qn, so k ∈ hni. Now m ∈ I was arbitrary, so
I ⊆ hni. Putting this together with the above, we conclude that I = hni.
Lemma 10.4.8. Let R be a commutative ring and let I be an ideal of R. We have that I = R if and only
if I contains a unit of R.
Proof. If I = R, then 1 ∈ I, so I contains a unit. Suppose conversely that I contains a unit, and fix such a
unit u ∈ I. Since u is a unit, we may fix v ∈ R with vu = 1. Since u ∈ I and I is an ideal, we conclude that
1 ∈ I. Now for any r ∈ R, we have r = r · 1 ∈ I again because I is an ideal of R. Thus, R ⊆ I, and since
I ⊆ R trivially, it follows that I = R.
Using principal ideals, we can prove the following simple but important result.
1. R is a field.
Proof. We first prove that 1 implies 2. Suppose that R is field. Let I be an ideal of R with I 6= {0}. Fix
a ∈ I with a 6= 0. Since R is a field, every nonzero element of R is a unit, so a is a unit. Since a ∈ I, we
may use the lemma to conclude that I = R.
We now prove that 2 implies 1 by proving the contrapositive. Suppose that R is not a field. Fix a nonzero
element a ∈ R such that a is not a unit. Let I = hai = {ra : r ∈ R}. We know from above that I is an
ideal of R. If 1 ∈ I, then we may fix r ∈ R with ra = 1, which implies that a is a unit (remember that we
are assuming that R is commutative). Therefore, 1 ∈ / I, and hence I 6= R. Since a 6= 0 and a ∈ I, we have
I = {0}. Therefore, I is an ideal of R distinct from {0} and R.
Given finitely many elements a1 , a2 , . . . , an of a commutative ring R, we can also describe the smallest
ideal of R containing {a1 , a2 , . . . , an } in a simple manner. We leave the verification as an exercise.
This set is the smallest ideal of R containing a1 , a2 , . . . , an , and we call it the ideal generated by a1 , a2 , . . . , an .
For example, consider the ring Z. We know that h15, 42i is an ideal of Z. Now
is an ideal of Z, so it must equal hni for some n ∈ Z by Proposition 10.4.7. Since gcd(15, 42) = 3, we know
that 3 divides every element in h15, 42i, and we also know that 3 is an actual element of h15, 42i. Working
out the details, one can show that h15, 42i = h3i. Generalizing this argument, one can show that if a, b ∈ Z,
then ha, bi = hgcd(a, b)i.
10.5. PRIME AND MAXIMAL IDEALS IN COMMUTATIVE RINGS 189
• A maximal ideal of R is an ideal M ⊆ R such that M 6= R and there exists no ideal I of R with
M ( I ( R.
For an example, consider the ideal hxi in the commutative ring Z[x]. We have
We claim that hxi is a prime ideal of Z[x]. To see this, suppose that f (x), g(x) ∈ Z[x] and that f (x)g(x) ∈ hxi.
We then have that f (0)g(0) = 0, so either f (0) = 0 or g(0) = 0 (because Z is an integral domain). It follows
that either f (x) ∈ hxi or g(x) ∈ hxi.
For an example of an ideal that is not a prime ideal, consider the ideal h6i = 6Z in Z. We have that
2 · 3 ∈ 6Z but 2 ∈
/ 6Z and 3 ∈ / 3Z. Also, h6i is not a maximal ideal because h6i ( h3i (because every multiple
of 6 is a multiple of 3, but 3 ∈
/ h6i).
Proposition 10.5.2. Let R be a commutative ring with 1 6= 0. The ideal {0} is a prime ideal of R if and
only if R is an integral domain.
Proof. Suppose first that {0} is a prime ideal of R. Let a, b ∈ R be arbitrary with ab = 0. We then have
that ab ∈ {0}, so either a ∈ {0} or b ∈ {0} because {0} is a prime ideal of R. Thus, either a = 0 or b = 0.
Since R has no zero divisors, we conclude that R is an integral domain.
Conversely, suppose that R is an integral domain. Notice that {0} is an ideal of R with {0} 6= R (since
1 6= 0). Let a, b ∈ R be arbitrary with ab ∈ {0}. We then have that ab = 0, so as R is an integral domain we
can conclude that either a = 0 or b = 0. Thus, either a ∈ {0} or b ∈ {0}. It follows that {0} is a prime ideal
of R.
1. I is a prime ideal if and only if either I = {0} or I = hpi for some prime p ∈ N+ .
Proof. 1. First notice that {0} is a prime ideal of Z by Proposition 10.5.2 because Z is an integral domain.
Suppose now that p ∈ N+ is prime and let I = hpi = {pk : k ∈ Z}. We then have that I = hpi, so I is
indeed an ideal of Z by Proposition 10.4.4. Suppose that a, b ∈ Z are such that ab ∈ pZ. Fix k ∈ Z
with ab = pk. We then have that p | ab, so as p is prime, either p | a or p | b by Proposition 2.5.6. If
p | a, then we can fix ` ∈ Z with a = p`, from which we can conclude that a ∈ pZ. Similarly, if p | b,
190 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS
then we can fix ` ∈ Z with b = p`, from which we can conclude that b ∈ pZ. Thus, either a ∈ pZ or
b ∈ pZ. Therefore, pZ is a prime ideal of Z.
Suppose conversely that I is a prime ideal of Z. By Proposition 10.4.7, we can fix n ∈ N with I = hni.
We need to prove that either n = 0 or n is prime. Notice that n 6= 1 because h1i = Z is not a prime
ideal of Z by definition. Suppose that n ≥ 2 is not prime. We can then fix c, d ∈ N with n = cd and
both 1 < c < n and 1 < d < n. Notice that cd = n ∈ hni. However, if c ∈ hni, then fixing k ∈ Z
with c = nk, we notice that k > 0 (because c, n > 0), so c = nk ≥ n, a contradiction. Thus, c ∈ / hni.
Similarly, d ∈
/ hni. It follows that hni is not prime ideal when n ≥ 2 is composite. Therefore, if I = hni
is a prime ideal, then either n = 0 or n is prime.
2. Suppose first that I = hpi for some prime p ∈ N+ . Let J be an ideal of Z with I ⊆ J. We prove that
either I = J or I = Z. By Proposition 10.4.7, we can fix n ∈ N with J = hni. Since I ⊆ J and p ∈ I,
we have that p ∈ J = hni. Thus, we can fix k ∈ Z with p = nk. Notice that k > 0 because p, n > 0.
Since p is prime, it follows that either n = 1 or n = p. If n = 1, then J = h1i = Z. If n = p, then
J = hpi = I. Therefore, I = hpi is a maximal ideal of Z.
Suppose conversely that I is a maximal ideal of Z. By Proposition 10.4.7, we can fix n ∈ N with
I = hni. We need to prove that either n is prime. Notice that n 6= 0 because h0i is not a maximal
ideal of Z (we have h0i ( h2i ( Z). Also, n 6= 1 because h1i = Z is not a maximal ideal of Z by
definition. Suppose that n ≥ 2 is not prime. We can then fix c, d ∈ N with n = cd and both 1 < c < n
and 1 < d < n. Notice that hni ⊆ hdi because every multiple of n is a multiple of d. However, we
have d ∈ / hni because n - d (as 1 < d < n). Furthermore, we have hdi = 6 Z because 1 ∈/ hdi. Since
hni ( hdi ( Z, we conclude that hni is not a maximal ideal of Z. Therefore, if I = hni is a maximal
ideal, then n is prime.
We can classify the prime and maximal ideals of a commutative ring by the properties of the corresponding
quotient ring.
Theorem 10.5.4. Let R be a commutative ring and let P be an ideal of R. P is a prime ideal of R if and
only if R/P is an integral domain.
Proof. Suppose first that P is a prime ideal of R. Since R is commutative, we know that R/P is commutative.
By definition, we have P 6= R, so 1 ∈ / P , and hence 1 + P 6= 0 + P . Finally, suppose that a, b ∈ R with
(a + P ) · (b + P ) = 0 + P . We then have that ab + P = 0 + P , so ab ∈ P . Since P is a prime ideal, either
a ∈ P or b ∈ P . Therefore, either a + P = 0 + P or b + P = 0 + P . It follows that R/P is an integral domain.
Suppose conversely that R/P is an integral domain. We then have that 1 + P 6= 0 + P by definition of
an integral domain, hence 1 ∈ / P and so P 6= R. Suppose that a, b ∈ R with ab ∈ P . We then have
(a + P ) · (b + P ) = ab + P = 0 + P
Since R/P is an integral domain, we conclude that either a + P = 0 + P or b + P = 0 + P . Therefore, either
a ∈ P or b ∈ P . It follows that P is a prime ideal of R.
Theorem 10.5.5. Let R be a commutative ring and let M be an ideal of R. M is a maximal ideal of R if
and only if R/M is a field.
Proof. We give two proofs. The first is the slick “highbrow” proof. Using the Correspondence Theorem and
Proposition 10.4.9, we have
M is a maximal ideal of R ⇐⇒ There are no ideals I of R with M ( I ⊆ R
⇐⇒ There are no ideals of R/M other than {0 + I} and R/M
⇐⇒ R/M is a field
10.5. PRIME AND MAXIMAL IDEALS IN COMMUTATIVE RINGS 191
If you don’t like appealing to the Correspondence Theorem (which is a shame, because it’s awesome), we
can prove it directly via a “lowbrow” proof.
Suppose first that M is a maximal ideal of R. Fix a nonzero element a+M ∈ R/M . Since a+M 6= 0+M ,
we have that a ∈ / M . Let I = {ra + m : s ∈ R, m ∈ M }. We then have that I is an ideal of M (check it)
with M ⊆ I and a ∈ I. Since a ∈ / M , we have M ( I, so as M is maximal it follows that I = R. Thus,
we may fix r ∈ R and m ∈ M with ra + m = 1. We then have ra − 1 = −m ∈ M , so ra + M = 1 + M . It
follows that (r + M )(a + M ) = 1 + M , so a + M has an inverse in R/M (recall that R and hence R/M is
commutative, so we only need an inverse on one side).
Suppose conversely that R/M is a field. Since R/M is a field, we have 1 + M 6= 0 + M , so 1 ∈
/ M and
hence M ( R. Fix an ideal I of R with M ( I. Since M ( I, we may fix a ∈ I\M . Since a ∈ / M , we have
a+M 6= 0+M , and using the fact that R/M is a field we may fix b+M ∈ R/M with (a+M )(b+M ) = 1+M .
We then have ab + M = 1 + M , so ab − 1 ∈ M . Fixing m ∈ M with ab − 1 = m, we then have ab − m = 1.
Now a ∈ I, so ab ∈ I as I is an ideal. Also, we have m ∈ M ⊆ I. It follows that 1 = ab − m ∈ I, and thus
I = R. Therefore, M is a maximal ideal of R.
From the theorem, we see that pZ is a maximal ideal of Z for every prime p ∈ N+ because we know
that Z/pZ is a field, which gives a much faster proof of one part of Proposition 10.5.3. We also obtain the
following nice corollary.
Corollary 10.5.6. Let R be a commutative ring. Every maximal ideal of R is a prime ideal of R.
Proof. Suppose that M is a maximal ideal of R. We then have that R/M is a field by Theorem 10.5.5, so
R/M is an integral domain by Proposition 9.2.9. Therefore, M is a prime ideal of R by Theorem 10.5.4.
The converse is not true. As we’ve seen, the ideal {0} is a prime ideal of Z, but it is certainly not a
maximal ideal of Z.
192 CHAPTER 10. IDEALS AND RING HOMOMORPHISMS
Chapter 11
Our primary example of an integral domain that is not a field is Z. We spent a lot of time developing the
arithmetic of Z in Chapter 2, and there we worked through the notions of divisibility, greatest common
divisors, and prime factorizations. In this chapter, we explore how much of this arithmetic of Z we can carry
over to other types of integral domains.
x2 + 3x − 1 | x4 − x3 − 11x2 + 10x − 2
because
(x2 + 3x − 1)(x2 − 4x + 2) = x4 − x3 − 11x2 + 10x − 2
If R is a field, then for any a, b ∈ R with a 6= 0, we have a | b because b = a(a−1 b). Thus, the divisibility
relation is trivial in fields. Also notice that we have that a ∈ R is a unit if and only if a | 1.
Proposition 11.1.2. Let R be an integral domain and let a, b, c ∈ R.
1. If a | b and b | c, then a | c.
2. If a | b and a | c, then a | (rb + sc) for all r, s ∈ R.
Proof.
193
194 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS
rb + sc = r(ak) + s(am)
= a(rk) + a(sm) (since R is commutative)
= a(rk + sm)
so a | (rb + sc).
For an example of where working in an integral domain makes divisibility have more desirable properties,
consider the following proposition, which would be false in general commutative rings. Working in R =
Z × Z (which is not an integral domain), notice that (2, 0) | (6, 0) via both (2, 0) · (3, 0) = (6, 0) and also
(2, 0) · (3, 5) = (6, 0).
Proposition 11.1.3. Let R be an integral domain. Suppose that a, b ∈ R with a 6= 0 and that a | b. There
exists a unique d ∈ R such that ad = b.
Proof. The existence of a d follows immediately from the definition of divisibility. Suppose that c, d ∈ R
satisfy ac = b and ad = b. We then have that ac = ad. Since a 6= 0 and R is an integral domain, we may use
Proposition 9.2.11 to cancel the a’s to conclude that c = d.
Recall that if R is a ring, then we denote the set of units of R by U (R), and that U (R) forms a
multiplicative group by Proposition 9.2.4.
Definition 11.1.4. Let R be an integral domain. Define a relation ∼ on R by letting a ∼ b if there exists a
u ∈ U (R) such that b = au.
• Symmetric: Suppose that a, b ∈ R with a ∼ b. Fix u ∈ U (R) with b = au. Multiplying on the right by
u−1 , we see that a = bu−1 . Since u−1 ∈ U (R) (by Proposition 9.2.4), it follows that b ∼ a.
• Transitive: Suppose that a, b, c ∈ R with a ∼ b and b ∼ c. Fix u, v ∈ U (R) with b = au and c = bv.
We then have c = bv = (au)v = a(uv). Since uv ∈ U (R) by (Proposition 9.2.4), it follows that a ∼ c.
Definition 11.1.6. Let R be an integral domain. Elements of the same equivalence class are called associates.
In other words, given a, b ∈ R, then a and b are associates if there exists u ∈ U (R) with b = au.
For example, we have U (Z) = {±1}, so the associates of a given n ∈ Z are exactly ±n. Thus, the
equivalence classes partition Z into the sets {0}, {±1}, {±2}, . . . .
Proposition 11.1.7. Let R be an integral domain and let a, b ∈ R. The following are equivalent.
2. Both a | b and b | a.
11.1. DIVISIBILITY AND ASSOCIATES 195
Proof. Suppose first that a and b are associates. Fix u ∈ U (R) with b = au. We then clearly have a | b, and
since a = bu−1 we have b | a.
Suppose conversely that both a | b and b | a. Fix c, d ∈ R with b = ac and a = bd. Notice that if a = 0,
then b = ac = 0c = 0, so a = b1 and a and b are associates. Suppose instead that a 6= 0. We then have
a1 = a = bd = (ac)d = acd
Since R is an integral domain and a 6= 0, it follows that cd = 1 (using Proposition 9.2.11), so both c, d ∈ U (R).
Therefore, as b = ac, it follows that a and b are associates.
It will be very useful for us to rephrase the definition of divisibility and associates in terms of principal
ideals. Notice that the elements a and b switch sides in the next proposition.
Proposition 11.1.8. Let R be an integral domain and let a, b ∈ R. We have a | b if and only if hbi ⊆ hai.
Proof. Suppose first that a | b. Fix d ∈ R with b = ad. We then have b ∈ hai, hence hbi ⊆ hai because hbi is
the smallest ideal containing b.
Suppose conversely that hbi ⊆ hai. Since b ∈ hbi, we then have in particular that b ∈ hai. Thus, we may
fix d ∈ R with b = ad. It follows that a | b.
Corollary 11.1.9. Let R be an integral domain and let a, b ∈ R. We have hai = hbi if and only if a and b
are associates.
Proof. We have
Finally, we begin a discussion of greatest common divisors in integral domains. Recall that in our
discussion about Z, we avoided defining the greatest common divisor as the largest common divisor of a and
b, and instead said that it had the property that every other common divisor of a and b was also a divisor
of it. Taking this approach was useful in Z (after all, gcd(a, b) had this much stronger property anyway and
otherwise gcd(0, 0) would not make sense), but it is absolutely essential when we try to generalize it to other
integral domains since a general integral domain has no notion of “order” or “largest”.
When we worked in Z, we added the additional requirement that gcd(a, b) was nonnegative so that it
would be unique. However, just like with “order”, there is no notion of “positive” in a general integral
domain. Thus, we will have to live with a lack of uniqueness in general. Fortunately, any two greatest
common divisors are associates, as we now prove.
• If d is a greatest common divisor of a and b, then every associate of d is also a greatest common divisor
of a and b.
• If d and d0 are both greatest common divisors of a and b, then d and d0 are associates.
Proof. Suppose that d is a greatest common divisor of a and b. Suppose that d0 is an associate of d. We
then have that d0 | d, so since d is a common divisor of a and b and divisibility is transitive, it follows that d0
is a common divisor of a and b. Let c be a common divisor of a and b. Since d is a greatest common divisor
of a and b, we know that c | d. Now d | d0 because d and d0 are associates, so by transitivity of divisibility,
we conclude that c | d0 . Therefore, every common divisor of a and b divides d0 , and hence d0 is a greatest
common divisor of a and b.
Suppose that d and d0 are both greatest common divisors of a and b. We then have that d0 is a common
divisor of a and b, so d0 | d because d is a greatest common divisor of a and b. Similarly, we have that d is a
common divisor of a and b, so d | d0 because d0 is a greatest common divisor of a and b. Using Proposition
Proposition 11.1.7, we conclude that either d and d0 are associates.
So far, we have danced around the fundamental question: Given an integral domain R and elements
a, b ∈ R, must there exist a greatest common divisor of a and b? The answer is no in general, so proving
existence in “nice” integral domains will be a top priority for us.
elements p ∈ Z when talking about “primes” and their divisors. However, it’s straightforward to check that
if a, b ∈ Z, and a | b, then (−a) | b. Thus, our old definition of a p ∈ Z being “prime” is the same as saying
that p ≥ 2 and the the only divisors of p are ±1 and ±p, i.e. the units and associates. If we no longer insist
that p ≥ 2, then this is precisely the same as saying that p is irreducible in Z. Notice now that after dropping
that requirement, we have that −2, −3, −5, . . . are irreducibles in Z. This is generalized in the following
result.
Proposition 11.2.3. Let R be an integral domain.
1. If p ∈ R is irreducible, then every associate of p is irreducible.
2. If p ∈ R is prime, then every associate of p is prime.
Proof. Exercise (see homework).
Let’s examine some elements of the integral domain Z[x]. We know that U (Z[x]) = U (Z) = {1, −1} by
Proposition 9.3.8. Notice that 2x2 + 6 is not irreducible in Z[x] because 2x2 + 6 = 2 · (x2 + 3), and neither
2 nor x2 + 3 is a unit. Also, x2 − 4x + 21 = (x + 3)(x − 7), and neither x + 3 nor x − 7 is a unit in Z[x], so
x2 − 4x + 21 is not irreducible in Z[x].
In contrast, we claim that x + 3 is irreducible in Z[x]. To see this, suppose that f (x), g(x) ∈ Z[x] are
such that x + 3 = f (x)g(x). Notice that we must have that f (x) and g(x) are both nonzero. By Proposition
9.3.6, we know that
deg(x + 3) = deg(f (x)) + deg(g(x))
so
1 = deg(f (x)) + deg(g(x))
It follows that one of deg(f (x)) and deg(g(x)) equals 0, while the other is 1. Suppose without loss of
generality that deg(f (x)) = 0, so f (x) is a nonzero constant, say f (x) = c. Since deg(g(x)) = 1, we can fix
a, b ∈ Z with g(x) = ax + b. We then have
x + 3 = c · (ax + b) = ac · x + b
It follows that ac = 1 in Z, so c ∈ {1, −1}, and hence f (x) = c is a unit in Z[x]. Therefore, x + 3 is irreducible
in Z[x].
One can also show that x + 3 is prime in Z[x]. The key fact to use here is Proposition 10.3.2, from which
we conclude that for any h(x) ∈ Z[x], we have that x + 3 | h(x) in Z[x] if and only if h(−3) = 0. Suppose
then that f (x), g(x) ∈ Z[x] are such that x + 3 | f (x)g(x). We then have that f (−3)g(−3) = 0, so since Z is
an integral domain, either f (−3) = 0 or g(−3) = 0. It follows that either x + 3 | f (x) in Z[x] or x + 3 | g(x)
in Z[x].
From these examples, we see that there certainly does appear to be some connection between the concepts
of irreducible and prime, but they also have a slightly different flavor. In Z, it is true that an element is
prime exactly when it is irreducible. The hard direction here is essentially the content of Proposition 2.5.6
(although again we technically only dealt with positive elements there, but we can use Proposition 11.2.3).
In general, one direction of this equivalence from Z holds in every integral domain, but will see that the
other does not.
Proposition 11.2.4. Let R be an integral domain. If p is prime, then p is irreducible.
Proof. Suppose that p is prime. By definition of prime, p is nonzero and not a unit. Let a, b ∈ R be arbitrary
with p = ab. We then have p1 = ab, so p | ab. Since p is prime, we conclude that either p | a or p | b.
Suppose that p | a. Fix c ∈ R with a = pc. We then have
p1 = p = ab = (pc)b = p(cb)
198 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS
Since R is an integral domain and p 6= 0, we may cancel it to conclude that 1 = cb, so b is a unit. Suppose
instead that p | b. Fix d ∈ R with b = pd. We then have
p1 = p = ab = a(pd) = p(ad)
Since R is an integral domain and p 6= 0, we may cancel it to conclude that 1 = ad, so a is a unit. Therefore,
either a or b is a unit. It follows that p is irreducible.
As in divisibility, it is helpful to rephrase our definition of prime elements in terms of principal ideals,
and fortunately our common names here coincide.
Proposition 11.2.5. Let R be an integral domain and let p ∈ R be nonzero. The ideal hpi is a prime ideal
of R if and only if p is a prime element of R.
Proof. Suppose first that hpi is a prime ideal of R. Notice that p 6= 0 by assumption and that p is not a unit
because hpi 6= R. Suppose that a, b ∈ R and p | ab. We then have that ab ∈ hpi, so as hpi is a prime ideal
we know that either a ∈ hpi or b ∈ hpi. In the former case, we conclude that p | a, and in the latter case we
conclude that p | b. Since a, b ∈ R were arbitrary, it follows that p is a prime element of R.
Suppose conversely that p is a prime element of R. By definition, we know that p is not a unit, so 1 ∈/ hpi
and hence hpi =
6 R. Suppose that a, b ∈ R and ab ∈ hpi. We then have that p | ab, so as p is a prime element
we know that either p | a or p | b. In the former case, we conclude that a ∈ hpi and in the latter case we
conclude that b ∈ hpi. Since a, b ∈ R were arbitrary, it follows that hpi is a prime ideal of R.
One word of caution here is that if R is an integral domain, then h0i = {0} is a prime ideal, but 0 is not
a prime element.
Recall from Corollary 9.3.9 that if F is a field, then U (F [x]) = U (F ) of nonzero constant polynomials.
Using this classification of the units, we can give a simple characterization of the irreducible elements in F [x]
as those polynomials that can not be factored into two polynomials of smaller degree.
Proposition 11.2.6. Let F be a field and let f (x) ∈ F [x] be a nonconstant polynomial. The following are
equivalent.
1. f (x) is irreducible in F [x].
2. There do not exist nonzero polynomials g(x), h(x) ∈ F [x] with f (x) = g(x) · h(x) and both deg(g(x)) <
deg(f (x)) and deg(h(x)) < deg(f (x)).
Proof. We prove 1 ↔ 2 by instead proving ¬1 ↔ ¬2.
• ¬1 → ¬2: Suppose that 1 is false, so f (x) is not irreducible in F [x]. Since f (x) is a nonconstant
polynomial, we know that f (x) is nonzero and not a unit. in F [x]. Therefore, there exists g(x), h(x) ∈
F [x] with f (x) = g(x)h(x) and such that neither g(x) nor h(x) are units. Notice that g(x) and h(x)
are both nonzero (because f (x) is nonzero), so since U (F [x]) = U (F ) = F \{0} by Corollary 9.3.9, it
follows that g(x) and h(x) are both nonconstant polynomials. Now
by Proposition 9.3.6. Since h(x) is nonconstant, we have deg(h(x)) ≥ 1, and hence deg(g(x)) <
deg(f (x)). Similarly, since g(x) is nonconstant, we have deg(g(x)) ≥ 1, and hence deg(h(x)) <
deg(f (x)). Thus, we’ve shown that 2 is false.
• ¬2 → ¬1: Suppose that 2 is false, and fix nonzero polynomials g(x), h(x) ∈ F [x] with f (x) = g(x)·h(x)
and both deg(g(x)) < deg(f (x)) and deg(h(x)) < deg(f (x)). Since
by Proposition 9.3.6, we must have deg(g(x)) 6= 0 and deg(h(x)) 6= 0. Since U (F [x]) = U (F ) = F \{0}
by Corollary 9.3.9, it follows that neither g(x) nor h(x) is a unit. Thus, f (x) is not irreducible in F [x],
and so 1 is false.
Life is not as nice if we are working over an integral domain that is not a field. For example, we showed
above that 2x2 + 6 is not irreducible in Z[x] because 2x2 + 6 = 2 · (x2 + 3), and neither 2 nor x2 + 3 is a
unit in Z[x]. However, it is not possible to write 2x2 + 6 = g(x) · h(x) where g(x), h(x) ∈ Z[x] and both
deg(g(x)) < 2 and deg(h(x)) < 2.
Proposition 11.2.7. Let F be a field and let f (x) ∈ F [x] be a nonzero polynomial with deg(f (x)) ≥ 2. If
f (x) has a root in F , then f (x) is not irreducible in F [x].
Proof. Suppose that f (x) has a root in F , and fix such a root a ∈ F . By Proposition 10.3.2, it follows that
(x − a) | f (x) in F [x]. Fixing g(x) ∈ F [x] with f (x) = (x − a) · g(x), we then have that g(x) is nonzero (since
f (x) is nonzero) and also
Therefore, deg(g(x)) = deg(f (x)) − 1 ≥ 2 − 1 = 1. Since x − a and g(x) both have degree at least 1, we
conclude that neither is a unit in F [x] by Corollary 9.3.9, and hence f (x) is not irreducible in F [x].
Theorem 11.2.8. Let F be a field and let f (x) ∈ F [x] be a nonzero polynomial.
2. If deg(f (x)) = 2 or deg(f (x)) = 3, then f (x) is irreducible in F [x] if and only if f (x) has no roots in
F [x].
Proof. 1. This follows immediately from Proposition 11.2.6 and Proposition 9.3.6.
2. Suppose that either deg(f (x)) = 2 or deg(f (x)) = 3. If f (x) has a root in F [x], then f (x) is not
irreducible immediately by Proposition 11.2.7. Suppose conversely that f (x) is not irreducible if F [x].
Write f (x) = g(x)h(x) where g(x), h(x) ∈ F [x] are nonunits. We have
Now g(x) and h(x) are not units, so they each have degree at least 1. Since deg(f (x)) ∈ {2, 3}, it
follows that at least one of g(x) or h(x) has degree equal to 1. Suppose without loss of generality that
deg(g(x)) = 1 and write g(x) = ax + b where a, b ∈ F with a 6= 0. We then have
Notice that this theorem can be false if we are working in R[x] for an integral domain R. For example,
in Z[x], we have that deg(3x + 12) = 1, but 3x + 12 is not irreducible in Z[x] because 3x + 12 = 3 · (x + 4),
and neither 3 nor x + 4 is a unit in Z[x]. Of course, the theorem implies that 3x + 12 is irreducible in Q[x]
(the factorization 3x + 12 = 3 · (x + 4) does not work here because 3 is a unit in Q[x]).
Also, note that even in the case of F [x] for a field F , in order to use the nonexistence of roots to
prove that a polynomial is irreducible, we require that the polynomial has degree 2 or 3. This restriction is
essential. Consider the polynomial x4 + 6x2 + 5 in Q[x]. Since x4 + 6x2 + 5 = (x2 + 1)(x2 + 5), it follows
that x4 + 6x2 + 5 is no irreducible in Q[x]. However, notice that x4 + 6x2 + 5 has no roots in Q (or even in
R) because a4 + 6a2 + 5 > 0 for all a ∈ Q.
For an example of how to use the theorem affirmatively,√consider the polynomial f (x) = x3 − 2 in the
ring Q[x]. We know that f (x) has no roots in Q because ± 3 2 are not rational by Theorem 2.5.13. Thus,
f (x) is irreducible in Q[x]. Notice that f (x) is not irreducible when viewed as an element of R[x] because
it does have a root in R. In fact, no polynomial in R[x] of odd degree is irreducible because every such
polynomial has a root (this uses the Intermediate Value Theorem because as x → ±∞, on one side we must
have f (x) → ∞ and on the other we must have f (x) → −∞). Moreover, it turns out that every irreducible
polynomial in R[x] has degree either 1 or 2, though this is far from obvious at this point, and relies on an
important result called the Fundamental Theorem of Algebra.
By Proposition 11.2.4, we know that in an integral domain R, every prime element of R is irreducible in
R. Although in the special case of Z we know that every irreducible is prime, this is certainly obvious in
general. For example, we’ve just shown that x3 − 2 is irreducible in Q[x], but it’s much less clear whether
x3 − 2 is prime in Q[x]. In fact, for general integral domains R, there can be irreducible elements that are
not prime. For an somewhat exotic but interesting example of this, let R be the Q[x] consisting of those
polynomials whose constant term and coefficient of x are both elements of Z. In other words, let
R = {an xn + an−1 xn−1 + · · · + a1 x + a0 ∈ Q[x] : a0 ∈ Z and a1 ∈ Z}
It is straightforward to check that R is indeed a subring of Q[x]. Furthermore, since Q[x] is an integral domain,
it follow that R is an integral domain as well. We also still have that deg(f (x)g(x)) = deg(f (x)) + deg(g(x))
for all f (x), g(x) ∈ R because this property holds in Q[x]. From this, it follows that U (R) = {1, −1} and
that 3 ∈ R is irreducible in R. Notice that 3 | x2 , i.e. 3 | x · x, because 31 x2 ∈ R and
1
3 · x2 = x2 .
3
However, we also have that 3 - x in R, essentially because 31 · x is not an element of R. To see this more
formally, suppose that h(x) ∈ R and 3 · h(x) = x. Since the degree of the product of two elements of R is
the sum of the degrees, we have deg(h(x)) = 1. Since h(x) ∈ R, we can write h(x) = ax + b where a, b ∈ Z.
We then have 3a · x + 3b = x, so 3a = 1, contradicting the fact that 3 - 1 in Z. Since 3 | x · x in R, but 3 - x
in R, it follows that 3 is not prime in R.
Another example of an integral domain where some irreducibles are not prime is the integral domain
√ √
Z[ −5] = {a + b −5 : a, b ∈ Z}.
√
Although this also looks like a bizarre example, we will see √ that Z[ −5] and many other examples like it
play a fundamental role in algebraic number theory. In Z[ −5], we have two different factorizations of 6:
√ √
(1 + −5)(1 − −5) = 6 = 2 · 3
√
It turns out that each of the four factors that appear are irreducible in Z[ −5], but none are prime. We
will establish
√ all of these
√ facts later, but here we√can at least√ argue that
√ 2 is not prime. Notice
√ that√since
2 · 3 = (1 + −5)(1 − −5), we have that √ 2 | (1 + −5)(1
√ − −5) in Z[ −5]. Now
√ if 2 | 1 +
√ −5 in Z[ −5],
then we can fix a, b ∈ Z with √ 2(a + b −5) = 1 + −5, which gives 2a +√2b −5 = 1 + −5, so 2a = 1, a
contradiction. Thus, 2 - 1 + −5. A similar argument shows that 2 - 1 − −5.
11.3. IRREDUCIBLE POLYNOMIALS IN Q[X] 201
• f ( 32 ) = 6 and f (− 32 ) = − 57
2
Thus, f (x) has no roots in Q. Since deg(f (x)) = 3, we can use Theorem 11.2.8 to conclude that f (x) is
irreducible in Q[x]. Notice that f (x) is not irreducible in R[x] because it has a root between in the interval
(1, 32 ) by the Intermediate Value Theorem.
As we mentioned above, we are focusing on polynomials in Q[x] which have integer coefficients since every
polynomial in Q[x] is an associate of such a polynomial. However, even if f (x) ∈ Z[x], when we check for
irreducibility in Q[x], we have to consider the possibility that a potential factorization involves polynomials
whose coefficients are fractions. For example, we have
1
x2 = (2x) · x
2
202 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS
Of course, in this case there also exists a factorization into smaller degree degree polynomials in Z[x] because
we can write x2 = x · x. Our first task is to prove that this is always the case. We will need the following
lemma.
Lemma 11.3.2. Suppose that g(x), h(x) ∈ Z[x] and that p ∈ Z is a prime which divides all coefficients of
g(x)h(x). We then have that either p divides all coefficients of g(x), or p divides all coefficients of h(x).
Proof. Let g(x) be the polynomial {bn }, let h(x) be the polynomial {cn }, and let g(x)h(x) be the polynomial
{an }. We are supposing that p | an for all n. Suppose the p - bn for some n and also that p - cn for some n
(possibly different). Let k be least such that p - bk , and let ` be least such that p - c` . Notice that
k+`
X
ak+` = bi ck+`−i
i=0
k−1
! k+`
!
X X
= bk c` + bi ck+`−i + bi ck+`−i
i=0 i=k+1
hence ! !
k−1
X k+`
X
bk c` = ak+` − bi ck+`−i − bi ck+`−i
i=0 i=k+1
Proposition 11.3.3 (Gauss’ Lemma). Suppose that f (x) ∈ Z[x] and that g(x), h(x) ∈ Q[x] with f (x) =
g(x)h(x). There exist polynomials g ∗ (x), h∗ (x) ∈ Z[x] such that f (x) = g ∗ (x)h∗ (x) and both deg(g ∗ (x)) =
deg(g(x)) and deg(h∗ (x)) = deg(h(x)). In fact, there exist nonzero s, t ∈ Q with
• g ∗ (x) = s · g(x)
• h∗ (x) = t · h(x)
Proof. If each of the coefficients of g(x) and h(x) happen to be integers, then we are happy. Suppose not.
Let a ∈ Z be the least common multiple of the denominators of the coefficients of g, and let b ∈ Z be the
least common multiple of the denominators of the coefficients of g. Let d = ab. Multiplying both sides of
the equation f (x) = g(x)h(x) through by d to “clear denominators”, we see that
where each of the three factors d · f (x), a · g(x), and b · h(x) is a polynomial in Z[x]. We have at least one
of a > 1 or b > 1, hence d = ab > 1.
Fix a prime divisor p of d. We then have that p divides all coefficients of d · f (x), so by the previous
lemma either p divides all coefficients of a · g(x), or p divides all coefficients of b · h(x). In the former case,
we have
d a
· f (x) = · g(x) · (b · h(x))
p p
11.3. IRREDUCIBLE POLYNOMIALS IN Q[X] 203
where each of the three factors is a polynomial in Z[x]. In the latter case, we have
d b
· f (x) = (a · g(x)) · · h(x)
p p
where each of the three factors is a polynomial in Z[x]. Now if dp = 1, then we are done by letting g ∗ (x)
be the first factor and letting h∗ (x) be the second. Otherwise, we continue the argument by dividing out
another prime factor of dp from all coefficients of one of the two polynomials. Continue until we have handled
all primes which occur in a factorization of d. Formally, you can do induction on d.
An immediate consequence of Gauss’ Lemma is the following, which greatly simplifies the check for
whether a given polynomial with integer coefficients is irreducible in Q[x].
Corollary 11.3.4. Let f (x) ∈ Z[x]. If there do not exist nonconstant polynomials g(x), h(x) ∈ Z[x] with
f (x) = g(x) · h(x), then f (x) is irreducible in Q[x]. Furthermore, if f (x) is monic, then it suffices to show
that no such monic g(x) and h(x) exist.
Proof. The first part is immediate from Proposition 11.2.6 and Gauss’ Lemma. Now suppose that f (x) ∈ Z[x]
is monic. Suppose that g(x), h(x) ∈ Z with f (x) = g(x)h(x). Notice that the leading term of f (x) is the
product of the leading terms of g(x) and h(x), so as f (x) is monic and all coefficients are in Z, either both
g(x) and h(x) are monic or both have leading terms −1. In the latter case, we can multiply both through
by −1 to get a factorization into monic polynomials in Z[x] of the same degree.
As an example, consider the polynomial f (x) = x4 + 3x3 + 7x2 − 9x + 1 ∈ Q[x]. We claim that f (x)
is irreducible in Q[x]. We first check for rational roots. We know that the only possibilities are ±1 and we
check these:
• f (1) = 1 + 3 + 7 − 9 + 1 = 3
• f (−1) = 1 − 3 + 7 + 9 + 1 = 15
1. a + c = 3
2. b + ac + d = 7
3. ad + bc = −9
4. bd = 1
30 = 2 · 3 · 5
= (−2) · (−3) · 5
= (−2) · 3 · (−5)
In Z, the elements −2, −3, and −5 are also irreducible/prime in Z under our new definition of these concepts.
Thus, if we move away from working only with positive primes, then we lose a bit more uniqueness. However,
if we slightly loosen the requirements that any two factorizations are “the same up to order” to “the same
up to order and associates”, we might have a chance.
Definition 11.4.1. A Unique Factorization Domain, or UFD, is an integral domain R such that:
1. Every nonzero nonunit is a product of irreducible elements.
2. If q1 q2 · · · qn = r1 r2 . . . rm where each qi and ri are irreducible, then n = m and there exists a permu-
tation σ ∈ Sn such that qi and rσ(i) are associates for all i.
Thus, a UFD is an integral domain in which the analogue of the Fundamental Theorem of Arithmetic
(Theorem 2.5.8) holds. We want to prove that several important integral domains that we have studied are
indeed UFDs, and to set the stage for this, we first go back and think about the proofs in Z.
In Proposition 2.5.2, we proved that every n ≥ 2 was a product of irreducible elements of Z by induction.
Intuitively, if n is not irreducible, then factor it, and if those factors are not irreducible, then factor them,
etc. The key fact forcing this process to “bottom out”, and hence making the induction work, is that
the numbers are getting smaller upon factorization and we can not have an infinite descending sequence
of natural numbers. In a general integral domain, however, there may not be something that is getting
“smaller” and hence this argument could conceivably break down. In fact, there are exotic integral domains
where it is possible factor an element forever without ever reaching an irreducible. For an example of this
situation, consider the subring R of Q[x] consisting of those polynomials whose constant term is an integer.
In other words, let
It is straightforward to check that R is a subring of Q[x]. Furthermore, since Q[x] is an integral domain,
it follows that R is an integral domain as well. We still have that deg(f (x)g(x)) = deg(f (x)) + deg(g(x))
for all f (x), g(x) ∈ R because this property holds in Q[x]. From this, it follows that any element of U (R)
must be a constant polynomial. Since the constant terms of elements of R are integers, we conclude that
U (R) = {1, −1}. Now consider the element x ∈ R, which is nonzero and not a unit. Notice that x is not
irreducible in R because we can write x = 2 · ( 21 x), and neither 2 nor 12 x is a unit in R. In fact, we claim that
x can not be written as a product of irreducibles in R. To see this, suppose that p1 (x), p2 (x), . . . , pn (x) ∈ R
and that
x = p1 (x)p2 (x) · · · pn (x).
Since deg(x) = 1, one of the pi (x) must have degree 1 and the rest must have degree 0 (notice that it is not
possible that some pi (x) is the zero polynomial since x is a nonzero polynomial). Since R is commutative,
206 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS
we can assume that deg(p1 (x)) = 1 and deg(pi (x)) = 0 for 2 ≤ i ≤ n. Write p1 (x) = ax + b, and pi (x) = ci
for 2 ≤ i ≤ n where b, c2 , c3 , . . . , cn ∈ Z, each ci 6= 0, and a ∈ Q. We then have
x = (ac2 c3 . . . ck )x + bc2 c3 · · · ck
This implies that bc2 c3 · · · ck = 0, and since each ci 6= 0, it follows that b = 0. Thus, we have p1 (x) = ax.
Now notice that a
p1 (x) = ax = 2 · x
2
and neither 2 nor a2 x are units in R. Therefore, p1 (x) is not irreducible in R. We took an arbitrary way to
write x as a product of elements of R, and showed that at least of the factors was not irreducible. It follows
that x can not be written as a product of irreducible elements in R. From this example, we realize that it is
hopeless to prove the existence part of factorizations in general integral domains, so we will need to isolate
some special properties of integral domains that will rule out such “infinite descent”. In general, relations
that do not have such infinite descent are given a special name.
Definition 11.4.2. A relation ∼ on a set A is well-founded if there does not exist a sequence of elements
a1 , a2 , a3 , . . . from A such that an+1 ∼ an for all n ∈ N+ .
Notice that < is well-founded on N but not on Z, Q, or R. If we instead define ∼ by saying that a ∼ b
if |a| < |b|, then ∼ is well-founded on Z, but not on Q or R. The fact that < is well-founded on N is the
foundation for well-ordering and induction proofs on N. We want to think about whether a certain divisibility
relation is well-founded. However, the relation we want to think about is not ordinary divisibility, but a
slightly different notion.
Definition 11.4.3. Let R be an integral domain and let a, b ∈ R. We write a k b to mean that a | b and
that a is not an associate of b. We call k the strict divisibility relation.
For example, in Z the strict divisibility relation is well-founded because if a, b ∈ Z\{0} and a k b, then
|a| < |b|. However, if R is the subring of Q[x] consisting of those polynomials whose constant term is an
integer, then the strict divisibility relation is not well-founded. To see this, let an = 21n x for each n ∈ N+ .
Notice that an ∈ R for all n ∈ N+ . Also, we have an = 2an+1 for each n ∈ N+ . Since 2 is not a unit in R, we
also have that an and an+1 are not associates for each n ∈ N+ . Therefore, an+1 k an for each n ∈ N+ . The
fact that the strict divisibility relation is not well-founded on R is the fundamental reason why elements of
R may not factor into irreducibles.
Proposition 11.4.4. Let R be an integral domain. If the strict divisibility relation on R is well-founded
(i.e. there does not exist d1 , d2 , d3 , . . . in R such that dn+1 k dn for all n ∈ N+ ), then every nonzero nonunit
in R is a product of irreducibles.
Proof. We prove the contrapositive. Suppose that there is a nonzero nonunit element of R that is not a
product of irreducibles. Fix such an element a. We define a sequence of nonzero nonunit elements d1 , d2 , . . .
in R with dn+1 k dn recursively as follows. Start by letting d1 = a. Assume inductively that dn is a nonzero
nonunit which is not a product of irreducibles. In particular, dn is itself not irreducible, so we may write
dn = bc for some choice of nonzero nonunits b and c. Now it is not possible that both b and c are products
of irreducibles because otherwise dn would be as well. Thus, we may let dn+1 be one of b and c, chosen so
that dn+1 is also not a product of irreducibles. Notice that dn+1 is a nonzero nonunit, that dn+1 | dn , and
that dn+1 is not an associate of dn because neither b nor c are units. Therefore, dn+1 k dn . Since we can
continue this recursive construction for all n ∈ N+ , it follows that the strict divisibility relation on R is not
well-founded.
We can also give another proof of Proposition 11.4.4 by using an important combinatorial result.
11.4. UNIQUE FACTORIZATION DOMAINS 207
Definition 11.4.5. Let {0, 1}∗ be the set of all finite sequences of 0’s and 1’s (including the “empty string”
λ). A tree is a subset T ⊆ {0, 1}∗ which is closed under initial segments. In other words, if σ ∈ T and τ is
an initial segment of S, then τ ∈ S.
For example, the set {λ, 0, 1, 00, 01, 011, 0110, 0111} is a tree.
Lemma 11.4.6 (König’s Lemma). Every infinite tree has an infinite branch. In other words, if T is a tree
with infinitely many elements, then there is an infinite sequence of 0’s and 1’s such that every finite initial
segment of this sequence is an element of T .
Proof. Let T be a tree with infinitely many elements. We build the infinite sequences in stages. That is, we
define finite sequences σ0 ≺ σ1 ≺ σ2 ≺ . . . recursively where each |σn | = n. In our construction, we maintain
the invariant that there are infinitely many element of T extending σn .
We begin by defining σ0 = λ and notice that there are infinitely many element of T extending λ because
λ is an initial segment of every element of T trivially. Suppose that we have defined σn in such a way that
|σn | = n and there are infinitely many elements of T extending σn . We then must have that either there
are infinitely many elements of T extending σn 0 or there are infinitely many elements of T extending σn 1.
Thus, we may fix an i ∈ {0, 1} such that there are infinitely many elements of T extending σn i, and define
σn+1 = σn i.
We now take the unique infinite sequence extending all of the σn and notice that it has the required
properties.
Proof 2 of Proposition 11.4.4. Suppose that a ∈ R is a nonzero nonunit. Recursively factor a into two
nonunits, and hence two nonassociate divisors, down a tree. If we ever a reach a irreducible, stop that
branch but continue down the others. This tree can not have an infinite path because this this would violate
the fact that the strict divisibility relation is well-founded on R. Therefore, by König’s Lemma, the tree is
finite. It follows that a is the product of the leaves, and hence a product of irreducibles.
Now that we have a descent handle on when an integral domain will have the property that every element
is a product of irreducibles, we now move on to the uniqueness aspect. In the proof of the Fundamental
Theorem of Arithmetic in Z, we made essential use of Proposition 2.5.6 saying that irreducibles in Z are
prime, and this will be crucial in our generalizations as well. Hence, we are again faced with the question
of when irreducibles are guaranteed to be prime. As we saw in the Section 11.2, irreducibles need not be
prime in general integral domains either. We now spend the rest of this section showing that if irreducibles
are prime, then we do in fact obtain uniqueness.
Definition 11.4.7. Let R be an integral domain and c ∈ R. Define a function ordc : R → N ∪ {∞} as
follows. Given a ∈ R, let ordc (a) be the largest k ∈ N such that ck | a if one exists, and otherwise let
ordc (a) = ∞. Here, we interpret c0 = 1, so we always have 0 ∈ {k ∈ N : ck | a}.
Notice that we have ord0 (a) = 0 whenever a 6= 0 and ord0 (0) = ∞. Also, for any a ∈ R and u ∈ U (R),
we have ordu (a) = ∞ because uk ∈ U (R) for all k ∈ N, hence uk | a for all k ∈ N.
Lemma 11.4.8. Let R be an integral domain. Let a, c ∈ R with c 6= 0, and let k ∈ N. The following are
equivalent.
1. ordc (a) = k
2. ck | a and ck+1 - a
Proof. • 1 → 2 is immediate.
208 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS
• 2 → 1: Suppose that ck | a and ck+1 - a. We clearly have ordc (a) ≥ k. Suppose that there exists ` > k
with c` | a. Since ` > k, we have ` ≥ k + 1. This implies that ck+1 | c` , so since c` | a we conclude
that ck+1 | a. This contradicts our assumption. Therefore, there is no ` > k with c` | a, and hence
ordc (a) = k.
• 2 → 3: Suppose that ck | a and ck+1 - a. Fix m ∈ R with a = ck m. If c | m, then we may fix
n ∈ R with m = cn, which would imply that a = ck cn = ck+1 n contradicting the fact that ck+1 - a.
Therefore, we must have c - m.
• 3 → 2: Fix m ∈ R with a = ck m and c - m. We clearly have ck | a. Suppose that ck+1 | a and fix
n ∈ R with a = ck+1 n. We then have ck m = ck+1 n. Since R is an integral domain and c 6= 0, it follows
that that ck 6= 0. Canceling ck from both sides of ck m = ck+1 n (again since R is an integral domain),
we conclude that m = cn. This implies that c | m, which is a contradiction. Therefore, ck+1 - a.
A rough intuition is that ordc (a) intends to count the number of “occurrences” of c inside of a. A natural
hope would be that ordc (ab) = ordc (a) + ordc (b) for all a, b, c ∈ R, but this fails in general. For example,
consider Z. We have ord10 (70) = 1 but ord10 (14) = 0 and ord10 (5) = 0, so ord10 (70) 6= ord10 (14) + ord10 (5).
Since 10 = 2 · 5, what is happening in this example is that the 2 was split off into the 14, while the 5 went
into the other factor. One might hope that since irreducibles can not be factored nontrivially, this kind of
behavior would not happen if we replace 10 by an irreducible element. Although this is true in Z, it is not
necessarily the case in general.
As introduced in Section 11.2, in the integral domain
√ √
Z[ −5] = {a + b −5 : a, b ∈ Z}
we have √ √
(1 + −5)(1 − −5) = 6 = 2 · 3
√
where all four √factors are irreducible.√Now ord2 (6) = 1 because 6 = 2 · 3 and 2 - 3 in Z[ −5]. However, we
have ord2 (1 + −5) = 0 = ord2 (1 − −5). Thus,
√ √ √ √
ord2 ((1 + −5)(1 − −5)) 6= ord2 (1 + −5) + ord2 (1 − −5)
√ √
√ other words, although 2 is irreducible in Z[ −5] and we can “find it” in the product 6 = (1 + −5)(1 −
In
−5), we can not “find it” in either of the factors. This strange behavior takes some getting used to, and
we will
√ explore it much further in later chapters. The fundamental
√ obstacle
√ is that although 2 is irreducible
√
in Z[ −5],
√ it is not prime there. In fact, although 2 | (1 + −5)(1 − −5), we have both 2 - 1 + −5 and
2 - 1 − −5.
For an even worse example, consider the subring R of Q[x] consisting of those polynomials whose constant
term and coefficient of x are both elements of Z. In Section 11.2, we showed that 3 was not prime in R
because 3 | x2 but 3 - x. However, we still have that U (R) = {1, −1}, so 3 is irreducible in R. However,
notice that ord3 (x2 ) = ∞ and ord3 (x) = 0. Thus, we certainly do not have ord3 (x2 ) = ord3 (x) + ord3 (x).
The takeaway fact from these examples is that although irreducibility was defined to mean that we could
not further “split” the element nontrivially, this definition does not carry with it the nice properties that we
might expect. The next theorem shows that everything works much better for the possibly smaller class of
prime elements.
Theorem 11.4.9. Let R be an integral domain and let p ∈ R be prime. We have the following.
1. ordp (ab) = ordp (a) + ordp (b) for all a, b ∈ R.
2. ordp (an ) = n · ordp (a) for all a ∈ R and n ∈ N.
11.4. UNIQUE FACTORIZATION DOMAINS 209
Proof. We prove 1, from which 2 follows by induction. Let a, b ∈ R. First notice that if ordp (a) = ∞, then
pk | a for all k ∈ N, hence pk | ab for all k ∈ N, and thus ordp (ab) = ∞. Similarly, if ordp (b) = ∞, then
ordp (ab) = ∞. Suppose then that both ordp (a) and ordp (b) are finite, and let k = ordp (a) and ` = ordp (b).
Using Lemma 11.4.8, we may then write a = pk m where p - m and b = p` n where p - n. We then have
ab = pk mp` n = pk+` · mn
Now if p | mn, then since p is prime, we conclude that either p | m or p | n, but both of these are
contradictions. Therefore, p - mn. Using Lemma 11.4.8 again, it follows that ordp (ab) = k + `.
Proposition 11.4.10. Let R be an integral domain, and let p ∈ R be irreducible.
1. For any irreducible q that is an associate of p, we have ordp (q) = 1.
2. For any irreducible q that is not an associate of p, we have ordp (q) = 0.
3. For any unit u, we have ordp (u) = 0.
Proof. 1. Suppose that q is an irreducible that is an associate of p. Fix a unit u with q = pu. Notice that
if p | u, then since u | 1, we conclude that p | 1, which would imply that p is a unit, contradicting our
definition of irreducible. Thus, p - u, and so ordp (q) = 1 by Lemma 11.4.8.
2. Suppose that q is an irreducible that is not an associate of p. Since q is irreducible, its only divisors
are units and associates of q. Since p is not a unit nor an associate of q, it follows that p - q. Therefore,
ordp (q) = 0.
3. This is immediate because if p | u, then since u | 1, we could conclude that p | 1 and hence p is unit,
contradicting the definition of irreducible.
Proposition 11.4.11. Let R be an integral domain. Let a ∈ R and let p ∈ R be prime. Suppose that
a = uq1 q2 · · · qk where u is a unit and the qi are irreducibles. We then have that exactly ordp (a) many of the
qi are associates of p.
Proof. Since p ∈ R is prime, Theorem 11.4.9 implies that
By Proposition 11.4.10, the terms on the right are 1 when qi is an associate of p and 0 otherwise. The result
follows.
Theorem 11.4.12. Let R be an integral domain. Suppose that the strict divisibility relation on R is well-
founded (i.e. there does not exist d1 , d2 , d3 , . . . such that dn+1 k dn for all n ∈ N+ ), and every irreducible in
R is prime. We then have R is a UFD.
Proof. Since the strict divisibility relation on R is well-founded, we can use Proposition 11.4.4 to conclude
that every nonzero nonunit element of R can be written as a product of irreducibles. Suppose now that
q1 q2 · · · qn = r1 r2 . . . rm where each qi and rj are irreducible. Call this common element a. We know that
every irreducible element of R is prime by assumption. Thus, for any irreducible p ∈ R, Proposition 11.4.11
210 CHAPTER 11. DIVISIBILITY AND FACTORIZATIONS IN INTEGRAL DOMAINS
(with u = 1) tells us that exactly ordp (a) many of the qi are associates of p, and also that exactly ordp (a)
many of the rj are associates of p. Thus, for every irreducible p ∈ R, there are an equal number of associates
of p on each side. Matching up the elements on the left with corresponding associates on the right tells us
that m = n and gives the required permutation.
Of course, we are left with the question of how to prove that many natural integral domains have these two
properties. Now in Z, the proof of Proposition 2.5.6 (that irreducibles are prime) relied upon the existence
of greatest common divisors. In the next chapter, we will attempt to generalize the special aspects of Z that
allowed us to prove some of the existence and fundamental properties of GCDs.
Chapter 12
Definition 12.1.1. Let R be an integral domain. A function N : R\{0} → N is called a Euclidean function
on R if for all a, b ∈ R with b 6= 0, there exist q, r ∈ R such that
a = qb + r
Definition 12.1.2. An integral domain R is a Euclidean domain if there exists a Euclidean function on R.
Example 12.1.3. As alluded to above, Theorem 2.3.1 and Corollary 9.3.11 establish the following:
• Let F be a field. The function N : F [x]\{0} → N defined by N (f (x)) = deg(f (x)) is a Euclidean
function on F [x], so F [x] is a Euclidean domain.
Notice that we do not require the uniqueness of q and r in our definition of a Euclidean function. Although
it was certainly a nice perk to have some aspect of uniqueness in Z and F [x], it turns out to be be unnecessary
for the theoretical results of interest about Euclidean domains. Furthermore, although the above Euclidean
function for F [x] does provide true uniqueness, the one for Z does not since uniqueness there only holds
if the remainder is positive (for example, we have 13 = 4 · 3 + 1 and also 13 = 5 · 3 + (−2) where both
N (1) < N (3) and N (−2) < N (3)). We will see more natural Euclidean functions on integral domains for
which uniqueness fails, and we want to be as general as possible.
The name Euclidean domain comes from the fact that any such integral domain supports the ability to
find greatest common divisors via the Euclidean algorithm. Even more fundamentally, the notion of “size”
given by a Euclidean function N : R → N allows us to use induction to prove the existence of greatest
common divisors. We begin with the following generalization of a simple result we proved about Z which
works in any integral domain (even any commutative ring).
211
212 CHAPTER 12. EUCLIDEAN DOMAINS AND PIDS
Proof. Suppose first that d is a common divisor of b and r. Since d | b, d | r, and a = qb + r = bq + r1, it
follows that d | a.
Conversely, suppose that d is a common divisor of a and b. Since d | a, d | b, and r = a − qb = a1 + b(−q),
it follows that d | r.
Theorem 12.1.5. Let R be a Euclidean domain. Every pair of elements a, b ∈ R has a greatest common
divisor.
Proof. Since R is a Euclidean domain, we may fix a Euclidean function N : R\{0} → N. We first handle
the special case when b = 0 since N (0) is not defined. If b = 0, then the set of common divisors of a and b
equals the set of divisors of a (because every element divides 0), so a satisfies the requirement of a greatest
common divisor. We now use (strong) induction on N (b) ∈ N to prove the result.
• Base Case: Suppose that b ∈ R is nonzero and N (b) = 0. Fix q, r ∈ R with a = qb + r and either r = 0
or N (r) < N (b). Since N (b) = 0, we can not have N (r) < N (b), so we must have r = 0. Therefore, we
have a = qb. It is now easy to check that b is a greatest common divisor of a and b.
• Inductive Step: Suppose then that b ∈ R is nonzero and we know the result for all pairs x, y ∈ R with
either y = 0 or N (y) < N (b). Fix q, r ∈ R with a = qb + r and either r = 0 or N (r) < N (b). By
(strong) induction, we know that b and r have a greatest common divisor d. By Proposition 12.1.4,
the set of common divisors of a and b equals the set of common divisors of b and r. It follows that d
is a greatest common divisor of a and b.
As an example, consider working in the integral domain Q[x] and trying to find a greatest common divisor
of the following two polynomials:
We apply the Euclidean Algorithm as follows (we suppress the computations of the long divisions):
Thus, the set of common of f (x) and g(x) equals the set of common divisors of x2 + 3 and 0, which is just
the set of divisors of x2 + 3. Therefore, x2 + 3 is a greatest common divisor of f (x) and g(x). Now this is
not the only greatest common divisor because we know that any associate of x2 + 3 will also be a greatest
common divisor of f (x) and g(x) by Proposition 11.1.11. The units in Q[x] are the nonzero constants, so
other greatest common divisors are 2x2 + 6, 65 x2 + 52 , etc. We would like to have a canonical choice for which
to pick, akin to choosing the nonnegative value when working in Z.
Definition 12.1.6. Let F be a field. A monic polynomial in F [x] is a nonzero polynomial whose leading
term is 1.
Notice that every nonzero polynomial in F [x] is an associate with a unique monic polynomial (if the
leading term is a 6= 0, just multiply by a−1 to get a monic associate, and notice that this is the only way to
multiply by a nonzero constant to make it monic). By restricting to monic polynomials, we get a canonical
choice for a greatest common divisor.
12.1. EUCLIDEAN DOMAINS 213
Definition 12.1.7. Let F be a field and let f (x), g(x) ∈ F [x] be polynomials. If at least one of f (x) and
g(x) is nonzero, we define gcd(f (x), g(x)) to be the unique monic polynomial which is a greatest common
divisor of f (x) and g(x). Notice that if both f (x) and g(x) are the zero polynomial, then 0 is the only greatest
common divisor of f (x) and g(x), so we define gcd(f (x), g(x)) = 0.
Now x2 + 3 is monic, so from the above computations, we have
gcd(x5 + 3x3 + 2x2 + 6, x4 − x3 + 4x2 − 3x + 3) = x2 + 3
We end this section by showing that the Gaussian Integers Z[i] = {a + bi : a, b ∈ Z} are also a Euclidean
domain. In order to show this, we will use the following result.
Proposition 12.1.8. The subring Q[i] = {q + ri : q, r ∈ Q} is a field.
Proof. Let α ∈ Q[i] be nonzero and write α = q + ri where q, r ∈ Q. We then have that either q 6= 0 or
r 6= 0, so
1 1
=
α q + ri
1 q − ri
= ·
q + ri q − ri
q − ri
= 2
q + r2
q −r
= 2 2
+ 2 ·i
q +r q + r2
q −r 1
Since both q 2 +r 2 and q 2 +r 2 are elements of Q, it follows that α ∈ Q[i]. Therefore, Q[i] is a field.
Proof. Notice that {1, −1, i, −i} ⊆ U (Z[i]) because 12 = 1, (−1)2 = 1, and i · (−i) = 1. Suppose conversely
that α ∈ U (Z[i]) and write α = c + di where a, b ∈ Z. Since α ∈ U (Z[i]), we can fix β ∈ Z[i] with αβ = 1.
We then have N (αβ) = N (1), so N (α) · N (β) = 1 by the previous proposition. Since N (α), N (β) ∈ N, we
conclude that N (α) = 1 and N (β) = 1. We then have c2 + d2 = N (α) = 1. It follows that one of c or d is 0,
and the other is ±1. Thus, α = c + di ∈ {1, −1, i, −i}.
Proof. Notice Z[i] is an integral domain because it is a subring of C. Let α, β ∈ Z[i] be arbitrary with β 6= 0.
When we divide α by β in the field Q[i] we get α β = s + ti for some s, t ∈ Q. Fix integers m, n ∈ Z closest
to s, t ∈ Q respectively, i.e. fix m, n ∈ Z so that |m − s| ≤ 21 and |n − t| ≤ 12 . Let γ = m + ni ∈ Z[i], and let
ρ = α − βγ ∈ Z[i]. We then have that α = βγ + ρ, so we need only show that N (ρ) < N (β). Now
N (ρ) = N (α − βγ)
= N (β · (s + ti) − β · γ)
= N (β · ((s + ti) − (m + ni))
= N (β · ((s − m) + (t − n)i)
= N (β) · N ((s − m) + (t − n)i)
= N (β) · ((s − m)2 + (t − n)2 )
1 1
≤ N (β) · +
4 4
1
= · N (β)
2
< N (β)
where the last line follows because N (β) > 0 (since β 6= 0).
We work out an example of finding a greatest common of 8 + 9i and 10 − 5i in Z[i]. We follow the above
proof to find quotients and remainders. Notice that
8 + 9i 8 + 9i 10 + 5i
= ·
10 − 5i 10 − 5i 10 + 5i
80 + 40i + 90i − 45
=
100 + 25
35 + 130i
=
125
7 26
= + ·i
25 25
7 26
Following the proof (where we take the closest integers to 25 and 25 ), we should use the quotient i and
determine the remainder from there. We thus write
8 + 9i = i · (10 − 5i) + (3 − i)
12.1. EUCLIDEAN DOMAINS 215
Notice that N (3 − i) = 9 + 1 = 10 which is less than N (10 − 5i) = 100 + 25 = 125. Following the Euclidean
algorithm, we next calculate
10 − 5i 10 − 5i 3 + i
= ·
3−i 3−i 3+i
30 + 10i − 15i + 5
=
9+1
35 − 5i
=
10
7 1
= − ·i
2 2
Following the proof (where we now have many choices because 72 is equally close to 3 and 4 and − 12 is equally
close to −1 and 0), we choose to take the quotient 3. We then write
10 − 5i = 3 · (3 − i) + (1 − 2i)
Notice that N (1 − 2i) = 1 + 4 = 5 which is less than N (3 − i) = 9 + 1 = 10. Going to the next step, we
calculate
3−i 3 − i 1 + 2i
= ·
1 − 2i 1 − 2i 1 + 2i
3 + 6i − i + 2
=
1+4
5 + 5i
=
5
=1+i
Therefore, we have
3 − i = (1 + i) · (1 − 2i) + 0
Putting together the various divisions, we see the Euclidean algorithm as:
8 + 9i = i · (10 − 5i) + (3 − i)
10 − 5i = 3 · (3 − i) + (1 − 2i)
3 − i = (1 + i) · (1 − 2i) + 0
Thus, the set of common divisors of 8 + 9i and 10 − 5i equals the set of common divisors of 1 − 2i and 0,
which is just the set of divisors of 1 − 2i. Since a greatest common divisor is unique up to associates and the
units of Z[i] are 1, −1, i, −i, it follows the set of greatest common divisors of 8 + 9i and 10 − 5i is
{1 − 2i, −1 + 2i, 2 + i, −2 − i}
In a Euclidean domains with a “nice” Euclidean function, say where N (b) < N (a) whenever b k a, one can
mimic the inductive argument in Z to prove that every element is a product of irreducibles. For example, it
is relatively straightforward to prove that our Euclidean functions on F [x] and Z[i] satisfy this, and so every
(nonzero nonunit) element factors into irreducibles. In fact, one can show that every Euclidean domain has
a (possibly different) Euclidean function N with the property that N (b) < N (a) whenever b k a. However,
rather than develop this interesting theory, we approach these problems from another perspective.
216 CHAPTER 12. EUCLIDEAN DOMAINS AND PIDS
{N (a) : a ∈ I\{0}}
is a nonempty subset of N. By the well-ordering property of N, the set has a least element m. Fix b ∈ I
with N (b) = m. Since b ∈ I, we clearly have hbi ⊆ I. Suppose now that a ∈ I. Fix q, r ∈ R with
a = qb + r
and either r = 0 or N (r) < N (b). Since r = a − qb and both a, b ∈ I, it follows that r ∈ I. Now if r 6= 0,
then N (r) < N (b) = m contradicting our minimality of m. Therefore, we must have r = 0 and so a = qb. It
follows that a ∈ hbi. Since a ∈ I was arbitrary, we conclude that I ⊆ hbi. Therefore, I = hbi.
Corollary 12.2.3. Z, F [x] for F a field, and Z[i] are all PIDs.
Notice also that all fields F are also PIDs for the trivial reason that the only ideals of F are {0} = h0i and
F = h1i. In fact, all fields are also trivially Euclidean domain via absolutely any function N : F \{0} → N
because we can always divide by a nonzero element with zero as a remainder. It turns out that there are
PIDs which are not Euclidean domains, but we will not construct examples of such rings now.
Returning to our other characterization of greatest common divisors in Z, we had that if a, b ∈ Z not
both nonzero, then we considered the set
{ma + nb : m, n ∈ Z}
and proved that the least positive element of this set was the greatest common divisor. In our current ring-
theoretic language, the above set is the ideal ha, bi of Z, and a generator of this ideal is a greatest common
divisor. With this change in perspective/language, we can carry this argument over to an arbitrary PID.
Theorem 12.2.4. Let R be a PID and let a, b ∈ R.
12.2. PRINCIPAL IDEAL DOMAINS 217
I1 ( I2 ( I3 ( . . .
which contradicts the definition of Noetherian. Therefore, k is well-founded on R. The last statement now
following from Proposition 11.4.4.
This is all well and good, but we need a “simple” way to determine when a commutative ring R is
Noetherian. Fortunately, we have the following.
Theorem 12.2.11. Let R be a commutative ring. The following are equivalent.
1. R is Noetherian.
12.2. PRINCIPAL IDEAL DOMAINS 219
2. Every ideal of R is finitely generated (i.e. for every ideal I of R, there exist a1 , a2 , . . . , am ∈ R with
I = ha1 , a2 , . . . , am i).
Proof. Suppose first that every ideal of R is finitely generated. Let
I1 ⊆ I2 ⊆ I3 ⊆ . . .
IN ⊆ In ⊆ J ⊆ IN
hence In = IN .
Suppose conversely that some ideal of R is not finitely generated and fix such an ideal J. Define a
sequence of elements of J as follows. Let a1 be an arbitrary element of J. Suppose that we have defined
a1 , a2 , . . . , ak ∈ J. Since J is not finitely generated, we have that
ha1 , a2 , . . . , an i ( J
so we may let ak+1 be some (any) element of J\ha1 , a2 , . . . , ak i. Letting In = ha1 , a2 , . . . , an i for each n ∈ N+ ,
we then have
I1 ( I2 ( I3 ( . . .
so R is not Noetherian.
Corollary 12.2.12. Every PID is Noetherian.
Proof. This follows immediately from Theorem 12.2.11 and the fact that in a PID every ideal is generated
by one element.
Corollary 12.2.13. Every PID is a UFD, and thus every Euclidean domain is a UFD as well.
Proof. Let R be a PID. By Corollary 12.2.12, we know that R is Noetherian, and so k is well-founded
by Proposition 12.2.10. Furthermore, we know that every irreducible in R is prime by Proposition 12.2.7.
Therefore, R is a UFD by Theorem 11.4.12.
Finally, we bring together many of the fundamental properties of elements and ideals in any PID.
Proposition 12.2.14. Let R be a PID and let a ∈ R with a 6= 0. The following are equivalent.
1. hai is a maximal ideal.
220 CHAPTER 12. EUCLIDEAN DOMAINS AND PIDS
3. a is a prime.
4. a is irreducible.
Proof. We have already proved much of this, so let’s recap what we know.
• 1 → 2 is Corollary 10.5.6.
• 2 ↔ 3 is Proposition 11.2.5.
• 3 → 4 is Proposition 11.2.4.
• 4 → 3 is Proposition 12.2.7
It is natural to misread the previous proposition to conclude that in a PID every prime ideal is maximal.
This is almost true, but pay careful attention to the assumption that a 6= 0. In a PID R, the ideal {0} is
always a prime ideal, but it is only maximal in the trivial special case of when R is a field. In a PID, every
nonzero prime ideal is maximal.
Proposition 12.3.1. Let F be a field and let p(x) ∈ F [x] be nonzero. Let I = hp(x)i and work in F [x]/I.
For all f (x) ∈ F [x], there exists a unique h(x) ∈ F [x] such that both:
• f (x) = h(x)
then the elements of S provide unique representatives for the cosets in F [x]/I.
Proof. We first prove existence. Let f (x) ∈ F [x]. Since p(x) 6= 0, we may fix q(x), r(x) with
Thus p(x) | (f (x) − r(x)) and so f (x) − r(x) ∈ I. It follows from Proposition 10.1.6 that f (x) = r(x) so we
may take h(x) = r(x). This proves existence.
We now prove uniqueness. Suppose that h1 (x), h2 (x) ∈ S (so each is either 0 or has smaller degree
than p(x)) and that h1 (x) = h2 (x). Using Proposition 10.1.6, we then have that h1 (x) − h2 (x) ∈ I and
hence p(x) | (h1 (x) − h2 (x)). Notice that every nonzero multiple of p(x) has degree greater than or equal to
deg(p(x)) (since the degree of a product is the sum of the degrees in F [x]). Now either h1 (x) − h2 (x) = 0
or it has degree less than deg(p(x)), but we’ve just seen that the latter is impossible. Therefore, it must be
the case that h1 (x) − h2 (x) = 0, and so h1 (x) = h2 (x). This proves uniqueness.
Let’s look at an example. Suppose that we are working with F = Q and we let p(x) = x2 − 2x + 3.
Consider the quotient ring R = Q[x]/hp(x)i. From above, we know that every element in this quotient
is represented uniquely by either a constant polynomial or a polynomial of degree 1. Thus, some distinct
elements of R are 1, 3/7, x, and 2x − 5/3. We add elements in the quotient ring R by adding representatives
as usual, so for example we have
4x − 7 + 2x + 8 = 6x + 1
Multiplication of elements of R is more interesting if we try to convert the resulting product to one of our
chosen representatives. For example, we have
which is perfectly correct, but the resulting representative isn’t one of our chosen ones. If we follows the
above proof, we should divide 2x2 + 5x − 7 by x2 − 2x + 3 and use the remainder as our representative. We
have
2x2 + 5x − 7 = 2 · (x2 − 2x + 3) + (9x − 13)
so
(2x2 + 5x − 7) − (9x − 13) ∈ hp(x)i
and hence
2x2 + 5x − 7 = 9x − 13
It follows that in the quotient we have
2x + 7 · x − 1 = 9x − 13
Here’s another way to determine that the product is 9x − 13. Notice that for any f (x), g(x) ∈ F [x], we have
x2 = x + 1 = x + 1
Therefore, we have
x · x + 1 = x2 + x
= x2 + x
=x+1+x
=1
Notice that every nonzero element of the quotient F [x]/hx2 + x + 1i has a multiplicative inverse, so the
quotient in this case is a field. We have succeeded in constructing of field of order 4. This is our first
example of a finite field which does not have prime order.
The reason why we obtained a field when taking the quotient by hx2 + x + 1i but not when taking the
quotient by hx2 + 1i is the following. It is the analogue of the fact that Z/pZ is a field if and only if p is
prime (equivalently irreducible) in Z.
Proposition 12.3.2. Let F be a field and let p(x) ∈ F [x] be nonzero. We have that F [x]/hp(x)i is a field
if and only if p(x) is irreducible in F [x].
Proof. Let p(x) ∈ F [x] be nonzero. Since F is a field, we know that F [x] is a PID. Using Proposition 12.2.14
and Theorem 10.5.5, we conclude that
When F = {0, 1} as above, we see that x2 +1 is not irreducible (since 1 is a root of x2 +1 = (x+1)(x+1))
but x2 + x + 1 is irreducible (because it has degree 2 and neither 0 nor 1 is a root). Generalizing the previous
constructions, we get the following.
Proposition 12.3.3. Let F be a finite field with k elements. If p(x) ∈ F [x] is irreducible and deg(p(x)) = n,
then F [x]/hp(x)i is a field with k n elements.
Proof. Since p(x) is irreducible, we know from the previous proposition that F [x]/hp(x)i is a field. Since
deg(p(x)) = n, we can represent the elements of the quotient uniquely by elements of the form
an−1 xn−1 + · · · + a1 x + a0
where each ai ∈ F . Now F has k elements, so we have k choices for each value of ai . We can make this
choice for each of the n coefficients ai , so we have k n many choices in total.
It turns out that if p ∈ N+ is prime and n ∈ N+ , then there exists an irreducible polynomial in Z/pZ[x]
of degree n, so there exists a field of order pn . However, directly proving that such polynomials exist is
nontrivial. It is also possible to invert this whole idea by first proving that there exists fields of order pn ,
working to understand their structure, and then using these results to prove there there exist irreducible
polynomials in in Z/pZ[x] of each degree n. We will do this later in Chapter ??. In addition, we will prove
every finite field has order some prime power, and any two finite fields of the same order are isomorphic, so
this process of taking quotients of Z/pZ[x] by irreducible polynomials suffices to construct all finite fields
(up to isomorphism).
224 CHAPTER 12. EUCLIDEAN DOMAINS AND PIDS
Finally, we end this section with one way to construct the complex numbers. Consider the ring R[x]
of polynomials with real coefficients. Let p(x) = x2 + 1 and notice that p(x) has no roots in R because
a2 + 1 ≥ 1 for all a ∈ R. Since deg(p(x)) = 2, it follows that p(x) = x2 + 1 is irreducible in R[x]. From
above, we conclude that R[x]/hx2 + 1i is a field. Now elements of the quotient are represented uniquely by
ax + b for a, b ∈ R. We have x2 + 1 = 0, so x2 + 1 = 0 and hence x2 = −1. It follows that x2 = −1, so x
can play the role of “i”. Notice that for any a, b, c, d ∈ R we have
ax + b + cx + d = (a + c)x + (b + d)
and
ax + b · cx + d = acx2 + (ad + bc)x + bd
= acx2 + (ad + bc)x + bd
= ac · x2 + (ad + bc)x + bd
= ac · −1 + (ad + bc)x + bd
= (ad + bc)x + (bd − ac)
Notice that this multiplication is the exact same as when you treat the complex numbers as having the
form ai + b and “formally add and multiply” using the rule that i2 = −1. One advantage of our quotient
construction is that we do not need to verify all of the field axioms. We get them for free from our general
theory.
We now define F rac(R) to be the set of equivalence classes, i.e. F rac(R) = P/∼. We need to define
addition and multiplication on F to make it into a field. Mimicking addition and multiplication of rationals,
we want to define
(a, b) + (c, d) = (ad + bc, bd) (a, b) · (c, d) = (ac, bd)
Notice first that since b, d 6= 0, we have bd 6= 0 because R is an integral domain, so we have no issues there.
However, we need to check that the operations are well-defined.
• (a1 d1 + b1 c1 , b1 d1 ) ∼ (a2 d2 + b2 c2 , b2 d2 )
• (a1 c1 , b1 d1 ) ∼ (a2 c2 , b2 d2 )
Proof. Since (a1 , b1 ) ∼ (a2 , b2 ) and (c1 , d1 ) ∼ (c2 , d2 ), it follows that a1 b2 = b1 a2 and c1 d2 = d1 c2 . We have
(a1 d1 + b1 c1 ) · b2 d2 = a1 b2 d1 d2 + c1 d2 b1 b2
= b1 a2 d1 d2 + d1 c2 b1 b2
= b1 d1 · (a2 d2 + b2 c2 )
so (a1 d1 + b1 c1 , b1 d1 ) ∼ (a2 d2 + b2 c2 , b2 d2 ).
Multiplying a1 b2 = b1 a2 and c1 d2 = d1 c2 , we conclude that
a1 b2 · c1 d2 = b1 a2 · d1 c2
and hence
a1 c1 · b2 d2 = b1 d1 · a2 c2
Now that we have successfully defined addition and multiplication, we are ready to prove that the resulting
object is a field.
Theorem 12.4.4. If R is an integral domain, then F rac(R) is a field with the following properties.
• If (a, b) ∈ F rac(R) with (a, b) 6= (0, 1), then a 6= 0 and the multiplicative inverse of (a, b) is (b, a).
2. Commutativity of +: Let q, r, s ∈ F rac(R). Fix a, b, c, d ∈ R with b, d 6= 0 such that q = (a, b), and
r = (c, d). We then have
q + r = (a, b) + (c, d)
= (ad + bc, bd)
= (cb + da, db)
= (c, d) + (a, b)
=r+q
3. (0, 1) is an additive identity: Let q ∈ F rac(R). Fix a, b ∈ R with b 6= 0 such that q = (a, b). We then
have
4. Additive inverses: Let q ∈ F rac(R). Fix a, b ∈ R with b 6= 0 such that q = (a, b). Let r = (−a, b). We
then have
q + r = (a, b) + (−a, b)
= (ab + b(−a), bb)
= (ab − ab, bb)
= (0, bb)
= (0, 1)
where the last line follows from the fact that 0 · 1 = 0 = 0 · bb. Since we already proved commutativity
of +, we conclude that r + q = (0, 1) also.
12.4. FIELD OF FRACTIONS 227
6. Commutativity of ·: Let q, r ∈ F rac(R). Fix a, b, c, d ∈ R with b, d 6= 0 such that q = (a, b), and
r = (c, d). We then have
q · r = (a, b) · (c, d)
= (ac, bd)
= (ca, db)
= (c, d) · (a, b)
=r·q
7. (1, 1) is a multiplicative identity: Let q ∈ F rac(R). Fix a, b ∈ R with b 6= 0 such that q = (a, b). We
then have
8. Multiplicative inverses: Let q ∈ F rac(R) with q 6= (0, 1). Fix a, b ∈ R with b 6= 0 such that q = (a, b).
Since (a, b) 6= (0, 1), we know that (a, b) 6∼ (0, 1), so a · 1 6= b · 0 which means that a 6= 0. Let r = (b, a)
which makes sense because a 6= 0. We then have
q · r = (a, b) · (b, a)
= (ab, ba)
= (ab, ab)
= (1, 1)
where the last line follows from the fact that ab·1 = ab = ab·1. Since we already proved commutativity
of +, we conclude that r · q = (1, 1) also.
Although R is certainly not a subring of F rac(R) (it is not even a subset), our next proposition says that
R can be embedded in F rac(R).
Proposition 12.4.5. Let R be an integral domain. Define ϕ : R → F rac(R) by θ(a) = (a, 1). We then have
that θ is an injective ring homomorphism.
Proof. Notice first that θ(1) = (1, 1), which is the multiplicative identity of F rac(R). Now for any a, b ∈ R,
we have
θ(a + b) = (a + b, 1)
= (a · 1 + 1 · b, 1 · 1)
= (a, 1) + (b, 1)
= θ(a) + θ(b)
and
θ(ab) = (ab, 1)
= (a · b, 1 · 1)
= (a, 1) · (b, 1)
= θ(a) · θ(b)
Thus, θ : R → F rac(R) is a homomorphism. Suppose now that a, b ∈ R with θ(a) = θ(b). We then have
that (a, 1) = (b, 1), so (a, 1) ∼ (b, 1). It follows that a · 1 = 1 · b, so a = b. Therefore, θ is injective.
We have now completed our primary objective in showing that every integral domain can be embedded
in a field. Our final result about F rac(R) is that it is the “smallest” such field. We can’t hope to prove that
it is a subset of every field containing R because of course we can always rename elements. However, we can
show that if R embeds in some field K, then you can also embed F rac(R) in K. In fact, we show that there
is a unique way to do it so that you “extend” the embedding of R.
Theorem 12.4.6. Let R be an integral domain. Let θ : R → F rac(R) be defined by θ(a) = (a, 1) as above.
Suppose that K is a field and that ψ : R → K is an injective ring homomorphism. There exists a unique
injective ring homomorphism ϕ : F rac(R) → K such that ϕ ◦ θ = ψ.
12.4. FIELD OF FRACTIONS 229
Proof. First notice that if b ∈ R with b 6= 0, then ψ(b) 6= 0 (because ψ(0) = 0 and ψ is assumed to be
injective). Thus, if b ∈ R with b 6= 0, then ψ(b) has a multiplicative inverse in K. Define ϕ : F rac(R) → K
by letting
ϕ((a, b)) = ψ(a) · ψ(b)−1
We check the following.
• ϕ is well-defined: Suppose that (a, b) = (c, d). We then have (a, b) ∼ (c, d), so ad = bc. From this we
conclude that ψ(ad) = ψ(bc), and since ψ is a ring homomorphism it follows that ψ(a)·ψ(d) = ψ(c)·ψ(b).
We have b, d 6= 0, so ψ(b) 6= 0 and ψ(d) 6= 0 by our initial comment. Multiplying both sides by
ψ(b)−1 · ψ(d)−1 , we conclude that ψ(a) · ψ(b)−1 = ψ(a) · ψ(d)−1 . Therefore, ϕ((a, b)) = ϕ((c, d)).
• ϕ(1F rac(R) ) = 1K : We have
• ϕ is injective: Let a, b, c, d ∈ R with b, d 6= 0 and suppose that ϕ((a, b)) = ϕ((c, d)). We then
have that ψ(a) · ψ(b)−1 = ψ(c) · ψ(d)−1 . Multiplying both sides by ψ(b) · ψ(d), we conclude that
ψ(a) · ψ(d) = ψ(b) · ψ(c). Since ψ is a ring homomorphism, it follows that ψ(ad) = ψ(bc). Now ψ is
injective, so we conclude that ad = bc. Thus, we have (a, b) ∼ (c, d) and so (a, b) = (c, d).
• ϕ ◦ θ = ψ: For any a ∈ R, we have
(ϕ ◦ θ)(a) = ϕ(θ(a))
= ϕ((a, 1))
= ψ(a) · ψ(1)−1
= ψ(a) · 1−1
= ψ(a)
We finally prove uniqueness. Suppose that φ : F rac(R) → K is a ring homomorphism with φ ◦ θ = ψ. For
any a ∈ R, we have
φ((a, 1)) = φ(θ(a)) = ψ(a)
Now for any b ∈ R with b 6= 0, we have
where we used the fact that (b, b) ∼ (1, 1) and that φ is a ring homomorphism so sends the multiplicative
identity to 1 ∈ K. Thus, for every b ∈ R with b 6= 0, we have
Therefore, φ = ϕ.
Corollary 12.4.7. Suppose that R is an integral domain which is a subring of a field K. If every element
of K can be written as ab−1 for some a, b ∈ R with b 6= 0, then K ∼
= F rac(R).
Proof. Let ψ : R → K be the trivial map ψ(r) = r and notice that ψ is an injective ring homomorphism. By
the proof of the previous result, the function ϕ : F rac(R) → K defined by