0% found this document useful (0 votes)

47 views118 pages

Course Notes Math235 - Copy 2

This document provides course notes for an Algebra 1 class covering several topics: - Sets, proofs, functions, cardinality, relations, number systems, fields, rings, arithmetic, congruences, polynomials, rings, groups, and appendices. - It is introduced by the instructor, Eyal Z. Goren of McGill University, and will be updated throughout the semester. Students are asked to point out any mistakes. - The introduction discusses how algebra originated from practical applications and notes that some abstract algebraic structures developed for theoretical purposes have later found applications, such as finite fields in coding theory.

Uploaded by

fullmetalporo2012

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views118 pages

Course Notes Math235 - Copy 2

Uploaded by

fullmetalporo2012

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 118

ALGEBRA 1 (MATH 235)

COURSE NOTES
FALL 2023
VERSION: October 5, 2023

EYAL Z. GOREN,
MCGILL UNIVERSITY
[email protected]

c All rights reserved to the author.

These notes will be updated and corrected throughout the semester. I will mark with z the point I got to in
relation to revisions. If you find any mistakes kindly bring them to my attention, especially if they are in the
part that was already revised. Thanks, E.G.
Contents

Part 1. Some Language and Notation of Mathematics 3

1. Sets 3
1.1. First definitions 3
1.2. Algebra of set operations 5
2. Proofs: ideas and techniques 6
2.1. Proving equality by two inequalities 6
2.2. Proof by contradiction and the contrapositive 7
2.3. Proof by Induction 7
2.4. Prove or disprove 9
2.5. The pigeonhole principle 9
3. Functions 10
3.1. Injective, surjective, bijective and inverse image 12
3.2. Composition of functions 12
3.3. The inverse function 13
4. Cardinality of a set 13
4.1. Cantor’s diagonal argument 16
5. Relations 16
6. Number systems 19
6.1. The polar representation 21
6.2. The complex exponential function 22
6.3. The Fundamental Theorem of Algebra 23
7. Fields and rings - definitions and first examples 25
7.1. Some formal consequences of the axioms 27
8. Exercises 29

Part 2. Arithmetic in Z 33
9. Division, GCD and the Euclidean Algorithm 33
9.1. Division with residue 33
9.2. Division 34
9.3. GCD 34
9.4. The Euclidean algorithm 35
10. Primes and unique factorization 36
10.1. Primes 36
10.2. Applications of the Fundamental Theorem of Arithmetic 39
11. Exercises 42

Part 3. Congruences and modular arithmetic 44

12. Congruence relations 44
12.1. Fermat’s little theorem 46
12.2. Solving equations in Z/nZ. 47
12.2.1. Linear equations. 47
12.2.2. Quadratic equations. 48
12.3. Public key cryptography; the RSA method 48
13. Exercises 50

Part 4. Polynomials and their arithmetic 52

14. The ring of polynomials 52
15. Division with residue 53
16. Arithmetic in F[ x ] 54
16.1. Some remarks about divisibility in a commutative ring T 54
16.2. GCD of polynomials 54
16.3. The Euclidean algorithm for polynomials 54
16.4. Irreducible polynomials and unique factorization 56
16.5. Roots 59
16.6. Eisenstein’s criterion 60
16.7. Roots of polynomials in Z/pZ 60
COURSE NOTES - ALGEBRA I

17. Exercises 62

Part 5. Rings 63
18. Some basic definitions and examples 63
19. Ideals 65
20. Homomorphisms 67
20.1. Units 69
21. Quotient rings 71
21.1. The quotient ring F[ x ]/( f ( x )) 73
21.2. Every polynomial has a root in a bigger field 75
21.3. Roots of polynomials over Z/pZ 75
22. The First Isomorphism Theorem 76
22.1. Isomorphism of rings 76
22.2. The First Isomorphism Theorem 76
22.3. The Chinese Remainder Theorem 77
22.3.1. Inverting Z/mnZ → Z/mZ × Z/nZ 78
23. Prime and maximal ideals 80
24. Exercises 82

Part 6. Groups 85
25. First definitions and examples 85
25.1. Definitions and some formal consequences 85
25.2. Examples 85
25.3. Subgroups 86
26. Permutation groups and dihedral groups 87
26.1. Permutation groups 87
26.2. Cycles 88
26.3. The Dihedral group 90
27. The theorem of Lagrange 91
27.1. Cosets 91
27.2. Lagrange’s theorem 92
28. Homomorphisms and isomorphisms 93
28.1. homomorphisms of groups 93
28.2. Isomorphism 93
29. Group actions on sets 94
29.1. Basic definitions 94
29.2. Basic properties 95
29.3. Some examples 95
30. The Cauchy-Frobenius Formula 97
30.1. Some applications to Combinatorics 98
31. Cauchy’s theorem: a wonderful proof 100
32. The first isomorphism theorem for groups 101
32.1. Normal subgroups 101
32.2. Quotient groups 101
32.3. The first isomorphism theorem 102
32.4. Groups of low order 103
32.4.1. Groups of order 1 103
32.4.2. Groups of order 2, 3, 5, 7 103
32.4.3. Groups of order 4 103
32.4.4. Groups of order 6 104
32.5. Odds and evens 104
32.6. Odds and Ends 104
32.7. Exercises 106

Part 7. Appendices 108

Appendix A: The Cantor-Bernstein Theorem 108
Appendix B: The irrationality of e 109
Appendix C: Euclidean rings 109
EYAL GOREN MCGILL UNIVERSITY

Appendix D: The complex exponential function 110

Index 113
COURSE NOTES - ALGEBRA I 1

Introduction.
It is important to realize that Algebra started in antiquity as an applied science. As every science, it was born
of necessity; in this case, the need to solve everyday problems some of which are mentioned below. Since the
19-th century, if not even earlier, there is a growing side to Algebra which is purely theoretic. However, it
is important to realize that some of the most abstract algebraic structures of the past, such as Galois fields
(nowadays simply called finite fields), have become in modern days part of the foundation of certain branches
of applied algebra, for example in applications to coding theory and cryptography. The lesson is (and please
repeat that to any politician you happen to meet) is that what is at present considered “pure" and what is
considered “applied" is our temporary perspective and in time many of the pure, seemingly useless, branches
of mathematics turn out to have critical relevance to real world applications.

The word “algebra" is derived from the title of a book - Hisab al-jabr w’al-muqabala - written by the Farsi
scholar Abu Ja’far Muhammad ibn Musa Al-Khwarizmi (790 - 840 AD). The word al-jabr itself comes from
the root of reunite (in the sense of completing or putting together) and refers to one of the methods of
solving quadratic equations (by completing the square) described in that book. The book can be considered
as the first treatise on algebra. The word algorithm is in fact derived from the name Al-Khwarizmi. The
book was very much concerned with methods for solving known practical problems; Al-Khwarizmi intended to
teach (in his own words) “... what is easiest and most useful in arithmetic, such as men constantly require
in cases of inheritance, legacies, partition, lawsuits, and trade, and in all their dealings with one another, or
where the measuring of lands, the digging of canals, geometrical computations, and other objects of various
sorts and kinds are concerned." 1

Algebra I is a first course in Algebra. Little is assumed in the way of background. Though the course is
self-contained, it puts some of the responsibility of digesting and exploring the material on the student, as is
normal in university studies. You’ll soon realize that we are also learning a new language in this course and a
new attitude towards mathematics. The language is the language of modern mathematics; it is very formal,
precise and concise. One of the challenges of the course is digesting and memorizing the new concepts and
definitions. The new attitude is an attitude where any assumptions one is making while making an argument
have to be justified, or at least clearly stated as a postulate, and from there on one proceeds in a logical and
clear manner towards the conclusion. This is called proof and one of the main challenges in the course is to
understand what constitutes a good proof and to be able to write proofs yourself.2 A further challenge for
most students is that the key ideas we learn in this course are very abstract, bordering on philosophy and art,
yet they are truly scientific in their precision. You should expect to not understand everything right away;
you should expect to need time to reflect on the meaning of the new ideas and concepts that we introduce.

Here are some pointers as to how to cope with the challenges of this course:
• Read the class notes and the textbook over and over again. Try and give yourself examples of the
theorems and propositions and try and provide counterexamples when some of the hypotheses are
dropped.
• Do lots and lots of exercises. The more, the better.
• This textbook attempts to be lean and so everything in it is important. Examples and exercises may
contain important observations and referred to later.
• Explain to your friends, and possibly to your family, the material of the course.3 Work together with
your class mates on assignments, but write your own solutions in the end; try to understand different
solutions to assignments and try and find flaws in your friends’ solutions.
• Use the instructor’s and the TA’s office hours, as well as the math help center, to quickly close any
gap and clarify any point you’re not sure about.

1Cited from http://www-groups.dcs.st-and.ac.uk/ history/Biographies/Al-Khwarizmi.html.

2To quote former Prime Minister Jean Chrétien, “The proof is the proof and when you have a good proof it is proven".
3Unless you start noticing that you are becoming less and less popular.
2 EYAL GOREN MCGILL UNIVERSITY

So what is this course really about?

We are going to start by learning some of the notation and language of mathematics. We are going to discuss
sets and functions and various properties and operations one can perform on those. We are going to talk
about proofs and some techniques of proof, such as “induction" and “proving the counter-positive". We are
going “to split infinities"; some infinities are more infinite than others.
We are going to discuss different structures in which one can do arithmetic, such as rational, real and
complex numbers, polynomial rings and yet more abstract systems called rings and fields. We are going to
see unifying patterns of such systems and some of their applications. For example, a finite field, perhaps
initially a strange beast, is a key notion in modern days computer science; it is a concept absolutely essential
and fundamental for cryptographic schemes as well as data management.
We are then going to do some really abstract algebra for a while, learning about rings and homomorphisms
and ideals. Our motivation is mostly to know how to construct a finite field and work with it in practice. We
also lay the basis for further study in the following algebra courses.
The final section of the course deals with groups and group actions on sets. After the previous section on
rings and fields we’ll feel more comfortable with the abstract notions of group theory. Furthermore, there are
going to be plenty of concrete examples in this section and applications to problems in combinatorics and to
the study of symmetries. Here is one concrete example: imagine that a jewelry company wants, as a publicity
stunt, to display all necklaces one can make using 10 diamonds and 10 rubies, perhaps under the (banal)
slogan “to each their own". In each necklace 10 diamonds and 10 rubies are to be used. The necklace itself
is just round, with no hanging or protruding parts. Thus, we can provide an example of such a necklace as
DDRRDRDRRRDDDDRRDRDR
(where the last R is adjacent to the first D). Now, when we consider a particular design such as above, we
do want to identify it with the following design
DRRDRDRRRDDDDRRDRDRD
(we’ve put the first D at the last spot), because this is just the same pattern; if the necklace is put on a
table, it is just rotating it a bit, or, alternately, looking at it from a slightly different angle. Also, note that
the pattern
DDRRDRDRRRDDDDRRDRDR
is identified with
RDRDRRDDDDRRRDRDRRDD
which corresponds to flipping over the necklace as the second sequence is the first sequence read from right
to left. Now the question is how many rubies and diamonds we need to purchase in order to make all the
different designs? It turns out that this can be approached using the theory of group actions on sets and a
general formula we’ll develop can be applied here. It turns out that there are 4752 such different designs;
that will require 47520 diamonds and 47520 rubies. Perhaps the idea should be reconsidered ;-)
COURSE NOTES - ALGEBRA I 3

Part 1. Some Language and Notation of Mathematics

1. Sets
1.1. First definitions. A set is a collection of elements. The notion of a set is logically not quite defined
(what’s a “collection"? an “element"?) but, hopefully, it makes sense to us. What we have is the ability to
say whether an element is a member of a set or not. Thus, in a sense, a set is a property, and its elements
are the objects having that property (the property is to be in the set).4
There are various ways to define sets:
(1) By writing it down:
S = {1, 3, 5}.
The set is named S and its elements are 1, 3 and 5. The use of curly brackets is mandatory! Another
example is
T = {2, 3, Jim’s football}.
This is a set whose elements are the numbers 2, 3 and Jim’s football. It is assumed here that “Jim"
refers to one particular individual.
A set can also be given as all objects with a certain property:
S1 = {all beluga whales}.
Another example is
T5 = {n : n is an odd integer, n3 = n}.
The colon means that the part that follows is the list of properties n must satisfy, i.e. the colon is
shorthand for “such as". Note that this set is equal to the set
U + = { n : n2 = 1}.
Our eccentric notation T5 , S1 , U + is just to make a point that a set can be denoted in many ways.
(2) Sometimes we write a set where the description of its elements is implicit, to be understood by the
reader. For example:
N = {0, 1, 2, 3, . . . }, Z = {. . . , −2, −1, 0, 1, 2, . . . },
and na o
Q= : a, b ∈ Z, b 6= 0 .
b
Thus N is the set of natural numbers, Z is the set of integers and Q the set of rational numbers.5
The use of the letters N, Z, Q is standard. Other standard notation is
R = the set of real numbers ( = points on the line),
and the complex numbers
C = { a + bi : a, b ∈ R}.
Here i is the imaginary number satisfying i2 = −1 (we’ll come back to that in §6). Note that we
sneaked in new notation. If A is a set, the notation x ∈ A means x is an element (a member) of A,
while x 6∈ A means that x is not an element of A. Thus, the expression C = { a + bi : a, b ∈ R} is
saying that C is the set whose elements are a + bi, where a and b are real numbers; these
√ are formal
expressions and it is not assumed that we know how to add a to bi. For example, 1 + i, 3 + πi are
complex numbers. Every real number r is a complex number that we still write as r instead of the
more formal expression r + 0i. For example, in the notation above, 3 ∈ S, 2 6∈ S, Jim’s football ∈ T
but 6∈ U + , i ∈ C but i 6∈ R.
We haven’t really defined any of these sets rigorously. We have assumed that the reader under-
stands what we mean. This suffices for the level of this course. A rigorous treatment is usually
4Like some upscale gentlemen clubs, they are defined as much by those that aren’t members as by those that are.
5For us 0 is a natural number, that is, we include it in N, but some authors do not. The letter Z comes from “zahlen" meaning
“numbers" in german and Q comes from “quotient".
4 EYAL GOREN MCGILL UNIVERSITY

given in a logic course for N (constructed via the Peano axioms) or in an analysis course for R (via
Dedekind cuts). The set of real numbers can also be thought of as the set of all numbers written in
a possibly infinite decimal expansion. Thus, 1, 2, 1/3 = 0.33333 . √ . . and indeed any rational number
is an element
√ of R as are π = 3.1415926 . . . , e = 2.718281
√ . . . , 2 = 1.414 . . . and so on. In fact
π, e and 2 are not rational numbers; we will prove that 2 is not rational in Proposition 10.2.4,
and that e is irrational in Appendix B. The proof that π is irrational is much harder.

Two sets A, B, are equal, A = B, if they have the same elements, that is, if every element of A is an element
of B and vice-versa. Thus, for example, A = {1, 2} is equal to B = {2, 1} (the order doesn’t matter), and
A = {−1, 1} is equal to B = { x ∈ R : x2 − 1 = 0} (the description doesn’t matter). Also A = {1, −1} is
equal to B = {1, 1, −1, 1} (repetitions do not matter, either). We say that
A ⊆ B,
(A is contained in B), or simply A ⊂ B, if every element of A is an element of B. For example N ⊂ Z.
Some authors use A ⊂ B to mean A is contained in B but not equal to it. We do not follow this convention
and for us A ⊂ B allows A = B. If we want to say that A is contained in B and not equal to it, we shall use
A $ B. Note that A = B holds precisely when both A ⊂ B and B ⊂ A.
The notation
∅
stands for the empty set. It is a set but it has no elements. Admittedly, that sounds funny... the logic behind
is that we want the intersection of sets to always be a set. We let
A ∩ B = { x : x ∈ A and x ∈ B}
be the intersection of A and B, the set of common elements, and we let
A ∪ B = { x : x ∈ A or x ∈ B}
be the union of A and B. For example, {1, 3} ∩ {n : n2 = n} = {1}, N ∩ { x : − x ∈ N} = {0}, S1 ∩ T5 =
∅.6
We shall also need arbitrary unions and intersections. Let I be a non-empty set (thought of as an index
set) and suppose that for each i ∈ I we are given a set Ai . Then
∩i ∈ I A i = { x : x ∈ A i , ∀ i },
(∀ means “for all") is the set of elements belonging to each Ai , and
∪i∈ I Ai = { x : x ∈ Ai , for some i },
is the set of elements appearing in at least one Ai . For example, define for i ∈ Z,
Ai = { x ∈ Z : x ≥ i }
(so A−1 = {−1, 0, 1, 2, 3, 4, . . . }, A0 = {0, 1, 2, 3, . . . }, A1 = {1, 2, 3, 4, . . . }, A2 = {2, 3, 4, 5, . . . } and so on).
Then ∪i∈Z Ai = Z, while ∩i∈Z Ai = ∅.
Here’s a another example: for every real number x, 0 ≤ x ≤ 1 define
Sx = {( x, y) : 0 ≤ y ≤ x, y ∈ R}.
Then ∪0≤ x≤1 Sx is the triangular area in the plane whose vertices are (0, 0), (1, 0), (1, 1).

Yet another operation on sets is the difference of sets:

A \ B = { x : x ∈ A, x 6∈ B}.
We will also use the notation A − B instead of A \ B. We remark that A ∪ B = B ∪ A, A ∩ B = B ∩ A
but A \ B is not equal to B \ A, unless A = B.
6Now wrap your mind around this: We may form sets whose elements are sets themselves. For example, if A = {1, 2}, A = {3, 4}
1 2
then we can form the set S := { A1 , A2 }. The set S has precisely two elements, namely, A1 and A2 . Note that 3, for example,
is not an element of S. So far so good. But consider the set T = {∅}. The set T is not empty - it has precisely one element.
That element is the empty set. Likewise, the set {∅, {∅}} has two elements. One element is the empty set, the other is the
set T. As T is not the empty set, these two elements are indeed distinct. Thus, we have created matter out of vacuum.
COURSE NOTES - ALGEBRA I 5

A good way to decipher formulas involving two or three sets is by diagrams. For example:

The last diagram shows clearly that A ∪ B ∪ C − ( A ∩ B ∩ C ) = ( A − C ) ∪ ( B − A) ∪ (C − B), though that

is not considered a proof!

Another definition we shall often use is that of the cartesian product. Let A1 , A2 , . . . , An be sets. Then
A1 × A2 × · · · × An = {( x1 , x2 , . . . , xn ) : xi ∈ Ai , for 1 ≤ i ≤ n}.
In particular,
A × B = {( a, b) : a ∈ A, b ∈ B}.
Example 1.1.1. Let A = {1, 2, 3}, B = {1, 2}. Then
A × B = {(1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)}.
Note that (3, 1) ∈ A × B but (1, 3) 6∈ A × B, because although 1 ∈ A, 3 6∈ B.
Example 1.1.2. Let A = B = R. Then
A × B = {( x, y) : x ∈ R, y ∈ R}.
This is just the presentation of the plane in cartesian coordinates (which explains why we call such products
“cartesian" products).

1.2. Algebra of set operations. Up till now we have only introduced notation and vocabulary. With the
exception of the notion of the empty set, none of what we had said had any depth. We now wish to make
general statements relating some of these operations. Once a statement is important enough to highlight
it, it falls under the heading of a Lemma, a Proposition or a Theorem (or, more colloquially, a Claim, an
Assertion and so on). Usually, “Lemma" is reserved for technical statements often to be used in the proof
of a proposition or a theorem. “Proposition" and “Theorem" are more or less the same. They are used for
claims that are more conceptual, or central, with “Theorem" implying even more importance. However, none
of these rules is absolute. For example, consider the following proposition:
Proposition 1.2.1. Let I be a set. Let A be a set and Bi , i ∈ I, be sets as well, then
A ∩ (∪i∈ I Bi ) = ∪i∈ I ( A ∩ Bi ),
and
A ∪ (∩i∈ I Bi ) = ∩i∈ I ( A ∪ Bi ).
Furthermore,
A \ (∪i∈ I Bi ) = ∩i∈ I ( A \ Bi ),
6 EYAL GOREN MCGILL UNIVERSITY

and
A \ (∩i∈ I Bi ) = ∪i∈ I ( A \ Bi ).

We will prove parts of this Proposition below and the rest is left as an exercise.

2. Proofs: ideas and techniques

Proposition 1.2.1 is not obvious. It is not even clear at first sight whether it’s correct. For that reason we
insist on proofs in mathematics. Proofs give us confidence that we are making true statements and they
reveal, to a lesser or higher extent, why the statements hold true. The proof should demonstrate that the
statements made are true. In fact, in French the commonly used word for a proof is démonstration, in the
past “demonstration" was often used instead of “proof" also in English (though today this usage is very rare),
and the famous QED in the end of a proof stands for “quod erat demonstrandum", meaning, “which was to
be demonstrated ". We shall prove some of the statements in the proposition now, leaving the rest as an
exercise. Our method of proof for Proposition 1.2.1 is by a standard technique.

2.1. Proving equality by two inequalities. When one wants to show that two real numbers x, y are equal,
it is often easier to show instead that x ≤ y and y ≤ x and to conclude that x = y.
In the same spirit, to show two sets A and B are equal, one may show that every element of A is an
element of B and that every element of B is an element of A. That is, we prove two “inequalities", A ⊆ B
and B ⊆ A. Thus, our principle of proof is
A=B
if and only if
x∈A⇒x∈B and x ∈ B ⇒ x ∈ A.
(The notation ⇒ means “implies that".)

Let us now prove the statement A ∩ (∪i∈ I Bi ) = ∪i∈ I ( A ∩ Bi ). The way you should write it in an assignment,
a test, or a research paper is as follows:

Proposition 2.1.1. Let I be a set. Let A and Bi , i ∈ I, be sets then

A ∩ (∪i∈ I Bi ) = ∪i∈ I ( A ∩ Bi ).

Proof. Let x ∈ A ∩ (∪i∈ I Bi ) then x ∈ A and x ∈ ∪i∈ I Bi . That is, x ∈ A and x ∈ Bi0 for some i0 ∈ I.
Then x ∈ A ∩ Bi0 and so x ∈ ∪i∈ I ( A ∩ Bi ). We have shown so far that A ∩ (∪i∈ I Bi ) ⊂ ∪i∈ I ( A ∩ Bi ).
Conversely, let x ∈ ∪i∈ I ( A ∩ Bi ). Then, there is some i0 ∈ I such that x ∈ A ∩ Bi0 and so for that i0 we
have x ∈ A and x ∈ Bi0 . In particular, x ∈ A and x ∈ ∪i∈ I Bi and so x ∈ A ∩ (∪i∈ I Bi ).

(The designates that the proof is complete. It is equivalent to writing QED.)

Let’s also do
A \ (∪i∈ I Bi ) = ∩i∈ I ( A \ Bi ).
We use the same technique. Let x ∈ A \ (∪i∈ I Bi ) thus x ∈ A and x 6∈ ∪i∈ I Bi . That means that x ∈ A and
for all i ∈ I we have x 6∈ Bi . That is, for all i ∈ I we have x ∈ A \ Bi and so x ∈ ∩i∈ I ( A \ Bi ).
Conversely, let x ∈ ∩i∈ I ( A \ Bi ). Then, for all i ∈ I we have x ∈ A \ Bi . That is, x ∈ A and x 6∈ Bi for
every i. Thus, x ∈ A and x 6∈ ∪i∈ I Bi and it follows that x ∈ A \ (∪i∈ I Bi ).
COURSE NOTES - ALGEBRA I 7

2.2. Proof by contradiction and the contrapositive. Proof by contradiction is a very useful technique,
even though using it too often shows lack of deeper understanding of the subject. Suppose that some
statement is to be proven true. In this technique one assumes that the statement is false and then one
proceeds to derive logical consequences of this assumption until an obvious contradiction arises. Here is an
easy example that illustrates this:

Claim. There is no solution to the equation x2 − y2 = 1 in positive integers.

Proof. Assume not. Then there are positive integers x, y such that x2 − y2 = 1. Then, ( x − y)( x + y) = 1.
However, the only product of integers giving 1 is 1 × 1 or −1 × −1 and, in any case, it follows that x − y =
x + y. It follows that 2y = ( x + y) − ( x − y) = 0 and so that y = 0. Contradiction (because we have
assumed both x and y are positive).

Claim. If x and y are two integers whose sum is odd, then exactly one of them is odd.
Proof. Suppose not. Then, either both x and y are odd, or both x and y are even. In the first case x =
2a + 1, y = 2b + 1, for some integers a, b, and x + y = 2( a + b + 1) is even; contradiction. In the second
case, x = 2a, y = 2b, for some integers a, b, and x + y = 2( a + b) is even; contradiction again.

Somewhat related is the technique of proving the contrapositive. Let A and B be two assertions and
let ¬ A, ¬ B be their negation. Logically the implication
A⇒B
is equivalent to
¬ B ⇒ ¬ A.
Here is an example. Let A be the statement “it rains" and B the statement “it’s wet outside". Then ¬ A
is the statement “it doesn’t rain" and ¬ B is the statement “it’s dry outside". The meaning of A ⇒ B is “it
rains therefore it’s wet outside" and its contrapositive is “it’s dry outside therefore it doesn’t rain". Those
statements are equivalent. Here is a mathematical example:

7
Claim. If x and y are integers such that xy is even then either x or y are even.
Proof. The contrapositive is: If both x and y are odd then xy is odd. To prove that, write x = 2a + 1, y =
2b + 1 for some integers a, b. Then xy = 4ab + 2a + 2b + 1 = 2(2ab + a + b) + 1, is one more than an even
integer and so is odd.

2.3. Proof by Induction. Induction is perhaps the most fun technique. Its logical foundations also lie deeper
than the previous methods. The principle of induction, to be explained below, rests on the following axiom:

Axiom: Every non empty subset of N has a minimal element.

We remark that the axiom is actually intuitively obviously true. The reason we state it as an axiom is that
when one develops the theory of sets in a very formal way from fundamental axioms the assertion stated
above doesn’t follow from simpler axioms and, in one form or another, has to be included as an axiom.
Theorem 2.3.1. (Principle of Induction) Let n0 be a given natural number. Suppose that for every natural
number n ≥ n0 we are given a statement Pn . Suppose that we know that:
(1) Pn0 is true.
(2) If Pn is true then Pn+1 is true.
Then Pn is true for every n ≥ n0 .
7Note that in mathematics “or" always means “and/or".
8 EYAL GOREN MCGILL UNIVERSITY

Proof. Suppose not. Then the set

S = {n ∈ N : n ≥ n0 , Pn is false}
is a non-empty set and therefore has a minimal element a. Note that a > n0 , because Pn0 is true by (1), and
so a − 1 ≥ n0 . Now a − 1 6∈ S because of the minimality of a and so Pa−1 is true. But then, by (2), also Pa
is true. Contradiction.
Remark 2.3.2. This proof is a gem of logic. Understand it and memorize it.
Remark 2.3.3. The mental picture I have of induction is a picture of a staircase. The first step is marked n0
and each higher step by the integers n0 + 1, n0 + 2, no + 3, . . . . I know I can reach that first step and I know
that if I can reach a certain step (say marked n) then I can reach the next one (marked n + 1). I conclude
that I can reach every step.

Example 2.3.4. Prove that for every positive integer n we have

n ( n + 1)
1+2+···+n = .
2
n ( n +1)
(The statement Pn is 1 + 2 + · · · + n = 2 and it is made for n ≥ 1, that is n0 = 1.)
The first case is when n = 1 (this is called the base case of the induction). In this case we need to show
1·(1+1)
that 1 = 2 , which is obvious.
Now, we assume that statement true for n, that is we assume that
n ( n + 1)
1+2+···+n = ,
2
and we need to show it’s true for n + 1. That is, we need to prove that
(n + 1)(n + 2)
1 + 2 + · · · + n + ( n + 1) = .
2
(By achieving that we would have shown that if Pn is true then Pn+1 is true.) We use the assumption that
it’s true for n (that’s called the induction hypothesis) and write
n ( n + 1)
1 + 2 + · · · + n + ( n + 1) = +n+1
2
n ( n + 1) + 2( n + 1)
=
2
2
n + 3n + 2
=
2
(n + 1)(n + 2)
= .
2
Thus, we have shown that Pn+1 is true as well and the proof is complete.
Example 2.3.5. Here we prove the following statement: Let q 6= 1 be a real number. Then for every n ∈ N
we have
1 − q n +1
1 + q + · · · + qn = .
1−q
COURSE NOTES - ALGEBRA I 9

1 − q n +1
The statement Pn is “for every real number q 6= 1, 1 + q + · · · + qn = 1−q ". The base case, that is the
first n for the statement is being claimed true, is n = 0; the statement is then
1−q
1= ,
1−q
which is obviously true.
Now suppose that the statement is true for n. That is, suppose that
1 − q n +1
1 + q + · · · + qn = .
1−q
We need to show that
1 − q n +2
1 + q + · · · + q n +1 = .
1−q
Indeed,
1 + q + · · · + q n +1 = (1 + q + · · · + q n ) + q n +1
1 − q n +1
= + q n +1
1−q
1 − q n +1 (1 − q ) q n +1
= +
1−q 1−q
1 − q n +1 q n +1 − q n +2
= +
1−q 1−q
1 − q n +2
= .
1−q
Here are further examples of statements that are easy to prove by induction. (i) For n ≥ 0, n2 ≤ 4n . (ii)
1 + 3 + · · · + (2n − 1) = n2 , for n ≥ 1. More thought is required for the following: (iii) Consider the following
scenario: there are n people in a party and when the bell rings each is supposed to throw a pie at the person
closest to him or her. (Well, things were getting wild...) Assume that all the distances between the people
are mutually distinct. If n is odd then at least one person is not going to be hit by a pie.

2.4. Prove or disprove. A common exercise, and a situation one often faces in research, is to prove or
disprove a particular statement. For example,
“Prove or disprove: for every natural number n, 4n + 1 is either a square or a sum of two squares."
At that point you are requested first to form a hunch, a guess, an opinion about whether the statement
is true or false. To form that hunch you can try some examples (4 ∗ 0 + 1 = 1 = 12 , 4 ∗ 1 + 1 = 5 =
12 + 22 , 4 ∗ 2 + 1 = 9 = 32 , 4 ∗ 3 + 1 = 13 = 22 + 32 , ...) to see if the statement holds for these examples,
or you can think whether the statement is similar to other statements you know to hold true/false, or, when
at loss, throw a coin. After deciding on your initial position, if you believe the statement is true you should
proceed to find a proof. If you don’t, then you have two options. You can try and show that if the statement
is true it will imply a contradiction to a known fact, or you can provide one counterexample. The statement
being false doesn’t mean it’s false for every n; it means it’s false for at least one n. In the case at hand,
if we take n = 5 we find that 4 ∗ 5 + 1 = 21, which is neither a square nor a sum of squares (just try all
possibilities) and so the statement is false.

2.5. The pigeonhole principle. The pigeonhole principle is simple to state, yet it is a powerful tool. It states
the following:8

If there are more pigeons than pigeonholes then in one pigeonhole

there must be at least two pigeons.

8To quote Clint Eastwood: “I know things about pigeons, Lily".

10 EYAL GOREN MCGILL UNIVERSITY

Following are examples of applications of the pigeonhole principle. To appreciate its power, I recommend
reading the statement first and trying to come up with an independent proof, before reading the provided
proof.
Example 2.5.1. Let a, b, c, d, e, f be 6 integers. Then there are among them two integers whose difference is
divisible by 5.
To prove this consider the remainders of a, b, c, d, e, f upon division by 5. The remainder is either 0, 1, 2, 3
or 4 (these are the 5 pigeonholes) but we get from our numbers 6 remainders (those are the pigeons).
Therefore, there are two numbers among a, b, c, d, e, f with the same remainder. Say, a and b have the same
remainder, say r. Then a − b is divisible by 5 (because a = 5a0 + r, b = 5b0 + r and so a − b = 5( a0 − b0 )).
A similar example is the following:
Example 2.5.2. Let n ≥ 2 be an integer. In any group of n people, there are two that have the same number
of friends within the group.
We prove that by induction on n. The case n = 2 is trivial. Let N ≥ 2; suppose the claim for all n ≤ N,
and consider a group of N + 1 people. If there is a person with 0 friends, we can look at the rest of the
people. This is a group of N people and the number of friends each has in this smaller group is the same as
in the original group. We can apply induction to conclude that two have the same number of friends (initially
in the smaller group, but in fact also in the larger group).
The other case is when each person in the group of N + 1 people has at least 1 friend. By considering
the number of friends each person in the group has, we get N + 1 integer values between 1 and N and so,
by the pigeonhole principle, two must be equal.

Here is another way to formulate the statement we have just proven. A graph is a collection of vertices and
edges connecting them. We say that a graph is simple if it has no loops and no multiple edges, namely, an
edge always goes from a vertex to a different vertex and between any two vertices there is at most one edge.
A graph is called finite if it has finitely many vertices. The degree of a vertex v is the number of edges one of
whose terminal points is v. What we proved is that in a finite simple graph two of the vertices must have the
same degree. (Given a party, create a vertex for every person and connect two vertices if the corresponding
persons are friends.)
Finally, notice that we have used a variant of Induction called “complete induction". Namely, we used
the following principle. Suppose that a statement Pn is given for every natural number greater or equal to a
natural number n0 . Suppose that the statement Pn0 is true and that for every n ≥ n0 , if all the statments
Pn0 , Pn0 +1 , . . . , Pn are true then so is Pn+1 . Then Pn is true for every n ≥ n0 .
The proof is more or less the same as the proof of Theorem 2.3.1, so we leave it as an exercise.9

3. Functions
We are familiar with functions already from high-school, where functions were usually, probably always,
functions of a real variable. For example, the function y = sin( x ) or y = 3x + 1. Here x ∈ R and the value y
is in R too. We are also used to plotting this function using cartesian coordinates as all the pairs ( x, y) where
y = sin( x ) or 3x + 1, as the case may be. We want to abstract this notion and for a start we may say that
we have a function f : R → R, where for the first example f ( x ) = sin( x ) and for the second f ( x ) = 3x + 1.
That is, instead of writing y we write f ( x ). In this language, the graph are all the points in the plane of the
form ( x, f ( x )), or, what is the same, all the points ( x, y) such that y = f ( x ).
There are more formal and less formal ways to define a function. Here we take the most pedestrian
approach. Let A and B be sets. A function f from A to B,
f : A −→ B,
9You should not take that as meaning that you can safely ignore the matter of finding a proof for complete induction. On the
contrary, it means that you should understand the proof of Theorem 2.3.1 so well that it is clear to you how to prove the variant
given here.
COURSE NOTES - ALGEBRA I 11

is a rule assigning to each element of A a single element of B. The set A is called the source, or the domain,
of the function, and B the target, or codomain, of the function. For a ∈ A, f ( a) is called the image of a
(under f ) and f ( A) = { f ( a) : a ∈ A} is the image of f .
Example 3.0.1. The simplest example is the identity function. Let A be any set and define
1A : A → A
to be the function sending each element to itself. Namely,
1 A ( x ) = x,
for any x ∈ A.
Example 3.0.2. Let A = {1, 2, 3}, B = {1, 2} and consider the following rules for f : A −→ B.
(1) f (1) = 2, f (2) = 1, f (3) = 1.
(2) f (1) = 1 or 2, f (2) = 2, f (3) = 1.
(3) f (1) = 1, f (2) = 1.
The first recipe defines a function from A to B. The second recipe does not, because 1 is assigned two
possible values. The third also doesn’t define a function because no information is given about f (3).
Example 3.0.3. Let R≥0 denote the non-negative real numbers. Consider the following attempts to define
functions. √
(1) f : R → R, f ( x ) = x.
(2) f : R≥0 → R, f ( x ) = y, where y is a real number such√that y2 = x.
(3) f : R≥0 → R, f ( x ) = the non negative root of x = + x.
(4) f : R → R, f ( x ) = 1/x.
The first definition fails because −1 doesn’t have a root in R. The second definition fails because every
positive number has 2 roots (differing by a sign) and it isn’t clear which root one is supposed to take. This
problem also exists in the first definition. The third definition does define a function. The fourth definition
doesn’t define a function, because the value f (0) is not well-defined.
There are various ways to define a function. It can be done by writing down f ( a) for every a ∈ A explicitly,
it can be done by providing a formula, and it can be done by giving some other description. For example,
take A to be the set of all people who ever lived, B = A and f : A → A is given by
f ( a) = a0 s mother.
This definitely looks like a good definition at first sight. However, the astute reader will note the problem
here. If this function was truly well-defined then the set A must be infinite, because if it were finite we would
have a person who’s a descendant of itself (consider a, f ( a), f ( f ( a)), f ( f ( f ( a))), . . . ). Since a mother is
older than any of her children by at least a day, say (just not to worry about an exact number), it follows
that if A and f are indeed well defined, that people have existed forever. The various ways to resolve the
paradox, namely to provide an explanation as to why f is not well-defined, are rather amazing and I leave it
to you as an amusing exercise. (And, no, I don’t view that as a proof that God exists.)

Here is some more notation: the symbol ∀ means “for all". The symbol ∃ means “exists". The symbol ∃!
means “exists unique". A function can also be defined by using a set
Γ ⊂ A × B,
with the following property: ∀ a ∈ A, ∃!b ∈ B such that ( a, b) ∈ Γ. (Read: for all a in A there exists a unique b
in B such that ( a, b) is in Γ.) We then define f ( a) to be the unique b such that ( a, b) ∈ Γ. Conversely, given
a function f we let
Γ = Γ f = {( a, f ( a)) : a ∈ A}.
The set Γ f is called the graph of f .
Example 3.0.4. Let A be a set and Γ ⊂ A × A the “diagonal",
Γ = {( x, x ) : x ∈ A}.
The function defined by Γ is 1 A .
12 EYAL GOREN MCGILL UNIVERSITY

Let f 1 , f 2 : A → B be functions. We say that f 1 = f 2 if for every a ∈ A we have f 1 ( a) = f 2 ( a).

Equivalently (exercise!) if Γ f1 = Γ f2 . Note here that the description of f 1 and f 2 doesn’t matter, only
whether f 1 ( a) = f 2 ( a) for all a ∈ A. For example, let f 1 : R → R be the constant function f 1 ( x ) = 1 for
all x. Let f 2 : R → R be the function f 2 ( x ) = sin( x )2 + cos( x )2 . Then f 1 = f 2 , although they were defined
differently.

3.1. Injective, surjective, bijective and inverse image. We introduce some properties of functions. Let
f: A→B
be a function. Then:
(1) f is called injective if f ( a) = f ( a0 ) ⇒ a = a0 . I.e., different elements of A go to different elements
of B. Such a function is also called one-one.
(2) f is called surjective, or onto, if ∀b ∈ B, ∃ a ∈ A such that f ( a) = b. I.e., every element in the
target is the image of some element in the source under the function f .
(3) f is called bijective if it is both injective and surjective. In that case, every element of B is the
image of a unique element of A.
Let f : A → B be a function. Let U ⊂ B. We define the pre-image of U to be the set
f −1 (U ) = { a : a ∈ A, f ( a) ∈ U }.
If U consists of a single element, U = {u}, we usually write f −1 (u) (instead of f −1 ({u})) and call it the
fibre of f over u.
Example 3.1.1. (1) f : R → R, f ( x ) = x2 . Then f is neither surjective (a square is always non-
negative) nor injective as f ( x ) = f (− x ). We have f −1 ([1, 4]) = [1, 2] ∪ [−2, −1] and f −1 (0) =
{0}, f −1 (−1) = ∅.
(2) f : R → R≥0 , f ( x ) = x2 . Then f is surjective but not injective.
(3) f : R≥0 → R≥0 , f ( x ) = x2 . Then f is bijective.

3.2. Composition of functions. Let

f : A → B, g : B → C,
be functions. We define their composition, g ◦ f , to be the function:
g ◦ f : A → C, ( g ◦ f )( x ) = g( f ( x )).
The picture is
g◦ f

f g #
A /B /C

Lemma 3.2.1. We have the following properties:

(1) If g ◦ f is injective then f is injective.
(2) If g ◦ f is surjective then g is surjective.
Proof. Suppose that g ◦ f is injective. Let a, a0 ∈ A be elements such that f ( a) = f ( a0 ). We need to
show that a = a0 . We have g( f ( a)) = g( f ( a0 )) or otherwise said, ( g ◦ f )( a) = ( g ◦ f )( a0 ). Since g ◦ f is
injective, a = a0 .
Suppose now that g ◦ f is surjective. Let c ∈ C. We need to show that there is an element b ∈ B
with g(b) = c. Since g ◦ f is surjective, there is a ∈ A such that ( g ◦ f )( a) = c. Let b = f ( a) then g(b) =
g( f ( a)) = ( g ◦ f )( a) = c.
A simple, but important property of composition is that it is associative: if f : A → B, g : B → C, h : C → D
are functions then
h ◦ ( g ◦ f ) = (h ◦ g) ◦ f .
COURSE NOTES - ALGEBRA I 13

3.3. The inverse function. Let f : A → B be a bijective function. In this case we can define the inverse
function
f −1 : B → A,
by the property
f −1 (b) = a if f ( a) = b.
This is well defined: since f is surjective such an a exists for every b and is unique (because f is injective).
Thus f −1 is a function. It is easy to verify that
f −1 ◦ f = 1 A , f ◦ f −1 = 1 B .
To tie it up with previous definitions, note that if we let
ϕ : A × B → B × A, ϕ( a, b) = (b, a),
then
ϕ ( Γ f ) = Γ f −1 .
In fact, this gives another way of defining f −1 .

4. Cardinality of a set
Imagine a group of students about to enter a lecture hall. The instructor wants to know if there are
sufficient chairs for all the students. There are two ways to do that. One is to count both students and
chairs separately and determine which is larger. The other is to ask each student to take a seat. If there are
students left standing, the number of chairs is too small. If there are chairs left unoccupied, there are more
chairs than students. In the remaining case there is a perfect match between students and chairs and so their
number (cardinality) is equal.
This idea proves very powerful in discussing the cardinality (“size", “magnitude", “number of elements") of
sets, finite or infinite.
George Cantor revolutionized mathematics,10 and human thought, by defining two sets A, B possibly
infinite to be of equal cardinality, denoted | A| = | B|, if there is a bijective function
f : A → B.

10Cantor’s own words are as fresh as ever; the following are the opening paragraphs of Cantor’s paper whose title is translated
into English as “Contributions to the founding of the theory of transfinite numbers". The translation is based on the Dover
edition, but I have replaced “aggregate", “uniting" etc. by the modern terminology “set", “union" and so on, and the use of
quotation marks by italics.

“By a set we are to understand any collection into a whole M of definite and separate objects m of our intuition or our thought.
These objects are called the elements of M. In signs we express this thus:
(1) M = { m }.
We denote the union of many sets M, N, P, . . . , which have no common elements, into a single set by
(2) ( M, N, P, . . . ).
The elements of this set are, therefore, the elements of M, of N, of P, . . . , taken together. [....]
Every set M has a definite power, which we will also call its cardinal number. We will call by the name power or cardinal number
of M the general concept which, by means of our active faculty of thought, arises from the set M when we make abstraction of
the nature of its various elements m and of the order in which they are given."
What is striking to me is that Cantor’s investigation is intimately tied with introspective meditation on the nature of thought
and abstraction. The notion of cardinal number is not claimed to be absolute; it arises as a result of our thought process. Note
the effort in trying to explain what cardinality is. In current times such efforts are rare. One simply says when two sets have the
same cardinality, as we have done, dodging the issue of what that cardinality is.
Cantor suffered vicious personal attacks in reaction to his theory. Leopold Kronecker’s public opposition and personal attacks
included describing Cantor as a "scientific charlatan", a "renegade" and a "corrupter of youth." On the other hand, David
Hilbert defended Cantor by declaring "No one shall expel us from the Paradise that Cantor has created." However, this came
too late for Cantor, he was already dead by then.
14 EYAL GOREN MCGILL UNIVERSITY

(Note that then there is an inverse function f −1 : B → A which is also a bijection, so it doesn’t matter if we
require a bijection from A to B or from B to A.) He defined the cardinality of A to be no larger than B’s if
there is an injective function
f : A → B,
and this is denoted | A| ≤ | B|. We also say that the cardinality of A is less than B’s, | A| < | B|, if | A| ≤ | B|
and | A| 6= | B|. As a sanity check we’d like to know at least the following.
Proposition 4.0.1. Let A, B, C be sets. If | A| = | B| and | B| = |C | then | A| = |C |.
Proof. Let f : A → B, g : B → C be bijections. Then
g◦ f: A → C
is also a bijection. Indeed: if for some x, y ∈ A we have ( g ◦ f )( x ) = ( g ◦ f )(y) then g( f ( x )) = g( f (y)).
Since g is injective, f ( x ) = f (y) and, since f is injective, x = y.
To show g ◦ f is surjective, let c ∈ C and choose b ∈ B such that g(b) = c; such b exists since g is surjective.
Since f is surjective, there is an a ∈ A such that f ( a) = b. Then ( g ◦ f )( a) = g( f ( a)) = g(b) = c.
To show the definitions and notations make sense at all, we surely need to know the following theorem.
Theorem 4.0.2 (Cantor-Bernstein). If | A| ≤ | B| and | B| ≤ | A| then | A| = | B|.
Although it does not require any sophisticated mathematics, knowing the proof is not required for this
course. It is an ingenious and intricate proof, but requires no more background than we already have. We
give it in Appendix A. We would like to explain though why the theorem is not obvious.
What we are given that there is some injective function f from A to B and some injective function g
from B to A. Those functions need not be related to each other in any way. We should conclude from that
there is a bijective function h from A to B. It is not necessarily true that h = f . One should somehow
construct h from f and g. Here is an example: let A be the set of points in the plane in distance at most 1
from the origin (the closed unit disk) and B the square [−1, 1] × [−1, 1]. The function
f : A → B, f ( x ) = x,
is a well-defined injective function but not a bijection. The function
√
g : B → A, g(b) = b/ 2,
is also a well-defined injective function, but not bijection. One can find a bijection from A to B, but it is not
immediately clear how to find it based on the knowledge of f and g (and in fact in this particular example, it
is better to “rethink the situation" rather than to deduce it from f and g).

Remark 4.0.3. By definition, | A| ≤ | B| if there is an injective function f : A → B. We show here that if A

is not the empty set, this is equivalent to the existence of a surjective function g : B → A. Indeed, given f
injective, choose some element a0 ∈ A and define g : B → A as follows: if b = f ( a) for some a ∈ A then
g(b) = a; otherwise, let g(b) = a0 . Then g is well defined as in the case where b = f ( a) the a is unique
because f is injective. The function g is surjective, because given some a ∈ A we find that g( f ( a)) = a and
so every a is the image of some b ∈ B.
Conversely, given a surjective function g : B → A, define f : A → B by choosing for every a ∈ A some
element b ∈ B such that f (b) = a. There could be more than one b such that f (b) = a, but we just choose
one of them. We get this way a function f : A → B. We claim that it is injective. Indeed, if f ( a1 ) = f ( a2 )
then g( f ( a1 )) = g( f ( a2 )). But by the construction of f , g( f ( a1 )) = a1 and g( f ( a2 )) = a2 . Therefore,
a1 = a2 and so f is injective.

A set A is called countable (or enumerable) if it is either finite or has the same cardinality as N. If it is
finite there is a bijective function f : {0, 1, . . . , n − 1} → A, where n is the number of elements of A and so
A = { f (0), f (1), . . . , f (n − 1)}. If A is infinite, there is a bijective function f : N → A. The elements of A
are thus { f (0), f (1), f (2), f (3), . . . }. If we introduce the notation ai = f (i ) then we can also enumerate the
elements of A as { a0 , a1 , a2 , a3 , . . . } and this explains the terminology.
COURSE NOTES - ALGEBRA I 15

Example 4.0.4. Let A be the set {0, 2, 4, 6, . . . } and B the set {0, 1, 4, 9, 16, 25, . . . }. Then

|N| = | A | = | B |.
Indeed, one verifies that the functions

f : N → A, f ( x ) = 2x,

and
g : N → B, g( x ) = x2 ,
are bijections. We can then conclude that | A| = | B| and, if we want to, we can find the bijection. It is
h = g ◦ f −1 ; that is, h( x ) = ( x/2)2 .

Example 4.0.5. The cardinality of N is the cardinality of A = { x ∈ N : x is not a square}.

Instead of trying to write a bijection explicitly, which is not that straight-forward, we use the Cantor-
Bernstein theorem. It is important to digest the following argument!
The function
f : A → N, f ( a) = a,
is injective. Thus, | A| ≤ |N|. Consider the function

f : N → A, f ( x ) = 4x + 3.

First, this is well defined. Namely, 4x + 3 is really in A. This is because if n is a square, n leaves residue 1
or 0 when divided by 4 (if n = (2m)2 = 4m2 the residue is zero; if n = (2m + 1)2 = 4(m2 + m) + 1 the
residue is one). But 4x + 3 leaves residue 3. Clearly f is injective and so |N| ≤ | A|.

Proposition 4.0.6. |N| = |Z|.

Proof. We define (
2x x≥0
f : Z → N, g( x ) =
−2x − 1 x < 0.
Then f is a bijective function, as is easy to check.
Proposition 4.0.7. |N| = |N × N|.

Proof. Define f : N → N × N by f (n) = (n, 0). This is an injective function. Define

g : N × N → N, g(n, m) = 2n 3m .

This is also an injective function. If 2n 3m = 2a 3b then n = a, m = b by unique factorization (to be discussed

in § 10). We conclude that |N| = |N × N|.
Corollary 4.0.8. |Z| = |Z × Z|.

Proof. Let h : N → N × N be a bijection. A bijection f : N → Z induces a bijection

g = ( f , f ) : N × N → Z × Z,

and the composition

f −1 g
Z /N h / N×N / Z×Z ,

is also a bijection.
Exercise 4.0.9. Prove that |N| = |Q|. (Hint: there’s an easy injection Q → Z × Z).

When Cantor has laid down the foundations for the study of infinite cardinals he also dropped a bombshell:
16 EYAL GOREN MCGILL UNIVERSITY

4.1. Cantor’s diagonal argument.

Theorem 4.1.1 (Cantor). |N| 6= |R|.
The argument in the proof became known as Cantor’s diagonal argument and is used in many proofs.

Proof. Suppose that |N| = |R|. We can then enumerate the real numbers as a0 , a1 , a2 , . . . . Let us write the
decimal expression of each number as

a0 = e0 b10 . . . bn0 (0) .c00 c01 c02 c03 . . .

a1 = e1 b11 . . . bn1 (1) .c10 c11 c12 c13 . . .
a2 = e2 b12 . . . bn2 (2) .c20 c21 c22 c23 . . .
a3 = e3 b13 . . . bn3 (3) .c30 c31 c32 c33 . . .
..
.

where we agree to use 000000000 . . . instead of 999999999 . . . (so we write 1.00000000000 . . . and not
0.9999999999 . . . , etc.). Here ei is a sign, + or −, and each bij , cij is a digit, i.e. in {0, 1, . . . , 9}.
Now consider the number (
3 cii 6= 3
0.e0 e1 e2 e3 e4 . . . , ei =
4 cii = 3.
This is a real number that differs from each ai at the i + 1-th digit after the decimal dot and hence is not
equal to any ai . It follows that the list a0 , a1 , a2 , . . . cannot consist of all real numbers and so we arrive at a
contradiction.
Remark 4.1.2. The Continuum hypothesis, formulated by George Cantor, asserts that there is no set A
such that |N| < | A| < |R|. Much later, Kurt Godel and Paul Cohen proved that neither the hypothesis,
nor its negation, can be proven from the standard axioms of set theory. That is, in the axiom system we
are using in mathematics, we cannot prove the hypothesis or provide a counter-example. These discoveries
were nothing short of shocking. They came at a time where humanity was fascinated with its own power,
especially the power of the intellect. These results proved the inherent limitations of thought.

One may ask at this point if there is a cardinality bigger than that of R. That is, is there a set T such that
|R| < | T |. The answer to that is yes. One may take T to be the set of all subsets of R. This is discussed in
the exercises.

5. Relations
A relation on a set S is best described as a subset Γ ⊂ S × S. For each s ∈ S, s is “related" to t if (s, t) ∈ Γ.
Though the format reminds one of functions, the actual relevance of the notion of functions here is minimal.
For example, usually for a given s there will be many elements t such that (s, t) ∈ Γ, which is the opposite of
what we require for functions, where there is precisely one t for a given s. We shall usually denote that x is
related to y, namely that ( x, y) ∈ Γ, by x ∼ y.
Note that so far the definition is wide enough to allow any Γ. Here are same basic examples:
(1) Γ = S × S. In this case for any x, y, we have x ∼ y. Any two elements are related.
(2) Γ = ∅. In this case for no x, y we have x ∼ y. No elements are related (including an element with
itself).
(3) Γ = {(s, s) : s ∈ S} (the “diagonal"). In this case x ∼ x for all x ∈ S, but if x 6= y then x 6∼ y.
(4) Γ = {( x, y) : x, y ∈ S, x ≤ y}, where S is the interval of real numbers [0, 1]. Then to say that x ∼ y
means that x ≤ y, in the sense of the usual inequality of real numbers.
(5) Γ = {( x, y) : x, y ∈ Z} is the relation on Z where x ∼ y if ( x − y) is divisible by 5.
COURSE NOTES - ALGEBRA I 17

A relation on S is called reflexive if for all x ∈ S we have x ∼ x. In words: every element is related to itself.
The relations in (1), (3), (4) and (5) are reflexive, but the relation in (2) is not (except in the trivial case
where S is the empty set).
A relation is called symmetric if for all x, y ∈ S, if x ∼ y ⇒ y ∼ x. In words, whenever x is related to y
also y is related to x. The relations (1), (2), (3), (5) are symmetric, but (4) is not.
A relation is called transitive for all x, y, z ∈ S, if x ∼ y and y ∼ z implies x ∼ z. All the relations (1) -
(5) are transitive.
A relation on a set S is called a partial order if it is reflexive and transitive and, in addition, if both x ∼ y
and y ∼ x then x = y. We then use the notation x ≤ y for x ∼ y and, in this notation, we always have:
x ≤ x, the implication x ≤ y, y ≤ z ⇒ x ≤ z holds, and if both x ≤ y and y ≤ x then x = y. There may
very well be x, y for which neither x ≤ y nor y ≤ x holds. A linear order (or a simple order or a total order)
is a partial order such that for every x, y we have either x ≤ y or y ≤ x.
For example, for real numbers we have the relation ≤ of “less or equal than". That is, for real numbers
a, b we have the notion of a ≤ b and this is a linear ordering. It is reflexive (a ≤ a), satisfies transitivity
(a ≤ b and b ≤ c implies that a ≤ c) and if both a ≤ b and b ≤ a then a = b. We can put a relation on the
positive natural numbers N, by saying that a ∼ b if a|b. Since we have a|b and b|c implies a|c this relation
is transitive. Also, if a|b and b| a then a = b and, clearly, always a| a. Thus, this is a partial ordering of the
natural numbers. Note that for a = 2, b = 3 neither a|b nor b| a. That is, this is not a linear order.

Another important class of relations, even more important for this course than order relations, are the
equivalence relations. They are very far from order relations. A relation is called an equivalence relation if
it satisfies the following properties:
(1) (Reflexive) For every x we have x ∼ x.
(2) (Symmetric) If x ∼ y then y ∼ x.
(3) (Transitive) If x ∼ y and y ∼ z then x ∼ z.
Equivalence relations arise when one wishes to identify elements in a given set R according to some principle.
If we reflect on the meaning of “identify" we see that what we aim at is that x is identified with x (well,
obviously!), that if x is identified with y then y is identified with x, and that if x is identified with y and y is
identified with z then, by all accounts, x should be identified with z too. That is, we aim at an equivalence
relation. And conversely, an equivalence relation is a way to identify elements in a set. We identify x with all
the elements y such that x ∼ y.
Example 5.0.1. (1) A permutation σ of a set A is a bijection σ : A → A. A permutation of n elements
is a bijection
σ : {1, 2, . . . , n} → {1, 2, . . . , n}.
We denote Sn the set of all permutations σ of n elements. There are n! permutations, that is,
|Sn | = n!. Define a relation on Sn by saying for two permutations σ, τ that
σ ∼ τ if σ (1) = τ (1).
This is an equivalence relation.
(2) Here is a simple example, that will be generalized in §12. Define a relation on the integers Z by
saying that a ∼ b if a − b is even. This is an equivalence relation as a − a = 0 is always even, if a − b
is even then so is b − a and if both a − b and b − c are even then so is a − c = ( a − b) + (b − c).
(3) Define a relation on sets11 by saying that
A ∼ B if | A| = | B|.
This is an equivalence relation. The identity function A → A shows A ∼ A. If A ∼ B there is a
bijection f : A → B and then the inverse function f −1 : B → A is a bijection too, showing B ∼ A.
Finally, if | A| = | B| and | B| = |C | then | A| = |C |, by Proposition 4.0.1.
11We ignore here the fact that the collection of all sets is not a set and we only defined relations on a set. The example would
be correct if we restricted ourselves to all sets that are subsets of some fixed given set.
18 EYAL GOREN MCGILL UNIVERSITY

Let S be a set and Si , i ∈ I, be subsets of S. We say that S is the disjoint union of the sets Si if S = ∪i∈ I Si
and for any i 6= j we have Si ∩ S j = ∅. We denote this by

S= ä Si .
i∈ I

Lemma 5.0.2. Let ∼ be an equivalence relation on a set S. Define the equivalence class [ x ] of an element x ∈
S as follows:
[ x ] = {y : y ∈ S, x ∼ y}.
This is a subset of S. The following holds:
(1) Two equivalence classes are either disjoint or equal.
(2) S is a disjoint union of equivalence classes.
Conversely, if S is a disjoint union S = äi∈ I Ui of non-empty sets Ui (this is called a partition of S) then
there is a unique equivalence relation on S for which the Ui are the equivalence classes.

Proof. Let x, y be elements of S and suppose that [ x ] ∩ [y] 6= ∅. Then, there is an element z such that
x ∼ z, y ∼ z. Since ∼ is symmetric also z ∼ y and using transitivity x ∼ y. Now, if s ∈ [y] then y ∼ s and
by transitivity x ∼ s and so s ∈ [ x ] and we showed [y] ⊂ [ x ]. Since x ∼ y also y ∼ x and the same argument
gives [ x ] ⊂ [y]. We conclude that [ x ] = [y].
Every element of S lies in the equivalence class of itself. It follows that S is a disjoint union of equivalence
classes.
To prove the second part of the lemma, we define that x ∼ y if both x and y lie in the same set Ui . It is
clearly reflexive and symmetric. It is also transitive: x ∼ y means x, y ∈ Ui for some i, y ∼ z means y, z ∈ Uj
for some j. But there is a unique Ui containing y because the union is a disjoint union. That is Ui = Uj and
so x, z ∈ Uj ; that is, x ∼ z. The equivalence classes are clearly the Ui .

Example 5.0.3. We revisit Example 5.0.1 and find the equivalence classes. In the first case, there are n
equivalence classes [σ1 ], . . . , [σn ] (each having (n − 1)! permutations in it), where we may choose σi to be
the permutation:

 j j 6= 1, i

σi ( j) = 1 j = i

i j=1


That is, σi is the permutation sending 1 to i and i to 1 and leaving all other elements intact.
In the second case there are two equivalence classes, the even integers and the odd integers.
In the third case, the equivalence classes are the cardinalities, per definition (namely, one could say that
a cardinality is an equivalence class of sets under the relation of existence of bijection). Some distinct
equivalence classes are “all sets of cardinality 3" (that is, all sets that are in bijection with {1, 2, 3}, which are
precisely the sets of 3 elements), “all sets of cardinality ℵ0 " (that is, all sets that are in bijection with N and
this includes Z, N × N, Q, ....), “all sets of cardinality 2ℵ0 ", which is the cardinality of the real numbers R
(see also Exercise 12).

We introduce the following terminology: Let S be a set with an equivalence relation. We say that a subset
T ⊆ S is a complete set of representatives if S = ät∈T [t]. That is S is the disjoint union of the equivalence
classes of the elements in T: every equivalence class [ x ] in S is equal to an equivalence class [t] for a unique
element t ∈ T.
Sometimes we want to index the elements of T and so as a notational variant we say that a subset { xi :
i ∈ I } ⊆ S, I some index set, is a complete set of representatives if the equivalence classes [ xi ] are disjoint
and S = ∪i∈ I [ xi ]. This means that every equivalence class is of the form [ xi ] for a unique i ∈ I. That is, if
i 6= j then [ xi ] 6= [ x j ] (and in fact [ xi ] ∩ [ x j ] = ∅).
COURSE NOTES - ALGEBRA I 19

6. Number systems
Again we start with an apology of a sort. The formal discussion of number systems is a rather involved piece
of mathematics. Our approach is pragmatic. We assume that at some level we all know what are integers
and real numbers and no confusion shall arise there. We use those to define more complicated notions.
As we have already said, we denote the natural numbers N by
N = {0, 1, 2, . . . }, N+ = {1, 2, 3, . . . }.
We also denote the integers by
Z = {. . . , −2, −1, 0, 1, 2, . . . }.
The rational numbers are the set na o
Q= : a, b ∈ Z, b 6= 0 ,
b
where we identify ba with dc if ad = bc. The real numbers R are the “points on the line". Each real number
has a decimal expansion such as 0.19874526348 . . . that may or may not repeat itself from some point on.
For example:
1/3 = 0.3333333 . . . , 1/2 = 0.5000000 . . . , 1/7 = 0.142857142857142857142857142857 . . . ,
π = 3.141592653589793238462643383 . . . .
It is a fact that a number is rational if and only if from some point on its decimal expansion becomes periodic.
The length of the period of rational number can be explained. For example, the period of a number of the
form 1/n, where n ≥ 1 is an integer not divisible by 2 and 5 is the following.
Consider the sequence 1, 10, 100, 1000, . . . and their residues upon division by n, say r0 = 1, r1 , r2 , . . . .
The first d > 0 such that rd = 1 is the period of the decimal expansion of 1/n. For example, for n =
3 we have 1, 10, 100, . . . that give residues 1, 1, 1, . . . when divided by 3, and indeed 1 = 0.3333 . . . has
period 1. On the other hand, the residues of 1, 10, 100, . . . when divided by 7 are r0 = 1, r1 = 3, r2 =
2, r3 = 6, r4 = 4, r5 = 5, r6 = 1 which predicts a period of 6 for the decimal expansion of 1/7 and indeed
1/7 = 0.142857142857142857142857142857 . . . .

The complex numbers are defined as the set

C = { a + bi : a, b ∈ R}.
Here i is a formal symbol. We can equally describe the complex numbers as points ( a, b) ∈ R2 , the plane. The
function f : C → R2 , f ( a + bi ) = ( a, b) is bijective. The x-axis are now called the real axis and the y-axis
the imaginary axis. If z = a + bi is a complex number a is called the real part of z and is denoted Re(z)
and b is called the imaginary part of z and is denoted Im(z). We therefore have
z = Re(z) + Im(z)i.
The point corresponding to z in the plane model is (Re(z), Im(z)).
One can perform arithmetic operations with the complex numbers using the following definitions:
−( a + bi ) = − a − bi, ( a + bi ) + (c + di ) = ( a + c) + (b + d)i.
Up to this point things look nice in the plane model as well:
−( a, b) = (− a, −b), ( a, b) + (c, d) = ( a + c, b + d).
That is to say, the addition is then just the addition rule for vectors. The key point is that we can also define
multiplication. The definition doesn’t have any priori interpretation in the plane model; it’s a new operation.
We let
( a + bi )(c + di ) = ( ac − bd) + ( ad + bc)i.
In particular,
i2 = −1.
This shows that we have really gone beyond the realm of real numbers because there is no real number whose
square is −1. The operations described above satisfy the usual rules of arithmetic, such as
(z + z0 ) + z00 = z + (z0 + z00 ), z(z0 + z00 ) = zz0 + zz00 , ...
20 EYAL GOREN MCGILL UNIVERSITY

(We shall later say that the complex numbers form a field.) In fact, given these rules and i2 = −1, there
is no need to memorize the formula for multiplication as it just follows by expansion: ( a + bi )(c + di ) =
ac + adi + bci + bdi2 = ac − bd + adi + bci = ac − bd + ( ad + bc)i.
Let z = a + bi be a complex number. We define the complex conjugate of z, z̄, as follows:
z̄ = a − bi.
Lemma 6.0.1. The complex conjugate has the following properties:
(1) z = z.
(2) z1 + z2 = z̄1 + z̄2 , z1 · z2 = z̄1 · z̄2 .
(3) Re(z) = z+2 z̄ , Im(z)i = z−2 z̄ .
(4) Define for z = a + bi,
p
| z | = a2 + b2 .
(This is just the distance of the point ( a, b) from the origin.) Then |z|2 = z · z̄ and the following
holds:
| z1 + z2 | ≤ | z1 | + | z2 |, | z1 · z2 | = | z1 | · | z2 |.

Proof. Denote z = z1 = a + bi, z2 = c + di.

We have z = a − bi and so z = a − (−b)i = a + bi = z. That is (1). For (2) we calculate
z1 + z2 = ( a + c ) + ( b + d ) i
= ( a + c ) − ( b + d )i
= a − bi + c − di
= a + bi + c + di
= z1 + z2 .
Similarly,
z1 z2 = ( ac − bd) + ( ad + bc)i
= ( ac − bd) − ( ad + bc)i
= ( a − bi )(c − di )
= a + bi · c + di
= z1 · z2 .
We have (z + z̄)/2 = (( a + bi ) + ( a − bi ))/2 = a = Re(z) and (z − z̄)/2 = (( a + bi ) − ( a − bi ))/2 = bi =
Im(z)i, which is (3). Next, |z|2 = a2 + b2 = ( a + bi )( a − bi ) = z · z̄. Now,
| z1 z2 |2 = z1 z2 · z1 z2
= z1 z2 · z1 · z2
= z1 · z1 · z2 · z2
= | z1 |2 · | z2 |2 .
COURSE NOTES - ALGEBRA I 21

Thus, the assertion |z1 · z2 | = |z1 | · |z2 | follows by taking roots. The inequality |z1 + z2 | ≤ |z1 | + |z2 | viewed
in the plane model for complex numbers is precisely the assertion that the sum of the lengths of two sides of
a triangle is greater or equal to the length of the third side.
z̄ z·z̄
Example 6.0.2. If z 6= 0 then z has an inverse with respect to multiplication. Indeed, z · | z |2
= | z |2
= 1. We
write
z̄
z −1 = .
| z |2

Just to illustrate calculation with complex numbers, we calculate 1 + 2i + 13+−5ii . Using z−1 = z̄/|z|2 , we
have 1+15i = 1− 5i 3− i 1 8 3− i 12 18
26 and so 1+5i = (3 − i )(1 − 5i ) /26 = − 13 − 13 i and thus 1 + 2i + 1+5i = 13 + 13 i.

6.1. The polar representation. Considering a complex number z = a + bi in the plane model as the vec-
tor ( a, b) we see that we can describe each complex number z by the length r of the corresponding vector
( a, b) and the angle θ it forms with the real axis.

Z
r

We have
Im(z) Re(z)
r = | z |, sin θ = , cos θ = .
|z| |z|

Lemma 6.1.1. If z1 has parameters r1 , θ1 and z2 has parameters r2 , θ2 then z1 z2 has parameters r1 r2 , θ1 + θ2
(up to multiples of 360◦ or 2π rad).

Proof. We have r1 r2 = |z1 ||z2 | = |z1 z2 | and this shows that r1 r2 is the length of z1 z2 . Let θ be the angle
of z1 z2 then

Im(z1 z2 )
sin θ =
| z1 z2 |
Re(z1 )Im(z2 ) + Re(z2 )Im(z1 )
=
|z1 ||z2 |
Re(z1 ) Im(z2 ) Re(z2 ) Im(z1 )
= +
| z1 | | z2 | | z2 | | z1 |
= cos θ1 sin θ2 + cos θ2 sin θ1
= sin(θ1 + θ2 ).
22 EYAL GOREN MCGILL UNIVERSITY

Similarly, we get
Re(z1 z2 )
cos θ =
| z1 z2 |
Re(z1 ) Re(z2 ) Im(z1 ) Im(z2 )
= −
| z1 | | z2 | | z1 | | z2 |
= cos(θ1 ) cos(θ2 ) − sin(θ1 ) sin(θ2 )
= cos(θ1 + θ2 ).
It follows that θ = θ1 + θ2 up to multiples of 360◦ .

6.2. The complex exponential function. Let θ be any real number. Let eiθ denote the unit vector whose
angle is θ. Note that we define eiθ this way. That is, eiθ is the complex number of length 1 and angle θ.
Clearly we have
eiθ = cos θ + i sin θ.
If z is any complex number with length r and angle θ then we have the equality z = |z|eiθ . The formula we
proved in Lemma 6.1.1 is
z1 z2 = |z1 ||z2 |ei(θ1 +θ2 )
and in particular, if we take the complex numbers eiθ1 , eiθ2 themselves, we find that
eiθ1 eiθ2 = ei(θ1 +θ2 ) .
Let z = a + bi be a complex number then we define
ez = e a eib ,
where e a is the usual exponential (a is a real number) and eib is as defined above. Combining the formulas
for the real exponent with our definition for eiθ , we conclude that for any complex numbers z1 , z2 ,
e z1 e z2 = e z1 + z2 .
We have defined here ez in a purely formal way. A more analytic approach, and in fact the “correct" approach,
is the following:
We say that a sequence of complex numbers z1 , z2 , . . . converges to a complex number A if
lim | A − zn | = 0.
n

In this sense, one can show that for every complex number z the series
z2 z3 zn
1+z+ + +···+ +...
2! 3! n!
converges and is equal to ez . This is well known to hold for z a real number, and so we see that our definition
of ez for z a complex number is a natural extension of the function ez for z a real number. See Appendix D
for more on the complex exponential function.

Example 6.2.1. Consider the polynomial x n − a = 0, where a is a non-zero complex number. We claim that
this equation has n distinct roots in C. Write a = reiθ (so | a| = r and the line from 0 to a forms an angle θ
with the real axis). A complex number z = ReiΘ is a solution to the equation if and only if zn = Rn einΘ = reiθ .
That is, if and only if
Rn = r, nΘ ≡ θ (mod 2π ).
Thus, the solutions are exactly
2π )
z = r1/n ei( n + j·
θ
n , j = 0, 1, . . . , n − 1.
In particular, taking a = 1 the solutions are called the roots of unity of order n. There are precisely n of
them: the points on the unit circle having angles 0, 2π 2π 2π
n , 2 · n , . . . , ( n − 1) · n .
COURSE NOTES - ALGEBRA I 23

3rd roots of 1 4th roots of 1 5th roots of 1

6.3. The Fundamental Theorem of Algebra. A complex polynomial f ( x ) is an expression of the form
a n x n + a n −1 x n −1 + · · · + a 1 x + a 0 ,
where n is a non-negative integer, x is a variable, and the coefficients ai are complex numbers. If all the
coefficients are real we may call it a real polynomial; if all the coefficients are rational numbers we may call
it a rational polynomial and so on. But note that x2 + 1 is both a rational, real and complex polynomial.
The zero polynomial, denote 0, is the case when n = 0 and a0 = 0.
A polynomial defines a function
f : C → C, z 7 → f ( z ) = a n z n + a n −1 z n −1 + · · · + a 1 z + a 0 .
The notation 7→ appearing in this formula means “maps to". If an 6= 0 then we say f has degree n. If f (z) = 0
for some particular complex number z, we say that z is a root (or a solution, or a zero) of the polynomial f .
Example 6.3.1. Consider the polynomial f ( x ) = x2 + 1. It has degree 2 and f (i ) = i2 + 1 = −1 + 1 =
0, f (−i ) = (−i )2 + 1 = −1 + 1 = 0. So i and −i are roots of f . This is a special case of Example 6.2.1,
because x4 − 1 = ( x2 + 1)( x2 − 1).
Theorem 6.3.2 (The Fundamental Theorem of Algebra). Let f ( x ) be a complex polynomial of degree at
least 1. Then f ( x ) has a root in C.
Proofs of the theorem are beyond the scope of this course. There are many proofs. In the course Honours
Algebra 4 one sees an algebraic proof using Galois theory; in the course Complex Variables and Transforms
one sees an analytic proof. The theorem is attributed to Gauss who proved it in 1799. There are some issues
regarding the completeness of that proof; he later published several other proofs of the theorem. Gauss didn’t
discover the theorem, though; Many attempts and partial results were known before his work, and he was
aware of that literature.
Proposition 6.3.3. Let f ( x ) = an x n + an−1 x n−1 + · · · + a1 x + a0 be a complex non-zero polynomial of
degree n. Then
n
f ( x ) = a n ∏ ( x − z i ),
i =1
for suitable complex numbers zi , not necessarily distinct. The numbers zi are all roots of f and any root of f
is equal to some zi . Moreover, this factorization is unique.
Proof. We prove the result by induction on n. For n = 0 we understand the product ∏in=1 ( x − zi ) as 1, by
definition.12 And so, the claim is just that a constant polynomial is equal to its leading coefficient. Clear.
Now, assume that f has degree at least one. By the Fundamental Theorem of Algebra there is a complex
number zn , say, such that f (zn ) = 0. We claim that for every complex number z we can write
f ( x ) = ( x − z) g( x ) + r,

12This is a convention: the empty product is equal to one, the empty sum is equal to zero.
24 EYAL GOREN MCGILL UNIVERSITY

where g( x ) is a polynomial of degree n − 1 and leading coefficient an and r is a complex number. In-
deed, write g( x ) = bn−1 x n−1 + · · · + b1 x + b0 and equate coefficients in ( x − z) g( x ) = bn−1 x n + (bn−2 −
zbn−1 ) x n−1 + · · · + (b0 − zb1 ) x and f ( x ). We want complex numbers b0 , . . . , bn−1 such that
bn−1 = an , (bn−2 − zbn−1 ) = an−1 , . . . , (b0 − zb1 ) = a1 ,
and there is no problem solving these equations. Thus, we can choose g( x ) with a leading coefficient an such
that f ( x ) − ( x − z) g( x ) = r is a constant. Note that g( x ) depends on z, but this is not reflected in our
notation.
Now, apply that for the particular choice z = zn . We have f ( x ) − ( x − zn ) g( x ) = r. We view r as a
polynomial and substitute x = zn . We get
f (zn ) − (zn − zn ) g(zn ) = r.
Since f (zn ) = 0 we conclude that r = 0.
We showed that if f (zn ) = 0 then
f ( z n ) = ( x − z n ) g ( x ), g( x ) = bn−1 x n−1 + · · · + b0 .
In fact, bn−1 = an . Using the induction hypothesis, we have
n −1
g( x ) = an ∏ ( x − z i ),
i =1
for some complex numbers zi and so
n
f ( x ) = a n ∏ ( x − z i ).
i =1
We note that f (z j ) = an ∏in=1 (z j − zi ) = 0, because the product contains the term (z j − z j ). If f (z) = 0
then an ∏in=1 (z − zi ) = 0. But, if a product of complex numbers is zero one of the numbers is already zero
(use |z1 z2 · · · z a | = |z1 | · |z2 | · ... · |z a |). Since an 6= 0, we must have z = zi for some i.
It remains to prove the uniqueness of the factorization. Suppose that
n n
f ( x ) = a n ∏ ( x − z i ) = a ∏ ( x − t i ).
i =1 i =1
Since the leading coefficient of f is an we must have a = an . We now argue by induction on the degree of
f . The case of degree 0 is clear. Assume f has degree greater than zero. Then the ti are roots of f and
so t1 is equal to some zi , and we may re-index the zi so that t1 = z1 . Dividing both sides by x − z1 we then
conclude that13
n n
a n ∏ ( x − z i ) = a n ∏ ( x − t i ),
i =2 i =2
and, by induction, zi = ti for all i.
We remark that for n = 1, 2 the expression of f as a product is well-known from highschool:
−b

ax + b = a · x − ,
a
√ ! √ !
2 −b + b2 − 4ac −b − b2 − 4ac
ax + bx + c = a · x − · x− .
2a 2a
There are also formulas for the roots for polynomials of degree 3 and 4, but in degrees 5 and higher no such
general formulas exist. Not that they are merely unknown; they cannot exist. This follows from Galois theory,
taught in in the course Algebra 4.
The story of the solvability of polynomials is a fascinating story. One looks for formulas for solving
polynomial equations that only involve the coefficients of the polynomials and elementary operations such
as adding, subtracting, multiplying, dividing and taking n-th roots. This is called “solving by radicals". The
formulas for solving linear and quadratic polynomials by radicals were known since antiquity: in some form or
13We say that f ( x )/g( x ) = h( x ) if h( x ) is a polynomial such that f ( x ) = g( x ) h( x ). We shall see later that h( x ) is uniquely
determined. In our case clearly f ( x )/( x − z1 ) = an ∏in=2 ( x − zi ).
COURSE NOTES - ALGEBRA I 25

another already to the Babylonians as early as 2000 BC and in a definite form to the Indian mathematician
Brahmagupta in 628 AD. This knowledge propagated through Al-Khwarizmi and others to Europe, and one
of the first European books containing such formulas was published in the 12th century by the Spanish Jewish
scholar Avraham bar Khiyya Ha-Nasi. The formulas for polynomials of degree 3 and 4 were published by the
Italian mathematician Gerolamo Cardano in 1545 (the case of a quartic reduces to a cubic and was discovered
by his student Ludovico Ferrari), and the case of cubic was also known to his contemporary and compatriot
Niccolo Fontana, who went by the name Tartaglia. In fact, Tartaglia had told Cardano how to solve cubics
but sworn him to complete secrecy. As it were, both were preceded, by a margin of three decades, by Scipione
del Ferro that preferred taunting his colleagues with challenges for solving particular cubic equations than to
publish his result. Cardano, after discovering del Ferro’s work didn’t consider himself bound by his promise
to Tartaglia anymore and, indeed, in his book Ars Magna had attributed the solution of the cubic to de
Ferro as well as acknowledging that Tartaglia had told him of the method, but not its proof. Cardano’s
publication didn’t go well with Tartaglia and a bitter feud ensued. Indeed, the solutions to the cubic and
quartic equations made Cardano and his book well-known through Europe and the scientific prestige easily
translated into material benefits as well.
For a long time finding formula for equations of degree 5 was one the outstanding problems of mathematics
and for a long time everyone was sure that such formulas must exist, although Gauss expressed some serious
doubts. Niels Henrik Abel, a Norwegian mathematical genius who died in 1829 at the age of 27 (and after
whom the Abel prize - the “Nobel prize for mathematics" - is named), proved that such formulas cannot
exist. Independently, Évartise Galois who died at the age of 21 in 1832, proved not only the insolvability of
the quintic, but in fact of all equations of any degree greater than 4, developing in the process the theory of
groups - a theory that transformed algebra and number theory.

7. Fields and rings - definitions and first examples

In the examples of number systems we have already discussed there are implicit structures that we want
to bring to light by defining them now in a formal way. At this point we just provide the definitions and
reexamine previous examples. Later we shall enter a systematic development of the theory. The process we
are about to carry out is very typical of how mathematics develops. Interesting structures are discovered
(in this case arithmetic of integers, rational numbers, complex numbers, but later we shall also consider
permutation groups). The typical (good) mathematician will ask herself what are the reasons that these
structures are so rich, and proceed to distill the particular features responsible for that richness (in our case,
we have “oprations" and certain identities between operations, or, for groups, we have “composition"). The
next step, is to turn the table around and ask, now that we have these abstracted structures, what other
examples may we find where these abstracted structures are manifest? In our context, these will be rings and
fields, or, in the context of groups, these will abstract groups, no longer tied to permutations of objects, and
we would want to know what properties are shared between all these examples.

An operation (more pedantically called a “binary operation") on a set R is a function

w : R × R → R.

That is, an operation is a rule taking two elements of R and returning a new one. For example:

w : C × C → C, w ( z1 , z2 ) = z1 + z2 ,

or
w : C × C → C, w ( z1 , z2 ) = z1 z2 .
Often, for a general set R, we may denote w(z1 , z2 ) by z1 + z2 , or z1 z2 , if we want to stress the fact that
the operation behaves like addition, or multiplication. We shall of course focus on mathematical examples,
but one can certainly be more adventurous. For example, we can take R to be the set of sound waves and
the operation w to be the juxtaposition of two sound waves.
26 EYAL GOREN MCGILL UNIVERSITY

Definition 7.0.1. A ring R is a non-empty set together with two operations, called “addition" and “multipli-
cation" that are denoted, respectively, by
( x, y) 7→ x + y, ( x, y) 7→ xy.
One requires the following axioms to hold:
(1) x + y = y + x, ∀ x, y ∈ R. (Commutativity of addition)
(2) ( x + y) + z = x + (y + z), ∀ x, y, z ∈ R. (Associativity of addition)
(3) There exists an element in R, denoted 0, such that 0 + x = x, ∀ x ∈ R. (Neutral element for
addition)
(4) ∀ x ∈ R, ∃y ∈ R such that x + y = 0. (Inverse with respect to addition)
(5) ( xy)z = x (yz), ∀ x, y, z ∈ R. (Associativity of multiplication)
(6) There exists an element 1 ∈ R such that 1x = x1 = x, ∀ x ∈ R. (Neutral element for multiplication)
(7) z( x + y) = zx + zy, ( x + y)z = xz + yz, ∀ x, y, z ∈ R. (Distributivity)

We remark that for us, by definition, a ring always has an identity element with respect to multiplication.
Most, but not all, authors follow this convention.
Definition 7.0.2. Note that the multiplication operation is not assumed to be commutative in general.
If xy = yx for all x, y ∈ R, we say R is a commutative ring. If for every non-zero x ∈ R there is an
element y ∈ R such that xy = yx = 1, and also 0 6= 1 in R, we call R a division ring. A commutative
division ring is called a field.
Example 7.0.3. Z is a commutative ring. It is not a division ring and so it is not a field.
Example 7.0.4. The rational numbers Q form a field. The real numbers R form a field. In both cases we
assume the properties of addition and multiplication as “well known". The complex numbers also form a field,
in fact we have at some level already used all the axioms implicitly in our calculations, but now we prove it
formally using that R is a field.

Proposition 7.0.5. C is a field.

Proof. Let z1 = a1 + b1 i, z2 = a2 + b2 i, z3 = a3 + b3 i. We verify the axioms:
1. z1 + z2 = ( a1 + a2 ) + (b1 + b2 )i = ( a2 + a1 ) + (b2 + b1 )i = z2 + z1 .
2. (z1 + z2 ) + z3 = [( a1 + a2 ) + (b1 + b2 )i ] + a3 + b3 i = [( a1 + a2 ) + a3 ] + [(b1 + b2 ) + b3 ]i = [ a1 + ( a2 +
a3 )] + [b1 + (b2 + b3 )]i = z1 + [( a2 + a3 ) + (b2 + b3 )i ] = z1 + (z2 + z3 ).
3. Clearly 0 + z1 = z1 .
4. We have (− a1 − b1 i ) + ( a1 + b1 i ) = (− a1 + a1 ) + (−b1 + b1 )i = 0 + 0i = 0.
5. (z1 z2 )z3 = [( a1 + b1 i )( a2 + b2 i )]( a3 + b3 i ) = (( a1 a2 − b1 b2 ) + ( a1 b2 + b1 a2 )i )( a3 + b3 i ) = ( a1 a2 −
b1 b2 ) a3 − ( a1 b2 + b1 a2 )b3 + (( a1 a2 − b1 b2 )b3 + ( a1 b2 + b1 a2 ) a3 )i = a1 a2 a3 − b1 b2 a3 − a1 b2 b3 − b1 a2 b3 +
( a1 a2 b3 − b1 b2 b3 + a1 b2 a3 + b1 a2 a3 )i. One now develops the product z1 (z2 z3 ) in the same way and checks
that the answers match. We don’t do that here.
6. Clearly 1 · z1 = z1 · 1 = z1 .
7. z1 (z2 + z3 ) = ( a1 + b1 i )(( a2 + a3 ) + (b2 + b3 )i ) = a1 ( a2 + a3 ) − b1 (b2 + b3 ) + (b1 ( a2 + a3 ) + a1 (b2 +
b3 )i ) = ( a1 a2 − b1 b2 ) + (b1 a2 + a1 b2 )i + ( a1 a3 − b1 b3 ) + (b1 a3 + a1 b3 )i = ( a1 + b1 i )( a2 + b2 i ) + ( a1 +
b1 i )( a3 + b3 i ) = z1 z2 + z1 z3 .
Before proving the next property, we check that z1 z2 = z2 z1 . We have z1 z2 = ( a1 + b1 i ) ( a2 + b2 i ) =
a1 a2 − b1 b2 + ( a1 b2 + b1 a2 )i = a2 a1 − b2 b1 + ( a2 b1 + b2 a1 )i = ( a2 + b2 i )( a1 + b1 i ) = z2 z1 . In particu-
lar, (z1 + z2 )z3 = z3 (z1 + z2 ) = z3 z1 + z3 z2 = z1 z3 + z2 z3 .
Finally, as we have already seen, if z1 6= 0 then z1 · |zz̄1|2 = 1. We proved that C is a field.
1

Example 7.0.6. Here is an example of a non-commutative ring. The elements of this ring are 2-by-2 matrices
with entries in R. We define

a b α β a+α b+β a b α β aα + bγ aβ + bδ
+ = , = .
c d γ δ c+γ d+δ c d γ δ cα + dγ cβ + dδ
COURSE NOTES - ALGEBRA I 27

The verification
of the axioms is a straightforward,
but tiresome, business. We shall not do it here. The zero
0 0 1 0
element is and the identity element is . To see that the ring is not commutative we give the
0 0 0 1
following example.

0 1 0 0 1 0 0 0 0 1 0 0
= , = .
0 0 1 0 0 0 1 0 0 0 0 1

7.1. Some formal consequences of the axioms. We note some useful formal consequences of the axioms
defining a ring:
(1) The element 0 appearing in axiom (3) is unique. Indeed, if q is another element with the same
property then q + x = x for any x and in particular q + 0 = 0. But also, using the property of 0 and
commutativity, we have q + 0 = 0 + q = q. So q = 0.
(2) The element y appearing in axiom (4) is unique. Indeed, if for a given x we have x + y = x + y0 = 0
then y = y + ( x + y0 ) = (y + x ) + y0 = ( x + y) + y0 = 0 + y0 = y0 . We shall denote this element y
by − x.
(3) We have −(− x ) = x and −( x + y) = − x − y, where, technically − x − y means (− x ) + (−y). To
prove that, it is enough, after what we have just proven, to show that − x + x = 0 and that ( x +
y) + (− x − y) = 0 (after all, −(− x ) is that unique element that when added to − x gives 0, etc.).
The first is clear, and ( x + y) + (− x − y) = x + (− x ) + y + (−y) = 0 + 0 = 0.
Although this proof is simple and short, it is based on an important idea, worth internalizing.
We prove that an object is equal to another object by proving they both have a property known to
uniquely determine the object. Namely, the proof that −(− x ) = x consists of showing that the
objects x and −(− x ) both have the property that adding them to − x gives zero. Since this is known
to uniquely characterize the additive inverse of − x, denoted −(− x ), we conclude that x = −(− x ).
(4) The element 1 in axiom (6) is unique. (Use the same argument as in (1)).
(5) We have x · 0 = 0, 0 · x = 0. Indeed, x · 0 = x · (0 + 0) = x · 0 + x · 0. Let y = x · 0 then y = y + y
and so 0 = −y + y = −y + (y + y) = (−y + y) + y = 0 + y = y.

We shall see many examples of rings and fields in the course. For now, we just give one more definition and
some examples.
Definition 7.1.1. Let R be a ring. A subset S ⊂ R is called a subring if 0, 1 ∈ S and if a, b ∈ S implies
that a + b, − a and ab ∈ S.
Note that the definition says that the operations of addition and multiplication in R give operations of
addition and multiplication in S (namely, the outcome is in S and so we get functions S × S → S), satisfying
all the axioms of a ring. It follows that S is a ring whose zero element is that of R and whose identity is,
likewise, that of R.
For example, Q is a subring of R and Z is a subring of Q, as well of R.
Example 7.1.2. Consider the set {0, 1} with the following addition and multiplication tables.
+ 0 1 × 0 1
0 0 1 0 0 0 .
1 1 0 1 0 1
One can verify that this is a ring by directly checking the axioms. (We shall later see that this is the ring of
integers modulo 2).
√ √
Example 7.1.3. Consider all expressions of √ the form { a + b 2 : a, b ∈ Z}. We use the notation Z[ 2] for
a ring. Since Z[ 2]√⊂ R and√R is a ring
this set. This set is actually √ √ (even a field!), it is √
enough to check
√ it’s
a subring. Indeed, √ 0, 1 ∈ Z[ √2]. Suppose a + b√ 2, c + d 2 ∈ Z √[ 2]. Then:
√ (1) ( a + b √ 2) + (c + d√ 2) =
( a + c) + (b + d) 2 ∈√ Z[ 2]√ ; (2) −( a + b 2) = − a − b 2 ∈ Z[ 2]; (3) ( a + b 2)(c + d 2) =
( ac + 2bd) + ( ad + bc) 2 ∈ Z[ 2].
Let now √ √
Q[ 2] = { a + b 2 : a, b ∈ Q}.
28 EYAL GOREN MCGILL UNIVERSITY

This is a field. The verification that this is a subring of C is the same √ as above. It is thus a commutative
ring in which 0 6= 1. We need to show inverse for multiplication. If a + b 2 is not zero then either a or b are
not zero. If
c = a2 − 2b2
√
is zero then either b = 0 (but then a 6= 0 and so c 6= 0, so this case doesn’t happen), or 2 = a/b√is a rational
√
number. We shall prove in Proposition 10.2.4 that this is not the case. Thus, c 6= 0. Now, ac − bc 2 ∈ Q[ 2]
and it is easy to check that
√ a b√
( a + b 2) − 2 = 1.
c c
COURSE NOTES - ALGEBRA I 29

8. Exercises
(1) Let B be a given set. What does the equality A ∪ B = B ∩ C imply on A? on C?
(2) Calculate the following intersection and union of sets (provide short explanations, if not complete
proofs).
Notation: If a, b are real numbers we use the following notation:
[ a, b] = { x ∈ R| a ≤ x ≤ b}.
[ a, b) = { x ∈ R| a ≤ x < b}.
( a, b] = { x ∈ R| a < x ≤ b}.
( a, b) = { x ∈ R| a < x < b}.
We also use
[ a, ∞) = { x ∈ R| a ≤ x }.
(−∞, b] = { x ∈ R| x ≤ b}.
(−∞, ∞) = R.
If A1 , A2 , A3 , . . . are sets, we may write ∪iN=1 Ai for A1 ∪ A2 ∪ · · · ∪ A N and ∪i∞=1 Ai for ∪i∈{1,2,3,... } Ai .
(a) Let N ≥ 1 be a natural number. What is ∪nN=1 [−n, n]? What is ∩nN=1 [−n, n]?
(b) What is ∪∞ ∞
n=1 [ n, n + 1]? What is ∪n=1 ( n, n + 2)?
(c) What is ∪∞ ∞
n=1 ( n, n + 1)? What is ∪n=1 (1/n, 1]?
(d) Let An = { x n : x ∈ N}. What is ∩∞
n =1 A n ?
(e) Let B = (−1, 1) × (−1, 1), the open square in the plane. Write B as an infinite union of closed
discs of positive radius (where a closed disc with center (v0 , v1 ) and positive radius is a set of
the form {(v0 , v1 ) + ( x, y) : x2 + y2 ≤ r } for some fixed positive real number r).
(f) Prove that one cannot write this B as a finite union of discs.
(3) Using the intervals [0, 2] and (1, 3] and the operations of union, intersection and difference, create 8
different sets. For example, the set [0, 1] ∪ (2, 3] is ([0, 2] \ (1, 3]) ∪ ((1, 3] \ [0, 2]).
(4) Let A, B and C be sets. Prove or disprove:
(a) ( A \ B) \ C = A \ ( B \ C );
(b) ( B \ A) ∪ ( A \ B) = ( A ∪ B) \ ( A ∩ B).
(5) Prove that
A \ (∩i∈ I Bi ) = ∪i∈ I ( A \ Bi ).
(6) Prove that the Principle of Induction (Theorem 2.3.1) implies the statement: “every non-empty
subset of N has a minimal element".
(7) Prove by induction that for n ≥ 1,
13 + · · · + n3 = (1 + · · · + n)2 .
You may use the formula for the right hand side previously given.
(8) Prove by induction that 2n > n2 for n ≥ 5. It is false if we take n ≥ 0 (because it fails for n = 2),
but it happens to hold for n = 0. Where would your proof break down if you try to argue by induction
starting at n = 0?
(9) Let Cn denote the number of ways to cover the squares of a 2 × n checkers board using plain dominos.
Therefore, C1 = 1, C2 = 2, C3 = 3.√ Compute C√ 4 and C5 and find a recursive formula for Cn . Prove
1 + 5 −
1
by induction that Cn = √ · ( 2 ) n + 1 − ( 2 5 ) n +1 .
1
5
(10) Prove by induction that for n ≥ 1
1 + 3 + 5 + · · · + (2n − 1) = n2 .

(11) Let A = {1, 2, 3, 4} and B = { a, b, c}.

30 EYAL GOREN MCGILL UNIVERSITY

(a) Write 4 different surjective functions from A to B.

(b) Write 4 different injective functions from B to A.
(c) How many functions are there from A to B?
(d) How many surjective functions are there from A to B?
(e) How many injective functions are there from A to B?
(f) How many functions are there from B to A?
(g) How many surjective functions are there from B to A?
(h) How many injective functions are there from B to A?
(12) Let A be a set with a elements, B a set with b elements, where a, b are finite. Show that the number
of functions f : B → A is ab . One sometimes write the set of functions from B to A using the
notation A B . In this notation, prove that | A B | = | A|| B| .
In particular, using this interpretation what should 0b be equal to if b > 0? what should a0 be
equal to if a > 0? what should 00 be equal to?
Note that this allows us to define the power of any cardinality to any cardinality. For example, if
ℵ0 = |N| then ℵ0ℵ0 := |NN | = |{ f : N → N}| and 2ℵ0 = |{ f : N → {0, 1}}|. One can show that
2ℵ0 = |R| by first showing |R| = |(0, 1)| by providing explicitly a bijection and then using binary
expansion of real number to show 2ℵ0 = |(0, 1)| (which is a bit tedious as the binary expansion is not
ℵ
quite unique: 1/2 is both 0.1000 . . . and 0.011111 . . . ). In fact, also ℵ0 0 = 2ℵ0 . See also Exercise 19.

(13) Let f : A → B be a function.

a) Prove that f is bijective if and only if there exists a function g : B → A such that f ◦ g = 1B
and g ◦ f = 1 A .
b) Prove or disprove: if there exists a function g : B → A such that f ◦ g = 1B then f is bijective.

(14) Consider N × N as a rectangular array:

(0, 0) (0, 1) (0, 2) (0, 3) ...
(1, 0) (1, 1) (1, 2) (1, 3) ...
(2, 0) (2, 1) (2, 2) (2, 3) ...
(3, 0) (3, 1) (3, 2) (3, 3) ...
..
.
Count the elements of N × N using diagonals as follows:
0 1 3 6 10 ...
2 4 7 11 ...
5 8 12 ...
9 13 ...
14 ...
..
.
This defines a function
f : N×N → N
where f (m, n) is the number appearing in the (m, n) place. (For example, f (0, 0) = 0, f (3, 1) =
13, f (2, 2) = 12.) Provide an explicit formula for f (it is what one calls “a polynomial function in
the variables m, n". It may be a good idea to first find a formula for f (0, n)). Note that this f is a
bijection N × N → N. And there is nothing wild about it. It is a rather simple polynomial.
(15) Prove that if | A1 | = | A2 | and | B1 | = | B2 | then | A1 × B1 | = | A2 × B2 |.
COURSE NOTES - ALGEBRA I 31

(16) Let A, B be sets and assume A is not the empty set. Prove that | A| ≤ | B| if and only if there is a
surjective function B → A.
(17) Prove that |N| = |Q|. (Hint: Show two inequalities; note that there is an easy injection Q → Z ×
Z).
(18) Prove or disprove: if | A1 | = | A2 |, A1 ⊇ B1 , A2 ⊇ B2 , and | B1 | = | B2 | then | A1 \ B1 | = | A2 \ B2 |.
(19) Let A be a set. Then | A| < |2 A |, where 2 A is the set of all subsets of A. (Another common notation
for the set of subsets of A is P ( A).) Prove this as follows: First show | A| ≤ |2 A | by constructing
an injection A → 2 A . Suppose now that there is a bijection
A → 2A, a 7 → Ua .
Define a subset U of A by
U = { a : a 6 ∈ Ua } .
Show that if U = Ub we get a contradiction. (This is some sort of “diagonal argument"). Put all
this together to conclude | A| < |2 A |.
(20) Prove that a number is rational if and only if from some point on its decimal expansion becomes
periodic.
(21) Prove that if a product z1 · z2 of complex numbers is equal to zero then at least one of z1 , z2 is zero.
(22) Let f be the complex polynomial f ( x ) = (3 + i ) x2 + (−2 − 6i ) x + 12. Find a complex number z such
that the equation f ( x ) = z has a unique solution (use the formula for solving a quadratic equation).
(23) Find the general form of a complex number z such that: (i) z2 is a real number (i.e., Im(z2 ) = 0).
(ii) z2 is a purely imaginary number (i.e., Re(z2 ) = 0). (iii) z2 = z̄. (iv) Im(z2 + z̄) = 0. (v)
Re(z2 + z̄) = 0. Also, in each of these cases, plot the answer on the complex plane.
(24) The ring of 2 × 2 matrices over a field F.
Let F be a field. We consider the set

a b
M2 (F) = : a, b, c, d ∈ F .
c d
It is called the two-by-two matrices over F. We define the addition of two matrices as

a1 b1 a b a + a2 b1 + b2
+ 2 2 = 1 .
c1 d1 c2 d2 c1 + c2 d1 + d2
We define multiplication by

a1 b1 a2 b2 a1 a2 + b1 c2 a1 b2 + b1 d2
= .
c1 d1 c2 d2 c1 a2 + d1 c2 c1 b2 + d1 d2
Prove that this is a ring. For each of the following subsets of M2 (F) determine if they are subrings
or not.
a b
(a) The set ∈ M2 (F) .
0 d

0 b
(b) The set ∈ M2 (F) .
0 0

a 0
(c) The set ∈ M2 (F) .
c d

a 0
(d) The set ∈ M2 (F) .
0 d

a 0
(e) The set ∈ M2 (F) .
0 a
Remark: One defines in a very similar way the ring of n × n matrices with entries in a field F.
32 EYAL GOREN MCGILL UNIVERSITY

(25) The zero ring. Let e be a formal symbol and let R = {e}. Define e + e = e, e · e = e. Verify that R
is a ring in which 1 = 0. Prove that if S is any ring in which 1 = 0 then S has a unique element and
if we choose to denote by e then e + e = e, e · e = e.
(26) Prove that a commutative ring with 2 elements is a field.
(27) Prove that a commutative ring with 3 elements is a field.
COURSE NOTES - ALGEBRA I 33

Part 2. Arithmetic in Z
In this part of the course we are going to study arithmetic in the ring of integers Z. We are going to focus
on particular properties of this ring. Our choice of properties is motivated by an analogy to be drawn later
between integers and polynomials. In fact, there is a general class of rings to which one can extend this
analogy, called Euclidean rings; they are discussed briefly in Appendix C.
We are focusing here on rather simple properties of the integers. However, the integers have been the
playground of mathematicians for millennia and the fascination with their properties never vaned. Starting
from the Greek mathematicians and philosophers, for whom integers were manifestation of the divine, simple
problems concerning integers were explored. Remarkably, for most of those problems we still do not know
the answer. For example, a positive integer n is called perfect if the sum of its divisors, excluding itself, is
equal to n. Thus, 6 is a perfect number as 6 = 1 + 2 + 3, as is 28. We do not know if odd perfect numbers
exist and we do not know if there are infinitely many even perfect numbers (although I would suspect that
the answer to the first is no, and for the second is yes). Because of the special role prime numbers play in
arithmetic, perhaps the most substantial open problems are problems concerning prime numbers, and some
of these are mentioned below.

9. Division, GCD and the Euclidean Algorithm

9.1. Division with residue.
Theorem 9.1.1. (Division with residue) Let a, b be integers with b 6= 0. There exist integers q, r such that
a = qb + r, 0 ≤ r < | b |.
Moreover, q and r are uniquely determined. r is called the residue and q the quotient.
Proof. For simplicity, assume b > 0. Very similar arguments prove the case b < 0.
Consider the set
S = { a − bx : x ∈ Z, a − bx ≥ 0}.
It is the set of all non-negative “residues". We claim that S is a non-empty set. Indeed, if a ≥ 0 take x = 0
and it follows that a ∈ S. If a < 0 take x = a and a − bx = a(1 − b) ≥ 0 (because b > 0 and so b ≥ 1). That
is, a(1 − b) ∈ S. Since S is a non-empty subset of N, it follows that S has a minimal element r = a − bq
for some q. Then r < b; otherwise, 0 ≤ r − b = a − b(q + 1) is an element of S as well and smaller that r,
which is a contradiction. It follows that
a = bq + r, 0 ≤ r < b.
We now show that q and r are unique. Suppose
a = bq0 + r 0 , 0 ≤ r 0 < b.
If q = q0 then also r = a − bq = a − bq0 = r 0 . Else, either q > q0 or q0 > q. Note that
0 = bq + r − (bq0 + r 0 ) = b(q − q0 ) + (r − r 0 ).
If q > q0 then r 0 = r + b(q − q0 ) ≥ r + b ≥ b. Contradiction. If q < q0 we get r = r 0 + b(q0 − q) ≥ b and
again a contradiction.
Example 9.1.2. Let a = 40 and b = 13. We then have
40 = 3 · 13 + 1,
and as 0 ≤ 1 < 13 we have divided 40 by 13 with residue 1. If we take a = 40 and b = −13 then
40 = −3 × −13 + 1
and since 0 ≤ 1 < | − 13| we have indeed divided 40 by −13 with residue 1. On the other hand, take a = −40
and b = 13. We have −40 = −3 · 13 − 1, but this is not proper division with residue as the residue, namely
−1, is negative. The proper way to do that is
−40 = −4 × 13 + 12.
34 EYAL GOREN MCGILL UNIVERSITY

Indeed 0 ≤ 12 < 13.

9.2. Division.
Definition 9.2.1. Let a, b be integers. We say that a|b (read, a divides b) if there is an element c ∈ Z such
that b = ac.
Here are some properties:
(1) a|b ⇒ a| − b.
(2) a|b ⇒ a|bd for any d ∈ Z.
(3) a|b, a|d ⇒ a|(b ± d).
(4) if a|b and b| a then a = ±b.
Proof. Write b = ac. Then −b = a · (−c) and so a| − b. Also, bd = a · (cd) and so a|bd.
Write also d = ae. Then b ± d = a · (c ± e) and so a|(b ± d).
Now, for the last property. For some integers c, d we have b = ac, a = bd and so we find that a = bd = acd
and so a(1 − cd) = 0. If a = 0 then also b = 0 and the conclusion a = ±b is definitely true. Otherwise, we
must have cd = 1. But, as c and d are integers, we must have c = d = ±1 and the conclusion follows.

Remark 9.2.2. Extreme cases are sometimes confusing. It follows from the definition that 0 divides only 0,
that every number divides 0 and that ±1 (and only them) divide any number.
Example 9.2.3. Which are the integers that can divide both n and 2n + 5? If a divides n, then a divides 2n
and so, if a divides 2n + 5 then a divides 5, because 5 = (2n + 5) − 2n. Therefore a = ±1, ±5 are the only
possibilities. However, note that this does not imply that they actually occur in every case. If n = 0, they all
do. If n = 1 only ±1 are common divisors.
Corollary 9.2.4. Let a 6= 0. a|b if and only if in dividing b in a with residue, b = aq + r, the residue r is zero.
Proof. If the residue r = 0 then b = aq and so a|b. If a|b and b = aq + r then a|(b − aq), i.e., a|r. But r < | a|
and so that’s possible only if r = 0.

9.3. GCD.
Definition 9.3.1. Let a, b, be integers, not both zero. The greatest common divisor (gcd) of a and b,
denoted gcd( a, b) or just ( a, b) if the context is clear, is the largest (positive) integer dividing both a and b.
Theorem 9.3.2. Let a, b, be integers, not both zero, and d = ( a, b) their gcd. Then every common divisor
of a and b divides d. There are integers u, v such that
d = ua + vb.
Moreover, d is the minimal positive number that has the form ua + vb.
Proof. Let
S = {ma + nb : m, n ∈ Z, ma + nb > 0}.
First note that S 6= ∅. Indeed, aa + bb ∈ S. Let D be the minimal element of S. Then, for some u, v ∈ Z
we have D = ua + vb.
We claim that D = d. To show D | a, write a = qD + r, 0 ≤ r < D. Then, D > r = a − qD =
a − q(ua + vb) = (1 − qu) a − qvb. If r 6= 0 then r = (1 − qu) a − qvb is an element of S smaller than D
and that’s a contradiction. It follows that r = 0, that is D | a. In the same way, D |b.
On the other hand, let e be any common divisor of a and b. Then e also divides ua + vb = D. It follows
that D is the largest common divisor of a, b, so D = d, and also that any other common divisor of a and b
divides it.
Corollary 9.3.3. If a|bc and gcd( a, b) = 1 then a|c.
COURSE NOTES - ALGEBRA I 35

Proof. We have 1 = ua + vb for some integers u, v. Since a|uac and a|vbc we have a|uac + vbc = c.

9.4. The Euclidean algorithm. 14 The question arises: how do we compute in practice the gcd of two
integers? This is a very practical issue, even in the simple task of simplifying fractions! As we shall see, there
are two methods. One method uses the prime factorization of the two numbers; we shall discuss that later.
The other method, which is much more efficient, is the Euclidean algorithm.
Theorem 9.4.1. (The Euclidean Algorithm) Let a, b be positive integers with a ≥ b. If b| a then gcd( a, b) = b.
Else perform the following recursive division with residue:
a = bq0 + r0 , 0 < r0 < b,
b = r0 q1 + r1 , 0 ≤ r1 < r0 ,
r0 = r1 q2 + r2 , 0 ≤ r2 < r1 ,
..
.

For some t we must first get that rt+1 = 0. That is,

r t −2 = r t −1 q t + r t , 0 < r t < r t −1
r t −1 = r t q t +1 .
Then rt is the gcd of a and b.
Before giving the proof we provide two examples.
1). Take a = 113, b = 54. Then

113 = 54 · 2 + 5
54 = 5 · 10 + 4
5 = 4·1+1
4 = 1 · 4.
Thus gcd(113, 54) = 1.

2). Now take a = 442, b = 182. Then

442 = 182 · 2 + 78
182 = 78 · 2 + 26
78 = 26 · 3
and so gcd(442, 182) = 26.
Proof. Let d = gcd( a, b). We claim that d|rn for every n. We prove that by induction: First, d| a, d|b
then d|( a − bq0 ) = r0 . Suppose that d|ri , i = 0, 1, 2, . . . , n. Since rn+1 = rn−1 − rn qn+1 we get that d|rn+1
as well. In particular, d|rt .
We now show that rt | a, rt |b. It then follows that rt |d and therefore rt = d. We again prove that by
induction. We have rt |rt and rt |rt qt+1 = rt−1 . Suppose we have already shown that rt divides rt , rt−1 , . . . , rn .
Then, since rn−1 = rn qn+1 + rn+1 we also get rt |rn−1 . Therefore, rt divides r0 , r1 , . . . , rt . Again, b = r0 q1 + r1
and so rt |b and then a = bq0 + r0 and so rt | a.
14Euclid (about 325 - 265 BC), after whom Euclidean geometry and the Euclidean algorithm are named was a Greek mathe-
matician, most famous for his magnum opus “Elements" consisting of 13 books devoted to plane and spacial geometry, number
theory and algebra. Besides landmark results, such as the infinitude of primes and the Fundamental Theory of Arithmetic, the
Elements are important for having set mathematics on the right course. Euclid’s postulates laying the logical foundation to plane
geometry, served as model for the development of mathematics at large. The Elements is the most successful textbook ever
written and have served since its inception until the 19th and even early 20th century as a textbook in high school and university.
Abraham Lincoln kept a copy of Euclid in his saddlebag, and studied it late at night by lamplight; he related that he said to
himself, “You never can make a lawyer if you do not understand what demonstrate means; and I left my situation in Springfield,
went home to my father’s house, and stayed there till I could give any proposition in the six books of Euclid at sight".
36 EYAL GOREN MCGILL UNIVERSITY

A further bonus supplied by the Euclidean algorithm is that it allows us to find u, v such that gcd( a, b) =
ua + vb. We just illustrate it in two examples:
1). Take a = 113, b = 54. Then, as we saw,

113 = 54 · 2 + 5
54 = 5 · 10 + 4
5 = 4·1+1
4 = 1 · 4.
and so gcd(113, 54) = 1. We have 1 = 5 − 4 · 1, and we keep substituting for the residues we now
have expressions involving previous residues (the important numbers to modify are the residues not the
quotients qi ). 4 = 54 − 5 · 10 and we get 1 = 5 − (54 − 5 · 10) = −54 + 5 · 11. Next, 5 = 113 − 54 · 2 and
we get 1 = −54 + 5 · 11 = −54 + (113 − 54 · 2) · 11 = 54 · (−23) + 113 · 11. Thus,
1 = gcd(54, 113) = −23 · 54 + 11 · 113.

2). Now take a = 442, b = 182. Then

442 = 182 · 2 + 78
182 = 78 · 2 + 26
78 = 26 · 3
and so gcd(442, 182) = 26. Here the process is easier: 26 = 182 − 78 · 2 = 182 − (442 − 182 · 2) · 2 =
5 · 182 − 2 · 442.
26 = gcd(182, 442) = 5 · 182 − 2 · 442.

10. Primes and unique factorization

10.1. Primes.
Definition 10.1.1. An integer p 6= 0, ±1 is called prime if its only divisors are ±1, ± p.
The phrase “prime number" is usually used to denote a prime positive integer. An integer p > 1 is prime iff
its only positive divisors are 1 and p.

The sieve of Eratosthenes: 15 This is a method that allows one to construct rapidly a list of all primes less
than a given number N. We illustrate that with N = 50. One writes all the numbers from 2 to 50:

2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,41, 42, 43, 44, 45,
46, 47, 48, 49, 50
The first number on the list is prime. This is 2. We write it in bold-face and cross all its multiples (we denote
crossing out by an underline):

2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
45, 46, 47, 48, 49, 50
The first number on the list not in bold-face and not crossed out is prime. This is 3. We write it in bold-face
and cross all its multiples (we denote crossing out by another underline):

15Eratosthenes of Cyrene, 276BC - 194BC, was a Greek mathematician and is famous for his work on prime numbers and for mea-
suring the diameter of the earth. For more see http://www-groups.dcs.st-and.ac.uk/%7Ehistory/Biographies/Eratosthenes.html
COURSE NOTES - ALGEBRA I 37

2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
45, 46, 47, 48, 49, 50
The first number on the list not in bold-face and not crossed out is prime. This is 5. We write it in bold-face
and cross all its multiples (we denote crossing out by an underline):

2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
45, 46, 47, 48, 49, 50
The first number on the list not in bold-face and not crossed out is prime. This is 7. We write it in bold-face
and cross all its multiples (we denote crossing out by yet another underline):

2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
45, 46, 47, 48, 49, 50
√ √
The next number 11 is already greater than N = 50 ∼ 7.071 . . . . So we stop, because any number is
a product of prime numbers (see Lemma 10.1.3 below) √ and so any number less or equal to N, which is not
prime, has at least one prime divisor smaller or equal to N. Thus, any number left on our list is prime.

2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
45, 46, 47, 48, 49, 50

Theorem 10.1.2 (The Fundamental Theorem of Arithmetic). Every non-zero integer n is a product of primes.
(We allow the empty product, equal by definition to 1). That is, one can write every non-zero integer n as
n = ep1 p2 · · · pm ,
where e = ±1 and 0 ≤ p1 ≤ p2 ≤ · · · ≤ pm are primes (m ≥ 0). Moreover, this way of writing n is unique.
Proof. We first show n can be written this way. We may assume n is positive (if n is negative, apply the
statement to −n, −n = p1 p2 · · · pm and thus n = −1 · p1 p2 · · · pm ).
Lemma 10.1.3. Every positive integer is a product of primes numbers. (We allow the empty product, equal
by definition to 1).
Proof. Suppose not. Then the set of integers S that are not a product of prime numbers has a minimal
element, say n0 . n0 is not one, or a prime, because in those cases it is a product of primes (1, as said, is the
empty product). Thus, there are integers 1 < s < n0 , 1 < t < n0 such that n0 = st. Note that s and t are
not in S because they are smaller than n0 . Thus, s = q1 q2 · · · q a is a product of primes, t = r1 r2 · · · rb is a
product of primes and therefore n = q1 q2 · · · q a r1 r2 · · · rb is also a product of primes. This is a contradiction
- a contradiction to our initial assumption that there are positive integers that are not a product of prime
numbers. Thus, every positive integer is a product of prime numbers.
Choosing the sign e appropriately and ordering the primes in increasing order we conclude that any non-zero
integer n = ep1 p2 · · · pm , where e = ±1 and 0 ≤ p1 ≤ p2 ≤ · · · ≤ pm are primes. We now show uniqueness.
For this we need the following important fact.
Proposition 10.1.4. Let p > 1 be an integer. The following are equivalent:
(1) p is a prime number;
(2) if p| ab then p| a or p|b.
Proof. Suppose p is prime and p| ab. If p 6 | a then gcd( p, a) = 1 and so, as we have already seen (Corol-
lary 9.3.3), p|b.
Now suppose that p satisfies (ii). Let p = st. By replacing s by −s and t by −t, we may assume that
s, t are positive. As p = st, p|st and so p|s, say. So s = ps0 and p = ps0 t. But we must have then
38 EYAL GOREN MCGILL UNIVERSITY

that s0 = t = 1, because s0 , t are positive integers. Therefore p = s. So p has no proper divisors and hence
is prime.
Remark 10.1.5. Suppose that p is a prime dividing a product of integers q1 q2 · · · qt . Arguing by induction
on t, one concludes that p divides some qi .
We now finish the proof of the theorem. Suppose that
n = ep1 p2 · · · pm ,
and also
n = µq1 q2 · · · qt ,
are two expressions of n as in the statement of the theorem. First, e is negative if and only if n is, and
the same holds for µ. So e = µ. We may then assume n is positive and e = µ = 1 and we argue by
induction on n. The case n = 1 is clear: a product of one or more primes will be greater than 1 so the only
way to express n is as the empty product. Assume the statement holds for 1, 2, . . . , n − 1 and consider two
factorizations of n:
n = p1 p2 · · · p m ,
and
n = q1 q2 · · · q t .
First, note that m ≥ 1 and t ≥ 1 because n > 1. Assume that p1 ≤ q1 (the argument in the other case goes
the same way). We have p1 |n and so p1 |q1 q2 · · · qt . It follows that p1 divides some qi but then, qi being
prime, p1 = qi . Furthermore, p1 ≤ q1 ≤ qi = p1 , so p1 = q1 . We then have the factorizations
n
= p2 · · · p m = q2 · · · q t .
p1
Since n/p1 < n we may apply the induction hypothesis and conclude that m = t and pi = qi for all i.
The Fundamental Theorem exhibits the prime numbers as the building blocks of the integers. In itself, it
doesn’t tell us if there are finitely many or infinitely many of them.
Theorem 10.1.6 (Euclid). There are infinitely many prime numbers.
Proof. Let p1 , p2 , . . . , pn be distinct prime numbers. We show then that there is a prime not in this list. It
follows that there couldn’t be only finitely many prime numbers.
Consider the integer n = p1 p2 · · · pn + 1 and its prime factorization. Let q be a prime dividing n. If q ∈
{ p1 , p2 , . . . , pn } then q| p1 p2 · · · pn and so q|(n − p1 p2 · · · pn ), that is q|1, which is a contradiction. Thus, q
is a prime not in the list { p1 , p2 , . . . , pn }.
So! We know that every integer is a product of prime numbers, we know there are infinitely many prime
numbers. That teaches us about the integers, and invites some more questions:

– How frequent are prime numbers? The Prime Number Theorem asserts that the number of primes
in the interval [1, n] is roughly n/ log n, in the sense that the ratio between the true number and the esti-
mate n/ log n approaches 1 as n goes to infinity. The result was conjectured by Gauss16 at the age of 15 or
16 and proven by J. Hadamard and Ch. de la Vallée Poussin, independently, in 1896. The famous Riemann
hypothesis, currently (Fall 2023) one of the Clay Institute million dollar millennium prize problems, is initially
a conjecture about the zeros of some complex valued function – the so-called Riemann zeta function. An
equivalent formulation is the following: let π ( x ) be the number of prime numbers in the interval [1, x ], for
x > 1, then for some constant C, we have
x √
|π ( x ) − | < C x · log x.
log x

16Johann Carl Friedrich Gauss, 1777 - 1855, worked in a wide variety of fields in both mathematics and physics including number
theory, analysis, differential geometry, geodesy, magnetism, astronomy and optics. His work has had an immense influence
in many areas of Science. He is mentioned in the same breath with Euclid, Archimedes and Newton as one of the giants of
mathematics.
COURSE NOTES - ALGEBRA I 39

– How small can the gaps between consecutive primes be? For example, we have (3, 5), (5, 7), (11, 13),
(17, 19), ... But, are there infinitely many such pairs?? The answer is believed to be yes but no one has
proven it yet (Fall 2023). This is called the Twin Primes Conjecture. Incidentally, it is not hard to prove
that the gap between primes can be arbitrarily large. Given an integer N, consider the numbers
N! + 2, N! + 3, N! + 4, . . . , N! + N.
This is a set of N − 1 consecutive integers none of which is prime. In recent breakthroughs, originating with
Yitang Zhang in 2013, it was established that there is a computable constant C such that there are infinitely
many pairs of primes at most C apart. Zhang provided C = 70, 000, 000 and J. Maynard improved that not
much after to C = 600; “Polymath project 8" - a large group of mathematicians working on the problem
using an online platform - improved the bound further to C = 256. It is believed that their methods can be
improved to yield C = 6, but not (oh, painful irony!) to yield C = 2, leaving the twin primes conjecture just
beyond reach.

– How far does one need to go until the next prime shows up? For example, it is known that there
is always a prime between n and 2n. It follows rather easily from the prime number theorem for n 0,
but it in fact holds for every n. There is a rather elementary, yet ingenious, proof of this fact, the technical
background for which is not much more than Stirling’s formula. We sometimes give it in the Number Theory
course.

– What about adding primes? Goldbach’s conjecture asserts that every even integer greater than 2 is
the sum of two prime numbers. For example, 4 = 2 + 2, 6 = 3 + 3, 8 = 3+5, 10 = 3+7, 12 = 5 + 7, 14 =
3 + 11, 16 = 5+ 11, .... It has been verified (Winter 2016) up to n ≤ 4 × 1017 , but no proof is currently
known (Fall 2023).
The “odd Goldbach conjecture" states that every odd integer greater than 5 is the sum of 3 primes.
Remarkably this is known. Series of works over a century proved better and better results towards the odd
conjecture. In particular, it was known in 2002 to be true for every odd integer greater than C = 2 · 101346 ,
which is computationally out of reach. Harald Helfgott proved the full conjecture in 2013, by improving the
constant C greatly so that the remaining cases could be verified by direct computation (and, in fact, were
already verified prior to his work).

10.2. Applications of the Fundamental Theorem of Arithmetic.

am a
Proposition 10.2.1. Let a, b be non-zero integers. Then a|b if and only if a = ep11 · · · pm and b =
a10 a0m b
µp1 · · · pm q11 · · · qbt t (products of distinct primes) with ai0 ≥ ai for all i = 1, . . . , m.
Proof. Clearly for such factorizations it follows that a|b, in fact
a 0 − a1 a0 − am b1
b/a = (µ/e) p11 · · · pmm q1 · · · qbt t .
a a00 a00 b
am
Conversely, if a|b, write a = ep11 · · · pm and b/a = νp11 · · · pmm q11 · · · qbt t , with ν = ±1 and ai00 ≥ 0.
a + a100 a + a00m b1
Then b = (νe) p11 · · · pmm q1 · · · qbt t and let µ = νe, ai0 = ai + ai00 .
am a b
Corollary 10.2.2. Let a = p11 · · · pm , b = p11 · · · pbmm with pi distinct prime numbers and ai , bi non-negative
integers. (Any two positive integers can be written this way). Then
min( a1 ,b1 ) min( am ,bm )
gcd( a, b) = p1 · · · pm .
The next proposition establishes the existence of real numbers that are not rational. It can be generalized
considerably. In fact, it is known that in randomly choosing a number in the interval [0, 1] the probability of
picking a rational number is zero. So, although there are infinitely many rational numbers, even in the interval
[0, 1], they are still a rather meagre set inside the real numbers. In fact, recall that by Cantor’s diagonal
argument, we know that the cardinality of R is greater than that of Q, so there must be some real number
40 EYAL GOREN MCGILL UNIVERSITY

that is not rational. However, the advantage of Proposition 10.2.4 is that it shows that a specific number is
irrational. We will need the following result.
Proposition 10.2.3. Any non-zero rational number q can be written as
a am
q = ep11 . . . pm ,
where e = ±1, the pi are distinct prime numbers and a1 , . . . , am are non-zero integers (possibly negative).
Moreover, this expression is unique (up to reordering the primes).
Proof. By definition, for some integers a, b we have q = a/b. Let us write:
s t
a = ea r11 · · · rnsn , b = eb r11 · · · rntn ,
where ea , eb ∈ {±1}, the ri are distinct primes and si , ti are non-negative integers, possibly zero. This is
always possible to do because we allow zero exponents here. Then, clearly,
s − t1
q = (ea /eb )r11 · · · rnsn −tn .
Now, omitting the primes such that si − ti = 0 from the list, calling the remaining primes p1 , . . . , pm , and
letting e = ea /eb , we obtain an expression as desired.
Suppose that we have two such expressions for q. Again, by allowing zero exponents, it is enough to
consider the following situation:
a a0 a0
am
q = ep11 . . . pm = e0 p11 . . . pmm .
Since e, e0 determine the sign of q, they must be equal and we need to show that ai = ai0 for all i. Dividing
through, we get an expression of the form,
c
1 = p11 . . . pcmm ,
where ci = ai − ai0 and we need to show all the ci are zero. By rearranging the primes, we may assume
that c1 , . . . , ct are negative and ct+1 , . . . , cm are non-negative. We conclude that
− c1 c
p1 . . . pt−ct = pt+
t +1 cm
1 . . . pm .
In this expression there are no negative exponents. Thus, from unique factorization for integers, since all the
primes are distinct all the powers must be zero.
√
Proposition 10.2.4. 2 is not a rational number.
√ a am
Proof. Suppose it is, and write 2 = p11 · · · pm , distinct primes with non-zero exponents (possibly negative).
Then
2a
2 = p1 1 · · · p2a
m
m

must be the unique factorization of 2. However, 2 is prime. So there must be only one prime on the right
2a
hand side, i.e. m = 1. Then 2 = p1 1 and we must have p1 = 2 and 2a1 = 1. But this contradicts the fact
that a1 is an integer.

Here is another
√proof: √
Suppose that 2 is rational and write 2 = m/n, where (m, n) = 1. Then
2n2 = m2 .
This implies that 2|m2 = m · m. As 2 is prime, it follows that 2|m. Thus, m is even, say m = 2k. It follows
that 2n2 = 4k2 and so n2 = 2k2 . Therefore, 2|n2 and by the same considerations
√ 2|n. This means that 2
both n and m, contrary to our assumption. Thus, assuming 2 is rational leads to a contradiction
divides √
and so 2 is not a rational number.
p √
Example 10.2.5. We can draw the conclusion that also α = 1 + 2 is irrational. Indeed, if α was rational, √
say equal to m/n, then α2 = m2 /n2 is also rational and so is α2 − 1 = (m2 − n2 )/n2 . But α2 − 1 = 2,
which we know to be irrational.
Some problems sound very hard, but turn out to have elementary solutions. For example, consider the
statement: “there exists two irrational numbers a, b such that ab is rational". This seems hard, but a shrewd
√ √2
observation due to Dov Jarden, gives a proof: if 2 is rational then we are done. Otherwise, consider
COURSE NOTES - ALGEBRA I 41

√
√ 2 √
ab , where a = 2 and b√ = 2. As ab = 2, we are done. This clever argument doesn’t tell us which
√ 2 √ √2 √2 √ √2
is the correct example, 2 , or ( 2 ) . In fact, by the Gelfond-Schneider theorem, we know 2 is
transcendental (namely, it doesn’t solve any non-zero polynomial with rational coefficients; being irrational just
√ √2 √2
means it doesn’t solve any linear polynomial with rational coefficients), so the example is actually ( 2 ) .
See also Exercise 20.

Although the second proof of Proposition 10.2.4 is more elementary,

√ √ the first proof is better. It is much
easier to generalize and indeed in a similar way, one can show 3, 5 18 etc. are irrational.
It is also known that e is irrational (see Appendix B) and π is irrational (hard). But it is still an open
question (Fall 2023) whether Euler’s constant

1 1 1
γ = lim 1 + + + · · · + + − log(n) ≈ 0.57721
n→∞ 2 3 n
is rational or not (it is believed to be irrational; if γ is rational, it was proved that its denominator has to
have more than 10242080 digits!).
42 EYAL GOREN MCGILL UNIVERSITY

11. Exercises
(1) Find the quotient and remainder when a is divided by b:
(a) a = 302, b = 19.
(b) a = −302, b = 19.
(c) a = 0, b = 19.
(d) a = 2000, b = 17.
(e) a = 2001, b = 17.
(f) a = 2002, b = 17.
(2) Prove that the square of any integer a is either of the form 3k or of the form 3k + 1 for some integer
k. (Hint: write a in the form 3q + r, where r = 0, 1 or 2.)
(3) Prove of disprove: If a|(b + c) then a|b or a|c. If a|bc then a|b. If a|c and ( a + b)|c then b|c.
(4) If r ∈ Z and r is a solution of x2 + ax + b (where a, b ∈ Z) prove that r |b.
(5) If n ∈ Z, what are the possible values of
(a) (n, n + 2);
(b) (n, n + 6).
(6) Find the following gcd’s. In each case also express ( a, b) as ua + vb for suitable integers u, v ∈ Z.
(a) (56, 72).
(b) (24, 138).
(c) (143, 227).
(d) (314, 159).
(7) If a|c and b|c, must ab divide c? What if ( a, b) = 1?
(8) The least common multiple of nonzero integers a, b is the smallest positive integer m such that a|m
and b|m. We denote it by lcm( a, b) or [ a, b]. Prove that:
(a) If a|k and b|k then [ a, b]|k.
ab
(b) [ a, b] = ( a,b)
if a > 0, b > 0.
r r s s
(9) Let a = p11 p2r2 · · · pkk and b = p11 p2s2 · · · pkk , where p1 , p2 , . . . , pk are distinct positive primes and
each ri , si ≥ 0. Prove that
n n
(a) ( a, b) = p1 1 p2n2 · · · pk k , where ni = min(ri , si ).
t t
(b) [ a, b] = p11 p2t2 · · · pkk , where ti = max(ri , si ).
(10) Prove or disprove: If n is an integer and n > 2, then there exists a prime p such that n < p < n!.
(11) Find all the primes between 1 and 150. The solution should consist of a list of all the primes + giving
the last prime used to sieve + explanation why you didn’t have to sieve by larger primes.
(12) Unlike twin primes, prove that there is a unique triplet of primes; namely, if n, n + 2, n + 4 are primes
for some n ≥ 1 then n = 3.
p √
(13) Prove that 2 + 3 is irrational.
√
(14) Prove that if p is a prime then p is irrational.
√
(15) Prove that if p is a prime then 3 p is irrational.
(16) Let k√≥ 1 be an integer. Prove that if a positive integer n is not a k-th power of another integer
than k n is irrational.
√ √
(17) Prove that there are no rational numbers a, b such that 3 = a + b 2.
(18) If the ratio of the frequencies of two musical notes is 3 : 2, we say the notes form a fifth. For
example, the notes C and G (where G is higher) form a fifth, as do the notes G and D (where D is
higher). In fact, in the following sequence any two consecutive notes are supposed to form a fifth:
COURSE NOTES - ALGEBRA I 43

C, G, D, A, E, B, F ], C ], G ], D ], A], F, C.
(The last C is 7 octaves higher than the first C).
On the other hand, the ratio between two consecutive C’s is an octave and is 2 : 1. Explain why
this leads to a contradiction; the sequence above cannot really be a sequence of fifths. This is solved,
in tuning a piano, by the so-called equal temperament method; the fifths are not quite in ratio 3 : 2.
In fact, they are in ratio x : 1, where x is chosen so to ensure that the octaves stay in ratio 2 : 1.
What is x and how close it is to 3 : 2?
(19) Prove that there are infinitely many primes congruent to 3 modulo 4.
(20) Let I be the interval ((1/e)1/e , ∞). Then every rational number in I is either of the form a a where a
is irrational, or of the form nn where n is an integer. This gives infinitely many examples of irrational
numbers a such that a a is rational.
Here are some suggestions as to how to prove this statement. First, using analysis show that the
function f ( x ) = x x is bijection between the interval (1/e, ∞) and I. Now, let r be a rational number
in I and assume that r = a a , where a is rational too. Write a = n/m and r = b/c where b, c, m, n
are positive integers and (b, c) = (m, n) = 1. The goal is to show m = 1.
By analysing (n/m)n/m = b/c conclude that we must have cm = mn . If m > 1, take a prime p
dividing both c and m, say pi kc and p j km (the notation pi kc means that pi |c, but pi+1 6 |c); deduce
that im = jn and from that p j | j. Show that this is not possible.
z
44 EYAL GOREN MCGILL UNIVERSITY

Part 3. Congruences and modular arithmetic

12. Congruence relations

Let n be a positive integer. Define a relation on the set of integers by x ∼ y if n|( x − y) (we shall also write
that as x ≡ y (mod n), or simply x ≡ y, if n is clear from the context). We say that x is congruent to y
modulo n.

Lemma 12.0.1. Congruence modulo n is an equivalence relation on Z. The set {0, 1, . . . , n − 1} is a complete
set of representatives.

Proof. First n|( x − x ) so x ≡ x and the relation is thus reflexive. If n|( x − y) then n| − ( x − y) = y − x,
so the relation is symmetric. Suppose n|( x − y), n|(y − z) then n|(( x − y) + (y − z)) = x − z and so the
relation is transitive too.
Let x be any integer and write x = qn + r with 0 ≤ r < n. Then x − r = qn and so x ≡ r. It follows
that every equivalence class is represented by some r ∈ {0, 1, . . . , n − 1}. The equivalence classes defined by
elements of {0, 1, . . . , n − 1} are disjoint. If not, then for some 0 ≤ i < j < n we have i ≡ j, that is, n|( j − i ).
But 0 < j − i < n and we get a contradiction.

Theorem 12.0.2. Denote the equivalence classes of congruence modulo n by 0̄, 1̄, . . . , n − 1 instead of
[0], [1], . . . , [n − 1]. Denote this set by Z/nZ. The set Z/nZ is a commutative ring under the follow-
ing operations:
ī + j̄ = i + j, ī · j̄ = ij.
The neutral element for addition is 0̄, for multiplication 1̄, and the inverse of ī with respect to addition
is −i = n − i.

Before proving the theorem we illustrate the definitions in a numerical example:

Example 12.0.3. We take n = 13 and calculate 5̄ · 6̄ − 5̄. First, 5̄ · 6̄ = 30 = 4̄. Then 4̄ − 5̄ = 4̄ + −5 =

4 − 5 = −1 = 12. Note that we could have also calculated 4̄ − 5̄ = 4̄ + 8̄ = 12,¯ or 5̄ · 6̄ − 5̄ = 5̄(6̄ − 1̄) =
5̄ · 5̄ = 25 = 12.

Modular arithmetic, that is calculating in the ring Z/nZ, is some times called “clock arithmetic". The reason
is the following. The usual clock is really displaying hours modulo 12. When 5 hours pass from the time
10 o’clock the clock shows 3. Note that 3 ≡ 15 (mod 12). We are used to adding hours modulo 12 (or
modulo 24, for that matter), but we are not used to multiplying hours as that doesn’t quite make sense.
However, if you’d like you can think about multiplication as repeated addition 5 · 3 = 5 + 5 + 5. So, in that
sense, we are already familiar with the operations modulo 12 and the definitions above are a generalization.
Continuing with our numerical example, let us solve the equation 4x + 2 = 7 in Z/13Z. Now, and from
now on, we are just writing 4, 2, 7 etc. for 4̄, 2̄, 7̄. So we need to solve 4x = 5. Let us first look for a
residue class r modulo 13 so that 4r ≡ 1 (mod 13). We guess (but later we’ll develop methods for that)
that r = 10 and check: 4 · 10 = 40 ≡ 1 (mod 13). We now multiply both sides of the equation 4x = 5 by
10. Then, 4x = 5 implies 10 · 4x ≡ x ≡ 50 ≡ 11 (mod 13). Thus, the only possibility is x = 11. We go
back to the original equation 4x = 5 and verify that 4 · 11 ≡ 5 (mod 13). We found the solution x = 11.
The idea of the calculation was to use that for a given a 6= 0 we can find r such that ra = 1. This is
used to solve the equation ax = b by reducing to x = rax = rb, and we had done that for the particular
case a = 4 and b = 5. We remark that in general such an r need not exists if the modulos n is not a prime.
These issues will be discussed later. In particular, as we shall see, there are efficient ways to find the solution
to ax = 1 (mod n), but the situation for equations of the form ax = b (mod n) is more involved.

Example 12.0.4. As another example, we give the addition and multiplication table of Z/5Z.
COURSE NOTES - ALGEBRA I 45

+ 0 1 2 3 4 · 0 1 2 3 4
0 0 1 2 3 4 0 0 0 0 0 0
1 1 2 3 4 0 1 0 1 2 3 4
2 2 3 4 0 1 2 0 2 4 1 3
3 3 4 0 1 2 3 0 3 1 4 2
4 4 0 1 2 3 4 0 4 3 2 1

Proof. (Of Theorem 12.0.2) We first prove that the operations do not depend on the representatives for the
equivalence classes that we have chosen.
Suppose ī = ī0 , j̄ = j̄0 , where i, i0 , j, j0 need not be in the set {0, 1, 2, . . . , n − 1}. We have defined ī + j̄ =
i + j and so we need to check that this is the same as i0 + j0 . Since ī = ī0 , n|(i − i0 ) and similarly n|( j − j0 ).
Therefore, n|((i + j) − (i0 + j0 )); that is, i + j = i0 + j0 .
We also need to show that ij = i0 j0 . But, ij − i0 j0 = ij − ij0 + ij0 − i0 j0 = i ( j − j0 ) + j0 (i − i0 ) and so
n|(ij − i0 j0 ).
The verification of the axioms is now easy if we make use of the fact that Z is a commutative ring:
(1) ī + j̄ = i + j = j + i = j̄ + ī. (The first and third equalities are by definition and the second equality
follows from Z being a commutative ring.)
(2) (ī + j̄) + k̄ = i + j + k̄ = (i + j) + k. Note that at this point we used the simplification that we can
use any representative of the equivalence class to carry out the operations. Had we insisted on always
using the representative in the set {0, 1, 2, . . . , n − 1} we would usually have needed to replace i + j
by its representative in that set and things would be turning messy. Now, (i + j) + k = i + ( j + k ) =
ī + j + k = ī + ( j̄ + k̄ ).
(3) 0̄ + ī = 0 + i = ī.
(4) ī + −i = i + (−i ) = 0̄. Note that −i = n − i.
(5) (ī · j̄)k̄ = ij · k̄ = (ij)k = i ( jk ) = ī · jk = ī ( j̄ · k̄). (A dot is added occasionally merely to make it
easier to read.)
(6) 1̄ · ī = 1 · i = ī and ī · 1̄ = i · 1 = ī.
(7) ī ( j̄ + k̄) = ī · j + k = i ( j + k) = ij + ik = ij + ik = ī · j̄ + ī · k̄. Similarly, ( j̄ + k̄)ī = j + k · ī =
( j + k)i = ji + ki = ji + ki = j̄ · ī + k̄ · ī.
Furthermore, this is a commutative ring: ī · k̄ = ik = ki = k̄ · ī.

In the proof we saw that the ring properties of Z/nZ, the set of equivalence classes modulo n, all follow
from the ring properties of Z. We shall later see that this can be generalized to any ring R: if we impose
a correct notion of an equivalence relation, the equivalence classes themselves will form a ring and the fact
that the ring axioms hold for it follows from the fact that they hold for R.

Theorem 12.0.5. Z/nZ is a field if and only if n is prime.

Before providing the proof we introduce some terminology. Let R be a ring, x ∈ R a non-zero element. x is
called a zero divisor if there is an element y 6= 0 such that either xy = 0 or yx = 0 (or both).

Lemma 12.0.6. Let R be a commutative ring. If R has zero divisors then R is not a field.

Proof. Let x 6= 0 be a zero divisor and let y 6= 0 be an element such that xy = 0. If R is a field then there
is an element z ∈ R such that zx = 1. But then z( xy) = z · 0 = 0 and also z( xy) = (zx )y = 1 · y = y.
So y = 0, and that is a contradiction.

Proof. (Of Theorem) If n = 1 then Z/nZ has a single element and so 0 = 1 in that ring. Therefore, it is
not a field. Suppose that n > 1 and n is not prime, n = ab where 1 < a < n, 1 < b < n. Then ā 6= 0̄, b̄ 6= 0̄
but ā · b̄ = ab = n̄ = 0̄. So Z/nZ has zero divisors and thus is not a field.
46 EYAL GOREN MCGILL UNIVERSITY

Suppose now that n is prime and let ā 6= 0̄. That is, n - a, which, since n is prime, means that (n, a) = 1.
Consider the list of elements
0̄ · ā, 1̄ · ā, . . . , n − 1 · ā.
We claim that they are distinct elements of Z/nZ. Suppose that ī · ā = j̄ · ā, for some 0 ≤ i ≤ j ≤ n − 1
then ia = ja, which means that n|(ia − ja) = (i − j) a. Since (n, a) = 1, it follows that n|(i − j) but that
means i = j. Thus, the list 0̄ · ā, 1̄ · ā, . . . , n − 1 · ā contains n distinct elements of Z/nZ and so it must
contain 1̄. That is, there’s an i such that ī · ā = 1̄ and therefore ā is invertible.
Here is another proof for the invertibility of ā. Since ( a, n) = 1, for suitable integers u, v we have
ua + vn = 1. But that means that n|(ua − 1). That is, ua ≡ 1 (mod n).

The first proof has the advantage that a minor variant shows that every finite commutative ring with no
zero-divisors is a field; the second proof has the advantage that in our particular situation one can use the
Euclidean algorithm to calculate u such that ua ≡ 1 (mod n), namely to calculate a−1 in the field. We shall
see that this is a tremendously useful fact.

Let p be a prime number. We denote Z/pZ also by F p . It is a field with p elements. It is a fact that any
finite field (that is, any field with finitely many elements) has cardinality a power of a prime and for any prime
power there is a field with that cardinality - we learn ways to construct such fields in this course. Finite fields,
such as F p , play an important role in coding and cryptography as well as in pure mathematics. Interestingly
enough, these fields were used to be called Galois fields and a notation such as GF ( p, n) (Galois field of
pn elements) was used. Initially, such fields were the height of abstraction; today we routinely use them in
various cryptographic and coding theory implementations.

12.1. Fermat’s little theorem.

Theorem 12.1.1. (Fermat17) Let p be a prime number. Let a 6≡ 0 (mod p) then

a p −1 ≡ 1 (mod p).

Before proving the theorem we state two auxiliary statements whose proofs are left as an exercise.

Lemma 12.1.2. Let p be a prime number. We have p|( pi) for every 1 ≤ i ≤ p − 1.18

Lemma 12.1.3. Let R be a commutative ring and x, y ∈ R. Interpret (ni) as adding (ni) times the element 1
to itself. Then the binomial formula holds in R:
n
n i n −i
( x + y) = ∑
n
xy .
i =0
i

17Pierre de Fermat, 1601 - 1665, was a French lawyer and government official most remembered for his work in number theory;
in particular for Fermat’s Last Theorem that states the equation x n + yn = zn does not have solutions in positive integers
x, y, z such that xyz 6= 0 for any n ≥ 3. He famously scribbled in the margin of his copy of Diophantus’ Arithmetica book
that he had discovered a marvellous proof that the margin is too small to contain. Fermat’s last theorem was proved by Sir
Andrew Wiles in 1994, ushering in the process a whole new area of number theory. Wiles’ proof is so difficult, and builds on
so much mathematics that didn’t exist at Fermat’s time, that Fermat’s claim cannot be accepted as such. Fermat had also
made important contributions to calculus and diophantine equations by introducing the so-called method of infinite descent.
This method, still in use today, assumes that a given diophantine equation (namely, f ( x ) = 0, f ( x ) ∈ Z[ x ]) has any solution
in integers and attempts to use those to provide a solution in smaller integers (in absolute value). The process can then be
repeated, yielding eventually a contradiction as one arrives at integer solutions of negative absolute value.
18Recall that the binomial coefficient (n) is defined as follows. Let n, a be natural numbers. Define (n) = n!
, where 0! := 1.
a a a! (n− a)!
This is in fact an integer and has the combinatorial interpretation that (na) is the number of ways to choose a distinct objects
from a set of n objects, where the order in which we choose the object doesn’t matter. That is, it is the number of subsets
of cardinality a of a set of cardinality n. This, incidentally, is a proof that (na) is indeed an integer, and the reason the binomial
number (na) is often read “ n choose a".
COURSE NOTES - ALGEBRA I 47

Proof. (of Fermat’s little theorem) We prove the statement by induction on 1 ≤ a ≤ p − 1. For a = 1 the
result is clear. Suppose the result for a and consider a + 1, provided a + 1 < p. We have, by the binomial
formula,
p
p i
( a + 1) = ∑
p
a
i =0
i

p p 2 p
= 1+ a+ a +···+ a p −1 + a p
1 2 p−1
= 1 + ap (using the lemma)
= 1+a (using the induction hypothesis)
Since 1 + a 6≡ 0 (mod p) it has an inverse y in F p , y(1 + a) ≡ 1. Then, y(1 + a) p ≡ y(1 + a) ≡ 1. As also
y(1 + a) p = y(1 + a)(1 + a) p−1 ≡ (1 + a) p−1 , we find that (1 + a) p−1 ≡ 1.
Example 12.1.4. We calculate 2100 modulo 13. We have 2100 = 296 24 = (212 )8 24 ≡ 24 ≡ 3 modulo 13.
Fermat’s little theorem gives a criterion for numbers to be composite. Let n be a positive integer. If there
is 1 ≤ a ≤ n − 1 such that an−1 6≡ 1 (mod n) then n is not prime. Unfortunately, it is possible that for
every 1 ≤ a ≤ n − 1 such that ( a, n) = 1, one has an−1 ≡ 1 (mod n) and yet n is not prime. Thus, this
test fails to recognize such n as composite numbers. Such numbers are called Carmichael numbers. There
are infinitely many such numbers. The first are 561, 1105, 1729, 2465, 2821, 6601, 8911, 10585, 15841,
29341, ...
Primality testing routines first test divisibility by small primes available to the program as pre-computed
data and then use various methods that are more sophisticated version of the following method: choose
randomly some 1 ≤ a < n: if ( a, n) 6= 1 then n is not prime. If ( a, n) = 1 the program calculated an−1
(mod n). If the result is not 1 (mod n) then n is not prime. If the result is 1, the program chooses another a.
After a certain number of tests, say 10, if n passed all the tests it is declared as “prime", though there is
no absolute reassurance it is indeed a prime; it could be a Carmichael number, or we could have just been
unlucky in choosing our elements a. We remark that calculating an−1 (mod n) can be done quickly. One
calculates a, a2 , a4 , a8 , a16 , · · · modulo n, as long as the power is less than n. This can be done rapidly. One
then expresses n in base 2 to find the result. Here is an example: Let us calculate 354 (mod 55) (random
choice of numbers). We have 3, 32 = 9, 34 = 81 = 26, 38 = 262 = 676 = 16, 316 = 162 = 256 = 36, 332 =
362 = 1296 = 31. Now, 54 = 2 + 4 + 16 + 32 and so 354 = 9 · 26 · 36 · 31 = 4. In particular, 55 is not a
prime – not that this is a particularly shrewd observation...

It is important to note that there is a polynomial-time algorithm (this means that the running time of the
algorithm is at most some constant times the number of digits of the input) to decide, without any doubt,
if an integer is prime. Such an algorithm was discovered by Agrawal, Kayal and Saxena in 2002 and was a
real sensation at the time. It is important to note that the algorithm does not produce a decomposition of n
in case n is composite. Such an algorithm will compromise the very backbone of e-commerce and military
security.

12.2. Solving equations in Z/nZ. There is no general method for solving polynomials equations in Z/nZ.
We just present some selected topics.
12.2.1. Linear equations. We want to consider the equation ax + b = 0 in Z/nZ. Let us assume that ( a, n) =
1. Then, there are integers u, v such that 1 = ua + vn. We remark that u, v are found by the Euclidean
algorithm. Note that this implies that ua ≡ 1 (mod n). Thus, if x solves ax + b = 0 in Z/nZ then x solves
the equation uax + ub = 0 (mod n), that is x + ub = 0 and so x = −ub in Z/nZ. Conversely, if x = −ub
in Z/nZ where ua = 1 in Z/nZ then ax = a(−ub) = − aub = −b in Z/nZ.
We summarize: if ( a, n) = 1 then the equation
ax + b = 0 (mod n),
has a unique solution x = −ub, where u is such that ua = 1 (mod n).
48 EYAL GOREN MCGILL UNIVERSITY

Example 12.2.1. Here is a specific example: Let us solve 12x + 3 = 0 (mod 17). First 17 = 12 + 5, 12 =
2 · 5 + 2, 5 = 2 · 2 + 1, so (12, 17) = 1 and, moreover, 1 = 5 − 2 · 2 = 5 − 2 · (12 − 2 · 5) = 5 · 5 − 2 ·
12 = 5 · (17 − 12) − 2 · 12 = 5 · 17 − 7 · 12. We see that −7 · 12 ≡ 1 (mod 17). Thus, the solution
is x = 7 · 3 = 21 = 4 (mod 17).

More generally, if ( a, n) = d, then the equation ax + b ≡ 0 (mod n) has a solution if and only if d|b.
Indeed, if we have a solution, then b = kn − ax for some integer k and we see that d|b. Conversely, if d|b
consider the equation ( a/d) x + (b/d) ≡ 0 (mod n/d). A solution to this equation also solves the original
equation ax + b ≡ 0 (mod n). Now ( a/d, n/d) = 1 and we are in the case already discussed above.
Unlike the situation ( a, n) = d where d = 1, if d > 1 there is more than one solution to ax + b ≡ 0
(mod n). In fact, one can prove that if x0 is one solution, then a complete set of solutions modulo n is
n n n
x0 , x0 + , x0 + 2 · , . . . , x0 + ( d − 1) · .
d d d
12.2.2. Quadratic equations. Consider the equation ax2 + bx + c = 0 in Z/nZ and assume n is a prime
greater than 2. In that case, assuming that a 6= 0 modulo n, there is an element (2a)−1 . One can prove that
the equation has a solution if and only if b2 − 4ac is a square in Z/nZ (which may or may not be the case).
In case it is a square, the solutions of this equation are given by the usual formula:
p
(2a)−1 (−b ± b2 − 4ac).
For example, the equation x2 + x + 1 has no solution in Z/5Z because the discriminant b2 − 4ac is in this
case 12 − 4 = −3 = 2 in Z/5Z and 2 can be checked not to be a square in Z/5Z (one just tries: 02 =
√ 4, 4 = 1 in Z/5Z).
0, 12 = 1, 22 = 4, 32 = On the other hand, x2 + x + 1 can be solved in Z/7Z. The
2
√
solutions are 4(−1 ± −3) = −4 ± 4 4 = −4 ± 8 = −4 ± 1 = {2, 4} .

When n is not prime, we shall not study the problem in this course, beyond remarking that one can proceed
by trying all possibilities if n is small and that the number of solutions can be very large. For example:
consider the equation x3 − x in Z/8Z. We can verify that its solutions are 0, 1, 3, 5, 7. There are 5 solutions
but the equation has degree three. We shall later see that in any field a polynomial equation of degree n
has at most n roots. As the example of x3 − x in Z/8Z suggests, this may fail spectacularly in a general
commutative ring.

12.3. Public key cryptography; the RSA method. 19 We cannot go here too much into the cryptograph-
ical practical aspects. Suffices to say that in many cryptographical applications two parties X and Y wish to
exchange a secret. Given any large integer n that secret can be represented as a number modulo n, and we
leave it to the reader’s imagination to devise methods for that. The method proceeds as follows:

X chooses two large primes p < q.

X calculates n = pq.
X calculates k = ( p − 1)(q − 1).
X chooses an integer d such that (d, k) = 1.
X finds an integer e such that ed ≡ 1 (mod k).
X publishes for anyone to see the data e, n. This is called the public key.

The rest of the data p, q, k, d is kept secret. In fact, p, q, k can be destroyed altogether and only d is kept,
and kept secret!

d is called the private key.

19The RSA method described above is named after Ron Rivest, Adi Shamir and Len Adleman, who discovered it in 1977.
COURSE NOTES - ALGEBRA I 49

Y, wishing to send a secret, writes it as an integer b modulo n, which is also relatively prime to n, and sends be
(mod n) to X, allowing anyone interested to see that message. The point is, and this is called the discrete
log problem, that it is very difficult to find what b is, even when one knows be and n. Thus, someone seeing
Y’s message cannot find the secret b from it.

X, upon receiving Y’s message be , calculates (be )d .

Lemma 12.3.1. We have bed ≡ b (mod n).
Proof. We need to show that bed ≡ b (mod p) and bed ≡ b (mod q). Then p|(bed − b) and q|(bed − b) and
so (using the p, q are primes and distinct), n = pq|(bed − b).
The argument being symmetric, we just show bed ≡ b (mod p). We have modulo p
bed = b1+vk
= b · ((b p−1 )q−1 )v
= b · (1q −1 ) v (Fermat’s little theorem)
= b.

We have shown that X can retrieve Y’s secret.

Here is a numerical example:

p = 10007, q = 10009;
n = p*q = 100160063
k = (p-1)*(q-1) = 100140048
d = 10001
e = 88695185
b = 3
b^e = 33265563
33265563^10001 = 3 Mod n.

RSA rests on several security assumptions. Besides the belief that the discrete log problem is inherently
difficult, it also pre-supposes that the difficulty of factoring an integer of the form pq, where p and q are
primes of similar size is a difficult problem computationally whose solution has running time comparable to
the size of p. All evidence so far indicates that this is so. Suppose that p is a prime of the order of magnitude
of 101000 (at this point in time (2016) this is considered very secure). Let us calculate how long it will take
to factor n by brute force trial and error. Cray’s supercomput Titan does 20,000 trillion (2 × 1015 ) flops
calculations per second. It had cost 97 million dollars to be built. To simplify, let us assume that this many
flops boils down to trying 2 × 1015 integers as factors of n per second.
We can assume that such a computer can’t be built for less that 10 million dollars. A computer 10000
times faster will prob cost today in the excess of 10 billion dollars, if it can be constructed at all. The existence
of anything bigger would probably be public knowledge for budgetary reasons and the size of resources needed
for constructing and running such a computer. Even with dedicated architecture it wouldn’t perform more
than 1025 operations per second. The time to factor an integer greater than 101000 by brute force will take
more than 10975 seconds, which is much much more than a billion years. Even relaxing our assumptions
greatly shows that for all practical purposes, as long as our security assumptions are valid, it is not feasible
to break RSA based on a key of this size in any feasible time.
50 EYAL GOREN MCGILL UNIVERSITY

13. Exercises
(1) A relation can be either reflexive or not, symmetric of not, transitive or not. This gives a priori 8
possibilities (e.g., reflexive, non-symmetric, transitive). For each possibility either give an example
of such a relation, or indicate why this possibility doesn’t occur. Whenever possible, give “natural
examples".
(2) Let A be a set.
(a) Let Γ = A × A. What relation does Γ define on A? Is it an equivalence relation? For every
a ∈ A write the set of elements b ∈ A such that a ∼ b.
(b) Let Γ = {( a, a) : a ∈ A}. What relation does Γ define on A? Is it an equivalence relation? For
every a ∈ A write the set of elements b ∈ A such that a ∼ b.
(c) Define a relation on non-zero real numbers by saying that a ∼ b if a/b is a rational number.
Show that this is an equivalence relation.
(d) Define a relation on complex numbers by saying that z1 ∼ z2 if |z1 | = |z2 |. Show that this is
an equivalence relation. How do the equivalence classes look like graphically?
(e) Let f ( x ) be a polynomial of the form x n + an−1 x n−1 + · · · + a0 , where n ≥ 1 and ai ∈ R.
Describe a relation on R by saying that a ∼ b if f ( a) = f (b). Prove that this is an equivalence
relation. Prove that if n is odd, there is an element a ∈ R whose equivalence class consists only
of itself. (This question requires some use of calculus.)
(3) Given an integer N, we write N in decimal expansion as N = nk nk−1 . . . n0 , the ni being the digits
of N. Note that this means that N = n0 + 10n1 + 102 n2 + · · · + 10k nk . In the following you are
asked to show certain divisibility criteria that can be proved by using congruences.
(a) Prove that a positive integer N = nk nk−1 . . . n0 is divisible by 3 if and only if the sum of its
digits n0 + n1 + · · · + nk is divisible by 3. (Hint: show that in fact N and n0 + n1 + · · · + nk
are congruent to the same number modulo 3.) Example: 34515 is divisible by 3 because 3 +
4 + 5 + 1 + 5 = 18 is divisible by 3.
(b) Prove that a positive integer N = nk nk−1 . . . n0 is divisible by 11 if and only if the sum of its
digits with alternating signs n0 − n1 + n2 − · · · + (−1)k nk is divisible by 11. Example: 1234563
is divisible by 11 since 1 − 2 + 3 − 4 + 5 − 6 + 3 = 0 is divisible by 11.
(c) Prove that a positive integer N = nk nk−1 . . . n0 is divisible by 7 if and only if when we let
M = nk nk−1 . . . n1 , we have that M − 2n0 is divisible by 7. Example: take the number 7 ∗ 11 ∗
13 ∗ 17 = 17017. It is clearly divisible by 7. Let us check the criterion against this example. We
form the number 1701 − 2 ∗ 7 = 1687 and then the number 168 − 2 ∗ 7 = 154 and then the
number 15 − 2 ∗ 4 = 7. So it works. Let us also check the number 82. It is not divisible by 7, in
fact it’s residue modulo 7 is 5. Also 8 − 2 ∗ 2 = 4, so the criterion shows that it’s not divisible.
Note though that in this case the number N = 82 and the number M = 8 − 2 ∗ 2 = 4 don’t
have the same residue modulo 7. So you need to construct your argument a little differently
than in the previous exercises.
(4) To check if you had multiplied correctly two large numbers A and B, A × B = C, you can make the
following check: sum the digits of A; keep doing it repeatedly until you get a single digit number a.
Do the same for B and C and get numbers b, c. If you have multiplied correctly, the sum of digits of
ab is c. Prove that this is so. This is called in French “preuve par neuf".

Example: I have multiplied A = 367542 by B = 687653 and got C = 252741358926. To check

(though this doesn’t prove the multiplication is correct) I do: 3 + 6 + 7 + 5 + 4 + 2 = 27, 2 + 7 = 9
and a = 9. Also 6 + 8 + 7 + 6 + 5 + 3 = 35, 3 + 5 = 8 and b = 8. ab = 72 and its sum of digits is
9. On the other hand 2 + 5 + 2 + 7 + 4 + 1 + 3 + 5 + 8 + 9 + 2 + 6 = 54, 5 + 4 = 9. So it checks.
3
(5) Calculate the expression 23··65+
−3 in Z/5Z and in Z/7Z. Note: we write 1 to denote the inverse a−1
10 a
of a with respect multiplication in the field. For example, in Z/5Z, 13 = 2 and in Z/7Z, 31 = 5.
The expression ba means a · b−1 .
(6) (a) Find all solutions to the equation x2 + x = 0 in Z/pZ, where p is prime.
COURSE NOTES - ALGEBRA I 51

(b) Find all solutions to the equation x2 + x = 0 in Z/6Z.

(7) Solve the following equations:
(a) 12x = 2 in Z/19Z.
(b) 7x = 2 in Z/24Z.
(c) 31x = 1 in Z/50Z.
(d) 34x = 1 in Z/97Z.
(e) 27x = 2 in Z/40Z.
(f) 15x = 5 in Z/63Z.
(8) (a) Let p > 2 be a prime. Prove that an equation of the form ax2 + bx + c (where a, b, c ∈ F p , a 6= 0)
has a solution in Z/pZ if and only if b2 − 4ac is a square in Z/pZ. If this is so, prove that the
solutions are given by the familiar formula.
(b) Determine for which values of a the equation x2 + x + a has a solution in Z/7Z.
(9) Calculate the following:
(a) (219808 + 6)−1 + 1 (mod 11).
(b) 12, 122 , 124 , 128 , 1216 , 1225 all modulo 29. (Hint: think how to proceed most efficiently before
computing).
(10) Let p be a prime number, prove that p|( pi) for every 1 ≤ i ≤ p − 1.
(11) Let R be a commutative ring and x, y ∈ R. Interpret (ni) as the sum in R of (ni) times the element 1.
Then the binomial formula holds in R:
n
n i n −i
( x + y) = ∑
n
xy .
i =0
i

(12) Find all values a for which the system of equations xy = a, x + y = 1 has a solution in Z/19Z. Is
there such an a for which there is a unique solution?
(13) Prove Wilson’s theorem: Let p be a prime number then ( p − 1)! ≡ −1 (mod p).
(14) Let p be an odd prime and consider the sum
m 1 1 1
:= 1 + + + · · · + .
n 2 3 p−1
Prove that p|m. One method starts by multiplying this sum by ( p − 1)!. The other method considers
the ring Z[1/p] := { ba : p - b, a, b ∈ Z}. As p is an element of this ring, we can talk about divisibility
by p and define for two elements x, y in this ring that x ≡ y (mod p) if p|( x − y). Show that there
is a well-defined commutative ring structure on the set of congruence classes. Show that the residue
classes are represented once more by 0, 1, . . . , p − 1. Let a be an integer not congruent to 0 modulo
p. There is an integer b such that ab ≡ 1 (mod p); namely, the congruence class of b is the inverse
of a in the field Z/pZ. Show that b ≡ 1a (mod p), where this congruence is taking place in the
ring Z[1/p]. With these preparations, prove that p|m.
(15) Provide another proof of Fermat’s little theorem: Let p be a prime number and a 6≡ 0 (mod p) then
a p−1 ≡ 1 (mod p).
Start the proof by showing that the residue classes a, 2a, 3a, . . . , ( p − 1) a are distinct and non-zero,
and deduce that ( p − 1)! · a p−1 ≡ ( p − 1)! (mod p).
52 EYAL GOREN MCGILL UNIVERSITY

Part 4. Polynomials and their arithmetic

Some of our main achievements in studying the arithmetic of Z are the following:
• Division with residue, greatest common divisor and the Euclidean algorithm.
• Primes and unique factorization.
• Construction of new rings, the rings Z/nZ, out of Z. In particular, the construction of fields with
p elements, for every prime number p.
In this section and the following one, we shall see that these constructions, notions and theorems hold true
also for the ring of polynomials over a field. One particular application will be, once we develop sufficient
theory, the construction of finite fields with pn elements, where p is a prime and n ≥ 1 an integer. One
should note that the only ring with know so far with p2 elements, the ring Z/p2 Z, is never a field. And so,
at this point it is not clear at all whether fields with p2 elements, say, exist at all, let alone how to calculate
in them. All that we will be able to do.

14. The ring of polynomials

Let R be a commutative ring. A good example to keep in mind is R = Z or R = C, but our discussion allows
any commutative ring, so rings like Z/nZ and Z[i ] = { a + bi ∈ C : a, b ∈ Z} are perfectly good examples
as well. We define the ring of polynomials over R as
R [ x ] = { a n x n + · · · + a1 x + a0 : a i ∈ R }.
In the definition n is any non-negative integer. an x n + · · · + a1 x + a0 is called a polynomial with coefficients
in R and the ai are called the coefficients. Note that we allow some, or even all, coefficients to be zero. We
implicitly identify polynomials of the form an x n + · · · + a1 x + a0 and 0 · x n+m + · · · + 0 · x n+1 + an x n + · · · +
a1 x + a0 , and we often write, say, x5 + x3 + x instead of x5 + 0 · x4 + 1 · x3 + 0 · x2 + 1 · x + 0. The zero
polynomial 0 is the choice n = 0 and a0 = 0. We define addition by adding coefficients (assume n ≥ m; a
similar formula holds when n ≤ m)
( an x n + · · · + a1 x + a0 ) + (bm x m + · · · + b1 x + b0 )
= an x n + · · · + ( am + bm ) x m + · · · + ( a1 + b1 ) x + ( a0 + b0 ).
If you wish, by adding zero coefficients, we can always assume that our polynomials are both of the form
an x n + · · · + a1 x + a0 and bn x n + · · · + b1 x + b0 , in which case we can write their sum as
( an x n + · · · + a1 x + a0 ) + (bn x n + · · · + b1 x + b0 ) = ( an + bn ) x n + · · · + ( a1 + b1 ) x + ( a0 + b0 ).
We also define multiplication by
( an x n + · · · + a1 x + a0 )(bm x m + · · · + b1 x + b0 ) = cn+m x n+m + · · · + c1 x + c0 ,
where
ci = a0 bi + a1 bi−1 + · · · + ai−1 b1 + ai b0 .
Note that in the formula for ci it is entirely possible that some a j or b j , 0 ≤ j ≤ i, are not defined; this
happens if j > n or j > m, respectively. In this case we understand a j , or b j , as zero.
Example 14.0.1. Take R = Z then, for example,
(2x2 + x − 2) + ( x3 + x − 1) = x3 + 2x2 + 2x − 3,
and
(2x2 + x − 2)( x3 + x − 1) = 2x5 + x4 − x2 − 3x + 2.

A polynomial f ( x ) = an x n + · · · + a1 x + a0 is called monic if an = 1. It is called of degree n if an 6= 0. If f

has degree 0, that is f ( x ) = a, a ∈ R, a 6= 0, then f is called a constant polynomial. The degree of the zero
polynomial is not defined, but it is also considered as a constant polynomial.
Proposition 14.0.2. With the operations defined above R[ x ] is a commutative ring, with zero being the
zero polynomial and 1 being the constant polynomial 1. The additive inverse of an x n + · · · + a1 x + a0 is
− a n x n − · · · − a1 x − a0 .
COURSE NOTES - ALGEBRA I 53

Since the proof is straightforward we leave it as an exercise.

Proposition 14.0.3. If R is an integral domain then R[ x ] is an integral domain. If f ( x ), g( x ) ∈ R[ x ] are
non-zero polynomials,
deg( f ( x ) · g( x )) = deg( f ( x )) + deg( g( x )).
Proof. Say deg( f ( x )) = n, deg( g( x )) = m. Then, by definition, f ( x ) = an x n + · · · + a1 x + a0 with an 6=
0 and g( x ) = bm x m + · · · + b1 x + b0 with bm 6= 0. Therefore f ( x ) g( x ) = an bm x n+m + ( an bm−1 +
an−1 bm ) x n+m−1 + · · · . Since R is an integral domain an bm 6= 0, and so f ( x ) g( x ) 6= 0 and deg( f ( x ) · g( x )) =
n + m.

Let F be a field. We shall see that there are many similarities between the ring of integers Z and the ring
of polynomials with coefficients in F, F[ x ]. Initially, we shall see that they share many notions concerning
divisibility. The notion of absolute value for Z which measures the size of an integer is replaced by the notion
of degree for polynomials, which is the “size" of a polynomial. Using this notion we will define division with
residue and a Euclidean algorithm. Later on, after introducing some ideas from ring theory, we will also
construct the analog of the rings of congruence classes Z/nZ (n ≥ 1 an integer) and in particular of the
fields Z/pZ (p a prime number).

15. Division with residue

Let F be a field. We have defined the ring of polynomials F[ x ]; it is an integral domain but is never a field;
for example x does not have an inverse with respect to multiplication.
Theorem 15.0.1. Let f ( x ), g( x ) be two polynomials in F[ x ], g( x ) 6= 0. Then, there exist unique polynomi-
als q( x ), r ( x ) in F[ x ] such that
f ( x ) = q ( x ) g ( x ) + r ( x ), r ( x ) = 0 or deg r ( x ) < deg g( x ).
Proof. We first show the existence and later the uniqueness. Consider the set
S = { f ( x ) − q( x ) g( x ) : q( x ) ∈ F[ x ]}.
If 0 ∈ S then there is a q( x ) such that f ( x ) = q( x ) g( x ) and we take r ( x ) = 0. Else, choose an element r ( x )
in S of minimal degree. Since r ( x ) is in S we can write r ( x ) = f ( x ) − q( x ) g( x ) for some q( x ).
Claim. deg r ( x ) < deg g( x ).
Let us write r ( x ) = rn x n + · · · r1 x + r0 and g( x ) = gm x m + · · · + g1 x + g0 , with rn 6= 0, gm 6= 0. As-
sume, by contradiction, that n ≥ m. Then, the polynomial r1 ( x ) = r ( x ) − rn gm −1 x n − m g ( x ) = ( r
n −1 −
− 1
r n gm g m −1 ) x n − 1 + · · · has degree smaller then r ( x ). On the other hand, the expression r1 ( x ) = r ( x ) −
−1 x n−m g ( x ) = f ( x ) − q ( x ) g ( x ) − r g−1 x n−m g ( x ) = f ( x ) − ( q ( x ) + r g−1 x n−m ) g ( x ) shows that r ( x ) ∈
r n gm n m n m 1
S. Contradiction. We have therefore established the existence of q( x ), r ( x ) such that
f ( x ) = q ( x ) g ( x ) + r ( x ), r ( x ) = 0 or deg r ( x ) < deg g( x ).
We now prove uniqueness. Suppose that also
f ( x ) = q1 ( x ) g ( x ) + r1 ( x ), r1 ( x ) = 0 or deg r1 ( x ) < deg g( x ).
We need to show that q( x ) = q1 ( x ), r ( x ) = r1 ( x ). We have,
(q( x ) − q1 ( x )) g( x ) = r1 ( x ) − r ( x ).
The right hand side is either zero or has degree less than g( x ). If it’s zero then, since F[ x ] is an integral domain,
we also have q( x ) = q1 ( x ). If r ( x ) 6= r1 ( x ) then also q( x ) 6= q1 ( x ) but then the degree of the left hand side
is deg(q( x ) − q1 ( x )) + deg( g( x )) ≥ deg( g( x )) > deg(r1 ( x ) − r ( x )) and we get a contradiction.
54 EYAL GOREN MCGILL UNIVERSITY

16. Arithmetic in F[ x ]
In this section F is a field. We denote by F× the set of non-zero elements of F.
16.1. Some remarks about divisibility in a commutative ring T. The definitions we made in § 9.2 can be
made in general and the same basic properties hold. Let T be a commutative ring and a, b ∈ T. We say that
a divides b if b = ac for some c ∈ T. We have the following properties:
(1) a|b ⇒ a| − b.
(2) a|b ⇒ a|bd for any d ∈ T.
(3) a|b, a|d ⇒ a|(b ± d).
In particular, the definition and properties hold for the ring of polynomials R[ x ], where R is a commutative
ring.
16.2. GCD of polynomials.
Definition 16.2.1. Let f ( x ), g( x ) ∈ F[ x ], not both zero. The greatest common divisor of f ( x ) and g( x ),
denoted gcd( f ( x ), g( x )) of just ( f ( x ), g( x )), is the monic polynomial of largest degree dividing both f ( x )
and g( x ). (We shall see below that there is a unique such polynomial.)
Theorem 16.2.2. Let f ( x ), g( x ) be polynomials, not both zero. The gcd of f ( x ) and g( x ), h( x ) =
( f ( x ), g( x )), is unique and can be expressed as
h ( x ) = u ( x ) f ( x ) + v ( x ) g ( x ), u ( x ), v ( x ) ∈ F[ x ].
It is the monic polynomial of minimal degree having such an expression. If t( x ) divides both g( x ) and f ( x )
then t( x )|h( x ).
Proof. Consider the following set of monic polynomials
S = { a( x ) : a( x ) = u( x ) f ( x ) + v( x ) g( x ) for some u( x ), v( x ) ∈ F[ x ], a( x ) monic}.
S contains a non-zero polynomial, because if f ( x ) 6= 0, f ( x ) = bx n + l.o.t.20, then b−1 f ( x ) ∈ S; if f ( x ) = 0
then g( x ) is not zero and the same argument can be applied to g( x ). Let h( x ) be an element of min-
imal degree of S. We claim that h( x ) divides both f ( x ) and g( x ). Since the situation is symmetric,
we just prove h( x )| f ( x ). Suppose not, then we can write f ( x ) = q( x )h( x ) + r ( x ), where r ( x ) is a
non-zero polynomial of degree smaller than h( x ). Then r ( x ) = f ( x ) − q( x )(u( x ) f ( x ) + v( x ) g( x )) =
(1 − q( x )u( x )) · f ( x ) − q( x )v( x ) · g( x ) and so, if we let r1 ( x ) be r ( x ) divided by its leading coefficient, we
see that r1 ( x ) ∈ S and has degree smaller than h( x ), which is a contradiction.
By construction, h( x ) is the monic polynomial of minimal degree having such an expression. If t( x ) divides
both g( x ) and f ( x ) then t( x )|(u( x ) f ( x ) + v( x ) g( x )) = h( x ). Therefore, h( x ) is a monic polynomial of the
largest possible degree dividing both f ( x ), g( x ). Suppose that h1 ( x ) is another monic polynomial dividing f ( x )
and g( x ) having the largest possible degree, i.e., the degree of h( x ). Then, we have h( x ) = h1 ( x )b( x ) by
what we proved. Since both polynomials have the same degree b( x ) must be a constant polynomial, and,
then, since both are monic, b( x ) = 1. We have shown the gcd is unique.

16.3. The Euclidean algorithm for polynomials.

Theorem 16.3.1. Let f ( x ), g( x ) ∈ F[ x ] be non-zero polynomials, g( x ) = an x n + l.o.t. If g( x )| f ( x )
then ( f ( x ), g( x )) = a− 1
n g ( x ). Else, define inductively,
f ( x ) = q0 ( x ) g ( x ) + r0 ( x ), deg(r0 ) < deg( g)
g ( x ) = q1 ( x )r0 ( x ) + r1 ( x ), deg(r1 ) < deg(r0 )
r0 ( x ) = q2 ( x )r1 ( x ) + r2 ( x ), deg(r2 ) < deg(r1 )
..
.
r t −2 ( x ) = q t ( x ) r t −1 ( x ) + r t ( x ), deg(rt ) < deg(rt−1 )
r t −1 ( x ) = q t +1 ( x ) r t ( x ).

20l.o.t. = lower order terms.

COURSE NOTES - ALGEBRA I 55

This is indeed possible, and the process always terminates. Letting rt ( x ) = cm x m + · · · + c0 , we have
( f ( x ), g( x )) = c− 1
m r t ( x ).

Moreover, this algorithm also allows expressing ( f ( x ), g( x )) in the form u( x ) f ( x ) + v( x ) g( x ).

Proof. Each step in the process is done based on Theorem 15.0.1. The process must terminate because the
degrees decrease and they are natural numbers.
It is easy to see that rt |rt−1 . Suppose we know rt divides rt−1 , rt−2 , . . . , r a then, since r a−1 = q a+1 r a + r a+1
we get also that rt |r a−1 . We conclude that rt divides r0 , r1 , . . . , rt . Exactly the same argument gives that rt
divides g( x ) and f ( x ).
Conversely, if a( x ) divides f ( x ) and g( x ) then a( x )|( f ( x ) − q0 ( x ) g( x )) = r0 ( x ) and therefore a( x )|( g( x ) −
q1 ( x )r0 ( x )) = r1 ( x ), etc. We see that a( x )|rt ( x ) and so rt ( x ), once divided by its leading coefficient, must
be the greatest common divisor of f ( x ) and g( x ).

Example 16.3.2. (1) f ( x ) = x2 + 1, g( x ) = x2 + 2ix − 1, complex polynomials. We have

f ( x ) = 1 · ( x2 + 2ix − 1) + (−2ix + 2)
1 1
( x2 + 2ix − 1) = ( x − )(−2ix + 2).
−2i 2
It follows that ( f ( x ), g( x )) = −12i (−2ix + 2) = x + i. This implies that −i is a root of both
polynomials, as one can verify.
(2) Now we choose F = Z/3Z, the field with 3 elements. We take f ( x ) = x3 + 2x + 1, g( x ) = x2 + 1.
We then have,
f ( x ) = x · ( x 2 + 1) + ( x + 1)
( x 2 + 1) = ( x − 1) · ( x + 1) + 2
x + 1 = (2x + 2) · 2.
This implies that ( f ( x ), g( x )) = 1. We have
2 = ( x 2 + 1) − ( x − 1) · ( x + 1)
= g( x ) − ( x − 1)( f ( x ) − xg( x ))
= (− x + 1) f ( x ) + ( x2 − x + 1) g( x ).
And so we find (note that 1 = −2 in F)
1 = ( f ( x ), g( x )) = ( x − 1) f ( x ) − ( x2 − x + 1) g( x ).
(3) Consider the polynomials f ( x ) = x3 + 5x2 + 4x, g( x ) = x3 + x2 − x − 1, as rational polynomials.
Then
f ( x ) = 1 · g( x ) + 4x2 + 5x + 1
1 1 15 15
x3 + x2 − x − 1 = ( x − )(4x2 + 5x + 1) − x −
4 16 16 16
− 16 15 15
(4x2 + 5x + 1) = (4x + 1)(− x − ).
15 16 16
It follows that ( f ( x ), g( x )) = x + 1.
To express x + 1 as u( x ) f ( x ) + v( x ) g( x ) we work backwards:
−15 1 1
( x + 1) = g( x ) − ( x − )(4x2 + 5x + 1)
16 4 16
1 1
= g( x ) − ( x − )( f ( x ) − g( x ))
4 16
1 1 15 1
= −( x − ) · f ( x ) + ( + x ) · g( x ).
4 16 16 4
56 EYAL GOREN MCGILL UNIVERSITY

Thus,
1 4 4
+ x ) · f ( x ) − (1 + x ) · g ( x ).
x + 1 = ( f ( x ), g( x )) = (−
15 15 15
(4) Now consider the same polynomials over the field F = Z/3Z. We now have:
f ( x ) = 1 · g( x ) + x2 + 2x + 1
x3 + x2 − x − 1 = ( x − 1)( x2 + 2x + 1).
Therefore, now we have ( f ( x ), g( x )) = x2 + 2x + 1 = ( x + 1)2 .

One interesting application of the Euclidean algorithm is that it allows to check whether two polynomials
f , g ∈ C[ x ] (for example) have a common root, although it would not find the root. The point is that
if f (α) = g(α) = 0 then ( x − α)| f ( x ) and ( x − α)| g( x ). Indeed, divide f ( x ) by x − α with residue:
f ( x ) = q( x )( x − α) + r ( x ). Division with residue gives that r ( x ) is a constant polynomial. On the other
hand, substitute x = α to see that r (α) = f (α) − q(α)(α − α) = 0. Thus, r ( x ) = 0 and ( x − α)| f ( x );
similarly, ( x − α)| g( x ). Thus, if f and g have a common root α then ( x − α)|( f , g) and so ( f , g) 6= 1.
Conversely, suppose that d( x ) := ( f , g) 6= 1. As d( x ) is a non-constant polynomial it has a root α ∈ C and
as f ( x ) = d( x )q1 ( x ), g( x ) = d( x )q2 ( x ) for some qi ( x ) ∈ C[ x ], it follows that f (α) = g(α) = 0.
The interesting point about this use of the Euclidean algorithm is that it provides a quick and efficient
answer to the question of whether two polynomials f and g have a common root. It should be stressed the
straightforward approach, namely “calculate the roots of f and g and check if one of them is the same"
usually fails, because there is no general method to fine in exact form the roots of polynomials of high degree.

16.4. Irreducible polynomials and unique factorization. Let F be a field. We define a relation on polyno-
mials f ( x ) ∈ F[ x ]. We say that f ( x ) ∼ g( x ) if there is an element a ∈ F, a 6= 0 such that f ( x ) = ag( x ).
Lemma 16.4.1. This relation is an equivalence relation. Related polynomials are called associates. The
associates of 1 are the non-zero scalars F× .
Proof. The relation is reflexive because f ( x ) = 1 · f ( x ) and symmetric, because f ( x ) = ag( x ) implies g( x ) =
a−1 f ( x ). It is also transitive since f ( x ) = ag( x ) and g( x ) = bh( x ) implies f ( x ) = abh( x ) and ab 6= 0.
Finally, the last claim follows immediately from the definition.
The motivation for this definition of being “associated" is that as far as division goes two associated
polynomials behave the same. Indeed, suppose that f ∼ g, say f = ag. If f |h, that is, if h = f f 1 for some
polynomial f 1 , then h = g( a f 1 ) and that shows that g|h as well. The argument can be reversed and one
conclude that if g|h then f |h. It is also easy to check that if h| f then h| g and, conversely, if h| g then h| f .
Why did this issue not come up for integers?! Well, in a sense it was always there but it was so “visible"
that it required no special definition. The units of Z are just ±1 and in this case the correct definition is
to say that two integers a, b are associate if a = ±b. Under this definition it is clear that a|c ⇔ b|c and
c| a ⇔ c|b.
Let us return to polynomials. A non-constant polynomial f is called irreducible if g| f implies that g ∼ 1
or g ∼ f . As we have noted, if g| f and g1 ∼ g then g1 | f . Therefore, trying to define “irreducible" by g| f
implies that g( x ) = 1, or g( x ) = f ( x ), will not make sense; it will be in general a too strong requirement.
Proposition 16.4.2. Let f ( x ) ∈ F[ x ] be a non-constant polynomial. The following are equivalent:
(1) f is irreducible.
(2) if f | gh then f | g or f |h.
Proof. Suppose that f is irreducible, f | gh and f - g. The only monic polynomials dividing f are 1 and a−1 f ,
where a is the leading coefficient of f . Therefore, ( f , g) = 1 and so, for suitable polynomials u, v we
have u f + vg = 1. Then u f h + vgh = h. Since f divides the left hand side, it also divides the right hand
side, i.e., f |h.
Suppose now that f has the property f | gh ⇒ f | g or f |h. Let g be a divisor of f . Then f = gh for some h
and so f | gh. Therefore, f | g or f |h. Since h| f too, the situation concerning whether f | g or f |h is entirely
symmetric and we can assume that g| f and f | g. This implies that deg( g) ≤ deg( f ) and deg( f ) ≤ deg( g),
COURSE NOTES - ALGEBRA I 57

and so deg( f ) = deg( g). But then deg(h) = deg( f ) − deg( g) = 0 and so h is a constant polynomial. We
find that f ∼ g.
Example 16.4.3. Here are some comments on irreducible polynomials.
(1) Every linear polynomial is irreducible.
(2) If f = ax2 + bx + c is reducible then f = (αx + β)(γx + δ), where α, β, γ, δ ∈ F and α 6= 0, γ 6= 0.
It follows then that f has a root in F, for example x = −α−1 β.
Conversely, suppose that f has a root α ∈ F then, as we shall see shortly (Theorem 16.5.1), f =
( x − α) g( x ) for some polynomial g( x ) ∈ F[ x ], and degree considerations dictate that g( x ) is a linear
polynomial.
Therefore, for quadratic polynomials one can say that f is reducible if and only if f has a root
in F. If, furthermore, 2 6= 0 in the field F then, as we have seen in the assignments, we know that f
has a root if and only if b2 − 4ac is a square in F. In fact, in that case, the unique factorization of f
is √ ! √ !
2 −b + b2 − 4ac −b − b2 − 4ac
ax + bx + c = a x − x− .
2a 2a
(3) If f has degree 3 it is still true that f is reducible if and only if f has a root. But if f has degree
√ 4
or higher this may fail. For example, the polynomial x2 − 2 is irreducible over Q because 2 is
irrational. Same for x2 − 3. Thus, for example, the polynomials ( x2 − 2)2 , ( x2 − 2)( x2 − 3), ( x2 − 3)2
√ over Q√but don’t√have a root. √ of ( x √− 2)( x − 3) then in C we
are reducible 2 2
√ (Indeed, if α is a root
have (α − 2)(α + 2)(α − 3)(α + 3) = 0 and so α is ± 2 or ± 3 and, in any case, is not
rational.)
(4) The property of f being irreducible depends on the field. It is not an absolute property. For ex-
√ x − 2 √is irreducible in Q[ x ] but is reducible in C[ x ] because there we can write x =
ample, 2 2

( x − 2)( x + 2).
Theorem 16.4.4. (Unique factorization for polynomials) Let f ( x ) ∈ F[ x ] be a non-zero polynomial. Then
there is an a ∈ F× and distinct monic irreducible polynomials f 1 , · · · , f g and positive integers r1 , . . . , r g such
that
r rg
(3) f = a f11 · · · f g .
Moreover, if
s
f = bh11 · · · hst t ,
where b ∈ F× , hi distinct monic irreducible polynomials and si > 0, then a = b, g = t, and after re-naming
the hi ’s we have hi = f i for all i, and ri = si for all i.
Proof. The proof is very similar to the proof for integers. We first prove the existence of factorization.
Suppose that there is a non-zero polynomial f ( x ) with no such factorization. Choose then a non-zero
polynomial f ( x ) of minimal degree for which no such factorization exists. Then f ( x ) is not a constant
polynomial and is not an irreducible polynomial either, else f ( x ) = an x n + · · · + a0 = an · ( a− 1
n f ( x )) is a
suitable factorization. It follows that f ( x ) = f 1 ( x ) f 2 ( x ), where each f i ( x ) has degree less than that of f ( x ).
Therefore, each f i ( x ) has a factorization
f 1 ( x ) = c1 a1 ( x ) · · · a m ( x ), f 2 ( x ) = c2 b1 ( x ) · · · bn ( x ),
with ci ∈ F and ai , b j monic irreducible polynomials, not necessarily distinct. It follows that
f ( x ) = (c1 c2 ) a1 ( x ) · · · am ( x )b1 ( x ) · · · bn ( x ),
has also a factorization as claimed, by collecting factors. Contradiction. Thus, no such f ( x ) exists, and every
polynomial has a factorization as claimed.

We now show the uniqueness of the factorization. Suppose that

f ( x ) = c1 a1 ( x ) · · · am ( x ) = c2 b1 ( x ) · · · bn ( x ),
with ci ∈ F and ai , b j monic irreducible polynomials, not necessarily distinct (the polynomials are not related to
those appearing in the previous part of the proof). We prove the result by induction on degree f . Since ci is the
58 EYAL GOREN MCGILL UNIVERSITY

leading coefficient of f , we have c1 = c2 . In particular, the case of deg( f ) = 0 holds. Assume that we proved
uniqueness for all polynomials of degree ≤ d and deg( f ) = d + 1 ≥ 1. Since a1 ( x )|c2 b1 ( x ) · · · bn ( x ) and a1 ( x )
is irreducible, it follows that either a1 ( x )|c2 (which is impossible because c2 is a constant) or a1 ( x )|bi ( x ) for
some i. But since bi ( x ) is irreducible it then follows that that a1 ( x ) ∼ bi ( x ) and so, both polynomials being
monic, a1 ( x ) = bi ( x ).
Let us re-number the bi so that a1 = b1 . Then, dividing by a1 ( x ) we have
c1 a2 ( x ) · · · am ( x ) = c2 b2 ( x ) · · · bn ( x ).
Induction gives that m = n and, after re-numbering the bi , ai ( x ) = bi ( x ), i = 2, 3, . . . , n.

Example 16.4.5. Here are some examples.

(1) f ( x ) = ax + b, a 6= 0 has unique factorization a( x + a−1 b).
(2) f ( x ) = ax2 + bx + c is irreducible if and only if it has no root in F, as we have seen above. If this
is the case,
f ( x ) = a( x2 + a−1 bx + a−1 c)
is the unique factorization. Otherwise, f has two roots, say α and β (and if 2 6= 0 in F we have a
formula for them) and
f ( x ) = a( x − α)( x − β)
is the unique factorization. √
(3) Consider the polynomial f ( x ) = ( x2 − √= Q( 2). We claim that x − 3
2 2
√ 2)( x − 3) over
√ the field K
is irreducible over K. Indeed, if not, 3 ∈ K and so 3 = a + b 2 for some rational numbers a, b.
Squaring, we get √
3 = a2 + 2b2 + 2ab 2.
√ √
But this implies that 2ab 2 is rational and so ab = 0. If b = 0 we get that 3 = a is rational,
which is a contradiction. If a = 0 we get that 3 = 2b2 , which is a contradiction, because of unique
factorization of rational numbers: The power of 2 in the left hand side (i.e., in 3) is 0, while in the
right hand side (i.e. in 2a2 ) is odd, whatever it may be. Therefore,
√ √
f ( x ) = ( x − 2)( x + 2)( x2 − 3)
√
is the unique factorization of f over Q( 2).
(4) What is the unique factorization of x4 + x over F2 [ x ]? An obvious factor is x and then we are left
with x3 + 1. We note that x = 1 is a root (2 = 0 now!) and so x3 + 1 = ( x + 1)( x2 + x + 1) (cf.
Theorem 16.5.1); this is correct as −1 = 1 in F2 . The polynomial x2 + x + 1 is quadratic and so
is reducible over F2 if and only if it has a root in F2 . We check by calculation that neither x = 0
nor x = 1 are roots. We conclude that the unique factorization is given in this case by
x4 + x = x ( x + 1)( x2 + x + 1).
(5) Over the complex numbers any non-constant polynomial f factors as
n
f ( x ) = a n ∏ ( x − z i ),
i =1

(see Proposition 6.3.3) and this is precisely its unique factorization, except if some of the zi are equal
and then we may wish to collect them together to get a factorization as in (3).

We can now deduce from Theorem 16.4.4 the analogues of Proposition 10.2.1 and Corollary 10.2.2. The
proofs are the same.
am a
Proposition 16.4.6. Let f , g be non-zero polynomials in F[ x ]. Then f | g if and only if f = ap11 · · · pm
a10 a0m b
and g = bp1 · · · pm q11 · · · qbt t (products of distinct irreducible monic polynomials pi ; a, b non-zero scalars)
with ai0 ≥ ai for all i = 1, . . . , m.
COURSE NOTES - ALGEBRA I 59

a am b
Corollary 16.4.7. Let f = ap11 · · · pm , g = bp11 · · · pbmm with pi distinct irreducible monic polynomials, a, b
non zero scalars and ai , bi non-negative integers. (Any two non-zero polynomials can be written this way).
Then
min( a1 ,b1 ) min( am ,bm )
gcd( f , g) = p1 · · · pm .

16.5. Roots. Let F be a field and let f ( x ) ∈ F[ x ] be a non-zero polynomial. Recall that an element a ∈ F
is called a root (or zero, or solution) of f if f ( a) = 0.

Theorem 16.5.1. Let f ( x ) ∈ F[ x ] be a non-zero polynomial.

(1) If f ( a) = 0 then f ( x ) = ( x − a) g( x ) for a unique polynomial g( x ) ∈ F[ x ]. In particular, if f is
irreducible of degree greater than 1 then f has no roots in F.
(2) Let deg( f ) = d then f has at most d roots.

Proof. Suppose that f ( a) = 0 and divide f by x − a with a residue, getting

f ( x ) = g( x )( x − a) + r ( x ),
where r ( x ) is either zero or a polynomial of degree less than that of x − a. That is, in either case, r ( x ) is a
constant. Substitute x = a. We get 0 = f ( a) = g( a)( a − a) + r = r and so f ( x ) = ( x − a) g( x ).
Consider the factorization of f into irreducible monic polynomials:
f = A ( x − a 1 ) s1 · · · ( x − a m ) s m f 1 ( x ) r1 . . . f n ( x ) r n ,
where the f i are irreducible polynomials of degree larger than 1, the ri , si are positive and A ∈ F× . Note that
if f ( a) = 0 then, since f i ( a) 6= 0 (else f i ( x ) = ( x − a) gi ( x ), for some polynomial gi , hence reducible), we must
have a = ai for some i. It follows that the number of roots of f , counting multiplicities, is s1 + s2 + · · · + sm =
deg(( x − a1 )s1 · · · ( x − am )sm ) ≤ deg( f ) = d.

A field F is called algebraically closed if any non-constant polynomial f ( x ) ∈ F[ x ] has a root in F. Recall
the following important result.

Theorem 16.5.2 (The Fundamental Theorem of Algebra). The field of complex numbers is algebraically
closed.

It is a fact (proven in Algebra III) that every field is contained in an algebraically closed field. If F is
algebraically closed, then the only irreducible polynomials over F are the linear polynomials, viz. x − a, a ∈ F.
It follows then that
f ( x ) = A ( x − a 1 ) s1 · · · ( x − a m ) s m ,
where A is the leading coefficient of f and a1 , . . . , am are the roots (with multiplicities s1 , . . . , sm ).

A natural question is, for a given field F and a given polynomial f ( x ), to tell if f has a root in F or not. It
is the first check one can do when trying to determine whether f ( x ) is irreducible or not. Unfortunately, in
general this is impossible to decide; but we have some partial answers in special cases.

Proposition 16.5.3. Let f ( x ) = an x n + · · · + a1 x + a0 be a non-constant polynomial with integer coefficients.

If a = s/t, (s, t) = 1, is a rational root of f then s| a0 and t| an .

Proof. We have an (s/t)n + · · · + a1 (s/t) + a0 = 0 and so

an sn + an−1 sn−1 t + · · · + a1 stn−1 + a0 tn = 0.

Since s divides an sn + an−1 sn−1 t + · · · + a1 stn−1 , it follows that s| a0 tn . Then, since (s, t) = 1, we get
that s| a0 . Similarly, t divides an−1 sn−1 t + · · · + a1 stn−1 + a0 tn , so t divides an sn . Now (s, t) = 1 implies
that t| an .
60 EYAL GOREN MCGILL UNIVERSITY

Example 16.5.4. Problem: Find the rational roots of the polynomial x4 − 27 x3 + 52 x2 − 27 x + 32 .

The roots are the same as for the polynomial 2x4 − 7x3 + 5x2 − 7x + 3. They are thus of the form s/t,
where s = ±1, ±3, t = ±1, ±2. We have the possibilities ±1, ±1/2, ±3, ±3/2. By checking each case,
we find the roots are 1/2 and 3. We remark that after having found the root 1/2 we can divide the
polynomial 2x4 − 7x3 + 5x2 − 7x + 3 by x − 1/2 finding 2x3 − 6x2 + 2x − 6, whose roots are the roots
of x3 − 3x2 + x − 3. So, in fact, the only possibilities for additional roots are ±3. We saved this way the
need to check if ±3/2 are roots.
Here is another example. Is the polynomial x3 + 2x2 + 5 irreducible over Q?
In this case, if it is reducible then one of the factors would have to have degree 1 (this type of argument only
works for degrees 1, 2, 3 polynomials. For higher degree, we might have a reducible polynomial with no linear
factor, e.g., ( x2 + 1)( x2 + 3)). Namely, the polynomial would have a rational root. But the rational roots
can only be ±1, ±5 and one verifies those are not roots. Thus, the polynomial is irreducible.

Proposition 16.5.5. If f ( x ) ∈ R[ x ] is a polynomial of odd degree then f has a root in R.

Proof. Since the roots of f are the roots of − f , we may assume that f ( x ) = an x n + · · · + a1 x + a0 , ai ∈
R, an > 0. An easy estimate shows that there is an N > 0 such that f ( N ) > 0 and f (− N ) < 0. By the
intermediate value theorem there is some a, − N ≤ a ≤ N such that f ( a) = 0. (Another proof appears in
the exercises.)

16.6. Eisenstein’s criterion.

Theorem 16.6.1. Let f ( x ) = x n + · · · + a1 x + a0 ∈ Z[ x ] be a monic polynomial with integer coefficients.

Suppose that for some prime p we have p| ai , ∀i and p2 - a0 then f ( x ) is irreducible over Q.

Proof. We first prove that f ( x ) is irreducible over Z. Namely, suppose f ( x ) = g( x )h( x ), where g( x ), h( x ) ∈
Z[ x ] and both polynomials are not constant. Let us write g( x ) = c a x a + · · · + c1 x + c0 , h( x ) = db x b + · · · +
d1 x + d0 . Note that c a , db are ±1. Reduce the identity f ( x ) = g( x )h( x ) modulo p to get x a+b = g( x ) · h( x ).
By unique factorization for Z/pZ[ x ] we conclude that g( x ) = c a x a , h( x ) = db x b , and in particular, that
p|c0 , p|d0 . It follows that p2 |c0 d0 = a0 and that’s a contradiction.
We next prove that if f ( x ) is reducible over Q it is reducible over Z. Suppose that f ( x ) = g( x )h( x ),
where g( x ), h( x ) are in Q[ x ] are non-constant polynomials. Multiply by a suitable integer to get that F ( x ) =
G ( x ) H ( x ), where F ( x ) = N f ( x ) is in Z[ x ] and all its coefficients are divisible by N, and G ( x ), H ( x ) ∈ Z[ x ]
are non-constant polynomials. It is enough to prove that if a prime p divides all the coefficients of F ( x ), it
either divides all the coefficients of G ( x ) or all the coefficients of H ( x ), because then, peeling off one prime
at the time, we find a factorization of f ( x ) into polynomials with integer coefficients.
Reduce the equation F ( x ) = G ( x ) H ( x ) modulo p, for a prime p that divides all the coefficients of F ( x ),
to find 0 = G ( x ) · H ( x ). Using that Z/pZ[ x ] is an integral domain, we conclude that either G ( x ) = 0 or
H ( x ) = 0, which means that either p divides all the coefficients of G, or all the coefficients of H.

16.7. Roots of polynomials in Z/pZ. Let p be a prime and let Z/pZ be the field with p elements whose
elements are congruence classes modulo p. By Fermat’s little theorem, every element of Z/pZ× is a root
of x p−1 − 1. This gives p − 1 distinct roots of x p−1 − 1 and so these must be all the roots and each must
appear with multiplicity one. It follows that the roots of x p − x are precisely the elements of Z/pZ, again
each with multiplicity one. That is,
p −1
xp − x = ∏ (x − ā).
a =0

Proposition 16.7.1. Let f ( x ) be any polynomial in Z/pZ[ x ]. Then f ( x ) has a root in Z/pZ if and only
if gcd( f ( x ), x p − x ) 6= 1.
COURSE NOTES - ALGEBRA I 61

Proof. If f ( a) = 0 for some a ∈ Z/pZ then ( x − a)| f ( x ), but also ( x − a)|( x p − x ). It follows that gcd( f ( x ), x p −
p −1
x ) 6= 1. Conversely, if h( x ) = gcd( f ( x ), x p − x ) 6= 1 then, since h( x )| x p − x = ∏ a=0 ( x − ā), by unique
factorization we must have h( x ) = ∏i=1,...,n ( x − ai ) for some distinct elements a1 , . . . , an of Z/pZ. In
particular, each such ai is a root of f ( x ).
The straightforward way to check if f ( x ) has a root in Z/pZ is just to try all possibilities for x. Suppose
that f ( x ) has a small degree relative to p. Even then, except in special cases, we still have to try p residue
classes, each in its turn, to see if any of which is a root. But p may be very large, much too large for
this method to be feasible. For example, p might be of cryptographic size ≈ 22048 (according to the RSA
company, this should be secure until 2030). Even with a computer doing 1010 operations per second, which is
about what a good laptop does these days (2016), checking all these possibilities will take about 10600 years!
Proposition 16.7.1 suggests a different method: Calculate gcd( f ( x ), x p − x ). Note that except for the
first step
x p − x = q0 ( x ) f ( x ) + r0 ( x ),
all the polynomials involved in the Euclidean algorithm would have very small degrees (smaller than f ’s for
example) and so the Euclidean algorithm will terminate very quickly. The first step, though, could be very
time consuming given what we know at this point. Later we shall see that it can, in fact, be done quickly (in
order of magnitude log( p)). For example, using the software GP/PARI, it took my laptop 69 microseconds
to determine that for p = 22203 − 1, the polynomial f ( x ) = x3 + x + 1 is irreducible. This p is an example of
a Mersenne prime; for numbers of the form 2n − 1 we have special methods to ascertain their primality. Even
a prime of the form p = 29941 − 1 was no problem; it took 3.582 seconds to determine that f ( x ) is reducible
modulo p. It even took only 7.710 seconds to find the root. As the root has close to 3000 digits, I am not
listing it here.

We have seen that many of the features of arithmetic in Z can be carried out in F[ x ]. We still don’t have
an analogue of passing from Z to Z/nZ in the context of F[ x ]. This is one motivation for studying rings in
much more detail; we’d like to be able to emulate the process of Z to Z/nZ for general rings, not just F[ x ].
62 EYAL GOREN MCGILL UNIVERSITY

17. Exercises
(1) In each case, divide f ( x ) by g( x ) with residue:
(a) f ( x ) = 3x4 − 2x3 + 6x2 − x + 2, g( x ) = x2 + x + 1 in Q[ x ].
(b) f ( x ) = x4 − 7x + 1, g( x ) = 2x2 + 1 in Q[ x ].
(c) f ( x ) = 2x4 + x2 − x + 1, g( x ) = 2x − 1 in Z/5Z[ x ].
(d) f ( x ) = 4x4 + 2x3 + 6x2 + 4x + 5, g( x ) = 3x2 + 2 in Z/7Z[ x ].
(2) Use the Euclidean algorithm to find the gcd of the following pairs of polynomials and express it as a
combination of the two polynomials.
(a) x4 − x3 − x2 + 1 and x3 − 1 in Q[ x ].
(b) x5 + x4 + 2x3 − x2 − x − 2 and x4 + 2x3 + 5x2 + 4x + 4 in Q[ x ].
(c) x4 + 3x3 + 2x + 4 and x2 − 1 in Z/5Z[ x ].
(d) 4x4 + 2x3 + 3x2 + 4x + 5 and 3x3 + 5x2 + 6x in Z/7Z[ x ].
(e) x3 − ix2 + 4x − 4i and x2 + 1 in C[ x ].
(f) x4 + x + 1 and x2 + x + 1 in Z/2Z[ x ].
(3) Consider the polynomial x2 + x = 0 over Z/nZ.
(a) Find an n such that the equation has at least 4 solutions.
(b) Find an n such that the equation has at least 8 solutions.
(4) Is the given polynomial irreducible:
(a) x2 − 3 in Q[ x ]? In R[ x ]?
(b) x2 + x − 2 in F3 [ x ]? In F7 [ x ]? (For any prime p we denote Z/pZ also by F p . This notation
is used only for primes! Namely, one does not use a notation as Fn if n is not a prime.)
(5) Find the rational roots of the polynomial 2x4 + 4x3 − 5x2 − 5x + 2.
(6) For a polynomial g( x ) = an x n + · · · + a1 x + a0 with complex coefficients, let g( x ) = an x n + · · · +
a1 x + a0 be the polynomial obtained by taking the complex conjugate of the coefficients. Check that
g1 ( x ) g2 ( x ) = g1 ( x ) · g2 ( x ). Let f ( x ) be a polynomial with real coefficients and
f ( x ) = a ( x − α 1 ) a1 ( x − α 2 ) a2 · · · ( x − α r ) ar ,
its unique factorization over C. Apply complex conjugation to both sides. Deduce that if α is a root
of f with multiplicity d then ᾱ is a root of f with the same multiplicity. Deduce that if f has odd
degree then f has a real root.
(7) Let p > 2 be a prime. Calculate the gcd of x p−1 − 1 and x2 + 1 in the ring Z/pZ[ x ], using the
Euclidean algorithm method, and conclude that −1 is a square in Z/pZ if and only p ≡ 1 (mod 4).

(8) Let p be a prime number. Use the factorization of x p − x to deduce Wilson’s theorem: ( p − 1)! ≡ −1
(mod p)
COURSE NOTES - ALGEBRA I 63

Part 5. Rings
18. Some basic definitions and examples
Recall our definition of a ring.
Definition 18.0.1. A ring R is a non-empty set together with two operations, called “addition" and “multi-
plication" that are denoted, respectively, by
( x, y) 7→ x + y, ( x, y) 7→ xy.
One requires the following axioms to hold:
(1) x + y = y + x, ∀ x, y ∈ R. (Commutativity of addition)
(2) ( x + y) + z = x + (y + z), ∀ x, y, z ∈ R. (Associativity of addition)
(3) There exists an element in R, denoted 0, such that 0 + x = x, ∀ x ∈ R. (Neutral element for
addition)
(4) ∀ x ∈ R, ∃y ∈ R such that x + y = 0. (Inverse with respect to addition)
(5) ( xy)z = x (yz), ∀ x, y, z ∈ R. (Associativity of multiplication)
(6) There exists an element 1 ∈ R such that 1x = x1 = x, ∀ x ∈ R. (Neutral element for multiplication)
(7) z( x + y) = zx + zy, ( x + y)z = xz + yz, ∀ x, y, z ∈ R. (Distributivity)

Recall also that a ring R is called a division ring (or sometimes a skew-field) if 1 6= 0 in R and any non-zero
element of R has an inverse with respect to multiplication. A commutative division ring is precisely what we
call a field.

Example 18.0.2. Z is a commutative ring. It is not a division ring and so is not a field. The rational
numbers Q form a field. The real numbers R form a field. The complex numbers C form a field.
We have also noted some useful formal consequences of the axioms defining a ring:
(1) The element 0 appearing in axiom (3) is unique.
(2) Given x, the element y appearing in axiom (4) is unique. We shall denote y by − x.
(3) We have −(− x ) = x and −( x + x 0 ) = − x − x 0 , where, technically − x − x 0 means (− x ) + (− x 0 ).
(4) We have x · 0 = 0, 0 · x = 0.

Here are some further examples. We do not prove that the ring axioms hold; this is left as an exercise.
Example 18.0.3. Let F be a field and n ≥ 1 an integer. Consider the set of n × n matrices:
  
a . . . a
 11 1n

 

 
 .. .. .. 
 
Mn ( F ) =  . . .  : aij ∈ F .

   

 
an1 . . . ann
 

For example:
(1) for n = 1 we ( a11 ), a11 ∈ F;
 just get 
a11 a12
(2) for n = 2,  ;
a21 a22
 
a11 a12 a13
 
(3) for n = 3 we get  a21 a22 a23 .
 
 
a31 a32 a33
n
In general we shall write an n × n matrix as ( aij ), or ( aij )i,j =1 if we need to be very clear about the dimensions
of the matrix. The index i is the row index and the index j is the column index. We then define
( aij ) + (bij ) = ( aij + bij ), ( aij )(bij ) = (cij ),
64 EYAL GOREN MCGILL UNIVERSITY

where
n
cij = ∑ aik bkj .
k =1
We can say that the ij entry of the product AB, is the dot product of the i-th row of A with the j-th column
of B.
   . ..
 
..
··· ···
 
.   ··· b1j ···
..
   
cij  =  ai1 . . . ain
   

  

 . 

.. .. 
··· ··· . . ··· bnj ···

For example:
(1) for n = 1 we get ( a) + (b) = ( a + b) and ( a)(b) = ( ab). Namely, we just get F again!
(2) for n = 2, we have
     
a11 a12 b11 b12 a11 + b11 a12 + b12
 + = ,
a21 a22 b21 b22 a21 + b21 a22 + b22
and     
a11 a12 b11 b12 a11 b11 + a12 b21 a11 b12 + a12 b22
  = .
a21 a22 b21 b22 a21 b11 + a22 b21 a21 b12 + a22 b22

Under these definitions Mn (F) is a ring, called the ring of n × n matrices with entries in F, with identity
given by the identity matrix  
1 0 ... 0
0 1
In =  .. ,
.
0 ... 1
and zero given by the zero matrix (the matrix all whose entries are zero). For n ≥ 2 this is a non-commutative
ring. For example, for n = 2 we have,
         
1 1 1 0 2 1 1 0 1 1 1 1
  = ,   = .
0 1 1 1 1 1 1 1 0 1 1 2
These are never equal, else 2 = 1 in F, which implies 1 = 0 in F, which is never the case, by definition.
Example 18.0.4. Let e be a formal symbol and F a field. The ring of dual numbers, F[e], is defined as
F[e] = { a + be : a, b ∈ F},
with the following addition and multiplication:
( a + be) + (c + de) = a + c + (b + d)e, ( a + be)(c + de) = ac + ( ad + bc)e.
Note that e is a zero divisor: e 6= 0 but e2 = 0.
Example 18.0.5. Let R1 , R2 be rings. Then R1 × R2 is a ring with the following operations:
( a1 , b1 ) + ( a2 , b2 ) = ( a1 + a2 , b1 + b2 ), ( a1 , b1 )( a2 , b2 ) = ( a1 a2 , b1 b2 ).
The zero element is (0R1 , 0R2 ) and the identity element is (1R1 , 1R2 ). The ring R1 × R2 is called the direct
product of R1 and R2 .
Some of the examples of rings are rather complicated (for example, rings of matrices) or exotic looking (like
the ring of dual numbers) and one would be justified to ask why one would want to consider such rings. As
one progresses in mathematics, the need for the concept of ring becomes more and more clear. In linear
algebra we learn that linear transformations of R3 , such as rotations fixing the origin, can be described by
3 × 3 matrices with real entries and composition of linear transformations matches multiplication of matrices
in M3 (R). Modern algebraic geometry associated to every commutative ring R a space Spec( R) (it is a set,
there is a notion of open subsets and of functions, and so on) whose points are the prime ideals of R (see
COURSE NOTES - ALGEBRA I 65

below for the concept of a prime ideal) and, in this setting, the ring of dual numbers is used to study the
tangent space at a point of Spec( R). If F is a field, we understand the importance of the ring of polynomials
F[ x ] and similarly F[ x, y], but if we want to considers polynomials in x “modulo a fixed polynomial f ( x )", or
polynomials in x and y “modulo a collection of polynomials f 1 ( x, y), . . . , f n ( x, y)" etc., we get new rings and
ring theory is the best way to have a rigorous language to discuss those. Thus, with no more apologies, we
proceed to develop the very basics of this huge area of mathematics.

Definition 18.0.6. Let S ⊂ R be a subset. S is called a subring of R if the following holds:

(1) 0R , 1R belong to S;
(2) s1 , s2 ∈ S ⇒ s1 ± s2 ∈ S;
(3) s1 , s2 ∈ S ⇒ s1 s2 ∈ S.
Note that in this case S is a ring in its own right.
Example 18.0.7. The easiest examples are Z, Q, R being subrings of C. We’ve already seen examples of
subrings of the ring of 2 × 2 matrices in Exercise 24, page 31.
Consider the subset {(r, 0) : r ∈ R} of the ring R × R. It is closed under addition and multiplication.
It is even a ring because (r, 0)(1, 0) = (r, 0) and so (1, 0) serves as an identity element for this subset.
Nonetheless, it is not a subring of R × R according to our definition, because the identity element of R × R,
which is (1, 1), does not belong to the set {(r, 0) : r ∈ R}. Some authors have other conventions (and thus
would consider R × {0} as a subring of R × R), but by-and-large our conventions are the more widespread.
One needs to be careful when considering other textbooks, though.
√
Example 18.0.8. Let n 6= ±√ √ integer (i.e., if p|n, p prime, 2then
1 be a square free p2 - n). Then n is not
a rational number. Indeed if n is rational, n = s/t, (s, t) = 1, then n = s /t2 . Let p a prime dividing n
and so p2 - n. Since nt2 = s2 , p|s2 . But then p|s. Looking at the power of p in the unique factorization of
both sides, it follows that p|t and thus p|(s, t) – a contradiction.
Consider √ √
Z[ n] = { a + b n : a, b ∈ Z}.
This is a subset of C, containing 0 and 1 and is closed under addition and multiplication:
√ √ √
( a + b n) + (c + d n) = ( a + c) + (b + d) n,
and √ √ √
( a + b n)(c + d n) = ( ac + bdn) + ( ad + bc) n.
√ √ √
We remark that any element of this ring has√a unique expression as a + b n. Indeed, if a + b n = c + d n,
either b = d (and then obviously a = c) or n = ( a − c)/(d − b) is a rational number, which it’s not.

19. Ideals
Definition 19.0.1. Let R be a ring. A (two-sided) ideal I of R is a subset of R such that
(1) 0 ∈ I;
(2) if a, b ∈ I then a + b ∈ I;
(3) if a ∈ I, r ∈ R, then ra ∈ I and ar ∈ I.
Remark 19.0.2. Note that if a ∈ I then −1 · a = − a ∈ I.
We shall use the notation I C R to indicate that I is an ideal of R.
Example 19.0.3. I = {0} and I = R are always ideals. They are called the trivial ideals.
Example 19.0.4. Suppose that R is a division ring (e.g., a field) and I C R is a non-zero ideal. Then I = R.
Indeed, there is an element a ∈ I such that a 6= 0. Then 1 = a−1 a ∈ I and so for every r ∈ R we
have r = r · 1 ∈ I. That is, I = R. We conclude that a division ring has only the trivial ideals. (Note also
that the argument shows for any ring R that if an ideal I contains an invertible element of R then I = R.)
66 EYAL GOREN MCGILL UNIVERSITY

Example 19.0.5. Let R be a commutative ring. Let r ∈ R. The principal ideal (r ) is defined as
(r ) = {ra : a ∈ R} = { ar : a ∈ R}.
We also denote this ideal by rR or Rr. This is indeed an ideal: First 0 = r · 0 is in (r ). Second, given two
elements ra1 , ra2 in (r ) we have ra1 + ra2 = r ( a1 + a2 ) ∈ (r ) and for every s ∈ R we have s(ra1 ) = (sr ) a1 =
(rs) a1 = r (sa1 ) ∈ rR (using commutativity!), (ra1 )s = r ( a1 s) ∈ rR.
Definition 19.0.6. Let R be a commutative ring. If every ideal of R is principal, one calls R a principal ideal
ring.
Theorem 19.0.7. Z is a principal ideal ring. In fact, the list
(0), (1), (2), (3), (4), . . .
is a complete list of the ideals of Z. (Note that another notation is 0, 1Z, 2Z, 3Z, 4Z, . . . .)
Proof. We already know that these are ideals, in fact principal ideals, and we note that for i > 0 the minimal
positive number in the ideal (i ) is i. Thus, these ideals are distinct.
Let I be an ideal of Z. If I = {0} then I appears in the list above. Else, there is some non-zero
element a ∈ I. If a < 0 then − a = −1 · a ∈ I and so I has a positive element in it. Choose the smallest
positive element in I and call it i.
First, since i ∈ I so is ia for any a ∈ Z and so (i ) ⊂ I. Let b ∈ I. Divide b by i with residue: b = qi + r,
where 0 ≤ r < i. Note that r = b − qi is an element of I, smaller than i. The only possibility is that r = 0
and so b = qi ∈ (i ). Thus, I = (i ).
Theorem 19.0.8. Let F be a field. The ring F[ x ] is a principal ideal ring. Two ideals ( f ( x )), ( g( x )) are equal
if and only if f ∼ g.
Proof. The proof is very similar to the case of Z. Let I be an ideal. If I = {0} then I = (0), the principal
ideal generated by 0. Else, let f ( x ) ∈ I be a non-zero polynomial whose degree is minimal among all non-
zero elements of I. On the one hand I ⊇ ( f ( x )). On the other hand, let g( x ) ∈ I and write g( x ) =
q( x ) f ( x ) + r ( x ), where r ( x ) is either zero or of degree small than f 0 s. But r ( x ) = g( x ) − q( x ) f ( x ) ∈ I.
Thus, we must have r ( x ) = 0 and so g( x ) = q( x ) f ( x ) ∈ ( f ( x )). That is, I ⊆ ( f ( x )).
At this point we need a definition and a lemma.
Let R be any ring. The units of R are denoted R× and defined as follows:
R× = { x ∈ R : ∃y ∈ R, xy = yx = 1}.
For example, 1R is always a unit. If R is a field then, by definition, R× = R − {0}. Given x ∈ R× the y such
that xy = yx = 1 is unique and we denote it x −1 . Indeed, if also xy1 = 1 then y( xy1 ) = y and on the other
hand y( xy1 ) = (yx )y1 = 1 · y1 = y1 . Thus, y = y1 .
Lemma 19.0.9. Let R be a commutative integral domain and a, b ∈ R. We say that a ∼ b (a and b are
associates) if for some unit u of R, au = b. This is an equivalence relation. If a|b and b| a then a ∼ b.
Proof. (Lemma) As a = 1 · a and au = b implies a = u−1 b if u is a unit (else u−1 doesn’t even make sense),
this is a reflexive and symmetric relation. If au = b and bv = c, where u, v ∈ R× then a(uv) = c and uv is a
unit (the inverse is v−1 u−1 ). This shows transitivity.
Now suppose a|b and b| a. If a = 0 then b = 0 and conversely, and there’s nothing to prove. Else, write
au = b for some u ∈ R and bv = a for some v ∈ R. We need to show that u, v are units. But, we have
a(1 − uv) = a − ( au)v = a − bv = a − a = 0. Since R is an integral domain and a 6= 0 it must be that
1 − uv = 0. Thus uv = 1. Starting from b(1 − vu) = 0 we get in the same way that vu = 1. It follows that
u, v are units of R.
Note that for R = F[ x ], the notion of being associate is precisely the one we have previously defined.
Indeed, the units of R are just the non-zero scalars and two polynomials are associate precisely if they differ
by multiplication by a non-zero scalar. As R is also an integral domain, we may apply the Lemma. So, suppose
that ( f ( x )) ⊃ ( g( x )) then g( x ) = f ( x )h( x ) for some polynomial h( x ) ∈ F[ x ]. That is f ( x )| g( x ). Thus, if
( f ( x )) = ( g( x )) then f | g and g| f and so f ∼ g.
If f | g, say g( x ) = f ( x )h( x ) then any multiple of g( x ), say g( x )t( x ) is equal to f ( x )[h( x )t( x )] and
so ( g( x )) ⊂ ( f ( x )). If f ∼ g then f | g and g| f and so, by the argument above, ( f ( x )) = ( g( x )).
COURSE NOTES - ALGEBRA I 67

Example 19.0.10. Let F be a field. One can show that all the ideals of F[e] are {0} = (0), F[e] = (1) and
(e) = {be : b ∈ F} and so the ring of dual numbers is also a principal ideal ring.
Example 19.0.11. The ring of polynomials C[ x, y] in two variables with complex coefficients is not a principal
ideal ring. We claim that the set of polynomials I = { f ( x, y) : f (0, 0) = 0}, namely, polynomials with zero
constant term, is an ideal that is not principal. We leave that as an exercise.
Example 19.0.12. Let R1 , R2 be rings with ideals I1 , I2 , respectively. Then I1 × I2 is an ideal of R1 × R2 .
Example 19.0.13. Let us consider the ring F[ x ] and in it the set
S = { f ( x ) : f ( x ) = a0 + a2 x 2 + a3 x 3 + . . . },
of polynomials with no x term. Note that 0 ∈ S and s1 , s2 ∈ S ⇒ s1 + s2 ∈ S and even s1 s2 ∈ S. However, S
is not an ideal. We have 1 ∈ S but x = x · 1 6∈ S.
√
Example 19.0.14. Consider the ring Z[ 5]. In this ring we consider
√
I = {5a + b 5 : a, b ∈ Z}.

√ that I is an ideal. This can be verified directly, but it is easier to note that I is in fact the principal
We claim
ideal ( 5).
Example 19.0.15. Let R be a ring and I1 , I2 two ideals of R. Then
I1 + I2 = {i1 + i2 : i1 ∈ I1 , i2 ∈ I2 }
is an ideal of R. Inductively, the sum of n ideals I1 + I2 + · · · + In is an ideal. A particular case is the
following: Let R be a commutative ring and Ii = ri R a principal ideal. Then
n
r1 R + r2 R + · · · + r n R = { ∑ r i a i : a i ∈ R }
i =1
is an ideal of R; we often denote it by (r1 , r2 , . . . , rn ) or hr1 , r2 , . . . , rn i (so in particular a principal ideal (r )
may also be denoted hr i). √ √
Let us consider the situation of the ring R = Z[ −5] and the ideal h2, 1 + −5i. We know abstractly
that this
√ is an ideal. We claim that this ideal is not principal. In particular, this shows that this ideal is
not Z[ −5] and, more √ importantly, gives√ us an example of a ring with non-principal √ ideals.
√
Suppose
√ that h 2,
√ 1 + − 5 i = h a + b − 5 i . It follows that 2 = ( a + b − 5 )( c + d −5) and so that 2 =
( a − b −5)(c − d −5) (check!). Therefore, by multiplying these two equations, 4 = ( a2 + 5b2 )(c2 + 5d2 ).
This is an equation
√ in integers and so (because 0 < a2 + 5b2 ≤√4) a ∈ {±1, ±2}, b = 0 and we conclude √
that h2,√1 + −5i = h ai is equal to h1i or h2i. Now, √ if h2, 1 + −5i = h2i this√implies that 1√+ −5 =
2(c√+ d −5), which is a contradiction.
√ If h2, 1 + −5i = h1i then 1 = 2(c1 + d1 −5) + (1 + −5)(c2 +
d2 −5) = (2c1 + c2 − 5d2 ) + −5(2d1 + c2 + d2 ). Therefore, 2d1 + c2 + d2 = 0, that is, −2d1 − c2 = d2
and we get 1 = 2c1 + c2 − 5d2 = 2c1 + c2 + 10d1 + 5c2 = 2(c1 + 3c2 + 5d1 ). This is an equation in integers
and it implies that 1 is even. Contradiction.

20. Homomorphisms
When we discussed sets, we also considered functions from one set to another. It was the functions, really,
that made the subject much more deep and useful. Indeed, without functions, we wouldn’t even be able to
compare cardinalities of sets. Moreover, functions on sets are as natural as things get in mathematics. I can
consider the set of guests of a wedding. A well-known headache-inducing problem is try and find a function
from the set of guests to the set of tables that associates to each guest a table so as to maximize the number
of friends around each table.
It is a general principle in mathematics that when sets are endowed with more structure, the maps should
take these structures into account; we say, “respect" the structures. We are about to make the definition
for rings and later on we will see it for groups. In Algebra II (or other linear algebra courses) you will see it
68 EYAL GOREN MCGILL UNIVERSITY

for vector spaces. In each case, the way to define correctly functions should be evident; one should be very
careful not ignore all the special features of the special sets (rings, groups, vector spaces, ...) that we are
considering.

Definition 20.0.1. Let R, S be rings. A function f : R → S is a ring homomorphism if the following holds:
(1) f (1R ) = 1S ;
(2) f (r1 + r2 ) = f (r1 ) + f (r2 );
(3) f (r1 r2 ) = f (r1 ) f (r2 ).

Here are some formal consequences (that are nonetheless very useful).
• f (0R ) = 0S . Indeed, f (0R ) = f (0R + 0R ) = f (0R ) + f (0R ). Let y = f (0R ) then y = y + y.
Adding −y to both sides we find 0S = y = f (0R ).
• We have f (−r ) = − f (r ). Indeed: 0S = f (0R ) = f (r + (−r )) = f (r ) + f (−r ) and so f (−r ) =
− f (r ) (just because it sums with f (r ) to 0S !)
• We have f (r1 − r2 ) = f (r1 ) − f (r2 ), because f (r1 − r2 ) = f (r1 + (−r2 )) (this, by definition) and
so f (r1 − r2 ) = f (r1 ) + f (−r2 ) = f (r1 ) − f (r2 ).
Note, in particular, that f (0R ) = 0S is a consequence of axioms (2), (3). On the other hand f (1R ) = 1S
does not follow from (2), (3) and we therefore include it as an axiom (though not all authors do that). Here
is an example. Consider,
f : R → R × R, f (r ) = (r, 0).
This map satisfies f (r1 + r2 ) = f (r1 ) + f (r2 ) and f (r1 r2 ) = f (r1 ) f (r2 ), but f (1) = (1, 0) is not the identity
element of R × R. So this is not a ring homomorphism.
On the other hand, if S ⊂ R is a subring then the inclusion map i : S → R, i (s) = s, is a ring homomorphism.
Note that this explains why in the definition of a subring we insisted on 1R ∈ S.
Proposition 20.0.2. Let f : R → S be a homomorphism of rings. The image of f is a subring of S.
Proof. As we have seen, f (0R ) = 0S . Also, by definition f (1R ) = 1S and so 0S , 1S ∈ Im( f ). Let now s1 , s2 ∈
Im( f ), say si = f (ri ). Then, s1 ± s2 = f (r1 ) ± f (r2 ) = f (r1 ± r2 ) and so s1 ± s2 ∈ Im( f ). Similarly, s1 s2 =
f (r1 r2 ) and so s1 s2 ∈ Im( f ).
Definition 20.0.3. Let f : R → S be a homomorphism of rings. The kernel of f , Ker( f ), is defined as
follows:
Ker( f ) = {r ∈ R : f (r ) = 0}.
Proposition 20.0.4. Ker( f ) is an ideal of R. The map f is injective if and only if Ker( f ) = {0}.
Proof. First, since f (0R ) = 0S we have 0R ∈ Ker( f ). Suppose that r1 , r2 ∈ Ker( f ) then f (ri ) = 0S and we
find that f (r1 + r2 ) = f (r1 ) + f (r2 ) = 0S + 0S = 0S , so r1 + r2 ∈ Ker( f ).
Now suppose that r1 ∈ Ker( f ) and r ∈ R is any element. We need to show that rr1 , r1 r ∈ Ker( f ). We
calculate f (rr1 ) = f (r ) f (r1 ) = f (r )0S = 0S , so rr1 ∈ Ker( f ). Similarly for r1 r.
So far we proved that Ker( f ) is an ideal. Suppose now that f is injective. Then f (r ) = 0S implies f (r ) =
f (0R ) and so r = 0R . That is, Ker( f ) = {0R }.
Suppose conversely that Ker( f ) = {0R }. If f (r1 ) = f (r2 ) then 0S = f (r1 ) − f (r2 ) = f (r1 − r2 ) and
so r1 − r2 ∈ Ker( f ). Since Ker( f ) = {0R }, we must have r1 − r2 = 0R ; that is, r1 = r2 . We proved that f
is injective.
We now look at some examples:
Example 20.0.5. Let n ≥ 1 be an integer. Define a function,
f : Z → Z/nZ,
by f ( a) = ā (the congruence class of a modulo n). Then f is a homomorphism:
(1) f (1) = 1̄ and 1̄ is the indeed the identity element of Z/nZ;
(2) f ( a + b) = a + b = ā + b̄ = f ( a) + f (b);
COURSE NOTES - ALGEBRA I 69

(3) f ( ab) = ab = ā b̄ = f ( a) f (b).

We observe that f is surjective and its kernel is { a : ā ≡ 0 (mod n)} = { a : n| a} = nZ.
Example 20.0.6. Let R, S, be any rings and define
f : R × S → R, f ((r, s)) = r.
This is a homomorphism:
(1) f (1R×S ) = f ((1R , 1S )) = 1R ;
(2) f ((r1 , s1 ) + (r2 , s2 )) = f ((r1 + r2 , s1 + s2 )) = r1 + r2 = f ((r1 , s1 )) + f ((r2 , s2 ));
(3) f ((r1 , s1 )(r2 , s2 )) = f ((r1 r2 , s1 s2 )) = r1 r2 = f ((r1 , s1 )) f ((r2 , s2 )).
The kernel of f is {(r, s) : r = 0} = {(0, s) : s ∈ S} = {0} × S.
Example 20.0.7. Let F be a field and F[e] the ring of dual numbers. Define
f : F[e] → F, f ( a + be) = a.
Then f is a homomorphism:
(1) f (1) = 1;
(2) f (( a + be) + (c + de)) = f ( a + c + (b + d)e) = a + c = f ( a + be) + f (c + de);
(3) f (( a + be)(c + de)) = f ( ac + ( ad + bc)e) = ac = f ( a + be) f (c + de).
The kernel of f is { a + be : a = 0, b ∈ F} = {be : b ∈ F}. We claim that this is the ideal (e). On the one
hand be certainly is in (e) for any b. That is Ker( f ) ⊆ (e). On the other hand (c + de)e = ce and that
shows (e) ⊆ Ker( f ).
Example 20.0.8. Let F be a field. Let a ∈ F be a fixed element. Define
α : F[ x ] → F, α( g( x )) = g( a).
Then α is a homomorphism:
(1) α(1) is the value of the constant polynomial 1 at a which is just 1, so α(1) = 1.
(2) We have α( f + g) = ( f + g)( a) = f ( a) + g( a) = α( f ) + α( g);
(3) Similarly, α( f g) = ( f g)( a) = f ( a) g( a) = α( f )α( g).
Therefore, α is a homomorphism. It is called a specialization homomorphism or an evaluation homomor-
phism. The kernel of α is { f ∈ F[ x ] : f ( a) = 0} and it is equal to the principal ideal ( x − a). Indeed:
if g( x ) ∈ ( x − a) then g( x ) = ( x − a) g1 ( x ) and so g( a) = ( a − a) g1 ( a) = 0. Conversely, if g( a) = 0,
Theorem 16.5.1 says that g( x ) = ( x − a) g1 ( x ) for some polynomial g1 ( x ) and so g( x ) ∈ ( x − a).

Example 20.0.9. Let A be the set of all continuous functions f : [0, 1] → R. Define the sum (resp. product)
of two functions f , g to be the function f + g (resp. f g) whose value at any x is f ( x ) + g( x ) (resp. f ( x ) g( x )).
That is:
( f + g)( x ) = f ( x ) + g( x ), ( f g)( x ) = f ( x ) g( x ).
This is a ring (in particular, these are operations – the sum and product of continuous functions is continuous!).
Its zero element is the constant function zero and its identity element is the constant function 1. Let a ∈ [0, 1]
be a fixed element. Define
ϕ : A → R, ϕ ( f ) = f ( a ).
Then ϕ is a ring homomorphism whose kernel are all the functions vanishing at the point a.

20.1. Units. Let R be any ring. Recall the definition of units: The units of R are denoted R× and defined
as follows:
R× = { x ∈ R : ∃y ∈ R, xy = yx = 1}.
For example, 1R is always a unit. If R is a field then, by definition, R× = R − {0}.
Lemma 20.1.1. We have the following properties:
(1) If r1 , r2 ∈ R× then r1 r2 ∈ R× .
70 EYAL GOREN MCGILL UNIVERSITY

(2) Let f : R → S be a homomorphism of rings then f ( R× ) ⊆ S× .

Proof. Suppose that r1 , r2 ∈ R× and r1 y1 = y1 r1 = 1, r2 y2 = y2 r2 = 1. Let y = y2 y1 then (r1 r2 )y =

r1 (r2 y2 )y1 = r1 · 1 · y1 = r1 y1 = 1. A similar computation gives y(r1 r2 ) = 1 and so r1 r2 ∈ R× .
Let now f : R → S be a homomorphism and r ∈ R× with ry = yr = 1R . Then f (r ) f (y) = f (ry) =
f (1R ) = 1S and f (y) f (r ) = f (yr ) = f (1R ) = 1S . It follows that f (r ) ∈ S× .

Example 20.1.2. We have Z× = {±1}. We have Q× = Q − {0}.

Example 20.1.3. We have F[e]× = { a + be : a 6= 0}. Indeed, if a 6= 0 then ( a + be)( a−1 − a−2 be) = 1
(where a−2 is by definition ( a2 )−1 . It satisfies a−2 a = a−1 ). Conversely, if ( a + be)(c + de) = 1 then ac = 1
and so a 6= 0.
√ √
Example 20.1.4. Let n 6= 0, 1 be a square free integer. √ Recall that Z[ n] = { a + b n : a, b ∈ Z} and
every element of this ring has a unique expression as a + b n. We claim that
√ √
Z[ n ] × = { a + b n : a2 − b2 n = ±1}.
√ √ √
Indeed,
√ if a2 − b2 n = ±1 √ then ( a + b n)( a − b n)√= ±1 and √so a + b n is invertible with inverse ±( a −
b n). Conversely,√ if a + b n√is invertible, say ( a + b n)(c + d n) = 1 (for some c, d ∈ Z) then ad + bc = 0
and so also ( a − b n)(c − d n) = 1. We get that
√ √ √ √
( a + b n)( a − b n)(c + d n)(c − d n) = 1.
√ √ √ √
But ( a + b n)( a − b n) = a2 − b2 n and (c + d n)(c − d n) = c2 − d2 n are integers. So
√ √
( a + b n)( a − b n) = a2 − b2 n = ±1.

If n is negative, then it is easy to see that the unique solutions are a = ±1, b = 0, except when n = −1
where also b = ±1 is a solution. Namely, in general, only ±1 are units for n < 0, but for Z[i ] the units are
±1, ±i. On the other hand, for n > 1 it turns out that there are infinitely many units; there are infinitely
many solutions ( a, b) to the so-called Pell equation

a2 − b2 n = 1.

This is not an easy statement. It is interesting to try and prove this, but don’t be discouraged if you can’t.
The Pell equation has been studied much and is related to the Archimedes Cattle problem that asks for “the
number of the cattle of the sun which once grazed upon the plains of Sicily". It is stated in the form of a
poem, ultimately reducing to a Pell equation, the solutions of which are just enormous. Finding the smallest
√
solution to a Pell equation is a complicated problem, related to the continued fraction expression of n. It
behaves rather erratically. For example, the smallest solution to a2 − 60 · b2 = 1 is a = 31, b = 4. In contrast,
the smallest solution to a2 − 61 · b2 = 1 is a = 1766319049, b = 226153980.

Example 20.1.5. Let F be a field. The units of the ring M2 (F) are the matrices
  
 a b 
GL2 (F) :=   : ad − bc 6= 0 .
 c d 

 
a b
Indeed, suppose that for the matrix   we have ad − bc 6= 0. Consider the matrix
c d
 
d −b
( ad − bc)−1  
−c a
COURSE NOTES - ALGEBRA I 71

     
a b ta tb a b
(where by t   we mean  . It is equal to   t). We claim that this is the inverse. We
c d tc td c d
have
    
d −b a b ad − bc 0
( ad − bc)−1    = ( ad − bc)−1  
−c a c d 0 ad − bc
 
1 0
= .
0 1
     
a b d −b 1 0
Similarly, one checks that   ( ad − bc)−1  =
.
c d −c 1a 0
 
a b
Suppose now that   is invertible. The expression ad − bc is called the determinant of the ma-
c d
 
a b
trix M =   and is denoted det( M). One can verify by a laborious but straightforward calculation that
c d
for any two matrices M, N we have
det( MN ) = det( M) det( N ).
If the matrix M has an inverse, say MN = N M = I2 , then
det( MN ) = det( M) det( N ) = det( I2 ) = 1,
 
d −b
and that shows that det( M) 6= 0. One can then show that N is necessarily ( ad − bc)−1  . In
−c a
fact, a more general fact is true.
Let R be a ring and x ∈ R× , yx = xy = 1. Suppose also that zx = xz = 1. Then (y − z) x = 1 − 1 = 0
and so (y − z)( xy) = 0 · y = 0. Therefore, since 
xy = 1, wehave y − z = 0 that is y = z.
d −b
Applying this to MN = I2 and M ( ad − bc)−1   = I2 , we conclude that
−c a
 
d −b
N = ( ad − bc)−1  .
−c a

*********************

21. Quotient rings

In this section we construct quotient rings. These are rings constructed out of a given ring by a process
of “moding out by an ideal". Our main motivation is constructing finite fields whose cardinality is a power
of a prime number, generalizing the fields Z/pZ. Finite fields, historically one of the most esoteric aspects
of algebra, should be considered today also as part of applied mathematics. Their usefulness to computer
science and engineering, mainly through the subjects of cryptography, coding theory and complexity theory,
is enormous. Knowing to construct and compute in finite fields is one of the main goals of this course.

Consider a surjective ring homomorphism f : R → S. Given an element s ∈ S let r be an element of R

such that f (r ) = s. How unique is r? If a ∈ I := Ker( f ) then f (r + a) = f (r ) + f ( a) = f (r ) + 0 = f (r ).
72 EYAL GOREN MCGILL UNIVERSITY

Conversely, if f (r1 ) = s then f (r1 − r ) = f (r1 ) − f (r ) = s − s = 0 so a := r1 − r ∈ I and r1 = r + a. Let

us use the notation
r + I = {r + i : i ∈ I }.
We proved that if r is any element of R such that f (r ) = s then
f −1 (s) = r + I.
Thus, in a sense, we may identify elements of S with cosets of R and from this point of view we may say that
the cosets (thought of as being the elements of S) form a ring.
In this section we perform a key construction that eliminates the need in S. Given a ring R and a two-sided
ideal I C R we construct a new ring R/I, whose elements are cosets of I.
Definition 21.0.1. Let R be a ring and I C R a two sided ideal. A coset of I is a subset of R of the form
a + I : = { a + i : i ∈ I },
where is a is an element of R.
Example 21.0.2. Suppose that R = Z and I = (n) for some positive integer n. Then a + (n) = {. . . , a −
n, a, a + n, a + 2n, . . . } are precisely the integers congruent to a modulo n. Thus, a coset of the ideal (n) is
the same thing as a congruence class modulo n.
Lemma 21.0.3. We have the following facts:
(1) Every element of R belongs to a coset of I.
(2) Two cosets are either equal or disjoint.
(3) The following are equivalent: (i) a + I = b + I; (ii) a ∈ b + I; (iii) a − b ∈ I.
Proof. The first claim is easy: the element r belongs to the coset r + I, because r = r + 0 and 0 ∈ I.
Suppose that a + I ∩ b + I 6= ∅. Then, there is an element of R that can be written as
a + i1 = b + i2 ,
for some i1 , i2 ∈ I. We show that a + I ⊂ b + I; by symmetry we have the opposite inclusion and so the
cosets are equal. An element of a + I has the form a + i for some i ∈ I. We have a + i = b + (i2 − i1 ) + i =
b + (i2 − i1 + i ). Note that i2 − i1 + i ∈ I and so a + i ∈ b + I.
We next prove the equivalence of (i), (ii) and (iii). Clearly (i) implies (ii) because a ∈ a + I. If (ii) holds
then a = b + i for some i ∈ I and so a − b = i ∈ I and (iii) holds. If (iii) holds then a − b = i for some i ∈ I,
and so a ∈ a + I and also a = b + i ∈ b + I. That is, a + I ∩ b + I 6= ∅ and so a + I = b + I.
Theorem 21.0.4. Let R be a ring and I C R a two-sided ideal. Denote the collection of cosets of I in R
by R/I. Define addition by
( a + I ) + (b + I ) = ( a + b) + I,
and multiplication by
( a + I )(b + I ) = ab + I.
These operations are well defined and make R/I into a ring (a quotient ring) with zero element 0 + I = I
and identity element 1 + I.
Proof. First, our definition of the operations makes use of writing a coset as a + I. This way of writing is
not unique and so we should check that our definitions are independent of the choice of the element a such
that the coset is equal to a + I. Namely, if
a + I = a0 + I, b + I = b0 + I,
we need to check that
a + b + I = a0 + b0 + I, ab + I = a0 b0 + I.
Now, ( a + b) − ( a0 + b0 ) = ( a − a0 ) + (b − b0 ). By Lemma 21.0.3, a − a0 ∈ I, b − b0 ∈ I and so ( a + b) −
( a0 + b0 ) ∈ I. Therefore, by the same lemma, a + b + I = a0 + b0 + I. Also ab − a0 b0 = ( a − a0 )b + a0 (b − b0 ).
Now, a − a0 ∈ I, b − b0 ∈ I and so ( a − a0 )b ∈ I, a0 (b − b0 ) ∈ I and thus ab − a0 b0 ∈ I. Therefore, ab + I =
a0 b0 + I.
COURSE NOTES - ALGEBRA I 73

We now verify the ring axioms. It will be convenient to write ā for a + I. With this notation we have
ā + b̄ = a + b, ā b̄ = ab.
The axioms follow from the definition of the operations and the fact that they hold for R. To make clear at
!
what point we use that the axioms hold in R, we use the notation = to draw attention to this.
!
(1) ā + b̄ = a + b = b + a = b̄ + ā.
!
(2) ā + (b̄ + c̄) = ā + b + c = a + (b + c) = ( a + b) + c = a + b + c̄ = ( ā + b̄) + c̄.
!
(3) We have 0̄ + ā = 0 + a = ā. ( We remark that 0̄ = I.)
!
(4) We have ā + − a = a + (− a) = 0̄.
!
(5) ā(b̄ c̄) = ā bc = a(bc) = ( ab)c = ab c̄ = ( ā b̄) c̄.
! !
(6) We have ā 1̄ = a 1 = ā and 1̄ ā = 1 a = ā.
!
(7) ( ā + b̄)c̄ = a + b c̄ = ( a + b)c = ac + bc = ac + bc = ā c̄ + b̄ c̄. Also, c̄( ā + b̄) = c̄ a + b =
!
c( a + b) = ca + cb = ca + cb = c̄ ā + c̄ b̄.

Proposition 21.0.5. The natural map,
π : R → R/I, a 7→ π ( a) := ā
is a surjective ring homomorphism with kernel I. Thus, every ideal I C R is the kernel of some ring homomor-
phism from R to some other ring.
Proof. Note that 1 7→ 1̄, which is the identity element of R/I. We have π ( a + b) = a + b = ā + b̄ =
π ( a) + π (b). Also, π ( ab) = ab = ā b̄ = π ( a) π (b). We have shown that π is a ring homomorphism and it
is clearly surjective.
The kernel of π are the elements a ∈ R such that π ( a) = ā = 0̄, namely, the elements a such that a + I =
0 + I. By Lemma 21.0.3 this is the set of elements a such that a − 0 ∈ I, namely, the kernel is precisely I.
Example 21.0.6. Consider the ring Z. If we take the ideal {0} then Z/{0} can be identified with Z; the
map Z → Z/{0} is a bijective ring homomorphism. Let n > 0 then. The ring Z/(n) has as elements
the cosets a + (n). Two cosets a + (n), b + (n) are equal if and only if a − b ∈ (n), that is, precisely
when n|( a − b). We see that the elements of Z/(n) are just the congruence classes modulo n, as we have
in fact noted before, and the operations on Z/(n) are just the operations we defined on congruence classes.
Thus, the quotient rings of Z are (either Z or) the familiar rings of congruences. In particular, if p is a
prime number we get the field Z/pZ of p elements. Following on the analogy between Z and F[ x ], it is
natural to examine next the quotient rings of F[ x ]. We shall see that in fact we can get this way fields and
in particular fields whose cardinality is any power of a prime (in contrast Z/p a Z is never a field for a > 1).
It is a fact that any finite field has cardinality a power of a prime, so the methods we develop in this course
produce all finite fields. We shall not prove, though, in this course that any finite field has cardinality a power
of a prime, or that we get all finite fields this way. This is done usually in Algebra IV.

21.1. The quotient ring F[ x ]/( f ( x )). Let F be a field , f ( x ) ∈ F[ x ] a non-constant polynomial and
( f ( x )) = f ( x ) · F[ x ] the principal ideal it defines. Consider the quotient ring F[ x ]/( f ( x )). Suppose
that f ( x ) = x n + an−1 x n−1 + · · · + a0 is a monic polynomial of degree n. The following lemma is an
analogue of Lemma 12.0.1.
Lemma 21.1.1. Every element of F[ x ]/( f ( x )) is of the form g( x ) := g( x ) + ( f ( x )) for a unique polyno-
mial g( x ) which is either zero or of degree less than n.
Proof. Let h( x ) be a polynomial. To say that we have equality of cosets, h( x ) + ( f ( x )) = g( x ) + ( f ( x )), is
to say that h( x ) = q( x ) f ( x ) + g( x ). The requirement that deg( g) < deg( f ) amounts to the assertion that
the expression
h( x ) = q( x ) f ( x ) + g( x )
74 EYAL GOREN MCGILL UNIVERSITY

is the one gotten by dividing h by f with residue. We know that this is always possible and in a unique
fashion.
Theorem 21.1.2. Let F be a field, f ( x ) ∈ F[ x ] a non-constant irreducible polynomial of degree n. The
quotient ring F[ x ]/( f ( x )) is a field. If F is a finite field of cardinality q then F[ x ]/( f ( x )) is a field with qn
elements.

Proof. We already know that F[ x ]/( f ( x )) is a commutative ring. We note that 0̄ 6= 1̄ because 1 6∈ ( f )
(if it were, f would be a constant polynomial). Thus, we only need to show that a non-zero element has
an inverse. Let g( x ) be a non-zero element. That means that g( x ) 6∈ ( f ( x )), thus f ( x ) - g( x ), and so
gcd( f , g) = 1 (here is where we use that f is irreducible). Therefore, there are polynomials u( x ), v( x ) such
that
u( x ) f ( x ) + v( x ) g( x ) = 1.
Passing to the quotient ring, that means that v̄ ḡ = 1̄, and 1̄ is the identity of the quotient ring. Therefore ḡ
is invertible (and its inverse is v̄).
Finally, by the Lemma, every element of F[ x ]/( f ( x )) has a unique representative of the form an−1 x n−1 +
· · · + a1 x + a0 , where a0 , a1 , . . . , an−1 are elements of F. If F has q elements, we get qn such polynomials as
qn is the number of choices for the coefficients a0 , a1 , . . . , an−1 .
Example 21.1.3. A field with 4 elements. Take the field F to be F2 = Z/2Z and consider the poly-
nomial x2 + x + 1 over that field. Because it is of degree 2 and has no root in F2 it must be irreducible.
Therefore, F2 [ x ]/( x2 + x + 1) is a field K with 4 elements. Let us list its elements:
K = {0̄, 1̄, x̄, x + 1}.
(This is the list of polynomials a0 + a1 x with a0 , a1 ∈ F2 .) We can describe the addition and multiplication
by tables:

+ 0̄ 1̄ x̄ x+1 · 0̄ 1̄ x̄ x+1

0̄ 0̄ 1̄ x̄ x+1 0̄ 0̄ 0̄ 0̄ 0̄
1̄ 1̄ 0̄ x+1 x̄ , 1̄ 0̄ 1̄ x̄ x+1
x̄ x̄ x+1 0̄ 1̄ x̄ 0̄ x̄ x+1 1̄
x+1 x+1 x̄ 1̄ 0̄ x+1 0̄ x+1 1̄ x̄

***************

Example 21.1.4. A field with 9 elements. Consider the polynomial x2 + 1 over F3 = Z/3Z. It is
quadratic and has no root in F3 , hence is irreducible over F3 . We conclude that L = F3 [ x ]/( x2 + 1)
is a field with 9 elements. Note that in F3 the element −1 = 2 is not a square. However, in L we
have x2 = x2 − ( x2 + 1) = −1 and so −1 is a square now – its root is x (viewed as an element of L). In
fact, any quadratic polynomial over F3 has a root in L, because the discriminant “ b2 − 4ac" is either 0, 1, 2
and all those are squares in L.
For example, consider the polynomial t2 + t + 2. It has discriminant −7 ≡ −1 (mod 3), which √ is not a
square in F3 . In the field L we have x2 = −1. The solutions of the polynomial are then (−1 ± −1)/2 =
2(−1 ± x ) = 1 ± 2x.
On the other hand, one can prove that the polynomial t3 + t2 + 2 is irreducible in F3 and stays irreducible
in L. In MATH 370 we learn a systematic theory for deciding which polynomials stay irreducible and which
do not.

Example 21.1.5. Fields with 8 and 16 elements. A polynomial of degree 3 is irreducible if and only if
it doesn’t have a root. We can verify that x3 + x + 1 doesn’t have a root in F2 = Z/2Z and conclude
that F2 [ x ]/( x3 + x + 1) is a field with 8 elements. Consider the field K with 4 elements constructed above.
COURSE NOTES - ALGEBRA I 75

We note that the polynomial t2 + t + x̄ is irreducible over K (simply by substituting for t any of the four
elements of K and checking). Thus, we get a field L with 16 elements
L = K[t]/(t2 + t + x̄ ).
Remark 21.1.6. To construct a finite field of pn elements, where p is a prime number, we need to find a
polynomial of degree n which is irreducible over the field of Z/pZ. One can prove that such a polynomial
always exists by a counting argument. Finding a specific one for a given p and n is harder. Nonetheless,
given p and n one can find such an irreducible polynomial and so construct a field of pn elements explicitly,
for example in the sense that one can write a computer program that makes calculations in such a field.

21.2. Every polynomial has a root in a bigger field.

Theorem 21.2.1. Let F be a field and f ( x ) ∈ F[ x ] a non-constant polynomial. There is a field L containing F
and an element ` ∈ L such that f (`) = 0.
Proof. If g| f and g(`) = 0 then also f (`) = 0, so we may assume that f is irreducible. Let L = F[ x ]/( f ( x )).
This is a field. We have a natural map F → L, a 7→ ā. This map is an injective ring homomorphism and we
identify F with its image in L so as to say that L ⊃ F.
Now, suppose that f ( x ) = an x n + . . . a1 x + a0 . To say that f has a root in L is to say that for some
element ` ∈ L we have
an `n + . . . a1 ` + a0 = 0.
We check that this hold for the element ` = x̄. Indeed,
an x̄ n + . . . a1 x̄ + a0 = an x n + . . . a1 x + a0 = f ( x ) = 0 L .

Example 21.2.2. According to this result, −1 has a square root in the field R[ x ]/( x2 + 1). One can show
that R[ x ]/( x2 + 1) ∼
= C.

21.3. Roots of polynomials over Z/pZ. We can now continue our discussion, begun in § 16.7, of the
efficient determination of whether a small degree polynomial f ( x ) over Z/pZ has a root in Z/pZ. Recall
that the only remaining point was whether the Euclidean algorithm step,
x p − x = q ( x ) f ( x ) + r ( x ),
can be done rapidly. Now we can answer that affirmatively. Note that r ( x ) + x is exactly the representative
of x p in the ring F[ x ]/( f ( x )). This representative can be calculated quickly by the method we already used
for calculating powers. We need to calculate
x, x2 , x4 , x8 , . . .
i
and express p in base 2, p = ∑ ai 2i , ai ∈ {0, 1}, x p = ∏{i:ai 6=0} x2 and so on. We see that the slowing factor
now is how quickly we can carry out multiplication in the ring F[ x ]/( f ( x )). It is not hard to see that this
depends on the degree of f and not on p.
Let us illustrate this by finding if x3 + x + 1 has a root in the field with 17 elements. We calculate x17 in
the quotient ring L = F17 [ x ]/( x3 + x + 1). We have x, x2 ,
x 4 = x ( x 3 + x + 1) − ( x 2 + x ),
x8 = ( x2 + x )2 = x4 + 2x3 + x2 = −( x2 + x ) + 2(− x − 1) + x2 = −3x − 2,
x16 = (3x + 2)2 = 9x2 + 12x + 4
and so
x17 − x = x (9x2 + 12x + 4) − x = 9(− x − 1) + 12x2 + 3x = 12x2 − 6x − 9.
This is the residue of dividing x17 − x in x3 + x + 1. Now we continue with the Euclidean algorithm, in the
way we are used to.
x3 + x + 1 = (10x + 5)(12x2 − 6x − 9) + 2x + 12,
12x2 − 6x − 9 = (6x − 5)(2x + 12)
76 EYAL GOREN MCGILL UNIVERSITY

and it follows that

gcd( x17 − x, x3 + x + 1) = x + 6,
and so that x3 + x + 1 has a unique root modulo 17, equal to 11. This is easy to verify this conclusion:
x3 + x + 1 = ( x + 6)( x2 + 11x + 3), which shows that x = 11 is indeed a root. The polynomial x2 + 11x + 3
has discriminant 121 − 12 = 109 ≡ 7 (mod 17), which is not a square modulo 17 (by direct verification),
hence it is irreducible modulo 17 and there are no other roots.

22. The First Isomorphism Theorem

22.1. Isomorphism of rings.

Definition 22.1.1. Let R, S be rings. A ring homomorphism f : R → S is called an isomorphism if f is

bijective.

Lemma 22.1.2. If f : R → S is a ring isomorphism then the inverse function g = f −1 , g : S → R is also a

ring homomorphism, hence an isomorphism. (The inverse function is defined by g(s) = r, where r is the
unique element such that f (r ) = s.)

Proof. First, because f (1R ) = 1S we have g(1S ) = 1R . Next, let s1 , s2 ∈ S. We need to prove g(s1 + s2 ) =
g(s1 ) + g(s2 ) and g(s1 s2 ) = g(s1 ) g(s2 ). It is enough to prove that
f ( g(s1 + s2 )) = f ( g(s1 ) + g(s2 )), f ( g(s1 s2 )) = f ( g(s1 ) g(s2 )),
because f is injective. But f ( g(s1 ) + g(s2 )) = f ( g(s1 )) + f ( g(s2 )) = s1 + s2 = f ( g(s1 + s2 )) and f ( g(s1 ) g(s2 )) =
f ( g(s1 )) f ( g(s2 )) = s1 s2 = f ( g(s1 s2 )).
Definition 22.1.3. Let R, S be rings. We say that R and S are isomorphic if there is a ring isomorphism R → S.

Lemma 22.1.4. Being isomorphic is an equivalence relation on rings.

Proof. First, the identity function is always a ring homomorphism from R to R, so this relation is reflexive.
Secondly, if f : R → S is an isomorphism then g : S → R is an isomorphism, where g is the inverse function
to f . Thus, the relation is symmetric. Now suppose f : R → S and g : S → T are ring isomorphisms between
the rings R, S, T. To show the relation is transitive we need to prove that g ◦ f : R → T is an isomorphism.
Indeed:
(1) ( g ◦ f )(1R ) = g( f (1R )) = g(1S ) = 1T ;
(2) ( g ◦ f )(r1 + r2 ) = g( f (r1 + r2 )) = g( f (r1 ) + f (r2 )) = g( f (r1 )) + g( f (r2 )) = ( g ◦ f )(r1 ) + ( g ◦
f )(r2 );
(3) ( g ◦ f )(r1 r2 ) = g( f (r1 r2 )) = g( f (r1 ) f (r2 )) = g( f (r1 )) · g( f (r2 )) = ( g ◦ f )(r1 ) · ( g ◦ f )(r2 ).

We shall denote that R is isomorphic to S by R ∼

= S.

22.2. The First Isomorphism Theorem.

Theorem 22.2.1. Let f : R → S be a surjective homomorphism of rings. Let I = ker( f ) then there is an
isomorphism F : R/I → S, such that the following diagram commutes
f
R //S,
=

π
! F
R/I
where π : R → R/I is the canonical map g 7→ ḡ.
COURSE NOTES - ALGEBRA I 77

Proof. We define a function,

F : R/I → S,
by F ( ḡ) = f ( g). We first prove that this map is well defined. Suppose that ḡ = g1 . We need to show
that f ( g) = f ( g1 ). This holds because ḡ = g1 means g − g1 ∈ I = Ker( f ). Now:
• F (1R/I ) = F (1̄R ) = f (1R ) = 1S ;
• F ( ḡ + h̄) = F ( g + h) = f ( g + h) = f ( g) + f (h) = F ( ḡ) + F (h̄);
• F ( ḡ h̄) = F ( gh) = f ( gh) = f ( g) · f (h) = F ( ḡ) · F (h̄).
We also have
( F ◦ π )( g) = F ( ḡ) = f ( g),
so F ◦ π = f . Because of this we have that F is surjective. We next show F is injective. Suppose
that F ( ḡ) = 0S then f ( g) = 0S and so g ∈ I. Thus, ḡ = 0R/I .
Example 22.2.2. We consider again the homomorphism Z → Z/nZ. It is a surjective ring homomorphism
with kernel given by the principal ideal (n) and we conclude that
Z/(n) ∼
= Z/nZ,
a fact we have noticed somewhat informally before.
Example 22.2.3. We have R[ x ]/( x2 + 1) ∼
= C. To show that, define a ring homomorphism
R[ x ] → C,
by ∑nj=0 a j x j
7→ ∑nj=0 a j i j .
This is a well defined function taking 1 to 1. It is easy to verify it is a homomorphism.
In fact, recall that C[ x ] → C, f 7→ f (i ), is a homomorphism. Our map is the restriction of the evaluation-at-i
homomorphism to the subring R[ x ] and so is also a homomorphism. It is surjective because, for example,
a + bx 7→ a + bi.
The kernel I definitely contains x2 + 1, and so all its multiples. That is, I contains the ideal ( x2 + 1).
Because every ideal of R[ x ] is principal, I = ( f ) for some polynomial f . Because x2 + 1 ∈ ( f ), f |( x2 + 1).
Since x2 + 1 is irreducible over R, either f ∼ 1 or f ∼ x2 + 1. If f ∼ 1 we have ( f ) = R[ x ] and so any
polynomial is in the kernel, which is clearly not the case (for example, 1 is not in the kernel). Thus f ∼ x2 + 1
and I = ( x2 + 1); by the first isomorphism theorem we have
R[ x ] / ( x 2 + 1) ∼
= C.

22.3. The Chinese Remainder Theorem.

Theorem 22.3.1. Let m, n be positive integers such that (m, n) = 1. Then
Z/mnZ ∼
= Z/mZ × Z/nZ.
Proof. We define a function
f : Z → Z/mZ × Z/nZ, f ( a) = ( a (mod m), a (mod n)).
This function is a ring homomorphism:
• f (1) = (1 (mod m), 1 (mod n)) = (1Z/mZ , 1Z/nZ );
• f ( a + b) = ( a + b (mod m), a + b (mod n)) = ( a (mod m), a (mod n)) + (b (mod m), b (mod n)) =
f ( a ) + f ( b );
• f ( ab) = ( ab (mod m), ab (mod n)) = ( a (mod m), a (mod n)) · (b (mod m), b (mod n)) =
f ( a ) f ( b ).
The kernel of the map is the set { a : m| a, n| a} = { a : mn| a} (using that (m, n) = 1), that is, the kernel
is the principal ideal (mn). That means that the integers 0, 1, . . . , mn − 1 all have different images in the
target. Since the target has mn elements, we conclude that f is surjective. By the first isomorphism theorem
Z/mnZ ∼
= Z/mZ × Z/nZ.

78 EYAL GOREN MCGILL UNIVERSITY

This theorem is very useful. It says that to solve an equation modulo mn, (m, n) = 1, is the same as solving it
modulo m and modulo n. That is, for given integers a0 , . . . , an and an integer A we have an An + · · · + a1 A +
a0 ≡ 0 (mod mn) if and only if we have an An + · · · + a1 A + a0 ≡ 0 (mod m) and an An + · · · + a1 A + a0 ≡
0 (mod n). Here is an example:
Example 22.3.2. Solve the equation 5x + 2 = 0 modulo 77.
We consider the equation modulo 7 and get 5x = −2 = 5 (mod 7) so x = 1 (mod 7); we consider
it modulo 11 and get 5x = −2 = 20 (mod 11) and get that x = 4 (mod 11). There is an x ∈ Z such
that x (mod 7) = 1, x (mod 11) = 4 and in fact x is unique modulo 77 (this is the CRT). We can guess
that x = 15 will do in this case, but it raises the general problem of finding the inverse isomorphism to
Z/mnZ → Z/mZ × Z/nZ.
22.3.1. Inverting Z/mnZ → Z/mZ × Z/nZ. Suppose we know how to find integers e1 , e2 such that e1 = 1
(mod m), e1 = 0 (mod n) and e2 such that e2 = 0 (mod m), e2 = 1 (mod n), then we would have solved
our problem. Indeed, given now two congruence classes a (mod m), b (mod n) take the integer ae1 + be2 .
It is congruent to a modulo m and to b modulo n.
Since (m, n) = 1 we may find u, v such that 1 = um + vn. Put
e1 = 1 − um, e2 = 1 − vn.
These are the integers we are looking for.
Example 22.3.3. Solve the equation 56x + 23 = 0 (mod 323).
We have 323 = 17 · 19.
• Solution modulo 17.
We have the equation 5x + 6 = 0 (mod 17). Or x = −6 · 5−1 = 11 · 5−1 . To find 5−1 we look
for u, v such that 1 = u5 + v17.
17 = 3 · 5 + 2, 5 = 2 · 2 + 1 so 1 = 5 − 2 · 2 = 5 − 2 · (17 − 3 · 5) = 7 · 5 − 2 · 17 and so 7 · 5 = 1
(mod 17). We conclude that x = 11 · 7 = 77 = 9 (mod 17).
• Solution modulo 19.
We have the equation − x + 4 = 0 so x = 4 (mod 19) is a solution.
• Finding e1 , e2 .
We have 19 = 17 + 2, 17 = 8 · 2 + 1 so 1 = 17 − 8 · 2 = 9 · 17 − 8 · 19. It follows that e1 =
1 − 9 ∗ 17 = −152, e2 = 1 + 8 ∗ 19 = 153.
• We conclude that the solution to the equation 56x + 23 = 0 (mod 323) is 9 ∗ e1 + 4 ∗ e2 = −1368 +
612 = −756 and modulo 323 this is 213.

With linear equations, there is in fact a quicker way to solve the equation that we already know. If the leading
coefficient in ax + b is prime to mn, there is a c such that ca ≡ 1 (mod mn) (c can be found using the
Euclidean algorithm). Then x = −bc.
Example 22.3.4. Solve the equation x2 = 118 (mod 323). As before we reduce to solving x2 = 118 = 16
(mod 17) and x2 = 118 = 4 (mod 19). There are two solutions in each case, given by x = ±4 (mod 17)
and x = ±2 (mod 19). We conclude that over all we have 4 solutions given by
±4(−152) ± 2(153) (mod 323).
One can then reduce those numbers to standard representatives and find that 21, 55, 268, 302 (mod 323) are
the four solutions.
One can be more precise about the connection between solutions mod mn and solutions mod m and mod n.
First, let us generalize the Chinese Remainder Theorem:
Theorem 22.3.5. Let m1 , . . . , mk be relatively prime non-zero integers (that is (mi , m j ) = 1 for i 6= j). Then
there is an isomorphism
Z/m1 m2 . . . mk Z ∼ = Z/m1 Z × Z/m2 Z × · · · × Z/mk Z,
given by
a (mod m1 m2 . . . mk ) 7→ ( a (mod m1 ), a (mod m2 ), . . . , a (mod mk )).
COURSE NOTES - ALGEBRA I 79

The theorem is not hard to prove by induction on k. The main case, k = 2, is the one we proved above.
Now, let g( x ) = an x n + . . . a1 x + a0 be a polynomial with integer coefficients. Let S be the solutions of g in
Z/m1 m2 . . . mk Z and Si the solutions of g in Z/mi Z. Then we have a bijection
S ↔ S1 × S2 × · · · × S k ,
given by
a (mod m1 m2 . . . mk ) 7→ ( a (mod m1 ), a (mod m2 ), . . . , a (mod mk )).
Indeed, g( a) (mod m1 m2 . . . mk ) is mapped to ( g( a) (mod m1 ), g( a) (mod m2 ), . . . , g( a) (mod mk )) and
g( a) ≡ 0 (mod m1 m2 . . . mk ) if and only if for every i we have g( a) ≡ 0 (mod mi ). This shows that we
have a map
S → S1 × S2 × · · · × S k .
But, conversely, given solutions ri to g( x ) mod mi , there is a unique r (mod m1 m2 · · · mk ) such that r ≡ ri
(mod mi ) and g(r ) ≡ 0 (mod m1 m2 · · · mk ) because it is true modulo every mi .
In particular, we may draw the following conclusion.
Corollary 22.3.6. Let m1 , . . . , mk be relatively prime integers. Let s be the number of solutions to the
equation an x n + · · · + a1 x + a0 = 0 (mod m1 m2 · · · mk ) and let si be the number of solutions modulo mi .
Then
s = s1 s2 · · · s k .
Example 22.3.7. The equation x2 = 1 has 8 solutions modulo 2 · 3 · 5 · 7 = 490, because it has one solution
mod 2, and 2 solutions mod 3,5 or 7.
Example 22.3.8. The equation 34x = 85 (mod 17 · 19) has 17 solutions, because it has 17 solutions modulo
17 (it is then the equation 0 · x = 0 (mod 17)) and has a unique solution modulo 19 (it is then the equation
4x = 10 (mod 19) and x = 12 is the unique solution).
On the other hand, the equation 34x = 5 (mod 17 · 19) has no solutions, because it has no solutions
modulo 17.

There remains the question how to calculate a solution mod m1 m2 · · · mk from solutions mod m1 , mod m2 ,
. . . , mod mk . That is, how to find explicitly the inverse to the map
Z/m1 m2 . . . mk Z → Z/m1 Z × Z/m2 Z × · · · × Z/mk Z.
We explain how to do that for 3 numbers m1 , m2 , m3 , though the method is general.
We first find integers e1 , e2 such that
e1 ≡ 1 (mod m1 ), e1 ≡ 0 (mod m2 m3 ),
and
e2 ≡ 0 (mod m1 ), e2 ≡ 1 (mod m2 m3 ).
This we know how to do because we are only dealing with two relatively prime numbers, that is, m1 and
m2 m3 . Then, find λ2 , λ3 such that
λ2 ≡ 1 (mod m2 ), λ2 ≡ 0 (mod m3 ),
and
λ3 ≡ 0 (mod m2 ), λ3 ≡ 1 (mod m3 ).
Then, the numbers
µ 1 = e1 , µ 2 = e2 λ 2 , µ 3 = e2 λ 3 ,
are congruent to (1, 0, 0), (0, 1, 0) and (0, 0, 1) respectively in the ring Z/m1 Z × Z/m2 Z × Z/m3 Z. To find
an integer mod m1 m2 m3 mapping to ( a, b, c) in Z/m1 Z × Z/m2 Z × Z/m3 Z take aµ1 + bµ2 + cµ3 :
aµ1 + bµ2 + cµ3 7→ ( a, b, c) ∈ Z/m1 Z × Z/m2 Z × Z/m3 Z.
Let us illustrate all this with a numerical example.
80 EYAL GOREN MCGILL UNIVERSITY

Example 22.3.9. Find the solutions to the equation x2 + x + 2 = 0 (mod 7 · 11 · 23).

The solutions modulo 7 are S1 = {3}; the solutions modulo 11 are S2 = {4, 6}; the solutions modulo
23 are S3 = {9, 13}. (Those are found by brute computation.) To find the corresponding solutions modulo
7 · 11 · 23 we first find the µi above.
First 11 · 23 = 36 · 7 + 1 and so e1 = 253, e2 = −252 satisfy e1 ≡ 1 (mod 7), e1 ≡ 0 (mod 253), e2 ≡ 0
(mod 7), e2 ≡ 1 (mod 253). To find λ1 , λ2 we note that 1 = 23 − 2 · 11 and so λ1 = 23, λ2 = −22 satisfy
λ1 ≡ 1 (mod 11), λ1 ≡ 0 (mod 23) and λ2 ≡ 0 (mod 11), λ2 ≡ 1 (mod 23). The numbers µ1 , µ2 , µ3 are
then 253, −252 · 23, −252 · (−22). They should be understood as numbers modulo 7 · 11 · 23 = 1771 and so
we can replace −252 · 23 by 1288 (which is congruent to it modulo 1771) and −252 · (−22) by 231. Then
to get a number congruent to ( a, b, c) in Z/7Z × Z/11Z × Z/23Z, we take a · 253 + b · 1288 + c · 231.
In particular, the solution (3, 4, 9) produces 3 · 253 + 4 · 1288 + 9 · 231 = 7990 which we can replace by the
number 906 which is congruent to it modulo 1771. By our theory 906 is one of the 4 solutions to the equation
x2 + x + 2 = 0 (mod 1771) and this can be verified.
Do you think you could have found this solution in an easier way?! (I can’t see how).
Example 22.3.10. Although the method described to find µ1 , µ2 , µ3 is perhaps theoretically the quickest, in
practice you may find it less confusing to perform the following variant. First, as 1 = am1 + bm1 m2 for some
integers a, b, if we let µ1 = bm1 m2 then
µ1 ≡ 1 (mod m1 ), µ1 ≡ 0 (mod m2 ), µ1 ≡ 0 (mod m3 ).
Similarly, write now (for different a, b) 1 = am2 + bm1 m3 and we let µ2 = bm1 m3 . Then µ2 ≡ 1 (mod m2 )
and µ2 ≡ 0 (mod m1 ), µ2 ≡ 0 (mod m3 ). Finally, write 1 = am3 + bm1 m2 and let µ3 = bm1 m2 . Then
µ3 ≡ 1 (mod m3 ) and µ3 ≡ 0 (mod m1 ), µ3 ≡ 0 (mod m2 ).
Consider the following numerical example. Let us solve the equation x2 + x (mod 60). The solutions mod
4 are {0, 3}, mod 3 are {0, 2} and mod 5 are {0, 4}. Using that gcd(4, 15) = 1 we find that 1 = 3 · 15 − 11 · 4
and so µ1 = 45. Using that gcd(3, 20) = 1 we find that 1 = 2 · 20 − 13 · 3 and so that µ2 = 40. Finally, as
gcd(5, 12) = 1 we find that 1 = 3 · 12 − 7 · 5 and so µ3 = 36.
The general solution to x2 + x = 0 (mod 60) is thus,
( ( (
0 0 0
45 · + 40 · + 36 · = {0, 144, 80, 224, 135, 279, 215, 359} = {0, 15, 20, 24, 35, 39, 44, 59}.
3 2 4

23. Prime and maximal ideals

Let R be a commutative ring and I an ideal of R.
Definition 23.0.1. The ideal I is called maximal, if I 6= R and the only ideals of R containing I are I and
R, namely there is no ideal J such that I $ J $ R. The ideal I is called prime if I 6= R and if ab ∈ I implies
a ∈ I or b ∈ I.
Theorem 23.0.2. The following hold:
(1) I is a prime ideal if and only if R/I is an integral domain.
(2) I is a maximal ideal if and only if R/I is a field.
(3) A maximal ideal is a prime ideal.
Proof. (1) Let I be a prime ideal. The quotient ring R/I is a commutative ring. Let ā, b̄ ∈ R/I and
suppose that ā · b̄ = 0̄. This means that ab = 0̄ and so that ab ∈ I. Thus, a ∈ I or b ∈ I, which
implies that ā = 0̄ or b̄ = 0̄. Furthermore, since R/I is not zero, 0R/I 6= 1R/I (as a ring in which 0
is equal to 1 is the zero ring). This proves that R/I is an integral domain.
Suppose conversely that R/I is an integral domain and that ab ∈ I. Then ā · b̄ = 0̄ and this
implies ā = 0̄ or b̄ = 0̄. Thus, a ∈ I or b ∈ I and so, as also I 6= R, I is a prime ideal.
(2) Suppose that I is a maximal ideal. If 1 ∈ I then I = R and that is a contradiction. Thus, 1 6∈ I
and therefore in R/I we have 1̄ 6= 0̄. Since R is commutative so is R/I and it remains to prove that
every non-zero element ā of R/I is invertible. Consider the ideal h a, I i = ( a) + I of R. It strictly
COURSE NOTES - ALGEBRA I 81

contains I because a 6∈ I. Therefore ( a) + I = R and in particular, for some r ∈ R and i ∈ I we

have ar + i = 1. In the ring R/I we get ā · r̄ = 1̄.
Suppose now that R/I is a field and let π : R → R/I be the canonical homomorphism. Let J ⊇ I
be an ideal. Then π ( J ) is an ideal of R/I (if f : R → S is a ring homomorphism and J an ideal of R
then f ( J ) is an ideal of Im( f )). Since R/I is a field, π ( J ) is either the zero ideal of R/I or R/I. In
the first case we have π ( J ) = {0̄}, which means that every element of J belongs to I and so J = I.
In the second case we have π ( J ) = R/I. Let r ∈ R then for some j ∈ J we have r + I = j + I and
that means that r = j + i for some i ∈ I. But J ⊇ I and so r ∈ J. It follows that J = R.
(3) Since a field is an integral domain, if I is maximal it is prime.

Example 23.0.3. Let R be a commutative ring. Two elements f , g of R are called associate if f = ge for
some unit e. We write f ∼ g. Being associate is an equivalence relation.
Suppose that R is a commutative domain. Namely, a non-zero commutative ring with no zero divisors.
Consider an ideal I of R of the form ( f ). When is I a prime ideal? If I = {0}, namely, if f = 0, then I is a
prime ideal as R/I ∼ = R. If f 6= 0 and I is a prime ideal then I 6= R and so f is also not a unit. It further
has the property that f | ab implies f | a or f |b. Such a non-zero non-unit element f is called prime. It is easy
to reverse the argument and we find that if I = ( f ) is a prime ideal, either I is equal to zero, or f is a prime
element and, vice-versa, if f is a prime element I = ( f ) is a prime ideal.
Now, we may ask when is I = ( f ) a maximal ideal? The simple answer is that this is the case if I 6= R,
(namely, if f is not a unit) and whenever f - g (namely, whenever g 6∈ ( f )) then ( g, f ) = R. Namely, one
can write 1 = ag + b f for some a, b ∈ R. This is not very interesting. In fact, it is merely a restatement of
the fact that R/( f ) is a field.
However, suppose that R is a principal ideal domain, namely, a non-zero commutative ring with no zero
divisors in which every ideal is principal. One can check that ( f ) = ( g) if and only if f ∼ g. When is an ideal
I = ( f ) maximal?
If f = 0 and I is maximal, then any ideal of the form (r ), where r 6= 0, must be the whole ring (as this
ideal contains I properly); that is, any r 6= 0 is a unit and that implies that R is a field. Conversely, if R is
a field then {0} is a maximal ideal. Now, suppose that f 6= 0. Then ( f ) is also a prime ideal and so f is a
prime element. However, because R is a principal ideal domain, we can show that conversely, if f is prime
then ( f ) is not just a prime ideal, but in fact a maximal ideal. Indeed, suppose that J ⊇ ( f ) is an ideal. Write
J = ( g), which is possible because R is a principal ideal domain. ( g) ⊇ ( f ) and that means that f = gb for
some b ∈ R. Thus, f | gb. If f | g then g ∈ ( f ) and so ( g) ⊆ ( f ) and therefore ( g) = ( f ) (and g ∼ f ). If f |b,
as b| f , we get that f ∼ b, which implies that g is a unit. That is, ( g) = R.
To summarize. In any commutative domain, if I = ( f ) is a prime ideal then either f = 0, or f is a prime
element. If f is a prime element, or zero, I = ( f ) is a prime ideal. If R is a principal ideal domain, then any
prime ideal is maximal (and always a maximal ideal is prime).
82 EYAL GOREN MCGILL UNIVERSITY

24. Exercises
(1) Recall that for the ring Z a complete list of ideals is given by (0), (1), (2), (3), (4), (5), . . . , where
(n) is the principal ideal generated by n, namely, (n) = {na : a ∈ Z}. Find the complete list of
ideals of the ring Z × Z.
(2) Let R be a ring and let I and J be two ideals of R.
(a) Prove that I ∩ J is an ideal of R, where
I ∩ J = {r : r ∈ I, r ∈ J }.
It is called the intersection of the ideals I and J.
(b) Prove that
I + J = {i + j : i ∈ I, j ∈ J }
is an ideal of R. It is called the sum of the ideals I and J.
(c) Find for every two ideals of the ring Z their sum and intersection.
(3) Let F be a field. Prove that the ring M2 (F) of 2 × 2 matrices with entries in F has no non-trivial
(two-sided) ideals. That is, every ideal is either the zero ideal or M2 (F) itself.
(Note: there is also a notion of a one-sided ideal that we don’t discuss in this course. The ring
M2 (F) has a non-trivial one sided ideal. The notion of one-sided ideals is studied in Higher Algebra
I & II).
(4) The ring of real quaternions H (Hamilton’s quaternions). Let i, j, k be formal symbols and
H = { a + bi + cj + dk : a, b, c, d ∈ R}.
Addition on H is defined by
( a + bi + cj + dk) + ( a0 + b0 i + c0 j + d0 k) = ( a + a0 ) + (b + b0 )i + (c + c0 ) j + (d + d0 )k.
Multiplication is determined by defining
i2 = j2 = −1, ij = − ji = k,
(and one extends this to a product rule by linearity).
(a) Prove that the map
  
 z z2 
1
H→   : z1 , z2 ∈ C ,
 −z z 
2 1
 
a + bi c + di
taking a + bi + cj + dk to the matrix   is bijective and satisfies: f ( x + y) =
−c + di a − bi
f ( x ) + f (y) and f (i )2 = f ( j)2 = − I2 , f (i ) f ( j) = f (k ) = − f ( j) f (i ).
(b) Use part (a) to conclude that H is indeed a ring, by proving it is a subring of M2 (C).
(c) Prove that H is a non-commutative division ring.
(5) In the following questions it’s useful to remember that under our definitions a ring homomorphism
takes 1 to 1.
(a) Prove that there is no ring homomorphism Z/5Z → Z.
(b) Prove that there is no ring homomorphism Z/5Z → Z/7Z.
(c) Prove that the rings Z/2Z × Z/2Z and Z/4Z are not isomorphic.
(d) Is there a ring homomorphism Z/4Z → Z/2Z × Z/2Z?
(e) Is there a ring homomorphism Z/2Z × Z/2Z → Z/4Z?
(6) (a) Let R be a commutative ring and let r1 , . . . , rn be elements of R. We define (r1 , . . . , rn ) to be
the set
{r1 a1 + · · · + r n a n : ∀ i a i ∈ R }.
Prove that (r1 , . . . , rn ) is an ideal of R. We call it the ideal generated by r1 , . . . , rn .
COURSE NOTES - ALGEBRA I 83

(b) Now apply that to the case where R = Z[ x ] (polynomials with integer coefficients). Let (2, x )
be the ideal generated by 2 and x.
(i) Prove that the ideal (2, x ) is not principal and conclude that Z[ x ] is not a principal ideal ring.
(ii) Find a homomorphism f : R → Z2 such that (2, x ) = Ker( f ).
(7) Prove that C[ x, y] is not a principal ideal ring, for example, that the ideal ( x, y) is not a principal
ideal.
(8) Prove that no two of the following rings are isomorphic:
(a) R × R × R × R (with addition and multiplication given coordinate by coordinate);
(b) M2 (R);
(c) The ring H of real quaternions.
(9) Let f : R → S be a ring homomorphism.
(a) Let J CS be an ideal. Prove that f −1 ( J ) (equal by definition to {r ∈ R : f (r ) ∈ J }) is an ideal
of R.
(b) Prove that if f is surjective and I C R is an ideal then f ( I ) is an ideal (where f ( I ) = { f (i ) : i ∈
I }).
(c) Show, by example, that if f is not surjective the assertion in (2) need not hold.
(10) Let F be a field and let
  
a a12 a13
 11

 

  

R=  0 a22 a23  : aij ∈ F .
 

  

 
 0 0 a33 

Let
I = {( aij ) ∈ R : a11 = a22 = a33 = 0}.
Prove that R is a subring of M3 (F), I is an ideal of R and R/I ∼= F × F × F.
(11) Let d be an integer, which is not a square of another integer.
√ d is not a √
(a) Prove that square a rational number. √
(b) Let Q[ d] := √ { a + b d : a, b ∈ Q}. Show that Q[ d] is a subring of C and is in fact a field.
(c) Prove that Q[ d] ∼ = Q[√x ]/( x2 − d)√
.
(d) Prove that the fields Q[ 2] and Q[ 3] are not isomorphic.
(12) Prove a Chinese Remainder Theorem for polynomials:
Let F be a field and let f ( x ), g( x ) be two non-constant polynomials that are relatively prime,
gcd( f , g) = 1. Prove that
∼ F[ x ] / ( f ) × F[ x ] / ( g ).
F[ x ] / ( f g ) =
(Hint: mimic the proof of the Chinese Remainder Theorem for integers.)
(13) Let R and S be rings and let I C R, J CS be ideals. Prove that
( R × S)/( I × J ) ∼
= ( R/I ) × (S/J ).

(14) For each of the rings Z/60Z and F[ x ]/( x4 + 2x3 + x2 ) find all their ideals and identify all their
homomorphic images. Suggestion: To find the ideals use exercise (9) to reduce the calculation to
ideals of the ring Z or F[ x ] that contain the ideal (60), respectively ( x4 + 2x3 + x2 ), but explain why
this is valid.
(15) Using the Chinese remainder theorem, find the solutions (if any) to the following polynomial equations
(for example, in (d), write the solutions as integers mod 30 and so on):
(a) 15x = 11 in Z/18Z.
(b) 15x = 12 in Z/63Z.
(c) x2 = 37 in Z/63Z.
(d) x2 = 4 in Z/30Z.
84 EYAL GOREN MCGILL UNIVERSITY

(16) Prove that F5 [ x ]/( x2 + 2) is a field. Calculate the following expressions as polynomials of degree
smaller than 2: ( x2 + x ) ∗ ( x3 + 1) − ( x + 1), ( x2 + 3)−1 and ( x2 − 1)/( x2 + 3). Find all the roots
of the polynomials t2 + 2 and t2 + 3 in this field.
(17) Prove that F2 [ x ]/( x3 + x + 1) is a field. Calculate the following expressions as polynomials of degree
smaller than 3: ( x2 + x ) ∗ ( x3 + 1) − ( x + 1), ( x2 + 3)−1 and ( x2 − 1)/( x2 + 3). Find all the roots
of the polynomial t3 + t2 + 1 and t3 + 1 in this field.
(18) Let f ( x ) = x2 − 2 ∈ F19 [ x ].
(a) Prove that f is irreducible and deduce that L := F19 [ x ]/( x2 − 2) is a field.
(b) Find the roots of the polynomial t2 − t + 2 in the field L.
(c) Prove that h( x ) = x3 − 2 is irreducible over F19 . Prove that it is also irreducible in L.
(d) Find x19 − x in F19 [ x ]/( x3 − 2) as being represented by a polynomial of degree at most 2 (hint:
x19 = ( x3 )6 · x ). Use this to rapidly calculate gcd( x19 − x, h( x )) and conclude also in this way
that h( x ) is irreducible over F19 .
(19) Find all the solutions to the equation x3 = 38 (mod 195).
COURSE NOTES - ALGEBRA I 85

Part 6. Groups
25. First definitions and examples
25.1. Definitions and some formal consequences.
Definition 25.1.1. A group G is a non-empty set with an operation
G × G → G, ( a, b) 7→ ab,
such that the following axioms hold:
(1) ( ab)c = a(bc). (Associativity)
(2) There exists an element e ∈ G such that eg = ge for all g ∈ G. (Identity)
(3) For every g ∈ G there exists an element d ∈ G such that dg = gd = e. (Inverse)
Here are some formal consequences of the definition:
(1) e is unique. Say ẽ has the same property then ẽ = eẽ, using the property of e, but also eẽ = e,
using the property of ẽ. Thus, e = ẽ.
(2) d appearing in (3) is unique (therefore we shall call it “the inverse of g" and denote it by g−1 ).
Say d˜ also satisfies dg
˜ = gd˜ = e. Then

d˜ = de
˜ = d˜( gd) = (dg
˜ )d = ed = d.

(3) Cancelation: ab = cb ⇒ a = c, and ba = bc ⇒ a = c. If ab = cb then ( ab)b−1 = (cb)b−1 and

so a = a(bb−1 ) = c(bb−1 ) = c.
(4) ( ab)−1 = b−1 a−1 . To show that we need to show that b−1 a−1 “functions as the inverse of ab". We
have ( ab)(b−1 a−1 ) = a(bb−1 ) a−1 = aea−1 = aa−1 = e. Similarly, (b−1 a−1 )( ab) = b−1 ( a−1 a)b =
b−1 eb = b−1 b = e.
(5) ( a−1 )−1 = a. This is because aa−1 = a−1 a = e also shows that a is the inverse of a−1 .
(6) Define a0 = e, an = an−1 a for n > 0 and an = ( a−1 )−n for n < 0. Then we have

am an = am+n , ( am )n = amn .

25.2. Examples.
Example 25.2.1. The trivial group G is a group with one element e and multiplication law ee = e.
Example 25.2.2. If R is a ring, then R with addition only is a group. It is a commutative group. The
operation in this case is of course written g + h. In general a group is called commutative or abelian if for
all g, h ∈ G we have gh = hg. It is customary in such cases to write the operation in the group as g + h and
not as gh, but this is not a must. This example thus includes Z, Q, R, C, F, F[e], M2 (F), Z/nZ, all with the
addition operation.
Example 25.2.3. Let R be a ring. Recall that the units R× of R are defined as
{u ∈ R : ∃v ∈ R, uv = vu = 1}.
This is a group. If u1 , u2 ∈ R with inverses v1 , v2 , respectively, then, as above, one checks that v2 v1 is an
inverse for u1 u2 and so R× is closed under the product operation. The associative law holds because it holds
in R; 1R serves as the identity. If R is not commutative there is no reason for R× to be commutative, though
in certain cases it may be.
Thus we get the examples of Z× = {±1}, Q× = Q − {0}, R× = R − 0, C× = C − {0}, and more
generally, F× = F − {0}. We also have, GL2 (F) = { M ∈ M2 (F) : det( M) 6= 0}, F[e]× = { a + be : a 6= 0}.
Proposition 25.2.4. Let n > 1 be an integer. The group Z/nZ× is precisely
{1 ≤ a ≤ n : ( a, n) = 1}.
86 EYAL GOREN MCGILL UNIVERSITY

Proof. If ā is invertible then ab = 1 (mod n) for some integer b; say ab = 1 + kn for some k ∈ Z. If d| a, d|n
then d|1. Therefore ( a, n) = 1.
Conversely, suppose that ( a, n) = 1 then for some u, v we have 1 = ua + vn and so ua = 1 (mod n).
One defines Euler’s ϕ function on positive integers by
(
1 n=1
ϕ(n) =
|Z/nZ× | n > 1.
One can prove that this is a multiplicative function, namely, if (n, m) = 1 then ϕ(nm) = ϕ(n) ϕ(m). I
invite you to try and prove this based on the Chinese Remainder Theorem.

Here are some specific examples:

n Z/nZ× ϕ(n)
2 {1} 1
3 {1, 2} 2
4 {1, 3} 2
5 {1, 2, 3, 4} 4
6 {1, 5} 2
7 {1, 2, 3, 4, 5, 6} 6
8 {1, 3, 5, 7} 4
9 {1, 2, 4, 5, 7, 8} 6

Example 25.2.5. If G, H are groups then G × H is a group with the operation

( g1 , h1 )( g2 , h2 ) = ( g1 g2 , h1 h2 ).
The identity is (eG , e H ) and ( g, h)−1 = ( g −1 , h −1 ).
Example 25.2.6. Consider a solid object and all the rotations that take it to itself. For example, if the object
is a cube we may rotate it ninety degrees relative to an axis that goes from the middle point of a face to the
middle point of the opposite face. Each such rotation, or succession of rotations, is thought of as an element
of the group of symmetries of the solid.
For example, for a tetrahedron we get a group of symmetries with 12 elements. For a cube we get a group
of symmetries with 24 elements. Can you see that?
25.3. Subgroups. Let G be a group. A subset H ⊆ G is called a subgroup if the following holds:
(1) eG ∈ H;
(2) a, b ∈ H ⇒ ab ∈ H;
(3) a ∈ H ⇒ a−1 ∈ H.
Clearly then H is a group in its own right.
Example 25.3.1. The subset S1 of C× consisting of all complex numbers of absolute value 1 is a subgroup.
Indeed 1 ∈ S1 . If s1 , s2 ∈ S1 then |s1 s2 | = |s1 | |s2 | = 1 so s1 s2 ∈ S1 . If z is any non-zero complex number
then 1 = |1| = |zz−1 | = |z| · |z−1 | and so |z−1 | = 1/|z|. If z ∈ S1 it therefore follows that z−1 ∈ S1 .
Let n ≥ 1 be an integer. The subset µn of C× consisting of all complex numbers x such that x n = 1 is a
subgroup of C× , and in fact of S1 , having n elements. It is called the n-th roots of unity. The proof is left
as an exercise.
Example 25.3.2. Let F be a field and
  
 1 a 
H=  :a∈F .
 0 1 

Then H is a subgroup of GL2 (F).

COURSE NOTES - ALGEBRA I 87

Definition 25.3.3. Let G be a group. G is called cyclic if there is an element g ∈ G such that G = { gn :
n ∈ Z}; that is, any element of G is a power of g. The element g is then called a generator of G.
Example 25.3.4. Let G be any group. Let g ∈ G and define
h g i : = { g n : n ∈ Z}.
This is a cyclic subgroup of G (it may be finite or infinite).
Example 25.3.5. The group Z is cyclic. As a generator we may take 1 (or −1).
Example 25.3.6. The group (Z/5Z)× = {1, 2, 3, 4} is cyclic. The elements 2, 3 are generators. The
group Z/8Z× is not cyclic. One can check that the square of any element is 1.

26. Permutation groups and dihedral groups

26.1. Permutation groups.
Definition 26.1.1. A permutation of a set T is a bijective function f : T → T. We shall denote the set
of permutations of T by ST (or Σ T ). If T = {1, 2, · · · , n} then we shall denote ST as Sn . It is called “the
symmetric group on n letters".
Proposition 26.1.2. For every non-empty set T, ST is a group under composition of functions. The cardinality
of Sn is n!.
Proof. The product of two permutations f , g is their composition f ◦ g; it is again a permutation. We
have [( f ◦ g) ◦ h](t) = ( f ◦ g)(h(t)) = f ( g(h(t))) = f (( g ◦ h)(t)) = [ f ◦ ( g ◦ h)](t). Thus, as functions, we
have ( f ◦ g) ◦ h = f ◦ ( g ◦ h) and so the operation of composition of functions is associative.
The identity is just the identity function. The inverse of a permutation f is the inverse function f −1 , which
satisfies f ◦ f −1 = f −1 ◦ f = IdT .
Finally, to define a permutation f on {1, 2, . . . , n} we can choose the image of 1 arbitrarily (n choices),
the image of 2 could be any element different from f (1) (n − 1 choices), the image of 3 can be any elements
different from the images of 1 and 2 (n − 1 choices), and so on. Altogether, we have n · (n − 1) · (n − 2) · · · ·
2 · 1 = n! choices.
Example 26.1.3. (1) For n = 1, S1 consist of a single element and so is the trivial group.
(2) For n = 2 we have two permutations.
(a) Id. Id(1) = 1, Id(2) = 2.
(b) σ. σ (1) = 2, σ (2) = 1.
We may also represent these permutations in the form of tables:
   
1 2 1 2
Id =  , σ= .
1 2 2 1
(3) For n = 3 we have 6 permutations. One of them is σ given by σ (1) = 2, σ (2) = 3, σ (3) = 1, or in
table form  
1 2 3
σ= .
2 3 1
The table form is a better notation and we list all elements of S3 in that form.
     
1 2 3 1 2 3 1 2 3
 ,  ,  ,
1 2 3 2 3 1 3 1 2
     
1 2 3 1 2 3 1 2 3
 ,  ,  .
2 1 3 1 3 2 3 2 1
88 EYAL GOREN MCGILL UNIVERSITY

 
1 2 3
The first line is in fact a cyclic subgroup of S3 . It is the subgroup generated by  .
2 3 1
(4) In general, let T0 be a subset of T. We can define two subgroups of Σ T . The first is H = {σ ∈ Σ T :
σ (t) = t, ∀t ∈ T0 } and the second it I = {σ ∈ Σ T : σ ( T0 ) = T0 }. Then H ⊆ I ⊆ Σ T are subgroups.
H and I depend on the choice of T0 , but we do not reflect this in the notation. If Σ T = Sn and T0
has m elements then H has (n − m)! elements and I has (n − m)!m! elements.

The groups Sn are not commutative for n ≥ 3. For example:

         
1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
  = ,   = .
2 1 3 2 3 1 1 3 2 2 3 1 2 1 3 3 2 1
Here is another example of multiplication, in S5 this time:
    
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
  = .
3 2 1 4 5 5 2 4 3 1 5 2 4 1 3

26.2. Cycles. There is still more efficient notation for permutations in Sn . Fix n ≥ 1. A cycle (in Sn ) is an
expression of the form
( a1 a2 · · · a t ),
where ai ∈ {1, 2, . . . , n} are distinct elements. This expression is understood as the permutation σ given by

 ai+1 a = ai , i < n,

σ ( a ) = a1 a = an ,

a else.


Pictorially:
) ) )
a1 i a2 a3 .( . . at

(and elements outside the cycle don’t

 move).   
1 2 3 1 2 3
For example, the permutation   is the cycle (1 2 3) and the permutation   is the
2 3 1 2 1 3
cycle (1 2). A cycle with two elements (i j) (not necessarily consecutive) is called a transposition.
Definition 26.2.1. Let G be a group. The order of G, denoted | G |, or ] G, is the number of elements of G
(written ∞ if not finite).
Let g ∈ G. The order of g is defined as |h gi|, the order of the cyclic group generated by g. It is also
denoted by o ( g), or ord( g).

Lemma 26.2.2. Let g ∈ G, o ( g) is the minimal positive integer k such that gk = e.

Proof. Let k be the minimal integer such that gk = e (∞ if such doesn’t exist).
Suppose first that o ( g) is finite, say equals r. Then the r + 1 elements {e, g, g2 , . . . , gr } cannot be distinct
and so gi = g j for some 0 ≤ i < j ≤ r. It follows that g j−i = e and so k ≤ j − i ≤ r. In particular k is also
finite. So r is finite implies k is finite and k ≤ r.
Suppose now that k is finite. Let n be an integer and write n = ak + b where 0 ≤ b < k. Then gn =
( gk ) a gb = e a gb = gb . We conclude that h gi ⊆ {e, g, · · · , gk−1 }, and so k is finite implies that r is finite
and r ≤ k.
Example 26.2.3. Let ( a1 a2 · · · at ) be a cycle. Its order is t.
COURSE NOTES - ALGEBRA I 89

Two cycles σ, τ are called disjoint if they contain no common elements. In this case, clearly στ = τσ.
Moreover, (στ )n = σn τ n and since σn and τ n are disjoint, (στ )n = Id if and only if σn = Id and τ n = Id.
Thus o (σ )|n, o (τ )|n and we deduce that o (στ ) (namely, the least n such that (στ )n = Id) is lcm(o (σ), o (τ )).
Arguing in the same way a little more generally we obtain:
Lemma 26.2.4. Let σ1 , . . . , σn be disjoint permutations of orders r1 , . . . , rn , respectively. Then the order of
the permutation σ1 ◦ σ2 ◦ · · · ◦ σn is lcm(r1 , r2 , . . . , rn ).
Combining this lemma with the following proposition allows us to calculate the order of every permutation
very quickly.
Proposition 26.2.5. Every permutation is a product of disjoint cycles.
We shall not provide formal proof of this proposition, but illustrate it by examples.
 
1 2 3 4 5
Example 26.2.6. Consider the permutation σ =  . To write it as a product of cycles we
3 4 1 5 2
begin by (1 and check where 1 goes to. It goes to 3. So we write (13 and check where 3 goes to. It goes
to 1 and so we have (13). The first number we didn’t consider is 2. 2 goes to 4 and so we write (13)(24
and 4 goes to 5 and so we write (13)(245 . Now, 5 goes to 2 and so we have σ = (13)(245). The order of
σ is lcm(2, 3) = 6.
 
1 2 3 4 5 6 7 8 9 10
Example 26.2.7. Consider the permutation σ =   . It is written as a
3 4 2 1 10 6 9 5 7 8
product of disjoint transposition as follows (1324)(5 10 8)(79). To find this expression, we did the same
procedure described above: We start with (1 , continue with (13 , because 1 goes to 3, and then with (132 ,
because 3 goes to 2. Then we find that 2 goes to 4 which goes to 1 and we have found (1324). The first
number not in this list is 5 which goes to 10 and so we have (1324)(5 10. Since 10 goes to 8 and 8 to 5 we get
now (1324)(5 10 8). The first number not in this list is 6 that goes to 6 and that gives (1324)(5 10 8)(6). We
then continue with 7. Since 7 goes to 9 which goes to 7 we have (1324)(5 10 8)(6)(79). We have considered
all numbers and so σ = (1324)(5 10 8)(6)(79) = (1324)(5 10 8)(79). The order of σ is lcm(4, 3, 2) = 12.
Example 26.2.8. Suppose we want to find a permutation of order 10 in S7 . We simply take (12345)(67). If
we want to find a permutation of order 10 in S10 we can take either (12345)(67) or (123456789 10) (and all
variants on this).

Finally we remark on the computation of σ−1 for a permutation σ. If σ is given in the form of a table, for
example:  
1 2 3 4 5 6 7 8 9 10
σ= ,
3 4 2 1 10 6 9 5 7 8
then because σ (i ) = j ⇔ σ−1 ( j) = i, the table describing σ−1 is the same table but read from the bottom
to the top. That is  
3 4 2 1 10 6 9 5 7 8
σ −1 =  .
1 2 3 4 5 6 7 8 9 10
Only that we follow our convention and write the columns in the conventional order and so we get
 
1 2 3 4 5 6 7 8 9 10
σ −1 =  .
4 3 1 2 8 6 9 10 7 5
If σ is a cycle, say σ = (i1 i2 . . . ik−1 ik ), then σ−1 is easily seen to be (ik ik−1 . . . i2 i1 ). So,
( i 1 i 2 . . . i k −1 i k ) −1 = ( i k i k −1 . . . i 2 i 1 ).
90 EYAL GOREN MCGILL UNIVERSITY

Now, if σ is a product of disjoint cycles, σ = σ1 σ2 . . . σr then σ−1 = σr−1 . . . σ2−1 σ1−1 (by a generalization
of the rule ( ab)−1 = b−1 a−1 ), but, since those cycles are disjoint they commute, and so we can also write
this as σ−1 = σ1−1 σ2−1 . . . σr−1 . (This last manipulation is wrong if the cycles are not disjoint!) Thus, for
example, the inverse of σ = (1324)(5 10 8)(79) is (4231)(8 10 5)(97), which we can also write, if we wish,
as (1423)(5 8 10)(79).

26.3. The Dihedral group. Consider a regular polygon with n

sides, n ≥ 3, in the plane, symmetric around the origin
(0, 0). The dihedral group Dn is defined as the symmetries
of the polygon. Let us number the vertices of the polygon
by 1, 2, . . . , n in the clockwise direction and say the first ver-
tex 1 lies on the x-axis. One sees that every symmetry must
permutes the vertices and in fact either maintains or reverses
their order (we allow flip-over of the polygon). In fact, if σ is
a symmetry then σ (1) = j and σ (2) = j + 1 or j − 1 (where
we understand n + 1 as 1 and 1 − 1 as n) and σ is uniquely
determined by these conditions. For example, the permuta-
tion y given by the cycle (1 2 3 · · · n) is an element of the
dihedral group rotating the polygon by angle 360◦ /n in the
clockwise direction (so if t is a point on the boundary of the Figure 1. The dihedral g
polygon such that the line from (0, 0) to t forms an angle θ with
the x-axis, then y(t) is the point forming an angle θ − 360◦ /n).
Another symmetry, x, is reflection through the x-axis. The symmetry x is given as permutation by the prod-
uct (2 n), (3 n − 1) · · · (n/2 2 + n/2) if n is even and (2 n), (3 n − 1) · · · ((n + 1)/2 1 + (n + 1)/2) if n is
odd. In terms of angles, x changes an angle θ to −θ.
Theorem 26.3.1. The elements of the dihedral group are
Dn = {e, y, . . . , yn−1 , x, yx, y2 x, . . . , yn−1 x },
and the relations x2 = yn = 1 and xyxy = 1 hold. In particular, Dn has 2n elements.
Proof. It is enough to show that any vertex j ∈ {1, 2, . . . , n} there is a unique element of the set
{e, y, . . . , yn−1 , x, yx, y2 x, . . . , yn−1 x }
that takes 1 to j and 2 to j + 1 and there is a unique element taking 1 to j and 2 to j − 1. This shows both
that every element of Dn is in the list and that all elements of the list are different.
We calculate that
y a (1) = a + 1, y a (2) = a + 2,
and
y a x (1) = a + 1, y a x (2) = y a (n) = a.
This proves our claims.
The relations x2 = yn = 1 are evident. We check that xyxy = 1, by checking that xyxy( j) = j for j = 1, 2.
We have xyxy(1) = xyx (2) = xy(n) = x (1) = 1 and xyxy(2) = xyx (3) = xy(n − 1) = x (n) = 2.

The nature of the symmetries 1, y, . . . , yn−1 is clear: y j rotates clockwise by angle j · 360◦ /n.

Proposition 26.3.2. Let 0 ≤ j < n. The element y j x is a reflection through the line forming an angle
− j · 360◦ /2n with the x-axis.
Proof. The symmetry y j x is not trivial. If it fixes an angle θ it must be reflection through the line with that
angle. Note that y j x sends the angle θ to −θ and then adds − j · 360◦ /n so the equation is θ = −θ − j · 360◦ /n
(mod 360). That is θ = − j · 360◦ /2n.
COURSE NOTES - ALGEBRA I 91

Example 26.3.3. We revisit Example 25.2.6. The technique is rather similar to studying the dihedral group.
The symmetries of solids we considered there preserve orientation (we don’t allow flip-overs through a fourth
dimension). If we number the vertices of a tetrahedron by 1, 2, 3, 4 and the vertices of the cube by 1, 2, . . . , 8,
we may view the group of symmetries, call them A and B, respectively, as subgroups of S4 and S8 , respectively.
If we fix an edge on the tetrahedron, or the cube, then a symmetry is completely determined by its effect
on this edge. The edge, as a moment’s reflection shows, can go to any other edge on the solid and in two
ways. Since a tetrahedron has 6 edges and a cube has 12, we conclude that the group A has 12 elements and
the group B has 24. If we wish we can enumerate the elements of the groups A, B, by listing the permutations
they induce.

27. The theorem of Lagrange

27.1. Cosets. Let H < G be a subgroup of G. A left coset of H in G is a subset of the form

gH := { gh : h ∈ H },

for some g ∈ G. The set gH is called the left coset of g; g is called a representative of the coset gH.

Example 27.1.1. Consider the subgroup H of S3 given by {1, (123), (132)}. Here are some cosets: H =
1H = (123) H = (132) H, (12) H = (13) H = (23) H = {(12), (23), (13)}. We leave the verification to the
reader.

Lemma 27.1.2. Let H be a subgroup of G.

(1) Two left cosets are either equal or disjoint.
(2) Let g1 H, g2 H be two left cosets. The following are equivalent: (i) g1 H = g2 H; (ii) g1 ∈ g2 H;
(iii) g2−1 g1 ∈ H.

Proof. Suppose that g1 H ∩ g2 H 6= ∅, so for some h1 , h2 we have g1 h1 = g2 h2 . We prove that g1 H ⊆ g2 H.

By symmetry we also have g2 H ⊆ g1 H and so g1 H = g2 H.
Let h ∈ H. Then g1 h = (( g2 h2 )h1−1 )h = g2 (h2 h1−1 h) ∈ g2 H.

We now prove the equivalence of the assertions (i) - (iii). Suppose (i) holds. Then g1 = g1 e ∈ g2 H and
(ii) holds. Suppose (ii) holds; say g1 = g2 h. Then g2−1 g1 = h ∈ H and (iii) holds. Suppose that (iii)
holds; g2−1 g1 = h for some h ∈ H. Then g1 = g2 h and so g1 H ∩ g2 H 6= ∅. By what we have proved in the
first part, g1 H = g2 H.

Remark 27.1.3. The Lemma and its proof should be compared with Lemma 21.0.3. In fact, since R is an
abelian group and an ideal I is a subgroup, that lemma is special case of the lemma above.

Corollary 27.1.4. G is a disjoint union of cosets of H. Let { gi : i ∈ I } be a set of elements of G such that
each coset has the form gi H for a unique gi . That is, G = äi∈ I gi H. Then the { gi : i ∈ I } are called a
complete set of representatives.

In the same manner one defines a right coset of H in G to be a subset of the form Hg = {hg : h ∈ H }
and Lemma 27.1.2 holds for right cosets with the obvious modifications. Two right cosets are either equal
or disjoint and the following are equivalent: (i) Hg1 = Hg2 ; (ii) g1 ∈ Hg2 ; (iii) g1 g2−1 ∈ H. Thus, the
Corollary holds true for right cosets as well.
We remark that the intersection of a left coset and a right coset may be non-empty, yet not a coset itself.
For example, take H = {1, (12)} in S3 . We have the following table.
92 EYAL GOREN MCGILL UNIVERSITY

g gH Hg
1 {1, (12)} {1, (12)}
(12) {(12), 1} {(12), 1}
(13) {(13), (123)} {(13), (132)}
(23) {(23), (132)} {(23), (123)}
(123) {(123), (13)} {(123), (23)}
(132) {(132), (23)} {(132), (13)}
The table demonstrates that indeed any two left (resp. right) cosets are either equal or disjoint, but the
intersection of a left coset with a right coset may be non-empty and properly contained in both.

27.2. Lagrange’s theorem.

Theorem 27.2.1. Let G be a finite group and H a subgroup of G. Then,
| H | divides | G |.
|G|
Moreover, let { gi : i ∈ I } be a complete set of representatives for the cosets of H, then | I | = | H | . In
particular, the cardinality of I does not depend on the choice of a complete set of representatives. It is called
the index of H in G.

Proof. We have,
G= ä gi H.
i∈ I
Let a, b ∈ G. We claim that the function
f : aH → bH, x 7→ ba−1 x,
is a well defined bijection. First, x = ah for some h and so ba−1 x = bh ∈ bH and so the map is well defined.
It is surjective, because given an element y ∈ bH, say y = bh it is the image of ah. The map is also injective:
if ba−1 x1 = ba−1 x2 then multiplying both sides by ab−1 we get x1 = x2 .
We conclude that each coset gi H has the same number of elements, which is exactly the number of
elements in H = eH. We get therefore that
| G | = | H | · | I |.
That completes the proof.
Here are some applications of Lagrange’s theorem:
(1) Let G be a finite group of prime order p. Then G is cyclic; in fact, every element of G that is not
the identity generates G.
Indeed, let g 6= e. Then H = h gi is a non-trivial subgroup. So | H | > 1 and divides p. It follows
that | H | = | G | and so that h gi = G.
(2) In a similar vein, we conclude that a group of order 6 say, cannot have elements of order 4, or 5,
or of any order not dividing 6. This follows immediately from Lagrange’s theorem, keeping in mind
that ord( g) = |h gi| .
COURSE NOTES - ALGEBRA I 93

28. Homomorphisms and isomorphisms

28.1. homomorphisms of groups.
Definition 28.1.1. Let G, H be groups and
f : G → H,
a function. The function f is called a group homomorphism, if
f ( g1 g2 ) = f ( g1 ) f ( g2 ) , ∀ g1 , g2 ∈ G.
In that case, we define the kernel of f as:
Ker( f ) = { g ∈ G : f ( g) = e H }.
Lemma 28.1.2. Let f : G → H be a group homomorphism. Then:
(1) f (eG ) = e H ;
(2) f ( g−1 ) = f ( g)−1 ;
(3) The image of f is a subgroup of H.
Proof. We have f (eG ) = f (eG eG ) = f (eG ) f (eG ). Multiplying (in H) both sides by f (eG )−1 we find e H =
f (eG ). Now, e H = f (eG ) = f ( gg−1 ) = f ( g) f ( g−1 ), which shows that f ( g−1 ) = f ( g)−1 .
Finally, we show that Im( f ) is a subgroup of H. Note that e H = f (eG ) ∈ Im( f ). If h1 , h2 ∈ Im( f ),
say hi = f ( gi ) then h1 h2 = f ( g1 g2 ) and h1−1 = f ( g1−1 ). This shows that h1 h2 , h1−1 ∈ Im( f ).
Proposition 28.1.3. Let f : G → H be a group homomorphism. Ker( f ) is a subgroup of G. The homomor-
phism f is injective if and only if Ker( f ) = {eG }.
Proof. First, we proved that f (eG ) = e H and so eG ∈ Ker( f ). Next, if g1 , g2 ∈ Ker( f ) then f ( g1 ) =
f ( g2 ) = e H and so f ( g1 g2 ) = f ( g1 ) f ( g2 ) = e H e H = e H . Therefore, g1 g2 ∈ H. Finally, if g ∈ Ker( f )
then f ( g−1 ) = f ( g)−1 = e− 1
H = e H and so g
−1 ∈ Ker( f ) as well.

Suppose f is injective. Then, since f (e g ) = e H , eG is the only element mapping to e H and so Ker( f ) =
{eG }. Conversely, suppose Ker( f ) = {eG } and f ( g1 ) = f ( g2 ). Then e H = f ( g1 )−1 f ( g2 ) = f ( g1−1 ) f ( g2 ) =
f ( g1−1 g2 ). That means that g1−1 g2 ∈ Ker( f ) and so g1−1 g2 = eG . That is, g1 = g2 .

28.2. Isomorphism.
Definition 28.2.1. A group homomorphism f : G → H is called an isomorphism if it is bijective.
As in the case of rings, one verifies that if f is an isomorphism, the inverse function g = f −1 is automatically
a homomorphism and so an isomorphism as well. Also, one easily checks that a composition of group
homomorphisms is a group homomorphism. It follows that being isomorphic is an equivalence relation on
groups. Cf. §22.1.
Example 28.2.2. Let n be a positive integer. Any two cyclic groups of order n are isomorphic.
Indeed, suppose that G = h gi, H = hhi are cyclic groups of order n. Define, for any integer a,
f ( ga ) = ha .
This is well defined; if g a = gb then g a−b = eG and so n|( a − b). Thus, a = b + kn and f ( g a ) = h a =
hb (hn )k = hb = f ( gb ). Obviously f is a surjective homomorphism; f is also injective, because f ( g a ) = h a =
e H implies that n| a and so g a = eG .
In particular, we conclude that any cyclic group of order n is isomorphic to the group Z/nZ (with the
group operation being addition).
Example 28.2.3. Let p be a prime number then any two groups of order p are isomorphic. Indeed, we have
seen that such groups are necessarily cyclic.
94 EYAL GOREN MCGILL UNIVERSITY

Theorem 28.2.4. (Cayley) Let G be a finite group of order n then G is isomorphic to a subgroup of Sn .

Proof. Let g ∈ G and let

σg : G → G, σg ( a) = ga.
We claim that σg is a permutation. It is injective, because σg ( a) = σg (b) ⇒ ga = gb ⇒ a = b. It is
surjective, because for any b ∈ G, σg ( g−1 b) = b.
Identifying the permutations of G with Sn (just call the elements of G, 1, 2, 3, . . . ), we got a map

G → Sn , g 7→ σg .

This map is a homomorphism of groups: σgh ( a) = gha = σg (σh ( a)). That is, σgh = σg ◦ σh . This homomor-
phism is injective: if σg is the identity permutation then σg (e) = e and that implies ge = e, that is g = e. We
get that G is isomorphic to its image, which is a subgroup of Sn , under this homomorphism.

Remark 28.2.5. We were somewhat informal about identifying the permutations of G with Sn . A more
rigorous approach is the following.

Lemma 28.2.6. Let T, Z be sets and f : T → Z a bijection. The group of permutations of T and Z are
isomorphic.

Proof. Let σ ∈ ST , a permutation of T. Then f ◦ σ ◦ f −1 is a function from Z to itself, and being a

composition of bijections is a bijection itself. We shall write more simply f σ f −1 for f ◦ σ ◦ f −1 . We therefore
got a function
ST → SZ , σ 7 → f σ f −1 .
We claim that σ 7→ f σ f −1 is a homomorphism. Indeed, given σ1 , σ2 ∈ ST we have f σ1 σ2 f −1 = ( f σ1 f −1 )( f σ2 f −1 ).
Moreover, it is easy to write an inverse to this homomorphism,

SZ → ST , τ 7 → f −1 τ f .

Therefore, we found a bijective homomorphism ST → SZ , which shows those two permutation groups are
isomorphic.

29. Group actions on sets

29.1. Basic definitions. Let G be a group and let S be a non-empty set. We say that G acts on S if we are
given a function
G × S → S, ( g, s) 7−→ g ? s,
such that;
(i) e ? s = s for all s ∈ S;
(ii) ( g1 g2 ) ? s = g1 ? ( g2 ? s) for all g1 , g2 ∈ G and s ∈ S.

Given an action of G on S we can define the following sets. Let s ∈ S. Define the orbit of s

Orb(s) = { g ? s : g ∈ G }.

Note that Orb(s) is a subset of S, equal to all the images of s under the action of the group G. We also
define the stabilizer of s to be
Stab(s) = { g ∈ G : g ? s = s}.
Note that Stab(s) is a subset of G. In fact, it is a subgroup, as Lemma 29.2.1 states.
COURSE NOTES - ALGEBRA I 95

29.2. Basic properties.

Lemma 29.2.1. (1) Let s1 , s2 ∈ S. We say that s1 is related to s2 , i.e., s1 ∼ s2 , if there exists g ∈ G
such that g ? s1 = s2 . This is an equivalence relation. The equivalence class of s1 is its orbit Orb(s1 ).
(2) Let s ∈ S. The set Stab(s) is a subgroup of G.
(3) Suppose that both G and S have finitely many elements. Then
|G|
|Orb(s)| = .
|Stab(s)|
Proof. (1) We need to show that this relation is reflexive, symmetric and transitive. First, we have e ? s =
s and hence s ∼ s, meaning the relation is reflexive. Second, if s1 ∼ s2 then for a suitable g ∈ G
we have g ? s1 = s2 . Therefore, g−1 ? ( g ? s1 ) = g−1 ? s2 and ( g−1 g) ? s1 = g−1 ? s2 . It follows
that, e ? s1 = g−1 ? s2 and so, s1 = g−1 ? s2 , which implies that s2 ∼ s1 .
It remains to show the relation is transitive. If s1 ∼ s2 and s2 ∼ s3 then for suitable g1 , g2 ∈ G
we have g1 ? s1 = s2 and g2 ? s2 = s3 . Therefore, ( g2 g1 ) ? s1 = g2 ? ( g1 ? s1 ) = g2 ? s2 = s3 , and
hence s1 ∼ s3 .
Moreover, by its very definition, the equivalence class of an element s1 of S is all the elements of
the form g ? s1 for some g ∈ G, namely, Orb(s1 ).
(2) Let H = Stab(s). We have to show that: (i) e ∈ H, (2) if g1 , g2 ∈ H then g1 g2 ∈ H, and (iii)
if g ∈ H then g−1 ∈ H.

First, by the definition of group action, we have e ? s = s. Therefore, e ∈ H. Next, suppose

that g1 , g2 ∈ H, i.e., g1 ? s = s and g2 ? s = s. Then, ( g1 g2 ) ? s = g1 ? ( g2 ? s) = g1 ? s = s,
which proves that g1 g2 ∈ H. Finally, if g ∈ H then g ? s = s and so g−1 ? ( g ? s) = g−1 ? s. That
is, ( g−1 g) ? s = g−1 ? s and so e ? s = g−1 ? s, or s = g−1 ? s, and therefore g−1 ∈ H.
(3) We claim that there exists a bijection between the left cosets of H = Stab(s) and the orbit of s. If
we show that, then by Lagrange’s theorem,
|Orb(s)| = no. of left cosets of H = index of H = | G |/| H |.
Define a function
φ
{left cosets of H } → Orb(s),
by
φ( gH ) = g ? s.
We claim that φ is a well-defined bijection. First
Well-defined: Suppose that g1 H = g2 H. We need to show that the rule φ gives the same result whether we
take the representative g1 or the representative g2 to the coset, that is, we need to show g1 ? s = g2 ? s.
Note that g1−1 g2 ∈ H, i.e., ( g1−1 g2 ) ? s = s. We get g1 ? s = g1 ? (( g1−1 g2 ) ? s) = ( g1 ( g1−1 g2 )) ? s = g2 ? s.
φ is surjective: Let t ∈ Orb(s) then t = g ? s for some g ∈ G. Thus, φ( gH ) = g ? s = t , and we get that φ
is surjective.
φ is injective: Suppose that φ( g1 H ) = φ( g2 H ). We need to show that g1 H = g2 H. Indeed, φ( g1 H ) =
φ( g2 H ) implies g1 ? s = g2 ? s and so that g2−1 ? ( g1 ? s) = g2−1 ? ( g2 ? s); that is, ( g2−1 g1 ) ? s = e ? s = s.
Therefore, g2−1 g1 ∈ Stab(s) = H and hence g1 H = g2 H.
Corollary 29.2.2. The set S is a disjoint union of orbits.
Proof. The orbits are the equivalence classes of the equivalence relation ∼ defined in Lemma 29.2.1. Any
equivalence relation partitions the set into disjoint equivalence classes.

29.3. Some examples.

Example 29.3.1. Let G = Sn , the symmetric group on T = {1, 2, . . . , n}. The group G acts on T in a
natural way:
σ ? i : = σ ( i ).
96 EYAL GOREN MCGILL UNIVERSITY

The stabilizer of i is the permutations fixing i. These permutations can be identified with permutations of
the set T − {i }, because σ ∈ Stab(i ) induces a permutation of T − {i } and a permutation of T − {i } can be
extended to a permutation of T sending i to itself. Thus, for every i, Stab(i ) ∼
= Sn−1 . The orbit of i is T;
for every j the transposition (ij) shows that j ∈ Orb(i).
Example 29.3.2. Let G be the group of real numbers R. The group operation is addition. Let S be the
sphere in R3 of radius 1 about the origin. The group R acts by rotating around the z-axis. An element r ∈ R
rotates by r radians. For every point s ∈ S, different from the poles, the stabilizer is 2πZ. For the poles the
stabilizer is R. The orbit of every point is the altitude line on which it lies.
Example 29.3.3. Let G be a group and H a subgroup of G. Then H acts on G by
H × G → G, (h, g) 7→ hg.
Here H plays the role of the group and G the role of the set in the definition. This is indeed a group
action: e H g = g for all g ∈ G, because by definition e H = eG . Also, h1 (h2 ) g = (h1 h2 ) g is nothing but the
associative law.
The orbit of g ∈ G is
Orb( g) = {hg : h ∈ H } = Hg.
That is, the orbits are the right cosets of H. We have that G is a disjoint union of orbits, namely, a disjoint
union of cosets. The stabilizer of any element g ∈ G is {e}. The formula we have proven, |Orb(g)| =
| H |/|Stab( g)|, gives us | Hg| = | H | for any g ∈ G, and we see that we have another point of view on
Lagrange’s theorem.
Example 29.3.4. We consider a roulette with n sectors and write n = i1 + · · · + ik , for some positive (and
fixed) integers i1 , . . . , ik . We suppose we have different colors c1 , . . . , ck and we color i1 sectors of the roulette
by the color c1 , i2 sectors by the color c2 and so on. The sectors can be chosen as we wish and so there are
many possibilities. We get a set S of colored roulettes. Now, we turn the roulette r steps clockwise, say, and
we get another colored roulette, usually with different coloring. Nonetheless, it is natural to view the two
coloring as the same, since “they only depend on your point of view". We may formalize this by saying that
the group Z/nZ acts on S; r acts on a colored roulette by turning it r steps clockwise, and by saying that
we are interested in the number of orbits for this action.
Example 29.3.5. Let G be the dihedral group D8 . Recall that G is the group of symmetries of a regular
octagon in the plane.
G = {e, y, y2 , . . . , y7 , x, yx, y2 x, . . . , y7 x },
where y is rotation clockwise by angle 2π/8 and x is reflection through the x-axis. We have the relations
x2 = y8 = e, xyxy = 1.
We let S be the set of colorings of the octagon ( = necklaces laid on the table) having 4 red vertices (rubies)
and 4 green vertices (sapphires). The group G acts on S by its action on the octagon.
For example, the coloring s0 , consisting of alternating green and red, is certainly preserved under x and
under y2 . Therefore, the stabilizer of s0 contains at least the set of eight elements
(4) {e, y2 , y4 , y6 , x, y2 x, y4 x, y6 x }.
Remember that the stabilizer is a subgroup and, by Lagrange’s theorem, of order dividing 16 = | G | . On the
other hand, Stab(s0 ) 6= G because y 6∈ Stab(s0 ). It follows that the stabilizer has exactly 8 elements and is
equal to the set in (4).
Let H be the stabilizer of s0 . According to Lemma 29.2.1 the orbit of s0 is in bijection with the left cosets
of H = {e, y2 , y4 , y6 , x, y2 x, y4 x, y6 x }. By Lagrange’s theorem there are two cosets. For example, H and gH
are distinct cosets. The proof of Lemma 29.2.1 tells us how to find the orbit: it is the set {s0 , gs0 }, which
is of course quite clear if you think about it.
COURSE NOTES - ALGEBRA I 97

30. The Cauchy-Frobenius Formula

21
Theorem 30.0.1. (CFF) Let G be a finite group acting on a finite set S. Let N be the number of orbits
of G in S. Define
I ( g) = |{s ∈ S : g ? s = s}|
22
(the number of elements of S fixed by the action of g). Then
1
N=
|G| ∑ I ( g ).
g∈ G

Remark 30.0.2. Note that I ( g) is the number of fixed points for the action of g on S. Thus, the CFF can
be interpreted as saying that the number of orbits is the average number of fixed points (though this does
not make the assertion more obvious).
Proof. We define a function
(
1 g ? s = s,
T : G × S → {0, 1}, T ( g, s) =
0 g ? s 6= s.
Note that for a fixed g ∈ G we have
I ( g) = ∑ T ( g, s),
s∈S
and that for a fixed s ∈ S we have
|Stab(s)| = ∑ T ( g, s).
g∈ G
Let us fix representatives s1 , . . . , s N for the N disjoint orbits of G in S. Now,
! !
∑ I ( g) = ∑ ∑ T ( g, s) = ∑ ∑ T ( g, s)
g∈ G g∈ G s∈S s∈S g∈ G
|G|
= ∑ |Stab(s)| = ∑ |Orb(s)|
s∈S s∈S
N N
|G| |G|
= ∑ ∑ |Orb(s)| ∑ ∑ |Orb(si )|
=
i =1 s∈Orb(s ) i i =1 s∈Orb(s ) i
N N
|G|
= ∑ |Orb(si )| · |Orb(si )| = ∑ |G|
i =1 i =1
= N · | G |.

z
Remark 30.0.3. If N, the number of orbits, is equal 1 we say that G acts transitively on S. It means exactly
that: For every s1 , s2 ∈ S there exists g ∈ G such that g ? s1 = s2 . Note that if G and S are finite then if G
acts transitively then the number of elements in S divides the number of elements in G,
| S | | | G |,
because, if S = Orb(s) then |S| = | G |/|Stab(s)|.
Corollary 30.0.4. Let G be a finite group acting transitively on a finite set S. Suppose that |S| > 1. Then
there exists g ∈ G without fixed points.
21This is also sometimes called Burnside’s formula.
22The sum appearing in the formula means just that: If you write G = { g , . . . , g } then
1 n ∑ g∈G I ( g) is ∑in=1 I ( gi ) = I ( g1 ) + I ( g2 ) +
· · · + I ( gn ). The double summation ∑ g∈G ∑s∈S T ( g, s) appearing in the proof means that if we write S = {s1 , . . . , sm } then the
double sum is T ( g1 , s1 ) + T ( g1 , s2 ) + · · · + T ( g1 , sm ) + T ( g2 , s1 ) + T ( g2 , s2 ) + · · · + T ( g2 , sm ) + · · · + T ( gn , s1 ) + T ( gn , s2 ) + · · · +
T ( gn , s m ) .
98 EYAL GOREN MCGILL UNIVERSITY

Proof. By contradiction. Suppose that every g ∈ G has a fixed point in S. That is, suppose that for
every g ∈ G we have
I ( g) ≥ 1.
Since I (e) = |S| > 1 we have that
∑ I ( g ) > | G |.
g∈ G
By Cauchy-Frobenius formula, the number of orbits N is greater than 1. Contradiction.

30.1. Some applications to Combinatorics.

Example 30.1.1. How many roulettes with 11 wedges painted 2 blue, 2 green and 7 red are there when we
allow rotations?

Let S be the set of painted

 roulettes.
  Let us enumerate the sectors of a roulette by the numbers 1, . . . , 11.
11 9
The set S is a set of     = 1980 elements (choose which 2 are blue, and then choose out of the
2 2
nine left which 2 are green).
Let G be the group Z/11Z. It acts on S by rotations. The element 1 rotates a painted roulette by
angle 2π/11 anti-clockwise. The element n rotates a painted roulette by angle 2nπ/11 anti-clockwise. We
are interested in N, the number of orbits for this action. We use CFF.
The identity element always fixes the whole set. Thus I (0) = 1980. We claim that if 1 ≤ i ≤ 10 then i
doesn’t fix any element of S. We use the following fact that we have proved before: Let G be a finite group
of prime order p. Let g 6= e be an element of G. Then h gi = G.
Suppose that 1 ≤ i ≤ 10 and i fixes s. Then so does hi i = Z/11Z (the stabilizer is a subgroup). But any
coloring fixed under rotation by 1 must be single colored! Contradiction.
Applying CFF we get
1 10 1
N= ∑
11 n=0
I (n) =
11
· 1980 = 180.

Example 30.1.2. How many roulettes with 12 wedges painted 2 blue, 2 green and 8 red are there when we
allow rotations?

Let S be the set of painted

  roulettes.
  Let us enumerate the sectors of a roulette by the numbers 1, . . . , 12.
12 10
The set S is a set of     = 2970 elements (choose which 2 are blue, and then choose out of the
2 2
ten left which 2 are green).
Let G be the group Z/12Z. It acts on S by rotations. The element 1 rotates a painted roulette by
angle 2π/12 anti-clockwise. The element n rotates a painted roulette by angle 2nπ/12 anti-clockwise. We
are interested in N, the number of orbits for this action. We use CFF.
The identity element always fixes the whole set. Thus I (0) = 2970. We claim that if 1 ≤ i ≤ 11 and i 6= 6
then i doesn’t fix any element of S. Indeed, suppose that i fixes a painted roulette. Say in that roulette
the r-th sector is blue. Then so must be the i + r sector (because the r-th sector goes under the action of i
to the r + i-th sector). Therefore so must be the r + 2i sector. But there are only 2 blue sectors! The only
possibility is that the r + 2i sector is the same as the r sector, namely, i = 6.
If i is equal to 6 and we enumerate the sectors of a roulette by the numbers 1, . . . , 12 we may write i as
the permutation
(1 7)(2 8)(3 9)(4 10)(5 11)(6 12).
In any coloring fixed by i = 6 the colors of the pairs (1 7), (2 8), (3 9), (4 10), (5 11) and (6 12) must be the
same. We may choose one pair for blue, one pair for green. The rest would be red. Thus there are 30 = 6 · 5
possible choices. We summarize:
COURSE NOTES - ALGEBRA I 99

element g I ( g)
0 2970
i 6= 6 0
i=6 30
Applying CFF we get that there are
1
N= (2970 + 30) = 250
12
different roulettes.

Example 30.1.3. In this example S is the set of necklaces made of four rubies and four sapphires laid on the
table. We ask how many necklaces there are when we allow rotations and flipping-over. We may talk of S
as the colorings of a regular octagon, four vertices are green and four are red. The group G = D8 acts on S
and we are interested in the number of orbits for the group G. The results are the following

element g I ( g)
e 70
y, y3 , y5 , y7 0
y2 , y6 2
y4 6
xyi for i = 0, . . . , 7 6

We explain how the entries in the table are obtained:  

8
The identity always fixes the whole set S. The number of elements in S is   = 70 (choosing which 4
4
would be green).
The element y cannot fix any coloring, because any coloring fixed by y must have all sections of the same
2
color (because y = (1 2 3 4 5 6 7 8)). If yr fixes a coloring s0 so does (yr )r = y(r ) because the stabilizer is
a subgroup. Apply that for r = 3, 5, 7 to see that if yr fixes a coloring so does y , which is impossible. 23
Now, y2 , written as a permutation, is (1 3 5 7)(2 4 6 8). We see that if, say 1 is green so are 3, 5, 7 and
the rest must be red. That is, all the freedom we have is to choose whether the cycle (1 3 5 7) is green or
red. This gives us two colorings fixed by y2 . The same rational applies to y6 = (8 6 4 2)(7 5 3 1).
Consider now y4 . It may written in permutation notation as (1 5)(2 6)(3 7)(4 8). In any coloring   fixed
4
by y4 each of the cycles (1 5)(2 6)(3 7) and (4 8) must be single colored. There are thus   = 6
2
possibilities (Choosing which 2 out of the four cycles would be green).
It remains to deal with the elements xyi . We recall that these are all reflections. There are two kinds of
reflections. One may be written using permutation notation as
(i1 i2 )(i3 i4 )(i5 i6 )
(with the other two vertices being fixed. For example x = (2 8)(3 7)(4 6) is of this form). The other kind is
of the form
(i1 i2 )(i3 i4 )(i5 i6 )(i7 i8 ).

23y(32 ) = g9 = g because y8 = e, etc.

100 EYAL GOREN MCGILL UNIVERSITY

(For example xy = (1 8)(2 7)(3 6)(4 5) is of this sort). Whichever the case, one uses similar reasoning to
deduce that there are 6 colorings preserved by a reflection.

One needs only apply CFF to get that there are

1
N= (70 + 2 · 2 + 6 + 8 · 6) = 8
16
distinct necklaces.
Example 30.1.4. Consider a tetrahedron with faces marked 1, 2, 3, 4. It takes a little thinking, but one can
see that each symmetry is determined by its action on the faces. As we have done previously, consider
symmetries of the tetrahedron that preserve orientation. We noted before (Example 26.3.3) that there are
12 such symmetries, but here we want a more explicit description. For example, (123) is such a symmetry;
it rotates the tetrahedron relative to the plane on which the face 4 lies. On the other hand, (12) is not
such a symmetry. We conclude that the symmetries that preserve orientation are a subgroup of S4 that
is not equal to S4 . Let us call this subgroup A4 (later on, we shall define the groups An in general and
our notation is consistent). Clearly A4 contains {1, (123), (132), (234), (243), (134), (143), (124), (142)} and
since it is closed under multiplication (being a subgroup) also (132)(134) = (12)(34) and similarly, (13)(24)
and (14)(23), are elements of A4 . We have already identified 12 elements in A4 and since the order of A4
divides the order of S4 , which is 24, A4 is in fact equal to
{1, (123), (132), (234), (243), (134), (143), (124), (142), (12)(34), (13)(24), (14)(23)}.
Let us count how many colorings of the faces of the tetrahedron are there using 4 distinct colors, each once.
The number of coloring is 4! = 24 (choose for each face its color). No symmetry but the identity preserves
a coloring and so by CFF we get that the number of colorings up to A4 identifications is 2.
Suppose now we want to color with 2 colors, say red and blue, painting two faces red and two faces blue.
The total number of colorings are 42 = 6. In this case, a three cycle cannot fix a coloring, while each
permutation of the type ( ab)(cd) fixes exactly two colorings (choose if the faces a, b are both red or both
blue). Therefore, the number of colorings up to symmetries is
1
N= (6 + 3 × 2) = 1.
12

31. Cauchy’s theorem: a wonderful proof

One application of group actions is to provide a simple proof of an important theorem in the theory of
finite groups. Every other proof I know is very complicated.
Theorem 31.0.1. (Cauchy) Let G be a finite group of order n and let p be a prime dividing n. Then G has
an element of order p.
Proof. Let S be the set consisting of p-tuples ( g1 , . . . , g p ) of elements of G, considered up to cyclic permu-
tations. Thus if T is the set of p-tuples ( g1 , . . . , g p ) of elements of G, S is the set of orbits for the action
of Z/pZ on T by cyclic shifts. One may therefore apply CFF and get
np − n
|S| = + n.
p
If n| |S|, then n divides (n p − n)/p and so p divides n p−1 − 1. But p divides n and we get a contradiction.
Thus, n 6 ||S| .
Now define an action of G on S. Given g ∈ G and ( g1 , . . . , g p ) ∈ S we define
g( g1 , . . . , g p ) = ( gg1 , . . . , gg p ).
This is well defined. Since the order of G is n, since n 6 ||S|, and since S is a disjoint union of orbits of G,
there must be an orbit Orb(s) whose size is not n. However, the size of an orbit is | G |/|Stab(s)| , and
COURSE NOTES - ALGEBRA I 101

we conclude that there must an element ( g1 , . . . , g p ) in S with a non-trivial stabilizer. This means that for
some g ∈ G, such that g 6= e, we have
( gg1 , . . . , gg p ) is equal to ( g1 , . . . , g p ) up to a cyclic shift.
This means that for some i we have
( gg1 , . . . , gg p ) = ( gi+1 , gi+2 , gi+3 , . . . , g p , g1 , g2 , . . . , gi ).
Therefore, gg1 = gi+1 , g2 g1 = ggi+1 = g2i+1 , . . . , g p g1 = · · · = g pi+1 = g1 (we always read the indices
mod p). That is, there exists g 6= e with
g p = e.
Since the order of g divides p and p is prime, the order of g must be p.

32. The first isomorphism theorem for groups

32.1. Normal subgroups.
Definition 32.1.1. Let G be a group and H a subgroup of G. H is called a normal subgroup if for every g ∈ G
we have
gH = Hg.
Note that gH = Hg if and only if gHg−1 = H, where gHg−1 = { ghg−1 : h ∈ H }. Thus, we could
also define a normal subgroup H to be a subgroup such that gHg−1 = H for all g ∈ G, equivalently, ∀ g ∈
G, ∀h ∈ H, ghg−1 ∈ H.
Lemma 32.1.2. Let H be a subgroup of a group G. Then H is normal if and only if
gHg−1 ⊂ H, ∀ g ∈ G.
Proof. Clearly if H is normal, gHg−1
⊂ H, ∀ g ∈ G. Suppose then that gHg−1 ⊂ H, ∀ g ∈ G. Given g ∈ G
we have then gHg−1 ⊂ H and also g−1 H ( g−1 )−1 ⊂ H. The last inclusion is just g−1 Hg ⊂ H, which is
equivalent to H ⊂ gHg−1 . We conclude that gHg−1 = H.

Our main example of a normal subgroup is the kernel of a homomorphism.

Proposition 32.1.3. Let f : G → H be a group homomorphism. Then Ker( f ) is a normal subgroup of G.
Proof. We proved already that Ker( f ) is a subgroup. Let g ∈ G, h ∈ Ker( f ); we need to show that ghg−1 ∈
Ker( f ), that is f ( ghg−1 ) = e H . We calculate f ( ghg−1 ) = f ( g) f (h) f ( g−1 ) = f ( g)e H f ( g−1 ) = f ( g) f ( g−1 ) =
f ( g ) f ( g ) −1 = e H .
Example 32.1.4. For any group G, {eG } and G are normal subgroups. If G is a commutative group, any
subgroup of G is a normal subgroup.
32.2. Quotient groups. Similar to the construction of a quotient ring, we construct quotient groups.
Let G be a group and H a normal subgroup of G. We let the quotient group G mod H, denoted G/H,
be the collection of left cosets of H. We define multiplication by
( aH )(bH ) = abH.
We claim that this is well defined, namely, if aH = a1 H, bH = b1 H then abH = a1 b1 H. Indeed, we
have a = a1 h for some h ∈ H and b = b1 h0 for some h0 ∈ H. Also, hb1 ∈ Hb1 = b1 H and so hb1 = b1 h00 for
some h00 ∈ H. Then, abH = a1 hb1 h0 H = a1 b1 h00 h0 H = a1 b1 H (if t ∈ H then tH = H).
We now verify the group axioms. We use the notation ā for aH. Then the group law is
ā b̄ = ab.
We have ( ā b̄)c̄ = ab c̄ = ( ab)c = a(bc) = ā bc = ā(b̄ c̄). Thus, this is an associative operation. We
have ā ēG = aeG = ā and ēG ā = eG a = ā. So there is an identity element eG/H and it is equal to ēG = H.
102 EYAL GOREN MCGILL UNIVERSITY

Finally, ā a−1 = aa−1 = ēG = eG/H and a−1 ā = a−1 a = ēG = eG/H . Thus, every ā is invertible and its
inverse is a−1 (that is, ( aH )−1 = a−1 H).

32.3. The first isomorphism theorem.

Theorem 32.3.1. Let f : G → H be a surjective group homomorphism. The canonical map π : G → G/Ker( f )
is a homomorphism with kernel Ker( f ). There is an isomorphism F : G/Ker( f ) → H, such that the following
diagram commutes:
f
G /H.
:

π
$ F
G/Ker( f )

Proof. First we check that π : G → G/Ker( f ) is a homomorphism, where π ( a) = ā = aKer( f ). Indeed,

this is just the formula ab = ā b̄. The kernel is { a ∈ G : aKer( f ) = Ker( f )} = Ker( f ).
Let us define
F : G/Ker( f ) → H, F ( ā) = f ( a).
This is well defined: if ā = b̄ then b−1 a ∈ Ker( f ), so f (b) = f (b) f (b−1 a) = f (b(b−1 a)) = f ( a). Clearly
F ◦ π = f.
F is a homomorphism: F ( ā b̄) = F ( ab) = f ( ab) = f ( a) f (b) = F ( ā) F (b̄). Furthermore, F is surjective,
since given h ∈ H we may find a ∈ G such that f ( a) = h and so F ( ā) = h. Finally, F is injective,
because F ( ā) = f ( a) = e H means that a ∈ Ker( f ) so ā = eG/H .
Example 32.3.2. Let F be a field. Recall the group of matrices GL2 (F),
   
 a b 
GL2 (F) = M =   : a, b, c, d ∈ F, det( M) = ad − bc 6= 0 .
 c d 

We have also noted that the determinant is multiplicative

det( MN ) = det( M) det( N ).
We may now view this fact as saying that the function
det : GL2 (F) → F× ,
is a  homomorphism. It is a surjective group homomorphism, because given any a ∈ F× the ma-
group 
a 0
trix   has determinant a. The kernel is called SL2 (F), it is equal to the matrices with determinant 1.
0 1
It is a normal subgroup of GL2 (F) and by the first isomorphism theorem GL2 (F)/SL2 (F) ∼ = F× .
Example 32.3.3. The homomorphic images of S3 . We wish to identify all the homomorphic images of S3 .
If f : S3 → G is a group homomorphism then Ker( f ) is a normal subgroup of S3 . We begin therefore by
finding all normal subgroups of S3 .
We know that every nontrivial subgroup of S3 contains a subgroup of the form h(ij)i for some transpo-
sition (ij) or the subgroup A3 := h(123)i. That there are no other subgroups follows from the following
observation: if H ⊂ K ⊂ G are groups and G is finite, then | G |/|K | divides | G |/| H |, because the quotient
is |K |/| H |. In our situation, for a non-trivial subgroup H we have |S3 |/| H | is either 2 or 3 and those are
prime. It follows that either |K | = | G | or |K | = | H | and so that either K = G or K = H.
The subgroups of order 2 are not normal. For example, (13)(12)(13)−1 = (13)(12)(13) = (23), which
shows that {1, (12)} is not normal, etc. On the other hand, the subgroup A3 := {1, (123), (132)} is normal.
This follows from it being of index 2 (see assignments); another argument appears below. Since S3 /A3 has
order 2, it must be isomorphic to Z/2Z.
We conclude that there are three options:
(1) Ker( f ) = {1}. In this case, S3 is isomorphic to its image.
COURSE NOTES - ALGEBRA I 103

(2) Ker( f ) = S3 . In this case S3 /Ker( f ) = S3 /S3 ∼

= {1} is the trivial group.
(3) Ker( f ) = A3 . In this case S3 /A3 is a group of 2 elements, obviously cyclic. Thus S3 /A3 ∼
= Z/2Z

32.4. Groups of low order.

32.4.1. Groups of order 1. There is a unique group of order 1, up to isomorphism. It consists of its identity
element alone. There is only one way to define a homomorphism between two groups of order 1 and it is an
isomorphism.

32.4.2. Groups of order 2, 3, 5, 7. Recall that we proved that every group G of prime order is cyclic, and,
in fact, any non-trivial element is a generator. This implies that any subgroup of G different from {eG } is
equal to G. We also proved that any two cyclic groups having the same order are isomorphic. We therefore
conclude:

Corollary 32.4.1. Every group G of prime order p is isomorphic to Z/pZ; it has no subgroups apart from
the trivial subgroups {e g }, G.

In particular, this corollary applies to groups of order 2, 3, 5, 7.

32.4.3. Groups of order 4. Let G be a group of order 4.

First case: G is cyclic.
In this case we have G ∼
= Z/4Z. Its subgroups are {0}, Z/4Z and H = h2i = {0, 2}. There are no other
subgroups because if a subgroup J contains an element g it contains the cyclic subgroup generated by g. In
our case, the elements 1 and 3 are generators, so any subgroup not equal to G is contained in {0, 2}.
Since G is abelian, H is normal. G/H has order | G |/| H | = 4/2 = 2 and so G/H ∼= Z/2Z.

Second case. G is not cyclic.

Claim: Every element of G different from eG has order 2.
Proof: we have ord( g) = |h gi| and it divides | G |. So, in our case, ord( g) = 1, 2 or 4. If ord( g) = 4, we get
that G is cyclic and if ord( g) = 1 then g = eG . Thus, we must have ord( g) = 2.

Claim: Let G be a group in which every element different from the identity has order 2. Then G is
commutative.
Proof: Note first that if a ∈ G has order 2 (or is the identity) then aa = eG and so a−1 = a. Now, we need
to show that for every a, b ∈ G we have ab = ba. But this is equivalent to ab = b−1 a−1 . Multiply both
sides by ab and we see that we need to prove that abab = eG . But, abab = ( ab)2 and so is equal to eG , by
assumption.

One example of a group of order 4 satisfying all these properties is Z/2Z × Z/2Z. We claim that G ∼
=
Z/2Z × Z/2Z. Pick two distinct elements g1 , g2 of G, not equal to the identity. Define a map

f : Z/2Z × Z/2Z → G, f ( a, b) = g1a g2b .

This is well defined: if ( a, b) = ( a0 , b0 ) then a = a0 + 2c, b = b0 + 2d and we get f ( a, b) = g1a g2b =
0 0 0 0
g1a ( g12 )c g2b ( g22 )d = g1a g2b = f ( a0 , b0 ). The map is also a homomorphism: f (( a1 , b1 ) + ( a2 , b2 )) = f ( a1 +
a + a b +b a b
a2 , b1 + b2 ) = g11 2 g21 2 = g11 g1a2 g21 g2b2 . Because G is commutative we can rewrite this as f ( a1 + a2 , b1 +
a b
b2 ) = g11 g21 g1a2 g2b2 = f ( a1 , b1 ) · f ( a2 , b2 ).
The image of f is a subgroup with at least 3 elements, namely, eG , g1 , g2 . By Lagrange the image then
must be G. It follows that f is surjective and so, because both source and target have four elements, is also
injective.
104 EYAL GOREN MCGILL UNIVERSITY

The non-trivial subgroups of Z/2Z × Z/2Z are all cyclic. They are {(0, 0), (0, 1)}, {(0, 0), (1, 0)} and
{(0, 0), (1, 1)}. Since the group is commutative they are all normal and the quotient in every case has
order 2, hence isomorphic to Z/2Z.

32.4.4. Groups of order 6. We know three candidates already Z/6Z, Z/2Z × Z/3Z and S3 . Now, in
fact, Z/6Z ∼ = Z/2Z × Z/3Z (CRT). And since S3 is not commutative it is not isomorphic to Z/6Z. In
fact, every group of order 6 is isomorphic to either Z/6Z or S3 . We don’t prove it here.

The subgroups of Z/6Z: Let n be a positive integer. We have a surjective group homomorphism π :
Z → Z/nZ. Similar to the situation with rings one can show that this gives a bijection between subgroups H
of Z that contain nZ and subgroups K of Z/nZ. The bijection is given by
H 7 → π ( H ), K 7 → π −1 ( K ).
The subgroups of Z are all cyclic, having the form nZ for some n (same proof as for ideals, really). We thus
conclude that the subgroups of Z/nZ are cyclic and generated by the elements m such that m|n. Thus,
for n = 6 we find the cyclic subgroups generated by 1, 2, 3, 6. Those are the subgroups Z/6Z, {0, 2, 4}, {0, 3}, {0}.
They are all normal and the quotients are isomorphic respectively to {0}, Z/2Z, Z/3Z, Z/6Z.

The subgroups of S3 : Those were classified above.

32.5. Odds and evens. Let n ≥ 2 be an integer. One can show that there is a way to assign a sign, ±1, to
any permutation in Sn such that the following properties hold:
• sgn(στ ) = sgn(σ) · sgn(τ ).
• sgn((ij)) = −1 for i 6= j.
We do not prove that here, but we shall prove that next term in MATH 251. Note that since any permutation
is a product of transpositions, the two properties together determine the sign of any permutation. Here are
some examples: sgn((12)) = −1, sgn((123)) = sgn((13)(12)) = sgn((13)) · sgn((12)) = 1, sgn((1234)) =
sgn((14)(13)(12)) = −13 = −1.

The property sgn(στ ) = sgn(σ) · sgn(τ ) could be phrased as saying that the function
sgn : Sn −→ {±1}
is a surjective group homomorphism. We define An as the kernel of the homomorphism sgn. It is called the
alternating group on n letters and its elements are called even permutations. The elements of Sn \ An are
called odd permutations. The group An is a normal subgroup of Sn , being a kernel of a homomorphism. Its
cardinality is n!/2. Here are some examples:
• A2 = {1};
• A3 = {1, (123), (132)};
• A4 = {1, (12)(34), (13)(24), (14)(23), (123), (132), (234), (243), (124), (142), (134), (143)}. (Easy
to check those are distinct 12 even permutations, so the list must be equal to A4 ).

32.6. Odds and Ends.

Example 32.6.1. We prove that in Z/pZ any element is a sum of two squares.
Clearly this holds for p = 2, so we assume p > 2. To begin with, Z/pZ× = {1, . . . , p − 1} is a group
under multiplication; it has p − 1 elements. Consider the homomorphism:
sq : Z/pZ× → Z/pZ× , sq( x ) = x2 .
Let H be its image, a subgroup of Z/pZ× . The kernel of sq is the solutions to x2 = 1, which are
precisely ±1. Note that 1 6= −1. It follows that H ∼ = Z/pZ× /{±1} is a group with ( p − 1)/2 elements
consisting precisely of the non-zero congruence classes that are squares. Let H ∗ = H ∪ {0}; it is a subset
of Z/pZ with ( p + 1)/2 elements consisting of all squares.
COURSE NOTES - ALGEBRA I 105

Let a ∈ Z/pZ then the two sets H ∗ and a − H ∗ := { a − h : h ∈ H ∗ } have size ( p + 1)/2 and so
p +1
must intersect (because Z/pZ has p < 2 · 2 elements). That is, there are two squares x2 , y2 such
that a − x2 = y2 and so a = x2 + y2 .

We next tie together the notions of homomorphism and group action.

Lemma 32.6.2. Let G be a group and T a non-empty set. To give an action of G on T is equivalent to
giving a homomorphism ρ : G → ST .
Proof. Suppose that we are given an action of G on S. Pick an element g ∈ G. We claim that the function
T → T, t 7→ g ? t,
is a permutation of T. Indeed, if gt1 = gt2 then g−1 ( gt−1 −1 −1
1 ) = g ( gt2 ), so ( g g ) t1 = ( g g ) t2 ; that
− 1 − 1
is, et1 = et2 and so t1 = t2 . Also, given t ∈ T we have g( g t) = ( gg )t = et = t, showing surjectivity.
Let us denote then this function by ρ( g), ρ( g)t = gt. We have a function
ρ : G → ST .
We claim that this function is a homomorphism. We need to show that ρ( g1 g2 ) = ρ( g1 ) ◦ ρ( g1 ), i.e., that for
every t ∈ T we have ρ( g1 g2 )(t) = (ρ( g1 ) ◦ ρ( g2 ))(t). Indeed, ρ( g1 g2 )t = ( g1 g2 )t = g1 ( g2 )t = g1 (ρ( g2 )t) =
ρ( g1 )(ρ( g2 )(t)) = (ρ( g1 ) ◦ ρ( g2 ))(t).
Conversely, suppose that
ρ : G → ST
is a group homomorphism. Define an action of G on S by
g ? t := ρ( g)(t).
We claim this is a group action. Since ρ is a homomorphism we have ρ(e) = IdT and so e ∗ t = ρ(e)(t) =
IdT (t) = t. Now, g1 ? ( g2 ? t) = ρ( g1 )(ρ( g2 )(t)) = (ρ( g1 ) ◦ ρ( g2 ))(t) = ρ( g1 g2 )(t) = ( g1 g2 ) ? t.
106 EYAL GOREN MCGILL UNIVERSITY

32.7. Exercises.
(1) Write that following permutations as a product of disjoint cycles in S9 and find their order:
(a) στ 2 σ, where σ = (1234)(68) and τ = (123)(398)(45).
(b) στστ, σ = (123), τ = (345)(17).
(c) σ−1 τσ, σ = (123456789), τ = (12)(345)(6789).
(d) σ−1 τσ, σ = (123456789), τ = (12345)(6789).
(2) Which of the following are subgroups of S4 ?
(a) {1, (12)(34), (13)(24), (14)(23)}.
(b) {1, (1234), (13), (24), (13)(24)}.
(c) {1, (423), (432), (42), (43), (23)}.
(d) {1, (123), (231), (124), (142)}.
(3) (a) Let Q8 be the set of eight elements {±1, ±i, ± j, ±k} in the quaternion ring H, discussed in a
previous assignment (so ij = k = − ji etc.). Show that Q8 is a group.
(b) For each of the groups S3 , D4 , Q8 do the following:
(i) Write their multiplication table;
(ii) Find the order of each element of the group;
(iii) Find all the subgroups. Which of them are cyclic?
 
1 2 3 4 5 6 7 8
(4) (a) Find the order of the permutation σ =   and write it as a product
3 1 5 6 2 7 8 4
of cycles.
(b) Find a permutation in S12 of order 60. Is there a permutation of larger order in S12 ?

(5) Let H1 , H2 be subgroups of a group G. Prove that H1 ∩ H2 is a subgroup of G.

(6) Let G be a group and let H1 , H2 be subgroups of G. Prove that if H1 ∪ H2 is a subgroup then either
H1 ⊆ H2 or H2 ⊆ H1 .

(7) Let F be a field. Prove that

  
 a b 
SL2 (F) :=   : a, b, c, d ∈ F, ad − bc = 1
 c d 

is a group. If F has q elements, how many elements are in the group SL2 (F)? (You may use the
fact that GL2 (F) is a group, being the units of the ring M2 (F).)

(8) Let n ≥ 2 be an integer. Let Sn be the group of permutations on n elements {1, 2, ..., n}. A
permutation σ is called a transposition if σ = (i j) for some i 6= j, namely, σ exchanges i and j
and leaves the rest of the elements in their places. Prove that every element of Sn is a product of
transpositions.
Hint: Reduce to the case of cycles.

(9) (a) Let n ≥ 1 integer. Show that the order of a ∈ Z/nZ, viewed as a group of order n with respect
to addition, is gcdn(a,n) .
(b) Let n ≥ 3. Find the order of every element of the dihedral group Dn .

(10) Let n > 1 be an integer, relatively prime to 10. Consider the decimal expansion of 1/n. It is
periodic, as we have proven in a previous assignment. Prove that the length of the period is precisely
the order of 10 in the group Z/nZ× (the group of congruence classes relatively prime to n, under
multiplication).
For example: 1/3 = 0.33333 . . . has period 1 and the order of 10 (mod 3) = 1 (mod 3), with
respect to multiplication, is 1. We have 1/7 = 0.1428571428571428571428571429 · · · , which has
COURSE NOTES - ALGEBRA I 107

period 6; the order of 10 (mod 7) = 3 (mod 7) is 6 as we may check: 3, 32 = 9 ≡ 2, 33 = 6, 34 =

18 ≡ 4, 35 = 12 ≡ 5, 36 = 15 ≡ 1.

Hint: how does one calculates the decimal expansion in practice?

(11) (a) Prove that Z2 × Z3 ∼ = Z6 .

(b) Prove that Z6 6∼ = S3 , though both groups have 6 elements.
(c) Prove that the following groups of order 8 are not isomorphic: Z2 × Z2 × Z2 , Z2 × Z4 , Z8 , D4 , Q.
(Hint: use group properties such as “commutative" , “exists an element of order k", “number of
elements of order k",...)
Remark: One can prove that every group of order 6 is isomorphic to Z6 or S3 . One can prove that
every group of order 8 is isomorphic to one in our list.
(12) Let G be a finite group acting on a finite set. Prove that if h ai = hbi for two elements a, b ∈ G then
I ( a) = I (b), where for any element g ∈ G, I ( g) = |{s ∈ S : gs = s}|.
(13) Let p be a prime number. Let G be a finite group of pr elements. Let S be a finite set having N
elements and assume that ( p, N ) = 1. Assume that G acts of S. Prove that G has a fixed point in
S. Namely, there exists s ∈ S such that g ? s = s for every g ∈ G.
(14) Find how many necklaces with 4 Rubies, 5 Sapphires and 3 Diamonds are there, up to the usual
identifications.
(15) Find all the homomorphic images of the groups D4 , Q. (Guidance: If G is one of these groups, show
that this amounts to classifying the normal subgroups of G and for each such normal subgroup H
find an isomorphism of G/H with a group known to us.)
(16) Let R be the ring of matrices
  
a a a
 11 12 13

 

  

 0 a22 a23  : aij ∈ Z2 .
 

   

 
 0 0 a33 

Note that R has 26 = 64 elements. Let G be the group of units of R. In this case it consists of the
matrices such that a11 = a22 = a33 = 1. Therefore G has 8 elements. Prove that G ∼ = D4 .
(17) Let H be a subgroup of index 2 of a group G. Prove that H is normal in G.
108 EYAL GOREN MCGILL UNIVERSITY

Part 7. Appendices
Appendix A: The Cantor-Bernstein Theorem
In this appendix we prove the Cantor-Bernstein theorem (Theorem 4.0.2).

Theorem 32.7.1. Let X, Y, be two sets such that | X | ≤ |Y | and |Y | ≤ | X |. Then, | X | = |Y |.

Proof. We divide the proof into three parts. First, given a set X, denote by P ( X ) the set whose elements
are the subsets of X. The first step is the following “fixed point" lemma.

Lemma 32.7.2. Let X be a set and let ϕ : P ( X ) → P ( X ) be a function with the property that for A1 , A2 ∈
P ( X ), A1 ⊆ A2 =⇒ ϕ( A1 ) ⊆ ϕ( A2 ). Then, there is A0 ∈ P ( X ) such that

ϕ ( A0 ) = A0 .

Proof. [Lemma] Consider the set

D = { A ∈ P ( X ) : A ⊆ ϕ( A)}.

Let
A0 = ∪ A∈ D A.

First, ϕ( A0 ) = ∪ A∈ D ϕ( A) ⊇ ∪ A∈ D A = A0 . From this we also conclude that ϕ( ϕ( A0 )) ⊇ ϕ( A0 ). Thus,

ϕ( A0 ) ∈ D and consequently, from the definition of A0 , ϕ( A0 ) ⊆ A0 . It follows that A0 = ϕ( A0 ).

Now, by assumption, there are injective functions

f : X → Y, g : Y → X.

Lemma 32.7.3. There is a subset A0 of X such that g−1 gives a bijection between X − A0 and Y − f ( A0 ).

Proof. [Lemma] Define a function ϕ : P ( X ) → P ( X ) by

ϕ( A) = X − g(Y − f ( A)).

If A1 ⊆ A2 then f ( A1 ) ⊆ f ( A2 ) and so Y − f ( A1 ) ⊇ Y − f ( A2 ) and the same after applying g. Thus,

ϕ( A1 ) ⊆ ϕ( A2 ). We can thus apply the previous lemma and find a subset A0 of X such that

A0 = ϕ( A0 ) = X − g(Y − f ( A0 )).

That means that X − A0 = g(Y − f ( A0 )) and so that X − A0 is contained in the image of g and g−1 gives
a bijection X − A0 → Y − f ( A0 ).

We now proceed to the last part of the proof where we construction a bijection

h : X → Y.

We define
(
f (x) x ∈ A0 ,
h( x ) =
g −1 ( x ) x ∈ X − A0 .

Note that h is injective when restricted to A0 and when restricted to X − A0 . At the same time h( A0 ) =
f ( A0 ) is disjoint from h( X − A0 ) = Y − f ( A0 ). Thus, h is injective. On the other hand, as Y = f ( A0 ) ∪
(Y − f ( A0 )), h is also surjective.
COURSE NOTES - ALGEBRA I 109

Appendix B: The irrationality of e

In this section we prove that e is irrational. Every rational number has the property that its decimal
expansion is eventually periodic and so one idea could be to prove that e doesn’t have this property. The
problem though is that we know very little about the expansion of e. For example, we do not know if every
digit appears infinitely often in its expansion. Instead, we shall use a different method. It is based on the fact
that e has an expression as an infinite sum that “converges too fast".
We make use of the formula, or, in certain approach, the definition, of e as an infinite sum:
∞
1 1 1 1 1 1 1 1
e= ∑ n!
= 1+1+ + + +··· = 1+1+ + +
2! 3! 4! 2 6 24
+
120
+ · · · ≈ 2.7182818 . . . .
n =0
Suppose that e is a rational number then e = a/N, where a and N are positive integers, N > 1 and so N! · e
is surely an integer, as is N! · e − N! ∑nN=0 n!
1
. Therefore,
∞
1 1 1
N! · ∑ = + +
N + 1 ( N + 1)( N + 2) ( N + 1)( N + 2)( N + 3)
+...
n = N +1

is an integer. This number is clearly positive. On the other hand, it is smaller than
1 1 1 1
+ + +··· = < 1.
N + 1 ( N + 1)2 ( N + 1)3 N
As there is no positive integer small than 1, we have arrived at a contradiction and so e is irrational.

Appendix C: Euclidean rings

Euclidean rings are a class of rings where division with residue is possible. They unite together our two
examples of such rings: the integers Z and rings of polynomials over a field F[ x ]. Moreover, there are other
interesting examples of Euclidean rings and in such rings many of the results we had obtained for Z and F[ x ]
will hold, but we shall only demonstrate a few of those.
Let R be a commutative ring which is an integral domain. That means that R has no zero divisors; that
is for two elements x, y ∈ R, xy = 0 implies x = 0 or y = 0. Suppose that there is a function
| · | : R − {0} → N,
such that for all x, y in R with y 6= 0 there are elements q, r ∈ R such that x = qy + r and either r = 0 or
|r | < |y|. (In a general Euclidean ring q, r need not be unique.) Then R is called a Euclidean ring.
Let R be a Euclidean ring and a, b non-zero elements of R. We may perform the Euclidean algorithm for
a, b in R as follows:

a = bq0 + r0 , |r0 | < | b |,

b = r0 q1 + r1 , |r1 | < |r0 |,
r0 = r1 q2 + r2 , |r2 | < |r1 |,
..
.

For some t we must first get that rt+1 = 0. That is,

r t −2 = r t −1 q t + r t , | r t | < | r t −1 |
r t −1 = r t q t +1 .

As said, this process of repeated division with residue much stop as the sizes |ri | of the residues are decreasing
natural numbers. Then rt is the gcd of a and b. That is, rt divides both a and b and any element of R dividing
both a and b divides r. Note though that rt is not uniquely determined by these properties as for every unit
u of R also urt would have these properties (also, in each step of the devision with residue we have choose
qi and that effects the residues as well). In the case of Z we could make the gcd unique by demanding that
it is positive, and in the case of F[ x ] we could make it unique by requiring it to be a monic polynomial, but
110 EYAL GOREN MCGILL UNIVERSITY

for a general ring R there are no such normalizations. And so, we should really say “a gcd" of a, b and not
“the gcd" of a, b. As in the case of Z or F[ x ] we can prove:
Proposition 32.7.4. R is a principal ideal domain. Let a, b ∈ R not both zero. Let r be a generator of the
ideal of R generated by a and b, hr i = h a, bi. Then r is a gcd of a and b.
Proof. Let us show first that R is a principal ideal domain. Let I be a non-zero ideal of R and consider the
non-empty set of natural numbers
{| a| : a ∈ I − {0}}.
This set has a minimal element, necessarily of the form | a0 | for some element a0 ∈ I, a0 6= 0. Clearly h a0 i ⊆ I.
Let x ∈ I and write x = qa0 + r where either r = 0 or |r | < | a0 |. As r = x − qa0 ∈ I, we cannot have
|r | < | a0 | and so we must have r = 0. That is, x ∈ h a0 i and we got I ⊆ h a0 i. Therefore, I = h a0 i and is
principal.
Assume that b is not zero and let r be a gcd of a, b obtain by the Euclidean algorithm. It follows from the
algorithm that r = xa + yb for some x, y ∈ R and so r ∈ h a, bi and consequently hr i ⊆ h a, bi. On the other
hand, as r | a we have a ∈ hr i and similarly b ∈ hr i and it follows that h a, bi ⊆ hr i. That is, hr i = h a, bi,
where r is a gcd of a and b. Remark that the generators of the ideal hr i are precisely the gcd’s of a, b (they
are the elements {ur : u ∈ R× }).
The next step is to show that for r ∈ R, r 6= 0, r 6∈ R× , the following properties are equivalent: (i) if r | ab
then r | a or r |b; (ii) if r = ab then either a or b are units. Such an element is called a prime element of R. A
proof very similar to those for Z or F[ x ] gives us unique factorization:
Theorem 32.7.5. Let R be a Euclidean ring and a 6= 0 an element of R. Then, there are non-associated
prime elements p1 , . . . , pt of R and positive integers a1 , . . . , at and a unit u such that
a
a = up11 . . . ptat .
b
Moreover, the factorization is unique in the sense that if a = vq11 · · · qbs s , with v a unit, qi are non-associated
prime elements and bi positive integers, then possibly after renaming the qi we have s = t, pi ∼ qi for all i
and ai = bi for all i.
Here is an interesting example of a Euclidean ring, called the ring of Gaussian integers . It is the ring
Z[i ] = { x + yi : x, y ∈ Z}, where
| x + yi | = x2 + y2 .
Given a, b ∈ Z[i ] such that b 6= 0, write using division in C, a/b = s + ti, where s, t are real numbers (in fact
they come out rational numbers) and let q = s0 + t0 i, where s0 is the integer nearest to s and t0 the integer
nearest to t. There is a unique r ∈ Z[i ] such that the equation
a = qb + r
holds; that is, define r = a − qb. One then verifies that
|r | < | b |,
√
√ that Z[2i ] is Euclidean.
and so we get It is interesting that if you try to mimic this argument of R = Z[ −5]
and | x + y −5| = x + 5y2 the proof doesn’t work. We know that because we have seen that R is not a
principal ideal domain (see page 67). It is interesting to analyze the exact point where the proof fails, but we
leave that to the interested reader.

Appendix D: The complex exponential function

Consider the series
∞
zn z z3
∑ n!
= 1+z+ + +...
2! 3!
n =0

Lemma 32.7.6. For every complex number z the series converges to a complex number that we shall call ez .
COURSE NOTES - ALGEBRA I 111

Proof. Left as an exercise. The key is to show that for every for every real number e > 0 and z ∈ C there is
an integer n ≥ 0 such that for all integers N ≥ 0,
n+ N
| ∑ zk /k! |< e.
k=n
This implies that the real part and imaginary part of this sum are both less then e and this, in turn, implies
that the series Re(∑kN=1 zk /k!), when N varies, and the series Im(∑kN=1 zk /k!), when N varies, are Cauchy
series of real numbers that thus converge.
Theorem 32.7.7. The complex exponential function ez satisfies
e z1 + z2 = e z1 e z2 .
Proof. As formal power series we have
∞
( z1 + z2 ) n
e z1 + z2 = ∑ n!
n =0
j n− j
∞ n (nj)z1 z2
= ∑∑ n!
n =0 j =0
∞ n j n− j
z z
= ∑ ∑ j!1 (n 2− j)!
n =0 j =0
∞z1n ∞ z2n
= (∑
n! n∑
)( ).
n =0 =0 n!
( z1 + z2 ) n
To justify that as equality of complex numbers we should work with finite sums ∑2M
n =0 n! where we get
z1n 2M z2
n
equalities up to the last step where we don’t quite get (∑2M
n=0 n! )( ∑n=0 n! ), but a sub-sum of this containing
zn zn
all terms appearing in(∑nM=0 n!1 )(∑nM=0 n!2 ).
However, it is easy to show that the two sums differ by a quantity
that goes to zero as M goes to infinity. We leave the details to the reader.
Lemma 32.7.8 (Euler’s formula). Let θ be a real number then
eiθ = cos(θ ) + i sin(θ ).
Remark 32.7.9. In the text, where we discussed complex numbers, we have defined eiθ this way as a quick
and dirty method to get a function ez with good properties. We now show that this formula follows from the
more systematic approach of defining ez via a power series, valid for either real or complex number z.
Proof. We have
∞
(iθ )n
eiθ = ∑ n!
n =0
∞ ∞
θ 2n θ 2n+1
= ∑ i2n (2n)! + i ∑ i2n (2n + 1)!
n =0 n =0
∞ ∞
θ 2n θ 2n+1
= ∑ (−1)n (2n)! + i ∑ (−1)n (2n + 1)!
n =0 n =0
= cos(θ ) + i sin(θ ).
We have assumed here familiarity with the Taylor series expansion for sin and cos.
Corollary 32.7.10 (de Moivre’s formula). Let θ be a real number.
(cos(θ ) + i sin(θ ))n = cos(nθ ) + i sin(nθ ).
Proof. The left hand side is equal to (eiθ )n , while the right hand side is equal to einθ . The theorem implies
these are equal.
112 EYAL GOREN MCGILL UNIVERSITY

Example 32.7.11. By expanding we find trigonometric formulas, and this is the easiest way to know them,
without memorizing them!
cos(2θ ) + i sin(2θ ) = (cos(θ ) + i sin(θ ))2 = cos(θ )2 − sin(θ )2 + i2 cos(θ ) sin(θ ).
By separating real and imaginary parts, we find:
cos(2θ ) = cos(θ )2 − sin(θ )2 , sin(2θ ) = 2 cos(θ ) sin(θ ).
Example 32.7.12. As eiθ · e−iθ = 1 we find that (cos(θ ) + i sin(θ ))(cos(−θ ) + i sin(−θ )) = (cos(θ ) +
i sin(θ ))(cos(θ ) − i sin(θ )) (cos is an even function and sin is an odd function; this is well known but is also
evident from their Taylor expansions). This gives the identity
cos(θ )2 + sin(θ )2 = 1.
Index
(r ), 66 cardinality, 13
(r1 , r2 , . . . , rn ), 67 Cauchy-Frobenius formula, 97
Dn , 90 Cayley’s theorem, 94
In , 64 Chinese remainder theorem, 77
Mn (F), 64 complete set of representatives, 18, 44, 91
R/I, 72 complex conjugate, 20
R[ x ], 52 congruence, 44
Sn , 17, 87 Continuum hypothesis, 16
C, 3 coset, 72, 91
F[e], 64 cycle, 88
F p , 46 disjoint, 89
Im(z), 19
N, 3 de Moivre’s formula, 111
N+ , 19 Dedekind, 4
Q, 3 direct product
R, 3 rings, 64
Re(z), 19 division, 34
⇒, 6 with residue, 33
Z, 3 division with residue, 53
Z/nZ, 44 Eisenstein’s criterion, 60
z̄, 20 equivalence
∩, 4 class, 18
∩i ∈ I , 4 relation, 17
∼
=, 76 Euclid, 35, 38
∪, 4 Euclidean algorithm, 35, 54
∪i ∈ I , 4 Euler’s constant, 41
∅, 4 Euler’s formula, 111
≡ (mod n), 44
∃, 11 Fermat, 46
∀, 4, 11 little theorem, 46, 51
∈, 3 fibre, 12
h gi, 87 field, 26
hr1 , r2 , . . . , rn i, 67 First isomorphism theorem, 76, 102
7→, 23 function, 10
¬, 7 bijective, 12
C, 65 composition, 12
6 ∈, 3 graph, 11
\, 4 identity, 11
∼, 16 image, 11
, 6 injective (one-one), 12
⊂, 4 inverse, 13
⊆, 4 source, domain, 11
Orb(s), 94 sujective (onto), 12
Stab(s), 94 target, codomain, 11
sgn, 104 Fundamental Theorem of Algebra, 23
×, 5 Fundamental Theorem of Arithmetic, 37
{ }, 3
a|b, 34 Gauss, 23, 38
e, 85 Gaussian integers, 110
eiθ , 22 generator, 87
o ( g), 88 Goldach’s conjecture, 39
graph, 10
43
finite, 10
Archimedes Cattle problem, 70 simple, 10
associate greatest common divisor, 59
in a ring, 81 greatest common divisor (gcd), 34, 39, 54, 59
polynomials, 56 group, 85
abelian, 85
binomial action, 94
coefficient, 46, 51 transitive, 97
theorem, 46, 51 cyclic, 87
Burnside’s formula, 97 dihedral, 90
homomorphism, 93
Cantor’s diagonal argument, 16 isomorhism, 93
113
114 EYAL GOREN MCGILL UNIVERSITY

order, 88 prove or disprove, 9

quotient, 101
symmetric, 87 quaternions, 82
trivial, 85 quotient group, 101
units, 85 quotient ring, 72

Hamilton, 82 relation, 16
homomorphism, 68, 93 congruence, 44
equivalence, 17
ideal, 65 complete set of representatives, 18
generated, 82 reflexive, 17
maximal, 80 symmetric, 17
non-principal, 67 transitive, 17
prime, 80 ring, 26
principal, 66 commutative, 26
sum, 67 division, 26
trivial, 65 dual numbers, 64
index, 92 Euclidean, 109
induction, 7 homomorphism, 68
isomorphism, 76, 93 integral domain, 109
isomorphism, 76
kernel, 68, 93 matrices, 63
polynomial, 52
Lagrange’s theorem, 92
quaternion, 82
necklace, 96 subring, 27
number root, 23
complex, 3, 19 of unity, 22
integer, 3 roulette, 96
irrational, 40 RSA, 48
natural, 3
set, 3
prime, 36
contain, 4
rational, 19
countable, 14
real, 3
difference, 4
operation, 25 disjoint union, 18
orbit, 94 equal, 4
order intersection, 4
element, 88 product, 5
group, 88 union, 4
linear, simple, total, 17 sieve of Eratosthenes, 36
partial, 17 sign (of a permutation), 104
stabilizer, 94
Peano, 4 subgroup, 86
Pell equation, 70 normal, 101
permutation, 17, 87 subring, 27
even, 104
odd, 104 Theorem
pigeonhole principle, 9 Eisenstein’s Criterion, 60
polar representation, 21 Euclidean algorithm, 35
polynomial Fundamental Theorem of Algebra, 23, 59
complex, 23 Fundamental Theorem of Arithmetic, 37
constant, 52 Prime Number, 38
irreducible, 56 transposition, 88
monic, 52 Twin Prime conjecture, 39
rational, 23
unique factorization
real, 23
integers, 37
zero, 23, 52
polynomials, 57
pre-image, 12
rationals, 40
prime
unit, 66, 69
number, 36
Prime Number Theorem, 38 Wilson’s theorem, 51, 62
principal ideal ring, 66
proof zero, 23
by contradiction, 7 zero divisor, 45
contrapositive, 7
induction, 7
pigeonhole, 9

A Comprehensive Course in Pure Mathematics - Alegra I-C.S. Lee
100% (10)
A Comprehensive Course in Pure Mathematics - Alegra I-C.S. Lee
424 pages
P Abbott - Teach Yourself Algebra PDF
100% (1)
P Abbott - Teach Yourself Algebra PDF
157 pages
Exercises For 1 Chapter 1: F G (X, Y) X Y+ y X F X, y G (X, Y) 0 X, Y) F
100% (1)
Exercises For 1 Chapter 1: F G (X, Y) X Y+ y X F X, y G (X, Y) 0 X, Y) F
9 pages
Theory and Problems Abstract Algebra - Fang Shaums PDF
No ratings yet
Theory and Problems Abstract Algebra - Fang Shaums PDF
354 pages
Sets, Functions and Logic - Keith J. Devlin PDF
100% (1)
Sets, Functions and Logic - Keith J. Devlin PDF
99 pages
A Mathematical Companion to Quantum Mechanics
From Everand
A Mathematical Companion to Quantum Mechanics
Shlomo Sternberg
No ratings yet
Introduction to the Quantum Theory: Third Edition
From Everand
Introduction to the Quantum Theory: Third Edition
David Park
4/5 (1)
Abstract Algebra Course Notes Math 235
No ratings yet
Abstract Algebra Course Notes Math 235
121 pages
Course Notes Math 235
No ratings yet
Course Notes Math 235
121 pages
ALGEBRA
No ratings yet
ALGEBRA
287 pages
Algebra - Wikipedia
No ratings yet
Algebra - Wikipedia
13 pages
Lecture Notes For Math 135: Algebraic Reasoning For Teaching Mathematics Steffen Lempp
No ratings yet
Lecture Notes For Math 135: Algebraic Reasoning For Teaching Mathematics Steffen Lempp
163 pages
Algebra Basic To Higher
No ratings yet
Algebra Basic To Higher
163 pages
An Introduction To Highed Mathematics
No ratings yet
An Introduction To Highed Mathematics
138 pages
Aljabar Linier
No ratings yet
Aljabar Linier
23 pages
From Boolean Algebra To Unified Algebra
No ratings yet
From Boolean Algebra To Unified Algebra
25 pages
MMA 206 - Alg - Structures
No ratings yet
MMA 206 - Alg - Structures
42 pages
Linear Algebra
No ratings yet
Linear Algebra
23 pages
English For Mathematics - Ban Goc
No ratings yet
English For Mathematics - Ban Goc
205 pages
(Ebook PDF) Abstract Algebra: An Introduction, 3rd Edition PDF Download
100% (2)
(Ebook PDF) Abstract Algebra: An Introduction, 3rd Edition PDF Download
57 pages
(Ebook PDF) Abstract Algebra: An Introduction, 3rd Edition Instant Download
100% (7)
(Ebook PDF) Abstract Algebra: An Introduction, 3rd Edition Instant Download
55 pages
(Ebook PDF) Abstract Algebra: An Introduction, 3rd Editioninstant Download
100% (4)
(Ebook PDF) Abstract Algebra: An Introduction, 3rd Editioninstant Download
52 pages
An Introduction To Higher Mathematics Patrick Keef David Guichard
No ratings yet
An Introduction To Higher Mathematics Patrick Keef David Guichard
142 pages
Algebra: Different Meanings of "Algebra"
No ratings yet
Algebra: Different Meanings of "Algebra"
8 pages
Algebra
No ratings yet
Algebra
8 pages
Lecture Notes For MA 132 Foundations: D. Mond
100% (1)
Lecture Notes For MA 132 Foundations: D. Mond
106 pages
MTH 4104 Introduction To Algebra: Notes (Version of April 6, 2018) Spring 2018
No ratings yet
MTH 4104 Introduction To Algebra: Notes (Version of April 6, 2018) Spring 2018
83 pages
CA Notes
No ratings yet
CA Notes
464 pages
Bamberg Lectures
No ratings yet
Bamberg Lectures
130 pages
UTM A Concrete Introduction To Higher Algebra
No ratings yet
UTM A Concrete Introduction To Higher Algebra
355 pages
Notes On Algbra G Chrystal
No ratings yet
Notes On Algbra G Chrystal
31 pages
Proof Methods
100% (1)
Proof Methods
43 pages
300 Textfa 22
No ratings yet
300 Textfa 22
95 pages
00 Main
No ratings yet
00 Main
224 pages
Assigment Sanjiban Das 3816 B SC L S 2nd Year
No ratings yet
Assigment Sanjiban Das 3816 B SC L S 2nd Year
26 pages
AAU 1st Year Math Course Outline
80% (5)
AAU 1st Year Math Course Outline
8 pages
(Introductory Monographs in Mathematics) A. P. Bowran B.A. (Auth.) - A Boolean Algebra - Abstract and Concrete (1965, Macmillan Education UK) PDF
100% (1)
(Introductory Monographs in Mathematics) A. P. Bowran B.A. (Auth.) - A Boolean Algebra - Abstract and Concrete (1965, Macmillan Education UK) PDF
100 pages
Carol Whitehead - Guide To Abstract Algebra PDF
100% (5)
Carol Whitehead - Guide To Abstract Algebra PDF
269 pages
BEM 1 Module1
No ratings yet
BEM 1 Module1
5 pages
The First Steps in Algebra - Wentworh
No ratings yet
The First Steps in Algebra - Wentworh
269 pages
Design
No ratings yet
Design
269 pages
Algebra
No ratings yet
Algebra
10 pages
Algebra: Jump To Navigationjump To Search
No ratings yet
Algebra: Jump To Navigationjump To Search
18 pages
MATH101 Algebra and Differential Calculus Lecture Notes Part 1
No ratings yet
MATH101 Algebra and Differential Calculus Lecture Notes Part 1
104 pages
Abstract Algebra With Applications - in Two Volumes - Vector Spaces and Groups
No ratings yet
Abstract Algebra With Applications - in Two Volumes - Vector Spaces and Groups
793 pages
Algebra
No ratings yet
Algebra
14 pages
Higher Mathematics
No ratings yet
Higher Mathematics
130 pages
An Intro To Pure Mathematics V2
No ratings yet
An Intro To Pure Mathematics V2
40 pages
High School Math
No ratings yet
High School Math
532 pages
An Introduction To Higher Mathematics: Patrick Keef David Guichard
No ratings yet
An Introduction To Higher Mathematics: Patrick Keef David Guichard
136 pages
Algdsdfosdpf Odt
No ratings yet
Algdsdfosdpf Odt
10 pages
Algebra (From
No ratings yet
Algebra (From
11 pages
Different Meanings of "Algebra"
No ratings yet
Different Meanings of "Algebra"
6 pages
IBL AbstractAlgebra2019 PDF
No ratings yet
IBL AbstractAlgebra2019 PDF
114 pages
Notes On Abstract Algebra 2013
No ratings yet
Notes On Abstract Algebra 2013
151 pages
Topology and Geometry for Physicists
From Everand
Topology and Geometry for Physicists
Charles Nash
3.5/5 (1)
Functional Analysis
From Everand
Functional Analysis
Peter D. Lax
No ratings yet
Chebyshev and Fourier Spectral Methods: Second Revised Edition
From Everand
Chebyshev and Fourier Spectral Methods: Second Revised Edition
John P. Boyd
4.5/5 (2)
Quantum Mechanics in Hilbert Space: Second Edition
From Everand
Quantum Mechanics in Hilbert Space: Second Edition
Eduard Prugovecki
4.5/5 (3)
Foundations of Applied Mathematics
From Everand
Foundations of Applied Mathematics
Michael D. Greenberg
3.5/5 (2)
Geometry: A Comprehensive Course
From Everand
Geometry: A Comprehensive Course
Dan Pedoe
3.5/5 (8)
Newspinor
No ratings yet
Newspinor
8 pages
Forthcoming Titles
No ratings yet
Forthcoming Titles
116 pages
Algebraic Number Theory - J. S. Milne PDF
100% (1)
Algebraic Number Theory - J. S. Milne PDF
163 pages
Lecture On Relative Langlands Notes by Jackson Van Dyke
No ratings yet
Lecture On Relative Langlands Notes by Jackson Van Dyke
18 pages
Engineering Electromagnetics: Energy and Potential
No ratings yet
Engineering Electromagnetics: Energy and Potential
44 pages
3 Set Theory PDF
No ratings yet
3 Set Theory PDF
26 pages
Ireland Rosen Chapters 7 8 PDF
No ratings yet
Ireland Rosen Chapters 7 8 PDF
34 pages
L17 Aperture
No ratings yet
L17 Aperture
27 pages
5B Practice 36-39
No ratings yet
5B Practice 36-39
4 pages
Jorge L. DeLyra - Complex Calculus - Mathematical Methods For Physics and Engineering - Volume 1. 1-Independently Published (2019)
100% (1)
Jorge L. DeLyra - Complex Calculus - Mathematical Methods For Physics and Engineering - Volume 1. 1-Independently Published (2019)
350 pages
Sample Compre Exam For Ph.D. Math
No ratings yet
Sample Compre Exam For Ph.D. Math
5 pages
Advanced Complex Analysis Course
No ratings yet
Advanced Complex Analysis Course
137 pages
Math 226 Report Part 2
No ratings yet
Math 226 Report Part 2
33 pages
Ring Theory PDF
No ratings yet
Ring Theory PDF
8 pages
Pre PH.D Syllabus-Applied Mathematics - 2021-22
No ratings yet
Pre PH.D Syllabus-Applied Mathematics - 2021-22
40 pages
Honors Summer Math Camp Texas State University
No ratings yet
Honors Summer Math Camp Texas State University
6 pages
Ring (Mathematics) PDF
No ratings yet
Ring (Mathematics) PDF
22 pages
Exploring The Beginnings of Algebraic K-Theory
No ratings yet
Exploring The Beginnings of Algebraic K-Theory
29 pages
Elliptic Curve
No ratings yet
Elliptic Curve
14 pages
Xerox Exam 2nd Sem 2016 - 2017
No ratings yet
Xerox Exam 2nd Sem 2016 - 2017
152 pages
Clifford Algebras and Lorentz Group
100% (1)
Clifford Algebras and Lorentz Group
54 pages
Linear Algebra 2 PYQ
No ratings yet
Linear Algebra 2 PYQ
16 pages
Math Syllabus Ug
No ratings yet
Math Syllabus Ug
12 pages
Note29 PDF
No ratings yet
Note29 PDF
5 pages
Vector Spaces and Matrices
No ratings yet
Vector Spaces and Matrices
80 pages
DocPlayer PDF
No ratings yet
DocPlayer PDF
68 pages
Field Theory Concepts-Edit Part2
No ratings yet
Field Theory Concepts-Edit Part2
10 pages
Assignment 1
No ratings yet
Assignment 1
3 pages